Samuel Pitoiset
d25952c3d3
radv/ci: update expected list of failures/flakes on GFX1201
...
50 runs in a row without any unexpected failures/hangs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36806 >
2025-08-20 06:31:14 +00:00
Job Noorman
2a8c5ebc77
ir3: enable scalar predicates
...
Enable the use of scalar predicates by marking predicate dsts as uniform
when possible during instruction emission and in opt_predicates.
Totals:
Instrs: 48207402 -> 47967272 (-0.50%); split: -0.54%, +0.05%
CodeSize: 101907026 -> 101768626 (-0.14%); split: -0.15%, +0.01%
NOPs: 8386320 -> 8165410 (-2.63%); split: -2.88%, +0.25%
MOVs: 1468853 -> 1470546 (+0.12%); split: -0.17%, +0.28%
COVs: 823724 -> 823746 (+0.00%); split: -0.01%, +0.01%
Full: 1716708 -> 1716767 (+0.00%); split: -0.00%, +0.01%
(ss): 1113167 -> 1168194 (+4.94%); split: -0.15%, +5.09%
(sy): 552317 -> 552288 (-0.01%); split: -0.10%, +0.09%
(ss)-stall: 4013046 -> 4261336 (+6.19%); split: -0.11%, +6.30%
(sy)-stall: 16741190 -> 16748983 (+0.05%); split: -0.17%, +0.22%
STPs: 18895 -> 18901 (+0.03%); split: -0.02%, +0.05%
LDPs: 23853 -> 23762 (-0.38%); split: -0.39%, +0.01%
Preamble Instrs: 11506988 -> 11493425 (-0.12%); split: -0.12%, +0.01%
Early Preamble: 121339 -> 121695 (+0.29%)
Last helper: 11686328 -> 11628618 (-0.49%); split: -0.72%, +0.23%
Cat0: 9241457 -> 9020508 (-2.39%); split: -2.62%, +0.22%
Cat1: 2353411 -> 2354860 (+0.06%); split: -0.17%, +0.23%
Cat2: 17468471 -> 17447932 (-0.12%); split: -0.12%, +0.00%
Cat6: 515728 -> 515643 (-0.02%); split: -0.02%, +0.00%
Cat7: 1637795 -> 1637789 (-0.00%); split: -0.05%, +0.05%
Totals from 33275 (20.20% of 164705) affected shaders:
Instrs: 30329487 -> 30089357 (-0.79%); split: -0.86%, +0.07%
CodeSize: 59715922 -> 59577522 (-0.23%); split: -0.26%, +0.03%
NOPs: 6265422 -> 6044512 (-3.53%); split: -3.86%, +0.33%
MOVs: 1058197 -> 1059890 (+0.16%); split: -0.23%, +0.39%
COVs: 427513 -> 427535 (+0.01%); split: -0.02%, +0.03%
Full: 548495 -> 548554 (+0.01%); split: -0.01%, +0.02%
(ss): 769340 -> 824367 (+7.15%); split: -0.21%, +7.36%
(sy): 368276 -> 368247 (-0.01%); split: -0.14%, +0.13%
(ss)-stall: 3076333 -> 3324623 (+8.07%); split: -0.15%, +8.22%
(sy)-stall: 10740547 -> 10748340 (+0.07%); split: -0.27%, +0.34%
STPs: 12872 -> 12878 (+0.05%); split: -0.02%, +0.07%
LDPs: 20808 -> 20717 (-0.44%); split: -0.45%, +0.01%
Preamble Instrs: 6354490 -> 6340927 (-0.21%); split: -0.22%, +0.01%
Early Preamble: 15233 -> 15589 (+2.34%)
Last helper: 8106631 -> 8048921 (-0.71%); split: -1.04%, +0.32%
Cat0: 6888653 -> 6667704 (-3.21%); split: -3.51%, +0.30%
Cat1: 1541452 -> 1542901 (+0.09%); split: -0.25%, +0.35%
Cat2: 10963398 -> 10942859 (-0.19%); split: -0.19%, +0.00%
Cat6: 265945 -> 265860 (-0.03%); split: -0.03%, +0.00%
Cat7: 1164800 -> 1164794 (-0.00%); split: -0.07%, +0.07%
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
cccb3ecc6a
ir3/opt_predicates: move some helpers up
...
We'll need them earlier in the next commit.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
0223ab01b7
ir3/isa: add encoding for scalar predicates
...
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.
This commit follows the blob and disassembles scalar predicates as
up0.c. The "u" presumably stands for "uniform".
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
25ab37ae5b
ir3: make backend aware of scalar predicates
...
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.
This commit makes the ir3 backend aware of scalar predicates. A new
register flag (IR3_REG_UNIFORM) is added that can be used to mark
predicate dsts as being written by the scalar ALU. For such dsts, the
same synchronization rules apply as for shared registers written by the
scalar ALU (e.g., (ss) is needed to read them from the vector ALU).
Scalar predicates can be used in the early preamble, which makes control
flow available there.
In many ways, the backend treats IR3_REG_UNIFORM the same as
IR3_REG_SHARED. A new flag was added because IR3_REG_SHARED is mainly
used to denote a separate register file, not as a flag to indicate usage
by the scalar ALU. Scalar predicates still use the normal predicate
register file but allow it to be written from the scalar ALU.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
bd28a40bd4
ir3/legalize: don't special-case early-preamble a1 reads
...
We can just generically read from the regmask.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
8760c36579
ir3: use shared srcs for demote/kill condition
...
No reason to force vector srcs.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
dbfed965ae
ir3: use ir3_get_predicate for demote/kill
...
Instead of duplicating its functionality.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614 >
2025-08-20 06:14:02 +00:00
Job Noorman
2158211eeb
ir3: allow shared srcs for ldc.k
...
This works just fine and opens up a lot more opportunities for early
preamble. Note that I haven't seen actual cases where the index is large
enough to need a register but verified in computerator that it works.
Totals:
MaxWaves: 2377648 -> 2377666 (+0.00%)
Instrs: 48207402 -> 48219491 (+0.03%); split: -0.01%, +0.03%
CodeSize: 101907026 -> 101929790 (+0.02%); split: -0.01%, +0.03%
NOPs: 8386320 -> 8392647 (+0.08%); split: -0.03%, +0.10%
MOVs: 1468853 -> 1474439 (+0.38%); split: -0.19%, +0.57%
Full: 1716708 -> 1716655 (-0.00%)
(ss): 1113167 -> 1115183 (+0.18%); split: -0.05%, +0.23%
(sy): 552317 -> 552334 (+0.00%); split: -0.10%, +0.10%
(ss)-stall: 4013046 -> 4011814 (-0.03%); split: -0.10%, +0.06%
(sy)-stall: 16741190 -> 16738674 (-0.02%); split: -0.20%, +0.19%
Preamble Instrs: 11506988 -> 11422360 (-0.74%); split: -0.79%, +0.06%
Early Preamble: 121339 -> 123955 (+2.16%)
Last helper: 11686328 -> 11688700 (+0.02%); split: -0.01%, +0.03%
Cat0: 9241457 -> 9248390 (+0.08%); split: -0.02%, +0.10%
Cat1: 2353411 -> 2359061 (+0.24%); split: -0.12%, +0.36%
Cat7: 1637795 -> 1637301 (-0.03%); split: -0.18%, +0.14%
Totals from 5370 (3.26% of 164705) affected shaders:
MaxWaves: 66838 -> 66856 (+0.03%)
Instrs: 4127945 -> 4140034 (+0.29%); split: -0.08%, +0.37%
CodeSize: 8376584 -> 8399348 (+0.27%); split: -0.08%, +0.35%
NOPs: 892650 -> 898977 (+0.71%); split: -0.24%, +0.95%
MOVs: 199423 -> 205009 (+2.80%); split: -1.42%, +4.22%
Full: 76648 -> 76595 (-0.07%)
(ss): 106018 -> 108034 (+1.90%); split: -0.56%, +2.46%
(sy): 48427 -> 48444 (+0.04%); split: -1.10%, +1.13%
(ss)-stall: 479348 -> 478116 (-0.26%); split: -0.80%, +0.54%
(sy)-stall: 1880900 -> 1878384 (-0.13%); split: -1.81%, +1.68%
Preamble Instrs: 1096452 -> 1011824 (-7.72%); split: -8.34%, +0.62%
Early Preamble: 0 -> 2616 (+inf%)
Last helper: 1313193 -> 1315565 (+0.18%); split: -0.10%, +0.29%
Cat0: 992161 -> 999094 (+0.70%); split: -0.23%, +0.93%
Cat1: 234329 -> 239979 (+2.41%); split: -1.21%, +3.62%
Cat7: 118722 -> 118228 (-0.42%); split: -2.42%, +2.00%
The regressions in NOPs/MOVs seem to be cases of bad luck in
RA/scheduling. I looked at a couple of cases and the main shader is
essentially the same before RA. It's a bit unfortunate the differences
in the preamble can have such an impact on the main shader...
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36673 >
2025-08-20 05:10:23 +00:00
Tapani Pälli
ef09df004e
compiler/types: handle BFLOAT16 when decoding blob
...
New type was not handled in the switch which lead to hitting following
assert when running tests with pipeline cache:
deqp-vk: ../src/compiler/glsl_types.c:3334: decode_type_from_blob: Assertion `!"Cannot decode type!"' failed.
Fixes: 9e5d7eb88d ("compiler/types: add a bfloat16 type")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36833 >
2025-08-20 04:12:00 +00:00
Kovac, Krunoslav
9452f2ca3f
amd/vpelib: Minor Refactor
...
[WHY]
There will be more conditions for bypassing degamma, so refactor.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Chan, Roy
dda6a76b54
amd/vpelib: check stream_count as well before accessing streams
...
[WHY]
It was found that the caller may call with stream_count = 0, while
streams array is some garbage.
it randomly ends up output_ctx being modified and leading to validation
failure.
[HOW]
Add checking to the stream_count.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Signed-off-by: Roy Chan <Roy.Chan@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Zhao, Jiali
2b50600a71
amd/vpelib: Extend TMZ value to 8 bit
...
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Signed-off-by: Jiali Zhao <Jiali.Zhao@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Ansari, Muhammad
c26cf7f74d
amd/vpelib: VPE Events
...
[WHY]
For further debugging need to know about the build cmd variables.
[HOW]
Added these input and output paramaters to vpe events.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Signed-off-by: Muhammad Ansari <Muhammad.Ansari@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Leder, Brendan Steve (Brendan)
a486404e4d
amd/vpelib: General cleanup / optimization tasks
...
Various small optimizations that have been accumulating, deal with them
in one commit:
- Add erase functionality for vector util, remove memsets for time opt.
- Update should_gen_cmd_info to take in any stream variables.
- Program funcs should directly program - update mpcc mux hook func to
take in blend_mode.
- Add reserved bits for debug flags.
Signed-off-by: Brendan Steven, Leder <BrendanSteven.Leder@amd.com >
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Okenczyc, Andrzej
e5cdc78e0e
amd/vpelib: Move predication size calculation to bufs_req
...
Calculation for the worst case scenario in bufs_req should also include
predication command size.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Signed-off-by: Andrzei Okenczyc <Andrzej.Okenczyc@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Assadian, Navid
fbeaca1202
amd/vpelib: Add necessary pointer casting
...
Add necessary pointer casting to prevent unexpected behavior
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com >
Signed-off-by: Navid Assadian <Navid.Assadian@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809 >
2025-08-20 10:42:01 +08:00
Yonggang Luo
bdda1cf5ef
va: Use { 0 } initialize struct
...
../src/gallium/frontends/va/config.c(574): error C2059: syntax error: '}'
MSVC 2019 doesn't support for it yet
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com >
Reviewed-by: David Rosca <david.rosca@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36843 >
2025-08-20 02:02:55 +00:00
Yonggang Luo
76c1243dc8
va: Remove unused variable pscreen
...
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com >
Reviewed-by: David Rosca <david.rosca@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36843 >
2025-08-20 02:02:54 +00:00
Caio Oliveira
4fda724fd4
brw: Avoid invalid access when compacting out-of-bounds JIP/UIP
...
Usually JIP will be valid, but as part of other changes, it will be
possible to have a shader that have multiple EOT messages and end with
and ENDIF instruction. Its JIP will point after the program ends.
This is fine but was tripping up the compaction code.
Change compaction to not read its internal structures beyond the last
instruction.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36822 >
2025-08-20 00:54:41 +00:00
Eric Engestrom
a5433b44e6
nvk/ci: document some flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857 >
2025-08-20 00:41:19 +00:00
Eric Engestrom
439a0a5c2e
turnip/ci: document a flake
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857 >
2025-08-20 00:41:19 +00:00
Eric Engestrom
65b0f2ebe0
etnaviv/ci: document some flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857 >
2025-08-20 00:41:19 +00:00
Eric Engestrom
a5b516804e
r300/ci: document flake
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857 >
2025-08-20 00:41:19 +00:00
Eric Engestrom
9cb27063fd
zink+turnip/ci: document fixed tests
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857 >
2025-08-20 00:41:19 +00:00
Eric Engestrom
19021733e6
zink+turnip/ci: document regression in b22806705c...cac3b4f404
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857 >
2025-08-20 00:41:19 +00:00
Erik Faye-Lund
03b7054c30
pan/midgard: avoid implicit cast-warning on Clang
...
BITFIELD_MASK() returns a 32-bit unsigned integer, and Clang complains
if we assign it to a 16-bit unsigned integer without a cast. Let's add
that cast.
While we're at it, add an assert() to make it clear to the compiler that
the condition in BITFIELD_MASK() can be optimized away.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org >
Tested-by: Yiwei Zhang <zzyiwei@chromium.org >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606 >
2025-08-20 00:05:36 +00:00
Erik Faye-Lund
e5fda871fd
panvk: avoid implicit cast-warning on Clang
...
BITFIELD_MASK() returns a 32-bit unsigned integer, and Clang complains
if we assign it to a 16-bit unsigned integer without a cast. Let's add
that cast.
While we're at it, add an assert() to make it clear to the compiler that
the condition in BITFIELD_MASK() can be optimized away.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org >
Tested-by: Yiwei Zhang <zzyiwei@chromium.org >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606 >
2025-08-20 00:05:36 +00:00
Erik Faye-Lund
fed682c506
pan/lib: do not duplicate enum mali_pixel_kill
...
The enum pan_earlyzs is just enum mali_pixel_kill under a different
name, which was needed because the enum was missing from common.xml.
However, because pan_earlyzs_lut is used in files that are both included
with PAN_ARCH unset and set to values including values lower than 6, we
get issues with the way genxml/common_pack.h gets included, resulting in
the enum not being defined.
We don't really depend on the values for this, only on the size. So
let's just use unsigned values in the struct instead, to side-step the
issue.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Tested-by: Yiwei Zhang <zzyiwei@chromium.org >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606 >
2025-08-20 00:05:36 +00:00
Erik Faye-Lund
0dcf510c05
pan: use translate_s_format for stencil
...
While this was also using translate_zs_format() before the commit in
question, that's didn't lead to any real issues, because only a single
value was legal here before. While it's not entirely in-spec to use
other values, it seems the HW doesn't mind.
But when this logic was reworked, the typed field was used instead. This
lead to a compiler warning on Clang.
Let's correct this properly here, rather than papering over the compiler
warning.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Tested-by: Yiwei Zhang <zzyiwei@chromium.org >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606 >
2025-08-20 00:05:36 +00:00
Erik Faye-Lund
30cc9f5b3d
pan/util: use nir_component_mask instead of BITFIELD_MASK
...
To generate a nir_component_mask_t, we should use nir_component_mask,
not BITFIELD_MASK()...
But we're also generating the same mask twice here, so let's just
store that to a variable and reuse the mask when shifting it while we're
at it.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org >
Tested-by: Yiwei Zhang <zzyiwei@chromium.org >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606 >
2025-08-20 00:05:36 +00:00
Eric Engestrom
69b0245f13
panfrost/meson: drop invalid C++ arg
...
cc1plus: warning: command-line option ‘-Wno-override-init’ is valid for C/ObjC but not for C++
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36840 >
2025-08-19 23:44:22 +00:00
Yonggang Luo
2a0a5a3e3f
d3d10umd: Fixes building with mingw/gcc and windows sdk/ddk 10.0.26100.0
...
Avoid recursive include between DriverIncludes.h and Debug.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36844 >
2025-08-19 23:22:07 +00:00
Zan Dobersek
1bc25c855b
tu: disable LRZ writes also for alpha-to-coverage, FS sample coverage output
...
Currently LRZ writes are disabled when depth writes are enabled but the
fragment shader is using discard. Additionally, LRZ writes should be
disabled when fragment shader is outputting sample coverage or the pipeline
state is enabling alpha-to-coverage which behaves as a discard.
This fixes rendering problems on Assetto Corsa. Conditions now used for
disabling LRZ writes match one set of conditions under which the
EARLY_Z_LATE_Z z-test mode is used. It was assumed that in that mode the
LRZ writes in binning will not happen until the late-Z phase, but that's
apparently not the case.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com >
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36848 >
2025-08-19 23:05:07 +00:00
Yiwei Zhang
ec4cebbf2e
venus: expose KHR_present_id(2)/wait(2) support
...
Venus does support these via common wsi.
Test: dEQP-VK.wsi.*.present_id_wait.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36834 >
2025-08-19 22:48:35 +00:00
Yiwei Zhang
fd0b41b98d
venus: hide swapchainMaintenance1 behind wsi guard
...
..otherwise would give false alarm on Android.
Fixes: acd5497067 ("venus: support wsi maintenance1 extensions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36834 >
2025-08-19 22:48:35 +00:00
Mike Blumenkrantz
0d7e38f431
zink: improve deferred buffer barrier heuristics
...
this is only to catch the case of a bound descriptor being written to
by some operation other than its draw/dispatch descriptor bind,
so any non-write binds are ignored
previously those non-write binds were required because of how sync
analysis could drop non-write access, so that is fixed as well
also use the vbo bind count instead of the mask because why not
also also ignore non-write GENERAL image deferred sync because that shouldn't
need anything deferred
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846 >
2025-08-19 22:11:51 +00:00
Mike Blumenkrantz
cf5d41575b
zink: remove UNSYNCHRONIZED map flag during unmap flush for non-subdata calls
...
this avoids a scenario where a non-subdata UNSYNCHRONIZED unmap triggers through
tc at the same time the frontend calls an UNSYNCHRONIZED subdata call
in the main thread, which desynchronizes the cmdbuf and hits an assert
Fixes: 8ee0d6dd71 ("zink: add a third cmdbuf for unsynchronized (not reordered) ops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846 >
2025-08-19 22:11:51 +00:00
Mike Blumenkrantz
4d0650d188
zink: fix image sync deferral
...
each of these cases wasn't actually checking what the comment claimed
it was checking, which would add unnecessary deferred sync
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846 >
2025-08-19 22:11:51 +00:00
Mike Blumenkrantz
af7b39a22f
zink: optimize a GENERAL layout case in pre-draw/dispatch barriers
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846 >
2025-08-19 22:11:50 +00:00
Job Noorman
77c1c688dc
ir3/array_to_ssa: remove trivial all-undef phis
...
remove_trivial_phi erroneously skipped phis containing an undef src
because the remaining srcs may not dominate the phi. However, it's fine
to replace a phi whose srcs are all undef with undef. Fix this by simply
checking if all srcs are equal, whether undef or not.
Note that in practice, this often caused phis with undef srcs to be
inserted all the way up to the entry block, keeping their defs alive for
much longer than necessary.
Fixes unnecessary spilling in God Of War and Neon Noir traces.
Totals:
MaxWaves: 2381774 -> 2384954 (+0.13%)
Instrs: 49052269 -> 49052865 (+0.00%); split: -0.03%, +0.04%
CodeSize: 102493810 -> 102514296 (+0.02%); split: -0.02%, +0.04%
NOPs: 8391570 -> 8385296 (-0.07%); split: -0.14%, +0.07%
MOVs: 1448918 -> 1455153 (+0.43%); split: -0.43%, +0.86%
COVs: 824835 -> 824846 (+0.00%)
Full: 1714015 -> 1707987 (-0.35%)
(ss): 1125974 -> 1126692 (+0.06%); split: -0.14%, +0.21%
(sy): 553893 -> 553561 (-0.06%); split: -0.23%, +0.17%
(ss)-stall: 4011440 -> 4006144 (-0.13%); split: -0.21%, +0.08%
(sy)-stall: 16707741 -> 16664838 (-0.26%); split: -0.48%, +0.23%
STPs: 18953 -> 18495 (-2.42%)
LDPs: 23957 -> 22121 (-7.66%)
Preamble Instrs: 11100893 -> 11100673 (-0.00%)
Early Preamble: 122185 -> 122188 (+0.00%)
Last helper: 11913048 -> 11914963 (+0.02%); split: -0.04%, +0.06%
Subgroup size: 12925248 -> 12926272 (+0.01%)
Cat0: 9246551 -> 9240417 (-0.07%); split: -0.13%, +0.07%
Cat1: 2335781 -> 2341487 (+0.24%); split: -0.29%, +0.53%
Cat2: 18445905 -> 18445930 (+0.00%)
Cat6: 515382 -> 514732 (-0.13%)
Cat7: 1635575 -> 1637224 (+0.10%); split: -0.09%, +0.19%
Totals from 2293 (1.39% of 164705) affected shaders:
MaxWaves: 21622 -> 24802 (+14.71%)
Instrs: 3399456 -> 3400052 (+0.02%); split: -0.49%, +0.51%
CodeSize: 6576806 -> 6597292 (+0.31%); split: -0.24%, +0.55%
NOPs: 774365 -> 768091 (-0.81%); split: -1.54%, +0.73%
MOVs: 226724 -> 232959 (+2.75%); split: -2.73%, +5.48%
COVs: 48005 -> 48016 (+0.02%)
Full: 50599 -> 44571 (-11.91%)
(ss): 88248 -> 88966 (+0.81%); split: -1.85%, +2.66%
(sy): 41345 -> 41013 (-0.80%); split: -3.03%, +2.23%
(ss)-stall: 396793 -> 391497 (-1.33%); split: -2.11%, +0.78%
(sy)-stall: 1594786 -> 1551883 (-2.69%); split: -5.06%, +2.37%
STPs: 1147 -> 689 (-39.93%)
LDPs: 2535 -> 699 (-72.43%)
Preamble Instrs: 707407 -> 707187 (-0.03%)
Early Preamble: 180 -> 183 (+1.67%)
Last helper: 1538341 -> 1540256 (+0.12%); split: -0.35%, +0.47%
Subgroup size: 149248 -> 150272 (+0.69%)
Cat0: 857696 -> 851562 (-0.72%); split: -1.43%, +0.72%
Cat1: 275565 -> 281271 (+2.07%); split: -2.44%, +4.51%
Cat2: 1139467 -> 1139492 (+0.00%)
Cat6: 22505 -> 21855 (-2.89%)
Cat7: 129600 -> 131249 (+1.27%); split: -1.15%, +2.42%
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36714 >
2025-08-19 20:07:34 +00:00
Job Noorman
ca15116fa1
ir3/array_to_ssa: fix updating/removing phis
...
Fix checking instruction flags instead of dst flags, and updating src
instead of def.
Totals:
MaxWaves: 2381954 -> 2381958 (+0.00%)
Instrs: 49073677 -> 49073417 (-0.00%)
CodeSize: 102537524 -> 102536824 (-0.00%)
NOPs: 8396340 -> 8396432 (+0.00%); split: -0.00%, +0.00%
MOVs: 1450777 -> 1450422 (-0.02%)
Full: 1714304 -> 1714287 (-0.00%)
(ss): 1126433 -> 1126463 (+0.00%); split: -0.00%, +0.00%
(ss)-stall: 4013834 -> 4013854 (+0.00%)
(sy)-stall: 16713036 -> 16713082 (+0.00%)
Cat0: 9252109 -> 9252194 (+0.00%); split: -0.00%, +0.00%
Cat1: 2337941 -> 2337592 (-0.01%)
Cat7: 1636810 -> 1636814 (+0.00%); split: -0.00%, +0.00%
Totals from 5 (0.00% of 164705) affected shaders:
MaxWaves: 42 -> 46 (+9.52%)
Instrs: 9052 -> 8792 (-2.87%)
CodeSize: 16806 -> 16106 (-4.17%)
NOPs: 2369 -> 2461 (+3.88%); split: -0.17%, +4.05%
MOVs: 1140 -> 785 (-31.14%)
Full: 133 -> 116 (-12.78%)
(ss): 206 -> 236 (+14.56%); split: -0.97%, +15.53%
(ss)-stall: 901 -> 921 (+2.22%)
(sy)-stall: 6229 -> 6275 (+0.74%)
Cat0: 2695 -> 2780 (+3.15%); split: -0.22%, +3.38%
Cat1: 1333 -> 984 (-26.18%)
Cat7: 419 -> 423 (+0.95%); split: -0.48%, +1.43%
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Fixes: 3ac743c333 ("ir3: Add pass to lower arrays to SSA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36714 >
2025-08-19 20:07:34 +00:00
Michal Krol
2385fa2098
gallium: Do not flush subnormals during tessellation.
...
D3D11 requires that subnormals are not flushed to zero
when tessellating primitives. Since we are flushing
subnormals during shader execution, we must temporarily
turn flushing off when calling the tessellator.
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36811 >
2025-08-19 19:45:29 +00:00
Gert Wollny
8fc2b0d24c
r600/sfn: Emit thread position as two-slot op
...
It doesn't change much though, because it always has to be scheduled
as in the xy channels.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:33 +00:00
Gert Wollny
b0bf1d914a
r600/sfn: give more liberty to the channel selection in simple two-slot ops
...
Some ops on 64 bit data don't require the data to reside in neighboring
channels and can be executed as seperate 32 bit ops. In these cases we don't
need to pin the registers to a specific channel, but for scheduling it is better
that we make sure that both destination values reside in different channels, so
that they can be scheduled into one ALU group and reduce the probability of
read-port conflicts when used as source values.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:33 +00:00
Gert Wollny
206d50ba25
r600/sfn: op1v_flt64_to_flt32 as multi-slot instruction
...
With that the optimizer can better switch the channel.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:32 +00:00
Gert Wollny
2d88e9236d
r600/sfn: Handle more ops in desk mask evaluation
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:32 +00:00
Gert Wollny
00c41ad03a
r600/sfn: replace hard-coded multislot dot handling
...
More ops then op2_dot_ieee + op2_mul_ieee can be submitted
as multi-slot ops. Make it ease to handle additional opcodes
when splitting the alu op that has only one dst but requires
multiple slots. With that we can emit more multi-slot ops that
use consecutive slots and use a different opcode in the last slot.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:31 +00:00
Gert Wollny
f2916b3df4
r600/sfn: Fix the mods when splitting ALU op
...
In preparation of splitting 64 bit two slot ops with one 32 bit
dest register use the right start slot.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:31 +00:00
Gert Wollny
1ba8ff9fe6
r600/sfn: Take slot count into account when pinning registers
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743 >
2025-08-19 19:30:30 +00:00