Commit Graph

16179 Commits

Author SHA1 Message Date
Samuel Pitoiset 44dfeb4479 radv,aco: add a separate function to compile the trap handler shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset 62e335c779 radv,aco: dump more SQ_WAVE regs from the trap handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Samuel Pitoiset 0cc21d0601 radv: cleanup printing SGPRS dumped from the trap handler
It's more readable like that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056>
2024-11-12 11:16:13 +00:00
Georg Lehmann ece1ab3b87 radv: run copy prop before vectorizing
Otherwise there are a lot of scalar movs between texture instructions
and alu. With those removed, the top down vectorizer has more starting
points.

Totals from 296 (0.37% of 79206) affected shaders:
MaxWaves: 5710 -> 5754 (+0.77%)
Instrs: 388051 -> 386630 (-0.37%); split: -0.46%, +0.09%
CodeSize: 2120800 -> 2117144 (-0.17%); split: -0.30%, +0.13%
VGPRs: 17496 -> 17344 (-0.87%)
Latency: 8893751 -> 8901364 (+0.09%); split: -0.10%, +0.18%
InvThroughput: 1740411 -> 1731710 (-0.50%); split: -0.57%, +0.07%
VClause: 6573 -> 6576 (+0.05%); split: -0.21%, +0.26%
SClause: 11233 -> 11209 (-0.21%); split: -0.28%, +0.07%
Copies: 31582 -> 31635 (+0.17%); split: -1.49%, +1.66%
PreSGPRs: 15878 -> 15876 (-0.01%)
PreVGPRs: 15380 -> 15274 (-0.69%)
VALU: 278528 -> 277036 (-0.54%); split: -0.65%, +0.11%
SALU: 49062 -> 49054 (-0.02%); split: -0.03%, +0.02%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32060>
2024-11-11 18:33:48 +00:00
Samuel Pitoiset 107f29c39a aco: do not reorder s_trap instructions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32055>
2024-11-11 15:46:36 +00:00
Samuel Pitoiset 30d9166d80 radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>
2024-11-11 09:34:05 +00:00
Samuel Pitoiset 4d50691ae9 radv: remove unused parameter to radv_fill_nir_compiler_options()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>
2024-11-11 09:34:05 +00:00
Konstantin Seurer e3cf6290e0 radv: Add RADV_DEBUG=nirdebuginfo
Annotates the shader with source locations into the nir shader.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer 736c8c6f23 radv: Dump nir shaders before compiling
It will allow adding source locations that point to the nir_string to
the shader.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer aaf65d6219 radv: Store debug info inside radv_shader
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:14 +00:00
Konstantin Seurer 54c22656b8 radv: Add a helper for accessing the shader binary
Use pointers into the blob instead of hardcoding the layout everywhere.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer 69ebba82d4 aco: Pass debug information to the driver
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer f8ef1afec8 aco: Handle nir_debug_info_instr
Propagated debug info using p_debug_info and Program::debug_info.
Offsets into the shader binary are gathered during assembly.
This will be usefull for mapping back the disassembled shader to
nir, glsl or spirv.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer 7dd9840128 amd: Add ac_shader_debug_info
This is very similar to nir_debug_info_instr but it can exist outside of
a nir shader.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>
2024-11-11 08:39:13 +00:00
Konstantin Seurer c5e40a60f8 radv: Lower non-uniform access after vectorization
Scalar access can make nir_lower_non_uniform_access emit a lot of
waterfall loops.

Totals from 83 (0.10% of 84770) affected shaders:
Instrs: 2747926 -> 2745959 (-0.07%); split: -0.07%, +0.00%
CodeSize: 15022460 -> 14998240 (-0.16%); split: -0.16%, +0.00%
Latency: 18602932 -> 18404976 (-1.06%); split: -1.18%, +0.12%
InvThroughput: 4500730 -> 4450364 (-1.12%); split: -1.18%, +0.06%
VClause: 93651 -> 91848 (-1.93%); split: -1.93%, +0.00%
SClause: 63672 -> 63595 (-0.12%); split: -0.13%, +0.00%
Copies: 229377 -> 229896 (+0.23%); split: -0.04%, +0.27%
Branches: 107630 -> 107627 (-0.00%); split: -0.01%, +0.00%
PreSGPRs: 5247 -> 5253 (+0.11%)
PreVGPRs: 5911 -> 5903 (-0.14%); split: -0.29%, +0.15%
VALU: 1761158 -> 1761540 (+0.02%); split: -0.01%, +0.03%
SALU: 419743 -> 419783 (+0.01%); split: -0.01%, +0.02%
VMEM: 152142 -> 150208 (-1.27%)
SMEM: 80251 -> 80244 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>
2024-11-11 07:53:13 +00:00
Visan, Tiberiu d379a3a428 amd/vpelib: remove luma offset (#459)
\[WHY\]
Shader and VPE does not apply brightness adjs in the same manner

\[HOW\]
Removed luma offset added in VPE

\[TESTING\]
Tested on real time video rendering

Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>
2024-11-11 13:00:54 +08:00
Visan, Tiberiu 2172ab2c2a amd/vpelib: patch to match shader (#456)
\[WHY\]
Shader and VPE had different behavior while adjusting the brightness

\[HOW\]
Apply the same normalization factor

\[TESTING\]
Tested on real video outputs

Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>
2024-11-11 13:00:44 +08:00
Leder, Brendan Steve 891c4694ba amd/vpelib: Refactor OCSC and update missing check
Missing check for 601 in limited format check, updated that.
Refactored OCSC to use specific limited depths.
Cleaned up general color processing.

Co-authored-by: Brendan <breleder@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>
2024-11-11 13:00:29 +08:00
Samuel Pitoiset 437bd63265 radv,aco: dump m0 and exec from the trap handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:15 +00:00
Samuel Pitoiset d1d41be43f aco: declare phys regs for tba_hi/tma_hi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:15 +00:00
Samuel Pitoiset 13bab450a2 aco: fix storing SQ_WAVE_STATUS in the trap handler shader
SQ_WAVE_STATUS can change inside the trap because of SCC.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:14 +00:00
Samuel Pitoiset 494050d2ea aco: add a helper to dump SGPR to memory for the trap handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:14 +00:00
Samuel Pitoiset 8c6f2fef1b aco: use scalar buffer stores for dumping SGPRS from the trap on GFX8
This avoids using any VGPRs on GFX8.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:14 +00:00
Samuel Pitoiset 17f6b4e51e aco: save/restore SCC in the trap handler shader
SCC is only updated on GFX9+ but let's do it by default because the
trap handler shader is likely going to be more and more complex over
time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:14 +00:00
Samuel Pitoiset 7b4386facd aco: cleanup using fixed registers in the trap handler shader
It's easier to read and potentially less error prone.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32026>
2024-11-08 14:00:14 +00:00
Pierre-Eric Pelloux-Prayer 9c3ac69568 ac/perfcounter: fix buffer overflow
If block->b->selectors is larger than 999, "+ 4" is not enough.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>
2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer cce45dc0bf ac: switch AMD_FORCE_FAMILY handling to using ac_fake_hw_db
ac_fake_hw_db can be the single place where radeon_info content
is emulated when overriding the GPU type.

For some fields we need to avoid overriding them with the value
coming from the ioctls to get the correct behavior.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>
2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer c097c37455 ac: add 'polaris12' gpu to ac_fake_hw_db
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>
2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer 1c31cec31e ac: rename ac_surface_test_common -> ac_fake_hw_db
The next commit will reuse the radeon_info when AMD_FORCE_FAMILY
is used.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>
2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer 9d0aba1f97 ac/surface: add flags to surface metadata
Instead of increasing the version number to describe which fields
are set, use the lower 16 bits for the metadata format version,
and the other bits as flags.
This way the version number defines the layout, and the flags
tell which values are set.

The format version is bumped to 3 (= can have flags), and 2 flags
are defined:
* AC_SURF_METADATA_FLAG_EXTRA_MD_BIT: replaces what was version
  number = 2. This means the metadata contains extra information
  for tools.
* AC_SURF_METADATA_FLAG_FAMILY_OVERRIDEN_BIT: if set, it means the
  surface was allocated from a context that used an overriden gfx
  family. This allows the importer process to fail the import early,
  as the surface is likely to be invalid.
  It also adds an extra dw at the end, to store the fake family.

This is a breaking change for existing code that interpreted
"version > 1" as 2, but only in one case:
AC_SURF_METADATA_FLAG_FAMILY_OVERRIDEN_BIT being set, but not
AC_SURF_METADATA_FLAG_EXTRA_MD_BIT, which produces a version number
of 0x20001 but there's not extra data.
I think this is ok, since both gfx family overriding and extra_md
are debugging tools.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>
2024-11-08 13:31:02 +00:00
Pierre-Eric Pelloux-Prayer acc32cadf5 radv: set info->family_overridden when RADV_FORCE_FAMILY is used
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31841>
2024-11-08 13:31:02 +00:00
Eric Engestrom e83613d906 radv/ci: add more flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32045>
2024-11-07 21:49:29 +01:00
Eric Engestrom 9229bcaf13 radeonsi/ci: add more flakes seen recently
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32045>
2024-11-07 21:49:29 +01:00
Samuel Pitoiset 9cc07bbd09 radv: mark some GFX6-7 GPUs as Vulkan 1.3 conformant
It's the first time RADV is Vulkan conformant on GFX6-7! Some chips
are missing because we don't have access but most of the GFX6-7 GPUs
are covered.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32022>
2024-11-07 11:50:10 +00:00
Samuel Pitoiset b67218645d radv: save the trap handler report in the HOME directory
It's similar to where GPU hang reports are saved.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31988>
2024-11-07 09:28:16 +01:00
Rhys Perry 215c44c124 aco: apply extract to v_cvt_f32_ubyte0
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry f1a932bc29 aco: apply extract to p_extract_vector
fossil-db (navi21):
Totals from 46 (0.06% of 79395) affected shaders:
Instrs: 80126 -> 79944 (-0.23%); split: -0.27%, +0.04%
CodeSize: 486860 -> 485668 (-0.24%); split: -0.31%, +0.06%
Latency: 1615395 -> 1614218 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 705479 -> 705013 (-0.07%); split: -0.07%, +0.00%
Copies: 18934 -> 18797 (-0.72%); split: -0.98%, +0.25%
VALU: 52452 -> 52268 (-0.35%); split: -0.41%, +0.06%
SALU: 17253 -> 17255 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry 6cb9d39bc2 aco: combine extracts with sub-dword definitions
fossil-db (navi21):
Totals from 23 (0.03% of 79395) affected shaders:
Instrs: 55133 -> 55099 (-0.06%)
CodeSize: 335744 -> 335512 (-0.07%)
Latency: 1709146 -> 1709031 (-0.01%)
InvThroughput: 613788 -> 613713 (-0.01%)
Copies: 14405 -> 14407 (+0.01%); split: -0.03%, +0.04%
VALU: 37038 -> 37000 (-0.10%)
SALU: 11125 -> 11131 (+0.05%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry 30af7ae44f aco: add and use apply_extract_twice helper
This will be used in the next commit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry 05d0fa894e aco: allow applying sign-extended sel to p_extract more often
In the case of v1=p_extract(v1=p_extract(src, 0, 16, 1), 0, 32, 0).
When we apply extracts with sub-dword definitions, this will also
include v2b=p_extract(v2b=p_extract(src, 0, 8, 1), 0, 16, 0).

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry e47bc3e750 aco: shrink code size of some p_extract
fossil-db (navi21):
Totals from 37 (0.05% of 79395) affected shaders:
CodeSize: 2048204 -> 2047836 (-0.02%)

fossil-db (navi31):
Totals from 307 (0.39% of 79395) affected shaders:
CodeSize: 3075732 -> 3065236 (-0.34%); split: -0.34%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry d285333800 aco: add a bit more p_extract/p_insert validation
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry d3ac69f79b aco: handle SGPR limitations when applying extract
We were already doing this, but missing it in a few places.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry 07e28dad75 aco: disallow p_extract(,,32,)
Nothing uses these.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry f528597906 aco: check for SDWA before applying extract to lshl/cvt_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry 6ce51ea168 aco/gfx11: fix v1b=p_extract(src, 0, 16, 0)
This is weird, but the SDWA path supports this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
2024-11-06 19:31:20 +00:00
Rhys Perry b318fe47e9 aco: don't byte align global VMEM loads if it might be unsafe
Using the byte align path can be unsafe even when 12 byte loads are
supported.

fossil-db (navi21):
Totals from 185 (0.23% of 79395) affected shaders:
Instrs: 391501 -> 391575 (+0.02%); split: -0.03%, +0.05%
CodeSize: 2147336 -> 2147672 (+0.02%); split: -0.03%, +0.05%
Latency: 3762613 -> 3860941 (+2.61%); split: -0.01%, +2.62%
InvThroughput: 871429 -> 888013 (+1.90%); split: -0.08%, +1.98%
VClause: 9712 -> 10210 (+5.13%)
Copies: 53775 -> 53010 (-1.42%); split: -1.46%, +0.04%
VALU: 254009 -> 252146 (-0.73%)
SALU: 56698 -> 56699 (+0.00%); split: -0.00%, +0.00%
VMEM: 18503 -> 19601 (+5.93%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 391bf3ea30 ("aco: don't expand smem/mubuf global loads")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31807>
2024-11-06 19:07:16 +00:00
Marek Olšák 2a9d590b6c Revert "amd/ci: adjust stoney traces checksums"
This reverts commit 5882b5b93b.

It was added because nir_opt_varyings was accidentally disabled.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31994>
2024-11-06 15:51:51 +00:00
Eric Engestrom cdeb284dce amd/ci: document flakes seen lately
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32007>
2024-11-06 14:14:38 +00:00
Georg Lehmann 2cd8a9fef7 amd: lower gl_FragCoord.w rcp in NIR
This allows NIR to remove the rcps if the application uses rcp(gl_FragCoord.w).
D3D provides w, not 1/w like GL/VK in the shader, so this is commonly used.

Foz-DB Navi21:
Totals from 2068 (2.61% of 79206) affected shaders:
MaxWaves: 45636 -> 45652 (+0.04%)
Instrs: 2173444 -> 2169671 (-0.17%); split: -0.18%, +0.00%
CodeSize: 11881304 -> 11867208 (-0.12%); split: -0.12%, +0.01%
VGPRs: 118000 -> 117968 (-0.03%)
Latency: 35689676 -> 35675909 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 9167199 -> 9159801 (-0.08%); split: -0.08%, +0.00%
VClause: 45076 -> 45078 (+0.00%); split: -0.01%, +0.02%
SClause: 92503 -> 92366 (-0.15%); split: -0.31%, +0.17%
Copies: 140282 -> 140303 (+0.01%); split: -0.13%, +0.14%
Branches: 53347 -> 53346 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 96495 -> 96465 (-0.03%)
VALU: 1522980 -> 1519252 (-0.24%); split: -0.25%, +0.01%
SALU: 213451 -> 213460 (+0.00%); split: -0.02%, +0.02%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31967>
2024-11-06 12:57:08 +00:00