Commit Graph

36 Commits

Author SHA1 Message Date
Daniel Schürmann eecd1c020d amd: keep ac_shader_config::lds_size unaligned
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:09 +00:00
Daniel Schürmann fe6ff6d1ef aco: remove DeviceInfo::lds_encoding_granule and DeviceInfo::lds_alloc_granule
Use utility functions instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:08 +00:00
Daniel Schürmann 11db02d5d9 radv: calculate LDS allocation requirements independently from the compiler
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:07 +00:00
Daniel Schürmann b651234414 amd: change ac_shader_config::lds_size to bytes
We still keep it aligned to allocation granularity.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:07 +00:00
Daniel Schürmann d0b87a0d5f ac/nir_flag_smem_for_loads: call divergence analysis internally
Also don't flag more SMEM instructions (in ACO) after the last
call to ac_nir_lower_mem_access_bit_sizes().

Totals from 75 (0.09% of 79839) affected shaders: (Navi48)

Instrs: 191246 -> 189960 (-0.67%)
CodeSize: 996840 -> 985976 (-1.09%)
Latency: 3066184 -> 2945500 (-3.94%)
InvThroughput: 355373 -> 353106 (-0.64%); split: -0.66%, +0.02%
SClause: 4848 -> 4699 (-3.07%)
Copies: 13827 -> 13925 (+0.71%); split: -0.07%, +0.78%
Branches: 5176 -> 5003 (-3.34%)
PreSGPRs: 6222 -> 6272 (+0.80%)
VALU: 108934 -> 108993 (+0.05%); split: -0.00%, +0.06%
SALU: 31679 -> 31210 (-1.48%); split: -1.51%, +0.03%
SMEM: 7158 -> 6739 (-5.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:12 +00:00
Daniel Schürmann 8ff44f17ef amd/lower_mem_access_bit_sizes: also use SMEM for subdword loads
We can simply extract from the loaded dwords as per
nir_lower_mem_access_bit_sizes() lowering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:11 +00:00
Marek Olšák 3fe651f607 nir: remove load_smem_amd
replaced by load_global_amd + ACCESS_SMEM_AMD

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>
2025-10-08 08:54:11 +00:00
Rhys Perry 81df517553 aco: avoid unaligned offsets when selecting load_global_amd
SMEM instructions mask off the low bits for the base and offset sources
both before and after they're added. However, NIR expects ACO to only
care about the alignment of the final address.

fossil-db (gfx1201):
Totals from 21 (0.03% of 79839) affected shaders:
Instrs: 229780 -> 229876 (+0.04%)
CodeSize: 1267724 -> 1268080 (+0.03%)
Latency: 2800924 -> 2800978 (+0.00%)
InvThroughput: 520250 -> 520256 (+0.00%)
Copies: 27878 -> 27876 (-0.01%); split: -0.01%, +0.00%
SALU: 29591 -> 29643 (+0.18%)

fossil-db (polaris10):
Totals from 3 (0.00% of 62201) affected shaders:
Latency: 2651 -> 2652 (+0.04%)
InvThroughput: 662 -> 663 (+0.15%)
PreSGPRs: 51 -> 54 (+5.88%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37301>
2025-09-17 09:15:46 +00:00
Georg Lehmann 714a149396 nir: remove unsigned upper bound config
All config information is now either in nir->info or nir->options.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann bb67dae12d nir/uub: remove max_workgroup_size from config
For most hardware, this is the same as max invocations in the workgroup.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann f3c08c9d27 nir/uub: use shader_info subgroup size
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann d029686e20 aco/isel: fix output args init stack buffer overflow
BITSET range functions include the end of the range.

Fixes: eb249bb18e ("aco: Only fix used variables to registers")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:03 +00:00
Natalie Vock 3667a7b687 aco: Add call info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Marek Olšák 7d5288b5b7 aco: check that global addresses are 64bit, apply_nuw_to_ssa to global_amd/smem
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 15:04:32 -04:00
Georg Lehmann 3d190f2e9c aco: implement skip_helpers for load_global_amd
Foz-DB GFX1201:
Totals from 119 (0.15% of 80287) affected shaders:
Instrs: 212449 -> 213452 (+0.47%)
CodeSize: 1120656 -> 1124708 (+0.36%)
Latency: 2854370 -> 2855772 (+0.05%); split: -0.02%, +0.07%
InvThroughput: 586142 -> 586210 (+0.01%); split: -0.00%, +0.01%
VClause: 3556 -> 3656 (+2.81%)
SClause: 2708 -> 2710 (+0.07%)
Copies: 14410 -> 14509 (+0.69%)
PreSGPRs: 6810 -> 6850 (+0.59%); split: -0.12%, +0.70%
VALU: 135945 -> 135942 (-0.00%); split: -0.01%, +0.01%
SALU: 22147 -> 23121 (+4.40%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann ee7069f875 aco: implement skip_helpers for load_scratch
Foz-DB GFX1201:
Totals from 2 (0.00% of 80287) affected shaders:
Instrs: 4016 -> 4054 (+0.95%)
CodeSize: 22104 -> 22256 (+0.69%)
Latency: 17123 -> 17129 (+0.04%)
Copies: 406 -> 415 (+2.22%)
SALU: 323 -> 353 (+9.29%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann 2bfd8918a5 aco: implement skip_helpers for load_ssbo/ubo/constant
Foz-DB GFX1201:
Totals from 6676 (8.32% of 80287) affected shaders:
Instrs: 8786161 -> 8829091 (+0.49%); split: -0.01%, +0.50%
CodeSize: 47141800 -> 47320480 (+0.38%); split: -0.01%, +0.39%
VGPRs: 376624 -> 376600 (-0.01%)
SpillSGPRs: 1251 -> 1250 (-0.08%)
Latency: 99716626 -> 99642361 (-0.07%); split: -0.11%, +0.04%
InvThroughput: 14893179 -> 14898323 (+0.03%); split: -0.01%, +0.04%
VClause: 149425 -> 153539 (+2.75%); split: -0.04%, +2.79%
SClause: 251247 -> 251842 (+0.24%); split: -0.06%, +0.30%
Copies: 580304 -> 586424 (+1.05%); split: -0.21%, +1.26%
Branches: 163014 -> 163013 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 356548 -> 357109 (+0.16%); split: -0.18%, +0.33%
VALU: 5149733 -> 5149797 (+0.00%); split: -0.00%, +0.00%
SALU: 1082176 -> 1122718 (+3.75%); split: -0.06%, +3.80%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:03 +00:00
Georg Lehmann bdae511b18 aco: implement skip_helpers for image loads
Foz-DB GFX1201:
Totals from 5 (0.01% of 80287) affected shaders:
Instrs: 1406 -> 1417 (+0.78%)
CodeSize: 8012 -> 8056 (+0.55%)
Latency: 7279 -> 7282 (+0.04%)
Copies: 84 -> 85 (+1.19%)
SALU: 170 -> 180 (+5.88%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:02 +00:00
Konstantin Seurer 9df7b48d2f nir: Use nir_def_as_* in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:09 +00:00
Marek Olšák 3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Georg Lehmann 883b1ca364 aco: disable wqm for tex loads when not needed
By only executing VMEM loads for lanes where the result is used, we can save
bandwidth.

The NIR pass only handles tex for now, but those are most common anyway.
We can extend it handle image/ssbo/ubo/global loads in the future.

Foz-DB GFX1201:
Totals from 32633 (40.66% of 80251) affected shaders:
Instrs: 22635910 -> 23193509 (+2.46%); split: -0.00%, +2.46%
CodeSize: 122880044 -> 125093428 (+1.80%); split: -0.00%, +1.81%
VGPRs: 1481868 -> 1481712 (-0.01%)
SpillSGPRs: 3877 -> 4301 (+10.94%); split: -0.52%, +11.45%
Latency: 171480552 -> 171685219 (+0.12%); split: -0.18%, +0.30%
InvThroughput: 24364743 -> 24373441 (+0.04%); split: -0.08%, +0.12%
VClause: 388318 -> 388557 (+0.06%); split: -0.06%, +0.13%
SClause: 774781 -> 776492 (+0.22%); split: -0.29%, +0.51%
Copies: 1416586 -> 1541199 (+8.80%); split: -0.16%, +8.96%
Branches: 419591 -> 419673 (+0.02%); split: -0.02%, +0.04%
PreSGPRs: 1330303 -> 1416540 (+6.48%)
PreVGPRs: 964864 -> 964863 (-0.00%)
VALU: 12919601 -> 12920254 (+0.01%); split: -0.01%, +0.01%
SALU: 2685402 -> 3224147 (+20.06%); split: -0.00%, +20.07%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:46 +00:00
Rhys Perry 44ab4ad732 aco: align scratch size after isel
Make it safe for VGPR spilling if it's not a multiple of 4.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36406>
2025-08-04 15:06:43 +00:00
Antonio Ospite ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Alyssa Rosenzweig 8a1a410389 treewide: use SWAP macro
Via Coccinelle patch + manual clean up:

    @@
    identifier temporary, a, b;
    type T;
    @@

    -T temporary = a;
    -a = b;
    -b = temporary;
    +SWAP(a, b);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36297>
2025-07-23 19:49:47 +00:00
Georg Lehmann d672737372 nir,aco: add byte_perm_amd
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Marek Olšák 5ded4f3c7d aco: remove unused aco_symbol_lds_ngg_gs_out_vertex_base
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
2025-07-12 10:28:21 +00:00
Daniel Schürmann 610a19cf31 aco/isel: allow to select SGPR defs for vectorized bcsel and logical operations
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:37 +00:00
Daniel Schürmann 764ee3a834 radv: don't lower subdword phis to scalar
Totals from 193 (0.24% of 79839) affected shaders: (Navi48)

MaxWaves: 6004 -> 6024 (+0.33%)
Instrs: 169276 -> 166784 (-1.47%); split: -3.01%, +1.53%
CodeSize: 940608 -> 915768 (-2.64%); split: -4.29%, +1.64%
VGPRs: 8012 -> 7716 (-3.69%); split: -3.99%, +0.30%
SpillVGPRs: 185 -> 0 (-inf%)
Scratch: 13568 -> 0 (-inf%)
Latency: 2159787 -> 2147084 (-0.59%); split: -2.86%, +2.28%
InvThroughput: 664022 -> 395859 (-40.38%); split: -42.59%, +2.21%
VClause: 2998 -> 2880 (-3.94%); split: -4.27%, +0.33%
SClause: 3117 -> 3120 (+0.10%)
Copies: 21290 -> 16278 (-23.54%); split: -24.74%, +1.20%
Branches: 4757 -> 4760 (+0.06%); split: -0.34%, +0.40%
PreSGPRs: 7369 -> 7378 (+0.12%); split: -0.11%, +0.23%
PreVGPRs: 4257 -> 3859 (-9.35%); split: -9.94%, +0.59%
VALU: 83173 -> 79804 (-4.05%); split: -5.68%, +1.63%
SALU: 36672 -> 37318 (+1.76%); split: -0.02%, +1.78%
VMEM: 4012 -> 3762 (-6.23%); split: -6.83%, +0.60%
SMEM: 4300 -> 4303 (+0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Daniel Schürmann 2c51a8870d nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar()
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.

The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Marek Olšák 4263b49778 ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass
This is a cleanup.

Old gs LDS layout: [es outputs][gs outputs][scratch]
Old nogs LDS layout: [xfb/cull][scratch]

New gs LDS layout: [es outputs][scratch|gs outputs]
New nogs LDS layout: [scratch|xfb/cull]

The LDS scratch is moved to the beginning of the preceding buffer in LDS,
while the addresses in that LDS buffer are offset by the scratch size.
It effectively merges the LDS scratch with the preceding buffer in LDS.
Thanks to that, we no longer need the ngg_scratch ABI and the offset
in a user SGPR.

The lowering passes now return the LDS scratch size, which is used
by the drivers to determine the final LDS size.

The ngg_lds_layout SGPR is now unused without GS in RADV.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
2025-07-02 20:27:41 +00:00
Alyssa Rosenzweig 67237b6f1b treewide: use nir_break_if
Via Coccinelle patch:

    @@
    expression builder, condition;
    @@

    -nir_push_if(builder, condition);
    -{
    -nir_jump(builder, nir_jump_break);
    -}
    -nir_pop_if(builder, NULL);
    +nir_break_if(builder, condition);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>
2025-06-30 14:51:24 -04:00
Georg Lehmann 001cd632ee aco: select float8 to fp32 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:27 +00:00
Georg Lehmann f047a67fba nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:27 +00:00
Georg Lehmann 9e6adcbca0 aco: select fp32 to float8 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:26 +00:00
Georg Lehmann 88753ddd1d aco: allow nir divergence to be printed again
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32990>
2025-06-19 07:02:20 +00:00
Daniel Schürmann 62a92417ef aco: move instruction selection files to /compiler/instruction selection/ subfolder
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34977>
2025-05-16 11:01:19 +00:00