Commit Graph

3942 Commits

Author SHA1 Message Date
Natalie Vock d18b438832 aco: Add RegisterDemand::operator!=
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Yonggang Luo 7db518cfe4 aco: Fixes warning: function get_branch_target/to_clrx_device_name defined but not used
../../src/amd/compiler/aco_print_asm.cpp:156:1: warning: 'bool aco::{anonymous}::get_branch_target(char**, aco::Program*, const std::vector<bool>&, char**)' defined but not used [-Wunused-function]
  156 | get_branch_target(char** output, Program* program, const std::vector<bool>& referenced_blocks,
      | ^~~~~~~~~~~~~~~~~
../../src/amd/compiler/aco_print_asm.cpp:105:1: warning: 'const char* aco::{anonymous}::to_clrx_device_name(amd_gfx_level, radeon_family)' defined but not used [-Wunused-function]
  105 | to_clrx_device_name(amd_gfx_level gfx_level, radeon_family family)
      | ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Rhys Perry e2181744c2 aco/tests: add barrier-to-waitcnt tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 0f32b573a4 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%

fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry ac882985c0 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 145b178de2 aco: fix workgroup-scope barrier between vmem and lds
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 02718fd4c5 aco: use a separate event for sendmsg_rtn
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 5812c2ea89 aco: update waitcnt events for exports
Include primitive, dual source blend and POS4 exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 711c023b55 aco: remove waitcnt code for POPS
We now insert barriers around these instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 005694fe1f aco: remove waitcnt code for SMEM stores
These were removed in GFX10.3 and we haven't used them in a while.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 20cd5cf5f7 aco: delay barrier waitcnt until they are needed
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 843acfa50b aco: add a separate barrier_info for release/acquire barriers
These can wait for different sets of accesses.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 6c446c2f83 aco: refactor waitcnt pass to use barrier_info
Currently there's just barrier_info_all, but more will be added later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 21332609b9 aco: don't move acquire barriers before interlock begin
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 0ee1c137f9 aco: don't move release barriers after interlock end
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 7c056dd473 aco: add is_atomic_or_control_instr helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry df6a3b7619 aco: reduce cost of using values defined in predecessors
For code like:
   if (cond) {
      val = load()
   }
   use(val)
The "use(val)" now has a similar cost to a use inside the IF.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Yonggang Luo 773a7f347a clang-format: Update the .clang-format files to conformance clang-format json-schema
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

The json-schema at
https://www.schemastore.org/clang-format.json

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Rhys Perry 1105f7b98f aco: fix signed integer overflow
Fix UBSan error:
runtime error: signed integer overflow: 2147483647 + 32 cannot be represented in type 'int'

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37055>
2025-09-03 11:47:00 +00:00
Samuel Pitoiset decf9af472 radv/rt: only use one user SGPR for the traversal shader addr
All shaders are allocated in the 32-bit addr space. To avoid an issue
with alignment, and also for future work, there is an unused user SGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37133>
2025-09-03 05:53:41 +00:00
Daniel Schürmann 441d5aab08 aco/ra: coalesce vector affinities with tied definitions
Totals from 19310 (24.19% of 79839) affected shaders: (Navi48)

MaxWaves: 564238 -> 564542 (+0.05%); split: +0.06%, -0.01%
Instrs: 10856428 -> 10803360 (-0.49%); split: -0.53%, +0.04%
CodeSize: 56405088 -> 56189384 (-0.38%); split: -0.41%, +0.02%
VGPRs: 986120 -> 985952 (-0.02%); split: -0.50%, +0.48%
Latency: 53956142 -> 53940850 (-0.03%); split: -0.11%, +0.09%
InvThroughput: 8769260 -> 8735595 (-0.38%); split: -0.49%, +0.11%
VClause: 237471 -> 237452 (-0.01%); split: -0.05%, +0.04%
SClause: 225385 -> 225389 (+0.00%)
Copies: 799792 -> 744150 (-6.96%); split: -7.25%, +0.30%
Branches: 208574 -> 208572 (-0.00%); split: -0.00%, +0.00%
VALU: 6116920 -> 6061448 (-0.91%); split: -0.95%, +0.04%
SALU: 1442068 -> 1441990 (-0.01%); split: -0.01%, +0.00%
VOPD: 1914 -> 1744 (-8.88%); split: +0.10%, -8.99%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:27 +00:00
Daniel Schürmann 2f303636f3 aco/ra: consider precolor affinities in get_reg_vector()
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:27 +00:00
Daniel Schürmann 6dbf8f7b90 aco/ra: don't set precolor affinities for already assigned temporaries
Also don't overwrite existing precolor affinities.

Totals from 248 (0.31% of 79839) affected shaders: (Navi48)

Instrs: 154427 -> 154401 (-0.02%); split: -0.12%, +0.10%
CodeSize: 812880 -> 812568 (-0.04%); split: -0.12%, +0.08%
VGPRs: 12432 -> 12408 (-0.19%)
Latency: 851623 -> 851801 (+0.02%); split: -0.03%, +0.05%
InvThroughput: 156569 -> 156581 (+0.01%); split: -0.04%, +0.05%
VClause: 2672 -> 2681 (+0.34%); split: -0.34%, +0.67%
Copies: 12645 -> 12660 (+0.12%); split: -0.53%, +0.65%
VALU: 82894 -> 82909 (+0.02%); split: -0.08%, +0.10%
SALU: 25406 -> 25424 (+0.07%); split: -0.07%, +0.14%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:26 +00:00
Daniel Schürmann eb557fd090 aco/ra: add vector_info::index to indicate the Operand's index into the vector
This simplifies the code and will allow for a mismatch between the index and
the Operand's temporary.

Totals from 28 (0.04% of 79839) affected shaders: (Navi48)

Instrs: 18453 -> 18440 (-0.07%); split: -0.08%, +0.01%
CodeSize: 98588 -> 98532 (-0.06%); split: -0.06%, +0.00%
Copies: 1347 -> 1333 (-1.04%); split: -1.11%, +0.07%
VALU: 10431 -> 10417 (-0.13%); split: -0.14%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:26 +00:00
Marek Olšák 4c87d002e3 aco,radeonsi: expand 32-bit shader arg pointers to 64 bits for ACO
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 15:04:32 -04:00
Marek Olšák 7d5288b5b7 aco: check that global addresses are 64bit, apply_nuw_to_ssa to global_amd/smem
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 15:04:32 -04:00
Georg Lehmann 38e32e39a9 aco: never end wqm early for vmem
The remaining cases where disable_wqm isn't set are either uniform loads
or loads that influence control flow. In the first case, not ending WQM early
is free, and in the second case it's likely still better to not block scheduling.

Foz-DB GFX1201:
Totals from 483 (0.60% of 80287) affected shaders:
MaxWaves: 12654 -> 12642 (-0.09%)
Instrs: 485234 -> 484830 (-0.08%); split: -0.19%, +0.11%
CodeSize: 2630876 -> 2629184 (-0.06%); split: -0.15%, +0.08%
VGPRs: 29980 -> 30004 (+0.08%)
Latency: 4908015 -> 4813167 (-1.93%); split: -1.95%, +0.02%
InvThroughput: 751059 -> 748582 (-0.33%); split: -0.35%, +0.02%
VClause: 8723 -> 8705 (-0.21%); split: -0.30%, +0.09%
SClause: 11085 -> 10986 (-0.89%); split: -1.45%, +0.56%
Copies: 25155 -> 25183 (+0.11%); split: -0.26%, +0.37%
Branches: 6203 -> 6204 (+0.02%)
PreSGPRs: 23763 -> 23780 (+0.07%)
VALU: 296576 -> 296593 (+0.01%); split: -0.01%, +0.02%
SALU: 49095 -> 49416 (+0.65%); split: -0.04%, +0.69%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann 3d190f2e9c aco: implement skip_helpers for load_global_amd
Foz-DB GFX1201:
Totals from 119 (0.15% of 80287) affected shaders:
Instrs: 212449 -> 213452 (+0.47%)
CodeSize: 1120656 -> 1124708 (+0.36%)
Latency: 2854370 -> 2855772 (+0.05%); split: -0.02%, +0.07%
InvThroughput: 586142 -> 586210 (+0.01%); split: -0.00%, +0.01%
VClause: 3556 -> 3656 (+2.81%)
SClause: 2708 -> 2710 (+0.07%)
Copies: 14410 -> 14509 (+0.69%)
PreSGPRs: 6810 -> 6850 (+0.59%); split: -0.12%, +0.70%
VALU: 135945 -> 135942 (-0.00%); split: -0.01%, +0.01%
SALU: 22147 -> 23121 (+4.40%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann ee7069f875 aco: implement skip_helpers for load_scratch
Foz-DB GFX1201:
Totals from 2 (0.00% of 80287) affected shaders:
Instrs: 4016 -> 4054 (+0.95%)
CodeSize: 22104 -> 22256 (+0.69%)
Latency: 17123 -> 17129 (+0.04%)
Copies: 406 -> 415 (+2.22%)
SALU: 323 -> 353 (+9.29%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann 2bfd8918a5 aco: implement skip_helpers for load_ssbo/ubo/constant
Foz-DB GFX1201:
Totals from 6676 (8.32% of 80287) affected shaders:
Instrs: 8786161 -> 8829091 (+0.49%); split: -0.01%, +0.50%
CodeSize: 47141800 -> 47320480 (+0.38%); split: -0.01%, +0.39%
VGPRs: 376624 -> 376600 (-0.01%)
SpillSGPRs: 1251 -> 1250 (-0.08%)
Latency: 99716626 -> 99642361 (-0.07%); split: -0.11%, +0.04%
InvThroughput: 14893179 -> 14898323 (+0.03%); split: -0.01%, +0.04%
VClause: 149425 -> 153539 (+2.75%); split: -0.04%, +2.79%
SClause: 251247 -> 251842 (+0.24%); split: -0.06%, +0.30%
Copies: 580304 -> 586424 (+1.05%); split: -0.21%, +1.26%
Branches: 163014 -> 163013 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 356548 -> 357109 (+0.16%); split: -0.18%, +0.33%
VALU: 5149733 -> 5149797 (+0.00%); split: -0.00%, +0.00%
SALU: 1082176 -> 1122718 (+3.75%); split: -0.06%, +3.80%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:03 +00:00
Georg Lehmann bdae511b18 aco: implement skip_helpers for image loads
Foz-DB GFX1201:
Totals from 5 (0.01% of 80287) affected shaders:
Instrs: 1406 -> 1417 (+0.78%)
CodeSize: 8012 -> 8056 (+0.55%)
Latency: 7279 -> 7282 (+0.04%)
Copies: 84 -> 85 (+1.19%)
SALU: 170 -> 180 (+5.88%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:02 +00:00
Georg Lehmann bf453a7c6a aco/isel: add init_disable_wqm helper
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:01 +00:00
Georg Lehmann 635ac758c9 aco/optimizer: don't create undef copies from p_create_vector
p_create_vector allows undef operands, p_parallelcopy doesn't.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13765
Fixes: 01d20680e2 ("aco/optimizer: generalize p_create_vector of split vector opt")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36963>
2025-08-25 16:47:38 +00:00
Georg Lehmann 8903bb4618 aco/optimizer: don't apply packed clamp to v_fma_mix
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13758
Fixes: 345bf8a2f2 ("aco/optimizer: remove label_vop3p")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36963>
2025-08-25 16:47:38 +00:00
Georg Lehmann 791a57805c aco: fix ra validation for flat/global/scratch/ds load sbyte_d16
Fixes: 18a53230eb ("aco: don't check dst_bitsize in apply_load_extract")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36964>
2025-08-25 10:09:16 +00:00
Konstantin Seurer 951b187b95 nir: Use nir_def_block in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:10 +00:00
Konstantin Seurer 9df7b48d2f nir: Use nir_def_as_* in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:09 +00:00
Daniel Schürmann 219c53e6fc aco/ra: don't clear lateKill operands in get_reg_create_vector()
Fixes: 08f088479a ('aco/ra: set late-kill for operands of temporary p_create_vector')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36871>
2025-08-22 15:25:04 +00:00
Marek Olšák 3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Georg Lehmann 639b91bb48 aco/isel: fix vectorized i2i16 with 8bit vec8 source
The extract index is in dwords, not bytes.

Fixes: 92d433c54a ("aco: vectorize conversions from 8bit to 16bit")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869>
2025-08-20 10:13:22 +00:00
Daniel Schürmann 0546ecfadb aco/scheduler: small refactor of schedule_VMEM()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00
Daniel Schürmann 0c590eb903 aco/scheduler: schedule VMEM store clauses during the regular forward pass
Totals from 1456 (1.82% of 79839) affected shaders: (Navi48)

MaxWaves: 37780 -> 37128 (-1.73%); split: +0.15%, -1.87%
Instrs: 3788175 -> 3788435 (+0.01%); split: -0.04%, +0.04%
CodeSize: 20468648 -> 20467432 (-0.01%); split: -0.04%, +0.03%
VGPRs: 86820 -> 91440 (+5.32%); split: -0.10%, +5.42%
Latency: 26866232 -> 26858867 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 3491741 -> 3828339 (+9.64%); split: -0.02%, +9.66%
VClause: 90413 -> 89426 (-1.09%); split: -1.27%, +0.18%
SClause: 130532 -> 130530 (-0.00%); split: -0.00%, +0.00%
Copies: 347397 -> 347806 (+0.12%); split: -0.11%, +0.23%
Branches: 117476 -> 117496 (+0.02%)
VALU: 1897427 -> 1897830 (+0.02%); split: -0.02%, +0.04%
SALU: 602365 -> 602379 (+0.00%)
VOPD: 1259 -> 1251 (-0.64%); split: +0.24%, -0.87%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00
Daniel Schürmann f601eb8555 aco/scheduler: move clauses as batch
Totals from 391 (0.49% of 79839) affected shaders:

Instrs: 612478 -> 612515 (+0.01%); split: -0.06%, +0.06%
CodeSize: 3342896 -> 3343228 (+0.01%); split: -0.04%, +0.05%
Latency: 6909794 -> 6909938 (+0.00%); split: -0.03%, +0.03%
VClause: 10752 -> 10167 (-5.44%); split: -5.46%, +0.02%
Copies: 26623 -> 26627 (+0.02%); split: -0.00%, +0.02%
VALU: 377494 -> 377499 (+0.00%); split: -0.00%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00
Daniel Schürmann 70f0c065e8 aco/scheduler: ignore potential SMEM stalls when forming clauses
Totals from 4190 (5.25% of 79839) affected shaders: (Navi48)

MaxWaves: 117020 -> 117014 (-0.01%)
Instrs: 4801892 -> 4801547 (-0.01%); split: -0.06%, +0.05%
CodeSize: 25327632 -> 25325500 (-0.01%); split: -0.05%, +0.04%
VGPRs: 236452 -> 236488 (+0.02%)
Latency: 30569070 -> 30539464 (-0.10%); split: -0.13%, +0.04%
InvThroughput: 4891650 -> 4891062 (-0.01%); split: -0.03%, +0.01%
VClause: 119615 -> 118763 (-0.71%); split: -1.02%, +0.31%
SClause: 100482 -> 100297 (-0.18%); split: -0.44%, +0.26%
Copies: 326644 -> 326756 (+0.03%); split: -0.19%, +0.22%
Branches: 98982 -> 98980 (-0.00%)
VALU: 2712397 -> 2712534 (+0.01%); split: -0.02%, +0.03%
SALU: 591836 -> 591817 (-0.00%); split: -0.00%, +0.00%
VOPD: 993 -> 987 (-0.60%); split: +0.20%, -0.81%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:11 +00:00
Daniel Schürmann d3a0f268b9 aco/scheduler: short-cut downwards_move_clause() when no movement is done
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:11 +00:00
Daniel Schürmann 8543b6cf2e aco/scheduler: remove DownwardsCursor::clause_demand
As we stop scheduling after forming clauses, this value
is not needed anymore.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:10 +00:00
Daniel Schürmann 5ae30deffb aco/scheduler: remove DownwardsCursor::insert_demand_clause
This partially reverts 93872270f0 ('aco/scheduler: keep track of RegisterDemand at DownwardsCursor::insert_idx{_clause}').

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:10 +00:00
Daniel Schürmann e95d728a98 aco/scheduler: split downwards_move_clause() from downwards_move()
We will do batched moves for clauses with the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:09 +00:00
Daniel Schürmann 37299a8d1a aco/scheduler: Stop downwards scheduling after encountering the first clause
Totals from 9899 (12.40% of 79839) affected shaders: (Navi48)

MaxWaves: 276355 -> 276317 (-0.01%); split: +0.01%, -0.02%
Instrs: 8781768 -> 8766504 (-0.17%); split: -0.25%, +0.07%
CodeSize: 46297556 -> 46236104 (-0.13%); split: -0.19%, +0.06%
VGPRs: 574680 -> 574800 (+0.02%); split: -0.00%, +0.03%
Latency: 54261324 -> 54357916 (+0.18%); split: -0.14%, +0.32%
InvThroughput: 9122700 -> 9121115 (-0.02%); split: -0.07%, +0.05%
VClause: 222062 -> 218499 (-1.60%); split: -2.33%, +0.73%
SClause: 167138 -> 163233 (-2.34%); split: -2.43%, +0.09%
Copies: 602395 -> 598560 (-0.64%); split: -1.21%, +0.57%
Branches: 161939 -> 161932 (-0.00%); split: -0.01%, +0.00%
VALU: 5063999 -> 5060199 (-0.08%); split: -0.14%, +0.07%
SALU: 988254 -> 988285 (+0.00%); split: -0.02%, +0.02%
VOPD: 2478 -> 2443 (-1.41%); split: +0.40%, -1.82%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:09 +00:00
Daniel Schürmann fb6b95517e aco/scheduler: check dependencies of entire clause upfront
and bail if any instruction of the clause can't be moved.

Totals from 4310 (5.40% of 79839) affected shaders:

MaxWaves: 115826 -> 115834 (+0.01%)
Instrs: 6256436 -> 6257599 (+0.02%); split: -0.05%, +0.07%
CodeSize: 32816488 -> 32820768 (+0.01%); split: -0.04%, +0.05%
VGPRs: 260184 -> 260172 (-0.00%)
Latency: 41207213 -> 41052150 (-0.38%); split: -0.45%, +0.07%
InvThroughput: 6822608 -> 6815208 (-0.11%); split: -0.14%, +0.03%
VClause: 148412 -> 147133 (-0.86%); split: -1.03%, +0.17%
SClause: 120854 -> 120856 (+0.00%); split: -0.01%, +0.01%
Copies: 425910 -> 427276 (+0.32%); split: -0.25%, +0.57%
VALU: 3572293 -> 3573647 (+0.04%); split: -0.03%, +0.07%
VOPD: 2803 -> 2816 (+0.46%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:08 +00:00