Natalie Vock
d18b438832
aco: Add RegisterDemand::operator!=
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531 >
2025-09-15 17:16:20 +00:00
Yonggang Luo
7db518cfe4
aco: Fixes warning: function get_branch_target/to_clrx_device_name defined but not used
...
../../src/amd/compiler/aco_print_asm.cpp:156:1: warning: 'bool aco::{anonymous}::get_branch_target(char**, aco::Program*, const std::vector<bool>&, char**)' defined but not used [-Wunused-function]
156 | get_branch_target(char** output, Program* program, const std::vector<bool>& referenced_blocks,
| ^~~~~~~~~~~~~~~~~
../../src/amd/compiler/aco_print_asm.cpp:105:1: warning: 'const char* aco::{anonymous}::to_clrx_device_name(amd_gfx_level, radeon_family)' defined but not used [-Wunused-function]
105 | to_clrx_device_name(amd_gfx_level gfx_level, radeon_family family)
| ^~~~~~~~~~~~~~~~~~~
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289 >
2025-09-13 08:23:07 +00:00
Rhys Perry
e2181744c2
aco/tests: add barrier-to-waitcnt tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
0f32b573a4
aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
...
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%
fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
ac882985c0
aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
145b178de2
aco: fix workgroup-scope barrier between vmem and lds
...
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
02718fd4c5
aco: use a separate event for sendmsg_rtn
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
5812c2ea89
aco: update waitcnt events for exports
...
Include primitive, dual source blend and POS4 exports.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
711c023b55
aco: remove waitcnt code for POPS
...
We now insert barriers around these instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
005694fe1f
aco: remove waitcnt code for SMEM stores
...
These were removed in GFX10.3 and we haven't used them in a while.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
20cd5cf5f7
aco: delay barrier waitcnt until they are needed
...
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
843acfa50b
aco: add a separate barrier_info for release/acquire barriers
...
These can wait for different sets of accesses.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
6c446c2f83
aco: refactor waitcnt pass to use barrier_info
...
Currently there's just barrier_info_all, but more will be added later.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
21332609b9
aco: don't move acquire barriers before interlock begin
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
0ee1c137f9
aco: don't move release barriers after interlock end
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
7c056dd473
aco: add is_atomic_or_control_instr helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Rhys Perry
df6a3b7619
aco: reduce cost of using values defined in predecessors
...
For code like:
if (cond) {
val = load()
}
use(val)
The "use(val)" now has a similar cost to a use inside the IF.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491 >
2025-09-09 12:34:40 +00:00
Yonggang Luo
773a7f347a
clang-format: Update the .clang-format files to conformance clang-format json-schema
...
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html
The json-schema at
https://www.schemastore.org/clang-format.json
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235 >
2025-09-09 07:04:55 +00:00
Rhys Perry
1105f7b98f
aco: fix signed integer overflow
...
Fix UBSan error:
runtime error: signed integer overflow: 2147483647 + 32 cannot be represented in type 'int'
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37055 >
2025-09-03 11:47:00 +00:00
Samuel Pitoiset
decf9af472
radv/rt: only use one user SGPR for the traversal shader addr
...
All shaders are allocated in the 32-bit addr space. To avoid an issue
with alignment, and also for future work, there is an unused user SGPR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37133 >
2025-09-03 05:53:41 +00:00
Daniel Schürmann
441d5aab08
aco/ra: coalesce vector affinities with tied definitions
...
Totals from 19310 (24.19% of 79839) affected shaders: (Navi48)
MaxWaves: 564238 -> 564542 (+0.05%); split: +0.06%, -0.01%
Instrs: 10856428 -> 10803360 (-0.49%); split: -0.53%, +0.04%
CodeSize: 56405088 -> 56189384 (-0.38%); split: -0.41%, +0.02%
VGPRs: 986120 -> 985952 (-0.02%); split: -0.50%, +0.48%
Latency: 53956142 -> 53940850 (-0.03%); split: -0.11%, +0.09%
InvThroughput: 8769260 -> 8735595 (-0.38%); split: -0.49%, +0.11%
VClause: 237471 -> 237452 (-0.01%); split: -0.05%, +0.04%
SClause: 225385 -> 225389 (+0.00%)
Copies: 799792 -> 744150 (-6.96%); split: -7.25%, +0.30%
Branches: 208574 -> 208572 (-0.00%); split: -0.00%, +0.00%
VALU: 6116920 -> 6061448 (-0.91%); split: -0.95%, +0.04%
SALU: 1442068 -> 1441990 (-0.01%); split: -0.01%, +0.00%
VOPD: 1914 -> 1744 (-8.88%); split: +0.10%, -8.99%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851 >
2025-09-02 10:24:27 +00:00
Daniel Schürmann
2f303636f3
aco/ra: consider precolor affinities in get_reg_vector()
...
No fossil changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851 >
2025-09-02 10:24:27 +00:00
Daniel Schürmann
6dbf8f7b90
aco/ra: don't set precolor affinities for already assigned temporaries
...
Also don't overwrite existing precolor affinities.
Totals from 248 (0.31% of 79839) affected shaders: (Navi48)
Instrs: 154427 -> 154401 (-0.02%); split: -0.12%, +0.10%
CodeSize: 812880 -> 812568 (-0.04%); split: -0.12%, +0.08%
VGPRs: 12432 -> 12408 (-0.19%)
Latency: 851623 -> 851801 (+0.02%); split: -0.03%, +0.05%
InvThroughput: 156569 -> 156581 (+0.01%); split: -0.04%, +0.05%
VClause: 2672 -> 2681 (+0.34%); split: -0.34%, +0.67%
Copies: 12645 -> 12660 (+0.12%); split: -0.53%, +0.65%
VALU: 82894 -> 82909 (+0.02%); split: -0.08%, +0.10%
SALU: 25406 -> 25424 (+0.07%); split: -0.07%, +0.14%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851 >
2025-09-02 10:24:26 +00:00
Daniel Schürmann
eb557fd090
aco/ra: add vector_info::index to indicate the Operand's index into the vector
...
This simplifies the code and will allow for a mismatch between the index and
the Operand's temporary.
Totals from 28 (0.04% of 79839) affected shaders: (Navi48)
Instrs: 18453 -> 18440 (-0.07%); split: -0.08%, +0.01%
CodeSize: 98588 -> 98532 (-0.06%); split: -0.06%, +0.00%
Copies: 1347 -> 1333 (-1.04%); split: -1.11%, +0.07%
VALU: 10431 -> 10417 (-0.13%); split: -0.14%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851 >
2025-09-02 10:24:26 +00:00
Marek Olšák
4c87d002e3
aco,radeonsi: expand 32-bit shader arg pointers to 64 bits for ACO
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101 >
2025-08-30 15:04:32 -04:00
Marek Olšák
7d5288b5b7
aco: check that global addresses are 64bit, apply_nuw_to_ssa to global_amd/smem
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101 >
2025-08-30 15:04:32 -04:00
Georg Lehmann
38e32e39a9
aco: never end wqm early for vmem
...
The remaining cases where disable_wqm isn't set are either uniform loads
or loads that influence control flow. In the first case, not ending WQM early
is free, and in the second case it's likely still better to not block scheduling.
Foz-DB GFX1201:
Totals from 483 (0.60% of 80287) affected shaders:
MaxWaves: 12654 -> 12642 (-0.09%)
Instrs: 485234 -> 484830 (-0.08%); split: -0.19%, +0.11%
CodeSize: 2630876 -> 2629184 (-0.06%); split: -0.15%, +0.08%
VGPRs: 29980 -> 30004 (+0.08%)
Latency: 4908015 -> 4813167 (-1.93%); split: -1.95%, +0.02%
InvThroughput: 751059 -> 748582 (-0.33%); split: -0.35%, +0.02%
VClause: 8723 -> 8705 (-0.21%); split: -0.30%, +0.09%
SClause: 11085 -> 10986 (-0.89%); split: -1.45%, +0.56%
Copies: 25155 -> 25183 (+0.11%); split: -0.26%, +0.37%
Branches: 6203 -> 6204 (+0.02%)
PreSGPRs: 23763 -> 23780 (+0.07%)
VALU: 296576 -> 296593 (+0.01%); split: -0.01%, +0.02%
SALU: 49095 -> 49416 (+0.65%); split: -0.04%, +0.69%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785 >
2025-08-28 06:29:04 +00:00
Georg Lehmann
3d190f2e9c
aco: implement skip_helpers for load_global_amd
...
Foz-DB GFX1201:
Totals from 119 (0.15% of 80287) affected shaders:
Instrs: 212449 -> 213452 (+0.47%)
CodeSize: 1120656 -> 1124708 (+0.36%)
Latency: 2854370 -> 2855772 (+0.05%); split: -0.02%, +0.07%
InvThroughput: 586142 -> 586210 (+0.01%); split: -0.00%, +0.01%
VClause: 3556 -> 3656 (+2.81%)
SClause: 2708 -> 2710 (+0.07%)
Copies: 14410 -> 14509 (+0.69%)
PreSGPRs: 6810 -> 6850 (+0.59%); split: -0.12%, +0.70%
VALU: 135945 -> 135942 (-0.00%); split: -0.01%, +0.01%
SALU: 22147 -> 23121 (+4.40%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785 >
2025-08-28 06:29:04 +00:00
Georg Lehmann
ee7069f875
aco: implement skip_helpers for load_scratch
...
Foz-DB GFX1201:
Totals from 2 (0.00% of 80287) affected shaders:
Instrs: 4016 -> 4054 (+0.95%)
CodeSize: 22104 -> 22256 (+0.69%)
Latency: 17123 -> 17129 (+0.04%)
Copies: 406 -> 415 (+2.22%)
SALU: 323 -> 353 (+9.29%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785 >
2025-08-28 06:29:04 +00:00
Georg Lehmann
2bfd8918a5
aco: implement skip_helpers for load_ssbo/ubo/constant
...
Foz-DB GFX1201:
Totals from 6676 (8.32% of 80287) affected shaders:
Instrs: 8786161 -> 8829091 (+0.49%); split: -0.01%, +0.50%
CodeSize: 47141800 -> 47320480 (+0.38%); split: -0.01%, +0.39%
VGPRs: 376624 -> 376600 (-0.01%)
SpillSGPRs: 1251 -> 1250 (-0.08%)
Latency: 99716626 -> 99642361 (-0.07%); split: -0.11%, +0.04%
InvThroughput: 14893179 -> 14898323 (+0.03%); split: -0.01%, +0.04%
VClause: 149425 -> 153539 (+2.75%); split: -0.04%, +2.79%
SClause: 251247 -> 251842 (+0.24%); split: -0.06%, +0.30%
Copies: 580304 -> 586424 (+1.05%); split: -0.21%, +1.26%
Branches: 163014 -> 163013 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 356548 -> 357109 (+0.16%); split: -0.18%, +0.33%
VALU: 5149733 -> 5149797 (+0.00%); split: -0.00%, +0.00%
SALU: 1082176 -> 1122718 (+3.75%); split: -0.06%, +3.80%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785 >
2025-08-28 06:29:03 +00:00
Georg Lehmann
bdae511b18
aco: implement skip_helpers for image loads
...
Foz-DB GFX1201:
Totals from 5 (0.01% of 80287) affected shaders:
Instrs: 1406 -> 1417 (+0.78%)
CodeSize: 8012 -> 8056 (+0.55%)
Latency: 7279 -> 7282 (+0.04%)
Copies: 84 -> 85 (+1.19%)
SALU: 170 -> 180 (+5.88%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785 >
2025-08-28 06:29:02 +00:00
Georg Lehmann
bf453a7c6a
aco/isel: add init_disable_wqm helper
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785 >
2025-08-28 06:29:01 +00:00
Georg Lehmann
635ac758c9
aco/optimizer: don't create undef copies from p_create_vector
...
p_create_vector allows undef operands, p_parallelcopy doesn't.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13765
Fixes: 01d20680e2 ("aco/optimizer: generalize p_create_vector of split vector opt")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36963 >
2025-08-25 16:47:38 +00:00
Georg Lehmann
8903bb4618
aco/optimizer: don't apply packed clamp to v_fma_mix
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13758
Fixes: 345bf8a2f2 ("aco/optimizer: remove label_vop3p")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36963 >
2025-08-25 16:47:38 +00:00
Georg Lehmann
791a57805c
aco: fix ra validation for flat/global/scratch/ds load sbyte_d16
...
Fixes: 18a53230eb ("aco: don't check dst_bitsize in apply_load_extract")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36964 >
2025-08-25 10:09:16 +00:00
Konstantin Seurer
951b187b95
nir: Use nir_def_block in more places
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746 >
2025-08-24 14:03:10 +00:00
Konstantin Seurer
9df7b48d2f
nir: Use nir_def_as_* in more places
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746 >
2025-08-24 14:03:09 +00:00
Daniel Schürmann
219c53e6fc
aco/ra: don't clear lateKill operands in get_reg_create_vector()
...
Fixes: 08f088479a ('aco/ra: set late-kill for operands of temporary p_create_vector')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36871 >
2025-08-22 15:25:04 +00:00
Marek Olšák
3aadae22ad
nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
...
We can just place the set structures inside nir_block.
This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).
Reviewed-by: Gert Wollny <gert.wollny@collabora.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728 >
2025-08-21 06:13:48 +00:00
Georg Lehmann
639b91bb48
aco/isel: fix vectorized i2i16 with 8bit vec8 source
...
The extract index is in dwords, not bytes.
Fixes: 92d433c54a ("aco: vectorize conversions from 8bit to 16bit")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869 >
2025-08-20 10:13:22 +00:00
Daniel Schürmann
0546ecfadb
aco/scheduler: small refactor of schedule_VMEM()
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:12 +00:00
Daniel Schürmann
0c590eb903
aco/scheduler: schedule VMEM store clauses during the regular forward pass
...
Totals from 1456 (1.82% of 79839) affected shaders: (Navi48)
MaxWaves: 37780 -> 37128 (-1.73%); split: +0.15%, -1.87%
Instrs: 3788175 -> 3788435 (+0.01%); split: -0.04%, +0.04%
CodeSize: 20468648 -> 20467432 (-0.01%); split: -0.04%, +0.03%
VGPRs: 86820 -> 91440 (+5.32%); split: -0.10%, +5.42%
Latency: 26866232 -> 26858867 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 3491741 -> 3828339 (+9.64%); split: -0.02%, +9.66%
VClause: 90413 -> 89426 (-1.09%); split: -1.27%, +0.18%
SClause: 130532 -> 130530 (-0.00%); split: -0.00%, +0.00%
Copies: 347397 -> 347806 (+0.12%); split: -0.11%, +0.23%
Branches: 117476 -> 117496 (+0.02%)
VALU: 1897427 -> 1897830 (+0.02%); split: -0.02%, +0.04%
SALU: 602365 -> 602379 (+0.00%)
VOPD: 1259 -> 1251 (-0.64%); split: +0.24%, -0.87%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:12 +00:00
Daniel Schürmann
f601eb8555
aco/scheduler: move clauses as batch
...
Totals from 391 (0.49% of 79839) affected shaders:
Instrs: 612478 -> 612515 (+0.01%); split: -0.06%, +0.06%
CodeSize: 3342896 -> 3343228 (+0.01%); split: -0.04%, +0.05%
Latency: 6909794 -> 6909938 (+0.00%); split: -0.03%, +0.03%
VClause: 10752 -> 10167 (-5.44%); split: -5.46%, +0.02%
Copies: 26623 -> 26627 (+0.02%); split: -0.00%, +0.02%
VALU: 377494 -> 377499 (+0.00%); split: -0.00%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:12 +00:00
Daniel Schürmann
70f0c065e8
aco/scheduler: ignore potential SMEM stalls when forming clauses
...
Totals from 4190 (5.25% of 79839) affected shaders: (Navi48)
MaxWaves: 117020 -> 117014 (-0.01%)
Instrs: 4801892 -> 4801547 (-0.01%); split: -0.06%, +0.05%
CodeSize: 25327632 -> 25325500 (-0.01%); split: -0.05%, +0.04%
VGPRs: 236452 -> 236488 (+0.02%)
Latency: 30569070 -> 30539464 (-0.10%); split: -0.13%, +0.04%
InvThroughput: 4891650 -> 4891062 (-0.01%); split: -0.03%, +0.01%
VClause: 119615 -> 118763 (-0.71%); split: -1.02%, +0.31%
SClause: 100482 -> 100297 (-0.18%); split: -0.44%, +0.26%
Copies: 326644 -> 326756 (+0.03%); split: -0.19%, +0.22%
Branches: 98982 -> 98980 (-0.00%)
VALU: 2712397 -> 2712534 (+0.01%); split: -0.02%, +0.03%
SALU: 591836 -> 591817 (-0.00%); split: -0.00%, +0.00%
VOPD: 993 -> 987 (-0.60%); split: +0.20%, -0.81%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:11 +00:00
Daniel Schürmann
d3a0f268b9
aco/scheduler: short-cut downwards_move_clause() when no movement is done
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:11 +00:00
Daniel Schürmann
8543b6cf2e
aco/scheduler: remove DownwardsCursor::clause_demand
...
As we stop scheduling after forming clauses, this value
is not needed anymore.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:10 +00:00
Daniel Schürmann
5ae30deffb
aco/scheduler: remove DownwardsCursor::insert_demand_clause
...
This partially reverts 93872270f0 ('aco/scheduler: keep track of RegisterDemand at DownwardsCursor::insert_idx{_clause}').
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:10 +00:00
Daniel Schürmann
e95d728a98
aco/scheduler: split downwards_move_clause() from downwards_move()
...
We will do batched moves for clauses with the next commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:09 +00:00
Daniel Schürmann
37299a8d1a
aco/scheduler: Stop downwards scheduling after encountering the first clause
...
Totals from 9899 (12.40% of 79839) affected shaders: (Navi48)
MaxWaves: 276355 -> 276317 (-0.01%); split: +0.01%, -0.02%
Instrs: 8781768 -> 8766504 (-0.17%); split: -0.25%, +0.07%
CodeSize: 46297556 -> 46236104 (-0.13%); split: -0.19%, +0.06%
VGPRs: 574680 -> 574800 (+0.02%); split: -0.00%, +0.03%
Latency: 54261324 -> 54357916 (+0.18%); split: -0.14%, +0.32%
InvThroughput: 9122700 -> 9121115 (-0.02%); split: -0.07%, +0.05%
VClause: 222062 -> 218499 (-1.60%); split: -2.33%, +0.73%
SClause: 167138 -> 163233 (-2.34%); split: -2.43%, +0.09%
Copies: 602395 -> 598560 (-0.64%); split: -1.21%, +0.57%
Branches: 161939 -> 161932 (-0.00%); split: -0.01%, +0.00%
VALU: 5063999 -> 5060199 (-0.08%); split: -0.14%, +0.07%
SALU: 988254 -> 988285 (+0.00%); split: -0.02%, +0.02%
VOPD: 2478 -> 2443 (-1.41%); split: +0.40%, -1.82%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:09 +00:00
Daniel Schürmann
fb6b95517e
aco/scheduler: check dependencies of entire clause upfront
...
and bail if any instruction of the clause can't be moved.
Totals from 4310 (5.40% of 79839) affected shaders:
MaxWaves: 115826 -> 115834 (+0.01%)
Instrs: 6256436 -> 6257599 (+0.02%); split: -0.05%, +0.07%
CodeSize: 32816488 -> 32820768 (+0.01%); split: -0.04%, +0.05%
VGPRs: 260184 -> 260172 (-0.00%)
Latency: 41207213 -> 41052150 (-0.38%); split: -0.45%, +0.07%
InvThroughput: 6822608 -> 6815208 (-0.11%); split: -0.14%, +0.03%
VClause: 148412 -> 147133 (-0.86%); split: -1.03%, +0.17%
SClause: 120854 -> 120856 (+0.00%); split: -0.01%, +0.01%
Copies: 425910 -> 427276 (+0.32%); split: -0.25%, +0.57%
VALU: 3572293 -> 3573647 (+0.04%); split: -0.03%, +0.07%
VOPD: 2803 -> 2816 (+0.46%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599 >
2025-08-19 16:59:08 +00:00