Commit Graph

15171 Commits

Author SHA1 Message Date
Daniel Schürmann
56ac6f26e0 aco/assembler: slightly refactor MTBUF assembly for more readability
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Daniel Schürmann
14f4906e53 aco/assembler: fix MTBUF opcode encoding on GFX11
We have accidentally set the tfe bit for some opcodes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Leo Liu
448c716358 ac/surface/tests: add the test for ADDR3_256B_2D
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29541>
2024-06-11 12:29:11 -04:00
Leo Liu
59e813d953 ac/surface: add GFX12 256B tile mode for video
With VCN5, the DPB buffer uses gfx12 tile/swizzle mode.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29541>
2024-06-11 12:29:11 -04:00
Rhys Perry
7a4f121c5d aco: remove some missing label resets
In the case of:
   c = xor(a, b)
   d = not(c)
   xor(d, e)
it will be optimized to:
   d = xnor(a, b)
   xor(d, e)
because "d" would still had a label with "instr=not(c)", it would then be
further optimized to:
   d = xnor(a, b)
   xnor(c, e)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11309
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29650>
2024-06-11 09:30:16 +00:00
Samuel Pitoiset
51d1e005e8 radv: use the common SQTT implementation
I have verified the generated command stream using PM4 is similar to
the previous one on POLARIS10, VEGA10, NAVI21 and NAVI31.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
ea8f29b4a7 radv: emit more consecutive registers for SQTT on GFX8-9
This change is only useful to compare the command stream generated by
PM4 in the next commit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
a373ba92c3 amd: add a common implementation for SQTT using PM4
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
2fab42ad2e amd: mark more registers that need RESET_FILTER_CAM in PM4
For SQTT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
0c08673656 amd: allow to emit privileged config registers in PM4
For SQTT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
b82e5c8da8 ac,radv,radeonsi: add more parameters to ac_sqtt
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
155399d03b ac,radv: add a helper for SQTT control register
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Pierre-Eric Pelloux-Prayer
a7880f3edb radv/sqtt: use radeon_check_space before emit_spm_*
This fixes the following error on a rdna2:

   radeon_set_uconfig_reg_seq: Assertion `cs->cdw + 2 + num <= cs->reserved_dw' failed.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29499>
2024-06-11 06:46:55 +00:00
Samuel Pitoiset
a80a1c9838 radv: don't assume that TC_ACTION_ENA invalidates L1 cache on gfx9
Ported from RadeonSI 279315fd73 ("radeonsi: don't assume that
TC_ACTION_ENA invalidates L1 cache on gfx9")

Thanks to Rhys for noticing this by inspection.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29644>
2024-06-11 06:15:12 +00:00
Friedrich Vock
15f2c9c553 aco: Limit rt stages to 128 vgprs
Totals from 35472 (7.40% of 479373) affected shaders:

MaxWaves: 206239 -> 283776 (+37.60%)
Instrs: 193922210 -> 202721106 (+4.54%)
CodeSize: 1056819972 -> 1110833680 (+5.11%); split: -0.00%, +5.11%
VGPRs: 6026704 -> 4540416 (-24.66%)
SpillSGPRs: 23742 -> 25754 (+8.47%)
SpillVGPRs: 118897 -> 2295118 (+1830.34%)
Scratch: 7201792 -> 152752128 (+2021.03%)
Latency: 2713432565 -> 3194796286 (+17.74%); split: -0.20%, +17.94%
InvThroughput: 1052131232 -> 935049835 (-11.13%); split: -16.59%, +5.46%
VClause: 6972784 -> 8716721 (+25.01%); split: -0.02%, +25.03%
SClause: 4879313 -> 4852452 (-0.55%); split: -0.88%, +0.33%
Copies: 32782141 -> 35223995 (+7.45%)
Branches: 11075847 -> 11094087 (+0.16%); split: -0.00%, +0.17%
VALU: 118525960 -> 120929058 (+2.03%)
SALU: 33924572 -> 33973293 (+0.14%); split: -0.03%, +0.17%
VMEM: 12419116 -> 17104582 (+37.73%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Friedrich Vock
ec8512ce85 aco/spill: Don't spill phis with all-undef operands
Fixes some crashes when limiting RT stages to 128 VGPRs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Samuel Pitoiset
128cca21c0 radv: pass a radv_shader to radv_get_compute_pipeline_metadata()
And rename to radv_get_compute_shader_metadata().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29652>
2024-06-10 17:31:12 +00:00
Ruijing Dong
8cd53d95fe radesonsi/vcn: update vcn4 tile processing logic
Vcn4 tile number calculation doesn't consider
some input case, which could result in output
bitstream corruption, this fixed the issue.

Not using the number_of_tiles directly but from
calculation. Also change to re-use some macros
from local vcn_5_0.c to ac_vcn_enc.h header file.

Updated vcn4 spec_misc_av1 ip package.

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29556>
2024-06-10 13:12:20 +00:00
Eric Engestrom
95ca41bef9 radv/ci: drop duplicate navi31-aco flakes line
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29631>
2024-06-09 22:31:30 +02:00
Eric Engestrom
ef0f926aff radv/ci: drop duplicate navi21-aco flakes line
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29631>
2024-06-09 22:31:23 +02:00
Eric Engestrom
f4f30ed826 radeonsi/ci: mark a bunch of tests as fixed on vangogh
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29631>
2024-06-09 22:30:36 +02:00
Georg Lehmann
05ca6e2478 amd/common: set COMPUTE_STATIC_THREAD_MGMT_SE2-3 correctly on gfx10-11
There is a hole between SE1 and SE2 occupied by COMPUTE_TMPRING_SIZE.

Fixes: 3c8b48e310 ("ac,radeonsi: add a function to initialize compute preambles")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29622>
2024-06-08 19:18:53 +00:00
Eric Engestrom
c02329ded1 ci: set a common B2C_JOB_SUCCESS_REGEX with the message that's printed for all jobs
Simpler code, and more reliable against serial corruption because that
message is printed 4 times (vs only once for the other ones).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29608>
2024-06-08 07:16:27 +00:00
Marek Olšák
dc113c418d ac/nir: import the dispatch logic for the universal compute clear/blit shader
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>
2024-06-08 05:48:11 +00:00
Marek Olšák
6b15e45908 ac/nir: import the universal compute clear/blit shader
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>
2024-06-08 05:48:11 +00:00
Marek Olšák
1becc6953c ac/nir: import the MSAA resolving pixel shader from radeonsi
It has a lot of options for efficiency.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>
2024-06-08 05:48:11 +00:00
Marek Olšák
f96bbb64d6 radeonsi: add decision code to select when to use compute blit for performance
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>
2024-06-08 05:48:11 +00:00
Marek Olšák
1b9ce2625f ac/nir/lower_ngg: don't use gfx12 xfb defs outside their basic block on gfx11
Move the defs after nir_pop_if and phis and inside the gfx12 branch.

Fixes: 1ea96a47cd - ac/nir/lower_ngg: use voffset in global_atomic_add for xfb

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564>
2024-06-08 00:11:18 -04:00
Marek Olšák
ea99c3fcb9 amd: update addrlib
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564>
2024-06-08 00:11:17 -04:00
Marek Olšák
2ea3cb054b ac/surface: pass the correct addrlib handle to Addr3GetPossibleSwizzleModes
Fixes: d22564d29c - ac/surface: add gfx12

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29564>
2024-06-08 00:11:15 -04:00
Eric Engestrom
bd6ace73f3 radv/ci: document navi31 regression from !29235
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29596>
2024-06-07 20:57:01 +00:00
Vignesh Raman
cea3aeefd0 ci: add farm variable for devices in collabora farm
Add farm variable for devices in the collabora farm so that
the LAVA job submitter uses this in structured log files.

Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29583>
2024-06-07 14:32:49 +00:00
Rhys Perry
5297896856 aco: use ac_get_hw_cache_flags()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:43 +00:00
Rhys Perry
167b6cac45 ac: stop using radeon_info for ac_get_hw_cache_flags
This makes the function easier to use when radeon_info is not available.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:43 +00:00
Rhys Perry
00eccf524f aco: use GFX12 scope/temporal-hint
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
b41f0f6cc1 aco: use ac_hw_cache_flags
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
cdaf269924 aco: inline store_vmem_mubuf/emit_single_mubuf_store
Both of these are only used once.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
185fa04baa aco/gfx6: set glc for buffer_store_byte/short
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Samuel Pitoiset
d4ccae739b radv: fix creating unlinked shaders with ESO when nextStage is 0
When nextStage is 0, the driver needs to assume that a stage might be
used with any valid next stages.

Fixes new dEQP-VK.shader_object.binding.*_no_next_stage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29567>
2024-06-07 12:21:38 +00:00
Friedrich Vock
f1742d36f3 radv/rt: Fix memory leak when compiling libraries
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29579>
2024-06-06 21:56:11 +00:00
Daniel Schürmann
c452a4d1cc aco/ra: use round robin register allocation
Totals from 74681 (94.06% of 79395) affected shaders: (GFX11)

MaxWaves: 2265668 -> 2263546 (-0.09%); split: +0.01%, -0.10%
Instrs: 44941647 -> 44412809 (-1.18%); split: -1.23%, +0.05%
CodeSize: 234173852 -> 232009132 (-0.92%); split: -0.97%, +0.05%
VGPRs: 3033208 -> 3403000 (+12.19%); split: -0.02%, +12.22%
Latency: 305575738 -> 301100302 (-1.46%); split: -1.70%, +0.23%
InvThroughput: 49366070 -> 49020000 (-0.70%); split: -0.91%, +0.21%
VClause: 875748 -> 854930 (-2.38%); split: -2.65%, +0.27%
SClause: 1369614 -> 1327212 (-3.10%); split: -3.43%, +0.33%
Copies: 2887932 -> 2883061 (-0.17%); split: -1.93%, +1.76%
Branches: 885041 -> 885101 (+0.01%); split: -0.01%, +0.02%
VALU: 25218078 -> 25215170 (-0.01%); split: -0.20%, +0.19%
SALU: 4328640 -> 4326052 (-0.06%); split: -0.20%, +0.14%
VOPD: 9129 -> 9611 (+5.28%); split: +7.48%, -2.20%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
197943ae27 aco/ra: change heuristic to first fit
Totals from 73175 (92.17% of 79395) affected shaders: (GFX11)

MaxWaves: 2217690 -> 2217930 (+0.01%); split: +0.02%, -0.01%
Instrs: 44780731 -> 44784895 (+0.01%); split: -0.14%, +0.15%
CodeSize: 233238960 -> 233255604 (+0.01%); split: -0.11%, +0.12%
VGPRs: 3009116 -> 3007684 (-0.05%); split: -0.29%, +0.24%
Latency: 304320163 -> 304286592 (-0.01%); split: -0.31%, +0.30%
InvThroughput: 49121992 -> 49145025 (+0.05%); split: -0.20%, +0.25%
VClause: 872566 -> 873242 (+0.08%); split: -0.25%, +0.33%
SClause: 1359666 -> 1361640 (+0.15%); split: -0.11%, +0.26%
Copies: 2879649 -> 2881646 (+0.07%); split: -1.13%, +1.20%
Branches: 887102 -> 887093 (-0.00%); split: -0.01%, +0.01%
VALU: 25128240 -> 25128572 (+0.00%); split: -0.12%, +0.12%
SALU: 4328852 -> 4330559 (+0.04%); split: -0.07%, +0.11%
VOPD: 8861 -> 8992 (+1.48%); split: +2.63%, -1.15%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
d76fc005b6 aco/ra: re-use registers from killed operands
Totals from 77283 (97.34% of 79395) affected shaders: (GFX11)

MaxWaves: 2348498 -> 2348250 (-0.01%); split: +0.01%, -0.02%
Instrs: 45304558 -> 45097367 (-0.46%); split: -0.57%, +0.11%
CodeSize: 235719656 -> 234957768 (-0.32%); split: -0.43%, +0.11%
VGPRs: 3065984 -> 3073244 (+0.24%); split: -0.41%, +0.65%
Latency: 308010576 -> 307008565 (-0.33%); split: -0.85%, +0.52%
InvThroughput: 49560307 -> 49464214 (-0.19%); split: -0.54%, +0.34%
VClause: 881895 -> 879739 (-0.24%); split: -0.78%, +0.53%
SClause: 1388139 -> 1374634 (-0.97%); split: -1.12%, +0.14%
Copies: 2918583 -> 2910434 (-0.28%); split: -1.92%, +1.64%
Branches: 893947 -> 893712 (-0.03%); split: -0.06%, +0.03%
VALU: 25260728 -> 25256766 (-0.02%); split: -0.20%, +0.19%
SALU: 4377750 -> 4373595 (-0.09%); split: -0.17%, +0.07%
VOPD: 8603 -> 9163 (+6.51%); split: +8.54%, -2.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
b054cfe704 aco/ra: move can_write_m0() check into get_reg_specified()
This way, affinities are also covered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
8e817cf52b aco/ra: refactor get_reg_simple() with increased stride.
This should avoid some redundant calls.

Totals from 153 (0.19% of 79395) affected shaders: (GFX11)

Instrs: 301717 -> 301687 (-0.01%); split: -0.06%, +0.05%
CodeSize: 1583080 -> 1582988 (-0.01%); split: -0.06%, +0.05%
VGPRs: 10068 -> 10348 (+2.78%)
Latency: 6685446 -> 6685475 (+0.00%); split: -0.11%, +0.11%
InvThroughput: 999241 -> 999316 (+0.01%); split: -0.01%, +0.02%
VClause: 3868 -> 3870 (+0.05%)
Copies: 23752 -> 23769 (+0.07%); split: -0.27%, +0.34%
Branches: 6479 -> 6480 (+0.02%)
VALU: 179290 -> 179307 (+0.01%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
1b0edf3f33 aco/ra: Fix array access when finding register for subdword variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
5326e033ff aco/ra: fix handling of killed operands in compact_relocate_vars()
Found by inspection.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:14 +00:00
Samuel Pitoiset
afa2070c99 radv: initialize compute preambles with the common helper
The PM4 mechanism can emit paired packets on GFX11+ when possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>
2024-06-06 20:26:47 +00:00
Samuel Pitoiset
3c8b48e310 ac,radeonsi: add a function to initialize compute preambles
Preambles are very similar between RADV and RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>
2024-06-06 20:26:47 +00:00
Samuel Pitoiset
428601095c ac,radeonsi import PM4 state from RadeonSI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29452>
2024-06-06 20:26:47 +00:00