Commit Graph

3434 Commits

Author SHA1 Message Date
Samuel Pitoiset 22d73fc077 amd,radv,radeonsi: add ac_emit_spm_setup()
This moves all SPM emit code to common code. This likely also fixes
SPM on GFX11+ for RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37956>
2025-10-23 08:29:27 +00:00
Samuel Pitoiset 202f8db793 amd,radv,radeonsi: add ac_emit_cp_spi_config_cntl()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37956>
2025-10-23 08:29:27 +00:00
Samuel Pitoiset 5cb400a97b amd,radv,radeonsi: add ac_emit_cp_inhibit_clockgating()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37956>
2025-10-23 08:29:26 +00:00
Samuel Pitoiset bc1080e27f amd,radv,radeonsi: add and use more ac_cmdbuf_XXX helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37956>
2025-10-23 08:29:26 +00:00
Samuel Pitoiset 0fb21e2299 amd,radv: add ac_emit_cp_indirect_buffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37956>
2025-10-23 08:29:25 +00:00
Samuel Pitoiset 50ec03054c amd,radv,radeonsi: add ac_pm4_emit_commands()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37956>
2025-10-23 08:29:24 +00:00
Rhys Perry b18421ae3d amd/lower_mem_access_bit_sizes: fix shared access when bytes<bit_size/8
This can happen with (for example) 32x2 loads with
align_mul=4,align_offset=2.

This patch does bit_size=min(bit_size,bytes) to prevent num_components
from being 0.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 52cd5f7e69 ("ac/nir_lower_mem_access_bit_sizes: Split unsupported shared memory instructions")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>
2025-10-21 22:10:34 +00:00
Rhys Perry e89b22280f amd/lower_mem_access_bit_sizes: be more careful with 8/16-bit scratch load
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.3
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>
2025-10-21 22:10:34 +00:00
Rhys Perry 8829fc3bd6 amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering
Summary of changes:
- handle unaligned 16-bit scalar loads when supported_dword=true
- increases the size of 8/16/32/64-bit buffer loads which are not dword
  aligned, which can create less SMEM loads.
- handles when "bytes" is less than "bit_size / 8"

fossil-db (gfx1201):
Totals from 26 (0.03% of 79839) affected shaders:
Instrs: 12676 -> 12710 (+0.27%); split: -0.30%, +0.57%
CodeSize: 67272 -> 67384 (+0.17%); split: -0.24%, +0.40%
Latency: 44399 -> 44375 (-0.05%); split: -0.09%, +0.04%
SClause: 352 -> 344 (-2.27%)
SALU: 3972 -> 3992 (+0.50%)
SMEM: 554 -> 528 (-4.69%)

fossil-db (navi21):
Totals from 6 (0.01% of 79825) affected shaders:
Instrs: 2192 -> 2186 (-0.27%)
CodeSize: 12188 -> 12140 (-0.39%)
Latency: 10037 -> 10033 (-0.04%); split: -0.12%, +0.08%
SMEM: 124 -> 118 (-4.84%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: fbf0399517 ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>
2025-10-21 22:10:34 +00:00
Rhys Perry 79b2fa785d amd/lower_mem_access_bit_sizes: don't create subdword UBO loads with LLVM
These are unsupported.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14127
Fixes: fbf0399517 ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>
2025-10-21 22:10:33 +00:00
Samuel Pitoiset 7cd12e5c6a amd: move CP emit helpers to ac_cmdbuf_cp.c/h
Seems more organized this way.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:31:20 +02:00
Samuel Pitoiset e0ffc41d9a amd,radv: move SDMA utility helpers to common code
Only simple ones for now. Other functions need more rework.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:31:20 +02:00
Samuel Pitoiset 4989b6e6b9 amd,radv,radeonsi: add ac_emit_cp_write_data_{head}()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:31:20 +02:00
Samuel Pitoiset ed7f9df864 amd: add a predicate parameter to ac_emit_cp_copy_data()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:31:20 +02:00
Samuel Pitoiset 29c2d02d64 amd,radv,radeonsi: add ac_emit_cp_load_context_reg_index()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:31:20 +02:00
Samuel Pitoiset c7c237dd27 amd,radv,radeonsi: add ac_emit_cp_nop()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:31:13 +02:00
Samuel Pitoiset 5801986f53 amd: add missing _cp_ to some emit helpers
Just for consistency with other helpers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:30:34 +02:00
Samuel Pitoiset a0117b5e74 amd,radv: add ac_emit_cp_atomic_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37881>
2025-10-21 13:30:34 +02:00
Georg Lehmann 9e41a7c139 treewide: use nir_load_global alias of nir_build_load_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Timur Kristóf d20049b430 ac/nir/ngg_mesh: Lower num_subgroups to constant
Mesh shader workgroups always have the same amount of subgroups.

When the API workgroup size is the same as the real workgroup
size, this is a small optimization (using a constant instead of
a shader arg).

When the API workgroup size is smaller than the real workgroup
size (eg. when the number of output vertices or primitves is
greater than the API workgroup size on RDNA 2), this fixes a
potential bug because num_subgroups would return the "real"
workgroup size instead of the API one.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37947>
2025-10-20 14:05:40 +00:00
Samuel Pitoiset abcaa46f6c amd,radv,radeonsi: add ac_cmdbuf_flush_vgt_streamout()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:41 +00:00
Samuel Pitoiset 679332f9a9 amd,radv,radeonsi: add ac_emit_cp_acquire_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:40 +00:00
Samuel Pitoiset 9ad7fb8569 amd,radv,radeonsi: add ac_emit_cp_gfx_scratch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:40 +00:00
Samuel Pitoiset 9ff8e71b4e amd,radv,radeonsi: add ac_emit_cp_tess_rings()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:39 +00:00
Samuel Pitoiset 47a64f5b6f amd,radv,radeonsi: add ac_emit_cp_gfx11_ge_rings()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:38 +00:00
Samuel Pitoiset 044bafb6ac amd: add a predicate parameter to ac_emit_cp_pfp_sync_me()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:36 +00:00
Samuel Pitoiset 48b4a43e8f amd,radv,radeonsi: add ac_emit_cp_set_predication()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37870>
2025-10-16 06:31:36 +00:00
Rhys Perry 3e9921f52e radv: only call radv_should_use_wgp_mode() once
This will let the compiler choose between CU and WGP mode.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37791>
2025-10-15 13:37:48 +01:00
Daniel Schürmann b2c44e3a65 amd: change radeon_info::lds_size_per_workgroup for GFX10+ to 64KB
Even though in WGP-mode, there is a total of 128KB LDS, a single workgroup
can access at most 64KB.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:09 +00:00
Daniel Schürmann eecd1c020d amd: keep ac_shader_config::lds_size unaligned
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:09 +00:00
Daniel Schürmann f99eba729d amd/common: remove radeon_info::lds_alloc_granularity and radeon_info::lds_encode_granularity
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:08 +00:00
Daniel Schürmann 6fd5766620 amd: add and use utility functions for LDS size encoding
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:08 +00:00
Daniel Schürmann b651234414 amd: change ac_shader_config::lds_size to bytes
We still keep it aligned to allocation granularity.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37577>
2025-10-15 11:20:07 +00:00
David Rosca 282e8285f1 ac/surface: Limit video modifiers to 64K_S also for VCN 2.2
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14032
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37819>
2025-10-15 10:24:29 +00:00
Samuel Pitoiset 88f53906d8 amd,radv,radeonsi: add ac_emit_cp_pfp_sync_me()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:25 +00:00
Samuel Pitoiset 7ead034a06 amd,radv,radeonsi: add ac_emit_cp_copy_data()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:25 +00:00
Samuel Pitoiset 0e358cec52 amd,radv,radeonsi: add ac_emit_cp_release_mem_pws()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:24 +00:00
Samuel Pitoiset c45035ceb4 amd,radv,radeonsi: add ac_emit_cp_acquire_mem_pws()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:23 +00:00
Samuel Pitoiset 6329e282b8 amd,radv,radeonsi: add ac_emit_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:23 +00:00
Samuel Pitoiset 82cbfc964a amd,radv: add ac_emit_write_data_imm()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:20 +00:00
Samuel Pitoiset ac262c351f amd,radv: add ac_emit_cond_exec()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37813>
2025-10-15 07:35:20 +00:00
David Rosca 3e458d06b8 radeonsi/vcn: Support BT2020 matrix with EFC
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37058>
2025-10-15 06:06:44 +00:00
Daniel Schürmann d0b87a0d5f ac/nir_flag_smem_for_loads: call divergence analysis internally
Also don't flag more SMEM instructions (in ACO) after the last
call to ac_nir_lower_mem_access_bit_sizes().

Totals from 75 (0.09% of 79839) affected shaders: (Navi48)

Instrs: 191246 -> 189960 (-0.67%)
CodeSize: 996840 -> 985976 (-1.09%)
Latency: 3066184 -> 2945500 (-3.94%)
InvThroughput: 355373 -> 353106 (-0.64%); split: -0.66%, +0.02%
SClause: 4848 -> 4699 (-3.07%)
Copies: 13827 -> 13925 (+0.71%); split: -0.07%, +0.78%
Branches: 5176 -> 5003 (-3.34%)
PreSGPRs: 6222 -> 6272 (+0.80%)
VALU: 108934 -> 108993 (+0.05%); split: -0.00%, +0.06%
SALU: 31679 -> 31210 (-1.48%); split: -1.51%, +0.03%
SMEM: 7158 -> 6739 (-5.85%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:12 +00:00
Daniel Schürmann 9b1a635bb3 amd/common: merge radv_nir_opt_access_speculate() into ac_nir_flag_smem_for_loads()
One shader is negatively affected, but we save 2 entire iterations over every shader.
This effect is also mitigated with the next commits.

Totals from 1 (0.00% of 79839) affected shaders: (Navi48)

Instrs: 947 -> 958 (+1.16%)
CodeSize: 4728 -> 4732 (+0.08%)
Latency: 20678 -> 20723 (+0.22%)
InvThroughput: 2697 -> 2698 (+0.04%)
SClause: 26 -> 27 (+3.85%)
Copies: 139 -> 145 (+4.32%)
Branches: 46 -> 47 (+2.17%)
VALU: 460 -> 463 (+0.65%)
SALU: 201 -> 204 (+1.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:12 +00:00
Daniel Schürmann 8ff44f17ef amd/lower_mem_access_bit_sizes: also use SMEM for subdword loads
We can simply extract from the loaded dwords as per
nir_lower_mem_access_bit_sizes() lowering.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:11 +00:00
Daniel Schürmann fbf0399517 amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes
This creates more SMEM instruction, mostly because vec3 64bit are being split
instead of overfetched.

Totals from 442 (0.55% of 79839) affected shaders: (Navi48)

Instrs: 288998 -> 289469 (+0.16%); split: -0.04%, +0.21%
CodeSize: 1538212 -> 1541460 (+0.21%); split: -0.03%, +0.24%
Latency: 3010072 -> 3009373 (-0.02%); split: -0.04%, +0.01%
InvThroughput: 885572 -> 885564 (-0.00%); split: -0.00%, +0.00%
VClause: 6900 -> 6885 (-0.22%); split: -0.28%, +0.06%
SClause: 4457 -> 4469 (+0.27%); split: -0.18%, +0.45%
VALU: 162473 -> 162469 (-0.00%)
SALU: 42871 -> 42855 (-0.04%); split: -0.05%, +0.01%
SMEM: 6893 -> 7239 (+5.02%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37843>
2025-10-14 16:33:11 +00:00
Samuel Pitoiset bc32286e5b radv: declare a new user SGPR for dynamic descriptors
To move them out of push constants.

fossils-db (GFX1201):
Totals from 20700 (25.99% of 79646) affected shaders:
Instrs: 14375624 -> 14370051 (-0.04%); split: -0.07%, +0.03%
CodeSize: 76746128 -> 76723772 (-0.03%); split: -0.05%, +0.02%
Latency: 74103586 -> 74113651 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 11908817 -> 11908798 (-0.00%); split: -0.00%, +0.00%
VClause: 249605 -> 249607 (+0.00%); split: -0.00%, +0.00%
SClause: 337914 -> 337772 (-0.04%); split: -0.08%, +0.04%
Copies: 843585 -> 839233 (-0.52%); split: -0.62%, +0.10%
PreSGPRs: 836283 -> 837260 (+0.12%)
SALU: 1790713 -> 1786374 (-0.24%); split: -0.29%, +0.05%

Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37768>
2025-10-14 15:34:43 +00:00
David Rosca c941e57d74 ac/gfx10_format_table: Use new names for 422 subsampled formats
Fixes: f20ee2806e ("util/format: Add subsampling info to our YUV-as-RGB format names")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37838>
2025-10-14 09:33:28 +00:00
Georg Lehmann e26a8be7af ac/nir: enable nir atomic load/store opts
Foz-DB GFX1201:
Totals from 4 (0.00% of 80287) affected shaders:
Instrs: 2928 -> 2920 (-0.27%); split: -0.31%, +0.03%
CodeSize: 15424 -> 15392 (-0.21%); split: -0.23%, +0.03%
Latency: 835578 -> 823220 (-1.48%)
InvThroughput: 3307941 -> 3258515 (-1.49%)
Copies: 459 -> 447 (-2.61%)
VALU: 1297 -> 1291 (-0.46%)
SALU: 595 -> 589 (-1.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37822>
2025-10-14 06:24:17 +00:00
Samuel Pitoiset ef900e93fc ac/surface: fix host image copies with stencil-only
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37748>
2025-10-10 13:46:51 +00:00