Commit Graph

18721 Commits

Author SHA1 Message Date
Natalie Vock d18b438832 aco: Add RegisterDemand::operator!=
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
David Rosca e1ea3f8bbf radv/video: Always use OBU_FRAME in AV1 encode
Saves couple bytes per frame.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37176>
2025-09-15 14:12:17 +00:00
Qiang Yu 310cfda034 ac/surface: add ac_compute_surface_modifier
Used by radeonsi to export existing texture modifier.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31658>
2025-09-15 09:39:19 +00:00
Qiang Yu 98cd68ec05 ac/surface: add radeonsi exported modifiers to supported list
radeonsi will export texture with these modifiers.

piglit tests:
spec@ext_image_dma_buf_import@ext_image_dma_buf_import-export-tex
spec@ext_image_dma_buf_import@ext_image_dma_buf_import-tex-modifier

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31658>
2025-09-15 09:39:19 +00:00
Qiang Yu 1b0ec56c40 ac/surface: refine supported modifier list for multi block size
Reference KMD convert_tiling_flags_to_modifier(). And we are
going to add 4K block size.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31658>
2025-09-15 09:39:19 +00:00
Georg Lehmann 5962659c85 radv: remove uses_rt from radv_shader_info
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann a2d3cbac2a radv: determine subgroup/wave size early
This means we can actually implement varying subgroup size correctly.
It also means that we implement the implicit SPIR-V 1.6 full subgroups
requirement in compute shaders with cswave32/rtwave32.

In the future it will also allow more optimizations that use the subgroup size.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>

The only somewhat complex case here is GFX10 geometry shaders, if gewave32 is
used. We then only know the subgroup size when is_ngg is decided, as legacy
GS doesn't support wave32.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann 76a502d75a ac/nir: set subgroup size for gs copy shader
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann c4cdbee2e6 radv: remove unused ballot_bit_size from shader info
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann 2cda56e8b7 ac/llvm: remove unused ballot size
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:20 +00:00
Georg Lehmann f83a6e6389 radv: add varying subgroup size to shader stage key
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:20 +00:00
Yonggang Luo 7db518cfe4 aco: Fixes warning: function get_branch_target/to_clrx_device_name defined but not used
../../src/amd/compiler/aco_print_asm.cpp:156:1: warning: 'bool aco::{anonymous}::get_branch_target(char**, aco::Program*, const std::vector<bool>&, char**)' defined but not used [-Wunused-function]
  156 | get_branch_target(char** output, Program* program, const std::vector<bool>& referenced_blocks,
      | ^~~~~~~~~~~~~~~~~
../../src/amd/compiler/aco_print_asm.cpp:105:1: warning: 'const char* aco::{anonymous}::to_clrx_device_name(amd_gfx_level, radeon_family)' defined but not used [-Wunused-function]
  105 | to_clrx_device_name(amd_gfx_level gfx_level, radeon_family family)
      | ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Timur Kristóf c183eb5bc8 radv: Flush L2 before CP DMA copy/fill when CP DMA doesn't use L2
In case the source or destination were previously written
through L2, we need to writeback L2 to avoid the CP DMA accessing
stale data.

However, as the CP DMA doesn't write L2 either, an invalidation
is also needed to make sure other clients don't access stale data
when they read it through L2 after the CP DMA is complete.

Doing an invalidation before the CP DMA operation should take
care of both.

Additionally, radv_src_access_flush also invalidates L2 before
the copied data can be read.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36820>
2025-09-13 05:18:28 +00:00
Timur Kristóf 8ef2f492e2 radv: Add comment to document CP DMA prefetch
Explain which caches (MALL, L2) the prefetch works with.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36820>
2025-09-13 05:18:28 +00:00
Val Packett 7f556805c1 radv: detect platform:virtio-mmio devices for virtgpu native context
VirtIO devices can be configured as platform devices instead of PCI,
which is especially common in microVM projects like libkrun.

Let's allow RADV to probe MMIO virtgpu devices.

Signed-off-by: Val Packett <val@invisiblethingslab.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37281>
2025-09-12 12:56:46 +00:00
Karol Herbst 4f94ca6c96 ac/llvm: fix get_global_address for global atomics
They do not have an ACCESS index.

Fixes: 9a33c03654 ("ac/llvm: port load_smem_amd behavior to load_global_amd")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37292>
2025-09-12 08:55:39 +00:00
Samuel Pitoiset a52483d9e7 radv: fix capture/replay with sampler border color
The border color index must be captured and returned to the app,
otherwise it's broken if samplers aren't replayed in-order.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37291>
2025-09-12 06:51:51 +00:00
Samuel Pitoiset a658be8a24 radv: get NIR options after initializing the physical device cache key
Otherwise, radv_split_fma isn't considered.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13878
Fixes: 7304423b5c ("radv: mark RADV_DEBUG=splitfma as deprecated")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37293>
2025-09-12 06:18:26 +00:00
Eric Engestrom 6ecb7f11b7 radv/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37302>
2025-09-11 16:02:38 +00:00
Samuel Pitoiset b00b8c763b radv: disable radv_disable_hiz_his_gfx12 for Mafia Definition Edition
This should be properly fixed now. Also remove this drirc option which
should no longer be needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset ec09ac1501 radv: switch to the full HiZ workaround by default on GFX12
The full HiZ workaround is the only one that fixes the issue reliably.

Sadly, the performance results are mixed and sometimes it hurts. To
maintain performance, RADV will opt-in by selecting which workarounds
to apply from drirc.

As we can't know the full list of games that will be affected by a
potential performance regression, users are encouraged to try with
RADV_GFX12_HIZ_WA (see Mesa documentation for more explanations).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset 5f8f4686bf radv: replace RADV_GFX12_HIZ_WA by a drirc option
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset 0a2ef363a8 radv: report an message when RADV_GFX12_HIZ_WA value is invalid
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36174>
2025-09-11 15:21:50 +00:00
Samuel Pitoiset ec87f1338f radv: emit more push shader registers on GFX12
They are supposed to be slightly faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37256>
2025-09-11 06:47:40 +00:00
Samuel Pitoiset 9039f33a8d Revert "radv: handle fbfetch output after binding graphics shaders"
This is actually wrong because if radv_handle_fbfetch_output() triggers
a decompression pass and graphics shaders (ESO) are saved/restored
they won't be updated because radv_bind_graphics_shaders() was called
before.

This fixes a very recent regression that I noticed while implementing
a new extension.

This reverts commit 9b912f00c7.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37194>
2025-09-11 06:22:20 +00:00
Samuel Pitoiset b69b953973 radv: add RADV_DEBUG=bo_history
This dumps the BO history to /tmp/radv_bo_history.log after each BO
operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37062>
2025-09-11 06:03:15 +00:00
Samuel Pitoiset 1da270fb35 radv/amdgpu: add more helpers for managing virtual BOs
All these new helpers will make the SMEM PRT workaround better
organized.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37193>
2025-09-10 14:50:25 +00:00
Samuel Pitoiset 3c4168a3cc radv/amdgpu: return OOM device when BO mapping fails
It's more appropriate than VK_ERROR_UNKNOWN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37193>
2025-09-10 14:50:24 +00:00
Konstantin Seurer 7c9e945460 radv,vulkan: Avoid a useless barrier in radv_update_bind_pipeline
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36982>
2025-09-10 08:35:50 +00:00
Konstantin Seurer a35dfab281 radv: Use vk_barrier_compute_w_to_compute_r more
vk_barrier_compute_w_to_compute_r shows up in rgp captures and is less
code.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36982>
2025-09-10 08:35:50 +00:00
Samuel Pitoiset c739d836f7 radv: exclude dynamic vertex input stride for the late scissor workaround
RADV_DYNAMIC_VERTEX_INPUT_BINDING_STRIDE doesn't emit any context
registers, so it can be excluded for the late scissor workaround to
avoid re-emitting scissors all the time it's dirty.

This fixes a performance regression noticed with Cyberpunk on Vega10,
but other games are likely affected too. The late scissor workaround is
only applied on Raven/Vega10.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13828
Fixes: d7f401c2bb ("radv: bind the vertex binding strides like a normal dynamic state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37252>
2025-09-10 07:09:48 +00:00
abdelhadi 3a41644165 aco, radv: remove line duplicate
Signed-off-by: abdelhadi <abdelhadims@icloud.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37243>
2025-09-10 06:34:43 +00:00
Rhys Perry e2181744c2 aco/tests: add barrier-to-waitcnt tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 0f32b573a4 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%

fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry ac882985c0 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 145b178de2 aco: fix workgroup-scope barrier between vmem and lds
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 02718fd4c5 aco: use a separate event for sendmsg_rtn
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 5812c2ea89 aco: update waitcnt events for exports
Include primitive, dual source blend and POS4 exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 711c023b55 aco: remove waitcnt code for POPS
We now insert barriers around these instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 005694fe1f aco: remove waitcnt code for SMEM stores
These were removed in GFX10.3 and we haven't used them in a while.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 20cd5cf5f7 aco: delay barrier waitcnt until they are needed
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 843acfa50b aco: add a separate barrier_info for release/acquire barriers
These can wait for different sets of accesses.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 6c446c2f83 aco: refactor waitcnt pass to use barrier_info
Currently there's just barrier_info_all, but more will be added later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 21332609b9 aco: don't move acquire barriers before interlock begin
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 0ee1c137f9 aco: don't move release barriers after interlock end
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry 7c056dd473 aco: add is_atomic_or_control_instr helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry df6a3b7619 aco: reduce cost of using values defined in predecessors
For code like:
   if (cond) {
      val = load()
   }
   use(val)
The "use(val)" now has a similar cost to a use inside the IF.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Yonggang Luo 773a7f347a clang-format: Update the .clang-format files to conformance clang-format json-schema
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

The json-schema at
https://www.schemastore.org/clang-format.json

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Georg Lehmann 4143f0725a radv/nir/lower_cmat: clean up GFX11 ACC->B convert
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37213>
2025-09-09 06:08:55 +00:00
Georg Lehmann 5c0ebcdaef radv/nir/lower_cmat: clean up gfx12 transpose
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37213>
2025-09-09 06:08:55 +00:00