Lionel Landwerlin
53eed61a90
intel: make sure intel_wa.h can be included by opencl code
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32059 >
2024-11-12 22:48:39 +00:00
Lionel Landwerlin
672d41d22a
anv: split generated draw flags from mocs/dword-count
...
We'll add more flags.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32059 >
2024-11-12 22:48:39 +00:00
Lionel Landwerlin
d6acb56f11
anv: update shader descriptor resource limits
...
Some limits got stuck to the old binding table limits. Those don't
apply anymore since EXT_descriptor_indexing was implemented.
Fixes: 6e230d7607 ("anv: Implement VK_EXT_descriptor_indexing")
Fixes: 96c33fb027 ("anv: enable direct descriptors on platforms with extended bindless offset")
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31999 >
2024-11-12 22:01:52 +00:00
Gurchetan Singh
1794ff7309
gfxstream: use canonical Mesa dependencies
...
drm_dep -> dep_libdrm, essentially.
Reviewed-by: Aaron Ruby <aruby@blackberry.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32062 >
2024-11-12 19:21:10 +00:00
Gurchetan Singh
5e9c14395d
gfxstream: guest: use internal version of AEMU headers + impls
...
This removes the dependency of libaemu-v0.1.2 on
gfxstream guest vulkan.
ALSO:
find ./ -type f -exec sed -i -e 's/android::base/gfxstream::aemu/g' {} \;
Reviewed-by: Aaron Ruby <aruby@blackberry.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32062 >
2024-11-12 19:21:10 +00:00
Gurchetan Singh
a8c1021d79
gfxstream: modify libaemu for Mesa use case
...
- Modifications to directory paths.
- saveStringArray moved to Stream.h/Stream.cpp to avoid
importing StreamSerializing
- C++ include guards
- Namespace changes
find ./ -type f -exec sed -i -e 's/namespace android/namespace gfxstream/g' {} \;
Reviewed-by: Aaron Ruby <aruby@blackberry.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32062 >
2024-11-12 19:21:10 +00:00
Gurchetan Singh
43e378c537
gfxstream: aemu: vendor it
...
This imports certain files from libaemu into gfxstream
guest.
Some are quite specific to gfxstream (Stream, ring_buffer) and others
we expect to Mesa-ify with time (AlignedBuf, Allocator) [probably
while keeping some C++ interface].
The main benefit of importing is easier refactoring and packaging.
Reviewed-by: Aaron Ruby <aruby@blackberry.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32062 >
2024-11-12 19:21:10 +00:00
Gurchetan Singh
c7decb61ee
gfxstream: nuke EntityManager.h include
...
So this is not actually used.
Reviewed-by: Aaron Ruby <aruby@blackberry.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32062 >
2024-11-12 19:21:10 +00:00
Georg Lehmann
8f094a7762
nir: handle fmul(a,a)/ffma(a,a,b) in nir_def_all_uses_ignore_sign_bit
...
Foz-DB Navi31:
Totals from 436 (0.55% of 79395) affected shaders:
Instrs: 808917 -> 805868 (-0.38%)
CodeSize: 4269056 -> 4246512 (-0.53%)
Latency: 5827077 -> 5819815 (-0.12%); split: -0.13%, +0.00%
InvThroughput: 625482 -> 622959 (-0.40%); split: -0.41%, +0.00%
SClause: 21797 -> 21756 (-0.19%); split: -0.23%, +0.04%
Copies: 48502 -> 48505 (+0.01%); split: -0.04%, +0.05%
VALU: 481686 -> 479074 (-0.54%); split: -0.54%, +0.00%
SALU: 76699 -> 76700 (+0.00%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844 >
2024-11-12 18:03:57 +00:00
Georg Lehmann
7e8a08ae77
aco: use nir_def_all_uses_ignore_sign_bit
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844 >
2024-11-12 18:03:57 +00:00
Georg Lehmann
7d5db1ee52
pan/bi: use nir_def_all_uses_ignore_sign_bit
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844 >
2024-11-12 18:03:57 +00:00
Georg Lehmann
34f41abe24
nir: add nir_def_all_uses_ignore_sign_bit
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844 >
2024-11-12 18:03:57 +00:00
Samuel Pitoiset
44fa24580f
radv: optimize the pipe misaligned L2 cache invalidation on GFX11
...
When using the subresource range, it's possible to reduce the number
of L2 cache invalidations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
7a3a65c0c4
radv: pass the image subresource range to radv_{src,dst}_access_flush()
...
This will allow us to optimize the pipe misaligned special case for
GFX11 because only the first mip in the mip-tail needs the L2 cache
invalidation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
f7a39fac10
radv: use vk_image_view_subresource_range() when possible
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
7a8b725d03
radv: determine the first mip that is pipe misaligned on GFX10+
...
This will allow us to optimize the GFX11 case where not all mips are
affected by the L2 invalidation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
c5d5f2fbef
radv: move the GFX11 special case for mips to radv_image_is_pipe_misaligned()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Samuel Pitoiset
65bb39bf96
radv: do not always invalidate L2 for GPUs with non-coherent RBs on GFX10+
...
According to PAL, L2 should be invalidated only for images with
DCC/HTILE even on GPUs with non-coherent RBs. In practice, most of
the images have either DCC/HTILE but this can reduce the number of L2
flushes for images without any compression.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31921 >
2024-11-12 17:27:39 +00:00
Boris Brezillon
eff8a3517d
panvk: Enable CI on G610
...
The number of failures/crashes/flakes is still considerable, but the
goal is to catch regressions when fixing bugs or adding features, so
let's enable CI on G610 anyway.
We might decide to turn g610-vk into a post-merge jobs if CI on G610
is too unstable.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31524 >
2024-11-12 16:46:47 +00:00
Samuel Pitoiset
5e0b81413d
radv: emit nir_debug_break instructions when the trap handler is enabled
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061 >
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
2d5df46c25
aco: emit nir_intrinsic_debug_break
...
s_trap is used to enter the trap.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061 >
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
b6c72b3717
spirv: handle NonSemantic.DebugBreak to emit nir_debug_break()
...
NonSemantic SPIR-V allows to declare extended instructions. This
NonSemantic.DebugBreak allows to emit a breakpoint.
See https://htmlpreview.github.io/?https://github.com/KhronosGroup/SPIRV-Registry/blob/main/nonsemantic/NonSemantic.DebugBreak.html
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061 >
2024-11-12 16:05:17 +00:00
Samuel Pitoiset
a85f0143e0
nir: add nir_intrinsic_debug_break instruction
...
This instruction can be used as a breakpoint in shaders to enter a
trap if supported by the driver. It will be used to handle
NonSemantic.DebugBreak in SPIR-V.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061 >
2024-11-12 16:05:17 +00:00
Jose Maria Casanova Crespo
5b951bcdd7
v3d: Enable Early-Z with discards when depth updates are disabled
...
The Early-Z optimization is disabled when there is a discard
instruction in the shader used in the draw call.
But if discard is the only reason to disable Early-Z, and at
draw call time the updates in the draw call are disabled we
can enable Early-Z using a shader variant.
If there are occlussion queries active we also need to disable
Early-z optimization.
So this patch enables Early-Z in this scenario.
The performance improvement is significant when running gfxbench
benchmark showing an average improvement of 11.15%
fps_avg helped: gl_gfxbench_aztec_high.trace: 3.13 -> 3.73 (19.13%)
fps_avg helped: gl_gfxbench_aztec.trace: 4.82 -> 5.68 (17.88%)
fps_avg helped: gl_gfxbench_manhattan31.trace: 5.10 -> 6.00 (17.59%)
fps_avg helped: gl_gfxbench_manhattan.trace: 7.24 -> 8.36 (15.52%)
fps_avg helped: gl_gfxbench_trex.trace: 19.25 -> 20.17 ( 4.81%)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32028 >
2024-11-12 13:26:38 +00:00
Sagar Ghuge
fef8490eb9
anv: Enable MCS_CCS compression on Gfx12+
...
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com >
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32009 >
2024-11-12 12:27:21 +00:00
Karmjit Mahil
2a7df331af
nir: Fix no_lower_set leak on early return
...
Addresses:
```
Indirect leak of 256 byte(s) in 2 object(s) allocated from:
#0 0x7faaf53ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fa8cfe900 in ralloc_size ../src/util/ralloc.c:118
#2 0x7fa8cfeb20 in rzalloc_size ../src/util/ralloc.c:152
#3 0x7fa8cff004 in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7fa8d06a84 in _mesa_set_init ../src/util/set.c:133
#5 0x7fa8d06bcc in _mesa_set_create ../src/util/set.c:152
#6 0x7fa8d0939c in _mesa_pointer_set_create ../src/util/set.c:613
#7 0x7fa95e5790 in nir_lower_mediump_vars
../src/compiler/nir/nir_lower_mediump.c:574
#8 0x7fa862c1c8 in tu_spirv_to_nir(tu_device*, void*, unsigned long,
VkPipelineShaderStageCreateInfo const*, tu_shader_key const*,
pipe_shader_type) ../src/freedreno/vulkan/tu_shader.cc:116
#9 0x7fa8646f24 in tu_compile_shaders(tu_device*, unsigned long,
VkPipelineShaderStageCreateInfo const**, nir_shader**,
tu_shader_key const*, tu_pipeline_layout*, unsigned char const*,
tu_shader**, char**, void*, nir_shader**, VkPipelineCreationFeedback*)
../src/freedreno/vulkan/tu_shader.cc:2741
#10 0x7fa85a16a4 in tu_pipeline_builder_compile_shaders
../src/freedreno/vulkan/tu_pipeline.cc:1887
#11 0x7fa85eb844 in tu_pipeline_builder_build<(chip)7>
../src/freedreno/vulkan/tu_pipeline.cc:3923
#12 0x7fa85e6bd8 in tu_graphics_pipeline_create<(chip)7>
../src/freedreno/vulkan/tu_pipeline.cc:4203
#13 0x7fa85c2588 in VkResult
tu_CreateGraphicsPipelines<(chip)7>(VkDevice_T*,
VkPipelineCache_T*, unsigned int, VkGraphicsPipelineCreateInfo const*,
VkAllocationCallbacks const*, VkPipeline_T**)
../src/freedreno/vulkan/tu_pipeline.cc:4234
```
seen in:
dEQP-VK.binding_model.mutable_descriptor.single.switches.uniform_texel_buffer_storage_image.update_write.no_source.no_source.pool_expand_types.pre_update.no_array.vert
Fixes: 7e986e5f04 ("nir/lower_mediump_vars: Don't lower mediump shared vars with atomic access.")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32057 >
2024-11-12 11:48:11 +00:00
Karmjit Mahil
c923eff742
tu: Fix potential alloc of 0 size
...
We can end up calling vk_multialloc_alloc with 0 size when
`attachment_count` is 0 and `clearValueCount` is 0.
Addressed:
```
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x7faf033ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fada5cc10 in vk_default_alloc ../src/vulkan/util/vk_alloc.c:26
#2 0x7fac50b270 in vk_alloc ../src/vulkan/util/vk_alloc.h:48
#3 0x7fac555040 in vk_multialloc_alloc
../src/vulkan/util/vk_alloc.h:234
#4 0x7fac555040 in void
tu_CmdBeginRenderPass2<(chip)7>(VkCommandBuffer_T*,
VkRenderPassBeginInfo const*, VkSubpassBeginInfo const*)
../src/freedreno/vulkan/tu_cmd_buffer.cc:4634
#5 0x7fac900760 in vk_common_CmdBeginRenderPass
../src/vulkan/runtime/vk_render_pass.c:261
```
seen in:
dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_texel_buffer.no_fmt_qual.len_252.samples_1.1d.frag
Fixes: 4cfd021e3f ("turnip: Save the renderpass's clear values in the cmdbuf state.")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32057 >
2024-11-12 11:48:11 +00:00
Karmjit Mahil
53c2d5e426
tu: Fix push_set host memory leak on command buffer reset
...
Addresses:
```
Direct leak of 192 byte(s) in 1 object(s) allocated from:
#0 0x7fbe5e4230 in __interceptor_realloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
#1 0x7fbd008bf4 in vk_default_realloc
../src/vulkan/util/vk_alloc.c:37
#2 0x7fbbabb2fc in vk_realloc ../src/vulkan/util/vk_alloc.h:70
#3 0x7fbbaead38 in tu_push_descriptor_set_update_layout
../src/freedreno/vulkan/tu_cmd_buffer.cc:3173
#4 0x7fbbaeb0b4 in tu_push_descriptor_set
../src/freedreno/vulkan/tu_cmd_buffer.cc:3203
#5 0x7fbbaeb500 in tu_CmdPushDescriptorSet2KHR(VkCommandBuffer_T*,
VkPushDescriptorSetInfoKHR const*)
../src/freedreno/vulkan/tu_cmd_buffer.cc:3235
#6 0x7fbbe35c80 in vk_common_CmdPushDescriptorSetKHR
../src/vulkan/runtime/vk_command_buffer.c:300
```
seen in:
dEQP-VK.binding_model.shader_access.secondary_cmd_buf.bind.with_push.sampler_mutable.tess_eval.multiple_discontiguous_descriptors.1d_array
Fixes: 03294e1dd1 ("turnip: Keep a host copy of push descriptor sets.")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32057 >
2024-11-12 11:48:11 +00:00
Samuel Pitoiset
5f79b8ea2d
radv,aco: save/restore overwritten VGPRs in the trap handler shader
...
The trap currently doesn't return to the shader but it will be needed
for example for the debug mode.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
ccde8ecd64
radv: compute the TMA BO size instead of using a constant
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
3e88f996a5
radv: fix the TMA descriptor size
...
The TMA BO contains the descriptor first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
6ec0c85908
radv,aco: use the trap handler layout struct while compiling the shader
...
It's less error prone to rely on the layout for offsets.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
6bfd92123f
aco: simplify postprocessing the trap handler shader
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
44dfeb4479
radv,aco: add a separate function to compile the trap handler shader
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
62e335c779
radv,aco: dump more SQ_WAVE regs from the trap handler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Samuel Pitoiset
0cc21d0601
radv: cleanup printing SGPRS dumped from the trap handler
...
It's more readable like that.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32056 >
2024-11-12 11:16:13 +00:00
Georg Lehmann
ee74b090db
nir/opt_16bit_tex_image: optimize extract half sources
...
I also tried extract_i16/u16, but that causes a lot of regressions.
Foz-DB Navi21:
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 367 -> 355 (-3.27%)
CodeSize: 2156 -> 2136 (-0.93%)
VGPRs: 80 -> 72 (-10.00%)
Latency: 3163 -> 3153 (-0.32%); split: -0.51%, +0.19%
InvThroughput: 424 -> 404 (-4.72%)
Copies: 31 -> 42 (+35.48%); split: -3.23%, +38.71%
PreVGPRs: 27 -> 25 (-7.41%)
VALU: 208 -> 196 (-5.77%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32058 >
2024-11-12 10:19:40 +00:00
Mary Guillemard
bad38c1e76
panvk: Implement global priority extensions
...
Wire up with common kmod code.
On JM, this is a no-op implementation only allowing medium priority.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Chia-I Wu <olvaffe@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31961 >
2024-11-12 08:46:22 +00:00
Mary Guillemard
e2c81380a9
pan/kmod: Expose medium priority on panfrost
...
Panfrost currently doesn't support priorities, assumes default priority as
medium to properly support global priorities on Vulkan.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Chia-I Wu <olvaffe@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31961 >
2024-11-12 08:46:22 +00:00
Mary Guillemard
2237cff1af
panfrost: Report default value for GROUP_PRIORITIES_INFO in drm-shim
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Reviewed-by: Chia-I Wu <olvaffe@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31961 >
2024-11-12 08:46:22 +00:00
Zan Dobersek
25b73dff5a
tu/a7xx: use concurrent resolve groups
...
Establish groups of resolve and unresolve operations that the a7xx
hardware can then use to improve efficiency. Creating such groups enables
continuation of command stream processing while these (un)resolves are in
progress, as long as those latter operations don't depend on the grouped
(un)resolves.
To enable concurrent resolves and unresolves, corresponding fields on the
RB_CCU_CNTL register have to be set appropriately.
Resolve groups are tracked through a scoped struct that logs any pending
resolve operation. Once the group is complete, the emit helper function
will write out the CCU_END_RESOLVE_GROUP event to the command stream.
The buffer ID field on the RB_BLIT_INFO register can be used to disperse
different resolve operations across all available slots in the resolve
engine. The 0x8 and 0x9 IDs are reserved for depth and stencil buffers,
while the 0x0-0x7 range is used for color buffers. A simple incremented
counter is used to assign IDs for all color buffers inside any resolve
group. While it can occur for two color or depth/stencil buffers inside
the same resolve group to have identical IDs, hardware doesn't seem to
have a problem with handling that.
Two TU_DEBUG options are provided, 'noconcurrentresolves' and
'noconcurrentunresolves` disable respective operations by adjusting the
mode set through RB_CCU_CNTL.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com >
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31190 >
2024-11-12 07:50:45 +00:00
Zan Dobersek
f0e5331b21
freedreno/registers: update RB_BLIT_INFO, RB_CCU_CNTL
...
For RB_BLIT_INFO, documentation of the buffer ID field is updated to
explain its use on a7xx.
RB_CCU_CNTL definition for a7xx is updated with fields for concurrent
resolve/unresolve modes and enhanced with dedicated enum types.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com >
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31190 >
2024-11-12 07:50:45 +00:00
Job Noorman
b36a7ce0f1
ir3/ra: prevent moving source intervals for shared collects
...
Non-trivial collects (i.e., ones that will introduce moves because the
sources don't line-up with the destination) may cause source intervals
to get implicitly moved when they are inserted as children of the
destination interval. Since we don't support moving intervals in shared
RA, this may cause illegal register allocations. Prevent this by
creating a new top-level interval for the destination so that the source
intervals will be left alone.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31978 >
2024-11-11 20:08:34 +00:00
Matt Turner
a2c4a34303
anv: Align anv_descriptor_pool::host_mem
...
Otherwise anv_descriptor_set is accessed through an unaligned pointer,
which is undefined behavior in C.
```
anv_descriptor_set.c:1620:17: runtime error: member access within misaligned address 0x61900002c2b5
for type 'struct anv_descriptor_set', which requires 8 byte alignment 0x61900002c2b5
```
Fixes: 2570a58bcd ("anv: Implement descriptor pools")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32070 >
2024-11-11 19:45:14 +00:00
Georg Lehmann
ece1ab3b87
radv: run copy prop before vectorizing
...
Otherwise there are a lot of scalar movs between texture instructions
and alu. With those removed, the top down vectorizer has more starting
points.
Totals from 296 (0.37% of 79206) affected shaders:
MaxWaves: 5710 -> 5754 (+0.77%)
Instrs: 388051 -> 386630 (-0.37%); split: -0.46%, +0.09%
CodeSize: 2120800 -> 2117144 (-0.17%); split: -0.30%, +0.13%
VGPRs: 17496 -> 17344 (-0.87%)
Latency: 8893751 -> 8901364 (+0.09%); split: -0.10%, +0.18%
InvThroughput: 1740411 -> 1731710 (-0.50%); split: -0.57%, +0.07%
VClause: 6573 -> 6576 (+0.05%); split: -0.21%, +0.26%
SClause: 11233 -> 11209 (-0.21%); split: -0.28%, +0.07%
Copies: 31582 -> 31635 (+0.17%); split: -1.49%, +1.66%
PreSGPRs: 15878 -> 15876 (-0.01%)
PreVGPRs: 15380 -> 15274 (-0.69%)
VALU: 278528 -> 277036 (-0.54%); split: -0.65%, +0.11%
SALU: 49062 -> 49054 (-0.02%); split: -0.03%, +0.02%
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32060 >
2024-11-11 18:33:48 +00:00
Samuel Pitoiset
107f29c39a
aco: do not reorder s_trap instructions
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32055 >
2024-11-11 15:46:36 +00:00
Asahi Lina
252e9a4cdf
hk: Bump up max buffer size
...
Signed-off-by: Asahi Lina <lina@asahilina.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081 >
2024-11-11 14:33:02 +00:00
Asahi Lina
81546c769e
asahi: Use 64bit size fields
...
This allows for BOs >4G.
Signed-off-by: Asahi Lina <lina@asahilina.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081 >
2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig
63dd4c13d0
asahi: move agx_gather_device_key
...
for precomp
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081 >
2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig
7e57e0aa7d
asahi: factor out more compiled shader
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081 >
2024-11-11 14:33:02 +00:00