Variable workgroup size works by compiling as much SIMD variants as
possible and then selecting the right one during dispatch (when the
actual workgroup size is passed to us).
Instead of replicating the logic in a separate function, reuse the
same logic for regular SIMD selection. And move function for that
together with the remaining simd selection functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
Clean up the logic and move it to functions that work with prog_data
attributes to select the right SIMD. This shouldn't change any
behavior compared to the original.
Having it extracted will allow reuse by Task/Mesh and make it easier
to write tests.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13249>
zink_resource_has_binds() only checks descriptor binds, and this doesn't
include streamout or fb bindings, so call these functions from the specific
rebind points to ensure those cases are also checked
fixes#5541
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13490>
this migrates existing uses of zink_descriptor_layout_key to use
zink_descriptor_pool_key instead, which ends up being more helpful overall
since it puts all the pool-related data into a single struct
this also has a(n incredibly small) benefit of removing VkDescriptorPoolSize[6]
from every program data struct and deduplicating it into the descriptor
pool key set
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13483>
When compiling for x86 with MSVC, Vulkan API entry points follow the
__stdcall convention (VKAPI_CALL maps to __stdcall), which uses the
following name mangling:
_<function_name>@<arguments_size>
Fix the vk_entrypoint_stub()/alternatename definitions accordingly.
Fixes: 6d44b21d4f ("vulkan: Fix weak symbol emulation when compiling with MSVC")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13516>
The streamout_config SGPR is used to determine if streamout is enabled.
This fixes a GPU hang with various transform feedback tests:
- dEQP-GLES3.functional.transform_feedback.*
- KHR-GL46.transform_feedback.api_errors_test
- KHR-GL46.draw_indirect.basic-draw*-xfbPaused
- KHR-GL46.geometry_shader.api.draw_calls_while_tf_is_paused
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13514>
AFBC is keyed to the format. Depending on the hardware, we'll get an
Invalid Data Fault or a GPU timeout if we attempt to sample from an
AFBC-compressed RGBA8 texture as R32F (for example).
Fixes Piglit ./bin/arb_texture_view-rendering-formats_gles3 with AFBC.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13205>
We need to know the internal (physical) formats used for AFBC of a given
logical format, in order to check format compatibility and determine if
we need to decompress AFBC for conformance.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13205>
According to mali_kbase, all Bifrost and Valhall GPUs are affected by
issue TSIX_2033. This hardware bug breaks the INTERSECT frame shader
mode when forcing clean_tile_writes. What does that mean?
The hardware considers a tile "clean" if it has been cleared but not
drawn to. Setting clean_tile_write forces the hardware to write back
such "clean" tiles to main memory.
Bifrost hardware supports frame shaders, which insert a rectangle into
every tile according to a configured rule. Frame shaders are used in
Panfrost to implement tile reloads (i.e. LOAD_OP_LOAD). Two modes are
relevant to the current discussion: ALWAYS, which always inserts a frame
shader, and INTERSECT, which tries to only insert where there is
geometry. Normally, we use INTERSECT for tile reloads as it is more
efficient than ALWAYS-- it allows us to skip reloads of tiles that are
discarded and never written back to memory.
From a software perspective, Panfrost's current logic is correct: if we
clear, we set clean_tile_writes, else we use an INTERSECT frame shader.
There is no software interaction between the two.
Unfortunately, there is a hardware interaction. The hardware forces
clean_tile_writes in certain circumstances when AFBC is used.
Ordinarily, this is a hardware implementation detail and invisible to
software. Unfortunately, this implicit clean tile write is enough to
trigger the hardware bug when using INTERSECT. As such, we need to
detect this case and use ALWAYS instead of INTERSECT for correct
results.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13205>
The gl_FragColor lowering in the fragment shader depends on the number
of render targets, which can change every set_framebuffer_state.
set_framebuffer_state thus needs to force a rebind of the fragment
shader.
Fixes a regression in Piglit fbo-drawbuffers-none gl_FragColor -auto
-fbo from enabling AFBC on Mali G52.
Fixes: 28ac4d1e00 ("panfrost: Call nir_lower_fragcolor based on key")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13498>
This uses the new property for AFBC we've added. The AFBC quirk is
applied only to v4, and we only set dev->has_afbc on v5+ so this is not
a regression. It now respects the hardware-specific AFBC disable.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13497>
u_format defines depth formats as having depth in .x, mesa/st samples for
depth or stencil in .x (not making use of any other channels).
util_make_fs_blit_zs() looks for depth or stencil in .x. The MSAA path
was the exception looking for it in .z or .y, which was causing drivers to
need to splat their values out to the other channels. This should be
better on hardware that can emit shorter messages for sampling just the
first channels.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13446>