On GFX10.3, VRS rates need to be copied to the HTILE buffer but in some
situations, like for mips, it's not always possible to enable HTILE.
In this case, we can fallback to our internal HTILE buffer and tweak
the depth/stencil registers to use this HTILE buffer.
This fixes a bunch of VRS crashes on GFX10.3.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26025>
The main change is to use struct radv_vs_prolog_key directly instead of
the compressed representation to simplify an upcoming rework in prolog /
epilog caching. In doing so the state struct pointer was replaced with
an inline struct.
Care was also taken to pre-mask all the states with the active attribute
mask and other masks when it makes sense; this ensures that we don't
accidentally use information not hashed into the key during compilation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26023>
This was broken as the field was never assigned to. This will also be
dropped from the upcoming prolog/epilog lookup rework, as it adds to
code complexity while the benefit of saving one hash table memory access
seems questionable.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26023>
On Steam Deck, shaders are pre-compiled for better performance (less
stuttering, less CPU usage, etc). But when a compiler fix needs to be
backported, there is currently no way to handle this properly.
This introduces 3 drirc options
radv_override_{graphics,compute,ray_tracing}_shader_version in order to
force the driver to re-compile pipelines when needed. By default, the
shader version is 0 for all pipelines.
When one drirc is set for a specific game, RADV will re-compile all
pipelines only once with the compiler fix included (because the
pipeline key would be different).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26094>
With DGC, push constants can be set from the cmdbuf (CmdPushConstants())
or from the indirect layout. Instead of always emitting inlined push
constants from the DGC shader, just update the ones that come from the
indirect layout and rely on cmdbuf updates for the other ones.
With that, it should be possible to preprocess push constants with
graphics when all can be inlined in shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25935>
pipeline_flags was 64-bit yet only the first 4 bytes were hashed.
Luckily, the mask included no flag above the 32nd bit, so this was
technically working fine. Still, it's better to use explicit sizeof
constructs to be more resilient to accidental type changes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26145>
Before this, the render pass code or the driver combined the pipeline
create flags and the implicit flags from the render pass, but the
pipeline create flags will need to be sanitized when they are dynamic
state, so we need to do it in vk_graphics_state where we know that
information.
We also weren't combining pipeline flags correctly when linking, which
on turnip was being hidden by the lack of sanitizing for driver-provided
flags. We can't combine them correctly if they're part of the render
pass state, so they need to be pulled out into the overall pipeline
state.
For drivers using emulated renderpasses or tracking feedback loop
information themselves, this won't make a difference, but we have to
adapt turnip to not pass pipeline flags. This also means that we can
drop all handling of feedback_loop_input_only in turnip and just set it
in the runtime.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25436>
A somewhat random collection of fossils:
N Min Max Median Avg Stddev
x 6 16.59 16.61 16.605 16.603333 0.0081649658
+ 6 15.99 16 16 15.998333 0.0040824829
Difference at 95.0% confidence
-0.605 +/- 0.00830327
-3.64385% +/- 0.0485573%
(Student's t, pooled s = 0.00645497)
I'm not sure if nir_opt_if and nir_opt_loop_unroll are actually idempotent
or not.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24197>
When the pitch or slice pitch isn't properly aligned,
the SDMA HW is unable to copy between tiled images and buffers.
To work around this, we process the image chunk by chunk,
copying the data to a temporary buffer which uses supported
pitches, and then copy it to the intended destination.
The implementation assumes that at least one pixel row of the
image will fit into the temporary buffer, and will try to copy
as many rows at once as possible. Sadly, this still results in
a lot of packets being generated for large images.
A possibe future improvement is to copy the image slice by slice
when only the slice pitch is misaligned. However, that is out
of scope for this commit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
Some copy operations are poorly supported by the SDMA hardware,
meaning that the built-in packets don't support them, so we will
need to work around that by copying to and from a temporary BO.
The size of the temporary buffer was chosen so that it can fit
at least one full pixel row of the largest possible image.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
Previously, RADV only had a simple implementation of
image to buffer copies using the SDMA for the PRIME copy.
This commit replaces that with a full-featured implementation
that includes buffer to image and image to buffer copies and
removes the assumptions that the PRIME copy had, as well as
adds new helper functions which will be shared with other copy
functions in upcoming commits.
Unaligned buffer/image copies require a workaround, which
will be implemented by a future commit.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25831>
This register seems needed to enable compute shader shader invocations
on GFX7. On GFX8+ it's working fine without emitting this register but
I think it doesn't hurt.
This fixes dEQP-VK.query_pool.statistics_query.*_cq on GFX7.
Fixes: a9945216ba ("radv: fix COMPUTE_SHADER_INVOCATIONS query on compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25957>
The following sequence is valid (although weird) but many other drivers
(including RADV) were broken:
- bind pipeline with some static state
- set state command for that static state (to a bad value)
- bind the same pipeline again
- draw
Fixes new dEQP-VK.dynamic_state.*.double_static_bind.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25954>