Currently, shader part caching logic is duplicated between VS prolog and
PS/TCS epilogs. This commit introduces a common abstraction to
deduplicate the code.
Additionally, there are a few design decisions that diverts from the
current implementation:
1. A simple mutex is used instead of reader-writer lock. Prolog/epilog
constructions are serialized, removing the need to free duplicate
objects in case of a race.
2. A CS-local cache is used to quickly lookup an entry without holding a
lock. This eliminates locking in over 99% of cases.
3. A set is used to reduce number of allocations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26028>
For 0x07030002 chip id different names are returned on different
phones: Adreno730v3 or Adreno725v1. Settle on 725 to disambiguate
them.
The only difference from base 730 is that it has conditional
execution of compute shader at the start of every command buffer.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25888>
They check for null, alignment, excessive size, and address space wrapping. If any of the checks
fails, `Err(CL_INVALID_VALUE)` is returned.
The caller still has to uphold the other requirements of the `from_raw_parts` fns.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26157>
Raw pointers have bad ergonomics and by using them we opt out of a lot of Rust safety guarantees.
The closure we create modifies the memory behind `svm_ptr`. Make that clear to the compiler by
representing it as slice. `pattern` could also be represented by a slice but then we'd create
overly generic code not exploiting the guarantees given to us be the OpenCL spec.
Namely that there's only a few possible sizes - all of them a power of two - and that `svm_ptr` is
aligned to that size.
Thus, represent `pattern` as one struct per possible size and have the compiler generate optimized
code paths for filling the buffer with each of them. There's one unsafe operation less and the
remaining ones as well as the casts have been documented in detail.
Based on that additional checks of the provided `size` have been added. While it's unlikely that
any application will ever run into them, the old pointer arithmetic already silently relied on
these properties.
Furthermore, since raw pointers are neither `Send` or `Sync` but the Rust types we now use are the
closure can now implement `Send` and `Sync`. That's one step toward marking `EventSig` `Send`.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26157>
Otherwise ctx->VertexProgram._VaryingInputs might not be up to date.
We can't do this in update_program because this breaks vbo_save_playback_vertex_list_gallium:
const GLbitfield enabled = node->enabled_attribs[mode];
_mesa_set_varying_vp_inputs(ctx, enabled); <-- update _VaryingInputs
if (ctx->NewState)
_mesa_update_state(ctx); <-- calls update_program, reverting the
change made above
Fixes: c97961a855 ("mesa: fix 38% decrease in display list performance of Viewperf2020/NX8_StudioAA")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9441
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25956>
On GFX10.3, VRS rates need to be copied to the HTILE buffer but in some
situations, like for mips, it's not always possible to enable HTILE.
In this case, we can fallback to our internal HTILE buffer and tweak
the depth/stencil registers to use this HTILE buffer.
This fixes a bunch of VRS crashes on GFX10.3.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26025>
Patch fxes ESO shadow pass ground corruption on Arc A750. In the colour
pass where the rendering corruption first appears, the depth resource
was used as a "PS - Texture". Immediately afterwards there's a Barrier
where it goes from
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL =>
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
immediately following that there's a Clear from vkCmdBeginRendering
which appears to be a HiZ clear. Things work when using AUX_USAGE_HIZ
but AUX_USAGE_HIZ_CCS_WT (XXX: and AUX_USAGE_HIZ_CCS?) doesn't work.
current thinking is this is related to 14015264727 where we had to add
HDC and DC flushes to CCS and MCS fast clears. Maybe HiZ clears with
CCS also have similar problems? The docs don't appear to indicate that
but the docs were also wrong for color clears until recently...
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9277
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9444
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22717>
Now that NAK is the default for Turing+, we can just chalk any storage
image descriptor handle overruns up to codegen bugs. We could plumb
shader stages all the way through to here and only assert when codegen
is in use but that's a lot of work just for an assert.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24998>