When RADV_DEBUG=shaders is set, printing e.g. different NIR shaders from
different threads at the same time makes the output unreadable. Use a mutex
to synchronize shader dumping so that all shaders get printed in once piece.
Since we're writing everything to a file or terminal anyway, the
performance impact of forcing singlethreaded compilation is negligible.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25215>
Our u_hexdump() squeezes 16-byte chunks filled of zeros, where the unix
hexdump squeezes repeated 16-byte chunks. Turns out panfrost/panvk dumps
can be pretty big when when VM dump is requested
(PANVK_DEBUG/PAN_MESA_DEBUG=dump) and memory regions are
filled with repeated non-zero patterns (like a Z16_UNORM buffer cleared
to 1.0, AKA 0xffff).
Avoiding the repetition of such non-zero patterns in dumps significantly
reduces the size of the dumps. It also clears any confusion for people
used to the original hexdump semantics where a star means the previous
line is repeated.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30692>
lower_driver_param_to_ubo would call ir3_const_state_mut
unconditionally. However, since 850f2aab03 ("ir3, tu: Use a UBO for VS
primitive params on a750+"), it can be called for the binning VS,
causing an assert. This commit makes sure to only call
ir3_const_state_mut when it's really necessary to have mutable access to
the const state.
Fixes: 2c47ad7774 ("ir3: make ir3_const_state less error-prone to use")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30718>
This commit solves the shortage-problem at the blit-functions by
checking the number of fence-registers after updating the batch.
If too many registers are used,
the batch-entries and relocs for the current blit function are
removed by setting batch->ptr and reloc_count to value before
the blit call and calling drm_intel_gem_bo_clear_relocs.
This truncated batch is flushed,
and the batch is updated again for the current blit function.
Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26769>
If there are no uniforms to push, don't emit the AND or invalidate the
shader analysis. This affects only compute shaders.
Not a significant impact since lots of shaders end up pushing
uniforms. Fossil-db numbers (restricted to compute pipelines only) for DG2
```
Totals:
Instrs: 3071016 -> 3070894 (-0.00%)
Cycle count: 8320268863 -> 8320264519 (-0.00%)
Totals from 122 (2.70% of 4520) affected shaders:
Instrs: 10675 -> 10553 (-1.14%)
Cycle count: 2060003 -> 2055659 (-0.21%)
```
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30631>
This keeps the allow_fallback behavior for Lua dependency when freedreno
tools are used, like it used to be. But will disable the fallback
mechanism otherwise.
For Intel, the dependency is optional and the tool that uses is
skipped when Lua is not available, so it is fine we don't use fallback
there.
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30693>
We need to flip trianges from CW to CCW based on the domain origin
specified as dynamic state. Instead of tracking all this on the CPU,
add a scratch register and do the conversion in the MME.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
This effectively splits the two states apart so that we can set them
independently. Inside the macros, we only update states that have
actually changed which should also be a bit more efficient.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
We're always storing it in a scratch register for register pressure
reasons anyway. We may as well just stash it there as a state reg and
we can avoid emitting it all over the place. This reduces each draw
call to nvk_flush_gfx_state() followed by the actual draw, which is now
independenty of any dynamic state.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
mme_set_priv_reg() needs the first three registers to send data to/from
FALCON04. If we don't reserve these in the register space, it may stomp
other things. This only really matters pre-Volta where we need to use
privileged registers for conservative rasterization. However, it's a
good idea to reserve the space none the less.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
Instead of the state part of the simulator being baked in, it's now
broken out into a pluggable component that the simulator talks to via a
function pointer interface. This will let us run the simulator without
the full state simulator under the hood.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
out_args->scratch_offset and in_wg_id_x will alias on <gfx9.
To avoid the conversion code reading a garbage WG ID, move the
scratch/ring offset writing to the very end.
Fixes: 1e354172 ("radv,aco: Convert 1D ray launches to 2D")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30707>