One advantage here of moving a bunch of stuff to NIR is that we can
now have consistent payload types straight from the NIR conversion to
BRW.
This massively simplifies the BRW lowering code and avoids type errors
that are quite common to make in the backend.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37527>
Calculate the minimum available buffer age in addition to the maximum,
and if they differ for 1000 frames in a row, destroy the BO for the
highest-age unused buffer.
Without this, Wayland compositors using dynamic triple buffering always
get buffer age 3 once a third BO has been allocated.
v2:
* Rename function to destroy_oldest_unused_bo. (Marek Olšák)
* Move function call into if block.
* Use == instead of > as the condition for the function call.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37511>
When CPU clock is the same with the authoritative trace clock (normally
default to CLOCK_BOOTTIME), perfetto drops the non-monotonic snapshots
to ensure validity of the global source clock in the resolution graph.
When they are different, the clocks are marked invalid and the rest of
the clock syncs will fail during trace processing.
There's no central daemon emitting consistent snapshots for
synchronization between CPU and GPU clocks on behalf of renderstages and
counters producers. The sequence-scoped clock (64 <= ID < 128) is unique
per producer + writer pair within the tracing session.
Turnip is a bit tricky here, since clocks may be synchronized before
`tu_perfetto_end_submit` is called (in case of KGSL), but emission of
perfetto event has to happen on the same thread as other renderstage events.
To solve this I save the clocks in `tu_perfetto_state` and emit them in
`stage_end` when needed.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37465>
The links in the console are broken depending on the console type; for example,
when it runs within a GitLab job. This can be improved using rich. But as we
have a dependency on colorama too, we can migrate all the coloring to use this
other library too.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37454>
Operands of an addition will be promoted to int making the a+b<a
kind of checks ineffective.
Use u_overflow.h helpers to perform the check correctly.
The commit would be simpler if it used __typeof__ like so:
util_add_check_overflow(__typeof__(src0), src0, src1)
But typeof only became a standard in C23 so this commit instead extends
nir_opcodes a bit to allow opcodes that need the dest_type to get it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37331>
Use tile_max_w/h which is the HW bound for the tile width/height and is
much smaller than the theoretical maximum width/height with a lopsided
tile with just the depth attachment.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37513>
This module has existed, unchanged, since Meson 0.64, and is now marked
as API stable in 1.8. It provides a number of helpers that reduce the
amount of code we need (including fiddly code about finding
wayland-scanner) by a bit, as well as some nice helpers for finding
external XML files.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35839>
blobAlignment helps with running a 4KB guest on a 16KB host.
But for 16KB guest on 4KB host, we'll need to check guest
size too. os_get_page_size(..) might not work on all target
guest OSes now, so default to 4KB.
TEST=CF 16KB works
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37460>
Add support for hardware-accelerated transform feedback using the TFB
command register to control capture state.
Maintains the hardware state through an enum distinguishing between
idle (no hardware state established), active (hardware currently
capturing), and paused (hardware stopped).
Hardware commands are emitted based on state transitions:
- ENABLE when moving from idle to active
- RESUME when transitioning from paused to active
- DISABLE when stopping capture
Transform feedback buffer setup is using the existing dirty state
mechanism through ETNA_DIRTY_STREAMOUT_BUFS, while command emission uses
the new ETNA_DIRTY_STREAMOUT_CMD flag. Buffer descriptors are computed by
mapping vertex shader transform feedback outputs to fragment shader input
registers, as required by the hardware.
A 64-byte context buffer is allocated per context to maintain hardware
state isolation between applications using transform feedback
simultaneously. The hardware state persists across pause and resume
cycles within a command stream but resets during flushes since transform
feedback state does not survive command buffer boundaries.
The implementation enables the full transform feedback capability with
support for 4 buffers and up to 64 separate or interleaved components,
replacing the previous debug-only stub implementation.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Add infrastructure for stream output by implementing the required Gallium
interface functions for creating, destroying, and binding stream output targets.
This lays the groundwork for transform feedback support in etnaviv.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Add support for transform feedback primitive counting queries using the
hardware TFB query mechanism. The implementation uses dedicated query
registers (VIVS_TFB_QUERY_BUFFER and VIVS_TFB_QUERY_COMMAND) to track
the number of primitives written during transform feedback operations.
The hardware automatically accumulates primitive counts and stores the
final result at offset 0 of the query buffer, eliminating the need for
manual accumulation.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Extend the supports(..) function signature in acc sample providers
to accept an etna_context parameter, enabling GPU feature validation
during query type support checks.
This change prepares the infrastructure for query providers to make
context-aware decisions based on available GPU capabilities.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Add native hardware support for rasterizer_discard on GPU cores that
support the HWTFB (Hardware Transform Feedback) feature. This moves
rasterizer discard handling from software clipping to dedicated
hardware state.
Passes all dEQP-GLES3.functional.rasterizer_discard.* with HWTFB.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Unlike the store/resolve that uses A2D, The FDM load path uses the 3d
pipeline and is therefore affected by the hardware FDM offset registers.
The fallback sysmem clear path also uses the 3d pipeline. Subtract off
the HW offset from the destination coordinates, similar to how it is
subtracted from viewport and scissor.
Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37496>
Intel HW does not support separate destination and reference output pictures
when decoding AV1 video. The only exception is film grain, which the Vulkan
spec already includes a caveat for.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37351>