The pan_afbc_block_info structure describes the extent (offset and
size) of the payload data (compressed data) for a superblock, so use
the pan_afbc_payload_extent structure name instead in order to be more
precise and improve readability. This also allows to differentiate
superblocks and payload data which will be useful later in this series
when new helpers will be added to pan_afbc.h.
A set of payload extents describes the layout of various payloads, so
use the term "layout" instead of the generic term "metadata" to
describe it.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35001>
Preventing the use of the AFBC tiled layout could be useful to further
optimise memory usage when using AFBC packing. This commit introduces
a new option to disable it through a driconf option.
This is exposed as a new AFBC pan_afbc_tiled option (not tied to
pan_force_afbc_packing) because it would otherwise imply a useless
performance hit for the tiled to untiled conversion at packing time:
there's no need to detile if the resource is created untiled in the
first place. This could also be useful to compare the performance of
the AFBC tiled and untiled layouts.
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35001>
Layout refactoring commits broke AFBC packing while removing several
fields to simplify the logic. The stride and height are now derived
when necessary at packing time based on the resource modifier. The
problem is that the code assumes that the source and destination
headers are the same although the source and destination modifiers
might differ and create size mismatches when passed to the AFBC
utilities in pan_afbc.h. The destination modifier is set as the source
modifier without the AFBC_FORMAT_MOD_SPARSE and AFBC_FORMAT_MOD_TILED
flags. While the AFBC_FORMAT_MOD_SPARSE flag doesn't have any impact
on these utilities, the AFBC_FORMAT_MOD_TILED flag does.
This commit fixes the issue by keeping the same header block layout
(linear or tiled header layout) when packing a resource. This allows
to simply parse header blocks linearly without having to bother with
the internal layout (Morton order). The tiled packed resource might
also benefit from better cache accesses.
Fixes: a2e9ce39e9 pan/layout: Drop pan_image_slice_layout::afbc::{stride_sb,nr_sblocks}
Fixes: 01d325ba63 pan/layout: Interleave header/body in AFBC(3D)
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35001>
Implement offset lowering by using the explicit LOD value with nearest-integer
rounding (floor(lod + 0.5)) and reusing the coordinate calculation helper.
Passes dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.* on GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
Implement offset lowering by calculating implicit LOD using coordinate derivatives (ddx/ddy)
and doing some deep floating point wizardry matching the binary blob behaviour.
Adds helper functions for coordinate calculation and LOD clamping that will be
reused by subsequent offset lowering passes.
Passes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.* without explicit bias on GC7000.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
This will enable large code removal.
shader->config.lds_size is now always computed the same as ACO except for
compute shaders.
We have to add a new 8-bit user SGPR bitfield called
GS_STATE_GS_OUT_LDS_OFFSET_256B, which contains the offset
that was previously set by the relocation.
Since the offset must be a multiple of 256, we have to add padding
to the LDS size computation to make sure the alignment to 256 for the ESGS
LDS size doesn't cause us to exceed the maximum LDS size.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
It has been implemented and works for PS outputs already.
The lowering callback needs 2 variants because we can't access
pipe_screen from it. The callback is rewritten to be more general.
We also need to do nir_clear_mediump_io_flag for any outputs we don't
lower because the mediump flag might prevent optimizations if it's not
cleared.
v2: fix si_nir_optim
Acked-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35529>
Remove redundent lines in si_perfetto.cpp
Convert offset_B of buffer gpu_address to uint64_t ptr offset when
si_buffer_map is being called to read timestamp.
Destroy sctx->trace by calling u_trace_fini in si_utrace_fini which
will be called in si_destroy_context.
Signed-off-by: Julia Zhang <julia.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36066>