This is probably not the most efficient way to go for all geometries,
but the assumption is that kernels tend to be x/y-heavy rather than
z-heavy. Iterates over each z slice and passes in the current value via
user param. (And bump all user params by a dword.)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9299>
This makes compute mostly work. For now we're laying out images/buffers
in a fixed offset from each other in the globals "array", but this
should be done dynamically. We're also missing passing image info to
shaders, as well as adding image formats to a shader key.
Heavily inspired by nvc0 variants of these.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9299>
For legacy reasons, we were using the PIPE order, instead of the
hardware order. Reorder the stages to match the order of the
BIND_TIC/etc methods, and adjust internal usage to match.
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
[imirkin: fixed numbering, removed TODO comments]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9299>
Format 0x26 is invalid, formats are in a 4 bit field so they repeat
in increments of 16.
Frame reg flags needs to set 0x01 to actually enable fp16.
The clear color setup is also a bit different for fp16, need to use
the 16 bit values in the first two clear color registers.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9916>
The comment about using the RS engine was correct before the code got
changed to use the 3D blitter and a fallback before etnaviv was merged
into upstream Mesa. It has been incorrect ever since. As it's just
adding confusion, instead of being helpful, remove it.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9310>
SB si known to be buggy and the ultimate aim is to make it go away. To
test workloads with better optimizations it makes sense to be able to
enable SB, but for the NIR backend it should not be enabled together
with NIR the default. Therefore an a specific debug option "nirsb" that
enables NIR with SB.
Fixes: 3b27243b01
r600: Enable sb also for NIR
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10108>
unsetting zink from GALLIUM_DRIVER is required in order for lavapipe to
work, but setting it back is totally broken in the case where an app
creates a ton of screens simultaneously
instead, just leave it set to llvmpipe, and if a race condition occurs,
at least llvmpipe isn't going to fail a test that zink passes
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10120>
a synchronous driver can use PIPE_MAP_ONCE to infer that a buffer is
guaranteed to not be mapped multiple times, as this is only used when
doing map -> memcpy -> unmap directly
a threaded driver performs maps/unmaps asynchronously, so this flag
can only be used by the driver to confirm that the mapped region is accessed
exactly once, not that it will not need to remain mapped for other transfer_map
uses after it is unmapped
in short, consider this scenario:
transfer_map(A) -> memcpy(map, data) -> transfer_unmap(map_A) ->
transfer_map(A) -> memcpy(map, data) -> transfer_unmap(map_A)
when a synchronous driver executes this, the call chain is unmodified
when a tc driver executes this, the call chain may become:
transfer_map(A) -> memcpy(map, data) ->
transfer_map(A) -> memcpy(map, data) ->
transfer_unmap(map_A) -> transfer_unmap(map_A)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10113>
GPUs with the feature bit PE_NO_ALPHA_TEST set have no fixed-function
alpha test unit and we want to let st lower it with a shader variant.
For GC7000K this fixes all fbo-alphatest-formats piglits like:
spec@ext_framebuffer_object@fbo-alphatest-formats
spec@ext_packed_float@fbo-alphatest-formats
spec@ext_texture_srgb@fbo-alphatest-formats
This only works with the NIR compiler backend.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Lukas F. Hartmann <lukas@mntmn.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9871>
The meson.build was unaware of transitive dependencies introduced by
Python imports.
Android still needs fixing. But I did not update the Android files lest
I break the build.
Ideally, we would fix this by using a Python runner that generates
a depfile, similar to how meson creates depfiles for C files by passing
flags -MD -MQ -MF to gcc. But this patch gets the job done, without
stalling on the ideal general solution, by manually tracking the Python
imports in new 'foo_depend_files' variables.
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1466>
It enables SAND modifier with columns 128-bytes-wide support for
NV12 format.
When a DRM_FORMAT_MOD_BROADCOM_SAND128 is enabled an imported NV12
texture format has a different layout. Luma and Chroma planes layout
is interleaved for every 128-bytes-wide columns.
Although TFU was supposed to convert a NV12 with SAND_COL128 modifier
from YUV to sRGB color space, it expects a particular swizzle that is not
the one provided by the video decoder available at the Raspberry Pi 4.
This patch follows a similar approach to VC4 YUV blit, using a custom
blit shader that transforms a NV12 texture with SAND_COL128 modifier
with the two interleaved planes to two not-interleaved textures with
UIF format, as it was a regular NV12 format texture.
To reduce the number of texture-fetch operations during the blit, we
are reading and writing the textures in pixel groups of 32-bits. This
implies some swizzling of the pixels to meet the particularities
of the different micro-tile layouts for 8bpp, 16bpp and 32bpp.
With this approach, we are not adding a new format that could be named
"NV12_SAND128". We are just enabling a format modifier.
v2: Rework checks for supported modifiers (Alejandro Piñeiro)
Destroy custom shaders on context destroy (Alejandro Piñeiro)
Add more comments (Alejandro Piñeiro)
SAND128 in query_dmabuf_modifiers should report external_only true.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10051>
it turns out there's not actually a requirement that resources be unmapped,
which means that a ton of overhead can be saved both in the unmap codepath
(the cpu overhead here is pretty insane) and then also when mapping cached
resource memory, as the map can now be added to the cache for immediate reuse
as seen in radeonsi
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9980>
this was not copied directly and changed the old conditional, which
was intended to catch only non-patch tess io
fixes tess io with legacy builtins, e.g., spec@!opengl 2.0@vertex-program-two-side back front2 back2
Fixes: 2d98efd323 ("zink: pre-populate locations in variables")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9996>