Note that if you grep, you'll find two instances of `wget`, but those are not
executed in our docker images, but on the generic FDo gitlab runners, so the
/etc/wgetrc file in our docker images cannot affect them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34684>
Invalid for now, but used by vkd3d-proton, where the use case is to convert
a result matrix to lower precision, followed by a store.
For 16bit accumulation matrices, GFX11 only uses 16bits per 32bit register.
RADV's coop matrix code pads the unused space with undefs and uses a vector
with twice as many elements as the matrix length. Extending that to 8bit by
leaving 24 bits unused is unnecessary as these matrices as there
is no hw unit that requires it. And in wave32, it would also result in
vectors larger than NIR's limit.
So tightly pack 8bit matrices without any undef padding.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34382>
Since headless overrides create_mem, it needs to override finish_create
too. Fixes a segfault in nvk that was caused by us mixing
wsi_create_null_image_mem with wsi_finish_create_blit_context, which
would then call CmdCopyImageToBuffer with image->blit.buffer == NULL
Fixes a cts failure on nvk in:
dEQP-VK.image.swapchain_mutable.headless.2d.r8g8b8a8_unorm_b8g8r8a8_unorm_clear_copy_format_list
and several others
Fixes: 579578f10a ("vulkan/wsi/drm: Break create_prime_image in pieces")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34646>
If the optional flag is set, compaction groups TES inputs based on which
outputs they are used for:
- inputs generating only POS/CLIP outputs are first
- inputs generating both POS/CLIP and VAR outputs are next
- inputs generating only VAR outputs are last
shader-db with ACO:
143 shaders have -1.44% average decrease in code size.
There are fewer input loads and more of them are vec4 instead of vec1-3.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32262>
The first pass computes which shader instructions contribute to each
output. It can be used to query how data flows within shaders towards
outputs.
The second pass computes which shader input components and which types of
memory loads are used to compute shader outputs.
The third pass uses the second pass to gather which input components are
used to compute pos and clip dist outputs, which input components are used
to compute all other outputs, and which input components are used to
compute both. This will be used by compaction in nir_opt_varyings for
drivers that split TES into a separate position cull shader and varying
shader to make it less likely that the same vec4 inputs are needed in both.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32262>
the only dynamic allocation left for geometry shaders is all done in the setup
indirect kernel. so just pass the heap to that kernel directly, so we don't
reserve a heap for direct draws with GS (including pure-VS XFB). this should
reduce our memory footprint a lot in certain apps.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34661>
rather than a bunch of subtle booleans telling the driver how to invoke the GS
rast shader, collect everything into a common enum, and provide (CL safe)
helpers to do the appropriate calculations rather than duplicating across
GL/VK/indirects.
this fixes suboptimal handling of instancing with list topologies.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34661>
They were both missing subtle cases. While we're here, fix a bunch of
SrcTypes. They shouldn't matter in practice since it's just used to
determine how many GPRs to allocate but we may as well get them right.
Fixes: 078ffb860b ("nak/sm20: Add initial SM20 encoding")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34675>
Since an FAbs or FNegAbs modifier is going to force the source out to a
register no matter what, we should do this first. That way we avoid the
unnecessary source swaps or other evictions when we have to evict the
source anyway because it has a modifier.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34675>
This logic to enable LD_VAR_BUF[_IMM] is on the conservative side.
For fixed varyings, we would need to know what the VS outputs to correctly
compute the indices the FS has to load from. For general varyings, the
locations are aligned either by the linker or by the application in case
of separable shaders.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34074>
A8_UNORM on v9+ is using RGBA8_UNORM as a pixel format with the
A8_UNORM clump format to dealing with the diffences between
RGBA8 and the actual A8 in-memory layout.
The problem is, LEA_TEX only loads the InternalConversionDescriptor
which contains only the pixel format, and that's what ST_CVT uses
to do the conversion, so we'll actually store 4 components instead
of one.
This shows up with
dEQP-VK.image.load_store.without_any_format.buffer.a8_unorm* after
enabling maintenance5.
For now I've turned off the image storage capability for A8_UNORM
on all gens, but I'd be fine disabling it only on v9+ if you think
that's preferable.
Fixes: d95423686f ("pan/format: Add PAN_BIND_STORAGE_IMAGE flag")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34648>