Georg Lehmann
8a2aca8d6f
aco/select_alu: avoid vector get_alu_src for instructions with scalar operands
...
Foz-DB Navi21:
Totals from 1 (0.00% of 80237) affected shaders:
Instrs: 22 -> 21 (-4.55%)
CodeSize: 112 -> 108 (-3.57%)
Latency: 392 -> 386 (-1.53%)
InvThroughput: 25 -> 24 (-4.00%)
Copies: 4 -> 3 (-25.00%)
PreVGPRs: 8 -> 4 (-50.00%)
VALU: 10 -> 9 (-10.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35728 >
2025-07-29 06:07:15 +00:00
Georg Lehmann
ad9c340d86
aco: insert VALU s_delay_alu for WMMA
...
This should avoid some SIMD stalls.
I think this special case was added to try to handle this case:
First Instruction: WMMA
Second Instruction: WMMA instruction with same VGPR of previous WMMA instruction’s Matrix D as Matrix C
Stall if the first and second instruction are not the same type of WMMA or use ABS/NEG on SRC2 of the second instruction
If I read it correctly, we shouldn't need a delay if the type is the same and no
modifier is used. That's kind of complex to handle, so leave it for now.
Not inserting any delays likely hurts more than this.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36328 >
2025-07-29 05:48:29 +00:00
Georg Lehmann
413d0d2ec8
aco/statistics: update GFX12 WMMA cost
...
Based on marketing numbers, but they seem to match RGP.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36328 >
2025-07-29 05:48:29 +00:00
Georg Lehmann
8f61c85880
aco/statistics: add latency to WMMA
...
Assume the normal VALU latency of 4 cycles.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36328 >
2025-07-29 05:48:29 +00:00
Mike Blumenkrantz
a30138c025
zink: verify that no generated tcs is ever in zink_context::gfx_stages
...
this otherwise becomes very confusing to reason about
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36420 >
2025-07-28 20:10:00 +00:00
Mike Blumenkrantz
8af51a08fb
zink: skip all glx piglit tests on anv-adl
...
tried fixing this with weston changes to disable xwayland decor,
but that didn't work
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36422 >
2025-07-28 19:54:31 +00:00
Mel Henning
a1b64fdc12
nak/mark_lcssa_invariants: Invalidate divergence
...
Preserving this was resulting in validation errors like:
error: def->loop_invariant == BITSET_TEST(loop_invariance, def->index) (../src/compiler/nir/nir_validate.c:1890)
Fixes: 1d6082bf56 ("nouveau: switch to nir_metadata_divergence")
Reviewed-by: Karol Herbst <kherbst@redhat.com >
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36415 >
2025-07-28 19:37:51 +00:00
Juston Li
e1ca09317e
anv/android: refactor anb resolve to fix align assertion
...
Retrieving memory requirement size and alignment via
anv_image_get_memory_requirements() return's 0 before surfaces are added
by resolve_anb_image() and will assert in align64() when align is 0:
Abort message: '../src/util/u_math.h:713: uint64_t align64(uint64_t, uint64_t): assertion "util_is_power_of_two_nonzero64(alignment)" failed'
Refactor out anv_image_bind_from_gralloc() into resolve_anb_image() so
the checks are performed after the surface is adds.
Resolving also requires API 29 so return VK_ERROR_EXTENSION_NOT_PRESENT
without it.
Fixes: 43cb986d9e ("anv/android: resolve ANB swapchain images on bind")
Signed-off-by: Juston Li <justonli@google.com >
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org >
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36060 >
2025-07-28 18:54:08 +00:00
Marek Olšák
35e1000072
radeonsi/ci: update gfx12 and other failures
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36382 >
2025-07-28 18:36:14 +00:00
Marek Olšák
ff42bf8b11
radeonsi/ci: don't build GLES CTS separately
...
GLES tests are available in GL CTS too.
Delete the build_es directly manually.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36382 >
2025-07-28 18:36:14 +00:00
Marek Olšák
2294dcb25d
radeonsi/ci: import piglit & cts build scripts
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36382 >
2025-07-28 18:36:14 +00:00
Mike Blumenkrantz
05cc38bb68
vulkan: silence typed_memcpy -Waddress warnings
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36413 >
2025-07-28 17:31:54 +00:00
Mike Blumenkrantz
0a536c7bf0
iris: silence perf_debug -Waddress warnings
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36413 >
2025-07-28 17:31:54 +00:00
Mike Blumenkrantz
1dae42308b
crocus: silence perf_debug -Waddress warnings
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36413 >
2025-07-28 17:31:54 +00:00
Mike Blumenkrantz
7d2b36e50f
zink: just check multiview availability to advertise extensions
...
now that legacy renderpasses are dropped, this can be more general
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36417 >
2025-07-28 17:06:32 +00:00
Aksel Hjerpbakk
c2284ae8a9
panvk: Use a single FBD for IR
...
Introduce a scratch FBD that will be used in the event of IR. Also store
a subset of FBD words that are needed to construct the relevant IR FBD
in the scratch FBD memory.
This patch also increase the TILER_OOM_HANDLER_MAX_SIZE from 512 to 1024
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34733 >
2025-07-28 16:08:20 +00:00
Aksel Hjerpbakk
8a35a98936
panvk: implement cs_extract64 & cs_extract_tuple
...
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com >
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34733 >
2025-07-28 16:08:20 +00:00
Aksel Hjerpbakk
5984ca8417
panvk: avoid cs jump block with no allocator
...
Also initialize allocator to NULL for tiler OOM handler and assert
if capacity is sufficient in the event that allocator is NULL
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34733 >
2025-07-28 16:08:20 +00:00
Jose Maria Casanova Crespo
20b61dcde2
v3d: Add V3D_TFU_READAHEAD padding for renderonly resources
...
Fixes: 4e033ffb27 ("v3d: Add V3D_TFU_READAHEAD padding for allocated resources")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13508
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36407 >
2025-07-28 15:37:47 +00:00
Myrrh Periwinkle
abcd02a07d
gallium: Properly handle non-contiguous used sampler view indexes
...
There is nothing guaranteeing that the currently used sampler view
indexes will be contiguous, which means the resulting extra sampler
views created by st_get_sampler_views may not be placed at the end of
the resulting array. Therefore, the exact indexes of these views must
be passed to the caller for releasing instead of simply assuming that
they will always be placed at the end.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13363
Fixes: 73da0dcddc ("gallium: eliminate frontend refcounting from samplerviews")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36397 >
2025-07-28 14:41:56 +00:00
Tomeu Vizoso
9fc2f71501
etnaviv/ml: Remove some skips that pass now
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:33 +00:00
Tomeu Vizoso
3909d28924
etnaviv/ml: Support Transpose operation
...
Similar to how we currently support Reshape, add a bypass
pseudo-operation and don't change the actual layout of the tensor.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:33 +00:00
Tomeu Vizoso
0845acf578
teflon: Add support for Transpose
...
Channel-first to channel-last, and the opposite.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:33 +00:00
Tomeu Vizoso
3170b5f31c
etnaviv/ml: Add support for Subtract
...
Based on how we perform addition with a convolution, do something
similar for subtractions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:33 +00:00
Tomeu Vizoso
005ab1f0fe
teflon: Add support for Subtract
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:33 +00:00
Tomeu Vizoso
a8a2ce1d74
etnaviv/ml: Add support for Logistic
...
Add a TP job that makes use of a look up table to implement a piecewise
linear approximation of the logistic function.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:32 +00:00
Tomeu Vizoso
9c6cab0458
teflon: Add support for Logistic
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:31 +00:00
Tomeu Vizoso
67faa1525b
etnaviv/ml: Add support for Absolute
...
Add a TP job that makes use of a look up table to implement a piecewise
linear approximation of the absolute function.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:31 +00:00
Tomeu Vizoso
519a8b0f4a
teflon: Add support for Absolute
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:31 +00:00
Tomeu Vizoso
a1bb3f3c97
etnaviv/ml: Add support for non-fused ReLU
...
Add a TP job that makes use of a look up table to implement a piecewise
linear approximation of the ReLU function.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:30 +00:00
Tomeu Vizoso
1a102e05b5
teflon: Add support for non-fused Relu operations
...
Typically ReLU will be fused to a convolution, but that is not always
the case.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:30 +00:00
Tomeu Vizoso
ff3162b074
etnaviv/ml: Add support for no-op Reshape operations
...
This operation could be implemented in the TP cores, but this operation
tends to be added by convertors that export to TFLite from frameworks
with different channel order, and end up being no-ops.
Once we move to NIR for tensor operations, we can support this operation
and then remove it when we have an explicit transpose operation that is
negated by a consequent transpose operation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:29 +00:00
Tomeu Vizoso
dae0af20ab
teflon: Add support for Reshape operations
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34629 >
2025-07-28 14:14:29 +00:00
Gert Wollny
1420f57ec7
r600/sfn: remove code used for vectorized ALU ops
...
Alu is lowered to scalar, so no need to check for vectorized
operations.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36101 >
2025-07-28 13:55:24 +00:00
Gert Wollny
c95b86cc0b
r600/sfn: remove obsolete index and address register handling
...
This old code was needed to get the backend assembler to do the
right thing when emitting index and address registers, but sfn
is handling this now so we can drop this.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36101 >
2025-07-28 13:55:24 +00:00
Gert Wollny
5d0719bf8d
r600/sfn: remove some dead code
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36101 >
2025-07-28 13:55:23 +00:00
Gert Wollny
5697e6bb31
r600/sfn: lower ineg in nir
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36101 >
2025-07-28 13:55:23 +00:00
Gert Wollny
43d877ce1a
r600/sfn: lower bany/ball *(n)equal in nir
...
The code emitted in the backend was not better and often worse
then what we get with the lowering.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36101 >
2025-07-28 13:55:22 +00:00
Erik Faye-Lund
1123987bb3
docs/features: add missing panvk extension
...
Mark VK_EXT_robustness2 as supported on pank/v10+, as it's been exposed
for a little while.
Fixes: ef91ad64d5 ("panvk/v10+: Advertise nullDescriptor support")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36408 >
2025-07-28 12:01:37 +00:00
Erik Faye-Lund
2adcb4c81a
pan/ci: remove non-existent flag from PAN_MESA_DEBUG
...
There's been a few years since these flags existed, let's drop them.
Fixes: ea03d0652d ("panfrost: Remove PAN_MESA_DEBUG=deqp")
Fixes: 7c7c38b126 ("panfrost: Remove unused debug parameter")
Acked-by: Valentine Burley <valentine.burley@collabora.com >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36310 >
2025-07-28 11:43:09 +00:00
Erik Faye-Lund
fe8f4084cf
panvk/ci: try to remove all previously slow tests
...
We have an SSA register allocator now that should be a lot faster on
most (if not all) of these tests now. Let's re-enable them to gain more
CI coverage.
Acked-by: Valentine Burley <valentine.burley@collabora.com >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36310 >
2025-07-28 11:43:09 +00:00
Erik Faye-Lund
5f6b839315
panfrost: add new skips
...
These all currently take over 30s, so they should be skipped.
Acked-by: Valentine Burley <valentine.burley@collabora.com >
Reviewed-by: Eric R. Smith <eric.smith@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36310 >
2025-07-28 11:43:09 +00:00
Patrick Lerda
cea80e1a10
dri: complete the support for ARGB4444
...
This change is inspired by 1021d6fe62 ("dri: deal
with ARGB1555")
This issue is now mostly fixed with
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081
Anyway, the dri3_cpp_for_fourcc entry is still missing and should
be added.
This change is useful for instance with r600 which
can handle this format.
Note: this mode was generated at the "glx visuals" level
on r600 by default before the commit d709b42180 .
This change was tested on r600 palm and cayman with X11
loaded with a version of mesa generating this very mode:
glx/glx-visuals-depth -pixmap: fail pass
glx/glx-visuals-stencil -pixmap: fail pass
Fixes: 00aa095d53 ("dri: Support 1555/4444 formats")
Signed-off-by: Patrick Lerda <patrick9876@free.fr >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34294 >
2025-07-28 11:17:00 +00:00
Jose Maria Casanova Crespo
5927fe5430
v3d: Reduce CLE submission of CLIP_WINDOW packets
...
When the rasterizer state is updated, we only need to update
the scissoring state if the rasterizer scissor state has changed.
This avoids re-sending the same scissor state any time the rasterizer
is changed.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36352 >
2025-07-28 09:53:56 +00:00
Jose Maria Casanova Crespo
591a894b94
v3d: Mark DIRTY_ZSA if disable_ez is changed from FS.
...
We need to update the CFG_BITS packet if the early_fragment_test status
changed vs previous draw call. But we don't need to update it every
time the FS is changed, we only need to update it when disable_ez
value is different from previous FS.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36352 >
2025-07-28 09:53:55 +00:00
Rhys Perry
928c9c618d
nir/opt_access: support RT/callable shaders
...
I don't know if this affects any real application.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35938 >
2025-07-28 09:19:01 +00:00
Rhys Perry
a9a1da0264
nir/uub: fix 8/16-bit overflow
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Backport-to: 25.1
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13552
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13553
Tested-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36372 >
2025-07-28 08:46:51 +00:00
Danylo Piliaiev
fd1e8cf03f
tu: Fix nullptr dereference in cmd_buffer tracepoint
...
Fixes: ac2046c5b0 ("tu/perfetto: Add app and engine names to the command buffer tracepoint")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36389 >
2025-07-28 08:27:10 +00:00
Mary Guillemard
525f2972a6
pan/bi: Properly handle SWZ.v4i8 lowering on v11+
...
We were not supporting non replicate swizzle and this trigger an
assertion on fossils/parallel-rdp/small_subgroup.foz.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Fixes: 1481b14fcb ("pan/bi: Lower SWZ.v4i8 to multiple MKVEC.v2i8 on v11+")
Reviewed-by: Olivia Lee <olivia.lee@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36349 >
2025-07-28 08:00:07 +00:00
Job Noorman
294014e196
ir3: use dummy dst for descriptor prefetches
...
Now that we have we have the concept of "dummy" registers, we can use it
for descriptor prefetches as well. Currently, they are represented as
having no dst, and a fixup pass during legalization adds the actual
needed dummy dst. This can be prevented by representing their dst using
a dummy register from the start.
Signed-off-by: Job Noorman <jnoorman@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36365 >
2025-07-28 09:02:17 +02:00