Commit Graph

215019 Commits

Author SHA1 Message Date
Aitor Camacho abc719f01f kk: Add multiViewport and EXT_shader_viewport_index_layer support
Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38518>
2025-11-19 23:29:00 +00:00
Aitor Camacho 15f170e369 kk: Merge io type modifying passes into one
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38518>
2025-11-19 23:29:00 +00:00
Faith Ekstrand cbd0c9eb3b panvk: Add a panvk_common_sysvals struct
For geometry shaders, we're going to need to compile various graphics
shaders down to compute shaders.  This means that they'll look like
compute shaders to much of the compile pipeline but ultimately get
executed as graphics shaders.  Most of the time, the compiler will just
happily take whatever offset you give and try to load the sysval from
there so you can load a graphics sysval from a compute shader just fine.
However, for the common ones, we switch on the shader stage and load
from a different offset for 3D vs. compute.  This breaks the moment you
have a compute shader that's going to actually load from a 3D sysval
space.

The solution here is to ensure that any common sysvals (currently just
the push uniforms address and the printf buffer) are at exactly the same
offset in both.  This is done by adding a panvk_common_sysvals struct,
some static asserts, and a bit of macro magic to keep things eurgonamic.
This also changes push uniform upload to just swap in the push uniform
address instead of writing it to the command buffer on every iteration.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38508>
2025-11-19 23:10:41 +00:00
Eric Engestrom 3ebabe9e43 docs/release-calendar: add 26.0 branchpoint and release candidates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38539>
2025-11-19 23:04:46 +00:00
Lionel Landwerlin 6fe2035065 anv: bump maxTessellationControlTotalOutputComponents
Our backend compiler explains the limits as :

   32 bytes for the patch header (tessellation factors)
  480 bytes for per-patch varyings (a varying component is 4 bytes and
            gl_MaxTessPatchComponents = 120)
16384 bytes for per-vertex varyings (a varying component is 4 bytes,
            gl_MaxPatchVertices = 32 and
            gl_MaxTessControlOutputComponents = 128)

In all that's :
  * 32 patches * 128 components (counting tessellation factors)
  * 32 vertices * 128 components

8192 total components.

I'm not sure why the limit was set so low, maybe leftover from older platforms?

Bump the limit to something like competition.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38523>
2025-11-19 22:44:54 +00:00
Eric R. Smith 65ba14519e pan: fix a bifrost disassembly assert failure
We were overflowing an array during bifrost disassembly. This was
only a problem if the user explicitly set an environment variable,
so unlikely to occur in casual use, and also only could be triggered
in very specific, dense code. But we still should get this right!

The specific CTS test that caused the assert is:

'dEQP-VK.graphicsfuzz.stable-quicksort-for-loop-with-injection'

with environment variable `BIFROST_MESA_DEBUG=shaders`. One of the
shaders has a clause with 6 constants (the maximum) and this overflowed
the array because we assume we always have an extra slot (used for
modifier processing).

Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38501>
2025-11-19 22:10:21 +00:00
Dmitry Baryshkov 7a3bfd1f79 rocket: drop file names from the generated file
Having file names and dates in the generated file affects
reproducibility. Build systems (like OE) error out on the gen_header.py
output, because it can contain full paths. Drop file list from the
generated file.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38528>
2025-11-19 16:27:32 +00:00
Dmitry Baryshkov cdb6468c53 ethosu: drop file names from the generated file
Having file names and dates in the generated file affects
reproducibility. Build systems (like OE) error out on the gen_header.py
output, because it can contain full paths. Drop file list from the
generated file.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38528>
2025-11-19 16:27:32 +00:00
Hyunjun Ko 9a9342e4aa anv/video: handling segmentations features for vp9 decoding
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38418>
2025-11-19 15:54:47 +00:00
Hyunjun Ko 1479e1ef82 anv/video: rework for handling alternative quantizer for vp9 decoding.
including prep-work for handling segmentation features.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38418>
2025-11-19 15:54:47 +00:00
Danylo Piliaiev 8827123fef tu: Disable FLAG_WAIT_FOR_BR sync when CB is disabled
Skip TU_CMD_FLAG_WAIT_FOR_BR wait whenever concurrent binning is disabled.
Without CB there is nothing to wait for, so the sync only adds overhead,
and in workloads with thousands of tiny renderpasses the cumulative overhead
becomes too big.

In one real-world workload I saw the following timings:
- 99.20 ms without disabling TU_CMD_FLAG_WAIT_FOR_BR
- 65.15 ms with TU_CMD_FLAG_WAIT_FOR_BR disabled
- 64.92 ms with TU_DEBUG=nocb

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38378>
2025-11-19 14:35:33 +00:00
Danylo Piliaiev 9370bdc61e tu: Disable by default CB running alongside renderpasses
Disable concurrent binning by default so regular renderpasses have access
to all vertex fetch resources. When a renderpass can actually enable CB,
walk back to the CB barrier at submission time and re-enable CB for all
patchpoints between CB barrier and the renderpass.
Because we expect at most one or two renderpasses with CB per frame,
the number of patches stays small.

The reduced vertex fetch resources resulted in up to 10% performance loss
seen in targeted benchmark and in a few game captures.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38378>
2025-11-19 14:35:33 +00:00
Danylo Piliaiev 5d2b171886 tu/cs: Helpers to create a region that can be easily enabled/disabled
To mitigate CB perf impact we'd need to be able to eaily toggle CB
related IB regions.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38378>
2025-11-19 14:35:32 +00:00
Danylo Piliaiev a7f63a5dbb tu: Do not WAIT_FOR_BR if concurrent binning is disabled
The sync emitted on TU_CMD_FLAG_WAIT_FOR_BR didn't disable CB
when CB was previously disabled for the renderpass, this resulted
in less resources vertex processing resources available for BR.

We can just not emit the sync instead, since next time CB is enabled
it will force the sync.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38378>
2025-11-19 14:35:32 +00:00
Danylo Piliaiev f2fb8ad422 tu: Don't CONCURRENT_BIN_DISABLE when there is no depth image
We have to disable CB when lrz fast-clear is disabled, but if there
is no depth image at all, we can keep it enabled. This means that
RP without depth won't effectively be a CB barrier.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38378>
2025-11-19 14:35:32 +00:00
Danylo Piliaiev ee4f375bfd tu: Fix CB barrier description
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38378>
2025-11-19 14:35:32 +00:00
Janne Grunau 1f144081ec meson: Add asahi to aarch64's auto-generated drivers
Since the Apple silicon M1 and M2 series of SoCs support only aarch64
split the lists for 'arm' and 'aarch64'.

Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38341>
2025-11-19 11:16:53 +00:00
Georg Lehmann fa66b670d4 aco/optimizer: reduce max alu_opt_info stack operands to 4
ALU instructions typically have a maximum of 3 operands, and even when combining
instructions, the peak count will not go above 4.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:43 +00:00
Georg Lehmann 4da74eed96 aco/tests: test packed fma opts
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:43 +00:00
Georg Lehmann 1f0293be0d aco/optimizer: use new helpers for packed fma
Foz-DB Navi48:
Totals from 374 (0.45% of 82419) affected shaders:
MaxWaves: 5476 -> 5480 (+0.07%)
Instrs: 2786653 -> 2784061 (-0.09%); split: -0.11%, +0.01%
CodeSize: 15163340 -> 15153460 (-0.07%); split: -0.08%, +0.01%
VGPRs: 46884 -> 46860 (-0.05%)
SpillVGPRs: 188 -> 189 (+0.53%)
Scratch: 3207936 -> 3208192 (+0.01%)
Latency: 27352681 -> 27350006 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 5933554 -> 5932632 (-0.02%); split: -0.02%, +0.01%
VClause: 62355 -> 62359 (+0.01%); split: -0.03%, +0.04%
Copies: 290221 -> 289786 (-0.15%); split: -0.21%, +0.06%
Branches: 108566 -> 108569 (+0.00%); split: -0.01%, +0.01%
PreVGPRs: 40172 -> 40157 (-0.04%)
VALU: 1355753 -> 1353329 (-0.18%); split: -0.19%, +0.01%
SALU: 524836 -> 524831 (-0.00%); split: -0.01%, +0.01%
VMEM: 90948 -> 90950 (+0.00%)
VOPD: 10489 -> 10490 (+0.01%); split: +0.98%, -0.97%

Foz-DB Navi21:
Totals from 374 (0.45% of 82387) affected shaders:
MaxWaves: 4339 -> 4348 (+0.21%)
Instrs: 2255741 -> 2253554 (-0.10%); split: -0.10%, +0.00%
CodeSize: 12755276 -> 12744184 (-0.09%); split: -0.09%, +0.01%
VGPRs: 40376 -> 40352 (-0.06%)
Latency: 27357012 -> 27348737 (-0.03%); split: -0.07%, +0.04%
InvThroughput: 7213578 -> 7211136 (-0.03%); split: -0.07%, +0.04%
VClause: 62154 -> 62172 (+0.03%); split: -0.01%, +0.04%
Copies: 268204 -> 268048 (-0.06%); split: -0.22%, +0.16%
Branches: 107067 -> 107066 (-0.00%)
PreVGPRs: 37615 -> 37599 (-0.04%)
VALU: 1423326 -> 1421187 (-0.15%); split: -0.16%, +0.01%
SALU: 383388 -> 383390 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:43 +00:00
Georg Lehmann fec10ea3ea aco/optimizer: use new helpers for add16 opts
Foz-DB Navi48:
Totals from 164 (0.20% of 82419) affected shaders:
Instrs: 145304 -> 145335 (+0.02%); split: -0.00%, +0.02%
CodeSize: 794156 -> 794280 (+0.02%); split: -0.00%, +0.02%
Latency: 1884349 -> 1884227 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 350403 -> 350393 (-0.00%)

Foz-DB Navi21:
Totals from 164 (0.20% of 82387) affected shaders:
Instrs: 117416 -> 117414 (-0.00%)
CodeSize: 673328 -> 673312 (-0.00%)
Latency: 1896952 -> 1897094 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 638536 -> 638556 (+0.00%); split: -0.01%, +0.01%
Copies: 14579 -> 14577 (-0.01%)
VALU: 65895 -> 65893 (-0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann e8f5b9374b aco/optimizer: use new helpers to optimize mul(b2f(a), b)
Foz-DB Navi48:
Totals from 979 (1.19% of 82419) affected shaders:
Instrs: 3630560 -> 3629463 (-0.03%); split: -0.03%, +0.00%
CodeSize: 19154176 -> 19147124 (-0.04%); split: -0.04%, +0.00%
Latency: 17700546 -> 17699505 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 3143808 -> 3143254 (-0.02%); split: -0.02%, +0.01%
SClause: 76410 -> 76405 (-0.01%); split: -0.01%, +0.00%
Copies: 256544 -> 256554 (+0.00%); split: -0.02%, +0.02%
PreVGPRs: 40868 -> 40835 (-0.08%)
VALU: 2003291 -> 2002466 (-0.04%); split: -0.04%, +0.00%
SALU: 514000 -> 514006 (+0.00%)
VOPD: 3254 -> 3256 (+0.06%); split: +0.12%, -0.06%

Foz-DB Navi21:
Totals from 926 (1.12% of 82387) affected shaders:
MaxWaves: 21538 -> 21542 (+0.02%)
Instrs: 2984216 -> 2983187 (-0.03%); split: -0.04%, +0.00%
CodeSize: 16104112 -> 16097272 (-0.04%); split: -0.05%, +0.00%
VGPRs: 46864 -> 46848 (-0.03%)
Latency: 15678064 -> 15677099 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3779550 -> 3778230 (-0.03%); split: -0.04%, +0.01%
VClause: 81590 -> 81598 (+0.01%)
SClause: 70753 -> 70751 (-0.00%); split: -0.01%, +0.00%
Copies: 240446 -> 240466 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 51121 -> 51062 (-0.12%)
PreVGPRs: 38538 -> 38505 (-0.09%)
VALU: 1978847 -> 1977777 (-0.05%); split: -0.06%, +0.00%
SALU: 439184 -> 439212 (+0.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann f0e24284f5 aco/optimizer: create max3/min3/med3 with salu min/max
Foz-DB Navi48:
Totals from 175 (0.21% of 82419) affected shaders:
Instrs: 465863 -> 465260 (-0.13%); split: -0.13%, +0.00%
CodeSize: 2362264 -> 2360744 (-0.06%); split: -0.07%, +0.00%
Latency: 1548501 -> 1548371 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 227683 -> 227630 (-0.02%); split: -0.08%, +0.06%
Copies: 33646 -> 33648 (+0.01%)
PreSGPRs: 9996 -> 10004 (+0.08%)
VALU: 175836 -> 175850 (+0.01%)
SALU: 122094 -> 121621 (-0.39%); split: -0.39%, +0.00%

Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
InvThroughput: 74 -> 76 (+2.70%)
VALU: 57 -> 58 (+1.75%)
SALU: 61 -> 60 (-1.64%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann d21734e024 aco/optimizer: use new helper functions to create med3
Foz-DB Navi48:
Totals from 9659 (11.72% of 82419) affected shaders:
Instrs: 17301747 -> 17301735 (-0.00%); split: -0.00%, +0.00%
CodeSize: 93378108 -> 93378184 (+0.00%); split: -0.00%, +0.00%
Latency: 145441784 -> 145441791 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 25768777 -> 25768778 (+0.00%)
Copies: 1370123 -> 1370124 (+0.00%)
VALU: 9705655 -> 9705656 (+0.00%)

Foz-DB Navi21:
Totals from 22 (0.03% of 82387) affected shaders:
Instrs: 27433 -> 27406 (-0.10%)
CodeSize: 146440 -> 146352 (-0.06%); split: -0.06%, +0.00%
Latency: 305857 -> 305806 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 63634 -> 63580 (-0.08%)
VALU: 19109 -> 19082 (-0.14%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 6fc250fc06 aco/optimizer: use new helpers for min3/max3/minmax/maxmin
Foz-DB Navi48:
Totals from 10453 (12.68% of 82419) affected shaders:
Instrs: 18676282 -> 18675798 (-0.00%); split: -0.00%, +0.00%
CodeSize: 100603268 -> 100603508 (+0.00%); split: -0.00%, +0.00%
Latency: 157036823 -> 157031708 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 28049331 -> 28048776 (-0.00%); split: -0.00%, +0.00%
Copies: 1452464 -> 1452503 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 458422 -> 458413 (-0.00%); split: -0.00%, +0.00%
VALU: 10429583 -> 10429353 (-0.00%); split: -0.00%, +0.00%
SALU: 2628403 -> 2628416 (+0.00%); split: -0.00%, +0.00%
VOPD: 21738 -> 21744 (+0.03%); split: +0.04%, -0.01%

Foz-DB Navi21:
Totals from 889 (1.08% of 82387) affected shaders:
MaxWaves: 15641 -> 15639 (-0.01%); split: +0.01%, -0.03%
Instrs: 2505527 -> 2505489 (-0.00%); split: -0.01%, +0.01%
CodeSize: 13975300 -> 13976516 (+0.01%); split: -0.00%, +0.01%
VGPRs: 65584 -> 65576 (-0.01%); split: -0.02%, +0.01%
Latency: 37135606 -> 37132577 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 10937032 -> 10935704 (-0.01%); split: -0.01%, +0.00%
VClause: 63136 -> 63140 (+0.01%); split: -0.01%, +0.01%
Copies: 256011 -> 256073 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 51804 -> 51809 (+0.01%)
PreVGPRs: 57905 -> 57890 (-0.03%); split: -0.03%, +0.00%
VALU: 1593523 -> 1593339 (-0.01%); split: -0.02%, +0.00%
SALU: 425116 -> 425134 (+0.00%); split: -0.00%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 5d02eae052 aco/optimizer: add less agressive pattern matching option
Still a bit more aggresive than the classic is_used_once,
but it should still prevent most regressions for patterns
that use min/max/mul as outer instruction.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 2c05aa34aa aco/optimizer: create fma with s_mul_f32/f16
Foz-DB Navi48:
Totals from 14473 (17.56% of 82419) affected shaders:
MaxWaves: 397738 -> 397720 (-0.00%); split: +0.00%, -0.01%
Instrs: 22133626 -> 21984649 (-0.67%); split: -0.68%, +0.01%
CodeSize: 117440104 -> 117111440 (-0.28%); split: -0.30%, +0.02%
VGPRs: 825820 -> 825928 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 15496 -> 15512 (+0.10%); split: -0.19%, +0.29%
Latency: 152141755 -> 152058676 (-0.05%); split: -0.07%, +0.02%
InvThroughput: 25715152 -> 25681160 (-0.13%); split: -0.14%, +0.01%
VClause: 402752 -> 400798 (-0.49%); split: -0.53%, +0.04%
SClause: 587448 -> 586772 (-0.12%); split: -0.19%, +0.07%
Copies: 1650891 -> 1661495 (+0.64%); split: -0.14%, +0.78%
Branches: 541341 -> 541334 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 748235 -> 748332 (+0.01%); split: -0.03%, +0.04%
VALU: 11754090 -> 11755396 (+0.01%); split: -0.01%, +0.02%
SALU: 3659133 -> 3536435 (-3.35%); split: -3.36%, +0.01%
VOPD: 17201 -> 17083 (-0.69%); split: +0.05%, -0.74%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 5abc961514 aco/optimizer: use new helpers to create fma
Foz-DB Navi48:
Totals from 25949 (31.48% of 82419) affected shaders:
Instrs: 30904250 -> 30904153 (-0.00%); split: -0.00%, +0.00%
CodeSize: 164623100 -> 164604652 (-0.01%); split: -0.01%, +0.00%
Latency: 209402611 -> 209402684 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 36622293 -> 36622236 (-0.00%); split: -0.00%, +0.00%
Copies: 2252080 -> 2251998 (-0.00%); split: -0.00%, +0.00%
VALU: 16831507 -> 16831382 (-0.00%); split: -0.00%, +0.00%
VOPD: 28252 -> 28295 (+0.15%)

Foz-DB Navi21:
Totals from 56269 (68.30% of 82387) affected shaders:
Instrs: 43751754 -> 43746463 (-0.01%); split: -0.01%, +0.00%
CodeSize: 233615096 -> 233576912 (-0.02%); split: -0.02%, +0.00%
VGPRs: 2445528 -> 2445520 (-0.00%)
Latency: 276776920 -> 276761183 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 66406450 -> 66402214 (-0.01%); split: -0.01%, +0.00%
VClause: 902951 -> 902947 (-0.00%)
Copies: 3926260 -> 3926289 (+0.00%); split: -0.01%, +0.01%
VALU: 26924056 -> 26918783 (-0.02%); split: -0.02%, +0.00%
SALU: 6938335 -> 6938321 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 1e2aea7461 aco/optimizer: add new helper functions for combining two instructions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 87e168f223 aco/optimizer: make label_mad more generic
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 53f5e447db aco/optimizer: add extract_float helper
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann 7eccf5c745 aco/optimizer: refactor insert
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Lionel Landwerlin 049adad4f4 anv: split non binding related intrinsics from apply_layout
Trying to cut down apply_pipeline_layout a bit and also allowing some
reuse for a new extension.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38495>
2025-11-19 10:27:27 +00:00
Erik Faye-Lund 138fbb1c6c mesa: introduce and use _mesa_has_texture_buffer_range
This reduces some code-repetition, and makes it a bit easier to reason
about what this actually tests for.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:19 +00:00
Erik Faye-Lund ba3a0c580f zink: only expose rgba buffer-textures
Unlike textures, we can't easily do format-conversion of the data
before, because the source is a buffer object and not a texture object.

But we already have a hammer for this in Mesa, which means we'll drop
the ARB_texture_buffer_object extension support, but only for the OpenGL
compatibility profile. We still get GL 4.6, both core and compatibility.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:19 +00:00
Erik Faye-Lund 73b1ea4491 panfrost: only expose rgba buffer-textures
Unlike textures, we can't easily do format-conversion of the data
before, because the source is a buffer object and not a texture object.

But we already have a hammer for this in Mesa, which means we'll drop
the ARB_texture_buffer_object extension support, but only for the OpenGL
compatibility profile. We still get GL 3.1 exposed.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:19 +00:00
Erik Faye-Lund 08b5876c37 v3d: only expose rgba buffer-textures
Unlike textures, we can't easily do format-conversion of the data
before, because the source is a buffer object and not a texture object.

But we already have a hammer for this in Mesa, which means we'll drop
the ARB_texture_buffer_object extension support, but only for the OpenGL
compatibility profile. We still get GL 3.2 exposed.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:18 +00:00
Erik Faye-Lund 70f1603125 mesa/main: do not check for ARB_texture_buffer_object for GL 3.1
While OpenGL 3.1 does require texture buffer objects, the ARB spec for
this requires support for texture buffers with alpha, luminance,
luminance-alpha and intensity formats in addition to RGBA formats. The
version of texture buffer objects that ended up in the OpenGL spec (even
in the compatibility spec) does not require these formats.

But, we don't even need to check this, because this is already included
in the GLSL 1.40 requirement that's also checked. So this shouldn't make
us expose GL 3.1 in cases where it isn't supported in the first place.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:17 +00:00
Erik Faye-Lund 3039899d5b mesa/main: correct error message
This code-path hasn't been solely about ARB_texture_buffer_object for a
long time, let's make the error message more generic to not confuse
people. While we're at it, remove the comment that brings the same
confusion.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:15 +00:00
Erik Faye-Lund 6f2b8c3f61 mesa/st: do not enable EXT_texture_buffer_object with rgba only
GL_EXT_texture_buffer_object requires support for alpha, luminance,
luminance-alpha and intensity formats. If we can't support those, we
can't enable the extension.

Fixes: 45ca7798dc ("glsl: handle interactions between EXT_gpu_shader4 and texture extensions")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:15 +00:00
Erik Faye-Lund 9d5e0c1ad2 mesa/main: correct formatquery error-handling
Most of the time, we remember to check for both extensions. But in one
case, it seems we forgot the GLES extension. Whoops.

Let's switch to a helper here, so we don't have to repeat the logic over
and over again.

Fixes: b4c0c514b1 ("mesa: add OES_texture_buffer and EXT_texture_buffer support")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
2025-11-19 09:10:14 +00:00
Samuel Pitoiset 7c9e5b4c1c radv: remove unreachable code for prefetch in radv_cs_emit_cp_dma()
CP DMA prefetches are implemented with a separate function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449>
2025-11-19 08:03:38 +00:00
Samuel Pitoiset 60d438e517 radv: always use MALL for CP DMA operations on GFX12
CP DMA isn't coherent with L2 on GFX12, but {SRC,DST}_ADDR_TC_L2 means
MALL.

Only small buffers are using copy/fill CP DMA operations, so this
shouldn't have much effect.

Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449>
2025-11-19 08:03:38 +00:00
Samuel Pitoiset b2a13ce92c radv/tests: require drm-shim and use it instead of RADV_FORCE_FAMILY
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38507>
2025-11-19 07:11:05 +00:00
Samuel Pitoiset 8fd91a1ee9 ci: build drm-shim for RADV tests in debian-vulkan
RADV tests will require AMDGPU drm-shim.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38507>
2025-11-19 07:11:05 +00:00
David Rosca 1f83e73145 radeonsi/vcn: Reduce allocated size for pre-encode recon pics
We use 4x downscale for pre-encode, so we don't need full size
pre-encode reconstructed pictures.

Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38303>
2025-11-19 05:06:33 +00:00
Yiwei Zhang a49b7adad8 venus: add error log coverage for virtgpu backend
Make life easier for ci debug, remote debug, and any kind of bug report
inspection. Long need to add this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38443>
2025-11-19 04:28:49 +00:00
Yiwei Zhang 0afc408cb9 venus: properly fix the blob mem mapping size
There's a single underlying bo mapping shared by the initial alloc here
and the later import of the same. The mapping size has to be initialized
with the real size of the created blob resource, since the app can query
the exported native handle size for re-import. e.g. lseek dma-buf size

Similar to virtgpu_bo_create_from_device_memory, the app can do multiple
imports with different sizes for suballocation. So on the initial
import, the mapping size has to be initialized with the real size of the
backing blob resource.

Backport-to: 25.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38443>
2025-11-19 04:28:49 +00:00
Yiwei Zhang c259ea24ee venus: avoid re-imported dma-buf to have a larger map size
If the allocation originates from the same instance, the tracker map
size follows the allocationSize. After export and re-import, mapping the
whole dma-buf can exceed the original map size. This change backs out
the offending changes.

Test: dEQP-VK.api.external.memory.*.suballocated.host_visible.*
Fixes: 442f242a49 ("venus: requests whole blob mem size for non-dedicated import")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38443>
2025-11-19 04:28:48 +00:00
Qiang Yu a6bf07e7c2 dri: avoid sending too many present reuqests when app start or pause
Found when running glxgears with vblank enabled and modesetting DDX.
glxgears will send many present requests at the beginning, but most
of them get complete event with skip mode. This problem causes
glxgears report ~75fps on a 60Hz monitor at the first record.
This change reduces it to 60fps.

Vulkan side X11 WSI does not have this problem as it will wait first
present request's complete event before send second present request.

How the problem happens:
1. client send present request 1 with target msc = 1
2. server side current msc is 100, so it find request 1 is
   outdated and queue it for vblank with target msc = 101
3. client send present request 2 with target msc = 2
4. server side current msc is still 100, so it find request 2
   is outdated and queue it with target msc = 101, and find
   request 1 will be overridden, so mark it as skipped and
   send idle notify for it.
5. client get the idle notify for request 1, and reuse the
   request 1 buffer for new back buffer to send present
   request 3.
6. this keeps going until client send present request N, and
   server finally process the vblank queue before 101 msc
   arrive and send complete event for all these requests back
   to client.

Reviewed-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38178>
2025-11-19 10:01:50 +08:00