Commit Graph

214701 Commits

Author SHA1 Message Date
Konstantin Seurer
c4aee84426 radv: Add re-format commit to .git-blame-ignore-revs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38336>
2025-11-12 07:55:36 +00:00
Samuel Pitoiset
0dba538643 radv/meta: fuse depth/stencil aspects copy with the GFX path
Depth/stencil copies on graphics are twice as fast now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:33 +00:00
Samuel Pitoiset
9d3dd174b8 radv/meta: rework radv_meta_nir_texel_fetch_build_func
This add a binding parameter that will be used for fused depth/stencil
copies.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:33 +00:00
Samuel Pitoiset
332f881375 radv/meta: simplify aspect/formats in radv_gfx_copy_image()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:32 +00:00
Samuel Pitoiset
cd59db45f9 radv/meta: simplify radv_gfx_copy_memory_to_image() even more
Selecting formats can be simplified.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:32 +00:00
Samuel Pitoiset
ed05c3fc31 radv/meta: remove multiple aspects in radv_gfx_copy_memory_to_image()
Only one aspect at any time is valid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:31 +00:00
Samuel Pitoiset
a1884dc737 radv/meta: remove radv_meta_blit2d_rect
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:31 +00:00
Samuel Pitoiset
1319b2bef6 radv/meta: split radv_meta_blit2d() into two separate functions
It's more code but it's definitely easier to read and it will allow us
to do more cleanups/optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:30 +00:00
Samuel Pitoiset
bb3f69fefe radv/meta: remove useless blit2d_src_temps
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38139>
2025-11-12 07:35:29 +00:00
Andy Hsu
d226c0d97d u_trace: remove redundant char* to string conversion (v2)
Add the string length parameter to the set_name(),
set_value() function to remove the conversion from
char* to std::string which takes extra work like
calling strlen() to compute the string length.

From the callback sampling in the perfetto tracing,
the ratio of trace_payload_as_extra_intel_end_draw_indexed
to intel_ds_end_draw_indexed drops from 63.80% to 59.65%
with this change.

v2: Add the data of the callback sampling to the description.

Signed-off-by: Andy Hsu <hwandy@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38073>
2025-11-12 06:17:16 +00:00
Aitor Camacho
93460e969e docs,kk: Add KosmicKrisp documentation
Adds build instructions and workarounds documentation.
Workarounds documentation only has the biggest offenders and
there are probably way more in code that need yet to be
documented.

Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38232>
2025-11-12 04:23:59 +00:00
Faith Ekstrand
f187b537b5 pan: Use nir_lower_point_size for the float16 conversion
This is more robust than smashing the variable to mediump and then
asking for mediump to be lowered later.  It's also faster because it
only involves one compiler pass, not two.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
2025-11-12 01:34:36 +00:00
Faith Ekstrand
6ee4ea5ea3 nir: Add a type parameter to nir_lower_point_size()
On Mali, we need not only clamp but also convert to float16 on Valhall+.
We could have a separate pass for this but it fits in nicely with the
rest of nir_lower_point_size() so we might as well put it there.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
2025-11-12 01:34:36 +00:00
Sviatoslav Peleshko
5af8abbf8b driconf: Add vertex_program_default_out option for Penumbra: Overture
Penumbra's vertex program Diffuse_EnvMap_Reflect_vp.cg produces 3-component
texture coordinates and primitive colors while using the FF fragment
program. Add this WA to fix the misrenderings.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14170
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38295>
2025-11-11 22:16:46 +00:00
Sviatoslav Peleshko
f03432c81a mesa,driconf: Add WA to initialize vertex program outputs to vec4(0,0,0,1)
Per ARB_vertex_program spec result registers are 4-component and initially
undefined, and the FF fragment program expects its intputs to be
4-component too. So, if the client's vertex program does not write the
whole vector it will cause misrenderings unless the same client also
supplies fragment program that expects less than 4 componens.

This commit adds a workaround that initializes results to vec4(0, 0, 0, 1)
which seems to be an expected behavior for such clients.

Cc: mesa-stable
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38295>
2025-11-11 22:16:46 +00:00
Eric Engestrom
f30e5ff44b ci: uprev vkd3d
03cca4cd97...4acd227131

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38370>
2025-11-11 20:15:21 +00:00
Faith Ekstrand
51a68ecc87 panvk: Optimize in the preprocess hook
NIR is actually pretty good at optimizing UBO, SSBO, and shared memory
access but in order to do so, we actually have to run the optimizations
before we lower it all.  Same for I/O.  By doing all our lowering in
panvk before we ever run the optimization loop, we risk hampering it
significantly.

Ignoring loop changes (several get unrolled now), fossil-db on Sascha
Willems demos and a few others looks lik

    Instrs: 189054 -> 187802 (-0.66%); split: -0.67%, +0.01%
    CodeSize: 1756160 -> 1747072 (-0.52%); split: -0.52%, +0.01%
    Estimated normalized CVT cycles: 771.367106999997 -> 766.0311719999971 (-0.69%); split: -1.05%, +0.36%
    Estimated normalized SFU cycles: 1407.21875 -> 1406.9375 (-0.02%); split: -0.03%, +0.01%
    Estimated normalized Load/Store cycles: 17477.0 -> 16917.0 (-3.20%)
    Maximum number of threads: 1257 -> 1213 (-3.50%); split: +0.08%, -3.58%
    Number of hardware loops: 283 -> 278 (-1.77%)

    Totals from 186 (19.81% of 939) affected shaders:
    Instrs: 102588 -> 101336 (-1.22%); split: -1.23%, +0.01%
    CodeSize: 834432 -> 825344 (-1.09%); split: -1.10%, +0.02%
    Estimated normalized CVT cycles: 463.226562 -> 457.890627 (-1.15%); split: -1.74%, +0.59%
    Estimated normalized SFU cycles: 1021.84375 -> 1021.5625 (-0.03%); split: -0.05%, +0.02%
    Estimated normalized Load/Store cycles: 8425.0 -> 7865.0 (-6.65%)
    Maximum number of threads: 334 -> 290 (-13.17%); split: +0.30%, -13.47%
    Number of hardware loops: 63 -> 58 (-7.94%)

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
1a9c7f8c8a panvk: Only lower outputs to temporaries
We need to lower outputs to get rid of output reads and so that we can
fix up layer writes on Bifrost.  However, there's really no point in
lowering reads besides moving them to the top.  Even then, NIR can
probably copy propagate the copies and we'll end up reading straight
from the input variable anyway.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
a8b6213983 panvk: Lower copy_deref and indirect derefs before nir_lower_io
Neither nir_lower_io() nor nir_lower_indirect_derefs() know what to do
with copy_deref so we need to get rid of those first.  Also, there are
some NIR passes which can insert more copy_deref or propagate an
indirect load to the I/O variable so we want to lower those away right
before lowering I/O.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
d6dc0ea5ae panvk: Split var copies and lower local vars early
These two passes are a prerequisite for basically anything that
optimizes on variables.

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
586e1ac2b8 pan/compiler: Expose the bifrost optimization loop
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Faith Ekstrand
0e9fcb33c3 nir: Add a couple panfrost sysvals to divergence analysis
Fixes: 2af6e4beeb ("pan: Don't pretend we support load_{vertex_id_zero_base,first_vertex}")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
2025-11-11 17:38:36 +00:00
Daniel Schürmann
5682e39e6b amd: enable load/store_shared2_amd for GFX6
Totals from 1509 (2.43% of 62200) affected shaders: (Pitcairn)

MaxWaves: 8078 -> 8057 (-0.26%); split: +0.09%, -0.35%
Instrs: 977182 -> 951746 (-2.60%); split: -2.62%, +0.02%
CodeSize: 4951468 -> 4758192 (-3.90%); split: -3.92%, +0.01%
SGPRs: 76704 -> 76696 (-0.01%)
VGPRs: 81092 -> 81068 (-0.03%); split: -0.34%, +0.31%
Latency: 11663237 -> 11526070 (-1.18%); split: -1.19%, +0.01%
InvThroughput: 6198904 -> 6114851 (-1.36%); split: -1.43%, +0.07%
VClause: 26656 -> 26655 (-0.00%); split: -0.05%, +0.05%
SClause: 22304 -> 22307 (+0.01%); split: -0.03%, +0.04%
Copies: 107503 -> 109564 (+1.92%); split: -0.23%, +2.15%
Branches: 22917 -> 22918 (+0.00%)
PreSGPRs: 42246 -> 42242 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 64561 -> 64761 (+0.31%); split: -0.01%, +0.32%
VALU: 600285 -> 601139 (+0.14%); split: -0.26%, +0.40%
SALU: 130622 -> 130851 (+0.18%); split: -0.16%, +0.33%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>
2025-11-11 17:12:17 +00:00
Daniel Schürmann
9abbcbc00e nir/opt_load_store_vectorize: don't add negative offsets to load/store_shared2_amd
By hoisting the low address instead, we can make use of these instructions on GFX6.

Totals from 3 (0.00% of 79839) affected shaders: (Navi48)

Instrs: 3768 -> 3776 (+0.21%); split: -0.03%, +0.24%
CodeSize: 20024 -> 20048 (+0.12%); split: -0.04%, +0.16%
Latency: 16093 -> 16198 (+0.65%)
InvThroughput: 3868 -> 3864 (-0.10%)
VClause: 97 -> 93 (-4.12%)
VALU: 2333 -> 2331 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37682>
2025-11-11 17:12:15 +00:00
Christian Gmeiner
688718be8b mesa: OES_texture_stencil8 requries OpenGL ES 3.1
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38360>
2025-11-11 15:59:06 +00:00
Valentine Burley
02986c9cec ci/lava: Use a660_zap.mbn from linux-firmware
This is now available in linux-firmware, so we can update the
gfx-ci/firmware archive to include the zap shader for a660 instead of
manually injecting it in LAVA.

e16373de80
6bff1a1967

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38368>
2025-11-11 13:16:18 +00:00
Tapani Pälli
12b2476b40 anv: throw anv_finishme warnings only on debug builds
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14259
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38369>
2025-11-11 12:51:32 +00:00
Samuel Pitoiset
0d9d45db4e radv: add vk_wsi_disable_unordered_submits and enable for GTK
GTK is missing a semaphore between QueueSubmit() and QueuePresent()
causing the WSI submit to be "unordered" and to immediately signal the
semaphores (because it's missing a wait semaphore in QueuePresent()).

The workaround is to disable unordered WSI submits until GTK fixes it
properly.

Cc: "25.3"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14087
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38351>
2025-11-11 12:13:41 +00:00
Daniel Schürmann
668259ef0b aco/scheduler: move clauses through RAR dependencies
For simplicity, we limit this feature to only one RAR-dependency per clause.
This allows to quickly correct the register demand changes that occur by
switching the kill flags.

Totals from 5861 (7.34% of 79839) affected shaders: (Navi48)
Instrs: 4891340 -> 4883789 (-0.15%); split: -0.21%, +0.06%
CodeSize: 25556612 -> 25527244 (-0.11%); split: -0.16%, +0.05%
VGPRs: 347044 -> 347140 (+0.03%); split: -0.13%, +0.16%
Latency: 32697095 -> 32642428 (-0.17%); split: -0.25%, +0.08%
InvThroughput: 4975909 -> 4975086 (-0.02%); split: -0.06%, +0.05%
VClause: 102152 -> 93852 (-8.13%); split: -8.22%, +0.10%
SClause: 101232 -> 101205 (-0.03%); split: -0.03%, +0.00%
Copies: 305189 -> 305651 (+0.15%); split: -0.56%, +0.71%
Branches: 87032 -> 87045 (+0.01%); split: -0.00%, +0.02%
VALU: 2776634 -> 2777097 (+0.02%); split: -0.06%, +0.08%
SALU: 662066 -> 660379 (-0.25%); split: -0.26%, +0.01%
VOPD: 4801 -> 4800 (-0.02%); split: +1.21%, -1.23%

Totals from 5680 (7.12% of 79825) affected shaders: (Vangogh)
MaxWaves: 111282 -> 111290 (+0.01%)
Instrs: 4955907 -> 4950709 (-0.10%); split: -0.15%, +0.04%
CodeSize: 26026264 -> 26014272 (-0.05%); split: -0.10%, +0.05%
VGPRs: 320784 -> 320776 (-0.00%); split: -0.03%, +0.03%
Latency: 35645457 -> 35584438 (-0.17%); split: -0.32%, +0.15%
InvThroughput: 8233912 -> 8236524 (+0.03%); split: -0.10%, +0.13%
VClause: 107017 -> 96804 (-9.54%); split: -9.69%, +0.15%
SClause: 98633 -> 98592 (-0.04%); split: -0.05%, +0.01%
Copies: 394041 -> 393584 (-0.12%); split: -0.52%, +0.40%
Branches: 120235 -> 120231 (-0.00%); split: -0.02%, +0.01%
VALU: 3183571 -> 3183114 (-0.01%); split: -0.06%, +0.05%
SALU: 735546 -> 734143 (-0.19%); split: -0.20%, +0.01%

Totals from 2507 (3.96% of 63370) affected shaders: (Vega10)

MaxWaves: 13643 -> 13637 (-0.04%)
Instrs: 1496453 -> 1496135 (-0.02%); split: -0.11%, +0.09%
CodeSize: 7777880 -> 7776608 (-0.02%); split: -0.09%, +0.07%
VGPRs: 134164 -> 134104 (-0.04%); split: -0.11%, +0.07%
Latency: 17465181 -> 17483075 (+0.10%); split: -0.36%, +0.47%
InvThroughput: 8830470 -> 8851751 (+0.24%); split: -0.09%, +0.33%
VClause: 42012 -> 38825 (-7.59%); split: -8.00%, +0.42%
SClause: 34586 -> 34549 (-0.11%); split: -0.12%, +0.01%
Copies: 137896 -> 137668 (-0.17%); split: -0.86%, +0.69%
VALU: 1092468 -> 1092240 (-0.02%); split: -0.11%, +0.09%
SALU: 132956 -> 132569 (-0.29%); split: -0.34%, +0.05%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>
2025-11-11 11:31:52 +00:00
Daniel Schürmann
65ba8a0e8b aco/scheduler: refactor downwards dependency check
We can also ignore killed operands when checking for RAR dependencies
as these cannot appear later anymore.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>
2025-11-11 11:31:52 +00:00
Daniel Schürmann
ce3cc03153 aco/scheduler: use hashmap for RAR_dependencies
Store information about the (relative) position of the RAR dependency.
This will allow to correct for register-demand changes when scheduling across.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>
2025-11-11 11:31:52 +00:00
Daniel Schürmann
6c0dd8164f aco/scheduler: remove MoveState::RAR_dependencies_clause
Since moving clauses as batch, this can easily be derived from RAR_dependencies.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>
2025-11-11 11:31:52 +00:00
Daniel Schürmann
5ef47ba231 aco/scheduler: assert that the register demand stays within pre-determined bounds
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>
2025-11-11 11:31:52 +00:00
Daniel Schürmann
82ba730994 aco/scheduler: remove unused include
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38135>
2025-11-11 11:31:51 +00:00
Kenneth Graunke
9ffae42975 brw: Store brw_urb_inst::offset in bytes on Xe2
Xe2 uses byte offsets rather than OWord offsets.  We've been storing the
per-slot offsets in bytes on Xe2 for a while, but kept the global offset
immediate in OWords for some reason, choosing to lower it during logical
send lowering.

This patch makes both offsets (global immediate, per-slot) in the same
units, so they could be added together if necessary without scaling.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>
2025-11-11 10:55:44 +00:00
Kenneth Graunke
cde3a34a43 brw: Use nir_intrinsic_[set_]base rather than poking at const_index[0]
Much clearer, especially since we're dealing with at least four
different kinds of intrinsics.  These helpers were introduced years ago,
but probably didn't exist when we first wrote this code.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>
2025-11-11 10:55:43 +00:00
Kenneth Graunke
439c156831 brw: Add an assertion that writemasks can be fully ignored
I noticed that our backend was completely ignoring writemasks, despite
them appearing on many of the intrinsics we're implementing.

Rhys Perry pointed out that nir_lower_mem_access_bitsizes is removing
all non-trivial writemasking today, so ssbo/global/shared/scratch/etc.
stores should only ever see all components enabled.  Which means what
we're doing is legitimate, if non-obvious.  Add an assert to make it
obvious.

Thanks a lot to Rhys for helping me rediscover what made this work.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>
2025-11-11 10:55:42 +00:00
Kenneth Graunke
6151eb4372 nir: Drop writemask from all Intel memory store intrinsics
The backend has been fully ignoring all writemasks for a long time,
so it really doesn't make sense to have them on our custom intrinsics.

I'm not sure they even make sense for some of the block intrinsics.

Also, the store_ssbo -> store_ssbo_intel pass was not setting writemask
at all, leaving it at the default value of 0 (aka write nothing, if it
had been respected...)

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>
2025-11-11 10:55:41 +00:00
Roland Scheidegger
d6fd8b4201 llvmpipe: do bounds checking for shared memory
Just compare against the size that was declared.
This is probably overkill. I couldn't figure out what vulkan says wrt
OOB access of shared memory. D3D however (which is very strict about
these things) says that for TGSM writes the entire contents of the TGSM
becomes undefined, for reads the result is undefined. Hence, rather
than masking out such accesses, to avoid the segfaults it would be
enough to just clamp the offsets to valid values.
nir doesn't seem easily able to tell us if an access is guaranteed
in-bound (unlike for ssbo access), so assume always potentially OOB.

v2: fix rusticl - for cl we don't know the shared size at compilation
time, this is only provided at launch_grid() time, the nir shader info
shared_size might be zero. Hence pass through the size via cs jit
context, there already actually was a member in there which looks
like it was intended for that (interestingly enough, the cs jit context
was actually unused, since resources are passed elsewhere nowadays).

Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38307>
2025-11-11 09:28:30 +00:00
Erik Faye-Lund
4490275332 pvr: rework pds_state array length logic
This attempts to avoid needing hwdefs in headers. It's not perfect, but
hopefully a step in the right direction.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:14 +01:00
Erik Faye-Lund
1eab712245 pvr: move static_asserts to source-files
This avoids needless dependencies on HW-defs in header files.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:14 +01:00
Erik Faye-Lund
b2b8ec1a4c pvr: move non-rogue helpers to pvr_hw_utils.h
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:14 +01:00
Erik Faye-Lund
02b5e78f0d pvr: rename rogue_get_slc_cache_line_size
This isn't really rogue-specific, so let's rename it to not cause any
confusion.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:14 +01:00
Erik Faye-Lund
e7fb4a9948 pvr: factor out pvr_sampler
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:14 +01:00
Erik Faye-Lund
cf08978985 pvr: break out pvr_instance and pvr_physical_device
These files shouldn't not be per-arch, so break them out to their own
modules before we start making things multi-arch.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:11 +01:00
Erik Faye-Lund
4d0ab70caa pvr: move queue function to pvr_queue.c
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:11 +01:00
Erik Faye-Lund
5e400e7449 pvr: remove needless include
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:11 +01:00
Erik Faye-Lund
428fadd71f pvr: remove unused macros
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38352>
2025-11-11 10:13:11 +01:00
Tapani Pälli
2741ddd75a anv: fix issues found with indirect data stride
Use tristate for the aligned setting, otherwise it is always
first disabled which contributes to the condition if we set the
new stride active.

v2: set ByteStride in dword units and take secondary cmdbuf
    in to account (Lionel)

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38349>
2025-11-11 05:05:43 +00:00
Alyssa Rosenzweig
997b3ebbdb poly: fix cull distance
More fallout from strict NIR validation but easy to fix. I hit this when
attempting to CTS changes for parent_instr.

Closes: #14245
Fixes: 2f6b4803ab ("nir/validate: expand IO intrinsic validation with nir_io_semantics")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38356>
2025-11-11 01:34:24 +00:00