Commit Graph

11033 Commits

Author SHA1 Message Date
Simon Perretta 880098158d nir/nir_lower_calls_to_builtins: trivially handle IA64 mangled functions
Using __attribute__((overloadable)) when declaring nir ops with
variable-width params in clc results in their symbol names being (IA64)
mangled; this change enables the mangled names to be handled when later
lowering the calls.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36873>
2025-09-02 16:04:19 +00:00
Robert Mader 1772380307 nir: Fixup 10/12 bit SW decoder YCbCr formats
The highest possible values that can be represented with
16/12/10 bits are 65535/4095/1023, not 65536/4096/1024.
In order to ensure 1023 maps to 65535 in the Sx10 case
we thus need to multiply by 65535 / 1023 ~= 64.06158
instead of 64.

Fixes: a166d7609f ("gles: Add support for 10/12/16 bit SW decoder YCbCr formats")
Suggested-by: Benjamin Otte <otte@redhat.com>
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37077>
2025-09-02 09:08:51 +00:00
Job Noorman e78bd88a06 nir/opt_offsets: add callback to set need_nuw per intrinsic
Wether need_nuw is used is currently decided in two different ways:
- globally through the allow_offset_wrap option;
- per intrinsic but hard-coded in opt_offsets.

Make this more flexible by creating a callback that is called per
intrinsic. This will allow backends to decide, on a per-intrinsic basis,
whether need_nuw is needed.

Note that the main use case for ir3 is to add support for opt_offsets
for global memory accesses. Other intrinsics don't need need_nuw but
global memory accesses do.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114>
2025-09-01 11:25:07 +00:00
Job Noorman bc03086320 nir/opt_offsets: rename max_offset_data to cb_data
We want to add more callbacks and pass the same data.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114>
2025-09-01 11:25:07 +00:00
Rhys Perry 2d0f93631c nir/divergence: make smem load_global_amd uniform
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 14:55:13 -04:00
Marek Olšák 25294f3dd4 nir/opt_move_to_top: handle load_global_amd with ACCESS_SMEM_AMD
to match the behavior of load_smem_amd

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 14:55:13 -04:00
Marek Olšák 48050dbef6 nir/opt_sink: handle load_global_amd
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 14:55:13 -04:00
Marek Olšák 219fcd4b32 nir/opt_call: handle load_global(_amd) with SPECULATE as rematerializable
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 14:55:13 -04:00
Ashley Smith d9b388af27 mesa: Fix support for GL_EXT_shader_clock
Missing 32-bit entry point in GLSL

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 2ce20170 ("mesa: Add support for GL_EXT_shader_clock")
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36041>
2025-08-29 11:09:04 +00:00
Faith Ekstrand 26e32417b9 nir: Add an option to make lower_phis_to_regs_block() less clever
Right now it tries to place reg_write instructions as far up the
predecessor chain as possible.  This is useful for a bunch of the passes
that call it since it ensures they don't get placed in dead blocks or in
single successors and things like that.  But it screws up NAK's control
flow lowering so we need the option to turn it off and make the pass
place the reg_write instructions in the most obvious place possible.

Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>
2025-08-29 01:24:56 +00:00
Dave Airlie c38170452d nir: add nir_intrinsic_cmat_load_shared_nv
This maps to NAK's OpLdsm

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36363>
2025-08-28 16:09:07 +02:00
Georg Lehmann 3b06824e4c nir/opt_algebraic: optimize some post peephole select patterns
Foz-DB GFX1201:
Totals from 208 (0.26% of 80287) affected shaders:
Instrs: 427684 -> 426834 (-0.20%); split: -0.22%, +0.02%
CodeSize: 2232616 -> 2228816 (-0.17%); split: -0.20%, +0.03%
Latency: 3993934 -> 3992726 (-0.03%); split: -0.04%, +0.01%
InvThroughput: 569055 -> 568622 (-0.08%); split: -0.09%, +0.01%
SClause: 12932 -> 12927 (-0.04%)
Copies: 22567 -> 22604 (+0.16%); split: -0.47%, +0.63%
Branches: 7671 -> 7658 (-0.17%)
VALU: 222047 -> 221625 (-0.19%)
SALU: 83954 -> 83815 (-0.17%); split: -0.29%, +0.13%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938>
2025-08-27 09:45:19 +00:00
Georg Lehmann 395893e16b nir/peephole_select: allows more lowered io
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938>
2025-08-27 09:45:19 +00:00
Georg Lehmann e270a7480b nir/lower_io: fix boolean output stores
Stores don't have a definition, we have to check the bit size of the source.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13762
Fixes: c217ee8d35 ("nir: Insert b2b1s around booleans in nir_lower_to")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36966>
2025-08-27 08:46:34 +00:00
Georg Lehmann 047b95a8c3 nir/shrink_vec_array_vars: detect zero init shared memory using constant initializer
More consistent.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36956>
2025-08-27 06:37:41 +00:00
Georg Lehmann edc5bea61e nir/shrink_vec_array_vars: update constant initializer after shrinking
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13751
Fixes: c7df3b4f64 ("nir/shrink_vec_array_vars: allow nir_var_mem_shared")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36956>
2025-08-27 06:37:41 +00:00
Faith Ekstrand a1d5e8bfdb compiler/rust: Fix the DFS loop detection algorithm
The previous algorithm just looked at the dominator's loop header.
However, if you have multiple consecutive loops like:

    function_impl {
        loop {
            // Stuff
        }
        loop {
            // Other stuff
        }
    }

then it will look like the second loop is contained in the first loop
because the first loop's header dominates the second loop.  This isn't
actually what we want.  Instead, we want a node N to be considered part
of a loop with header H if H dominates N and H is reachable from N.

Fixes: 741f7067f1 ("nak: Add loop detection to the CFG")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36524>
2025-08-27 01:20:05 +00:00
Georg Lehmann d0f4b535fe nir: constant fold txd with 0 ddx/ddy to txl
Foz-DB GFX1201:
Totals from 34 (0.04% of 80287) affected shaders:
Instrs: 3111158 -> 3111076 (-0.00%)
CodeSize: 16345020 -> 16344908 (-0.00%); split: -0.00%, +0.00%
Latency: 15378053 -> 15378063 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 2940485 -> 2940477 (-0.00%); split: -0.00%, +0.00%
VClause: 79940 -> 79941 (+0.00%)
Copies: 228205 -> 228159 (-0.02%)
VALU: 1730040 -> 1729994 (-0.00%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36967>
2025-08-26 06:19:43 +00:00
Dave Airlie 7a96a928a2 nir: add coop mat flexible dimensions lowering.
This adds a generic lowering pass for coop mat flexible dimensions.

This should be suitable for all drivers that implement coop mat2 flexible dimensions
or even just lowering sw exposed sizes to hw sizes.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36544>
2025-08-25 18:55:08 +00:00
Konstantin Seurer 951b187b95 nir: Use nir_def_block in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:10 +00:00
Konstantin Seurer 9df7b48d2f nir: Use nir_def_as_* in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:09 +00:00
Pierre-Eric Pelloux-Prayer e92638b6bf nir/opt_varyings: fix build with PRINT_RELOCATE_SLOT
Fixes: e3d122ed7b ("nir/opt_varyings: completely exclude mediump from type changes")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36411>
2025-08-23 14:44:29 +00:00
Jesse Natalie 5b3756f231 nir: Add missing #include for c99_alloca.h
Fixes: 3dd9a978 ("nir: add new pass nir_lower_io_indirect_loads")
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36940>
2025-08-22 22:33:50 +00:00
Rhys Perry 2d597b6919 nir/load_store_vectorize: use nir_def_num_lsb_zero in calc_alignment
fossil-db (gfx1201):
Totals from 20 (0.03% of 79839) affected shaders:
Instrs: 15370 -> 15251 (-0.77%)
CodeSize: 89764 -> 88952 (-0.90%)
Latency: 150295 -> 149963 (-0.22%)
InvThroughput: 210291 -> 210105 (-0.09%)
Copies: 1337 -> 1320 (-1.27%)
PreVGPRs: 589 -> 590 (+0.17%)
VALU: 7519 -> 7466 (-0.70%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry b03eeb12a9 nir/load_store_vectorize: use nir_def_num_lsb_zero in check_for_robustness
fossil-db (gfx1201):
Totals from 499 (0.63% of 79839) affected shaders:
MaxWaves: 14276 -> 14234 (-0.29%)
Instrs: 520883 -> 508159 (-2.44%); split: -2.45%, +0.01%
CodeSize: 2831220 -> 2731080 (-3.54%); split: -3.54%, +0.00%
VGPRs: 27156 -> 27348 (+0.71%)
SpillSGPRs: 360 -> 390 (+8.33%)
Latency: 4473898 -> 4414552 (-1.33%); split: -1.54%, +0.21%
InvThroughput: 494468 -> 493508 (-0.19%); split: -0.62%, +0.43%
VClause: 14211 -> 14060 (-1.06%); split: -1.16%, +0.10%
SClause: 14653 -> 14354 (-2.04%); split: -2.39%, +0.35%
Copies: 36772 -> 37056 (+0.77%); split: -0.65%, +1.42%
Branches: 11502 -> 11486 (-0.14%)
PreSGPRs: 22605 -> 22848 (+1.07%); split: -0.39%, +1.47%
PreVGPRs: 20571 -> 20833 (+1.27%)
VALU: 242982 -> 243151 (+0.07%); split: -0.08%, +0.14%
SALU: 91332 -> 88069 (-3.57%); split: -3.71%, +0.14%
VMEM: 32275 -> 29137 (-9.72%)
SMEM: 26239 -> 22400 (-14.63%)
VOPD: 345 -> 330 (-4.35%)
SClause: 14646 -> 14347 (-2.04%); split: -2.39%, +0.35%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry 46da666205 nir/algebraic: allow non-const for iand(iadd()) -> iadd(iand())
fossil-db (gfx1201):
Totals from 596 (0.75% of 79839) affected shaders:
Instrs: 691926 -> 691819 (-0.02%); split: -0.11%, +0.09%
CodeSize: 3675216 -> 3675180 (-0.00%); split: -0.08%, +0.08%
VGPRs: 37464 -> 37452 (-0.03%)
Latency: 8566849 -> 8563162 (-0.04%); split: -0.09%, +0.05%
InvThroughput: 1068038 -> 1063279 (-0.45%); split: -0.46%, +0.01%
VClause: 17859 -> 17897 (+0.21%); split: -0.01%, +0.22%
SClause: 16704 -> 16735 (+0.19%); split: -0.07%, +0.26%
Copies: 45422 -> 45395 (-0.06%); split: -0.15%, +0.09%
PreSGPRs: 24345 -> 24351 (+0.02%)
PreVGPRs: 29121 -> 29128 (+0.02%)
VALU: 349959 -> 348117 (-0.53%); split: -0.54%, +0.01%
SALU: 105926 -> 107576 (+1.56%); split: -0.02%, +1.58%
VOPD: 252 -> 234 (-7.14%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry 4f83059ac5 nir/algebraic: improve is_unsigned_multiple_of_4 and use it more
fossil-db (gfx1201):
Totals from 160 (0.20% of 79839) affected shaders:
MaxWaves: 4008 -> 3952 (-1.40%)
Instrs: 390073 -> 379834 (-2.62%); split: -2.63%, +0.00%
CodeSize: 2126020 -> 2053740 (-3.40%); split: -3.40%, +0.00%
VGPRs: 9492 -> 9612 (+1.26%)
Latency: 6746019 -> 6723893 (-0.33%); split: -0.33%, +0.00%
InvThroughput: 849571 -> 848942 (-0.07%); split: -0.42%, +0.35%
VClause: 11977 -> 11983 (+0.05%); split: -0.20%, +0.25%
SClause: 11828 -> 11824 (-0.03%); split: -0.14%, +0.11%
Copies: 30003 -> 30938 (+3.12%); split: -0.09%, +3.20%
PreSGPRs: 8914 -> 8938 (+0.27%)
PreVGPRs: 7352 -> 7514 (+2.20%); split: -0.04%, +2.24%
VALU: 171829 -> 168829 (-1.75%); split: -1.76%, +0.01%
SALU: 66503 -> 66543 (+0.06%); split: -0.01%, +0.07%
VMEM: 29365 -> 25327 (-13.75%)
VOPD: 864 -> 1013 (+17.25%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry 09ab7ff01e nir: add nir_def_num_lsb_zero
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry 51dd513789 nir/search: reorder match_value to check constants first
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry 84fe10f939 nir/search: don't clear empty hash tables
_mesa_hash_table_clear() memsets the entries, even if it's already empty.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Rhys Perry 2a12624532 nir/search: add nir_search_state
A future commit will add another hash table.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>
2025-08-22 15:45:55 +00:00
Georg Lehmann 996c07353b nir/shrink_vec_array_vars: use range analysis for non constant indices
Foz-DB Navi21:
Totals from 84 (0.10% of 80255) affected shaders:
MaxWaves: 1700 -> 1806 (+6.24%); split: +6.59%, -0.35%
Instrs: 90479 -> 91278 (+0.88%); split: -0.15%, +1.04%
CodeSize: 499644 -> 504572 (+0.99%); split: -0.10%, +1.08%
VGPRs: 5400 -> 4912 (-9.04%); split: -9.93%, +0.89%
LDS: 292864 -> 152064 (-48.08%)
Latency: 2001405 -> 2002335 (+0.05%); split: -0.01%, +0.06%
InvThroughput: 545293 -> 543073 (-0.41%); split: -0.52%, +0.11%
VClause: 1510 -> 1508 (-0.13%)
SClause: 2096 -> 2097 (+0.05%); split: -0.05%, +0.10%
Copies: 6373 -> 6431 (+0.91%); split: -0.64%, +1.55%
Branches: 1648 -> 1686 (+2.31%); split: -0.36%, +2.67%
PreVGPRs: 3918 -> 3960 (+1.07%); split: -0.03%, +1.10%
VALU: 67591 -> 68107 (+0.76%); split: -0.14%, +0.90%
SALU: 8352 -> 8490 (+1.65%); split: -0.25%, +1.90%
VMEM: 2685 -> 2683 (-0.07%)

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388>
2025-08-22 13:47:47 +00:00
Georg Lehmann c7df3b4f64 nir/shrink_vec_array_vars: allow nir_var_mem_shared
This should just work.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388>
2025-08-22 13:47:47 +00:00
Rhys Perry 2b5681f257 nir/opt_load_skip_helpers: always require helpers for handles
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>
2025-08-22 13:15:05 +00:00
Rhys Perry 81dd60df95 nir/opt_load_skip_helpers: move divergence check earlier
This should fix a hypothetical issue such as:
   address = load_global()
   value = load_global(address, access=uses-smem)
where divergence analysis can't prove that 'address' is uniform, but can
prove that 'value' is uniform.

We might then add both load_global to the load_worklist, but only disable
helpers for the first because the second is uniform, making 'address'
divergent for real and potentially incorrect when used with
v_readfirstlane_b32.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>
2025-08-22 13:15:05 +00:00
Rhys Perry 354df09c88 nir: add global_amd to nir_get_io_offset_src/nir_get_io_index_src
This is needed for nir_opt_load_skip_helpers.

fossil-db (gfx1201):
Totals from 5 (0.01% of 79839) affected shaders:
Instrs: 2288 -> 2286 (-0.09%); split: -0.13%, +0.04%
CodeSize: 12372 -> 12364 (-0.06%); split: -0.10%, +0.03%
Latency: 18378 -> 20044 (+9.07%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 883b1ca364 ("aco: disable wqm for tex loads when not needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>
2025-08-22 13:15:04 +00:00
Qiang Yu 9acaa409b9 mesa,glsl: add mesh shader subrotine handling
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36751>
2025-08-22 10:01:57 +00:00
Qiang Yu dbbb46aa38 nir: compute io base for fragment shader inputs which maybe per primitive
Some inputs is per vertex while vertex pipeline, and per primitive
when mesh pipeline. Put these inputs after other inputs to share
the same fragment shader code for two pipelines.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36749>
2025-08-22 02:42:57 +00:00
Qiang Yu 7c3f7e1046 nir: lower io support task and mesh shader
mesh shader does not have input, and we skip task shader
IO lowering like compute shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36749>
2025-08-22 02:42:57 +00:00
Yonggang Luo a34756bbed Revert "nir: Temporarily disable optimizations for MSVC ARM64"
This reverts commit 55d153b9f5.

The msvc bug is https://developercommunity.visualstudio.com/t/Stack-overflow-compiling-C-code-to-ARM64/916235

and Fixed In: Visual Studio 2022 version 17.7

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36767>
2025-08-22 01:28:23 +00:00
Yonggang Luo 85310e912c Revert "glsl: Work around MSVC arm64 optimizer bug"
This reverts commit 86b5c9278c.

As bug https://developercommunity.visualstudio.com/t/Incorrect-ARM64-codegen-with-optimizatio/10564605

is fixed in VS 17.8.6

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36767>
2025-08-22 01:28:23 +00:00
Juan A. Suarez Romero ca989ecdec glsl: disable UBSan vptr check for ir_instruction
With UBSan enabled, we get the following issue:

```
../src/compiler/glsl/ir.h:116:4: runtime error: member access within address 0x555637c62c10 which does not point to an object of type 'ir_instruction'
0x555637c62c10: note: object has invalid vptr
 5f 76 61 6c  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr
```

This only happens the first time a ir_variable (which derives from
ir_instruction) is created; next calls don't show the issue any more.

The problem is with the following call in the `new()` operator:

```
((ir_instruction*)((uintptr_t)p))->node_linalloc = ctx;
```

In this case, the ir_instruction structure is not fully constructed and
thus UBSan complains about it. In the next calls, as the structure is
now fully constructed it doesn't complain any more.

The right approach would be fully creating the structure, and afterwards
doing the context assignment. But this would require quite a lot of
changes, passing the context through the constructors to assign it.

A simpler solution is just disabling this check for this case, as we
know what is happening.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36884>
2025-08-21 12:09:04 +00:00
Georg Lehmann e24db36f20 nir/uub: handle bit_count
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>
2025-08-21 10:36:09 +00:00
Georg Lehmann aff391bc77 nir/uub: handle more reduction ops
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>
2025-08-21 10:36:09 +00:00
Georg Lehmann 773ee60e48 nir/uub: decrease default max subgroup size to 128
128 is the maximum all apis allow.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>
2025-08-21 10:36:09 +00:00
Georg Lehmann a2e48d2ede nir/uub: fix exclusive scans
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>
2025-08-21 10:36:09 +00:00
Calder Young a3ecdf33a3 nir/builder: Add helper for building uvec8 immediates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>
2025-08-21 09:04:54 +00:00
Marek Olšák f34fa80f0d glsl_to_nir: don't allocate 0-sized arrays for Uniform/ShaderStorageBlocks
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:49 +00:00
Marek Olšák 2ab6b275bd glsl_to_nir: don't allocate 0-sized num_params & subroutine_types
It still allocates the ralloc header, which is wasteful.

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:49 +00:00
Marek Olšák deac7cf1a2 glsl/ir_variable_refcount: don't ralloc the hash table
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:49 +00:00