AlexIndustrial/mesa

Author	SHA1	Message	Date
Simon Perretta	880098158d	nir/nir_lower_calls_to_builtins: trivially handle IA64 mangled functions Using __attribute__((overloadable)) when declaring nir ops with variable-width params in clc results in their symbol names being (IA64) mangled; this change enables the mangled names to be handled when later lowering the calls. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36873>	2025-09-02 16:04:19 +00:00
Robert Mader	1772380307	nir: Fixup 10/12 bit SW decoder YCbCr formats The highest possible values that can be represented with 16/12/10 bits are 65535/4095/1023, not 65536/4096/1024. In order to ensure 1023 maps to 65535 in the Sx10 case we thus need to multiply by 65535 / 1023 ~= 64.06158 instead of 64. Fixes: `a166d7609f` ("gles: Add support for 10/12/16 bit SW decoder YCbCr formats") Suggested-by: Benjamin Otte <otte@redhat.com> Signed-off-by: Robert Mader <robert.mader@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37077>	2025-09-02 09:08:51 +00:00
Job Noorman	e78bd88a06	nir/opt_offsets: add callback to set need_nuw per intrinsic Wether need_nuw is used is currently decided in two different ways: - globally through the allow_offset_wrap option; - per intrinsic but hard-coded in opt_offsets. Make this more flexible by creating a callback that is called per intrinsic. This will allow backends to decide, on a per-intrinsic basis, whether need_nuw is needed. Note that the main use case for ir3 is to add support for opt_offsets for global memory accesses. Other intrinsics don't need need_nuw but global memory accesses do. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114>	2025-09-01 11:25:07 +00:00
Job Noorman	bc03086320	nir/opt_offsets: rename max_offset_data to cb_data We want to add more callbacks and pass the same data. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37114>	2025-09-01 11:25:07 +00:00
Rhys Perry	2d0f93631c	nir/divergence: make smem load_global_amd uniform Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>	2025-08-30 14:55:13 -04:00
Marek Olšák	25294f3dd4	nir/opt_move_to_top: handle load_global_amd with ACCESS_SMEM_AMD to match the behavior of load_smem_amd Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>	2025-08-30 14:55:13 -04:00
Marek Olšák	48050dbef6	nir/opt_sink: handle load_global_amd Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>	2025-08-30 14:55:13 -04:00
Marek Olšák	219fcd4b32	nir/opt_call: handle load_global(_amd) with SPECULATE as rematerializable Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>	2025-08-30 14:55:13 -04:00
Ashley Smith	d9b388af27	mesa: Fix support for GL_EXT_shader_clock Missing 32-bit entry point in GLSL Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Fixes: `2ce20170` ("mesa: Add support for GL_EXT_shader_clock") Signed-off-by: Ashley Smith <ashley.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36041>	2025-08-29 11:09:04 +00:00
Faith Ekstrand	26e32417b9	nir: Add an option to make lower_phis_to_regs_block() less clever Right now it tries to place reg_write instructions as far up the predecessor chain as possible. This is useful for a bunch of the passes that call it since it ensures they don't get placed in dead blocks or in single successors and things like that. But it screws up NAK's control flow lowering so we need the option to turn it off and make the pass place the reg_write instructions in the most obvious place possible. Fixes: `b013d54e4f` ("nak/lower_cf: Flag phis as convergent when possible") Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>	2025-08-29 01:24:56 +00:00
Dave Airlie	c38170452d	nir: add nir_intrinsic_cmat_load_shared_nv This maps to NAK's OpLdsm Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36363>	2025-08-28 16:09:07 +02:00
Georg Lehmann	3b06824e4c	nir/opt_algebraic: optimize some post peephole select patterns Foz-DB GFX1201: Totals from 208 (0.26% of 80287) affected shaders: Instrs: 427684 -> 426834 (-0.20%); split: -0.22%, +0.02% CodeSize: 2232616 -> 2228816 (-0.17%); split: -0.20%, +0.03% Latency: 3993934 -> 3992726 (-0.03%); split: -0.04%, +0.01% InvThroughput: 569055 -> 568622 (-0.08%); split: -0.09%, +0.01% SClause: 12932 -> 12927 (-0.04%) Copies: 22567 -> 22604 (+0.16%); split: -0.47%, +0.63% Branches: 7671 -> 7658 (-0.17%) VALU: 222047 -> 221625 (-0.19%) SALU: 83954 -> 83815 (-0.17%); split: -0.29%, +0.13% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938>	2025-08-27 09:45:19 +00:00
Georg Lehmann	395893e16b	nir/peephole_select: allows more lowered io Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938>	2025-08-27 09:45:19 +00:00
Georg Lehmann	e270a7480b	nir/lower_io: fix boolean output stores Stores don't have a definition, we have to check the bit size of the source. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13762 Fixes: `c217ee8d35` ("nir: Insert b2b1s around booleans in nir_lower_to") Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36966>	2025-08-27 08:46:34 +00:00
Georg Lehmann	047b95a8c3	nir/shrink_vec_array_vars: detect zero init shared memory using constant initializer More consistent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36956>	2025-08-27 06:37:41 +00:00
Georg Lehmann	edc5bea61e	nir/shrink_vec_array_vars: update constant initializer after shrinking Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13751 Fixes: `c7df3b4f64` ("nir/shrink_vec_array_vars: allow nir_var_mem_shared") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36956>	2025-08-27 06:37:41 +00:00
Faith Ekstrand	a1d5e8bfdb	compiler/rust: Fix the DFS loop detection algorithm The previous algorithm just looked at the dominator's loop header. However, if you have multiple consecutive loops like: function_impl { loop { // Stuff } loop { // Other stuff } } then it will look like the second loop is contained in the first loop because the first loop's header dominates the second loop. This isn't actually what we want. Instead, we want a node N to be considered part of a loop with header H if H dominates N and H is reachable from N. Fixes: `741f7067f1` ("nak: Add loop detection to the CFG") Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36524>	2025-08-27 01:20:05 +00:00
Georg Lehmann	d0f4b535fe	nir: constant fold txd with 0 ddx/ddy to txl Foz-DB GFX1201: Totals from 34 (0.04% of 80287) affected shaders: Instrs: 3111158 -> 3111076 (-0.00%) CodeSize: 16345020 -> 16344908 (-0.00%); split: -0.00%, +0.00% Latency: 15378053 -> 15378063 (+0.00%); split: -0.00%, +0.00% InvThroughput: 2940485 -> 2940477 (-0.00%); split: -0.00%, +0.00% VClause: 79940 -> 79941 (+0.00%) Copies: 228205 -> 228159 (-0.02%) VALU: 1730040 -> 1729994 (-0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36967>	2025-08-26 06:19:43 +00:00
Dave Airlie	7a96a928a2	nir: add coop mat flexible dimensions lowering. This adds a generic lowering pass for coop mat flexible dimensions. This should be suitable for all drivers that implement coop mat2 flexible dimensions or even just lowering sw exposed sizes to hw sizes. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36544>	2025-08-25 18:55:08 +00:00
Konstantin Seurer	951b187b95	nir: Use nir_def_block in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>	2025-08-24 14:03:10 +00:00
Konstantin Seurer	9df7b48d2f	nir: Use nir_def_as_* in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>	2025-08-24 14:03:09 +00:00
Pierre-Eric Pelloux-Prayer	e92638b6bf	nir/opt_varyings: fix build with PRINT_RELOCATE_SLOT Fixes: `e3d122ed7b` ("nir/opt_varyings: completely exclude mediump from type changes") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36411>	2025-08-23 14:44:29 +00:00
Jesse Natalie	5b3756f231	nir: Add missing #include for c99_alloca.h Fixes: `3dd9a978` ("nir: add new pass nir_lower_io_indirect_loads") Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36940>	2025-08-22 22:33:50 +00:00
Rhys Perry	2d597b6919	nir/load_store_vectorize: use nir_def_num_lsb_zero in calc_alignment fossil-db (gfx1201): Totals from 20 (0.03% of 79839) affected shaders: Instrs: 15370 -> 15251 (-0.77%) CodeSize: 89764 -> 88952 (-0.90%) Latency: 150295 -> 149963 (-0.22%) InvThroughput: 210291 -> 210105 (-0.09%) Copies: 1337 -> 1320 (-1.27%) PreVGPRs: 589 -> 590 (+0.17%) VALU: 7519 -> 7466 (-0.70%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	b03eeb12a9	nir/load_store_vectorize: use nir_def_num_lsb_zero in check_for_robustness fossil-db (gfx1201): Totals from 499 (0.63% of 79839) affected shaders: MaxWaves: 14276 -> 14234 (-0.29%) Instrs: 520883 -> 508159 (-2.44%); split: -2.45%, +0.01% CodeSize: 2831220 -> 2731080 (-3.54%); split: -3.54%, +0.00% VGPRs: 27156 -> 27348 (+0.71%) SpillSGPRs: 360 -> 390 (+8.33%) Latency: 4473898 -> 4414552 (-1.33%); split: -1.54%, +0.21% InvThroughput: 494468 -> 493508 (-0.19%); split: -0.62%, +0.43% VClause: 14211 -> 14060 (-1.06%); split: -1.16%, +0.10% SClause: 14653 -> 14354 (-2.04%); split: -2.39%, +0.35% Copies: 36772 -> 37056 (+0.77%); split: -0.65%, +1.42% Branches: 11502 -> 11486 (-0.14%) PreSGPRs: 22605 -> 22848 (+1.07%); split: -0.39%, +1.47% PreVGPRs: 20571 -> 20833 (+1.27%) VALU: 242982 -> 243151 (+0.07%); split: -0.08%, +0.14% SALU: 91332 -> 88069 (-3.57%); split: -3.71%, +0.14% VMEM: 32275 -> 29137 (-9.72%) SMEM: 26239 -> 22400 (-14.63%) VOPD: 345 -> 330 (-4.35%) SClause: 14646 -> 14347 (-2.04%); split: -2.39%, +0.35% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	46da666205	nir/algebraic: allow non-const for iand(iadd()) -> iadd(iand()) fossil-db (gfx1201): Totals from 596 (0.75% of 79839) affected shaders: Instrs: 691926 -> 691819 (-0.02%); split: -0.11%, +0.09% CodeSize: 3675216 -> 3675180 (-0.00%); split: -0.08%, +0.08% VGPRs: 37464 -> 37452 (-0.03%) Latency: 8566849 -> 8563162 (-0.04%); split: -0.09%, +0.05% InvThroughput: 1068038 -> 1063279 (-0.45%); split: -0.46%, +0.01% VClause: 17859 -> 17897 (+0.21%); split: -0.01%, +0.22% SClause: 16704 -> 16735 (+0.19%); split: -0.07%, +0.26% Copies: 45422 -> 45395 (-0.06%); split: -0.15%, +0.09% PreSGPRs: 24345 -> 24351 (+0.02%) PreVGPRs: 29121 -> 29128 (+0.02%) VALU: 349959 -> 348117 (-0.53%); split: -0.54%, +0.01% SALU: 105926 -> 107576 (+1.56%); split: -0.02%, +1.58% VOPD: 252 -> 234 (-7.14%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	4f83059ac5	nir/algebraic: improve is_unsigned_multiple_of_4 and use it more fossil-db (gfx1201): Totals from 160 (0.20% of 79839) affected shaders: MaxWaves: 4008 -> 3952 (-1.40%) Instrs: 390073 -> 379834 (-2.62%); split: -2.63%, +0.00% CodeSize: 2126020 -> 2053740 (-3.40%); split: -3.40%, +0.00% VGPRs: 9492 -> 9612 (+1.26%) Latency: 6746019 -> 6723893 (-0.33%); split: -0.33%, +0.00% InvThroughput: 849571 -> 848942 (-0.07%); split: -0.42%, +0.35% VClause: 11977 -> 11983 (+0.05%); split: -0.20%, +0.25% SClause: 11828 -> 11824 (-0.03%); split: -0.14%, +0.11% Copies: 30003 -> 30938 (+3.12%); split: -0.09%, +3.20% PreSGPRs: 8914 -> 8938 (+0.27%) PreVGPRs: 7352 -> 7514 (+2.20%); split: -0.04%, +2.24% VALU: 171829 -> 168829 (-1.75%); split: -1.76%, +0.01% SALU: 66503 -> 66543 (+0.06%); split: -0.01%, +0.07% VMEM: 29365 -> 25327 (-13.75%) VOPD: 864 -> 1013 (+17.25%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	09ab7ff01e	nir: add nir_def_num_lsb_zero Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	51dd513789	nir/search: reorder match_value to check constants first Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	84fe10f939	nir/search: don't clear empty hash tables _mesa_hash_table_clear() memsets the entries, even if it's already empty. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Rhys Perry	2a12624532	nir/search: add nir_search_state A future commit will add another hash table. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36760>	2025-08-22 15:45:55 +00:00
Georg Lehmann	996c07353b	nir/shrink_vec_array_vars: use range analysis for non constant indices Foz-DB Navi21: Totals from 84 (0.10% of 80255) affected shaders: MaxWaves: 1700 -> 1806 (+6.24%); split: +6.59%, -0.35% Instrs: 90479 -> 91278 (+0.88%); split: -0.15%, +1.04% CodeSize: 499644 -> 504572 (+0.99%); split: -0.10%, +1.08% VGPRs: 5400 -> 4912 (-9.04%); split: -9.93%, +0.89% LDS: 292864 -> 152064 (-48.08%) Latency: 2001405 -> 2002335 (+0.05%); split: -0.01%, +0.06% InvThroughput: 545293 -> 543073 (-0.41%); split: -0.52%, +0.11% VClause: 1510 -> 1508 (-0.13%) SClause: 2096 -> 2097 (+0.05%); split: -0.05%, +0.10% Copies: 6373 -> 6431 (+0.91%); split: -0.64%, +1.55% Branches: 1648 -> 1686 (+2.31%); split: -0.36%, +2.67% PreVGPRs: 3918 -> 3960 (+1.07%); split: -0.03%, +1.10% VALU: 67591 -> 68107 (+0.76%); split: -0.14%, +0.90% SALU: 8352 -> 8490 (+1.65%); split: -0.25%, +1.90% VMEM: 2685 -> 2683 (-0.07%) Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388>	2025-08-22 13:47:47 +00:00
Georg Lehmann	c7df3b4f64	nir/shrink_vec_array_vars: allow nir_var_mem_shared This should just work. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26388>	2025-08-22 13:47:47 +00:00
Rhys Perry	2b5681f257	nir/opt_load_skip_helpers: always require helpers for handles Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>	2025-08-22 13:15:05 +00:00
Rhys Perry	81dd60df95	nir/opt_load_skip_helpers: move divergence check earlier This should fix a hypothetical issue such as: address = load_global() value = load_global(address, access=uses-smem) where divergence analysis can't prove that 'address' is uniform, but can prove that 'value' is uniform. We might then add both load_global to the load_worklist, but only disable helpers for the first because the second is uniform, making 'address' divergent for real and potentially incorrect when used with v_readfirstlane_b32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>	2025-08-22 13:15:05 +00:00
Rhys Perry	354df09c88	nir: add global_amd to nir_get_io_offset_src/nir_get_io_index_src This is needed for nir_opt_load_skip_helpers. fossil-db (gfx1201): Totals from 5 (0.01% of 79839) affected shaders: Instrs: 2288 -> 2286 (-0.09%); split: -0.13%, +0.04% CodeSize: 12372 -> 12364 (-0.06%); split: -0.10%, +0.03% Latency: 18378 -> 20044 (+9.07%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `883b1ca364` ("aco: disable wqm for tex loads when not needed") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36850>	2025-08-22 13:15:04 +00:00
Qiang Yu	9acaa409b9	mesa,glsl: add mesh shader subrotine handling Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36751>	2025-08-22 10:01:57 +00:00
Qiang Yu	dbbb46aa38	nir: compute io base for fragment shader inputs which maybe per primitive Some inputs is per vertex while vertex pipeline, and per primitive when mesh pipeline. Put these inputs after other inputs to share the same fragment shader code for two pipelines. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36749>	2025-08-22 02:42:57 +00:00
Qiang Yu	7c3f7e1046	nir: lower io support task and mesh shader mesh shader does not have input, and we skip task shader IO lowering like compute shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36749>	2025-08-22 02:42:57 +00:00
Yonggang Luo	a34756bbed	Revert "nir: Temporarily disable optimizations for MSVC ARM64" This reverts commit `55d153b9f5`. The msvc bug is https://developercommunity.visualstudio.com/t/Stack-overflow-compiling-C-code-to-ARM64/916235 and Fixed In: Visual Studio 2022 version 17.7 Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36767>	2025-08-22 01:28:23 +00:00
Yonggang Luo	85310e912c	Revert "glsl: Work around MSVC arm64 optimizer bug" This reverts commit `86b5c9278c`. As bug https://developercommunity.visualstudio.com/t/Incorrect-ARM64-codegen-with-optimizatio/10564605 is fixed in VS 17.8.6 Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36767>	2025-08-22 01:28:23 +00:00
Juan A. Suarez Romero	ca989ecdec	glsl: disable UBSan vptr check for ir_instruction With UBSan enabled, we get the following issue: ``` ../src/compiler/glsl/ir.h:116:4: runtime error: member access within address 0x555637c62c10 which does not point to an object of type 'ir_instruction' 0x555637c62c10: note: object has invalid vptr 5f 76 61 6c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^~~~~~~~~~~~~~~~~~~~~~~ invalid vptr ``` This only happens the first time a ir_variable (which derives from ir_instruction) is created; next calls don't show the issue any more. The problem is with the following call in the `new()` operator: ``` ((ir_instruction*)((uintptr_t)p))->node_linalloc = ctx; ``` In this case, the ir_instruction structure is not fully constructed and thus UBSan complains about it. In the next calls, as the structure is now fully constructed it doesn't complain any more. The right approach would be fully creating the structure, and afterwards doing the context assignment. But this would require quite a lot of changes, passing the context through the constructors to assign it. A simpler solution is just disabling this check for this case, as we know what is happening. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36884>	2025-08-21 12:09:04 +00:00
Georg Lehmann	e24db36f20	nir/uub: handle bit_count Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Georg Lehmann	aff391bc77	nir/uub: handle more reduction ops Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Georg Lehmann	773ee60e48	nir/uub: decrease default max subgroup size to 128 128 is the maximum all apis allow. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Georg Lehmann	a2e48d2ede	nir/uub: fix exclusive scans Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36874>	2025-08-21 10:36:09 +00:00
Calder Young	a3ecdf33a3	nir/builder: Add helper for building uvec8 immediates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36455>	2025-08-21 09:04:54 +00:00
Marek Olšák	f34fa80f0d	glsl_to_nir: don't allocate 0-sized arrays for Uniform/ShaderStorageBlocks Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:49 +00:00
Marek Olšák	2ab6b275bd	glsl_to_nir: don't allocate 0-sized num_params & subroutine_types It still allocates the ralloc header, which is wasteful. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:49 +00:00
Marek Olšák	deac7cf1a2	glsl/ir_variable_refcount: don't ralloc the hash table Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>	2025-08-21 06:13:49 +00:00

1 2 3 4 5 ...

11033 Commits