AlexIndustrial/mesa

Author	SHA1	Message	Date
Okenczyc, Andrzej	e5cdc78e0e	amd/vpelib: Move predication size calculation to bufs_req Calculation for the worst case scenario in bufs_req should also include predication command size. Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Andrzei Okenczyc <Andrzej.Okenczyc@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Assadian, Navid	fbeaca1202	amd/vpelib: Add necessary pointer casting Add necessary pointer casting to prevent unexpected behavior Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Navid Assadian <Navid.Assadian@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>	2025-08-20 10:42:01 +08:00
Yonggang Luo	bdda1cf5ef	va: Use { 0 } initialize struct ../src/gallium/frontends/va/config.c(574): error C2059: syntax error: '}' MSVC 2019 doesn't support for it yet Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36843>	2025-08-20 02:02:55 +00:00
Yonggang Luo	76c1243dc8	va: Remove unused variable pscreen Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: David Rosca <david.rosca@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36843>	2025-08-20 02:02:54 +00:00
Caio Oliveira	4fda724fd4	brw: Avoid invalid access when compacting out-of-bounds JIP/UIP Usually JIP will be valid, but as part of other changes, it will be possible to have a shader that have multiple EOT messages and end with and ENDIF instruction. Its JIP will point after the program ends. This is fine but was tripping up the compaction code. Change compaction to not read its internal structures beyond the last instruction. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36822>	2025-08-20 00:54:41 +00:00
Eric Engestrom	a5433b44e6	nvk/ci: document some flakes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857>	2025-08-20 00:41:19 +00:00
Eric Engestrom	439a0a5c2e	turnip/ci: document a flake Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857>	2025-08-20 00:41:19 +00:00
Eric Engestrom	65b0f2ebe0	etnaviv/ci: document some flakes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857>	2025-08-20 00:41:19 +00:00
Eric Engestrom	a5b516804e	r300/ci: document flake Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857>	2025-08-20 00:41:19 +00:00
Eric Engestrom	9cb27063fd	zink+turnip/ci: document fixed tests Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857>	2025-08-20 00:41:19 +00:00
Eric Engestrom	19021733e6	zink+turnip/ci: document regression in b22806705c...cac3b4f404 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36857>	2025-08-20 00:41:19 +00:00
Erik Faye-Lund	03b7054c30	pan/midgard: avoid implicit cast-warning on Clang BITFIELD_MASK() returns a 32-bit unsigned integer, and Clang complains if we assign it to a 16-bit unsigned integer without a cast. Let's add that cast. While we're at it, add an assert() to make it clear to the compiler that the condition in BITFIELD_MASK() can be optimized away. Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>	2025-08-20 00:05:36 +00:00
Erik Faye-Lund	e5fda871fd	panvk: avoid implicit cast-warning on Clang BITFIELD_MASK() returns a 32-bit unsigned integer, and Clang complains if we assign it to a 16-bit unsigned integer without a cast. Let's add that cast. While we're at it, add an assert() to make it clear to the compiler that the condition in BITFIELD_MASK() can be optimized away. Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>	2025-08-20 00:05:36 +00:00
Erik Faye-Lund	fed682c506	pan/lib: do not duplicate enum mali_pixel_kill The enum pan_earlyzs is just enum mali_pixel_kill under a different name, which was needed because the enum was missing from common.xml. However, because pan_earlyzs_lut is used in files that are both included with PAN_ARCH unset and set to values including values lower than 6, we get issues with the way genxml/common_pack.h gets included, resulting in the enum not being defined. We don't really depend on the values for this, only on the size. So let's just use unsigned values in the struct instead, to side-step the issue. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>	2025-08-20 00:05:36 +00:00
Erik Faye-Lund	0dcf510c05	pan: use translate_s_format for stencil While this was also using translate_zs_format() before the commit in question, that's didn't lead to any real issues, because only a single value was legal here before. While it's not entirely in-spec to use other values, it seems the HW doesn't mind. But when this logic was reworked, the typed field was used instead. This lead to a compiler warning on Clang. Let's correct this properly here, rather than papering over the compiler warning. Fixes: `7a763bb0a3` ("pan/genxml: Rework the RT/ZS emission logic") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>	2025-08-20 00:05:36 +00:00
Erik Faye-Lund	30cc9f5b3d	pan/util: use nir_component_mask instead of BITFIELD_MASK To generate a nir_component_mask_t, we should use nir_component_mask, not BITFIELD_MASK()... But we're also generating the same mask twice here, so let's just store that to a variable and reuse the mask when shifting it while we're at it. Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Tested-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>	2025-08-20 00:05:36 +00:00
Eric Engestrom	69b0245f13	panfrost/meson: drop invalid C++ arg cc1plus: warning: command-line option ‘-Wno-override-init’ is valid for C/ObjC but not for C++ Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36840>	2025-08-19 23:44:22 +00:00
Yonggang Luo	2a0a5a3e3f	d3d10umd: Fixes building with mingw/gcc and windows sdk/ddk 10.0.26100.0 Avoid recursive include between DriverIncludes.h and Debug.h Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36844>	2025-08-19 23:22:07 +00:00
Zan Dobersek	1bc25c855b	tu: disable LRZ writes also for alpha-to-coverage, FS sample coverage output Currently LRZ writes are disabled when depth writes are enabled but the fragment shader is using discard. Additionally, LRZ writes should be disabled when fragment shader is outputting sample coverage or the pipeline state is enabling alpha-to-coverage which behaves as a discard. This fixes rendering problems on Assetto Corsa. Conditions now used for disabling LRZ writes match one set of conditions under which the EARLY_Z_LATE_Z z-test mode is used. It was assumed that in that mode the LRZ writes in binning will not happen until the late-Z phase, but that's apparently not the case. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36848>	2025-08-19 23:05:07 +00:00
Yiwei Zhang	ec4cebbf2e	venus: expose KHR_present_id(2)/wait(2) support Venus does support these via common wsi. Test: dEQP-VK.wsi..present_id_wait. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36834>	2025-08-19 22:48:35 +00:00
Yiwei Zhang	fd0b41b98d	venus: hide swapchainMaintenance1 behind wsi guard ..otherwise would give false alarm on Android. Fixes: `acd5497067` ("venus: support wsi maintenance1 extensions") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36834>	2025-08-19 22:48:35 +00:00
Mike Blumenkrantz	0d7e38f431	zink: improve deferred buffer barrier heuristics this is only to catch the case of a bound descriptor being written to by some operation other than its draw/dispatch descriptor bind, so any non-write binds are ignored previously those non-write binds were required because of how sync analysis could drop non-write access, so that is fixed as well also use the vbo bind count instead of the mask because why not also also ignore non-write GENERAL image deferred sync because that shouldn't need anything deferred Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846>	2025-08-19 22:11:51 +00:00
Mike Blumenkrantz	cf5d41575b	zink: remove UNSYNCHRONIZED map flag during unmap flush for non-subdata calls this avoids a scenario where a non-subdata UNSYNCHRONIZED unmap triggers through tc at the same time the frontend calls an UNSYNCHRONIZED subdata call in the main thread, which desynchronizes the cmdbuf and hits an assert Fixes: `8ee0d6dd71` ("zink: add a third cmdbuf for unsynchronized (not reordered) ops") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846>	2025-08-19 22:11:51 +00:00
Mike Blumenkrantz	4d0650d188	zink: fix image sync deferral each of these cases wasn't actually checking what the comment claimed it was checking, which would add unnecessary deferred sync Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846>	2025-08-19 22:11:51 +00:00
Mike Blumenkrantz	af7b39a22f	zink: optimize a GENERAL layout case in pre-draw/dispatch barriers Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36846>	2025-08-19 22:11:50 +00:00
Job Noorman	77c1c688dc	ir3/array_to_ssa: remove trivial all-undef phis remove_trivial_phi erroneously skipped phis containing an undef src because the remaining srcs may not dominate the phi. However, it's fine to replace a phi whose srcs are all undef with undef. Fix this by simply checking if all srcs are equal, whether undef or not. Note that in practice, this often caused phis with undef srcs to be inserted all the way up to the entry block, keeping their defs alive for much longer than necessary. Fixes unnecessary spilling in God Of War and Neon Noir traces. Totals: MaxWaves: 2381774 -> 2384954 (+0.13%) Instrs: 49052269 -> 49052865 (+0.00%); split: -0.03%, +0.04% CodeSize: 102493810 -> 102514296 (+0.02%); split: -0.02%, +0.04% NOPs: 8391570 -> 8385296 (-0.07%); split: -0.14%, +0.07% MOVs: 1448918 -> 1455153 (+0.43%); split: -0.43%, +0.86% COVs: 824835 -> 824846 (+0.00%) Full: 1714015 -> 1707987 (-0.35%) (ss): 1125974 -> 1126692 (+0.06%); split: -0.14%, +0.21% (sy): 553893 -> 553561 (-0.06%); split: -0.23%, +0.17% (ss)-stall: 4011440 -> 4006144 (-0.13%); split: -0.21%, +0.08% (sy)-stall: 16707741 -> 16664838 (-0.26%); split: -0.48%, +0.23% STPs: 18953 -> 18495 (-2.42%) LDPs: 23957 -> 22121 (-7.66%) Preamble Instrs: 11100893 -> 11100673 (-0.00%) Early Preamble: 122185 -> 122188 (+0.00%) Last helper: 11913048 -> 11914963 (+0.02%); split: -0.04%, +0.06% Subgroup size: 12925248 -> 12926272 (+0.01%) Cat0: 9246551 -> 9240417 (-0.07%); split: -0.13%, +0.07% Cat1: 2335781 -> 2341487 (+0.24%); split: -0.29%, +0.53% Cat2: 18445905 -> 18445930 (+0.00%) Cat6: 515382 -> 514732 (-0.13%) Cat7: 1635575 -> 1637224 (+0.10%); split: -0.09%, +0.19% Totals from 2293 (1.39% of 164705) affected shaders: MaxWaves: 21622 -> 24802 (+14.71%) Instrs: 3399456 -> 3400052 (+0.02%); split: -0.49%, +0.51% CodeSize: 6576806 -> 6597292 (+0.31%); split: -0.24%, +0.55% NOPs: 774365 -> 768091 (-0.81%); split: -1.54%, +0.73% MOVs: 226724 -> 232959 (+2.75%); split: -2.73%, +5.48% COVs: 48005 -> 48016 (+0.02%) Full: 50599 -> 44571 (-11.91%) (ss): 88248 -> 88966 (+0.81%); split: -1.85%, +2.66% (sy): 41345 -> 41013 (-0.80%); split: -3.03%, +2.23% (ss)-stall: 396793 -> 391497 (-1.33%); split: -2.11%, +0.78% (sy)-stall: 1594786 -> 1551883 (-2.69%); split: -5.06%, +2.37% STPs: 1147 -> 689 (-39.93%) LDPs: 2535 -> 699 (-72.43%) Preamble Instrs: 707407 -> 707187 (-0.03%) Early Preamble: 180 -> 183 (+1.67%) Last helper: 1538341 -> 1540256 (+0.12%); split: -0.35%, +0.47% Subgroup size: 149248 -> 150272 (+0.69%) Cat0: 857696 -> 851562 (-0.72%); split: -1.43%, +0.72% Cat1: 275565 -> 281271 (+2.07%); split: -2.44%, +4.51% Cat2: 1139467 -> 1139492 (+0.00%) Cat6: 22505 -> 21855 (-2.89%) Cat7: 129600 -> 131249 (+1.27%); split: -1.15%, +2.42% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36714>	2025-08-19 20:07:34 +00:00
Job Noorman	ca15116fa1	ir3/array_to_ssa: fix updating/removing phis Fix checking instruction flags instead of dst flags, and updating src instead of def. Totals: MaxWaves: 2381954 -> 2381958 (+0.00%) Instrs: 49073677 -> 49073417 (-0.00%) CodeSize: 102537524 -> 102536824 (-0.00%) NOPs: 8396340 -> 8396432 (+0.00%); split: -0.00%, +0.00% MOVs: 1450777 -> 1450422 (-0.02%) Full: 1714304 -> 1714287 (-0.00%) (ss): 1126433 -> 1126463 (+0.00%); split: -0.00%, +0.00% (ss)-stall: 4013834 -> 4013854 (+0.00%) (sy)-stall: 16713036 -> 16713082 (+0.00%) Cat0: 9252109 -> 9252194 (+0.00%); split: -0.00%, +0.00% Cat1: 2337941 -> 2337592 (-0.01%) Cat7: 1636810 -> 1636814 (+0.00%); split: -0.00%, +0.00% Totals from 5 (0.00% of 164705) affected shaders: MaxWaves: 42 -> 46 (+9.52%) Instrs: 9052 -> 8792 (-2.87%) CodeSize: 16806 -> 16106 (-4.17%) NOPs: 2369 -> 2461 (+3.88%); split: -0.17%, +4.05% MOVs: 1140 -> 785 (-31.14%) Full: 133 -> 116 (-12.78%) (ss): 206 -> 236 (+14.56%); split: -0.97%, +15.53% (ss)-stall: 901 -> 921 (+2.22%) (sy)-stall: 6229 -> 6275 (+0.74%) Cat0: 2695 -> 2780 (+3.15%); split: -0.22%, +3.38% Cat1: 1333 -> 984 (-26.18%) Cat7: 419 -> 423 (+0.95%); split: -0.48%, +1.43% Signed-off-by: Job Noorman <jnoorman@igalia.com> Fixes: `3ac743c333` ("ir3: Add pass to lower arrays to SSA") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36714>	2025-08-19 20:07:34 +00:00
Michal Krol	2385fa2098	gallium: Do not flush subnormals during tessellation. D3D11 requires that subnormals are not flushed to zero when tessellating primitives. Since we are flushing subnormals during shader execution, we must temporarily turn flushing off when calling the tessellator. Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36811>	2025-08-19 19:45:29 +00:00
Gert Wollny	8fc2b0d24c	r600/sfn: Emit thread position as two-slot op It doesn't change much though, because it always has to be scheduled as in the xy channels. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:33 +00:00
Gert Wollny	b0bf1d914a	r600/sfn: give more liberty to the channel selection in simple two-slot ops Some ops on 64 bit data don't require the data to reside in neighboring channels and can be executed as seperate 32 bit ops. In these cases we don't need to pin the registers to a specific channel, but for scheduling it is better that we make sure that both destination values reside in different channels, so that they can be scheduled into one ALU group and reduce the probability of read-port conflicts when used as source values. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:33 +00:00
Gert Wollny	206d50ba25	r600/sfn: op1v_flt64_to_flt32 as multi-slot instruction With that the optimizer can better switch the channel. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:32 +00:00
Gert Wollny	2d88e9236d	r600/sfn: Handle more ops in desk mask evaluation Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:32 +00:00
Gert Wollny	00c41ad03a	r600/sfn: replace hard-coded multislot dot handling More ops then op2_dot_ieee + op2_mul_ieee can be submitted as multi-slot ops. Make it ease to handle additional opcodes when splitting the alu op that has only one dst but requires multiple slots. With that we can emit more multi-slot ops that use consecutive slots and use a different opcode in the last slot. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:31 +00:00
Gert Wollny	f2916b3df4	r600/sfn: Fix the mods when splitting ALU op In preparation of splitting 64 bit two slot ops with one 32 bit dest register use the right start slot. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:31 +00:00
Gert Wollny	1ba8ff9fe6	r600/sfn: Take slot count into account when pinning registers Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:30 +00:00
Gert Wollny	77eaad8e21	r600/sfn: Fix test when allocating registers more freely With the changes to the register pinning we have to update the test to avoid failures later. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:29 +00:00
Gert Wollny	b6a917b6da	r600/sfn: Only map ssa index to register index if pinning is not free If we have more than one register that is associated with the same ssa index, but can be allocated without a specific channel pinning, then don't add it to the ssa.index/register.index map to not re-use the same register index. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:29 +00:00
Gert Wollny	6e2f08633a	r600/sfn: Take allowed dest mask into account in copy-prop In addition, on Cayman some trans opts can use three or four channels, and it may be an advantage to use the four channel version if the result needs to be written to the w channel to reduce the all-over ALU instruction group count. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36743>	2025-08-19 19:30:29 +00:00
Faith Ekstrand	14b4160792	vulkan/wsi: Only test for dma-buf sync file support once Instead of each helper having a VK_ERROR_FEATURE_NOT_PRESENT fast-reject path, drop those paths and check at the top of each caller. This ensures that we do the check once per wsi_device, and only on a known test dma-buf and that any subsequent fails turn into fails rather than silently turning off explicit/implicit sync in potentially inconsistent ways. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36816>	2025-08-19 18:59:43 +00:00
Faith Ekstrand	6d3c82704d	vulkan/wsi: Sanitize the result of wsi_drm_check_dma_buf_sync_file_import_export() Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36816>	2025-08-19 18:59:43 +00:00
Faith Ekstrand	9ddd29639c	vulkan/wsi: Style nits Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36816>	2025-08-19 18:59:43 +00:00
Natalie Vock	4de3a5cce3	radv: Only expose indirect raytracing on gfx7+ It relies on unaligned indirect dispatches which are broken on gfx6. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30811>	2025-08-19 18:34:41 +00:00
Rob Clark	e1493996b5	freedreno/decode: Add missing varset check Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13688 Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36818>	2025-08-19 18:19:58 +00:00
Samuel Pitoiset	baaf5d643a	radv: emit inlined push constants with buffered SH regs on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>	2025-08-19 18:01:23 +00:00
Samuel Pitoiset	c710eaa443	radv: emit descriptor pointers with buffered SH regs on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>	2025-08-19 18:01:22 +00:00
Samuel Pitoiset	95d2f009a9	radv: emit compute pipeline with buffered SH regs on GFX12 This also includes RT, task shaders and DGC IES for compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>	2025-08-19 18:01:21 +00:00
Samuel Pitoiset	bbf8338443	radv: rework the helper to emit buffered regs on GFX12 Also reserve enough space if needed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>	2025-08-19 18:01:21 +00:00
Samuel Pitoiset	1f26f93aa7	radv: emit relocation for task shaders at the same place as other stages Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36570>	2025-08-19 18:01:21 +00:00
Karol Herbst	f2f945c2b7	nak: run nir_opt_move nir_move_comparisons Totals: CodeSize: 914469536 -> 914055696 (-0.05%); split: -0.07%, +0.02% Number of GPRs: 3863818 -> 3866731 (+0.08%); split: -0.01%, +0.08% SLM Size: 841076 -> 840828 (-0.03%); split: -0.03%, +0.00% Static cycle count: 1073101189 -> 1059404451 (-1.28%); split: -1.39%, +0.11% Spills to memory: 57317 -> 54698 (-4.57%); split: -4.57%, +0.00% Fills from memory: 57317 -> 54698 (-4.57%); split: -4.57%, +0.00% Spills to reg: 67707 -> 57646 (-14.86%); split: -15.24%, +0.38% Fills from reg: 80456 -> 71960 (-10.56%); split: -10.75%, +0.20% Max warps/SM: 3672668 -> 3672244 (-0.01%); split: +0.00%, -0.01% Totals from 33585 (38.33% of 87622) affected shaders: CodeSize: 614909536 -> 614495696 (-0.07%); split: -0.10%, +0.03% Number of GPRs: 1771770 -> 1774683 (+0.16%); split: -0.01%, +0.18% SLM Size: 659824 -> 659576 (-0.04%); split: -0.04%, +0.00% Static cycle count: 994849091 -> 981152353 (-1.38%); split: -1.50%, +0.12% Spills to memory: 57317 -> 54698 (-4.57%); split: -4.57%, +0.00% Fills from memory: 57317 -> 54698 (-4.57%); split: -4.57%, +0.00% Spills to reg: 67372 -> 57311 (-14.93%); split: -15.32%, +0.39% Fills from reg: 80178 -> 71682 (-10.60%); split: -10.79%, +0.20% Max warps/SM: 1299808 -> 1299384 (-0.03%); split: +0.01%, -0.04% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36536>	2025-08-19 17:29:07 +00:00
Karol Herbst	83cf765f8e	nak: run nir_opt_move nir_move_load_ubo Usually we can fold most ldc and ldcx into the instruction using it, however there are a couple of cases where we can't, e.g. when there is an indirect offset. Moving the ldc(x) down to the consumer leads to increase value ranges for uniform registers, but lowering them for normal registers. Totals: CodeSize: 914650304 -> 914469536 (-0.02%); split: -0.05%, +0.03% Number of GPRs: 3879754 -> 3863818 (-0.41%); split: -0.42%, +0.01% Static cycle count: 1073273107 -> 1073101189 (-0.02%); split: -0.09%, +0.08% Spills to reg: 67219 -> 67707 (+0.73%); split: -0.10%, +0.83% Fills from reg: 79733 -> 80456 (+0.91%); split: -0.10%, +1.01% Max warps/SM: 3666036 -> 3672668 (+0.18%); split: +0.18%, -0.00% Totals from 24235 (27.66% of 87622) affected shaders: CodeSize: 444747392 -> 444566624 (-0.04%); split: -0.11%, +0.07% Number of GPRs: 1360384 -> 1344448 (-1.17%); split: -1.20%, +0.03% Static cycle count: 806310857 -> 806138939 (-0.02%); split: -0.12%, +0.10% Spills to reg: 35826 -> 36314 (+1.36%); split: -0.19%, +1.55% Fills from reg: 31863 -> 32586 (+2.27%); split: -0.26%, +2.53% Max warps/SM: 911328 -> 917960 (+0.73%); split: +0.74%, -0.01% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36536>	2025-08-19 17:29:07 +00:00

1 2 3 4 5 ...

210493 Commits