AlexIndustrial/mesa

Author	SHA1	Message	Date
Mohamed Ahmed	21165c7972	nil, nvk: Add plumbing for compression This lays the groundwork for enabling compression by adding a way to pass in whether the image will be compressed or not from NVK to NIL. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38702>	2025-11-28 00:30:30 +00:00
Faith Ekstrand	02b4647a1c	nvk: Add a dedicated_image to nvk_device_memory Also refactor the dedicated image handling a tiny bit to make the next bit easier. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38702>	2025-11-28 00:30:29 +00:00
Mohamed Ahmed	9bd51ce508	nvk/nvkmd: Fix alignments Previously, there was some mixing up of alignments between the alignment provided by the caller, and the minimum alignment we have (4KiB). Additionally, there was some redundant aligning being done to data already passed in aligned. This didn't matter because we were always using 4K pages anyways due to kernel limitations. However, this now needs fixing to allow for larger page support. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38702>	2025-11-28 00:30:29 +00:00
Mohamed Ahmed	88b92dc3d3	nouveau/winsys: Retrieve and store the PTE kind in the nouveau_ws_bo Previously, for imports we wouldn't carry over the PTE kind with the import, which worked fine up till now. However, compression depends on the PTE kind being correct otherwise there will be a mismatch between both sides. The GEM info object we get from the kernel already has the PTE kind embedded in the tile flags object, so all we have to do is retrieve it and store in the bo object, and then the lower layers can retrieve the kind from the bo directly. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38702>	2025-11-28 00:30:29 +00:00
Mohamed Ahmed	3af0ee04a5	nouveau/winsys: Store the nouveau kernel version This is so we can enable features needing kernel support based on whether the detected kernel driver supports them or not by checking for the version in nvkmd. Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38702>	2025-11-28 00:30:29 +00:00
Mel Henning	7a492e102f	nvk: Use the OS page size in nvk_AllocateMemory In `bccb9fe091` ("nvk/nvkmd: nouveau uses the OS page size"), the alignment size was narrowed to the OS page size in nvkmd_nouveau_alloc_tiled_mem. This makes the same change for nvk_AllocateMemory. This is being done in preparation for large page support, which will be more picky about alignments. Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38702>	2025-11-28 00:30:29 +00:00
Christian Gmeiner	ab86088438	etnaviv: Defer GPU state reset until first draw call Currently, GPU state is reset immediately after each flush and during context creation, even when the next command might be a simple BLT/RS operation that doesn't require the full GPU rendering pipeline. This patch introduces lazy GPU state reset by: - Adding a needs_gpu_state_reset flag to track when reset is needed - Setting the flag to true after flush instead of immediately resetting - Only performing the actual reset in etna_draw_vbo() when rendering Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Acked-by: Lucas Stach <l.stach@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36565>	2025-11-27 23:06:28 +00:00
Christian Gmeiner	55447790c4	etnaviv: rs: Move RS_SINGLE_BUFFER control to per-operation basis Move RS_SINGLE_BUFFER from global context initialization to individual RS operations, enabling it before each operation and disabling it immediately after. The same pattern is seen in traces from the binary blob driver. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Acked-by: Lucas Stach <l.stach@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36565>	2025-11-27 23:06:28 +00:00
Christian Gmeiner	d7fff632cd	lavapipe: Trivially expose VK_GOOGLE_user_type extension There's nothing for the driver to do; it's all handled in spirv_to_nir. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Acked-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38574>	2025-11-27 20:52:17 +00:00
Yiwei Zhang	73e8d6533e	docs: add VK_KHR_robustness2 and supported drivers Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38690>	2025-11-27 19:51:28 +00:00
Yiwei Zhang	78029a2773	venus: enable promoted VK_KHR_robustness2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38690>	2025-11-27 19:51:28 +00:00
Yiwei Zhang	6ba742e334	venus: sync to latest protocol for v1.4.334 This also includes enabling the promoted VK_KHR_robustness2. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38690>	2025-11-27 19:51:28 +00:00
Yonggang Luo	9d3d15f871	util,wgl: Replace usage of putenv with os_unset_option,os_set_option Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:34 +00:00
Yonggang Luo	168042fb05	gfxstream: os_set_option can be used on windows now Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@google.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:34 +00:00
Yonggang Luo	0a32d5e6fd	treewide: Use regexp to replace usage of setenv with os_set_option. setenv$(.), 1$; => os_set_option($1, true); setenv$(.), 0$; => os_set_option($1, false); Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:34 +00:00
Yonggang Luo	1825715623	treewide: Use regexp to replace usage of unsetenv with os_unset_option. unsetenv$(.*)$; => os_unset_option($1); Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:33 +00:00
Yonggang Luo	d277dfdd76	treewide: Replace the usage of setenv manually and #include "util/os_misc.h" when needed Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:33 +00:00
Yonggang Luo	5ab8148f23	util: Update os_get_option* comments to match os_set_option Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:32 +00:00
Yonggang Luo	2771eb39fd	util: Add function os_unset_option/os_set_option for latter use It's will be used to replace SetEnvironmentVariableA,putenv on windows and putenv,setenv on non-windows Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:32 +00:00
Yonggang Luo	123a66fc43	util,asahi,vulkan,panfrost: Replace the remaining usage of getenv with os_get_option Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>	2025-11-27 18:22:32 +00:00
Tapani Pälli	95938823f4	compiler/glsl: validate input blocks with opaque/booleans Commit adds a check for booleans/opaque types inside interfaces, there is existing check for "regular varyings". Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14338 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38613>	2025-11-27 17:40:15 +00:00
Caterina Shablia	a338694c50	panvk: report support for sparseResidencyImage2D Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37483>	2025-11-27 17:05:43 +00:00
Caterina Shablia	5326c45174	panvk/csf: implement sparse image non-opaque binds Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37483>	2025-11-27 17:05:43 +00:00
Caterina Shablia	c87bdde596	panvk: align rows and layers of sparse resident images When laying out a sparse partially-resident image we need to align rows of ordered blocks to a mapping granularity in bytes (i.e. the page size) and array layers to a multiple of sparse block size. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37483>	2025-11-27 17:05:43 +00:00
Caterina Shablia	7421b38521	panvk: sparse partially-resident image -related queries Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37483>	2025-11-27 17:05:43 +00:00
Caterina Shablia	bd9aeeec0a	pan/lib: introduce row_align_B and array_align_B constraints To implement sparse partially-resident images, we need to be able to express mapping in terms of rectangles of texel blocks. With row_align_B we can constrain the rows of ordered blocks to start at mapping boundary (i.e. page size) and using array_align_B we can ensure that each subresource starts at a multiple of whatever sparse block size we decide to use. Not setting each of these fields is the same as setting them to 1. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37483>	2025-11-27 17:05:42 +00:00
Caterina Shablia	dbf20eb49f	panvk: move sparse blackhole stuff to panvk_sparse.{c,h} While we're at it also add the SPDX header to panvk_sparse.c because I forgor to do that when it was first being added. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37483>	2025-11-27 17:05:42 +00:00
Lionel Landwerlin	515d8f8e3a	brw: fix sample mask flag emission It's also used for testing helper invocations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `e3328dfa2f` ("brw: only initialize sample mask flag if needed") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38699>	2025-11-27 15:59:35 +00:00
Pierre-Eric Pelloux-Prayer	671e943c9b	mesa: fix function prototype Replace void* by GLvoid* and add GLAPIENTRY to match the gl_API.xml version. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14164 Fixes: `ae75b59cb5` ("glthread, tc: Fix buffer release with glthread and tc") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38029>	2025-11-27 16:22:45 +01:00
Leon Perianu	bff723e50c	pvr: pvr_pds_fragment_program_create fix allocation callback usage The staging buffer is persistent until the destruction of the pvr_pipeline object, so we should set the allocation scope to PVR_ALLOC_SCOPE_OBJECT instead of PVR_ALLOC_SCOPE_COMMAND. Also did the same change in the function pvr_pds_coeff_program_create_and_upload for the staging buffer, because that buffer is also destroyed at pipeline destruction. Fixes dEQP-VK.api.object_management.single_alloc_callbacks.graphics_pipeline. Signed-off-by: Leon Perianu <leon.perianu@imgtec.com> Reviewed-by: Karmjit Mahil <karmjit.mahil@igalia.com> Tested-by: Icenowy Zheng <uwu@icenowy.me> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38662>	2025-11-27 13:18:31 +00:00
Juan A. Suarez Romero	b9b9c676e1	v3d/ci: update expected results Some failures in OpenCL tests were fixed due commit `a643681d`. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38694>	2025-11-27 11:59:39 +00:00
Danylo Piliaiev	297c5b5de3	freedreno: Update A7XX_RB_UNKNOWN_8E09 to be in line with blob All A7XX GPUs seem to have A7XX_RB_UNKNOWN_8E09=0x7 according to blob v819. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38680>	2025-11-27 11:27:03 +00:00
Job Noorman	bcd81c8172	freedreno/computerator: add option to print raw disassembly It is sometimes useful to see the raw hex values of what instructions are assembled to, similar to the output of shaders in cffdump. Add an option for this to computerator. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37595>	2025-11-27 10:27:27 +00:00
Job Noorman	e413615d55	ir3: add ir3_disasm_options struct We want to add some disassembly options in the future. Add new ir3_shader_disasm_options function that takes options from a new ir3_disasm_options struct in which we can add options later. The original ir3_shader_disasm becomes a wrapper for the new function to not have to update all call sites now. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37595>	2025-11-27 10:27:27 +00:00
Marek Olšák	166afc592b	gallium/hud: don't fclose stdout for GALLIUM_HUD=...,stdout This fixes printf doing nothing after the context is destroyed and recreated. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38601>	2025-11-27 03:21:12 +00:00
Yonggang Luo	6356efc4e0	gfxstream: Use os_get_option_dup(VK_DRIVER_FILES) As the return value os_get_option should be immediately consumed. Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38687>	2025-11-27 10:20:52 +08:00
Yonggang Luo	d668c0ad42	gfxstream: Use VK_DRIVER_FILES instead of VK_ICD_FILENAMES as VK_ICD_FILENAMES is deprecated for a while. This is a prepare for remove VK_ICD_FILENAMES in source tree. Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38687>	2025-11-27 10:20:48 +08:00
Calder Young	09e8a54087	anv: Fix ray query shadow stack buffer size Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38685>	2025-11-26 22:49:52 +00:00
Sagar Ghuge	d8447fd392	vulkan/runtime: Account for pipeline libraries stage count Don't excludes stages coming from pipeline libraries. This caused valid group indices referring to library stages to be dropped, leading to mismatched stage_count. Fixes: `e05a9b77b6` ("vulkan/runtime: split rt shaders hashing from compile") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38669>	2025-11-26 22:17:57 +00:00
Marek Olšák	e47be4f37b	st/mesa: call nir_opt_intrinsics slightly later It makes more progress after nir_lower_atomics_to_ssbo. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38598>	2025-11-26 16:24:06 -05:00
Marek Olšák	2ea30edc70	st/mesa: call nir_opt_intrinsics for the GL_SELECT shader radeonsi may assert that this pass makes no progress. This is one place that should call the pass. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38598>	2025-11-26 16:24:04 -05:00
Marek Olšák	eea5959a22	nir/lower_io_passes: call nir_opt_undef to eliminate undef output stores If we do it here, we won't have to call nir_recompute_io_bases later again. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38598>	2025-11-26 16:23:49 -05:00
Roland Scheidegger	88ae1f8533	llvmpipe: optimize the centroid implementation All things related to selecting the position when no sample is covered isn't actually dependent on fragment shader loop iteration, in fact it's not even dependent on the shader invocation, only the sample mask (which is from jit context, not from shader key, otherwise could just precalculate all of it). And certainly there's no need for all the extra per-sample selects. Just calculate it once at interpolation context init. LLVM should be able to easily toss out (as with the previous version) all extra code done at interpolation init if centroid interpolation isn't actually used. (Although the code didn't turn out as simple as I hoped...) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38664>	2025-11-26 19:12:17 +00:00
Roland Scheidegger	9fb4b1e6dc	llvmpipe: implement strict d3d11 rules for centroid interpolation D3D11 is pretty strict about how to do centroid interpolation. In particular, llvmpipe didn't honor these rules when no sample was covered for a pixel (relevant for helper pixels), in this case llvmpipe selected the position of the sample with the highest index (just due to initialization, not really by choice). Given that helper pixels are only really used for derivative calculations, and derivatives are generally sketchy with centroid interpolation, this seems quite a lot of work, but I suppose it could be useful if the state sample mask has only 1 sample set (since these d3d11 rules then guarantee that even with centroid the derivatives are actually useful as the interpolation will be done at the position defined by the sample specified in the sample mask, regardless if that sample is covered by the primitive or not). Other APIs might technically not need this (they tend to not even define at which position centroid interpolation is done, other than it must be inside the primitive), but it shouldn't really hurt them neither. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38664>	2025-11-26 19:12:17 +00:00
Samuel Pitoiset	930cab7702	radv: fix fbfetch output with ESO This fixes a real issue when ESO uses fbfetch output because this was determined after instead of before. This solution isn't the most elegant one but binding graphics shaders earlier would require more work. Let's just handle this specific corner case for now. This fixes dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.shader_objects.fragment_region* on some GPUs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617>	2025-11-26 17:47:07 +00:00
Samuel Pitoiset	6569acbdf2	radv: make sure to reset uses_fbfetch_output for NULL fragment shaders To prevent useless decompression passes if a previously bound FS was using fbfetch output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617>	2025-11-26 17:47:07 +00:00
Ian Romanick	0c089a5c32	brw: Eliminate duplicate fills When the register allocator decides to spill a value, all reads of that value are filled. This can result in cases where the same value is filled many times in a single block. In those cases, the result of an earlier fill may still be available when a later fill occurs. This optimization replaces the later fill with a move from the result of the earlier fill. v2: Use FIXED_GRF for register overlap tests. Since this is after register allocation, the VGRF values will not tell the whole truth. v3: Use brw_transform_inst. Suggested by Caio. Add brw_scratch_inst::offset instead of storing it as a source. Suggested by Lionel. v4: In intervening spill to the same location also invalidates the value. 🤦 v5: Don't eliminate a fill if its destination partially overlaps the preceeding fill destination. Fixes failures in cooperative matrix CTS. shader-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) total instructions in shared programs: 17249903 -> 17249653 (<.01%) instructions in affected programs: 35550 -> 35300 (-0.70%) helped: 20 / HURT: 0 total cycles in shared programs: 893092398 -> 893101836 (<.01%) cycles in affected programs: 2501720 -> 2511158 (0.38%) helped: 6 / HURT: 14 total fills in shared programs: 1901 -> 1776 (-6.58%) fills in affected programs: 1757 -> 1632 (-7.11%) helped: 20 / HURT: 0 fossil-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) Totals: Instrs: 929949528 -> 926770338 (-0.34%) Cycle count: 105126671329 -> 104851299099 (-0.26%); split: -0.28%, +0.02% Fill count: 6520785 -> 5021518 (-22.99%) Totals from 54281 (2.69% of 2018922) affected shaders: Instrs: 239616289 -> 236437099 (-1.33%) Cycle count: 22051883404 -> 21776511174 (-1.25%); split: -1.33%, +0.08% Fill count: 6406295 -> 4907028 (-23.40%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>	2025-11-26 17:20:13 +00:00
Ian Romanick	d2e3707ecc	brw: Eliminate redundant fills and spills When the register allocator decides to spill a value, all writes to that value are spilled and all reads are filled. In regions where there is not high register pressure, a spill of a value may be followed by a fill of that same file while the spilled register is still live. This optimization pass finds these cases, and it converts the fill to a move from the still-live register. The restriction that the spill and the fill must have matching NoMask really hampers this optimization. With the restriction removed, the pass was more than 2x helpful. v2: Require force_writemask_all to be the same for the spill and the fill. v3: Use FIXED_GRF for register overlap tests. Since this is after register allocation, the VGRF values will not tell the whole truth. v4: Use brw_transform_inst. Suggested by Caio. The allows two of the loops to be merged. Add brw_scratch_inst::offset instead of storing it as a source. Suggested by Lionel. v5: Add no-fill-opt debug option to disable optimizations. Suggested by Lionel. v6: Move a calculation outside a loop. Suggested by Lionel. v7: Check that spill ranges overlap instead of just checking initial offset. Zero shaders in fossil-db were affected, but some CTS with spill_fs were fixed (e.g., dEQP-VK.subgroups.arithmetic.compute.subgroupmin_uint64_t_requiredsubgroupsize). Suggested by Lionel. v8: Add DEBUG_NO_FILL_OPT to debug_bits in brw_get_compiler_config_value(). Noticed by Lionel. shader-db: Lunar Lake total instructions in shared programs: 17249907 -> 17249903 (<.01%) instructions in affected programs: 10684 -> 10680 (-0.04%) helped: 2 / HURT: 0 total cycles in shared programs: 893092630 -> 893092398 (<.01%) cycles in affected programs: 237320 -> 237088 (-0.10%) helped: 2 / HURT: 0 total fills in shared programs: 1903 -> 1901 (-0.11%) fills in affected programs: 110 -> 108 (-1.82%) helped: 2 / HURT: 0 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19968898 -> 19968778 (<.01%) instructions in affected programs: 33020 -> 32900 (-0.36%) helped: 10 / HURT: 0 total cycles in shared programs: 885157211 -> 884925015 (-0.03%) cycles in affected programs: 39944544 -> 39712348 (-0.58%) helped: 8 / HURT: 2 total fills in shared programs: 4454 -> 4394 (-1.35%) fills in affected programs: 2678 -> 2618 (-2.24%) helped: 10 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 930445228 -> 929949528 (-0.05%) Cycle count: 105195579417 -> 105126671329 (-0.07%); split: -0.07%, +0.00% Spill count: 3495279 -> 3494400 (-0.03%) Fill count: 6767063 -> 6520785 (-3.64%) Totals from 43844 (2.17% of 2018922) affected shaders: Instrs: 212614840 -> 212119140 (-0.23%) Cycle count: 19151130510 -> 19082222422 (-0.36%); split: -0.39%, +0.03% Spill count: 2831100 -> 2830221 (-0.03%) Fill count: 6128316 -> 5882038 (-4.02%) Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 1001375893 -> 1001113407 (-0.03%) Cycle count: 92746180943 -> 92679877883 (-0.07%); split: -0.08%, +0.01% Spill count: 3729157 -> 3728585 (-0.02%) Fill count: 6697296 -> 6566874 (-1.95%) Totals from 35062 (1.53% of 2284674) affected shaders: Instrs: 179819265 -> 179556779 (-0.15%) Cycle count: 18111194752 -> 18044891692 (-0.37%); split: -0.41%, +0.04% Spill count: 2453752 -> 2453180 (-0.02%) Fill count: 5279259 -> 5148837 (-2.47%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>	2025-11-26 17:20:13 +00:00
Ian Romanick	b7f5285ad3	brw: Add fill and spill opcodes for LSC platforms These opcodes are emitted during register allocation instead of the scratch reads and writes that were previously emitted. These instructions contain additional information (i.e., the instruction encodes the scratch offset) that enable optimizations to be added later. The fill and spill opcodes are lowered to scratch reads and writes shortly after register allocation. Eventually this lower may have some optimizations (e.g., reuse previous address calculations for successive spills). v2: Add brw_scratch_inst::offset instead of storing it as a source. Suggested by Lionel. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>	2025-11-26 17:20:12 +00:00
Ian Romanick	2215003d95	brw: Add OPT macro to brw_shader.cpp like brw_opt.cpp Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>	2025-11-26 17:20:11 +00:00

1 2 3 4 5 ...

215356 Commits