AlexIndustrial/mesa

Author	SHA1	Message	Date
Eric Engestrom	4db58a04f9	ci/vkd3d: print a message when the expected failures file is missing Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29749>	2024-06-18 09:58:54 +00:00
Eric Engestrom	b1f82ce646	ci/vkd3d: deduplicate the diff between the expectation and the results We're seeing weird errors where the results file has disappeared, so let's start by combining the "is this right?" and "what's wrong?" logic into one. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29749>	2024-06-18 09:58:54 +00:00
Danylo Piliaiev	e602a7a392	freedreno/replay: Fix replaying without SET_IOVA Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29753>	2024-06-18 09:46:19 +00:00
Danylo Piliaiev	7c07c44d57	freedreno/rddecompiler: Make possible to use original shader Sometimes decompiled shader isn't easily compiled back into the same binary, e.g. when some part of bitset is not decoded. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29753>	2024-06-18 09:46:19 +00:00
Kenneth Graunke	9e750f00c3	intel/brw: Make opt_copy_propagation_defs clean up its own trash Copy propagation often eliminates all uses of an instruction. If we detect that we've done so, we can eliminate the instruction ourselves rather than leaving it hanging until the next DCE pass. This saves some CPU time as other passes don't see dead code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	2af84c2d49	intel/brw: Use the defs-based copy propagation along with the old one The new def-based pass works better in many cases, and should be less resource intensive. However, the limited visibility of the defs-based pass due to many values not being SSA yet makes it unable to fully replace the old pass. Try the new one, and if it can't make progress, then try the old one. That way, things will mostly be handled by the new pass, but everything that was being cleaned up still will be. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	580e1c592d	intel/brw: Introduce a new SSA-based copy propagation pass (Quite a few of the restrictions here are ported from the old pass.) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	9690bd369d	intel/brw: Delete old local common subexpression elimination pass We no longer use this older pass, so there's no need to keep it. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	8f09c58ddc	intel/brw: Switch to the new defs-based global CSE pass While the limited visibility due to partial SSA is a downside to the new pass, it has a huge number of advantages that make it worth switching over even now. It's much more efficient, can eliminate redundant memory loads across blocks, and doesn't generate loads of unnecessary copies that other passes have to clean up. This means we also eliminate the infighting between the old CSE, coalescing, and copy propagation passes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	234c45c929	intel/brw: Write a new global CSE pass that works on defs This has a number of advantages compared to the pass I wrote years ago: - It can easily perform either Global CSE or block-local CSE, without needing to roll any dataflow analysis, thanks to SSA def analysis. This global CSE is able to detect and coalesce memory loads across blocks. Although it may increase spilling a little, the reduction in memory loads seems to more than compensate. - Because SSA guarantees that values are never written more than once, the new CSE pass can directly reuse an existing value. The old pass emitted copies at the point where it discovered a value because it had no idea whether it'd be mutated later. This led it to generate a ton of trash for copy propagation to clean up later, and also a nasty fragility where CSE, register coalescing, and copy propagation could all fight one another by generating and cleaning up copies, leading to infinite optimization loops unless we were really careful. Generating less trash improves our CPU efficiency. - It uses hash tables like nir_instr_set and nir_opt_cse, instead of linearly walking lists and comparing each element. This is much more CPU efficient. - It doesn't use liveness analysis, which is one of the most expensive analysis passes that we have. Def analysis is cheaper. In addition to CSE'ing SSA values, we continue to handle flag writes, as this is a huge source of CSE'able values. These remain block local. However, we can simply track the last flag write, rather than creating entire sets of instruction entries like the old pass. Much simpler. The only real downside to this pass is that, because the backend is currently only partially SSA, it has limited visibility and isn't able to see all values. However, the results appear to be good enough that the new pass can effectively replace the old pass in almost all cases. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	2b30b3bbd4	intel/brw: Print defs in dump_instructions Like NIR, we print SSA defs as %1, %2, and so on. The number here is the VGRF number. VGRFs that don't correspond to a SSA def remain printed as vgrf1, vgrf2, and so on. This makes it much easier to see what values are SSA and which aren't. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Caio Oliveira	08da7edc0e	intel/brw: Track the number of uses of each def in def_analysis Even without a full use list, simply tracking the number of uses will let us tell "this is the only use of the def" or "we've just replaced all uses of a def". It's inexpensive to calculate and will be useful. (rebased by Kenneth Graunke) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	0d144821f0	intel/brw: Add a new def analysis pass This introduces a new analysis pass that opportunistically looks for VGRFs which happen to satisfy the SSA definition properties. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	ad9e414aa9	intel/brw: Skip LOAD_PAYLOADs after every texture instruction if possible This avoids generating a bunch of trash we have to clean up later. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	84219892ad	intel/brw: Make gl_SubgroupInvocation lane index loading SSA Our code to initialize gl_SubgroupInvocation uses multiple instructions some of which are partial writes. This makes it difficult to analyze expressions involving gl_SubgroupInvocation, which appear very frequently in compute shaders. To make this easier, we add a new virtual opcode which initializes a full VGRF to the value of gl_SubgroupInvocation. (We also expand it to UD for SIMD8 so there are not partial write issues.) We then lower it to the original code later on in compilation, after we've done the bulk of our optimizations. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	344d4ee9f0	intel/brw: Make VEC() perform a single write to its destination. This gathers a number of sources into a contiguous vector register, typically using LOAD_PAYLOAD. However, it uses MOV for a single source. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Timothy Arceri	7df492923a	glsl: drop dump-builder support from standalone compiler The support is incomplete and largely untested, but more importantly glsl ir is depreciated at this point. This feature was added to support building additional passes but that shouldn't ever be needed from here on. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29469>	2024-06-18 08:12:45 +00:00
Iago Toral Quiroga	02f33b7d92	broadcom/compiler: initialize payload_conflict for all initial nodes Fixes: `cb83f25b39` ('broadcom/compiler: don't assign payload registers to spilling setup temps') cc: mesa-stable Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29759>	2024-06-18 07:19:07 +00:00
Juan A. Suarez Romero	7dcba7e873	v3dv/ci: fix spurious line in expected Fixes: `c8c9d1a802` ("v3dv/ci: add expected failure") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29763>	2024-06-18 08:48:38 +02:00
Mike Blumenkrantz	95828d8901	mesa/st: fix zombie shader handling for non-current programs for drivers that don't support PIPE_CAP_SHAREABLE_SHADERS, the zombie shader mechanism is used, storing shaders to delete after the next flush the zombie mechanism also calls bind__state(pipe, NULL) during deletion, however, which breaks drivers in the following scenario: create_all_shaders(pipe_A) * bind_vs(pipe_A, vs_A) * bind_fs(pipe_A, fs_A) * draw(pipe_A) * makeCurrent(pipe_B) * delete_vs(pipe_B, vs_B) * vs_B must only be deleted on pipe_A * zombie_shader_add(pipe_A, vs_B) * makeCurrent(pipe_A) * free_zombie_shaders(pipe_A) * bind_vs(pipe_A, NULL) * delete_vs(pipe_A, vs_B) * draw(pipe_A) * boom the problem being that bind_vs(pipe_A, NULL) was called when deleting vs_B, but it was actually vs_A which was bound to solve this, just flag the shader state for updating and let st figure it out Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11122 cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29680>	2024-06-18 03:51:05 +00:00
Marek Olšák	75777f1dc8	nir: add a NIR option flag nir_io_prefer_scalar_fs_inputs It's a NIR option because passing flags from radeonsi to the GLSL linker is complicated. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	3622092614	glsl/linker: vectorize lowered IO Since we scalarize all IO for nir_opt_varyings, we should re-vectorize it. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	2514999c9c	nir: add nir_opt_vectorize_io, vectorizing lowered IO Since nir_opt_varyings requires scalar IO and thus all drivers have to scalarize it, this gives the option to re-vectorize IO after that. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	0058989357	nir/lower_io_to_scalar: don't create output stores that have no effect This fixes NIR validation errors that happen with certain shaders. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	756b4f907e	nir/lower_io_to_scalar: add new_component temporary variable The next commit will use it. No change in behavior. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Francisco Jerez	06e4e088a3	intel/brw/xe2+: Use active-thread-only barriers available since Xe2+. These allow avoiding dead-locks in non-compliant applications that execute barriers under non-uniform control flow. They're not expected to have any major disadvantage so let's enable them unconditionally. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:18 -07:00
Francisco Jerez	8e61d32db8	iris,anv/xe2+: Use pipelined variant of 3DSTATE_DRAWING_RECTANGLE. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:17 -07:00
Francisco Jerez	576c9e3af2	iris,anv/xe2+: Set tessellation redistribution regions per patch to recommended values. See also HSDES#14015504893 regarding the region-based tessellation redistribution feature which allows fine-tuning the number of regions per patch. This sets it to the recommended value, since region-based redistribution is enabled by default. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:17 -07:00
Francisco Jerez	2aa4652a68	iris,anv/xe2+: Enable the DX10/OGL border mode for YCrCb as per Wa_14014226147. Hardware defaults to DX9 YCrCb border color mode instead of the behavior expected for DX10/OGL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>	2024-06-17 16:19:17 -07:00
Juan A. Suarez Romero	c8c9d1a802	v3dv/ci: add expected failure This was caused when enabling VK_KHR_maintenance5 extension, but the problem is fixed using a new Vulkan Loader. Fixes: `a589901328` ("v3dv: expose VK_KHR_maintenance5") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29756>	2024-06-17 22:03:01 +00:00
Alyssa Rosenzweig	ae3af4c73a	nir: document restriction on load_smem_amd constantness This came up while reviewing https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398 ... Possibly this intrinsic should be renamed to load_smem_constant_amd for consistency with load_global_constant. But if we're not going to convey constantness in the intrinsic name, let's at least document the restriction, because NIR's optimizer relies on it. (I didn't inspect every call site, but it looks like load_smem_amd is just used for descriptor loads so there's no bug to fix.) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29743>	2024-06-17 21:17:09 +00:00
Alyssa Rosenzweig	15257b65c6	treewide: use nir_metadata_control_flow Via Coccinelle patch: @@ @@ -nir_metadata_block_index \| nir_metadata_dominance +nir_metadata_control_flow ...plus some manual fixups for call sites missed by coccinelle. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom] Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:14 -04:00
Alyssa Rosenzweig	90b6dba772	nir: add nir_metadata_control_flow Most passes want to preserve this specific combination of metadata, so let's add an alias for the combination. The alias communicates that the control flow graph is preserved, rather than a particular statement about e.g. dominance preservation. You don't need to understand dominance to write a simple nir_shader_instructions_pass. And since you were going to cargo cult the metadata anyway, this way you'll cargo cult a version you're more likely to understand. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:11 -04:00
Daniel Schürmann	cfa5beeeab	spirv: workaround for tests assuming that OpKill terminates invocations or loops Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	7af16e9f1e	nir/shader_info: remove uses_demote This flag is mostly redundant with uses_discard and was only introduced to implement demote with LLVM when it didn't have that intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	e52e8dd02e	zink: pass zink_screen to nir_to_spirv(). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	9b1a748b5e	nir: remove nir_intrinsic_discard The semantics of discard differ between GLSL and HLSL and their various implementations. Subsequently, numerous application bugs occurred and SPV_EXT_demote_to_helper_invocation was written in order to clarify the behavior. In NIR, we now have 3 different intrinsics for 2 things, and while demote and terminate have clear semantics, discard still doesn't and can mean either of the two. This patch entirely removes nir_intrinsic_discard and nir_intrinsic_discard_if and replaces all occurences either with nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the case that the NIR option 'discard_is_demote' is being set. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Faith Ekstrand	4a84725ebb	intel/blorp: Set nir_shader::options up-front before building Previously, we left it NULL until later in the compile. However, some builder helpers are starting to check the options and they blow up when options == NULL. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Daniel Schürmann	073e69c7dc	nir/opt_peephole_select: handle nir_terminate{_if} Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Daniel Schürmann	f3d8bd18dd	nir: introduce discard_is_demote compiler option This new option indicates that the driver emits the same code for nir_intrinsic_discard and nir_intrinsic_demote. Otherwise, it is assumed that discard is implemented as terminate. spirv_to_nir uses this option in order to directly emit nir_demote in case of OpKill. RADV GFX11: Totals from 3965 (4.99% of 79439) affected shaders: MaxWaves: 119418 -> 119424 (+0.01%); split: +0.03%, -0.03% Instrs: 1608753 -> 1620830 (+0.75%); split: -0.18%, +0.93% CodeSize: 8759152 -> 8785152 (+0.30%); split: -0.18%, +0.48% VGPRs: 152292 -> 149232 (-2.01%); split: -2.37%, +0.36% Latency: 9162314 -> 10033923 (+9.51%); split: -0.46%, +9.97% InvThroughput: 1491656 -> 1493408 (+0.12%); split: -0.10%, +0.22% VClause: 21424 -> 21452 (+0.13%); split: -0.31%, +0.44% SClause: 53598 -> 55871 (+4.24%); split: -2.15%, +6.39% Copies: 90553 -> 90462 (-0.10%); split: -2.91%, +2.81% Branches: 16283 -> 16311 (+0.17%) PreSGPRs: 113993 -> 113254 (-0.65%); split: -1.84%, +1.19% PreVGPRs: 110951 -> 108914 (-1.84%); split: -2.08%, +0.24% VALU: 963192 -> 963167 (-0.00%); split: -0.01%, +0.01% SALU: 87926 -> 90795 (+3.26%); split: -2.92%, +6.18% VMEM: 25937 -> 25936 (-0.00%) SMEM: 110012 -> 109799 (-0.19%); split: -0.20%, +0.01% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Daniel Schürmann	d5821bdf7d	radv: emit discard as demote by default Also removes radv_lower_discard_to_demote debug option. Totals from 1506 (1.90% of 79439) affected shaders: (GFX11) MaxWaves: 46432 -> 46448 (+0.03%) Instrs: 664515 -> 667914 (+0.51%); split: -0.15%, +0.67% CodeSize: 3569656 -> 3583440 (+0.39%); split: -0.12%, +0.51% VGPRs: 50100 -> 49680 (-0.84%); split: -0.96%, +0.12% Latency: 4221359 -> 4217875 (-0.08%); split: -0.67%, +0.59% InvThroughput: 628809 -> 625565 (-0.52%); split: -0.53%, +0.02% VClause: 9948 -> 9965 (+0.17%); split: -0.36%, +0.53% SClause: 19656 -> 19695 (+0.20%); split: -0.77%, +0.97% Copies: 32113 -> 33513 (+4.36%); split: -1.59%, +5.95% Branches: 8406 -> 8378 (-0.33%) PreSGPRs: 42328 -> 42555 (+0.54%); split: -0.39%, +0.93% PreVGPRs: 38451 -> 38203 (-0.64%); split: -0.78%, +0.14% VALU: 390770 -> 390208 (-0.14%); split: -0.16%, +0.02% SALU: 43318 -> 46374 (+7.05%); split: -0.08%, +7.14% VMEM: 15052 -> 15051 (-0.01%) SMEM: 37225 -> 37215 (-0.03%); split: -0.03%, +0.01% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Daniel Schürmann	e0ab1ed14e	spirv: make gl_HelperInvocation volatile if demote is being used Non-volatile gl_HelperInvocation after demote is undefined. In order to avoid application bugs, make it volatile if we use demote. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Erik Faye-Lund	9336190868	panvk: move macro-definition to header This define is used in panvk_physical_device.c as well, so it needs to be visible there. Fixes: `ac34183ec3` ("panvk: Move the VkPhysicalDevice logic to panvk_physical_device.{c,h}") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29751>	2024-06-17 19:15:10 +00:00
Pavel Ondračka	4b040577d5	r300: vectorization tweaks for R300/R400 Vectorization can make the constant layout worse and increase the constant register usage. The worst scenario is vectorization lowered indirect register access, where we access i-th element and later we access i-1 or i+1 (most notably glamor and gsk shaders). In this case we already added constants 1..n where n is the array size, however we can reuse them unless the lowered ladder gets vectorized later. Thus prevent vectorization of the specific patterns from lowered indirect access. This is quite a heavy hammer, we could in theory estimate how many slots will the current ubos and constants need and only disable vectorization when we are close to the limit. However, this would likely need a global shader analysis each time r300_should_vectorize_inst is called, which we want to avoid. So for now just don't vectorize anything that loads constants if we already have lot of uniforms. This is the final missing piece to make glamor work on R400. shader-db R420: total instructions in shared programs: 107288 -> 107290 (<.01%) instructions in affected programs: 236 -> 238 (0.85%) helped: 2 HURT: 3 total temps in shared programs: 17730 -> 17726 (-0.02%) temps in affected programs: 41 -> 37 (-9.76%) helped: 4 HURT: 0 total cycles in shared programs: 163251 -> 163251 (0.00%) cycles in affected programs: 478 -> 478 (0.00%) helped: 2 HURT: 3 GAINED: 7 (2 glamor and 5 GSK shaders) RV370 is quite similar instruction/temp-wise, but we don't gain any shader there, because they are all over the 64 instructions limit... Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10787 Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Filip Gawin <filip.gawin@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29734>	2024-06-17 18:16:02 +00:00
Pavel Ondračka	5f68ba505b	r300: missing whitespace in shader stats Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29734>	2024-06-17 18:16:02 +00:00
Amol Surati	4bf330471b	nine: avoid using post-compacted indices with state expecting pre-compacted ones The commit `973e6f3b` implemented compaction of the stream-number space. The functions `update_vertex_elements(_sw)` began using the post-compacted stream-numbers/indices when maintaining the `stream_usage_mask` and when reading from the arrays `vtxstride` and `stream_freq`. But, the `stream_instancedata_mask`, with which the `stream_usage_mask` is compared/bitwise-anded, maintains bits for the pre-compacted indices. Additionally, the information within the arrays is stored using the pre-compacted indices. The functions have a disagreement, regarding the type (pre- vs post- compacted) of indices, with the rest of the relevant source. This change removes the disagreement by having them use pre-compacted indices when maintaining the `stream_usage_mask` and when reading from the arrays. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11283 Fixes: `973e6f3b` ("gallium: remove start_slot parameter from pipe_context::set_vertex_buffers") Reviewed-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29704>	2024-06-17 17:53:43 +00:00
Michel Dänzer	0bee32a4c3	wsi: Call drmSyncobjQuery only once for all images Reduces system call overhead. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29199>	2024-06-17 16:06:46 +00:00
Alyssa Rosenzweig	574c5c70de	nir/lower_robust_access: handle MSAA images We need to check the sample too. fixes on Honeykrisp with MSAA storage images: dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.storage_image.fmt_qual.img.samples_4.2d_array.comp Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29741>	2024-06-17 15:28:15 +00:00
Samuel Pitoiset	bd59478d2f	radv: implement streamout on GFX12 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29676>	2024-06-17 14:46:36 +00:00
Samuel Pitoiset	aa9dfcad50	radv/nir: lower nir_intrinsic_load_xfb_state_address_gfx12_amd This intrinsic returns a 64-bit address that points to the streamout state buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29676>	2024-06-17 14:46:36 +00:00

1 2 3 4 5 ...

190800 Commits