AlexIndustrial/mesa

Author	SHA1	Message	Date
Faith Ekstrand	1ffb0c5af4	nir: Support 0 and 32 bits in some format conversion helpers Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>	2024-06-19 01:56:22 +00:00
Faith Ekstrand	34161d3fda	nir: Move most of nir_format_convert to a C file There's no good reason for this to be header-only besides laziness on my part when I first wrote a few "small" helpers. Some of those are pretty good sized and don't need to be inlined. Keeping the original copyright since this is just moving code. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>	2024-06-19 01:56:22 +00:00
Faith Ekstrand	9d3b144018	nir: Add a nir_intrinsic_use for unit tests Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>	2024-06-19 01:56:22 +00:00
Faith Ekstrand	5b9ac9a68f	nir/format_convert: Use fmin/fmax to clamp R9G9B9E5 data As long as drivers implement an fmin/fmax that do the right thing with NaN, there's no reason for the integer comparison. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>	2024-06-19 01:56:22 +00:00
Faith Ekstrand	86aad90e2a	nir/format_convert: Smash NaN to 0 in pack_r9g9b9e5() I have no idea why I flipped the order of these to checks vs. the C code when I wrote the NIR helper. We need to deal with NaN first or else the fmin will smash NaN to MAX_RGB9E5 and it won't get handled as NaN. Fixes: `9981709d8f` ("nir/format_convert: Add a function to pack RGB9_E5 formats") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>	2024-06-19 01:56:22 +00:00
Marek Olšák	75777f1dc8	nir: add a NIR option flag nir_io_prefer_scalar_fs_inputs It's a NIR option because passing flags from radeonsi to the GLSL linker is complicated. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	2514999c9c	nir: add nir_opt_vectorize_io, vectorizing lowered IO Since nir_opt_varyings requires scalar IO and thus all drivers have to scalarize it, this gives the option to re-vectorize IO after that. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	0058989357	nir/lower_io_to_scalar: don't create output stores that have no effect This fixes NIR validation errors that happen with certain shaders. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Marek Olšák	756b4f907e	nir/lower_io_to_scalar: add new_component temporary variable The next commit will use it. No change in behavior. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29406>	2024-06-17 23:48:35 +00:00
Alyssa Rosenzweig	ae3af4c73a	nir: document restriction on load_smem_amd constantness This came up while reviewing https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398 ... Possibly this intrinsic should be renamed to load_smem_constant_amd for consistency with load_global_constant. But if we're not going to convey constantness in the intrinsic name, let's at least document the restriction, because NIR's optimizer relies on it. (I didn't inspect every call site, but it looks like load_smem_amd is just used for descriptor loads so there's no bug to fix.) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29743>	2024-06-17 21:17:09 +00:00
Alyssa Rosenzweig	15257b65c6	treewide: use nir_metadata_control_flow Via Coccinelle patch: @@ @@ -nir_metadata_block_index \| nir_metadata_dominance +nir_metadata_control_flow ...plus some manual fixups for call sites missed by coccinelle. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com> Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom] Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:14 -04:00
Alyssa Rosenzweig	90b6dba772	nir: add nir_metadata_control_flow Most passes want to preserve this specific combination of metadata, so let's add an alias for the combination. The alias communicates that the control flow graph is preserved, rather than a particular statement about e.g. dominance preservation. You don't need to understand dominance to write a simple nir_shader_instructions_pass. And since you were going to cargo cult the metadata anyway, this way you'll cargo cult a version you're more likely to understand. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>	2024-06-17 16:28:11 -04:00
Daniel Schürmann	7af16e9f1e	nir/shader_info: remove uses_demote This flag is mostly redundant with uses_discard and was only introduced to implement demote with LLVM when it didn't have that intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	9b1a748b5e	nir: remove nir_intrinsic_discard The semantics of discard differ between GLSL and HLSL and their various implementations. Subsequently, numerous application bugs occurred and SPV_EXT_demote_to_helper_invocation was written in order to clarify the behavior. In NIR, we now have 3 different intrinsics for 2 things, and while demote and terminate have clear semantics, discard still doesn't and can mean either of the two. This patch entirely removes nir_intrinsic_discard and nir_intrinsic_discard_if and replaces all occurences either with nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the case that the NIR option 'discard_is_demote' is being set. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:16 +00:00
Daniel Schürmann	073e69c7dc	nir/opt_peephole_select: handle nir_terminate{_if} Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Daniel Schürmann	f3d8bd18dd	nir: introduce discard_is_demote compiler option This new option indicates that the driver emits the same code for nir_intrinsic_discard and nir_intrinsic_demote. Otherwise, it is assumed that discard is implemented as terminate. spirv_to_nir uses this option in order to directly emit nir_demote in case of OpKill. RADV GFX11: Totals from 3965 (4.99% of 79439) affected shaders: MaxWaves: 119418 -> 119424 (+0.01%); split: +0.03%, -0.03% Instrs: 1608753 -> 1620830 (+0.75%); split: -0.18%, +0.93% CodeSize: 8759152 -> 8785152 (+0.30%); split: -0.18%, +0.48% VGPRs: 152292 -> 149232 (-2.01%); split: -2.37%, +0.36% Latency: 9162314 -> 10033923 (+9.51%); split: -0.46%, +9.97% InvThroughput: 1491656 -> 1493408 (+0.12%); split: -0.10%, +0.22% VClause: 21424 -> 21452 (+0.13%); split: -0.31%, +0.44% SClause: 53598 -> 55871 (+4.24%); split: -2.15%, +6.39% Copies: 90553 -> 90462 (-0.10%); split: -2.91%, +2.81% Branches: 16283 -> 16311 (+0.17%) PreSGPRs: 113993 -> 113254 (-0.65%); split: -1.84%, +1.19% PreVGPRs: 110951 -> 108914 (-1.84%); split: -2.08%, +0.24% VALU: 963192 -> 963167 (-0.00%); split: -0.01%, +0.01% SALU: 87926 -> 90795 (+3.26%); split: -2.92%, +6.18% VMEM: 25937 -> 25936 (-0.00%) SMEM: 110012 -> 109799 (-0.19%); split: -0.20%, +0.01% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>	2024-06-17 19:37:15 +00:00
Alyssa Rosenzweig	574c5c70de	nir/lower_robust_access: handle MSAA images We need to check the sample too. fixes on Honeykrisp with MSAA storage images: dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.storage_image.fmt_qual.img.samples_4.2d_array.comp Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29741>	2024-06-17 15:28:15 +00:00
Karol Herbst	358e09f9ff	nir: add global_atomic_2x32 variants to nir_get_io_offset_src_number Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Karol Herbst	d2d966a3c2	nir_lower_mem_access_bit_sizes: support unaligned store_scratch This can be trivially be added as it doesn't even need atomics. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>	2024-06-17 10:07:56 +00:00
Faith Ekstrand	7e3d157bee	nak,nir: Drop r2ur_nv in favor of as_uniform Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29737>	2024-06-15 06:14:27 +00:00
Job Noorman	0c1bb92690	nir/opt_offsets: add load/store_ssbo_ir3 Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28664>	2024-06-14 17:12:59 +00:00
Job Noorman	609a56d170	nir/opt_offsets: add option to allow offset wrapping On some ISAs (e.g., ir3) the offset calculation wraps the same way as normal unsigned addition so potentially wrapping operations do not have to be ignored. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28664>	2024-06-14 17:12:59 +00:00
Job Noorman	518c93768b	nir/opt_offsets: add callback for max base offset To support cases where different instructions may be used for the same storage type. For example, to load from an SSBO on ir3, either ldib (max offset 127) or isam.v (max offset 255) can be used. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28664>	2024-06-14 17:12:59 +00:00
Job Noorman	d3f8de791d	ir3: lower SSBO access imm offsets Add the BASE index to the load/store_ssbo_ir3 intrinsic to store an immediate offset. This offset is encoded in the corresponding fields of isam.v/ldib.b/stib.b. One extra optimization is implemented: whenever the regular offset is also a constant, the total offset (regular plus immediate) is aligned down to a multiple of the max immediate offset and this is used as the regular offset while the immediate is set to the remainder. This ensures that the register used for the regular offset can often be reused among multiple contiguous accesses. Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28664>	2024-06-14 17:12:59 +00:00
Faith Ekstrand	e05cb967e7	nir: Add nir_foreach_block_in_cf_node_safe() iterators Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>	2024-06-13 20:43:46 +00:00
Faith Ekstrand	b107240474	nir: Add some new _nv intrinsics The ldc_nv and ldcx_nv intrinsics correspond to the index and bindless forms of NVIDIA's LDC instruction, respectively. ldc_nv is pretty much load_ubo without some of the unnecessary constant bits while ldcx_nv takes a 64-bit bindless handle instead of an index. The other two give us a little control over register allocation at the NIR level to ensure that LDCX handles are placed in uniform registers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>	2024-06-13 20:43:45 +00:00
Faith Ekstrand	290cbf413c	nir/print: Improve divergence information Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29591>	2024-06-13 20:43:44 +00:00
Timothy Arceri	4c3d1a09de	nir: add additional opt_loop_merge() test of deref handling Here we test the rematerialization of the deref produces valid nir when both the deref and array index value are moved to the else branch of the first terminator during the merge. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>	2024-06-13 15:00:35 +00:00
Timothy Arceri	abb51f449d	nir: test opt_loop_merge_terminators() skips unhandled loops This test makes sure the merge if pass skips loops with trainling phis as those are not handled by the pass. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>	2024-06-13 15:00:35 +00:00
Timothy Arceri	b26ef8f153	nir: correctly track current loop in nir_opt_loop() We were not restoring an outer loop as the current loop after we had finished processing a nested loop. Fixes: `9995f336e6` ("nir: add merge loop terminators optimisation") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>	2024-06-13 15:00:35 +00:00
Timothy Arceri	3d2a821198	nir: add test for opt_loop_merge_terminators Makes sure we correctly rematerialize derefs moved during the merge. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>	2024-06-13 15:00:35 +00:00
Rhys Perry	92af96e0b3	nir/opt_loop: fix formatting Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>	2024-06-13 15:00:35 +00:00
Rhys Perry	cb51a93c1e	nir/opt_loop: rematerialize derefs instead of creating phis Fixes NIR validation of hogwarts_legacy/00ac08423ad6e422. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `9995f336e6` ("nir: add merge loop terminators optimisation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>	2024-06-13 15:00:35 +00:00
Alyssa Rosenzweig	f1144aa56f	nir/builtin_builder: factor out nir_build_texture_query useful for other queries too. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29614>	2024-06-11 13:10:22 +00:00
Timothy Arceri	9995f336e6	nir: add merge loop terminators optimisation Merge two consecutive basic terminators. Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28998>	2024-06-11 01:42:23 +00:00
Timothy Arceri	e25da8d8d7	nir: support more loop unrolling for logical operators Here we support finding loop count when the termination condition is a logical or. Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28998>	2024-06-11 01:42:23 +00:00
Timothy Arceri	987cf4b47d	nir: more aggressively remove in loop during partial unroll Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28998>	2024-06-11 01:42:23 +00:00
Timothy Arceri	9702570994	nir: clarify and update loop conditional instruction This value is intended to be used to remove out of bounds array access when unrolling loops so it should contain the comparison that contains the the induction variable not the overall condition of the loop terminator. So here we update the instruction when dealing with iand/ior loop terminator conditions. Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28998>	2024-06-11 01:42:23 +00:00
Alyssa Rosenzweig	31127d7b02	nir/lower_wpos_center: clean up Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29585>	2024-06-10 16:59:38 +00:00
Emma Anholt	3beae0f98e	nir,panfrost,agx: Fix driver PIXEL_COORD_INTEGER setting and drop workaround. nir_lower_frag_coord_to_pixel_coord was adding .5 to work around that the drivers were mistakenly setting PIXEL_COORD_HALF_INTEGER. With the setting corrected, the GL frontend handles it appropriately (instead of subtracting half in the frontend for ARB_fragment_coord_conventions integer setting and then adding the half back here), and makes the pass reusable from Intel. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29585>	2024-06-10 16:59:38 +00:00
Alyssa Rosenzweig	5f72234745	asahi: split param structs for GS internal kernel this simplifies state management consdierably Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29607>	2024-06-07 16:57:03 +00:00
Georg Lehmann	75b1fa9263	nir/opt_algebraic: alternative 8bit pack_[us]norm_4x8 lowering Foz-DB Navi21: Totals from 42 (0.05% of 79395) affected shaders: Instrs: 2709529 -> 2705848 (-0.14%) CodeSize: 14720732 -> 14711384 (-0.06%); split: -0.06%, +0.00% VGPRs: 4096 -> 4104 (+0.20%) Latency: 17907612 -> 17904468 (-0.02%); split: -0.02%, +0.00% InvThroughput: 4723551 -> 4722649 (-0.02%); split: -0.02%, +0.00% Copies: 223516 -> 219819 (-1.65%) Branches: 109578 -> 109594 (+0.01%); split: -0.00%, +0.02% VALU: 1730848 -> 1727151 (-0.21%) Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28882>	2024-06-04 17:00:29 +00:00
Georg Lehmann	f66883a875	nir: lower pack_uvec4_to_uint to pack_32_4x8 if supported Foz-DB Navi31: Totals from 42 (0.05% of 79395) affected shaders: Instrs: 3326544 -> 3324640 (-0.06%) CodeSize: 16908376 -> 16896212 (-0.07%); split: -0.07%, +0.00% VGPRs: 4284 -> 4296 (+0.28%) Latency: 17862544 -> 17855438 (-0.04%); split: -0.05%, +0.01% InvThroughput: 3535291 -> 3533993 (-0.04%); split: -0.04%, +0.00% VClause: 95270 -> 95275 (+0.01%); split: -0.01%, +0.01% SClause: 65402 -> 65397 (-0.01%) Copies: 229723 -> 234124 (+1.92%) Branches: 109481 -> 109518 (+0.03%); split: -0.00%, +0.04% PreVGPRs: 3879 -> 3909 (+0.77%) VALU: 1789208 -> 1787370 (-0.10%); split: -0.10%, +0.00% SALU: 409136 -> 409129 (-0.00%); split: -0.00%, +0.00% Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28882>	2024-06-04 17:00:29 +00:00
Faith Ekstrand	7e6cd395c7	nir: Handle cmat types in lower_variable_initializers Fixes: `b98f87612b` ("spirv: Implement SPV_KHR_cooperative_matrix") Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29509>	2024-06-04 16:34:48 +00:00
Georg Lehmann	18a0ff137f	nir: sink/move inverse_ballot like moves It's just a copy for the backends that don't lower it. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502>	2024-06-04 15:40:57 +00:00
Georg Lehmann	690f880d18	nir/opt_uniform_atomics: handle inverse_ballot when detecting single lane ifs Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502>	2024-06-04 15:40:57 +00:00
Ian Romanick	7b7e5cf5d4	nir/algebraic: intel/fs: Optimize some patterns before lowering 64-bit integers v2: Add some comments explaining some of the nuance of the shift optimizations. Fix a bug in the shift count calculation of the upper 32-bits. Move the @64 from the variable to the opcode. All suggested by Jordan. No shader-db changes on any Intel platform. fossil-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 154507026 -> 154506576 (-0.00%) Cycle count: 17436298868 -> 17436295016 (-0.00%) Max live registers: 32635309 -> 32635297 (-0.00%) Totals from 42 (0.01% of 632575) affected shaders: Instrs: 5616 -> 5166 (-8.01%) Cycle count: 133680 -> 129828 (-2.88%) Max live registers: 1158 -> 1146 (-1.04%) No fossil-db changes on any other Intel platform. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00
Ian Romanick	4834df82e2	nir/algebraic: More patterns to generate iadd3 I noticed some shaders with patterns similar to these while working on cooperative matrix lowering. Meteor Lake and DG2 are the only platforms that support iadd3, so there were no shader-db or fossil-db changes on any other platforms. shader-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19869445 -> 19868343 (<.01%) instructions in affected programs: 419426 -> 418324 (-0.26%) helped: 913 / HURT: 2 total cycles in shared programs: 936010029 -> 935909811 (-0.01%) cycles in affected programs: 31746523 -> 31646305 (-0.32%) helped: 495 / HURT: 356 LOST: 10 GAINED: 12 fossil-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 154514596 -> 154505466 (-0.01%); split: -0.01%, +0.00% Cycle count: 17540226067 -> 17436266198 (-0.59%); split: -0.63%, +0.04% Spill count: 146887 -> 146886 (-0.00%) Fill count: 272499 -> 272489 (-0.00%); split: -0.01%, +0.00% Max live registers: 32634290 -> 32634739 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 5550128 -> 5550368 (+0.00%) Totals from 4401 (0.70% of 632560) affected shaders: Instrs: `3095239` -> 3086109 (-0.29%); split: -0.30%, +0.00% Cycle count: 7327352564 -> 7223392695 (-1.42%); split: -1.51%, +0.10% Spill count: 28105 -> 28104 (-0.00%) Fill count: 45830 -> 45820 (-0.02%); split: -0.04%, +0.02% Max live registers: 264376 -> 264825 (+0.17%); split: -0.05%, +0.22% Max dispatch width: 43768 -> 44008 (+0.55%) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00
Ian Romanick	f1b941aaec	nir/search: Refactor is_16_bits Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Suggested-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00
Ian Romanick	6e53be2a0a	nir/search: Fix is_16_bits for vectors Require that all elements of a vector be representable as either int16_t or uint16_t. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Fixes: `7ef45e661f` ("intel/fs: Add constant propagation for ADD3") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00

1 2 3 4 5 ...

5399 Commits