AlexIndustrial/mesa

Author	SHA1	Message	Date
Simon Perretta	6dd0a5ee2d	pvr, pco: switch to clc query shaders Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37439>	2025-09-22 14:52:04 +01:00
Simon Perretta	6edb72d28b	pco: replace {un,}packing alu ops with intrinsics Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:19 +00:00
Simon Perretta	8104ef4e01	pco: support 1010102 snorm, [us]scaled formats Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:19 +00:00
Simon Perretta	672541d036	nir, asahi: commonize interleave_agx Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:12 +00:00
Simon Perretta	78062fbb75	pvr, pco: improved image write (with format) support, handle 111110 Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:11 +00:00
Simon Perretta	ed652e10fc	pco: force image/texture array coordinate f2i32 conversions to be rtne Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:11 +00:00
Simon Perretta	b50f0b47d2	pco: add support for sscaled8* formats Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:09 +00:00
Simon Perretta	db686e190a	pvr, pco: per frag/vertex input/output rework Adds support for packing and unpacking r10g10b10a2 unorm and r11g11b10 float formats, as well as partial 2x16 and 4x8 formats. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:09 +00:00
Simon Perretta	b7c0863b97	pco: add uadd64_32 op Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:08 +00:00
Simon Perretta	8ec174b3f9	pco: add support for various selection, complex, trig ops Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>	2025-09-16 18:26:08 +00:00
Alyssa Rosenzweig	b9c2579ae0	nir: unmark 24b multiply as associative Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig	076f245df8	nir: restrict associativity to binary operations mathemtically, associativity is only defined for binary operations. I have no idea what "associativity" would even mean for imad. I can kinda see the idea for iadd3 but iadd3 should not be formed until after reassociating adds so the point is moot. Unmark the "associative" ternary operations, and assert that associativity implies binary. nothing uses associativity yet, so this doesn't cause any functional change. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig	e466b8735b	nir: introduce "inexact associative" property nothing currently uses the associative flag, but they will change soon. we need to stop incorrectly marking fmul/fadd/etc as associative, because they're not, but they almost are. distinguish these properties so we can correctly handle floating point rules without any opcode-based special casing. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:19 +00:00
Georg Lehmann	d672737372	nir,aco: add byte_perm_amd Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>	2025-07-16 11:46:52 +00:00
Georg Lehmann	f047a67fba	nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:27 +00:00
Georg Lehmann	5addbf63f9	nir: add float8 conversion opcodes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:24 +00:00
Samuel Pitoiset	226b0e28db	nir: generalize bitfield insert/extract sizes Original patch from Alyssa Rosenzweig Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35209>	2025-06-04 09:37:53 +00:00
Rhys Perry	397920c16e	nir: fix left shift of negative value in ibfe constant folding Fixes "left shift of negative value -128" with parallel_rdp/00f93a9497dfbb3b and UBSan. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>	2025-06-03 09:45:01 +00:00
Rhys Perry	78aae4b1ba	nir: fix signed overflow in pack_half_2x16 constant folding Without this cast, the left shift is promoted to 'int'. Fixes "left shift of 50432 by 16 places cannot be represented in type 'int'" with horizon_zero_dawn/001064f580f8e3be and UBSan. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>	2025-06-03 09:45:01 +00:00
Rhys Perry	6852538ba0	nir: fix unpack_unorm_2x16/unpack_snorm_2x16 constant folding Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 25.0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>	2025-06-03 09:45:01 +00:00
Alyssa Rosenzweig	759dc70bde	nir: generalize bitfield_reverse bit size No reason we can't reverse other bit sizes, we just need to generalize the constant folding & bit size lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35198>	2025-05-28 16:29:30 +00:00
Georg Lehmann	ba63263f32	nir: add bfdot2_bfadd and use it for lowering bfdot if supported Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>	2025-05-09 11:20:26 +00:00
Caio Oliveira	cf4021f93c	nir: Add opcodes for BFloat16 SPV_KHR_bfloat16 requires a small set of operations, since it doesn't support all the arithmetic ops. This patch adds conversions to/from Float32 and also the necessary ops (bfdot, bffma, bfmul) to implement SpvOpDot using the same lowering approach than the Float32 counterpart. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Benjamin Lee	252c59602e	panfrost: implement 16-bit ldexp Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy lowering. The codegen for this is quite bad currently, but would be improved by implementing unpack_32_2x16_split_*, and by fusing comparisons with CSEL. The main alternative is converting to F32, then LDEXP.f32, then converting back to F16. This has better codegen for dynamic exponents currently, but worse in the common case with a constant exponent where all the saturating cast logic can be folded. Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2 when shaderFloat16 is enabled in panvk. Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>	2025-02-27 16:49:11 +00:00
Mel Henning	11b8c8b8e6	nak,nir: Add 64-bit lea_nv Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32517>	2025-02-13 17:36:41 +00:00
Mel Henning	0470643047	nak,nir: Add 32-bit nir_op_lea_nv and use it Changes code size by -0.80% on shaderdb. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32517>	2025-02-13 17:36:41 +00:00
Alyssa Rosenzweig	bd89279dd4	nir: add lower_scratch_to_var pass to ease opencl pain. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>	2024-12-12 21:16:13 +00:00
Karmjit Mahil	b79994e92d	nir,ir3: Add icsel_eqz In IR3 `sel.b32` works based on the 0 so add `icsel_eqz` to fuse the cmp and sel that we'd otherwise need. total Instruction Count in shared programs: 1112814 -> 1110473 (-0.21%) Instruction Count in affected programs: 162701 -> 160360 (-1.44%) helped: 81 HURT: 29 Instruction count are helped. total MOV Count in shared programs: 86777 -> 88671 (2.18%) MOV Count in affected programs: 28119 -> 30013 (6.74%) helped: 1 HURT: 292 Mov count are HURT. total COV Count in shared programs: 15070 -> 14962 (-0.72%) COV Count in affected programs: 5770 -> 5662 (-1.87%) helped: 76 HURT: 2 Cov count are helped. total Last helper instruction in shared programs: 592729 -> 590638 (-0.35%) Last helper instruction in affected programs: 91331 -> 89240 (-2.29%) helped: 30 HURT: 1 Last helper instruction are helped. total Instructions with SS sync bit in shared programs: 29336 -> 29546 (0.72%) Instructions with SS sync bit in affected programs: 4702 -> 4912 (4.47%) helped: 8 HURT: 43 Instructions with ss sync bit are HURT. total Estimated cycles stalled on SS in shared programs: 111590 -> 112401 (0.73%) Estimated cycles stalled on SS in affected programs: 27708 -> 28519 (2.93%) helped: 21 HURT: 61 Estimated cycles stalled on ss are HURT. total cat1 instructions in shared programs: 101933 -> 103695 (1.73%) cat1 instructions in affected programs: 35804 -> 37566 (4.92%) helped: 18 HURT: 290 Cat1 instructions are HURT. total cat2 instructions in shared programs: 380299 -> 377499 (-0.74%) cat2 instructions in affected programs: 128609 -> 125809 (-2.18%) helped: 322 HURT: 0 Cat2 instructions are helped. Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32189>	2024-12-06 08:42:36 +00:00
Job Noorman	22fc90a116	nir: add ir3-specific bitwise triop opcodes ir3 has a number of bitwise triops (e.g., shrm == (src0 >> src1) & src2) that don't have NIR-equivalents. Doing instruction selection for them is a lot more convenient using algebraic patterns than to have to manually match for them. This patch add NIR opcodes for these instructions. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>	2024-11-28 06:19:59 +00:00
Rhys Perry	0619e4db63	nir,aco,ac/llvm: add nir_op_alignbyte_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Alyssa Rosenzweig	9ab8d70fa6	nir: add ilea_agx/ulea_agx opcodes to facilitate address mode lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	85b3dc90e0	nir,agx: lower fmin/fmax in NIR we want to elide flushes, doing so requires more sophisticated analysis than I'd like in the middle of isel. also, it should be done before forming preambles for efficiency (notice the uniform reduction here). let's do it with a NIR pass. total instructions in shared programs: 2768481 -> 2757832 (-0.38%) instructions in affected programs: 644084 -> 633435 (-1.65%) helped: 2242 HURT: 18 helped stats (abs) min: 1 max: 349 x̄: 4.77 x̃: 3 helped stats (rel) min: 0.01% max: 34.91% x̄: 3.19% x̃: 2.19% HURT stats (abs) min: 1 max: 19 x̄: 2.89 x̃: 1 HURT stats (rel) min: 0.24% max: 7.94% x̄: 1.27% x̃: 0.81% 95% mean confidence interval for instructions value: -5.20 -4.22 95% mean confidence interval for instructions %-change: -3.30% -3.01% Instructions are helped. total alu in shared programs: 2182880 -> 2172352 (-0.48%) alu in affected programs: 513166 -> 502638 (-2.05%) helped: 2235 HURT: 16 helped stats (abs) min: 1 max: 349 x̄: 4.73 x̃: 3 helped stats (rel) min: 0.02% max: 37.65% x̄: 3.70% x̃: 2.59% HURT stats (abs) min: 1 max: 19 x̄: 2.50 x̃: 1 HURT stats (rel) min: 0.33% max: 3.74% x̄: 1.04% x̃: 0.91% 95% mean confidence interval for alu value: -5.16 -4.20 95% mean confidence interval for alu %-change: -3.83% -3.49% Alu are helped. total fscib in shared programs: 2178643 -> 2168059 (-0.49%) fscib in affected programs: 514666 -> 504082 (-2.06%) helped: 2243 HURT: 17 helped stats (abs) min: 1 max: 349 x̄: 4.74 x̃: 3 helped stats (rel) min: 0.02% max: 37.65% x̄: 3.74% x̃: 2.59% HURT stats (abs) min: 1 max: 19 x̄: 2.65 x̃: 1 HURT stats (rel) min: 0.33% max: 14.71% x̄: 1.85% x̃: 0.93% 95% mean confidence interval for fscib value: -5.16 -4.20 95% mean confidence interval for fscib %-change: -3.87% -3.53% Fscib are helped. total bytes in shared programs: 18467348 -> 18403042 (-0.35%) bytes in affected programs: 4403648 -> 4339342 (-1.46%) helped: 2247 HURT: 20 helped stats (abs) min: 2 max: 2132 x̄: 28.73 x̃: 18 helped stats (rel) min: 0.01% max: 33.53% x̄: 2.80% x̃: 1.94% HURT stats (abs) min: 4 max: 72 x̄: 12.60 x̃: 6 HURT stats (rel) min: 0.23% max: 6.58% x̄: 1.06% x̃: 0.75% 95% mean confidence interval for bytes value: -31.29 -25.45 95% mean confidence interval for bytes %-change: -2.90% -2.64% Bytes are helped. total regs in shared programs: 864605 -> 864442 (-0.02%) regs in affected programs: 4692 -> 4529 (-3.47%) helped: 68 HURT: 48 helped stats (abs) min: 1 max: 54 x̄: 7.25 x̃: 3 helped stats (rel) min: 4.26% max: 43.20% x̄: 13.21% x̃: 10.53% HURT stats (abs) min: 1 max: 36 x̄: 6.88 x̃: 6 HURT stats (rel) min: 3.64% max: 91.67% x̄: 23.12% x̃: 24.00% 95% mean confidence interval for regs value: -3.60 0.79 95% mean confidence interval for regs %-change: -2.10% 5.75% Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120927 -> 2120911 (<.01%) uniforms in affected programs: 770 -> 754 (-2.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.67 x̃: 2 helped stats (rel) min: 1.79% max: 2.70% x̄: 2.13% x̃: 1.96% 95% mean confidence interval for uniforms value: -3.75 -1.58 95% mean confidence interval for uniforms %-change: -2.50% -1.76% Uniforms are helped. total threads in shared programs: 27612224 -> 27613056 (<.01%) threads in affected programs: 7168 -> 8000 (11.61%) helped: 6 HURT: 3 helped stats (abs) min: 64 max: 192 x̄: 170.67 x̃: 192 helped stats (rel) min: 8.33% max: 23.08% x̄: 20.62% x̃: 23.08% HURT stats (abs) min: 64 max: 64 x̄: 64.00 x̃: 64 HURT stats (rel) min: 8.33% max: 9.09% x̄: 8.59% x̃: 8.33% 95% mean confidence interval for threads value: -3.17 188.06 95% mean confidence interval for threads %-change: -0.92% 22.69% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>	2024-10-30 10:14:07 -04:00
Rhys Perry	b2abd3bdba	nir: fix shfr constant folding with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Georg Lehmann	dbf63a0788	nir: remove nir_op_is_derivative Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	f9d2aad7a3	nir: remove alu ddx/ddy Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Alyssa Rosenzweig	6287c8251d	nir: add bounds_agx opcode used to facilitate bounds checking optimization Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>	2024-10-05 18:30:11 +00:00
Iago Toral Quiroga	aac1c074cc	nir: make fclamp_pos_mali and fsat_signed_mali opcodes generic V3D can use these too. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>	2024-10-03 09:02:07 +00:00
Alyssa Rosenzweig	749205fe06	pan/bi: switch to derivative intrinsics rewrote most of the impl but shrug. regresses code gen for mediump but I'm not too bothered given the lackluster perf of fp16 on bifrost :( Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30567>	2024-08-14 01:34:54 +00:00
Faith Ekstrand	bbccbd8d50	nir,nak: Add a nir_op_prmt_nv We have this in hardware since forever and it's really useful. May as well add it to NIR so we can use it in various lowerings. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30218>	2024-07-17 13:38:24 +00:00
Alyssa Rosenzweig	6f48fa4ebe	nir: strengthen fmin/fmax definitions with signed zero SPIR-V strengthened the semantics around signed zero, requiring fmin(-0, +0) = -0. Since nir_op_fmin is commutative, we must also require fmin(+0, -0) = -0 to match, although it's unclear if SPIR-V requires that. We must strengthen NIR's definitions accordingly. This strengthening is additionally motivated by the existing nir_opt_algebraic rule like: (('fmin', a, ('fneg', a)), ('fneg', ('fabs', a))), With the strengthened new definition, this transform is clearly exact. With the weaker definition, the transform could change the sign of zero based on implementation-defined behaviours which ... while, not exactly unsound, is undesireable semantically. ... This is probably technically a bug fix, but I'm not convinced it's worth it's weight in backporting. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>	2024-07-15 19:29:00 +00:00
Alyssa Rosenzweig	7fc5a2296b	nir: use MIN2/MAX2 opcodes for imin/umax folding This is more idiomatic and already #include'd. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>	2024-07-15 19:29:00 +00:00
Georg Lehmann	99372c1ed7	nir: add ford, funord, fneo, fequ, fltu, fgeu Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>	2024-06-27 08:12:29 +00:00
Juan A. Suarez Romero	60e7cb7654	nir: use unsigned types when performing bitshifting Ensure unsigned integers are used instead of signed ones when performing left bit shifts. This has been detected by the Undefined Behaviour Sanitizer (UBSan). Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29772>	2024-06-21 21:07:05 +00:00
Juan A. Suarez Romero	e43cc49806	nir: fix overflow when negating maxint in constant expressions Undefined Behaviour Sanitizer (UBSan) detected the following when running testing `dEQP-VK.graphicsfuzz.cov-fold-negate-min-int-value`: `negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself` SPIR-V spec states that OpSNegate(0x80000000) has to return 0x80000000; in our case, -2147483648 should be -2147483648. While this is not causing any issue because compilers seem to be behaving like that, it is still undefined behaviour, so it expects to be this handled explicitly, which is the purpose of this commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29772>	2024-06-21 21:07:05 +00:00
Georg Lehmann	dcab408a6c	nir: remove unpack_half_flush_to_zero It doesn't make sense to have two sets of opcodes for this when all backends that support the flush_to_zero variant just rely on the global floating point mode anyway. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>	2024-05-31 09:46:35 +00:00
Connor Abbott	32308fe9f1	ir3/nir: Fix imadsh_mix16 definition The constant-folding definition and comments say that it takes the high 16 bits of the first source and low 16 bits of the second source, but actually it's the opposite. The algebraic optimization, which actually happens and needs to be correct, was correct but the comment above it was wrong. Note that in the way we use it when lowering multiplications, the ordering doesn't matter. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22075>	2024-04-26 12:55:14 +00:00
Rhys Perry	08903bbe89	nir: add mqsad_4x8, shfr and nir_opt_mqsad Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26251>	2024-04-05 11:01:39 +00:00
Faith Ekstrand	f4fb5277c3	nir: Add an imad opcode Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27159>	2024-02-27 21:51:30 -06:00
Rhys Perry	ae54cbeb3f	nir: remove sad_u8x4 All uses of this can be replaced with msad_4x8. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>	2024-01-05 18:55:22 +00:00
Rhys Perry	0477421f7d	nir: add msad_4x8 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>	2024-01-05 18:55:22 +00:00

1 2 3 4 5

221 Commits