mathemtically, associativity is only defined for binary operations. I have no
idea what "associativity" would even mean for imad. I can kinda see the idea for
iadd3 but iadd3 should not be formed until after reassociating adds so the point
is moot. Unmark the
"associative" ternary operations, and assert that associativity implies binary.
nothing uses associativity yet, so this doesn't cause any functional change.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
nothing currently uses the associative flag, but they will change soon. we need
to stop incorrectly marking fmul/fadd/etc as associative, because they're not,
but they almost are. distinguish these properties so we can correctly
handle floating point rules without any opcode-based special casing.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy
lowering. The codegen for this is quite bad currently, but would be
improved by implementing unpack_32_2x16_split_*, and by fusing
comparisons with CSEL.
The main alternative is converting to F32, then LDEXP.f32, then
converting back to F16. This has better codegen for dynamic exponents
currently, but worse in the common case with a constant exponent where
all the saturating cast logic can be folded.
Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2
when shaderFloat16 is enabled in panvk.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
In IR3 `sel.b32` works based on the 0 so add `icsel_eqz` to fuse the
cmp and sel that we'd otherwise need.
total Instruction Count in shared programs: 1112814 -> 1110473 (-0.21%)
Instruction Count in affected programs: 162701 -> 160360 (-1.44%)
helped: 81
HURT: 29
Instruction count are helped.
total MOV Count in shared programs: 86777 -> 88671 (2.18%)
MOV Count in affected programs: 28119 -> 30013 (6.74%)
helped: 1
HURT: 292
Mov count are HURT.
total COV Count in shared programs: 15070 -> 14962 (-0.72%)
COV Count in affected programs: 5770 -> 5662 (-1.87%)
helped: 76
HURT: 2
Cov count are helped.
total Last helper instruction in shared programs: 592729 -> 590638 (-0.35%)
Last helper instruction in affected programs: 91331 -> 89240 (-2.29%)
helped: 30
HURT: 1
Last helper instruction are helped.
total Instructions with SS sync bit in shared programs: 29336 -> 29546 (0.72%)
Instructions with SS sync bit in affected programs: 4702 -> 4912 (4.47%)
helped: 8
HURT: 43
Instructions with ss sync bit are HURT.
total Estimated cycles stalled on SS in shared programs: 111590 -> 112401 (0.73%)
Estimated cycles stalled on SS in affected programs: 27708 -> 28519 (2.93%)
helped: 21
HURT: 61
Estimated cycles stalled on ss are HURT.
total cat1 instructions in shared programs: 101933 -> 103695 (1.73%)
cat1 instructions in affected programs: 35804 -> 37566 (4.92%)
helped: 18
HURT: 290
Cat1 instructions are HURT.
total cat2 instructions in shared programs: 380299 -> 377499 (-0.74%)
cat2 instructions in affected programs: 128609 -> 125809 (-2.18%)
helped: 322
HURT: 0
Cat2 instructions are helped.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32189>
ir3 has a number of bitwise triops (e.g., shrm == (src0 >> src1) & src2)
that don't have NIR-equivalents. Doing instruction selection for them is
a lot more convenient using algebraic patterns than to have to manually
match for them. This patch add NIR opcodes for these instructions.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>
SPIR-V strengthened the semantics around signed zero, requiring fmin(-0, +0) =
-0. Since nir_op_fmin is commutative, we must also require fmin(+0, -0) = -0 to
match, although it's unclear if SPIR-V requires that. We must strengthen NIR's
definitions accordingly.
This strengthening is additionally motivated by the existing nir_opt_algebraic
rule like:
(('fmin', a, ('fneg', a)), ('fneg', ('fabs', a))),
With the strengthened new definition, this transform is clearly exact. With the
weaker definition, the transform could change the sign of zero based on
implementation-defined behaviours which ... while, not exactly unsound, is
undesireable semantically.
...
This is probably technically a bug fix, but I'm not convinced it's worth it's
weight in backporting.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30075>
Undefined Behaviour Sanitizer (UBSan) detected the following when
running testing `dEQP-VK.graphicsfuzz.cov-fold-negate-min-int-value`:
`negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself`
SPIR-V spec states that OpSNegate(0x80000000) has to return 0x80000000;
in our case, -2147483648 should be -2147483648.
While this is not causing any issue because compilers seem to be
behaving like that, it is still undefined behaviour, so it expects to be
this handled explicitly, which is the purpose of this commit.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29772>
The constant-folding definition and comments say that it takes the high
16 bits of the first source and low 16 bits of the second source, but
actually it's the opposite. The algebraic optimization, which actually
happens and needs to be correct, was correct but the comment above it
was wrong.
Note that in the way we use it when lowering multiplications, the
ordering doesn't matter.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22075>