Commit Graph

6371 Commits

Author SHA1 Message Date
Timur Kristóf
33630090a2 nir: Add comment to explain the sad_u8x4 opcode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12649>
2021-09-01 08:42:03 +00:00
Emma Anholt
33182c555f nir/nir_lower_uniforms_to_ubo: Set the explicit stride of the UBO 0 uniform.
Normal UBOs have explicit strides on them, make our lowered one behave the
same.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12175>
2021-08-31 20:12:16 +00:00
Emma Anholt
01759d3fb2 nir: Set .driver_location for GLSL UBO/SSBOs when we lower to block indices.
Without this, there's no way to match the UBO nir_variable declarations to
the load_ubo intrinsics referencing their data.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12175>
2021-08-31 20:12:16 +00:00
Timur Kristóf
548b383310 nir: Fix local_invocation_index upper bound for non-compute-like stages.
The lowered LS and NGG stages use local_invocation_index and they
can benefit from the unsigned upper bound because they can emit a
less expensive integer multiplication instruction.
This was working in the past, but accidentally borked by a refactor.

Fossil DB changes on Sienna Cichlid:

Totals from 956 (0.74% of 128647) affected shaders:
CodeSize: 2354172 -> 2344712 (-0.40%)
Instrs: 434359 -> 434327 (-0.01%)
Latency: 1883949 -> 1876814 (-0.38%)
InvThroughput: 762638 -> 757405 (-0.69%)

Fossil DB changes on Sienna Cichlid (with NGGC enabled):

Totals from 57873 (44.99% of 128647) affected shaders:
CodeSize: 155844192 -> 155607064 (-0.15%)
Instrs: 29799184 -> 29799152 (-0.00%)
Latency: 130959764 -> 130814224 (-0.11%); split: -0.11%, +0.00%
InvThroughput: 21100300 -> 20928635 (-0.81%); split: -0.81%, +0.00%

Fixes: 8af6766062
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
2021-08-30 14:05:33 +00:00
Timur Kristóf
a25fd1787a nir: Add unsigned upper bound for extract opcodes.
This helps with some cases of extract, such as:
- Emitting more optimal integer multiplications
- Better address calculation
- Possibly others

Fossil DB results on Sienna Cichlid:

Totals from 4064 (3.16% of 128647) affected shaders:
VGPRs: 262040 -> 262032 (-0.00%)
CodeSize: 28856648 -> 28811892 (-0.16%); split: -0.18%, +0.02%
Instrs: 5370279 -> 5367827 (-0.05%); split: -0.08%, +0.04%
Latency: 74230112 -> 74016671 (-0.29%); split: -0.29%, +0.01%
InvThroughput: 12082532 -> 12036365 (-0.38%); split: -0.39%, +0.01%
VClause: 108506 -> 108721 (+0.20%); split: -0.03%, +0.22%
SClause: 217731 -> 216602 (-0.52%); split: -0.67%, +0.15%
Copies: 265689 -> 270811 (+1.93%); split: -0.26%, +2.19%
PreSGPRs: 201982 -> 204907 (+1.45%); split: -0.01%, +1.46%
PreVGPRs: 236099 -> 236079 (-0.01%)

Fossil DB results on Sienna Cichlid with NGGC enabled:

Totals from 60375 (46.93% of 128647) affected shaders:
VGPRs: 2212576 -> 2212568 (-0.00%)
CodeSize: 180870420 -> 179684816 (-0.66%); split: -0.66%, +0.00%
Instrs: 34386715 -> 34213682 (-0.50%); split: -0.51%, +0.01%
Latency: 199676290 -> 198987998 (-0.34%); split: -0.35%, +0.00%
InvThroughput: 32288299 -> 31736433 (-1.71%); split: -1.71%, +0.00%
VClause: 621521 -> 621743 (+0.04%); split: -0.00%, +0.04%
SClause: 900447 -> 899392 (-0.12%); split: -0.16%, +0.04%
Copies: 3439529 -> 3445305 (+0.17%); split: -0.02%, +0.19%
PreSGPRs: 2216297 -> 2219220 (+0.13%); split: -0.00%, +0.13%
PreVGPRs: 1842887 -> 1842867 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>
2021-08-30 14:05:33 +00:00
Caio Marcelo de Oliveira Filho
b34f9740ca spirv: Implement non-Multiview parts of SPV_NV_mesh_shader
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:43 +00:00
Caio Marcelo de Oliveira Filho
10a03e30cf nir: Allow Task/Mesh to lower compute system values
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:43 +00:00
Caio Marcelo de Oliveira Filho
4f52681a2d nir: Don't lower Task/Mesh I/O to temporaries
These won't work since a workgroup can span more than one thread, and
the temporaries are not shared memory.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:43 +00:00
Caio Marcelo de Oliveira Filho
27697d5eb8 nir/divergence_analysis: Handle Task/Mesh shaders
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
bf5f6add01 nir/lower_io: Identify Mesh output as arrayed
Mesh shader outputs are either:

- non-array builtins
- array builtins that are either per-primitive or per-vertex
- user-defined outputs that must be either per-primitive or per-vertex

So we can identify any array output as "arrayed" for the purposes of
I/O lowering.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
9631d24c3f compiler: Add Task/Mesh to shader_info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
813d41829d compiler: Add new non-Multiview Task/Mesh builtins
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
cd394017c8 nir: Add per-primitive I/O intrinsics
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
f95daad3a2 nir: Add a way to identify per-primitive variables
Per-primitive is similar to per-vertex attributes, but applies to all
fragments of the primitive without any interpolation involved.

Because they are regular input and outputs, keep track in shader_info
of which I/O is per-primitive so we can distinguish them after deref
lowering.  These fields can be used combined with the regular
`inputs_read`, `outputs_written` and `outputs_read`.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Caio Marcelo de Oliveira Filho
927584fa67 nir: Update documentation for location to mention Task/Mesh
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10600>
2021-08-28 03:56:42 +00:00
Filip Gawin
46f3582c6f nir: fix ifind_msb_rev by using appropriate type
As you can see comparion "x < 0" doesn't make
sense if x is unsigned.

Fixes: a5747f8a ("nir: add opcodes for *find_msb_rev and lowering ")

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12548>
2021-08-26 18:35:31 +00:00
Filip Gawin
9083e9a483 nir: fix shadowed variable in nir_lower_bit_size.c
Fixes: 6d79298992 ("nir/lower_bit_size: fix lowering of {imul,umul}_high")

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12527>
2021-08-26 18:04:22 +00:00
Ilia Mirkin
18962b94d3 glsl: fix explicit-location ifc matching in presence of array types
We were treating each field as if it took up a single slot. However
that's not the case. And with strict matching (GLSL 4.20+ / ES 3.1+) we
would end up not matching identical interfaces.

Fixes: c4545676d7 ("glsl/linker: fix location aliasing checks for interface variables")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12479>
2021-08-26 04:07:29 +00:00
Lionel Landwerlin
a13e79843e nir: prevent peephole from generating invalid NIR
We can't append instructions following a return/halt instruction
because the control flow helpers will modify the successor of the
block containing the return/halt. And the NIR validator enforces that
the return/halt must have the end of the function as successor.

This tends to happen following lower_shader_calls lowering which
inserts halts. This probably doesn't prevent the optimization, it'll
just happen in one of the return shaders after the halt has been
removed.

v2: Move prev block ending check earlier in the function (Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12506>
2021-08-25 11:38:21 +00:00
Samuel Pitoiset
cff106c4b6 nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(fabs(b), -a)
and fmin(-fmax(b, a)) to fmin(-fabs(b), -a).

fossils-db (Sienna Cichlid):
Totals from 34 (0.02% of 150170) affected shaders:
CodeSize: 388540 -> 387748 (-0.20%)
Instrs: 74621 -> 74423 (-0.27%)
Latency: 1039407 -> 1039011 (-0.04%)
InvThroughput: 208364 -> 208150 (-0.10%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12519>
2021-08-25 07:18:24 +02:00
Ian Romanick
fe956d0182 spirv: Add support for SPV_KHR_integer_dot_product
v2 (Ivan): Add missing capability enum handling.

v3 (idr): Properly handle cases where dest_size != 32.

v4 (idr): Rewrite most of the error checking to use vtn_fail_if.  Use
nir_ssa_def with vtn_push_nir_ssa instead of vtn_ssa_value with
vtn_push_ssa_value.  All suggested by Jason.  Massive rewrite of the
handling of packed 4x8 saturating opcodes.  Based on some observations
made by Jason.

v5 (idr): Remove some debugging cruft accidentally added in v4.  Noticed
by Jason.

v6: Emit packed versions of vectored instructions when possible.
Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
652d304ee9 spirv: Update headers and metadata from latest Khronos commit
This corresponds to e7b49d7 ("Implement SPV_INTEL_optnone extension
(#230)") in https://github.com/KhronosGroup/SPIRV-Headers.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
a6db40605e nir/algebraic: Add some extract optimizations
These help quite a bit when vectored versions of SpvOpSDotKHR and
friends are emitted as packed versions and then lowered.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
839495efc6 nir/algebraic: Add lowering for dot_4x8 instructions
v2: Fix copy-and-paste bugs in lowering patterns.

v3: Add has_sudot_4x8 flag.  Requested by Rhys.

v4: Since the names of the opcodes changed from dp4 to dot_4x8, also
change the names of the lowering helpers.  Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
806cd2341c nir/algebraic: Basic patterns for dot_4x8
v2: Add and modify patterns to let constant folding do better.

v3: Remove '(is_not_zero)' from the patterns that try to combine
addends.  I honestly don't know why I had it there in the first place,
and nothing in my deep git logs could help clue me in.  Noticed by
Alyssa.  Remover patterns that detect open-coded udot_4x8.  Suggested by
Alyssa and Jason.  Add missing sudot_4x8 patterns.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
6c18a3b497 nir/opcodes: Add integer dot-product opcodes
Six opcodes are added: sdot_4x8_iadd, udot_4x8_uadd, sudot_4x8_iadd,
sdot_4x8_iadd_sat, udot_4x8_uadd_sate, and sudot_4x8_iadd_sat.  These
represent the combinations of integer dot-product and add that operate
on packed source vectors.  That is, the four 8-bit values for each
vector is stored in a single 32-bit integer.

Some hardware may prefer to operate on unpacked byte vectors.  When such
hardware comes to Mesa, we'll have to figure out how to name things.

v2: Add nir_op_iudp4a and nir_op_iudp4a_sat instructions.  These opcodes
are not 2-source commutative.

v3: Rename all opcodes to be more like some existing 4x8 opcodes.
Suggested by Timur.  Change type of packed vector sources to uint32,
change types of constant folding variables to have explicit size, and
delete some extra casts.  All suggested by Jason.

v4: Fix typo previously noticed by Alyssa but missed in v2.

v5: Add has_sudot_4x8 flag.  Requested by Rhys.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Ian Romanick
7d8bf7c167 nir/lower_bit_size: Support add_sat and sub_sat
Without this, lowered saturating ALU instructions would only clamp to
the range of the new type instead of the range of the old type.

v2: Use nir_iclamp.  Suggested by Jason. Use new
u_{int,uint}N_{min,max}() helpers.

Fixes: 090e282407 ("nir: Add a saturated unsigned integer add opcode")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Rhys Perry
3d228b6926 nir/gcm: pin some instructions which require uniform sources
fossil-db (Sienna Cichlid, GCM enabled):
Totals from 6192 (4.12% of 150170) affected shaders:
VGPRs: 548392 -> 542040 (-1.16%)
SpillSGPRs: 3702 -> 3990 (+7.78%); split: -0.54%, +8.32%
CodeSize: 62418488 -> 62481516 (+0.10%); split: -0.07%, +0.17%
MaxWaves: 70582 -> 71718 (+1.61%)
Instrs: 11768497 -> 11795079 (+0.23%); split: -0.07%, +0.30%
Latency: 445891848 -> 523561297 (+17.42%); split: -0.07%, +17.49%
InvThroughput: 115675481 -> 121494913 (+5.03%); split: -0.09%, +5.12%
VClause: 164914 -> 164934 (+0.01%); split: -0.05%, +0.06%
SClause: 405991 -> 395302 (-2.63%); split: -2.64%, +0.00%
Copies: 907216 -> 926429 (+2.12%); split: -1.11%, +3.23%
Branches: 456373 -> 457478 (+0.24%); split: -0.13%, +0.38%
PreSGPRs: 648030 -> 642953 (-0.78%); split: -0.88%, +0.10%
PreVGPRs: 522425 -> 516355 (-1.16%); split: -1.16%, +0.00%

Seems to affect Detroit: Become Human and Cyberpunk 2077. The Cyberpunk
2077 changes look like a fixed bug. At least some of the Detroit: Become
Human changes could probably be removed with better divergence analysis.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>
2021-08-24 16:52:31 +00:00
Rhys Perry
884ac52eaa nir: consider push constant loads as always dynamically uniform
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12444>
2021-08-24 16:52:31 +00:00
Daniel Schürmann
2cf164feb9 nir/opt_algebraic: optimize flrp(fadd, fadd, x) only if fadd are used_once
Totals from 201 (0.13% of 150170) affected shaders: (GFX10.3)
VGPRs: 13880 -> 13856 (-0.17%)
CodeSize: 1517328 -> 1518124 (+0.05%); split: -0.04%, +0.10%
MaxWaves: 3184 -> 3192 (+0.25%)
Instrs: 285487 -> 285569 (+0.03%); split: -0.06%, +0.08%
Latency: 7774066 -> 7780877 (+0.09%); split: -0.10%, +0.19%
InvThroughput: 1936341 -> 1935287 (-0.05%); split: -0.07%, +0.02%
SClause: 11446 -> 11448 (+0.02%); split: -0.01%, +0.03%
Copies: 17500 -> 17506 (+0.03%); split: -0.51%, +0.55%
Branches: 8174 -> 8180 (+0.07%); split: -0.13%, +0.21%
PreVGPRs: 12507 -> 12427 (-0.64%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
2021-08-24 16:10:30 +00:00
Daniel Schürmann
89a842b2b6 nir/loop_analyze: consider instruction cost of nir_op_flrp
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061>
2021-08-24 16:10:30 +00:00
Rhys Perry
aeb1b4c30c nir/lower_io: use nir_vector_insert_imm()
This creates a single nir_op_vecn instead of a nir_op_vecn and several
copies.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12469>
2021-08-24 10:35:19 +00:00
Samuel Pitoiset
f4b858e746 Revert "nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)"
This is wrong for negative values.

This reverts commit 07cd30ca29.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12515>
2021-08-24 08:58:38 +00:00
Samuel Pitoiset
07cd30ca29 nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)
Found with Cyberpunk 2077.

fossils-db (GFX10.3):
Totals from 128 (2.34% of 5465) affected shaders:
CodeSize: 769720 -> 767656 (-0.27%); split: -0.27%, +0.00%
Instrs: 145748 -> 145229 (-0.36%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11604>
2021-08-23 17:53:38 +00:00
Timothy Arceri
02b394023b glsl: fix variable scope for instructions inside case statements
Fixes: 665d75cc5a ("glsl: Fix scoping bug in if statements.")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5247

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12435>
2021-08-20 16:13:56 +00:00
Daniel Schürmann
59f2c85845 nir: return false for loops in contains_other_jump()
Allows to unwrap more loops.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12473>
2021-08-19 13:51:17 +00:00
Marcin Ślusarz
e3b4c77ed3 glsl: refactor code to avoid static analyzer noise
Clang analyzer thinks struct_base_offset can be used uninitialized
because it doesn't know that glsl_type_is_struct_or_ifc returns
the same value for the same type.

Refactor the code to make it clear what is going on. As a side effect
this should be faster because glsl_get_length and
glsl_type_is_struct_or_ifc will be called only once (they are not
inline functions).

This is an alternative approach to
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12399.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12403>
2021-08-19 06:59:01 +00:00
Qiang Yu
e6790d4a31 nir/inline_uniforms: support loop
Be able to inline uniforms in loop for unrolling it.
Nested loop/if is also supported.

Some example:

    for (i = 0; i < count; i++)
	...

uniform "count" will be inlined. But note this does not
make sure the loop will be unrolled (ie. count = 1000).

    for (i = 0; i < count; i++)
        for (j = init; j < 10; j++)
            if (type == 2)
                ...

uniform "count", "init" and "type" will be inlined.

It is intentional to not be too aggressive to add uniforms
to avoid false positive case while be able to support most
common usage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
3c93ebbae5 nir/loop_analyze: skip unsupported induction variable early
Instead of fail in trip count calculation, just don't mark such
kind of variable as induction from the beginning.

Don't bother inline uniform to deal with such kind of variable
either.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
0b9639c35d nir/loop_analyze: record induction variables for each loop
For being used by uniform inline lowering pass.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
c86ec09d11 nir/loop_analyze: move nir_is_supported_terminator_condition() to header
To be shared with uniform inline.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
a406fff78a nir/inline_uniforms: support vector uniform
Collect per vector component dependency and lower vector uniform
load to scalar if any component need to be inlined.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Qiang Yu
9d796b21ac nir/inline_uniforms: add uniforms in condition atomically
Unless all uniforms in the condition can be inlined we can
lower the if/loop. So we rollback added uniforms when one
of uniforms in a if condition fail to be added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
2021-08-19 02:17:35 +00:00
Ian Romanick
f0a8a9816a nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
89f639c0ca nir/algebraic: Remove spurious conversions from inside logic ops
Not only does this eliminate a bunch of unnecessary type converting
MOVs, but it can also enable some SWAR.  The
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag test does
something about like:

    c = a.x ^ b.x;
    d = a.y ^ b.y;
    e = a.z ^ b.z;

After this change, it looks more like:

    uint t = i8vec3AsUint(a) ^ i8vec3AsUint(b);
    c = extract_u8(t, 0);
    d = extract_u8(t, 1);
    e = extract_u8(t, 2);

On Ice Lake, this results in:

SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 31 instructions. 1 loops. 2844 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
a147717a93 nir/algebraic: Optimize some extract forms resulting from 8-bit lowering
This eliminates some spurious, size-converting moves.  For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 56 instructions. 1 loops. 4444 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends

v2: Condition two of the patterns on !options->lower_extract_byte.
Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Mike Blumenkrantz
649251ad4e nir/lower_vectorize_tess_levels: set num_components for vectorized loads
this otherwise explodes when rewriting e.g., a single array component load to a vec4

Fixes: f5adf27fb9 ("nir,radv: add and use nir_vectorize_tess_levels()")

fixes zmike/mesa#94

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12419>
2021-08-18 12:18:15 +00:00
Marcin Ślusarz
89bc8ff408 glsl/opt_algebraic: disable invalid optimization
When operators other than eq and ne are involved we can't really
move operands around and negate them because such transformation
may change the value of the whole expression.

Some examples:

For unsigned var:
0 >= 1u + var would eventually become 0xffffffff >= var,
which would always evaluate to true, when original expression
was true only for var == 0xffffffff.

For signed var:
0 >= 1 + var would become -1 >= var, which would evaluate to
false for var == 2147483647, when original expression evaluated
to true (because signed overflow is defined to wrap around in
glsl, 1 + 2147483647 == -2147483648, so 0 >= -2147483648).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5226
Fixes: 34ec1a24d6 ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12359>
2021-08-17 08:17:52 +00:00
Timothy Arceri
edfcc4f022 nir: fix GCM when GVN enabled
Enabling GVN uncovered a bug where we would crash if the pass
thinking about pushing something into a loop.

Fixes: 6538b3e566 ("nir: add heuristic for instructions in loops with GCM")

Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12242>
2021-08-17 03:15:49 +00:00
Rhys Perry
cfc4433015 nir,glsl_to_nir: use nir_fdot()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>
2021-08-16 17:19:45 +00:00