NIR doesn't really support global instructions such as global val
initilisation. So here we add functionality to glsl_to_nir() to
put these instructions into a temporary function that will be
later inlined into main.
We give the function a name starting with gl_mesa_tmp_ as functions
starting with gl_ are reserved and will not have any clashes with
user functions, we finish the name with the blake3 of the shader
source to avoid conflicts with multiple shaders attached to a single
stage.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
Inspired by a commit message in !30934, I set about optimizing the code
generated for nir_copysign. It would be possible to just implement an
opt_algebraic pattern for the specific values used by nir_copysign, but
this casts a slightly larger net.
As noted in a comment in the code, there may be variations of the
pattern that this pass misses. The opt_algebraic pattern would miss them
too.
v2: Use nir_def_replace. Suggested by Alyssa. Allow more "root"
instruction types. Suggested by Georg.
v3: Treat extract_u16(x, 0) as (x & 0x0000ffff), and treat extract_u8(x,
0) as (x & 0x000000ff).
v4: Use nir_scalar. Suggested by Georg.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.
nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
Backward inter-shader code motion can move any code into the previous
shader if it only uses convergent inputs. The problem is the final input
type can end up being integer or FP64, which is incompatible with
the assumption that convergent inputs can always be interpolated.
If such a case occurs and the type is integer or FP64, either don't
do any code motion, or if the driver exposes the new flag, rewrite
convergent loads to use load_input.
If the new flag is supported, all convergent loads are rewritten to use
load_input, and flat varyings are allowed to be classified as convergent,
which means they are packed into interpolated vec4 slots if there are
unused components.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
"Rewrite and remove" is a super common idiom in NIR passes. Let's add a helper
to make it more ergonomic.
More the point, I expect that /most/ of the time when a pass rewrites uses, they
also want to remove the parent instruction. The principle reason not to is
because it takes extra effort to add in the nir_instr_remove and nir_opt_dce
will clean up after you eventually, right? From a compile time perspective, it's
better to remove earlier to reduce the redundant processing between the pass and
the next DCE run. So ... we want to be doing *more* removes. From a UX
perspective - the way to nudge devs towards that is to make the
preferred "rewrite-and-remove" pattern more ergonomic than the "rewrite but
keep". That justifies the simple "replace" name rather than something silly like
"rewrite_uses_and_remove".
---
Something else I've wanted for a while.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29817>
Most passes want to preserve this specific combination of metadata, so let's add
an alias for the combination. The alias communicates that the control flow graph
is preserved, rather than a particular statement about e.g. dominance
preservation.
You don't need to understand dominance to write a simple
nir_shader_instructions_pass. And since you were going to cargo cult the
metadata anyway, this way you'll cargo cult a version you're more likely to
understand.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>
The semantics of discard differ between GLSL and HLSL and
their various implementations. Subsequently, numerous application
bugs occurred and SPV_EXT_demote_to_helper_invocation was written
in order to clarify the behavior. In NIR, we now have 3 different
intrinsics for 2 things, and while demote and terminate have clear
semantics, discard still doesn't and can mean either of the two.
This patch entirely removes nir_intrinsic_discard and
nir_intrinsic_discard_if and replaces all occurences either with
nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the
case that the NIR option 'discard_is_demote' is being set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>
This value is intended to be used to remove out of bounds array
access when unrolling loops so it should contain the comparison
that contains the the induction variable not the overall
condition of the loop terminator. So here we update the instruction
when dealing with iand/ior loop terminator conditions.
Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28998>
v2: Add some comments explaining some of the nuance of the shift
optimizations. Fix a bug in the shift count calculation of the upper
32-bits. Move the @64 from the variable to the opcode. All suggested
by Jordan.
No shader-db changes on any Intel platform.
fossil-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 154507026 -> 154506576 (-0.00%)
Cycle count: 17436298868 -> 17436295016 (-0.00%)
Max live registers: 32635309 -> 32635297 (-0.00%)
Totals from 42 (0.01% of 632575) affected shaders:
Instrs: 5616 -> 5166 (-8.01%)
Cycle count: 133680 -> 129828 (-2.88%)
Max live registers: 1158 -> 1146 (-1.04%)
No fossil-db changes on any other Intel platform.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
This allows us to not generate 64-bit iadd3 on Intel but continue
generating it for NVIDIA.
No shader-db or fossil-db changes.
v2: Add nir_lower_iadd3_64 flag so we can continue to generate 64-bit
iadd3 on NVIDIA platforms.
v3: s/bit_size == 64/s == 64/. This cut-and-paste bug prevented any of
the optimizations from ever occuring.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>
New pass that shares code with nir_opt_load_store_vectorize but
it only updates the alignment of load/store instructions.
It is useful before running other passes which may
potentially destroy that information (eg. by removing some
instructions from which the alignment may be deduced).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29210>
AGX has limited border colour hardware. To support full
customBorderColorWithoutFormat semantics, we're forced to emulate in shaders at
a substantial performance penalty. Actually, that's needed just to pass CTS
because of other hardware issues stacking on top of each others... Hooray!
Add the texops we need to facilitate efficient custom border colour lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29179>