Commit Graph

7563 Commits

Author SHA1 Message Date
Qiang Yu 49cfbe1fed nir/xfb_info: nir_gather_xfb_info_from_intrinsics update nir xfb_info
Use this function to update nir_shader->xfb_info.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19489>
2023-01-18 05:30:14 +00:00
Gert Wollny 2e05cfa179 nir: Add range_base to atomic_counter and an option to use it
Some drivers may encode constant offsets in the instruction, so
make it possible for the drivers to request lowering the atomic
uniform offset into the range_base variable of the intrinsic.

v2: drop patch to use build-in array offset evaluation, it makes
    problems with zink, and update the code accordingly
v3: always initialize range base

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>
2023-01-17 13:19:04 +00:00
Gert Wollny c4cde91c1b nir: Add possibility to store image var offset in range_base
Add the intrinsic range_base value to the image intrinsics and add
the option to store the image array offset into range_base instead
of adding it to the image array index if the driver requests it.

v2: Always initialize range_base

v3: fix for bindless intrinsics

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>
2023-01-17 13:19:04 +00:00
Alyssa Rosenzweig c3839bd540 nir: Optimize vendored sin/cos the same way
As we've done for the AMD one, to prevent any codegen regression from switching
the Midgard lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>
2023-01-16 22:20:43 +00:00
Alyssa Rosenzweig a49ba0f1ae nir: Add Midgard-specific fsin/fcos ops
NIR has a fsin instruction that takes an argument in radians. Midgard instead
has an fsinpi argument that takes an argument in multiples of pi. So, we had a
NIR pass that would change fsin(x) to fsin(x / pi) and then map fsin to fsinpi
in the backend.

But that's invalid! In NIR, the opcode fsin is well-defined. fsin(x) means
something very different than fsin(x / pi). They won't usually be equal. The
transform fsin(x) -> fsin(x / pi) is fundamentally unsound.

It did work before, by accident. Most NIR passes don't care about the semantics
of ALU instructions.  fsin(x) and fsin(x / pi) are both well-defined but
fundamentally different NIR shaders. So while rewriting is wrong -- the NIR we
get out is not equivalent to the NIR we put in, and the Midgard ops we generate
are not equivalent to the NIR -- but if we don't run any passes that care about
the definition of fsin the two wrongs will cancel out to make a right.

However, some NIR passes do care about the definitions of ALU instructions,
instead of treating them as named black boxes. In particular, constant folding
(nir_opt_constant_fold) evaluates ALU instructions when their inputs are
constants, according to the definition in nir_opcodes.py. So our little charade
will only work if we don't call nir_opt_constant_fold, or if all the fsin
instructions have non-constant inputs. At the beginning of this series, that is
the case. With the later scalarization change, that's no longer the case, and
the unsoundness translates to real failing tests rather than a quibble of NIR's
semantics.

To mitigate, we define a new NIR opcode with the semantics we want and translate
fsin(x) = fsin_mdg(x / pi), where that equivalence does hold mathematically. So
the new translation is sound and doesn't rely on lucky pass ordering.

This matches the approach already used for AMD and AGX, which have fsin_amd and
fsin_agx opcodes respectively.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>
2023-01-16 22:20:43 +00:00
Jason Ekstrand b39958a3a1 anv,nir: Move the ANV YCbCr lowering pass to common code
Nir changes:
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>

Anv changes:

Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>
2023-01-16 14:10:21 +00:00
Jason Ekstrand f02a11e4e4 nir: Add copyright and include guards to nir_vulkan.h
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>
2023-01-16 14:10:21 +00:00
Gert Wollny 0ca325cc10 glsl/nir: only set uses_sample_shading when the output is a fbfetch
Constructs like

  out vec4 fs_out;
  ....
  fs_out = vec4(...);
  if (fs_out.w < alpha_test_value)
     discard;

lead to initial nir that reads from fs_out, even though we don't actually
do a framebuffer fetch, and later nir passes will eliminate that direct
read from the output variable. As given in the commit message of 1124bee4
we are actually only interested in the framebuffer fetch, so set the
property only when an output is used for fbfetch reads.

v2: Iterate over all variables (Jason)

Fixes: commit 1124bee4ba
  glsl/nir: Set sample_shading if a FS output ever shows up as an rvalue

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20694>
2023-01-15 22:04:15 +00:00
Jason Ekstrand 433fe592ac nir/builder: Add some texture helpers
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19480>
2023-01-13 20:25:01 +00:00
Jason Ekstrand 30f3fec380 nir: Add more opcodes to nir_tex_instr_is_query()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19480>
2023-01-13 20:25:01 +00:00
t0b3 267dd1f4d5 nir/nir_opt_move: fix ALWAYS_INLINE compiler error
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Closes: #6825
Fixes: f1d20ec6 ("nir/nir_opt_move: handle non-SSA defs ")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17439>
2023-01-13 14:23:35 +00:00
Alyssa Rosenzweig f4b3201244 nir/peephole_select: Allow load_preamble
load_preamble is intended to be almost free (costing at most a move), and it
does not have special bounds checking requirement, so it's ok to select with it.
With this, drivers that use nir_opt_preamble together with a late call to
peephole_select can optimize sequences like:

   if (x) {
      <uniform-on-uniform calculation>
   } else {
      <different uniform-on-uniform calculation>
   }

to simply

   bcsel(x, <uniform register 0>, <uniform register 1>)

rather than emitting needless control flow / branching over some moves.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20597>
2023-01-13 00:43:04 +00:00
Pavel Ondračka 3fcdd9e4a7 nir/lower_bool: ntt: Generate a good opcode for bcsel
This is heavily copy-pasted from a patch of Ian Romanick, including the
commit message.

Previously, this pass always generated fcsel for bcsel.  This was the
only place that generate fcsel, so various drivers assumed (and needed!)
that src0 was a Boolean with 0.0 or 1.0 as the only values.

Specifically, many DX9 / GL_ARB_vertex_program platforms lack a CMP
instruction in vertex shaders.  In those cases, they would use LRP to
implement fcsel.  The bummer is that many plaforms have a real fcsel
instruction, and those platforms would benefit from other places
generating that opcode.

Instead of leaving assumptions in drivers about the sources of an opcode
that they can't really support, allow them to control the way the
lowering pass translates bcsel.  Two flags are used to control this:

- If the driver sets has_fused_comp_and_csel in nir_options, fcsel_gt
  will be used.  Since the Boolean value is 0.0 or 1.0, this is
  equivalent to fcsel.

- If the parameter has_fcsel_ne is set, fcsel will be used.  This is the
  old path.

- Otherwise, the lowering pass assumes we're on a crufty, old DX9 vertex
  program, and it emits flrp.

With this, the assumptions about src0 of fcsel in NTT can be removed.
If a platform can't handle fcsel, it should ensure that the lowering
pass won't generate it.

No change in shader-db.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>
2023-01-12 23:01:05 +00:00
Ian Romanick 70b25d9fe8 nir/lower_int_to_float: Add support for i32csel opcodes
These lower naturally to the corresponding fcsel opcodes.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>
2023-01-12 23:01:05 +00:00
Alyssa Rosenzweig 2976548e4a nir/gather_info: Handle store_zs_agx
This acts as a depth/stencil write. The AGX compiler checks outputs_written to
determine what conservative depth settings the driver needs. Nominally, this
should work: the original store_output(FRAG_RESULT_DEPTH) intrinsic causes the
DEPTH outputs_written bit to be set, so the metadata is still correct after
lowering store_output to store_zs_agx. However, there are a handful of places
that call nir_gather_info late, which *resets* the existing outputs_written
value and regathers, causing Asahi to use the wrong conservative depth settings
when shuffling NIR pass order and breaking gl_FragDepth.

To fix, handle store_zs_agx conservatively when gathering info so we don't have
to play games with the pass order or stashing info in a sideband.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20563>
2023-01-11 21:14:20 +00:00
Alyssa Rosenzweig cc5ca8164d nir: Add store_agx intrinsic
This works like store_global, but lets us optimize address arithmetic. Like
load_agx, it is formatted to match the hardware semantic. We don't make use of
any clever formats in this series, though.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20558>
2023-01-11 20:36:51 +00:00
Rob Clark 5fb0992a53 mesa/st: Track complete access qualifier for images
Don't turn gl_access_qualifier coming from NIR back into GL enums,
losing information in the process.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20612>
2023-01-11 20:09:01 +00:00
Jesse Natalie b0f3a387c9 nir_lower_fragcoord_wtrans: Support Vulkan shaders
In Vulkan shaders, you might not have all derefs pointing to a variable

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20400>
2023-01-10 04:25:26 +00:00
Timothy Arceri ac5af6c06d util/driconf: add Dune: Spice Wars workaround
As per the bug report the game does not correctly handle a uniform
index of -1 being returned for the unused array element, which
results in rendering issues. So here we skip the uniform array
resizing optimisation.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6397
Cc: mesa-stable

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20579>
2023-01-10 03:53:19 +00:00
Mary d8e5714e81 isaspec: Fix bitmask conversions when isa.bitsize < 64
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20541>
2023-01-07 00:14:10 +01:00
Jason Ekstrand 39c6f6454c isaspec: Give decode.c/h more descriptive names
Because these are being included across subdir boundaries, the name
"decode" is potentially pretty overloaded.  Instead, prefix them with
"isaspec_".  Also, since they're both weird includes now and not really
complete files in their own right, give them a descriptive suffix.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Jason Ekstrand e8945a8ce6 isaspec: Stop depending on glue headers and out-of-folder C files
The way the isaspec decoder used to work was that it would generate a
header and a C file, each with ISA-specific stuff in it. Then that would
get built together with a stand-alone decode.c file which lives in the
isaspec folder, not the driver's folder.  In order for decode.c to find
the ISA-specific headers, it would also generate a glue header which had
to be named isaspec-isa.h.  This effectively meant that you can't have
multiple isaspec definitions in the same folder.

To solve this, we make do it the other way around and make the generated
header and C files include the stand-alone files.  This is a bit awkward
because it means including a C file from another C file but it's better
for the build system.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Jason Ekstrand 4953a8db25 isaspec: Use argparse
This also cleans up some of our python script execution conventions and
handles mako errors better. Copied a bit from vk_entrypoints_gen.py.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Jason Ekstrand e83ad77ef5 isaspec: Stop using s and xml from the global namespace
We really shouldn't rely on these being global variables.  Pass them
along instead.

Acked-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20525>
2023-01-05 18:21:02 +00:00
Pavel Ondračka a93bc6afc4 nir: check for x - ffract(x) patterns when lowering f2i32
We already skip emitting ftrunc in nir_lower_int_to_float when there is
ffloor, fround or any other integer-making opcode preceding f2i32. However
if lower_ffloor is set for driver that doesn't support integers, the lowered
x - ffract(x) patterns would not be recognized and extra ftruct would be
emitted, doing unnecessary rounding.

This optimization only works if there is no non-trivial swizzling used for
the fadd, fneg and ffract involved, which seems to be 99% of the cases according
to my testing.

This is needed to enable nir ffloor lowering on r300 driver without regressions.

I'm not sure if this helps anybody else, the only hardware which sets
lower_ffloor and converts ints to floats (and can't do trunc) are some old
etnaviv cards, so maybe it will help there a bit.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20208>
2023-01-05 12:01:32 +00:00
Qiang Yu cf2ea3fce9 nir/xfb: save high_16bits output info
It is combined with slot location to identify a varying
when using VARYING_SLOT_VARx_16BIT.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20157>
2023-01-05 01:12:06 +00:00
Ian Romanick 043508d8f8 glsl: Remove bit_count lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets the NIR lower_bit_count
flag.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Ian Romanick abe5acf7fd glsl: Remove bitfield_reverse lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets the NIR
lower_bitfield_reverse flag.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Ian Romanick f5722c4973 glsl: Remove bitfield_extract and bitfield_insert lowering
As far as I can tell, every driver that supports GLSL 1.30 or
GL_EXT_gpu_shader4 (and therefore also enables support for
GL_MESA_shader_integer_functions) also sets some subset of the various
NIR lower_bitfield_extract and lower_bitfield_insert flags.

v2: Declaration of 'result' still needs to be added to the IR. Noticed
by marge.

v3: Fix 'git rebase --autosquash' putting the v2 fix in the wrong
place. I've never seen that happen before. :(

Reviewed-by: Emma Anholt <emma@anholt.net> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Ian Romanick db241fbd70 nir: Don't allow conflicting bitfield lowering passes
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20323>
2023-01-03 18:37:53 -08:00
Pavel Ondračka 53d9b696e4 nir: basic tests for nir_opt_shrink_vectors
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
2023-01-03 12:32:33 +01:00
Pavel Ondračka 3305c9602d nir: fix shrinking of load_const for large vectors
Specifically when shrinking load_const with number of components
> 5, if the final number of components is not allowed (for example 8->6)
it would report false for progress even if we actually did some
reshuffling and also it would skip on the rewrite of the readers.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
2023-01-03 12:32:33 +01:00
Pavel Ondračka cb7f201288 nir: remove duplicate alu channels in nir_opt_shrink_vectors
This will clean code like:
   vec3 32 ssa_8 = frcp ssa_7.www
   vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8
into
   vec1 32 ssa_8 = frcp ssa_7.w
   vec3 32 ssa_9 = fmul ssa_7.xyz, ssa_8.xxx

This helps r300 driver because we can only do single channel for math
ops at a time, so the first version would result in three frcp
instructions. The nir_opt_shrink_vectors comments even claim the pass
should be doing this, however it actually does it only for nir_op_vecx
instructions, so extend this for generic alu instructions.

RV530 shader-db:
total instructions in shared programs: 135032 -> 133707 (-0.98%)
instructions in affected programs: 46121 -> 44796 (-2.87%)
helped: 452
HURT: 26
total temps in shared programs: 17051 -> 17033 (-0.11%)
temps in affected programs: 1509 -> 1491 (-1.19%)
helped: 91
HURT: 30

12.02->12.08 (+0.5%) fps gain in Unigine Sanctuary (n=5) with RV530

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7051
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reiewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20213>
2023-01-03 12:32:33 +01:00
Christian Gmeiner 9e56f69edf isaspec: encode: handle special fieldname properties
Without this change a fieldname like '{DST::align=12}' was not used for
encoding. Change the regex to include such fieldnames and remove the fieldname
property in a later step.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20462>
2022-12-31 13:43:15 +00:00
Danylo Piliaiev 1c9ee30838 nir/fold_16bit_tex_image: Add type granularity for dst folding
Some HW may be able to fold only some of dst types, e.g.
for Adreno folding i32 -> i16 could cause a different result since
folded variant clamps the result instead of masking it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20396>
2022-12-23 15:48:18 +01:00
Lionel Landwerlin 3af08b9c30 nir/divergence: handle shader_record_ptr intrinsic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes 6b8fd65e84 ("spirv: Implement the new ray-tracing storage classes")

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20413>
2022-12-23 09:22:13 +00:00
Danylo Piliaiev 8482ad0110 nir/nir_lower_is_helper_invocation: Lower helper invocation if required
nir_lower_is_helper_invocation lowers intrinsic_is_helper_invocation
and uses load_helper_invocation (which is lowered by nir_lower_system_values).
While nir_lower_system_values may lower SYSTEM_VALUE_HELPER_INVOCATION
into intrinsic_is_helper_invocation.

So they depend on each other. Break the dependency by making
nir_lower_is_helper_invocation aware of lower_helper_invocation option
and emitting lowered load_helper_invocation when required.

Happens with SPIR-V 1.6 for which gl_HelperInvocation is translated into
"BuiltIn HelperInvocation" + "Volatile", which nir_lower_system_values
translates into is_helper_invocation.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19677>
2022-12-20 11:06:52 +00:00
Qiang Yu e85c5d8779 nir/divergence_analysis: add missing intrinsics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Singed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
2022-12-19 09:22:24 +08:00
Qiang Yu 194add2c23 nir: lower image add lower_to_fragment_mask_load_amd option
Like lower_to_fragment_fetch_amd option in lower tex,
this is for radeonsi to lower MS image ops.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
2022-12-19 09:22:16 +08:00
Qiang Yu 1461b5f61b nir: add image fragment mask load intrinsic
Like nir_texop_fragment_mask_fetch_amd, this is used to load multi
sample image fmask data for AMD GPU.

We will lower multi sample image load and samples_identical intrinsics
to use it latter for radeonsi. RADV does not need this because it
always expand fmask images before dispatch compute shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18666>
2022-12-19 09:22:11 +08:00
Alyssa Rosenzweig 71e2028ce3 nir: Add store_zs_agx intrinsic
Will be used for frag depth/stencil export with multisampling.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
2022-12-17 18:10:28 +00:00
Yonggang Luo 36ba2e31f6 glsl: fixes -Werror,-Wunused-but-set-variable for clang-15 in glcpp-parse.y and glsl_parser.yy
error messages:
src/compiler/glsl/glcpp/glcpp-parse.c:1691:9: error: variable 'glcpp_parser_nerrs' set but not used [-Werror,-Wunused-but-set-variable]
    int yynerrs = 0;
        ^

src/compiler/glsl/glsl_parser.cpp:2370:9: error: variable '_mesa_glsl_nerrs' set but not used [-Werror,-Wunused-but-set-variable]
    int yynerrs = 0;
        ^

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
2022-12-16 19:02:17 +00:00
Yonggang Luo 113def3bbd glsl: Fixes indent issue after replace tab with 3 space by tools in glcpp-parse.y
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
2022-12-16 19:02:17 +00:00
Yonggang Luo 3261a54c79 glsl: replace tab with 3 space in glcpp-parse.y
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
2022-12-16 19:02:17 +00:00
Yonggang Luo c5a4520b3c glsl: Fixes ident issue in glsl_parser.yy and update editorconfig for it
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
2022-12-16 19:02:17 +00:00
Karol Herbst 6d6c6caff1 nir_lower_io_to_scalar: handle load/store_global
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20106>
2022-12-16 08:02:32 +00:00
Karol Herbst 3cd641bebd nir_lower_io_to_scalar: make use of nir_get_io_offset_src
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20106>
2022-12-16 08:02:32 +00:00
Kenneth Graunke 0521027182 nir: Allow more than just ALU instructions in 'weak' GVN
This removes the ALU-only restriction on the "weak" GVN introduced by
the previous commit.  This makes it slightly more aggressive, allowing
it to coalesce things like UBO loads (still within sister then/else
blocks).  This also can have surprisingly large cascading effects.

I was concerned that this might increase register pressure, but
shader-db and fossil-db show effectively no change in spills/fills,
so it seems to be fine.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19823>
2022-12-14 20:56:55 +00:00
Kenneth Graunke d5d03a7273 nir: Perform 'weak' global value numbering in all GCM passes
Full global value numbering (GVN) can be pretty aggressive, moving
values far away from their original locations, even out of loops,
and can extend their live ranges a lot.  So we've left it disabled.

This patch introduces a weaker form of GVN: we only allow coalescing
identical values when they appear on either side of the same if/else
construct.  For now, we also only allow ALU instructions.

This allows nir_opt_gcm to clean up identical instructions appearing
on both sides of if/then/else control flow.  But it avoids aggressively
combining every other occurrence of a value in the program.

This can still have surprisingly large cascading effects, as simple
constructs are cleaned up, leading to more opportunities to do the
same clean up, up a chain of nested ifs.  It also enables greater use
of the select peephole as ifs are cleaned up.

shader-db and fossil-db results show a reduction in spills/fills on
Icelake, so it doesn't seem to be hurting register pressure.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19823>
2022-12-14 20:56:55 +00:00
Samuel Pitoiset 877c10efd1 spirv: add support for AMD_shader_early_and_late_fragment_tests
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19738>
2022-12-14 08:16:27 +00:00