Commit Graph

2308 Commits

Author SHA1 Message Date
Timothy Arceri dbf016e259 nir: fix implicit fallthrough warnings
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5705>
2020-07-02 23:52:52 +00:00
Ian Romanick 8591adea38 nir/algebraic: Don't distrubte absolute-value into dot-products
Dot product is multiplication followed by addition, and absolute value
does not distribute into addition.

Only vec4 platforms are affected by this change as scalar-only platforms
never have any of the fdot_replicated instructions.  In the shader-db
results, below, shaders in MANY different applications are affected.
Trine, Doom3, Enemy Territory: Quake Wars, Counter Strike: Global
Offensive, Mad Max, Metro Last Light, and on and on...  I'm really
shocked that there were no test regressions!

All Haswell and earlier platforms had similar results. (Haswell shown)
total instructions in shared programs: 16219743 -> 16219820 (<.01%)
instructions in affected programs: 12171 -> 12248 (0.63%)
helped: 1
HURT: 78
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.78% max: 0.78% x̄: 0.78% x̃: 0.78%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.35% max: 2.38% x̄: 0.91% x̃: 1.06%
95% mean confidence interval for instructions value: 0.92 1.03
95% mean confidence interval for instructions %-change: 0.78% 1.00%
Instructions are HURT.

total cycles in shared programs: 538481383 -> 538491045 (<.01%)
cycles in affected programs: 470796 -> 480458 (2.05%)
helped: 149
HURT: 142
helped stats (abs) min: 1 max: 1338 x̄: 71.13 x̃: 4
helped stats (rel) min: 0.06% max: 40.99% x̄: 2.76% x̃: 0.67%
HURT stats (abs)   min: 1 max: 2092 x̄: 142.68 x̃: 12
HURT stats (rel)   min: 0.07% max: 55.38% x̄: 5.07% x̃: 1.07%
95% mean confidence interval for cycles value: -5.28 71.69
95% mean confidence interval for cycles %-change: -0.07% 2.19%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 62795475e8 ("nir/algebraic: Distribute source modifiers into instructions")
Closes: #3129
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5581>
2020-07-02 14:05:33 -07:00
Timothy Arceri d55aa78615 nir: add missing break to nir_opt_access()
Fixes: f2d0e48ddc ("glsl/nir: Add optimization pass for access flags")

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5714>
2020-07-02 12:11:30 +10:00
Alyssa Rosenzweig 54d7907c27 nir: Propagate *2*16 conversions into vectors
If we have code like:

   ('f2f16', ('vec2', ('f2f32', 'a@16'), '#b@32'))

We would like to eliminate the conversions, but the existing rules can't
see into the the (heterogenous) vector. So instead of trying to
eliminate in one pass, we add opts to propagate the f2f16 into the
vector. Even if nothing further happens, this is often a win since then
the created vector is smaller (half2 instead of float2). Hence the above
gets transformed to

   ('vec2', ('f2f16', ('f2f32', 'a@16')), ('f2f16', '#b@32'))

Then the existing f2f16(f2f32) rule will kick in for the first component
and constant folding will for the second and we'll be left with

   ('vec2', 'a@16', '#b@16')

...eliminating all conversions.

v2: Predicate on !options->vectorize_vec2_16bit. As discussed, this
optimization helps greatly on true vector architectures (like Midgard)
but wreaks havoc on more modern SIMD-within-a-register architectures
(like Bifrost and modern AMD). So let's predicate on that.

v3: Extend for integers as well and add a comment explaining the
transforms.

Results on Midgard (unfortunately a true SIMD architecture):

total instructions in shared programs: 51359 -> 50963 (-0.77%)
instructions in affected programs: 4523 -> 4127 (-8.76%)
helped: 53
HURT: 0
helped stats (abs) min: 1 max: 86 x̄: 7.47 x̃: 6
helped stats (rel) min: 1.71% max: 28.00% x̄: 9.66% x̃: 7.34%
95% mean confidence interval for instructions value: -10.58 -4.36
95% mean confidence interval for instructions %-change: -11.45% -7.88%
Instructions are helped.

total bundles in shared programs: 25825 -> 25670 (-0.60%)
bundles in affected programs: 2057 -> 1902 (-7.54%)
helped: 53
HURT: 0
helped stats (abs) min: 1 max: 26 x̄: 2.92 x̃: 2
helped stats (rel) min: 2.86% max: 30.00% x̄: 8.64% x̃: 8.33%
95% mean confidence interval for bundles value: -3.93 -1.92
95% mean confidence interval for bundles %-change: -10.69% -6.59%
Bundles are helped.

total quadwords in shared programs: 41359 -> 41055 (-0.74%)
quadwords in affected programs: 3801 -> 3497 (-8.00%)
helped: 57
HURT: 0
helped stats (abs) min: 1 max: 57 x̄: 5.33 x̃: 4
helped stats (rel) min: 1.92% max: 21.05% x̄: 8.22% x̃: 6.67%
95% mean confidence interval for quadwords value: -7.35 -3.32
95% mean confidence interval for quadwords %-change: -9.54% -6.90%
Quadwords are helped.

total registers in shared programs: 3849 -> 3807 (-1.09%)
registers in affected programs: 167 -> 125 (-25.15%)
helped: 32
HURT: 1
helped stats (abs) min: 1 max: 3 x̄: 1.34 x̃: 1
helped stats (rel) min: 20.00% max: 50.00% x̄: 26.35% x̃: 20.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 16.67% max: 16.67% x̄: 16.67% x̃: 16.67%
95% mean confidence interval for registers value: -1.54 -1.00
95% mean confidence interval for registers %-change: -29.41% -20.69%
Registers are helped.

total threads in shared programs: 2471 -> 2520 (1.98%)
threads in affected programs: 49 -> 98 (100.00%)
helped: 25
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.96 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.88 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].

total spills in shared programs: 168 -> 168 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 186 -> 186 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4999>
2020-06-30 16:21:33 +00:00
Boris Brezillon cff418cc4c nir: Add new rules to optimize NOOP pack/unpack pairs
nir_load_store_vectorize_test.ssbo_load_adjacent_32_32_64_64 expectations
need to be fixed accordingly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5589>
2020-06-29 09:18:26 +02:00
Kenneth Graunke 6a5fb31fef nir: Fix divergence analysis for tessellation input/outputs
The load_per_vertex_{input,output} intrinsics simply mean that they're
reading an arrayed input/output, which have one element per invocation.
Most accesses to those use gl_InvocationID as the subscript.  However,
it's totally possible to read any element of the array.  For example,
an evaluation shader might read gl_in[2].gl_Position, or a control
shader might read output[0].

For threads processing a single patch, an input/output load is
convergent if and only if both sources (the per-vertex-array subscript
and the offset) are convergent.  For threads processing multiple
patches, we continued to mark them divergent.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5613>
2020-06-24 03:25:10 +00:00
Jose Maria Casanova Crespo ba15bb383f nir: only uniforms with dynamically_uniform offset are dynamically_uniform
Previously all nir_intrinsic_load_uniform that were used as sources were
considered to be dynamically_uniform but when offsets of load_uniform
are indirect it can not be determined.

This fixes artefacts in Google Maps 3D view in V3D.

Fixes: 886d46b089 ("nir: Add a function to determine if a source is dynamically uniform")
Reviewed-by: Neil Roberts <nroberts@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5587>
2020-06-23 13:04:04 +00:00
Rhys Perry 9a389322c4 nir: slight correction to cube_face_coord constant folding
ACO does the division with a rcp and then a multiplication.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5547>
2020-06-22 10:28:40 +00:00
Neil Roberts ed29b576cb nir/scheduler: Add an option to specify what stages share memory for I/O
The scheduler has code to handle hardware that shares the same memory
for inputs and outputs. Seeing as the specific stages that need this is
probably hardware-dependent, this patch makes it a configurable option
instead of hard-coding it to everything but fragment shaders.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>
2020-06-22 08:23:06 +02:00
Neil Roberts 28e3209985 nir/schedule: Store a pointer to the scoreboard in nir_deps_state
nir_deps_state is a struct used as a closure for calculating the
dependencies. Previously it had two fields copied out of the scoreboard.
The closure is initialised in two seperate places. In order to make it
easier to access other members of the scoreboard in the callbacks in
later patches, the closure now just contains a pointer to the scoreboard
and the two bits of copied information are removed.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>
2020-06-22 08:23:06 +02:00
Neil Roberts df8dc30cea nir/scheduler: Handle nir_intrinsic_load_per_vertex_input
load_per_vertex_input should probably be handled in the same way as a
regular load_input. I think the nir_schedule pass was written before V3D
had geometry shader support, so that is probably why it hasn’t taken
this into account until now.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>
2020-06-22 08:23:06 +02:00
Karol Herbst feb83f2f82 nir/lower_images: handle dec and inc
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5480>
2020-06-18 15:15:17 +00:00
Jason Ekstrand 20b6ee82ac nir/intrinsics: Put the _intel intrinsics together at the end
All the other driver-specific intrinsics are at the end of the file so
Intel's should go there too.

Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5503>
2020-06-16 20:07:33 +00:00
Rob Clark 167fa2887f nir/validate: validate intr->num_components
Validate that num_components is only set for vectorized instructions, to
prevent other nir passes or driver backends from mistakenly relying on
num_components for non-vectorized instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
2020-06-16 02:48:18 +00:00
Rob Clark 2e5b5d9584 nir/lower-atomics-to-ssbo: don't set num_components
Of the possible intrinsics generated, only load_ssbo is vectorized (and
store_ssbo is never generated)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
2020-06-16 02:48:18 +00:00
Rob Clark f70d6030e3 nir/builder: don't set intr->num_components
The "load-sysval" intrinsics are not vectorized.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
2020-06-16 02:48:18 +00:00
Erik Faye-Lund e838acf37d nir: do not try to merge xfb-outputs
It's tricky to merge XFB-outputs correctly, because we need there to not
be any overlaps when we get to `nir_gather_xfb_info_with_varyings` later
on. We currently trigger an assert there if we end up merging here.

So let's not even try. This is an optimization, and we can optimize this
in safe cases later if needed. For now, let's play it safe.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5329>
2020-06-15 16:13:58 +00:00
Rob Clark 399114329b nir/print: print tex dest type
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
2020-06-11 21:59:54 +00:00
Jason Ekstrand 2b676b2ce8 nir: Properly preserve metadata in more cases
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00
Jason Ekstrand 5e1c42d85f nir: Call nir_metadata_preserve on !progress
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00
Jason Ekstrand b0d1f9a72f nir: Add a nir_shader_preserve_all_metadata helper
There are some passes which really work on the shader level and it's
easier if we have a helper which preserves metadata on the whole shader.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00
Jason Ekstrand e017ee95c1 nir: Add a nir_metadata_all enum value
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
2020-06-11 05:08:12 +00:00
Icecream95 bcc8f28b1a nir: Replace the zs_output_pan intrinsic with combined_output_pan
Depth and stencil writes are combined with color writes, so we need
this intrinsic which has sources for color, RT, depth and stencil.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
2020-06-10 13:54:03 +00:00
Alyssa Rosenzweig dc8bffe999 nir: Remove nir_intrinsic_output_u8_as_fp16_pan
Now unused in favour of nir_intrinsic_load_output, happily.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5287>
2020-06-10 09:30:31 +00:00
Ben Skeggs a6c747e8e0 nir: use bitfield_insert instead of bfi in nir_lower_double_ops
NVIDIA hardware doesn't have an equivilant to bfi, but we do already have
a lowering for bitfield_insert->bfi.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5373>
2020-06-09 08:38:22 +10:00
Caio Marcelo de Oliveira Filho d1f6d2f3e8 nir: Fix logic that ends combine barrier sequence
The combination must stop when we see a scoped barrier that have
execution scope, i.e. it has control barrier behavior.  The code was
mistakenly looking at the wrong scope.

Fixes: 345b5847b4 ("nir: Replace the scoped_memory barrier by a scoped_barrier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5365>
2020-06-08 15:49:24 +00:00
Caio Marcelo de Oliveira Filho b7a3821a5c nir: Fix printing execution scope of a scoped barrier
Fixes: 345b5847b4 ("nir: Replace the scoped_memory barrier by a scoped_barrier")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5365>
2020-06-08 15:49:24 +00:00
Samuel Pitoiset 86f21e4eba nir/lower_explicit_io: fix NON_UNIFORM access for UBO loads
Make sure to propagate the NON_UNIFORM access for UBO loads, so
that non-uniform loads are correctly lowered.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5311>
2020-06-08 07:35:43 +00:00
Erik Faye-Lund e61a98877c nir: reuse existing psiz-variable
For shaders where there's already a psiz-variable, we should rather
reuse it than create a second one. This can happen if a shader writes
gl_PointSize, but disables GL_PROGRAM_POINT_SIZE.

Fixes: 878c94288a ("nir: add lowering-pass for point-size mov")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5328>
2020-06-04 09:12:54 +00:00
Rob Clark 26a3c7b363 nir/lower_tex: fixes for fp16 yuv lowering
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3079
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>
2020-06-03 21:24:13 +00:00
Rob Clark 0f3255ef0a nir/builder: add bitsize conversion helpers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>
2020-06-03 21:24:13 +00:00
Rob Clark 866618c5c8 nir: extract out convert_to_bitsize() helper
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>
2020-06-03 21:24:13 +00:00
Rob Clark 924bfb6560 nir: get_base_type() should return enum type
Needed by the next patch, for c++ code which is more strict about
conversions between integers and enums.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>
2020-06-03 21:24:12 +00:00
Boris Brezillon 345b5847b4 nir: Replace the scoped_memory barrier by a scoped_barrier
SPIRV OpControlBarrier can have both a memory and a control barrier
which some hardware can handle with a single instruction. Let's
turn the scoped_memory_barrier into a scoped barrier which can embed
both barrier types. Note that control-only or memory-only barriers can
be supported through this new intrinsic by passing NIR_SCOPE_NONE to the
unused barrier type.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4900>
2020-06-03 07:39:52 +00:00
Timothy Arceri 04dbf709ed nir: add callback to nir_remove_dead_variables()
This allows us to do API specific checks before removing variable
without filling nir_remove_dead_variables() with API specific code.

In the following patches we will use this to support the removal
of dead uniforms in GLSL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
2020-06-03 02:22:23 +00:00
Marek Olšák cac24bee62 nir: gather which images are MSAA
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5209>
2020-06-02 20:47:49 +00:00
Marek Olšák 6503e4be13 nir: gather which images are buffers
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5209>
2020-06-02 20:47:49 +00:00
Marek Olšák f8ef15c061 nir: don't count samplers and images in interface blocks
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5209>
2020-06-02 20:47:49 +00:00
Marek Olšák 116e006693 nir: add options::vectorize_vec2_16bit to limit vectorization to vec2 16
for hardware that is scalar but can do 2 16-bit operations on low and high
16 bits of registers at once.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Marek Olšák a6916d1ce8 nir: fix lower_wpos for 16-bit fddy
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Marek Olšák 92333c6d1a nir: lower int16 and uint16 in nir_lower_mediump_outputs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Marek Olšák f798513f91 nir: add i2imp and u2ump opcodes for conversions to mediump
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Alyssa Rosenzweig f3310cb3e1 nir: Fold f2f16(b2f32(x)) to b2f16(x)
By definition.

This reduces register pressure on freedreno so that the noubo expected
failure goes away.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Dylan Baker a8e2d79e02 meson: use gnu_symbol_visibility argument
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
2020-06-01 18:59:18 +00:00
Gert Wollny 682e14d3ea nir: lower_tex: Don't normalize coordinates for TXF with RECT
v2: remove the option to actually request normalization and its
    application in Intel < Gen6 (Jason)

v3: Also don't lower for query operations (Jason)

Fixes: 1ce8060c25
    nir/lower_tex: support for lowering RECT textures

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5105>
2020-05-28 18:39:29 +00:00
Jason Ekstrand e91108691d nir: Fix sources for image atomic fadd
Somehow we ended up with an extra scalar source up-front.  It doesn't
look like any drivers use this opcode yet so no real harm has been done
by it being wrong.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5218>
2020-05-26 23:24:45 +00:00
Rhys Perry 8e2009c448 nir: fix lowering to scratch with boolean access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 18ed82b084
   ('nir: Add a pass for selectively lowering variables to scratch space')

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5214>
2020-05-26 18:54:17 +00:00
Alyssa Rosenzweig fcbc022787 nir: Add un/pack_32_4x8 opcodes
Complement the existing un/pack_32_2x16 opcodes. These are useful for
8-bit format packing. On Midgard, they are equivalent to just a 32-bit
move, but other GPUs could lower to other packs if needed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5107>
2020-05-25 20:03:52 +00:00
Dmitriy Nester 0e9af02323 nir: replace fnv1a hash function with xxhash
xxhash is faster than fnv1a in almost all circumstances, so we're
switching to it globally.

Signed-off-by: Dmytro Nester <dmytro.nester@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4020>
2020-05-25 19:41:09 +00:00
Samuel Pitoiset 37c88c670f spirv: add ReadClockKHR support with device scope
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117>
2020-05-24 20:37:50 +02:00