Commit Graph

5371 Commits

Author SHA1 Message Date
Kenneth Graunke 140f53e646 Revert "nir: replace lower_ffma and fuse_ffma with has_ffma"
This reverts commit 939ddf3f67.

Intel has a separate pass for fusing FFMAs selectively.  We split
these flags in commit 1b72c31e1f and
the reasoning still stands.  The patch being reverted was just a
cleanup, so there should be no issue with reverting it.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>
2020-09-24 13:11:50 -07:00
Marek Olšák 939ddf3f67 nir: replace lower_ffma and fuse_ffma with has_ffma
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>
2020-09-24 12:29:11 +00:00
Marek Olšák 771aad3027 nir: split lower_ffma into lower_ffma16/32/64
AMD wants different behavior for each bit size

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>
2020-09-24 12:29:11 +00:00
Marek Olšák 21174dedec nir: split fuse_ffma into fuse_ffma16/32/64
AMD wants different behavior for each bit size

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>
2020-09-24 12:29:11 +00:00
Jesse Natalie 924e27647e nir_lower_system_values: Fix load_global_invocation_id to use base_work_group_id even with no base_global id
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6668>
2020-09-22 21:22:26 +00:00
Eric Anholt ee4cee6dbd android: Disable trying to read/write to the disk cache.
We need the disk cache enabled in Android to get EGL_ANDROID_blob_cache's
callbacks called, but we don't actually want to store anything on disk.
Fixes "Failed to create //.cache for shader cache (Read-only file
system)---disabling." spam on init.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6762>
2020-09-22 20:41:25 +00:00
Jason Ekstrand 9750164c09 nir: Rename get_buffer_size to get_ssbo_size
This makes it explicit that this intrinsic is only for SSBOs.  For the
v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be
able to distinguish between the two.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812>
2020-09-22 13:34:12 +00:00
Rhys Perry f100cf0d30 aco: stop multiplying driver_location by 4
This didn't really serve any purpose, doesn't match how FS inputs are
currently done, and prevented us from using
nir_io_add_const_offset_to_base in the future.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>
2020-09-22 12:38:43 +00:00
Danylo Piliaiev f2b17dec12 nir/lower_samplers: Clamp out-of-bounds access to array of samplers
Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:

"In the subsections described above for array, vector, matrix and
 structure accesses, any out-of-bounds access produced undefined
 behavior.... Out-of-bounds reads return undefined values, which
 include values from other variables of the active program or zero."

Robustness extensions suggest to return zero on out-of-bounds
accesses, however it's not applicable to the arrays of samplers,
so just clamp the index.

Otherwise instr->sampler_index or instr->texture_index would be out
of bounds, and they are used as an index to arrays of driver state.

E.g. this fixes such dereference:
 if (options->lower_tex_packing[tex->sampler_index] !=
in nir_lower_tex.c

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>
2020-09-22 09:06:52 +00:00
Danylo Piliaiev 0ba82f78a5 nir/large_constants: Eliminate out-of-bounds writes to large constants
Out-of-bounds writes could be eliminated per spec:

Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:

"In the subsections described above for array, vector, matrix and
 structure accesses, any out-of-bounds access produced undefined
 behavior.... Out-of-bounds writes may be discarded or overwrite
 other variables of the active program."

Fixes: 1235850522
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>
2020-09-22 09:06:52 +00:00
Danylo Piliaiev 66669eb529 nir/lower_io: Eliminate oob writes and return zero for oob reads
Out-of-bounds writes could be eliminated per spec:

Section 5.11 (Out-of-Bounds Accesses) of the GLSL 4.60 spec says:

 "In the subsections described above for array, vector, matrix and
  structure accesses, any out-of-bounds access produced undefined
  behavior....
  Out-of-bounds writes may be discarded or overwrite
  other variables of the active program.
  Out-of-bounds reads return undefined values, which
  include values from other variables of the active program or zero."

GL_KHR_robustness and GL_ARB_robustness encourage us to return zero
for reads.

Otherwise get_io_offset would return out-of-bound offset which may
result in out-of-bound loading/storing of inputs/outputs,
that could cause issues in drivers down the line.

E.g. this fixes such dereference:
 int vue_slot = vue_map->varying_to_slot[intrin->const_index[0]];
in brw_nir.c

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6428>
2020-09-22 09:06:52 +00:00
Vinson Lee 3022cf3bac glsl: Initialize ir_constant member const_elements in all constructors.
Fix defects reported by Coverity Scan.

Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member const_elements is not
initialized in this constructor nor in any functions that it calls.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6770>
2020-09-21 18:29:46 -07:00
Jason Ekstrand e1fc23265f nir: Add a pass for lowering CL-style image ops to texture ops
In CL 1.2, images are required to be either read-only or write-only.  We
can always translate the read-only image ops to texture ops.  In CL 2.0
(and an extension), the ability is added to have read-write images but
sampling (with a sampler) is only allowed on read-only images.  As long
as we only lower read-only images to texture ops, everything should stay
consistent.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6578>
2020-09-20 14:28:13 +00:00
Jason Ekstrand 3fc425b355 spirv: Plumb access qualifiers through from image types
In SPIR-V, the access qualifiers for an image are provided on the image
type.  Assuming no one swaps the types around on us (I think that should
be illegal), this means we can reliably fetch the access qualifiers from
the type itself.  The ops which don't really have an easy-to-fetch type
are the atomics because they use OpImageTexelPointer.  However, those
are only allowed on read/write images and that's the default.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6578>
2020-09-20 14:28:13 +00:00
Jason Ekstrand 1e902102c4 spirv: Access qualifiers are not a bitfield
They're an actual enum.  My bad.

Fixes: de36b5b805 "nir/vtn: Add support for kernel images to SPIRV-to-NIR"
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6578>
2020-09-20 14:28:13 +00:00
Jesse Natalie 9aa86eb61a glsl_type: Add packed to structure type comparison for hash map
Fixes: 659f333b3a "glsl: add packed for struct types"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6767>
2020-09-18 19:33:00 +00:00
Tapani Pälli 5805f5ab01 glsl: take EXT_gpu_shader4 in to account when adding round
GL_EXT_gpu_shader4 adds truncate() and round() builtins.

Fixes: 12567de2be ("glsl: mark some builtins with correct glsl(es) version check")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6731>
2020-09-18 05:49:51 +00:00
Gert Wollny 6f2b6952be nir: remove ubo_r600 instrinsic since ubo_vec4 is used now
As suggested by Eric.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
eviewed-by: Eric Anholt <eric@anholt.net>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6743>
2020-09-17 10:11:11 +00:00
Alejandro Piñeiro 2aaa1564ad nir/lower_io: don't reduce range if parent length is zero
When handling arrays, range is increased based on the array size minus
one. But if such is zero, it has the effect of reducing the
range. Handle that case by returning the unknown range value.

v2:
  * Add missing braces.
  * Return unknown range in this case, instead of keeping the initial
    range.
v3: Simplify code, using existing "fail" label. (Jason)

Fixes the following using v3dv:
  dEQP-VK.graphicsfuzz.cov-simplify-clamp-max-itself

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6737>
2020-09-16 23:24:28 +02:00
Gert Wollny 2c9fee9b6a nir: Add option lower_uniforms_to_ubo
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6316>
2020-09-16 10:07:42 +00:00
Marek Olšák 57bf4c2028 nir,radeonsi: move ffma fusing to late optimizations for better codegen
The freedreno trace changes were suggested by Rob Clark.

ALU performance is higher, because ffma is used more often, but so is
register usage, because trinary opcodes (such as ffma) usually need
at least 3 live registers.

54793 shaders in 33659 tests
Totals:
SGPRS: 2639746 -> 2642938 (0.12 %)
VGPRS: 1534120 -> 1536392 (0.15 %)
Spilled SGPRs: 3541 -> 3618 (2.17 %)
Spilled VGPRs: 33 -> 44 (33.33 %)
Scratch size: 292 -> 312 (6.85 %) dwords per thread
Code Size: 55639836 -> 55620116 (-0.04 %) bytes
Max Waves: 964785 -> 963977 (-0.08 %)

Totals from affected shaders:
SGPRS: 1105800 -> 1108992 (0.29 %)
VGPRS: 635292 -> 637564 (0.36 %)
Spilled SGPRs: 3193 -> 3270 (2.41 %)
Spilled VGPRs: 33 -> 44 (33.33 %)
Scratch size: 36 -> 56 (55.56 %) dwords per thread
Code Size: 31568708 -> 31548988 (-0.06 %) bytes
Max Waves: 319991 -> 319183 (-0.25 %)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6596>
2020-09-16 02:39:02 +00:00
Jesse Natalie bf849b058b spirv: Handle OpTypeOpaque
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6715>
2020-09-15 20:38:37 +00:00
Italo Nicola 00914e2179 nir/algebraic: fold some nested comparisons with ball and bany
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6604>
2020-09-14 17:47:39 +00:00
Pierre-Eric Pelloux-Prayer 8a2a9e9bb8 glsl: fix per_vertex_accumulator::fields size
49d35f3d88 moved gl_Layer/gl_ViewportIndex/gl_ViewportMask
as builtins but fields size wasn't increased.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3508
Fixes: 49d35f3d88 ("glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6688>
2020-09-14 15:24:32 +02:00
Tapani Pälli 12567de2be glsl: mark some builtins with correct glsl(es) version check
GLSL Desktop spec 1.30.x:
   "New built-ins: trunc(), round(), roundEven(), isnan(), isinf(), modf()"

For ES, 3.00.x is the first ES spec that mentions the builtins.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6455>
2020-09-14 11:12:55 +03:00
Marek Olšák 656d8edd9e nir/opt_vectorize: don't lose exact and no_*_wrap flags
This fixes a bunch of dEQP GLES tests.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6694>
2020-09-11 17:41:14 -04:00
Marek Olšák 50d335804f nir/algebraic: add late optimizations that optimize out mediump conversions (v3)
v2: move *2*mp patterns to the end of late_optimizations
v3: remove ftrunc from the optimizations to fix:
    dEQP-GLES3.functional.shaders.builtin_functions.common.modf.vec2_lowp_vertex

Reviewed-by: Rob Clark <robdclark@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák b86305bb57 nir/algebraic: collapse conversion opcodes (many patterns)
mediump inserts a lot of conversions. This cleans up the IR.
All other combinations are covered too.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák cdd498bbe8 nir: add new mediump opcodes f2[ui]mp, i2fmp, u2fmp
Algebraic optimizations will select them.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 385b4dbc39 nir: enforce 32-bit src type requirement for f2fmp and i2imp
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 3d3df8dbff nir: remove redundant opcode u2ump
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 26fc5e1f4a nir/algebraic: expand existing 32-bit patterns to all bit sizes using loops
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 3c8934a644 nir/algebraic: add flrp patterns for 16 and 64 bits
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
2020-09-10 23:35:13 +00:00
Marek Olšák 40f7afc1e9 nir: fix lower_mediump_outputs to not require variables
If IO is lowered, NIR doesn't have to contain any IO variables
(and in fact radeonsi removes them and other drivers should too).

This makes the pass work without variables.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6621>
2020-09-10 19:52:57 +00:00
Marek Olšák c2ae39e0ce nir: add mediump flag to IO semantics
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6621>
2020-09-10 19:52:57 +00:00
Vinson Lee 0bc36ef50e spirv: Initialize spirv_test member shader.
Fix defect reported by Coverity Scan.

Uninitialized pointer field (UNINIT_CTOR)
uninit_member: Non-static class member shader is not initialized in this
constructor nor in any functions that it calls

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6640>
2020-09-09 22:24:09 +00:00
Jesse Natalie 89401e5867 nir: More NIR_MAX_VEC_COMPONENTS fixes
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6655>
2020-09-09 20:19:42 +00:00
Jason Ekstrand c5dd54e600 nir/idiv_const: Use the modern nir_src_as_* constant helpers
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6655>
2020-09-09 20:19:42 +00:00
Jason Ekstrand d86e38af2c nir: More NIR_MAX_VEC_COMPONENTS fixes
A couple of these probably aren't strictly necessary but they won't
hurt.  The one that's particularly tricky is a fixed-length array in
nir_search.h.  However, to avoid blowing up the binary size of
nir_opt_algebraic by about 2x, we just assert that only small ops are
used.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6655>
2020-09-09 20:19:42 +00:00
Jesse Natalie 7ee5da90ed nir_dominance: Use uint32_t instead of int16_t for dominance counters
We're seeing OpenCL kernels that can hit this INT16_MAX block count.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6657>
2020-09-09 19:01:01 +00:00
Rhys Perry 641d45befb nir/opt_loop_unroll: fix is_access_out_of_bounds with vectors
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsquueze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6347>
2020-09-09 12:34:47 +00:00
Jason Ekstrand 7cedc4128a spirv: Run repair_ssa if there are discard instructions
SPIR-V's OpKill is a control-flow instruction but NIR's discard is not.
Therefore, it can be valid SPIR-V to have

    if (...) {
        foo = /* something */
    } else {
        discard;
    }
    use(foo);

without any phi between the definition of foo and its use.  This is not
true in NIR, however, because NIR's discard isn't considered
control-flow.  Arguably, this is a NIR bug but making discard control-
flow is a very deep change that can have serious ans subtle
side-effects.   The easier thing to do is just fix up the SSA in case we
have an OpKill which might have gotten us into the above case.

Fixes dEQP-VK.graphicsfuzz.vectors-and-discard-in-function with the new
NIR dominance validation pass enabled.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
2020-09-08 19:44:01 +00:00
Jason Ekstrand 45bcb10841 nir: Add a dominance validation pass
We don't do full dominance validation of SSA values in nir_validate
because it requires generating valid dominance information and, while
that's not extremely expensive, it's probably more than we want to do on
every pass.  Also, dominance information is generated through the
metadata system so if we ran it by default in nir_validate, we would get
different beavior of the metadata system based on whether or not you
have a debug build and metadata bugs would be very hard to find.

However, having a pass for it that can be run occasionally, should help
detect and expose bugs.  For ease of use, we add a NIR_VALIDATE_SSA_DOMINANCE
environment variable which can be set to manually enable dominance
validation as a standard part of nir_validate.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
2020-09-08 19:44:01 +00:00
Rhys Perry 6cef804067 nir/opt_if: fix opt_if_merge when destination branch has a jump
Fixes a case where opt_if_merge created code like:
if (...) {
   break;
   loop {
      ...
   }
}
which caused opt_peel_loop_initial_if to complain that the loop pre-header
wasn't a predecessor of the loop header. This patch prevents this
(invalid, I think) unreachable code from being created.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3496
Fixes: 4d3f6cb973 ('nir: merge some basic consecutive ifs')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6633>
2020-09-08 18:39:47 +00:00
Eric Anholt 1ed78bd247 nir: Use explicit deref information to provide real UBO ranges.
freedreno results (note that cat6 is loads from memory as opposed to
pushed constants from the constant file):

total instructions in shared programs: 8044344 -> 8022085 (-0.28%)
total constlen in shared programs: 1411384 -> 1461964 (3.58%)
total cat6 in shared programs: 89983 -> 87065 (-3.24%)

Over the last 3 commits, we increased Manhattan31 performance by ~10%

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
2020-09-08 18:20:51 +00:00
Eric Anholt f3b33a5a35 nir: Add a range_base+range to nir_intrinsic_load_ubo().
For UBO accesses to be the same performance as classic GL default uniform
block uniforms, we need to be able to push them through the same path.  On
freedreno, we haven't been uploading UBOs as push constants when they're
used for indirect array access, because we don't know what range of the
UBO is needed for an access.

I believe we won't be able to calculate the range in general in spirv
given casts that can happen, so we define a [0, ~0] range to be "We don't
know anything".  We use that at the moment for all UBO loads except for
nir_lower_uniforms_to_ubo, where we now avoid losing the range information
that default uniform block loads come with.

In a departure from other NIR intrinsics with a "base", I didn't make the
base an be something you have to add to the src[1] offset.  This keeps us
from needing to modify all drivers (particularly since the base+offset
thing can mean needing to do addition in the backend), makes backend
tracking of ranges easy, and makes the range calculations in
load_store_vectorizer reasonable.  However, this could definitely cause
some confusion for people used to the normal NIR base.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
2020-09-08 18:20:51 +00:00
Eric Anholt 3a9356831a nir: Update the comment about nir_lower_uniforms_to_ubo()'s multiplier.
I remembered doing this analysis and was arguing in another MR that this
pass didn't have any driver dependency, but it actually does based on
PIPE_CAP_PACKED_UNIFORMS.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
2020-09-08 18:20:51 +00:00
Rhys Perry e4d75c22be nir/opt_shrink_vectors: shrink image stores using the format
fossil-db (Navi):
Totals from 657 (0.48% of 135946) affected shaders:
VGPRs: 26076 -> 25520 (-2.13%); split: -2.15%, +0.02%
CodeSize: 3033016 -> 3014472 (-0.61%); split: -0.64%, +0.03%
MaxWaves: 9386 -> 9420 (+0.36%)
Instrs: 590109 -> 585502 (-0.78%); split: -0.82%, +0.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5772>
2020-09-07 18:06:50 +00:00
Jason Ekstrand bd428162b6 nir/lower_io: Fix the unknown-array-index case in get_deref_align
The current align_mul calculation in the unknown-array-index
calculation is

    align_mul = MIN3(parent_mul,
                     min_pow2_divisor(parent_offset),
                     min_pow2_divisor(stride))

which is certainly correct if parent_offset > 0.  However, when
parent_offset = 0, min_pow2_divisor(parent_offset) isn't well-defined
and our calculation for it is 1 << -1 which isn't well-defined.  That
said.... it's not actually needed.

The offset to the base of the array is

    array_base = parent_mul * k + parent_offset

for some integer k.  When we throw in an unknown array index i, we get

    elem = parent_mul * k + parent_offset + stride * i.

If we set new_align = MIN2(parent_mul, min_pow2_divisor(stride)), then
both parent_mul and stride are divisible by new_align and

    elem = (parent_mul / new_alig) * new_align * k +
           (stride / new_align) * new_align * i + parent_offset

         = new_align * ((parent_mul / new_alig) * k +
                        (stride / new_align) * i) + parent_offset

so elem = new_align * j + parent_offset where

    j = (parent_mul / new_alig) * k + (stride / new_align) * i.

That's a very long-winded way of saying that we can delete one parameter
from the align_mul calculation and it's still fine. :-)

Fixes: 480329cf8b "nir: Add a helper for getting the alignment of a deref"
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6628>
2020-09-07 17:29:10 +00:00
Jason Ekstrand 013a2b123d spirv2nir: Rework argument handling
The argument handling of this little tool was pretty rubbish.  It had no
help and it required the filename to come first which is just strange.
This reworks it and makes things much nicer.  It's still rubbish but at
least there's a chance people can figure out how to use it now.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6607>
2020-09-07 14:59:40 +00:00