Commit Graph

7563 Commits

Author SHA1 Message Date
Marek Olšák a572ba673b nir/serialize: try to store a diff in var data locations instead of var data
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák c8314678ee nir/serialize: deduplicate serialized var types by reusing the last unique one
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák 545415f45f nir/serialize: don't serialize var->data for temporaries
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák c358c2b2bf nir/serialize: pack src better and limit the object count to 1M from 1G
We need to limit the object count to 1M to free 10 bits for the src
modifiers.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Marek Olšák 35655865cb nir/serialize: pack instructions better
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-23 00:02:10 -05:00
Ian Romanick ca353285cb nir/range_analysis: Make sure the table validation only occurs once
All of the tables are static const, so they only need to be validated
once.  As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass.  Even with
the help of the previous commit, this does not always occur.

-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Ian Romanick ccefce46cb nir/range-analysis: Add pragmas to help loop unrolling
I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions.  It seems none of this happens even at -O3.

Adding the pragmas helps encourage loop unrolling at some optimization
levels.  I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.

-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5

v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 08:16:06 -08:00
Danylo Piliaiev 25a00b449f glsl: Add varyings to "zero-init of uninitialized vars" workaround
Varyings are similar to already handled cases. And "glsl_zero_init"
name of the workaround already looks like it should include varyings.

The issue was observed in GiMark subtest from GpuTest.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-22 15:25:56 +00:00
Alyssa Rosenzweig deaebc82a7 nir: Add load_sampler_lod_paramaters_pan intrinsic
This loads in the <min_lod, max_lod, lod_bias> settings for a given
sampler, which is necessary for lowering clamps/biases on certain
Midgard chips.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-22 05:07:19 +00:00
Marek Olšák 0b1452ffdd nir/serialize: do ctx = {0} instead of manual initializations
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Marek Olšák ff71fae440 nir: strip as we serialize to remove the nir_shader_clone call
Serializing stripped NIR is faster now.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-21 18:49:57 -05:00
Dave Airlie cce07ea835 nir: fix deref offset builder
Use the correct bit size

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:41 +10:00
Dave Airlie 7325f6ac98 vtn/opencl: add clz support
This is needed for OpenCL

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:41 +10:00
Dave Airlie d0d96053e6 nir: add 64-bit ufind_msb lowering support. (v2)
This adds the option to lower 64-bit ufind_msb opcodes.

v2: use split_x/y removes component loops (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:37 +10:00
Dave Airlie 12913bcf86 spirv/nir/opencl: handle some multiply instructions.
This adds support for some missing 24-bit and hi multiply
variants.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Dave Airlie 5375c30234 spirv: get the correct type for function returns.
This needs to be derived from the address format, not always 1/32.

Suggested by Jason

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Dave Airlie b62a925ad1 spirv: don't store 0 to cs.ptr_size for non kernel stages.
cs is a union so storing this there is wrong.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-22 04:37:25 +10:00
Iago Toral Quiroga c573b50179 glsl: add missing initialization of the location path field
This was apparently missed in 67b32190f3, which added support
for ARB_shading_language_include to #line, including the 'path'
field for the location.

Fixes crashes in CTS with all drivers as they attempt to access
an uninitialized path string during parsing.

Fixes: 67b32190f3 ("glsl: add ARB_shading_language_include support to #line")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>
2019-11-21 12:55:15 +01:00
Timothy Arceri cd6322366d compiler: move build definition of pp_standalone_scaffolding.c
This should fix android build issues while still allowing scons to
build the standalone compiler.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2019-11-21 16:07:08 +11:00
Karol Herbst 5934a53bfe nir/validate: validate num_components on registers and intrinsics
also make 8 and 16 compoments invalid. We will enable that later again
when we actually support it.

v2: fix validation of nir_intrinsic_instr::num_components
    correct validation of instr->num_components

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-21 01:10:24 +01:00
Rhys Perry ca2de7ae9c nir/large_constants: use nir_index_vars and nir_variable::index
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-20 15:05:42 +00:00
Rhys Perry 9f92e8b721 nir: add nir_variable::index and nir_index_vars
This will be useful as a deterministic identifier/index for the variable.

v2: fix comment style

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
2019-11-20 15:05:42 +00:00
Rhys Perry 45a0b53490 nir: make nir_variable::{num_members,num_state_slots} a uint16_t
Doesn't shrink it (at least, on x86-64) and leaves space for more members.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-20 15:05:42 +00:00
Neil Roberts f6b5abe91a nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops
Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 634eb9c04b nir: Add a 8-bit bool type
Adds nir_type_bool8 as well as 8-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 0f5640c577 nir: Add a 16-bit bool type
Adds nir_type_bool16 as well as 16-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 2ec97e78a9 nir/opcodes: Add a helper function to generate reduce opcodes
Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit
versions of the reduce operation. This reduces the code duplication a
bit and will make it easier to later add 16-bit versions as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 9a96afb97e nir/opcodes: Add a helper function to generate the comparison binops
Adds binop_compare_all_sizes which generates both 1-bit and 32-bit
versions of the comparison operation. This reduces the code
duplication a bit and will make it easier to later add 16-bit versions
as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Timothy Arceri 1201d3377e mesa: add support cursor support for relative path shader includes
This will allow us to continue searching the current path for
relative shader includes.

From the ARB_shading_language_include spec:

   "If it is quoted with double quotes in a previously included
   string, then the first search point will be the tree location
   where the previously included string had been found."

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri db5197cec5 glsl: delay compilation skip if shader contains an include
If the shader contains an include when need to first run the
preprocessor before deciding if we can skip compilation based
on the shader cache.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri 17df8f8b5d glsl: add can_skip_compile() helper
We will reuse this in the following commit.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:56 +00:00
Timothy Arceri 5327b756bf glsl: error if #include used while extension is disabled
In other words make sure the shader does this:

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri 13a1426b97 glsl: add preprocessor #include support
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri e0fd2fa689 glsl: pass gl_context to glcpp_parser_create()
This is a small tidy up and will be useful in the following commit.

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri 67b32190f3 glsl: add ARB_shading_language_include support to #line
From the ARB_shading_language_include spec:

   "#line must have, after macro substitution, one of the following
    forms:

       #line <line>
       #line <line> <source-string-number>
       #line <line> "<path>"

    where <line> and <source-string-number> are constant integer
    expressions and <path> is a valid string for a path supplied in the
    #include directive. After processing this directive (including its
    new-line), the implementation will behave as if it is compiling at
    line number <line> and source string number <source-string-number>
    or <path> path. Subsequent source strings will be numbered
    sequentially, until another #line directive overrides that
    numbering."

Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Timothy Arceri 35108caa71 glsl: add infrastructure for ARB_shading_language_include
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
2019-11-20 05:05:55 +00:00
Marek Olšák 654efd38bb nir: don't use GLenum16 in nir.h
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:12 -05:00
Marek Olšák ec7d37c9c0 nir: move data.descriptor_set above data.index for better packing
4 bytes down

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:10 -05:00
Marek Olšák b160acb9f5 glsl_to_nir: rename image_access to mem_access
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:09 -05:00
Marek Olšák 193e2c9625 nir/print: only print image.format for image variables
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:07 -05:00
Marek Olšák ebe7579655 nir: move data.image.access to data.access
The size of the data structure doesn't change.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-11-19 18:20:05 -05:00
Samuel Pitoiset 194bee193c spirv: fix lowering of OpGroupNonUniformAllEqual
It should rely on the source type, not on the return type which
is always a boolean anyways, so vote_feq was never selected. For
OpSubgroupAllEqualKHR it's always an integer comparison.

This fixes some VK_KHR_shader_subgroup_extended_types tests with RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-19 18:01:13 +00:00
Dave Airlie 1468a4f1f3 nir/serialize: fix serializing functions with no implementations.
Store a flag stating if there was an implmentation, and use
fxn->impl as a temporary flag between deserializsation stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-19 09:30:32 +10:00
Dave Airlie 0fd6b8aa98 nir/serialize: pack function has name and entry point into flags.
Suggested by Jason.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-19 09:30:12 +10:00
Jason Ekstrand 7260df5894 nir: Validate that variables are in the right lists
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-18 16:15:30 -06:00
Connor Abbott f9fd04aca1 nir: Fix non-determinism in lower_global_vars_to_local
Using a hash-table walk means that variables will get inserted in
different orders on different runs. Just walk the list of globals
instead, even if some of them can't be turned into locals.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-14 13:10:58 +00:00
Caio Marcelo de Oliveira Filho 7ae506e5b8 spirv: Consider the sampled_image case in wa_glslang_179 workaround
Fixes: 9e440b8d0b ("spirv: Sort out the mess that is sampled image")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-13 12:02:29 -08:00
Brian Paul bd49dedae0 nir: fix a couple signed/unsigned comparison warnings in nir_builder.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-12 11:44:02 -07:00
Rhys Perry c877f4d320 nir/divergence: improve DA of shuffle
If the data is uniform, then it's really a uniform copy. If the index is
uniform, then it's really a read_invocation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Jason Ekstrand 0c7e0c5599 spirv: Fix the MSVC build
Fixes: 9cc4c2c916 "spirv: Add a vtn_decorate_pointer helper"
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 08:34:55 +00:00