Take away const qualifier from return type of these functions as
-Wignored-qualifiers points out it is ignored for these cases.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
vtn supports these, so don't squalk if user is happy with enabling
these.
v2: add new members sorted
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
v2: add assert to verify we have at least one valid bit_size
v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
With OpenCL some system values match the address bits, but in GLSL we also
have some system values being 64 bit like subgroup masks.
With this it is possible to adjust the builder functions so that depending
on the bit_sizes the correct bit_size is used or an additional argument is
added in case of multiple possible values.
v2: validate dest bit_size
v3: generate hex values in python code
remove useless imports
rename and move bit_sizes
v4: add 1 to legal bit_sizes for front_face
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
fixes a couple of deqp tests (on nvc0 and potential other drivers):
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_1
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_2
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Currently we only add a cache key for a shader once it is linked.
However games like Team Fortress 2 compile a whole bunch of shaders
which are never actually linked. These compiled shaders can take
up a bunch of memory.
This patch changes things so that we add the key for the shader to
the cache as soon as it is compiled. This means on a warm cache we
can avoid the wasted memory from these shaders. Worst case scenario
is we need to compile the shaders at link time but this can happen
anyway if the shader has been evicted from the cache.
Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a
warm cache from start up to the game menu.
V2: only add key to cache when compilation is successful.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Currently we only add a cache key for a shader once it is linked.
However games like Team Fortress 2 compile a whole bunch of shaders
which are never actually linked. These compiled shaders can take
up a bunch of memory.
This patch changes things so that we add the key for the shader to
the cache as soon as it is compiled. This means on a warm cache we
can avoid the wasted memory from these shaders. Worst case scenario
is we need to compile the shaders at link time but this can happen
anyway if the shader has been evicted from the cache.
Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a
warm cache from start up to the game menu.
Acked-by: Marek Olšák <marek.olsak@amd.com>
This basically reverts c2bc0aa7b1.
By running the opts we reduce memory using in Team Fortress 2
from 1.5GB -> 1.3GB from start-up to game menu.
This will likely increase Deus Ex start up times as per commit
c2bc0aa7b1. However currently 32bit games like Team Fortress 2
can run out of memory on low memory systems, so that seems more
important.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Passes' function names, separated by comma, listed in NIR_SKIP
environment variable will be skipped in debug mode. The mechanism is
hooked into the _PASS macro, like NIR_PRINT.
The extra macro NIR_SKIP is available as a developer convenience, to
skip at pointer other than the passes entry points.
v2: Fix typo in NIR_SKIP macro. (Bas)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise writes get propagated across atomics if no barrier is
used. Without barrier writes should still be visible in the same
invocation, so an atomic has to be considered a write.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b3c6146925 "nir: Copy propagation between blocks"
Fixes: 62332d139c "nir: Add a local variable-based copy propagation pass"
From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering.
ior: fmax instead of fadd allows removing the fsat.
inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better
with the scalar instruction set.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Replace calls to create hash tables and sets that use
_mesa_hash_pointer/_mesa_key_pointer_equal with the helpers
_mesa_pointer_hash_table_create() and _mesa_pointer_set_create().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Function's out variable could be an array dereferenced by an array:
func(v[w[i]]);
or something more complicated.
Copy index in any case.
Fixes: 76c27e47b9 ("glsl: Copy function out to temp if we don't directly ref a variable")
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Instead of emitting all of the conditions for the cases of a switch
statement up-front, emit them on-the-fly as we emit the code for each
case. The original justification for this was that we were going to
have to build a default case anyway which would need them all. However,
we can just trust CSE to clean up the mess in that case. Emitting each
condition right before the if statement that uses it reduces register
pressure and, in one customer benchmark, reduces spilling and improves
performance by about 2x.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Even though no one's been brave enough to ever use this pass, I like to
keep it functionally working.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Instead of applying the workaround universally, detect semi-old GLSLang
via the generator ID and only enable the workaround on old GLSLang.
This isn't nearly as precise as one would like it to be because the
first GLSLang generator id version bump was on October 7, 2017 which is
about 1.5 years after the bug was fixed. However, it at least lets us
disable it for non-GLSLang and for more modern versions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
A long time in a galaxy far far away, there was a GLSLang bug with how
it handled samplers passed in as function parameters. (The bug can be
found here: https://github.com/KhronosGroup/glslang/issues/179.)
Unfortunately, that version was shipped in several apps and has been
causing heartburn for our SPIR-V parser ever since.
Recent changes to NIR uncovered a moderately old bug in how we work
around this issue. In particular, we ended up with a deref_cast from
uniform to local which is not a no-op cast so nir_opt_deref wasn't
getting rid of the cast. The only reason why it worked before was
because someone just happened to call nir_fixup_deref_modes which
"fixed" the cast (that shouldn't be happening) and then a later round of
copy-prop would get rid of it. The fact that the deref_cast survived
that long without causing trouble for other parts of NIR is a bit
surprising.
Just whacking the mode of the pointer seems to fix it fairly
unobtrusively. Currently, only apps with this bug will have a local
variable containing an image or sampler.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109304
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
With the new handling of bool types, the conversion to float in glsl_to_nir
should not apply to bool types anymore.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
out_type in the default cast case is always GLSL_TYPE_FLOAT, so we get a
mov otherwise.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
All alu instructions emitted with native_integers=false expect float
(or bool in some cases) constants, so this change is necessary.
This will cause changes with some intrinsics which had integer sources,
such as nir_intrinsic_load_uniform. Apparently it might cause issues with
some opt passes, but perhaps those don't apply in OpenGL ES 2.0 cases?
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
GL_ARB_gl_spirv does not provide a sampler deref for e.g. texelFetch(),
so we can't assume that both are present and identical. Simply lower
each if it is present.
Fixes regressions in GL_ARB_gl_spirv tests since I switched everyone to
using this pass. Thanks to Alejandro Piñeiro for catching these.
Fixes: f003859f97 nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
In all GLSL ES versions output variables in fragment shader are allowed
to be invariant.
From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 1.00 spec:
"Only the following variables may be declared as invariant:
...
- Built-in special variables output from the fragment shader."
From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 3.00 spec:
"Only variables output from a shader can be candidates for invariance."
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107842
They have glsl_sampler_dim enum values of 8 and 9 which don't work when
you & them with 0x7. Fortunately, we have plenty of bits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The check for location aliasing was always asuming output variables
but this validation is also called for input variables.
Fixes: e2abb75b0e ("glsl/linker: validate explicit locations for SSO programs")
Cc: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
NIR metadata validation verifies that the debug bit was unset (by a call
to nir_metadata_preserve) if a NIR optimization pass made progress on
the shader. With the expectation that the NIR shader consists of only a
single main function, it has been safe to call nir_metadata_preserve()
iff progress was made.
However, most optimization passes calculate progress per-function and
then return the union of those calculations. In the case that an
optimization pass makes progress only on a subset of the functions in
the shader metadata validation will detect the debug bit is still set on
any unchanged functions resulting in a failed assertion.
This patch offers a quick solution (short of a larger scale refactoring
which I do not wish to undertake as part of this series) that simply
unsets the debug bit on unchanged functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Will be used to communicate that a shader uses 64-bit operations to the
concerned lowering passes.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We're going to have multiple functions, so nir_shader_get_entrypoint()
needs to do something a little smarter.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Previously it assumed that only a single function (the entrypoint)
existed and attempted to lower constant initializers of shader outputs
for each function, for instance.