Commit Graph

1638 Commits

Author SHA1 Message Date
Alan Swanson f1e9671442 util/disk_cache: actually enforce cache size
Currently only a one in one out eviction so if at max_size and
cache files were to constantly increase in size then so would the
cache. Restrict to limit of 8 evictions per new cache entry.

V2: (Timothy Arceri) fix make check tests

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri 50989f87e6 util/disk_cache: use a thread queue to write to shader cache
This should help reduce any overhead added by the shader cache
when programs are not found in the cache.

To avoid creating any special function just for the sake of the
tests we add a one second delay whenever we call dick_cache_put()
to give it time to finish.

V2: poll for file when waiting for thread in test
V3: fix poll delay to really be 100ms, and simplify the wait function

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Jason Ekstrand 9d559ba39d nir/constant_expressions: Refactor helper functions
Apart from avoiding some unneeded size cases, this shouldn't have any
actual functional impact.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 702d1af8ba glsl/nir: Use nir_type_conversion_op
Using the helper is way better than hand-coding the universe.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 6eb051e36f nir: Rewrite nir_type_conversion_op
The original version was very convoluted and tried way too hard to not
just have the nested switch statement that it needs.  Let's just write
the obvious code and then we know it's correct.  This fixes a bunch of
missing cases particularly with int64.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 9084b1db30 nir: Add a get_nir_type_for_glsl_base_type helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand a136884139 nir/validate: Rework ALU bit-size rule validation
The original bit-size validation wasn't capable of properly dealing with
instructions with variable bit sizes.  An attempt was made to handle it
by looking at source and destinations but, because the validation was
done in validate_alu_(src|dest), it didn't really have the needed
information.  The new validation code is much more straightforward and
should be more correct.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 370d68babc nir/validate: Validate that bit sizes and components always match
We've always required bit sizes to match but the rules for number of
components have been a bit loose.  You've never been allowed to source
from something with less components than you consume, but more has
always been fine.  This changes the validator to require that they match
exactly.  The fact that they don't always match has been a source of
confusion in NIR for quite some time and it's time we got rid of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand e9a45a3d5d nir: Make image_size a variable-width intrinsic
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 0bf0365393 nir/lower_tex: Use tex_instr_dest_size for txs destinations
Using coord_components of the source texture is correct for everything
except cube maps where it's off by one.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand fffa4111df nir/spirv: Restrict the number of channels in texture coordinates
Some SPIR-V texturing instructions pack more than the texture coordinate
into the coordinate source.  We need to mask off the unused channels.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand 3c312be7b3 nir/copy_prop: Respect the source's number of components
In the near future we are going to require that the num_components in a
src dereference match the num_components of the SSA value being
dereferenced.  To do that, we need copy_prop to not remove our MOVs from
a larger SSA value into an instruction that uses fewer channels.

Because we suddenly have to know how many components each source has,
this makes the pass a bit more complicated.  Fortunately, copy
propagation is the only pass that cares about the number of components
are read by any given source so it's fairly contained.

Shader-db results on Sky Lake:

   total instructions in shared programs: 13318947 -> 13320265 (0.01%)
   instructions in affected programs: 260633 -> 261951 (0.51%)
   helped: 324
   HURT: 1027

Looking through the hurt programs, about a dozen are hurt by 3
instructions and the rest are all hurt by 2 instructions.  From a
spot-check of the shaders, the story is always the same:  They get a
vec4 from somewhere (frequently an input) and use the first two or three
components as a texture coordinate.  Because of the vector component
mismatch, we have a mov or, more likely, a vecN sitting between the
texture instruction and the input.  This means that the back-end inserts
a bunch of MOVs and split_virtual_grfs() goes to town.  Because the
texture coordinate is also used by some other calculation, register
coalesce can't combine them back together and we end up with an extra 2
MOV instructions in our shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand 60d1aac28a nir/intrinsics: Make load_barycentric_input take a 2-component coor
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Timothy Arceri df1d5fc442 glsl: don't use ralloc for blob creation
There is no need to use ralloc here.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 09:50:19 +11:00
Timothy Arceri b607aad8e1 glsl: don't recompile a shader on fallback unless needed
Because we optimistically skip compiling shaders if we have seen them
before we may need to compile them later at link time if they haven't
yet been use in a specific combination to create a program.

Rather than always recompiling we take advantage of the
gl_compile_status enum introduced in the previous patch to only
compile when we have previously skipped compilation.

This helps with regressions in app start-up times on cold cache
runs, compared with no cache.

Deus Ex: Mankind Divided start-up times:

cache disabled:               ~3m15s
cold cache master:            ~4m23s
cold cache with this patch:   ~3m33s

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:26:08 +11:00
Timothy Arceri bfa95997c4 mesa/glsl: introduce new gl_compile_status enum
This will allow us to tell if a shader really has been compiled or
if the shader cache has just seen it before.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:24:40 +11:00
Emil Velikov a3782f2b7a glsl/tests: remove any bashisms
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov e4c7911150 nir: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov 56e58e01e4 glsl: remove shebang from python scripts
All of the scripts are [must be] executed via $PYTHON2 [or equivalent]
hence why they are missing the execute bit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov eca18d440d glsl/tests: remove execute bit from compare_ir python script
Nearly all the python scripts used in-tree are invoked via $PYTHON2 or
equivalent. As such having the execute bit not needed and generally
ill-advised.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:45 +00:00
Emil Velikov 7473fcd40b glsl/tests: suffix .sh/.py files as applicable
This makes it easier/clearer as to:
 - if the file should have the execute bit set (.py should not)
 - do we need the shebang in the first place and if so what it should be

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:45 +00:00
Robert Bragg a98ffe2477 exec_list: Add a foreach_list_typed_from macro
This allows iterating list nodes from a given start point instead of
necessarily the list head.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:50 +00:00
Grazvydas Ignotas 8cd83a6c81 glsl/blob: clear padding bytes
Since blob is intended for serializing data, it's not a good idea to
leave padding holes with uninitialized data, which may leak heap
contents and hurt compression if the blob is later compressed, like
done by shader cache. Clear it.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:41:02 +11:00
Lionel Landwerlin f81ede4699 glsl: builtin: always return clones of the builtins
Builtins are created once and allocated using their own private ralloc
context. When reparenting IR that includes builtins, we might be steal
bits of builtins. This is problematic because these builtins might now
be freed when the shader that includes then last is disposed. This
might also lead to inconsistent ralloc trees/lists if shaders are
created on multiple threads.

Rather than including builtins directly into a shader's IR, we should
include clones of them in the ralloc context of the shader that
requires them. This fixes double free issues we've been seeing when
running shader-db on a big multicore (72 threads) server.

v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better
    reflect how this function is used. (Ken)

v3: Rename ctx to mem_ctx (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 08:30:36 +00:00
Jason Ekstrand 4483c5d57c spirv: Silence unused variable warnings in release mode
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-07 15:22:16 -08:00
Timothy Arceri 6b657cecd5 util/disk_cache: fix make check
Fixes make check after 11f0efec2e which caused disk cache
to create an additional directory.
2017-03-06 16:39:55 +11:00
Timothy Arceri e3a01a5d1b Revert "glsl: Switch to disable-by-default for the GLSL shader cache"
This reverts commit 0f60c6616e.

Piglit and all games tested so far seem to be working without
issue. This change will allow wide user testing and we can decided
before the next release if we need to turn it off again.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-06 09:38:07 +11:00
Jason Ekstrand bc456749bd nir/int64: Properly handle imod/irem
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder.  They just leave the behavior undefined
whenever either source is negative.  However, in SPIR-V, there is a
defined behavior for negative arguments.  This commit beefs up the pass
so that it handles both correctly.  Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-03 13:59:27 -08:00
Jason Ekstrand 9745bef308 nir/builder: Add an int64 immediate helper
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-03 13:59:24 -08:00
Samuel Pitoiset 9fc86d4f53 glsl: fix subroutine mismatch between declarations/definitions
Previously, when q.subroutine was set to 1, a new subroutine
declaration was added to the AST, while 0 meant a subroutine
definition has been detected by the parser.

Thus, setting the q.subroutine flag in both situations is
obviously wrong because a new type identifier is added instead
of trying to match the declaration. To fix it up, introduce
ast_type_qualifier::is_subroutine_decl() to differentiate
declarations and definitions easily.

This fixes a regression with:
arb_shader_subroutine/compiler/direct-call.vert

Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-03 00:57:57 +01:00
Jason Ekstrand 424ac809bf i965: Do int64 lowering in NIR
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand 074f5ba0b5 nir: Add a simple int64 lowering pass
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.

v2: Properly handle vectors

v3: Get rid of log2_denom stuff.  Since we're using bcsel, we do all the
    calculations anyway and this is just extra instructions.

v4:
 - Add back in the log2_denom stuff since it's needed for ensuring that
   the shifts don't overflow.
 - Rework the looping part of the pass to be easier to expand.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand 86e749b1ad spirv: Use nir_builder for control flow
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand 95972cd4fd nir/lower_indirect: Use nir_builder control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand 3ce8eeb5a1 nir/lower_gs_intrinsics: Use nir_builder control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand c75f965ab7 glsl/nir: Use nir_builder's new control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand e27c716ad7 nir/builder: Add support for easily building control-flow
Each of the pop functions (and push_else) take a control flow parameter as
their second argument.  If NULL, it assumes that the builder is in a block
that's a direct child of the control-flow node you want to pop off the
virtual stack.  This is what 90% of consumers will want.  The SPIR-V pass,
however, is a bit more "creative" about how it walks the CFG and it needs
to be able to pop multiple levels at a time, hence the argument.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand 2c58709023 glsl/int64: Fix a typo in imod64
The zy swizzle gives us one component of quotient and one component of
remainder.  What we wanted was zw for the remainder.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-01 15:31:44 -08:00
Samuel Pitoiset be8aa76afd glsl: remove unecessary flags.q.subroutine_def
This bit is definitely not necessary because subroutine_list
can be used instead. This frees one more bit in the flags.q
struct which is nice because arb_bindless_texture will need
4 bits for the new layout qualifiers.

No piglit regressions found (including compiler tests) with
"-t subroutine".

v2: set the subroutine flag for validating illegal flags

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-01 14:15:31 +01:00
Kenneth Graunke aa8bb9fc15 compiler: Free types in _mesa_glsl_release_types() rather than autofree.
Instead of using ralloc_autofree_context() to install an atexit()
handler to ralloc_free(glsl_type::mem_ctx), we can simply free them
from _mesa_glsl_release_types().

This is effectively the same, because _mesa_glsl_release_types() is
called from _mesa_destroy_shader_compiler(), which is called from Mesa's
one_time_fini() function, which Mesa installs as an atexit() handler.

The one advantage here is that it ensures the built-in functions are
destroyed before the types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-27 15:46:12 -08:00
Samuel Pitoiset e69fd0b43c glsl: reject samplers not declared as uniform/function params earlier
This improves consistency with image variables and atomic
counters which are already rejected the same way.

Note that opaque variables can't be treated as l-values, which
means only the 'in' function parameter is allowed.

v2: rewrite commit message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2017-02-27 19:42:00 +01:00
Samuel Pitoiset 08a052966f glsl: use is_sampler() anywhere it's possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-27 19:41:14 +01:00
Samuel Pitoiset e12f4edf9c glsl: use is_image() anywhere it's possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-27 19:41:11 +01:00
Samuel Pitoiset 46562a062b glsl: add missing blend_support qualifier in validate_flags()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-02-27 19:40:12 +01:00
Samuel Pitoiset 87ee1729d0 glsl: use an enum for AMD_conservative_depth layout qualifiers
The main idea behind this is to free some bits in the flags.q
struct because currently all 64-bits are used and we can't
add more layout qualifiers without reaching a static assert.

In order to do that (mainly for ARB_bindless_texture), use an
enumeration for the AMD_conservative_depth layout qualifiers
because it's forbidden to declare more than one depth qualifier
for gl_FragDepth.

Note that ast_type_qualifier::merge_qualifier() will prevent
using duplicate layout qualifiers by returning a compile-time
error.

No piglit regressions found (including compiler tests) with
RX480 on RadeonSI.

v2: use a switch case

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com> (v1)
2017-02-27 19:39:37 +01:00
Samuel Pitoiset de2727925a glsl: add has_shader_image_load_store()
Preliminary work for ARB_bindless_texture which can interact
with ARB_shader_image_load_store.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-27 19:33:10 +01:00
Elie TOURNIER 082d5b1aee nir: Delete unused arg in get_iteration
nir_const_value is not needed in get_iteration

Signed-off-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-27 14:35:16 +00:00
Timothy Arceri 6b4bb24acf compiler: style clean-ups in blob.h
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2017-02-25 13:30:28 +11:00
Vinson Lee c3f9540a0c glsl: Fix missing-braces warning.
CXX    glsl/ast_to_hir.lo
glsl/ast_to_hir.cpp: In member function 'virtual ir_rvalue* ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*)':
glsl/ast_to_hir.cpp:4846:42: warning: missing braces around initializer for 'unsigned int [16]' [-Wmissing-braces]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-02-24 16:04:06 -08:00