Commit Graph

3202 Commits

Author SHA1 Message Date
Eric Anholt 447ddfc137 nir: Add nir_lower_alu_to_scalar.
This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted
for vc4.

v2: Use the nir_src_for_ssa() helper, and another instance of
    nir_alu_src_copy().
v3: Drop the non-SSA support.  All intended callers will have SSA-only ALU
    ops.
v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused
    unsupported() function, drop lower_context struct.
v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert
    about weird input_sizes[].

Reviewed-by: Jason Ekstrand <jason.ekstrand@iastate.edu>
2015-01-23 16:37:23 -08:00
Eric Anholt b200127816 nir: Make some helpers for copying ALU src/dests.
There aren't many users yet, but I wanted to do this from my scalarizing
pass.

v2: Constify the src arguments.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 16:37:16 -08:00
Kenneth Graunke 15063d2ad0 nir: Add algebraic optimizations for division and reciprocal.
These also exist in opt_algebraic.cpp.

total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%)
NIR instructions in affected programs:     42221 -> 42002 (-0.52%)
helped:                                    198

total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%)
i965 instructions in affected programs:     84322 -> 83885 (-0.52%)
helped:                                     394
HURT:                                       1 (by 1 instruction)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke bbd60f6d79 nir: Add algebraic optimizations for exponential/logarithmic functions.
Most of these exist in the GLSL IR algebraic pass already.  However,
SSA allows us to find more instances of the patterns.

total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%)
NIR instructions in affected programs:     124189 -> 120026 (-3.35%)
helped:                                    604

total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%)
i965 instructions in affected programs:     261295 -> 254507 (-2.60%)
helped:                                     1295
HURT:                                       3

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke 391fb32bbe nir: Add algebraic optimizations for simplifying comparisons.
The first batch removes bonus fnot/inot operations, possibly allowing
other optimizations to better recognize patterns.

The next batch replaces a fadd and constant 0.0 with an fneg - negation
is usually free on GPUs, while addition is not.

total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%)
NIR instructions in affected programs:     411143 -> 405922 (-1.27%)
helped:                                    2233
HURT:                                      214

A few shaders are hurt by a few instructions due to moving neg such
that it has a constant operand, which is then folded, resulting in two
distinct load_consts for x and -x.  We can always clean that up later.

total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%)
i965 instructions in affected programs:     784980 -> 775093 (-1.26%)
helped:                                     4508
HURT:                                       2

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke 551a752a59 nir: Add algebraic optimizations for pointless shifts.
The GLSL IR optimization pass contained these; we may as well include
them too.

v2: Fix a >> 0 and a << 0 optimizations (caught by Matt).

No change in the number of NIR instructions on a shader-db run.

total i965 instructions in shared programs: 6035397 -> 6035392 (-0.00%)
i965 instructions in affected programs:     542 -> 537 (-0.92%)
helped:                                     2 (in glamor)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke 3e56572c49 nir: Add a bunch of algebraic optimizations on logic/bit operations.
Matt and I noticed a bunch of "val <- ior a a" operations in a shader,
so we decided to add an algebraic optimization for that.  While there,
I decided to add a bunch more of them.

v2: Delete bogus fand/for optimizations (caught by Jason).

total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%)
NIR instructions in affected programs:     149634 -> 146937 (-1.80%)
helped:                                    1032

total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%)
i965 instructions in affected programs:     537 -> 542 (0.93%)
HURT:                                       2

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke 978b0a9cda nir: Implement CSE on intrinsics that can be eliminated and reordered.
Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had
load_input and load_uniform intrinsics repeated several times, with the
same parameters, but each one generating a distinct SSA value.  This
made ALU operations on those values appear distinct as well.

Generating distinct SSA values is silly - these are read only variables.
CSE'ing them makes everything use a single SSA value, which then allows
other operations to be CSE'd away as well.

Generalizing a bit, it seems like we should be able to safely CSE any
intrinsics that can be eliminated and reordered.  I didn't implement
support for variables for the time being.

v2: Assert that info->num_variables == 0 (requested by Jason).

total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%)
NIR instructions in affected programs:     2413496 -> 2001071 (-17.09%)
helped:                                    16872

total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%)
i965 instructions in affected programs:     640654 -> 620094 (-3.21%)
helped:                                     2071
HURT:                                       585
GAINED:                                     14
LOST:                                       25

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 14:53:26 -08:00
Kenneth Graunke cbdd623f13 nir: Pull nir_instr_can_cse()'s SSA checks out of the switch.
This should not be a change in behavior, as all current cases that
potentially answer "yes" require SSA.

The next patch will introduce another case that requires SSA.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-23 14:53:26 -08:00
Matt Turner 145919b2ab glsl: Build a libglsl_util library.
Rather than sourcing files with ../dir/file.c which leads to distclean
wiping out ../dir's .deps directory.
2015-01-23 14:28:44 -08:00
Matt Turner 618c3b35f1 glsl: Build with subdir-objects.
Apparently $(top_srcdir) is not expanded in a source list when using
subdir-objects, so remove that. It's not clear to me why we were going
to such lengths to prefix each source file anyway.
2015-01-23 14:28:42 -08:00
Matt Turner a8b880bd63 nir: Add headers to distribution. 2015-01-23 14:27:39 -08:00
Matt Turner ae494281a4 nir: Add nir_{opt_,}algebraic.py to distribution. 2015-01-23 14:26:53 -08:00
Connor Abbott 68a9d0b36f nir: add generated file to .gitignore
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-23 10:20:46 -08:00
Connor Abbott c8761c8559 glsl: fix stale comment
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-23 00:23:51 -05:00
Eric Anholt fc6938d23e nir: Fix setup of constant bool initializers.
brw_fs_nir has only seen scalar bools so far, thanks to vector splitting,
and the ralloc of in glsl_to_nir.cpp will *usually* get you a 0-filled
chunk of memory, so reading too large of a value will usually get you the
right bool value.  But once we start doing vector bools in a few commits,
we end up getting bad values.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-22 13:52:19 -08:00
Eric Anholt 534a4ec82f nir: Make an easier helper for setting up SSA defs.
Almost all instructions we nir_ssa_def_init() for are nir_dests, and you
have to keep from forgetting to set is_ssa when you do.  Just provide the
simpler helper, instead.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-22 13:52:19 -08:00
Jonathan Gray c5be9c126d glsl: Link glsl_test with pthreads library.
Otherwise pthread_mutex_lock will be an undefined reference
on OpenBSD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>
2015-01-22 21:29:43 +00:00
Tapani Pälli adc8cdfa35 glsl: do not allow interface block to have name already taken
Fixes currently failing Piglit case
   interface-blocks-name-reused-globally.vert

v2: combine var declaration with assignment (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-01-22 07:54:19 +02:00
Matt Turner 28b7c6b285 nir: Replace assert(0) with unreachable().
Fixes a couple of warnings in the process.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-21 21:06:37 -08:00
Jason Ekstrand f88c6a4997 nir: Stop using designated initializers
Designated initializers with anonymous unions don't work in MSVC or
GCC < 4.6.  With a couple of constructor methods, we don't need them any
more and the code is actually cleaner.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88467
Reviewed-by: Connor Abbot <cwabbott0@gmail.com>
2015-01-21 19:55:02 -08:00
Jason Ekstrand 7da60eca4f nir: Add src and dest constructors
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-21 12:21:10 -08:00
Jason Ekstrand 194f6235b3 nir: Add a nir_foreach_phi_src helper macro
Reviewed-by: Connor Abbott <cwabbott02gmail.com>
2015-01-20 16:53:29 -08:00
Micah Fedke d36fa60191 mesa: Add ARB_shader_precision infrastructure
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2015-01-19 16:33:21 +13:00
Vinson Lee 10a4f1e77a nir: s/malloc.h/stdlib.h/
Fix build error on Mac OS X.

  CC       nir_to_ssa.lo
nir_to_ssa.c:29:10: fatal error: 'malloc.h' file not found
         ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88478
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2015-01-16 16:14:51 -08:00
Carl Worth 977ddecb69 glsl: Add unit tests for blob.c
In addition to exercising all of the functions in blob.h, this
includes a stress test that forces some reallocing, and also tests to
verify the alignment and overrun-detection code in blob.c.
2015-01-16 13:47:40 -08:00
Tapani Pälli ffcad3a548 glsl: Add blob_overwrite_bytes and blob_overwrite_uint32
These functions are useful when serializing an unknown number of items
to a blob. The caller can first save the current offset, write a
placeholder uint32, write out (and count) the items, then use
blob_overwrite_uint32 with the saved offset to replace the placeholder
value.

Then, when deserializing, the reader will first read the count and
know how many subsequent items to expect.

(I wrote this code after reading a very similar patch written by
Tapani when he wrote serialization code for IR. Since I re-used the
idea of his code so directly, I've credited him as the author of this
code. --Carl)

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 13:47:40 -08:00
Carl Worth 1c9877327e glsl: Add blob.c---a simple interface for serializing data
This new interface allows for writing a series of objects to a chunk
of memory (a "blob").. The allocated memory is maintained within the
blob itself, (and re-allocated by doubling when necessary).

There are also functions for reading objects from a blob as well. If
code attempts to read beyond the available memory, the read functions
return 0 values (or its moral equivalent) without reading past the
allocated memory. Once the caller is done with the reads, it can check
blob->overrun to ensure whether any invalid values were previously
returned due to attempts to read too far.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-16 13:47:40 -08:00
Carl Worth f87ffd5cc3 glsl: Add convenience function get_sampler_instance
This is similar to the existing functions get_instance,
get_array_instance, etc. for getting a type singleton. The new
get_sampler_instance() function will be used by the upcoming shader
cache.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-16 13:47:40 -08:00
Jason Ekstrand bc6e57e019 nir/live_variables: Use a worklist
This is a rework of the liveness algorithm using a worklist as suggested by
Connor.  Doing so reduces the number of times we walk over the instructions
because we don't have to do an entire pointless walk over the instructions
just to figure out it's time to stop.  Also, the stuff after the last loop
in the funciton will only ever get visited once.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 16:54:21 -08:00
Jason Ekstrand 4839d1aed1 nir: Add a worklist helper structure
A worklist is a common concept in optimizations.  This adds a structure
that we can reuse for many different types of optimizations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 16:54:21 -08:00
Brian Paul 0aaaa13ec9 nir: fix incorrect argument passed to validate_src() in validate_tex_instr()
Silences a compiler warning.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 17:41:42 -07:00
Brian Paul aa479a69d6 nir: silence compiler warning from visit_src() call
v2: use proper argument

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 17:09:02 -07:00
Jason Ekstrand 153b8b3525 util/hash_set: Rework the API to know about hashing
Previously, the set API required the user to do all of the hashing of keys
as it passed them in.  Since the hashing function is intrinsically tied to
the comparison function, it makes sense for the hash set to know about
it.  Also, it makes for a somewhat clumsy API as the user is constantly
calling hashing functions many of which have long names.  This is
especially bad when the standard call looks something like

_mesa_set_add(ht, _mesa_pointer_hash(key), key);

In the above case, there is no reason why the hash set shouldn't do the
hashing for you.  We leave the option for you to do your own hashing if
it's more efficient, but it's no longer needed.  Also, if you do do your
own hashing, the hash set will assert that your hash matches what it
expects out of the hashing function.  This should make it harder to mess up
your hashing.

This is analygous to 94303a0750 where we did this for hash_table

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Jason Ekstrand 4c99e3ae78 util: Move main/set to util/hash_set
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Jason Ekstrand 8ed5305d28 hash_table: Rename insert_with_hash to insert_pre_hashed
We already have search_pre_hashed.  This makes the APIs match better.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-01-15 13:21:27 -08:00
Jason Ekstrand 0d05d1226e nir/algebraic: Only replace an instruction once
Without the break, it was possible that an instruction would match multiple
expressions.  If this happened, you could end up trying to replace it
multiple times and get a segfault.  This makes it so that, after a
successful replacement, it moves on to the next instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 0f85310975 nir/vars_to_ssa: Use the copy lowering from lower_var_copies
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand d3636da902 nir: Add a pass for lowering copy instructions
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 700ba5daaf nir/vars_to_ssa: Refactor get_deref_node
This refactor allows you to more easily get the deref node associated with
a given variable.  We then use that new functionality in the
deref_may_be_aliased function instead of creating a 1-element deref chain.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 55b5058e69 nir: Rename lower_variables to lower_vars_to_ssa
The original name wasn't particularly descriptive.  This one indicates that
it actually gives you SSA values as opposed to the old pass which lowered
variables to registers.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 4aa6162f6e nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array
This solves a number of problems.  First is the ability to change the
number of sources that a texture instruction has.  Second, it solves the
delema that may occur if a texture instruction has more than 4 sources.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand dcb1acdea0 nir/validate: Only build in debug mode
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:24 -08:00
Jason Ekstrand 347ab2bf24 nir/lower_variables: Improve documentation
Additional description was added to a variety of places.  Also, we no
longer use the term "leaf" to describe fully-qualified direct derefs.
Instead, we simply use the term "direct" or spell it out completely.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 8016fa39e1 nir/lower_variables: Use a for loop for get_deref_node
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 0c0ca8b6ae nir: Use the actual FNV-1a hash for hashing derefs
We also switch to using loops rather than recursion.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand e4115ca9d8 nir: Make intrinsic flags into an enum
This should be much better for debugging as GDB will pick up on the fact
that it's an enum and actually tell you what you're looking at instead of
giving you some arbitrary hex value you have to go look up.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand ed13f4e716 nir: Use static inlines instead of macros for list getters
This should make debugging a lot easier as GDB handles static inlines much
better than macros.  Also, static inlines are typesafe.

Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand b95fae034f nir/variable: Remove the constant_value field
This was a left-over relic of GLSL IR that we aren't using for anything.
If we ever want that value again, we can add it back, but NIR constant
folding should be just as good as GLSL IR's if not better pretty soon, so
I'm not worried about it.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00
Jason Ekstrand 8599b30c67 nir: Add some documentation
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-01-15 07:20:23 -08:00