Roman Stratiienko
fdd6151612
nir: Add missing dependency in Android.nir.gen.mk
...
Fixes incremental build with Android
Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com >
Reviewed-by: Tapani Pälli <tapani.palli@intel.com >
2019-08-19 09:53:18 +03:00
Vasily Khoruzhick
0e394cda0d
glsl/standalone: init shader stage in init_gl_program()
...
Otherwise lima standalone compiler fails when trying to compile fragment
shader with:
lima_compiler: ../src/compiler/nir/nir.c:55: nir_shader_create: Assertion `si->stage == stage' failed
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com >
2019-08-17 11:14:40 -07:00
Rhys Perry
0a790c3019
nir/algebraic: add a few masking-before-unpack optimizations
...
Helps some Dawn of War 3 and F1 2017 shaders with ACO:
Totals from affected shaders:
SGPRS: 2136 -> 2128 (-0.37 %)
VGPRS: 1624 -> 1628 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 168068 -> 164332 (-2.22 %) bytes
LDS: 44 -> 44 (0.00 %) blocks
Max Waves: 222 -> 221 (-0.45 %)
Wait states: 0 -> 0 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-16 12:13:01 +01:00
Erik Faye-Lund
544b088616
win32: unify strcasecmp definitions
...
There was two incompatible definitions of strcasecmp, which lead to a
compiler warning. Let's clean this up by only leaving one of them, and
using that one all the time.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-15 20:23:44 +02:00
Erik Faye-Lund
c646cd4bac
nir: avoid warning when casting bogus pointer
...
This intentionally-bogus pointer generates a warning on some 64-bit
systems, so let's cast to a properly-sized integer first.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-15 20:23:35 +02:00
Erik Faye-Lund
b355eef920
glsl: fixup u64-warning
...
Similarly to the unsigned-version, we need to first cast the result to a
suiting integer before negating the number, otherwise we'll trigger a
warning.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-15 20:23:13 +02:00
Eric Engestrom
a3d6024199
meson: add nir tests to the compiler/nir test suite
...
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-14 22:17:06 +01:00
Ian Romanick
0e6581b87d
nir/algebraic: Reassociate shift-by-constant of shift-by-constant
...
v2: After some review discussion with Alyssa, the replacements now
correct account for cases where (b+c) >= bitsize.
v3: Use a temporary to simplify the Python code quite a bit. Suggested
by Jason.
Haswell and all Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16251155 -> 16249576 (<.01%)
instructions in affected programs: 232627 -> 231048 (-0.68%)
helped: 547
HURT: 1
helped stats (abs) min: 1 max: 15 x̄: 2.89 x̃: 3
helped stats (rel) min: 0.04% max: 7.84% x̄: 1.14% x̃: 1.06%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
95% mean confidence interval for instructions value: -3.12 -2.65
95% mean confidence interval for instructions %-change: -1.20% -1.06%
Instructions are helped.
total cycles in shared programs: 365924392 -> 365372103 (-0.15%)
cycles in affected programs: 59207053 -> 58654764 (-0.93%)
helped: 497
HURT: 34
helped stats (abs) min: 1 max: 29300 x̄: 1118.16 x̃: 16
helped stats (rel) min: <.01% max: 10.59% x̄: 1.82% x̃: 1.82%
HURT stats (abs) min: 2 max: 424 x̄: 101.03 x̃: 63
HURT stats (rel) min: 0.07% max: 46.17% x̄: 4.72% x̃: 2.06%
95% mean confidence interval for cycles value: -1426.41 -653.77
95% mean confidence interval for cycles %-change: -1.66% -1.15%
Cycles are helped.
total spills in shared programs: 8870 -> 8871 (0.01%)
spills in affected programs: 104 -> 105 (0.96%)
helped: 0
HURT: 1
Ivy Bridge and all pre-Gen7 platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11956236 -> 11955635 (<.01%)
instructions in affected programs: 94110 -> 93509 (-0.64%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 5.67 x̃: 4
helped stats (rel) min: 0.12% max: 4.71% x̄: 1.96% x̃: 0.76%
95% mean confidence interval for instructions value: -6.62 -4.72
95% mean confidence interval for instructions %-change: -2.27% -1.64%
Instructions are helped.
total cycles in shared programs: 179296340 -> 178788044 (-0.28%)
cycles in affected programs: 51009603 -> 50501307 (-1.00%)
helped: 82
HURT: 7
helped stats (abs) min: 5 max: 27820 x̄: 6199.00 x̃: 16
helped stats (rel) min: 0.30% max: 8.16% x̄: 2.58% x̃: 3.11%
HURT stats (abs) min: 2 max: 8 x̄: 3.14 x̃: 2
HURT stats (rel) min: 0.02% max: 1.40% x̄: 0.34% x̃: 0.10%
95% mean confidence interval for cycles value: -7649.38 -3773.00
95% mean confidence interval for cycles %-change: -2.71% -1.99%
Cycles are helped.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com > [v2]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
2019-08-14 11:15:37 -07:00
Ian Romanick
73aaeac0a3
nir/algebraic: Reassociate add-and-shift to be shift-and-add
...
A common thing in many shaders:
uniform vs { vec4 bones[...]; };
...
x = some_calculation(bones[i + 0]);
y = some_calculation(bones[i + 1]);
z = some_calculation(bones[i + 2]);
This turns into stuff like
vec1 32 ssa_12 = iadd ssa_11, ssa_0
vec1 32 ssa_13 = ishl ssa_12, ssa_3
vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
vec1 32 ssa_15 = iadd ssa_11, ssa_1
vec1 32 ssa_16 = ishl ssa_15, ssa_3
vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
vec1 32 ssa_18 = iadd ssa_11, ssa_2
vec1 32 ssa_19 = ishl ssa_18, ssa_3
vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)
By reassociating the shift and the add, we can reduce this to
vec1 32 ssa_12 = ishl ssa_11, ssa_3
vec1 32 ssa_13 = iadd ssa_12, ssa_0
vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
vec1 32 ssa_16 = iadd ssa_12, ssa_1
vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
vec1 32 ssa_19 = iadd ssa_12, ssa_2
vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)
v2: Add some commentary from Rhys Perry's nearly identical patch.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16277758 -> 16250704 (-0.17%)
instructions in affected programs: 1440284 -> 1413230 (-1.88%)
helped: 4920
HURT: 6
helped stats (abs) min: 1 max: 69 x̄: 5.50 x̃: 4
helped stats (rel) min: 0.10% max: 18.33% x̄: 2.21% x̃: 1.79%
HURT stats (abs) min: 1 max: 12 x̄: 4.50 x̃: 3
HURT stats (rel) min: 0.18% max: 3.23% x̄: 1.91% x̃: 2.55%
95% mean confidence interval for instructions value: -5.67 -5.31
95% mean confidence interval for instructions %-change: -2.26% -2.16%
Instructions are helped.
total cycles in shared programs: 367118526 -> 365895358 (-0.33%)
cycles in affected programs: 93504145 -> 92280977 (-1.31%)
helped: 2754
HURT: 1269
helped stats (abs) min: 1 max: 47039 x̄: 460.66 x̃: 16
helped stats (rel) min: <.01% max: 34.93% x̄: 3.77% x̃: 1.12%
HURT stats (abs) min: 1 max: 1500 x̄: 35.85 x̃: 9
HURT stats (rel) min: 0.01% max: 17.35% x̄: 2.18% x̃: 0.75%
95% mean confidence interval for cycles value: -387.31 -220.78
95% mean confidence interval for cycles %-change: -2.11% -1.68%
Cycles are helped.
LOST: 1
GAINED: 1
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
2019-08-14 11:15:32 -07:00
Andrii Simiklit
ff2225cf88
nir/find_array_copies: Reject copies with mismatched lengths
...
copy_deref for wildcard dereferences requires the same
arrays lengths otherwise it leads to a crash in optimizations
like 'nir_opt_copy_prop_vars' because these optimizations expect
'copy_deref' just for arrays with the same lengths.
v2: check was moved to 'try_match_deref' to fix aoa cases
(Jason Ekstrand <jason@jlekstrand.net >)
v3: -fixed comment
-the condition merged with other one
(Jason Ekstrand <jason@jlekstrand.net >)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111286
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com >
2019-08-14 18:11:31 +00:00
Ian Romanick
f2965fde9b
nir/range-analysis: Fail gracefully on non-SSA sources
...
Tested-by: Rob Clark <robdclark@chromium.org >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-14 09:02:38 -07:00
Iago Toral Quiroga
48f5c34301
nir: add a pass to clamp gl_PointSize to a range
...
The OpenGL and OpenGL ES specs require that implementations clamp the
value of gl_PointSize to an implementation-depedent range. This pass
is useful for any GPU hardware that doesn't do this automatically
for either one or both sides of the range, such as V3D.
v2:
- Turn into a generic NIR pass (Eric).
- Make the pass work before lower I/O so we can use the deref variable
to inspect if we are writing to gl_PointSize (Eric).
- Make the pass take the range to clamp as parameter and allow it
to clamp to both sides of the range or just one side.
- Make the pass report progress.
v3:
- Fix copyright header (Eric)
- use fmin/fmax instead of bcsel to clamp (Eric)
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-13 09:44:12 +02:00
Rhys Perry
7740149852
nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo
...
v2: add to series
v3: update Makefile.sources
v4: don't remove a comment and break statement
v4: use nir_can_move_instr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-12 22:01:30 +00:00
Rhys Perry
da8ed68aca
nir: replace nir_move_load_const() with nir_opt_sink()
...
This is mostly the same as nir_move_load_const() but can also move
undef instructions, comparisons and some intrinsics (being careful with
loops).
v2: actually delete nir_move_load_const.c
v3: fix nir_opt_sink() usage in freedreno
v3: update Makefile.sources
v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def
v4: handle if uses
v4: fix handling of nested loops
v5: re-write adjust_block_for_loops
v5: re-write setting of use_block for if uses
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-12 22:01:30 +00:00
Marek Olšák
9c7746ceae
compiler: add SYSTEM_VALUE_TESS_LEVEL_OUTER/INNER_DEFAULT
...
TCS system values for internal passthru TCS, needed by radeonsi NIR support
Reviewed-by: Connor Abbott <cwabbott0@gmail.com >
2019-08-12 14:52:17 -04:00
Marek Olšák
1b881852bc
compiler: add SYSTEM_VALUE_USER_DATA_AMD
...
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák
f0ccc5457a
compiler: add shader_info.cs.user_data_components_amd
2019-08-12 14:52:17 -04:00
Marek Olšák
028dbd35ba
compiler: add shader_info.vs.blit_sgprs_amd
...
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák
9fb2fd0b43
compiler: add ACCESS_STREAM_CACHE_POLICY
...
radeonsi will use this.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com >
2019-08-12 14:52:17 -04:00
Kenneth Graunke
5180a222c0
glsl: Optimize the SoftFP64 shader when first creating it.
...
By optimizing the shader before inlining, we avoid having to redo this
work for each inlined copy of a function. It should also reduce the
memory consumption a bit.
This cuts the KHR-GL46.arrays_of_arrays_gl.SubroutineFunctionCalls2
runtime by 25% on my Icelake. That test compiles many shaders, which
contain large types (dmat4) and division (expensive operations).
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
2019-08-12 10:42:32 -07:00
Caio Marcelo de Oliveira Filho
5ed4e31c08
spirv: Drop lower_workgroup_access_to_offsets
...
Intel drivers are not using this anymore, and turnip still don't have
Compute Shaders, so won't make a difference to stop using this option.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Rob Clark <robdclark@chromium.org >
2019-08-10 22:15:35 -07:00
Rhys Perry
fd73ed1bd7
nir: add nir_lower_to_explicit()
...
v2: use glsl_type_size_align_func
v2: move get_explicit_type() to glsl_types.cpp/nir_types.cpp
v2: use align() instead of util_align_npot()
v2: pack arrays a bit tighter
v2: rename mem_* to field_*
v2: don't attempt to handle when struct offsets are already set
v2: use column_type() instead of recreating it
v2: use a branch instead of |= in nir_lower_to_explicit_impl()
v2: assign locations to variables and update shared_size and num_shared
v2: allow the pass to be used with nir_var_{shader_temp,function_temp}
v4: rebase
v5: add TODO
v5: small formatting changes
v5: remove incorrect assert in get_explicit_type()
v5: rename to nir_lower_vars_to_explicit_types
v5: correctly update progress when only variables are updated
v5: rename get_explicit_type() to get_explicit_shared_type()
v5: add comment explaining how get_explicit_shared_type() is different
v5: update cast strides
v6: update progress when lowering nir_var_function_temp variables
v6: formatting changes
v6: add more detailed documentation comment for get_explicit_shared_type
v6: rename get_explicit_shared_type to get_explicit_type_for_size_align
v7: fix comment in nir_lower_vars_to_explicit_types_impl()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com > (v5)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
2019-08-08 12:10:39 -05:00
Rhys Perry
8bd2e138f5
nir/lower_explicit_io: add nir_var_mem_shared support
...
v2: require nir_address_format_32bit_offset instead
v3: don't call nir_intrinsic_set_access() for shared atomics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
2019-08-08 12:10:39 -05:00
Erik Faye-Lund
75097114d9
spirv: fixup signature
...
This avoids a warning on some compiler, complaining about implicitly
casting the function-pointer.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Fixes: d482a8f "spirv: Update the OpenCL.std.h header"
Acked-by: Eric Engestrom <eric@engestrom.ch >
2019-08-08 18:20:29 +02:00
Connor Abbott
e7fd90e8ef
nir/builder: Add nir_b2i
...
Same as nir_b2f but for integers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 18:03:10 -04:00
Pierre-Eric Pelloux-Prayer
a9ec718652
nir: add atomic_inc_wrap/atomic_dec_wrap image intrinsics
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:41:02 -04:00
Pierre-Eric Pelloux-Prayer
fc0a2e5d01
glsl: add EXT_shader_image_load_store new image functions
...
This extension has 2 functions that are missing from the ARB versions:
- imageAtomicIncWrap
- imageAtomicDecWrap
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:41:00 -04:00
Pierre-Eric Pelloux-Prayer
70a47fb032
glsl: add EXT_shader_image_load_store keywords to lexer
...
All of them already existed for ARB_shader_image_load_store.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:40:58 -04:00
Pierre-Eric Pelloux-Prayer
cfba168b6c
glsl: add size qualifiers from EXT_shader_image_load_store
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:40:56 -04:00
Pierre-Eric Pelloux-Prayer
cd45d09226
glsl: handle differences between ARB/EXT versions of shader_image_load_store
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:40:55 -04:00
Antia Puentes
954224b714
nir/spirv: Fix gl_BaseVertex for non-indexed draws for OpenGL
...
Lowers BaseVertex to the correct system value for OpenGL.
v2: use options->environment rather than adding a new flag to
spirv_to_nir_options
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-06 09:11:27 -07:00
Jonathan Marek
b514f41183
glcpp: use pre-expansion line number for __LINE__
...
Fixes the following deqp tests:
dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_*
It don't see the spec requiring this, but it seems to be better, as the
clang preprocessor for example has this behavior.
Signed-off-by: Jonathan Marek <jonathan@marek.ca >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
2019-08-06 11:27:04 +00:00
Ian Romanick
5544b2cbbd
nir/algebraic: Use value range analysis to eliminate useless unary ops
...
Sandy Bridge is the big winner because it lies at something of a
crossroads. It supports a fairly high OpenGL version, and it still has
the old style math box. The high OpenGL version means a lot more
shaders can run on it. The old style math box means extra moves are
necessary to resolve source modifiers on operands to complex math
instructions like COS, SQRT, and RCP.
v2: Remove a couple patterns that are now redundant.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16282006 -> 16278207 (-0.02%)
instructions in affected programs: 174555 -> 170756 (-2.18%)
helped: 661
HURT: 0
helped stats (abs) min: 1 max: 36 x̄: 5.75 x̃: 3
helped stats (rel) min: 0.06% max: 23.68% x̄: 2.81% x̃: 1.94%
95% mean confidence interval for instructions value: -6.16 -5.34
95% mean confidence interval for instructions %-change: -3.02% -2.60%
Instructions are helped.
total cycles in shared programs: 367168597 -> 367134284 (<.01%)
cycles in affected programs: 1105276 -> 1070963 (-3.10%)
helped: 460
HURT: 150
helped stats (abs) min: 1 max: 568 x̄: 96.60 x̃: 82
helped stats (rel) min: 0.02% max: 32.50% x̄: 7.99% x̃: 4.27%
HURT stats (abs) min: 1 max: 901 x̄: 67.49 x̃: 39
HURT stats (rel) min: 0.07% max: 20.00% x̄: 4.90% x̃: 4.22%
95% mean confidence interval for cycles value: -65.68 -46.82
95% mean confidence interval for cycles %-change: -5.59% -4.05%
Cycles are helped.
Sandy Bridge
total instructions in shared programs: 10824272 -> 10802557 (-0.20%)
instructions in affected programs: 1237988 -> 1216273 (-1.75%)
helped: 8199
HURT: 0
helped stats (abs) min: 1 max: 41 x̄: 2.65 x̃: 2
helped stats (rel) min: 0.12% max: 20.00% x̄: 2.04% x̃: 1.73%
95% mean confidence interval for instructions value: -2.70 -2.59
95% mean confidence interval for instructions %-change: -2.07% -2.00%
Instructions are helped.
total cycles in shared programs: 154009894 -> 153843598 (-0.11%)
cycles in affected programs: 10650486 -> 10484190 (-1.56%)
helped: 4973
HURT: 1533
helped stats (abs) min: 1 max: 3904 x̄: 40.20 x̃: 20
helped stats (rel) min: 0.02% max: 41.72% x̄: 2.63% x̃: 1.67%
HURT stats (abs) min: 1 max: 453 x̄: 21.94 x̃: 8
HURT stats (rel) min: 0.02% max: 41.91% x̄: 1.54% x̃: 0.58%
95% mean confidence interval for cycles value: -28.02 -23.10
95% mean confidence interval for cycles %-change: -1.74% -1.56%
Cycles are helped.
LOST: 0
GAINED: 2
GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8135196 -> 8134888 (<.01%)
instructions in affected programs: 31920 -> 31612 (-0.96%)
helped: 169
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 1.82 x̃: 2
helped stats (rel) min: 0.43% max: 3.23% x̄: 1.23% x̃: 1.16%
95% mean confidence interval for instructions value: -2.01 -1.64
95% mean confidence interval for instructions %-change: -1.32% -1.15%
Instructions are helped.
total cycles in shared programs: 188575724 -> 188574092 (<.01%)
cycles in affected programs: 406840 -> 405208 (-0.40%)
helped: 169
HURT: 0
helped stats (abs) min: 4 max: 72 x̄: 9.66 x̃: 10
helped stats (rel) min: 0.07% max: 2.16% x̄: 0.57% x̃: 0.47%
95% mean confidence interval for cycles value: -10.72 -8.59
95% mean confidence interval for cycles %-change: -0.63% -0.50%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:14 -07:00
Ian Romanick
8d14380971
nir/algebraic: Use value range analysis to convert fmin to fsat
...
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16297320 -> 16282006 (-0.09%)
instructions in affected programs: 2434498 -> 2419184 (-0.63%)
helped: 8091
HURT: 1
helped stats (abs) min: 1 max: 51 x̄: 1.89 x̃: 2
helped stats (rel) min: 0.04% max: 14.29% x̄: 0.98% x̃: 0.95%
HURT stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7
HURT stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28%
95% mean confidence interval for instructions value: -1.94 -1.85
95% mean confidence interval for instructions %-change: -0.99% -0.96%
Instructions are helped.
total cycles in shared programs: 367221624 -> 367168597 (-0.01%)
cycles in affected programs: 126409635 -> 126356608 (-0.04%)
helped: 5612
HURT: 1023
helped stats (abs) min: 1 max: 2332 x̄: 31.11 x̃: 16
helped stats (rel) min: <.01% max: 30.31% x̄: 1.69% x̃: 1.42%
HURT stats (abs) min: 1 max: 2372 x̄: 118.84 x̃: 16
HURT stats (rel) min: <.01% max: 46.98% x̄: 1.46% x̃: 0.35%
95% mean confidence interval for cycles value: -11.52 -4.46
95% mean confidence interval for cycles %-change: -1.26% -1.14%
Cycles are helped.
total spills in shared programs: 8868 -> 8870 (0.02%)
spills in affected programs: 28 -> 30 (7.14%)
helped: 0
HURT: 1
total fills in shared programs: 21903 -> 21904 (<.01%)
fills in affected programs: 42 -> 43 (2.38%)
helped: 0
HURT: 1
Haswell
total instructions in shared programs: 13353925 -> 13338728 (-0.11%)
instructions in affected programs: 2265850 -> 2250653 (-0.67%)
helped: 8127
HURT: 5
helped stats (abs) min: 1 max: 51 x̄: 1.88 x̃: 2
helped stats (rel) min: 0.04% max: 20.00% x̄: 1.13% x̃: 1.07%
HURT stats (abs) min: 5 max: 16 x̄: 9.00 x̃: 6
HURT stats (rel) min: 0.19% max: 0.52% x̄: 0.35% x̃: 0.28%
95% mean confidence interval for instructions value: -1.91 -1.83
95% mean confidence interval for instructions %-change: -1.15% -1.11%
Instructions are helped.
total cycles in shared programs: 375535444 -> 375536343 (<.01%)
cycles in affected programs: 131206582 -> 131207481 (<.01%)
helped: 5590
HURT: 1055
helped stats (abs) min: 1 max: 2844 x̄: 34.15 x̃: 16
helped stats (rel) min: <.01% max: 21.57% x̄: 2.08% x̃: 1.60%
HURT stats (abs) min: 1 max: 2487 x̄: 181.78 x̃: 21
HURT stats (rel) min: <.01% max: 40.66% x̄: 1.96% x̃: 0.37%
95% mean confidence interval for cycles value: -4.74 5.01
95% mean confidence interval for cycles %-change: -1.51% -1.37%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 23401 -> 23407 (0.03%)
spills in affected programs: 248 -> 254 (2.42%)
helped: 2
HURT: 5
total fills in shared programs: 34850 -> 34845 (-0.01%)
fills in affected programs: 383 -> 378 (-1.31%)
helped: 2
HURT: 5
Ivy Bridge
total instructions in shared programs: 11975423 -> 11968117 (-0.06%)
instructions in affected programs: 845703 -> 838397 (-0.86%)
helped: 4071
HURT: 0
helped stats (abs) min: 1 max: 51 x̄: 1.79 x̃: 1
helped stats (rel) min: 0.08% max: 8.21% x̄: 1.04% x̃: 0.93%
95% mean confidence interval for instructions value: -1.87 -1.71
95% mean confidence interval for instructions %-change: -1.06% -1.02%
Instructions are helped.
total cycles in shared programs: 179674318 -> 179635552 (-0.02%)
cycles in affected programs: 5100065 -> 5061299 (-0.76%)
helped: 2650
HURT: 611
helped stats (abs) min: 1 max: 900 x̄: 21.85 x̃: 16
helped stats (rel) min: <.01% max: 21.55% x̄: 2.39% x̃: 1.40%
HURT stats (abs) min: 1 max: 1841 x̄: 31.33 x̃: 6
HURT stats (rel) min: <.01% max: 58.71% x̄: 1.64% x̃: 0.37%
95% mean confidence interval for cycles value: -14.14 -9.64
95% mean confidence interval for cycles %-change: -1.75% -1.52%
Cycles are helped.
LOST: 3
GAINED: 7
Sandy Bridge
total instructions in shared programs: 10828844 -> 10824272 (-0.04%)
instructions in affected programs: 525678 -> 521106 (-0.87%)
helped: 2386
HURT: 0
helped stats (abs) min: 1 max: 51 x̄: 1.92 x̃: 2
helped stats (rel) min: 0.11% max: 7.96% x̄: 1.05% x̃: 0.94%
95% mean confidence interval for instructions value: -2.04 -1.80
95% mean confidence interval for instructions %-change: -1.08% -1.03%
Instructions are helped.
total cycles in shared programs: 154024591 -> 154009894 (<.01%)
cycles in affected programs: 4005766 -> 3991069 (-0.37%)
helped: 1245
HURT: 506
helped stats (abs) min: 1 max: 585 x̄: 21.07 x̃: 16
helped stats (rel) min: 0.02% max: 11.57% x̄: 1.98% x̃: 0.83%
HURT stats (abs) min: 1 max: 639 x̄: 22.81 x̃: 6
HURT stats (rel) min: 0.01% max: 26.21% x̄: 1.07% x̃: 0.26%
95% mean confidence interval for cycles value: -10.57 -6.21
95% mean confidence interval for cycles %-change: -1.23% -0.97%
Cycles are helped.
GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs: 8137248 -> 8135196 (-0.03%)
instructions in affected programs: 148322 -> 146270 (-1.38%)
helped: 992
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.41% max: 9.73% x̄: 1.74% x̃: 1.51%
95% mean confidence interval for instructions value: -2.16 -1.98
95% mean confidence interval for instructions %-change: -1.80% -1.67%
Instructions are helped.
total cycles in shared programs: 188583424 -> 188575724 (<.01%)
cycles in affected programs: 4409620 -> 4401920 (-0.17%)
helped: 956
HURT: 6
helped stats (abs) min: 2 max: 168 x̄: 8.09 x̃: 8
helped stats (rel) min: 0.04% max: 6.76% x̄: 0.27% x̃: 0.18%
HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel) min: 0.10% max: 0.10% x̄: 0.10% x̃: 0.10%
95% mean confidence interval for cycles value: -8.41 -7.60
95% mean confidence interval for cycles %-change: -0.29% -0.25%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:14 -07:00
Ian Romanick
b77070e293
nir/algebraic: Use value range analysis to eliminate tautological compares
...
It's only one application on one platform (Haswell) that's affected,
but spills and fills increase quite dramatically. :(
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16320850 -> 16297320 (-0.14%)
instructions in affected programs: 448012 -> 424482 (-5.25%)
helped: 1938
HURT: 0
helped stats (abs) min: 2 max: 264 x̄: 12.14 x̃: 10
helped stats (rel) min: 0.35% max: 43.75% x̄: 5.85% x̃: 5.38%
95% mean confidence interval for instructions value: -12.80 -11.48
95% mean confidence interval for instructions %-change: -5.99% -5.72%
Instructions are helped.
total cycles in shared programs: 367496943 -> 367221624 (-0.07%)
cycles in affected programs: 8557232 -> 8281913 (-3.22%)
helped: 1907
HURT: 26
helped stats (abs) min: 4 max: 12802 x̄: 147.21 x̃: 48
helped stats (rel) min: 0.03% max: 75.85% x̄: 5.55% x̃: 3.94%
HURT stats (abs) min: 4 max: 1870 x̄: 208.23 x̃: 20
HURT stats (rel) min: 0.16% max: 32.11% x̄: 8.31% x̃: 0.79%
95% mean confidence interval for cycles value: -165.38 -119.48
95% mean confidence interval for cycles %-change: -5.68% -5.04%
Cycles are helped.
LOST: 1
GAINED: 0
Haswell
total instructions in shared programs: 13374211 -> 13353925 (-0.15%)
instructions in affected programs: 349868 -> 329582 (-5.80%)
helped: 1669
HURT: 1
helped stats (abs) min: 1 max: 264 x̄: 12.57 x̃: 10
helped stats (rel) min: 0.12% max: 46.81% x̄: 6.86% x̃: 6.49%
HURT stats (abs) min: 700 max: 700 x̄: 700.00 x̃: 700
HURT stats (rel) min: 64.34% max: 64.34% x̄: 64.34% x̃: 64.34%
95% mean confidence interval for instructions value: -13.25 -11.04
95% mean confidence interval for instructions %-change: -7.01% -6.63%
Instructions are helped.
total cycles in shared programs: 375763544 -> 375535444 (-0.06%)
cycles in affected programs: 6932686 -> 6704586 (-3.29%)
helped: 1622
HURT: 48
helped stats (abs) min: 2 max: 12229 x̄: 148.31 x̃: 68
helped stats (rel) min: 0.06% max: 74.03% x̄: 5.94% x̃: 4.12%
HURT stats (abs) min: 3 max: 7451 x̄: 259.44 x̃: 41
HURT stats (rel) min: 0.05% max: 54.99% x̄: 8.52% x̃: 2.88%
95% mean confidence interval for cycles value: -159.86 -113.31
95% mean confidence interval for cycles %-change: -5.86% -5.18%
Cycles are helped.
total spills in shared programs: 23258 -> 23401 (0.61%)
spills in affected programs: 54 -> 197 (264.81%)
helped: 4
HURT: 2
total fills in shared programs: 34775 -> 34850 (0.22%)
fills in affected programs: 52 -> 127 (144.23%)
helped: 4
HURT: 1
LOST: 5
GAINED: 0
Ivy Bridge
total instructions in shared programs: 11996051 -> 11977964 (-0.15%)
instructions in affected programs: 346679 -> 328592 (-5.22%)
helped: 1508
HURT: 0
helped stats (abs) min: 2 max: 198 x̄: 11.99 x̃: 10
helped stats (rel) min: 0.26% max: 19.83% x̄: 5.73% x̃: 5.43%
95% mean confidence interval for instructions value: -12.65 -11.34
95% mean confidence interval for instructions %-change: -5.86% -5.60%
Instructions are helped.
total cycles in shared programs: 179891389 -> 179691339 (-0.11%)
cycles in affected programs: 7869479 -> 7669429 (-2.54%)
helped: 1485
HURT: 23
helped stats (abs) min: 1 max: 12615 x̄: 136.16 x̃: 54
helped stats (rel) min: 0.02% max: 71.84% x̄: 4.69% x̃: 3.49%
HURT stats (abs) min: 1 max: 403 x̄: 93.48 x̃: 6
HURT stats (rel) min: 0.04% max: 34.01% x̄: 8.68% x̃: 0.81%
95% mean confidence interval for cycles value: -154.59 -110.73
95% mean confidence interval for cycles %-change: -4.79% -4.19%
Cycles are helped.
Sandy Bridge
total instructions in shared programs: 10829247 -> 10828844 (<.01%)
instructions in affected programs: 21258 -> 20855 (-1.90%)
helped: 88
HURT: 0
helped stats (abs) min: 2 max: 17 x̄: 4.58 x̃: 5
helped stats (rel) min: 0.52% max: 3.92% x̄: 2.05% x̃: 2.21%
95% mean confidence interval for instructions value: -5.03 -4.13
95% mean confidence interval for instructions %-change: -2.21% -1.89%
Instructions are helped.
total cycles in shared programs: 154035437 -> 154024591 (<.01%)
cycles in affected programs: 430176 -> 419330 (-2.52%)
helped: 78
HURT: 10
helped stats (abs) min: 2 max: 4649 x̄: 143.06 x̃: 32
helped stats (rel) min: 0.05% max: 6.02% x̄: 2.03% x̃: 1.07%
HURT stats (abs) min: 3 max: 265 x̄: 31.30 x̃: 6
HURT stats (rel) min: 0.10% max: 8.67% x̄: 1.03% x̃: 0.21%
95% mean confidence interval for cycles value: -232.53 -13.97
95% mean confidence interval for cycles %-change: -2.13% -1.23%
Cycles are helped.
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8137402 -> 8137248 (<.01%)
instructions in affected programs: 2280 -> 2126 (-6.75%)
helped: 10
HURT: 0
helped stats (abs) min: 12 max: 19 x̄: 15.40 x̃: 15
helped stats (rel) min: 3.90% max: 11.73% x̄: 7.19% x̃: 6.95%
95% mean confidence interval for instructions value: -17.69 -13.11
95% mean confidence interval for instructions %-change: -8.99% -5.39%
Instructions are helped.
total cycles in shared programs: 188538716 -> 188583424 (0.02%)
cycles in affected programs: 69326 -> 114034 (64.49%)
helped: 0
HURT: 10
HURT stats (abs) min: 2068 max: 7686 x̄: 4470.80 x̃: 4870
HURT stats (rel) min: 27.20% max: 173.66% x̄: 69.55% x̃: 59.41%
95% mean confidence interval for cycles value: 2830.86 6110.74
95% mean confidence interval for cycles %-change: 39.18% 99.91%
Cycles are HURT.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
96fcb3f95b
nir/algebraic: Use value range analysis to eliminate tautological compares not used by if-statements
...
This just eliminates tautological / contradictory compares that are used
for bcsel and other non-if-statement cases. If-statements are not
affected because removing flow control can cause the i965 instrution
scheduler to create some very long live ranges resulting in unncessary
spilling. This causes some shaders to fall of a performance cliff.
Since many small if-statements are already flattened to bcsel, this
optimization covers more than 68% of the possible cases (2417 shaders
helped for instructions on Skylake vs. 3554).
v2: Reorder and add whitespace to make the relationship between the
patterns more obvious. Suggested by Caio.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16333474 -> 16322028 (-0.07%)
instructions in affected programs: 438559 -> 427113 (-2.61%)
helped: 1765
HURT: 0
helped stats (abs) min: 1 max: 275 x̄: 6.48 x̃: 4
helped stats (rel) min: 0.20% max: 36.36% x̄: 4.07% x̃: 1.82%
95% mean confidence interval for instructions value: -6.87 -6.10
95% mean confidence interval for instructions %-change: -4.30% -3.84%
Instructions are helped.
total cycles in shared programs: 367608554 -> 367511103 (-0.03%)
cycles in affected programs: 8368829 -> 8271378 (-1.16%)
helped: 1541
HURT: 129
helped stats (abs) min: 1 max: 4468 x̄: 66.78 x̃: 39
helped stats (rel) min: 0.01% max: 45.69% x̄: 4.10% x̃: 2.17%
HURT stats (abs) min: 1 max: 973 x̄: 42.25 x̃: 10
HURT stats (rel) min: 0.02% max: 64.39% x̄: 2.15% x̃: 0.60%
95% mean confidence interval for cycles value: -64.90 -51.81
95% mean confidence interval for cycles %-change: -3.89% -3.36%
Cycles are helped.
total spills in shared programs: 8867 -> 8868 (0.01%)
spills in affected programs: 18 -> 19 (5.56%)
helped: 0
HURT: 1
total fills in shared programs: 21900 -> 21903 (0.01%)
fills in affected programs: 78 -> 81 (3.85%)
helped: 0
HURT: 1
All Gen6 and earlier platforms had similar results. (Sandy Bridge shown)
total instructions in shared programs: 10829877 -> 10829247 (<.01%)
instructions in affected programs: 30240 -> 29610 (-2.08%)
helped: 177
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 3.56 x̃: 3
helped stats (rel) min: 0.37% max: 17.39% x̄: 2.68% x̃: 1.94%
95% mean confidence interval for instructions value: -3.93 -3.18
95% mean confidence interval for instructions %-change: -3.04% -2.32%
Instructions are helped.
total cycles in shared programs: 154036580 -> 154035437 (<.01%)
cycles in affected programs: 352402 -> 351259 (-0.32%)
helped: 96
HURT: 28
helped stats (abs) min: 1 max: 128 x̄: 14.73 x̃: 6
helped stats (rel) min: 0.03% max: 24.00% x̄: 1.51% x̃: 0.46%
HURT stats (abs) min: 1 max: 117 x̄: 9.68 x̃: 4
HURT stats (rel) min: 0.03% max: 2.24% x̄: 0.43% x̃: 0.23%
95% mean confidence interval for cycles value: -13.40 -5.03
95% mean confidence interval for cycles %-change: -1.62% -0.53%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
fa116ce357
nir/range-analysis: Range tracking for ffma and flrp
...
A similar technique could be used for fmin3, fmax3, and fmid3.
This could be squashed with the previous commit. I kept it separate to
ease review.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
586602c5d9
nir/range-analysis: Range tracking for bcsel
...
This could be squashed with the previous commit. I kept it separate to
ease review.
v2: Add some missing cases. Use nir_src_is_const helper. Both
suggested by Caio. Use a table for mapping source ranges to a result
range.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
3009cbed50
nir/range-analysis: Tighten the range of fsat based on the range of its source
...
This could be squashed with the previous commit. I kept it separate to
ease review.
v2: Use a switch statement and add more comments. Both suggested by
Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
405de7ccb6
nir/range-analysis: Rudimentary value range analysis pass
...
Most integer operations are omitted because dealing with integer
overflow is hard. There are a few things that could be smarter if there
was a small amount more tracking of ranges of integer types (i.e.,
operands are Boolean, operand values fit in 16 bits, etc.).
The changes to nir_search_helpers.h are included in this patch to
simplify reordering the changes to nir_opt_algebraic.py.
v2: Memoize range analysis results. Without this, some shaders appear
to get stuck in infinite loops.
v3: Rebase on many months of Mesa changes, including 1-bit Boolean
changes.
v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction".
v5: Use nir_alu_srcs_equal for detecting (a*a). Previously just the SSA
value was compared, and this incorrectly matched (a.x*a.y).
v6: Many code improvements including (but not limited to) better names,
more comments, and better use of helper functions. All suggested by
Caio. Rework the handling of several opcodes to use a table for mapping
source ranges to a result range. This change fixed a bug that caused
fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero.
Slightly tighten the range of fmul by recognizing that x*x is gt_zero if
x is gt_zero. Add similar handling for -x*x.
v7: Use _______ in the tables as an alias for unknown. Suggested by
Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
d24edb4b8c
nir/algebraic: Simplify some comparisons like a+constant < constant
...
v2: Remove unsafe integer versions of the optimizations. This change
had no effect on shader-db results. Suggested by Caio.
All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16333713 -> 16332631 (<.01%)
instructions in affected programs: 258112 -> 257030 (-0.42%)
helped: 1275
HURT: 407
helped stats (abs) min: 1 max: 7 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.20% max: 8.33% x̄: 1.33% x̃: 0.86%
HURT stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.11% max: 2.94% x̄: 0.98% x̃: 0.98%
95% mean confidence interval for instructions value: -0.70 -0.59
95% mean confidence interval for instructions %-change: -0.84% -0.70%
Instructions are helped.
total cycles in shared programs: 367596791 -> 367601268 (<.01%)
cycles in affected programs: 3420062 -> 3424539 (0.13%)
helped: 1553
HURT: 783
helped stats (abs) min: 1 max: 742 x̄: 24.36 x̃: 6
helped stats (rel) min: 0.05% max: 21.12% x̄: 1.47% x̃: 0.65%
HURT stats (abs) min: 1 max: 557 x̄: 54.04 x̃: 14
HURT stats (rel) min: 0.01% max: 33.66% x̄: 3.36% x̃: 1.43%
95% mean confidence interval for cycles value: -1.60 5.43
95% mean confidence interval for cycles %-change: -0.03% 0.33%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 2
Iron Lake
total instructions in shared programs: 8137992 -> 8137874 (<.01%)
instructions in affected programs: 17501 -> 17383 (-0.67%)
helped: 104
HURT: 2
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.25% max: 2.63% x̄: 0.87% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.22 -1.00
95% mean confidence interval for instructions %-change: -0.94% -0.76%
Instructions are helped.
total cycles in shared programs: 188540038 -> 188539650 (<.01%)
cycles in affected programs: 704574 -> 704186 (-0.06%)
helped: 125
HURT: 84
helped stats (abs) min: 2 max: 96 x̄: 6.45 x̃: 4
helped stats (rel) min: <.01% max: 3.47% x̄: 0.42% x̃: 0.25%
HURT stats (abs) min: 2 max: 58 x̄: 4.98 x̃: 4
HURT stats (rel) min: 0.01% max: 2.75% x̄: 0.36% x̃: 0.33%
95% mean confidence interval for cycles value: -3.20 -0.52
95% mean confidence interval for cycles %-change: -0.19% -0.03%
Cycles are helped.
GM45
total instructions in shared programs: 5008889 -> 5008830 (<.01%)
instructions in affected programs: 8824 -> 8765 (-0.67%)
helped: 52
HURT: 1
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.25% max: 2.38% x̄: 0.86% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.27 -0.95
95% mean confidence interval for instructions %-change: -0.96% -0.71%
Instructions are helped.
total cycles in shared programs: 128969426 -> 128969128 (<.01%)
cycles in affected programs: 399798 -> 399500 (-0.07%)
helped: 74
HURT: 30
helped stats (abs) min: 2 max: 22 x̄: 6.76 x̃: 6
helped stats (rel) min: <.01% max: 1.83% x̄: 0.46% x̃: 0.29%
HURT stats (abs) min: 2 max: 58 x̄: 6.73 x̃: 6
HURT stats (rel) min: 0.06% max: 2.75% x̄: 0.42% x̃: 0.21%
95% mean confidence interval for cycles value: -4.60 -1.14
95% mean confidence interval for cycles %-change: -0.32% -0.08%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
7c64cbf49d
nir/algebraic: Recognize (a < 0 || 0 < b) as min(a, -b) < 0
...
Similar to commit 97e6c1b9 and f5cf74d8ba .
First apply 0 < b => -b < 0 to get (a < 0 || -b < 0), then apply some
pre-existing rules to get min(a, -b) < 0.
v2: Substantially update the comment explaining the use of is_used_once
and the duplication of patterns. Suggested by Caio. Also, while flt
and fge are not commutative, ior and iand are. Half of the original
patterns were redundant, so delete them. As alternate justification for
deleting them, fmin(a, -b) < 0 <=> 0 < fmax(-a, b). Proof left as an
exercise for the reader.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16333789 -> 16333713 (<.01%)
instructions in affected programs: 11424 -> 11348 (-0.67%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 2
helped stats (rel) min: 0.20% max: 1.67% x̄: 0.76% x̃: 0.69%
95% mean confidence interval for instructions value: -3.03 -1.72
95% mean confidence interval for instructions %-change: -0.89% -0.62%
Instructions are helped.
total cycles in shared programs: 367598295 -> 367596791 (<.01%)
cycles in affected programs: 141414 -> 139910 (-1.06%)
helped: 23
HURT: 6
helped stats (abs) min: 3 max: 386 x̄: 72.52 x̃: 20
helped stats (rel) min: 0.15% max: 4.86% x̄: 1.01% x̃: 0.76%
HURT stats (abs) min: 4 max: 88 x̄: 27.33 x̃: 12
HURT stats (rel) min: 0.22% max: 3.95% x̄: 1.08% x̃: 0.59%
95% mean confidence interval for cycles value: -93.51 -10.21
95% mean confidence interval for cycles %-change: -1.10% -0.05%
Cycles are helped.
total instructions in shared programs: 10830836 -> 10830779 (<.01%)
instructions in affected programs: 6895 -> 6838 (-0.83%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 4.75 x̃: 1
helped stats (rel) min: 0.14% max: 1.61% x̄: 0.65% x̃: 0.33%
95% mean confidence interval for instructions value: -8.46 -1.04
95% mean confidence interval for instructions %-change: -1.03% -0.27%
Instructions are helped.
total cycles in shared programs: 154028477 -> 154032740 (<.01%)
cycles in affected programs: 178433 -> 182696 (2.39%)
helped: 3
HURT: 9
helped stats (abs) min: 3 max: 20 x̄: 11.00 x̃: 10
helped stats (rel) min: 0.07% max: 0.20% x̄: 0.12% x̃: 0.09%
HURT stats (abs) min: 27 max: 1415 x̄: 477.33 x̃: 262
HURT stats (rel) min: 0.22% max: 6.45% x̄: 2.49% x̃: 1.76%
95% mean confidence interval for cycles value: 28.68 681.82
95% mean confidence interval for cycles %-change: 0.37% 3.30%
Cycles are HURT.
Iron Lake
total instructions in shared programs: 8137966 -> 8137992 (<.01%)
instructions in affected programs: 3281 -> 3307 (0.79%)
helped: 0
HURT: 6
HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3
HURT stats (rel) min: 0.63% max: 1.01% x̄: 0.76% x̃: 0.64%
95% mean confidence interval for instructions value: 2.17 6.50
95% mean confidence interval for instructions %-change: 0.56% 0.96%
Instructions are HURT.
total cycles in shared programs: 188539386 -> 188540038 (<.01%)
cycles in affected programs: 103826 -> 104478 (0.63%)
helped: 0
HURT: 7
HURT stats (abs) min: 16 max: 218 x̄: 93.14 x̃: 80
HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.53% x̃: 0.46%
95% mean confidence interval for cycles value: 10.26 176.02
95% mean confidence interval for cycles %-change: 0.24% 0.81%
Cycles are HURT.
GM45
total instructions in shared programs: 5008876 -> 5008889 (<.01%)
instructions in affected programs: 1645 -> 1658 (0.79%)
helped: 0
HURT: 3
HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3
HURT stats (rel) min: 0.63% max: 1.00% x̄: 0.76% x̃: 0.63%
total cycles in shared programs: 128968950 -> 128969426 (<.01%)
cycles in affected programs: 64854 -> 65330 (0.73%)
helped: 0
HURT: 4
HURT stats (abs) min: 18 max: 218 x̄: 119.00 x̃: 120
HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.60% x̃: 0.66%
95% mean confidence interval for cycles value: -62.92 300.92
95% mean confidence interval for cycles %-change: -0.05% 1.26%
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Ian Romanick
92b75c126b
nir/algebraic: Replace checks that a value is between (or not) [0, 1]
...
v2: Add an extra line to one of the proofs. Suggested by Caio.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16329772 -> 16329427 (<.01%)
instructions in affected programs: 41980 -> 41635 (-0.82%)
helped: 110
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 3.14 x̃: 2
helped stats (rel) min: 0.19% max: 5.56% x̄: 1.12% x̃: 0.94%
95% mean confidence interval for instructions value: -4.10 -2.17
95% mean confidence interval for instructions %-change: -1.28% -0.96%
Instructions are helped.
total cycles in shared programs: 367551273 -> 367549979 (<.01%)
cycles in affected programs: 492462 -> 491168 (-0.26%)
helped: 76
HURT: 25
helped stats (abs) min: 1 max: 400 x̄: 42.86 x̃: 12
helped stats (rel) min: 0.06% max: 10.72% x̄: 1.23% x̃: 0.75%
HURT stats (abs) min: 2 max: 730 x̄: 78.52 x̃: 16
HURT stats (rel) min: 0.17% max: 6.89% x̄: 2.08% x̃: 1.23%
95% mean confidence interval for cycles value: -37.79 12.16
95% mean confidence interval for cycles %-change: -0.90% 0.07%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 2
Sandy Bridge
total instructions in shared programs: 10831115 -> 10830836 (<.01%)
instructions in affected programs: 37830 -> 37551 (-0.74%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 3.99 x̃: 2
helped stats (rel) min: 0.33% max: 7.14% x̄: 1.21% x̃: 0.97%
95% mean confidence interval for instructions value: -5.47 -2.50
95% mean confidence interval for instructions %-change: -1.49% -0.92%
Instructions are helped.
total cycles in shared programs: 154029323 -> 154028477 (<.01%)
cycles in affected programs: 247909 -> 247063 (-0.34%)
helped: 52
HURT: 6
helped stats (abs) min: 2 max: 254 x̄: 25.81 x̃: 4
helped stats (rel) min: 0.07% max: 4.39% x̄: 0.81% x̃: 0.19%
HURT stats (abs) min: 4 max: 403 x̄: 82.67 x̃: 8
HURT stats (rel) min: 0.18% max: 1.60% x̄: 0.71% x̃: 0.53%
95% mean confidence interval for cycles value: -34.83 5.65
95% mean confidence interval for cycles %-change: -0.98% -0.32%
Inconclusive result (value mean confidence interval includes 0).
Iron Lake
total instructions in shared programs: 8138007 -> 8137966 (<.01%)
instructions in affected programs: 4060 -> 4019 (-1.01%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.32 x̃: 1
helped stats (rel) min: 0.68% max: 8.33% x̄: 1.45% x̃: 0.90%
95% mean confidence interval for instructions value: -1.50 -1.15
95% mean confidence interval for instructions %-change: -2.11% -0.79%
Instructions are helped.
total cycles in shared programs: 188539492 -> 188539386 (<.01%)
cycles in affected programs: 26280 -> 26174 (-0.40%)
helped: 25
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 4.24 x̃: 4
helped stats (rel) min: 0.08% max: 2.11% x̄: 0.54% x̃: 0.50%
95% mean confidence interval for cycles value: -5.08 -3.40
95% mean confidence interval for cycles %-change: -0.70% -0.37%
Cycles are helped.
GM45
total instructions in shared programs: 5008897 -> 5008876 (<.01%)
instructions in affected programs: 2096 -> 2075 (-1.00%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1
helped stats (rel) min: 0.68% max: 7.69% x̄: 1.41% x̃: 0.89%
95% mean confidence interval for instructions value: -1.57 -1.06
95% mean confidence interval for instructions %-change: -2.32% -0.49%
Instructions are helped.
total cycles in shared programs: 128969020 -> 128968950 (<.01%)
cycles in affected programs: 18490 -> 18420 (-0.38%)
helped: 15
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 4.67 x̃: 4
helped stats (rel) min: 0.08% max: 2.11% x̄: 0.51% x̃: 0.48%
95% mean confidence interval for cycles value: -6.03 -3.30
95% mean confidence interval for cycles %-change: -0.78% -0.24%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
2019-08-05 20:14:13 -07:00
Eric Engestrom
178811d8f6
meson: drop unused dep_{thread,dl}
...
Unused as of last commit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Acked-by: Eric Anholt <eric@anholt.net >
Tested-by: Vinson Lee <vlee@freedesktop.org >
2019-08-03 00:08:37 +00:00
Eric Engestrom
d2d85b950d
meson: replace libmesa_util with idep_mesautil
...
This automates the include_directories and dependencies tracking so that
all users of libmesa_util don't need to add them manually.
Next commit will remove the ones that were only added for that reason.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Acked-by: Eric Anholt <eric@anholt.net >
Tested-by: Vinson Lee <vlee@freedesktop.org >
2019-08-03 00:08:37 +00:00
Connor Abbott
f41516bdb5
nir/find_array_copies: Reject copies with mismatched type
...
When we detect a scalar/vector copy through load_deref/store_deref, we
have to be careful since those can bitcast an int to a float and
vice-versa even though copy_deref can't.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251
Fixes: 156306e5e6 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
2019-08-02 10:34:29 +02:00
Jason Ekstrand
70dc017aec
nir: Stop whacking gl_FrontFacing to a system value
...
We have a cap bit for gallium and a GLSL compiler flag to control this.
Just trust what GLSL gives us and stop forcing it. In order for this to
be safe, we have to advertise another cap in some of the gallium
drivers.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-08-01 21:59:37 +00:00
Jason Ekstrand
078dcb7ccd
nir/lower_io: Add an option to lower 64-bit varyings
...
Reviewed-by: Matt Turner <mattst88@gmail.com >
2019-07-31 18:14:09 -05:00
Dave Airlie
7ad6ec80d9
nir: use common deref has indirect code in scratch lowering.
...
This doesn't seem to need it's own copy here.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
2019-08-01 08:32:12 +10:00
Eric Engestrom
5d7bcac4e7
nir: remove explicit nir_intrinsic_index_flag values
...
These were left after a rebase and happen to make
NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it
was noticed.
Fixes: 6f20643b47 ("nir: Allow qualifiers on copy_deref and image instructions")
Cc: Connor Abbott <cwabbott0@gmail.com >
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-07-31 23:28:20 +01:00