Rhys Perry
14b8227083
nir: add some missing nir_alu_type_get_base_type
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Rhys Perry
f2fbba7920
nir/algebraic: optimize open-coded fmulz/ffmaz
...
This pattern will be found in future versions of D3D9 DXVK.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Rhys Perry
312a284980
nir/algebraic: add ignore_exact() wrapper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Rhys Perry
7f05ea3793
nir: add nir_op_fmulz and nir_op_ffmaz
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Emma Anholt
cac6f633b2
nir/opt_offsets: Use nir_ssa_scalar to chase offset additions.
...
For nir_to_tgsi, I want to be able to fold into the base from a vector
load_const, which the ad-hoc scalar chasing couldn't handle.
r300:
total instructions in shared programs: 1278731 -> 1256502 (-1.74%)
instructions in affected programs: 457909 -> 435680 (-4.85%)
total flowcontrol in shared programs: 8316 -> 8313 (-0.04%)
flowcontrol in affected programs: 5 -> 2 (-60.00%)
total temps in shared programs: 213687 -> 213774 (0.04%)
temps in affected programs: 13140 -> 13227 (0.66%)
total consts in shared programs: 952850 -> 949929 (-0.31%)
consts in affected programs: 386352 -> 383431 (-0.76%)
Fixes : #5781
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309 >
2022-01-19 22:28:34 +00:00
Emma Anholt
645ca56425
nir/opt_offsets: Also apply the max offset to top-level constant folding.
...
nir_to_tgsi wants this for disabling folding into shared var accesses at
all.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309 >
2022-01-19 22:28:34 +00:00
Emma Anholt
ec4b9909f0
nir/opt_offsets: Disable unsigned wrap checks on non-native-integers HW.
...
Since we don't have 32-bit ints, these checks for 32-bit unsigned wrapping
don't help and just reduce optimization opportunities (particularly for
DX9 addressing math).
Doesn't affect any current consumers.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309 >
2022-01-19 22:28:34 +00:00
Emma Anholt
700d2fbd0a
nir: Add a .base field to nir_load_ubo_vec4.
...
This lets nir-to-tgsi fold the constant offset of addressing calculations
into the CONST[] reference, which is important for D3D9-era compatibility:
HW of that age has limited uniform space, and if we do the addressing math
as math in the shader for dynamic indexing, the nir_load_consts end up
taking up uniforms we don't have available.
r300:
total instructions in shared programs: 1279699 -> 1279167 (-0.04%)
instructions in affected programs: 134796 -> 134264 (-0.39%)
total instructions in shared programs: 1279699 -> 1279167 (-0.04%)
instructions in affected programs: 134796 -> 134264 (-0.39%)
total temps in shared programs: 213912 -> 213736 (-0.08%)
temps in affected programs: 2166 -> 1990 (-8.13%)
total consts in shared programs: 953237 -> 952973 (-0.03%)
consts in affected programs: 45980 -> 45716 (-0.57%)
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14309 >
2022-01-19 22:28:34 +00:00
Dave Airlie
ccbf700d6c
nir: remove gl.h include from nir headers.
...
This saves a lot of pointless gl.h includes across the board,
it moves the one place that needs GLenum into a separate file
only used in those passes that require it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605 >
2022-01-19 21:54:58 +00:00
Dave Airlie
1352e0ba0c
mesa/*: add a shader primitive type to get away from GL types.
...
This creates an internal shader_prim enum, I've fixed up most
users to use it instead of GL types.
don't store the enum in shader_info as it changes size, and confuses
other things.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605 >
2022-01-19 21:54:58 +00:00
Connor Abbott
9c9e8c3349
nir: Reorder ffma and fsub combining
...
It's relatively common to do something like "a * b - c", which on most
GPUs can be implemented in a single instruction. Before
opt_algebraic_late this will be something like
"fadd(fmul(a, b), fneg(c))", and we want to turn it info
"ffma(a, b, fneg(c))". But because the fsub pattern was first we
instead turned it into "fsub(fmul(a, b), c)". Fix this by reordering
them.
Selected shader-db results on freedreno:
total instructions in shared programs: 1561330 -> 1551619 (-0.62%)
instructions in affected programs: 780272 -> 770561 (-1.24%)
helped: 1941
HURT: 491
helped stats (abs) min: 1 max: 147 x̄: 7.98 x̃: 4
helped stats (rel) min: 0.07% max: 30.77% x̄: 4.36% x̃: 3.17%
HURT stats (abs) min: 1 max: 307 x̄: 11.76 x̃: 5
HURT stats (rel) min: 0.09% max: 18.71% x̄: 2.26% x̃: 1.38%
95% mean confidence interval for instructions value: -4.57 -3.41
95% mean confidence interval for instructions %-change: -3.21% -2.84%
Instructions are helped.
total nops in shared programs: 358926 -> 356263 (-0.74%)
nops in affected programs: 167116 -> 164453 (-1.59%)
helped: 1395
HURT: 859
helped stats (abs) min: 1 max: 108 x̄: 6.80 x̃: 3
helped stats (rel) min: 0.17% max: 100.00% x̄: 19.15% x̃: 10.57%
HURT stats (abs) min: 1 max: 307 x̄: 7.95 x̃: 3
HURT stats (rel) min: 0.00% max: 381.82% x̄: 20.04% x̃: 10.00%
95% mean confidence interval for nops value: -1.77 -0.59
95% mean confidence interval for nops %-change: -5.55% -2.87%
Nops are helped.
total non-nops in shared programs: 1202404 -> 1195356 (-0.59%)
non-nops in affected programs: 496682 -> 489634 (-1.42%)
helped: 1951
HURT: 265
helped stats (abs) min: 1 max: 39 x̄: 4.02 x̃: 3
helped stats (rel) min: 0.07% max: 15.38% x̄: 2.97% x̃: 1.96%
HURT stats (abs) min: 1 max: 22 x̄: 2.97 x̃: 2
HURT stats (rel) min: 0.05% max: 10.00% x̄: 1.14% x̃: 0.75%
95% mean confidence interval for non-nops value: -3.38 -2.99
95% mean confidence interval for non-nops %-change: -2.60% -2.36%
Non-nops are helped.
total systall in shared programs: 288317 -> 292975 (1.62%)
systall in affected programs: 87876 -> 92534 (5.30%)
helped: 388
HURT: 431
helped stats (abs) min: 1 max: 214 x̄: 14.39 x̃: 8
helped stats (rel) min: 0.25% max: 100.00% x̄: 22.12% x̃: 11.96%
HURT stats (abs) min: 1 max: 232 x̄: 23.77 x̃: 7
HURT stats (rel) min: 0.00% max: 1300.00% x̄: 51.71% x̃: 17.30%
95% mean confidence interval for systall value: 3.07 8.30
95% mean confidence interval for systall %-change: 9.49% 23.97%
Systall are HURT.
(The systall hurt is probably just due to having having fewer
instructions to hide latency with.)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Acked-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14554 >
2022-01-18 17:44:50 +00:00
Qiang Yu
2cee73f0f7
nir: fix nir_tex_instr hash not count is_sparse field
...
This fixes nir_opt_cse miss replace a non-sparse tex instruction
with a sparse tex instruction and fail the nir_validate_shader().
Fixes: 3a7972f72a ("nir,spirv: add sparse texture fetches")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14362 >
2022-01-18 16:10:35 +08:00
Rhys Perry
d95a0b52e4
nir/unsigned_upper_bound: don't follow 64-bit f2u32()
...
Fixes Doom Eternal crash.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Fixes: 72ac3f6026 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14555 >
2022-01-17 10:59:21 +00:00
Emma Anholt
f6ffefba3e
nir: Apply nir_opt_offsets to nir_intrinsic_load_uniform as well.
...
Doing this for ir3 required adding a struct for limits of how much base to
fold in (which NTT wants as well for its case of shared vars), otherwise
the later work to lower to the 1<<9 word limit would emit more
instructions.
The shader-db results are that sometimes the reduction in NIR instruction
count results in the fewer sampler prefetches due to the shader being
estimated to be shorter (dota2, nexuiz):
total instructions in shared programs: 8996651 -> 8996776 (<.01%)
total cat5 in shared programs: 86561 -> 86577 (0.02%)
Reviewed-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023 >
2022-01-16 19:11:29 +00:00
Emma Anholt
b024102d7c
freedreno/ir3: Use nir_opt_offset for removing constant adds for shared vars.
...
Saves some work in carchase and manhattan31:
instructions in affected programs: 2842 -> 2818 (-0.84%)
nops in affected programs: 1131 -> 1105 (-2.30%)
non-nops in affected programs: 1236 -> 1238 (0.16%)
mov in affected programs: 57 -> 61 (7.02%)
dwords in affected programs: 2144 -> 2150 (0.28%)
cat0 in affected programs: 1195 -> 1169 (-2.18%)
cat1 in affected programs: 151 -> 155 (2.65%)
cat2 in affected programs: 142 -> 140 (-1.41%)
sstall in affected programs: 190 -> 178 (-6.32%)
(ss) in affected programs: 63 -> 63 (0.00%)
systall in affected programs: 532 -> 511 (-3.95%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14023 >
2022-01-16 19:11:29 +00:00
Daniel Schürmann
79a987ad2a
nir/opt_if: also merge break statements with ones after the branch
...
This optimizations turns
loop {
...
if (cond1) {
if (cond2) {
do_work_1();
break;
} else {
do_work_2();
}
do_work_3();
break;
} else {
...
}
}
into:
loop {
...
if (cond1) {
if (cond2) {
do_work_1();
} else {
do_work_2();
do_work_3();
}
break;
} else {
...
}
}
As this optimizations moves code into the NIF statement,
it re-iterates on the branch legs in case of success.
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587 >
2022-01-13 02:30:32 +00:00
Daniel Schürmann
dad609d152
nir/opt_if: merge two break statements from both branch legs
...
This optimization turns
loop {
...
if (cond) {
do_work_1();
break;
} else {
do_work_2();
break;
}
}
into:
loop {
...
if (cond) {
do_work_1();
} else {
do_work_2();
}
break;
}
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7587 >
2022-01-13 02:30:32 +00:00
Daniel Schürmann
8a78706643
nir: refactor nir_opt_move
...
This patch is a rewrite of nir_opt_move.
Differently from the previous version, each instruction is checked
if it can be moved downwards and then inserted before the first user
of the definition. The advantage is that less insert operations are
performed, the original order is kept if two movable instructions have
the same first user, and instructions without user in the same block
are moved towards the end.
v2: Only return true if an instruction really changed the position.
Don't care for discards, this will be handled by another MR.
v3: fix self-referring phis and update according to nir_can_move_instr().
v4: use nir_can_move_instr() and nir_instr_ssa_def()
v5: deduplicate some code
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657 >
2022-01-12 13:41:54 +00:00
Marcin Ślusarz
f286ecf906
nir: handle per-view clip/cull distances
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
0d6f83cbf1
nir: remove invalid assert affecting per-view variables
...
per-view variables can have arbitrary (but > 0) number of array levels
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Marcin Ślusarz
4fed440724
nir: add load_mesh_view_count and load_mesh_view_indices intrinsics
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14263 >
2022-01-11 22:45:23 +00:00
Rhys Perry
67fc7a1763
nir/uniform_atomics: fix is_atomic_already_optimized without workgroups
...
dims_needed would have been zero, so this would always returned true for
non-compute stages.
Also fix this for variable workgroup sizes.
Improves Shadow of the Tomb Raider RX 6800 performance by 10.6%, 11.5% and
4.5% (day_of_dead, jungle and paititi scenes).
radv_perf before and after:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '62.913333333333334', 'min_fps': '62.81', 'max_fps': '62.98', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '64.02666666666666', 'min_fps': '63.93', 'max_fps': '64.11', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '74.81666666666666', 'min_fps': '74.72', 'max_fps': '74.88', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '69.57', 'min_fps': '69.52', 'max_fps': '69.63', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '71.41000000000001', 'min_fps': '71.31', 'max_fps': '71.5', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '78.16666666666667', 'min_fps': '78.07', 'max_fps': '78.23', 'interations': '3'}
Performance now seems slightly better than AMDVLK 2021.Q4.3:
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '68.02666666666666', 'min_fps': '67.95', 'max_fps': '68.16', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '70.24666666666667', 'min_fps': '69.83', 'max_fps': '70.51', 'interations': '3'}
{'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '77.19', 'min_fps': '77.18', 'max_fps': '77.2', 'interations': '3'}
fossil-db (Sienna Cichlid):
Totals from 40 (0.03% of 134621) affected shaders:
CodeSize: 62676 -> 65996 (+5.30%)
Instrs: 11372 -> 12111 (+6.50%)
Latency: 144122 -> 142848 (-0.88%); split: -1.09%, +0.21%
InvThroughput: 19686 -> 19847 (+0.82%); split: -0.06%, +0.87%
VClause: 304 -> 306 (+0.66%)
SClause: 603 -> 604 (+0.17%); split: -0.83%, +1.00%
Copies: 780 -> 858 (+10.00%)
Branches: 235 -> 329 (+40.00%)
PreSGPRs: 1072 -> 1083 (+1.03%); split: -0.37%, +1.40%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14407 >
2022-01-10 19:57:38 +00:00
Rhys Perry
b00138090e
nir/lower_shader_calls: fix store_scratch write_mask
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447 >
2022-01-10 19:01:04 +00:00
Danylo Piliaiev
b8d486f298
nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8
...
Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but
not signed dot product.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986 >
2022-01-10 13:20:39 +02:00
Gert Wollny
6f348d9c99
nir_lower_io: propagate the "invariant" flag to outputs
...
Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer
to correctly declare output variables.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423 >
2022-01-07 16:35:43 +00:00
Emma Anholt
558a600629
nir_to_tgsi: Enable fdot_replicates flag.
...
That's how the TGSI math opcodes work.
This lets lower_vec_to_regs coalesce the DP output into the .yzw channels,
giving an impressive shader-db win on softpipe:
total instructions in shared programs: 2929840 -> 2794036 (-4.64%)
instructions in affected programs: 1651438 -> 1515634 (-8.22%)
total temps in shared programs: 372730 -> 332744 (-10.73%)
temps in affected programs: 118151 -> 78165 (-33.84%)
and a minor one on r300:
total instructions in shared programs: 51238 -> 51149 (-0.17%)
instructions in affected programs: 2621 -> 2532 (-3.40%)
total vinst in shared programs: 15655 -> 15618 (-0.24%)
vinst in affected programs: 468 -> 431 (-7.91%)
total temps in shared programs: 9838 -> 9828 (-0.10%)
temps in affected programs: 59 -> 49 (-16.95%)
and a bigger one on i915g:
total instructions in shared programs: 398064 -> 395901 (-0.54%)
instructions in affected programs: 29271 -> 27108 (-7.39%)
total tex_indirect in shared programs: 12261 -> 12233 (-0.23%)
tex_indirect in affected programs: 98 -> 70 (-28.57%)
LOST: 0
GAINED: 5
The r300 change is less impressive because it does some backend copy-prop,
but also because intermediate storage of DPs now takes a vec4 instead of a
scalar.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14200 >
2022-01-07 09:58:24 +00:00
Jesse Natalie
c09c0b351f
nir_opt_dead_cf: Remove dead ifs
...
An if that looks like:
if (x) { } else { }
That has no phis following it is dead. Currently these are only
removed by peephole select, but that means that 'x' is considered
used until that pass is run, which can make it difficult to apply
sane lowering in the case where loading 'x' requires complex or
expensive transformations, but 'x' is *really* unused.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14400 >
2022-01-07 05:15:48 +00:00
Alyssa Rosenzweig
24ea7cbb06
nir: Extend store_combined_output_pan
...
Extend store_combined_output_pan to take a dual source blend input in
addition to colour, depth, and stencil inputs. Use the last source for
this, and represent the type with the DEST_TYPE index. This is a hack
but there is no SRC2_TYPE and NIR doesn't seem to mind as long as we
know what we mean. This allows the backend to emit a combined "blend
render target #0 " instruction taking two sources.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13714 >
2022-01-02 01:12:05 +00:00
Alyssa Rosenzweig
5c168f09eb
nir: Eliminate store_combined_output_pan BASE
...
It's meaningless for this intrinsic and is just adding noise to the
lowering pass.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13714 >
2022-01-02 01:12:05 +00:00
Daniel Schürmann
17ecd0b31a
nir/opt_algebraic: lower fneg_hi/lo to fmul
...
This pattern, found in the FSR upscaling shader,
helps the vectorization efforts by keeping the
chain of vectorized instructions intact.
Radeon can optimize it to per-component fneg modifiers.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:37 +01:00
Caio Oliveira
729df14e45
nir: Handle volatile semantics for loading HelperInvocation builtin
...
SPV_EXT_demote_to_helper_invocation added OpDemoteToHelperInvocation
operation to turn an invocation into a helper invocation, but the
value of HelperInvocation (a builtin from Input storage class)
couldn't be modified dynamically without breaking compatibility.
For the extension the operation OpIsHelperInvocation was added to get
the dynamic value.
For SPIR-V 1.6, the demote operation was promoted, but now to get the
dynamic value the shader must issue a load to HelperInvocation with
Volatile memory access semantics.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14209 >
2021-12-17 16:37:14 -08:00
Rhys Perry
a65285f54b
nir/opt_access: infer CAN_REORDER for global access
...
fossil-db (Sienna Cichlid):
Totals from 352 (0.26% of 134621) affected shaders:
VGPRs: 17240 -> 17272 (+0.19%)
CodeSize: 1753640 -> 1755744 (+0.12%); split: -0.04%, +0.16%
Instrs: 323190 -> 323801 (+0.19%); split: -0.03%, +0.22%
Latency: 3241205 -> 3241293 (+0.00%); split: -0.10%, +0.10%
InvThroughput: 568927 -> 568067 (-0.15%); split: -0.16%, +0.00%
SClause: 12109 -> 10444 (-13.75%); split: -13.76%, +0.01%
Copies: 27802 -> 27717 (-0.31%); split: -0.56%, +0.26%
PreSGPRs: 14699 -> 14690 (-0.06%)
PreVGPRs: 15793 -> 15799 (+0.04%)
fossil-db (Polaris10):
Totals from 348 (0.26% of 135668) affected shaders:
SGPRs: 21446 -> 21574 (+0.60%); split: -0.15%, +0.75%
VGPRs: 17004 -> 16996 (-0.05%); split: -0.09%, +0.05%
CodeSize: 1782796 -> 1783060 (+0.01%); split: -0.03%, +0.05%
Instrs: 337828 -> 337921 (+0.03%); split: -0.03%, +0.06%
Latency: 3726328 -> 3726721 (+0.01%); split: -0.09%, +0.10%
InvThroughput: 1307917 -> 1299841 (-0.62%); split: -0.62%, +0.00%
VClause: 4327 -> 4337 (+0.23%); split: -0.09%, +0.32%
SClause: 12178 -> 10529 (-13.54%); split: -13.55%, +0.01%
Copies: 40227 -> 40244 (+0.04%); split: -0.19%, +0.24%
PreSGPRs: 14946 -> 14937 (-0.06%)
PreVGPRs: 15637 -> 15643 (+0.04%)
fossil-db (Pitcairn):
Totals from 351 (0.26% of 135668) affected shaders:
SGPRs: 20382 -> 20619 (+1.16%); split: -0.79%, +1.95%
CodeSize: 1789732 -> 1789836 (+0.01%); split: -0.04%, +0.04%
MaxWaves: 1947 -> 1949 (+0.10%)
Instrs: 352274 -> 352318 (+0.01%); split: -0.04%, +0.06%
Latency: 4057829 -> 4058226 (+0.01%); split: -0.08%, +0.09%
InvThroughput: 1332245 -> 1317578 (-1.10%); split: -1.11%, +0.01%
VClause: 8581 -> 8583 (+0.02%); split: -0.13%, +0.15%
SClause: 12187 -> 10552 (-13.42%); split: -13.43%, +0.02%
Copies: 44906 -> 44915 (+0.02%); split: -0.24%, +0.26%
PreSGPRs: 16571 -> 16562 (-0.05%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14227 >
2021-12-17 18:51:24 +00:00
Rhys Perry
403ae3b48e
nir/algebraic: optimize more 64-bit imul with constant source
...
Two 64-bit shifts and an addition are usually faster than the several
multiplications nir_lower_int64 creates.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14227 >
2021-12-17 18:51:24 +00:00
Rhys Perry
c56cf157c5
nir/opt_load_store_vectorize: improve ssbo/global alias analysis
...
If either the global access or the ssbo access is restrict, they shouldn't
alias.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14227 >
2021-12-17 18:51:24 +00:00
Jason Ekstrand
deec7a590b
anv,nir: Use sample_pos_or_center in lower_wpos_center
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198 >
2021-12-17 16:02:16 +00:00
Jason Ekstrand
e8acc5a7ea
nir: Add a new sample_pos_or_center system value
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14198 >
2021-12-17 16:02:16 +00:00
Marcin Ślusarz
504e5cb4e8
nir/print: print const value near each use of const ssa variable
...
Without/with NIR_DEBUG=print,print_const:
-vec4 32 ssa_60 = fadd ssa_59, ssa_58
+vec4 32 ssa_60 = fadd ssa_59 /*(0xbf800000, 0x3e800000, 0x00000000, 0x3f800000) = (-1.000000, 0.250000, 0.000000, 1.000000)*/, ssa_58
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13880 >
2021-12-17 10:04:50 +00:00
Marcin Ślusarz
23f8f836e0
nir/print: group hex and float vectors together
...
Vectors are much easier to follow in this format, because developer cares
either about hex or float values, never both.
Before/after:
-vec4 32 ssa_222 = load_const (0x00000000 /* 0.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 /* 1.000000 */, 0x3f800000 /* 1.000000 */)
+vec4 32 ssa_222 = load_const (0x00000000, 0x00000000, 0x3f800000, 0x3f800000) = (0.000000, 0.000000, 1.000000, 1.000000)
-vec1 32 ssa_174 = load_const (0xbf800000 /* -1.000000 */)
+vec1 32 ssa_174 = load_const (0xbf800000 = -1.000000)
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13880 >
2021-12-17 10:04:50 +00:00
Marcin Ślusarz
d2b4051ea9
nir/print: move print_load_const_instr up
...
... to avoid forward declarations in future commit
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13880 >
2021-12-17 10:04:50 +00:00
Marcin Ślusarz
f7e63ec5d8
nir/print: compact printing of intrinsic indices
...
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14222 >
2021-12-16 09:43:13 +00:00
Marcin Ślusarz
d8fa625bb3
nir/print: expand printing of io semantics.gs_streams
...
gs_streams can be set for at least 2 other intrinsics.
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14222 >
2021-12-16 09:43:13 +00:00
Marcin Ślusarz
be25db9f0f
nir/print: simplify printing of IO semantics
...
Some of the tested flags are set for other intrinsics and they are
printed only when set, so there's no point in checking exact intrinsic
name or shader stage.
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14222 >
2021-12-16 09:43:13 +00:00
Caio Oliveira
b1156f23a2
Revert "nir: disable a NIR test due to undebuggable & locally unreproducible CI failures"
...
This reverts commit 6eb3fe2d4f . The root cause was
a bug in Meson when using the new gtest protocol and a test failed before producing
the XML file expected by it. This was fixed in later versions of Meson, so
we've bumped the required meson version to use that feature. The failure should
now be properly identified, so re-enabling the NIR test.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com >
Acked-by: Daniel Stone <daniels@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14204 >
2021-12-15 23:28:09 +00:00
Caio Oliveira
dcc7b19cae
nir: Initialize nir_register::divergent
...
Fixes: c7fc44f9eb ("nir/from_ssa: Respect and populate divergence information")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14205 >
2021-12-15 22:39:06 +00:00
Juan A. Suarez Romero
b8f6685bb5
nir: use call_once() to init debug variable
...
For data-race safety, let's use this function to ensure NIR debug is
initialized only once.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14057 >
2021-12-14 08:01:17 +00:00
Juan A. Suarez Romero
18c039b2e1
tgsi-to-nir: initialize NIR_DEBUG envvar
...
This envvar is initialized when creating a NIR shader, but it needs to
be used before. So initialize it here.
v2 (Juan):
- Use static variable for first initialization.
Fixes: f77ccdfb4a ("nir: add NIR_DEBUG envvar")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14057 >
2021-12-14 08:01:17 +00:00
Jordan Justen
211e0606c7
nir/lower_tex: Add filter for tex offset lowering
...
Rework:
* Add callback_data (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14142 >
2021-12-13 16:56:23 -08:00
Samuel Pitoiset
be53b3d1bf
nir/lower_tex: add lower_lod_zero_width
...
On AMD, the hardware will return 0 for the raw LOD if the sum of the
absolute values of derivatives is 0 but Vulkan expects the value to
be in the [-inf, -22.0f] range.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14147 >
2021-12-13 10:00:07 +00:00
Marcin Ślusarz
87f03b1662
nir: limit lower_clip_cull_distance_arrays input to traditional stages
...
Compute, task, mesh & raytracing stages don't support
ClipDistance/CullDistance as input.
This change is not needed for correctness. Just something I stumbled on.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14149 >
2021-12-13 08:32:23 +00:00
Marek Olšák
2785141c16
nir: add nir_has_divergent_loop function
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13966 >
2021-12-11 20:07:35 +00:00