There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variation that include both ine
and i2b), just emit a != 0 instead of i2b(a).
I think that the changes to the unit tests weaken them slightly, but
perhaps that's okay?
No shader-db changes on any Intel platform. The GLSL paths use other
means to generate i2b operations, but the SPIR-V paths use nir_i2b.
Presumably since 4676b3d3dd (nir: Use nir_test_mask instead of
i2b(iand)), no fossil-db changes either.
v2: Use nir_ine_imm. Suggested by Jesse.
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
v2: Remove the ineg from the b2i in the ior pattern. Suggested by
Jason.
All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19914441 -> 19914369 (<.01%)
instructions in affected programs: 63507 -> 63435 (-0.11%)
helped: 24 / HURT: 0
total cycles in shared programs: 853869766 -> 853851470 (<.01%)
cycles in affected programs: 10551542 -> 10533246 (-0.17%)
helped: 24 / HURT: 0
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141163061 -> 141092683 (-0.0%)
Instructions helped: 14103
Instructions hurt: 55
Cycles in all programs: 9132376195 -> 9133183045 (+0.0%)
Cycles helped: 13775
Cycles hurt: 380
Spills in all programs: 18286 -> 18284 (-0.0%)
Spills helped: 1
Fills in all programs: 30647 -> 30643 (-0.0%)
Fills helped: 1
Gained: 133
Lost: 130
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
This helps because it enables cmod propagation to do more.
The removed patterns involving b2i will be handled by other existing
patterns after the unary operations are removed.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19914458 -> 19914441 (<.01%)
instructions in affected programs: 5456 -> 5439 (-0.31%)
helped: 17 / HURT: 0
total cycles in shared programs: 855302118 -> 853869766 (-0.17%)
cycles in affected programs: 327354347 -> 325921995 (-0.44%)
helped: 291 / HURT: 81
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141205979 -> 141205961 (-0.0%)
Instructions helped: 4
Instructions hurt: 3
SENDs in all programs: 7466919 -> 7466913 (-0.0%)
SENDs helped: 1
Cycles in all programs: 9133387327 -> 9133384475 (-0.0%)
Cycles helped: 3
Cycles hurt: 12
In the shader that was helped for sends, it appears that a NIR pass that
moves code out of loops was able to move 3 send operations outside a
loop after this change. I did not investigate further.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
This prevents ~400 shader-db regresssions and a handful of fossil-db
regressions after i2b is always lowered.
All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total cycles in shared programs: 855301494 -> 855302118 (<.01%)
cycles in affected programs: 52787 -> 53411 (1.18%)
helped: 4 / HURT: 5
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141206055 -> 141205979 (-0.0%)
Instructions helped: 14
Cycles in all programs: 9133376616 -> 9133387327 (+0.0%)
Cycles helped: 13
Cycles hurt: 3
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
No shader-db changes on any Intel platform.
All of the helped shaders were presumably regressed by 4676b3d3dd (nir:
Use nir_test_mask instead of i2b(iand)).
v2: Add some comments explaining why specific replacements are used. In
the umin pattern, only markup the first usage of 'b' in the source
pattern.
Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 141384970 -> 141200966 (-0.1%)
Instructions helped: 45842
Cycles in all programs: 9133648977 -> 9133282672 (-0.0%)
Cycles helped: 26812
Cycles hurt: 6025
Gained: 23
Lost: 135
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
A later commit adds a pattern
(('umin', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)')),
('iand', ('iand', a, b), ('iand', c, b))),
When I originally made that pattern, I copied and pasted the search to
the replacement as
(('umin', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)')),
('iand', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)'))),
The caused the variables in the replacement to be marked is_constant,
and that resulted in an assertion failure deep inside nir_search.
src/compiler/nir/nir_search.c:530: construct_value: Assertion `!var->is_constant' failed.
These extra validation rules catch this kind of error at compile time
rather than at run time.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
It's not valid to be copying input variables to temps when
inlining atomic memory, interpolateAt functions, etc. We got away
with this previously because tree grafting would clean up the
mess but we shouldn't depend on an optimisation to clean up
invalid IR. Also I hope to remove tree grafting in a follow up
merge request.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19890>
This allows us to drop some duplicate code that is already in the
ir_rvalue_visitor. It also allows us to better replace rvalues
and handle swizzle in the following patch without having to add
even more duplicate code.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19890>
link_varyings ignores precisions and can assign the same location to
variables with different precisions. nir_link_varying_precision should
check location_frac as well.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20113>
In try_fold_load_store when trying to extract const addition from
non-const offset source, we should take into account that there is
already a constant base offset, which should count towards the limit.
The issue was found in "Monster Hunter: World" running on Turnip.
Fixes: cac6f633b2
("nir/opt_offsets: Use nir_ssa_scalar to chase offset additions.")
Well, the issue was present before this commit but it made a lot
of changes in surrounding code.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20099>