When the workgroup is 1 dimensional, simply use a vec3
filled with zeroes and the local invocation index.
This is is better than lower_id_to_index + constant folding,
because this way we don't leave behind extra ALU instrs.
Note, this is relevant to mesh shaders on RDNA2 because
it enables us to better detect cross-invocation output
access.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
This also prevents some small regressions in "glsl: remove GLSL IR
inverse comparison optimisations".
shader-db results:
All Sandy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19941025 -> 19940805 (<.01%)
instructions in affected programs: 52431 -> 52211 (-0.42%)
helped: 188 / HURT: 6
total cycles in shared programs: 858451784 -> 858431633 (<.01%)
cycles in affected programs: 2119134 -> 2098983 (-0.95%)
helped: 183 / HURT: 12
LOST: 2
GAINED: 0
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8364668 -> 8364670 (<.01%)
instructions in affected programs: 753 -> 755 (0.27%)
helped: 2 / HURT: 4
total cycles in shared programs: 248752572 -> 248752238 (<.01%)
cycles in affected programs: 87290 -> 86956 (-0.38%)
helped: 2 / HURT: 4
fossil-db results:
Skylake, Ice Lake, and Tiger Lake had similar results. (Ice Lake shown)
Instructions in all programs: 144909184 -> 144909130 (-0.0%)
Instructions helped: 6
Cycles in all programs: 9138641740 -> 9138640984 (-0.0%)
Cycles helped: 8
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
Ever since 4246c2869c and 7d85dc4f35 loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.
Here we simply check to see if the inot contains a simple
terminator condition we previously handled. We also update
the previous users of this function to use a newly name
copy of the previous behaviour
nir_is_terminator_condition_with_two_inputs().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
A task shader must use this instruction to specify the dimensions
of the launched mesh shader workgroups.
It is a terminating instruction.
When the task shader doesn't have the optional payload, use the
pre-existing launch_mesh_workgroups intrinsics.
When the task shader has a payload, use a new
launch_mesh_workgroups_with_payload_deref intrinsics which has
a deref that refers to the payload variable.
We also add this new intrinsic to nir_lower_io which lowers this
to the pre-existing explicit intrinsic.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
This reverts commit 6f25d45877, replacing it
with GLSL_PRECISION_MEDIUM. The previous commit ended up not being the
right approach, as it affected only nir vars for spirv phis and not other
nir vars, and we want a tool that does both. The new
nir_lower_mediump_vars pass can do that for you.
No fossil-db change for my angle fossils run on radv.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18259>
SPIRV and GLSL are reasonable at converting ALU ops to mediump, but
variable storage would be wrapped in a 2f32/2mp on store/load, and if
nir_vars_to_ssa doesn't make that storage go away then you'd have extra
conversions. For compute shader shared mem, you'd waste memory too.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18259>
Take this real-world (trimmed) shader:
precision highp float;
in lowp vec4 var_varVertexColor;
layout(location = 0) out vec4 out_FragColor0;
void main() {
vec4 textureColor0 = vec4(1.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00);
vec3 color = vec3(1.000000e+00, 1.000000e+00, 1.000000e+00);
vec4 outColor = vec4(vec3((color).rgb), 1.000000e+00);
(outColor *= vec4(var_varVertexColor));
(out_FragColor0 = outColor);
}
After opts, it's just a store from input to output. If we decide to lower
the input to 16-bit, then as long as the driver can handle 16-bit outputs,
it would be a good idea to demote the output and save the conversions.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18003>
Soon we'll be allocating instructions out of a per-shader pool, which
means that if we don't free too many instructions during the main
optimization loop, the final nir_sweep() call will create holes which
can't be filled. By freeing instructions more aggressively, we can
allocate more instructions from the freelist which will reduce the final
memory usage.
Modified from Connor Abbott's original patch to rebase on top of
refactored DCE and so that the use-after-free in nir_algebraic_impl() is
fixed.
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
This fixes the use of an uninitialzed value:
Conditional jump or move depends on uninitialised value(s)
bcmp (vg_replace_strmem.c:1203)
_mesa_add_sized_state_reference (prog_parameter.c:434)
st_nir_assign_uniform_locations(gl_context*, gl_program*, nir_shader*) (st_glsl_to_nir.cpp:209)
st_finalize_nir (st_glsl_to_nir.cpp:1041)
by 0x58271B9: st_glsl_to_nir_post_opts(st_context*, gl_program*, gl_shader_program*) (st_glsl_to_nir.cpp:571)
...
Uninitialised value was created by a heap allocation
malloc (vg_replace_malloc.c:381)
ralloc_size (ralloc.c:114)
ralloc_array_size (ralloc.c:218)
deref_offset_var (nir_lower_atomics_to_ssbo.c:47)
lower_instr (nir_lower_atomics_to_ssbo.c:111)
nir_lower_atomics_to_ssbo (nir_lower_atomics_to_ssbo.c:204)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18227>
Warning messages:
../src/compiler/glsl/tests/list_iterators.cpp:68:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
../src/compiler/glsl/tests/list_iterators.cpp:187:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18203>
Warning messages:
../src/compiler/nir/tests/serialize_tests.cpp:113:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
../src/compiler/nir/tests/serialize_tests.cpp:119:1: warning: 'InstantiateTestCase_P_IsDeprecated' is deprecated: INSTANTIATE_TEST_CASE_P is deprecated, please use INSTANTIATE_TEST_SUITE_P [-Wdeprecated-declarations]
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18203>