Commit Graph

7273 Commits

Author SHA1 Message Date
Samuel Pitoiset 68bb58a46e nir,radv: pass the number of samples to load_sample_positions_amd
This will be used to lower it when it's dynamic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18677>
2022-09-21 10:30:33 +00:00
Samuel Pitoiset dd30e7bfa0 nir: add nir_load_rasterization_samples_amd
This will be used to load the number of rasterization samples when a
fragment shader is compiled inside a library without the MSAA state.
RADV needs to know the number of samples for loading sample positions
with interpolateAtSample().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18677>
2022-09-21 10:30:33 +00:00
Marcin Ślusarz 1f0c39f23c nir/lower_task_shader: lower small stores & loads to shared when requested
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18501>
2022-09-21 09:16:20 +00:00
Marcin Ślusarz 037404b441 nir, anv, hasvk, radv: pull uses_wide_subgroup_intrinsics into shader_info
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18504>
2022-09-20 10:19:21 +00:00
Marcin Ślusarz fa437f87ca nir: add uses_wide_subgroup_intrinsics to task/mesh shader_info
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18504>
2022-09-20 10:19:21 +00:00
Samuel Pitoiset 7f444fc72c nir: add nir_intrinsic_load_sample_positions_amd
This will be used to lower barycentric_at_sample in NIR for RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
2022-09-20 09:52:37 +00:00
Rhys Perry 7d26fafacf radv: fix dynamic RT stack size with VGPR spilling
VGPR spilling might cause VGPRs to be spilled at scratch offset 0, so we
can't use that.

fossil-db (Sienna Cichlid, Q2RTX and Control):
Totals from 4 (0.26% of 1524) affected shaders:
Instrs: 8734 -> 8737 (+0.03%)
CodeSize: 48492 -> 48504 (+0.02%)
Latency: 384375 -> 384369 (-0.00%)
InvThroughput: 256250 -> 256246 (-0.00%)
Copies: 1312 -> 1313 (+0.08%)
Branches: 256 -> 258 (+0.78%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>
2022-09-20 01:39:20 +00:00
Kai Wasserbäch 452e5973de fix: nir: unused variable ‘else_block’ [-Wunused-variable]
Only used in debug builds.

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18643>
2022-09-19 22:02:16 +00:00
Jason Ekstrand 8f4af4d700 nir/load_libclc: Don't add generic variants that already exist
At some point in the future, adding generic variants to libclc will
hopefully no longer be needed.  At that point, we don't want the NIR
code adding duplicates.  Check if the generic version already exists
and, if it does, don't re-add it.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18675>
2022-09-19 16:52:17 +00:00
Jason Ekstrand 2aa9eb497d nir: Add a helper for finding a function by name
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18675>
2022-09-19 16:52:17 +00:00
Jason Ekstrand 0a06abbb91 spirv: Don't use libclc for wait_group_events
v2: Drop old code (Karol)

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18675>
2022-09-19 16:52:17 +00:00
Qiang Yu 4e06a8f15e nir: add nir_intrinsic_ordered_xfb_counter_add_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu 1119e06a45 nir,ac/llvm: add nir_intrinsic_load_ordered_id_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu 5c2d710064 nir: add nir_intrinsic_load_streamout_buffer_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu 2ae357aa23 nir: add nir_intrinsic_load_num_vertices_per_primitive_amd
This is used in streamout as radeonsi pass this value for VS
by arg.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Qiang Yu 417cf031a0 nir: fix nir_xfb_info buffer_to_stream length
Fixes: 19064b8c3a ("nir: Add a pass for gathering transform feedback info")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
2022-09-16 08:51:28 +00:00
Emma Anholt 7e986e5f04 nir/lower_mediump_vars: Don't lower mediump shared vars with atomic access.
I don't know of any GPUs doing 16-bit atomic accesses, nor do I know of
anybody wanting that in shaders.  But deqp has GLES CTS cases that set
mediump on shared variables, so just skip lowering for those vars.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18452>
2022-09-14 14:56:22 -07:00
Rhys Perry b301c33f65 nir/algebraic: optimize fabs(bcsel(b, fneg(a), a))
fossil-db (Sienna Cichlid):
Totals from 207 (0.15% of 134913) affected shaders:
VGPRs: 7152 -> 6928 (-3.13%)
CodeSize: 762404 -> 752888 (-1.25%)
MaxWaves: 6138 -> 6146 (+0.13%)
Instrs: 144031 -> 142184 (-1.28%)
Latency: 817783 -> 807286 (-1.28%)
InvThroughput: 151031 -> 147497 (-2.34%)
VClause: 1490 -> 1453 (-2.48%)
SClause: 3357 -> 3331 (-0.77%); split: -0.92%, +0.15%
Copies: 9632 -> 9555 (-0.80%); split: -0.81%, +0.01%
Branches: 4306 -> 4270 (-0.84%)
PreSGPRs: 11232 -> 11218 (-0.12%); split: -0.15%, +0.03%
PreVGPRs: 6307 -> 6121 (-2.95%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14772>
2022-09-14 12:16:07 +00:00
Rhys Perry c23411a970 nir/algebraic: optimize bits=umin(bits, 32-(offset&0x1f))
Optimizes patterns which are created by recent versions of vkd3d-proton,
when constant folding doesn't eliminate it entirely:
- ubitfield_extract(value, offset, umin(bits, 32-(offset&0x1f)))
- ibitfield_extract(value, offset, umin(bits, 32-(offset&0x1f)))
- bitfield_insert(base, insert, offset, umin(bits, 32-(offset&0x1f)))

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13225>
2022-09-13 20:36:06 +00:00
Alyssa Rosenzweig 7371803f14 nir: Add nir_intrinsic_texture_base_agx sysval
For non-bindless textures, get the base address of the texture
descriptor array, so we can crawl descriptors in the shader. For
bindless, this isn't needed (since the bindless handle will be the
address itself).

jekstrand suggested the idea of the descriptor crawl. It worked out
pretty well, all considered.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
2022-09-13 16:04:28 +00:00
Alyssa Rosenzweig 6177c43bb9 nir/lower_blend: Avoid emitting unnecessary fsats
The option struct passed to nir_lower_blend doesn't have a "blending
disabled" flag. Unless blending is skipped due to logic ops or
framebuffer formats, nir_lower_blend always blends, even if the blend
mode is "replace" (corresponding to the API level blend disable).

That's mostly okay, since NIR can optimize out the code, at the expense
of a little compile time. However, there's a catch: nir_lower_blend
emits fsat at the start of the shader (for UNORM framebuffers, or
fsat_signed for SNORM). We can expect hardware to saturate the input to
store_output itself, so these operations are redundant, but it's tricky
to optimize these instructions out otherwise. Don't even try: detect the
replace blend mode and don't call nir_blend in that case. Colour masking
is still applied as usual.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18535>
2022-09-12 23:44:54 +00:00
Karol Herbst 46ee5988cd rusticl: nir bindings
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15439>
2022-09-12 05:58:12 +00:00
Thomas H.P. Andersen 6d19b34571 spirv: avoid allocating memory twice
ptr_type was allocated twice. This drops the second allocation.

It has been like this since the introduction of the code in
b778e7bd6c

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18450>
2022-09-11 22:13:09 +00:00
Gert Wollny 762d377292 mesa/glsl: Add support for NV_shader_noperspective_interpolation
With EXT_gpu_shader4 the support is already in place, we just
have to allow it in glsl and expose the extension name.

v2: Check whether the extension is enabled in the shader (Adam Jackson)
v3: Don't check GLES version in lexer (mareko)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18460>
2022-09-09 07:22:20 +00:00
Timur Kristóf e58a5cca02 nir/gather_info: Clear cross-invocation output mask.
Similar to how other I/O info is cleared at the beginning
of gather_info we should also clear the cross-invocation
mesh shader output mask.

Fixes: 112a856813
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
2022-09-08 20:26:03 +00:00
Timur Kristóf c80d811403 nir/lower_system_values: Add shortcut for 1D workgroups.
When the workgroup is 1 dimensional, simply use	a vec3
filled with zeroes and the local invocation index.
This is is better than lower_id_to_index + constant folding,
because this way we don't leave behind extra ALU instrs.

Note, this is relevant to mesh shaders on RDNA2 because
it enables us to better detect cross-invocation output
access.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18464>
2022-09-08 20:26:03 +00:00
Georg Lehmann 4d7fe94f3a nir/opt_algebraic: Optimize unpacking of upcasts to 64bit integers.
Foz-DB Navi21:
Totals from 7 (0.01% of 134913) affected shaders:
CodeSize: 213364 -> 213028 (-0.16%)
Instrs: 38347 -> 38319 (-0.07%)
Latency: 780148 -> 779776 (-0.05%)
InvThroughput: 520098 -> 519851 (-0.05%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18435>
2022-09-08 14:37:56 +00:00
Timothy Arceri f182b1952a glsl: remove GLSL IR inverse comparison optimisations
As per 7d85dc4f35 GLSL IR is not smart enough to handle this
correctly for NANs.

Shader-db radeonsi (RX 6800):

Totals from affected shaders:
SGPRS: 26848 -> 26848 (0.00 %)
VGPRS: 13552 -> 13552 (0.00 %)
Spilled SGPRs: 134 -> 134 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 635000 -> 630988 (-0.63 %) bytes
Max Waves: 5474 -> 5474 (0.00 %)

Shader-db iris (BDW):

total instructions in shared programs: 17538859 -> 17539018 (<.01%)
instructions in affected programs: 29369 -> 29528 (0.54%)
helped: 3
HURT: 126
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
HURT stats (abs)   min: 1 max: 2 x̄: 1.29 x̃: 1
HURT stats (rel)   min: 0.27% max: 1.32% x̄: 0.61% x̃: 0.54%
95% mean confidence interval for instructions value: 1.13 1.33
95% mean confidence interval for instructions %-change: 0.54% 0.63%
Instructions are HURT.

total loops in shared programs: 4866 -> 4866 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total cycles in shared programs: 858548230 -> 858548915 (<.01%)
cycles in affected programs: 1331737 -> 1332422 (0.05%)
helped: 0
HURT: 92
HURT stats (abs)   min: 2 max: 49 x̄: 7.45 x̃: 6
HURT stats (rel)   min: 0.01% max: 1.90% x̄: 0.12% x̃: 0.05%
95% mean confidence interval for cycles value: 5.72 9.17
95% mean confidence interval for cycles %-change: 0.05% 0.19%
Cycles are HURT.

Note: With the addition of "nir/comparison_pre: See through an inot to
apply the optimization", idr's shader-db results are:

All Broadwell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19940805 -> 19940802 (<.01%)
instructions in affected programs: 582 -> 579 (-0.52%)
helped: 3 / HURT: 0

total cycles in shared programs: 858431633 -> 858431747 (<.01%)
cycles in affected programs: 4938 -> 5052 (2.31%)
helped: 0 / HURT: 3

All older Intel platforms had similar results. (Haswell shown)
total instructions in shared programs: 16715626 -> 16715670 (<.01%)
instructions in affected programs: 9496 -> 9540 (0.46%)
helped: 0 / HURT: 44

total cycles in shared programs: 881224396 -> 881232314 (<.01%)
cycles in affected programs: 600610 -> 608528 (1.32%)
helped: 6 / HURT: 44

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
2022-09-08 01:01:14 +00:00
Ian Romanick 5473536798 nir/comparison_pre: See through an inot to apply the optimization
This also prevents some small regressions in "glsl: remove GLSL IR
inverse comparison optimisations".

shader-db results:

All Sandy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19941025 -> 19940805 (<.01%)
instructions in affected programs: 52431 -> 52211 (-0.42%)
helped: 188 / HURT: 6

total cycles in shared programs: 858451784 -> 858431633 (<.01%)
cycles in affected programs: 2119134 -> 2098983 (-0.95%)
helped: 183 / HURT: 12

LOST:   2
GAINED: 0

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8364668 -> 8364670 (<.01%)
instructions in affected programs: 753 -> 755 (0.27%)
helped: 2 / HURT: 4

total cycles in shared programs: 248752572 -> 248752238 (<.01%)
cycles in affected programs: 87290 -> 86956 (-0.38%)
helped: 2 / HURT: 4

fossil-db results:

Skylake, Ice Lake, and Tiger Lake had similar results. (Ice Lake shown)
Instructions in all programs: 144909184 -> 144909130 (-0.0%)
Instructions helped: 6

Cycles in all programs: 9138641740 -> 9138640984 (-0.0%)
Cycles helped: 8

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
2022-09-08 01:01:14 +00:00
Timothy Arceri 61c3438b27 nir: support loop unrolling with inot conditions
Ever since 4246c2869c and 7d85dc4f35 loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.

This change avoids 292 loop unrolling regressions with shader-db
once the following patch is applied.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
2022-09-08 01:01:14 +00:00
Timothy Arceri 96c19d23c9 nir: update nir_is_supported_terminator_condition()
Ever since 4246c2869c and 7d85dc4f35 loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.

Here we simply check to see if the inot contains a simple
terminator condition we previously handled. We also update
the previous users of this function to use a newly name
copy of the previous behaviour
nir_is_terminator_condition_with_two_inputs().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18006>
2022-09-08 01:01:14 +00:00
Emma Anholt 7662a5e9d3 mesa: Remove PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED/lower_cs_derived.
We have fine NIR lowering for this (already called from mesa/st), no need
for a separate GLSL pass.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18361>
2022-09-06 17:11:14 +00:00
Christian Gmeiner 912d0383b4 isaspec: Move isa_decode(..) declaration
The implementation of isa_decode(..) is already part of isaspec. So lets
move the function declaration and some related structs to a src/isaspec.

Also make the header C++ safe.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18403>
2022-09-03 19:26:04 +00:00
Marcin Ślusarz 14911e8f83 spirv, compiler: add "bool nv" to shader_info.mesh
Not knowing whether we deal with the NV or EXT extension
makes implementation difficult for Intel HW.
NV support will be dropped at some point, so
this ugliness will go away eventually.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf 7d1bcf1f55 spirv, nir: Handle EmitMeshTasksEXT opcode.
A task shader must use this instruction to specify the dimensions
of the launched mesh shader workgroups.
It is a terminating instruction.

When the task shader doesn't have the optional payload, use the
pre-existing launch_mesh_workgroups intrinsics.

When the task shader has a payload, use a new
launch_mesh_workgroups_with_payload_deref intrinsics which has
a deref that refers to the payload variable.

We also add this new intrinsic to nir_lower_io which lowers this
to the pre-existing explicit intrinsic.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf 42e906485c spirv: Support TaskPayloadWorkgroupEXT storage class.
Just use the task_payload NIR storage class for this.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf a03c30bd8d spirv: Support the CullPrimitiveEXT mesh shader built-in.
This is a per-primitive builtin output which indicates that a
primitive should be culled (deleted) from the output.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf c5c6cef893 spirv: Support EXT_mesh_shader SetMeshOutputsEXT.
Use the set_vertex_and_primitive_count intrinsic to
express the number of vertices and primitives that the
mesh shader workgroup outputs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf b3cc09cff3 spirv: Support EXT_mesh_shader mesh/task stages.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf bbebc1fb35 spirv: Add mesh_shading capability for EXT_mesh_shader.
Indicates support for the EXT_mesh_shader SPIR-V capabilities.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf f6925b8446 spirv: Support EXT_mesh_shader indices and mark them per-primitive.
They are not defined as per-primitive in the EXT, but they behave
like per-primitive outputs so it's easier to treat them like that.
They may still require special treatment in the backend in order to
control where and how they are stored.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Timur Kristóf c315e2e718 vulkan, spirv: Update to Vulkan 1.3.226 and latest SPIR-V headers.
Done using the "khronos-update.py" script, leaving out parts that
are not relevant to Vulkan.

Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18366>
2022-09-02 16:18:33 +00:00
Emma Anholt e1588cdf9e spirv: Mark phis as mediump instead of directly lowering them to 16 bit.
This reverts commit 6f25d45877, replacing it
with GLSL_PRECISION_MEDIUM.  The previous commit ended up not being the
right approach, as it affected only nir vars for spirv phis and not other
nir vars, and we want a tool that does both.  The new
nir_lower_mediump_vars pass can do that for you.

No fossil-db change for my angle fossils run on radv.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18259>
2022-09-01 22:39:39 +00:00
Emma Anholt 0cee5f3918 nir: Add a pass to lower mediump temps and shared mem.
SPIRV and GLSL are reasonable at converting ALU ops to mediump, but
variable storage would be wrapped in a 2f32/2mp on store/load, and if
nir_vars_to_ssa doesn't make that storage go away then you'd have extra
conversions.  For compute shader shared mem, you'd waste memory too.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18259>
2022-09-01 22:39:39 +00:00
Emma Anholt 5f66a927ec gallium,glsl: Delete PIPE_CAP_VERTEXID_NOBASE and lower_vertex_id.
Every driver uses the nir_lower_system_values path now.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18327>
2022-08-31 22:57:03 +00:00
Emma Anholt 28b2252d0a nir: Make nir_lower_discard_if() handle demotes and terminates, too.
AGX and zink both want all of these lowered, but nir_to_tgsi will want
only demote (and terminate if it was possible from GLSL but it's not)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15932>
2022-08-31 18:26:19 +00:00
Georg Lehmann 6eb4dfca23 nir/opt_algebraic: Optimize d3d9 pow with fmulz.
Foz-DB Navi21:
Totals from 69 (0.05% of 134913) affected shaders:
CodeSize: 255684 -> 253788 (-0.74%); split: -0.74%, +0.00%
Instrs: 46307 -> 46052 (-0.55%); split: -0.55%, +0.00%
Latency: 533255 -> 530742 (-0.47%); split: -0.48%, +0.01%
InvThroughput: 110001 -> 109156 (-0.77%)
VClause: 839 -> 844 (+0.60%); split: -1.19%, +1.79%
SClause: 1411 -> 1395 (-1.13%)
Copies: 1828 -> 1816 (-0.66%); split: -1.09%, +0.44%
PreSGPRs: 2243 -> 2232 (-0.49%)
PreVGPRs: 2213 -> 2192 (-0.95%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18145>
2022-08-31 17:07:24 +00:00
Georg Lehmann 9c2c47884d nir/opt_algebraic: Optimize check for single bit.
Foz-DB Navi21:
Totals from 3239 (2.40% of 134913) affected shaders:
SpillSGPRs: 110 -> 102 (-7.27%)
CodeSize: 17426512 -> 17344808 (-0.47%); split: -0.48%, +0.01%
Instrs: 3194264 -> 3179366 (-0.47%)
Latency: 20498012 -> 20481419 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 3311738 -> 3311282 (-0.01%); split: -0.02%, +0.00%
SClause: 145810 -> 145690 (-0.08%)
Copies: 171748 -> 169009 (-1.59%); split: -1.63%, +0.03%
Branches: 86610 -> 86370 (-0.28%)
PreSGPRs: 138036 -> 137104 (-0.68%)
PreVGPRs: 138540 -> 138545 (+0.00%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17429>
2022-08-31 18:36:33 +02:00
Iago Toral Quiroga a68a2805bf nir/lower_variable_initializers: implement non-scoped barrier path
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18312>
2022-08-31 07:25:00 +02:00
Emma Anholt 80b35fbefe nir/lower_mediump: Lower FS outputs to 16-bit when the value was upconverted.
Take this real-world (trimmed) shader:

precision highp float;
in lowp vec4 var_varVertexColor;
layout(location = 0) out vec4 out_FragColor0;
void main() {
    vec4 textureColor0 = vec4(1.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00);
    vec3 color = vec3(1.000000e+00, 1.000000e+00, 1.000000e+00);
    vec4 outColor = vec4(vec3((color).rgb), 1.000000e+00);
    (outColor *= vec4(var_varVertexColor));
    (out_FragColor0 = outColor);
}

After opts, it's just a store from input to output.  If we decide to lower
the input to 16-bit, then as long as the driver can handle 16-bit outputs,
it would be a good idea to demote the output and save the conversions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18003>
2022-08-31 02:43:45 +00:00