Samuel Pitoiset
9b58c4958b
ac/nir: fix integer comparisons with pointers
...
If we get a comparison between a pointer and an integer, LLVM
complains if the operands aren't of the same type.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3085
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5397 >
2020-06-10 08:18:22 +00:00
Marek Olšák
9538b9a68e
radeonsi: add support for Sienna Cichlid
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
2cc4bfbe01
radeonsi: don't set any XNACK options on gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
a1602516d7
ac,radeonsi: replace == GFX10 with >= GFX10 where it's needed
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Samuel Pitoiset
008b0d1701
ac/nir: adjust an assertion for D16 on GFX6-GFX7
...
16-bit types can be used with MUBUF on GFX6-GFX7.
Fixes: c3e0ba52a0 ("ac/nir: support 16-bit data in buffer_load_format opcodes")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5325 >
2020-06-08 08:45:32 +02:00
Marek Olšák
c6c8a9bd55
ac/nir: support v2f16 derivatives
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
7c423dd721
ac/nir: set the second v_cvt_pkrtz argument to undef if it's unused
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
bfb95725aa
ac/nir: select v_cvt_pkrtz for all conversions from f32 to f16 for radeonsi
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
1d80015eaf
ac/nir: handle nir_op_[fiu]2[fiu]mp opcodes
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
70b6d54011
ac/nir: support 16-bit data in image opcodes
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
c3e0ba52a0
ac/nir: support 16-bit data in buffer_load_format opcodes
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
b819ba949b
ac/nir: remove type and num_channels args from ac_build_buffer_store_common
...
They were only used for type overloading where we can just use
the type of data.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
b98df7bf50
ac/nir: support vector types in the type suffix of overloaded intrinsics
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Marek Olšák
e5ea87cde8
ac/nir: use more types from ac_llvm_context
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5003 >
2020-06-02 16:29:25 -04:00
Dylan Baker
a8e2d79e02
meson: use gnu_symbol_visibility argument
...
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.
Acked-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Eric Engestrom <eric@engestrom.ch >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740 >
2020-06-01 18:59:18 +00:00
Samuel Pitoiset
e99c818cf0
ac/nir: add support for bias/lod with texture gather
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5147 >
2020-05-25 08:51:10 +02:00
Samuel Pitoiset
14292310d9
ac/nir: implement nir_intrinsic_shader_clock with device scope
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117 >
2020-05-24 20:37:58 +02:00
Samuel Pitoiset
b034f6cf2a
ac/nir: fix shader clock with subgroup scope
...
The compiler should emit s_memtime instead of s_memrealtime for
the subgroup scope. I don't know why this LLVM 9 checks was for
but LLVM 8 also has this amdgcn intrinsic.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5117 >
2020-05-24 20:37:54 +02:00
Michel Dänzer
2a6811f0f9
Revert "ac,radeonsi: fix compilations issues with LLVM 11"
...
This reverts commit 42b1696ef6 .
The corresponding LLVM changes were reverted.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5087 >
2020-05-19 07:19:35 +00:00
Marek Olšák
2361e8e722
ac/nir: honor ACCESS_STREAM_CACHE_POLICY for L1 and L0 caches too
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4935 >
2020-05-15 22:12:35 +00:00
Samuel Pitoiset
0d63a1a84d
ac/llvm: add support for texturing with clamped LOD
...
This is a requirement for the shaderResourceMinLod feature which
allows to clamp LOD. This uses all image_sample_*_cl variants.
All dEQP-VK.glsl.texture_functions.texture*clamp.* pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4989 >
2020-05-14 10:05:44 +00:00
Pierre-Eric Pelloux-Prayer
ae4379d81e
ac/nir: export some undef as zero
...
NIR already optimizes undef usage.
If undef reaches llvm, it's probably because of a broken shader.
In this situation, rather than letting llvm use the undef values
to do more optimization and probably produce incorrect results,
we replace undef values by 0.
"undef" values that are directly used in exports are kept as undef,
because this allows llvm to optimize them away.
This is only enabled for radeonsi.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2689
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607 >
2020-05-05 12:26:26 +02:00
Pierre-Eric Pelloux-Prayer
64662dd5ba
radeonsi: add workaround for issue 2647
...
For unknown reasons pixel shaders in KSP game get executed with
infinite interpolation coefficients and this causes an infinite
loop in the shader.
This commit adds a hacky workaround that kills pixel shaders if
invalid interp coeffs are detected and enables it for KSP.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2174
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2647
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4700 >
2020-05-05 09:41:14 +00:00
Marek Olšák
b97cc41aa2
Revert "ac: reassociate FP expressions for inexact instructions for radeonsi"
...
This reverts commit cf2f3c2753 .
It breaks shadows in Unigine Superposition.
Fixes: cf2f3c2753
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4837 >
2020-05-04 11:51:37 -04:00
Pierre-Eric Pelloux-Prayer
7e7bb38bd8
radeonsi: fix export count
...
Fixes: 17acff01a0 ("radeonsi: skip vs output optimizations for some outputs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2877
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4871 >
2020-05-04 15:11:09 +02:00
Samuel Pitoiset
aa94213781
ac/llvm: fix nir_texop_texture_samples with NULL descriptors
...
With VK_EXT_robustness2, descriptors can be NULL and the number of
samples returned by nir_texop_texture_samples should be 0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4775 >
2020-04-29 07:29:54 +00:00
Samuel Pitoiset
42b1696ef6
ac,radeonsi: fix compilations issues with LLVM 11
...
Latest LLVM replaced LLVMVectorTypeKind.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2826
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4755 >
2020-04-27 17:13:36 +00:00
Marek Olšák
cf2f3c2753
ac: reassociate FP expressions for inexact instructions for radeonsi
...
Totals:
SGPRS: 2591784 -> 2590696 (-0.04 %)
VGPRS: 1666888 -> 1666736 (-0.01 %)
Spilled SGPRs: 4131 -> 4107 (-0.58 %)
Spilled VGPRs: 38 -> 38 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2228 -> 2228 (0.00 %) dwords per thread
Code Size: 52715468 -> 52693584 (-0.04 %) bytes
LDS: 92 -> 92 (0.00 %) blocks
Max Waves: 479897 -> 479892 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696 >
2020-04-27 11:20:16 +00:00
Marek Olšák
4b9370cb0f
ac: generate FMA for inexact instructions for radeonsi
...
NIR mostly does this already.
Totals:
SGPRS: 2588520 -> 2591784 (0.13 %)
VGPRS: 1666984 -> 1666888 (-0.01 %)
Spilled SGPRs: 4074 -> 4131 (1.40 %)
Spilled VGPRs: 38 -> 38 (0.00 %)
Private memory VGPRs: 2176 -> 2176 (0.00 %)
Scratch size: 2228 -> 2228 (0.00 %) dwords per thread
Code Size: 52726872 -> 52715468 (-0.02 %) bytes
LDS: 92 -> 92 (0.00 %) blocks
Max Waves: 479872 -> 479897 (0.01 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696 >
2020-04-27 11:20:16 +00:00
Marek Olšák
f2c2a28073
ac: update and document fast math flags used by radeonsi
...
This should have no effect, because we never use FP division, but
it's safer for the future.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696 >
2020-04-27 11:20:16 +00:00
Marek Olšák
3bb65c0670
ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696 >
2020-04-27 11:20:16 +00:00
Marek Olšák
b4fd8f1919
ac,radeonsi: simplify checking for Navi1x chips
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698 >
2020-04-24 10:38:54 +00:00
Pierre-Eric Pelloux-Prayer
17acff01a0
radeonsi: skip vs output optimizations for some outputs
...
If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so
we can't apply the vs output optim.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747
Fixes: 3ec9975555 ("radeonsi: eliminate trivial constant VS outputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559 >
2020-04-20 08:45:16 +02:00
Samuel Pitoiset
9f005f1f85
radv: enable lowering of GS intrinsics for the LLVM backend
...
This replaces emit_vertex with:
if (vertex_count < max_vertices) {
emit_vertex_with_counter vertex_count ...
vertex_count += 1
}
Which is exactly what NIR->LLVM was doing but at NIR level. This
pass is already called by ACO.
pipeline-db changes on GFX10:
Totals from affected shaders:
SGPRS: 1952 -> 1912 (-2.05 %)
VGPRS: 2112 -> 2044 (-3.22 %)
Code Size: 189368 -> 185620 (-1.98 %) bytes
Max Waves: 494 -> 491 (-0.61 %)
No pipeline-db changes on other generations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182 >
2020-04-08 08:24:05 +02:00
Samuel Pitoiset
3cd5450df5
ac/nir: split 16-bit SSBO stores on GFX6
...
Due to possible alignment issues, make sure to split stores of
16-bit vectors.
Doom Eternal requires storageBuffer16BitAccess.
Cc: 20.0 <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339 >
2020-04-03 08:01:28 +00:00
Samuel Pitoiset
55fdcc03de
ac/nir: split 16-bit load/store to global memory on GFX6
...
Due to possible alignment issues, make sure to split loads/stores
of 16-bit vectors.
Doom Eternal requires storageBuffer16BitAccess.
Cc: 20.0 <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339 >
2020-04-03 08:01:28 +00:00
Samuel Pitoiset
c6bf1597d1
ac/nir: split 8-bit SSBO stores on GFX6
...
Due to possible alignment issues, make sure to split stores of
8-bit vectors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339 >
2020-04-03 08:01:28 +00:00
Samuel Pitoiset
433f3380eb
ac/nir: split 8-bit load/store to global memory on GFX6
...
Due to possible alignment issues, make sure to split loads/stores
of 8-bit vectors.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4339 >
2020-04-03 08:01:28 +00:00
Eric Engestrom
79af30768d
meson: inline inc_common
...
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.
Signed-off-by: Eric Engestrom <eric@engestrom.ch >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360 >
2020-03-28 21:36:54 +01:00
Samuel Pitoiset
ba2ec1f369
ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()
...
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
No pipeline-db changes with VEGA10/LLVM 9.
pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 6672 -> 6672 (0.00 %)
VGPRS: 6652 -> 6652 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 561780 -> 561692 (-0.02 %) bytes
Max Waves: 1043 -> 1043 (0.00 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 84608 -> 83768 (-0.99 %)
VGPRS: 106768 -> 106636 (-0.12 %)
Spilled SGPRs: 1625 -> 1713 (5.42 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 10850936 -> 10726712 (-1.14 %) bytes
Max Waves: 3152 -> 3180 (0.89 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326 >
2020-03-27 08:05:43 +01:00
Samuel Pitoiset
d548384fc6
ac/nir: use llvm.amdgcn.rsq for nir_op_frsq
...
Instead of emitting 1.0 / sqrt(x) which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
pipeline-db (VEGA10/LLVM 9):
Totals from affected shaders:
SGPRS: 16872 -> 16864 (-0.05 %)
VGPRS: 15320 -> 15464 (0.94 %)
Spilled SGPRs: 2021 -> 2133 (5.54 %)
Code Size: 1915464 -> 1917476 (0.11 %) bytes
Max Waves: 641 -> 639 (-0.31 %)
pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 43936 -> 44120 (0.42 %)
VGPRS: 41776 -> 41972 (0.47 %)
Spilled SGPRs: 875 -> 875 (0.00 %)
Code Size: 4468164 -> 4468120 (-0.00 %) bytes
Max Waves: 2412 -> 2414 (0.08 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 60096 -> 60096 (0.00 %)
VGPRS: 63552 -> 63648 (0.15 %)
Spilled SGPRs: 6135 -> 6117 (-0.29 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 6252996 -> 6249772 (-0.05 %) bytes
Max Waves: 2324 -> 2337 (0.56 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326 >
2020-03-27 07:45:47 +01:00
Samuel Pitoiset
66426ce119
ac/nir: use llvm.amdgcn.rcp for nir_op_frcp
...
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
pipeline-db (VEG10/LLVM 9):
Totals from affected shaders:
SGPRS: 50384 -> 50312 (-0.14 %)
VGPRS: 42572 -> 42696 (0.29 %)
Spilled SGPRs: 1372 -> 1372 (0.00 %)
Code Size: 5692040 -> 5691428 (-0.01 %) bytes
Max Waves: 3954 -> 3951 (-0.08 %)
pipeline-db (VEG10/LLVM 10):
Totals from affected shaders:
SGPRS: 78512 -> 78464 (-0.06 %)
VGPRS: 62408 -> 62484 (0.12 %)
Spilled SGPRs: 1502 -> 1502 (0.00 %)
Code Size: 8106188 -> 8103372 (-0.03 %) bytes
Max Waves: 7759 -> 7753 (-0.08 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 112760 -> 113232 (0.42 %)
VGPRS: 111132 -> 110568 (-0.51 %)
Spilled SGPRs: 5870 -> 5940 (1.19 %)
Spilled VGPRs: 650 -> 652 (0.31 %)
Code Size: 11887232 -> 11561744 (-2.74 %) bytes
Max Waves: 8964 -> 9015 (0.57 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326 >
2020-03-27 07:45:43 +01:00
Pierre-Eric Pelloux-Prayer
5533c41541
ac: fix ac_build_is_helper_invocation when postponed_kill is null
...
If there was no demote() in the shader, ac_build_is_helper_invocation
behaves exactly the same as ac_build_load_helper_invocation, i.e.
the helper lanes are the same as they were at the beginning of the shader.
Fixes: de57ea2a3d ("amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301 >
2020-03-25 08:19:38 +01:00
Samuel Pitoiset
7ac8bb33cd
radv/llvm: fix subgroup shuffle for chips without bpermute
...
bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead
we have to use readlane with a waterfall loop to defeat the LLVM
backend.
This fixes DOOM Eternal which requires subgroup shuffle.
Cc: <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284 >
2020-03-23 14:19:03 +00:00
Marek Olšák
303842b2db
ac: fix fast division
...
This stopped working with LLVM 11 and might occasionally have been broken
on older LLVM, because the metadata was set on the mul, not on the rcp.
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268 >
2020-03-21 22:34:17 +00:00
Bas Nieuwenhuizen
8e4e2cedcf
amd/llvm: Fix divergent descriptor regressions with radeonsi.
...
piglit/bin/arb_bindless_texture-limit -auto -fbo:
Needed to deal with non-NULL dynamic_index without deref in tex instructions.
piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
Need to deal with non-deref images in enter_waterfall_imae.
Fixes: b83c9aca4a "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191 >
2020-03-17 22:53:16 +01:00
Marek Olšák
8dc5e174c7
ac: don't set old denormals flags with LLVM >= 11
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
2020-03-17 20:47:48 +00:00
Marek Olšák
63a5051ea6
ac: set new LLVM denormal flags
...
See: https://reviews.llvm.org/D71358
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
2020-03-17 20:47:48 +00:00
Bas Nieuwenhuizen
b83c9aca4a
amd/llvm: Fix divergent descriptor indexing. (v3)
...
There are multiple LLVM passes that very much move the
intrinsic using the descriptor outside of the loop, defeating
the entire point of creating the loop.
Defeat the optimizer by splitting the break into a separate
if-statement and putting an optimization barrier on the bool
in between.
v2: Move from a callback based system to begin/end loop.
This does not make it significantly less intrusive but
is a bit nicer with all the extra struct and callback
stubs.
v3: Deal with non-divergent values in divergent path.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2160
Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109 >
2020-03-12 16:12:02 +00:00
Samuel Pitoiset
cc320ef9af
ac/llvm: add missing optimization barrier for 64-bit readlanes
...
Otherwise, LLVM optimizes it but it's actually incorrect.
Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585 >
2020-03-12 08:46:42 +01:00