Samuel Pitoiset
7fa00e178f
radv: calculate the GSVS vertex size in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:22 +02:00
Samuel Pitoiset
3e8bda66ae
radv: gather primitive ID in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:20 +02:00
Samuel Pitoiset
1877e87f1e
radv: gather layer in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:19 +02:00
Samuel Pitoiset
84b346eda9
radv: gather viewport in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:17 +02:00
Samuel Pitoiset
d21489d415
radv: gather pointsize in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:09 +02:00
Samuel Pitoiset
a99d2d5564
radv: gather clip/cull distances in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:07 +02:00
Samuel Pitoiset
b16cf6c4c6
radv: move ac_fill_shader_info() to radv_nir_shader_info_pass()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:05 +02:00
Samuel Pitoiset
83499ac765
radv: merge radv_shader_variant_info into radv_shader_info
...
Having two different structs is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:03 +02:00
Samuel Pitoiset
fa13b2f002
radv/gfx10: always set ballot_mask_bits to 64
...
The codegen handles it and it adds the correct casts. This fixes
a bunch of LLVM validation errors when enabling Wave32 for compute.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 08:11:43 +02:00
Vasily Khoruzhick
9367d2ca37
nir: allow specifying filter callback in lower_alu_to_scalar
...
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com >
2019-09-06 01:51:28 +00:00
Connor Abbott
3f5b541fc8
radv: Call nir_propagate_invariant()
...
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-09-05 14:05:46 +02:00
Connor Abbott
71a6794200
ac/nir: Enable nir_opt_large_constants
...
vkpipeline-db numbers:
Totals:
SGPRS: 1740306 -> 1741322 (0.06 %)
VGPRS: 1331124 -> 1331712 (0.04 %)
Spilled SGPRs: 21201 -> 21316 (0.54 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size: 79022628 -> 78694788 (-0.41 %) bytes
LDS: 6500 -> 6500 (0.00 %) blocks
Max Waves: 301413 -> 301302 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 53633 -> 54649 (1.89 %)
VGPRS: 53000 -> 53588 (1.11 %)
Spilled SGPRs: 3454 -> 3569 (3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5284232 -> 4956392 (-6.20 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 4239 -> 4128 (-2.62 %)
Wait states: 0 -> 0 (0.00 %)
(The biggest VGPR and max wave regression is due to unrolling a loop,
which made the scheduler more aggressive, but in this case it's able to
effectively hide latency so it's actually probably a win.)
shader-db numbers with radeonsi NIR:
Totals:
SGPRS: 3526496 -> 3526512 (0.00 %)
VGPRS: 2198576 -> 2198576 (0.00 %)
Spilled SGPRs: 10463 -> 10463 (0.00 %)
Spilled VGPRs: 86 -> 86 (0.00 %)
Private memory VGPRs: 3182 -> 2528 (-20.55 %)
Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread
Code Size: 74117280 -> 74106140 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 775846 -> 775844 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 856 -> 872 (1.87 %)
VGPRS: 680 -> 680 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 654 -> 0 (-100.00 %)
Scratch size: 668 -> 0 (-100.00 %) dwords per thread
Code Size: 49652 -> 38512 (-22.44 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 182 -> 180 (-1.10 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-09-05 12:21:46 +02:00
Connor Abbott
91626d0865
ac/nir: Support load_constant intrinsics
...
Setup a constant global variable that LLVM will stick in a .rodata
section and generate PC-relative loads for.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-09-05 12:21:42 +02:00
Connor Abbott
5dadbabb47
radv/radeonsi: Don't count read-only data when reporting code size
...
We usually use these counts as a simple way to figure out if a change
reduces the number of instructions or shrinks an instruction. However,
since .rodata sections aren't executed, we shouldn't be counting their
size for this analysis. Make the linker return the total executable
size, and use it to report the more useful size in both drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-09-05 12:21:35 +02:00
Connor Abbott
2abf62d348
ac/nir: Fix gather4 integer wa with unnormalized coordinates
...
This adds a bit of unneccesary code on radeonsi, since whether
unnormalized coordinates are used is known at compile time with GL, but
I wasn't sure if it was worth the few instructions to plumb everything
through, especially for something so rare -- my shader-db doesn't have
any instances where this changes anything.
Fixes CTS tests I created at
https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-03 13:50:54 +00:00
Connor Abbott
c63ccf90df
ac/nir: Rewrite gather4 integer workaround based on radeonsi
...
The workaround was originally written based on amdgpu-pro traces, but
since then radeonsi has got its own slightly different version. Use the
radeonsi version instead, to be consistent and because it'll be slightly
more convenient for handling unnormalized coordinates.
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-03 13:50:54 +00:00
Samuel Pitoiset
6b96c94b5a
radv: keep a pointer to a NIR shader into radv_shader_context
...
This avoids multiple copies for nothing and it's more elegant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:30 +02:00
Samuel Pitoiset
7b1655ccf3
radv: move setting can_discard to ac_fill_shader_info()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:27 +02:00
Samuel Pitoiset
081561de16
radv: replace ac_nir_build_if by ac_build_ifcc
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:25 +02:00
Samuel Pitoiset
cc3d36b5dd
radv: remove radv_init_llvm_target() helper
...
RADV no longer uses specific LLVM options compared to the common code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:21 +02:00
Samuel Pitoiset
dc27a54c84
radv: remove useless ac_llvm_util.h include from the WSI code
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:19 +02:00
Samuel Pitoiset
6cb455c418
radv: remove unused shader_info parameter in ac_compile_llvm_module()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:17 +02:00
Samuel Pitoiset
9aaca90123
radv: remove some unused fields from radv_shader_context
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:15 +02:00
Samuel Pitoiset
8d44f83844
radv: move lowering PS inputs/outputs at the right place
...
At shaders creation, just after NIR linking.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:29:31 +02:00
Samuel Pitoiset
151d6990ec
radv: gather info about PS inputs in the shader info pass
...
It's the right place to do that.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:29:29 +02:00
Samuel Pitoiset
9f2fd23f99
ac: drop now useless lookup_interp_param from ABI
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-30 08:23:56 +02:00
Samuel Pitoiset
a63719db6a
ac: import linear/perspective PS input parameters from radv/radeonsi
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-30 08:23:54 +02:00
Samuel Pitoiset
b650ecfe31
radv/gfx10: compute the LDS size for exporting PrimID for VS
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-29 16:08:37 +02:00
Marek Olšák
2e94cb6693
radeonsi: add PKT3_CONTEXT_REG_RMW
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-08-27 16:16:08 -04:00
Samuel Pitoiset
49f5ddd3ae
radv: make use of has_ls_vgpr_init_bug
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-27 08:04:51 +02:00
Samuel Pitoiset
fd54fc85aa
ac: add has_ls_vgpr_init_bug to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:47 +02:00
Samuel Pitoiset
1bf2572dff
ac: add has_msaa_sample_loc_bug to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:44 +02:00
Samuel Pitoiset
021feb1bf6
ac: add rbplus_allowed to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:41 +02:00
Samuel Pitoiset
20c5db02b5
ac: add has_tc_compat_zrange_bug to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:36 +02:00
Samuel Pitoiset
b55919cf2a
ac: add has_gfx9_scissor_bug to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:32 +02:00
Samuel Pitoiset
2b9c371575
ac: add cpdma_prefetch_writes_memory to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:29 +02:00
Samuel Pitoiset
b027ad66d7
ac: add has_out_of_order_rast to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:26 +02:00
Samuel Pitoiset
ed720af46d
ac: add has_load_ctx_reg_pkt to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:22 +02:00
Samuel Pitoiset
63c0b89b8f
ac: add has_rbplus to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:19 +02:00
Samuel Pitoiset
44a46c09de
ac: add has_dcc_constant_encode to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:16 +02:00
Samuel Pitoiset
c08401f035
ac: add has_distributed_tess to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:11 +02:00
Samuel Pitoiset
d62d2840c4
ac: add has_clear_state to ac_gpu_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:05 +02:00
Samuel Pitoiset
af65f9431e
ac: drop llvm8 from some load/store helpers
...
Cleanup.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-27 08:04:00 +02:00
Samuel Pitoiset
218ce34962
radv: add mipmap support for the clear depth/stencil values
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-26 15:56:59 +02:00
Samuel Pitoiset
e36e260c42
radv: add mipmap support for the TC-compat zrange bug
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-26 15:56:55 +02:00
Samuel Pitoiset
9db0dc6b8e
radv: allocate metadata space for mipmapped depth/stencil images
...
For each mipmaps, the driver will store the clear values (8-bytes)
and the TC-compat zrange value (4-bytes).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-26 15:56:51 +02:00
Samuel Pitoiset
76812339f7
radv: decompress mipmapped depth/stencil images during transitions
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-26 15:56:48 +02:00
Samuel Pitoiset
81c6473b7f
radv: add mipmaps support for decompress/resummarize
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-26 15:56:45 +02:00
Samuel Pitoiset
18ccde4d68
radv: add radv_process_depth_image_layer() helper
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-26 15:56:42 +02:00
Connor Abbott
b7acf38073
ac/nir: Remove gfx9_stride_size_workaround_for_atomic
...
The workaround was entirely in common code, and it's needed in radeonsi
too so just always do it when necessary. Fixes
KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9
with LLVM 8.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-26 11:00:49 +02:00