Commit Graph

3873 Commits

Author SHA1 Message Date
Samuel Pitoiset 7fa00e178f radv: calculate the GSVS vertex size in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:22 +02:00
Samuel Pitoiset 3e8bda66ae radv: gather primitive ID in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:20 +02:00
Samuel Pitoiset 1877e87f1e radv: gather layer in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:19 +02:00
Samuel Pitoiset 84b346eda9 radv: gather viewport in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:17 +02:00
Samuel Pitoiset d21489d415 radv: gather pointsize in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:09 +02:00
Samuel Pitoiset a99d2d5564 radv: gather clip/cull distances in the shader info pass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:07 +02:00
Samuel Pitoiset b16cf6c4c6 radv: move ac_fill_shader_info() to radv_nir_shader_info_pass()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:05 +02:00
Samuel Pitoiset 83499ac765 radv: merge radv_shader_variant_info into radv_shader_info
Having two different structs is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:03 +02:00
Samuel Pitoiset fa13b2f002 radv/gfx10: always set ballot_mask_bits to 64
The codegen handles it and it adds the correct casts. This fixes
a bunch of LLVM validation errors when enabling Wave32 for compute.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 08:11:43 +02:00
Vasily Khoruzhick 9367d2ca37 nir: allow specifying filter callback in lower_alu_to_scalar
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-06 01:51:28 +00:00
Connor Abbott 3f5b541fc8 radv: Call nir_propagate_invariant()
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-09-05 14:05:46 +02:00
Connor Abbott 71a6794200 ac/nir: Enable nir_opt_large_constants
vkpipeline-db numbers:

Totals:
SGPRS: 1740306 -> 1741322 (0.06 %)
VGPRS: 1331124 -> 1331712 (0.04 %)
Spilled SGPRs: 21201 -> 21316 (0.54 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size: 79022628 -> 78694788 (-0.41 %) bytes
LDS: 6500 -> 6500 (0.00 %) blocks
Max Waves: 301413 -> 301302 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 53633 -> 54649 (1.89 %)
VGPRS: 53000 -> 53588 (1.11 %)
Spilled SGPRs: 3454 -> 3569 (3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5284232 -> 4956392 (-6.20 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 4239 -> 4128 (-2.62 %)
Wait states: 0 -> 0 (0.00 %)

(The biggest VGPR and max wave regression is due to unrolling a loop,
which made the scheduler more aggressive, but in this case it's able to
effectively hide latency so it's actually probably a win.)

shader-db numbers with radeonsi NIR:

Totals:
SGPRS: 3526496 -> 3526512 (0.00 %)
VGPRS: 2198576 -> 2198576 (0.00 %)
Spilled SGPRs: 10463 -> 10463 (0.00 %)
Spilled VGPRs: 86 -> 86 (0.00 %)
Private memory VGPRs: 3182 -> 2528 (-20.55 %)
Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread
Code Size: 74117280 -> 74106140 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 775846 -> 775844 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 856 -> 872 (1.87 %)
VGPRS: 680 -> 680 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 654 -> 0 (-100.00 %)
Scratch size: 668 -> 0 (-100.00 %) dwords per thread
Code Size: 49652 -> 38512 (-22.44 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 182 -> 180 (-1.10 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-05 12:21:46 +02:00
Connor Abbott 91626d0865 ac/nir: Support load_constant intrinsics
Setup a constant global variable that LLVM will stick in a .rodata
section and generate PC-relative loads for.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-05 12:21:42 +02:00
Connor Abbott 5dadbabb47 radv/radeonsi: Don't count read-only data when reporting code size
We usually use these counts as a simple way to figure out if a change
reduces the number of instructions or shrinks an instruction. However,
since .rodata sections aren't executed, we shouldn't be counting their
size for this analysis. Make the linker return the total executable
size, and use it to report the more useful size in both drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-05 12:21:35 +02:00
Connor Abbott 2abf62d348 ac/nir: Fix gather4 integer wa with unnormalized coordinates
This adds a bit of unneccesary code on radeonsi, since whether
unnormalized coordinates are used is known at compile time with GL, but
I wasn't sure if it was worth the few instructions to plumb everything
through, especially for something so rare -- my shader-db doesn't have
any instances where this changes anything.

Fixes CTS tests I created at
https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-03 13:50:54 +00:00
Connor Abbott c63ccf90df ac/nir: Rewrite gather4 integer workaround based on radeonsi
The workaround was originally written based on amdgpu-pro traces, but
since then radeonsi has got its own slightly different version. Use the
radeonsi version instead, to be consistent and because it'll be slightly
more convenient for handling unnormalized coordinates.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-03 13:50:54 +00:00
Samuel Pitoiset 6b96c94b5a radv: keep a pointer to a NIR shader into radv_shader_context
This avoids multiple copies for nothing and it's more elegant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:30 +02:00
Samuel Pitoiset 7b1655ccf3 radv: move setting can_discard to ac_fill_shader_info()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:27 +02:00
Samuel Pitoiset 081561de16 radv: replace ac_nir_build_if by ac_build_ifcc
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:25 +02:00
Samuel Pitoiset cc3d36b5dd radv: remove radv_init_llvm_target() helper
RADV no longer uses specific LLVM options compared to the common code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:21 +02:00
Samuel Pitoiset dc27a54c84 radv: remove useless ac_llvm_util.h include from the WSI code
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:19 +02:00
Samuel Pitoiset 6cb455c418 radv: remove unused shader_info parameter in ac_compile_llvm_module()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:17 +02:00
Samuel Pitoiset 9aaca90123 radv: remove some unused fields from radv_shader_context
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:33:15 +02:00
Samuel Pitoiset 8d44f83844 radv: move lowering PS inputs/outputs at the right place
At shaders creation, just after NIR linking.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:29:31 +02:00
Samuel Pitoiset 151d6990ec radv: gather info about PS inputs in the shader info pass
It's the right place to do that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-30 09:29:29 +02:00
Samuel Pitoiset 9f2fd23f99 ac: drop now useless lookup_interp_param from ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 08:23:56 +02:00
Samuel Pitoiset a63719db6a ac: import linear/perspective PS input parameters from radv/radeonsi
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-30 08:23:54 +02:00
Samuel Pitoiset b650ecfe31 radv/gfx10: compute the LDS size for exporting PrimID for VS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-29 16:08:37 +02:00
Marek Olšák 2e94cb6693 radeonsi: add PKT3_CONTEXT_REG_RMW
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-08-27 16:16:08 -04:00
Samuel Pitoiset 49f5ddd3ae radv: make use of has_ls_vgpr_init_bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-27 08:04:51 +02:00
Samuel Pitoiset fd54fc85aa ac: add has_ls_vgpr_init_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:47 +02:00
Samuel Pitoiset 1bf2572dff ac: add has_msaa_sample_loc_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:44 +02:00
Samuel Pitoiset 021feb1bf6 ac: add rbplus_allowed to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:41 +02:00
Samuel Pitoiset 20c5db02b5 ac: add has_tc_compat_zrange_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:36 +02:00
Samuel Pitoiset b55919cf2a ac: add has_gfx9_scissor_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:32 +02:00
Samuel Pitoiset 2b9c371575 ac: add cpdma_prefetch_writes_memory to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:29 +02:00
Samuel Pitoiset b027ad66d7 ac: add has_out_of_order_rast to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:26 +02:00
Samuel Pitoiset ed720af46d ac: add has_load_ctx_reg_pkt to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:22 +02:00
Samuel Pitoiset 63c0b89b8f ac: add has_rbplus to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:19 +02:00
Samuel Pitoiset 44a46c09de ac: add has_dcc_constant_encode to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:16 +02:00
Samuel Pitoiset c08401f035 ac: add has_distributed_tess to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:11 +02:00
Samuel Pitoiset d62d2840c4 ac: add has_clear_state to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:05 +02:00
Samuel Pitoiset af65f9431e ac: drop llvm8 from some load/store helpers
Cleanup.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:00 +02:00
Samuel Pitoiset 218ce34962 radv: add mipmap support for the clear depth/stencil values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:59 +02:00
Samuel Pitoiset e36e260c42 radv: add mipmap support for the TC-compat zrange bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:55 +02:00
Samuel Pitoiset 9db0dc6b8e radv: allocate metadata space for mipmapped depth/stencil images
For each mipmaps, the driver will store the clear values (8-bytes)
and the TC-compat zrange value (4-bytes).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:51 +02:00
Samuel Pitoiset 76812339f7 radv: decompress mipmapped depth/stencil images during transitions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:48 +02:00
Samuel Pitoiset 81c6473b7f radv: add mipmaps support for decompress/resummarize
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:45 +02:00
Samuel Pitoiset 18ccde4d68 radv: add radv_process_depth_image_layer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:42 +02:00
Connor Abbott b7acf38073 ac/nir: Remove gfx9_stride_size_workaround_for_atomic
The workaround was entirely in common code, and it's needed in radeonsi
too so just always do it when necessary. Fixes
KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9
with LLVM 8.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-26 11:00:49 +02:00