Marek Olšák
1bb2656276
ac: replace HAVE_LLVM with LLVM_VERSION_MAJOR for atomic-optimizations
...
trivial
2019-09-11 10:56:46 -04:00
Samuel Pitoiset
538766792d
radv/gfx10: declare a LDS symbol for the NGG emit space
...
This fixes some interactions when NGG GS is enabled. It fixes:
- dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.*geom*
- dEQP-VK.tessellation.geometry_interaction.passthrough.*
For some reasons, using the computed ESGS ring size randomly hangs
with CTS. For now, just use the maximum LDS size for ESGS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:27:01 +02:00
Samuel Pitoiset
168f8dbafa
radv: calculate GFX9 GS and GFX10 NGG states before compiling shader variants
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:26:58 +02:00
Samuel Pitoiset
e7ee9a6387
radv: store the ESGS ring size as part of gfx10_ngg_info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:26:53 +02:00
Samuel Pitoiset
7eba5666fa
radv: store GFX10 NGG state as part of the shader info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:26:51 +02:00
Samuel Pitoiset
349caedee0
radv: store GFX9 GS state as part of the shader info
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:26:47 +02:00
Samuel Pitoiset
a9af11f1fa
radv: fill shader info for all stages in the pipeline
...
This shouldn't be in NIR->LLVM because ACO also needs the shader
info. This will also help for computing some NGG values that are
necessary for declaring LDS symbols.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:26:45 +02:00
Samuel Pitoiset
8cf297c7b1
radv: do not pass all compiler options to the shader info pass
...
Only the pipeline layout and the shader keys are needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-10 09:26:42 +02:00
Marek Olšák
e4c84d8678
radeonsi: move texture storage allocation outside of radeonsi
...
possible code sharing with radv
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
58ccadfc5c
radeonsi: move HTILE allocation outside of radeonsi
...
ac_surface computes it for amdgpu.
radeon_drm_surface computes it for radeon.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
7d4a10a29f
ac/surface: add RADEON_SURF_NO_FMASK
...
This controls FMASK and CMASK computation for MSAA.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
d95afd8b9e
radeonsi/gfx10: fix wave occupancy computations
...
Cc: 19.2 <mesa-stable@lists.freedesktop.org >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
d64593e3c4
ac: use fma on gfx10
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-09-09 23:43:03 -04:00
Marek Olšák
d979e5bfab
ac: enable LLVM atomic optimizations
2019-09-09 23:43:03 -04:00
Eric Engestrom
5eb7d48b58
radv: add support for vk_x11_override_min_image_count
...
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
2019-09-06 23:16:05 +01:00
Eric Engestrom
4ad99ee961
amd: move adaptive sync to performance section, as it is defined in xmlpool
...
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
2019-09-06 23:16:05 +01:00
Eric Engestrom
19d9e57f2c
amd: replace major llvm version checks with LLVM_VERSION_MAJOR
...
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Acked-by: Michel Dänzer <mdaenzer@redhat.com >
2019-09-06 22:26:29 +01:00
Samuel Pitoiset
0bf51b6941
radv/gfx10: determine the number of vertices per primitive for TES
...
This doesn't fix anything known but it's correct now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 17:36:49 +02:00
Samuel Pitoiset
c6be5cefba
radv/gfx10: make use the output usage mask when exporting NGG GS params
...
It shouldn't matter much because output varyings should have been
compacted during NIR shader linking but it mirrors what the driver
does when emitting NGG GS vertex parameters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 17:25:28 +02:00
Samuel Pitoiset
b1a872f0c0
radv/gfx10: account for the subpass view for the NGG GS storage
...
If the fragment shader needs the layer index, we have to allocate
one more dword in the NGG GS storage. Found by inspection. This
doesn't fix anything known.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 17:25:28 +02:00
Samuel Pitoiset
f31fb33432
radv: calculate esgs_itemsize in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:24 +02:00
Samuel Pitoiset
7fa00e178f
radv: calculate the GSVS vertex size in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:22 +02:00
Samuel Pitoiset
3e8bda66ae
radv: gather primitive ID in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:20 +02:00
Samuel Pitoiset
1877e87f1e
radv: gather layer in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:19 +02:00
Samuel Pitoiset
84b346eda9
radv: gather viewport in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:17 +02:00
Samuel Pitoiset
d21489d415
radv: gather pointsize in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:09 +02:00
Samuel Pitoiset
a99d2d5564
radv: gather clip/cull distances in the shader info pass
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:07 +02:00
Samuel Pitoiset
b16cf6c4c6
radv: move ac_fill_shader_info() to radv_nir_shader_info_pass()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:05 +02:00
Samuel Pitoiset
83499ac765
radv: merge radv_shader_variant_info into radv_shader_info
...
Having two different structs is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 15:52:03 +02:00
Samuel Pitoiset
fa13b2f002
radv/gfx10: always set ballot_mask_bits to 64
...
The codegen handles it and it adds the correct casts. This fixes
a bunch of LLVM validation errors when enabling Wave32 for compute.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-06 08:11:43 +02:00
Vasily Khoruzhick
9367d2ca37
nir: allow specifying filter callback in lower_alu_to_scalar
...
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com >
2019-09-06 01:51:28 +00:00
Connor Abbott
3f5b541fc8
radv: Call nir_propagate_invariant()
...
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-09-05 14:05:46 +02:00
Connor Abbott
71a6794200
ac/nir: Enable nir_opt_large_constants
...
vkpipeline-db numbers:
Totals:
SGPRS: 1740306 -> 1741322 (0.06 %)
VGPRS: 1331124 -> 1331712 (0.04 %)
Spilled SGPRs: 21201 -> 21316 (0.54 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size: 79022628 -> 78694788 (-0.41 %) bytes
LDS: 6500 -> 6500 (0.00 %) blocks
Max Waves: 301413 -> 301302 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 53633 -> 54649 (1.89 %)
VGPRS: 53000 -> 53588 (1.11 %)
Spilled SGPRs: 3454 -> 3569 (3.33 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5284232 -> 4956392 (-6.20 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 4239 -> 4128 (-2.62 %)
Wait states: 0 -> 0 (0.00 %)
(The biggest VGPR and max wave regression is due to unrolling a loop,
which made the scheduler more aggressive, but in this case it's able to
effectively hide latency so it's actually probably a win.)
shader-db numbers with radeonsi NIR:
Totals:
SGPRS: 3526496 -> 3526512 (0.00 %)
VGPRS: 2198576 -> 2198576 (0.00 %)
Spilled SGPRs: 10463 -> 10463 (0.00 %)
Spilled VGPRs: 86 -> 86 (0.00 %)
Private memory VGPRs: 3182 -> 2528 (-20.55 %)
Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread
Code Size: 74117280 -> 74106140 (-0.02 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 775846 -> 775844 (-0.00 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 856 -> 872 (1.87 %)
VGPRS: 680 -> 680 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 654 -> 0 (-100.00 %)
Scratch size: 668 -> 0 (-100.00 %) dwords per thread
Code Size: 49652 -> 38512 (-22.44 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 182 -> 180 (-1.10 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-09-05 12:21:46 +02:00
Connor Abbott
91626d0865
ac/nir: Support load_constant intrinsics
...
Setup a constant global variable that LLVM will stick in a .rodata
section and generate PC-relative loads for.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-09-05 12:21:42 +02:00
Connor Abbott
5dadbabb47
radv/radeonsi: Don't count read-only data when reporting code size
...
We usually use these counts as a simple way to figure out if a change
reduces the number of instructions or shrinks an instruction. However,
since .rodata sections aren't executed, we shouldn't be counting their
size for this analysis. Make the linker return the total executable
size, and use it to report the more useful size in both drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-09-05 12:21:35 +02:00
Connor Abbott
2abf62d348
ac/nir: Fix gather4 integer wa with unnormalized coordinates
...
This adds a bit of unneccesary code on radeonsi, since whether
unnormalized coordinates are used is known at compile time with GL, but
I wasn't sure if it was worth the few instructions to plumb everything
through, especially for something so rare -- my shader-db doesn't have
any instances where this changes anything.
Fixes CTS tests I created at
https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-03 13:50:54 +00:00
Connor Abbott
c63ccf90df
ac/nir: Rewrite gather4 integer workaround based on radeonsi
...
The workaround was originally written based on amdgpu-pro traces, but
since then radeonsi has got its own slightly different version. Use the
radeonsi version instead, to be consistent and because it'll be slightly
more convenient for handling unnormalized coordinates.
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-03 13:50:54 +00:00
Samuel Pitoiset
6b96c94b5a
radv: keep a pointer to a NIR shader into radv_shader_context
...
This avoids multiple copies for nothing and it's more elegant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:30 +02:00
Samuel Pitoiset
7b1655ccf3
radv: move setting can_discard to ac_fill_shader_info()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:27 +02:00
Samuel Pitoiset
081561de16
radv: replace ac_nir_build_if by ac_build_ifcc
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:25 +02:00
Samuel Pitoiset
cc3d36b5dd
radv: remove radv_init_llvm_target() helper
...
RADV no longer uses specific LLVM options compared to the common code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:21 +02:00
Samuel Pitoiset
dc27a54c84
radv: remove useless ac_llvm_util.h include from the WSI code
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:19 +02:00
Samuel Pitoiset
6cb455c418
radv: remove unused shader_info parameter in ac_compile_llvm_module()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:17 +02:00
Samuel Pitoiset
9aaca90123
radv: remove some unused fields from radv_shader_context
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:33:15 +02:00
Samuel Pitoiset
8d44f83844
radv: move lowering PS inputs/outputs at the right place
...
At shaders creation, just after NIR linking.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:29:31 +02:00
Samuel Pitoiset
151d6990ec
radv: gather info about PS inputs in the shader info pass
...
It's the right place to do that.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-30 09:29:29 +02:00
Samuel Pitoiset
9f2fd23f99
ac: drop now useless lookup_interp_param from ABI
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-30 08:23:56 +02:00
Samuel Pitoiset
a63719db6a
ac: import linear/perspective PS input parameters from radv/radeonsi
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-30 08:23:54 +02:00
Samuel Pitoiset
b650ecfe31
radv/gfx10: compute the LDS size for exporting PrimID for VS
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-29 16:08:37 +02:00
Marek Olšák
2e94cb6693
radeonsi: add PKT3_CONTEXT_REG_RMW
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-08-27 16:16:08 -04:00