Bas Nieuwenhuizen
4bb17c08ae
radv/gfx10: Enable DCC for storage images.
...
v2: Hide it behind a perftest flag.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
3a5950f501
radv: Add device argument for dcc compression check.
...
Because it is about to be generation dependent.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
8c63ffe54d
radv: Disable compression for compute DCC decompress store.
...
Previously we relied on stores not using DCC but that is going to
change, so disable compression explicitly.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
216a9d8871
radv: Add extra struct to image view creation.
...
For extra args. Unlike image creation, I'm not embedding the vk
struct in there, so all the inline structs can be kept.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
50add1b33a
radv: Do not decompress on LAYOUT_GENERAL.
...
We handle render loops properly now and STORAGE still disables
DCC/TC-compat HTILE in general.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
66131ceb8b
radv: Pass through render loop detection to internal layout decisions.
...
And do nothing with it yet.
Everything outside a renderpass has no render loop.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
a171a6663d
radv: Add render loop detection in renderpass.
...
VK spec 7.3:
"Applications must ensure that all accesses to memory that backs
image subresources used as attachments in a given renderpass instance
either happen-before the load operations for those attachments, or
happen-after the store operations for those attachments."
So the only renderloops we can have is with input attachments. Detect
these.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen
04c6feb12c
radv: Fix config reg assert.
...
Using the wrong bounds
Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted"
Reviewed-by: Andres Rodriguez <andresx7@gmail.com >
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-08-07 08:58:23 +10:00
Pierre-Eric Pelloux-Prayer
25fff591c1
radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrap
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:41:06 -04:00
Pierre-Eric Pelloux-Prayer
704a6b5948
ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap support
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-08-06 17:41:03 -04:00
Marek Olšák
f818d9ae3c
radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_center
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-08-06 17:08:39 -04:00
Bas Nieuwenhuizen
2af00b1fdd
ac/nir: Use correct cast for readfirstlane and ptrs.
...
Fixes: 028ce527 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-06 15:48:50 +00:00
Bas Nieuwenhuizen
2301b2e029
radv: Do non-uniform lowering before bool lowering.
...
Since it can introduce comparisons.
Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-06 15:48:50 +00:00
Connor Abbott
74470baebb
ac/nir: Lower large indirect variables to scratch
...
results from radeonsi NIR:
Totals from affected shaders:
SGPRS: 704 -> 464 (-34.09 %)
VGPRS: 2056 -> 672 (-67.32 %)
Spilled SGPRs: 24 -> 0 (-100.00 %)
Spilled VGPRs: 28406 -> 0 (-100.00 %)
Private memory VGPRs: 0 -> 3182 (0.00 %)
Scratch size: 1064 -> 3228 (203.38 %) dwords per thread
Code Size: 935260 -> 40180 (-95.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 28 -> 70 (150.00 %)
Wait states: 0 -> 0 (0.00 %)
results from radv:
Totals from affected shaders:
SGPRS: 80 -> 48 (-40.00 %)
VGPRS: 204 -> 108 (-47.06 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 256 (0.00 %) dwords per thread
Code Size: 15792 -> 9504 (-39.82 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1 -> 2 (100.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-05 11:45:18 +02:00
Eric Engestrom
9a07606b84
meson: replace last uses of libxmlconfig with idep_xmlconfig
...
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Acked-by: Eric Anholt <eric@anholt.net >
Tested-by: Vinson Lee <vlee@freedesktop.org >
2019-08-03 00:08:37 +00:00
Eric Engestrom
d2d85b950d
meson: replace libmesa_util with idep_mesautil
...
This automates the include_directories and dependencies tracking so that
all users of libmesa_util don't need to add them manually.
Next commit will remove the ones that were only added for that reason.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Acked-by: Eric Anholt <eric@anholt.net >
Tested-by: Vinson Lee <vlee@freedesktop.org >
2019-08-03 00:08:37 +00:00
Bas Nieuwenhuizen
2d54fdb563
radv: Expose VK_KHR_imageless_framebuffer.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 22:35:25 +02:00
Bas Nieuwenhuizen
9475782eac
radv: Implement VK_KHR_imageless_framebuffer.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 22:35:19 +02:00
Bas Nieuwenhuizen
a7041f3b4e
radv: Store image view also outside framebuffer.
...
So we can use it with imageless framebuffers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 22:19:16 +02:00
Bas Nieuwenhuizen
49e6c2fb78
radv: Store color/depth surface info in attachment info instead of framebuffer.
...
That way we can use it for imageless framebuffers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 22:18:51 +02:00
Bas Nieuwenhuizen
72e7b7a00b
ac/nir,radv: Optimize bounds check for 64 bit CAS.
...
When the application does not ask for robust buffer access.
Only implemented the check in radv.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 21:21:55 +02:00
Samuel Pitoiset
e8110e51c6
radv: fix image_has_{cmask,fmask}() helpers
...
The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.
Fixes: ad1bc8621d ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 14:00:50 +02:00
Samuel Pitoiset
ad1bc8621d
radv: remove radv_get_image_fmask_info()
...
It's unnecessary to duplicate fields in another struct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 13:34:46 +02:00
Samuel Pitoiset
10d08da52c
radv/gfx10: add missing dcc_tile_swizzle tweak
...
Fixes: c90f46700d ("radv/gfx10: mask DCC tile swizzle by alignment")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 13:34:43 +02:00
Samuel Pitoiset
9c9745e8dd
radv: remove radv_get_image_cmask_info()
...
It's unnecessary to duplicate fields in another struct.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 13:34:41 +02:00
Samuel Pitoiset
856487a280
radv: only account for tile_swizzle for color surfaces with DCC
...
It's 0 for depth surfaces with TC compat HTILE enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 13:34:39 +02:00
Bas Nieuwenhuizen
e1c5d8a364
radv: Enable VK_KHR_shader_atomic_int64
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 12:26:32 +02:00
Bas Nieuwenhuizen
a17f2206d3
ac/nir: Implement LLVM9 64-bit buffer compare & exchange.
...
LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this
extracts the ptr, does a bound check and then uses a cmpxchg LLVM
instruction.
Not ideal, but the earliest release we're going to get a proper
intrinsic is LLVM 10.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
2019-08-02 12:26:11 +02:00
Connor Abbott
73274c9ec2
Revert "ac/nir: handle negate modifier"
...
This reverts commit bfea7e4d29 .
2019-08-02 11:14:50 +02:00
Connor Abbott
4a382d66ee
Revert "ac/nir: handle abs modifier"
...
This reverts commit d3c80733cd .
These were only appearing due to memory corruption.
2019-08-02 11:14:08 +02:00
Samuel Pitoiset
7368000868
radv: re-apply "Optimize rebinding the same descriptor set."
...
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.
This optimization has been reverted a while back because of
random GPU hangs on GFX9, no it looks fine, at least CTS no longer
hangs on GFX9 and it doesn't hang on GFX10 as well.
It fixes a performance problem with Wolfenstein Youngblood.
Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de >
2019-08-02 09:56:55 +02:00
Samuel Pitoiset
96a5445559
radv/gfx10: use the correct target machine for Wave32
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 09:37:38 +02:00
Samuel Pitoiset
8a86908e9a
radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders
...
It can be enabled with RADV_PERFTEST=gewave32.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 09:37:36 +02:00
Samuel Pitoiset
953bbacc23
radv/gfx10: add Wave32 support for fragment shaders
...
It can be enabled with RADV_PERFTEST=pswave32.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-08-02 09:37:34 +02:00
Samuel Pitoiset
c66021069e
radv/gfx10: implement a GE bug workaround
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 12:14:29 +02:00
Samuel Pitoiset
9a3fc7b6fa
radv/gfx10: remove an obsolete VGT_REUSE_OFF workaround
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 12:14:29 +02:00
Samuel Pitoiset
bb8f25233a
radv/gfx10: disable LATE_ALLOC_GS on Navi14
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 12:14:29 +02:00
Samuel Pitoiset
e041a74588
radv/gfx10: implement a bug workaround for GE_PC_ALLOC
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 12:14:29 +02:00
Samuel Pitoiset
0e1724af61
radv/gfx10: implement a bug workaround for NGG -> legacy transitions
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 12:14:29 +02:00
Samuel Pitoiset
29cca5f381
radv: skip draw calls with 0-sized index buffers
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 12:14:29 +02:00
Eric Engestrom
abc226cf41
tree-wide: replace MAYBE_UNUSED with ASSERTED
...
Suggested-by: Jason Ekstrand <jason@jlekstrand.net >
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
2019-07-31 09:41:05 +01:00
Eric Engestrom
aed15fa799
radv: drop incorrect MAYBE_UNUSED
...
`compressed` is clearly always used on the line right after.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
2019-07-31 09:41:05 +01:00
Samuel Pitoiset
ea38565011
radv/gfx10: add Wave32 support for compute shaders
...
It can be enabled with RADV_PERFTEST=cswave32.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-07-31 09:35:04 +02:00
Marek Olšák
033c39a660
ac/nir: fix incorrect Phis if callbacks use control flow inside control flow
2019-07-30 22:06:23 -04:00
Marek Olšák
d3c80733cd
ac/nir: handle abs modifier
2019-07-30 22:06:23 -04:00
Marek Olšák
efe2d8c5f9
ac: fix a memory leak in the error path of ac_build_type_name_for_intr
2019-07-30 22:06:23 -04:00
Marek Olšák
f6eca14f1b
ac: allow control flow statements in NIR callbacks
...
This fixes a crash when compiling geometry shaders on radeonsi.
2019-07-30 22:06:23 -04:00
Marek Olšák
bfea7e4d29
ac/nir: handle negate modifier
2019-07-30 22:06:23 -04:00
Marek Olšák
9234275320
radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced
2019-07-30 22:06:23 -04:00
Marek Olšák
17021efc74
radeonsi: adjust RB+ blend optimization settings
...
based on PAL
2019-07-30 22:06:23 -04:00