Commit Graph

1018 Commits

Author SHA1 Message Date
Jason Ekstrand dca6cd9ce6 nir: Make boolean conversions sized just like the others
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64.  This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:07 -06:00
Nicolai Hähnle 776b911365 amd/addrlib: update Mesa's copy of addrlib
Update to the internal master as of 2018-11-15.

This has a lot of gratuitous whitespace change, but on the plus
side it's built using the same tooling that's used for AMDVLK,
which should help going forward.
2018-11-29 13:18:24 +01:00
Nicolai Hähnle 621c107760 ac/surface/gfx9: let addrlib choose the preferred swizzle kind
Our choices here are simply redundant as long as sin.flags is set
correctly.

(v2:
- remove unused function parameter)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-11-29 13:18:23 +01:00
Nicolai Hähnle 729ebdf07e radv: remove dependency on addrlib gfx9_enum.h
v2:
- use SI_CONTEXT_REG_OFFSET

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-29 13:18:23 +01:00
Dave Airlie 3486fe655a ac: handle cast derefs
Just give back the same value for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:46 +10:00
Dave Airlie baa4bdd3a6 radv: handle loading from shared pointers
We won't have a var to load from, so don't try to the processing
required if we don't need it.

This avoids crashes in:
dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:42 +10:00
Dave Airlie ec9fe8abc7 ac: avoid casting pointers on bcsel and stores
For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 08:54:25 +10:00
Samuel Pitoiset f4563d8f5b ac/nir: fix intrinsic name string size in visit_image_atomic()
Fixes an assertion in SoTTR.

Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-20 10:23:45 +01:00
Bas Nieuwenhuizen dd0172e865 radv: Use structured intrinsics instead of indexing workaround for GFX9.
These force the index to be used in the instruction so we don't need the
workaround.

Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-19 23:36:00 +01:00
Marek Olšák d059eae269 ac/surface: remove the overallocation workaround for Vega12
not needed anymore (probably since the tile_swizzle fix)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-09 14:55:04 -05:00
Samuel Pitoiset f425d9ee74 radv: use LOAD_CONTEXT_REG when loading fast clear values
This avoids syncing the Micro Engine. This is only supported
for VI+ currently. There is probably a way for using
LOAD_CONTEXT_REG on previous chips but that could be done later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:45 +01:00
Timothy Arceri 9aa3c1915e ac/nir_to_llvm: fix b2f for f64
Fixes: d7e0d47b9d ("nir: Add a bunch of b2[if] optimizations")

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-07 16:35:07 +11:00
Jan Vesely 9cab8ccd6c amd: Make vgpr-spilling depend on llvm version
The option was removed in LLVM r345763

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-11-02 10:32:47 -04:00
Samuel Pitoiset 9278089d05 ac/nir: make use of i1false in few more places
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-01 08:49:05 +01:00
Samuel Pitoiset 9ef8ea1451 radv: use WAIT_REG_MEM_GREATER_OR_EQUAL instead of a magic value
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Marek Olšák 26cb93e229 radeonsi: add support for Raven2 (v2)
v2: fix enabling primitive binning

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-30 16:03:02 -04:00
Marek Olšák 8676af12c8 ac: fix ac_build_fdiv for f64
trivial

Fixes: a5f35aa742
2018-10-29 17:24:21 -04:00
Samuel Pitoiset b4eb029062 radv: implement VK_EXT_transform_feedback
This implementation should work and potential bugs can be
fixed during the release candidates window anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:10:58 +01:00
Eric Engestrom bb84fa146f util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Leo Liu b75fb8ee36 amd/common: check DRM version 3.27 for JPEG decode
JPEG was added after DRM version 3.26

Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2018-10-23 13:12:05 -04:00
Boyuan Zhang 4558758c51 amd/common: add vcn jpeg ip info query
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2018-10-23 08:50:02 -04:00
Connor Abbott 27fe3f5b5a ac: Fix loading a dvec3 from an SSBO
The comment was wrong, since the loop above casts to a type with the
correct bitsize already.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Connor Abbott 59535b05cf ac: Introduce ac_build_expand()
And implement ac_bulid_expand_to_vec4() on top of it.

Fixes: 7e7ee82698 ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 09:44:51 +02:00
Marek Olšák bfc795670e ac: add helpers for fast integer division by a constant 2018-10-16 17:23:25 -04:00
Marek Olšák fedc1fda30 radeonsi: save raster config in screen, add se_tile_repeat 2018-10-16 15:28:22 -04:00
Marek Olšák 0d05581578 radeonsi: rename si_gfx_* functions to si_cp_*
and write_event_eop -> release_mem
2018-10-16 15:28:22 -04:00
Marek Olšák 6e1cf6532d radeonsi: make si_gfx_write_event_eop more configurable 2018-10-16 15:28:22 -04:00
Alex Smith ca83d51cfb ac/nir: Use context-specific LLVM types
LLVMInt*Type() return types from the global context and therefore are
not safe for use in other contexts. Use types from our own context
instead.

Fixes frequent crashes seen when doing multithreaded pipeline creation.

Fixes: 4d0b02bb5a "ac: add support for 16bit load_push_constant"
Fixes: 7e7ee82698 "ac: add support for 16bit buffer loads"
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-16 08:18:24 +01:00
Samuel Pitoiset 416013b4f5 radv: emit the GLC bit for SSBO loads/stores when needed
This fixes some new memory model tests:
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.*

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-12 08:42:08 +02:00
Marek Olšák 77903c8cfb ac: add ac_build_round 2018-10-06 21:50:09 -04:00
Marek Olšák fa023f293e ac: correct PKT3_COPY_DATA definitions 2018-10-06 21:50:09 -04:00
Marek Olšák 82f5f89bf6 ac: simplify LLVM alloca helpers 2018-10-06 21:50:09 -04:00
Marek Olšák a668c8d6ba ac: define all address spaces properly 2018-10-06 21:50:09 -04:00
Samuel Pitoiset 5d6a560a29 radv: do not use the availability bit for timestamp queries
It's unnecessary because we can just check if the timestamp
is to different to the default value when a pool is created
or resetted. Instead of waiting for the availability bit to
be 1, we have to emit a not equal WAIT_REG_MEM for checking
if the timestamp is ready.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-09-28 09:08:03 +02:00
Samuel Pitoiset cd76ce0078 ac: add 16-bit support to ac_build_bitfield_reverse()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:37 +02:00
Samuel Pitoiset fc398f4d67 ac: add 16-bit support to ac_build_bit_count()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:34 +02:00
Samuel Pitoiset 94dd08eb7c ac: add 16-bit support to ac_find_lsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:32 +02:00
Samuel Pitoiset 5a6c8ca3e8 ac: add 16-bit support to ac_build_umsb()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:30 +02:00
Samuel Pitoiset 3e7f3e2cd1 ac: add 16-bit support to ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:28 +02:00
Samuel Pitoiset cfd6314cfe ac: add 16-bit constant values for zero and one
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:26 +02:00
Samuel Pitoiset 074e29183c ac: add ac_build_bifield_reverse() helper
Are we missing 64-bit support?

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:23 +02:00
Samuel Pitoiset 371c35e5bb ac: add ac_build_bit_count() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-17 15:18:20 +02:00
Timothy Arceri e29f0ede75 ac: fix get_image_coords() for radeonsi
Because this was setting image to true we would end up calling
si_load_image_desc() when we sould be calling
si_load_sampler_desc().

This fixes an assert() in Deus Ex: MD

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-09-15 12:23:32 +10:00
Marek Olšák b00deed66f radeonsi: adjust and simplify max_alloc_size determination
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák be0bd95abf radeonsi: fix GPU hangs with bindless textures and LLVM 7.0
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák fa595e3d0c ac: remove deprecated use of LLVMInt1Type()
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák cc36ebbdc3 ac: use iN_0/1 constants
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák bc09c3d59e ac: add radeon_info::num_good_cu_per_sh
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák a5f35aa742 ac: revert new LLVM 7.0 behavior for fdiv
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Dave Airlie 2c1f249f2b ac/radeonsi: fix CIK copy max size
While adding transfer queues to radv, I started writing some tests,
the first test I wrote fell over copying a buffer larger than this
limit.

Checked AMDVLK and found the correct limit.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-08-31 15:11:49 +10:00