Marek Olšák
4cf674c8f7
ac/surface: add a wrapper structure to hold ADDR_HANDLE
...
and more things in the future.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
e6996d6fbd
amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
a99f4d5382
amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call
...
This decreases the DCC retile map overhead from 23% to 18%.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
a1b9eb62f6
ac/surface: don't recompute the DCC retile map for imported textures
...
The retile map is not used in this case, and the retile map computation
takes 39% of CPU time when resizing a window.
This brings it down to 23%.
The dcc_retile_use_uint16 setting has to be derived from DCC sizes.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Rhys Perry
1b2e1163b2
aco: fix moving sub-dword values out of a register for a fixed definition
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
edf863d1d2
aco: use Info::definition_size instead of definition's regclass
...
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
207c35cbe8
aco: add Info::{operand_size,definition_size}
...
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
62ea429a99
aco: prefer 4-byte aligned definitions
...
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
CodeSize: 811984 -> 806224 (-0.71%)
Instrs: 155733 -> 155939 (+0.13%); split: -0.04%, +0.18%
Cycles: 1982568 -> 1984400 (+0.09%); split: -0.06%, +0.15%
VMEM: 7187 -> 7121 (-0.92%); split: +0.86%, -1.78%
SMEM: 1770 -> 1769 (-0.06%)
VClause: 1475 -> 1476 (+0.07%)
Copies: 12406 -> 12606 (+1.61%); split: -0.46%, +2.07%
Branches: 5901 -> 5900 (-0.02%); split: -0.25%, +0.24%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
56345b8c61
aco: allow reading/writing upper halves/bytes when possible
...
Use SDWA, opsel or a different opcode to achieve this.
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
98060ba0f0
aco: p_extract_vector in 64-bit u2f16/i2f16
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
d9cfb8ad48
aco: validate instructions reading/writing upper halves/bytes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Pierre-Eric Pelloux-Prayer
8275dc1ed5
ac/surface: fix epitch when modifying surf_pitch
...
This is needed otherwise it can cause bad rendering of UYVY files.
The align(..., 256 / surf->bpe) constraint comes from addrlib.
Fixes: 69aadc4933 ("radeonsi: fix surf_pitch for subsampled surface")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314 >
2020-06-10 09:11:23 +00:00
Pierre-Eric Pelloux-Prayer
e9826a1bb2
ac/surface: set SCANOUT if surf->is_displayable
...
Fixes: ba10fb3f7f ("radeonsi: preserve the scanout flag for shared resources on gfx9 and gfx10")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314 >
2020-06-10 09:11:23 +00:00
Samuel Pitoiset
9b58c4958b
ac/nir: fix integer comparisons with pointers
...
If we get a comparison between a pointer and an integer, LLVM
complains if the operands aren't of the same type.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3085
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5397 >
2020-06-10 08:18:22 +00:00
Samuel Pitoiset
64f2d45c3b
radv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
be4dd6abd1
radv/aco: enable shaderInt16 on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
b3aee3aa23
radv/aco: enable 8-bit/16-bit storage on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
5cde4989d3
aco: remove unnecessary split- and create_vector instructions for subdword loads
...
This helps GFX6/7 by removing unnecessary shuffle code.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
5446e3cf2e
aco: fix alignment of vectors with 4 elements
...
I think this case was just missing.
This fixes a bunch of 16-bit storage related CTS failures like
dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
c7bd0f8cd5
aco: implement 8-bit/16-bit conversions on GFX6-GFX7
...
Use v_bfe to implement small bitsize conversions because the
compiler probably optimizes this better.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
db957f9135
aco: optimize packing of 16bit subdword registers on GFX6/7
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
2a51840c52
aco: skip partial copies on first iteration when lowering to hw
...
Helps some Detroit : Become Human shaders.
Totals from affected shaders: (VEGA)
Code Size: 47693912 -> 47670212 (-0.05 %) bytes
Instructions: 9183788 -> 9177863 (-0.06 %)
Copies: 910052 -> 904127 (-0.65 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
1d6f667193
aco: coalesce copies more aggressively when lowering to hw
...
Helps some Detroit : Become Human shaders.
Totals from affected shaders: (VEGA)
Code Size: 9880420 -> 9879088 (-0.01 %) bytes
Instructions: 1918553 -> 1918220 (-0.02 %)
Copies: 177783 -> 177450 (-0.19 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
b21d2d9a9f
aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7
...
This is needed to lower some corner cases correctly,
in case the same operand occurs multiple times:
e.g. v0 = p_create_vector(v0[0:8], v0[0:8], v0[0:8], v0[0:8])
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
9e8e12ea6d
aco: adjust GFX6 subdword lowering workarounds for 8bit
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
b083581010
aco: Workarounds subdword lowering on GFX6/7
...
As there are no SDWA instructions, we need to take care not to overwrite
the upper bits of other copy_operation's operands.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
942e3c40c3
aco: use full-register instructions to implement subdword packing on GFX6/7
...
On GFX6/7, there are no SDWA instructions.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
3f03db848d
aco: simplify statistics collection for copies
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
0560831593
aco: fix register assignment for p_create_vector on GFX6/7
...
In case, some operand was already placed in the definition space,
it could happen that it wasn't considered for live-range splits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Marek Olšák
9538b9a68e
radeonsi: add support for Sienna Cichlid
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
789cdab3b6
ac: align num_vgprs for gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
2cc4bfbe01
radeonsi: don't set any XNACK options on gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
788696c7b2
radeonsi: implement R9G9B9E5 render target and image store support on gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
a54bcb9429
radeonsi: enable larger SDMA clears and copies on gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
abe89e1329
ac/surface: add displayable DCC code for gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
a23802bcb9
ac,radeonsi: start adding support for gfx10.3
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
a1602516d7
ac,radeonsi: replace == GFX10 with >= GFX10 where it's needed
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5383 >
2020-06-09 16:17:36 +00:00
Marek Olšák
ceaf848c56
radeonsi: enable ARB_sparse_buffer
...
This seems to be working now, but it wasn't working before.
I don't know what fixed this. Tested on Raven and Navi14.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5402 >
2020-06-09 16:00:38 +00:00
Samuel Pitoiset
d7923c74d4
radv/llvm: expose VK_EXT_shader_demote_to_helper_invocation with LLVM 9+
...
It should already work with the LLVM backend.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5361 >
2020-06-09 08:04:23 +02:00
Rhys Perry
43e69475ad
aco: use v_xor3_b32
...
fossil-db (Navi):
Totals from 334 (0.26% of 128321) affected shaders:
CodeSize: 3345532 -> 3345484 (-0.00%); split: -0.00%, +0.00%
Instrs: 624662 -> 622778 (-0.30%); split: -0.30%, +0.00%
Mostly affects some parallel-rdp shaders
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5357 >
2020-06-08 13:20:01 +00:00
Rhys Perry
1234faa7bf
ac/gpu_info, radv: set max_wave64_per_simd to 20 on GFX10
...
Fixes RADV max_waves reporting for GFX10
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5356 >
2020-06-08 10:26:59 +00:00
Samuel Pitoiset
008b0d1701
ac/nir: adjust an assertion for D16 on GFX6-GFX7
...
16-bit types can be used with MUBUF on GFX6-GFX7.
Fixes: c3e0ba52a0 ("ac/nir: support 16-bit data in buffer_load_format opcodes")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5325 >
2020-06-08 08:45:32 +02:00
Vinson Lee
faa339e666
Switch from cElementTree to ElementTree.
...
The xml.etree.cElementTree module will be removed in Python 3.9. Since
Python 3.3 the xml.etree.cElementTree module has been deprecated, the
xml.etree.ElementTree module uses a fast implementation whenever
available.
Builds using Python 2.7 can still work but with the slower
implementation.
Signed-off-by: Vinson Lee <vlee@freedesktop.org >
Acked-by: Eric Engestrom <eric@engestrom.ch >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5349 >
2020-06-05 23:42:54 -07:00
Rhys Perry
5d13c7477e
radv: set keep_statistic_info with RADV_DEBUG=shaderstats
...
Needed for RADV_DEBUG=shaderstats to dump ACO statistics.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5358 >
2020-06-05 15:11:01 +00:00
Samuel Pitoiset
bfff330f06
radv/aco: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7
...
CTS pass on Pitcairn (GFX6). This extension isn't really useful
without 8-bit/16-bit storage though but this is going to be exposed
soon.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327 >
2020-06-05 16:04:08 +02:00
Samuel Pitoiset
6391f9ab4c
aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327 >
2020-06-05 16:04:06 +02:00
Samuel Pitoiset
e1523b34c2
aco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7
...
SDWA is GFX8+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327 >
2020-06-05 16:04:05 +02:00
Samuel Pitoiset
ee4bc13de2
aco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327 >
2020-06-05 16:04:03 +02:00
Bas Nieuwenhuizen
c67ef7695a
radv: Use ac_surface to allocate aux surfaces.
...
For consistency and a bunch of codesharing.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194 >
2020-06-05 13:27:55 +00:00
Bas Nieuwenhuizen
63db31fdfc
amd/common: Add total alignment calculation.
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194 >
2020-06-05 13:27:55 +00:00