Rhys Perry
962c917cea
aco: move MIMG VDATA to its own operand
...
We will want both a VDATA operand and a sampler for some TFE/LWE MIMG
instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Rhys Perry
2aaf52bb85
aco: fix MIMG_instruction::lwe comment
...
The ISA docs were inconsistent about what this flag does, but that seems
fixed in the RDNA doc.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Rhys Perry
816b7fb5cb
aco: fix unreachable() for uniform 8/16-bit nir_op_mov from VGPR
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: d20a752c0d ("aco: use Builder::copy more")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8380 >
2021-01-08 12:54:36 +00:00
Samuel Pitoiset
f40a7d3c93
radv: fix performance regression by restoring TC-compat HTILE in GENERAL
...
This fixes a performance regression for games (eg. Youngblood) that
declare all images as concurrent. This is likely buggy for compute
queues but this just restores the previous behaviour for now.
Fixes: f4f096805b ("radv: fix TC-compat HTILE images with DST_OPTIMAL on the compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8351 >
2021-01-08 09:22:32 +00:00
Samuel Pitoiset
0ae1cf46a6
radv: fix enabling TC-compat HTILE in GENERAL for writes on GFX10+
...
It wasn't expected to also enable inside render loops.
Fixes: 4bb92d9145 ("radv: enable TC-compat HTILE in GENERAL on GFX10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8351 >
2021-01-08 09:22:32 +00:00
Samuel Pitoiset
20683461e3
radv: configure the texture descriptor for TC-compat CMASK on GFX10+
...
This was missing, it can be enabled with RADV_PERFTEST=tccompatcmask.
Note that this feature is still experimental.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8350 >
2021-01-08 08:21:17 +01:00
Samuel Pitoiset
d2f4934121
radv/llvm,aco: always split typed vertex buffer loads on GFX6 and GFX10+
...
To avoid any alignment issues that triggers memory violations and
eventually a GPU. This can happen if the stride (static or dynamic)
is unaligned and also if the VBO offset is aligned to scalar
(eg. stride is 8 and VBO offset is 2 for R16G16B16A16_SNORM).
The AMD Windows driver also always splits typed vertex fetches.
fossils-db (Sienna Cichlid):
Totals from 56508 (40.54% of 139391) affected shaders:
SGPRs: 2643545 -> 2664516 (+0.79%); split: -0.19%, +0.98%
VGPRs: 2007472 -> 1995408 (-0.60%); split: -0.74%, +0.13%
CodeSize: 70596372 -> 73913312 (+4.70%); split: -0.00%, +4.70%
MaxWaves: 772653 -> 774916 (+0.29%); split: +0.37%, -0.08%
Instrs: 14074162 -> 14567072 (+3.50%); split: -0.00%, +3.51%
Cycles: 69281276 -> 71253252 (+2.85%); split: -0.00%, +2.85%
VMEM: 22047039 -> 25554196 (+15.91%); split: +17.20%, -1.29%
SMEM: 4120370 -> 4360820 (+5.84%); split: +7.41%, -1.58%
VClause: 416913 -> 438361 (+5.14%); split: -1.86%, +7.01%
SClause: 536739 -> 542637 (+1.10%); split: -0.33%, +1.43%
Copies: 977194 -> 970015 (-0.73%); split: -2.43%, +1.69%
Branches: 241205 -> 241193 (-0.00%); split: -0.06%, +0.06%
PreVGPRs: 1505645 -> 1505379 (-0.02%)
This fixes GPU hangs with bin/draw-vertices from Piglit on GFX10+
with Zink.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8363 >
2021-01-07 17:28:00 +00:00
Samuel Pitoiset
68c2537062
aco: fix creating the dest vector when 16-bit vertex fetches are splitted
...
Compute the number of components of the destination vector from the
bitsize when eg. a 16-bit vec2 vertex fetches is splitted. This is
because the dst will be a v1, so the p_create_vector should be created
from two v2b fro both sizes to match.
This prevents a regression from the next change which will split
typed vertex buffer loads on GFX6 and GFX10+.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8363 >
2021-01-07 17:28:00 +00:00
Rhys Perry
f5adf27fb9
nir,radv: add and use nir_vectorize_tess_levels()
...
fossil-db (Sienna):
Totals from 1342 (0.97% of 138791) affected shaders:
CodeSize: 3287996 -> 3269572 (-0.56%); split: -0.56%, +0.00%
Instrs: 629896 -> 628191 (-0.27%); split: -0.31%, +0.04%
Cycles: 2619244 -> 2612424 (-0.26%); split: -0.30%, +0.04%
VMEM: 388807 -> 389273 (+0.12%); split: +0.14%, -0.02%
SMEM: 90655 -> 90700 (+0.05%); split: +0.06%, -0.01%
VClause: 21831 -> 21812 (-0.09%)
PreVGPRs: 44155 -> 44058 (-0.22%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202 >
2021-01-07 16:34:53 +00:00
Rhys Perry
bfc777f83e
radv: vectorize shader I/O
...
Fixes code size regressions after enabling TCS/TES for ACO.
fossil-db (Sienna):
Totals from 2588 (1.86% of 138791) affected shaders:
SGPRs: 109950 -> 108480 (-1.34%); split: -1.43%, +0.09%
VGPRs: 107764 -> 112060 (+3.99%); split: -0.03%, +4.02%
CodeSize: 5957760 -> 5321656 (-10.68%)
MaxWaves: 31718 -> 30358 (-4.29%); split: +0.03%, -4.32%
Instrs: 1116300 -> 1029000 (-7.82%)
Cycles: 4600344 -> 4251072 (-7.59%)
VMEM: 980024 -> 812978 (-17.05%); split: +1.14%, -18.18%
SMEM: 275458 -> 258227 (-6.26%); split: +2.34%, -8.60%
VClause: 42925 -> 30533 (-28.87%); split: -31.02%, +2.15%
SClause: 31554 -> 31362 (-0.61%); split: -1.79%, +1.18%
Branches: 15689 -> 15697 (+0.05%)
PreVGPRs: 80399 -> 83953 (+4.42%); split: -0.00%, +4.42%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202 >
2021-01-07 16:34:53 +00:00
Rhys Perry
f199b7188b
nir/load_store_vectorize: add data as callback args
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202 >
2021-01-07 16:34:53 +00:00
Rhys Perry
00c8bec47b
nir: add nir_load_store_vectorize_options
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202 >
2021-01-07 16:34:53 +00:00
Rhys Perry
cacce76db9
radv: workaround games which assume full subgroups if cswave32 is enabled
...
This assumption becomes incorrect with RADV_PERFTEST=cswave32.
Games include Detroit: Become Human and Doom Eternal.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918 >
2021-01-07 15:01:02 +00:00
Rhys Perry
5bb94ab050
radv: implement CREATE_REQUIRE_FULL_SUBGROUPS_BIT with cswave32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918 >
2021-01-07 15:01:02 +00:00
Michel Dänzer
1de2fd0cf2
wsi/x11: Always link against xcb-xrandr
...
The next commit will make use of it even without
VK_USE_PLATFORM_XLIB_XRANDR_EXT.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197 >
2021-01-07 14:57:45 +01:00
Pierre-Eric Pelloux-Prayer
df5233b977
ac/sqtt: move radv_get_expected_buffer_size to ac
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002 >
2021-01-07 10:10:16 +01:00
Pierre-Eric Pelloux-Prayer
ea6176e63e
ac/sqtt: move ac_is_thread_trace_complete to ac
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002 >
2021-01-07 10:10:14 +01:00
Pierre-Eric Pelloux-Prayer
ffdfe136e6
ac/sqtt: move rgp/sqtt def to ac
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002 >
2021-01-07 10:09:57 +01:00
Pierre-Eric Pelloux-Prayer
4ec5cf5318
ac/radv: move radv_rgp.c to ac
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002 >
2021-01-07 10:09:49 +01:00
Pierre-Eric Pelloux-Prayer
bbc245ab2e
ac/radv: move sqtt structs and helpers to amd/common
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002 >
2021-01-07 10:09:47 +01:00
Pierre-Eric Pelloux-Prayer
04f6ba113c
ac/sqtt: add ac_thread_trace_data
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002 >
2021-01-07 10:09:45 +01:00
Rhys Perry
1fd8b46667
nir,spirv: add sparse image loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774 >
2021-01-06 20:36:38 +00:00
Samuel Pitoiset
4bb92d9145
radv: enable TC-compat HTILE in GENERAL on GFX10+
...
GFX10+ supports compressed writes to HTILE, so it should just work
to skip decompressions when transitioning from/to GENERAL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Samuel Pitoiset
326c7312bf
radv: only load the DS fast clear values for compressed rendering
...
Otherwise it's useless because we are unlikely to perform a
fast depth stencil clear.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Samuel Pitoiset
76e33d528b
radv: clean up radv_layout_is_htile_compressed()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Samuel Pitoiset
f4f096805b
radv: fix TC-compat HTILE images with DST_OPTIMAL on the compute queue
...
This is probably rare but can happen if someone performs a depth-stencil
copy on the compute queue. This might work (untested by CTS) but it
looks more conservative to decompress before perfoming the operation.
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Samuel Pitoiset
1c539b6484
radv: add radv_htile_get_initial_value() and document the HTILE dword
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Samuel Pitoiset
3038c88661
radv: fix potential HTILE issues for TC-compat images on GFX8
...
We can only use the entire HTILE buffer if TILE_STENCIL_DISABLE is
TRUE. On GFX8+, this is only true if the depth image has no stencil
and if it's not TC-compatible because of the ZRANGE_PRECISION issue.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Samuel Pitoiset
f7f6e9ad56
radv: always clear the SR0/SR1 bits of the HTILE buffer
...
To make sure the stencil compare state is properly initialized and
cleared when the driver performs a fast depth clear.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8039 >
2021-01-05 12:10:11 +00:00
Pierre-Eric Pelloux-Prayer
d0767fc045
amd/addrlib: use cpp.has_argument() to filter compiler arguments
...
Acked-by: Michel Dänzer <mdaenzer@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7846 >
2021-01-05 11:29:11 +00:00
Rhys Perry
c5973ede01
ac/nir: use llvm.readcyclecounter for LLVM9+
...
Unlike llvm.amdgcn.s.memtime, this works on GFX10.3
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4033
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8306 >
2021-01-05 10:27:00 +00:00
Samuel Pitoiset
831d9d406a
radv: remove unused radv_image::aspects
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8324 >
2021-01-05 09:46:01 +00:00
Samuel Pitoiset
58c68bac39
radv: fix clearing images with vkCmdClear{Color,DepthStencil}Image()
...
The image aspects field is actually never set and we should use the
range aspect anyways.
Fixes: 1a7b7b17ad ("radv: avoid oob read during clear")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8324 >
2021-01-05 09:46:01 +00:00
Marek Olšák
b94626d3ee
ac,radeonsi: limit Smart Access Memory to Zen 3 and GFX10.3 due to perf issues
...
Many people experience performance degradation on some systems.
There will be a driconf option to enable SAM on other chips as well as
disable it on enabled systems.
Fixes: d3d6d38145 - ac: add radeon_info::all_vram_visible for Smart Access Memory
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3982
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8225 >
2021-01-05 02:43:55 +00:00
Samuel Pitoiset
3ae1c6a4fb
radv: disable A2 SNORM/SSCALED/SINT for texel buffers & images on all gens
...
AMDVLK and AMDGPU-PRO also don't support these formats for texel
buffers and images.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3386
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8315 >
2021-01-04 17:19:41 +00:00
Rhys Perry
b2d000513e
aco: fix incorrect address calculation for load_barycentric_at_sample
...
Fix address calculation for indirect load_barycentric_at_sample on GFX6-8
with a uniform sample index.
A non-zero uniform sample index does not seem to be tested by CTS.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3966
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8302 >
2021-01-04 16:53:29 +00:00
Mike Blumenkrantz
1a7b7b17ad
radv: avoid oob read during clear
...
when clearing a depth/stencil image the passed colorvalue pointer is
smaller than the VkClearValue struct size
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8288 >
2021-01-04 14:11:56 +00:00
Bas Nieuwenhuizen
3898f747ce
radv: Use VRAM for the initial gfx cmdbuffer.
...
Not expect it to make any real difference, but lets be consistent.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979 >
2021-01-04 13:10:16 +00:00
Bas Nieuwenhuizen
b7cc5dc853
radv: Put commandbuffers in VRAM if all VRAM is CPU visible.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979 >
2021-01-04 13:10:15 +00:00
Bas Nieuwenhuizen
f06e91d85a
radv: Use VRAM for upload buffers if entire VRAM is CPU-visible.
...
Not doing this for APUs because spilling is quite likely, due to
overall VRAM pressure.
Also adding a flag to disable for performance debugging.
Finally adds some memset for places where we depended on the memory
being initialized to zero, which we won't get with VRAM anymore.
(I think these places should stop depending on it since it hides
issues with executing the cmdbuffer multiple times, but this
preserves behavior)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7979 >
2021-01-04 13:10:15 +00:00
Samuel Pitoiset
ef06f1bb03
radv: disable stippledBresenhamLines on GFX9
...
Some CTS fail on Vega10 but work on Raven.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8242 >
2020-12-31 09:17:21 +01:00
Tony Wasserka
9d59c84e31
aco/ra: Avoid redundant RegisterFile copies in get_reg_impl
...
Now that this function does not block RegisterFile entries anymore,
the temporary copy is only needed upon reaching the collect_vars call.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8261 >
2020-12-30 17:36:33 +01:00
Samuel Pitoiset
d90a102a01
radv: add a Python script to check if a VA was ever valid
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7891 >
2020-12-30 08:40:21 +01:00
Samuel Pitoiset
6ed4332591
radv: dump VA ranges history when a GPU hang is detected
...
This is enabled only with RADV_DEBUG=hang. This adds a small
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3904
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7891 >
2020-12-30 08:40:19 +01:00
Tony Wasserka
6b538506f2
aco/ra: Fix register allocation for subdword operands
...
ACO attempts to store the output of an instruction in the same register
occupied by its operands where possible. Importantly this only works if
the operands are large enough to store the result register size. The code
failed to consider subdword operands when checking for this, causing
entire register slots to be freed up even though subdword parts were still
used.
In Mafia 3, this affected the following code:
v2b: %363:v[2][0:16], v2b: %362:v[2][16:32] = p_split_vector %360:v[2]
v1: %116:v[2] = v_cvt_f32_f16 %362:v[2][16:32]
v1: %117:v[2] = v_cvt_f32_f16 %363:v[2][0:16]
where v[2] is allocated to %116 even though its original lower 16 bits are
still used in the instruction after.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3717
Fixes: 031edbc4a5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
187b185502
aco/ra: Add some documentation
...
This should make these somewhat tricky bits easier to follow.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
b841b4fde8
aco: Add tests for subdword register allocation
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
6a246f5c6d
aco/tests: Fix deadlock for too large test lists
...
The write() to the communication pipe shared with check_output.py would block
for large test output streams since the pipe's consumer wouldn't be launched
until the write already completed.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
a240341ec9
aco/tests: Allow specifiying the test subvariant in setup_cs
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00
Tony Wasserka
05ca6758cb
aco/tests: Fix GFX10_3 being printed as gfx11
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7461 >
2020-12-29 18:57:10 +00:00