Rhys Perry
72e9a23443
radv/aco: use ACO for GS copy shaders
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
f8f7712666
aco: implement GS copy shaders
...
v5: rebase on float_controls changes
v7: rebase after shader args MR and load/store vectorizer MR
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
de4ce66f5c
aco: remove needs_instance_id
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
e192e268de
aco: explicitly mark end blocks for exports
...
For GS copy shaders, whether we want to do exports is conditional. By
explicitly marking the end blocks, we can mark an IF's then branch as an
export block and ensure that's where the assembler inserts null exports.
v6: only fixup exports in the end block, like before
v8: simplify some code
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
d46a54ecff
radv/aco: allow ACO for GS
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
8bad100f83
aco: implement GS on GFX7-8
...
GS is the same on GFX6, but GFX6 isn't fully supported yet.
v4: fix regclass
v7: rebase after shader args MR
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
40bb81c9dd
radv/aco,aco: implement GS on GFX9+
...
v2: implement GFX10
v3: rebase
v7: rebase after shader args MR
v8: fix gs_vtx_offset usage on GFX9/GFX10
v8: use unreachable() instead of printing intrinsic
v8: rename output_state to ge_output_state
v8: fix formatting around nir_foreach_variable()
v8: rename some helpers in the scheduler
v8: rename p_memory_barrier_all to p_memory_barrier_common
v8: fix assertion comparing ctx.stage against vertex_geometry_gs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
70f63c1988
aco: improve support for s_sendmsg
...
In particular, the messages needed for GS.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Rhys Perry
0da7b3b18b
radv: move gs copy shader creation before other variants
...
ACO lowers output derefs which breaks the shader_info pass used by gs copy
shader creation.
v3: rebase
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421 >
2020-01-24 13:35:07 +00:00
Timur Kristóf
23edcf6490
aco: Make a better guess at which instructions need the VCC hint.
...
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451 >
2020-01-24 13:14:23 +00:00
Bas Nieuwenhuizen
0890482969
radv: Allow DCC & TC-compat HTILE with VK_IMAGE_CREATE_EXTENDED_USAGE_BIT.
...
I misunderstood the flag when initially disabling. But this flag
only does something with mutable formats. If we have DCC and
mutable formats, the formats are close enough that the allowed
usage flags are not meaningfully different nor used during
allocation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424 >
2020-01-24 11:16:39 +00:00
Bas Nieuwenhuizen
1b447bd2e6
radv: Expose VK_KHR_swapchain_mutable_format.
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2354
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425 >
2020-01-24 10:47:07 +00:00
Samuel Pitoiset
a31bcf2be6
ac/llvm: fix missing casts in ac_build_readlane()
...
Because ac_build_optimization_barrier() overwrites the original
src_type, we have to keep track of it before emitting that barrier.
Otherwise, wrong conversions are expected for pointers or small
bitsizes.
By doing this, we no longer need to do the cast dance in
ac_build_readlane_no_opt_barrier(), it was just necessary for
ac_build_optimization_barrier().
This fixes a bunch of crashes with subgroups related tests when
RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash
with The Surge 2.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395
Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535 >
2020-01-24 07:40:07 +01:00
Samuel Pitoiset
8d5203dad2
aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6
...
V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:48 +01:00
Samuel Pitoiset
4d92601715
aco: implement 64-bit nir_op_ffloor on GFX6
...
GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.
Introduce a new function because it will be useful for some other
64-bit operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:45 +01:00
Samuel Pitoiset
fbd169e421
aco: implement 64-bit nir_op_fround_even on GFX6
...
GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:42 +01:00
Samuel Pitoiset
87588801d3
aco: implement 64-bit nir_op_fceil on GFX6
...
GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:38 +01:00
Samuel Pitoiset
aad5176c58
aco: implement 64-bit nir_op_ftrunc on GFX6
...
GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based
on the AMDGPU LLVM backend.
Introduce a new function because it will be useful for some other
64-bit operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:34 +01:00
Samuel Pitoiset
36e7a5f5b9
aco: implement nir_intrinsic_global_atomic_* on GFX6
...
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:30 +01:00
Samuel Pitoiset
22d8822683
aco: implement nir_intrinsic_load_global on GFX6
...
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:27 +01:00
Samuel Pitoiset
d6af7571c2
aco: implement nir_intrinsic_store_global on GFX6
...
GFX6 doesn't have FLAT instructions, use MUBUF instructions instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:24 +01:00
Samuel Pitoiset
01f0bef71e
aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample
...
Only GFX6 was affected, my mistake. The total number of SGPR operands
should be 4 when we want to create a vec4.
Fixes: dbdf3b3ef9 ("aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477 >
2020-01-23 14:40:21 +01:00
Samuel Pitoiset
54e54ec3e8
aco: fix printing assembly with CLRXdisasm on GFX6
...
We thought that CLRXdisasm allowed gfx600 as well as gfx700 but
it actually doesn't. Use the family for GFX6 chips instead.
Fixes: 0099f85232 ("aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531 >
2020-01-23 11:34:37 +00:00
Samuel Pitoiset
12fe19ba3b
radv: advertise VK_AMD_shader_fragment_mask
...
Only for GFX8+ because it's untested on older generations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304 >
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
e030aef32c
aco: add support for nir_texop_fragment_{mask}_fetch
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304 >
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
9e477d79b7
ac/nir: add support for nir_texop_fragment_{mask}_fetch
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304 >
2020-01-23 10:48:02 +00:00
Samuel Pitoiset
e60de08547
radv: handle missing implicit subpass dependencies
...
When a subpass doesn't declare an explicit dependency from/to
VK_SUBPASS_EXTERNAL, Vulkan says there is an implicit dependency.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330 >
2020-01-23 11:25:41 +01:00
Samuel Pitoiset
0d2da2a8c0
radv: add explicit external subpass dependencies to meta operations
...
No functional changes because a subpass dependency with dstStageMask
set to VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT is a no-op.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330 >
2020-01-23 11:25:38 +01:00
Rhys Perry
15a1cc00d3
aco: fix off-by-one error when initializing sgpr_live_in
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2394
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511 >
2020-01-22 17:23:30 +00:00
Samuel Pitoiset
bd51538d28
radv: fix double free corruption in radv_alloc_memory()
...
If the driver fails to allocate memory for some reasons, it shouldn't
free the 'mem' object twice.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2302
Fixes: 825ddfee59 ("radv: Handle device memory alloc failure with normal free.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508 >
2020-01-22 17:01:16 +00:00
Rhys Perry
3f96a1ed86
aco: fix operand kill flags when a temporary is used more than once
...
Helps create v_mac_f32 from v_mad_f32(b, a, b)
Totals from affected shaders:
SGPRS: 35824 -> 35824 (0.00 %)
VGPRS: 33460 -> 33456 (-0.01 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2187264 -> 2180976 (-0.29 %) bytes
LDS: 127 -> 127 (0.00 %) blocks
Max Waves: 3802 -> 3802 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486 >
2020-01-22 15:55:00 +00:00
Timur Kristóf
1c9ecb2123
aco: Fix signedness compare warning.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483 >
2020-01-22 11:09:17 +01:00
Timur Kristóf
533a20dbd5
aco: Fix maybe-uninitialized warnings.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483 >
2020-01-22 11:09:14 +01:00
Timur Kristóf
6fb3df2786
aco: Fix -Wstringop-overflow warnings in aco_span.
...
GCC does not understand how aco_span works.
This patch fixes it by casting the aco_span's this pointer
to uintptr_t rather than to a char pointer, effectively
telling GCC not to try to figure it out.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483 >
2020-01-22 11:09:10 +01:00
Bas Nieuwenhuizen
bd4380c63c
radv: Remove syncobj_handle variable in header.
...
I strongly suspect it was supposed to be a typedef. However, used
nowhere, we should remove it.
Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2385
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479 >
2020-01-21 12:28:00 +00:00
Marek Olšák
4e4b2d13f0
ac: add helper ac_build_triangle_strip_indices_to_triangle
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
0f45d4dc2b
ac: add ac_build_readlane without optimization barrier
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Marek Olšák
77393cf39b
ac: add prefix bitcount functions
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2020-01-20 16:16:11 -05:00
Samuel Pitoiset
dbdf3b3ef9
aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6
...
GFX6 doesn't have FLAT instructions which means we have to emit
a 64-bit MUBUF load.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
9e2fde84fc
aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7
...
According to the different ISA docs (and to LLVM), this bit seems
to only exists on GFX6-GFX7.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
fe9157a700
aco: do not use the vec3 variant for loads on GFX6
...
GFX6 only supports vec3 with load/store format.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
1b5bb204d9
aco: do not use the vec3 variant for stores on GFX6
...
GFX6 only supports vec3 with load/store format.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
2020-01-20 16:24:55 +00:00
Samuel Pitoiset
b8abfafe86
aco: fix constant folding of SMRD instructions on GFX6
...
SMRD instructions have an 8-bit dword offset on SI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
2020-01-20 16:24:55 +00:00
Rhys Perry
29bfe18abd
aco: fix fall-through test in try_remove_simple_block() with back-edges
...
3bca0af2 enhanced empty block determination which exposed this bug and
created an infinite loop in a Guild Wars 2 shader.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 3bca0af25d
('aco: ignore parallelcopies to the same register on jump threading')
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2364
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452 >
2020-01-20 11:51:45 +00:00
Rhys Perry
e151398de6
aco: fix stack buffer overflow in apply_sgprs()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: cef7879719 ('aco: rewrite apply_sgprs()')
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442 >
2020-01-20 11:13:11 +00:00
Samuel Pitoiset
0099f85232
aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system
...
LLVM only supports GFX8+. Using CLRXdisasm works most of the time,
so it's useful to add support for it.
Original patch by Daniel Schürmann.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439 >
2020-01-17 17:41:32 +00:00
Samuel Pitoiset
b9b393f0ce
aco: fix emitting slc for MUBUF instructions on GFX6-GFX7
...
Same as GFX10, only GFX8/GFX9 moved that bit near the opcode.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437 >
2020-01-17 16:56:04 +01:00
Daniel Schürmann
3bca0af25d
aco: ignore parallelcopies to the same register on jump threading
...
The more conservative lowering to CSSA inserts unnecessary parallelcopies
which might get coalesced and can be ignored on jump threading.
v2: outline is_empty_block() check.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385 >
2020-01-16 16:01:59 +01:00
Daniel Schürmann
427e5eeb02
aco: handle phi affinities transitively through parallelcopies
...
This can coalesce most unnecessarily inserted parallelcopies
from lowering to CSSA.
v2: refactor loop a bit to make it more efficient and readable.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385 >
2020-01-16 16:01:59 +01:00
Daniel Schürmann
d098024c40
aco: rework lower_to_cssa()
...
This patch changes lower_to_cssa to be much more conservative
about assumptions which phi operands might interfere.
Previously, this pass wasn't exhaustive and could miss some corner cases.
v2: remove optimizations to find better insertion points as it's hard
to guarantee that they are always correct and have overall no benefit.
Fixes: 0b8216b2cd ('aco: Lower to CSSA')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385 >
2020-01-16 16:01:59 +01:00