Georg Lehmann
6d949e18fd
aco: fix u2f16 with 32bit input
...
The vulkan spec says all conversions are correctly rounded, so if the input
is larger than the largest fp16 value, we need to return MAX_FLOAT/inf
instead of cutting off the msbs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24826 >
2023-08-23 12:25:56 +00:00
Rhys Perry
1d29a1e2fc
aco: add adjust_bpermute_dst helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693 >
2023-08-23 12:36:46 +01:00
Rhys Perry
9169fbf83c
aco: clarify bpermute pseudo opcode names
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693 >
2023-08-23 12:36:46 +01:00
Rhys Perry
8a024c985f
aco: fix p_bpermute_gfx6's exec save/restore with wave32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693 >
2023-08-23 12:36:46 +01:00
Rhys Perry
85957dd6e5
aco: fix p_bpermute_gfx6 with input at non-zero byte
...
Same as the other bpermute pseudo instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693 >
2023-08-23 12:36:46 +01:00
Samuel Pitoiset
203b4054f3
aco: rework printing shader stages
...
To avoid printing "unknown" for shader object when eg. VS and TCS
are compiled separately.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24810 >
2023-08-23 09:21:33 +00:00
Samuel Pitoiset
aef257fd15
radv: advertise NV_device_generated_commands_compute
...
This extension introduces a token for implementing DGC compute, it's
only intended to be used by vkd3d-proton.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24275 >
2023-08-23 06:05:39 +00:00
Samuel Pitoiset
1a90b7a5da
radv: allow DGC on the compute queue
...
DGC cmdbuf on ACE are executed as IB1 without chaining because IB2
isn't supported on ACE.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24275 >
2023-08-23 06:05:39 +00:00
Samuel Pitoiset
559da06755
radv: implement NV_device_generated_commands_compute
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24275 >
2023-08-23 06:05:39 +00:00
Samuel Pitoiset
a57fe712f7
radv: prepare radv_prepare_dgc() for DGC compute
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24275 >
2023-08-23 06:05:39 +00:00
Samuel Pitoiset
aa0ca1e1db
radv: prepare radv_get_sequence_size() for DGC compute
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24275 >
2023-08-23 06:05:39 +00:00
Samuel Pitoiset
bb82a3402a
radv: track the pipeline bind point for indirect commands layout
...
This will be used to implement DGC compute.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24275 >
2023-08-23 06:05:39 +00:00
Konstantin Seurer
7aee3ba36d
radv: Stop updating the stack_size in insert_rt_case
...
There are two paths that call insert_rt_case:
- Traversal shader: The stack size is ignored.
- Monolithic raygen shader: The stack sizes of the inlined shaders are
accounted for in compute_rt_stack_size.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Konstantin Seurer
bdec044c88
aco: Do not fixup registers if there are no shader calls
...
Frees up some registers when using monolithic compilation.
Quake II RTX and Control (with monolithic compilation):
Totals from 10 (29.41% of 34) affected shaders:
MaxWaves: 77 -> 98 (+27.27%)
Instrs: 49047 -> 48984 (-0.13%); split: -0.16%, +0.03%
CodeSize: 260420 -> 259880 (-0.21%); split: -0.25%, +0.04%
VGPRs: 1328 -> 1104 (-16.87%)
Latency: 477134 -> 479377 (+0.47%); split: -0.05%, +0.52%
InvThroughput: 137763 -> 114108 (-17.17%)
VClause: 1318 -> 1286 (-2.43%); split: -2.66%, +0.23%
SClause: 1295 -> 1293 (-0.15%); split: -0.54%, +0.39%
Copies: 7838 -> 7782 (-0.71%); split: -0.82%, +0.10%
Branches: 2592 -> 2589 (-0.12%)
PreSGPRs: 874 -> 796 (-8.92%)
PreVGPRs: 1283 -> 1013 (-21.04%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Konstantin Seurer
ec708c26ef
radv/rt: Split stage initialization and hashing
...
The dependency chain is: init stages -> compute pipeline key -> hash
stages.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Konstantin Seurer
f3e2900c59
radv/rt: Insert rt_return_amd before lowering shader calls
...
Also skips running nir_lower_shader_calls for the traversal shader. This
will be used to skip the pass and the rt_return_amd insertion for
monolithic raygen shaders.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Konstantin Seurer
774421f11e
radv/rt: Add and use radv_build_traversal
...
Moves most of the build code to a helper which will be useful for adding
inline traversal.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Konstantin Seurer
2d7965dbff
radv/rt: Do not apply stack_ptr for non-recursive stages
...
stack_ptr is set to 0.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Konstantin Seurer
d174a71db8
radv/rt: Remove some dead code
...
- call_idx_base was used for resume shaders in the shader call loop
- hit attribs are lowered elsewhere
- stack_size is set in radv_pipeline_rt.c
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Georg Lehmann
9cf6984200
nir: unify lower_find_msb with has_{find_msb_rev,uclz}
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662 >
2023-08-22 12:08:37 +00:00
Georg Lehmann
2ac7e6614a
nir: unify lower_bitfield_extract with has_bfe
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662 >
2023-08-22 12:08:37 +00:00
Georg Lehmann
34c3f81614
nir: unify lower_bitfield_insert with has_{bfm,bfi,bitfield_select}
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24662 >
2023-08-22 12:08:37 +00:00
Friedrich Vock
bfb55d0266
ac/sqtt,radv/sqtt: Add and use marker for separate RT compilation
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
3d3d5c4bc3
radv/sqtt: Handle separately-compiled RT pipelines
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
1cd9525b18
radv/sqtt: Write LDS size metadata in code objects
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
7809fb9e49
radv/sqtt: Unregister records based on hash
...
RT pipelines have multiple hashes used in records, so don't always use
the pipeline hash.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
3ed4cca883
radv/sqtt: Move record filling to helper function
...
RT shaders construct records differently, but this piece of code is
common to all types of pipelines.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
b4a704b42a
ac/rgp: Add metadata for separate-compiled RT stages
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
0c4e92bf3e
ac/rgp: Write lds_size metadata
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
be0e3e8e09
ac/sqtt,radv: Split internal and API hash in PSO correlations
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:11 +00:00
Friedrich Vock
d5f1c9fb4b
ac/msgpack: make fixstrs a const char
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24371 >
2023-08-22 11:33:10 +00:00
Samuel Pitoiset
a29e2c6fbc
aco: implement create_tcs_jump_to_epilog()
...
This implements jumping from the main TCS to the epilog.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
e03c09dfb2
aco: allow SGPRs operands with p_jump_to_epilog
...
For TCS epilogs, we will have to pass SGPRs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
fc9283938f
aco: adjust TCS epilogs for RADV
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
0c2adc7ada
aco: fix jumping from main TCS to epilog on GFX9+
...
On GFX9+, VS is merged with TCS which means this function is called
twice and the epilog was emitted in both shader parts.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
131c3aa3dc
radv: add tcs_out_patch_fits_subgroup to radv_tcs_epilog_key
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
65191bb351
radv: declare shader arguments for TCS epilogs
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
d0808b22cb
radv: stop declaring the scratch offset argument for TCS epilogs
...
ACO skip it for epilogs now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
6ad8abf7aa
radv: use the maximum possible workgroup size for TCS epilogs
...
It's similar to when the patch control points value is dynamic.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Chia-I Wu
e74c3dbb70
ac/surface: limit RADEON_SURF_NO_TEXTURE to color surfaces
...
For z surfaces, flags.texture should be based on
RADEON_SURF_TC_COMPATIBLE_HTILE alone. Otherwise, addrlib could pick a
_X/_T swizzle mode for a MSAA depth texture, which is said to be broken:
When _X/_T swizzle mode was used for MSAA depth texture, TC will get zplane
equation from wrong address within memory range a tile covered and use the
garbage data for compressed Z reading which finally leads to corruption.
Fixes: de0885cdb8 ("amd/surface: add RADEON_SURF_NO_TEXTURE flag")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24767 >
2023-08-22 02:36:20 +00:00
David Heidelberg
6079c3ca49
ci: disable Material Testers.x86_64_2020.04.08_13.38_frame799.rdc trace
...
This change will be revert as soon, as Collabora proxy gets fixed.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24819 >
2023-08-21 22:31:21 +00:00
Tatsuyuki Ishi
6c5512568b
radv/amdgpu: Do not pass in a BO handle when clearing PRT VA region.
...
This field is invalid to access for virtual BOs.
Fixes: a931d5a4a4 ("radv/winsys: clear the PRT VA range when destroying a virtual BO")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24805 >
2023-08-21 17:24:35 +00:00
Konstantin Seurer
2943bc34e9
radv: Remove leaf_args::dst_offset
...
We can use first_id instead.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24756 >
2023-08-21 12:45:06 +00:00
Konstantin Seurer
90a24c7cb3
radv: Add internal_nodes_offset to scratch_layout
...
It shouldn't be a part of bvh_state.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24756 >
2023-08-21 12:45:06 +00:00
Samuel Pitoiset
b8b42be555
radv/amdgpu: add support for submitting external IBs with the chained path
...
External IBs are currently only used for DGC. With the chained path,
these IBs will only be used to workaround missing IB2 packet on the
compute queue, which is rare enough to care about chaining inside CS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24207 >
2023-08-21 10:52:13 +00:00
Samuel Pitoiset
33f584f033
radv/amdgpu: allow to execute external IBs on the compute queue
...
IB2 isn't supported on ACE, so external IBs should be submitted as IB1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24207 >
2023-08-21 10:52:13 +00:00
Samuel Pitoiset
e3fae01730
Revert "radv/amdgpu: skip adding per VM BOs for sparse during CS BO list build"
...
This reverts commit 51caece74c .
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24774 >
2023-08-21 09:42:51 +00:00
Samuel Pitoiset
f67eb9ce07
Revert "radv/amdgpu: workaround a kernel bug when replacing sparse mappings"
...
This workaround was added temporarily but it can actually cause
stuttering in some games like Forza Horizon 5.
The kernel fix
(https://lists.freedesktop.org/archives/amd-gfx/2023-June/094648.html )
landed in some stable kernels (5.15.121+, 6.1.40+ and 6.4.5+). Sadly,
older stable kernels don't have it, so you might experiment random GPU
hangs in games that use sparse mapping. Please ensure your kernel is
up-to-date for the best experience.
This reverts commit 9b00867327 .
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9443
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24774 >
2023-08-21 09:42:51 +00:00
Marek Olšák
905a00f10a
ac/surface: add radeon_surf::u::gfx9::uses_custom_pitch
...
so that we don't try to guess when the pitch is overridden
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24759 >
2023-08-19 19:36:56 +00:00
Marek Olšák
9eb00f612a
ac/surface: trivial non-functional changes
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24759 >
2023-08-19 19:36:56 +00:00