Qiang Yu
71fd3c2a35
aco: do not fix_exports when separately compiled ngg vs or es
...
For radeonsi not abort when this case, as it does not jump at
last.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25631 >
2023-10-26 02:01:49 +00:00
Rhys Perry
4c3677094e
aco,nir: add export_row_amd intrinsic
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25040 >
2023-10-24 21:36:06 +00:00
Bas Nieuwenhuizen
5e7c828c0e
aco: Add WMMA instructions.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24683 >
2023-10-24 13:24:18 +00:00
Samuel Pitoiset
f0abdaea9f
amd/llvm,aco,radv: implement NGG streamout with GDS_STRMOUT registers on GFX11
...
According to RadeonSI, this is required for preemption, user queues,
and we only have to wait for VS after streamout which should be more
performant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25284 >
2023-10-12 16:58:22 +00:00
Qiang Yu
6eb0910d45
aco: do not fix_exports when program has epilog
...
PS with epilog does not need to fix_exports. And radeonsi use
p_end_with_regs so does not have jump instruction at last.
radeonsi may also have exec restore instruction, so may break
before reach to p_end_with_regs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:34 +00:00
Qiang Yu
c9c18d3da5
aco: compact ps expilog color export for radeonsi
...
radeonsi need to compact color export for ps epilog while radv does not.
radv will fill empty color slot, so won't affected by this change.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:34 +00:00
Qiang Yu
1517aa7a8a
aco,radv: add radeonsi spec ps epilog code
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:34 +00:00
Qiang Yu
9972a385fb
aco: simplify export_fs_mrt_color
...
It's now used by ps epilog only.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:34 +00:00
Qiang Yu
77d5966661
aco,radv: rename ps epilog info inputs to colors
...
Will add other mrtz args for radeonsi.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:34 +00:00
Qiang Yu
57b0f19582
aco: add create_fs_end_for_epilog for radeonsi
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:33 +00:00
Qiang Yu
90c901a987
aco: handle ps outputs from radeonsi
...
radeonsi will keep outputs <FRAG_RESULT_DATA0.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:33 +00:00
Qiang Yu
49250f9fc5
aco: add ps prolog generation for radeonsi
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:33 +00:00
Qiang Yu
80728a2e71
aco: do not eliminate final exec write when p_end_with_regs block
...
p_end_with_regs just partially end the program, next part need
exec mask to be set correctly. For example p_end_wqm will generate
a exec restore from WQM mode after p_end_with_regs.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973 >
2023-10-10 02:36:33 +00:00
Georg Lehmann
34d8fa6185
aco/gfx11: optimize dual source export
...
We can combine dpp with the v_cndmask_b32.
Foz-DB Navi31:
Totals from 222 (0.28% of 79330) affected shaders:
Instrs: 564392 -> 563373 (-0.18%); split: -0.19%, +0.01%
CodeSize: 2867040 -> 2864728 (-0.08%); split: -0.09%, +0.01%
Latency: 4278957 -> 4275199 (-0.09%); split: -0.09%, +0.00%
InvThroughput: 586636 -> 585824 (-0.14%); split: -0.14%, +0.00%
SClause: 20210 -> 20211 (+0.00%); split: -0.02%, +0.02%
Copies: 39763 -> 39778 (+0.04%); split: -0.13%, +0.17%
PreVGPRs: 13924 -> 13922 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25541 >
2023-10-05 10:37:34 +00:00
Rhys Perry
1a268dc59d
aco: disable FI for quad/masked swizzle
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8330
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525 >
2023-10-04 18:53:43 +00:00
Rhys Perry
26fce534b5
aco: shrink DPP8_instruction
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525 >
2023-10-04 18:53:43 +00:00
Georg Lehmann
79b767f4c0
aco: remove -0.0 for 32 bit fsign with mul_legacy/omod when denorms are flushed
...
v_mul_legacy_f32 and omod return +0.0 if any operand is +0.0/-0.0.
Foz-DB Navi21:
Totals from 4289 (5.60% of 76572) affected shaders:
Instrs: 8100571 -> 8099319 (-0.02%); split: -0.02%, +0.00%
CodeSize: 43433200 -> 43435088 (+0.00%); split: -0.01%, +0.01%
Latency: 88151566 -> 88147232 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 22966705 -> 22965192 (-0.01%); split: -0.01%, +0.00%
VClause: 190010 -> 190009 (-0.00%)
SClause: 269697 -> 269689 (-0.00%)
Copies: 687294 -> 687296 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25347 >
2023-10-02 14:02:49 +00:00
Rhys Perry
558738e3c5
aco: remove zero offset optimization
...
This is done in nir_opt_constant_folding now.
No fossil-db changes on navi31.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25477 >
2023-10-02 10:11:37 +00:00
Rhys Perry
c3a894fb47
aco: disable zero offset optimization for strict WQM coords
...
If we try to do this, we end up using {undef,coordx} as the coordinates
for an image_sample instruction, because we can't shrink the linear VGPR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9767
Fixes: 859e059aa9 ("radv: use fix_derivs_in_divergent_cf")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25477 >
2023-10-02 10:11:37 +00:00
Georg Lehmann
b91616e800
aco: implement 64bit div find_lsb
...
This can be selected for divergent subgroupBallotFindLSB.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25407 >
2023-09-27 14:47:42 +00:00
Georg Lehmann
336ec2a4b4
aco: simplify masked swizzle dpp selection by removing or_mask first
...
and_mask and xor_mask alone can represent all patterns without or_mask
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25115 >
2023-09-21 10:07:27 +00:00
Connor Abbott
c93bcb32fe
amd: Use inverse ballot intrinsic if available
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123 >
2023-09-20 14:41:18 +00:00
Daniel Schürmann
45f6d38a76
aco: insert a single p_end_wqm after the last derivative calculation
...
This new instruction replaces p_wqm.
Totals from 28065 (36.65% of 76572) affected shaders: (GFX11)
MaxWaves: 823922 -> 823952 (+0.00%); split: +0.01%, -0.01%
Instrs: 22221375 -> 22180465 (-0.18%); split: -0.26%, +0.08%
CodeSize: 117310676 -> 117040684 (-0.23%); split: -0.30%, +0.07%
VGPRs: 1183476 -> 1186656 (+0.27%); split: -0.19%, +0.46%
SpillSGPRs: 2305 -> 2302 (-0.13%)
Latency: 176559310 -> 176427793 (-0.07%); split: -0.21%, +0.14%
InvThroughput: 26245204 -> 26195550 (-0.19%); split: -0.26%, +0.07%
VClause: 368058 -> 369460 (+0.38%); split: -0.21%, +0.59%
SClause: 857077 -> 842588 (-1.69%); split: -2.06%, +0.37%
Copies: 1245650 -> 1249434 (+0.30%); split: -0.33%, +0.63%
Branches: 394837 -> 396070 (+0.31%); split: -0.01%, +0.32%
PreSGPRs: 1019139 -> 1019567 (+0.04%); split: -0.02%, +0.06%
PreVGPRs: 925739 -> 931860 (+0.66%); split: -0.00%, +0.66%
Changes are due to scheduling and re-enabling cross-lane optimizations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038 >
2023-09-14 09:25:23 +00:00
Daniel Schürmann
28904839da
aco: don't insert a copy when emitting p_wqm
...
Totals from 351 (0.46% of 76572) affected shaders: (GFX11)
Instrs: 709202 -> 709600 (+0.06%); split: -0.02%, +0.08%
CodeSize: 3606364 -> 3608040 (+0.05%); split: -0.01%, +0.06%
Latency: 3589841 -> 3590756 (+0.03%); split: -0.01%, +0.03%
InvThroughput: 463303 -> 463324 (+0.00%)
SClause: 28147 -> 28201 (+0.19%); split: -0.02%, +0.22%
Copies: 43243 -> 43204 (-0.09%); split: -0.24%, +0.15%
PreSGPRs: 21028 -> 21042 (+0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038 >
2023-09-14 09:25:22 +00:00
Daniel Schürmann
040142684c
aco: make p_wqm a marker instruction without Operands/Definitions
...
Totals from 28277 (36.93% of 76572) affected shaders: (GFX11)
MaxWaves: 833930 -> 833898 (-0.00%); split: +0.01%, -0.01%
Instrs: 21366950 -> 21353346 (-0.06%); split: -0.11%, +0.05%
CodeSize: 112855368 -> 112610508 (-0.22%); split: -0.24%, +0.03%
VGPRs: 1157748 -> 1158540 (+0.07%); split: -0.10%, +0.17%
SpillSGPRs: 2465 -> 2463 (-0.08%); split: -0.16%, +0.08%
Latency: 168339886 -> 168383646 (+0.03%); split: -0.10%, +0.12%
InvThroughput: 25164895 -> 25158376 (-0.03%); split: -0.08%, +0.06%
VClause: 347660 -> 346256 (-0.40%); split: -0.55%, +0.15%
SClause: 794460 -> 799521 (+0.64%); split: -0.33%, +0.97%
Copies: 1151908 -> 1148370 (-0.31%); split: -0.54%, +0.23%
Branches: 359447 -> 359437 (-0.00%); split: -0.01%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038 >
2023-09-14 09:25:22 +00:00
Rhys Perry
41b6020ff3
aco: remove fast path in insert_exec_mask's process_instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038 >
2023-09-14 09:25:22 +00:00
Daniel Schürmann
0e8192a76b
aco: append p_logical_end after monolithic RT shaders
...
Fixes: bdec044c88 ('aco: Do not fixup registers if there are no shader calls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038 >
2023-09-14 09:25:22 +00:00
Georg Lehmann
524a894ba4
aco/gfx11: don't use bfe for local_invocation_id if the others are always 0
...
Foz-DB GFX1100:
Totals from 4469 (3.37% of 132657) affected shaders:
Instrs: 3895053 -> 3893529 (-0.04%); split: -0.04%, +0.00%
CodeSize: 20244128 -> 20220952 (-0.11%); split: -0.11%, +0.00%
Latency: 37864147 -> 37862227 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 5578100 -> 5576469 (-0.03%); split: -0.03%, +0.00%
SClause: 108336 -> 108343 (+0.01%); split: -0.00%, +0.01%
Copies: 275897 -> 275900 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24514 >
2023-09-08 11:28:24 +00:00
Qiang Yu
b5eaec6c80
aco,radv,radeonsi: rename is_monolithic to merged_shader_compiled_separately
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24990 >
2023-09-04 10:53:44 +08:00
Georg Lehmann
2ae94b3894
aco: implement some exclusive scans with inclusive scans
...
exclusive scan lowering uses full wave shift, for iadd/ixor it's faster
to do inclusive scans and subtract/xor the thread's source.
Foz-DB Navi21:
Totals from 21 (0.02% of 132657) affected shaders:
Instrs: 10925 -> 10727 (-1.81%)
CodeSize: 58064 -> 56488 (-2.71%)
Latency: 178471 -> 177928 (-0.30%)
InvThroughput: 24374 -> 24145 (-0.94%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24555 >
2023-09-02 11:42:22 +00:00
Konstantin Seurer
2a5d8d4cf4
aco: Unify demote and demote_if selection
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24906 >
2023-09-01 07:23:33 +00:00
Konstantin Seurer
9af91edda9
aco: Use bytes() instead of size() in emit_wqm
...
This should get most of the cases that would fail validation.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24906 >
2023-09-01 07:23:33 +00:00
Samuel Pitoiset
e104718c9f
aco: allow separate compilation of NGG shaders
...
Also prevent to emit a long-jump for VS as NGG without a GS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24907 >
2023-09-01 06:52:40 +00:00
Samuel Pitoiset
bfb39031f1
aco: flag blocks with long-jump as export_end for separate compilation
...
This will allow us to adjust fix_exports().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24907 >
2023-09-01 06:52:40 +00:00
Qiang Yu
a2ba50aee6
aco: add vs prolog instruction selection for radeonsi
...
Port from llvm si_llvm_build_vs_prolog().
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712 >
2023-08-31 02:39:15 +00:00
Qiang Yu
3f87413811
aco: prepare fix_ls_vgpr_init_bug to be used by gl vs prolog
...
Prolog does not have nir shader.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712 >
2023-08-31 02:39:15 +00:00
Qiang Yu
eb4f871034
aco: pass sw_stage when setup_isel_context
...
We are going to add more shader parts.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712 >
2023-08-31 02:39:15 +00:00
Samuel Pitoiset
f30d47c518
aco: fix emitting TCS epilogs end on GFX9+
...
With merged shaders, the long-jump should be emitted inside the
divergent if (ie. only for TCS invocations) and other non TCS
invocations should just end the program.
This fixes a bunch of failures with CTS by forcing TCS epilogs on
RDNA2.
Not sure how RadeonSI will handle that but maybe doing the merged
wave info thing in epilogs would help.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24832 >
2023-08-29 09:30:07 +00:00
Samuel Pitoiset
8ebb29245c
aco: add support for compiling {VS,TES}+GS separately on GFX9+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24862 >
2023-08-25 10:22:41 +00:00
Samuel Pitoiset
80177e0296
aco: add support for compiling VS+TCS separately on GFX9+
...
The VS will just jump to the TCS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697 >
2023-08-25 07:22:04 +00:00
Samuel Pitoiset
aba16211a8
aco: disable shared VGPRs for non-monolithic shaders on GFX9+
...
For unmerged shaders on GFX9+, we would need to jump to the second
shader part which means shared VGPRs can't be enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697 >
2023-08-25 07:22:04 +00:00
Georg Lehmann
6d949e18fd
aco: fix u2f16 with 32bit input
...
The vulkan spec says all conversions are correctly rounded, so if the input
is larger than the largest fp16 value, we need to return MAX_FLOAT/inf
instead of cutting off the msbs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24826 >
2023-08-23 12:25:56 +00:00
Rhys Perry
9169fbf83c
aco: clarify bpermute pseudo opcode names
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693 >
2023-08-23 12:36:46 +01:00
Konstantin Seurer
bdec044c88
aco: Do not fixup registers if there are no shader calls
...
Frees up some registers when using monolithic compilation.
Quake II RTX and Control (with monolithic compilation):
Totals from 10 (29.41% of 34) affected shaders:
MaxWaves: 77 -> 98 (+27.27%)
Instrs: 49047 -> 48984 (-0.13%); split: -0.16%, +0.03%
CodeSize: 260420 -> 259880 (-0.21%); split: -0.25%, +0.04%
VGPRs: 1328 -> 1104 (-16.87%)
Latency: 477134 -> 479377 (+0.47%); split: -0.05%, +0.52%
InvThroughput: 137763 -> 114108 (-17.17%)
VClause: 1318 -> 1286 (-2.43%); split: -2.66%, +0.23%
SClause: 1295 -> 1293 (-0.15%); split: -0.54%, +0.39%
Copies: 7838 -> 7782 (-0.71%); split: -0.82%, +0.10%
Branches: 2592 -> 2589 (-0.12%)
PreSGPRs: 874 -> 796 (-8.92%)
PreVGPRs: 1283 -> 1013 (-21.04%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809 >
2023-08-22 15:46:29 +00:00
Samuel Pitoiset
a29e2c6fbc
aco: implement create_tcs_jump_to_epilog()
...
This implements jumping from the main TCS to the epilog.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
fc9283938f
aco: adjust TCS epilogs for RADV
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
0c2adc7ada
aco: fix jumping from main TCS to epilog on GFX9+
...
On GFX9+, VS is merged with TCS which means this function is called
twice and the epilog was emitted in both shader parts.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643 >
2023-08-22 06:10:32 +00:00
Qiang Yu
51a8479a51
aco: use semantic location as io temp index
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00
Qiang Yu
facbd13df1
aco: skip scratch init when no scratch arg provide
...
epilog does not use scratch so has no scratch arg.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00
Qiang Yu
d3333609e6
aco: don't emit s_endpgm for tcs with epilog
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442 >
2023-08-16 02:27:45 +00:00