Commit Graph

1062 Commits

Author SHA1 Message Date
Qiang Yu
71fd3c2a35 aco: do not fix_exports when separately compiled ngg vs or es
For radeonsi not abort when this case, as it does not jump at
last.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25631>
2023-10-26 02:01:49 +00:00
Rhys Perry
4c3677094e aco,nir: add export_row_amd intrinsic
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25040>
2023-10-24 21:36:06 +00:00
Bas Nieuwenhuizen
5e7c828c0e aco: Add WMMA instructions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24683>
2023-10-24 13:24:18 +00:00
Samuel Pitoiset
f0abdaea9f amd/llvm,aco,radv: implement NGG streamout with GDS_STRMOUT registers on GFX11
According to RadeonSI, this is required for preemption, user queues,
and we only have to wait for VS after streamout which should be more
performant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25284>
2023-10-12 16:58:22 +00:00
Qiang Yu
6eb0910d45 aco: do not fix_exports when program has epilog
PS with epilog does not need to fix_exports. And radeonsi use
p_end_with_regs so does not have jump instruction at last.

radeonsi may also have exec restore instruction, so may break
before reach to p_end_with_regs.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Qiang Yu
c9c18d3da5 aco: compact ps expilog color export for radeonsi
radeonsi need to compact color export for ps epilog while radv does not.
radv will fill empty color slot, so won't affected by this change.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Qiang Yu
1517aa7a8a aco,radv: add radeonsi spec ps epilog code
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Qiang Yu
9972a385fb aco: simplify export_fs_mrt_color
It's now used by ps epilog only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Qiang Yu
77d5966661 aco,radv: rename ps epilog info inputs to colors
Will add other mrtz args for radeonsi.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:34 +00:00
Qiang Yu
57b0f19582 aco: add create_fs_end_for_epilog for radeonsi
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:33 +00:00
Qiang Yu
90c901a987 aco: handle ps outputs from radeonsi
radeonsi will keep outputs <FRAG_RESULT_DATA0.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:33 +00:00
Qiang Yu
49250f9fc5 aco: add ps prolog generation for radeonsi
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:33 +00:00
Qiang Yu
80728a2e71 aco: do not eliminate final exec write when p_end_with_regs block
p_end_with_regs just partially end the program, next part need
exec mask to be set correctly. For example p_end_wqm will generate
a exec restore from WQM mode after p_end_with_regs.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
2023-10-10 02:36:33 +00:00
Georg Lehmann
34d8fa6185 aco/gfx11: optimize dual source export
We can combine dpp with the v_cndmask_b32.

Foz-DB Navi31:
Totals from 222 (0.28% of 79330) affected shaders:
Instrs: 564392 -> 563373 (-0.18%); split: -0.19%, +0.01%
CodeSize: 2867040 -> 2864728 (-0.08%); split: -0.09%, +0.01%
Latency: 4278957 -> 4275199 (-0.09%); split: -0.09%, +0.00%
InvThroughput: 586636 -> 585824 (-0.14%); split: -0.14%, +0.00%
SClause: 20210 -> 20211 (+0.00%); split: -0.02%, +0.02%
Copies: 39763 -> 39778 (+0.04%); split: -0.13%, +0.17%
PreVGPRs: 13924 -> 13922 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25541>
2023-10-05 10:37:34 +00:00
Rhys Perry
1a268dc59d aco: disable FI for quad/masked swizzle
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8330
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Rhys Perry
26fce534b5 aco: shrink DPP8_instruction
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25525>
2023-10-04 18:53:43 +00:00
Georg Lehmann
79b767f4c0 aco: remove -0.0 for 32 bit fsign with mul_legacy/omod when denorms are flushed
v_mul_legacy_f32 and omod return +0.0 if any operand is +0.0/-0.0.

Foz-DB Navi21:
Totals from 4289 (5.60% of 76572) affected shaders:
Instrs: 8100571 -> 8099319 (-0.02%); split: -0.02%, +0.00%
CodeSize: 43433200 -> 43435088 (+0.00%); split: -0.01%, +0.01%
Latency: 88151566 -> 88147232 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 22966705 -> 22965192 (-0.01%); split: -0.01%, +0.00%
VClause: 190010 -> 190009 (-0.00%)
SClause: 269697 -> 269689 (-0.00%)
Copies: 687294 -> 687296 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25347>
2023-10-02 14:02:49 +00:00
Rhys Perry
558738e3c5 aco: remove zero offset optimization
This is done in nir_opt_constant_folding now.

No fossil-db changes on navi31.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25477>
2023-10-02 10:11:37 +00:00
Rhys Perry
c3a894fb47 aco: disable zero offset optimization for strict WQM coords
If we try to do this, we end up using {undef,coordx} as the coordinates
for an image_sample instruction, because we can't shrink the linear VGPR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9767
Fixes: 859e059aa9 ("radv: use fix_derivs_in_divergent_cf")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25477>
2023-10-02 10:11:37 +00:00
Georg Lehmann
b91616e800 aco: implement 64bit div find_lsb
This can be selected for divergent subgroupBallotFindLSB.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25407>
2023-09-27 14:47:42 +00:00
Georg Lehmann
336ec2a4b4 aco: simplify masked swizzle dpp selection by removing or_mask first
and_mask and xor_mask alone can represent all patterns without or_mask

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25115>
2023-09-21 10:07:27 +00:00
Connor Abbott
c93bcb32fe amd: Use inverse ballot intrinsic if available
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>
2023-09-20 14:41:18 +00:00
Daniel Schürmann
45f6d38a76 aco: insert a single p_end_wqm after the last derivative calculation
This new instruction replaces p_wqm.

Totals from 28065 (36.65% of 76572) affected shaders: (GFX11)
MaxWaves: 823922 -> 823952 (+0.00%); split: +0.01%, -0.01%
Instrs: 22221375 -> 22180465 (-0.18%); split: -0.26%, +0.08%
CodeSize: 117310676 -> 117040684 (-0.23%); split: -0.30%, +0.07%
VGPRs: 1183476 -> 1186656 (+0.27%); split: -0.19%, +0.46%
SpillSGPRs: 2305 -> 2302 (-0.13%)
Latency: 176559310 -> 176427793 (-0.07%); split: -0.21%, +0.14%
InvThroughput: 26245204 -> 26195550 (-0.19%); split: -0.26%, +0.07%
VClause: 368058 -> 369460 (+0.38%); split: -0.21%, +0.59%
SClause: 857077 -> 842588 (-1.69%); split: -2.06%, +0.37%
Copies: 1245650 -> 1249434 (+0.30%); split: -0.33%, +0.63%
Branches: 394837 -> 396070 (+0.31%); split: -0.01%, +0.32%
PreSGPRs: 1019139 -> 1019567 (+0.04%); split: -0.02%, +0.06%
PreVGPRs: 925739 -> 931860 (+0.66%); split: -0.00%, +0.66%

Changes are due to scheduling and re-enabling cross-lane optimizations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:23 +00:00
Daniel Schürmann
28904839da aco: don't insert a copy when emitting p_wqm
Totals from 351 (0.46% of 76572) affected shaders: (GFX11)

Instrs: 709202 -> 709600 (+0.06%); split: -0.02%, +0.08%
CodeSize: 3606364 -> 3608040 (+0.05%); split: -0.01%, +0.06%
Latency: 3589841 -> 3590756 (+0.03%); split: -0.01%, +0.03%
InvThroughput: 463303 -> 463324 (+0.00%)
SClause: 28147 -> 28201 (+0.19%); split: -0.02%, +0.22%
Copies: 43243 -> 43204 (-0.09%); split: -0.24%, +0.15%
PreSGPRs: 21028 -> 21042 (+0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Daniel Schürmann
040142684c aco: make p_wqm a marker instruction without Operands/Definitions
Totals from 28277 (36.93% of 76572) affected shaders: (GFX11)

MaxWaves: 833930 -> 833898 (-0.00%); split: +0.01%, -0.01%
Instrs: 21366950 -> 21353346 (-0.06%); split: -0.11%, +0.05%
CodeSize: 112855368 -> 112610508 (-0.22%); split: -0.24%, +0.03%
VGPRs: 1157748 -> 1158540 (+0.07%); split: -0.10%, +0.17%
SpillSGPRs: 2465 -> 2463 (-0.08%); split: -0.16%, +0.08%
Latency: 168339886 -> 168383646 (+0.03%); split: -0.10%, +0.12%
InvThroughput: 25164895 -> 25158376 (-0.03%); split: -0.08%, +0.06%
VClause: 347660 -> 346256 (-0.40%); split: -0.55%, +0.15%
SClause: 794460 -> 799521 (+0.64%); split: -0.33%, +0.97%
Copies: 1151908 -> 1148370 (-0.31%); split: -0.54%, +0.23%
Branches: 359447 -> 359437 (-0.00%); split: -0.01%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Rhys Perry
41b6020ff3 aco: remove fast path in insert_exec_mask's process_instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Daniel Schürmann
0e8192a76b aco: append p_logical_end after monolithic RT shaders
Fixes: bdec044c88 ('aco: Do not fixup registers if there are no shader calls')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25038>
2023-09-14 09:25:22 +00:00
Georg Lehmann
524a894ba4 aco/gfx11: don't use bfe for local_invocation_id if the others are always 0
Foz-DB GFX1100:
Totals from 4469 (3.37% of 132657) affected shaders:
Instrs: 3895053 -> 3893529 (-0.04%); split: -0.04%, +0.00%
CodeSize: 20244128 -> 20220952 (-0.11%); split: -0.11%, +0.00%
Latency: 37864147 -> 37862227 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 5578100 -> 5576469 (-0.03%); split: -0.03%, +0.00%
SClause: 108336 -> 108343 (+0.01%); split: -0.00%, +0.01%
Copies: 275897 -> 275900 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24514>
2023-09-08 11:28:24 +00:00
Qiang Yu
b5eaec6c80 aco,radv,radeonsi: rename is_monolithic to merged_shader_compiled_separately
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24990>
2023-09-04 10:53:44 +08:00
Georg Lehmann
2ae94b3894 aco: implement some exclusive scans with inclusive scans
exclusive scan lowering uses full wave shift, for iadd/ixor it's faster
to do inclusive scans and subtract/xor the thread's source.

Foz-DB Navi21:
Totals from 21 (0.02% of 132657) affected shaders:
Instrs: 10925 -> 10727 (-1.81%)
CodeSize: 58064 -> 56488 (-2.71%)
Latency: 178471 -> 177928 (-0.30%)
InvThroughput: 24374 -> 24145 (-0.94%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24555>
2023-09-02 11:42:22 +00:00
Konstantin Seurer
2a5d8d4cf4 aco: Unify demote and demote_if selection
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24906>
2023-09-01 07:23:33 +00:00
Konstantin Seurer
9af91edda9 aco: Use bytes() instead of size() in emit_wqm
This should get most of the cases that would fail validation.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24906>
2023-09-01 07:23:33 +00:00
Samuel Pitoiset
e104718c9f aco: allow separate compilation of NGG shaders
Also prevent to emit a long-jump for VS as NGG without a GS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24907>
2023-09-01 06:52:40 +00:00
Samuel Pitoiset
bfb39031f1 aco: flag blocks with long-jump as export_end for separate compilation
This will allow us to adjust fix_exports().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24907>
2023-09-01 06:52:40 +00:00
Qiang Yu
a2ba50aee6 aco: add vs prolog instruction selection for radeonsi
Port from llvm si_llvm_build_vs_prolog().

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712>
2023-08-31 02:39:15 +00:00
Qiang Yu
3f87413811 aco: prepare fix_ls_vgpr_init_bug to be used by gl vs prolog
Prolog does not have nir shader.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712>
2023-08-31 02:39:15 +00:00
Qiang Yu
eb4f871034 aco: pass sw_stage when setup_isel_context
We are going to add more shader parts.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24712>
2023-08-31 02:39:15 +00:00
Samuel Pitoiset
f30d47c518 aco: fix emitting TCS epilogs end on GFX9+
With merged shaders, the long-jump should be emitted inside the
divergent if (ie. only for TCS invocations) and other non TCS
invocations should just end the program.

This fixes a bunch of failures with CTS by forcing TCS epilogs on
RDNA2.

Not sure how RadeonSI will handle that but maybe doing the merged
wave info thing in epilogs would help.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24832>
2023-08-29 09:30:07 +00:00
Samuel Pitoiset
8ebb29245c aco: add support for compiling {VS,TES}+GS separately on GFX9+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24862>
2023-08-25 10:22:41 +00:00
Samuel Pitoiset
80177e0296 aco: add support for compiling VS+TCS separately on GFX9+
The VS will just jump to the TCS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697>
2023-08-25 07:22:04 +00:00
Samuel Pitoiset
aba16211a8 aco: disable shared VGPRs for non-monolithic shaders on GFX9+
For unmerged shaders on GFX9+, we would need to jump to the second
shader part which means shared VGPRs can't be enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24697>
2023-08-25 07:22:04 +00:00
Georg Lehmann
6d949e18fd aco: fix u2f16 with 32bit input
The vulkan spec says all conversions are correctly rounded, so if the input
is larger than the largest fp16 value, we need to return MAX_FLOAT/inf
instead of cutting off the msbs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24826>
2023-08-23 12:25:56 +00:00
Rhys Perry
9169fbf83c aco: clarify bpermute pseudo opcode names
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24693>
2023-08-23 12:36:46 +01:00
Konstantin Seurer
bdec044c88 aco: Do not fixup registers if there are no shader calls
Frees up some registers when using monolithic compilation.

Quake II RTX and Control (with monolithic compilation):

Totals from 10 (29.41% of 34) affected shaders:
MaxWaves: 77 -> 98 (+27.27%)
Instrs: 49047 -> 48984 (-0.13%); split: -0.16%, +0.03%
CodeSize: 260420 -> 259880 (-0.21%); split: -0.25%, +0.04%
VGPRs: 1328 -> 1104 (-16.87%)
Latency: 477134 -> 479377 (+0.47%); split: -0.05%, +0.52%
InvThroughput: 137763 -> 114108 (-17.17%)
VClause: 1318 -> 1286 (-2.43%); split: -2.66%, +0.23%
SClause: 1295 -> 1293 (-0.15%); split: -0.54%, +0.39%
Copies: 7838 -> 7782 (-0.71%); split: -0.82%, +0.10%
Branches: 2592 -> 2589 (-0.12%)
PreSGPRs: 874 -> 796 (-8.92%)
PreVGPRs: 1283 -> 1013 (-21.04%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24809>
2023-08-22 15:46:29 +00:00
Samuel Pitoiset
a29e2c6fbc aco: implement create_tcs_jump_to_epilog()
This implements jumping from the main TCS to the epilog.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
fc9283938f aco: adjust TCS epilogs for RADV
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
2023-08-22 06:10:32 +00:00
Samuel Pitoiset
0c2adc7ada aco: fix jumping from main TCS to epilog on GFX9+
On GFX9+, VS is merged with TCS which means this function is called
twice and the epilog was emitted in both shader parts.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
2023-08-22 06:10:32 +00:00
Qiang Yu
51a8479a51 aco: use semantic location as io temp index
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
2023-08-16 02:27:45 +00:00
Qiang Yu
facbd13df1 aco: skip scratch init when no scratch arg provide
epilog does not use scratch so has no scratch arg.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
2023-08-16 02:27:45 +00:00
Qiang Yu
d3333609e6 aco: don't emit s_endpgm for tcs with epilog
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
2023-08-16 02:27:45 +00:00