Commit Graph

1162 Commits

Author SHA1 Message Date
Rhys Perry cdaf269924 aco: inline store_vmem_mubuf/emit_single_mubuf_store
Both of these are only used once.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry 185fa04baa aco/gfx6: set glc for buffer_store_byte/short
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry 4cfb7a0c17 aco: remove support for sub-dword push constants
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29480>
2024-06-06 17:52:05 +00:00
Rhys Perry 8e475bba61 aco: implement nir_intrinsic_nop_amd and nir_intrinsic_sleep_amd
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry 1ad05d4ca8 aco: implement nir_atomic_op_ordered_add_gfx12_amd
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry 2a4424425a aco/gfx12: fix s_wait_event immediate
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Rhys Perry c651eed1d8 aco/gfx12: implement load_subgroup_id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29466>
2024-06-06 14:26:52 +00:00
Georg Lehmann dcab408a6c nir: remove unpack_half_flush_to_zero
It doesn't make sense to have two sets of opcodes for this when all backends
that support the flush_to_zero variant just rely on the global floating point
mode anyway.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>
2024-05-31 09:46:35 +00:00
Samuel Pitoiset ce6557cc04 aco: adjust loading local invocation ID for GS on GFX12
It uses gs_vtx_offset[0] instead.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29417>
2024-05-30 11:05:04 +00:00
Rhys Perry 1829d74ad3 aco: fix fddx/y with uniform inf/nan input
inf or nan subtracted by itself is not zero.

I don't think Vulkan requires this, but this better matches NIR's constant
folding and the divergent implementation.

fossil-db (navi31):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 537 -> 588 (+9.50%)
CodeSize: 3132 -> 3380 (+7.92%)
Latency: 2806 -> 2819 (+0.46%)
InvThroughput: 286 -> 316 (+10.49%)
Copies: 24 -> 39 (+62.50%)
VALU: 262 -> 289 (+10.31%)
SALU: 33 -> 51 (+54.55%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29418>
2024-05-29 15:18:52 +00:00
Konstantin Seurer a93f95c69c radv/rt: Remove load_rt_dynamic_callable_stack_base_amd
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619>
2024-05-28 12:23:45 +00:00
Konstantin Seurer 432f3eb9ca radv/rt: Track ray_launch_size reads
Totals from 33 (8.71% of 379) affected shaders:
Instrs: 1434025 -> 1433988 (-0.00%); split: -0.01%, +0.00%
CodeSize: 7578824 -> 7578472 (-0.00%); split: -0.01%, +0.00%
Latency: 9241632 -> 9241639 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 3407014 -> 3407049 (+0.00%); split: -0.00%, +0.00%
VClause: 40399 -> 40391 (-0.02%)
SClause: 37755 -> 37760 (+0.01%); split: -0.04%, +0.05%
Copies: 169588 -> 169567 (-0.01%); split: -0.04%, +0.02%
PreSGPRs: 4323 -> 4319 (-0.09%)
VALU: 940500 -> 940484 (-0.00%); split: -0.00%, +0.00%
SALU: 220508 -> 220509 (+0.00%); split: -0.03%, +0.03%

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619>
2024-05-28 12:23:45 +00:00
Konstantin Seurer 7ba8fccad3 radv/rt: Track ray_launch_id reads
We can expect the z-component to be unused most of the times. Avoid
preserving it in those cases.

Totals from 94 (24.80% of 379) affected shaders:
MaxWaves: 916 -> 935 (+2.07%)
Instrs: 3316697 -> 3318357 (+0.05%); split: -0.06%, +0.11%
CodeSize: 17618704 -> 17616680 (-0.01%); split: -0.09%, +0.08%
VGPRs: 11632 -> 11520 (-0.96%)
SpillSGPRs: 1139 -> 1205 (+5.79%); split: -0.35%, +6.15%
Latency: 22595907 -> 22598225 (+0.01%); split: -0.15%, +0.16%
InvThroughput: 7036479 -> 6923740 (-1.60%); split: -1.74%, +0.14%
VClause: 104325 -> 104361 (+0.03%); split: -0.16%, +0.19%
SClause: 83920 -> 83925 (+0.01%); split: -0.08%, +0.08%
Copies: 328140 -> 330687 (+0.78%); split: -0.27%, +1.05%
Branches: 134521 -> 134541 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 8753 -> 8806 (+0.61%)
PreVGPRs: 10984 -> 10937 (-0.43%)
VALU: 2149880 -> 2151318 (+0.07%); split: -0.08%, +0.15%
SALU: 499107 -> 499128 (+0.00%); split: -0.08%, +0.09%

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28619>
2024-05-28 12:23:45 +00:00
Rhys Perry 12b4bdc134 aco/gfx12: decrease max_nsa_vgprs for VSAMPLE
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry ef74407577 aco/gfx12: use ttmp9/ttmp7 for workgroup id
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Rhys Perry fae2a85d57 aco/gfx12: implement subgroup shader clock
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29330>
2024-05-28 10:52:11 +00:00
Samuel Pitoiset 3d6957268b aco: use new common helpers for building buffer descriptors
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29268>
2024-05-22 08:31:39 +00:00
Rhys Perry 4ae8a558b2 aco: remove nir_to_aco
This isn't used anymore

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry b1964f03e7 aco: use scalar phi lowering for lcssa workaround
This lets us use non-undef for the last operand, if necessary
(demonstrated in the test).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry bbe4652430 aco: create lcssa phis for continue_or_break loops when necessary
These might not exist because adding would decrease the quality of
divergence analysis. They are necessary for continue_or_break though, so
add them later, where they won't affect divergence analysis.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10623
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
2024-05-21 21:28:13 +00:00
Rhys Perry 418fed1805 aco: update VS prolog waitcnt for GFX12
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29225>
2024-05-20 10:45:39 +00:00
Rhys Perry 74aa6437d6 aco: add GFX11.5+ opcodes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29162>
2024-05-14 20:50:27 +00:00
Rhys Perry 869253b66c aco: support VS prologs with unaligned access
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Rhys Perry 9ec2fa392f aco: copy VS prolog constants after loads
This way, the loads start earlier.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Rhys Perry 46b8ba8154 aco: form hard clauses in VS prologs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29071>
2024-05-13 17:22:26 +00:00
Marek Olšák 58a5de5c34 amd: add gfx12 register definitions into the register header generator
The generator renamed some definitions to resolve conflicts.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29007>
2024-05-11 22:14:05 -04:00
Samuel Pitoiset 53a142ad23 aco: add support for remapping color attachments
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27263>
2024-05-07 10:35:04 +00:00
Daniel Schürmann 2d0c6647f0 aco: use SGPR phi lowering for all scalar phis
No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28661>
2024-04-26 08:39:01 +00:00
Daniel Schürmann 6ec6899bff aco: use SGPR phi lowering for all loop header phis
No fossil-db changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28661>
2024-04-26 08:39:01 +00:00
Daniel Schürmann 7c01193299 aco: use SGPR phi lowering for uniform phis in divergent merge blocks
The fossil changes are due to a slightly different register allocation
from a reversed order of phi instructions.

Totals from 1620 (2.04% of 79395) affected shaders: (GFX10.3)

Instrs: 730683 -> 732621 (+0.27%); split: -0.02%, +0.28%
CodeSize: 3888464 -> 3898488 (+0.26%); split: -0.00%, +0.26%
Latency: 3274291 -> 3275549 (+0.04%); split: -0.02%, +0.06%
InvThroughput: 606625 -> 606661 (+0.01%); split: -0.00%, +0.01%
VClause: 9541 -> 9538 (-0.03%)
SClause: 17296 -> 17272 (-0.14%); split: -0.16%, +0.02%
Copies: 81392 -> 83231 (+2.26%); split: -0.17%, +2.43%
Branches: 27023 -> 27020 (-0.01%); split: -0.03%, +0.02%
VALU: 383380 -> 382749 (-0.16%)
SALU: 160895 -> 163369 (+1.54%); split: -0.03%, +1.57%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28661>
2024-04-26 08:39:01 +00:00
Daniel Schürmann 6e3446422f aco: introduce aco_opcode::p_boolean_phi
This opcode is only used during instruction selection and
immediately lowered to linear phis afterwards.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28661>
2024-04-26 08:39:01 +00:00
Rhys Perry 37e9e8b06c aco: split vop3p results
Removes copies in the case of:
a = fmul
b = fmul
c = vec4(a.x, a.y, b.x, b.y)

fossil-db (navi31):
Totals from 21 (0.03% of 79395) affected shaders:
Instrs: 96481 -> 96338 (-0.15%)
CodeSize: 548452 -> 548196 (-0.05%); split: -0.13%, +0.09%
Latency: 1514460 -> 1514238 (-0.01%); split: -0.02%, +0.00%
InvThroughput: 683048 -> 682942 (-0.02%); split: -0.02%, +0.00%
VClause: 1611 -> 1613 (+0.12%)
Copies: 21326 -> 21190 (-0.64%)
Branches: 2427 -> 2426 (-0.04%)
PreVGPRs: 2289 -> 2298 (+0.39%)
VALU: 59090 -> 58954 (-0.23%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28763>
2024-04-23 12:31:59 +00:00
Georg Lehmann 4b5016a537 aco: support high_16bits FS IO
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28435>
2024-04-10 07:49:27 +00:00
Georg Lehmann 893ee883fe aco: use v1 definition for v_interp_p1lv_f16
The result of the first interpolation step is always fp32.

Fixes: 1647e098e9 ("aco: implement 16-bit interp")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28435>
2024-04-10 07:49:26 +00:00
Rhys Perry 0f2d5ed75c aco: assume no unreachable blocks
These shouldn't happen anymore.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28301>
2024-04-08 18:38:39 +00:00
Rhys Perry 543ca160a5 nir,aco: add test intrinsics
These don't really do anything. They're just a source and user of SSA
defs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28301>
2024-04-08 18:38:39 +00:00
Rhys Perry 0a25af1d4e aco: save/reset/combine has_divergent_continue in uniform branches
For
if (uniform) {
   if (divergent)
      continue
} else {
   break
}
we don't need to consider the continue to be divergent.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28301>
2024-04-08 18:38:39 +00:00
Rhys Perry 46c734ff02 aco: ensure loop exits exist in NIR
This simplifies instruction selection and fixes the case where the loop
ends with a continue instruction.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28301>
2024-04-08 18:38:39 +00:00
Samuel Pitoiset 7a69d78ba2 aco: use SPDX-License-Identifier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28622>
2024-04-08 15:49:25 +00:00
Rhys Perry 6b301eae36 aco: implement mqsad_4x8 and shfr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26251>
2024-04-05 11:01:39 +00:00
Timur Kristóf 8883b88dd4 aco: Delete all TCS epilog code.
Now that neither RADV nor RadeonSI uses TCS epilogs, we don't
need to keep the code to compile them in ACO either.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30 21:56:39 +01:00
Timur Kristóf 3fd002f6d5 radv, aco: Remove the code that jumped to RADV's TCS epilogs.
The actual TCS epilog selection code is kept unchanged for now,
we'll delete it when RadeonSI also gets rid of TCS epilogs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
2024-03-28 23:44:03 +00:00
Daniel Schürmann a863c7951e aco: remove create_instruction() template parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann 9b0ebcc39b aco: change return type of create_instruction() to Instruction*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann 1187189235 aco: unify different SALU types into single struct SALU_instruction
This removes
- SOP1_instruction
- SOP2_instruction
- SOPC_instruction
- SOPK_instruction
- SOPP_instruction

and their corresponding methods.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Daniel Schürmann 5d265257a0 aco: remove SOPP_instruction::block member
Re-use SOPP_instruction::imm instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28370>
2024-03-28 11:25:43 +00:00
Timur Kristóf fcf574f4c1 radv, aco: Delete now dead TCS epilog code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28371>
2024-03-28 09:41:08 +00:00
Timur Kristóf 023d7fc76d aco: Use tess factors when TCS jumps to epilog.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28371>
2024-03-28 09:41:08 +00:00
Timur Kristóf 3422084026 aco: Use common helper for counting tess level components.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28371>
2024-03-28 09:41:08 +00:00
Marek Olšák 6773595ed0 nir: rename AMD XFB intrinsics to *_gfx11_amd
to indicate it's only for gfx11.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27952>
2024-03-22 21:58:02 +00:00