Commit Graph

15692 Commits

Author SHA1 Message Date
Samuel Pitoiset 8edbfbfe68 radv: specialize push constant DGC token
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset 7d0972711c radv: simplify allocating push constants with DGC
Using a condition will allow to specialize it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset 545949d12f radv: specialize VBO DGC token
Can't really specialize more without rewriting VBO completely.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset 64076c652c radv: specialize pipeline DGC token
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset 7270bf7aa3 radv: specialize index buffer DGC token
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset 3128eca2d0 radv: specialize draw DGC token
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset ccd55b55da radv: specialize dispatch DGC token
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset b4793400f3 radv: add a pointer to the DGC layout in dgc_cmdbuf
Will be useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Samuel Pitoiset c7540d3fd6 radv: prepare for specialized DGC shaders
The DGC prepare shader is getting crazy and it takes a non-trivial
amount of time. Using specialized DGC shaders is cleaner and it's
faster than a pile of conditional SALU instructions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30852>
2024-08-28 11:03:36 +00:00
Georg Lehmann 246e22ff4f aco/tests: do not use mul with constant to tests neg modifier
The neg can be moved to the constant operand, which defeats the point
of the test.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30781>
2024-08-27 20:41:10 +00:00
Georg Lehmann bf67ac30fe aco/tests: allow literals with resolved swizzles in vop3p test
My new optimizer code will resolve swizzles for constants.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30781>
2024-08-27 20:41:09 +00:00
Georg Lehmann 6a18eb6afc aco/tests: parse neg(constant) in vop3p test
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30781>
2024-08-27 20:41:09 +00:00
Georg Lehmann 52465956ca aco/print_ir: use neg() for constants
Otherwise, it's not clear if -1 is 0xffffffff or 0x80000001.
LLVM uses a similar logic.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30781>
2024-08-27 20:41:09 +00:00
Georg Lehmann fb8e730d9b aco/tests: do not use add to tests neg modifer
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30781>
2024-08-27 20:41:09 +00:00
Georg Lehmann f71522e5cf aco/tests: don't test dpp constant propagation with row shift
With bc=1, removing DPP for shifts is invalid.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30781>
2024-08-27 20:41:09 +00:00
Samuel Pitoiset 2fda0db66f ac,radeonsi,radv: add common GFX preambles
RADV and RadeonSI have a few differences.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Samuel Pitoiset 80e8e18cc6 ac: add ac_gfx103_get_cu_mask_ps()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30789>
2024-08-27 14:14:57 +00:00
Samuel Pitoiset 9bfb23b252 radv: rework computing the DGC cmdbuf layout
This is much better and less error prone because the offset/size are
computed in only one place now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30868>
2024-08-27 12:36:36 +00:00
Samuel Pitoiset 4c1a912372 radv: remove RADV_DEBUG=nogsfastlaunch2
It's been two Mesa releases since this fast-launch mode2 has been fixed
on GFX11 and everything works as expected. The option is no longer
needed, note that GFX12 only has mode2 apparently.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30815>
2024-08-27 07:51:33 +00:00
Dave Airlie 7db16e7cdd radv: turn video decode/encode on for VCN4 with latest fw
With the latest fw in the linux-firmware repo, navi3x passes
all the CTS tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Dave Airlie 4255bbd958 radv: move video decode enable test into a flag
This makes it easier to start conditionalising this on fw releases.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Benjamin Cheng 95a980b61f radv/video: add event support for VCN4
This was the main missing piece for passing vulkan video CTS
as the video firmwares couldn't do proper vulkan events.

With new enough firmware this is now possible.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30837>
2024-08-26 22:19:09 +00:00
Assadian, Navid cb32bcd3fe amd/vpelib: Add 420 semi-planar 12bit handling
Adds semi-Planar 420 12 bits formats.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Navid Assadian <navid.assadian@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:15 +00:00
Brendan fcad791d07 amd/vpelib: Create virtual stream concept
[Why]
Need to create streams that don't come from input params (ex. for bg
gen) to prepare for future concepts.

[How]
Add enum for stream type, create helper functions to populate virtual
streams, and add custom functions where virtual stream function varies
from input stream function.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Brendan Leder <brendansteve.leder@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky b670701b65 amd/vpelib: Increase the CD field in vpe descriptor programming
Introduce the vpe desc writer hook.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Shih, Jude cb9175a7af amd/vpelib: Update Plane Descriptor Writer
Refactor to support new plane descriptor hook, and update enum
vpe_scan_direction.

Co-authored-by: Jesse Agate <jesse.agate@amd.com>
Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Jude Shih <shenshih@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Patel, Utpal 18dae30b17 amd/vpelib: Add resource function hooks for checking support
Add function hooks for checking support including rotation, background
color, DCC capability and input/output support check.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Utpal Patel <utpal.patel@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Alan Liu 06097ad64d amd/vpelib: Remove unused structs
Remove the definition of unused structs:
- struct x_axis_config
- struct point_config
- struct curve_points32
- struct lut_point
- struct pwl_parameter2

Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Chang, Tomson 6483c2c786 amd/vpelib: Add and fix collaborate sync data
[Why&How]
The original implementation always have sync data == 1.
Make it increasing with some 4 bits in random to help debugging
collaborate sync issues across multiple contexts.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Tomson Chang <tomson.chang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky 015b1b52c8 amd/vpelib: Remove extra collaborate sync commands in IB
Remove extra collaborate sync commands and fix coding format.

Co-authored-by: Roy Chan <roy.chan@amd.com>
Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Lin, Ricky e9e2fe389f amd/vpelib: Use VPE_IP_LEVEL_1_0 for VPE IP 6.1.3
Use VPE_IP_LEVEL_1_0 for VPE IP version 6.1.0 and 6.1.3.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Ricky Lin <ricky.lin@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Patel, Utpal 73d112f372 amd/vpelib: Add input pixel format support
Add input pixel format support for VPE.

Signed-off-by: Utpal Patel <utpal.patel@amd.com>
Reviewed-by: Jesse Agate <jesse.agate@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Hsieh, Mike 0164bfda65 amd/vpelib: Add cache mechanism for 3D Lut command
[WHY & HOW]
Converting 3D Lut parameters into vpe command takes time.
3D Lut will not change every frame, by adding cache mechanism can improve effeciency.

Reviewed-by: Tomson Chang <tomson.chang@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Mike Hsieh <mike.hsieh@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Kovac, Krunoslav 9817793cd9 amd/vpelib: Reuse existing float to reg format conversion
Remove vpe_fixpt_from_float and use existing conversion
for double(float)->reg custom 1.6.12 format.

Reviewed-by: Roy Chan <roy.chan@amd.com>
Acked-by: Alan Liu <haoping.liu@amd.com>
Signed-off-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30715>
2024-08-26 19:57:14 +00:00
Rhys Perry dea1fedf51 aco/tests: add more VALUMaskWriteHazard tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry 11262a01ce aco: preserve bitsets after a lane mask is written
fossil-db (navi31):
Totals from 4840 (6.10% of 79395) affected shaders:
Instrs: 13733449 -> 13761177 (+0.20%); split: -0.00%, +0.21%
CodeSize: 71997868 -> 72102520 (+0.15%); split: -0.00%, +0.15%
Latency: 128385177 -> 128408780 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 21105847 -> 21109475 (+0.02%); split: -0.00%, +0.02%
VALU: 7741209 -> 7741210 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry 61e73c2323 aco: check SALU writing lanemask later for VALUMaskWriteHazard
This should be done after reads are checked and
sgpr_read_by_valu_as_lanemask_then_wr_by_salu is reset. The old version
also skipped checking the reads if the write check passed.

fossil-db (navi31):
Totals from 193 (0.24% of 79395) affected shaders:
Instrs: 3212435 -> 3212735 (+0.01%)
CodeSize: 16462868 -> 16463848 (+0.01%); split: -0.00%, +0.01%
Latency: 19492377 -> 19492462 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 4419705 -> 4419718 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry b1ba7d1b99 aco: don't consider sa_sdst=0 before SALU write to fix VALUMaskWriteHazard
LLVM does but that's probably a bug.

fossil-db (navi31):
Totals from 311 (0.39% of 79395) affected shaders:
Instrs: 380453 -> 381075 (+0.16%)
CodeSize: 1961012 -> 1964744 (+0.19%)
Latency: 4799095 -> 4800313 (+0.03%)
InvThroughput: 958358 -> 958904 (+0.06%)
VALU: 242322 -> 242633 (+0.13%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry 8f5ee70d85 aco: also consider VALU reads for VALUMaskWriteHazard
fossil-db (navi31):
Totals from 9776 (12.31% of 79395) affected shaders:
Instrs: 19348258 -> 19383680 (+0.18%); split: -0.00%, +0.19%
CodeSize: 101223460 -> 101366964 (+0.14%); split: -0.01%, +0.15%
Latency: 172853115 -> 172866070 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 27590468 -> 27592390 (+0.01%); split: -0.00%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11550
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11436
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11337
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11738
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11741
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Rhys Perry ee648326d9 aco: ignore exec and literals when mitigating VALUMaskWriteHazard
LLVM ignores exec and literals don't seem to work in some cases.

fossil-db (navi31):
Totals from 2676 (3.37% of 79395) affected shaders:
Instrs: 10638979 -> 10646019 (+0.07%); split: -0.00%, +0.07%
CodeSize: 55929640 -> 55959416 (+0.05%); split: -0.00%, +0.06%
Latency: 107707408 -> 107712893 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 18119843 -> 18120442 (+0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 24.1
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30818>
2024-08-26 19:16:34 +00:00
Daniel Schürmann 14de650d58 aco: call nir_copy_prop() and nir_opt_dce() before instruction selection
Totals from 1037 (1.31% of 79395) affected shaders: (Navi21)

MaxWaves: 18760 -> 18960 (+1.07%)
Instrs: 4865258 -> 4860063 (-0.11%); split: -0.11%, +0.00%
CodeSize: 27094112 -> 27089224 (-0.02%); split: -0.06%, +0.04%
VGPRs: 68816 -> 68000 (-1.19%)
SpillVGPRs: 2140 -> 2105 (-1.64%)
Scratch: 4237312 -> 4234240 (-0.07%)
Latency: 55894512 -> 55748035 (-0.26%); split: -0.31%, +0.05%
InvThroughput: 11611286 -> 11372897 (-2.05%); split: -2.09%, +0.03%
VClause: 145331 -> 145285 (-0.03%); split: -0.04%, +0.01%
SClause: 150339 -> 150338 (-0.00%)
Copies: 472476 -> 468470 (-0.85%); split: -0.88%, +0.03%
Branches: 206562 -> 206067 (-0.24%); split: -0.24%, +0.00%
PreVGPRs: 61747 -> 61361 (-0.63%)
VALU: 3116434 -> 3112660 (-0.12%); split: -0.13%, +0.00%
SALU: 723154 -> 722887 (-0.04%); split: -0.04%, +0.01%
VMEM: 238656 -> 238586 (-0.03%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30786>
2024-08-26 12:59:00 +00:00
Samuel Pitoiset cc5d481f41 radv/ci: enable RADV_PERFTEST=transfer_queue on GFX9+
To avoid breaking this because it's not enabled by default.

There is a couple of failures because MSAA is still broken with SDMA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30778>
2024-08-26 09:26:52 +00:00
Samuel Pitoiset 731523a10b radv/ci: update flakes lists for NAVI21/VANGOGH
Found these when I did a stress test with RADV_PERFTEST=transfer_queue
enabled but they are existing flakes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30778>
2024-08-26 09:26:52 +00:00
Dave Airlie 68cd36d9b4 radv/video: fix reporting video format props for encode.
When encode isn't enabled, refuse the image usage, also use
the correct error on the decode check.

Fixes: 05cd42417f ("radv/video: enable video encoding behind perftest flag")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30838>
2024-08-26 08:49:54 +00:00
Samuel Pitoiset 7f7ecaf08c radv: optimize NOPs padding with DGC
There is two different alignment requirements:
a) IB VA must be aligned to ib_alignment
b) IB size must be aligned to ib_pad_dw_mask

Though RADV was aligning DGC cmdbuf to ib_alignment always, but this is
unnecessary. Using the optimal padding size for DGC cmdbuf removes a
bunch of useless NOPs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30768>
2024-08-26 08:22:06 +00:00
Samuel Pitoiset a7547a9781 radv/amdgpu: assert that the DGC IB VA is correctly aligned
It must be aligned to what the kernel returns.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30768>
2024-08-26 08:22:06 +00:00
Qiang Yu 58e412014a ac,radv,radeonsi: stop using quad vote any/all when llvm
ClustedAnd with bool argument and cluster_size==4 will be lowered
to quad_vote_all. So does ALU nir_iand/ior op with bool src.

OpenGL and Vulkan subgroup clustered_and tests with bool argument
fail when using LLVM. It seems LLVM has bug when quad vote bool
is in complex control flow. So stop using it for now.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:15 +08:00
Qiang Yu a37933b721 ac/llvm: build wqm for quad intrinsics only when fragment shader
Otherwise we get wrong result when non-fragment shader.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30610>
2024-08-26 10:46:11 +08:00
Karol Herbst 74dafa3c79 ac/llvm: fix umul_high
LLVM optimizes umul_hi with a constant to v_mul_hi_i32_i24_e32 which isn't
always what we need here. This causes miscalculations. To prevent LLVM to
apply this optimization, we insert a optimization barrier.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11761
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30810>
2024-08-24 16:10:20 +00:00
Samuel Pitoiset 28c957409f radv/amdgpu: do not check that a CS is aligned if no padding is added
Some video queues don't require padding.

Fixes: d5efbc7f1c ("radv/amdgpu: fix CS padding for non-GFX/COMPUTE queues")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30795>
2024-08-23 13:48:51 +00:00