Assadian, Navid
9a88afecbd
amd/vpelib: More parameters to the segmentation process and introduce validation hook
...
Generalization for the following:
1. pass in the scaler output alignment requirement to segment number determination function
2. parameter validation hook
Signed-off-by: Navid Assadian <Navid.Assadian@amd.com >
Reviewed-by: Roy Chan <Roy.Chan@amd.com >
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com >
Acked-by: Alan Liu <Haoping.Liu@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33833 >
2025-03-06 02:11:53 +00:00
Zhao, Jiali
37c244998a
amd/vpelib: Fix studio output CSC
...
Fix studio output CSC.
Signed-off-by: Jiali Zhao <Jiali.Zhao@amd.com >
Reviewed-by: Roy Chan <Roy.Chan@amd.com >
Reviewed-by: Evan Damphousse <Evan.Damphousse@amd.com >
Acked-by: Alan Liu <Haoping.Liu@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33833 >
2025-03-06 02:11:53 +00:00
Visan, Tiberiu
da04cbca66
amd/vpelib: Apply normalization for full range
...
[WHY]
The full range needs to have the same brightness normalization like the
studio range.
[HOW]
Apply the same normalization.
Signed-off-by: Tiberiu Visan <Tiberiu.Visan@amd.com >
Reviewed-by: Tomson Chang <Tomson.Chang@amd.com >
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com >
Acked-by: Alan Liu <Haoping.Liu@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33833 >
2025-03-06 02:11:53 +00:00
Visan, Tiberiu
b3d43cea08
amd/vpelib: Fix studio range
...
[WHY]
Studio signal has an offset.
[HOW]
Subtract that offset.
Signed-off-by: Tiberiu Visan <Tiberiu.Visan@amd.com >
Reviewed-by: Roy Chan <Roy.Chan@amd.com >
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com >
Acked-by: Alan Liu <Haoping.Liu@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33833 >
2025-03-06 02:11:53 +00:00
Leder, Brendan Steve
69c331e2c0
amd/vpelib: Reformat index variables and update enum
...
Reformat index variables to indicate loop specifics and update enum to match formatting guide.
Signed-off-by: Brendan Steve Leder <Brendansteve.Leder@amd.com >
Reviewed-by: Roy Chan <Roy.Chan@amd.com >
Reviewed-by: Evan Damphousse <Evan.Damphousse@amd.com >
Acked-by: Alan Liu <Haoping.Liu@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33833 >
2025-03-06 02:11:53 +00:00
Mike Blumenkrantz
7200cf8827
radv: don't unnecessarily flag prolog recalc when binding VBOs
...
another 25% for vkoverhead@draw_vbo_change_dynamic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
4f71370830
radv: get vbo info directly into dgc upload
...
don't need this memcpy
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
b78835de13
radv: move non_trivial_format calc to dynamic VI bind
...
this otherwise gets pointlessly recalculated on every draw when a VBO changes
another 10% for vkoverhead@draw_vbo_change_dynamic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
42db08c275
radv: split out dynamic vertex input descriptor writing
...
~25% boost to vkoverhead@draw_vbo_change_dynamic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
22434edefc
radv: inline some vertex descriptor functions
...
+5-7% in vkoverhead 16
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
00f51f7215
radv: eliminate a memset in radv_get_vbo_info()
...
very minor perf cost
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
e2ccd638a8
radv: roll line topology dynamic state changes into existing rast samples flag
...
this eliminates uploading rast samples whenever prim type changes even
when rast samples will not be changed
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
b2123314bd
radv: store vertex prolog simple input check to cmdbuf on vs bind
...
no need to check this again and again
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
881d94a40a
radv: store num_attributes to shader info
...
this eliminates a util_last_bit from the prolog hotpath
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Mike Blumenkrantz
d40dd4bfb7
radv: rewrite radv_get_line_mode() conditional
...
this was weirdly hard to parse
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33806 >
2025-03-06 01:26:02 +00:00
Samuel Pitoiset
f2eb31b1a2
spirv: move workarounds to an inner struct in spirv_to_nir_options
...
To be more explicit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33866 >
2025-03-05 19:56:50 +00:00
Rhys Perry
0ec174afd5
aco: insert dependency waits in certain situations
...
This seems to fix some artifacts, but we're not sure why, so it might not
be a correct or optimal solution.
fossil-db (navi31):
Totals from 28424 (35.81% of 79377) affected shaders:
Instrs: 30112910 -> 30348977 (+0.78%); split: -0.00%, +0.78%
CodeSize: 159542980 -> 160485336 (+0.59%); split: -0.00%, +0.59%
Latency: 221438396 -> 221500856 (+0.03%); split: -0.00%, +0.03%
InvThroughput: 38154231 -> 38159984 (+0.02%); split: -0.00%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com >
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33853 >
2025-03-05 16:22:54 +00:00
Samuel Pitoiset
ab4d2d447a
radv: remove redundant radv_instance::drirc::rt_wave64
...
Use RADV_PERFTEST_RT_WAVE_64 instead.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33868 >
2025-03-05 12:45:08 +00:00
Samuel Pitoiset
54a62c5c23
radv: use radv_emulate_rt() more
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33868 >
2025-03-05 12:45:08 +00:00
Samuel Pitoiset
9108c198bb
radv: fix trap handler exception options
...
They are same values.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33868 >
2025-03-05 12:45:07 +00:00
Georg Lehmann
20dd6dfa12
aco/isel: use s_mul_i32 instead of s_cselect_b32 for a ? b : 0
...
It doesn't require SCC and this is more consistent with b2f.
Foz-DB Navi21:
Totals from 2107 (2.64% of 79789) affected shaders:
Instrs: 6619774 -> 6619280 (-0.01%); split: -0.01%, +0.00%
CodeSize: 36754448 -> 36752396 (-0.01%); split: -0.01%, +0.00%
Latency: 62207779 -> 62206422 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 13090494 -> 13090204 (-0.00%); split: -0.00%, +0.00%
VClause: 171572 -> 171573 (+0.00%)
SClause: 257528 -> 257530 (+0.00%)
Copies: 607680 -> 607204 (-0.08%); split: -0.10%, +0.02%
VALU: 4189422 -> 4189418 (-0.00%)
SALU: 1001750 -> 1001264 (-0.05%); split: -0.07%, +0.02%
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33734 >
2025-03-04 21:36:17 +00:00
Georg Lehmann
2d68efd9f3
aco/opt_postRA: remove scc == 0 for more opcodes
...
Convert special case to s_cselect
Foz-DB Navi21:
Totals from 42 (0.05% of 79789) affected shaders:
Instrs: 91826 -> 91690 (-0.15%)
CodeSize: 496304 -> 495680 (-0.13%)
Latency: 1631974 -> 1631948 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 278772 -> 278766 (-0.00%)
SALU: 10627 -> 10491 (-1.28%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33734 >
2025-03-04 21:36:17 +00:00
Georg Lehmann
83247ffa30
aco/opt_postRA: remove scc != 0 with multiple uses
...
These can always be removed.
Foz-DB Navi21:
Totals from 39 (0.05% of 79789) affected shaders:
Instrs: 138352 -> 138299 (-0.04%)
CodeSize: 710424 -> 710272 (-0.02%)
Latency: 468276 -> 468254 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 108970 -> 108973 (+0.00%)
SALU: 18785 -> 18732 (-0.28%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33734 >
2025-03-04 21:36:17 +00:00
Georg Lehmann
6445ba0f05
aco/opt_postRA: allow try_optimize_scc_nocompare for all instructions
...
If the old SCC source worked, the new one will too.
Foz-DB Navi21:
Totals from 106 (0.13% of 79789) affected shaders:
Instrs: 255233 -> 254825 (-0.16%)
CodeSize: 1337308 -> 1335692 (-0.12%)
Latency: 1455208 -> 1454524 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 385624 -> 385612 (-0.00%); split: -0.00%, +0.00%
SALU: 53976 -> 53568 (-0.76%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33734 >
2025-03-04 21:36:17 +00:00
Georg Lehmann
3386ea09d4
aco/opt_postRA: split try_optimize_scc_nocompare in two functions
...
These are two independent steps, no real reason why they should be in the same
function.
No FOZ-DB changes.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33734 >
2025-03-04 21:36:17 +00:00
Lucas Fryzek
cfcc522bf8
vulkan/runtime: Add object type to DMR API
...
radv: Update DMR usage to make use of object type arg
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33767 >
2025-03-04 15:24:39 +00:00
Ivan Avdeev
7271b8ee49
radv,radeonsi: disable compute queue for BC250
...
BC250 is known to have non-functional compute queue. Thousands
for Vulkan CTS tests fail, and many games are known to have visual
glitches. RADV_DEBUG=nocompute is the known workaround for all these
issues.
Disable compute queue for this chip in both radv and radeonsi.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33116 >
2025-03-04 08:07:31 +00:00
Ivan Avdeev
ff6504d4c0
radv: add experimental support for AMD BC-250 board
...
AMD BC-250 is a mining board based on an AMD APU with an integrated GPU
that kernel recognizes as Cyan Skillfish.
It is basically RDNA1/GFX10, but with added hardware ray tracing
support. LLVM calls it GFX1013, see
https://llvm.org/docs/AMDGPU/AMDGPUAsmGFX1013.html
Support for this GPU hasn't been extensively tested. Some games are
known to work, some non-trivial ray query compute and ray tracing
pipeline rendering works too. Q2RTX works.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33116 >
2025-03-04 08:07:31 +00:00
Martin Roukala (né Peres)
8fb80834ed
radv/ci: add hawaii to CI
...
This GPU is located in the same host as Tahiti, and was kindly donated
to the RADV project by Leonardo Frassetto (@DottorLeo).
It's good to finally making use of it, one year after receiving it \o/
On a side now, the skips are removed since they do not appear to be
reducing the chances of hanging once paired with the updated postamble
flushes.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33563 >
2025-03-03 19:42:21 +00:00
Martin Roukala (né Peres)
f4b1d62f00
radv/ci: reduce the timeout of vkcts-tahiti to a more sensible time
...
The current runtime is just over 33 minutes, so no need for
multi-hour-long timeouts.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33563 >
2025-03-03 19:42:21 +00:00
Timur Kristóf
3f3a5d8068
radv: Use flush postamble on GFX7 with different flags.
...
Flush caches at the end of each submission on GFX7.
This significantly improves stability on Hawaii
when running the CTS on multiple threads.
Keep previous behaviour on GFX6.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33563 >
2025-03-03 19:42:21 +00:00
Samuel Pitoiset
7f6e28db26
radv: fix re-emitting fragment output state when resetting gfx pipeline state
...
When switching from pipeline to shader objects.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33840 >
2025-03-03 19:19:33 +00:00
Konstantin Seurer
6e3fc37d47
radv: Implement multidimensional ray query arrays
...
This is technically a bug fix, but no sane developer would use this.
It's still nice to implement all corner cases.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32334 >
2025-03-03 12:07:47 +00:00
Konstantin Seurer
febc923a46
radv: Lower ray query vars to structs
...
This is much cleaner than passing an index around it will allow
implementing multidimensional ray query arrays.
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32334 >
2025-03-03 12:07:47 +00:00
Julia Zhang
79bb8e3455
radv: advertise VK_EXT_device_memory_report
...
Signed-off-by: Julia Zhang <julia.zhang@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088 >
2025-03-03 08:26:51 +00:00
Julia Zhang
f504ed9e73
radv: emit device memory report for device memory events
...
Emit device memory report when radv create memory or free memory.
Signed-off-by: Julia Zhang <julia.zhang@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088 >
2025-03-03 08:26:51 +00:00
Julia Zhang
313aa44bf1
radv: add obj_id to radeon_winsys_bo
...
mem->bo->obj_id will be used by device memory report.
Signed-off-by: Julia Zhang <julia.zhang@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088 >
2025-03-03 08:26:51 +00:00
Julia Zhang
900be035c8
radv: add import and export handle_type in radv_alloc_memory
...
The import_handle_type and export_handle_type will be used to set the
memoryObjectId for memory report.
Signed-off-by: Julia Zhang <julia.zhang@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33088 >
2025-03-03 08:26:51 +00:00
Alyssa Rosenzweig
d2edb15454
radv: use VK_COPY_STR
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33826 >
2025-03-01 20:27:26 +00:00
Georg Lehmann
975be7ac5d
ac/nir/mem_access_bit_sizes: split unaligned vec3 lds access to allow more read2/write2
...
Foz-DB Navi21:
Totals from 77 (0.10% of 79377) affected shaders:
Instrs: 69787 -> 68745 (-1.49%); split: -1.51%, +0.02%
CodeSize: 367256 -> 360060 (-1.96%); split: -1.97%, +0.01%
VGPRs: 3896 -> 3880 (-0.41%)
Latency: 335403 -> 335297 (-0.03%); split: -0.11%, +0.08%
InvThroughput: 102766 -> 102931 (+0.16%); split: -0.09%, +0.25%
VClause: 1645 -> 1643 (-0.12%); split: -0.18%, +0.06%
SClause: 1434 -> 1433 (-0.07%)
Copies: 4280 -> 4283 (+0.07%); split: -0.56%, +0.63%
PreVGPRs: 2408 -> 2421 (+0.54%); split: -0.08%, +0.62%
VALU: 45557 -> 45646 (+0.20%); split: -0.10%, +0.29%
SALU: 6458 -> 6474 (+0.25%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33448 >
2025-03-01 18:26:54 +00:00
Georg Lehmann
8b2b3e5704
radv: remove outdated vectorize TODO
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33448 >
2025-03-01 18:26:54 +00:00
Georg Lehmann
7eb43c3b1c
aco/optimizer: delete combine_and_subbrev
...
This is now done in NIR. No Foz-DB changes on Navi21.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33761 >
2025-03-01 07:49:28 +00:00
Natalie Vock
237d8799be
radv/rt: Limit monolithic pipelines to 50 stages
...
Beyond that, monolithic pipelines just bloat to incredible sizes,
destroying compile times for questionable, if any, runtime perf benefit.
Indiana Jones: The Great Circle has more than 100 stages and takes
several minutes to compile its RT pipeline on Deck when using monolithic
compilation, and yet separate shaders still end up faster (probably
because instruction cache coherency in traversal is better).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33818 >
2025-02-28 16:22:45 +00:00
Natalie Vock
d5a2666ad9
aco/ra: Assert operands only clear their own id
...
This is useful for debugging register assignment, as this case would
usually result in RA silently assigning the same register to multiple
temps at the same time.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576 >
2025-02-28 16:00:48 +00:00
Natalie Vock
1967b0f0c4
aco/tests: Add tests for precolored operands in different regs
...
The first test verifies that, if possible, we don't emit unnecessary
renames/copies for temporaries where it's possible for them to stay
in their current register (if an operand is precolored to the register
the temporary is currently residing in).
The second test verifies that we correctly choose a non-clobbered
operand even if there is one fixed to the temporary's current register.
To minimize copies, we'll want to have the live copy of
%tmp0 in v[2] there, because v[0-1] gets overwritten.
The third test verifies that we add a copy to another free register and
rename if all possible precolored operands are clobbered.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576 >
2025-02-28 16:00:48 +00:00
Natalie Vock
b8bcc8e5c5
aco/ra: Handle temps fixed to different regs in different operands
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576 >
2025-02-28 16:00:48 +00:00
Natalie Vock
7a4775b396
aco/ra: Add option to skip renaming for parallelcopies
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576 >
2025-02-28 16:00:48 +00:00
Natalie Vock
b339bcfa38
aco/ra: Use struct for parallelcopies
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576 >
2025-02-28 16:00:48 +00:00
Natalie Vock
3f182bc1fa
aco/ra: Use iterators for linear VGPR copy extraction
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576 >
2025-02-28 16:00:48 +00:00
Georg Lehmann
ea3c04b535
radv/nir_lower_ray_queries: use nir_foreach_function_impl
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33770 >
2025-02-28 14:38:14 +00:00