Samuel Pitoiset
a0c05dd7b5
radv/meta: convert the HTILE expand CS pipelines to vk_meta
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:50 +00:00
Samuel Pitoiset
3ff28c8f98
radv/meta: convert the DCC retile pipelines to vk_meta
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:50 +00:00
Samuel Pitoiset
4521eb1b2b
radv/meta: convert the FMASK copy pipelines to vk_meta
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:50 +00:00
Samuel Pitoiset
a3aeeab434
radv/meta: convert the FMASK expand pipelines to vk_meta
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:50 +00:00
Samuel Pitoiset
f27bee04ce
radv/meta: convert the copy VRS to HTILE pipelines to vk_meta
...
This pipeline was already always on-demand.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:50 +00:00
Samuel Pitoiset
88ffeb61ae
radv/meta: convert the copy/fill pipelines to vk_meta
...
This also switches these pipelines to on-demand always.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:49 +00:00
Samuel Pitoiset
9ebfe81a24
radv/meta: rework creating meta pipelines for query resolves
...
To use the same design as other meta operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32744 >
2024-12-29 18:31:49 +00:00
Samuel Pitoiset
2bc155959e
radv: pass extra graphics pipeline create info using pNext
...
This will be needed to convert meta graphics pipeline to vk_meta.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32750 >
2024-12-29 17:51:03 +00:00
Samuel Pitoiset
23b1df7953
radv: use VK_PRIMITIVE_TOPOLOGY_META_RECT_LIST_MESA for meta pipelines
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32750 >
2024-12-29 17:51:03 +00:00
Samuel Pitoiset
0f8d07d355
radv: add support for VK_PRIMITIVE_TOPOLOGY_META_RECT_LIST_MESA
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32750 >
2024-12-29 17:51:02 +00:00
Timur Kristóf
de2cb4a7d3
ac/nir: Only store params to attribute ring that are varying.
...
On GFX11+, varying outputs from the last pre-rasterization stage
are implemented by storing the outputs to the so-called
attribute ring.
Make sure to only store them when necessary.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:41 -06:00
Timur Kristóf
13234a8a8a
ac/nir: Only export parameters when they are actually varying.
...
In AMD terminology, varying outputs are implemented by
parameter export instructions on GFX6-10.3 GPUs.
Only emit those when actually necessary.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:38 -06:00
Timur Kristóf
4d6c00944b
ac/nir: Only export positions when they are really system values.
...
In AMD terminology, a system value is implemented by
position export instructions.
Make sure to only emit those when they are needed.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:36 -06:00
Timur Kristóf
f5981e8c0b
ac/nir: Split GS output usage masks to varying and sysval masks.
...
To keep track which output is used for what purpose.
Note that this commit just adds the capability to track this
separately in ac/nir. The drivers will need to be updated
in the future to take advantage of this.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:33 -06:00
Timur Kristóf
92464109e3
ac/nir: Mark when pre-rast output is used as varying or sysval.
...
In this commit, just collect the info.
It will be taken into use by subsequent commits.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:29 -06:00
Timur Kristóf
cb0671aede
ac/nir/ngg: Refactor storing per-primitive primitive ID to attribute ring.
...
Simplify the code using the helpers introduced in previous commits.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:26 -06:00
Timur Kristóf
edde762b56
ac/nir/ngg: Move emitting GS vertex param exports to if.
...
On GFX10-10.3 (when no attribute ring is present), only emit
the GS vertex parameter exports on the vertex export threads.
Other threads don't have anything to export.
Move this code around to make it a bit easier to follow.
Also add some comments to better explain what's what.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:23 -06:00
Timur Kristóf
68dbcdd935
ac/nir/ngg: Move wait attr ring workaround for GS to better place.
...
The call depends on the phis created by create_output_phis so
the code becomes more readable if we move it closer to that.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:20 -06:00
Timur Kristóf
9acc2f2435
ac/nir/ngg: Remove dead code for attribute ring stores.
...
These are replaced by the new helpers added in previous commits.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:17 -06:00
Timur Kristóf
f528de896e
ac/nir/ngg: Refactor export_pos0_wait_attr_ring.
...
There is no need to create phis in this function anymore,
because they can be already created by create_output_phis before.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:14 -06:00
Timur Kristóf
badbb01c5d
ac/nir/ngg: Refactor GS attribute ring stores.
...
Use the new helper.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:11 -06:00
Timur Kristóf
23c615bde2
ac/nir/ngg: Refactor VS/TES attribute ring stores.
...
Use the new helper.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:09 -06:00
Timur Kristóf
f38680aa1c
ac/nir: Introduce ac_nir_store_parameters_to_attr_ring.
...
This function is going to be used for storing parameter outputs
to the attribute ring, instead of the current implementation.
It is going to be taken into use in the following commits.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:06 -06:00
Timur Kristóf
c4b45f1ec8
ac/nir: Pass ac_nir_prerast_out to ac_nir_export_position.
...
In a subsequent commit, ac_nir_export_position will
start using other fields from ac_nir_prerast_out.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:04 -06:00
Timur Kristóf
3d291a98c4
ac/nir: Pass ac_nir_prerast_out to ac_nir_export_parameters.
...
In a subsequent commit, ac_nir_export_parameters will
start using other fields from ac_nir_prerast_out.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:31:01 -06:00
Timur Kristóf
896237b52e
ac/nir/ngg: Simplify updating mesh shader output info.
...
All 64-bit outputs are already lowered to 32-bit.
There is no need to handle them here.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:30:58 -06:00
Timur Kristóf
f460e3a36b
ac/nir/ngg: Use ac_nir_prerast_out in mesh shader lowering.
...
This will help us share more code between the mesh shader lowering
and other passes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32640 >
2024-12-28 10:30:53 -06:00
David Rosca
a642ff15a6
frontends/va: Fix deinterlace filter
...
Deinterlace filter uses interlaced buffer for output which needs
to be converted to progressive. Add back code that handles this.
Fixes: c324364f39 ("frontends/va: Only use interlaced surfaces when progressive is not supported")
Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32768 >
2024-12-28 12:02:42 +00:00
Lionel Landwerlin
5e4aeb3ad7
anv: fix index buffer size changes
...
With vkCmdBindIndexBuffer2KHR only the provided size can change which
currently fails to reprogram the index buffer properly.
Signed-off-by: Lionel Landwerlin <llandwerlin@gmail.com >
Fixes: 5c2aca456e ("anv: implement vkCmdBindIndexBuffer2KHR")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12376
Reviewed-by: Tapani Pälli <tapani.palli@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32785 >
2024-12-27 13:20:49 +00:00
David Rosca
96cb12ac68
radv/amdgpu: Set VCN version for ac_parse_ib
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32760 >
2024-12-27 08:17:16 +00:00
David Rosca
e3d602de98
ac/parse_ib: Parse VCN IB_COMMON_OP_WRITEMEMORY
...
And more small fixes.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32760 >
2024-12-27 08:17:16 +00:00
Qiang Yu
b0c47871ec
ac: remove ac_nir_lower_subdword_loads
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781 >
2024-12-27 01:58:38 +00:00
Qiang Yu
403cdacaff
radeonsi: replace ac_nir_lower_subdword_loads
...
ac_nir_lower_mem_access_bit_sizes() does the work of it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781 >
2024-12-27 01:58:38 +00:00
Qiang Yu
955ae53efd
radeonsi: fix OpenCL piglit tests fails when using ACO
...
Now no regression compared with using LLVM.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781 >
2024-12-27 01:58:38 +00:00
Qiang Yu
21f888a3ed
ac,radv: move ac_nir_lower_bit_size_callback to common place
...
To be used by radeonsi for OpenCL.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781 >
2024-12-27 01:58:38 +00:00
Qiang Yu
5f601361ed
ac/nir: lower access for shared and scratch memory
...
OpenCL may load and store vec16 data, while ACO only
support <=32byte. Radeonsi is going to use
ac_nir_lower_mem_access_bit_sizes() for lowering these
access.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781 >
2024-12-27 01:58:38 +00:00
Qiang Yu
9a8eef282b
radeonsi: fix OpenCL shader compile fail
...
sel->stage is assigned with MESA_SHADER_COMPUTE statically,
change to use nir->info.stage need to handle with MESA_SHADER_KERNEL.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12372
Fixes: 9b7ea720c9 ("radeonsi: use nir->info instead of sel->info.base")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32781 >
2024-12-27 01:58:38 +00:00
Marek Olšák
c0e5e8f932
amd: update addrlib
...
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32687 >
2024-12-26 21:02:21 +00:00
Georg Lehmann
33a73203b0
aco/isel: skip and(exec) for top level demote_if/terminate_if
...
In nested control flow this is nessecary to not demote/terminate invocations
that are part of the global but not part of the local mask.
At the top level, the masks are the same and no additional invocations
can be accidentally disabled.
Foz-DB Navi21:
Totals from 2095 (2.64% of 79395) affected shaders:
Instrs: 1058326 -> 1056839 (-0.14%)
CodeSize: 5632480 -> 5626616 (-0.10%)
Latency: 12082761 -> 12080520 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 2246677 -> 2246636 (-0.00%); split: -0.00%, +0.00%
Copies: 114446 -> 114433 (-0.01%)
SALU: 230585 -> 229098 (-0.64%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32755 >
2024-12-26 18:34:38 +00:00
Georg Lehmann
5b4b195f1b
nir: optimize unpacking 8bit values from a 64bit source
...
Useful for load vectorization.
Foz-DB Navi21:
Totals from 299 (0.38% of 79395) affected shaders:
Instrs: 287818 -> 284333 (-1.21%); split: -1.21%, +0.00%
CodeSize: 1557124 -> 1540544 (-1.06%); split: -1.07%, +0.00%
Latency: 4009407 -> 4012389 (+0.07%); split: -0.05%, +0.12%
InvThroughput: 1260613 -> 1262530 (+0.15%); split: -0.01%, +0.17%
VClause: 5472 -> 5369 (-1.88%); split: -1.92%, +0.04%
SClause: 5419 -> 5305 (-2.10%); split: -2.58%, +0.48%
Copies: 36709 -> 36060 (-1.77%); split: -1.81%, +0.04%
PreSGPRs: 11861 -> 11655 (-1.74%)
SALU: 66920 -> 64310 (-3.90%)
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32778 >
2024-12-26 17:50:32 +00:00
Marek Olšák
47cdec24ee
radeonsi: remove unused code
...
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780 >
2024-12-26 10:12:43 +00:00
Marek Olšák
357ee7f699
radeonsi: switch si_get_blitter_vs to IO intrinsics
...
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780 >
2024-12-26 10:12:43 +00:00
Marek Olšák
a0579f75fb
radeonsi: fix a TCS regression
...
This change caused the regression:
@@ -853,7 +853,7 @@ bool si_llvm_compile_shader(struct si_screen *sscreen, struct ac_llvm_compiler *
/* Reset the shader context. */
ctx.shader = shader;
- ctx.stage = sel->stage;
+ ctx.stage = nir->info.stage;
bool same_thread_count = shader->key.ge.opt.same_patch_vertices;
si_build_wrapper_function(&ctx, parts, same_thread_count);
because "nir" contains the previous shader (LS), not the current shader (HS).
Fix it by using prev_nir for the previous shader, so that we can keep using
"nir".
Fixes: 9b7ea720c9 - radeonsi: use nir->info instead of sel->info.base
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780 >
2024-12-26 10:12:43 +00:00
Marek Olšák
227a894775
radeonsi/ci: update failures
...
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780 >
2024-12-26 10:12:43 +00:00
Marek Olšák
19c00c586e
ac/llvm: remove unused code
...
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780 >
2024-12-26 10:12:43 +00:00
Marek Olšák
c6fd69bd5e
ac: remove unused code
...
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32780 >
2024-12-26 10:12:43 +00:00
Evan
4e89690878
amd/vpelib: Shaper Refactor
...
- Refactor Shaper code to apply linear OR PQ based on input transfer function
- Program gamma based on shaper expected input CS
- fix fp16 input handling
- fix snake case in update_whitepoint
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com >
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com >
Signed-off-by: Evan Damphousse <evan.damphousse@amd.com >
Acked-by: Chih-Wei Chien <Chih-Wei.Chien@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32695 >
2024-12-26 01:23:59 +00:00
Hsieh, Mike
596d9ff8cf
amd/vpelib: Refactor 3D LUT parameters
...
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com >
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com >
Acked-by: Chih-Wei Chien <Chih-Wei.Chien@amd.com >
Signed-off-by: Mike Hsieh <Mike.Hsieh@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32695 >
2024-12-26 01:23:59 +00:00
Chen, Phoebe
7d326ab082
amd/vpelib: Refactor YUV format check
...
Using general vpe_is_yuv* helper function for the condition check
Reviewed-by: Evan Damphousse <evan.damphousse@amd.com >
Reviewed-by: Roy Chan <Roy.Chan@amd.com >
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com >
Acked-by: Chih-Wei Chien <Chih-Wei.Chien@amd.com >
Signed-off-by: Phoebe Chen <phoebe.chen@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32695 >
2024-12-26 01:23:59 +00:00
Ian Romanick
0f3a350087
brw/nir: Don't generate scalar byte to float conversions on DG2+ in optimize_extract_to_float
...
The lowering code does not generate efficient code. It is better to
just not emit the bad thing in the first place. The shaders that I
examined had blocks of NIR like:
con 32 %527 = extract_u8 %456.o, %5 (0x0)
con 32 %528 = extract_u8 %456.o, %35 (0x1)
con 32 %529 = extract_u8 %456.o, %14 (0x2)
con 32 %530 = extract_u8 %456.o, %11 (0x3)
con 32 %531 = u2f32 %527
con 32 %532 = u2f32 %528
con 32 %533 = u2f32 %529
con 32 %534 = u2f32 %530
In some cases the u2f results are multiplied with 1/255. There may be
a slightly more efficient way to do this by doing something like
mov(8) g40<1>UW g12.1<32,8,4>UB
mov(8) g41<1>UW g12.2<32,8,4>UB
mov(8) g42<1>UW g12.3<32,8,4>UB
mov(8) g60<1>F g12<32,8,4>UB
mov(8) g61<1>F g40<1,1,0>UW
mov(8) g62<1>F g41<1,1,0>UW
mov(8) g63<1>F g42<1,1,0>UW
In SIMD16 and SIMD32 that would save temporary register space. It could
save a register in SIMD8 by using g40.8 instead of g42. Making that
happen might be tricky. Maybe we should just add a special NIR opcode
that converts a packed uint32 to a vec4?
v2: Add a bunch of documentation explaining what's going on. Suggested
by Ken.
shader-db:
Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 18228689 -> 18228720 (<.01%)
instructions in affected programs: 43091 -> 43122 (0.07%)
helped: 0 / HURT: 30
total cycles in shared programs: 932542994 -> 932544290 (<.01%)
cycles in affected programs: 8150758 -> 8152054 (0.02%)
helped: 15 / HURT: 17
fossil-db:
Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 142890605 -> 142890392 (-0.00%); split: -0.00%, +0.00%
Cycle count: 21655049536 -> 21654693720 (-0.00%); split: -0.00%, +0.00%
Totals from 181 (0.03% of 553251) affected shaders:
Instrs: 188022 -> 187809 (-0.11%); split: -0.12%, +0.01%
Cycle count: 85291658 -> 84935842 (-0.42%); split: -0.47%, +0.05%
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 154438050 -> 154436980 (-0.00%)
Cycle count: 15334650326 -> 15334644375 (-0.00%); split: -0.00%, +0.00%
Spill count: 56754 -> 56706 (-0.08%)
Fill count: 95919 -> 95808 (-0.12%)
Scratch Memory Size: 2306048 -> 2304000 (-0.09%)
Max live registers: 32469924 -> 32469899 (-0.00%)
Totals from 112 (0.02% of 642922) affected shaders:
Instrs: 156186 -> 155116 (-0.69%)
Cycle count: 11111478 -> 11105527 (-0.05%); split: -0.62%, +0.56%
Spill count: 1766 -> 1718 (-2.72%)
Fill count: 2815 -> 2704 (-3.94%)
Scratch Memory Size: 78848 -> 76800 (-2.60%)
Max live registers: 11526 -> 11501 (-0.22%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29884 >
2024-12-24 18:09:59 -08:00