Kenneth Graunke
e5c1d00faf
brw: Pass devinfo to brw_nir_lower_tes_inputs
...
This will be useful for using reversed patch header layouts.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482 >
2025-11-25 22:43:56 +00:00
Kenneth Graunke
a1c7ae9d15
brw: Implement URB handle intrinsics for TCS and TES stages
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482 >
2025-11-25 22:43:56 +00:00
Lionel Landwerlin
e290f9641d
brw: Implement load/store URB intrinsics
...
These work the same regardless of stage.
v2 (Ken): Rebase, move from mesh to all stages, add reorderable load
variant, allow channel masks to be non-constant even on Xe2.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482 >
2025-11-25 22:43:55 +00:00
Lionel Landwerlin
0d8ee4ed23
brw: use default builder for urb handle adjustment
...
Be consistent with lowering that happens after, so that it gets a full
vector register and can stride into it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482 >
2025-11-25 22:43:55 +00:00
Kenneth Graunke
f1ab64ad74
nir: add new intrinsics to load/store from URB on intel
...
We add several new intrinsics for accessing URB handles:
- load_urb_output_handle_intel
- load_urb_input_handle_intel
- load_urb_input_handle_intel_indexed
The latter is used by stages like TCS and GS where each input control
point has a unique handle. The index is which ICP to read from. The
others are for most stages, where all inputs or outputs are accessed
via a single handle.
Then we have URB load and store operations, split for Xe2+ (URB via LSC)
and earlier (HDC OWord messages):
- load_urb_vec4_intel
- load_urb_lsc_intel
- store_urb_vec4_intel
- store_urb_lsc_intel
The legacy vec4 variants take a handle and a 128-bit OWord offset as
sources. Additionally, stores take a set of channel enables to mask
off and avoid writing vec4 components. We don't use the WRITE_MASK
const-index as our channel enables are not required to be constant.
The Xe2+ LSC variants are simpler. Handles are byte offsets into the
URB memory region, and offsets are expressed in bytes. So we simply
add them into a single "address" source. We don't support writemasks
here, as they aren't really necessary with the better addressability.
(Plus, the store_cmask operations work significantly differently than
the previous HDC OWord messages). We will lower disjoint writemasks
to multiple stores.
Based on earlier code by Lionel Landwerlin.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482 >
2025-11-25 22:43:54 +00:00
Joshua Ashton
8916b8a7f4
vulkan/wsi: Handle 0xFFFFFFFF special case in vk_wsi_force_swapchain_to_current_extent driconf
...
Current extent can be 0xFFFFFFFF (-1) which indicates there is no current extent, and the swapchain will inherit that of which the application provides.
Check this before applying the hack.
Signed-off-by: Joshua Ashton <joshua@froggi.es >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31134 >
2025-11-25 22:10:38 +00:00
Marek Olšák
5ee9a76058
nir: fix a typo in NIR_PASS_ASSERT_NO_PROGRESS for non-debug builds
...
Fixes: 4e834b4321 - nir: add NIR_PASS_ASSERT_NO_PROGRESS
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38608 >
2025-11-25 21:38:50 +00:00
Marek Olšák
d9d3f6703c
ac,winsys/amdgpu: report why ac_query_gpu_info failed
...
only these case were not reporting anything
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38602 >
2025-11-25 21:17:35 +00:00
Marek Olšák
1c3e7e4ca0
ac: document RELEASE_MEM limitation with PS_DONE/CS_DONE on gfx6-11
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38602 >
2025-11-25 21:17:35 +00:00
Felix DeGrood
406e6e094a
anv/rt: avoid out of bound access by clamping global id
...
Signed-off-by: Felix DeGrood <felix.j.degrood@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: cff9d82c66 ("anv/rt: rewrite encode.comp for better performance")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38636 >
2025-11-25 19:59:42 +00:00
Lionel Landwerlin
b1e74a1bb1
anv: shrink image opaque data
...
Noticed renderdoc complaining about our size :
RDOC 692028: [18:08:18] vk_core.cpp(2272) - Warning -
VkPhysicalDeviceDescriptorBufferPropertiesEXT.imageCaptureReplayDescriptorDataSizeis too large at 32
(must be <= 16), can't support capture of VK_EXT_descriptor_buffer
Since we only need 2 pointers (main + private), we can shrink this to
16bytes. The 1/2 planes have a relative offset from the base.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Acked-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38625 >
2025-11-25 19:38:53 +00:00
Benjamin Cheng
6aabc3d5d2
ac/parse_ib: Implement VCN dec message parsing
...
This makes the IB dumps more useful for decode, as most of the actual
decode command is within the message buffers.
Reviewed-by: David Rosca <david.rosca@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38631 >
2025-11-25 19:17:12 +00:00
Karol Herbst
28060c1e05
nvk/ci: add broken coop matrix CTS tests to skips
...
Reviewed-by: Mel Henning <mhenning@darkrefraction.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38657 >
2025-11-25 18:50:19 +00:00
Danylo Piliaiev
580ce1c911
tu: Handle mismatch in mip layouts for reinterpreted compressed images
...
If the threshold of the linear mipmap fallback for compressed
format is reached at a different mipmap level than the
size-compatible non-compressed formats the image can be viewed as,
then we have to disable the fallback. Otherwise, for some levels,
texels would be read from the wrong locations due to the tiling
mismatch.
NOTE: Prop driver falls back to LINEAR in this case.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38655 >
2025-11-25 18:25:57 +00:00
Danylo Piliaiev
e347f82aeb
freedreno/layout: Use blocks for linear mipmap fallback where possible
...
Starting from at least A6XX gen3 there is an option for compressed
formats to have linear fallback threshold in compressed blocks, instead
of threshold in raw texels. However, proprietary driver doesn't enable
it on A6XX, so to be safe we also enable it only on A7XX.
Additionaly, clarify what the linear fallback threshold is really about.
It's about tiling, not strictly about UBWC.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38655 >
2025-11-25 18:25:57 +00:00
Rob Clark
654b0dd548
freedreno: Remove use of FDL_MIN_UBWC_WIDTH
...
fdl6_layout_image() already has the fallback-to-linear check. And with
gen8 this is no longer a fixed constant. So lets just consolidate all
the logic in fdl layout helper.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38655 >
2025-11-25 18:25:57 +00:00
Aitor Camacho
c6ad0f781a
kk: Force attachment load as temp solution to preserve attachment
...
Attachment preserve is not correctly handled due to multiple issues:
- No knowledge that we need to do that from render pass
- Setting VkEvent from command buffers break MTLRenderPass
- Pipeline barriers break MTLRenderPasses
- vkCmdBeginRendering and vkCmdEndRendering don't account for it
Need to handle all those above first before we can have efficient
MTLRenderPasses implemented.
Acked-by: Arcady Goldmints-Orlov <arcady@lunarg.com >
Signed-off-by: Aitor Camacho <aitor@lunarg.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38612 >
2025-11-25 17:49:53 +00:00
Aitor Camacho
fa420a8649
kk: Remove mem leaks in cmd buf destroy and residency set creation
...
Acked-by: Arcady Goldmints-Orlov <arcady@lunarg.com >
Signed-off-by: Aitor Camacho <aitor@lunarg.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38629 >
2025-11-25 17:33:50 +00:00
Gurchetan Singh
4f9d7c3385
gfxstream: codegen: don't generate custom protocols in function table
...
vulkan_gfxstream.h contains custom protocols not found
in vk.xml (vk_gfxstream.xml).
gfxstream_vk_entrypoints.h is codegen by Mesa common code,
and it does not accept the custom XML.
So avoid generating implementations for them:
guest/vulkan_enc/gfxstream_guest_vk_autogen_impl/gen/func_table.cpp:5321:1:
note: declare 'static' if the function is not intended to be
used outside of this translation unit
5321 | void gfxstream_vk_CollectDescriptorPoolIdsGOOGLE(
| ^
| static
fatal error: too many errors emitted, stopping now [-ferror-limit=]
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38632 >
2025-11-25 09:12:15 -08:00
Gurchetan Singh
169f571f4f
gfxstream: delete createImmutableSamplersFilteredImageInfo
...
This function is not used.
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38632 >
2025-11-25 09:12:10 -08:00
Gurchetan Singh
b6df034363
gfxstream: make functions static when needed
...
Fixes errors like:
ResourceTracker.cpp:62:6: error: no previous prototype for function 'zx_handle_close'
[-Werror,-Wmissing-prototypes]
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38632 >
2025-11-25 09:12:06 -08:00
Gurchetan Singh
0a231dfb40
gfxstream: silence non-null Clang check on Android
...
Workaround:
src/gfxstream/guest/connection-manager/GfxStreamConnectionManager.cpp:82:51:
error: null passed to a callee that requires a non-null argument [-Werror,-Wnonnull]
82 | tss_set(gfxstream_connection_manager_tls_key, nullptr);
| ^~~~~~~
Ultimately, the Bionic headers look wrong. Passing NULL to tss_set
is completely legit.
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38632 >
2025-11-25 09:11:35 -08:00
Lionel Landwerlin
7e72d392d7
brw: switch to load_(pixel_coord|frag_coord_z|frag_coord_w) intrinsics
...
Allows us to better determine if we need Z/W payload delivery.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36392 >
2025-11-25 15:50:48 +00:00
Natalie Vock
b7f011e653
radv/rt: Correctly copy culling flags when updating to separate AS
...
This was missing and led to the field being uninitialized.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488 >
2025-11-25 15:25:21 +00:00
Natalie Vock
bc1eea90b9
radv/rt: Keep updated nodes always active
...
In updateable AS, we keep all nodes active even if they're
degenerate/NaN, because too many games ignore API rules about not
making inactive nodes active (and some vendor tips outright advise this
behavior). We also need to match this by keeping everything active in
the update side. The ALWAYS_ACTIVE macro has been long removed and
replaced by VK_BVH_BUILD_FLAG, too. Since updating only happens to
updateable AS, don't even check for the flag, just implement the
always-active handling.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488 >
2025-11-25 15:25:21 +00:00
Lionel Landwerlin
6d3be477ab
anv: enable application shader printfs with debug option
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
4c3bf04dd0
anv: enable mesh/task shader hashes
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
4b9aa9dc91
nir/lower_printf: fix missing singleton add
...
If we're using the singleton, we need to add to it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
d24633023f
nir/lower_printf: fix array alignment
...
The pointer arithmetic doesn't need a 4byte alignment, otherwise
everything is broken.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
67faf6dfbd
spirv: fix printf generation
...
Not having the uses_printf will drop the printf info in serialization.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
6940b8fcd7
nir: fix lower_printf with no arguments
...
SPIRV generated printf with no arguments have an undef source (not the
expected deref).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
c2c815afd9
nir: print out number of printfs
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638 >
2025-11-25 14:18:42 +00:00
Thong Thai
b4e7c13ef4
meson: add libva wrap and fallback option
...
Allow for Mesa to be built with VAAPI support, without having to install
libva headers first.
Signed-off-by: Thong Thai <thong.thai@amd.com >
Acked-by: Ruijing Dong <ruijing.dong@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38271 >
2025-11-25 13:37:49 +00:00
Georg Lehmann
f5eb3fe9cb
aco/optimizer: optimze cndmask(a, b, not(c)) to cndmask(b, a, c)
...
Can happen with nir_op_bitz/b2f/b2i.
Foz-DB Navi48:
Totals from 3465 (4.20% of 82419) affected shaders:
Instrs: 7534077 -> 7527637 (-0.09%); split: -0.09%, +0.01%
CodeSize: 40017384 -> 39993008 (-0.06%); split: -0.07%, +0.01%
Latency: 38593071 -> 38582815 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 8519291 -> 8518620 (-0.01%); split: -0.01%, +0.00%
VClause: 151669 -> 151662 (-0.00%); split: -0.02%, +0.02%
SClause: 155781 -> 155772 (-0.01%); split: -0.01%, +0.01%
Copies: 628453 -> 628531 (+0.01%); split: -0.01%, +0.02%
Branches: 180429 -> 180430 (+0.00%)
PreSGPRs: 182855 -> 182801 (-0.03%)
VALU: 4315173 -> 4315241 (+0.00%); split: -0.00%, +0.00%
SALU: 992125 -> 986876 (-0.53%); split: -0.53%, +0.00%
VOPD: 15827 -> 15838 (+0.07%); split: +0.23%, -0.16%
Foz-DB Navi21:
Totals from 3341 (4.06% of 82387) affected shaders:
MaxWaves: 61924 -> 61950 (+0.04%)
Instrs: 6640276 -> 6635078 (-0.08%); split: -0.08%, +0.00%
CodeSize: 35932788 -> 35913760 (-0.05%); split: -0.06%, +0.00%
VGPRs: 205512 -> 205456 (-0.03%)
Latency: 40201463 -> 40194285 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 12379144 -> 12378028 (-0.01%); split: -0.01%, +0.00%
VClause: 151556 -> 151563 (+0.00%); split: -0.01%, +0.01%
SClause: 157470 -> 157472 (+0.00%); split: -0.00%, +0.01%
Copies: 645034 -> 644947 (-0.01%); split: -0.02%, +0.01%
Branches: 192070 -> 192071 (+0.00%)
PreSGPRs: 173368 -> 173311 (-0.03%)
VALU: 4554790 -> 4554782 (-0.00%); split: -0.00%, +0.00%
SALU: 881251 -> 876087 (-0.59%); split: -0.59%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:19 +00:00
Georg Lehmann
752f1fb4ae
aco/optimizer: extend existing patterns to handle b2f/b2i(not(a))
...
The next commit will optimize b2f(not(a)) and b2i(not(a)),
so handle those in other patterns to prevent regressions.
No Foz-DB changes on its own.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:19 +00:00
Georg Lehmann
c538f47f03
aco/optimizer: create ff0/bcnt0
...
Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
Instrs: 350 -> 347 (-0.86%)
CodeSize: 1800 -> 1788 (-0.67%)
Latency: 2427 -> 2421 (-0.25%)
SALU: 80 -> 77 (-3.75%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:18 +00:00
Georg Lehmann
0f7a1ce23e
aco/optimizer: some more mul opts
...
Foz-DB Navi48:
Totals from 1650 (2.00% of 82419) affected shaders:
Instrs: 975716 -> 970609 (-0.52%); split: -0.53%, +0.00%
CodeSize: 4986260 -> 4982916 (-0.07%); split: -0.09%, +0.02%
Latency: 2795394 -> 2793211 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 620892 -> 620914 (+0.00%); split: -0.00%, +0.01%
VClause: 18773 -> 18729 (-0.23%)
SClause: 13219 -> 13218 (-0.01%)
Copies: 53619 -> 53620 (+0.00%); split: -0.01%, +0.01%
VALU: 592094 -> 592096 (+0.00%); split: -0.00%, +0.00%
SALU: 96586 -> 93532 (-3.16%); split: -3.17%, +0.00%
Foz-DB Navi21:
Totals from 1647 (2.00% of 82387) affected shaders:
Instrs: 1104100 -> 1100149 (-0.36%); split: -0.36%, +0.00%
CodeSize: 5631092 -> 5637668 (+0.12%); split: -0.00%, +0.12%
Latency: 3503029 -> 3501621 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 1088494 -> 1088495 (+0.00%); split: -0.00%, +0.00%
VClause: 20898 -> 20885 (-0.06%)
Copies: 72641 -> 72635 (-0.01%); split: -0.02%, +0.01%
VALU: 725593 -> 725592 (-0.00%); split: -0.00%, +0.00%
SALU: 139046 -> 135175 (-2.78%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:17 +00:00
Georg Lehmann
92dbf42379
aco/optimizer: use cndmask for neg(b2i)
...
Foz-DB Navi48:
Totals from 1310 (1.59% of 82419) affected shaders:
Instrs: 1337622 -> 1338677 (+0.08%); split: -0.00%, +0.08%
CodeSize: 7039828 -> 7043996 (+0.06%); split: -0.00%, +0.06%
Latency: 7783135 -> 7782526 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 1587987 -> 1586644 (-0.08%)
Branches: 24320 -> 24318 (-0.01%)
Foz-DB Navi21:
Totals from 334 (0.41% of 82387) affected shaders:
Instrs: 666102 -> 666094 (-0.00%)
CodeSize: 3599748 -> 3599724 (-0.00%)
Latency: 6873870 -> 6873868 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2151773 -> 2151780 (+0.00%); split: -0.00%, +0.00%
Branches: 17419 -> 17411 (-0.05%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:17 +00:00
Georg Lehmann
0e4d4aeef7
aco/optimizer: add some bitop combining
...
Foz-DB Navi48:
Totals from 53 (0.06% of 82419) affected shaders:
Instrs: 172843 -> 172769 (-0.04%); split: -0.06%, +0.01%
CodeSize: 937308 -> 936924 (-0.04%); split: -0.04%, +0.00%
Latency: 454652 -> 454823 (+0.04%); split: -0.01%, +0.05%
InvThroughput: 89833 -> 89812 (-0.02%); split: -0.06%, +0.03%
PreSGPRs: 2926 -> 2929 (+0.10%)
PreVGPRs: 2920 -> 2919 (-0.03%); split: -0.07%, +0.03%
VALU: 76638 -> 76556 (-0.11%)
SALU: 37856 -> 37859 (+0.01%); split: -0.01%, +0.01%
VOPD: 10943 -> 10936 (-0.06%)
Foz-DB Navi21:
Totals from 59 (0.07% of 82387) affected shaders:
Instrs: 1047744 -> 1047578 (-0.02%)
CodeSize: 5641948 -> 5640780 (-0.02%)
Latency: 5116816 -> 5116957 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 1274035 -> 1274023 (-0.00%); split: -0.00%, +0.00%
VClause: 30744 -> 30745 (+0.00%)
PreSGPRs: 3329 -> 3333 (+0.12%)
PreVGPRs: 4130 -> 4129 (-0.02%); split: -0.05%, +0.02%
VALU: 689731 -> 689562 (-0.02%)
SALU: 162830 -> 162833 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:16 +00:00
Georg Lehmann
ee0354e0f1
aco/optimizer: use new helpers for bitwise n2 opts
...
Foz-DB Navi48:
Totals from 604 (0.73% of 82419) affected shaders:
Instrs: 2759878 -> 2758431 (-0.05%); split: -0.06%, +0.01%
CodeSize: 14801888 -> 14793412 (-0.06%); split: -0.06%, +0.01%
SpillSGPRs: 6237 -> 6233 (-0.06%)
Latency: 23509766 -> 23507853 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 7471297 -> 7471008 (-0.00%); split: -0.00%, +0.00%
Branches: 104979 -> 104977 (-0.00%)
PreSGPRs: 51506 -> 51408 (-0.19%); split: -0.20%, +0.01%
VALU: 1351564 -> 1351561 (-0.00%); split: -0.00%, +0.00%
SALU: 537430 -> 536266 (-0.22%); split: -0.23%, +0.01%
VOPD: 3834 -> 3833 (-0.03%)
Foz-DB Navi21:
Totals from 739 (0.90% of 82387) affected shaders:
Instrs: 2489644 -> 2488228 (-0.06%); split: -0.06%, +0.00%
CodeSize: 13930192 -> 13915972 (-0.10%); split: -0.11%, +0.00%
SpillSGPRs: 980 -> 976 (-0.41%)
Latency: 25027553 -> 25027845 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 8591377 -> 8591097 (-0.00%); split: -0.00%, +0.00%
SClause: 78380 -> 78382 (+0.00%)
Copies: 275433 -> 275393 (-0.01%); split: -0.02%, +0.01%
Branches: 113718 -> 113716 (-0.00%)
PreSGPRs: 48377 -> 48260 (-0.24%); split: -0.27%, +0.03%
VALU: 1589250 -> 1589240 (-0.00%)
SALU: 420348 -> 418962 (-0.33%); split: -0.34%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:15 +00:00
Georg Lehmann
758fe79ad5
aco/optimizer: use new helpers for v_sub opts
...
Foz-DB Navi48:
Totals from 1315 (1.60% of 82419) affected shaders:
Instrs: 1339446 -> 1339428 (-0.00%)
CodeSize: 7049636 -> 7049596 (-0.00%)
Latency: 7790708 -> 7790698 (-0.00%)
InvThroughput: 1588815 -> 1588807 (-0.00%)
VALU: 826831 -> 826821 (-0.00%)
Foz-DB Navi21:
Totals from 344 (0.42% of 82387) affected shaders:
Instrs: 692048 -> 692040 (-0.00%); split: -0.00%, +0.00%
Latency: 6987086 -> 6987066 (-0.00%)
InvThroughput: 2174789 -> 2174762 (-0.00%)
Copies: 57845 -> 57850 (+0.01%)
VALU: 475761 -> 475748 (-0.00%)
SALU: 93692 -> 93697 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:14 +00:00
Georg Lehmann
e42be7536c
aco/optimizer: use new helpers for remaining add opts
...
Foz-DB Navi48:
Totals from 373 (0.45% of 82419) affected shaders:
Instrs: 542269 -> 542186 (-0.02%); split: -0.06%, +0.04%
CodeSize: 2872728 -> 2867204 (-0.19%); split: -0.21%, +0.02%
Latency: 3174435 -> 3174634 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 828783 -> 828600 (-0.02%); split: -0.03%, +0.01%
SClause: 11954 -> 11955 (+0.01%)
Copies: 49104 -> 49110 (+0.01%)
PreSGPRs: 15422 -> 15420 (-0.01%)
VALU: 262635 -> 262641 (+0.00%)
Foz-DB Navi21:
Totals from 426 (0.52% of 82387) affected shaders:
Instrs: 624744 -> 624754 (+0.00%); split: -0.00%, +0.00%
CodeSize: 3382728 -> 3385664 (+0.09%); split: -0.00%, +0.09%
Latency: 3841693 -> 3842101 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 1132036 -> 1132065 (+0.00%); split: -0.00%, +0.00%
VClause: 14008 -> 14011 (+0.02%)
Copies: 73104 -> 73114 (+0.01%); split: -0.00%, +0.02%
PreSGPRs: 19504 -> 19502 (-0.01%)
SALU: 131431 -> 131443 (+0.01%)
Foz-DB Polaris10:
Totals from 812 (1.31% of 61894) affected shaders:
Instrs: 610178 -> 609219 (-0.16%); split: -0.21%, +0.05%
CodeSize: 3142404 -> 3147304 (+0.16%); split: -0.02%, +0.17%
VGPRs: 38380 -> 38376 (-0.01%)
Latency: 8312085 -> 8307755 (-0.05%); split: -0.12%, +0.07%
InvThroughput: 3929970 -> 3924631 (-0.14%); split: -0.15%, +0.01%
VClause: 15714 -> 15632 (-0.52%); split: -0.67%, +0.15%
SClause: 14509 -> 14510 (+0.01%); split: -0.02%, +0.03%
Copies: 70197 -> 70388 (+0.27%); split: -0.61%, +0.89%
PreSGPRs: 26409 -> 26404 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 30448 -> 30436 (-0.04%)
VALU: 408184 -> 407068 (-0.27%); split: -0.29%, +0.01%
SALU: 95726 -> 95959 (+0.24%); split: -0.30%, +0.54%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:13 +00:00
Georg Lehmann
adc55b1a1e
aco/optimizer: use new helpers for v_and opt
...
Foz-DB Navi48:
Totals from 465 (0.56% of 82419) affected shaders:
Instrs: 372721 -> 372083 (-0.17%); split: -0.18%, +0.01%
CodeSize: 2004568 -> 2003332 (-0.06%)
Latency: 3664162 -> 3660745 (-0.09%); split: -0.10%, +0.00%
InvThroughput: 892042 -> 890994 (-0.12%); split: -0.12%, +0.01%
Copies: 35552 -> 35549 (-0.01%)
VALU: 171781 -> 171333 (-0.26%); split: -0.28%, +0.02%
SALU: 87946 -> 87949 (+0.00%)
VOPD: 48 -> 49 (+2.08%)
Foz-DB Navi21:
Totals from 191 (0.23% of 82387) affected shaders:
Instrs: 139340 -> 139178 (-0.12%); split: -0.13%, +0.02%
CodeSize: 798660 -> 798284 (-0.05%)
Latency: 1672750 -> 1673194 (+0.03%); split: -0.06%, +0.08%
InvThroughput: 634847 -> 634651 (-0.03%); split: -0.06%, +0.03%
Copies: 16372 -> 16366 (-0.04%); split: -0.04%, +0.01%
VALU: 79668 -> 79506 (-0.20%); split: -0.23%, +0.03%
SALU: 38233 -> 38236 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:13 +00:00
Georg Lehmann
7bc6d8e2ad
aco/optimizer: add more v_add_lshl_u32 opts
...
No Foz-DB changes on Navi21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:12 +00:00
Georg Lehmann
6a1caabd64
aco/optimizer: use new helpers for v_add_lshl_u32
...
Foz-DB Navi48:
Totals from 357 (0.43% of 82419) affected shaders:
Instrs: 244419 -> 243608 (-0.33%); split: -0.34%, +0.01%
CodeSize: 1302584 -> 1304188 (+0.12%); split: -0.00%, +0.13%
VGPRs: 21240 -> 21216 (-0.11%)
Latency: 1226165 -> 1225651 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 162432 -> 161940 (-0.30%); split: -0.30%, +0.00%
Copies: 16607 -> 16610 (+0.02%)
PreSGPRs: 14082 -> 14135 (+0.38%)
PreVGPRs: 15917 -> 15914 (-0.02%)
VALU: 136308 -> 135699 (-0.45%)
SALU: 24415 -> 24418 (+0.01%)
VOPD: 333 -> 334 (+0.30%)
Foz-DB Navi21:
Totals from 319 (0.39% of 82387) affected shaders:
Instrs: 255434 -> 254831 (-0.24%)
CodeSize: 1375792 -> 1378164 (+0.17%)
VGPRs: 15360 -> 15344 (-0.10%)
Latency: 1405956 -> 1405181 (-0.06%)
InvThroughput: 174402 -> 173816 (-0.34%)
Copies: 25892 -> 25891 (-0.00%)
PreSGPRs: 14129 -> 14132 (+0.02%)
PreVGPRs: 12457 -> 12454 (-0.02%)
VALU: 139630 -> 139032 (-0.43%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:12 +00:00
Georg Lehmann
7108dac637
aco/optimizer: use new helpers for s_lshl<n>_add_u32
...
Foz-DB Navi48:
Totals from 7654 (9.29% of 82419) affected shaders:
Instrs: 6170479 -> 6174536 (+0.07%); split: -0.07%, +0.13%
CodeSize: 32489580 -> 32500100 (+0.03%); split: -0.07%, +0.10%
SpillSGPRs: 4253 -> 4224 (-0.68%); split: -0.71%, +0.02%
Latency: 60472662 -> 60489681 (+0.03%); split: -0.02%, +0.04%
InvThroughput: 9218099 -> 9218149 (+0.00%); split: -0.01%, +0.01%
VClause: 121094 -> 121089 (-0.00%); split: -0.01%, +0.00%
SClause: 178092 -> 179830 (+0.98%); split: -0.55%, +1.53%
Copies: 424495 -> 423756 (-0.17%); split: -0.57%, +0.40%
Branches: 120352 -> 120353 (+0.00%); split: -0.01%, +0.01%
PreSGPRs: 334391 -> 333381 (-0.30%); split: -0.33%, +0.02%
VALU: 3349394 -> 3349323 (-0.00%); split: -0.00%, +0.00%
SALU: 957913 -> 957149 (-0.08%); split: -0.25%, +0.17%
VOPD: 9177 -> 9179 (+0.02%); split: +0.03%, -0.01%
Foz-DB Navi21:
Totals from 7649 (9.28% of 82387) affected shaders:
Instrs: 6144605 -> 6143005 (-0.03%); split: -0.06%, +0.04%
CodeSize: 32685976 -> 32672380 (-0.04%); split: -0.08%, +0.04%
SpillSGPRs: 3079 -> 3067 (-0.39%); split: -0.42%, +0.03%
Latency: 64979945 -> 65002741 (+0.04%); split: -0.02%, +0.05%
InvThroughput: 14754398 -> 14754230 (-0.00%); split: -0.01%, +0.01%
VClause: 132336 -> 132357 (+0.02%); split: -0.02%, +0.03%
SClause: 190229 -> 191340 (+0.58%); split: -1.01%, +1.60%
Copies: 511915 -> 511287 (-0.12%); split: -0.44%, +0.32%
Branches: 157156 -> 157154 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 345761 -> 344826 (-0.27%); split: -0.33%, +0.05%
VALU: 3856887 -> 3856928 (+0.00%); split: -0.01%, +0.01%
SALU: 1001190 -> 1000362 (-0.08%); split: -0.22%, +0.14%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:10 +00:00
Georg Lehmann
d9919c3e10
aco/optimizer: optimize add(mad_u32_u16(a, b, 0), c)
...
Foz-DB Navi48:
Totals from 104 (0.13% of 82419) affected shaders:
Instrs: 3554243 -> 3553555 (-0.02%); split: -0.02%, +0.00%
CodeSize: 18836004 -> 18830572 (-0.03%); split: -0.03%, +0.00%
Latency: 19288034 -> 19287208 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 3527510 -> 3526925 (-0.02%); split: -0.02%, +0.00%
VClause: 89526 -> 89522 (-0.00%); split: -0.02%, +0.01%
SClause: 62484 -> 62492 (+0.01%); split: -0.00%, +0.01%
Copies: 266415 -> 266404 (-0.00%); split: -0.04%, +0.03%
Branches: 102123 -> 102125 (+0.00%)
VALU: 1987067 -> 1986531 (-0.03%); split: -0.03%, +0.00%
SALU: 471348 -> 471346 (-0.00%); split: -0.00%, +0.00%
Foz-DB Navi21:
Totals from 228 (0.28% of 82387) affected shaders:
Instrs: 3069693 -> 3068317 (-0.04%); split: -0.05%, +0.00%
CodeSize: 16582476 -> 16574920 (-0.05%); split: -0.05%, +0.00%
Latency: 20038755 -> 20030986 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 4742546 -> 4738245 (-0.09%); split: -0.10%, +0.00%
VClause: 93157 -> 93135 (-0.02%); split: -0.03%, +0.01%
Copies: 265019 -> 264959 (-0.02%); split: -0.04%, +0.02%
VALU: 2025352 -> 2023897 (-0.07%); split: -0.07%, +0.00%
SALU: 447385 -> 447375 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:10 +00:00
Georg Lehmann
0359c8a901
aco/optimizer: use new helpers for v_add_u32 opts
...
Foz-DB Navi48:
Totals from 1554 (1.89% of 82419) affected shaders:
Instrs: 5154325 -> 5151499 (-0.05%); split: -0.08%, +0.02%
CodeSize: 27310012 -> 27318708 (+0.03%); split: -0.01%, +0.05%
VGPRs: 97236 -> 97200 (-0.04%); split: -0.05%, +0.01%
Latency: 34121873 -> 34120894 (-0.00%); split: -0.02%, +0.01%
InvThroughput: 6735276 -> 6730418 (-0.07%); split: -0.08%, +0.01%
VClause: 130106 -> 130090 (-0.01%); split: -0.05%, +0.04%
SClause: 90439 -> 90449 (+0.01%); split: -0.00%, +0.01%
Copies: 382920 -> 382401 (-0.14%); split: -0.18%, +0.05%
Branches: 130089 -> 130091 (+0.00%)
PreSGPRs: 67745 -> 67743 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 72710 -> 72674 (-0.05%)
VALU: 2941866 -> 2938129 (-0.13%); split: -0.13%, +0.00%
SALU: 651032 -> 651779 (+0.11%); split: -0.02%, +0.14%
VOPD: 2446 -> 2393 (-2.17%); split: +0.70%, -2.86%
Foz-DB Navi21:
Totals from 1534 (1.86% of 82387) affected shaders:
MaxWaves: 32481 -> 32479 (-0.01%)
Instrs: 4732755 -> 4730039 (-0.06%); split: -0.06%, +0.00%
CodeSize: 25305728 -> 25313148 (+0.03%); split: -0.00%, +0.03%
VGPRs: 84424 -> 84448 (+0.03%)
SpillVGPRs: 2420 -> 2419 (-0.04%)
Scratch: 180224 -> 179200 (-0.57%)
Latency: 36843383 -> 36846269 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 9252495 -> 9238142 (-0.16%); split: -0.17%, +0.02%
VClause: 146629 -> 146671 (+0.03%); split: -0.02%, +0.05%
SClause: 94502 -> 94512 (+0.01%); split: -0.00%, +0.01%
Copies: 403672 -> 403592 (-0.02%); split: -0.09%, +0.07%
Branches: 141145 -> 141137 (-0.01%)
PreSGPRs: 70003 -> 70001 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70835 -> 70800 (-0.05%)
VALU: 3114513 -> 3111338 (-0.10%); split: -0.10%, +0.00%
SALU: 651177 -> 651925 (+0.11%); split: -0.02%, +0.13%
VMEM: 271263 -> 271261 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:09 +00:00
Georg Lehmann
715b9214da
aco/optimizer: use new helpers for xor opts
...
Foz-DB Navi48:
Totals from 26 (0.03% of 82419) affected shaders:
Instrs: 180854 -> 180787 (-0.04%)
CodeSize: 948640 -> 948832 (+0.02%); split: -0.01%, +0.03%
Latency: 527883 -> 527858 (-0.00%); split: -0.03%, +0.02%
InvThroughput: 149480 -> 149379 (-0.07%); split: -0.07%, +0.00%
PreVGPRs: 1502 -> 1503 (+0.07%)
VALU: 84220 -> 84168 (-0.06%)
Foz-DB Navi21:
Totals from 26 (0.03% of 82387) affected shaders:
Instrs: 150984 -> 150929 (-0.04%)
CodeSize: 800404 -> 800552 (+0.02%); split: -0.00%, +0.02%
Latency: 541067 -> 540854 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 182046 -> 181983 (-0.03%); split: -0.04%, +0.00%
Copies: 11324 -> 11322 (-0.02%)
PreVGPRs: 1568 -> 1569 (+0.06%)
VALU: 96977 -> 96923 (-0.06%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:08 +00:00
Georg Lehmann
3ba783e716
aco/optimizer: use new helpers for v_or opts
...
Foz-DB Navi48:
Totals from 1518 (1.84% of 82419) affected shaders:
Instrs: 6575669 -> 6575601 (-0.00%); split: -0.01%, +0.01%
CodeSize: 35135060 -> 35136020 (+0.00%); split: -0.00%, +0.01%
VGPRs: 99660 -> 99648 (-0.01%)
Latency: 47912874 -> 47910876 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 9913228 -> 9912959 (-0.00%); split: -0.00%, +0.00%
VClause: 151572 -> 151567 (-0.00%); split: -0.01%, +0.00%
SClause: 133112 -> 133109 (-0.00%); split: -0.00%, +0.00%
Copies: 577835 -> 577837 (+0.00%); split: -0.01%, +0.01%
PreSGPRs: 84939 -> 84898 (-0.05%)
PreVGPRs: 75892 -> 75891 (-0.00%)
VALU: 3520300 -> 3520176 (-0.00%); split: -0.00%, +0.00%
SALU: 1026499 -> 1026529 (+0.00%); split: -0.00%, +0.01%
VOPD: 6830 -> 6850 (+0.29%); split: +0.31%, -0.01%
Foz-DB Navi21:
Totals from 1508 (1.83% of 82387) affected shaders:
Instrs: 5053785 -> 5053710 (-0.00%); split: -0.00%, +0.00%
CodeSize: 27603768 -> 27604048 (+0.00%); split: -0.00%, +0.00%
Latency: 44447441 -> 44444474 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 11666771 -> 11666371 (-0.00%); split: -0.00%, +0.00%
SClause: 121429 -> 121435 (+0.00%); split: -0.00%, +0.01%
Copies: 496693 -> 496642 (-0.01%); split: -0.02%, +0.01%
PreSGPRs: 72106 -> 72071 (-0.05%)
PreVGPRs: 69819 -> 69818 (-0.00%)
VALU: 3294641 -> 3294547 (-0.00%); split: -0.00%, +0.00%
SALU: 799012 -> 799014 (+0.00%); split: -0.01%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:08 +00:00