Samuel Pitoiset
76356ed208
aco: remove unreachable code about viewport index/layer and mesh shaders
...
If the mesh shaders exports the viewport index or the layer, the value
can't be NULL, and it should be implicitly zero.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16438 >
2022-05-13 14:01:54 +00:00
Samuel Pitoiset
27f1da8215
radv,aco: do not implicitly export the primitive ID for mesh shaders
...
From the Vulkan spec:
"VUID-VkGraphicsPipelineCreateInfo-PrimitiveId-06264
If the pipeline is being created with pre-rasterization shader
state, it includes a mesh shader and the fragment shader code
reads from an input variable that is decorated with PrimitiveId,
then the mesh shader code must write to a matching output variable,
decorated with PrimitiveId, in all execution paths"
So, if PS uses PrimitiveID, MS must export it (like GS).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16438 >
2022-05-13 14:01:54 +00:00
Samuel Pitoiset
07954a8fd6
aco: only retrieve the scratch offset when it's declared
...
This allows to run most of the fossils we have right now. I will fix
up scratch in upcoming patches.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
5a119f15aa
radv,aco: export alpha-to-coverage via MRTZ on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
b6ba4cb407
aco: do not set COMPR for exports but use 0x3 channel mask on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
a6e1445d5f
aco: do not set RESOURCE_LEVEL for buffer descriptors on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
7647097d3d
aco: update waitcnt on GFX11
...
Not sure if the vmcnt field can use more than 0x3f bits.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
765aa36b9d
aco: update LDS allocation granularity for PS on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
a284b677ba
aco: do not set GLC stores on GFX11
...
It has no effect.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
eea15d6706
aco: do not set DLC for loads on GFX11
...
It means something different.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
bc8da20dda
aco: export MRT0 instead of NULL on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
42ef89e8db
radv,aco: use the new TCS WaveID SGPR to compute vs_rel_patch_id on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
432cde7f00
radv,aco: add support for packed threadID VGPRs on GFX11
...
Thread ID are packed in one VGPR with 10 bits each.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
52952f51cd
aco: do not align VGPRS to 8 or 16 on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
0cb1b12ec0
aco: recognize GFX11 in few places
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Konstantin Seurer
b30f96dd93
radv,aco: Use ray_launch_size_addr
...
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15712 >
2022-05-12 15:04:31 +00:00
Dave Airlie
24176cae55
aco: drop unused radv include
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
c44d5d61ce
aco: remove radv vs prolog key from aco internals.
...
This creates an aco specific key, and converts radv to it.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
04c07a2413
aco/radv: convert to aco shader info at the radv level.
...
This removes the radv shader info type from aco completely.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
199edce84d
aco/info: add some more fields.
...
These fields are also used in aco.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
8cfd8420ab
aco: convert vs and so info over to aco structs.
...
This renames the vs to vp (vertex pipeline) on the way past.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
3dba3458e9
aco: remove radv specific streamout info
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
9bd89af1bc
aco/info: reduce the gs ring info to what is needed.
...
Only one member was being used.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
87df607ff5
aco: move to a minimal aco shader info struct.
...
This should be kept to only things aco uses, and expanded when
radeonsi support is added. Things should be removed if lowered in NIR.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
a2701bfdb8
aco: move info pointer to a copy.
...
This is just setup to move this to a different struct later.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Timur Kristóf
f553076eaf
aco: Remove now-superfluous intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13155 >
2022-05-10 17:16:03 +00:00
Rhys Perry
cc410dd4d1
aco: fix cmpswap global atomic definition on GFX6
...
Missed this one.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Fixes: 2f0bb39e16 ("aco: ensure that definitions fixed to operands have matching regclasses")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16367 >
2022-05-10 01:01:13 +00:00
Rhys Perry
152092b8ea
aco: skip s_barrier if TCS patches are within subgroup
...
fossil-db (Sienna Cichlid):
Totals from 518 (0.32% of 162293) affected shaders:
Instrs: 124943 -> 123908 (-0.83%)
CodeSize: 708764 -> 704624 (-0.58%)
Latency: 618380 -> 618279 (-0.02%)
InvThroughput: 214061 -> 214051 (-0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16356 >
2022-05-09 16:30:27 +00:00
Rhys Perry
359e60cf5e
aco: split load_sbt_amd result
...
fossil-db (Sienna Cichlid):
Totals from 11 (0.01% of 162293) affected shaders:
Instrs: 47857 -> 47738 (-0.25%)
CodeSize: 261556 -> 261080 (-0.18%)
Latency: 1176822 -> 1176245 (-0.05%)
InvThroughput: 784549 -> 784165 (-0.05%)
Copies: 5959 -> 5840 (-2.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16203 >
2022-05-06 15:15:13 +00:00
Daniel Schürmann
d70688492c
aco/optimizer: re-combine and copy-propagate p_create_vector(p_split_vector)
...
Totals from 309 (0.23% of 134913) affected shaders: (GFX10.3)
CodeSize: 1853812 -> 1857732 (+0.21%); split: -0.05%, +0.27%
Instrs: 340810 -> 341789 (+0.29%); split: -0.07%, +0.36%
Latency: 3301814 -> 3301774 (-0.00%); split: -0.02%, +0.02%
InvThroughput: 590473 -> 590914 (+0.07%); split: -0.00%, +0.08%
Copies: 28751 -> 29731 (+3.41%); split: -0.87%, +4.28%
PreSGPRs: 14010 -> 14028 (+0.13%); split: -0.01%, +0.14%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15414 >
2022-05-06 14:52:07 +00:00
Daniel Schürmann
5e6e47ecea
aco/ra: improve split_vector register assignment if the operand is not killed
...
This allows for more coalescing when lowering the copies.
Totals from 44801 (33.21% of 134913) affected shaders: (GFX10.3)
VGPRs: 1513264 -> 1513248 (-0.00%)
CodeSize: 113354240 -> 113172872 (-0.16%); split: -0.16%, +0.00%
Instrs: 21648793 -> 21603397 (-0.21%); split: -0.21%, +0.00%
Latency: 95762290 -> 95757403 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 15427354 -> 15427341 (-0.00%); split: -0.00%, +0.00%
Copies: 2065330 -> 2019933 (-2.20%); split: -2.20%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15414 >
2022-05-06 14:52:07 +00:00
Daniel Schürmann
499dc20e6a
aco: don't re-create vectors for load_barycentric_* intrinsics
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15414 >
2022-05-06 14:52:07 +00:00
Rhys Perry
2f0bb39e16
aco: ensure that definitions fixed to operands have matching regclasses
...
If the operand is not killed, the definition needs to be large enough so
that the new location for the operand does not intersect with the old
location.
Fixes with zink:
KHR-GL45.shader_image_load_store.basic-allTargets-atomicCS
KHR-GL45.shader_image_load_store.basic-allTargets-atomicGS
KHR-GL45.shader_image_load_store.basic-allTargets-atomicVS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6276
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16106 >
2022-05-05 19:56:48 +01:00
Rhys Perry
1b639a0ce5
aco/ra: fix vgpr_limit
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Fixes: b98a4d4dd7 ("aco: refactor GPR limit calculation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16297 >
2022-05-04 11:12:13 +00:00
Georg Lehmann
69cceab718
aco: Remove D16 zero components from image stores.
...
No foz-db changes.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179 >
2022-05-04 09:58:03 +00:00
Georg Lehmann
7a6dbe0c77
aco: Implement image_load d16.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179 >
2022-05-04 09:58:03 +00:00
Georg Lehmann
7ffcaf9187
aco: Implement image_store d16.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15179 >
2022-05-04 09:58:03 +00:00
Joshua Ashton
ce7966fcb4
aco: Use movk for AddressHi bits in vertex prolog
...
Eliminates a useless extra dword by transforming
s_mov_b32 s47, 0xffff8000 ; beaf03ff ffff8000
to
s_movk_i32 s47, 0x8000 ; b02f8000
which does the same thing.
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15415 >
2022-05-03 10:15:36 +00:00
Daniel Schürmann
58bd9a379e
aco/ra: fix live-range splits of phi definitions
...
It could happen that at the time of a live-range split,
a phi was not yet placed in the new instruction vector,
and thus, instead of renamed, a new phi was created.
Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16248 >
2022-05-03 09:54:30 +00:00
Daniel Schürmann
12d7f911c9
aco/optimizer: prevent any overflow between SGPR and const offset on MUBUF
...
Apparently, if the SGPR offset + const offset overflows,
it doesn't work.
Totals from 145 (0.11% of 134913) affected shaders: (GFX10.3)
SpillSGPRs: 134 -> 104 (-22.39%)
CodeSize: 1632676 -> 1645916 (+0.81%); split: -0.03%, +0.84%
Instrs: 316920 -> 320252 (+1.05%); split: -0.01%, +1.07%
Latency: 1456285 -> 1459686 (+0.23%); split: -0.02%, +0.25%
InvThroughput: 165785 -> 166086 (+0.18%); split: -0.02%, +0.20%
VClause: 6815 -> 6875 (+0.88%); split: -0.03%, +0.91%
SClause: 19089 -> 19079 (-0.05%); split: -0.06%, +0.01%
PreSGPRs: 7302 -> 7304 (+0.03%); split: -0.01%, +0.04%
Fixes: KHR-GL45.shader_storage_buffer_object.basic-operations-case1-cs
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15866 >
2022-04-29 15:57:57 +00:00
Daniel Schürmann
d5dc0c0392
aco: adjust num_waves for LDS before scheduling
...
Totals from 67 (0.05% of 134913) affected shaders: (GFX10.3)
VGPRs: 2024 -> 2136 (+5.53%); split: -0.40%, +5.93%
CodeSize: 162364 -> 162348 (-0.01%); split: -0.08%, +0.07%
MaxWaves: 1882 -> 1816 (-3.51%); split: +0.11%, -3.61%
Instrs: 29176 -> 29162 (-0.05%); split: -0.09%, +0.04%
Latency: 329984 -> 327272 (-0.82%); split: -0.88%, +0.06%
InvThroughput: 54653 -> 54672 (+0.03%); split: -0.01%, +0.04%
VClause: 782 -> 761 (-2.69%); split: -2.81%, +0.13%
SClause: 833 -> 824 (-1.08%); split: -2.28%, +1.20%
Copies: 1872 -> 1873 (+0.05%); split: -0.37%, +0.43%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16039 >
2022-04-29 15:39:10 +00:00
Daniel Schürmann
8d8c59b4cd
aco: split num_waves adjustment into separate function
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16039 >
2022-04-29 15:39:10 +00:00
Daniel Schürmann
6220046ad1
aco: remove 'max_waves' and use 'num_waves' to adjust for LDS and workgroup size
...
Totals from 21 (0.02% of 134913) affected shaders: (GFX10.3)
VGPRs: 1024 -> 1176 (+14.84%)
CodeSize: 127824 -> 127664 (-0.13%); split: -0.17%, +0.04%
MaxWaves: 416 -> 378 (-9.13%)
Instrs: 22521 -> 22502 (-0.08%); split: -0.17%, +0.09%
Latency: 146386 -> 143154 (-2.21%); split: -2.21%, +0.00%
InvThroughput: 28379 -> 28944 (+1.99%); split: -0.23%, +2.22%
VClause: 575 -> 579 (+0.70%); split: -0.87%, +1.57%
SClause: 692 -> 645 (-6.79%)
Copies: 780 -> 747 (-4.23%); split: -4.74%, +0.51%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16039 >
2022-04-29 15:39:10 +00:00
Samuel Pitoiset
6873da0e42
aco: fix load_barycentric_at_{sample,offset} on GFX6-7
...
The computation was wrong.
Fixes dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.*
with Zink on GFX6 (Pitcairn).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16099 >
2022-04-25 07:17:39 +00:00
Jason Ekstrand
1b8a43a0ba
util: Remove util_cpu_detect
...
util_cpu_detect is an anti-pattern: it relies on callers high up in the call
chain initializing a local implementation detail. As a real example, I added:
...a Mali compiler unit test
...that called bi_imm_f16() to construct an FP16 immediate
...that calls _mesa_float_to_half internally
...that calls util_get_cpu_caps internally, but only on x86_64!
...that relies on util_cpu_detect having been called before.
As a consequence, this unit test:
...crashes on x86_64 with USE_X86_64_ASM set
...passes on every other architecture
...works on my local arm64 workstation and on my test board
...failed CI which runs on x86_64
...needed to have a random util_cpu_detect() call sprinkled in.
This is a bad design decision. It pollutes the tree with magic, it causes
mysterious CI failures especially for non-x86_64 developers, and it is not
justified by a micro-optimization.
Instead, let's call util_cpu_detect directly from util_get_cpu_caps, avoiding
the footgun where it fails to be called. This cleans up Mesa's design,
simplifies the tree, and avoids a class of a (possibly platform-specific)
failures. To mitigate the added overhead, wrap it all in a (fast) atomic
load check and declare the whole thing as ATTRIBUTE_CONST so the
compiler will CSE calls to util_cpu_detect.
Co-authored-by: Alyssa Rosenzweig <alyssa@collabora.com >
Reviewed-by: Marek Olšák <maraeo@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15580 >
2022-04-20 18:44:35 +00:00
Georg Lehmann
d12b5e7633
aco: Reuse previous -1 result in find_msb to avoid using VOP3.
...
Totals:
CodeSize: 388934388 -> 388933712 (-0.00%)
Totals from 208 (0.15% of 134913) affected shaders:
CodeSize: 2008016 -> 2007340 (-0.03%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16011 >
2022-04-19 15:18:58 +00:00
Georg Lehmann
a8b29094c2
aco: Remove some old comments in aco_opcodes.py.
...
s_cmovk_i32 isn't GFX8_GFX9 only and s_version doesn't need a comment to say
it's GFX10+ exclusive. The encoding list is enough to provide this information,
as for other GFX10+ instructions.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16006 >
2022-04-18 15:59:38 +00:00
Rhys Perry
63e40adf8c
aco: fix disassembly of SMEM with both SGPR and constant offset
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15890 >
2022-04-14 20:58:36 +00:00
Rhys Perry
c883abda76
aco: implement load_shared2_amd/store_shared2_amd
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00
Rhys Perry
5aa5af7776
aco: handle read2st64/write2st64 in optimizer
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13778 >
2022-04-13 23:08:07 +00:00