Rhys Perry
a9c4a31d8d
aco: handle NIR loops without breaks
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11626 >
2021-07-01 10:01:52 +00:00
Rhys Perry
c094765a01
aco: remove resource flags
...
After disabling SMEM stores, nir_opt_access() now does the same analysis
and we don't need this anymore. Doing it in isel is also too late if we
want to lower descriptor loads in NIR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11652 >
2021-06-30 19:07:12 +01:00
Rhys Perry
ebeda07801
aco/tests: fix 32-bit build
...
"call of overloaded ‘Operand(long unsigned int)’ is ambiguous"
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11627 >
2021-06-29 09:55:32 +00:00
Daniel Schürmann
b14bd285f8
aco/ra: handle copies of copies better
...
Instead of adding a second copy, just redirect
the existing copy.
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11571 >
2021-06-24 16:53:10 +00:00
Daniel Schürmann
995e218993
aco/ra: handle copies of definition registers
...
Previously, it could happen that a parallelcopy of
a definition was inserted before the instruction.
Fixes Rage 2 with GFX7.
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11571 >
2021-06-24 16:53:10 +00:00
Timur Kristóf
e6bf5cfe59
aco/gfx10: Emit barrier at the start of NGG VS and TES.
...
The Navi 1x NGG hardware can hang in certain conditions when
not every wave launched before s_sendmsg(GS_ALLOC_REQ).
As a workaround, to ensure this never happens, let's emit a
workgroup barrier at the beginning of NGG VS and TES.
Note that NGG GS already has a workgroup barrier so it doesn't
need this.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10837 >
2021-06-22 14:32:27 +00:00
Timur Kristóf
f9447abb36
aco/gfx10: NGG zero output workaround for conservative rasterization.
...
Navi 1x GPUs have an issue: they can hang when the output vertex
and primitive counts are zero. The workaround is exporting a dummy
triangle.
This commit changes the dummy triangle's vertex so its positions
are all NaN. This should make sure the triangle is never rendered.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10837 >
2021-06-22 14:32:27 +00:00
Jason Ekstrand
f0f713960b
nir,amd: Suffix nir_op_cube_face_coord/index with _amd
...
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463 >
2021-06-21 09:03:34 -05:00
Timur Kristóf
e5510536e7
aco: Fix checking if load_shared is used by cross lane instructions.
...
This commit fixes two issues with it:
1. Prevent it from going into an infinite loop.
2. Check all uses, not just first use.
Closes : #4916
Fixes: b4e22eb482
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11361 >
2021-06-21 13:42:52 +00:00
Rhys Perry
bc1c527834
aco/lower_phis: don't allocate unused temporary ids
...
The excessive number of temporary IDs caused #4872 's live-out sets to be
extremely large and expensive to iterate.
With this change, #4872 's shader is much faster to compile and uses much
less memory.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4872
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11300 >
2021-06-14 16:48:38 +00:00
Rhys Perry
ecc0353af7
aco/lower_phis: fix undef_operands initialization with >32 predecessors
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11300 >
2021-06-14 16:48:38 +00:00
Rhys Perry
d64f5a3f9d
aco: move VMEM instructions below descriptor loads
...
This is to prevent sequences like:
a = descriptor_load()
vmem(a)
b = descriptor_load()
vmem(b)
and instead create:
a = descriptor_load()
b = descriptor_load()
vmem(a)
vmem(b)
fossil-db (GFX10.3):
Totals from 114521 (78.30% of 146267) affected shaders:
VGPRs: 4540352 -> 4540216 (-0.00%); split: -0.03%, +0.02%
CodeSize: 289864228 -> 289114652 (-0.26%); split: -0.29%, +0.03%
MaxWaves: 2940234 -> 2940338 (+0.00%); split: +0.00%, -0.00%
Instrs: 55112418 -> 54919910 (-0.35%); split: -0.38%, +0.03%
Latency: 956528393 -> 954682011 (-0.19%); split: -0.24%, +0.05%
InvThroughput: 229280830 -> 229238107 (-0.02%); split: -0.04%, +0.02%
VClause: 1141832 -> 1139002 (-0.25%); split: -0.63%, +0.38%
SClause: 2357840 -> 2225008 (-5.63%); split: -6.01%, +0.38%
Copies: 3316040 -> 3331519 (+0.47%); split: -0.31%, +0.77%
Branches: 1187212 -> 1186919 (-0.02%); split: -0.03%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489 >
2021-06-14 15:47:37 +00:00
Rhys Perry
bc71222cd9
aco: don't move descriptor loads below buffer loads
...
fossil-db (GFX10.3):
Totals from 52870 (36.15% of 146267) affected shaders:
VGPRs: 2109936 -> 2110056 (+0.01%); split: -0.01%, +0.01%
CodeSize: 134898056 -> 134812748 (-0.06%); split: -0.08%, +0.02%
MaxWaves: 1347354 -> 1347346 (-0.00%)
Instrs: 25598063 -> 25575415 (-0.09%); split: -0.11%, +0.02%
Latency: 432491613 -> 432047723 (-0.10%); split: -0.12%, +0.02%
InvThroughput: 90940977 -> 90927545 (-0.01%); split: -0.03%, +0.01%
VClause: 570039 -> 570019 (-0.00%); split: -0.05%, +0.04%
SClause: 1145076 -> 1139040 (-0.53%); split: -0.60%, +0.07%
Copies: 1513949 -> 1513102 (-0.06%); split: -0.32%, +0.26%
Branches: 524279 -> 524275 (-0.00%); split: -0.03%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489 >
2021-06-14 15:47:37 +00:00
Rhys Perry
f8bf6b9e0a
aco/ra: use adjust_max_used_regs() in compact_relocate_vars()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489 >
2021-06-14 15:47:37 +00:00
Rhys Perry
1d50ef9ca6
aco: adjust the condition for expanding vertex fetch data format
...
Instead of avoiding out-of-bounds access, avoid creating a load larger
than the original attribute. This should work just as well, since the only
situations expending a load helped was because we shrunk it first.
Also fixes a bug where a 3 component load (4 components with the first
component skipped) would be incorrectly expanded to 4 components because
the stride check would never be performed. Maybe we should avoid skipping
the first component in some situations, but I'm not sure if it's worth
the VGPR cost.
fossil-db (vega10):
Totals from 583 (0.39% of 149974) affected shaders:
CodeSize: 1496848 -> 1500868 (+0.27%); split: -0.03%, +0.30%
Instrs: 286155 -> 286575 (+0.15%); split: -0.07%, +0.22%
Latency: 2947101 -> 2946865 (-0.01%); split: -0.23%, +0.22%
InvThroughput: 797396 -> 797127 (-0.03%); split: -0.08%, +0.04%
fossil-db (polaris10):
Totals from 583 (0.39% of 151365) affected shaders:
SGPRs: 38880 -> 39216 (+0.86%)
VGPRs: 24440 -> 24356 (-0.34%)
CodeSize: 1506808 -> 1510876 (+0.27%); split: -0.01%, +0.28%
Instrs: 288735 -> 289167 (+0.15%); split: -0.06%, +0.21%
Latency: 2963263 -> 2961884 (-0.05%); split: -0.24%, +0.19%
InvThroughput: 802351 -> 801665 (-0.09%); split: -0.12%, +0.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007 >
2021-06-14 09:48:32 +00:00
Rhys Perry
91f8f82806
radv,aco: use all attributes in a binding to obtain an alignment for fetch
...
Instead of assuming scalar alignment for an attribute, we can use the
required alignment of other attributes in a binding to expect a higher
one.
This uses the alignment of all attributes in the pipeline, not just the
ones loaded. This can create slightly better code, but could break
pipelines which relied on unused (and unaligned) attributes no being
loaded. I don't think such pipelines are allowed by the spec.
fossil-db (Sienna Cichlid):
Totals from 44350 (30.32% of 146267) affected shaders:
VGPRs: 1694464 -> 1700616 (+0.36%); split: -0.08%, +0.44%
CodeSize: 60207184 -> 58093836 (-3.51%); split: -3.51%, +0.00%
MaxWaves: 1175998 -> 1174948 (-0.09%); split: +0.02%, -0.11%
Instrs: 11763444 -> 11458952 (-2.59%); split: -2.60%, +0.01%
Latency: 70679612 -> 67062215 (-5.12%); split: -5.27%, +0.15%
InvThroughput: 11482495 -> 11362911 (-1.04%); split: -1.20%, +0.16%
VClause: 359459 -> 343248 (-4.51%); split: -6.36%, +1.85%
SClause: 422404 -> 419229 (-0.75%); split: -1.17%, +0.42%
Copies: 754384 -> 764368 (+1.32%); split: -1.74%, +3.06%
Branches: 197472 -> 197474 (+0.00%); split: -0.03%, +0.03%
PreVGPRs: 1215348 -> 1215503 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007 >
2021-06-14 09:48:32 +00:00
Daniel Schürmann
bb1c06343d
aco/ra: refactor register assignment for vector operands
...
No functional changes.
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764 >
2021-06-11 12:35:46 +02:00
Daniel Schürmann
09b99f1b7c
aco/ra: refactor affinity coalescing
...
Also adds v_interp_p2_f32 to the list of
affinity-related instructions.
Totals from 68 (0.05% of 149839) affected shaders (GFX10.3):
CodeSize: 792928 -> 792056 (-0.11%)
Instrs: 152843 -> 152625 (-0.14%)
Latency: 1235353 -> 1235278 (-0.01%)
InvThroughput: 224087 -> 224049 (-0.02%)
Copies: 9218 -> 9000 (-2.36%)
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764 >
2021-06-11 12:35:31 +02:00
Daniel Schürmann
3a98f484d1
aco/ra: only create phi-affinities for killed operands
...
If a phi-operand is not killed, it must be copied anyway.
The additional affinity would only overwrite any potential
better affinity that was already created
Totals from 1067 (0.71% of 149839) affected shaders (GFX10.3):
VGPRs: 68072 -> 68064 (-0.01%)
CodeSize: 8252588 -> 8245220 (-0.09%); split: -0.12%, +0.03%
Instrs: 1596146 -> 1593941 (-0.14%); split: -0.16%, +0.02%
Latency: 18828176 -> 18823914 (-0.02%); split: -0.08%, +0.06%
InvThroughput: 3575063 -> 3574787 (-0.01%); split: -0.05%, +0.04%
VClause: 24345 -> 24325 (-0.08%); split: -0.16%, +0.07%
Copies: 88712 -> 87398 (-1.48%); split: -1.77%, +0.29%
Branches: 52067 -> 51364 (-1.35%); split: -1.38%, +0.03%
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764 >
2021-06-11 12:35:12 +02:00
Rhys Perry
9162963f0a
aco: fix emit_mbcnt() with a VGPR mask
...
Found by inspection. Should be possible with nir_intrinsic_mbcnt_amd.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11295 >
2021-06-10 11:21:47 +00:00
Timur Kristóf
18337fbcf2
aco: Use as_vgpr for the second source of mbcnt_amd.
...
Fixes: 1e49018ced
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11292 >
2021-06-10 10:13:02 +00:00
Timur Kristóf
1e49018ced
amd: Add extra source to the mbcnt_amd NIR intrinsic.
...
The v_mbcnt instructions can take an extra source that they add to
the result. This is not exposed in SPIR-V but we now expose it in NIR.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072 >
2021-06-09 16:48:51 +00:00
Timur Kristóf
b4e22eb482
aco: Keep VGPR destinations for uniform shared loads when beneficial.
...
When the result of these loads is only used by cross-lane instructions,
it is beneficial to use a VGPR destination. This is because this allows
to put the s_waitcnt further down, which decreases latency.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072 >
2021-06-09 16:48:51 +00:00
Timur Kristóf
ce141e4c5f
aco: Implement byte and lane permute intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072 >
2021-06-09 16:48:51 +00:00
Timur Kristóf
5713e059ea
aco: Add validation for v_permlane instructions.
...
Previously there hasn't been any validation for these instructions,
but after shooting myself in the leg with it a few times, I decided
to add the validation now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072 >
2021-06-09 16:48:51 +00:00
Timur Kristóf
fd6605367d
aco: Implement nir_op_sad_u8x4.
...
Fix up the operand size for v_sad instructions, and implement
the new NIR horizontal add. There is no viable way to do this
in SALU, so let's always use a VGPR destination.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072 >
2021-06-09 16:48:51 +00:00
Timur Kristóf
228169c87c
aco: Add note about v_alignbyte in the ISA README.
...
We tried to use this instruction for a more optimal sequence,
but it turned out that it doesn't exactly work as it was
supposed to. This note is to help others who want to use it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072 >
2021-06-09 16:48:51 +00:00
Rhys Perry
c129ede523
aco: use ds_read_{u8,u16}_d16
...
This allows partial writes and writes to the upper half of the destination.
fossil-db (Sienna Cichlid):
Totals from 135 (0.09% of 149839) affected shaders:
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113 >
2021-06-09 12:06:50 +00:00
Rhys Perry
6334d73fc9
aco: don't ever widen 8/16-bit sgpr load_shared
...
Doesn't seem to create incorrect code, but it is suboptimal.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113 >
2021-06-09 12:06:50 +00:00
Rhys Perry
4870d7d829
aco: use v1b/v2b for ds_read_u8/ds_read_u16
...
The p_extract_vector isn't necessary.
For ds_read_u8 and ds_read_u16, we used a 32-bit regclass, but did't load
32 bits, and used dst_hint for vector loads when we shouldn't have.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113 >
2021-06-09 12:06:50 +00:00
Samuel Pitoiset
d169dad393
aco: fix emitting literal offsets with SMEM on GFX7
...
When the offset is negative, reg() isn't 255. Fix this by splitting
SGPR and literal emission. While we are at it, adjust a comment
saying that literals are also accepted on GFX6 which is wrong.
Fixes another batch of robustness tests.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11247 >
2021-06-09 11:10:38 +00:00
Samuel Pitoiset
3761d994f6
aco: fix range checking for SSBO loads/stores with SGPR offset on GFX6-7
...
GFX6-7 are affected by a hw bug that prevents address clamping to work
correctly when the SGPR offset is used. Use the VGPR offset to fix it.
Fixes various hangs with dEQP-VK.robustness.robustness2.* on Bonaire.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11238 >
2021-06-09 06:40:16 +00:00
Caio Marcelo de Oliveira Filho
8af6766062
nir: Move workgroup_size and workgroup_variable_size into common shader_info
...
Move it out the "cs" sub-struct, since these will be used for other
shader stages in the future.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225 >
2021-06-08 09:23:55 -07:00
Tony Wasserka
3b81f53e34
aco/ra: Split print_regs by lines of 64 registers
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10517 >
2021-06-08 17:03:08 +02:00
Tony Wasserka
69584478c9
aco/ra: Clean up print_regs output and support byte-allocated variables
...
Example output:
00 03 06 09 12 15 18 21 24 27 30 33 36 39 42
sgprs: ·▉█▉███▉▉█··████···········▉████············
00 03 06 09 12 15 18 21 24 27 30 33 36 39 42
vgprs: ▉▉··▉▉▉▉▘▀▉▉▉···▉▘▘▉▉▉▉···▉▉▉▀▀▉············
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10517 >
2021-06-08 17:03:08 +02:00
Tony Wasserka
5bfef2de66
aco/ra: Fix off-by-one-error in print_regs
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: 3675aefa84 ("aco/ra: Fix build with print_regs enabled")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10517 >
2021-06-08 17:03:08 +02:00
Rhys Perry
c768d7d8f2
aco/tests: add SDWA tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
24418304b0
aco/tests: add tests for p_extract/p_insert lowering
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
8e0c6e196e
aco: disallow literals with some instruction formats
...
Because isVOPn() is true for many VOP3, SDWA and DPP instructions, this
would often not complain.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
cf22eabc68
aco: make validate_ir() output usable in tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
54292e99c7
aco: optimize 32-bit extracts and inserts using SDWA
...
Still need to use dst_u=preserve field to optimize packs
fossil-db (Sienna Cichlid):
Totals from 15974 (10.66% of 149839) affected shaders:
VGPRs: 1009064 -> 1008968 (-0.01%); split: -0.03%, +0.02%
SpillSGPRs: 7959 -> 7964 (+0.06%)
CodeSize: 101716436 -> 101159568 (-0.55%); split: -0.55%, +0.01%
MaxWaves: 284464 -> 284490 (+0.01%); split: +0.02%, -0.01%
Instrs: 19334216 -> 19224241 (-0.57%); split: -0.57%, +0.00%
Latency: 375465295 -> 375230478 (-0.06%); split: -0.14%, +0.08%
InvThroughput: 79006105 -> 78860705 (-0.18%); split: -0.25%, +0.07%
fossil-db (Polaris):
Totals from 11369 (7.51% of 151365) affected shaders:
SGPRs: 787920 -> 787680 (-0.03%); split: -0.04%, +0.01%
VGPRs: 681056 -> 681040 (-0.00%); split: -0.01%, +0.00%
CodeSize: 68127288 -> 67664120 (-0.68%); split: -0.69%, +0.01%
MaxWaves: 54370 -> 54371 (+0.00%)
Instrs: 13294638 -> 13214109 (-0.61%); split: -0.62%, +0.01%
Latency: 373515759 -> 373214571 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 166529524 -> 166275291 (-0.15%); split: -0.20%, +0.05%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
daa329f664
aco: use byte/word extract pseudo-instructions
...
fossil-db (Sienna Cichild):
Totals from 1890 (1.26% of 149839) affected shaders:
CodeSize: 5104196 -> 5104300 (+0.00%); split: -0.00%, +0.01%
Latency: 11572943 -> 11572880 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 4876941 -> 4876982 (+0.00%); split: -0.00%, +0.00%
SClause: 26774 -> 26775 (+0.00%)
Copies: 125778 -> 125813 (+0.03%)
PreSGPRs: 56452 -> 56451 (-0.00%)
fossil-db (Polaris):
Totals from 1884 (1.24% of 151365) affected shaders:
CodeSize: 3849340 -> 3849312 (-0.00%); split: -0.00%, +0.00%
Instrs: 741391 -> 741382 (-0.00%)
Latency: 13533815 -> 13533439 (-0.00%)
InvThroughput: 12058777 -> 12058500 (-0.00%)
Copies: 120890 -> 120891 (+0.00%)
PreSGPRs: 48940 -> 48939 (-0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
1f2518ef9f
aco: implement nir_op_extract/nir_op_insert
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Rhys Perry
2f94353735
aco: add p_extract/p_insert
...
These will let us make the SDWA optimizer much simpler than if we were to
recognize combinations of shift/and/bfe.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:42 +00:00
Rhys Perry
e9d1643288
aco: disallow SDWA for instructions with 64-bit definitions/operands
...
For example, v_cvt_f64_i32. LLVM doesn't seem to allow this either and it
doesn't seem to work correctly.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho
c8a7bd0dc8
nir: Rename WORK_GROUP (and similar) to WORKGROUP
...
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho
430d2206da
compiler: Rename local_size to workgroup_size
...
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Tony Wasserka
3c390e2eb6
aco/scheduler: Move cursor handling state to dedicated interfaces
...
This clarifies the semantics of the index variables compared to the previous
version, which used the same variables in a slightly different way depending
on whether they were used for downwards moves or upwards ones.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10885 >
2021-06-07 12:09:39 +02:00
Tony Wasserka
81761a311e
aco/scheduler: Clean up register demand tracking
...
Refactoring total_demand and total_demand_clause to cover non-overlapping
instruction intervals makes the code easier to follow and allows the register
demand to be updated more efficiently in some cases.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10885 >
2021-06-07 12:09:39 +02:00
Daniel Schürmann
d4662e38c4
aco: simplify Phi RegClass selection
...
Also adds moves validation rules to aco_validate.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11181 >
2021-06-04 16:47:01 +00:00