Rhys Perry
5bc100eb2d
aco: use a single instruction for uadd32_sat() on GFX8
...
fossil-db (GFX8):
Totals from 8 (0.01% of 147787) affected shaders:
SGPRs: 352 -> 368 (+4.55%)
CodeSize: 49576 -> 48788 (-1.59%)
Instrs: 9487 -> 9318 (-1.78%)
Latency: 49935 -> 49607 (-0.66%)
InvThroughput: 138493 -> 137443 (-0.76%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598 >
2021-03-17 15:33:34 +00:00
Rhys Perry
3decb52c82
aco: use uadd32_sat() helper for nir_op_uadd_sat
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598 >
2021-03-17 15:33:31 +00:00
Rhys Perry
590de30093
aco: implement 64-bit VGPR {u,i}find_msb
...
This can be created by subgroupBallotFindMSB().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4458
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598 >
2021-03-17 15:33:22 +00:00
Timur Kristóf
ed7c6e46e7
aco: Delete superfluous tess and ESGS I/O code.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201 >
2021-03-17 12:42:23 +00:00
Timur Kristóf
540168fd15
radv: Use new, NIR-based I/O lowering.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201 >
2021-03-17 12:42:23 +00:00
Timur Kristóf
b3a16c0e19
radv: Fill some tess shader info earlier.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201 >
2021-03-17 12:42:23 +00:00
Timur Kristóf
582229585b
aco: Implement new Geometry Shader intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201 >
2021-03-17 12:42:23 +00:00
Timur Kristóf
5c95b32c6e
aco: Implement the new tessellation I/O related NIR intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201 >
2021-03-17 12:42:23 +00:00
Timur Kristóf
e10e74a7af
aco: Implement new buffer load/store intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201 >
2021-03-17 12:42:23 +00:00
Rhys Perry
0af7ff49fd
aco: lower p_constaddr into separate instructions earlier
...
This allows them to be scheduled properly and simplifies the assembler a
little.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994 >
2021-03-11 16:31:19 +00:00
Rhys Perry
7d5643c0fe
aco: track divergent and uniform branch depth
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994 >
2021-03-11 15:35:30 +00:00
Rhys Perry
8f71be0a7b
aco: simplify loop_nest_depth tracking in isel
...
Keep track of the current loop depth in Program and set the depth inside
Program::insert_block() instead of repeating it every time we insert one.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994 >
2021-03-11 15:35:24 +00:00
Rhys Perry
341dd9d834
aco: set compr for fp16 exports
...
Obviously this didn't affect correctness. Not sure about performance.
It also changes enabled_channels to match radeonsi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: f29c81f863 ("aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9459 >
2021-03-11 13:54:18 +00:00
Rhys Perry
3a72044ece
aco: add missing usable_read2 check
...
A Hitman 2 shader does: read64(local_invocation_index() * 4 - 4). This was
likely emitting a ds_read2_b32 on GFX6. For local_invocation_index()=0,
because the first dword was out-of-bounds, the second was likely also
considered out-of-bounds (even though it's not, at offset 0).
Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/3882
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: 57e6886f98 ("aco: refactor load_lds to use new helpers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332 >
2021-03-02 13:13:59 +00:00
Rhys Perry
941739619e
Revert "radv,aco: allow unaligned LDS access on GFX9+"
...
This reverts commit 1a0b0e8460 .
The bounds checking behaviour of ds_read_b64, ds_read_b96 and ds_read_b128
make this feature very difficult to use safely.
This fixes a blocking artifact in Hitman 2. Previously, it contained:
ds_read_b64(local_invocation_index() * 4 - 4)
For local_invocation_index()=0, the second dword would be considered
out-of-bounds, even though it's at offset 0.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332 >
2021-03-02 13:13:59 +00:00
Rhys Perry
c3af0c2079
aco: use p_as_uniform for get_sampler_desc and convert_pointer_to_64_bit
...
Since value-numbering no longer works across loops, we no longer need to
use v_readfirstlane_b32.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288 >
2021-02-26 13:33:56 +00:00
Rhys Perry
5f1b354472
aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM
...
We should avoid a situation where a v_readfirstlane_b32 is in WQM but it's
source is calculated in Exact.
Fixes hang when running Assassin's Creed: Valhalla benchmark.
fossil-db (GFX10.3):
Totals from 1021 (0.70% of 146267) affected shaders:
CodeSize: 7835228 -> 7842992 (+0.10%); split: -0.00%, +0.10%
Instrs: 1519208 -> 1521149 (+0.13%); split: -0.00%, +0.13%
SClause: 78921 -> 78920 (-0.00%)
Copies: 44456 -> 45421 (+2.17%); split: -0.05%, +2.22%
Branches: 12987 -> 13933 (+7.28%)
PreSGPRs: 47599 -> 47813 (+0.45%)
Cycles: 10037540 -> 10045304 (+0.08%); split: -0.00%, +0.08%
VMEM: 538381 -> 538777 (+0.07%); split: +0.11%, -0.03%
SMEM: 84553 -> 84554 (+0.00%); split: +0.01%, -0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288 >
2021-02-26 13:33:56 +00:00
Daniel Schürmann
fbf791e70c
aco: value number VOPC instructions with different exec masks
...
This becomes possible as long as we do
val = s_and_b32/64 exec, val
before any subgroup operations.
This precautional instruction can be removed by the
optimizer if 'val' was computed by a VOPC instruction
using the same exec mask.
Totals from 59 (0.04% of 146267) affected shaders (Navi10):
VGPRs: 2808 -> 2816 (+0.28%)
CodeSize: 340888 -> 340852 (-0.01%); split: -0.20%, +0.19%
Instrs: 61733 -> 61625 (-0.17%); split: -0.18%, +0.01%
Cycles: 470636 -> 469112 (-0.32%); split: -0.33%, +0.01%
VMEM: 8091 -> 7993 (-1.21%)
SMEM: 2736 -> 2719 (-0.62%); split: +0.29%, -0.91%
VClause: 1745 -> 1741 (-0.23%)
SClause: 2394 -> 2392 (-0.08%); split: -0.25%, +0.17%
Copies: 3249 -> 3253 (+0.12%); split: -0.62%, +0.74%
Branches: 1210 -> 1206 (-0.33%)
PreSGPRs: 3126 -> 3176 (+1.60%); split: -0.16%, +1.76%
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9195 >
2021-02-25 11:35:42 +01:00
Daniel Schürmann
29b866fef6
aco: remove special handling of load_helper_invocation
...
These should now behave the same as is_helper_invocation.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058 >
2021-02-17 21:53:52 +00:00
Rhys Perry
1a0b0e8460
radv,aco: allow unaligned LDS access on GFX9+
...
fossil-db (GFX10.3):
Totals from 223 (0.16% of 139391) affected shaders:
SGPRs: 10032 -> 10096 (+0.64%)
VGPRs: 7480 -> 7592 (+1.50%)
CodeSize: 853960 -> 821920 (-3.75%); split: -3.76%, +0.01%
MaxWaves: 5916 -> 5908 (-0.14%)
Instrs: 154935 -> 150281 (-3.00%); split: -3.01%, +0.01%
Cycles: 3202496 -> 3080680 (-3.80%); split: -3.81%, +0.00%
VMEM: 48187 -> 46671 (-3.15%); split: +0.29%, -3.44%
SMEM: 13869 -> 13850 (-0.14%); split: +1.52%, -1.66%
VClause: 3110 -> 3085 (-0.80%); split: -1.03%, +0.23%
SClause: 4376 -> 4381 (+0.11%)
Copies: 12132 -> 12065 (-0.55%); split: -2.61%, +2.06%
Branches: 5204 -> 5203 (-0.02%)
PreVGPRs: 6304 -> 6359 (+0.87%); split: -0.10%, +0.97%
See https://reviews.llvm.org/D82788
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8762 >
2021-02-17 12:57:12 +00:00
Rhys Perry
3d4c13f3b8
aco: add DeviceInfo
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:44:22 +00:00
Rhys Perry
7ff805a19d
radv,aco: add radv_nir_compiler_options::wgp_mode
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761 >
2021-02-15 13:35:36 +00:00
Daniel Schürmann
8b793f9567
aco: remove dead code for the handling of exec temporaries
...
Totals from 26026 (18.67% of 139391) affected shaders (Navi10):
PreSGPRs: 370993 -> 326177 (-12.08%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870 >
2021-02-12 22:41:31 +00:00
Daniel Schürmann
947bf0bd67
aco: don't decrease the vgpr_limit when encountering bpermute
...
Instead we recalculate vgpr_limit on demand, depending on
the number of needed shared VGPRs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921 >
2021-02-12 19:00:18 +00:00
Daniel Schürmann
bacc3b36f5
aco: fix shared VGPR allocation on RDNA2
...
VGPRs are now allocated in blocks of 8 normal
or 16 shared VGPRs, respectively.
Fixes: 14a5021aff ('aco/gfx10: Refactor of GFX10 wave64 bpermute.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921 >
2021-02-12 19:00:18 +00:00
Rhys Perry
f7575fa71f
aco: fix adjust_vertex_fetch_alpha
...
These offsets were wrong and didn't match the old behaviour.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: e8220e106b ("aco: optimize AC_FETCH_FORMAT_SNORM alpha adjust")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8935 >
2021-02-10 09:37:17 +00:00
Rhys Perry
e8220e106b
aco: optimize AC_FETCH_FORMAT_SNORM alpha adjust
...
This looks like it was copied from LLVM, which didn't have a fmax
intrinsic.
fossil-db (GFX8):
Totals from 43 (0.03% of 140385) affected shaders:
CodeSize: 49660 -> 49488 (-0.35%)
Instrs: 10434 -> 10348 (-0.82%)
Cycles: 41736 -> 41392 (-0.82%)
VMEM: 13793 -> 13719 (-0.54%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8918 >
2021-02-09 12:58:22 +00:00
Rhys Perry
b2dbe2b87b
aco: implement non-uniform get_ssbo_size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3711
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7969 >
2021-01-27 13:00:33 +00:00
Daniel Schürmann
b06609e903
aco: fix nir_intrinsic_ballot with wave32
...
Found by inspection.
Fixes: 21db083504 ('aco/wave32: Allow setting the subgroup ballot size to 64-bit.')
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8703 >
2021-01-26 21:06:48 +00:00
Daniel Schürmann
9a49760e82
aco: fix VCC hint on boolean subgroup operations
...
Found by inspection.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8712 >
2021-01-26 14:59:30 +00:00
Samuel Pitoiset
bb8f87088c
radv,aco: fix shifting input VGPRs for the LS VGPR init bug on GFX9
...
We were incorrectly shifting the input VGPRs for the instance ID
for chips affected by the LS VGPR init bug (ie. Vega10 and Raven).
When there is no HS threads, the hardware loads the LS VGPR
starting from VGPR 0, so they should be shifted by two.
This fixes some sort of vertex explosion with Squad, Visage, Barn
Finders and probably more titles that use tessellation. Note that
only Vega10 and Raven were affected by this bug.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4129
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3311
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-by: Diego Viola <diego.viola@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8694 >
2021-01-25 17:10:44 +00:00
Rhys Perry
f03c20ffae
aco: fix WQM for texture instructions with args before the coordinates
...
Previously, we might not have required all coordinates to be in WQM if
there were other args before them. We should probably also require that
the offset is in WQM.
fossil-db (GFX10.3):
Totals from 10053 (7.21% of 139391) affected shaders:
SGPRs: 911032 -> 911048 (+0.00%); split: -0.00%, +0.00%
VGPRs: 689856 -> 688412 (-0.21%); split: -0.26%, +0.05%
CodeSize: 84151460 -> 84140396 (-0.01%); split: -0.02%, +0.01%
MaxWaves: 77526 -> 77527 (+0.00%)
Instrs: 15972106 -> 15971521 (-0.00%); split: -0.01%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4153
Fixes: 4015b3651a ("aco: only require texture coordinates to be in WQM if NSA is used")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8693 >
2021-01-25 16:31:39 +00:00
Rhys Perry
e115b01948
aco: return references in instruction cast methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:33 +00:00
Rhys Perry
70dbcfa1c9
aco: use instruction cast methods
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Rhys Perry
441ead5fb3
aco: remove Format::{VOP3A,VOP3B}
...
These are really the same as Format::VOP3.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595 >
2021-01-22 14:12:32 +00:00
Rhys Perry
dc19fe0e9f
radv,aco: use deref_buffer_array_length
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3993
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163 >
2021-01-21 11:53:12 +00:00
Rhys Perry
914c61d6c0
radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2
...
Fixes several dEQP-VK.robustness.robustness2.* tests on GFX8. Generations
other than GFX8 don't fail the tests because bounds-checking is done using
the index (making it per-vertex).
fossil-db (Polaris):
Totals from 1387 (0.99% of 140385) affected shaders:
(no statistics affected)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Fixes: 03a0d39366 ("aco: use MUBUF in some situations instead of splitting vertex fetches")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7834 >
2021-01-20 17:57:56 +00:00
Rhys Perry
4015b3651a
aco: only require texture coordinates to be in WQM if NSA is used
...
From comment in emit_mimg():
We don't need the bias, sample index, compare value or offset to be
computed in WQM but if the p_create_vector copies the coordinates, then it
needs to be in WQM.
fossil-db (GFX10.3):
Totals from 1778 (1.28% of 139391) affected shaders:
SGPRs: 105080 -> 105072 (-0.01%); split: -0.02%, +0.01%
VGPRs: 96800 -> 96776 (-0.02%); split: -0.07%, +0.05%
CodeSize: 10001120 -> 10001384 (+0.00%); split: -0.04%, +0.04%
MaxWaves: 18164 -> 18163 (-0.01%)
Instrs: 1883750 -> 1883598 (-0.01%); split: -0.06%, +0.05%
Cycles: 34800176 -> 34767840 (-0.09%); split: -0.10%, +0.01%
We don't have a p_create_vector if we use NSA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Rhys Perry
c353895c92
aco: use non-sequential addressing
...
fossil-db (GFX10.3):
Totals from 70493 (50.57% of 139391) affected shaders:
SGPRs: 4232624 -> 4231808 (-0.02%); split: -0.09%, +0.07%
VGPRs: 2831772 -> 2764740 (-2.37%); split: -2.53%, +0.17%
CodeSize: 225584412 -> 225048740 (-0.24%); split: -0.44%, +0.21%
MaxWaves: 875319 -> 878837 (+0.40%); split: +0.44%, -0.04%
Instrs: 43157803 -> 42496421 (-1.53%); split: -1.54%, +0.01%
Cycles: 1656380132 -> 1641532056 (-0.90%); split: -0.94%, +0.04%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Rhys Perry
faf3e9a27f
aco: move VADDR to the end of the operand list
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Rhys Perry
cd29210fce
aco: add emit_mimg() helper
...
Some fossil-db noise from slightly different order of instructions.
fossil-db (GFX10.3):
Totals from 73 (0.05% of 139391) affected shaders:
SGPRs: 3424 -> 3440 (+0.47%)
CodeSize: 199076 -> 199064 (-0.01%); split: -0.01%, +0.00%
Instrs: 37303 -> 37300 (-0.01%); split: -0.01%, +0.00%
Cycles: 786328 -> 786316 (-0.00%); split: -0.00%, +0.00%
VMEM: 19448 -> 19454 (+0.03%); split: +0.04%, -0.01%
SMEM: 5241 -> 5305 (+1.22%); split: +1.70%, -0.48%
SClause: 1282 -> 1281 (-0.08%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Rhys Perry
9890dabb1b
aco: have emit_wqm() take Builder instead of isel_context
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8523 >
2021-01-20 16:46:54 +00:00
Daniel Schürmann
454bbf8f23
aco: emit packed 16bit instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
5ad52ac906
aco: create helpers to emit vop3p instructions
...
Also make get_alu_src() capable to return
unswizzled multi-component SGPR sources.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
d495a5c183
radv: enable .lower_ineg
...
We already emit ineg as isub most of the time.
The results are a bit mixed, but shouldn't really make a difference.
A couple of additional copies are needed as isub writes scc.
Totals from 5975 (4.29% of 139391) affected shaders:
CodeSize: 31508648 -> 31509264 (+0.00%); split: -0.00%, +0.00%
Instrs: 6073379 -> 6073531 (+0.00%); split: -0.00%, +0.00%
Cycles: 47186280 -> 47187116 (+0.00%); split: -0.00%, +0.00%
VMEM: 2528515 -> 2529139 (+0.02%); split: +0.03%, -0.01%
SMEM: 596842 -> 596924 (+0.01%); split: +0.02%, -0.00%
SClause: 280596 -> 280594 (-0.00%)
Copies: 288554 -> 288669 (+0.04%); split: -0.00%, +0.04%
PreSGPRs: 240390 -> 240397 (+0.00%)
PreVGPRs: 349630 -> 349749 (+0.03%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8425 >
2021-01-12 16:14:00 +00:00
Rhys Perry
04e3d7ad93
aco: improve nir_op_vec with constant operands
...
Could still be improved a little. For example, 8-bit pack without
constants could be:
(s_pack_ll(x, z) & 0x00ff00ff) | ((s_pack_ll(y, w) & 0x00ff00ff) << 8)
fossil-db (Sienna):
Totals from 136 (0.10% of 139391) affected shaders:
CodeSize: 279776 -> 278144 (-0.58%)
Instrs: 50742 -> 50470 (-0.54%)
Cycles: 211560 -> 210472 (-0.51%)
SMEM: 3607 -> 3557 (-1.39%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8421 >
2021-01-12 15:50:54 +00:00
Rhys Perry
a502aa7b04
aco: form sparse load clauses
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Rhys Perry
0bd14be962
aco: implement sparse image loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Rhys Perry
382f50ad2c
aco: implement sparse texture fetches
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00
Rhys Perry
5a4f6313b1
aco: implement nir_op_vec5
...
Since sparse fetch/load uses vec5 destinations, it may be possible that we
encounter nir_op_vec5.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7775 >
2021-01-08 14:27:07 +00:00