Rhys Perry
92ace0bb31
aco: replace extract_vector with copies
...
Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done
Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
e686e4765e
aco: add min(-max(), ) and max(-min(), ) optimization
...
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
fa8357eb70
aco: improve clamp optimization
...
Not sure why it checked the use count, it doesn't apply the constants.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 22908584 -> 22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions: 4275312 -> 4263869 (-0.27 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 14650824 -> 14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions: 2762802 -> 2755517 (-0.26 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
edc888ccb1
aco: fix clamp optimization
...
We can't do the optimization if there are neg/abs in-between.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
f664cb01ec
aco: improve creation of v_madmk_f32/v_madak_f32
...
Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 54743740 -> 54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions: 10182030 -> 10182030 (0.00 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 49173564 -> 49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions: 9315135 -> 9293330 (-0.23 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
15e25da3e5
aco: take advantage of GFX10's constant bus limit and VOP3 literals
...
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 2397159 -> 2392494 (-0.19 %)
VGPRS: 1756036 -> 1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 110287304 -> 109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions: 21019327 -> 20533618 (-2.31 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
9c2d37308f
aco: allow an extra SGPR with multiple uses to be applied to VOP3
...
This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
f4c2c90e1a
aco: allow applying two sgprs to an instruction
...
We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.
This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
7da07ca3e4
aco: follow through temporary when merging tests into constant comparisons
...
This can happen with v_mov_b32(s_mov_b32(literal))
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
dc6c35e1c3
aco: be more careful with literals in combine_salu_{n2,lshl_add}
...
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
fcf52eb42d
aco: add check_vop3_operands()
...
This will be useful when taking advantage of GFX10 features.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
cef7879719
aco: rewrite apply_sgprs()
...
This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
0be7409069
aco: rewrite literal combining
...
Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
84b9f3786b
aco: improve can_use_VOP3()
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
3cb98ed939
aco: combine two sgprs into a VALU if they're the same
...
This was supposed to be done before but it wasn't done correctly and
everywhere.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 784680 -> 786128 (0.18 %)
VGPRS: 574012 -> 573892 (-0.02 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45477088 -> 45478172 (0.00 %) bytes
Max Waves: 81294 -> 81277 (-0.02 %)
Instructions: 8657970 -> 8622483 (-0.41 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 780664 -> 782072 (0.18 %)
VGPRS: 573880 -> 573760 (-0.02 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45445244 -> 45448340 (0.01 %) bytes
Max Waves: 81178 -> 81161 (-0.02 %)
Instructions: 8649902 -> 8614918 (-0.40 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
c240c1aecf
aco: apply literals to split mads
...
Removing the return is also needed to apply literals to mads (which can be
done on GFX10).
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 368787 -> 367555 (-0.33 %)
VGPRS: 312436 -> 312448 (0.00 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26113388 -> 26098260 (-0.06 %) bytes
Max Waves: 35982 -> 35982 (0.00 %)
Instructions: 5038670 -> 5028941 (-0.19 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 369843 -> 368659 (-0.32 %)
VGPRS: 317224 -> 317196 (-0.01 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26310540 -> 26295156 (-0.06 %) bytes
Max Waves: 36324 -> 36326 (0.01 %)
Instructions: 5073957 -> 5064164 (-0.19 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
809c8feb92
aco: check if multiplication/clamp is live when applying output modifier
...
It's possible that a multiplication/clamp is dead code and the single use
is from a different user.
Fixes portal rendering in Path of Exile when global illumination is
enabled.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
ef8abfa790
aco: disable add combining for ds_swizzle_b32
...
ds_bpermute_b32/ds_permute_b32 are fine, I think
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
69bed1c918
aco: don't DCE atomics with return values
...
We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
21eafe30df
aco: better handle neg/abs of sgprs
...
isel/label_instruction currently doesn't create these but we should
probably check anyway.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
f29a5a205c
aco: check usesModifiers() when identifying a neg/abs
...
This was fine because a literal used to mean that it didn't use modifiers,
but now VOP3 can take a literal on GFX10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
46fb341b8d
aco: handle omod successors with the constant in the first operand
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
7ce244b7d1
aco: handle VOP3 modifiers when combining a constant comparison's NaN test
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Daniel Schürmann
8b7a42d6d0
aco: compact aco::span<T> to use uint16_t offset and size instead of pointer and size_t.
...
This reduces the size of the Instruction base class
from 40 bytes to 16 bytes. No pipelinedb changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332 >
2020-01-10 17:49:18 +00:00
Daniel Schürmann
ffb4790279
aco: compact various Instruction classes
...
No pipelinedb changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332 >
2020-01-10 17:49:18 +00:00
Rhys Perry
6ff92f3d68
aco/wave32: fix comparison optimizations
...
Previously, they weren't done in wave32.
Totals from affected shaders:
SGPRS: 507726 -> 508006 (0.06 %)
VGPRS: 450340 -> 450268 (-0.02 %)
Spilled SGPRs: 298 -> 298 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 39689708 -> 39384488 (-0.77 %) bytes
Max Waves: 39631 -> 39636 (0.01 %)
Instructions: 7865919 -> 7793650 (-0.92 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-12-21 12:38:42 +01:00
Daniel Schürmann
8259c97b2d
aco: propagate temporaries into expanded vectors
...
Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size: 1708488 -> 1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
da7ff58835
aco: make 1/2*PI a literal constant on SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
6a586a6006
aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
caea4bbfdc
aco: fix SMEM offsets for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Timur Kristóf
c0dbf42a03
aco/wave32: Change uniform bool optimization to work with wave32.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-12-04 10:36:01 +00:00
Rhys Perry
df645fa369
aco: implement VK_KHR_shader_float_controls
...
This actually supports more of the extension than the LLVM backend but we
can't enable it because ACO doesn't work with all stages yet.
With more of it enabled, some CTS tests fail because our 64-bit sqrt
is very imprecise. I can't find any precision requirements for it
anywhere, so I'm thinking it might be a CTS issue.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-11-15 17:36:21 +00:00
Rhys Perry
b062b92ab1
aco: don't combine literals into v_cndmask_b32/v_subb/v_addc
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-11-15 17:36:21 +00:00
Timur Kristóf
9b8dc6929e
aco: Optimize out trivial code from uniform bools.
...
This should remove most of the excess code size that was
introduced by making all booleans per-lane.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-11-14 17:27:11 +01:00
Timur Kristóf
94e355148f
aco: Make sure not to mistakenly propagate 64-bit constants.
...
ACO's optimizer would try to propagate 64-bit constants, but
does so in such a way that wouldn't work due to how the 64-bit
constants are handled in the IR.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-11-14 17:27:10 +01:00
Rhys Perry
2c98d79d11
aco: don't propagate vgprs into v_readlane/v_writelane
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
78e3ea9a0f
aco: add Instruction::usesModifiers() and add more checks in the optimizer
...
No pipeline-db changes.
v2: use early-exit for VOP3
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev > (v1)
2019-11-08 00:14:06 +00:00
Rhys Perry
c96289a70e
aco: keep can_reorder/barrier when combining addition into SMEM
...
Affects 30 shaders in the pipeline-db (all youngblood).
Totals from affected shaders:
SGPRS: 2656 -> 2456 (-7.53 %)
VGPRS: 2260 -> 2260 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 240680 -> 240944 (0.11 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 90 -> 90 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-22 18:52:29 +00:00
Rhys Perry
bdf47a1273
aco: properly combine additions into ds_write2_b64/ds_read2_b64
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-22 18:52:29 +00:00
Rhys Perry
a400928f4a
aco: fix 64-bit p_extract_vector on 32-bit p_create_vector
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-22 18:52:29 +00:00
Daniel Schürmann
4b458b3e8f
aco: don't combine minmax3 if there is a neg or abs modifier in between
...
This fixes a graphical corruption in HotS.
No pipelinedb changes other than that.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-10-17 16:21:19 +00:00
Rhys Perry
45d6c69b99
aco: use can_accept_constant in valu_can_accept_literal
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-11 14:26:58 +00:00
Rhys Perry
b37857bcea
aco: don't apply sgprs/constants to read/write lane instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-11 14:26:58 +00:00
Daniel Schürmann
93c8ebfa78
aco: Initial commit of independent AMD compiler
...
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.
ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.
Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev >
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com >
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Co-authored-by: Connor Abbott <cwabbott0@gmail.com >
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com >
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-19 12:10:00 +02:00