Daniel Schürmann
28d36d26c2
aco: fix p_extract_vector optimization in presence of unequally sized vector operands
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4506 >
2020-04-13 16:35:40 +00:00
Daniel Schürmann
a39df3bfce
aco: don't constant-propagate into subdword PSEUDO instructions
...
PSEUDO instructions are lowered using SDWA, and thus,
cannot take literals and before GFX9 cannot take constants
at all. As the in-register representation differs between
32bit and 16bit floats, we first need to ensure correct
behavior.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492 >
2020-04-10 07:19:27 +00:00
Rhys Perry
20a4b1461b
aco: zero-initialize Temp
...
Fixes dEQP-VK.transform_feedback.* crashes from accesses garbage
temporaries in emit_extract_vector().
Fixes: 85521061 ("aco: prepare helper functions for subdword handling")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463 >
2020-04-06 19:15:19 +00:00
Daniel Schürmann
0bb3537676
aco: don't assume split_vector(create_vector) has the same number of elements when optimizing
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
c436743b0c
aco: don't propagate SGPRs into subdword PSEUDO instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Samuel Pitoiset
c953292630
aco: always optimize v_mad to v_madak in presence of literals
...
v_mad and v_madak are both 64-bit instructions, so it doesn't
increase code size to always apply a 32-bit literal instead of
using v_mad and a sgpr which contains that literal.
Found with some Youngblood shaders but help some other games.
vkpipeline-db (VEGA10):
Totals from affected shaders:
SGPRS: 46168 -> 46016 (-0.33 %)
VGPRS: 45576 -> 45564 (-0.03 %)
Code Size: 5187208 -> 5179584 (-0.15 %) bytes
Max Waves: 3297 -> 3297 (0.00 %)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410 >
2020-04-03 07:30:49 +00:00
Timur Kristóf
655c050119
aco: Fix combining DS additions in the optimizer.
...
Previously, it was calculated incorrectly for 64-bit writes and reads.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:10 +00:00
Rhys Perry
2d1ba86382
aco: handle v_add_co_u32_e64 in parse_base_offset()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902 >
2020-03-03 18:31:06 +00:00
Rhys Perry
483d4ec57c
aco: improve SCC handling in some SALU combines
...
Add some checks and remove some unnecessary checks.
Found by observation. No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599 >
2020-02-12 19:18:45 +00:00
Rhys Perry
d45e9451cf
aco: disable some instruction combining if it could change an exec operand
...
Found by observation. No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599 >
2020-02-12 19:18:40 +00:00
Samuel Pitoiset
ddd767387f
aco: fix creating v_madak if v_mad_f32 has two sgpr literals
...
Do not ignore that src1 can be a sgpr.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2435
Cc: <mesa-stable@lists.freedesktop.org >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759 >
2020-02-11 07:17:31 +00:00
Timur Kristóf
4d34abd15c
aco/optimizer: Don't combine uniform bool s_and to s_andn2.
...
Fixes: 8a32f57fff
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714 >
2020-02-05 22:53:45 +00:00
Daniel Schürmann
71440ba0f5
aco: reorder VMEM operands in ACO IR
...
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602 >
2020-01-29 18:45:23 +00:00
Daniel Schürmann
396be00640
aco: fix combine_salu_not_bitwise() when SCC is used
...
Previously, we didn't use the SCC bit, and thus, we didn't care about it.
With 'aco: Transform uniform bitwise instructions to 32-bit if possible.'
that changed, so that we have to handle it.
Fixes: 8a32f57fff ('aco: Transform uniform bitwise instructions to 32-bit if possible.')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598 >
2020-01-28 18:14:02 +01:00
Rhys Perry
2dc63d39d3
aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541 >
2020-01-27 14:50:37 +00:00
Rhys Perry
827681f921
aco: always add sgprs to sgpr_ids when choosing literals
...
Even if it's a literal, we should add this to sgpr_ids.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 0be7409069 ('aco: rewrite literal combining')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541 >
2020-01-27 14:50:37 +00:00
Timur Kristóf
8a32f57fff
aco: Transform uniform bitwise instructions to 32-bit if possible.
...
This allows removing superfluous s_cselect instructions
that come from turning booleans into 64-bit vector condition.
v2 by Daniel Schürmann:
- Make the code massively simpler
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
- Eliminate extra moves by not always using the SCC definition
- Use s_absdiff_i32 for uniform XOR
- Skip the transformation for uncommon or invalid instructions
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450 >
2020-01-24 14:40:45 +00:00
Timur Kristóf
23edcf6490
aco: Make a better guess at which instructions need the VCC hint.
...
Previously, bool_to_vector_condition would always set the VCC hint
on its result. This commit improves it by having the optimizer set
the VCC hint only when the result really needs to be in the VCC.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451 >
2020-01-24 13:14:23 +00:00
Samuel Pitoiset
b8abfafe86
aco: fix constant folding of SMRD instructions on GFX6
...
SMRD instructions have an 8-bit dword offset on SI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432 >
2020-01-20 16:24:55 +00:00
Rhys Perry
e151398de6
aco: fix stack buffer overflow in apply_sgprs()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: cef7879719 ('aco: rewrite apply_sgprs()')
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442 >
2020-01-20 11:13:11 +00:00
Samuel Pitoiset
a445cb35bd
aco: do not combine additions of DS instructions on GFX6
...
The offset field doesn't work as expected on GFX6.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412 >
2020-01-16 14:06:06 +00:00
Timur Kristóf
dfaa3c0af6
aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.
...
When possible, get rid of an s_not when all it does is invert the SCC,
and its successor s_cbranch / s_cselect can be inverted instead.
Also modify some parts of instruction_selection to take advantage of
this feature.
Example:
s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406
s2: %3902 = s_cselect_b64 -1, 0, %3900:scc
s2: %407, s1: %3903:scc = s_not_b64 %3902
s2: %3906, s1: %3905:scc = s_and_b64 %407, %0:exec
p_cbranch_z %3905:scc
Can now be optimized to:
s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406
p_cbranch_nz %3900:scc
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2020-01-14 21:21:06 +01:00
Timur Kristóf
c0f82165a7
aco: Optimize out s_and with exec, when used on uniform bitwise values.
...
Previously all booleans needed an s_and with exec when they were turned
into a scalar condition. However, this is not needed for uniform booleans.
v2 by Daniel Schürmann:
- Make the code more readable
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2020-01-14 21:21:06 +01:00
Timur Kristóf
1c44129db3
aco: Don't skip combine_instruction when definitions[1] is used.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2020-01-14 21:21:06 +01:00
Timur Kristóf
d962bbd895
aco: Implement 64-bit constant propagation.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2020-01-14 21:21:06 +01:00
Rhys Perry
f978e0e516
aco: add integer min/max to can_swap_operands
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
92ace0bb31
aco: replace extract_vector with copies
...
Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done
Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
e686e4765e
aco: add min(-max(), ) and max(-min(), ) optimization
...
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
fa8357eb70
aco: improve clamp optimization
...
Not sure why it checked the use count, it doesn't apply the constants.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 22908584 -> 22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions: 4275312 -> 4263869 (-0.27 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 14650824 -> 14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions: 2762802 -> 2755517 (-0.26 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
edc888ccb1
aco: fix clamp optimization
...
We can't do the optimization if there are neg/abs in-between.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
f664cb01ec
aco: improve creation of v_madmk_f32/v_madak_f32
...
Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 54743740 -> 54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions: 10182030 -> 10182030 (0.00 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 49173564 -> 49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions: 9315135 -> 9293330 (-0.23 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
15e25da3e5
aco: take advantage of GFX10's constant bus limit and VOP3 literals
...
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 2397159 -> 2392494 (-0.19 %)
VGPRS: 1756036 -> 1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 110287304 -> 109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions: 21019327 -> 20533618 (-2.31 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
9c2d37308f
aco: allow an extra SGPR with multiple uses to be applied to VOP3
...
This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
f4c2c90e1a
aco: allow applying two sgprs to an instruction
...
We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.
This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
7da07ca3e4
aco: follow through temporary when merging tests into constant comparisons
...
This can happen with v_mov_b32(s_mov_b32(literal))
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
dc6c35e1c3
aco: be more careful with literals in combine_salu_{n2,lshl_add}
...
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
fcf52eb42d
aco: add check_vop3_operands()
...
This will be useful when taking advantage of GFX10 features.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
cef7879719
aco: rewrite apply_sgprs()
...
This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
0be7409069
aco: rewrite literal combining
...
Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
84b9f3786b
aco: improve can_use_VOP3()
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
3cb98ed939
aco: combine two sgprs into a VALU if they're the same
...
This was supposed to be done before but it wasn't done correctly and
everywhere.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 784680 -> 786128 (0.18 %)
VGPRS: 574012 -> 573892 (-0.02 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45477088 -> 45478172 (0.00 %) bytes
Max Waves: 81294 -> 81277 (-0.02 %)
Instructions: 8657970 -> 8622483 (-0.41 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 780664 -> 782072 (0.18 %)
VGPRS: 573880 -> 573760 (-0.02 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 45445244 -> 45448340 (0.01 %) bytes
Max Waves: 81178 -> 81161 (-0.02 %)
Instructions: 8649902 -> 8614918 (-0.40 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
c240c1aecf
aco: apply literals to split mads
...
Removing the return is also needed to apply literals to mads (which can be
done on GFX10).
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 368787 -> 367555 (-0.33 %)
VGPRS: 312436 -> 312448 (0.00 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26113388 -> 26098260 (-0.06 %) bytes
Max Waves: 35982 -> 35982 (0.00 %)
Instructions: 5038670 -> 5028941 (-0.19 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 369843 -> 368659 (-0.32 %)
VGPRS: 317224 -> 317196 (-0.01 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 26310540 -> 26295156 (-0.06 %) bytes
Max Waves: 36324 -> 36326 (0.01 %)
Instructions: 5073957 -> 5064164 (-0.19 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Rhys Perry
809c8feb92
aco: check if multiplication/clamp is live when applying output modifier
...
It's possible that a multiplication/clamp is dead code and the single use
is from a different user.
Fixes portal rendering in Path of Exile when global illumination is
enabled.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
ef8abfa790
aco: disable add combining for ds_swizzle_b32
...
ds_bpermute_b32/ds_permute_b32 are fine, I think
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
69bed1c918
aco: don't DCE atomics with return values
...
We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
21eafe30df
aco: better handle neg/abs of sgprs
...
isel/label_instruction currently doesn't create these but we should
probably check anyway.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
f29a5a205c
aco: check usesModifiers() when identifying a neg/abs
...
This was fine because a literal used to mean that it didn't use modifiers,
but now VOP3 can take a literal on GFX10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
46fb341b8d
aco: handle omod successors with the constant in the first operand
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Rhys Perry
7ce244b7d1
aco: handle VOP3 modifiers when combining a constant comparison's NaN test
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081 >
2020-01-13 13:26:43 +00:00
Daniel Schürmann
8b7a42d6d0
aco: compact aco::span<T> to use uint16_t offset and size instead of pointer and size_t.
...
This reduces the size of the Instruction base class
from 40 bytes to 16 bytes. No pipelinedb changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332 >
2020-01-10 17:49:18 +00:00