Daniel Schürmann
09850e0a94
aco: during RA only insert into renames table if a variable got renamed
...
This improves the speed of register allocation.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130 >
2020-04-09 15:08:57 +00:00
Daniel Schürmann
48a74b6815
aco: replace assignment hashmap by std::vector in register allocation
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130 >
2020-04-09 15:08:57 +00:00
Daniel Schürmann
ba482c2e5f
aco: improve register assignment when live-range splits are necessary
...
When finding a good place for a register, we can ignore
killed operands.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130 >
2020-04-09 15:08:57 +00:00
Daniel Schürmann
0680b258f4
aco: align subdword registers during RA when necessary
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
031edbc4a5
aco: adapt register allocation for subdword registers
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
2c74fc98b8
aco: create helper function to collect variables from register area
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
aca2bbf975
aco: add notion of subdword registers to register allocator
...
To not having to split the register file into single bytes,
we maintain a map with registers which contain subdword variables.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
90811554da
aco: remove unnecessary reg_file.fill() operation in get_reg_create_vector()
...
No pipelinedb changes
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Daniel Schürmann
7de003473c
aco: fix Temp and assignment of renamed operands during RA
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Rhys Perry
34424b81df
aco: make PhysReg in units of bytes
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002 >
2020-04-03 23:13:15 +01:00
Rhys Perry
1872759f55
aco: add a late kill flag
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914 >
2020-03-16 16:09:02 +00:00
Albert Astals Cid
760fe44e8c
aco: pass vars by const &
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3935 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3935 >
2020-03-02 13:18:49 +00:00
Rhys Perry
fe5c5507bd
aco: add some helpers for filling/testing register ranges
...
We do this a lot
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3768 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3768 >
2020-02-19 12:23:50 +00:00
Rhys Perry
43497e30e2
aco: add RegisterFile
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3768 >
2020-02-19 12:23:50 +00:00
Daniel Schürmann
3b323d6601
aco: fix image_atomic_cmp_swap
...
Fixes: 71440ba0f5 ('aco: reorder VMEM operands in ACO IR')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3652 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3652 >
2020-01-31 16:51:46 +00:00
Daniel Schürmann
99d032f3cd
aco: fix register allocation with multiple live-range splits
...
This patch fixes register allocation if multiple live-range splits
occur to the same variable within one instruction.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602 >
2020-01-29 18:45:23 +00:00
Daniel Schürmann
71440ba0f5
aco: reorder VMEM operands in ACO IR
...
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602 >
2020-01-29 18:45:23 +00:00
Rhys Perry
15a1cc00d3
aco: fix off-by-one error when initializing sgpr_live_in
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2394
Fixes: 93c8ebfa78 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511 >
2020-01-22 17:23:30 +00:00
Daniel Schürmann
427e5eeb02
aco: handle phi affinities transitively through parallelcopies
...
This can coalesce most unnecessarily inserted parallelcopies
from lowering to CSSA.
v2: refactor loop a bit to make it more efficient and readable.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385 >
2020-01-16 16:01:59 +01:00
Rhys Perry
f9405ceb8a
aco: don't move literal to reg when making an instruction VOP3 on GFX10
...
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 163398 -> 163398 (0.00 %)
VGPRS: 143820 -> 143820 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13065744 -> 13044308 (-0.16 %) bytes
Max Waves: 18921 -> 18921 (0.00 %)
Instructions: 2514644 -> 2509285 (-0.21 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883 >
2020-01-14 12:56:28 +00:00
Daniel Schürmann
ffb4790279
aco: compact various Instruction classes
...
No pipelinedb changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332 >
2020-01-10 17:49:18 +00:00
Daniel Schürmann
6a586a6006
aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Timur Kristóf
e0bcefc3a0
aco/wave32: Use lane mask regclass for exec/vcc.
...
Currently all usages of exec and vcc are hardcoded to use s2 regclass.
This commit makes it possible to use s1 in wave32 mode and
s2 in wave64 mode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-12-04 10:36:01 +00:00
Daniel Schürmann
8861a82be7
aco: don't split live-ranges of linear VGPRs
...
Fixes: 93c8ebfa78 'aco: Initial commit of independent AMD compiler'
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-11-29 21:54:27 +01:00
Daniel Schürmann
b6f5085dfe
aco: preserve kill flag on moved operands during RA
...
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-11-12 15:59:48 +00:00
Daniel Schürmann
a2a6880743
aco: fix invalid access on Pseudo_instructions
...
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-11-12 15:59:48 +00:00
Daniel Schürmann
8023dcd71e
aco: fix live-range splits of phis
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-10-30 19:48:33 +00:00
Timur Kristóf
c52ebbcea4
aco: Introduce vgpr_limit to keep track of available VGPRs.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-28 23:52:50 +00:00
Rhys Perry
08d510010b
aco: increase accuracy of SGPR limits
...
SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed
number of SGPRs and has 106 addressable SGPRs.
pipeline-db (Vega):
SGPRS: 5912 -> 6232 (5.41 %)
VGPRS: 1772 -> 1780 (0.45 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 88228 -> 87904 (-0.37 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 559 -> 571 (2.15 %)
piepline-db (Navi):
SGPRS: 341256 -> 363384 (6.48 %)
VGPRS: 171536 -> 170960 (-0.34 %)
Spilled SGPRs: 832 -> 581 (-30.17 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 14207332 -> 14190872 (-0.12 %) bytes
LDS: 33 -> 33 (0.00 %) blocks
Max Waves: 18072 -> 18251 (0.99 %)
v2: unconditionally count vcc as an extra sgpr on GFX10+
v3: pass SGPRs rounded to 8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-23 19:11:21 +01:00
Mauro Rossi
c24ad565ae
android: aco: fix undefined template 'std::__1::array' build errors
...
Fixes a few building errors similar to the following:
In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26:
In file included from external/libcxx/include/algorithm:639:
external/libcxx/include/utility:321:9:
error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>'
_T2 second;
^
Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com >
2019-09-28 15:56:23 +02:00
Daniel Schürmann
93c8ebfa78
aco: Initial commit of independent AMD compiler
...
ACO (short for AMD Compiler) is a new compiler backend with the goal to replace
LLVM for Radeon hardware for the RADV driver.
ACO currently supports only VS, PS and CS on VI and Vega.
There are some optimizations missing because of unmerged NIR changes
which may decrease performance.
Full commit history can be found at
https://github.com/daniel-schuermann/mesa/commits/backend
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev >
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com >
Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Co-authored-by: Connor Abbott <cwabbott0@gmail.com >
Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com >
Co-authored-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-09-19 12:10:00 +02:00