Rhys Perry
483657de32
aco: use mubuf helper in select_gs_copy_shader
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103 >
2020-10-28 14:59:49 +00:00
Rhys Perry
ec7ecfe9cb
aco: use control flow creation helpers in select_gs_copy_shader
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103 >
2020-10-28 14:59:49 +00:00
Daniel Schürmann
543f50789a
aco: implement nir_op_unpack_[64/32]_*
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527 >
2020-10-28 10:14:26 +00:00
Rhys Perry
26e53e3afa
aco: ignore the ACO-inserted continue in create_continue_phis()
...
Otherwise, for loops without continue_or_break, create_continue_phis()
always returns an undef operand.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: 638cbc21a1 ("aco: handle when ACO adds new continue edges")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2848
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7148 >
2020-10-27 19:53:38 +00:00
Rhys Perry
437995bb70
aco: remove all-undef phi opt
...
This doesn't look like it would create correct IR for 8/16-bit phis and
doesn't seem to help anything. If we ever want to do this, it's probably
better done in nir_opt_remove_phis().
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216 >
2020-10-27 15:24:38 +00:00
Rhys Perry
d20a752c0d
aco: use Builder::copy more
...
fossil-db (Navi):
Totals from 6973 (5.07% of 137413) affected shaders:
SGPRs: 381768 -> 381776 (+0.00%)
VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00%
CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01%
MaxWaves: 86581 -> 86583 (+0.00%)
Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00%
Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05%
fossil-db (Polaris):
Totals from 8154 (5.87% of 138881) affected shaders:
VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00%
CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00%
MaxWaves: 49090 -> 49091 (+0.00%)
Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00%
Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00%
Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp
shaders. They appear to be improved because the p_create_vector from
lower_subdword_phis() was blocking constant propagation.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216 >
2020-10-27 15:24:38 +00:00
Rhys Perry
72b307a338
aco: don't do divergent break+discard
...
If the shader does:
loop {
if (divergent)
discard
else
a()
b()
}
then a()'s block will dominate b()'s block in the logical CFG, but not the
linear CFG. This will cause value numbering to try to combine SLAU from
a() and b().
This didn't happen with break/continue because sanitize_if() would move
a() out of the branch. Using sanitize_if() to fix this doesn't look easy,
because discards are not control flow instructions in NIR.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216 >
2020-10-27 15:24:38 +00:00
Rhys Perry
27ce5d921e
aco: remove isel_context::allocated
...
Now that we have Program::temp_rc, we can replace it with the first
temporary id allocated for NIR's ssa defs.
No fossil-db changes on Navi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7067 >
2020-10-26 15:14:32 +00:00
Samuel Pitoiset
4e2fe34aa9
aco: fix determining if LOD is zero for nir_texop_txf/nir_texop_txs
...
txf/txs expects LOD to be a 32-bit unsigned integer while other
texture operations expects a float.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3668
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7256 >
2020-10-22 11:30:43 +00:00
Samuel Pitoiset
eb6877d3af
radv,aco: fix use of texop_samples_identical in the resolve meta path
...
The return value of this texture intrinsic should be a NIR 1-bit bool.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7236 >
2020-10-21 13:06:53 +02:00
Tony Wasserka
fd038132de
aco/isel: Miscellaneous cleanups using the new Stage API
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094 >
2020-10-21 09:49:38 +00:00
Tony Wasserka
34bc9477de
aco: Clean up symbol names and comments related to NGG
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094 >
2020-10-21 09:49:38 +00:00
Tony Wasserka
86c227c10c
aco: Use strong typing to model SW<->HW stage mappings
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Acked-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094 >
2020-10-21 09:49:38 +00:00
Bas Nieuwenhuizen
76421667ec
aco: Add VK_KHR_shader_terminate_invocation support.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7226 >
2020-10-20 22:53:08 +00:00
Timur Kristóf
d8435c1628
aco/ngg: Add assertion to make sure we always know the vertex count.
...
Just a sanity check to avoid hangs caused by missing this
in the future.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7213 >
2020-10-20 07:11:29 +00:00
James Park
af8d488ea5
util,ac,aco,radv: Cross-platform memstream API
...
POSIX memstream is not available on Windows.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7143 >
2020-10-19 03:37:42 -07:00
Rhys Perry
fdb65b8b23
aco: add missing SCC clobber in get_buffer_size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: fcd6d83245 ("aco: fix imageSize()/textureSize() with large buffers on GFX8")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7162 >
2020-10-15 21:11:45 +00:00
Tony Wasserka
d5a72319d6
aco/isel: Remove now unused VS-related code from create_null_export
...
Also replaced a hardcoded constant with the appropriate register macro.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102 >
2020-10-14 16:22:51 +00:00
Tony Wasserka
c22c702f35
aco/isel: Remove some dead code
...
exported_pos was always initialized to true (due to the is_pos argument
of the first export_vs_varying call being true), so none of this code has
any effect.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102 >
2020-10-14 16:22:51 +00:00
Tony Wasserka
bf51b11c04
aco/isel: Always export position data from VS/NGG
...
AMD ISA docs explicitly require this for VS, and this likely extends to
NGG too.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3615
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102 >
2020-10-14 16:22:51 +00:00
Daniel Schürmann
f29c81f863
aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible
...
This patch also does a slight rework of export_fs_mrt_color()
to avoid setting of enabled channels which are not used.
Totals from 52404 (38.38% of 136546) affected shaders (NAVI):
SGPRs: 3097443 -> 3097435 (-0.00%)
CodeSize: 189151600 -> 188546200 (-0.32%)
Instrs: 36445061 -> 36445104 (+0.00%); split: -0.00%, +0.00%
Cycles: 1739388020 -> 1739388192 (+0.00%); split: -0.00%, +0.00%
VMEM: 21071501 -> 21071665 (+0.00%); split: +0.00%, -0.00%
SMEM: 3470983 -> 3470982 (-0.00%); split: +0.00%, -0.00%
PreSGPRs: 2058965 -> 2058962 (-0.00%)
PreVGPRs: 1860294 -> 1860295 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
7240edec2a
aco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10
...
Totals from 767 (0.56% of 136546) affected shaders (NAVI):
CodeSize: 2862208 -> 2850036 (-0.43%)
Instrs: 561572 -> 561574 (+0.00%)
Cycles: 6455420 -> 6455428 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
2f125908b3
radv,aco: lower_pack_half_2x16
...
This patch also optimizes pack_half_2x16(a, 0.0).
Totals from 1949 (1.43% of 136546) affected shaders (RAVEN):
SGPRs: 83376 -> 83336 (-0.05%)
CodeSize: 3532144 -> 3512352 (-0.56%)
Instrs: 660746 -> 660682 (-0.01%); split: -0.01%, +0.00%
Cycles: 6780716 -> 6780472 (-0.00%); split: -0.00%, +0.00%
VMEM: 990886 -> 990883 (-0.00%); split: +0.00%, -0.00%
SMEM: 150506 -> 150538 (+0.02%); split: +0.05%, -0.03%
SClause: 30595 -> 30594 (-0.00%); split: -0.01%, +0.00%
Copies: 40801 -> 40729 (-0.18%)
PreSGPRs: 52335 -> 52341 (+0.01%); split: -0.03%, +0.04%
PreVGPRs: 45104 -> 45097 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
dae1e6f756
aco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16
...
Apparently, we forgot to remove some debug code.
This patch also fixes the round mode check to consider
the destination bit width.
Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
SGPRs: 100848 -> 100280 (-0.56%)
VGPRs: 68536 -> 66044 (-3.64%); split: -3.68%, +0.05%
CodeSize: 4882296 -> 4837220 (-0.92%); split: -0.94%, +0.01%
MaxWaves: 18990 -> 19019 (+0.15%); split: +0.19%, -0.04%
Instrs: 938150 -> 930388 (-0.83%); split: -0.83%, +0.00%
Cycles: 8699824 -> 8667648 (-0.37%); split: -0.38%, +0.01%
VMEM: 1144502 -> 1059680 (-7.41%); split: +0.06%, -7.48%
SMEM: 170076 -> 167999 (-1.22%); split: +0.22%, -1.44%
VClause: 18428 -> 18422 (-0.03%)
SClause: 41375 -> 41353 (-0.05%); split: -0.06%, +0.00%
Copies: 60008 -> 60054 (+0.08%); split: -0.31%, +0.39%
PreVGPRs: 56163 -> 56142 (-0.04%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
aec872cda0
aco: use p_split_vector for nir_op_unpack_half_*
...
This enables the use of SDWA if possible
Totals from 9933 (7.27% of 136546) affected shaders (RAVEN):
VGPRs: 731764 -> 731772 (+0.00%); split: -0.00%, +0.00%
CodeSize: 90944852 -> 90671472 (-0.30%); split: -0.30%, +0.00%
Instrs: 17881885 -> 17867831 (-0.08%); split: -0.08%, +0.00%
Cycles: 1597904072 -> 1597771260 (-0.01%); split: -0.01%, +0.00%
VMEM: 1702328 -> 1697383 (-0.29%); split: +0.13%, -0.42%
SMEM: 659583 -> 659049 (-0.08%); split: +0.01%, -0.09%
VClause: 318024 -> 318025 (+0.00%); split: -0.00%, +0.00%
SClause: 631670 -> 631707 (+0.01%); split: -0.01%, +0.01%
Copies: 1504107 -> 1504626 (+0.03%); split: -0.01%, +0.04%
PreVGPRs: 683153 -> 683180 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
a38a497b86
aco: use p_create_vector for nir_op_pack_half_2x16
...
This enables the use of SDWA if possible
Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
VGPRs: 68508 -> 68516 (+0.01%)
CodeSize: 4897024 -> 4881068 (-0.33%); split: -0.33%, +0.00%
MaxWaves: 18992 -> 18990 (-0.01%)
Instrs: 946942 -> 939161 (-0.82%); split: -0.82%, +0.00%
Cycles: 8737668 -> 8705704 (-0.37%); split: -0.37%, +0.00%
VMEM: 1155362 -> 1145245 (-0.88%); split: +0.00%, -0.88%
SMEM: 170435 -> 170165 (-0.16%); split: +0.01%, -0.16%
VClause: 18426 -> 18425 (-0.01%)
SClause: 41376 -> 41375 (-0.00%)
Copies: 59813 -> 59787 (-0.04%); split: -0.15%, +0.10%
PreVGPRs: 56126 -> 56136 (+0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Rhys Perry
c122315702
aco: fix get_ssbo_size with a vgpr resource
...
The result of load_vulkan_descriptor is passed directly to get_ssbo_size.
This caused convert_pointer_to_64_bit() to skip creating a
v_readfirstlane_b32 if it was necessary.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Fixes: 05b6612b4e ('radv: do not lower UBO/SSBO access to offsets')
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3628
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7095 >
2020-10-13 14:20:28 +00:00
Rhys Perry
bb5c0ba0d2
aco: implement last_invocation
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558 >
2020-10-13 12:47:21 +00:00
Rhys Perry
36da9c4aa2
aco: implement elect
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558 >
2020-10-13 12:47:20 +00:00
Rhys Perry
bf77f539ee
aco: optimize more uniform reductions/scans
...
Uniform atomic optimization will create these.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558 >
2020-10-13 12:47:20 +00:00
Samuel Pitoiset
b9ca4923d6
aco: implement missing nir_op_unpack_half_2x16_split_{x,y}_flush_to_zero
...
SPIRV->NIR emits nir_op_unpack_half_2x16_flush_to_zero instead of
nir_op_unpack_half_2x16 if the shader enables denorm flush to zero
for 16-bit floating point.
This doesn't fix anything known and CTS doesn't have tests.
Fixes: 56d9bcdded ("radv: enable more float_controls features")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6939 >
2020-10-13 08:35:22 +02:00
Samuel Pitoiset
b0829c6af7
radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065 >
2020-10-12 13:13:40 +00:00
Timur Kristóf
61280bb4b6
aco/ngg: Allocate NGG GS space early for const vertex/primitive counts.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
e8a0409d01
aco/ngg: Use more efficient LDS layout to help reduce bank conflicts.
...
The LLVM backend has a trick which helps reduce LDS bank conflicts
by swizzling the LDS address where each vertex is emitted.
This commit implements the same thing for ACO.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
dd73719856
aco/ngg: Add shader query support to NGG GS.
...
In each GS thread, we calculate the number of "real" primitives that
were emitted (points, lines, triangles, not strips). Then we
accumulate the number of "real" primitives emitted by the
entire threadgroup in GDS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
df62c8fbea
aco/ngg: Place workgroup barrier outside control flow for NGG GS.
...
Merged shaders have a workgroup barrier which makes sure that
the first half is completed in every wave before the 2nd half
is started.
This barrier is located in divergent control flow, so that waves
that don't have any invocations in the 2nd half can finish as early
as possible. This is problematic for NGG GS because it has more
workgroup barriers after the 2nd half.
So, for NGG GS we need to put the barrier outside
control flow because otherwise the waves that have 0 GS threads
won't be able to wait for the waves which have non-zero GS threads.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
1129575d5e
aco/ngg: Implement NGG GS output.
...
We store emitted GS vertices in LDS.
Then, at the end of the shader, the emitted vertices are compacted
and each thread loads a single vertex from LDS in order to export
a primitive as needed, and the vertex attributes.
The reason this is done is because there is an impedance mismatch
between how API GS and the NGG HW works. API GS can emit an arbitrary
number of vertices and primites in each thread, but NGG HW can only
export one vertex per thread.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
62b5012ec3
aco/ngg: Implement workgroup reduce / exclusive scan for NGG GS.
...
This function calculates two things at once:
1. The total number of vertices emitted by the threadgroup.
2. Exclusive scan of emitted vertex count accross the threadgroup.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
c29e288fb5
aco/ngg: Create LDS layout for NGG GS.
...
For NGG GS, we need to store the following in LDS:
1. The ESGS ring, similarly to legacy ESGS.
2. Emitted vertices from the GS threads.
3. Temporary space used by the workgroup scan.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:15 +02:00
Timur Kristóf
9c3d8404de
aco/ngg: Allow NGG GS to create VS exports.
...
NGG GS need to use the same instructions to export vertex
attributes at the end.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
b67878f328
aco/ngg: Allow NGG GS to load per-vertex GS inputs.
...
They work the same way as in legacy GS, so we can reuse that.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
8f25d9f821
aco/ngg: Allow NGG GS to store ES outputs.
...
We can reuse the existing ES output code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
b57b1a06e4
aco/ngg: Clean up and reorganize NGG VS/TES code.
...
Make the NGG VS/TES code easier to follow, give better names to
some functions and make ngg_nogs_early_prim_export a variable.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
3645a3106a
aco/ngg: Make primitive export packing less prone to error.
...
Use lshl_or instead of lshl_add, which makes it more robust in
handling -1 and -2 indices which will now just become null
exports, which is what we want.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
0bfe0495c1
aco/ngg: Refactor ngg_emit_prim_export in preparation for NGG GS.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
b08ced08a2
aco/ngg: Refactor gs_alloc_req in preparation for NGG GS.
...
Previously, this function inferred the vertex and primitive counts
from the gs_tg_info shader argument, but in case of NGG GS, it will
need to be calculated in runtime.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
57d8799284
aco: Optimize thread_id_in_threadgroup when there is just one wave.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
5e31fb49a3
aco: Use thread_id_in_threadgroup helper for ES outputs.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
924f816fe1
aco: Extract thread_id_in_threadgroup to a separate function.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
b1964ad4d6
aco: Extract lanecount_to_mask to a separate function.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00