Samuel Pitoiset
bb00a6860e
radv: fix optimizing needed states if some are marked as dynamic
...
From the Vulkan spec 1.2.157:
"VK_DYNAMIC_STATE_STENCIL_TEST_ENABLE_EXT specifies that the
stencilTestEnable state in VkPipelineDepthStencilStateCreateInfo
will be ignored and must be set dynamically with
vkCmdSetStencilTestEnableEXT before any draw call."
So, stencilTestEnable should be ignored if dynamic. While we are
at it, fix depthBoundsTestEnable too.
Cc: 20.2
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3633
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7112 >
2020-10-14 17:13:29 +00:00
Eric Anholt
68daac28df
docs: Document how to replicate a CI build locally.
...
Who hasn't needed to do this at some point? Turns out it's not too hard
to do, and was useful for me in iterating on the Android build.
Acked-by: Michel Dänzer <mdaenzer@redhat.com >
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700 >
2020-10-14 16:54:59 +00:00
Eric Anholt
0767af3ffe
ci/android: Switch to using the Android NDK.
...
To support Android drivers, we're going to want to be tracking that Mesa's
build succeeds on a real android toolchain. This still uses the android
stubs since these libs aren't in the NDK.
Note that I had to drop the Intel and AMD drivers currently: we don't have
LLVM cross-compiled for Android in this container, and I'm honestly hoping
ACO saves us from that. Intel has dependencies on libexpat, which AOSP
really doesn't want to bring in, and it looks to me like those dependencies
could be optional.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700 >
2020-10-14 16:54:59 +00:00
Eric Anholt
ad6189920b
symbols-check: Add __cxa_guard_* to the list of approved symbols.
...
These are introduced by the compiler during static local initialization in
c++ for thread safety. This seems to end up being public in the driver
with --static-libc++ on android.
Reviewed-by: <ajax@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700 >
2020-10-14 16:54:59 +00:00
Eric Anholt
4722491124
glsl/tests: Make the tests skip on Android binary execution failures.
...
We don't have a suitable exe wrapper for running them, and the missing
linker is throwing return code 255 instead of an ENOEXEC. Catch it and
return skip from the tests.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700 >
2020-10-14 16:54:59 +00:00
Eric Anholt
f51ce21e4e
meson: Drop adding -Wl,--gc-sections to project c/cpp arguments.
...
We already have the targets we care about doing this using
ld_args_gc_sections, and by adding it to project arguments we caused
warnings spam in the android clang build about the compile stage not using
the argument.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6700 >
2020-10-14 16:54:59 +00:00
Tony Wasserka
d5a72319d6
aco/isel: Remove now unused VS-related code from create_null_export
...
Also replaced a hardcoded constant with the appropriate register macro.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102 >
2020-10-14 16:22:51 +00:00
Tony Wasserka
c22c702f35
aco/isel: Remove some dead code
...
exported_pos was always initialized to true (due to the is_pos argument
of the first export_vs_varying call being true), so none of this code has
any effect.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102 >
2020-10-14 16:22:51 +00:00
Tony Wasserka
bf51b11c04
aco/isel: Always export position data from VS/NGG
...
AMD ISA docs explicitly require this for VS, and this likely extends to
NGG too.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3615
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102 >
2020-10-14 16:22:51 +00:00
Daniel Schürmann
f29c81f863
aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible
...
This patch also does a slight rework of export_fs_mrt_color()
to avoid setting of enabled channels which are not used.
Totals from 52404 (38.38% of 136546) affected shaders (NAVI):
SGPRs: 3097443 -> 3097435 (-0.00%)
CodeSize: 189151600 -> 188546200 (-0.32%)
Instrs: 36445061 -> 36445104 (+0.00%); split: -0.00%, +0.00%
Cycles: 1739388020 -> 1739388192 (+0.00%); split: -0.00%, +0.00%
VMEM: 21071501 -> 21071665 (+0.00%); split: +0.00%, -0.00%
SMEM: 3470983 -> 3470982 (-0.00%); split: +0.00%, -0.00%
PreSGPRs: 2058965 -> 2058962 (-0.00%)
PreVGPRs: 1860294 -> 1860295 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
7240edec2a
aco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10
...
Totals from 767 (0.56% of 136546) affected shaders (NAVI):
CodeSize: 2862208 -> 2850036 (-0.43%)
Instrs: 561572 -> 561574 (+0.00%)
Cycles: 6455420 -> 6455428 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
2f125908b3
radv,aco: lower_pack_half_2x16
...
This patch also optimizes pack_half_2x16(a, 0.0).
Totals from 1949 (1.43% of 136546) affected shaders (RAVEN):
SGPRs: 83376 -> 83336 (-0.05%)
CodeSize: 3532144 -> 3512352 (-0.56%)
Instrs: 660746 -> 660682 (-0.01%); split: -0.01%, +0.00%
Cycles: 6780716 -> 6780472 (-0.00%); split: -0.00%, +0.00%
VMEM: 990886 -> 990883 (-0.00%); split: +0.00%, -0.00%
SMEM: 150506 -> 150538 (+0.02%); split: +0.05%, -0.03%
SClause: 30595 -> 30594 (-0.00%); split: -0.01%, +0.00%
Copies: 40801 -> 40729 (-0.18%)
PreSGPRs: 52335 -> 52341 (+0.01%); split: -0.03%, +0.04%
PreVGPRs: 45104 -> 45097 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
dae1e6f756
aco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16
...
Apparently, we forgot to remove some debug code.
This patch also fixes the round mode check to consider
the destination bit width.
Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
SGPRs: 100848 -> 100280 (-0.56%)
VGPRs: 68536 -> 66044 (-3.64%); split: -3.68%, +0.05%
CodeSize: 4882296 -> 4837220 (-0.92%); split: -0.94%, +0.01%
MaxWaves: 18990 -> 19019 (+0.15%); split: +0.19%, -0.04%
Instrs: 938150 -> 930388 (-0.83%); split: -0.83%, +0.00%
Cycles: 8699824 -> 8667648 (-0.37%); split: -0.38%, +0.01%
VMEM: 1144502 -> 1059680 (-7.41%); split: +0.06%, -7.48%
SMEM: 170076 -> 167999 (-1.22%); split: +0.22%, -1.44%
VClause: 18428 -> 18422 (-0.03%)
SClause: 41375 -> 41353 (-0.05%); split: -0.06%, +0.00%
Copies: 60008 -> 60054 (+0.08%); split: -0.31%, +0.39%
PreVGPRs: 56163 -> 56142 (-0.04%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
9185b7c069
aco: add validation rules for p_split_vector
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
aec872cda0
aco: use p_split_vector for nir_op_unpack_half_*
...
This enables the use of SDWA if possible
Totals from 9933 (7.27% of 136546) affected shaders (RAVEN):
VGPRs: 731764 -> 731772 (+0.00%); split: -0.00%, +0.00%
CodeSize: 90944852 -> 90671472 (-0.30%); split: -0.30%, +0.00%
Instrs: 17881885 -> 17867831 (-0.08%); split: -0.08%, +0.00%
Cycles: 1597904072 -> 1597771260 (-0.01%); split: -0.01%, +0.00%
VMEM: 1702328 -> 1697383 (-0.29%); split: +0.13%, -0.42%
SMEM: 659583 -> 659049 (-0.08%); split: +0.01%, -0.09%
VClause: 318024 -> 318025 (+0.00%); split: -0.00%, +0.00%
SClause: 631670 -> 631707 (+0.01%); split: -0.01%, +0.01%
Copies: 1504107 -> 1504626 (+0.03%); split: -0.01%, +0.04%
PreVGPRs: 683153 -> 683180 (+0.00%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
f503699e10
nir/opt_algebraic: optimize unpack_half_2x16_split_x(ushr, a, 16)
...
Same as extract_u16(a, 1)
Totals from 2021 (1.48% of 136546) affected shaders (RAVEN):
VGPRs: 129516 -> 129524 (+0.01%); split: -0.00%, +0.01%
CodeSize: 12485704 -> 12486600 (+0.01%); split: -0.00%, +0.01%
Instrs: 2435041 -> 2434999 (-0.00%); split: -0.00%, +0.00%
Cycles: 20952552 -> 20952624 (+0.00%); split: -0.00%, +0.00%
VMEM: 374492 -> 374212 (-0.07%); split: +0.01%, -0.08%
SMEM: 123309 -> 123291 (-0.01%); split: +0.00%, -0.02%
VClause: 64156 -> 64164 (+0.01%)
Copies: 191620 -> 191616 (-0.00%); split: -0.03%, +0.03%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
a38a497b86
aco: use p_create_vector for nir_op_pack_half_2x16
...
This enables the use of SDWA if possible
Totals from 2218 (1.62% of 136546) affected shaders (RAVEN):
VGPRs: 68508 -> 68516 (+0.01%)
CodeSize: 4897024 -> 4881068 (-0.33%); split: -0.33%, +0.00%
MaxWaves: 18992 -> 18990 (-0.01%)
Instrs: 946942 -> 939161 (-0.82%); split: -0.82%, +0.00%
Cycles: 8737668 -> 8705704 (-0.37%); split: -0.37%, +0.00%
VMEM: 1155362 -> 1145245 (-0.88%); split: +0.00%, -0.88%
SMEM: 170435 -> 170165 (-0.16%); split: +0.01%, -0.16%
VClause: 18426 -> 18425 (-0.01%)
SClause: 41376 -> 41375 (-0.00%)
Copies: 59813 -> 59787 (-0.04%); split: -0.15%, +0.10%
PreVGPRs: 56126 -> 56136 (+0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
3c2abd7116
aco: expand create_vector more carefully w.r.t. subdword operands
...
No pipelinedb changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Daniel Schürmann
d887eb141b
aco: propagate SGPRs into VOP1 instructions early.
...
This helps DCE. We should reconsider our optimization order
or maybe do the dead code analysis twice
Totals from 106 (0.08% of 136546) affected shaders (RAVEN):
SGPRs: 7184 -> 7152 (-0.45%)
CodeSize: 736912 -> 736052 (-0.12%)
Instrs: 145739 -> 145509 (-0.16%)
Cycles: 2085344 -> 2084268 (-0.05%)
VMEM: 14819 -> 14807 (-0.08%)
SMEM: 7109 -> 7100 (-0.13%); split: +0.04%, -0.17%
SClause: 5383 -> 5385 (+0.04%)
Copies: 13290 -> 13189 (-0.76%)
PreSGPRs: 5265 -> 5221 (-0.84%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777 >
2020-10-14 15:31:38 +00:00
Mike Blumenkrantz
3424e17b9a
zink: unify code for emitting named uint-based variable instructions
...
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7130 >
2020-10-14 15:22:54 +00:00
Samuel Pitoiset
20d73a9049
aco: adjust an assertion about the wavesize in emit_gfx10_wave64_bpermute()
...
This gets rids of one more use of radv_shader_info.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061 >
2020-10-14 15:09:34 +00:00
Samuel Pitoiset
112e66fa09
aco: compute the CS workgroup size from the shader NIR info
...
cs.block_size is copied from cs.local_size during the shader info pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061 >
2020-10-14 15:09:34 +00:00
Samuel Pitoiset
e3e8d13ada
radv: move compiler statistics to ACO
...
They are really specific to ACO.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061 >
2020-10-14 15:09:34 +00:00
Samuel Pitoiset
97afb2a0a9
aco: remove unused radv_shader.h includes
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061 >
2020-10-14 15:09:34 +00:00
Samuel Pitoiset
408195ec53
aco: remove useless occurences of radv_nir_compiler_options
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061 >
2020-10-14 15:09:34 +00:00
Samuel Pitoiset
8a6f60fc6b
aco: remove stub lower_wqm() prototype
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061 >
2020-10-14 15:09:34 +00:00
Mike Blumenkrantz
23e731fcdb
zink: export PIPE_CAP_MAX*_VARYINGS values
...
this is separate from PIPE_SHADER_CAP_MAX_OUTPUTS
fixes mesa/mesa#3105
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7110 >
2020-10-14 14:47:22 +00:00
Erik Faye-Lund
d50e8554b9
zink: add feature-documentation
...
This adds some documentation for the current feature-set in Zink,
explaining what extensions are currently needed for what functionality.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7116 >
2020-10-14 14:23:43 +00:00
Mike Blumenkrantz
f85488ab82
zink: redo slot mapping again for the last time really I mean it
...
now that shader compiling is happening all at once, we can store the slot
map on zink_gfx_program directly and reserve it dynamically in order to
use up only the slots that are actually being used across all shader stages
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7100 >
2020-10-14 13:46:05 +00:00
Mike Blumenkrantz
4f144dc92c
zink: don't leak sampler view textures
...
by adding a batch reference for these textures during draw, we can successfully
destroy the resources without crashing
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924 >
2020-10-14 09:20:39 -04:00
Mike Blumenkrantz
270969b55e
zink: explicitly flag fb attachments as being written to in render passes
...
we need to ensure that we're accurately setting this hint in order to avoid
synchronization issues when determining whether we can read from the buffer
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924 >
2020-10-14 09:20:37 -04:00
Mike Blumenkrantz
8dfb941a4c
zink: add more explicit fencing for transfer maps
...
we're using our (primitive) buffer r/w tracking here to ensure that our
src buffers are synchronized before we do any kind of read operation on them
this is pretty slow in some cases, but it fixes a bunch of piglit tests
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924 >
2020-10-14 09:20:35 -04:00
Mike Blumenkrantz
e3ed624072
zink: optimize transfer_map for resources with pending reads/writes
...
we don't need to stall here if we know that we're not about to have any io
conflicts in the buffer
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924 >
2020-10-14 09:20:32 -04:00
Mike Blumenkrantz
c6687eef2d
zink: add a mechanism to track current resource usage in batches
...
this is really primitive, but it at least gives an idea of whether a
resource has been submitted for writing in a pending batch
Reviewed-by: Erik Faye-Lun <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6924 >
2020-10-14 09:20:24 -04:00
Samuel Pitoiset
48b988e35f
radv: fix ignoring the vertex attribute stride if set as dynamic
...
The vertex attribute stride should be ignored, so make sure it's
initialized to zero if dynamic to avoid computing a wrong offset.
The fact that each element of pStrides must be greater than or equal
to the maximum extent of all vertex input attributes fetched saves us
one user SGPR for the dynamic stride.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3627
Cc: 20.2
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7101 >
2020-10-14 12:29:39 +00:00
James Park
28d02b9d3e
ac,amd/llvm,radv: Initialize structs with {0}
...
Necessary to compile with MSVC.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7123 >
2020-10-14 12:15:23 +00:00
Samuel Pitoiset
b84d1a0c42
radv/aco: disable NGG GS support because it randomly hangs the GPU
...
Disable ACO NGG GS until the random GPU hangs are fixed
(one CTS run == one GPU hang here). No hangs so far after
5 full CTS runs with this disabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7108 >
2020-10-14 13:52:42 +02:00
Rhys Perry
21422b1ff2
nir/opt_uniform_atomics: remove useless returns
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7117 >
2020-10-14 09:53:34 +00:00
James Park
7758664788
radv: Only close local_fd when valid
...
Necessary when drm_device is bypassed.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119 >
2020-10-13 22:56:31 +00:00
James Park
4ca6faa933
util: Hide timespec_passed on Windows
...
Windows doesn't have clockid_t.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119 >
2020-10-13 22:56:31 +00:00
James Park
1026e2ac0f
radv: Increased const usage
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119 >
2020-10-13 22:56:31 +00:00
James Park
1b551857f9
amd/addrlib: Fix warning list for msvc
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119 >
2020-10-13 22:56:31 +00:00
Jason Ekstrand
5abac85177
intel/fs: Rework scratch handling on Gen9+
...
The current scratch mechanism uses an MRF hack where we reserve a few
GRF registers to treat like the MRF and we collect the data into that
MRF region before doing a scratch write. We also use that region for
the header for scratch reads.
This commit changes things and gets rid of the MRF hack. Instead, we
reserve a single register (which RA is free to pick) for the scratch
header and uses split sends for scratch writes to avoid having to do
the copy. This should provide RA with more freedom in the presence of
spilling as well as avoid some unnecessary data moves. In future, the
new GEN9_SCRATCH_HEADER opcode gives us a place where we can do our own
per-thread scratch base address calculations rather than depending on
the scratch base address that gets pushed into g0. Having an opcode for
this lets us do it once at the top of the shader rather than repeating
it at every read/write.
One other noticeable difference is the use of SHADER_OPCODE_SEND. We
can get away with this thanks to the fact that we're now using a set to
track which instructions are generated by spills and don't rely on the
opcodes to find spill/fill instructions. This allows us to avoid adding
more virtual opcodes and let the normal code paths handle things like
scoreboard dependencies between header setup and the SEND. It also
means that post-RA scheduling may be able to space out the header setup
MOV and the SEND for better latency hiding.
Shader-db results on Skylake:
total spills in shared programs: 12137 -> 10604 (-12.63%)
spills in affected programs: 6685 -> 5152 (-22.93%)
helped: 274
HURT: 2
total fills in shared programs: 13065 -> 11515 (-11.86%)
fills in affected programs: 9007 -> 7457 (-17.21%)
helped: 275
HURT: 1
Shader-db results on Ice Lake:
total spills in shared programs: 12482 -> 10953 (-12.25%)
spills in affected programs: 6586 -> 5057 (-23.22%)
helped: 275
HURT: 0
total fills in shared programs: 12819 -> 11234 (-12.36%)
fills in affected programs: 7867 -> 6282 (-20.15%)
helped: 274
HURT: 0
Shader-db results on Tigerlake:
total spills in shared programs: 11689 -> 10233 (-12.46%)
spills in affected programs: 4740 -> 3284 (-30.72%)
helped: 259
HURT: 0
total fills in shared programs: 10840 -> 9443 (-12.89%)
fills in affected programs: 6244 -> 4847 (-22.37%)
helped: 259
HURT: 0
Fossil-db results on Ice Lake:
Spills in all programs: 245249 -> 201633 (-17.8%)
Fills in all programs: 366066 -> 314368 (-14.1%)
More practically, this seems to give about a 0.5-1% perf boost in
Witcher 3 (DXVK) and Shadow of the Tomb Raider (Vulkan native).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
e557af9781
intel/fs/ra: Use a set to track added spill/fill instructions
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
f650c4c0c6
intel/fs/ra: Sanity-check our IP counts
...
Starting with e99081e76d , we don't re-construct liveness information
every time we spill a register. Instead, we're very careful to track
which instructions are spill instructions and not contribute those to
the IP count so that we can continue to use the old liveness information
even though instructions have been added. This commit adds an assert
that sanity-checks that we count the same number of instructions as our
liveness information is based on.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
d80d0a6ced
intel/fs/ra: Store the last non-spill VGRF node
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
2af6528c33
intel/fs/ra: Refactor handling of Gen7 scratch reads
...
The attempt at de-duplication with the gen7_read Boolean wasn't actually
saving us anything.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
74a1843ca0
intel/fs/ra: Increment spill_offset as part of the emit_spill loop
...
This makes it consistent with our handling of src.offset and with our
handling of spill_offset in emit_unspill.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
06ebf23283
intel/fs: Add a SCRATCH_HEADER opcode
...
This opcode is responsible for setting up the buffer base address and
per-thread scratch space fields of a scratch message header. For the
most part, it's a copy of g0 but some messages need us to zero out g0.2
and the bottom bits of g0.5.
This may actually fix a bug when nir_load/store_scratch is used. The
docs say that the DWORD scattered messages respect the per-thread
scratch size specified in gN.3[3:0] in the message header but we've been
leaving it zero. This may mean that we've been ignoring any scratch
reads/writes from a load/store_scratch intrinsic above the 1KB mark.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00
Jason Ekstrand
24b64c8408
intel/fs: Copy the PTSS from g0 for scratch reads/writes
...
In theory, this fixes a bug where we were dropping the PTSS bound on the
floor. The hardware docs claim that the A32 DWORD and BYTE scattered
read/write messages do a PTSS bounds check. However, in practice, it
seems that the hardware ignores the bounds check so this doesn't
actually matter. I verified this with the following couple of piglit
tests:
https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/399
In practice, this prevents the next commit from making a subtle
behavioral change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7084 >
2020-10-13 21:59:27 +00:00