Rhys Perry
1f2518ef9f
aco: implement nir_op_extract/nir_op_insert
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151 >
2021-06-08 08:57:43 +00:00
Caio Marcelo de Oliveira Filho
c8a7bd0dc8
nir: Rename WORK_GROUP (and similar) to WORKGROUP
...
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Daniel Schürmann
dc807dff3e
radv,aco: scalarize all phis via nir_lower_phis_to_scalar()
...
This allows to remove some ACO code which did so previously.
Totals from 93 (0.06% of 149839) affected shaders (Navi2):
CodeSize: 582424 -> 582348 (-0.01%); split: -0.10%, +0.08%
Instrs: 107083 -> 107011 (-0.07%); split: -0.08%, +0.01%
Latency: 483338 -> 484881 (+0.32%); split: -0.09%, +0.40%
InvThroughput: 101129 -> 101532 (+0.40%); split: -0.03%, +0.42%
Copies: 9893 -> 9774 (-1.20%); split: -1.28%, +0.08%
Branches: 2862 -> 2858 (-0.14%)
PreSGPRs: 3342 -> 3339 (-0.09%)
PreVGPRs: 4567 -> 4565 (-0.04%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11181 >
2021-06-04 16:47:01 +00:00
Rhys Perry
903f814b78
aco: don't create 4 and 5 dword NSA instructions on GFX10
...
"stability issues", apparently: https://reviews.llvm.org/D103348
fossil-db (Navi10):
Totals from 4512 (3.01% of 149839) affected shaders:
VGPRs: 221516 -> 223308 (+0.81%); split: -0.07%, +0.88%
CodeSize: 23000080 -> 23070672 (+0.31%); split: -0.08%, +0.39%
MaxWaves: 107718 -> 107496 (-0.21%); split: +0.11%, -0.32%
Instrs: 4321890 -> 4362822 (+0.95%); split: -0.00%, +0.95%
Latency: 71495710 -> 71581476 (+0.12%); split: -0.07%, +0.19%
InvThroughput: 11858568 -> 11938960 (+0.68%); split: -0.00%, +0.68%
VClause: 76575 -> 76585 (+0.01%); split: -0.05%, +0.07%
SClause: 168771 -> 168709 (-0.04%); split: -0.06%, +0.02%
Copies: 182305 -> 221948 (+21.75%); split: -0.00%, +21.75%
PreVGPRs: 194657 -> 195635 (+0.50%); split: -0.00%, +0.50%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Fixes: c353895c92 ("aco: use non-sequential addressing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898 >
2021-06-03 03:49:07 +00:00
Timur Kristóf
a93092d0ed
aco: Use s_cbranch_vccz/nz in post-RA optimization.
...
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.
It works on the following pattern:
vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc
The result looks like this:
vcc = v_cmp ...
p_cbranch vcc
Fossil DB results on Sienna Cichlid:
Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779 >
2021-05-28 12:14:53 +00:00
Samuel Pitoiset
729ebe4b17
aco: fix emitting discard when the program just ends
...
For fragment shaders that only contain a discard, the exec mask has
to be zero'd and everything discarded.
It seems unnecessary to emit an export here because if the FS has no
exports, the compiler already emits a null export at the end.
Fixes incorrect hair rendering in Detroit: Become Human.
fossil-db (Sienna Cichlid):
Totals from 3 (0.00% of 149839) affected shaders:
CodeSize: 2896 -> 2872 (-0.83%)
Instrs: 556 -> 553 (-0.54%)
Latency: 29266 -> 29214 (-0.18%)
InvThroughput: 3374 -> 3372 (-0.06%)
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10955 >
2021-05-26 10:32:59 +00:00
Daniel Schürmann
32c7d17120
aco: remove condition operand from branch in invert block
...
As value numbering only handles logical blocks, this
could lead to invalid IR until insert_exec_mask().
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10894 >
2021-05-20 17:44:20 +00:00
Samuel Pitoiset
fe2a5716ee
aco: fix derivatives/intrinsics with SGPR sources
...
ds_swizzle_b32 requires a VGPR and DPP can't encode SGPR sources.
Fixes
dEQP-VK.graphicsfuzz.cov-derivative-uniform-vector-global-loop-count.
Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10840 >
2021-05-20 13:24:31 +00:00
Bas Nieuwenhuizen
c7904b5b9b
aco: Implement bvh64_intersect_ray_amd intrinsic.
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10818 >
2021-05-18 23:02:25 +02:00
Bas Nieuwenhuizen
bfe2802188
aco: Add load_sbt_amd intrinsic implementation.
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9767 >
2021-05-18 18:29:36 +00:00
Timur Kristóf
bb127c2130
radv: Use new NIR lowering of NGG GS when ACO is used.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740 >
2021-05-12 13:47:04 +00:00
Timur Kristóf
9732881729
radv: Use new NGG NIR lowering for VS/TES when ACO is used.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740 >
2021-05-12 13:47:04 +00:00
Timur Kristóf
89a76ff786
aco: Implement new NGG specific NIR intrinsics.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740 >
2021-05-12 13:47:04 +00:00
Timur Kristóf
75a002f809
aco: Split ngg_emit_sendmsg_gs_alloc_req from the wave0 check.
...
This allows us to emit the gs_alloc_req independently of the
wave ID check, which is what the NIR lowering will need.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740 >
2021-05-12 13:47:04 +00:00
Timur Kristóf
00fd087f0a
aco: Allow workgroup barrier and shared scope for NGG shaders.
...
NGG already needs to use workgroup barriers, but this commit allows
them to come from NIR as opposed to just emitting it in ACO
instruction selection.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740 >
2021-05-12 13:47:04 +00:00
Rhys Perry
a54f111831
radv,aco: compact vertex buffer descriptors
...
It seems common for there to be holes.
fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 33791 (23.10% of 146267) affected shaders:
(no statistics changed)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871 >
2021-05-10 12:09:14 +00:00
Rhys Perry
20a0744e22
Revert "radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2"
...
This reverts commit a8a6b9fb2fdcb1bea55707fa0c2b8e96f03c6b5b.
This is no longer necessary now that we fixup the size when creating the
descriptors.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871 >
2021-05-10 12:09:14 +00:00
Rhys Perry
157c6b0f33
radv,aco: use per-attribute vertex descriptors for robustness
...
We have to use a different num_records for each attribute to correctly
implement robust buffer access.
fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 60059 (41.06% of 146267) affected shaders:
VGPRs: 2169040 -> 2169024 (-0.00%); split: -0.02%, +0.02%
CodeSize: 79473128 -> 81156016 (+2.12%); split: -0.00%, +2.12%
MaxWaves: 1635360 -> 1635258 (-0.01%); split: +0.00%, -0.01%
Instrs: 15559040 -> 15793205 (+1.51%); split: -0.01%, +1.52%
Latency: 90954792 -> 91308768 (+0.39%); split: -0.30%, +0.69%
InvThroughput: 14937873 -> 14958761 (+0.14%); split: -0.04%, +0.18%
VClause: 444280 -> 412074 (-7.25%); split: -9.22%, +1.97%
SClause: 588545 -> 644141 (+9.45%); split: -0.54%, +9.99%
Copies: 1010395 -> 1011232 (+0.08%); split: -0.44%, +0.53%
Branches: 274279 -> 274282 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1431171 -> 1405056 (-1.82%); split: -2.89%, +1.07%
PreVGPRs: 1575253 -> 1575259 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871 >
2021-05-10 12:09:14 +00:00
Rhys Perry
dfa38fa0c7
aco: group loads from the same vertex binding into the same clause
...
In the future, we might have vertex attribute loads from the same binding
but with different descriptors. Since they will be loading from the same
buffer, we should continue grouping them into clauses.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871 >
2021-05-10 12:09:14 +00:00
Rhys Perry
ee9b744cb5
radv,aco: use nir_address_format_vec2_index_32bit_offset
...
The vec2 index helps the compiler make use of SMEM's SOFFSET field when
loading descriptors.
fossil-db (GFX10.3):
Totals from 126326 (86.37% of 146267) affected shaders:
VGPRs: 4898704 -> 4899088 (+0.01%); split: -0.02%, +0.03%
SpillSGPRs: 13490 -> 14404 (+6.78%); split: -1.10%, +7.87%
CodeSize: 306442996 -> 302277700 (-1.36%); split: -1.36%, +0.01%
MaxWaves: 3277108 -> 3276624 (-0.01%); split: +0.01%, -0.02%
Instrs: 58301101 -> 57469370 (-1.43%); split: -1.43%, +0.01%
VClause: 1208270 -> 1199264 (-0.75%); split: -1.02%, +0.28%
SClause: 2517691 -> 2432744 (-3.37%); split: -3.75%, +0.38%
Copies: 3518643 -> 3161097 (-10.16%); split: -10.45%, +0.29%
Branches: 1228383 -> 1228254 (-0.01%); split: -0.12%, +0.11%
PreSGPRs: 3973880 -> 4031099 (+1.44%); split: -0.19%, +1.63%
PreVGPRs: 3831599 -> 3831707 (+0.00%)
Cycles: 1785250712 -> 1778222316 (-0.39%); split: -0.42%, +0.03%
VMEM: 52873776 -> 50663317 (-4.18%); split: +0.18%, -4.36%
SMEM: 8534270 -> 8361666 (-2.02%); split: +1.79%, -3.82%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523 >
2021-04-27 15:56:07 +00:00
Samuel Pitoiset
4c2add8cba
aco: adjust NGG if provoking vertex mode is last
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10449 >
2021-04-27 07:31:03 +00:00
James Park
1351fcf3c3
amd: Fix warnings around variable sizes
...
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6162 >
2021-04-23 10:37:22 +00:00
Timur Kristóf
74c467d988
aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7.
...
On GFX6-7, the 8 and 16-bit integer add reductions use the 32-bit v_add
instruction, which clobbers the VCC register.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10346 >
2021-04-22 11:29:49 +00:00
Rhys Perry
0eaa5dfac0
aco: remove image parameter from get_sampler_desc()
...
We can just check whether tex_instr is NULL instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036 >
2021-04-20 17:42:21 +00:00
Rhys Perry
3cbe9894f7
aco: set TRUNC_COORD=0 for nir_texop_tg4
...
Fixes black squares in Assassin's Creed: Valhalla and rendering of
FidelityFX-CACAO demo.
fossil-db (sienna cichlid):
Totals from 3052 (2.09% of 146267) affected shaders:
SpillSGPRs: 8437 -> 8646 (+2.48%)
CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40%
Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29%
Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05%
InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03%
VClause: 92114 -> 92132 (+0.02%)
SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01%
Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61%
Branches: 219629 -> 219635 (+0.00%)
PreSGPRs: 248970 -> 249366 (+0.16%)
fossil-db (polaris10):
Totals from 3050 (2.06% of 147787) affected shaders:
SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02%
VGPRs: 242572 -> 242612 (+0.02%)
SpillSGPRs: 10387 -> 10675 (+2.77%)
CodeSize: 31872460 -> 31996128 (+0.39%)
MaxWaves: 10924 -> 10925 (+0.01%)
Instrs: 6222217 -> 6239072 (+0.27%)
Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09%
InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06%
VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01%
SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00%
Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41%
Branches: 219698 -> 219703 (+0.00%)
PreSGPRs: 244251 -> 244644 (+0.16%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Fixes: 58f25098a0 ("radv: Use TRUNC_COORD on samplers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3110
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036 >
2021-04-20 17:42:21 +00:00
Samuel Pitoiset
9434675d60
aco: fix opquantize2f16 on GFX6-7
...
Make sure to preserve signed zeroes.
Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero
on GFX6 (Pitcairn). Untested on GFX7.
Fixes: 54a09545ec ("aco: optimize a*0.0")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10319 >
2021-04-19 16:33:37 +00:00
Marek Olšák
ec1ddb976a
amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261 >
2021-04-17 02:37:49 +00:00
Timur Kristóf
5dbab03a80
aco: Emit fewer branches for NGG VS/TES with late primitive export.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106 >
2021-04-14 14:25:10 +00:00
Timur Kristóf
af7d5f5b86
aco: Set block_kind_export_end in create_vs/fs_exports.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106 >
2021-04-14 14:25:10 +00:00
Timur Kristóf
2b312a4fd7
aco: Extract ngg_nogs_export_prim_id to a separate function.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106 >
2021-04-14 14:25:10 +00:00
Timur Kristóf
231ef14b3d
aco: Use s_setprio 3 at the beginning of every VS and TES.
...
The user-set priority of shaders matters very little, but we hope
this might still help speed up VS input loads especially.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106 >
2021-04-14 14:25:10 +00:00
Timur Kristóf
4c86c7aa15
aco: Remove useless s_setprio near gs_alloc_req.
...
We learned that the gs_alloc_req is not actually when the export
space allocation happens. So it makes no sense to prioritize it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106 >
2021-04-14 14:25:10 +00:00
Timur Kristóf
75cd43741a
aco: Align NGG scratch size to 16 so a single ds_read can always read it.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155 >
2021-04-14 14:05:24 +00:00
Timur Kristóf
c1346e5c22
aco: Optimize workgroup exclusive scan to better avoid bank conflicts.
...
Previously, every wave had multiple active lanes read the LDS, and
the data was processed by VALU DPP instructions.
Now, only the first lane reads the LDS in order to avoid bank
conflicts, and the results are processed by SALU.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155 >
2021-04-14 14:05:24 +00:00
Rhys Perry
d8f12fd421
aco: fix 16-bit f2{u8,i8} on GFX6/7
...
Not really tested.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081 >
2021-04-12 16:19:46 +00:00
Rhys Perry
d0e15b8c22
aco: fix 16-bit u2f32
...
This shouldn't sign-extend.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081 >
2021-04-12 16:19:46 +00:00
Samuel Pitoiset
1ad295ed6f
radv: allow to force VRS rates on GFX10.3 with RADV_FORCE_VRS
...
This allows to force the VRS rates via RADV_FORCE_VRS, the supported
values are 2x2, 1x2 and 2x1. This supports the primitive shading rate
mode for non GUI elements.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7794 >
2021-04-09 14:47:53 +02:00
Rhys Perry
835c5b7ebf
aco: fix integer tg4 workaround with unnormalized coordinates
...
Same as LLVM from 2abf62d348 .
fossil-db (GFX8):
Totals from 15 (0.01% of 147787) affected shaders:
VGPRs: 744 -> 748 (+0.54%)
CodeSize: 100472 -> 100732 (+0.26%)
Instrs: 19995 -> 20059 (+0.32%)
Latency: 1001530 -> 1001859 (+0.03%)
InvThroughput: 378508 -> 378747 (+0.06%)
SClause: 676 -> 675 (-0.15%)
Copies: 1655 -> 1654 (-0.06%)
PreSGPRs: 735 -> 742 (+0.95%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10053 >
2021-04-07 15:21:51 +00:00
Samuel Pitoiset
65bca137bd
aco: implement a workaround for the image load DCC hw bug on GFX10.3
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919 >
2021-04-05 08:54:55 +00:00
Samuel Pitoiset
3dfb453626
aco: fix get_sampler_desc() for image loads
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919 >
2021-04-05 08:54:55 +00:00
Tony Wasserka
8557ac9a12
aco/isel: Add documentation for (u)int64->f16 conversion
...
The upper 32 bits are truncated before converting, which still produces
correct results since they never meaningfully contribute to the result.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597 >
2021-03-26 14:39:23 +00:00
Tony Wasserka
b5be03f39f
aco/isel: Fix large inputs being truncated in int32->f16 conversions
...
The previous code produced incorrect results for inputs outside the
range [INT16_MIN, INT16_MAX].
A problematic case is e.g. i2f16 32768, which previously would be
converted to -32768.0 instead of returning the exactly representable
floating point result.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597 >
2021-03-26 14:39:23 +00:00
Tony Wasserka
4ce8e422e3
aco/isel: Add documentation and asserts for convert_int
...
This function has evolved to be a generic helper function used throughout
the file, so having those assumptions written down explicitly and document
unsupported edge cases should help prevent incorrect use.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597 >
2021-03-26 14:39:23 +00:00
Tony Wasserka
1e03796fa4
aco/isel: Don't request sign extension when truncating signed integers
...
This doesn't change semantics but allows us to reject this potentially
ambiguous configuration in convert_int in a later change.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597 >
2021-03-26 14:39:23 +00:00
Tony Wasserka
3a2b055726
aco/isel: Fix i64/u64->float32 conversion for large inputs
...
Previously, inputs such as 0x100000000 would have their upper 32-bits
ignored despite being representable by 32-bit floats.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597 >
2021-03-26 14:39:23 +00:00
Tony Wasserka
436922c84a
aco/isel: Don't emit unsupported i16<->f16 conversion opcodes on GFX6/7
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: b86305bb57 ("nir/algebraic: collapse conversion opcodes (many patterns)")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4357
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597 >
2021-03-26 14:39:23 +00:00
Rhys Perry
e3c283e0bc
aco: use -1.0*x and 1.0*|x| for fneg/fabs
...
Besides -1.0*x being 1 dword smaller than x^0x80000000, this commit also
improves generated code when the application requires that denormals are
flushed.
Future versions of DXVK will require that 32-bit denormals are flushed.
fossil-db (GFX8):
Totals from 21021 (14.22% of 147787) affected shaders:
SGPRs: 1288960 -> 1288944 (-0.00%); split: -0.01%, +0.01%
VGPRs: 792672 -> 792848 (+0.02%); split: -0.01%, +0.03%
CodeSize: 62439228 -> 62403552 (-0.06%); split: -0.11%, +0.05%
MaxWaves: 136182 -> 136181 (-0.00%); split: +0.00%, -0.00%
Instrs: 12230882 -> 12239927 (+0.07%); split: -0.01%, +0.08%
fossil-db (GFX10.3):
Totals from 20191 (13.80% of 146267) affected shaders:
VGPRs: 799992 -> 800032 (+0.01%)
CodeSize: 59763656 -> 59715484 (-0.08%); split: -0.12%, +0.03%
MaxWaves: 525378 -> 525376 (-0.00%)
Instrs: 11511082 -> 11517419 (+0.06%); split: -0.00%, +0.06%
fossil-db (GFX8, d3d float controls):
Totals from 87160 (58.98% of 147787) affected shaders:
SGPRs: 5395072 -> 5408480 (+0.25%); split: -0.06%, +0.31%
VGPRs: 3596716 -> 3581592 (-0.42%); split: -0.55%, +0.13%
CodeSize: 271347396 -> 266814460 (-1.67%); split: -1.67%, +0.00%
MaxWaves: 539669 -> 540400 (+0.14%); split: +0.15%, -0.02%
Instrs: 53395194 -> 52257505 (-2.13%); split: -2.13%, +0.00%
fossil-db (GFX10.3, d3d float controls):
Totals from 82306 (56.27% of 146267) affected shaders:
VGPRs: 3572312 -> 3558848 (-0.38%); split: -0.44%, +0.06%
CodeSize: 273494748 -> 269648968 (-1.41%); split: -1.41%, +0.00%
MaxWaves: 2007156 -> 2009950 (+0.14%); split: +0.15%, -0.01%
Instrs: 52251568 -> 51356424 (-1.71%); split: -1.71%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9079 >
2021-03-24 14:02:41 +00:00
Rhys Perry
27e2f82f17
aco: implement image_deref_samples
...
It used to be that this intrinsic was never created and texture
instructions were always used.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Fixes: 50881d59e6 ("compiler/spirv: fix image sample queries")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9686 >
2021-03-19 10:31:46 +00:00
Timur Kristóf
89c8e22cc6
aco: Fix constant address offset calculation for ds_read2 instructions.
...
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9678 >
2021-03-18 10:43:41 +00:00
Rhys Perry
5bc100eb2d
aco: use a single instruction for uadd32_sat() on GFX8
...
fossil-db (GFX8):
Totals from 8 (0.01% of 147787) affected shaders:
SGPRs: 352 -> 368 (+4.55%)
CodeSize: 49576 -> 48788 (-1.59%)
Instrs: 9487 -> 9318 (-1.78%)
Latency: 49935 -> 49607 (-0.66%)
InvThroughput: 138493 -> 137443 (-0.76%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9598 >
2021-03-17 15:33:34 +00:00