Tony Wasserka
b603875482
aco/ra: Use PhysRegInterval for count_zero
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
c30e83cc51
aco/ra: Use PhysRegInterval for collect_vars parameters
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
0959b7c435
aco/ra: Use PhysReg when indexing into RegisterFile's containers
...
This gets rid of a lot of implicit/explicit conversions from PhysReg to
unsigned.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
c3660f4781
aco/ra: Use PhysReg for member functions of PhysRegInterval
...
This replaces the various PhysReg{lb} casts that had been all over the place.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
d2d0096c0c
aco/ra: Remove unused function parameter
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
d9e1375e27
aco/ra: Use std::all_of to simplify a loop
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
f7e6b61379
aco/ra: Add helpers to test for intersection/containment of reg intervals
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
88f21ad87a
aco/ra: Move commonly repeated code to a helper function
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
8962510e38
aco/ra: Conservatively refactor get_reg_specified to use PhysRegInterval
...
All expressions have been replaced by their closest equivalent. No major
simplification efforts have been made to minimize risk of regressions.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
46c9d76134
aco/ra: Use std::all_of to simplify a loop
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
2b3b2f7ff5
aco/ra: Use std::find_if(_not) to clean up get_reg_simple
...
This makes for a more self-describing iteration behavior, and it gets rid
of the need for the duplicated "final check" at the bottom.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
ebdb362937
aco/ra: Add iterator interface for PhysRegInterval
...
This enables various loops to use range-based for.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:06 +00:00
Tony Wasserka
689ce1f39d
aco/ra: Remove always-false conditions
...
All code paths that set "found" to true either break or return before the
loop header is reached again, so the checks are unnecessary.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:05 +00:00
Tony Wasserka
46eee40abc
aco/ra: Conservatively refactor existing code to use PhysRegInterval
...
All expressions have been replaced by their closest equivalent. No major
simplification efforts have been made to minimize risk of regressions.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:05 +00:00
Tony Wasserka
9bbd6162a9
aco/ra: Introduce PhysRegInterval helper class
...
This mainly clarifies the semantics of register bounds (inclusive vs
exclusive), and further groups related varaibles together to clarify
sliding-window-style loops.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:05 +00:00
Tony Wasserka
67c1f32228
aco/ra: Update register use bounds before recursing in get_regs_for_copies
...
Delaying the call to adjust_max_used_regs until after get_regs_for_copies
returns puts the RA context into a state where registers past max_used_gpr
may be blocked. This isn't an issue on its own, but it adds a surprising
corner case to get_reg_simple that is easily avoided now.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7799 >
2021-01-13 18:21:05 +00:00
Daniel Schürmann
288032a873
aco: remove divergent branches which only jump over very few instructions
...
Totals from 18436 (13.23% of 139391) affected shaders (NAVI10):
CodeSize: 138428504 -> 138172588 (-0.18%)
Instrs: 26605127 -> 26541176 (-0.24%)
Cycles: 1624994088 -> 1622461620 (-0.16%)
VMEM: 3689892 -> 3689102 (-0.02%)
SMEM: 1131767 -> 1131761 (-0.00%)
Branches: 851796 -> 787852 (-7.51%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7814 >
2021-01-13 18:04:28 +00:00
Daniel Schürmann
412291ddef
aco: propagate swizzles when optimizing packed clamp & fma
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
6ecbccfb23
aco: optimize v_pk_fma_f16 -> v_pk_fmac_f16 on GFX10
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
b03be30e07
aco: optimize packed fneg
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
e3790fc458
aco: optimize packed clamp
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
a9fd9187e8
aco: optimize packed mul+add to v_pk_fma_f16
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
01134b0bfe
aco: simplify multiply-add combining
...
When both operands of a v_sub (same apply for v_add) are mul and one
already uses clamp/omod, pick the other operand to get a chance to
combine to a MAD.
No fossils-db changes.
Co-authored-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
fcd2ef23e5
radv: vectorize 16bit instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
454bbf8f23
aco: emit packed 16bit instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
5ad52ac906
aco: create helpers to emit vop3p instructions
...
Also make get_alu_src() capable to return
unswizzled multi-component SGPR sources.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
036a369f46
aco: change usesModifiers() considering opsel_hi on packed instructions
...
opsel_hi == 1 means that the high operand selects the
high bits of the input, which is the normal behavior.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
178b33c870
aco: allow SGPRs on every src position for VOP3P
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
0db4263a3a
aco: allow constants/literals on every src position for VOP3P
...
and prevent literals on VOP3P pre-GFX10.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:56 +00:00
Daniel Schürmann
4a75a28698
aco/RA: fix subdword operands on VOP3P instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:55 +00:00
Daniel Schürmann
2caba08c1a
aco: fix VOP3P assembly, VN and validation
...
aco/opcodes: rename v_pk_fma_mix* -> v_fma_mix*
and add modifier capabilities for VOP3P.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6680 >
2021-01-13 17:46:55 +00:00
Samuel Pitoiset
3c1275ccae
radv: enable DCC for MSAA on GFX10+
...
It should work fine now.
This gives +1-2% improvements with Control MSAA (2x and 4x)
on Sienna.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8413 >
2021-01-13 17:24:31 +00:00
Boris Brezillon
0ad83e3361
pan/bi: Fix the !immediate case in bi_emit_store_vary()
...
The base offset was ignored, take it into account.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8469 >
2021-01-13 17:07:14 +00:00
Ilia Mirkin
f9237619d3
nouveau: trigger the current fence's work on destroy explicitly
...
Otherwise the delete yells at us that there's still work pending. This
isn't an actual problem, but annoying to see each time.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Reviewed-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8462 >
2021-01-13 16:59:18 +00:00
Thong Thai
4b208cc503
frontends/va: Return an error if non-interlaced buffer is not supported
...
Add a check to vaDeriveImage to see if a non-interlaced buffer was
created successfully. Otherwise, return an error, since we won't be able
to derive an image from the interlaced buffer.
Prevents a null pointer dereference from occuring on some nVidia cards,
reported by Alexander Kapshuk.
v2: Check for PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE support (Ilia)
Fixes: fcb558321e ("frontends/va: Derive image from interlaced buffers")
Signed-off-by: Thong Thai <thong.thai@amd.com >
Tested-by: Alexander Kapshuk <alexander.kapshuk@gmail.com >
Reviewed-by: Leo Liu <leo.liu@amd.com >
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8320 >
2021-01-13 16:37:43 +00:00
Bas Nieuwenhuizen
4a783a3c78
radv: Use L2 coherency on GFX9+.
...
Especially on GFX10 we can avoid pretty much all L2 flushes.
However, instead of that we have to do L2_METADATA invalidations. We
do that every time we could possibly be reading new DCC/HTILE info
from the L2 cache in shaders.
Benchmark results, basemark on high preset with a navi10 on profile_standard
(which is slower than a navi10 on default settings, please don't compare
to random navi10 results you find)
before:
5932
5928
5937
after:
6011
6013
6009
So this looks like a >1% increase.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202 >
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen
0af86341a2
radv: Use L2 for CP DMA on GFX9+.
...
This enables assuming that the L2 is always up to date for barriers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202 >
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen
8f8d72af55
radv: Use access helpers for flushing with meta operations.
...
This way we're properly using the vulkan barrier paradigm instead
of adhoc guessing what caches need to be flushed. This is more robust
for cache policy changes as we now don't have to revisit all the meta
operations all the time.
Note that a barrier has both a src and dst part though. So
barrier:
flush src
meta op
flush dst
becomes
barrier:
flush barrier src
flush meta op dst
meta op
flush meta op src
flush barrier dst
And there are some places where we've been able to replace a CB flush
with a shader flush because that is what we'd need according to vulkan rules
(and it turns out that in the cases the CB flush mattered the app will set the
bit in one of the relevant flushes or it was needed as a result of an optimization
that we counter-acted in the previous patch.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202 >
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen
dba0a523a0
radv: Do dst invalidations for write accesses.
...
For write-after-write hazards.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202 >
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen
9026f10cda
radv: Invalidate CB on SHADER_WRITE for meta operations.
...
To cancel the optimization in radv_dst_access_flush if these helpers
get used by meta operations.
We could also remove that optimization but I think this triggers less
often as all SHADER_WRITE flushes on images not supporting STORAGE should
be meta
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202 >
2021-01-13 16:27:19 +00:00
Bas Nieuwenhuizen
3d7713b5a2
radv: Remove redundant WB_L2 flush.
...
INV_L2 already does that.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7202 >
2021-01-13 16:27:19 +00:00
Alyssa Rosenzweig
275277a2b4
panfrost: Implement alpha testing natively
...
On Midgard, we still have to lower on v6+. Passes Piglit
./fbo-mrt-alphatest (saving a cycle in the fragment shader to
compare/discard).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8447 >
2021-01-13 15:17:32 +00:00
Alyssa Rosenzweig
ff44f813fb
panfrost: Add alpha reference to XML
...
Midgard only, v6 dropped support.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8447 >
2021-01-13 15:17:32 +00:00
Alyssa Rosenzweig
7a6a5f3fe1
panfrost: Handle explicit primitive restart
...
Don't fall back. Passes piglit ./bin/primitive-restart on Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8447 >
2021-01-13 15:17:32 +00:00
Samuel Pitoiset
afad13700a
radv: disable VK_EXT_sample_locations again on GFX10+
...
I attempted to enable it for 21.0, only 2x and 4x were supported
but there is new failures if DCC+MSAA is enabled.
Disable it again because DCC is more important than this feature and
no Mesa releases have it on GFX10+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8472 >
2021-01-13 15:04:56 +00:00
Boris Brezillon
09bf6910b0
panfrost: Fix panfrost_afbc_format_needs_fixup()
...
This function returns true for PIPE_FORMAT_R8G8B8X8_UNORM, which is
wrong.
Fixes: 44217be921 ("panfrost: Adjust the format for AFBC textures on Bifrost v7")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8466 >
2021-01-13 14:01:42 +00:00
Samuel Pitoiset
001c1105f1
radv: enable DCC for mipmaps on GFX10+
...
Seems to work fine.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468 >
2021-01-13 13:42:04 +00:00
Samuel Pitoiset
825e2386dc
radv: do not enable DCC for 3D images with mipmaps on GFX10+
...
This is broken for some reasons, and probably rare enough to
care for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468 >
2021-01-13 13:42:04 +00:00
Samuel Pitoiset
755a8313fc
radv: add support for fast-clearing DCC levels on GFX10+
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468 >
2021-01-13 13:42:04 +00:00
Samuel Pitoiset
5537c9de73
radv: prevent fast-clearing uncompressed DCC levels
...
When size is 0, this means the level can't be compressed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8468 >
2021-01-13 13:42:04 +00:00