For Gen11 and prior, the dispatch mode for TCS was SINGLE_PATCH, and
this debug setting could be used to change it to 8_PATCH (falling back
to SINGLE_PATCH when shader couldn't be in the multi dispatch mode).
However after talking to Ken, seems this debug setting is not really
worth keeping around, so removing it.
For Gen12+ the only option is 8_PATCH, so it was always using that
dispatch mode as before.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>
For AMD we kinda have some modifiers with a max size ... (Which is
really a compositor/kms issue, but getting them to try kinda falls
into the unsolved "how to allocate/what pitch to use" bucket, so
we solve it on the allocating side)
Cc: mesa-stable
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18139>
LLVM 11 was released in October 2020. If you want to build against
Mesa's Git version, that seems like enough time to upgrade to at least
LLVM 11 (Debian stable has this too).
It reduces the amount of #if gates we need and more will be incoming
again, given the Opaque Pointer transition.
Additionally radeonsi is already requiring LLVM 11. Therefore the
minimum will have been LLVM 11 for many builds anyway.
Note that clc is kept to LLVM 10 for the time being.
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16047>
As explained in the previous commit, GFX9+ has issues with addressing
mipmaps in block-compressed images. In the case of copy commands, we fix
this by doing an extra copy for the missing blocks.
For GFX10, the mipmap layout in memory allows us to do better than that. We
can change the base level of the descriptor to one level bigger than the
requested level and adjust the extent and address to match. This is done by
ComputeNonBlockCompressedView in addrlib. Thus on GFX10 we can skip the
fixup copy workaround, and this will also fix cases outside of explicit
copy commands.
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>
GFX9+ hardware has an issue where mipmap degradations are calculated
incorrectly due to using divide-by-two integer math and certain mipmap
sizes lose blocks.
This issue has been documented before, and we ported a workaround from
AMDVLK to increase the extent that is programmed into the descriptor, so
that the hardware arrives at the correct result. However, this is
insufficient as we cannot safely increase the extent beyond the physical
extent of the image in memory. If we can't increase it enough, the image
will still be missing blocks.
But there is still hope. In cases where RADV is responsible for copying to
or from an image (such as vkCmdCopyBufferToImage/vkCmdCopyImageToBuffer),
we can perform a second copy of the blocks that the hardware excluded so
that the resulting image is complete. This is another workaround from
AMDVLK.
This fixes corrupted textures in Halo: The Master Chief Collection.
v2: Add RADV_CMD_FLAG_INV_L2 | RADV_CMD_FLAG_INV_VCACHE to flush_bits
just in case (Samuel Pitoiset)
Closes: #3347
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>
Among others, it would result in the spill offsets being wrong due to
being relative to the parent split and not absolute.
For example when computing a 64-bit multiply on Tesla (which only
supports 16-bit mul in hardware), the sources will first be split into
32-bit values and then a second time down to 16-bit ones. Looking at the
first source, the spill offsets ended being computed as follows:
{ .hihi = +2, .hilo = +0, .lohi = +2, .lolo = +0 }
instead of the expected
{ .hihi = +6, .hilo = +4, .lohi = +2, .lolo = +0 }
This is resolved with this patch.
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
Allocations larger than the amount of VRAM available might be possible,
but has not been tested.
v4: Change it back to just VRAM size
v3: Increase the minimum from 32 MB to 128 MB, as OpenCL 1.x mandated at
least 128 MB.
v2: Further lower the allocation size.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Pierre Moreau <dev@pmoreau.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>
It's entirely redundant with radv_can_fast_clear_depth except that it's
missing a few checks around view masks and the layer range in the actual
clear rect and assumes you're always clearing the whole image view.
This isn't necessarily true, even for classic Vulkan render passes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18084>
This commit fixes
dEQP-VK.rasterization.rasterization_order_attachment_access.format on
GFX9 because changing the layout for Vulkan feedback loops will trigger
a fast-clear eliminate. Though, the root cause is unrelated to that and
it's because the CMASK/FMASK initialization on GFX9 is currently broken
for TC-compatible images (there is a TODO somewhere).
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18084>
Even with dynamic rendering, you have to bind both aspects of the image
if the image contains both depth and stencil. One day, we may see this
restriction lifted but that will require deeper driver surgery into the
way we handle depth/stencil layouts.
Fixes: 42db590006 ("radv: convert the meta blit 2d path to dynamic rendering")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18084>
- dpb buffer created while receive the first begin_frame
moved out from encoder creation.
- dpb buffer name changed from cpb.
- dpb buffer allocation to support pre-encoding
- different vcn version IB package adjustment
- merge reconstructed_picture_v4_0_t to reconstructed_picture_t
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17994>
[why]
to enhance va video encoding quality
[how]
use va encoding quality_level interface, and provide
default value and encoding quality adjustment options,
so that users can finetune encoding quality and performance
from va quality interface. (limited to VCNs)
There are 3 settings added:
- preset modes: speed, balance, quality
they are using different encoding strategies
- vbaq modes:
vbaq mode is using variance based strategy
to improve the subjective image quality
- pre-encoding modes:
Using scaled down input image for pre-encoding to have
better rate-control reaction and consume more memory
in the same time. Only preencoding-4x mode is enabled.
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17994>