I wanted to find slow pieces of code in our Anv driver using our
drm-shim stub.
The last bit of code still talking to the compositor was the WSI
swapchain code and failing because none of the submissions are taking
place (because of the stub).
This change introduces a new variable MESA_VK_WSI_HEADLESS_SWAPCHAIN
which when set turns every swapchain creation into a headless
swapchain. This swapchain does not present anything, allowing the
application to spin as many frames as possible. Thus helping to
identify slow spots in command buffer building path.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6156>
nir->info.subgroup_size can be set to an enum :
SUBGROUP_SIZE_VARYING = 0
SUBGROUP_SIZE_UNIFORM = 1
SUBGROUP_SIZE_API_CONSTANT = 2
SUBGROUP_SIZE_FULL_SUBGROUPS = 3
So compute the API subgroup size value and compare it to the dispatch
size to determine whether we need some bound checking.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9ac192d79d ("intel/fs: bound subgroup invocation read to dispatch size")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>
I'm not planning to stand mesa-swrast back up until we get Kata set up, so
turn the testing back on at a reduced fraction on so that
venus/llvmpipe/etc. dev can still get some coverage.
I haven't turned lavapipe back on, because it is now unstable in memory
model / atomics tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21880>
For instance, to load uniform data with the LSC we usually rely on
tranpose messages which have to execute in SIMD1. Those end up being
considered as partial writes so within loops their life span spread to
the whole loop, increasing register pressure.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21867>
on pipeline bind with dynamic state, depth_clip_near needs to either be set by
* applying the dynamic state
* using the pipeline state
the previous code always used the pipeline state
fixes:
dEQP-VK.pipeline.*.extended_dynamic_state.between_pipelines.depth_clamp_enable
Fixes: 650880105e ("vulkan,lavapipe: Use a tri-state enum for depth clip enable")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21814>
if it's known that a renderpass is active and the driver wants to do
renderpass optimizing, help out by not forcing a sync and instead doing
what the driver would do: create a staging buffer and copy it to the
image
this requires that the driver already handles buffer -> image copies
with resource_copy_region
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21801>
It is legal to pass in nullptr as an instance into
vkGetInstanceProcAddr when resolving any global addresses, this
wasn't handled correctly and an illegal access to a member of
a null struct was made.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21827>
This commit rewrites the KGSL backend to utilize vk common wherever
possible to bring the codebase in line with DRM while implicitly
fixing minor API bugs that may have occurred as a result of manually
implementing VK functions.
As a part of moving to vk common, KGSL sync is now implemented
atop vk common sync and vastly expanded in terms of functionality
such as:
* Import/Export of sync FDs - A required capability for properly
supporting the Android WSI and as these functions were stubbed
when a presentation operation used semaphores, it would cause a
leak of FDs that were imported due to the expectation that the
driver would close them. As well as causing UB around due to
ignoring the imported FD or not exporting a valid FD.
* Supporting pre-signalled fences - Vulkan allows fences to be
created in a signalled state which was stubbed prior and can
lead to UB.
* Timeline semaphore support - As a result of utilizing vk common
as the backbone for synchronization, its timeline semaphore
emulation has been utilized to provide support for them without
needing kernel support. (Note: On newer versions of KGSL,
timeline semaphores can be implemented natively rather than
using emulation as they support wait-before-signal)
Fixes freezes due to semaphore usage with presentation on:
* Genshin Impact
* Skyline Emulator
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21651>
Originally if we had an anonymous field (ie. field declared as part of
the register definition itself) the name in the generated field struct
would include the gen prefix (ie. .a6xx_rb_stencil_buffer_pitch), but
this doesn't work for variants because the variant regs would have
different gen prefixes. Fix this by using reg name instead of the
full_name.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
For regs with multiple variants, generate a template'ized function to
pack the reg value. If the template param is known at compile time
(which is the expected usage) this will optimize to the same thing as
the "traditional" reg packing.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
They have more similarities than differences, so merge them and use
"variant" attribute as needed to manage differences.
Note initially using "variant" conservatively when it comes to regs
known on a7xx but not a6xx. It could be that they exist also on later
versions of a6xx as well, for example. For ex, LPAC related regs/bits
likely existed on later a6xx (eg. a660 family) but BV stuff is not.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
To merge a7xx and a6xx regs, using variant property to manage the
differences, we'll want regs/etc to be named according to the first
generation it is use rather than the domain name. Add a new prefix
type to accomplish this. By default, if no variant property, things
will still be named based on domain (ie. REG_A6XX_...), and things
that have variant="A6XX" will also end up as they currently are
(since the chip enum matches domain name), but things that have
variant="A7XX" will end up as REG_A7XX_...
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
Clang seems more relaxed about this, allowing C99 style initializers
without requiring ordering. But unfortunately g++ is more picky :-/
TODO this doesn't completely fix everything with g++, namely sparse
array initialization.. for ir3 driver-params, I think we can convert
these to structs. But there are still one or two others to deal with.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>
C++ is more picky about a goto jumping over variable initialization,
even if unused after the goto label (presumably because of destructors
that can be called after a variable goes out of scope). Since there is
only a single fallback path, get rid of the goto.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21846>