Extend the vkGetPhysicalDeviceFormatProperties2 cache to include
VkFormatProperties3 from the pNext chain. VkFormatProperties3 was
observed being always attached for DXVK and thus skipping the cache
if not handled.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28194>
For guest vram, there's already roundtrip to protect device memory alloc
ordering. This change adds the same protection for shmem used in below
scenarios and optimize to wait for new shmem only.
- reply shmem
- indirect upload shmem
- cmd stream shmem
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28147>
A roundtrip is to ensure a cmd via virtqueue happens on the renderer
side before a ring relies on it. Since venus is now with multi-ring, the
roundtrip submit and wait should belong to ring instead of instance, and
each ring owns its own roundtrip seqno to synchronize with virtqueue.
No behavior change.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28147>
Below is the common client pattern (app, angle, zink, etc):
- a few resets for queries to be used in this batch
- optional, depending on EXT_host_query_reset
- a few queries
- incremental
- can cross query pool boundary
The HW drivers normally have faster shader path when there are too many
individual reset and copies. Without further resolving, this ends up
with linear overhead on the 2d engines. This change has largely
optimized that:
- angle: many copies => 1 copy (or 2)
- zink: many resets and copies => 1 reset and 1 copy (or 2)
and again...some more renamings around
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28112>
Add a new free helper while renaming the alloc one as well. During query
record resolving, use a dropped list to store those records being reset.
This is to prepare for later further query record resolving.
This change also simplifies a query pool compare.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28112>
Per v1.3.279 spec "VUID-VkDeviceGroupSubmitInfo-commandBufferCount-00083
commandBufferCount must equal VkSubmitInfo::commandBufferCount"
When adding feedback, need to check for vkDeviceGroupSubmitInfo in the
SubmitInfo pNext to update their commandBufferCount and
pCommandBufferDeviceMasks to include feedback cmds.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28029>
Batches must be ignored if batch count is zero, so all batch inspections
have to be gated behind batch count. For memcpy, it's UB if either src
or dst is NULL even when size is zero.
Side note:
- For original commit, this fixes just the memcpy UB
- For current codes, this fixes to not skip ffb batch prepare
Fixes: 493a3b5cda ("venus: refactor batch submission fixup")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28071>
This is done by introduce idep_vulkan_lite_runtime, and only venus
depends on idep_vulkan_lite_runtime.
Modify the meson and source files to allow building venus without
the compiler.
See details Venus build metrics at the MR description.
gfxstream-vulkan forwards the shader to the host, and doesn't
need to convert into NIR in the guest. This results in faster
builds and less parts of Mesa to build. Also venus does the
same thing too, that's what the build is keyed on right now
as an in-tree user.
v7: By Yonggang Luo <luoyonggang@gmail.com>
Add idep_vulkan_common_entrypoints_h into vulkan_lite_runtime_deps because
vk_instance.c depends on idep_vulkan_common_entrypoints_h but vk_common_entrypoints is
not compiled in library `vulkan_lite_instance`.
Rename idep_vulkan_runtime_headers to idep_vulkan_lite_runtime_headers because
both lite/full runtime library depends on this, but lite should not depends on full
vk_meta_private.h added into vulkan_runtime_files
v6: By Yonggang Luo <luoyonggang@gmail.com>
get vulkan_lite_runtime_files and vulkan_runtime_files sorted
v5: By Yiwei Zhang <zzyiwei@chromium.org>
both vk_sampler and vk_ycbcr_conversion can stay in the lite runtime
v4: By Yonggang Luo <luoyonggang@gmail.com>
only build vk_instance.(c|h) twice for reduce compiling time
v3: By Yiwei Zhang <zzyiwei@chromium.org>
less code changes by introduce libvulkan_lite_runtime
v2: By Yonggang Luo <luoyonggang@gmail.com>
allow building Vulkan without libcompiler without compiling flags, the
venus is always built without libcompiler
v1: By Gurchetan Singh <gurchetansingh@google.com>
allow building Vulkan without libcompiler
Signed-off-by: Gurchetan Singh <gurchetansingh@google.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26574>
For submissions with an empty last batch containing no cmd buffers but
with semaphores as zink does, adding feedback to that batch would make
it no longer empty and increase submission overhead on some drivers.
Since feedback order is enforced by barriers, the feedback cmds can
instead be appended to the previous batch (if it exists) so that the
last batch remains empty.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27830>
1. move feedback helpers into vn_feedback
2. rename related structs, helpers, etc
3. only recycle wait semaphores is enough for the submission. Later we
can further optimize to only recycle each timeline sempahore once
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27758>
No behavior change, and below is the summary:
1. simplify to drop _timeline_ from semaphore feedback naming
2. update feedback structs to use obj_handle naming
3. for vn_feedback_cmd_pool, use fb_cmd_pool variable naming
4. for vn_feedback_buffer, use fb_buf variable naming
5. for query_feedback_cmd, use qfb_cmd variable naming (already use ffb)
6. s/submit_batches2/submit2_batches/
7. s/cmd_buffer_count/cmd_count/
8. use total_cmd_size instead of cmd_buffer_size if applicable
9. update vn_queue_submission's feedback_cmd_count to cmd_count
10. update setup time local feedback_cmd_count to extra_cmd_count
11. update feedback_event_cmd to event_feedback_cmd
12. other trivial renames
Most semaphore and query feedback cmd renamings are deferred to later
commits.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27758>
Previously we always put fence feedback cmd in a new batch appended,
which ends up with a separate execbuf for most drivers. This change
updates to avoid that separate eb except for empty submission with just
a feedback fence.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27649>
Cache for vkGetPhysicalDeviceImageFormatProperties2 as it is observed
to be called repeatedly with zink/proton layers.
Cache design is the same as the image requirements cache, generating
a hash key from pImageFormatInfo and storing pImageFormatProperties
into a hash table.
There are a couple differences though:
- VkResult gets cached when the query returns NOT_SUPPORTED.
- Unlike pMemoryRequirements that returns VkMemoryRequirements2 and
possibly VkMemoryDedicatedRequirements, VkImageFormatProperties2
has various pNext chains that can be optionally passed in. Hash
the existence of these pNext so that they are considered different
queries and the underlying pNext struct can be optionally cached.
The alternative would be to modify the query to always chain these
pNext so all of them would be cached, but it is unlikely for queries
to only differ in pNext chains.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27401>
This works around some Unity engine behaivor with ANGLE-on-Venus, when
cmd pools are created on main thread once while the render thread only
does descriptor pool creation for set allocations during recording time.
This change also explicitly forces async pipeline create for threads
creating the device instead of implicitly via feedback cmd pool create.
This ensures intended behavior when feedback is disabled.
Fixes: d17ddcc847 ("venus: dispatch background shader tasks to secondary ring")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27347>