Sparse resources (buffers and images) own their address ranges, so
when creating a sparse resource we allocate an address range large
enough to fit the resource. The address range has to start and end
at page boundaries, as that's the memory mapping granularity.
At destruction time, we unmap all memory mapped within the range,
as Vulkan doesn't require the user to unmap memory themselves and
we don't want to retain references to BOs for longer than
necessary. Finally, we free the address range that was previously
owned by the resource.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35287>
We want to be able to survive accesses to, from Vulkan's
perspective, unmapped regions of sparse resources. For that we
allocate a single page -sized bo, which we'll use to implement
sparse unmapping by mapping the address range to this bo.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35287>
Since the kmsro option was removed, it is now just built together with a
list of gallium OpenGL drivers that require it.
On a Vulkan-only build with zink for OpenGL, kmsro is still required for
some wsi paths for those platforms, but it is no longer possible to
explicitly enable it without a gallium OpenGL driver to pull it.
This enables kmsro when zink is enabled to allow the Vulkan-only use
case in those platforms.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37459>
This commit keeps vkcts as a nightly job, but this puts us in shooting
distance to what we've been working for for the past 2.5 years!
We will flip the switch to making this job part of the merge pipeline
after a week of stress testing to make sure reliability issues,
especially around USB, don't come back to haunt my days and nights.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37367>
If the application really thinks it needs pswave32, let it use it.
Fragment shaders also have no concept of full subgroups, so the existing
code that chooses the subgroup size will work already.
For pre raster stages, we cannot allow this because of potential mismatches
in merged stages.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37360>
When writing to a depth texture the driver is first doing a decompress
blit to a stageing resource. On one hand this blit can be skipped, if
PIPE_MAP_DISCARD_WHOLE_RESOURCE is set, OTOH we need to clear the
PIPE_MAP_UNSYNCHRONIZED flag if a partial write is done, because we have to
wait until the blit is finished.
v2: Update the patch with a more targeted approach.
Fixes: 25b97a3a96 ("mesa/st: mark internal texture map calls as UNSYNCHRONIZED")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13916
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37457>
Use FD20 macro that will account for the implicit LSB zero value and is
already used for sources. For the new macro we need to use the entire
bit-range of the field (55-51), so remove the adjustments we used to
do prior to encoding and decoding.
Fixes assertion in vkpeak (https://github.com/nihui/vkpeak) when running
bf16 tests on BMG. And the code now will correctly apply the subreg_nr
to the destination, e.g. a mad(32) gets splitted into two pieces, the
generation would not fill out the upper-part of the register
```
mad(16) g13<1>BF g10<8,8,1>BF g12<8,8,1>BF g56<1,1,1>F { align1 1H A@5 };
-mad(16) g13<1>BF g10.16<8,8,1>BF g12.16<8,8,1>BF g57<1,1,1>F { align1 2H A@5 };
+mad(16) g13.16<1>BF g10.16<8,8,1>BF g12.16<8,8,1>BF g57<1,1,1>F { align1 2H A@5 };
```
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37236>
When CPU clock is the same with the authoritative trace clock (normally
default to CLOCK_BOOTTIME), perfetto drops the non-monotonic snapshots
to ensure validity of the global source clock in the resolution graph.
When they are different, the clocks are marked invalid and the rest of
the clock syncs will fail during trace processing.
There's no central daemon emitting consistent snapshots for
synchronization between CPU and GPU clocks on behalf of renderstages and
counters producers. The sequence-scoped clock (64 <= ID < 128) is unique
per producer + writer pair within the tracing session. So we can use
sequence-scoped clock for gpu clock whenever applicable, and fallback to
use global clock for dynamic minor allocated >= 192.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37425>
In short, perfetto doesn't require the initial clock snapshot to be
earlier than the timestamp to be converted. So we don't have to do
complex handling for it.
With this change:
- renderstage event requires clock sync, so we'd only emit clock
snapshots on the traceq thread that handles the callbacks
- drops redundant sync_timestamp calls as well as sync_gpu_ts tracking
- no need to reset next_clock_sync_ns when tracing is disabled, since a
snapshot is always emitted right after the initial interned data emit
upon tracing start
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37425>
The object name is part of the VkDebugUtilsObjectName event messages.
When the trace buffer is full and the ring buffer fill policy is chosen,
the debug obj events can be overwritten (lost), which is why we need the
RefreshSetDebugUtilsObjectNameEXT.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37425>