If we move the blob-cache path into the async queue, then
path_init_failed is no longer a good way to check if puts
should be a no-op. But fortunately checking if the queue
is initialized is, and is a more obvious check because
what it is guarding is a util_queue_add_job().
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22248>
When combining of ro+rw caches is enabled, at first the ro cache should be
looked up and if data isn't found there then rw cache should be checked.
The rw cache checking got lost by accident after the code rebase and there
was no unit test covering this condition. Fix the rw cache looking up and
add the unit test case.
Fixes: 32fe60e8c4 ("util/disk_cache: Support combined foz ro and non-foz rw caches")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Juston Li <justonli@google.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20595>
Mesa utilizes only one type of cache at a time. This patch enables support
for combined reading from read-only Fossilize cache + non-foz read-write
caches.
From now on, a non-foz read-write caches will first try to retrieve data
from a read-only foz cache if new MESA_DISK_CACHE_COMBINE_RW_WITH_RO_FOZ
environment variable is set to true, otherwise the caching behaviour is
unchanged. The new flag has no effect when MESA_DISK_CACHE_SINGLE_FILE=1,
i.e. when the single-file foz cache is used.
This change allows us to ship a prebuilt RO caches for a certain
applications, while the rest of applications will benefit from the
regular RW caching that supports cache-size limitation. This feature
will be used by ChromeOS.
Usage example #1:
MESA_DISK_CACHE_DATABASE=0
MESA_DISK_CACHE_SINGLE_FILE=0
MESA_DISK_CACHE_COMBINE_RW_WITH_RO_FOZ=1
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS=rocache1,rocache2
Usage example #2:
MESA_DISK_CACHE_DATABASE=1
MESA_DISK_CACHE_SINGLE_FILE=0
MESA_DISK_CACHE_COMBINE_RW_WITH_RO_FOZ=1
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS=rocache1,rocache2
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18551>
Android's implementation of the blob-cache get/put funcs do not
implement any compression. And the default cache size is rather small,
at 2MB (!!) per app (although I assume everyone patches android to
increase the size limit).
We don't bother compressing the has_key/put_key path, since that path is
only storing a uint32_t.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19387>
Introduce new cache type, the Mesa-DB. This is a single-file read/write
cache that is based on the read-only Fossilize DB cache. Mesa-DB supports
cache size capping. It's a much more efficient cache than the multi-file
cache because Mesa-DB doesn't have the inode overhead. The plan is to make
Mesa-DB the default cache implementation once it will be deemed as stable
and well tested. For now users have to set the new MESA_DISK_CACHE_DATABASE
environment variable in order to active the Mesa-DB cache.
Mesa-DB cache is resilient to corrupted cache files and doesn't require
maintenance from users and developers. The size capping is implemented
by evicting least recently used cache items and compacting the cache
database files with the evicted entries. In order to prevent frequent
compaction of the cache, at minimum a half of cache is evicted when cache
is full.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16888>
The shader cache files produced by multi-file cache have a minimum size
limitation imposed by filesystem, usually it's 4KB. In order to make
cache tests suitable for a single file caches, we need to bypass the data
compression, making size of written out data determined for the eviction
testing.
The disk_cache_create() now checks whether driver_id name is set to
"make_check_uncompressed" in order to disable cache data compression.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16888>
Instead of spawning 4 threads when the cache is created,
spawn 1 and let u_queue grow the number of threads if
needed.
I wrote this patch because when running piglit's quick_shader
profile I had lots of samples in disk cache threads - mostly
in native_queued_spin_lock_slowpath kernel function.
Since these tests shouldn't really stress the cache, I assumed
it was caused only by thread creations.
After writing the patch and redoing the measurement, I got an
improvement but I still more hits in the same function for
shader_runner:$disk0 thread so something was wrong.
After digging more, I found out that my shader cache index was
corrupted: the on-disk size was 29MB but the index reported it
was way more than 1GB. So each disk cache thread was spending
a lot of time trying to evict files. Given that my cache had
a really low count of files, the LRU method based on randomly
generating subfolder names failed, so evicting was very slow.
Now that my cache index is fixed, the disk cache threads are
mostly idle but I still think it makes sense to grow the
number of threads instead of spawning 4 at the program start.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11296>
When the MESA_DISK_CACHE_SINGLE_FILE environment variable is set
we make use of the new single file shader cache implementation.
The new cache uses the following directory structure based on the
first defined name as follows:
$MESA_GLSL_CACHE_DIR/driver_id/gpu_name/foz_cache.foz
$MESA_GLSL_CACHE_DIR/driver_id/gpu_name/foz_cache_idx.foz
$XDG_CACHE_HOME/mesa_shader_cache_sf/driver_id/gpu_name/foz_cache.foz
$XDG_CACHE_HOME/mesa_shader_cache_sf/driver_id/gpu_name/foz_cache_idx.foz
<pwd.pw_dir>/.cache/mesa_shader_cache_sf/driver_id/gpu_name/foz_cache.foz
<pwd.pw_dir>/.cache/mesa_shader_cache_sf/driver_id/gpu_name/foz_cache_idx.foz
Where foz_cache_idx.foz is a database of offsets pointing to the location of
the shader cache entries in foz_cache.foz
This initial implementation doesn't have any max cache size handling and is
initially intended to be use by applications such as steam that will handle
cache management for us.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7725>
We need the disk cache enabled in Android to get EGL_ANDROID_blob_cache's
callbacks called, but we don't actually want to store anything on disk.
Fixes "Failed to create //.cache for shader cache (Read-only file
system)---disabling." spam on init.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6762>
Remove filename ralloc comment. filename is allocated by asprintf.
Clean up disk_cache_get dead code left over from 367ac07efc
("disk_cache: move cache item loading code into
disk_cache_load_item() helper").
Fix defect reported by Coverity Scan.
Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach this statement:
free(filename);
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6738>
This allows ZSTD instead of ZLIB to be used for compressing the shader
cache.
On a 72 core system emulating skl with a full shader-db (with i965):
ZSTD:
1915.10s user 229.27s system 5150% cpu 41.632 total (cold cache)
225.40s user 10.87s system 3810% cpu 6.201 total (warm cache)
154M (235M on disk)
ZLIB:
2231.33s user 194.24s system 1899% cpu 2:07.72 total (cold cache)
229.15s user 10.63s system 3906% cpu 6.139 total (warm cache)
163M (244M on disk)
Tim Arceri sees (8 core ryzen and a full shader-db):
ZSTD:
2505.22 user 40.50 system 3:18.73 elapsed 1280% CPU (cold cache)
418.71 user 14.93 system 0:46.53 elapsed 931% CPU (warm cache)
454.3 MB (681.7 MB on disk)
ZLIB:
3069.83 user 40.02 system 4:20.13 elapsed 1195% CPU (cold cache)
425.50 user 15.17 system 0:46.80 elapsed 941% CPU (warm cache)
470.3 MB (701.4 MB on disk)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
If there are queued shaders to be written to disk, wait for that.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This makes use of the total job size limiting feature added in the
previous patch.
The idea is to avoid an excessive build up in memory use due to the
use of both the UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flags.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a
situation where the queue never executes and grows to a huge size
due to all other threads being busy.
This is the case with the shader cache when attempting to compile a
huge number of shaders up front. If all threads are busy compiling
shaders the cache queues memory use can climb into the many GBs
very fast.
The use of these two flags with the shader cache is intended to
allow shaders compiled at runtime to be compiled as fast as possible.
To avoid huge memory use but still allow the queue to perform
optimally in the run time compilation case, we now add the ability
to track memory consumed by the jobs in the queue and limit it to
a hardcoded 256MB which should be more than enough.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Since we set the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flag this should
have little impact on low core systems. However just about all modern
CPUs currently available that run Mesa have *at least* 4 cores. For
these CPUs allowing more threads can result in the queue being
processed faster and avoid excessive memory use due to a backlog of
cache entrys building up in the queue.
This change helps avoid a huge build up of cache entrys in the queue
due to using both the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY and
UTIL_QUEUE_INIT_RESIZE_IF_FULL flags.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
There are multiple `goto path_fail` with an open fd, but none that go to
`fail:` without going through `path_fail:` first, so let's just move the
`close(fd)` there.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>