anv, drirc: Add workaround to speed up Cyberpunk 2077 reg allocation

Calling the ra_allocate function after each register spill can take
several minutes. This option speeds up shader compilation by spilling
more registers after the ra_allocate failure.Required for
Cyberpunk 2077, which uses a watchdog thread to terminate the process
in case the render thread hasn't responded within 2 minutes.

Execution time of my Cyberpunk2077 shader compilation test:
https://gitlab.freedesktop.org/illia.a.polishchuk/cyberpunk-vulkan-compute-hang-test-anv

Before the patch:

real 1m28,738s
user 1m28,329s
sys 0m0,400s

After the patch

real 0m33,245s
user 32m,835s
sys 0m0,404s

I think it's acceptable patch because Cyberpunk benchmarks has
the same FPS with and without patch. (I started
it without patch with a patched binary with disabled watchdog thread)

Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Requires: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9241
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>
This commit is contained in:
Illia Polishchuk
2023-07-24 15:31:53 +03:00
committed by Marge Bot
parent 739e21fa9a
commit 56e0aff530
4 changed files with 18 additions and 1 deletions
+3
View File
@@ -79,6 +79,7 @@ static const driOptionDescription anv_dri_options[] = {
DRI_CONF_ANV_QUERY_CLEAR_WITH_BLORP_THRESHOLD(6)
DRI_CONF_ANV_QUERY_COPY_WITH_SHADER_THRESHOLD(6)
DRI_CONF_ANV_FORCE_INDIRECT_DESCRIPTORS(false)
DRI_CONF_SHADER_SPILLING_RATE(0)
DRI_CONF_SECTION_END
DRI_CONF_SECTION_DEBUG
@@ -1373,6 +1374,8 @@ anv_physical_device_try_create(struct vk_instance *vk_instance,
device->compiler->indirect_ubos_use_sampler = device->info.ver < 12;
device->compiler->extended_bindless_surface_offset = device->uses_ex_bso;
device->compiler->use_bindless_sampler_offset = !device->indirect_descriptors;
device->compiler->spilling_rate =
driQueryOptioni(&instance->dri_options, "shader_spilling_rate");
isl_device_init(&device->isl_dev, &device->info);
device->isl_dev.buffer_length_in_aux_addr = true;
+3
View File
@@ -712,6 +712,9 @@ anv_pipeline_hash_common(struct mesa_sha1 *ctx,
const bool rba = device->robust_buffer_access;
_mesa_sha1_update(ctx, &rba, sizeof(rba));
const int spilling_rate = device->physical->compiler->spilling_rate;
_mesa_sha1_update(ctx, &spilling_rate, sizeof(spilling_rate));
}
static void
+9
View File
@@ -1079,6 +1079,15 @@ TODO: document the other workarounds.
<application name="Cyberpunk 2077" executable="Cyberpunk2077.exe">
<option name="force_vk_vendor" value="-1" />
</application>
<application name="Cyberpunk 2077" executable="Cyberpunk2077.exe">
<!--
Cyberpunk 2077 uses a watchdog thread to terminate
the process in case the render thread hasn't responded within 2 minutes.
This option speeds up shader compilation.
See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9241
-->
<option name="shader_spilling_rate" value="15" />
</application>
<!--
Disable 16-bit feature on zink and angle so that GLES mediump doesn't
lower to our inefficent 16-bit shader support. No need to do so for
+3 -1
View File
@@ -411,7 +411,9 @@
DRI_CONF_OPT_B(mesa_no_error, def, \
"Disable GL driver error checking")
#define DRI_CONF_SHADER_SPILLING_RATE(def) \
DRI_CONF_OPT_I(shader_spilling_rate, def, 0, 100, \
"Speed up shader compilation by increasing number of spilled registers after ra_allocate failure")
/**
* \brief Miscellaneous configuration options
*/