intel/i915: give up the execbuf ioctl after ~16s of ENOMEMs

If nothing has freed memory until that point, return the error, which
may make the upper layers report the device as lost. It could be that
the system is under very very heavy swapping and that waiting a little
more would make it work, but let's try 16s for now.

v2: Bring down the timeout from ~60s to ~16s (José).

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37559>
This commit is contained in:
Paulo Zanoni
2025-10-01 14:44:00 -07:00
committed by Marge Bot
parent 7b1e9af900
commit dc1877a0a1

View File

@@ -146,21 +146,21 @@ static inline int
i915_gem_execbuf_ioctl(int fd, const struct intel_device_info *info,
struct drm_i915_gem_execbuffer2 *execbuf)
{
int ret, retries = 0;
int ret, retries;
assert((execbuf->flags & I915_EXEC_FENCE_OUT) == 0);
if (unlikely(info->no_hw))
return 0;
while (true) {
/* After 80 retries, we spent more than 16s sleeping. */
for (retries = 0; retries < 80; retries++) {
ret = intel_ioctl(fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, execbuf);
if (likely(!(ret && errno == ENOMEM)))
break;
os_time_sleep(100 * retries * retries);
retries++;
}
return ret;