From 419b02c90c822e6ea89c7fae951fd19d7db20181 Mon Sep 17 00:00:00 2001 From: Jason Ekstrand Date: Wed, 27 Oct 2021 01:40:36 -0500 Subject: [PATCH] anv,iris: Advertise a max 3D workgroup size of 1024^3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On GFX version 12.5+ with COMPUTE_WALKER, this is the limit based on the size of the HW packet. On older HW, we can technically go a bit bigger but there's not much point. Technically, some hardware can support a scalar workgroup size up to 2048 but most apps don't go any bigger than 1024. As discussed on the merge request page, the current limit assumes SIMD32, but it is unclear if we want to encourage applications to use SIMD32 if it may lead to additional register spilling in shader programs. Many applications have likely tuned for a limit of 1024 based on the OpenGL minimum limit, so it might not gain much by advertising more than 1024. Reworks: * Jordan: Use MIN2 and limit total invocations as well. * Jordan: Add second paragraph to commit message based on merge request discussion. Reviewed-by: Tapani Pälli Reviewed-by: Lionel Landwerlin Reviewed-by: Jordan Justen Signed-off-by: Jordan Justen Part-of: --- src/gallium/drivers/iris/iris_screen.c | 3 ++- src/intel/vulkan/anv_device.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/iris/iris_screen.c b/src/gallium/drivers/iris/iris_screen.c index 44113d91410..5e7e5dd9f19 100644 --- a/src/gallium/drivers/iris/iris_screen.c +++ b/src/gallium/drivers/iris/iris_screen.c @@ -527,7 +527,8 @@ iris_get_compute_param(struct pipe_screen *pscreen, struct iris_screen *screen = (struct iris_screen *)pscreen; const struct intel_device_info *devinfo = &screen->devinfo; - const uint32_t max_invocations = 32 * devinfo->max_cs_workgroup_threads; + const uint32_t max_invocations = + MIN2(1024, 32 * devinfo->max_cs_workgroup_threads); #define RET(x) do { \ if (ret) \ diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 0c903deb6a8..af6e261cc9e 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -1817,7 +1817,8 @@ void anv_GetPhysicalDeviceProperties( pdevice->has_bindless_images && pdevice->has_a64_buffer_access ? UINT32_MAX : MAX_BINDING_TABLE_SIZE - MAX_RTS - 1; - const uint32_t max_workgroup_size = 32 * devinfo->max_cs_workgroup_threads; + const uint32_t max_workgroup_size = + MIN2(1024, 32 * devinfo->max_cs_workgroup_threads); VkSampleCountFlags sample_counts = isl_device_get_sample_counts(&pdevice->isl_dev);