Allowing the user to set custom sample locations, by filling the
extension structs and chaining them to the pipeline structs according
to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:
'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'
Once custom locations are used, the default locations are lost and need
to be re-emitted again in the next pipeline creation. For that, we emit
the 3DSTATE_SAMPLE_PATTERN at every pipeline creation.
v2: In v1, we used the custom anv_sample struct to store the location
and the distance from the pixel center because we would then use
this distance to sort the locations and send them in increasing
monotonical order to the GPU. That was because the Skylake PRM Vol.
2a "3DSTATE_SAMPLE_PATTERN" says that the samples must have
monotonically increasing distance from the pixel center to get the
correct centroid computation in the device. However, the Vulkan
spec seems to require that the samples occur in the order provided
through the API and this requirement is only for the standard
locations. As long as this only affects centroid calculations as
the docs say, we should be ok because OpenGL and Vulkan only
require that the centroid be some lit sample and that it's the same
for all samples in a pixel; they have no requirement that it be the
one closest to center. (Jason Ekstrand)
For that we made the following changes:
1- We removed the custom structs and functions from anv_private.h
and anv_sample_locations.h and anv_sample_locations.c (the last
two files were removed). (Jason Ekstrand)
2- We modified the macros used to take also the array as parameter
and we renamed them to start by GEN_. (Jason Ekstrand)
3- We don't sort the samples anymore. (Jason Ekstrand)
v3 (Jason Ekstrand):
Break the refactoring out into multiple commits
v4: Merge dynamic/non-dynamic changes into a single commit (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:
pname | max | min
pname:sampleLocationSampleCounts |- |ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize |- |(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits |- |4
pname:variableSampleLocations | true |implementation dependent
The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.
Also, variableSampleLocations is set to true because we can set sample
locations per draw.
Implement the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].
v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
2- Used the isl_device_sample_count to take the number of samples
per platform to avoid extra checks. (Sagar Ghuge)
v3: 1- Replaced VK_FALSE with false as Jason has sent a patch to replace
VK_FALSE with false in other places. (Jason Ekstrand)
2- Removed unecessary defines and set the grid size to 1 (Jason Ekstrand)
v4: Fix properties reporting in GetPhysicalDeviceProperties2, not
GetPhysicalDeviceFeatures2 (Lionel)
Use same alignment as other functions (Lionel)
Report variableSampleLocations=true (Lionel)
v5: Don't overwrite the pNext in GetPhysicalDeviceMultisamplerPropertiesEXT
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
When calculating a URB configuration, we start with a notion of how
much space each stage /wants/ (to achieve the maximum amount of
concurrency), but sometimes fall back to giving it less than that,
because we don't have enough space. (Typically, this happens when
the per-stage size is large, or there are many stages, or both.)
We now output a "constrained" boolean which is true if we weren't
able to satisfy all the "wants" due to a lack of space.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8721>
We can skip CCS ambiguate if followed by a fast clear within render
pass.
v2: (Jason)
- Check array layer as well since we only fast clear first layer and
first LOD.
- Don't drop fast clear check while doing resolve operation.
Fixes: d5849bc840 "anv: Skip HiZ and CCS ambiguates which preceed fast-clears"
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6988>
According to the BSpec page for MEDIA_VFE_STATE, on Gen12 platforms
"if a fused configuration has fewer threads than the native POR
configuration, the scratch space allocation is based on the number of
threads in the base native POR configuration". However we currently
use the subslice count from devinfo->num_subslices[0], which only
includes the subslices currently enabled by the platform fusing. This
leads to scratch space underallocation and occasional hangs.
The problem is likely to affect most Gen12 GPUs with less than 96 EUs.
GFXBench5 Aztec Ruins is able to reproduce the issue fairly reliably.
Fixes: 9e5ce30da7 "intel: fix the gen 12 compute shader scratch IDs"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8636>
While invalidating the AUX-TT entries, we have to consider the surface
offset as well otherwise, we will end up invalidating another surface's
CCS portion.
For eg. when we have HiZ+CCS and STC_CCS enabled, both will use the CCS
portion allocated at the end of BO. While invalidating the CCS portion
of stencil buffer, we will end up invalidating the CCS portion that
belongs to the depth main surface and vice-versa, if the surface offset
is not considered.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4123
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8677>
Previously, if we had a pipeline transition from something which used,
say, tessellation to something which didn't and we ended up with
tessellation descriptors dirty, we could end up re-emitting far more
than necessary. With this commit, we mask off unused stages so we only
update when necessary.
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8594>
We incorrectly utilize the stencil layouts structures even if we
should stick to the depth_stencil ones if the layout includes stencil.
v2: Don't forget stencil only layout (Nanley)
Simplify callers of new helper functions (Nanley)
v3: Store VK_IMAGE_LAYOUT_UNDEFINED when no stencil is available (Nanley)
Use a switch statement (Nanley)
v4: Consider all layouts but depth only to be potential stencil layouts (Lionel)
v5: Refactor helper in vk_image_layout_depth_only() and discard
VkAttachmentDescriptionStencilLayoutKHR in
VkAttachmentDescription2KHR if format is not depth/stencil.
v5: s/LAYOUT_COLOR_ATTACHMENT_OPTIMAL/LAYOUT_DEPTH_ATTACHMENT_OPTIMAL/ (Nanley)
v6: Fix overly harsh assert()
Fixes: c1c346f166 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4084
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8475>
We sometimes use anv_layout_to_aux_state() to compute the aux state of
an image during the resolve operations at the end of a render
(sub)pass.
If we're dealing with a multisampled image that is created without a
transfer usage, our internal code might trigger a resolve using the
transfer layout (see genX_cmd_buffer.c:cmd_buffer_end_subpass), for
which the image doesn't the usage bit. The current code tries to AND
the 2 usages which won't have any bit in common, thus skipping all
checks below.
v2: Add the transfer usages depending on attachment usage (Lionel)
v3: Limit to samples > 1 (Jason) && DEPTH_STENCIL_ATTACHMENT_BIT (Lionel)
v4: Add transfer usage at image creation (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 54b525caf0 ("anv: Rework anv_layout_to_aux_state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4037
Reviewed-by: Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8307>
When DEBUG is not defined, no error reporting is done, the error is
just returned back. The current definition a couple of warnings in
anv_formats.c. First when the return value is intentionally ignored
../src/intel/vulkan/anv_formats.c:989:48: warning: statement with no effect [-Wunused-value]
989 | vk_errorfi(instance, physical_device, VK_ERROR_FORMAT_NOT_SUPPORTED,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/intel/vulkan/anv_private.h:486:55: note: in definition of macro ‘vk_errorfi’
486 | #define vk_errorfi(instance, obj, error, format, ...) error
| ^~~~~
and also when an argument is used only
../src/intel/vulkan/anv_formats.c:908:25: warning: unused variable ‘instance’ [-Wunused-variable]
908 | struct anv_instance *instance = physical_device->instance;
| ^~~~~~~~
../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceImageFormatProperties2’:
../src/intel/vulkan/anv_formats.c:1231:25: warning: unused variable ‘instance’ [-Wunused-variable]
1231 | struct anv_instance *instance = physical_device->instance;
| ^~~~~~~~
to avoid both issues, use a static inline function that just returns
it's argument but can consume other input. Ignoring the return value
of a function is OK, and the extra input can be tagged as UNUSED
getting rid of both warnings.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7860>
Quoting the spec :
"When a pool is destroyed, all descriptor sets allocated from the
pool are implicitly freed and become invalid. Descriptor sets
allocated from a given pool do not need to be freed before
destroying that descriptor pool."
This implies we might leak nodes allocated in the vma object.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0a6d2593b8 ("anv: Allocate descriptor buffers from the BO cache")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7796>
Simple refactor. No intended change in behavior.
Replace each derivation of aux address with anv_image_get_aux_addr().
The function will soon do more in support of
VK_EXT_image_drm_format_modifier, where the image bo and aux bo may be
disjoint.
v2:
- Replace param 'aspect' with 'plane'.
v3:
- Workaround for stencil ccs. If no aux surface, then return
ANV_NULL_ADDRESS.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v3)
Pre-patch, we checked the offsets once per aspect after adding all
surfaces for the aspect. The additional checks will make it easier to
diagnose layout bugs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Pure refactor. No intended change in behavior.
This makes the code infinitely easier to understand. And it uncovers
a potential bug (marked with XXX comment).
v2: Fix narrowing conversions on 32-bit arch. s/size_t/uintmax_t/.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Months ago, make_surface() added *all* surfaces required for the given
aspect. It was a monster monolithic function, and difficult to reason
about its correctness. In commit c652ff8c (2020-03-06), I split the code
for aux surfaces into its own function, add_aux_surface_if_supported().
This patch continues the splitting, therefore making bugs easier to
identify.
Code changes:
- Move the code that adds the shadow surface from make_surface() to
a new function add_shadow_surface(), called from
add_all_surfaces().
- Move the call to add_aux_surface_if_supported() from make_surface()
to add_all_surfaces().
- To preserve correctness of the assertions on image layout in
make_surface(), move them to the loop in add_all_surfaces() after
all the aspect's surfaces have been added.
- Rename make_surface() to add_primary_surface() because now that's
what it does.
Pure refactor, no intended change in behavior.
v2:
- Rebase onto "anv: Fix isl_surf_usage_flags for stencil images".
- Sanitize the image's extent earlier.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)