I meant to do this years ago when I first added SHADER_OPCODE_SEND. At
the time, the only use for the extended descriptor was bindless handles
which were always one thing and never non-constant. However, it doesn't
actually require any extra instructions because we have to OR in ex_mlen
anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8748>
this is a little hacky since we're still using unpacked layout for everything,
requiring that we "adjust" the value we pass back to the user for std430 to
be the expected value as though we were using packed layout
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8628>
this is actually crazy, but there's no other way to do it from the variable.
ideally, nir would have a separate type for atomic counters to simplify this
and then also stop mangling binding/block index during lower_buffers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8628>
Commits 0278d1fa323cf1f289..b688ea31fcf7e20436 added a new parameter
to set_vertex_buffers(), set_shader_images(), and set_sampler_views()
which specifies a number of trailing slots to unbind. They updated
the iris functions to do the unbinding, but didn't update the code
to mark which things are bound in the bitfields. This meant that
later code would assume those unbound slots were bound, and crash
on a NULL dereference. All that's needed is to add that slot count
when unbinding things in the bitfield.
Fixes: 0278d1fa32 ("gallium: add unbind_num_trailing_slots to set_vertex_buffers")
Fixes: 72ff66c3d7 ("gallium: add unbind_num_trailing_slots to set_shader_images")
Fixes: b688ea31fc ("gallium: add unbind_num_trailing_slots to set_sampler_views")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8758>
Currently, `allow_higher_compat_version` is only used during context
creation. Doing that means an application that doesn't request a
specific version can be given a version higher than 3.0.
However, an application still cannot request a higher version via
glXCreateContextAttribsARB. The GLX and DRI layers will only see that
version 3.0 is supported, so context creation will fail before the drive
is called. For this to work, max_gl_compat_version must be set to a
higher version.
This enables running many piglit tests on i965 with
allow_higher_compat_version.
v2: Fix a typo in a comment. Noticed by Tim. Fix a typo in the commit
message. Noticed by the spell checker. :)
v3: Don't parse driconf again. Suggested by Tim.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7387>
There are a few places (mainly u_threaded_context) that do:
set_vertex_buffers(...);
for (i = 0; i < count; i++)
pipe_resource_reference(&buffers[i].resource.buffer, NULL);
set_vertex_buffers increments the reference counts while the loop
decrements them.
This commit eliminates those reference count changes by adding a parameter
into set_vertex_buffers that tells the callee to accept all buffers
without incrementing the reference counts.
AMD Zen benefits from this because it has slow atomics if they come from
different CCXs.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
Instead of calling this functions again to unbind trailing slots,
extend it to do it when binding. This reduces CPU overhead.
A lot of drivers ignore "start" and always unbind all slots after "count".
Such drivers don't need any changes here.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
We often do this:
pipe->set_constant_buffer(pipe, shader, slot, &cb);
pipe_resource_reference(&cb->buffer, NULL);
That results in atomic increment in set_constant_buffer followed by
atomic decrement after set_constant_buffer. This new interface
eliminates those atomics.
For the case above, this should be used instead:
pipe->set_constant_buffer(pipe, shader, slot, true, &cb);
cb->buffer = NULL; // if cb is not a local variable, else do nothing
AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
Allowing the user to set custom sample locations, by filling the
extension structs and chaining them to the pipeline structs according
to the Vulkan specification section [26.5. Custom Sample Locations]
for the following structures:
'VkPipelineSampleLocationsStateCreateInfoEXT'
'VkSampleLocationsInfoEXT'
'VkSampleLocationEXT'
Once custom locations are used, the default locations are lost and need
to be re-emitted again in the next pipeline creation. For that, we emit
the 3DSTATE_SAMPLE_PATTERN at every pipeline creation.
v2: In v1, we used the custom anv_sample struct to store the location
and the distance from the pixel center because we would then use
this distance to sort the locations and send them in increasing
monotonical order to the GPU. That was because the Skylake PRM Vol.
2a "3DSTATE_SAMPLE_PATTERN" says that the samples must have
monotonically increasing distance from the pixel center to get the
correct centroid computation in the device. However, the Vulkan
spec seems to require that the samples occur in the order provided
through the API and this requirement is only for the standard
locations. As long as this only affects centroid calculations as
the docs say, we should be ok because OpenGL and Vulkan only
require that the centroid be some lit sample and that it's the same
for all samples in a pixel; they have no requirement that it be the
one closest to center. (Jason Ekstrand)
For that we made the following changes:
1- We removed the custom structs and functions from anv_private.h
and anv_sample_locations.h and anv_sample_locations.c (the last
two files were removed). (Jason Ekstrand)
2- We modified the macros used to take also the array as parameter
and we renamed them to start by GEN_. (Jason Ekstrand)
3- We don't sort the samples anymore. (Jason Ekstrand)
v3 (Jason Ekstrand):
Break the refactoring out into multiple commits
v4: Merge dynamic/non-dynamic changes into a single commit (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>
The VkPhysicalDeviceSampleLocationPropertiesEXT struct is filled with
implementation dependent values and according to the table from the
Vulkan Specification section [36.1. Limit Requirements]:
pname | max | min
pname:sampleLocationSampleCounts |- |ename:VK_SAMPLE_COUNT_4_BIT
pname:maxSampleLocationGridSize |- |(1, 1)
pname:sampleLocationCoordinateRange|(0.0, 0.9375)|(0.0, 0.9375)
pname:sampleLocationSubPixelBits |- |4
pname:variableSampleLocations | true |implementation dependent
The hardware only supports setting the same sample location for all the
pixels, so we only support 1x1 grids.
Also, variableSampleLocations is set to true because we can set sample
locations per draw.
Implement the vkGetPhysicalDeviceMultisamplePropertiesEXT according to
the Vulkan Specification section [36.2. Additional Multisampling
Capabilities].
v2: 1- Replaced false with VK_FALSE for consistency. (Sagar Ghuge)
2- Used the isl_device_sample_count to take the number of samples
per platform to avoid extra checks. (Sagar Ghuge)
v3: 1- Replaced VK_FALSE with false as Jason has sent a patch to replace
VK_FALSE with false in other places. (Jason Ekstrand)
2- Removed unecessary defines and set the grid size to 1 (Jason Ekstrand)
v4: Fix properties reporting in GetPhysicalDeviceProperties2, not
GetPhysicalDeviceFeatures2 (Lionel)
Use same alignment as other functions (Lionel)
Report variableSampleLocations=true (Lionel)
v5: Don't overwrite the pNext in GetPhysicalDeviceMultisamplerPropertiesEXT
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> (v3)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1887>