Our implemention was bogus, it was only putting a cap on the offset
based on the aligned buffer size and this doesn't ensure the access
to the buffer happens within its valid range.
I think the only reason we have been passing the tests is that we
align all buffers sizes to 256B and the tests create buffers with a
size that is smaller than that (like 64B). When get the size of the
buffer from the shader, we get the actual bound range (so 64B in this
case) and by capping to that we don't ensure the access will stay
within that range, but we ensure it will stay within the underlying
memory bound to the buffer (256B), and this is fine by the spec,
however, I think if the actual buffer range was the same as the
underlying allocation we would fail the tests.
A valid behavior for robust buffer access on an out-of-bounds access
is to return any valid bytes within the buffer, so we can just
make that offset 0.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18744>
This is the minimum required by KHR_maintenance4 and there is no
reason we can't support this.
The only restriction we have is that the texture state base
address (which comes into play with texel buffers) must be aligned
to 4-bits, but this doesn't restrict the size of the buffer, only
its base address, and we already have requirements for buffer
alignment that ensure this.
Fixes: dEQP-VK.api.info.vulkan1p3_limits_validation.khr_maintenance4
Fixes: 2c388c1d ('v3dv: set maxBufferSize property')
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18748>
This extension was promoted to Vulkan 1.3 so we should be setting its
properties directly in the VkPhysicalDeviceVulkan13Properties struct
which the common mesa code will use to populate outgoing properties.
Apparently, only the properties struct was promoted and not the features
struct.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Fixes: ee62a4c751 ('v3dv: implement VK_EXT_texel_buffer_alignment')
Fixes: dEQP-VK.api.info.get_physical_device_properties2.properties.basic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18697>
If we emit a ldunif to load the ubo/ssbo base address and
then we are immediately moving it to the unifa register we
can have the ldunif write directly to unifa and avoid the mov
in between, which won't be done by copy propagation because that
only works with temp registers.
Also, since we can't read from unifa we must be careful to disallow
reuse of the ldunif result for a future ldunif of the same base address.
We do that by only reusing ldunif results from temp registers.
total instructions in shared programs: 12468943 -> 12455139 (-0.11%)
instructions in affected programs: 1661233 -> 1647429 (-0.83%)
helped: 8307
HURT: 3994
total uniforms in shared programs: 3704532 -> 3704522 (<.01%)
uniforms in affected programs: 339 -> 329 (-2.95%)
helped: 7
HURT: 0
total max-temps in shared programs: 2148158 -> 2148290 (<.01%)
max-temps in affected programs: 9320 -> 9452 (1.42%)
helped: 175
HURT: 295
total spills in shared programs: 2202 -> 2202 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0
total fills in shared programs: 3059 -> 3057 (-0.07%)
fills in affected programs: 27 -> 25 (-7.41%)
helped: 1
HURT: 0
total sfu-stalls in shared programs: 21167 -> 21056 (-0.52%)
sfu-stalls in affected programs: 497 -> 386 (-22.33%)
helped: 209
HURT: 127
total inst-and-stalls in shared programs: 12490110 -> 12476195 (-0.11%)
inst-and-stalls in affected programs: 1662875 -> 1648960 (-0.84%)
helped: 8312
HURT: 3987
total nops in shared programs: 316563 -> 313553 (-0.95%)
nops in affected programs: 24269 -> 21259 (-12.40%)
helped: 2158
HURT: 1006
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18667>
We had a comment stating that we were using different program ids for render
and binning but this isn't true. We were only assigning ids to the render
stages and then we would create the binning stages and not assign a program id
to them at all, so they would remain with a program id of 0.
This change removes the comment and makes sure we assign the same program
id to the binning and render stages of the pipeline, which makes it a lot
easier to match render and binning shaders when debugging.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18630>
Instead, we should just return VK_SUCCESS. The physical device
won't be initialized and vkEnumeratePhysicalDevices will not
list it as available, which is the expected behavior here.
Also, VK_ERROR_INCOMPATIBLE_DRIVER is not a valid return code
from vkEnumeratePhysicalDevices, so never return that, instead
we return VK_ERROR_INITIALIZATION_FAILED if a valid device was
found but we failed to create the physical device for it.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-By: Ryan Houdek <Sonicadvance1@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18591>
This extension adds new NONE attachment load / store operations,
which are identical to the DONT_CARE variants with the difference
that DONT_CARE doesn't ensure that the original contents of the
memory within the render area are preserved and these new versions
do (with some caveats).
Our implementation was not destroying data with DONT_CARE anyway
so we already support the new semantics. Our implementation is
such that we don't need to do anything specific with the new
operations and the current behavior will do what is expected.
We pass all the tests under:
dEQP-VK.renderpass*.load_store_op_none.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18570>
If the render area is not aligned to tile boundaries it means we have partially
covered tiles in the framebuffer. In this case, we always need to load the tile
buffer from memory in order to preserve the contents outside the render area
on the tile buffer store. However, if in this scenario we know we won't be
storing the tile buffer we can skip the load safely.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18570>
We don't have any special requirements for this, so we can just expose
the extension.
The tests in CTS have an issue where they only check if a format is
supported for sampling but don't check if an image with that format
can be created for sampling. In our case, since we can't sample
1D depth/stencil images, this causes affected tests to crash in the
simulator (they pass on the device though). There is an issue with
a fix here:
https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/3923
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18489>
The VC4 and Lima Piglit failures seems to mostly fall in two camps:
1. The hardware lacks sRGB support, but the drivers decide to expose it
nevertheless, with some varying level of emulation. This leads to some
failures, probably because we're missing sRGB decoding somewhere.
2. The spec@ext_texture_compression_s3tc@compressedteximage fails,
mostly due to the test not setting the mipfilter to nearest. With
that fixed, the test passes on VC4, but still fails on Lima due to an
a bit dodgy miplod bias in the driver.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18180>
We were computing these from the final swizzle resulting from
combining the format swizzle and the view swizzle, but here we
want to use the format swizzle alone, which is the one we
use to define these properties in the format table.
Fixes CTS test fails with EXT_border_color_swizzle.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18416>
The best way to tune this value is to test Vulkan
applications. Current somewhat big value (512), was obtained by
testing only vkQuake2. Additionally at that time the bo cache was the
first performance oriented improvement we implemented.
After more improvements were included, and retested with more
applications, the conclusion is that we can reduce the value. More
info on the issue that closes.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7090
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18398>
The common code in Mesa will rewrite the old entry points to these
if available.
Since we implement events and timestamps queries in the driver the
API changes don't quite affect us.
vkQueueSubmit2 is already implemented in the common synchronization
framework and the driver works in terms of a submit hook that is
independent of the actual entry point used by the application.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18290>
Added with KHR_synchronization2. The common code in Mesa will
rewrite vkCmdPipelineBarrier to vkCmdPipelineBarrier2.
With synchronization2 barriers now have a per-barrier stage
and access flags (previously these were shared by all the barriers
in a vkCmdPipelineBarrier commands), so we need to rewrite a bit
the logic to account for this.
Also, stage and access flag bits have been expanded and renamed.
Particularly, some new flags have been added that we need to account
for.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18290>