Since 13cb41f666 PIPE_BIND_SHARED was used to allocate driver internal
video buffers. These buffers are never shared, but the intent was to
get non-suballocated buffers and SHARED was used as an indirect flag.
This commit switches to PIPE_BIND_CUSTOM which isn't used anywhere else,
and is now translated as "no suballocation".
The main benefit here is that this allows these buffers to set
use_reusable_pool to true reducing the CPU overhead a lot.
For instance, running the following command on my system:
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi \
-i tears_of_steel_1080p.mov -an -c:v h264_vaapi output.mp4
takes 35 sec with this commit vs 45 sec without.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21416>
As we support VK_EXT_image_drm_format_modifier, we could receive
VK_IMAGE_ASPECT_MEMORY_PLANE_0/1/2_BIT_EXT flags.
Fixes several tests like this:
dEQP-VK.drm_format_modifiers.create_explicit_modifier.*
when using CTS 1.3.5.0
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21463>
We have specialized lowering passes dealing with most of that already:
1. gl_nir_lower_samplers_as_deref
2. nir_lower_samplers
3. nir_lower_cl_images
If we need more than that, those passes can deal with following deref
chains as well.
We _might_ need to improve nir_lower_cl_images a bit for more complex
kernels, but CL also doesn't allow indirect images, so we are always able
to optimize the entire deref chain away.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20161>
Some of the OpenCL tests are flaky, because they just take that long.
Builtins can generated really complex code and if we are unlucky they can
timeout.
Proper support for functions would also solve the issue, probably, but for
now increase the deqp-runner timeout so it's less of an annoyence.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20161>
There's just two places where we need any of the WSI specific vulkan
includes, the rest of Zink should do just fine with vulkan_core.h. So
let's include the win32-specific header explicitly in those two places,
and reduce the need for WSI specifics inside zink itself. Kopper
handles the rest of the WSI integration.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21441>
- tcinv - Likely Texture Cache Invalidate (unverified)
- icinv - Mostly sure that it is Instruction Cache Invalidate
- dccln - Data Cache Clean
- dcinv - Data Cache Invalidate
- dcflu - Data Cache Flush
The emission of these instructions were not observed in the wild.
TODO: find out the difference between .shr and .all modes of
dccln, dcinv, dcflu.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14419>
When establishing a texture transfer map, if there is any pending changes on the
texture, instead of trying direct map with DONTBLOCK first, just
use the upload buffer path.
Fixes piglit tests gen-teximages, arb_copy_images-formats
Cc: mesa-stable
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
When an EGLImage is created from a 2D texture and used for texture sharing,
the texture surface might not have been created with the SHARED bind flag.
To allow these surfaces for sharing, this patch sets the USAGE SHARED bit
for surfaces that can be potentially used for sharing even when the SHARED
bind flag is not originally set. Instead of unconditionally enabling the
SHARED bind flag for all surfaces and unnecessarily bypass the surface cache
optimization, this patch only enables the USAGE SHARED bit for surfaces
that also have the RENDER TARGET bind flag.
When the surface handle is inquired and if the surface is currently
marked as cachable, we will need to unset the cachable bit so
the surface handle will not be recycled again.
This patch fixes an assertion in svga_resource_get_handle() when the
EGL_MESA_image_dma_buf_export extension is used in webgl benchamrk running
from firefox in Fedora 37.
Cc: mesa-stable
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21393>
This hack caused problems with some dx9 tests before (due to mipgen
test using nearest filter sampling with tex coords exactly between two
texels hence being extremely sensitive to arithmetic inaccuracies),
and we can no longer distinguish this by using pixel_offset to not get
it enabled. But to pass other tests we don't really need the hack when
there's texture sampling involved anyway.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21407>
This avoids waiting for instance_data which can improve performance:
vk_ray_tracing_ao_KHR_app: 0.2% (The TLAS has 2 instances)
Quake II RTX: 1%
Control: 1%
We also have to shuffle around some code to avoid increasing VGPR usage.
That leaves us with the following stats:
Quake II RTX:
Totals from 7 (14.29% of 49) affected shaders:
CodeSize: 165612 -> 165716 (+0.06%)
Instrs: 31446 -> 31460 (+0.04%)
Latency: 596709 -> 554292 (-7.11%)
InvThroughput: 121998 -> 113327 (-7.11%)
VClause: 596 -> 587 (-1.51%)
Copies: 4664 -> 4646 (-0.39%)
PreVGPRs: 620 -> 639 (+3.06%)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21421>