When drawing using util_translate_prim_restart_ib, zink explicitly
ignores pipe_draw_start_count_bias::start, because
util_translate_prim_restart_ib used to create a new index-buffer without
padding at the start.
This makes a lot of sense, because creating a padded index buffer is
just wasteful.
So let's walk back on the choice of starting to pad the output buffer.
Fixes: 1272c2e052 ("util/prim_restart: fix util_translate_prim_restart_ib")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4851
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11059>
Spec requires apps to use the size returned from
vkGetAndroidHardwareBufferPropertiesANDROID for AHB import, which
includes dedicated image/buffer import and non-dedicated buffer import.
Spec requires venus to use the size from image and buffer memory
requirement for dma_buf fd import if it's dedicated. If not dedicated,
the actual payload size should be used.
For AHB export allocation of VkImage, the app provided size will be 0,
and internally we must use the size from image memory requirement.
For AHB export allocation of VkBuffer, the app provided size comes from
buffer memory requirements and it can be smaller than the actual AHB
size due to page alignment. Internally that's the size we must use.
For AHB import, app must use the size from AHB prop query. Internally,
we have to override that with the size from memory requirement if the
import operation is dedicated to a VkImage or a VkBuffer. If not
dedicated, the actual payload size should be used.
The not working scenario is:
1. App creates an AHB backed VkBuffer, and the exported AHB size is
larger than the buffer memory requirement (very common).
2. App imports the AHB without a dedicated VkBuffer. Then the entire
AHB payload will be imported and the mmap_size might increase.
Test: dEQP-VK.api.external.memory.android_hardware_buffer.*
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11055>
For importing an dma_buf fd, export info is not required. Leaving the
bo_flags missing VIRTGPU_BLOB_FLAG_USE_SHAREABLE bit as well as the
VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE bit. Upon querying fd properties,
DMA_BUF handle type will be specified which generates a new bo_flags
not matching the prior one.
This patch aligns the above 2 bits for the dma_buf import path.
Test: vkGetAndroidHardwareBufferPropertiesANDROID should succeed with
a valid AHB with AHARDWAREBUFFER_FORMAT_BLOB format.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11055>
It is valid to not have a sampler view declaration for the corresponding
sampler in a TGSI shader, and hence we should not rely on the sampler view
declaration to determine if we need to adjust the unnormalized coordinates
for texture rectangle sampling.
This patch is to prep for tgsi shaders that are translated from nir which
in many cases do not issue sampler view declarations.
Fixes: 584b107037 ("st/mesa: Drop the TGSI paths for drawpixels and use nir-to-tgsi")
Reviewed-by: Neha Bhende <bhenden@vmware.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11011>
Sometimes, TGSI shader doesn't have SVIEW declaration if it is not
utilize in shader. In such cases, declare those resources with the
help of information stored in shader key.
Fixes: 584b107037 ("st/mesa: Drop the TGSI paths for drawpixels and use nir-to-tgsi")
Tested with piglit, gleretrace
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11011>
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.
It works on the following pattern:
vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc
The result looks like this:
vcc = v_cmp ...
p_cbranch vcc
Fossil DB results on Sienna Cichlid:
Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
This commit adds the skeleton of a new ACO post-RA optimizer,
which is intended to be a simple pass called after RA, and
is meant to do code changes which can only be done
after RA.
It is currently empty, the actual optimizations will be added
in their own commits. It only has a DCE pass, which deletes
some dead code generated by the spiller.
Fossil DB results on Sienna Cichlid:
Totals from 375 (0.25% of 149839) affected shaders:
CodeSize: 2933056 -> 2907192 (-0.88%)
Instrs: 534154 -> 530706 (-0.65%)
Latency: 12088064 -> 12084907 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 4433454 -> 4432421 (-0.02%); split: -0.02%, +0.00%
Copies: 81649 -> 78203 (-4.22%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
Technically, this is correct, and I think it's the right long-term
solution. However, Iris has code that undoes the damage caused by
nir_lower_passthrough_edgeflags (see iris_fix_edge_flags in
iris_program.c), so it should be safe to do the lowering here.
I'm not marking this as closing mesa#4838 because I think we should move
the Iris code up to here, make a different version of the NIR pass, or
something different... to properly fix this problem. In the mean time,
this gets a bunch of tests to stop crashing. :)
Fixes: a76ec17f12 ("mesa/st: Fix iris regression with clip distances.")
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11013>