There are a bunch of optimizations that are broken when DPP is involved.
fossil-db (Sienna Cichlid):
Totals from 100 (0.07% of 150170) affected shaders:
CodeSize: 325204 -> 325192 (-0.00%); split: -0.06%, +0.05%
Instrs: 62773 -> 62664 (-0.17%); split: -0.18%, +0.00%
Latency: 295348 -> 295266 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 73990 -> 73946 (-0.06%); split: -0.06%, +0.01%
Copies: 1650 -> 1609 (-2.48%); split: -2.55%, +0.06%
PreSGPRs: 3554 -> 3520 (-0.96%)
Fossil-db changes are probably because v_sub_f32_dpp(v_mul_f32) is no
longer being combined into MAD and then split back into separate
instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>
I haven't gone through every test (particularly ones I think are loop
unrolling or instruction-count-related ones I think), but this gives a
better picture of what's going on in this driver.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12436>
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.
Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.
A tiny helper function is introduced to compute the modifier of a
resource.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb223639 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.
Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.
A tiny helper function is introduced to compute the modifier of a
resource.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 7bcb223639 ("v3d, vc4: Fix dmabuf import for non-scanout buffers")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
Prior to this commit, the stride, offset and modifier were fetched
via WINSYS_HANDLE_TYPE_KMS. However we can't make such a query
succeed if the buffer couldn't be imported to the KMS device.
Instead, implement the resource_get_param hook to allow users to
fetch this information without WINSYS_HANDLE_TYPE_KMS.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 4c092947df ("panfrost: fail in get_handle(TYPE_KMS) without a scanout resource")
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12370>
When this was rewritten to support Vulkan, we stopped initializing
file_max to -1 in the case of no inputs. This causes the draw module
to go down a needlessly pessimistic case, printing an error while we're
at it.
Fixes: 42b5cfdbd2 ("gallivm/nir: fix vulkan vertex inputs")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12440>
On some GPUs when:
- depth bounds test is enabled
- depth test is disabled
- depth attachment uses UBWC in sysmem mode
GPU hangs. As a workaround we should enable z test. That's what blob
is doing for a630. And since we enable z test we should make it always pass.
Blob doesn't emit this workaround on a650 and a660. Untested on a640.
Fixes:
dEQP-VK.pipeline.extended_dynamic_state.two_draws_static.depth_bounds_test_disable
dEQP-VK.pipeline.extended_dynamic_state.two_draws_dynamic.depth_bounds_test_disable
dEQP-VK.dynamic_state.ds_state.depth_bounds_1
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12407>
The OpenGL 4.6 specification, section 14.5.2.1 (Line Stipple) says:
> The masking is achieved using three parameters: the 16-bit line
> stipple p, the line repeat count r, and an integer stipple counter s.
This is pretty clear that the stipple counter shouldn't carry fractional
parts. But we also don't really do anything useful with the fractional
part anyway, apart from skewing the third or later line-segments
Properly carrying over the fractional parts as the Vulkan specification
allows for rectangular lines is trickier than this and would require us
to use a shorter output-line at the start of the following
line-segments.
But let's just do what the OpenGL specification describes, and the
Vulkan specification allows for now.
This, combined with the following patch for the vulkan CTS makes the
last two rasterization-tests pass for me:
https://github.com/KhronosGroup/VK-GL-CTS/pull/279
Fixes the "spec/!opengl 1.1/linestipple/line strip" piglit-test.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12327>
Glue together all the GLES related jobs using the suites feature.
This allow us to reduce the total number of devices required, moving
some of them to help in other jobs, and the remaining free for other
pipelines in parallel.
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12453>
Be able to inline uniforms in loop for unrolling it.
Nested loop/if is also supported.
Some example:
for (i = 0; i < count; i++)
...
uniform "count" will be inlined. But note this does not
make sure the loop will be unrolled (ie. count = 1000).
for (i = 0; i < count; i++)
for (j = init; j < 10; j++)
if (type == 2)
...
uniform "count", "init" and "type" will be inlined.
It is intentional to not be too aggressive to add uniforms
to avoid false positive case while be able to support most
common usage.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11950>
In b9c095cc2c ("panfrost: Rewrite the clear colour packing code"),
packing of clear colours was corrected to use the tilebuffer's
fractional bits, fixing dithering of the clear colour with formats like
RGB565. Unfortunately, that commit did so unconditionally. If the
framebuffer is dithered, but dithering is disabled at the time of
the clear, we would incorrectly dither the clear.
This is a regression, as the old (broken) code passed the relevant CTS
test. What's the catch? Depending on dither state, there are two
formulas to pack tilebuffer colours. We need to handle both. Fixes
KHR-GLES31.core.draw_buffers_indexed.color_masks.
Fixes: b9c095cc2c ("panfrost: Rewrite the clear colour packing code")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12460>
Right now opcode_desc struct, used to define data for all the
operations to pack/unpack, include a version field. In theory that
could be used to check if we are retrieving a opcode valid for our hw
version, or to get the correct opcode if a given one changed across hw
versions, or just the same if it didn't change.
In practice that field was not used. So for example, if by mistake we
asked for an opcode defined at version 41, while being on version 33
hardware, we would still get that opcode description.
This commit fixes that, and as we are here we expand the functionality
to allow to define version ranges, just in case a given opcode number
and their description is only valid for a given range.
v2 (from Iago feedback):
* Fixed some comment typos
* Simplified filtering opcode method
* Rename filtering opcode method
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>
Right now there is a helper to get the opcode description from a
packed instruction, used on unpack related instructions. This commit
adds a helper that refactors the equivalent that is already in use on
pack related instructions.
Right now the helper is small, but we plan to extend it on following
commits in order to use the opcode description version field.
To avoid any possible confusion we rename the existing lookup helper.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>
* Remove one about waddr 6 being reserved, when at some point it
become NOP
* Fix one comment about reserved signals on v41 map, as 24 and 25
are in fact defined. This seems a C&P issue (see v40 map).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12301>