Before VK_KHR_maintenance5 point size is undefined unless the
shader explicitly writes it, but this extension changes this and
expects a default point size of 1.0 if none has been written.
We accomplish this by emitting a POINT_SIZE packet with the
default point size the first time we draw with a POINT primitive
in the job. If the shaders used in the draw call doesn't write
point size then the hardware will take the point size from the
state set by the packet. If the shader does write to point size
then the value written in the shader will be used instead.
Passes all tests we support in:
dEQP-VK.rasterization.primitive_size.default_size.points.*
when forcing maintenance5 enabled.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29413>
2712D0 has V3D 7.1.10 which included draw index and
base vertex in the shader state record packet, shuffling
the locations of most of its fields. Handle this at run
time by emitting the appropriate packet based on the
V3D version since our current versioning framework doesn't
support changes based on revision number alone.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29189>
Note that when it is dynamic, it goes to the codepath of having
enabled raster_enabled at the pipeline, even if at the end it will be
disabled. The fragment shader compilation, and the stage keys, depends
on rasterization being enabled or not. As mentioned, if the state is
dynamic, it assumes that the rasterization is enabled.
That would work, as then the rasterization could be discarded at the
CFG_BITS package, by the command buffer at draw time. We just have a
(discarded) shader slightly more complex that it would have been with
rasterization enabled.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28980>
While working on VK_EXT_extended_dynamic_state2 we found two issues
the stencil emission code, after the update for StencilTestEnable
being dynamic.
Specifically:
* pack_stencil_cfg: if we don't have a ds_info, we need to return,
as pack_single_stencil_cfg uses it to fill it up. Also the check
for MESA_VK_DYNAMIC_DS_STENCIL_TEST_ENABLE was not needed. That
state doesn't affect the content of the STENCIL_CFG
packet. Stencil is enabled/disabled at the CFG_BITS packet.
* cmd_buffer_emit_stencil: we can't use pipeline->emit_stencil_cfg
to filter if it is needed to emit that as since
stencil_test_enable and stencil_op become dynamic.
We also update which states we check that are dynamic. As
mentioned STENCIL_TEST_ENABLE doesn't affect here.
Fixes: 60e9237e81 ("v3dv: StencilOp and StencilTestEnable are now dynamic")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28980>
There were some pending places to update after PrimitiveTopology
become dynamic. FWIW, this was not catched by any CTS test.
As we are here we add a comment to explain why we still use the
topology on the pipeline.
Fixes: 2526f74ade ("v3dv: PrimitiveTopology is now dynamic")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28980>
This commit introduces a significant change when we emit STENCIL_CFG,
with any dynamic state: we stop to use cl_emit_with_prepacked, and use
directly cl_emit. The reason is that now most of the STENCIL_CFG
parameters are dynamic, any improvement of using
cl_emit_with_prepacked is minimized. Also gets the code simpler, and
avoid the need to be extra careful with the fact that
cl_emit_with_prepaked doesn't override values.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
Note that although the topology affects the final shader, and it is
part of the v3d_fs_key (through is_points and is_lines), changing
dynamically the topology would not trigger a shader recompilation as
that would only needed if there was a topology class change. From
spec:
"VUID-vkCmdDraw-dynamicPrimitiveTopologyUnrestricted-07500
If the bound graphics pipeline state was created with the
VK_DYNAMIC_STATE_PRIMITIVE_TOPOLOGY dynamic state enabled and the
dynamicPrimitiveTopologyUnrestricted is VK_FALSE, then the
primitiveTopology parameter of vkCmdSetPrimitiveTopology must be of
the same topology class as the pipeline
VkPipelineInputAssemblyStateCreateInfo::topology state"
dynamicPrimitiveTopologyUnrestricted is defined at
VK_EXT_extended_dynamic_state3, so for now it is false. And even if in
the future we support that extension, it is really likely that we
would return False there.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
As the values depends on several values that can be dynamic too.
Note that the current approach of this commit is keeping this info
duplicated on the pipeline and the cmd_buffer. The alternative would
be to just track it at the cmd_buffer, like we did recently with
z_updates_enable, but getting the values for ez_state/incompatible_ez
were more complex, so this commit still computes it when the pipeline
is created, and uses it as default value.
This is debatable though, and the alternative would be to just keep
ez_state/incompatible_ez_state at the command buffer.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
As it depends on values that could be dynamic now. Technically we
could try to keep pre-emitting, just in case that info is provided
statically.
But for the dynamic case, we would still need to compute that bits,
and we would need to discard all the pre-emitted CFG set, and
recompute it completely (as right now cl_emit_with_prepacked doesn't
allow to override values).
It is also gets a simpler code by setting those flags in only one
codepath.
As we are here, we also move z_updates_enable from the pipeline to the
cmd_buffer. This values doesn't require a complex compute, so it is
easier to just keep it on one place.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
Mostly equal to vkCmdBindVertexBuffers, but adding strides, that with
VK_EXT_extended_dynamic_state become dynamic, and setting pSizes.
It is worth to note that at this moment we don't use
CmdBindVertexBuffers2 pSizes.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
Specifically to use the common vk_dynamic_graphics_state.
The advantage of using the common struct is not only reducing the size
of our custom one, but also using common helpers (like all those cmd
buffer setters), and a lot of the logic that in the future will be
used for other extensions.
Some notes:
* We still keep dirty flags, for things like PIPELINE,
DESCRIPTOR_SETS, etc. Other driver do the same. FWIW, this is also
an improvement, as before we were mixing those with the per-spec
Vulkan dynamic info.
* For the port viewport/scissor we still keep some data on a custom
structure, as we cache the translate/scale info that is derived
from scissor/viewport, but used in three different places.
For that we also maintain a custom implementation of
CmdSetViewport, that computes translate/scale, and call the common
implementation.
* We make the same for color_write_enables. The vulkan runtime saves
it as a 8-bit bitset, with a bit per attachment. But when combining
with color_write_mask you need a 32bit with 4 bits set per
attachment. To avoid recompute it during emission, we also cache
the color_write_enables, using the runtime just to track the dirty
status.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
With the simultaneous use flag we can reuse the same command
buffer multiple times. That means, for example, that we can
have an instance of a job running in the GPU while we are
submitting another one for execution to a queue.
This scenario is problematic with dynamic rendering and job
suspension because suspended jobs need to be patched with the
resume address at queue submit time, and thus, if we have another
instance of the same job currently executing in the GPU we could
stomp its resume address, which could be different.
To fix this, at queue submission time, when we detect a suspending
job in a command buffer with the simultaneous use flag, we clone the
job and create its own copy of the BCL so we can patch the resume
address into it safely without conflicting with any other instance
of the job that may be running.
We need to flag these clones as having their own BCL since
we would have to free it when the job is destroyed, unlike other
clones that don't own any resources of their own. Also, because
this job is created at queue submit time, it won't be in the
execution list of the command buffer, so it won't be automatically
destroyed with it, so we need to add it to the command buffer
as a private object.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28521>
We had these pointing to the original job instead of pointing
to the cloned job. This can be confusing, particularly, if we
then emit commands that include references to new BOs into the
cloned jobs, since we would then try to insert these BOs in the
original jobs instead of the clones, which was the situation
we had when we implemented resume address patching with dynamic
rendering.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28521>
Because we are cloning these into primaries but the cloning is
superficial the command lists in them still point to the original
jobs and therefore paching new addresses would make the packing
code add the BO of the resume address to the original job. This
has two problems:
1. This is probably not what we want since the patching should only
be affecting the clone.
2. The bo_count of the clone job will not be updated accordingly and
we end up with a mismatch that will blow up when we submit.
The solution used here is a big hack, but works for now: we just
specify the address by its full offset rather than a relative
offset from a BO. We already have to add all the BOS in the resume
job manually which will include this the BO for the branch address
too, so this is fine.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
This was used only in secondary CL command buffers so it made
sense but with dynamic rendering we are going to also have
regular CLs also in secondaries (since secondaries can now
record full dynamic rendering passes), so renaming this to
INCOMPLETE makes more sense, since this is really what they
refer to: parts of CLs that are intended to be merged into
other primaries through branching.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
Dynamic rendering allows the client to suspend recording of a
render pass and have it continued in a different command buffer.
When a suspended command buffer is submitted to a queue, the
resuming command buffer must be te next one in submission order.
This means we need to be able to "merge" or "stitch" together
these command buffers at submit time.
To accomplish this, when we suspend a command buffer we emit
a BRANCH instruction to finish it. Then at submit time, when
we know the resuming job, we patch the BRANCH address with the
address of the resuming binning list (bcl). This is very similar
to how we execute secondary command buffers inside a render pass.
Also, only the last resuming job should flush the binning lists
in the bcl since we won't have processed the full binning command
list until we have execute the last linked job in the resume
list.
Since all jobs and command buffers in the suspend/resume chain
must be part of the same dynamic render pass, we only need to
produce and emit the render command list (rcl) once.
Since the way we implement stitching is that we branch from the
suspending job into the resuming one, the first job suspending
will link into all the resuming jobs necessary to complete the
chain, therefore, after the stitching is complete, we only want
to submit the first job in the suspend/resume chain, and thus,
we only produce and emit the rcl for this one job.
Notice as well that suspending only affects the last job
recording a dynamic rendering pass (the one that needs the branch
so we can resume execution with another job in another command
buffer).
Resuming affects all jobs in the dynamic render pass, since
we won't produce RCLs for them (as only the originating job
on the suspend/resume chain will emit the RCL).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
When the Z scale is too small guardband clipping may not clip
correctly, so disable it, which is a new option in V3D 7.x.
This fixes this test in V3D 7.x without needing any workarounds:
dEQP-VK.draw.renderpass.inverted_depth_ranges.nodepthclamp_deltazero
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Starting point for v71 version inclusion.
This just adds it as one of the versions to be compiled (on meson),
updates the v3dX/v3dv_X macros, and update the code enough to get it
compiling when building using the two versions. For any packet not
available on v71 we just provide a generic asserted placeholder of
generation not supported.
Any real v71 support will be implemented on following commits.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
If the viewport center is not positive we can't express it
through the fine coordinates, which are unsigned, and we
need to use the coarse coordinates too.
Fixes new crashes in Vulkan CTS 1.3.6.0:
dEQP-VK.draw.renderpass.offscreen_viewport.*negative*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23489>
These are split between fine and coarse coordinates. We have only been
using fine until now, so we kept the same naming convention we had
prior to V3D 4.1 for simplicity, but we will start using the coarse
coordinates soon too.
Also, the signedness was reversed: coarse coordinates are signed and
fine coordinates are unsigned.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23489>
This is a prepare step to remove depends on p_defines.h in src/util/*
This is done by:
replace pipe_prim_type with mesa_prim
replace shader_prim with mesa_prim
replace PIPE_PRIM_MAX with MESA_PRIM_COUNT
replace SHADER_PRIM_ with MESA_PRIM_
replace PIPE_PRIM_ with MESA_PRIM_
This patch only replace code only
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23369>
Currently, we postpone binning syncs until we record draw calls
and can validate if any of them require accessing protected
resources in the binning stage, however, if the draw calls are
recorded in a secondary command buffer and the barriers have
been recorded in the primary command buffer, we won't apply the
binning sync in the secondary when we record the draw calls
and so we must apply it when we execute the secondary in the
primary.
Fixes flakyness in:
dEQP-VK.api.command_buffers.record_many_draws_secondary_2
cc: mesa-stable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21162>
Original patches wrote by Ella Stanforth.
Alejandro Piñeiro main changes (skipping the small fixes/typos):
* Reduced the list of supported formats to
VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM and
VK_FORMAT_G8_B8R8_2PLANE_420_UNORM, that are the two only
mandatory by the spec.
* Fix format features exposed with YCbCr:
* Disallow some features not supported with YCbCr (like blitting)
* Disallow storage image support. Not clear if really useful. Even
if there are CTS tests, there is an ongoing discussion about the
possibility to remove them.
* Expose VK_FORMAT_FEATURE_COSITED_CHROMA_SAMPLES_BIT, that is
mandatory for the formats supported.
* Not expose VK_FORMAT_FEATURE_2_MIDPOINT_CHROMA_SAMPLES_BIT. Some
CTS tests are failing right now, and it is not mandatory. Likely
to be revisit later.
* We are keeping VK_FORMAT_FEATURE_2_DISJOINT_BIT and
VK_FORMAT_FEATURE_2_MIDPOINT_CHROMA_SAMPLES_BIT. Even if they
are optional, it is working with the two formats that we are
exposing. Likely that will need to be refined if we start to
expose more formats.
* create_image_view: don't use hardcoded 0x70, but instead doing an
explicit bit or of VK_IMAGE_ASPECT_PLANE_0/1/2_BIT
* image_format_plane_features: keep how supported aspects and
separate stencil check is done. Even if the change introduced was
correct (not sure about that though), that change is unrelated to
this work
* write_image_descriptor: add additional checks for descriptor type,
to compute properly the offset.
* Cosmetic changes (don't use // for comments, capital letters, etc)
* Main changes coming from the review:
* Not use image aliases. All the info is already on the image
planes, and some points of the code were confusing as it was
using always a hardcoded plane 0.
* Squashed the two original main patches. YCbCr conversion was
leaking on the multi-planar support, as some support needed
info coming from the ycbcr structs.
* Not expose the extension on Android, and explicitly assert that
we expect plane_count to be 1 always.
* For a full list of review changes see MR#19950
Signed-off-by: Ella Stanforth <estanforth@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19950>
Our implementation was mostly CPU-based, with things such as query
resets and result copying handled in the CPU, as well as some aspects
of query availability tracking.
This new implementation handles all GPU-side query functions by
dispatching compute shaders to push the work to the GPU. This
involves query availability, reset and result copying.
For now, only occlusion queries are managed this way. Performance
queries can also be implemented in a similar fashion in the future
with some additional work, however, for timestamp queries our only
option to improve this would be to execute the actual timestamp in the
kernel, since we can't take a timestamp from a shader.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19770>
If the render area is not aligned to tile boundaries it means we have partially
covered tiles in the framebuffer. In this case, we always need to load the tile
buffer from memory in order to preserve the contents outside the render area
on the tile buffer store. However, if in this scenario we know we won't be
storing the tile buffer we can skip the load safely.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18570>