PER_QUAD TMU lookups will partially override the predication mask on TMU
writes. If some but not all lanes in a quad are predicated out, setting
PER_QUAD will force them all to be enabled. This can result in TMU
access to bogus addresses when in nonuniform control flow. Also, since
PER_QUAD is needed to make sure derivatives work with helper
invocations, and derivatives are undefined in nonuniform control flow,
there is no reason to leave it enabled in this case.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726>
Similarly to if statements, uniform loops are now emitted without
predication, using simple branches for breaks and continues. The
uniformity of the loop is determined by running the
nir_divergence_analysis pass.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726>
Fix defect reported by Coverity Scan.
Side effect in assertion (ASSERT_SIDE_EFFECT)
assignment_where_comparison_intended: Assignment job->ez_state =
VC5_EZ_DISABLED has a side effect. This code will work differently in a
non-debug build.
Fixes: cec2ed7c80 ("v3dv: fix disabling Early Z for the whole frame")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8666>
From vkCmdBindPipeline spec:
"pipelineBindPoint is a VkPipelineBindPoint value specifying to
which bind point the pipeline is bound. Binding one does not disturb
the others."
But internally we were only handling one pipeline per command buffer,
so binding a pipeline of one type would override an alredy bound
pipeline of other type.
Note that for push constants, in the same way that we were keeping one
client array and one bo for the values, for all stages, independently
of the stageFlags specified by vkCmdPushConstants, we are keeping the
same idea here, so such client array and bo is still tied to the
command buffer, and used by the two pipeline bind points. That makes
far easier tracking the push constants. We could revisit in the future
if we want a more fine grained tracking.
Fixes the following crashes:
dEQP-VK.pipeline.push_constant.lifetime.pipeline_change_diff_range_bind_push_vert_and_comp
dEQP-VK.pipeline.push_constant.lifetime.pipeline_change_same_range_bind_push_vert_and_comp
v2 (from Iago review)
* Move removal of v3dv_resource definition to a different commit.
* Use the new v3dv_cmd_pipeline_state on the cmd buffer meta
sub-struct, call it gfx for consistency
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8613>
Although I assume that this should be caught by the validation layers,
recently while triaging the following tests:
dEQP-VK.ycbcr.query.*r8g8b8a8_unorm*
We found that they were setting a descriptorCount of zero, because it
was not handling correctly the differences between Vulkan 1.0 and
Vulkan 1.1.
So let's just assert, just in case it happens again, as that would
make the bugfixing far easier.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8614>
The documentation states that if we disable Early Z for the whole
frame in the RCL Tile Rendering Mode packet, then we should not
emit any draw calls with it enabled (which we can do by enabling
it in the CFG_BITS packet).
Since we emit our RCL after recording our draw calls in the BCL
and we were not considering there if any condition for global disable
would be met, it was possible that we end up with an incorrect
configuration when we decide for a global disable in the RCL, which
can cause rendering artifacts. This can be easily observed by simply
forcing the RCL bit to disable early Z in applications that are known
to enable it in CFG_BITS (such as the UE Shooter demo for example).
With this change we keep track of this scenario when we record
draw calls in the BCL and if decide that we need to disable EZ for
the entire job, we make sure we never enable it for any draw calls
in the frame.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
This is an optimization that should make Z/S clears faster. To enable
this we can't have any Z/S loads or stores in the job. Also, it seems
that enabling early Z/S clearing is independent of whether early Z/S
testing is enabled.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
There was a misunderstanding regarding the scope of some hardware bugs
that led us to think that:
1. The Clear Tile Buffer Z/S bit was broken
2. The Clear Tile Buffer RTs bit would also clear Z/S.
1) is not really true, what happened was that some other bugs for which
we need workarounds anyway would have that effect. 2) was only true
for V3D 4.1, so it doesn't affect v3dv.
This change makes proper use of the Z/S bit instead of falling back to
clearing all tile buffers every time we have a Z/S clear. This also
allows us to do color clears on the tile store (which is faster) rather
than falling back to the clear all RTs bit every time we have a Z/S clear.
v2: rewrite the original comment about the hardwarebug description to
include recent discussions with Broadcom instead of keeping it as
is and amending it with an update note.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8589>
I saw this while inspecting CL dumps from the UE Shooter demo,
where they disable Z writes for occlusion queries. The hardware
is probably doing this internally, but it doesn't hurt
to do this explicitly and make CL traces consistent with intended
behavior.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8571>
If we have dirty descriptor set state we have to update our uniform
data to reference the new resources such as addresses for textures
or UBOs. This is known to have a high CPU cost, so we want to limit
this as much as we can.
It is a common rendering pattern in applications to render many objects
using the same pipeline, but modifying the descriptor sets bound to update
textures, UBOs, etc. In this scenario, we would be incurring in unnecessary
uniform stream updates for stages that don't access descriptor sets at all.
This change makes it so we track which shader stages in a pipeline
use descriptor set state and skips updating uniform streams for them
when dirty descriptor set state is the only reason requiring us to
generate new uniform streams for a draw call.
v2: reuse shader stage information from the pipeline set layouts
to track shader stages that use descriptor state.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8555>
Instead of checking whether the source and destination are the same,
we should check if the underlying BOs are the same, since we may
be suballocating resources from the same allocation and the kernel
will fail to execute jobs if the BO list has duplicated entries.
Fixes aborts with Unreal Engine due to failed TFU jobs.
Fixes: 30f1fc25ce ('v3dv: implement TFU blits')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8098>
Replace mesa's slightly different container_of() with one more aligned
to the linux kernel's version which takes a type as the 2nd param. This
avoids warnings like:
freedreno_context.c:396:44: warning: variable 'batch' is uninitialized when used within its own initialization [-Wuninitialized]
At the same time, we can add additional build-time type-checking asserts
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7941>
The TFU path only activates for blits that are really copies
(no linear filtering, no scaling, same pixel format, etc.), and
we do it slice by slice, so we can easily handle mirroring of the
Z coordinate for 3D images by reversing the order of the layers
as we copy them.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
Same as with other TFU paths, we only handle exact copies without
conversion, so we can rewrite the format to use a compatible TFU format
based on its texel size, which allows us to use this path with more
formats.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>