this parameter is only a hint, as tc provides no method for tracking cases
when a buffer is bound multiple times to the same site (e.g., multiple vertex
buffer slots will be counted as 1 bind), so rename to "minimum" to be more clear
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12898>
upgrade_vertex copies save->copied.nr vertices to the vertex buffer,
so we need to make sure it has enough space to accomodate them.
This commit also drops the usage of COPY_CLEAN_4V_TYPE_AS_UNION in
this function because it always writes 4-components for all attributes,
but our buffer might be smaller. Instead, only write the needed
components.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5353
Fixes: cc57156dce ("vbo/dlist: rework vertex_store management")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12849>
The fd_fence_finish() may be passed a special timeout value PIPE_TIMEOUT_INFINITE.
This gets propagated all the way to get_abs_timeout(), where it gets converted to
a huge timeout value and passed down to the kernel. At least on iMX53, the kernel
may complain about this value being too large and emit a backtrace. The relevant
piece of information there is the following:
schedule_timeout: wrong timeout value bf94984b
Per suggestion by Rob Clark, fix this in get_abs_timeout() by picking the same
rollover implementation present in etnaviv. This fixes one part of the problem
where the tv_nsec becomes larger than NSEC_PER_SEC, which is invalid.
However, the PIPE_TIMEOUT_INFINITE is sufficiently large to make tv_secs larger
than KTIME_SEC_MAX, which makes kernel-side ktime_set() return KTIME_MAX and
that in turn triggers the above "wrong timeout value N" message. Fix this by
setting the timeout to large enough value in case of PIPE_TIMEOUT_INFINITE.
While the timeout is not truly infinite, the timeout is long enough as anything
longer than a few seconds means the GPU got hung.
The "util/timespec.h" is added so we can use NSEC_PER_SEC instead of ad-hoc
constant 1000000000 . The "pipe/p_defines.h" is needed for PIPE_TIMEOUT_INFINITE.
This problem can be reliably triggered on iMX53 using Qt5 with EGLFS support,
using the qtbase examples, as follows:
/usr/share/examples/opengl/qopenglwidget/qopenglwidget -platform eglfs
Fixes: f3cc0d2747 ("freedreno: import libdrm_freedreno + redesign submit")
Signed-off-by: Marek Vasut <marex@denx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12886>
This helper simply is a wrapper to the clear color fields in the
iris_resource struct. We choose to delete it for two reasons:
1) It incorrectly asserts that the resource argument has an aux BO.
This doesn't hold for CCS_E on XeHP.
2) The majority of functions ignore the helper anyway and access these
fields directly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12795>
Now that they're no longer ralloc'd, we have to be much more careful
about indirects. We have to make sure every time a source or
destination is overwritten, its indirect (if any) is freed. We also
have to choose a memory ownership convention for the rewrite functions.
Assuming that they will be called with the source from some other
instruction, we choose to always make a copy of the indirect (if any).
It's the responsibility of the caller to ensure its copy of the indirect
is freed.
Unfortunately, all this extra logic is going to make
nir_instr_rewrite/move_src/dest more expensive because they now have
all the logic of nir_src/dest_copy instead of a simple struct
assignment. Fortunately, the vast majority of rewrite calls are done by
nir_ssa_def_rewrite_uses which is an SSA-only fast-path.
Fixes: 879a569884 "nir: Switch from ralloc to malloc for NIR instructions."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12884>
PSIZ output is only needed when:
1. There is a next stage and it reads it.
2. Primitive topology is point list, in the last vertex pipeline stage.
Zink always adds this output in its vertex (and other) shaders,
because it helps Zink avoid recompiling shader variants.
However, this has a performance impact for RADV because
it needs a scalar memory load. That becomes noticeable
at high primitive rates.
The Fossil stats are unremarkable because our DB doesn't include any
shaders from Zink or D9VK, but there are a few affected shaders.
Note that there may be an increase in LDS use in some GS. This is
because with PSIZ removed the ES per-vertex LDS size is smaller, so
we can squeeze more GS threads in the same workgroup.
Fossil DB stats on Sienna Cichlid:
Totals from 14 (0.01% of 128647) affected shaders:
CodeSize: 119884 -> 119732 (-0.13%)
LDS: 235008 -> 228864 (-2.61%); split: -2.83%, +0.22%
Instrs: 23076 -> 23048 (-0.12%)
Latency: 71667 -> 71625 (-0.06%)
InvThroughput: 19155 -> 18870 (-1.49%)
Copies: 1586 -> 1572 (-0.88%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10725>