When using Freedreno's RD output facilities, enable prefixing any output or
trigger file with the string specified in the FD_RD_DUMP_TESTNAME environment
option. This is similar to how the TESTNAME env can be used with libwrap to
provide a more descriptive name for the output RD file.
These prefixes can be quite long, e.g. the longest test case name in Vulkan CTS
is above 250 characters. For that reason the output name string in the
fd_rd_output struct is now allocated on the heap, and any path building using
the output name has its on-stack string buffer enlarged.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28442>
brw_broadcast and generate_mov_indirect both had similar comments, both
with typos ("insead"). One still referred to IVB bugs, while the other
dropped that during the compiler split. The one that dropped the
comment mentioned "both of these" issues, while citing only one issue;
there was in fact a third issue (no-Q/UQ) that wasn't mentioned in
either comment. One also had some bad grammar in the comments.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
For BROADCAST and MOV_INDIRECT, required_exec_type was returning
brw_int_type(type_sz(t), false), which is an unsigned type. However,
get_exec_type(inst) returns the original type for either Q or UQ.
This meant that has_invalid_exec_type would detect a mismatch and
trigger lowering.
That lowering would insert new 64-bit MOVs, which would need to be
lowered on platforms which don't support Q/UQ. Except, we already
ran that lowering pass earlier. So, the unlowered Q/UQ MOVs would
reach the software scoreboarding pass, and trigger failures in the
inferred_exec_pipe() function, as no pipe is available to handle
64-bit integer operations.
It turns out that we don't need the region lowering pass to do
anything for these opcodes. The generator code for both BROADCAST
and MOV_INDIRECT already handle decomposing Q/UQ operations into
32-bit MOVs when they're not supported. And, it also implicitly
converts to integer types, even for floating point sources. The
inferred_exec_pipe function already special cases them to note
that they'll always be handled on the integer pipe, so that matches.
Just drop the region lowering code for these opcodes.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
We are overriding the type to Q/UQ, so we need to split to two MOVs
if 64-bit integer math is not supported. For reference, Meteorlake
does support 64-bit floats but would still not work correctly here.
See also brw_broadcast(), which does similar indirects but correctly
checks has_64bit_int instead of has_64bit_float.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
==134077== 96 bytes in 1 blocks are definitely lost in loss record 1 of 3
==134077== at 0x4840808: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==134077== by 0x6D6F690: vk_default_alloc (vk_alloc.c:26)
==134077== by 0x52EEEBE: vk_alloc (vk_alloc.h:48)
==134077== by 0x52EEEEE: vk_zalloc (vk_alloc.h:56)
==134077== by 0x52EF47E: xe_exec_process_syncs (anv_batch_chain.c:132)
==134077== by 0x52EF8F6: xe_execute_trtt_batch (anv_batch_chain.c:215)
==134077== by 0x5301670: anv_queue_submit_trtt_batch (anv_batch_chain.c:1697)
==134077== by 0x603D135: gfx125_write_trtt_entries (genX_cmd_buffer.c:6091)
==134077== by 0x5370B44: anv_sparse_bind_trtt (anv_sparse.c:595)
==134077== by 0x5370CFC: anv_sparse_bind (anv_sparse.c:629)
==134077== by 0x5370E6E: anv_init_sparse_bindings (anv_sparse.c:670)
==134077== by 0x5328037: anv_CreateBuffer (anv_device.c:5071)
Note to backporters: this is only for when xe.ko is being used and
ANV_SPARSE_USE_TRTT=1 is exported. This is not the regular code path.
Fixes: 18bd00c024 ("anv/trtt: don't wait/signal syncobjs using the CPU anymore")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28455>
update multimedia info to show num_instances and firmware version when valid
and video codec capabilities are shown if the query is supported and valid.
Multimedia info: from Navi21 ASIC is shown below.
Multimedia info:
vcn_decode = 2
vcn_encode = 2
vcn_enc_major_version = 1
vcn_enc_minor_version = 30
vcn_dec_version = 3
jpeg_decode = 1
codec dec max_resolution enc max_resolution
mpeg2 * 4096x4096 - -
mpeg4 * 4096x4096 - -
vc1 * 4096x4096 - -
h264 * 4096x4096 * 4096x2160
hevc * 8192x4352 * 7680x4352
jpeg * 4096x4096 - -
vp9 * 8192x4352 - -
av1 * 8192x4352 - -
v2: fix build error with _WIN32 builds
v3: rebase on 76425cdf23 (ac/gpu_info: Add vcn dec and enc version query)
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28252>
These fields are overlapping with the ones set by brw_message_desc(),
so the latter should be used instead. This fixes corruption of the
LSC message descriptors when inconsistent values are specified through
both helpers, which can happen if the 'inst->mlen' field is modified
during optimization (e.g. by opt_split_sends()).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
We cannot rely on the immediate message descriptor having accurate
values for mlen and rlen at the IR level, since they are updated at
codegen time via 'inst->mlen' and 'inst->size_written', which could
end up with values inconsistent with the message descriptor if
e.g. the split sends optimization had an effect. Instead, define
helpers that do the computation without relying on the message
descriptor, and use the pre-existing
brw_message_desc_mlen()/brw_message_desc_rlen() helpers (fully
equivalent to the lsc helpers deleted here) during disassembly.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
There's no real benefit to all this fixed size array nonsense. We're
heap allocating all of the nvk_descriptor_set structs anyway so having
an array-of-pointers walk instead of a linked list is no faster. Also,
this gets rid of another fixed-size array that can overflow.
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28482>