If a secondary command buffer recording a dynamic pass has the
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT flag
then the rendering information for it should come from a
VkCommandBufferInheritanceRenderingInfo struct in the pNext
chain instead of the usual render pass information in the
VkCommandBufferInheritanceInfo struct. We take the information
from the new struct and build a render pass description from it
assuming a setup without a framebuffer (which is optional for
regular render passes too).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
Because we are cloning these into primaries but the cloning is
superficial the command lists in them still point to the original
jobs and therefore paching new addresses would make the packing
code add the BO of the resume address to the original job. This
has two problems:
1. This is probably not what we want since the patching should only
be affecting the clone.
2. The bo_count of the clone job will not be updated accordingly and
we end up with a mismatch that will blow up when we submit.
The solution used here is a big hack, but works for now: we just
specify the address by its full offset rather than a relative
offset from a BO. We already have to add all the BOS in the resume
job manually which will include this the BO for the branch address
too, so this is fine.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
This was used only in secondary CL command buffers so it made
sense but with dynamic rendering we are going to also have
regular CLs also in secondaries (since secondaries can now
record full dynamic rendering passes), so renaming this to
INCOMPLETE makes more sense, since this is really what they
refer to: parts of CLs that are intended to be merged into
other primaries through branching.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
Dynamic rendering allows the client to suspend recording of a
render pass and have it continued in a different command buffer.
When a suspended command buffer is submitted to a queue, the
resuming command buffer must be te next one in submission order.
This means we need to be able to "merge" or "stitch" together
these command buffers at submit time.
To accomplish this, when we suspend a command buffer we emit
a BRANCH instruction to finish it. Then at submit time, when
we know the resuming job, we patch the BRANCH address with the
address of the resuming binning list (bcl). This is very similar
to how we execute secondary command buffers inside a render pass.
Also, only the last resuming job should flush the binning lists
in the bcl since we won't have processed the full binning command
list until we have execute the last linked job in the resume
list.
Since all jobs and command buffers in the suspend/resume chain
must be part of the same dynamic render pass, we only need to
produce and emit the render command list (rcl) once.
Since the way we implement stitching is that we branch from the
suspending job into the resuming one, the first job suspending
will link into all the resuming jobs necessary to complete the
chain, therefore, after the stitching is complete, we only want
to submit the first job in the suspend/resume chain, and thus,
we only produce and emit the rcl for this one job.
Notice as well that suspending only affects the last job
recording a dynamic rendering pass (the one that needs the branch
so we can resume execution with another job in another command
buffer).
Resuming affects all jobs in the dynamic render pass, since
we won't produce RCLs for them (as only the originating job
on the suspend/resume chain will emit the RCL).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
With this we are able to run basic dynamic render passes, however,
we are still missing a few things like support for secondary
render passes, suspend/resume, etc that will be adding in follow-up
patches.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
This builds up on the previous patch and rewrites all the pipeline
code that fetched information from the pipeline's render pass (which
will be NULL for dynamic rendering) to instead fetch it through the
new rendering_info field, which will be valid for both regular and
dynamic render passes.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
With dynamic rendering the API formally eliminates render passes,
so the pipeline create info can now have a NULL render pass, in
which case rendering info must be provided via pNext struct
VkPipelineRenderingCreateInfo, or if this is missing too then
defaults to no multiview and no attachments.
Since we don't want to have separate paths all over the place
whenever we need to access render pass / rendering info for the
pipeline, we will always produce a valid vk_render_pass_state
struct with the relevant information even when we have a render
pass, so we can rely on that always being available.
A follow-up patch will rewrite all the places where we assumed
the existence of a render pass in the pipeline to instead fetch
the info it needs from this new field instead.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
Since the plan is to leverage our render pass infrastructure, we
also need to setup a framebuffer from the rendering info provided
with dynamic rendering.
We allocate the framebuffer lazily, only once, if a dynamic render
pass is used. To do this, we make it so it can hold the maximum
number of attachments possible with our hardware.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
The idea is to build a regular render pass from the rendering info
provided with dynamic rendering. We will use this when recording
dynamic render passes to leverage our existing implementation
for render passes with dynamic rendering.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
It is allowed for a shader to enable the multiview extension
even if the draw call in which it is used doesn't use multidraw.
This allows the shader to still use gl_ViewIndex, which will
always be 0 in that scenario.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
These have been incredibly useful when debugging regressions and weird
behavior in the Intel backend when trying to spill multiple registers
before retrying allocation. With them, we can print out not only what
register was chosen, but the benefit and cost. Seeing lists of chosen
registers where the benefit/cost was not sorted, and poor options were
chosen before better ones, led me to investigate a number of issues.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28257>
We also need the BV/LPAC info, so run the entire thing until we get to a
waitin or read $data without waitin, which the BV microcode does
because it is disabled by default and skips everything until a
CP_THREAD_CONTROL packet.
The actual microcode writes the packet table last, but my simple test
one doesn't, and there's no guarantee it will continue to do so.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26771>
This is similar to a7xx but slightly different, because it inverts the
sense of the bits (the firmware sets to 1 once it starts) and there are
only 2 processors. We didn't need this before because the waiting on
THREAD_SYNC only happens after setting the packet table.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26771>
This frees us from having to strip the trailing colon, and makes it
easier to add other identifiers like for section names. The downside is
that now we can't name a label with a reserved word like "mov" but that
doesn't seem too bad.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26771>
Some a7xx firmwares have junk after the LPAC jump table, and the old
method of hardcoding the location of the jumptable and the jumptable
offset when assembling and disassembling also doesn't work well on a7xx
where the location of the jumptable offset changes. Make jump tables
explicit in the disassembly with a ".jumptbl" directive which expands to
contain the contents of the jump table, make disassembly find the jump
table and emit the directive there. Then add the ability for a literal
to reference a label, which will be used for the jump table offset at
the beginning of the firmware. The disassembler guesses when a word is
actually the jump table offset and replaces it with a relocation.
This restores the ability to disassemble a630_sqe.fw and a650_sqe.fw and
reassemble to an identical binary without modifying the disassembly to
remove the jump table.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26771>
The int filter has precision issues when used with depth textures,
causing failures in tests like piglit
spec@arb_depth_texture@fbo-depth-gl_depth_component16-drawpixels.
In addition to that the texture compare unit for GL_ARB_shadow
operations seems to be only implemented in the float filter pipe,
so this change also fixes this functionality on GPUs where it
isn't emulated, e.g. GC3000.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28218>
Fixes the validation error
VUID-vkBindBufferMemory-memory-02985
reported, e.g. by running piglit
spec@amd_pinned_memory@map-buffer decrement-offset
v2: move setting the export type for user_mem get_export_flags and
also set alloc_info->external for user_mem to avoid adding
a conditional when finning the external buffer create info (Mike)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28244>
Fixes validation error:
VUID-VkShaderModuleCreateInfo-pCode-08737
AtomicFAddEXT: expected Pointer to point to a value of type Result
Type
%51 = OpAtomicFAddEXT %float %49 %uint_1 %uint_0 %50
when running
spec@nv_shader_atomic_float@execution@ssbo-atomicadd-float
Fixes: 9f6be8effb
zink: store and use alu types for ntv defs
v2: Fix commit message (Mike)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28243>
On MacOS/Apple/Dawin you can only get MESA to forward the GL funtions to
the systems OpenGL.framework or run SWRast directly. There is no way to use a gallium driver, even if they have been compiled.
The two gallium drivers of interest are SWRast and Zink, as the rest are hardware drivers and not relavent on MacOS.
The code changes add a new define GLX_USE_APPLE. This is used in combination with the existing GLX_USE_APPLEGL.
GLX_USE_APPLEGL calls the systems OpenGL.framework, Apple's OpenGL.
GLX_USE_APPLE calls the non-system OpenGL code, i.e. Gallium, hence the subtle naming difference. Apple systems are still used, just not the GL ones.
When GLX_USE_APPLE is defined the code will use the DRI/gallium driver sub-system so SWRast and Zink can selected at runtime on MacOS.
This also allows Zink to be run on MacOS, once it is fixed up.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28205>
DRI2 calls are different between Linux and MacOS.
Calling these Linux version on MacOS using xquartz fails with 'unknown' codes.
This patch hardwires a number of the utility DRI2 functions to use the MacOS
specific version that already exist for APPLEGL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28205>