Instead of blatantly assuming with no assert that they're the same, add
a remap function. Also, be more careful about which enum we use where.
In the AST, we use GLenum and GL_FOO because we also need GL_ISOLINES.
When we translate to shader info, GS gets translated into mesa_prim
and tessellation gets translated into tess_primitive_mode which has
ISOLINES as a valid primitive value.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24821>
RADV compacts all descriptors for multi-planar images into one
combined image sampler, so it should be 96, and not eg. 192 for a two
planes format.
Fixes new CTS
dEQP-VK.binding_model.descriptor_buffer.ycbcr_sampler.*array.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26498>
Save a comparison, and move out the comparison to be more backend friendly.
Saves 2 instrs on AGX (as the remaining comparison now fuses with bcsel).
Results on AGX, all affected shaders in asphalt9.
total instructions in shared programs: 1813003 -> 1812611 (-0.02%)
instructions in affected programs: 119646 -> 119254 (-0.33%)
helped: 333
HURT: 0
Instructions are helped.
total bytes in shared programs: 11870344 -> 11867208 (-0.03%)
bytes in affected programs: 820888 -> 817752 (-0.38%)
helped: 333
HURT: 0
Bytes are helped.
and on Mali-G57:
total instructions in shared programs: 2677538 -> 2677205 (-0.01%)
instructions in affected programs: 206923 -> 206590 (-0.16%)
helped: 333
HURT: 0
Instructions are helped.
total cvt in shared programs: 14667.50 -> 14662.30 (-0.04%)
cvt in affected programs: 1953.64 -> 1948.44 (-0.27%)
helped: 333
HURT: 0
Cvt are helped.
total quadwords in shared programs: 1450664 -> 1450544 (<.01%)
quadwords in affected programs: 5064 -> 4944 (-2.37%)
helped: 15
HURT: 0
Quadwords are helped.
total threads in shared programs: 53282 -> 53309 (0.05%)
threads in affected programs: 27 -> 54 (100.00%)
helped: 27
HURT: 0
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26489>
Otherwise, we can transition to exact before p_jump_to_epilog, then
transition to WQM again and then back to exact:
p_jump_to_epilog //transitions to exact
p_logical_end //transitions to wqm
p_end_wqm //transitions to exact
We rely on ssa elimination to clean most of this up.
fossil-db (navi21):
Totals from 1 (0.00% of 79330) affected shaders:
Instrs: 111 -> 110 (-0.90%)
CodeSize: 572 -> 568 (-0.70%)
Copies: 16 -> 15 (-6.25%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25440>
This creates a function that can be used by both pipelines and shaders.
Note that we cannot yet call tu_CreateShadersEXT directly inside the
pipeline due to things like pipeline feedback, multiview, and so on, but
further extensions will hopefully bring us closer to that ideal.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25679>
With shader objects, we won't have the pipeline layout available. This
means that the current way we implement dynamic offset descriptors in
combination with fast-linking and independent descriptor sets, where we
use the pipeline layout when fast-linking that has pre-computed offsets
for each descriptor set, won't work. Instead we need to piece together
the sizes of the descriptors in each descriptor set from the shaders.
This is already effectively what we do when we stitch together the
pipeline layout when fast-linking, but we need to make it work with just
the shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25679>
OpBreak and OpBSsy aren't very SSA friendly as they require bar_in and
bar_out to be assigned the same register. We need to encure that RA
knows about this restriction. For now, we just special-case these two
instructions. In the future we may want a more generic mechanism for
this but it's not worth it for just two instructions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26463>
The NIR intrinsics now take and return a barrier whenever one is
modified instead of modifying in-place. In NAK, we give the internal
instructions the same treatment and convert everything to use barrier
SSA values and RegRefs. In nak_from_nir, we move all barriers to/from
GPRs. We'll clean up the massive pile of OpBMov later.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26463>
OpBar and OpBSync both stall the thread until other threads get to that
point. These instructions must have .yld set. Also, warp barrier ops
don't support the usual instruction barrier mechanism so they should be
marked as having a fixed latency. It's unclear if the barrier file is
internally scoreboarded or if warp barrier ops just stall the whole
thread. In either case, this seems necessary.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26463>
In a scenario like:
CmdBindPipeline(stipple line enabled)
CmdDraw()
CmdBindPipeline(stipple line disabled)
CmdDraw()
The second draw wasn't resetting the stipple line state and this might
have caused issues, though it's uncovered by VK CTS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26429>
Pre v10 Mali GPU were describing GPU jobs as a chain of job descriptors,
but new generations moved to a command stream based approach. The
pan_cs.{h,c} name was chosen based on the assumption this job chain
would replace the command stream we have on other GPUs, things will
become a lot more confusing now that we have a real command stream.
Let's rename these files before it happens. Given all the helpers in
there are either emitting descriptors, and calculating values to be
put in such descriptors, pan_desc.{c,h} sounds like an acceptable
name.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26356>
What pan_scoreboard manipulates is a job chain, how dependencies
between jobs is implemented is an implementation detail, and shouldn't
leak through the name.
Let's rename pan_scoreboard.h pan_jc.h, and prefix the functions
accordingly.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26356>
We never steal the NIR program or free it explicitly, and the state
tracker expects drivers to take ownership of the program object. Since
panfrost doesn't need to keep the original NIR shader around for
compute, let's just free it before returning.
Fixes: 40372bd720 ("panfrost: Implement a disk cache")
Cc: stable
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26424>
Enables 10 bpc / color depth 30 bit support under XOrg
with X11/GLX/OpenGL.
Successfully tested with RaspberryPi OS 11,
running X-Server 1.20, and also with Weston,
on a RaspberryPi 400 on top of current Mesa 24.0-devel.
Alejandro Piñeiro also performed a GLES CTS run
with successful result, citing him:
"Full GLES 31 CTS finished with 0 failres. So all ok"
Note that this commit was originally developed and
successfully tested by myself against Mesa 23.1-devel
from February 2023, and therefore should apply and work
cleanly against recent Mesa stable branches. One could
see this commit as a trivial compatibility fix against
X-Server 1.20 / modesetting-ddx 1.20, which is why I'm
also nominating this commit for the current 23.3 stable
branch, and also the 23.2 stable branch, so it may make
it into RaspberryPi OS 12. Thanks for the consideration.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Backport-to: 23.2
Backport-to: 23.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26472>