If we pass a physical 2D texture descriptor we should also pass 2
coords. Otherwise it just uses the random content in the second
register which ends up funny sometimes.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20696>
If the previously emitted graphics pipeline uses the value A for
col_format_non_compacted and the new bound graphics pipeline uses B.
At bind time, radv_cmd_state::col_format_non_compacted will be set to
B and the rbplus flag will be dirtied. But if there is no draws and a
new graphics pipeline is bound with the same value as A, the next
draw will emit the rbplus state with B instead of A.
This can be basically triggered with meta operations after drawing
because the driver saves/restores the bound pipeline.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8073
Fixes: 11469f7553 ("radv: copy the non-compacted color format at pipeline bind time")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20692>
load_preamble is intended to be almost free (costing at most a move), and it
does not have special bounds checking requirement, so it's ok to select with it.
With this, drivers that use nir_opt_preamble together with a late call to
peephole_select can optimize sequences like:
if (x) {
<uniform-on-uniform calculation>
} else {
<different uniform-on-uniform calculation>
}
to simply
bcsel(x, <uniform register 0>, <uniform register 1>)
rather than emitting needless control flow / branching over some moves.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20597>
multiplying by the array size is always wrong for this case, and not
doing so allows for some simplification and better inlining, though
the output results are identical
the one corner case is clip/cull distance, which need special handling
since they're arrays with vec semantics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20678>
This is heavily copy-pasted from a patch of Ian Romanick, including the
commit message.
Previously, this pass always generated fcsel for bcsel. This was the
only place that generate fcsel, so various drivers assumed (and needed!)
that src0 was a Boolean with 0.0 or 1.0 as the only values.
Specifically, many DX9 / GL_ARB_vertex_program platforms lack a CMP
instruction in vertex shaders. In those cases, they would use LRP to
implement fcsel. The bummer is that many plaforms have a real fcsel
instruction, and those platforms would benefit from other places
generating that opcode.
Instead of leaving assumptions in drivers about the sources of an opcode
that they can't really support, allow them to control the way the
lowering pass translates bcsel. Two flags are used to control this:
- If the driver sets has_fused_comp_and_csel in nir_options, fcsel_gt
will be used. Since the Boolean value is 0.0 or 1.0, this is
equivalent to fcsel.
- If the parameter has_fcsel_ne is set, fcsel will be used. This is the
old path.
- Otherwise, the lowering pass assumes we're on a crufty, old DX9 vertex
program, and it emits flrp.
With this, the assumptions about src0 of fcsel in NTT can be removed.
If a platform can't handle fcsel, it should ensure that the lowering
pass won't generate it.
No change in shader-db.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>
DX9 PS 1.x / GL_ARB_fragment_program shaders that have been converted to
GLSL are littered with patterns like
ps_r1.x = ((ps_t0.y >= 0.0) ? ps_r1.x : ps_c1.y);
This is because CMP is a fundamental opcode in those earlier shading
languages. i915 supports this opcode natively, but there's no way to
get it directly into the backend. Instead, NIR and NTT generate some
combination of fcsel and sge and hope for the best.
i915
total instructions in shared programs: 49032 -> 48897 (-0.28%)
instructions in affected programs: 4173 -> 4038 (-3.24%)
helped: 39
HURT: 0
total temps in shared programs: 2795 -> 2790 (-0.18%)
temps in affected programs: 22 -> 17 (-22.73%)
helped: 5
HURT: 0
total const in shared programs: 4976 -> 4967 (-0.18%)
const in affected programs: 203 -> 194 (-4.43%)
helped: 9
HURT: 0
GAINED: shaders/trine/fp-13.shader_test FS
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20162>
This is a global register which isn't settable by userspace contexts.
It also shouldn't appear in any of our aubinator decodes from error
states or aub dumps, as no userspace batch should be setting it.
So it's not very valuable to have here. Just makes us think we can
set it. Plus, a lot of the field definitions changed a bunch, and
would need updating.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20627>
Some format modifiers change the number of planes used by an image.
For instance AMD DCC modifiers uses 2 or 3 planes. However the
format modifier was ignored in the PIPE_RESOURCE_PARAM_NPLANES
get_param hook.
Fix this by using get_dmabuf_modifier_planes() instead of
util_format_get_num_planes().
This fixes wlroots-based compositors under zink.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: c025cb9ee9 ("zink: fix dmabuf plane returns")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20395>