From the Vulkan spec 1.2.146:
"VK_ACCESS_MEMORY_READ_BIT specifies all read accesses. It is
always valid in any access mask, and is treated as equivalent
to setting all READ access flags that are valid where it is
used."
"VK_ACCESS_MEMORY_WRITE_BIT specifies all write accesses.
It is always valid in any access mask, and is treated as
equivalent to setting all WRITE access flags that are valid
where it is used."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3241
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5807>
Otherwise LLVM does not see the pointers as allowing speculative
loads.
The pipeline-db results are pretty wild, but mostly what is to be
expected from allowing more code movement in LLVM:
Totals from affected shaders:
SGPRS: 157728 -> 168336 (6.73 %)
VGPRS: 158628 -> 158664 (0.02 %)
Spilled SGPRs: 10845 -> 24753 (128.24 %)
Spilled VGPRs: 13 -> 13 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 8 -> 8 (0.00 %) dwords per thread
Code Size: 17189180 -> 17313712 (0.72 %) bytes
LDS: 204 -> 204 (0.00 %) blocks
Max Waves: 5700 -> 5687 (-0.23 %)
Wait states: 0 -> 0 (0.00 %)
This gives some boosts for shaders we can move a descriptor load
outside a loop.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3159>
This updates a3xx/a4xx/a5xx to fix the fetchsize to "PITCHALIGN" (called
"MINLINEOFFSET" by the a3xx docs), and some simplifications to make things
more like a6xx. Also similar simplifications for a2xx layout code.
The pitch can always be determined using a simple calculation from the base
level pitch, so don't pre-calculate a pitch for each mipmap level.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5796>
I want the docs to be discoverable next to the code, and sphinx insists
that all docs are under the top-level docs dir (sigh). We can't symlink
from that dir to .gitlab-ci because windows builds can't do symlinks, so
link back the other direction.
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5510>
In the case where SSA use/def chains are broken, NIR prints out a very
cryptic error and then aborts. This abort happens during validation
rather than after the print is complete, hiding any other errors that
may have been found. One might think, "So what? Fix your use/def issue
first." However, what makes this especially bad is that, when use/def
chains are broken, there's usually a much nicer error inline in the
shader that would have been printed had we not aborted early so the
current behavior simply ensures you get the most cryptic error possible
in an already difficult-to-debug case.
While we're at it, we remove the one other case of abort() which is in
the validation of phi instruction sources.
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5809>
according to EXT_multiview_draw_buffers, gl_FragColor outputs to all available
render targets when used, so we need to translate this to gl_FragData[PIPE_MAX_COLOR_BUFS]
in order to correctly handle more than one color buffer attachment
this fixes the rest of spec@arb_framebuffer_object tests in piglit
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5687>
ntq_setup_fs_inputs and ntq_setup_gs_inputs sort the inputs according to
the driver location. This input array is then used to calculate the VPM
offset for the outputs in the previous stage. However, it wasn’t taking
into account variables that are packed into a single varying slot. In
that case they would have the same driver_location and are
distinguished by location_frac.
This patch makes it additionally sort by location_frac when the driver
locations are equal. This can happen when the compiler packs varyings
that are sized less than vec4. Without this fix, when the VPM is used to
transmit data free-form between the stages (such as VS->GS) then it
would end up writing to inconsistent locations.
Fixes dEQP tests such as:
dEQP-GLES31.functional.primitive_bounding_box.lines.global_state.
vertex_geometry_fragment.default_framebuffer_bbox_equal
Fixes: 5d578c27ce ("v3d: add initial compiler plumbing for geometry shaders")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5787>
gcc is not smart enough to see that
enum pipe_format dst_fmt;
...
switch (data_size) {
case 16:
dst_fmt = PIPE_FORMAT_R32G32B32A32_UINT;
...
break;
case 12:
/* RGB32 is not a valid RT format. This will be handled by the pushbuf
* uploader.
*/
break;
case 8:
dst_fmt = PIPE_FORMAT_R32G32_UINT;
...
break;
case 4:
dst_fmt = PIPE_FORMAT_R32_UINT;
...
break;
case 2:
dst_fmt = PIPE_FORMAT_R16_UINT;
...
break;
case 1:
dst_fmt = PIPE_FORMAT_R8_UINT;
break;
default:
assert(!"Unsupported element size");
return;
}
...
if (data_size == 12) {
...
return;
}
Does not result in dst_fmt being uninitialized when it is used so
lets just initialise it to silence the warning.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5766>