It's not good for performance, but it's possible to use for debugging.
Running single-wave GS workgroups could work around any LDS race conditions.
Setting the workgroup size to 64 reliably works around
GLCTS *primitive_counter*line failures, indicating streamout data
corruption with multi-wave GS workgroups.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38328>
This avoids u_upload_data_ref() when cb0 is bound. The u_upload_*_ref()
paths are still problematic to mix with uploaders that the front-end
uses with explicitly managed releasebufs, but this at least side-steps
the issue, and is a legit fix on it's own.
Cc: mesa-stable
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
Triggering the rollover where the old upload buffer is released is a
good way to catch bugs with a releasebuf being dropped too soon (ie.
while the frontend still needs a reference).
This makes it easy to reproduce firefox crashes in any driver where
pipe->const_uploader == pipe->stream_uploader.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
URB messages on Xe2 are LSC messages with FLAT addressing. We can
specify a S19 immediate offset in the extended message descriptor,
which should be more than adequate to hold any offsets we need.
We wrote the original URB code before implementing that, and never
doubled back to take advantage of it. But doing so can drop ADDs
near every URB access.
fossil-db results on Battlemage:
Totals:
Instrs: 232239759 -> 231432254 (-0.35%)
Cycle count: 34044435848.0 -> 34055507100.0 (+0.03%); split: -0.00%, +0.04%
Spill count: 520370 -> 520362 (-0.00%); split: -0.00%, +0.00%
Fill count: 470790 -> 470803 (+0.00%); split: -0.00%, +0.00%
Max live registers: 72111853 -> 72111369 (-0.00%); split: -0.00%, +0.00%
Totals from 227920 (28.89% of 788851) affected shaders:
Instrs: 59841897 -> 59034392 (-1.35%)
Cycle count: 683385208.0 -> 694456460.0 (+1.62%); split: -0.14%, +1.76%
Spill count: 17278 -> 17270 (-0.05%); split: -0.10%, +0.06%
Fill count: 17481 -> 17494 (+0.07%); split: -0.03%, +0.10%
Max live registers: 23052652 -> 23052168 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
GLSL defines gl_SampleMaskIn as :
"a fragment language that indicates the set of samples covered
by the primitive generating the fragment during multisample
rasterization"
when variable rate shading is enabled, a single invocation might cover
multiple samples. The lowering done in nir_lower_single_sampled() does
not account for that case, so add an option to selectively disable it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
This shows off how we don't need to pass an explicit size per CRB instance
in our non-growable CSes.
However, I don't like the additional indentation I did to make a CRB go
out of scope when I needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38762>
Loosely based on freedreno's, but simplified since a lot of overflow
handling was already there in tu_cs. It successfully catches issues of:
- Overflowing the CRB reservation
- Starting a new CRB with one in progress.
- Emitting a pkt4 while a CRB emit is in progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38762>
Replace the duplicated swapchain image detection pattern across all
Vulkan drivers with the new wsi_common_is_swapchain_image() helper.
Since the swapchain handle can be extracted from VkImageCreateInfo's
pNext chain inside wsi_common_create_swapchain_image(), remove the
now-redundant VkSwapchainKHR parameter from that function.
This removes the #ifdef guards for Android/WSI platforms from each
driver, as the helper now handles this uniformly.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38541>
Now that input attachment lowering is factored out, there's no reason to
be passing the whole shader variant around here. This both makes things
a lot more clear and gives us more flexibility about when we call it,
allowing us to potentially call it once per-shader instead of once
per-variant.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>
We were running this in the preprocess step and then trusting that it
would clean up everything before we got to the back-end. However, we
were running the entire optimization loop in between as well as drivers
potentially adding stuff (since panvk has it's own passes after
postprocess). Instead, this should be one of the last things run, right
before we go into the back-end.
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38821>