We would set the `src` flag on the interval of reloaded sources.
However, the interval might be merged with its parent when inserted and
the parent wouldn't have this flag set. This caused the parent interval
to potentially be reused to reload later sources. Fix this by setting
the `src` flag on the top-level interval after insertion.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33810>
Now that every ANGLE use is covered by tag consistency checks
(structured tagging), we don't need the USE_ANGLE flag anymore, because
if we have ANGLE_TAG set, it means that ANGLE is required in this job.
In detail, it means that the test job has inherited ANGLE_TAG from
`.container-builds-angle`.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33421>
When spilling values, we can detect when a value is known to be constant
and avoid spilling it out to memory and/or GPRs by just re-materializing
the constant value instead of filling.
Shader-db stats:
Totals:
CodeSize: 30101168 -> 30052896 (-0.16%); split: -0.19%, +0.03%
SLM size: 146536 -> 146524 (-0.01%)
Static cycle count: 6952994 -> 6939532 (-0.19%); split: -0.30%, +0.10%
Spills to memory: 174139 -> 173625 (-0.30%)
Fills from memory: 174139 -> 173625 (-0.30%)
Totals from 555 (8.05% of 6891) affected shaders:
CodeSize: 18945520 -> 18897248 (-0.25%); split: -0.30%, +0.04%
SLM size: 128952 -> 128940 (-0.01%)
Static cycle count: 4344118 -> 4330656 (-0.31%); split: -0.47%, +0.16%
Spills to memory: 174139 -> 173625 (-0.30%)
Fills from memory: 174139 -> 173625 (-0.30%)
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33785>
Fixes: 692b5fa9f2 ("anv: Add shader to copy acceleration structures")
This commit fixes the future test "sparse_binding_structures" for
"header_bottom_address" for ray tracing pipeline.
Even on 48-bit ray tracing (Xe1/2), the software-defined part
instance_leaf_part1.bvh_ptr has to be in canonical form for copy.comp
to deference a bvh, which means we have to preserve the upper 16bits.
This is especially relevant in cases where the acceleration structure buffer
is located high, such as sparse buffer.
Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
Fixes: 2fe57947e3 ("anv: Implement encode shader to fit in ANV BVH")
This commit resolves the failures in the future tests
"sparse_binding_structures" for rayquery. Sparse buffers' heaps are
located high, and since it's in canonical form, the higher 16bits are
all set to 1. However, the existing encoder did not expect any non-zero
values at the higher 16bits. As a result, the instance flags got
corrupted, causing most triangle tests to fail.
Thanks for Paulo providing insights about sparse buffer properties.
Co-developed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
... since `BoxedHandleManager` should, well, manager the handles.
This simplifies `VkDecoderGlobalState` a little bit and should also
allow us to remove a bunch of functions that no longer need to
depend on `VkDecoderGlobalState`.
Test: cvd create --gpu_mode=gfxstream_guest_angle_host_swiftshader
Test: cvd snapshot_take --force \
--auto_suspend \
--snapshot_path=/tmp/snapshot1
Test: cvd reset -y
Test: cvd create --snapshot_path=/tmp/snapshot1
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
Add vkGetFenceStatus and vkWaitForFences functions to the
global state tracking list for the host.
This will allow adding more functionality to the fences
and perform additional operations before waiting for and
signaling them.
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
... to break the recursive behavior of the replay calling into
VkDecoderSnapshot so that locking and thread safety annotations can be
preserved in VkDecoderSnapshot.
Follow up to aosp/3412302.
Test: cvd create --gpu_mode=gfxstream_guest_angle_host_swiftshader
Test: cvd snapshot_take --snapshot_path=<>
Test: cvd create --snapshot_path=<>
Reviewed-by: Aaron Ruby <aruby@qnx.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33740>
Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy
lowering. The codegen for this is quite bad currently, but would be
improved by implementing unpack_32_2x16_split_*, and by fusing
comparisons with CSEL.
The main alternative is converting to F32, then LDEXP.f32, then
converting back to F16. This has better codegen for dynamic exponents
currently, but worse in the common case with a constant exponent where
all the saturating cast logic can be folded.
Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2
when shaderFloat16 is enabled in panvk.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
On vulkan, truncating to S/U16 before converting is not valid, because
out-of-range conversions are specified to be correctly rounded. IEEE 754
requires that out-of-range values round to ±inf with RTNE and ±F16_MAX
with RTZ.
On gl, truncating is valid for U16->F16, because out-of-range int->float
conversions are undefined behavior. For S16->F16, it is not valid
because S16_MAX < F16_MAX, so some in-range values will be truncated as
well.
Instead, just handle S/U16->F16 as S/U16->F32->F16.
Fixes dEQP-VK.spirv_assembly.instruction.compute.convertstof.int32_to_float16_*
when shaderFloat16 is enabled in panvk.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: be74b84e6f ("pan/bi: Fill in some more conversions")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
Right now the aliasing/overlapping checks are only done with index
0. I guess that was done because variables don't get a different
internal location even if you have a different index.
But doing that, the checks would not detect a case like this:
layout(location = 0, index = 1) out vec4 color;
layout(location = 0, index = 1) out vec4 factor;
That was used on the following piglit parser test:
spec/arb_explicit_attrib_location/1.10/compiler/layout-13.frag
And as the spec included on that test, is a link error case:
" * if more than one varying out variable is bound to the same
number and index; or"
This commit executes the aliasing checks for index 1 too, and moves
the skip down, to only skip if the current variable and all previous
location-assigned variables has different index and location.
The bad news is that now such assigned variables need to be tracked on
OpenGL-ES. Before that commit that was avoided.
With this commit the mentioned parser test properly fails to link in
any driver.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33093>