Vulkan says you can do things like image resolves or blits on transfer
queues, but D3D only allows literal copies. We could try to emulate
a Vulkan transfer-only queue backed by multiple D3D queues, where we
use the copy queue when possible but fall back to compute when needed,
but let's wait until there's a good reason to do that...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23218>
Forcing bindless on is nice for apps that don't use EXT_descriptor_indexing,
but for the CTS, whenever EXT_descriptor_indexing is supported, it's used.
To be able to more thoroughly test the not-bindless path, add a debug flag
to turn it off.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23218>
pvr_gpu_upload() can't be used in the case of pvr_gpu_upload_usc() as it expects
the source and destination buffers to be the same size. This isn't the case
because pvr_gpu_upload_usc() adds some padding bytes to the size passed in by
the caller.
Fixes: 547a10f870 ("pvr: switch pvr_cmd_buffer_alloc_mem to use pvr_bo_suballoc")
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23185>
Usually the register demand before an instruction would be considered part
of the previous instruction, since it's not greater than the register
demand for that previous instruction. Except, it can be greater in the
case of an definition fixed to a non-killed operand: the RA needs to
reserve space between the two instructions for the definition (containing
a copy of the operand).
fossil-db (navi21):
Totals from 5 (0.00% of 135636) affected shaders:
PreVGPRs: 35 -> 40 (+14.29%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8807
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22446>
We make the compiler assume the worst possible case (it's not great
because we have to burn 32 GRFs of potential input data) and then we
push the actual value through push constants.
This enables VK_EXT_gpl usage on zink, which causes two traces to change
their results. Raven is an imperceptible change, blender has missing
original pngs but looks plausible.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>
We need to do 3 things to accomplish this :
1. make all the register access consider the maximal case when
unknown at compile time
2. move the clamping of load_per_vertex_input prior to lowering
nir_intrinsic_load_patch_vertices_in (in the dynamic cases, the
clamping will use the nir_intrinsic_load_patch_vertices_in to
clamp), meaning clamping using derefs rather than lowered
nir_intrinsic_load_per_vertex_input
3. in the known cases, lower nir_intrinsic_load_patch_vertices_in
in NIR (so that the clamped elements still be vectorized to the
smallest number of URB read messages)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22378>
nir_registers are only supposed to be used temporarily. They may be created by a
producer, but then must be immediately lowered prior to optimizing the produced
shader. They may be created internally by an optimization pass that doesn't want
to deal with phis, but that pass needs to lower them back to phis immediately.
Finally they may be created when going out-of-SSA if a backend chooses, but that
has to happen late.
Regardless, there should be no case where a backend sees a shader that comes in
with nir_registers needing to be lowered. The two frontend producers of
registers (tgsi_to_nir and mesa/st) both call nir_lower_regs_to_ssa to clean up
as they should. Some backend (like intel) already depend on this behaviour.
There's no need for other backends to call nir_lower_regs_to_ssa too.
Drop the pointless calls as a baby step towards replacing nir_register.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23181>
All platforms supported by Iris will have aperture_bytes set as 4Gb.
Also this value is not the actual aperture in i915, it actualy is the
GGTT size.
So here replacing it by the sram size, something that will vary
depending in the amount of RAM available.
This fix some tests with Xe KMD, as it is not setting aperture_bytes.
And will not do that as there is no UAPI to fetch this information
and it is not planned to it to Xe UAPI.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Ack-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22969>
disable_tmu_pipelining has been recently set to false on two
strategies that should set it to true.
Fixes the following CTS test:
dEQP-VK.graphicsfuzz.spv-stable-maze-flatten-copy-composite
Fixes: c950098ab - broadcom/compiler: move buffer loads to lower register pressure
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23207>