From Bspec 46959, a programming note applicable to Gfx12+:
"Since HZ_OP has to be sent twice (first time set the clear/resolve
state and 2nd time to clear the state), and HW internally flushes the
depth cache on HZ_OP, there is no need to explicitly send a Depth
Cache flush after Clear or Resolve."
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24027>
We're interested in a Vulkan-only stack in Chrome OS, where Android's GLES
would be provided by ANGLE-over-Venus-over-ANV. Let's get some testing
covering ANGLE-on-ANV first.
This is structured as a single partial job pre-merge to catch most
regressions, and a longer manual job to do full coverage for when you need
to update the xfails list.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20163>
anv_vma_alloc() returns a canonical address, but explicit_address is a
regular address. This mismatch can potentially cause issues.
So here making bo->offset as always canonical address by converting it
in the explicit case and fixing the only caller that was caling
anv_bo_vma_alloc_or_close() with a canonical address.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23977>
There is no mention in spec about subtract one of the number of
threads, also Iris and blorp code don't subtract.
Alchemist PRMs: Volume 2a: Command Reference: Instructions: CFE_STATE: Maximum Number of Threads:
Normally set to the maximum number of threads: (# EUs) * (# threads/EU)
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23973>
Depth/stencil surfaces cannot be linear but they can be 1D, so they
end up being tile64 when sparse (as we force every sparse resource to
be either tile64 or linear).
According to the "1D surfaces" page from BSpec, our driver treats 1D
surfaces as 2D surfaces with a height of 1 texel, since we don't
enable the corresponding bit from HAS_SLICE_CHICKEN7. And since we
support 2D surfaces, we should also support 1D.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22974>
On DG2 I ran into a case where the surface state was not being decoded
with INTEL_DEBUG=bat. This is because the surface states are not part
of a state pool there anymore. Instead BO are allocate manually and
placed in vma heap.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 96c33fb027 ("anv: enable direct descriptors on platforms with extended bindless offset")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23891>
The Vulkan specification indicates that if memory types have
properties which are a strict subset of another type's, then they
should appear before that memory type. Otherwise the specification
does not require a specific ordering of memory types.
But, it appears that Aztec Ruins and the Vulkan CTS make an assumption
that the first host-accessible memory type is host-coherent and select
it when they expect data written by the CPU to become visible without
calling vkFlushMappedMemoryRanges(), even though flushing is required
by the spec, which leads to misrendering and hangs on MTL platforms.
We found that other drivers also put a host-coherent, but not cached
memory type as the first host-accessible memory type, so let's do the
same in order to match the expectations of such broken applications.
Host-coherent uncached memory types are currently implemented with a
WC CPU map on non-LLC platforms, so there shouldn't be a huge
performance penalty from this: If an application intends to do heavy
R/W CPU access on a memory range it's expected to loop over the
available memory types and select one marked as host-cached -- If an
application fails to do that and simply selects the first available
type it seems more robust to stay on the safe side and give them a
host-coherent type rather than a cached one.
Rework:
* Jordan: Add initial explanation to body of commmit message.
* Curro: Add additional comments to commit message.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22878>
Although the following is based on this observations for OpenGL, we
probably need this for Vulkan as well.
KHR-GL46.texture_buffer.texture_buffer_operations_ssbo_writes writes
to an SSBO in a compute program, then issues a memory-barrier, which
causes us to add a DC-flush. Then a second compute program samples
from the SSBO written by the first compute program.
Although we expected the DC-flush to make the writes available to the
second compute program, on MTL this wasn't the case. Adding the
"Untyped Data-Port Cache Flush" fixes this.
The PRM indicates that compute programs must set "Untyped Data-Port
Cache Flush" to flush some LSC writes when flushing HDC. Although we
are setting DC-flush, and not HDC-flush, it does appear that the
following reference might also apply to DC-flush.
In the Intel(R) Arc(tm) A-Series Graphics and Intel Data Center GPU
Flex Series Open-Source Programmer's Reference Manual, Vol 2a: Command
Reference: Instructions, PIPE_CONTROL, HDC Pipeline Flush (DWord 0,
Bit 9), there is a programming note:
> When the "Pipeline Select" mode is set to "GPGPU", the LSC Untyped
> L1 cache flush is controlled by "Untyped Data-Port Cache Flush" bit
> in the PIPE_CONTROL command.
Ref: a8108f1d44 ("anv: Add missing untyped data port flush on PIPELINE_SELECT")
Ref: bd8e8d204d ("iris: Add missing untyped data port flush on PIPELINE_SELECT")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23176>
In the Intel(R) Arc(tm) A-Series Graphics and Intel Data Center GPU
Flex Series Open-Source Programmer's Reference Manual, Vol 2a: Command
Reference: Instructions, PIPE_CONTROL, HDC Pipeline Flush (DWord 0,
Bit 9), there is a programming note:
> When the "Pipeline Select" mode is set to "GPGPU", the LSC Untyped
> L1 cache flush is controlled by "Untyped Data-Port Cache Flush" bit
> in the PIPE_CONTROL command.
Ref: a8108f1d44 ("anv: Add missing untyped data port flush on PIPELINE_SELECT")
Ref: bd8e8d204d ("iris: Add missing untyped data port flush on PIPELINE_SELECT")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23176>