intel/common has a build dependency on intel/dev so the later should
not have any dependendies on the first.
So here moving the definition of intel_engine_class to
intel_device_info.h because it is used in intel_device_info struct
and then including intel_device_info.h in intel_engine.h.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25233>
Generate intel_device_serialize.c from a mako template, providing
functions to dump and parse intel_device_info.
intel_device_info.py declares python objects representing all type
declarations associated with intel_device_info. It is used as a data
source for intel_device_serialize_c.py
intel_device_serialize_c.py emits a c++ file with routines to dump
or load all struct members to/from json. The json format is a direct
translation of the c structure, with 2 exceptions:
- When parsing json, the no_hw member is always set to true to
indicate that the driver's intel_device_info does not correspond to
the current platform.
- When dumping to json, devinfo_type_sha1 is calculated to be a
checksum which changes whenever intel_device_info is updated. This
checksum is encoded in json. When parsing json, the driver verifies
that the checksum matches before loading data into
intel_device_info. This verifies compatibility.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27557>
The round up in 'next_address_8kb = DIV_ROUND_UP(push_constant_kb, 8)'
was not decreasing the amount of URB available for Mesh and Task, what
could cause an over allocation of URB.
There was also no minimum entries enforcement for Mesh and Task, what
could cause 0 r.mesh_entries to be set in a case where tue_size_dw is
90% > than mue_size_dw. Same for r.task_entries when Task is enabled.
Also adding a few more asserts to help debug.
This fixes at least dEQP-VK.mesh_shader.ext.properties.mesh_payload_size
in LNL but it has potential to fixes other Mesh tests as well.
Cc: mesa-stable
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27555>
When doing query result copies in 3D mode, we're flushing the render
target cache, but the shader writes go through the dataport.
Fixes flakes/fails in piglit with shader query copies forced with Zink :
$ query_copy_with_shader_threshold=0 ./bin/arb_query_buffer_object-coherency -auto -fbo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b3b12c2c27 ("anv: enable CmdCopyQueryPoolResults to use shader for copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>
We have to push the lowering of texture operations a bit further in
pipeline since nir_lower_tex gets invoked twice and if there is no LOD
source present, nir_lower_tex adds that as a source. Once that's all
done we can easily combine the LOD and array index into a single 32-bit
value.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
On Gfx12.0, CCS allocations have to be allocated per image because the
format of the image goes into the AUX-TT PTEs. The effect on memory
allocations is limited since the main surface granularity in the
AUX-TT PTE is 64KB.
On Gfx12.5, the granularity of the AUX-TT PTE is 1MB. This creates a
lot of waste in the application memory allocations. Fortunately the HW
doesn't care about the format put into the PTEs anymore. So it becomes
possible to have 2 images share the same PTE.
To implement this we bring back an earlier version of AUX-TT mappings
where we used to allocate additional CCS space at the end of the
VkDeviceMemory objects. On Gfx12.5, if the BO has additional CCS
space, we will now map the main surface to that space.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26822>
It is possible to free memory backing images before images are
destroyed :
VkFreeMemory:
"Memory can be freed whilst still bound to resources, but those
resources must not be used afterwards."
The spec leaves us the option to keep a reference on the associated
memory and free it only when all the bound resources have been
destroyed. Here we choose to free memory immediately.
One particular test in the CTS
(dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics)
does the following :
imgA = vkCreateImage()
imgB = vkCreateImage()
memA = vkAllocateMemory()
vkBindImageMemory(imgA, memA) # Aux mapping with ref count = 1
vkFreeMemory(memA) # Aux mapping removed, ref count = 0
memB = vkAllocateMemory() # Same address as memA
vkBindImageMemory(imgB, memB)
vkDestroyImage(imgA) # Removes the mapping of imgB-memB
vkQueueSubmit() # hang with pagefault in AUX-TT
The solution implemented in this change is to not do anything AUX-TT
related in vkFreeMemory(). This soluation has some consequences,
because a virtual memory address range freed and reallocated cannot be
rebound in the AUX-TT until all the associated resources have released
their AUX-TT mapping (to bring back the AUX-TT refcount of the range
to 0). This should still be better than keeping the memory allocated
through refcounting of the anv_bo.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7b87e1afbc ("anv: track & unbind image aux-tt binding")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10528
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27566>