GCC 12.2.0 warns:
../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::
split_virtual_grfs()’:
../src/intel/compiler/brw_fs.cpp:2199:10: warning: ‘void* memset(void*, int,
size_t)’ specified size between 18446744071562067968 and 18446744073709551615
exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
2199 | memset(vgrf_has_split, 0, num_vars * sizeof(*vgrf_has_split));
`num_vars` is an `int` but gets assigned the value of `this->alloc.count`,
which is an `unsigned int`. Thus, `num_vars` will be negative if
`this->alloc.count` is larger than int max value. Converting that negative
`int` to a `size_t`, which `memset` expects, then blows it up to a huge
positive value.
Simply turning `num_vars` into an `unsigned int` would be enough to fix this
specific problem, but there are many other instances where an `unsigned int`
gets assigned to an `int` for no good reason in this function. Some of which
the compiler warns about now, some of which it doesn't warn about.
This turns all variables in `fs_visitor::split_virtual_grfs`, which should
reasonably be unsigned, into `unsigned int`s. While at it, a few now pointless
casts are removed.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19423>
This is meant to remove any integrated GPU only code paths that can't
be compiled in CPU architectures different than x86.
Discrete GPUS don't have need_clflush set to true so it was just
matter of remove some code blocks around need_clflush but was left a
check in anv_physical_device_init_heaps() to fail physical device
initialization if it ever became false.
Signed-off-by: Philippe Lecluse <philippe.lecluse@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19812>
Fixes a number of CTS patterns on DG2 :
- dEQP-VK.dynamic_rendering.primary_cmd_buff.random*
- dEQP-VK.draw.*secondary_cmd*
- dEQP-VK.dynamic_rendering.*secondary_cmd*
- dEQP-VK.geometry.*secondary_cmd_buffer
- dEQP-VK.multiview.*secondary_cmd*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9c1c1888d9 ("intel/fs: put scratch surface in the surface state heap")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19946>
It is still marked as a flake (along with other copyteximage cases) on all
these boards, so this will reduce the CI IRC channel noise given that we
actually expect a Pass. I haven't found where exactly in history we went
from generally-fail to generally-pass, but it looks like around Feb 2022.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19912>
Previously only 'decode_vs_state' would dump the referenced shader,
which meant we completely ignored every other shader when decoding
the '3DSTATE_PIPELINED_POINTERS' command.
Move the program disassembly logic from 'decode_vs_state' into a
common 'disasm_program_from_group' helper and call it from every
other decode_*_state function, too.
Signed-off-by: Daniil Tatianin <99danilt@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19880>
In 4ceaed7839 we made scratch surface state allocations part of the
internal heap (mapped to STATE_BASE_ADDRESS::SurfaceStateBaseAddress)
so that it doesn't uses slots in the application's expected 1M
descriptors (especially with vkd3d-proton).
But all our compiler code relies on BSS
(STATE_BASE_ADDRESS::BindlessSurfaceStateBaseAddress).
The additional issue is that there is only 26bits of surface offset
available in CS instruction (CFE_STATE, 3DSTATE_VS, etc...) for
scratch surfaces. So we need the drivers to put the scratch surfaces
in the first chunk of STATE_BASE_ADDRESS::SurfaceStateBaseAddress
(hence all the driver changes).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4ceaed7839 ("anv: split internal surface states from descriptors")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7687
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>
The intel_perf_counter_pass::pass field is actually useless and
invalid.
Once you have mapped all the counters to all the metrics, the order of
the metrics capture is dictated by intel_perf_get_n_passes().
When reading values that is the order we should follow.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2001a80d4a ("anv: Implement VK_KHR_performance_query")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
__builtin_clz(value - 1) is undefined for with value=1 (because
__builtin_clz(0) is undefined).
Because we set rt_pipeline->stack_size = 1 when a ray tracing pipeline
doesn't need any stack allocation to differentiate from a dynamic size
(rt_pipeline->stack_size = 0) we can run into this undefinied behavior
issue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f68d64dac0 ("anv: Add support for vkCmdSetRayTracingPipelineStackSizeKHR")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19781>
In 23c7142cd6 ("anv: disable SIMD16 for RT shaders") we were forcing the SIMD8
using the mechanism for subgroup size control, which is problematic since it has
other effects on the shader behavior.
The code was changed to select the SIMD in a different way in the previous patches,
so we can revert the behavior to the original semantics.
Fixes dEQP-VK.subgroups.builtin_var.ray_tracing.subgroupsize.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>