Commit Graph

8960 Commits

Author SHA1 Message Date
Emma Anholt f6c06ef2f6 ci: Add manual rules variations to disable irrelevant driver jobs.
If you're only affecting one or a couple of drivers, it would be nice if
your pipeline buttons on the web UI weren't full of manual run buttons for
all the other drivers.

This is a bunch of duplicated lines, but less than it could have been now
that we have !references.

In some of these cases (i915g, nouveau, etnaviv), we have no non-manual
jobs for those drivers, so I could have just rewritten the original
"driver-rules" to "driver-manual-rules".  I decided to keep things
consistent between drivers, though, because this is all esoteric enough to
readers already without making different drivers' rules look different.

Fixes: #4891
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17445>
2023-01-26 00:48:19 +00:00
Emma Anholt 67547a04b6 ci: Move the performance jobs' allow_failure:true to the gl rules.
This helps clarify what's going on with these magic manual jobs.

Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17445>
2023-01-26 00:48:19 +00:00
Francisco Jerez 7b5e933629 intel/fs: Fix src and dst types of LOAD_PAYLOAD ACP entries during copy propagation.
The ACP entries created by copy propagation to track the implied
copies of LOAD_PAYLOAD instructions don't model the behavior of
LOAD_PAYLOAD correctly, since (as of 41868bb682) header
moves are implicitly retyped to UD and the destination of non-header
copies implicitly uses the same type as the corresponding source, even
though the ACP entries created for such copies could incorrectly
represent a type conversion, which can lead to mis-optimization of the
program.

According to Marcin, this fixes the func.mesh.ext.workgroup_id.task.q0
crucible test.

Fixes: 41868bb682 ("i965/fs: Rework the fs_visitor LOAD_PAYLOAD instruction")
Reported-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18980>
2023-01-25 22:22:12 +00:00
José Roberto de Souza 5ccc2049fa intel/ds: Nuke intel_ds_queue::queue_id
queue_id is not used anywhere.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20888>
2023-01-25 17:43:57 +00:00
José Roberto de Souza f2fdd14907 hasvk: Nuke anv_queue:index_in_family
This value is also stored in vk_queue, so we can nuke from anv_queue.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20888>
2023-01-25 17:43:57 +00:00
José Roberto de Souza 013e849a2c anv: Nuke anv_queue:index_in_family
This value is also stored in vk_queue, so we can nuke from anv_queue.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20888>
2023-01-25 17:43:57 +00:00
José Roberto de Souza 80dd6ed074 intel: Add kmd_type parameter to necessary intel_gem.h functions
Here adding kmd_type parameter to
intel_gem_read_render_timestamp(), intel_gem_can_render_on_fd() and
intel_gem_supports_protected_context().
Those 3 functions will have Xe implementations, the other functions
in intel_gem.h will not be called by Xe code paths so not adding
kernel_driver_type to it.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20773>
2023-01-25 09:17:15 -08:00
José Roberto de Souza 162695f0b8 intel: Add intel_kmd_type parameter to intel_engine_get_info()
This will be needed to execute proper function to fetch engines from
kernel mode driver.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20773>
2023-01-25 09:16:55 -08:00
José Roberto de Souza 2c42641994 intel/dev: Detect what is the kernel mode driver loaded
Intel GPUs will have a new kernel mode driver, support for some
platforms will overlap so we will need to detect in runtime what is
the kernel mode driver loaded for each device.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20773>
2023-01-25 09:16:55 -08:00
Kenneth Graunke a8108f1d44 anv: Add missing untyped data port flush on PIPELINE_SELECT
See the comments in emit_apply_pipe_flushes().  Flushing HDC is not
sufficient in GPGPU mode, and we need to set the untyped data port flush
bit as well.

Fixes many dEQP-VK failures with INTEL_COMPUTE_CLASS=1 on Alchemist.

Fixes: 1067ec90a5 ("anv: Update PIPELINE_CONTROL flush when switching pipeline mode in TGL+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20774>
2023-01-25 07:08:55 +00:00
Lionel Landwerlin 5b79745b04 intel/common: add a INTEL_DECODE variable to parameter decoder at runtime
Sometimes you want to diff 2 runs with INTEL_DEBUG=bat, but a tiny
allocation change can mess quite badly with offsets printed in the
decoding, making it hard to look at the diff with meld.

Fortunately our decoder can avoid printing offsets. We just need a
variable to specify that.

We still use the defaults specified by the driver but you can turn
things on/off with :

INTEL_DECODE=+color,-offsets,-floats INTEL_DEBUG=bat ./my_app

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20874>
2023-01-24 15:15:17 +00:00
Lionel Landwerlin 31931754bc intel/decoder: print out compute push constants
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20874>
2023-01-24 15:15:17 +00:00
Marcin Ślusarz 536a2acfc2 intel/compiler/mesh: handle const data in task & mesh programs
Started showing up when nir_opt_large_constants call was moved in 88756cee8d.
Fixes dEQP-VK.mesh_shader.ext.smoke.monolithic.fullscreen_gradient*

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 88756cee8d ("intel/compiler: Run nir_opt_large_constants before scalarizing consts")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20876>
2023-01-24 14:47:21 +00:00
Marcin Ślusarz 9bb18a4f9e intel/compiler: fix generation of vec8/vec16 alu instruction
I stumbled on this when I inserted some suboptimal lowering code after all
optimizations. Adding certain subset of optimizations after my lowering code
actually avoided this bug, so I think it's not possible to hit this on upstream.

Let's fix this for the next person generating suboptimal code...

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20857>
2023-01-24 13:15:58 +00:00
Tapani Pälli 0813c1a6fa intel/genxml: set unused 3DSTATE_PS_EXTRA field as mbz
Wa_14015360517 mentions situations where HW produces invalid
occlusion query results when "Pixel Shader Does not write to RT"
bit is set.

"When Pixel Shader Kills Pixel is set, SW must perform a dummy render
 target write from the shader and not set this bit, so that Occlusion
 Query is correct."

Another situation is when writing to UAV or to NULL render target.
Patch sets field as 'must be zero' to discourage possible use of it.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20849>
2023-01-24 10:28:15 +00:00
Lionel Landwerlin a50d2fdb46 intel/fs: avoid cmod optimization on instruction with different write_mask
I've been running into failures with tests like :

dEQP-VK.robustness.robustness2.bind.notemplate.rgba32i.unroll.nonvolatile.uniform_buffer_dynamic.no_fmt_qual.len_4.samples_1.1d.frag

With the load_global_const_block_intel NIR intrinsic, you can load a
vec8/vec16 with a predicate. The predicate is correctly uniformized to
feed into the SEND instruction's flag register.

The problem is that a series of optimization first remove the
find_live_channel and then changes the broadcast into a simple MOV
instruction, on the assumption that the first channel is always active
if there is not control flow. This is correct.

But after that the cmod optimzation will remove this instruction :

   mov.nz.f0.0(16) null:D, vgrf16+0.0<0>:D NoMask

because it seems to be equivalent to :

   cmp.g.f0.0(16) vgrf16:D, vgrf12:D, 63d

In this case vgrf16 is the predicate to the load block SEND
instruction. Since the execution mask is different between both, some
of the channels of the SEND instruction end up not being loaded or
loaded with the wrong predication and we end up with incorrect UBO
data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20852>
2023-01-24 07:35:42 +00:00
Emma Anholt 849af68dbd ci/piglit: Add some common piglit skips for Mesa CI's testing of glx.
Since our X servers don't have a compositor, and we run tests in parallel,
various swap and frontbuffer tests won't ever be stable.  Rather than
having every driver have to track those flakes, make a general X11 skips
list as a known issue of our CI rather than pointing fingers at drivers.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20798>
2023-01-24 00:13:02 +00:00
Paulo Zanoni 5039acfd9d hasvk: check the return value of anv_execbuf_add_bo_bitset()
This is the Hasvk version of Anv's:
  3d37950fd9 ("anv: check the return value of anv_execbuf_add_bo_bitset()")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20800>
2023-01-23 20:43:36 +00:00
Paulo Zanoni 80196aaa5b hasvk: don't leave undefined values in exec->syncobj_values
This is the Hasvk version of Anv's:
  ad6a036a68 ("anv: don't leave undefined values in exec->syncobj_values")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20800>
2023-01-23 20:43:36 +00:00
Lionel Landwerlin 6f02f9d108 anv: fix preemption enable emission in gpu_memcpy
This has to be before the MI_BATCH_BUFFER_END otherwise it has no
effect.

This also was messing around with you batch length alignment.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b9aa66d5d0 ("anv: disable preemption for 3DPRIMITIVE during streamout")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20802>
2023-01-20 22:35:41 +02:00
José Roberto de Souza c1d99283e6 intel/dev: Split hwconfig i915 specific code
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20437>
2023-01-20 17:26:08 +00:00
José Roberto de Souza b3b769894e intel/dev: Move i915 code to i915/intel_device_info.c
No behavior changes here.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20437>
2023-01-20 17:26:08 +00:00
José Roberto de Souza 1332702cf4 intel/dev: Export functions that will be used by different kernel drivers
No behavior changes here.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20437>
2023-01-20 17:26:08 +00:00
Tapani Pälli 58dd9d5134 hasvk: add restrictions for 3DSTATE_RASTER::AntiAliasingEnable
Field must be disabled if any render targets have integer format.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20671>
2023-01-20 12:50:04 +00:00
Tapani Pälli 9b37ef40f8 anv: add restrictions for 3DSTATE_RASTER::AntiAliasingEnable
Field must be disabled if any render targets have integer
format, additionally for Gfx12+ field must be disabled when
num multisamples > 1 or forced multisample count > 1.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20671>
2023-01-20 12:50:04 +00:00
Iván Briano 58ababdee6 anv: uncompressed views of compressed 3d images are now valid
There was a VUID-VkImageViewCreateInfo-image-04739 in the Vulkan 1.3
spec that said:

    If image was created with the
    VK_IMAGE_CREATE_BLOCK_TEXEL_VIEW_COMPATIBLE_BIT flag and format is a
    non-compressed format, viewType must not be VK_IMAGE_VIEW_TYPE_3D

That VUID has since been removed, and when a view of a 3D image is
created, with put the depth into the array_len, so it won't be always 1.

Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20803>
2023-01-20 08:41:59 +00:00
David Heidelberg a8d2b288ee ci/piglit: 2023-01-19 uprev
Emergency uprev to get robustness against http 50x errors for traces.

Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20788>
2023-01-19 23:46:44 +00:00
Dylan Baker c31629ee78 meson: remove version checks for < 0.59
Which is now required, so these are useless

Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20752>
2023-01-19 23:06:07 +00:00
Kenneth Graunke 7092c1218a intel/compiler: Use more symbolic source names in components_read()
Rather than hardcoding source 1, source 2, etc.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 16b66ab659 intel/compiler: Drop dest checking in atomic code
NIR atomic operation intrinsics all have destinations.  This is just
copy and pasted from other generic intrinsic handling where that may
or may not be the case.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 780f3e2e6b intel/compiler: Delete all the A64 atomic variants for type sizes
These are handled identically in almost all cases.  There is one place
in the legacy surface lowering that was obtaining the bitsize from the
opcode, but the LSC-based lowering uses (type_sz(inst->dst.type) * 8)
for that and works just fine.  If we just do that in the legacy lowering
too, then we don't need this plethora of opcodes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 03ddde1230 intel/compiler: Combine nir_emit_{ssbo,shared}_atomic into one helper
These are basically identical save for:
- shared has surface hardcoded to SLM rather than an SSBO index
- shared has to handle adding the 'base' const_index (SSBO have none)
- the NIR source index for data is shifted by one

It's not worth copy and pasting the entire function for this.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke b84939c678 intel/compiler: Delete fs_visitor::nir_emit_{ssbo,shared}_atomic_float()
These are now basically identical to their non-float counterparts.  The
only thing that differed was the opcode checking to determine which
operands existed.  Now that we have a unified opcode enum and a helper
for the number of data operands, we can just use that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke f7b29d7924 intel/compiler: Drop redundant 32-bit expansion for shared float atomics
We already expanded data to 32-bit a few lines earlier, so this is just
redundantly doing it a second time.

Fixes: 43169dbbe5 ("intel/compiler: Support 16 bit float ops")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 02129eee3a intel/compiler: Eliminate SHADER_OPCODE_UNTYPED_ATOMIC_FLOAT
The only reason for the separate opcode was because of the overlapping
BRW_AOP_* enums, making it impossible to tell whether a particular AOP
was the integer or float operation.  Now that we use the lsc_opcode
enums, we can just have the legacy lowering inspect the opcode and
select the right descriptor.  No need for a separate opcode.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 284f0c9a57 intel/compiler: Add an lsc_op_num_data_values() helper
There are a number of places that need to know how many operands an LSC
atomic takes (0 for inc/dec, 1 for most things, 2 for cmpxchg).  We can
add a helper for that and eliminate some code (with more to come).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 90a2137cd5 intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs
This gets our logical atomic messages using the lsc_opcode enum rather
than the legacy BRW_AOP_* defines.  We have to translate one way or
another, and using the modern set makes sense going forward.

One advantage is that the lsc_opcode encoding has opcodes for both
integer and floating point atomics in the same enum, whereas the legacy
encoding used overlapping values (BRW_AOP_AND == 1 == BRW_AOP_FMAX),
which made it impossible to handle both sensibly in common code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Kenneth Graunke 8d2dc52a14 intel/compiler: Move atomic op translation into emit_*_atomic()
There's no need to pass both the intrinsic and an opcode computed from
that same intrinsic.  Just do it in the functions themselves.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>
2023-01-19 08:42:22 +00:00
Lionel Landwerlin 5ff3d4a8a2 anv: fix generated indirect draw shader stats checks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c950fe97a0 ("anv: implement generated (indexed) indirect draws")
Tested-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20776>
2023-01-19 07:36:19 +00:00
Tapani Pälli 4fd9bf6e7f intel/hasvk: remove some stale comments, wa was removed
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20783>
2023-01-19 06:37:20 +00:00
Francisco Jerez f40e17059a intel/fs/gfx12+: Drop redundant handling of SHADER_OPCODE_BROADCAST in exec pipe inference.
Commit c80c0ed943 introduced handling of
SHADER_OPCODE_BROADCAST into inferred_exec_pipe(), but it was already
being handled, drop the redundant handling.  Shouldn't lead to any
functional changes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20543>
2023-01-19 06:14:03 +00:00
Francisco Jerez b867d1b851 intel/eu/gfx12+: Implement decoding of 64-bit immediates.
C.f. a12533f2ce.  The corresponding
change for the decoding path was never implemented so the disassembler
was printing incorrect immediate values.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20543>
2023-01-19 06:14:03 +00:00
Francisco Jerez f80f29dc4b intel/disasm/gfx12+: Fix print out of non-existing condmod field with 64-bit immediate.
The conditional mode field doesn't exist for instructions with a
64-bit immediate, so this would currently print garbage.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20543>
2023-01-19 06:14:03 +00:00
Francisco Jerez f3352745ad intel/disasm/gfx12+: Use helper instead of hardcoded bit access for 64-bit immediates.
So we don't have to duplicate code to handle differences in the
encoding of 64-bit immediates across platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20543>
2023-01-19 06:14:03 +00:00
Francisco Jerez 4a2e7306dd intel/fs/gfx12: Ensure that prior reads have executed before barrier with acquire semantics.
This avoids a violation of the Vulkan memory model that was leading to
intermittent failures of at least 8k test-cases of the Vulkan CTS
(within the group dEQP-VK.memory_model.*) on TGL and DG2 platforms.
In theory the issue may be reproducible on earlier platforms like IVB
and ICL, but the SYNC.ALLWR instruction is not available on those
platforms so a different (likely costlier) fix will be needed.

The issue occurs within the sequence we emit for a NIR memory barrier
with acquire semantics requiring the synchronization of multiple
caches, e.g. in pseudocode for a barrier involving the TGM and UGM
caches on DG2:

 x <- load.ugm // Atomic read sequenced-before the barrier
 y <- fence.ugm
 z <- fence.tgm
 wait(y, z)
 w <- load.tgm // Read sequenced-after the barrier

In the example we must provide the guarantee that the memory load for
x is completed before the one for w, however this ordering can be
reversed with the intervention of a concurrent thread, since the UGM
fence will block on the prior UGM load and potentially take a long
time, while the TGM fence may complete and invalidate the TGM cache
immediately, so a concurrent thread could pollute the TGM cache with
stale contents for the w location *before* the UGM load has completed,
leading to an inversion of the expected memory ordering.

v2: Apply the workaround regardless of whether the NIR barrier
    intrinsic specifies multiple storage classes or a single one,
    since an acquire barrier is required to order subsequent requests
    relative to previous atomic requests of unknown storage class not
    necessarily specified by the memory scope information of the
    intrinsic.

Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20690>
2023-01-18 21:34:33 -08:00
Paulo Zanoni f9477770d8 anv: use vk_realloc for the anv_execbuf arrays
Three reasons for that:

0. The operation we're doing here is actually a reallocation.

1. The newer code is, IMHO, easier to read.

2. Realloc has this property where sometimes, when possible, it will
   expand your array without moving it somewhere else, so it doesn't
   need to copy the memory contents, returning the original pointer
   back to you. I did some analysis and while that case is not common,
   it does happen sometimes in real world applications (I could see it
   happening in Shootergame and Aztec Ruins, but not Dota 2), so we're
   able to save a few CPU cycles.

v2: Rebase.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
2023-01-19 02:21:09 +00:00
Paulo Zanoni 6d4fc0e5bf anv: rename anv_execbuf->array_length to bo_array_length
Because this is counting the array length of the things related to the
BOs, just like syncobj_array_length is counting the array length of
the things related to syncobjs.

v2: Rebase.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
2023-01-19 02:21:09 +00:00
Paulo Zanoni e642cafdae anv: run buf_finish() if add_bo() fails during execute_simple_batch()
This is the only code path where we don't run anv_execbuf_finish() in
case anv_execbuf_add_bo() fails. While there is not a bug in the
current tree, I recently made an (uncommitted) modification that
started leaking memory and made me realize the lack of cleanup here.
If we had anv_execbuf_finish() being called upon error like we're
going to have after this patch my modification wouldn't have caused
the memory leak.

I think it's much safer and future-proof if we're able to operate
under the assumption that whatever is allocated and set to anv_execbuf
will be dealt with upon failure of anything else related to it, so
functions that fail should only be required to free pointers not yet
assigned to anv_execbuf.

The dEQP-VK 'alloc_callback_fail' tests should exercise this code
path. The one I was specifically using here is:
  dEQP-VK.api.object_management.alloc_callback_fail.device_group

v2: Rebase.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
2023-01-19 02:21:09 +00:00
Paulo Zanoni 3d37950fd9 anv: check the return value of anv_execbuf_add_bo_bitset()
Because anv_execbuf_add_bo_bitset() calls anv_execbuf_add_bo(), which
can fail if its memory allocations fail.

I have seen dEQP tests exercising memory allocation failures during
anv_execbuf_add_bo(), but I don't think the path coming from
add_bo_biset() was specifically exercised. Anyway, add the error check
just in case.

v2: Rebase.

Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
2023-01-19 02:21:09 +00:00
Paulo Zanoni ad6a036a68 anv: don't leave undefined values in exec->syncobj_values
In anv_execbuf_add_syncobj(), we try to not create or use
exec->syncobj_values if we don't need to. But when we figure we're
going to need it (i.e., when timeline_value is not zero), then we
create exec->syncobj_values with vk_zalloc, which means every previous
value is set to zero, as it should be. This is all correct.

The problem starts when we add a 16th element. In this case we double
exec->syncobj_array_length and realloc the buffer by using vk_alloc
and copying the old array to the new one. After that, we write the
timeline_value to the array only if it's not zero, and that's the
problem: since we just used vkalloc and memcpy, we don't have any
guarantees that the new array will be zero after the 16th element, and
if timeline_value is zero we write nothing to that position.

Once we start using exec->syncobj_values we have to commit to using
it, so the "if (timeline_value)" check near the end of the function
has to be changed to "if (exec->syncobj_values)", so we actually set
elements after the 16th to zero when they need to be zero.  Another
approach to fix this would be to memset the new elements once we
double syncobj_array_length.

In practice, I couldn't find any application or deqp test that used
more than 3 elements in exec->syncobj_array_length, and we need more
than 16 elements in order to be able to reproduce the bug, so I'm not
aware of any real-world bug that goes away with this patch. This issue
was found while reading code.

If we craft a little Vulkan program that submits a ton of timeline and
binary semaphores on vkQueueSubmit, then waits for them, we get the
following error without this patch:

MESA: error: ../../src/intel/vulkan/anv_batch_chain.c:1910: execbuf2 failed: Invalid argument (VK_ERROR_DEVICE_LOST)

v2: Rebase.

Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20703>
2023-01-19 02:21:09 +00:00