We can go further in tracking what BOs are written to by the driver by
tracking when a buffer in unmapped. A BO could be mmap, written, unmap
and never be written to again. In such case we can just write the BO's
content on the first exec buf after unmap and never write it again.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2201>
Create basic aub_context on GEM_CONTEXT_CREATE.
Set it up and submit a context + ring + pphwsp during execbuf
submission, if it has not been initialized yet.
v2: Write the HWSP only once per engine (Lionel).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
v2:
- Only dump context if there were no erros (Lionel).
- Store counter for context handles in aub_file (Lionel).
v3:
- Add a comment about aub_context -> GEM context (Lionel).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The `device` can be set earlier either by a command line or a by
intercepting an ioctl call to get the I915_PARAM_CHIPSET_ID done by
the application early. In both cases `aub_file` and `devinfo` would
not be initialized.
Fix by splitting the conditions
- `device == 0`: use the FD to get both device and devinfo.
- Or `devinfo.gen == 0`: use `device` to initialize it.
And separatedly, initialize aub_file the first time it is needed.
Fixes: d594d2a052 ("intel/tools: use device info initializer")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes formatting errors for 32 bit compilations, eg:
error: format ‘%lx’ expects argument of type ‘long unsigned int’,
but argument 5 has type ‘uint64_t’ {aka ‘long long unsigned int’}
[-Werror=format=]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Prepare aub write to deal with multiple engine instances. We don't
pass the instance number yet this could be done in the future by
having a 2 dimensional array of struct engine.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
In the future we'll want error2aub to reuse the context image saved by
i915 instead of the default one we write in intel_dump_gpu.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
This fixes a rather astonishing problem that came up while debugging
an issue in the Vulkan CTS. Apparently the Vulkan CTS framework has
the tendency to create multiple VkDevices, each one with a separate
DRM device FD and therefore a disjoint GEM buffer object handle space.
Because the intel_dump_gpu tool wasn't making any distinction between
buffers from the different handle spaces, it was confusing the
instruction state pools from both devices, which happened to have the
exact same GEM handle and PPGTT virtual address, but completely
different shader contents. This was causing the simulator to believe
that the vertex pipeline was executing a fragment shader, which didn't
end up well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The problem with passing the configuration of the dump lib through a
file descriptor is that it can be read only once. But under gdb you
might want to rerun your program multiple times.
This change hands the configuration through a temporary file that is
deleted once the command line passes to intel_dump_gpu has exited.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
In commit 86cb05a6d3 ("intel: aubinator: remove standard input
processing option") we removed the ability to process aub as an input
stream because we're now rely on mmapping the aub file to back the
buffers aubinator is parsing.
intel_aubdump was the provider of the standard input data and since
we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa,
we don't need that code anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Instead of having quite so many singletons, we use a struct aub_file to
organize the bits we need for writing an aub file.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
For large buffers which span an entire l1 page table, we got the range
calculations wrong. In this case, we end up with an l1_start which is
the first byte represented by the given l1 table and an l1_end which is
the first byte after the range represented by the l1 table. Then
l2_start_index == L2_index(l2_end) due to roll-over. Instead, compute
lN_end using (1Ull << shift) - 1 so that lN_end is the last byte in the
range represented by the Nth level page table. When we do this, we
don't need the conditional expression anymore.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
We were not properly writing page tables when the virtual address
range spans multiple subtrees of the tables.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
For gen8+, write out PPGTT tables in aub files so that full 48-bit
addresses can be serialized.
v2: Fix handling of `end` index in map_ppgtt
v3: Correctly mark GGTT entry as present (Rafael)
Signed-off-by: Scott D Phillips <scott.d.phillips@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>