When transitioning the oa-*.xml files from Gputop to IGT, we also had
to deal with a python2->3 transition. Unfortunately the implementation
dependent hash table ordering leaked into the XML files and so things
changed quite a bit.
This script reorders things from the old to the new order in the
existing files.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
For all generations supported we had a layout describing what register
to store to implement a MI_RPC replacement.
This is because, on Gen12 we need to snapshot OAG registers to get
correct values for the perf equations. There, the MI_RPC instruction
captures OAR register which do not have all the information we need.
v2: Fix commented code for debug (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
This will be useful when we implement queries using a series of MI_SRM
instead of MI_RPC.
Unfortunately on Gen12, the MI_RPC command sources values from the OAR
unit which has a similar series of register as the OAG unit but some
of the configuration of HW doesn't reach OAR so we have to snapshot
OAG manually instead.
v2: Fix comments
Use const
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
Those are not part of the OA reports and need some additional
scaffolding. Those counters are only available when doing queries as
we need to emit MI_SRMs to record them.
Equations making use of those counters are not there yet, they will
come in a follow up commit updating a bunch of oa-*.xml files.
v2: Fix typo
v3: Use PERF_CNT_VALUE_MASK (Marcin)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6518>
The `restart_index` field can be uninitialized if `primitive_restart`
is false so we have to track `restart_index` changes
only if `primitive_restart` is true
Here is a valgrind warning:
Conditional jump or move depends on uninitialised value(s)
==52021== at 0x6D44968: iris_update_draw_info (iris_draw.c:102)
==52021== by 0x6D450B5: iris_draw_vbo (iris_draw.c:273)
==52021== by 0x642FD8E: cso_multi_draw (cso_context.c:1708)
==52021== by 0x5C434D3: st_draw_gallium (st_draw.c:271)
==52021== by 0x5DF5F1B: _mesa_draw_arrays (draw.c:554)
==52021== by 0x5DF68F7: _mesa_DrawArrays (draw.c:768)
==52021== by 0x49011F2: stub_glDrawArrays (piglit-dispatch-gen.c:12181)
==52021== by 0x11C611: piglit_display (shader_runner.c:4549)
==52021== by 0x4994D83: process_next_event (piglit_x11_framework.c:137)
==52021== by 0x4994E47: enter_event_loop (piglit_x11_framework.c:153)
==52021== by 0x49939A4: run_test (piglit_winsys_framework.c:88)
==52021== by 0x49821A9: piglit_gl_test_run (piglit-framework-gl.c:229)
v2: - don't propagate trash to state->cut_index
(Kenneth Graunke <kenneth@whitecape.org>)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8409>
Where it is safe to do so, avoid the generation of code to convert a
condition code into a boolean which is then tested to generate a
condition code. This is only done in uniform ifs, and only for condition
values that are SSA and only used once (in that if statement).
shader-db relative to MR 7726:
total instructions in shared programs: 8985667 -> 8974151 (-0.13%)
instructions in affected programs: 390140 -> 378624 (-2.95%)
helped: 810
HURT: 276
helped stats (abs) min: 1 max: 49 x̄: 17.77 x̃: 16
helped stats (rel) min: 0.10% max: 33.63% x̄: 7.97% x̃: 6.45%
HURT stats (abs) min: 1 max: 46 x̄: 10.42 x̃: 10
HURT stats (rel) min: 0.16% max: 21.54% x̄: 2.26% x̃: 2.03%
95% mean confidence interval for instructions value: -11.46 -9.75
95% mean confidence interval for instructions %-change: -5.76% -4.97%
Instructions are helped.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8709>
Changes:
- disallow NGG culling for GS, fast launch for tess using template args
(GS can't do NGG culling, tess can't do fast launch)
- skip checking current_rast_prim with tessellation
(bake the condition into ngg_cull_vert_threshold)
- use only 1 vertex count threshold for enabling NGG shader culling
to simplify it. I think it doesn't have a big impact. The threshold
computation depends on more parameters than just fast launch.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8434>
When a secondary command buffer is encountered, insert an event that
links to the new batch.
This commit leaves intel_measure timestamp buffer objects mmapped,
which is more efficient than mapping/unmapping several times. With
the BOs mapped at all times, timestamp buffers can be managed directly
by intel_measure, where it will iterate over timestamps of linked
secondary buffers.
With timestamp buffers managed by intel_measure, a more efficient and
accurate check for render completion can be moved into intel_measure
from anv/iris.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
This infrastructure collects GPU timestamps over common intervals, and
generates a CSV report to show how long rendering took. The overhead
of collection is limited to the flushing that is required at the
interval boundaries for accurate timestamps.
By default, timing data is sent to stderr. To direct output to a
file:
INTEL_MEASURE=file=/tmp/measure.csv {workload}
To begin capturing timestamps at a particular frame:
INTEL_MEASURE=file=/tmp/measure.csv,start=15 {workload}
To capture only 23 frames:
INTEL_MEASURE=count=23 {workload}
To capture frames 15-37, stopping before frame 38:
INTEL_MEASURE=start=15,count=23 {workload}
Designate an asynchronous control file with:
INTEL_MEASURE=control=path/to/control.fifo {workload}
As the workload runs, enable capture for 5 frames with:
$ echo 5 > path/to/control.fifo
Enable unbounded capture:
$ echo -1 > path/to/control.fifo
and disable with:
$ echo 0 > path/to/control.fifo
Select the boundaries of each snapshot with:
INTEL_MEASURE=draw : DEFAULT - Collects timings for every render
INTEL_MEASURE=rt : Collects timings when the render target changes
INTEL_MEASURE=batch : Collects timings when batches are submitted
INTEL_MEASURE=frame : Collects timings at frame boundaries
With INTEL_MEASURE=interval=5, the duration of 5 events will be
combined into a single record in the output. When possible, a single
start and end event will be submitted to the GPU to minimize
stalling. Combined events will not span batches, except in
the case of INTEL_MEASURE=frame.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
Now that all drivers are converted over, we can make a few changes.
First off, vk_device_init no longer takes two separate allocators
because we can assume that the parent instance is non-null and it can
pull the instance allocator from that. Second, dispatch tables and the
instance extension table are no longer optional. We leave the device
extension table optional for now because we don't do any verification at
vk_init_physical_device time and some drivers find it more convenient to
set the extensions later in their own physical_device_init for various
reasons.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8676>