1e6af961ec
The intent is to provide an easy way to measure the impact of an
optimization, not by measuring the whole workload completion time
but also by measuring certain chunks of the workload like command
buffers, renderpasses, or even separate draws.
A moderate perf win in a rare case may not translate into statistically
signifacant overall result. An optimization also may hurt perf in some
cases and help in other which is also hard to judge from overall perf.
For best results pin cpu/gpu frequencies and disable gpu suspend.
Exclude all unnecessary tracepoints via TU_GPU_TRACEPOINT.
Usage:
u_trace_gather.py gather_all \
--loops 1 --launcher "renderdoccmd replay --loops 12" \
--traces-list /path/to/traces.txt \
--traces-dir /path/to/dir/with/traces/ \
--results /path/to/results/ \
--alias new-shiny-opt
u_trace_compare.py compare \
--results /path/to/results/ \
--loops-merged true \
--alias-a default \
--alias-b new-shiny-opt \
--event-start start_render_pass \
--event-end end_render_pass \
--filter "int(params['drawCount']) > 10"
u_trace_compare.py details \
--results /path/to/results/ \
--trace-name test.rdc \
--alias default \
--event-start start_render_pass \
--event-end end_render_pass
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16914>