The intent is to provide an easy way to measure the impact of an
optimization, not by measuring the whole workload completion time
but also by measuring certain chunks of the workload like command
buffers, renderpasses, or even separate draws.
A moderate perf win in a rare case may not translate into statistically
signifacant overall result. An optimization also may hurt perf in some
cases and help in other which is also hard to judge from overall perf.
For best results pin cpu/gpu frequencies and disable gpu suspend.
Exclude all unnecessary tracepoints via TU_GPU_TRACEPOINT.
Usage:
u_trace_gather.py gather_all \
--loops 1 --launcher "renderdoccmd replay --loops 12" \
--traces-list /path/to/traces.txt \
--traces-dir /path/to/dir/with/traces/ \
--results /path/to/results/ \
--alias new-shiny-opt
u_trace_compare.py compare \
--results /path/to/results/ \
--loops-merged true \
--alias-a default \
--alias-b new-shiny-opt \
--event-start start_render_pass \
--event-end end_render_pass \
--filter "int(params['drawCount']) > 10"
u_trace_compare.py details \
--results /path/to/results/ \
--trace-name test.rdc \
--alias default \
--event-start start_render_pass \
--event-end end_render_pass
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16914>
The android-cts-runner.sh and android-deqp-runner.sh scripts are usually
sourced from android-runner.sh which then propagates the EXIT_CODE value
set in the scripts.
However this scheme does not cover the case where android-cts-runner.sh
and android-deqp-runner.sh are called directly, in that case they will
not return the previously saved EXIT_CODE appropriately in the last
command.
Fix that, allowing developers to call the scripts directly with the
intended behavior.
After this change android-runner.sh does not need to know about the
EXIT_CODE variable anymore, which was a slight layering violation, but
it can just rely on the exit code of the last executed commands.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36286>
The Xe kernel driver doesn't allow vm_bind on compressed bo
if it has user pointer. And we probably shouldn't enable CCS
compression on memory in any case.
This change is necessary to prevent failures once we adjust the
priority of compression PAT entries in a following commit:
Vulkan CTS:
dEQP-VK.api.buffer_marker.compute.external_host_mem.top_of_pipe.
memory_dep.buffer_copy
dEQP-VK.memory.external_memory_host.simple_allocation.
minImportedHostPointerAlignment_x3
anv_kmd_backend.c:308: xe_vm_bind_op: Assertion
`errno_ != EINVAL' failed.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36275>
programs start with one reference per contained shader. the only other
places they can be referenced are:
* batch refs
* merged separable programs during compile
this unref did not match any of those cases and caused early deletion
caselist:
dEQP-GLES31.functional.atomic_counter.inc.8_counters_5_calls_10_threads
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_linear_linear_mirror_repeat
dEQP-GLES31.functional.texture.specification.texsubimage3d_pbo.rgb32i_cube_array
dEQP-GLES31.functional.texture.specification.texsubimage3d_pbo.rgb8_image_height_cube_array
dEQP-GLES31.functional.texture.specification.texsubimage3d_depth.depth24_stencil8_cube_array
dEQP-GLES31.functional.sample_shading.min_sample_shading.multisample_texture_samples_2_color
dEQP-GLES31.functional.vertex_attribute_binding.usage.mixed_usage.mixed_attribs_instanced_binding
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36294>
For example, divergence analysis can call nir_print_instr with an
instruction that doesn't have a block set. When that happens,
print_state::shader will be NULL.
I stumbled on this while testing !36147.
v2: Use nir_instr::has_debug_info instead. Suggested by Konstantin.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Fixes: ce0f30b230 ("nir: Add variable debug info to instructions")
Fixes: 3aeab4ce40 ("nir/print: Do not print debug information when gathering it")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36267>
The validation settings file must be named
$GPU_VERSION-validation-settings.txt for deqp-runner.sh to use it.
Enable Vulkan Validation Layers for the two RADV jobs that commit 5fd0b634d4
("zink: add VVL for RADV jobs") attempted to enable, as well as the new
Cezanne job.
Also add filters for VUIDs that error due to unrecognized extensions when
the VVL version in CI is too old.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35717>
The NORMALIZE_SIGN_EXTEND flag is required for SINT vertex formats to
properly handle sign extension when reading signed integer data, not
just for traditional normalized formats.
Replace manual channel description checks with utility functions to
determine when NORMALIZE_SIGN_EXTEND is needed. This makes the code
more maintainable and less error-prone.
Fixes dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.*
tests with signed integer vertex data.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36274>
Set VIVS_TS_SAMPLER_CONFIG_64BPP_FORMAT when the texture format
has 64 bits per pixel to ensure proper tile status handling for
wide formats.
Fixes at least the following CTS tests:
- dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rg32i
- dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rg32ui
- dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgba16i
- dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.rgba16ui
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36270>
prologs inflate register pressure, so this can help a lot in the monolithic case
(together with dynamic strides). eliminates spilling from some vertex shaders in
Control that read a ton of attributes per vertex.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36271>
A few device features (most importantly bufferDeviceAddress) are behind
a check for has_vm_always_valid. When replaying fossilize captures using
SPIR-V capabilities like PhysicalStorageBuffer addresses (which itself
depends on bufferDeviceAddress) on a null device, these features will be
hidden and replay will fail. Claim vm_always_valid support in the null
winsys - it's not like we'll ever create any BOs anyway.
Fixes: df1224c8 ("radv: rework VM_ALWAYS_VALID handling")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36221>
Specifically, fe8bc3f23e made the "default" case `manual`
instead of `never`, which is causing all staging pipelines to have that
job regardless of any file `changes`, and always `manual` making it
block pipelines.
Let's make it automatic for upstream pipelines (ie. staging and
push-to-main), and only run on file `changes`.
Fixes: fe8bc3f23e ("ci: Only run rustfmt when necessary")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36260>
Switch from legacy api to the atomic api. Atomic support should be
standard at this point, and failing to get a KHR_display connector in its
absence seems reasonable (rather than retaining code that we don't expect
to use or test, as in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4176)
This is a prerequisite for modifiers support, where we need to be able to
pick a specific plane in order to see its supported modifiers list.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6173>
Instead of setting it up when the swapchain is presented, set it up when
creating the swapchain. This means that multiple swapchains might use
the same crtc, but only one can be active at a time, and the connectors
are now refcounted.
This is necessary for the next commit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6173>
OpenCL doesn't really allow a wide range of different samplers, so the
cache hit rate is pretty high across all applications.
This also allows us to stop unbinding samplers after each kernel launch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36243>