I was today years old when I learned this about classic composable UNIX
tools:
~/mesa/mesa lava-submitter-overlay * % bash
[daniels@strictly mesa]$ set -e
[daniels@strictly mesa]$ false | tee
[daniels@strictly mesa]$ echo $?
0
Use tail rather than tee, so it doesn't hide our exit status.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Truth is relative in 2021, and Python's duck-typing means truthiness
isn't what you think it is. Use an explicit fatal-error handler to make
sure we crash out hard on failure, rather than hoping sys.exit() behaves
like you think it does, because it doesn't.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Trying to get arbitrary strings suitably quoted for shell, embedded in a
YAML file, processed by Python templating, is like seven bad ideas all
embedded into one big can of bees.
Reuse the same script we use for bare-metal to generate the environment,
tar that up into a per-job overlay which is added to the
inter-pipeline-reusable rootfs built by the container jobs and the
intra-pipeline-reusable overlay built by the build jobs.
@anholt wrote a chunk of this - replacing the $ENV_VARS GitLab CI
variable with a Python loop across the POSIX job environment - in
!11192, but this still had YAML quoting nightmares, and was more
needless duplication between LAVA and bare-metal.
The diff is large and annoying, but is mostly a sed job to get
ENV_VARS="FOO=bar BAZ=quux" into FOO: bar\nBAZ: quux.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Co-authored-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
This brings in some major new features in the runner:
- piglit tests now include subtest reporting
- "-t" support for quick include-filtering of tests.
- piglit tests that crash after their result report are considered crashes.
- throws a nice error if you try to annotate the same failure twice
(e.g. lvp's dEQP-VK.glsl.builtin.precision.pow.highp.vec2,Fail)
Since the runner catches piglit test bugs where the same subtest is run
twice, we also uprev piglit to pull in the fixes for those.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11283>
LAVA doesn't consider failure to download a kernel/initramfs as an
infrastructure error, rather just a user error for supplying a broken
URL. We know our URLs aren't broken (because we're perfect), so assume
that failures in download validation are network issues and retry when
we hit them.
LAVA itself has been fixed to retry internally, so we'll get that when
upgrade in a couple of weeks, but gloss over it for now.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
hang-detection is a vulkan-based lightweight wrapper from
parallel-deqp-runner that periodically submits empty command buffers
and waits for their completions. If the completion never happens, the
GPU is considered hung, the wrapped script is killed, and the job
should get aborted.
This should have no negative impact on the runtime of dEQP/traces/...,
but will allow saving time when the GPU gets hung as we can abort the
job immediately rather than waiting for the timeout.
In the case of B2C, we are using this tool's error message as a way to
trigger the reboot of the test machine and start again.
v2:
- Use hang-detection already with some jobs (Martin).
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11087>
When we remove the contents of the results directory, we `cd` into it.
The script expects that $PWD is /piglit, and $OLDPWD is the Mesa build
directory, however the cd into the results directory will make $OLDPWD
be $BUILDDIR/results.
This means that Piglit emits into results/results/ which looks weird,
but more importantly also fails OpenCL Piglit execution, because we
can't find our baseline result expectations.
Fix it by using an explicit variable rather than relying on history.
Fixes: 683ddf19dc ("ci: remove results directory content only with piglit runners")
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10856
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11126>
Covert the job submission process to a python script for more
robustness and control. allowing easier manipulation of job data.
As a result, it adds retry logic to deal with Infrastructure Errors in LAVA.
_call_proxy() is equipped with a robust retry logic, which I have been
using already in the past few weeks in stress testing to run hundreds
of jobs.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11079>
Use Piglit's replay profile to measure and store the time that frames
take to render in the GPU.
This job won't run automatically in regular pipelines, but will be
triggered automatically by a script for every successful pre-merge
pipeline.
This is because we want to generate performance data for every relevant
commit merged in main, but we don't want to keep a device busy during
the pre-merge run.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7987>
In order to reduce the amount of building work and network traffic, we
use docker caching. For that, we use the MESA_IMAGE_TAG and
MESA_BASE_TAG env variables which build the MESA_IMAGE variable to
identify different containers.
We are also using these tags to identify the cached artifacts produced
by other containers when those are part of the underlying OS to run
directly in DUTs through the DISTRIBUTION_TAG env variable.
The undesirable collateral effect is that we cannot combine a test job
using a container which would like to make use of some of the cached
artifacts created by another container. In other words, we cannot have
a job using a DISTRIBUTION_TAG and a MESA_IMAGE using a different
MESA_[IMAGE|BASE]_TAG variables.
Now, we split the usage in the DISTRIBUTION_TAG through the definition
of MESA_ARTIFACTS_TAG AND MESA_ARTIFACTS_BASE_TAG.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10977>