AlexIndustrial/mesa

Author	SHA1	Message	Date
Eric Anholt	e2e3e4cbf3	ci: Stop doing internal retries in bare-metal. We have job-level retry on failure now, and will continue to need to in order to work around fd.o infrastructure flakes. If we stop doing retry inside the job, then we can crank down the gitlab-level timeouts on test jobs to be closer to our CI guidelines and avoid blocking a runner for an hour when things go wrong (for example, cheza #16 failing to boot in a recognized way and continuously looping due to the intra-job retry). Plus, the job logs will be more readable when you don't have two boots in one job, and we'll get the flakes surfaced in our monitoring dashboards. If internal retries were really doing useful work we may see an increase in flakes as a result of this. I'm committing to turning off boards or reducing coverage as necessary to handle this. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25790>	2023-10-19 07:42:15 +00:00
Emma Anholt	0c1b6af1b6	ci/fastboot: Use a case insensitive match for a fastboot line. Newer boards like the RB5 have a capital F, so this will make the script more reusable for drm ci. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25311>	2023-09-21 15:32:30 +00:00
Emma Anholt	cde8c92ab6	ci/bare-metal: Add timeouts to the shell commands called in fastboot. It seems that we sometimes stall out executing "fastboot boot", and if that happens we want to reboot the board and try again. Fixes: #6682 Acked-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17607>	2022-07-19 21:05:07 +00:00
Emma Anholt	5f09b1ebe9	ci/bare-metal: Add test phase timeouts to all boards. This should help with "marge got stuck for an hour and all I got was this failed job with no results/" when a system intermittently wedges. This replaces the BM_POE_TIMEOUT ("did we get something on serial in the last 3 minutes?") that rpi had, in favor of checking that the whole test job gets through in 20 minutes. Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Emma Anholt	ca453714aa	ci/bare-metal: Add per-boot-stage timeouts for fastboot and poe. This should avoid the 1-hour timeouts if something goes wrong, and just restart. Fixes: #6682 Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Emma Anholt	1e15ec1949	ci/bare-metal: Apply autopep8 to our python scripts. My editor likes to pep8 as I edit, and I'm tired of carefully not committing those changes. Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Ilia Mirkin	268fc8e5c1	gitlab-ci: detect a3xx gpu hang recovery failure But don't bail immediately, instead print out some more lines after the hang, hopefully catching info about the cause of the hang. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14033>	2021-12-03 23:26:27 +00:00
Emma Anholt	8f5a0bd9b4	ci/bare-metal: Close serial and join serial threads before exit. This should fix the intermittent (~1/week) cheza failure where python complains that a thread tried to do stdio while the main thread has exited. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13462>	2021-11-10 20:36:57 +00:00
Emma Anholt	306a039472	ci/baremetal: Retry if our network device spontaneously fails. Seen in https://gitlab.freedesktop.org/mesa/mesa/-/jobs/13824132. It's unlikely that graphics would kill the network, so just assume it's not our fault and keep going. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12939>	2021-09-20 19:55:55 +00:00
Daniel Stone	5f32d2a438	ci: Consistent pass/fail result output One less point of differentiation. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Martin Peres <martin.peres@mupuf.org> Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>	2021-06-15 14:02:44 +02:00
Emma Anholt	6cfd1298e1	ci/fastboot: Consistently restart the run on intermittent conditions. Not currently on my list of intermittent issues, but let's be resilient hopefully. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11308>	2021-06-11 20:24:55 +00:00
Emma Anholt	fe70badfc3	ci/fastboot: Add a serial timeout to catch fastboot prompt failure. The a530s will occasionally fail to make it to the fastboot prompt, with no other deltas between a working log and a log stalled waiting for that line to show up. So, add a serial timeout (like the rpi boards do for similar reasons), and on timeout restart the run. We actually restart the whole serial watching process, because the SerialBuffer finishes itself on timeout. This should also help with the intermittent issue we've had where a power cycle causes the python serial module to throw an exception. Tested with the gitlab-disabled db820c that never makes it to the fastboot prompt (I think it's one where we need a longer micro cable to connect it!) and saw successful boot looping to retry. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11308>	2021-06-11 20:24:55 +00:00
Eric Anholt	1af7be02d7	ci/bare-metal: Move the db820c lockup detect to the right boot script. Fixes: `2407952ec9` ("ci/bare-metal: Restart a run on intermittent kernel lockups.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9715>	2021-03-19 22:07:57 +00:00
Juan A. Suarez Romero	e45d372968	ci/baremetal: highlight message errors Highlight in red errors from the baremetal run, so user is more aware of what happened. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9335>	2021-03-01 18:22:24 +00:00
Eric Anholt	fd2ee49b21	ci/bare-metal: Use python for handling fastboot booting and parsing Modeling after what I did for cros_servo_run.py, this gives us easy support for restarting the test run a530 when we detect a spontaneous reboot. I had to touch up serial_buffer.py to handle buffering in from a file instead of a serial device, to support the upcoming etnaviv CI (tested by running it against a serial log from db410c and seeing it step to calling "fastboot") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>	2020-09-03 23:22:44 +00:00

15 Commits