Comment 31 for bug 1615021

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1615021] Comment bridged from LTC Bugzilla

On Tue, Sep 13, 2016 at 06:20:49PM -0000, bugproxy wrote:
> During investigation, and problem/mistake was found with systemd but is
> almost-certainly not the cause of the hang.

Agreed.

> This fixed systemd was supposedly being made available in xenial-proposed
> repositories, but so far does not seem to have appeared there.

The systemd package is present in the xenial-proposed repository, but no
updated installer image has yet been produced that includes it.

We have had sufficient verification of the systemd change that it will be
released to xenial users for the general problem; we will also update the
debian-installer images as a matter of course.

Based on the feedback from <email address hidden>, it does not appear that the
buggy udev rule is blocking progress on this bug.

> This bug was placed in "verify" state and it started causing email to be
> sent several times a day reminding me to verify the fix.

I don't know why this would be. Our process generates a single message to
the bug when a package is accepted into the -proposed repository, it does
not send daily reminder messages.

> ------- Comment From <email address hidden> 2016-09-13 14:18 EDT-------
> I should also ammend my previous comment by saying, if Canonical has some
> suggestions of how to gather more information in order to help debug this,
> they should let us know and we can make test runs for them.

My previous suggestion to gpiccoli on IRC was to modify the initrd to dump
the state of the udev database at a point after the hang. I haven't seen
such output attached here; does that mean it's not possible to produce such
results because the kernel hard locks? Currently the only debugging
information I've seen is that the /lib/debian-installer/start-udev script
never returns, but that does not mean the kernel has locked up - it only
shows that udev believes it has not finished processing. I would still like
to see a dump of the udev database at the point of the hang, not just a udev
debug log showing processing up to that point.

Is this problem only reproducible with the X710 ethernet adapter? Is this a
removable ethernet adapter, and have you tested what happens if it's
removed? If it's not removable, have you tested what happens if you
blacklist the i40e driver? The ethernet driver may be a complete red
herring, and the problem may be with something that normally happens after
ethernet driver initialization rather than with the ethernet driver itself.

I would also have asked whether this could be an issue with the console
output being redirected to some different device, but since Guilherme
indicated that the problem appeared to be racy, with boot to the installer
sometimes succeeding, that seems unlikely to be the problem.

If you can reproduce this problem with the cloud image from
<http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-ppc64el-disk1.img>,
that would present additional debugging opportunities since that uses a
standard Ubuntu initramfs instead of the installer initramfs and will
support various 'break=' options to interrupt the boot and introspect the
system state.