ifup -a does not start dependants last, causes deadlocks with vlans/bonding

Bug #1636708 reported by Chris Pitchford
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
ifupdown (Ubuntu)
Incomplete
High
Mathieu Trudel-Lapierre
Xenial
Expired
Undecided
Unassigned

Bug Description

This is a problem I've been struggling with since moving to 16.04.1 from 14.04 (fresh install)

I don't believe this problem affected 14.04. I have used an almost identical interfaces file on 14.04 without problem.

On 16.04.1, however, 9/10 boots would hang during network configuration and leave the network incorrectly configured.

When calling "ifup -a" all candidate interfaces appear to be started in parallel leading to collisions with locks. This causes hanging (until timeout) during booting and the network interfaces left incorrectly configured

Imagine this /etc/network/interfaces

auto eno1 bond0 bond0.1

iface eno1 inet manual
        bond-master bond0

iface bond0 inet manual
        bond-slaves eno1
        bond-mode 4
        bond-lacp-rate 1
        bond-miimon 100
        bond-updelay 200
        bond-downdelay 200

iface bond0.5 inet dhcp
        vlan-raw-device bond0

eno1 -> bond0 -> bond0.5 -> dhcp

When calling "ifup -a" at boot time all three interfaces are started at the same time.

bond0 and bond0.5 both attempt to share the same lock file:

  /run/network/ifstate.bond0

If bond0 wins the race, the system will start correctly (1/10):

  * bond0 starts and creates the bond0 device and the ifenslave.bond0 file to indicate the bond is ready
  * eno1 polls for the ifenslave.bond0 file, when it appears it attaches eno1 to bond0
  * bond0 finishes and releases the lock
  * bond0.5 now acquires the lock.
  * bond0.5 starts dhclient, which can talk to the network and configure the interface

If, however, bond0.2 wins the lock race, the system will hang at boot (5 mins) and fail to set up the network.

  * bond0.5 is awarded the ifstate.bond0 lockfile
  * bond0.5 starts dhclient waiting to hear from the network
  * bond0 is blocked, so bond0 is not created nor is the bond0.ifenslave file
  * eno1 polls but never finds the ifenslave.bond0 file so never attaches to bond0
  * bond0.5's dhclient is trying to talk to a disconnected network and never receives an answer

  ! bond0.5 is stuck running dhclient
  ! bond0 is stuck waiting for bond0.5 to finish
  ! eno1 is stuck waiting for bond0 to create the ifenslave.bond0 file

I believe ifup should start interfaces (that share lock files) in dependant order. The most basic interface must be awarded the lock over its dependants. In this case:

  1 eno1
  2 bond0
  3 bond0.5

but never:

  1 eno1
  2 bond0.5
  3 bond0

As a work arouund, in /etc/network/interfaces

-auto eno1 bond0 bond0.1
+auto eno1 bond0
+allow-bond bond0.1

And also in /lib/systemd/system/networking.service

 ExecStart=/sbin/ifup -a --read-environment
+ExecStart=/sbin/ifup -a --allow=bond --read-environment
 ExecStop=/sbin/ifdown -a --read-environment

Then run:

  systemctl dameon-reload

This causes all "auto" interfaces to start then, when they've completed, all allow-bond interfaces to start.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: ifupdown 0.8.10ubuntu1.1 [modified: lib/systemd/system/networking.service]
ProcVersionSignature: Ubuntu 4.4.0-45.66-generic 4.4.21
Uname: Linux 4.4.0-45-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
Date: Wed Oct 26 06:32:57 2016
InstallationDate: Installed on 2016-10-24 (1 days ago)
InstallationMedia: Ubuntu-Server 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
SourcePackage: ifupdown
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.init.networking.conf: [modified]
mtime.conffile..etc.init.networking.conf: 2016-10-26T04:52:05.750927

Revision history for this message
Chris Pitchford (newsuk-platform) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ifupdown (Ubuntu):
status: New → Confirmed
Changed in ifupdown (Ubuntu):
importance: Undecided → Medium
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I thought we have systemd hotplugging as well, can we not generate depedencies that <email address hidden> Wants= and After= <email address hidden>? and prevent all of these locks?

Revision history for this message
Ryan Harper (raharper) wrote : Re: [Bug 1636708] Re: ifup -a does not start dependants last, causes deadlocks with vlans/bonding

On Mon, May 8, 2017 at 9:06 AM, Dimitri John Ledkov <email address hidden>
wrote:

> I thought we have systemd hotplugging as well, can we not generate
>

I'm not sure what this means

> depedencies that <email address hidden> Wants= and After=
> <email address hidden>? and prevent all of these locks?
>

Wants= and AFter= are systemd *unit* dependencies, the locks and order
here
are internal states for ifupdown itself.

Systemd isn't in the picture other than it triggers calls, either ifup@.service
calling ifup on a specific interface (physical ones due do hotplug)
Or the networking.service, which invokes ifup -a (for all non-physical
interfaces defined).

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1636708
>
> Title:
> ifup -a does not start dependants last, causes deadlocks with
> vlans/bonding
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/
> 1636708/+subscriptions
>

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Not quite.

On boot, there are multiple ways that ifup is called, and effectively it races with itself.

In my case I have two vlans, on top of a bond, of two NICs. By the time networking.service is called, the two NICs are present. networking.service is essentially `ifup -a`, it looks at the eni file and realises that it should bring bond0. It looks for bond0 in its internal state and creates it.

This is where the race starts.

ifupdown ships /lib/udev/rules.d/80-ifupdown.rules which calls /lib/udev/ifupdown-hotplug which effectively does
$ exec systemctl --no-block start $(systemd-escape --template ifup@.service $INTERFACE)
(very strange to do this on systemd systems, because one could have just did SYSTEMD_WANTS, but anyway)

At this point bond0 is being brought up by networking.service unit (ifup -a) and <email address hidden> (ifup bond0). Sometimes one can see "already configured" message from either of the two units in the logs.

But also, at this point it time, <email address hidden> and <email address hidden> may have been started as well.

In my case my machine manages to hit this race quite a bit. I am attaching a journal log, of what is happening.

The log is produced using:
journalctl -u ifup@*.service -u networking -o verbose | grep -e UTC -e UNIT -e MESSAGE

You can see messages that things are waiting on bond0 to be up; and that one or the other vlan is waiting on bond0 lock. To beat the locks and to prevent ifup@.service interfering with networking@.service, or executing in parallel and creating deadlocks, I had to encode the dependencies between these units in systemd brain by doing this:

# cat /<email address hidden>/order.conf
[Unit]
<email address hidden>
<email address hidden>
# cat /<email address hidden>/order.conf
[Unit]
<email address hidden>
<email address hidden>

This way the ordering is enforced for the ifup@.service hotplug. IMHO ifupdown should ship a generator, that would create these dependencies and orderings between interfaces. And possibly ifup -a should be reduced to starting ifup@%I.service for every interface it is meant to start for a given command.

I'm not sure if we can cheat and state that ifup@.service should be Wants=networking.service After=networking.service. Because I think then we may get ourselves into the situation that ifupdown fails to resolve cycles in the eni, when eni is specified out of order.

For cloud-init, this is more complicated. As on boot the generators will fire, before eni is populated. Therefore cloud-init should probably re-run this magical ifupdwon generator (just like it does for netplan) or cloud-init should create these symlinks directly, and reload systemd before moving onto networking.service.

Does above make sense at all?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

@ Chris Pitchford

Could you please try below, and let us know if that works for you, with your proposed workaround reverted?
The reason being, is that with a generator we should be able to create these orderings using the original / unmodified eni files as generated by installers / were in use in Trusty.

mkdir -p /<email address hidden>
cat > /<email address hidden>/order.conf << EOF
[Unit]
<email address hidden>
<email address hidden>

EOF

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Maybe this is easier to read.

root@xnox-iad-nr5:~# journalctl -o short-monotonic -u <email address hidden>
-- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09 15:22:34 UTC. --
[ 6.740201] xnox-iad-nr5 systemd[1]: Started ifup for bond0.
[ 6.750333] xnox-iad-nr5 sh[1184]: Waiting for a slave to join bond0 (will timeout after 60s)
[ 6.987241] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for interface bond0 in pre-up phase
[ 6.987341] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for parent interface bond0 in pre-up phase
[ 6.987425] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for parent interface bond0 in pre-up phase
root@xnox-iad-nr5:~# journalctl -o short-monotonic -u <email address hidden>
-- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09 15:22:34 UTC. --
[ 6.755723] xnox-iad-nr5 systemd[1]: Started ifup for bond0.101.
[ 6.757056] xnox-iad-nr5 sh[1286]: ifup: waiting for lock on /run/network/ifstate.bond0
[ 7.227572] xnox-iad-nr5 sh[1286]: Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
root@xnox-iad-nr5:~# journalctl -o short-monotonic -u <email address hidden>
-- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09 15:22:34 UTC. --
[ 6.760568] xnox-iad-nr5 systemd[1]: Started ifup for bond0.401.
[ 6.761920] xnox-iad-nr5 sh[1290]: ifup: waiting for lock on /run/network/ifstate.bond0
[ 7.197983] xnox-iad-nr5 sh[1290]: Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
root@xnox-iad-nr5:~# journalctl -o short-monotonic -u networking.service
-- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09 15:22:34 UTC. --
[ 6.645323] xnox-iad-nr5 systemd[1]: Starting Raise network interfaces...
[ 6.692530] xnox-iad-nr5 ifup[992]: Waiting for bonding kernel module to be ready (will timeout after 5s)
[ 6.693104] xnox-iad-nr5 ifup[992]: Waiting for bond master bond0 to be ready
[ 7.221867] xnox-iad-nr5 ifup[992]: /sbin/ifup: waiting for lock on /run/network/ifstate.bond0
[ 7.263179] xnox-iad-nr5 systemd[1]: Started Raise network interfaces.

Note that networking, <email address hidden>, and <email address hidden> all hit locks that are held by bond0.
Note that <email address hidden> beats networking.service to acquire the lock.
And that netwokring.service is blocked until after ifup@bond0.*01 vlans release the locks.

Revision history for this message
Ryan Harper (raharper) wrote :
Download full text (5.6 KiB)

On Tue, May 9, 2017 at 10:32 AM, Dimitri John Ledkov <<email address hidden>
> wrote:

> Maybe this is easier to read.
>
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.740201] xnox-iad-nr5 systemd[1]: Started ifup for bond0.
> [ 6.750333] xnox-iad-nr5 sh[1184]: Waiting for a slave to join bond0
> (will timeout after 60s)
> [ 6.987241] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> interface bond0 in pre-up phase
> [ 6.987341] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> parent interface bond0 in pre-up phase
> [ 6.987425] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> parent interface bond0 in pre-up phase
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u
> <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.755723] xnox-iad-nr5 systemd[1]: Started ifup for bond0.101.
> [ 6.757056] xnox-iad-nr5 sh[1286]: ifup: waiting for lock on
> /run/network/ifstate.bond0
> [ 7.227572] xnox-iad-nr5 sh[1286]: Set name-type for VLAN subsystem.
> Should be visible in /proc/net/vlan/config
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u
> <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.760568] xnox-iad-nr5 systemd[1]: Started ifup for bond0.401.
> [ 6.761920] xnox-iad-nr5 sh[1290]: ifup: waiting for lock on
> /run/network/ifstate.bond0
> [ 7.197983] xnox-iad-nr5 sh[1290]: Set name-type for VLAN subsystem.
> Should be visible in /proc/net/vlan/config
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u networking.service
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.645323] xnox-iad-nr5 systemd[1]: Starting Raise network
> interfaces...
> [ 6.692530] xnox-iad-nr5 ifup[992]: Waiting for bonding kernel module
> to be ready (will timeout after 5s)
> [ 6.693104] xnox-iad-nr5 ifup[992]: Waiting for bond master bond0 to be
> ready
> [ 7.221867] xnox-iad-nr5 ifup[992]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.bond0
> [ 7.263179] xnox-iad-nr5 systemd[1]: Started Raise network interfaces.
>
>
What is the status/timestamp of the physical devices (enspf90 and 91) ?

 Note that networking, <email address hidden>, and <email address hidden> all hit locks
that are held by bond0.

> Note that <email address hidden> beats networking.service to acquire the lock.
>

hrm, I didn't think we got ifup events for non-physical devices; this is
certainly problematic;
the design, IIUC, is for physical devices to get ifup@.service triggers via
udev events; see the the BindsTo=sys-subsystem-net-devices-%i.device

The result in this boot is that networking.service (which starts at 6.64
seconds calls ifup -a (pid 992) which is racing with
the ifup@bond0 (started at 6.74, pid 1184); so we have *two* ifups locking
bond0; I think this is the real deadlock.

I'm not sure why we're getting an ifup@bond0 unit execution, the udev
scripts are only triggered for physical devices;

Looks like when the modu...

Read more...

Changed in ifupdown (Ubuntu):
importance: Medium → High
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

> Why would bond0 get created when loading the bonding module?

Because the default module parameter max_bonds for bonding module is 1, meaning create one bond upon bond module loading.

I have since hit similar issue with networkd where bonding module would be autoprobed, bond0 created, and confuse networkd (bond0 already exists, not touching it).

I have since applied max_bonds=0 and used networkd. I guess I can try this again with ifupdown, max_bond=0, and check what sort of udev events are received.

Imho we should default to max_bonds=0 such that loading bonding module doesn't create stray interfaces.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ifupdown (Ubuntu Xenial):
status: New → Confirmed
Revision history for this message
Tom Verdaat (tom-verdaat) wrote :

We upgraded to the vlan 1.9-3.2ubuntu1.16.04.3 package and our networking broke horribly in a very similar way.

Let me start with our networking configuration. Two slaves, a bond and a vlan on top of that bond:

auto eno1
iface eno1 inet manual
   mtu 1500
   bond-master bond1
   bond-primary eno1

auto eno2
iface eno2 inet manual
   mtu 1500
   bond-master bond1

auto bond1
iface bond1 inet static
   mtu 1500
   address 10.10.10.3
   bond-miimon 100
   bond-mode active-backup
   bond-slaves none
   bond-downdelay 200
   bond-updelay 200
   dns-nameservers 10.10.0.1
   netmask 255.255.0.0

auto bond1.2
iface bond1.2 inet static
   mtu 1500
   address 10.11.10.3
   netmask 255.255.0.0
   vlan-raw-device bond1

This fails to come up correctly, both during boot and manually. Bringing up either eno1, eno2, bond1 or bond1.2 all result in the same problem: "ifup: waiting for lock on /run/network/ifstate.bond1".

Problem seems to be that ifup tries to bring up the base bond1 interface *again*. Even if it is already up. And it gets stuck waiting for the bond1 interface to be unlocked so it can bring it up, but it is already up and thus locked so that will never happen.

We also tried bringing all interfaces down and just running "ifup bond1.2" but that results in the same behavior.

Only workaround that seemed to work for us was to:
1) temporarily remove the bond1.2.cfg from /etc/network/interfaces.d
2) bring up eno1, eno2 and bond1
3) put the bond1.2.cfg back in its place
4) run "ifup bond1.2"
5) using another terminal, list all open processes using "ps -ef | grep ifup"
6) kill the "ifup bond1" process

The "ps -ef | grep ifup" during step 5, outputs two ifup processes. One for bond1 and one for bond1.2. As soon as we kill the "ifup bond1" process, the "ifup bond1.2" process completes immediately and correctly configures the vlan 2 subinterface.

This is clearly linked to vlan, because our infiniband interfaces work just fine. Also it worked just fine before upgrading the package. So my best guess would be that something broke in the code that detects if the vlan-raw-device is up. Perhaps related to LP #1573272 ?

Revision history for this message
Dan Streetman (ddstreet) wrote :

First, I believe the original description of this case shows it's the same problem (or same fix at least) as bug 1573272 - the vlan is racing with its raw device bond to come up. The fix in that bug should address this problem, by forcing the bond interface to always come up first (it's forced during udev processing when udev detects a new vlan interface). So I think this bug can be marked as fixed or duplicate - can anyone who had the original problem please test with the latest vlan package to verify this is fixed?

Second, @tom-verdaat, re: comment 11, I set up a system with that config, and rebooted many times and saw no problem. As your description doesn't involve dhcp on the vlan interface - which is what caused the long-term locking of the ifupdown bond0 lock from the initial problem description - it sounds like a new issue. Especially since this is an older bug, and you only now started seeing the problem after upgrading to the latest vlan package. Can you open a new bug for your issue please?

Changed in ifupdown (Ubuntu Xenial):
assignee: nobody → Dan Streetman (ddstreet)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

The fixes in vlan package are insufficient, as networking.service spawns ifup@.services in parallel with no ordering information and therefore systemd is free to process them in a racy manner. At boot ifup@.services are racing with networking.service and can result in a deadlock as you have experienced.

@tom-verdaat To resolve boot time races, you need to provide manual dependency ordering of the ifup@.service systemd units, as at the moment, unfortunately, these are not automatically generated. For you pasted config I believe you require the following files:

# cat /<email address hidden>/order.conf
[Unit]
<email address hidden>
<email address hidden>

# cat /<email address hidden>/order.conf
[Unit]
<email address hidden>
<email address hidden>
<email address hidden>
<email address hidden>

In 17.10 and later releases these races are fixed with netplan. netplan is also available and can be used on xenial. netplan generates correct configuration and then all of bonds/vlans are managed by networkd in a race-free manner.

I apologize for the inconvenience =(

Revision history for this message
Dan Streetman (ddstreet) wrote :

> networking.service spawns ifup@.services in parallel with no ordering information

no, networking.service calls ifup -a, it doesn't spawn ifup@ services.

ifup -a doesn't bring up interfaces in parallel, it brings them up sequentially as ordered in the /etc/networking/interface file(s). However, while bringing up an interface frequently other interfaces are created which does trigger parallel ifup@ service calls - for example, if bond1 and bond1.2 are both defined, while bond1 is being ifup'ed, the actual bond1 interface is created and processed by udev, which then calls /lib/udev/vlan-network-interface with bond1, which then parses the ifupdown configuration and finds the bond1.2 interface that uses bond1 as its raw-device, and so it runs /etc/network/if-pre-up.d/vlan which creates the bond1.2 interface - that is then processed by udev, which then ifup's it in parallel to bond1's ifup, etc.

> To resolve boot time races, you need to provide manual dependency ordering of the
> ifup@.service systemd units

i don't believe that is entirely correct, but if your suggestion fixes things for @tom-verdaat then great. ifupdown (and the vlan add-on to it) is rather a mess, and i can't wait to move on to netplan with networkd.

Dan Streetman (ddstreet)
Changed in ifupdown (Ubuntu Xenial):
assignee: Dan Streetman (ddstreet) → nobody
Revision history for this message
Tom Verdaat (tom-verdaat) wrote :

@ddstreet I opened a separate ticket as requested a while ago. Some eyes on it would be very welcome. It's bug 1759573

Revision history for this message
Dan Streetman (ddstreet) wrote :

@newsuk-platform, are you still having this problem? If not, let's close this bug.

Dan Streetman (ddstreet)
Changed in ifupdown (Ubuntu Xenial):
status: Confirmed → Incomplete
Changed in ifupdown (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for ifupdown (Ubuntu Xenial) because there has been no activity for 60 days.]

Changed in ifupdown (Ubuntu Xenial):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.