Comment 5 for bug 1771480

Revision history for this message
Jay Vosburgh (jvosburgh) wrote :

The dev_disable_lro warning is happening due to some logic issues in the features code. The LRO on the VLAN (bond0.200, e.g.) that's being warned about does end up being disabled by a NETDEV_FEAT_CHANGE callback when the underlying bond0's features are updated, so the warning is spurious.

Tracing the dev_disable_lro -> netdev_update_features for the bond0.2004 VLAN, I see:

name="bond0" feat=219db89 hw_feat=20219cbe9 want_feat=20219cbe9 vlan_feat=198069

NETIF_F_LRO = 0x8000

dev_disable_lro
        wanted_features &= ~NETIF_F_LRO
        bond0.2004 wanted_features = 0x200194869 # no LRO

__netdev_update_features
        features = netdev_get_wanted_features
return (dev->features & ~dev->hw_features) | dev->wanted_features;
        (0x19d809 & ~0x23839487b) | 0x200194869
             ^LRO ^no LRO ^no LRO
        0x9000 | 0x200194869
$2 = 0x20019d869
            ^ LRO

vlan_dev_fix_features(dev, 0x20019d869) # has LRO

        struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
        netdev_features_t old_features = features;

        features &= real_dev->vlan_features; # 0x198069 has LRO
        features |= NETIF_F_RXCSUM; # 0x100198069 has LRO
        features &= real_dev->features; # 0x198009 has LRO

        features |= old_features & NETIF_F_SOFT_FEATURES; # save GSO / GRO
        features |= NETIF_F_LLTX;

        return features; # will have LRO

So, basically, LRO is set in the underlying bond0's features, so it ends up being kept in the VLAN device's features even though it wasn't in wanted_features. Later, dev_disable_lro will call dev_disable_lro on all the lower devices (the bond0 in this case), and the update of features for the bond0 will issue a NETDEV_FEAT_CHANGE callback to the bond0.2004 VLAN, which will then set the features correctly.

The Ubuntu 3.13 __netdev_update_features (called by dev_disable_lro via netdev_update_features) lacks additional logic found in later kernels to sync the features to lower devices. That presumably triggers the NETDEV_FEAT_CHANGE within the call to __netdev_update_features so that the bond0.2004 VLAN is updated before we return back to dev_disable_lro (but I haven't verified this).

I suspect the fix to eliminate the warning is to apply the "sync_lower:" block from a later kernel __netdev_update_features to 3.13, along with the netdev_sync_lower_features function it uses.