"IPV6 header not found" in system logs for ICMP6 Router Advertisements

Asked by Andres Pozo

Hi all,

I am facing some -I think- strange problem with IPV6 and QinQ in a Linux host, and may be someone has faced similar problem (or could provide a hint).

I have some VMs running in a host (KVM), and every time any VM sends an ICMP6 Router Advertisement, we get the following log in syslog:

    Aug 10 11:18:36 Hostname kernel: [1722430.045240] IPv6 header not found

For the traffic, I use QinQ (802.1Q in both tags), the inner tag is set with OVS in the tap, and the outer is set with a veth of vlan type, in the following way:

(edit: ascii art format was not fine), the host is running Ubuntu 16.04 (kernel 4.4.0-62-generic)

VM-1 --- (tag=1)--- OVS ---- veth1.203 ----- veth1 ------ veth0 ----- BRIDGE ---- ens11f1 (pysical interface)

'Regular traffic' (non ICMP6) seems to work fine, the problem happens apparently only with the Router Advertisement.

I checked the code writing that log, and I think it's in 'kernel/net/ipv6/exthdrs_core.c'

    if (*offset) {
      struct ipv6hdr _ip6, *ip6;

      ip6 = skb_header_pointer(skb, *offset, sizeof(_ip6), &_ip6);
      if (!ip6 || (ip6->version != 6)) {
        printk(KERN_ERR "IPv6 header not found\n");
        return -EBADMSG;
      }
      start = *offset + sizeof(struct ipv6hdr);
      nexthdr = ip6->nexthdr;
    }

but both the protocol and protocol version seem right in tcpdump:

    11:28:38.675686 02:00:40:00:21:31 > 33:33:00:00:00:01, ethertype 802.1Q (0x8100), length 158: vlan 203, p 0, ethertype 802.1Q, vlan 49, p 0, ethertype IPv6, fe80::40ff:fe00:2131 > ff02::1: ICMP6, router advertisement, length 96
    `....`:...........@...!1................... @...............@.!1..........@.... ........*.. ..!1................*.. ..!1............. '.
    11:28:39.300076 02:00:40:00:23:2a > 33:33:00:00:00:01, ethertype 802.1Q (0x8100), length 158: vlan 204, p 0, ethertype 802.1Q, vlan 193, p 0, ethertype IPv6, fe80::40ff:fe00:232a > ff02::1: ICMP6, router advertisement, length 96
    `....`:...........@...#*...................%@...............@.#*..........@.... ........*.. ..#*................*.. ..#*.............

I already disabled (just in case) all the offloading features in the ethernet card.

Questions:
* Is there any way I can verify which packets are dropped? (which packets are causing the 'header not found' log)
* Is QinQ (with both 802.1Q tags) supported in the kernel? Is there any config param I should change to enable?
* Any hint what could be the reason? QinQ with ICMP6?

Any help/hint is trully appreciated!

Regards,
Andrés

Question information

Language:
English Edit question
Status:
Expired
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Andres Pozo (apoz) said :
#1

I tried to investigate further the issue, tracing with systemtap the function printing the value of the params in the function call.

The stap program I used:

    probe kernel.statement("*@net/ipv6/exthdrs_core.c:200") {
        printf("Function call %s -> %s\n", thread_indent(1), ppfunc());
       printf("%s\n", $$parms$);
        printf("%s\n",$$locals$$);
        printf("\nIP6->%s\n",$_ip6$$)
    }

The result is the following:

Function call 1999283 vhost-38719(38729): -> ipv6_find_hdr
skb={<union>={...}, .sk=0x0, .dev=0xffff880def80c000, .cb=[...], ._skb_refdst=0, .destructor=0x0, .sp=0x0, .nfct=0x0, .nf_bridge=0x0, .len=94, .data_len=0, .mac_len=14, .hdr_len=0, .queue_mapping=11, .cloned=1, .nohdr=0, .fclone=0, .peeked=0, .head_frag=0, .xmit_more=0, .headers_start=[...], .__pkt_type_offset=[...], .pkt_type=2, .pfmemalloc=0, .ignore_df=0, .nfctinfo=0, .nf_trace=0, .ip_summed=0, .ooo_okay=0, .l4_hash=1, .sw_hash=1, .wifi_acked_valid=0, .wifi_acked=0, .no_fcs=0, .encapsulation=0, .encap_h
_ip6={.priority=8, .version=4, .flow_lbl="9�?���F�r����� ", .payload_len=34832, .nexthdr='\377', .hop_limit='\377', .saddr={.in6_u={.u6_addr8="F�r����� ", .u6_addr16=[60998, ...], .u6_addr32=[2171792966, ...]}}, .daddr={.in6_u={.u6_addr8="#�q�����", .u6_addr16=[35619, ...], .u6_addr32=[2171702051, ...]}}} ip6={.priority=?, .version=?, .flow_lbl=?, .payload_len=?, .nexthdr=?, .hop_limit=?, .saddr={.in6_u={.u6_addr8=?, .u6_addr16=[?, ...], .u6_addr32=[?, ...]}}, .daddr={.in6_u={.u6_addr8=?, .u6_addr16=[?, .

IP6->{.priority=8, .version=4, .flow_lbl="9�?���F�r����� ", .payload_len=34832, .nexthdr='\377', .hop_limit='\377', .saddr={.in6_u={.u6_addr8="F�r����� ", .u6_addr16=[60998, ...], .u6_addr32=[2171792966, ...]}}, .daddr={.in6_u={.u6_addr8="#�q�����", .u6_addr16=[35619, ...], .u6_addr32=[2171702051, ...]}}}

I also checked the dropped packets script example provided by systemtap (the one that traces kfree_skb function), and it seems there are some packages dropped:

    37 packets dropped at 0xffffffff8172c1f9
    10 packets dropped at 0xffffffff817dc08d
    5 packets dropped at 0xffffffff8181a935
    5 packets dropped at 0xffffffff8176b1fb
    2 packets dropped at 0xffffffffc029627e
    2 packets dropped at 0xffffffffc048f9d9

 That, according to /boot/System.map-4.4.0-62-generic are:

 37 by ffffffff8172bf30 t __netif_receive_skb_core
 10 by ffffffff817dbf30 T ipv6_rcv
 5 by ffffffff8181a8e0 t tpacket_rcv
 5 by ffffffff8176b060 t ip_rcv_finish
(I can't find the other 2)

About the traces in the ipv6_find_hdr function:
* Confirms that some of the packets going through that functions are the Router Advertisements (lenght matches).
* It looks like the problem is in ipv6 header having version=4?

Revision history for this message
Launchpad Janitor (janitor) said :
#2

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Andres Pozo (apoz) said :
#3

The problem is still happening :(

Revision history for this message
Launchpad Janitor (janitor) said :
#4

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Andres Pozo (apoz) said :
#5

Anyone can help?

Revision history for this message
Launchpad Janitor (janitor) said :
#6

This question was expired because it remained in the 'Open' state without activity for the last 15 days.