Question #227995 “Possible delay in Quantum GRE and flows” : Questions : neutron

Revision history for this message

Darragh O'Reilly (darragh-oreilly) said on 2013-05-02:

#1

just to confirm, the 1st ping request packet is reaching br-int quickly - ie it is not getting slowed down outside openstack/quantum and taking 6.5 seconds to reach the quantum router gateway?

Revision history for this message

Marco Colombo (colo90) said on 2013-05-03:

#2

Hi Darragh, thanks for reply.
no, the quantum router gateway are reached quickly, see this log. Tcpdump are made on the private interface of the quantum router.

09:35:59.143451 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 1, length 64
09:35:59.836781 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 2, length 64
09:36:00.844828 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 3, length 64
09:36:01.852905 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 4, length 64
09:36:02.355638 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 1, length 64
09:36:02.355710 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 2, length 64
09:36:02.355731 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 3, length 64
09:36:02.355760 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 4, length 64
09:36:02.854375 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 5, length 64
09:36:02.855173 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 5, length 64
09:36:03.856016 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 6, length 64
09:36:03.856978 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 6, length 64

and this is the ping output :

64 bytes from 185.21.172.18: icmp_req=1 ttl=58 time=3527 ms
64 bytes from 185.21.172.18: icmp_req=2 ttl=58 time=2520 ms
64 bytes from 185.21.172.18: icmp_req=3 ttl=58 time=1512 ms
64 bytes from 185.21.172.18: icmp_req=4 ttl=58 time=504 ms
64 bytes from 185.21.172.18: icmp_req=5 ttl=58 time=2.56 ms
64 bytes from 185.21.172.18: icmp_req=6 ttl=58 time=2.69 ms
64 bytes from 185.21.172.18: icmp_req=7 ttl=58 time=3.27 ms
64 bytes from 185.21.172.18: icmp_req=8 ttl=58 time=2.47 ms
64 bytes from 185.21.172.18: icmp_req=9 ttl=58 time=2.51 ms

The reply packet flows (on the quantum server) are created 1-2 seconds(or more) later i started to ping VM and so we see the reply ping deleted.
I think that this is my problem, but i don't know how to solve

Thanks

Hi Darragh, thanks for reply.
no, the quantum router gateway are reached quickly, see this log. Tcpdump are made on the private interface of the quantum router.

09:35:59.143451 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 1, length 64
09:35:59.836781 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 2, length 64
09:36:00.844828 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 3, length 64
09:36:01.852905 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 4, length 64
09:36:02.355638 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 1, length 64
09:36:02.355710 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 2, length 64
09:36:02.355731 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 3, length 64
09:36:02.355760 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 4, length 64
09:36:02.854375 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 5, length 64
09:36:02.855173 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 5, length 64
09:36:03.856016 IP 212.29.130.12 > 192.168.178.2: ICMP echo request, id 15800, seq 6, length 64
09:36:03.856978 IP 192.168.178.2 > 212.29.130.12: ICMP echo reply, id 15800, seq 6, length 64

and this is the ping output :

64 bytes from 185.21.172.18: icmp_req=1 ttl=58 time=3527 ms
64 bytes from 185.21.172.18: icmp_req=2 ttl=58 time=2520 ms
64 bytes from 185.21.172.18: icmp_req=3 ttl=58 time=1512 ms
64 bytes from 185.21.172.18: icmp_req=4 ttl=58 time=504 ms
64 bytes from 185.21.172.18: icmp_req=5 ttl=58 time=2.56 ms
64 bytes from 185.21.172.18: icmp_req=6 ttl=58 time=2.69 ms
64 bytes from 185.21.172.18: icmp_req=7 ttl=58 time=3.27 ms
64 bytes from 185.21.172.18: icmp_req=8 ttl=58 time=2.47 ms
64 bytes from 185.21.172.18: icmp_req=9 ttl=58 time=2.51 ms

The reply packet flows (on the quantum server) are created 1-2 seconds(or more) later i started to ping VM and so we see the reply ping deleted.
I think that this is my problem, but i don't know how to solve

Thanks

Revision history for this message

Darragh O'Reilly (darragh-oreilly) said on 2013-05-03:

#3

Hi Marco,

ok - just wanted to check that the delay was within ovs.

I think the creation of new flows requires a context switch to vswitchd in userspace. Does the compute node have lots of free memeory, or is it doing a lot of swapping? I see the port numbers are quite high (65 and 80) - how many instances is it currently running? Maybe check the vswitchd logs too.

Darragh.

Revision history for this message

Marco Colombo (colo90) said on 2013-05-06:

#4

Hi Darragh,
in this moment there are 16 instances.
The load avarage is 4 (is too high?) and the free memory is about 1,5 GB.
This is the ovs-vswitchd.log.

and this is the output of the command ovs-ofctl show br-int

OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:0000ba56c69ab54d
n_tables:255, n_buffers:256
features: capabilities:0xc7, actions:0xfff
1(patch-tun): addr:ba:9d:bd:68:35:92
     config: 0
     state: 0
2(qr-21d03c13-85): addr:3c:11:ff:7f:00:00
     config: PORT_DOWN
     state: LINK_DOWN
3(qr-2e9d596d-3b): addr:3c:11:ff:7f:00:00
     config: PORT_DOWN
     state: LINK_DOWN
4(qr-c7c253a7-02): addr:3c:11:ff:7f:00:00
     config: PORT_DOWN
     state: LINK_DOWN
5(qr-7e531375-39): addr:00:87:00:00:fa:16
     config: PORT_DOWN
     state: LINK_DOWN
6(qr-83b0e978-a7): addr:3c:11:ff:7f:00:00
     config: PORT_DOWN
     state: LINK_DOWN
7(qr-32b06b50-91): addr:fa:16:3e:2f:6a:91
     config: PORT_DOWN
     state: LINK_DOWN
8(qr-da54554e-20): addr:fa:16:3e:6a:cd:80
     config: PORT_DOWN
     state: LINK_DOWN
125(tap569e45aa-1c): addr:3c:11:ff:7f:00:00
     config: PORT_DOWN
     state: LINK_DOWN

All my ports are down. I already try with ovs-ofctl mod-port br-int <port> up but does not work.
May be the problem?

Thanks

Hi Darragh,
in this moment there are 16 instances.
The load avarage is 4 (is too high?) and the free memory is about 1,5 GB. 
This is the ovs-vswitchd.log.

and this is the output of the command ovs-ofctl show br-int

OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:0000ba56c69ab54d
n_tables:255, n_buffers:256
features: capabilities:0xc7, actions:0xfff
 1(patch-tun): addr:ba:9d:bd:68:35:92
     config:     0
     state:      0
 2(qr-21d03c13-85): addr:3c:11:ff:7f:00:00
     config:     PORT_DOWN
     state:      LINK_DOWN
 3(qr-2e9d596d-3b): addr:3c:11:ff:7f:00:00
     config:     PORT_DOWN
     state:      LINK_DOWN
 4(qr-c7c253a7-02): addr:3c:11:ff:7f:00:00
     config:     PORT_DOWN
     state:      LINK_DOWN
 5(qr-7e531375-39): addr:00:87:00:00:fa:16
     config:     PORT_DOWN
     state:      LINK_DOWN
 6(qr-83b0e978-a7): addr:3c:11:ff:7f:00:00
     config:     PORT_DOWN
     state:      LINK_DOWN
 7(qr-32b06b50-91): addr:fa:16:3e:2f:6a:91
     config:     PORT_DOWN
     state:      LINK_DOWN
 8(qr-da54554e-20): addr:fa:16:3e:6a:cd:80
     config:     PORT_DOWN
     state:      LINK_DOWN
 125(tap569e45aa-1c): addr:3c:11:ff:7f:00:00
     config:     PORT_DOWN
     state:      LINK_DOWN

All my ports are down. I already try with ovs-ofctl mod-port br-int <port> up but does not work.
May be the problem?

Thanks

Revision history for this message

Darragh O'Reilly (darragh-oreilly) said on 2013-05-06:

#5

It looks like vswitchd has a problem with tap569e45aa-1c - does it exist in linux? I guess this is from the network node as there are only qr-* devices and tap569e45aa-1c which I guess is for dhcp?

Revision history for this message

Marco Colombo (colo90) said on 2013-05-06:

#6

Hi Darragh,
in linux does not exist, but there is in ovs.

Thanks

Revision history for this message

Darragh O'Reilly (darragh-oreilly) said on 2013-05-06:

#7

are you using ip namespaces? If so you will need to find the namespace names with:

# ip netns

and then list their interfaces with:

# ip netns exec {name-space} ip link

Revision history for this message

Marco Colombo (colo90) said on 2013-05-06:

#8

yes, i'm using namespaces.

ip netns exec qdhcp-7d4651b5-3030-4a54-a1e6-84145e77c4c4 ip link
49: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
1137: tap569e45aa-1c: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether fa:16:3e:d7:9e:1a brd ff:ff:ff:ff:ff:ff

Thanks

Revision history for this message

Darragh O'Reilly (darragh-oreilly) said on 2013-05-06:

#9

Have you pinpointed the delay to be on the compute node or the network node? Are you working from any particular guide? I assume this is the ovs plugin?

Revision history for this message

Marco Colombo (colo90) said on 2013-05-09:

#10

Hi Darragh,
thanks for support. I change my quantum server with CPU improvment and now the problem has been solved.
Now the load avarage on the new server lower than older.

neutron

Possible delay in Quantum GRE and flows

Question information

Subscribers

neutron

Possible delay in Quantum GRE and flows

Question information

Related bugs

Related FAQ:

Subscribers