restart of quantum ovs plugin and network node reboot

Asked by Christoph Arnold

Three Node Setup, Quantum + OVS + GRE, simple Network, no router no floating, one network + subnet.

Initial network creating works fine, Instances get an IP, can ping each other and so on. After restarting the quantum-plugin-openvswitch-agent on the network node, I cant connect to existing VMs anymore. Trying to create new instances result in dhcp not providing an IP.

Running tcpdump against all the qbr- qvo- qvb-<something> devices showed that arp requests do get answered by the compute node, but never reach the network node. Packets seem to be dropped by the "action=drop" line and do not match the other flow any longer.

# ovs-ofctl dump-flows br-tun
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=5313.915s, table=0, n_packets=886, n_bytes=41676, priority=3,tun_id=0x1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=mod_vlan_vid:1,output:1
 cookie=0x0, duration=5313.974s, table=0, n_packets=1644, n_bytes=205097, priority=4,in_port=1,dl_vlan=1 actions=set_tunnel:0x1,NORMAL
 cookie=0x0, duration=5313.856s, table=0, n_packets=182, n_bytes=28539, priority=3,tun_id=0x1,dl_dst=fa:16:3e:12:d1:f3 actions=mod_vlan_vid:1,NORMAL
 cookie=0x0, duration=6213.217s, table=0, n_packets=1427, n_bytes=149242, priority=1 actions=drop

A workaround was to add another default flow via ovs-ofctl to br-tun.

# ovs-ofctl add-flow br-tun action=Normal

This only works in case packets get lost on their way back via br-tun, but it seems that there are other issues after restart of network node / quantum ovs plugin. Sometimes it only helps to recreate the whole netconfig from scratch.

Any idea what goes wrong here and how to get to the core of that issue?.

Question information

Language:
English Edit question
Status:
Solved
For:
neutron Edit question
Assignee:
No assignee Edit question
Solved by:
Christoph Arnold
Solved:
Last query:
Last reply:
Revision history for this message
Christoph Arnold (1rnold) said :
#1

After setting "use_namespaces = True" in dhcp_agent.ini and l3_agent.ini restarting network node is not an issue anymore and running instances can be reached.

I am not sure if restarting parts of the network without namespace support should work, maybe somebody can provide the missing pieces to this.

Problem solved anyway.