Why would an OVS bridge not forward ARP

Asked by Sunil Srivastava

  Bridge "br-eth0"
        Port "br-eth0"
            Interface "br-eth0"
                type: internal
        Port "eth0"
            Interface "eth0"
        Port "phy-br-eth0"
            Interface "phy-br-eth0"
    Bridge br-int
        Port "tap55d1e5e8-ab"
            tag: 1
            Interface "tap55d1e5e8-ab"
                type: internal
        Port "qr-4b50a17d-3c"
            tag: 1
            Interface "qr-4b50a17d-3c"
                type: internal
        Port "int-br-eth0"
            Interface "int-br-eth0"
        Port "tape8d6e0a5-52"
            tag: 1
            Interface "tape8d6e0a5-52"
        Port "tap6176588e-48"
            tag: 1
            Interface "tap6176588e-48"
        Port br-int
            Interface br-int
                type: internal

I can see ARP packets sent from int-br-eth0 to phy-br-eth0 but not to upstream eth0.

So we cannot ping from one VM (or DHCP NetNS) on one machine to another VM on another machine.

I see the ping triggering ARPs. The Tx counter of int-br-eth0 and Rx counter of phy-br-eth0 were also corelated with ping.

Question information

Language:
English Edit question
Status:
Solved
For:
neutron Edit question
Assignee:
No assignee Edit question
Solved by:
Eoghan
Solved:
Last query:
Last reply:
Revision history for this message
Aaron Rosen (arosen) said :
#1

Can you also provide the output of:

ovs-ofctl dump-flows br-int
ovs-ofctl dump-flows br-eth0

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#2

$ sudo ovs-ofctl dump-flows br-int
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=27644.955s, table=0, n_packets=285993, n_bytes=36330627, priority=2,in_port=20 actions=drop
cookie=0x0, duration=27645.393s, table=0, n_packets=38985, n_bytes=7265979, priority=1 actions=NORMAL
$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
cookie=0x0, duration=27656.544s, table=0, n_packets=7708, n_bytes=326108, priority=2,in_port=5 actions=drop
cookie=0x0, duration=27656.901s, table=0, n_packets=299545, n_bytes=37154475, priority=1 actions=NORMAL

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#3

This is a sample when ping was going on and failing.

stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30832.687s, table=0, n_packets=7717, n_bytes=326498, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30833.044s, table=0, n_packets=336783, n_bytes=41878623, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30835.541s, table=0, n_packets=7717, n_bytes=326498, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30835.898s, table=0, n_packets=336806, n_bytes=41881064, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30845.057s, table=0, n_packets=7720, n_bytes=326628, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30845.414s, table=0, n_packets=336907, n_bytes=41893018, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30847.233s, table=0, n_packets=7722, n_bytes=326712, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30847.59s, table=0, n_packets=336925, n_bytes=41895113, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30853.972s, table=0, n_packets=7729, n_bytes=327006, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30854.329s, table=0, n_packets=337000, n_bytes=41903665, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30861.609s, table=0, n_packets=7736, n_bytes=327300, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30861.966s, table=0, n_packets=337083, n_bytes=41911871, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30873.43s, table=0, n_packets=7748, n_bytes=327804, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30873.787s, table=0, n_packets=337217, n_bytes=41927882, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30886.031s, table=0, n_packets=7761, n_bytes=328350, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30886.388s, table=0, n_packets=337345, n_bytes=41942673, priority=1 actions=NORMAL
stack@esg-dell-c4-s11:~$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=30902.643s, table=0, n_packets=7780, n_bytes=329156, priority=2,in_port=5 actions=drop
 cookie=0x0, duration=30903s, table=0, n_packets=337542, n_bytes=41964462, priority=1 actions=NORMAL

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#4

--- 10.0.0.6 ping statistics ---
70 packets transmitted, 0 received, +65 errors, 100% packet loss, time 69300ms

The drops packets (7780-7717 = 63) come close to 65 errors but not sure if 100% co relation can be made.

But we did tcp dump on ARPs for src IP on Rx and Tx side.

Rx side showed ARP packets coming and Tx side showed no ARP Packets leaving.

Revision history for this message
Aaron Rosen (arosen) said :
#5

Hi Sunil,

Is the host (or switch attached to eth0) configured to recieve a packet with a vlan tag on it? If a packet is sent from [tape8d6e0a5-52 or tap6176588e-48] the ARP request will enter int-br-eth0(and a vlan tag of 1 will be added to the packet). Then the request will enter br-eth0 with this vlan tag and then exit eth0.

The other option is that: if you do a ovs-dpctl show, eth0 cordinates to port 5 in which case the packets won't be forwarded on due to the drop rule in your flow table for br-eth0.

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#6

Hi Aaron,

Please see this outout. It is phy-br-eth0.

stack@esg-dell-c4-s11:~$ sudo ovs-ofctl show br-eth0
OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:000000219bc9d983
n_tables:255, n_buffers:256
features: capabilities:0xc7, actions:0xfff
 5(phy-br-eth0): addr:06:ee:0f:8c:92:b3
     config: 0
     state: 0
     current: 10GB-FD COPPER
 8(eth0): addr:00:21:9b:c9:d9:83
     config: 0
     state: 0
     current: 1GB-FD FIBER AUTO_NEG
     advertised: 1GB-FD AUTO_NEG
     supported: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-FD COPPER FIBER AUTO_NEG
 LOCAL(br-eth0): addr:00:21:9b:c9:d9:83
     config: PORT_DOWN
     state: LINK_DOWN
OFPT_GET_CONFIG_REPLY (xid=0x3): frags=normal miss_send_len=0

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#7

The switch is connected to eth0 and configured to take VLAN #1 and set in trunk mode.

Revision history for this message
Aaron Rosen (arosen) said :
#8

Looks like br-eth0 is down, I'm not sure if that would stop if from forwarding packets. Can you try ifconfig br-eth0 up; and see if that changes anything? You're sure if you tcpdump on eth0 you don't see any of these arps?

Also:

 cookie=0x0, duration=30902.643s, table=0, n_packets=7780, n_bytes=329156, priority=2,in_port=5 actions=drop

would block the returning ARP reply (though if it's not making it out eth0, that doesn't matter yet).

Are you using a particular plugin and it's not working as expected?

Revision history for this message
yong sheng gong (gongysh) said :
#9

can u list the network and show the network you are using:
quantum net-list
quantum net-show

and make sure the ovs-quantum-agent is active.
It seems your flows in ovs bridge are not set well.

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#10

Hi Aaron,

I am leaving on some trip, and would not have access.

The br-eth0 is up and tried that still but did not work.

stack@esg-dell-c4-s11:~/gitstack/devstack$ ifconfig
br-eth0 Link encap:Ethernet HWaddr 00:21:9b:c9:d9:83
          inet6 addr: fe80::221:9bff:fec9:d983/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:660411 errors:0 dropped:1883 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:84199314 (84.1 MB) TX bytes:468 (468.0 B)

Sunil.

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#11

Hi Yong,

That is not the root cause. Still.

stack@esg-dell-c4-s11:~/gitstack/devstack$ quantum net-list
+--------------------------------------+---------+--------------------------------------+
| id | name | subnets |
+--------------------------------------+---------+--------------------------------------+
| 68f76ec1-407b-4e42-a089-d0e6553473f8 | ext_net | 09851d25-806f-492c-b708-bf03838d77b3 |
| fa8f9c5e-e41a-4f80-955c-94b3a45b9dcb | net1 | 31ed889f-f3f5-4faa-bb51-1d92344c91a3 |
+--------------------------------------+---------+--------------------------------------+

stack@esg-dell-c4-s11:~/gitstack/devstack$ quantum net-show net1
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| id | fa8f9c5e-e41a-4f80-955c-94b3a45b9dcb |
| name | net1 |
| provider:network_type | local |
| provider:physical_network | |
| provider:segmentation_id | |
| router:external | False |
| shared | False |
| status | ACTIVE |
| subnets | 31ed889f-f3f5-4faa-bb51-1d92344c91a3 |
| tenant_id | b0d8717a0f8b4cf8bdff8d84156622af |

stack@esg-dell-c4-s11:~/gitstack/devstack$ quantum net-show ext_net
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| id | 68f76ec1-407b-4e42-a089-d0e6553473f8 |
| name | ext_net |
| provider:network_type | local |
| provider:physical_network | |
| provider:segmentation_id | |
| router:external | True |
| shared | False |
| status | ACTIVE |
| subnets | 09851d25-806f-492c-b708-bf03838d77b3 |
| tenant_id | 44cb33fdc72b44ad8e200a1326199895 |
+---------------------------+--------------------------------------+

+---------------------------+--------------------------------------+

Revision history for this message
Aaron Rosen (arosen) said :
#12

Sunil, One last thing. If you leave the ping running and then provide the output of

ovs-dpctl dump-flows br-int
ovs-dpctl dump-flows br-tun

That will show the active flow entires in the kernel. Did you try running tcpdump on eth0 to see if you see arp packets there? You never said how you know that they are not making it out eth0. You just said you were unable to ping. (The drop flow entry you provided blocks the returning replies so ping definitely will not work).

Aaron

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#13

Hi Aron,

Here is the TCP Dumps.

The following output show there is no link issue between phy-br-eth0 and int-br-eth0.

(1)

root@esg-dell-c4-s11:~# ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_req=1 ttl=64 time=0.056 ms
64 bytes from 10.0.0.2: icmp_req=2 ttl=64 time=0.052 ms
64 bytes from 10.0.0.2: icmp_req=3 ttl=64 time=0.032 ms
64 bytes from 10.0.0.2: icmp_req=4 ttl=64 time=0.041 ms
64 bytes from 10.0.0.2: icmp_req=5 ttl=64 time=0.048 ms
--- 10.0.0.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4006ms

The above triggers this below on int-br-eth0

stack@esg-dell-c4-s11:~/devstack$ sudo tcpdump -i int-br-eth0 arp and src 10.0.0.2
tcpdump: WARNING: int-br-eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on int-br-eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
00:41:55.755545 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:41:56.753793 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:41:57.753782 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:41:58.771011 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28 00:41:59.769790 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:42:00.769796 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28

(2)

And again on phy-br-eth0.

root@esg-dell-c4-s11:~# ping 10.0.0.3
PING 10.0.0.3 (10.0.0.3) 56(84) bytes of data.
From 10.0.0.2 icmp_seq=1 Destination Host Unreachable From 10.0.0.2 icmp_seq=2 Destination Host Unreachable From 10.0.0.2 icmp_seq=3 Destination Host Unreachable ^C
--- 10.0.0.3 ping statistics ---
5 packets transmitted, 0 received, +3 errors, 100% packet loss, time 4024ms

stack@esg-dell-c4-s11:~/devstack$ sudo tcpdump -i phy-br-eth0 arp and src 10.0.0.2
tcpdump: WARNING: phy-br-eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on phy-br-eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
00:43:35.871097 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:36.869787 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:37.873777 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:38.887008 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:39.885784 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:40.885794 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:41.902994 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:42.901762 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28
00:43:43.901758 ARP, Request who-has inenbasavbl1c.corp.emc.com tell usxxstephc2mbp1.corp.emc.com, length 28 ^C

Thanks,
Sunil

Revision history for this message
Aaron Rosen (arosen) said :
#14

Hi Sunil,

Sorry I'm not sure why you're trying to show me here. You can ping 10.0.0.2 and not 10.0.0.3, I don't know where those interfaces reside in your setup. Can you show me an ifconfig -a of this machine. Also while you are pinging a ovs-dpctl dump-flow. Also, why are you showing me a tcpdump on phy-br-eth0, you should be doing that on eth0 since you say the packets are getting there.

Thanks,

Aaron

P.S: I'll also be in #openstack-dev for a little while longer tonight.

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#15

Someone else has been on the setup and things have changed a bit.

He said he entered a flow entry - but I did not have time to follow up. I am in R/O mode. :-)

br-eth0 was brought up and that may have changed the behavior.

Now ARP is reach the other machine, and I can see the traffic on eth0, phy-br-eth0 and int-br-eth0.

But there is not ARP reply. The ARPs are not getting to any of the TAP interfaces (but I only did once).

The br-int seems to be dropping now.

BTW, one TAP interface is not there as VM was brought down but ovs-vsctl has it.

stack@esg-dell-c4-s10:~/devstack$ ovs-dpctl show br-eth0 flows
system@br-eth0:
        lookups: hit:658100 missed:119465 lost:0
        flows: 27
        port 0: br-eth0 (internal)
        port 6: eth0
        port 9: phy-br-eth0
ovs-dpctl: opening datapath flows failed (No such device)

stack@esg-dell-c4-s10:~/devstack$ sudo ovs-ofctl dump-flows br-eth0
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=54070.378s, table=0, n_packets=80, n_bytes=6552, priority=2,in_port=9 actions=drop
 cookie=0x0, duration=54070.663s, table=0, n_packets=632974, n_bytes=79155773, priority=1 actions=NORMAL

stack@esg-dell-c4-s10:~/devstack$ ovs-dpctl show br-eth0 flows
system@br-eth0:
        lookups: hit:660125 missed:119838 lost:0
        flows: 30
        port 0: br-eth0 (internal)
        port 6: eth0
        port 9: phy-br-eth0
ovs-dpctl: opening datapath flows failed (No such device)

stack@esg-dell-c4-s10:~/devstack$ sudo ovs-ofctl dump-flows br-int
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=54307.045s, table=0, n_packets=608686, n_bytes=77906470, priority=2,in_port=18 actions=drop
 cookie=0x0, duration=54307.359s, table=0, n_packets=14395, n_bytes=2748282, priority=1 actions=NORMAL

stack@esg-dell-c4-s10:~/devstack$ ovs-dpctl show br-int flows
system@br-int:
        lookups: hit:631616 missed:143682 lost:0
        flows: 26
        port 0: br-int (internal)
Sep 21 03:00:32|00001|netdev_linux|WARN|/sys/class/net/tap26583155-34/carrier: open failed: No such file or directory
        port 1: tap26583155-34 (internal)
        port 14: tapd1802d22-b4
        port 15: tapfa0e7fcf-8d
        port 16: tap5eb27feb-05
        port 18: int-br-eth0
ovs-dpctl: opening datapath flows failed (No such device)

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#16

Sorry, I need to go. Someone from EMC would follow up.

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#17

Sorry, I need to go. Someone from EMC would follow up.

Revision history for this message
yong sheng gong (gongysh) said :
#18

Seen from Floor #11,
you are using local mode where the traffic will not go out from the machine. U can try multiple Vms on the same machine, they should can ping each other.

Revision history for this message
Eoghan (eoghank) said :
#19

Following on from Yong's suggestion about flow control rules I checked the rules on br-int and br-eth0 on both nodes. The example below is from br-int on the controller and port 20 was set to drop by default. This was the same for br-int on the other node, and for br-eth0 on both nodes.

sudo ovs-ofctl show br-int
OFPT_FEATURES_REPLY (xid=0x1): ver:0x1, dpid:00005aa5a97a1541
n_tables:255, n_buffers:256
features: capabilities:0xc7, actions:0xfff
 2(tap55d1e5e8-ab): addr:0b:02:00:00:00:00
     config: PORT_DOWN
     state: LINK_DOWN
 18(tape8d6e0a5-52): addr:b6:a6:52:18:de:00
     config: 0
     state: 0
     current: 10MB-FD COPPER
 19(tap6176588e-48): addr:7e:b5:44:0c:fa:0d
     config: 0
     state: 0
     current: 10MB-FD COPPER
 20(int-br-eth0): addr:6a:36:6c:2e:3a:76
     config: 0
     state: 0
     current: 10GB-FD COPPER

$sudo ovs-ofctl dump-flows br-int
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=75826.521s, table=0, n_packets=810503, n_bytes=104709261, priority=2,in_port=20 actions=drop
 cookie=0x0, duration=75826.959s, table=0, n_packets=50413, n_bytes=9135002, priority=1 actions=NORMAL

Once I opened these up for both br-int and br-eth0 on both sides I could ping instances from either side so this is now working.

Is there any reason why these ports would be set to drop by default?

Thanks
Eoghan

Revision history for this message
yong sheng gong (gongysh) said :
#20

both of your networks are local type:
stack@esg-dell-c4-s11:~/gitstack/devstack$ quantum net-show net1
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| id | fa8f9c5e-e41a-4f80-955c-94b3a45b9dcb |
| name | net1 |
| provider:network_type | local |

the flow is to drop by default. if we have Vms on the network with Vlan network_type, the port will be opened.

Revision history for this message
Eoghan (eoghank) said :
#21

I had these in the localrc before I ran stack.sh

ENABLE_TENANT_VLAN=True
TENANT_VLAN_RANGE=1:1000
PHYSICAL_NETWORK=eth0

And nova.conf had vlan_interface=eth0

ovs_quantum_plugin.ini had these flags set:

bridge_mappings = eth0:br-eth0
tenant_network_type = vlan
network_vlan_ranges = eth0:1:1000

Should this be sufficient for the networks to run as VLAN type?

Revision history for this message
yong sheng gong (gongysh) said :
#22

Yes. But to enable the networks on multi-nodes to connect together, u need corresponding actual physical net which runs on the given VLAN id.
For example, If your virtual network has | provider:segmentation_id = 1, u will have to enable your hardware switch to allow Vlan 1 to pass.

Revision history for this message
yong sheng gong (gongysh) said :
#23

And if your original question is answered, we should close this question. If we have new ones, we should open new one. Different questions in one thread is not helpful for others to query.

Revision history for this message
Best Eoghan (eoghank) said :
#24

Original question is answered.

Revision history for this message
Sunil Srivastava (sunil-srivastava) said :
#25

Thanks Eoghan, that solved my question.