NIC bonding performance

Asked by Claude Durocher

I'm testing NIC bonding with Fuel 6 : I have 3 x 1 Gbps bonded with SLB for the storage network. When I copy a large file (1 Gb) between two servers (configured with bonding) using either a bonded interface or a non-bonded interface, I get similar performance :

not bonded:
root@node-45:~# scp w2k.qcow2 192.168.0.13:
Warning: Permanently added '192.168.0.13' (ECDSA) to the list of known hosts.
w2k.qcow2 100% 1024MB 60.2MB/s 00:17

bonded:
root@node-45:~# scp w2k.qcow2 192.168.1.10:
Warning: Permanently added '192.168.1.10' (ECDSA) to the list of known hosts.
w2k.qcow2 100% 1024MB 64.0MB/s 00:16

Am I supposed to get better results with bonding?

Question information

Language:
English Edit question
Status:
Solved
For:
Fuel for OpenStack Edit question
Assignee:
No assignee Edit question
Solved by:
Fabrizio Soppelsa
Solved:
Last query:
Last reply:
Revision history for this message
Miroslav Anashkin (manashkin) said :
#1

Greetings, Claude,

You probably reached the network speed limit, set by your current settings.

Explanation:

1. Bond balance mode is set to SLB.

If you use balance-slb, then streams from a particular host (that is, all streams, if you are not using VMs) will use the same interface. In case of VMs -the streams from a particular VM will use the same interface. If you use balance-tcp, then streams from a particular host/VM will be spread (statistically) across the available interfaces.
So, with balance-SLB every single client network client (determined by MAC address) may use only single NIC from bond at the same time.
In order to use all 3 NICs of the bond for single client session you have to enable balance-TCP on the bond and on the network switches. Additionally, you may have to enable LACP , while LACP hash policy Layer3+4 (it is analog for OVS TCP balance mode) may not work correct with OVS, depending on network drivers.
This issue of LACP layer3+4 service traffic with OVS bonds+NIC drivers was one of the reasons we deprecated OBS bonds in Mirantis OpenStack 6.0 and set default bond provider to Linux in upcoming 6.1

2. I assume you use standard network frame size (MTU=1500)
In order to utilize full network speed it is recommended to set MTU=9000 for 1Gbit (One gigabit) NICs and faster.
It must be set through all the network, including hardware switches.
In our internal tests against 10Gbit NICs we could not get throughput greater than ~60% of full NIC speed - about 6 GBit for 10Gbit NICs with default MTU=1500.
Please note - changing MTU on OVS bonded NICs require each NIC down/up cycle, otherwise OVS bond may freeze.
It is also recommended to increase MTU size for VMs accordingly.

3. You use SCP - copy over SSH encrypted tunnel - to test speed.
It is CPU bound. Additionally, encrypted data is encapsulated into TCP frames with some extra space, necessary for tunnel headers and other service info. So, actual data amount, going through SSH may be significantly less than existing network bandwidth for non-encrypted traffic. It is also depends on cypher algorithm.
Additionally, SSH may use several levels of compression by default.

Please try to use iperf3 or older iperf version in order to test the real network speed - this utility tells much more about network speed, packet loss and gives great statistics. And it is multi-threaded.

Conclustion:
With current network settings, achieved 64MB/s=512 MBit/s is about ~50% of maximum possible speed for single/SLB balanced 1 GBit NIC with MTU=9000, or it is about 80% of maximum possible speed for single/SLB balanced 1 GBit NIC with default MTU=1500.
And it all is over SSH.

It is very, very good results! Usual performance hit for, say GRE tunnels, is much higher - GRE run at 1/3-1/4 of possible network speed. You probably have fast CPU and enabled by default compression for SSH, which allows you to exceed the theoretical performance limit.

Kind regards,
Miroslav Anashkin
L2 Support engineer
Fuel Core Team | Mirantis Inc.

Revision history for this message
Claude Durocher (claude-d) said :
#2

>This issue of LACP layer3+4 service traffic with OVS bonds+NIC drivers was one of the reasons we deprecated OBS bonds in Mirantis OpenStack 6.0 and set default bond provider to Linux in upcoming 6.1

If we use SLB, will this be supported in 6.1?

Revision history for this message
Miroslav Anashkin (manashkin) said :
#3

Yes, all the bonding/hash policy modes are supported in 6.1.
Only default bonding provider and some default bonding settings changed.

Revision history for this message
Claude Durocher (claude-d) said :
#4

Now that version 6.1 is about to go public, I really wonder what type of bond use (I want load balancing and fault tolerance and want the most reliable option). Any preferences amongst :

Balance-rr
Balance-xor
balance-tlb
balance-alb
balance-slb
balance-tcp

?

Revision history for this message
Fabrizio Soppelsa (fsoppelsa) said :
#5

Hello Claude, in 6.1 we switched from OVS to Linux native bonds:
https://blueprints.launchpad.net/fuel/+spec/refactor-l23-linux-bridges

Revision history for this message
Claude Durocher (claude-d) said :
#6

Will the Linux native bond be available in the webui of Fuel? Where can I find the documentation (the blueprints are not very usefull for an end-user)?

Revision history for this message
Best Fabrizio Soppelsa (fsoppelsa) said :
#7

Claude,
the UI remains intact, no changes for the enduser.
Internally, Fuel will bond with Linux bonds instead than with OVS ones.

Fabrizio.

Revision history for this message
Claude Durocher (claude-d) said :
#8

Thanks Fabrizio Soppelsa, that solved my question.