Infiniband support for storage network

Asked by Dmitry Komarov

Hello everyone,

I'm new to Fuel and OpenStack so please forgive if my questions sound stupid.

I'm trying to use Infiniband adapters for my CEPH storage network in OpenStack cluster. I already have experience with CEPH and CLoudStack where I use Infiniband for RBD access.

When I first installed OpenStack manually using official guides for Ubuntu 12.04 I succeeded with connecting to CEPH cluster over Infiniband without any special issues or workarounds. It just worked "out of the box" with Cinder and Glance.

But now I try to create my cluster with Fuel for simplified deployment, preconfigured HA and better integration of things. Unfortunately Fuel does not offer me to use Infiniband for nodes as network adapters.

Is there any workaround for this? I suppose that I have to use Fuel's CLI instead of GUI for deployment and patch some Puppet templates to configure preloading of Infiniband modules on nodes. But I don't have a glue how it should be integrated with OVS and overall networking configuration. I configure my cluster with Neutron and GRE and I know that this should work flawlessly.

Can anyone please provide some help with this setup? I think it should be quite a common question as Infiniband is very popular for clouds now.

P.S. My IB cards are not Connect-X3 so plugins from Mellanox are useless for me.

Best Regards,
Dmitry

Question information

Language:
English Edit question
Status:
Solved
For:
Fuel for OpenStack Edit question
Assignee:
No assignee Edit question
Solved by:
Andrey Korolyov
Solved:
Last query:
Last reply:
Revision history for this message
Andrew Woodward (xarses) said :
#1

You will need to add the adapters firmware to the bootstrap image that is used to discover the node, you can do this by building your own iso from https://github.com/stackforge/fuel-main and editing this https://github.com/stackforge/fuel-main/blob/master/bootstrap/module.mk to install the firmware. You will also want to add this package to requirements-rpm.txt and requirements-deb.txt so that they are included in the mirrors that the iso produces. Finally, you will need to tweak the cobbler preseed file (or anaconda) so that they will include the driver during OS install time. These are found under build/repos/fuellib/deployment/puppet/cobbler/templates/preseed or build/repos/fuellib/deployment/puppet/cobbler/templates/kickstart after you do a 'make build_repos' from the root of the git folder.

If you run into any problems making the iso please feel free to reach out to us on IRC

Revision history for this message
Dmitry Komarov (dmit-q) said :
#2

Thank you for the prompt reply!

So you mean that I have to rebuild the Fuel install ISO and reinstall my master node with the new one?

Actually both bootstrap and install PXE images of the current Fuel ISO already have all required driver modules:

mlx4_ib
ib_sa
ib_cm
ib_umad
ib_addr
ib_uverbs
ib_ipoib

So can I patch the PXE images to modprobe them at boot without rebuilding the whole ISO and keeping my existing master node installation?

And I just wonder if this will be enough for the Fuel GUI or CLI to see and manage those Infiniband adapters afterwards to use for the storage traffic.

I'm asking because Infiniband is not ethernet compatible, it is just an IP over Infiniband L3 device with its own 20-bytes hardware address and so can not be L2-bridged by linux-bridge or OVS. Though it works perfectly with usual routing or GRE tunnels in OpenStack.

Regards,
Dmitry

Revision history for this message
Best Andrey Korolyov (xdeller) said :
#3

Hey,

Actually you will not be able to use IPoIB with current FUEL topo since it hardly relying on the bridge hierarchy and none of IP addresses assigned by host are going directly to interface. You can neither switch to the ethernet mode and modify installation and configuration scripts accordingly or try to do the same with EoIPoIB [0] or just manually adjust Ceph configuration after installation to pass storage traffice via IB-backed subnet. The last one seems most easiest and most hacky way to get the apple.

0. https://lwn.net/Articles/509448/

Revision history for this message
Dmitry Komarov (dmit-q) said :
#4

Thanks Andrey Korolyov, that solved my question.

Revision history for this message
Dmitry Komarov (dmit-q) said :
#5

Andrey, one more question please:

you suggest the EoIPoIB solution with the link to a driver implementation at LWN. Did you just googled for it or tried this driver yourself?

I was already thinking of using it but unfortunately have no glue what is the source of the driver and where it can be downloaded for compilation :( There are some patches mentioned at the link but with no source.

Any ideas?..

Revision history for this message
Andrey Korolyov (xdeller) said :
#6

Yes, I tried it, but amount of work is relatively high - you`ll need to patch kernel with those patches(can be found http://thread.gmane.org/gmane.linux.network/238895 or at patchwork ). EoIB is a faster and more straightforward way to achieve the same functionality but its functionality depends on your cards version(though they are not mlx, they have same capabilities as a regular connect-x mellanox card).

Revision history for this message
Dmitry Komarov (dmit-q) said :
#7

"EoIB is a faster and more straightforward way to achieve the same functionality ... though they are not mlx, they have same capabilities as a regular connect-x mellanox card" - would you be more specific please?..

I use an older model of mellanox connectX which has no hardware support for ethernet switching, so there are no official EoIB drivers from Mellanox for my adapters.

Thank you!
Dmitry

Revision history for this message
Andrey Korolyov (xdeller) said :
#8

Ehm, I thought that your sentence 'not ConnectX-3' cards related to the card vendor, so if yours are just ConnectX(not even -2, which can work with EoIB just fine) you may try to use patchset from above and consider address reassignment for plain IPoIB as a fallback.

Revision history for this message
Dmitry Komarov (dmit-q) said :
#9

OK, thank you very much for your explanatory answer!

I was just surprised no one here asked the same yet but now things getting more clear...

I'm having the same 'bridging' problem with CEPH / CloudStack combo as it has very limited storage capabilities where I can't use RBD as a secondary storage backend. So I wanted to shift to OpenStack where RBD can be used for both Glance and Cinder without this annoying bridging requirement. It was just a bit unclear for me how does the Fuel OpenStack distro manage storage traffic.

I would prefer to avoid this EoIPoIB solution as it supposed to have huge overhead in resources and latency and I would like to stay with plain IPoIB model.

Am I right that everything I need is to leave this OVS br-storage as is, assign another IP range to IB network and patch the ceph.conf and hosts files to this new address space on all nodes? As far as I can understand no other configuration required?..

Thank you!
Dmitry

Revision history for this message
Andrey Korolyov (xdeller) said :
#10

Yes, but I suggest you to stay on same addresses. Address change may require ceph.conf change and of course not easy task for a MON quorum in case if 'public' Ceph network with 'storage' one.

Revision history for this message
Dmitry Komarov (dmit-q) said :
#11

Good point, thank you!

If I stay with the same IP addresses, to avoid the IP conflict with preconfigured OVS br-storage interface, is it enough to change the /etc/network/interfaces.d/ifcfg-br-storage IP to a different one? Or this will require more complex reconfiguration of OVS?

Sorry for bothering, I just never used OVS before and don't feel comfortable with its configuration yet.

Dmitry

Revision history for this message
Andrey Korolyov (xdeller) said :
#12

Until all physical nodes will be able to reach this subnet directly there will be absolutely no problem, so you may just move addresses in the config from br-storage to ibX. Bridge may remain, but feel free to destroy it at any time after address migration(and delete an appropriate network config).

Revision history for this message
Dmitry Komarov (dmit-q) said :
#13

THANK YOU VERY MUCH!

You saved my day!

Everything works as expected, though I also had to change the public_network parameter in ceph.conf to the storage (infiniband) subnet.

Without this change all the traffic between compute <-> CEPH was coming through the mgmt ethernet network and only CEPH internal traffic was running through infiniband:

http://ceph.com/docs/master/rados/configuration/network-config-ref/

Revision history for this message
Roberto (rgonzalez-i) said :
#14

Hello, I'm a little stuck trying to get this working. I have loaded the modules on the kernel but fuel still dosent see the infiniband cards. These are a couple of things I tried.

1. I modified the initramfs.img and loaded the modules for inifinband. Fuel still does not see it
2. I downloaded the GIT source and looked for where the modules are loaded and did not find it.

Any help is appreciated.

Thanks,
Roberto