RabbirMq server is getting hanged continously in Multi node setup

Asked by Dannina Anil Kumar on 2013-01-18

Hi all,

In our organization, as part of one product we are building a multinode openstack setup. The setup contains once controller node(api, scheduler, cert services are installed in it) and more than one compute nodes(Compute and Network services are installed in it). In this setup, sometimes the rabbitmq service is getting continuously hanged, with the command rabbitmqctl list_queues, not all the compute nodes are listing.

In the scheduler logs, we are continuously getting the following error.

2013-01-18 07:29:02 INFO nova.rpc.common [-] Reconnecting to AMQP server on ---------------
2013-01-18 07:29:02 ERROR nova.rpc.common [-] AMQP server on -------------- is unreachable: [Errno 111] ECONNREFUSED. Trying again in 3 seconds.
2013-01-18 07:29:02 TRACE nova.rpc.common Traceback (most recent call last):
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/local/lib/python2.7/dist-packages/nova-2012.1.1-py2.7.egg/nova/rpc/impl_kombu.py", line 446, in reconnect
2013-01-18 07:29:02 TRACE nova.rpc.common self._connect()
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/local/lib/python2.7/dist-packages/nova-2012.1.1-py2.7.egg/nova/rpc/impl_kombu.py", line 423, in _connect
2013-01-18 07:29:02 TRACE nova.rpc.common self.connection.connect()
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 95, in connect
2013-01-18 07:29:02 TRACE nova.rpc.common return self.connection
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 396, in connection
2013-01-18 07:29:02 TRACE nova.rpc.common self._connection = self._establish_connection()
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 364, in _establish_connection
2013-01-18 07:29:02 TRACE nova.rpc.common return self.transport.establish_connection()
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 219, in establish_connection
2013-01-18 07:29:02 TRACE nova.rpc.common connect_timeout=conninfo.connect_timeout)
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 45, in __init__
2013-01-18 07:29:02 TRACE nova.rpc.common super(Connection, self).__init__(*args, **kwargs)
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 129, in __init__
2013-01-18 07:29:02 TRACE nova.rpc.common self.transport = create_transport(host, connect_timeout, ssl)
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 281, in create_transport
2013-01-18 07:29:02 TRACE nova.rpc.common return TCPTransport(host, connect_timeout)
2013-01-18 07:29:02 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 85, in __init__
2013-01-18 07:29:02 TRACE nova.rpc.common raise socket.error, msg
2013-01-18 07:29:02 TRACE nova.rpc.common error: [Errno 111] ECONNREFUSED

Can any one please suggest the solution, how can i make the rabbitmq server service available all the time.

Thanks in advance,
D Anil Kumar.

Question information

Language:
English Edit question
Status:
Answered
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Last query:
2013-01-18
Last reply:
2013-01-19
Keith Tobin (keith-tobin) said : #1

This story is worth sharing,

We had being talking offline on this one, it seems that the problem was first a RPC error in log.
This was caused by the fact the the nova manager tried to make a RPC call to the nova network manager to setup instance networking, the RPC call was preformed, but a response was never received, so a error occurred and logged. The reason for no RPC response for RPC call was, quantum networking was configured in nova config file, but quantum was not installed so no reply was received.

I understand the you changed the nova network manager in the nova.conf file to load the nova networking and running nova-networking.

All seems good now?

Can you help with this problem?

Provide an answer of your own, or ask Dannina Anil Kumar for more information if necessary.

To post a message you must log in.