Unable run instance on second node with error Timed out waiting for RPC response

Asked by xmdr

 I've installed openstack essex on 2 nodes with CentOS 6.2. One has all nova services called node1 and the other called node2 has only compute service running.
I could launch instance successfully with a single node (node1), it failed while I'd added node2 to the cloud.
When I launch instance, it's assigned on node2 and turns to ERROR state after a while. Errors could be found on both network.log on node1 and compute.log on node2.

network.log on node 1:
2012-07-03 14:43:00 ERROR nova.rpc.common [-] Timed out waiting for RPC response: timed out
2012-07-03 14:43:00 TRACE nova.rpc.common Traceback (most recent call last):
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 490, in ensure
2012-07-03 14:43:00 TRACE nova.rpc.common return method(*args, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 567, in _consume
2012-07-03 14:43:00 TRACE nova.rpc.common return self.connection.drain_events(timeout=timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 139, in drain_events
2012-07-03 14:43:00 TRACE nova.rpc.common return self.transport.drain_events(self.connection, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 223, in drain_events
2012-07-03 14:43:00 TRACE nova.rpc.common return connection.drain_events(**kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 56, in drain_events
2012-07-03 14:43:00 TRACE nova.rpc.common return self.wait_multi(self.channels.values(), timeout=timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 62, in wait_multi
2012-07-03 14:43:00 TRACE nova.rpc.common chanmap.keys(), allowed_methods, timeout=timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 119, in _wait_multiple
2012-07-03 14:43:00 TRACE nova.rpc.common channel, method_sig, args, content = read_timeout(timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 93, in read_timeout
2012-07-03 14:43:00 TRACE nova.rpc.common return self.method_reader.read_method()
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/amqplib/client_0_8/method_framing.py", line 215, in read_method
2012-07-03 14:43:00 TRACE nova.rpc.common raise m
2012-07-03 14:43:00 TRACE nova.rpc.common timeout: timed out

compute.log on node 2:

2012-07-03 14:43:00 ERROR nova.rpc.common [req-76b47004-835b-4875-8cb0-f4714776aa2b b74e7826ac8c46f1968653e71871d69e 7df8530e806740428d4ae230f6a50cad] Timed out waiting for RPC response: timed out
2012-07-03 14:43:00 TRACE nova.rpc.common Traceback (most recent call last):
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 490, in ensure
2012-07-03 14:43:00 TRACE nova.rpc.common return method(*args, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 567, in _consume
2012-07-03 14:43:00 TRACE nova.rpc.common return self.connection.drain_events(timeout=timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/connection.py", line 139, in drain_events
2012-07-03 14:43:00 TRACE nova.rpc.common return self.transport.drain_events(self.connection, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 223, in drain_events
2012-07-03 14:43:00 TRACE nova.rpc.common return connection.drain_events(**kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 56, in drain_events
2012-07-03 14:43:00 TRACE nova.rpc.common return self.wait_multi(self.channels.values(), timeout=timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 62, in wait_multi
2012-07-03 14:43:00 TRACE nova.rpc.common chanmap.keys(), allowed_methods, timeout=timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 119, in _wait_multiple
2012-07-03 14:43:00 TRACE nova.rpc.common channel, method_sig, args, content = read_timeout(timeout)
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/kombu/transport/pyamqplib.py", line 93, in read_timeout
2012-07-03 14:43:00 TRACE nova.rpc.common return self.method_reader.read_method()
2012-07-03 14:43:00 TRACE nova.rpc.common File "/usr/lib/python2.6/site-packages/amqplib/client_0_8/method_framing.py", line 215, in read_method
2012-07-03 14:43:00 TRACE nova.rpc.common raise m
2012-07-03 14:43:00 TRACE nova.rpc.common timeout: timed out
2012-07-03 14:43:00 TRACE nova.rpc.common
2012-07-03 14:43:00 ERROR nova.compute.manager [req-76b47004-835b-4875-8cb0-f4714776aa2b b74e7826ac8c46f1968653e71871d69e 7df8530e806740428d4ae230f6a50cad] [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] Instance failed network setup
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] Traceback (most recent call last):
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 565, in _allocate_network
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] requested_networks=requested_networks)
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/network/api.py", line 170, in allocate_for_instance
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] 'args': args})
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/__init__.py", line 68, in call
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] return _get_impl().call(context, topic, msg, timeout)
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 674, in call
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] return rpc_amqp.call(context, topic, msg, timeout, Connection.pool)
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 343, in call
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] rv = list(rv)
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 304, in __iter__
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] self.done()
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] self.gen.next()
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 301, in __iter__
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] self._iterator.next()
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 572, in iterconsume
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] yield self.ensure(_error_callback, _consume)
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 503, in ensure
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] error_callback(e)
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 553, in _error_callback
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] raise rpc_common.Timeout()
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12] Timeout: Timeout while waiting on RPC response.
2012-07-03 14:43:00 TRACE nova.compute.manager [instance: 501652e9-6d09-47dc-9d9b-b646372f1a12]
2012-07-03 14:43:00 ERROR nova.rpc.amqp [req-76b47004-835b-4875-8cb0-f4714776aa2b b74e7826ac8c46f1968653e71871d69e 7df8530e806740428d4ae230f6a50cad] Exception during message handling
2012-07-03 14:43:00 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 253, in _process_data
2012-07-03 14:43:00 TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 114, in wrapped
2012-07-03 14:43:00 TRACE nova.rpc.amqp return f(*args, **kw)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 183, in decorated_function
2012-07-03 14:43:00 TRACE nova.rpc.amqp sys.exc_info())
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2012-07-03 14:43:00 TRACE nova.rpc.amqp self.gen.next()
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 177, in decorated_function
2012-07-03 14:43:00 TRACE nova.rpc.amqp return function(self, context, instance_uuid, *args, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 657, in run_instance
2012-07-03 14:43:00 TRACE nova.rpc.amqp do_run_instance()
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/utils.py", line 946, in inner
2012-07-03 14:43:00 TRACE nova.rpc.amqp retval = f(*args, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 656, in do_run_instance
2012-07-03 14:43:00 TRACE nova.rpc.amqp self._run_instance(context, instance_uuid, **kwargs)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 457, in _run_instance
2012-07-03 14:43:00 TRACE nova.rpc.amqp self._set_instance_error_state(context, instance_uuid)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2012-07-03 14:43:00 TRACE nova.rpc.amqp self.gen.next()
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 430, in _run_instance
2012-07-03 14:43:00 TRACE nova.rpc.amqp requested_networks)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 565, in _allocate_network
2012-07-03 14:43:00 TRACE nova.rpc.amqp requested_networks=requested_networks)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/network/api.py", line 170, in allocate_for_instance
2012-07-03 14:43:00 TRACE nova.rpc.amqp 'args': args})
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/__init__.py", line 68, in call
2012-07-03 14:43:00 TRACE nova.rpc.amqp return _get_impl().call(context, topic, msg, timeout)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 674, in call
2012-07-03 14:43:00 TRACE nova.rpc.amqp return rpc_amqp.call(context, topic, msg, timeout, Connection.pool)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 343, in call
2012-07-03 14:43:00 TRACE nova.rpc.amqp rv = list(rv)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 304, in __iter__
2012-07-03 14:43:00 TRACE nova.rpc.amqp self.done()
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2012-07-03 14:43:00 TRACE nova.rpc.amqp self.gen.next()
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 301, in __iter__
2012-07-03 14:43:00 TRACE nova.rpc.amqp self._iterator.next()
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 572, in iterconsume
2012-07-03 14:43:00 TRACE nova.rpc.amqp yield self.ensure(_error_callback, _consume)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 503, in ensure
2012-07-03 14:43:00 TRACE nova.rpc.amqp error_callback(e)
2012-07-03 14:43:00 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/impl_kombu.py", line 553, in _error_callback
2012-07-03 14:43:00 TRACE nova.rpc.amqp raise rpc_common.Timeout()
2012-07-03 14:43:00 TRACE nova.rpc.amqp Timeout: Timeout while waiting on RPC response.

I searched with google and found similar problems and tried with different solutions, however, all of them failed.
I restarted the nova-network and nova-compute service follow the tips on https://bugs.launchpad.net/nova/+bug/973609, but nothing happeded.
I could not find any lock file on both nodes with the solutions on http://www.gossamer-threads.com/lists/openstack/dev/14833.

Therefore, I got stuck with the error and ask for some help.
Here is my nova.conf below:

[DEFAULT]
# LOGS/STATE
verbose=True
logdir=/var/log/nova
state_path=/var/lib/nova
lock_path=/var/lock/nova
# AUTHENTICATION
auth_strategy=keystone
# SCHEDULER
compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
# VOLUMES
volume_group=nova-volumes
volume_name_template=volume-%08x
iscsi_helper=tgtadm
# DATABASE
sql_connection=mysql://nova:nova@192.168.128.10/nova
# COMPUTE
libvirt_type=kvm
connection_type=libvirt
instance_name_template=instance-%08x
api_paste_config=/etc/nova/api-paste.ini
allow_resize_to_same_host=True
# APIS
osapi_compute_extension=nova.api.openstack.compute.contrib.standard_extensions
ec2_dmz_host=192.168.128.10
s3_host=192.168.128.10
# RABBITMQ
rabbit_host=192.168.128.10
# GLANCE
image_service=nova.image.glance.GlanceImageService
glance_api_servers=192.168.128.10:9292
# NETWORK
network_manager=nova.network.manager.FlatDHCPManager
force_dhcp_release=False
dhcpbridge_flagfile=/etc/nova/nova.conf
dhcpbridge=/usr/bin/nova-dhcpbridge
firewall_driver=nova.virt.libvirt.firewall.IptablesFirewallDriver
my_ip=192.168.128.10
public_interface=eth0
#vlan_interface=eth0
flat_network_bridge=br100
flat_interface=eth0
fixed_range=172.16.1.0/24
flat_network_dhcp_start=172.16.1.2
# NOVNC CONSOLE
novncproxy_base_url=http://192.168.128.10:6080/vnc_auto.html
vncserver_proxyclient_address=192.168.128.10
vncserver_listen=192.168.128.10

Could anyone help me?

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
xmdr
Solved:
Last query:
Last reply:
Revision history for this message
Nilanjan Roy (nilanjan-r) said :
#1

Can you check the rabbitmq log files? I have faced similar issue of RPC timed out while listing floating IPs and the problem boiled down to rabbbitmq-server's queue was full. I restarted the service and blocked the source which is making numerous aqmp requests (https://answers.launchpad.net/nova/+question/202628).

I think the servers are intercommunicating with hostname.
Also try setting the flag

novnc_enabled=true
vncserver_listen=0.0.0.0

Revision history for this message
xmdr (xiamianderen) said :
#2

Hi, Nilanjan

Thanks for your reply. However, I have to reinstall openstack after trying for many times. That I came up with another question.

I will take your advice facing the same problem. And if you have some pieces, could you please take a look at my concern on page https://answers.launchpad.net/nova/+question/202631 now?