reboot system nova-compute is error ,unable connect RabbitMQ

Asked by WangPeng

   Hi!Everyone!
  Finally,I finished the installation of openstack in 3nodes(two compute node and one all-in-one node as controller)
  But,when I reboot system,the controller's nova-compute service(with instance) is error:
nova-compute.log:
2012-08-22 19:28:51 INFO nova.rpc.common [req-d38eb14e-dfd8-4c6a-ae26-fb1e522765eb None None] Connected to AMQP server on 172.18.32.7:5672
2012-08-22 19:29:51 ERROR nova.rpc.common [req-d38eb14e-dfd8-4c6a-ae26-fb1e522765eb None None] Timed out waiting for RPC response: timed out
2012-08-22 19:29:51 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490, in ensure
2012-08-22 19:29:51 TRACE nova.rpc.common return method(*args, **kwargs)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567, in _consume
2012-08-22 19:29:51 TRACE nova.rpc.common return self.connection.drain_events(timeout=timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in drain_events
2012-08-22 19:29:51 TRACE nova.rpc.common return self.transport.drain_events(self.connection, **kwargs)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 238, in drain_events
2012-08-22 19:29:51 TRACE nova.rpc.common return connection.drain_events(**kwargs)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 57, in drain_events
2012-08-22 19:29:51 TRACE nova.rpc.common return self.wait_multi(self.channels.values(), timeout=timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 63, in wait_multi
2012-08-22 19:29:51 TRACE nova.rpc.common chanmap.keys(), allowed_methods, timeout=timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 120, in _wait_multiple
2012-08-22 19:29:51 TRACE nova.rpc.common channel, method_sig, args, content = read_timeout(timeout)
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 94, in read_timeout
2012-08-22 19:29:51 TRACE nova.rpc.common return self.method_reader.read_method()
2012-08-22 19:29:51 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
2012-08-22 19:29:51 TRACE nova.rpc.common raise m
2012-08-22 19:29:51 TRACE nova.rpc.common timeout: timed out
2012-08-22 19:29:51 TRACE nova.rpc.common
2012-08-22 19:29:52 CRITICAL nova [-] Timeout while waiting on RPC response.

Question information

Language:
English Edit question
Status:
Answered
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
WangPeng (breakwindwp) said :
#1

Only controller nova-compute service is error,two computenodes is ok.
It makes me very puzzled.
Is it RabbitMQ close?but,other nodes is ok.
so ,I find RabbitMQ log:
=INFO REPORT==== 19-Aug-2012::15:51:33 ===
:$
closing TCP connection <0.240.0> from 172.18.32.7:50592

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.229.0> from 172.18.32.7:50587
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.229.0> from 172.18.32.7:50587

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.262.0> from 172.18.32.7:50594
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.262.0> from 172.18.32.7:50594

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.306.0> from 172.18.32.7:50600
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.306.0> from 172.18.32.7:50600

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.330.0> from 172.18.32.7:50601
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.330.0> from 172.18.32.7:50601

=WARNING REPORT==== 24-Aug-2012::11:13:16 ===
exception on TCP connection <0.250.0> from 172.18.32.7:50593
connection_closed_abruptly

=INFO REPORT==== 24-Aug-2012::11:13:16 ===
closing TCP connection <0.250.0> from 172.18.32.7:50593

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
Limiting to approx 924 file handles (829 sockets)

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
Memory limit set to 1579MB of 3948MB total.

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index

=INFO REPORT==== 24-Aug-2012::11:14:04 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index

=WARNING REPORT==== 24-Aug-2012::11:14:04 ===
msg_store_persistent: rebuilding indices from scratch
I don‘t find reason.

Revision history for this message
WangPeng (breakwindwp) said :
#2

That's my config file:
[DEFAULT]
###### LOGS/STATE
#verbose=True
verbose=False

###### AUTHENTICATION
auth_strategy=keystone

###### SCHEDULER
#--compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
scheduler_driver=nova.scheduler.simple.SimpleScheduler

###### VOLUMES
volume_group=nova-volumes
volume_name_template=volume-%08x
iscsi_helper=tgtadm
iscsi_ip_prefix=172.18.32

###### DATABASE
sql_connection=mysql://nova:*****@172.18.32.7/nova

###### COMPUTE
libvirt_type=kvm
#libvirt_type=qemu
connection_type=libvirt
instance_name_template=instance-%08x
api_paste_config=/etc/nova/api-paste.ini
allow_resize_to_same_host=True
libvirt_use_virtio_for_bridges=true
start_guests_on_host_boot=true
resume_guests_state_on_host_boot=true

###### APIS
osapi_compute_extension=nova.api.openstack.compute.contrib.standard_extensions
allow_admin_api=true
s3_host=172.18.32.7
cc_host=172.18.32.7

###### RABBITMQ
rabbit_host=172.18.32.7

###### GLANCE
image_service=nova.image.glance.GlanceImageService
glance_api_servers=172.18.32.7:9292

###### NETWORK
network_manager=nova.network.manager.FlatDHCPManager
force_dhcp_release=True
dhcpbridge_flagfile=/etc/nova/nova.conf
dhcpbridge=/usr/bin/nova-dhcpbridge
firewall_driver=nova.virt.libvirt.firewall.IptablesFirewallDriver
public_interface=eth0
flat_interface=eth1
flat_network_bridge=br100
fixed_range=172.18.32.160/27
multi_host=true

###### NOVNC CONSOLE
novnc_enabled=true
novncproxy_base_url= http://172.18.32.7:6080/vnc_auto.html
vncserver_proxyclient_address=172.18.32.7
vncserver_listen=172.18.32.7

########Nova
logdir=/var/log/nova
state_path=/var/lib/nova
lock_path=/var/lock/nova

#####MISC
use_deprecated_auth=false
root_helper=sudo nova-rootwrap

Revision history for this message
WangPeng (breakwindwp) said :
#3

when I stop compute node nova-compute service
reboot system again,the controller node is ok!

So,I doubt that is RabbitMQ queue problem.
then,I add follow lines in nova.conf:
rabbit_durable_queues=ture
rabbit_max_retries=3
rabbit_retry_backoff=5
rabbit_retry_interval=3

restart nova-compute
but then computenode is error
why?

Revision history for this message
Yaguang Tang (heut2008) said :
#4

this seems to be a rabbitmq issume, which version is your rabbitmq?

Can you help with this problem?

Provide an answer of your own, or ask WangPeng for more information if necessary.

To post a message you must log in.