Getting error in /var/log/quantum/agent-dhcp.log

Asked by Srikanth on 2012-09-13

Hi,
I'm using Folsom Testing Packages on Ubuntu 12.04., Quantum V2 with OVS Plugin & Agent running.
I created a network with subnet 10.10.0.0/24. I wanted to create an instance with the created network.
I am trying to create that instance in the same Controller Node.

The following services are up and running.
1.quantum-server
2.quantum-plugin-openvswitch-agent
3.quantum-dhcp-agent
4.nova-api
5.nova-compute
6.nova-cert
7.nova-consoleauth
8.nova-scheduler
9.novnc

The thing is, I am able to create an instance and running. I am not able to get an IP from the assigned subnet.
When I check the processes running, I am not able to see any 'dnsmasq' with the mentioned subnet.
When I check the logs of Quantum DHCP Agent in '/var/log/quantum/agent-dhcp.log', I found the following errors which are getting for every 30 secs:

2012-09-13 14:14:39 INFO [quantum.openstack.common.rpc.common] Reconnecting to AMQP server on localhost:5672
2012-09-13 14:14:39 DEBUG [amqplib] Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2011 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.7.1'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
2012-09-13 14:14:42 ERROR [quantum.openstack.common.rpc.common] AMQP server on localhost:5672 is unreachable: Socket closed. Trying again in 30 seconds.
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 469, in reconnect
    self._connect()
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 446, in _connect
    self.connection.connect()
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 154, in connect
    return self.connection
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 560, in connection
    self._connection = self._establish_connection()
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 521, in _establish_connection
    conn = self.transport.establish_connection()
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 255, in establish_connection
    connect_timeout=conninfo.connect_timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 52, in __init__
    super(Connection, self).__init__(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 144, in __init__
    (10, 30), # tune
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/abstract_channel.py", line 95, in wait
    self.channel_id, allowed_methods)
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 202, in _wait_method
    self.method_reader.read_method()
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
    raise m
IOError: Socket closed

Please help to fix this.

Regards,
Srikanth.

Question information

Language:
English Edit question
Status:
Solved
For:
neutron Edit question
Assignee:
No assignee Edit question
Solved by:
Srikanth
Solved:
2012-09-18
Last query:
2012-09-18
Last reply:
2012-09-18
dan wendlandt (danwent) said : #1

the DHCP-agent communicates to the main quantum-server using RPC.

Your DHCP agent is trying to connect to a rabbit server on localhost.... perhaps you are not running rabbit on the localhost? You may need to update rabbit_host and rabbit_password in your quantum.conf config file.

Srikanth (srikanth-lingala) said : #2

Rabbit server is running on my localhost only, as I am trying to add an instance on my Controller Node.
When I see in 'netstat', Rabbit Server showing as running.

#> netstat -ntap | grep 5672
tcp6 0 0 :::5672 :::* LISTEN 2388/beam.smp

I tried with both 'localhost' and 'actual ip' as values to the 'rabbit_host' field in '/etc/quantum/quantum.conf' of the controller node.
But, I am still getting the same issue.

Regards,
Srikanth.

dan wendlandt (danwent) said : #3

Yes, closer inspection suggests that the rabbit server is running, but in intentionally closing the connection made by the DHCP agent. See the line that says:

2012-09-13 14:14:42 ERROR [quantum.openstack.common.rpc.common] AMQP server on localhost:5672 is unreachable: Socket closed.

I would check the log files in /var/log/rabbitmq/ to see if the rabbit server explained why it closed connection. One possibility is if the dhcp-agent is not configured with the right rabbit password.

Srikanth (srikanth-lingala) said : #4

I checked the logs of rabbitmq at '/<email address hidden>'.
I found the following error for every 30 secs.

=ERROR REPORT==== 10-Sep-2012::15:19:11 ===
exception on TCP connection <0.308.0> from 127.0.0.1:33910
{channel0_error,starting,
                {amqp_error,access_refused,
                            "AMQPLAIN login refused: user 'guest' - invalid credentials",
                            'connection.start_ok'}}

How to configure dhcp-agent with rabbitmq details?
I gave the following fields in '/etc/quantum/dhcp_agent.ini', even though I gave the same fields in '/etc/quantum/dhcp_agent.ini':
rabbit_host = localhost
rabbit_userid = guest
rabbit_password = password

And, then restarted quantum-dhcp-agent. Am I correct here??

But, now I am getting the following error in '/var/log/quantum/agent-dhcp.log':

2012-09-14 12:27:24 INFO [quantum.openstack.common.rpc.common] Connected to AMQP server on localhost:5672
2012-09-14 12:28:24 ERROR [quantum.openstack.common.rpc.common] Timed out waiting for RPC response: timed out
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 513, in ensure
    return method(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 590, in _consume
    return self.connection.drain_events(timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 238, in drain_events
    return connection.drain_events(**kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 57, in drain_events
    return self.wait_multi(self.channels.values(), timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 63, in wait_multi
    chanmap.keys(), allowed_methods, timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 120, in _wait_multiple
    channel, method_sig, args, content = read_timeout(timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 94, in read_timeout
    return self.method_reader.read_method()
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
    raise m
timeout: timed out
2012-09-14 12:28:24 DEBUG [amqplib] Closed channel #1
2012-09-14 12:28:24 DEBUG [amqplib] using channel_id: 1
2012-09-14 12:28:24 DEBUG [amqplib] Channel open

Please guide me.

dan wendlandt (danwent) said : #5

ok, so its definitely a bad rabbit username/password issue.

quantum-dhcp-agent should be passed both dhcp_agent.ini and quantum.conf using --config-file options.

please make sure rabbit has the same user and password configured as you have in the quantum.conf file.

you can make sure a particular password is set on rabbit using rabbitmqctl

Srikanth (srikanth-lingala) said : #6

Ok, But, I gave the correct username/password in both dhcp_agent.ini and quantum.conf.
Still, I am getting the same errors:

2012-09-17 13:54:17 INFO [quantum.agent.dhcp_agent] Synchronizing state
2012-09-17 13:54:17 DEBUG [quantum.openstack.common.rpc.amqp] Making asynchronous call on q-plugin ...
2012-09-17 13:54:17 DEBUG [quantum.openstack.common.rpc.amqp] MSG_ID is 2e7704be7914452b8b89d569e4e76f5d
2012-09-17 13:55:17 ERROR [quantum.openstack.common.rpc.common] Timed out waiting for RPC response: timed out
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 513, in ensure
    return method(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 590, in _consume
    return self.connection.drain_events(timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 238, in drain_events
    return connection.drain_events(**kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 57, in drain_events
    return self.wait_multi(self.channels.values(), timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 63, in wait_multi
    chanmap.keys(), allowed_methods, timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 120, in _wait_multiple
    channel, method_sig, args, content = read_timeout(timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 94, in read_timeout
    return self.method_reader.read_method()
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
    raise m
timeout: timed out
2012-09-17 13:55:17 DEBUG [amqplib] Closed channel #1
2012-09-17 13:55:17 DEBUG [amqplib] using channel_id: 1
2012-09-17 13:55:17 DEBUG [amqplib] Channel open
2012-09-17 13:55:17 ERROR [quantum.agent.dhcp_agent] Unable to sync network state.
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 112, in sync_state
    active_networks = set(self.plugin_rpc.get_active_networks())
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 260, in get_active_networks
    topic=self.topic)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
    return rpc.call(context, self._get_topic(topic), msg, timeout)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/__init__.py", line 102, in call
    return _get_impl().call(cfg.CONF, context, topic, msg, timeout)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 712, in call
    rpc_amqp.get_connection_pool(conf, Connection))
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/amqp.py", line 368, in call
    rv = list(rv)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/amqp.py", line 329, in __iter__
    self.done()
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/amqp.py", line 326, in __iter__
    self._iterator.next()
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 595, in iterconsume
    yield self.ensure(_error_callback, _consume)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 526, in ensure
    error_callback(e)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 576, in _error_callback
    raise rpc_common.Timeout()
Timeout: Timeout while waiting on RPC response.

Please suggest me.

dan wendlandt (danwent) said : #7

are you still seeing the messages about 'invalid credentials' being generated in the rabbit log? If so, investigating that would be my suggestion. I have no quantum-specific advice on that.

Lawrance (jing) said : #8

hi Srikanth
i have the same problem, the problem is about the configuration “/etc/init/quantum-dhcp-agent.conf”
cat /etc/init/quantum-dhcp-agent.conf
description "Quantum l3 agent"
author "Chuck Short <email address hidden>"

start on runlevel [2345]
stop on runlevel [016]

chdir /var/run

pre-start script
        mkdir -p /var/run/quantum
        chown quantum:root /var/run/quantum
end script

#####################here######################
exec start-stop-daemon --start --chuid quantum --exec /usr/bin/quantum-dhcp-agent -- \
--config-file=/etc/quantum/quantum.conf \
--config-file=/etc/quantum/dhcp_agent.ini \
--log-file=/var/log/quantum/dhcp-agent.log

i hope this could help you!!!

Srikanth (srikanth-lingala) said : #9

Hi Lawrance,
Thanks. That fixed my issue. That error disappears from the DHCP Agent logs.
Now, I can be able to see 2 dnsmasqs running with the given subnet range in Controller Node.
I created a VM Instance, which is up and running. But, I am not able to get IP for that VM Instance.
Do I need to do anything to get IP for the VM Instance?

Regards,
Srikanth.

Srikhant,

Unfortunately I have not followed this thread from the beginning, but here are a few thing you might try out to troubleshoot your issue:

1) check connectivity - does the dhcp discover even arrive at the destination host? You can use tcpdump to this aim or even just give your vm a static address and try to ping the dhcp server (which is not the gateway).

If no connectivity is present, this might mean there's a network configuration issue in your setup. As you're using the OVS plugin, check the following:
- if using tunnelling verify the tunnel endpoints (local_ip) are correctly configured and that you can ping each endpoint from any other host in you rsetup
- if using vlans ensure the physical switches trunk vlans at least in the range you specified in the ovs plugin configuration file.

2) check dhcp leases - goto $state_path/dhcp and see if the port-id for your istance is listed in this folder. If not, this might imply the dhcp-agent has still issues retrieving info from the quantum server.

In any case, also check;
3) the firewall driver used. Ensure that if a firewall driver is used dhcp traffic is still allowed across the hosts.

Srikanth (srikanth-lingala) said : #11

Finally, VM able to get the IP.
I disabled 'iptables' and restarted 'quantum-dhcp-agent'. That fixed the issue.
Thanks for the help.. guys ...

Srikanth.

Adam Gandelman (gandelman-a) said : #12

Exactly what is the error in the /etc/init/quantum-dhcp-agent.upstart file that prevents the agent from reading its config properly? This should be fixed in packaging proper, and should probably have a bug report somewhere.

dan wendlandt (danwent) said : #13

Yes, I"m confused as to the config file change as well... can you provide more details?

What was the value of libvirt_vif_driver that you were using? The issue with the DHCP packets not getting through could be an issue with nova security groups getting an incorrect view of the dhcp server for the subnet. In some recent testing I saw a subnet get two different ports with owner 'quantum:dhcp', which would confuse the security group logic.

Srikanth (srikanth-lingala) said : #14

@Adam:
I got the following error in /var/log/quantum/agent-dhcp.log:

2012-09-17 13:54:17 INFO [quantum.agent.dhcp_agent] Synchronizing state
2012-09-17 13:54:17 DEBUG [quantum.openstack.common.rpc.amqp] Making asynchronous call on q-plugin ...
2012-09-17 13:54:17 DEBUG [quantum.openstack.common.rpc.amqp] MSG_ID is 2e7704be7914452b8b89d569e4e76f5d
2012-09-17 13:55:17 ERROR [quantum.openstack.common.rpc.common] Timed out waiting for RPC response: timed out
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 513, in ensure
    return method(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 590, in _consume
    return self.connection.drain_events(timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in drain_events
    return self.transport.drain_events(self.connection, **kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 238, in drain_events
    return connection.drain_events(**kwargs)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 57, in drain_events
    return self.wait_multi(self.channels.values(), timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 63, in wait_multi
    chanmap.keys(), allowed_methods, timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 120, in _wait_multiple
    channel, method_sig, args, content = read_timeout(timeout)
  File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 94, in read_timeout
    return self.method_reader.read_method()
  File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
    raise m
timeout: timed out
2012-09-17 13:55:17 DEBUG [amqplib] Closed channel #1
2012-09-17 13:55:17 DEBUG [amqplib] using channel_id: 1
2012-09-17 13:55:17 DEBUG [amqplib] Channel open
2012-09-17 13:55:17 ERROR [quantum.agent.dhcp_agent] Unable to sync network state.
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 112, in sync_state
    active_networks = set(self.plugin_rpc.get_active_networks())
  File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 260, in get_active_networks
    topic=self.topic)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/proxy.py", line 80, in call
    return rpc.call(context, self._get_topic(topic), msg, timeout)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/__init__.py", line 102, in call
    return _get_impl().call(cfg.CONF, context, topic, msg, timeout)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 712, in call
    rpc_amqp.get_connection_pool(conf, Connection))
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/amqp.py", line 368, in call
    rv = list(rv)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/amqp.py", line 329, in __iter__
    self.done()
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/amqp.py", line 326, in __iter__
    self._iterator.next()
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 595, in iterconsume
    yield self.ensure(_error_callback, _consume)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 526, in ensure
    error_callback(e)
  File "/usr/lib/python2.7/dist-packages/quantum/openstack/common/rpc/impl_kombu.py", line 576, in _error_callback
    raise rpc_common.Timeout()
Timeout: Timeout while waiting on RPC response.

If I keep default command to start dhcp-agent in '/etc/init/quantum-dhcp-agent.conf', it would have started without any issues. But, I changed that command in '/etc/init/quantum-dhcp-agent.conf' by referring a PDF document to deploy Folsom by Emilien Macchi.
While installing Quantum DHCP Agent, I changed it from
'exec start-stop-daemon --start --chuid quantum --exec /usr/bin/quantum-dhcp-agent -- --config-file=/etc/quantum/quantum.conf --config-file=/etc/quantum/dhcp_agent.ini --log-file=/var/log/quantum/dhcp-agent.log'

to
'exec start-stop-daemon --start --chuid quantum --exec /usr/bin/quantum-dhcp-agent -- --config-file=/etc/quantum/dhcp_agent.ini --log-file=/var/log/quantum/agent-dhcp.log'

That's what I reverted back to get the issue fixed.

@Dan:
In my Nova configuration, I put the following entry:
libvirt_vif_driver=nova.virt.libvirt.vif.LibvirtOpenVswitchDriver

Regards,
Srikanth.