issues observed with multi-node setup

Asked by dan wendlandt

Lately I've been running devstack with quantum in a multi-node setup.

The controller node has pretty much everything enabled:

stackrc:

ENABLED_SERVICES=g-api,g-reg,key,n-api,n-crt,n-obj,n-cpu,n-net,n-vol,n-sch,n-novnc,n-xvnc,n-cauth,horizon,mysql,rabbit,quantum,q-agt,q-svc

I can then add multiple extra compute nodes, that run only nova-compute and quantum agent. They point to the "controller node for all services:

stackrc:

ENABLED_SERVICES=n-cpu,quantum,q-agt

localrc:

SERVICE_HOST=172.28.33.6
MYSQL_HOST=$SERVICE_HOST
RABBIT_HOST=$SERVICE_HOST

this works great, but I've noticed two quirks as wanted to ask about.

Issue #1

I noticed that when I do a clean install on a compute node, and specify only "ENABLED_SERVICES=n-cpu,quantum,q-agt", nova-compute would fail because it could not find some glance libraries (glance.common.*). Presumably its this reference:

./nova/image/glance.py:from glance.common import exception as glance_exception

This was a week or two ago, but it seems like the glance dependencies where not being installed if the glance service was not running on the node. Specifying g-api as one of the services at least once seemed sufficient to get the libraries pulled down, after which I no longer needed to specify it. Still, seems like a busted dependency.

Issue #2

this is more recent. If I only specify "ENABLED_SERVICES=n-cpu,quantum,q-agt" on a compute node, I noticed that "nova-manage service list" shows that the nova-compute on the compute-only node is not phoning home correctly. When I used screen -x stack to access the console, I noticed that nova-compute was trying to reach localhost, even though my localrc had the RABBIT_HOST variable set correctly.

I think the root cause is the following line in stack.sh:

if is_service_enabled rabbit ; then
    add_nova_opt "rabbit_host=$RABBIT_HOST"
    add_nova_opt "rabbit_password=$RABBIT_PASSWORD"

My understanding is that the "rabbit" service need only be run once per deployment (i.e., on my controller node), but this code means that unless I specify it on each node, nova-compute will assume rabbit is on localhost. I can work around this issue by specifying rabbit as an enabled service on each of my machines, and using RABBIT_HOST to point nova-compute to a single centralized server, but it seems borked, so I thought I would mention it.

Thanks!

Question information

Language:
English Edit question
Status:
Solved
For:
devstack Edit question
Assignee:
No assignee Edit question
Solved by:
dan wendlandt
Solved:
Last query:
Last reply:
Revision history for this message
dan wendlandt (danwent) said :
#1

btw, just a note that the above multi-node configuration only works once you have the code from my review here: https://review.openstack.org/#/c/7001/

Revision history for this message
Kyle Mestery (mestery) said :
#2

The situation is even worse if you try to use qpid instead of rabbitmq with devstack. For qpid, the host running only compute will default to try and connect locally to qpid. I added code to devstack to add the "qpid_hostname=" into nova.conf to fix this. I will propose this fix back to devstack.

Revision history for this message
John Garbutt (johngarbutt) said :
#3

I have just patched issue #2, hopeful this sorts part of the issue:
https://review.openstack.org/#/c/7757/

I have certainly seen #1, I just added g-api for the moment, I should look at patching that too.

I think you have to configure the glance and keystone address as well as Rabbit and MySQL for the compute worker nodes:
http://wiki.openstack.org/XenServer/GettingStarted#Steps_to_add_a_second_compute_node

Revision history for this message
dan wendlandt (danwent) said :
#4

Thanks for the patches.

Btw, keystone + glance default to using $SERVICE_HOST, so as long as you have that set correctly to the IP address of your "controller node", then you're fine without setting those explicitly.