Adding a second compute node fails

Asked by Patryk Kaminski

I'm trying to install OpenStack on two host computers. The first one (192.168.64.66) is for the controller (nova-api, nova-scheduler, nova-objectstore, nova-network) and compute (nova-compute). The second host computer should run only the compute (nova-compute and anything else that is required... nova-schedule?).
I use flat networking model. Both host computers run Ubuntu 10.04.02 x64.

I have installed the nova packages from ppa:nova-core/release, and then used the the https://github.com/dubsquared/OpenStack-NOVA-Installer-Script/raw/master/nova-CC-install-v1.1.sh script to configure the software on the first host computer.
I was able to start instances, SSH to them and access the network from within the instances. No issues here.

Next I installed the nova packages on the second host (also from ppa:nova-core/release) and the used the https://github.com/dubsquared/OpenStack-NOVA-Installer-Script/raw/master/nova-NODE-installer.sh script to configure the software.

After restarting the nova services on both computers, nova-manage service list reports the following:
asa-test-a nova-network enabled :-) 2011-04-15 01:18:00
asa-test-a nova-compute enabled :-) 2011-04-15 01:18:00
asa-test-a nova-scheduler enabled :-) 2011-04-15 01:18:00
asa-test-b nova-scheduler enabled XXX 2011-04-14 23:16:15

Clearly there is an issue with the nova-scheduler on the second host (asa-test-b). Do I need it at all since I already have the nova-scheduler running on the first host (asa-test-a)?
I do not see the nova-compute running on the second host. (log output later, below).
Should nova-network be running on the second host? It seems to fail (log output later, below).

------------------------------------------------------------------------------------
/etc/network/interfaces file from the first host (asa-test-a):
------------------------------------------------------------------------------------
auto lo
iface lo inet loopback

auto br100
iface br100 inet static
        bridge_ports eth0
        bridge_stp off
        bridge_maxwait 0
        bridge_fd 0
        address 192.168.64.66
        netmask 255.255.255.0
        broadcast 192.168.64.255
        gateway 192.168.64.1
        dns-nameservers 192.168.64.1

------------------------------------------------------------------------------------
/etc/network/interfaces file from the second host (asa-test-b):
------------------------------------------------------------------------------------
auto lo
iface lo inet loopback

auto br100
iface br100 inet static
        bridge_ports eth0
        bridge_stp off
        bridge_maxwait 0
        bridge_fd 0
        address 192.168.64.67
        netmask 255.255.255.0
        broadcast 192.168.64.255
        gateway 192.168.64.1
        dns-nameservers 192.168.64.1

------------------------------------------------------------------------------------
/etc/nova/nova.conf (identical on both hosts):
------------------------------------------------------------------------------------

--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--s3_host=192.168.64.66
--rabbit_host=192.168.64.66
--cc_host=192.168.64.66
--ec2_url=http://192.168.64.66:8773/services/Cloud
--fixed_range=192.168.64.0/24
--network_size=64
--FAKE_subdomain=ec2
--routing_source_ip=192.168.64.66
--verbose
--sql_connection=mysql://root:nova@192.168.64.66/nova
--network_manager=nova.network.manager.FlatManager

------------------------------------------------------------------------------------
The last few entries from the /var/log/nova/nova-compute.log file on asa-test-b:
------------------------------------------------------------------------------------

(nova.root): TRACE: Traceback (most recent call last):
(nova.root): TRACE: File "/usr/bin/nova-compute", line 44, in <module>
(nova.root): TRACE: service.serve()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 231, in serve
(nova.root): TRACE: x.start()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 81, in start
(nova.root): TRACE: self.manager.init_host()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 120, in init_host
(nova.root): TRACE: self.driver.init_host(host=self.host)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 158, in init_host
(nova.root): TRACE: for instance in db.instance_get_all_by_host(ctxt, host):
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/db/api.py", line 355, in instance_get_all_by_host
(nova.root): TRACE: return IMPL.instance_get_all_by_host(context, host)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/db/sqlalchemy/api.py", line 97, in wrapper
(nova.root): TRACE: return f(*args, **kwargs)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/db/sqlalchemy/api.py", line 744, in instance_get_all_by_host
(nova.root): TRACE: filter_by(deleted=can_read_deleted(context)).\
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1453, in all
(nova.root): TRACE: return list(self)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1565, in __iter__
(nova.root): TRACE: return self._execute_and_instances(context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1570, in _execute_and_instances
(nova.root): TRACE: mapper=self._mapper_zero_or_none())
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/session.py", line 735, in execute
(nova.root): TRACE: clause, params or {})
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1157, in execute
(nova.root): TRACE: params)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1237, in _execute_clauseelement
(nova.root): TRACE: return self.__execute_context(context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1268, in __execute_context
(nova.root): TRACE: context.parameters[0], context=context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1367, in _cursor_execute
(nova.root): TRACE: context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1360, in _cursor_execute
(nova.root): TRACE: context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/default.py", line 288, in do_execute
(nova.root): TRACE: cursor.execute(statement, parameters)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute
(nova.root): TRACE: self.errorhandler(self, exc, value)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
(nova.root): TRACE: raise errorclass, errorvalue

(nova.root): TRACE: OperationalError: (OperationalError) (1054, "Unknown column 'instances.instance_type' in 'field list'") 'SELECT instances.created_at AS instances_created_at, instances.updated_at AS instances_updated_at, instances.deleted_at AS instances_deleted_at, instances.deleted AS instances_deleted, instances.id AS instances_id, instances.admin_pass AS instances_admin_pass, instances.user_id AS instances_user_id, instances.project_id AS instances_project_id, instances.image_id AS instances_image_id, instances.kernel_id AS instances_kernel_id, instances.ramdisk_id AS instances_ramdisk_id, instances.launch_index AS instances_launch_index, instances.key_name AS instances_key_name, instances.key_data AS instances_key_data, instances.state AS instances_state, instances.state_description AS instances_state_description, instances.memory_mb AS instances_memory_mb, instances.vcpus AS instances_vcpus, instances.local_gb AS instances_local_gb, instances.hostname AS instances_hostname, instances.host AS instances_host, instances.instance_type AS instances_instance_type, instances.user_data AS instances_user_data, instances.reservation_id AS instances_reservation_id, instances.mac_address AS instances_mac_address, instances.scheduled_at AS instances_scheduled_at, instances.launched_at AS instances_launched_at, instances.terminated_at AS instances_terminated_at, instances.availability_zone AS instances_availability_zone, instances.display_name AS instances_display_name, instances.display_description AS instances_display_description, instances.locked AS instances_locked, fixed_ips_1.created_at AS fixed_ips_1_created_at, fixed_ips_1.updated_at AS fixed_ips_1_updated_at, fixed_ips_1.deleted_at AS fixed_ips_1_deleted_at, fixed_ips_1.deleted AS fixed_ips_1_deleted, fixed_ips_1.id AS fixed_ips_1_id, fixed_ips_1.address AS fixed_ips_1_address, fixed_ips_1.network_id AS fixed_ips_1_network_id, fixed_ips_1.instance_id AS fixed_ips_1_instance_id, fixed_ips_1.allocated AS fixed_ips_1_allocated, fixed_ips_1.leased AS fixed_ips_1_leased, fixed_ips_1.reserved AS fixed_ips_1_reserved, floating_ips_1.created_at AS floating_ips_1_created_at, floating_ips_1.updated_at AS floating_ips_1_updated_at, floating_ips_1.deleted_at AS floating_ips_1_deleted_at, floating_ips_1.deleted AS floating_ips_1_deleted, floating_ips_1.id AS floating_ips_1_id, floating_ips_1.address AS floating_ips_1_address, floating_ips_1.fixed_ip_id AS floating_ips_1_fixed_ip_id, floating_ips_1.project_id AS floating_ips_1_project_id, floating_ips_1.host AS floating_ips_1_host, security_groups_1.created_at AS security_groups_1_created_at, security_groups_1.updated_at AS security_groups_1_updated_at, security_groups_1.deleted_at AS security_groups_1_deleted_at, security_groups_1.deleted AS security_groups_1_deleted, security_groups_1.id AS security_groups_1_id, security_groups_1.name AS security_groups_1_name, security_groups_1.description AS security_groups_1_description, security_groups_1.user_id AS security_groups_1_user_id, security_groups_1.project_id AS security_groups_1_project_id \nFROM instances LEFT OUTER JOIN fixed_ips AS fixed_ips_1 ON fixed_ips_1.instance_id = instances.id AND fixed_ips_1.deleted = %s LEFT OUTER JOIN floating_ips AS floating_ips_1 ON floating_ips_1.fixed_ip_id = fixed_ips_1.id AND floating_ips_1.deleted = %s LEFT OUTER JOIN security_group_instance_association AS security_group_instance_association_1 ON security_group_instance_association_1.instance_id = instances.id AND instances.deleted = %s LEFT OUTER JOIN security_groups AS security_groups_1 ON security_groups_1.id = security_group_instance_association_1.security_group_id AND security_group_instance_association_1.deleted = %s AND security_groups_1.deleted = %s \nWHERE instances.host = %s AND instances.deleted = %s' (False, False, False, False, False, 'asa-test-b', False)
(nova.root): TRACE:

------------------------------------------------------------------------------------
The last few entries from the /var/log/nova/nova-network.log file on asa-test-b:
------------------------------------------------------------------------------------

(nova.root): TRACE: Traceback (most recent call last):
(nova.root): TRACE: File "/usr/bin/nova-network", line 44, in <module>
(nova.root): TRACE: service.serve()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 231, in serve
(nova.root): TRACE: x.start()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 81, in start
(nova.root): TRACE: self.manager.init_host()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 124, in init_host
(nova.root): TRACE: for network in self.db.host_get_networks(ctxt, self.host):
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/db/api.py", line 928, in host_get_networks
(nova.root): TRACE: return IMPL.host_get_networks(context, host)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/db/sqlalchemy/api.py", line 97, in wrapper
(nova.root): TRACE: return f(*args, **kwargs)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/db/sqlalchemy/api.py", line 1908, in host_get_networks
(nova.root): TRACE: filter_by(host=host).\
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1453, in all
(nova.root): TRACE: return list(self)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1565, in __iter__
(nova.root): TRACE: return self._execute_and_instances(context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1570, in _execute_and_instances
(nova.root): TRACE: mapper=self._mapper_zero_or_none())
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/session.py", line 735, in execute
(nova.root): TRACE: clause, params or {})
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1157, in execute
(nova.root): TRACE: params)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1237, in _execute_clauseelement
(nova.root): TRACE: return self.__execute_context(context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1268, in __execute_context
(nova.root): TRACE: context.parameters[0], context=context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1367, in _cursor_execute
(nova.root): TRACE: context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/base.py", line 1360, in _cursor_execute
(nova.root): TRACE: context)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/sqlalchemy/engine/default.py", line 288, in do_execute
(nova.root): TRACE: cursor.execute(statement, parameters)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute
(nova.root): TRACE: self.errorhandler(self, exc, value)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
(nova.root): TRACE: raise errorclass, errorvalue
(nova.root): TRACE: OperationalError: (OperationalError) (1054, "Unknown column 'networks.ra_server' in 'field list'") 'SELECT networks.created_at AS networks_created_at, networks.updated_at AS networks_updated_at, networks.deleted_at AS networks_deleted_at, networks.deleted AS networks_deleted, networks.id AS networks_id, networks.injected AS networks_injected, networks.cidr AS networks_cidr, networks.cidr_v6 AS networks_cidr_v6, networks.ra_server AS networks_ra_server, networks.netmask AS networks_netmask, networks.bridge AS networks_bridge, networks.gateway AS networks_gateway, networks.broadcast AS networks_broadcast, networks.dns AS networks_dns, networks.vlan AS networks_vlan, networks.vpn_public_address AS networks_vpn_public_address, networks.vpn_public_port AS networks_vpn_public_port, networks.vpn_private_address AS networks_vpn_private_address, networks.dhcp_start AS networks_dhcp_start, networks.project_id AS networks_project_id, networks.host AS networks_host \nFROM networks \nWHERE networks.deleted = %s AND networks.host = %s' (False, 'asa-test-b')
(nova.root): TRACE:

Thank you for any help.

P.

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
Dan Prince
Solved:
Last query:
Last reply:
Revision history for this message
Best Dan Prince (dan-prince) said :
#1

Hi Patryk,

From the looks of the stack trace it appears that you are running a nova database schema for cactus (or very close to it) with an older nova-compute client.

There were some changes in the Cactus dev cycle to the instances table. The instances.instance_type column has been replaces with instances.instance_type_id.

Can you verify the PPA package version on your machines by running a dpkg -l and looking at the revision?

Hope this helps.

Dan

Revision history for this message
Patryk Kaminski (patryk-kaminski) said :
#2

Hi Dan,

Thank you for your reply.
You were right. I checked the PPA package versions, and found a mismatch.
I then reinstalled the nova packages. Running dpkg -l | grep nova on either host server returns:

ii nova-api 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - API frontend
ii nova-common 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - common files
ii nova-compute 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - compute node
ii nova-doc 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - documetation
ii nova-network 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - Network thingamaj
ii nova-objectstore 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - object store
ii nova-scheduler 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - Scheduler
ii python-nova 2011.2-0ubuntu0ppa1~lucid1 OpenStack Compute - Nova - Python libraries
ii python-novaclient 2.4-0ubuntu1~lucid1 client library for OpenStack Compute API

After restarting both host servers, I get the following output from nova-manage service list:
asa-test-a nova-compute enabled :-) 2011-04-17 17:01:59
asa-test-a nova-network enabled :-) 2011-04-17 17:01:57
asa-test-a nova-scheduler enabled :-) 2011-04-17 17:01:57
asa-test-b nova-scheduler enabled XXX 2011-04-17 17:01:06
asa-test-b nova-network enabled XXX 2011-04-17 17:01:05
asa-test-b nova-compute enabled XXX 2011-04-17 17:01:11

I tried starting an instance, ex.: euca-run-instances $emi -k mykey -t m1.tiny
I can see that it now tries to launch the instance (sometimes on the primary host, sometimes on the secondary host). They run fine on first host (asa-test-a) but on the second host (asa-test-b) the instance status quickly changes from scheduling to launching to shutdown.
Despite the suggestion in the nova-compute.log, both machines have support for hardware virtualization, and I have successfully run kvm on both of them.

----------------------------------------------------------------------------------------------------
Here is the output from the nova-compute.log file on second host server (asa-test-b) when it tires to launch an instance:
-----------------------------------------------------------------------------------------------------

2011-04-17 12:55:08,260 AUDIT nova.compute.manager [Q6TB2JI-BQUULBHOXPGI admin asa] instance 16: starting...
2011-04-17 12:55:08,959 INFO nova [-] called setup_basic_filtering in nwfilter
2011-04-17 12:55:08,960 INFO nova [-] ensuring static filters
2011-04-17 12:55:09,146 INFO nova.virt.libvirt_conn [-] instance instance-00000010: Creating image
2011-04-17 12:55:09,412 INFO nova.virt.libvirt_conn [-] instance instance-00000010: injecting key into image 1853417834
2011-04-17 12:55:09,412 INFO nova.virt.libvirt_conn [-] instance instance-00000010: injecting net into image 1853417834
2011-04-17 12:55:10,512 WARNING nova.virt.libvirt_conn [-] instance instance-00000010: ignoring error injecting data into image 1853417834 (Unexpected error
while running command.
Command: sudo tune2fs -c 0 -i 0 /dev/nbd15
Exit code: 1
Stdout: 'tune2fs 1.41.11 (14-Mar-2010)\n'
Stderr: "tune2fs: Invalid argument while trying to open /dev/nbd15\r\nCouldn't find valid filesystem superblock.\n")
2011-04-17 12:55:41,903 ERROR nova.exception [-] Uncaught exception
(nova.exception): TRACE: Traceback (most recent call last):
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/exception.py", line 120, in _wrap
(nova.exception): TRACE: return f(*args, **kw)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 617, in spawn
(nova.exception): TRACE: domain = self._create_new_domain(xml)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.6/nova/virt/libvirt_conn.py", line 1079, in _create_new_domain
(nova.exception): TRACE: domain.createWithFlags(launch_flags)
(nova.exception): TRACE: File "/usr/lib/python2.6/dist-packages/libvirt.py", line 337, in createWithFlags
(nova.exception): TRACE: if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
(nova.exception): TRACE: libvirtError: internal error process exited while connecting to monitor: char device redirected to /dev/pts/3
(nova.exception): TRACE: qemu: could not load kernel '/var/lib/nova/instances/instance-00000010/kernel': Inappropriate ioctl for device
(nova.exception): TRACE:
(nova.exception): TRACE:
2011-04-17 12:55:41,905 ERROR nova.compute.manager [Q6TB2JI-BQUULBHOXPGI admin asa] Instance '16' failed to spawn. Is virtualization enabled in the BIOS?
(nova.compute.manager): TRACE: Traceback (most recent call last):
(nova.compute.manager): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 234, in run_instance
(nova.compute.manager): TRACE: self.driver.spawn(instance_ref)
(nova.compute.manager): TRACE: File "/usr/lib/pymodules/python2.6/nova/exception.py", line 126, in _wrap
(nova.compute.manager): TRACE: raise Error(str(e))
(nova.compute.manager): TRACE: Error: internal error process exited while connecting to monitor: char device redirected to /dev/pts/3
(nova.compute.manager): TRACE: qemu: could not load kernel '/var/lib/nova/instances/instance-00000010/kernel': Inappropriate ioctl for device
(nova.compute.manager): TRACE:
(nova.compute.manager): TRACE:
2011-04-17 12:55:42,105 INFO nova.compute.manager [-] Found instance 'instance-00000010' in DB but no VM. State=5, so setting state to shutoff.

-----------------------------------------------------------------------------------------------------
Here is the output from the nova-network.log file on second host server (asa-test-b):
-----------------------------------------------------------------------------------------------------
2011-04-17 12:14:01,534 AUDIT nova [-] Starting network node (version 2011.2-workspace:tarmac-20110415024701-a9bdb77vaatk99lh)
2011-04-17 12:14:01,777 INFO nova.rpc [-] Created 'network_fanout' fanout exchange with 'network' routing key

-----------------------------------------------------------------------------------------------------
Here is the output from the nova-scheduler.log file on second host server (asa-test-b):
-----------------------------------------------------------------------------------------------------

2011-04-17 12:14:01,485 AUDIT nova [-] Starting scheduler node (version 2011.2-workspace:tarmac-20110415024701-a9bdb77vaatk99lh)
2011-04-17 12:14:01,729 INFO nova.rpc [-] Created 'scheduler_fanout' fanout exchange with 'scheduler' routing key

Revision history for this message
Patryk Kaminski (patryk-kaminski) said :
#3

Well, never mind. Looking at other posts on this forum, I realized that I cannot use local storage for multiple hosts.
I have installed glance and now I can start instances on either host server.

P.

Revision history for this message
Patryk Kaminski (patryk-kaminski) said :
#4

Thanks Dan Prince, that solved my question.