nova.compute.manager: ERROR .. Is virtualization enabled in the BIOS?

Asked by Liang

I've follow the instructions in http://wiki.openstack.org/NovaInstall/RHEL6Notes-Cactus, fresh installed RHEL6 on 2 SunFire x2270 M2, and installed the rpms dated 4-25-2011. I was able to use a custom-made RedHat Enterprise 6 image and uploaded, when run instances, I got error, failed to spawn, from the nova-compute.log I saw the following:
2011-04-28 11:13:48,958 nova.rpc: received {u'_context_request_id': u'UZV0MSI7QFC7JA2F049P', u'_context_read_deleted': False, u'args': {u'instance_id': 4, u'in
jected_files': None, u'availability_zone': None}, u'_context_is_admin': True, u'_context_timestamp': u'2011-04-28T18:13:25Z', u'_context_user': u'lwang', u'met
hod': u'run_instance', u'_context_project': u'test', u'_context_remote_address': u'10.208.3.239'}
2011-04-28 11:13:48,958 nova.rpc: unpacked context: {'timestamp': u'2011-04-28T18:13:25Z', 'remote_address': u'10.208.3.239', 'project': u'test', 'is_admin': T
rue, 'user': u'lwang', 'request_id': u'UZV0MSI7QFC7JA2F049P', 'read_deleted': False}
2011-04-28 11:13:49,015 nova.compute.manager: AUDIT [UZV0MSI7QFC7JA2F049P lwang test] instance 4: starting...
2011-04-28 11:13:49,355 nova.rpc: Making asynchronous call on network.tcscld01.aero.org ...
2011-04-28 11:13:49,356 nova.rpc: MSG_ID is 507d88b395d74ae795870695674c449c
2011-04-28 11:13:57,617 nova.utils: Attempting to grab semaphore "ensure_bridge" for method "ensure_bridge"...
2011-04-28 11:13:57,617 nova.utils: Attempting to grab file lock "ensure_bridge" for method "ensure_bridge"...
2011-04-28 11:13:57,617 nova.utils: Running cmd (subprocess): ip link show dev br100
2011-04-28 11:13:57,626 nova.utils: Running cmd (subprocess): sudo route -n
2011-04-28 11:13:57,643 nova.utils: Running cmd (subprocess): sudo ip addr show dev eth1 scope global
2011-04-28 11:13:57,661 nova.utils: Running cmd (subprocess): sudo brctl addif br100 eth1
2011-04-28 11:13:57,678 nova.utils: Result was 1
2011-04-28 11:13:57,739 nova.virt.libvirt_conn: instance instance-00000004: starting toXML method
2011-04-28 11:13:57,869 nova.virt.libvirt_conn: instance instance-00000004: finished toXML method
2011-04-28 11:13:57,955 nova: called setup_basic_filtering in nwfilter
2011-04-28 11:13:57,955 nova: ensuring static filters
2011-04-28 11:13:58,033 nova: <nova.db.sqlalchemy.models.SecurityGroupIngressRule object at 0x3409d10>
2011-04-28 11:13:58,033 nova: <nova.db.sqlalchemy.models.SecurityGroupIngressRule object at 0x3828c90>
2011-04-28 11:13:58,047 nova.utils: Attempting to grab semaphore "iptables" for method "apply"...
2011-04-28 11:13:58,047 nova.utils: Attempting to grab file lock "iptables" for method "apply"...
2011-04-28 11:13:58,060 nova.utils: Running cmd (subprocess): sudo iptables-save -t filter
2011-04-28 11:13:58,077 nova.utils: Running cmd (subprocess): sudo iptables-restore
2011-04-28 11:13:58,096 nova.utils: Running cmd (subprocess): sudo iptables-save -t nat
2011-04-28 11:13:58,115 nova.utils: Running cmd (subprocess): sudo iptables-restore
2011-04-28 11:13:58,178 nova.utils: Running cmd (subprocess): mkdir -p /var/lib/nova/instances/instance-00000004/
2011-04-28 11:13:58,187 nova.virt.libvirt_conn: instance instance-00000004: Creating image
2011-04-28 11:13:58,324 nova.utils: Attempting to grab semaphore "00000002" for method "call_if_not_exists"...
2011-04-28 11:13:58,324 nova.utils: Running cmd (subprocess): cp /var/lib/nova/instances/_base/00000002 /var/lib/nova/instances/instance-00000004/kernel
2011-04-28 11:13:58,370 nova.utils: Attempting to grab semaphore "00000001" for method "call_if_not_exists"...
2011-04-28 11:13:58,370 nova.utils: Running cmd (subprocess): cp /var/lib/nova/instances/_base/00000001 /var/lib/nova/instances/instance-00000004/ramdisk
2011-04-28 11:13:58,455 nova.utils: Attempting to grab semaphore "00000004" for method "call_if_not_exists"...
2011-04-28 11:13:58,597 nova.exception: Uncaught exception
(nova.exception): TRACE: Traceback (most recent call last):
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/exception.py", line 120, in _wrap
(nova.exception): TRACE: return f(*args, **kw)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/virt/libvirt_conn.py", line 616, in spawn
(nova.exception): TRACE: self._create_image(instance, xml, network_info)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/virt/libvirt_conn.py", line 867, in _create_image
(nova.exception): TRACE: size=size)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/virt/libvirt_conn.py", line 776, in _cache_image
(nova.exception): TRACE: call_if_not_exists(base, fn, *args, **kwargs)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/utils.py", line 607, in inner
(nova.exception): TRACE: retval = f(*args, **kwargs)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/virt/libvirt_conn.py", line 774, in call_if_not_exists
(nova.exception): TRACE: fn(target=base, *args, **kwargs)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/virt/libvirt_conn.py", line 787, in _fetch_image
(nova.exception): TRACE: images.fetch(image_id, target, user, project)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/virt/images.py", line 51, in fetch
(nova.exception): TRACE: metadata = image_service.get(elevated, image_id, image_file)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 117, in get
(nova.exception): TRACE: image_meta, image_chunks = self.client.get_image(image_id)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/glance/client.py", line 224, in get_image
(nova.exception): TRACE: res = self.do_request("GET", "/images/%s" % image_id)
(nova.exception): TRACE: File "/usr/lib/python2.6/site-packages/glance/client.py", line 143, in do_request
(nova.exception): TRACE: res = c.getresponse()
(nova.exception): TRACE: File "/usr/lib64/python2.6/httplib.py", line 986, in getresponse
(nova.exception): TRACE: response.begin()
(nova.exception): TRACE: File "/usr/lib64/python2.6/httplib.py", line 391, in begin
(nova.exception): TRACE: version, status, reason = self._read_status()
(nova.exception): TRACE: File "/usr/lib64/python2.6/httplib.py", line 355, in _read_status
(nova.exception): TRACE: raise BadStatusLine(line)
(nova.exception): TRACE: BadStatusLine
(nova.exception): TRACE:
2011-04-28 11:13:58,599 nova.compute.manager: ERROR [UZV0MSI7QFC7JA2F049P lwang test] Instance '4' failed to spawn. Is virtualization enabled in the BIOS?
(nova.compute.manager): TRACE: Traceback (most recent call last):
(nova.compute.manager): TRACE: File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 236, in run_instance
(nova.compute.manager): TRACE: self.driver.spawn(instance_ref)
(nova.compute.manager): TRACE: File "/usr/lib/python2.6/site-packages/nova/exception.py", line 126, in _wrap
(nova.compute.manager): TRACE: raise Error(str(e))
(nova.compute.manager): TRACE: Error
(nova.compute.manager): TRACE:

Here are my config files:

/etc/nova/nova.conf:
--verbose=true
--ec2_url=http://10.208.3.239:8773/services/Cloud
--s3_host=10.208.3.239
--cc_host=10.208.3.239
--rabbit_host=10.208.3.239
--sql_connection=mysql://nova:nova@10.208.3.239/nova
--glance_port=9292
--glance_host=10.208.3.239
--my_ip=10.208.3.239
--use_s3=true
--libvirt_type=kvm
--use_syslog=false
--node_availability_zone=tcscld
--logdir=/var/log/nova
--logging_context_format_string=%(asctime)s %(name)s: %(levelname)s [%(request_id)s %(user)s %(project)s] %(message)s
--logging_default_format_string=%(asctime)s %(name)s: %(message)s
--logging_debug_format_suffix=
--use_cow_images=true
--auth_driver=nova.auth.dbdriver.DbDriver
--network_manager=nova.network.manager.FlatDHCPManager
--scheduler_driver=nova.scheduler.zone.ZoneScheduler
--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--image_service=nova.image.glance.GlanceImageService
--use_ipv6=false
--ca_path=/var/lib/nova/CA
--keys_path=/var/lib/nova/keys
--images_path=/var/lib/nova/images
--buckets_path=/var/lib/nova/buckets
--instances_path=/var/lib/nova/instances
--networks_path=/var/lib/nova/networks
--injected_network_template=/usr/share/nova/interfaces.rhel.template
--libvirt_xml_template=/usr/share/nova/libvirt.xml.template
--vpn_client_template=/usr/share/nova/client.ovpn.template
--credentials_template=/usr/share/nova/novarc.template
--state_path=/var/lib/nova
--flat_interface=eth1
--flat_injected=true
--flat_network_dhcp_start=10.0.0.2
--public_interface=eth0
--fixed_range=10.0.0.0/12
--network_size=32
--vnc_enabled=true
--vncproxy_url=http://10.208.3.239:6080
--vncserver_host=0.0.0.0
--vnc_token_ttl=300
--lock_path=/var/lib/nova/tmp
--quota_instances=40

ip a sh:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:21:28:77:6e:1c brd ff:ff:ff:ff:ff:ff
    inet 10.208.3.238/23 brd 10.208.3.255 scope global eth0
    inet6 fe80::221:28ff:fe77:6e1c/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:21:28:77:6e:1d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::221:28ff:fe77:6e1d/64 scope link
       valid_lft forever preferred_lft forever
4: br100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 00:21:28:77:6e:1d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::221:28ff:fe77:6e1d/64 scope link
       valid_lft forever preferred_lft forever
5: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 02:cf:56:2c:3f:08 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0

brctl show
bridge name bridge id STP enabled interfaces
br100 8000.002128776e1d no eth1
virbr0 8000.000000000000 yes

nova-manage service list
tcscld01.aero.org nova-compute enabled :-) 2011-04-28 19:33:44
tcscld01.aero.org nova-network enabled :-) 2011-04-28 19:33:47
tcscld01.aero.org nova-scheduler enabled :-) 2011-04-28 19:33:42
tcscld01.aero.org nova-volume enabled :-) 2011-04-28 19:33:42
tcscld01.aero.org nova-vncproxy enabled :-) 2011-04-28 19:33:43
tcscld02.aero.org nova-vncproxy enabled :-) 2011-04-28 19:33:52
tcscld02.aero.org nova-compute enabled :-) 2011-04-28 19:33:52

nova-manage project quota list
metadata_items: 128
gigabytes: 1000
floating_ips: 10
instances: 40
volumes: 10
cores: 20

I was able to run instances using the same customized rhel6 images in the nova code using same installation procedures but point to 2011.1-2 bexar repo.
With the previous installation, the running instances may failed to survive host system reboot, someone suggest using the cactus.

Any help really appreciated. Unfortunately, our systems only allow running RedHat Ent 6, hope I can get this installation up running then I can install in a much larger installation base.

Thanks in Advance

A. WAng

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
Everett Toews
Solved:
Last query:
Last reply:
Revision history for this message
Liang (anzadragon) said :
#1

Does anyone have similar situation like I had? I still can't get it resolved.

Revision history for this message
Best Everett Toews (everett-toews) said :
#2

Looks like the problem is actually coming from Glance. Are you using Swift as your backend for Glance?

Check /var/log/glance/glance-api.log on your Glance server. You might see a message like,

    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])

If that's the problem, it's already recorded in this bug https://bugs.launchpad.net/glance/+bug/771849

If you work around the bug, don't forget to

    rm -f /var/lib/nova/instances/_base/*

on all the compute nodes where this happened to clear out the cached 0 byte image files.

Everett

Revision history for this message
Liang (anzadragon) said :
#3

Everett,

Thanks for the pointer.
My situation was actually not from glance itself, but related, so I did not see the message you mentioned in the glance-api.log.
For this situation, I actually build another set of systems with exact procedures I provided as I can't directly access the problem system, and this new set of Openstack does not have that problem.
After thorough comparison, the problem was found, the glance.sqlite on the problem system was on a NFS volume. And because the file glance.sqlite can't release the lock. (Manual check by running sqlite3 glance.sqlite always got something like error: database locked).
After moving the file away from NFS volume, the problem went away.

A. Wang