Node not downloading image from Glance on Cloud Controller

Asked by Bertrand NOEL

Hello,
I have 1 cloud controller where is installed all services (nova+glance) and 3 nodes with just the compute service. See below :
================================================================
> euca-describe-availability-zones verbose

AVAILABILITYZONE nova available
AVAILABILITYZONE |- krypton.inrialpes.fr
AVAILABILITYZONE | |- nova-compute enabled :-) 2011-11-15 15:55:11
AVAILABILITYZONE | |- nova-network enabled :-) 2011-11-15 15:55:15
AVAILABILITYZONE | |- nova-scheduler enabled :-) 2011-11-15 15:55:16
AVAILABILITYZONE | |- nova-vncproxy enabled :-) 2011-11-15 15:55:17
AVAILABILITYZONE |- cadmium.inrialpes.fr
AVAILABILITYZONE | |- nova-compute enabled :-) 2011-11-15 15:55:18
AVAILABILITYZONE |- fermium.inrialpes.fr
AVAILABILITYZONE | |- nova-compute enabled :-) 2011-11-15 15:55:14
AVAILABILITYZONE |- molecule.inrialpes.fr
AVAILABILITYZONE | |- nova-compute enabled :-) 2011-11-15 15:55:15
================================================================
(krypton is 194.199.25.36)

Each install are Diablo 2011.3 on CentOS 6
My nova.conf on each machine is almost unmodified from default (only ip and verbose modified) :
================================================================
--verbose=true
--ec2_url=http://194.199.25.36:8773/services/Cloud
--s3_host=194.199.25.36
--cc_host=194.199.25.36
--rabbit_host=194.199.25.36
--sql_connection=mysql://nova:nova@194.199.25.36/nova
--glance_api_servers=194.199.25.36:9292
--use_s3=true
--libvirt_type=kvm
--use_syslog=false
--node_availability_zone=nova
--logdir=/var/log/nova
--logging_context_format_string=%(asctime)s %(name)s: %(levelname)s [%(request_id)s %(user)s %(project)s] %(message)s
--logging_default_format_string=%(asctime)s %(name)s: %(message)s
--logging_debug_format_suffix=
--use_cow_images=true
--auth_driver=nova.auth.dbdriver.DbDriver
--network_manager=nova.network.manager.VlanManager
--scheduler_driver=nova.scheduler.zone.ZoneScheduler
--image_service=nova.image.glance.GlanceImageService
--use_ipv6=false
--ca_path=/var/lib/nova/CA
--keys_path=/var/lib/nova/keys
--images_path=/var/lib/nova/images
--buckets_path=/var/lib/nova/buckets
--instances_path=/var/lib/nova/instances
--networks_path=/var/lib/nova/networks
--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
--injected_network_template_dir=/usr/share/nova/interfaces/
--libvirt_xml_template=/usr/share/nova/libvirt.xml.template
--vpn_client_template=/usr/share/nova/client.ovpn.template
--credentials_template=/usr/share/nova/novarc.template
--state_path=/var/lib/nova
--lock_path=/var/lib/nova/tmp
--vnc_enabled=true
--vncproxy_url=http://194.199.25.36:6080
--vncserver_host=0.0.0.0
--vnc_token_ttl=300
--vlan_interface=eth0
================================================================

When I try to launch instances (images stored on Glance) :
- if the compute service is the one on the controller, instance is created.
- if instances are to be launch on nodes, VM stay on "pending" state.
Note : Ip is allocated in both cases.

Log on node are as follow :
================================================================
2011-11-15 16:39:40,782 nova.compute.manager: Updating host status
2011-11-15 16:41:10,943 nova.rpc: received {u'_context_roles': [], u'_context_request_id': u'13da7fb6-40ff-4c85-91ec-0fefb574ce20', u'_context_read_deleted': False, u'args': {u'request_spec': {u'num_instances': 1, u'image': {u'status': u'active', u'name': u'MachineTest', u'deleted': False, u'container_format': u'ami', u'created_at': u'2011-11-15 14:09:12.017433', u'disk_format': u'ami', u'updated_at': u'2011-11-15 14:09:12.368728', u'properties': {u'kernel_id': u'1', u'min_ram': u'0', u'ramdisk_id': u'2', u'min_disk': u'0', u'owner': None, u'type': u'machine'}, u'location': u'file:///var/lib/glance/images/3', u'checksum': u'10047a119149e08fb206eea89832eee0', u'is_public': True, u'deleted_at': None, u'id': 3, u'size': 25165824}, u'filter': None, u'instance_type': {u'rxtx_quota': 0, u'flavorid': 1, u'name': u'm1.tiny', u'deleted': False, u'created_at': None, u'updated_at': None, u'memory_mb': 512, u'vcpus': 1, u'rxtx_cap': 0, u'extra_specs': {}, u'swap': 0, u'deleted_at': None, u'id': 2, u'local_gb': 0}, u'blob': None, u'instance_properties': {u'vm_state': u'building', u'availability_zone': None, u'ramdisk_id': u'2', u'instance_type_id': 2, u'user_data': u'', u'vm_mode': None, u'reservation_id': u'r-y1rt8m7r', u'user_id': u'bertrand', u'display_description': None, u'key_data': u'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAlfPhqBFvvtSpcJ6sbVjajuROOorAAwiszpNi8RyIqEyOMzbk+TlY2aHCAWkh0ajx9ayz97LqVzJfwvzylTo5oXKs4xieHYXXgyozpm2R9zutAuRswwPMQrpf60kw05Cz9JxDKtn/6WK64/NqROfikMhWpG78Id6h7k92rqu2/f8= <email address hidden>\n', u'power_state': 0, u'project_id': u'projettest', u'metadata': {}, u'access_ip_v6': None, u'access_ip_v4': None, u'kernel_id': u'1', u'key_name': u'CleTest', u'display_name': None, u'config_drive_id': u'', u'local_gb': 0, u'architecture': None, u'locked': False, u'launch_time': u'2011-11-15T15:41:10Z', u'memory_mb': 512, u'vcpus': 1, u'image_ref': 3, u'root_device_name': None, u'os_type': None, u'config_drive': u''}}, u'requested_networks': None, u'availability_zone': None, u'instance_id': 15, u'admin_password': None, u'injected_files': None}, u'_context_auth_token': None, u'_context_strategy': u'noauth', u'_context_is_admin': True, u'_context_project_id': u'projettest', u'_context_timestamp': u'2011-11-15T15:41:10.580369', u'_context_user_id': u'bertrand', u'method': u'run_instance', u'_context_remote_address': u'127.0.0.1'}
2011-11-15 16:41:10,943 nova.rpc: unpacked context: {'user_id': u'bertrand', 'roles': [], 'timestamp': u'2011-11-15T15:41:10.580369', 'auth_token': None, 'msg_id': None, 'remote_address': u'127.0.0.1', 'strategy': u'noauth', 'is_admin': True, 'request_id': u'13da7fb6-40ff-4c85-91ec-0fefb574ce20', 'project_id': u'projettest', 'read_deleted': False}
2011-11-15 16:41:11,218 nova.rpc: Making asynchronous call on network ...
2011-11-15 16:41:11,219 nova.rpc: MSG_ID is d4d7cde953dd4f609f89a0eedd8125ca
2011-11-15 16:41:12,504 nova.compute.manager: instance network_info: |[[{u'bridge': u'br100', u'multi_host': False, u'bridge_interface': u'eth0', u'vlan': 100, u'id': 1, u'injected': False, u'cidr': u'10.0.0.0/24', u'cidr_v6': None}, {u'should_create_bridge': True, u'dns': [], u'vif_uuid': u'479cda54-9b53-4203-9cbd-7a0489996220', u'label': u'private', u'broadcast': u'10.0.0.255', u'ips': [{u'ip': u'10.0.0.3', u'netmask': u'255.255.255.0', u'enabled': u'1'}], u'mac': u'02:16:3e:7c:80:f4', u'rxtx_cap': 0, u'should_create_vlan': True, u'dhcp_server': u'10.0.0.1', u'gateway': u'10.0.0.1'}]]|
2011-11-15 16:41:12,728 nova.virt.libvirt_conn: instance instance-0000000f: starting toXML method
2011-11-15 16:41:12,729 nova.virt.libvirt.vif: Ensuring vlan 100 and bridge br100
2011-11-15 16:41:12,729 nova.utils: Attempting to grab semaphore "ensure_vlan" for method "ensure_vlan"...
2011-11-15 16:41:12,729 nova.utils: Attempting to grab file lock "ensure_vlan" for method "ensure_vlan"...
2011-11-15 16:41:12,730 nova.utils: Running cmd (subprocess): ip link show dev vlan100
2011-11-15 16:41:12,743 nova.utils: Attempting to grab semaphore "ensure_bridge" for method "ensure_bridge"...
2011-11-15 16:41:12,744 nova.utils: Attempting to grab file lock "ensure_bridge" for method "ensure_bridge"...
2011-11-15 16:41:12,744 nova.utils: Running cmd (subprocess): ip link show dev br100
2011-11-15 16:41:12,759 nova.utils: Running cmd (subprocess): sudo brctl addif br100 vlan100
2011-11-15 16:41:12,781 nova.utils: Result was 1
2011-11-15 16:41:12,782 nova.utils: Running cmd (subprocess): sudo route -n
2011-11-15 16:41:12,804 nova.utils: Running cmd (subprocess): sudo ip addr show dev vlan100 scope global
2011-11-15 16:41:12,840 nova.virt.libvirt_conn: block_device_list []
2011-11-15 16:41:12,840 nova.virt.libvirt_conn: block_device_list []
2011-11-15 16:41:13,017 nova.virt.libvirt_conn: instance instance-0000000f: finished toXML method
2011-11-15 16:41:13,017 nova: called setup_basic_filtering in nwfilter
2011-11-15 16:41:13,018 nova: ensuring static filters
2011-11-15 16:41:13,061 nova.virt.libvirt.firewall: iptables firewall: Setup Basic Filtering
2011-11-15 16:41:13,061 nova.utils: Attempting to grab semaphore "iptables" for method "_do_refresh_provider_fw_rules"...
2011-11-15 16:41:40,945 nova.compute.manager: Updating host status
2011-11-15 16:41:41,031 nova.compute.manager: Found 1 in the database and 0 on the hypervisor.
2011-11-15 16:42:41,123 nova.compute.manager: Found 1 in the database and 0 on the hypervisor.
2011-11-15 16:43:41,124 nova.compute.manager: Updating host status
================================================================

Seems that network stuff is done.

There is however a weird thing in the line :
=====
2011-11-15 16:41:10,943 nova.rpc: unpacked context: {'user_id': u'bertrand', 'roles': [], 'timestamp': u'2011-11-15T15:41:10.580369', 'auth_token': None, 'msg_id': None, 'remote_address': u'127.0.0.1', 'strategy': u'noauth', 'is_admin': True, 'request_id': u'13da7fb6-40ff-4c85-91ec-0fefb574ce20', 'project_id': u'projettest', 'read_deleted': False}
=====
This "127.0.0.1"

Moreover, looking in /var/lib/nova/instances/_base/ :
- On the controller node, I can see 3 files, with size different from 0
- On nodes, nothing.

The weird thing is that I can do the following and see the file displaying :
curl http://194.199.25.36:9292/v1/images/1

Has anyone ever had this kind of error?

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
Bertrand NOEL
Solved:
Last query:
Last reply:
Whiteboard:
Switching to Nova project since I believe this is an issue with the configuration of Keystone and Nova...
Revision history for this message
Bertrand NOEL (bertrand-noel-88) said :
#1

Using Wireshark, I can see that when nodes receive a notification to create an instance, they successfully make a HEAD request to the Glance server on port 9292. But they don't do the GET to download images whereas controller do it when it has to create VM. Very weird.

Revision history for this message
Bertrand NOEL (bertrand-noel-88) said :
#2

Is it mandatory to have Keystone installed just to have a basic cloud ? I have not installed it since it is not written in docs that it has to in order to run the basic cloud I want. I will dig in this direction.

Revision history for this message
Bertrand NOEL (bertrand-noel-88) said :
#3

The problem was with nova trying to apply iptables rules. Nova tried getting a semaphore on iptables by a file (/var/lib/nova/tmp/nova-iptables.lock.lock). The problem came because this file was present, it has not been deleted, probably because a previous time I stopped nova before it finished.