Unable to attach volume to a xen PVM

Asked by Alvaro Lopez

Hi there.

Currently I'm running Diablo-3 (I am planning to upgrade my systems this week) and I am unable to attach a volume to a given node.

I am using Xen as the hypervisor, and nova is configured with "--libvirt_type=xen". Everything runs fine except when I try to attach a volume. The iSCSI target is accessed OK, but nova-compute throws the following exception:

    2011-10-05 17:08:00,574 ERROR nova [-] Exception during message handling
    (nova): TRACE: Traceback (most recent call last):
    (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/rpc.py", line 232, in _process_data
    (nova): TRACE: rval = node_func(context=ctxt, **node_args)
    (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 113, in decorated_function
    (nova): TRACE: function(self, context, instance_id, *args, **kwargs)
    (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 1096, in attach_volume
    (nova): TRACE: raise exc
    (nova): TRACE: Error: POST operation failed: xend_post: error from xen daemon: (xend.err 'Unable to find number for device (vdb)')
    (nova): TRACE:

If I have a look at the xend.log I see that the disk is passed as "qemu:/dev/disk/by-path/ip-(...)" that indeed it is how the device is defined by "nova/virt/libvirt/connection.py:"

    (...)
     if type == 'block':
            xml = """<disk type='block'>
                         <driver name='qemu' type='raw'/>
                         <source dev='%s'/>
                         <target dev='%s' bus='virtio'/>
                     </disk>""" % (device_path, mount_device)
    (...)

If I change the driver name to "phy" then I get the following error:

    2011-10-05 17:08:00,574 ERROR nova [-] Exception during message handling
    (nova): TRACE: Traceback (most recent call last):
    (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/rpc.py", line 232, in _process_data
    (nova): TRACE: rval = node_func(context=ctxt, **node_args)
    (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 113, in decorated_function
    (nova): TRACE: function(self, context, instance_id, *args, **kwargs)
    (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/manager.py", line 1096, in attach_volume
    (nova): TRACE: raise exc
    (nova): TRACE: Error: POST operation failed: xend_post: error from xen daemon: (xend.err 'Unable to find number for device (vdb)')
    (nova): TRACE:

Which I think it is due to the "/dev/vdb" device that I am constrained to use with the "euca-attach-volume".

If I run all the steps manually with the corrected values, everything runs smoothly.

Did I miss any configuration option?

Thank you very much,
Alvaro.

Question information

Language:
English Edit question
Status:
Expired
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Launchpad Janitor (janitor) said :
#1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Pablo Llopis (pllopis) said :
#2

Hi Alvaro,

I am having the same problem but still not able to attach volumes manually.

You posted the same trace twice, but I think I am having the same error.
I guess what you intended to show in the first trace was the exception 'Block device type "qemu" is invalid', is that right?

After changing "qemu" to "phy" I got the same exception 'Unable to find number for device (vdb)', but I don't understand what is going wrong.

I would be very happy if you could be more specific on what you did when you say "run all the steps manually with the corrected values". Which values have changed?
I guess that might have something to do with the device argument "/dev/vdb", are you supplying a different path to euca-attach-volume -d (what would that bye?)

Thank you in advance!

Pablo

Revision history for this message
Pablo Llopis (pllopis) said :
#3

Hello again,

I just figured it out.
For anyone who might have run into the same problems, the solution is to patch that same file (nova/virt/libvirt/connection.py) and hardcode the target block device:

 <target dev='%s' bus='virtio'/>
                     </disk>""" % (device_path, mount_device)

to

 <target dev='xvdc' bus='virtio'/>
                     </disk>""" % (device_path)

since euca-attach-volume does not allow me to supply /dev/xvdc.

Instead of this, a more cleaner hack is to remove the block device volume check, in file nova/compute/api.py, method attach_volume

Just comment out the first 3 lines of code:

def attach_volume(self, context, instance_id, volume_id, device):
        """Attach an existing volume to an existing instance."""
        #if not re.match("^/dev/[a-z]d[a-z]+$", device):
        # raise exception.ApiError(_("Invalid device specified: %s. "
        # "Example device: /dev/vdb") % device)
        self.volume_api.check_attach(context, volume_id=volume_id)

This has the benefit of only requiring patching for the nova-api of the cloud controller, as far as I am able to tell.

(I should note that I am running Xen 4.0.1, since I did not manage to get Ubuntu's Xen 4.1 + Kernel 3.0 to work under our setup, and KVM is not an option due to the lack of hardware virtualization support.)

Revision history for this message
Vish Ishaya (vishvananda) said :
#4

Glad you found the root of this, but wouldn't the best fix be to modify the regex to allow for a prepended x or to replace /dev/x with /dev/ before doing the check if the type is set to xen?

I think we still want to verfy that they aren't attaching to something like /dev/doohicky

On Oct 27, 2011, at 5:25 AM, Pablo Llopis wrote:

> Question #173319 on OpenStack Compute (nova) changed:
> https://answers.launchpad.net/nova/+question/173319
>
> Pablo Llopis posted a new comment:
> Hello again,
>
> I just figured it out.
> For anyone who might have run into the same problems, the solution is to patch that same file (nova/virt/libvirt/connection.py) and hardcode the target block device:
>
> <target dev='%s' bus='virtio'/>
> </disk>""" % (device_path, mount_device)
>
> to
>
> <target dev='xvdc' bus='virtio'/>
> </disk>""" % (device_path)
>
> since euca-attach-volume does not allow me to supply /dev/xvdc.
>
> Instead of this, a more cleaner hack is to remove the block device
> volume check, in file nova/compute/api.py, method attach_volume
>
> Just comment out the first 3 lines of code:
>
> def attach_volume(self, context, instance_id, volume_id, device):
> """Attach an existing volume to an existing instance."""
> #if not re.match("^/dev/[a-z]d[a-z]+$", device):
> # raise exception.ApiError(_("Invalid device specified: %s. "
> # "Example device: /dev/vdb") % device)
> self.volume_api.check_attach(context, volume_id=volume_id)
>
> This has the benefit of only requiring patching for the nova-api of the
> cloud controller, as far as I am able to tell.
>
> (I should note that I am running Xen 4.0.1, since I did not manage to
> get Ubuntu's Xen 4.1 + Kernel 3.0 to work under our setup, and KVM is
> not an option due to the lack of hardware virtualization support.)
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).

Revision history for this message
Pablo Llopis (pllopis) said :
#5

Hi Vish,

yes, you are right that modifying the regex would provide a better fix, and I agree that a verification is desirable.

However, I don't think it is a matter of patching those 3 lines. I must clarify that I was mistaken in my previous post, you can't get away with just patching the initial check in the controller: under my environment (and I would bet under Alvaro's as well) connection.py still needs to be patched by changing the hardcoded "qemu" device driver to "phy". I am not sure what the implication of this change would have on other environments, since I guess "qemu" was written there for a reason.

Additionally, I found out that detaching volumes is not working as well. I suppose that this is a related issue, but I can't verify since I have not figured out what is preventing the detach from succeeding.

euca-describe-volume correctly identifies the device as /dev/xvdc, and I can access the /dev/xvdc block device from my instance just fine. With the block device "untouched" and unmounted, an attempt to euca-detach-volume spits the following trace in the instance's nova-compute log:

2011-10-27 23:57:39,487 INFO nova.compute.manager [f73516db-4489-433f-82a4-3ab1e8236c0b novadmin nimbo] check_instance_lock: decorating: |<function _detach_volume at 0x17cfed8>|
2011-10-27 23:57:39,488 INFO nova.compute.manager [f73516db-4489-433f-82a4-3ab1e8236c0b novadmin nimbo] check_instance_lock: argumentos: |<nova.compute.manager.ComputeManager object at 0x164c650>| |<nova.rpc.impl_kombu.RpcContext object at 0x3617b10>| |79|
2011-10-27 23:57:39,563 INFO nova.compute.manager [f73516db-4489-433f-82a4-3ab1e8236c0b novadmin nimbo] check_instance_lock: locked: |False|
2011-10-27 23:57:39,563 INFO nova.compute.manager [f73516db-4489-433f-82a4-3ab1e8236c0b novadmin nimbo] check_instance_lock: admin: |True|
2011-10-27 23:57:39,563 INFO nova.compute.manager [f73516db-4489-433f-82a4-3ab1e8236c0b novadmin nimbo] check_instance_lock: ejecutando: |<function _detach_volume at 0x17cfed8>|
2011-10-27 23:57:39,639 AUDIT nova.compute.manager [f73516db-4489-433f-82a4-3ab1e8236c0b novadmin nimbo] Quitar el volumen 3 del punto de montaje /dev/xvdc en la instancia 79
2011-10-27 23:57:40,139 ERROR nova.exception [-] Excepción no controlada
(nova.exception): TRACE: Traceback (most recent call last):
(nova.exception): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 98, in wrapped
(nova.exception): TRACE: return f(*args, **kw)
(nova.exception): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 396, in detach_volume
(nova.exception): TRACE: virt_dom.detachDevice(xml)
(nova.exception): TRACE: File "/usr/lib/python2.7/dist-packages/libvirt.py", line 353, in detachDevice
(nova.exception): TRACE: if ret == -1: raise libvirtError ('virDomainDetachDevice() failed', dom=self)
(nova.exception): TRACE: libvirtError: Unknown failure
(nova.exception): TRACE:
2011-10-27 23:57:40,140 ERROR nova.rpc [-] Exception during message handling
(nova.rpc): TRACE: Traceback (most recent call last):
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 620, in _process_data
(nova.rpc): TRACE: rval = node_func(context=ctxt, **node_args)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1298, in detach_volume
(nova.rpc): TRACE: return self._detach_volume(context, instance_id, volume_id, True)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 98, in wrapped
(nova.rpc): TRACE: return f(*args, **kw)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 117, in decorated_function
(nova.rpc): TRACE: function(self, context, instance_id, *args, **kwargs)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1288, in _detach_volume
(nova.rpc): TRACE: volume_ref['mountpoint'])
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 129, in wrapped
(nova.rpc): TRACE: raise Error(str(e))
(nova.rpc): TRACE: Error: Unknown failure
(nova.rpc): TRACE:

manually doing a xm block-detach seems to detach just fine, but this of course confuses nova and I had to "manually" fix the nova database which was left in an inconsistent state.

Revision history for this message
Alvaro Lopez (aloga) said :
#6

Hi Pablo.

> You posted the same trace twice, but I think I am having the same error.
> I guess what you intended to show in the first trace was the exception 'Block device type "qemu" is invalid', is that right?

Yes, you're right, it was a copy&paste mistake.

> I would be very happy if you could be more specific on what you did when you say "run all the steps manually with the corrected values". Which values have changed?
> I guess that might have something to do with the device argument "/dev/vdb", are you supplying a different path to euca-attach-volume -d (what would that bye?)

I was meaning that I used all the supposed values (that is, "phy" and "/dev/xvdX") and performed all the steps manually (that is, initiate a iSCSI session and attach the disk to the VM) just to check that everything was running OK and that the problem was not on Xen or the iSCSI target and initiators.

I have done more or less the same hacks that you have provided, but I was asking it as I (as you also pointed) do not know if there are any implications in any other environments different that mine.

I will have a look today at this issue again to see if we can shed some more light.

Cheers,
Alvaro.