Can't do live-migration on x86 rhel6.3 because the error "libvirtError: internal error unable to send file handle 'migrate': No file descriptor supplied via SCM_RIGHTS"

Asked by Hua Zhang

hi all,

        I encontered a problem when I do live-migration on x86 rhel6.3 because the error "libvirtError: internal error unable to send file handle 'migrate': No file descriptor supplied via SCM_RIGHTS", any input will be highly appreciated.

        I have two nodes, one is node1, one is node2.

[olympics@node1 ~]$ nova-manage service list
Binary Host Zone Status State Updated_At
nova-compute node1 nova enabled :-) 2012-11-26 11:22:59
nova-scheduler node1 nova enabled :-) 2012-11-26 11:19:02
nova-network node1 nova enabled :-) 2012-11-26 11:19:04
nova-compute node2 nova enabled :-) 2012-11-26 11:18:58

        I can deploy VM on any one of above host, I have deployed a VM on node2,
[olympics@node1 ~]$ nova show a1085ace-6ed4-40d7-87d1-8b9728468369
+-------------------------------------+----------------------------------------------------------------+
| Property | Value |
+-------------------------------------+----------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-SRV-ATTR:host | node2 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node2 |
| OS-EXT-SRV-ATTR:instance_name | instance-00000014 |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| accessIPv4 | |
| accessIPv6 | |
| config_drive | |
| created | 2012-11-26T10:53:58Z |
| flavor | m1.tiny (1) |
| hostId | 11ffcc6fd02287054ac39ace83987636d53d6ace8d11f593cd4a6a1c |
| id | a1085ace-6ed4-40d7-87d1-8b9728468369 |
| image | cirros-0.3.0-x86_64-uec (fe480628-d78e-4145-9347-a80036040cd7) |
| key_name | None |
| metadata | {} |
| name | i1 |
| private network | 10.0.1.5 |
| progress | 0 |
| security_groups | [{u'name': u'default'}] |
| status | ACTIVE |
| tenant_id | fb7f0b7b239c4df99bafea46f2d972f4 |
| updated | 2012-11-26T10:55:49Z |
| user_id | 75e7e5e64fd24d4aa06308f3ef2539f3 |
+-------------------------------------+----------------------------------------------------------------+

now I want to live migration the VM from the host node2 to the host node1

[olympics@node1 ~]$ nova live-migration a1085ace-6ed4-40d7-87d1-8b9728468369 node1

[olympics@node1 ~]$ nova list
+--------------------------------------+------+-----------+------------------+
| ID | Name | Status | Networks |
+--------------------------------------+------+-----------+------------------+
| a1085ace-6ed4-40d7-87d1-8b9728468369 | i1 | MIGRATING | private=10.0.1.5 |
+--------------------------------------+------+-----------+------------------+

but I found the VM can't be migrated to the host node1, and from the nova-compute process of the host node2 I get the following errors:

2012-11-23 23:14:56 DEBUG nova.openstack.common.rpc.amqp [-] Making asynchronous call on compute.node1 ... from (pid=28587) multicall /home/olympics/source/nova/nova/openstack/common/rpc/amqp.py:351
2012-11-23 23:14:56 DEBUG nova.openstack.common.rpc.amqp [-] MSG_ID is 5a09f9c62ddb40bb845d04712d0e523c from (pid=28587) multicall /home/olympics/source/nova/nova/openstack/common/rpc/amqp.py:354
libvir: QEMU error : internal error unable to send file handle 'migrate': No file descriptor supplied via SCM_RIGHTS
2012-11-23 23:15:39 ERROR nova.virt.libvirt.driver [-] [instance: 25fc1b2e-fb86-4ca7-8cb7-773e316c502a] Live Migration failure: internal error unable to send file handle 'migrate': No file descriptor supplied via SCM_RIGHTS

2012-11-23 23:15:39 DEBUG nova.utils [-] Got semaphore "compute_resources" for method "update_usage"... from (pid=28587) inner /home/olympics/source/nova/nova/utils.py:713
2012-11-23 23:15:39 DEBUG nova.openstack.common.rpc.amqp [-] Making asynchronous call on network ... from (pid=28587) multicall /home/olympics/source/nova/nova/openstack/common/rpc/amqp.py:351
2012-11-23 23:15:39 DEBUG nova.openstack.common.rpc.amqp [-] MSG_ID is 2430d93d708743cfa710283c0cb3c017 from (pid=28587) multicall /home/olympics/source/nova/nova/openstack/common/rpc/amqp.py:354
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/eventlet/hubs/poll.py", line 97, in wait
    readers.get(fileno, noop).cb(fileno)
  File "/usr/lib/python2.6/site-packages/eventlet/greenthread.py", line 192, in main
    result = function(*args, **kwargs)
  File "/home/olympics/source/nova/nova/virt/libvirt/driver.py", line 2493, in _live_migration
    recover_method(ctxt, instance_ref, dest, block_migration)
  File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
    self.gen.next()
  File "/home/olympics/source/nova/nova/virt/libvirt/driver.py", line 2487, in _live_migration
    FLAGS.live_migration_bandwidth)
  File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 187, in doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 147, in proxy_call
    rv = execute(f,*args,**kwargs)
  File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 76, in tworker
    rv = meth(*args,**kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1039, in migrateToURI
    if ret == -1: raise libvirtError ('virDomainMigrateToURI() failed', dom=self)
libvirtError: internal error unable to send file handle 'migrate': No file descriptor supplied via SCM_RIGHTS

Removing descriptor: 19

[root@node1]# vi /var/log/libvirt/qemu/instance-0000000d.log
2012-11-23 15:15:27.030+0000: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -S -M pc-0.12 -m 512 -smp 1,sockets=1,cores=1,threads=1 -name instance-0000000d -uuid 25fc1b2e-fb86-4ca7-8cb7-773e316c502a -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-0000000d.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=readline -rtc base=utc -boot c -kernel /home/olympics/source/data/nova/instances/instance-0000000d/kernel -initrd /home/olympics/source/data/nova/instances/instance-0000000d/ramdisk -append root=/dev/vda console=ttyS0 -drive file=/home/olympics/source/data/nova/instances/instance-0000000d/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -device rtl8139,vlan=0,id=net0,mac=fa:16:3e:6e:1a:5c,bus=pci.0,addr=0x3 -net tap,fd=30,vlan=0,name=hostnet0 -chardev file,id=charserial0,path=/home/olympics/source/data/nova/instances/instance-0000000d/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -usb -device usb-tablet,id=input0 -vnc 127.0.0.1:2 -k en-us -vga cirrus -incoming tcp:0.0.0.0:49154 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
Domain id=5 is tainted: high-privileges
char device redirected to /dev/pts/17
load of migration failed

I use a common user (olympics) to do live-migration, in order to support live-migration, I have do following three steps:

1) create nfs

[olympics@node1 ~]$ mount
9.123.106.28:/home/olympics/source/share/instances on /home/olympics/source/data/nova/instances type nfs (rw,addr=9.123.106.28)
[olympics@node1 ~]$ ll /home/olympics/source/data/nova/instances/
total 8
drwxrwxr-x 2 olympics olympics 4096 Nov 26 18:54 _base
drwxrwxr-x 2 olympics olympics 4096 Nov 26 18:59 instance-00000014

[olympics@node2 ~]$ mount
9.123.106.28:/ on /home/olympics/source/data/nova/instances type nfs4 (rw,addr=9.123.106.28,clientaddr=9.123.106.30)
[olympics@node2 ~]$ ll /home/olympics/source/data/nova/instances/
total 8
drwxrwxr-x 2 olympics olympics 4096 Nov 26 18:54 _base
drwxrwxr-x 2 olympics olympics 4096 Nov 26 18:59 instance-00000014

2) on both node1 and node2, update the libvirt configurations, Modify /etc/libvirt/libvirtd.conf
listen_tls = 0
listen_tcp = 1
tcp_port = "16509"
auth_tcp = "none"

Modify /etc/sysconfig/libvirtd
LIBVIRTD_ARGS="--listen"

vi /etc/libvirt/qemu.conf
user = "olympics"

sudo chown -R olympics /home/olympics/source/data/nova/instances
sudo chown -R olympics /var/cache/libvirt
sudo chown -R olympics /var/cache/glance
sudo chown -R olympics /var/lib/libvirt
sudo chown -R olympics /var/run/libvirt
sudo chown -R olympics /etc/libvirt

sudo chown -R olympics /var/run/libvirtd.pid
sudo chown -R olympics /var/run/libvirt/libvirt-sock*

finally restart the libvirtd service by : sudo service libvirtd restart

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
Hua Zhang
Solved:
Last query:
Last reply:
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) said :
#1

Bug #1005557 was closed as invalid fyi

Revision history for this message
Hua Zhang (zhhuabj) said :
#2

I have fixed this issue by upgrading version of qemu to 0.15.
It is a flaw in QEMU fixed upstream as this page referred, http://www.redhat.com/archives/libvirt-users/2010-June/msg00053.html