Couldn't ping or SSH to the instance, seems like VM booting issue

Asked by Tushar Patil

I see some errors in the syslog on the compute-node. The instance status is showing as running but when I ssh or ping to the instance it fails. I have ensured that the access is given to port 22 and icmp.

root@ubuntu-network-api-server:/home/openstack/nova# euca-describe-instances
RESERVATION r-mb1nxjho project-1
INSTANCE i-00000013 ami-vdfo75za 172.16.0.4 172.16.0.4 running test (project-1, ubuntu-compute-02) 0 m1.tiny 2011-01-18 20:24:33 nova

root@ubuntu-network-api-server:/home/openstack/nova# euca-describe-groups
GROUP project-1 default default
PERMISSION project-1 default ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0
PERMISSION project-1 default ALLOWS icmp -1 -1 FROM CIDR 0.0.0.0/0

/var/log/syslog
Jan 18 12:32:24 ubuntu-compute-02 libvirtd: 12:32:24.886: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 12:32:24 ubuntu-compute-02 kernel: [676352.295833] type=1503 audit(1295382744.890:49): operation="open" pid=28383 parent=22720 profile="/usr/lib/libvirt/virt-aa-helper" requested_mask="::r" denied_mask="::r" fsuid=0 ouid=104 name="/var/lib/nova/instances/_base/ami-vdfo75za_sm"
Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.508359] type=1505 audit(1295382745.102:50): operation="profile_load" pid=28384 name="libvirt-1735a216-316d-b8a6-2f6c-1ececc4e5e86"
Jan 18 12:32:25 ubuntu-compute-02 libvirtd: 12:32:25.202: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.608935] device vnet0 entered promiscuous mode
Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.609367] br100: port 2(vnet0) entering forwarding state
Jan 18 12:32:35 ubuntu-compute-02 kernel: [676363.204522] vnet0: no IPv6 routers present

nova-compute.log doesn't show any errors.

Any clue?

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
Tushar Patil
Solved:
Last query:
Last reply:
Revision history for this message
Vish Ishaya (vishvananda) said :
#1

You are getting unique ip addresses. Are you using flat networking and standard ubuntu images? If so your instance might be stuck trying to hit the metadata server. You can check this by getting console output on the instance and see if it reports a failure to hit metadata over and over. If so it will time out after about 20 minutes and you will be able to hit it. For flat networking, you either need to disable cloud init, or modify your gateway to forward to api so that instances can retrieve metadata.

On Jan 18, 2011, at 12:39 PM, Tushar Patil wrote:

> New question #141960 on OpenStack Compute (nova):
> https://answers.launchpad.net/nova/+question/141960
>
> I see some errors in the syslog on the compute-node. The instance status is showing as running but when I ssh or ping to the instance it fails. I have ensured that the access is given to port 22 and icmp.
>
> root@ubuntu-network-api-server:/home/openstack/nova# euca-describe-instances
> RESERVATION r-mb1nxjho project-1
> INSTANCE i-00000013 ami-vdfo75za 172.16.0.4 172.16.0.4 running test (project-1, ubuntu-compute-02) 0 m1.tiny 2011-01-18 20:24:33 nova
>
> root@ubuntu-network-api-server:/home/openstack/nova# euca-describe-groups
> GROUP project-1 default default
> PERMISSION project-1 default ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0
> PERMISSION project-1 default ALLOWS icmp -1 -1 FROM CIDR 0.0.0.0/0
>
> /var/log/syslog
> Jan 18 12:32:24 ubuntu-compute-02 libvirtd: 12:32:24.886: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
> Jan 18 12:32:24 ubuntu-compute-02 kernel: [676352.295833] type=1503 audit(1295382744.890:49): operation="open" pid=28383 parent=22720 profile="/usr/lib/libvirt/virt-aa-helper" requested_mask="::r" denied_mask="::r" fsuid=0 ouid=104 name="/var/lib/nova/instances/_base/ami-vdfo75za_sm"
> Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.508359] type=1505 audit(1295382745.102:50): operation="profile_load" pid=28384 name="libvirt-1735a216-316d-b8a6-2f6c-1ececc4e5e86"
> Jan 18 12:32:25 ubuntu-compute-02 libvirtd: 12:32:25.202: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
> Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.608935] device vnet0 entered promiscuous mode
> Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.609367] br100: port 2(vnet0) entering forwarding state
> Jan 18 12:32:35 ubuntu-compute-02 kernel: [676363.204522] vnet0: no IPv6 routers present
>
>
> nova-compute.log doesn't show any errors.
>
> Any clue?
>
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).

Revision history for this message
Tushar Patil (tpatil) said :
#2

I am using Vlan networking and standard ubuntu images. Unfortunately I don't see any console log too.

root@ubuntu-network-api-server:/home/openstack/nova# euca-get-console-output i-00000013
i-00000013
2011-01-18 20:54:33.322383

I also checked tcpdump -i br100 -n port 67 or port 68 on both network-node and compute node, but I don't see any traffic on br100.

network-node

root@ubuntu-network-api-server:/# ps aux | grep dnsmasq
nobody 19102 0.0 0.0 4144 900 ? S Jan17 0:00 dnsmasq --strict-order --bind-interfaces --conf-file= --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=172.16.0.1 --except-interface=lo --dhcp-range=172.16.0.3,static,120s --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/home/naved/nova/bin/nova-dhcpbridge --leasefile-ro
root 19103 0.0 0.0 4144 256 ? S Jan17 0:00 dnsmasq --strict-order --bind-interfaces --conf-file= --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=172.16.0.1 --except-interface=lo --dhcp-range=172.16.0.3,static,120s --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/home/naved/nova/bin/nova-dhcpbridge --leasefile-ro
root 20155 0.0 0.0 3320 816 pts/6 S+ 12:57 0:00 grep --color=auto dnsmasq

brctl show output on network-node

root@ubuntu-network-api-server:/home/openstack/nova# brctl show
bridge name bridge id STP enabled interfaces
br100 8000.0026b981379c no vlan100

brctl show output on compute-node
root@ubuntu-compute-02:/var/log# brctl show
bridge name bridge id STP enabled interfaces
br100 8000.0026b98149ee no vlan100
                                                        vnet0

Revision history for this message
Tushar Patil (tpatil) said :
#3

log from libvirt

/var/log/libvirt/qemu/instance-00000013.log
--------------------------------------------------------
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name instance-00000013 -uuid 0ade0002-78a6-1bb3-773f-39296d44c782 -nographic -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/instance-00000016.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot c -kernel /var/lib/nova/instances/instance-00000016/kernel -initrd /var/lib/nova/instances/instance-00000013/ramdisk -append root=/dev/vda console=ttyS0 -drive file=/var/lib/nova/instances/instance-00000013/disk,if=none,id=drive-virtio-disk0,boot=on,format=qcow2 -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 -device rtl8139,vlan=0,id=net0,mac=02:16:3e:61:bb:09,bus=pci.0,addr=0x2 -net tap,fd=34,vlan=0,name=hostnet0 -chardev file,id=serial0,path=/var/lib/nova/instances/instance-00000013/console.log -device isa-serial,chardev=serial0 -chardev pty,id=serial1 -device isa-serial,chardev=serial1 -usb -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
char device redirected to /dev/pts/3

libvirt.xml
-----------
<domain type='kvm'>
    <name>instance-00000013</name>
    <memory>524288</memory>
    <os>
            <type>hvm</type>
            <kernel>/var/lib/nova/instances/instance-00000013/kernel</kernel>
                <cmdline>root=/dev/vda console=ttyS0</cmdline>
                <initrd>/var/lib/nova/instances/instance-00000013/ramdisk</initrd>
    </os>
    <features>
        <acpi/>
    </features>
    <vcpu>1</vcpu>
    <devices>
        <disk type='file'>
            <driver type='qcow2'/>
            <source file='/var/lib/nova/instances/instance-00000013/disk'/>
            <target dev='vda' bus='virtio'/>
        </disk>
        <interface type='bridge'>
            <source bridge='br100'/>
            <mac address='02:16:3e:61:bb:09'/>
            <!-- <model type='virtio'/> CANT RUN virtio network right now -->
            <filterref filter="nova-instance-instance-00000013">
                <parameter name="IP" value="172.16.0.4" />
                <parameter name="DHCPSERVER" value="172.16.0.1" />
                <parameter name="RASERVER" value="fd00::" />
                <parameter name="PROJNET" value="172.16.0.0" />
<parameter name="PROJMASK" value="255.255.255.224" />

            </filterref>
        </interface>

        <!-- The order is significant here. File must be defined first -->
        <serial type="file">
            <source path='/var/lib/nova/instances/instance-00000013/console.log'/>
            <target port='1'/>
        </serial>

        <console type='pty' tty='/dev/pts/2'>
            <source path='/dev/pts/2'/>
            <target port='0'/>
        </console>

        <serial type='pty'>
            <source path='/dev/pts/2'/>
            <target port='0'/>
        </serial>

    </devices>
</domain>

Revision history for this message
Vish Ishaya (vishvananda) said :
#4

hmm, looks like an apparmor issue possibly. There is a patch that just merged that may help. Although i'm not sure why it is reporting running. Please try again with new trunk now that the patch has merged. If it doesn't help, we may have to revisit app-armor issues.

Vish

On Jan 18, 2011, at 12:39 PM, Tushar Patil wrote:

> New question #141960 on OpenStack Compute (nova):
> https://answers.launchpad.net/nova/+question/141960
>
> I see some errors in the syslog on the compute-node. The instance status is showing as running but when I ssh or ping to the instance it fails. I have ensured that the access is given to port 22 and icmp.
>
> root@ubuntu-network-api-server:/home/openstack/nova# euca-describe-instances
> RESERVATION r-mb1nxjho project-1
> INSTANCE i-00000013 ami-vdfo75za 172.16.0.4 172.16.0.4 running test (project-1, ubuntu-compute-02) 0 m1.tiny 2011-01-18 20:24:33 nova
>
> root@ubuntu-network-api-server:/home/openstack/nova# euca-describe-groups
> GROUP project-1 default default
> PERMISSION project-1 default ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0
> PERMISSION project-1 default ALLOWS icmp -1 -1 FROM CIDR 0.0.0.0/0
>
> /var/log/syslog
> Jan 18 12:32:24 ubuntu-compute-02 libvirtd: 12:32:24.886: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
> Jan 18 12:32:24 ubuntu-compute-02 kernel: [676352.295833] type=1503 audit(1295382744.890:49): operation="open" pid=28383 parent=22720 profile="/usr/lib/libvirt/virt-aa-helper" requested_mask="::r" denied_mask="::r" fsuid=0 ouid=104 name="/var/lib/nova/instances/_base/ami-vdfo75za_sm"
> Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.508359] type=1505 audit(1295382745.102:50): operation="profile_load" pid=28384 name="libvirt-1735a216-316d-b8a6-2f6c-1ececc4e5e86"
> Jan 18 12:32:25 ubuntu-compute-02 libvirtd: 12:32:25.202: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
> Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.608935] device vnet0 entered promiscuous mode
> Jan 18 12:32:25 ubuntu-compute-02 kernel: [676352.609367] br100: port 2(vnet0) entering forwarding state
> Jan 18 12:32:35 ubuntu-compute-02 kernel: [676363.204522] vnet0: no IPv6 routers present
>
>
> nova-compute.log doesn't show any errors.
>
> Any clue?
>
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).

Revision history for this message
Tushar Patil (tpatil) said :
#5

I observe the same problem with the latest trunk Revision 577.

/var/log/syslog

Jan 18 16:01:04 ubuntu-compute-02 libvirtd: 16:01:04.594: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:01:04 ubuntu-compute-02 kernel: [688842.613618] type=1503 audit(1295395264.598:58): operation="open" pid=31117 parent=22724 profile="/usr/lib/libvirt/virt-aa-helper" requested_mask="::r" denied_mask="::r" fsuid=0 ouid=104 name="/var/lib/nova/instances/_base/ami-vdfo75za_sm"
Jan 18 16:01:04 ubuntu-compute-02 kernel: [688842.826514] type=1505 audit(1295395264.810:59): operation="profile_load" pid=31118 name="libvirt-16942e24-78a7-51fd-c0f7-2c06c411a495"
Jan 18 16:01:04 ubuntu-compute-02 libvirtd: 16:01:04.926: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:01:04 ubuntu-compute-02 kernel: [688842.942459] device vnet0 entered promiscuous mode
Jan 18 16:01:04 ubuntu-compute-02 kernel: [688842.942814] br100: port 2(vnet0) entering forwarding state

euca-get-console-output doesn't show any console output.

/var/log/libvirt/qemu/instance-00000008.log
--------------------------------------------------------
root@ubuntu-compute-02:/var/log/libvirt/qemu# vi instance-00000008.log
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name instance-00000008 -uuid 16942e24-78a7-51fd-c0f7-2c06c411a495 -nographic -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/instance-00000008.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot c -kernel /var/lib/nova/instances/instance-00000008/kernel -initrd /var/lib/nova/instances/instance-00000008/ramdisk -append root=/dev/vda console=ttyS0 -drive file=/var/lib/nova/instances/instance-00000008/disk,if=none,id=drive-virtio-disk0,boot=on,format=qcow2 -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 -device rtl8139,vlan=0,id=net0,mac=02:16:3e:24:07:8d,bus=pci.0,addr=0x2 -net tap,fd=34,vlan=0,name=hostnet0 -chardev file,id=serial0,path=/var/lib/nova/instances/instance-00000008/console.log -device isa-serial,chardev=serial0 -chardev pty,id=serial1 -device isa-serial,chardev=serial1 -usb -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
char device redirected to /dev/pts/2

Revision history for this message
Tushar Patil (tpatil) said :
#6

I am getting " nbd15: unknown partition table: message in the syslog.

/var/log/syslog
--------------------
Jan 18 16:14:07 ubuntu-compute-01 kernel: [337973.471306] nbd15: unknown partition table
Jan 18 16:14:08 ubuntu-compute-01 kernel: [337975.170330] nbd15: NBD_DISCONNECT
Jan 18 16:14:08 ubuntu-compute-01 kernel: [337975.182412] nbd15: queue cleared
Jan 18 16:14:08 ubuntu-compute-01 libvirtd: 16:14:08.870: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:14:08 ubuntu-compute-01 kernel: [337975.286341] type=1503 audit(1295396048.873:32): operation="open" pid=6159 parent=1226 profile="/usr/lib/libvirt/virt-aa-helper" requested_mask="::r" denied_mask="::r" fsuid=0 ouid=104 name="/var/lib/nova/instances/_base/ami-vdfo75za_sm"
Jan 18 16:14:09 ubuntu-compute-01 kernel: [337975.497634] type=1505 audit(1295396049.086:33): operation="profile_load" pid=6160 name="libvirt-740ac8da-97ed-c5cd-238d-aad70cd9f752"
Jan 18 16:14:09 ubuntu-compute-01 libvirtd: 16:14:09.202: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:14:09 ubuntu-compute-01 kernel: [337975.615222] device vnet0 entered promiscuous mode
Jan 18 16:14:09 ubuntu-compute-01 kernel: [337975.615662] br100: port 2(vnet0) entering forwarding state

Revision history for this message
Vish Ishaya (vishvananda) said :
#7

The nbd message isn't a problem. I'll try to repro the app-armor issue. You can probably work around the other issue by changing qemu to not use apparmor.
Change line ~90 in /etc/libvirt/qemu.conf:
# security_driver = "selinux"
On Jan 18, 2011, at 4:19 PM, Tushar Patil wrote:
security_driver = "None"

Vish

> Question #141960 on OpenStack Compute (nova) changed:
> https://answers.launchpad.net/nova/+question/141960
>
> Tushar Patil gave more information on the question:
> I am getting " nbd15: unknown partition table: message in the syslog.
>
> /var/log/syslog
> --------------------
> Jan 18 16:14:07 ubuntu-compute-01 kernel: [337973.471306] nbd15: unknown partition table
> Jan 18 16:14:08 ubuntu-compute-01 kernel: [337975.170330] nbd15: NBD_DISCONNECT
> Jan 18 16:14:08 ubuntu-compute-01 kernel: [337975.182412] nbd15: queue cleared
> Jan 18 16:14:08 ubuntu-compute-01 libvirtd: 16:14:08.870: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
> Jan 18 16:14:08 ubuntu-compute-01 kernel: [337975.286341] type=1503 audit(1295396048.873:32): operation="open" pid=6159 parent=1226 profile="/usr/lib/libvirt/virt-aa-helper" requested_mask="::r" denied_mask="::r" fsuid=0 ouid=104 name="/var/lib/nova/instances/_base/ami-vdfo75za_sm"
> Jan 18 16:14:09 ubuntu-compute-01 kernel: [337975.497634] type=1505 audit(1295396049.086:33): operation="profile_load" pid=6160 name="libvirt-740ac8da-97ed-c5cd-238d-aad70cd9f752"
> Jan 18 16:14:09 ubuntu-compute-01 libvirtd: 16:14:09.202: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
> Jan 18 16:14:09 ubuntu-compute-01 kernel: [337975.615222] device vnet0 entered promiscuous mode
> Jan 18 16:14:09 ubuntu-compute-01 kernel: [337975.615662] br100: port 2(vnet0) entering forwarding state
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).

Revision history for this message
Tushar Patil (tpatil) said :
#8

I made the changes as suggested by you but I still see the same problem.

After I restarted the libvirt-bin service, I see following messages in the /var/log/syslog

Jan 18 16:50:46 ubuntu-compute-02 libvirtd: 16:50:46.900: warning : qemudDispatchSignalEvent:396 : Shutting down on signal 15
Jan 18 16:50:47 ubuntu-compute-02 kernel: [691818.023204] virbr0: starting userspace STP failed, starting kernel STP
Jan 18 16:50:47 ubuntu-compute-02 libvirtd: 16:50:47.052: warning : networkAddIptablesRules:850 : Could not add rule to fixup DHCP response checksums on network 'default'.
Jan 18 16:50:47 ubuntu-compute-02 libvirtd: 16:50:47.052: warning : networkAddIptablesRules:851 : May need to update iptables package & kernel to support CHECKSUM rule.
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq[31522]: started, version 2.52 cachesize 150
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq[31522]: compile time options: IPv6 GNU-getopt DBus I18N DHCP TFTP
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq-dhcp[31522]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq[31522]: reading /etc/resolv.conf
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq[31522]: using nameserver 216.98.98.4#53
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq[31522]: using nameserver 216.98.98.20#53
Jan 18 16:50:47 ubuntu-compute-02 dnsmasq[31522]: read /etc/hosts - 7 addresses
Jan 18 16:50:47 ubuntu-compute-02 libvirtd: 16:50:47.351: warning : qemudStartup:1846 : Unable to create cgroup for driver: No such device or address
Jan 18 16:50:47 ubuntu-compute-02 kernel: [691818.456353] lo: Disabled Privacy Extensions
Jan 18 16:50:47 ubuntu-compute-02 libvirtd: 16:50:47.471: warning : lxcStartup:1895 : Unable to create cgroup for driver: No such device or address
Jan 18 16:50:57 ubuntu-compute-02 kernel: [691828.613947] virbr0: no IPv6 routers present
Jan 18 16:51:06 ubuntu-compute-02 libvirtd: 16:51:06.279: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:51:22 ubuntu-compute-02 kernel: [691853.159004] nbd15: unknown partition table
Jan 18 16:51:24 ubuntu-compute-02 kernel: [691855.037997] nbd15: NBD_DISCONNECT
Jan 18 16:51:24 ubuntu-compute-02 kernel: [691855.038163] nbd15: Receive control failed (result -32)
Jan 18 16:51:24 ubuntu-compute-02 kernel: [691855.049089] nbd15: queue cleared
Jan 18 16:51:24 ubuntu-compute-02 libvirtd: 16:51:24.223: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:51:24 ubuntu-compute-02 libvirtd: 16:51:24.331: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed
Jan 18 16:51:24 ubuntu-compute-02 kernel: [691855.258707] device vnet0 entered promiscuous mode
Jan 18 16:51:24 ubuntu-compute-02 kernel: [691855.259071] br100: port 2(vnet0) entering forwarding state
Jan 18 16:51:34 ubuntu-compute-02 kernel: [691865.327560] vnet0: no IPv6 routers present

Revision history for this message
Tushar Patil (tpatil) said :
#9

I have reinstalled ubuntu 10.04 OS and nova rev no. 577.

now I can see the VM instance console log. From the console log I found that instance is sending discover request but it seems like instance is not receiving the DHCP reply from network node.

I also verified this request is coming to the network-node and it is sending reply.

root@ubuntu-network-api-server:/home/naved# tcpdump -i br100 -n port 67 or port 68
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br100, link-type EN10MB (Ethernet), capture size 96 bytes
17:40:22.173505 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:40:22.173976 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:40:45.445643 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:40:45.446189 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:40:48.450517 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:40:48.450979 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:40:51.455879 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:40:51.456340 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:41:14.728254 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:41:14.728683 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:41:17.732965 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:41:17.733507 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:41:20.737477 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:41:20.738014 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:41:44.012282 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:41:44.012844 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:41:47.017241 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:41:47.017776 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:41:50.021915 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:41:50.022475 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:42:13.298704 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:42:13.299245 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:42:16.303712 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:42:16.304180 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
17:42:19.308511 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
17:42:19.308984 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304

My question is why VM instance couldn't get the IP address?

Revision history for this message
Vish Ishaya (vishvananda) said :
#10

if you are running in vlan mode on multiple machines, you need to make sure that host-managed vlans are enabled on your switch. By default nova uses vlan 100 +

On Jan 18, 2011, at 5:45 PM, Tushar Patil wrote:

> Question #141960 on OpenStack Compute (nova) changed:
> https://answers.launchpad.net/nova/+question/141960
>
> Tushar Patil gave more information on the question:
> I have reinstalled ubuntu 10.04 OS and nova rev no. 577.
>
> now I can see the VM instance console log. From the console log I found
> that instance is sending discover request but it seems like instance is
> not receiving the DHCP reply from network node.
>
> I also verified this request is coming to the network-node and it is
> sending reply.
>
> root@ubuntu-network-api-server:/home/naved# tcpdump -i br100 -n port 67 or port 68
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on br100, link-type EN10MB (Ethernet), capture size 96 bytes
> 17:40:22.173505 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:40:22.173976 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:40:45.445643 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:40:45.446189 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:40:48.450517 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:40:48.450979 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:40:51.455879 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:40:51.456340 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:41:14.728254 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:41:14.728683 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:41:17.732965 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:41:17.733507 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:41:20.737477 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:41:20.738014 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:41:44.012282 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:41:44.012844 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:41:47.017241 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:41:47.017776 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:41:50.021915 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:41:50.022475 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:42:13.298704 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:42:13.299245 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:42:16.303712 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:42:16.304180 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
> 17:42:19.308511 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:61:1a:bd, length 280
> 17:42:19.308984 IP 172.16.0.1.67 > 172.16.0.4.68: BOOTP/DHCP, Reply, length 304
>
> My question is why VM instance couldn't get the IP address?
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).

Revision history for this message
Tushar Patil (tpatil) said :
#11

Yes, switch is vlan enabled. I have already tested nova and everything seems to be working as expected before.

Thought of rebooting the nova-compute node to see if it can resolve the problem, but after rebooting the machine I see the same behavior again. I don't see the VM instance console log and no DHCP request/reply traffic on the network node.

It seems there is some issue with my environment. I have 4 nova-compute machines and I see same problem on all of them.

Any clue?

Revision history for this message
Vish Ishaya (vishvananda) said :
#12

Hmm this could be a bug with the new iptables firewall driver perhaps. Maybe try switching to the old NWFilter based driver?

On Jan 18, 2011, at 6:05 PM, Tushar Patil wrote:

> Question #141960 on OpenStack Compute (nova) changed:
> https://answers.launchpad.net/nova/+question/141960
>
> Status: Answered => Open
>
> Tushar Patil is still having a problem:
> Yes, switch is vlan enabled. I have already tested nova and everything
> seems to be working as expected before.
>
> Thought of rebooting the nova-compute node to see if it can resolve the
> problem, but after rebooting the machine I see the same behavior again.
> I don't see the VM instance console log and no DHCP request/reply
> traffic on the network node.
>
> It seems there is some issue with my environment. I have 4 nova-compute
> machines and I see same problem on all of them.
>
> Any clue?
>
> --
> You received this question notification because you are a member of Nova
> Core, which is an answer contact for OpenStack Compute (nova).

Revision history for this message
Koji Iida (iida-koji) said :
#13

Hi Tushar, Vish,

I have same kind of problem...

Could you review following patch?

=== modified file 'nova/virt/libvirt_conn.py'
--- nova/virt/libvirt_conn.py 2011-01-18 22:55:03 +0000
+++ nova/virt/libvirt_conn.py 2011-01-19 09:23:10 +0000
@@ -1266,12 +1266,12 @@
             if(ip_version == 4):
                 # Allow DHCP responses
                 dhcp_server = self._dhcp_server_for_instance(instance)
- our_rules += ['-A %s -s %s -p udp --sport 67 --dport 68' %
- (chain_name, dhcp_server)]
+ our_rules += ['-A %s -s %s -p udp --sport 67 --dport 68 '
+ '-j ACCEPT' % (chain_name, dhcp_server)]
             elif(ip_version == 6):
                 # Allow RA responses
                 ra_server = self._ra_server_for_instance(instance)
- our_rules += ['-A %s -s %s -p icmpv6' %
+ our_rules += ['-A %s -s %s -p icmpv6 -j ACCEPT' %
                                                  (chain_name, ra_server)]

             # If nothing matches, jump to the fallback chain

----

Revision history for this message
Tushar Patil (tpatil) said :
#14

This is my iptables -n -L output on the nova-compute node.

root@ubuntu-compute-02:/var/log# iptables -n -L
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination
nova-local all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain nova-fallback (1 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0

Chain nova-inst-99 (1 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0 state INVALID
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
nova-sg-1 all -- 0.0.0.0/0 0.0.0.0/0
           udp -- 172.16.0.1 0.0.0.0/0 udp spt:67 dpt:68
nova-fallback all -- 0.0.0.0/0 0.0.0.0/0

Chain nova-local (1 references)
target prot opt source destination
nova-inst-99 all -- 0.0.0.0/0 172.16.0.3

Chain nova-sg-1 (1 references)
target prot opt source destination
ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:22
ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0
ACCEPT tcp -- 10.2.3.0/24 0.0.0.0/0 tcp dpt:22

Is there any issue with the iptables filters?

tcpdump on nova-compute node.
-----------------------------------------------
root@ubuntu-compute-02:/var/log# tcpdump -i br100 -v -n port 68
tcpdump: WARNING: br100: no IPv4 address assigned
tcpdump: listening on br100, link-type EN10MB (Ethernet), capture size 96 bytes
11:44:35.640815 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 308)
    0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:3f:c3:1c, length 280, xid 0x6458492f, Flags [none]
          Client-Ethernet-Address 02:16:3e:3f:c3:1c [|bootp]
11:44:35.641374 IP (tos 0x0, ttl 64, id 64012, offset 0, flags [none], proto UDP (17), length 332)
    172.16.0.1.67 > 172.16.0.3.68: BOOTP/DHCP, Reply, length 304, xid 0x6458492f, Flags [none]
          Your-IP 172.16.0.3
          Server-IP 172.16.0.1
          Client-Ethernet-Address 02:16:3e:3f:c3:1c [|bootp]
11:44:38.645585 IP (tos 0x0, ttl 64, id 0, offset 0, flags [none], proto UDP (17), length 308)
    0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 02:16:3e:3f:c3:1c, length 280, xid 0x6458492f, Flags [none]
          Client-Ethernet-Address 02:16:3e:3f:c3:1c [|bootp]
11:44:38.646069 IP (tos 0x0, ttl 64, id 64013, offset 0, flags [none], proto UDP (17), length 332)
    172.16.0.1.67 > 172.16.0.3.68: BOOTP/DHCP, Reply, length 304, xid 0x6458492f, Flags [none]
          Your-IP 172.16.0.3
          Server-IP 172.16.0.1
          Client-Ethernet-Address 02:16:3e:3f:c3:1c [|bootp]

Revision history for this message
Tushar Patil (tpatil) said :
#15

If I apply the patch suggested by Koji IIDA San, I don't face this problem.
I can ping and SSH to the instance successfully.

Seems like this is some bug in the IPtablesFirewallDriver.

Revision history for this message
Ryan Lane (rlane) said :
#16

The patch by Koji also solves the problem I was having today with flatdhcp networking.