vm seems to hang on kvm

Asked by Laszlo Budai

Hello,

we have 4 linux guest VMs running on ubuntu 14.04 with 3.13.0-35-generic #62-Ubuntu SMP kernel. From time to time the VMs seems to hang for about 30 seconds, and afterwards are continuing their operations.
I cannot find anything in the syslog files on the host nor on the guests.
Performance parameters collected by sar (10 minutes sampling) only shows a slight iowait appearing on the host:

12:05:01 PM CPU %user %nice %system %iowait %steal %idle
12:15:01 PM all 3.16 0.00 0.55 0.00 0.00 96.29
12:25:01 PM all 3.34 0.00 0.73 0.00 0.00 95.93
12:35:01 PM all 3.65 0.00 0.94 1.44 0.00 93.97 <----- iowait is 1.44 the only value different than 0 for the whole day
12:45:01 PM all 3.27 0.00 0.65 0.00 0.00 96.08
12:55:01 PM all 3.18 0.00 0.58 0.00 0.00 96.24

The only disk based fs is the / :
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 63G 12K 63G 1% /dev
tmpfs 13G 1.4M 13G 1% /run
/dev/sda1 275G 13G 249G 5% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 63G 12K 63G 1% /run/shm
none 100M 0 100M 0% /run/user

while the sar values for the disk does not show anything unusual:
12:05:01 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
12:15:01 PM dev8-0 0.79 0.00 25.20 31.76 0.00 0.00 0.00 0.00
12:25:01 PM dev8-0 0.80 0.00 25.52 31.90 0.00 0.00 0.00 0.00
12:35:01 PM dev8-0 0.76 0.01 25.15 33.18 0.00 0.01 0.01 0.00
12:45:01 PM dev8-0 0.79 0.00 25.01 31.46 0.00 0.01 0.01 0.00
12:55:01 PM dev8-0 0.80 0.00 25.84 32.44 0.00 0.00 0.00 0.00
Average: dev8-0 0.79 0.00 25.34 32.14 0.00 0.00 0.00 0.00

The VMs have their discs on a ceph cluster and are accessing them using librbd.
I can see some traffic peak on the storage interface:
12:05:01 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:15:01 PM vlan6 157.49 148.81 42.45 328.90 0.00 0.00 0.00 0.00
12:25:01 PM vlan6 154.82 148.44 41.97 327.32 0.00 0.00 0.00 0.00
12:35:01 PM vlan6 157.22 154.34 47.12 505.42 0.00 0.00 0.00 0.00 <----- txkB goes upt to 505 from an average of 328
12:45:01 PM vlan6 152.60 147.00 41.15 319.85 0.00 0.00 0.00 0.00
12:55:01 PM vlan6 156.09 147.38 42.22 323.50 0.00 0.00 0.00 0.00
Average: vlan6 155.64 149.19 42.98 361.00 0.00 0.00 0.00 0.00

Unfortunately I don't have access to the VMs. I just got the sar files + the system log (/var/log/messages - the guests are centos 7), but I cannot see anything unusual in them at the given period.

Does anybody has an idea what else can I check?
Is it possible that the iowait has appeared due to the increased network traffic? (AFAIK only fs related syscalls are included in that metric).

Any suggestion is welcome.

Kind regards,
Laszlo

Question information

Language:
English Edit question
Status:
Expired
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Launchpad Janitor (janitor) said :
#1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.