Server freezes - 14.04 Trusty

Asked by abadger

Every few days my Ubuntu 14.04 server freezes and has to be 'power cycled' to reboot. This is happening on three different servers, so failly sure it's not a hardware issue. Suspect software / memory usage is the problem - but how can I debug this ? Have installed crashdump - but you get no crash dump after hardware reset. (I'm using Hetzner hosting for these servers). I am running 4 kvm windows 2003 clients on server - hence why I suspect memory issues.

How should I go about debugging this further ... what commands would you like me to run / logs files you would like to see.

Thanks for any guidance,
Dave

After latest reboot....

Here is /var/log/syslog file just before the server 'goes offline' and then you see it rebooted ..

Sep 14 09:34:12 big3 dnsmasq-dhcp[2736]: DHCPREQUEST(virbr0) 192.168.122.26 52:54:00:ea:f8:59
Sep 14 09:34:12 big3 dnsmasq-dhcp[2736]: Ignoring domain click2sit.com for DHCP host name masswebsrv1
Sep 14 09:34:12 big3 dnsmasq-dhcp[2736]: DHCPACK(virbr0) 192.168.122.26 52:54:00:ea:f8:59 masswebsrv1
Sep 14 09:35:01 big3 CRON[9347]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Sep 14 09:45:01 big3 CRON[10063]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Sep 14 10:58:43 big3 rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="1194" x-info="http://www.rsyslog.com"] start
Sep 14 10:58:43 big3 rsyslogd: rsyslogd's groupid changed to 104
Sep 14 10:58:43 big3 rsyslogd: rsyslogd's userid changed to 101
Sep 14 10:58:43 big3 kernel: [ 0.000000] Initializing cgroup subsys cpuset
Sep 14 10:58:43 big3 kernel: [ 0.000000] Initializing cgroup subsys cpu

root@big3 /home/dave # free -m
             total used free shared buffers cached
Mem: 31745 21991 9753 4 73 4866
-/+ buffers/cache: 17051 14693
Swap: 32767 0 32767
root@big3 /home/dave # uname -a
Linux big3.netfm.org 3.13.0-35-generic #62-Ubuntu SMP Fri Aug 15 01:58:42 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Output from hardinfo....

-Computer-
Processor : 8x Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Memory : 32507MB (17547MB used)
Operating System : Ubuntu 14.04.1 LTS
User Name : dave (David Herring)
Date/Time : Sun 14 Sep 2014 11:14:14 AM BST
-Display-
Resolution : 1920x1080 pixels
OpenGL Renderer : Unknown
X11 Vendor : The X.Org Foundation
-Multimedia-
-Input Devices-
 Power Button
 Power Button
 AT Translated Set 2 keyboard
 Eee PC WMI hotkeys
-Printers-
No printers found
-SCSI Disks-
ATA INTEL SSDSC2CW24
ATA ST3000DM001-1CH1
ATA ST3000DM001-1CH1

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
abadger
Solved:
Last query:
Last reply:
Revision history for this message
actionparsnip (andrew-woodhead666) said :
#1

I suggest you test your RAM using Memtest86+ from Grub as a good start.

Revision history for this message
abadger (dave-netfm) said :
#2

My servers are based at Hetzner hosting, and they suggest we update the bios on the servers. This has now bveen done - and so far no more server freezes.

 I did get memtest installed and run this across the 32 G of memory - no errors were found, but not sure if this is as effective as Memtest86 - but I was unsure how I could run this on a Hetzner server when I don't have console access.

Anyway - thanks for help, and note to all to check for Bios updates if server freezing.

Thanks Dave