Ubuntu 10.10 Freezes on occasions

Asked by Poldi

This is similar to question 17260.

I've posted this in the Desktop forum as wel, but there were no takers:-(

My Systems is as follows:

Ubuntu 10.10 64bit (all patches up to date)
Seperate boot disk (dual boot)
RAID 1 for rest of FS
NVIDIA drivers

Problem:

The system freezes completely on at least three occasions in the last 4-5 weeks. Never did this before on 10.10 or previous versions.

Following is and excerpt of the kernel log and you will notice that there were no entries after 23:30 until I rebooted!

---------------------------------------------
May 11 23:30:09 LeosLinux kernel: [175615.640341] ------------[ cut here ]------------
May 11 23:30:09 LeosLinux kernel: [175615.640348] kernel BUG at /build/buildd/linux-2.6.35/fs/buffer.c:3346!
May 11 23:30:09 LeosLinux kernel: [175615.640351] invalid opcode: 0000 [#1] SMP
May 11 23:30:09 LeosLinux kernel: [175615.640355] last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/host4/target4:0:0/4:0:0:0/block/sdb/uevent
May 11 23:30:09 LeosLinux kernel: [175615.640359] CPU 1
May 11 23:30:09 LeosLinux kernel: [175615.640361] Modules linked in: binfmt_misc act_police cls_flow cls_fw cls_u32 sch_htb sch_hfsc sch_ingress sch_sfq xt_time xt_connlimit xt_realm iptable_raw xt_comment xt_recent xt_policy ipt_ULOG ipt_REJECT ipt_REDIRECT ipt_NETMAP ipt_MASQUERADE ipt_ECN ipt_ecn ipt_CLUSTERIP ipt_ah ipt_addrtype nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_TPROXY nf_tproxy_core xt_tcpmss xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_conntrack xt_connmark xt_CLASSIFY ipt_LOG xt_tcpudp xt_state iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack i
May 11 23:30:09 LeosLinux kernel: ptable_mangle nfnetlink iptable_filter ip_tables x_tables quota_v2 quota_tree nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss snd_hda_codec_realtek sunrpc snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event usblp nvidia(P) snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc edac_core lp ppdev edac_mce_amd k10temp i2c_piix4 joydev parport_pc parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx usbhid hid raid1 firewire_ohci firewire_core raid0 crc_itu_t pata_atiixp r8169 ahci mii libahci pata_jmicron multipath linear
May 11 23:30:09 LeosLinux kernel: [175615.640468]
May 11 23:30:09 LeosLinux kernel: [175615.640472] Pid: 76, comm: kswapd0 Tainted: P 2.6.35-29-generic #51-Ubuntu GA-MA790FXT-UD5P/GA-MA790FXT-UD5P
May 11 23:30:09 LeosLinux kernel: [175615.640476] RIP: 0010:[<ffffffff8117de5d>] [<ffffffff8117de5d>] free_buffer_head+0x3d/0x50
May 11 23:30:09 LeosLinux kernel: [175615.640485] RSP: 0018:ffff8802238a1900 EFLAGS: 00010206
May 11 23:30:09 LeosLinux kernel: [175615.640488] RAX: ffff88004094f388 RBX: 0000000000000001 RCX: 0000000000000000
May 11 23:30:09 LeosLinux kernel: [175615.640491] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff88004094f340
May 11 23:30:09 LeosLinux kernel: [175615.640494] RBP: ffff8802238a1900 R08: dead000000200200 R09: dead000000100100
May 11 23:30:09 LeosLinux kernel: [175615.640496] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88004094f340
May 11 23:30:09 LeosLinux kernel: [175615.640499] R13: ffffea0000fe0980 R14: ffffea0000fe0980 R15: ffff88022594639c
May 11 23:30:09 LeosLinux kernel: [175615.640503] FS: 00007ff77e85e720(0000) GS:ffff880001e80000(0000) knlGS:00000000f76116c0
May 11 23:30:09 LeosLinux kernel: [175615.640506] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 11 23:30:09 LeosLinux kernel: [175615.640509] CR2: 00007f6e8fd91da0 CR3: 00000001e76a2000 CR4: 00000000000006e0
May 11 23:30:09 LeosLinux kernel: [175615.640512] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 11 23:30:09 LeosLinux kernel: [175615.640514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
May 11 23:30:09 LeosLinux kernel: [175615.640518] Process kswapd0 (pid: 76, threadinfo ffff8802238a0000, task ffff8802238996f0)
May 11 23:30:09 LeosLinux kernel: [175615.640520] Stack:
May 11 23:30:09 LeosLinux kernel: [175615.640522] ffff8802238a1940 ffffffff8117e2a9 ffffea0000457198 ffff88004094f340
May 11 23:30:09 LeosLinux kernel: [175615.640526] <0> 0100000000000008 ffff88004094f340 0000000000000000 ffff88004094f340
May 11 23:30:09 LeosLinux kernel: [175615.640531] <0> ffff8802238a19a0 ffffffff8122a1d7 ffff8802238a19a0 ffffffff8114fcd6
May 11 23:30:09 LeosLinux kernel: [175615.640536] Call Trace:
May 11 23:30:09 LeosLinux kernel: [175615.640541] [<ffffffff8117e2a9>] try_to_free_buffers+0x79/0xc0
May 11 23:30:09 LeosLinux kernel: [175615.640547] [<ffffffff8122a1d7>] jbd2_journal_try_to_free_buffers+0xa7/0x150
May 11 23:30:09 LeosLinux kernel: [175615.640553] [<ffffffff8114fcd6>] ? __mem_cgroup_uncharge_common+0x196/0x280
May 11 23:30:09 LeosLinux kernel: [175615.640558] [<ffffffff811ed825>] ext4_releasepage+0x55/0x90
May 11 23:30:09 LeosLinux kernel: [175615.640562] [<ffffffff811024b2>] try_to_release_page+0x32/0x50
May 11 23:30:09 LeosLinux kernel: [175615.640566] [<ffffffff811116f5>] shrink_page_list+0x485/0x580
May 11 23:30:09 LeosLinux kernel: [175615.640570] [<ffffffff81111abd>] shrink_inactive_list+0x2cd/0x7f0
May 11 23:30:09 LeosLinux kernel: [175615.640575] [<ffffffff8110beda>] ? determine_dirtyable_memory+0x1a/0x30
May 11 23:30:09 LeosLinux kernel: [175615.640579] [<ffffffff8110bf87>] ? get_dirty_limits+0x27/0x2f0
May 11 23:30:09 LeosLinux kernel: [175615.640583] [<ffffffff8110fe22>] ? get_scan_count+0x172/0x2b0
May 11 23:30:09 LeosLinux kernel: [175615.640587] [<ffffffff8111218b>] shrink_zone+0x1ab/0x230
May 11 23:30:09 LeosLinux kernel: [175615.640591] [<ffffffff81112cc9>] balance_pgdat+0x679/0x740
May 11 23:30:09 LeosLinux kernel: [175615.640595] [<ffffffff81112ead>] kswapd+0x11d/0x310
May 11 23:30:09 LeosLinux kernel: [175615.640599] [<ffffffff81080600>] ? autoremove_wake_function+0x0/0x40
May 11 23:30:09 LeosLinux kernel: [175615.640603] [<ffffffff81112d90>] ? kswapd+0x0/0x310
May 11 23:30:09 LeosLinux kernel: [175615.640606] [<ffffffff81080086>] kthread+0x96/0xa0
May 11 23:30:09 LeosLinux kernel: [175615.640611] [<ffffffff8100aee4>] kernel_thread_helper+0x4/0x10
May 11 23:30:09 LeosLinux kernel: [175615.640614] [<ffffffff8107fff0>] ? kthread+0x0/0xa0
May 11 23:30:09 LeosLinux kernel: [175615.640618] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10
May 11 23:30:09 LeosLinux kernel: [175615.640620] Code: 2a 48 89 fe 48 8b 3d a3 19 b9 00 e8 6e 63 fc ff 65 48 8b 14 25 a0 eb 00 00 48 c7 c0 60 2d 01 00 83 2c 02 01 e8 15 ff ff ff c9 c3 <0f> 0b eb fe eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48
May 11 23:30:09 LeosLinux kernel: [175615.640650] RIP [<ffffffff8117de5d>] free_buffer_head+0x3d/0x50
May 11 23:30:09 LeosLinux kernel: [175615.640654] RSP <ffff8802238a1900>
May 11 23:30:09 LeosLinux kernel: [175615.640658] ---[ end trace 8856969ad921f820 ]---
May 12 08:42:46 LeosLinux kernel: imklog 4.2.0, log source = /proc/kmsg started.
May 12 08:42:46 LeosLinux kernel: [ 0.000000] Initializing cgroup subsys cpuset
May 12 08:42:46 LeosLinux kernel: [ 0.000000] Initializing cgroup subsys cpu
May 12 08:42:46 LeosLinux kernel: [ 0.000000] Linux version 2.6.35-29-generic (buildd@allspice) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #51-Ubuntu SMP Fri Apr 15 17:12:35 UTC 2011 (Ubuntu 2.6.35-29.51-generic 2.6.35.12)
May 12 08:42:46 LeosLinux kernel: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-2.6.35-29-generic root=UUID=a05bde2b-394f-43c5-97f2-c7397e2af4b2 ro rootdelay=60 quiet splash
May 12 08:42:46 LeosLinux kernel: [ 0.000000] BIOS-provided physical RAM map:
-------------------------------------------------------------

The system's not completely dead until 0:30 as excerpts from syslog still logs (some) errors after 23:30

----------------------------------------------------------------

May 12 00:26:51 LeosLinux postfix/local[10122]: fatal: main.cf configuration error: mailbox_size_limit is smaller than message_size_limit
May 12 00:26:52 LeosLinux postfix/master[2520]: warning: process /usr/lib/postfix/local pid 10122 exit status 1
May 12 00:26:52 LeosLinux postfix/master[2520]: warning: /usr/lib/postfix/local: bad command startup -- throttling
May 12 00:27:52 LeosLinux postfix/local[10124]: fatal: main.cf configuration error: mailbox_size_limit is smaller than message_size_limit
May 12 00:27:53 LeosLinux postfix/master[2520]: warning: process /usr/lib/postfix/local pid 10124 exit status 1
May 12 00:27:53 LeosLinux postfix/master[2520]: warning: /usr/lib/postfix/local: bad command startup -- throttling
May 12 00:28:54 LeosLinux postfix/local[10126]: fatal: main.cf configuration error: mailbox_size_limit is smaller than message_size_limit
May 12 00:28:55 LeosLinux postfix/master[2520]: warning: process /usr/lib/postfix/local pid 10126 exit status 1
May 12 00:28:55 LeosLinux postfix/master[2520]: warning: /usr/lib/postfix/local: bad command startup -- throttling
May 12 08:42:46 LeosLinux kernel: imklog 4.2.0, log source = /proc/kmsg started.
May 12 08:42:46 LeosLinux rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="1191" x-info="http://www.rsyslog.com"] (re)start
May 12 08:42:46 LeosLinux rsyslogd: rsyslogd's groupid changed to 103

-----------------------------------------------------------------

I'd appreciate any pointers to resolve this as I'm using this machine for work

Thanks

Question information

Revision history for this message
Ronald Malloy (r-j-malloy) said :
#1

Hello Vlad,

When I have experienced lock-ups or other freezes, I tend to look at the hardware first. The first thing I would check is for proper ventilation. If you are able to open the case, I would look at the CPU cooling fan and make sure it is clear of dirt/hair/fuzz/lint, and is spinning like you would expect. If the fan is slowing down and speeding up, that might be a sign of a failing cooling fan. Also, you would want to inspect the other cabinet cooling fans. If you notice an excessive build-up of dirt and other stuff on the fans, Wal-Mart has a tall can of compressed air in the electronics section for about 5 bucks. I suggest powering down the system, moving it to an open outside area so you can open the cabinet and blow the dust and dirt out, but not all over your house. Make sure all the dirt is removed from the CPU cooling fan, and from around the memory. After you have cleaned the cabinet, bring it back in and reassemble the machine and reconnect the cables. Once done, power back up and let it run. You might also want to look at a DMI tool that can read the motherboard temp, CPU fan speed, and power supply info as well. You might find out something with that tool which could be heat related.

Hope this helps!
-Ron

Revision history for this message
Poldi (poldi) said :
#2

Hi Ron,

You raised a few good points, but I have a water cooled PC. I don't think this is in any way heat related, but may be related to the kernel bug

May 11 23:30:09 LeosLinux kernel: [175615.640348] kernel BUG at /build/buildd/linux-2.6.35/fs/buffer.c:3346!
May 11 23:30:09 LeosLinux kernel: [175615.640351] invalid opcode: 0000 [#1] SMP
May 11 23:30:09 LeosLinux kernel: [175615.640355] last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/host4/target4:0:0/4:0:0:0/block/sdb/uevent
May 11 23:30:09 LeosLinux kernel: [175615.640359] CPU 1

Cheers

Revision history for this message
Eliah Kagan (degeneracypressure) said :
#3

This looks like bug 658584. I recommend that you visit that bug page and see if the description matches what's happening on your machine. If so, I recommend you subscribe to the bug, mark it as affecting you (with the green "This bug affects..." link near the top of the bug page), and post to indicate your kernel version and architecture (the output of "uname -a" includes this information) and whether or not you're using the fglrx video driver (and if so, what version of the fglrx package is installed). To find out if you're using the fglrx video driver, check in Additional Drivers (fglrx is the proprietary ATi video driver) or run "sudo lshw -c video | grep driver" and see what it says.

Can you help with this problem?

Provide an answer of your own, or ask Poldi for more information if necessary.

To post a message you must log in.