External USB drive keeps freezing up

Asked by Paul Tomblin

I have this system that's been running Ubuntu for a number of years - before that I used RedHat, Fedora, and other distros. I back up every hour to an external USB hard drive using "rsync", and that's been working fine since back when I used Fedora. Last week I decided to see if a wipe and re-install would fix some of the glitches I've had every since I upgraded to 12.04. I installed Kubuntu (although I doubt KDE versus XFCE has any bearing on this problem). Ever since the upgrade, my hourly backup works fine for a number of hours, and then suddenly it fails - it freezes up. And when it fails, any attempt to access the USB drive freezes up. The only way I've found to fix it is to power down the drive and power it up again. When it's frozen up like this, I see the following in /var/log/kern.log (repeated over and over again):

Sep 21 23:18:01 allhats2 kernel: [52652.707110] INFO: task kjournald:4380 blocked for more than 120 seconds.
Sep 21 23:18:01 allhats2 kernel: [52652.707113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 21 23:18:01 allhats2 kernel: [52652.707114] kjournald D ffffffff81806200 0 4380 2 0x00000000
Sep 21 23:18:01 allhats2 kernel: [52652.707117] ffff8803661e1c10 0000000000000046 ffff8803fd890148 ffff880404f17000
Sep 21 23:18:01 allhats2 kernel: [52652.707120] ffff8803661e1fd8 ffff8803661e1fd8 ffff8803661e1fd8 00000000000137c0
Sep 21 23:18:01 allhats2 kernel: [52652.707122] ffff880404e89700 ffff880361304500 ffff8803661e1bf0 ffff88041f5d4080
Sep 21 23:18:01 allhats2 kernel: [52652.707125] Call Trace:
Sep 21 23:18:01 allhats2 kernel: [52652.707130] [<ffffffff811a8a40>] ? __wait_on_buffer+0x30/0x30
Sep 21 23:18:01 allhats2 kernel: [52652.707133] [<ffffffff8165850f>] schedule+0x3f/0x60
Sep 21 23:18:01 allhats2 kernel: [52652.707135] [<ffffffff816585bf>] io_schedule+0x8f/0xd0
Sep 21 23:18:01 allhats2 kernel: [52652.707137] [<ffffffff811a8a4e>] sleep_on_buffer+0xe/0x20
Sep 21 23:18:01 allhats2 kernel: [52652.707139] [<ffffffff81658ddf>] __wait_on_bit+0x5f/0x90
Sep 21 23:18:01 allhats2 kernel: [52652.707140] [<ffffffff811a8a40>] ? __wait_on_buffer+0x30/0x30
Sep 21 23:18:01 allhats2 kernel: [52652.707142] [<ffffffff81658e8c>] out_of_line_wait_on_bit+0x7c/0x90
Sep 21 23:18:01 allhats2 kernel: [52652.707145] [<ffffffff8108ab20>] ? autoremove_wake_function+0x40/0x40
Sep 21 23:18:01 allhats2 kernel: [52652.707146] [<ffffffff811a8a3e>] __wait_on_buffer+0x2e/0x30
Sep 21 23:18:01 allhats2 kernel: [52652.707149] [<ffffffff81257534>] journal_commit_transaction+0x484/0xfc0
Sep 21 23:18:01 allhats2 kernel: [52652.707152] [<ffffffff8125b5eb>] kjournald+0xeb/0x250
Sep 21 23:18:01 allhats2 kernel: [52652.707154] [<ffffffff8108aae0>] ? add_wait_queue+0x60/0x60
Sep 21 23:18:01 allhats2 kernel: [52652.707155] [<ffffffff8125b500>] ? commit_timeout+0x10/0x10
Sep 21 23:18:01 allhats2 kernel: [52652.707157] [<ffffffff8108a03c>] kthread+0x8c/0xa0
Sep 21 23:18:01 allhats2 kernel: [52652.707160] [<ffffffff81664b74>] kernel_thread_helper+0x4/0x10
Sep 21 23:18:01 allhats2 kernel: [52652.707162] [<ffffffff81089fb0>] ? flush_kthread_worker+0xa0/0xa0
Sep 21 23:18:01 allhats2 kernel: [52652.707163] [<ffffffff81664b70>] ? gs_change+0x13/0x13

My first theory was that there was something aggressively spinning down the drive, so I issued the command "hdparm -S 0 /dev/sde". That didn't help. So I tried uninstalling all the auto power management stuff. That didn't help. I've also changed the configuration of munin so it stops sending smartctl commands to the drive - I did that a few hours ago, but this freezeup usually takes 10 or 18 hours to manifest (ie it normally happens when I'm asleep) so I don't know if this is going to help.

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu linux Edit question
Assignee:
No assignee Edit question
Solved by:
Paul Tomblin
Solved:
Last query:
Last reply:
Revision history for this message
actionparsnip (andrew-woodhead666) said :
#1

Have you tested the drive using the drive manufacturers tool?

Revision history for this message
Paul Tomblin (ptomblin) said :
#2

I haven't tested it because it was working fine before the re-install, but I shall try that.

Revision history for this message
actionparsnip (andrew-woodhead666) said :
#3

Worth checking. I'd also fsck the partition to make sure the filesystem is consistant

Revision history for this message
Paul Tomblin (ptomblin) said :
#4

Both checks report no errors.

Revision history for this message
actionparsnip (andrew-woodhead666) said :
#5

tried a different port and a different USB cable?

Revision history for this message
Paul Tomblin (ptomblin) said :
#6

I ran it for 20 hours on the internal SATA without problems. I just switched the USB cradle, cable and power supply with the one from my laptop's Time Machine drive, and plugged into a different USB port. I'll report back in 24 hours and/or when it freezes up again.

Revision history for this message
Paul Tomblin (ptomblin) said :
#7

Well, it's been more than 24 hours. One of the many changes I made along the way seems to have fixed it.