external hard-drive becomes "read only"

Asked by Bogdan Butnaru

This is really weird: I have an external hard-drive --- this one: http://www.lacie.com/products/product.htm?pid=10764 --- where I keep most of my files. It's FAT32 formatted, with a single ~480 GB partition. (Yeah, I know it's rather big for FAT32, but there are good reasons for it.)

I never had any problems until a couple hours ago. I can't tell exactly what happens, but (out of the blue) programs start to report the partition is mounted read-only.

**********************
$ ls /media/ -l
drwxr-xr-x 14 bogdanb root 32768 2007-02-10 21:53 LACIE

$ ls /media/LACIE -l
total 864
drwxr-xr-x 13 bogdanb root 32768 2007-01-17 16:44 Aparatu'
drwxr-xr-x 45 bogdanb root 32768 2007-02-07 20:03 Biblioteca
drwxr-xr-x 4 bogdanb root 32768 2006-11-24 18:55 comics
drwxr-xr-x 7 bogdanb root 32768 2007-02-07 00:56 filme
-rwxr-xr-x 1 bogdanb root 6072 2006-11-27 22:27 gmerge.py
-rwxr-xr-x 1 bogdanb root 448858 2006-08-18 22:23 IPTPS-P2PDB.odp
-rwxr-xr-x 1 bogdanb root 183 2006-12-15 17:31 irbrc
drwxr-xr-x 22 bogdanb root 32768 2007-01-17 16:35 jocuri
drwxr-xr-x 5 bogdanb root 32768 2007-01-17 16:43 kituri
drwxr-xr-x 13 bogdanb root 32768 2007-01-17 17:01 media
drwxr-xr-x 17 bogdanb root 32768 2007-02-07 19:40 misc
drwxr-xr-x 7 bogdanb root 32768 2007-02-10 16:54 music
drwxr-xr-x 8 bogdanb root 32768 2006-12-27 17:30 Recycled
drwxr-xr-x 7 bogdanb root 32768 2007-02-10 18:08 torrent

$ mount
[snip]
/dev/sdb1 on /media/LACIE type vfat (rw,noexec,nosuid,nodev,uid=1000,utf8,umask=022)

$ cat /etc/mtab
[snip]
/dev/sdb1 /media/LACIE vfat rw,noexec,nosuid,nodev,uid=1000,utf8,umask=022 0 0

$ df
Filesystem 1K-blocks Used Available Use% Mounted on
[snip]
/dev/sdb1 488264768 403286592 84978176 83% /media/LACIE

$ echo "test" > /media/LACIE/test.txt
bash: /media/LACIE/test.txt: Read-only file system

**********************

Nautilus displays files as writable, but doesn't allow any change to be done on the disk. (Sometimes it shows the 'lock' emblem on all files, but removes it when I refresh the view.) I unmounted the disk a couple of times, even rebooted. On remount it works for a bit, then this just happens.

While all this happened Azureus was the only program writing on the partition. It crashes when the partition becomes unwritable, but Azureus crashes a lot anyway. I have Amarok and occasionally Beagle reading from it all the time. There's lots of space there, too.

I'm stumped.

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
Bogdan Butnaru
Solved:
Last query:
Last reply:
Revision history for this message
Ralph Janke (txwikinger) said :
#1

Can you read the files from the drive?

look if

    dmesg

gives you any error messages.

Revision history for this message
Bogdan Butnaru (bogdanb) said :
#2

OK, I have a new lead:

****************
$ dmesg
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
[ 868.190000] FAT: Filesystem panic (dev sdb1)
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
[ 868.190000] FAT: Filesystem panic (dev sdb1)
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
[ 868.190000] FAT: Filesystem panic (dev sdb1)
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
[ 868.190000] FAT: Filesystem panic (dev sdb1)
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
[ 868.190000] FAT: Filesystem panic (dev sdb1)
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
[ 868.190000] FAT: Filesystem panic (dev sdb1)
[ 868.190000] fat_bmap_cluster: request beyond EOF (i_pos 15127556805)
... lots of snipping ...
[ 2318.714000] FAT: Filesystem panic (dev sdb1)
[ 2318.714000] fat_bmap_cluster: request beyond EOF (i_pos 15127565026)
[ 2318.714000] FAT: Filesystem panic (dev sdb1)
[ 2318.714000] fat_bmap_cluster: request beyond EOF (i_pos 15127565026)
[ 2566.098000] Core dump to |/usr/share/apport/apport.9135 pipe failed

$ tail /var/log/messages
Feb 10 21:22:03 bogdanb-d620 kernel: [ 382.178000] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
Feb 10 21:22:09 bogdanb-d620 dhcdbd: message_handler: message handler not found under /com/redhat/dhcp/eth1 for sub-path eth1.dbus.get.host_name
Feb 10 21:22:09 bogdanb-d620 dhcdbd: message_handler: message handler not found under /com/redhat/dhcp/eth1 for sub-path eth1.dbus.get.nis_domain
Feb 10 21:22:09 bogdanb-d620 dhcdbd: message_handler: message handler not found under /com/redhat/dhcp/eth1 for sub-path eth1.dbus.get.nis_servers
Feb 10 21:23:42 bogdanb-d620 kernel: [ 482.647000] Core dump to |/usr/share/apport/apport.6590 pipe failed
Feb 10 21:29:14 bogdanb-d620 kernel: [ 814.984000] Core dump to |/usr/share/apport/apport.7539 pipe failed
Feb 10 21:33:07 bogdanb-d620 kernel: [ 1047.584000] Core dump to |/usr/share/apport/apport.7712 pipe failed
Feb 10 21:56:17 bogdanb-d620 -- MARK --
Feb 10 21:58:26 bogdanb-d620 kernel: [ 2566.098000] Core dump to |/usr/share/apport/apport.9135 pipe failed
Feb 10 22:16:17 bogdanb-d620 -- MARK --

***************

I don't know what all that means, but now I'm a bit scared. I really can't afford this disk to break. The only complain I have for this drive is that though it has a power supply bigger than my laptop's, the power cable tends to slide out much too easy...

Revision history for this message
Bogdan Butnaru (bogdanb) said :
#3

Oh, this is on up-to-date Feisty.

Revision history for this message
Ralph Janke (txwikinger) said :
#4

I am afraid, you probably need to run a 'fsck -a' on that drive

Hopefully nothing is unrecoverable so far.

Make sure to backup important stuff. Drives can die every fast and unexpected

Revision history for this message
danielmewes (danielmewes) said :
#5

I would say that there is either a problem with the way Linux recognizes the disk (perhaps only sees a part of it), or with the FAT32 file system.
Which size is shown in /proc/partitions? Which one with "sudo sfdisk -uM -s /dev/sdb" and "sudo sfdisk -uM -s /dev/sdb1"?

When the displayed sizes are correct, you can try to repair the file system by running fsck.vfat on /dev/sdb1.

Revision history for this message
Bogdan Butnaru (bogdanb) said :
#6

$ cat /proc/partitions
major minor #blocks name
[snip]
   8 16 488386584 sdb
   8 17 488384001 sdb1
[snip]

$ sudo sfdisk -uM -s /dev/sdb
488386584

$ sudo sfdisk -uM -s /dev/sdb1
488384001

************

These seemed to work. I tried fsck (see below) and it did find just two files that are wrong. I'm a bit suspicious of Azureus, because it's who created and was writing on those files, and it's very unstable. Though I doubt a user-space program could break the file-system...

I'm reluctant to let fsck run on the partition. If it's all just a pair of simple bad writes caused by power brownout on the disk, it'll fix it, but I'm afraid it could break more if the disk is broken somehow. I did see some posts on the net mentioning fsck did even more damage. I can't afford a full backup, and loosing all that data would be very annoying.

Are there any more checks I could run before trusting fsck? Is there any way to diagnose what's the cause of the errors?

************

$ fsck.vfat -v -n /dev/sdb1
dosfsck 2.11 (12 Mar 2005)
dosfsck 2.11, 12 Mar 2005, FAT32, LFN
Checking we can access the last sector of the filesystem
Boot sector contents:
System ID "MSDOS5.0"
Media byte 0xf8 (hard disk)
       512 bytes per logical sector
     32768 bytes per cluster
        32 reserved sectors
First FAT starts at byte 16384 (sector 32)
         2 FATs, 32 bit entries
  61033472 bytes per FAT (= 119206 sectors)
Root directory start at cluster 2 (arbitrary size)
Data area starts at byte 122083328 (sector 238444)
  15258274 data clusters (499983122432 bytes)
63 sectors/track, 255 heads
        63 hidden sectors
 976768002 sectors total
/torrent/files.done/Black Sabbath Discography/12-Black Sabbath-1981-Mob Rules/01-Turn Up The Night-mw.mp3
  File size is 6160384 bytes, cluster chain length is 5832704 bytes.
  Truncating file to 5832704 bytes.
/torrent/files.done/Black Sabbath Discography/20-Black Sabbath-1994-Cross Purposes/10-Evil Eye-mw.mp3
  File size is 6256337 bytes, cluster chain length is 6225920 bytes.
  Truncating file to 6225920 bytes.
Checking for unused clusters.
Checking free cluster summary.
Free cluster summary wrong (2655568 vs. really 2654150)
  Auto-correcting.
Leaving file system unchanged.
/dev/sdb1: 95680 files, 12604124/15258274 clusters

Revision history for this message
Ralph Janke (txwikinger) said :
#7

can you write on it now, and don't get the errors in dmesg?

Revision history for this message
danielmewes (danielmewes) said :
#8

The proposed corrections are probably safe. The first two may damage the mentioned MP3 files. On the other hand they probably are defective already (thus fsck wants to correct their file system entry).
The change to the free cluster summary would *decrease* the amount of free clusters. This cannot cause any data loss as far as I understand this point.
However I cannot give you any guarantee.

Revision history for this message
Best Bogdan Butnaru (bogdanb) said :
#9

@txwikinger: the drive is still mounted read-only. By my last few tries and what I saw on the net, I should be able to unmount it and remount it as writable, but it would turn read-only when the error is detected again.

@danielmewes: the files are not completely downloaded, so I can delete them after the filesystem is repaired. They seem safe to me too, but I'm concerned about what caused them.

Revision history for this message
danielmewes (danielmewes) said :
#10

Probably the following happened:
New data was to be appended to the MP3 files. But e.g. a power failure or system hang occured. Thus the following information was already written out to the file system:
- the new file size was written to the file's directory entry
- the clusters where reserved by reducing the free cluster counter of the file system
However the transaction was not complete, since
- the new clusters were not yet asigned to the files (leading to the first two errors)
- the needed clusters were not finally marked as being used (perhaps just the same step, don't know how FAT exactly works in this point)

FAT32 is known to be very vulnerable to power outages and the like. If you need better safety for you data, you should use a journalling file system like ext3 or ReiserFS (or NTFS for Windows systems). These are specially designed to handle such incidents, while FAT is not and may easily loose data.

Revision history for this message
Bogdan Butnaru (bogdanb) said :
#11

I know the principle of it. The problem is that I didn't have any crash or even any error messages, at least not until the partition started to became read-only. I do suspect some sort of brown-out on the drive, because it has a very flimsy power connector. At least, I hope that was all.

I now the thing about journaling, but right now I can't use anything but FAT32, because I need the drive to be readable everywhere. As soon as I'll have the money, though, I'm getting another one to use for backup.

Revision history for this message
LKRaider (paul-eipper) said :
#12

I have the EXACT same issue after upating to Feisty from Dapper.

My bet is on a bug on the vfat driver somewhere.

Revision history for this message
Bogdan Butnaru (bogdanb) said :
#13

I forgot to post my results: it turned out that I had a few errors on the file-system, probably because I disconnected the drive without unmounting it, by mistake. As far as I could tell, when the filesystem driver detects some error on a mounted drive it turns it read-only to prevent further damage.

In my case I solved this by unmounting the drive and using fsck to fix the errors (just a few files that were probably written to when the drive was pulled out). Depending on your filesystem, you could try "fsck -n [device-file]" to see if there are any errors, and then attempt to fix them. Whenever possible make backups!

Revision history for this message
Robert C. Mullins (mullins) said :
#14

I have a very similar problem on an NTFS drive. I thought it was just linux having trouble with the drive, but as it turns out, I have the exact same problem as you only NTFS style.

My question is this:

Can I unmount and run fsck on it?

I think what I did, was unplug the drive BEFORE unmounting. I used to do this alot when I was running it on the Windows system, and I never seemed to have any problems. As a newb on linux I am learning to start following instructions. ;)

Revision history for this message
Bogdan Butnaru (bogdanb) said :
#15

For a NTFS disk it's probably a bit more complicated, because support for NTFS isn't as advanced (I think) as that for the other filesystems.

You can try doing the steps above, but I think it a bit dangerous, and it might not work anyway. Your best bet is to try to mount the disk on a Windows computer and use scandisk from there.

By the way, in the future, _always_ unmount a disk before removing it. Doing otherwise is a sure recipe for disk errors. (Actually, if you mount the disk read-only, you can unplug it in relative safety, but that's rarely worth the bother.)

Revision history for this message
Robert C. Mullins (mullins) said :
#16

Bogdan:

Thanks for the input. I was able to recover all of the information (copy-paste) with exception of the corrupted files. I am going to go ahead and reformat the entire thing. It's been several years since working with Linux, so I am very rusty. Would you happen to know a good online resource for re-formatting my drive to fat32 or something worthwhile?

External hdd is a 128 gig Western Digital.

Thanks in advance.

For anyone who is interested, here is my thread on the Ubuntu Forums. Mine isn't the only one, seems that many are having mounting issues with the Feisty build, particularly with NTFS.
http://ubuntuforums.org/showthread.php?t=419692

Revision history for this message
alexicon (alexicon) said :
#17

i have a very similar problem to yours, and it seems to have evolved into something more complicated here.

i have a seagate freeagent go 160gb drive. this drive came with a bunch of windows stuff and was formatted ntfs. i installed the ntfs-progs or ntfs-3g as i think ubuntu calls them. and for a little while the drive worked perfectly. then yesterday i worked it a bit more and tried backing up some larger directories and suddenly it all went wrong. loads of directories simply disappeared, even directories in the root of the filesystem which i hadn't touched before. rebooted, checked the drive in windows, looked for .trash folders incase i accidentally removed something, and nothing. stuff was just gone. after 2-3 reboots i was only left with one of the four directories i was backing up on to :( so i copied off that final directory, and more than half of the stuff that was in that directory was gone too. so in my annoyance i just blamed the dodgy ntfs drivers, and reformatted the whole drive into one fat32 partition. plugged it back in and started backing up again. about 10 minutes into backing up all my dotfiles, i started getting write errors suddenly! then everytime i plugged in the drive i got write errors. no idea what was wrong and again blaming windows filesystems, although my fat32 ipod works perfectly still. so this time i just reformatted it to ext3, as i dont really use windows anyway, and this time, i didnt even get it to write properly once. after playing a little bit i noticed i could write if i used sudo, but i shouldnt have to do that and have never needed to do that before. im not sure what has corrupted this process...

right now the drive has been plugged in several hours, something my ipod never had problems with. i tried clicking on the one file i transferred earlier and it disappears from view. i refresh and some broken icon thing comes up, i click that and it disappears.

dmesg says things like:
[ 128.300000] kjournald starting. Commit interval 5 seconds
[ 128.300000] EXT3-fs warning: mounting fs with errors, running e2fsck is recommended
[ 128.300000] EXT3 FS on sdb1, internal journal
[ 128.300000] EXT3-fs: recovery complete.
[ 128.304000] EXT3-fs: mounted filesystem with ordered data mode.
[ 133.232000] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 143.520000] eth1: no IPv6 routers present
[ 148.980000] ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 150.144000] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 167.904000] eth1: no IPv6 routers present
[ 2126.352000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[ 2126.352000] Additional sense: Logical unit not ready, initializing command required
[ 2126.352000] end_request: I/O error, dev sdb, sector 12375
[ 2126.352000] EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2 offset 0
[ 2126.360000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[ 2126.360000] Additional sense: Logical unit not ready, initializing command required
[ 2126.360000] end_request: I/O error, dev sdb, sector 63
[ 2126.360000] Buffer I/O error on device sdb1, logical block 0
[ 2126.360000] lost page write due to I/O error on sdb1
[ 2126.372000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[ 2126.372000] Additional sense: Logical unit not ready, initializing command required
[ 2126.372000] end_request: I/O error, dev sdb, sector 12375
[ 2146.484000] kjournald starting. Commit interval 5 seconds
[ 2146.484000] EXT3 FS on sda9, internal journal
[ 2146.484000] EXT3-fs: mounted filesystem with ordered data mode.
[ 7115.820000] ipw3945: Microcode SW error detected. Restarting.
[ 7115.820000] ipw3945: request scan called when driver not ready.
[ 7116.820000] ipw3945: Can't stop Rx DMA.
[ 7117.100000] ipw3945: Detected geography ABG (13 802.11bg channels, 12 802.11a channels)
[ 7118.624000] ipw3945: association process canceled
[22131.992000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[22131.992000] Additional sense: Logical unit not ready, initializing command required
[22131.992000] end_request: I/O error, dev sdb, sector 12479
[22131.996000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[22131.996000] Additional sense: Logical unit not ready, initializing command required
[22131.996000] end_request: I/O error, dev sdb, sector 12495
[22131.996000] Buffer I/O error on device sdb1, logical block 1554
[22131.996000] lost page write due to I/O error on sdb1
[22131.996000] Aborting journal on device sdb1.
[22166.412000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[22166.412000] Additional sense: Logical unit not ready, initializing command required
[22166.412000] end_request: I/O error, dev sdb, sector 8279
[22166.412000] Buffer I/O error on device sdb1, logical block 1027
[22166.412000] lost page write due to I/O error on sdb1
[22890.540000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[22890.540000] Additional sense: Logical unit not ready, initializing command required
[22890.540000] end_request: I/O error, dev sdb, sector 12375
[22890.540000] EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2 offset 0
[22890.544000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[22890.544000] Additional sense: Logical unit not ready, initializing command required
[22890.544000] end_request: I/O error, dev sdb, sector 63
[22890.544000] Buffer I/O error on device sdb1, logical block 0
[22890.544000] lost page write due to I/O error on sdb1
[22897.724000] sd 5:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[22897.724000] Additional sense: Logical unit not ready, initializing command required
[22897.724000] end_request: I/O error, dev sdb, sector 12375
[22897.748000] EXT3-fs error (device sdb1): ext3_find_entry: reading directory #2 offset 0
[22899.884000] ext3_abort called.
[22899.884000] EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal
[22899.884000] Remounting filesystem read-only

im not sure whats going on here, but ive tried three different filesystems on this drive. not sure if its the drive failing, although its brand new, only got it last week. have been some suspect things going on with my system though, yesterday was unable to navigate any drop down menus from the gnome panel [applications, places, system, preferences, or even the beryl manager options.] although i rebooted and it was fine again [except for failing to boot clean because of fsck now complaining about one of my winxp partitions which i just ignored for now]

i just did a fresh install of feisty and will try reformatting this drive yet again and using it in that to see if i can get any more regular results out of the drive. might also try another distro and see if i can get this thing working properly... very strange.

feisty fawn
 2.6.20-16-generic #2 SMP Fri Aug 31 00:55:27 UTC 2007 i686 GNU/Linux

Revision history for this message
Jay R. Wren (evarlast) said :
#18

I am having similar problems with a Seagate FreeAgent Pro 750G using ext3.

So my issue is definitely not with vfat.

[53476.014029] sd 3:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[53476.014038] Additional sense: Logical unit not ready, initializing command required
[53476.014045] end_request: I/O error, dev sdc, sector 12375
[53476.022142] sd 3:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[53476.022150] Additional sense: Logical unit not ready, initializing command required
[53476.022156] end_request: I/O error, dev sdc, sector 8279
[53476.022517] EXT3-fs error (device dm-3): ext3_get_inode_loc: unable to read inode block - inode=2, block=1027
[53476.022529] Aborting journal on device dm-3.
[53476.035118] sd 3:0:0:0: Device not ready: <6>: Current: sense key: Not Ready
[53476.035125] Additional sense: Logical unit not ready, initializing command required
[53476.035131] end_request: I/O error, dev sdc, sector 12423
[53476.035137] Buffer I/O error on device dm-3, logical block 1545
[53476.035140] lost page write due to I/O error on dm-3
[53476.035626] Remounting filesystem read-only

I beleive this may have happened during or right after the fsck.

Is there a problem with usb or usb-storage?

Revision history for this message
alexicon (alexicon) said :
#19

still uncertain whether i really had a problem with the ntfs drivers or not on my freeagent go, but in the end the problem was completely solved by altering the udev entry for the drive. please refer to this thread on ubuntuforums which solved my issue, esp if you are using a freeagent.

http://ubuntuforums.org/showthread.php?t=494673