fsck.ext4 fails to fix multiply-claimed blocks: can't find dup_blk

Bug #1321418 reported by Forest
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
Invalid
Undecided
Unassigned
Precise
Fix Released
Undecided
Seyeong Kim
Trusty
Fix Released
Undecided
Seyeong Kim

Bug Description

[SRU justification]

[Impact]

The last few times my root ext4 filesystem had its regularly-scheduled boot-time check, errors were reported. The first time it happened, I simply told the system to fix the errors, but since they kept coming up again, I decided to look more closely. I booted from a live USB drive, assembled my raid partitions, and ran fsck.ext4 manually.

Without any options, fsck.ext4 simply reported that the filesystem was clean, and exited. Things got more interesting when I ran with -f. It reported several multiply-claimed blocks, and when I told fsck to go ahead and clone them, it failed with an internal error. Repeated runs of fsck revealed that the filesystem was still not fixed, and repeated attempts to fix the problem also failed, reporting that multiply-claimed blocks already reassigned or cloned.

I was lucky in that the files in question were unimportant, so deleting one of them and running fsck again seems to have fixed my problem this time. However, fsck's internal error and failure to fix the problem in the first place is worrisome.

[Test Case]

Here's the full output:

$ sudo fsck.ext4 -f /dev/md/1
e2fsck 1.42.8 (20-Jun-2013)
Pass 1: Checking inodes, blocks, and sizes

Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 27528089: 73467268 9287 9288 9289 9290 9291
9292 9293
Multiply-claimed block(s) in inode 27528105: 73467268 9287 9288 9289 9290 9291
9292 9293
Pass 1C: Scanning directories for inodes with multiply-claimed blocks Pass 1D:
Reconciling multiply-claimed blocks
(There are 2 inodes containing multiply-claimed blocks.)

File
/home/user/dir/subdir/05.09.2013_13.15.48.300.jpg
(inode #27528089, mod time Mon Jan 13 02:50:08 2014)
  has 7 multiply-claimed block(s), shared with 1 file(s):
        /home/user/.thumbnails/normal/51048a1138d61df87bf3fdc7deed50e3.png/WebpageIcons.db
(inode #27528105, mod time Mon Jan 13 02:50:08 2014)
Clone multiply-claimed blocks<y>? yes
clone_file_block: internal error: can't find dup_blk for 73467268

clone_file_block: internal error: can't find dup_blk for 73467268

File
/home/user/.thumbnails/normal/51048a1138d61df87bf3fdc7deed50e3.png/WebpageIcons.db
(inode #27528105, mod time Mon Jan 13 02:50:08 2014)
  has 7 multiply-claimed block(s), shared with 1 file(s):
        /home/user/dir/subdir/05.09.2013_13.15.48.300.jpg
(inode #27528089, mod time Mon Jan 13 02:50:08 2014)
Multiply-claimed blocks already reassigned or cloned.

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

root: ***** FILE SYSTEM WAS MODIFIED *****
root: 1038356/30285824 files (0.3% non-contiguous), 106319516/121119454 blocks

[Regression Potential]

[Other Info]

backported from upstream
https://kernel.googlesource.com/pub/scm/fs/ext2/e2fsprogs/+/9a1d614df217c02ea6b2cb0072fccfe706aea111
https://kernel.googlesource.com/pub/scm/fs/ext2/e2fsprogs/+/84397754250d13e8596dd68c157c4c9863800079%5E%21/#F0

Forest (foresto)
summary: - fsck.ext4 fails to fix multiply-claimed blocks
+ fsck.ext4 fails to fix multiply-claimed blocks: can't find dup_blk
Revision history for this message
Theodore Ts'o (tytso) wrote :

Can you send me the output of the following:

debugfs /dev/md/1
debugfs: stat <27528105>
debugfs: stat <27528089>
debugfs: quit

Thanks!!

Revision history for this message
Forest (foresto) wrote :

(Keep in mind that I'm running debugfs after having deleted the file at inode 27528089 using rm.)

debugfs: stat <27528105>

Inode: 27528105 Type: regular Mode: 0600 Flags: 0x80000
Generation: 2927505238 Version: 0x00000000:00000001
User: 1000 Group: 1000 Size: 28672
File ACL: 0 Directory ACL: 0
Links: 2 Blockcount: 64
Fragment: Address: 0 Number: 0 Size: 0
 ctime: 0x52d35460:6c6ed008 -- Sun Jan 12 18:50:08 2014
 atime: 0x537a59e3:7b6818f8 -- Mon May 19 12:22:11 2014
 mtime: 0x52d35460:6c6ed008 -- Sun Jan 12 18:50:08 2014
crtime: 0x4e05aecd:1ddc9980 -- Sat Jun 25 02:47:57 2011
Size of extra inode fields: 28
EXTENTS:
(ETB0):73467268, (0):9280, (1):9281, (2):9282, (3):9283, (4):9284, (5):9285, (6):9286

stat <27528089>

Inode: 27528089 Type: regular Mode: 0600 Flags: 0x80000
Generation: 1720399890 Version: 0x00000000:00000001
User: 1000 Group: 1000 Size: 42789
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 88
Fragment: Address: 0 Number: 0 Size: 0
 ctime: 0x537ba775:61b6f090 -- Tue May 20 12:05:25 2014
 atime: 0x537ba763:3cca25fc -- Tue May 20 12:05:07 2014
 mtime: 0x537ba6bf:bec9095c -- Tue May 20 12:02:23 2014
crtime: 0x537ba6bf:bec9095c -- Tue May 20 12:02:23 2014
Size of extra inode fields: 28
EXTENTS:
(0-10):110330351-110330361

Revision history for this message
Theodore Ts'o (tytso) wrote :

Right, I didn't notice the first time I read the bug report you had already deleted the inode to fix the problem. It would have been nice to have gotten a look at the inode in question so I could really see what was going on.

The "internal error: can't find dup_blk" error is one of those "this should never happen" situations, so it would be good to understand how and why it happened, and hopefully figure out how to reproduce it.

Can you send me the output of dumpe2fs -h /dev/md/1, just for the record?

Thanks!!

Revision history for this message
Forest (foresto) wrote :

Yeah, sorry about that. If I had known you were likely to respond so quickly, I might have left it alone until I heard from you.

dumpe2fs 1.42.8 (20-Jun-2013)
Filesystem volume name: root
Last mounted on: /
Filesystem UUID: 3dd3e793-f873-4db2-8c68-801b217ba06e
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Remount read-only
Filesystem OS type: Linux
Inode count: 30285824
Block count: 121119454
Reserved block count: 2422389
Free blocks: 14803241
Free inodes: 29247104
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 995
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Fri Jun 24 17:47:00 2011
Last mount time: Tue May 20 12:03:32 2014
Last write time: Tue May 20 12:03:31 2014
Mount count: 2
Maximum mount count: -1
Last checked: Tue May 20 11:10:00 2014
Check interval: 2419200 (4 weeks)
Next check after: Tue Jun 17 11:10:00 2014
Lifetime writes: 6072 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
First orphan inode: 27269186
Default directory hash: half_md4
Directory Hash Seed: 814029bb-cac6-432f-9a6b-a7924a5e8f4a
Journal backup: inode blocks
Journal features: journal_incompat_revoke
Journal size: 128M
Journal length: 32768
Journal sequence: 0x00b205bb
Journal start: 237

Revision history for this message
Theodore Ts'o (tytso) wrote :

Thanks, I was able to reproduce the problem using a test file system. The problem appears to have been caused by an extent-mapped inode getting written (or copied) to a different/wrong place in the inode table. I should be able to get this fixed now that I have a simple repro.

Changed in e2fsprogs (Ubuntu):
assignee: nobody → Theodore Ts'o (tytso)
status: New → Confirmed
Revision history for this message
Theodore Ts'o (tytso) wrote :
Revision history for this message
Forest (foresto) wrote :

That's great news. Thanks for the quick response!

Revision history for this message
bigeel (unagi-pie) wrote :

Hi,
I had the same error message doing while fsck'ing one of my disks.

I did not understand well the cause of the problem. Is fsck what caused the file-system corruption? (because of an "extent-mapped inode getting written (or copied) to a different/wrong place in the inode table"?)

This ticket is labelled e2fsprogs. May you explain if the planned fix is:
- to avoid file-system corruption
- to enable fsck to fix it
(- something else?)
 please?

Revision history for this message
Theodore Ts'o (tytso) wrote :

The file system which I posted in #6 no longer triggers a bug in 1.42.12. It looks like it was fixed as a side effect of commit 9a1d614df217. @bigeel, what version of e2fsck were you using? Could you try using e2fsprogs 1.42.12 and see if that fixes the problem for you?

Thanks!!

Revision history for this message
bigeel (unagi-pie) wrote :

Hi Theodore,

thanks for your really fast answer. I do not know how to get fsck version number, but I had the following output:
  fsck from util-linux 2.20.1
  e2fsck 1.42 (29-Nov-2011)

However I have run `e2fsck f_dup5.img` on your attachment at #6, and no error was reported, so I do not know how if I can trust the "1.42 (29-Nov-2011)" part:
  e2fsck 1.42 (29-Nov-2011)
  f_dup5.img: clean, 13/16 files, 43/100 blocks

Anyway, I do not think I am willing to run fsck again as it takes 4 days to complete, and the data is not that important.

What I am more worried about is what could have caused it, and how serious this is. Do you have a clue?
Would deleting (or cp file bak; rm file; mv bak file) solve the problem?
(at least I have done some cp/rm and the number of errors decreased, but I wonder whether it’s just shallow or the real problem is gone)

Best

Changed in e2fsprogs (Ubuntu):
status: Confirmed → In Progress
assignee: Theodore Ts'o (tytso) → Rafael David Tinoco (inaddy)
tags: added: cts
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Hello, for all those interested..

I'll be providing the fix by backporting/cherry-picking commit 9a1d614df217 from upstream mentioned by Theo so we can create a SRU for this...

inaddy@xxx:~/sources/upstream/e2fsprogs$ git tag --contains 9a1d614df217
v1.42.12

From rmadison all affected versions are:

- Precise (1.42-1ubuntu2.2)
- Trusty (1.42.9-3ubuntu1.2)
- Utopic (1.42.10-1.1ubuntu1)

Vivid is ok (1.42.12-1ubuntu2)

I already cherry-picked the commit and I'm solving conflicts. Then I'll test the fsck with the image provided by Theo to see if it solves the issue and ask for SRU.

Thank you!

Rafael Tinoco

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Okay, I have provided one PPA for precise for this particular case:

https://launchpad.net/~inaddy/+archive/ubuntu/lp1321418

Cherry-picking 2 needed patches from upstream.

I have tested also the "fsck" with the filesystem that Theo provided and version from PPA fixes the problems.

Second fsck run shows:

inaddy@xxx:~/filesystem$ fsck.ext4 -f ./f_dup5.img
e2fsck 1.42 (29-Nov-2011)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
./f_dup5.img: 13/16 files (15.4% non-contiguous), 65/100 blocks

No errors.

I'm waiting for feedback from some users to provide debdiffs to be proposed as SRU for Precise, Trusty and Utopic.

Thank you very much for reporting this.

Rafael Tinoco

Louis Bouchard (louis)
no longer affects: e2fsprogs (Ubuntu Vivid)
Seyeong Kim (seyeongkim)
Changed in e2fsprogs (Ubuntu):
status: In Progress → Invalid
Changed in e2fsprogs (Ubuntu Precise):
status: New → In Progress
Changed in e2fsprogs (Ubuntu Trusty):
status: New → Incomplete
status: Incomplete → In Progress
Changed in e2fsprogs (Ubuntu Precise):
assignee: nobody → Seyeong Kim (xtrusia)
Changed in e2fsprogs (Ubuntu Trusty):
assignee: nobody → Seyeong Kim (xtrusia)
Changed in e2fsprogs (Ubuntu):
assignee: Rafael David Tinoco (inaddy) → nobody
description: updated
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
description: updated
tags: added: sts
removed: cts
Revision history for this message
Seyeong Kim (seyeongkim) wrote :
Revision history for this message
Chris J Arges (arges) wrote :

Sponsored for P/T.

Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello Forest, or anyone else affected,

Accepted e2fsprogs into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/e2fsprogs/1.42.9-3ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in e2fsprogs (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Changed in e2fsprogs (Ubuntu Precise):
status: In Progress → Fix Committed
Revision history for this message
Chris J Arges (arges) wrote :

Hello Forest, or anyone else affected,

Accepted e2fsprogs into precise-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/e2fsprogs/1.42-1ubuntu2.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Forest (foresto) wrote :

Chris, I have no way to test this, because I no longer have a filesystem with multiply-claimed blocks.

Revision history for this message
Simon Déziel (sdeziel) wrote :

This verified fine on Precise and Trusty using the FS image from comment #6. Thank you

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package e2fsprogs - 1.42-1ubuntu2.3

---------------
e2fsprogs (1.42-1ubuntu2.3) precise; urgency=low

  * fix rule-violating lblk->pblk mappings on bigalloc filesystems (LP: #1321418)

 -- Seyeong Kim <email address hidden> Tue, 01 Sep 2015 10:57:56 -0500

Changed in e2fsprogs (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for e2fsprogs has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package e2fsprogs - 1.42.9-3ubuntu1.3

---------------
e2fsprogs (1.42.9-3ubuntu1.3) trusty; urgency=medium

  * fix rule-violating lblk->pblk mappings on bigalloc filesystems (LP: #1321418)

 -- Seyeong Kim <email address hidden> Tue, 01 Sep 2015 07:08:12 -0500

Changed in e2fsprogs (Ubuntu Trusty):
status: Fix Committed → Fix Released
Changed in e2fsprogs (Ubuntu):
assignee: nobody → Jouan Oceane (jouanoceane)
assignee: Jouan Oceane (jouanoceane) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.