How do I rebuild corrupted indicies?

Asked by Ted Gould

I have a bazaar repository that got corrupted as I ran out of battery on my laptop while committing to the repository. Several files now return error when simply calling cat on them. They are:

cat: 01ef0ba809d28d85ed4e96154ff6924e.cix: Input/output error
cat: 01ef0ba809d28d85ed4e96154ff6924e.rix: Input/output error
cat: 01ef0ba809d28d85ed4e96154ff6924e.six: Input/output error

It appears that the associated packs file is okay, it alteast doesn't return IO error.

How do I rebuild these index files?

Thank you,
Ted

Question information

Language:
English Edit question
Status:
Solved
For:
Bazaar Edit question
Assignee:
No assignee Edit question
Solved by:
Ted Gould
Solved:
Last query:
Last reply:
Revision history for this message
Robert Collins (lifeless) said :
#1

You can't rebuild index files at the moment - they are are part of the data structure. We can back out the transaction though.

Firstly, check that its only a single commit in there - bzr dump-btree 01ef0ba809d28d85ed4e96154ff6924e.iix - that should only return one record

Revision history for this message
John A Meinel (jameinel) said :
#2

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Question #96346 on Bazaar changed:
> https://answers.edge.launchpad.net/bzr/+question/96346
>
> Status: Open => Needs information
>
> Robert Collins requested for more information:
> You can't rebuild index files at the moment - they are are part of the
> data structure. We can back out the transaction though.
>
> Firstly, check that its only a single commit in there - bzr dump-btree
> 01ef0ba809d28d85ed4e96154ff6924e.iix - that should only return one
> record
>

Actually, we probably *could* for those indices, but I don't think it is
worth the effort (yet).

I agree that just marking it as not-present and then deleting the files
may be a better way forward. Then you can 'bzr pull' from whatever
source you got them from and start over again.

Though if it was a *commit*, then the specific info would be lost. The
WT may be in a valid state to just commit again, though.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDm8EACgkQJdeBCYSNAANajwCaAiTTj0KMSxHFbU8ui4haficK
V6gAn3xgewKl6c2BxyhCAhpuxhRXCQNu
=B+My
-----END PGP SIGNATURE-----

Revision history for this message
Ted Gould (ted) said :
#3

It seems that the iix file might be corrupted:

bzr dump-btree 01ef0ba809d28d85ed4e96154ff6924e.iix
bzr: ERROR: 01ef0ba809d28d85ed4e96154ff6924e.iix is not an index of type <class 'bzrlib.btree_index.BTreeGraphIndex'>.

I'm guessing something with the recovery of the file by the FS failed. Here is the first few lines:

# libdatetime.la - a libtool library file
# Generated by ltmain.sh (GNU libtool) 2.2.6 Debian-2.2.6a-4
#
# Please DO NOT delete this file!
# It is necessary for linking the library.

Revision history for this message
John A Meinel (jameinel) said :
#4

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ted Gould wrote:
> Question #96346 on Bazaar changed:
> https://answers.edge.launchpad.net/bzr/+question/96346
>
> Ted Gould posted a new comment:
> It seems that the iix file might be corrupted:
>
> bzr dump-btree 01ef0ba809d28d85ed4e96154ff6924e.iix
> bzr: ERROR: 01ef0ba809d28d85ed4e96154ff6924e.iix is not an index of type <class 'bzrlib.btree_index.BTreeGraphIndex'>.
>
> I'm guessing something with the recovery of the file by the FS failed.
> Here is the first few lines:
>
> # libdatetime.la - a libtool library file
> # Generated by ltmain.sh (GNU libtool) 2.2.6 Debian-2.2.6a-4
> #
> # Please DO NOT delete this file!
> # It is necessary for linking the library.
>

That is *definitely* not a bzr index file. :)

Can you do 'head -n 5' on the pack files as well?

I'm a bit concerned that you have some rather severe corruption, given
that file contents are not what they seem...

Make sure that "bzr dump-btree .bzr/repository/pack-names" works, and
look for any temp files with 'pack-names' as part of the name (in
.bzr/repository).

It shouldn't be hard to remove a single offending pack file.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDnZEACgkQJdeBCYSNAAPKzwCgiAruSktwi7rZVZV2rBEW7K74
4KMAnjJRGAuIKvFHdeg3rTqbOymiROS0
=8BRn
-----END PGP SIGNATURE-----

Revision history for this message
Ted Gould (ted) said :
#5

On Tue, 2010-01-05 at 20:15 +0000, John A Meinel wrote:
> Can you do 'head -n 5' on the pack files as well?
>
> I'm a bit concerned that you have some rather severe corruption, given
> that file contents are not what they seem...
>
> Make sure that "bzr dump-btree .bzr/repository/pack-names" works, and
> look for any temp files with 'pack-names' as part of the name (in
> ..bzr/repository).
>
> It shouldn't be hard to remove a single offending pack file.

Yes, that pack looks bad as well. So I guess I need to remove that
also. :(

Revision history for this message
John A Meinel (jameinel) said :
#6

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ted Gould wrote:
> Question #96346 on Bazaar changed:
> https://answers.launchpad.net/bzr/+question/96346
>
> Status: Answered => Open
>
> Ted Gould is still having a problem:
> On Tue, 2010-01-05 at 20:15 +0000, John A Meinel wrote:
>> Can you do 'head -n 5' on the pack files as well?
>>
>> I'm a bit concerned that you have some rather severe corruption, given
>> that file contents are not what they seem...
>>
>> Make sure that "bzr dump-btree .bzr/repository/pack-names" works, and
>> look for any temp files with 'pack-names' as part of the name (in
>> ..bzr/repository).
>>
>> It shouldn't be hard to remove a single offending pack file.
>
> Yes, that pack looks bad as well. So I guess I need to remove that
> also. :(
>

If pack-names is corrupted, then we will have a much harder time
regenerating it, but I think you are saying that only the .pack file is bad.

If that is true, then all we need to do is remove the reference to
01ef0ba809d28d85ed4e96154ff6924e from the pack-names file, and things
should start working again.

I think we'll need some bzrlib magic to do that, though.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDoHwACgkQJdeBCYSNAAMm9ACgzlvJrzPNYnZLCDYjVrgtZL0B
m7sAn2izwJW2Cch5fcrIAOjMjvzdopMf
=3SlH
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) said :
#7

Yah - new BTreeIndexBuilder, a filter function to discard this hash, and add_records(iter_all) (thumbnail sketch)

Revision history for this message
John A Meinel (jameinel) said :
#8

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Question #96346 on Bazaar changed:
> https://answers.edge.launchpad.net/bzr/+question/96346
>
> Robert Collins posted a new comment:
> Yah - new BTreeIndexBuilder, a filter function to discard this hash, and
> add_records(iter_all) (thumbnail sketch)
>

I've done it before manually. Let me see:

from bzrlib import btree_index, transport

t = transport.get_transport('.bzr/repository')
pack_name_index = btree_index.BTreeGraphIndex(t, 'pack-names', None)
pack_name_index.key_count() # load the header
builder = btree_index.BTreeBuilder(pack_name_index.node_ref_lists,
       pack_name_index._key_length)

for node in pack_name_index.iter_all_entries():
  if node[1] == '01ef0ba809d28d85ed4e96154ff6924e':
    continue
  builder.add_node(*node[1:])
res = builder.finish()
t.put_file('pack-names2', res)

At that point, you can use 'dump-btree' to make sure that the only
difference is the removal of that one index, and then 'mv pack-names2
pack-names'.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDousACgkQJdeBCYSNAAM19QCfUqEAESTjt/kg6abmMc6epkiu
sncAn0KTlB04K+ywE0j15j223HL8SQbk
=wOl7
-----END PGP SIGNATURE-----

Revision history for this message
Ted Gould (ted) said :
#9

On Tue, 2010-01-05 at 20:27 +0000, John A Meinel wrote:
> If pack-names is corrupted, then we will have a much harder time
> regenerating it, but I think you are saying that only the .pack file is bad.
>
> If that is true, then all we need to do is remove the reference to
> 01ef0ba809d28d85ed4e96154ff6924e from the pack-names file, and things
> should start working again.
>
> I think we'll need some bzrlib magic to do that, though.

Yes, the pack-names file looks fine.

Revision history for this message
Ted Gould (ted) said :
#10

On Tue, 2010-01-05 at 20:39 +0000, John A Meinel wrote:
> I've done it before manually. Let me see:

Cool, thanks. There was a slight error in the script, I've attached the
updated version that worked for me. It just needs to go into the array
one more level.

After removing the pack file I still had the issue that working trees
were pointing at a revision that now didn't exist. So, I needed to
figure out how to get to the previous revision. To do that I looked at
the various indicies and found the second most recent index:

   $ cd .bzr/repository/indicies
   $ ls -lt *.iix

Then on the file I dumped the tree:

   $ bzr dump-btree 2e498f7acd7122676522c7879948c553.iix

Which gave me the revision IDs in that file:

   (('<email address hidden>',), '2043 246 0 294',
((('<email address hidden>',),),))

I could then use that to create my new branch using the last good
revision:

   $ bzr branch -r revid:<email address hidden>
build-indicator.old/ build-indicator

And then I had a branch to work with! Thanks to both Robert and John
for giving me a hand here.

  status solved

Revision history for this message
Ted Gould (ted) said :
#11

Heh, apparently I have to have text here to click on the solved button :)