Corrupt files in /var/lib/apt/lists, possibly bzip2?

Asked by Mikhail Panchenko

We've been observing apt failures causing Chef to fail on 2 of our ubuntu 14.04 boxes (ami-e7b8c0d7). It took me a while to get around to investigating it, but today I looked and got a little bit scared..

The error was the same on both nodes, except the filenames were different:

E: Encountered a section with no Package: header
E: Problem with MergeList /var/lib/apt/lists/us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_main_i18n_Translation-en
E: The package lists or status file could not be parsed or opened.

The other node was complaining about us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_*universe*_i18n_Translation-en instead.

When I opened the files with Vim, they both had garbage in them similar to what one would get from opening a binary. The same file on "good" nodes just had text in them. The files both have an mtime of 05/21/2013 23:02 UTC.

Neither of the files have the executable bits set, but I'm still very nervous about arbitrary looking junk data showing up on the server. Is this a normal occurrence with apt? With the ec2 apt mirrors? Could there be some reason that the us-west-2 ubuntu archive would briefly return garbage data?

OR - should I be treating this as a security incident?

Some details about the files:

Node 1:

$ stat us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_universe_i18n_Translation-en
  File: ‘us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_universe_i18n_Translation-en’
  Size: 185646 Blocks: 368 IO Block: 4096 regular file
Device: ca01h/51713d Inode: 18571 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2015-06-08 09:45:17.266505631 +0000
Modify: 2015-05-21 23:02:27.000000000 +0000
Change: 2015-05-22 06:58:45.643041679 +0000
 Birth: -

Node 2:
$ stat us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_main_i18n_Translation-en
  File: ‘us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_main_i18n_Translation-en’
  Size: 306424 Blocks: 600 IO Block: 4096 regular file
Device: ca01h/51713d Inode: 7735 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2015-06-08 09:04:32.877855276 +0000
Modify: 2015-05-21 23:02:27.000000000 +0000
Change: 2015-05-22 06:55:24.202443173 +0000
 Birth: -

I am happy to provide full copies of the files to interested parties as well. I have done the basics (hexdump and strings) and found nothing of interest in there. `file` seems to think they're bzip2 archives. Trying `bunzip2` on em results in an error, `bzip2recover` produces:

$ sudo bzip2recover us-west-2.ec2.archive.ubuntu.com_ubuntu_dists_trusty-updates_universe_i18n_Translation-en
bzip2recover 1.0.6: extracts blocks from damaged .bz2 files.
bzip2recover: searching for block boundaries ...
   block 1 runs from 80 to 1485168 (incomplete)
bzip2recover: sorry, I couldn't find any block boundaries.

The fact that it's looking for the first block at byte 80 seems to indicate that the file does match up with the bzip2 format as described in http://en.wikipedia.org/wiki/Bzip2#File_format (16+8+8+48=80).

Is it safe to assume that some sort of apt network operation failed resulting in truncated bzip2 files and call it a day? As I dug deeper while writing this post, I started to feel better about it, but the initial "binary in what should be text files" feeling was not a good one.

There's a semi-related ticket at https://bugs.launchpad.net/ubuntu/+source/apt/+bug/346386 but that appears to address a problem where a proxy serves HTML instead of the package listing apt expects. This seems like a different issue.

Both hosts have apt version 1.0.1ubuntu2.

Should I file this as a bug?

Question information

Language:
English Edit question
Status:
Answered
For:
Ubuntu apt Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
actionparsnip (andrew-woodhead666) said :
#1

cd $HOME
wget https://dl.dropbox.com/u/8850924/fixpackage
chmod +x ./fixpackage
sudo ./fixpackage

hope that helps.

Revision history for this message
Manfred Hampl (m-hampl) said :
#2

I am inclined to assume that there was some kind of network problem causing that the data got garbled during transfer.
There are some additional security checks (like md5sums) in the package management processes, which apparently have successfully identified the files as being wrong.

I recommend that you delete the broken files (actionparsnip's command should care for that), and re-try updating.
If that problem re-appears frequently, then a deeper investigation seems necessary.

If your trust in the us-west-2.ec2.archive.ubuntu.com server has dropped too much, you could also select a different repository server.

Revision history for this message
Mikhail Panchenko (mihasya) said :
#3

actionparsnip, I've already "fixed" the issue by moving the files out of the way. No need for `sudo ./somescriptsomeguygavemeontheinternet`, thanks though. Regardless, why would you include `dist-upgrade` in that script? That has nothing to do with my problem; an inexperienced user is going to run that script and probably end up in a world of pain.

Manfred Hampl: I would expect that apt would delete and re-download the file on its own if the "security checks" had worked. Instead, it tried to read the compressed binary file as if it were text, looking for headers. This is precisely why I think it's a bug; seems like "download/decompress of the file failed" should result in "download the file again," not "fail perpetually until manual steps are taken to clear out the file." Is that expectation not correct?

Revision history for this message
Manfred Hampl (m-hampl) said :
#4

You are addressing the root cause behind bug #346386 (meanwhile 6 years old) - what to do if the data that are downloaded are not what the package management programs expect. The modifications done in the programs have already reduced the frequency that such problem shows up, but evidently your system proves that it still happens from time to time.

Sorry to say, but I see a problem also with your solution proposal:

... seems like "download/decompress of the file failed" should result in "download the file again," not "fail perpetually until manual steps are taken to clear out the file." ...

What should the program do, if it discards the wrong file, tries to re-download, gets wrong data again ...? This also might lead to an endless loop.

Maybe it should be considered to add a "repair" function into the package management programs, to allow the user delete the possibly broken files with a single keypress, without the need to issue terminal commands.

Can you help with this problem?

Provide an answer of your own, or ask Mikhail Panchenko for more information if necessary.

To post a message you must log in.