9.10 System Won't Boot. Again!

Asked by klausner

For the <A HREF="http://ubuntuforums.org/showthread.php?t=1353645">third</A>
<A HREF="http://ubuntuforums.org/showthread.php?t=1336988">time</A>
in as many weeks my system running 9.10 karmic-64 wont boot. It
starts, then dumps into a shell. Trying the recovery console doesn't
work either. The errors are:</P>
<P STYLE="margin-left: 0.49in; margin-bottom: 0.2in">init: hwclock
main process (nnn) terminated with status 2<BR>.: can't open
/etc/default/rcS<BR>init: mountall main process (nnn) terminated with
status 2<BR>filesystem check failed</P>
<P STYLE="margin-bottom: 0.2in">Ran fsck manually from the shell
prompt and from the Live CD, found some timestamp errors, fixed them,
nothing else.<BR><BR><BR>
</P>
<P STYLE="margin-bottom: 0.2in">Tried 2.6.24-16, 2.6.24-24,
2.6.28-11, and 2.6.31-16. 31-16 gives the above errors, the older
kernels just lock up and don't even go to a prompt.
</P>
<BR>There are a <A HREF="https://bugs.launchpad.net/ubuntu/karmic/+source/udev/+bug/432155" TARGET="_blank">couple</A>
of <A HREF="https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/436076" TARGET="_blank">bugs</A>
open for clock update problems back in September, but they look like
they were fixed.
<BR><BR>dmesg output <A HREF="http://ubuntuforums.org/showthread.php?t=1366516">here</A>.<BR><BR>I'm
tired of reinstalling Ubuntu! Is there a fix/workaround for this?

<BR>

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu yelp Edit question
Assignee:
No assignee Edit question
Solved by:
klausner
Solved:
Last query:
Last reply:
Revision history for this message
klausner (klausner) said :
#1

Thought this would take HTML. Apologies. Here is the text, links above.

For the third time in as many weeks my system running 9.10 karmic-64 wont boot. It starts, then dumps into a shell. Trying the recovery console doesn't work either. The errors are:

    init: hwclock main process (nnn) terminated with status 2
    .: can't open /etc/default/rcS
    init: mountall main process (nnn) terminated with status 2
    filesystem check failed

Ran fsck manually from the shell prompt and from the Live CD, found some timestamp errors, fixed them, nothing else.

Tried 2.6.24-16, 2.6.24-24, 2.6.28-11, and 2.6.31-16. 31-16 gives the above errors, the older kernels just lock up and don't even go to a prompt.

There are a couple of bugs open for clock update problems back in September, but they look like they were fixed.

dmesg output here.

I'm tired of reinstalling Ubuntu! Is there a fix/workaround for this?

Revision history for this message
Stefan Eggers (stefan-eggers) said :
#2

Is the hardware reliable with Ubuntu 9.04 or some other Linux? If not, it might be a simple hardware problem.

Do the logs indicated failed reads on the disk? If it is an ATA or SATA disk, install smartmontools and check the output of this command:

sudo smartctl -a /dev/sda | less [Replace /dev/sda if necessary.]

Among the output something like this should show up:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

If there is some error reported in this place, I'd assume the disk is bad and has to be replaced. Also inspect the rest of this commands output for something looking suspect and report this back.

It might also be some other hardware problem. For example about a decade ago had an 386 system with FreeBSD on it. When trying to compile a new kernel, this failed reporting that a certain signal was causing the gcc process to terminate. Turning off the cache helped. Seems the stress from kernel compilation caused the cache to fail.

Maybe during the boot process - which has lots of activity going on - something similar happens to your hardware? Need not be the cache, but can be pretty much any part involved in transporting and storing data (disk, SATA cables, RAM, whatever).

As for clock update problems: Which ones are those? If they were just causing the wrong system time, this might give files dated in the future. This should not prevent the file from getting opened.

Revision history for this message
klausner (klausner) said :
#3

Thanks for the response.

smartctl reports:

    SMART overall-health self-assessment test result: PASSED

The fsck first pass results when the problem began reported a problem with the partition timestamps being out of sync with reality (don't remember the exact message). This seems to be what the two bug reports I referenced were related to. Their links are in the HTML of the first posting. But I fixed those with fsck run from the Live CD, and subsequent repeats did not show the error again.

 Of the earlier system failures one allowed reinstall with no problems. The second would not let me reinstall Intrepid-32 at all, I had to revert to Hardy-32. But Hardy had too many down-level things, so i bit the bullet and installed Jaunty-64, then upgraded to Karmic. Wanted to avoid grub2, and any possibility that it was behind my problems. Have not tried reinstalling this time.

So it might be HW, but the Windoze partition I occasionally boot makes no complaints, and that was recently rebuilt to Windoze7 (after my Linux problems started).

Are there any log files besides the kernel ring buffer (dmesg) that might be useful?

Revision history for this message
klausner (klausner) said :
#4

Finally figured out that the error message was telling me something.

THERE WAS NO /ETC/DEFAULT/RCS FILE!!!

Why didn't someone here tell me I am an idiot and point this out!

Copied the file from the CD.

Anyway, now X and Gnome start to boot, then I get

"There is a problem with the configuration server
(/usr/lib/libconf2/gconf-sanity-check-2 exited with status 256"

and I get a bare X environment, with no window manager. Tried several suggestions with file permissions, but no joy. One step forward, one step back.