System hangs when block device fails, Amazon EC2/EBS

Asked by Rudi Meyer

I'm running Ubuntu 12.04 LTS Precise in Amazon EC2.

I'm testing out how to handle a failed EBS device (block device), I do this by "force-detaching" through the API. I've tested this with software raid (mdadm), lvm and GlusterFS. I want to handle the failure, for instance with raid - have it discover the error and fail the disk.

The procedure is (raid):

1. Attach 2 drives
2. Set it up a RAID1
3. Force-detach one of the drives

The mount of the raid1 drives will now hang, mdadm --detail and other commands doesn't respond, they just hang there and the system starts to go into a state where a reboot is the only option.

What i expected was that the system, and mdadm would see the error, mark the drive as failed and continue running, that should be the general idea.

I've tested this out on Ubuntu 10.04.4 LTS (Lucid Lynx) with the same result.
But I also tested it on a RedHat distribution where everything seems to work as expected. So whats the difference? How come Ubuntu handles this so poorly?

Question information

Language:
English Edit question
Status:
Open
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Daniel Smoczyk (daniel-smoczyk) said :
#1

+1 same problem here - this is very important to say gracefull byebye to lost EBS in ec2, i'll provide any testing help if needed

Revision history for this message
Daniel Smoczyk (daniel-smoczyk) said :
#2

one more thing: i've done tests on 13.04 amazon ec2 default image

Can you help with this problem?

Provide an answer of your own, or ask Rudi Meyer for more information if necessary.

To post a message you must log in.