Comment 3 for bug 1907262

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Thimo,

Firstly, thank you for your bug report, we really, really appreciate it.

You are correct, the recent raid10 patches appear to cause filesystem corruption on raid10 arrays.

I have spent the day reproducing, and I can confirm that the 4.15.0-126-generic, 5.4.0-56-generic and 5.8.0-31-generic kernels are affected.

The kernel team are aware of the situation, and we have begun an emergency revert of the patches, and we should have new kernels available in the next few hours / day or so.

The current mainline kernel is affected, so I have written to the raid subsystem maintainer, and the original author of the raid10 block discard patches, to aid with debugging and fixing the problem.

You can follow the upstream thread here:

https://www.spinics.net/lists/kernel/msg3765302.html

As for the data corruption on your servers, I am deeply sorry for causing this regression.

When I was testing the raid10 block discard patches on the Ubuntu stable kernels, I did not think to fsck each of the disks in the array, instead, I was contempt with the speed of creating new arrays, writing a basic dataset to the disks, and rebooting the server to ensure the array came up again with those same files.

Since the first disk seems to be okay, there is at least a small window of opportunity for you to restore any data that you have not backed up.

I will keep you informed of getting the patches reverted, and getting the root cause fixed upstream. If you have any questions, feel free to ask, and if you have any more details from your own debugging, feel free to share in this bug, or on the upstream mailing list discussion.

Thanks,
Matthew