Comment 61 for bug 1796292

Revision history for this message
Andrea Righi (arighi) wrote :

Ryan, unfortunately the last reproducer script is giving me a lot of errors and I'm still trying to figure out how to make it run to the end (or at least to a point where it's start to run some bcache commands).

In the meantime (as anticipated on IRC) I've uploaded a test kernel reverting the patch "UBUNTU: SAUCE: (no-up) bcache: decouple emitting a cached_dev CHANGE uevent":

https://kernel.ubuntu.com/~arighi/LP-1796292/4.15.0-56.62~lp1796292+1/

As we know, this would re-introduce the problem discussed in bug 1729145, but it'd be interesting to test it anyway, just to see if this patch is somehow related to the bch_bucket_alloc() deadlock.

In addition to that I've spent some time looking at the last kernel trace and the code. It looks like bch_bucket_alloc() is always releasing the mutex &ca->set->bucket_lock when it goes to sleep (call to schedule()), but it doesn't release bch_register_lock, that might be also acquired. I was wondering if this could the reason of this deadlock, so I've prepared an additional test kernel that does *not* revert our "UBUNTU SAUCE" patch, but instead it releases the mutex bch_register_lock when bch_bucket_alloc() goes to sleep:

https://kernel.ubuntu.com/~arighi/LP-1796292/4.15.0-56.62~lp1796292+3/

Sorry for asking all these tests... if I can't find a way to reproduce the bug on my side, asking you to test is the only way that I have to debug this issue. :)