Comment 10 for bug 90243

Revision history for this message
Roland Dreier (roland.dreier) wrote :

Looking at the traceback that John Clemens just posted -- it looks pretty obviously like another variation of the same class of deadlock that I described before:

 - BeaconTimeout() ran from a timer, and calls into MlmeHandler()
 - MlmeHandler() runs the table-based state machine
 - The state machine calls MlmeJoinReqAction()
 - Right at the top of the function, MlmeJoinReqAction() does
        del_timer_sync(&pAd->MlmeAux.BeaconTimer);
 - But we're running from the BecaonTimer callback already

and voila, we have another "wait for timer to finish from within timer callback" deadlock.

Looking at the rt61 driver source, I really feel it was a mistake to merge this into the Ubuntu kernel in the first place. I realize that there are probably some people who are using the rt61 driver without lock ups, but at this point I don't think feisty should ship with rt61 enabled by default.

I think the reason this bug is having a more severe impact now is that all kernels are built with CONFIG_SMP now, so del_timer_sync() is never converted to del_timer(). So another possible fix would be to enable the code in rtmp.h that does:

#undef del_timer_sync
#define del_timer_sync(x) del_timer(x)

that would probably convert this easily-triggered deadlock into much rarer strange crashes on SMP systems. (Although I know that almost every modern system is dual-core at least)