Comment 138 for bug 1505564

Revision history for this message
Dan Streetman (ddstreet) wrote :

Ok, here's my analysis of the latest dump.

There are 3 kernel migrate threads waiting; this is the cause of the softlockup - specifically pid 101 on cpu 13 is where the softlockup (and then panic, due to panic on softlockup enabled) happens, and the other 2 migrate threads (pid 79 and 151) are also waiting. All are waiting for multi_cpu_stop to finish. The way multi_cpu_stop works is: the caller sets up one or more cpus to coordinate stopping; in multi_cpu_stop, the state machine moves from MULTI_STOP_PREPARE through disable irqs, to run (the provided function), to exit when done. However, only the specified cpus (in the cpumask) will run the function. The state machine doesn't proceed to the next step until all cpus have processed the current state.

This is where the problem comes in. In this case, it's a migration of tasks from one numa node to another, via numa rebalancing. In this particular case, there are 3 rebalancing events happening: cpu 3 and cpu 10, cpu 3 and cpu 13, cpu 3 and cpu 20. the migrate threads on cpus 10, 13, and 20 are running multi_cpu_stop, but it's stuck waiting because cpu 3 still has it in its queue.

cpu 3 is writing bytes to the serial port, and currently waiting for confirmation that the serial port write completed. This wait is done via checking the serial port register for CTS, then if it's not set delaying for 1us, and trying again. However, this is all inside a held spinlock, with irqs disabled. So while this serial port r/w is being done, nothing else will run on this cpu. But - the code limits this to 1 second, so presumably it shouldn't lock up the cpu for longer than 1 second or so (I haven't dug too far into this, so the function may be called multiple times with the lock held).

For whatever reason, that serial port r/w seems to be taking a long time. The migrate threads on the other cpus are waiting for it to finish, so that the migrate thread on cpu 3 can run, and move the multi_cpu_stop state machine along. But that doesn't happen in time to avoid the softlockup detector.

The multi_cpu_stop function could arguably use the addition of touch_nmi_watchdog(), since it intentionally spins on the cpu with interrupts disabled - doing so would avoid the softlockup detector (but would not change the system behavior). However, it's not really its fault, since the real cause is the other cpu(s) it's waiting for being locked.

back on cpu 3 (that the others are waiting on), the way that delay is implemented is using the TSC. Unfortunately, the TSC is a generally unreliable clock source, so it's possible there is a problem in the delay function.

To determine that, can you please boot with the "notsc" parameter, which will change the udelay function to use a simple loop instead of the TSC, and reproduce the softlockup?