Retrieve Python tracebacks from hanging applications

Bug #1015080 reported by Evan
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Whoopsie
Confirmed
Wishlist
Unassigned

Bug Description

The current plan is to retrieve stacktraces from hanging applications by sending them a SIGSEGV from apport. This works fine for binary applications, but in the case of Python applications will have undesirable results. It will produce C stack traces of the Python interpreter at the point it received the SIGSEGV, which will be unreadable from the point of debugging a Python application. It will also generated a signature from these stack traces that may produce many false duplicates, since its based off the state of the interpreter rather than the state of the interpreted program.

We will, however, continue to collect these reports from Python applications. A Python application that hangs in a C extension is worth knowing about. Also, we will not know how important it is to resolve the problem of poor quality Python hang reports until we measure it.

Context from #ubuntu-devel on 2012/06/19:
[11:09:56] <ev> doko, barry: are there any plans of getting this into Ubuntu: https://fedoraproject.org/wiki/Features/EasierPythonDebugging#py-bt
[11:09:56] <ev> The use case I have is for hanging application. We're going to pop up apport when we detect an desktop application hang. When the user presses submit, apport then SEGVs the process and we get back a nice core dump
[11:09:56] <ev> Which will give us a stack trace, but for python applications it would be better if we could get a python traceback
[11:10:27] <ev> I suppose one alternative is to install a special signal handler in every python application and get it that way
[11:13:07] <doko> ev: this should already be there. I can check if this is the newest version however, but I'd like to avoid backports to gdb
[11:13:14] <doko> hmm, maybe not for python3
[11:13:57] <ev> (gdb) py-bt
[11:13:57] <ev> Undefined command: "py-bt". Try "help".
[11:14:24] <ev> That's gdb 7.4-2012.04-0ubuntu2
[11:16:49] <cjwatson> ev: Which python executable are you debugging?
[11:17:08] <cjwatson> ev: It works with python-dbg.
[11:17:20] <cjwatson> ev: (Which is my reading of the Fedora spec too)
[11:18:56] <ev> ahh
[11:19:01] <ev> cjwatson, doko: that was indeed it
[11:19:02] <ev> thanks
[11:19:24] <cjwatson> But yes, not in python3-dbg right now
[11:20:29] <ev> cjwatson: while I have you here, what are your thoughts on the idea? gdb to python stacktrace or installed signal handler that raises an uncaught exception?
[11:20:37] <ev> if you have a moment, of course
[11:20:40] <ev> otherwise no worries
[11:20:55] <cjwatson> I'm *very* wary of injecting signal handlers into unsuspecting processes, even if they are SEGV.
[11:21:06] <cjwatson> You must be ^-- this tall to use signals, as the saying goes
[11:21:11] <ev> :)
[11:21:39] <ev> looks like that leaves us with one option then
[11:21:40] <cjwatson> You don't want to have to re-exec with python(3)-dbg, though.
[11:21:52] <ev> yeah, hm
[11:22:00] <cjwatson> Is it possible to make the gdb script work with the non-dbg interpreter? (If it isn't, the signal handler approach wouldn't work anyway)
[11:22:42] <ev> Why wouldn't the signal handler approach work with a non-dbg interpreter?
[11:22:57] <ev> or am I incorrectly parsing your sentence
[11:23:22] <cjwatson> Something still has to be able to introspect enough of Python's data structures to generate the backtrace.
[11:23:40] <cjwatson> I don't know whether the dbg-ness is what allows us to do that.
[11:25:06] <ev> cjwatson: python application hangs, send SIGQUIT to it, signal handler does raise(YoullNeverCatchMe), apport python hook picks it up, job done.
[11:25:30] <ev> no need to know the internals of the python data structures
[11:25:40] <ev> mind you, I'm not saying we go with this approach for the reasons you raised
[11:25:51] <ev> just pointing out that I don't see why it needs dbg packages
[11:25:52] <cjwatson> I'm concerned about Python's habit of leaking signal handlers to subprocesses it calls. See SIGPIPE.
[11:26:07] <ev> indeed, I do recall the pain you faced from that in ubiquity :)
[11:26:22] <cjwatson> Unless you were supernaturally careful in a way that I'm not even sure how it's possible to be, the effect of this would be that any non-Python subprocess of Python would have SIGQUIT ignored.
[11:26:43] <cjwatson> So I can't recommend that.
[11:26:48] <ev> hmm, indeed
[11:27:23] <cjwatson> (And yes, I'd misunderstood how you were planning to use the signal handler)
[11:30:28] <ev> It doesn't seem like there's a good way to do this at present then. I think it's still valuable to collect these so that we get statistics on application hangs, but the python ones will just be unparseable.
[11:31:08] <ev> unparseable in that a developer will look at them and their brain will melt out, not that the retracers will have any difficulty handling a SEGV'ing python application
[11:33:37] <ev> The signatures for these may end up looking very similar to one another as well, but I think I'd rather collect python hangs and provide consistent UI than only show the dialog for binary applications.
[11:38:05] <cjwatson> Mm. (A small handful of the Python crashes will be useful even without this.)
[11:39:26] <lifeless> atfork?
[11:39:30] <lifeless> [ok, terrible idea]
[11:39:33] <ev> cjwatson: I'm not sure I follow
[11:49:18] <ev> cjwatson: to be clear, they're hangs not crashes. We're just treating them as crashes because that's the most secure way to do it (https://bugs.launchpad.net/ubuntu/+source/whoopsie-daisy/+bug/1006398)
[11:49:18] <ev> but if the hang were in a c extension, that would indeed be useful to capture
[11:49:20] <ubottu> Launchpad bug 1006398 in whoopsie-daisy (Ubuntu) "Bypassing ptrace restrictions for errors from hanging applications" [Undecided,New]
[11:49:28] <ev> if I'm following your logic :)
[11:51:37] <cjwatson> ev: Right

Evan (ev)
Changed in whoopsie:
importance: Undecided → Wishlist
Evan (ev)
Changed in whoopsie:
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.