0.3.8 init state machine broken?
I've found 3 and fixed (patched, more like) 2 problems. My question is - has this been found and fixed already in later versions?
Details:
1) The test below makes init dump core and restart, triggered by an assert in job.c::
service
respawn
script
exit 0
end script
post-start script
sleep 4
end script
As far as I could see JOB_RUNNING was legitimate, so I updated the assert condition to allow it, and respawning worked.
2) after the change above, stopping this job produced another core. This time PROCESS_POST_START got JOB_WAITING.
I patched this as well by allowing JOB_WAITING and setting state and stop to FALSE in this case.
I'm not sure the second fix is a good one.
I tried to compare against 0.5, but it's been re-written in C++, so there's no simple diff.
Question information
- Language:
- English Edit question
- Status:
- Solved
- For:
- upstart Edit question
- Assignee:
- No assignee Edit question
- Solved by:
- Alex Nekrasov
- Solved:
- 2009-06-15
- Last query:
- 2009-06-15
- Last reply:
- 2009-06-15
Launchpad Janitor (janitor) said : | #1 |
This question was expired because it remained in the 'Open' state without activity for the last 15 days.
re-opening
This part of the code has been extensively re-written in later releases, so it's entirely possible the bug is fixed and there are all new bugs waiting to be found ;)
0.5 is not written in C++, it's still plain old C.
I suspect the bug is that the post start script ends *after* the running one? Technically it should remain in the post-start state for that, but it transitions out too early instead.
Alex Nekrasov (ennnot) said : | #4 |
something's wrong. I've downloaded what I though was 0.5 code and looked in job.c to find the new child reaper and co. I saw classes. Ok, so may be I got some different version. Will need to check.
As to the issue, the same thing happens with other sections that can run in parallel with main. I added my attempt at fixing this to the bug report you opened.