[17:47] <mattrae_> yay there's an upstart channel :) hopefully someone has some ideas there.. i have an upstart job stuck in pre-start.. 'status nova-network' reports 'nova-network stop/pre-start, process 389' but there is no process with PID 389
[17:47] <mattrae_> how do i go about debugging this state and i'm wondering why there is no pid 389
[17:48] <mattrae_> ?
[17:53] <mattrae_> this isn't the stock nova-network upstart job.. its customized and i don't have all the scripts called in pre-start. the person i'm helping has ran the upstart job script manually without problems
[17:54] <mattrae_> the pre-start script does do some stuff creating a cgroup for nova-network.. wondering if that could possibly be related to the pid question
[17:54] <mattrae_> anything i should take a look at?
[17:55] <mattrae_> and any way to deal with a upstart job stuck in pre-start?
[17:56] <mattrae_> so far this has only happened after a reboot on initial startup
[18:16] <SpamapS> mattrae_: does your job perhaps specify 'expect fork' but the main process does not fork?
[18:17] <SpamapS> mattrae_: thats the only time I've seen upstart reference non-existant pids
[19:02] <mattrae_> SpamapS: ahh cool thanks.. yeah, no 'expect fork' is used
[19:03] <SpamapS> mattrae_: cgroups shouldn't have any bearing, as upstart has no awareness of them..
[19:03] <SpamapS> mattrae_: but if you were going to setup a cgroup, seems like it would need to be in the actual script section, not pre-start
[19:03] <SpamapS> depends on how its being used I guess
[19:06] <mattrae_> oh good to know cgroups shouldn't be a problem. they're doing cgcreate in pre-start and cgexec in the actual script.
[19:11] <mattrae_> SpamapS: know of how to deal with a job stuck in pre-start?
[19:13] <SpamapS> mattrae_: well usually stop would do it
[19:13] <SpamapS> mattrae_: the fact that its waiting on a non-esistant process is very weird
[19:13] <SpamapS> mattrae_: anything in /var/log/syslog or /var/log/dmesg about the job?
[19:23] <mattrae_> SpamapS: ohh thanks, i will see what we can find in syslog or dmesg about the job
[19:25] <SpamapS> mattrae_: you might have to use one of the hack scripts out there that exhausts pid space up to the pid needed, and then lets upstart kill it off
[19:29] <mattrae_> SpamapS: ahh ok, in what case is that hack needed?
[19:29] <SpamapS> mattrae_: in the past I've only ever seen it with badd expect fork
[19:29] <SpamapS> mattrae_: never on pre-start
[19:29] <SpamapS> mattrae_: any way you can pastebin the conf?
[19:32] <mattrae_> SpamapS: ahh cool, sure here's the conf: http://pastebin.com/16MJZBv9
[19:34] <SpamapS> mattrae_: yeah looks normal to me. Not sure why that would lose track.. but cgroups does seem like a logical choice since its the most process-weirdness-related thing in there :)
[19:35] <mattrae_> SpamapS: cool thanks for reviewing :) yeah cgroups did seem logical.. i'll see what more we can find out and if anything is in the logs
[21:58] <mattrae_> SpamapS: sorry to ask again, i was wondering if its possible to clear the upstart state when the job is stuck in pre-start. is reboot the only way?
[21:59] <SpamapS> mattrae_: you can write a program which "revives" that pid by forking until the pid counter wraps...
[21:59] <SpamapS> mattrae_: reboot is often simpler :)
[22:00] <mattrae_> SpamapS: ahh thanks, good to know that possiblity.. yeah i think rebooting will be easier :)