[07:35] <lurch_> hi, having an issue with stopping a service through upstart on ubuntu. The app is a very basic sinatra server that listens on some port. Starting goes fine, i can access the service, but there's a problem with restarting/stopping the service. One process stays running, which keeps the  port in use. Here's the output i get for the commands: http://pastebin.com/k9hdNJLE
[07:41] <jodh> lurch_: you've probably mis-specified (or not specified) the 'expect' stanza: http://upstart.ubuntu.com/cookbook/#expect
[07:43] <lurch_> jodh: thx. let me look at that
[07:44] <jodh> lurch_: after starting your job, try running 'status app01' and checking that the pid shown is correct (confirm it using 'ps'). If not, it's an expect issue.
[07:50] <lurch_> jodh: the status command isn't showing any pids
[07:50] <lurch_> # status app01
[07:50] <lurch_> app01 start/running
[07:52] <jodh> lurch_: right, so I suspect your daemon forks either once or twice, but you haven't added an 'expect' to allow Upstart to track the forks.
[07:52] <lurch_> thx. added an 'expect fork' now, testing
[07:58] <lurch_> added the 'expect fork' (and also tried 'expect daemon'), but then the start / stop just hangs. Must be doing something stupid, but not that familiar with upstart. Here's the config: http://pastebin.com/RuUwyn8A
[08:00] <lurch_> the upstart scripts are generated by 'foreman'
[08:00] <jodh> lurch_: don't add respawn until you are 100% convinced that you've got the rest of the config correct as it will just cause confusion.
[08:00] <lurch_> ok
[08:03] <jodh> lurch_: also, you can simplify that exec line by doing 'setuid app' and 'chdir /opt/apps/app01'. Note too that you may be able to get rid of all that redirection since Upstart will automatically log all output to stdout/stderr to /var/log/upstart/app01.log.
[08:04] <jodh> lurch_: Um. where is $PORT being defined? I suspect that might be the cause of your problem.
[08:05] <jodh> lurch_: either set it via 'env PORT=1234' or pass in via command line 'start app01 PORT=1234' or source your apps config file as shown here: http://upstart.ubuntu.com/cookbook/#sourcing-files
[08:06] <lurch_> jodh: ok, i'll try that. thx
[20:23] <aaronlevy> I'm having trouble with restarts of a python process that spawns other worker processes.
[20:24] <aaronlevy> The process cant have 2 instances of itself running because a socket will already be in use by the other process. What seems to be happening is that upstart is re-starting the process before it has actually stopped the original (the parent will not exit until all children have completed their work). 
[20:25] <aaronlevy> Using start/stop works fine. Restart, however, will periodically cause the above to happen and then the process never exits, but upstart reports that is is not running
[20:26] <ion> Add something like kill timeout 300
[20:26] <ion> Or whatever number makes sense.
[20:27] <aaronlevy> Hmm. I mean, shouldn't it track the PID of the parent process?
[20:28] <ion> yes
[20:28] <ion> But it sends SIGKILL if it doesn’t terminate in a certain number of seconds after SIGTERM.
[20:28] <aaronlevy> Then just assumes it is closed?
[20:29] <ion> SIGKILL kills it for good, so the process won’t exist anymore.
[20:29] <aaronlevy> Hmm.. but the process is still there (as are the children). 
[20:29] <ion> You can also add a pre-stop exec/script if the server provides a command that tells it to stop and waits until it does.
[20:29] <ion> huh
[20:30] <aaronlevy> no re respawns either. Let me double check that it is the same as originating pid
[20:33] <aaronlevy> So the zombie process is not the same PID as what restart reports
[21:25] <aaronlevy> I made a stub of my actual program that re-creates the issue. https://gist.github.com/7cf1c9ef87a50ef08c9c (requires pyzmq)
[21:25] <aaronlevy> and upstart conf: https://gist.github.com/ecc65af894d72c91cecb
[21:26] <aaronlevy> repeatedly restarting the service quickly will cause it to get an error because the socket is still in use. Upstart will report the process as stopped, but there will be a zombie process (doing nothing)
[21:27] <aaronlevy> This is probably a mistake on my part, but I'm not sure where :/