[07:35] hi, having an issue with stopping a service through upstart on ubuntu. The app is a very basic sinatra server that listens on some port. Starting goes fine, i can access the service, but there's a problem with restarting/stopping the service. One process stays running, which keeps the port in use. Here's the output i get for the commands: http://pastebin.com/k9hdNJLE [07:41] lurch_: you've probably mis-specified (or not specified) the 'expect' stanza: http://upstart.ubuntu.com/cookbook/#expect [07:43] jodh: thx. let me look at that [07:44] lurch_: after starting your job, try running 'status app01' and checking that the pid shown is correct (confirm it using 'ps'). If not, it's an expect issue. [07:50] jodh: the status command isn't showing any pids [07:50] # status app01 [07:50] app01 start/running [07:52] lurch_: right, so I suspect your daemon forks either once or twice, but you haven't added an 'expect' to allow Upstart to track the forks. [07:52] thx. added an 'expect fork' now, testing [07:58] added the 'expect fork' (and also tried 'expect daemon'), but then the start / stop just hangs. Must be doing something stupid, but not that familiar with upstart. Here's the config: http://pastebin.com/RuUwyn8A [08:00] the upstart scripts are generated by 'foreman' [08:00] lurch_: don't add respawn until you are 100% convinced that you've got the rest of the config correct as it will just cause confusion. [08:00] ok [08:03] lurch_: also, you can simplify that exec line by doing 'setuid app' and 'chdir /opt/apps/app01'. Note too that you may be able to get rid of all that redirection since Upstart will automatically log all output to stdout/stderr to /var/log/upstart/app01.log. [08:04] lurch_: Um. where is $PORT being defined? I suspect that might be the cause of your problem. [08:05] lurch_: either set it via 'env PORT=1234' or pass in via command line 'start app01 PORT=1234' or source your apps config file as shown here: http://upstart.ubuntu.com/cookbook/#sourcing-files [08:06] jodh: ok, i'll try that. thx [20:23] I'm having trouble with restarts of a python process that spawns other worker processes. [20:24] The process cant have 2 instances of itself running because a socket will already be in use by the other process. What seems to be happening is that upstart is re-starting the process before it has actually stopped the original (the parent will not exit until all children have completed their work). [20:25] Using start/stop works fine. Restart, however, will periodically cause the above to happen and then the process never exits, but upstart reports that is is not running [20:26] Add something like kill timeout 300 [20:26] Or whatever number makes sense. [20:27] Hmm. I mean, shouldn't it track the PID of the parent process? [20:28] yes [20:28] But it sends SIGKILL if it doesn’t terminate in a certain number of seconds after SIGTERM. [20:28] Then just assumes it is closed? [20:29] SIGKILL kills it for good, so the process won’t exist anymore. [20:29] Hmm.. but the process is still there (as are the children). [20:29] You can also add a pre-stop exec/script if the server provides a command that tells it to stop and waits until it does. [20:29] huh [20:30] no re respawns either. Let me double check that it is the same as originating pid [20:33] So the zombie process is not the same PID as what restart reports [21:25] I made a stub of my actual program that re-creates the issue. https://gist.github.com/7cf1c9ef87a50ef08c9c (requires pyzmq) [21:25] and upstart conf: https://gist.github.com/ecc65af894d72c91cecb [21:26] repeatedly restarting the service quickly will cause it to get an error because the socket is still in use. Upstart will report the process as stopped, but there will be a zombie process (doing nothing) [21:27] This is probably a mistake on my part, but I'm not sure where :/