[17:10] <jdsanders> Hi all - is there any way to provide a custom exec or script for the actual stop procedure?
[17:11] <jdsanders> use case is a server that stops gracefully when it receives a shutdown command over tcp, but loses data when it receives TERM
[17:12] <jdsanders> I tried doing it with "pre-stop exec server-cli shutdown"
[17:12] <jdsanders> which shuts it down properly, but it then respawns
[17:12] <jdsanders> what I really want isn't a pre-stop, but just a stop exec
[17:43] <SpamapS> jdsanders: I believe there is a 'kill signal' keyword
[17:43] <SpamapS>        kill signal SIGNAL
[17:43] <jdsanders> SpamapS: thanks, but I actually want *no* signal
[17:43] <SpamapS> jdsanders: you can also do a 'normal exit SIGWHATEVER' and the signal will not be considered something to respawn on
[17:43] <jdsanders> or more importantly, I want a process killed in a pre-stop clause to not be respawned
[17:43] <jdsanders> oh interesting
[17:43] <jdsanders> i'll read about that command
[17:44] <SpamapS> jdsanders: so if it is graceful on SIGUSR1, then 'normal exit 0 USR1' means don't respawn if it is killed by USR1
[17:44] <SpamapS> jdsanders: the '0' in there also means if the exit code was 0, don't respawn. Thats the default btw, so its surprising that your graceful shutdown doesn't exit(0)
[17:45] <SpamapS> jdsanders: though I believe even if you exit(0) from the USR1 handler, upstart will consider that a kill by SIGUSR1
[17:46] <jdsanders> huh
[17:46] <jdsanders> thanks, sounds like i have a lot more respawn research to do
[17:46] <SpamapS> jdsanders: the default logging level *should* tell you what upstart thinks killed the process
[17:47] <SpamapS> jdsanders: would you mind pasting what you see? youre pre-stop, btw, is the right way to go
[17:50] <jdsanders> SpamapS: one moment, going back up my terminal....
[17:58] <jdsanders> shoot can't reproduce!
[17:59] <jdsanders> ah, here
[17:59] <jdsanders> maybe a race condition actually...
[17:59] <jdsanders> deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo start redis-server
[17:59] <jdsanders> redis-server start/running, process 9535
[17:59] <jdsanders> deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo status redis-server
[17:59] <jdsanders> redis-server start/running, process 9535
[17:59] <jdsanders> deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo stop redis-server
[17:59] <jdsanders> redis-server start/running, process 9625
[17:59] <jdsanders> deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo status redis-server
[17:59] <jdsanders> redis-server start/running, process 9625
[18:00] <jdsanders> SpamapS: ^^
[18:01] <jdsanders> SpamapS: here's the config script - https://gist.github.com/bf5b600645ff1ddc67cf
[18:10] <SpamapS> jdsanders: looking
[19:13] <jdsanders> SpamapS: just got back from a meeting, any luck understanding that redis-server issue?
[19:14] <SpamapS> jdsanders: oops no I got sdistracted. Looking again
[19:15] <SpamapS> jdsanders: I'd need to see your syslog entries to understand what happened
[19:17] <jdsanders> k, do you know what label to look for in syslog?
[19:17] <jdsanders> don't see anything like "upstart"
[19:18] <jdsanders> looks like "init" maybe
[19:19] <jdsanders> Jul 11 15:18:08 NCVMLITSDEVQUE01 init: redis-server pre-stop process (9765) terminated with status 1
[19:19] <jdsanders> seems problematic maybe
[19:29] <jdsanders> hmmm, yeah it seems like my shutdown command is *expected* to have exit code 1
[19:29] <jdsanders> (oddly enough)
[19:30] <jdsanders> so i think that's causing the problem?
[20:19] <JanC> by default it will consider exit != 0 as an error
[20:20] <JanC> so just override that if you want to ignore that
[20:20] <jdsanders> JanC: thanks, I'm now fairly certain my server is exiting with 0 when it shuts down, and I've modified the pre-stop to make sure it exits 0
[20:20] <jdsanders> all I see in syslog is now basically "process died, respawning"
[20:20] <jdsanders> is there some way I can see more info about why it decided to respawn?
[20:22] <JanC> maybe use a script instead of the main binary
[20:23] <JanC> and make it log whatever seems relevant
[20:28] <jdsanders> ok thanks
[22:12] <SpamapS> jdsanders: you may also want to try 'initctl log-priority info' to get more about why it died
[22:12] <SpamapS> jdsanders: I suspect just setting 'normal exit' properly is all you need
[22:32] <jdsanders> SpamapS: i'll try the log level thing, but the reason I think the exit code is 0 is, I ran the same command that I'm executing from upstart in the foreground
[22:32] <jdsanders> then shut it down from another process
[22:32] <jdsanders> then checked its exit status
[22:32] <jdsanders> and it was 0
[22:33] <SpamapS> jdsanders: right, so I am wondering what upstart thinks the exit code is
[22:33] <SpamapS> jdsanders: you might also just try 'normal exit 0'
[22:33] <SpamapS> I thought that was the default
[22:33] <SpamapS> but man 5 init isn't entirely crystal clear on it IIRC
[22:34] <SpamapS> "Tasks may exit with a zero exit status to prevent being respawned."
[22:34] <SpamapS> I believe that means jobs may not by default consider 0 a normal exit
[22:34] <SpamapS> " All
[22:34] <SpamapS>               reasons  for  a  service stopping, except the stop(8) command itself, are considered abnormal."
[22:34] <SpamapS> jdsanders: yeah, try 'normal exit 0'
[22:37] <jdsanders> huh
[22:37] <jdsanders> i'll try that!
[22:39] <SpamapS> jdsanders: what that means is that upstart won't be able to keep redis running in the face of accidental shutdowns. But thats probably fine.
[22:40] <jdsanders> well
[22:40] <jdsanders> I haven't tested this
[22:40] <jdsanders> but my theory is that any non-graceful shutdown will exit with a different exit code than 0
[22:40] <SpamapS> it should
[22:41] <jdsanders> so I think if somebody *wants* to shut it down with the command that is meant for precisely that, it is reasonable to expect that it doesn't respawn
[22:41] <SpamapS> jdsanders: you're still in better shape than with a bare daemon.
[22:54] <jdsanders> alright, I think I've got this now, thanks for all the help!
[22:54] <jdsanders> SpamapS,JanC: ^^
[22:54] <SpamapS> jdsanders: woot!
[22:55] <jdsanders> woot!
[22:56] <jdsanders> alright...I hate to do this, but here's another one
[22:56] <jdsanders> I'm on ubuntu 10.04
[22:56] <jdsanders> which has upstart 0.6.5
[22:56] <jdsanders> which doesn't support setuid
[22:56] <jdsanders> but I need to run my server as another user
[22:57] <jdsanders> so I'm doing
[22:57] <jdsanders> sudo -u $USER sh -c "/usr/local/bin/redis-server /etc/redis-server.conf 2>&1 >> /var/log/redis/server.log"
[22:57] <jdsanders> problem is that it makes the fork count jankier
[22:57] <jdsanders> when I start the server it actually spawns three processes - one for sudo, one for sh, and one for redis-server
[22:58] <jdsanders> the pid i'd really like to track is the redis-server one
[22:58] <jdsanders> (which isn't doing any forking)
[22:58] <jdsanders> any better way to change users?
[23:00] <SpamapS> jdsanders: sudo is one way
[23:00] <SpamapS> jdsanders: better is start-stop-daemon
[23:00] <jdsanders> ah
[23:00] <jdsanders> ok
[23:00] <jdsanders> I'll look into that
[23:00] <SpamapS> http://upstart.ubuntu.com/cookbook/#changing-user
[23:01] <SpamapS> jdsanders: ^^
[23:01] <jdsanders> awesome, thank you
[23:01] <jdsanders> sorta related: how do i get out of the stop/killed state
[23:08] <jdsanders> I currently have: redis-server start/killed, process 20674
[23:08] <jdsanders> but: sudo kill -0 20674; echo $?
[23:08] <jdsanders> kill: No such process
[23:08] <jdsanders> 1
[23:08] <jdsanders> sudo stop redis-server seems to hang
[23:09] <SpamapS> jdsanders: ugh did you try 'expect fork' at one point?
[23:09] <jdsanders> i did
[23:09] <jdsanders> i was playing with it
[23:09] <SpamapS> Ok, thats a bug
[23:09] <jdsanders> and  seem to have foobared it right up
[23:09] <SpamapS> http://paste.ubuntu.com/1087042/
[23:10] <SpamapS> this script loops over the pid space until it reaches the pid given
[23:10] <SpamapS> jdsanders: its python 3.. might have to port it to python2 ;)
[23:10] <jdsanders> so the theory here is that it's running somewhere, but i don't know where?
[23:10] <SpamapS> no
[23:10] <SpamapS> its not running
[23:10] <SpamapS> upstart thinks it should be
[23:10] <SpamapS> and will wait forever for it to die
[23:11] <jdsanders> so how does that script help?
[23:11] <SpamapS> it creates processes until the given process exists.. then lets upstart kill it
[23:11] <jdsanders> haha oooooooooooh, i get it
[23:11] <jdsanders> clever
[23:11] <SpamapS> not really, its a hack on hack on hack
[23:11] <jdsanders> is there any way to re-initialize upstart?
[23:11] <jdsanders> I could always just reboot
[23:11] <SpamapS> jdsanders: yes but then it forgets all job state
[23:11] <jdsanders> well yeah, hacks are clever
[23:11] <SpamapS> yeah reboot works if you're just noodling around
[23:12] <SpamapS> jdsanders: the original fix was in ruby and about 10 lines of it at that.. this one is just more friendly
[23:12] <jdsanders> what do you mean by "forgets all job state"
[23:12] <SpamapS> jdsanders: you can send upstart a command to re-execute itself, but then it forgets what is running and what is not
[23:12] <SpamapS> which means your shutdown does not go so smoothly
[23:15] <jdsanders> hmmm
[23:15] <jdsanders> ok i'll figure it out