jdsanders | Hi all - is there any way to provide a custom exec or script for the actual stop procedure? | 17:10 |
---|---|---|
jdsanders | use case is a server that stops gracefully when it receives a shutdown command over tcp, but loses data when it receives TERM | 17:11 |
jdsanders | I tried doing it with "pre-stop exec server-cli shutdown" | 17:12 |
jdsanders | which shuts it down properly, but it then respawns | 17:12 |
jdsanders | what I really want isn't a pre-stop, but just a stop exec | 17:12 |
SpamapS | jdsanders: I believe there is a 'kill signal' keyword | 17:43 |
SpamapS | kill signal SIGNAL | 17:43 |
jdsanders | SpamapS: thanks, but I actually want *no* signal | 17:43 |
SpamapS | jdsanders: you can also do a 'normal exit SIGWHATEVER' and the signal will not be considered something to respawn on | 17:43 |
jdsanders | or more importantly, I want a process killed in a pre-stop clause to not be respawned | 17:43 |
jdsanders | oh interesting | 17:43 |
jdsanders | i'll read about that command | 17:43 |
SpamapS | jdsanders: so if it is graceful on SIGUSR1, then 'normal exit 0 USR1' means don't respawn if it is killed by USR1 | 17:44 |
SpamapS | jdsanders: the '0' in there also means if the exit code was 0, don't respawn. Thats the default btw, so its surprising that your graceful shutdown doesn't exit(0) | 17:44 |
SpamapS | jdsanders: though I believe even if you exit(0) from the USR1 handler, upstart will consider that a kill by SIGUSR1 | 17:45 |
jdsanders | huh | 17:46 |
jdsanders | thanks, sounds like i have a lot more respawn research to do | 17:46 |
SpamapS | jdsanders: the default logging level *should* tell you what upstart thinks killed the process | 17:46 |
SpamapS | jdsanders: would you mind pasting what you see? youre pre-stop, btw, is the right way to go | 17:47 |
jdsanders | SpamapS: one moment, going back up my terminal.... | 17:50 |
jdsanders | shoot can't reproduce! | 17:58 |
jdsanders | ah, here | 17:59 |
jdsanders | maybe a race condition actually... | 17:59 |
jdsanders | deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo start redis-server | 17:59 |
jdsanders | redis-server start/running, process 9535 | 17:59 |
jdsanders | deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo status redis-server | 17:59 |
jdsanders | redis-server start/running, process 9535 | 17:59 |
jdsanders | deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo stop redis-server | 17:59 |
jdsanders | redis-server start/running, process 9625 | 17:59 |
jdsanders | deploy@NCVMLITSDEVQUE01:/var/www/apps/smartgrid/current/apollo$ sudo status redis-server | 17:59 |
jdsanders | redis-server start/running, process 9625 | 17:59 |
jdsanders | SpamapS: ^^ | 18:00 |
jdsanders | SpamapS: here's the config script - https://gist.github.com/bf5b600645ff1ddc67cf | 18:01 |
SpamapS | jdsanders: looking | 18:10 |
jdsanders | SpamapS: just got back from a meeting, any luck understanding that redis-server issue? | 19:13 |
SpamapS | jdsanders: oops no I got sdistracted. Looking again | 19:14 |
SpamapS | jdsanders: I'd need to see your syslog entries to understand what happened | 19:15 |
jdsanders | k, do you know what label to look for in syslog? | 19:17 |
jdsanders | don't see anything like "upstart" | 19:17 |
jdsanders | looks like "init" maybe | 19:18 |
jdsanders | Jul 11 15:18:08 NCVMLITSDEVQUE01 init: redis-server pre-stop process (9765) terminated with status 1 | 19:19 |
jdsanders | seems problematic maybe | 19:19 |
jdsanders | hmmm, yeah it seems like my shutdown command is *expected* to have exit code 1 | 19:29 |
jdsanders | (oddly enough) | 19:29 |
jdsanders | so i think that's causing the problem? | 19:30 |
JanC | by default it will consider exit != 0 as an error | 20:19 |
JanC | so just override that if you want to ignore that | 20:20 |
jdsanders | JanC: thanks, I'm now fairly certain my server is exiting with 0 when it shuts down, and I've modified the pre-stop to make sure it exits 0 | 20:20 |
jdsanders | all I see in syslog is now basically "process died, respawning" | 20:20 |
jdsanders | is there some way I can see more info about why it decided to respawn? | 20:20 |
JanC | maybe use a script instead of the main binary | 20:22 |
JanC | and make it log whatever seems relevant | 20:23 |
jdsanders | ok thanks | 20:28 |
SpamapS | jdsanders: you may also want to try 'initctl log-priority info' to get more about why it died | 22:12 |
SpamapS | jdsanders: I suspect just setting 'normal exit' properly is all you need | 22:12 |
jdsanders | SpamapS: i'll try the log level thing, but the reason I think the exit code is 0 is, I ran the same command that I'm executing from upstart in the foreground | 22:32 |
jdsanders | then shut it down from another process | 22:32 |
jdsanders | then checked its exit status | 22:32 |
jdsanders | and it was 0 | 22:32 |
SpamapS | jdsanders: right, so I am wondering what upstart thinks the exit code is | 22:33 |
SpamapS | jdsanders: you might also just try 'normal exit 0' | 22:33 |
SpamapS | I thought that was the default | 22:33 |
SpamapS | but man 5 init isn't entirely crystal clear on it IIRC | 22:33 |
SpamapS | "Tasks may exit with a zero exit status to prevent being respawned." | 22:34 |
SpamapS | I believe that means jobs may not by default consider 0 a normal exit | 22:34 |
SpamapS | " All | 22:34 |
SpamapS | reasons for a service stopping, except the stop(8) command itself, are considered abnormal." | 22:34 |
SpamapS | jdsanders: yeah, try 'normal exit 0' | 22:34 |
jdsanders | huh | 22:37 |
jdsanders | i'll try that! | 22:37 |
SpamapS | jdsanders: what that means is that upstart won't be able to keep redis running in the face of accidental shutdowns. But thats probably fine. | 22:39 |
jdsanders | well | 22:40 |
jdsanders | I haven't tested this | 22:40 |
jdsanders | but my theory is that any non-graceful shutdown will exit with a different exit code than 0 | 22:40 |
SpamapS | it should | 22:40 |
jdsanders | so I think if somebody *wants* to shut it down with the command that is meant for precisely that, it is reasonable to expect that it doesn't respawn | 22:41 |
SpamapS | jdsanders: you're still in better shape than with a bare daemon. | 22:41 |
jdsanders | alright, I think I've got this now, thanks for all the help! | 22:54 |
jdsanders | SpamapS,JanC: ^^ | 22:54 |
SpamapS | jdsanders: woot! | 22:54 |
jdsanders | woot! | 22:55 |
jdsanders | alright...I hate to do this, but here's another one | 22:56 |
jdsanders | I'm on ubuntu 10.04 | 22:56 |
jdsanders | which has upstart 0.6.5 | 22:56 |
jdsanders | which doesn't support setuid | 22:56 |
jdsanders | but I need to run my server as another user | 22:56 |
jdsanders | so I'm doing | 22:57 |
jdsanders | sudo -u $USER sh -c "/usr/local/bin/redis-server /etc/redis-server.conf 2>&1 >> /var/log/redis/server.log" | 22:57 |
jdsanders | problem is that it makes the fork count jankier | 22:57 |
jdsanders | when I start the server it actually spawns three processes - one for sudo, one for sh, and one for redis-server | 22:57 |
jdsanders | the pid i'd really like to track is the redis-server one | 22:58 |
jdsanders | (which isn't doing any forking) | 22:58 |
jdsanders | any better way to change users? | 22:58 |
SpamapS | jdsanders: sudo is one way | 23:00 |
SpamapS | jdsanders: better is start-stop-daemon | 23:00 |
jdsanders | ah | 23:00 |
jdsanders | ok | 23:00 |
jdsanders | I'll look into that | 23:00 |
SpamapS | http://upstart.ubuntu.com/cookbook/#changing-user | 23:00 |
SpamapS | jdsanders: ^^ | 23:01 |
jdsanders | awesome, thank you | 23:01 |
jdsanders | sorta related: how do i get out of the stop/killed state | 23:01 |
jdsanders | I currently have: redis-server start/killed, process 20674 | 23:08 |
jdsanders | but: sudo kill -0 20674; echo $? | 23:08 |
jdsanders | kill: No such process | 23:08 |
jdsanders | 1 | 23:08 |
jdsanders | sudo stop redis-server seems to hang | 23:08 |
SpamapS | jdsanders: ugh did you try 'expect fork' at one point? | 23:09 |
jdsanders | i did | 23:09 |
jdsanders | i was playing with it | 23:09 |
SpamapS | Ok, thats a bug | 23:09 |
jdsanders | and seem to have foobared it right up | 23:09 |
SpamapS | http://paste.ubuntu.com/1087042/ | 23:09 |
SpamapS | this script loops over the pid space until it reaches the pid given | 23:10 |
SpamapS | jdsanders: its python 3.. might have to port it to python2 ;) | 23:10 |
jdsanders | so the theory here is that it's running somewhere, but i don't know where? | 23:10 |
SpamapS | no | 23:10 |
SpamapS | its not running | 23:10 |
SpamapS | upstart thinks it should be | 23:10 |
SpamapS | and will wait forever for it to die | 23:10 |
jdsanders | so how does that script help? | 23:11 |
SpamapS | it creates processes until the given process exists.. then lets upstart kill it | 23:11 |
jdsanders | haha oooooooooooh, i get it | 23:11 |
jdsanders | clever | 23:11 |
SpamapS | not really, its a hack on hack on hack | 23:11 |
jdsanders | is there any way to re-initialize upstart? | 23:11 |
jdsanders | I could always just reboot | 23:11 |
SpamapS | jdsanders: yes but then it forgets all job state | 23:11 |
jdsanders | well yeah, hacks are clever | 23:11 |
SpamapS | yeah reboot works if you're just noodling around | 23:11 |
SpamapS | jdsanders: the original fix was in ruby and about 10 lines of it at that.. this one is just more friendly | 23:12 |
jdsanders | what do you mean by "forgets all job state" | 23:12 |
SpamapS | jdsanders: you can send upstart a command to re-execute itself, but then it forgets what is running and what is not | 23:12 |
SpamapS | which means your shutdown does not go so smoothly | 23:12 |
jdsanders | hmmm | 23:15 |
jdsanders | ok i'll figure it out | 23:15 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!