=== JanC is now known as Guest22660 | ||
=== JanC_ is now known as JanC | ||
frew | does anyone know if there is a way to detect a restart? That is, I would like to run a command when a service is restarted automatically, but not when it is started via stop then start | 17:27 |
---|---|---|
frew | I sorta expect the start on reason or something to be in an env var? | 17:30 |
chras | frew: yeah | 17:32 |
chras | you can do this | 17:32 |
chras | sec, lemme look for my code which does this | 17:32 |
frew | awesome, thanks | 17:32 |
frew | oh is it UPSTART_EVENTS ? | 17:32 |
* frew *just* found http://upstart.ubuntu.com/cookbook/#standard-environment-variables | 17:33 | |
* frew would guess a "respawn" in UPSTART_EVENTS means it crashed | 17:33 | |
chras | yeah you can check on that | 17:34 |
chras | so when you do an initctl restart you want another event to fire? | 17:35 |
chras | or when you kill the running process, and upstart restarts it automatically | 17:35 |
chras | or both? | 17:35 |
frew | no, really all I want, if possible, is that if it crashes (or if I killed it directly) | 17:35 |
frew | so the latter (I think) | 17:35 |
frew | basically if it restarted due to a crash | 17:36 |
chras | right, k, so i have an event which does this | 17:36 |
chras | sec lemme pastebin it | 17:36 |
frew | thanks | 17:36 |
chras | https://pastebin.mozilla.org/8870501 | 17:37 |
chras | but, when i test it, it doesnt trigger on an initctl restart $process | 17:37 |
frew | which is the goal here. | 17:37 |
frew | huh, so are scripts bash? | 17:38 |
chras | more or less | 17:38 |
chras | with a set -e | 17:38 |
frew | ok | 17:38 |
frew | I assumed they'd be sh | 17:38 |
chras | well my sh is linked to bash | 17:38 |
frew | mine's dash | 17:38 |
chras | i dont use dash | 17:38 |
frew | gotcha. | 17:38 |
chras | dash does ... unexpected things | 17:38 |
frew | heh | 17:38 |
frew | so you install this upstart config | 17:39 |
frew | and it captures basucally all events? | 17:39 |
frew | (hence the JOB!=self_help, which is what this is) | 17:40 |
chras | yeah im having trouble getting it to trigger on a restart though | 17:41 |
frew | restart meaning a crash? | 17:42 |
chras | my thing triggers on a crash | 17:42 |
chras | just not an 'initctl restart' | 17:42 |
frew | ok that's how I want it | 17:42 |
frew | yeah | 17:42 |
frew | that's what I want. | 17:42 |
frew | if someone manually restarts, I think I'm ok with that not being logged especially | 17:42 |
frew | though it would be nice to know as well | 17:42 |
chras | i made this event back when i had heartbeat dying unexpectedly | 17:42 |
chras | and then i had a config file which took actiosn based on what happened | 17:43 |
chras | ie | 17:43 |
chras | heartbeat XCPU echo "nas_self_help: heartbeat died unexpectedly with XCPU Signal, restarting" | wall -n | 17:43 |
chras | so heartbeat was getting a XCPU kill flag for whatever reason (ended up being a bug in heartbeat debug) | 17:43 |
chras | and then id wall it out for instance | 17:43 |
frew | right my thought here is to either page or create an issue or osmething | 17:44 |
frew | not sure | 17:44 |
chras | and it does trigger on a stop / kill / etc | 17:44 |
chras | but not sure about restart atm | 17:44 |
frew | ok well I need to get this installed and I'll see how well it works | 17:44 |
frew | so GCONFIG and PCONFIG (currently) do nothing | 17:45 |
frew | so key signal rest is like | 17:45 |
frew | who did what | 17:45 |
frew | what they did | 17:46 |
frew | and other info? | 17:46 |
frew | or something? | 17:46 |
frew | oh sorry I am reading this and learning more | 17:46 |
frew | I hsould talk to my rubber duck before asking here | 17:46 |
chras | well thoes are just configs that i use | 17:47 |
frew | right | 17:47 |
chras | i can paste thoes in the thing, sec | 17:47 |
frew | thanks that'd be helpful | 17:47 |
chras | https://pastebin.mozilla.org/8870503 | 17:48 |
chras | i guess the main thing i wanted you to originally see was the 'start on stopped JOB=${some_job} | 17:49 |
chras | and then have a task which does whatever you want in a script stanza | 17:50 |
frew | right | 17:50 |
frew | well and having a single task that does it is very elegant | 17:50 |
frew | I was gonna have a script I put in all my scripts | 17:51 |
chras | frew: yeah i cant get it working in the context of an 'initctl restart' | 18:09 |
chras | as a workaround you could do an initctl emit in your pre-stop | 18:11 |
chras | yeah a restart never emits a 'stopping' or 'stopped' event | 18:27 |
frew | chras: fwiw the z thing is not really needed anymore iirc | 18:31 |
chras | frew: ug. i got it | 18:31 |
chras | its definately a bug | 18:31 |
frew | yeah | 18:31 |
chras | if you have a pre-stop script | 18:31 |
chras | then the job goes running -> pre-stop -> start | 18:32 |
chras | and never emits a stopping/stopped | 18:32 |
frew | you think the maintainers of upstart would fix it? | 18:32 |
frew | I was telling my boss about this and he was a little hesitant since upstart is destined for the void | 18:32 |
frew | but so is everything so I'm gonna keep going | 18:32 |
chras | if you DONT have a pre-stop script, then it works as intended | 18:32 |
frew | interesting. | 18:32 |
chras | it MIGHT be this bug, https://bugs.launchpad.net/upstart/+bug/703800 | 18:32 |
frew | btw logger ~~ syslog? | 18:33 |
frew | I'd say even if it is that bug, that bug is 4y old and will probably never get fixed | 18:33 |
chras | right | 18:33 |
frew | I think I'm gonna make an "ANY" job too. | 18:34 |
frew | at least so I can get started with this reasonably. | 18:34 |
chras | anyways, sorry couldnt be more help | 18:36 |
frew | no way this is great | 18:37 |
frew | you 100% solved my issue and added more | 18:37 |
chras | great | 18:38 |
frew | chras: actually I guess I maybe misunderstand you; I don't get logging or anything for a respawn (crash and restart) | 20:15 |
frew | is that what you were saying is broken? | 20:15 |
chras | hi | 20:17 |
chras | k the quick summary was | 20:18 |
chras | IF you have a pre-stop, then an initctl restart does start -> pre-stop -> start , and skips doing a stopping, or stopped event | 20:18 |
frew | ok that's what I thought you were saying. | 20:18 |
frew | this is neither a start, stop, nor restart, but a service that starts and crashes after about 3s, consistently | 20:19 |
frew | and I get no events | 20:19 |
chras | so your event fails to start at all | 20:20 |
chras | and thats what you're trying to catch ? | 20:20 |
frew | well I mean, it never forks, so as long as upstart knows, it worked correctly for 3s | 20:21 |
frew | and then crashed | 20:21 |
frew | but yes, I'm trying to catch an error causing a respawn, likely a compile error | 20:21 |
chras | right. ok, i think i understand | 20:21 |
frew | though it could have been running for three hours | 20:21 |
chras | so your daemon forks, and upstart isnt detecting that it died? | 20:21 |
frew | it would still be the same situation | 20:21 |
frew | no | 20:21 |
frew | it never forks | 20:21 |
frew | it just exec's | 20:21 |
chras | ah | 20:21 |
frew | hence upstart not knowing it's ready | 20:22 |
chras | does your upstart throw an error like | 20:22 |
chras | main process ($pid) terminated with status $exit? | 20:23 |
frew | if it would place an error in the syslog, no | 20:23 |
frew | not sure where else to look | 20:23 |
chras | check dmesg | 20:23 |
chras | do you 'console log' ? | 20:23 |
frew | I don't know? | 20:24 |
chras | mind putting your event on pastebin? | 20:24 |
frew | yeah it's super basic | 20:24 |
frew | https://gist.github.com/frioux/0e4e4c1dc0b02f82a9a1a327ae55c248 | 20:25 |
chras | right k | 20:25 |
chras | i think you want to add a 'console log' to that | 20:26 |
chras | so you can get some minimal logging | 20:26 |
frew | ok | 20:26 |
frew | (fwiw I know why it's crashing, I just want to be able to detect better in the future) | 20:26 |
chras | and you're trying to throw another event when that one fails to start? | 20:27 |
frew | well, I just want to find out, somehow | 20:27 |
frew | originally I was gonna write a pre-start that looked at | 20:28 |
frew | UPSTART_EVENTS | 20:28 |
frew | and if respawn was in there, do a thing | 20:28 |
frew | but your solution was a lot more elegan | 20:28 |
chras | so, when i do this | 20:29 |
chras | https://pastebin.mozilla.org/8870543 | 20:32 |
chras | when my main process ends with a non 0 exit code (which i assume is your failureA) | 20:32 |
chras | it still triggers my self_help | 20:32 |
frew | surely it is | 20:32 |
frew | although... it's probably exiting 255 | 20:33 |
frew | any chance upstart ignores high errors? | 20:33 |
chras | should be anything non-zero | 20:33 |
frew | ok | 20:33 |
chras | unless you told it to ignore certain exit codes | 20:33 |
frew | I don't even know how I'd do that? | 20:33 |
frew | where would that be done | 20:33 |
chras | its just a main-config command | 20:33 |
chras | sec | 20:34 |
chras | lemme find it | 20:34 |
chras | http://upstart.ubuntu.com/cookbook/#normal-exit | 20:35 |
frew | well I showed you the whole file | 20:35 |
frew | unless you can glboally do it, I didn't do that | 20:35 |
chras | yeah so itll be non-zero exit codes | 20:36 |
chras | will cause it to fail | 20:36 |
chras | so my self_help event isnt triggering on your thing exiting weirdly? | 20:37 |
frew | correct | 20:38 |
frew | I have to admit though | 20:38 |
chras | does your reaper fork ? | 20:38 |
chras | or is it a foreground process | 20:38 |
frew | I did clean up your self_help thing to be shorter and simpler | 20:38 |
frew | no it doesn't fork | 20:38 |
frew | it's foreground | 20:38 |
frew | lemme paste the changes I made | 20:39 |
chras | did you run an initctl reload-configuration? | 20:39 |
frew | they shouldn't harm anything | 20:39 |
chras | to get the new event | 20:39 |
frew | yes | 20:39 |
chras | does an initctl start self_help work? | 20:39 |
frew | well I get output about other services | 20:39 |
frew | but let's see | 20:39 |
chras | ah interesting so you know it IS working, just not with your reaper? | 20:39 |
frew | oh wait | 20:40 |
frew | wtf | 20:40 |
frew | initctl: Unknown parameter: JOB | 20:40 |
frew | does that mean antyhing to you? | 20:40 |
chras | yeah | 20:40 |
chras | initctl start self_help JOB=test | 20:40 |
chras | thats just the instancing stuff triggering | 20:40 |
chras | normally JOB is emitted automatically by other upstart events | 20:41 |
frew | initctl: Job failed to start | 20:41 |
frew | right | 20:41 |
frew | no obvious reason why it's failing | 20:41 |
chras | hm, pastebin it ? | 20:41 |
frew | the modified self_help or what? | 20:41 |
chras | yea | 20:41 |
frew | k | 20:41 |
frew | https://gist.github.com/frioux/953fe2f4b950c2bef58521f0ca0e0ad9 | 20:42 |
frew | should be effectively the same, but work with any POSIX shell (and only allow one extra config file) | 20:42 |
frew | (and I renamed it watchdog) | 20:42 |
chras | right | 20:43 |
chras | you're using dash right? | 20:45 |
frew | I wrote it with dash in mind, since we are moving to dash soon, but it's still bash now (12.04) | 20:45 |
frew | oh no | 20:46 |
chras | its just a syntax error for me | 20:46 |
frew | 12.04 is dash | 20:46 |
frew | oh? | 20:46 |
* frew eyeballs | 20:46 | |
chras | ./foo.script: 24: ./foo.script: Syntax error: "(" unexpected (expecting "then") | 20:47 |
frew | I wonder why that's wrong | 20:47 |
frew | it's supposed to work | 20:47 |
frew | `(expression) True if expression is true.` from dash(1) | 20:47 |
chras | yeah it looks right | 20:48 |
chras | sec, seeing if i can fix | 20:48 |
frew | can repro fwiw | 20:48 |
frew | fwiw the ()'s are not needed, since -o binds more tightly than -a | 20:49 |
frew | but I thoguht it made it easier to read | 20:49 |
frew | dash -c '[ ( -f /bin/dash ) ] && echo 1' | 20:50 |
frew | I think this must be a bug in dash. | 20:51 |
chras | dash does unexpected things :P | 20:52 |
frew | I never claimed it was perfect | 20:52 |
frew | I jsut can't switch what /bin/sh is | 20:53 |
frew | ok I'll just leave out the () for now | 20:54 |
chras | yeah i cant use ()'s in dash for whatever reason | 20:54 |
frew | oh derp | 21:06 |
frew | dash -c '[ \( -f /bin/dash \) ] && echo true' | 21:06 |
frew | gotta escape them for some reason | 21:07 |
frew | must be a history expansion or something. | 21:08 |
frew | nonetheless I do see this: `self_help(pt-heartbeat-update-mysql): caught job[pt-heartbeat-update-mysql] instance[] process[respawn] result[failed] status[] signal[]` in syslog, but nothing about my reaper | 21:09 |
chras | hm | 21:20 |
frew | trying a different route, but 90% sure that upstart is just not creating an event for that | 21:20 |
frew | I don't see where it would in the lifecycle anyway, fwiw | 21:20 |
chras | you can try bumping the debug up with log-priority debug | 21:21 |
chras | and see where its dying | 21:21 |
frew | hm. | 21:23 |
frew | I have a theory. | 21:23 |
* frew "simplifies" | 21:23 | |
frew | yeah, it catches start and stop, but not respawn | 21:28 |
chras | respawn, when the main process dies? | 21:29 |
frew | yeah | 21:29 |
frew | respawn is an even that's not real, to be clear | 21:29 |
frew | I thought it might be | 21:29 |
frew | s/even/event | 21:29 |
frew | I sorta thing consuming events won't work for this, since there isn't one for what I'm thinking of | 21:30 |
chras | lemme try here | 21:32 |
chras | k so | 21:32 |
chras | we're triggering self_help/watchdog on the upstart event 'stopped' | 21:33 |
chras | after something is completely stopped | 21:33 |
chras | for it to also hit your respawn, you need to change that to be 'start on stopping ...' | 21:33 |
chras | i should make that change for my own needs as well i guess | 21:34 |
chras | with it only being set to 'stopped' | 21:37 |
chras | it will wait for upstart to give up on respawning before its triggered | 21:38 |
chras | like i suppose, its the difference between if your job fails 10 times in a row, and upstart gives up on it | 21:38 |
chras | getting 10 emails, or 1 | 21:38 |
chras | if your watchdog sends email out at least. | 21:38 |
chras | oo, interesting though | 21:51 |
chras | the respawn attempts have a result[ok] instead of a result[failed] | 21:52 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!