/srv/irclogs.ubuntu.com/2014/04/10/#upstart.txt

=== hatchetation_ is now known as hatchetation
=== balkamos_ is now known as balkamos
styolthanks again xnox, you rock01:33
=== SpamapS_ is now known as SpamapS
=== PaulePan1er is now known as PaulePanter
dstokeshi guys. what's the proper way to execute something when an app crashes but hasn't reached the respawn limit. post-stop is also run when the process is manualy stopped so that's not working for me. is there an event i can have another task listen to?15:32
dstokeslike, maybe i can check $PROCESS on the 'stopped' event?15:34
dstokesbasically looking for `start on respawned <job>`16:10
dstokesguise?16:47
styoldstokes: I'm a newbie so I may not have the proper guidance available, but you might be able to `initctl emit your-event-name` within your stop routine and then `start on your-event-name`17:47
dstokesstyol: afaik there is no way to detect a process failure / respawn in stop routines. only difference in a respawn is that pre-stop isn't run17:49
styoldstokes: ah I see what you're saying17:53
dstokeseasy to detect when the process stops, not easy to make a distinction btwn `stop <job>` and process crashing17:54
styoldstokes: this post suggests PROCESS=respawn might be able to be used https://bugs.launchpad.net/upstart/+bug/71680217:54
dstokesstyol: process=respawn indicates a process that reached it's respawn limit. it's triggered once after n unsuccessful respawn attempts. i'm currently using a task to monitor those failures.17:55
dstokesgoal now is to detect when apps crash but successfully restart, for debugging purposes17:56
xnoxstyol: dstokes: the events emitted are - stopped/failed JOB='test' INSTANCE='' RESULT='failed' PROCESS='respawn'22:33
xnoxstyol: which is emited after e.g. the main process segfaulted.22:33
xnoxstyol: after that there will be new starting/started events from respawning.22:34
xnoxwait no, that one is when the respawn limit is reached22:34
styolah i see22:34
styoldstokes: ping22:34
dstokesyup22:35
dstokesrespawn does not represent a respawn, but reaching the respawn limit22:36
xnoxstyol: dstokes: for the intermediate failures one gets: started/failed JOB='test' INSTANCE=''22:36
xnoxstyol: dstokes: let me paste the log of events22:36
dstokesjob failure is indicative of the upstart job failing, not the managed process right? pretty sure i tested that case..22:37
xnoxdstokes: so without respawn the event i see is - stopped JOB=test EXIT_STATUS=1 PROCESS=main RESULT=failed22:46
xnoxdstokes: or e.g. - stopped JOB=test RESULT=failed PROCESS=main EXIT_SIGNAL=FPE22:46
xnox(floating point exception, core-dump)22:46
xnoxdstokes: but with respawn one gets less information. You could leave out respawn, and instead have a second job - e.g. monitor.conf which is "start on stopped mainjob" which then can act appropriate and do "$ start --no-wait mainjob" to "respawn" and/or do other clean-up things 22:47
dstokesi'm only seeing that event after respawn limit with - stopped JOB=test PROCESS=respawn RESULT=failed22:48
xnoxdstokes: looks ugly, and i think it's a bug that /less/ info is passed.22:48
dstokesmaybe i need to check my exit code..22:48
dstokesxnox: then you have to hack together your own respawn limit right?22:48
xnoxdstokes: "so without respawn the event i see is" as in if the main job _does not have "respawn" stanza_ I  see more info in the failed events.22:48
xnoxdstokes: yeap.22:48
xnoxi think it's bug that we don't emit as to /why/ we failed.22:49
dstokesxnox: sry, main job _does_ have respawn stanza, along with limit22:49
dstokesi'm only seeing stopped and stopping emitted at the end of several respawn attempts (when the limit is reached)22:49
xnoxdstokes: respawn -> only one failed event, when respawn limit reached without info; without respawn -> more details as to why main process failed.22:49
dstokesxnox: i see. so that's just the way it is ;)22:50
dstokesto workarnd, should be writing my own respawn task22:50
dstokesoften times a process will fail, then startup successfully. those are the cases i'm after so i can debug why it failed before the logs fill up etc22:51
xnoxdstokes: it is weird. I'll open a bug about it, but probably will not help much as it will take a while for such a fix to be created.22:51
dstokesright, thx for your help anyway. happy to at least confirm that it's not misconfiguration on my end22:51
xnoxdstokes: oh, i see. I wonder if you can just crank up upstart logging to get that.22:51
xnoxdstokes: so do you actually want to take any automated scripts/job upon failures? or are you simply to collect data?22:51
xnoxdstokes: after $ initctl log-level debug (the most debugging)22:52
dstokesthe ideal scenario is: setup main job to respawn a process when it fails, setup secondary task to curl when the process is respawned succesfully (curl for notification)22:52
xnoxdstokes: I see - http://paste.ubuntu.com/7232773/22:53
dstokesi suppose i could watch the log, but that's a little more involved than i want to get ;)22:53
xnoxdstokes: i think we can succeed your requirements!22:55
dstokesxnox++23:02
xnoxdstokes: one sec, testing.23:03
dstokesfor context: main job http://paste.ubuntu.com/7232809/23:05
dstokesand associated task: http://paste.ubuntu.com/7232813/23:06
dstokesalrdy have the respawn limit task working properly. notifies me when a process fails to respawn (after limit)23:06
xnoxdstokes: so my "main" job simply does "main() { pause()};"23:10
xnoxdstokes: that's the process, and then externally i send FPE (kill -8) to it.23:10
xnox$ cat /etc/init/test.conf 23:10
xnoxrespawn23:10
xnoxexec /tmp/a.out23:10
xnoxthat's main job.23:10
xnoxand here is my "monitor"23:10
xnox$ cat /etc/init/monitor.conf 23:10
xnoxstart on stopping test23:10
xnoxstop on stopped test23:10
xnoxscript23:10
xnoxsleep 2; echo "Main job respawned successfully"23:10
xnoxend script23:10
xnox..23:10
xnoxdstokes: so when job under test fails (stopping test) monitor kicks in. If the job is getting normally stopped or reached respawn limit, the monitor will be stopped.23:11
xnoxdstokes: however, if respawn succeeds and the job stays alive for 2 seconds then in /var/log/upstart/monitor.log i get notification that respawn was successful.23:12
xnoxdstokes: but the sleep2 needs to be adjusted. You could do better without a sleep23:12
dstokesclever..23:13
xnoxdstokes: e.g. "stop on stopped test or started test"23:13
xnoxdstokes: and then instead of script, you'd have a post-stop script -> which checks the stop event environment.23:13
xnoxdstokes: if the reason for getting stopped is "started test" it means a succssful respawn happened.23:13
xnoxlet me code that.23:14
xnoxdstokes: nah, needs sleep / appropriate matching for respawn limit none-the-less.23:21

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!