[03:57] <mikedevita> i have an upstart script which for some reason isnt sourcing env variables that i have defined in /home/root/.bashrc
[03:57] <mikedevita> env variables like passwords and such for a nodejs app
[18:24] <dsimon> hey all; is there a convenient way to keep instanced jobs persisted?
[18:24] <dsimon> e.g. suppose i have a job foo that has a line "instance $BAR"
[18:24] <dsimon> and i do "start foo BAR=a" and also "start foo BAR=b"
[18:25] <dsimon> 5 minutes later, my UPS fails for a bit and the system shuts down hard
[18:25] <dsimon> i'd like those specific instances to be restarted when the system comes back up
[18:32] <xnox> dsimon: have a job" task\n script\n start foo BAR=a; start foo BAR=b\n end script" with appropriate start-on conditions.
[18:33] <xnox> dsimon: that way on each boot those instances are started. (or otherwise arrange for events to happen to bring those jobs up)
[18:33] <xnox> dsimon: there is no persistence across reboots, apart from what's encoded in the job config files.
[18:33] <dsimon> xnox, yeah, i just saw that same suggestion from a stack overflow answer, makes sense to me :-)
[18:33] <xnox> dsimon: your other option however is for the instance job to write it's own config file out on disk ;-)
[18:33] <dsimon> oh geez, that makes my brain hurt just thinking about it
[18:34] <dsimon> "when i start, create a thing that starts me"
[18:34] <dsimon> hm... actually, now that i say that, though
[18:34] <dsimon> maybe that's not a terrible idea
[18:35] <xnox> dsimon: "echo '@reboot root initctl --system start BAR=$BAR' > /etc/cron.d/custom-foo-$INST"
[18:36] <xnox> dsimon: but make sure you clean up that same file as well, one the instance gracefully stops and no longer needed, e.g. in post-stop.
[18:36] <dsimon> hm, in cron?
[18:36] <dsimon> shouldn't i use upstart itself instead?
[18:36] <xnox> dsimon: that's another option. It's best to write out upstart job though.
[18:36] <xnox> dsimon: or write a template and parse / write it out.
[18:37] <dsimon> yah
[18:37] <dsimon> well, maybe my best bet is to just have a manager
[18:37] <xnox> dsimon: because that way you get the "start on" conditions right =)
[18:37] <dsimon> my situation is: i have daemons that implement web services which are available on particular ports during particular scheduled date ranges
[18:38] <dsimon> so maybe i should just implement a thing that checks the running services list vs. the list of services that should be running given the current date, and starts/stops instances as necessary
[18:39] <dsimon> and have cron run it once a minute
[18:42] <dsimon> xnox, does that seem like a good idea to you?
[18:50] <xnox> dsimon: yes.
[18:51] <xnox> dsimon: i'm a freak so I would have used nagios for this encode the policy when the service must be up, and give nagios agents to execute start/stop commands to bring them up/down.
[18:51] <xnox> dsimon: cause it has recovery options, such that when the time is right nagios start them.
[18:52] <dsimon> hm, that could work
[18:52] <xnox> dsimon: there is a long term plan to implement temporal events, such that one would can do "start on midnight" "stop on 1am"
[18:52] <xnox> or somesuch, but there is no working implementation yet.
[18:53] <dsimon> ah, i see
[18:55] <dsimon> ok, cool; xnox, thanks for your help :-)
[18:57] <xnox> no problem =)
[20:43] <dsimon> hm
[20:43] <dsimon> i just ran into a weird issue
[20:44] <dsimon> i had a misconfigured upstart config for nginx
[20:44] <dsimon> so it was starting stuff up badly
[20:44] <dsimon> i corrected it...
[20:44] <dsimon> but things still did not work :-(
[20:44] <dsimon> both "start" and "stop" would hang, regardless of the expect stanza setting
[20:45] <dsimon> ... oh, wait
[20:45] <dsimon> wait wait
[20:45] <dsimon> i bet it was the pid file
[20:47] <dsimon> well, possibly
[20:48] <dsimon> at any rate; after restarting the machine, everything seemed to be behaving correctly
[20:48] <dsimon> so that makes me happy and sad :-|
[20:48] <dsimon> happy because it works, but sad because i am not sure what happened, or how to prevent it from happening again
[20:49] <dsimon> the reason i mention pid files is because "status" would falsely report an old pid for nginx, even if it was stopped
[20:51] <dsimon> okay, now it's happening again; i was starting and stopping nginx just fine, but when i tried "restart", it hung
[20:52] <dsimon> and now, start and stop also hang, and start fails to actually start any processes
[20:52] <dsimon> what could be causing this?
[22:03] <xnox> dsimon: please read the cookbook on "establishing the pid count" and the states it could force your job into.
[22:04] <xnox> dsimon: at the moment, if the expect stanza is wrong, one can get into a "deadlock" you described about (job stuck, cannot reset state)
[22:04] <xnox> dsimon: this is in-progress being fixed (merge proposal under review into trunk)
[22:05] <JanC> well, you can "reset" it with some tricks  ;)
[22:28] <xnox> JanC: yes, yes you can. I've heard of them........ but in no way, i'll suggest that for anyone to do =)))))
[22:30] <JanC> well, they are fine to solve temporary issues
[22:30] <JanC> certainly beats rebooting between tries to get a job right  :p
[22:32] <xnox> true =)
[22:32] <xnox> JanC: anyway it's getting fixed properly =)
[22:32] <JanC> that would be nice
[22:33] <JanC> xnox: any ETA?
[22:34] <xnox> JanC: as I said there is a merge proposal under review for trunk.
[22:34] <xnox> JanC: it should make it into Trusty (14.04 LTS)
[22:35] <JanC> that's more or less what I meant  :)
[22:35] <JanC> a merge proposal is not merged yet, and trunk is not released yet
[22:36] <JanC> (and I haven't followed the ML in ages)
[22:40] <xnox> JanC: i've been misquoted on ETAs in the past =))))) i'm trying to be careful.
[22:41] <JanC> well, the E in ETA is for "estimated", so I don't take them as a promise  :)
[22:42] <xnox> ack =)
[22:42] <xnox> I still have nightmares from time to time, from the times when i was in "consulting"
[22:42] <JanC> ugh
[22:43] <xnox> i'm in a happy place now =)
[22:43] <JanC> I get nightmares from consultants  ;)
[22:43] <JanC> but there are some exceptions, of course
[22:44] <xnox> haha =)
[22:45] <JanC> I've seen some very competent ones, and many, many who know less than an average helpdesk person with a high school diploma  :-/
[22:50] <JanC> (and when I say helpdesk, I mean a real helpdesk, not a bunch of people answering phones with the goal to end calls within less than 1 min 32 sec on average--and one sec less next month, to improve productivity)