[03:57] i have an upstart script which for some reason isnt sourcing env variables that i have defined in /home/root/.bashrc [03:57] env variables like passwords and such for a nodejs app === JanC_ is now known as JanC [18:24] hey all; is there a convenient way to keep instanced jobs persisted? [18:24] e.g. suppose i have a job foo that has a line "instance $BAR" [18:24] and i do "start foo BAR=a" and also "start foo BAR=b" [18:25] 5 minutes later, my UPS fails for a bit and the system shuts down hard [18:25] i'd like those specific instances to be restarted when the system comes back up [18:32] dsimon: have a job" task\n script\n start foo BAR=a; start foo BAR=b\n end script" with appropriate start-on conditions. [18:33] dsimon: that way on each boot those instances are started. (or otherwise arrange for events to happen to bring those jobs up) [18:33] dsimon: there is no persistence across reboots, apart from what's encoded in the job config files. [18:33] xnox, yeah, i just saw that same suggestion from a stack overflow answer, makes sense to me :-) [18:33] dsimon: your other option however is for the instance job to write it's own config file out on disk ;-) [18:33] oh geez, that makes my brain hurt just thinking about it [18:34] "when i start, create a thing that starts me" [18:34] hm... actually, now that i say that, though [18:34] maybe that's not a terrible idea [18:35] dsimon: "echo '@reboot root initctl --system start BAR=$BAR' > /etc/cron.d/custom-foo-$INST" [18:36] dsimon: but make sure you clean up that same file as well, one the instance gracefully stops and no longer needed, e.g. in post-stop. [18:36] hm, in cron? [18:36] shouldn't i use upstart itself instead? [18:36] dsimon: that's another option. It's best to write out upstart job though. [18:36] dsimon: or write a template and parse / write it out. [18:37] yah [18:37] well, maybe my best bet is to just have a manager [18:37] dsimon: because that way you get the "start on" conditions right =) [18:37] my situation is: i have daemons that implement web services which are available on particular ports during particular scheduled date ranges [18:38] so maybe i should just implement a thing that checks the running services list vs. the list of services that should be running given the current date, and starts/stops instances as necessary [18:39] and have cron run it once a minute [18:42] xnox, does that seem like a good idea to you? [18:50] dsimon: yes. [18:51] dsimon: i'm a freak so I would have used nagios for this encode the policy when the service must be up, and give nagios agents to execute start/stop commands to bring them up/down. [18:51] dsimon: cause it has recovery options, such that when the time is right nagios start them. [18:52] hm, that could work [18:52] dsimon: there is a long term plan to implement temporal events, such that one would can do "start on midnight" "stop on 1am" [18:52] or somesuch, but there is no working implementation yet. [18:53] ah, i see [18:55] ok, cool; xnox, thanks for your help :-) [18:57] no problem =) [20:43] hm [20:43] i just ran into a weird issue [20:44] i had a misconfigured upstart config for nginx [20:44] so it was starting stuff up badly [20:44] i corrected it... [20:44] but things still did not work :-( [20:44] both "start" and "stop" would hang, regardless of the expect stanza setting [20:45] ... oh, wait [20:45] wait wait [20:45] i bet it was the pid file [20:47] well, possibly [20:48] at any rate; after restarting the machine, everything seemed to be behaving correctly [20:48] so that makes me happy and sad :-| [20:48] happy because it works, but sad because i am not sure what happened, or how to prevent it from happening again [20:49] the reason i mention pid files is because "status" would falsely report an old pid for nginx, even if it was stopped [20:51] okay, now it's happening again; i was starting and stopping nginx just fine, but when i tried "restart", it hung [20:52] and now, start and stop also hang, and start fails to actually start any processes [20:52] what could be causing this? [22:03] dsimon: please read the cookbook on "establishing the pid count" and the states it could force your job into. [22:04] dsimon: at the moment, if the expect stanza is wrong, one can get into a "deadlock" you described about (job stuck, cannot reset state) [22:04] dsimon: this is in-progress being fixed (merge proposal under review into trunk) [22:05] well, you can "reset" it with some tricks ;) [22:28] JanC: yes, yes you can. I've heard of them........ but in no way, i'll suggest that for anyone to do =))))) [22:30] well, they are fine to solve temporary issues [22:30] certainly beats rebooting between tries to get a job right :p [22:32] true =) [22:32] JanC: anyway it's getting fixed properly =) [22:32] that would be nice [22:33] xnox: any ETA? [22:34] JanC: as I said there is a merge proposal under review for trunk. [22:34] JanC: it should make it into Trusty (14.04 LTS) [22:35] that's more or less what I meant :) [22:35] a merge proposal is not merged yet, and trunk is not released yet [22:36] (and I haven't followed the ML in ages) [22:40] JanC: i've been misquoted on ETAs in the past =))))) i'm trying to be careful. [22:41] well, the E in ETA is for "estimated", so I don't take them as a promise :) [22:42] ack =) [22:42] I still have nightmares from time to time, from the times when i was in "consulting" [22:42] ugh [22:43] i'm in a happy place now =) [22:43] I get nightmares from consultants ;) [22:43] but there are some exceptions, of course [22:44] haha =) [22:45] I've seen some very competent ones, and many, many who know less than an average helpdesk person with a high school diploma :-/ [22:50] (and when I say helpdesk, I mean a real helpdesk, not a bunch of people answering phones with the goal to end calls within less than 1 min 32 sec on average--and one sec less next month, to improve productivity)