[00:00] what do you mean [00:00] we, the unix loving people of the world :) [00:00] ok ill email them with my final script (maybe) once i get something i like going [00:00] let me review whats been mentioned here and get back to you SpamapS for a review if thats ok [00:06] that cool SpamapS [16:05] heya [16:06] I have an upstart script that originally used respawn. I fixed it to not use respawn but it always respawns [16:06] even when stopping the service through upstart [16:19] jamescarr: stopping a job would never lead to a respawn. [16:19] jamescarr: respawn only happens if the program exits with a non-normal exit code/signal (which, by default, there is only one normal exit, 0) [16:19] jamescarr: do you have stuff from syslog? [16:20] SpamapS: well, originally I had respawn in my upstart script, then I realized the service it was starting starts as a daemon, causing it to respawn every second [16:20] SpamapS: I fixed the upstart script by removing the respawn and changing the service config, but it keeps going [16:20] no matter how many times I tell it to stop :( [16:21] syslog is just filling up with "Nov 27 10:21:20 ultipro-api init: ultipro-api-service main process ended, respawning" every minute [16:21] jamescarr: if its a daemon, you just use expect fork, or expect daemon, and then it won't respawn like that [16:22] I set that [16:22] it keep going [16:22] perhaps it forks more than twice? [16:22] jamescarr: and the program has no "--nodaemonize" mode? [16:23] yeah [16:24] should I just reboot!? [16:24] jamescarr: in that case, you might be better off starting the program in pre-start, and recording a pidfile, then stopping it in post-stop [16:24] SpamapS: how can I just outright stop it for now? [16:24] it won't stop coming back [16:25] jamescarr: stop jobname [16:25] jamescarr: that will not respawn.. it changes the intended state to 'stop' [16:26] which exits any respawn code [16:26] SpamapS: I've don this tons of times [16:26] sudo initctl stop jobname [16:26] it just gives me... [16:26] "ultipro-api-service start/running, process 734 " [16:29] *stop* says that? [16:29] jamescarr: thats very unusual [16:29] I should mention I accidentally copy and pasted the start script into the pre-stop script the first time [16:29] I fixed it, but it keeps doing this [16:30] jamescarr: its possible that is causing issues... [16:30] jamescarr: especially if that script calls 'start' [16:30] yeah, I meant to replace it with stop but I screwed up [16:31] I since fixed it, but the results seem to be from the first upstart script [16:31] jodh: ^^ another use case for a "please forget about this job" command [16:31] slangasek: ^^ [16:31] jamescarr: yeah, the new conf file isn't loaded until the 1st instance stops [16:32] oh man [16:32] anyway to stop the first one? [16:32] jamescarr: this might require a reboot unfortunately [16:34] jodh, xnox: signal sender=:1.9 -> dest=(null destination) serial=3 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=Restarted [16:35] ended up being a tiny bit more tricky than I had expected as the code needs to be added much later than jodh first though as we need to wait until upstart is done reconnecting to DBus [16:36] as a result, the signal is emitted whenever upstart restarts, not only for successful stateful re-execs, but I think that makes sense as the Instance Init don't really care whether it was succesful or not, they need to restart regardless [16:38] now to figure out how to write tests for that stuff :) [17:21] hi all! I have a question about writing/debugging multiple dependencies in an upstart script [17:21] I see that this has been discussed long ago: https://lists.ubuntu.com/archives/upstart-devel/2010-April/001252.html [17:22] but at the moment, my job won't start or stop as expected by following the same pattern [17:22] mattiaza: its important to start with the right frame of mind. Upstart is not dependency based, it is event based. [17:23] mattiaza: what start on and stop on do you have now? [17:24] I understand (but only very vaguely). starts and stops of other services generate events, and then these trigger start/stop on conditions. but how are events stored or cached when combining them? [17:24] they're not stored or cached [17:25] the event blocks until all consumers of the event change state [17:26] mattiaza: its a very "now" thing, so if nothing needs to change, the event is gone. [17:26] aha. [17:26] at the moment, my conditions are: [17:26] start on (started postgresql and started rabbitmq-server) [17:26] stop on (stopping postgresql or stopping rabbitmq-server) [17:27] what I'm trying to achieve is: my service needs postgresql and rabbitmq-server. therefore it should start automatically once both have been brought up, and it should stop when either of them goes away [17:29] it would be nice if it also worked not only at system startup time, but also when I restart postgresql for example - but this is more of a nice-to-have [17:31] mattiaza: ok, first, you're doing it wrong. :) [17:31] mattiaza: those are network services [17:31] mattiaza: so your app just has to deal with them not being up sometimes. [17:32] mattiaza: if they were on another server, you couldn't control the boot order [17:33] mattiaza: you're better off having your service retry and/or degrade until postgres and rabbit are available. [17:33] mattiaza: and then just 'start on runlevel [2345]' [17:33] and 'stop on runlevel [^2345]' [17:33] I'll test it a bit... [17:33] mattiaza: do you understand why though? [17:34] mattiaza: upstart *can* express this. But it takes 2 jobs, one to keep track of the state of pgsql and rabbit, and another to run your service [17:34] if I ever wanted to move postgres or rabbitmq to another server, then sure, it would be useful if it retried connections [17:34] "my service" is django-celery worker, I'm investigating if it handles broken connections or not [17:34] mattiaza: s/if I ever wanted to /when I am forced to / [17:42] tested it with various configuration of the services running or not. [17:44] celery is mostly reasonable: if rabbitmq is down, it retries connections. if postgresql is down, it runs tasks, but their results cannot be written and they are simply discarded forever. [17:45] uwsgi django app itself errors when postgresql is down (obviously), but hangs forever if rabbitmq is down and it can't queue up new tasks. [17:46] so at the moment, by starting celery on network-runlevels, it would technically run - but it would silently eat all tasks if rabbitmq comes up before postgresql [17:47] i'd say that at the moment, linking the start on and stop on conditions to postgresql only makes most sense [17:47] I don't expect to move the server to another machine in the next 6 months, and for now it appears to be the most reliable configuration [17:51] mattiaza: that honestly sounds like you have a very broken queue runner [17:51] mattiaza: you're saying, if pgsql goes down, all of your queued work is lost [17:51] and thats ok? [17:52] appears that way! [17:52] awesome [17:52] not sure if there are ways to make sure the queued jobs get persisted and retried if they fail [17:53] mattiaza: you probably shouldn't be consuming the messages until the work is done. [17:53] (and if they are persisted in pgsql, then it's still down :) [17:53] another option is to put them back in the queue yourself [17:54] try: write to db except: unrecoverable db error; resubmit [17:55] I would expect this to be possible, as many other people are also using rabbitmq+celery as the standard django background task queue [17:55] have to see if there are configuration options I missed [17:56] Its totally possible. [17:56] Just.. not actually helpful. :) [18:08] mattiaza: here's another thought. You could just have your thing start on started postgresql and stop on stopping postgresql [18:09] yep, that seems to be the most reliable method for now, and would not require complex celery hacking. reading documentation on how to add event emitting to postgresql old init scripts now [18:10] my old start on command would never have worked anyway, as apparently neither postgresql and rabbitmq-server emit any events :) [18:16] you can make them emit events if you want... [18:23] postgresql old-style startup scripts look daunting.. might give it a try, but don't want to mess them up too much [18:23] I have another idea though [18:24] http://manpages.ubuntu.com/manpages/precise/en/man8/upstart-socket-bridge.8.html seems to be a way to see events on sockets - is there a way to listen for events on "socket on port X started listening" ? [18:24] then I could have an event when postgresql really is ready to accept connections [18:49] mattiaza: socket activation is not what you want in this case [19:10] cool, got it working :) [19:10] added some initctl emit lines to postgresql scripts [19:10] now the celery service is started and stopped together with it [19:11] not the ideal solution, but it should reduce the potential for queued tasks being consumed but failing [19:14] mattiaza: seems like something the celery people would have dealt with.. this is a pretty common pattern (read a job, do some work, record the result). Usually in AMQP there is a way to make messages only disappear on "ACK" [19:14] yep, I'll investigate that tomorrow [19:16] I'm still overwhelmed by the complexity of rabbitmq + celery workers + task result storage in database (the recommended production setup), and how all these interact together [19:21] mattiaza: given that you have 1 server.. it does seem.. overkill [19:22] yes - it also only needs to scale to an amazing 5 requests per hour :) [19:31] mattiaza: *l o l* [19:31] totally worth this much time [19:37] heading home now, thanks for your help! [20:06] has anyone run upstart 1.5 with 11.10? [20:17] nm that didn't work 2 well