[00:00] specifically I have two jobs, one is instanced on $INTERFACE, the other is a task which starts on (started network-interface and local-filesystems) and just starts an instance of the first job [00:00] is this a reasonable approach? [00:02] my guess is the second one may need to be instanced as well, so that it can maintain state for each network-interface in the case that local-filesystems has not yet fired [00:15] simplified it to a single job, seems to work if the interface comes up before local-filesystems, presumably because then the state exists for that interface, but if the interface comes up after, I guess the instance is created and it never sees the local-filesystems event [00:30] shamilton: hey [00:30] shamilton: local-filesystems event is only emitted once, thus all subsequent tasks (which trigger on network-interface) are "stuck" waiting on local-filesystems to be emitted again, which will never happen. [00:31] shamilton: you can have "start on network-interface" and it will be started on each interface (one at a time, or multiple if you also use instance in the task) [00:32] shamilton: but, on debian/ubuntu systems it's best to use if up & if down scripts anyway. As they are more closely related to network configuration / deconfiguration [00:32] okay, but my daemon actually does require local-filesystems -- should I just poll for it in my pre-start script? [00:32] shamilton: if it is a daemon, it's not a task. [00:32] shamilton: why not make it "start on runlevel [2345]" ? [00:33] yeah, previously I had a separate task which was spawning the daemon due to some other mapping logic I require, but I've since removed it to try to get something simple working [00:33] I need an instance per network interface [00:34] so forget about everything above, current situation is I have a single job which starts an instance on (started network-interface and local-filesystems) [00:35] but it doesn't start if the network interface comes up after local-filesystems [00:35] shamilton: you will ever have one of those. Since "local-filesystem" is one time event (which fires only once) then it waits for e.g. one network interface to fire, after which local-filesystem event is retired. [00:36] shamilton: and during a lifecycle of a single boot local-filesystem is never repeaded. [00:36] my understanding is that each job remembers that local-filesystems has already fired, so I'm thinking if I could "touch" the jobs for the interfaces which I know will come up in the future, I could get it working, though it's a bit hacky [00:37] shamilton: no, nothing remembers anything. That would state preservations, there is no state. [00:37] shamilton: write a simple job which does this: [00:37] start on A and B [00:38] stop on C [00:38] $ initctl emit A [00:38] $ initctl emit B [00:38] $ status test job [00:38] start/running [00:38] $ initctl emit C [00:38] $ status test-job [00:38] stop/waiting [00:38] $ initctl emit B [00:38] $ status test-job [00:38] stop/waiting [00:39] ..... waiting for event A to be emitted again. [00:39] does it not remember that A has fired, so that it can start for real when you emit B? [00:39] so you can have "instance INTERFACE \ task \ start on network-interface" [00:39] shamilton: no, it does not remember, because upstart is event driven, not state driven. [00:40] shamilton: it's widely documented and explained in multiple times in the cookbook. [00:40] http://upstart.ubuntu.com/wiki/JobEventExpressions "When an event occurs, each node in the tree is checked, and the value changed if the event matches. Each parent node is then updated to the new value." [00:40] my interpretation of this is that it maintains state [00:40] shamilton: please see cookbook. http://upstart.ubuntu.com/cookbook/#restarting-jobs-with-complex-conditions [00:40] shamilton: there are other sections as well about it. [00:41] shamilton: and just read through http://upstart.ubuntu.com/cookbook/#restarting-jobs-with-complex-conditions it has examples of all common cases, including mutually exclusive tasks and those that need 1 to 1 mapping [00:42] so in your example above, you remove "initctl emit A", "initctl emit B" would no longer start the job, correct? [00:42] I don't see how it could be stateless if so [00:43] it's implemented as follows [00:43] event fires, iterate across known waiting/blocked jobs and check the events it keys on. [00:43] if any match, mark it as satisfied, free the event. [00:43] if last condition is satisfied start the job. [00:43] so for example: [00:44] $ initctl emit A [00:44] $ initctl emit B [00:44] $ initctl emit A [00:44] $ stop test-job [00:44] $ initctl emit B [00:44] will _not_ start the job. (as event A arrived whilst the job was running thus no conditions are tracked) [00:45] okay, got it, that's roughly how I understood it [00:45] it's all to avoid deadlocks, circular dependencies and keep the core very simple 1-time event propagation. [00:46] there is no multi-pass. [00:47] shamilton: but yeah, in your case you should "instance $INTERFACE \n start on network-interface" and in the pre-start block until all other things are satisfied (e.g. mountpoints mounted et.al.) [00:48] suppose hypothetically I could touch all the instances I expect in advance at "startup", would it not then remember that local-filesystems has fired, and behave desirably? [00:48] ignoring the restart thing, which isn't a problem [00:50] I guess at that point there isn't much reason to use instances [00:56] shamilton: hm, no, and typically one uses instances only when you do not know ahead of time how many instances you will have. [00:57] shamilton: e.g. for hot-plug type of things. If you do know all the interfaces you expect to have, then I'd store it in a config file somewhere and then have 1 arbitrator job which is "start on local-filesystems" and instance job without any start on conditions at all. [00:58] then arbitrator would do: script \n for i in $DESIRED_INSTANCES; do start --no-wait worker INTERFACE=$i; done \n end script [00:58] shamilton: what is the actual thing you are running? and why does it have to be started on per network interface? [00:59] shamilton: can it not just start one daemon that e.g. connects to network-manager over dbus to monitor networking events and act appropriately on them? [00:59] it's basically tcpdump, which I want running as early as possible, but I do need a separate instance per interface [00:59] shamilton: right, true. why do you need local-filesystem them? [01:00] shamilton: with recent enough upstart stdout & stderr is collected and stored into /var/log/upstart/job-instance.log [01:00] shamilton: and that is done irrespective of filesystems (e.g. it's cached and flushed to disk, once disk becomes writable) [01:01] shamilton: check $ man 5 init to see if you have "console log" (the default) available to you. [01:01] shamilton: .... and networks can be established in the initramfs /before/ upstart is run. [01:02] root filesystem is RO, the RW disk doesn't come up until a bit later, the packet inspector needs RW to the filesystem [01:03] also this is just my piece of the puzzle, unfortunately I'm not at liberty to make sweeping systemic changes [01:03] some kind of process controller would be ideal, but in the interest of getting things working quickly, I'm trying to do it with usptart [01:03] and just polling for the filesystem in a pre-start should do the trick, though it feels suboptimal [01:03] however, I can leave a comment saying it was vetted by an upstart dev [01:04] shamilton: what upstart version you are using? [01:04] # /sbin/init --version [01:04] init (upstart 1.5) [01:04] unfortunately [01:04] might be able to drop in a newer one if it wouldn't break anything [01:04] shamilton: and i'm not sure why packet inspector needs RW, if you can redirect output to just plain stdout upstart will collect it for you, from early boot even whilst filesystem is RO. [01:05] shamilton: and i'm not sure what you mean "vetted by an upstart dev". I didn't veto anything =) [01:06] inspection engine has a local database, so unfortunately it's not as simple as stdio buffering [01:06] again not my call [01:07] okay, "discussed with someone who sounded like they knew what they were doing" [01:07] shamilton: well, you can get away without a pre-start [01:08] oh? [01:09] shamilton: so have your thing start on network-interface - aka worker. And have another job which is called "check-conditions" which is also an instance "start on starting working" which would block until file systems are writable et.al. Not really that different, but you'd be able to optimise those checks to short-circuit them after it succeded the first time. [01:09] with "start on starting foo" you can pre-empt and block foo from starting until this other job completes. [01:10] shamilton: again this is also described in the cookbook. Do please read the cookbook's common examples, to see if there is inspiration to organise your jobs. [01:10] http://upstart.ubuntu.com/cookbook/ [01:11] 11 COokbook and Best Practises [01:17] so I have a task which is start on starting tpcdump, it blocks until the filesystem is available, correct? not quite sure what you mean by short circuit the tests [01:22] shamilton: yeah. Short cirtuit - e.g. cache the result, at the end touch /var/run/life-is-good, and that filesystem check can start with: if [ ! -e /var/run/life-is-good ]; then block until filesystem is mounted; fi exit 0 [01:23] k, that's what I thought you meant, but the filesystem test is also a -e, so in this case I don't win much [01:23] think the pre-start script is the simplest approach [01:24] guessing there's no chance for future statefulness [01:25] shamilton: we did plan to introduce state tracking. E.g. "while mounted MOUNTPOINT='/' and running webserverd" stanza and then monitor and start/stop when state is correct/incorrect. [01:25] shamilton: but that was very vague plans. Currently in the pipeline are cgroups support & non-blocking process spawning (speed up) [01:26] shamilton: nobody is working to design nor implement statefulness at the moment. [01:29] thanks for your help. I'm not completely satisfied with the end result, as I was really hoping it could be completely event driven (sans polling,) but at least I can put it to bed