#upstart 2006-11-28
<Keybuk> morning Marco
* AlexExtreme yawns
<AlexExtreme> Keybuk: what did you use to create the graph on http://wiki.ubuntu.com/ReplacementInitscripts ?
<Keybuk> dot
<Keybuk> from the graphviz package
<Keybuk> http://people.ubuntu.com/~scott/feisty.dot
<AlexExtreme> cool, thanks
<Keybuk> (note, that's a text file, not an OpenOffice document <g>)
<AlexExtreme> yeah :)
<AlexExtreme> bbl
* Keybuk tackles the module-init-tools merge
<Keybuk> Md: I guess you saw kay's lkml post about merging some of modprobe into udev?
<Md> Keybuk: not yet
<Keybuk> ah
<Keybuk> it's in the thread BenC started about making it possible to move some of the device/driver binding decisions to userspace
<Keybuk> Md: no major changes of consequence in our m-i-t patch; just our usual two extra patches (silence modprobe & combine modprobe.conf/modprobe.d automatically)
<Keybuk> I suspect we'll be dropping the silence one now that we want verbose log messages by default
<Md> can you tell me the thread title?
<_ion> A nice graph.
<Keybuk> [RFC]  Pushing device/ddriver binding decisions to userspace
<Keybuk> Kay's comment in that
<Keybuk> *yawn*
<AlexExtreme> *yawn* too
<Keybuk> I've almost finally finished merges
<Keybuk> just sysvinit to go
<Keybuk> that'll be fun
<AlexExtreme> i'm trying to find something to code, which is usually a signal that i'm bored ;)
<AlexExtreme> merges of what?
<Keybuk> Debian changes into Ubuntu
<Keybuk> it's the blackhole of developer time in the first few weeks of our release cycle
<AlexExtreme> ah
<AlexExtreme> ok
<AlexExtreme> i'm going to get some food
<AlexExtreme> bbl
<Keybuk> wasabi: hey
<Keybuk> so I thought about things a bit
<Keybuk> I'm convinced that the eight-step state machine is correct
<Keybuk> but think you're right that the events aren't
<Keybuk> especially as the state machine is more complex, you could end up being started by an event but needing to list two or three stop events, depending on the path through the state machine
<Keybuk> so...
<Keybuk> keeping the eight states (but renamed a bit, to make more sense)
<Keybuk> but only having four events
<Keybuk> start (or starting?) when the goal is set to start, started once the entire process of starting is complete
<Keybuk> stop (or stopping?) when the goal is set to stop, stopped once the entire process of stopping is complete
<Keybuk> that should fulfil all the use cases I had
<_ion> Where is the spec about this 8-step state machine?
<Keybuk> _ion: http://upstart.ubuntu.com/wiki/JobStates
<Keybuk> so, the interesting thing about the start and stop events ...
<_ion> "starting" and "stopping" make sense
<_ion> Thanks.
<Keybuk> the "start" and "stop" commands can then just generate those events
<Keybuk> and only once those events are handled, begin the process of starting the job
<Keybuk> so if tomcat had "stop on stop apache"
<Keybuk> then tomcat would be fully stopped before apache began stopping
<Keybuk> anyhoo, bed
#upstart 2006-11-29
<AlexExtreme> strange
<AlexExtreme> is it normal that my karma in launchpad goes up without me doing anything?
<Kamping_Kaiser> yeh, LP is whacko
<AlexExtreme> heh
* AlexExtreme disappear again
<Keybuk> heh
<_ion> It would be easy to make a script that creates a graphviz graph alike the one in the wiki of the current system, btw. It would just need a purely informational stanza about events a job may emit, e.g. fhs_check could contain 'emits fhs-filesystem-mounted fhs-filesystem-unmounted'.
<Keybuk> yeah
<_ion> I'll probably write that script later.
<Kamping_Kaiser> didnt notice you here Keybuk 
<Keybuk> ... I have always been here ...
<Kamping_Kaiser> ... oh ...
<_ion> after the complex-event-config spec is final.
<Keybuk> Kamping_Kaiser: actually, that's a lie, I can't back that up
<Keybuk> _ion: http://upstart.ubuntu.com/wiki/Emits
<_ion> keybuk: Nice, thanks.
<_ion> keybuk: Hm, will all that information (the "start/stop on" and the "emits" events) about current jobs be exposed via libupstart? I could just write the bindings for libupstart and use them.
<Keybuk> I think so
<Keybuk> not sure which works best
<Keybuk> have a tool ask upstart for job relationships
<Keybuk> or parse the configuration files itself
<_ion> Yeah, that's what i'm pondering. :-)
<Keybuk> I think the former, that way what you see is always consistent with what upstart knows
<Keybuk> and allows for future registry of jobs from other sources (e.g. autostart files)
<_ion> Yeah.
<wasabi> morning folks
<Keybuk> heyhey
<wasabi> where did we leave off yesterday?
<wasabi> Oh yes! Yeah. You have a real job. It just seems like it's too good to be real. ;)
<Keybuk> it has its downsides, like any other
<Keybuk> I'm just going through the specs tidying them up and marking those we're pretty much agreed on as approved
<Keybuk> well, that was a conversation killer :p
<wasabi> Heh.
<Keybuk> so, I was thinking on last night's conversation
<Keybuk> I'm convinced that the eight-stage state machine is correct, though the names could do with a little tweaking
<Keybuk> however I think I agree that there should not be eight corresponding events
<Keybuk> the main reason being that the paths through the state machine aren't circular
<Keybuk> e.g. you could have a start event, but a failure along the way, which could cause you to go straight to stopped
<Keybuk> so anything with "from start FOO to stop FOO" wouldn't actually get stopped
<Keybuk> and I think that inconsistency is bad
<Keybuk> going through all the examples, I couldn't find anything that really needed that level of complexity
<Keybuk> so I propose keeping the state machine *but* reducing the events
<Keybuk> in fact, there should be just two pairs
<Keybuk> "start"/"starting" = goal changed to start, process begun
<Keybuk> "stop/stopping" = goal changed to stop, process begun
<Keybuk> "started" = process of starting finished
<Keybuk> "stopped" = process of stopping finished
<Keybuk> these events would be emitted at the most appropriate places in the state machine
<wasabi> I agree wholeheartedly.
<Keybuk> and would always go
<Keybuk> starting -> started
<Keybuk> or
<Keybuk> starting -> stopping -> stopped
<Keybuk> or stopping -> stopped
<Keybuk> or stopping -> starting
<Keybuk> etc.
<Keybuk> so they make logical sense
<wasabi> Yes.
<Keybuk> if you get a starting event, you ALWAYS get at least stopping
<wasabi> Makes sense.
<Keybuk> and if you get a stopping event, you ALWAYS get at least starting following
<wasabi> So, if pre-stop fails, it'll emit starting again?
<Keybuk> right
<Keybuk> goal changes back
<wasabi> But not actually traverse through pre-start and post-start.
<Keybuk> yup
<wasabi> Super.
<Keybuk> this also gives us an interesting idea
<Keybuk> the started/stopped events are just "now at rest" events
<wasabi> Yup.
<Keybuk> but the starting/stopping events are "goal being changed"
<Keybuk> we could delay any action on those two events until they've been hancled
<Keybuk> uh, handled
<Keybuk> so:
<Keybuk> start ssh => sets goal to start => emits "starting ssh" => waits for event to finish => changes state to pre-start
<Keybuk> likewise for stopping
<Keybuk> stop ssh => sets goal to stop => emits "stopping ssh" => waits for event to finish => changes state to pre-stop
<Keybuk> so for the more usual example:
<Keybuk> tomcat could be started on starting apache; which means that apache won't be started until apache is actually running
<Keybuk> uh tomcat is actually running
<wasabi> Yes. That makes sense for apache/tomcat.
<Keybuk> network-manager could be stopped on stopping dbus; which means that dbus won't be stopped until network-manager has been stopped
<wasabi> And that also makes perfect sense.
<Keybuk> are there any examples of something you would run "on starting" or "on stopping" which _shouldn't_ delay the named job?
<wasabi> And in both cases the definition of the dependency is described from the POV of the depending service.
<wasabi> Say that again?
<wasabi> Ahh I see.
<wasabi> I'm unsure.
<wasabi> I can't see it.
<wasabi> I think the most likely case is desiring a service to run on started.
<wasabi> Which leaves the emitting job at rest, done, so blocking doesn't matter.
<wasabi> And I can't imagine any circumstances where that isn't good enough.
<_ion> I agree with having only starting/stopping/started/stopped emitted.
<_ion> And with "on starting" and "on stopping" causing the emitting job to wait.
<wasabi> I think the only desire to have something start but not block "starting" is simply because it's reactionary, but independent. It doesn't actually care about the state of the first job.
<wasabi> Thus making it more async.
<wasabi> And I would probably say that that could be handled better. Don't make your job block.
<wasabi> Let it start quick.
<_ion> Can you think of an use case for having "on starting" without blocking?
<Keybuk> how would you define that, though?
<wasabi> Define it? By geting through pre-start FAST.
<Keybuk> *confused*
<wasabi> Don't just spin in pre-start.
<wasabi> Well, the blocking is only waiting for the dependent job to be started
<Keybuk> the only use cases I can think of for "on starting" do suggest blocking
<wasabi> Which is waiting from child stopped->started
<Keybuk> yeah
<wasabi> I can't imagine that being slow enough to be unsatisfactory justify 
<wasabi> s/justify//
<wasabi> I guess the only potential bottleneck I can think of is that if somebody wants to have their job run before apache runs, but their job is waiting on some other external resource before it starts.
<wasabi> And they represent that waiting by spinning in pre-start =)
<Keybuk> I'd argue that's a bug anyway :)
<wasabi> And that seems pretty broken anyways.
<wasabi> Yeah
<wasabi> running to work now
<wasabi> Are we suseptable to any sort of feedback loop with this?
<wasabi> Or deadlocking
<Keybuk> you can be deadlocked by two jobs
<Keybuk> foo:  start on starting bar
<_ion> A "start on starting B" and B "start on starting A"?
<Keybuk> bar:  start on starting foo
<Keybuk> actually, no you wouldn't
<_ion> That could be easily detected, couldn't it?
<Keybuk> because the "starting bar" event wouldn't affect foo, because it's goal is already set
<Keybuk> bar would start immediately
<Keybuk> and thus foo would too
<_ion> Ah, nice.
<Keybuk> hmm
<Keybuk> what to do about respawn?
<Keybuk> would you want an event for that?
<Keybuk> should it be when it notices the job has died, or when it has successfully restarted it?
<wasabi_> back
<_ion> died foo 42   exit value
<_ion> stopped foo
<_ion> starting foo
<_ion> started foo
<_ion> ?
<wasabi_> Nice.
<wasabi_> I don't think an event is required for respawn specifically as much as "died"
<wasabi_> And I *love* the idea of including the exit code in the event args
<wasabi_> That's slick.
<Keybuk> failed?
<Keybuk> we have that already
<wasabi_> Yeah.
<Keybuk> my only concern there is that it'd cause any service using starting or stopping to get restarted
<wasabi_> "jobname failed 1"
<Keybuk> but maybe that's right too
<wasabi_> Hmm.
<Keybuk> actually, it'd only do it for anything on stopping
<Keybuk> so if dbus died, all the dbus services would get restarted
<wasabi_> That is most certainly appropiate.
<wasabi_> At least in that case.
<Keybuk> I suspect if something really needs to be stopped on stopping, and block the parent, that does imply that the connection is critical
<wasabi_> I think failed is definatly a bug, as in, something which should never happen. But when it does, it's best to be conservative.
<Keybuk> so does imply that respawning one means to respawn the other
<wasabi_> When it fails, run the post-stop scripts, which would clean up on service stop.
<wasabi_> Then cycle back through to start.
<wasabi_> Also, the post-stop script, and in fact the pre-start script will have access to the event that caused it. Including args.
<wasabi_> [ "$1" == "failed" ]  
<Keybuk> yeah
<Keybuk> ok, so another topic
<wasabi_> There are cases where we have known buggy software, that people want to run, but we know is going to crash.
<wasabi_> (nscd)
<Keybuk> (from the same spec)
<_ion> Hmm, if foo: "start on stopping bar" and bar fails, foo should be stopped. Should there be some magic, or should there be a dummy "stopping bar" event right after the "failed bar" event?
<wasabi_> I think I lean towards having a dummy event.
<_ion> Me too.
<wasabi_> You know, we might not even have a failed event.
<_ion> Whoops, i meant foo: "stop on stopping bar"
* Keybuk thinkgs about "start on failed self" again
<wasabi_> Are we going to use job names as events or start/stop ?
<wasabi_> We should decide that once and for all. ;)
<Keybuk> wasabi: right, so let's discuss that now
<Keybuk> there's a good reason (imo) to make events have a _different_ namespace to jobs
<Keybuk> the example event is the block-device-added event
<Keybuk> we can write things like /etc/event.d/mount-filesystem with "on block-device-added"
<Keybuk> but someone is going to try and write /etc/event.d/block-device-added
<Keybuk> this has several effects
<wasabi_> Hmm.
<Keybuk> 1) it doesn't get triggered, whey they probably expect it to -- no matter, they can add "on block-device-added" in it
<wasabi_> Weird events being mixed.
<Keybuk> 2) so the block-device-added hda event from udev now starts the block-device-added job
<wasabi_> You'd get block-device-added events with data that was not in fact a block device name
<Keybuk> 3) which emits the block-device-added starting event
<Keybuk> and poor mount-filesystem would get both
<wasabi_> You win. =)
<Keybuk> this is a real world problem
<Keybuk> someone (me) shipped /etc/event.d/control-alt-delete in Ubuntu <g>
<wasabi_> Heh.
<Keybuk> which is why the event got renamed
<Keybuk> so I think in general we want to keep the namespace separate
<Keybuk> and I think it's never useful to say "on JOBNAME" and get started whenever that job changes state
<wasabi_> The only reason I was thinking the other way was for our reusable complex event stuff.
<Keybuk> but it's potentially useful to have "on STATE" (on failed) and get started whenever any job reaches that state
<Keybuk> I don't think there's much semantic difference between
<wasabi_> /etc/event.d/some-reusable-state-machine
<Keybuk>     from started apache until stopping apache
<Keybuk> and
<Keybuk>     from apache started until apache stopping
<Keybuk> which is the only real difference
<wasabi_> Yeah. Agreed.
* _ion totally agrees with Keybuk
<Keybuk> the reusable complex event stuff defines states which have the various events
<wasabi_> I'm on board now. ;)
<Keybuk> not a single up/down event
<Keybuk> the cute thing about that is you can define pre-start and post-stop scripts for your reusable state machines <g>
<wasabi_> Yeah.
<Keybuk> of course, this all means that /etc/event.d isn't the right name anymore
<Keybuk> it's a directory that defines jobs, services, tasks and states
<wasabi_> But not events. ;)
<Keybuk> I agree that /etc/event.d/FOO implies that FOO is an event, and the content of the file is what happens on it
<wasabi_> Explicitly not events, yet we call it event.d hehe
<Keybuk> so I think we should rename that directory
<_ion> TBH, i was never very fond of the name "event.d".
<wasabi_> Well, we need a name which better describes { jobs, services, tasks and states }
<Keybuk> http://upstart.ubuntu.com/wiki/JobEvents
<Keybuk> ^ suitable for approval?
<wasabi_> reading
<_ion> /etc/upstart would be quite obvious, but perhaps that would be better preserved for possible configuration of upstart itself, such as the profiles spec.
<wasabi_> I'm curious how internally you will deal with respawning.
<Keybuk> wasabi_: I'm edging closer and closer to dropping it entirely from the internal state
<wasabi_> You have the concept of a goal, which is essentially an enum set on a job, and the job works towards that state.
<wasabi_> Ahh.
<Keybuk> I think that respawn just indicates that the job doesn't change goal when it dies
<wasabi_> But in the case of restart, your goal is started, but you have to instruct the machine to cycle to that.
<Keybuk> we can still kick it into stopping, and let it come back as if it had been manually restarted
<wasabi_> True.
<Keybuk> so it'll go stopping -> starting -> started
<Keybuk> actually, it'll go failed -> stopping -> starting -> started
<Keybuk> I sometimes wonder whether someone with far more forethought than I wrote it :p
<wasabi_> You'd kick it into post-stop, basically.
<Keybuk> right
<_ion> "Events emitted as part of a job state change are currently named after the job, with a suffix indicating the new state."
<wasabi_> And it would just cycle back around naturally.
<_ion> Isn't that the other way around?
<Keybuk> _ion: apache/started 
<_ion> Sorry, i missed the word "currently" and misinterpreted the sentence.
<Keybuk> I usually try in the summary to first say how things are, then say what the spec proposes
<_ion> "it'll go failed -> stopping -> starting -> started"  Shouldn't "stopped" be there after "stopping"?
<Keybuk> err
<Keybuk> yes
<_ion> Have i understood this correctly:
<_ion>  When admin stops a job: "stopping foo", pre-stop script, kill proces, post-stop script, "stopped foo"
<_ion>  When a process exits by itself: "failed foo exitval" IF it failed, "stopping foo", pre-stop script, post-stop script, "stopped foo"
<Keybuk> yes
<wasabi_> It might be worth thinking about removing "failed", and replacing it with a third argument to "stopping".
<wasabi_> I might be totally offbase there.
<_ion> Then "foo" would need to know what exit value "bar" has when it fails or doesn't fail.
<wasabi_> If it's set to respawn, any exit code is a failure.
<wasabi_> Even 0.
<wasabi_> At least, that's how I understand it.
<wasabi_> Maybe I'm wrong there too.
<Keybuk> correct
<Keybuk> or any value in normalexit
<wasabi_> Consider this. A piece of software, such as a java application server (which I can attest operates this way), which has a built in shutdown/startup interface.
<wasabi_> So, it's possible to shutdown this server using it's built in web interface. This does not instruct upstart to in fact shut it down.
<wasabi_> It could, but any such patches to do so would have to be written.
<wasabi_> Instead, it terminates normally with exit 0.
<wasabi_> Even though you want it to respawn on crash.
<wasabi_> So, basically, you would desire upstart to "respawn on !0"
<Keybuk> right
<Keybuk> respawn
<Keybuk> normal exit 0
<Keybuk> (this works today)
<wasabi_> Ahh. Cool.
<wasabi_> So, that would result in upstart running post-stop, and emitting stopping and stopped.
<Keybuk> yes
<wasabi_> Nice.
<Keybuk> wasabi: what if a post-start script fails?
<Keybuk> remind me
<Keybuk> I think your description says it kills the process
<wasabi_> I think if post-start fails, it would proceed to "stopped" along normal paths.
<wasabi_> Hmm. Guess not./
<wasabi_> Guess it would skip to stop (bypassing pre-stop)
<wasabi_> kill the process, then post-stop.
<Keybuk> what about if the pre-stop fails?
<Keybuk> which is kinda interesting
<Keybuk> assuming we issue the stopping event first, and hold on that, then things will need to be restarted
<Keybuk> so you'd have to go stopping -> starting -> started again
<Keybuk> without actually doing anything :p
<Keybuk> alternatively one could wait to issue the stopping event until after pre-stop, but that negates the pre-stop being used to instruct the server to shut down safely
<wasabi_> pre-stop wouldn't be used to instruct it to shut down.
<wasabi_> It would be used to determine if it should actually shutdown.
<wasabi_> Maybe that's what we're missing?
<wasabi_> So, it isn't in fact going to stop until pre-stop says it can.
<Keybuk> the trouble with that is that you've already issued the stopping event, haven't you?
<Keybuk> or do you wait until pre-stop says you can?
<Keybuk> I can see both uses for the script
<wasabi_> Hmmm. Also makes me wonder if we in fact need a "start"/"stop" script seperatly from exec.
<wasabi_> pre-stop i think is and idea we had simply to prevent stopping.
<wasabi_> I'm not even sure that is even neccassary.
<wasabi_> I'd say, if there is some actual shell script that is run that iniates the stop, it might be seperate from all of this, and a companion with "exec"
<wasabi_> exec foo
<wasabi_> stop script
<wasabi_>    do something to kill foo
<wasabi_> end script
<wasabi_> pre-start;start;post-start,  pre-stop;stop;post-stop,  where start and stop can run at the same time. Each can be expressed by "name {exec|script}"
<wasabi_> So, exec, logically, becomes "start exec".
<wasabi_> JUst like we would have "post-start exec", or various other uses.
<Keybuk>  * all works fine: starting, started, stopping, stopped
<Keybuk>  * pre-start fails: starting, failed, stopping, stopped
<Keybuk>  * post-start fails: starting, failed, stopping, stopped
<Keybuk>  * binary fails: starting, failed, stopping, stopped
<Keybuk>  * binary respawns: starting, started, failed, stopping, starting, started
<wasabi_> start/stop are runnable at the same time. while start is still going, we are at rest.
<Keybuk> those are the event sequences I've come up with so far
<Keybuk> probably:
<Keybuk>  * binary terminates badly: starting, started, failed, stopping, stopped
<Keybuk> as well
<wasabi_> I think there are a number of differnet things represented there which need to be thought about seperatly.
<wasabi_> 1) emitted events
<wasabi_> 2) "status" of job
<Keybuk> those are simply emitted events
<wasabi_> 3) internal status of job used to maintain state machine
<wasabi_> Okay
<Keybuk> trying to work out what works best
<wasabi_> 2), the queryable statuses I see are "starting", "started", "stopping", "stopped". initctl can get the status of a job and return only those 4.
<Keybuk> do those event sequences make sense?
<wasabi_> 4) executable steps
<wasabi_> Yes I believe so.
<Keybuk> the respawn one is the odd one, as it never reaches "stopped"
<Keybuk> it's stopping, then it's starting again
<wasabi_> Yeah.
<wasabi_> MIght emit stopped anyways.
<Keybuk> I think that makes sense, if a job is "stop on stopped foo", and foo respawns, it doesn't get restarted; where a job that's "stop on stopping foo" does
<wasabi_> Since another job may be watching for it.
<Keybuk> the problem is working out where to emit events
<wasabi_> You could emit stopped, but there would be no point at which you could query and return "stopped"
<Keybuk> emitting stopped there is damned hard
<wasabi_> Just let the mainloop run twice.
<Keybuk> no such thing
<wasabi_> I mean, you don't let it do that, you iterate and set it to stopped, and run the normal stopped code, which emits the event, and then it runs again immediatly after and progresses.
<Keybuk> once it hit stopped, nothing would get it back out again
<wasabi_> Sure, it's at stopped, but hte goal isn't stopped.
<Keybuk> yes
<wasabi_> So the next loop would send it to starting again
<Keybuk> but what would pick that up?
<Keybuk> you'd need something in the main loop checking every single job
<Keybuk> which is damned ineffecient
<wasabi_> Ahh.
<wasabi_> Well, I suspect you will end up with an idle bit someplace.
<Keybuk> right now, it flows because each change causes the next
<Keybuk> I'm trying to drop the idle bit :)
<wasabi_> if goal != state idle = false;
<wasabi_> hmm. i see.
<wasabi_> idle = (goal == state) actually.
<wasabi_> Then always "process" jobs where !idle, after work, reset idle again.
<wasabi_> 4) executable steps. these are basically PIDs that are being tracked.
<wasabi_> 4) pre-start, start, post-start, pre-stop, stop, post-stop
<wasabi_> They can be expressed as "name [exec|script]  ..."
<wasabi_> start exec foo, start script\n foo\nend script
<wasabi_> pre-start exec, pre-start script. Same stuff.
<Keybuk> http://people.ubuntu.com/~scott/states.png
<Keybuk> (reload)
<Keybuk> (reload again)
<Keybuk> follow the green lines while the goal is start
<Keybuk> follow the red lines while the goal is stop
<wasabi_> These are events?
<Keybuk> states
<wasabi_> Didn't think pre-start and such would be states.
<Keybuk> why not?
<Keybuk> they represent discreet points in the lifecycle
<wasabi_> okay i see
<Keybuk> starting is where we fork the process
<Keybuk> stopping is where we kill it
<Keybuk> we emit the "started" event when we hit "running"
<Keybuk> and the "stopped" event when we hit "waiting"
<Keybuk> where to emit the starting and stopping events is the tricky one <g>
<wasabi_> I might say we emit the stopped event when we finish post-stop.
<Keybuk> right now I have them being emitted when the goal changes, so completely outside that diagram
<Keybuk> doesn't work if you don't have a post-stop
<wasabi_> and we emit the started event when we finish post-start, to keep it simple.
<wasabi_> The machine would always progress through post-stop, but if there was no post-stop PID to run/track, it would just be done.
<wasabi_> And on completion, it emits event.
<Keybuk> it'd enter post-stop, but then you'd need some kind of "ok, what state was I just in?" thing
<wasabi_> Not really, always, at the end of post-stop, emit stopped event.
<wasabi_> Regardless of what happened previously.
<Keybuk> no such thing as "end of"
<wasabi_> You'd want to emit stopped on a respawn.
<Keybuk> we define an end of a state by changing to a different one
<Keybuk> see, I'm not sure we do
<wasabi_> Sure there is. End of just just hte last step in the small machine known as "post-stop
<Keybuk> why does respawn need to say the job is stopped?
<Keybuk> it never really is
<wasabi_> Functionally it isn't available, or has gone up and down.
<wasabi_> And thus you would have to reconnect.
<wasabi_> nm/dbus.
<Keybuk> right, but I think anything that connects will be hanging on the started and stopping events
<Keybuk> NOT the starting or stopped events
<Keybuk> because those implicitly don't allow connection
<wasabi_> nm could do from (dbus started until dbus stopping) or (dbus started until dbus failed)
<wasabi_> but that would be a bit obnoxious.
<Keybuk> you'd always get stopping
<wasabi_> Hmm. the blue line skips over it
<Keybuk> you're confusing a state with an event
<Keybuk> that's probably my fault :p
<Keybuk> those stopping and starting are just the interim states, nothing to do with the events
<Keybuk> I haven't renamed them yet
<wasabi_> that diagram should have events hanging off of each step.
<wasabi_> heh
<Keybuk> that doesn't work if we don't have an event for each step :p
<wasabi_> Okay, so stopped would not be emitted... so in that case:
<Keybuk> the point is that we're having events to indicate which way round we're going, *not* where we are
<wasabi_> from (dbus starting until dbus stopped) or (dbus starting until dbus failed)
<wasabi_> Since stopped isn't emitted when failed is.
<wasabi_> But the conditition is functionally the same.
<Keybuk> ah
<Keybuk> from starting dbus until stopped dbus
<wasabi_> ie no service running, can't connect to it.
<Keybuk> ^ I don't need to connect to dbus, just be around when it is  (and it should wait for me to be around)
<wasabi_> dbus is a bad example there.
<wasabi_> It was just in my buffer and I didn't want to retype.
<Keybuk> better example
<Keybuk> from starting apache until stopped apache
<wasabi_> Yeah.
<Keybuk> apache needs me to be started first, and needs to be stopped before I am
<Keybuk> the counter-example is
<Keybuk> from started dbus until stopping dbus
<Keybuk> I need dbus to be around, and I don't want dbus to be stopped until I am
<wasabi_> {...dbus...{...nm...}...}
<Keybuk> our first example wouldn't get restarted if apache respawned (no need, apache depends on it, not the other way around)
<Keybuk> our second example would get restarted whenever dbus respawned
<wasabi_> {...tomcat...{...apache...}...}
<wasabi_> Yeah I think you're right. If tomcat can run without apache, then it doesn't matter if it knows that apache just died.
<wasabi_> Because apache will just come back.
<wasabi_> But the situation described is still right. tomcat will always be running while apache is.
<Keybuk> from started foo until stopped foo -- I suspect this will be rare; it implies that foo must start first, but you don't really need foo, as it's ok for it to stop
<Keybuk> from starting foo until stopping foo -- odd, foo has to wait for you to start, and wait for you to stop -- some kind of strange hold; this works in the respawn example though
<wasabi_> {foo  {bar   }foo   ?bar
<wasabi_> Here's an interesting thing.
<wasabi_> dbus will be considered started, and nm will launch.
<wasabi_> But dbus might not have actually opened it's socket and been completely prepared.
<Keybuk> that's what the post-start script is for, and why the started event isn't emitted until that finishes
<wasabi_> And hence post-start can hold nm from starting until dbus is sure it's up.
<wasabi_> And that is just totally kick ass.
<wasabi_> Easy race elimination.
<Keybuk> pre-stop is the tricky one
<Keybuk> do we emit stopping before pre-stop, and wait for other jobs to finish first?
<wasabi_> Yeah, interesting.
<Keybuk> or do we run pre-stop first, and then only emit stopping after - waiting for other jobs to finish before killing the process?
<wasabi_> I think that's probably best.
<Keybuk> which means the stopping event is tricky
<wasabi_> 1) execute and wait for pre-stop 2) if going to stop emit and wait for stopping, else reset back to started 3) execute and wait for stop (procedure to terminate process) 4) execute and wait for post-stop
<Keybuk> http://people.ubuntu.com/~scott/states.png
<Keybuk> reload again
<Keybuk> follow the green lines when the goal is start
<Keybuk> follow the red lines when the goal is stop
<Keybuk> failure of a step sets the goal to stop
<Keybuk> note:
<Keybuk> running can terminate normally, and change the goal to stop (the red line out marked "normal")
<Keybuk> this goes to pre-stop
<Keybuk> running can terminate badly, and change the goal to stop (the red line out marked "failed")
<Keybuk> err
<Keybuk> sorry
<Keybuk> the one marked failed is supposed to be marked "terminated"
<Keybuk> the normal one is a stop request or event
<Keybuk> the idea there is that you skip pre-stop and "kill process" if the process died
<Keybuk> there's a red and green line in that direction, because respawn doesn't change the goal
<Keybuk> pre-stop can change the goal back to start, so there's a green line back to running again
<Keybuk> does that look right?
<wasabi_> still parsing
<wasabi_> between writing an application deployment lifecycle policy and procedures document for the audit guy. blah.
<wasabi_> Is there a way in dotty to make a big grouping of pieces of this?
<Keybuk> I think so, yes
<wasabi_> Like, off to the right side, a big line which says what parts are in a group. stopped, starting, started, stopped.
<Keybuk> what would be in those groups?
<wasabi_> stopped would be "waiting"
<Keybuk> ?
<wasabi_> starting would be [emit starting, pre-start, exec process, post-start] 
<wasabi_> started would be [emit-started, running, pre-stop] 
<wasabi_> stopping would be the rest. get it?
<Keybuk> I'll see what I can make it do :)
<wasabi_> What I mean, is representing the queryable state of the job at every given point.
<Keybuk> reload
<Keybuk> though dot mangled that a bit :p
<Keybuk> right bed, will mull on that a bit
<Keybuk> nite
#upstart 2006-11-30
<Chipp1> hey, I'm trying to write my own script to run a program on startup, but getting a little confused, can anyone help me out?
<_ion> I guess he became unconfused in five minutes.
<Keybuk> morning
<_ion> Hi
<Keybuk> how goes it?
<_ion> Well, i'm alive. :-) How are you?
<Keybuk> not so bad
<Keybuk> day off today, though got the xmas shopping done pretty quickly this morning
<thom> xmas shopping? it's not even december yet!
<Keybuk> thom: father coming up this evening for the weekend
<thom> ahr
<Keybuk> so had to get him and his partner something
<Keybuk> got him the DVD of the Live version of Jeff Wayne's War of the Worlds
<Keybuk> (which I took him to see for his Birthday)
<thom> there's a live version? sounds awesome
<Keybuk> yeah
<Keybuk> they did it for about a week in April or so
<Keybuk> tour of most of the arenas in the country
<Keybuk> live band and orchestra on stage, conducted by Jeff Wayne; huge screen behind with CGI and actors playing out the parts; guest singers in front doing the vocal bits; and a big floating head with a CG Richard Burton projected on it talking and doing the narration
<Keybuk> they're doing another tour next year, apparently
<thom> awesome
<_ion> Sounds cool. :-)
<abultman> Good morning; anybody awake here yet?
<Keybuk> morning
<abultman> Hey there.  Wanna answer a question or two for me?
<Keybuk> sure
<abultman> s/Wanna/Can you/
<abultman> Ok, good.  I have a daemon I'm trying to run with upstart. It doesn't make a pidfile, and it forks.   I need upstart to run this program and restart it when it dies. 
<abultman> Bit difficult when it forks.  So, I edited teh source to make it not fork, so it stays in the foreground.
<Keybuk> *nods*
<Keybuk> a common problem
<abultman> Nonforking works, but it doesn't restart my program when it dies. I'm not sure if it's because I don't have 'respawn programname', or because my start/stop scripts in the conf file are dead...
<Keybuk> it's because you don't have "respawn" in the configuration file
<abultman> When it runs the daemon-mode one, it fires up 12 copies and then kills them all ;)
<abultman> Does the respawn line require the program name and all it's command line switches?
<Keybuk> no
<abultman> I  thought that went in the start script
<abultman> Ok.
<Keybuk> ahh
<Keybuk> I see your confusion
<Keybuk> the "start script" isn't supposed to start the daemon
<Keybuk> it's just supposed to prepare for it
<Keybuk> likewise, the "stop script" isn't supposed to kill it
<Keybuk> (we're renaming them to make this less confusing)
<abultman> Ah, ok.  
<Keybuk> let's say your daemon is /usr/sbin/foo
<Keybuk> you could get away with just
<Keybuk> ----
<Keybuk> respawn
<Keybuk> exec /usr/sbin/foo
<Keybuk> ----
<Keybuk> and upstart would run that command, if it exits, respawn it, and when you do "stop foo", it would kill it
<abultman> Oh, so 'exec' is valid.  Is there a way I can specify a run-as user?  (I don't have envuidgid or anything like that around)
<Keybuk> the start/stop scripts are only needed if you need to make directories or clean up
<abultman> So if I have a stop script, I can have the stop script remove any leftover sock files?
<Keybuk> right
<abultman> That's what I figured, good.
<abultman> I'll let you know how it goes.  
<Keybuk> on the forking daemon case, at the moment they must remain in the foreground
<abultman> BTW, this is much nicer than daemontools and stupid solaris SMF
<Keybuk> there's a planned feature to allow it to find the forked process
<Keybuk> but that's not finished yet
<abultman> Solaris supports daemonized processes, but that's the only benefit 
<Keybuk> run-as-user => no; use su or similar
<Keybuk> user jobs are planned, but would run in a full pam session, etc. as cron would
<AlexExtreme> hmm, pam. would it be possible to make it work without pam?
<Keybuk> not planning to, why?
<AlexExtreme> frugalware doesn't use pam (slackware roots, you know ;))
<Keybuk> your mad :)
<Keybuk> I guess you could compile it without pam support or something
<AlexExtreme> it would be quite easy to get it to work with shadow, as far as i know
<Keybuk> why would it need shadow?
<abultman> Keybuk: the upstart people need some better documentation, btw
<Keybuk> the point of pam in this regard is to set up a user environment, with limits, etc.
<Keybuk> abultman: the lack of documentation is ... err ... deliberate
<AlexExtreme> uhh, i dunno. first thing that popped into my head (i was thinking gnome-screensaver, it has 2 auth methods, pam and shadow)
<Keybuk> we didn't want people shipping upstart job files until we knew the config file format wasn't going to change much
<abultman> Keybuk: Ah, ok, it all makes sense now.  I was more than a little irritated. I didn't want to hack this server to use daemontools, and upstart seems good enough at the moment...
<AlexExtreme> daemontools *shudder*
<abultman> AlexExtreme: It's nice in some cases
<abultman> Trying to get daemontools to log sucks
<Keybuk> abultman: the next few releases will see some major increases in documentation
<AlexExtreme> *some* cases ;)
<abultman> Keybuk: Thank you, that's awesome. 
<_ion> keybuk: Let's say we want to start /usr/sbin/foobard as user:group foobard:foobard from /etc/event.d/foobard. Wouldn't a full pam session be a bit excessive?
<Keybuk> _ion: no
<_ion> OTOH, it's not like it really causes any overhead now that i think of it. :-)
<abultman> Keybuk: awesome, that works perfectly, I haev my daemon starting and stopping correctly now with upstart
<Keybuk> you need the full pam session to ensure that /etc/security stuff still works
<neemz> hey folks does anyone know how I can stop a startup sequence from autodetecting hardware in edgy?
<_ion> Can you define your problem more accurately? (It has probably nothing to do with upstart, btw)
<neemz> ohh i thought upstart handled everything at startup
<neemz> the problem i'm having is when certain drivers try to load they hang the system
<neemz> i want to stop the system from trying to load the driver, either by telling it not to load that one in particular or by completely turning off hardware autodection
<tonfa> I think you can blacklist it in /etc/hotplug/blacklist
<neemz> i can't boot the system to do that though
<_ion> In edgy, upstart just runs the traditional sysvrc files. In feisty, startup is going to be based on actual upstart jobs. The initramfs scripts will be run before upstart, though.
<theCore> is it normal that upstart logging output is send to the "active" tty? 
<neemz> i shall attempt to mount the filesystem using a rescue cd
<_ion> I guess creating and editing /etc/modprobe.d/blacklist-local would be a good idea.
<theCore> shouldn't be fixed to the last screen?
<_ion> thecore: Upstart's output or upstart jobs' output?
<theCore> jobs' output
<theCore> I think I will fill a bug against the Ubuntu's package, but I just want to know if it's really a bug
<_ion> Remove "console output" or "console owner" from jobs.
<theCore> will that remove completely the jobs' output?
<_ion> The output will go to /var/log/boot by default, currently.
<theCore> that wouldn't fix the problem
<theCore> I want the output to be send to the Alt-Ctrl-F8's screen  
<neemz> do you guys know how to go into interactive startup? 
#upstart 2006-12-03
<treepio> Can someone help me with a basuc (real basic) system with upstart?
<treepio> I cant seem to find the example files "The requested URL /download/example-jobs.tar.gz was not found on this server." where can I find those ?
<AlexExtreme> it's example-jobs-2.tar.gz
<treepio> AlexExtreme: thank you, I will try that
<treepio> AlexExtreme: perfect, thank you.
<Keybuk> evening
<_ion> That.
<Keybuk> that?
<_ion> Yes, evening. :-)
<Keybuk> heh
<yankees26> that was genius
<yankees26> :p
<theCore> is it just me or upstart haven't been updated for a while?
<theCore> is it done?
<theCore> :P
<AlexExtreme> no, there's just a lack of coding :p
<theCore> ah
<AlexExtreme> and yay, i broke gnome-session!
<AlexExtreme> neither the panel nor metacity came up :P
<yankees26> nice job alex
<_ion> thecore: There's been a lot of planning, see the wiki at the upstart website.
<theCore> oh, there's a wiki now? nice
<theCore> _ion, thanks for the info
<yankees26> sorry to report i wont be using upstart in LFS guys 'cause i've become too lazy to do LFS :P and my package manager is also dead in that case :P
<AlexExtreme> I got bored of LFS :P
<AlexExtreme> I never get past about 5 packages of BLFS :)
<yankees26> heh, i didnt even start
<yankees26> and my previous attempts never got pasted binutils :P
<Keybuk> theCore: as Johan said, mostly been planning out the new stuff for 0.3
<Keybuk> and been distracted by UDS and then the Canonical All Hands, and getting feisty open
<Keybuk> next week is probably my first full week on upstart again :)
<Keybuk> http://upstart.ubuntu.com/wiki/JobSerialisation
<Keybuk> ^ new sexyness :p
<AlexExtreme> nooo
<AlexExtreme> http://upstart.ubuntu.com/wiki/SplashIntegration
<AlexExtreme> ^ that's sexyness! :p
<theCore> sweet
<Keybuk> SI: that was originally what logd was going to do -- subscribe to job notifications from upstart and send stuff to usplash
<Keybuk> never did that though
<theCore> Keybuk, I got a question
<Keybuk> theCore: what is your question?
<AlexExtreme> brb
<theCore> is the intended behaviour for upstart to send the output of the jobs to the "active" console?  
<Keybuk> if you have "console output", yes
<Keybuk> it just sends it to /dev/console
<Keybuk> which is the active console
<theCore> personally, I find this quite annoying
<AlexExtreme> the default is to be logged though
<Keybuk> right, the default is to send it to a daemon that knows better how to do with it
<Keybuk> if you wanted the messages in a file, or on a particular VT, or sent to a daemon or serial port, etc. you could do that with the daemon
<theCore> would I need to edit all the jobs, or I could set that in the upstart conf (if there's one) 
<theCore> ?
<_ion> Sigh, i don't seem to get a freaking Hello World done.
<Keybuk> theCore: it's the default to send it to the daemon
<_ion> Today i thought i'd write an upstart job parser (with support for complex-event-config and emits stanzas, as in the specifications) in ruby and then make it output a graphviz dot file, but just couldn't concentrate.
<theCore> does logd only write the job's output to /var/log/boot?
<_ion> Currently yes.
<theCore> ok, that's what thought 
#upstart 2007-11-26
<soren> Keybuk: I've got defunct process with ppid 1.. :( Anything you want me to do?
<Keybuk> soren: ps output?
<soren> All of it? Or just: soren    13177  0.1  0.0      0     0 ?        Zl   Nov23   5:31 [epiphany-browse] <defunct>
<Keybuk> ps ajx for that pid
<Keybuk> ps j -p 13177
<Keybuk> should do the trick
<soren> 13177  7364  7364 ?        Zl     5:31 [epiphany-browse] <defunct>
<Keybuk> it's ppid isn't 1 then
<Keybuk> ?
<soren> That's what ps -efl says.
<Keybuk> ps -j ! :)
<soren> ps -j -p gives these headers:
<Keybuk> no
<Keybuk> ps j
<soren> Gah.
<Keybuk> ps j -p 13177
<Keybuk> ;)
<soren> I'm an idiot.
<soren>  PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
<soren>     1 13177  7364  7364 ?           -1 Zl    1000   5:31 [epiphany-browse] <defunct>
<Keybuk> and curse ps for having the most insanely bad command-line ever
<Keybuk> hmm
<Keybuk> if you kill -CHLD 1 does it go away?
<soren> Wow.. No.
<Keybuk> sudo initctl version
<Keybuk> ?
<Keybuk> (note: no --)
<soren> Now init (upstart 0.3.9)
<soren> Er.. -now :)
<Keybuk> did it go away?
<soren> No.
<Keybuk> *shrug*
<Keybuk> iz kernel boog
<soren> What makes you say that?
<soren> Surely, it's upstart's job to wait() for it?
<Keybuk> yes
<Keybuk> and you just did two things that guaranteed that upstart would have called wait() :P
<soren> Eh? What was the other one? initctl version?
<Keybuk> yeah
<soren> *g*
<soren> Why?
<Keybuk> upstart always calls wait() in a loop inside its main loop
<soren> Oh.
<Keybuk> rather than in a signal handler
<Keybuk> so you just did two things that would have definitely caused upstart to repeat its main loop
<Keybuk> so definitely caused wait() to be called
<Keybuk> which kernel version?
<soren> Just once? Or until there aren't any more children?
<Keybuk> until there aren't any more children
<Keybuk> it calls waitid (..., WNOHANG) in a loop
<soren> uname -a says: Linux butch 2.6.22-14-generic #1 SMP Sun Oct 14 21:45:15 GMT 2007 x86_64 GNU/Linux
<soren> I may or may not have upgraded the kernel package since I booted, so that's the best I've got.
<Keybuk> sudo stop tty1
<Keybuk> sudo start tty1
<soren> Same.
<Keybuk> any getty processes now zombies?
<soren> Nope
<Keybuk> sudo status tty1
<Keybuk> then kill that pid
<Keybuk> does it respawn?
<soren> Yup
<Keybuk> any zombie gettys?
<soren> Nope.
<Keybuk> epiphany still there?
<soren> Yup
<Keybuk> *shrug*
<Keybuk> not my fault then ;)
<soren> Gah..
<Keybuk> definitely a kernel issue
<Keybuk> try sending SIGCONT to 13177 ?
<soren> To a zombie? Er.. No. I can, though.
<Keybuk> try ie
<Keybuk> try it
<soren> Same.
<Keybuk> fair enough
<soren> What effect could that possibly have?
<Keybuk> -> #ubuntu-kernel :)
<Keybuk> might have released an in-kernel lock
<soren> Oh.
<soren> The epiphany process has an npviewer.bin child process, but I doubt that can mess things up this badly.
<Keybuk> well
<Keybuk> that kinda proves the kernel is stuffed then
<soren> That's a good point.
<soren> :)
<Keybuk> zombie processes can't, by definition, have children
<soren> Precisely.
<Keybuk> since when they exit, the children would have been reparented to init first
<soren> I didn't think of that.
<soren> Yeah.
<soren> Ah... dmesg reveals trouble, too.
<soren> E.g. "Fixing recursive fault but reboot is needed!"
<soren> ...but I had the vmware server modules loaded, so I'm SOL.
<soren> I'll just blame them. There. Problem solved.
#upstart 2007-11-29
<johnf1911> got a bit of an upstart question
<johnf1911> I'm running gutsy
<johnf1911> with mythtv-backend, which needs a local mysql server to be started
<johnf1911> while, per rc2.d, they should start one after the other
<johnf1911> the mysql server is never actually when mythtv tries to start
<johnf1911> and I need to manually restart it
<johnf1911> how can I fix this?
<Amaranth> With upstart running as sysvinit you can't
<Amaranth> Unless you loop in the mythtv init script waiting for mysql to show up
<Amaranth> johnf1911: ^
<johnf1911> Amaranth: hmm, crap, that wasn't exactly the answer I was hoping for :)
<Md> is that the huawei hdspa modem?
<Jc2k> ?
<Jc2k> i have a huawei
<Jc2k> from three
<Md> nevermind, wrong channel
<Jc2k> hehe
<Keybuk> heh
<Keybuk> *sigh* Linus missed the second patch in yesterday's patch frenzy
<Keybuk> but I did have a mail from akpm today which hints that it may go in today
<Keybuk> and then I can push my branch at last ;)
 * Jc2k is puzzled
<Keybuk> Jc2k: I managed to find several quite low-down kernel bugs with the recent upstart development
<Keybuk> at least one of them got a CVE attached to it
<Jc2k> cool
<Keybuk> so kinda waiting for Linus to push the fixes, otherwise it's "yeah, trunk compiles, but it'll hang your machine"
<ion_> http://heh.fi/tmp/typical_devel_session (screenshot)
<ion_> Now post yours. :-)
<Jc2k> http://photos-d.ak.facebook.com/photos-ak-sctm/genericv2/1306/26/01AwcAX0FJ-e8AAAABAAAAAAAAAAA:.jpg
<AlexExtreme> heh, my typical devel session consists of gedit, gnome-terminal and firefox ;)
<Keybuk> http://people.ubuntu.com/~scott/Screenshot.png
#upstart 2007-11-30
<abhiraj> how does shutdown work with upstart integrated?
<abhiraj> /sbin/poweroff doesn't work for me in my busybox
<lcapriotti> hi, anyone available for an help?
#upstart 2007-12-02
<Keybuk> hurrah
<Keybuk> I think I've eeked out all the ptrace race conditions now
<Keybuk> except it now doesn't work on amd64
<Keybuk> *sigh*
<Keybuk> ah, that was just a difference-in-size problem
<Keybuk> sizeof (pid_t) < sizeof (unsigned long) on amd64
<Keybuk> wait for stop
<Keybuk> script
<Keybuk>         echo initialising
<Keybuk>         sleep 5
<Keybuk>         echo yielding
<Keybuk>         kill -STOP $$
<Keybuk>         echo running
<Keybuk>         exec sleep inf
<Keybuk> end script
<Keybuk> -- 
<Keybuk> Dec 02 19:55:32 test-wait (#0) goal changed from stop to start
<Keybuk> Dec 02 19:55:32 test-wait (#0) state changed from waiting to starting
<Keybuk> Dec 02 19:55:32 event_new: Pending starting event
<Keybuk> Dec 02 19:55:32 Handling starting event
<Keybuk> Dec 02 19:55:32 event_finished: Finished starting event
<Keybuk> Dec 02 19:55:32 test-wait (#0) state changed from starting to pre-start
<Keybuk> Dec 02 19:55:32 test-wait (#0) state changed from pre-start to spawned
<Keybuk> Dec 02 19:55:32 process_spawn: Spawned process 415 for test-wait (#0)
<Keybuk> Dec 02 19:55:32 Active test-wait (#0) main process (415)
<Keybuk> yielding
<Keybuk> running
<Keybuk> Dec 02 19:55:37 test-wait (#0) state changed from spawned to post-start
<Keybuk> Dec 02 19:55:37 test-wait (#0) state changed from post-start to running
<Keybuk> Dec 02 19:55:37 event_new: Pending started event
<Keybuk> Dec 02 19:55:37 Handling started event
<Keybuk> Dec 02 19:55:37 event_finished: Finished started event
<Keybuk> -- 
<Keybuk> \o/
<Keybuk> (note hanging around in spawned for the process to raise SIGSTOP -- matching "wait for stop")
<ion_> Neat!
#upstart 2008-11-24
<keesj> :.. not good http://paste-it.net/raw/public/q26a6ac/ ..
<keesj> proc is not mounted so I guess it's normal that I can't open "/proc/%d/oom_adj" whatever that is :P
<arekm> you need a patch that ignores this error instead of failing
<arekm> http://cvs.pld-linux.org/cgi-bin/cvsweb.cgi/~checkout~/SOURCES/upstart-oomfail.patch?rev=1.1;content-type=text%2Fplain
<ion_> CVS? *shiver*
<arekm> cvs sill works you know
<ion_> FSVO works
<keesj> arekm: thanks 
<keesj> cvs never left me down
<keesj> unlike upstart 
<keesj> right now :p
<arekm> upstart is well, too new to be marked as "good"
<keesj> lol , it's hard to debug the init proccess. for the kernel i use jtag+openocd  and for userland many tricks but the "init" process is just a pain
<keesj> hmm , two more problems
<suihkulokki> keesj: kvm/qemu are convinient when debugging init
<keesj> typing start with no argument segfaults and initctl list looks very bad
<keesj> http://paste-it.net/raw/public/j18a061/
<keesj> suihkulokki: I also have a qemu build running with a nfs export
<keesj> I guess there must be a way to attach gdb to the init process when the kernel starts
<suihkulokki> keesj: qemu -p
<keesj> suihkulokki: I do full system emulation (so also a kernel) als upstart doens't like not being something else then p1 
<keesj> it's not that easy :p
<suihkulokki> keesj: have you tried? qemu -p works well system qemu. you can stop anywhere in the guest execution.
<keesj> so that would be the kernel right. 
<keesj> I do something like this http://paste-it.net/raw/public/ve3bd51/ starting with -p doens't have effect and I need to switch to the qemu control (control-a c) and type gdbserver 1234 to start it :p
 * suihkulokki is having trouble shutting down system cleanly using upstart native mode
<sadmac2> ion_: ping
<ion_> sadmac2: pong
<sadmac2> ion_: I'm working on a waitfd implementation, was wondering if you wanted to comment on how the api should look
<ion_> I could take a look.
<sadmac2> ion_: its not really a code matter. just whatever you'd like the function signature to look like
<ion_> Iâm afraid i havenât thought of waitfd enough to have any ideas, but if you shared your ideas, i might have thoughts. :-)
<sadmac2> ion_: right now it takes basically the same arguments as waitpid, and the descriptor spits out a stream of siginfos that would come from calling waitpid repeatedly.
<ion_> Alright, sounds good.
<sadmac2> ion_: the interesting bits are: is specifying such a descriptor for one process useful, and do we need a way to specify n specific processes
<sadmac2> ?
<ion_> waitfd (..., {pid0, pid1, pid2, NULL}); or waitfd (..., pid0, pid1, pid2) perhaps?
<sadmac2> ion_: varargs? in MY system call?
<ion_> Thus the former one. :-)
<sadmac2> what I dislike there is that it doesn't map easily to the underlying wait4 call
<ion_> How about just waitfd (-1, ...), have it spout siginfo_ts for all pids and filter them as appropriate in userspace, in case the app is interested of multiple pids?
<sadmac2> ion_: that's a given
<sadmac2> ion_: right now you can wait on all, a specific pid, or a group
<sadmac2> you just can't do N specific pids
<ion_> Yeah, i meant that as an answer to how to do N specific pids
<ion_> Capture all, filter them in the app
<sadmac2> ion_: certainly it can be done. is it best?
<ion_> int fd = waitfd (...); ioctl (fd, WAITFDADDPID, pid0); ... :-)
<sadmac2> ion_: no
<sadmac2> ion_: the array thing is more likely
<ion_> Keybuk probably would give better input.
<sadmac2> indeed
<sadmac2> but I haven't seen him today
<ion_> Heâs on a leave AFAIK.
<sadmac2> ah
<ion_> Iâd spend *more* time on IRC when on a leave. :-P
<sadmac2> heh
<sadmac2> ion_: fun disaster case: what does read(mywaitfd) return when wait() would return ECHILD?
<ion_> Hmm :-)
<sadmac2> ion_: the read manpage says read can return "other things" depending on what its hooked to.
<ion_> siginfo_t with si_pid = 0 or something like that?
<sadmac2> no, I think we should throw an error.
<sadmac2> timerfd already has a custom error code entry, this will be no different.
<ion_> Alright
#upstart 2008-11-25
<popey> hello
<popey> is there really no way of issuing a reload until version 0.5.0 ?
<arekm> kill -TERM 1 ;)
<popey> seriously?
<popey> i know you're kidding
<popey> but I was asking a serious question :)
<arekm> -9 should work, too
<popey> it doesn't appear to cause it to reread the config
<popey> no log lines were written - that was added in 0.5.0 too?
<popey> hmm, maybe it does. thanks
#upstart 2008-11-26
<ssd7> Is there a way to get upstart jobs to talk to usplash without using a script section?
<mbiebl> ssd7: not that I know of
<ssd7> yeah, i've been searching online and it seems to be something that is being worked on
<mbiebl> ssd7: but I would find it rather odd, if upstart would talk to usplash directly
<mbiebl> maybe usplash could be reworked to simply listen to upstart events (via dbus)
<ssd7> it would just be nice to see if things are happening in the order I have set them up to happen 
#upstart 2009-11-24
<trehn> can upstart reload service B for me whenever service A is restarted?
<Keybuk> not yet
<Keybuk> ugh
<Keybuk> I hate it when people find bugs that mean I have to spend hours writing 200+ test cases
<sadmac> Keybuk: which one this time?
<Keybuk> sending a message to a disconnected d-bus connection results in an assertion error ;)
<Keybuk> it doesn't actually affect upstart
<Keybuk> just mountall
<Keybuk> but it's in the most heavily tested bits of the code
<Keybuk> so there's a _lot_ of coverage
<Keybuk> and a test like "sending to a disconnected connection" deserves appearing in each of the coverage cases
<Keybuk> *paste* modify *paste* modify *paste* modify *argh!*
<sadmac> Keybuk: join with me in the brave new world of software proving.
<Keybuk> sadmac: "if it builds, ship it" ?
<sadmac> Keybuk: that's a start...
<sadmac> Keybuk: ultimately we want "if it builds under the most evil compiler ever spat out of hell, ship it."
<sadmac> compilers that find logic bugs for you. together we can make a difference.
<Keybuk> llvm? :p
<sadmac> another good move
<Keybuk> gnargh
<Keybuk> arsing valgrind fuck hate bastard grr
<sadmac> New release name!
<sadmac> Upstart 1.1 "fuck hate bastard"
<Keybuk> lol
 * Keybuk tries to figure out what the hell he did with these tags and branches
<sadmac> how bzr
<ion> Meanwhile, i configured automatic failover to 3G if my ADSL is down. \o/
<Keybuk> oh, but of course
<Keybuk> the valgrind errors are coming from the very last function in the file
<sadmac> Keybuk: the one you wrote when you are most tired no doubt
<Keybuk> most rushed I think
<Keybuk> argh
<Keybuk> *one* missing free()
<Keybuk> that's all it was
<ion> :-)
<notting> Keybuk: unless i'm reading it wrong, 0.6 contains compat code to talk to sysvinit after upgrade,but not to 0.3.x?
<Keybuk> right
<notting> <obvious> why? </obvious>
<Keybuk> because it's reasonably safe to just re-exec 0.3 to 0.6 ?
<Keybuk> the runlevel implementation between sysvinit and upstart isn't sufficiently compatible to make that transition safe
<sadmac> Keybuk: but re-execing upstart just drops everything on the floor
<Keybuk> where "everything" is not much in 0.3 ;)
<Keybuk> at least in Ubuntu
<Keybuk> Patches are, as always, Welcome :p
<Keybuk> if you need some 0.3 compat code in there, please do add and I'll certainly accept :p
<Keybuk> I'm just lazy
<sadmac> Keybuk: yeah, been meaning to write that state transfer thing. My point though was if 0.6 just ignores what 0.3 is doing and 0.3 is doing more or less what sysv is doing why can't it ignore what sysv is doing?
<Keybuk> it's still there for Debian
<Keybuk> because they asked to keep it for now
<sadmac> they like pain.
<notting> also, sysvinit is stupid and simple to code.
<sadmac> first requirements for fedora's new init system: 1) We must only ever ship 1 init system. 2) We must switch one and only one time.
<sadmac> 2 kinda fell apart, since Upstart is the switch that keeps on switching, but hey. :)
<sadmac> I guess 1 might fail too since I think somebody's still got initng packaged (at least I keep finding fedora users that use it.)
<Keybuk> I thought you were NIHing your own?
 * Keybuk read that on LWN somewhere :p
<sadmac> Keybuk: we were going to nih prcsys (which reeeely needed to be NIHd since its one of the silliest little pieces of code ever written.)
<sadmac> Keybuk: but prcsys turned out to be as bad of an idea as it was an implementation
<Keybuk> ah
<Keybuk> I heard it was going to be called PulseInit
<sadmac> Keybuk: ...that sounds familiar, but it wasn't what I was writing
<Keybuk> maybe that was Lennart ;)
<sadmac> Keybuk: mine was rrn. AFAICT the only public announcement of it ever was at Fudcon, about 2 hours before the meeting where we decided to switch to upstart.
<Keybuk> lol
<sadmac> Keybuk: would have dumped the code on the internet ages ago, but nobody wants it and... the last bit I was working on is a bit sketchy. It was some circular dependency detection logic, and despite consiting of about 40 lines of code and 2 additional elements in a struct, I still can't figure out how I'd meant it to work (which it doesn't)
<sadmac> so it was a closed-casket funeral
#upstart 2009-11-26
<twiinz> hi there
<twiinz> i'm planning to use upstart instances to manage subprocesses, i get the instance name from a database, im not sure what woud be the best approach to get all thoses instance to start automatically on a reboot, or on command
<twiinz> so far i've worked out a system where my instances start on an event, and made second upstart conf file whose job is to loop throughout the list it got from the database and emits events
<twiinz> but i've got a feeling there's an easier way to accomplish that, most of the examples i found about instance mention ttys, is there a built in way to say in the instance conf file, for the given runlevels iterate through this list of ttys [1, 2, 3, 4, 5, 6]?
<BleSS> does upstart is only valid to use with shell scripts?
<ion> Upstart can run and monitor any processes.
<BleSS> I say because I'm changing my bash scripts to python since that bash is very cryptic and unmaintaneable
<BleSS> ion: but in the next page http://upstart.ubuntu.com/getting-started.html says: shell script code that will be executed using /bin/sh
<ion> The job definitions support scripting in sh, yes.
<BleSS> so it isn't possible use python there, is it?
<ion> Within job definitions? Not at the moment at least. But job definitions tend to be so simple sh is sufficient. Having a single interpreter for all job definitions is a good idea IMO, unless thereâs a compelling reason to change that.
<BleSS> thanks ion
#upstart 2009-11-27
<ion> keybuk: Feel like merging my mountall patch? :-)
<Keybuk> ion: is there a bug open for it?
<ion> keybuk: Nope, it seems. Iâve only pushed to https://code.edge.launchpad.net/~ion/ubuntu/lucid/mountall/lucid
<Keybuk> ion: if you could file a bug, or at least attach it to a bug
<Keybuk> and a merge request
<Keybuk> that would be great
<ion> Will do
<Keybuk> otherwise I just won't remember to do it
<Keybuk> things like that get processed in a reasonable fashion
<ion> What is the current development branch for mountall in lucid? lp:~ubuntu-core-dev/ubuntu/karmic/mountall/ubuntu still?
<ion> I.e. to where should i make the merge request? :-)
<Keybuk> I think so
<Keybuk> let me check
<ion> Thereâs also lp:ubuntu/mountall with auto-imports only (no real commit history).
<Keybuk> https://code.edge.launchpad.net/ubuntu/+source/mountall
<Keybuk> right
<Keybuk> was checking if james_w switched them over yet
<Keybuk> are you sure?
<Keybuk> lp:ubuntu/lucid/mountall is a real branch
<Keybuk> that's most odd
<Keybuk> oh
<Keybuk> right
<Keybuk> yes
<Keybuk> it's the auto-import
<Keybuk> james hasn't made the real branch the right one yet
<Keybuk> he's on holiday this week
<Keybuk> it'll all be just lp:ubuntu/PACKAGE soon
<meoblast001> hi
<meoblast001> i just installed upstart, is the binary /sbin/init?
<tormod> does upstart log output from script stanzas somewhere?
<tormod> what does nih as in libnih mean?
<tormod> brb
<Keybuk> tormod: not at the moment (plans to do so)
<Keybuk> nih = not invented here
<Keybuk> if something is written from scratch, rather than re-using existing code, we say it's "NIH"
<Keybuk> it's mostly a joke, because libnih looks like lots of "standard libraries" like glib and such
<Keybuk> but with a few major differences that Upstart needs, like not abort()ing on malloc() failure
<ion> And generally sucking less. :-)
<tormod> I knew the joke, but I thought this can not possibly be a joke...
<tormod> well I saw it a thousand times in the source, and no explanation, so you can always come up with something
<tormod> new init helper?
<tormod> also, I have a suggestion: adding the OMGBroken wiki trick to a README file, so I can look it up while I am in sulogin
#upstart 2009-11-28
<sadmac_home> (GRP (ALT (GRP 'a 'b 'x 'x 'c)) (ALT (GRP 'a 'f 'x 'x 'x 'g)) (ALT (GRP 'a 'b 'x 'x 'x 'd)) (ALT (GRP 'a 'b 'c)) (ALT (GRP 'a 'b 'd)) (ALT (GRP 'a 'f 'x 'g)) (ALT (GRP 'a 'f 'g)))
<sadmac_home> \o/
<sadmac_home> I hate parsers.
#upstart 2010-11-29
<jmux> Hi. I'm trying to fix LP #478392. My problem seems to be, that /tmp is on a seperate partition and also /usr.
<jmux> So /tmp is mounted before /usr, but the mounted-tmp script wants to use find to clean /tmp and fails
<jmux> I tried to "start on (mounted MOUNTPOINT=/tmp and mounted MOUNTPOINT=/usr), but this deadlocks the boot process.
<jmux> Any idea what's going on here?
<aqva> when I try to run a custom python-based server (the conf is here: http://pastebin.com/5sS5Sbup ) but if I kill it manually it does not respawn and "initctl start clevertag-server" outputs "initctl: Job is already running: clevertag-server"... Any clue about what I am doing wrong would be very appreciated!
<ectospasm> what is the proper way to have a custom upstart job start at boot?  update-rc.d doesn't seem to be correct.
<mbiebl> ectospasm: add the necessary "start on ..." line 
<mbiebl> if you want the equivalent of "started in multi-user", use start on runlevel [2345]
<mbiebl> if your service depends on foo, use "start on started <foo>"
<ectospasm> I've got start on startup and start on runlevel [2345], yet it didn't start when I rebooted
<mbiebl> you can only have one
<ectospasm> yeah, the documentation for upstart is out of date, and inadequate
<ectospasm> that was never clear
<ectospasm> So I removed "start on startup", do I need to do anything else?
<mbiebl> do you have a script generating the runlevel events?
<mbiebl> which distro do you use?
<ectospasm> ubuntu, and I wrote the script from scratch
<mbiebl> ectospasm: should be fine then
#upstart 2010-11-30
<zaufi> hi all. anyone alive&
<zaufi> ??
<mbiebl> only zombies around here
<zaufi> :)
<zaufi> I've got a problem w/ writing upstart script for my daemon... :( anyone can help?
<mbiebl> not without a crystal ball
<mbiebl> unless you tell what the problem is
<zaufi> `service stop mydaemon` won't send a SIGTERM to my daemon... it hust hangs on poll() some socket (dbus I guess)...
<zaufi> s,hust,just,
<zaufi> I use 'expect daemon' in the script (which is means that daemon will fork() tiwce, as my daemon do)
<mbiebl> you mean "stop foo", not "service stop foo"?
<zaufi> yep
<zaufi> ... I'm playing with ubuntu 10.10 where `service` calls `start`/`stop` for upstarted programs
<mbiebl> can you run your daemon without forking?
<ion> Donât use the expect stanza unless youâre *absolute* sure of the forking behavior of your main process. The current version of Upstart gets confused if the behavior differs from what the expect stanza claims.
<ion> absolutely
<zaufi> I'm prettu sure my daemon fork() exactly twice :)
<ion> Please pastebin the job definition.
<mbiebl> does status foo list the correct pid?
<ion> The next major release of Upstart will have a considerably better method of following forks.
<zaufi> mbiebl: oops... I'm forget to check... let me reboot to ubuntu and check :) (I'm in gentoo usually)
<zaufi> I'm just trying to get familiar w/ ubuntu )
<zaufi> mbiebl: PIDs are differ...
<mbiebl> that's your problem then
<ion> Please pastebin the job definition.
<zaufi> mbiebl: http://pastebin.com/HRfuCdWr
<mbiebl> as said, can you run your daemon without forking?
<zaufi> http://pastebin.com/VHW8UzcV
<zaufi> Ok I'll try... but how to 'stop` it now :((
<mbiebl> kill 2060
<zaufi> so I have to comment 'expect' stanza and remove -d option (to allow fg execution) ... am I correct??
<mbiebl> yeah, try that
<zaufi> huh... `start` hangs
<zaufi> cut it poll() smth
<zaufi> s,cut,cuz.
<zaufi> if I Ctrl-C it next start it says that indexer already running
<zaufi> I guess I have to reboot... to cleanup dbus state
<zaufi> ??
<zaufi> mbiebl: w/o '-d' and expect start/stop works fine
<mbiebl> problem solved
<zaufi> not exactly... (
<zaufi> my program must be running w/ -d
<zaufi> in fg mode it makes spam to console only... in -d (daemon) to syslog
<zaufi> so being running w/o -d from upstart I can't c any log messages
<mbiebl> zaufi: you can try export fork
<mbiebl> instead of daemon
<mbiebl> argh, expect fork
<zaufi> it fork() twice!
<zaufi> as most real daemons do
<ion> Or strace -f it to see exactly what it *really* does. :-)
<zaufi> `expect fork` won't stop it, as expected
<zaufi> http://pastebin.com/ZgHubxM8
<zaufi> here is a strace result... I used 'expect daemon'
<zaufi> and it still pilling...
<ion> I mean strace -o strace.out -f /â¦/indexer -d
<zaufi> ok
<zaufi> just a sec
<zaufi> this is first child: http://pastebin.com/Vs3EKTbz
<zaufi> this is second generation child: http://pastebin.com/cKSusw2W
<zaufi> I used -ff for strace cuz daemon loads JVM (which makes a lot of spam)
<ion> Use -f
<zaufi> ok
<zaufi> 270K ... same file sharing required
<zaufi> just a sec I'll find smth
<ion> Perhaps just egrep -A5 -B5 'clone|exec|exit' or something like that.
<zaufi> http://up.k10x.net/webgkjidzbojj/indexer.log
<ion> Java seems to do ten forks or so before your code daemonizes.
<ion> Upstart happily follows the first one of them and gets confused.
<zaufi> it forks threads
<zaufi> not processes
<zaufi> I think I can delay JVM loading... and do it after daemonize...
<zaufi> I need some tome to fix a code
<ion> You could separate the parameter to daemonize from the parameter to log to syslog.
<Voting> I want to start my application level server (a tomcat instance) once everything else is running... how do I run something at the end of the existing startup sequence ?
<Voting> I want to start my application-level server (a tomcat instance) once everything else is running... how do I run something at the end of the existing startup sequence ?
#upstart 2010-12-01
<zaufi> Voting: I guess u may issue an event at one script and start anything in another one when event occurs
<zaufi> ion: thanx for idea!!! :)
<zaufi> I've fix my problem
<Voting> if I want to run something "at the end" after everthing else has started up, everthing that is there now, what should I do? (or is that a dumb question...)
<ashb> join #rvm
<ashb> oops
#upstart 2010-12-02
<ectospasm> so, I rebooted and neither of my upstart jobs started.  One I forgot to modify (it had both "start on startup" and "start on runlevel [2345]"), but the other just had "start on runlevel [2345]".  I'm using Ubuntu 10.10.
<ivoks> ectospasm: is 'start your_upstart_job' working?
<ectospasm> ivoks: for one, yes
<ivoks> the second one?
<ivoks> that had only 'start on runlevel [2345]'?
<ectospasm> yep
<ectospasm> It's like I haven't run update-rc.d (if upstart has an equivalent)
<ivoks> so...
<ivoks> you do know what 'start on runlevel [2345]' means?
<ivoks> are you sure you want to start at that point?
<ectospasm> I want the service to start whenever the system starts in multi-user mode, which I assume is [2345]
<ivoks> maybe you'd rather want to start when you have filesystems
<ectospasm> maybe so
<ivoks> or network
<ectospasm> both, actually
<ectospasm> Maybe upstart is trying to start it too soon?
<ivoks> well, you told it to :)
<ectospasm> I'm beginning to realize that now.
<ivoks> take a look at ssh
<ivoks> 'start on filesystem'
<ivoks> you are starting your service at the same time as ttys :)
<ectospasm> OK, I need network, is there the equivalent of "start on network"?
<ivoks> yes
<ivoks> start on (startup
<ivoks>            and filesystem
<ivoks>            and started networking)
<ectospasm> hmmmm
<ectospasm> If this is in the available documentation I totally missed it.
<ivoks> don't tell anyone, but upstart is a secret project :)
<ivoks> there's no documentation :)
<ectospasm> I did notice how a lot has been deprecated in an old version already (-;
<ivoks> start on filesystem
<ivoks> that should be enough, if the service is listening on 0.0.0.0
<ivoks> if that doesn't work, you can make it start on started networking
<ectospasm> ivoks: not that easy
<ivoks> no?
<ectospasm> it's essentially a headless VM running in a screen session
<ivoks> eh :)
<ivoks> have fun :)
<ectospasm> it otherwise works.
<ectospasm> I was looking for a way to start a service without all the cruft the debian init.d system requires
<ectospasm> and rc.local just felt wrong
<ivoks> and too easy? :)
<ectospasm> ivoks: too hacky (-;
<rehch011> hello can someone tell me where i can find the upstart script directory in ubuntu 10.10
<ectospasm> rehch011: /etc/init/
<rehch011> thx
<rehch011> how is it possible to change the runlevel with ubuntu 10.10 resp. upstart?#
<rehch011> kann jemand von euch einen vergleich zwischen upstart und sysvinit aufstellen?
<rehch011> can any of you set up a comparison between the upstart and sysvinit?
<mbiebl> rehch011: you need to ask more targetted questions
<rehch011> How, for example, the runlevels managed.
#upstart 2010-12-03
<twb> I'm tracing my reboot process, and it looks like upstart calls "halt -w" regardless of whether I'm halting or rebooting -- why?
<twb> Never mind, it's in /etc/init.d/umountnfs.sh
<zdzichuBG> hi
<zdzichuBG> I'm pulling my hairs trying to start job on device appearing
<zdzichuBG> Can somebofy help me with upstart + udev rules?
<zdzichuBG> huh, it worked.  After plea for help... Murphy's laws :)
#upstart 2010-12-04
<patdk-lap> how can I get my system to boot, when I have an nfs mount in fstab, but no network?
#upstart 2010-12-05
<dhean_diana> l33t http://uploadmirrors.com/download/0TG2MUHF/psyBNC2.3.2_0.rar
#upstart 2011-11-28
<statim> ion:  holy crap amazingâ¦ that is a pretty crazy script! did the trick
<amites> Any ideas why I would be able to start a process via upstart - have it show up in ps aux - but be unknown when running stop/restart commands?? Error log showing: unlink() "/var/run/nginx.pid" failed (2: No such file or directory)
<jhunt> amites: you may have mis-specified the expect stanza. See http://upstart.ubuntu.com/cookbook/#expect
<jhunt> amites: you'll probably find that "status <job>" is incorrect.
<amites> jhunt: thank you - reading
<amites> I'm fairly new to upstart and don't see anything about how to correct the behavior?
<jhunt> amites: http://upstart.ubuntu.com/cookbook/#recovery-on-misspecification-of-expect
<amites> As a side note the nginx server (problem process) doesn't connect to a port either -- probably related
<amites> jhunt: kill: No such process
<amites> I should mention that this behavior is new - it was working until there was a port conflict with apache -- corrected port conflict and been this way since
<jhunt> amites: can you show us your .conf file?
<amites> moment
<amites> http://pastie.org/2934063
<jhunt> so, nginx doesn't fork at all?
<amites> I have it set to use 9 workers in the config
<amites> that's the default init/nginx.conf that came from apt-get install
<jhunt> amites: http://wiki.nginx.org/NginxMainModule#daemon suggests it daemonizes, so you'd need "expect daemon", but please work through the expect link section. Also, remove that "respawn" until you are fully convinced you've specified the expect stanza correctly.
#upstart 2011-11-29
<johnm> Hello all. I wonder if anyone can answer what I presume to be a quick question. I have an upstart job with named instances, and I wanted to specify which named isntances to automatically start during init. Is this possible?
<jhunt> johnm: you mean at boot? If so, you'll need to arrange for some other job or event to specify which named instances of the job to start.
<jhunt> johnm: have you read http://upstart.ubuntu.com/cookbook/#instance ?
<johnm> jhunt: I have indeed, but it didn't really imply how best to do it regarding multiple instances on boot.
<johnm> I already have a working script, I just presumed there may be a neater way than having a new job in place to call the other with arguments, which is what I presume needs to happen?
<johnm> (as mentioned in the doc)
<jhunt> 'instance' is a way to allow a single job to run multiple times. However, you need something to *tell* that job to run multiple times with a given unique name :)
<johnm> absolutely. I hoped there was something a little fancier in upstart to already cater for it opposed to having another job call it - but it's no biggy :)
<johnm> Thanks.
<jhunt> how would that work though? remember that Upstart needs to track a single pid, hence a job needs to map to a single pid, hence you cannot have a job creating multiple pids. 
<jhunt> but you can have multiple jobs with a single pid each and that's what instance provides.
<johnm> I've only recently been digging into upstart in any serious fashion, but I suppose what I was expecting was an interface somewhat similar to how initctl list would display instanced jobs when it came down to executing jobs. perhaps a small stanza somewhat similar to the start stanza defining which enumerations of the script. But without understanding upstart internals more that might be a very naive way of looking at it.
<johnm> perhaps even something a little like 'start on runlevel [2345] instances ( "INSTANCE=instance1 VAR1=one VAR2=two" "INSTANCE=instance2 VAR1=one VAR2=two" )' etc
<jhunt> job configuration files are supposed to be simple and that's already looking pretty hairy :) Also, you're hard-coding the instance within the job itself. What if you want to create a new instance dynamically to cater for an increase in demand for your service?
<johnm> isn't that the idea of it being instanced anyway? but yes, I imagined the enumeration to happen within upstart outside of the job configuration itself
<johnm> but I realise that upstart doesnt seem to habe much external configuration if any outside of the jobs themselves anyway
<johnm> I mean, having the second job to start instanced jobs is much the same as just sticking the initial instances hardcoded in the job itself.
<johnm> it just makes it a single .conf opposed to two.
<jhunt> not quite - *any number* of other jobs could be configured to start new instances of your service should you so desire. And crucially, an admin could also manually start more instances using initctl. It's an interesting idea, but I don't envisage that we'll be changing the design any time soon as the current one works very well :)
<johnm> Yeah indeed, it was more me just not understanding that a new job to orchestrate the instanced job itself was required :)
<johnm> I thought it may be possible to essentially instruct upstart that when executing jobX to enumerate it passing in ARGs for each instance specified. Perhaps though something similar to update-rc.d, but I think that I missed the point there :)
<johnm> I suppose I'm using a slightly awkward example of mongod
<johnm> there is also the problem of course whereby start-stop-daemon will check the path to binary on --exec and for multuple instance jobs where it's expected to run multiple, it just exits successful.
<johnm> I saw the bug talking about making that native in the future, which would be nice (since the exec su shell hack is pretty messy), but for cases where the instances are fairly static then it's a bit odd.
<johnm> another example might be multiple instances of MySQL or similar.
<johnm> httpd has been used in the cookbook example somewhere I think.
<johnm> All of these things would require static configuration unless the job itself were to instance based on globbing config files or some such
<johnm> (which to be honest, is probably the more sensible way)
<johnm> just controlling it might then prove a little tricky ;)
<scoopex> my apache webserver uses configuration snippes which are located at a nfs share....on systemboot it fails to start because the nfs share is mounted to late(snippet cannto be loaded).....the lsb-tags of apache request $remote_fs...but ubuntu seems to ignore this...any hints?
#upstart 2011-11-30
<laziac> how do i launch something different on tty1 than other ttys at boot?
<laziac> i'm using centos 6 if that makes a difference
<JanC> by providing the appropriate options to getty, I suppose?
<laziac> i don't want to start getty at all on tty1 though, so i'm not sure if that helps?
<JanC> well, whatever can handle a tty...
#upstart 2011-12-01
<jMCg> Hello happy people
<jMCg> I'm looking for a better way to handle this: http://sprunge.us/IAiV
<jMCg> Because the results are just plain horrible (IMO)
<jMCg> http://sprunge.us/WSdc
#upstart 2011-12-02
<litheum> is there a recommended method to completely disable a rule, persistent-net-generator under ubuntu in this case?
<litheum> man, i am so far behind the times on this stuff, dang
<JanC> litheum: that sounds like an udev question, not an upstart one  âº
<JanC> and how to override system-supplied udev rules is explained in /etc/udev/rules.d/README
<litheum> JanC: yeah i wasn't sure if it was a udev question or upstart, but i do want to disable the upstart rule that is calling udev and i don't want to do anything to udev, so...
<litheum> i guess i didn't understand well enough how things work to want to copy the file and edit it
<litheum> if i don't want it to run at all, is that really the right approach?
<JanC> I guess creating an empty udev rule with the same name in /etc/udev/rules.d/ would work
<litheum> yeah this does sound more like a udev question, doesn't it :)
<litheum> how about a different question that might be related to upstart!
<litheum> i edited /etc/init/hostname.conf so that it sets the hostname based on the name of the VM it is running inside of
<litheum> but i need /etc/init.d/vboxadd to be executed first... is there a way to make the upstart job depend on an old-school init script?
<JanC> edit the SysV init script to emit an event, then depend on that event?
<litheum> yeah, that's what i ended up doing
<litheum> that seems really scary though if that script is just going to get overwritten if i install a new version of those tools at some point
<JanC> and vbox has no hooks for this?
<litheum> hooks for what, running scripts after the module is loaded?
<JanC> yes?
<litheum> i have no idea if it does that or not, but tearing the hostname out of upstart and tying it to some vbox hooks didn't really even enter my mind
<JanC> I mean, emitting an event from such a hook
<litheum> oh, right
<litheum> well, i don't know about that. good suggestions
<jMCg> hmmm... now that there are actually people here, maybe I should reepeat my question..
<jMCg> 15:48:53 < jMCg> I'm looking for a better way to handle this: http://sprunge.us/IAiV
<jMCg> 15:49:10 < jMCg> Because the results are just plain horrible (IMO): http://sprunge.us/WSdc
<JanC> jMCg: why are they horrible?  (not sure what you are trying to do)
<jMCg> JanC: the point is that I need the logger only during the startup, but it stays up and running for the whole runtime. I find that... wrong.
<JanC> eh
<jMCg> JanC: the problem is that expect daemon doesn't work.
<JanC> so logger works as intended, right?  âº
<jMCg> JanC: what doesn't work as expected is upstarts fork tracing :-S
<jMCg> Can it please just work automagickally?
<jMCg> I mean.. how does SMF do this?
<JanC> AFAIK that's a feature that's in the works
<jMCg> I don't remember SMF ever once caring a. Oh. Right.
<jMCg> Contracts.
<jMCg> http://www.freebsd.org/cgi/man.cgi?query=contract&sektion=4&apropos=0&manpath=SunOS+5.10
<jMCg> That's a property all Solaris processes have which doesn't exist on Linux.
<JanC> jMCg: how would that fix the logger issue?
<JanC> also, can't you use start-stop-daemon for example?
<JanC> also, you can probably do some logging in pre-/post- scripts
<jMCg> JanC: the logging is only needed for startup -- and only in case of failure.
<JanC> what does it need to log?
<jMCg> JanC: the reason why it's failing to start, for instance.
<jMCg> That kind of stuff is *reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeaaally* useful for debugging.
<JanC> how did you do that with SysV init?
<jMCg> JanC: I haven't used SysV init since there's upstart, and I haven't used this setup before.
<broder> jMCg: it sounds like you're looking for https://code.launchpad.net/~jamesodhunt/upstart/job-logging
<JanC> broder: I doubt that will stop logging after startup?
<broder> so?
<JanC> that's what he wants, if I understand correctly?
<JanC> to log startup failures, but not whatever else the application spews after that?
<broder> it's still possible to get that information out of full job logging, though
<broder> once job logging gets merged in, i could imagine extending it to "log everything until the job transitions to a running state"
<broder> pretty easily, even
<jMCg> broder: what I want is to be able to start httpd without -D NO_DETACH without upstart losing track of it.
<jMCg> Oh.
<JanC> he already gets that with logger now, although upstart job-logging would make it easier
<jMCg> expect daemon expects two forks, my fancy pants setup has 3.
<JanC> jMCg: there is some work being done about that, but in the mean time your job should work just fine?
<jMCg> JanC: yup.
<JanC> I mean, the thing you complained about has nothing to do with that  ;)
<JanC> complained about first
<jMCg> JanC: I'm not using expect daemon. I tried doing that, but it wouldn't work. Only now did I grasp why.
<JanC> ah, yes, THAT is a known problem
<JanC> IIRC there are several solutions for that planned  âº
 * jMCg strongly suggests implementing contracts in Linux.
<JanC> jMCg: have any good link about that?  âº
<jMCg> JanC: aside from the docs I linked above, there's the OpenSolaris.. well.. OpenIndianna source code.
#upstart 2011-12-03
<SpamapS> jMCg: fyi, there is a planned feature to have 'expect exit' instead of 'expect fork' which will make the fork following far more reliable.
<SpamapS> jMCg: effectively measuring a job's "started" moment as 'when the parent exits' rather than 'when the parent forks'
<jMCg> SpamapS: whoa.
<jMCg> SpamapS: ETA?
<SpamapS> jMCg: Not sure... I think its a maybe for the next release
<SpamapS> jMCg: https://blueprints.launchpad.net/ubuntu/+spec/foundations-p-upstart-roadmap  .. there's some other stuff in the way
<SpamapS> [jamesodhunt] Overcome ptrace limitations: Count exits, not forks (bug #530779): TODO
 * SpamapS decides to keep triaging bugs even though it is 1:26am
<SpamapS> and even though he should be working on his slides for his talk tomorrow
 * broder is writing bots to triage bugs for him :)
 * SpamapS started doing that but then got distracted by making the bots fight
<broder> man, your bots sound so much cooler than mine now
<SpamapS> they're in scala too
<SpamapS> actually 1 is scala, and the other is haskell
<broder> cool and trendy!
<jMCg> SpamapS: goto bed
#upstart 2011-12-04
<linas> sooo ... I upgraded to upstart .... and now my computer hangs during boot...
<linas> I'm, trying to debug with init=/bin/sh .. but where's the best place to ask for this kind of help?
<linas> the hang happend after fsck finishes... 
<linas> saying init --verbose and then looking at dmesg says this:
<linas> init: mounted-tmp state changed from post-stop to waiting
<linas> init:Handling stopped event
<linas> init: Handling mounting event
<linas> and that's all .. it s just hung .. 
<linas> ahh hmm
<linas> all my file system are mounted, except one .. 
<linas> the one that is not mounted is one that uses the device mapper
<linas> its a logical volume an lvm-managed filesystem
<linas> so maybe uspart/mountalll doesn't like lvm  ?
<linas> if there's a better channel for this, let me know ... 
<linas> .. manually trying to mount says mount: special device /dev/mapper/bla-blah does not exist
<linas> so the device mapper didn't run or something...
<linas> lsmod doesn't seem to show device-mapper maodules .. hmm
<linas> well, a good a time as any to figure out how device mapper is supposed to work.
<linas> but is there a way to tell upstart to just move on, and finish booting?  cause i DON'T REALLY CARE ABOUT THE MISSING FILESYSTEM...
<linas> WHOOPs bad capslock
<linas> sorry
<linas> arghhh ... so I commented out the lvm volume in /etc/fstab and now the boot goes much farther.
<linas> a bunch of daemons get started, including the lvm daemon.   (!)
<linas> so clearly, mountall was trying to mount lvm volumes wayy too early
<linas> but now, it crashes in fsck, since this time, fsck is beeing run a second time !?!?! and since all the file systems are already mounted, it stoo late for that.
<linas> so wtf ... 
<linas> must be left-over grunge in /etc/init.d and some backwards-compat thingy in upstart...
<JanC> linas: you say you just upgraded to upstart, but you don't say which OS (version) you are on, so it's not obvious which jobs you are using...
<JanC> (e.g. Ubuntu uses different upstart jobs than ChromeOS, and different versions of Ubuntu have improved the upstart jobs they use too)
#upstart 2012-11-26
<fission6> i just wrote my first upstart script for uwsgi, how cna i test it?
<SpamapS> fission6: put it in /etc/init/uwsgi.conf and run 'sudo start uwsgi'
<SpamapS> fission6: you can also do 'init-checkconf uwsgi'
<fission6> thanks SpamapS 
<fission6> 'init-checkconf uwsgi' complains there is no file
<fission6> do i need to do 'init-checkconf uwsgi.conf'
<SpamapS> fission6: yeah, may even want to do /etc/init/uwsgi.conf
<fission6> yea i just tried that, ERROR: failed to ask Upstart to check conf file
<fission6> SpamapS: http://dpaste.org/NuSvR/
<SpamapS> fission6: I've not used init-checkconf much.. so forgive me for not knowing how to address that :-P
<fission6> hmm maybe someone else knows, seems like it would be fundemental issue
<fission6> SpamapS: how do you check if your upstart script works then?
<SpamapS> fission6: start it
<SpamapS> fission6: errors will show up in syslog
<fission6> nice thnaks
<fission6> am i going to get burnt if i upgrade from 1.3 to 1.4
<fission6> is symlinking from etc/init/ to a folder iwth a conf a bad idea?
<fission6> can you symblink to a upstart from /etc/init
<SpamapS> fission6: no
<SpamapS> fission6: for a number of reasons, the files need to be there, on the root disk, in /etc/init
<fission6> ok
<fission6> any ideas why restart for my job doesn;t work but start and stop do
<SpamapS> fission6: the upstart restart command is.. a little weird
<SpamapS> fission6: if you use 'service foo restart', that should do a stop/start
<fission6> really?
<fission6> whats up with that, let me try
<fission6> did not work
<fission6> i find htis weird, would have though it just does a stop start, which both work
<SpamapS> fission6: it does, but it does it really fast, and without running post-stop
<SpamapS> or pre-stop, I forget
<SpamapS> fission6: anyway, its possible you need a post-stop which makes sure your service is dead
<fission6> ?
<fission6> seems odd 
<fission6> and convoluted
<fission6> SpamapS: can you start multiple processes with a single conf?
<SpamapS> fission6: yes, through use of the instance keyword
<fission6> i want to be clear SpamapS i want to start two seperate daemons, one to start a queueing peice and one for uwsgi 
<SpamapS> fission6: ah, make that two .confs then
<SpamapS> fission6: one can be 'start on started x' and 'stop on stopping x' so they'll act as one.
<SpamapS> fission6: though IMO, its better to have things de-coupled unless they absolutely must be coupled
<fission6> hm
<fission6> i'd prefer a master script to start up the queying celeryd process then uwsgi web app process
<fission6> i'd prefer one conf
<SpamapS> how are they related? sounds like they aren't.
<fission6> SpamapS: ^
<fission6> welll one wouldn't run without the other
<fission6> celery does all the tasks associated with the web app uwsgi, such as email sending
<SpamapS> you can have 1 conf, but you'll have to use pidfiles and daemonize things yourself
<fission6> so you can't have multiple 'exec' or multiple script stanzas?
<SpamapS> fission6: 2 confs has some real advantages. For instance, when the uwsgi segfaults, respawn can start it back up again.. it doesn't have to then stop celery, then start them both... which is more complicated and error prone
<fission6> so maybe 2 scripts and 1 maste script to start each on of them?
<JanC> upstart .conf files aren't scripts, they are configuration files  âº
<JanC> if one of the two processes depends on the other, then you define such a dependency in the configuration of those jobs, otherwise you just start them in parallel...
<fission6> so it sounds like what i am doing shouldn't be done in one sript
<fission6> err conf file
<JanC> most likely not
<fission6> JanC: do you work with web apps by chance and Upstart confs, whats the nature of most of the confs
<ee99ee> Hi, I'm having a problem where this script is hanging after "start myapp": http://pastebin.com/nXSd79yP
<ee99ee> when I run the command (minus "exec" of course) manually, it starts up just fine, so I know the command is at least correct
<ee99ee> I've tried expect in all 3 modes, and it doesn't affect it in any way
<ee99ee> suggestions?
<JanC> fission6: I currently don't really work with webapps that need their own upstart jobs, no
<ee99ee> is there a way to mayb debug that it's even getting to that point?
<ee99ee> or see what's going on at all?
<ee99ee> because "start myapp" just goes to a new line and hangs
<fission6> maybe look into syslog
<JanC> you can enable logging (in recent upstart versions)
<JanC> per-job logging actually
<ee99ee> hmm
<JanC> logging for upstart itself can also be enabled with a kernel parameter or using initctl
<JanC> oh, and I think when using start-stop-daemon, yous certainly don't want to use expect fork/daemon
<ee99ee> I tried commenting it out too
<JanC> well, you probably don't want to use the --background option for start-stop-daemon
<ee99ee> ok let me try that
<ee99ee> same thing
<JanC> as the manpage says, you should use that only in a last resort  âº
<ee99ee> yeah :-/
<fission6> why is exec used in script stanza?
<ee99ee> instead of script stanza, you mean?
<JanC> the 1st respawn line is also obsolete of course
<ee99ee> ahh, thanks
<JanC> you might also want to check the logs of the application
<ee99ee> the application isn't even being started... I looked, but there isn't even a process for the app in the process list
<fission6> why do they use exec with two differnt meanings
<JanC> fission6: eh?
<fission6> exec in a script stanza and exec outside ofi t or does it do the same thing
<ee99ee> I think it does the same thing
<JanC> exec inside a script does not have anything to do with upstart
<fission6> right. why do "we" use it then to kick off the main daemon your are upstarting
<JanC> it does whatever your shell does with the exec command
<fission6> right, so do i need it in an upstart script, seems its used as a standard but i dont see what it does
<fission6> also still unclear why i cant run multiple things in a conf
<JanC> you can run multiple things from one conf
<JanC> but it won't supervise them separately then
<fission6> JanC: one other issue i have is i cn cleanly start and stop my service but i can not restart it, it stops but doesn't restart
<JanC> as SpamapS said, that probably means it tries to restart it too quickly
<fission6> ok. so what does that mean in terms of resolving it, seems like an odd issue
<JanC> if you use start-stop-daemon, it offers some ways to deal with that
<fission6> this whole thing is becoming such a tangled mess just to get uwsgi to restart when my ec2 reboots
<ee99ee> fission6: I feel your pain
<ee99ee> JanC: do you have an example of a start-stop-daemon implmentation
<JanC> or otherwise you might need some custom pre-stop or post-stop script or so
<fission6> yea sort of odd was hoping upstart was a bit cleaner, looks like supervisord is a bit easier
<JanC> fission6: I find upstart rather easy to understand, it's usually more work to understand how the applications that it needs to start/stop/etc. work  ;)
<fission6> well when it starts before something stops its a bit retarded, especially when its managing the process
<JanC> yes/no
<JanC> it's difficult to tell when an application starts & stops, or to be more exact: when it stopped "enough" that you can (re)start it  âº
<SpamapS> fission6: IMO, you are coupling things that don't need to be coupled
<SpamapS> fission6: just start celery and the webapp service on their own. They'll talk if they can.
<SpamapS> fission6: this will make it easier to split out the two into dedicated services later on anyway
<JanC> well, unless the webapp really needs celery, then you make it wait on celery...
<JanC> in both cases, you want 2 jobs
<fission6> thats fine, i agree on tht front
<fission6> i dont like how i can not currently restart uwsgi when before i got into upstart i woudl jsut stop adn start the process via control - c in a screen window
<fission6> can we focus on why restart isn't working, maybe getting thatin place will make me happy
<fission6> this is my uwsgi.conf to be clear http://dpaste.org/w1FBw/
<fission6> SpamapS: JanC could this be why http://projects.unbit.it/uwsgi/wiki/Upstart read the --die-on-term section
<SpamapS> fission6: is your uwsgi service printing errors? I bet its trying to LISTEN on a taken port because the OS hasn't given back the socket yet.
<fission6> nah
<fission6> By default sending SIGTERM to uWSGI would mean "brutally-reload-it", while normally apps tend to shutdown on SIGTERM. To shutdown uWSGI use SIGINT or SIGQUIT. If you cannot live with such setup you can use --die-on-term option.
<fission6> this could be it
<JanC> that might indeed be the reason
<SpamapS> wow, what an abuse of SIGTERM
<JanC> âº
<SpamapS> TERM is for *TERMINATING* not reloading
<JanC> well, depends on how uwsgi works
<fission6> my guess is its problem the reason since i only see --die-on-term mentioned in the upstart docs http://projects.unbit.it/uwsgi/wiki/Upstart
<SpamapS> a process that doesn't exit very soon after SIGTERM is going to get a SIGKILL soon after from some overzealous BOFH
<JanC> maybe it has a master process that automatically restarts child processes that get terminated or the like
<SpamapS> fission6: yes you want --die-on-term
<SpamapS> and you should send a really nasty email to the uwsgi authors explaining how dumb this is
<SpamapS> we have HUP for reloads... anything further, use USR1
<SpamapS> fission6: you can also use 'kill signal QUIT' I think
<SpamapS> sort of 6 and 1/2 dozen at that point
<SpamapS> ugh, and it also doesn't realize that Upstart has had socket activation since 10.04
<fission6> when you say "we have"
#upstart 2012-11-27
<fission6> what do you mean
<SpamapS> we, the unix loving people of the world :)
<fission6> ok ill email them with my final script (maybe) once i get something i like going
<fission6> let me review whats been mentioned here and get back to you SpamapS for a review if thats ok
<fission6> that cool SpamapS 
<jamescarr> heya
<jamescarr> I have an upstart script that originally used respawn. I fixed it to not use respawn but it always respawns
<jamescarr> even when stopping the service through upstart
<SpamapS> jamescarr: stopping a job would never lead to a respawn.
<SpamapS> jamescarr: respawn only happens if the program exits with a non-normal exit code/signal (which, by default, there is only one normal exit, 0)
<SpamapS> jamescarr: do you have stuff from syslog?
<jamescarr> SpamapS: well, originally I had respawn in my upstart script, then I realized the service it was starting starts as a daemon, causing it to respawn every second
<jamescarr> SpamapS: I fixed the upstart script by removing the respawn and changing the service config, but it keeps going
<jamescarr> no matter how many times I tell it to stop :(
<jamescarr> syslog is just filling up with "Nov 27 10:21:20 ultipro-api init: ultipro-api-service main process ended, respawning" every minute
<SpamapS> jamescarr: if its a daemon, you just use expect fork, or expect daemon, and then it won't respawn like that
<jamescarr> I set that
<jamescarr> it keep going
<SpamapS> perhaps it forks more than twice?
<SpamapS> jamescarr: and the program has no "--nodaemonize" mode?
<jamescarr> yeah
<jamescarr> should I just reboot!?
<SpamapS> jamescarr: in that case, you might be better off starting the program in pre-start, and recording a pidfile, then stopping it in post-stop
<jamescarr> SpamapS: how can I just outright stop it for now?
<jamescarr> it won't stop coming back
<SpamapS> jamescarr: stop jobname
<SpamapS> jamescarr: that will not respawn.. it changes the intended state to 'stop'
<SpamapS> which exits any respawn code
<jamescarr> SpamapS: I've don this tons of times
<jamescarr> sudo initctl stop jobname
<jamescarr> it just gives me... 
<jamescarr> "ultipro-api-service start/running, process 734 "
<SpamapS> *stop* says that?
<SpamapS> jamescarr: thats very unusual
<jamescarr> I should mention I accidentally copy and pasted the start script into the pre-stop script the first time
<jamescarr> I fixed it, but it keeps doing this 
<SpamapS> jamescarr: its possible that is causing issues...
<SpamapS> jamescarr: especially if that script calls 'start'
<jamescarr> yeah, I meant to replace it with stop but I screwed up
<jamescarr> I since fixed it, but the results seem to be from the first upstart script
<SpamapS> jodh: ^^ another use case for a "please forget about this job" command
<SpamapS> slangasek: ^^
<SpamapS> jamescarr: yeah, the new conf file isn't loaded until the 1st instance stops
<jamescarr> oh man
<jamescarr> anyway to stop the first one?
<SpamapS> jamescarr: this might require a reboot unfortunately
<stgraber> jodh, xnox: signal sender=:1.9 -> dest=(null destination) serial=3 path=/com/ubuntu/Upstart; interface=com.ubuntu.Upstart0_6; member=Restarted
<stgraber> ended up being a tiny bit more tricky than I had expected as the code needs to be added much later than jodh first though as we need to wait until upstart is done reconnecting to DBus
<stgraber> as a result, the signal is emitted whenever upstart restarts, not only for successful stateful re-execs, but I think that makes sense as the Instance Init don't really care whether it was succesful or not, they need to restart regardless
<stgraber> now to figure out how to write tests for that stuff :)
<mattiaza> hi all! I have a question about writing/debugging multiple dependencies in an upstart script
<mattiaza> I see that this has been discussed long ago: https://lists.ubuntu.com/archives/upstart-devel/2010-April/001252.html
<mattiaza> but at the moment, my job won't start or stop as expected by following the same pattern
<SpamapS> mattiaza: its important to start with the right frame of mind. Upstart is not dependency based, it is event based.
<SpamapS> mattiaza: what start on and stop on do you have now?
<mattiaza> I understand (but only very vaguely). starts and stops of other services generate events, and then these trigger start/stop on conditions. but how are events stored or cached when combining them?
<SpamapS> they're not stored or cached
<SpamapS> the event blocks until all consumers of the event change state
<SpamapS> mattiaza: its a very "now" thing, so if nothing needs to change, the event is gone.
<mattiaza> aha.
<mattiaza> at the moment, my conditions are:
<mattiaza> start on (started postgresql and started rabbitmq-server)
<mattiaza> stop on (stopping postgresql or stopping rabbitmq-server)
<mattiaza> what I'm trying to achieve is: my service needs postgresql and rabbitmq-server. therefore it should start automatically once both have been brought up, and it should stop when either of them goes away
<mattiaza> it would be nice if it also worked not only at system startup time, but also when I restart postgresql for example - but this is more of a nice-to-have
<SpamapS> mattiaza: ok, first, you're doing it wrong. :)
<SpamapS> mattiaza: those are network services
<SpamapS> mattiaza: so your app just has to deal with them not being up sometimes.
<SpamapS> mattiaza: if they were on another server, you couldn't control the boot order
<SpamapS> mattiaza: you're better off having your service retry and/or degrade until postgres and rabbit are available.
<SpamapS> mattiaza: and then just 'start on runlevel [2345]'
<SpamapS> and 'stop on runlevel [^2345]'
<mattiaza> I'll test it a bit...
<SpamapS> mattiaza: do you understand why though?
<SpamapS> mattiaza: upstart *can* express this. But it takes 2 jobs, one to keep track of the state of pgsql and rabbit, and another to run your service
<mattiaza> if I ever wanted to move postgres or rabbitmq to another server, then sure, it would be useful if it retried connections
<mattiaza> "my service" is django-celery worker, I'm investigating if it handles broken connections or not
<SpamapS> mattiaza: s/if I ever wanted to /when I am forced to /
<mattiaza> tested it with various configuration of the services running or not.
<mattiaza> celery is mostly reasonable: if rabbitmq is down, it retries connections. if postgresql is down, it runs tasks, but their results cannot be written and they are simply discarded forever.
<mattiaza> uwsgi django app itself errors when postgresql is down (obviously), but hangs forever if rabbitmq is down and it can't queue up new tasks.
<mattiaza> so at the moment, by starting celery on network-runlevels, it would technically run - but it would silently eat all tasks if rabbitmq comes up before postgresql
<mattiaza> i'd say that at the moment, linking the start on and stop on conditions to postgresql only makes most sense
<mattiaza> I don't expect to move the server to another machine in the next 6 months, and for now it appears to be the most reliable configuration
<SpamapS> mattiaza: that honestly sounds like you have a very broken queue runner
<SpamapS> mattiaza: you're saying, if pgsql goes down, all of your queued work is lost
<SpamapS> and thats ok?
<mattiaza> appears that way!
<SpamapS> awesome
<mattiaza> not sure if there are ways to make sure the queued jobs get persisted and retried if they fail
<SpamapS> mattiaza: you probably shouldn't be consuming the messages until the work is done.
<mattiaza> (and if they are persisted in pgsql, then it's still down :)
<SpamapS> another option is to put them back in the queue yourself
<SpamapS> try: write to db except: unrecoverable db error; resubmit
<mattiaza> I would expect this to be possible, as many other people are also using rabbitmq+celery as the standard django background task queue
<mattiaza> have to see if there are configuration options I missed
<SpamapS> Its totally possible.
<SpamapS> Just.. not actually helpful. :)
<SpamapS> mattiaza: here's another thought. You could just have your thing start on started postgresql and stop on stopping postgresql
<mattiaza> yep, that seems to be the most reliable method for now, and would not require complex celery hacking. reading documentation on how to add event emitting to postgresql old init scripts now
<mattiaza> my old start on command would never have worked anyway, as apparently neither postgresql and rabbitmq-server emit any events :)
<JanC> you can make them emit events if you want...
<mattiaza> postgresql old-style startup scripts look daunting.. might give it a try, but don't want to mess them up too much
<mattiaza> I have another idea though
<mattiaza> http://manpages.ubuntu.com/manpages/precise/en/man8/upstart-socket-bridge.8.html seems to be a way to see events on sockets - is there a way to listen for events on "socket on port X started listening" ?
<mattiaza> then I could have an event when postgresql really is ready to accept connections
<SpamapS> mattiaza: socket activation is not what you want in this case
<mattiaza> cool, got it working :)
<mattiaza> added some initctl emit lines to postgresql scripts
<mattiaza> now the celery service is started and stopped together with it
<mattiaza> not the ideal solution, but it should reduce the potential for queued tasks being consumed but failing
<SpamapS> mattiaza: seems like something the celery people would have dealt with.. this is a pretty common pattern (read a job, do some work, record the result). Usually in AMQP there is a way to make messages only disappear on "ACK"
<mattiaza> yep, I'll investigate that tomorrow
<mattiaza> I'm still overwhelmed by the complexity of rabbitmq + celery workers + task result storage in database (the recommended production setup), and how all these interact together
<SpamapS> mattiaza: given that you have 1 server.. it does seem.. overkill
<mattiaza> yes - it also only needs to scale to an amazing 5 requests per hour :)
<SpamapS> mattiaza: *l o l*
<SpamapS> totally worth this much time
<mattiaza> heading home now, thanks for your help!
<cheezit> has anyone run upstart 1.5 with 11.10?
<cheezit> nm that didn't work 2 well
#upstart 2012-11-28
<Redoubt> If I have an Upstart job start on event1 or (event2 and event3), I assumed it would run once upon event1, and again upon event2 and event3
<Redoubt> That doesn't seem to be the case
<Redoubt> Can anyone explain why?
<Redoubt> It only seems to run once
<Redoubt> But if I have a job start on event1, and another job start on (event2 and event3), both jobs run
<stgraber> Redoubt: well, a job only starts once, unless you define it as a task or use upstart's instances
<Redoubt> Oh my apologies-- it is a task
<stgraber> hmm, odd, my understanding of tasks was that they'd re-trigger every time one of the conditions would match... /me tests
<Redoubt> That's what I thought too. I can give you my specific task if you like
<stgraber> http://paste.ubuntu.com/1393270/
<stgraber> that simple example seems to show that upstart behaves as you'd expect...
<Redoubt> Well this is fortunate. I'm working on that bluetooth upstart job (bug 1073669) which you're involved in
<stgraber> can you pastebin what you've got currently?
<Redoubt> Well, here are my test scripts: http://paste.ubuntu.com/1393294/
<Redoubt> Because when I test them like you did (via initctl), they do seem to work as expected
<Redoubt> But when I reboot, I only get one print from kyle-all.conf, and the event that triggers it is only local-filesystems
<Redoubt> All the others print as well, but kyle-all.conf only prints once instead of the three times I expected
<stgraber> hmm, so the problem seems to be that it's expecting the local-filesystems to be re-emited too which it won't
<stgraber> testing some alternatives here
<Redoubt> Oh interesting, so I have a fundamental misunderstanding, I guess. I sort of thought of events as... states, I guess
<Redoubt> Something that was persistent
<Redoubt> That doesn't actually make sense though, so... :)
<stgraber> upstart basically tracks event state per job to know when the start condition matches, but as soon as it matches it looses that state, or so it looks like anyway :)
<Redoubt> That makes perfect sense
<Redoubt> But interferes with all my plans, darnit
<Redoubt> Hmm... it would be nice if it analyzed start conditions a little more and was smarter with that
<Redoubt> The only way around this that I can think of is three separate scripts. Yuck
<Redoubt> I would probably do four and make one an instance
<stgraber> well, you could keep rfkill-restore as it's and add a new rfkill-restore-interface which would roughly be
<stgraber> start on net-device-added or bluetooth-device-added
<stgraber> task
<stgraber> exec initctl start --no-wait rfkill-restore
<stgraber> the INTERFACE environment variable should then be sent all the way to the rfkill-restore job that can then change behaviour slightly when it's defined
<Redoubt> Well, what about local-filesystems?
<stgraber> that'd still be the start condition of the rfkill-restore.conf job, just not of the -interface one
<Redoubt> Would exec initctl start --no-wait rfkill-restore fail if local-filesystems hadn't happened?
<stgraber> hmm, no, it'd run the code anyway but would exit 0 immediately because of the first if statement in rfkill-restore.conf
<Redoubt> If not, this two-script combo is the same as just changing rfkill-restore starts to local-filesystems or net-device-added or bluetooth-device-added, right?
<stgraber> so that wouldn't be a problem as such a case would mean that the main rfkill-restore call didn't happen yet and that the interface will be rfkilled a bit later
<stgraber> hmm, good point, and I guess doing that change + adding the needed tweaks to the shell script would do the trick
<stgraber> no need for a separate job
<Redoubt> Okay, I like that. I really appreciate your help man, I've made a couple forum posts about this and no one seems to have any clue
<Redoubt> That saving states until start condition met thing... I'm not sure the docs explicitly mention that. Not sure I ever would have learned it :)
<stgraber> hehe, np and thanks for the work on that bug. I've been quite busy with other things lately, one of which being upstart development, so it's good to have you help with this one :)
<Redoubt> Oh my pleasure!
<stgraber> yeah, the state saving is a bit odd, it makes sense when you think of it, but it can surprise you :) I believe Scott (original upstart author) covered that in some blogpost a few years ago, or maybe it was a session at UDS, can't recall... anyway, may be something we should better cover in the cookbook.
<SpamapS> Redoubt: if you want to poke something when network connections change, just use /etc/network/if-up.d
<SpamapS> Redoubt: upstart is a bit low level for what you're intending
<SpamapS> Redoubt: also if you do need to do state tracking, you can use the 'wait-for-state' job added in Ubuntu 11.04
<Redoubt> stgraber: Yeah, if you think of it, it would be handy to mention that in there somewhere.
<Redoubt> SpamapS: Huh, wait-for-state is a new one for me, thank you for the reference!
<stgraber> SpamapS: wait-for-state only works for jobs though, not for events right? (it waits for a job to get into the WAIT_FOR state)
<SpamapS> stgraber: indeed.. the idea though, is that you use the job to track state.. and then you just call wait-for-state with that job when you want to block on a state
<stgraber> SpamapS: Redoubt is helping me with a limitation of my rfkill-restore job I introduced in 12.10 where some devices you can rfkill show up after the rfkill-restore job is done. The only way to reliably cover this case is to trigger the restore job when network or bluetooth devices show up.
<stgraber> SpamapS: initially I thought I added the local-filesystems requirement of rfkill-restore because I actually needed local-filesystems, but I don't really as the job itself is checking for the paths it needs and exiting if they're not there.
<stgraber> SpamapS: so the simple fix for our problem is to change from "start on local-filesystems" to "start on local-filesystems or net-device-added or bluetooth-device-added" which will guarantee the job to be triggered once per boot + once per device
<stgraber> SpamapS: with some of the per-device call potentially being no-ops if the filesystem isn't there yet, but in such case, the rfkill-restore job will be called a bit later anyway and will take care of any leftovers
<SpamapS> stgraber: Right, so it sounds like you just need a single task, with start on net-device-added or bluetooth-device-added , and an instance value that will be unique enough to run them at the right time
<SpamapS> stgraber: that won't actually guarantee that it will be run once per device
<stgraber> SpamapS: as it's a task, we don't even need to use instances
<SpamapS> stgraber: thats not true at all
<SpamapS> stgraber: the only thing task does is guarantee that it will block until it reaches stopped again
<SpamapS> stgraber: it does not serialize events. If the state is already 'start/running' .. the next event in the or is ignored
<Redoubt> SpamapS: So you think instances are a better way to go?
<stgraber> SpamapS: it won't be start/running because the task exits in a few miliseconds
<SpamapS> I think instances is the only way to go if you need to run it once per event
<SpamapS> stgraber: >< cross your fingers? ;)
<SpamapS> thats actually again where wait-for-state works, because you can wait for stopped on a task, but start it. That way all of the waiters will block until it has been run once
<stgraber> SpamapS: well, the way the script is designed, we don't need the value of INTERFACE, whenever it triggers, it'll restore the rfkill value of any interface it hasn't restored yet. So if we have multiple devices showing up at the same time and we end up only triggering the job once, that's fine as it'll cover all of them anyway
<SpamapS> start wait-for-state WAITER=$UPSTART_JOB WAIT_STATE=stopped GOAL_STATE=start WAIT_FOR=rfkill-restore
<SpamapS> assuming rfkill-restore is a task
<stgraber> it's
<SpamapS> stgraber: sounds like a very tight race..
<SpamapS> [dev1-added][rfkill-restore starts and handles dev1][dev2-added][upstart acks that rfkill-restore is done and returns to stop/waiting]
<SpamapS> seems like there's a very tiny window for dev2 to not be handled
<Redoubt> An instance would then require a helper script, correct?
<Redoubt> helper job, rather
<Redoubt> Two jobs: One instance, one job to run the instance
<SpamapS> Well, the helper is wait-for-state
<Redoubt> So in order to use wait-for-state the job must be an instance?
<SpamapS> but yes, there will be two jobs, one which lists the events, and runs the wait-for-state command for each instance of itself, and then the actual rfkill-restore
<SpamapS> Redoubt: no, its just in this case, it makes more sense that way as it closes the race, even if the race is tiny
<Redoubt> Of you have a job that lists the events and runs wait-for-state, doesn't that provide a window for the race to occur as well?
<stgraber> SpamapS: I'd actually have to check, but I vaguely remember the udev events being blocking on the udev side, which is that until the initctl done by udev returns, no other uevent can be emitted
<stgraber> hmm, upstart-events marks them as signal, so I guess not
<SpamapS> http://paste.ubuntu.com/1393368/
<stgraber> ah no, I'm being confused with some other weird udev rules we have somewhere else that doesn't use the bridge
<stgraber> WHATEVER_DEVICE_ADDED_RELIABLY_EXPORTS would be INTERFACE
<SpamapS> stgraber: so udev will queue up the events until the waited for event is handled?
<SpamapS> because that means you can just handle with or's
<stgraber> SpamapS: nope, as I said, I got confused with some other hacks I had to look at recently where something indirectly does an "initctl emit" from a udev hook, which in this case indeed blocks everything on udev's side. But that's not the case for those event emitted by upstart-udev-bridge.
<stgraber> SpamapS: so your solution is fine, as much as I dislike introducing an extra job to cover that.
<SpamapS> stgraber: its a bug in upstart not to have this built in
<SpamapS> stgraber: In fact we talked about just changing or's to work this way, but we'd have to call that upstart 2.0 or something.. because it would likely break some things
<Redoubt> SpamapS: That would be handy!
<SpamapS> Yeah, it would
<stgraber> yeah, or just add another keyword "or2" or whatever, not sure what'd be the most confusing for the users :)
<SpamapS> well, its actually not or that is broken, per se
<SpamapS> its blocking in general
<SpamapS> it just doesn't get done if the goal does not change
<SpamapS> what should happen is, on a blocking event emission, we should look for any start on's or stop ons that match, and then flip only the ones that need changing, but block if there are other events already being waited on for any that don't change
<SpamapS> Of course, another one would be to finally do the "state rewrite" that keybuk intended, which would introduce a 'while' keyword and allow you to do 'while bluetooth-device is running'
<stgraber> yay, my prctl branch builds and appears to work!
<stgraber> stgraber 25187  0.0  0.0  41988  1928 pts/4    S+   10:22   0:00  |   \_ /home/stgraber/data/code/upstart/init --session --confdir=~/.init/
<stgraber> stgraber 25201  0.0  0.0   4328   360 ?        S    10:24   0:00  |       \_ /bin/sleep 2000
<stgraber> neat!
<stgraber> and as init is now the parent, I can even stop that job (in the past it'd just hang as it wouldn't receive the SIGCHILD)
<SpamapS> stgraber: *nice*
#upstart 2012-11-29
<xnox> jodh: I am at a loss how come that test passes on a real system but not in the vm as started by run-adt-test.
<jodh> xnox: alas writing the code is easy in comparison to diagnosing test failures :)
<jodh> xnox: we need to identify the test that creates the file in question and see why it isn't being removed as expected.
<jodh> xnox: you could always compare the autopkgtest env with non-VM env with... procenv :-)
<xnox> the test is correct.
<xnox> let me add a print statement to see if it takes the assumed code path in the vm.
<jodh> xnox: actually - remember to take care with inserting any debug code since you're in a TEST_ALLOC_FAIL() block and nih_debug et al allocate memory (http://upstart.ubuntu.com/cookbook/#gotchas). You may need to use a basic write(2).
<xnox> thanks =)
<jodh> xnox: just a sec - that test looks wrong - there is no call to waitpid() in the non-failure scenario.
<jodh> xnox: we should wait for the pid, then delete the log file in dirname/<job>.log if job_process_spawn() did not fail.
<xnox> jodh: I thought that the point of the test is that allocation_fails to succeed the test.
<xnox> unless I got it utterly wrong.
<jodh> xnox: http://upstart.ubuntu.com/cookbook/#test-alloc-fail is a bit of a 'mind bender', but the loop _will_ succeed once. And that is the problem iteration - there is a race.
<jodh> xnox: so, I suspect if you waitpid() when test_alloc_failed is not set, you'll see the test fail on all systems since the rmdir will fail as by that time, the job has finished and written the log file in the directory the test is attempting to delete.
<jodh> xnox: so, the correct fix I think is: waitpid(), then unlink the log file for the non-failure iteration.
<xnox> jodh: ok. I understand your solution. Can you explain what the test is testing for in this case then?
<jodh> xnox: you'll need to construct the name of the logfile using nih_local and nih_sprintf(). See examples in the code relating to the 'expected' variable.
<jodh> xnox: well it's supposed to be testing that a simple job which produces output can be handled in all error scenarios. But it really could do with a bit of improvement.
<jodh> xnox: namely, we should assert that on success, that the log file is written and the contents are correct.
<xnox> jodh: ok. Thank you for the information, I will try to poke it.
<jodh> xnox: so you could add a call to CHECK_FILE_EQ().
<jodh> xnox: and feel free to add some more comments if you want ;-)
<jodh> xnox: thanks
#upstart 2012-11-30
 * xnox cannot see upstart source code side by side on my laptop & I'm away from my dual screens =(
<xnox> 80 char limit is _good_ sometimes
 * xnox lowers the font size & squints ;-)
<dzhus> good day, can I run an upstart init script if it's symlinked to another location?
<dzhus> like ~/.init/foo.conf -> ~/system-scripts/foo.conf
<dzhus> oh, it's bug #665022
<xnox> jodh: it now fails \o/
<jodh> dzhus: that behaviour is by design, as shown on the bug.
<jodh> xnox: great! so we can now just check the log and then unlink it.
<xnox> bug 665022
 * xnox ponders where is the bug bot.
<xnox> jodh: ;-)
<dzhus> Well, I was wrong. What 665022 (and the code) says is that upstart does not plug new services is when new symlinks are created in init directories. If the symlink is already there, it works finely.
<xnox> ...simple test
<xnox> /bin/bash: line 5: 10384 Segmentation fault      (core dumped) ${dir}$tst
 * xnox ponders how to get a more useful fail....
<xnox> it's ugly, but passes on the machine. Let me try in the autopkgtest now.
<jodh> xnox: 'gdb --batch -ex r init/test_job_process' followed by 'bt' when it fails :)
<xnox> jodh: =))))
<xnox> thanks.
<xnox> jodh: pushed merge-proposal to fix the above test. It's not the most optimal way but it works.
<jodh> xnox: thanks!
<xnox> jodh: all autopkgtest pass together with stgrabers fix & my branch fixing autopkgtests ;-)
<Fecn> Hi Folks - For reasons I wont' go into, I'm trying to get upstart working in an initial ramdisk... it mostly sort-of works, but whenver I try to start services using initctl it gives me a variety of different 'Syntax Error: Bad fd number' messages in the debug output.
<Fecn> Is there any way to get more debug output so I can see what is actually cuasing the bad fd messages?
<xnox> we have a spec for eventdriven initramfs (that is upstart in the initramfs)
<xnox> but it's not fully tested / ready yet.
<xnox> search for upstart eventdriver initramfs if you are interested.
<xnox> Can't help with fd number error, Fecn.
<xnox> Hopefully somebody else will help you.
<Fecn> xnox: Thanks - I'll have a search for the eventdriver stuff
<SpamapS> man, this systemd/udev/w'ever thread on debian-devel just won't die
<jodh> Fecn: try booting with 'console=ttyS0 --debug debug --no-log' and seeing what you get on the serial console. If the error is coming from initctl, you could add strace to the initramfs and use that.
<Fecn> jodh: Serial console could be a challenge.. I'm not in the same physical location as the server... However, /dev/console is accessible to me OK (as is an ssh session if I fire up dhclient and sshd before I exec init
<Fecn> I figure there will be some vital component missing from my initramfs somewhere along the way
<Fecn> fwiw - the initramfs I'm working on is for a node-imaging system (formats hdd, runs rsync then does a kexec)
<SpamapS> wait, so you're developing a custom initramfs .. remotely?
<Fecn> Yep.. remotely via a KVM console thing
 * SpamapS hands Fecn the "Bravest Craziest Mofo" award he got for upgrading ssh on a slackware box remotely back in 2002
<Fecn> It's not really that crazy - I have power control, screen, keyboard etc... I can reboot it when it goes wrong.. I just can't get to the serial port
<Fecn> sadly I have to do it this way as the only place where the infiniband adapters are located is in the datacentre machines
<SpamapS> I've been through stuff like that before too
<SpamapS> "Developer makes $85k/year, we can't possibly afford $2000 worth of extra equipment for their test setup."
<Fecn> I did have it all working beautifully with a debian-based initrd... but we have our corporate RHEL standard.. so I'm having to rebuild it all using RHEL6 as the base
<Fecn> For that matter, my RHEL based one is nearly working... apart from the kexec at the end, which hangs the machine if I do a forced reboot... hence trying to get enough of upstart to work so that it can actually do a 'shutdown -r' correctly
#upstart 2012-12-01
<JanC> hm, why is grub-common not an upstart task instead of a sysvinit script?
#upstart 2013-11-25
<Xeta> Hi! Is it possible to make upstart-job start an application in the foreground? I want my process to be run on startup and be the first thing the user sees.
<xnox> upstart view of "foreground" is quite different =) can you describe your application? is it graphical or terminal one?
<Xeta> It is a fullscreen video + audio player
<Xeta> But it might take some keyboard input as well
<xnox> Xeta: if you are using saucy or later. I suggest for a default (non-root capable) user account to auto-login. And then you can auto-launch a graphical application on user-session login.
<xnox> Xeta: the locations you can put the user-session upstart job is at ~/.config/upstart/, or more global locations (admin path /etc/xdg/upstart/, packaging path /usr/share/upstart/sessions/)
<xnox> see http://upstart.ubuntu.com/cookbook/ for further details and manpages, man init and related manpages from there.
<Xeta> Ok cool, thanks xnox ill check it out
<Xeta> Or hm, I would actually like the app to be started as root and not require the user to login.
<Xeta> We are making kinda like a media box (with the rpi) where the user shouldn't have to login and our app should be the only thing runing in the foreground
<Xeta> How could i achieve this? :)
<Xeta> Im my current config im more or less just doing "exec /myapp". And it is kinda working, the video ourput from my app is shown after startup, but so is the normal prompt to login as a user.
<Xeta> So hm, maybe my question is: How do I hide the login-prompt after boot so that it just shows the video output from my application?
<xnox> Xeta: you don't want your app to run as root, create a user account to run your app, password-less and set lightdm to autologin.
<xnox> Xeta: and auto-start your app from there.
<xnox> Xeta: or look at e.g. ubiquity, it start on starting lightdm, and if one "quits" ubiquity one is taken lightdm (normal or auto login)
<xnox> but it takes care to drop priviliges.
<SegFaultAX> Is it possible to make upstart play nice with processes that change pids when reloading themselves?
<SegFaultAX> It's kinda sorta but not quite like a restart.
#upstart 2013-11-26
<xnox> SegFaultAX: can you file a bug against upstart with example process that does that?
<adrien_oww> is there a difference between writing "start on started B" in event.d/A and writing "start on starting A" in event.d/B?
<xnox> adrien_oww: yes. if jobA.conf has "start on started B" then jobA will be automatically attempted to be started when jobB is fully started & running. and it doesn't matter how jobB has started (manually or automatically)
<xnox> adrien_oww: if jobB.conf has "start on starting A", then just before jobA is attempted to be started it will be blocked until jobB fully starts.
<xnox> in practice, you'd need additional stanzas in either of them to be automatically started, unless you intend to only start them manually.
<xnox> this does not convey "state", since e.g. in "started B" case i can do manually start job A without having started job B.
<xnox> in "start on starting A" case I can manually start job B and job A will not start.
<xnox> adrien_oww: if B must be running, before A's main process is started, make sure in A pre-start script a check is made that B's state is correct. e.g. status B | grep -q start/running
<adrien_oww> right, I see, thanks
<adrien_oww> these scripts should only be started at boot so it won't matter much iiuc
<adrien_oww> this is a fairly old codebase (well, up to 3-4 years) with an upstart 0.3.3 and I'm trying to start a process earlier on without changing too many things but I must have a line wrong
<xnox> adrien_oww: i see. well, 0.3.3 sounds rather very old. I'd recommend upgrading if that is at all possible =) in recent upstarts there is an initctl2dot which can parse jobs and draw a diagram of all jobs and which ones are started when.
<xnox> start with your startup event (the default one is "startup") and trace it to your jobs, and see what things are blocking what.
<adrien_oww> unfortunately, moving away from 0.3.3 is not possible; I know there are some kludges hidden and some of them will break with a newer upstart
<xnox> understand.
<adrien_oww> I have a fairly good idea of what gets to run after what but that's mostly because there is little parallelization ='( 
<xnox> adrien_oww: one should be able to increasing debugging. "--debug" should work, and then collect all logs, which should list all the jobs in order, it should be then easier to analyse where the bottlenecks are.
<adrien_oww> I've checked my commits and I don't understand why the service doesn't seem to get started; that said it might be because I'm looking at syslog (started with --verbose on the kernel command-line) and rsyslogd isn't started yet
<xnox> check console & dmesg as well.
<xnox>  / kmsg
<adrien_oww> I'm tempted to be a bit dirty and start on started rsyslogd :D 
<adrien_oww> it's almost as soon as I want it too
<adrien_oww> otherwise, in dmesg, there isn't much infos
<adrien_oww> hmm, had to stop a bit and now getting back on this issue
<adrien_oww> I have a script X which "start on started Y"; Y is "start on started rsyslogd" and correctly starts
<adrien_oww> but X never starts; Y is an abstract job
<adrien_oww> if I check the status, I get "Y (start) starting" and "X (stop) waiting"
<adrien_oww> and if I "initctl emit started Y", X is triggered
<adrien_oww> the only issue I can think of is that Y is an abstract job but I can't find a confirmation in the doc; however that might only be an issue with older upstart version
<adrien_oww> s
<adrien_oww> arf, actually Y has a "script" section
<xnox> adrien_oww: can you paste full script? if it e.g. has task stanza, started is only emitted once script is completed & exited.
<adrien_oww> I think I need to clean things up a bit, both on disk and in my mind
<adrien_oww> but if a job is stuck in "starting" then it means it's waiting on something that is "start on starting $it", right?
<adrien_oww> ah and upstart 0.3.11, not 0.3.3
<xnox> adrien_oww: "A (start) starting", means that "starting A" event was emitted; and "started A" was not.
<xnox> adrien_oww: checkout http://upstart.ubuntu.com/cookbook/ it lists all states (reported by status) and which events are emitted when, and which transitions happen when.
<xnox> it is a useful reference when things are entangled.
<xnox> (cookbook did not exist back in the day....)
<adrien_oww> I've checked the cookbook and I can't what could make it never move from "starting" to "started"; the only thing I can think of is that there is one job with "start on starting the-process-that-stays-in-starting-state" and which is not completing
<adrien_oww> however the only one like that is redis-server and it does nothing but call the script in /etc/init.d/, the redis-server is wroking and its state is "redis-sever (start) running"
#upstart 2013-11-27
<Unit193> Sooo, is there a way to unwedge upstart then?  As in, when it loses track of a job and still thinks it's running, but you can't start or stop it?  (Causing upgrades to not work)
<adrien_oww> xnox: well, for my issue, I've reverted some changes back without effect; I don't understand what is wrong but it's most probably a recent change on my end
<adrien_oww> I have to change topic a bit but I'll get back to this when I can
<magmatt> I've got an upstart script that when stopped doesn't stop all the child processes: http://paste.ubuntu.com/6484975/Â  This shows the problem: http://bpaste.net/show/yS98t42s9ki5umrFFDoG/  Why isn't that last processes killed when its parent is killed?
<magmatt> you can see it lingering in line 16 of the second paste
<magmatt> afaik there's no forking going on; nothing daemonizes
<xnox> magmatt: this is very odd job.
<magmatt> it looks like the su might be doing some kind of daemonizing
<magmatt> xnox: yeah, it kind of is :(
<xnox> magmatt: setuid shampain; setgid shampain; exec /home/shampain/start-shampain.sh
<xnox> magmatt: instead of using su, cause stopping will be killing su.....
<xnox> magmatt: with setuid/setgid stanzas you will not need su.
<xnox> magmatt: you might be required to export/set $HOME
<magmatt> xnox: okay, let me try that
<xnox> magmatt: ideally you would not have wrapper script, but write it out directly in script stanza.... but not sure how to deal with decrypt thing.
<magmatt> which brings me to another question: sometimes when I can the upstart conf script, I can't start/stop/restart because it complains of "stop: Unknown job: shampain"
<magmatt> such is the case right now
<magmatt> s/can/change/
<xnox> you must have made a mistake, and it rejected the conf file.
<xnox> init-checkconf /path/to/your/job.conf
<xnox> should say what it complains about.
<magmatt> xnox: I don't have init-checkconf, but the setuid and setgid stanzas are the problem (commenting them out make it work again).  I'm using upstart 0.6.5
<magmatt> perhaps that's why
<xnox> ah.
<xnox> that's rather old.
<xnox> magmatt: in that case, instead of using su -c, use start-stop-daemon and pass command/user args to it. then stop will send singnals to start-stop-daemon, which will in turn relay them.
<xnox> it accepts --user option, etc.
<magmatt> xnox: great!  Let me give that a try.  (And I'll see if ops is willing to upgrade upstart)
<magmatt> bah, this box doesn't have start-stop-daemon either :(
<magmatt> xnox: thank you for your help.  You've given me plenty to go on -- I'll need to see what we can do about installing things on this machine
#upstart 2013-11-28
<wrale1> hi.. i'm looking to use an upstart job (task?) to run a program (ansible-pull) periodically, in an end-to-end fashion.  that is, i want ansible-pull to run all the way through before the next run begins.  also, i'd like a sleep period to expire after ansible-pull runs.. i think i understand most of what i need to know.. however, i'm unsure of how best to do the loop.. it seems respawn only works for exit non-zero.. any ideas?
<quizme> I'm writing a custom task.  In the upstart cookbook, it says that session jobs will be looked for in /usr/share/ustart/sessions, but I put a job there and it's not in the list of upstart jobs as advertised in the upstart cookbook.  but if i put my job in /etc/init, it's there.  :~(
<xnox> quizme: /usr/share/upstart/session/* jobs are only in effect in graphical user sesions.
<xnox> quizme: use e.g. $ initctl --system list
<xnox> or $ initctl --user list
<xnox> to tell them appart.
<quizme> xnox oh
<xnox> quizme: similarly $ initctl --system start | stop | restart, etc.
<quizme> xnox is there a way to enable them on the server ?
<xnox> quizme: not really no, but there is a hack/workaround in the cookbook as how to do it. (basically a system job that auto-launches per-user session for the users you want it for)
<xnox> quizme: it's a bit tricky, because on terminal login, you need to query the UPSTART_SESSION var & set it for your self & also get some other vars.
<xnox> but it should work =)
<quizme> basically, I'm writing an application-specific task that i want to run once per hour.  for some reason I was thinking to use upstart instead of cron, is that wrong-headed ?
<quizme> well i was planning to make an upstart task, then call the upstart task from cron.  Then an upstart even would fire when it was finished and run another job.
<quizme> "upstart event would fire" i mean
<quizme> xnox: what do u think?
<quizme> xnox should i just use bash+cron minus upstart ?
<xnox> quizme: there was a plan for temporal/periodic events.
<xnox> quizme: one thing you need to keep in mind is that - you  cant start a running job again.
<xnox> quizme: so your cron job can be: stop myjob; sleep 10; start myjob. and cron it to run once a day or some such.
<xnox> quizme: cron is reliable way to do stuff.
<xnox> quizme: you can instead emit events if you want via cron. E.g. cron every midnight to do $ initctl emit FOX_SAYS=midnight
<xnox> or simly $ initctl emit midnight
<xnox> and then your jobs can do "star on midnight" or "start on FOX_SAYS=midnight" or "start on FOX_SAYS"
<quizme> oh yeah that's interesting
<quizme> then when upstart takes over cron, my script will already work 
<xnox> quizme: i don't think upstart will take over cron =) instead it will provide interface akeen of "start on every 5 minutes" or "start on 10 minutes after runlevel=2"
<quizme> xnox oh, ok, i thought it was going to take over cron, and then the world.
<xnox> quizme: no, upstart is event based init system. And it does events and it does init.
<xnox> quizme: you may be thinking of systemd software collection, that project has an index page of daemons/commands that is does and plans to add even more http://www.freedesktop.org/software/systemd/man/
<xnox> This index contains 296 entries, referring to 144 individual manual pages.
#upstart 2013-11-29
<quizme> It's obvious they are trying to take over the world.
#upstart 2013-11-30
<wralej1> can a task restart itself ?  that is, if i exec something which exits zero, can i use upstart to completely run it, over and over, without respawning on exit non-zero?
<xnox> wralej1: see "respawn" stanza in $ man 5 init, or cookbook
<wralej1> xnox:  i read the respawn section again.  the first time around, i got the impression that the respawn stanza causes a task restart only on a non-zero exit.. now, i'm not so sure.. the cookbook doesn't say explicitly.. it says: "Likewise, for tasks, (see below), respawning means that you want that task to be retried until it exits with zero (0) as its exit code."
<xnox> wralej1: "task" is also a stanza and has special meaning ;-)
<xnox> wralej1: maybe you want to define which "exit" codes you consider normal, and which abnormal.
<xnox> "normal exit"
<wralej1> xnox: i see.. i'm going to look at the source code.. i gather respawn must not make the non-zero assumption for tasks, at least
<wralej1> yes, that would be my preference.. defining
<xnox> wralej1: man page is more clear "exit 0 || stop" is to finish, and not respawn. So you could guard exits, and call "stop" when/where needed as well.
<wralej1> xnox: awesome.. i'll look at the man page.. thanks for the help
<xnox> wralej1: no problem =)
#upstart 2013-12-01
<igors> hello. I'm having problems trying to create a upstart conf (http://dpaste.com/1488863/), the file is /etc/init/nsw.conf. '$ nsw start' hangs. But if I execute the "exec" manually it works
#upstart 2014-11-25
<cetex> a quick question.. i need to run something after another task has finished completely, the task is "udevtrigger" and it has a "post-stop exec udevadm settle
<cetex> currently i'm doing "start on stopped udevtrigger" in another  but it seems it's run too early, how do i make sure it's not run until the post-stop has completed?
<cetex> *in another upstart-script*
<cetex> Or, if anyone have an idea how i can trigger on when udev has settled i'd be happy..
<afournier> hi
<afournier> can i know in pre-script if a job is started by the user with the start command or with the "start on" stanza ?
<afournier> yay UPSTART_EVENTS !
<cetex> hm, another issue.
<cetex> how do i (in a script) do something like: "mkdir -p /something/somethingelse" and not have it fail?, i've tried "mkdir .. .. || true", but it doesn't seem to work.
#upstart 2014-11-28
<rexxor> hey guys, got a small problem with an autossh upstart job
<rexxor> i have to actually log into the server (on boot) for the job to start, is this normal?
<weallneeda3minut> where do I find the upstart logs on Amazon Linux? my uwsgi script doesn't seem to boot on start
#upstart 2015-11-23
<m1dnight_> Hello. I am wondering how I can set variables for an upstart script. Do they have to be in the .conf file?
<m1dnight_> (Something like /etc/default)
<JanC> m1dnight_: you can source files in /etc/default/ from script sections inside the .conf
<m1dnight_> Ah, I will have look into that. Thanks JanC 
<JanC> if you have an Ubuntu system/VM, then apport.conf uses this
<JanC> and maybe others too
#upstart 2016-11-29
<AlecTaylor> hi
<AlecTaylor> Do I need two upstart scripts one server one running a commmand whenever the server produces output? - Or can this be done in one upstart?
<AlecTaylor> Ohhhh looks like I can do it with 'tasks'
 * hallyn wonders whether anyone here woudl be intersted in maintaining a fork of libnih.
<hallyn> (i'd offer a patch to switch from select to epoll, and maybe an upstart patch to use systemd socket activation;  but no sense doing that if libnih won't be maintained)
#upstart 2017-11-30
<dosjota> hi 
<dosjota> an example to run a nodejs application as a daemon
