#juju-dev 2012-07-16
<davecheney> ubuntu@ip-10-4-114-16:~$ pgrep jujud -lf
<davecheney> 5074 /var/lib/juju/tools/juju-0.0.0-precise-amd64/jujud provisioning --zookeeper-servers localhost:2181 --log-file /var/log/juju/provision-agent.log --debug
<davecheney> 5077 /var/lib/juju/tools/juju-0.0.0-precise-amd64/jujud machine --zookeeper-servers localhost:2181 --machine-id 0 --log-file /var/log/juju/machine-agent.log --debug
<rogpeppe> davecheney: mornin'
<davecheney> rogpeppe: hows' tricks ?
<rogpeppe> davecheney: good thanks. finally got around to painting the kitchen last weekend... two years after we did the rest of it!
<rogpeppe> davecheney: still needs another coat, but getting there
<davecheney> rogpeppe: yup, you'd think as an adult painting would be simple
<davecheney> turns out it's really bloody hard
<rogpeppe> davecheney: i've had an idea about the watchers which i'd like to run past you
<davecheney> btw, I think ec2/config_test.go might be broken
<davecheney> as in, doesn't test anything
<rogpeppe> davecheney: hmm, how so?
<davecheney> trying to write a _failing_ test case for https://bugs.launchpad.net/juju-core/+bug/1025128
<davecheney>         {
<davecheney>                 "admin-secret: 81a1e7429e6847c4941fda7591246594\n" + baseConfig,
<davecheney>                 func(cfg *providerConfig) {},
<davecheney> this should fail right ?
<davecheney>                 "",
<davecheney>         },
<davecheney> rogpeppe: did you read my mid afternoon status, we have a machine agent running on machine/0 now
 * rogpeppe has a look
<rogpeppe> davecheney: oh, no i didn't - i've got a branch almost done which runs the MA. that's cool though, it's not much code.
<davecheney> rogpeppe: dagnabbit, we're all over each others toes
<rogpeppe> davecheney: i know, it's as things start to close in
<davecheney> i'm kinda hanging in the wind until deploy secrets is comitted
<rogpeppe> davecheney: that'll be soon, i hope, if one or other of the config branches gets in
 * davecheney nods
<rogpeppe> davecheney: did you have a look at this BTW? https://codereview.appspot.com/6343107/
<davecheney> rogpeppe: i had a brief look, SGTM
<davecheney> but i'm really only intersted in getting secrets deployed
<rogpeppe> davecheney: it's part and parcel
<davecheney> config has had _so_ many polishes, it must ground down to a nub now
<rogpeppe> davecheney: it's surprisingly subtle
<davecheney> rogpeppe: that suggests that its serving many masters
<rogpeppe> davecheney: i found that when trying to hack on william's code
<davecheney> subtle the way state.Info is subtle
<rogpeppe> davecheney: it was trying to be two things at once
<rogpeppe> davecheney: i don't thing state.Info is subtle
<rogpeppe> think
<rogpeppe> davecheney: there was lots of code doing not very much
<rogpeppe> davecheney: you're right, config tests are fucked
<davecheney> rogpeppe: should I raise a bug
<davecheney> i fixed 1025128, but it's hard to test when the test harness is rooted
<rogpeppe> davecheney: i just changed one of the existing error messages and the test still passed...
<davecheney> rogpeppe: indeed o_O
<rogpeppe> davecheney: go test -gocheck.f TestConfig
<rogpeppe> davecheney: runs no tests
<rogpeppe> oops
<davecheney> bwahaha
<davecheney> hwo did that happen ?
<rogpeppe> davecheney: ha!
<rogpeppe> davecheney: i see
<rogpeppe> davecheney: someone at some point (maybe me!) changed (configSuite) to (s *configSuite)
<davecheney> bzr blame ?
<rogpeppe> davecheney: i dare not
<davecheney> care to unfuk ?
<rogpeppe> davecheney: gimme 30s
<rogpeppe> davecheney: ha! not so fast - loads of tests fail...
<rogpeppe> shit
<davecheney> rogpeppe: maybe log a bug
<davecheney> i'll do it
<rogpeppe> davecheney: looks like william might've done it...
<rogpeppe> fwereade_: ahem
<rogpeppe> so easy to do
<davecheney> rogpeppe: sohuld thre be two bugs then ? on against gocheck to avoid this kind of thing ?
<rogpeppe> davecheney: arguably. it should probably give an error if it finds tests on the value type when it's been given a pointer
<davecheney> rogpeppe: i think this should be raised as a bug
<rogpeppe> davecheney: yeah
<davecheney> rogpeppe: i'll do it
<rogpeppe> davecheney: can i run my watcher idea past you?
<davecheney> rogpeppe: https://bugs.launchpad.net/juju-core/+bug/1025138
<davecheney> ^ can you put some debug in this issue
<davecheney> then i'll log one against gocheck to make sure we can't screw ourselves again
<davecheney> rogpeppe: sure, tell me about your watcher
<rogpeppe> davecheney: i'm concerned by how hard it is to write multiplexers. this was some code i suggested to frank for the firewall code: http://paste.ubuntu.com/1089462/
<rogpeppe> davecheney: the code inside loop() there should really only be three lines of code.
<rogpeppe> davecheney: and i think the problem is to do with the way we stop watchers
<davecheney> rogpeppe: one idea i had was to pass into the watcher a channel to receive changes
<rogpeppe> davecheney: each watcher has its own Stop function.
<davecheney> this is unusual, but might make a mux style watcher simpler
<rogpeppe> davecheney: i think a nicer alternative is to pass a stop channel into the watcher
<rogpeppe> davecheney: then the watcher closes its channel when done
<davecheney> rogpeppe: but you can't select on a closed channel, right ?
<rogpeppe> davecheney: that way you've got a *single* stop channel for all watchers you're muxing
<rogpeppe> davecheney: 'course you can
<rogpeppe> davecheney: (select on read)
 * davecheney remains unsure
<rogpeppe> davecheney: and only a single channel to read from each watcher
<rogpeppe> davecheney: i *think* it could simplify things considerably
 * davecheney goes to play with play.g.o for a second
<davecheney> rogpeppe: indeed, this will work, http://play.golang.org/p/G15bEHpZkP
<rogpeppe>  davecheney: absolutely. it's the way closed chans are meant to be used
<davecheney> nice
<rogpeppe> davecheney: having a single stop channel means that a muxer can "delegate" its stop request to those things it's watching
<davecheney> rogpeppe: if you pass it in as <- chan, then you know you are the only one who can close it
<rogpeppe> davecheney: then every watcher would provider a Wait (sp?) method which would read messages from the watcher chan until EOF, then return the watcher's error
<rogpeppe> davecheney: exactly
<rogpeppe> davecheney: i *think* this means most watchers can dispense with tomb
<davecheney> rogpeppe: certainly for those watchers where we don't need any sort of error value from, yes
<rogpeppe> davecheney: no, errors are easy too
<rogpeppe> davecheney: most watchers are single-goroutine
<rogpeppe> davecheney: it's easy for them to get an error from the watcher they're reading from and put it in an instance variable
<rogpeppe> davecheney: the only way for a client to get the error is to call Wait.
<rogpeppe> davecheney: and when that returns, we're guaranteed that the error is in a known state
<rogpeppe> davecheney: errors are crucial, because all watchers need to be able to return an error
 * davecheney nods
<rogpeppe> shit, i've just buggered cobzr
<davecheney> rm -rf ../../.bzr/cobzr ?
<rogpeppe> davecheney: nope, edit .bzr/branch/location
<rogpeppe> davecheney: that's a really bad suggestion BTW!
<rogpeppe> davecheney: it would erase all my branches
<davecheney> rogpeppe: I did the above by mistake once, once
<rogpeppe> davecheney: oops
<rogpeppe> davecheney: in this case it wasn't so bad. i did branch -m ec2-fix-configsuite'
<rogpeppe> '
<rogpeppe> (note the linefeed)
<rogpeppe> davecheney: it doesn't like it!
<rogpeppe> davecheney: at least, it "works"
<rogpeppe> davecheney: then you can't do anything
<rogpeppe> davecheney: i'll raise a bug! that's what i'll do...
<rogpeppe> davecheney: ha. can't do that.
<rogpeppe> davecheney: i did one watcher: https://codereview.appspot.com/6373048
<rogpeppe> davecheney: the important bit is the change on lines 162-167
<rogpeppe> davecheney: we don't have to select on output any more
<rogpeppe> davecheney: which makes muxers much easier (and we will have lots of them in the firewall and unit agent)
<davecheney> rogpeppe: holy shit, a closed channel always fires !??
<rogpeppe> davecheney: absolutely!
<rogpeppe> davecheney: have you not read the spec?
<rogpeppe> davecheney: they're *really* useful
<davecheney> rogpeppe: not for a while
<rogpeppe> davecheney: it's the only way to use a chan to broadcast
<rogpeppe> davecheney: it's always been like that. except in the early days you'd get a panic if you read more than N times N =~ 20.
<davecheney> rogpeppe: i can't find a reference to it in the spec
<davecheney> but a closed channel looks like it always behaves like it has a value of the zero value and false
<rogpeppe> davecheney: http://golang.org/ref/spec#Close
<davecheney> I guess you're talking about this bit "zero value for the channel's type without blocking"
<davecheney> it doesn't specifically mention it's use in select {} but whatever
<davecheney> as you say, a very useful property
<rogpeppe> davecheney: yeah
<rogpeppe> to both
<davecheney> the fact it's level triggered is very useful
<davecheney> although, it's not really acting as a default case
<rogpeppe> davecheney: thanks to fwereade_'s refactoring, the state/wacher.go changes are pretty minimal, it seems
<rogpeppe> davecheney: what do you mean by that?
<davecheney> that is, receiving from a closed channel will be chosen pseudorandomly from the set of ready channels
<rogpeppe> davecheney: yeah
<rogpeppe> davecheney: which is fine, i think.
<davecheney> yup
<rogpeppe> davecheney: you might have two closed channels, of course
<davecheney> combined with nil'ing the channel, it's very powerful
<rogpeppe> davecheney: definitely
<davecheney> anyway, time for a break, and some dinner
<rogpeppe> davecheney: both of those things weren't in Limbo, and i really appreciate them
<rogpeppe> davecheney: enjoy!
<rogpeppe> davecheney: 't'was good to catch you
<davecheney> later lads, if I don't see you again tonight, we'll talk at le standup!
<rogpeppe> davecheney: aye
<TheMue> morning
<fwereade_> TheMue, heyhey
<fwereade_> rogpeppe, also heyhey :)
<rogpeppe> fwereade_: yo!
<rogpeppe> fwereade_: i've had a minor revelation about the watchers
<rogpeppe> TheMue: yay!
<fwereade_> rogpeppe, oh yes?
<rogpeppe> fwereade_: i was thinking about this code, which i suggested to TheMue as part of the firewall code: http://paste.ubuntu.com/1089462/
<fwereade_> rogpeppe, is this to do with the which-ready-channel-you-receive-from-is-random thing?
<fwereade_> rogpeppe, that has had me a little suspicious lately, easy to miss
<rogpeppe> fwereade_: that was a side shoot
<rogpeppe> fwereade_: or... maybe i don't know what you're referring to there
<rogpeppe> fwereade_: i thought that the code above was way more complex than it should be
<fwereade_> rogpeppe, I shouldn't worry about it for now, I'm just developing a sl. twitchy feeling that we may not have fully analysed some of the watchers, I'll figure it out properly myself :)
<rogpeppe> fwereade_: this potential change makes them easier to analyse, i think
<rogpeppe> fwereade_: the idea is that actually most watchers layer on top of other watchers
<fwereade_> rogpeppe, anyway, sorry derail
<fwereade_> rogpeppe, yes, this is very true
 * TheMue listens
<rogpeppe> fwereade_: and so the fact that each watcher has its own Stop method makes things hard
<fwereade_> rogpeppe, ok...
<rogpeppe> fwereade_: we can do better, i think, if we *pass in* a stop channel
<rogpeppe> fwereade_: and use the EOF status of the watch channel as an indication that the watcher has completed
<TheMue> rogpeppe: Hmm, an idea: why don't we build the watchers so that, when creating an instance, we don't pass behaviors based on interfaces. They are called on changes and errors.
<fwereade_> rogpeppe, hmm, this has crossed my mind in individual cases
<TheMue> rogpeppe: Everything else is handled internally.
<rogpeppe> TheMue: i'm not sure i understand that
<TheMue> rogpeppe: Wait a moment, I'll write a paste as outline.
<rogpeppe> fwereade_: here's an example of what i mean: https://codereview.appspot.com/6373048/diff/1/state/watcher/watcher.go
<fwereade_> rogpeppe, the trouble is that pass-a-stopper is a nice thing internally, but as a consumer of our APIs I would much rather get an object I can Stop() myself
<rogpeppe> fwereade_: yes, that's the down side
<rogpeppe> fwereade_: *but*
<rogpeppe> fwereade_: it makes everything else simpler
<rogpeppe> fwereade_: particularly in the muxing case, which we're going to be doing a *lot* in the firewall and unit agent code
<rogpeppe> fwereade_: that code i pasted earlier turns into something like this: http://paste.ubuntu.com/1094518/
<rogpeppe> fwereade_: i.e. exactly what it *should* look like
<fwereade_> rogpeppe, hmm, yes, I agree it's useful in those cases
 * fwereade_ is fretting that it's not necessarily everything we need
<rogpeppe> fwereade_: how do you mean?
<fwereade_> rogpeppe, eg in one of the things I have up for review
<fwereade_> rogpeppe, relationUnitsWatcher I think
<fwereade_> rogpeppe, I found that stop chans weren't quite enough for my single-thing-watcher goroutines
<fwereade_> rogpeppe, and I convinced myself that I needed a whole separate tomb for every goroutine
<rogpeppe> fwereade_: i don't *think* you do
<fwereade_> rogpeppe, maybe I should have been keeping references to the subwatchers around directly, that would probably be enough
<fwereade_> rogpeppe, the specific issue was "make sure this goroutine is not going to send anything else on the updates channel"
<rogpeppe> fwereade_: ah!
<fwereade_> rogpeppe, ie, make sure I can't get a changed event after a deleted event
<rogpeppe> fwereade_: but that's the important thing about this change!
<rogpeppe> fwereade_: we don't care any more!
<TheMue> rogpeppe: http://paste.ubuntu.com/1094521/ is very, very quick. The important idea should be that you don't bother with different channels and tombs. The watcher does this and calls the methods of the passed behavior.
<rogpeppe> fwereade_: because the way we wait for a watcher to complete is by reading all events from the channel.
<TheMue> rogpeppe: It's used in Erlang/OTP very much. Services provide the backend, behaviors concentrate on business logic.
<rogpeppe> fwereade_: have a look at that watcher code i posted above
<fwereade_> rogpeppe, I do like that, but I'm still not sure it helps my case
<fwereade_> rogpeppe, this is me waiting until I'm sure that one specific goroutine is no longer going to be sending on a shared channel
<rogpeppe> TheMue: i'm not sure i understand the motivation, or the context.
<rogpeppe> fwereade_: why do you need to do that?
<rogpeppe> TheMue: is this a suggestion for state/watcher.go ?
<fwereade_> rogpeppe, to ensure that I will not get any embarrassing changed events after I've sent a departed event for a particular relation units
<rogpeppe> TheMue: it looks quite similar to what fwereade_'s done in that file already
<rogpeppe> fwereade_: because you've got two watchers watching different aspects of the same thing?
<TheMue> rogpeppe: Inside one watcher, yes. But take a look at your paste above. We still build new watchers upon, with own tombs, with multiply channels and all that stuff.
<fwereade_> rogpeppe, assume that I got a change immediately followed by a delete, and my main goroutine is processing the delete while the child goroutine is processing the change
<TheMue> rogpeppe: My intention is to simplify that.
<rogpeppe> TheMue: i'd have to see a more complete example to understand what you're suggesting.
<fwereade_> rogpeppe, unless I wait for the child to actually die, I can't be sure which of the channels it's selecting on will be selected
<rogpeppe> fwereade_: isn't the problem there that you've got two things operating on the same state?
<fwereade_> rogpeppe, ie when the scheduler gets around to it next, Dying is closed but updates is also unblocked
<rogpeppe> fwereade_: which channels is it selecting on?
<fwereade_> rogpeppe, I have two different things operating on two different pieces of state that are in the same system
<fwereade_> rogpeppe, waiting to receive from a stop chan (which might be a tomb.Dying()), or send on an updates chan for the attention of the main goroutine
<rogpeppe> fwereade_: with this change, you would not do that
<rogpeppe> fwereade_: you'd just send on the updates chan unconditionally
<fwereade_> rogpeppe, right, so how do I avoid getting change events on the updates chan as soon as I want to stop receiving them?
<rogpeppe> fwereade_: you don't.
<rogpeppe> fwereade_: but i don't see why that's a problem.
<fwereade_> rogpeppe, at the time I stop, the child goroutine may already be waiting to send
<rogpeppe> fwereade_: that's fine.
<fwereade_> rogpeppe, I want to be damn sure it will not do so
<rogpeppe> fwereade_: at the receiving side, you have to ignore the subsequent event(s)
<fwereade_> rogpeppe, because a relationUnitsWatcher that sends crap like unit-0-0 deleted followed by unit-0-0 changed is just crackful
<rogpeppe> fwereade_: but that shouldn't be hard
<fwereade_> rogpeppe, when N goroutines are sending on the shared update channel?
<fwereade_> rogpeppe, I keep more state lying around to know which events I should filter out f the updates stream?
<rogpeppe> fwereade_: yeah, it's not a difficult problem, i think. you maintain some state per goroutine on the receiver side.
<rogpeppe> fwereade_: each even coming from the shared update channel will identify its goroutine
<fwereade_> rogpeppe, hmm, doesn't really feel like a win to me, but I could perhaps be convinced
<rogpeppe> fwereade_: so the state can be there trivially
<rogpeppe> s/each even/each event/
<rogpeppe> fwereade_: i *think* it's a much easier structure to reason about
<fwereade_> rogpeppe, it's just a feeling that the watchers which keep state around are harder to reason about
<fwereade_> rogpeppe, it could well still be a net win; makes most things easier and some harder
<rogpeppe> fwereade_: at least this state is simple local state, no concurrent interactions with it
<rogpeppe> fwereade_: in this case, i think it's as simple as "x.ignore = true"; [..] if x.ignore { continue}
<rogpeppe> fwereade_:  but i may well misunderstand your problem
<fwereade_> rogpeppe, well, it's multiple goroutines, so I'd need to invert it and store the ones I *do* care about in a map
<fwereade_> rogpeppe, I agree not very complex
<fwereade_> rogpeppe, I'm not arguing it's an insupportable burden
<rogpeppe> fwereade_: i don't think so... each goroutine stores info about itself
<rogpeppe> fwereade_: and sends that info on the watch channel.
<rogpeppe> fwereade_: so when an event arrives, you get the info too, which you can manipulate as local state.
<fwereade_> rogpeppe, but the critical info always comes in on the main goroutine, not the child, I don't think the child should be storing that state
<fwereade_> rogpeppe, I am not opposed to exploring this idea further
<rogpeppe> fwereade_: cool
<Aram> moin everyone.
<TheMue> Hi Aram
<rogpeppe> Aram: yo
<rogpeppe> !
<fwereade_> rogpeppe, just pointing out a use case which I think is legitimate and not the best fit
<fwereade_> Aram, heyhey
<rogpeppe> fwereade_: i'll be interested to have a look at the CL
<fwereade_> rogpeppe, I have a stack of them, bit insomniac this w/e
<rogpeppe> fwereade_: i had this idea sitting in my head all weekend, but couldn't get anywhere near a computer...
<fwereade_> rogpeppe, TheMue: in fact I think https://codereview.appspot.com/6408045/ is basically a trivial
<fwereade_> rogpeppe, TheMue: but one with a surprisingly significant effect
<fwereade_> rogpeppe, TheMue: so I would appreciate both your opinions
<fwereade_> rogpeppe, TheMue: it's independent of https://codereview.appspot.com/6405044/ which is a prereq for https://codereview.appspot.com/6402048/
<fwereade_> rogpeppe, the first one of those 2 is the use case I was talking about
<fwereade_> rogpeppe, the second one ties together a bunch of stuff in a way that I find pleasing
<fwereade_> rogpeppe, and gets us a RelationUnit type that is aware of the existence/settings of other RelationUnits in other agents
<Aram> fwereade_: rogpeppe: TheMue: I have to take at least half a day off... I've stayed too much in the sun and I have a fever and an excruciating headache.
<fwereade_> Aram, np, look after yourself :)
<rogpeppe> fwereade_: 6408045 is because yaml doesn't marshal deterministically?
<fwereade_> rogpeppe, yeah :/
<Aram> woke up and drank 2 liters of water.
<rogpeppe> Aram: ok, sorry about that
<rogpeppe> fwereade_: although i'm jealous of your sun
<rogpeppe> fwereade_: it's been 13 degrees here for weeks
<fwereade_> rogpeppe, it was swapping dict field order about 1 time in 10 in normal use
<rogpeppe> fwereade_: i've got the central heating on
<rogpeppe> fwereade_: ah, makes sense
<fwereade_> rogpeppe, the test case hits it 100 times and it sometimes switches rendering order >50 times
<rogpeppe> fwereade_: LGTM
<fwereade_> rogpeppe, that was a fun midnight bug-hunt though
<fwereade_> rogpeppe, cool
<rogpeppe> fwereade_: good catch!
<fwereade_> TheMue, if I could get one from you too I'll merge it straight in
<fwereade_> rogpeppe, cheers :)
<rogpeppe> fwereade_: although...
<fwereade_> rogpeppe, go on
<rogpeppe> fwereade_: why did it cause anything to fail?
<rogpeppe> fwereade_: i'd've thought the tests should be resilient to config nodes changing-but-not-changing
<fwereade_> rogpeppe, AIUI ConfigNode is meant to *not* change-without-changing
<fwereade_> rogpeppe, a while ago niemeyer said he would consider that behaviour to be a bug
<fwereade_> rogpeppe, and it is an assumption I was depending on in the tests
<rogpeppe> fwereade_: nonetheless, the watchers *should* be resilient to it, no?
<rogpeppe> fwereade_: ah
<fwereade_> rogpeppe, no, IMO the system should assume this useful property of ConfigNode
<fwereade_> rogpeppe, this case specifically
<rogpeppe> fwereade_: ok
<fwereade_> rogpeppe, relation unit writes private-address to its settings node on creation
<fwereade_> rogpeppe, the private address *might* have changed if the UA was restarted and is rejoining an existing one
<rogpeppe> fwereade_: i suppose contentWatcher copes with updates-without-changes at too low a level
<fwereade_> rogpeppe, or it might not, or the node might not exist at all
<fwereade_> rogpeppe, IMO the nice thing to do is just to always set private-address on join
<fwereade_> rogpeppe, and let the lower level filter out the changes-that-don't-chnage
<fwereade_> rogpeppe, with the bug, a particular node's version can change under the hod and send me an unepected event
<rogpeppe> "relation unit writes private-address to its settings node on creation"
<fwereade_> rogpeppe, not even an unexpected event, actually
<rogpeppe> i don't understand that
<fwereade_> rogpeppe, just an unexpected version
<fwereade_> rogpeppe, ok, every unit participating in a relation has its own settings and presence nodes
<TheMue> fwereade_: Looks good
<rogpeppe> fwereade_: what's a "relation unit" and what's the private address we're talking about here?
<fwereade_> rogpeppe, existence of the presence node implies validity of the settings
<rogpeppe> fwereade_: ok
<fwereade_> rogpeppe, one thing we guarantee is that if you're in a relation with some other units, you will have access to their private-address setting
<fwereade_> rogpeppe, so that you can do whatever magic youpersonally require to get your charm talking to the other side of the relation
<rogpeppe> fwereade_: ah, i didn't know that
<fwereade_> rogpeppe, make sense roughly?
<rogpeppe> fwereade_: yup
<fwereade_> rogpeppe, I'm not saying there isn't a better way to do it, but the python seems to work pretty well like that :)
<rogpeppe> fwereade_: i guess i'm surprised that the config node watcher doesn't filter out occasions when the content has changed but the attrs haven't
<rogpeppe> fwereade_: but i'm happy with your fix
<rogpeppe> fwereade_: it seems very plausible
<fwereade_> rogpeppe, feels like a lot more hassle to do it at output time when we already have all the info available at input time, as it were
<rogpeppe> fwereade_: yeah
<fwereade_> rogpeppe, and then we get more predictable behaviour at a lower level and can build higher-level stuff with more confidence
<rogpeppe> fwereade_: definitely.
<rogpeppe> fwereade_: thanks for the explanation
<fwereade_> rogpeppe, need to take a longish break, cath was away for the weekend and I need to JIT some housekeeping which got sacrificed to coding while laura was asleep ;)
<rogpeppe> fwereade_: :-)
<rogpeppe> fwereade_: when you're back i should have a watcher CL for your perusal
<fwereade_> rogpeppe, cool
<fwereade_> rogpeppe, TheMue: I'll just merge the ConfigNode change before I break
<rogpeppe> fwereade_: and i'll add some tests to that config proposal. i *hope* that gustavo likes it.
<fwereade_> rogpeppe, that's always the worry
<fwereade_> rogpeppe, I wish I had a better niemeyer sim
<rogpeppe> fwereade_: mine's still training the neural net.
<rogpeppe> fwereade_: it's pretty erratic
<fwereade_> rogpeppe, indeed, I find it actually gets worse with time because it functions accurately for weeks and then, suddenly, *total* failure
<rogpeppe> fwereade_: wildly non-linear solution space
<fwereade_> rogpeppe, indeed :)
<fwereade_> rogpeppe, TheMue: thanks, submitted
<TheMue> fwereade_: Cheers
<TheMue> Oh, revision 300, nice number.
<TheMue> davecheney: Heya
<davecheney> TheMue: howdy
<davecheney> i'll leave the firewaller to you
<TheMue> davecheney: Thx *bow*
<fwereade_> davecheney, heyhey
<davecheney> 'sup !
 * TheMue thankfully bought the Ubuntu fleece jacket in Oakland. It's pretty cold today here.
 * fwereade_ is exhausted by the mere prospect of the 10-minute walk in blazing sunshine to collect laura from nursery
<fwereade_> bbs
<davecheney> niemeiyer: any comment ? https://www.youtube.com/watch?v=32DD4DF7Qpo
<rogpeppe> davecheney, TheMue, fwereade_: request for comment: https://codereview.appspot.com/6373048/
<rogpeppe> davecheney: lol
 * davecheney reads
<rogpeppe> fwereade_: i'm not sure that passing a tomb around would be a good idea
<fwereade_> rogpeppe, yeah, it feels too bulky
<fwereade_> rogpeppe, potentially
<rogpeppe> fwereade_: a tomb has a single idea of an error
<fwereade_> rogpeppe, ah, sorry, what's the problem there?
<rogpeppe> fwereade_: my main inspiration for the CL was my realisation that there's a fundamenal asymmetry using channels
<rogpeppe> fwereade_: channels fan in but they don't fan out
<fwereade_> rogpeppe, yeah
<rogpeppe> fwereade_: which leads me to think that stopping your sources is more appropriate as a broadcast.
<rogpeppe> fwereade_: then each source tells you when it has responded and completed.
<TheMue> rogpeppe, fwereade_: What's still somehow too complex for me is the chaining of goroutines. We do goroutine(baseWatcher) -> goroutine(somehowSpezializedWatcher) -> goroutine(neededWatcher)
<rogpeppe> fwereade_: and when we've got a channel coming from that source, EOF on the channel seems the Right Way.
<fwereade_> rogpeppe, sure, that's what we do anyway, right?
<rogpeppe> TheMue: chaining of goroutines is the Go Way.
<fwereade_> TheMue, yeah, the chaining of goroutines makes me happy
<rogpeppe> fwereade_: yes, but the stopping is still one-to-one.
<fwereade_> TheMue, so long s they're sanely written they make very nice building blocks IMO
<TheMue> rogpeppe, fwereade_: I would like to already tell the first one what I want to do and only think about the business logic.
<rogpeppe> TheMue: i'm not sure what you mean by "the business logic"
<fwereade_> TheMue, IMO the good thing is that you *can* as long as you enforce communication by channels
<TheMue> fwereade_: But we have so much overhead around it, each time with new tombs.
<fwereade_> rogpeppe, the domain logic if you prefer
<rogpeppe> TheMue: my proposal does away with most of the tombs
<fwereade_> TheMue, is this overhead serious?
<TheMue> rogpeppe: The stuff that I want to be done, sorry for the wording.
<fwereade_> rogpeppe, yeah, that's what makes me uncomfortable, I find the tombs *really* helpful
<TheMue> fwereade_: IMHO it can be done more simple using interfaces.
<rogpeppe> fwereade_: a tomb is overkill much of the time
<rogpeppe> fwereade_: did you look at the watcher implementations in the CL?
<fwereade_> rogpeppe, if "knowing when you've stopped" is overkill, then yes
<rogpeppe> fwereade_: they don't use a tomb, and the code is no more complex IMHO
<rogpeppe> fwereade_: when you've only got one goroutine, there's no difficulty knowing when you've stopped :-)
<rogpeppe> fwereade_: you return...
<rogpeppe> TheMue: the fundamental problem we're trying to solve here is that there are things changing all over the system, and we need to respond to them in a coherent way
<rogpeppe> s/to them/to the changes/
<rogpeppe> TheMue: channel multiplexing can work really nicely here, but it's a right pain if you have to select every time you want to send or receive on a channel.
<TheMue> rogpeppe: How does your CL helps regarding this point.
<TheMue> rogpeppe: OK, get's more clear.
<rogpeppe> TheMue: you see the two pieces of code at the start of the CL description?
<rogpeppe> TheMue: that's the simplification that this CL buys you
<TheMue> rogpeppe: That's why I started with using range.
<rogpeppe> TheMue: *exactly*!
<TheMue> rogpeppe: Indeed.
<rogpeppe> TheMue: but with the way things are, you *can't use range*.
<rogpeppe> TheMue: which seems wrong.
<rogpeppe> IMHO this CL lets us write more natural Go code.
<TheMue> rogpeppe: ack
<rogpeppe> fwereade_: one possibility i did consider was to allow passing in a nil stop channel to the watcher, which would mean "make your own stop channel"; and a method, Stop, which would close it.
<rogpeppe> fwereade_: i even started doing it, but it doesn't work very well.
<fwereade_> rogpeppe, I think the thing I don;t like is the action-at-a-distance nature of it all
<rogpeppe> fwereade_: you mean that stopping isn't synchronous?
<fwereade_> rogpeppe, you close the channel you originally passed in and a whole tree of goroutines shut themselves down, sending an arbitrary number of additional events, before everything is finally stopped
<fwereade_> rogpeppe, internally to the watchers the receive-until-close works nicely
<fwereade_> rogpeppe, but outside of them it feels like we're exposing our internals in a slightly inappropriate way
<fwereade_> rogpeppe, does that make sense?
<rogpeppe> fwereade_: i know what you mean, but i think that it's not an internal thing - it's that telling anything to stop is a two-way process - we tell them to stop and then they get around to doing it.
<rogpeppe> fwereade_: and that an *inevitable* consequence of a synchronous stop is that you have to select on every channel send.
<rogpeppe> fwereade_: and i think that leads to uglier code overall
<rogpeppe> fwereade_: particularly when we come to the unit agent
<rogpeppe> fwereade_: and the firewall code
<rogpeppe> fwereade_: and, i think that it really won't be hard at all to avoid relying on synchronous-stop semantics
<fwereade_> rogpeppe, won't the firewall code also have to deal with a lot more state to figure out what events should just be dropped because we don't care about them any more?
<rogpeppe> fwereade_: i don't think so.
<rogpeppe> fwereade_: we'll only stop things at the end, AFAICS
<fwereade_> rogpeppe, you surely have a clearer idea of the details there than me
<fwereade_> rogpeppe, surely when a machine is terminated you'll want to stop paying attention to changes?
<fwereade_> rogpeppe, some changes anyway
<fwereade_> rogpeppe, which could still be queued up from the past but will potentially want to write to state that you've deleted?
<rogpeppe> fwereade_: if that's the case, it's trivial to mark the machine as dead
<rogpeppe> fwereade_: if we see an update from a dead machine, we'll just ignore it. one if.
<fwereade_> rogpeppe, ok, and when do we tidy up the dead machines?
<rogpeppe> fwereade_: GC
<fwereade_> rogpeppe, how? when do we know that we we're guaranteed no more changes for a machine?
<rogpeppe> fwereade_: we don't. the goroutine that's responsible for sending on the channel holds a reference to the machine.
<rogpeppe> fwereade_: when it exits, the last reference goes
<fwereade_> rogpeppe, ok, and how do we tell that goroutine to make the machine dead?
<rogpeppe> fwereade_: we don't
<fwereade_> rogpeppe, we'll be findin out it's dead on a separate goroutine, right?
<rogpeppe> fwereade_: no
<rogpeppe> fwereade_: hold on
<rogpeppe> fwereade_: let me paste some code
<fwereade_> rogpeppe, thnks
<rog> fwereade_: bugger, my machine just randomly rebooted for no apparent reason
<rog> fwereade_: i was about 10 lines into my example
<rog> fwereade_: will be another 5 mins
<fwereade_> rog, np, I'll stick some food on
<rog> fwereade_: http://paste.ubuntu.com/1094802/
<fwereade_> rog, sorry, will deal converse about this in a bit
<rog> fwereade_: np
 * Aram is feeling just slightly slightly better.
<rog> fwereade_: why don't you like Environ.Config BTW?
<Aram> just enough to get out of bed :).
<niemeyer> Good morning!
<rog> niemeyer: yo!
<rog> niemeyer: did you have a good conference?
<rog> Aram: if you feel bad, just take the day off sick...
<niemeyer> rog: Yeah, it was pretty nice
<rog> niemeyer: your talk went down well, i trust
<niemeyer> rog: Nothing so different, though, given I had been in other MongoDB conferences recently
<niemeyer> rog: Seeing old friends and making new ones is great, though
<niemeyer> rog: Yeah, talk was alright
<rog> niemeyer: yeah
<fwereade_> niemeyer, heyhey
<niemeyer> rog: Haven't had much time to really prepare something great, so it was below what I'd like to do
<niemeyer> fwereade_: Heya!
<rog> niemeyer: you can't do everything!
<niemeyer> The 5-Gram Experiment turned out nicely, though
<fwereade_> rog, I do like your environ config... did I comment on the wrong CL or something? :)
<niemeyer> rog: True that is :)
<rog>  niemeyer: i've been thinking about watchers recently, and i wonder what you think of this: https://codereview.appspot.com/6373048/
<rog> fwereade_: i meant, specifically the Config method on Environ.
<fwereade_> rog, ah sorry
<fwereade_> rog, I don't see a use case for it
<fwereade_> rog, when would you need it?
<rog> fwereade_: apart from anything else, it makes for a great test
<fwereade_> rog, do we really need anything more than name?
<rog> fwereade_: that caught quite a few errors when i was doing the branch
<fwereade_> rog, hm, like what?
<rog> fwereade_: times when i'd forgotten to return all the attributes from Attrs, for example.
<rog> fwereade_: i can also see it being useful for debugging
<rog> fwereade_: (it makes it trivial to print all the attributes of the current environ)
<fwereade_> rog, I dunno, I think we can manipulate and verify and test configs just fine on their own
<rog> fwereade_: by putting the test in jujutest.LiveTests, we automatically get that test for every environ that passes through that, which is more or less every environ tests
<rog> s/tests/tested
<fwereade_> rog, but that environ is usually wrong
<fwereade_> rog, so it will make debugging harder :p
<rog> fwereade_: it doesn't matter if it's wrong or right - it should still roundtrip
<rog> fwereade_: sorry, wrong why?
<niemeyer> rog: This code has just been refactored by fwereade_.. I think it's time for us to let it alone a bit so we can finish the rest
<rog> niemeyer: fwereade_'s refactoring made it nice and easy to make this change :-)
<rog> niemeyer: and i'm pretty sure it'll make writing the firewall and unit agent code much easier
<niemeyer> rog: I don't feel like the code has improved significantly either, to be honest
<rog> niemeyer: what about the two code fragments in the CL description?
<rog> niemeyer: that's a common idiom
<niemeyer> https://codereview.appspot.com/6373048/diff/5001/state/watcher.go
<niemeyer> Line 44 on the new watcher
<niemeyer> 32 on the old file
<rog> niemeyer: i changed the name because it does the go itself now
<niemeyer> It seems just as tricky, and requires as much understanding of the surroundings, or perhaps more, than the previous version
<rog> niemeyer: perhaps runLoop might be better
<fwereade_> rog, I see how you do the machine changes, I guess it's ok, but I'd still rather have the watchers behave prdictably than to have simple idioms for handling their unpredictability
<niemeyer> rog: Not worried about naming
<niemeyer> fwereade_++
<fwereade_> rog, even at the cost of a few selects :)
<rog> i think we should be able to write nice idiomatic Go code. and selecting on *everything* isn't good for that.
<fwereade_> rog, and I don't think the UA stuff is too much of a car crash with the existing style ;)
<fwereade_> rog, mileages may ofc vary
<rog> the "watcher cleans itself up and tells you when it has done" idiom works well, and is commonly used
 * rog is convinced this, or something like it, is the Right Way to do it.
<rog> niemeyer: that code seems pretty equivalent to me.
<rog> niemeyer: it's nice we can use a range, of course
<rog> niemeyer: and 5 lines shorter is nice too
<niemeyer> rog: LOL
<rog> niemeyer: i agree it doesn't make much difference at the edges, but it really helps the multiplexer case, and both the firewaller and the unit agent do a lot of multiplexing.
<niemeyer> rog: I haven't seen any kind of improvement in that direction in the CL
<rog> niemeyer: that's because we don't have any multiplexers yet
<niemeyer> rog: In fact, both the provisioner and the machiner have gained complexity
<niemeyer> rog: Rather than being simplified
<rog> niemeyer: i don't think one line extra is much gain.
<niemeyer> rog: and now we have those awkward stop channels spread through the whole code base
<niemeyer> (of state, specifically)
<niemeyer> rog: I'd prefer to not do that now, and as suggested focus on pushing things forward
<rog> niemeyer: yes, that is indeed the worst bit of it
<niemeyer> rog: mstate is coming along as well
<rog> niemeyer: whether we're using mstate or state doesn't make much difference to the code i'm thinking about
<niemeyer> rog: Yes, it doesn't, but this is changing the interface while Aram is working on it
<rog> niemeyer: ok, that's a reasonable point.
<niemeyer> rog: It'd be great if we could stop fuzzing with this interface for a few weeks while we push the implementation of agents forward
<rog> niemeyer: it was precisely because i was thinking about the implementation of agents that i wanted to make this change
<rog> niemeyer: but i hope you'll bear this in mind when we find that the firewall code looks unnecessarily big and bulky
<niemeyer> rog: That's only part of the suggestion ;-)
<niemeyer> rog: I haven't seen it, but given the current agents we have in place, I really hope that this isn't the case
<rog> niemeyer: ok, time for another slapdown. i was sympathetic to fwereade_'s cries of distress over the config stuff, tried to make it better, but couldn't. i then wondered whether an alternative approach might work. https://codereview.appspot.com/6353092/
<fwereade_> niemeyer, rog: IMO the ugliest bit of the UA is likely to be the relationUnitWatcher, and I think that worked out tolerably
<rog> niemeyer: it's lacking tests, and william made some comments that i will address if you like it.
<niemeyer> rog: This is a branch from William?
<rog> niemeyer: no, this was something i did very quickly while trying to work out what might work.
<rog> niemeyer: i wanted to avoid saying yet again "try this" without knowing the implications
<niemeyer> rog: This is a branch from William
<rog> niemeyer: oops, so it is
<rog> niemeyer: https://codereview.appspot.com/6343107/
<fwereade_> niemeyer, sorry about that branch -- I did my best to follow what you wanted but wasn't able to make it nice
<fwereade_> niemeyer, davecheney thought it was kinda OK so it might be an ok fallback, but I thought rog's approach was much cleaner
<niemeyer> Oh, this again..
<fwereade_> niemeyer, sorry
<fwereade_> niemeyer, believe me the thorn has been in my side too :)
<rog> fwereade_: i'd like to see your code for relationUnitWatcher
<niemeyer> fwereade_, rog: So, rather than evaluating yet another proposal from the ground up, what is the problem with the one brought up in the mailing list?
<rog> fwereade_: it's your code, you say :-)
<fwereade_> niemeyer, the problem is that I tried my best and was not satisfied with the result
<niemeyer> fwereade_: That's not a problem.. that seems like a consequence of the problem
<rog> from my point of view there was lots of code for little result
<rog> having the config type hold attributes for two different things made things awkward.
<fwereade_> niemeyer, well, my version is still up on codereview if you want to take a look at that; I was not happy with it, though, and I didn't think you would be either
<rog> conversely, embedding it seems to make things fall out more naturally.
<niemeyer> fwereade_: Heh..
<fwereade_> niemeyer, if I misjudged that then sorry; as it was I felt I'd screwed up the 3rd or 4th attempt at implementing it, and cutting my losses did not seem like a bad idea, especially since it let me get back to the unit agent
<niemeyer> fwereade_: Is there a reason you can identify why that is the case? :-)
<rog> niemeyer: the ComposeConfig function in this file seems to me to epitomise the difficulties: https://codereview.appspot.com/6353092/diff/2001/environs/ec2/provider.go
<fwereade_> niemeyer, I feel like the actualy code is ugly and hard to follow and I'm still suspicious of corner cases despite heavy testing
<niemeyer> rog: The proposal didn't even have this method
<rog> niemeyer: it's fwereade_'s name for Validate
<rog> niemeyer: same semantics
<fwereade_> niemeyer, by best guess is that I Just Do Not Get It, and have failed to apprehend some aspect of what you're looking for
<niemeyer> rog: There was nothing in config with that semantics, whatever its name
<fwereade_> niemeyer, indeed; I tried to do what you were asking but I presumably misunderstood something, or was too obsessive about validation, or *something* that I cannot precisely identify
<niemeyer> fwereade_: Okay.. I think I'll give it a try then
<fwereade_> niemeyer, I look forward to the inevitable schooling ;p
<fwereade_> niemeyer, but I'm sorry to add more to your plate :(
<niemeyer> fwereade_: Well, one of us will definitely learn something.. not sure which yet :-)
<fwereade_> niemeyer, we'll see :)
<rog> [Monday 09 July 2012] [14:54:01] <rog>	niemeyer: something like this, perhaps? http://paste.ubuntu.com/1082778/
<rog> [Monday 09 July 2012] [14:54:26] <niemeyer>	rog: Yeah, except it should be called Validate as we've been agreeing on
<niemeyer> ?!
<rog> niemeyer: it takes an old config, a new config and returns a validated config by composing the two. seems fairly similar.
<niemeyer> rog: Heh
<rog> niemeyer: but like william, i probably misunderstood
<niemeyer> rog: No, you didn't.. you just changed the proposal
<niemeyer> rog: That paste is very different from what the CL has
<niemeyer> rog: Anyway, would you mind if I tried to implement the proposal made, in smaller chunks?
<rog> niemeyer: that's true, but i tried to move in that direction and failed.
<rog> niemeyer: sigh
<rog> niemeyer: i think we have spent quite a long time on this
<niemeyer> rog: Yes, we have indeed!
<rog> niemeyer: and i don't think my proposal is too shit
<niemeyer> rog: Yet all the time I spent on it seems a bit wasted
<niemeyer> rog: Is my proposal shit then?
<niemeyer> :-/
<rog> niemeyer: i didn't see yours implemented
<rog> niemeyer: but i'm sure fwereade_ did the best he could
<niemeyer> rog: fwereade_ was working on it, supposedly
<niemeyer> rog: and you came up with something else..
<rog> niemeyer: yes, because i tried to make something like yours work, and couldn't.
<fwereade_> niemeyer, I was working on it, and I proposed a CL that pretty clearly betrayed my state of despair
<rog> niemeyer: and i think that it ended up really quite nice a different way
<fwereade_> niemeyer, so from my perspective it was a helpful thing to do that brought a great sense of relief
<niemeyer> That's all fine, but please don't judge me as I try to implement what I actually presented and was ignored
<fwereade_> niemeyer, I would love to see you make it work
<niemeyer> "We've spent quite a long time on this", as you say
<fwereade_> niemeyer, please do not think I did not try
<rog> niemeyer: ok, have a go
<niemeyer> Thanks
<fwereade_> niemeyer, and as you say one of us will learn from it; I suspect it will be me, because it usually is :)
<rog> niemeyer: seems like we have already got something workable, but go for it
<niemeyer> rog: We've got lots of things workable.. fwereade_ had a different proposal a while ago that was workable too
<rog> niemeyer: fair enough
<niemeyer> rog: I don't appreciate the fact New returns "unknown" for example
<rog> niemeyer: ?
<niemeyer> rog: Nor the fact that there's no way for an environment to validate settings
<rog> niemeyer: what?
<niemeyer> rog: All of those things in your proposal were already debated
<rog> niemeyer: which New returns "unknown"?
<rog> niemeyer: and all environment settings are validated in my proposal
<rog> niemeyer: at least, that was the intention
<niemeyer> rog: There's no hook point as far as I can see to compare an old and a new configuration
<niemeyer> rog: and prevent changes at the client side
<rog> niemeyer: environs.Config.Change
<niemeyer> rog: In case the modification is rendered invalid
<niemeyer> rog: This must be per environment
<rog> niemeyer: it *is* per environment
<niemeyer> rog: I don't think I understand
<rog> niemeyer: environs.EnvironConfig is implemented by each environment
<niemeyer> rog: Ah, I see
<rog> niemeyer: you can embed config.Config to get the common stuff
<niemeyer> This is the unknowns I was talking about: 	  38 // New creates a new Config from the given attributes, and also returns any attr
<niemeyer>      ibutes
<niemeyer>   39 // not known.
<niemeyer>   40 func New(attrs map[string]interface{}) (*Config, map[string]interface{}, error)
<niemeyer> rog: How do we return this from state?
<rog> niemeyer: that's because it's designed to be embedded.
<niemeyer> rog: How do we return this from state?
<rog> niemeyer: have a look at how it's used in https://codereview.appspot.com/6343107/diff/11001/environs/ec2/config.go
<niemeyer> rog: How do we return this from state?
<niemeyer> :-(
<rog> niemeyer: what state?
<niemeyer> rog: *State.EnvironConfig()
<rog> niemeyer: i don't think we need to do that.
<niemeyer> rog: Heh
<niemeyer> rog: That whole conversation started there
<rog> niemeyer: i known. but i think it's a false premise
<rog> niemeyer: but...
<rog> niemeyer: if we want it to, then EnvironConfig could live inside the config package
<niemeyer> rog: and then all the registry has to live there as well
<rog> niemeyer: well... either state uses an environ to validate what comes out of the db or not
<rog> niemeyer: if it *does* then there has to be some registry not in environs
<rog> niemeyer: if it does not, then it cannot return a validated environ config
<niemeyer> rog: We can have a first class type being returned from state even if it's not validated
<niemeyer> rog: Because it was already validated on entrance
<rog> niemeyer: why not just return the attributes?
<rog> niemeyer: same difference
<niemeyer> Maybe I should just give up on this.. it's like the fourth time I'm going over this
<niemeyer> Rather than converging there's a brand new implementation every time
<rog> niemeyer: so your objection to my branch is that we can't return a config.Config from state?
<niemeyer> rog: I'm just going bit by bit about how the proposal sent to the mailing list, for which there's no reply other than "looks great", was put in place
<rog> niemeyer: are you talking about my proposal now, or william's, or yours?
<niemeyer> rog: Heh
<TheMue> So, next iteration of firewaller is in for review: https://codereview.appspot.com/6404051 . And this time the adding of the former CL as a requisite worked. Yeah!
<rog> TheMue: i'm taking a look
<niemeyer> rog, TheMue, fwereade_, Aram: Are any of you planning on buying a Raspberry Pi?
<rog> niemeyer: not currently
<rog> niemeyer: i don't have any spare geek time :-)
<niemeyer> rog: Hehe :)
<fwereade_> niemeyer, not really, I like the idea but doubt I'd actually do anything with it if I had one ;)
<fwereade_> niemeyer, what rog said ;)
<niemeyer> rog: That's wise.. I have a bunch of hardware I've never played at home :-)
<niemeyer> Some of them did make up for some good time, though
<rog> niemeyer: yeah, and what fwereade_ said too - too much decaying h/w in the attic
<niemeyer> mramm: What about you, any plans?
<TheMue> niemeyer: I'm not planing it, I never had been good in HW things. Even if it looks interesting.
<mramm> niemeyer: no plans yet
<niemeyer> Crap :)
<mramm> niemeyer: but I keep thinking about it
<TheMue> niemeyer: But a box with a Tilera 64 would be nice. ;)
<Aram> niemeyer: I am planning to buy one (or more), why?
<niemeyer> Aram: Because they are now available, and I was going to ask someone to take one to the sprint in case they arrived on time
<niemeyer> Aram: http://www.raspberrypi.org/archives/1588
<niemeyer> Aram: But now that I read it, it won't fly
<Aram> I'll order some, but it won't get in time
<niemeyer> It's taking months to deliver
<niemeyer> Yeah
<mramm> There is a guy at the co-working space I sometimes use who has a couple
<mramm> but he's been on the list forever
<mramm> and has a startup that's doing a bunch of open source hardware/computer vision stuff
<mramm> so he's unlikely to let me take one away from him for 3 weeks
<mramm> (which is how long I'll be away counting this sprint through the one on the isle of man)
<niemeyer> mramm: Aw :)
<mramm> haha
<niemeyer> Okay, I'll grab some quick lunch and bbiab
<rog> TheMue: you have a review
<rog> TheMue: i'm concerned about the unsafe access to firewaller private variables from the tests.
<rog> TheMue: they could be flaky in a very hard-to-diagnose way in heavily loaded environments
<rog> TheMue: how about something like this, so allow instrumenting the internal state: http://paste.ubuntu.com/1095153/
<rog> TheMue: again, with a comment on CheckProgress this time: http://paste.ubuntu.com/1095160/
<TheMue> rog: Thx, have to get deeper behind your idea.
<rog> TheMue: how do you mean?
<TheMue> rog: To understand it. Only had two quick views on it.
<rog> TheMue: which idea?
<TheMue> rog: Huh, the one you posted above.
<rog> TheMue: ah, the test instrumenting idea?
<TheMue> rog: Yes.
<niemeyer> "The conference will begin when the LEADER arrives.. there are.. currently.. *9* other participants."
<rog> niemeyer: ha
<TheMue> rog: Where do I access private variables?
<rog> TheMue: i don't actually like it that much, because it clutters the runtime code for the tests, but i don't see a better alternative
<mramm> niemeyer: yea, the hold music is particularly awesome today too
<rog> TheMue: in CheckProgress you can safely call functions defined in export_test.go that access private variables
<niemeyer> mramm: Seems the same for me
 * niemeyer tries to code without being disturbed by the bad music
<rog> TheMue: i thought about another alternative that adds a global variable, but i didn't like it
<TheMue> rog: In other places we use that kind of export_test too
<TheMue> rog: No, indeed, no global.
<rog> TheMue: there's nowhere as far as i know that accesses private variables unsafely.
<fwereade_> so, it turns out somehow it's past 7
<mramm> niemeyer: perhaps I just haven't listened to it long enough
<fwereade_> gn all, see you tomorrow
<mramm> fwereade_: good night
<rog> fwereade_: gn, enjoy the evening!
<TheMue> fwereade_: enjoy
<rog> i've gotta go in a moment too
 * TheMue too, but later I'll take a deeper look into rogs ideas
<rog> TheMue: another (and worse) alternative is to put a mutex around the private variables.
<TheMue> rog: No, the tests shouldn't bother the productive code
<rog> TheMue: i agree, but i don't see a good alternative
<rog> TheMue: we're trying to introspect the internal state of the firewaller
<rog> TheMue: i was trying to make the instrumentation code as minimal as possible, but it's still there.
<rog> TheMue: to be honest, i wouldn't mind no tests at all for this code and lots of tests for the final behaviour which is what we're after
<TheMue> rog: So maybe I could keep it while we're approaching the final state and then remove it.
<rog> TheMue: but until we get there, something like this is useful i think. we can delete these tests later.
<rog> TheMue: exactly
<rog> TheMue: if there was a "test" build flag, we could do it with no overhead...
<TheMue> rog: ??? How?
<rog> TheMue: i'd have two type definitions, one in testing mode and the other in non-testing mode. the non-testing mode type would have stubs for the functions, which would be empty. the testing mode would have members like testStub in my idea.
<rog> TheMue: calls to empty functions are removed by the compiler, so... no overhead.
<rog> right, i'm off for the day. see y'all tomorrow.
<TheMue> rog: Interesting, thx for sharing. We should discuss it tomorrow.
<rog> TheMue: definitely
<TheMue> rog: Enjoy, here is dinner time too
 * TheMue waves
<niemeyer> davecheney: yo!
<niemeyer> davecheney: how're things going there?
<davecheney> niemeyer: hey, how was your conference ?
<niemeyer> davecheney: It was pretty good
<niemeyer> davecheney: Always a chance to meet old and new friends
<niemeyer> davecheney: and exchange ideas on how things have been moving
<davecheney> indeed
<niemeyer> davecheney: How're things going for you?
<davecheney> good, just talking a spin through roger and williams comments
<niemeyer> davecheney: Do you have a moment for a quick call?
<davecheney> siure
<davecheney> skype or g+ ?
<niemeyer> davecheney: G+, if that works
<davecheney> two secs
<davecheney> niemeyer: just rearranging the myriad of usb devices
<niemeyer> davecheney: Cool, no worries
<davecheney> imma online, but you do not appear to be
<niemeyer> davecheney: I've just invited you
<niemeyer> davecheney: Maybe the wrong you? :)
<davecheney> niemeyer: maybe, nothign here
<niemeyer> https://plus.google.com/hangouts/_/4ac7f87247068630837d29e94eee4ad282d7c8f7?authuser=0&hl=en
<niemeyer> OK: 12 passed
<niemeyer> PASS
<niemeyer> ok      launchpad.net/juju-core/worker/provisioner      8.444s
<niemeyer> davecheney: !!
<niemeyer> :-)
<davecheney> niemeyer: whoop whoop
#juju-dev 2012-07-17
<niemeyer> davecheney: Proposal in the list
<rog> davecheney: mornin'
<davecheney> rog: howdy
<fwereade_> rog, heyhey
<rog> fwereade_: mornin' guv
<fwereade_> rog, how's it going?
<rog> fwereade_: am finally looking at relation-units-watcher BTW...
<fwereade_> rog, cool, if you see any obvious simplifications I will be most pleased
<fwereade_> rog, it's an awful lot better now it has watch-presence-children to depend on
<rog> fwereade_: the main simplifications would be gained, i believe (although i haven't properly verified), by applying the CL i proposed yesterday
<rog> fwereade_: i may be wrong, but all that tomb checking seems unnecessary.
<fwereade_> rog, so how do I avoid sending changes after departs?
<rog> fwereade_: instead of keeping two maps, one with tombs and one with names, i'd keep a single map with a locally defined relationUnit type.
<fwereade_> rog, maintaining state on the loop goroutine feels kinda hacky to me, like saying "we can't make this do the right thing so patch it up later"
<rog> fwereade_: then it would be trivial to mark that so that you don't issue any updates after a depart
<rog> fwereade_: you're already maintaining state with the tombs
<fwereade_> rog, (state to filter the changes, I mean)
<rog> fwereade_: the unitLoop code would be much simpler, for one
<rog> fwereade_: you'd have a couple of extra lines to maintain the state, but about 30 less (total guess!) lines overall
<fwereade_> rog, and, hmm, I don;t think I need names at all anyway, do I?
<fwereade_> rog, oh bother, yes I do, for removes
<rog> fwereade_: and the same applies to TheMue's firewaller code. i saw your suggestion to add tombs and i thought, yes you're right, the way things are, we do. but... noooo, it's already much bigger than it needs to be this is so unnecessary if we just changed our invariants a little
<rog> oops
 * rog hates pressing return when he meant to press backspace
<rog> wereade_: and the same applies to TheMue's firewaller code. i saw your suggestion to add tombs and i thought, yes you're right, the way things are, we do. but... noooo, it's already much bigger than it needs to be  and unnecessary if we just changed our invariants a little
<rog> is what i meant to say
<rog> fwereade_: all the code looks very reasonable given our current rules BTW. i'm just thinking that we could make things simpler if we wanted to.
<fwereade_> rog, I think making those guarantees makes our lives simpler overall
<fwereade_> rog, your mileage clearly varies on this ;)
<rog> fwereade_: clearly :-)
<rog> fwereade_: how does it make our lives simpler?
<rog> fwereade_: given that it's adding lots of code (IMHO obviously)
<fwereade_> rog, it's easier to reason about the components when we use them
<fwereade_> rog, a Stop that doesn't really stop feels kinda weak to me
<rog> fwereade_: if you want a Stop that really stops, it's trivial to do.
<rog> fwereade_: you just stop and wait for eof on the channel.
<rog> fwereade_: think of it as flushing the pipeline
<rog> fwereade_: where the pipeline consists of components like unitLoop
<fwereade_> rog, indeed, we can't do it when we have components like unitLoop; we have to add that deadness state
<rog> fwereade_: actually, i think we could do it without the deadness state
<rog> fwereade_: you *already* do t.Kill then t.Wait
<rog> fwereade_: which is the moral equivalent
<fwereade_> rog, I thought you were saying the tombs were unnecesssary?
<rog> fwereade_: they are
<fwereade_> rog, and to make that work I need the selects that you don't like
<rog> fwereade_: i'm saying that you could close the unit's stop channel, then wait for eof on its update channel.
<fwereade_> rog, we have one updates channel for all unit loops
<rog> fwereade_: oh yeah, good point :-)
<rog> fwereade_: i knew there was a reason
<fwereade_> rog, as I said I'm really +-0 on the change -- it feels to me like the things it makes easier are balanced by the things it makes harder
<fwereade_> rog, either would be a reasonable way to start out but I don;t feel like a switch in midstream really gains us much
<rog> fwereade_: from my point of view, we are just starting out on the real uses of watchers.
<fwereade_> rog, I think I'm just less bothered by all the selects than you are ;)
<rog> fwereade_: which is why i'm suggesting the change now
<fwereade_> rog, in the end it's niemeyer's call and I don't think he's convinced... but maybe he will be by looking at relationUnitsWatcher :)
<fwereade_> rog, I don't feel it's a win in this context but I am probably too close to the current implementation to be enitirely objective
<rog> fwereade_: i'll try and rephrase relation units watcher to try and show the advantage. i'll probably fail :-)
<fwereade_> rog, either way one of us will learn something :)
<rog> fwereade_: i suppose it's coming from my experience with programming with channels and that working with one-way pipelines is much more elegant than two-way comms.
<fwereade_> rog, makes sense; I personally feel that a one-shot 1-bit pathway in the second direction gives us some explicit clarity that I find helpful
<fwereade_> rog, I may well come to change this view in time :)
<rog> fwereade_: here's how i might expect unitLoop to look, BTW: http://paste.ubuntu.com/1096214/
 * fwereade_ remains unconvinced :)
 * fwereade_ is also going to pop out for some breakfast, he was up late reviewing last night
<rog> fwereade_: enjoy. i haven't broken my fast yet...
<fwereade_> rog, cheers
<Aram> moin.
<fwereade_> rog, do you recall there was a dicsussion somewhere about journalling for workflow state transitions in the UA?
<rog> fwereade_: yeah
<fwereade_> rog, I was thinking of doing something like that and thought I should probably catch up with the consensus
<rog> fwereade_: i made some notes, which ended up in one of the google docs
<rog> fwereade_: one mo, i'll find 'em
<fwereade_> rog, awesome, tyvm
<rog> fwereade_: they'd been removed from that doc, but i found them anyway... http://paste.ubuntu.com/1096304/
<fwereade_> rog, cheers
<rog> fwereade_: i'd had a few more thoughts further along those lines; i'll see if i can find some more notes
<fwereade_> rog, hmm, ok, this is for hook executions specifically?
<rog> fwereade_: yes
<rog> fwereade_: well, anything that takes any time really
<rog> fwereade_: so that we don't miss updates
 * fwereade_ ponders
<Aram> is the meeting in one hour or two hours?
 * Aram is confused about TZ again.
<fwereade_> rog, it kinda feels like it should be for state transitions rather than hook executions
<fwereade_> Aram, I *think* it's 2h
<rog> Aram: 2 hours, i think
<Aram> thanks.
 * Aram heads for a snack.
<rog> fwereade_: perhaps
<rog> fwereade_: i wanted to avoid writing all state data to the log file
<fwereade_> rog, if you mean the settings of every unit, I agree
<fwereade_> rog, but I think in general we do need to keep track of known membership per-relation and latest settings version per-related-unit
<rog> fwereade_: my idea was that each thing that can trigger a hook corresponds to a single event from some watched thing, and each watched thing has a version.
<fwereade_> rog, not true
<fwereade_> rog, install/start don't
<fwereade_> rog, config-changed does
<rog> fwereade_: install/start i think can be treated as special cases.
<fwereade_> rog, more critically, relation unit changes are not 1:1 with hook execution
<rog> fwereade_: agreed.
<rog> fwereade_: it's 1:N, right?
<fwereade_> rog, I think the only quibble is that we always need to do a -changed after a -joined
<fwereade_> rog, so in practice IMO that actually means we should always precede a -changed of a hitherto unknown unit with a -joined
<rog> fwereade_: that should work ok
<fwereade_> rog, yeah, the code seems to want it :)
<rog> fwereade_: that was the idea behind "for all outstanding intentions"
<fwereade_> rog, ahhh, got you
<rog> fwereade_: because you might excecute a joined but not its associated changed
<fwereade_> rog, except, hmm, consider -broken
 * rog considers broken.
<fwereade_> rog, that overrides *everything*
<fwereade_> rog, clear the queue, just break the relation
<rog> fwereade_: remind me of the semantics of broken
<fwereade_> rog, called when the unit itself leaves the relation
<fwereade_> rog, doesn't have access to any useful state really
<fwereade_> rog, similarly, actually, not sure how queue reduction fits in
<rog> fwereade_: sorry, i don't remember at all how -broken works.
<fwereade_> rog, it's just the hook for "this specific is no longer part of the relation"
<rog> fwereade_: queue reduction is easy i think
<fwereade_> s/specific/specific unit/
<rog> fwereade_: assuming i understand what you mean by that
<fwereade_> rog, stuff like "a -changed followed by a -departed" should just be a -departed
<rog> fwereade_: the idea is that we only add to intentions when we are just about to execute the hook
<rog> fwereade_: so the queue reduction can happen before the intentions make it to disk
<fwereade_> rog, ok, cool
<fwereade_> rog, so it's just for what in python is the HookScheduler, rather than the workflow stuff
<rog> fwereade_: the workflow stuff?
<rog> fwereade_: the stuff that works out what hooks to execute given what state changes?
<fwereade_> rog, keeping track of the states of the unit and its relations
<fwereade_> rog, more or less yeah
<rog> fwereade_: i'm imagining that the workflow could be a goroutine pipeline component
<rog> fwereade_: which takes in any changes and spits out intentions
<fwereade_> rog, yeah, I'm having leanings in that direction but not quite sure how it will all fit together
<rog> fwereade_: but i'm not entirely sure how that would work, it's just an initial gut feeling
<fwereade_> rog, it needs to store its own state in ZK for sure
<rog> fwereade_: what state does it need to store?
<fwereade_> rog, er, its state -- charm_upgrade_error, or installed, or running, or whatever
<fwereade_> rog, so status can see it
<fwereade_> rog, I think state maintenance and actual hook execution are distinct problems
<rog> fwereade_: i'm wondering if it would be better that the central loop did that
<rog> fwereade_: i see hook execution and state maintenance as two sides of the same coin
<fwereade_> rog, no question that they're intimately connected
<fwereade_> TheMue, morning
<rog> TheMue: hiya
<TheMue> rog, fwereade_ : Morning
<rog> fwereade_: thing is, some of the state maintenance transitions come from the results of hook executions, i think
<fwereade_> rog, some but not all
<rog> fwereade_: which seems to argue in favour of having the central hook-execution loop responsible for all of them, that way there's only one thing in charge of the state.
<fwereade_> rog, I don't think that's right... hook execution is a single and relatively simple responsibility
<fwereade_> rog, having the single hook executor goroutine responsible for keeping the separate workflows of the unit and all its relation up to date, especially when not all workflow transitions necessarily correspond to hook executions, sounds like a nightmare
<fwereade_> rog, *some* external state changes imply workflow state transitions, *some* of which imply hook executions
<rog> fwereade_: hmm, it seems like there's a two way flow, which i hadn't appreciated before. i.e. a different sequence of hooks will be executed depending on the results of previous hooks
<fwereade_> rog, yeah, for example if we're in an error state we won't be worrying about any other transitions that we otherwise might be expected to pay attention to
<rog> i'd been imagining: {external state changes -> workflow state transitions -> hook executions} as a one-way pipeline flow
<fwereade_> rog, I *think* that still applies, what did I miss?
<rog> fwereade_: but perhaps an error state just implies throwing all workflow state transitions away, which would be easy
<fwereade_> rog, I *think* it means treating that workflow as though the UA was not executing at all
<rog> fwereade_: right
<fwereade_> rog, and then when it's active again doing a big-bang diff against the last known state when the workflow was sane
<fwereade_> rog, and executing whatever changes come from that
<rog> fwereade_: "having the single hook executor goroutine responsible for keeping ...". that wasn't my intention
<fwereade_> rog, ah, sorry, I guess I'm just not clear what the intentions correspond to if not hook executions
<rog> fwereade_: i'd thought {external state changes -> workflow state transitions -> (hook executions + state maintenance)}
<fwereade_> rog, I just can't yet figure out how many places we actively need journalling, and of what sort
<fwereade_> rog, I'm not sure the hook *executor* needs it at all but I haven't yet even convinced myself
<rog> fwereade_: the question is: if we crash half way through executing a hook, how do we know that we need to execute it again when restarting?
<rog> fwereade_: actually perhaps the pipe line could be extended: {external state changes -> workflow state transitions -> hook executions -> state maintenance}
<fwereade_> rog, I need to refresh my memory with the python it seems :)
<rog> fwereade_: because AFAICS all state maintenance is done as a result of hook executing (and charm upgrade, presumably, which i think could be considered similar)
<fwereade_> rog, ok, a given workflow transition is associated with N lifecycle operations which may or may not execute hooks, any of which may in general cause an error and thereby a different workflow transition and set of lifecycle operations
<rog> fwereade_: does an error ever in fact cause anything more than the cessation of lifecycle operations?
<fwereade_> rog, it usually leads to a new state transition IIRC
<fwereade_> workflow transition
<rog> fwereade_: i'm not sure that answers my question
<fwereade_> rog, well, it leads to a different workflow transition than the one we expected
<fwereade_> rog, but in general all the workflow is responsible for is telling the lifecycle to do stuff and writing new state when it has done so
<fwereade_> rog, so the answer kinda depends on your perspective
<rog> fwereade_: that's the way the python code is done, yeah, but i'm wondering if we need that generality
<rog> fwereade_: because that generality is the thing that means that it's not a one-way pipeline
<fwereade_> rog, I think it's reasonable to suggest that the way the workflow and lifecycle can affect one another is confusing and couldprobably be done better
<rog> fwereade_: and i'm wondering if the error state is sufficiently special that we don't mind hard-coding that
<fwereade_> rog, we have a bunch of error states specific to what transition failed, and how we recover from them differs
<fwereade_> rog, I'm not certain this is an essential property of the system though
<rog> fwereade_: interesting
<rog> fwereade_: what sort of recovery are we talking about here?
<rog> fwereade_: i've got an idea
<fwereade_> rog, what hook to (maybe, depending on resolved mode) run again
 * fwereade_ listens
<rog> fwereade_: when we get an error, we treat that as a break in the pipeline, so we tear down the workflow goroutines, do whatever we need to get the show back on the road, then start it up again
<fwereade_> rog, yeah, I think so
<rog> fwereade_: that means that we've still got a nice clean one way flow, apart from termination, which is an abnormal condition, and therefore requires exceptional handling
<fwereade_> rog, from a relation unit's POV, recovering from an error state should be exactly the same as dealing with process restart
<rog> fwereade_: interesting
<fwereade_> rog, so we never watch anything excet resolved when we're in an error state
<rog> fwereade_: and then restart everything
<fwereade_> rog, (these are tentative statements of opinion rather than fact)
<fwereade_> rog, but it feels like it might be a good way to go
<rog> fwereade_: i'd taken that for granted :-)
<fwereade_> rog, I suspect if anything the critical thing is to break up the lifecycle so that bits-that-affect-workflow are separate from bits-affected-by-workflow
<davecheney> evening lads
<rog> fwereade_: definitely
<rog> davecheney: hiya
<fwereade_> davecheney, heyhey
<davecheney> miss anything good ?
<rog> davecheney: a wee discussion about the unit agent stuff
<davecheney> rog: this channel is logged somewhere right ?
<davecheney> anyone know the url ?
<rog> davecheney: one mo
<rog> davecheney: i just google for "ubuntu irc logs"
<rog> davecheney: http://irclogs.ubuntu.com/2012/07/17/%23juju-dev.txt
<davecheney> lmgtfy
<davecheney> noice
<rog> davecheney: sadly it's updated very slowly
<davecheney> probably monitoring a tonne of channels
<TheMue> davecheney: Heya
<rog> davecheney: in this case it's only about 12 minutes out of date
<TheMue> davecheney: I added http://irclogs.ubuntu.com/ to my bookmarks, so the access is easy.
<rog> fwereade_: i'm sure this won't convince you, but for the record: https://codereview.appspot.com/6399052/diff/2001/state/watcher.go
<rog> TheMue: yeah, but you still have to manually navigate to the right date and the right channel, right? :-)
<fwereade_> rog, err and stop and stopped kinda feel like an ad-hoc tomb to me ;)
<TheMue> rog: To the one I want to have, yes. Multiple access paths (e.g. by channel) and search would be nice.
<TheMue> rog, fwereade_: Btw, thx for your reviews.
<fwereade_> TheMue, yw, hope they're useful
<rog> fwereade_: no more than: var err error; ... if err != nil { return } is an ad hoc tomb, IMHO...
<TheMue> rog, fwereade_ : I'll move all watchers into the loop and maybe used a shared environ from the provisioner.
<rog> fwereade_: what do you mean by "move all watchers into the loop"?
<fwereade_> rog, aren't they pretty much giving us all that tomb does, which just returning doesn't?
<fwereade_> TheMue, and I'm not sure the provisioner should actually be responsible for the environ anyway
<fwereade_> TheMue, feels to me like the agent process should get a State and an Environ, and be looking after the changes to the Environ, and provisioner/firewaller should both just be dumbly using the environ
<rog> fwereade_: +1, i think
<fwereade_> davecheney, thoughts? ^^
<rog> fwereade_: because we can just call SetConfig (or whatever it's called) on the environ
<TheMue> fwereade_: Sounds fine too
<fwereade_> rog, that's the idea
<fwereade_> rog, I *think* it should be safe
<fwereade_> rog, AIUI it's meant to be ;)
<rog> fwereade_: i think it was designed tobe
<davecheney> fwereade_: yes, the provisioniner needs to be responsible for the environ
<davecheney> that was an explitic request from gustabo
<davecheney> gustavo
<fwereade_> davecheney, hmm, I'm saying it shouldn't
<davecheney> fwereade_: i think you have a good case
<fwereade_> davecheney, or rather, I'm saying the agent should be but the worker should not
<davecheney> but that was an extended and painful review process
<davecheney> I would be loathe to change it without guidence
<fwereade_> davecheney, heh :)
<fwereade_> davecheney, I know the feeling :)
<davecheney> fwereade_: the reason the firewaller, in my proposals, maintains an independent connection to the environ and the state
<davecheney> was based on a conversation at UDS-q
<davecheney> where it was mentioned that the firewaller service may not live with the provisioner
<davecheney> it only cohabits currently for convenient access to the secrets
<fwereade_> davecheney, hmm, interesting
<fwereade_> davecheney, not sure that's a reason for them to cohabit, is it?
<fwereade_> davecheney, it's not like *anything* is restricted from looking at the secrets ;)
<davecheney> fwereade_: indeed
<davecheney> so with that in mind the provisioner, machiner et al, all operate independantly
<fwereade_> davecheney, STM like the agent processes should be responsible for setting up the bits their workers need, and then firing off the workers, but I guess we could make it work either way without duplication
<fwereade_> davecheney, it's the duplication in firewaller/provisioner that currently bugs me
<fwereade_> davecheney, and the fact that watching the environs complicates each of their main loops
<TheMue> fwereade_: Yep, see your point, sounds reasonable to me.
<rog> fwereade_: i think i agree. i don't see why the environment setting needs to be in the provisioner's main loop.
<TheMue> *: So in general, whoever uses a worker is responsible to pass an environment.
<fwereade_> rog, TheMue: cool, cheers
<fwereade_> TheMue, yeah, I think so, and is also responsible for keeping the Environ up to date, because any worker ought to be able to just deal with it
<davecheney> fwereade_: i'm surprised your more worried about the environ than the state
<rog> fwereade_: it means, i think, that when the PA starts, it would do three things: start the environment settings agent and wait for an environ; start the provisioner and the firewaller with the environment thus obtained
<davecheney> as in, duplicates in process
<fwereade_> davecheney, I'm worried about both :)
<fwereade_> davecheney, I have comments on separate reviews whining about each of those things ;)
<davecheney> fwereade_: good, just checking
<fwereade_> rog, yeah, think so
<davecheney> so, who has the hangout invite ?
<niemeyer> mramm: I sent an invite a few minutes ago already
<mramm> https://plus.google.com/hangouts/_/79b690ed61262ab9e0bbaa673cd718b151275733
<mramm> ahh
<mramm> sorry
<Aram> you did?
<mramm> I'll join that one
<niemeyer> Aram: Yep
<Aram> damn flash
<Aram> meh
<Aram> wtf
<Aram> A connection error occurred while loading this page. Please try refreshing the page.
<niemeyer> https://bugs.launchpad.net/juju-core/+bug/1022954
<niemeyer> "un-in-progressing"
<niemeyer> fwereade's new verb
<Aram> milestones++
<fwereade_> davecheney, were my reviews clearish?
 * fwereade_ tries to remember what they were, all I can currently remember is that there were things I wasn't sure I'd said effectively
<fwereade_> davecheney, niemeyer: ah, yeah, that was it
<fwereade_> davecheney, niemeyer: doing authorized-keys "properly", ie responding to env-sets, feels to me like it's the MA's job
<fwereade_> davecheney, niemeyer: sane?
<niemeyer> fwereade_: Does sound sane to me
<fwereade_> davecheney, niemeyer: this is something we don't handle at all in the python but feels like it's necessary for env-set authorized-keys=BLAH to be useful
<davecheney> fwereade_: sorry, i'm unclear what you are talking about ?
<fwereade_> davecheney, https://codereview.appspot.com/6405046/diff/2001/environs/ec2/cloudinit.go#oldcode116
<niemeyer> fwereade_: Agreed
<niemeyer> fwereade_: But it sounds like a bug we can file and postpone
<davecheney> fwereade_: right, i'm with you now
<niemeyer> fwereade_: After all, it's not handled at all in Python :-)
<davecheney> as per my reply, it's important, but those comments should go somewhere else
<fwereade_> niemeyer, yeah, I'm not suggesting it's *high* priority, but I think it's something without which we can't reasonably call env-set "done"
<niemeyer> fwereade_: +1
<fwereade_> davecheney, sorry, I saw no replies
<davecheney> fwereade_: with your permission i'll copy that text into a bug for the backlog
<davecheney> is that ok ?
 * fwereade_ goes and pokes at his email client
<fwereade_> davecheney, that's great, thank you
<fwereade_> davecheney, it's a point I was raising as I thought of it rather than a request for immediate fixing
<davecheney> fwereade_: i have a very broad view of how bug trackers work
<davecheney> to me, they are simply places where you put things you don't want to loose
<davecheney> the rest is just semantics
<fwereade_> davecheney, yeah, makes sense
<davecheney> OMG, i just found a bug in LP; if you alter the field on the show bugs screen
<davecheney> the report bug link stops working
<davecheney> fwereade_: done, bug raised
<davecheney> fwereade_: if I can also draw your attention to http://codereview.appspot.com/6408047/#msg5, which may have been lost in the either
<davecheney> and http://codereview.appspot.com/6408047/#msg4
<davecheney> err, ether
<fwereade_> davecheney, was just coming to that
<fwereade_> davecheney, I'm still -1
<fwereade_> davecheney, I really think that a half-initialized machine is  a Bad Thing
<niemeyer> fwereade_: I've replied to all your comments on the config branch.. can you please have a look and see what is sensible and what is not?
<niemeyer> fwereade_: I'll step out for breakfast and will come back to act on it
<fwereade_> davecheney, if initzk falls over half way through but everybody else keeps going we will get confused
<niemeyer> fwereade_: Will ping you before starting in case you'd like to discuss
<davecheney> fwereade_: yup, cloud init is not checking the return status of any of the commands
<fwereade_> davecheney, if we have a machine but the instance-id is not set, the PA will go ahead and prvision it
<davecheney> witness the hilarity of the ec2 apt mirror snafu a few weeks ago
<fwereade_> davecheney, so I really think it should go inside Initialize
<fwereade_> davecheney, I *also*think Initialize should not *require* it
<fwereade_> davecheney, otherwise all our tests will break, because there will be 1 machine where there were once 0
<davecheney> fwereade_: I have to call it a night, i'm alread in the doghouse
<fwereade_> davecheney, sorry :)
<davecheney> could you please raise this as a bug
<fwereade_> I still think it's a blocker on the CL
<davecheney> yup, assign it to me
<fwereade_> but I'll raise the other as a bug
<davecheney> i'll do it tomorrow
<fwereade_> niemeyer, looking now
<fwereade_> davecheney, cheers, awesome
<davecheney> night all
<niemeyer> fwereade_: if path != "" || keys == "" {
<niemeyer> <fwereade_> What happens when both path and keys are empty?
<niemeyer> fwereade_: So, tell me.. what happens? :-)
<rog> lunch
<fwereade_> niemeyer, doh :(
<fwereade_> niemeyer, but shouldn't there be a "whoa, you specified both" error path?
<niemeyer> fwereade_: I'm happy to define that when both are provided, the path takes over.. or to concatenate them
<niemeyer> fwereade_: This is addressing the point you brought up yesterday related to set-env
<fwereade_> niemeyer, I'm easy tbh
<fwereade_> niemeyer, I think I have overthought that particular functionality :/
<fwereade_> niemeyer, my perspective is as likely to be warped as it is to be correct
<niemeyer> fwereade_: I understand.. IMO it's fine to have a well defined behavior that makes the implementation simpler
<fwereade_> niemeyer, SVGTM
<niemeyer> fwereade_: I suspect concatenating them is the least-surprising behavior
<fwereade_> niemeyer, agreed in the small, feels maybe tricky to do the Right Thing on env-set
<niemeyer> fwereade_: Ah, yes
<niemeyer> fwereade_: The current behavior handles that properly I think
<TheMue> fwereade_: Do you have a quick link into your relation-unit branches? I would like to see how you do it with tombs in the sub-goroutines.
<fwereade_> TheMue, https://codereview.appspot.com/6405044/
<TheMue> fwereade_: Thx
<fwereade_> TheMue, given that you're using little types for the sub-goroutines, the tombs would probably be better placed on the types
<fwereade_> TheMue, but I *think* the genral idea is sound
<TheMue> fwereade_: I'll place them in those types, yes.
<TheMue> fwereade_: Your deferred finish() is neat.
<fwereade_> TheMue, cheers :
<fwereade_> )
<niemeyer> rog: LGTM on https://codereview.appspot.com/6344113/
<rog> niemeyer: thanks a lot
<rog> niemeyer: we could return ports sorted, but i don't really see the point, as only the testing code is concerned about that, i think.
<niemeyer> rog: Thank you!
<niemeyer> rog: I'd be glad to see ports sorted whenever looking at that information, personally
<rog> niemeyer: if we do that, we'll want to implement something like state.PortSlice, otherwise every provider will need to do its own sort.Interface implementation for ports.
<rog> niemeyer: or state.SortPorts
<rog> i suppose
<rog> niemeyer: i'd be ok doing that if you'd like
<rog> niemeyer: state.SortPorts, that is
<niemeyer> rog: +1
<niemeyer> rog: We already have the implementation anyway
<rog> niemeyer: sure. will do.
<niemeyer> rog: Thanks!
<niemeyer> fwereade_: https://codereview.appspot.com/6354045/ is still deleting .lbox
<fwereade_> niemeyer, gaah, I guess I forgot that :/
<niemeyer> fwereade_: Would you have a moment to talk about the logic in there?
<fwereade_> niemeyer, ofc
<niemeyer> fwereade_: Looking at line 509 on presence.go
<fwereade_> niemeyer, here?
<fwereade_> niemeyer, yep
<niemeyer> fwereade_: It stikes me as odd that we're firing another watch before we even cared to observe what the previous watch said
<niemeyer> fwereade_: I may just be missing the underlying logic, though
<fwereade_> niemeyer, the underlying idea is that AliveW returns a bool and a chan that sends the other bool
<fwereade_> niemeyer, if the bool we get out of the new AliveW is not the same as what was fired by the original watch, state has changes again since the original watch fired
<niemeyer> fwereade_: Well, if you look a third time, it may have changed again.. we may do that ad infinitum
<fwereade_> niemeyer, but I looked and got a watch which should be guaranteed to fire next time it changes
<fwereade_> niemeyer, the point is that if it differs, the latest know state is no different to the last one we notified of
<fwereade_> niemeyer, and therefore we should not send a spurious "hey dude, still alive (or dead)" event, and should just start again with the new watch
<niemeyer> fwereade_: Sorry, I still don't get it
<niemeyer> fwereade_: Why are we not doing that a third time, just in case?
<niemeyer> fwereade_: (it's not a tricky question.. it would help me understand)
<fwereade_> niemeyer, because the watch is guaranteed to fire when the value we get out changes
<niemeyer> fwereade_: Right.. that's true for the first watch too, right/
<niemeyer> ?
<fwereade_> niemeyer, the first watch has fired
<fwereade_> niemeyer, we know something has changed
<niemeyer> fwereade_: Exactly
<fwereade_> niemeyer, but it's just like using ZK normally
<niemeyer> fwereade_: We were just told something has changed.. why are we asking again?
<fwereade_> niemeyer, because it might have changed any number of times again between our being notified that something changed and our starting a new watch
<fwereade_> niemeyer, we need a new watch, right?
<niemeyer> fwereade_: Yes, and it may have changed any number of times again after the second watch
<fwereade_> niemeyer, the point of the watch is that if it has we'll immediatey get notified next time through the loop
<niemeyer> fwereade_: Ah, I see
<niemeyer> fwereade_: So your point is that we need this watch anyway, so we may as well do it ahead of time and verify the second result already
<niemeyer> fwereade_: Right?
<fwereade_> niemeyer, yes, we need the watch; and the important current value is the one that comes out when we get the watch; and *if* that value differs from the one we got out of the first watch, the state has changed an odd number of times in the interim
<fwereade_> niemeyer, and therefore the latest state is the same as the last one we notified the client of
<fwereade_> niemeyer, and therefore we should not notify them
<niemeyer> fwereade_: Understood, makes sense, thanks for explaining
<fwereade_> niemeyer, a pleasure
<fwereade_> niemeyer, I imagine it could use an extra comment or two ;)
<niemeyer> fwereade_: Yeah, I'll suggest that
<fwereade_> niemeyer, I'm pretty sure I readded the lbox as it should have been but I'm getting "error: Failed to send patch set to codereview: can't upload base of .lbox: ERROR: Checksum mismatch." -- once I have your review I think I will just clone it onto a fresh branch and link it to the original for contnuity of discussion
<niemeyer> fwereade_: I'd prefer to fix it instead, if we manage to
<niemeyer> fwereade_: Every time we recreate a CL we lose all the context for the previous comments
<niemeyer> fwereade_: and I can't review the delta anymore
<niemeyer> fwereade_: There's probably a bug in lbox
<niemeyer> fwereade_: If needed, let's just remove and readd it
<fwereade_> niemeyer, that's what I thought I'd done
<fwereade_> niemeyer, codereview now seems to think that there are two .lbox files, one of which was deleted and another of which was added
<niemeyer> fwereade_: Can you uncommit the re-add?
<niemeyer> fwereade_: Or is it not tip anymore?
<fwereade_> niemeyer, done
<niemeyer> fwereade_: Ok, you'll probably have to force-push now
<niemeyer> fwereade_: Since the revision is already up
<niemeyer> fwereade_: Try "bzr push" just to confirm
<niemeyer> fwereade_: If that doesn't work, just make sure the branch URL is right (and not trunk!) and then do bzr push --overwrite
<fwereade_> niemeyer, ok, yep, needed to --overwrite
<niemeyer> fwereade_: You have a review
<niemeyer> fwereade_: It's really just cosmetic stuff, except for the fact the pinger has changed since the last review, and it's not clear why
<niemeyer> Lunch time.. biab
<rog> niemeyer: quick once-over before i submit? https://codereview.appspot.com/6344113
<niemeyer> rog: Awesome, thank you!
<rog> niemeyer: i take it that LGTY
<niemeyer> rog: Definitely, thanks
<TheMue> rog, fwereade_ : the firewallers first part is at https://codereview.appspot.com/6374069 in again, will continue this way with the second part.
<TheMue> rog, fwereade_ : looks simpler now, and uses tombs for each machine.
<rog> TheMue: what happens to a machine units watcher when the machine it's watching is removed?
<rog> TheMue: i'm concerned that we might receive an error from that watcher before we see the machine removal, and thus cause the whole thing to fall over
<TheMue> rog: that's in the next branch. I think I'll add there a tomb and a finish() for all tombs too. but I'm not yet exactly shure.
<rog> TheMue: i'm concerned that something that i'd envisaged as about 6 lines of code has expanded to 50 lines. i don't think it should have to be this complex, but i don't quite see how to avoid it with our current conventions.
<rog> lc
<TheMue> rog: yeah, it's getting bigger again. *sigh*
<TheMue> rog: I would like to only have one range loop for machine and co too.
<TheMue> rog: but we also have to react on the firewall, on depending goroutines etc.
<rog> TheMue: i don't see a problem with doing that
<TheMue> rog: I'm listening
<TheMue> rog: btw, the code is now smaller than the one you lgtm'ed yesterday
<TheMue> rog: only the tomb is new, so we really now when the machine instance has ended working
<rog> TheMue: yesterday's code was 144 lines. today's is 165. how is that smaller?
<rog> TheMue: we already know when the machine instance has been removed - we can just flag it as dead and ignore it if it sends any events after we've killed it.
<rog> TheMue: that would only take about 4 lines of code.
<rog> TheMue: but i know fwereade_ has opinions about this too :-)
<TheMue> rog: the additional lines are two helpers, a constructor, a forgotten statement and comments for types and methods.,
<fwereade_> rog, I'm still looking :)
<TheMue> rog: could you show me your 4 lines? last time they contained much pseudo code and error testing had to be added
<rog> TheMue: http://paste.ubuntu.com/1096832/
<rog> TheMue: no error checking is needed in that loop, i think.
<rog> TheMue: (it can be done by the central loop when it receives the nil change notification
<rog> )
<fwereade_> rog, I think I'm happier with more lines of code but clearer boundaries
<rog> fwereade_: personally, i find the logic with the myriad of tombs and watchers quite hard to follow
<rog> fwereade_: i like to keep concurrent code as minimal as possible
<TheMue> rog: and when the firewaller dies or stops, or a machineUnitsWatcher?
<rog> fwereade_: as it's quite easy to get wrong and hard to test fully
<rog> TheMue: if the firewaller stops, it kills the machineUnitsWatcher which gives EOF on its changes channel, which propagates through the machine loop and back to the firewaller which exits when all its sub-watchers have terminated.
<niemeyer> TheMue: Sent comments
<rog> TheMue: if a machineUnitsWatcher dies, it gives EOF as above.
<TheMue> niemeyer: thx
<niemeyer> rog: This looks sensible.. the only disadvantage is that any errors on the watcher are lost
<niemeyer> rog: It could be fixed by adding an err field to the Change though
<niemeyer> rog: Or someone else has to Stop() the watcher
<rog> niemeyer: you're talking about the above branch, presumable?
<rog> presumably
<niemeyer> rog: I'm talking about the above paste
<rog> niemeyer: ah, thanks
<niemeyer> rog: I think the logic in there works correctly, though
<niemeyer> rog: It'd probably be wise to move forward with it and let that kind of simplification for a follow up
<rog> niemeyer: the errors can later be retrieved because the central loop can interrogate the watcher itself.
<rog> niemeyer: or, as you say, add an err to the change type
<niemeyer> rog: It can, it has to interrogate all the watchers, and Stop them
<niemeyer> rog: Also must be careful not to close machinesChanges
<rog> niemeyer: definitely.
<fwereade_> TheMue, I'm starting to wonder what we get from the machine type that couldn't be got from (say) a (*Firewaller)machineLoop(*Machine, *tomb.Tomb)
<rog> niemeyer: but i don't think that's hard to ensure.
<niemeyer> It's not hard.. it's just another approach that TheMue will have to get right
<niemeyer> He seems to have gotten that one approach almost correctly
<niemeyer> So I'm tempted to suggest moving forward with it, and simplifying in a follow up
<rog> niemeyer: seems ok.
<fwereade_> TheMue, but then Firewaller has machines and machines has ports and neither of those seem to be used a great deal, and I think my issue is that I haven't closely followed the original discussion so I'm not quite sure of the plan
<TheMue> fwereade_: and then go that loop per machine? would be possible, but I think more complex in the end (there will be more types later). take a look at the Py code, all in one type with callbacks.
<rog> fwereade_: did you see my original sketch?
<fwereade_> rog, not closely enough that it stuck in the mind, I'm afraid
<rog> fwereade_: http://paste.ubuntu.com/1096860/
<TheMue> niemeyer: the small types at the end of the file are left from rogs paste. they grow with each new branch. the follow-up (already in) has a working unit (will rename it to unitTracker).
<fwereade_> rog, ok, I do remember that; not fully seeing the path from here to something-like-there at the moment
<niemeyer> TheMue: Okay
<fwereade_> rog, but, honestly, I has an EOD sleepy on and I think I should take a rest
<rog> fwereade_: TheMue's branch is the beginning of that
<niemeyer> TheMue: I suggest keeping the machineTracker with the current mechanism for now
<rog> fwereade_: it implements the "start machine units watcher for m; add it to machines" piece
<niemeyer> TheMue: To avoid delaying it much further with new logic
<fwereade_> rog, there's something I can't quite put my finger on about having ports on both machine and unit
<rog> fwereade_: i played around with a few configurations, but that one seemed to work best
<TheMue> niemeyer: sounds reasonable
<rog> fwereade_: because they're distinct things
<rog> fwereade_: the machine ports are a union of the unit ports on that machine
<fwereade_> rog, fair enough
<rog> fwereade_: and we need to keep track of the machine ports so that we know which ports to close when a unit's ports change.
<fwereade_> rog, I guess we don't have anything stopping two units on the same machine from attempting to mess with the same ports?
<rog> fwereade_: i discussed this with niemeyer
<TheMue> So, have to step out, Carmen calls for dinner. But will return later.
<rog> fwereade_: we thought it was reasonable if each port was considered "owned" by a given unit
<rog> fwereade_: then the open-port command should give an error if the port is owned by another unit
<rog> fwereade_: note that machine.ports in my sketch is a map from state.Port to *unit
<fwereade_> rog, yeah, that sounds sensible... and, ah-ha; yeah, I'm sleepy :(
<rog> fwereade_: ok, happy snoozes!
<fwereade_> rog, cheers
<fwereade_> might be on a bit later, not sure
<niemeyer> fwereade_: Have a good one
<niemeyer> Doc appointment.. back in ~40mins
 * rog is off for the night. have fun, see y'all tomorrow
<niemeyer> Break time. biab
<mramm> niemeyer: have fun!
<fwereade__> niemeyer, ping
<fwereade__> niemeyer, so, I did a bit of archaeology and figured out what happened with those Pinger changes: long story short, they were part of the testing change that I thought I'd proposed among the other changes (and were therefore ok, because they were *changed* code but not *moved* and changed code), and it turns out they're not anywhere in the changesets in https://codereview.appspot.com/6348053/ ...so, I thought I'd proposed them, but I actually hadn't, a
<fwereade__> nd committed them anyway, which is a pretty monumental screwup :(
<fwereade__> niemeyer, I am very much aware that  this is Not Ok, and I guess it must have happened via a bedtime `lbox propose` whose results I didn't check
<fwereade__> niemeyer, but that's an explanation not an excuse
<fwereade__> niemeyer, and I guess we're just lucky that, actually, it *does* seem to be more stable now and by sheer luck (or maybe judgment, who knows) I didn't screw it up really badly
<fwereade__> niemeyer, I think
<fwereade__> niemeyer, um, anyway, flagellation over, but if you want to take over for a bit I wouldn't feel it was unjustified
<fwereade__> niemeyer, I guess it's good that presence.ChildrenWatcher has such a venerable history, nobody would have noticed if I'd reproposed from scratch as I was tempted to do :/
<davecheney> fwereade_: thank you for raising that issue, i'll take a swipe at it today
<fwereade_> davecheney, cool, cheers
<fwereade_> davecheney, I suspect it will be best to start by changing Initialize and leaving bootstrap alone for a bit, given that we expect some churn in environ config
<davecheney> i'm concerned that this change will be delayed by the other churn in that area
<davecheney> which was my motivation for proposing it is addressed later
<fwereade_> davecheney, your instincts are probably better than mine: I was only thinking about that code because we need default-series for deploy, and I think you're more deeply involved than I am
<davecheney> fwereade_: i wouldn't say deeply :) i'm motivated primarily by the goal of having deploying somethign
<fwereade_> davecheney, but because I wanted the default-series to be a quick fix before going back to the UA, I have been somewhat scattered about it
<davecheney> err, having something to deploy
<fwereade_> davecheney, oh, me too ;)
<davecheney> anyway, i'll take a look now
<fwereade_> davecheney, but implementing deploy ended up feeling like more trouble than I had hoped :/
<davecheney> btw, juju status, the output, is that actul yaml, or just formatted to look similar ?
<fwereade_> davecheney, should be yaml
<fwereade_> davecheney, takes --format, defaults to yaml, also allows json
<davecheney> fwereade_: right, that is what i thought, but I was confused by the doc strings that appeared to dictate a format
<davecheney> or a layout, to be more exact
<fwereade_> davecheney, hmm, also dot, svg, png, it seems
<davecheney> map's gonna map
<fwereade_> haha :)
<fwereade_> davecheney, hmm, I just wanted to create a Hooker type :/
<fwereade_> davecheney, I can probably think of a better name
<davecheney> first bug logged against launchpad, do I get a badge or achivement award ?
<davecheney> fwereade_: forgive my thickness
<davecheney> but reading the bug you raised
<davecheney> i don't see how that is (directly) related to the problem at hand
<davecheney> which I though was the population of machine/0 into the state after we had returned from initalise
<niemeyer> fwereade_: So what's the history there?
<niemeyer> fwereade_: Sorry, that's not the right question
<niemeyer> fwereade_: What do we want? :-)
<davecheney> niemeyer: two secs
<davecheney> niemeyer: http://codereview.appspot.com/6408047/#msg3, which became, https://bugs.launchpad.net/juju-core/+bug/1025656
<niemeyer> davecheney: Sorry, I'm a bit out of context
<niemeyer> davecheney: I was asking about the stuff fwereade_ said earlier
<davecheney> ok, ignore me then :)
<niemeyer> davecheney: But that's interesting too :-)
<niemeyer> davecheney: What is potentially racy, more precisely?
<niemeyer> davecheney: When you have a few minutes, I'd like to understand better what's going on thre
<niemeyer> there
<davecheney> niemeyer: racy because we are setting /initalised, then doing some more fudging of the state before the command exits
<niemeyer> davecheney: That part is totally fine
<niemeyer> davecheney: that node handles the lack of atomicity in zk
<davecheney> yes, i think so too, because the only consumers of those machine entries, are started by cloud init after that process has exited
<niemeyer> davecheney: It prevents code from trying to e.g. create other nodes before the fundamental parents even exist
<niemeyer> davecheney: That's irrelevant, IMO
<davecheney> what is irrelevant
<niemeyer> davecheney: AddMachine in State should work
<niemeyer> davecheney: That's a standalone assumption
<niemeyer> davecheney: Do I misunderstand?
<davecheney> no, we are in agreement
<davecheney> so, can I drop that comment block ?
<niemeyer> davecheney: Yeah, the comment seems to assume /initialized is something it's not
<davecheney> niemeyer: will do
<niemeyer> davecheney: Now, for the second part of the question: why are we adding a machine by hand like this rather than doing it the way we did in Python?
 * davecheney checks the python
<niemeyer> davecheney: I can describe, sorry.. I thought it was a conscious decision
<niemeyer> davecheney: This isn't hard-coded
<niemeyer>     sub_parser.add_argument(
<niemeyer>         "--instance-id", required=True,
<niemeyer>         help="Provider instance id for the bootstrap node")
<niemeyer> davecheney: Hmmm.. nevermind
<niemeyer> davecheney: Clearly my memories are failing me
<davecheney> niemeyer: yup, but the current state.Initialize provides no way to pass that value in
<davecheney> maybe it should
<niemeyer> davecheney: Don't worry.. I'm on crack
<niemeyer> davecheney: i was incorrectly complaining about something else
<davecheney> which is the bug I thought that william was going to raise for me last night
<davecheney> niemeyer: np
<niemeyer> davecheney: I'm surprised to see machine/0 hardcoded around initialize, but it *is* hardcoded in Python too
<davecheney> niemeyer: :)
<niemeyer> davecheney: Which clearly means we shouldn't worry about that now
<davecheney> niemeyer: which was my point
<niemeyer> davecheney: I can see why William was worried too, because in Python /initialize was guarding the machine creation
<davecheney> it could be better, but at the moment I think a todo in the code and bug in the issue tracker for the backlog should suffice
<niemeyer> davecheney: But that's not it's reason of existence really.. I do recall that part :-)
<davecheney> niemeyer: hence my comemnt about a possible race
<niemeyer> davecheney: The problem was that it was extremely boring to guard every single action against the lack of existence of the critical parents
<davecheney> but given the consumer of that piece of information in the state is not started till jujud initzk returns
<niemeyer> Like /charms, /services, etc
<davecheney> yeah, that sounds dull
<niemeyer> davecheney: So /initialized came into play
<niemeyer> davecheney: But the machine creation is clearly outside of that need
<niemeyer> davecheney: So +1 on dropping the comment, and +1 on getting this in
<niemeyer> davecheney: The TODO should also not mention Initialize.. seems fine where it is
<niemeyer> davecheney: Some day we do want that to move into Bootstrap, though.. in the not upcoming future.. :-)
<davecheney> i'll raise a bug for the backlog
<niemeyer> davecheney: Thanks!
<davecheney> niemeyer: thanks for the review
<niemeyer> davecheney: np
<niemeyer> I'm also addding a comment about the stuff we just talked about
#juju-dev 2012-07-18
<TheMue> Morning
<fwereade_> morning TheMue, davecheney, rog
<davecheney> howdy
 * fwereade_ looks back at davecheney's conversation
<fwereade_> davecheney, interesting, sorry to lead you astray on /initialized
<davecheney> fwereade_: no harm done
<fwereade_> davecheney, I always saw it as meaning "if this node does not exist, do not trust the state, because it is wrong"
<davecheney> two commits today, so ain't nothing going to bring me down :)
<fwereade_> davecheney, sweet :)
 * TheMue would like to take his whole family with him to Lisbon. It's raining cats and dogs here today.
<TheMue> Maybe I should try a summer2012 *= -1
<rog> TheMue, fwereade_: mornin'!
<TheMue> rog: Heya
<fwereade_> rog, heyhey
<davecheney> rog: can I ask you a question about gnu flags ?
<rog> davecheney: ask away
<davecheney> i'll describe what I want to do, not what I am currently doing
<davecheney> to support multple output formatters
<davecheney> I have a map[string]func(io.Writer)
<davecheney> which maps format names to a function that will format the output
<rog> davecheney: ok
<davecheney> but i'm having trouble satistfying the interface for whatever fs.Var takes
<davecheney> which needs Set() and String()
<davecheney> ideally i'd like to avoid having to make a struct { name string render func() } for each map entry
<rog> davecheney: you want a flag that selects the format?
<davecheney> just so we knew it's name
<davecheney> yes
<davecheney> that defaults to yaml
<davecheney> rog: please hold, pasting
<davecheney> rog: http://paste.ubuntu.com/1097972/
<davecheney> this may not work exactly, i've been hacking on it
<Aram> moin.
 * davecheney waves
<rog> davecheney: so your difficulty is that you don't know how to define the String method on the formatterVar?
<davecheney> rog: not without having to pass a struct with the name of the formatter around
<davecheney> I was trying to keep it lean using only a function
<rog> davecheney: why not just do: type formatterVar string; and then index into the formatter map at a later stage?
<rog> davecheney: or have another method on formatterVar that returns the function.
<davecheney> the latter might be good
<rog> davecheney: by looking up into the map.
<fwereade_> davecheney, take a look at cmd/jujuc/server/output.go
<rog> davecheney: all this stuff is identical to the standard flag package BTW
<davecheney> yeah, i've been looking in cmd/cmd and cmd/jujud for hints
<davecheney> give me a sec to play
<davecheney> thnks for your suggestions
<fwereade_> davecheney, the file I suggested should have ~exactly what you need already implemented
<davecheney> fwereade_: sweeeeeeeeeeeet
<rog> fwereade_: i *thought* i'd seen it before!
<fwereade_> :D
<davecheney> fwereade_: that is _exactly_ what I want
 * davecheney smells refactoring
<fwereade_> davecheney, awesome :)
<TheMue> Aram: Moin
<davecheney> fwereade_: can I move this to cmd/ ?
<fwereade_> davecheney, please do :)
<davecheney> as it will be used by several cmds ?
<fwereade_> davecheney, SGTM, you may want to write a couple of explicit tests for it
<davecheney> fwereade_: will do
<fwereade_> davecheney, beware the --test flag, which we probably don;t want on juju status
<fwereade_> davecheney, but you should probably keep --output, pointless though it may be, in the service of interface stability
<davecheney> fwereade_: in that case, may I move c.out.testmode out of the formatter
<davecheney> as test mode is only used by the jujuc/server types
<Aram> fwereade_: there's a problem with multiple testing suites in mstate :).
<Aram> starting a replicated mongod takes 30 seconds, so tests take 200 seconds just to bootstrap!
<Aram> I tried to make mongod start faster but ti can't be done.
<fwereade_> Aram, whoa :)
<fwereade_> davecheney, SGTM
<fwereade_> Aram, you *should* be able to do something like we do in testing
<fwereade_> Aram, so at least you share a mongo between tests in every package
<Aram> that's the plan I had in mind.
<davecheney> lbox propose over celular connection will probably be a fail
<fwereade_> Aram, and surely you can run most tests without a replicated mongod?
<Aram> that's another plan I had in mind.
<fwereade_> Aram, STM that running most of them without replication, but the actual mstate suite with, would be pretty plausible
<davecheney> fwereade_: rog: https://codereview.appspot.com/6426043
<fwereade_> davecheney, could we not have testMode on ClientContext please?
<davecheney> actaully, please hold, still massaging
<davecheney> fwereade_: indeed, i moved it there
<fwereade_> davecheney, emphasis on "not"
<davecheney> it's in server.go, i'll move it into context.go
<fwereade_> davecheney, the commands do embed a ClientContext but I don't think the ClientContext is "commandy"
<davecheney> fwereade_: ahh, my mistake, i misinterprited you
<fwereade_> davecheney, and I *think* that not every ClientContext-embedder actually wants that flag
<davecheney> only config-get and unit-get care about this
<davecheney> i'll add it explicitly to them
<fwereade_> davecheney, yeah, there will be more but you don't need to worry about them yet
<fwereade_> davecheney, I'm worrying about how I'm going to make all these things work enough for everybody ;p
<davecheney> fwereade_: PTAL
<fwereade_> davecheney, LGTM modulo tests :)
<davecheney> adding some testage to TestCommand now
<davecheney> fwereade_: rog aram: comments most welcome on tests "{\"Juju\":1,\"Puppet\":false}\n"
<davecheney> err, "{\"Juju\":1,\"Puppet\":false}\n"
<davecheney> https://codereview.appspot.com/6426043
<rog> davecheney: does that test pass consistently?
<davecheney> rog: yup
<davecheney> i use a struct for the test data so I don't get screwed by map ordering
<rog> davecheney: oh yeah, it's struct not a map, duh
<davecheney> rog: don't worry, i made that mistake
<davecheney> i added a comment to warn othrs
<davecheney> others
<rog> davecheney: i wonder whether instead of specifying the entire error output, it might be better to make a regexp that just includes the most important bits.
<davecheney> rog: yeah, i'll make that change
<rog> davecheney: e.g. `usage.*invalid value \"xml\" for flag.*`
<davecheney> that was overkill
<rog> davecheney: actually i think it may have to be (.|\n)*
<davecheney> Assert( Equals ) is already a regex right ?)
<rog> davecheney: also, perhaps use a more-obviously unrecognised name than "xml"?
<rog> davecheney: nope, use Assert(Matches)
<rog> davecheney: it's just plausible we may add xml support in the future, i guess
<davecheney> rog: i can invent a more implausible output format
<rog> davecheney: i'm sure you can :-)
<rog> davecheney: you've got a review
<davecheney> thanks rog
<davecheney> rog: Assert Matche is having trouble with a multi line message
<davecheney> matches
<rog> davecheney: did you use (.|\n)* ?
<davecheney> no ...
<rog> davecheney: i think you need to, as . doesn't match \n
<rog> TheMue: i wonder if another way of testing the internals of the firewaller might be to hook into the logger.
<davecheney> that go it
<davecheney> thanks
<rog> TheMue: you could make a little type with an Output method that grepped for certain strings.
<rog> TheMue: then there would be no need to have unsafe access to firewaller internals
<davecheney> rog: holy crap
<davecheney> if you include a CL url in the body of lbox commit message
<davecheney> it appends it to the old review
<davecheney> farq!
<rog> davecheney: you mean lbox submit ?
<davecheney> no, look what happened here
<davecheney> https://codereview.appspot.com/6405046#msg14
<rog> *sigh* random emails are going missing, and i don't know why
<rog> davecheney: in particular, i never saw the Submitted email, so i thought i was replying to an active review
<davecheney> yeah, that happens a lot
<rog> davecheney: i wonder why that message came out yellow
<davecheney> but I use my gmail for reviews
<davecheney> which is more reliable
<rog> davecheney: me too
<rog> davecheney: i use gmail for everything
<rog> davecheney: and it worked fine before
<rog> davecheney: i verified that it wasn't any of my filters misbehaving, and it's not going into Spam.
<davecheney> rog: https://codereview.appspot.com/6416044/
<davecheney> rog: email totally crapped out last week, thursday/friday
<davecheney> 24+ hours delays
<davecheney> maybe this is a continuation
<rog> davecheney: email from launchpad was always very slow, but it got there in the end.
<rog> davecheney: these days, i'm not sure i'm seeing any email from launchpad at all.
<davecheney> weird
<TheMue> rog: Instead of logging I also could add a chan of interface{} and send testing data to a goroutine where it's simply added to a slice. Additionally I can send funcs to that goroutine which iterate and type switch over the collected data.
<rog> TheMue: i like the logging approach because it doesn't make the core code do anything unusual
<rog> TheMue: there's no need to be sending to a chan in the normal code flow
<rog> TheMue: and having a log.Debugf call would look quite normal
<rog> TheMue: you could parse the messages trivially, with fmt.Scanf
<TheMue> rog: Have to think about. I'm not such a big fan of string-stream-data-fetching. ;)
<TheMue> rog: Thx for your review. How do we handle ZK outages in other parts?
<rog> TheMue: i think some tests close the underlying zk and test the behaviour
<rog> TheMue: or perhaps the underlying state.
<rog> TheMue: it's not easy, but i think it's worth doing as much of the logic is about the error paths, so it's important that we get it right.
<TheMue> rog: Btw, the better handling of a dying machine units watcher is in the next branch (goes in today). It hasn't been complete so far.
<rog> TheMue: i'm thinking about the handling of a dying machine watcher in this case
<TheMue> rog: I can follow your motivation but I'm not sure how you would do it.
<rog> TheMue: here's a thought: currently the firewaller opens its own state. how about we change the signature to NewFirewaller(*state.State)?
<rog> TheMue: then we can pass in a state that we have a handle to, and can close that and see what happens
<rog> TheMue: in fact, that enables both the PA and the firewaller to use the same state object, which is probably a good thing.
<TheMue> rog: Would be ok for me, yes.
<TheMue> rog: Could you add this as a review comment so that Gustavo could take a look at that idea too?
<rog> TheMue: done
<TheMue> rog: thx
<TheMue> rog: I'll now finalize the follow-up branch as proposal and then return to the initial one
<rog> TheMue: sounds good
<rog> TheMue: here's one way of doing the log-based tests. i think it works quite naturally, but it may be crack! http://paste.ubuntu.com/1098250/
<rog> fwereade: what do you think of the above approach?
 * fwereade reads
 * fwereade isn't sure
<fwereade> rog, I think it's decent scaffolding but bad testing if you see what I mean
<fwereade> rog, it will help us verify behaviour as we build up the firewaller
<rog> fwereade: that's the idea
<fwereade> rog, but I think that once we can test the whole thing in terms of dummy Operations we should trash it
<rog> fwereade: i agree
<fwereade> rog, I think we're on the same page then :)
<rog> fwereade: i just wanted something that was unintrusive but reliable
<rog> fwereade: the current approach of inspecting the firewaller's locals from another goroutine seems icky to me
<rog> TheMue: hey, i just saw your CloseState method again
<rog> TheMue: that approach would be fine, i think.
<fwereade> rog, yep, definitely
<rog> TheMue: maybe we don't need more error checking currently in fact
<TheMue> rog: Eh, sorry, somehow I'm lost.
<rog> TheMue: lost where?
<TheMue> rog: From the paste above (which I'm reading now and so maybe answered too quick) and the last statement about CloseState and "don't need more".
<rog> TheMue: the paste above is orthogonal to the error testing issue that i was just referring to
<rog> TheMue: sorry, two things without sufficient disambiguation!
<rog> TheMue: i'm suggesting the paste above as a way to avoid the need for the thread-unsafe AllMachines methods.
<TheMue> rog: ah, ok
<rog> TheMue: it's the kind of thing i was referring to earlier when i talked about hooking into the log messages
<TheMue> rog: yes, ic, your approach is nice. i only have to add some more debug statements then.
<rog> TheMue: exactly
<rog> TheMue: cool, glad you like it
<rog> TheMue: if you like, i'll submit it as another CL after yours is done.
<rog> TheMue: you could leave a "TODO: thread safe testing" or somesuch comment lying around
<TheMue> rog: can do so, but I can also add it myself and update the proposal
<rog> TheMue: ok, that's probably easier tbh
<rog> biab
<TheMue> rog: I'm just testing the follow-up branch, then I can start with it
<TheMue> niemeyer: morning
<niemeyer> Gooood mornings
<rog> niemeyer: yo!
<niemeyer> rog: Heya!
<rog> niemeyer: in https://bugs.launchpad.net/bugs/1017732, you say "The two first parameters are only useful for InferCharm" but there's not currently a function by that name. did you mean InferURL?
<niemeyer> rog: Yeah
<rog> niemeyer: ok. next branch up, i'll propose a fix. i want to make the argument to juju.Conn.Deploy a charm.URL
<rog> niemeyer: which means i want InferRepository as you suggest it
<niemeyer> rog: Sounds great, cheers
<rog> fwereade: sound ok to you? i'll assign the bug to me if you've not already got a fix in progress.
<niemeyer> TheMue: ping
<fwereade> rog, sorry, was having a bite to eat, forgot lunch, reading back
<fwereade> niemeyer, heyhey
<rog> fwereade: np
<fwereade> rog, go for it :)
<rog> fwereade: done
 * fwereade cheers
<TheMue> niemeyer: pong
<TheMue> niemeyer: seen your +1 an will re-propose it later
<niemeyer> TheMue: I'm just sending a few other notes
<niemeyer> TheMue: Can you please see if you'd like to talk about something there?
<TheMue> niemeyer: thx
<niemeyer> fwereade: Heya!
<fwereade> niemeyer, heyhey
<fwereade> niemeyer, how's it going?
<niemeyer> fwereade: Smoothly!
<niemeyer> :)
<fwereade> niemeyer, sweet :D
<TheMue> niemeyer: Only one question regarding your first comment. You would prefer a stopping of the firewall using the typical way?
<niemeyer> TheMue: Hmm
<niemeyer> TheMue: You know what, I think the implementation is right
<niemeyer> TheMue: It's the comment that is wrong
<niemeyer> TheMue: "can't stop machine tracker: %v"
<niemeyer> TheMue: The machine tracker *was* stopped, if we got there
<niemeyer> TheMue: Ah, and it should also delete the machine from the map
<TheMue> niemeyer: this is done below with delete. the error is the the result of a stopping error. that's why I log "can't stop the tracker". the machine has been removed, but the tracker showed an error during stopping.
<TheMue> niemeyer: maybe better "stopping machine tracker %d showed errer %v"
<niemeyer> TheMue: See the comment in the CL
<TheMue> s/errer/error/
<niemeyer> TheMue: If you 'continue', the delete is never reached
<niemeyer> TheMue: Yes, but "can't stop" is incorrect
<niemeyer> TheMue: It won't get there unless it is stopped
<TheMue> niemeyer: iiirks, the continue, you're right
<TheMue> niemeyer: any good idea for the logging message? or shall i take that "stopping machine tracker %d: %v" thingy?
<niemeyer> TheMue: Have you seen the CL comments?
<niemeyer> TheMue: There's a suggestion there
<TheMue> niemeyer: didn't refreshed screen while we're typing here
<TheMue> niemeyer: rietveld needs auto-refresh. ;)
<niemeyer> TheMue: Yeah :)
<niemeyer> TheMue: What about the channel closing?
<TheMue> niemeyer: you've seen rogs comment? it's the firewallers channel, only passed to the tracker. I could move it up. it can only be closed by the firewaller. so my fw.finish() has to take care to first stop the machine trackers befor closing that channel.
<niemeyer> TheMue: Hold on tight there
<niemeyer> :)
<niemeyer> TheMue: The question was pretty simple
<niemeyer> TheMue: What happens if the channel is closed?
<niemeyer> TheMue: The sentence should start with "When it is closed, ..." :-)
<TheMue> niemeyer: "When it is closed, no tracker should be running."
<niemeyer> TheMue: Okay, that sounds perfect, thanks
<TheMue> niemeyer: ;)
<niemeyer> TheMue: We need to have that idea clear in there
<TheMue> niemeyer: Yes, and I have an idea how to ensure it, based on rogs suggestion
<TheMue> niemeyer: the next CL will have it, it's a simple one
<TheMue> niemeyer: btw, maybe you could help me
<niemeyer> TheMue: Right now it's a puzzle that is being carefully put together, not easy to assess whether it is right, and very easy to screw up in the next change
<niemeyer> TheMue: If nothing else, we must document how to not screw up
<TheMue> niemeyer: I know change sets and their acronym CS, but what is the L in CL?
<niemeyer> TheMue: change list, I don't know where it comes from, to be honest
<TheMue> niemeyer: ah, thought so but not have been sure
<niemeyer> TheMue: In fact, I think we have a problem in there
<TheMue> niemeyer: I think the next propose will it make more clear
<TheMue> niemeyer: fw.loop() has a defer fw.finish()
<TheMue> niemeyer: there first all trackers are stopped
<niemeyer> TheMue: Ah, no, the problem I imagined just now doesn't exist
<TheMue> niemeyer: and then the channels closed (possible by making them to firewaller fields).
<niemeyer> TheMue: Btw, why do we have to wait for the firewaller.tomb.Dying within the machineTracker?
<TheMue> niemeyer: I think that's a clear tear down. and by referencing fw.xyzChan in the trackers it is more clear that the change is sent back to be handled there
<niemeyer> TheMue: Ah, yes, that sounds good
<niemeyer> TheMue: You mean removing the defers and moving them into the finish method, right?
<TheMue> niemeyer: yes
<niemeyer> TheMue:+1!
<TheMue> niemeyer: which line you're referring to with the wait?
<niemeyer> TheMue: Hmm..
<niemeyer> TheMue: So why do we even close the machineUnitsChanges channel?
<niemeyer> TheMue: I don't think we have to close it.. there's no benefit, apparently
<TheMue> niemeyer: you mean it's ok to let the GC close them?
<niemeyer> TheMue: It's just introducing a race we have to be aware of
<niemeyer> TheMue: No, the GC doesn't close channels
<niemeyer> TheMue: Ever
<niemeyer> TheMue: The GC just collects them, once they are unused
<niemeyer> TheMue: It's fine to leave a channel unclosed
<niemeyer> TheMue: If it makes sense for the intended workflow
<TheMue> niemeyer: yeah, thought a bit about it.
<TheMue> niemeyer: most important is a proper closing of the trackers bottom up.
<niemeyer> TheMue: So, +1 on moving stuff onto finish, +1 on dropping the unnecessary close of the channel too
<niemeyer> TheMue: So, going back a bit, the line I was referring to is 132 and 140
<fwereade> niemeyer, quickly, please confirm: a hook is allowed to set the running unit's relation state for any relation the unit is in... right?
<fwereade> niemeyer, s/running/local/ is perhaps clearer
<niemeyer> fwereade: Yes
<fwereade> niemeyer, jolly good, thanks :)
<niemeyer> fwereade: There are defaults, but yes, it is allowed
<fwereade> niemeyer, yep
<TheMue> niemeyer: thinking about it. don't know anymore *argh*
<fwereade> niemeyer, just wanted to check I was following the python correctly
<TheMue> niemeyer: currently see now reason
<niemeyer> fwereade: Now that I think of it, I'm not sure if Python ever did this correctly, since we figured those details as we went through it
<niemeyer> fwereade: I've recently seen bug reports addressing precisely this issue, IIRC
<niemeyer> fwereade: May have been fixed by now
<fwereade> niemeyer, hmmmmm :/
<fwereade> niemeyer, ok, I'll take a look :)
<niemeyer> fwereade: Nothing to worry about, I think.. it was just not something we focused a lot on as we got the basic semantics right in the first go
<fwereade> niemeyer, ah, jolly good
<niemeyer> fwereade: Your feeling of "all hooks should be able to see and touch this" is the right one
<fwereade> niemeyer, still worth double checking what went wrong so I don;t make the same mistake
<niemeyer> fwereade: Good point
<fwereade> niemeyer, all my mistakes will be new and fresh :)
<niemeyer> fwereade: LOL
<niemeyer> fwereade: Fresh mistakes just leaving the oven are always so tasty, though!
 * fwereade runs his tummy
<fwereade> er, rubs
<fwereade> niemeyer, cool, nothing looks likely to be a serious problem
<TheMue> So, switching branches
<rog> niemeyer: https://codereview.appspot.com/6430044
<rog> niemeyer: should be quite trivial
<niemeyer> rog: Clean and sweet
<rog> niemeyer: thanks!
<niemeyer> rog: Btw, if you add -bug to lbox propose, it will link the bug to the proposal
<rog> niemeyer: ah, thank you for reminding of that.
<niemeyer> rog: and assign to you, and put it in progress, but hopefully those two should be done ahead of time (or with -wip)
<rog> niemeyer: presumably then lbox submit should mark the bug as "fix committed" ?
<niemeyer> That's lunch time!
<rog> fwereade, niemeyer: i'm thinking of a signature something like this for juju.Conn.Deploy: http://paste.ubuntu.com/1098537/
<rog> fwereade: what d'ya think?
<fwereade> rog, looks reasonable, I think
<fwereade> rog, the charm is the potentially-fuzzy charm name from the command line, yes?
<rog> fwereade: yeah
<fwereade> rog, yeah, that looks right, I think
<rog> fwereade: i was wondering about using charm.URL, but i think better to keep the interface simpler, i think.
<rog> fwereade: i don't think it loses any generality
<fwereade> rog, I don't think you could even if you wanted, could you?
<rog> fwereade: no?
<fwereade> rog, we don't necessarily know the charm URL until we've messed around using potentially both state's default-series and the LocalRepoPath
<rog> fwereade: i was going to assume that the caller would invoke charm.InferURL
<rog> fwereade: (for which you don't need LocalRepoPath, incidentally)
<fwereade> rog, good point; but they *do* need State
<rog> fwereade: really?
<rog> fwereade: i didn't think InferURL used State
<fwereade> rog, where are you going to get default-series from?
<rog> fwereade: ah!
<rog> fwereade: an excellent point sir
 * fwereade bows graciously
<TheMue> rog: integrated your log testing, works like a charm
<rog> TheMue: cool. i hoped you'd like it!
<TheMue> rog: definitely
<rog> TheMue: see, string-stream-data-fetching can be nice sometimes... :-)
<fwereade> rog, I have a naming problem
<rog> fwereade: god, me too
<fwereade> rog, I'm writing a HookQueue
<rog> :-)
<TheMue> rog: well said
<rog> fwereade: ok
<fwereade> rog, and I basically want to pop hook executions in 2 stages
<TheMue> rog: the test-by-string-stram-data-fetching-and-comparing-idiom
<fwereade> rog, I'd like to call the operations Peek and Pop, except Peek is the thing that returns something and we should panic if we Pop without havng Peeked
<fwereade> rog, that I want to do this at all *may* be evidence of total insanity on my part in the first place, but I think it'll all work out rather nicely
<rog> fwereade: you could make the Pop operation work on the thing just peeked i suppose
<fwereade> rog, it actually has to
<rog> fwereade: why do you need a peek operation BTW?
<rog> fwereade: perhaps you could paste a brief outline of your basic design intentions here
<fwereade> rog, I'll give it a god
<rog> fwereade: i've got a spare god if you need one
<fwereade> rog, wait, balls, thought of a problem, need to think a sec
<fwereade> rog, ah no it's ok
<fwereade> rog, OK
<fwereade> rog, the idea is that the HookQueue is a magical change-collapsing queue
<fwereade> rog, we feed it RelationUnitsChange events
<rog> fwereade: yeah, i figured something of the kind
<fwereade> rog, and it decomposes them into individual hook executions
<rog> fwereade: i still don't quite see how Peek fits in though
<fwereade> rog, of which there should only be one of joined/changed/departed per unit in the queue at a time
<fwereade> rog, the context is kinda necessary I think
<rog> fwereade: also, Get is probably a better name for a queue than Pop, which implies a stack to me.
<fwereade> rog, yeah, good point, bad connotations
<fwereade> rog, ok, anyway
<fwereade> rog, we always can collapse two potential hooks
<fwereade> rog, if we get a changed on top of a changed, we can just ignore it
<rog> fwereade: yup
<fwereade> rog, if we get a departed on top of a changed, we drop the changed and append the departed, etc
<rog> fwereade: uh huh
<fwereade> rog, so, this is nice
<rog> fwereade: ah, i think i see the need for Peek now
<fwereade> rog, but I would like to be able to maintain all the hook queue state in one place
<fwereade> rog, ie the HookQueue saves its state after every change, so it can reload itself if we restart
<rog> fwereade: is that actually necessary?
<fwereade> rog, I think so...
<fwereade> rog, something needs to know what state we think we're in
<TheMue> rog: propose is in again
<rog> fwereade: so your HookQueue kinda corresponds to my "intentions" except that you save all the intentions at every step
<fwereade> rog, and I think the HookQueue has basically all of it that we need
<fwereade> rog, kind of; but an intention is not all we need to save, is it?
<rog> fwereade: i'm not sure. what else do we need to save?
<fwereade> rog, we need to keep track of relation membership and settings versions so far as the unit currently knows
<rog> fwereade: ok, yes, in my original sketch, the intention was saved along with the version that corresponded to that intention
<fwereade> rog, where do you keep the versions that don't correspond to a queued intention?
<rog> fwereade: do they matter?
<fwereade> rog, yes, we don't want to run a change hook for every unit whenever we bounce the process
<fwereade> rog, how else do we reconcile latest state with saved state and know which changes are worth hooking?
<rog> fwereade: we've already run hooks for those units, right?
<rog> fwereade: so there will be an intention stored in the log with a version number
<fwereade> rog, ah, ok, the intentions log grows without bound?
<rog> fwereade: (it will be also be tagged as "done" in the log)
<rog> fwereade: not without bound - we can collapse it occasionally
<rog> fwereade: like a log-structured filesystem
<rog> fwereade: we'd probably collapse it when we restart and if it grows bigger than some bound
<fwereade> rog, hmm, how do we collapse the intentions we don't actually need to run?
<rog> fwereade: each intention has some tag associated with it (originally i thought the zk path, but we don't have access to that) - we collapse intentions with the same tag.
<rog> fwereade: it's quite possible that this scheme is entirely crackful, but i'd like to have a go to see if it is possible, because i *think* it could work nicely.
<fwereade> rog, I don't see how we do all this collapsing without writing it all the time, though
<rog> fwereade: it also potentially scales reasonably well too.
<fwereade> rog, not to mention marking them as done
<rog> fwereade: we *are* writing it all the time
<fwereade> rog, ah, I thought you were claiming there was something different about what we were doing, and that my approach was a problem because I was writing all the time
<rog> fwereade: the collapsing happens occasionally, by reading through the entire log
<rog> fwereade: no
<rog> fwereade: the problem i have is that your approach writes *everything* all the time
<rog> fwereade: if the system grows quite big, there are going to be many many units and many many changes...
<fwereade> rog, ...and a vast log file to scan through and write "done" in the middle of...?
<rog> fwereade: we never write into the middle of the log file
<rog> fwereade: it's append-only
<fwereade> rog, ah, ok, that does make more sense
<rog> fwereade: presumably in your approach you'd do something like alternate two files, writing the HookQueue alternately to each one?
<fwereade> rog, basically yeah
<fwereade> rog, I'm still trying to figure out how to do the constant event collapsing with the log approach
<fwereade> rog, if we're lagging at all behind "now", which I imagine we will in a big system, how do we strip unwanted intentions from the log?
<rog> fwereade: the hook queue event collapsing?
<fwereade> rog, we add both "done"s and "whoops-don't-bother"s?
<rog> fwereade: or the log collapsing?
<fwereade> rog, the hook queue
<fwereade> rog, the log collapsing doesn't sound too challenging
<rog> fwereade: i'm thinking that the hook queue doesn't have any persistence at all
<rog> fwereade: s/have/need/
<fwereade> rog, so we just run loads more hooks?
<rog> fwereade: no, the hook queue does event collapsing
<rog> fwereade: but i don't think it needs to save its state
<fwereade> rog, so it collapses an event, writes an intention, and then when that intention is no longer valid it...
<fwereade> rog, or it doesn't write the intention until it's actually about to run the hook?
<rog> fwereade: the way i'd thought about it is that the hook queue is running in a separate goroutine. it receives events, collapses them into a queue, and tries to send hooks down a channel to the hook executor
<rog> fwereade: the hook executor is the thing that writes the intentions, when it's read the hooks from the hook queue
<fwereade> rog, so the hook queue needs to read the intentions as well to know how to collapse?
<fwereade> rog, how does it cancel a "joined" intention for example?
<fwereade> rog, I think the second bit is the more interesting question, sorryt
<rog> fwereade: the hook queue starts with our current knowledge of the state of the world
<fwereade> rog, sorry, side question: how many not-done intentions will be in the log file at a time?
<rog> fwereade: we've found that out by reading the log file at startup
<rog> fwereade: one or two
<rog> fwereade: any sequence of actions that corresponds to a single state change (i.e. version change)
<rog> fwereade: so the first two intentions would be "install" and "start", presumably
<fwereade> rog, ok, and this intention-storing is per-relation?
<fwereade> who ok no
<fwereade> whoa^^
<fwereade> rog, ok, we have a constant stream of hooks coming from a bunch of places in separate streams, some of which need to be paused for arbitrary chunks of time
<rog> fwereade: the intention storing is per hook execution - we're storing the "intention" to execute a hook, based on some database change
<fwereade> rog, the relationship between database changes and hook executions is not simple
<fwereade> rog, your hook executor will need ot know basicaly everything about the state of the system, won't it?
<rog> fwereade: ah, well that may well be a killer for this approach :-)
<rog> fwereade: i was kinda assuming that a single database change triggers a single hook execution
<fwereade> rog, 0-2 in general, I think, and the changes come from lots of different places
<rog> fwereade: ok... let's enumerate
<rog> fwereade: 0?
<fwereade> rog, and 0s may look like 1s without reference to past state, and 1s may look like 0s without reference to future state
<fwereade> rog, change on change
<fwereade> rog, the first change shouldn't actually happen
<rog> fwereade: ah, i don't care about the number of hook executions a db change triggers
<rog> fwereade: but...
<fwereade> rog, what hooks needs to be executed depend on future state
<rog> fwereade: is there ever a time when a hook execution depends on *more than one* db change?
<fwereade> rog, in terms of deciding whether or not to run it, yes
<rog> fwereade: how do you mean?
<fwereade> rog, we can have a joined queued and then decide not to bother because we got a departed
<rog> fwereade: ok, i see that. but i don't *think* that matters.
<fwereade> rog, the fact that the system state changed does not automatically imply that a hook should be run, but we don't know the truth of that until later
<rog> fwereade: the idea is that everything in the hook queue is "fuzzy might wanna do it stuff". but when you get something *out* of the queue, you are really going to try and execute it regardless
<rog> fwereade: at some point, we have to commit to an intention
<rog> fwereade: which is what the intentions log is supposed to mirror
<fwereade> rog, hmm, ok, I might be able to see how it could work if we had one per relation
<rog> fwereade: why one per relation?
<fwereade> rog, what about the hooks we run and need to store the context of for an arbitrarily long time until we run them again?
<fwereade> rog, while running other hooks in the interim?
<rog> fwereade: how do you mean?
<fwereade> rog, resolved
<rog> fwereade: how does resolved work?
<fwereade> rog, hook runs and errors; we pause the hook queue for that relation until the user resolves it, and may or may not want to retry the hook
<rog> fwereade: do any extra hooks get run as a result of this?
<fwereade> rog, you have a stored intention for the changed, after the joined that just failed
<fwereade> rog, you cannot run that hook until the joined is resolved one way or another
<rog> fwereade: how does the user decide whether or not to retry the hook?
<fwereade> rog, --retry-hook or something
<rog> fwereade: well, you don't mark an intention as done until it's completed without error.
<fwereade> rog, ok, so intentions execute out of order?
<fwereade> rog, or, alternatively, you have an intention log per relation, and one for the whole unit
<rog> fwereade: hmm. i see, i thought a hook execution error paused execution of *all* hooks
<fwereade> rog, just because one relation is screwed doesn't mean we should take *everything* down
<rog> fwereade: hmm, seems slightly odd to me - the screwed-uppedness could be to do with the central working of the charm
<rog> fwereade: but anyway, that's how it is
<fwereade> rog, in that case, all the relations fall over, but they do so independently :)
<niemeyer> <rog> fwereade, niemeyer: i'm thinking of a signature something like this for juju.Conn.Deploy: http://paste.ubuntu.com/1098537/
<niemeyer> rog: That's mixing concerns
<niemeyer> rog: Deploy is about, well, deploying
<rog> niemeyer: ok, that's why i was askin'!
<fwereade> rog, I have to go I'm afrad
<niemeyer> rog: And I'm answering!
<rog> fwereade: me too. talk tomorrow. i'll keep thinking about it.
<niemeyer> !? :-)
<rog> niemeyer: thanks!
<rog> niemeyer: what concerns is it mixing?
<niemeyer> rog: Deploy is about deploying.. it shouldnt' be resolving URLs IMO
<rog> niemeyer: fwereade made a good point earlier
<rog> niemeyer: i don't think we can resolve the url without a state
<rog> niemeyer: i guess we could add juju.Conn.InferURL
<niemeyer> rog: Yep.. that's going to be an issue with other methods too
<fwereade> rog, hmm, really gtgt, but: 3 operations really -- PublishCharm, AddService, AddUnits
<rog> fwereade: yes, that sounds reasonable.
<rog> fwereade: i'm kinda trying to avoid mirroring all the state types in juju. perhaps juju should be a more transparent layering.
<niemeyer> rog: I'm not sure if InferURL is even the best interface in that specific case.. let me paste you the original point in the original CL
<rog> fwereade: func Deploy(*state.State, *charm.URL) *state.Service
<niemeyer> rog: http://pastebin.ubuntu.com/1098633/
<rog> niemeyer: how do we get the charm URL without getting the default series from the state?
<niemeyer> rog: I don't understand why you'd want to do that
<niemeyer> rog: What's the problem with getting default-series from the state?
<rog> niemeyer: it's an argument to InferURL
<niemeyer> rog: That's where it lives..
<niemeyer> rog: Yes.. let's use it.. ? :)
<rog> niemeyer: right, so juju.Conn should expose its State field
<rog> niemeyer: which i actually think is a good idea
<niemeyer> rog: Why?
<rog> niemeyer: otherwise we'll have to open another state just to get the default series, no?
<niemeyer> rog: Not that I think it is a big problem.. it's already exposed, but I don't really get why exposing it or not is relevant in this case
<niemeyer> rog: I'm pretty lost
<niemeyer> rog: Conn has a State.. PutCharm is in Conn.. Deploy is in Conn
<rog> niemeyer: PutCharm takes a charm.URL as an argument
<niemeyer> rog: Ah, sure.. we can have InferURL in Conn too.. is that what you're concerned with?
<rog> niemeyer: the canonical way of getting a charm.URL is by calling charm.InferURL
<rog> niemeyer: yes!
<niemeyer> rog: Phew.. ok..
<rog> [18:08:26] <rog> niemeyer: i guess we could add juju.Conn.InferURL
<niemeyer> rog: Either way sounds fine.. Conn.State exists today.. Conn.InferURL sounds fine too
<rog> niemeyer: to be honest, i prefer the idea of the juju package as a relatively transparent layer atop Environ and State
<niemeyer> rog: i'd prefer to not change what the juju package is today
<rog> niemeyer: i don't intend to do that
<niemeyer> rog: Conn looks great as it is
<niemeyer> rog: It's working, implemented, and tested
<niemeyer> rog: Let's solve the problem we have at hand and move on, IMO
<rog> niemeyer: just that it means that we don't necessarily have to provide a totally complete layer on top of state
<niemeyer> rog: We're not doing that
<rog> niemeyer: so we can allow clients to, for example, get the default series from the state
<niemeyer> rog: Look at the state interface, and look at the Conn interface
<niemeyer> rog: They are pretty different, and seem to be going exactly in the direction we designed them to be
<rog> niemeyer: yeah, but what i'm worried about is having types in juju mirroring the ones in state, e.g. juju.Service, juju.Relation etc
<niemeyer> rog: We don't have that today, and I don't think we should have that.. why would we?
<rog> niemeyer: i'm wondering how juju.Conn.SetRelation would work, for example
<rog> niemeyer: how do we phrase the arguments to it?
<niemeyer> rog: SetRelation?  What's that?
<rog> niemeyer: sorry, RelationSet
<niemeyer> rog: What's that?
<rog> niemeyer: erk
<niemeyer> rog: AddRelation?
<rog> niemeyer: ok, Set
<niemeyer> rog: Sorry, I'm lost.. those names don't mean anything to me
<rog> sorry
<rog> niemeyer: the juju set command
<rog> niemeyer: equivalent in juju pkg would probably be juju.Conn.Set
<niemeyer> rog: Okay, thanks
<rog> hmm, i guess we can just use the svc name
<niemeyer> rog: Yeah, or a *state.Service
<rog> niemeyer: i was wondering if we'd return a *state.Service from juju.Conn.Deploy; then juju.Conn.Set could take a service
<rog> niemeyer: yeah
<rog> niemeyer: that's the kind of thing i mean by a "more transparent layer" - we're not entirely hiding the fact that state is under the surface
<niemeyer> rog: I never thought of Conn like that
<rog> niemeyer: cool
<rog> niemeyer: then i'm happy
<niemeyer> rog: We have State in there, after all
<rog> niemeyer: yeah... i guess i'd thought of that mainly for testing purposes
<niemeyer> rog: It's an API that allows us to poke into both the environment and the state together.. a facade if you will
<rog> niemeyer: yeah, sounds good
<rog> niemeyer: a collection of higher level operations
<niemeyer> Yeah
<rog> niemeyer: ok, so Deploy(*charm.URL) it is. that's lovely.
<rog> niemeyer: and it's funny we both came up with *almost* exactly the same sig for Deploy. (i'd totally forgotten about your remark in that CL!)
<niemeyer> rog: I think the suggested interface solves the issue better, but you'll probably figure as you implement it
<rog> niemeyer: what do you see as the main differences, besides the URL resolving stuff?
<niemeyer> rog: Deploying a charm may be done many times, with charms that are already in the state
<niemeyer> rog: That does not involve any kind of URL resolution or inferring
<niemeyer> rog: Putting a charm in the state can also be done by itself, independently
<rog> niemeyer: yeah, i'm with you. i was wondering about other differences.
<rog> niemeyer: (hence "besides" the URL resolving stuff)
<rog> niemeyer: i've gotta go, but i'll see what you write, later.
<rog> niemeyer: have a great rest-of-day and evening
<niemeyer> rog: That's really it.. it seems to make more sense to deploy a *state.Charm than a *charm.URL, in Conn
<niemeyer> rog: Cheers
<rog> niemeyer: ok cool
<rog> niemeyer: guess i'll see you in lisbon next!
<niemeyer> rog: For you too
<rog> niemeyer: hope your interminable travel time isn't too bad
<niemeyer> rog: Generally not an issue.. there's so much to do with quiet time :)
<niemeyer> rog: Thanks, though
<niemeyer> rog: Have a good time yourself
<niemeyer> "Are you uncertain, or hoping to convince me through Socratic dialogue?"
<niemeyer> fwereade_: LOL
<niemeyer> Giving a ride to Ale.. back in 15
<niemeyer> fwereade_: ping
<hazmat> yes
<niemeyer> hazmat: I like your thinking
<hazmat> econtext wrong channel
<niemeyer> hazmat: The positiveness is welcome no matter what :-)
<hazmat> awesome
<davecheney> niemeyer: thanks for the reviews
<davecheney> at some point today the workmen in my house will need to turn off the power
<davecheney> so i'll go into the city and work from there
<niemeyer> davecheney: np, and thanks for the note
<fwereade_> niemeyer, pong
<niemeyer> fwereade_: Heya
<fwereade_> niemeyer, heyhey
<niemeyer> I think you've got it all in your inbox already :)
<fwereade_> niemeyer, cool :)
<fwereade_> niemeyer, I see nothing on watch-presence-children?
<fwereade_> niemeyer, thank you for the others though, and I think I like the RelationHandler suggestion
<fwereade_> niemeyer, need a little time to absorb it though :)
<fwereade_> niemeyer, I should probably do it while I sleep, though :)
<fwereade_> niemeyer, nn
<niemeyer> fwereade_: Sounds like a good plan! :-)
<niemeyer> fwereade_: Have a good night
<fwereade_> niemeyer, and you :)
<fwereade_> niemeyer, hm, I have just had a thought about RelationHandler
<fwereade_> niemeyer, no, it's not well-formed
<fwereade_> niemeyer, really going now ;)
<niemeyer> fwereade_: Cheers! ;)
#juju-dev 2012-07-19
<niemeyer> EMPTY REVIEW QUEUE
 * niemeyer steps out to celebrate
<davecheney> afternoon
<davecheney> anyone in the chan ?
<andrewsmedina> davecheney: hi
<davecheney> howdy
<davecheney> can I ask, what happens if you run juju status on a freshly bootstrapped environ ?
<davecheney> looking at the code, the result should be yamls {}
<andrewsmedina> juju status returns a yaml with one machine (bootstrap machine)
<davecheney> ahh yes, of course
 * davecheney goes back to mocking
<andrewsmedina> davecheney: http://bazaar.launchpad.net/~juju/juju/trunk/view/head:/juju/control/tests/sample_cluster.yaml
<davecheney> andrewsmedina: ta
<andrewsmedina> davecheney: sleep time
<andrewsmedina> davecheney: see you tomorrow
<davecheney> andrewsmedina: np, thanks for your help
<andrewsmedina> davecheney: :D
<rog> davecheney: hiya
<davecheney> morning
<fwereade_> rog, davecheney, heyhey
<rog> fwereade_: yo!
<fwereade_> rog, does this look in any way familiar to you?
<fwereade_> ../state/state.go:8: inconsistent definition for type goyaml._Ctype_struct___9 during import\n\tstruct { start *goyaml._Ctype_yaml_tag_directive_t; end *goyaml._Ctype_yaml_tag_directive_t; top *goyaml._Ctype_yaml_tag_directive_t }\n\tstruct { start *goyaml._Ctype_yaml_simple_key_t; end *goyaml._Ctype_yaml_simple_key_t; top *goyaml._Ctype_yaml_simple_key_t }\n
<fwereade_> rog, goyaml builds just fine normally, but falls over like that in TestPutTools
<rog> fwereade_: it's a cgo error, but i don't think i've seen it before
<rog> fwereade_: have you tried go build launchpad.net/juju-core/... ?
<rog> fwereade_: because it looks to me like TestPutTools is failing because it can't build the commands.
<fwereade_> rog, yes indeed, that is the weirdness
<fwereade_> rog, it only fails when I clear out pkg
<rog> fwereade_: hmm, i wonder if it's something to do with GOPATH.
<rog> fwereade_: what happens if you do GOPATH="" go build launchpad.net/juju-core/cmd/... ?
<rog> hmm, it just should fail with "pkg not found" i guess
<fwereade_> rog, in PutTools (or whatever that calls) I presume?
<fwereade_> rog, I'll see :)
<rog> fwereade_: yeah.
<rog> fwereade_: so you cleaned out pkg by doing rm -r $GOPATH/pkg ?
<fwereade_> rog, yeah
<rog> fwereade_: i'll see if i can reproduce. were you testing all packages, or just one?
<fwereade_> rog, all packages
<rog> fwereade_: if you clean out pkg, then do "go install launchpad.net/juju-core/...", *then* start the tests, does it then succeed?
<davecheney> rog: fwereade_ have eithre of you installed gustavo's patch to the go tool ?
<rog> davecheney: nope.
<fwereade_> davecheney, nope
<rog> davecheney: i don't have a huge $GOPATH/src dir, so it's not so important for me
<davecheney> cool, i have been playing with it, and had some weirdness
<fwereade_> rog, I expect so, I will check
<fwereade_> rog, empty GOPATH just won't build as expected
<rog> fwereade_: hmm, i've got a test that's hanging again :-(
<fwereade_> rog, installed packages work just fine
<fwereade_> rog, bah :(
<fwereade_> rog, which one?
<davecheney> fwereade_: unrelated
<rog> fwereade_: weird
<rog> fwereade_: i'm not sure - it only prints the test name when the test has timed out!
<davecheney> the machine: section in juju status isn't actually state.Machines is it
<davecheney> it environ.Instances
<rog> fwereade_: i'm gonna kill it and see what's happening
<davecheney> the machines: section is reporting details about the environ.AllInstances
<rog> fwereade_: it's the juju package that's hanging
<fwereade_> rog, huh, weird... won't -gocheck.vv tell you though?
<fwereade_> rog, not so helpful if it's unreliable
<fwereade_> davecheney, sorry, I'm not quite following you
<fwereade_> davecheney, I'm pretty sure status should be telling us about machines, not instances
<davecheney> the two keys that are reported 'machines' in juju status
<rog> fwereade_: yeah, but i'd already run it without gocheck.vv, and i suspect it's a sporadic failure
<davecheney> fwereade_: I think they are what we call environ.Instances
<davecheney> because the first field in the yaml is dns-name
<davecheney> a property which is only availble on the environ.Instance
<rog> fwereade_: it's actually hanging trying to dial zookeeper
<fwereade_> davecheney, I'm suspicious of that assertion, but juju/control/status.py is a rats' nest so I'll have to check ;)
<davecheney> fwereade_:                 pm = yield self.provider.get_machine(instance_id)
<rog> fwereade_: i think that dial should have a timeout actually. it's wrong that zk tries to redial indefinitely.
<davecheney> i think we have a cleaner separation of machine, the virtual, and instance, the conrete in the go impl
<davecheney> rog: zk never times out
<davecheney> it sucks
<fwereade_> davecheney, STM it's only doing that with the machines it already knows about from state
<rog> davecheney: +1
<fwereade_> davecheney, look 17 lines before that
<davecheney> fwereade_: as you say, it is an uncomforable union of both concepts
<fwereade_> davecheney, I dunno, I think status info from both is relevant
<fwereade_> davecheney, I think it's ok to smoosh them together there
<davecheney> fwereade_: thanks for the clarification
<davecheney> I was hoping to have more to show
<rog> fwereade_: apart from that one test hangup, all the tests worked ok for me after removing pkg.
<fwereade_> davecheney, instances are really an internal detail, but their properties apply to the machines the user cares about
<davecheney> but all I have tonight is a harness
<fwereade_> davecheney, status is not going to be all that small I'm afraid
<fwereade_> davecheney, probably smaller than the python, but still
<davecheney> fwereade_: indeed, which is why I put some time into setting up a table driven harness
<fwereade_> davecheney, +1
<davecheney> imma signing off in a few minuts
<davecheney> they are closing the coworking space
<fwereade_> davecheney, enjoy your evening :)
<davecheney> but I look forward to your comments
<rog> davecheney: have fun
<rog> davecheney: when do you leave for lisbon?
<davecheney> rog: sunday
<davecheney> but I don't get there til monday 14:25
<rog> davecheney: i saw that
<davecheney> I have NFI how to get from the airport to the hotel
<rog> davecheney: there's some info on the sprint wiki page
<davecheney> need to get some euros as well
<davecheney> rog: I could have go there a day erlier for an extra 800 euro, and a detour via CDG
<rog> davecheney: well, i might catch you tomorrow morning then...
<rog> davecheney: marvellous
<rog> davecheney: well, i'm glad you're making it regardless!
<davecheney> for small values of yay
<davecheney> anyway, i'll catch you around
<rog> davecheney: i usually order currency from travelex to be picked up at the airport. that way you get the convenience of just picking it up, but a much better rate.
<rog> fwereade_: does malta use euros?
<fwereade_> rog, yeah
<rog> fwereade_: well aren't you lucky then?!
<fwereade_> rog, luckyish :)
<rog> :-)
<fwereade_> rog, my bank is still shit at getting me more money when I need it overseas
<rog> fwereade_: not good
<fwereade_> rog, I usually have to bite down the swears and hit up mygradually dwindling UK funds
<davecheney> right-o
<davecheney> have a good day gentlemen
<rog> davecheney: and you. enjoy the evening!
<fwereade_> have a good night davecheney :)
<rog> fwereade_: is that because they don't like you using your card overseas?
<fwereade_> rog, I have a credit card that *works* but is such an aggressively shitty deal that I don't really use it
<fwereade_> rog, they're essentially charging me to borrow the money from myself
<rog> fwereade_: just get cash in advance...
<fwereade_> rog, and I have no way of telling what my "balance" is on it
<fwereade_> rog, I have another one which works only in totally arbitrary circumstances
<rog> fwereade_: another possibility is to get a "foreign currency" card, which you can charge up as required.
<fwereade_> rog, twas sold to me as "this will let you get cash from HSBC machines in the UK without us charging you"
<rog> fwereade_: you get the best rate that way.
<rog> fwereade_: ha
<fwereade_> rog, in practice it *sometimes* works everywhere *except* HSBC machines in the UK, where it never works
<rog> fwereade_: i have a possible idea for why your build might be failing
<fwereade_> rog, I keep meaning to go and shout at them for a host of reasons but I have better things to do with my time ;)
<fwereade_> rog, oh yes?
<rog> fwereade_: i think it must be something in your environment. i'm trying to remember what the gcc heuristics are for finding include files.
<fwereade_> rog, ahhh, yes, that does make perfect sense
<rog> fwereade_: i'm wondering if it's right that bundleTools runs in a clean environment with just GOPATH, GOBIN and PATH set
<fwereade_> rog, I suspect it might be better to just use whatever env the developer is using in the first place
<rog> fwereade_: what happens if you unset everything in your environment except those vars?
<rog> fwereade_: that's what i'm thinking
<rog> fwereade_: ... and then run go install launchpad.net/juju-core/cmd/...
<rog> fwereade_: i'd be interested to find out what var it was, but i think you're right - the fix is to mutate the env, not come up with a new one
<rog> fwereade_: (assuming i've got the right diagnosis, of course!)
<fwereade_> rog, I'm peering at my env and can't see what could be the issue :/
<rog> fwereade_: i suspect it might be $HOME
<rog> fwereade_: try HOME="" go install launchpad.net/juju-core/cmd/...
<fwereade_> rog, sorry, just a mo
<fwereade_> rog, empty HOME works fine :(
<rog> fwereade_: try applying lp:~rogpeppe/juju-core/go-build-env and see if it works then
<rog> fwereade_: (your go test, that is)
<rog> fwereade_: (the thing that was failing before)
<fwereade_> rog, will do in a sec, thanks
<fwereade_> rog, I have a side issue to ponder: in https://codereview.appspot.com/6402048/ niemeyer suggests a RelationHandler type
<fwereade_> rog, and I'm not actually sure whether *that* has a benefit over putting the methods directly onto Relation
<rog> fwereade_: yeah, i saw that. i haven't thought it over yet though.
<rog> fwereade_: yeah, what fields are in RelationHandler other than the Relation itself?
<rog> fwereade_: oh, hold on, it's got a particular set of relation end points in it
<fwereade_> rog, that's the only way we have to identify/get a Relation in the first place isn't it?
<fwereade_> rog, the endpoint list is equivalent to the relation's name
 * rog does a godoc to remind himself of the API
<rog> fwereade_: yeah that does sound right
<fwereade_> rog, there is *something* up there
<fwereade_> rog, possibly we ought to be doing it with endpoint identifiers rather than actual RelationEndpoints
<fwereade_> rog, and doing the translation internally
<rog> fwereade_: why's that?
<fwereade_> rog, but I'm not sure that entirely fits expected usage
<fwereade_> rog, mainly because that will be a lot more convenient for add-relation and remove-relation
<fwereade_> rog, which is the only time I'm aware of when we do stuff with relations *other than* via something that already exists within state
<fwereade_> rog, which can identify them how it likes internally
<fwereade_> rog, same as the name/key distinction
<fwereade_> rog, btw, go-build-env works great
<rog> fwereade_: ah good. so *something* in your environment is the culprit!
<rog> fwereade_: i'm not sure i get it. RelationEndpoint is part of the public API; why not use it?
<rog> fwereade_: or are you suggesting removing it from the public API?
<fwereade_> rog, I'm not quite sure :)
<fwereade_> rog, possibly it is theright approach, and we just need a State.Endpoint(name string) method
<fwereade_> rog, except, hmm, that's actually State.Endpoints...
<rog> fwereade_: what would it return?
<fwereade_> rog, meh, it'll become clearer as I do more with it
<fwereade_> rog, look, you need a state to construct a meaningful RelationEndpoint, right?
<rog> fwereade_: BTW i'm all for the idea of reducing the number of types in state
<rog> fwereade_: do you?
<fwereade_> rog, yeah, it's all information that comes from the charm
<rog> fwereade_: well, i suppose you hav... yeah
<rog> fwereade_: a relation endpoint is always associated with exactly one unit, right?
<fwereade_> rog, no, with a service
<fwereade_> rog, the user expresses endpoints as service_name[:relation_name]
<fwereade_> rog, and we need to do magic to figure out what endpoints are actually referred to by those strings
<rog> fwereade_: and for a scope=local relation, there's still only one endpoint for the service?
<fwereade_> rog, and possibly barf if they're ambiguous and ask the user to specify the [:relation] bit
<fwereade_> rog, yeah, the endpoints are purely about the connections between services
<rog> fwereade_: so the RelationEndpoint type is about isolating that magic, so that you're dealing with unambiguous things?
<fwereade_> rog, the details of which units actually talk to one another is handled completely separately, although the choice of how to do so is ofc governed by the endpoint's scope
<fwereade_> rog, or rather the narrowest scope present in the endpoints of the relation as a whole
<rog> fwereade_: so you do some magic to produce an endpoint, then use that in the API?
<fwereade_> rog, I don't do anything, we don't yet have add-relation ;p
<rog> fwereade_: will do, then :-)
<fwereade_> rog, but, yeah, that's what we'll have to do
<fwereade_> rog, I'm saying we should have AddRelation(epnames ...string)
<fwereade_> rog, ...probably, anyway
<fwereade_> rog, I'm not 100% sure
<rog> fwereade_: where epnames are potentially ambiguous?
<fwereade_> rog, quite possibly we should actually have State.RelationEndpoints(epnames ...string) ([]RelationEndpoint, error)
<rog> fwereade_: that's Service.AddRelation, presumably?
<fwereade_> rog, yes
<fwereade_> rog, no, State.AddRelation
<rog> fwereade_: ah.
<rog> fwereade_: i'm not sure if that magic needs to be in state
<rog> fwereade_: it seems to me like it's something that can be built using the primitive already available.
<rog> primitives
<fwereade_> rog, yeah, reasonable perspective
<rog> fwereade_: it seems to me like that's the meat of the logic of add-relation
<rog> fwereade_: and it could sit happily in the juju package
<fwereade_> rog, ok, I think I'm convinced, thanks :)
<rog> fwereade_: then the state API is still very general
<rog> fwereade_: cool
<fwereade_> rog, (also the meat of remove-relation)
<rog> fwereade_: but... i'm not sure we've answered the original question
<fwereade_> rog, (although I suspect we should have a slightly different algorithm there)
<rog> fwereade_: should RelationHandler just be Relation?
<rog> fwereade_: ah, no, it can't be
<fwereade_> rog, I'm leaning towards "yes"
<fwereade_> rog, really? bother
<rog> fwereade_: because it's more than a relation - it encapsulates a set of units cooperating in a relation
<fwereade_> rog, all that is implicit in the *Unit params to the methods, innit?
<rog> fwereade_: so for locally scoped relations, you'll have a RelationHandler for each machine running the service
<fwereade_> rog, no you won't
<rog> fwereade_: no?
<fwereade_> rog, you just do rh.Join(localUnit); rh.Watch(localUnit), surely?
<fwereade_> rog, ok, the name of Watch is sl. misleading
<fwereade_> rog, WatchAsThoughYouWere(localUnit) ;)
<fwereade_> rog, the rh itself is the "same" object regardless of what UA is running it
<rog> fwereade_: but if all machines have also doing rh.Join, won't we see changes from all machines?
<fwereade_> rog, well, in a global relation, yes; in a locally scoped relation, no
<rog> oh i think i see
<fwereade_> rog, it figures out what scope to watch from the unit we pass in
<fwereade_> rog, ie it finds all the units in the same scope as the one suggested
<rog> fwereade_: you're relying on the fact that every time we get a relation, it's freshly made
<fwereade_> rog, expand please
<rog> fwereade_: are you saying that the Join method doesn't actually affect the state at all?
<rog> hmm, i think i'm confused
<fwereade_> rog, no, I'm not, but I can't connect that with what you're asking
<fwereade_> rog, that would not surprise me, I *think* I now get relations, but it has been a somewhat arduous path
<rog> fwereade_: i'll try and recap; tell me where i go wrong.
<fwereade_> rog, cool
<rog> fwereade_: someone does add-relation svc1:relation1 svc2:relation2; we look up the endpoints for both of those, and call State.AddRelation with those.
<fwereade_> rog, yes
<rog> fwereade_: the unit agent on each machine sees the new relation and calls Join on it.
<fwereade_> rog, yes
<rog> fwereade_: the unit agent runs the relation1-relation-joined hook.
<fwereade_> rog, no
<rog> ah, no i see
<fwereade_> rog, a *different* unit agent runs that hook when it detects that the first UA called Join
<rog> of course
<rog> fwereade_: ok, the unit agent see *anouther* machine that has called Join on the new relation.
<fwereade_> rog, yep
<rog> fwereade_: so it runs the relation1-relation-joined hook.
<fwereade_> rog, (ofc it only sees *that* because it's watching the relation from its own perspective, by having called Watch(itsOwnUnit)
<fwereade_> rog, yes
<rog> fwereade_: but how does this work with locally scoped relations?
<fwereade_> rog, the state.unitScopePath is at the heart of it
<fwereade_> rog, are you familiar with the contents of the /relations/relation-XXX node?
<rog> fwereade_: remind me :-)
<fwereade_> rog, if it's a globally scoped relation, it has a settings subnode, and a subnode for each role in the relation
<fwereade_> rog, if it's container-scoped, it contains a subnode for each container that holds some unit participating in the relation, and each of those container nodes contains the role and settings nodes
<fwereade_> rog, when watching, or joining, on behalf of a given unit, we figure out what the role-and-settings-parent node is for that unit, based on the relation's scope and (maybe) the unit's container
<fwereade_> rog, and then we just watch/write to the contents of that tree
<fwereade_> rog, (note: the top-level relation node of a container-scoped relation does *not* have role or settings nodes)
<fwereade_> rog, make sense?
<fwereade_> rog, (similarly, the Settings(*Unit) method does the same thing)
<rog> fwereade_: ah, so the Join method does the magic based on the unit
<fwereade_> rog, yes; really, *every* Relation(Handler)? operation depends on a specific unit to get the appropriate perspective
<rog> fwereade_: BTW unitScopePath doesn't actually seem to be used anywhere
<fwereade_> rog, update your tree :)
<rog> fwereade_: ah ha!
<fwereade_> rog, and then for examples of actually constructing a useful one, take a look at the relation-unit branch itself
<fwereade_> rog, in Unit.AgentJoin
<fwereade_> rog, it's very small but I'm rather pleased with it
<fwereade_> rog, the python seems to do path-hackery all over the place :)
<rog> fwereade_: ok, so i *think* i understand now, and it seems reasonable then that the methods should be directly on the Relation.
<fwereade_> rog, cool
<fwereade_> rog, I think I'll talk about it with niemeyer too, but it's good to know I'm not *obviously* all underpants-on-head about it ;)
<rog> fwereade_: how would Relation.Join(unit) differ from Unit.AgentJoin(relation) BTW?
<fwereade_> rog, it's only responsible for the Pinger
<fwereade_> rog, AgentJoin does both the pinger *and* the watching *and* gives access to the settings node
<fwereade_> rog, I *suspect* we'll still end up with something similar in effect to the RelationUnit but I can't see far enough into the future to be sure
<fwereade_> rog, the precise types we need to sling around will become clearer as I work on the queuing and tie it into cmd/jujuc/server.ClientContext
<rog> fwereade_: if RelationHandler is actually Relation, then it seems like we've managed to lose a type, which sounds like a good thing.
<rog> fwereade_: well, lose a public type anyway
<fwereade_> rog, yeah, I think so
<rog> fwereade_: i'm definitely +1 on that
<fwereade_> rog, cool :)
<rog> fwereade_: i think Join, Settings and Watch on Relation is an easier thing to understand
<rog> fwereade_: i hadn't quite grasped what a RelationUnit actually *was* :-)
<fwereade_> rog, and we don't have ServiceRelation any more, I never really understood what that was either ;)
<rog> fwereade_: cool :-)
 * rog likes it when an API comes together
<rog> fwereade_:  https://codereview.appspot.com/6421045
<fwereade_> rog, LGTM
 * fwereade_ tuts in niemeyer's general direction
<fwereade_> (config gofmt ;))
 * fwereade_ suddenly grasps the full horror of how the tests will look if he tries:
<fwereade_> var hookQueueTests = []struct {
<fwereade_>     adds []state.RelationUnitsChange
<fwereade_>     gets []relationer.HookInfo
<fwereade_> }{
<fwereade_> ...given that:
<fwereade_> / RelationUnitsChange holds settings information for newly-added and -changed
<fwereade_> / units, and the names of those newly departed from the relation.
<fwereade_> type RelationUnitsChange struct {
<fwereade_>     Changed  map[string]UnitSettings
<fwereade_>     Departed []string
<fwereade_> }
<fwereade_> / UnitSettings holds information about a service unit's settings within a
<fwereade_> / relation.
<fwereade_> type UnitSettings struct {
<fwereade_>     Version  int
<fwereade_>     Settings map[string]interface{}
<fwereade_> }
<fwereade_> ...and:
<fwereade_> / RelationUnitsChange holds settings information for newly-added and -changed
<fwereade_> / units, and the names of those newly departed from the relation.
<fwereade_> type RelationUnitsChange struct {
<fwereade_>     Changed  map[string]UnitSettings
<fwereade_>     Departed []string
<fwereade_> }
<fwereade_> / UnitSettings holds information about a service unit's settings within a
<fwereade_> / relation.
<fwereade_> type UnitSettings struct {
<fwereade_>     Version  int
<fwereade_>     Settings map[string]interface{}
<fwereade_> }
 * fwereade_ ponders grumpily
<TheMue> morning
<TheMue> rog: Btw, your log-tracking-testing will soon fly into the trunk. Just doing the final steps.
<rog> TheMue: cool. i saw gustavo's remarks, and breathed a sigh of relief that he thought it was ok :-)
<rog> TheMue: morning, BTW!
<TheMue> rog: morning and yes, it's indeed a nice idea. didn't thought so in the beginning, but seeing it in real life feels good.
<rog> TheMue: it went through a couple of more complex iterations before it arrived at the paste i gave you. i'm quite happy how it turned out actually.
<rog> fwereade_: erm, you pasted the same thing twice
<rog> fwereade_: did you mean to post the HookInfo type definition?
<fwereade_> rog, oh, blast, yeah
<fwereade_> type HookInfo struct {
<fwereade_>     Name       string
<fwereade_>     RemoteUnit string
<fwereade_>     Members    map[string]map[string]interface{}
<fwereade_> }
<fwereade_> rog, but not to worry, it's readable now
<fwereade_> rog, there are rather a lot of cases to deal with, though ;)
<rog> fwereade_: what are the names in RelationUnitsChange.Departed, BTW? unit names?
<TheMue> And once again heavy rain. Hey, weather god, it's enough!
<rog> TheMue: you'll be too hot next week!
<TheMue> rog: Yes, I've seen, and I promised my family to bring some good weather with me like from Oakland.
<rog> fwereade_: also, it seems odd that RelationUnitsChange gives no way of telling *which* unit changed. is that because we don't need to know?
<TheMue> rog: Tomorrow the school holidays begin, so it *has* to change. ;)
<hazmat> ec2 ssd instances!
<hazmat> http://aws.typepad.com/aws/2012/07/new-high-io-ec2-instance-type-hi14xlarge.html
<rog> TheMue: we haven't really had a summer here. it's been continual rain. and when i got back, found that there'd been some serious rain while we were away: http://www.youtube.com/watch?v=huCnxI8qDb0 http://www.youtube.com/watch?v=lpJVTCLHKUE
<rog> hazmat: sounds expensive :-)
<hazmat> rog, depends on if you need it.. netflix reduced their costs by half for the same throughput on a benchmark.. http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html
<rog> hazmat: interesting
<TheMue> rog: Hard storms, yes. We had some of those in July too. May/June have been some wonderful warm days where I worked on the veranda.
<TheMue> Hmm, Gavin Harrison really motivates me to realize an old dream: learn to play drums.
<TheMue> He's the drummer of Porcupine Tree.
<rog> fwereade_: i don't think you'd need an intention log per relation
<rog> fwereade_: i *think* that one log would suffice
<rog> TheMue: one forecast i saw had it at 37 degrees in lisbon on one day
<TheMue> rog: ouch
<rog> TheMue: but mostly around 30, which is bearablew
<rog> s/w//
<TheMue> rog: Yes, just took a look too. 30 is fine
<fwereade_> rog, sorry, lunch :)
<rog> fwereade_: np!
<fwereade_> rog, RelationUnitsChange has Changed map[unitname]info, Departed []unitname
<fwereade_> rog, you're expected to know which relation changed because you started watching a particular relation
<rog> fwereade_: ah, i think the documentation should mention that the key of Changed is the unit name :-)
<fwereade_> rog, sensible, cheers
<rog> fwereade_: i glanced at it my brain said "attribute name"
<fwereade_> rog, I thought it kinda went without explicitly saying when I wrote it but it could certainly be clearer
<rog> fwereade_: some of are stupider than you might think :-)
<fwereade_> rog, I'm interested to hear about your further thoughts on intention logs though
<fwereade_> rog, (sorry I had to dash yesterday)
<rog> fwereade_: np
<rog> fwereade_: i'm thinking that when we get an error executing an intention, we can write "broken" to the intention log.
<rog> fwereade_: that doesn't stop the updates coming through the queue for that relation, of course.
<rog> fwereade_: but i'm reconsidering if having the queue in a separate goroutine is the right approach after all
<fwereade_> rog, sorry, I'm not coming up with any responses
<rog> fwereade_: that's fine, it's all vapour anyway :-)
<rog> fwereade_: just my vague thoughts on the matter. you're much closer to the problem.
<fwereade_> rog, I am definitely having difficulty visualising an intention log with the unit lifecycle state and all the relation lifecycle states and brokens and dones and collapses and resolve-handling :)
<fwereade_> rog, collapses and resolves, in-memory, for a single relation, is occupying most of my brain ATM :)
<fwereade_> rog, I think I have something that will work and be easy enough to extend to handle process bounces, but we'll see
<rog> fwereade_: this kind of API, perhaps? http://paste.ubuntu.com/1099903/
<fwereade_> rog, ATM I have http://paste.ubuntu.com/1099905/
<rog> fwereade_: i'm not sure just RelationUnitsChange is enough, as it doesn't hold the name of the relation that changed.
<rog> fwereade_: of course, it could get that
<fwereade_> rog, I'm assuming one of these per relation
<fwereade_> rog, and that the client should know what relation they're dealing with
<rog> fwereade_: that sounds good actually
<fwereade_> rog, it's about all I can fit in my head ATM :)
<rog> :-)
<rog> niemeyer: yo!
<niemeyer> Morning!
<TheMue> niemeyer: Heya
<rog> niemeyer: what if there's already a setting for GOBIN in the environment? are later entries guaranteed to override earlier ones?
<fwereade_> niemeyer, heyhey!
<fwereade_> niemeyer, rog, I have an in-memory HookQueue for your perusal: https://codereview.appspot.com/6422049
<rog> fwereade_: cool
<rog> fwereade_: (not entirely sure about "relationer" as a name, even though we already have the dubious "machiner")
<niemeyer> rog: I guess we shouldn't trust it indeed
<rog> fwereade_: how about just "hook"
<niemeyer> rog: It works, but who knows
<fwereade_> rog, because this is only one of many components of a system that will deal purely with unit agents' participation in relations?
<rog> niemeyer: yeah, it seems impl dependent to me
<rog> fwereade_: i'd really expect to see "uniter" :-)
<fwereade_> rog, uniter will come
<rog> fwereade_: under worker/, anyway
<fwereade_> rog, that has a whole bunch of other responsibilities, including kicking off stuff in relationer
<rog> fwereade_: maybe worker/uniter/relation ?
<fwereade_> rog, like how we have worker/provisioner/firewall? :p
<rog> fwereade_: in a way, firewaller is an independent worker in its own right.
<rog> fwereade_: i'm not *sure* that can be said for relationer
<rog> fwereade_: the fact that the firewaller happens to run in the PA is historical accident IMO
<fwereade_> rog, I guess I could live with worker/uniter/relation/hookqueue.go, but the unit stuff and the relation stuff are really pretty distinct
<rog> fwereade_: "hooks" seem fairly closely tied to the unit agent to me
<rog> fwereade_: i'd be happy for all this to live inside the unit agent, tbh
<rog> fwereade_: not entirely sure what we gain by having a separate package
<fwereade_> rog, mainly what we gain is an opportunity to avoid the ickiness of python, in which we have relation stuff and unit stuff smeared across the same package together despite the connections between the two being very very tenuous
<rog> fwereade_: i'd've thought that we could make that distinction very clear by the types we use
<fwereade_> rog, the unit stuff starts and stops the relation stuff, it is true, but the only other point of contact that I can recall is a shared hook executor, which feels to me much like the environ shared between the provisioner and the firewaller
<fwereade_> rog, uniter.UnitWorkflow/RelationWorkflow, uniter.UnitLifecycle/RelationLifecycle feels nasty to me
<rog> fwereade_: that's ok, i think. we already have fairly self-contained modules within the same package.
<fwereade_> rog, compared to uniter.Workflow/Lifecycle, relationer.Workflow/Lifecycle
<rog> fwereade_: does any of this stuff need exporting?
<rog> fwereade_: i think of all this as internal details of the unit agent/worker.
<fwereade_> rog, certainly the workflow stuff
<fwereade_> rog, or, hmm, maybe we can get away without that
<rog> fwereade_: by workflow stuff, you mean?
<fwereade_> rog, ZK storage of unit/relation state, used by status, but that will probably want to be a separate package in itself
<rog> fwereade_: feels to me like that would live in state
<fwereade_> rog, also disk storage, I think, although we my come up with some nice way around that
<fwereade_> rog, anyway, I think we have now discussed this enough to make the necessary move half a dozen times if it becomes apparent that it is misplaced ;)
<rog> fwereade_: tbh, i'm happy having the relation hook queue stuff in separate package if it's really self-contained, but "relationer" isn't the right name. it's not an "er" :-)
<fwereade_> rog, I am impressed by your precognitive ability :/
<rog> fwereade_: ok, i may be wrong!
<rog> fwereade_: as always...
<fwereade_> rog, not to worry, but the code in there is about 1000 times more interesting than the name of the package IMO
<rog> fwereade_: indeed, i'm looking through it. just thought i'd convey my naming discomfort...
<fwereade_> rog, no worries :)
<rog> fwereade_: any particular reason you need to go through the changed units in alphabetical order, BTW?
<fwereade_> rog, consistent tests
 * rog just noticed the comment, duh
<fwereade_> :)
 * fwereade_ thinks he's spotted a bug
 * fwereade_ writes a test
 * fwereade_ stops and tries to figure out how to
 * fwereade_ thinks it's not a bug
<fwereade_> niemeyer, ping
<niemeyer> fwereade_: Yo
<fwereade_> niemeyer, I was wondering if there was any particular reason to have RelationHandler separate from Relation?
 * niemeyer looks at what Relation is today and how it's used
<fwereade_> niemeyer, and I also think that Watch() should really return a RelationUnitsWatcher, which would then be an exported type, and could be tested in a more pleasing fashion
<niemeyer> fwereade_: Both suggestions look sane
<fwereade_> niemeyer, sweet, thanks
<rog> fwereade_: the hook queue code looks reasonable
<rog> fwereade_: one question: do we actually need a queue there?
<fwereade_> rog, sorry, where?
<rog> fwereade_: could we just keep a bag of stuff waiting to be done?
<rog> fwereade_: in the hook queue
<rog> fwereade_: does it matter which order the relation hooks are executed (between relations, that is)?
<fwereade_> rog, across different relations, no it doesn't
<fwereade_> rog, within one relation yes it does
<rog> fwereade_: i was wondering about a data structure like this: http://paste.ubuntu.com/1100094/
<fwereade_> rog, needs versions but that's by the by
<rog> fwereade_: yeah
<fwereade_> rog, the prospect of entirely dropping ordering seems odd to me
<rog> fwereade_: it's probably crack, mind
<rog> fwereade_: well, we're not *entirely* dropping it :-)
<rog> fwereade_: and if we restart, the order is going to be arbitrary anyway...
<fwereade_> rog, it feels like we'd need to add a bunch of icky heuristic stuff to ensure we didn't just, say, not bother to handle a departed for 20 minutes because there were lots of interesting changes going on
<rog> fwereade_: statistically that would be very unlikely :-)
<rog> fwereade_: BTW shouldn't Done call remove() ?
<fwereade_> rog, remove() is dead
<fwereade_> rog, proposal is updated
<fwereade_> rog, and the queue is updated in Next
<fwereade_> rog, anyway I'm not *sure* but it feels like the unordered solution would be more churny with a heavily loaded queue
<fwereade_> rog, and I think there's something to be said for predictability
<rog> fwereade_: here's a more predictable solution in the same vein: http://paste.ubuntu.com/1100115/
<rog> fwereade_: we have a simple linked list of units
<fwereade_> rog, and I think ensuring joined-then-immediately-changed might also be kinda icky
<rog> fwereade_: i know it's probably utterly irrelevant, but i'd like to see the queue operations be O(1), and i don't think it's hard.
<fwereade_> rog, that's pretty nice, I'm still not sure it does anything other than push the complexity around, but I'll look into it some more
<rog> fwereade_: i'm pretty sure it makes it more efficient. but simpler... i dunno.
<rog> fwereade_: i'm hoping so.
<fwereade_> rog, maaaaybe :)
<fwereade_> rog, it's missing membership and settings information
<fwereade_> rog, and I'm not sure it's entirely trivial to add them cleanly
<fwereade_> rog, (also, list needs to be doubly linked, but that is also by the by)
<rog> fwereade_: i thought a single link would probably be ok
<fwereade_> rog, how do we reorder in O(1)?
<rog> fwereade_: we do we reorder?
<rog> s/we/why/
<rog> fwereade_: members looks like it's applied only by Next, and probably wouldn't change that much.
<rog> fwereade_: settings is in unitStatus.settings
<fwereade_> rog, because some earlier events are replaced by later events, eg depart on top of change, or change on top of depart
<fwereade_> rog, yeah, but you're missing all the units that aren't in the queue
<rog> fwereade_: no reordering required there, i think.
<rog> fwereade_: and i *think* that the unitStatus map can contain all units, not just ones in the queue
<fwereade_> rog, hmm, the python did need it, but I think you're right; this implementation doesn't
<rog> fwereade_: (ok, i realise that's definitely a change from where i started :-])
<fwereade_> rog, sure, that's the nature of the beast
<fwereade_> rog, serializing that with all those pointers will be icky, but serialization and reconciliation is going to be interesting regardless I think
<rog> fwereade_: serialisation would be easy i think, but i'm still not convinced we want to serialise this.
<rog> fwereade_: interesting, yes
<fwereade_> rog, this has all the state we need to serialize wrt the hook queue, I think
<rog> fwereade_: i'm still thinking that incremental might be the way to go, but that may be entirely crack
<rog> fwereade_: http://paste.ubuntu.com/1100165/
<rog> fwereade_: i'll get back to what i'm supposed to be doing now, erk!
<rog> fwereade_: when you run a departed hook, does it have access to the old settings of the remote unit?
<fwereade_> rog, no
<rog> fwereade_: hmm, that seems a pity in a way. i can see that it might be useful.
<fwereade_> rog, I think it's just *gone*, too bad, is the thinking :)
<rog> fwereade_: it means that if you had any resources associated with some attribute exposed by the remote unit, you'd have to keep track of it yourself. but actually, i can see stronger arguments for the settings not being available - if you restart, you won't have access to them.
<TheMue> Aargh! What's that? My proposal ended with a "can't upload base of .lbox: ERROR: Checksum mismatch" and the original issue is messed up. I thought I did it right with the prerequisite this time. *hmpf*
 * TheMue softly switches into panic mode
<TheMue> *phew*
<TheMue> Wrote down a recipe ones and it worked.
 * niemeyer => lunch
<fwereade> rog, fwiw, there is definite ugliness related to restarting when there's an inflight hook in which some members no longer exist
<fwereade> rog, haven't quite figured out what to do about that yet
<rog> fwereade: hmm, how is that awkward?
<fwereade> rog, we don't really want to just remove members from the relation without executing departed hooks
<fwereade> rog, we can't sanely execute the departed hooks while we're in a screwed state from the previous errored hook
<rog> fwereade: ah! you mean resolving, not restarting?
<fwereade> rog, I mean restarting, I was using inflight to denote resolviness
<rog> fwereade: ah, so we're in an error state when we restart?
<fwereade> rog, yeah
<fwereade> rog, or, well, we might be
<fwereade> rog, the hook queue doesn't have to worry about *that* at least
<rog> fwereade: hmm, maybe the --retry could be interpreted as "retry that hook eventually" :-)
<rog> fwereade: so you'd run the departed hooks first
<fwereade> rog, and what if it's just in limbo?
<rog> fwereade: sorry, what if _what_ is just in limbo?
<fwereade> rog, the queue... sitting there for an arbitrary amount of time, waiting for resolution
<rog> fwereade: well, we know that the queue is waiting for resolution, right? i'm not sure i see the problem.
<fwereade> rog, the relation should surely still contain its original members until we have executed departed hooks?
<fwereade> rog, but we don't have access to those members, because they don't exist any more
<rog> fwereade: what can access those members? we're not running any hooks for that relation, no?
<fwereade> rog, doesn't the hook that errored (or just was not completed) want them?
<fwereade> rog, forget resolved entirely
<rog> fwereade: if we forget resolved, then we have to assume that the hooks all complete successfully, no?
<fwereade> rog, ha ha :)
<fwereade> rog, if we haven't said Done(), we can't assume that, can we?
<fwereade> rog, it is *important* that we don't skip hooks ;)
<rog> fwereade: i think that if the unit agent goes down while executing a hook, we could treat that as an error and require resolution
<rog> fwereade: after all, we won't *know* if the hook failed or not
<fwereade> rog, could do, might make life easier for us
<fwereade> rog, but there *should* be nothing wrong with re-executing a hook
<rog> fwereade: here's another little question: what happens if a relation gets departed while its join hook is executing. are we still required to execute the change hook?
<fwereade> rog, no, tested
<rog> fwereade: so we can execute joined then immediately departed
<fwereade> rog, yeah, the python knowingly does that
<rog> fwereade: cool. i thought that *should* be the case, but...
<rog> fwereade: so there *is* something wrong with re-executing a hook, in general.
<fwereade> rog, sorry, why is that so?
<rog> fwereade: if we crashed while a change hook is executing, and the relation is departed while we're down, we don't want to re-execute the change hook, but the departed hook, i think.
<fwereade> rog, you know what? all these are specific cases of the general "we sometimes need access to relation unit settings after the units are gone" problem
<fwereade> rog, the python addresses this by saying LALALA I CAN'T HEAR YOU when you point this out
<rog> fwereade: i'm not sure
<fwereade> rog, andtrusting that nobody ever actually deletes relation unit settings
<rog> fwereade: i'm not sure i see any places above where we need access to relation unit settings after the units are gone.
<fwereade> rog, that happens *all* the time
<fwereade> rog, whenever two units depart at the same time?
<fwereade> rog, we execute 2 departed hooks
<rog> fwereade: in fact... we always provide the "latest" version of a unit's settings. what happens if that unit has departed already but we haven't seen the departed hook yet
<rog> ?
<fwereade> rog, the second unit is still apparently present during the first depart
<SpamapS> FYI, http://www.oscon.com/oscon2012/public/content/video sabdfl keynoting about juju
<fwereade> rog, that should not be such a problem, we just use the ones we already have
<rog> fwereade: ah, but not if we've restarted, right?
<fwereade> rog, indeed
<rog> fwereade: ah, and that's what you meant by "nobody ever actually deletes relation unit settings"
<rog> fwereade: ... from zk, yes?
 * TheMue watches Mark
 * rog too
<fwereade_> rog, sorry, the last thing I said was "rog, but that does kinda offend my sensibilities a little"
<rog> fwereade_: last thing i saw was "indeed"
<fwereade_> rog, unless we do what we do in python and just hope they exist, at least
<fwereade_> rog, which they will, as long as we don't GC them until the whole relation is gone
<fwereade_> rog, but that does kinda offend my sensibilities a little
<rog> fwereade_: yeah, me too
<fwereade_> the *easy* solution, of which niemeyer is not fond for entirely sensible reasons, is to store all the settings as well
<fwereade_> rog, blast, gtg :(
<rog> fwereade_: k
<fwereade_> rog, I am interested to hear any brainwaves you may have
<rog> fwereade_: i'll continue thinking on it
<fwereade_> rog, cheers
<rog> fwereade_: i'm around late this evening BTW
<rog> fwereade_: though i should pack!
<fwereade_> rog, I *might* be, my later plans remain unclear
<rog> TheMue: crossed fingers the demo works :-)
<TheMue> rog: Yes ;) Always the hard part.
<rog> essentially
<TheMue> rog: That has been nice.
<TheMue> rog: Ha, here he smiles.
<rog> TheMue: why bother checking that machineUnitsChanges is closed at all? we never close it - there's no need to check closed on *every* channel receive, i think.
<TheMue> rog: Which line?
<rog> TheMue: line 64
<rog> TheMue: case change, ok := <-fw.machineUnitsChanges:
<rog> TheMue: why not just: case change := <-fw.machineUnitsChanges ?
<rog> TheMue: it's trivially verifiable that we don't close the channel
<TheMue> rog: just a habbit, add a note
<TheMue> afk
<niemeyer> fwereade__: ping
<niemeyer> I'll step out for a coffee break, back soon
<niemeyer> Back
<fwereade__> niemeyer, heyhey, I'm around for a bit
<niemeyer> fwereade__: yo
<fwereade__> niemeyer, what can I do for you?
<niemeyer> fwereade__: I just wanted to run a seed concept by you
<fwereade__> niemeyer, cool
<niemeyer> fwereade__: I haven't reviewed your branch yet, but hopefully will do that today still (just working a bit on config)
<niemeyer> fwereade__: But,
<niemeyer> fwereade__: One thing that we talked about at some point in the past
<niemeyer> fwereade__: and that seems curious in the context of the "hook queue"
<niemeyer> fwereade__: Is that, in fact, we may not really need a *queue* per se
<niemeyer> fwereade__: The only thing we need to know is what to run next
<niemeyer> fwereade__: That is, we don't have to take action on events coming from ZooKeeper immediately, as we're doing something else
<niemeyer> fwereade__: Because we have to handle the possibility of losing events in case of errors anyway
 * fwereade__ continues to listen with interest
<niemeyer> fwereade__: So, in a way, this means we can be more comfy and simply consume the next event until we have a decision to do something at hand
<niemeyer> fwereade__: and then let the next queue of events completely unhandled while we do that
<niemeyer> fwereade__: Does that ring any bells?
<fwereade__> niemeyer, so essentially we just snapshot the state whenever we think it might be a good time to run some hooks, and at that point figure out the diff?
<niemeyer> fwereade__: Not even snapshot
<niemeyer> fwereade__: We consume events from the queue until we have a decision on what to do
<niemeyer> fwereade__: But the "queue" is the network itself, and the zookeeper library.. we don't have to be processing everything in advance
<niemeyer> fwereade__: Of course, we still need the logic for keeping track of where we stand, etc
<fwereade__> niemeyer, I am having a little bit of trouble seeing the distinction
<niemeyer> fwereade__: and recording that for error recovery
<niemeyer> fwereade__: Maybe there's none..
<fwereade__> niemeyer, rog had some interesting ideas about it not really actually needing to be ordered, which I thought were worthy of a bit of exploration
<niemeyer> fwereade__: But if nothing else, I see a small conceptual difference in trying to maintain a queue of upcoming events, and simply computing what's the *next* event
<niemeyer> fwereade__: Uh.. that's not what I mean
<niemeyer> fwereade__: Order is important, I suspect
<fwereade__> niemeyer, that is my *instinct* but I can't quite figure out *why*, apart from it being an awful lot easier to get my head around
<fwereade__> niemeyer, be that as it may, I don't want to derail
<niemeyer> fwereade__: That'd be good enough of a reason to keep it ordered, but a unit departing and then joining sounds very different from joining and then departing :)
<fwereade__> niemeyer, what the hook queue *does*, though, both here and in python, is just keep track of the next event per unit
<niemeyer> fwereade__: Next event, or next eventS
<fwereade__> niemeyer, and I'm not sure we can reduce that to just the next event without dropping considerations of order
<niemeyer> ?
<fwereade__> niemeyer, next event singular
<fwereade__> niemeyer, it collapses operations aggressively
<fwereade__> niemeyer, if there's more than one queued event per unit we've done it wrong ;)
<niemeyer> fwereade__: Is there ever a case where we have more than one pending operation?
<fwereade__> niemeyer, joined/changed but that's handled internally
<niemeyer> fwereade__: Okay, cool
<niemeyer> fwereade__: So maybe it's already optimal
<fwereade__> niemeyer, well, it's surely not *optimal*, I definitely intend to look into a couple of rog's less surprising ideas
<niemeyer> fwereade__: That's really it. I wasn't sure if there was anything to adapt, but I had this thing in my head that I pondered if could be useful or not
<fwereade__> niemeyer, I'm pretty sure it's a little nicer than the python
<fwereade__> niemeyer, but I actually recommend you not worry about that review until tomorrow
<fwereade__> niemeyer, because it *might* get much better
<niemeyer> fwereade__: Ohh, exciting :-)
<fwereade__> niemeyer, and even if it doesn't, a day's delay won't hurt :)
<niemeyer> fwereade__: True
<niemeyer> fwereade__: Have plenty to do on the config side
<niemeyer> fwereade__: I'm hoping to push the whole thing for review today still
<fwereade__> niemeyer, I confess to being a little relieved that you haven't done the whole config thing in 5 minutes flat ;)
<niemeyer> fwereade__: Haha :)
<niemeyer> fwereade__: I'm taking the chance to clean up a few other edges too, such as bring into life that old StringMap type
<fwereade__> niemeyer, I do have one worry about the relation hooks, though
<fwereade__> niemeyer, nice
<niemeyer> fwereade__: I'll also twist Schema so it handles defaults by itself
<fwereade__> niemeyer, +1
<niemeyer> fwereade__: Oh, what's that?
<fwereade__> niemeyer, you remember we discussed sending settings through that whole pipeline rather than just versions?
<niemeyer> fwereade__: Right
<fwereade__> niemeyer, I'm not sure we actually really gain very much from that
<niemeyer> fwereade__: Hmm
<niemeyer> fwereade__: Why's that?
<fwereade__> niemeyer, because I can't see any way around us *sometimes* needing to access settings from departed units
<niemeyer> fwereade__: I'm missing both ends of the problem I think
<fwereade__> niemeyer, ok, simplest possible case
<fwereade__> niemeyer, two units depart at once
<niemeyer> fwereade__: Ok
<fwereade__> niemeyer, we run hooks for the first one, claiming the second one is still part of the relation, and so it's perfectly legitimate for the hook to go through the membership and do something based on the settings for the departed unit that it doesn't yet know is departed
<fwereade__> niemeyer, with settings cached in memory this is nice and easy
<niemeyer> fwereade__: I actually think we should allow units to query settings for departed units
<niemeyer> fwereade__: But not sure if that's relevant for the problem you're explaining
<fwereade__> niemeyer, I think it's very relevant
<fwereade__> niemeyer, the idea that we then can't really GC any relation settings node until the whole relation is down bothers me a little, because it feels somehow untidy
<fwereade__> niemeyer, *but* I don;t think there's any way past that
<fwereade__> niemeyer, excpet to cache the whole damn relation state to disk all the time, which is just too much to deal with
<niemeyer> fwereade__: It would not solve either way.. the unit could have changed its settings before departure
<niemeyer> fwereade__: I think it's fine to leave the garbage uncollected for a while
<fwereade__> niemeyer, true
<niemeyer> fwereade__: We can GC it eventually in the corrective agent hinted at over the lifecycle conversation
<fwereade__> niemeyer, yeah, I'm happy enough not worrying about that really, it hasn't seemed to hurt in python too much
<niemeyer> fwereade__: Okay, but what's that issue that makes sending settings with the version non-useful?
<fwereade__> niemeyer, ok, so, we end up with a situation where any relation hook might relation-get some departed unit (whether or not it is known to be departed according to the "current" membership)
<fwereade__> niemeyer, so we need both the get-arbitrary-settings code that we currently have in python *and* the use-cached-settings code we're working towards here
<niemeyer> fwereade__: Okay, and you feel the get-arbitrary-settings always would be simpler
<fwereade__> niemeyer, and I'm not really sure that the somewhat reduced network access is really worth the extra code plus memory budget
<fwereade__> niemeyer, I don't feel sure either way
<niemeyer> fwereade__: I'm happy to go with whatever you feel would result in less code and less mental trickiness
<fwereade__> niemeyer, but using sheer projected simplicity of code as a heuristic I tend *slightly* towards the always-get-arbitrary
<fwereade__> niemeyer, I don;t think I want to toss the caching *yet*
<fwereade__> niemeyer, but consider this early warning that I *might*
<niemeyer> fwereade__: Thanks for the hinting, simplicity rocks :)
<fwereade__> niemeyer, and I'm pleased to hear you're ok exploring the alternatives if it does start to bother me seriously ;)
<fwereade__> niemeyer, anyway, I think I'm going to try to rework the RelationUnit tests in terms of Join, Watch and Settings
<niemeyer> fwereade__: Superb
<niemeyer> fwereade__: Btw,
<niemeyer> fwereade__: While you're there,
<fwereade__> niemeyer, if you happen to see a propose of that stuff land while you're still around, it would be higher-value than the hook queue for now :)
<niemeyer> fwereade__: When looking at the tests I was vaguely considering how tricky it would be to break those tests down into independent tests, or even a table-styled test that used "scripting" rather than that long sequence of contiguous operations
<fwereade__> niemeyer, I think that they will get a *lot* cleaner now that the functionality isn't all hidden in one magic type with freaky action-at-a-distance
<niemeyer> fwereade__: I wouldn't be specially bad if we had to merge them in that long chain, because they're quite well documented (thanks) and sensible
<niemeyer> fwereade__: Having them independent would just be a nice bonus
<fwereade__> niemeyer, cheers :)
<fwereade__> niemeyer, (well, action at a distance is reasonable and expected, it's the bidirectional action at a distance that makes it a little tricky to follow I think)
<niemeyer> fwereade__: Yeah
<fwereade__> niemeyer, btw, re the principalKey change
<niemeyer> fwereade__: Ah, yeah?
<fwereade__> niemeyer, the reason to have an empty principal denoting unit-is-principal it the *topology* is because topology units don't know their own keys
<fwereade__> niemeyer, so, with *just* a topology unit, we can't immediately tell whether its principal differs from itself
<niemeyer> fwereade__: Interestingly, it's not the only reason
<niemeyer> fwereade__: We've debated this very recently in the context of mstate
<fwereade__> niemeyer, ah, ok, that is the reason that made me feel it was more trouble than it was worth to change it
<fwereade__> niemeyer,  go on
<niemeyer> fwereade__: myKey == principalKey requires a full scan
<niemeyer> fwereade__: principalKey == "" is indexable
<fwereade__> niemeyer, ah, ys
<niemeyer> fwereade__: Aram had suggested the same change you suggested in the database side, and we ended up rolling back because of that
<fwereade__> niemeyer, I looked at mstate briefly, and it didn't seem to need any changes to work nicely as is
<niemeyer> fwereade__: For that reason, I suggested keeping principalKey in the *Unit as "" too
<fwereade__> niemeyer, ok, that SGTM
<niemeyer> fwereade__: Simply because it's easier to keep in mind what to expect in the principalKey field
<fwereade__> niemeyer, yeah, very sensible
<fwereade__> niemeyer, ok, I'll do that now :)
<niemeyer> fwereade__: Cheers!
<fwereade__> morning davecheney
<fwereade__> ok, this one is starting to piss me off:
<fwereade__> ----------------------------------------------------------------------
<fwereade__> FAIL: watcher_test.go:44: WatcherSuite.TestContentWatcher
<fwereade__> test 0
<fwereade__> watcher_test.go:55:
<fwereade__>     c.Fatalf("didn't get change: %#v", test.want)
<fwereade__> ... Error: didn't get change: watcher.ContentChange{Exists:false, Version:0, Content:""}
<fwereade__> OOPS: 3 passed, 1 FAILED
<fwereade__> --- FAIL: TestPackage (6.43 seconds)
<fwereade__> FAIL
<fwereade__> I've seen it 3 times today, and it disappears like mist in the sun when I try to look for it :/
 * niemeyer looks at the test
<niemeyer> fwereade__: 200ms seems on the short side
<fwereade__> niemeyer, yeah, I suspect that's it, but it is one of those ones that bothers me because it only seems to show up in a full test run
<fwereade__> niemeyer, on its own it seems 100% reliable
<niemeyer> fwereade__: Full test run == JVM on ZK with data, Go runtime with data, etc
<niemeyer> fwereade__: Sum up two GCs, plus the stars in the right location..
<fwereade__> niemeyer, yeah, indeed
<niemeyer> fwereade__: It's the first test, interestingly
<fwereade__> niemeyer, I just like to imagine that if I run it alone often enough the stars should eventually align right
<niemeyer> fwereade__: I'll have a run through the code just in case
<fwereade__> niemeyer, before you get too deep, I'm just proposing unit-principal-key again
<fwereade__> niemeyer, https://codereview.appspot.com/6421049
<niemeyer> fwereade__: Looking
<fwereade__> niemeyer, and if you decide the ContentWatcher test just needs a timeout bump, please assume a pre-emptive LGTM on that :)
<niemeyer> fwereade__: LGTM!
<fwereade__> niemeyer, cheers
<niemeyer> fwereade__: Thanks :)
<niemeyer> fwereade__: We should be printing the watcher Err() with those messages
<fwereade__> niemeyer, good point
<niemeyer> fwereade__: Looks quite straightforwardly correct on the ContentWatch side at least
<niemeyer> fwereade__: I was just pondering about something too
<niemeyer> fwereade__: Do you have an SSD?
<fwereade__> niemeyer, yeah, it seemed sane to me
<fwereade__> niemeyer, nope :)
<niemeyer> fwereade__: Yeah
<niemeyer> fwereade__: That may be it
<niemeyer> fwereade__: We'll get the "All good!" from ZK before it's actually ready to take action
<fwereade__> niemeyer, ahhhh, I did not know that
<fwereade__> thanks zookeeper :/
<niemeyer> fwereade__: The fact it's the first test seems to point in that direction
<fwereade__> niemeyer, I have never seen it on any test other than the first
<niemeyer> fwereade__: If zk was on the way to attending your request, the After() ticker would already be running
<niemeyer> fwereade__: It also explains why repeating rarely catches the bug
<niemeyer> fwereade__: At this point it's all cached
<fwereade__> niemeyer, yeah, sounds pretty plausible to me
<fwereade__> niemeyer, timeout it is then :)
<niemeyer> fwereade__: I bet that if you flush disk and memory buffers, you'll be able to repeat it easily
<niemeyer> Try this: sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
<niemeyer> fwereade__: and then run the tests again
<niemeyer> fwereade__: Btw, do run sync first :)
<niemeyer> fwereade__: This isn't entirely safe
<fwereade__> niemeyer, I was reading around it and decided I could probably live with the risks
<niemeyer> fwereade__: LOL
<fwereade__> niemeyer, not reproed yet, but it does seem to goose the occasional ssh and store tests into failure pretty well
<fwereade__> niemeyer, I'm comfortable bumping the timeout and seeing whether it happens again :)
<niemeyer> fwereade__: Sounds good :)
<niemeyer> All those "ok" in go test ./... feel so great :-)
<niemeyer> fwereade__: Are you up for a quick review?
<fwereade__> niemeyer, sure
<niemeyer> fwereade__: You already reviewed most of it before, actually.. I'm just reviving it
<fwereade__> niemeyer, even better ;)
<niemeyer> lbox propose churning
<niemeyer> Aaaaaaand
<niemeyer> fwereade__: https://codereview.appspot.com/6423062
 * fwereade__ looks
<fwereade__> niemeyer, LGTM
<niemeyer> fwereade__: Cheers!
<niemeyer> fwereade__: Will add default handling now
<davecheney> niemeyer: thanks for your review
<niemeyer> davecheney: np!
<niemeyer> davecheney: Morning!
<davecheney> i will have to repropose it as i've back ported from my next branch
<davecheney> (as it addressed some of the things you raised)
<davecheney> but the change is quite large
<davecheney> well, more than a LGTM will cover
<niemeyer> davecheney: I'm happy for this branch to go in as-is and these changes be done in the follow up
<davecheney> niemeyer: i'd like to spend a little more time on it
<davecheney> obtained string = "error: json: unsupported type: map[int]interface {}\n"
<davecheney> to make sure it's a good skelton to build upon
<niemeyer> davecheney: Aha, ok
<davecheney> niemeyer: json doesn't support map[int]interface{}, keys must be strings ... :(
<niemeyer> davecheney: Indeed
<niemeyer> davecheney: Do we trust that?
<niemeyer> s/trust/depend on/
<davecheney> niemeyer: possibly
<niemeyer> davecheney: We shouldn't
<niemeyer> davecheney: The json spec itself forbids anything else but strings
<niemeyer> davecheney: So if we're doing that, it's fine to fix it
<davecheney> wonderful
<davecheney> as long as yaml formats a string key without quotes, this will be fine
<davecheney> niemeyer: would you recommend I do with structs to describe the status data, or nested maps ?
<davecheney> /s/do/go
<niemeyer> davecheney: I'd personally prefer structs, unless the dynamism of fields being present or not turns out to make it more messy than anticipated
<davecheney> i'm hoping ,omitempty will get us most of the way
<niemeyer> davecheney: The struct gives a very nice descriptive view in one location of what the map is composed of
<davecheney> also, for test stability, structs are probably necessary
<davecheney> however, the names like Relation, Result are already declared in the cmd pkg's namespace
<niemeyer> davecheney: Indeed. Might be worth having a nice prefix on all of hem
<niemeyer> them
<niemeyer> Dinner.. biab
<davecheney> kk
<davecheney> niemeyer: structs won't work, the yaml output expects the output of each machine to be a map
<davecheney> so machines => map[int]map[string]string
<niemeyer> davecheney: Oops
<niemeyer> davecheney: Ok
<davecheney> it's not the end of the world
<davecheney> niemeyer: some things can be structs, like Service, Unit, Relation, but Machine must be a map to match the python output
<niemeyer> davecheney: Hmm
<niemeyer> davecheney: It can be a map like map[int]Foo, though
<davecheney> niemeyer: yes
<niemeyer> fwereade__: ping
#juju-dev 2012-07-20
<davecheney> niemeyer: ahh, `yaml:"flow"`, might be the savation
<niemeyer> davecheney: Ah, hold on
<niemeyer> davecheney: You want to compare output?
<niemeyer> davecheney: for testing purposes?
<davecheney> niemeyer: yes
<niemeyer> davecheney: That's probably not a good direction
<niemeyer> davecheney: We've had a lot of pain comparing yaml like that
<davecheney> ok
<niemeyer> davecheney: It's easier to unmarshal the yaml back
<niemeyer> davecheney: and compare the value
<niemeyer> davecheney: This also helps because you can compare bits, rather than full docs
<davecheney> using deep equals ?
<niemeyer> davecheney: Yeah
<niemeyer> davecheney: Indentation is doable, etc
<davecheney> ok, i'll keep working on it
<davecheney> niemeyer: is this a bug ? http://play.golang.org/p/P1dQvMC9T4
<davecheney> niemeyer: nm, the channel corrected my mistake
<niemeyer> davecheney: Yeah, no commas
<TheMue> davecheney, fwereade__, rog: morning btw
<rog> davecheney, fwereade__, TheMue: yo!
<davecheney> rog: themue, hello!
<davecheney> rog: https://codereview.appspot.com/6432045 if you have any comments on the test harness
<davecheney> i would be glad to hear them
<davecheney> testing json and yaml is a pain the butt
<davecheney> but at least this way, it isn't a pain in the butt when we're pressed for time and someone points out json output is manditory
 * rog looks
<rog> davecheney: i wonder if we should use MarshalIndent to produce nice looking json. just a thought.
<davecheney> rog: sure
<rog> davecheney: also, i wonder if the status tests could just specify the value of the data structure required; then we could unmarshal the output (in whatever format we want to check) and make sure it deep equals the
<rog> required value
<davecheney> rog: tried that, didn't wokr
<rog> davecheney: ah, because the unmarshal differently?
<rog> theyt
<rog> they
<rog> :-)
<davecheney> yes, json produces map[string]interface{}
<davecheney> yaml produces map[interface{}]interface{}
<rog> davecheney: hmm, but can't you unmarshal into a data structure of the expected kind?
<davecheney> rog: yes, but then the next level down becomes m[i]i
<rog> m[i]i ?
<rog> oh yeah
<rog> hmm
<davecheney> i don't think it's solveable
<davecheney> anyway, that is enough to start mergeing in my working branches
<davecheney> and if the harness is uttery wrong
<davecheney> then i'll tackle it then
<rog> davecheney: one last thought: could you start with the expected data structure, then marshal it and unmarshal it again, *then* check against the original unmarshalled value you got?
<davecheney> or marshal it, and compare it to the data in cxt.Stdout
<rog> davecheney: no, that won't work
<davecheney> why not
<rog> davecheney: because marshalling isn't deterministic
<davecheney> for json it is, but i take your point
<rog> davecheney: oh really, json sorts map keys? cool, i didn't know that.
<davecheney> gustavo argued that goayml can do the same thing
<davecheney> i'm going to hvae to change goyaml slightly anyway
<rog> davecheney: it would make life easier
<rog> davecheney: and json is a good precedent
 * davecheney thinks that json is about as crack headed as javascript
<davecheney> for example, struct { Machines map[int]interface{} }
<davecheney> cannot be json marshalled
<davecheney> yet _must_ be marshalled by yaml
<davecheney> so, because of that, i have to conver machine.Id to a string, then change goyaml to accept a formatting flag to convert the string keys back to ints
<rog> davecheney: oh that's a pity
<davecheney> anyway, dinner time
<davecheney> then i'll ponder your idea
<davecheney> if that means we describe the output once, that is a big win
<rog> davecheney: given what you've said above, i'm not sure it can work.
<rog> davecheney: pity machine ids aren't strings
<TheMue> rog: Let the Firewaller only use a passed state has been a nice simplification, but how do I get the environment now. Somehow lost the path.
<rog> TheMue: ha ha. you'll need to pass the environment in as an argument too.
<TheMue> rog: sh**
<TheMue> rog: Already expected it *sigh*
<TheMue> rog: Btw, on Sunday, do we try to get seats next to each other or do we just meet at the baggage claim?
<rog> TheMue: it would be nice to sit together, but i can't quite work out how :-)
<fwereade_> morning all
<TheMue> rog: That's the problem.
<TheMue> fwereade_: Morining
<rog> TheMue: except, i guess, we can try to arrive at the checkin desk together... but most seats are gone by then.
<rog> TheMue: i have an idea
 * TheMue listens
<rog> TheMue: i won't have access to a computer, but perhaps i could give you my booking details and you could try and select a seat for both of us on the airline website
<rog> TheMue: in fact, perhaps they allow us to choose a seat now. i'll have a look.
<TheMue> rog: That could work
<TheMue> rog: Your Environ.OpenPorts() wants all ports that have to be open or just a delta that should be opened with this action?
<rog> TheMue: just a delta
<TheMue> rog: Good
<rog> TheMue: otherwise there wouldn't be a ClosePorts method :)
<TheMue> rog: Makes it more simple
<TheMue> rog: Sounds logical :D
<rog> TheMue: yeah, i toyed with both approaches, but this seemed better
<TheMue> rog: Yes, otherwise I would have to maintain a list of all open ports on a machine inside the firewaller. Now I just tell you to open it.
<rog> TheMue: yeah
<TheMue> rog: Hmm, if a port can't be opened (so you return an error), how hard is this error? I could do a panic, but that wouldn't be helpful. Better would be a kind of retry.
<rog> TheMue: yeah, it's a good question.
<rog> TheMue: not very hard, i'd say
<rog> TheMue: perhaps you could put the retry in later, to avoid cluttering the initial logic
<TheMue> rog: But have to leave now until noon, some shopping with Carmen. We can talk later again.
<TheMue> rog: So far I'll ad a TODO, yes
<asachs_> Hello all
<fwereade_> asachs_, hi
<fwereade_> everyone, sorry, I have to pop out for 30 mins
<fwereade_> bbs
<asachs_> anyone seen a MaaS based bootstrap fail to install zookeeper ?
<fwereade_> asachs_, not offhand; can you ssh into the instance at all?
<asachs_> fwereade_: yeah i could - but no zookeeper installed :(
 * fwereade_ desperately tries to remember where cloudinit logs to
<fwereade_> asachs_, can you take a look at /var/log/cloud-init-output.log on the instance?
<rog> fwereade_, asachs_: /var/log/cloud-init.log
<fwereade_> rog, ah ok
<rog> fwereade_, asachs_: also, perhaps: cloud-init-output.log
<rog> fwereade_: here's a script i use occasionally for debugging, which automatically copies the logs from an instance to my local machine: http://paste.ubuntu.com/1101520/
<fwereade_> rog, nice :)
<rog> fwereade_: wanna swap a book or two again?
<fwereade_> rog, I'll see if I can find something worth reading :)
<davecheney> rog: can you paste that link again please ?
<rog> davecheney: which link?
<davecheney> the one you emailed me
<davecheney> play.g.o
<rog> davecheney: http://play.golang.org/p/AMq8qURfBW
<davecheney> ta
<asachs_> rog: thanks - its for a maas setup so the ec2 stuff will not work so nice :)
<rog> asachs_: np. it should be easy enough to port the concept though, if you want. i doubt you have the rc shell installed anyway :-)
<asachs_> rog: it looks fairly straightforward, will have a go at it once my new MAAS server is up
<asachs_> rog: and thx
<rog> asachs_: tbh, i usually ended up doing ssh and tail -f for a more interactive view.
<rog> fwereade_: i wonder if you might take a look at this and see whether it seems reasonable. i haven't yet done the extra tests in the juju package, but the cmd/juju test suite passes: https://codereview.appspot.com/6428061/
<fwereade_> rog, very shortly, have much state loaded atm
<rog> fwereade_: np
<rog> fwereade_: it's fairly trivial refactoring, nothing substantial
<fwereade_> rog, cool
<rog> just off for lunch, then travelling this afternoon - i'll probably leave here in a couple of hours
<TheMue> rog: enjoy
<fwereade> rog, ok, I am at last on your review, lunch intruded
<fwereade> rog, sorry
<rog> fwereade: it's still WIP BTW
<rog> fwereade: 3 possible books: Crystal Rain (Tobias Buckell), Harmony (Project Itoh) and Wolf Hall (Hilary Mantel). read any of 'em?
<fwereade> rog, none of them :)
<rog> fwereade: last one's historical rather than sf, but great anyway
<fwereade> rog, excellent
<fwereade> rog, what's your general view of bleak sprawling messed-up fantasy?
<fwereade> rog, reviewed btw
<fwereade> rog, and I did the linked list thing with HookQueue and I think it worked out really nicely
<fwereade> rog, so many thanks for that
<fwereade> rog, https://codereview.appspot.com/6422049
<fwereade> rog, nothing is O(1) though
<fwereade> rog, Add is O(len(changed_units))
<TheMue> rog: I have some troubles with http://paste.ubuntu.com/1087489/, line 41 ff.
<fwereade> rog, Next is O(len(members))
<fwereade> rog, but those are both better than before :)
<TheMue> rog: It seems like your outline iterates twice over the change (which is the list of open ports).
<rog> fwereade: last night i couldn't resist seeing what the linked list idea might look like; i was going to throw it away, but... http://paste.ubuntu.com/1101809/
 * rog is trying desperately to pack
<fwereade> rog, I'm afraid I really can't judge it without tests ;)
<rog> fwereade: indeed
<niemeyer> Good mornings!
<fwereade> niemeyer, heyhey
<TheMue> niemeyer: Moin
<niemeyer> Heyas!
<rog> niemeyer: yo! i'm heading to the airport in 25 mins...
<niemeyer> rog: Oh, going to Lisbon already?
<rog> niemeyer: first leg only. laying over for the weekend with some university pals in amsterdam (planned ages and ages ago, and cut short a bit by the sprint).
<niemeyer> rog: Ah, nice
<niemeyer> rog: Have some good fun there
<rog> niemeyer: should be fun. at least one of them i haven't seen for 20 years
<niemeyer> rog: WOah
<rog> niemeyer: i'll try not to arrive too hungover!
<niemeyer> rog: Have you graduated 20 years ago!? 8)
<rog> niemeyer: old bugger, me
<niemeyer> rog: LOL
<rog> niemeyer, fwereade, TheMue: right, i'm off. might be online for a bit from the airport though.
<rog> otherwise see y'all in lisbon!
<rog> i'm looking forward to it!
<fwereade> rog, have fun!
<TheMue> rog: Enjoy, we'll see at least at Lisbon airport
<niemeyer> rog: Indeed!
<niemeyer> rog: Have fun!
<rog> toodle pip
<niemeyer> fwereade: So, the validation stuff is up for review
<fwereade> niemeyer, I'm reading it now, it looks *awesome* so far
<niemeyer> fwereade: Very glad (and relieved :-) to hear it
<fwereade> niemeyer, LGTM, but I think control-bucket should also be immutable
<niemeyer> fwereade: Is it not?
<fwereade> niemeyer, huh, couldn't see it
 * fwereade reads again
<fwereade> niemeyer, it's not mentioned in Validate
<fwereade> niemeyer, did you do something clever?
<niemeyer> fwereade: Isn't that awesome? :-)
<niemeyer> fwereade: Oh, wait
<fwereade> niemeyer, but the structure of it all is perfect
<niemeyer> fwereade: You mean immutable as in cannot be changed from old to cfg
<niemeyer> Okay
<fwereade> niemeyer, yeah, sorry
<niemeyer> fwereade: I read that as in *config.Config is immutable
<niemeyer> fwereade: WE don't ever change, or even reference, control-bucket within Validate
<fwereade> niemeyer, nah, Config seems fine to me :)
<niemeyer> fwereade: Which is what I claimed is awesome
<fwereade> niemeyer, ohhhhh
<fwereade> niemeyer, hmmmmm
<fwereade> niemeyer, can't quite decide whether that's awesome or evil
<niemeyer> fwereade: But, you're right I think.. we shouldn't allow the control-bucket to change, at least for now
<niemeyer> fwereade: Really?
<niemeyer> fwereade: Why would it be evil? The schema is doing the whole validation for us
<fwereade> niemeyer, sorry, I misunderstood what you said
<fwereade> niemeyer, not to worry :)
<niemeyer> fwereade: Cool :-)
<fwereade> niemeyer, a suggestion: we should store relation resolved nodes under the relation (like settings and presence), not under the unit
<fwereade> niemeyer, (and we should do the same for the workflow nodes, if they're not already there, which I don't immediately recall)
<fwereade> niemeyer, oh, hmm, maybe that's problematic
 * fwereade goes off to read code
<niemeyer> fwereade: I do think there's an awesometastic potential for simplifying things in that state machine, but indeed we have to think through the error management
<niemeyer> fwereade: "If it's worth checking for a missing endpoint (as we do in Open) I maintain it's worth checking for here rather than there :)."
<fwereade> niemeyer, ...but it turns out you dropped that check entirely, and I'm fine with that
<niemeyer> fwereade: Part of the beauty is that this is now done in both locations, and in SetConfig
<fwereade> niemeyer, oh? sorry, where?
<niemeyer> fwereade: I haven't dropped it.. this branch actually *increases* validation significantly, perhaps to my surprise also :-)
<niemeyer> fwereade: There's a single code path to grab an ecfg
<niemeyer> fwereade: and it validate
<niemeyer> s
<fwereade> niemeyer, there's also a funny little bit in Open, that you deleted, that checks the EC2Endpoint on the chosen region
<fwereade> niemeyer, I don;t see EC2Endpoint anywhere else in the diff
<niemeyer> fwereade: Oh?
<fwereade> niemeyer, but I don't really see why we should need to check it in the first place
<niemeyer> fwereade: Where is that?
<fwereade> niemeyer, https://codereview.appspot.com/6416055/diff/4001/environs/ec2/ec2.go#oldcode117
<niemeyer> fwereade: Oh, sorry
<niemeyer> fwereade: I did misread that code
<niemeyer> fwereade: I've read the error message, and inferred the test
<niemeyer> fwereade: If the region exists in aws.Region, I claim we can use it
<niemeyer> fwereade: We're not checking S3Endpoint, for example
<fwereade> niemeyer, agreed
<niemeyer> fwereade: Thanks, though, I did miss the real test
<fwereade> niemeyer, I don't think that bit ever had an explicit test anyway
<niemeyer> fwereade: Indeed
<niemeyer> fwereade: I'll add the control-bucket check
<fwereade> niemeyer, cool
<fwereade> niemeyer, btw, I added those methods to Relation, in https://codereview.appspot.com/6430055/
<fwereade> niemeyer, and in the CL I suggest that maybe they should be on Unit
<fwereade> niemeyer, but in *fact* I now think a RelationUnit type, which just holds the stuff we repeatedly calculate in Relation.unitScope, is really the right place for them
<fwereade> niemeyer, (*Relation)Unit(*Unit) (*RelationUnit, error)
<fwereade> niemeyer, (*RelationUnit) Watch() (*presence.Pinger, error)
<fwereade> niemeyer, (*RelationUnit) Settings() (*ConfigNode, error)
<fwereade> niemeyer, (*RelationUnit) Watch() (*RelationUnitsWatcher, error)
<fwereade> niemeyer, and then...
<fwereade> niemeyer, something a bit like (*RelationUnit) WaitResolved() (<-chan bool, error)
<fwereade> niemeyer, and similar methods for Workflow etc etc
<fwereade> niemeyer, would you be OK with that?
<fwereade> niemeyer, sorry, way up there, s/unitScope/unitInfo/
<fwereade> (what the hell, how is it 5pm? time flies :))
<niemeyer> fwereade: I'm on the fence about it, mostly because I don't see the actual benefit and do see increased API surface, and even usage burden (e..g you now must check error twice to reach a Relation's Settings).
<niemeyer> fwereade: Do you see simplification on the implementation?  And if so, can you describe a bit of it?
<fwereade> niemeyer, the benefit IMO is that we don't need to pass a Relation and a Unit around everywhere we're dealing with the relation lifecycle
<fwereade> niemeyer, I think it'll also have methods like Workflow and SetResolved and so forth
<fwereade> niemeyer, which would be icky on Unit (because it will want stuff like that itself)
<niemeyer> fwereade: Sounds sensible
<niemeyer> fwereade: +1
<fwereade> niemeyer, and the big issue with the current form is (*Relation)Watch(*Unit), which *really* looks like we're watching a unit
<niemeyer> fwereade: Agreed.. I dislike that too
<niemeyer> fwereade: It's also nice that we can have an *actual* Relation.Watch method, taking no parameters
<niemeyer> fwereade: That doesn't ignore any units
<fwereade> niemeyer, ooh, nice
<niemeyer> fwereade: I'm looking forward to use that in a monitoring tool :-)
<fwereade> niemeyer, ok, I'll wip that and make the change
<niemeyer> fwereade: Thanks a lot
<fwereade> niemeyer, but https://codereview.appspot.com/6422049/ is also ready(?) and I think it's pretty cool actually
<fwereade> niemeyer, it lacks the occasional pathological performance characteristics of the first one
<fwereade> niemeyer, I think :)
<niemeyer> fwereade: Sweet, will have a look
<niemeyer> Actually, I'll review that first thing in the afternoon if that's ok
<fwereade> niemeyer, no rush :)
<fwereade> niemeyer, I have two nicely complementary streams of work at the moment, both UA related, and easy to switch between
<fwereade> niemeyer, when they collide I'll be screaming for reviews, so take it easy while you can ;)
<niemeyer> fwereade: LOL
<niemeyer> fwereade: Sweet
<rog> fwereade: is there any particular reason that StartUnit should be StartUnits? it's trivial for the caller to write a for loop, and it means that the caller can decide what to do if one fails after several have succeeded.
<fwereade> rog, just seems more appropriate for a high-level interface somehow
<fwereade> rog, matter of taste, not really bothered, we'll find out what we really need later
<rog> fwereade: if we were going to potentially optimise AddUnits(n) so that it wasn't just a simple for loop around AddUnit, i think i'd agree. but otherwise i can't see the point.
<rog> fwereade: it's not *that* high level - we assume the caller can program :-)
<fwereade> rog, that's what I kinda think we should be doing :)
<rog> fwereade: yeah, ok, will go for AddUnits then
<rog> fwereade: then if the state API changes to allow a more efficient approach, the juju API can remain the same
<fwereade> rog, yeah, exactly
<fwereade> rog, niemeyer: hey, why does AddUnitSubordinateTo exist?
<rog> fwereade: and one approach to that would be to add AddUnits to the state API and not download the charm for every unit! probably a better approach than caching
<niemeyer> fwereade: Hm?
<fwereade> rog, niemeyer: should this not be handled at AddUnit time, with something like Service.Subordinates()?
<fwereade> rog, niemeyer: anyway, total derail really, but I can't see any benefit to exposing the functionality
<rog> fwereade: i guess it depends what level you see the state API
<rog> fwereade: i see it as potentially allowing things outside the scope of what is allowed in the high level juju view
<fwereade> rog, regardless of level, I'm not sure it should let us do things that aren't internally consistent
<fwereade> rog, without the appropriate relations in place, isn't it straight-up nonsensical to try to do that?
<niemeyer> fwereade: Yeah, could be internalized I guess
<fwereade> rog, niemeyer: one for our Copious Free Time, I think, anyway ;)
<rog> fwereade: i think i agree
<niemeyer> fwereade: It doesn't let us do things that aren't internally consistent already, but if we can simplify it that's a bonus
<niemeyer> fwereade: I've added the control-bucket stuff and re-proposed.. if all looks good will submit once I'm back from lunch
<fwereade> niemeyer, cool, I'll take a look
<fwereade> niemeyer, just unwipped https://codereview.appspot.com/6430055
<fwereade> niemeyer, according to how the tests feel, it was a good move :)
<niemeyer> fwereade: Sweet!
<fwereade> niemeyer, dammit, I'm meant to be going out and having fun this evening ;p
<mramm2> fwereade: no rest for the wicked!
<fwereade> mramm2, haha
<niemeyer> fwereade: You really should :-)
<fwereade> niemeyer, oh, I will, but it's really inconvenient right now :)
<niemeyer> fwereade: Next week we'll have tons of time together to push awesomeness forward :)
<niemeyer> fwereade: LOL
<niemeyer> fwereade: I know how that feels
<niemeyer> fwereade: I was hacking until post midnight yesterday
<niemeyer> fwereade: Hard to stop when in a roll :)
<fwereade> niemeyer, yeah :)
<fwereade> TheMue, ping
<TheMue> fwereade: mom
<fwereade> TheMue, it always disturbs me when you call me mom :)
<TheMue> fwereade: You're to me like a mom ;)
<fwereade> TheMue, aww :)
<fwereade> TheMue, well, all right darling
<fwereade> TheMue, (I'll ask now; but no rush, please just respond when you're ready, I might have to go in a sec, I'll see your response later)
<fwereade> TheMue, I'm getting a slight urge to do violence to the ResolvedMode stuff
<fwereade> TheMue, (1) I don't think we have any need for the 1000/1001 magic numbers
<fwereade> TheMue, (2) I don't think we have any need for anything other than SetResolved, ClearResolved, and WaitResolved() (retry bool, err error)
<fwereade> TheMue, ie I can't see a use case for WatchResolved; am I missing something?
<fwereade> TheMue, (hmm, actually, what I'd *really* like is a one-shot <-chan bool watch, I think it will mesh very nicely with my plans for the hook executor)
<fwereade> TheMue, anyway, let me know your thoughts
<fwereade> niemeyer, I would also appreciate your perspective on the above
<fwereade> I'm thinking that I want, on both Unit and RelationUnit:
<fwereade> SetResolved(retry bool) error
<fwereade> ClearResolved() error
<fwereade> WaitResolved() (retry <-chan bool, err error)
<fwereade> but ofc I'm open to mockery and dismissal :)
<fwereade> anyway, later all
<fwereade> niemeyer, (also, am I right in thinking that we can't separately resolve errors in relations of the same name? is this deliberate, or am I misreading juju/control/resolved.py?)
<TheMue> fwereade: So, back again. Our daughter had problems today.
<mramm2> TheMue: sorry to hear about that
<TheMue> oops
<fwereade> TheMue, no worries, I might be able to pop back and read in 5-10 mins
<fwereade> TheMue, hope all is ok now
<TheMue> mramm2: Thankfully it's not her direct problem. It a harder problem by her best girl friend. But now our daughter has problems with her employer because she came later to the afternoon stint
<TheMue> fwereade: Yes, it's ok. Thx.
<TheMue> fwereade: So you're looking if the ResolvedMode is needed? Or what is your question?
<TheMue> fwereade: I've translated it from the python code, but if we don't need it anymore we should remove it.
<fwereade> TheMue, sorry, I'm really just asking if you're aware of any uses for it that I may have missed
<fwereade> TheMue, that can't be covered by what I proposed
<fwereade> TheMue, or if there's anything otherwise obviously ludicrous about what I suggest
<TheMue> fwereade: Right now I don't know. I'll take a look.
<TheMue> fwereade: So far it's not used, yes. I only ported it without looking who later will use it.
<TheMue> fwereade: In Py it's used by the agent.
<niemeyer> fwereade: Hah.. so I screwed up the copy & paste..
<niemeyer> That's what I get for not self-reviewing
<TheMue> So, next proposal for the firewaller is in and I'm out. If we don't see tomorrow evening here we'll see on Sunday in Lisbon.
<TheMue> Have a nice evening.
<niemeyer> TheMue: Thanks, have a great trip!
<TheMue> niemeyer: Thx, yes, and you have a great trip too. Will be a long flight.
<niemeyer> TheMue: Thanks, this one is going to be surprisingly good, actually
<niemeyer> TheMue: A single leg.. first time ever
<TheMue> niemeyer: Yeah, so it's a good location. Only Dave will always ever have problems.
<TheMue> niemeyer: So we once should visit him.
<TheMue> :D
<niemeyer> TheMue: True :-)
<TheMue> niemeyer: Next UDS is very near. I'm already ching if not train will be better.
<niemeyer> TheMue: Nice, that should be a breeze indeed
<TheMue> niemeyer: Yes, but that's still some month in front. Let's face our sprint next week.
<niemeyer> TheMue: Yeah, and I have nightmareish trip before that, in early August
<niemeyer> 4 legs each way.. :-(
<TheMue> niemeyer: Ouch
#juju-dev 2013-07-15
 * thumper does a little scream
<thumper> wallyworld_: ping
<wallyworld_> hi
<thumper> wallyworld_: will the world explode if I return nil for hardware characteristics in start instance?
<wallyworld_> no, but it would be nice to have
<wallyworld_> i think nil is ok, but not sure if an empty struct is preferred
<wallyworld_> let me check
<thumper> kk
<wallyworld_> nil is ok
<thumper> wallyworld_: ok, ta
<thumper> hmm...
<thumper> stabby
<thumper> stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby stabby v
 * bigjools sends calm thoughts across The Ditch
 * thumper scratches his head
<thumper> jam1: btw, your lxc upgrade check breaks all containers
<thumper> jam1: we need to be smarter about them
<thumper> jam1: I'm still working out how
<thumper> more stabbies
<thumper> wtf...
<thumper> wallyworld_: we need public address and private address on state.Machine ASAP
<thumper> this is driving me crazy
 * thumper wonders wtf is going on here
 * thumper thinks
<wallyworld_> thumper: sorry, i didn't see your message before. i'll talk to martin tonight
<thumper> wallyworld_: sure, I felt I was all alone again
<wallyworld_> sometimes i don't notice the irc window blink
<thumper> :)
<wallyworld_> maybe i need two monitors
<thumper> I'm this |..| close to getting the local provider working
<wallyworld_> awsome
<thumper> I found a major(ish) problem though
<thumper> which I'm simmering on
<wallyworld_> which is?
<thumper> I need the storage provider to be running on machine-0 all the time
<thumper> and to refactor how the environment gets created
<thumper> I want to talk to fwereade about it when he starts
<thumper> to get his thoughts on the best way to do this
<wallyworld_> by all the time, you mean it is kicked off when juju bootstrap is run?
<thumper> yes, and stays running
<thumper> so we'd have: mongod, machine-agent, storage provider
<thumper> to deploy a charm
<thumper> the uniter accesses the storage of the environment
<thumper> which isn't listening :)
<thumper> or at least, isn't listening in the right place at the right time for the uniter
<wallyworld_> yay
<wallyworld_> i guess it was all different in py juju
<thumper> probably
<rogpeppe> morning all!
<TheMue> rogpeppe: heya, how have your holidays been?
<jam1> hi rogpeppe, good to have you back
<rogpeppe> TheMue: great thanks!
<rogpeppe> TheMue: we had excellent weather in Norway
<rogpeppe> jam: thanks
<rogpeppe> i've barely glanced at what's been going on while i've been away. any particularly significant mail threads/changes of direction  i should be aware of?
<TheMue> rogpeppe: sounds good. and Norway itself?
<rogpeppe> TheMue: really good. expensive though!
<rogpeppe> TheMue: particularly alcohol
<TheMue> rogpeppe: oh, didn't know. but I remember they have high taxes there.
<rogpeppe> TheMue: yeah. beer is about 8-12 euros / 500cl
<TheMue> rogpeppe: if it's only alcohol it doesn't hurt you. you surely never drink. :D
<rogpeppe> TheMue: surely not
<rogpeppe> TheMue: :-)
<TheMue> rogpeppe: ouch, that's really expensive
<rogpeppe> TheMue: we bought our full quota from the duty free on the way in
<rogpeppe> TheMue: 6 bottles of wine and 8 bottles of beer :-)
<TheMue> rogpeppe: clever
<rogpeppe> TheMue: mandatory :-)
<TheMue> rogpeppe: hehe, indeed
<TheMue> rogpeppe: regarding the project everything went on fine. the first week a bit slower, as you, Dimiter and for two days me were off
<TheMue> rogpeppe: but last week with more speed again
<TheMue> rogpeppe: Tim and me found a bad behavior in the http package. here a request takes a Reader, but casts it into a ReadCloser and calls Close() later. this is used in the local storage. so the Seek(0, 0) when writing the fake tools fails.
<rogpeppe> TheMue: interestin
<rogpeppe> g
<rogpeppe> TheMue: i can't see that that behaviour is documented
<rogpeppe> TheMue: it's trivial to work around though
<TheMue> rogpeppe: Tim is working on it. he stumbled about this behavior.
<rogpeppe> TheMue: all you need to do is pass in ioutil.NopCloser(r) to NewRequest
<rogpeppe> TheMue: presumably this is only actually a problem when the tools writing fails and is retried?
 * rogpeppe goes to get some sun cream
<TheMue> rogpeppe: the writing of the first tool file works, but the second file fails because the file is closed.
 * TheMue is afk for a moment, bringing his daughter to the railway station (first single longer ride)
<dimitern> rogpeppe: hey, welcome back!
<rogpeppe> dimitern: yo!
<dimitern> rogpeppe: good holiday?
<rogpeppe> dimitern: great thanks
<rogpeppe> dimitern: two holidays really :-)
<dimitern> roh
<dimitern> rogpeppe: awesome :)
<rogpeppe> dimitern: both in norway, but the first week we rented a cabin in the south; the second we spent in oslo
<dimitern> rogpeppe: cool and refreshing weather i presume?
<rogpeppe> dimitern: pretty warm, particularly the second week. temp around 22-25
<rogpeppe> dimitern: first week cooler, but we were on the sea
<dimitern> rogpeppe: oh, compared to 32-34 here is nice
<rogpeppe> dimitern: we were v lucky
<rogpeppe> dimitern: yeah, perfect
<rogpeppe> dimitern: and come back to find the garden has gone mental!
<rogpeppe> dimitern: (in a good way)
<dimitern> rogpeppe: :)
<jam> rogpeppe: I think the big focus last week was that upgrading 1.10 => 1.11.2 showed lots of things broken. I think William and I have managed to track them down, I have a couple patches up for review
<jam> We're looking to get a 1.11.3 out this week, with the hopes of it being considered a stable 1.12 with an upgrade path from 1.10
<rogpeppe> jam: i was looking at one of your patches and trying to get my head around the underlying issues
<jam> rogpeppe: If you need to ask questions, I can give more context
<rogpeppe> jam: in particular i was trying to work out what permutations of versions/upgrading would cause the issues to be seen
<rogpeppe> jam: let me go back to the CL and have another look.
<fwereade> rogpeppe, heyhey, welcome back!
<rogpeppe> fwereade: yo!
<jam> rogpeppe: bootsrap with 1.10 client starting 1.10 server, upgrade using 1.11.2 (or trunk --upload-tools)
<jam> stuff breaks
<jam> fwereade: o/
<fwereade> jam, heyhey
<rogpeppe> jam: ah, yes, i saw that issue, but didn't get to the bottom of it
<jam> fwereade:  so the good and bad bits, the good is that I have: https://code.launchpad.net/~jameinel/juju-core/api-set-creds-1199915/+merge/174620 up for review wrt bug 1199915
<jam> the bad is that the existing bug actually means only machine-0 sort-of-works after upgrade
<jam> because all other workers wait for the main thread to get the list of jobs from the api
<jam> but the api can't be connected to because we didn't set the agent's password
<jam> bug #1199915
<jam> where did the bot go?
<jam> _mup_: bug #1199915
<rogpeppe> jam: launchpad says it's in the middle of an update
<fwereade> jam, heh, LP doesn't want to talk to me
<rogpeppe> jam: i was just trying to look at that bug
<rogpeppe> jam: how much of these changes are temporary cruft that should go away eventually?
<jam> rogpeppe: we generally need to come up with an answer for how we handle changing "stuff" when upgrading. Some of this will last as long as we want to be able to upgrade from 1.10
<jam> which may be "short" depends on how long we support 1.10
<rogpeppe> jam: yes, i wondered that.
<rogpeppe> jam: i think we should make sure that we have comments in the code that indicates when the workaround might be removed
<jam> rogpeppe: but generally, I don't think we've done much engineering around "binary 2.X needs these packages and this configuration that 2.X-1 didn't need, how do we configure it after upgrade" ?
<rogpeppe> jam: yeah, that's definitely something we need to think about
<rogpeppe> jam: packages are possibly easier than config
<jam> rogpeppe: so *right* now agents have both a DB connection and an API one, so they can poke the DB connection directly to get their creds updated inside the db. At some point when we cut off direct DB access, upgrading from 1.10 will be infeasible (I think)
<jam> rogpeppe: lp seems to be back up
<jam> fwereade: ^^
<rogpeppe> jam: we should perhaps make it so that the upgraded version sets the API passwords appropriately so that people can get there with an intermediate upgrade
<jam> rogpeppe: which is what my patch does
<rogpeppe> jam: ah, cool
<jam> so you could do 1.10 => 1.12 => version-that-doesn't-have-direct-db-access
<fwereade> jam, I'm fine saying that we only cut off direct db access for 2.0, and we only allow upgrades to 2.0 from 1.x versions that have done the requisiste juggling
<fwereade> jam, what you said basically :)
<jam> oddly enough on my system, "make simplify" is faster than "go fmt ./..." even though the former is passing the '-s' flag. I wonder if 'go fmt' search through the code is slower than 'find *.go'
<jam> (1.7s vs 2.1s)
<rogpeppe> jam: i think that might be because of an outstanding issue with the go command's package matching
<rogpeppe> jam: try find . -name '*.go' | xargs gofmt -w
<rogpeppe> jam: i *thought* the issue had been fixed tho
<jam> 1.56s
<rogpeppe> jam: the problem was that the go command scans all directories in GOPATH anyway, even if there's a fixed prefix
<fwereade> frankban, dimitern: responded to https://codereview.appspot.com/11003044/
<fwereade> frankban, dimitern: LGTM, but I may be missing some nuance of dimitern's thoughts
<dimitern> fwereade, frankban: will take a look again
<frankban> thanks fwereade and dimitern!
<fwereade> jtv, jam: responded to https://codereview.appspot.com/11234043/
<jam> fwereade: sure, you review all the easy ones first :)
<fwereade> jam, apart from wallyworld's pipeline which I'm saving for after the meeting, I'm going purely by age, I thin
<jam> fwereade, rogpeppe, dimitern, mgz, wallyworld: standup in 3 min: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471
<fwereade> jam, I'm on your scary one now :)
<jam> fwereade: doesn't mean you didn't get to easy ones first
<dimitern> frankban, fwereade: LGTM
<rogpeppe> jam: gimme 3 minutes while i sort a couple of things out
<jam> rogpeppe: np
<jam> fwereade: +1 on the separate provider section, I also worry how we get a provisioner running in each provider (or does the provisioner just run at the lowe
<rogpeppe> jam: back
<jam> at the state server node, and just knows how to set stuff on each provisioner.)
<jam> rogpeppe: starting
<fwereade> jam, I am very consciously not thinking about how we fix the code for this, it's still percolating
<fwereade> jam, I think it'll be one provisioner per provider though
<frankban> dimitern: thanks
<dimitern> frankban: yw
<fwereade> jam, I think I just understood what you were saying better
<fwereade> jam, we'll want the manager node(s) running all the provider provisioners because I Do Not Want to spread manager nodes across clouds until someone can demonstrate that it really is sane & safe to do so
<jam> fwereade: right
<jam> you have to have an instance running on each cloud you want to provision on isn't great
<jam> (an instance running the jujud provisioner code)
<fwereade> jam, can't get away with it forever, though, keeping all manager nodes in one cloud renders the whole env susceptible to failures of that cloud
<fwereade> jam, not one to fix this year, though
 * fwereade lunch
<rogpeppe> TheMue: just finished meeting with jam. i've got lunch now. i'll ping you after that, if that's ok.
<TheMue> rogpeppe: it's fine with me, yes
<rogpeppe> TheMue: https://plus.google.com/hangouts/_/a969ef2b58bd85872816e26fa3574a46ef3cb428
<fwereade> jam, https://codereview.appspot.com/11137044/ reviewed
<jam> fwereade: so the SetPassword before setting the Config thing. We are pulling the password out of the existing conf, so we can do it again
<jam> If we write-to-disk first, then we can't tell we need to SetPassword
<fwereade> if anyone's wondering where dimitern is, he's had a power cut for a while now
<mgz> how frustrating
<mgz> I take it he's got phone and computer, just no internet?
<fwereade> I presume he has at least some power in his battery, and yeah, he texted me
<hazmat> what's the recommended way of getting mongodb for dev?
<hazmat> and testing
<hazmat> per https://bugs.launchpad.net/juju-core/+bug/1175493 the package and ppa versions don't work
<_mup_> Bug #1175493: Tests fail using the mongo package from raring, or ppa <juju-core:Triaged> <https://launchpad.net/bugs/1175493>
<hazmat> nm.. the link is in the readme
<Daviey> mgz: Is this expected? http://pb.daviey.com/wIxM/
<mgz> Daviey: yup
<mgz> to use --upload-tools implies you're compiling locally and want to use your own build stuff
<mgz> arguably we want a nicer error/to hide that flag for general users
<mgz> see bug 1135564
<_mup_> Bug #1135564: juju bootstrap --upload-tools fails, missing dependency <juju-core:Triaged> <https://launchpad.net/bugs/1135564>
<Daviey> mgz: http://pb.daviey.com/wP3I/
<Daviey> right now, i feel fubared
<mgz> Daviey: you need to follow lp:juju-core README if you actually want to do that
<mgz> involves getting all the source and building it
<mgz> probably what you really want is not that flag, but to just specify our latest release binaries?
<mgz> Daviey: what exactly are you trying to do?
<hazmat> Daviey, unless your testing a branch/trunk.. this is probably what you want  $ juju sync-tools
<mgz> fwereade: can you cast an eye over codereview.appspot.com/11284044 to see if you have any issues with the suggested model
<fwereade> mgz, sorry, just going out; I've opened a tab with it and will give it proper attention later today
<mgz> ta
<sidnei> mramm: bug #1201503
<_mup_> Bug #1201503: Add disk constraint <juju-core:New> <https://launchpad.net/bugs/1201503>
<sidnei> which i believe elmo was poking  you about
<rogpeppe> fwereade: ping
<rogpeppe> fwereade: oops sorry, just saw you went out
<rogpeppe> right, that's me for the day
<rogpeppe> see y'all tomorrow
<hazmat> are there are any known issues that cause the test suite to just hang and spin on cpu
<hazmat> i've tried with mongo 2.2.4 from raring, and from the s3 dist of mongo 2.2.0
<thumper> morning
<thumper> mramm: did you want to chat at some stage?
<mramm> thumper: sure
<thumper> mramm: like... when?
<mramm> I'm free now
<mramm> and in about 2 hours
<thumper> now is good
<sidnei> hey thumper, i was looking at getting my feet wet with some juju-core and implement a fix for bug #1201503 but wanted some confirmation that it does make sense in the first place
<_mup_> Bug #1201503: Add disk constraint <juju-core:New> <https://launchpad.net/bugs/1201503>
<thumper> sidnei: hmm...
<thumper> sidnei: I'm not entirely sure, do we have access to the amount of disk?
<thumper> sidnei: also
<thumper> sidnei: there are some future plans for dealing with mounting "special" block devices on cloud images
<thumper> as this is provided by some cloud providers
<thumper> but out of scope for right now
<sidnei> thumper: nova flavor-list shows disk and ephemeral disk: http://paste.ubuntu.com/5878931/ in prodstack there's some custom flavors with eg 2 CPUs and 10, 20, 50, 100 of disk, as if multiple variations of m1.small with only different disk size
<sidnei> s/m1.small/m1.medium
<thumper> sidnei: well, if you think it makes sense, and likely that someone wants to do that, then sure, maybe
<thumper> sidnei: you may want to run it by fwereade
<sidnei> thumper: when i said it wasn't supported i got a 'dude, seriously' from elmo, so yeah. :)
<thumper> heh
<thumper> I'll take that as a "it will be used" type thing
<thumper> you should write that in the bug :)
<sidnei> it's somewhat of a workaround to https://bugs.launchpad.net/juju-core/+bug/1183831 since if instance-type was supported we'd just use that
<_mup_> Bug #1183831: ec2 constraints missing <juju-core:Triaged> <https://launchpad.net/bugs/1183831>
<sidnei> i guess workaround is a bad way to put it
<fwereade> thumper, sidnei: I've been having some very interesting conversations that may be able to let us do an end-run around the problems with provider-specific constraints, and it seems that sabdfl is very keen on the direction; so while it remains a little up in the air right now, it is more likely than not that we will see movement on that front soon
<thumper> fwereade: yeah, just reading your doc
<fwereade> thumper, sidnei: a disk constraint is not a bad idea in itself necessarily but we'd need to be quite clear on exactly what it is and is not
<fwereade> thumper, sidnei: and I worry about possible interactions with future plans for storage
<thumper> yeah, the future storage thing was something I was a bit worried about too
<sidnei> yeah, me too. it is somewhat of a blocker though for moving to juju-core, since we don't have instance-type as a forward path either.
<fwereade> thumper, sidnei: so my inclination is to discourage and take this as an even-stronger vote for instance-type/flavor constraints
<fwereade> sidnei, but if you had them, it would be moot, or at least very low priority?
<sidnei> either one of them would be fine, instance-type or disk constraint
<sidnei> instance-type would be the less friction one since we wouldn't need to change existing scripts
<fwereade> sidnei, all the better then, I'd rather have that too
<fwereade> sidnei, can we sync up again on weds evening, at which point relevant plans and schedule discussions will have progressed another couple of steps?
<sidnei> sure
<sidnei> since i haven't seen the proposal, maybe im talking bullshit but would it make sense to prefix provider-specific constraints with a provider tag? as in ec2:instance-type or openstack:flavor and so on? and if you try to use those in a provider that doesn't support them they would be ignored?
<fwereade> sidnei, openstack:flavor is IMO a big no, that was the main problem -- but canonistack:flavor seems just fine
<fwereade> sidnei, openstack:flavor looks way more portable than it actually is
<sidnei> because one can define custom flavors?
<fwereade> sidnei, but prodstack:flavor, or hp:flavor, or whatever, *do* make sense and are also *clearly* not globally portable
<fwereade> sidnei, yeah, the vocabularies may have enough overlap to be tempting but trying too apply them too broadly will hurt like hell ;)
<fwereade> sidnei, and hopefully in practice we'll be able to infer provider anyway, so it should actually just be "instance-type" or "flavor" in many many cases
<sidnei> im thinking in terms of 'openstack:flavor' is a filter that gets passed through to the openstack provider without vocabulary validation; there's no 'hp' or 'prodstack' provider, and neither 'hp' or 'prodstack' are values you specify in the environments config
<arosales> quick 'juju init' question regarding the boiler template
<davecheney> arosales: shoot
<arosales> the boiler template has the following for hpcloud:
<arosales>     # Usually set via the env variable AWS_SECRET_ACCESS_KEY, but can be specified here
<arosales> should that the HP_SECRET_ACCESS_KEY ?
 * arosales not sure if that was intentional or not, thus thought I would ping here.
<davecheney> arosales: probably shouldn't be anything, I don't think there is sucha var for HP
<davecheney> it's probably copy pasta
<arosales> I thought you guys did land keys for hp .. .
 * arosales checks env.yaml
<davecheney> arosales: yes, but they woukd't be called HP_*
<davecheney> we don't have a HP provider, we have an openstack provider
<arosales> ah, ok.  Well that was just a comment in the boiler template
<arosales> the actualy yaml key was
<arosales> secret-key: <secret>
<arosales> access-key: <secret>
<arosales> basically the stanza http://pastebin.ubuntu.com/5879146/
<davecheney> hmm, i see what you mean
<davecheney> neither is correct
<arosales> marteen was trying to make it easier to get started with juju and was trying to script editing the env file and I found the keys for hp and aws to be the same.
<arosales> davecheney, ok sounds like I should file a bug for follow up.
<arosales> davecheney, https://bugs.launchpad.net/juju-core/+bug/1201628
<_mup_> Bug #1201628: HP Cloud Boiler-plate (juju init) template has AWS info <juju-core:New> <https://launchpad.net/bugs/1201628>
<davecheney> arosales: ta
<wallyworld> arosales: those aws keys do work for openstack
<wallyworld> although the openstack ones work too
<arosales> wallyworld, I am guessing the key auth does work for openstack (ie hpcloud)
 * arosales was just wondering about the AWS comment.
<wallyworld> key auth does work now, or is supposed to
<wallyworld> i haven't tested it personally
<arosales> I think I have seen it work
<arosales> bug is out there for triage accordingly
 * arosales grabs some dinner
<arosales> wallyworld, davecheney thanks for the help.
<wallyworld> arosales: np. the AWS key is supported for ease of migration between ec2 and openstack
<wallyworld> it looks for openstack specific env vars first
 * thumper beats the machine over the head with the local provider stick
<wallyworld> but uses the aws ones if there
<thumper> hmm...
#juju-dev 2013-07-16
<wallyworld> thumper: do you know how to kick tarmac in the balls to reset it?
<thumper> wallyworld: sorry, no
<wallyworld> ok. ta. ffs it seems i can never just land a branch without hassle
<thumper> poos
<thumper> I want the opposite of updateSecrets
<thumper> I want to get some settings from the EnvironConfig
<thumper> and put it into the local config
 * thumper thinks
 * thumper resolves to just use default hard coded port numbers
<thumper> and allow people to override them in the config
<thumper> too hard to be dynamic here
 * thumper does a little dance
<thumper> wallyworld: http://paste.ubuntu.com/5879343/
<wallyworld> thumper: \o/
<wallyworld> as Borat would say, niiiiiiiice
<thumper> wallyworld: now... to break it up, land it and have some tests...
<thumper> it has been super hacky so far
<thumper> davecheney: local provider is working
<wallyworld> it runs, who needs tests :-P
<thumper> wallyworld: it is actually pretty hard to test some of this
<thumper> as it requries root
<wallyworld> yeah :-(
<thumper> so test around the edges where you can
<thumper> and have some form of live tests
 * thumper goes to make lunch for the minions
<thumper> wallyworld: didn't you write a function somewhere to get the container type from a machine id?
<wallyworld> sorry, what's the context?
<wallyworld> i've worked on 5 branches today
<thumper> s'ok, found it
<thumper> ContainerTypeFromId
<wallyworld> cool
<wallyworld> thumper: i'm dumb. i read your question as "why didn't you write...."  and was thinking that i thought i had. clearly my brain is dying
<thumper> heh
<thumper> you had already
 * thumper enfixorates the ensure lxc bit
 * thumper starts teasing apart the threads of the local provider work to land bits independently
<thumper> hmm  7 pipes unlanded already
<wallyworld> is that all? :-P
<wallyworld> go on, try for 10
<thumper> wallyworld: it probably will be by the time I've done the teasing
<thumper> wallyworld: but IT WORKS!!!!
<thumper> so far
<wallyworld> thumper: so you sorted out th addrssing for local?
 * thumper wonders how to stress test it
<thumper> what do you mean?
<wallyworld> i thought you had to tell the containers what ip address thy could use
<wallyworld> or make the containers to look methods not yt existing on state.machine
<thumper> wallyworld: no, not for the local provider
<thumper> wallyworld: the default lxc settings work fine
<thumper> wallyworld: the problem is having the user, on the outside of the containers getting the ip addresses of the containers
<wallyworld> ah ok. so they all just use the localhost address?
<thumper> wallyworld: instead of fucking around with lxc to get it from the outside
<thumper> wallyworld: I want to fix it in state
<thumper> wallyworld: no, they use a 10.0.3.0/24 address
<thumper> wallyworld: which is routed through 10.0.3.1 bridge
<wallyworld> ok
<thumper> hence my email that I sent a few minutes ago
 * wallyworld hits refresh
 * thumper waits while lbox does its thing
<thumper> this thing that lp does too
<thumper>  bit 1: https://codereview.appspot.com/11321043
<thumper> bit 2: https://codereview.appspot.com/11319044
<thumper> bit 3: https://codereview.appspot.com/11325043
<jam> thumper: but lbox does it synchronously and less accurately. (though you do know when it has actually finished, vs async in lp)
<thumper> hi jam
<thumper> when are you off for holidays?
<jam> thumper: my flight is tomorrow evening.
<jam> so I'll be working some tomorrow, but maybe not a full day
 * thumper nods
 * thumper goes to make coffee and take a break
<thumper> will submit bit 4 RSN
<wallyworld> jam: hi, tarmac is fooked and i don't know how to kick it in the guts
<jam> wallyworld: can you point me to context?
<wallyworld> jam: i approved a mp and nothing happened. the tarmac log says there was a bzr error merging (from memory, can't recall exactly), and now it appears to not ven be trying to look at any new approvals to process
<wallyworld> i at least wanted just to kick start it again
<jam> wallyworld: it is running, but it just keeps failing on your proposal: http://paste.ubuntu.com/5879636/
<wallyworld> jam: that timestamp is hours old
<jam> wallyworld: UTC?
<wallyworld> it only failed once and then stopped processing
<jam> 4 ohurs
<wallyworld> i think it was about 4 hours ago from memory
<wallyworld> jam: and it was the 2nd try because the first run failed some tests
<jam> wallyworld: yeah, date says 4 hrs ago
<jam> I don't see anything running, which is strange, cron should still be firing off tarmac every minute
<wallyworld> yes it is weird
<jam> wallyworld: we use "flock ...." as a way to avoid having 2 tarmac processes running concurrently (you can use crontab -l to see it). I did a lot of searching, but I didn't see any processes running flock or python.
<jam> So I just deleted the lock file
<jam> and it is running now.
<jam> It is going for ~jtv's code
<wallyworld> hmm. ok. i didn't realise that's what we did
<wallyworld> maks sense
<wallyworld> so the error handling needs improving
<wallyworld> to always release the lock
<jam> wallyworld: it is the "flock" process outside of tarmac
<jam> I didn't think we could get that wrong
<jam> as it starts a process, and when that process dies it unlocks
<jam> wallyworld: http://paste.ubuntu.com/5879648/
<jam> from man flock
<jam> makes me think it might have left a mongo or something running.
<wallyworld> yeah, wouldn't be surprised
<jam> wallyworld: --wait 3600 maybe? (so we auto-drop flock after 1 hour) not sure on that one.
<wallyworld> an hour sounds reasonable, maybe even less
<wallyworld> hopefully this won't happen too often
<jam> wallyworld: so the test suite should only take 15min or so, but we might run multiple
<jam> and I don't think flock actually kills the subprocess.
<jam> it would just stop locking
<jam> wallyworld: any ideas what would be non-ascii with what you submitted?
<jam> ISTR there were also nonascii patches because gcc was outputing Unicode sequences.
<wallyworld> jam: no, that's just the thing. it worked first time, some tests failed, and i just re-approved
<jam> https://bugs.launchpad.net/tarmac/+bug/750930
<_mup_> Bug #750930: breaks on non-ascii characters in verify_command output on failure <Tarmac:In Progress by jameinel> <https://launchpad.net/bugs/750930>
<jam> that is my patch, which appears to still not have landed...
<wallyworld> ah, i did push up a small change
<wallyworld> i'll check the logs
<wallyworld> jam: is was a one line fix to add "jc." in front of IsTrue cause a trunk change while the branch was in review broke my test
<wallyworld> so i don't see any non-ascii in there
<jam> wallyworld: I have a feeling it could be a gcc sort of issue (gcc uses fancy quotes if your terminal is marked UTF-8)
<jam> so maybe something with gwacl or something
<jam> anyway, I have the patch, not sure what the issue is, unfortunately.
<jam> I'm putting together a branch for us
<jam> because lp:tarmac is owned by rockstar only
<jam> I already have a couple local-only patches, and I don't want to make that worse.
<jam> the local patches are just logging changes
<wallyworld> jam: ah, there was a gofmt change in code i didn't touch. the diff in loggerhead shows "nothing" changed, so it could have been a tab
<wallyworld> or something
<jam> wallyworld: well a tab is still ascii
<jam> it sounds like a build that failed
<wallyworld> true
<thumper> bit 4: https://codereview.appspot.com/11326043
<jam> wallyworld: note that the code that failed indicated it was trying to mark the proposal as failed
<wallyworld> hmm. ok. i'll rerun the tests
<wallyworld> jam: just got an email - it ran and merged ok that time
<wallyworld> i didn't change anything
<thumper> bit 5: https://codereview.appspot.com/11327043
<thumper> bit 6:  https://codereview.appspot.com/11327044
<thumper> bit 7: https://codereview.appspot.com/11330043
<wallyworld> jam: tarmac hates me. this time it appears to be stuck on my branch because i did the prereq myself. so i guess i have to repropose against trunk and adjust all the downstream branches?
<thumper> bit 8: https://codereview.appspot.com/11333043
<thumper> and that last one enables the local provider
 * thumper is done for the day
<thumper> plz review nicely :)
<thumper> laters...
<davecheney> is anyone able to bootstrap on ec2 ?
<davecheney> i'm seeing bootstrap nodes stillborn
<wallyworld> davecheney: i did earlier today
<davecheney> just sits there waiting for the mgo server to come up
<davecheney> can't see to the instance either
<wallyworld> hmmm. not sure sorry :-(
<jam> wallyworld: you can just propose the prereq and manually mark it merged.
<jam> davecheney: I was bootstrapping a lot yesterday, but I have not tried it today.
<davecheney> jam: wallyworld i cannot bootstrap in ap-southeast-2
<davecheney> ap-southeast-1 works
<davecheney> -2 results in an aninstance that is running but does not repond at all on the netwokr
<davecheney> no ssh no nothing
<jam> davecheney: can you get the boot information from ec2?
<davecheney> oh, and it looks like destroy-environment doens't work either
<jam> davecheney: well if you can't read the s3 bucket, you can't destroy stuff, I think.
<jam> It *sounds* like an ec2 side issue, but I could certainly be wrong.
<davecheney> jam: get system log looks fine
<davecheney> jam: if it was a bucket issue, --upload-tools would have failed
<davecheney> they are the same bucket
<jam> true enough
<davecheney> hang on, my ap-southeast-1 is set to raring
<davecheney> ... i wonder if I set it to precise
<davecheney> ...
<davecheney> sigh - no
<davecheney> it's just ap-southeast-2 is busted today
<rogpeppe> davecheney: hiya
<rogpeppe> mornin' all
<davecheney> rogpeppe: howdy
<rogpeppe> davecheney: just wondering: if you needed to reinstall your laptop ubuntu, what would you use for backup and restore? just tar?
<rogpeppe> davecheney: my laptop is in a bad state since upgrading to 13.04 and it's possible that reinstalling might fix things
<davecheney> rogpeppe: i just tar up ~
<davecheney> but really you want to avoid all the dot files shit in ~
<rogpeppe> davecheney: i'm just concerned that i'll lose all the stuff outside $HOME that i've accumulated over the years.
<davecheney> rogpeppe: ahh, i don't ever step outside $HOME for that reason
<rogpeppe> davecheney: i'm thinking mostly of apt-get stuff
<rogpeppe> davecheney: but i guess i can just stumble along until i find something missing, then apt-get it
<davecheney> dpkg -l | grep ^ii
<rogpeppe> davecheney: ah, cool. what's the significance of "ii" ?
<davecheney> ii == installed
<davecheney> lots of other stuff in there as turds
<rogpeppe> davecheney: and do you know of any way i can ask for only those packages which aren't depended on by others
<rogpeppe> ?
<davecheney> no, but jam or tim will know
<rogpeppe> davecheney: something which could produce output suitable for tsort might work
<davecheney> google says debtree
<TheMue> morning
<jam> I know you can ask for rdepends I believe, but I don't know a specific way to say "give me the list of packages explicitly installed ignoring their dependencies". I know it is possible, because when you "apt-get remove" it can tell you "these packages are no longer required"
<jam> I just don't know it.
<rogpeppe> hmm, the solution here looks plausible (though i'm sure i haven't explicitly installed *all* the 2080 packages i get when applying the first answer)
<rogpeppe> http://unix.stackexchange.com/questions/3595/ubuntu-list-explicitly-installed-packages/3624#3624
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: TheMue | Bugs: 6 Critical, 79 High - https://bugs.launchpad.net/juju-core/
<TheMue> rogpeppe: what problem do you have? is your system in trouble?
<rogpeppe> TheMue: yeah, various pieces of the system are broken in weird ways
<rogpeppe> TheMue: it's possible that it's a hardware issue
<rogpeppe> TheMue: but i have to try a fresh install first
<TheMue> rogpeppe: iiirgks, that really doesn't sound good
<TheMue> rogpeppe: i had to cleanup my package list after 13.04 upgrade
<jamespage> davecheney, trying your no-strip suggestion now
<pavel> guys, is juju-core 1.11.2 behavior stable today?
<mgz> pavel:  what do you mean exactly?
<pavel> I mean that I have weird errors all day
<mgz> pastebin?
<pavel> if there are no any common issue, then it's on my side
<mgz> yeah, as far as I know we've not broken the published tools or anything of late
<pavel> ok, thanks
<jam> wallyworld: when you're back, I figured out how flock works. It opens a file flocks it, then execs into the child process which means that all spawned processes hold open that flock until they all exit. And there was a 'mongo' running that meant the lock was permanently held.
<mgz> hah
<mgz> o_cloexec plz
<jtv> Anybody up for a second review?  Nothing big â just discarding an unneeded complication: https://codereview.appspot.com/11322043
<rogpeppe> jtv: looking
<jtv> Thanks!
<rogpeppe> jtv: out of interest, what does gwacl stand for?
<TheMue> jtv: you've got a +1
<dimitern> hey guys I need a review on the last bit of the deployer API stuff (client-side): https://codereview.appspot.com/11342043
<TheMue> rogpeppe: i assume go windows azure cloud library (or client library)
<TheMue> dimitern: *click*
<rogpeppe> TheMue: ah, sounds plausible. i reckon it could do with a package doc comment...
<TheMue> rogpeppe: yes, sounds reasonable. didn't look if there exists one
<jam> mgz: well you need it for the first process, and you can set a flag as to 'don't inherit' but I don't know what it kills when.
<rogpeppe> jtv: reviewed
<mgz> jam: the other option is specifically when spawning mongo, to do the post-exec go through and close all file handles hack
<rogpeppe> mgz: what's the issue?
<jam> mgz: or not have the test suite crash without cleaning itself up?  :)
<mgz> that too :)
<dimitern> fwereade: ping
<mgz> rogpeppe: flock persisting when a child process spawns with an fd open
<jam> mgz: http://linux.die.net/man/1/flock
<jam> I would consider --close, but that seems to negate the point of using flock in the first place (
<jam> (don't let 2 tarmac processes run concurrently)
<rogpeppe> mgz: 5&- ?
<rogpeppe> mgz: if 5 is your fd
<jam> rogpeppe: earlier today we had a submission go haywire and the bot was hung unable to process new requests for 4+ hours.
<rogpeppe> mgz: sorry, >5&-
<jam> It would appear because mongod was spawned by a test case, and was not stopped when the 'go test' executable exited
<rogpeppe> jam: hmm, interesting - it *should* be stopped at the end of the test
<jam> rogpeppe: sure, but given the test suite failed via some sort of crash, some resource was not properly cleaned up
<jam> in this case a mongo, which then had a file descriptor it inherited still open
<jam> so while --close sounds like it might work (we want to hold open the handle for tarmac, but not for children)
<jam> it sounds like it closes too early.
<rogpeppe> jam: could we use process groups or sessions or something related, and kill the session after go test finishes?
<rogpeppe> jam: it sounds like it might be a mistake to make gotest/mongo not hold the lock, because presumably we don't want multiple garbage mongod's accumulating
<jam> rogpeppe: we don't, but I also don't mind *a* garbage mongod preventing us from landing anything until I come online, especially since I'm gone for 2+ weeks (2 vacation, 1 for Isle of Man)
<jam> I guess I can just tell Martin and Wallyworld they have to deal with it :)
<rogpeppe> jam: presumably it might end up as an arbitrary number of mongods if the same issue reoccurs.
<jam> rogpeppe: it is slightly harder to discover, but easier to diagnose when it does happen. :)
<rogpeppe> jam: in which case just {go test 200>&-} should do the job
 * rogpeppe hasn't done fd manipulation in sh for a while, it seems :-)
<jam> rogpeppe: the piece that knows what verify_command to run needs to know what handle flock opend. And you can force the flock handle, but it isn't like 200 is the default.
<jtv> rogpeppe: thanks â I think it was Go Windows Azure Client Library.
<jtv> And thanks TheMue too.  :)
<rogpeppe> jam: aren't we writing both bits?
<jam> rogpeppe: we do control both bits, they are spread far apart in terms of configuration, so if we find we have to have it we can, but I would avoid it
<rogpeppe> jam: $FLOCK_HANDLE ?
<jam> rogpeppe: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471 standup
<jam> mramm: I'm in the 1:1 whenever you're ready
<mramm> ok
<mramm> answering questions for dimitern quickly
<mramm> there in a min or two
<jam> np
<TheMue> dimitern: you've got a review
<dimitern> TheMue: tyvm
<TheMue> dimitern: yw
 * TheMue => lunchtime
<dimitern> fwereade: when you can, PTAL https://codereview.appspot.com/11342043/
 * rogpeppe goes to lunch
<wallyworld__> fwereade: you disappeared. i think we had sort of finished anyway
<dimitern> wallyworld__, others: fwereade just texted me that his connection was gone haywire and he's going to lunch now
<wallyworld__> ok
<fwereade> jam, before I go, about the api addresses -- would it be that awful to connect to state and get api addresses from there? we should be able to assume sane/valid state info, right?
<jam> fwereade: we have a state connection, though it is hidden behind the state.Machine we have in the code.
<jam> we need it to SetPassword
<jam> so we will have a state conn
<jam> but I'm not sure how to get it passed in.
<jam> I can investigate
<fwereade> jam, I feel like it ought to be possible but, yeah, I guess all the entity stuff is a little tangly, maybe it's not worth it
<fwereade> jam, have a little look but if it's going to be costly I guess we're fine without it
<fwereade> jam, I'm just a bit worried about what'll happen if that code lives longer than we expect it to ;)
<fwereade> rogpeppe, bug fix LGTM, thanks
<fwereade> rogpeppe, that was *not* how I expected that Format to work though :)
<fwereade> (evidently ;p)
<jam> fwereade: that isn't how anyone who doesn't read the docs closely thinks it works
<jam> given it is the first time I've seen a strftime that *wasn't* %H:%M:%S based
<fwereade> jam, yeah, and there's even mention of those %H~s etc in the docs iirc
<dimitern> fwereade: thanks for the review
<jam> http://golang.org/pkg/time/#Time.Format doesn't actually mention the % versions that I can see
<jam> and http://code.google.com/p/go/issues/detail?id=444 clearly indicates it doesn't want strftime
<jam> fwereade: Given the issue, is 2006 actually better?
<jam> which is 1
<jam> what is 2
<jam> IMO you still need the docs to figure out what to pass
<jam> maybe if you did it enough you'd remember the magic date better
<fwereade> jam, ha, clearly my crack consumption was much higher than usual that day... I could swear I remembered looking it up
<fwereade> jam, maybe I was an idiot and looked up strftime instead
<fwereade> jam, that's probably it
<jam> fwereade: right, it must use strftime syntax, I'll go track it down and use it
<fwereade> jam, yeah, indeed
<fwereade> jam, there's something quite neat about what they do there but it's a touch astonishing too
<rogpeppe> fwereade: i think it's only astonishing if you're used to strftime. i just wish they'd chosen a more memorable date - the y/m/d ordering is quite parochial
<jam> fwereade: so.. s.State.APIAddresses() is actually wrong in the test suite (because it does the same "give me the default  port on all these things") s.APIInfo(c).Addrs has the correct value...
<jam> It doesn't matter for real world case
<jam> but it is true that we aren't recording in State the *actual* API Info
<jam> (addresses)
 * rogpeppe wishes there was an easy way of traversing forwards through a pipeline of merge proposals
<jam> rogpeppe: pump ?
<rogpeppe> jam: when reviewing
<jam> or you mean web browse would link them
<jam> LP links them together
<jam> well, only towards the prereq maybe
<rogpeppe> jam: does it? perhaps i've missed that.
<rogpeppe> jam: exactly (and even then it only points to the branch, not the MP, i think)
<jam> rogpeppe: so if you go to the MP page, it has links to the branch itself, and the prerequisite branch, if you click on the branch itself, it says "1 branch depending on this one"
<jam> which you can click to
<jam> and then get to the MP from there
<jam> so the links are there
<jam> but not direct
<jam> hmmm.. the "1 branch dependent on this one" takes you to the page which has the list of merges to that branch
<jam> which *doesn't* include the MPs that depend on the branch
<jam> I wonder if it was intended to do so
<jam> rogpeppe: poke tim :)
<rogpeppe> jam: yeah, i just saw that.
<rogpeppe> jam: it looks wrong.
<abentley> jam: verrrrry longstanding bug.
<jam> abentley: :)
<jam> fwereade: https://codereview.appspot.com/11137044/ has been updated with a state.State.APIAddresses call. I wish that actually did the right thing in the test suite.
<jam> (We don't record the API Addresses in the DB, so APIAddresses infers them from the State.Addresses that *are* recorded)
<jam> wallyworld__: something is wrong with your branch. I'm getting: Running test command: go fmt ./... && go build ./... && go test ./...
<jam> Command appears to be hung. There has been no output for 900 seconds. Sending SIGTERM.
<jam> It is trying again right now, but other branches have been able to land I believe.
<rogpeppe> jam: ah, i see now why mongo wasn't shut down
<rogpeppe> hmm, my laptop seems to have stopped talking to its wired ethernet :(
<jam> fwereade,  dimitern: ping about some recent timeouts on go-bot
<fwereade> jam, oh yes?
<jam> we've had several failures like this one: https://code.launchpad.net/~dimitern/juju-core/070-deployer-client-facade/+merge/174973
<jam> Where it appears deployer tests are getting a 500ms timeout
<dimitern> jam: yeah, i noticed
<jam> I have the feeling Canonistack is overloaded, and go-bot is running slowly
<jam> but I noticed the deployer code wakes up every 50ms, but doesn't do anything like StartSync in the inner loop
<jam> fwereade: we have s.State.StartSync() at the beginning, but not in the inner loops
<jam> this doesn't fix everything, but I think when tests are failing they aren't cleaning up cleanly, so we have a bunch of follow on failure.
<fwereade> jam, yeah, those all do look like timeouts while the SUT is actually doing what it should, but slowly
<jam> fwereade: right, it is getting some of them done (like svc 0 but not svc 1)
<fwereade> jam, StartSync in inner loops is only necessary when there's no way to tell when a triggering action has actually taken place
<jam> note stuff like: ok  	launchpad.net/juju-core/worker/uniter	365.596s
<jam> which is one of the slower tests
<jam> but 370s is super long
<jam> tarmac was running the whole test suite in 15min
<jam> fwereade: so I'm considering just bumping up the global LongWait (and changing the deployer code to use that value).
<jam> But I figured I'd bring it up for discussion.
<fwereade> jam, yeah, I'm +1 on that -- in normal circumstances these tests pass relatively fast, but in unhappy circumstances we still want them to work
<fwereade> oof, yeah, I did not see that one
<jam> fwereade: and long timeout is supposed to be the "waiting this long is a failure" not the "sleep a bit to let things progress"
<fwereade> jam, yeah, indeed
<fwereade> jam, rogpeppe1: how might we be getting an "unauthorized" error out of machiner?
<rogpeppe1> fwereade: context?
<fwereade> rogpeppe1, deploying to saucy on azure: http://paste.ubuntu.com/5880671/
<rogpeppe1> fwereade: is machiner using the API now?
<rogpeppe1> fwereade: (i'm presuming not)
<fwereade> rogpeppe1, no, I don't think it actually is, which is what's baffling
<rogpeppe1> fwereade: might line 22 etc be significant here?
<rogpeppe1> fwereade: hmm, no
<rogpeppe1> fwereade: that's expected
<fwereade> rogpeppe1, I think that's normal, yeah
<rogpeppe1> fwereade: this only happened deploying to saucy on azure?
<rogpeppe1> s/happened/happens/
<fwereade> rogpeppe1, I have only seen it reported there
<fwereade> rogpeppe1, but I have not deployed today
<rogpeppe1> fwereade: and this is on the bootstrap machine too, right? that's very odd.
<rvba> rogpeppe1: this is only on the bootstrap machine.  I can't deploy nodes.
<rogpeppe1> rvba: what happens when you run juju status?
<rvba> rogpeppe1: http://paste.ubuntu.com/5881387/
<rogpeppe1> rvba: is it possible could you give me ssh access to the bootrap machine?
<rvba> rogpeppe1: nothing is listening on the API port: http://paste.ubuntu.com/5881394/
<rvba> rogpeppe1: sure, just one sec.
<rvba> rogpeppe1: launchpad id?
<rogpeppe1> rvba: that's not surprising
<rvba> k
<rvba> rogpeppe1: ssh ubuntu@juju-azure-saucyy7xrrjl4h9zemnvvbaqpfqtelpm2jzjjzu375hmczp20ldz.cloudapp.net
<rogpeppe1> rvba: as the machine agent is failing to start
<rogpeppe1> rvba: and that's what runs the API
<rogpeppe1> rvba: my ssh public key is:
<rogpeppe1> ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDOjaOjVRHchF2RFCKQdgBqrIA5nOoqSprLK47l2th5I675jw+QYMIihXQaITss3hjrh3+5ITyBO41PS5rHLNGtlYUHX78p9CHNZsJqHl/z1Ub1tuMe+/5SY2MkDYzgfPtQtVsLasAIiht/5g78AMMXH3HeCKb9V9cP6/lPPq6mCMvg8TDLrPp/P2vlyukAsJYUvVgoaPDUBpedHbkMj07pDJqe4D7c0yEJ8hQo/6nS+3bh9Q1NvmVNsB1pbtk3RKONIiTAXYcjclmOljxxJnl1O50F5sOIi38vyl7Q63f6a3bXMvJEf1lnPNJKAxspIfEu8gRasny3FEsbHfrxEwVj rog@rog-x220
<rogpeppe1> rvba: rogpeppe
<rogpeppe1> rvba: (i think - perhaps it's rogpeppe@gmail.com; i never know what launchpad expects)
<rvba> rogpeppe1: you should be able to login now, I've imported the key you put on lp.
<rogpeppe1> rvba: am logged in, thanks
<mgz> remember you can just do `ssh-import-id rogpeppe`
<rvba> That's precisely what I did.
<rogpeppe1> hmm, the agent config file looks ok, but mongo auth as that agent fails. http://paste.ubuntu.com/5881424/
<rogpeppe1> and the log seems to indicate that the password for the MA was created correctly http://paste.ubuntu.com/5881429/
<rogpeppe1> rvba: are you using tip?
<rvba> rogpeppe1: yes
<rogpeppe1> rvba: revno 1464?
<rvba> rogpeppe1: yes
<rogpeppe1> rvba: i'll just check to see if it works under ec2
<rogpeppe1> hmm, looks like trunk is broken against the latest version of gwacl
<rvba> rogpeppe1: yes, we're fixing this, use gwacl's revision number 182.
<rogpeppe1> rvba: thanks. using that.
<rogpeppe1> rvba: hmm, seems to work ok under ec2. can you reproduce the azure behaviour?
<rvba> rogpeppe1: yes, I've tested it ~3 times this afternoon.
<rvba> rogpeppe1: I'll try with a raring image right now.
<rogpeppe1> rvba: please. i'm just trying ec2 with a saucy image (assuming it can find one)
<rogpeppe1> oh bugger, forgot to use --upload-tools. discard as unverified all my previous assertions of okayness
<rogpeppe1> darn, "no "saucy" images in us-east-1 "
<rogpeppe1> i've reached eod and have to stop
<rogpeppe1> rvba: please let me know how you get on with raring
<rvba> rogpeppe1: will doâ¦ talk to you tomorrow!  Thanks for your help.
<rogpeppe1> rvba: no help as yet, i'm afraid
<rogpeppe1> rvba: g'night all
<rvba> nn rogpeppe1
<rvba> rogpeppe1: works ok on raring.
<thumper> good morning
<thumper> fark... bikeshed much?
<fwereade> thumper, heh, sorry, is that my reviews to which you refer?
<thumper> fwereade: no
<thumper> fwereade: I gather you wanted to chat?
<fwereade> thumper, yeah, can't quite remember what about -- was it the upload-tools stuff?
<thumper> yeah
<fwereade> thumper, I don't think it needs much actual discussion really; it's settled in my mind and I'm willing to accept it in the name of expediency, because I really want a local provider soon not late
<thumper> :)
<thumper> I'd be very surprised if it actually caused problems
<fwereade> thumper, yeah, yuo may be right
<thumper> but prepared to fix the problems should they occur
<fwereade> thumper, I still feel it's a bit unnecessarily messy
<fwereade> thumper, however, themue's back looking at auto-sync-tools again
<thumper> I can accept that for now
 * thumper nods
<thumper> what is the auto-sync-tools option?
<fwereade> thumper, and I kinda hope it'll wither naturally
<fwereade> thumper, it's just a nicer first user experience
<fwereade> thumper, if you can't find tools in this cloud, copy them from somewhere else
<thumper> fwereade: hmm
<thumper> fwereade: however the local provider should work disconnected
<thumper> fwereade: providing that the user has the cloud image
 * thumper thinks
<thumper> hmm
<thumper> the auto-update of the container may have problems
<thumper> should check disconnected use
<fwereade> thumper, istm that the jujud shortcut fits better with sync-tools than with upload-tools
<fwereade> thumper, good point
<wallyworld__> fwereade: i'm confused by your reference to agent.conf
<fwereade> wallyworld__, ah, sorry
<fwereade> wallyworld__, to be compatible, we need to be able to bootstrap older code with a newer client
<wallyworld__> and we can i think, i tested it
<wallyworld__> ah, i tested an older client
<wallyworld__> againgst this new code
<wallyworld__> the new provider-state file just appends info
<fwereade> wallyworld__, so the args are deliberately minimal, and we're expected to write additional stuff to some well-known location such that newer code can use it if it's there
<wallyworld__> so the older struct can read it just fine
<fwereade> wallyworld__, sure, I understand that, that's a separate issue
<fwereade> wallyworld__, and I'm not *really* too bothered about the BootstrapState issue
<wallyworld__> i saw hw as a natural extension to the current bootstrap state, which right now is just instance ids
<fwereade> wallyworld__, at least the type name indicates the sanity of what you do, and I think it doesn't matter if future code overwrites it and kills the hardware info
<fwereade> wallyworld__, because its only actual client is bootstrap
<wallyworld__> if it is over written later it doesn't matter cause it's been used by then
<fwereade> wallyworld__, wrt agent.Conf, that's the existing mechanism for passing extra stuff into jujud
<wallyworld__> i went to write a separate file but it involves a fair bit of extra code for little benefit
<fwereade> wallyworld__, I have abiding grumpiness that it was designed as a single file with heaps of arbitrary duplication
<wallyworld__> oh ok, i didn't know that about agent.conf, sorry
<wallyworld__> or maybe you told be and i didn't understand
<fwereade> wallyworld__, no worries at all, it's just another of these ill-documented details
<fwereade> wallyworld__, so if you just tack a field in there it won't make it notably worse than it is today
<wallyworld__> but we just want to pass stuff to the first bootstrap node here, not nodes in general
<wallyworld__> so agent.conf is to be avoided i think
<fwereade> wallyworld__, the agent conf has *so* much totally situational crap in there already that I can't get too worked up about it
<wallyworld__> yes, i just tack an extra field in
<wallyworld__> which is read as part of th final machine config  process
<fwereade> wallyworld__, it's *everything*... how to connect to *and* to serve both state and the api, and probably a few other things besides
<wallyworld__> so given all this, are you ok with it now?
<fwereade> wallyworld__, I'm not ok with the extra parameter to bootstrap-state, purely because it's an unnecessary compatibility break
<fwereade> wallyworld__, I'm fine with the extra field in envrons.BootstrapState
<fwereade> wallyworld__, (but its precarious nature should be commented and justified I think)
<wallyworld__> oh ok, you are talking about the jujud param
<fwereade> wallyworld__, I'm talking about both in a kind of unhelpful way
<fwereade> wallyworld__, the jujud param is the significant one
 * thumper heading to take the dog to the dog park with the minions
<thumper> bbl
<davecheney> thumper: ping
#juju-dev 2013-07-17
<davecheney> yay, thanks lp ,https://launchpadlibrarian.net/145125461/buildlog.txt.gz
<thumper> davecheney: hi
<thumper> davecheney: whazzup?
<davecheney> thumper: everything OSCON related
<thumper> everything? wow
<davecheney> thumper: https://codereview.appspot.com/11333043/
<thumper> that's a lot
<davecheney> ^ if this lands, does that mean we can say 1.11.3 will have a local providerr
<thumper> yes
<thumper> it won't be perfect
<davecheney> or is the reality more nuanced ?
<thumper> but it works
<thumper> there are a few rough edges
<thumper> 1. containers don't restart on reboot yet
<davecheney> i ask because when we add providers to all.go
<thumper> 2. outside addresses for machines is broken
<davecheney> then they show up in juju init
<thumper> right
<thumper> it is functional
<thumper> as it, you can bootstrap, deploy to it, and it works
<thumper> and destroy
<thumper> etc
<davecheney> right
<davecheney> sweet
<bigjools> how do you work out which goroutine caused the "fault" here: http://paste.ubuntu.com/5881651/
<thumper> bigjools: the one that paniced L(
<davecheney> bigjools: always the top one
<bigjools> davecheney: good to know, ta
<davecheney> bigjools: it might look like they are sorted
<davecheney> but that is just a fluke
<bigjools> next question, WTF
<davecheney> the faulting goroutine is the one on the top
<davecheney> 0xb is SIGSEGV
<bigjools> how readable :)
<bigjools> is this a golang bug?
<davecheney> bigjools: very likely, this was go 1.0.2
<bigjools> on raring
<davecheney> bigjools: that fault is a very 'fuck what happened'
<davecheney> last ditch error
<davecheney> normal nil derefs are less cryptic
<bigjools> same code works on precise
<davecheney> precise has a differnt version 1.0
<davecheney> bigjools: is it reproducable
<davecheney> the panic ?
<davecheney> s/panic/fault
<bigjools> I believe so (no personal experience)
<davecheney> ie, always crashes on raring/1.0.2
<bigjools> just catching up with email and saw that rvb had this
<bigjools> I am wondering how he built the binaries now
<bigjools> ie did he build on precise for precise deployment, raring for raring, etc
<bigjools> is this a known bug? ERROR juju runner.go:200 worker: fatal "machiner": unauthorized
<davecheney> bigjools: i do not think so
<davecheney> i haven't seen anything like that reported recently
<bigjools> oh apparently Roger and William are investigating
<bigjools> only happens on saucy
<bigjools> davecheney: the branch of Gavin's that you just approved requires gwacl to be updated too, can you make that happen on the bot or do we need jam?
<davecheney> bigjools: i have no access to the bot :(
<bigjools> \o/
<bigjools> this dependency situation is riduclous
<bigjools> and other, better spelled, things
<davecheney> http://25.media.tumblr.com/tumblr_lo9qb4Narf1qdw1bro1_500.gif
<thumper> davecheney: when is oscon?
<bigjools> much better
<bigjools> next week
<thumper> davecheney: and are you talking about juju?
<thumper> ah, that's right
<thumper> and mramm is there too right?
 * thumper goes to write some tests
<davecheney> thumper: yes
<thumper> davecheney: is that a yes to you talking about juju?
<thumper> wallyworld__: we are landing machines today
<thumper> go go go
<wallyworld__> indeed
<wallyworld__> i only have one more to go but am making changes and may need another +1 from william
<wallyworld__> thumper: since when did we decide to block imports? sadly go fmt doesn't know about that and so now i have to tediously inspect each file i change
<thumper> wut?
<wallyworld__> till now i've been able to rely on go fmt to Do The Right Thing
<wallyworld__> i got a code review comment about it
<thumper> oh
<thumper> a while back
<wallyworld__> since go fmt didn't Do The Right Thing
<wallyworld__> was there an email i missed?
<thumper> it just mixes them
<thumper> wallyworld__: probably
<thumper> it was to the list
<thumper> from jam
<wallyworld__> go fmt puts things in alphabetical order
<thumper> right, but it is better to have:
<thumper> standard libs
<thumper> blank
<thumper> other deps
<thumper> blank
<thumper> juju-core
<wallyworld__> sigh. sure. but the tooling doesn't support it :-(
<thumper> it does sort each block
<thumper> so it kinda does
<thumper> it just doesn't split into blocks for you
<wallyworld__> ok. if it sorts each block that's kinda ok
<wallyworld__> too bad there isn't a tool like we had for lp
<wallyworld__> to fix them up
<thumper> you could write one :)
<wallyworld__> still is tedious without one
<wallyworld__> i could in my copious spare time
<thumper> that's the spirit
<wallyworld__> blood from stone and all that
<wallyworld__> gotta sleep and eat sometimes
<wallyworld__> first world problem i know
<thumper> :)
<thumper> wallyworld__: hangout?
<wallyworld__> ok
<thumper> i'll start?
<thumper> https://plus.google.com/hangouts/_/c8ac9986272fa8ed8fcdc3983ab3deeb0297ad2d?hl=en
<thumper> wallyworld__: https://codereview.appspot.com/11327043/
<thumper> wallyworld__: and https://codereview.appspot.com/11327044/
<wallyworld__> ok
<wallyworld__> thumper: you owe me. by +1ing your 2nd mp, i've condemned myself to work fixing it up when my current branch lands
 * thumper hands wallyworld__ a beer voucher
 * wallyworld__ doesn't drink beer :-(
 * thumper hands wallyworld__ a generic drink voucher
<wallyworld__> \o/
<thumper> value TBD
<thumper> wallyworld__: https://codereview.appspot.com/11333043/ needs a look too
<thumper> wallyworld__: it is the last
<thumper> wallyworld__: the previous three are all approved and awaiting the merge bot
 * thumper does a little dance
<wallyworld__> ok
<davecheney> thumper: fuck year
<thumper> a whole year of fuck?
<davecheney> ... berserker
<thumper> wallyworld__: I'll make a branch that uses an environ watcher for the config
<thumper> wallyworld__: instead of the env variables
<thumper> wallyworld__: once the others are in
<thumper> davecheney: re golang backports
<thumper> gpg: requesting key 448413DC from hkp server keyserver.ubuntu.com
<thumper> gpgkeys: key B4CA2F7AA7F663D0B585869B1CB4303F448413DC not found on keyserver
<wallyworld__> thumper: ok
<thumper> wallyworld__: then a branch to make containers restart
<davecheney> thumper: how do I fix that ?
<wallyworld__> sounds good
<thumper> davecheney: I don't know, normally by pushing that key to the keyserver
<davecheney> thumper: that isn't my key
<davecheney> i don't think
<davecheney> it isn't my PPA
<thumper> yeah, I know
<thumper> technically I think it should be there
<thumper> as the ppa thing does that.
 * davecheney is lost
 * thumper looks at bigjools
<thumper> davecheney: got it this time
<thumper> seems like a transient error
 * thumper twiddles thumbs
<thumper> time for another coffee I think
<jtv> Coffee?  Good idea.
<thumper> hmm...
<thumper> wallyworld__: isn't that simple actually
<thumper> wallyworld__: as the worker doesn't know about the config internals
<thumper> wallyworld__: I think that the environment variables are the better way to do it
 * thumper moves on to restarting the containers
<wallyworld__> hmmm. should we pass config to workers
<wallyworld__> shouldn't we
<wallyworld__> env variables seem out of band
<thumper> the workers know generic config
<thumper> but not environment specific config
<thumper> as that is private
<thumper> using environment variables is actually very common
<wallyworld__> sure, but here we are talking pass stuff between application internal moving parts
<wallyworld__> if our config doesn't support that it's broken imo
<thumper> wallyworld__: they are separate executables
<thumper> wallyworld__: in this case, the worker is a special case
<thumper> wallyworld__: I don't advocate using this by default
<thumper> wallyworld__: but the local provider is always going to be a bit /special/
<wallyworld__> ok
<jtv> Hi jam â did you hear we have another gwacl update pending?
<thumper> jtv: jam is only on for half a day
<jtv> ah
<thumper> jam: we need to make sure someone else knows about tarmac
<thumper> jam: and how to poke it
<jtv> We'll need someone to update its gwacl.
<thumper> jam: before you go on holiday
<thumper> davecheney: I wish I could write a test that checked the imports of a package
<thumper> davecheney: can we use magic meta-shit for that?
<jtv> Out of curiosity: what do you want to check them for?
<davecheney> MEGA_SHIT!
<davecheney> COCK!
<davecheney> # launchpad.net/juju-core/environs/azure
<davecheney> src/launchpad.net/juju-core/environs/azure/environ.go:401: not enough arguments in call to gwacl.NewNetworkConfigurationSet
<davecheney> https://launchpadlibrarian.net/145136519/buildlog_ubuntu-precise-i386.juju-core_1.11.3-4~1472~precise1_FAILEDTOBUILD.txt.gz
<davecheney> the build is broken til we can land that change via the bot
<jtv> davecheney: yes, we were just trying to get the tarmac updated.
<davecheney> fucksticks
<davecheney> oh well
<davecheney> the recipe should work after taht
<jtv> "oh well": EINCONGRUOUS
<jtv> Will the upcoming Go release finally fix this just-the-tip dependencies crap?
<wallyworld__> jtv: what, and adopt best practice used throughout the software industry? don't count on it
<thumper> jtv: I want to make sure that when I fix agent so it doesn't depend on the world
<thumper> jtv: that I have a test to enforce that
<jtv> thumper: that sounds remarkably close to one of Go's stated design goals, so you'd expect there to be a tool for that.
<thumper> jtv: hopefully
<thumper> I know that the go doc is able to walk the package dependencies
<jtv> They were very very concerned about minimizing imports.  Pike says they'd rather duplicate code than import a package for just one function.
<thumper> so I'm hoping it should be easy enough
<thumper> pitty we don't do that
<jtv> IIRC I saw a strings function re-implemented in the url package.
<thumper> the minimizing imports thing
<thumper> obviously we do the duplication
<jtv> "Let's do both!"
<thumper> \o/
<thumper> WINNING
<jtv> One problem is, with the "unused imports and variables are errors," the Go team seem to feel they've already done a lot to fight the problem â it may have taken the wind out of the sails of further support.
<jtv> IME, much of the weight of pointless dependencies comes from unused types, unreachable code, and poor structuring of dependencies.  The unused variables & imports are a drop in the bucket by comparison.
<jam> thumper: wallyworld__, dimitern, and mgz all have logged into the tarmac machine before so I know they have creds. Dimitern has done gwacl updates
<jtv> Hi jam!
<thumper> jam: awesome, is it documented anywhere?
<jam> thumper: I've sent emails, but I don't think there is a simple wiki page sort of thing.
<jtv> Maybe it's time for one...  There does seem to be a lot of oral tradition in this project.
<thumper> hmm... arse biscuits
<thumper> just got a failure from the merge bot
<thumper> has to do with looking for the lxc bridge network adapter
<thumper> which it obviously doesn't have
<thumper> hmm...
<thumper> double hmm...
<jam> jtv: gwacl is currently on rev 179, pull will bring it to 186, does that sound correct toy ou?
<jtv> My local one was on 179 as well actually.  Must have had a real burst of landings.
<jtv> I'm checking the log.
<jtv> jam: I'll try a local test run â but if that new version breaks anything else it doesn't look like a lot of work to fix it up.
<jam> thumper: as for the dependencies stuff, you can do a few things, 1) go has a very strong ast package that lets you introspect the source, 2) the package go/build lets you introspect packages, etc
<jtv> jam: OK â the only build failure I'm getting is the one that that pending juju-core branch fixes.
<jam> thumper: so it is possible to do, rogpeppe has probably done some of it
<jtv> So please hit the update button!
 * thumper nods
<jam> jtv: update has finished please land your branch
<jtv> Thanks!
<jtv> It's on the way in.
<thumper> davecheney: I've just updated go, how do a force a rebuild of everything?
<thumper> davecheney: is there a go clean?
<jam> thumper: rm $GOPATH/{pkg,bin}  ?
<thumper> yeah, that worked
<jtv> Better make very very sure GOPATH is set though.  :)
<jam> jtv: well I was meaning cd $GOPATH rm pkg/bin, but you're right you could do the rm via env expansion, but it is a bit risky
<thumper> jam: did it manually
<thumper> but I got the idea
<davecheney> thumper: jam that is the ticket
<thumper> davecheney: I'm about to land the branch that enables the local provider
<davecheney> thumper: excellent
<jtv> jam: No worries, it was clear that you were paraphrasing because there was no "-r" either :-)
<thumper> just had to mock out the interface functions for tests
<thumper> it failed the first time
<davecheney> i'll come nagging about some word for the release notes for Evil Nick
<thumper> passed locally because I have a lxcbr0
<davecheney> https://docs.google.com/a/canonical.com/document/d/1ZBV6m0D1cfJQGoHW7EzJEc2qZeJFR38teHM_4OiYkeM/edit#
 * thumper waits the 15 minutes for the tests to run
<jam> I believe someone just poked me, but my IRC window hung
<jam> please resend
<jtv> If it was me, it wasn't important.
<jam> ah it was just dave agreeing with me
<davecheney> jam: was me, was not important
<davecheney> lets hope nobody figures out how I handle dependencies in the LP recipe, http://bazaar.launchpad.net/~dave-cheney/juju-core/package/changes
<davecheney> bigjools will bollock me
<jtv> \o/
<jtv> Trunk is updated for the gwacl change.
<jtv> Please update your gwacl, everyone, before trying to build the latest trunk.
<davecheney> whoop whoop
<jtv> Anyone available to review https://codereview.appspot.com/11409043 ?
<davecheney> ... value *golxc.Error = &golxc.Error{Name:"lxc-ls", Err:(*exec.Error)(0xf840169cc0), Output:[]string(nil)} ("error executing \"lxc-ls\": exec: \"lxc-ls\": executable file not found in $PATH")
<davecheney> shitter
<davecheney> this is going to be (lower case) run
<davecheney> s/run/fun
<jtv> Missing dependency?
<jtv> thumper, have you got anything to do with that?  ^
<davecheney> lucky(~) % lxc-ls
<davecheney> The program 'lxc-ls' is currently not installed. You can install it by typing:
<davecheney> sudo apt-get install lxc
<thumper> davecheney: there are a few bits around there
<thumper> davecheney: what in particular are you looking at?
<thumper> ah...
<thumper> I need to mock out lxc there too
<thumper> forgot about that
<davecheney> jtv: reviewing
<jtv> Thanks
<davecheney> jtv: why is ec2 special ?
<davecheney> (probably a silly question)
<davecheney> ie, why doesn't the ec2 provider call ComposeUserData like the others did ?
<thumper> and try to land it again...
 * thumper waits the requisite 15 minutes
<jtv> davecheney: did I forget a bit there!?
<jtv> davecheney: yup, I did!  But it's the exact same recipe as openstack.  Please stand by while I fix.
<davecheney> jtv: cool, I was wondering why ec2 was different
<davecheney> you called it out in the description
<wallyworld__> thumper: i've just removed InstanceId() from local env provider. i don't think it was actually used, right?
<davecheney> but that didn't marry with what I saw
<thumper> wallyworld__: it was used in bootstrap
<davecheney> thumper: 4 changes per hour, 24 hours per day
<wallyworld__> even for local?
<jtv> davecheney: Absolutely right.  EC2 is the Ur-provider where it all starts.  I suspect I just assumed at one point that I'd done it before anything else.
<davecheney> the progenitor, the scion
<jtv> Thank you.  I knew there were better terms but couldn't think of any.
<jtv> Except Urvater, but that's not English.
<wallyworld__> thumper: so local bootstrap does call jujud --bootstrap-state? i didn't see it using that or cloud init anywhere
<davecheney> jtv: ooh, nice word
<jtv> Oh good, now I have a conflict!
<jtv> Please bear with me.
<davecheney> you need the Ã¼berlagern
<wallyworld__> thumper: in fact, local bootstrap just writes the state file directly from what i can see
<thumper> wallyworld__: it was used in one and only one place
<wallyworld__> so provider.InstanceId() is not used unless i am on crack and can't see it
<thumper> wallyworld__: if you removed it, you've fixed it
<wallyworld__> used != defined
<wallyworld__> it was implemented
<wallyworld__> but i was asking if anything called it
<jtv> davecheney: Ã¼berlagern...  isn't that a verb for to spend the night in camb?
<jtv> *camp
<davecheney> jtv: internets says it means to overlay one on top of another
<thumper> wallyworld__: yes, one thing called it
<wallyworld__> really?
<jtv> Ah.  Well if you're going to noun it, you'll need to capitalize it as well.  It's a bit like Go's exporting rules.  :)
<thumper> well it did when I grepped the code for it
<wallyworld__> in local?
<thumper> no,
<wallyworld__> right, i already removed that then
<wallyworld__> thanks :-)
<davecheney> I can't even type the Ã¼, i had to copy pasta it
<jtv> <compose>"u
<wallyworld__> jtv: i'm removing InstanceId() from the Aszure provider. I'll tweak the bootstrap code to make the tests pass. tomorrow after i land can you do a live test for me?
<jtv> wallyworld__: isn't InstanceId() required?
<wallyworld__> jtv: not with my latest code
<jtv> We went through a lot of pain to implement that...
<wallyworld__> jtv: the fact that it was there at all sucks
 * thumper twiddles thumbs some more
<thumper> about 4 minutes to wait
<jtv> I can try to do a live test tomorrow, yes.  It'll be my first one.
<jtv> You're right, it was always a pain and a disappointment.  Can you do the maas side as well then?
<wallyworld__> jtv: sorry about the implementation pain. we had similar pain with openstack
<thumper> if it doesn't land this time...
<thumper> it can wait for tomorrow
<wallyworld__> jtv: maas already done
<wallyworld__> in my branch
<jtv> Great.
<wallyworld__> tests pass fwiw :-)
<jtv> There's probably a lot of code that can disappear from  the Azure provider without this.
<jtv> All the WALA stuff AFAIK.
<thumper> wallyworld__: btw, \o/ to removing the dumb method
<jtv> ../gwacl/storage.go:14: import /home/jtv/go/pkg/linux_386/launchpad.net/gwacl/logging.a: not a package file
<jtv> wtf?
<wallyworld__> jtv: i removed some maas stuff too already
<jtv> wallyworld__: cleanups.  Always feels good.
<wallyworld__> yes. indeed
<wallyworld__> that particular method was nasty
<jtv> It came as an unpleasant surprise...  You don't expect part of the EnvironProvider suddenly to run on the instance and have to figure out stuff it received from the Environ on the server.
<thumper> MERGED!!!! \o/
<jtv> davecheney: diff should be updated...  this drives the line count for my branch even further into the negative.  :)
<jtv> Wow but the Rietveld part of "lbox propose" takes a long time.
<davecheney> delete everything for the win!
<davecheney> thumper: did you do it ?
<davecheney> do we have an lxv provider ?
<thumper> yes
<thumper> yes we do
 * davecheney golf clap!
<jtv> The delay is especially annoying without DCVS because you basically can't get any work done while waiting for lbox.
 * thumper is done
<thumper> see ya later
<wallyworld__> jtv: from what i can see, you could have returned "" for that InstanceId() method cause azsure is not using cloud init
<wallyworld__> and the only thing that used provider.InstanceId() was jujud bootstrap-state which was called from cloud init
<wallyworld__> whereas the azsure provider is saving the bootstrap state directly
<jtv> Wait...  we didn't have an *Ubuntu  image* for Azure that supported cloudinit.  But AIUI we do use cloudinit on Azure.
<jtv> In some highly modified way, I'm sure.
<wallyworld__> ah i think you are right
<jtv> If we'd done all that work for nothing, that would be a bit of a wet blanket.
<wallyworld__> but i'm almost sure it doesn't require or use InstanceId()
<wallyworld__> hence i can just delete that method, but i'll keep checking to be sure
<jtv> Well with any luck we'll never know.  :)
<davecheney> awwww shit
<davecheney> can someone moderate my message to juju-dev https://lists.ubuntu.com/mailman/confirm/juju-dev/cffced6f12ee7bde9b7a60b938cd49ece99e601d ?
<jtv> It seems I'm not privileged.  :(
<davecheney> bradm is fix0ring for me
<jtv> ah
<jtv> btw davecheney, did you get the diff update for the branch you're reviewing?
 * jtv reboots
<davecheney> jtv: just looking now
<davecheney> might have to wait a bit
<davecheney> about to walk out the front door
<rogpeppe1> mornin' all
<rogpeppe1> i'm going to be unavailable for a little while as i'm reinstalling this laptop from scratch
<rogpeppe1> which, hopefully, will fix lots of stuff. crossed fingers.
<davecheney> bzr: ERROR: bzrlib.errors.InvalidHttpResponse: Invalid http response for https://xmlrpc.launchpad.net/bazaar/: Unable to handle http code 502: Bad Gateway
<davecheney> thanks for nothing lp
<davecheney> that is twice today a build hsa failed becuase lp soiled itself
<bigjools> davecheney: any idea what would cause this?  we switch from 1.0.2 to 1.1 in the PPA and get failing tests: http://paste.ubuntu.com/5883536/
<bigjools> the panic is.... interesting
<bigjools> or anyone?
 * davecheney looks
<davecheney> bigjools: is it repeatable ?
<bigjools> davecheney: so far yes
<davecheney> bigjools: got a branch I can poke ?
<bigjools> davecheney: lp:gwacl
<bigjools> we were using 1.0 until your email
<davecheney> ok, it's the gwacl branch
<TheMue> bigjools: cleaned the pkg dir after upgrade and before testing?
<bigjools> yes
<bigjools> it bombed out much quicker when I didn't :)
<davecheney> the compiler would have refuse to link to old packages
<davecheney> it might have given a confusing error
<bigjools> yes
<davecheney> but it would not have worked
<bigjools> it looks like a bug in the 1.1 runtime
<bigjools> [fp=0x2aaaaac6f4f8] runtime.sigpanic()
<bigjools>         /usr/lib/go/src/pkg/runtime/os_linux.c:239 +0xe7
<davecheney> bigjools: maybe
<davecheney> checking
<davecheney> bigjools: yup, i see a panic
<davecheney> investigating
<bigjools> davecheney: ok ta
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: fwereade | Bugs: 7 Critical, 79 High - https://bugs.launchpad.net/juju-core/
<davecheney> bigjools: as well as the crash
<davecheney> are you also getting other test failures ?
<bigjools> no
<bigjools> well we were but trivial
<bigjools> xml formatting stuff
<davecheney> bigjools: http://paste.ubuntu.com/5883598/
<davecheney> this is what I see
<bigjools> davecheney: pull again
<bigjools> they're fixed
<davecheney> ok
<davecheney> bigjools: confirmed. other failures have ceased, i'm down to the segv now
<bigjools> roger
<rogpeppe> all reinstalled, hopefully this machine will behave better now.
<jtv> Oh, was part of the failure just an omission to mangle calling conventions into the linkage?
<davecheney>  // libcurl go bingding
<davecheney> ^ what can possibly go wrong
<bigjools> spelling
<davecheney> bigjools: please run go test github.com/andelf/go-curl
<bigjools> davecheney: /o\
<bigjools> sigh
<davecheney> === RUN TestCallbackFunction
<davecheney> unexpected fault address 0xfffff8250c8b4864
<davecheney> well, there's your problem
<bigjools> davecheney: given the reluctance on https://code.google.com/p/go/issues/detail?id=5742 to implement renegs, I am at the point of grabbing chunks of what hair I have left
<davecheney> bigjools: understood
<davecheney> i know your on that thread with gavin
<bigjools> yeah
<davecheney> please save your pate
<bigjools> believe me I want to - it's easier to get sunburn here
<bigjools> davecheney: how do we work out where go-curl is breaking? the trace is useless
<bigjools> debug build of Go?
<davecheney> bigjools: psychic debugging
<davecheney> i started with a bias against go-curl
<davecheney> ran their tests
 * bigjools ohmms
<davecheney> saw there was the same failure
<davecheney> ran go test -v
<davecheney> saw it was related to callbacks from C to Go
<davecheney> then I went to get another beer
<bigjools> lol
<bigjools> I gather there's some game of rugby league on tonight
<davecheney> translatoin: i was equal parts pesimistic and lucky
<davecheney> bigjools: https://github.com/andelf/go-curl/issues/15
<bigjools> davecheney: I'm positive there's a solution hidden behind the, errr, whatever language that is
<bigjools> google translate FTW
<davecheney> what is mandarin for 'eat a dick'
<davecheney> "Suspected cgo callback go when blocked go running. Consider other ways to make trouble. . ."
<bigjools> åé¸¡å·´
<davecheney> http://www.toplessrobot.com/2010/11/fireflys_15_best_uses_of_chinese_profanity.php
<davecheney> my favorite is number 7 followed by a number 3
<bigjools> this is kind of a blocker then.  Which pot of piss shall we consume?
<davecheney> that is a number 6
<davecheney> unrelated, https://twitter.com/davecheney/status/357443014064480259
<bigjools> trololol
<davecheney> i think you mean
<davecheney> trolololo, boom, splash, *debris noises*
<rogpeppe> sigh
<davecheney> rogpeppe: still not awesome ?
<rogpeppe> davecheney: some things are now working again
<rogpeppe> davecheney: others are broken
<rogpeppe> davecheney: like currently i can't raise the side menu
<rogpeppe> davecheney: oh, now i can
<rogpeppe> davecheney: and acme is now interpreting a left button click as a right  button click, so i can't select anything
<davecheney> bigjools: calling in favors now
<bigjools> <blink>
<TheMue> rogpeppe: that really sounds evil
<TheMue> rogpeppe: done a fresh install?
<rogpeppe> TheMue: yes, from total scratch
<TheMue> rogpeppe: then it's even worse
<davecheney> bigjools: ok, sitrep
<davecheney> the package is broken because it depends on an exact representation of a function pointer in Go 1.0
<davecheney> that was changed in Go 1.1
<bigjools> ahahahahahaha
<davecheney> http://golang.org/doc/go1.1#method_values
<davecheney> the last line of that sectoin may need some revision
<jam> davecheney: probably no "Go" code is affected, but "cgo" code is.
<davecheney> jam: semantics semantics
<mgz> jam: the bot seems unhappy? I can't ssh to it.
<jam> mgz: there are 1000s of files in /tmp/mongodb*.sock
<jam> I'm trying to "ls" all of them
<jam> and it is bringing the machine to a halt
<mgz> joy
<davecheney> jam: that is standard behavior for our test suite :)
<davecheney> tmpwatch ftw
<jam> mgz: I sent a mail to the list, but can you check your own machine?
<jam> I have 535 of them
<davecheney> i have /tmp on tmpfs
<mgz> I have a few hundred
<davecheney> i konw it's time to clean them out when my machine starts to page
<mgz> my machine is pretty disposable though, so tmp doesn't last that long
<bigjools> unreadable variable names in Go doc examples obfuscates things
<jtv> On an unrelated note, is Tarmac stuck again?
<jam> jtv: it is "running" but I'm doing stuff like ls too-many-files*
<jtv> Gulp
<jam> jtv: it is currently trying to merge your patch
<jtv> Good to know, thanks.
<jam> jtv: but it was stuck trying to run mine
<jam> and timing out
<jam> because of aforementioned "too-many-files*"
<jam> jtv: apparently the test suite leaks /tmp/mongodb-*.sock files
<jtv> Just saw the email...  New behaviour?
<jam> my guess being we start and stop mongo a lot and it may not clean up the sock file
<jam> jtv: from what davecheney said, no
<jam> just bot has been running long enough to be a $SERIOUS problem
<jam> bot has no swap file, so it can't page anything :)
<davecheney> sadbot
<jtv> Maybe the test suite should run with a TMPDIR that gets cleaned up from out-of-process afterwards?
<davecheney> jtv: it does, c.Mkdir does that
<davecheney> obviously whatever is creating the mgo files isn't using it
<jam> davecheney: I don't think it sets TMPDIR for mongo
<jam> right
<jtv> davecheney: c.Mkdir makes a directory *in* TMPDIR, but does it set TMPDIR!?
<jtv> It would surprise mme.
<jam> jtv: It doesn't set TMPDIR, but IME most tests are good about using c.Mkdir for the actual testing
<jtv> Most.  :)
<jtv> My point is that the thing that runs the test suite should try to insulate itself a bit from the test suite, rather than rely on the tests' good behaviour.
<jtv> Because this kind of thing will just happen from time to time.
<jam> jtv: interesting thought. The question there is: TMP, TMPDIR, TEMP, .... ?
<jtv> The beauty of standards...
<jtv> I thought TMPDIR was more or less the Unixy standard, but I could well be wrong.
<jtv> Another option might be chroot.  Not as any kind of security measure, but to facilitate cleanup.
<jam> jtv: I'm going to have to kill your process so that I can cleanup the machine. forgive me and please re-approve in a while
<jtv> No worries.  I hope it helps!
<jtv> No, Rietveld, this is *not* what I was asking you to do.
<jtv> Why does Rietveld error out so often?
<jtv> Invalid XSRF token.
<bigjools> davecheney: feck. So 1.0.2 doesn't work with ec2, 1.1 doesn't work with Azure.  Fucking score.
<davecheney> bigjools: yup, we hit the jackpot there
<hazmat> how much work is implementing the tls renegotiation?
<jtv> Any other reviewers for a cleanup job that'll eliminate some duplicate code across providers?  https://codereview.appspot.com/11409043/
<davecheney> hazmat: step back
<davecheney> is it known that TLS renegotaion is what we require ?
<hazmat> yes
<davecheney> [citation needed]
<jtv> It seems likely, but we're not actually certain.
<jtv> That's my assessment.  Anyone disagrees, I would be very very happy to hear the rest of the story.  :)
 * hazmat digs around for a ref
<jam> ok, we had *only* 8000 mongodb sock ifles
<jtv> What a relief!
<mgz> kittens
<davecheney> jam: i have some spare if you need 'em
<hazmat> jtv, davecheney most of the refs seem to refer to our own usage, or the renegotiation attack (for which recommends are disable renegotation or augment with extension from rfc 5746)
 * jam heads to put gas in the car, I should be back in time for standup
<jtv> Yes... we figured that was one of the reasons why the Google folks thought renegotiation was too messy to implement.  But the question was really: is renegotiation really what's needed to solve this problem?  Or is the renegotiation only happening to work around an avoidable condition?
<mgz> gaspoweredcar
<rogpeppe> weird, X is sending mouse events with Mod1Mask set.
<TheMue> jtv: you've got a review
<TheMue> jtv: a real good cleanup
<jtv> DankeschÃ¶n
<TheMue> jtv: BitteschÃ¶n
<jtv> Ah yes, forgot about the underscore...
<TheMue> :D
<dimitern> jam: btw i noticed that in addition to mnogo-*.sock files (1751 before i removed them) my /tmp contained also quite a few test-mgo* subdirs
<wallyworld__> dimitern: mgz: i will missing the meeting. HUGE game of football on right now
<dimitern> wallyworld__: enjoy ;)
<wallyworld__> i am :-)
<wallyworld__> we are winning but only just
<hazmat> if i had to get a working version of juju-core for a demo with openstack today. what's the recommend? go 1.1 rel branch + juju + trunk?
<mgz> hazmat: what's in saucy should be fine
<hazmat> i'm on precise for this
<mgz> then you can download the tarball and build with the go backport, also the devel ppa is probably okay
<hazmat> mgz, thanks.. i'll try tarball with a dl of go1.1.1
<mgz> ppa:juju/golang should be fine for you
<hazmat> hmm.. hasn't juju init been around for a while.. a build of the tarball says its not a valid command  http://pastebin.ubuntu.com/5883893/
<mgz> hazmat: what does `juju version` say?
<hazmat> doh.. pyjuju
<hazmat> doing  pyjuju for maas -> openstack and then juju-core for openstack workload..
<hazmat> mgz, is there a way to get the openstack provider to ignore invalid certs? http://paste.ubuntu.com/5883925/
<mgz> ho ho ho
<mgz> I think that's another thing we didn't port over from pyjuju
 * hazmat files a bug
<mgz> rogpeppe: ^have a look in a sec
<hazmat> its bug 1202163 for ref
<_mup_> Bug #1202163: openstack provider should have config option to ignore invalid certs <juju-core:New> <https://launchpad.net/bugs/1202163>
<rogpeppe> hazmat: you can configure the http stack to always ignore invalid certs, or ignore them for just a single request
<rogpeppe> hazmat: for the former, you can set http.DefaultClient.Transport to &http.Transport{TLSClientConfig: tls.Config{InsecureSkipVerify: true, (maybe more here)}}
<rogpeppe> hazmat: for the latter, you can create a new http.Client and call Do (or Get or whatever) on that
 * rogpeppe goes for lunch
<hazmat> rogpeppe, thanks
<jam> jtv: your maas loggo patch has landed.
<jtv> Yup, just saw it thanks.
<jtv> Does that mean I can land the next one too?
<jam> jtv: there is currently a fair backlog of things to land, but you should be able to mark it approved and the bot will get to it
<jtv> Thanks.
<jam> to whom it may concern, we should add to the release notes that we can upgrade a 1.10 deployment to 1.11.3
<jam> But I couldn't find the 1.11.3 release notes in my quick search
<mgz> jam: dave linked them in his message in the "local provider revies" thread
<gary_poster> evilnickveitch, arosales, jcastro this is the proposed replacement for the hooks tab.  AOK?  http://ubuntuone.com/1jW3lOgQLDNRup1bOX5xWf
<jcastro> gary_poster: so "source" instead of "hooks"?
<gary_poster> jcastro, yes per evilnickveitch bug 1201840
<_mup_> Bug #1201840: hooks section lists more than just hooks <charmbrowser> <juju-gui:Triaged by lucapaulina> <https://launchpad.net/bugs/1201840>
<evilnickveitch> gary_poster, cool, where does that link go to?
<evilnickveitch> (the Contribute to this charm one)
<gary_poster> evilnickveitch, https://juju.ubuntu.com/docs/authors-charm-store.html
<gary_poster> evilnickveitch, AOK?  need to run
<evilnickveitch> gary_poster, yeah, that's fine
<gary_poster> cool, will land after lunch
<gary_poster> thanks
<arosales> gary_poster, taking a look
<arosales> gary_poster, my only fear is that is is "different" that what we have from the existing charms
<marcoceppi> gary_poster: what about just "Files"
<jcastro> I would just call it "Code"
<jcastro> charms are code, so just call it code
<arosales> gary_poster, talking with eco team
<arosales> gary_poster, seems it is ok to rename that tab to something else, consensus is looking like "code" but we wouldn't be heart broken if you kept source
<jcastro> gary_poster: what do you think about having the Juju home button go to the browse mode and not the canvas?
<jcastro> https://bugs.launchpad.net/juju-gui/+bug/1202306
<_mup_> Bug #1202306: We need an "all" category <juju-gui:New> <https://launchpad.net/bugs/1202306>
<rogpeppe> rvba: which revno of gwacl should we be using currently?
<rogpeppe> rvba: sorry, ignore me, i hadn't pulled
<rogpeppe> done for the day. see y'all tomorrow.
<marcoceppi> Hey guys, compiling from source, but I already compiled juju-core a while ago. So I ran `got get -u launchpad.net/juju-core/...` which exited 0, ran go install -v launchpad.net/juju-core/... no output exit 0 but my GOPATH/bin/juju has the old version still
<marcoceppi> anything I'm missing?
<ahasenack> marcoceppi: try deleting it
<marcoceppi> ahasenack: cool, I was just doing that actually
<marcoceppi> ahasenack: http://paste.ubuntu.com/5885337/ libcurl3 is installed on the system, ran apt-get dep-build for juju-core as well same response
<ahasenack> marcoceppi: hm, no clue, I run go get -v -u everyday
<ahasenack> and ran it just now
<ahasenack> marcoceppi: trash $GOPATH maybe
<marcoceppi> ahasenack: I did, found the issue. libcurl4-gnutls-dev was needed
<ahasenack> ok
<marcoceppi> ahasenack: thanks!
<ahasenack> marcoceppi: have you tried lxc? I just get "connection refused" when I bootstrap, but telnet to that host/port shows no problems
<ahasenack> it's the mongo port
<marcoceppi> ahasenack: I just got it compiled, about to give it a go
<ahasenack> I get
<ahasenack> 2013-07-17 19:36:23 ERROR juju open.go:89 state: connection failed, will retry: dial tcp 127.0.0.1:37017: connection refused
<ahasenack> and tcpdump indeed shows a syn and a rst
<ahasenack> but when I telnet to the same address and port, I get a connection established
<ahasenack> so no clue what is going on
<marcoceppi> odd
<ahasenack> I also don't know which mongo to use, I grabbed the raring one
<ahasenack> 2.2.4, seems to be listening in ssl mode at least
<ahasenack> marcoceppi: it also complained there was no secret in the env, the juju init template for local: needs admin-secret it seems
<marcoceppi> ahasenack: I just copied the one from cheney's email
<ahasenack> marcoceppi: it's the same as juju init, let's see if it will complain in your case
<marcoceppi> ahasenack: ah, yeah it complains
<marcoceppi> makes sense since that's needed for the api
<ahasenack> makes you wonder if they really tried it ;)
<ahasenack> then you need sudo
<ahasenack> and install mongodb
<ahasenack> and that's where I'm stuck now, bootstrap fails to connect to mongo, even though it's running
<marcoceppi> ahasenack: well, it's a first cut, I'm not expecting perfection. I'm here to report bugs!
<ahasenack> sure
<marcoceppi> technically you had to use "root" to bootstrap the last local provider
<marcoceppi> it just prompted for sudo password during bootstrap
<marcoceppi> ahasenack: seems to have worked for me, waiting for status to return
<ahasenack> marcoceppi: what did you get for bootstrap -v?
<marcoceppi> ahasenack: http://paste.ubuntu.com/5885365/
<marcoceppi> ahasenack: I didn't -v that command :(
<ahasenack> marcoceppi: ok, that it was probably full of connection refused too
<ahasenack> because it exits silently
<marcoceppi> I'll destroy and try again in a second
<ahasenack> then status shows what you pasted for me too
<marcoceppi> well I'm getting connection estabilished
<marcoceppi> finally errors with 2013-07-17 19:50:38 ERROR juju supercommand.go:235 command failed: cannot log in to admin database: auth fails
<marcoceppi> error: cannot log in to admin database: auth fails
<marcoceppi> mongodb is 1:2.2.4-0ubuntu1
<ahasenack> so, the mongodb package starts mongo
<ahasenack> juju creates another upstart job
<ahasenack>  /etc/init/juju-db-<something>.conf
<ahasenack> I think it starts mongo on another port, so it doesn't conflict
<ahasenack> that job starts a mongo on port 37017
<ahasenack> with ssl
<ahasenack> marcoceppi: did you try bootstrap -v?
<marcoceppi> ahasenack: bootstrapping now
<marcoceppi> ahasenack: yeah, getting conn fail
<ahasenack> and my juju status also fails with a ton of "connection established"
<ahasenack> like several per second
 * marcoceppi changes port
<marcoceppi> nvm
<ahasenack> I stopped the main mongodb service
<ahasenack> so just the one started by juju is in play
<ahasenack> even it I let it run, it's on a different port
<ahasenack> tcp        0      0 0.0.0.0:38017           0.0.0.0:*               LISTEN      10851/mongod
<ahasenack> and
<ahasenack> hm, wait
<ahasenack> it didn't start
<ahasenack> that's the one from juju
<marcoceppi> ahasenack: well we're in the same boat now
<ahasenack> so when juju's mongo is running, the packaged mongodb won't start, I don't know why
<ahasenack> not sure if it's even relevant
<marcoceppi> ahasenack: it looks like juju mongo runs both 38017 and 37017
<ahasenack> marcoceppi: maybe one is the web admin port
<ahasenack> I'm trying ssl on it, but chrome is being stupid
<ahasenack> "You attempted to reach localhost, but the server presented an invalid certificate."
<ahasenack> and doesn't allow to continue
<ahasenack> ok, firefox works
<ahasenack> so 38017 is some sort of web status page
<ahasenack> like apache's /server-status
<marcoceppi> ack
<ahasenack> 37017 is the client port
<ahasenack> what clients should use
<ahasenack> it shows some logs, that's good
<ahasenack>            17:02:12 [conn26]  authenticate db: admin { authenticate: 1, nonce: "dd1208d655bae7f3", user: "admin", key: "770f106419a82fc861b9d03103b0cda6" }
<ahasenack>            17:02:12 [conn26] auth: couldn't find user admin, admin.system.users
<ahasenack> so something needs to create that user somehow
<ahasenack> maybe that's the bootstrap step that failed
<ahasenack> (this was as a result of juju status)
<ahasenack> I think it's a timing issue
<ahasenack> it starts juju-db-andreas-local (upstart job)
<ahasenack> and I kept trying that port in another terminal, also getting connection refused
<ahasenack> after a while, it worked
<marcoceppi> ahasenack: yeah, so I'm running Umongo against it, it seems like core stops trying to connect too soon
<ahasenack> agreed
<marcoceppi> about 45 seconds after bootstrap stops trying to connect, it works
<ahasenack> it's an aggressive retry even
<marcoceppi> this is a little disappointing
<ahasenack> need to stash a sleep in there somewhere, or increase the number of retries
<ahasenack> i counted 38
<ahasenack> 28
<marcoceppi> I'm going to try to piss it off, see if I can past the retries
<ahasenack> marcoceppi: worked here, I increased the timeout 10x
<ahasenack> marcoceppi: https://pastebin.canonical.com/94586/
<ahasenack> juju status works now
<marcoceppi> ahasenack: brilliant
<marcoceppi> I need to figure out how to patch that in my install
<ahasenack> I ran go install launchpad.net/juju-core/... after that, it rebuilt the binary
<marcoceppi> ahh, just patch it in the src
<marcoceppi> cool
<ahasenack> right
<marcoceppi> ahasenack: I ended up putting it to 60, a minute seems like a fair amount of time
<marcoceppi> ahasenack: have you opened a bug yet?
<ahasenack> no
<ahasenack> so far it's that (too short a timeout) and the admin-secret missing from the template
<ahasenack> doing a deploy now, I see wget action in the process list
<ahasenack> container running
<marcoceppi> wow it's fast as hell
<marcoceppi> It's a shame you still have to download the cloud image on first run
<ahasenack> how big is that?
<marcoceppi> I thought there was going to be a way to seed the cloud images before dpeloyment
<marcoceppi> ~200MB
<marcoceppi> it's used to build the template, so once it's downloaded (as it is during the first deploy command) all furture deploys are fast because it's cached
<ahasenack> that's good enough
<marcoceppi> in pyjuju the cache was cleared during each destroy, I wonder if it'll keep this cache around longer
<jcastro> I remember bashing out the workflow with thumper on the cloud image sync
<jcastro> and we asked specifically for a sync-like command to preload the image
<ahasenack> anyway, worked, got wordpress up (with mysql) on lxc
<ahasenack> almost
<ahasenack> got a 502 from nginx
<ahasenack> will debug later
<jcastro> marcoceppi: thumper is online in a few, I'm going to snag him on G+ and ask a buncha questions
<marcoceppi> jcastro: excellent
<marcoceppi> ahasenack: got it running over here
 * marcoceppi shrugs
<marcoceppi> jcastro: dude, local provider is awesome already. But I too have a few qs for thump
<jcastro> indeed
<jcastro> hey do we still need sudo? maybe we can answer that dude's question
<marcoceppi> jcastro: you need sudo for local juju-core provider
<marcoceppi> to bootstrap and destroy
<marcoceppi> jcastro: but I don't think his question is about using juju local provider
<thumper> morning
<marcoceppi> thumper: \o/
<thumper> hi marcoceppi
<marcoceppi> thumper: the other thing that ahasenack found while using local provider is admin-secret is not put in the local privider init template. Would you like me to open a bug for that as well?
<thumper> marcoceppi: yes please
<thumper> I'll look to address ASAP
<marcoceppi> thumper: ack
<ahasenack> thumper: congrats thumper, consider me impressed
<thumper> ahasenack: uh... ok
<thumper> just doing my job :)
<ahasenack> :)
<hazmat> thumper, local provider is pretty nice
<thumper> hazmat: thanks
 * thumper is working on surviving reboot
<thumper> auto-restart containers
<sidnei> thumper: there was a question from marcoceppi earlier about why go get -u + go install doesn't update the built version. i have the same question. :)
<sidnei> i still have 1.11.1-saucy-amd64 after doing that
<thumper> sidnei: I'm not entirely sure I understand your question
<jcastro> thumper: I like that you're swamped with people asking about local. :D
<thumper> perhpas because GOBIN isn't in your path?
<jcastro> this is a pretty awesome milestone dude
<thumper> jcastro: shouldn't you be not working/
<thumper> ?
<sidnei> thumper: did you mean GOPATH?
<jcastro> I'm back
<jcastro> gotta play with local. :D
<thumper> GOPATH/bin even
<thumper> there are several different go env vars
<sidnei> it's there yeah
<sidnei> $ which juju
<sidnei> /home/sidnei/src/go/bin/juju
<sidnei> the timestamp is fine even
<sidnei> $ ls -la /home/sidnei/src/go/bin/juju
<sidnei> -rwxr-xr-x 1 sidnei sidnei 13907104 Jul 17 17:05 /home/sidnei/src/go/bin/juju
<sidnei> so i guess it got updated after all, just reports an odd version?
<sidnei> uhm, no, it just got rebuilt but still no local
<thumper> sidnei: haha
<thumper> I know what it is
<thumper> sudo ~/go/bin/juju bootstrap --debug
<thumper> your GOPATH isn't in root's path
<thumper> this goes away when we have it packaged
<sidnei> uhm, i don't think that's it
<sidnei> juju init | grep local doesn't show anything
<sidnei> so it smells like it wasn't updated after all
<sidnei> source is at the right revno
<sidnei> sidnei@sidnei-laptop:~/src/go/src/launchpad.net/juju-core$ bzr revno
<sidnei> 1279
<sidnei> or?
<thumper> that isn't tip
<thumper> not even close
<thumper> bzr info?
<sidnei>   parent branch: http://bazaar.launchpad.net/~juju/juju-core/trunk/
<sidnei> wtf
<sidnei> seems like the branch changed location
<thumper> yeah
<thumper> sidnei: bzr pull lp:juju-core --remember
<thumper> that'll use bzr+ssh as well
<sidnei> fun that go get -u didn't think of looking at the new location
<sidnei> i guess it just did a pull locally
<sidnei> thanks thumper! eager to try local :)
 * sidnei dinners
 * marcoceppi continues to throw papercut fixes
<davecheney> http://msdn.microsoft.com/en-us/library/windowsazure/gg551722.aspx
<davecheney> how the bloody hell am I supposed to do this ?!?
 * davecheney throws table
<davecheney> bigjools: jtv how do you create an x509 cert for azure with openssl, is it possible ?
<davecheney> can you get ms to create one for your ?
<sidnei> thumper: 2013-07-17 23:45:58 ERROR juju runner.go:200 worker: fatal "machiner": unauthorized
<sidnei>  am i missing something?
<davecheney> jtv: bigjools oh hey, it's in the README, thanks heaps!
<davecheney> sidnei: nope, sounds like a bug
<sidnei> davecheney: this is with the local provisioner fwiw, i saw a mention of missing settings in the juju init -spitted template
<davecheney> shit you guys are keen
<davecheney> that code only landed yesterday
<sidnei> couldn't help it, im subscribed to commits :)
<sidnei> it keeps looping around with: 2013-07-17 23:45:58 ERROR juju runner.go:200 worker: fatal "machiner": unauthorized
<sidnei> oops
<sidnei> 2013-07-17 23:54:14 INFO juju runner.go:246 worker: restarting "state" in 3s
<sidnei> looks like it might fill the disk if i leave it :)
<thumper> hmm...
 * thumper poks
<thumper> pokes
<thumper> sidnei: where are you seeing that?
<thumper> sidnei: mine seems to be working fine
<sidnei> thumper: locally here, in ~/.juju/local/log/machine-0.log
<sidnei> i ran: sudo /home/sidnei/src/go/bin/juju bootstrap -v -e local
<thumper> hmm, I don't get that
<thumper> sidnei: missing the secret?
<thumper> I have to fix that
#juju-dev 2013-07-18
<thumper> admin-secret is needed in environments.yaml
<sidnei> thumper: nope, bootstrap complained that the secret was missing, didn't finish, so i had to add it
<sidnei> the problem seems to be the  "fatal "machiner": unauthorized"
 * thumper takes the dog out
<ahasenack> sidnei: I got it working here, had to make two changes
<ahasenack> sidnei: haven't seen that fatal error
<ahasenack> I have revision 1481
<ahasenack> sidnei: is that during bootstrap?
<sidnei> ahasenack: nope bootstrap finishes successfully, juju status returns pending for the bootstrap node, tailing ~/.juju/local/log/machine-0.log i see a retry every 3s with that unauthorized
<ahasenack> sidnei: did you bootstrap with -v?
<sidnei> ahasenack: y
<ahasenack> sidnei: so you saw a bunch of "connection refused", and then eventually it worked?
<sidnei> ahasenack: only 2
<ahasenack> really? I see dozens
<ahasenack> it retries several times per second
<sidnei> ahasenack: https://pastebin.canonical.com/94599/
<sidnei> ahasenack: get an ssd? :)
<ahasenack> impressive
<ahasenack> sidnei: so juju status should work right after that, and that's where you have the problem then
<sidnei> yup, status says it's pending
<sidnei> https://pastebin.canonical.com/94600/
<ahasenack> sidnei: what about mongodb, you installed the one from raring?
<sidnei> ahasenack: im running raring
<sidnei> https://pastebin.canonical.com/94601/
<ahasenack> sidnei: right, but you have the raring package?
<ahasenack> sidnei: I'm not on to something, I'm just comparing notes
<sidnei> ahasenack: sorry, duh. im on saucy not raring
<ahasenack> oh, ok
<sidnei> $ apt-cache policy mongodb
<sidnei> mongodb:
<sidnei>   Installed: (none)
<sidnei>   Candidate: 1:2.4.3-1ubuntu1
<ahasenack> that's different :)
<ahasenack> I have mongo 2.2.4
<sidnei> yet there's a mongod running
<ahasenack> I don't know if a new major mongo db version could cause this
<sidnei> ah, mongodbserver
<ahasenack> could be worth asking around
<sidnei> i guess davecheney would know
<ahasenack> wait, installed none?
<sidnei> Installed: 1:2.4.3-1ubuntu1
<sidnei> mongodb-server is the package name, not mongodb
<ahasenack> so that changed
<ahasenack> mongodb                              1:2.2.4-0ubuntu1        amd64
<ahasenack> it's what I have
<ahasenack> and, well, also mongodb-server
<ahasenack> mongodb is just a meta package probably
<ahasenack> maybe version 2.4 has a new acl or something
<ahasenack> Security Improvements
<ahasenack> New Modular Authentication System with Support for Kerberos
<sidnei> fun
<ahasenack> they talk about mongodb enterprise
<ahasenack> not sure if that's what we have
<ahasenack> but this sounds relevant
<ahasenack> Role Based Access Control and New Privilege Documents
<ahasenack> MongoDB 2.4 introduces a role based access control system that provides more granular privileges to MongoDB users. See User Privilege Roles in MongoDB for more information.
<ahasenack> (from http://docs.mongodb.org/manual/release-notes/2.4/ btw)
<thumper> ah that'd probably be it
<thumper> we need the ssl enabled mongo
<ahasenack> ssl is enabled
<ahasenack> even in sidnei's case, or else bootstrap wouldn't have finished/worked
<ahasenack> it clearly didn't work fully, but at the bootstrap stage it didn't get any errors
<sidnei> indeed
<davecheney> ahasenack: please for the love of $DEITY, don't open the 'lets use a new version of mongodb' can of worms
<ahasenack> davecheney: why would I do that? :)
<ahasenack> sidnei is the one using a newer version :)
<ahasenack> but looks like you have until october to get it working with mongo 2.4 ;)
<ahasenack> since it's in saucy
<sidnei> ah, here's some info in syslog: https://pastebin.canonical.com/94602/
<sidnei> davecheney: indeed, saucy has 2.4.3
<ahasenack> sidnei: that's javascript, right?
<sidnei> no idea. reminds me of couchdb though *wink*
<ahasenack> sidnei:
<ahasenack> "With authentication enabled, eval will fail during the operation if you do not have the permission to perform a specified task.
<ahasenack> Changed in version 2.4: You must have full admin access to run."
<ahasenack> from http://docs.mongodb.org/manual/reference/command/eval/
<ahasenack> is that just to get the date from mongo?
<davecheney> sidnei: yup, and P, Q, and R have something different
<sidnei> oh, lovely
<ahasenack> davecheney: so does go
<davecheney> ahasenack: yup, lets not light two fires at once
<sidnei> hopefully it'll be a trivial fix. not looking forward to downgrading :)
<ahasenack> sidnei: I was just poking around, maybe this could be used instead of javascript and eval: http://docs.mongodb.org/manual/reference/method/Date/
<ahasenack> sidnei: but I don't know how to login on mongo yet, I tried "mongo" with username and key I found in /var/log/syslog, but didn't work
<ahasenack> i wanted to see what that output looks like, then hack it in state/presence/presence.go in the clockDelta() thing, which is what calls the eval
<ahasenack> ok, gotta go
<ahasenack> cya
<hazmat> thumper, future optimization for local provider.. serge & smoser  were discussing being able to use lxc-clone with the cloud-image, ie re run cloudinit post clone, it will make lxc provisioning significantly faster, just a copy or fs snapshot (lvm/btrfs).
<thumper> hazmat: let's get it working simply first, no?
<hazmat> thumper, hence the future label. it seems to be working pretty well so far.
<hazmat> although i'd note bootstrap via sudo -E for folks using JUJU_HOME/JUJU_ENV
<thumper> wut?
<thumper> hmm..
<thumper> oh yeah
<hazmat> sudo strips env vars without -E
<thumper> HOME is passed through
<thumper> but others aren't
<thumper> hmm... I wonder if sudo -E would pick up the right juju for me
<thumper> that would make things a little easier
<hazmat> thumper, one other random.. --force-machine to 0 on deploy.. might do unexpected things, i got a series error just due to mismatch there, but wasn't sure if there was a sanity check against workloads on 0 with local if series did match
<thumper> hazmat: machine 0 doesn't support units
<thumper> hazmat: so would get a different error later
<hazmat> cool
<thumper> wallyworld__: ping
<wallyworld__> hi
<thumper> wallyworld__, davecheney: ?  https://codereview.appspot.com/11455044
<thumper> wallyworld__: hangout?
<wallyworld__> sure
<thumper> https://plus.google.com/hangouts/_/36d42d48a7f8998aa3789caa0e0e93bead3afb24?hl=en
<thumper> wallyworld__, davecheney: also two others pending reviews for minor things
<thumper> davecheney: would love some +1s :)
<davecheney> thumper: looking
<thumper> ta
<thumper> davecheney: one is to add the admin-secret
<thumper> davecheney: the other is a small pipeline, one to add gc and jc prefixes instead of importing into local namespace
<thumper> followed by a branch to make containers auto restart
<davecheney> thumper: why the \n in the maas provider ?
<thumper> davecheney: because I noticed that when I went juju init
<thumper> there wasn't a gap between maas and local
<thumper> and all the others did
<thumper> davecheney: perhaps we should change the default provider in the init to be local?
<thumper> although I'd like it to mature a bit more
<thumper> get some more real world testing first
<thumper> heh, didn't notice that marco did this too
<davecheney> thumper: sounds like the tests don't care
<thumper> davecheney: https://codereview.appspot.com/11491043/ maybe?
<davecheney> thumper: LGTM
<thumper> davecheney: ta
<sidnei> davecheney: regardless of the upgrade to 2.4, which may or not be a side effect, why is an eval used to get the server time? seems like there's some ill side-effects of that: http://docs.mongodb.org/manual/reference/command/eval/ "By default, eval takes a global write lock before evaluating the JavaScript function. As a result, eval blocks all other read and write operations to the database while the eval operation runs."
<sidnei> and then there's "Changed in version 2.4: You must have full admin access to run." (re: eval)
<sidnei> so maybe that's what broke indeed
<davecheney> sidnei: not sure
<davecheney> i don't konw that part very well
<davecheney> is it part of our transactoin magic ?
<sidnei> davecheney: nope, it's in clockDelta() in presence
<davecheney> sidnei: that is both facepalm and totally correct
<davecheney> you can't rely on the time inside a vm being correcet
<davecheney> shit
<davecheney> you can't rely on real hardware keeping good time
<davecheney> so the only choice is to use the time on the db server
<davecheney> at least it maybe consistent with itself
<sidnei> yeah, the question is if there's a way to get the time withot eval. seems that from 2.2 on you could get isMaster().localTime
<sidnei> and there's also serverStatus.localTime
 * davecheney waves hands
<davecheney> if it 'aint broken, something something, profit !
<sidnei> eh
<davecheney> ok, battery is flat
<davecheney> time for lunch
<sidnei> alright, commented out clockDelta body, hardcoded a return 0, nil. status is now started, and deploy kicked off
 * sidnei files bug
<sidnei> https://bugs.launchpad.net/juju-core/+bug/1202480
<_mup_> Bug #1202480: raring: bootstrap local environment works, machiner gets unauthorized <juju-core:New> <https://launchpad.net/bugs/1202480>
<sidnei> https://bugs.launchpad.net/juju-core/+bug/1202481 is a funny one
<_mup_> Bug #1202481: oddly formatted directory name in deployer <juju-core:New> <https://launchpad.net/bugs/1202481>
 * thumper-afk back for meetings (lotsa meetings)
<thumper-afk> in about 4 hours
<sidnei> fwereade_: if you happen to be around, https://codereview.appspot.com/11501043
 * sidnei eods
<davecheney> can everyone please familiarise themselves with the agenda for tonights call
<davecheney> https://docs.google.com/a/canonical.com/document/d/1eeHzbtyt_4dlKQMof-vRfplMWMrClBx32k6BFI-77MI/edit#
<rogpeppe> davecheney: doing so
<dimitern> davecheney: i don't see go 1.0.3 mentioned on that list - i'm using it and everything builds ok and ec2 works
<dimitern> not sure about azure though
<davecheney> dimitern: good point
<davecheney> it's not available in PPA
<davecheney> but as we're grasping at straws
<dimitern> yeah
<davecheney> dimitern: added
<dimitern> cheers
<thumper> good morning to you europeans
<thumper> davecheney: do the go-curl developers know that it is broken under go 1.1?
<davecheney> thumper: yes, see agenda
 * thumper looks for mgz
<TheMue> thumper: ping
<thumper> TheMue: hey
<thumper> TheMue: so... all my changes around tools are in trunk now
<TheMue> thumper: sent you a mail regarding the autosync stuff
<thumper> yeah, I've not looked at the doc sorry
<TheMue> thumper: yep, already analyzing them
<TheMue> thumper: np
<TheMue> thumper: it's just the place where i note my findings
<TheMue> thumper: so if you have anything to add feel free
<thumper> ok...
<TheMue> thumper: thx
<thumper> TheMue: AFAIK, the major point is to make it easy for prod-stack
<thumper> if we do that, we fix a lot of issues
<TheMue> thumper: we've got a first solution with sync-tools --source /my/path/to/tools
<TheMue> thumper: but that's not enough and convenient
<thumper> TheMue: but anyone can do upload tools now
<thumper> TheMue: if it grabs the binary
<thumper> no build env needed any more
<TheMue> thumper: exactly
<thumper> TheMue: here is an interesting idea
<thumper> ...
 * TheMue listens
 * thumper thinks
<TheMue> ;)
<thumper> hmm...
<thumper> not as easy
<thumper> the problem isn't the cli
<thumper> but the bootstrap node
<thumper> a setting maybe?
<thumper> in the config
<TheMue> thought about that too
<thumper> to say whether we should auto upload on bootstrap
<TheMue> making it explicit helps, otherwise the detection of the need is not simple
<TheMue> that could help
<TheMue> it's semi-automatic
<TheMue> beside the --source for sync-tools I already thought about a --source for bootstrap
<thumper> I think an explicit setting saying "this cloud can't see remote tools", plz upload
<thumper> don't guess
<thumper> then it is a one off config item
<thumper> we would need the sync still
<thumper> to do an upgrade, no?
<thumper> or does upgrade do an --upload-tools too?
<thumper> or probably should
 * thumper goes to have a cuppa tea and watch arrow
<thumper> back to chat to mgz later
<TheMue> upgrade afaik only on demand
<TheMue> thumper-afk: thx for sharing your thoughts
<noodles775> hazmat: Hi! I'm trying to test a deployment, and it's hanging here: http://bazaar.launchpad.net/~hazmat/juju-deployer/refactor/view/head:/deployer.py#L1142
<noodles775> hazmat: afaics, that loop never ends (my deployment times out in that loop), and it's checking for a non-existent key (the unit has 'agent-state', but not 'state').
<noodles775> (well, only ends via timeout - anyway, I'm hacking around it atm, but wasn't sure if it was a WIP - in which case, sidnei, I'm guessing we shouldn't be using it yet).
<rogpeppe> anyone here used juju with azure?
<rogpeppe> i was just trying to set it up and it's not obvious to me where to find the various pieces of management info
<rogpeppe> rvba: ^
<rvba> rogpeppe: sure, I can help you with that, just one secâ¦
<rogpeppe> rvba: actually, i've just found some of the info by scrolling down and finding "Settings"
<rvba> rogpeppe: I've written a small tutorialâ¦ let me forward it to you.
<rogpeppe> rvba: ta
<dimitern> fwereade_: it worked! s.BackingState.Sync() + a few modifications, I even reverted the in := w.in stuff and the flip-flopping to make sure with the fix the test pass and without it doesn't, i.e. the events are not lost
<fwereade_> dimitern, sweet!
<mgz> looks to me that azure-in-all.go did in fact land btw, so we will want to revert that for release
<rogpeppe> rvba: where do you get the azure command from?
<rogpeppe> rvba: or... perhaps i don't need to run it.
<rvba> rogpeppe: you don't need to run it indeed.
<dimitern> fwereade_: https://codereview.appspot.com/11506043 - others as well? this fixes the StringsWatcher
<rogpeppe> rvba: where do i get the cert for the management-certificate-path entry in environments.yaml?
<rvba> rogpeppe: you can just generate it with: http://paste.ubuntu.com/5887043/
<rvba> Then upload the public part to  Azure.
<rogpeppe> rvba: hmm, might be nice if that was in the instructions
<rvba> Settings > Management certificates
<rogpeppe> rvba: the public part being the .cer file, presumably?
<rvba> rogpeppe: yes
<rvba> rogpeppe: that's rightâ¦ these instructions I've sent you are a draft really, they were meant for our team which has already setup an Azure account with a certificate.  We need to polish that up for general consumption.
<rogpeppe> rvba: yeah, i've been making some notes about places that tripped me up which i'll pass on to you
<rvba> ta
<thumper> mgz: ping
<mgz> thumper: hey, sorry I missed you earlier, forgot to get into irc after rebooting back to ubuntu
<thumper> mgz: yeah, got time to chat?
<thumper> hangout?
<mgz> er...
<thumper> or just irc?
<mgz> that would mean rebooting again :)
<thumper> you can't do hangouts in ubuntu?
<thumper> mgz: dude, get more computers \o/
<thumper> mgz: does mumble work in ubuntu?
<mgz> I have many, not all of which are that portable :)
<thumper> or skype?
<mgz> mumble is good
 * thumper fires up mumble
<thumper> is canonical still running the mumble server?
<mgz> I think I just need to hop...
<mgz> ^yeah, details are on the internal wiki
<thumper> it eventually connected
 * thumper has to remember what his push to talk button was
 * thumper is in the cloud engineering kitchen
<rogpeppe> rvba: what about the storage account key?
<thumper> talking thing isn't working
 * thumper pokes some more
<mgz> audio is such fun
<rvba> rogpeppe: once you've created the storage account, click on it, then click on "manage access keys" at the bottom of the screen to get the key.
<rogpeppe> rvba: ah, thanks - that menu at the bottom is good for evading the eyes :-)
<rvba> rogpeppe: yeah :)
<fwereade_> dimitern, https://codereview.appspot.com/11506043/ reviewed
<dimitern> fwereade_: thanks
<rogpeppe> rvba: do you have to manually create the environment's container?
<rvba> rogpeppe: yes
<rvba> We could automate this part but haven't done it yet.
<rogpeppe> rvba: i assumed so, but was just checking i hadn't done something wrong
<dimitern> fwereade_: I was thinking of AssertChange, but good naming is an art, and it didn't sound right
<dimitern> fwereade_: :) so I guess I can make it AssertChanges and leave the comments I added
<rogpeppe> rvba: is it possible to create the container from the portal? (i'm probably missing another menu option again...)
<rvba> rogpeppe: yes: click on the storage, then the "CONTAINERS" tab, then "Add" at the bottom.
<rogpeppe> rvba: ah, i'd missed that the storage name was a link
<fwereade_> dimitern, AssertOneChange/AssertOnlyOneChange perhaps?
<fwereade_> it's a bit of a nasty global rename
<fwereade_> dimitern, but it's probably clearer than what I suggested first
<rogpeppe> now *that's* a doman name juju-azurea44xm2tkgd51kt2l7zx3228isi0olwasqcpiccatd24pmj3lf5zj9.cloudapp.net
<rvba> rogpeppe: yeah, we'll reduce the size of the random part :)
<dimitern> fwereade_: no, I decided to go with your original suggestion - AssertChange + AssertOneChange, sgtm
<dimitern> fwereade_: I tend to avoid very long idents when there are doc comments anyway :)\
<dimitern> fwereade_: updated https://codereview.appspot.com/11506043/ - rogpeppe: can you take a look as well please?
<rogpeppe> dimitern: will do
<rogpeppe> dimitern: is there a reason you can't do : for {data, ok := <-in; if !ok { break }; out <- data.(*params.StringsWatchResult).Changes} ?
<dimitern> rogpeppe: why, how is this better?
<rogpeppe> dimitern: it makes the flip logic much more obvious - why use a select?
<rogpeppe> dimitern: it's considerably shorter and simpler for one thing
<dimitern> rogpeppe: it's not more obvious for me at least
<dimitern> rogpeppe: i've never seen channels being used like that in a loop
<dimitern> rogpeppe: hmm
<rogpeppe> dimitern: how is this not obvious? http://paste.ubuntu.com/5887170/
<rogpeppe> dimitern: it's a classic way to transfer data from one channel to another
<dimitern> rogpeppe: really? didn't know
<rogpeppe> dimitern: read a value; write the value
<dimitern> rogpeppe: but you have to use w.in and w.out in the loop, right? and instead of the panic you can have return nil after the loop
<rogpeppe> dimitern: with the select, it's not obvious that only one arm is ever active at a time
<rogpeppe> dimitern: yes
<dimitern> rogpeppe: if you know the mechanics of a select block, it is (that's by definition)
<rogpeppe> dimitern: no - you have to follow the logic of when in and out can be non-nil
<rogpeppe> dimitern: BTW i think you need to select on tomb.Dying when writing to the out channel
<dimitern> rogpeppe: so for instead of select in this case i agree is probably better
<dimitern> rogpeppe: i can see it now
<rogpeppe> dimitern: well, the original uses for *and* select
<dimitern> rogpeppe: but for more complicated cases where you're not just reading from one and directly writing to another as select gives you more flexibility
<rogpeppe> dimitern: sure
<rogpeppe> dimitern: this is a nice simple case though
<dimitern> rogpeppe: except for a small thing actually
<rogpeppe> dimitern: i know what you're going to say :-)
<dimitern> rogpeppe: the initial event won't be sent with that code until there is another change
<rogpeppe> dimitern: then swap the statements
<dimitern> rogpeppe: hmm, not really
<dimitern> rogpeppe: the initial event comes as an argument to the loop, it's not read from a channel
<rogpeppe> dimitern: data.(*params.StringsWatchResult).Changes
<rogpeppe> oops
<rogpeppe> dimitern: http://paste.ubuntu.com/5887184/
<dimitern> rogpeppe: that will probably work yes
<dimitern> rogpeppe: i'll try it
<dimitern> rogpeppe: it's a pity if we don't as well change the notifywatcher's loop the same way though
<rogpeppe> dimitern: that's slightly different - it's ok if we're always ready to receive on w.in there
<dimitern> rogpeppe: there are no changes to send there, so we can always send the initial event
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: you probably want something more like this actually: http://paste.ubuntu.com/5887201/
<dimitern> rogpeppe: why wait on the tomb? the commonWatcher is already doing this
<dimitern> rogpeppe: I think it's more like http://paste.ubuntu.com/5887206/
<rogpeppe> dimitern: how else do we get out of the loop if we're trying to write to the out channel and the watcher is stopped?
<dimitern> rogpeppe: oops that was for the strings watcher
<dimitern> rogpeppe: http://paste.ubuntu.com/5887210/ that's for the notifywatcher
<dimitern> rogpeppe: aha! so that's why we need select - see :)
<rogpeppe> dimitern: yeah we need select when sending the out value but not when reading the in value
<dimitern> rogpeppe: the interesting thing is, it still works, without the select on the tomb
<dimitern> rogpeppe: just run all tests - all pass
<rogpeppe> dimitern: i bet i could make it deadlock
<dimitern> rogpeppe: so let's keep the original select blocks then please
<rogpeppe> dimitern: your original suggestion had the same problem
<rogpeppe> dimitern: (the one in the CL)
<dimitern> rogpeppe: how so?
<dimitern> rogpeppe: can you show me how to deadlock it? if the tests aren't detecting this case we need a better test
<rogpeppe> dimitern: yes
<rogpeppe> dimitern: you could deadlock it by creating a new watcher, reading the first change, making a change, waiting for the loop to receive a change, then stop the watcher without reading from the out channel
<dimitern> fwereade_: how do you feel about my lasy 2 pastes above?
<rogpeppe> dimitern: this doesn't have that problem: http://paste.ubuntu.com/5887201/
<fwereade_> dimitern, rogpeppe, meeting, will have to read backproperly in amo
<dimitern> rogpeppe: i'm not really convinced yet
<rogpeppe> dimitern: what your issue with it?
<rogpeppe> s/what/what's/
<dimitern> rogpeppe: we have a goroutine in commonLoop that waits on w.tomb.Dying()
<rogpeppe> dimitern: yes, and?
<dimitern> rogpeppe: and it exists the loop when it happens
<rogpeppe> dimitern: there are two loops
<rogpeppe> dimitern: if the loop in StringsWatcher.loop is blocked trying to send on out, how does it know that the goroutine in commonLoop has finished?
<dimitern> rogpeppe: hmm..
<dimitern> rogpeppe: it *can* know if it waits for it
<dimitern> rogpeppe: but i suppose waiting on the tomb is the same thing really
<rogpeppe> dimitern: yes. and your code wasn't waiting for it
<dimitern> rogpeppe: nor was john's notifywatcher
<rogpeppe> dimitern: because out != nil => in == nil
<rogpeppe> dimitern: actually john's notifywatcher was
<rogpeppe> dimitern: because it was always ready to receive on in
<dimitern> rogpeppe: hmm
<dimitern> rogpeppe: i don't see how mine is different? because i'm reading changes, rather than always sending empty structs?
<rogpeppe> dimitern: because of line 215
<dimitern> rogpeppe: so if I move that at the end of the case it'll work better?
<rogpeppe> dimitern: no
<dimitern> rogpeppe: why?
<rogpeppe> dimitern: to make it work, you'd have to read on tomb.Dying in the select loop
<dimitern> rogpeppe: but the notifywatcher is not doing that and yet you say it's correct?
<rogpeppe> dimitern: in the notify watcher it's reading on w.in which is always non-nil
<rogpeppe> dimitern: the point is that when in is nil, you're trying to send on out without allowing any way to know when the tomb is killed
<dimitern> rogpeppe: that's only because there are always changes to send (empty)
<dimitern> rogpeppe: but in my case that's also true
<dimitern> rogpeppe: meeting?
<rogpeppe> dimitern: there aren't always changes to send in the notifyWatcher - otherwise it would always be sending
<dimitern> standup time
<rogpeppe> ah yes
<mgz> I can't make standup I'm afraid, will be back shortly though
<dimitern> rogpeppe: lp:~jameinel/juju-core/upgrader-api-worker lp:~jameinel/juju-core/long-timeouts
 * TheMue => lunch
<rvba> rogpeppe: btw, I was told that Ian landed a change which affects how the MAAS and the azure provider get the instance id of a machine (previously, that was in environprovider).  So I need to test this.
<rvba> rogpeppe: I got an awful crash when starting jujud: http://paste.ubuntu.com/5887330/ (bottom of the file).  Does that ring any bell?
<rogpeppe> rvba: interesting
<rogpeppe> rvba: is this on tip?
<rvba> rogpeppe: yes, with a couple of tweaks in the Azure provider (that's what I'm testing) but the crash seems unrelated to my changes.
<rvba> rogpeppe: apparently, it crashed precisely in the new code added by Ian.
<rogpeppe> rvba: hmm, the stack trace doesn't seem to make sense to me
<rogpeppe> rvba: it seems to imply it's crashing in yamlBase64Value.String
<rogpeppe> rvba: ah, no, i was looking at an old version
<rvba> I'm using r 1487.
<rogpeppe> rvba: ha!
<rogpeppe> rvba: the problem is obvious - the previous line is missing a "return"
<rogpeppe> 		fmt.Errorf("cannot load state from URL %q: %v", stateInfoURL, err)
<rvba> rogpeppe: ha!
<hazmat> noodles775, this is with the python support?
<hazmat> noodles775, or with juju-core?
<hazmat> noodles775, re deployer
<noodles775> hazmat: python support (PyEnvironment.wait_for_units)
<noodles775> hazmat: I could be reading incorrectly, but it looks like we shouldn't be using juju-deployer/refactor yet.
<rvba> rogpeppe: are you able to boostrap a node (with that bug fixed)?  Even with a manual fix for the "return" problem I'm getting an error, the stateInfoURL is the empty string apparentlyâ¦
<rogpeppe> rvba: i failed to bootstrap correctly, although the node was created.
<rogpeppe> rvba: it failed to download the tools (although it seemed to upload them ok)
<rvba> rogpeppe: same here: http://paste.ubuntu.com/5887414/
<rogpeppe> i saw this error: http://paste.ubuntu.com/5887427/
<rogpeppe> rvba: it looks like your attempt got further than mine
<rvba> rogpeppe: is the container named "juju-private-storage" public?
<rogpeppe> rvba: i don't think so. should it be?
<rvba> If you want to be able to download stuff from there without being authenticated, yes.
<rogpeppe> rvba: usually the private storage is private
<rogpeppe> rvba: ah, the url isn't authenticated?
<rvba> rogpeppe: no
<rogpeppe> rvba: ah, presumably that will be fixed in time
<hazmat> noodles775, pyjuju support there isn't 100%, wait_for_units specifically.
<rvba> rogpeppe: yes, but it's not done yet (I think I wrote a note about this in the tutorial)
<hazmat> noodles775, i can push out update for it today
<rogpeppe> rvba: ha! yes, you're right, mea culpa - i didn't read that far in the instructions.
<rogpeppe> rvba: so there's no way of creating a read-only container?
<sidnei> ohai folks, trivial review? https://codereview.appspot.com/11501043/
<rogpeppe> sidnei: i thought i'd landed a similar fix
<rvba> rogpeppe: well, you can create an account, put things in a public container, and trow away the key;  now you have a read-only container :).
<rogpeppe> sidnei: oh darn it, it looks like it didn't get merged
<rogpeppe> sidnei: https://code.launchpad.net/~rogpeppe/juju-core/339-fix-lp1197369/+merge/174831
<rogpeppe> sidnei: (argh, the merge failed because the tests hung up - i don't *think* it was related to my changes)
<rogpeppe> sidnei: marking as approved again
<noodles775> hazmat: Thanks - the update would be great.
<sidnei> rogpeppe: cool
<rvba> rogpeppe: the problem I'm seeing does not seem to be very Azure-specificâ¦ could you try boostrapping an ec2 environment (I'm asking because I suppose you have the set up already).
<rogpeppe> rvba: will try
<rvba> ta
<rogpeppe> rvba: it bootstrapped ok to me
<rogpeppe> s/to me/for me/
<rvba> rogpeppe: okayâ¦ so it's Azure-specific thenâ¦ thanks for testing.
<rogpeppe> mgz: conventionally, we start merge proposal titles with the package name(s), e.g. state: add address.go for machine location data
<rogpeppe> mgz: it might be nice to stick with that convention
<mgz> rogpeppe: I always forget
<mgz> and struggle to fit in 50 characters anyway...
<mgz> shall try to do better in future
<rogpeppe> mgz: yeah the 50 char limit is punishing
<rogpeppe> mgz: it's nice to have the package name as a first-level filter though
<mgz> hm, seems most commits of late have been forgetting it
<mgz> or have just been changing things across packages
<rogpeppe> mgz: yeah, i should bring it up at a meeting. it's also really good to include the codereview link in the commit message
<mgz> well, that one got through the bot without hanging
<mgz> rogpeppe: any clues on your failed merge test breakage?
<rogpeppe> mgz: i'm waiting to see if it happens again. approved 25 minutes ago. hopefully the queue's not too long.
<mgz> 2013-07-18 13:19:16 DEBUG    Merging https://code.launchpad.net/~rogpeppe/juju-core/339-fix-lp1197369 at revision roger.peppe@canonical.com-20130715172835-nsdxk0l82zpvqtpu
<noodles775> hazmat: just so you don't double-up, sidnei is working on a fix to the PyEnvironment.wait_for_units.
<mgz> (add an hour for BST)
<rogpeppe> mgz: where do you see that?
<mgz> I'm tailing the tarmac logs on the bot
<rogpeppe> mgz: i've been wanting to do that
<rogpeppe> mgz: can anyone do it?
<rogpeppe> anyway, lunchtime
<sidnei> rogpeppe: your mp doesn't have any Approved votes
<mgz> rogpeppe: I've added you if you weren't before, `ssh ubuntu@10.55.63.190 "tail -f ~tarmac/logs/tarmac.log"`
<rogpeppe> sidnei: i marked it as Approved, no?
<rogpeppe> mgz: ta!
<sidnei> rogpeppe: the mp yes, but there's no votes, it's still pending review from juju hackers
<mgz> sidnei: yeah, the votes don't matter in the way we're doing things, as they come in on rietveld rather than launchpad, so bot goes on just the status
<rogpeppe> sidnei: what do you mean by "votes"
<rogpeppe> ?
<sidnei> ah, thanks mgz
<sidnei> rogpeppe: i thought you were using tarmac the 'standard' way :)
<rogpeppe> ah, i guess i'm just unfamiliar with the usual workflow
<sidnei> which looks at the approvals on the mp itself
<dpb1> Hi all -- is debug-log supposed to work with the local provider?
<ahasenack> to expand, it tries to ssh into the bootstrap node with ubuntu@, and I don't have an ubuntu user on my achine
<ahasenack> machine
<ahasenack> it should probably not try to ssh and instead tail the log file locally
<mgz> that sounds like a bug report :)
<rvba> rogpeppe: I'm still investigating but I think I found the problemâ¦ Ian did not update the Azure provider's code to cope with his new design.  So the url where to fetch the url to the state info file ends up being empty.
<rvba> s/So the url/So the file/
<jcastro> when's the next release due? sometime this week?
<rvba> rogpeppe: I'm putting together a branch with the fix to the Azure provider.
<hazmat> jcastro, niemeyer knows travis
<jcastro> ack
<rogpeppe> dimitern: what were those two branches that jam handed off to you?
<dimitern> rogpeppe: the uploader api and longwait unification
<rvba> rogpeppe: care to have a look? With that I was able to boostrap a node on azure and deploy charms to get a working mediawiki service: https://codereview.appspot.com/11462044/
<rogpeppe> rvba: cool!
<rogpeppe> rvba: looking
<rvba> ta
<rogpeppe> rvba: reviewed
<rvba> rogpeppe: thanks
<rvba> dimitern: could you please give me a second review for this: https://codereview.appspot.com/11462044/? It's tiny.
<dimitern> rvba: LGTM
<dimitern> man! i hate when my finger twitches and I send a review like 2-4 times :)
<marcoceppi> error: build command "go" failed: exit status 1; can't load package: package launchpad.net/juju-core/cmd/jujud: import "launchpad.net/juju-core/cmd/jujud": cannot find package
<marcoceppi> when trying to run juju bootstrap from recently compiled tip
<rvba> dimitern: 6 times to be precise :).  Thanks for the review!
<ahasenack> marcoceppi: I had some build problems this morning
<ahasenack> marcoceppi: I rm -rf $GOPATH/* and started from scratch, after 8min it was good again
<marcoceppi> ahasenack: thanks, giving it ago
<rogpeppe> rvba: did you edit the mp description directly inside launchpad?
<rvba> rogpeppe: no
<rogpeppe> rvba: hmm, it's really odd that the description doesn't contain the usual link to the codereview page
<rvba> Something is probably wrong with my setup but I can't figure out what.
<rogpeppe> rvba: presumably you created the lp mp with lbox?
<rvba> Yes
<rogpeppe> rvba: hmm. try proposing again and see if it works this time
<dimitern> rogpeppe: btw you haven't sent me your review yet, but wait, i'm reproposing with the changes as discussed now
<rvba> rogpeppe: I ran "lbox propose" again if that's what you wanted me to doâ¦
<rogpeppe> dimitern: i didn't think there was a point in reviewing until you'd made those changes
<dimitern> rogpeppe: yeah, good point
<rvba> rogpeppe: ""
<rvba> An environments.yaml entry for the image name (and region) would be a much
<rvba> better stop-gap measure than patching the source!
<rvba> rogpeppe: that's on our todo list ;)
<rogpeppe> rvba: did the lbox propose make yet another codereview CL?
<rogpeppe> rvba: and mp
<dimitern> rogpeppe: https://codereview.appspot.com/11506043
<rvba> rogpeppe: no, not this time (maybe because the branch hasn't changed).
<rogpeppe> rvba: hmm, i don't see your changes still
<rvba> rogpeppe: it's on lp: https://code.launchpad.net/~rvb/juju-core/fix-az-prov/+merge/175574
<rvba> Trust me, I made the changes :)
<rvba> I'll land this now.
<rogpeppe> rvba: i believe you :-) i just want our workflow to function correctly
<dimitern> rogpeppe: did you like the deadlock tests? :)
<rogpeppe> dimitern: haven't got there yet...
<marcoceppi> I'm definitely missing something
<marcoceppi> Blown away and go get'd a few times with different GOPATH's everytime I go install -v no output but binaries suddenly show up
<marcoceppi> nevermind.
<rogpeppe> marcoceppi: go get does a go install
<ahasenack> marcoceppi: got a new build?
<marcoceppi> rogpeppe: So I don't need to run go install?
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe: thanks
<rogpeppe> marcoceppi: not if you've done the right go get (i.e. the packages you need)
<rogpeppe> marcoceppi: go get launchpad.net/juju-core/...
<rogpeppe> marcoceppi: should be sufficient
<dimitern> rogpeppe: yeah, i did test it and it fails without the select block (deadlocks, so I needed to tweak the assert call to timeout)
<marcoceppi> rogpeppe: yeah, I do that only I use the -v flag. Didn't know it just built for you
<marcoceppi> thanks
<rogpeppe> dimitern: cool
<rogpeppe> dimitern: have you got a branch name for jameinel's upgrader API? i presume the timeout unification branch is lp:~jameinel/juju-core/long-timeouts
<dimitern> rogpeppe: it's lp:~jameinel/juju-core/upgrader-api-worker (i did send you both earlier, i think)
<rogpeppe> so you did - i don't know how i missed that
<mattyw> rogpeppe, question for you, if I wanted to get the version of a charm from the api. The only/best place is from the charmUrl right?
<rogpeppe> mattyw: currently yes
<mattyw> rogpeppe, great thanks :)
<fwereade_> rogpeppe, hey, the public bucket holds tools and potentially simplestreams data -- anything else you can think of?
 * rogpeppe thinks
<rogpeppe> fwereade_: it could potentially hold info on available clouds
<rogpeppe> s/clouds/public clouds/
<fwereade_> rogpeppe, a public bucket's already cloud-specific isn't it?
<rogpeppe> fwereade_: gotta start somewhere. autosync is an example of that.
<rogpeppe> fwereade_: it could hold info on how to contact jaas too
<rogpeppe> fwereade_: istm that a public bucket is our first point of information provision that isn't hardcoded into the juju binary
<rogpeppe> fwereade_: but perhaps i'm thinking too far into the future here
<rogpeppe> mattyw: you can use charm.InferRepository and charm.Repository.Get to actually retrieve a charm given its url
<fwereade_> rogpeppe, I agree that some sort of global first point of contact would be a good thing, but what you're talking about goes a bit far beyond what I'm considering today
<fwereade_> rogpeppe, cheers
<rogpeppe> fwereade_: you didn't give me any context for the question :-)
<fwereade_> rogpeppe, well, you did already answer it perfectly before speculating :)
<fwereade_> so I can't really compain ;p
<rogpeppe> i'm going to wrap up a little earlier today to go and enjoy the sunshine. leaving in 9 minutes.
<rogpeppe> dimitern: i've reproposed https://codereview.appspot.com/11439043/ with some tweaks and i hope your remarks addressed
<dimitern> rogpeppe: yes, i saw it, thanks
<rogpeppe> dimitern: re-review appreciated
<rogpeppe> dimitern: the first time was just the original; i've now pushed my changes
<rogpeppe> dimitern: so you can see the exact changes i've made since jam's proposal
<rogpeppe> right, off now. see y'all tomorrow!
<dimitern> rogpeppe: where's your branch then?
<rogpeppe> dimitern: there
<rogpeppe> dimitern: i just linked to it
<rogpeppe> dimitern: oh bugger, i didn't
<rogpeppe> dimitern: one mo
<rogpeppe> dimitern: https://codereview.appspot.com/11529043
<rogpeppe> ttfn
<dimitern> rogpeppe: i'll scan through it quickly
<dimitern> rogpeppe: reviewed
 * fwereade_ is wondering why we demand that the user have an SSH key in order to bootstrap but still do the admin-secret dance
<dimitern> fwereade_: that's an excellent question actually
 * fwereade_ suspects the answer is "because we're stupid"
<dimitern> i suspect overengineering might be root cause ;)
<fwereade_> that said, if we'd gone pk-only we would have problems with the gui, so I guess passwords *are* also necessary
<fwereade_> not sure there's any good reason to use them from the CLI though
<fwereade_> hey ho
<sidnei> fwereade_: https://codereview.appspot.com/11407044
<sidnei> re: https://bugs.launchpad.net/juju-core/+bug/1202480
<_mup_> Bug #1202480: saucy: bootstrap local environment works, machiner gets unauthorized <juju-core:New> <https://launchpad.net/bugs/1202480>
<thumper> morning
 * thumper sighs
<thumper> late night meetings make me feel like forever working
 * thumper sighs again...
<thumper> fwereade_: you about later?
<sidnei> hey thumper
<thumper> hi sidnei
<sidnei> thumper: speaking of reviews, this is a fix for saucy, for the bug i reported yesterday: https://codereview.appspot.com/11407044/
 * thumper looks
<sidnei> it does break compat with mongo 2.0 iiuc, so might need a second opinion
<thumper> any way to have it have a nice fallback?
<sidnei> thumper: i guess you could fallback to the previous code, running the eval
<sidnei> the eval doesn't work with 2.4, because it requires admin privs, and the isMaster.localTime is not present in 2.0, so tricky ;)
<thumper> sidnei: how about:
<thumper> try the 2.4 trick
<thumper> and if that fails, fall back to eval
<thumper> that way we support both?
<thumper> with a preference to the latest version
<sidnei> that's what i'd do yes
 * thumper leaves that comment on the review
<sidnei> thumper: should the fallback go inside the before/after? should there be a flag so it skips the first one after it fails the first time since it's inside a loop?
<thumper> sidnei: thinking...
<thumper> wtf is this command doing?
<thumper> why the wait for 5 seconds?
<thumper> no comment...
<thumper> that's helpful
<thumper> NOT!
<sidnei> thumper: it's not waiting for 5 seconds, it's *retrying* if running the function on the server takes more than 5 seconds!
<sidnei> even nicer, the eval acquires a global write lock on the server, so the likelyhood of it waiting to acquire that lock is not exactly low ;)
<thumper> ah...
<thumper> sidnei: can I get you to add a comment to that effect?
<sidnei> sure, i added like 4x more comments than changes already :)
<thumper> :)
<sidnei> thumper: how do i upload codereview? lbox -cr?
<thumper> sidnei: just lbox propose again
<sidnei> thumper: done
<wallyworld> davecheney: hi, you tagging a release this morning?
<thumper> wallyworld: I need to land a fix before the tag hits
<thumper> wallyworld: or destroying local environments is broken
<wallyworld> thumper: i was also hoping to get a 2nd +1 on the force machine rename
<thumper> wallyworld: I didn't realise that the lxc-destroy command removed the symlink from /etc/lxc/auto
<thumper> wallyworld: at least mine is trivial :)
<wallyworld> mine isn't that complicated
<thumper> wallyworld, davecheney: plz... https://codereview.appspot.com/11546043/
 * wallyworld looks
#juju-dev 2013-07-19
<davecheney> wallyworld: yeah, i plan to day
<davecheney> just waiting for a smoke test build to pass
<wallyworld> davecheney: thumper and i had a couple of fixes
<wallyworld> is it too late?
<davecheney> wallyworld: it is not too late
<davecheney> remember my tagging score card
<davecheney> 0 for 2
<wallyworld> davecheney: it would be great if you could +1 this then. you're the only person besides thumper who is around to ask sorry. https://codereview.appspot.com/11493043/
<davecheney> no wories
<wallyworld> thanks. do you have a link to the release notes?
<davecheney> https://docs.google.com/a/canonical.com/document/d/1ZBV6m0D1cfJQGoHW7EzJEc2qZeJFR38teHM_4OiYkeM/edit#
<wallyworld> i'll add some shit
<davecheney> capital!
<davecheney> environs/azure isn't in environs/all
<davecheney> but it is leaking into jujud somewhere
<davecheney> i need to fix that quickly
<thumper> davecheney: in my trunk, azure is in all
<davecheney> yeah, for some reason I couldn't see it last night
<davecheney> i'll have to submit a branch to remove it
<thumper> +1 trivial from me
<thumper> but I'll do it when you have it up
<davecheney> wallyworld: LGTM on your branch
<davecheney> ooops, sam is up
<davecheney> afk for a few mins
<wallyworld> davecheney: thanks, will land
<sidnei> thumper: should i flip that mp to approved? iiuc there's a release about to be cut so maybe wait until after that?
<davecheney> https://codereview.appspot.com/11557043
<davecheney> wallyworld: thumper could I give some love for this change ?
<wallyworld> sure, looking
<davecheney> sidnei: sure, check it in
<sidnei> davecheney: oh, apparently i don't have perms. do the honors? https://code.launchpad.net/~sidnei/juju-core/mongo-2.4-compat/+merge/175653
<davecheney> sidnei: screw that, i'll give you perms
<sidnei>  /o\
<sidnei> ok :)
<davecheney> try now
<sidnei> done
<sidnei> or. slow launchpad is slow.
<sidnei> now for realz.
<bigjools> davecheney: I had a patch from agl to handle tls renegs... can you take a look please?
<davecheney> bigjools: FUCKING A
<bigjools> you can say that if it works
<bigjools> check your email
<davecheney> bigjools: please send
<davecheney> got it
<davecheney> bigjools: speaking of azure signup
<davecheney> did some dude try to call your to welcome you to the azure failmy
<davecheney> family
<bigjools> using what number? *cough*
<thumper> wallyworld: hangout?
<davecheney> i had to give a mobile number so they send me the activation code
<wallyworld> ok
<davecheney> i guess that is how they got my phone number
<davecheney> i did use my @canonical email
<davecheney> maybe this is VIP treatment
<bigjools> ummmm maybe I did that
<bigjools> never got called
<bigjools> I can't remember what I did yesterday let alone 2 months ago
<davecheney> good point
<davecheney> bigjools: patch didn't apply cleanly, but i'll hack it in manually
<bigjools> ok
<bigjools> that's interesting
<bigjools> what is he hacking on...
<wallyworld> thumper: url?
<thumper> https://plus.google.com/hangouts/_/e105bdb56dcdc0f23d38c9cdafc18eea12967111?hl=en
<davecheney> no idea, the tls package hasn't received any updates for a looooong time
<davecheney> I do know that google have their own internal version of Go
<davecheney> but this isn't a fork
<davecheney> it's just the way they managage external source into the big mega google/ repo
<davecheney> so possibly he was working from that copy
<davecheney> anyhoo
<bigjools> davecheney: notice he said it's not suitable for upstream, so we still have a problem
<davecheney> yup
<bigjools> if they are that against it, fixing go-curl is the only realistic option
<davecheney> sidnei: thouest forget the commit message
<davecheney> bigjools: one elephant at a time
<davecheney> ok, tagging time
<bigjools> davecheney: did you get it working?
<davecheney> bigjools: not yet
<davecheney> it's next on my list after tagging 1.11.3
<bigjools> kk
<wallyworld> davecheney: after a spurious bit failure, my last branch for release finally merged
<davecheney> wallyworld: what is the push location for the _real_ trunk now ?
<wallyworld> s/it/bot
<wallyworld> ~gobot/juju-core of something i think
<davecheney> lucky(~/devel/juju-core) % bzr push lp:~gobot/juju-core
<davecheney> bzr: ERROR: Invalid url supplied to transport: "lp:~gobot/juju-core": No such person or team: gobot
<davecheney> ahh, hyphen
<wallyworld> go-ot sorry
<wallyworld> i think you also need to add credentials
<wallyworld> not sure
<davecheney> lucky(~/devel/juju-core) % bzr push lp:~go-bot/juju-core/trunk
<wallyworld> there's an email somewhere perhaps, i'll try and find it
<davecheney> bzr: ERROR: Cannot lock LockDir(chroot-76960720:///~go-bot/juju-core/trunk/.bzr/branch/lock): Transport operation not possible: readonly transport
<wallyworld> yeah, you need credentials
<davecheney> jam said that my key shuld be there
<davecheney> tey fook
<wallyworld> no, the push needs the correct form of the url
<wallyworld> bzr+ssh://.... or something i think
<davecheney> lucky(~/devel/juju-core) % bzr push bzr+ssh://bazaar.launchpad.net/~go-bot/juju-core/trunk
<davecheney> bzr: ERROR: Cannot lock LockDir(chroot-92566480:///~go-bot/juju-core/trunk/.bzr/branch/lock): Transport operation not possible: readonly transport
<wallyworld> davecheney: bzr+ssh://go-bot@bazaar.launchpad.net/~go-bot/juju-core/trunk
<wallyworld> i think
<bigjools> lp:juju-core should work
<wallyworld> bigjools: it's protected
<bigjools> oh you need to use go-butt
<davecheney> bigjools: is that like a ro-butt ?
<bigjools> you could bzr lp-login first then
<bigjools> then push to lp:juju-core
<wallyworld> i didn't know you could do that
<davecheney> lucky(~/devel/juju-core) % bzr lp-login
<davecheney> dave-cheney
<davecheney> lucky(~/devel/juju-core) % bzr push bzr+ssh://bazaar.launchpad.net/~go-bot/juju-core/trunk
<davecheney> bzr: ERROR: Cannot lock LockDir(chroot-67593168:///~go-bot/juju-core/trunk/.bzr/branch/lock): Transport operation not possible: readonly transport
<davecheney> lucky(~/devel/juju-core) % bzr push lp:juju-core
<davecheney> bzr: ERROR: Cannot lock LockDir(chroot-92087248:///%2Bbranch/juju-core/.bzr/branch/lock): Transport operation not possible: readonly transport
<davecheney> oh, do I have to lp-login as go-but ?
<wallyworld> davecheney: did you try my suggestion above?
<wallyworld> add the go-bot@
<davecheney> wallyworld: yes i tried your suggestion
<davecheney> oh, not that one
<davecheney> << tit
<davecheney> lucky(~/devel/juju-core) %  bzr push bzr+ssh://go-bot@bazaar.launchpad.net/~go-bot/juju-core/trunk
<davecheney> Permission denied (publickey).
<davecheney> very close now
<davecheney> ConnectionReset reading response for 'BzrDir.open_2.1', retrying
<davecheney> key isn't actually there
<wallyworld> oh, your key is not registered?
<wallyworld> no, it's not, i just checked
<davecheney> wallyworld: i don't know how to answer that
<davecheney> nothing did not work before
<wallyworld> davecheney: see https://launchpad.net/~go-bot
<wallyworld> i can't see your key
<davecheney> ok, i'll fix it
<davecheney> ok, how do I fix it ?
<davecheney> ie, nothing didn't work before hand
<davecheney> looks like I need to login as go-bot and rummage around
<wallyworld> i don't think i have perms to add your key
<wallyworld> maybe ask in -ops
<thumper> davecheney: If I use go.build package, build.ImportPath(".", 0) and use that in a test, will it always import the directory I think it should?
<davecheney> thumper: not 100% sure
<davecheney> never tried that
<davecheney> by rights it shuld use whatever the value of $(pwd) is
<davecheney> but that can sometimes be surprising
<thumper> hmm...
<thumper> I want to write a test to make sure people don't mistakenly expand the dependencies of a package
<thumper> so I want to get the list of imports for the current package
<thumper> it seems to work
<thumper> but the "." makes me hessitant
<thumper> also annoying if I want to put the test in a helper package
<thumper> in python there is __file__
<thumper> anything like that in go?
<davecheney> what about this
<davecheney> go list -f '{{ .Imports }}' $PKG
<davecheney> thumper: you want build.ImportPath("launchpad.net/juju-core/cmd/juju", 0) or something
<thumper> that doesn't work
<thumper> &os.PathError{Op:"open", Path:"launchpad.net/juju-core/agent", Err:0x2} ("open launchpad.net/juju-core/agent: no such file or directory")
<davecheney> ok
<davecheney> you are right to be concerned about ".'
<thumper> it seems to work, but I'm not convinced it is always going to work
<davecheney> rightly so
<thumper> hangon,
<thumper> think I have it
<thumper> davecheney:
<thumper> func (*deps) TestPackageDependencies(c *gc.C) {
<thumper> 	c.Assert(
<thumper> 		FindJujuCoreImports(c, "launchpad.net/juju-core/agent"),
<thumper> 		gc.DeepEquals,
<thumper> 		[]string{"log", "state", "version"})
<thumper> }
<thumper> whaddaya think?
 * thumper proposes
<bigjools> wallyworld: is that cost and constraint evaluation done in the core code or each provider?
<bigjools> (as opposed to defining the structs I mean)
<wallyworld> umm. core code i think, let me check
<bigjools> that's cool if so
<wallyworld> yeah, core code
<wallyworld> there's glue in each provider to plug in the data
<wallyworld> and the core code does the rest
<wallyworld> so the common logic is all outside the providers
<bigjools> nice
<bigjools> so we just define the structs and Robert's your father's brother
<wallyworld> that's the idea more or less
<wallyworld> the openstack and ec2 implementations give some idea of what to do
<bigjools> yeah
<bigjools> wallyworld: oh, does cpupower have any meaning outside of ec2?
<wallyworld> bigjools: nope
<bigjools> wallyworld: what should it be set to, anything?
<bigjools> as long as it's the same for each type I guess
<wallyworld> no, don't set it. it will be ignored
<bigjools> ah ok
<bigjools> I used a Â¢ symbol in a comment and now I get this "illegal UTF-8 encoding" compile error, what do I do?
<davecheney> bigjools: delete that character ?
<davecheney> did your editor put a BOM at the top fo the file ?
<bigjools> not that I can tell
<davecheney> like this ? http://play.golang.org/p/MHE29ZM7Ln
<bigjools> it was working ok until I did a go fmt
<davecheney> oooh interesting
<davecheney> is the \c all mangled as well ?
<bigjools> it now looks like: ÃÂ¢
<bigjools> if I remove the Ã Go bitches at me
<davecheney> bigjools: can you provoke the problem in another editor ?
 * bigjools tries again
<davecheney> or can you send me the file pre fmt ?
<bigjools> I used vim to make a basic file
<bigjools> it works
<bigjools> post format it works
<dimitern> ÃÂ¢ looks like latin1 encoding interpreted as utf-8
<bigjools> except it added that Ã
<bigjools> hmmm vim adds it later
<bigjools> with no fmt
 * bigjools sighs heavily
<jtv> dimitern: the other way around, no?
<dimitern> jtv: ah, yes right
<dimitern> if it was interpreted as utf there will be <?> things inside
<dimitern> fwereade_: ping
<dimitern> rogpeppe: hey
<rogpeppe> dimitern: yo!
<dimitern> rogpeppe: so about SetAgentAlive again
<rogpeppe> dimitern: okay...
<dimitern> rogpeppe: did we make a way to expose the same functionality through the api connection?
<dimitern> rogpeppe: i think that was what we decided last
<rogpeppe> dimitern: i can't entirely remember if we decided to make it explicit or implicit
<rogpeppe> dimitern: both have advantages and disadvantages
<dimitern> rogpeppe: it was something like auto start a pinger on connect
<rogpeppe> dimitern: yeah, that's the implicit approach
<rogpeppe> dimitern: how do we currently deal with the machine nonce?
<dimitern> rogpeppe: we set it at provisioning time and can verify it in the agent
<rogpeppe> dimitern: yeah, but do we deal with it in the API at all yet?
<dimitern> rogpeppe: i don't think so, let me check
<dimitern> rogpeppe: no
<dimitern> rogpeppe: why do we need that?
<dimitern> rogpeppe: i was thinking to do the api migration in steps
<dimitern> rogpeppe: machiner doesn't need it, only the MA and provisioner
<rogpeppe> dimitern: that's the thing that ensures we don't have two machine agents connected at the same time
<dimitern> rogpeppe: how so?
<rogpeppe> dimitern: isn't that the whole reason for the nonce?
<rogpeppe> dimitern: my suggestion is that we integrate the nonce with the login request, and start the pinger then.
<dimitern> rogpeppe: expand please
<rogpeppe> dimitern: currently when you log in, you provide an entity tag and password, right?
<dimitern> rogpeppe: right
<rogpeppe> dimitern: so it's only at that stage that the API server knows who's at the other end
<rogpeppe> dimitern: we need to know who's at the other end if we're to start a pinger
<rogpeppe> dimitern: but we don't want to start a pinger if an agent comes up with the wrong nonce (because then they'll immediately go away again)
<rogpeppe> dimitern: so we could add the nonce to the login request, so the login will fail without the correct nonce.
<rogpeppe> dimitern: (for machine agents only)
<TheMue> morning
<rogpeppe> dimitern: and then start the pinger on a successful login request (and stop it when the connection goes away)
<dimitern> rogpeppe: hmm
<dimitern> rogpeppe: so add an optional field to apiinfo?
<dimitern> rogpeppe: that's alright, but how can juju status get this to show it (i.e. AgentAlive) ?
<rogpeppe> dimitern: we can still have an API AgentAlive call
<dimitern> rogpeppe: but for the CLI/clients only
<rogpeppe> dimitern: probably, yes
<dimitern> rogpeppe: and assuming we can start the same pinger out-of-order to get the already running one, should be fine
<rogpeppe> dimitern: i don't understand that
<dimitern> rogpeppe: i mean the pinger we start on MA login will be running, and then AgentAlive tries to find whether it's running, but it will be in another process perhaps
<rogpeppe> dimitern: that's always been the case
<dimitern> rogpeppe: so it's fine then
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: you don't even need a Pinger to ask if something's alive
<dimitern> rogpeppe: ok then, so the machiner can be simplified a bit
<dimitern> rogpeppe: sorry, the MA actually
<rogpeppe> dimitern: yeah
<dimitern> rogpeppe: no need to call CheckProvisioned - if it's not we won't be able to connect
<rogpeppe> dimitern: yup
<dimitern> rogpeppe: but the error was to be a different one from "login failed", so it won't retry
<rogpeppe> dimitern: good point, yes
<dimitern> rogpeppe: I think I can see it now, the CL I mean to enable all that
<rogpeppe> dimitern: great
<dimitern> rogpeppe: if you don't mind I'll try it out and propose it for discussion
<rogpeppe> dimitern: sounds good
<dimitern> rogpeppe: as a first step towards api -> machiner
<rogpeppe> dimitern: i'm not sure there's any problem with LongWait being 10 seconds BTW
<rogpeppe> dimitern: can you think of one?
<dimitern> rogpeppe: why so long?
<dimitern> rogpeppe: why not 1h then? :)
<rogpeppe> dimitern: because some it's important in some cases as a worst case; and it's not long to wait for a failed test.
<rogpeppe> dimitern: it will never happen when tests pass, so it doesn't impact anything in the usual case
<dimitern> rogpeppe: as long it's only a failure timeout then ok
<rogpeppe> dimitern: that's the whole point of LongWait, isn't it?
<dimitern> rogpeppe: i assumed so yes, but in some places it seemed less clear
<rogpeppe> dimitern: anywhere it's part of the test-passing path, i'd like to know
<dimitern> rogpeppe: it should be easy enough to spot - running go test before and after the branch changes a few times and comparing times
<rogpeppe> dimitern: yeah, change it to 1 minute and see which tests take longer
<dimitern> rogpeppe: have you tried that yourself?
<rogpeppe> dimitern: no - i'll do it now
 * TheMue hates import cycles
<rogpeppe> dimitern: useful test - thanks for the suggestion. it found two places we were using LongWait inappropriately
<dimitern> rogpeppe: nice!
<rogpeppe> dimitern: make that three places - there were two related tests in state that were doing it wrong
<dimitern> rogpeppe: i'm glad i asked you to do this test then
<rogpeppe> dimitern: me too
<rvba> Hi guys, would someone be available for a second review of https://codereview.appspot.com/11524044/ ?
<dimitern> rvba: I'll take a look, but it doesn't seem there is a first review there
<rvba> dimitern: yeah, the first review is here https://codereview.appspot.com/11433044/.  You know, I've got this problem that each time I run lbox propose it creates another MP.
<rvba> dimitern: thanks!
<dimitern> rvba: you could probably just do bzr push and publish your comments manually from rietveld
<dimitern> rvba: the only drawback is you won't see the updated diff on rietveld i guess
<rvba> dimitern: well, I think I'll live with the creation of an additional MP each time I run "lbox propose" :)
<dimitern> rvba: if you rebuild lbox from tip using go 1.1.2 (or whatever the latest release is) I think you should be fine - that's what I did and it works.
<rvba> dimitern: okay, I'll try that.
<dimitern> rvba: but just in case it doesn't better keep a copy of the lbox binary around as backup
<rvba> sure
<dimitern> rvba: and also set a separate $GOPATH for go 1.1.2, so it'll be easier to switch back to your current go version
<rvba> Well, because of the go-curl problem I'm sort of stuck with the old version for now.
<dimitern> rvba: ah..
<dimitern> rvba: reviewed
<rvba> ta
<mgz> mornin'
<dimitern> mgz: moin
<dimitern> rogpeppe: so far so good - changed OpenAPIAs and all relevent code to use machine nonce for MA logins, tests pass, only need to add the pinger and will propose
<rogpeppe> dimitern: cool
<rogpeppe> dimitern: final look? https://codereview.appspot.com/11529043/
<dimitern> rogpeppe: sure
 * TheMue removed cyclic dependency, but our packages are sometime really tight entangled
 * fwereade_ out for breakfast, bbiab
<rogpeppe> TheMue: what was the issue in this case?
<dimitern> rogpeppe: reviewed
<TheMue> rogpeppe: I'm trying to extract the sync tools so that they are better reusable
<TheMue> rogpeppe: and my wish has been environs/tools
<TheMue> rogpeppe: sadly I'm using the EC2 HTTP storage reader, and environs/ec2 uses environs/tools. *rofl*
<rogpeppe> dimitern: replied
<rogpeppe> dimitern: (and i'm going to submit it)
<rogpeppe> dimitern: did jam tell you anything about his plans around upgrade worker (in particular the role of his upgrade handler stuff)?
<rogpeppe> fwereade_: ping
<dimitern> rogpeppe: not much that I remember off hand
<rogpeppe> dimitern: ok, i'll figure it out
<dimitern> rogpeppe: it model is similar to notifyWorker
<rogpeppe> dimitern: hmm, that seems a bit odd - isn't notifyWorker structured like that because it's generic?
<rogpeppe> dimitern: i think the upgrader is the same in every case, isn't it?
<rogpeppe> dimitern: i'm a bit concerned that we're taking time to refactor significant logic that really doesn't need refactoring at this stage
<rogpeppe> dimitern: when i tried to estimate the time to implement the API, i did not include time to refactor all the client code too
<dimitern> rogpeppe: i'm not sure, but i think the idea is to refactor the upgrader so it can use the notifyworker, like the machiner
<rogpeppe> dimitern: that's a different thing.
<fwereade_> rogpeppe, pong
<rogpeppe> fwereade_: i was just wondering why params.AgentTools flattens out all the version fields
<fwereade_> rogpeppe, I'd be +1 on changing that
<fwereade_> I asked jam, he said it basically seemed like a good idea at the time but didn't have strong feelings either way
<fwereade_> dimitern, rogpeppe: fwiw I am -1 on spending time making upgrader a handler-style worker
<rogpeppe> fwereade_: it seems like an embedded version.BinaryVersion would be more obvious
<fwereade_> dimitern, rogpeppe: it's got 2 things to handle
<rogpeppe> fwereade_: me too
<rogpeppe> fwereade_: i'm not even sure it'll work well using NotifyWorker
<fwereade_> rogpeppe, ah, we're on go 1.1 now, so marshalling embedded fields works properly, right
<fwereade_> rogpeppe, hey wait
<rogpeppe> fwereade_: no we're not
<fwereade_> rogpeppe, Upgrader... *is* that watching 2 things? oh yes it does the download stuff
<fwereade_> rogpeppe, we're not?
<fwereade_> rogpeppe, I thought we had everything building with go1.1
<rogpeppe> fwereade_: oh, have i missed something?
<fwereade_> rogpeppe, I *think* so
<rogpeppe> fwereade_: is 1.1 backported to precise etc?
<fwereade_> rogpeppe, I suspect mgz will be able to give you the precise (heh) details
<rogpeppe> fwereade_: i thought tarmac was still running 1.0.2
<rogpeppe> fwereade_: if we're really 1.1 throughout, then \o/
<fwereade_> rogpeppe, honestly for the tools I'd just as soon send a string
<rogpeppe> fwereade_: don't we need the version *and* the URL?
<fwereade_> rogpeppe, sure, sorry, I mean send the version as a string
<rogpeppe> fwereade_: ah, you mean for AgentTools
<rogpeppe> fwereade_: version.BinaryVersion marshals as a string anyway AFAIR
<fwereade_> rogpeppe, ah, that's nice
<rogpeppe> fwereade_: ah, i misremembered - it only does that in mongo (it has SetBSON etc methods)
<fwereade_> dimitern, rogpeppe: the other option is to drop the un- or poorly-tested download-cancelling stuff, and then upgrader *would* make a nice NotifyWorker, I think
<rogpeppe> fwereade_: it looks like versions aren't currently used in the API yet, so we can still do it for JSON
<fwereade_> rogpeppe, sgtm
<rogpeppe> fwereade_: the important thing, for the time being, is that the upgrader waits around to complete an upgrade for a while after it's killed (because of the one-kills-all logic in the machine agent)
<rogpeppe> fwereade_: when we fix things so that we can detect non-fatal errors, we can rethink that
<fwereade_> rogpeppe, I thought we weren't doing that on the api side?
<rogpeppe> fwereade_: we are currently. we need to have a hard think about error handling before we change that.
<fwereade_> rogpeppe, I don't think it's worth worrying about preserving that, so long as we just defer landing a NotifyWorker version until it's going via the api?
<fwereade_> rogpeppe, isn't the only reason for the delayed stop to work around the one-error-kills-everything behaviour?
<rogpeppe> fwereade_: not entirely
<fwereade_> rogpeppe, go on
<rogpeppe> fwereade_: even if there is a fatal error due to the state going away, i think we want to allow the upgrade to proceed
<rogpeppe> fwereade_: because we're not downloading the tools from the state
<fwereade_> rogpeppe, ok, but that just involves dropping the download-cancelling cleverness, right? if we just do a blocking download and return the error when it's done we're dorted
<rogpeppe> fwereade_: i think we should probably leave existing logic as is. i know it's tempting to do drive-by refactoring of everything, but it all eats into the critical path
<dimitern> fwereade_, rogpeppe: https://codereview.appspot.com/11424044 please take a look
<rogpeppe> fwereade_: i'm dorted anyway, dunno about you :-)
<fwereade_> rogpeppe, I think that'd hold more water if upgrader were properly tested... as it is the tested bits needed changing, and if we dropped the untested bits we could also drop the somewhat astonishing delayed stop
<fwereade_> rogpeppe, which is at least tested but appears not to be delivering much value now that it's only there to support the untested bits
 * fwereade_ gets "dorted" now, haha :)
<dimitern> fwereade_, rogpeppe: ^^ that CL implements the first step towards ditching SetAgentAlive
<rogpeppe> fwereade_: the delayed stop isn't just there to support the download cancellation
<rogpeppe> fwereade_: it's there to make the system more robust in the face of stupid errors.
<fwereade_> dimitern, cool, I probably won't get to it until later today I'm afraid
<rogpeppe> fwereade_: (and the download cancellation doesn't work anyway AFAIR because we don't provide a means the abort the http download)
<fwereade_> rogpeppe, those errors being ones in parallel workers, right? we still depend on the outer agent mechanisms working right
<dimitern> fwereade_: when you can, np, i have a couple more coming
<rogpeppe> fwereade_: well yes, it has to get into the upgrader logic
<fwereade_> rogpeppe, so... if it starts an upgrader correctly (which we depend on anyway) it won't stop the upgrader unless there's an API problem, right? if there *is* an api problem, no upgrades; if not, we'll get the upgrade, and if we ignore the tomb once we've started downloading there's no further need for delay... is there?
<fwereade_> rogpeppe, the only thing that'll kill the upgrader is ErrTerminateAgent, and that's a potential stupid error the delayed stop doesn't defend against anyway
<rogpeppe> fwereade_: first: i'm not sure we want to ignore the tomb when we're downloading - a download can take a long time.
<rogpeppe> fwereade_: second: it will stop the upgrader if any API-based agent returns any error
<fwereade_> rogpeppe, first, var upgraderKillDelay = 5 * time.Minute
<fwereade_> rogpeppe, second: I thought the whole idea was that it *wouldn't* do that?
<rogpeppe> fwereade_: yes - that happens only for the very first download
<rogpeppe> fwereade_: in fact it doesn't happen if we come up successfully and the version is as expected
<rogpeppe> fwereade_: AFAIR...
<fwereade_> rogpeppe, yeah, we do indeed usually not wait that long
<fwereade_> rogpeppe, but if it's acceptable for the upgrader to block everything for 5 mins while it's figuring stuff out, I think it's also ok for it to wait for the duration of a download
<rogpeppe> fwereade_: the 5 minute wait only happens in a very specific circumstance
<fwereade_> rogpeppe, likewise the wait for a download
<rogpeppe> fwereade_: it might actually be better structured as a statement outside the main upgrader loop actually
<rogpeppe> fwereade_: in my experience, a network download can take lots more than 5 minutes to time out
<fwereade_> rogpeppe, let's try a different approach -- describe a bug that the delayed stop will actually defend against that *isn't* a bug in the mechanism that starts the upgrader anyway
<rogpeppe> fwereade_: say there's a compatibility problem in the unit agent, for example - it encounters an "impossible" state and bombs out
<rogpeppe> s/unit agent/uniter/
<fwereade_> rogpeppe, then the uniter waits 5s and gets restarted
<fwereade_> rogpeppe, the upgrader continues to toddle along just fine
<rogpeppe> fwereade_: except that as things are *currently*, the upgrader is killed too
<fwereade_> rogpeppe, we have been over that
<rogpeppe> fwereade_: because we have no way of distinguishing temporary from permanent errors
<rogpeppe> fwereade_: i'd like to fix the upgrader logic *after* we've added that error logic
<fwereade_> rogpeppe, the upgrader is killed too because isFatal always returns true; but when it's an api worker, it won't kill everything
<fwereade_> rogpeppe, we write a simple upgrader, and defer landing it until we can also land it as an api worker instead of a state worker
<rogpeppe> fwereade_: we don't use isFatal for the APIWorker runner
<rogpeppe> 	runner := worker.NewRunner(allFatal, moreImportant)
<fwereade_> rogpeppe, WTF
<rogpeppe> fwereade_: it has to be like that
<rogpeppe> fwereade_: because we can't currently know when an API connection is borked
<rogpeppe> fwereade_: we do need to do that, but we haven't done it yet
<fwereade_> rogpeppe, so maybe it's not hooked up yet, because we seem tohave some pathology where we still somehow write reams of code and never integrate it
<rogpeppe> fwereade_: what's not hooked up yet?
<fwereade_> rogpeppe, but "does Ping return an error is I think adequate"
<rogpeppe> fwereade_: i don't believe it is
<rogpeppe> fwereade_: what happens if the underlying state connection that the API server is talking to goes away?
<fwereade_> rogpeppe, then handling that is the api server's job
<rogpeppe> fwereade_: yes, and we don't know how to do that yet
<fwereade_> rogpeppe, it is *not* the agent's concern though
<rogpeppe> fwereade_: it is currently - that's the only way we can recover from.... oh, in fact we never do recover. we're stuffed.
<fwereade_> rogpeppe, the answer to possible crapness in one component is *not* to smear workarounds across the rest of the system, it's to fix the problematic component
<rogpeppe> fwereade_: i agree. but i don't want to remove our current workaround until we've fixed that problematic component
<fwereade_> rogpeppe, using allFatal just makes the agent crap as well
<rogpeppe> fwereade_: so... it seems to me that there are two fixes we really need
<rogpeppe> fwereade_: 1) add another worker that watches api.State.Broken and exits with a fatal error if it is.
<rogpeppe> fwereade_: 2) change the api server so that if it sees an error that implies the mongo state connection is broken, it quits
<fwereade_> rogpeppe, yeah, both good things
<fwereade_> rogpeppe, more important than implementing the API workers how I asked more than a month ago? very much not so
<rogpeppe> fwereade_: i wasn't planning on rewriting all the workers to change them to use the API, and i don't think it's necessary
<rogpeppe> fwereade_: that's what i'm resisting
 * TheMue smiles
<TheMue> tests are passing
<fwereade_> rogpeppe, we *have* to rewrite the upgrader, because you implemented it so that we have to send secrets out to every single agent
<fwereade_> rogpeppe, since we're doing it I would prefer to do it right
<rogpeppe> fwereade_: we don't have to change anything other than change it to read on a notify channel and read the tools when we get a notification
<rogpeppe> fwereade_: that's a fairly trivial change, i think
<rogpeppe> fwereade_: definitely not a rewrite
<fwereade_> rogpeppe, apart from all the magic untested complex code in there
<rogpeppe> fwereade_: that's a totally orthogonal issue
<rogpeppe> fwereade_: it may be the case, but we don't need to fix everything now
<fwereade_> rogpeppe, you know we could have written a NotifyWatcher implementation in the time you've been kicking up dust over this issue?
<rogpeppe> fwereade_: but it would probably be wrong in some subtle way. i like not rewriting things when possible.
<fwereade_> rogpeppe, you are free to think I'm wrong about things
<fwereade_> rogpeppe, you are not free to persistently go off and implement WTF you want to in the face of my instructions
<rogpeppe> fwereade_: no, i think you have good points,
<rogpeppe> fwereade_: and i think we want to go in the same direction
<rogpeppe> fwereade_: to get things straight here: your instructions here are to rewrite the upgrader?
<fwereade_> rogpeppe, to (1) do the API runner as originally requested, so rogue tasks don't take the others down; to (2) implement a simpler upgrader as a NotifyWorker against the API, that does a blocking download and returns the appropriate upgrade error when done; and to (3) unhook the state one and activate the machine one, for the machine agent only
<fwereade_> s/the machine one/the api one/
<rogpeppe> fwereade_: ok, thanks
<dimitern> rogpeppe: ping
<rogpeppe> dimitern: pong
<dimitern> rogpeppe: how about that review?
<dimitern> ;)
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe: thanks
<jtv> Would any of the Go experts here know how I'd go about debugging this crash?  I went through  it with gdb but it seems to be happening somewhere pretty deep down in the malloc logic... http://paste.ubuntu.com/5890497/
<jtv> I don't even know why it'd list two goroutines, since as far as I know this code doesn't even use goroutines.
<dimitern> fwereade_: standup?
<jtv> Phew.  Go 1.0.2 will produce a proper traceback.  The 1.1.1 from the PPA we're using is a dev version, not a released one, right?
<rvba> Anyone up for two really tiny reviews? https://codereview.appspot.com/11404044/ / https://codereview.appspot.com/11511043/
<rogpeppe> rvba: will look after lunch
 * rogpeppe goes for lunch
<rvba> Thanks.
 * TheMue steps out and continues later
<hazmat> rogpeppe, is there some sort of cache around the api?
<hazmat> rogpeppe, i can do actions, like deploy/add_units/add_relation, then do an all watcher immediately after and not get the results back
<hazmat> ie. the api is behaving as though its eventual consistent.
<hazmat> hmm.. nevermind maybe user error
<dimitern> looking for a second review on https://codereview.appspot.com/11424044/ and reviews on https://codereview.appspot.com/11572043/
<dimitern> fwereade_: when you can as well ^^
<rogpeppe> hazmat: the events coming from the allwatcher are only delivered when the global polling goroutine gets around to checking for changes
<rogpeppe> hazmat: the poll interval is 5s
<rogpeppe> hazmat: so that's your worst case delay
<rogpeppe> hazmat: although...
<rogpeppe> hazmat: the first result should come immediately
<hazmat> rogpeppe, immediately, but not nesc. reflective of current reality.
<fwereade_> anybody else unable to connect to irc.canonical.com?
<rogpeppe> hazmat: no, it should be reflective of current reality
<hazmat> rogpeppe, because the status api call wasn't fully implemented i did status  emulation on the watcher
<hazmat> using the initial watch response value
<hazmat> but that doesn't always seem to reflect the last ops done
<hazmat> running into this against some unit tests
<hazmat> rogpeppe, here's an example.. http://pastebin.ubuntu.com/
<hazmat> whoops
<hazmat> http://pastebin.ubuntu.com/5890883/
<hazmat> i deploy two services, add a relation, add a unit.. then do a status immediately after, it reflects only the new machines, a watcher initial result doesn't even show the machines.
<hazmat> ideally these should both show the services, units, relations, and machines of previous api ops
<hazmat> for this simple sync usage on the client, watches here use a separate client connection (env.status is a watch translator due to lack of details in rpc status)
<rogpeppe1> hazmat: what does wait_for_units do?
<hazmat> rogpeppe, in this case nothing, because its waiting for units to be in the 'started' state, but it can't see any units, so its basically a no-op here.
<rogpeppe1> hazmat: does it create a new watcher or use an existing one?
<hazmat> rogpeppe, normally its a watch against units agent-state to reach started
<hazmat> rogpeppe, new watcher
<rogpeppe1> hazmat: it seems odd that the first event isn't producing the current status, because it gets the contents of all collections before producing it
<rogpeppe1> hazmat: could you show me the whole code? (or just a trace of the API messages would be even better)
<hazmat> rogpeppe1, hence the question is there some sort of cache behavior?
<hazmat> rogpeppe1, okay..
<rogpeppe1> hazmat: ah...
<rogpeppe1> hazmat: i've just realised
<rogpeppe1> hazmat: yes, of course, there *is* some sort of cache behaviour!
<rogpeppe1> hazmat: because the allwatcher is shared between all clients
<rogpeppe1> hazmat: and we keep a single copy of all the data
<hazmat> rogpeppe1, i thought each client had independent pointer into that data
<rogpeppe1> hazmat: yes, each  client does
<rogpeppe1> hazmat: but the underlying data is a cache that's only updated when new events arrive (by that 5s poll interval)
<rogpeppe1> hazmat: sorry for the mis-info!
<hazmat> rogpeppe1, okay.. so i think the critical fix here then for would be for status op to report status instead of just machines.
<rogpeppe1> hazmat: i think the status op should provide the whole status, yes
<hazmat> otherwise, there's no way a client can see its mods in th eenv.
<rogpeppe1> hazmat: and then we can make the status command use that
<rogpeppe1> hazmat: mgz is working on some stuff to enable that
<hazmat> mgz, ping :-)
<mgz> hazmat: hey
<hazmat> mgz, rogpeppe1 mentioned you might be working on cli using api?
<rogpeppe1> hazmat: no, that's not quite it
<mgz> well, mostly as a side effect
<hazmat> specifically i was wondering about status support, since the client api is hard to use/verify ops without status working given 5s delay on watcher.
<rogpeppe1> hazmat: the work mgz is doing makes it possible to get all the status info over the api, but not the status call itself
<mgz> hazmat: if I read the log correctly, your issue is that just using watcher apis you don't get all the status info you could with manual `juju status` calls?
<rogpeppe1> hazmat: you can get info on individual machines i think
<hazmat> rogpeppe1, that's not useful when i'm verifying services, units, relations..
<rogpeppe1> hazmat: one possibility would be to add a Sync call to the API
<hazmat> rogpeppe1, that would help.. but i'm not sure its something we want in the api
<rogpeppe1> hazmat: to sync any watchers
<rogpeppe1> hazmat: but yeah, i think you may be right
<hazmat> rogpeppe1, the watchers are already sync'd, its the a dos api against updating the stream.
<rogpeppe1> hazmat: i don't know how much dos could be done this way that couldn't be done through other api calls.
<hazmat> which to combat would just introduce the delay elsewhere... what this usage needs is non-cached access. hence status
<rogpeppe1> hazmat: yeah
<hazmat> rogpeppe1, well we have fetch service info, no fetch unit info, or machine afaik
<rogpeppe1> hazmat: ironically my original plan was to do the status API call as the first actual use of the API
<dimitern> rogpeppe1: https://codereview.appspot.com/11572043/ ?
<rogpeppe1> dimitern: i don't think NewAgentAPI is the right place to start the pinger
<hazmat> mgz, no.. the watch apis do give all the status, but their cached, so its eventual consistent behavior, which is causing issues with unit tests, the issue would be solved if the rpc status call actually returned status info, instead of just a partial machine info
<dimitern> rogpeppe1: why?
<rogpeppe1> dimitern: because it's called once for each machiner API call
<rogpeppe1> dimitern: so you're quickly going to have many many pingers :-)
<dimitern> rogpeppe1: isn't that the singleton root?
<dimitern> rogpeppe1: for the machiner
<rogpeppe1> dimitern: nope
<dimitern> rogpeppe1: aaah
<dimitern> rogpeppe1: got you!
<dimitern> rogpeppe1: it's the entry point, but for every call
<rogpeppe1> dimitern: yes
<dimitern> rogpeppe1: so then after the CheckProvisioned in admin then
<rogpeppe1> dimitern: in srvAdmin.Login, you mean?
<dimitern> rogpeppe1: yeah?
<rogpeppe1> dimitern: yes, i think that's the right place
<dimitern> rogpeppe1: ok, will move it there
<mgz> hazmat: I see. seems fixable, but not really what I'm poking right now.
<rogpeppe1> dimitern: or perhaps in apiRootForEntity
<dimitern> rogpeppe1: I've go another CL for you as well :) https://codereview.appspot.com/11574044/
<dimitern> rogpeppe1: in apiRootForEntity with the idea that once it becomes a "root factory" it will be more appropriate?
<rogpeppe1> dimitern: yeah
<dimitern> rogpeppe1: good point
<rogpeppe1> dimitern: i've still got that CL around somewhere
<dimitern> rogpeppe1: it got rejected i think
<rogpeppe1> dimitern:  i don't think it was reviewed actually
<dimitern> rogpeppe1: ah, it might be
<rogpeppe1> dimitern: yeah, no reviews: https://codereview.appspot.com/10684044
<dimitern> rogpeppe1: I think fwereade mentioned something he didn't like there, but can't remember details
<rogpeppe1> dimitern: oh, i'd like to know if there was
<rogpeppe1> dimitern: this was the crux of it. your NewPinger code would go into the NewMachineRoot function.
<rogpeppe1> https://codereview.appspot.com/10684044/diff/1015/state/apiserver/facades/machine.go
<dimitern> rogpeppe1: yeah, well we can still do it next week
<dimitern> rogpeppe1: haven't look into it in detail myself
<rogpeppe1> dimitern: reviewed
<mattyw> I just did an lbox propose but it didn't make a Rietveld code review for me?
<rvba> Anyone up for a review? https://codereview.appspot.com/11404044/
<fwereade> hey guys
<fwereade> I'm about to go through the reviews from the top
<dimitern> rogpeppe1: thanks
<fwereade> rogpeppe1, I thought https://codereview.appspot.com/10684044 looked cool but hit the nice-to-have buttons not the must-have buttons, so I didn't want anyone to take it over
<rogpeppe1> fwereade: yeah, i appreciate that
<fwereade> rogpeppe1, but since it's here, and you're here to respond, I'm on it now (it's at the top after all :))
<rogpeppe1> fwereade: that's appreciated. it wasn't complete though - in particular i hadn't added any more tests
<fwereade> rogpeppe1, ok, give me 5 mins to reload my state on it just in case there was something I didn't like, but I think I'm happy for you to continue with it, just make it a side task not the main one please ;)
<rogpeppe1> fwereade: ta
<mattyw> fwereade, I just submitted a small proposoal but lbox propose didn't create a Rietveld code review for me?
<rogpeppe1> fwereade: i haven't touched it since i got back BTW
<fwereade> mattyw, huh, weird
<mattyw> fwereade, I take it we're still using Rietveld
<rogpeppe1> fwereade: a proposal against juju-core?
<fwereade> mattyw, I use +activereviews as my task list anyway so rest assured I'll see it
<rogpeppe1> s/fwereade/mattyw/
<mattyw> fwereade, ok, shall I try to make a Rietveld review for it or just ignore it?
<mattyw> rogpeppe1, yeah juju-core
<rogpeppe1> mattyw: hmm, weird. can you run lbox propose --debug and paste the result?
<rogpeppe1> mattyw: fwiw, rvba has been having some perhaps similar issues with lbox
<mattyw> rogpeppe1, hmm, running debug tells me it can't login - I guess that's the problem ;)
<fwereade> rogpeppe1, yeah, direction looks sane, thanks :)
<rogpeppe1> mattyw: try removing ~/.lpad_oauth* and trying again
<fwereade> oh, incidentally, did everyone see http://foaas.herokuapp.com/
<rogpeppe1> fwereade: no
<fwereade> rogpeppe1, it made me smile
<rogpeppe1> fwereade: me too
<dimitern> LOL
 * fwereade didn't really have a proper lunch and is off to have one now, reviews will have to wait a bit
<mattyw> rogpeppe, got it working without having to resort to removing .lpad_oauth
 * fwereade will probably just have a sandwich at his desk though, so don;t fret ;p
<fwereade> jtv, ping
<jtv> Hi fwereade
<fwereade> jtv, heyhey, it looks like you have 2 LGTMs on "Drop startInstanceParams"
<jtv> Ah great, thanks
<fwereade> jtv, unless you particularly want to, please don't feel you need to wait for my review
<jtv> Thans!
<jtv> *Thanks
<fwereade> jtv, thank *you*, I'm really appreciating what you're doing
<dimitern> rogpeppe: all the other client-side facades for workers use the worker name as the facade type name
<dimitern> rogpeppe: and i think it's reasonable as well, for agent facades (currently MA only) State seems more appropriate
<jtv> hi-ho, hi-ho, to tarmac-land we go
<rogpeppe> dimitern: i think they should all use State
<dimitern> rogpeppe: i don't agree
<rogpeppe> dimitern: that was my original plan, agreed with fwereade AFAIR
<dimitern> rogpeppe: why?
<dimitern> fwereade: ?
<rogpeppe> dimitern: because the Machiner state API facade is not itself a Machiner
<rogpeppe> dimitern: it is actually the state
<rogpeppe> dimitern: with a facade in front of it
<dimitern> rogpeppe: it's not state either
<rogpeppe> dimitern: so calling it State in all those places seems like a Good Thing to me
<dimitern> rogpeppe: it's api
<fwereade> rogpeppe, dimitern: I don't *love* state but IMO it's at least consistent with existing usage and as a name for the source of truth
<rogpeppe> dimitern: the API is just a front-end for the state
<dimitern> rogpeppe: so MachinerAPI then
<rogpeppe> dimitern: and naming it State means we don't need to rename all the variables in every piece of client code
<dimitern> rogpeppe: better than State
<dimitern> rogpeppe: that's not a big deal imo
<dimitern> rogpeppe: we don't have to rename them either way
<rogpeppe> dimitern: it means that it's potentially easy to move code from one worker to another
<rogpeppe> dimitern: or factor it out
<fwereade> rogpeppe, dimitern: I'm ok sticking with state to reduce churn, personally, but I don't have a strong opinion either way
<dimitern> rogpeppe: i don't see how this should be something to consider
<dimitern> rogpeppe: facades are specifically done for a worker/agent
<rogpeppe> dimitern: sure. but from the point of view of that agent, it's talking to the state
<rogpeppe> dimitern: so having it called, say machiner.State, seems like it works well to me
<dimitern> rogpeppe: i think it's misleading
<rogpeppe> dimitern: it means that when you read the code, it's fairly obvious that all these pieces of code are talking to the same underlying thing
<dimitern> rogpeppe: maybe not Machiner then, MachinerAPI is actually closer to the intent
<dimitern> rogpeppe: when it has API in its name it's pretty obvious
<rogpeppe> dimitern: i don't see a significant benefit from machiner.API (over machiner.State) (and i'm not keen on machiner.MachinerAPI
<rogpeppe> )
<rogpeppe> dimitern: as fwereade says, it reduces churn, and that seems like a good thing
<rogpeppe> dimitern: i'd like to be able to change all the client code with minimal fuss
<dimitern> rogpeppe: who's going to change all the other facades then?
<dimitern> rogpeppe: which are already using these names
<rogpeppe> dimitern: i'm happy to do it. i've already done the upgrader one.
<dimitern> rogpeppe: ok then
<rogpeppe> dimitern: given that nothing is using them yet, it's not much of an issue
<dimitern> rogpeppe: i *still* don't agree
<dimitern> rogpeppe: but will do it
<dimitern> rogpeppe: 2:1 against :)
<fwereade> rogpeppe, did you assign yourself to the time.Format bug?
<rogpeppe> dimitern: thanks a lot. i do think it makes sense though, if only from a grandfathered-in p.o.v.
<rogpeppe> fwereade: no
<rogpeppe> fwereade: but i have merged a fix
<rogpeppe> fwereade: so should mark the bug fix committed
<rogpeppe> fwereade: have you got the link to hand?
<fwereade> rogpeppe, not offhand -- it's just that sidnei also fixed it but a bit late
<rogpeppe> fwereade: i think mark assigned the bug to me actually
<hazmat> mgz, gotcha.. could you clarify what your working on?.. from the kanban its not relatable at all? goamz for extra ip.. (which is vpc support incidentally) and lxc container addressability.
<fwereade> rogpeppe, oh, double-bad luck, sidnei found and fixed it completely independently
<hazmat> not sure how either of relates to the api bits
<hazmat> i did a branch on that one too (time.format), but couldn't ever the get the full test suite to pass..
<mgz> hazmat: specifically I need a mechanism for keeping machine address details updated in state
<fwereade> rogpeppe, I've marked it fix committed
<rogpeppe> fwereade: thanks
<mgz> which seems easiest to do by monitoring machines in the process of coming up and polling over the cloud's api for changes
<dimitern> rogpeppe: i've updated this as well https://codereview.appspot.com/11572043/
<rogpeppe> dimitern: looking
<mgz> then everything else can just watch state for address stuff, rather than having to go run something provider specific at the unit level
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe: thanks\
<fwereade> rvba, a question about https://codereview.appspot.com/11578043/
<fwereade> mattyw, I think that one's a trivial, go ahead and merge it
<fwereade> mattyw, tyvm
<mattyw> fwereade, trivial is my middle name
<fwereade> mattyw, haha
<mattyw> fwereade, I don't think I can merge it, I only have the option to say it's already merged in lp
<fwereade> mattyw, ok, I've approved it for you
<fwereade> mattyw, keep half an eye on it just in case the bot doesn't like something
<mattyw> fwereade, thanks :)
<jtv> Oh no.
<jtv> Our branches are spending too long in review.  Getting really hard conflicts popping up in the meantime.
<fwereade> dimitern, https://codereview.appspot.com/11424044/ LGTM, but ponder my comments before you merge
<dimitern> fwereade: thanks; looking
<rogpeppe> fwereade, dimitern: https://codereview.appspot.com/11586043
<dimitern> fwereade: let's rename it to Nonce when we have UA connecting the same way?
<rogpeppe> dimitern: will we need a nonce for the unit agent?
<dimitern> rogpeppe: fwereade mentioned that
<dimitern> rogpeppe: looking
<rogpeppe> fwereade: +1 to omitempty - i nearly mentioned that
<dimitern> rogpeppe: will do
<fwereade> dimitern, let's rename it *now* if we're going to
<fwereade> dimitern, it may be that we never will
<fwereade> dimitern, but I suspect that one day units will be able to move from one machine to another
<dimitern> fwereade: will we?
<dimitern> fwereade: I prefer not to complicate more that CL renaming all MachineNonce instances to Nonce
<fwereade> dimitern, I remember thinking they might need it but I can't remember why
<dimitern> fwereade: but can do a follow up
<fwereade> dimitern, ok, so long as we don't release with MachineNonce in the api, or we'll never get away from it ;p
<dimitern> fwereade: why?
<dimitern> fwereade: it's just an argument
<dimitern> a field even
<fwereade> dimitern, that some clients will use one version of, and other clients will use another version of
<dimitern> fwereade: ok, will do a follow-up after the third branch has landed
<fwereade> dimitern, cool
<dimitern> fwereade: and rename *all* MachineNonce to Nonce (even at top-level in agent.Conf)
<dimitern> fwereade: or you meant only the one in API Info actually?
<rogpeppe> dimitern, fwereade: fairly trivial: https://codereview.appspot.com/11589043/
<fwereade> dimitern, I really just meant the one in api info
<fwereade> dimitern, agent.conf is alredy compatibility-infected, best not touch it unless we have to
<dimitern> fwereade: ok
<dimitern> fwereade: if it's only that one, then I guess it's no bigger change to do it in the current CL
<fwereade> dimitern, that'd be great
<fwereade> dimitern, tyvm
<dimitern> rogpeppe: how will that make the api nicer?
<fwereade> rogpeppe, LGTM
<TheMue> rogpeppe: first review ;)
<fwereade> dimitern, check out params.AgentTools
<dimitern> rogpeppe: i really don't get that CL
<dimitern> sorry
<rogpeppe> dimitern: a version will be encoded as a nice string rather than several fields. params.AgentTools can also just use version.Binary rather than explicitly redeclaring all the version fields.
<fwereade> dimitern, it lets us just have a version.Binary in a params struct, and get it sent nicely over the wire, rather than as like 7 distinct fields
<rogpeppe> fwereade: thanks
<dimitern> rogpeppe, fwereade: ah! I see, thanks
<fwereade> dimitern, https://codereview.appspot.com/11572043/ also LGTM with one tweak
<fwereade> dimitern, and https://codereview.appspot.com/11574044/ LGTM too
<dimitern> fwereade: tyvm
<fwereade> right, that's 6pm: I've survived the week and I'm going to lie down in a park for a bit
<fwereade> I'm on holiday mon-weds but will probably be around mon am a bit
<rogpeppe> fwereade: enjoy
<fwereade> so I can talk to whoever's here about where we're going with environment config
<rogpeppe> fwereade: i'm also off now
<fwereade> happy weekends all!
<rogpeppe> fwereade: i've done the upgrader but not tested it...
<fwereade> rogpeppe, sweet, tyvm
<rogpeppe> g'night all
<dimitern> fwereade: why the need to test is the pinger a resource?
<TheMue> fwereade: enjoy your holiday
<dimitern> fwereade: I can add a test that after disconnecting it's no longer alive - will that be enought?
<dimitern> fwereade: ah, sorry didn't see you're previous msgs
<TheMue> so, time for me to step out too
<TheMue> enjoy your weekend, dimitern
<dimitern> TheMue: thanks, you too!
<TheMue> dimitern: will do so, tomorrow invented to a breakfast and later to a bbq ;)
<TheMue> invited
<benji> gary_poster: I'll take one
#juju-dev 2013-07-21
<thumper> morning
<davecheney> thumper: you doing ok mate ?
<thumper> davecheney: kinda
<thumper> davecheney: about to take the kids to a movie and work from town for a bit
<thumper> I'm going to a cafe while they watch a movie
<davecheney> thumper: ok
#juju-dev 2014-07-14
<davecheney> seriously, the replica set tests never pass on my machine
<davecheney> they pass in CI
<davecheney> so my care factor is < 1.0
<thumper> all tests just passed locally for me
<davecheney> yet here we are
<davecheney> thumper: i just found a rouge mongodb binary
<davecheney> beleted, lets see if that improves things
<thumper> :)
 * thumper takes the kids and dog out for a walk in the sun
<rick_h__> thumper: let me know if you've got a few min please
<sebas5384> someone already talked about bundle upgrade?
<rick_h__> sebas5384: ?
<sebas5384> hey rick_h__!
<sebas5384> for example, lets say I deployed a bundle topology into production
<sebas5384> and then, i change something, like a new charm related to an existing deployed charm
<sebas5384> this is already taking care? by juju ?
<rick_h__> sebas5384: yea, to do that we have to update Juju to track and know about services/services that were part of the bundle.
<rick_h__> bundles are kind of only an outside idea atm, we're working on pushing the bundle idea deeper into Juju
<sebas5384> great to hear that rick_h__
<sebas5384> because we are using juju with a CD tool
<rick_h__> CD tool?
<sebas5384> continues delivery tool
<sebas5384> striderCD
<rick_h__> ah k
<sebas5384> is a CD made in nodejs
<rick_h__> ah cool, we almost used that for CI
<sebas5384> ooh nice
<rick_h__> I had some chats with them about our desired github integration/etc
<sebas5384> hmm
<sebas5384> yeah they integrate very well with github
<sebas5384> but I understand the tool is in early stages
<rick_h__> yea
<sebas5384> so the thing is we are planning to do a juju plugin
<sebas5384> for the stridercd
<sebas5384> so you can relate a bundle to a branch
<sebas5384> and then to an environment
<sebas5384> so if I add a charm into the bundle yaml file, it should update my environments
<rick_h__> right, to do that properly, you need to track a lot of info that the env doesn't currently have
<rick_h__> we're working on making bundles support revisions and allowing them to live in the charm store right now
<sebas5384> yeah I thought about that
<rick_h__> it's the first baby step into that data/tracking you're looking for
<sebas5384> its related to the revision API ?
<sebas5384> nice!
<rick_h__> well the idea is that you want to be able to tell a diff from what's changed.
<sebas5384> thats right
<rick_h__> and the store will be able to at least provide you rev 3 bundle and rev 4. Though really you probably just want to know what's different from your current env and the updated bundle
<rick_h__> there's a bunch of that stuff to work out
<sebas5384> of course
<sebas5384> and I saw the new GUI, with the commit changes
<rick_h__> commit changes?
<sebas5384> hmmm i don't know how to called it
<rick_h__> ah ok. Well if you've got any gui questions I'm your guy on that end
<sebas5384> but I sow that we are going to have an stage state, so then you have to "commit" those changes
<sebas5384> yeah I know!!! hehe
<rick_h__> ah, yes. That's what we call the 'deployer bar' and it is part of the machine view work
<rick_h__> so you can run colocation through the GUI
<sebas5384> i'm really excited about the new approaches you guys are developing
<rick_h__> cool, glad ot hear it
<sebas5384> congrats on that :D
<sebas5384> talking about the machine view
<sebas5384> when i imagined how it should be the gui for that
<sebas5384> something like what we have in with app's bundles in our phones
<sebas5384> but man, the work that it's being doing its awesome!! :)
<rick_h__> sorry, not following 'apps bundles' phrasing
<rick_h__> oh, you mean something like a box for each machine?
<rick_h__> yea, I'll be very glad to get it out there for sure
<sebas5384> yeah!, something like that
<rick_h__> we looked hard at that, but the UX doesn't scale well
<rick_h__> especially when you get into lxc/kvm containers inside machines
<sebas5384> when you drag an app on top of another (in your phone) it's join al together in a bundle
<rick_h__> and in 100's of machines those boxes get hard to manage, too big. No good way to go through them quickly/etc
<rick_h__> yea, that's really kind of limited in size/scope
<sebas5384> yeah I agree
<rick_h__> but anyway our UX team spent many months going through design ideas and I think what we've got will be really nice
<sebas5384> yeah for sure man!
<rick_h__> a lot of idea left on the cutting room floor
<sebas5384> yeah I can imagine
<sebas5384> but its seems not be like the charm view, you know?
<rick_h__> right, that's why there's a toggle up top. You have the service view, and the machine view
<rick_h__> we'll work on adding other views in the future I think
<sebas5384> great
<sebas5384> so talking about this gui's things
<sebas5384> we are studying about a Juju as a service
<sebas5384> so for that you would have to manage a lot of environments, and some kind of a web gui for the juju quickstart tool
<sebas5384> there are some other ideias like charm factory, etc... but juju as a service is a real need for us right now
<sebas5384> there's plans to something like that?
<rick_h__> interesting. it'd be interesting to get an idea of what you're looking for. What are the features/requirements and reasons for them.
<sebas5384> rick_h__: sorry to throwing all this questions to you hehe
<rick_h__> sebas5384: all good, happy to help/chat
<sebas5384> rick_h__: :)
<rick_h__> sebas5384: there's some work to go in this cycle, I think it was emailed to the -dev list after our vegas sprint?
<sebas5384> really? didn't see it
 * rick_h__ goes to check archive
<sebas5384> I sow a guy from adobe asking for something like that
<sebas5384> and then the conversation was focus in the api's I think
<rick_h__> bah, can't find what I was thinking of
<sebas5384> its ok :0
<rick_h__> https://lists.ubuntu.com/archives/juju-dev/2014-May/002500.html is a little old thread that might be of interest. There's some work to help with the stuff you're asking about.
<sebas5384> :)
 * sebas5384 reading about
<thumper> rick_h__: hey
<thumper> whazzup?
<rick_h__> yea, worth tracking if you have an interest in this stuff
<rick_h__> thumper: have a doc for you to peek at and a bug to put into your brain
<thumper> rick_h__: hmm... ok
<rick_h__> thumper: not asking for anything today :)
<sebas5384> rick_h__: I see, but my ideia is more for the gui
<sebas5384> though the ideia of sharing envs is awesome
<sebas5384> I have this vision to make so easy for people deploy and orchestrate their topologies as a drag and drop action
<sebas5384> and juju-gui makes this happen!
<sebas5384> so it should be easy to get a juju gui configured environment for you with some clicks and filling some inputs fields with the provider's info
<thumper> davecheney: with you in 5 min, just need to get toast
<rick_h__> sebas5384: right, but the GUI relies on the power of Juju to function to what you'll find is that we add basic support for ideas into juju and then grow on top of it with the GUI
<rick_h__> sebas5384: but I'd love to get your use cases down to make sure we're thinking about them and addressing them. Maybe you've got ideas we've not thought about or some ideas we think are only kind of useful are very useful to you
<rick_h__> sebas5384: so feel free to email the juju list, or me, or the gui list, and we can collect your input.
<sebas5384> yeah for sure rick_h__ I love brainstorming
<sebas5384> and iterating on top of feedback
<sebas5384> is the more agile way to develop an app
<rick_h__> but for now it's time for me to get a shower and head to bed. Night from the EST
<sebas5384> avoids waste of features and focus in what it really adds value to the client
<sebas5384> rick_h__: nice to talk to you, and thanks for your attention :)
<sebas5384> rick_h__: good night to you man!
<davecheney> thumper: ok
 * thumper is eating quickly
<davecheney> uniter_test.go:813:
<davecheney>     c.Assert(result, gc.DeepEquals, params.StringsWatchResults{
<davecheney>         Results: []params.StringsWatchResult{
<davecheney>             {StringsWatcherId: "1", Changes: []string{
<davecheney>                 firstAction.Id(),
<davecheney>                 secondAction.Id(),
<davecheney>             }},
<davecheney>         },
<davecheney>     })
<davecheney> ... obtained params.StringsWatchResults = params.StringsWatchResults{Results:[]params.StringsWatchResult{params.StringsWatchResult{StringsWatcherId:"1", Changes:[]string{"wordpress/0_a_1", "wordpress/0_a_0"}, Error:<nil>}}}
<davecheney> ... expected params.StringsWatchResults = params.StringsWatchResults{Results:[]params.StringsWatchResult{params.StringsWatchResult{StringsWatcherId:"1", Changes:[]string{"wordpress/0_a_0", "wordpress/0_a_1"}, Error:<nil>}}}
<davecheney> [LOG] 0:00.471 INFO juju.provider.dummy reset environment
<davecheney> [LOG] 0:00.471 INFO juju.state.apiserver [30] user-admin API connection terminated after 204.572645ms
<davecheney> [LOG] 0:00.476 INFO juju.testing reset successfully reset admin password
<davecheney> [LOG] 0:00.490 INFO juju.testing reset successfully reset admin password
<davecheney> [LOG] 0:00.494 INFO juju.testing reset successfully reset admin password
<davecheney> [LOG] 0:00.494 ERROR juju.state.apiserver.common error stopping *state.actionWatcher resource: state has been closed
<davecheney> OOPS: 50 passed, 1 FAILED
<davecheney> --- FAIL: Test (22.56s)
<davecheney> FAIL
<davecheney> FAIL    github.com/juju/juju/state/apiserver/uniter     22.721s
<sebas5384> davecheney: pastebin dude! :)
<davecheney> eeek
<davecheney> menn0, thumper: nag for the OCR, https://github.com/juju/juju/pull/305
 * thumper should check to see if he is OCR
<thumper> davecheney: https://github.com/juju/juju/pull/290 if you feel like returning the favour
<davecheney> thumper: i can do that
<thumper> cheers
 * thumper has another dependent branch ready
 * davecheney deletes many duplicate worker/mockConfig types
<jam> dimitern: morning, be there in just a sec
<dimitern> jam, morning, me too :)
<TheMue> morning
<dimitern> TheMue, morning
<jam> morning TheMue
<natefinch> jam: morning
<natefinch> heh, gonna be a quiet day, since everyone from the US core devs except me gets a swap day today
<TheMue> natefinch: so go to bed again
<TheMue> natefinch: morning btw
<TheMue> natefinch: ;)
<natefinch> Heh, it's ok, maybe I'll get some work done today
<jam> morning natefinch
<jam> natefinch: I thought you wouldn't be in today, sorry I missed you
<jam> (I have our standup in 10 min)
<natefinch> jam: it's ok, no problem
<natefinch> jam: I don't get a swap day today, since I didn't go anywhere :)  Just a regular week driving into the office.  It's funny, I forgot how much I dislike that :)
<jam> natefinch: driving to work every day?
<natefinch> yeah
<natefinch> It's like 45 minutes in traffic, which is not the end of the world, but pleh.  What a waste of time.
<natefinch> (The sprint was great, don't get me wrong, only the driving was a waste ;)
<TheMue> natefinch: yes, and in case of having this all working days simply count up the lost time (beside the additional cost)
<jam> dimitern: TheMue: vladk: standup?
<TheMue> jam: already coming ;)
<jam> natefinch: did the week go well other than the commute time ?
<dimitern> jam, sorry, brt
<natefinch> jam: The work week was great.  But at home, my A/C had a leak (yet again).  Such is life.
<bodie_> morning all
<TheMue> bodie_: morning
<natefinch> morning bodie_
<bodie_> addressed TheMue 's comments on PR 301, pushing them up in a moment here
<bodie_> https://github.com/juju/juju/pull/301 should be good to go
<bodie_> would appreciate LGTM!
<TheMue> bodie_: Iâm looking again.
<TheMue> bodie_: It already has been ok, only the comments you now addressed
<bodie_> :)
<bodie_> TheMue, I mentioned this in the comments but fwereade had indicated he really wanted a refactor of the uniter context and runhook
<bodie_> I'm working on a (pretty bloated by this point) PR to address a lot of that which splits hook context and hides a lot of content inside the new types
<jam1> natefinch: just to remind you guys that I'm on
<jam1> on-call IRC master today
<jam1> so ping either jam1 or jam if you have questions
<natefinch> jam1: thanks.  I think you can probably bail today.  Everyone except me has a swap day today.
<natefinch> (everyone in the US that is)
<TheMue> bodie_: LGTM
<bodie_> TheMue, I still need another lgtm, right?  I'm never quite certain when I need two
<TheMue> bodie_: one is ok
<bodie_> cool, thanks!
<jam1> natefinch: dang, I was hoping you meant I got a swap day, too :)
<vladk> dimitern: I've committed https://github.com/juju/juju/pull/121
<dimitern> vladk, thanks, I'll have a look
 * TheMue too
<dimitern> vladk, re your question about tests - using statetesting.AssertOneChange() and so should be enough, as these methods internally call StartSync
<natefinch> jam1: haha
<bac> allenap: in your vast azure experience did you ever look at using floating ips?
<TheMue> dimitern: would you take a look at https://github.com/juju/juju/pull/306 too? Iâm using now dynamic addreeses here.
<dimitern> TheMue, sure, in a bit
<TheMue> dimitern: thx
<dimitern> TheMue, LGTM
<TheMue> dimitern: ta (and jam1 too) ;)
<dimitern> vladk, reviewed
<dimitern> TheMue, please take a look https://github.com/juju/juju/pull/307
<dimitern> when you can
<TheMue> dimitern: sure, just seen the mail :)
<dimitern> TheMue, cheers! :)
<TheMue> dimitern: oh, ONLY 24 files :D
<dimitern> TheMue, heh, changes to agent/environ config have this effect yes :)
<TheMue> dimitern: yeah, some are that extensive
 * dimitern needs to step out, bbl
<alexisb> jog, welcome to juju-dev!
<jog> thanks alexisb, happy to be here!
<TheMue> jog: heya also by me (from Germany)
<jog> hi TheMue, thanks
<alexisb> natefinch, jam1: looks like we have a github defect open:
<alexisb> https://github.com/juju/juju/issues/304
<alexisb> rick_h__ pointed it out to me
<TheMue> dimitern: if youâre back, youâve got a review
<bodie_> trying to understand a yaml error
<bodie_> goyaml*
<bodie_> "could not find expected directive name"
<bodie_> as I understand it, directives are optional
<TheMue> bodie_: where exactly you get it?
<natefinch> alexisb: pleh, looks like those instance types were just added July 1
<natefinch> (to AWS, not juju)
<bodie_> http://paste.ubuntu.com/7794085/ TheMue
<bodie_> TheMue, perhaps this is a different error -- it indicates the problem is on line 13
<bodie_> which doesn't seem to exist
<TheMue> *click*
<TheMue> bodie_: could you also paste a code snippet?
<bodie_> yeah... let me just shoot you the real thing highlighted on github
<TheMue> ok
<bodie_> TheMue, https://github.com/binary132/juju/blob/367f2b557fa09665bc77bff482e609ea9ac6be3b/worker/uniter/uniter_test.go#L267-L283
<bodie_> that's where the yaml is getting generated
<bodie_> I suspect the issue might be on line 279, but that's based on something that is known working
<tasdomas> when trying to run 'juju debug-log' with trunk, I get: ERROR cannot open log file: open /var/log/juju/all-machines.log: no such file or directory
<tasdomas> is there a way to view the juju log by ssh'ing into the state server?
<bodie_> TheMue, perhaps looking at my spaghetti code tests isn't the best way to spend your time ;)
<bodie_> http://paste.ubuntu.com/7794138/ will perhaps be more helpful (TheMue)
<TheMue> tasdomas: juju ssh 0
<TheMue> bodie_: hehe, will look there too
<bodie_> the error is shown on line 1346 of that output
<tasdomas> TheMue - that part I know, which log files do I look in?
<TheMue> tasdomas: in that directory /var/log/juju take a look at machine-0.log
<tasdomas> TheMue, thanks - that file seems to be not a complete replacement for the debug-log
<TheMue> tasdomas: no, thatâs the all-machines.log. debug-log only is the command for it
<TheMue> tasdomas: local provider?
<tasdomas> TheMue, ec2
<TheMue> tasdomas: astonishing
<tasdomas> TheMue - unrelated question - there is a way to connect to the juju mongo instance, right?
<TheMue> tasdomas: in which way? with a cli client?
<TheMue> bodie_: just for info, Iâm still reading code and log
<tasdomas> TheMue, yes, from within the state server
<TheMue> tasdomas: afaik not, only using a coded client, like we do with mgo
<tasdomas> TheMue, thanks
<TheMue> tasdomas: sorry for having no better info
<TheMue> tasdomas: do you have services deployed or only bootstrapped?
<tasdomas> TheMue - I have services deployed, but I'm trying to debug an ec2 deployment
<TheMue> tasdomas: the wanted all-machines.log is an aggregation of machne-0.log machine-1.log etc
<tasdomas> TheMue, yes, I understand that
<TheMue> tasdomas: thatâs why I asked for deployed services. itâs strange you donât find an all-machines.log then.
<TheMue> bodie_: in your created output, line 11, why is outfile in square brackets?
<bodie_> TheMue, it's my understanding that yaml permits lists in square brackets
<bodie_> required is expected to be a list
<bodie_> hmm
<bodie_> I think the bit: content := fmt.Sprintf(actionsYamlFull, filepath.Base(actionsYamlPath))
<bodie_> was the issue
<bodie_> I've got it validating now
<TheMue> bodie_: thatâs not what I expect when getting a YAML prefixed error
<bodie_> I'm not sure I understand the utility of that clause
<TheMue> bodie_: I know YAML list written with - on new lines
<jam> natefinch: fwiw T2 are going to be interesting, because they come with a lot of different constraints. They are cheaper, but must be in VPC (cannot be used in Classic mode), they "accrue" CPU credits that they can use to burst, etc
<jam> They actually sound like a decent fit for the juju state server (if you consider that it usually goes quiet once everything is stable)
<jam> but the networking around them means they aren't just like other instances
<natefinch> jam: very interesting.  I hadn't looked into all the differences.... the CPU credits is kind of an interesting idea
<allenap> bac: It rings a bell. It might all come back if you want me to look at something.
<bac> allenap: looking into using floating ip on azure it seems to not be possible.  i was hoping to verify that with someone with lots of azure expertise.
<bac> allenap: but we can chat about it tomorrow, since you're well past eod
<bodie_> anyone know when fwereade will be around for me to pester?
<bodie_> mgz will do as well
<allenap> bac: Iâm around now, but tomorrow would be better. Too many things to do this evening!
<bac> allenap: sure
<bac> tomorrow is better
<bac> allenap: perhaps have a look at http://msdn.microsoft.com/en-us/library/azure/dn690120.aspx tomorrow and we can chat around 1300UTC
<Egoist> Hi
<Egoist> Is there any way to look how many units is in service, from hook code?
<thumper> ha... thought it was quiet
<thumper> had forgotten to start the irc client
<ChrisW1> blah
<thumper> o/ ChrisW1
<ChrisW1> okay, so what does '0/' mean?
<ChrisW1> o/, even
<ChrisW1> seen it several times tonight, feeling out of touch
<sebas5384> what would it be the better approach for charm scaling?
<sebas5384> things like sharing files using nfs or glusterfs came to my mind
<sebas5384> i found this storage charm but I don't know if is a good optin
<sebas5384> option
<sebas5384> marcoceppi: would you recommend the gluster-server and gluster-client charms for scaling?
<voidspac_> wwitzel3: ping
<sebas5384> ok, looking over ceph now
 * thumper needs a git master
<thumper> I need to unfuck a branch
<cmars> thumper, mind if I join onyx standup today? also, I'm not too shabby at git
<thumper> cmars: go ahead
<ChrisW1> thumper: what's your git problem?
#juju-dev 2014-07-15
<davecheney> o_O
<davecheney> func (cfg *MachineConfig) addMachineAgentToBoot(c *cloudinit.Config, tag, machineId string) error {
<davecheney> is the tag _not_ the machine id !?!?
<thumper> davecheney: this is in my area
<thumper> I'm tweaking things around there
<thumper> let me look
 * thumper has a headache
 * thumper need more coffee
<thumper> does anyone remember the git equivalent of uncommit?
<cmars> thumper, git reset
<thumper> cmars: will that leave me the changes still?
<cmars> thumper, depends on how you use it
<thumper> I want the changes
<cmars> thumper, i need to look up the flags for that case
<mwhudson> if you want your head to hurt, read the docs on what git reset --mixed does
<mwhudson> (it's not what's wanted here though)
<mwhudson> er
<mwhudson> --merge
 * thumper steadies himself to read the git man page
<mwhudson> thumper: you want --mixed or --soft i think
<mwhudson> they differ on whether the changes are left staged or not
<thumper> git reset --soft HEAD^
<mwhudson> yar
<mwhudson> the help on git reset --merge is real http://git-man-page-generator.lokaltog.net/ stuff
<thumper> ah actually, it is that followed by another
<thumper> git reset
<thumper> the latter takes the files from the index back into uncommitted state
<thumper> mwhudson, cmars: what is the equivalent of "bzr pull --overwrite" ?
<thumper> I want to reset my checkout to be tip of master
<thumper> davecheney: actually, the machine config isn't in my area
 * thumper steps away
 * davecheney has reached the point that tests do not pass, at all, on my machine
<davecheney> yet they pass 100% in CI
<davecheney> whom am I to believe !?!
<thumper> davecheney: huh?
<thumper> what is failing?
 * thumper ran master again just today with no failures
<mwhudson> thumper: git reset --hard origin/master
<mwhudson> (making sure you are in your local master branch first, probably)
<thumper> mwhudson: ah... thanks, will do that next time
<thumper> I ended up with the changes I needed, but not the history I wanted
<thumper> but good enough
<rick_h__> reset --hard won't save the current changes or uncommit a previous commit
<thumper> rick_h__: sure, but I was asking then about an equivalent for "bzr pull --overwrite"
<rick_h__> thumper: oh going off the backlog you asked for uncommit that kept the changes
<thumper> rick_h__: I have "git reset HEAD^ && git reset" for uncommit
<thumper> with a --soft on the first one
<rick_h__> right
<rick_h__> ok cool then
<rick_h__> ah, didn't read far enough down the log my bad
<thumper> rick_h__: np
<thumper> I'm slowly getting my git legs
<rick_h__> woot
<davecheney> thumper: https://github.com/juju/juju/pull/309
<davecheney> let's see if this one fits throught ci
<thumper> davecheney: as a note, agentConfig.Tag() currently could be a machine tag or a unit tag
<thumper> davecheney: until we merge the agents at least
<davecheney> thumper: yup
<davecheney> i've only found one case where agentconfig.Tag() could be a unit
<davecheney> once that case is removed, we can tighten up the types and remove the runtime checks
<thumper> davecheney: aarrgghhh
<thumper> davecheney: why are we testing agents with user tags?
<thumper> that's dumb
<thumper> I can see it wasn't you
<thumper> but geez
<davecheney> why are we testing with _invalid_ tags ?
<davecheney> that is the bigger headdesk
<davecheney> agentConfig{tag: "machine-tag"}
<davecheney> ^ not valid
<davecheney> no sir
<thumper> "omg" not valid either
<davecheney> thumper: nope
<davecheney> thumper: consider yourself nagged to nag fwreade to make a decision on the errors package
<thumper> davecheney: I feel that we will make most progress with that next week when we can nag in person
<thumper> davecheney: with sticks
<davecheney> thumper: noted, consider yourself nagged
 * thumper feels nagged
 * davecheney is afraid what happens with I change those tests to not take user tags ...
<davecheney> nothing good, i'll bet
<davecheney> agent_test.go:410: c.Assert(err, gc.IsNil)
<davecheney> ... value *errors.Err = [{/home/dfc/src/github.com/juju/juju/agent/agent.go:274: entity tag must be MachineTag or UnitTag, got names.UserTag}] ("entity tag must be MachineTag or UnitTag, got names.UserTag")
<davecheney> argh!!!
<davecheney> that's what I get for adding a test
<davecheney> thumper: this is correct, right ?
<davecheney> users cannot be agents
<davecheney> or more specifically you cannot have an agent that represents a user
<davecheney> only a machine or a unit, yes ?
<thumper> yes
<thumper> for what we currently have
<thumper> yes
<thumper> agents configs are only for machine agents and unit agents
<thumper> maybe...
<thumper> one day later
<thumper> we may have something else
<thumper> but more likely that we'll only have machine agents
<thumper> and they may do something on behalf of a user
<thumper> but the config is only for the machine
<davecheney> ok
<davecheney> i'll make sure that agent.ConfigParams only accepts tags of that type
<davecheney> add tests and fix the other ones that are passing a user tag 'cos they are dumb
<thumper> I am like 99% sure of that :-)
<davecheney> close enough for government work
<thumper> I would fire up a local provider
<thumper> and deploy something
<thumper> before committing to trunk :)
<davecheney> will do
<thumper> magic
<thumper> juju/api.go:250
<thumper> check that the variable isn't null after calling a function on it that needs it to be not null
 * thumper will fix...
<thumper> ish
 * thumper goes to make coffee
<thumper> ENEEDSCAFFEINE
<davecheney> thumper: shuld I write tests that assert _only_ Units and Machines can host agents ?
<davecheney> or is one test, ie, that a User Tag won't work is sufficient ?
<thumper> I don't think you need to test all possible tags.
<davecheney> thumper: cool
<thumper> I'd test that machines and units work, and users don't
<davecheney> i've tested that users won't fit through the filter
<davecheney> there are shitloads of tests for machines's
<davecheney> i'll add one for units and put a bow on it
<davecheney> FOR FUCKS SAKE
<davecheney> agent/agent_test.go
<davecheney> the inspect config logic isn't used
<davecheney> you can write anything in there and the test passes !
<davecheney> my uniter tests have been unreliable since the actoins stuff landed
<davecheney> http://paste.ubuntu.com/7796927/
<davecheney> i smell a race
<davecheney> or a logical failure there
<jam> wallyworld: I'm looking at https://github.com/juju/juju/pull/282/files and I'm trying to see the logic change that fixed the bug. I see a bunch of changes towards error propagating, but not something that changed suppressing an error treating it as 'we don't have any instances'
<wallyworld> jam: it's difficult to see (and explain). perhaos easiest to have a quck chat?
<jam> wallyworld: ah, the "default" case
<wallyworld> yeah
<jam> wallyworld: I just couldn't see where the actual logic changed because of all the other error handling changes, but I see it now.
<wallyworld> if we get an empty maps back out of that (instead of an error), were fucked donwnstream
<wallyworld> jam: yeah, the error handling was a drive by to better log the errors
<jam> wallyworld: yeah, I'm fine having it, just made it hard to locate the fix. FWIW, I don't feel like I knew about the critical customer-facing issues, should it have been escalated to canonical-juju@ at least?
<wallyworld> jam: that default case is the root cause bt the actual bug is then triggered several api calls distant to that so it's hard to see the causal link
<jam> wallyworld: sure, it is a case where we suppressed an error (accidentally, perhaps, though we thought we could continue at that point)
<jam> wallyworld: I'm a little concerned that we might get a Unknown error intentionally (because that machine really has been removed) and the provisioner will start failing with errors.
<wallyworld> jam: yeah i guess it should have gone to canonical-juju. has that been our policy previously? i  had assumed that folks who cared were across it because of the ubuntu advantage bug raised
<jam> because we don't have any sort of knowledge about what error should and shouldn't be propagatetd
<jam> wallyworld: *I* haven't been able to keep up with the bug backlog, and if I haven't then likely a good portion of the team hasn't. But I may just be biased.
<wallyworld> jam: i'd have to look at the code again, but the unknon error is handled i think
<jam> wallyworld: yeah, it was outside the diff
<jam> IsCodeNotFoundOrUnauthorized is one of the cases
<wallyworld> yeah
<jam> wallyworld: k, I'm happy then. I think if we handle the errors we know about, and then puke on the rest, that is reasonable
<jam> wallyworld: I feel like for escalated issues, a canonical-juju post is reasonable. Thoughts?
<wallyworld> jam: fair point about the bug backlog etc. we've had a maybe 3 customer facing issues with 1.20 (some raised by cts).
<davecheney> jam: +1
<wallyworld> yeah i think so
<jam> wallyworld: I just saw the "as you are aware", and I realized that I wasn't aware at all :)
<davecheney> jam: and keep driving people back to the LP issue
<davecheney> juju issues are tracked on LP
<wallyworld> where was the "as you are aware" ?
<davecheney> let's keep that message simple
<jam> wallyworld: in Martin's post to canonical-juju
<davecheney> and avoid people starting their own hit list of issues in google sheets, et al
<jam> davecheney: I'm perfectly happy having it be a link to "here's an escalated bug we need more visilibilty on"
<wallyworld> jam: i think that was directed at martin :-) but yeah, point taken
<jam> wallyworld: from Martin: I am sending this out to the broader team, based on Robbie's guidance. Â As you are aware, we have encountered a serious software defect with juju-core,
<davecheney> jam: what is 'it'
<jam> davecheney: the email to canonical-juju can certainly just be a link back to the LP bug.
<wallyworld> jam: oh, right, i misremembered the content, sent so many emails today
<davecheney> jam: +1, yes
<davecheney> exactly
<jam> wallyworld: I think my point is that trying to raise availability on a bug by raising another bug doesn't really help. :)
<jam> raise awareness
<jam> (visibility)
<wallyworld> you mean the public bug linkd with the privayte one?
<davecheney> wallyworld: jam agreed
<wallyworld> that was for a separate but related issue
<davecheney> juju bugs are tracked on launchpad
<davecheney> that should be the constant message here
<jam> wallyworld: your comment was "a u-a bug raised got all the involved parties more visibility", I'm not sure that it brought awareness to the greater juju team like a canonical-juju post would.
<wallyworld> jam: with that comment, i meant customer facing stkeholders
<wallyworld> martin was implying he wasn't across the issues, but he should have been
<jam> wallyworld: I guess my point is that this sort of bug is sufficient that at least all team leads should be aware, and canonical-juju seems the easiest way to escalate it.
<wallyworld> that's not to say that others in the canonical juju community wouldn't have benefited from a post to canonical-juju
<jam> Regardless of what Martin said in his email, this seems serious enough that I would have liked to know about it, and I'm not sure there was a way that isn't "read all your bug mails"
<wallyworld> fair enough, sorry
<jam> wallyworld: that said I *really* appreciate that you and curtis really are tracking it well enough.
<jam> I feel like I should be, but I feel a bit inundated
<wallyworld> we're getting there. 1.20 certainly wasnt the best release
<wallyworld> jam: it's not on your plate to do it - tanzanite is the release manager this cycle
<jam> wallyworld: well, reading bugs can give you a pulse on what's going on with the greater ecosystem, it has the downside that we have a fair amount of bug churn that *doesn't* need to be read.
<jam> Makes me wonder about the groupbuzz post recently
<wallyworld> jam: these juju/mongo connection issues have always been there (part of the original implementation), just seems that they've gotton worse with 1.20, due perhaps to repliaset stuff turn on
<jam> where you can easily opt-in/out of a given message thread
<jam> with sane defaults
<jam> wallyworld: perhaps, but they ran into this with 1.18, didn't they?
<wallyworld> jam: yes they did. just seems subjectively worse with 1.20
<wallyworld> there are more places where it bites us
<jam> wallyworld: it seems very weird to have this hit 2 places within short succession and not have hit before
<jam> with a release that is reasonably old now
<wallyworld> jam: some of the 1.20 issues were due to new code added to jujud to deal with repliaset start up
<jam> wallyworld: I do think https://github.com/juju/juju/pull/282/files#diff-edccfba67a01587c9faca9185781e5dbR285 should be backported to 1.18
<wallyworld> jam: i'm inclined to agree. but i'm not yet +1 on backporting any change to safe mode default value
<jam> wallyworld: yeah, I'm a bit confused about bug #1339770
<jam> as it seems to be saying safe-mode, but is actually where you did the "don't kill things when we can't list machines"
<jam> wallyworld: do you want to change the description of that bug to be clearer why it is fix released in 1.20.2 ?
<jam> wallyworld: I think your point about changing safe mode default is perfectly sane. It would only effect new deployments anyway.
<jam> (unless we add an upgrade to 1.18.X that toggles the setting)
<wallyworld> jam: i already changed the bug description once
<wallyworld> it used to be "Make provisioner-safe-mode defaults to True on MAAS provider."
<jam> wallyworld: that's the title
<wallyworld> i changed it to reflect the actual fix that was done
<jam> the description is still "We should change the provisioner-safe-mode config entry to True on MAAS provider, so in any case when the state server goes down, juju will not trigger the MAAS release commands."
<wallyworld> ah right, doh
<wallyworld> yeah, ok
<TheMue> morning
<jam> morning TheMue
<TheMue> jam: heya
<dimitern> morning
<TheMue> dimitern: morning
<dimitern> TheMue, thx for the review
<TheMue> dimitern: yw, liked it
<tasdomas> is there any way to deploy juju (on ec2) with a mongodb that is accessible by the mongo client?
<TheMue> jam: Iâm afk for some time, wonât be back for standup. after finishing the flakey tests Iâm now working LXC templates for IPv6.
<wallyworld> fwereade: you around?
<fwereade> wallyworld, hey dude
<wallyworld> hey. do you have time to talk?
<fwereade> wallyworld, let's
<wwitzel3_> wallyworld, fwereade: I'm here as well
<wallyworld> see you in the tanzanite standup hangout
<fwereade> wwitzel3_, cool
<fwereade> wallyworld, doesn't seem to be on my calendar -- link?
<wallyworld> https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand
<fwereade> wallyworld, cheers
<psivaa> hello, i'm having a situation in local deployments.. where i have 'default-series: precise' in environments.yaml for 'local' but when i run 'juju bootstrap' i see 'uploading tools for series [trusty]'
<psivaa> this makes precise based deployments fail to deploy.. the agent-state always states 'pending' for them
<psivaa> and the machine-0.log states the following errors: http://paste.ubuntu.com/7797841/
<psivaa> to workaround this we've had to bootstrap with '--upload-tools  --series precise,trusty' during bootstrap
 * fwereade lunch, might be a bit late back
<katco> good morning all
<wwitzel3_> morning katco
<katco> wwitzel3: thanks for a great sprint. it was great to meet you :)
<natefinch> jam1, wallyworld, alexisb: anyone around to talk about the current critical bugs?
<wallyworld> natefinch: i'm here but it's late so may not make much sense
<wallyworld> natefinch: you talking 1.20 or 1.21?
<wallyworld> we have 1.20 covered, apart from the licensing one
<wallyworld> katco: try adding a card now, i've updated your permissions
<katco> wallyworld: ty sir; trying now
<katco> wallyworld: looks like that did it!
<wallyworld> katco: great :-)
<katco> wallyworld: oh and happy birthday to you son! (and axw)
<wallyworld> lol
<wallyworld> my son turns 20 on thursday bt we are having dinner etc on wednesday
<katco> cool :) what is the legal drinking age there?
<wallyworld> 18 :-D
<katco> haha so 21 is not a big deal
 * rick_h__ starts to refer to wallyworld as 'old man' :P
<katco> well, 20 is a nice even number :)
<wallyworld> rick_h__: this is a family channel or else i'd tell you to ..... ******
<rick_h__> wallyworld: see, you're not thaaaaat tired. You filtered nicely.
<katco> wallyworld: oh also should bugs 1319474 and 1319475 be assigned to "next stable release" now?
<wallyworld> katco: yeah. too risky trying to get the goamz stuff done this week
<katco> wallyworld: i agree.
<wallyworld> katco: when we first started looking at those bugs, it seems plausible we could get something done but then it sorta got hard
<wallyworld> katco: your family happy to see you again after a week away?
<katco> wallyworld: oh god yes. my wife about tackled me.
<katco> wallyworld: and then handed me a baby and walked away.
<wallyworld> lol, i know how that goes :-)
<wallyworld> i used to travel when my kids were young too
<katco> it's hard! but i'm glad i went
<katco> how was the flight back?
<wallyworld> long
<katco> =/
<wallyworld> but i got home just in time for the 2nd half og the world cup final
<katco> lol nice
<katco> that was an amazing goal
<katco> JITed goal ;)
<wallyworld> yes it was. wish i had that skill
<wallyworld> you nerd :-P
<katco> haha you know it
<katco> glad to be on a team of nerds; i love it!
<wallyworld> i'm sad at myself i actually loled at that
<katco> haha
<katco> nerd is a compliment to me. we make the world go round!
<wallyworld> katco: so you're ok to pick up those bugs for today, not blocked?
<katco> wallyworld: yes, should be good to go
<wallyworld> great. we can talk about juju-gui tomorrow or something
<katco> okie doke; martin said something about just aliasing the juju bootstrap command or something?
<katco> (do not need to answer now actually)
<wallyworld> ok :-)
<katco> go sleep!
<wallyworld> soon :-)
 * rick_h__ perks up his ears...gui?
<wallyworld> rick_h__: we are going to deploy juju-gui auto magically in a juju environment
<rick_h__> wallyworld: ooh shiny
<wallyworld> as we discussed on the call last week
<rick_h__> wallyworld: right, let me know if you hit any issues/questions
<wallyworld> yeah, will be niiiiice to quote borat
<wallyworld> will do
<allenap> bac: Can we delay our meeting by 1h?
<katco> rick_h__: your team made a nice web-app :)
<rick_h__> katco: the team rocks and likes shiny things
<katco> rick_h__: haha... are you guys looking at polymer at all?
<rick_h__> katco: looking, but if we were to do anything we'd layer on top of react
<katco> rick_h__: cool; haven't had a chance to play with it myself
<rick_h__> but all the tools are only partial answers to our various problems and it's kind of ugh
<rick_h__> at the tool/library proliferation
<katco> rick_h__: yeah, i have seen that first-hand
<bac> allenap: i have another then
<bac> allenap: but even later would work
<bac> allenap: 1530Z?
<rick_h__> katco: then again I'm a cranky old man like wallyworld so I naturally hate all the new shiny stuff without proper build/test/integration solutions in place already
<katco> rick_h__: sorry, not sure if that translated; i have seen that 1st hand at another company, not on the juju GUI
<bac> wallyworld: is bug 1316174 on your radar? it affects juju-quickstart's ability to provision an environment and it doesn't look like there is a work-around.
<_mup_> Bug #1316174: 'precise-updates/cloud-tools' is invalid for APT::Default-Release <juju-core:Triaged> <https://launchpad.net/bugs/1316174>
<rick_h__> katco: all good, I think I followed
<wallyworld> bac: no, hadn't seen that
<bac> wallyworld: well, you triaged it.  :)
<wallyworld> bac: ah, sorry, i meant the quickstart flow on effect
<allenap> bac: Cool.
<bac> wallyworld: if juju-core is not installed, quickstart will install juju-core and juju-local before continuing.  those two are no long sufficient due to this bug
<wallyworld> bac: i must admit, without digging a bit more, i'm not 100% sure what the issue is as i did think juju-local was sufficient
<bac> wallyworld: just wanted to bring it to your attention. have a good evening
<wallyworld> bac: thanks, i'll assign the bug to 1.21-alpha1 so we at least guaratee to look at it before the next release
<wallyworld> hopefully "look at it" = "fix it"
<bac> ty
<katco> seeing some behavior i don't understand with godeps -u... it wasn't pulling down the commit it says it should; i had to do a go get -u. has anyone experienced that? am i doing something wrong?
<natefinch> katco: you need to do godeps -f -u to get commits that aren't on your local machine.  -u will fail if you reference a commit that hasn't been pulled down already
<natefinch> (why this is not just the default behavior, I don't know)
<katco> natefinch: ah shoot that's right
<katco> natefinch: thank you
<natefinch> katco:  welcome :)
<katco> does anyone see any issues with setting the owner of the machine.x.logs to syslog:syslog?
<TheMue> natefinch: hmm, donât find where to see whoâs allowed to merge into the juju repo
<voidspace> morning all
<katco> good morning voidspace
<voidspace> katco: morning
<voidspace> katco: good to be back with the family?
<voidspace> katco: dumb question I know :-)
<ericsnow> voidspace: a day too soon, no? :)
<katco> voidspace: definitely!
<voidspace> ericsnow: I'm doing a couple of hours today as my wife has an ultrasound tomorrow that I'd like to be at
<ericsnow> katco: :)
<ericsnow> voidspace: nice
<voidspace> ericsnow: :-)
<katco> this might be a matter of opinion, but after i'm reasonably sure i have a good fix, should i run the test sfor just the sub-module, or the entire juju project?
<katco> voidspace: congrats again!
<voidspace> katco: thanks
<katco> this is great... i know who people are now. :)
<voidspace> katco: it makes a big difference doesn't it
<katco> voidspace: it really does
<katco> eric's now not just a name!
<ericsnow> katco: haha
<bac> allenap: are you free now?
<allenap> rvba: Are you free to talk to bac now?
<rvba> allenap: yep
<bac> allenap: oh, i didn't know you were bringing backup
<allenap> bac: Yes indeed, because my brain is hopelessly sieve-like :)
<allenap> bac: Anyway, https://plus.google.com/hangouts/_/canonical.com/azure?v=1404955657
<bodie_> https://github.com/juju/juju/pull/311 review would be appreciated!
<alexisb> wwitzel3, ping
<alexisb> natefinch, ping
<TheMue> bodie_: *click*
<bodie_> And another -- https://github.com/juju/juju/pull/312
<wwitzel3> alexisb: back
<alexisb> are you going to be on the tosca call?
<alexisb> I have a conflict
<wwitzel3> alexisb: yeah, I was just getting logged in now
<alexisb> cool
<alexisb> I will get a debrief from you then
<alexisb> I will try to join when I am done wit this other call
<alexisb> thanks wwitzel3 !
<natefinch> alexisb: here now
<alexisb> natefinch, I was just making sure we had coverage on the tosca call
<alexisb> wwitzel3, is on
<natefinch> alexisb:  I'll hop on
<katco> https://github.com/juju/juju/pull/313
<voidspace> natefinch: so, I've tried a different approach to "session copying"
<voidspace> natefinch: this time first looking at the watchers rather than the transaction runners
<voidspace> natefinch: and again run into the "auth failed" error
<natefinch> voidspace: ug
<natefinch> voidspace: isn't today a swap day for you?
<voidspace> natefinch: didn't you already reply to my email about that :-)
<voidspace> natefinch: it is, I'm putting a couple of hours in today
<voidspace> natefinch: my main point of attack on this is assuming that the mongo password is being changed after the session is created
<voidspace> natefinch: and the session isn't updated
<voidspace> natefinch: before I deep dive into that, does that sound likely to you?
<voidspace> or does anything else occur to you about it
<voidspace> natefinch: anyway, I'm updating my coffee first
<voidspace> I also wonder how this work of untangling session use from watchers will intersect with the "persistence layer" refactoring that wwitzel3 and wallyworld are embarking on
<natefinch> voidspace: I'm really not sure.  I think you've done a lot more spelunking on it recently than I have.
<voidspace> natefinch: ok, cool
<voidspace> natefinch: I just didn't want to embark on a path that was "obviously dumb" to someone with more knowledge
<voidspace> natefinch: so I'm just running it by you as a sanity check
<voidspace> I'll continue the investigation
<natefinch> voidspace: doesn't sound dumb to me
<voidspace> good, thanks
<TheMue> bodie_: 312 is reviewed
<bodie_> thanks TheMue !
<TheMue> bodie_: yw
<katco> what is the use case for this code branch? why would we want to continue bootstrapping if an environment is already found? https://github.com/juju/juju/blob/master/cmd/juju/common.go#L40
<katco> also, anyone? https://github.com/juju/juju/pull/313
<rick_h__> katco: just as an fyi, not sure if this fits into what you're looking at or not 17:26    balboah| ~balboah@air.joonix.se has joined #juju
<rick_h__> bah
<rick_h__> katco: https://bugs.launchpad.net/juju-core/+bug/1336843
<_mup_> Bug #1336843: bootstrap without a jenv destroys an existing environment  <bootstrap> <juju-core:Triaged> <https://launchpad.net/bugs/1336843>
<katco> rick_h__: hm, looks plausibly related
<natefinch> katco: it's not just for bootstrap, it's also used by sync-tools
<katco> rick_h__: different code-path, but probably very related... maybe i can fix that too if i get time
<rick_h__> katco: rgr, just as an fyi as it seemed close
<katco> rick_h__: yeah tyvm
<katco> natefinch: remind me what sync-tools does again
<natefinch> This copies the Juju tools tarball from the official tools store (located
<natefinch> at https://streams.canonical.com/juju) into your environment.
<natefinch> This is generally done when you want Juju to be able to run without having to
<natefinch> access the Internet. Alternatively you can specify a local directory as source.
<natefinch> katco: juju help sync-tools :)
<katco> natefinch: ahhh! ok, ty. i will have to think on that a bit
<marcoceppi> What is "safe-mode" for juju?
<voidspace> marcoceppi: I'm not familiar with it, but the implication from the email thread is that juju won't destroy instances for you in safe mode - even if it thinks they're unused
<voidspace> marcoceppi: that's what it sounded like anyway
<voidspace> I'm EOD
<marcoceppi> That's what I gathered as well, I just want to update the documentation
<voidspace> g'night all until tomorrrow
<katco> voidspace: tc
<voidspace> marcoceppi: ah... you'll want a more authoritative answer
<voidspace> katco: o/
<katco> i could use a review to see if i'm on the right track: https://github.com/katco-/juju/compare/lp-1340893-bootstrap-destroy
<katco> this is for https://bugs.launchpad.net/juju-core/+bug/1340893
<_mup_> Bug #1340893: juju bootstrap in an existing environment destroys the environment <bootstrap> <canonical-is> <juju-core:In Progress by cox-katherine-e> <https://launchpad.net/bugs/1340893>
<tasdomas> I'm trying to debug a txn.Op that works in tests but for some reason fails when I deploy a juju environment running that code on ec2
<thumper> morning
<alexisb> morning thumper
<wallyworld> katco: hi, time for a quick catch up?
<alexisb> wallyworld, you should be sleeping
<wallyworld> alexisb: it's 7:30am here now :-)
<alexisb> o ok
<alexisb> wallyworld, fyi, I found out today that the license bug is holding up the server group release
<alexisb> I sent mail but that is going to get hot given that customers are looking for the distroy env fix that was put in 1.20.1
<wallyworld> alexisb: yeah,  need to talk to you about that. potentialy very non trivial. can we chat after the testingmeeting?
<alexisb> wallyworld, sure
<alexisb> what ever support you need, we will have to find a way to make it happen
<wallyworld> yeah, will be "fun"
 * thumper needs moar caffeine
<sinzui> thumper, https://bugs.launchpad.net/juju-core/+bug/1342106
<_mup_> Bug #1342106: add-machine fails in recent commit <add-machine> <ci> <manual-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1342106>
<thumper> sinzui: looking
<thumper> sinzui: can we talk through this? hangout?
<sinzui> thumper, okay
<thumper> sinzui: https://plus.google.com/hangouts/_/canonical.com/manual-fail
<wallyworld> sinzui: you already on another hangout?
<voidspace> wallyworld: hey, hi
<wallyworld> hey
<wallyworld> late for you
<voidspace> wallyworld: did you see my latest email? I made some progress.
<wallyworld> voidspace: not yet, been on other stuff, let me look
<voidspace> wallyworld: when I corrected my code to *actually* return the state opened with the changed password those auth failures went away
<voidspace> wallyworld: so progress
<voidspace> wallyworld: that's pretty much it
<voidspace> wallyworld: I have two branches, I'll combine them tomorrow and push and then we can discuss it
<wallyworld> voidspace: i fuck, do you know how much i hate := vs =
<voidspace> wallyworld: heh
<voidspace> me too now...
<wallyworld> let's talk tomorrow, we can pick up where you get to
<voidspace> yep, 1am here so going to bed
<voidspace> g'night
<wallyworld> night and thank you
#juju-dev 2014-07-16
<thumper> twice in two days I'm happy I disabled pushing to upstream
<thumper> as it has saved my arse
<wallyworld> ?
<wallyworld> sinzui: you finished with tim?
<thumper> wallyworld: yeah, I'm sufficiently whipped
<wallyworld> lol
<thumper> sinzui: https://github.com/juju/juju/pull/314
<thumper> menn0: hangout now or after lunch?
<menn0> thumper: now is fine with me
<thumper> menn0: https://plus.google.com/hangouts/_/canonical.com/upgrades
<wallyworld> davecheney: what's the staus of the potentio mgo bug? have you managed to reproduce in a way that a bug can be filed?
<wallyworld> sinzui: you got time for a quick chat?
<jrwren> menn0: haven't caught you online in a week-ish.  About that regex... I found it an interesting difference between go regex and perl/python. \d matches unicode digits in python.  Thanks for the discussion
<menn0> jrwren: yeah, sorry. I've had a several days off over the last week and a bit
<menn0> jrwren: thanks for following up on the regex differences. good to know!
<jrwren> no worries. I just meant to pick up that conversation with you again.
<sinzui> wallyworld, I am available now
<wallyworld> sinzui: great, wanna rejoin the hangout from before?
<wallyworld> https://plus.google.com/hangouts/_/canonical.com/juju-release
<davecheney> wallyworld: no update
<davecheney> sorry
<davecheney> other bugs to fix
<davecheney> i can see there is an issue wher emongo eats the eerror
<davecheney> but haven't investgated further
<wallyworld> davecheney: otp, will ping you soon
<davecheney> wallyworld: noted
<wallyworld> davecheney: it's becoming critical we get this issue looked at upstream. is there any way you could get enough info together to file a bug?
<davecheney>         if strings.HasPrefix(tag, "environment-") {
<davecheney>                 return ops, nil
<davecheney>         }
<davecheney> ark!
<davecheney> wallyworld: i'll send you what I have
<davecheney> which isn't mich
<davecheney> wallyworld: http://paste.ubuntu.com/7801230/
<wallyworld> davecheney: i was hoping you'd be abe to progress it to the point of filing the bug. do i need to ask thumper for some of your time?
<davecheney> ^ that's it
<davecheney> wallyworld: yes
<davecheney> i am working on the Ci blocking windows bug
<wallyworld> thumper: consider yourself asked
<wallyworld> it can come after the CI blocking bug is fixed
<davecheney> wallyworld: fyi, i'm wasting^h^h^^h^h^h^h spending all today setting up a windows dev environment
<davecheney> going back to the mgo bug
<wallyworld> oh, you have my sympathies :-(
<wallyworld> davecheney: i'd rather a kick in the bollocks than having to deal with that
<davecheney> what I see is when we have test failures, they are usually acompanied by the additional debugging in the patch above firing
<wallyworld> ok. so seems like it would be possible to package up a bug report which illustrates or at least explains the problem
<wallyworld> in enough detail that it hopefully is uncontentios to fix
<davecheney> all i can show is the symptoms at the moment
<davecheney> thumper: state/annotator.go insertOps
<davecheney> ^ me shakes head
<davecheney> wallyworld: thumper, do I even want to ask if there is windows support for bzr ?
<thumper> davecheney: yes
<thumper> this is
<davecheney> noice
<davecheney> hmm,
<davecheney> that wasn't too painful
<davecheney> no idea how i'm going to get a version of mongodb with ssl support
<thumper> aah fuck
<thumper> wallyworld: you around for a chat?
<wallyworld> sure
<thumper> wallyworld: https://plus.google.com/hangouts/_/gsz5xroyjpu4ty7ryxodhbqirya?authuser=1&hl=en
<davecheney> wallyworld: thumper https://github.com/juju/utils/pull/10
<wallyworld> davecheney: +1, i assume you know the windows stuff :-)
<davecheney> wallyworld: https://github.com/juju/juju/pull/315
<wallyworld> davecheney: LGTM
<davecheney> sinzui: fix is landing for the win32 build failure
<davecheney> is there a way I can tickle the CI aperatus to trigger a build ?
<davecheney> omg
<davecheney> if juju doesn't build from the trunk of juju/utils I will throw a shoe
<davecheney> hmm
 * thumper dashing out for a bit
<thumper> bbs
<wallyworld> jam: hiya, no idea why this can't automatically merge, i'll do it by hand i guess, but could you please look at https://github.com/juju/errors/pull/3
<jam> wallyworld: because unless mgz set up the jobs only juju/juju is automated by the bot?
<jam> wallyworld: you can't click the button because it finds a conflict between master and your branch
<wallyworld> jam: but i hust pushed up my branch just then
<wallyworld> so i'm not sure what the conflict is
<wallyworld> i branch off tip of master, add the licence changes, pushed, and create the pr off that branch
<wallyworld> i can merge by hand i guess
<jam> wallyworld: so I'm having trouble finding the git syntax to copy just your fix-licenses branch into my local repository, when I figure that out, I'll let you know if I can find the merge issue
<wallyworld> ok
<wallyworld> jam: i'm not so worried about the merge issue for now, i can just merge by hand if needed
<wallyworld> jam: this one can be merged, created the same way https://github.com/juju/errgo/pull/7
<jam> wallyworld: $ git merge --no-commit --no-ff 890f25ff011baceede953804330b590cbac89c83
<jam> CONFLICT (modify/delete): internals.go deleted in HEAD and modified in 890f25ff011baceede953804330b590cbac89c83. Version 890f25ff011baceede953804330b590cbac89c83 of internals.go left in tree.
<jam> Auto-merging annotation.go
<jam> Automatic merge failed; fix conflicts and then commit the result.
<wallyworld> jam: ok, thanks. i'll just delete that file. not sure why it got left there when I pulled from upstream
<jam> wallyworld: your 'fix-licenses' version is not from the tip of juju/errors/master
<jam> wallyworld: looking at the graph it is based off of "also update the readme"
<jam> which doesn't include "use deep equals" and "merge pull request from howbazaar ..."
<wallyworld> oh, bollocks. could have sworn i branched off master
<wallyworld> ok, thanks, will fix
<jam> wallyworld: you probably branched off your *local* master
<jam> you have to do
<jam> git co master
<jam> git pull origin master
<jam> git co -b fix-licenses
<wallyworld> sigh. i hate git
<wallyworld> jam: fixed now, thank you
<jam> wallyworld: happy to help, I got to learn a few more git commands, though probably that just means next time I need it I'll be googling again :)
<wallyworld> lol, yeah i have to do that too
<wallyworld> it never occurred to me i had branched off an old master
<wallyworld> jam: here's a really trivial one. after this one, i need to branch the oher sub repos because 1.20 and master revs are different https://github.com/juju/schema/pull/2
<jam> wallyworld: as in you want a review? LGTM
<wallyworld> jam: thanks, i really didn't need a review for that i guess
<wallyworld> jam: guess who drew the short straw in having to fox the licence balls up?
<wallyworld> :-(
<jam> wallyworld: clearly Martin
<jam> I actually haven't seen mgz/bz2/â¦ around in the last couple of days, is he on vacation? or just swap days after the sprint?
<wallyworld> just swap days
<wallyworld> should be back today
<jam> wallyworld: yeah, I think for the licensing fixes, as long as we know we want LGPL then it is just trivial and you can just merge them.
<wallyworld> jam: btw, the sub repos are still merged by hand, hence i'm not too worried about desc, title should be self evident
<tasdomas> is there a way to create a juju environment with a mongo instance that is accessible by the mongo client?
<tasdomas> I've run into a strange problem where a txn works in tests, but seems to be applied only partially in an ec2 environment
<jam> tasdomas: the plan is to explicitly restrict that (I think today you can technically get to it, but you have to have the right user/password)
<jam> tasdomas: if you really need it, I would use SSH and port forwarding
<tasdomas> jam, what do you mean by ssh and port forwarding? Forward the port for mongo?
<jam> ssh $MACHINE_0 -L 37017:localhost:37017 and then you connect to your local machine at 37017
<jam> mongo localhost:37017/juju
<jam> you'll need the username and password from machine-0 IIRC
<wallyworld> davecheney: in utils repo, there's a zfile_windows.go that says it has been automatically generated. was that checked in my mistake do you think?
<jam> 'machine-0' I believe is the username, but the password is randomly generated and in /var/lib/juju/agents/machine-0/agent.conf
<tasdomas> jam, thanks
<jam> tasdomas: I've had hard times getting mongo to play well with authenticated connections
<jam> you *might* need to connect to the admin db first to login, and then switch
<tasdomas> jam, the issue I am trying to debug is this: http://paste.ubuntu.com/7801746/
<tasdomas> a []txn.Op runs without any apparent errors, but the end result does not match what should actually have happened
<jam> tasdomas: offhand I don't know what Assert: d- means
<tasdomas> jam, it's txn.DocMissing
<jam> tasdomas: so the Insert there has a pointer rather than a detailed struct?
<jam> IIRC you can't do stuff to objects created earlier in the same transaction (I could be completely wrong)
<jam> but if the first step was creating the doc, it doesn't exist for you to add the item to its set.
<jam> tasdomas: but *my* mongo + transactions knowledge is pretty weak. You're better off chatting with fwereade if you and he can overlap in time.
<tasdomas> jam, hm, thanks for the suggestion
<jam> wallyworld: have you done any TXN stuff?
<tasdomas> jam, the weird part is that those ops work in tests and that the whole transaction does not abort
<wallyworld> jam: in what context?
<jam> I thought it was you who commented on being able to do stuff later in a transaction to stuff earlier
<wallyworld> jam: the asserts are evaluated once at the start of the txn
<wallyworld> so ops that happen laster on can't have asserts that depend on previous ops in that same txn
<wallyworld> kinda sucks :-(
<wallyworld> but that's how mgo driver has been written
<tasdomas> wallyworld, thanks
<wallyworld> np
<jam> tasdomas: the asserts I see there are stuff like "value ne dead"
<jam> which nil != dead, right?
<tasdomas> jam, I think so
<jam> tasdomas: anyway, my guess is that the "if the doc doesn't exist insert it" is succeeding, but it isn't letting the "add this value to the set" work
<jam> so maybe if you change the inserted doc to include the new port?
<jam> I really don't know
<jam> but there aren't any asserts there that would fail
<jam> thus the transaction will succeed
<tasdomas> jam, yeah - that's what I'm rewriting this to now
<jam> tasdomas: I would have thought that you could create and then update an object, but maybe the ordering of a transaction isn't really guaranteed, and thus it could try to apply the update before the create.
<jam> and thus, can't do anything
<tasdomas> jam, makes sense
 * thumper groans while studying code...
<jam> thumper: don't you groan for pretty much anything? :)
<thumper> jam: in this particular instance I'm groaning because what I though would be simple, isn't
 * thumper ungroans
<davecheney> wallyworld: no, that is correct
<wallyworld> ok
<davecheney> it is generated then comitted
<wallyworld> thanks
 * thumper seems to be running around in circles
<thumper> ok, stopped running around in circles
<thumper> moar tests tomorrow
<thumper> menn0: I'm getting close :)
<thumper> hmm if dimitern is arriving, time to leave
 * thumper waves
<dimitern> :D
<jam> morning dimitern
<dimitern> morning jam
<dimitern> (which reminds me to have some breakfast :)
<dimitern> TheMue, jam, vladk, others? review on https://github.com/juju/juju/pull/318 much appreciated - finalizing prefer-ipv6 flag implementation
 * dimitern needs to step out for 1/2 h
<jam> dimitern: quid-pro-quo? https://github.com/juju/juju/pull/317
<TheMue> morning
<TheMue> dimitern, jam: will take a look into both
<TheMue> jam: ping
<jam> TheMue: yes?
<jam> morning, btw
<TheMue> jam: on GH where can I see who is allowed to merge code via $$merge$$? or is everybode allowed?
<TheMue> jam: otherwise Iâll merge an external provided code (like Iâve done on LP last week) ;)
<jam> TheMue: AFAIK the people who are listed as part of the "juju" team (with their membership being public) are allowed to vote $$merge$$.
<jam> TheMue:  https://github.com/orgs/juju/members is the member list
<TheMue> jam: ah, ok, thx, will take a look
<jam> TheMue: so you mean if you review code, you want to check if the user can vote merge for themselves?
<jam> that would probably be the list
<TheMue> jam: yes, exactly. the code is reviewed. the contributor is not on the list, so Iâll merge it
<TheMue> â¦oooOOO( After adding a card, learned from last time. *smile* )
<Egoist> Hello
<Egoist> why juju execute the same hook few times one the same unit?
<Egoist> on*
<fwereade> Egoist, hi, depends what hook you mean
<fwereade> Egoist, and when
<fwereade> Egoist, https://juju.ubuntu.com/docs/authors-charm-hooks.html
<TheMue> jam: youâve got a review
<jam> fwereade: hey, good to see you around.
<jam> fwereade: I was hoping you might be able to find time to help me with cloudsigma reviews now that you're back
<jam> TheMue: are you seeing user photos for github users being broken and replaced by text strings? Or is it just me?
<TheMue> jam: hehe, already wanted to ask the same. itâs broken here too. already wondered.
<jam> TheMue: yeah, I thought maybethey updated something so I went to upload a new image and it tells me "we can't use that image"â¦ I stopped there.
<jam> dimitern: I reviewed your patch https://github.com/juju/juju/pull/318
<TheMue> jam: funnily the organization images are shown, but not the user images
<jam> bodie_: ping
<dimitern> jam, thanks!
<Egoist> fwereade, i mean -relation-changed, when add more unit to service
<Egoist> fwereade, it'a a little strange because, hook who is done, starting again on the same unit
<fwereade> Egoist, relation-changed is always specific to a remote unit
<fwereade> Egoist, if there are 5 units of service S and 2 units of service T
<fwereade> Egoist, each unit of T will see a -joined and a -changed for each of the 5 units of S
<fwereade> Egoist, the $JUJU_REMOTE_UNIT var tells you which one you're responding to
<Egoist> fwereade, yeah, but when relation-changed is finished on every related unit, it should be over, and while new unit is not added, relation changed should do nothing
<Egoist> right?
<fwereade> Egoist, it'll also fire whenever a remote unit writes new settings
<fwereade> Egoist, if you're writing a relation-changed hook you need to look at what that remote unit has set, and you probably need to respond to it
<Egoist> ok, get it :)
<TheMue> ouch, looks like GH is down now
<Egoist> fwereade, but it's not possible, executing relation-get every ten seconds, and it not stop :/
<fwereade> Egoist, so you're seeing relation-changed firing again and again forever?
<fwereade> Egoist, can you describe the situation a bit more?
<fwereade> Egoist, eg just the conversation between two particular units you're having trouble with?
<Egoist> fwereade, no it happen when i attach another unit to service
<Egoist> this new unit firing relation-changed again and again, and it won't stop
<Egoist> it's hard to describe, in relation-changed i just getting data from relation and config, using this data make change on istalled software on unit, and basically that's it
<Egoist> fwereade, and relation-changed return true at the end
<jam> fwereade:  or mattyw: as OCR, care for a reasonably trivial review: https://github.com/juju/utils/pull/13
<mattyw> jam, why has that file be renamed?
<jam> mattyw: because _linux is only compiled on linux
<mattyw> ^^ (not sure what other question I could ask)
<jam> which isn't darwin
<jam> (OS X)
<mattyw> jam, ok
<jam> or BSD or whatever else for that matter
<fwereade> Egoist, well it will certainly fire it once for every remote unit, and it may do so more if the settings are changing -- are you seeing it again and again for the same remote unit? or cycling through all the remote units? and are the settings for the relevant units definitely not changing?
<jam> but with "posix" it should compile everywhere that has Posix semantics, which should be just fine for a 'symlink' thing.
<mattyw> jam, ah sorry - github wasn't drawing your description for me
<mattyw> I see it now
<jam> mattyw: how nice of it
<jam> thanks mattyw
<Egoist> fwereade, i make a little change in code, and it stopped, so maybe it was a bug in code
<fwereade> Egoist, could be -- thanks for letting me know
<jamespage> dimitern, around? is it possible to have a MAAS environment use an interfaces/bridge other than br0?
<jamespage> for LXC/KVM instances?
<dimitern> jamespage, afaik there's a setting you need to tweak for that
<fwereade> jamespage, the network-bridge config setting might be what you're looking for
<dimitern> jamespage, fwereade, right
<jamespage> fwereade, dimitern: that sounds like the one
<mattyw> davecheney, feel free to land it - or at least try to
<Egoist> is there any way to get how many units is in service, but not from command line, i need that in code?
<fwereade> Egoist, not really -- you should expect that the number could change at any time, but that you'll be informed by appropriate hooks running
<fwereade> Egoist, what are you trying to do?
<Egoist> fwereade, because in code i use relation-list, and it not alway listed all units, sometimes one is missing
<fwereade> Egoist, it'll tell you about the units you're expected to know about based on what hooks have run
<fwereade> Egoist, if you haven't seen the -joined hook for unit X yet, you won't see in in relation-list
<fwereade> wallyworld, sorry I'm out of touch: do we have an ETA for in-environment storage? I got the impression it was nearly done, but not quite
<wallyworld> fwereade: the api is done and functional and in the blobstorage repo; the refactoring work to change the juju code to store tools and charms etc is yet to be done
<wallyworld> fwereade: there's ongoing debate about the charm storage side of things that needs to be sorted out also
<fwereade> wallyworld, so I guess we can't yet fix it for manual/local/cloudsigma providers
<wallyworld> fwereade: we can - that's on the immediate todo list. we've already started the Environ api changes and there's a wip branch to internalise the Strorage() facility
<wallyworld> fwereade: significant refactoring was done at the sprint; the StateInfo() api is almost gone; replaced by a StateServerInstances() api
<fwereade> wallyworld, ooh! can I direct you at https://github.com/juju/juju/pull/174/files please? just to inform them what needs to be done (or rather doesn't need to be done any more?)?
<wallyworld> looking
<fwereade> wallyworld, if Storage() can be internalised, I guess it's safe to return a null storage now?
<wallyworld> fwereade: Storage() on ENviron will be gone. but it's still needed for now for tools and charms
<wallyworld> we're not quite there yet
<wallyworld> fwereade: the stuff in that pr above will be needed until we do more work to remove the need for storage. not quite there yet but making good progress
<fwereade> wallyworld, ok, cool
<fwereade> wallyworld, would you give them a heads up all the same please
<wallyworld> fwereade: sure,who is "them". do i just talk to nate?
<jam> wallyworld: "them" is Altoros who are writing the CloudSigma provider which has to use the "manual" method for storage because CloudSigma itself doesn't provide a storage solution
<wallyworld> jam: ok, thanks. sadly we are "out of sync" by a few of weeks. i reckon we'll be in a good place to be able to ditch mandatory provider storage by that time
 * jam clearly thinks "air quotes" are the way of the future
<wallyworld> but we need to sync with the charm store guys on charm storager so it's not entirely under out control
<wallyworld> seems like a waste to commit code for only a short time but i don't hink we can hold off on the cloud sigma stuff for that long
<wallyworld> natefinch: can i leave it in your hands to deal with the gojsonschema licence issue? whether it's moved to a different repo, or just the LICENSE file added, or whatever
<jam1> fwereade: maybe you remember best why gojsonschema wasn't merged into the github.com/juju namespace? Was it just evolving fast enough that they didn't want to do forced code review since they aren't directly in that group?
<dimitern> jam1, standup?
<jam1> dimitern: thanks, brt
<fwereade> jam1, I'm sorry, I don't immediately recall -- mgz was more directly involved in the details at that stage iirc, and if not bodie_ will be able to say when he comes one
<natefinch> wallyworld: sure
<wallyworld> \o/
<wallyworld> natefinch: i asked because i thought you already had a relatioship with the dev(s) concerned
<natefinch> wallyworld: I'm more than happy to do it.  I'm friendly with them, but otherwise no professional relationship, FYI.
<natefinch> am I the only one who gets duplicate copies of rogpeppe's mailing list posts?
<wallyworld> i do sometimes i think
<rogpeppe> natefinch: i send another copy when i find i've sent from the wrong Sender
<natefinch> rogpeppe: Oh, I see
<rogpeppe> natefinch: because (i believe) juju-dev and canonical-juju will reject posts unless they're from roger.peppe@canonical.com
<natefinch> rogpeppe: right, that makes sense.  I never noticed they were from different senders.
<rogpeppe> natefinch: unfortunately I haven't found a way to force gmail to always choose a particular sender address based on the destination address
<rogpeppe> natefinch: and gmail doesn't show the sender by default, so i often forget to change it
<natefinch> rogpeppe: I just have separate inboxes, so it's not a problem
<natefinch> but I can understand wanting a single inbox for everything, too
<natefinch> man, setting CDPATH=$GOPATH/src is frigging amazing
<natefinch> from anywhere I can do cd github.com/juju/juju and it goes to the right place
<dimitern> jam1, TheMue, vladk, my g+ froze and now I can't go back :/
<jam1> morning mgz
<mgz> hey jam
<natefinch> int54 is evidently not a defined type
<TheMue> jam: your problem seems to be in apiserver/client_test.go
<jam> TheMue: actually it is state/apiserver/client/api_test.go
<jam> but there may be others
<TheMue> jam: gna, but ok, itâs no small change
<voidspace> morning all
<dimitern> so does it matter what you put between the $$ merge markers $$ ?
<natefinch> dimitern: it's a regex that is $$\w$$   so, no spaces
<dimitern> natefinch, right! good to know, thanks :)
<perrito666> morning
<dimitern> morning perrito666, voidspace
<natefinch> dimitern: we changed it at the sprint last week, for $$lotsOfFunTimes$$
<voidspace> I thought it was \s+
<dimitern> $$exactly$$
<voidspace> dimitern: perrito666: o/ morning
<natefinch> voidspace: isn't \s whitespace?
<mgz> I wonder if I should actually merge the change to make it a re match, it's potentially useful for less silly things
<voidspace> ah, maybe \S+ then
<katco> mgz: i have to feed my pets, but then do you want to chat about the unit tests?
<mgz> sure thing
<natefinch> wwitzel3: I have to move our 1 on 1 for later today
<sinzui> dimitern, is bug 1261780 still an issue now that we use golanf 1.2?
<_mup_> Bug #1261780: go 1.1.2 TLS-enabled client does not accept our CACert <security> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1261780>
<dimitern> sinzui, i'll take a look
<wwitzel3> natefinch: np
<dimitern> sinzui, it seems that should be fixed in go 1.2
<dimitern> sinzui, had to check a few forums to make sure
<sinzui> thank you dimitern
<sinzui> natefinch, can you help arrange a fix for bug 1342725
<_mup_> Bug #1342725: C:/Juju/lib/juju/nonce.txt does not exist, bootstrap failed in win <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1342725>
<sinzui> dimitern, fwereade , jam : can I be upgrades in this channel to update the topic which is months old
<sinzui> s/upgrades/upgraded AKA moderator?
<dimitern> sinzui, it's done via chanserv
<dimitern> sinzui, and I think we all should be able to do it
<sinzui> not yet: #juju-dev :You're not a channel operato
<dimitern> sinzui, try /msg ChanServ TOPIC #juju-dev <text>
<stokachu> we're seeing an issue where people are attempting to use juju bootstrap with kvm and it fails b/c they do have /usr/sbin in their path
<stokachu> this line: command := exec.Command("kvm-ok")
<stokachu> could that possibly be changed to the full path of kvm-ok?
<dimitern> stokachu, please file a bug about it
<sinzui> stokachu, please report that to a bug and I will target it to 1.20.2
<stokachu> i will was just curious on your thoughts
<dimitern> stokachu, I think these commands like kvm-ok are supposed to be in the $PATH, and there are some checks about it in tests
<natefinch> sinzui: I can help with that
<stokachu> dimitern, the only test i see related is it checking if the host os is ubuntu
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Bugs: 1 Critical, 147 High - https://bugs.launchpad.net/juju-core/
<stokachu> sinzui, https://bugs.launchpad.net/juju-core/+bug/1342747
<_mup_> Bug #1342747: juju bootstrap fails if kvm-ok not in path <cloud-installer> <juju-core:New> <https://launchpad.net/bugs/1342747>
<sinzui> thank you stokachu
<stokachu> np :D
<lazyPower> I have a user in #juju asking what ports he needs to keep open in UFW to use juju. are there any other ports than the WSAPI port that he would need to open?
<natefinch> lazyPower: that should be it, though he might want 22 open for SSH access
<lazyPower> the wssapi port is 27017 by default right?
<natefinch> lazyPower: that's the default mongo port
<natefinch> lazyPower: hang on, I'll find it, I forget
<lazyPower> haha yeah, i thought we used the mongodb port
 * lazyPower has mongo on the brain
<natefinch> lazyPower: used to be we went straight to mongo, we've stopped doing that and it's all through the API now
<dimitern> lazyPower, it's 17017 for the api and 37017 for mongo, but the latter shouldn't be needed
<lazyPower> i like that we did that. It'll be easier to replace mongo if we ever do that.
<dimitern> lazyPower, so 17017 + 22 should be opened
<lazyPower> dimitern: thanks *hattip*
<dimitern> lazyPower, depending on the provider, if httpstorage is used you need to open its port as well (there are environ settings for some of those)
<dimitern> lazyPower, np
<natefinch> arg..... the fix for the windows client is going to be annoying.  it's one of those things where we used to just assume everything was ubuntu, but now we have *some* windows code, and that's screwing us up.  And so we have to change the signature of functions to pass in a series
<perrito666> natefinch: yup, my patch, Which I answered in the thread about that same issue does a very tiny fix and touches like 20 files
<alexisb> TheMue, you keep joining and dropping :)
<TheMue> alexisb: yeah, it shows me an error and says I shall retry later :(
<TheMue> alexisb: and so I retry :)
<alexisb> TheMue, do you have topics for today?
<TheMue> alexisb: nothing special. feeling good and currently investigating in LXC and IPv6
<alexisb> TheMue, alright, I can give you 30 minutes back, I know I could use them this week :)
<TheMue> alexisb: fine, then weâll see next time again. ;)
<perrito666> natefinch: stand up?
<voidspace> ok, so not such a successful test run - 22509 lines of output...
<voidspace> wallyworld: ping
<voidspace> natefinch: ping
<alexisb> jam1, ping
<alexisb> jam ping
<jam1> alexisb: pong
<jam1> I've been trying to connect to the hangout, but I keep getting "could not start video because of an error"
<natefinch> jam1: is there an easy way to see who has signed the canonical CLA?
<alexisb> jam1, yeah the hangouts seem to be acting all funny today
<alexisb> john and I are on
<alexisb> but no one else
<jam1> natefinch: there is a launchpad group, just a sec
<jam1> natefinch: https://launchpad.net/~contributor-agreement-canonical
<alexisb> lol jam1 everyone keeps joining and dropping
<alexisb> just John and i on
<voidspace> my internet still sucks
<voidspace> I'll sort it out properly on my return from Romania
<voidspace> I do at least have a phone I can tether as a backup now
<voidspace> I have 2.6Mbit downstream and that's the best I've had for over a month!
<voidspace> alexisb: enjoy your holiday. I didn't think managers were *allowed* to go away for ten days ;-)
<alexisb> :)
<alexisb> katco, fyi, got asked for status on this bug: https://bugs.launchpad.net/juju-core/+bug/1319475
<_mup_> Bug #1319475: Juju should support new signing format <ec2-provider> <goamz:New> <juju-core:Triaged by cox-katherine-e> <https://launchpad.net/bugs/1319475>
<alexisb> if you have any updates might be good to add a comment
<katco> alexisb: right, sorry i will do so. short of it is: we're waiting on a new version of goamz from gustavo
<mfoord> natefinch: ping
<natefinch> mfoord: yo
<niemeyer> katco, alexisb: Oh?
<alexisb> katco, cool, we just want to communicate that in the bug so I have a place to pull status
<katco> alexisb: will do
<niemeyer> katco, alexisb: I'm not on the hook to provide a new version of goamz..
<niemeyer> katco, alexisb: .. that I know of
<katco> niemeyer: yes, sorry, it's been bandied about a bit. ian sent another email and told me to shelve it for now
<niemeyer> katco: Ok, so saying you're waiting on me for it is not quite correct
<katco> niemeyer: i apologize. i was told to shelve this for now pending a change to goamz, and how that will occur is up in the air as i understand it.
<katco> niemeyer: hopefully that makes more sense
<alexisb> who owns goamz?
<niemeyer> katco, alexisb: I really have no idea about the internal communication that took place around the issue. On my side I was asked to "merge a fork" and then I asked what the fork comprises and what do we want from it, which was not answered.. so I really have no intention or means of moving forward. The ball is on someone else's side at this point.
<katco> alexisb: it's a canonical library, but there are several community forks with the support we need for v4 already added
<alexisb> niemeyer, can you please send me the thread you are referring to so I can follow-up
<alexisb> katco, thanks for the info
<katco> niemeyer: i'll take an action item to have ian follow up. sorry for any confusion
<niemeyer> alexisb: Done
<alexisb> niemeyer, katco thank you!
<niemeyer> alexisb, katco: I see there's a mail from Ian asking me a bunch of questions on the topic from yesterday too.. I'll answer that with CC
<alexisb> niemeyer, thank you
<katco> niemeyer: tyvm gustavo. i think the only additional context that's missing from that thread is that he asked me to shelve the analysis of the forks
<katco> niemeyer: so perhaps a bit of confusion there. sorry about that.
<niemeyer> katco, alexisb: No problem
<mfoord> hmmm... so it's at least possible that some of these problems are due to JujuConnSuite State/BackingState issues
<natefinch> fwereade: the windows bug we have currently is due to using "current OS" when we should have used "target OS".  Really, it's because we used "we assume there's only one OS and that's Ubuntu" instead of "Let's actually determine what OS we're targetting"
<jam1> natefinch: that would also solve running the test suite on Mac OSX, which would probably be easier to start with
<jam1> I tried a couple of times, but for right now I just hack osversion_darwin to return "trusty" and most of the test suite passes.
<voidspace> jam1: ping
<jam1> voidspace: ?
<natefinch> jam1: heh, interesting
<voidspace> jam1: can you explain to me why the code is correct as written and my diff makes it wrong
<voidspace> jam1: http://pastebin.ubuntu.com/7804510/
<natefinch> jam1: OSX would be easier to start with if someone's willing to buy me a Macbook :)
<voidspace> jam1: why do we set the password on the connection info to a hash of password and then change the password to the non-hashed version
<voidspace> note that my fix causes tests to pass when I copy sessions
<jam1> voidspace: checking
<voidspace> thanks
<natefinch> I remember something about that hacky code where some of it is a hash and some is not.
<jam1> voidspace: so the hash of the password is because we originalyl passed in the passwords via cloud-init
<jam1> and that leaks the admin secret to everyoen
<jam1> so we leak only the hash of the password
<jam1> and on the first connect we switch back to the "real" password.
<voidspace> jam1: this is in testing though
<jam1> however, since axw's patch a long time ago
<jam1> we don't do that
<jam1> voidspace: but *bootstrap* does that
<jam1> or did
<jam1> we now set up mongo via "SSH" into the machine instead of cloud-init.
<voidspace> so we connect successfully with the hash (which is what we used originally)
<voidspace> and then we change the password to the real one
<voidspace> that was the idea
<voidspace> so we leaked the hash and then changed to the real password
<voidspace> but as this is in testing, my fix doesn't matter - it just changes us to use a consistent password
<voidspace> which happens to be a hash
<voidspace> note that in the test connecting with the hash still worked
<voidspace> so something in this setup is still using the hash
<jam1> voidspace: so you probably need to look into testing/mgo.go I would think
<voidspace> ok
<voidspace> jam1: cool
<voidspace> thanks for the help
<voidspace> wel.l, that change fixed all the apiserver tests
<voidspace> so it was definitely a big part of my problem
<voidspace> time to go jogging and do a full test run
<voidspace> it *might* be finished by the time I get back ;-)
<jam1> voidspace: "resetAdminPasswordAndFetchDBNames" looks suspicious
<voidspace> I'll take a look
<voidspace> thanks
<jam1> voidspace: it looks like it is doing "admin.RemoveUser("admin") as part of resetting the db, I don't quite understand how removing the admin user allows us to use the database properly afterward
<jam1> voidspace: It might be the "mongo lets us login as anyone if noone is configured" ?
<natefinch> I believe that's true
<jam1> natefinch: 		// We try for a while because we might succeed in
<jam1> 		// connecting to mongo before the state has been
<jam1> 		// initialized and the initial password set.
<jam1> suspicious
<jam1> voidspace: so the issue would appear that we are successfully logging in (and mgo is caching) the original hashed password, but then we immediately set the password to the non hashed version, but we don't update the mgo.Session
<jam1> that sounds like an update we potentially need to do for real connections as wel
<perrito666> jam1: runs mongod with --noauth and creates a new admin iirc
<jam1> perrito666: we do that for real mongo (I think), but for the test suite?
<voidspace> jam1: yep, that's the same problem we had in state.Open too which I had to fix
<voidspace> jam1: making the fix in that test code causes *almost* all tests to pass with some of the watcher infrastructure doing session copying
<jam1> voidspace: JujuConnSuite.tearDownConn
<perrito666> jam1: I am reading the same tests for other stuff and it implies so, but I am not sure either
<voidspace> not quite all the tests though
<voidspace> jam1: I'll come back to this in a bit - but this is great
<jam1> 		if err := s.State.SetAdminMongoPassword(""); err != nil && serverAlive {
<jam1> voidspace: fwiw I can't actually find a place where we set the password properly to juju/testing/DefaultMongoPassword
<jam1> I see places where we appear to get rid of that password
<jam1> voidspace: it is probably the dummy environment
<jam1> as we have "juju/provider/dummy/environs.go: admin-secret: testing.defaultMongoPassword
<jam1> voidspace: and there (I believe) it is.
<jam1> juju/provider/dummy/environs.go
<jam1> line 654
<jam1> V		if err := st.SetAdminMongoPassword(utils.UserPasswordHash(password, utils.CompatSalt)); err != nil {
<jam1> voidspace: I'm betting if you change that, you can change the testing/conn.go
<jam1> voidspace: you'll want to confirm with axw, but I'm 95% sure that "juju bootstrap" no longer uses the hash of the password in real world scenarios
<voidspace> jam1: thanks
<voidspace> appreciated
<katco> i'm having trouble getting gocheck to filter down to a single test; i'm executing "go test -gocheck.f ".*" github.com/juju/juju/cmd/juju/..." but it doesn't execute any tests. what am i doing wrong?
<ericsnow> katco: doesn't gocheck require that you be in the directory where the tests are?
<katco> ericsnow: maybe that's why it's not working, but i thought it was just a wrapper around go test and so you could be anywhere
<ericsnow> katco: I haven't looked into it much but I ran into the same thing a while back
<katco> ericsnow: the bootstrap tests are kind of slow, so i wanted to tighten up my testing cycles. let me try cding into the dir
<katco> ericsnow: well, it seems that's working. ty sir :)
<ericsnow> katco: np
<hazmat> are there things within a state server machine agent that connect back to itself on the api?
<bodie_> https://github.com/juju/juju/pull/311 should be good to go btw
<hazmat> anyone can help me diagnose a user issue.. post upgrade to 1.20.1 there env is basically broken
<hazmat> http://pastebin.ubuntu.com/7804968/
<hazmat> afaics..  the underlying issue looks like some sort of timing issue .. juju starts mongodb up.. and the api worker tries to connect to it, succeeds, but does so after it triggers a timeout, which results in the api worker restarting
<hazmat> although perhaps that normal, and subsequent connects are expected to succeed.. the subsequent api server is dying because it can't connect is a bit more relevant
<natefinch> hazmat: I think that's normal
<natefinch> bodie_: have you heard that we need a license on your go json schema code?
<bodie_> yes
<natefinch> bodie_: cool
<bodie_> :)
<bodie_> thanks for checking
<hazmat> natefinch, actually it does appear to that timing issue re connect from the subsequent restarts.. ie the last 100 lines of the machine-0.log is constant api server restarts and mongodb reconnects .. http://pastebin.ubuntu.com/7805073/
<natefinch> hazmat: is that a local provider environment?
<hazmat> natefinch, it is
<hazmat> natefinch, its hackedbellini's env
<natefinch> hazmat, hackedbellini: can you dump the full logs?  Not really sure what could cause that.
<hackedbellini> natefinch: here is the log from just after the agent is restarted: http://pastebin.ubuntu.com/7804968/
<hackedbellini> it continues spamming this after that: http://pastebin.ubuntu.com/7805073/
<hackedbellini> fyi, our env (using lxc) was deployed on 1.16 (with sudo juju bootstrap at that time), we upgraded then to 1.18 and to 1.19 because of a bug (https://bugs.launchpad.net/juju-core/+bug/1325034)
<_mup_> Bug #1325034: juju upgrade-juju on 1.18.3 upgraded my agents to 1.19.2 <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1325034>
<hackedbellini> today we tried to move back to stable (1.20.1) and that happened just after doing a juju upgrade-juju (both juju/juju-core packages are updated to 1.20.1 from the stable ppa)
<natefinch> so this was an environment deployed with sudo juju?
<natefinch> one thing I see notably absent is the log lines about initiating the mongo replicaset
<hackedbellini> natefinch: yes it was deployed with sudo. But it runs with the user "juju" now
<hackedbellini> I'll post the service files for you to see
<hackedbellini> here: http://pastebin.ubuntu.com/7805056/
<hackedbellini> those are the service files that start juju-db and juju-agent
<natefinch> hackedbellini: so that looks correct
<natefinch> actually,  I withdraw the comment about initiating.... forgot this was an existing environment, not bootstrap.... so used to these problems coming right at bootstrap time.
<hackedbellini> natefinch: hahaha I see =P
<hackedbellini> natefinch: is there any info you need, or anything you want me to try to do here? Just ask. Atm, my juju is totally broken, because of that issue, I can't even do a juju status :(
<natefinch> yeah, juju status uses the API :/
<natefinch> hang on
<hackedbellini> np
<natefinch> hackedbellini: could you dump your whole machine-0 log for me?  Or at least the stuff from just before you upgraded?
<hackedbellini> natefinch: sure, just a minute =P
<hackedbellini> natefinch: its sending the paste... I'll pm it to you
<natefinch> cool
<hackedbellini> natefinch: just sent you the log
<natefinch> hackedbellini: looking
<natefinch> I like the way ubuntu pastbin makes me log in to download the text I can see in my damn browser :/
<hackedbellini> natefinch: hahaha. I'll send you a download link directly from our server here so you can get it using wget if you perfer
<natefinch> hackedbellini: I have it, it's ok
<hackedbellini> natefinch: I don't remember the exactly time I did the "juju upgrade-juju", but from the log the errorr started at 17:50 (it 14:50 here, the log is in utc =P)
<hackedbellini> it was*
<hackedbellini> probably I did the juju upgrade-juju a little before that
<natefinch> hackedbellini: we should put in some giant ****** LOOK OUT! HERE COMES AN UPGRADE!! ****** log message
<hackedbellini> natefinch: hahahaha yeah, it for sure would be very useful! :)
<natefinch> instead all we get is "upgrade requested from 1.19.3-precise-amd64 to 1.20.1"
<hackedbellini> natefinch: so indeed I started the upgrade at, surprisingly, 17:50:00 utc.
<natefinch> hackedbellini: I saw that.  Nice timing
<hackedbellini> hahahah
<natefinch> thumper: hackedbellini upgraded his local environment from 1.19.3 to 1.20.1  (he'd previously upgraded from 1.18.3 to 1.19 through a bug).  And now his API won't start up.
<natefinch> hackedbellini, thumper: so, previously, connecting to the API, we dialed "wss://192.168.99.5:17070/"  and now we're dialing "wss://localhost:17070/
<natefinch> I remember we did change this code so that we always dial localhost on state machines, since they should always talk to their own APIs (otherwise they might connect to the API on another HA server)
<natefinch> hackedbellini: is there anything unusual about the iptables on that machine?
<hackedbellini> natefinch: could this be the issue? One thing we noticied is that there's noone listening on 17070 (netstat -nl | grep 17070 returns nothing)
<hackedbellini> natefinch: what do you mean by unusual?
<natefinch> hackedbellini: just wondering if maybe using the IP before was getting around some firewall issue... but if nothing's listening, obviously that a problem
<natefinch> hackedbellini: I presume if you do 'ps -Al | grep jujud' you have a jujud running?
<natefinch> thumper, hackedbellini:  I have to run in a couple minutes, unfortunately.  Previous engagement.
<hackedbellini> natefinch: hrm, I don't know because I'm not the one who manages the network here (I just maintain juju at a user level without sudo privileges =P)
<hackedbellini> when talking to hazmat he said something about that problem... he said that maybe it was a timing issue... that the same process (the agent) should listen and then connect to 17070, but it was not happening
<hackedbellini> natefinch: not at the moment. A workmate here turned juju agent down because of the spamming on the log. But yes, when the agent is running, I see juju on ps. Actually, I can see the jujud from the lxc machines when running that atm, but they are all spamming that wss issue
<natefinch> hackedbellini: I gotta run.  hopefully thumper or wallyworld can help you when they fully wake up.
<hackedbellini> natefinch: hahaha np. Thank you for you help so far! I'll not be here soon too (it's 18h in Brazil and soon I'll go home) so maybe I'll leave before they, like you said, fully wake up =P, but I'll be here tomorrow again at 15h utc
<natefinch> hackedbellini: ok, we'll do whatever we can to help out.  I'm an hour behind you, so it's 17:00 here.
<hackedbellini> natefinch: thank you so much for that :)
<natefinch> hackedbellini: np
 * thumper is reading backlog
<thumper> hackedbellini: how do you maintain juju without sudo? it needs sudo for lxc
<hackedbellini> thumper: well, when I need to run something with sudo, I ask someone with sudo powers to do that for me
<hackedbellini> but usually I just "juju ssh <machine/service>" when I need to go inside an lxc
<thumper> hackedbellini: ok, with the local provider the machine agent for machine 0 runs on the host with root privleges
<cmars> thumper, there's also unprivileged containers, https://www.stgraber.org/2014/01/17/lxc-1-0-unprivileged-containers/
<thumper> also, for some reason I'm yet to fathom, the logs were changed to be 0600 owned by the syslog user
<thumper> cmars: I know
<thumper> cmars: but they weren't done in time for us to use it properly
<thumper> cmars: I have plans, but no time
<thumper> hackedbellini: so you are unlikely to be able to look at the logs without sudo
<hackedbellini> thumper: the logs I can see with no problem
<thumper> hackedbellini: so far :-)
<thumper> hackedbellini: it was changed recently
<hackedbellini> I can't sometimes edit a configuration file (like the agent.conf of the machine-0 =P)
<thumper> hackedbellini: first thing I'd check is that the juju mongo db service is running
<thumper> hackedbellini: then that the machine agent is running
<thumper> hackedbellini: then check the logs of the machine agent
<hackedbellini> thumper: also, about that sudo issue... we initially did the initial bootstrap using sudo (as it was a requirement at the time)
<hackedbellini> now the configuration here (https://juju.ubuntu.com/docs/config-LXC.html) says that it doesn't need it anymore, that's the reason we changed the user running the agent to juju and some permissions (although some files are still owned by root)
<thumper> hackedbellini: sudo is still needed for bootstrap, but not explicitly
<thumper> we now ask during bootstrap
<thumper> hackedbellini: yeah, if you edited the upstart script, it will fail
<thumper> hackedbellini: juju expects the agents to run as root
<thumper> we've talked about this before, but all hooks expect root
<hackedbellini> thumper: mongodb is running. The machine agent is running (not atm because a workmate stopped it because it was spamming the 17070 port error), but it runs
<hackedbellini> I have a log from a very early hour today, it contains the log from the upgrade and all the spamming after. Do you want me to pm it to you?
<thumper> hackedbellini: you could pastebin it
<hackedbellini> thumper: hrmmm, so this config is wrong? http://pastebin.ubuntu.com/7805056/
<thumper> hackedbellini: also, what did you change?
<hackedbellini> thumper: sure, I'll pm you
<thumper> hackedbellini: yes, that is wrong
<thumper> hackedbellini: I recall seeing something like this before when the agent was trying to update mongo to be a replica set
<thumper> hackedbellini: there was an early race condition
<thumper> hackedbellini: where one part of the code thought it was done, but it wasn't
<thumper> wwitzel3: did you do some of the replica set stuff?
<thumper> I'm not familiar with it
<hackedbellini> thumper: hrmmm, I see. Anything I can do to try to workaround that?
<thumper> probably, I just don't know what it is
<hackedbellini> thumper: the only person here with sudo powers is going home now... is there anything you want me to test that requires sudo? Once he is gone, I can continue the tests just tomorrow, unfortunately :(
<thumper> hackedbellini: I've been discussing the issue
<thumper> menn0 and I think we may know what happened
<thumper> during the 1.19 cycle, HA was added
<thumper> but only initializes HA when upgrading from 1.18 -> 1.20
<thumper> as upgrades from dev -> prod have never been formally supported
<thumper> we *think* we may be able to trick the code into thinking it was upgrading from 1.18
<thumper> so it initializes mongo properly
 * thumper quickly reads code
<hackedbellini> thumper: hrmmm, very nice!!!
<hackedbellini> I asked my workmate to wait a little
<thumper> hackedbellini: ok, this is a little hacky
<thumper> hackedbellini: are you ready?
<hackedbellini> thumper: ahhh, he just left... he said he was in a very hurry :(
<hackedbellini> but tell me what it is
<thumper> ok
<hackedbellini> I'll see if I can do, and if I can't, I'll write it down and ask him to do it tomorrow
<hackedbellini> hahaha and no problem if it
<hackedbellini> it's a little hacky =P
<thumper> you need to edit the machine agent conf file
<thumper> hackedbellini: based on your config, it should be here: /home/juju/.juju/local/agents/machine-0/agent.conf
<hackedbellini> yes, it's there indeed
<thumper> hackedbellini: you need to change the line that has: upgradedToVersion:
<thumper> hackedbellini: to say "1.18.4"
<thumper> or something before "1.19" anyway
<thumper> that way when the machine agent starts
<thumper> it goes "oh, you are pre-ha, let me fix that for you"
<thumper> I *think* that'll fix it
<hackedbellini> thumper: hrmm, I see. And after that, restart the agent service?
<thumper> right
<menn0> which needs sudo unfortunately...
<hackedbellini> menn0: yes, unfortunatly :(
<hackedbellini> but np, tomorrow I'll ask my workmate to do that and then tell you guys if it solved the problem =P
<hackedbellini> btw, do I need to do that on all agent.conf of my lxc machines?
<hackedbellini> I can't look at them right now, but probably they have upgradedToVersion: 1.19.3 too
<wwitzel3> thumper: I worked on it a little yes, right when I first started, paired with nate
 * thumper is still trawling through emails
<hackedbellini> well, I'm going now too, it's 19h here in Brazil =P. Thank you guys for all the attention. Let's hope tomorrow I come back here with good news
<menn0> hackedbellini: I *think* you'll just need to update the agent conf for machine-0 (the API server)
<menn0> hackedbellini: that's where the HA initialisation runs
<hackedbellini> menn0: nice. So, let's see what happens tomorrow
<hackedbellini> going now. Cheers!
 * davecheney woke up swinging this morning
<thumper> sinzui: you around?
<thumper> davecheney: I have a task for you...
<davecheney> thumper: speak
<thumper> davecheney: it seems that there are many people, me included, that are unclear on the go compiler rules for conditional compilation
<thumper> it seems that there are suffix rules, and special comment rules
<thumper> can I get you to summarise these to the mailing list?
<davecheney> thumper: http://dave.cheney.net/2013/10/12/how-to-use-conditional-compilation-with-the-go-build-tool
<davecheney> done
<davecheney> next
<thumper> and possibly to put themin the docs directory?
<thumper> nice
<thumper> perhaps linking to that blog post from the hacking doc
<davecheney> thumper: will post to the list
<thumper> where we have a section on different targets
<davecheney> thumper: ok
<thumper> that may help people understand
<davecheney> thumper: PR coming damn soon
<davecheney> thumper: I cannot find said document
<thumper> davecheney: perhaps add to the style guide?
<perrito666> davecheney: I would have gone by euro
<wallyworld> sinzui: hey
<perrito666> hey gm wallyworld
<wallyworld> perrito666: evening
<wallyworld> sinzui: i have a question about bug 1341589 if you are around
<_mup_> Bug #1341589: Distribution tarball has licensing problems that prevent redistribution <juju-core:In Progress by wallyworld> <juju-core 1.20:In Progress by wallyworld> <https://launchpad.net/bugs/1341589>
<voidspace> wallyworld: so I found the cause of *almost* all the remaining test failures in my watcher session copying branch
<wallyworld> voidspace: hey, sorry i missed your ping last night, had fallen asleep at the keyboard
<voidspace> wallyworld: unsurprisingly, JujuConnSuite is opening the state in a custom way and then changing the password
<wallyworld> \o/
<voidspace> wallyworld: haha
<voidspace> I thought you probably weren't around but I thought I'd try just in case
<wallyworld> lol, you alomost got me
<thumper> o/ voidspace
<wallyworld> i woke up later to send an email
<voidspace> I believe you and axw have been working on changing JujuConnSuite?
<voidspace> I wonder if this code is about to go away anyway
<voidspace> thumper: hi
<wallyworld> not directly, but yes we do need to do futher work and it may result in that code changing significantly
<voidspace> wallyworld: the conn suite follows the pattern of using the hashed password, connecting and then changing the password
<voidspace> which of course screws session copying because the password is out of date
 * thumper is taking kids to see a movie and will be working in town at a cafe
<voidspace> however, I believe that production code no longer follows this pattern (using the hash) anyway
 * thumper doesn't want to see tinkerbell
<voidspace> thumper: have fun
<voidspace> haha, I bet thumper really does...
 * thumper ignores voidspace
<voidspace> wallyworld: so my belief is that I can just delete that little dance
<wallyworld> voidspace: i'm not 100% across the implementation detail (yet), but it seems if we can tweak the suite set up we can try and target the root cause
<thumper> voidspace: my kids are very jeleous about our marvel vans
<voidspace> thumper: did you get some too?
<thumper> s/our/your/
<voidspace> ah, ha
<thumper> missed a y
<voidspace> yeah - I still like them
<voidspace> the red Iron Man ones especially
<wallyworld> voidspace: i don't think it does, there's a lot of legacy in jujuconnsuite
<thumper> they have gotten into the marvel movies this holidays
<wallyworld> voidspace: if you push ypur latest changes, we can take a look
<voidspace> wallyworld: well, the simplest fix was a one line fix to just keep the password as the hash
<voidspace> wallyworld: but that's not the *right* fix
<voidspace> wallyworld: I have that pushed, hang on
<wallyworld> voidspace: that fix was in the test suite, right?
<voidspace> thumper: ah, good stuff. I enjoy the marvel movies.
<voidspace> wallyworld: right
<voidspace> wallyworld: let me get you a link to that change
<wallyworld> voidspace: at this stage, the right fix is what allows us to ship 1.20.2 :-)
<wallyworld> we can clean up in trunk if needed
<voidspace> hah... well
<voidspace> jam did some digging for me and the right fix might be just as simple
<voidspace> I'm about to look at that now
<voidspace> it's nearly 1am but I still have jetlag :-/
<wallyworld> voidspace: ok, but get some sleep first :-)
<voidspace> I didn't get it for vegas
<wallyworld> voidspace: really appreciate the work on this one
<voidspace> but I normally do, so I'll stay up another hour I reckon
<wallyworld> ok, but don't feel obligated to :-)
<voidspace> wallyworld: well, let's see how much I get actually done by Friday :-/
<voidspace> I'd like to get a chunk bitten out
<voidspace> as soon as I get tests passing I can work on the actual changes
<wallyworld> yup, we'll pick up with where you get to
<wallyworld> if i get time today, i'll look to see what ou've done, so keep your wip branch up to date :-)
<voidspace> wallyworld: see the changes to juju/testing/conn.go in this branch https://github.com/voidspace/juju/compare/copy-sessions
<wallyworld> looking
<voidspace> wallyworld: did you see the email I sent to juju-dev?
<voidspace> that branch only has the watcher changes
<voidspace> it doesn't have the transaction runner changes
<wallyworld> ah not yet, still getting through backlog
<voidspace> I summarised what I've done so far and gave three pastebins with the diffs
<wallyworld> hopefully txn runner changes a lot easier than collection ones
<voidspace> the diffs are small as the problem is getting tests to pass
<wallyworld> great will read
<voidspace> wallyworld: they're much more isolated, yes
<wallyworld> hopefully getting tests to pass is just a small fix with password
<voidspace> just three methods to change and all they do is copy the session and create a new transaction runner
<wallyworld> yup, spnds right
<wallyworld> sounds
<voidspace> wallyworld: well, with this in place I have several test failures - down from a shitload
<wallyworld> that's a start
<voidspace> and that was already down from crap-tonne
<voidspace> so steady progress
<voidspace> yep, really a limited few failures now
<wallyworld> several < shitload < fucktonne
<voidspace> :-D
<wallyworld> voidspace: sounds like you're well on the way to having an almost complete fix we can take and finish and run with for next  week
<voidspace> I think that change in conn.go is actually a no-op as the password has already been set to the hash
<voidspace> the hash is actually set in the dummy provider
<bigjools> sinzui: I saw you triaged https://bugs.launchpad.net/maas/+bug/1341281, but I suspect the guy has messed up his network config
<voidspace> uju/provider/dummy/environs.go line 654
<_mup_> Bug #1341281: MaaS does not report to juju that the bootstrap node is ready? <juju-core:Triaged> <juju-core 1.20:Triaged> <MAAS:Incomplete> <https://launchpad.net/bugs/1341281>
<voidspace> and that code is (we believe) out of date with the actual implementation now
<voidspace> wallyworld: I'll leave my hack in place and shoot a separate email to axw about that
<wallyworld> voidspace: could be, i'd have to get across the implementation detail
<sinzui> bigjools, I agree. I think getting more people might help close the issue. We don't want to see any more maas misadventures
<voidspace> right
<wallyworld> voidspace: yeah, that conn change is a no-op. also, master has diverged quite a bit from 1.20 - that conn code lives in testing in master but up a level in juju in 1.20. so we will have to backport your changes when master is finished
<voidspace> right
<voidspace> gah, and then other tests *depend* on the password being "dummy-secret", hard coded
<voidspace> e.g. provisioner_test
<wallyworld> sigh
<sinzui> davecheney, et al, sorry, CI is running behind schedule because I tried to switch it over to the new scheduling rules. It wasn't a complete success. I don't think CI will see the queue revisions for another 2 hours
<voidspace> wallyworld: so I'm going to kill the code that sets it to the hash instead
<davecheney> sinzui: ok
<voidspace> well, as well
<wallyworld> voidspace: i think that's ok ottomh. let's get the tests passing and we can revisit and tweak the implementation with all known changes in place
<voidspace> wallyworld: yep, I'll do another progress update tomorrow
<wallyworld> \o/ thank you
#juju-dev 2014-07-17
<voidspace> this discovery explains a bunch more failures
<voidspace> so more progress
<wallyworld> great :-)
<voidspace> thanks, you've been a great rubber duck :-)
<wallyworld> that's me
<wallyworld> sinzui: bug 1341589,  there's been a comment made that we did ship godeps source in 1.20 - i made the assertion that we didn't, and i think you didn't think we did either?
<_mup_> Bug #1341589: Distribution tarball has licensing problems that prevent redistribution <juju-core:In Progress by wallyworld> <juju-core 1.20:In Progress by wallyworld> <https://launchpad.net/bugs/1341589>
<sinzui> wallyworld, oops, we didn't delete all of it
<sinzui> wallyworld, I will add that to my list of deletes.
<sinzui> wallyworld, actually do we agree that I should delete
<sinzui> src/code.google.com/p/go.net/html/charset/testdata/
<sinzui> src/github.com/binary132/gojsonschema/json_schema_test_suite
<wallyworld> sinzui: nate is following up with the gojsonschema author to get the licensing sorted out for that. i think we can delete the html test data though
<wallyworld> sinzui: the gojsonschema repo should be under juju as it was written on canonical time
<davecheney> wallyworld: seconded
<davecheney> it seems foolish to consume a dependency published in a personal account
<wallyworld> nate is dealing with that, i'll follow up with him tonight
 * davecheney mentions that he raised this issue 6 weeks ago when this dependency leaked into our codebase
<davecheney> rick_h__: can I nag you for more msdn keys
<rick_h__> davecheney: wallyworld it was forked from an existing repo though I recall
<davecheney> rick_h__: that's just great
<rick_h__> davecheney: sure, same software etc?
<davecheney> ^ insert heavy sarcasm
 * wallyworld has no idea, wasn't involved with the development
<davecheney> rick_h__: i'll 'ahem' get the cd somewhere else
<davecheney> i'll email you
<rick_h__> davecheney: ok souds good
<rick_h__> sounds that is
<davecheney> rick_h__: thanks, I only have a dodgy xp vm that i've been tending for close to a decade now
<davecheney> seriously, the install date of that vm is 2007
<rick_h__> davecheney: did you put the other key to use?
<rick_h__> davecheney: or do you want that one again?
<rick_h__> davecheney: I've got 5 to use so want to make sure I reuse if that's what'll work
<davecheney> rick_h__: let me check my irc log
<rick_h__> davecheney: k, if you're not using it I can resend that one
<rick_h__> davecheney: I just want to check if you need a second NEW one, or if that other one will work
<davecheney> rick_h__: i have one starting 39
<davecheney> is that win32 or win64
<davecheney> because windows is amazing
<rick_h__> davecheney: either.
<rick_h__> davecheney: so will get you that one and a second one then
<davecheney> rick_h__: ok
<davecheney> given how short supply they are i'll use this key for a win64 install
<davecheney> and keep using my xp 32 bit vm
<davecheney> until that becomes unworkable
<sinzui> davecheney, MS published free images for web development testing. The work with virtualbox and MS even recommends snapshotting them to prevent them from expiring
<rick_h__> davecheney: all good
<davecheney> sinzui: linkage ?
<davecheney> sinzui: i know this isn't your fault
<davecheney> but this is crap
<davecheney> if we're developing on windows, we shouldn't be scrapping together resources to get win dev machines
<davecheney> this is serious software development, it should be done properly
<rick_h__> davecheney: replied
<sinzui> davecheney, But I have been using win images in the cloud this year, when I need on, I just start an instance from the snapshot that does the windows tests. it has golang 1.2, python 2.7 and a ssh
<davecheney> rick_h__: ta, when you say 'the download' is that the one that I should look for, um, online ?
<rick_h__> davecheney: I imagine. I ink it should work on anything of the right version though
<rick_h__> davecheney: though you'll know more of windows than me soon
<davecheney> yay. life skills
<rick_h__> :)
<rick_h__> job security, no one else will want to do it!
<voidspace> wallyworld: full test suite pass with *some* of the watcher infrastructure copying sessions
<wallyworld> voidspace: farking awesome
<voidspace> wallyworld: so now to change all the rest
<wallyworld> time for you to git push and then go to bed :-)
<voidspace> pushed
<wallyworld> \o/ tyvm
<voidspace> wallyworld: some of them do lovely things like send collections (which have a session reference of course) across a channel
<voidspace> so who closes the session then...
<voidspace> so those changes will need more thought
<wallyworld> lalalalalala
<voidspace> probably stick to using the global session and the receiver can copy
 * wallyworld covers his ears and pretends not to hear
<voidspace> hah
<wallyworld> but agree with receiver copying
<sinzui> davecheney, Sorry, took forever to find the link, and then remember the blog post about how I got it downloaded
<sinzui> http://blog.launchpad.net/general/a-tale-of-two-travesties
<sinzui> davecheney, The vhd you want is probably different since windows and ie have changed
<davecheney> ok
<davecheney> i read that as 'the tale of two transvestites'
<davecheney> now I can't unsee it
<thumper> wallyworld: I'm in town while the kids are at a movie, I won't be home in time for our meeting start
<thumper> movie finishes 10 minutes before our meeting
<wallyworld> thumper: np, just ping e whenever
<thumper> wallyworld: kk
<wwitzel3> thumper: what movie?
<thumper> tinkerbell and some pirate...
<thumper> hence me not going
<wwitzel3> lol, that's why you are in town
<thumper> if it was how to train your dragon 2, I probably would have gone
<rick_h__> thumper: saw that with the boy, it's ok
<perrito666> thumper: that sounds an awful lot like peter pan
<perrito666> from an odd point of view
<thumper> perrito666: it probably is
<wwitzel3> haha
<thumper> rick_h__: the dragon one?
<rick_h__> yea
 * perrito666 would fancy a movie about captain hook and the annoying young boy and his fairy 
 * perrito666 saw many kid movie blockbusters on the plane back from the last sprint
<perrito666> they had cloudy with a change of burgers 2
<perrito666> anyway, Ill go to sleep so I am ready for our "too early to be pleasant" morning meeting, cheers everyone
<voidspace> wallyworld: ah, the transaction stuff breaks the test hooks, because they expect a global transaction runner
<voidspace> wallyworld: so that needs fixing too
<voidspace> shouldn't be too hard
<wallyworld> voidspace: yeah, i did something in the blobstore sub repo - we can do something similar here too
<wallyworld> go to bed!
<voidspace> I could still use a global jujutxn.Runner and have *it* recreate the underlying transaction runner
<voidspace> that would actually be nicer
<voidspace> the code would look the same... inside state
<voidspace> still not very tired :-/
<voidspace> haven't got to sleep until around 4am the last two nights
<wwitzel3> voidspace: I've been on odd hours as well
<voidspace> wwitzel3: morning
<voidspace> wwitzel3: we had our scan today - 7 1/2 weeks
<wwitzel3> voidspace: morning, congrats, all is well?
<voidspace> wwitzel3: I double checked and I wasn't on a sprint that week, which was a relief ;-)
<voidspace> yeah, all is well
<wwitzel3> wonderful
<voidspace> and there's only one - which is a *big* relief
<wwitzel3> hah, oh that's right .. sometimes there is more than one in there
<voidspace> slightly more likely if you're as old as us
<voidspace> wwitzel3: how's you?
<wwitzel3> voidspace: really? I didn't know that. I'm well, just messing around with tests still :/
<voidspace> apparently so
<voidspace> wwitzel3: heh, I've been doing that all day - *just* got the tests passing with some session copying in place
<wwitzel3> voidspace: some where something is doing an extra Close on state so when a teardown runs, mgo throws a NPE.
<voidspace> ouch
<wwitzel3> voidspace: I'm "walking" it now
<voidspace> heh
<wwitzel3> voidspace: printing and panicing since I could never get gdb working right
<voidspace> I was walking JujuConnSuite.SetupTest
<voidspace> and that's a beast
<voidspace> heh, yeah - panics are very useful for that
<voidspace> "stop here"
<wwitzel3> yeah, these tests don't use JujuConnSuite, just Base and Mgo suite.
<wwitzel3> but it is still a hassle
<voidspace> right, I'm going to meditate and hopefully fall asleep
<voidspace> see you all tomorrow...
<wwitzel3> voidspace: see ya
<voidspace> o/
<voidspace> thumper: enjoy tinkerbell...
<bigjools> sinzui: we get this particular one a lot - it might help a bit if juju had some way to timeout and report that the node has no internet access?
<axw> bigjools: what's the problem? machines not coming up because they can't apt-get?
<bigjools> axw: I think so
<bigjools> it just hangs on bootstrap
<axw> hrm, ok. bootstrap *should* time out after 5 minutes by default
<menn0> thumper: your change and mine are going to have merge conflicts. sucks to be me :)
<axw> wallyworld: there's a new mgo release coming soon, so I think it's best I hold off proposing any changes till I can update
<axw> I have some workarounds in the juju code now that will be irrelevant with that
<thumper> axw: how soon?
<wallyworld> axw: sure, np.
<axw> I'll continue on with killing Environ.StateInfo
<axw> thumper: don't know sorry
<wallyworld> axw: otp, will chat soon
<axw> thumper: well, it doesn't say "soon", but upcoming: https://bugs.launchpad.net/mgo/+bug/1340361/comments/1
<_mup_> Bug #1340361: RemoveUser of non-existent user has different behaviour with mongo 2.4/2.6 <mgo:Fix Committed> <https://launchpad.net/bugs/1340361>
<bigjools> axw: "should" :)
<axw> bigjools: I've witnessed the timeout working recently. There was a recent change to add a longer default for MAAS (for !fastpath-installer) -- maybe it's just not reaching that longer timeout yet
<axw> can't remember how long's long
<bigjools> axw: that would be it
<menn0> thumper: I'm done with the review. A few things but nothing major.
<wallyworld> axw: michael has made good progress on the io timeout issue; he should be able to push up a fairly complete branch tomorrow and we can pick up from there
<axw> wallyworld: cool.
<axw> wallyworld: I have moved address propagation to the backlog, pending additional network/relation hook changes
<wallyworld> axw: yup, np
<thumper> katco: are you still as enthusiastic as your initial blog post?
<davecheney> katco: don't answer that, it's a trap
<davecheney> thumper: i'm going to head out for lunch
<davecheney> do you want to talk to me today about HR stuffs ?
<thumper> kk
<thumper> davecheney: nah, tomorrow it is
<davecheney> thumper: np
<thumper> bbl
<davecheney> ok, windows 8 install
<davecheney> complete
<davecheney> brb, just going to drink some cynaide
<wwitzel3> It's good over ice
<davecheney> mmm, laudinum
<jam> wwitzel3: you got jetlag going from southeast US to northeast US ?
<thumper> wallyworld: I think I remember why that provisioner code was there
<thumper> wallyworld: got a minute?
<wallyworld> sure
<thumper> https://plus.google.com/hangouts/_/g6p7d3ldngpmaqhkp5pqyyhidia?authuser=1&hl=en
<menn0> thumper: any chance you can look at https://github.com/juju/juju/pull/322
<thumper> menn0: yep, just doing some hr stuff first
<menn0> thumper: ok
<thumper> menn0: done
<thumper> dinner time
<thumper> back for meeting in 4 hours
<thumper-afk> bugger
<rogpeppe> mornin' all
<menn0> thumper: thanks for the review. merging now.
<mattyw> davechen1y, ping ping?
<mattyw> davechen1y, cancel that
<voidspace> morning all
<jam> morning voidspace
<voidspace> o/
<Egoist> Hi
<Egoist> is -relation-departed hook is executed after remove unit from service?
<dimitern> Egoist, quoting juju-core/doc/charms-in-action.txt: The "relation-departed" hook for a given unit always runs once when a related unit is no longer related. After the "relation-departed" hook has run, no further notifications will be received from that unit; however, its settings will remain accessible via relation-get for the complete lifetime of the relation.
<dimitern> jam, ping
<jam> dimitern: /wave
<dimitern> jam, should we have a quick hand-off chat re networking?
<jam> certainly
<jam> give me a sec to grab headphones
<dimitern> sure
<TheMue> dimitern: jam: any interesting regarding ipv6 (especially for lxc) here?
<dimitern> jam, i'm in the standup g+
 * TheMue simply will join too
<dimitern> TheMue, come as well yes
<perrito666> morning everyone
<dimitern> morning perrito666
<TheMue> perrito666: o/
<voidspace> jam: with this code all the tests pass (there's "some" session copying going on in the watchers)
<voidspace> https://github.com/voidspace/juju/compare/master...copy-sessions
<voidspace> jam: that removes the password hashing from the dummy provider and also from JujuConnSuite
<jam> TheMue: dimitern: want to join me in https://plus.google.com/hangouts/_/canonical.com/juju-sapphire
<TheMue> jam: already in
<jam> axw: what was the final results of address changes for charms?
<jam> voidspace: sounds good, I wanted to check with the list that it was time we could do it, but it sounds like we are ready
<voidspace> cool
<axw> jam: not much for now. for the short term, I've updated the uniter so that config-changed is triggered on machine/unit address change. in the longer term, (I think?) there will be a relation equivalent of config-changed that will be triggered when relation addresses change
<axw> so, no automatic changes for now
<axw> I've moved the card to the backlog
<jam> mgz: ping for team standup?
<wwitzel3_> jam: hah, I guess I did. (re: jet lag)
<voidspace> what else is using juju/txn now that it has been broken out into a separate package?
<voidspace> 'coz I want to change the transaction runner in a backwards incompatible way
<natefinch> voidspace, wwitzel3: all team meeting?
<wwitzel3> natefinch: yep, sure
<voidspace> natefinch: I can join
<voidspace> for the sake of nostalgia
<voidspace> oh!
<voidspace> all team meeting
<voidspace> cool
<mgz> doh, should have looked at calendar
<Guest63023> voidspace: blobstore
<voidspace> Guest63023: ok, thanks
<voidspace> wallyworld: ah, I wondered who you were :-)
<voidspace> wallyworld: so I want to make the underlying transaction runner do the session copying
<wallyworld> stupd freenode
<voidspace> wallyworld: as that's the correct thing to do
<voidspace> wallyworld: but it changes the signature of NewRunner
<wallyworld> voidspace: go for it, tht would be awesome'
<wallyworld> i'll fix blobstore
<voidspace> wallyworld: and only the replicaset tests fail when I do that
<wallyworld> they fail a lot anyway :-)
<voidspace> wallyworld: although obviously the txn tests themselves won't run
<voidspace> hah
<voidspace> they fail deterministically now
<voidspace> so I guess you could call that progress...
<wallyworld> yup :-D
<wallyworld> we should do what's the best approach and fix tests accordingly
<voidspace> only one fails, and it looks like it's the ipv6 one
<wallyworld> hmmm, that may be different root cause?
<voidspace> well, it was passing and now it's failing...
<voidspace> but the specific error is that ""exception: need most members up to reconfigure, not ok"
<wallyworld> there's one ip6 one that has been flakey
<voidspace> which maybe indicates a timing issue
<wallyworld> yep
<voidspace> I don't think it's this one that is the normally flaky one
<wallyworld> hmmm, strange that it's just the 1 test
<voidspace> this one just starts a replicaset with members location specified using ipv6 format (so the replicaset members talk to each other over ipv6)
<voidspace> yeah, I'm digging into it
<wallyworld> can't see how session clone relates to that, unless sockets somehow break on ip6
<voidspace> it may just be that the extra connections need some extra time
<wallyworld> that would be more plausible
<wallyworld> try increasing the sleep just for shits and giggles to see what happens
<voidspace> yep
<voidspace> but coffee first
<mgz> interestingly the bot has been pretty solid for the last few days
<mgz> all the red seems to be real issues
<dimitern> mgz, really? I had 2 reds yesterday - one landed successfully, the other was genuine
<mgz> dimitern: ah, the only one of yours I saw looked real
<dimitern> mgz, there was a test failure previously on the same PR
<dimitern> mgz, the funny thing though is how jenkins decides to mark a build red very early.. seems fishy
<mgz> oh, yeah, something jenkinsie went wrong on 43, which wasn't test related
<mgz> I forgot that I'd requeued that
<axw> wallyworld: I'm not going to make the standup, team meeting ate into dinner
<mgz> they ate your dinner?!
<wallyworld> axw: no problem at all, talk tomorrow
<axw> wallyworld: I've got my StateInfo branch all working, need to tidy up and split it
<wallyworld> great :-)
<axw> and now there seem to be more action test races
<wallyworld> :-(
 * axw goes for dinner
<wallyworld> axw: i'd like to continue to remove ome more old cruft too
<wwitzel3> why would a test be trying to actual create a machine with the 'someprovider' from the FakeConfig when calling AddMachine?
<wwitzel3> the testing is erroring with  value *errors.errorString = &errors.errorString{s:"no registered provider for \"someprovider\""} ("no registered provider for \"someprovider\"")
<wallyworld> katco: mgz: i'll be a bit late for standup, in another meeting
<mgz> wallyworld: poke when ready
<axw> poke me too, I finished dinner and am around atm
<katco> good morning all
<voidspace> katco: o/
<katco> voidspace: hallo
<voidspace> wallyworld: just FYI, it was a timing issue - around destroying instances
<voidspace> wallyworld: we need to give the replicaset time to adjust
<wallyworld> voidspace: yeah, thought so :-)
<voidspace> wallyworld: not sure why it wasn't biting us before
<wwitzel3> katco: morning
<voidspace> wallyworld: we have some "strategy" code around the other operations to deal with it
<wallyworld> that's the natue of it
<katco> wwitzel3: howdy
<voidspace> wwitzel3: but weren't using it around Remove
<wallyworld> sigh
<voidspace> oops, sorry wwitzel3
<voidspace> wallyworld: well, I'm happy it was an easy fix
<wallyworld> voidspace: yes :-)
<voidspace> wallyworld: I have a doctors appointment soon - but should have a txn branch to propose later today
<wallyworld> katco: morning, i'm running late, in another meeting
<wallyworld> voidspace: you're awesome
<katco> wallyworld: no worries, just yell when you're ready
<voidspace> wallyworld: haha, wait until you see how much I leave you to do before you say that
<voidspace> wallyworld: but thanks
<wwitzel3> lol
 * wallyworld is nervous :-)
<katco> voidspace: you rewrote juju in python didn't you
<mgz> rerewrote
<voidspace> katco: we actually came up with a way to do that
<katco> rofl
<voidspace> a go to python source translator would be relatively easy
<katco> -.-
<wwitzel3> hah
<voidspace> so we do that, run it once, then switch to Python dev
<voidspace> mgz said he would work on it over the weekend
<mgz> >_<
<voidspace> :-)
<katco> haha
<katco> axw: replied to your latest review comment (thank you). so i'm OK to add a file-lock to the bootstrap command?
<wallyworld> mgz: katco: ready :-)
<axw> katco: thanks, will read them in a bit. i'm not sure - is it actually possible to get into the race condition with the same $JUJU_HOME? the .jenv file is meant to be a sort of lock
<katco> axw: maybe we can discuss a little at the end of the standup
<cmars> jam1, sure
<cmars> joining
<cmars> oh
<vladk> dimitern: ping
<perrito666> does anyone know how to get a user/password to connect to a github.com/juju/testing . MgoInstance?
<natefinch> perrito666: sorry, I'm not sure.  If you figure it out, can you make a quick PR to put that info in the comments of MgoInstance?
<perrito666> natefinch: I certainly will once I do
<natefinch> mgz, dimitern, rogpeppe, jam:  any of you guys able to point perrito666 to how he can get the user/password for a testing.MgoInstance?
<jam> natefinch: user admin
<jam> password dummy-secret IIRC, or testing.DefaultMongoPassword ?
<perrito666> jam: testing.DefaultMongoPassword does not work I get auth fails, Ill keep looking , I eventually will figure out or throw the computer over the window
<jam> perrito666: where in the code are you trying to do it?
<jam> perrito666: environs/dummy/environs.go Bootstrap sets the password to Config().AdminSecret()
<jam> (or the hash of it before first connect)
<jam> (line 654)
<perrito666> jam: I am writing tests for the new restore, most, if not all uses in tests use MustDial Â¯\_(ã)_/Â¯
<jam> perrito666: so for restore, are we bootstrapping with the dummy environ?
<jam> if we haven't ever connected to it before then the password should be Hash(password)
<jam> but in real "restore" wouldn't the jujud have come up at least 1 time?
<jam> maybe in testing it hasn't
<jam> perrito666: so I have to ask a bit where this is getting set up, as it probably makes a difference
<jam> until you bootstrap, I don't tihnk we have any users in mong
<jam> mongo
<jam> perrito666: fwiw voidspace has been working in this place a bunch right now, he might have answers if I'm out
<perrito666> jam: mm, I see I have not yet connected to it at that point, I am writing tests for the function that re-sets replset
<bodie_> morning #juju-dev :)
<perrito666> jam: tx for the info :)
<perrito666> bodie_: hi
<jam> morning bodie_
<TheMue> bodie_: heya
<abentley> sinzui: chat?
<sinzui> yes
<abentley> https://plus.google.com/hangouts/_/calendar/Y3VydGlzQGNhbm9uaWNhbC5jb20.5o5o70vgo1v23fjnopr1famgrg?authuser=1
<bodie_> TheMue, did you have anything more to add on PR 311?
<bodie_> I was talking about a change to the whole thing with fwereade -- I have some content I've been working on but refactoring the tests ended up being a big chunk
<TheMue> bodie_: will take a look
 * TheMue âs irc client forgot to notify me in case of a mention :(
<natefinch> bodie_: can you put that license file in gojsonschema ASAP?  It's blocking our release
<natefinch> bodie_: I presume the right thing to do is just pull down the one added to the upstream repo
<bodie_> natefinch, sure, give me 5 minutes to wrap up here.  mgz gave me the impression it wasn't a front of the line concern, sorry to keep you waiting
<natefinch> bodie_: it's ok, I didn't really follow up yesterday.
<natefinch> FWIW, I blame mgz too ;)
<perrito666> voidspace: ping?
<voidspace> perrito666: I have to go to the doctors - like *right* now
<mgz> bodie_: finishing your branch was more important, which you did :)
 * bodie_ throws a tomato at mgz
<perrito666> voidspace: no hurry
<voidspace> perrito666: I shouldn't be long though and we can talk when I get back
<perrito666> cu later or tomorrow
<voidspace> perrito666: ok
<bodie_> mgz, if it LGTY, LMK ;)
<mgz> TSGTM
 * natefinch just figured out bodie_'s github account name
<bodie_> hehehe
<bodie_> now I feel a little silly... that name is a throwback to my teens
<natefinch> haha
<bodie_> mgz, natefinch, did you want to have a brief chat about whether we should be using my repo or try to get a proper update merged into xeipuuv's?
<natefinch> bodie_: neither.  we should fork yours into github/juju
<bodie_> that's what seemed reasonable to me as well
<natefinch> bodie_: I'm not confident that xeipuuv will keep it up to date, and since it's written on canonical time, it should be in a canonical repo
<bodie_> that was my primary reason for keeping my work in my fork, he seemed unresponsive on some queries
<natefinch> also, we don't want to be blocked on some external person accepting our pull requests for our code to work
<bodie_> right
<bodie_> can I simply create a repo in juju?
<mgz> nate can
<natefinch> you can't, but I can.  I think what we'll do is have you put the license in your repo, and then I'll just fork your repo into juju
<bodie_> cool
<mgz> I do want the gap between upstream and us to be as small as possible, and them to be aware of our changes
<perrito666> natefinch: bodie_ you made me look up
<natefinch> mgz: that's pretty fair.  Maybe fork upstream and merge in bodie_'s changes?
<natefinch> perrito666: haha
 * perrito666 notices he has been wearing his headphones without music for 2 hs
<natefinch> perrito666: haha, I do that all the time. :)
<bodie_> hate it when that happens... I don't want sweaty ears unless I'm getting something out of it! :P
<natefinch> https://github.com/juju/gojsonschema
<perrito666> bodie_: its winter, my ears feel pretty good about the heat
<bodie_> natefinch, I'll poke a change into core quickly to use that and whatnot
<natefinch> bodie_: ok.  I need to merge in your stuff first.  hopefully it'll just work :/
<bodie_> I'll give it a shot too
<bodie_> erm, it looks like it's still using the other two bad deps
<natefinch> bodie_: in theory, all the fixes you made will just work
<bodie_> sigu-399/gojson* was a big pain point that I had to rework
<bodie_> *pokes natefinch*
<natefinch> merge conflict?  damn
<bodie_> natefinch, there's also a couple of bad deps in that
<bodie_> not sure if you saw
<bodie_> I pulled out all three libs
<bodie_> gojsonschema, gojsonreference, and gojsonpointer
<natefinch> bodie_: right, but if I pull in your changes, it should put the juju repo in the same state as your repo
<bodie_> should've thought to mention that -_-
<natefinch> "should"
<bodie_> well, my gojsonschema repo, yes
<bodie_> but gojsonschema has two more or less shoddy external dependencies in sigu-399's account
<bodie_> which I fixed
<natefinch> yeah, all I'm doing is forking your upstream and then merging in your changes
<bodie_> I'm not sure which one of us is failing to understand the other, haha
<natefinch> bodie_: oh, I see, there's more than the one repo, I get it.  gojsonreference
<bodie_> and gojsonpointer
<natefinch> gah, those have no licenses either
<bodie_> it's prepended to the file, which is acceptable under apache 2
<bodie_> I wasn't totally sure whether that was an issue
<natefinch> oh, cool, yeah, that's fine
<perrito666> natefinch: stdup?
<natefinch> yep
<natefinch> perrito666: sorry, no I haven't done the set state server address thing
<katco> mgz: so i'm wondering if this variable name is appropriate, "existing". it's only true if there is an error and the error is that the environment path _doesn't_ exist
<bodie_> heh
<katco> mgz: it's causing some mental gymnastics, and actually might be the source of _a_ bug; e.g.: we cleanup only if an environment was already found, not created
<bodie_> they may be in a standup, btw
<bodie_> not sure how they do things
<katco> mgz?
<katco> he's on my team, not sure if he goes to other stand-ups
<perrito666> katco: he uses irssi, he might not get notified of your pings
<mgz> sec, just wombled off
<katco> perrito666: ah ok :) i used to use irssi; migrated to erc recently
<katco> mgz: wombled... that is a new phrase for me
<mgz> 'existing' should be refering to if there was a .jenv file *before* we called Prepare
<mgz> which sort of makes sense
<mgz> there are just too many negatives in there to make that clear :)
<katco> mgz: haha, yeah... even as i'm trying to clarify i have to reparse this in my head
<mgz> ReadInfo -> !IsNotFound -> existing = true
<bodie_> oy
<bodie_> isn't not found?
<mgz> !existing -> destroy
<bodie_> wouldn't that mean IsNotFound -> destroy?
<mgz> right, any err other than not found means you fall through and existing remains = false
<mgz> (or err == nil in the common but not-explict case)
<mgz> the code is just generally confusing
<katco> mgz: so, true iff either there is no error OR there exists an error but it's not that we couldn't find the env
<mgz> katco: the reverse, false iff...
<mgz> wait, now I'm backwards?
<katco> mgz: haha this code is a mental trap
<mattyw> sorry for missing the core team meeting today folks - completely missed the reminder
<mgz> yours was right
<katco> mgz: i think i'm correct which is perhaps why there's a bug
<katco> mgz: right, b/c currently if there _is_ an error, but it's something other than "couldn't find env", it will still mark "existing = true"
<mgz> right, which is wrong
<katco> mgz: i think this should occur: existing = (err == nil)
<katco> i think that's true for all cases
<mgz> yeah, I think so
<katco> mgz: i think i have the fix; but i think you're going to be the only one who can review it haha
<katco> mgz: take a peek at this and let me know what you think: https://pastebin.canonical.com/113716/
<mgz> yah, the three states seem correct there
<katco> mgz: awesome, i'll go with this. the test i have seems to agree
<mgz> you can probably just do if err == nil { envExists = true } else if IsNotFound(err) { ...warning.. } else { return err }
<katco> mgz: i am not a huge fan of relying on default values, but i'm not opposed if you feel strongly
<katco> mgz: or wait, did i misread that...
<mgz> katco: I mostly just like simple diffs, I agree reassigning envExists isn't super
<katco> mgz: +1 on simple diffs
<mgz> but really it's an either-or for the first two blocks
<mgz> as unexpected errors we should just propogate rather than go on and Prepare at all
<katco> mgz: i like your way better... reads cleaner
<katco> mgz: other than the default value ;)
<natefinch> katco: can you fix the comment on environFromName so it explains what the function is for that it returns?
<katco> natefinch: sure... would naming the return param work?
<natefinch> katco: yeah, I was going to suggest that too... can you do both, though?  Cleanup is a good name, but it still doesn't describe when you should call it and stuff.  Pretend you're brand new and have no idea what most of this code does :)
<mgz> that's all of us...
<katco> natefinch: well that's not hard to pretend ;) but yeah, i'll do both. tyvm for the suggestion
<natefinch> right
<natefinch> that's actually something I'd like us to work on... put as much helpful information in comments as possible for the next developer, instead of assuming the next person has a perfect comprehension of the mechanics of the entire system.
<katco> do we prefer line breaks for each arg, or several on a line but break before c80?
<natefinch> katco: up to the individual developer as long as gofmt likes it.  Personally, I prefer one arg per line like a struct declaration, instead of randomly breaking somewhere in the middle of the args
<katco> natefinch: +1
<katco> natefinch: although the indentation isn't how i would do it:
<katco> func environFromName(ctx *cmd.Context,
<katco> 	envName string,
<katco> 	resultErr *error,
<katco> 	action string) (environs.Environ, func(), error) {
<katco> natefinch: i'd like for the args to be under the first, and bringing the 1st down a line reads strangely to me
<natefinch> katco: this is my own personal favorite way:
<natefinch> func destroyPreparedEnviron(
<natefinch> 	bar *cmd.Context,
<natefinch> 	baz environs.Environ,
<natefinch> ) (foo, error) {
<natefinch> reads a lot like a struct definition, which my eyes parse easily
<katco> natefinch: i like that
<natefinch> but again, the Juju rule is, if gofmt likes it, it's ok
<katco> natefinch: consider it adopted! :)
<natefinch> cool
<katco> that is very easy to parse for me
<alexisb> natefinch, team: anyone have time to take a look at this bug today:
<alexisb>  https://bugs.launchpad.net/ubuntu/+source/juju/+bug/1078213
<_mup_> Bug #1078213: logs are not logrotated <amd64> <apport-bug> <canonical-is> <canonistack> <logging> <precise> <juju-core:Triaged> <juju (Ubuntu):Triaged> <https://launchpad.net/bugs/1078213>
<alexisb> it is starting to affect the field
<natefinch> alexisb: I can look at it.  I've wotked on it before
<natefinch> worked
<katco> mgz: i've found another curiosity... cleanup gets passed in a pointer to an error, and if the error is nil, returns. as far as i can tell, that error will never be anything but nil...
<natefinch> alexisb: can you add wayne to canonical-juju?  evidently he's not on it
<alexisb> o, yeah let me see how to do that
<katco> natefinch: alexisb: i'm not sure i'm on that either
<alexisb> katco, ack
<mgz> katco: you mean the resultErr to destroyPreparedEnviron?
<katco> alexisb: at least a quick search shows no messages
<katco> mgz: yeah
<mgz> that's a funky defer hack
<katco> mgz: am i just reading that wrong? it never gets set?
<mgz> cleanup is defered till the end of Run, which has the named return resultErr
<mgz> which was passed *as a reference* to environFromName and on into cleanup
<katco> mgz: right, but we never do anything with it as far as i can tell
<katco> mgz: in any of the stack frames
<mgz> so, when cleanup runs, *after* the return from Run, it's populated with the return err from that runction
<katco> mgz: ooooh i see
<katco> mgz: jees... that is not intuitive at all
<mgz> what we do with it is *err == nil
<katco> mgz: a defer func() { if resultErr != nil { cleanup() } }() would be much clearer
<natefinch> ericsnow: 1:1 in moonstine?
<mgz> so, cleanup does not destroy if Run returned successfully
<ericsnow> natefinch: coming
<mgz> yes, having two levels of guards on destroy is just confusing
<katco> mgz: so the ever-present question: do i make that change?
<mgz> you could pull it up into cleanup pretty trivally if it's not used otherwise, which it seems not to be
<katco> mgz: i agree
<katco> mgz: and i apologize; regarding this morning's discussion; if the environ already existed, we just want to remove the jenv file, not destroy the entire thing, correct?
<katco> mgz: hm it looks like that pattern (err ref in cleanup) is also used in synctools
<dimitern> anyone willing to review a small fix for bug 1343219 ? https://github.com/juju/juju/pull/327
<_mup_> Bug #1343219: networker restarts every 3 seconds with the local provider (missing /etc/network/interfaces) <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1343219>
<TheMue> voidspace: to allow a review, could you add a bit more information to your txn PR?
<voidspace> TheMue: which bit don't you understand?
<voidspace> ;-)
<voidspace> TheMue: ok
<TheMue> voidspace: itâs more the general comment for the whole PR, about the intention
<voidspace> TheMue: yep, no problem
<TheMue> voidspace: great, thx
<dimitern> TheMue, voidspace ^^ ? :)
<TheMue> voidspace: btw, Iâm taking some notes about IPv6 in LXC containers
<TheMue> dimitern: will take a look
<voidspace> dimitern: eh?
<voidspace> ah
<voidspace> TheMue: great
<dimitern> :)
<dimitern> TheMue, ops, sorry - I had to fix a test slighly, but the rest is fine
<voidspace> Sooo... in order to use collections safely we should copy them before performing any actions on them
<voidspace> in order to use a new session
<voidspace> unfortunately State is a big bundle of collections
<voidspace> I'm considering replacing the collections with a function to return a collection instead
<voidspace> so it's impossible to access them without creating a new copy
<voidspace> however, wherever they're copied a closer function is also returned - so session.Close can be called appropriately
<voidspace> that's going to create some uglyish code
<voidspace> it will make it more likely that I catch all the places using collections though
<voidspace> I think it needs an email to juju-dev
<wwitzel3> hey dimitern , any idea why one of my tests in state (calling AddMachine) would be returning an error that there is no registered provider for "someprovider" which is the details from FakeConfig.
<natefinch> voidspace: definitely
<dimitern> wwitzel3, I've never seen that :/
<voidspace> natefinch: my branch with some session copying for the watchers - and session copying for the transaction runner - passes all the tests
<voidspace> wwitzel3: that sounds to me like some mocking isn't taking effect
<natefinch> voidspace: nice
<voidspace> wwitzel3:  something that's supposed to be catching calls isn't and they're propagating too far
<TheMue> dimitern: ping me when you fixed the test, the rest indeed looks good
<dimitern> wwitzel3, i seem to recall some piece of code getting a fakeconfig and changing the type to "someprovider" though
<natefinch> voidspace: that'll help with our scaling up as well.
<wwitzel3> voidspace, dimitern: yeah the fake environ is setup and it is even using the dummy provider
<wwitzel3> voidspace, dimitern: I have one test that it works on and one test is fails on, so right now i'm just talking the code to see where the passing one mocks and the failing one doesn't
<dimitern> wwitzel3, so which test passes and which fails?
<bodie_> mgz, hangout or are we good?  I don't have any real questions yet, I'm just starting to dig into the jujuc stuff
<dimitern> TheMue, PR updated
<TheMue> dimitern: ok
<wwitzel3> dimitern: the failing ones for me are in the megawatcher tests and many of them pass in the other suites in state.
<wwitzel3> dimitern: I am sure it has to do with me not mocking something properly with the new refactors I've made, I was just hoping for a pointer in the right direction for where that gets mocked.
<wwitzel3> I feel like it should just be calling the AddMachine from dummy provider
<wwitzel3> and that should be that
<dimitern> wwitzel3, why you need an environment in state tests?
<TheMue> dimitern: reviews
<dimitern> wwitzel3, I mean - state tests usually don't mess with provider types, etc. just set and get the envconfig
<TheMue> dimitern: eh, reviewed
<dimitern> TheMue, ta!
<TheMue> dimitern: quick solving of the issue
<wwitzel3> dimitern: because this megawatcher_internal_test calls AddMachine a lot.
<wwitzel3> dimitern: I'm not messing with them in the tests, I've messed with things in general and I'm trying to fix the tests :)
<dimitern> TheMue, it's quick-n-dirty fix for now - we'll probably have to handle missing /etc/network/interfaces better in the future
<dimitern> wwitzel3, ah :)
<dimitern> wwitzel3, sorry I couldn't help you more
<wwitzel3> dimitern: no worries, appreciate the time, I'll get it figured out
<wwitzel3> it is most likely something simple and dumb
<wwitzel3> and completely my fault :)
<TheMue> dimitern: yep, we need to know if it is needed or, like in the mentioned case, not
<dimitern> I know the feeling - esp. when I mess around with stuff in state
<dimitern> TheMue, or create it ourselves even
<dimitern> ok guys
 * dimitern is off - see you in 2 weeks ;)
<natefinch> bodie_: one of the jsonschema tests is panicking?
<natefinch> bodie_: http://pastebin.ubuntu.com/7809812/
<bodie_> natefinch -- hrm
<hackedbellini> thumper: we just tried the workaround you mentioned yesterday, but it didn't work :(
<hackedbellini> natefinch: to update you, yesterday after you left, thumper said our problem here is probably because it was missing "ha" (I don't know what that means =P), and it was caused because we upgraded from 1.19 to 1.20 and not from 1.18 to 1.20 (that second migration would trigger that "ha" to be properly setup)
<hackedbellini> as a workaround, he said we could try changing the "upgradedToVersion" on machine-0's agent.conf to read "1.18.4" instead of "1.19.3" so when the agent started, it would maybe fix it
<bodie_> natefinch, is this using my deps?
<bodie_> mine is passing
<katco> apparently errors.IsNotFound(nil) returns false
<natefinch> bodie_: ahh yeah, I was missing your changes to gojsonreference
<natefinch> katco: yeah, most "isFoo" should return false on nil
<natefinch> especially when checking interfaces
<katco> natefinch: generally agreed, but it makes "if environInfo, err := store.ReadInfo(envName); nil == err || !errors.IsNotFound(err) {...}" all the more confusing
<natefinch> katco: double negatives are always a problem
<natefinch> katco: and yeah, that ode is not very good.  I would split out the if statement from the rest... I generally only do the inline if statement if it's super simple
<natefinch> line returns are free, developer confusion is expensive
<katco> natefinch: yeah that's what we've decided to do; but i missed a case
<katco> natefinch: i don't know why, but this particular block of code is really good at scrambling people's brains
<natefinch> katco: show me?
<natefinch> file/line no
<katco> natefinch: the code? it was what we were discussing earlier
<katco> natefinch: common.go, environFromName
<katco> natefinch: it's going to look different by the time i submit a PR
<natefinch> katco: that sounds like a good thing :)
<katco> natefinch: absolutely... axw, mgz, and i were thoroughly confused
<katco> natefinch: but i think one of them will have to be the reviewer, otherwise it looks like much too large of a PR
<katco> speaking of which... i've never done this. so i've made changes since my last PR, and i want to resubmit. does github handle that for me, or how do i submit a fresh PR?
<perrito666> katco: if your pr has not beenmerged
<perrito666> just push
<perrito666> that updates the pr
<katco> perrito666: ah ok. ty
<natefinch> depends... pushing will put another commit on the PR, or you can rebase to update the commit already in the PR.
<katco> natefinch: so if i rebase from trunk, it will update the PR but erase prior history (comments, etc.)?
<natefinch> right
<natefinch> so, not rebasing means you keep the comments
<natefinch> ...and this is why we want to move to reviewboard, so it's simpler and easier
<natefinch> I think
 * natefinch has not actually used reviewboard
<katco> cool
<natefinch> bodie_: ok, so all the jsonschema stuff is now under github.com/juju
<bodie_> right on
<bodie_> natefinch, shall I open a PR to correct the imports and godeps?
<natefinch> bodie_: yes please
<katco> natefinch: btw, in environs/configstore/disk.go, my initial impression is all of those mutexes should be removed, and the variables be converted to a non-buffered channel
<katco> natefinch: just ran across that when reading through this code
<natefinch> katco: I'm not entirely sure about the reason for all that locking, so I'm not ready to comment on if it should be changed.
<katco> natefinch: from the docs: "Package sync provides basic synchronization primitives such as mutual exclusion locks. Other than the Once and WaitGroup types, most are intended for use by low-level library routines. Higher-level synchronization is better done via channels and communication."
<katco> natefinch: if we're not using panics b/c it is not idiomatic go, we should not be using mutexes either
<natefinch> katco: meh. The go authors have admitted, sometimes a lock is a much simpler solution to a problem
<bodie_> mgz, good to go on pull 311?
<katco> natefinch: that seems like a double standard to me
<natefinch> It's a pragmatic standard.  Using channels can complicate code where it's not really necessary.  Panic *always* complicates code... and causes your implementation to leak into the caller's code.  Callers of code that uses mutexes don't ever need to know you're using a mutex unless you expose it.
<katco> natefinch: although i disagree about panics (exceptions), there is the same argument for channels
<katco> natefinch: callers need never know how you've implemented synchronization
<natefinch> katco: You're welcome to take it to juju-dev.  I think it's good to want to make our code more idiomatic.
<natefinch> (juju-dev the mailing list that is)
<katco> natefinch: i'd like to once i'm a little more settled. i just thought you'd be interested since we had been discussing it
<katco> hm it doesn't look like my PR was automatically updated with my push... do i close the existing one and resubmit?
<natefinch> I think if you just re-PR it'll work
<natefinch> sorry, I haven't landed a ton of code since the switch  :/
<katco> natefinch: ok (crosses fingers)
<katco> natefinch: oh no worries, i gather everyone is still figuring this out
<natefinch> I'm sure mgz would know, possibly wwitzel3 or ericsnow too
 * natefinch just pings everyone
<wwitzel3> :)
<ericsnow> katco, natefinch: I've done it a few times at is always works (though I'm pretty sure it you have to refresh the page and it *may* not be instant)
<wwitzel3> katco: if you pushed to the same branch, it should of updated it
<ericsnow> s/at is/and it/
<perrito666> voidspace: ping
<katco> ericsnow: wwitzel3: ty gentlemen
<bodie_> natefinch, http://paste.ubuntu.com/7810105/
<bodie_> seems odd that there's still a dep for binary132/gojsonpointer
<natefinch> bodie_: oops, probably my fault
<bodie_> natefinch, I grepped for binary132 without result O.o
<bodie_> oh, oops, of course
<bodie_> yep
<bodie_> gojsonreference
<bodie_> however, I'm not sure how to PR my fix
<bodie_> since it's forked from xeipuuv, when I open the fix branch, it wants to PR against xeipuuv's
<bodie_> looks like you can either merge my change by adding my branch as a remote, or just do it yourself, I suppose
<bodie_> natefinch, it's just the import in gojsonreference
<natefinch> bodie_: yeah, fixing it
<natefinch> bodie_: there you go
<voidspace> perrito666: pong
<voidspace> biab - going jogging
<voidspace> sorry perrito666
<perrito666> lol, voidspace well talk tomorrow, its just tomake sure I understood correctly your changes to admin password (and how they most likely break old restore)
<voidspace> perrito666: https://github.com/voidspace/juju/compare/copy-sessions
<voidspace> perrito666: I don't directly change oldPassword
<voidspace> perrito666: it's still used in production
<voidspace> I haven't changed that
<voidspace> or at least, if it's still used it will still work
<perrito666> ok
<katco> ok https://github.com/juju/juju/pull/319 finally in
<rogpeppe> to whom it may concern: i'm around this evening (for the next 2 hours) if anyone needs to ask any juju-core questions
<bodie_> natefinch, sorry, just about done here, things got a little madcap here chez bodie
<bodie_> natefinch, https://github.com/juju/juju/pull/329
<natefinch> bodie_: why does that change juju/charm?
<bodie_> because that's where the gojsonschema dependency is
<natefinch> bodie_: ahh I see
<bodie_> natefinch, specifically, gojsonschema is used to validate against charm.ActionSpec
<bodie_> so, anywhere we're validating, we're doing so against that type
<bodie_> using a method on the type
<bodie_> I took the liberty of merging the import fix to charm since it tests OK
<natefinch> bodie_: cool
<natefinch> rick_h__: ping on the MaaS nuc stuff
<rick_h__> natefinch: sure thing, I'll pull my list together
<natefinch> rick_h__: thanks
<rick_h__> natefinch: https://pastebin.canonical.com/113747/ is what I picked up
<rick_h__> natefinch: and I got a usb -> ethernet adapter for the maas controller so that it's dual homed on both the external network (for the team to access) and the private on the switch for the maas network to operate
<rogpeppe> anyone fancy a review of the brand-new fresh-off-the-press charm store HTTP router: https://github.com/juju/charmstore/pull/14
<natefinch> rick_h__: awesome, thanks
<alexisb> alrighty all I am out of here! see you in a week
<natefinch> alexisb: have fun!
<natefinch> alexisb: we'll keep the place from burning all the way to the ground while you're gone
<alexisb> :)
<alexisb> I believe you natefinch
 * perrito666 is surprised by natefinch comment while he fetches gas and a match
<bodie_> I'm sure there won't be any freak gasoline fight accidents.
<rick_h__> alexisb: have a good time, but did reply to your email :P light reading for the trip
<alexisb> rick_h__, thanks
<alexisb> btw rick_h__ good stuff that is very helpful
 * perrito666 begins the burning
<natefinch> I
<natefinch> I'd really love it if canonical hr would stop logging me out and deleting my in-progress reviews
<perrito666> well I only did one, but that did not happen to me
<natefinch> If you stay on the review page for too long "save and continue" means "dump you to ubuntu SSO and then redirect to the salesforce main page"
<perrito666> how fun... not
<natefinch> note that none of that includes actually saving your work
<perrito666> natefinch: try ctrl+click
<perrito666> if properly coded should try to do that on the next tab
<natefinch> if....
<perrito666> so there you can go by with the re-login and then re save
<natefinch> I'll give it a try
<katco> wallyworld: i have to run to a preschool orientation, but i added some details to the card for the kvm stuff. i wouldn't mind you taking a look to see if i'm on the right track if you have any free time.
<katco> EOD, ciao!
<wallyworld> voidspace: if you're still around, how's the mgo session stuff?
<wallyworld> thumper: have you started looking at the container issues?
<thumper> wallyworld: no, have to do hr stuff that I'm already behind on
<wallyworld> thumper: yeah, i'm behinf also :-(
<wallyworld> i'll start looking but may need your input
<wallyworld> i alswo goota do a tonne or hr
<voidspace> wallyworld: hey, hi
<voidspace> wallyworld: soooo...
<wallyworld> hey
<voidspace> wallyworld: I did some more, completed the transaction runner and a few more collections in the watcher replaced with session copying
<voidspace> https://github.com/voidspace/juju/compare/copy-sessions
<voidspace> with an mp for juju/txn
<voidspace> https://github.com/juju/txn/pull/1/files
<voidspace> wallyworld: I was searching around looking at all the places that need to change and it was a bit daunting
<voidspace> because state.State is basically a big bag of collections that need to be copied any time they're used
<wallyworld> yep :-(
<voidspace> and whilst I was jogging a better solution occurred to me
<voidspace> as we shouldn't be reusing those collections they just shouldn't be stored on State
<wallyworld> i agree
<voidspace> instead a method that fetches the collection for you when you need it - then we can *ensure* that every time they're touched you have a new session
<voidspace> and the compiler will tell me all the places to fix
<voidspace> so I won't miss any
<wallyworld> yep
<voidspace> so I'm about to do that now
<voidspace> unfortunately it makes a little bit of my work so far redundant
<wallyworld> i think an email to gustavo may be useful just to ensure the right approach is being used
<voidspace> but I have a couple of useful functions I put in the mongo namespace
<voidspace> niemeyer: ping - I don't suppose you're still around are you?
<voidspace> wallyworld: ok, I have an email I started
<wallyworld> i'd like to understand why it was done the way it was in the first place
<wallyworld> considering it seems wrong
<wallyworld> maybe we're missing something
<voidspace> wallyworld: my initial idea was to replace all the collection references with methods to fetch the collection
<voidspace> but there's no need for it
<voidspace> but I started an email about it anyway
<voidspace> wallyworld: so the current revision is a useful point - all tests pass
<wallyworld> \o/
<voidspace> and I'm about to embark on the "grand delete"
<wallyworld> i do thing removing the collections from state is the right approach
<voidspace> I'll send the email, but then just begin anyway
<voidspace> and if it turns out not to be wise we can just go back a few revisions
<wallyworld> seems more in line with the expectedusage patterns
<voidspace> yep
<wallyworld> thank you
<voidspace> leaving the collections there is dangerous
<wallyworld> yep, i hate essentially global state like that
<voidspace> you only need to have one place that uses the global session to be at risk (extra risk anyway) of timeout
<voidspace> yep
<voidspace> so we have one global session - but we basically use that just to store connection details
<voidspace> for session copying (which is using socket pooling under the hood anyway)
<wallyworld> yep, that is how these things are normally architected
<wallyworld> cahce the credentials, peel off a new session from a pool when needed
<wallyworld> hence i'd like to see if we're missing anything give tne implementation as it is now seems wrong
<voidspace> and deleting those State collections nicely leverages the compiler to tell me when I've found all the places using them
<voidspace> yeah, no idea
<wallyworld> yes indeed
<voidspace> but gustavo *agrees* it's wrong and wants the change
<wallyworld> ok
<wallyworld> in that case jfdi :-)
<voidspace> and he said the right approach was to copy the session, deferring close, every time you do something
<voidspace> so, going downstairs to do it whilst watching vikings :-)
<voidspace> back online in a minute
<wallyworld> this will be a nice win - delete global state (shudder), fix stale sessions yada yada
<voidspace> there will probably be other places in the code that need work
<voidspace> we can grep for mgo.Collection and session
<voidspace> I had a look at a few - but some of those are innocent (like using the session immediately after creating it is fine I think)
<wallyworld> i thik so too, so long as we don't then just hang on to that session and try and use it again later without a clone
<voidspace> right
<voidspace> so, "go build ./..." reports a few errors
<voidspace> followed by
<voidspace> state/addmachine.go:713: too many errors
<voidspace> I guess I fix them a few at a time...
<wwitzel3> voidspace: yeah I never figured out a way to get all of them
<voidspace> wwitzel3: seeing them all would be very depressing I think
<voidspace> a few at a time is probably the way to go...
<voidspace> :-)
<voidspace> wallyworld: we have a lot of code like
<voidspace> a.st.units.Name
<voidspace> where st is state and units is the "units" collection
<wallyworld> yes we do
<voidspace> isn't Name *always* going to be "units"?
<wallyworld> i think so yes ottomh
<voidspace> ottomh?
<wallyworld> off the top of my head
<voidspace> hah
<voidspace> thanks
<wallyworld> :-)
<mwhudson> off the top of my head
<mwhudson> doh
<voidspace> rather than copy the session and get a new collection I'm going to hardcode the name
<mwhudson> was scrolled up :)
<voidspace> and see what happens
<voidspace> mwhudson: morning
<voidspace> mwhudson: what are you doing here?
<mwhudson> voidspace: i was involved in the juju on arm64 stuff and haven't left yet!
<voidspace> hah
<voidspace> cool
<voidspace> I'm doing surgery
<thumper> that moment when you hear a crash followed by the words "don't move, there is glass everywhere"...
<voidspace> oh dear
<voidspace> thumper: how was tinkerbell?
<thumper> voidspace: one said "I literally died", I told her it only "figuratively or metaphorically died"
<thumper> the elder said it was 87 minutes of pure pain and 3 minutes of awesome
<voidspace> hehe
<thumper> youngest said "meh"
<thumper> and she was the one that wanted to see it
<thumper> so, not wonderful
<voidspace> my goodness, that's quite a collection you have
<voidspace> I shall knock it off my "must see" list then...
<thumper> heh
<voidspace> thumper: do you know any reason, beyond a desire not to hardcode strings, to do "state.cleanups.Name" instead of just "cleanups" ?
<voidspace> thumper: our code is littered with them (accessing column names via the columns - where we know the name)
<thumper> voidspace: got a concrete example?
<voidspace> state/cleanup.go line 88
<voidspace> thumper: there are tens of such examples
<voidspace> possibly a hundred, I don't know because the compiler just says "too many errors"...
<thumper> these are collections not columns
<thumper> but no, I think it is just to avoid magic strings
<thumper> voidspace: where is it initialized?
<voidspace> thumper: it's a state.State member created in state.Open (state/open.go) with db.C("cleanups")
<thumper> personally I'd rather have a package level constant: const cleanupCollection = "cleanups"
<thumper> and use that in state.Open, and in the collection descriptions
<thumper> but I can see how it started :)
<voidspace> thumper: well, I'm removing all those collections from state
<thumper> I bet it just grew organically from the first use
#juju-dev 2014-07-18
<thumper> oh?
<voidspace> we need to use a new session every time we execute a query against a collection
<thumper> voidspace: and put them where?
<voidspace> so they shouldn't be long lived
<voidspace> we should fetch them when we need them
<thumper> fair call
<voidspace> but we shouldn't create a new session just to use the name
<thumper> correct
<voidspace> so I'm just using the name
<thumper> make package constants
<voidspace> the pattern just seemed a bit odd
<thumper> please use a named constant not just a naked string
<voidspace> ok, I'll have to go back and change the ones I've done already
<thumper> hopefully that would be pretty mechanical
<voidspace> sure
<wallyworld> voidspace: +1 to constants
<voidspace> yep, doing it
<wallyworld> you also need to sleep
<voidspace> not quite mechanical, the const names will clash with the obvious name for the column collection - so I'll have to append a Col suffix to those
<voidspace> wallyworld: soon enough
<wallyworld> you have 1 more day to work on this, right?
<thumper> voidspace: or "Collection"
<wallyworld> lol
<voidspace> wallyworld: yep
<voidspace> wallyworld: just getting this to compile will be an achievement
<wallyworld> voidspace: make it Cxn to piss off thumper
 * thumper slaps wallyworld with a wet fish
<voidspace> thumper: Col works for Column and Collection - and they're both
<voidspace> Columnection
<thumper> voidspace: but it isn't both is it?
<voidspace> it is
<voidspace> it's a collection representing a column
<thumper> it is either the name of the collection, or the collection itself
<wallyworld> voidspace: non relational dbs don't have columns :-)
 * thumper has no time to bikeshed names
<thumper> voidspace: call them "a", "b", "c" and so on
<voidspace> hah, it's a collection not a column
 * thumper leaves
<voidspace> correct
<voidspace> this is why you need to study maths to be a programmer
<thumper> voidspace: and religion
<voidspace> if you haven't done algebra how else would you know that variables should have single letter names
<voidspace> no-one should study religion
<voidspace> either live it or ignore it
 * wallyworld afk for a bit
<voidspace> wallyworld: made a dent in it
<voidspace> wallyworld: into machines.go switching to the new style collection access
<voidspace> wallyworld: https://github.com/voidspace/juju/compare/copy-sessions
<voidspace> wallyworld: hoping for compilation tomorrow
<voidspace> g'night all
<axw> wallyworld: https://github.com/juju/juju/pull/331
<wallyworld> axw: you free now?
<axw> wallyworld: just a sec, making tea
<wallyworld> ok, no hurry
<thumper> well, that's blown it
<davecheney> ?
<thumper> at the museum thinking I could work in the cafe while the kids played
<thumper> but I need to be in the discovery world with them
<thumper> currently propped in the corner with laptop on my lap
<thumper> not entirely comfortable
<davecheney> thumper: i can't stand it any longer
<davecheney> i'm going for afternoon coffee
<davecheney> do you want to do HR stuff today ?
<menn0> review pls: https://github.com/juju/juju/pull/333/files
<axw> wallyworld: part 2: https://github.com/juju/juju/pull/336
<axw> I'll do azure in a followup
<wallyworld> axw: looking after i finish the current review
<axw> thanks
<axw> wallyworld: FYI, here's the followup too: https://github.com/juju/juju/pull/337
<axw> includes the commit from 336
<wallyworld> rightio
<wallyworld> axw: i'd love to get rid of the remaining old connection guff from juju/testing/conn.go i think it is
<wallyworld> the bit that does all the side effecty state open stuff
<wallyworld> i think the only thing we really want to keep from there is PutCharm
<axw> wallyworld: which bits can go? MongoInfo/APIInfo? I'm pretty sure things still use them...
<axw> JujuConnSuite is everywhere
<wallyworld> axw: the newState() stuff
<axw> ah, ok
<axw> as in collapse State/BackingState
<wallyworld> yeah
<axw> I can take a look at that after I fix up EnsureNotBootstrapped if you like
<wallyworld> and we also do need to fix JujuConnSuite and ConnSuite (there's 2 of them)
<wallyworld> sure, it was just a heads up - one of us will get there first
<axw> okey dokey
<wallyworld> will be great to clean all this shit up
<axw> indeed
<axw> I was nearly tempted to remove the use of JujuConnSuite in the provisioner code, but didn't want to get too sidetracked
<wallyworld> yeah, separate piece of work :-)
<axw> fwereade: hey, are you about?
<fwereade> axw, hey dude
<axw> fwereade: thought you may want to skim https://github.com/juju/juju/pull/336 before I land it
<axw> in case you have different ideas about how this should look
<fwereade> axw, cheers, I'll take a super-quick look
<axw> fwereade: description alone may be enough
<fwereade> axw, description LGTM
<fwereade> axw, I'd say that I'm not convinced that replacing the storage-based implementations is ideal
<fwereade> axw, I would prefer to see us move in a direction in which environs have an explicit SetStateServerInstances method
<fwereade> axw, (which can be backed by storage or tagging or whatever)
<axw> fwereade: which would add a tag to an instance?
<fwereade> axw, but post-bootsrtap, I think it's really up to the HA code to determine who's *really* meant to be a state server, and to keep the env updated with that knowledge
<axw> I see
<fwereade> axw, yeah -- and remove that tag from non-state-servers, I think
<fwereade> axw, I mean, yeah, the storage-based implementation isn't great, I'm not *attached* to it
<axw> fwereade: given that we can't do that with azure, I don't think we can get away from tying state-serveryness from provisioning any time soon
<rogpeppe> fwereade: does an Environ need to know about the state server instances at all?
<fwereade> axw, but I feel like where we really fall short is in maintaining the important info at an env level, and the less-than-ideal implementation backing our attempts isn't my main focus
 * axw nods
<fwereade> rogpeppe, yes, IMO, I am not willing to discard that functionality
<fwereade> rogpeppe, I've heard plenty of arguments as to why it's kind of a hassle, and why we often don't need it, but IMO any time we force people to manage a juju environment via their provider dashboard we have *failed*
<rogpeppe> fwereade: i'm not suggesting that an a provider wouldn't be able to provide an implementation of the state-server-instanceid storage, but that it could be something independent of Environ
<rogpeppe> fwereade: then one could arbitrarily compose instance id storage and Environ
<rogpeppe> fwereade: just a thought. please ignore me :-)
<fwereade> rogpeppe, I definitely think it's part of environ's responsibility -- but equally I am fine with the idea that we could/should be able to delegate that functionality elsewhere
<axw> rogpeppe: I don't see how it could be independent, unless we require something like storage. it always has to be external to the environment, which means leaning on something provider-specific
<fwereade> axw, or just a little remember-my-state-servers web service that we provide independently
<fwereade> axw, it's not inextricably tied in *implementation* to any specific provider
<axw> right, if we had a global thingy that would work
<fwereade> axw, I just remain adamant that it's a responsibility of the environ interface, independent of where we actually storeit
<rogpeppe> axw: my thought is that there are many possible ways to do it (e.g. with some external web service). an instance-id storage service could be something that an Environ could be asked for, but wouldn't necessarily have to implement.
<axw> fwereade: I agree with that
<axw> rogpeppe: ah right, yes.
<axw> we could definitely delegate to some provider-independent thing
<rogpeppe> axw: rather than forcing all providers to know about all possible kinds of external instance-id storage services
<rogpeppe> fwereade: the reason i don't see it as a fundamental part of Environ is that the provider really couldn't care less about where instance-ids are stored. That's why I see it (potentially) as a service that the Environ *can* provide but not one that the Environ *must* provide.
<fwereade> rogpeppe, I see it as part of the responsibility of the Environ instance to keep track of what happened when you called Bootstrap -- but indeed not necessarily as something that has to be provider-specific
<rogpeppe> fwereade: i would imagine that Bootstrap could take an instance-id storage service as an argument. That storage service might have been provided by the Environ itself, or externally.
<wallyworld__> fwereade: the reason the storage based impl was replaced is due to the explicit direction given my mark s
<wallyworld__> by
<wallyworld__> that we should evolve towards removing cloud storage
<fwereade> wallyworld__, hmm, I feel like there's a bit more nuance there
<fwereade> wallyworld__, removing juju-level dependency on it is definitely important
<wallyworld__> i agree we also want a SetStateServerInstances, but that is somewhat ortogonal to the current work and can come later
<fwereade> wallyworld__, this feels like it goes further, and is close to mandating that we shouldn't use provider storage even if it exists
<wallyworld__> fwereade: exactly, that was his direction
<wallyworld__> he quited the recent swift config issues
<wallyworld__> quoted
<wallyworld__> as justification for why we need to eliminate it even if it exists
<fwereade> wallyworld__, heh, ok then, I don't fully agree but it's not a hill I'm interested in dying on
<wallyworld__> fwereade: i'm with you tbh
<wallyworld__> but
<wallyworld__> we have been told :-)
<fwereade> wallyworld__, yep :)
<rogpeppe> FWIW, the instance-id storage is the *only* place we *cannot* rely on state-server provided storage
<axw> yup
<fwereade> rogpeppe, understood, agreed
<rogpeppe> ISTR some general agreement that we could potentially move away from instance-id storage and store only instance addresses instead
<rogpeppe> or potentially just state server host-ports
<axw> rogpeppe: what's the point of storing the addresses if they may be out of date?
<rogpeppe> axw: the environment itself keeps them up to date
<rogpeppe> axw: if the environment is moribund, there's not that much point in knowing its instance ids
<Egoist_> Hi
<wallyworld> not sure. if we use instance tagging, we can just query the env's instance pool for the start server instances
<rogpeppe> axw: the "keeping them up to date" piece of the HA work is still to do, AFAIK. that was mainly because I wanted to make that change as part of it (otherwise there's an impedance mismatch between StateServerHostPorts and the instance id storage)
<Egoist_> Why when I add new unit to service the other units can't get relation data, even if relation-set was executed?
<fwereade> Egoist_, expand please? relation-set is not flushed to the state server until the hook in which it was called completes successfully, but I'm not sure that's what you're seeing?
<rogpeppe> would anyone like to take a look over this charm store PR? it's the core of the new charm store, and i think it's quite neat actually :-] https://github.com/juju/charmstore/pull/14
<fwereade> axw, rogpeppe: fwiw I think that the only impedance mismatch is in trying to use only a single piece of data for two purposes -- agents/clients need to know addresses to connect to, environs need to know instances to manipulate, it's about tuning the info to the context it'll be used in. IMO this implies we want 2 separate methods (and 2 separate watchers that'd probably still just share an impl), not that we should make environs concern it
<fwereade> self with addresses when it's fundamentally concerned with instances
<Egoist_> fwereade, No, I mean that in peer relation, when I add new unit this new unit can get relation data about other unit, but other units who was in relation before adding new unit can't get relation data
<rogpeppe> fwereade: do we ever use the state server instance ids for anything other than turning into a state server address?
<Egoist_> btw. what about charm, who is responsible about charms in charmstore?
<rogpeppe> Egoist_: the top level ("promulgated") charms are reviewed and gated for quality and appropriacy by folks at Canonical.
<Egoist_> rogpeppe: the top level ("promulgated") charms -> You mean charms that are more important for juju like juju-gui?
<rogpeppe> Egoist_: i mean any charm that doesn't have "~" in the name
<rogpeppe> Egoist_: (which is probably all the charms you've actually seen :-])
<fwereade> rogpeppe, not sure we do, that doesn't mean that addresses are the correct expression of the info at the environ level
<axw> fwereade, rogpeppe: we currently use StateServerInstances only to generate API addresses (and soon to determine whether the environment is bootstrapped)
<fwereade> Egoist_, at the point the old units try to get the data, what hooks have successfully run on the new unit?
<axw> I don't think provider implementations should be thinking about API addresses, though. If we somehow pulled instance ID storage out of Environ then we could make it not-the-environ's-concern, but as it is I don't think Environ should know anything about API addresses
<fwereade> axw, +1
<Egoist_> -relation-joined and -relation-changed
<Egoist_> fwereade:
<rogpeppe> fwereade: i guess i'm trying to say that this concern is actually entirely independent of the environ. the only exception being, i guess, the fact that Environ.Bootstrap knows about it.
<rogpeppe> axw: +1 too
<fwereade> rogpeppe, and that we want to know it to tell if the environ's bootstrapped, too, right?
<fwereade> rogpeppe, "these are the instances" is a lot more direct and appropriate than "here are some addresses"
<fwereade> rogpeppe, (and apart from anything else addresses are deeply problematic anyway, there's this infuriating pervasive assumption that the set of state server addresses is a single property of the environment)
<fwereade> rogpeppe, (it's really not: different networks, different scopes -> (potentially) different set of state server addresses per agent)
<rogpeppe> fwereade: interesting.
<fwereade> rogpeppe, that's one of the this-really-isn't-bulk things that has had me all grumped up for a while
<fwereade> rogpeppe, it's not "what are the state server addresses for an environment"
<axw> :q
<axw> oops
<fwereade> rogpeppe, it's "what addresses should X entity use to connect"
<fwereade> Egoist_, what's the error you see on the old units when you try to read the relation data you set in joined/changed?
<wallyworld__> axw: lucky it wasn't your password :-)
<axw> :)
<fwereade> brb
<rogpeppe> fwereade: yeah, it's a maze of tunnels and there are no universal addresses for anything.
<Egoist_> fwereade: nothing error i guess, it's from charm code and it say that relation data is no exist -> In python language relation data is None
<Egoist_> fwereade: btw, is there any tool to debug relation problem or something like that?
<fwereade> Egoist_, you can use `juju run` to execute arbitrary code in a hook context; and you can use `juju debug-hooks` to start a session that gives you a session that lets you run each hook
<fwereade> Egoist_, but, hmm, None is very weird
<fwereade> Egoist_, I'd expect at least private-address to be set
<Egoist_> fwereade: yeah, and can't get any data set by relation set
<fwereade> Egoist_, so, just to be clear: there's a new unit that the old ones know about (they've run its joined hook) but for which they can't see any data?
<fwereade> Egoist_, (they've run *or are running*)
<Egoist_> rogpeppe: yeah, I don't see charm with '~' in name. But with who I need to speak about bugs in charms/
<Egoist_> ?
<Egoist_> fwereade: there's a new unit and this unit know about other ones, but old ones can't get any relation even about themself so i don't know is they don't know about this new unit
<Egoist_> fwereade: thev've run
<Egoist_> fwereade, no no sorry, they are running
<fwereade> Egoist_, hmm, that sounds like we can isolate it to just the old units anyway then?
<fwereade> Egoist_, what exactly are you running on those units, and in what hook, and where did you get the unit names from?
<fwereade> Egoist_, btw https://bugs.launchpad.net/charms ?
<TheMue> morning
<Egoist_> fwereade: what exactly are you running on those units -> trying to deploy mongodb charm maintain by myself | and in what hook -> in -relation-changed | and where did you get the unit names from -> from relation-list
<fwereade> Egoist_, ok, I'm scratching my head here a little bit -- is it possible you could come up with an absolutely minimal charm that repros what you're seeing and report a bug against juju-core that references the charm?
<fwereade> bbs
<Egoist_> fwereade: yeah, i think i could do that
<fwereade> Egoist_, awesome, tyvm
<mattyw> rogpeppe, ping?
<rogpeppe> mattyw: hiya
<TheMue> one trivial one for the OCR: https://github.com/juju/juju/pull/338
<TheMue> vds: ping, youâre OCR today ;)
<voidspace> how do I push without running the pre-push hook?
<voidspace> I need to push a branch that doesn't compile
<TheMue> voidspace: eh, why that?
<voidspace> TheMue: because I've made a big change and it will be a while before it compiles and I don't want it only on my computer
<voidspace> I want to push my wip
<voidspace> I think there's a command line flag, can't find it though
<TheMue> voidspace: ah, to your personal repo?
<voidspace> TheMue: no, to master of course ;-)
<voidspace> TheMue: yeah, to my repo - but the pre push check runs for that too
<TheMue> voidspace: eh, the question mark has been by accident ;)
<voidspace> hehe
<TheMue> voidspace: hmm, isnât the hook already running before commiting? not before pushing?
<voidspace> TheMue: no, it's pre-push
<voidspace> TheMue: at least if you follow the instructions in CONTRIBUTING
<voidspace> the command line flag is --no-verify
<TheMue> voidspace: ouch, I only have a pre-commit *blush*
<voidspace> pre-commit is even stricter!
 * TheMue has to take a deeper look into his git hooks
<TheMue> so, added it too
<TheMue> Iâm running most of it manually from inside vim, so no problems so far
<voidspace> cool
<TheMue> voidspace: could you please take a look at https://github.com/juju/juju/pull/338? itâs a very trivial one.
<voidspace> looking
<TheMue> voidspace: thx
<voidspace> TheMue: the hash change is markdown syntax?
<voidspace> I'm not familiar with markdown
<voidspace> TheMue: have you run it through a processor to check the output?
<TheMue> voidspace: yep, the === and â only work for h1 and h2, #, ##, ###, ### etc can even go further
<voidspace> ok
<TheMue> voidspace: yep, and took a look on it in GH itself which renders it fine
<voidspace> I think the ==== and --- look better in plain text form though
<TheMue> voidspace: maybe, but itâs only habituation. we want to render all docs later from md in GH direkt to HTML on j.u.c
<voidspace> well, I think they look more like headings - so not just habituation
<TheMue> ok
<voidspace> If we need the alternative format then fine
<TheMue> in other md docs nick and i use # because very often we need deeper nestings than h2
<voidspace> at least we're not using latex
<TheMue> voidspace: hey, I wrote my golang book in latex, so please donât complain :D
<voidspace> hah
<TheMue> voidspace: btw, my IPv6 info collection so far is at https://docs.google.com/a/canonical.com/document/d/1wfdGL_vyemT2-ncAB7KIySkKI9HbT8efKeC3Sd8ID0I/edit#
<voidspace> cool
<TheMue> voidspace: very raw, no real concept, only a collection of links and snippets
<voidspace> TheMue: I won't get a chance to look at it today - I'm off next week and need to get this branch as close to done as possible
<voidspace> even getting it to compile will be a challenge :-/
<voidspace> I'll bookmark it for later
<TheMue> voidspace: ok, nopro
<voidspace> I'll be back with you guys on networking on my return
<TheMue> voidspace: a quick vacation? dimiter is diving on malta these days
<voidspace> nice
<voidspace> I'd love to go diving
<TheMue> never done that
<voidspace> me neither
<TheMue> I would like to do the opposite, flying and parachuting
<voidspace> I did a parachute jump last year :-)
<TheMue> once done a tandem jump out of 3000m, nice
<TheMue> cool
<voidspace> TheMue: you've done a tandem jump?
<voidspace> that's what I did
<voidspace> great fun
<TheMue> yeah, during free fall Iâve almost forgot to breathe. and then the silence, when the parachute opened and youâve haning there in about 1000m
<voidspace> yep, exhilarating and amazing
<TheMue> hanging
<TheMue> voidspace: the right words
<voidspace> TheMue: LGTM on your PR by the way
<TheMue> voidspace: thx
<vladk> dimitern
<TheMue> vladk: dimiterm is on vacation already
<vds> TheMue, pong, just got back to a decent internet connection
<TheMue> vladk: youâre here? have been in the hangout but nobody there
<vladk> TheMue: yes, I was late
<TheMue> vds: hehe, itâs ok, voidspace already reviewd it
<TheMue> vladk: iâve got nothing special to tell, working on ipv6 for lxc and documentation
<vladk> I'm, too. I am working on moving cloudinit stuff to networker.
<vladk> TheMue: ^
<TheMue> vladk: ah, ok. does the networker also modify /etc/network/interfaces?
<TheMue> vladk: and this inside containers?
<TheMue> vladk: it could be interesting for routing IPv6 sub-subnets in a container (net yet absolutely shure about it, but found an interesting blog entry)
<wwitzel3> I've put off restarting for a couple weeks, guess I should probably listen to the nag window, brb
<vladk> TheMue: networker requires the presence of /etc/network/interfaces, but it modifies it only for MAAS, and it creates config files for separate interfaces in interfaces.d/ directory. And this should work for all providers except manual
<vladk> TheMue: what is the blog about IPv6?
<katco> good morning all
<mgz> mornin'
<voidspace> katco: o/
<voidspace> mgz: thanks for leaving dominion with reception by the way - I got it fine
<voidspace> mgz: sorry I missed you for the Boston expedition
<voidspace> I hope it was fun
<voidspace> I had a very quiet Saturday morning
<mgz> voidspace: no probs, hope you have a good night on friday
<voidspace> mgz: yeah, pretty epic
<voidspace> got in at 5am
<voidspace> fwereade: ping
<fwereade> voidspace, pong
<voidspace> fwereade: I'm part way through what's turning into quite an invasive change
<voidspace> fwereade: and I think you at least ought to be aware of it - and ideally give a thumbs up to it
<voidspace> fwereade: the basic gist is that we need to be copying sessions in order to use mgo socket pooling
<voidspace> fwereade: which means we need to *not* be storing collections (which have a reference to the global session) on state.State
<voidspace> fwereade: so I've pulled them out, providing methods to get the collection (using a new session) whenever it's needed
<fwereade> voidspace, hmm, ok -- is it just a matter of making them funcs that return suitable collections with new sessions? perfect
<fwereade> voidspace, +1
<voidspace> fwereade: it's *largely* a mechanical change - with a few places that need careful thought
<voidspace> fwereade: work so far https://github.com/voidspace/juju/compare/copy-sessions
<voidspace> fwereade: yeah, it's *mostly* that
<voidspace> fwereade: the compiler won't tell me how many places I have to change - just that there are too many...
<fwereade> voidspace, sounds like a nice commit on its own
<TheMue> vladk: http://blog.toxa.de/archives/606
<voidspace> fwereade: it needs some support work - for example, when we change the password we need to re-open the state because the existing session has the wrong credentials
<voidspace> fwereade: the goal is to eliminate i/o timeout errors which we are pretty certain come from using the one global session
<voidspace> fwereade: my lunch date just arrived
<voidspace> bad timing
<voidspace> biab
<voidspace> well, about an hour or so - sorry
<wallyworld> mgz: katco: standup?
<katco> wallyworld: sure, let me grab another cup of coffee. long night -.-
<wallyworld> sure :-)
<fwereade> bbmuchl
<fwereade> in new zealand :/
<perrito666> late morning everyone
<wwitzel3> hey perrito666
<perrito666> natefinch:
<wallyworld> katco: forgot to ask - did you get access to canonicaladmin?
<natefinch> perrito666: can we move our meeting later?  I have to call my insurance company about why they didn't cover my wife's $500 ultrasound
<perrito666> sure we can
<katco> wallyworld: i emailed sarah and she just said to use my login from my new-hire email(s)
<katco> wallyworld: and i haven't gone back and checked those yet
<wallyworld> ok, np. just wanted to make sure you were ok
<katco> wallyworld: ty i will let you know if i have any issues
<wallyworld> ok
<katco> wallyworld: cheers! :)
<katco> mgz: lmk if you have any questions regarding that PR
<natefinch> wallyworld: do you know why we have the upstart script redirect jujud's output to a file, rather than just having loggo write to a file?
<mgz> katco: sure thing
<wallyworld> natefinch: my guess is because bu default, loogo writes to stdout and so it's simpler just to have upstart capture that stdout data and redirect from there
<wallyworld> but we could look at changing that
<wallyworld> i don't imagine it would be too hard, but it would be extra code
<wallyworld> cause we'd need to register a different writer with loggo in each process
<natefinch> yeah, but it's code and not text in a string that gets written to a configuration file :)
<wallyworld> sure, understood. i'm just guessing the reason :-)
<natefinch> :)
<natefinch> wallyworld: do we expect multiple jujud processes running on the same machine?
<wallyworld> yes
<wallyworld> machine agent, unit agent
<natefinch> ahh
<natefinch> still, one log file each wouldn't be a terrible idea
<wallyworld> i think they get one log file each now anyway
<wallyworld> machine-x, unit-y etc
<natefinch> ahh ok, yeah
<wallyworld> natefinch: i updated all the deps and other stuff for the licensing based on the new gojson repos. but then i noticed the readmes in those repos still referenced binary132 and i didn't have the will power to do it all again
<natefinch> wallyworld: I can fix the readme files... sorry, guess I forgot about those.
<wallyworld> natefinch: np. you will need to update dependencies.tsv in both 1.20 and master
<wallyworld> thanks for sorting it out though
<natefinch> welcome
<katco> wwitzel3: btw ran across this and thought you'd be interested: http://jamescooke.info/git-to-squash-or-not-to-squash.html
<katco> wwitzel3: it's an interesting question; not sure where i fall yet. i kind of like squash conceptually
<wwitzel3> katco: read that one a while back. if there was a way to persist the reflog and share it for each branch / the entire repo .. well that would be perfect :)
<wwitzel3> katco: but it is only a local copy (afaik), so for me, doesn't quite solve the problem of keeping tools like git bisect useful when large feature branches introduce subtle bugs.
<katco> wwitzel3: i'm not yet familiar enough with git to comment on that. i haven't used bisect at all.
<katco> wwitzel3: sounds like you're a few steps ahead of me in your thinking
<perrito666> wwitzel3: your dog is so nice
 * perrito666 loves dogs
<wwitzel3> perrito666: which dog? one of them isn't mine
<perrito666> wwitzel3: a dog just paraded behind you while we where having the stand up
<perrito666> like a big sheppard
<perrito666> but pixelated :p
<wwitzel3> perrito666: yeah, that is a german sheppard, her name is Daisy, watching her for a friend.
<sinzui> natefinch, or maybe katco , can you help triage bug 1343318 . I relates to logging which has changed and will change soon
<_mup_> Bug #1343318: Node syslog full of rsyslog warnings <logging> <juju-core:New> <https://launchpad.net/bugs/1343318>
<sinzui> natefinch, katco : Do we intend to fix this for 1.21.0, or later
<katco> sinzui: i have never triaged before; and i don't think i can comment on when things will be fixed yet, but i'm looking at the log none-the-less
<katco> sinzui: is there a way to tell what version he was running? i recently fixed a bug where rsyslog could not read from a machine-x.log
<sinzui> katco, understood.These are my rules, though they don't list that we can choose medium when we think we think we need to discuss it in relation to other issues in afew months: https://docs.google.com/a/canonical.com/drawings/d/1vPqrgXDduzwV_N01ORF6jdnprwykf1ofiIaEAmIwOuA/edit
<katco> sinzui: (version of juju)
<katco> sinzui: thanks for that :)
<sinzui> katco, The user is Canonical IS, so I believe 1.18.1, but they are also migrating some thing to 1.20.1 They will not run devel juju
<ericsnow> sinzui: yeah, thanks :)
<sinzui> katco, The easy question to ask is if this work it thematically related to what we are doing, are we will to commit to fixing this in the 1.21.x series, if so, I would mark the bug high and assign it to next-stable milestone. If not, I would mark the bug as Medium because I think we will want this bug closed in the next 6 months
<katco> sinzui: gotcha. i really have to defer to a lead. i'm just not on-boarded enough to make a call. however, it's possible that this is a symptom of a regression that has already been fixed.
<katco> sinzui: https://bugs.launchpad.net/juju-core/+bug/1339715
<_mup_> Bug #1339715: Juju 1.21-alpha1 local provider does not create all-machines.log <local-provider> <logging> <rsyslog> <juju-core:Fix Committed by cox-katherine-e> <https://launchpad.net/bugs/1339715>
<sinzui> katco, exactly why I asked you :)
<katco> sinzui: aha! ;)
<katco> sinzui: so just to make sure the loop is closed; do you mind if i defer to a lead?
<sinzui> katco, not at all
<sinzui> hazmat, do you want to include this bug in your "usability" mandate . bug 1343569 points out that you cannot dump the config of running service to backup or reuse it to deploy another service the same way
<_mup_> Bug #1343569: Juju should have an easy way to dump then reuse service config <charms> <config> <improvement> <usability> <juju-core:New> <https://launchpad.net/bugs/1343569>
<mramm> anybody around to help SABDFL debug stuff?
<mramm> he is having issues with lxc containers
<mgz> not sure what I could usefully suggest
<mramm> well, we have lots of load on a machine and would like somebody to look at it to see if they can find anything
<mramm> and container starts are failing, and we should get logs
<mgz> okay
<mramm> do you have access to the garage MAAS?
<mgz> ugh, probably in theory, but I haven't checked in ages
<mgz> will just search emails quickly
<mgz> mramm: nope, I'm not in the right group
<mgz> I can file an rt to fix that
<mgz> but that's not very immediate
<mgz> mramm: are there no americans around?
<mgz> nate at least has done maasy things quite recently
<katco> mgz: sorry to be a pain, but i want to catch you before your EOD. do you think you will be able to review that PR today?
<mgz> katco: sure, sorry, just in other things
<katco> mgz: oh no worries... as long as it's in before EOD i'm happy
<katco> mgz: i just want to land it by my EOD
<mgz> the diff on github is now annoying to read...
<katco> mgz: =/ could/should i have done it differently?
<mgz> nah, just fallout from having had several rounds of review
<katco> ah ok... yeah it was difficult to understand the commenting system, this having been my first multi-push review
<katco> i understand review board will be better at this
<mgz> katco: done
<katco> mgz: ty sir
<katco> mgz: regarding the break on 80? what is the style in the project? string literal (`foo`) with line breaks? \n embedded in the string?
<mgz> katco: yeah, I'd just try to throw in a \n after the first sentence or something, and trim the second one a bit
<katco> mgz: (friday humor) i just did "M-u 80
<mgz> katco: let me guess, that transforms your editor into a giant robot?
<katco> mgz: LOL
<katco> M-u is just a way to tell emacs to do something X number of times
<katco> so i say M-u 80 and then hit the right button
<bodie_> hmmm, go emacs?  how do you like it?
<katco> bodie_: how does one like air? water?
<katco> bodie_: hehe no i love emacs. i am typing this from emacs (erc)
<bodie_> I guess it depends how problematic it is ;)
<bodie_> I like the air here much better than the air in LA
<katco> haha
<bodie_> I enjoy emacs for clojure but I'm not sure it would feel right for Go, but some people do swear by it
<katco> bodie_: i poke fun at my love of emacs often. but i use it for everything. i find writing go in it is wonderful
<bodie_> again... best OS with a reasonably tolerable text editor :P
<katco> bodie_: lol yup
<katco> it was funny, at the sprint axw was reviewing some of my code
<katco> and he's a vim user
<katco> so i hopped into evil-mode
<bodie_> nice
<katco> and now i can say axw used emacs ;)
<bodie_> emaxw
<katco> rofl
<katco> but yeah, google's go-mode ties into a lot of great tools
<bodie_> oo, there's a google-sponsored package? I might have to check that out
<katco> yeah it's in elpa and everything
<katco> i think it's distributed with go, but the elpa is more up to date
<katco> http://dominik.honnef.co/posts/2013/03/writing_go_in_emacs/
<katco> older, but will give you some idea
<katco> and on top of that, magit is wonderful for git
<bodie_> true, magit is quite nifty from what I've used of it
<katco> i don't have to remember all the flags etc
<katco> and as i'm building muscle memory, it is _so_ fast
<bodie_> I've been using a nice vim plugin set that has code completion, goto-definition, hideable fn and symbol table, and hideable dir tree browser
<bodie_> it's actually almost too heavy
<bodie_> but, really nice
<katco> bodie_: yeah i have all that up right now too... ecb
<bodie_> :D
<katco> it is heavy if i know where i'm at in the code
<katco> but since i'm new, that's almost never, and it's great for holding my hand
<katco> also, i run tests from emacs and then C-` to jump to errors
<katco> i will go on and on if you let me ;)
<bodie_> oooo
<katco> flymake highlights errors while i'm typing
<bodie_> afai'm concerned, this is the productive opposite of an editor war :P
<katco> rofl
<katco> i can add imports with C-c Ca, prune them with C-c C-r
<katco> *C-c C-a
<katco> but by FAR the best is the hook-in with oracle
<katco> points out callers, callees, etc... all clickable
<jcastro> Error details:
<jcastro> invalid Joyent provider config: control-dir: expected string, got nothing
<jcastro> did something get added to the joyent provider?
<bodie_> katco, now you're talking
<bodie_> I definitely have to give this a spin
<katco> phew... i was at risk of losing my emacs license. too few converts.
<bodie_> unfortunately, my piano skills are a little rusty :P
<katco> haha
<katco> another thing to check out if you're into productivity tools: org-mode
<bodie_> I have heard the ranting and raving about org-mode
<katco> my life is in org-mode
<bodie_> but haven't taken the time to poke around
<bodie_> lol
<bodie_> I'm trying to write something a bit like a team enabled org-mode-as-a-service on the side right now
<bodie_> would be neat to write a lisp client to interface between that and "real" org-mode
<katco> http://blog.kate.cox2.name/2014/01/reflecting-on-2013.html check out the "To-Do System..." header
<katco> bodie_: yeah i wish there was an interface into org-mode for leankit
<katco> there's one for trello
<bodie_> no way, too cool
<katco> and i've written some very basic hooks into launchpad just to grab some info
<katco> so like i can put the point on a bug id, and then i do M-x kt/browse-launchpad-bug
<katco> seems trivial but i use it _so_ much
<katco> need to get some food bbiab
<bodie_> o/
<voidspace> signing off for a few hours. Back on in around 4 hours.
<voidspace> g'night everyone
<katco> tc voidspace
<voidspace> I'm not in next week. Don't have too much fun without me...
<voidspace> o/
<TheMue> o/
<TheMue> bodie_: youâre using vim? Iâm an old vimthusiast
<TheMue> bodie_: for go Iâm using the standard of the go team plus an own little and helpful plugin
<TheMue> bodie_: also tagbar is nice for navigation inside a file
<katco> hmm.. i'm trying to write a unit test to check if kvm-ok is on the path, but i would rather not have the test fail if the dev running the test doesn't have kvm-ok... any ideas?
<katco> perhaps creating a dummy kvm-ok on a dummy path?
<perrito666> katco: you can set the path to whatever c.Mkdir() returns and adda kvm-ok there
<natefinch> yeah, that's what I'd do.
<katco> perrito666: perfect, ty guys
<katco> i kinda wish there was a better way to mock this
<natefinch> I presume the code is the thing checking for kvm-ok, and you want to make sure it does the right thing?
<TheMue> hehe, mocking, always a nice topic
<katco> yeah, we're adding the ability to check a given path if it can't find it in the user's path
<natefinch> right
<natefinch> I think the way mentioned is a good way to do it.  It physically sets the environment to either correct or incorrect.   An alternate way would be to mock out os.Stat, but that's a little more implementation dependent
<katco> cool ty natefinch
<perrito666> also you can set path to your temp dir, fail, add file, success and is a rather flow explanatory test
<katco> perrito666: yeah i'm planning on doing just that
 * perrito666 remembers one of his testing mentors "tests should tell a story"
<katco> perrito666: that's really interesting; i would listen if you wanted to expound
<natefinch> I like tests that only test one thing.  So, one to test the good case, one to test the bad case.   But sometimes setup is a big pain, so you end up testing multiple things along the way.
<katco> natefinch: i try to find the "middle way" for that reason
<katco> natefinch: an architect i respected once said, "if you ever find yourself thinking in absolutes, that should be a red flag"
<natefinch> katco: yep... never ever use absolutes.
<ericsnow> natefinch: you can create a new suite for that and write a SetUpTest() that is common to those tests
<perrito666> katco: well, you ideally should know how many exits your code has and therefore should tell the story of how data -> box->info flows from shorter to longer path
<katco> natefinch: that joke is _always_ funny
<natefinch> :)
<natefinch> yep
<natefinch> ericsnow: I kinda hate suite-based tests, actually.  It seems to encourage really elaborate test setups, rather than refactoring your code to be more testable
<perrito666> katco: that should be read with a fake old chinese wise man voice
<natefinch> "hey look, I've spent two weeks writing code so that I can start up mongo and a juju server with just three lines of code!"
<katco> perrito666: ah, so that touches on something i am familiar with: the tests being a good way for devs to examine your api
<katco> perrito666: rofl
<katco> perrito666: the argentinian accent is just as effective :)
<bodie_> TheMue, I agree about tagbar :)
<katco> bodie_: heresy! come to emacs!
 * natefinch just installed acme so he could take part in the editor wars.....
<katco> haha
<katco> natefinch: i will be interested in how you find it
<katco> natefinch: personally i don't like how it's mouse driven
<natefinch> katco: I'm coming from Visual Studio and Sublime text, so I'm pretty mouse-centric anyway.  I have come to love the command line...  but I like the ability to just look at a word and perform an action on it, like go to definition or whatever
<natefinch> but I literally just started acme for the first time, so we'll see
<perrito666> natefinch: well be around you when you use your mouse sounds a lot like being near a cs player
<bodie_> natefinch, lol
<katco> natefinch: yeah i understand; the two are not mutually exclusive fyi; i do M-. to go to defs on word, among other keystrokes
<katco> perrito666: haha
<bodie_> I may be a CLI fighter but I'm not Acme crazy
<TheMue> katco: emacs = eight megabytes and continuously swapping
<TheMue> katco: oh, no, today itâs 8 GB, for Atom
<katco> TheMue: haha
 * katco wonders if that's now a compliment these days
<katco> ah darn
<bodie_> natefinch, I love sublime too, but I discovered that not having to leave the keyboard reduces friction just that little extra bit... meaning I can be just a little bit lazier ;)
<bodie_> lol TheMue
<katco> TheMue: actually that came up in the sprint... i went to look and see how much memory emacs was consuming expecting 500MB or so... 29 MB
<TheMue> I left Sublime, somehow I came back to vim. love the speed and that itâs availble on GUI and in terminal
<natefinch> I've heard a lot of good things about vim-go, but I don't know that I can deal with vim's keyboard navigation stuff
<katco> TheMue: it's taking up ~230MB right now, but i have considerably more running
<katco> natefinch: emacs can help you there! ;)
<TheMue> katco: today the old ones are all small, even with all their plugins
<bodie_> natefinch, yeah, it's a little bit of a learning curve for sure, especially for the first week or so
 * katco taps fingertips together
<bodie_> once you're used to it it's very friendly (same with emacs imo) but getting there is weird and annoying
<katco> obligatory http://ergoemacs.org/emacs/i/emacs_learning_curves.png
<TheMue> in early times smalltalk had no chance because of the resources they need. when today running pharo you love about the needed resources
 * TheMue will stay with vim
<bodie_> precisely
<bodie_> (katco)
<perrito666> if you really want to be part of a good editor war, you need to code your own editor, clearly
<bodie_> well...
<TheMue> perrito666: good idea
<katco> perrito666: seriously speaking, i consider the point of emacs to customize it specifically to you until it is an extension of you
<bodie_> emacs is basically just a lisp interpreter, right?
<TheMue> perrito666: on your own os with your own language
<katco> bodie_: i actually have scripts i run from the command line utilizing emacs as a host
<bodie_> bodim on bodhi linux running bodiescript?
<katco> bodie_: check this out: http://elnode.org/
<katco> sometimes i use that for stubbing web apis
<bodie_> oh, M-s-nap
<katco> lol
<bodie_> ... that was a stretch
<TheMue> oh, and today everything has to run in the browser. so even the local editor has to be a server and files are edited by https://localhost:12345/home/themue/code/..
<katco> bodie_: i enjoyed it
<bodie_> TheMue, you must be thinking of LightTable IDE... :P
<katco> loll
<TheMue> bodie_: noooooooo /o\
<bodie_> I really wanted that to be cool :S
 * TheMue âs first real hard editor has been ISPF on TSO (and his cousin SPF/2 on OS/2), very strange, but powerful once you get it 
<bodie_> I was fantasizing the other day about how a keyboardless shell / code editor might be possible
<katco> i started out on vim, but have since forgotten all of it
<bodie_> to me vim is like a very productive text editor whereas emacs is more like a semantic tree editor
<bodie_> and go is not very ... tree-y
<bodie_> but it is somewhat verbose (relative to more tree-y things like lisps)
<perrito666> bodie_: I have never been into a level of productivity where that degree of speed is useful, I usually spend more time thinking that writing
<TheMue> bodie_: never thought about it, but yes
<katco> i'm not sure i understand that. what is the relation b/t editor and how close to sexps the language is?
<TheMue> Atom will be the next emacs, only js instead of lisp. and node.js as backend *shivver*
<bodie_> it seems to me that in a sexpy language your syntax is closer to a semantic tree while in a more imperative style language your syntax is more about editing the state machine.... or something
<katco> bodie_: ah ok. i am still not able to make the connection on how the editor impacts that, perhaps b/c i haven't used vim in so long
<bodie_> well to me it feels like vim is more about manipulating letters / lines / words while emacs is more about manipulating sexps
<bodie_> maybe I'm just used to paredit
<katco> hm. again, perhaps it's b/c i haven't used vim in so long, but emacs seems to work just fine editing things at the document level (letters, lines, words, etc.)
<katco> it does assign meaning to syntax based on what mode you're in, so you can jump around the document based on functions, etc.
<katco> but i don't think that precludes operating at a purely document level
<bodie_> I'm probably just more comfy with vim's toolbelt :)
<katco> i do get the feeling that vim is better for that kind of thing
<katco> vim golf
<bodie_> heh
<katco> but we have evil-mode! ;)
 * bodie_ dramatically keels over clutching his head
<katco> haha
<katco> quick! M-x revive-developer
<natefinch> honestly, the worst thing about emacs users is how they always have to write out all their hotkeys (and call them chords to make them sound more exotic)
<natefinch> :D
<TheMue> *ROFL*
<perrito666> natefinch: nah, chords are easy, you can play a piano with only 10 fingers, its not like emacs
 * perrito666 hides
<katco> hold on i accidently did the wrong chord and it's reformatting my hard drive
<katco> dammit emacs!
<bodie_> if you just map c, s, and m to C maj, F maj, G 7....
<TheMue> katco: as long as you donât reformat EC2
<bodie_> lol perrito666
<katco> TheMue: nah, that's bound to C-S...
<bodie_> or was it M-C-S?
<TheMue> last time I used emacs I got a knot into my fingers and had been in the hospital for three weeks
<katco> M-x describe-key to the rescue!
<katco> lol
<bodie_> actually
<bodie_> I find emacs easier on my hands
<bodie_> because I *never* have to leave home row
<bodie_> position*
<natefinch> evidently acme runs its own virtualized operating system. Take that, emacs!
<katco> rofl
<katco> editor wars are too fun to take seriously
<bodie_> clearly, emacs is superior in that respect, since its primary operating system is the brain of the user
 * katco refrains from making a comment about the OS needing an upgrade
<bodie_> I'm afraid there is no known remedial action for removing such a virus
<katco> i knew when i found that tetris had come bundled with emacs that something was amiss
<katco> but it was too late for me
<natefinch> rogpeppe: you around?
<natefinch> good god ubuntu software center is slow
<jcastro> apt-get dude!
<natefinch> but if I install stuff from a deb, I have no clue what apt thinks it's called
<katco> ok this is stupid. what's a cross platform command i can place in an executable script? very few assumptions can be made about what resides on the system
<natefinch> need more context
<katco> i'm still working on this stupid test. if i create an empty file and make it executable, i get a "fork/exec (file...) exec format error"
<katco> if i fill it in with something, it works
<natefinch> katco: I think we have a util for that...
<katco> natefinch: i honestly can't tell if you're joking
<natefinch> katco: not joking :)
<katco> haha ok. please elucidate
<natefinch> katco: cross platform is actually pretty tricky... we usually use "exit 0" as the script contents
<katco> oh jees... why didn't i think of that
<katco> i blame 3 hours of sleep and my baby
<natefinch> #!/bin/bash --norc
<natefinch> exit 0
<katco> that won't work on windows will it?
<natefinch> not cross platform, obv
<natefinch> correct
<katco> trying to be mindful of your windows work :)
<natefinch> echo works :)
<katco> hmmm... it doesn't like that either
<natefinch> might not be possible to be cross platform... that's ok, we can always refactor it so we create an OS-specific script
<natefinch> jcastro: how do I list installed stuff with apt-get? I installed a deb I downloaded through the ubuntu software center, but now it's not show it as installed
<natefinch> jcastro: (and I don't know what the heck apt-get thinks it's called)
<jcastro> natefinch, in this case install synaptic, then look in the "local" section
<jcastro> and it will show you all the horrible things from around the web you've installed
<katco> natefinch: apt-cache search <regex>
<natefinch> that's ok, I was able to guess :)  But that'll be useful later
<natefinch> hahah... it's not that ubuntu software center couldn't find the software, it was just so slow and didn't show any progress, I thought it couldn't find it, but I went back to the window a few minutes later, and it was there
<rick_h__> sinzui: really awesome email :)
 * rick_h__ tosses three cheers to sinzui abentley abel and company
<sinzui> thank you rick_h__
<TheMue> sinzui: +1
<natefinch> sinzui: thank you for that.  There had been grumbling that Juju was unstable, but I didn't know where it was coming from.  Nice to hear someone who uses and tests juju thinks it's pretty stable
<natefinch> man our logging infrastructure is overly complicated
<perrito666> sinzui: +1 that kind of feedback is very refreshing
<katco> where is this very awesome email?
<ericsnow> katco: canonical-juju
<ericsnow> katco: "Some facts about Juju CI and Juju 1.20.x"
<katco> oh that's right... i still don't think i'm on that list
<katco> i guess alexis didn't get around to that yesterday
<ericsnow> katco: I'll forward it to you
<katco> ericsnow: oh thank you!
<katco> for now it's past my EOD and i desperately need sleep. have a good weekend all
<voidspace> wallyworld: ping
<voidspace> hmmm... unless it's now saturday in your world
<voidspace> I guess technically it is in mine now
<voidspace> wallyworld: anyway, I'll email you tomorrow...
#juju-dev 2014-07-19
<voidspace> morning all
<voidspace> it compiles
<voidspace> just need the tests to pass
<wwitzel3> voidspace: nice :) .. I think I'll have my tests passing today as well
<voidspace> wwitzel3: cool
<voidspace> wwitzel3: madly packing for a month now. :-)
#juju-dev 2014-07-20
<thehe> hey guys - is there any option to set maschine0 (in local deployment) to host units? i want (HAVE TO) juju-gui on the host-maschine!
#juju-dev 2015-07-13
 * wallyworld sighs. ocr on leave means reviews kinda stall
<alexisb> wallyworld, we should start having folks assign back-up for ocr when they go on vacation
<wallyworld> yes we should - we used to have that as a polciy
<wallyworld> i guess we still do in theory
<wallyworld> folks just have to follow it :-)
<menn0> wallyworld: just so you know, i'm currently dealing with a nasty upgrade issue that CTS has run in to
<wallyworld> oh no
<wallyworld> bug?
<menn0> wallyworld: seems to affect any upgrade from 1.24.x
<menn0> wallyworld: the agent won't restart... a worker is failing to stop
<menn0> wallyworld: only seems to happen with big complex envs (bootstack in this case)
<menn0> wallyworld: i'm using the bootstack staging env atm the repro and am making some progress
<menn0> wallyworld: RT 82240... i'll make sure there's a Juju ticket if there isn't already
<menn0> thumper was looking at this last week but I've taken it over in his absence
<wallyworld> oh, ty
<wallyworld> menn0: is it bug 1468653
<mup> Bug #1468653: jujud hanging after upgrading from 1.24.0 to 1.24.1(and 1.24.2) <canonical-bootstack> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <https://launchpad.net/bugs/1468653>
<alexisb> menn0, that is a bootstack bug
<menn0> alexisb: I don't think so
<menn0> alexisb: jujud is definitely not doing the right thing... it's getting stuck when trying to shut down
<menn0> wallyworld: that is the ticket though (LP was timing out for me)
<wallyworld> ok
<menn0> wallyworld: this is definitely leadership related
<wallyworld> oh joy
<menn0> wallyworld: when I reproed there were 9 stuck API connections
<menn0> wallyworld: and there were 9 goroutines waiting in BlockUntilLeadershipReleased
<menn0> wallyworld: in 1.24.0 at least there's a naked channel read there
<wallyworld> stuck before upgrade during shotdown of agents?
<menn0> wallyworld: yep
<menn0> wallyworld: when did you fix all the naked channel ops?
<wallyworld> i didn't fix those
<wallyworld> i think maybe william did?
<wallyworld> or tim?
<menn0> ok, I thought you did some
<wallyworld> if i did i can't remember
<menn0> anyway, I shoudl hopefully have this soon
<wallyworld> i guess we think that 1.24.2 is ok
<wallyworld> and 1.24.0 upgrades may need a manual process if it hangs
<menn0> not sure, I think the problem still happens when upgrading from 1.24.2
<menn0> I'll check that soon
<wallyworld> ok
<menn0> it takes about 30 mins to build up the env to test it
<menn0> and I don't want to tear it down just yet until I've finished looking at this env
<wallyworld> ok
<menn0> wallyworld: ok, i understand the problem now
<menn0> wallyworld: checking to see if someone has already fixed it in a later 1.24 or in master
<wallyworld> ok
<menn0> wallyworld: basically if any of the leadership API requests are active (and some are quite long running) while an upgrade is initiated the server will get stuck
<menn0> wallyworld: the more units you have the more likely you are to hit the problem
<wallyworld> menn0: yes, that sounds very plausibl based on what has been observed before and what andrew/william fixed
<wallyworld> i don't think there's a fix we can do because 1.24.0 is already running
<menn0> wallyworld: that's true, but it would be good to ensure that the next 1.24.0 doesn't have the problem
<menn0> sorry 1.24.x
<menn0> wallyworld: from looking at the code, the problem is still there in master
<menn0> wallyworld: hangout?
<wallyworld> sure, sec
<menn0> wallyworld: onyx standup?
<menn0> wallyworld: it's not actually as simple as we thought... the thing returned by NewLeaseManager is actually the singleton which is supposedly getting killed
<menn0> wallyworld: there must be some other aspect
<wallyworld> otp, sec
<menn0> wallyworld: no worries
<menn0> wallyworld: i'll sort it out
<wallyworld> ty
<menn0> wallyworld: I have a fix for the problem
<menn0> wallyworld: it's way past my EOD and i'm not working tomorrow so I'm writing up notes for thumper so he can write some tests around it and land it
<menn0> wallyworld: we're not out of the woods yet though... post upgrade about 50% of the units on this env have hook failures
<menn0> wallyworld: will send an email
<wallyworld> menn0: thanks for sticking with it, i'll talk to tim tomorrow
<mup> Bug #1471231 changed: debugLogDBIntSuite teardown fails <ci> <unit-tests> <juju-core db-log:Fix Committed> <https://launchpad.net/bugs/1471231>
<perrito666> morning
<alexisb> morning perrito666
<perrito666> there is something about anual medical check that makes me feel old
 * perrito666 sighs and makes appt
<anastasiamac> perrito666: wait until u get kids... :)
<ashipika> anastasiamac: +1
<sinzui> perrito666: katco I cannot find the on-call reviewer callendar. Any clues?
<katco> sinzui: it's just the juju team calendar
<sinzui> katco: Anymore clue's. Canconal's Google Calenendar tells me none of the juju email address have a calendar?
<sinzui> wwitzel3: Can you review http://reviews.vapour.ws/r/2140/
<natefinch> man, coming back from vacation is always so hard
<katco> natefinch: o/ hope you had a good time
<natefinch> katco: amazing time.  Could have used another week (and a raise to be able to afford it ;)
<TheMue> katco: he had, seen it on Instagram ;)
<katco> natefinch: lol
<katco> TheMue: :)
<wwitzel3> sinzui: taking a look now
<TheMue> natefinch: looked like a lot of fun in a cool environment
<natefinch> TheMue: it was great.  We did it last year in a house half this size... the extra interior space and nicer beach made this year even better.
<natefinch> (and 50% more expensive... but worth it)
<katco> ericsnow: wwitzel3: natefinch: we have 2 meetings overlapping. just meet in moonstone
<wwitzel3> sinzui: just the dep updates? I was able to update to them and build juju and bootstrap, so combined with the tests you did, LGTM.
<TheMue> natefinch: your familiy growed, you need the space
<TheMue> ;)
<natefinch> TheMue: yeah, I gotta stop doing thing
<TheMue> natefinch: cute family, no need to stop
<sinzui> wwitzel3: it is, I just wanted a dev to ask the hard questions about consequences. Thank you. This is the compararable branch for master. http://reviews.vapour.ws/r/2141/
<natefinch> TheMue: haha... the number of bedrooms in my house, seats in my car, and lack of hair on my head say otherwise ;)
<TheMue> natefinch: hmm, ok, there are constraints, yep :D
<perrito666> sinzui: tim and wayne
<sinzui> thank you perrito666 katco and fwereade sorted me out
<natefinch> katco: I didn't check my calendar until now and just realized we have the iteration meeting nowish.  I have to take my daughter to a swim lesson in about 15 minutes... can we push the iteration meeting back a couple hours?  Sorry for the late notice... I forgot we'd pushed the iteration meeting to today.
<katco> natefinch: not really, i am taking the middle of today off to catch up on some things. you need to start checking your calendar dude
<natefinch> katco: I know, I know.  Totally my fault. I'm sorry.
<cherylj> Is there someone who owns the CentOS support within Juju?
<cherylj> alexisb: ^^
<cherylj> (I figure you'd be the most likely to know :)
<alexisb> gsamfira and team did the work
<natefinch> katco, wwitzel3, ericsnow_afk: how goes?
<wwitzel3> natefinch: good, I'm just working on wpm bugs
<bogdanteleaga> cherylj: I might be able to answer questions
<katco> natefinch: pick up some of the bugs in the backlog if you don't mind
<katco> wwitzel3: please tag the bug you're working on and move to actively working
<natefinch> katco: will do
<wwitzel3> katco: thanks
<natefinch> katco: FYI: one bug was fixed by someone else, one was marked invalid, and one seems to be assigned to gsamfira, though that was 5 days ago, so I'm not sure if he's actually working on it.  The other bug in the backlog is being worked on by wwitzel3.  I could do my "clean up assigned bugs" task, unless you think there's something more important
<alexisb> natefinch, pending katco's arrival, there are plenty of bugs against 1.25 you can tackle :)
<alexisb> lots and lots
<natefinch> alexisb: heh ok
<mup> Bug #1424892 changed: rsyslog-gnutls is not installed when enable-os-refresh-update is false <cloud-init> <logging> <juju-core:Fix Released by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1424892>
<natefinch> mgz: don't suppose you're around?
<natefinch> sinzui: is my CI blockers bookmark incorrect?  It shows no blockers, but trying to merge some code to main returns "does not match fixes-blah"  My bookmark, for reference: https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL
<sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
<sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
<sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
<sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
<sinzui> natefinch: The status changes about 6 months ago, and the tags 3 months ago: look at this
<sinzui> https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&field.importance%3Alist=CRITICAL&field.tag=ci+blocker+&field.tags_combinator=ALL
<sinzui> natefinch: CI is testing the fixes now
<sinzui> looks the the osx change is good too
<natefinch> sinzui: mind if I add that link to the blocking bugs wiki page that Martin made today?  That way, hopefully it'll get updated if the requirements change
<sinzui> natefinch: go ahead
<natefinch> done
<natefinch> thanks sinzui
<mup> Bug #1473461 changed: OSX/darwin builds fail: undefined: password.EnsureJujudPassword <blocker> <ci> <osx> <regression> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1473461>
<thumper> ah mah gard, so many emails
<thumper> fwereade: I'm here
<fwereade> thumper, heyhey
<fwereade> thumper, not so critical really, I think it's just JujuConnSuite being shite
<fwereade> thumper, and I've convinced myself that it's an INFO log anyway so it's moot
<fwereade> thumper, but if, in your Copious Free Time, you were to come up with a clean way of separating the logging (that wasn't just "replace JujuConnSuite"), that would be awesome
<thumper> fwereade: there is a way...
 * fwereade is all ears
<thumper> the base suite brings in a logging sute
<thumper> the logging suite captures the logs
<thumper> and replaces the default logger (stderr) with one that goes to gocheck
<thumper> so... wondering what the problem is
<fwereade> thumper, well, me too, I'm vaguely assuming that because JCS has everything running all at once there's some global logging setup somewhere that dumps the state stuff into the stderr of the testing.Context
<fwereade> thumper, I imagine the cmd.Logger or whatever it is has a hand in it?
<thumper> IIRC, there was some change to the default loggers with the log roller
<fwereade> thumper, and it's not wrong to be sending all those logs to stdout
<thumper> but I've not looked deeply
<fwereade> thumper, it's just that it's happening in the same process, which is out of the ordinary, and so gets logged with everything else
<fwereade> thumper, I guess the answer with that specific test to to run it against an api stub and check it doesn't log when nicely isolated
<thumper> :)
<davecheney> \o/
<thumper> hi davecheney
<thumper> davecheney: how'd the conference go for you?
<davecheney> thumper: excellently
<davecheney> i guess that means I beat axw back to austalia
<bradm> is there any way to see what jujud is doing, load wise?  we've got one constantly sitting between 100% - 150% cpu, and the logs aren't particularly illuminating - doesn't look too busy at all
<mup> Bug #1446871 changed: Unit hooks fail on windows if PATH is uppercase <ci> <hooks> <windows> <juju-core:Fix Released by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1446871>
<thumper> bradm: best suggestion is to change the log settings to debug
<thumper> bradm: or are they at debug already?
 * thumper takes a deep breath and resolves conflicts between master and jes-cli branch
<bradm> thumper: we have a 20G log file, so either we're on debug or its very very verbose for info, but we'll check.
<bradm> yes, we're definately on debug
<bradm> we're seeing a lot about ClaimLeadership
#juju-dev 2015-07-14
<thumper> bradm: which version of juju?
<thumper> bradm: sounds to me like some runawaw leadership code
<bradm> thumper: the agents are 1.23.3, we have juju 1.24.2 client installed
<thumper> bradm: ok... I'm guessing leadership stuff then (known to be problematic in 1.23)
<thumper> but it is mostly a guess
<thumper> bradm: does restarting the agents fix it?
<thumper> is that possible ?
<bradm> thumper: we're being super cautious with this environment, its prodstack 4.5
<thumper> :-)
<thumper> I understand the caution
<thumper> sorry, personally I don't have an answer for you
<bradm> thats fine, just trying to see if there's something we should be doing to see whats going on
<bradm> its not a problem per se, the load is only sitting at about 1, just curious
<bradm> so it looks like it was restarted, cleared out a bunch of memory, but its still chewing cpu time
<bradm> the memory usage was more of a problem, but the restart cleared most of that
<thumper> ah fark
<anastasiamac> thumper: ?
<thumper> just resolving a mega-merge (master -> jes-cli) and I deleted a file I shouldn't have
<anastasiamac> :( sounds painful ...
 * thumper runs make check again
<anastasiamac> ericsnow: ping?
<wallyworld> axw: can you remind me - there was a bug about upgrades not imgesting charms from cloud storage to env storage I think?
<axw> wallyworld: tools, not charms
<axw> wallyworld: on ec2 only
<axw> wallyworld: you want the number?
<wallyworld> ah. there's a bug report that implies charms are not being loaded
<wallyworld> sure
<axw> wallyworld: the one I fixed was about the s3 signing, due to ":" being in the URL
<wallyworld> i'll have to try and replicate the charms one
<wallyworld> ah yes, i recall now
<axw> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1469130
<mup> Bug #1469130: tools migration fails when upgrading 1.20.14 to 1.24.1 on ec2 <ec2-provider> <upgrade-juju> <juju-core:Fix Committed> <juju-core 1.24:Fix Committed by axwalk> <https://launchpad.net/bugs/1469130>
<wallyworld> ty
<thumper> wallyworld: http://reviews.vapour.ws/r/2152/
<wallyworld> looking
<wallyworld> thumper: looks ok. are the other issues raised about rsyslog etc valid?
<thumper> I'm not sure which issues you are referring to
<wallyworld> thumper: i menno's email
<wallyworld> in
<thumper> yes, the other issues are valid
<wallyworld> :-(
<thumper> also, looking through the code...
<thumper> I saw something.
 * thumper wonders what happens
<thumper> if you try to send on a closed channel in a select
 * thumper has to head out for a bit
<mup> Bug #1474195 opened: juju 1.24 memory leakage <cpec> <juju-core:New> <https://launchpad.net/bugs/1474195>
<thumper> davecheney: are you really around?
<thumper> if so, I have a golang question
<thumper> if I have a select, and one of the cases is a send to a channel
<thumper> and something closes that channel
<thumper> will it blow up, or just choose the other case?
<thumper> hmm...
 * thumper goes to test with the playground
<wallyworld> thumper: i'm sad
<wallyworld> upgrades from 1.20 to 1.22 are busted
<thumper> :(
<thumper> I'm sad
<wallyworld> the migration of the charm collection to add the env uuid occurs AFTER the migration into storage of the charms, hence no charms are imported
<thumper> a send on a closed channel panics even in a select
<wallyworld> oh?
<thumper> you asked "was there another place where we need to care about closing the channels"
<thumper> and I looked
<wallyworld> i shouldn't have asked :-)
<thumper> there are go routines that are started to send values down the subscriber channels
<thumper> which try to send for a while, then timeout
<thumper> if there are any of these running when we close the subscriber channels
<thumper> panic
<wallyworld> i guess we need to include <-chan in the select
<thumper> it is worse than that
<thumper> we need to make sure they are all dead before we close the subscribers
<thumper> which is a bitch to write and probably worse to tests
<wallyworld> yuk
<thumper> well, isn't too hard to write
<thumper> just a wait group
<wallyworld> thumper: save me looking, can you remember, are upgrade steps run serially in the order defined? i would have hoped so
<thumper> yes
<thumper> yes they are
<wallyworld> so how the f*ck are we running charm import in the middle of env uuid fixing
<thumper> NFI
<thumper> is it not an upgrade step?
<wallyworld> it is
<thumper> is it doing things asynchronously?
<wallyworld> not that i can see, i'll have to look closer
 * thumper stabs the lease code
<wallyworld> not sure how to fix, may need to trigger the upgrade steps again somehow
<wallyworld> ah so migrate charms into storage is a 1.21 step
<wallyworld> add env uuid to collections is a 1.22 step
<wallyworld> and yet  the initial env uuid steps are logged prior to the 1.21 steps
<wallyworld> oh wait, env uuid steps are split across 1.21 and 1.22
<wallyworld> and if 1.22 state server runs the steps, it expects an env uuid to be there
<wallyworld> but it won't be because that is only done in a 1.22 step
<wallyworld> and the charm migration happens in a 1.21 step
<wallyworld> so looks like we need to force an upgrade via 1.21 \o/
<thumper> hang on... hangout so you can talk me through this
<thumper> something doesn't smell right
<wallyworld> ok
<wallyworld> 1:1
<thumper> ack
<thumper> wallyworld: you froze
<thumper> and I couldn't hear you
<thumper> so I hung up
<wallyworld> thumper: sorry, bigjools wifi died
<bigjools> thumper: best thing to do is hang up on wallyworld
<thumper> :-)
<wallyworld> thumper: i did the same to you
<thumper> fair enough
<mup> Bug #1253613 changed: Hooks want to publish information to juju status <feature> <hooks> <juju-core:Fix Released> <https://launchpad.net/bugs/1253613>
<wallyworld> dooferlad: hey, i've addressed your comments in the PR, if it looks ok, and you and dimiter think it is suffcient to solve the issues you've seen, let me know and i'll land
<dooferlad> wallyworld: +1 from me. Doesn't look like dimitern is around. TheMue, could you take a look at http://reviews.vapour.ws/r/2138/ please?
<wallyworld> ty
<TheMue> dooferlad: will do, just merging my branch
 * wallyworld off to soccer training now anyway
<dooferlad> wallyworld: have fun! One of us can $$merge$$ it
<TheMue> dooferlad: I added one question to #2138, could you answer it as a native speaker?
<dooferlad> TheMue: sure
<TheMue> dooferlad: thx
<TheMue> dooferlad: it's just a feeling based on the German meanings of those two different words
<dooferlad> TheMue: no problem. In this case only is the right word to use.
<dooferlad> TheMue: I think that this is mostly because solely is a bit of a mouthful and we tend to only use it in more formal language, such as "Bob is a sole trader" .
<TheMue> dooferlad: fine, so ignore my note
<TheMue> dooferlad: in German "only" has a more negative meaning while "solely" is narrowing down and positive
<TheMue> dooferlad: but btw, it won't merge, does not fix the right blocker \o/
<dooferlad> TheMue: :-(
<TheMue> dooferlad: yep, my latest PR also knocks on the CIs doors
<TheMue> dooferlad: as a fix it's accepted, but the tests failed
<dimitern> wallyworld, hey, thanks for taking care of bug 1472014 !
<mup> Bug #1472014: juju 1.24.0: wget cert issues causing failure to create containers on 14.04.2 with lxc 1.07 <openstack-installer> <juju-core:Fix Committed by wallyworld> <juju-core 1.24:Fix Committed by wallyworld> <https://launchpad.net/bugs/1472014>
<mup> Bug #1474291 opened: juju called unexpected config-change hooks after read tcp 127.0.0.1:37017: i/o timeout <juju-core:New> <https://launchpad.net/bugs/1474291>
<dimitern> TheMue, dooferlad, so no maas call today  - the guys are away it seems
<TheMue> dimitern: ok, thanks for info
<dooferlad> dimitern: ack
<perrito666> good morning all
<wallyworld> axw: meeting?
<mup> Bug #1461993 changed: support using an existing vpc <feature> <network> <juju-core:Triaged> <https://launchpad.net/bugs/1461993>
<mup> Bug #1457575 opened: archive/tar: write too long <backup-restore> <intermittent-failure> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1457575>
<dimitern> oh boy, juju not supporting a vpc account without a default vpc strikes again! we should really fix this bug 1321442
<mup> Bug #1321442: Juju does not support EC2 with no default VPC <ec2-provider> <network> <juju-core:Triaged> <https://launchpad.net/bugs/1321442>
<dimitern> hazmat will like this a lot :)
<mup> Bug # changed: 1461959, 1462417, 1463133, 1464280, 1469186
<mup> Bug # changed: 1463870, 1464254, 1464255, 1466513, 1469184, 1469189
<mup> Bug # opened: 1463870, 1464254, 1464255, 1466513, 1469184, 1469189
<mup> Bug # changed: 1463870, 1464254, 1464255, 1466513, 1469184, 1469189
<mup> Bug #1474382 opened: MeterStateSuite teardown failure on windows <ci> <regression> <test-failure> <windows> <juju-core:Triaged by bteleaga> <https://launchpad.net/bugs/1474382>
<mup> Bug #1474386 opened: Problems bootstrapping the manual provider with CentOS <juju-core:New> <https://launchpad.net/bugs/1474386>
<mgz> bogdanteleaga: poke me if you need anything from the windows test suite
<mgz> bogdanteleaga: I'll also proofread your message to the list about setup function calling if you like
<bogdanteleaga> mgz: sure
<bogdanteleaga> mgz: any eta on when the 1.24.3 thing will start testing the upgrade?
<mgz> bogdanteleaga: when we have 1.24.3 on our maas machine, I'll ask sinzui about the release timeline
<ericsnow> katco, natefinch, wwitzel3: standup?
<ericsnow> could anyone tell me how to deploy a charm on the dummy provider such that hooks get run? or is that not an option?
<ericsnow> natefinch: looks like you didn't add a link for the forward-port PR to bug #1370896
<mup> Bug #1370896: juju has conf files in /var/log/juju on instances <canonical-bootstack> <logging> <rsyslog> <juju-core:Triaged by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1370896>
<mup> Bug #1474411 opened: juju --help text for upgrade is out of date <juju-core:New> <https://launchpad.net/bugs/1474411>
<natefinch> ericsnow: thanks for reminding me
<ericsnow> natefinch: :)
<mup> Bug #1472596 changed: bootstrap failed yet retry says it succeeded <bootstrap> <ui> <juju-core:New> <https://launchpad.net/bugs/1472596>
<mup> Bug #1472596 opened: bootstrap failed yet retry says it succeeded <bootstrap> <ui> <juju-core:New> <https://launchpad.net/bugs/1472596>
<mup> Bug #1472596 changed: bootstrap failed yet retry says it succeeded <bootstrap> <ui> <juju-core:New> <https://launchpad.net/bugs/1472596>
<mup> Bug #1473197 changed: openstack: juju failed to launch instance, remains pending forever <deploy> <openstack-provider> <ubuntu-openstack> <juju-core:New> <https://launchpad.net/bugs/1473197>
<katco> natefinch: hey just checking in between errands. did you find something to work on? alexisb's suggestion of 1.25 bugs is not bad, but check with ericsnow to see if there's anything he needs help with to get the wpm demo ready for the mid-cycle
<perrito666> mm, where is the list of envirionment aware collections?
<perrito666> nevermind
<mup> Bug #1473197 opened: openstack: juju failed to launch instance, remains pending forever <deploy> <openstack-provider> <ubuntu-openstack> <juju-core:Triaged> <https://launchpad.net/bugs/1473197>
<natefinch> katco: yep, doing bugs, cleaning up some of my old bugs that needed forward porting
<katco> natefinch: cool. i looked at the bugs we have flagged in our backlog, and they look valid to me? what was the problem?
<natefinch> katco: there were a couple more that I deleted that were already fixed... sorry if deleting was not the correct thing to do.  I didn't want to put them into done, since they weren't work that we actually did.
<natefinch> katco: but they were definitely already marked as fix released in all series to which they were targetted (probably work was done after we created the cards)
<katco> natefinch: oh, i'm referring to the two bugs in our iteration backlog
<katco> natefinch: created on the 6th, no one assigned in LP
<katco> natefinch: e.g. https://canonical.leankit.com/Boards/View/114568542/115913838
<natefinch> katco: I... somehow completely overlooked those in favor of what was in the backlog (not iteration backlog).  Sorry.  Well, I'll start on one of those right away
<katco> natefinch: cool beans. ty
<katco> natefinch: did that not come up in the standup this morning?
<natefinch> katco: no, I was doing my "cleanup assigned bug" task (which will be done when trunk opens again), and talking to eric about one of the bugs that he'd worked on.    I think eric just believed me when I said all the bugs were assigned or finished ;)
<katco> natefinch: ah, ok :)
<mup> Bug #1473470 changed: Windows cannot ensurePassword <blocker> <ci> <regression> <windows> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1473470>
<mbruzek> I think I found a networking problem with the KVM provider.  https://bugs.launchpad.net/juju-core/+bug/1474508  -  Can someone from core take a look and let me know if they need anymore information?
<mup> Bug #1474508: Rebooting the virtual machines breaks Juju networking <juju-core:New> <https://launchpad.net/bugs/1474508>
<mup> Bug #1474291 changed: juju called unexpected config-change hooks after read tcp 127.0.0.1:37017: i/o timeout <hooks> <openstack> <sts> <uosci> <juju-core:Invalid> <ceilometer (Juju Charms Collection):New> <https://launchpad.net/bugs/1474291>
<mup> Bug #1474508 opened: Rebooting the virtual machines breaks Juju networking <juju-core:New> <https://launchpad.net/bugs/1474508>
<mup> Bug #1474508 changed: Rebooting the virtual machines breaks Juju networking <juju-core:New> <https://launchpad.net/bugs/1474508>
<mup> Bug #1474291 opened: juju called unexpected config-change hooks after read tcp 127.0.0.1:37017: i/o timeout <hooks> <openstack> <sts> <uosci> <juju-core:Invalid> <ceilometer (Juju Charms Collection):New> <https://launchpad.net/bugs/1474291>
<natefinch> mbruzek: attach the machine and unit log from that machine, and the state server's machine log as well, if you can.
<mbruzek> natefinch: I can't juju ssh or juju scp from the units any longer since Juju thinks the IP address changed.
<mbruzek> natefinch: I will get the logs if I can figure out another day.
<mbruzek> way
<mup> Bug #1474291 changed: juju called unexpected config-change hooks after read tcp 127.0.0.1:37017: i/o timeout <hooks> <openstack> <sts> <uosci> <juju-core:Invalid> <ceilometer (Juju Charms Collection):New> <https://launchpad.net/bugs/1474291>
<mup> Bug #1474508 opened: Rebooting the virtual machines breaks Juju networking <juju-core:New> <https://launchpad.net/bugs/1474508>
<natefinch> mbruzek: juju ssh won't work, but plain old ssh should still work.
<mbruzek> It does, just having trouble submitting the form on Launchpad
<mbruzek> The button does not work on Firefox, and Chrom I get an error
<natefinch> mbruzek: maybe you should try IE ;)
<mbruzek> natefinch: I have tried this several different ways, I can not upload the machine-0.log
<mbruzek> natefinch: because it was owned by root, not mbruzek
<mbruzek> natefinch: uploaded
<natefinch> mbruzek: great :)
<mbruzek> launchpad doesn't tell you that, I can navigate to the location and select the actual file.
<natefinch> mbruzek: may be a browser/OS issue where the specific problem is not well communicated.
<mbruzek> Yeah I was just stating the reason I did not immediately see the resolution
<_thumper_> fwereade: you around?
<fwereade> thumper, yeah, in a few minutes
<thumper> I get the feeling we are being shafted by the mgo txn stuff
<thumper> and while I think I have grokked the problem, I'd like to discuss
<thumper> fwereade: also I was very much hoping you would cast your eye over http://reviews.vapour.ws/r/2152/diff/# as you have been doing a lot of lease work
<fwereade> thumper, LGTM
<thumper> fwereade: ok
<fwereade> thumper, will be back to chat in a minute
<thumper> fwereade: have standup on the hour
<thumper> fwereade: if you want to give us 15 or 20 minutes
<fwereade> I'll be a few more minutes than 1 then
<thumper> it may well be worthwhile talking with menn0 and waigani then
<fwereade> thumper, ping me when you're out, I should still be around
<thumper> kk
<fwereade> thumper, cool, I'll just join your hangout when I get back
<fwereade> thumper, I'll sit quietly ;p
 * perrito666 sees fwereade quietly sitting in a corner and hand him a pr for update-status
<wallyworld> thumper: did you have 5?
<thumper> wallyworld: no
<thumper> not right now
<wallyworld> ok
<wallyworld> menn0: is the rsyslog issue you found with that upgrade bug 1468653 an issue that needs its own bug?
<mup> Bug #1468653: jujud hanging after upgrading from 1.24.0 to 1.24.1(and 1.24.2) <canonical-bootstack> <juju-core:In Progress by thumper> <juju-core 1.24:Fix Committed by thumper> <https://launchpad.net/bugs/1468653>
<menn0> wallyworld: yes it does... or perhaps just reopen one of the previous bugs that cover this problem
<menn0> wallyworld: new bug is probably less confusing
<wallyworld> menn0: yeah, could you please do that and assign to 1.24.3?
<wallyworld> the hook execution one was fixed elsewhere right?
<menn0> wallyworld: i'm not sure if the hook failure one has been dealt with in 1.24 yet. thumper?
<wallyworld> menn0: if i recall one of the windows guys may have been working in the area or doing a different fix that would address it? not sure
<menn0> wallyworld: fwereade said that bogdan had committed a fix so that a unit wouldn't indicate that it had started hook execution until it had the lock
<menn0> wallyworld: that needs to be backported to 1.24
<wallyworld> right, that's the one thanks
<wallyworld> or sounds like it
<wallyworld> fwereade: just to check - the above hook execution fix is ok to backport
<fwereade> I think so, yes
<wallyworld> as it appears implicated in a 1.24 upgrade issue
<wallyworld> ok, ta
<fwereade> I remember it being clean, and I don't think there have been major changes in that area
<wallyworld> is there a bug for it do you recall?
<wallyworld> i'll do a search
<fwereade> huh
<fwereade> I have what might be good or bad news
<fwereade> https://github.com/juju/juju/pull/2681
<fwereade> how old is the 1.24 that's experiencing this?
<wallyworld> hmmm, might be older than the fix i hope
<wallyworld> i need check when 1.24.2 came out
<wallyworld> 2 july
<wallyworld> i check the source
<wallyworld> fwereade: damn, that code is in 1.24.2
<fwereade> wallyworld, damnshit
<wallyworld> menn0: you definitely saw the hook bug in 1.24.2?
<menn0> wallyworld, fwereade: yep. after upgrading from 1.24.0 to the official 1.24.2 release, around half of the units had a hook failure
<menn0> wallyworld, fwereade: mostly leadership-settings-changed and leader-elected but also config-changed and others
<wallyworld> menn0: damn, could you add that info to a new bug?
<fwereade> menn0, wallyworld: ...I think that is a property of 1.24.0 rather than 1.24.2
<menn0> wallyworld, fwereade: we can probably get access to the bootstack staging env again to recreate
<wallyworld> fwereade: i have no insight into this issue
<wallyworld> wouldbe happy if it were 1.24 related
<fwereade> menn0, wallyworld: if we want 1.24.2 to recover from that situation I think we basically need to retry failed hooks automatically
<fwereade> menn0, wallyworld: and that, surprise surprise, may have tentacles
<wallyworld> as part of an upgrade?
<fwereade> wallyworld, I think we should do it anyway
<wallyworld> i'd be loathe to auto retry hooks
<wallyworld> not in a point release
<fwereade> wallyworld, UX-wise just conditioning users to press the big red retry button is less good than just doing it ourselves
<fwereade> wallyworld, right
<fwereade> wallyworld, and an auto-retry on upgrade might be tricky?
<wallyworld> so for 1.24.0 -> 1.24.2 we document that users need to retry
<fwereade> wallyworld, doable actually
<fwereade> hmm
<wallyworld> well on update we don't have full api
<wallyworld> so if we retry, hooks may fail
<fwereade> wallyworld, I think we just set the retry flag on any unit in an error state
<fwereade> wallyworld, and the units handle the retry when they get to it
<menn0> fwereade: can we go back to the root cause of the issue? why is this even happening and why is it suddenly more of an issue
<menn0> I think I've noticed this before but it's always just been one or 2 units in an openstack deployment
<menn0> is it because with leadership we have a lot more hooks going off?
<fwereade> menn0, I don't *think* that accounts for it all
<fwereade> menn0, and indeed I don't understand why it wasn't happening before
<fwereade> menn0, *unless*
<fwereade> menn0, something about how we handle the machine lock on upgrade has changed as well
<fwereade> menn0, *or* I fucked up somewhere in the uniter changes of ~jan/feb, and unwittingly changed something about how the uniter handles the lock
<fwereade> menn0, but I might have expected to see that earlier?
<fwereade> menn0, perhaps not?
<fwereade> menn0, sorry I don't have further context, I'm flagging a bit
<fwereade> menn0, I think it would be easier to eliminate the change-to-locking-on-upgrade hypothesis
<fwereade> menn0, so if nothing happened there it's probably my fault
<menn0> fwereade: i'll try to do some digging
<fwereade> menn0, thanks
<menn0> fwereade: it should be easy enough to repro the issue and then work from there
<menn0> wallyworld: did you create a ticket for the hook failures on upgrade issue?
<menn0> wallyworld: i'm currently doing some repro for the rsyslog issue so we have decent details
<wallyworld> menn0: didn't create ticket - i'll go back through emails and dig up details. i haven't actually seen the issue first hand so am unsure how to describe it exactly
<menn0> wallyworld: I'll create the ticket. was just checking you hadn't already.
<wallyworld> ok, ta
<menn0> wallyworld, fwereade: the other issue that needs looking at is the leaseManager worker constantly dying and restarting due to "concurrent updates"
<wallyworld> oh
<wallyworld> didn't know about that one
<menn0> wallyworld: that is my email as well I think
<wallyworld> i'll go nd re-read
<wallyworld> my brain fifo kicked in
<wallyworld> sinzui: i created a 1.22.7 milestone
<menn0> wallyworld: here's what I said: 3. The lease manger dies at least once a minute, sometimes more often due to "simultaneous lease updates occurred".
<wallyworld> awesome :-(
<menn0> wallyworld: I suspect fwereade's upcoming work will fix this but I don't know if we want to do something else in the mean time
<wallyworld> we may need to
<wallyworld> it's kinda broken as is
<menn0> wallyworld: i'm not sure what the consequences of this error are
<wallyworld> me either without digging
<menn0> wallyworld: maybe the worker just needs to not treat this error as fatal and resync/retry instead
<wallyworld> worth considering
<mup> Bug #1474588 opened: Many hook failures after upgrade <regression> <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1474588>
<perrito666> menn0: could you post status-history for some of those?
<menn0> perrito666: that's a v good idea. i keep forgetting about that feature
<menn0> perrito666: when I repro the issue I'll do that
<perrito666> menn0: tx
<rick_h_> NOTICE: having an issue with prodstack and causing jujucharms.com to be unresponsive. Also means 1.24.X juju deploys will probably not be successfull atm. Working with webops to keep an eye and correcting
 * thumper has in-laws arriving for lunch...
<thumper> yay?
<rick_h_> thumper: lucky dog
<thumper> :)
<perrito666> menn0: have a sec?
 * perrito666 lures menn0 into a dark alley
 * menn0 pretends everything is going to be fine
<menn0> perrito666: what's up
<perrito666> I have to report a bug and need some input from you to make sure I dont lie a lot
<perrito666> also, have this completely inocent looking candy I extracted from my coat
<perrito666> menn0: basically I have found that status for our entities has been left out of envuuid :(
<menn0> ooooh candeeee
<menn0> perrito666: do you mean the docs in the statuses collection?
<perrito666> yeah, give a man the choice between possibly dangerous food or a bug in envuuid and see where he goes
<perrito666> menn0: yes, statusesC
<menn0> perrito666: I remember that collection have multi-env support added
<menn0> perrito666: what are you seeing?
<perrito666> menn0: so, the entries for that collection are being created with the services
<perrito666> since that employs createstatusOp it all goes well because envuuid is added
<menn0> ok
<perrito666> but every subsequent setstatus uses udpateStatusOp which does an Update: bson.D{{"$set": doc }} where doc is a new status doc lacking envuuid
<perrito666> so, only the first status of every entity has envuuid
 * perrito666 can hear menn0 cursing from here
#juju-dev 2015-07-15
<bradm> we've got a charm that seems to be stuck on installing, but the juju logs say the start hook ran, any ideas on how we can dig into it?
<rick_h_> thumper: wallyworld ^ this is related to current IS issues
<wallyworld> ok, thanks for update
<menn0> perrito666: that does indeed seem to be a serious problem
<menn0> perrito666: file a bug and point thumper at it. one of us will fix it soon.
 * thumper keeps his head down
<perrito666> thumper: candy?
 * thumper mutters under his breath something about deadlines and too much work
 * thumper ignores the candy
<menn0> perrito666: we'll have to add an upgrade step to fix existing records
<menn0> bradm: if you haven't already can you look at the logs on the unit's machine itself? (/var/log/juju/unit-FOO.log)
<bradm> menn0: all good now, it was just taking a long time to realise it was up
<menn0> bradm: ok, good to hear
<bradm> well, all good might be a stretch, but its all onto ceph now
<bradm> its making good progress
<menn0> perrito666: I can certainly see the statuses env-uuid problem in a local env here
<perrito666> menn0: adding the bug with detail
<rick_h_> NOTICE: jujucharms.com and the charmstore are back up. The storage in IS is working to rebalance/sync and might time out or be slow for a bit longer.
<rick_h_> menn0: ^
<menn0> rick_h_: sweet
<menn0> perrito666: I wonder how status lookups are even workings at all
<perrito666> menn0: me too
<perrito666> menn0: thumper https://bugs.launchpad.net/juju-core/+bug/1474606
<mup> Bug #1474606: entities status is loosing env-uuid upon setting status. <juju-core:New> <https://launchpad.net/bugs/1474606>
<perrito666> menn0: enjoy
<rick_h_> hah, he says as he takes his candy back
<perrito666> rick_h_: well I left a very nice report in exchange
<perrito666> menn0: this happens for services, units, agents, machines and every other thing that has a status
<menn0> perrito666: I see you've already got a fix for it too
<menn0> perrito666: although I might try and fix this in the multi-env txn layer
<menn0> too
<perrito666> menn0: I do, but I was not sure if I cover all aspects of this issue
<perrito666> I just fixed my patch of land
<menn0> perrito666: understood
<davecheney> what the heck is this test doing ?
<davecheney> FAIL: kvm-broker_test.go:241: kvmBrokerSuite.TestStartInstancePopulatesNetworkInfo
<davecheney> [LOG] 0:00.001 DEBUG juju.testing setting feature flags: address-allocation
<davecheney> kvm-broker_test.go:251: instanceConfig := s.instanceConfig(c, "42")
<davecheney> /home/ubuntu/src/github.com/juju/juju/container/testing/common.go:90: c.Assert(err, jc.ErrorIsNil)
<davecheney> ... value *os.PathError = &os.PathError{Op:"mkdir", Path:"/var/lib/lxc", Err:0xd} ("mkdir /var/lib/lxc: permission denied")
<davecheney> of course this is going to fail
<davecheney> mortals don't have permissino to write to that dir
<menn0> wallyworld: just confirmed... rsyslog is screwed both before and after the upgrade
<menn0> :(
<menn0> writing up a ticket now
<wallyworld> oh dear
<menn0> in different ways though so you know, at least that's interesting
<wallyworld> even better
<menn0> I think the issue in 1.24.0 has already been fixed in a later 1.24
<wallyworld> the rsyslog issue? so do we need to ammend an upgrade step?
<menn0> not sure yet
<menn0> before the upgrade rsyslogd is continually being restarted by juju
<menn0> every 30s or so
<wallyworld> sounds like a bug we may have fixed yeah
<menn0> the only thing in juju's logs is the rsyslog worker saying "reloading rsyslog config"
<menn0> after the upgrade that stops
<menn0> but most/all of the units can't connect
<menn0> due to cert verification
<axw> thumper: "I did say that the way to fix this properly was to not use the state method to load the charms."  -- what else would you use to enumerate charms, if not state?
<menn0> which seems like the thing that's been attempted to be fixed several times now
<menn0> anyway, i'll write up the ticket
<menn0> wallyworld: ^^
<mup> Bug #1474606 opened: entities status is losing env-uuid upon setting status. <juju-core:New for menno.smits> <juju-core 1.24:New for menno.smits> <https://launchpad.net/bugs/1474606>
<mup> Bug #1474607 opened: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <juju-core:New> <https://launchpad.net/bugs/1474607>
<thumper> axw: making a request of the collection and have it return the raw data as dicts
<thumper> axw: that way you aren't expecting a particular structure
<thumper> axw: we have had to write many upgrade steps in this way
<thumper> and I believe it is better
<thumper> because you just ask for what you need, and change what you must
<thumper> and don't worry about the structure of the doc as much
<wallyworld> menn0: sorry was distracted by a review. i recall vaguely the cer issue was fixed. rsyslog i think uses a different cert to state server connections
<menn0> wallyworld: do you know when it was fixed given that I'm seeing this with an upgrade to 1.24.2?
<wallyworld> menn0: not offhand - ericsnow might know
<wallyworld> thumper: i am going to fix it like you wanted in 1.22 - do we really need to do the work in 1.24 if the step runs after env uuid has been added?
<wallyworld> if yes, i can fix
<axw> wallyworld: is there a test we can add that would have highlighted the issue?
<axw> wallyworld: and that would highlight future issues
<wallyworld> axw: for this case yes - a CI test that adds charms to the 1.20 env prior to upgrade and then checks that they are imported after
<wallyworld> that's on my todo list to follow up on
<axw> wallyworld: presumably we could write a unit test that adds a charm entry to state, and wipes out its env-uuid field to exercise the bug? but the CI test would be better, since it'll catch future bugs too
<wallyworld> yeah, that was my thinking
<wallyworld> menn0: and for further joy, bug 1469077 is back again, i'll need to point william to it
<mup> Bug #1469077: Leadership claims, document larger than capped size <landscape> <leadership> <juju-core:Triaged> <juju-core 1.24:Confirmed> <https://launchpad.net/bugs/1469077>
<menn0> wallyworld: yeah I saw that
<menn0> wallyworld: so unawesome
<wallyworld> i know right :-(
<menn0> here's the rsyslog ticket: bug 1474614
<mup> Bug #1474614: rsyslog connections fail with certificate verification errors after upgrade to 1.24.2 <regression> <juju-core:New> <juju-core 1.24:New> <https://launchpad.net/bugs/1474614>
<menn0> wallyworld: ^^
<wallyworld> looking
<wallyworld> thanks for filing, a nice bug report
<wallyworld> menn0: axw will pick up that rsyslog bug
<wallyworld> axw: goose pr lgtm
<axw> wallyworld: thanks
<mup> Bug #1474291 opened: juju called unexpected config-change hooks after read tcp 127.0.0.1:37017: i/o timeout <hooks> <openstack> <sts> <uosci> <juju-core:New> <ceilometer (Juju Charms Collection):New> <https://launchpad.net/bugs/1474291>
<mup> Bug #1474614 opened: rsyslog connections fail with certificate verification errors after upgrade to 1.24.2 <regression> <juju-core:New> <juju-core 1.24:New> <https://launchpad.net/bugs/1474614>
<wallyworld> axw: tagging pr reviewed, but there is a question
<axw> wallyworld: ok, thanks, will look in a sec
<thumper> wallyworld: if we don't, it is just a time bomb for the next time things change
<thumper> it is makeing a problem for future us
<wallyworld> thumper: looking at the code - there's *lots* of current upgrade steps that use the docs directly, not maps. the only ones that use maps are the ones to inser env uuid
<thumper> well... we are just making problems for ourselves IMO
<wallyworld> i thumper your team wrote a lot of them
<thumper> you are proably not wrong
<wallyworld> i guess the difference is that the doc are used with the raw collection
<thumper> I'm telling you the result of accumulated wisdom
<wallyworld> so rawCollection.Find(&someoc)
<wallyworld> one would hope that CI tests would evolve to better catch upgrade issues
<lazyPower> thumper: sorry was deep in a support scenario.
<lazyPower> thumper: sounds good mate, submit it earlier is way better than later, as the queue takes a couple days to sift down to newly submitted stuff
<thumper> lazyPower: no worries
<thumper> I'm not going to block my deployment on it getting reviewed :)
<lazyPower> we're averaging ~ 5 days on initial touch, still trying to get that number down, but its way better than the 13 days in history.
<lazyPower> as you shouldn't be :)
<lazyPower> namespaces!!!!
 * lazyPower toots the namespace horn
<lazyPower> cheers
<thumper> namespaces?
<menn0> thumper, waigani, wallyworld: see email for findings related to bug 1474195
<mup> Bug #1474195: juju 1.24 memory leakage <cpec> <deployer> <performance> <regression> <juju-core:Triaged> <juju-core 1.24:In Progress by menno.smits> <https://launchpad.net/bugs/1474195>
<menn0> thumper, waigani, wallyworld: looks like will's theory was right
<waigani> menn0: red box of death?
<menn0> waigani: yep... see the note on the field
<waigani> ah yep, just saw note
<menn0> :)
<waigani> nice work
<menn0> waigani: when you added the auto env life assertion to the txn layer did you remove the ones that already existed elsewhere, or did they not exist anywhere before JES?
<menn0> I guess it wasn't really necessary when there was just one env
<waigani> menn0: yeah, it's going back a bit now, but I don't remember there being any - which as you point out makes sense.
<menn0> cool
<menn0> waigani: i'm trying to figure out the right places to check
<menn0> adding a machine certainly
<waigani> menn0: you mean where we really need to assert for a live environ?
<menn0> waigani: yep
<waigani> menn0: as a starting point, didn't will say whenever we add a service, unit, relation or machine?
<mup> Bug #1454468 changed: nodes deployed successfully by maas but juju status remains pending with juju 1.23.2 and services stuck in allocating <deploy> <oil> <juju-core:Expired> <https://launchpad.net/bugs/1454468>
<menn0> waigani: I wonder if we can reduce that set to just service and machine
<waigani> so what's the worst case? we add a unit/relation to a dying environment...
<menn0> waigani: I'm wondering if we can just add an environment cleanup to kill them
<menn0> waigani: in fact, the current cleanupServicesForDyingEnvironment might already do it
 * thumper is going to lie down
 * thumper is not 100%
<waigani> menn0: so that sets the existing services to dying, but expects that no new services can be added to a dying environment
<menn0> waigani: standup hangout? (as per PM)
<waigani> menn0: yep
<wallyworld> menn0: should be any time we allocate something that costs
<wallyworld> eg machine, storage etc
<menn0> wallyworld: yep, that's what i'm looking at now... anything that results in a physical change certainly needs the env life assert
<menn0> (as physical as a virtual machine is anyway)
<wallyworld> well, physical change that costs $$$
<wallyworld> don're really care about containers
<menn0> wallyworld: i've got a pretty clear picture now of what I want to do. i'm going to catch will later on tonight to confirm
<wallyworld> but machines yes
<wallyworld> ok
<menn0> wallyworld: that's a good point, maybe we don't check for containers
<wallyworld> menn0: yeah, so for stuff that doesn't cost, we just have a cleanup job after env is killed
<menn0> wallyworld: yep, and we already have most of that it turns out
<wallyworld> i can't see why we'd check more than is necessary
<wallyworld> and before JES, we didn't check
<wallyworld> so TBH i'm not sure why we started checking with JES
<wallyworld> i guess JES has greater chance of concurrent access
<menn0> yep and b/c before when you issued destroy-environment everything died at that point including the API server
<menn0> so you had very little opportunity to add a new machine or whatever once the env was dying
<wallyworld> waigani: so is that +2 a ship it? btw - that func needs to be exported because it is in state package
<wallyworld> and called by upgrades package
<menn0> now for hosted envs the API server stays up so there's a much great chance of env changing operations as the env is dying
<wallyworld> fair point
<menn0> anyway, stopping now since i'm going to be back on later
<dimitern> dooferlad, morning
<dooferlad> dimitern: hi
<dimitern> dooferlad, I thought we dealt with the kvm-inaccessable-after-reboot issue in 1.24 as well ? see bug 1474508
<mup> Bug #1474508: Rebooting the virtual machines breaks Juju networking <juju-core:New> <https://launchpad.net/bugs/1474508>
<dooferlad> dimitern: I thought so too.
<dimitern> dooferlad, maybe the fix is in master only?
<dooferlad> dimitern: will need to take a look and see if I missed landing it
<dimitern> dooferlad, cheers
<dooferlad> dimitern: darn, wasn't backported.
<dooferlad> dimitern: will be trivial to do.
<waigani> wallyworld: sorry, just saw your message - this for moving charm tests to state? yes, +2 shipit.
<wallyworld> ta
<waigani> wallyworld: ah, I thought I clicked shipit - done now. The pattern of needing exported state funcs for upgrade steps is something fwereade is keen to change - possibly just exporting one upgrade step from state which then calls the other unexported steps. But for now we just need to try to make it clear that while the func is exported, no-one except the upgrades package should be using it.
<wallyworld> waigani: np. and this was for a 1.22 release, so old code
<waigani> wallyworld: yeah true
<dimitern> dooferlad, awesome! will you do the dance then please? - card, bug, etc.
<dimitern> :)
<dooferlad> sure
<dimitern> ta!
<menn0> fwereade: ping?
<dooferlad> dimitern: http://reviews.vapour.ws/r/2163/ for a quick +2
<dimitern> dooferlad, ship it! :)
<jam1> fwereade: dimitern: food just arrived so I'm going to miss the standup. But I'm working on breaking down the Uncommitted state stuff into development items (I'd like to chat directly with fwereade later if you have time before our cycle review)
<dooferlad> jam, fwereade, TheMue, dimitern: stand up!
<dimitern> dooferlad, omw
<TheMue> omw
<dimitern> jam, thanks for the heads up
<rogpeppe> can anyone tell me something about plans relevant to the EnvironmentsCacheFile feature?
<rogpeppe> jam, fwereade: ^
<jam> rogpeppe: I don't particularly know it by that name, but it looks like something thumper would have been doing to support multiple environments
<mup> Bug #1474788 opened: ec2: provisioning machines sometimes fails with "tagging instance: The instance ID <ID> does not exist" <ec2-provider> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1474788>
<jam> just by reading its description from https://github.com/juju/juju/blob/master/feature/flags.go#L29
<rogpeppe> jam: i'm just wondering what our future plans are. is the plan to do away with .jenv files entirely?
<jam> rogpeppe: thats how I read the description in there. I haven't heard of that before, nor had read that particular detail in the JES stuff. But it does read that way.
<rogpeppe> jam: surely we have some roadmap plans somewhere?
<rogpeppe> cherylj: do you know about this, by any chance?
<jam> rogpeppe: so I've got docs for JES CLI, JES Logging, MESS Work Items and one more. The last two are "Historical" and it might be described in there, but they are roughly before I started tracking all the proposals directly.
<dooferlad> dimitern: this is what I have for the spaces API stuff: https://github.com/juju/juju/compare/net-cli...dooferlad:net-cli-apiserver-spaces?expand=1
<dooferlad> dimitern: would be good to have a chat about if that is shaping up in the way you imagined. I am not sure I like having the stub network stuff in apiserver/testing. I think having its own package is nicer.
<dooferlad> dimitern: what do you think?
<dimitern> dooferlad, looking
<dimitern> dooferlad, I like the refactoring around moving the shared stubs in apiserver/testing
<dimitern> dooferlad, haven't looked at every line, but so far it looks solid
<dimitern> dooferlad, please, s/ast/apiservertesting/ (or whichever alias for that path is more common)
<dooferlad> do you have an opinion about if fake_spaces_subnets.go shoud be in its own package so we can just import and use it rather than having to call InitStubNetwork?
<dimitern> dooferlad, also InitStubNetwork() could be defined as a method on a fixture struct, which can be embedded into the suites that need it and call it in SetUpSuite, rather than init()
<dimitern> dooferlad, have a look at LiveTests (or was it Tests ?) for example
<dooferlad> dimitern: sure, github.com/juju/juju/environs/jujutest/livetests.go right?
<dimitern> dooferlad, I have a lingering feeling the shared stubs are not goroutine safe (when used outside apiserver/testing) - make sure you run with -race
<dimitern> dooferlad, that's the one yeah
<dooferlad> dimitern: great, thanks for the pointers.
<wallyworld> fwereade: bug 1469077 has come up again on 1.24.2, so i removed the incomplete status
<mup> Bug #1469077: Leadership claims, document larger than capped size <landscape> <leadership> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1469077>
<fwereade> wallyworld, grar. axw, do you have context on this? ^^
<axw> fwereade: nope
<wallyworld> fwereade: i have no context on the cause or fix sadly, but i see some info has been attached to the bug
<jam> wallyworld: fwereade: I could see a case where contending on the txn-queue field and having it grow large enough that we can't handle all the txns listed before a new one comes in
<jam> and then it grows every 30s until there are so many entries that it is larger than we're allowed to make a document (or in this case larger than the size of a capped collection?)
<wallyworld> sounds plausible
<fwereade> jam, yeah -- I just thought that *someone* had addressed the writes that caused that
<fwereade> jam, I just forget who
<fwereade> jam, perhaps I hallucinated it
<jam> fwereade: we handled that for addresses by fixing the addresser
<wallyworld> ah mr *someone* :-)
<jam> I don't know of a fix for the leadership stuff
<fwereade> jam, ok, bugger, I thought that was part of the stuff axw had done but evidently not
<jam> m-enn-o had done some work to clean out transactions that are thought of as already applied (to handle our other assertion-only TXNs don't get cleaned out)
<jam> but I would think that's a different issue.
<fwereade> jam, yeah
<fwereade> jam, ok, let's chalk it up to a hallucination then ;p
<fwereade> jam, oh wait
<fwereade> jam, I thought it was the remove/insert behaviour that led to growing txn queues, and mr.someone had made a fix to the lease persistor that stopped it doing that?
<fwereade> wallyworld:
<fwereade> 	// TODO(wallyworld) - this logic is a stop-gap until a proper refactoring is done
<fwereade> 	// We'll be especially paranoid here - to avoid potentially overwriting lease info
<fwereade> 	// from another client, if the txn fails to apply, we'll abort instead of retrying.
<fwereade> wallyworld, originally it was remove/insert every time, which was causing unbounded queue growth
<wallyworld> hmmm, le tm elook up that code
<fwereade> wallyworld, state/lease.go
<fwereade> wallyworld, maybe it wasn't backported..?
<wallyworld> i don't recall that todo all at, yet it has my nname on it :-)
<fwereade> haha
<fwereade> I know the feeling
<wallyworld> fwereade: i just checked the code, i.24 is the same
<jam> wallyworld: I do remember fwereade reviewing a patch you submitted so that we changed how leases are requested so that it wouldn't be a "delete current one, create a new one" sort of operation.d
<jam> fwereade: as far as that goes, *if* we ever get to a point where we have an invalid TXN in the queue (one that we cannot clear)
<jam> then we'll overflow the txn-queue eventually
<jam> because creatiion of a *new* txn adds a value to the field
<jam> and then when we go to evaluate the txn, we see the bad txn and die, and now we have yet-another txn in the queue
<jam> so the "document too big" could just be a symptom of "invalid TXN in queue"
<wallyworld> jam: i vaguely recall that too, i'll have to go digging
<jam> fwereade: iteration planning meeting?
<fwereade> jam, there
<jam> hm. I don't see you in the one I'm in
<wwitzel3> katco: ping
<mup> Bug #1474508 changed: Rebooting the virtual machines breaks Juju networking <juju-core:Fix Released by dooferlad> <juju-core 1.24:In Progress by dooferlad> <https://launchpad.net/bugs/1474508>
<perrito666> morning all
<wwitzel3> perrito666: o/
<wwitzel3> ericsnow: ping
<wwitzel3> so cold and alone
<natefinch> wwitzel3: lol
<wwitzel3> natefinch: these things tend to happen when working on rsyslog stuff ;)
<natefinch> wwitzel3: ahh yeah, totally
<dooferlad> dimitern, TheMue: please be opinionated at http://reviews.vapour.ws/r/2169/
<TheMue> dooferlad: ok
<TheMue> dooferlad: too many files, cannot be good *lol*
<dimitern> dimitern, kiijubg\
<dimitern> wtf?!
<dimitern> dooferlad, looking :)
<mup> Bug #1468815 opened: Upgrade fails moving syslog config files "invalid argument" <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1468815>
<cherylj> rogpeppe: I didn't do the work to enable the cache file, but I might be able to answer specific questions you may have about it.
<dimitern> dooferlad, reviewed
<dooferlad> dimitern: thanks - exactly what I needed.
<dimitern> dooferlad, cool :)
<mup> Bug #1474885 opened: juju deploy fails with ERROR EOF <juju-core:New> <https://launchpad.net/bugs/1474885>
<mup> Bug #1474892 opened: User friendly error message for system destroy could be improved <juju-core:New for cherylj> <https://launchpad.net/bugs/1474892>
 * fwereade is stopping, has a review up: http://reviews.vapour.ws/r/2172/
<katco> ericsnow: meeting
<alexisb> katco, would you mind filling gsamfira in our the details of how we use feature branches?
<alexisb> I pointed him to the wiki but he has some questioned I am not qualified to answer
<katco> alexisb: sure thing
<katco> gsamfira: lmk what questions you have
<mattyw> fwereade, I'll be proposing small uniter changes later on, don't need reviewing yet but I'll ping you about them tomorrow, wanted to let you know before then just to let you know that part of it might be controversial but I think I have a good justification
<davecheney> kvm-broker_test.go:201: kvm0 := s.startInstance(c, "1/kvm/0")
<davecheney> /home/ubuntu/src/github.com/juju/juju/container/testing/common.go:90: c.Assert(err, jc.ErrorIsNil)
<davecheney> ... value *os.PathError = &os.PathError{Op:"mkdir", Path:"/var/lib/lxc", Err:0xd} ("mkdir /var/lib/lxc: permission denied")
<davecheney> I am seeing this error constantly on a fresh ubuntu machine
<davecheney> it seems pretty fatal
<davecheney> has anyone else ever seen this
<davecheney> i'm sure it's because lxc pacakges are not installed, so /var/lib/lxc is not present
<davecheney> but this seems like a pretty serious isolation failure
<natefinch> davecheney: I haven't seen it, but I have lxc installed
<davecheney> this is on a fresh install
<davecheney> tests fail because this directory is
<davecheney> 1. not present
<davecheney> 2. will not be present, bucause /var/lib is owned by root
<natefinch> davecheney: certainly, it's an isolation problem.  I wonder if there aren't a lot more similar problems in those tests, if lxc is not installed.
<davecheney> i'm too scared to look
<davecheney> also, how is that test supposed to pass on windows ?
 * davecheney logs a bug and moves on
<natefinch> davecheney: I presume all the tests are marked as skipped on windows
<davecheney> do we have voting windows CI tests ?
<natefinch> ericsnow, wwitzel3: review me? http://reviews.vapour.ws/r/2174/
<natefinch> sinzui: what davecheney said ^
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1474946
<mup> Bug #1474946: worker/provisioner: tests are poorly isolated <juju-core:New> <https://launchpad.net/bugs/1474946>
<natefinch> davecheney: I know they run and passed at one time, but I don't know if they're voting or not.  I believe so, since I have gotten windows bugs
<natefinch> from CI failures
<sinzui> natefinch windows tests do vote. they have passed in but not in a week. I am told many test are skipped
<natefinch> wwitzel3: btw, I already foward ported the first bug in your bug task: https://bugs.launchpad.net/juju-core/+bug/1370896
<mup> Bug #1370896: juju has conf files in /var/log/juju on instances <canonical-bootstack> <logging> <rsyslog> <juju-core:Fix Committed by natefinch> <juju-core 1.24:Fix Released by natefinch> <https://launchpad.net/bugs/1370896>
<sinzui> juju-ci-tools as a similar problem when we run its own suite on OS X. I created /var/lib/lxc on the machine to get a pass
<davecheney> that's terrible
<wwitzel3> natefinch: yeah, saw that, I discovered that the problem still exists in juju-1.24 master so I'm working on a fix now, before porting the other PRs
<davecheney> why doesn't it fail for the landing bot ?
<sinzui> davecheney: windows test suite is run by ci, not the merge bot, and since the test take about 2 hours to get a pass, do you really want to slow down merges? mgz suggested that the test suite be made reliable so that we could get the run down to 40 minutes per merge
<davecheney> sinzui: i like forcing the issue
<sinzui> ;)
<mup> Bug #1474946 opened: worker/provisioner: tests are poorly isolated <juju-core:New> <https://launchpad.net/bugs/1474946>
<sinzui> perrito666: juju-ci-tools has the first part of my testing arg change. I am going to do another round to, but It wont be merged until tomorrow.
<perrito666> sinzui: tx for the heads up
<katco> ericsnow: sorry, got caught up in meetings. reviewing your prs now
<wwitzel3> afk picking my car up from the shop
<menn0> perrito666: ping?
<perrito666> menn0: pong?
<perrito666> good morning
<menn0> perrito666: good evening
<menn0> perrito666: regarding the problem you found yesterday
<perrito666> yes?
<menn0> perrito666: thumper reminded me that using $set to overwrite a doc is a no-no with mgo/txn
<menn0> perrito666: because it blows away the mgo/txn fields (txn-queue, txn-revno etc)
<perrito666> menn0: oh, expand
 * menn0 goes to find the mailing list post about this
<menn0> perrito666: it was SO. the last paragraph here: http://stackoverflow.com/a/24458293/195383
<menn0> perrito666: we should probably change the status update code to do a more conventional update
<menn0> perrito666: and add some protection to stop people doing this again in the future.
<perrito666> Definitely
<menn0> perrito666: can you handle the first part (changing the status update code)
<menn0> ?
<menn0> perrito666: I'll handle the second part (preventing these kinds of updates)
<perrito666> I wonder if that is not the cause of some of the eff ups of txns, this has been there for who knows how long
<perrito666> I'll fix update
<menn0> perrito666: it could well be
<perrito666> I am just making a quick grocery shop and I'll send a patch upon returning
 * perrito666 is surprised of how slow can the 10 > items line be
<menn0> perrito666: no problems.. it can wait until tomorrow
<menn0> perrito666, thumper: statusesC isn't the only place where we replace docs using $set
<thumper> menn0: how many other places?
<menn0> perrito666, thumper: not many: stateServingInfoC, constraintsC, settings
<menn0> kinda important ones though!
<thumper> :)
<thumper> settings change often IIRC
<thumper> moving a service around a gui updates settings doesn't it?
<menn0> thumper: no that's annotations
<thumper> ah
<thumper> good
<menn0> thumper: settings is all the relation and env settings
<thumper> but equally important bits
<thumper> relation settings is the core communication channel between services right?
<menn0> thumper: esp b/c they all get watched for changes
<perrito666> Aghh this line (all this conversation happened in the market line)
<menn0> thumper: and also bad b/c constraints and settings are multi-env so really should have the env-uuid set
 * thumper nods
<thumper> fark!!!
 * menn0 extends bug 1474606
<mup> Bug #1474606: entities status is losing env-uuid upon setting status. <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1474606>
<perrito666> Menn0 do we need some sort of repair steps?
<menn0> perrito666: we will need to implement DB migrations to fix the env-uuid fields
 * menn0 is having doubts and does a quick check to ensure that $set with a struct really replaces the whole doc
<perrito666> menn0: can we implement db migrations to run even when not having min version change?
<perrito666> by min I mean maj.min.micro
<menn0> thumper, perrito666: no it does what we thought, so all made
<menn0> urgh, so all bad
<thumper> menn0: also... there are a bunch of weird relation bugs that I have a feeling are caused by this
<menn0> thumper: could be
<thumper> where an openstack deployment is made and some relations don't get the settings
<thumper> especially if the relation config is more complicated
<thumper> which is likely to be with some openstack charms
<perrito666> basically anything being updated is left out of an env
<perrito666> and most likely breaks a transaction
 * perrito666 makes a t-shirt that says "every time you $set a doc a txn dies"
<menn0> perrito666: all upgrade steps for the current major version are run whenever upgrading to any version within that major version so if we add upgrade steps they will get run
<perrito666> excelent, I was in doubt there
 * perrito666 feels ignored by the bot
<menn0> bug 1474606 updated
<mup> Bug #1474606: Document replacements using $set are problematic <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1474606>
<perrito666> menn0: Ill propose a fix for status right away
<perrito666> I take the migration step will be rather generic and just be called with all affected collections once all is fixed?
<perrito666> menn0: btw, thanks for putting all that effort into this, I completely overlooked the txn issue.
<menn0> perrito666: it was thumper who remembered this, not me.
<perrito666> aww, I don't want to thank thumper, he did not take my candy
<thumper> haha
<thumper> perrito666: no if you were offering nice steak or wine... that would be a different proposition
<thumper> s/no/now/
<marcoceppi> jw4: you still around?
<thumper> perrito666: I have a gut feeling that the replacement of the docs in the settings collection is the source of a collection of weird unreproducable relation errors
<thumper> hey marcoceppi
<thumper> marcoceppi: quickk question for you
<marcoceppi> hey thumper o/
<marcoceppi> thumper: shoot
<thumper> marcoceppi: if you are deploying a large bundle, how often are there strange relation config issues?
<marcoceppi> I wouldn't know, I've only done openstack bundles
<marcoceppi> that's the biggest I've gotten
<perrito666> thumper: I think you might not like steak how its done here :p but if you where ever to visit I might cook you a decent local meat dish with wine
<thumper> :)
<jw4> marcoceppi: yep, sorry missed your ping
<alexisb> thumper, I am available when ever you would like to chat
<marcoceppi> jw4: does action-fail immediately exit after it's called?
<marcoceppi> as in, kill the action?
<jw4> marcoceppi: I don't think so
<marcoceppi> or do I still need to exit
<marcoceppi> kk
<marcoceppi> thanks
<jw4> yw :)
<jw4> marcoceppi: just confirmed - it only sets the status of the action but doesn't terminate execution
<perrito666> thumper: menn0 what I am wondering, and you might be too, is how in the universe are these things working eventhough they lack env-uuid
<perrito666> sounds like we have another bug somewhere
<perrito666> at least in state
<perrito666> http://reviews.vapour.ws/r/2178/ <-- fix for update status
<menn0> perrito666: yes, I was wondering the same thing
<menn0> perrito666: there might be a bug in the multi-env txn stuff
<perrito666> mm, isnt (or wasnt) the env also encoded in the id?
<perrito666> I just noticed the breakage once I needed to use something with an int _id
<menn0> perrito666: ship it
<menn0> perrito666: the env uuid is prefixed on to the front of the _id
<menn0> perrito666: it needs to be a string
<menn0> perrito666: where do you have int _ids?
<perrito666> menn0: status-history works differently
<perrito666> its a simple pile
<menn0> perrito666: so it has int _ids?
<perrito666> menn0: yes, sequential
<perrito666> also doesnt use txn
<perrito666> all by hand
<menn0> ok, well if it's not using the txn system then it doesn't matter what you do
<perrito666> menn0: yup its a different beast
<perrito666> menn0: btw, I think that, at least for status, what is happening is that, since the ids of the entities and statuses are the same, it is returning the statuses correctly anyway (and the envuuid aware txn might be letting blank envuuid pass, which it shouldnt)
<menn0> perrito666: yeah it's not supposed to
<perrito666> menn0: its just a theory
<perrito666> but behavior seems to suggest that this is happening
<menn0> perrito666: i'm dealing with the another critical bug at the moment, then I'll get to this one
<perrito666> life is fun, isn't it?
<perrito666> if it makes you feel better, you are one day closer to the weekend than I am
<thumper> perrito666: it is working because we don't use the env-uuid value unless we are cleaning up documents
<thumper> perrito666: all the queries use the _id field
<thumper> which is the same
<thumper> and has the env-uuid prefixed
#juju-dev 2015-07-16
<davecheney> thumper: i've been staring at this one all afternoon, https://bugs.launchpad.net/juju-core/+bug/1475056
<mup> Bug #1475056: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <juju-core:New> <https://launchpad.net/bugs/1475056>
<davecheney> it happens super reliably for me
<thumper> davecheney: with you shortly, writing big email
<davecheney> thumper: that's ok
<davecheney> no action needed
<davecheney> just letting you know 'cos I missed standup
<thumper> davecheney: ok
<davecheney> i'm a bit worried
<davecheney> i cannot see anything in the logic that the test actually guarentees
<davecheney> ie, it's adding then removing a relatino
<davecheney> and hoping that happens fast enough that no events are generated
<davecheney> this is at best, a conincidence
<thumper> haha
<thumper> that's terrible
<perrito666> davecheney: uff, that relies on the uniter being busy in a different path on the loop :|
<perrito666> or not yet in it
<mup> Bug #1475056 opened: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <juju-core:New> <https://launchpad.net/bugs/1475056>
<wallyworld> menn0: is there a chance bug 1469077 is caused by the mgo/txn issue you are working on?
<mup> Bug #1469077: Leadership claims, document larger than capped size <landscape> <leadership> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1469077>
<wallyworld> it has been raised again as an issue
<wallyworld> for a 1.24 deployment
<menn0> wallyworld: it's possible, not sure of the likelihood
<menn0> wallyworld: do you know which collection has the out of control txn-queue fields?
<wallyworld> not yet
<menn0> wallyworld: also note that I'm not working on that one yet
<menn0> wallyworld: i'm currently dealing with bug 1474195
<mup> Bug #1474195: juju 1.24 memory leakage <cpec> <deployer> <performance> <regression> <juju-core:Triaged> <juju-core 1.24:In Progress by menno.smits> <https://launchpad.net/bugs/1474195>
<menn0> wallyworld: that's going well
<wallyworld> yay
<wallyworld> cannot resume transactions: document is larger than capped size 1326012 > 1048576
<wallyworld> is the only error i can see so far
<wallyworld> doesn't say what collection
<menn0> wallyworld: yeah, you need to look at the DB to see where the problem is
<wallyworld> ok
<wallyworld> happens writing the lease token, so could be leadership related
 * perrito666 yells at bot
<davecheney> thumper: which makes me wonder
<davecheney> should I just delete the test ?
<davecheney> there cannot be code relying on this behavior
<davecheney> 'cos
<davecheney> well
<davecheney> the behaviour only exists in tests
<davecheney> in real life
<davecheney> there is no way this timing could exist
<menn0> wallyworld: didn't we fix that problem already ... when we saw this before it was lease/leadership related too
<thumper> davecheney: what is it testing exactly?
<wallyworld> menn0: a fix was made to error if any concurrent change was made to leadership document. not sure though how the previous implementation or the current one would impact txn queue
<wallyworld> s/leadership/lease
<menn0> ok
<menn0> I have no idea what's going on then
<wallyworld> by error, i mean return with error rather than trying again
<wallyworld> exit txn loop early
<wallyworld> if anything, that should have helped the situation
<wallyworld> so the bug was marked as incomplete
<wallyworld> but was recently reported as the issue still occurs :-(
<davecheney> test 0: Nothing happens if a unit departs before its joined is run
<menn0> wallyworld: I think we need to point jam and fwereade  and this one
<menn0> at this one
<wallyworld> yeah
<wallyworld> i'll ping them later
<axw> wallyworld: would you PTAL at http://reviews.vapour.ws/r/2154/ ?
<wallyworld> sure
<wallyworld> axw: looks ok, just a quibble
<axw> wallyworld: ta
<menn0> wallyworld, axw: is it important for there to be an assertion that the env is alive around createStorageOps?
<wallyworld> yes
<menn0> wallyworld, axw: I ask b/c it gets called as part of unit creation,  and we're trying to avoid that assertion when units are created
<wallyworld> because storage costs $$
<axw> wallyworld: yes, for persistent storage anyway. we don't want to destroy an environment while there's persistent storage around
<wallyworld> well some
<axw> err
<axw> menn0: :)
<wallyworld> yes, just persistent
<wallyworld> axw: blonde moment - how could line 28 in this pastebin result in a nil pointer given that "ch" is used just above
<wallyworld> http://pastebin.ubuntu.com/11885503/
 * axw looking
<menn0> wallyworld: so yes if there's persistent storage involved?
<wallyworld> menn0: yeah
<axw> wallyworld: ch.URL() dereferences the charmDoc.URL field
<menn0> wallyworld: well that sucks b/c we can't fully remove this bottleneck then
<axw> wallyworld: so if it's nil...
<wallyworld> menn0: we want to avoid provisioning machines / volumes etc that could cost the user
<menn0> wallyworld: yeah I understand
<menn0> wallyworld: what actually provisions the storage/
<menn0> wallyworld: maybe we can block it there
<wallyworld> axw: sure, so why isn't the line number where the charm doc is then?
<wallyworld> inside ULR()
<axw> wallyworld: show me the panic?
<wallyworld> menn0: there's a storage provisioner
<wallyworld> similar to machine provisioner
<wallyworld> axw: http://data.vapour.ws/juju-ci/products/version-2882/aws-upgrade-trusty-amd64/build-2233/machine-0.log.gz
<axw> wallyworld: I feel like I'm missing something, that panic points to the MigrateCharmStorage function
<wallyworld> axw: yeah, it's in 1.22
<axw> and not the state code
<wallyworld> i have to move it
<wallyworld> had
<axw> I see
<wallyworld> because we needed to use the raw collection
<axw> wallyworld: not entirely sure, possibly inlining?
<wallyworld> yeah could be, weird though
<wallyworld> here's the new code https://github.com/juju/juju/blob/1.22/state/upgrades.go#L964
<wallyworld> i'll do some digging
<axw> yeah I found it, thanks
<wallyworld> menn0: did you find it?
<menn0> wallyworld: yep I found the storage provisioner
<menn0> wallyworld: it'll be a bit of work to add watching of env life in there
<wallyworld> menn0: it calls into state methods - may be able to modify one of those
<menn0> wallyworld: I'll go for adding the assertion only for perisistent storage
<wallyworld> axw: ^^^ so if there's a EBS volume involved, that is bound to the machine, the above approach will be ok i think?
<wallyworld> maybe it sould assert the storage binding instead
<wallyworld> or i mean do the assert if binding = env
<wallyworld> but wait, this is 1.24
<wallyworld> so will be different
<axw> yes I think that'll work. machines will prevent env death, so machine-bound storage will be fine
<wallyworld> so do that for 1.25
<axw> menn0: why would you add env life watching?
<menn0> axw: it's automatically added everywhere by the multi-env txn layer
<menn0> axw: but that's created a massive perf bottleneck
<menn0> so that's being ripped out
<menn0> in favour of selectively adding it in a few key places
<menn0> storage is one of those places
<axw> menn0: understood, by why does that mean adding a watcher?
<menn0> b/c we don't want someone to be able to add storage to an env just has it's dying
<axw> menn0: the way things work atm with storage, we use cleanups to tirgger death of storage when the bound-to entity dies
<axw> menn0: so you destroy a machine with storage, then a cleanup is queued that destroys the attached storage
<menn0> axw: but if storage is added as the env is dying and that txn takes a while to run the cleanup could miss it
<wallyworld> axw: that cleanup is only in master though from memory
<menn0> axw: but I guess the machine or unit will be dead so the txn will probably still fail
<axw> menn0: the env can't die while there's still machines right?
<axw> hrm
 * axw ponders
<wallyworld> we need a 1.24 solution too
<anastasiamac> clear
<anastasiamac> oops
<axw> menn0: if storage is added, then its life will be set to Dying by the cleanup regardless of whether it's been provisioned
<menn0> axw: yes it can
<menn0> axw: the first thing that happens is the env is set to Dying and then machines and everything else get killed off
<axw> but you're saying that the txn that adds the storage may happen after the cleanup...
<perrito666> ah wonderful a test that only breaks when run non isolated....
<menn0> axw: there's a slim chance that it could
<menn0> axw: right now that's not possible because we have an automatically added env life assertion on almost all txns
<menn0> axw: but that's going away
<menn0> axw: seems like adding an extra check in the storage provisioners before it does anything might be sensible?
<menn0> provisioner
<axw> menn0: sorry I mistyped before: the env can't be *removed* until there's no machines? i.e. it can go to Dying, but can't be Removed until the dependents  are gone?
<axw> hrmph still doesn't really help
<menn0> axw: yes that's right
<axw> menn0: we're going to have this problem with the machine provisioner too right?
<menn0> no because the machine addition ops now include an explicit env life assertion
<menn0> (but only for top level machines, not containers)
<axw> menn0: so why can't we do that in storage? they're no more plentiful than machine addition ops
<menn0> we don't want that for units though because units are often added in huge bulk (this is where users are seeing the current bottleneck)
<menn0> axw: b/c storage ops get added as part of unit addition
<axw> menn0: I think we could do it for machine storage (volumes, filesystems), but not storage instances
<menn0> my unfamiliarity with storage is probably not helping here :)
<axw> so do machines, except if you're using --to
<menn0> axw: b/c I'm slow can you please summarise :)
<axw> menn0: if a charm requires storage, then adding a unit will add a "storage instance". that will cause the creation of either a volume or filesystem when the unit is assigned to a machine
<axw> menn0: a volume can be e.g. a loop device, or an EBS volume
<menn0> ok
<axw> menn0: actually we never create storage without an accompanying machine, so if the machine is prevented due to env being Dying, then we're fine
<axw> menn0: the storage provisioner won't create a volume or filesystem until the due-to-be-attached machine is provisioned
<menn0> axw: ok that sounds promising then
<menn0> axw: I think you were hinting at this before, but what about when a unit is added to an already provisioned machine
<axw> menn0: so I think we can drop the env life checks in storage
<axw> ah yeah
<axw> :|
<menn0> axw: I guess the machine will be dying or about to die if the env is going down
<menn0> axw: and that should clean up the storage?
<axw> menn0: it will... but only if the storage is bound to the machine. there's a concept of lifecycle binding, where storage is bound to either a unit/service, a machine, or the environment
<menn0> axw: also, won't the storage provisioner itself die if the env goes to dying
<axw> menn0: currently we're fine because we always bind to either the unit, service or machine
<axw> menn0: there was an intention of binding storage to env initially if marked persistent though
<axw> menn0: I hope the worker would continue to run until the env is removed, not just Dying
 * menn0 checks 
<axw> menn0: otherwise the provisioner won't clean up any remaining things
<menn0> axw: so it looks we're ok because there isn't a storage provisioner per env
<menn0> axw: it's not run under the envWorkerManager
<axw> menn0: ok, cool
<menn0> axw: the worker is up until the machine agent dies
<menn0> the storage provisioner worker I mean
 * axw nods
<menn0> axw: ok so it looks like we don't need env life assertions for the state stuff in storage then
<axw> menn0: so... I think we're ok unless/until we allow storage to be created that is bound to an env
<axw> menn0: currently not the case, so we're fine atm
<menn0> axw: we can do the assert only for the case where storage is bound to the env
<axw> yep, that should be fine
<menn0> axw: which will be a fairly low frequency event I imagine so not a performance issue
<axw> yes I think so
<menn0> axw: thanks for your help
<axw> menn0: nps, thank you for fixing. sounds messy :)
<menn0> axw: it is
<wallyworld> axw: found out why charm url is nil - serialisation changed between 1.20 and 1.22. which also means charm migration is broken in general and we didn't notice because migration function was never called
<axw> wallyworld: :(
<wallyworld> fixing now :-)
<thumper> menn0: based on axw's points above, we should at least get together to talk about environment destruction
<thumper> menn0, axw: because I feel that we hav some bad interactions
<thumper> and I'd like to check
<menn0> thumper: sure.
<menn0> thumper: now?
<thumper> not just now, Rachel is arriving home shortly and I'll be stopping for coffee
<thumper> but perhaps in 30-40 minutes?
<menn0> thumper: sure just let me know
<menn0> thumper: with axw too/
<thumper> axw: have you got some time?
<thumper> waigani: how goes environment destroy?
<waigani> thumper: merging cli command to jes-cli branch now.
<waigani> thumper: and writing Will an email to review environ.Destroy branch
<thumper> kk
<thumper> coolio
<waigani> thumper: Will usually starts around 8, so I'll check in with him this evening and hopefully finish off / land tonight.
<thumper> cool
<thumper> wallyworld: any idea if master is capable of being blessed at the moment? or are there known failures?
<wallyworld> thumper: not sure, i'd have to look at build logs
<wallyworld> i don't know of any failures
<thumper> there was a windows issue at some stage
<thumper> has that all been fixed now?
<waigani> there's an open critical bug on 1.25
<waigani> #1468815
<mup> Bug #1468815: Upgrade fails moving syslog config files "invalid argument" <ci> <regression> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1468815>
 * thumper sighs
<thumper> why has it not been forward ported?
<menn0> thumper, wallyworld: I have a likely fix to bug 1474195 ready... although I need to talk env destruction with thumper
<mup> Bug #1474195: juju 1.24 memory leakage <cpec> <deployer> <performance> <regression> <juju-core:Triaged> <juju-core 1.24:In Progress by menno.smits> <https://launchpad.net/bugs/1474195>
<wallyworld> great
<thumper> I'm waiting for axw before we talk destruction
<thumper> menn0: I can look at the fix if you like
<menn0> thumper: pushing now
<menn0> thumper: https://github.com/juju/juju/pull/2801
 * thumper looks
<thumper> menn0: for the machine insertion
<thumper> menn0: does that method also do the containers
<thumper> ?
<thumper> or is there a different one to add containers
<thumper> as I thought we were going to skip the alive assertion for containers
<menn0> a different one does containers
<thumper> kk
<menn0> see the docstring at the top of the method I added the assert to
<menn0> wallyworld, thumper: any tips of debugging an lxc container that is stuck in "pending"?
<menn0> I can't ssh to it
<menn0> and lxc-console gives me nothing
<wallyworld> menn0: the logs are available locally
<thumper> menn0: look here: /var/lib/juju/containers/...
<wallyworld> /var/lib/lxc/blah/root
<wallyworld> then look at cloud init logs
<thumper> and also where wallyworld said
<menn0> wallyworld: thanks, i'll look there
<thumper> the cloud init logs are in the /var/lib/juju/containers dir
<thumper> menn0: shipit
<menn0> thumper: what about your concerns?
<thumper> this branch doesn't touch the concerns I have
<menn0> ok great
<thumper> any bad thing we are doing, we are already doing
<thumper> which is why I think we need to talk to axw about environment destruction of hosted environments
<thumper> because we are going "bullet to the head" on all the machines, then removing all the docs
<thumper> what impact is this going to have for any attached storage
<menn0> thumper: I want to do some manual performance comparisons and if it looks like things are faster then I'll merge
<menn0> thumper, wallyworld: this appears to be why that container didn't start: http://paste.ubuntu.com/11886010/
<menn0> any clues?
<thumper> I'm guessing this line: WARN     lxc_start - start.c:signal_handler:307 - invalid pid for SIGCHLD
<thumper> NFI why though
 * menn0 is googling
<wallyworld> menn0: yeah, NFI sorry
<menn0> this looks like the bug (a race) but it was fixed in lxc 1.0.0-alpha2: https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
<mup> Bug #1168526: race condition causing lxc to not detect container init process exit <bot-stop-nagging> <linux (Ubuntu):Confirmed> <lxc (Ubuntu):Fix Released> <https://launchpad.net/bugs/1168526>
<thumper> menn0: what version of lxc do you have?
<menn0> thumper: 1.1.2 (stock vivid)
<menn0> (as far as I know)
<thumper> so not fixed released then...
<menn0> thumper: ?
<thumper> menn0: try #lxcontainers
<menn0> thumper: I will
<thumper> menn0: because it is happening to you...\
<menn0> out of 10 containers 1 failed
<menn0> but this happened earlier today and yesterday as well
<thumper> yeah, but we create a lot of containers
<thumper> wallyworld: master curse seems to be : bad record MAC, mongo not coming up, and intermittent failure collecting metrics in the uniter suite
<wallyworld> sigh
<wallyworld> those would all be intermittent right
<thumper> yup
<wallyworld> i'll look at the logs when i can
 * thumper heading off until meeting later tongith
<wallyworld> axw: could you look at http://reviews.vapour.ws/r/2181/ when you get a chance? it looks larger than it is because i reverted the move done previosly
<axw> wallyworld: ok
<wallyworld> ty
<axw> wallyworld: LGTM
<wallyworld> ty
<mup> Bug #1475163 opened: when the uniter fails to run an operation due to an error, the agent state is not set to "failed" <juju-core:Triaged by wallyworld> <juju-core 1.24:In Progress by wallyworld> <https://launchpad.net/bugs/1475163>
<wallyworld> axw: and one more sorry http://reviews.vapour.ws/r/2184/
<axw> wallyworld: reviewed
<axw> wallyworld: machine provisioning and hook errors are a bit different: they're coming from the IaaS provider and the hook execution respectively. Maybe I misunderstood, but it sounded like these errors might include, say, errors talking to the API server
<wallyworld> axw: yeah, could be those. i think your idea not to include is good
<wallyworld> fixing patching will be more work, but i have soccer now so will do later
<axw> wallyworld: ok. I have to go out soon anyway, so will check later
<jam> fwereade: dimitern: standup ?
<mup> Bug #1475212 opened: Environment destroy can miss manual machines and 	persistent volumes <juju-core:New> <https://launchpad.net/bugs/1475212>
<jam> fwereade: so I'm supposed to be in a call now, but he's not arrived yet. So on the concept of Token being resuable...
<dimitern> dooferlad, TheMue, fwereade, jam, sorry guys for missing standup - I had to renew my car insurance in the morning, but it took more time than expected :/
<jam> dimitern: no worries
<dooferlad> dimitern: jam just said my standard response, so ^^
<fwereade> jam, listening
<dimitern> I've discovered yesterday after wasting almost a full day, that when running go test with both -race and -cover (or -coverprofile=) *itself* leads to races!
<TheMue> dooferlad: hehe, maybe the number of calls will get negative when passing a black hole. but you're right, a bool flag would be enough
<dooferlad> dimitern: well, that sucks
<dimitern> supposedly fixed in go 1.3+, can be worked around by adding also -covermode=atomic (which is the default behavior in 1.3+)
<dooferlad> TheMue: I was more thinking about uint
<dooferlad> TheMue: but yes, types matter and sometimes we live with inappropriate choices
<perrito666> Morning
<dimitern> i'll send this to juju-dev as well, just in case I can save somebody else the same experience
<dimitern> s/the same/from the same/ even
<jam> fwereade: he showed, sorry. I did want to overview of how I felt tokens should work.
<fwereade> jam, just braindump whenever you get the chance :)
<dimitern> TheMue, hey
<TheMue> dimitern: heya
<dimitern> TheMue, didn't we discuss using bulk client-side api calls for the addresser?
<dimitern> TheMue, like RemoveIPAddresses taking params.Entites and returning params.ErrorResults, error, rather than forcing the worker to remove them one by one?
<TheMue> dimitern: have to take look in my notes
<TheMue> dimitern: we talked about where the "work" has to be done when I suggested that e'thing could be done via one call on server-side
<dimitern> TheMue, I don't insist on doing it now (just the addresser using api instead of state is already a big improvement, esp. around the entity watcher), but it seems to me it will be slightly better
<mup> Bug #1455628 changed: TestPingTimeout fails <ci> <intermittent-failure> <lxc> <test-failure> <unit-tests> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1455628>
<mup> Bug #1456726 opened: UniterCollectMetrics fails <ci> <tech-debt> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1456726>
<TheMue> dimitern: when I asked why we need an API usable only for the worker and providing its calls
<dimitern> TheMue, that's what *all* our apis are doing anyway :)
<anastasiamac> dimitern: tyvm for being adventurous and running tests with 2 flags not one :D
<dimitern> TheMue, however I see your point - we should (re)use better defined api interfaces across multiple workers/etc.
<TheMue> dimitern: and I oriented at your instancepoller, which is acting on one machine each too
<dimitern> anastasiamac, I'm even using -check.v :D
<anastasiamac> dimitern: \o/
<TheMue> dimitern: that's why I implemented the IPAddress(Proxy) as type
<TheMue> dimitern: but n.p., I simply can change it, one tine missed gofmt dislikes my try to merge, hehe
<TheMue> tiny
<dimitern> TheMue, yes, as it was easiest to do - gradual improvement, over using state directly, but from design perspective we can do better for such workers that makes more sense to batch multiple ops in a single api call
<TheMue> and I thought I ran my pre-commit check *grmfplx*
<dooferlad> TheMue: your git client doesn't auto-run the pre-commit hook?
<dimitern> TheMue, so I suggest you go ahead and still land this (if you can perhaps add a TODO somewhere in the code we can improve the behavior by using bulk calls)
<TheMue> dooferlad: different environment here, as you know. script integration didn't work, so I integrated it into my jdt (juju development tool)
<TheMue> dimitern: ok, will do so
<dimitern> TheMue, cheers
<dooferlad> TheMue: Clearly you need to switch clients :p
<TheMue> dooferlad: it's not the client, it's more complex. will show you when having our next meeting.
<dooferlad> dimitern: is the logic behind addSubnetsCache just to speed things up? Isn't state fast enough and the canonical source of information?
<dimitern> dooferlad, the main reason for its existence is to improve the case when multiple subnets are added in the same API call
<dimitern> dooferlad, so I guess it might be actually moot if we don't allow users to add multiple subnets with the CLI (unless we add an "import these subnets definitions as a batch" thing, which was discussed at some point)
<dooferlad> dimitern: if we have the ability at some point to dump the output of juju status to a file, then load that back, then yes we will benefit.
<dimitern> dooferlad, ewww.. yeah, I got your point :) but we'll have state deltas before that happens most likely
<dooferlad> dimitern: I mostly don't like caches because if somebody does something unexpected to what they are caching you can have "fun" finding bugs. In this case though, I was looking at it in terms of what I needed to do for space create.
<dimitern> (just imagined having to parse a moving target like the status yaml output)
<dooferlad> dimitern: which seems to be, not caching.
<dimitern> dooferlad, for space create I don't think you need to do it the same way
<dooferlad> dimitern: +1
<dimitern> dooferlad, I've realize addSubnetsCache now looks totally over-engineered to me :/
<dooferlad> dimitern: well, I am sure it was fun engineering, so I am not worrying!
<dimitern> dooferlad, you bet :)
<alexisb> fwereade, jam leads call
<mattyw> TheMue, not tried lfe yet, but it's on my list of things to try
<TheMue> mattyw: it has a nice approach for lisplers, but it never will get a larger community *sigh*
<TheMue> dooferlad: btw, just found why my pre-commit failed. only one missing line
<mattyw> TheMue, I'd love to have sessions at sprints where we can just hack on stuff
<mattyw> TheMue, maybe we should make the time this sprint
<TheMue> mattyw: definitely would raise the experience with different approaches, avoiding to get routine-blinded
<perrito666> morning all
<thumper> crap, perrito666 is back, time to go
<mup> Bug #1475271 opened: Intermittent test failure UniterSuite.TestUniterCollectMetrics <intermittent-failure> <test-failure> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1475271>
<perrito666> I see thumper does the same I do to figure EOD
<perrito666> has anyone noticed we are getting curses for no space left on device? mgz sinzui ?
<mgz> yeah, I see the vivid build failing
<mgz> we have tests running in the current still though, so I was not in a rush to retest
<jam> fwereade: ok. pie in the sky how tokens work feels like you would get the token at Auth checks, and then apply that token to each process you do. I feel like token failures are the sort of thing that wouldn't need to be retried if we knew they were the cause of the failure.
<fwereade> jam, right
<jam> For example, if I was leader, and I said X, then I failed to be the leader for a while, then I was leader *again*, my original X should actually be invalid.
<fwereade> jam, we could implement it like that but I'm not sure I think it's good
<fwereade> jam, tokens have to be reusable anyway
<fwereade> jam, other ops will cause ErrAborted
<fwereade> jam, next time through the buildTxn func we need to check again
<jam> fwereade: everything causes ErrAborted right? So we can't distinguish the why
<fwereade> jam, but we have to distinguish why
<fwereade> jam, hence the form of Runner.Run()
<fwereade> jam, refusing to check again once a token's failed might be an interesting optimisatioon
<fwereade> jam, but not relevant for my purpposes because I'll be returning the error as soon as I get one
<jam> fwereade: so I don't quite see how Token.Read() isn't reusable.
<sinzui> perrito666: I just woke up and yes I am disapointed. The machine only had to live for 4 more days
<fwereade> jam, it's a sinngle snapshot of past state
<fwereade> jam, unless it's able to get fresh state and return an error, it will push everyhing into ErrExcessiveContention
<fwereade> jam, by returning the same (failing) txn ops
<fwereade> jam, always corresponding to the reality-check that's now several cycles inn the past
<jam> fwereade: so is the use case that my leadership cert expired and I renewed it?
<fwereade> jam, it is to catch the situation when the leadership lease expires and is removed while some other component is running a txn that depends on it
<fwereade> jam, that other component (should!) have the looping form, in which it starts off using recent state from db or memory, interrogates that state for reasons to fail, then packages it up as asserts and sends it on to execute
<fwereade> jam, the txn fails
<fwereade> jam, what went wrong?
<fwereade> jam, we need to read current leadership state to be able to pin it on that
<fwereade> jam, sane?
<jam> fwereade: so I agree that we want to be able to read the current state at some point, but I worry that we'll read the current state and apply it as the new "its ok to do this as long as this holds true"
<jam> fwereade: so you want *a* token that says "the person who is making this request is the current leader"
<fwereade> jam, no
<fwereade> jam, I want a token that will, on request, tell me whether a unit is leader
<fwereade> jam, existence of a token implies nothing
<fwereade> jam, Check()ing a token implies that the fact the token is attesting to was recently true
<fwereade> jam, passing an out ptr into check gives you a very specific tool that allows you to check whether it still holds true inn the future
<jam> fwereade: so your Token interface only has Read()
<fwereade> jam, sorry, I renamed it Check
<fwereade> jam, otherwise the same
<fwereade> jam, and those still-hold-in-the-future things are critically important; but yes, I don't know how best to encourage people to use mgo/txn correctly :(
<jam> fwereade: so I think you're saying that Auth wants to return a Checker (and possibly calls it one time), but that the Checker is part of the inner loop
<fwereade> jam, yeah
<fwereade> jam, the initial call is technically redundant, am undecided, leaning towards not having it
<fwereade> jam, most/all the actual use of the Token will be inside state
<jam> fwereade: from an Auth func it is nice to fail early
<jam> SetStatus failing immediately with "you're not the leader" rather than waiting until it goes to update the DB with an actual change?
<fwereade> jam, agreed, there are forces pushing both ways :)
<fwereade> jam, it won't try to run a txn...
<fwereade> jam, I contend that constructing a txn is much cheaper than running one
<jam> fwereade: I certainly agree that stuff in memory vs once you've written it to the DB
<fwereade> jam, so what it will do is one up-to-date leadership check, and then hand over the ops representing it
<jam> it seems a little funny to have something like GetAuth not actually have checked your auth on the assumption that once you've actually processed the request you'll have finally checked they're allowed.
<fwereade> jam, point taken, but I think it follows from the mgo/txn dependency
<fwereade> jam, technically, any auth that isn't checked *at txn time* is leaky
<fwereade> jam, when working in state we just have to ...embrace the madness, and use the techniques that are reliable in this context :)
<fwereade> perrito666, http://reviews.vapour.ws/r/2185/ ?
<fwereade> perrito666, and whatever the other branch is
<fwereade> perrito666, does statusDoc have txn-revno or txn-queue fields?
<perrito666> fwereade: arent those added by txn?
<fwereade> perrito666, yes
<fwereade> perrito666, unless you have those fields specified in your doc, [$set, doc] is fine
<perrito666> fwereade: sorry I got distracted by watching a singer call ladybeard... odly hipnotizing
<fwereade> heh
<fwereade> good name :)
<perrito666> bearded man in japanese 5yo girl costume singing metal version of jpop songs, amazing
<perrito666> fwereade: this attacks the immediate issue with envuuid for this particular collection while a better fix is being worked for envuuid auto adding on Updates
<fwereade> perrito666, what makes you believe it changes anything?
<fwereade> perrito666, you have inserted a comment that is a straight-up lie
<perrito666> oh?
<fwereade> perrito666, https://bugs.launchpad.net/juju-core/+bug/1474606/comments/1
<mup> Bug #1474606: Document replacements using $set are problematic <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1474606>
<perrito666> it is a partial lie, if I insert that doc as is it wipes envuuid
<fwereade> perrito666, ok, so you're saving a doc with an empty env-uuid field
<fwereade> perrito666, why do you not know the env-uuid?
<fwereade> perrito666, ohhh, right
<perrito666> fwereade: I might need to change the var name so the comment is not confusing
<perrito666> do not insert That doc
<perrito666> :)
<fwereade> perrito666, this just makes me more adamant that it's the leavy multiEnv stuff that is the problem
<fwereade> s/leavy/leaky
<fwereade> perrito666, ok, so
<fwereade> perrito666, that comment is certainly not accurate re txn
<fwereade> perrito666, and re env-uuid
<fwereade> perrito666, can we not just drop the dependency on the env-uuid field and take them off all the doc structs?
<perrito666> fwereade: I honestly do not know, I wouldn't think so
<fwereade> perrito666, well, we definitely can
<fwereade> perrito666, it's more "should we"?
<fwereade> perrito666, and the more I think the more I think "yes of course we should, it would take a day at the outside"
<fwereade> perrito666, counterpoint?
<fwereade> perrito666, which might mean 3 days in practice
<fwereade> perrito666, but how much dev time have these sorts of issues cost us already?
 * perrito666 sits like a rubber ducl
<perrito666> duck
<fwereade> perrito666, haha
<fwereade> perrito666, so looking through state for EnvUUID it really doesn't seem like it's even used most of the time
<fwereade> perrito666, it exists only for the convenience of the multi-env layer
<fwereade> perrito666, but it also breaks the multi-env layer because you have to pay attention to that field all the time
<fwereade> perrito666, so
<fwereade> perrito666, if the multi-env layer just converted *everything* into bson.D *before* rewriting
<fwereade> perrito666, no more need for the fields
<fwereade> perrito666, right?
<fwereade> perrito666, there may be a couple of relevant fields we should keep
<fwereade> perrito666, but they're very much the minority
<fwereade> perrito666, quack. quack quack?
<fwereade> perrito666, and then we'd be able to insert docs that weren't pointers
<fwereade> perrito666, and we wouldn't have that scary surprising leakage out to the original docs either
 * perrito666 re-reads
<fwereade> perrito666, (and my lease stuff would Just Work without having to know it's in a multi-env collection, too)
<perrito666> fwereade: ok a couple of things
<perrito666> 1st are you sure no one is working in anything whatsoever heavily dependent on this?
<perrito666> 2nd, even though I believe in the empirical proof you showed me, on the original discussion about txn a linke arose http://stackoverflow.com/questions/24455478/simulating-an-upsert-with-mgo-txn/24458293#24458293 which has gustavo saying it shouldn't
<fwereade> perrito666, that's my reading of it; I see 21 uses of .EnvUUID in state, and most of them are irrelevant
<perrito666> I was rather wondering about work in process
<fwereade> perrito666, in that link, where does gustavo suggest you shouldn't $set a struct?
<wwitzel3> axw: I'll handle the forward porting of that issue
<wwitzel3> axw: well, the patch to master that is
<perrito666> fwereade: the final paragrah seems to be implying it
<fwereade> perrito666, (1) "you can set every field in a value by offering the value itself to $set"
<fwereade> perrito666, (2) "If you replace the whole document with some custom content, these fields will go away"
<fwereade> perrito666, they are talking about different situations
<perrito666> fwereade: I see
<perrito666> that might have caused the missunderstanding
<fwereade> perrito666, yeah, it could be clearer
<fwereade> perrito666, in particular it *is* dangerous to do a $Set with any of our doc tyypes that include a txn-revno
<fwereade> perrito666, so we do need to keep an eye out for that
<fwereade> perrito666, but that's more a matter of watching the doc definitions, and only allowing TxnRevno when it's *really* necessary, and commenting it clearly
<fwereade> katco, do you have any time to review http://reviews.vapour.ws/r/2186/ ?
<katco> fwereade: today is my meeting day :(
<fwereade> katco, ah bother, not to worry
<jam> fwereade: I've been reading through https://pubsubhubbub.googlecode.com/git/pubsubhubbub-core-0.4.html and it doesn't feel like a great fit, as when you subscribe to a topic you pass an HTTP callback URL. We could do that internally but it does feel a bit odd. Certainly I don't really expect to have general routing back to a client outside of the current connection.
<katco> jam: get in touch with https://github.com/go-kit/kit. they are actively soliciting feedback on features like this
<fwereade> jam, agreed
<katco> jam: doh, nm: in the "Non-goals" Supporting messaging patterns other than RPC (in the initial release) â pub/sub, CQRS, etc
<jam> :)
<jam> fwiw, I rather like https://github.com/grpc/grpc
<jam> but it feels like we're rewriting our communication infrastructure a bit too much at that point.
<davecheney> jam i agree
<jam> there is https://godoc.org/google.golang.org/cloud/pubsub which is less about the HTTP aspects
<jam> though IIRC it is strictly a client for Google's cloud pub/sub and not a server implementation.
<fwereade> jam, btw, 2172 has been superseded by reviews.vapour.ws/r/2186/ which has new-style Token
<fwereade> jam, so, yeah, doesn't sound like very rich pickings
<fwereade> perrito666, LGTM
<mup> Bug #1475341 opened: juju set always includes value when warning that already set <juju-core:New> <https://launchpad.net/bugs/1475341>
<perrito666> fwereade: it sounds like a more sincere comment :)
 * perrito666 is tempted of lunching a happy meal just to get a new minion toy
<fwereade> perrito666, can I hit you up for a review on reviews.vapour.ws/r/2186/ please?
 * perrito666 looks
<fwereade> perrito666, it's just a rework of the leadership interfaces such that my stuff and katco's has matching interfaces (well, at least they both implement CLaimer)
<fwereade> perrito666, cheers
 * perrito666 sees the lenght of the review and realizes hit was quite literal :p
<alexisb> davecheney, what part of the world are you in right now?
 * perrito666 tries to aquire a second monitor of the same model than the one he has and notices the price is the exact double of what he paid less than a year ago :p inflationary countries are fun
<katco> wwitzel3: 1:1
<davecheney> alexisb: san fran
<davecheney> damnit, i missed the opporutunity to say i was omnipresent
<alexisb> heh
 * perrito666 looks over his shoulder just to make sure davecheney isnt
<davecheney> i'm watching, always watching
<fwereade> perrito666, it's almost all renames
<perrito666> fwereade: ?
<perrito666> ah the review
<fwereade> perrito666, the big review
<fwereade> perrito666, probably start with leadership/interface.go
<perrito666> I would kill for threaded conversations on irc
<fwereade> perrito666, sorry, I should have said that in the blurb
<perrito666> fwereade: for starters I would like the pr description to say more why than what
<perrito666> by reading the code I can assert that you did exactly what that list of changes say, but I am not sure Ill be able to say what is the end result of it.
<fwereade> perrito666, heh, good point
<sinzui> perrito666: is bug 1474606 fix committed in 1.24?
<mup> Bug #1474606: Document replacements using $set are problematic <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1474606>
<perrito666> sinzui: no, just a partial for 1.24 and master
<sinzui> thank you perrito666
<perrito666> sinzui: that is why I did not change anything on it
<TheMue> so /me says goodbye, daughter has graduation ball today *proud-daddy-mode*
<dooferlad> TheMue: congratulations to you both!
<perrito666> TheMue: congrats man :) have fun
<wwitzel3> ericsnow: ping
<ericsnow> wwitzel3: hey
<wwitzel3> ericsnow: hey, is there anything you think you can break off of what you are doing or should I look in to destroy?
<ericsnow> wwitzel3: halfway through this yak :/
<ericsnow> wwitzel3: so maybe you had better
<ericsnow> wwitzel3: it will depend on my state patch
<bdx> hello everyone
<bdx> core: anyone familiar with this error showing up in the cloud-init-output.log on bootstrap node?
<bdx> core: 2015-07-16 16:43:47 ERROR juju.cmd supercommand.go:430 relative path in ExecStart ($MULTI_NODE/usr/lib/juju/bin/mongod) not valid
<bdx> then 2015-07-16 16:43:47 ERROR juju.cmd supercommand.go:430 failed to bootstrap environment: subprocess encountered error code 1
<bdx> and bootstrapping fails after
<bdx> grr
<perrito666> ericsnow: ping?
<ericsnow> perrito666: hi
<perrito666> hi :D
<perrito666> hey, are you still the reviewboardmonger?
<ericsnow> perrito666: depends on what you need :)
<perrito666> I was wondering if I could see the logs for rb, I find the javacript for comment/response textbox failing too often and the browser console says its an api call failing to respond
<perrito666> also the js might need to be uncompressed
<perrito666> some paths fail with a syntax error
<ericsnow> perrito666: its the reviewboard service in the juju-ci4 env
<ericsnow> perrito666: I can take a look but not quite yet
<perrito666> no hurry just had the issue while we are in the same TZ so didnt want to let it pass
<mup> Bug #1475386 opened: unit not dying after failed hook + destroy-service <juju-core:New> <https://launchpad.net/bugs/1475386>
<rick_h_> NOTICE: jujucharms.com is having a webui outage due to a failed redis. Charm deploys should work as normal and the API is available.
<mup> Bug #1475386 changed: unit not dying after failed hook + destroy-service <juju-core:New> <https://launchpad.net/bugs/1475386>
<natefinch> bdx: I think the problem is that $MULTI_NODE is not getting expanded
<natefinch> bdx: or not set or set weirdly
<natefinch> bdx: kind of a terrible error message, sorry about that
<mup> Bug #1475386 opened: unit not dying after failed hook + destroy-service <juju-core:New> <https://launchpad.net/bugs/1475386>
<rick_h_> NOTICE: jujucharms.com webui is back up
<davechen1y> rick_h_: \o/
<natefinch> sinzui: I'm trying to reproduce https://bugs.launchpad.net/juju-core/+bug/1471657   but when I try to get juju's code on stilson-07  I get this error:
<natefinch> fatal: unable to access 'https://code.googlesource.com/google-api-go-client/': Received HTTP code 403 from proxy after CONNECT
<natefinch> seems like it must be a proxy/firewall issue?
<mup> Bug #1471657: linker error in procsPersistenceSuite unit test on ppc64 <ci> <ppc64el> <test-failure> <unit-tests> <juju-core:Triaged> <juju-core feature-proc-mgmt:Triaged> <https://launchpad.net/bugs/1471657>
<sinzui> natefinch: those machines are on a private network. They cannot access google or aws, or hp or joyent. They cann access canonistack. I think you need to move to another machine
<natefinch> sinzui: I'll take whatever PPC machine is available, I just knew how to connect to those.  Is there a different PPC machine I can use that has connection to the public internet?
<sinzui> natefinch: those are the only ones, and they have special access . all others are more restricted
<davecheney> natefinch, yes, you'll have to raise an RT to get that firewall exception
<davecheney> or you could just scp in the code from your machine
<davecheney> that's what I do
<sinzui> yep, I do that all the time
<natefinch> davecheney: yeah, that was going to be my next thought - scp.  I just igured, since so much of the rest of it worked, the fact that one random url didn't work seemed like more of a bug than intentional
<natefinch> davecheney: or maybe none of it worked and that's just the first leaf package to try to download.  I didn't actually check
<davecheney> just part of life behind the firewall
<davecheney> this is a new dep for google gae
<natefinch> davecheney:  I see
<natefinch> davecheney: are you still in the US?  I presume you're not awake back home at this time of night
<sinzui> natefinch: I had a day last week spent taring, scping, untarring, go testing :( this situation is also true for our one machine that can run maas
<davecheney> sadness
<sinzui> davecheney: natefinch There is a plan to add ppc64el to canonistack. That might fix this situation
<katco> wwitzel3: how's that doc coming?
<wwitzel3> katco: good, I think we have a couple ideas
<katco> wwitzel3: mind if i tal?
<wwitzel3> katco: shared the doc with you, which is pasted irc logs in to, I haven't distilled anything yet, so haven't given any structure to the doc
<katco> wwitzel3: hrm. worried that this might be too complicated for a demo
<wwitzel3> katco: ok
<katco> wwitzel3: to give you some kind of idea. wallyworld's storage demo was bringing up postgres with external storage and then showing the contents of the external storage (i think)
<katco> wwitzel3: cool idea would be cool, but i don't want it to be anything so elaborate i mess it up and don't know enough about the charms to fix it
<katco> ericsnow: did you create a bug to track the OVA images card?
<ericsnow> katco: #1468383
<katco> ericsnow: ty... and is there an email i can piggy-back off to email ben?
<ericsnow> katco: not really
<katco> ericsnow: the remaining wpm cards are created?
<ericsnow> katco: not yet
<katco> ericsnow: wwitzel3: we need to be ready to go over the demo and how to get there by tomorrow
<ericsnow> katco: k
<wwitzel3> katco: ok, in that case, updated the doc
<katco> wwitzel3: simple, love it :p
<katco> wwitzel3: not that i'm not *very* interested in what whit et. al. are working on (i.e. real-world use-cases)
<katco> wwitzel3: but for demo, just need proof that it works
<katco> wwitzel3: it would be cool to have a 2nd demo in case i'm feeling ambitious, if they have something ready to go
<katco> natefinch: 1:1
<natefinch> katco: oops, sorry, coming
<katco> ericsnow: can you take a look at requirements section here: https://docs.google.com/document/d/1etgWYADQHVSY_yT5rd-_DqPXBNUIWYBj-z8-Cpxc2-U/edit#heading=h.u3tics2c141k
<katco> ericsnow: and update with what else needs to be done?
<ericsnow> katco: sure
<katco> wwitzel3: also, do we need a mysql component there as well to prove they can talk to each other?
<katco> wwitzel3: whoop nm looks like that's there isn't it
<cmars> natefinch, can you please take a look at http://reviews.vapour.ws/r/2188/ ?
<cmars> natefinch, it's passing on hyperv
<cmars> and linux of course ;)
<natefinch> cmars: np
<cmars> natefinch, ty
<natefinch> cmars: gah... whoever wrote ReplaceFile did it backwards :/
<natefinch> cmars: Go standard is foo(dest, src)
<natefinch> to mimic a = b
<cmars> natefinch, i noticed that
<natefinch> cmars: well, there's no fixing it now, I guess.
<cmars> natefinch, that'd be a heavy lift
<cmars> natefinch, os.Rename is kind of the same way though, http://golang.org/pkg/os/#Rename
<natefinch> cmars: huh, weird, yeah
<natefinch> cmars: probably written before they settled on the other scheme.  Oh well. Better to be consistent.
<cmars> natefinch, i should return proper os.LinkErrors.. i'll fix that
<natefinch> cmars:  reviewed
<cmars> natefinch, thanks!
<natefinch> cmars: welcome.  Anything to avoid working on this ppc bug ;)
<mup> Bug #1475425 opened: There's no way to query the provider's instance type by constaint <juju-core:New> <https://launchpad.net/bugs/1475425>
<davecheney> thumper: sorry i'm on another call
<perrito666> wallyworld: you are a bit frozen
 * perrito666 hums let it go to wallyworld 
<mup> Bug #1475056 changed: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <juju-core:New> <https://launchpad.net/bugs/1475056>
<perrito666> wallyworld: time to get a new modem?
<wallyworld> perrito666: maybe, trying to join again now
<wallyworld> perrito666: except now chome hates me
<katco> cherylj: still there?
<davecheney> thumper: sorry i missed the standup
<davecheney> was on another call
<davecheney> wrt the arm issue
<davecheney> is there a maas install that I can use to reproduce it
<davecheney> wallyworld: you were trying to get access to the system ?
<davecheney> did you succeed ?
<wallyworld> davecheney: i didn't succeed, but maybe that's just me. there's access instructions in the bug
<thumper> davecheney: if you aren't able to get access through the instructions in the bug, try bugging the hyperscale time, Andrew Cloke or Sean
<alexisb> davecheney, Sean specifically said he would provide any access needed
<alexisb> so we should hold them to that
<davecheney> are we talking about the same bug ?
<davecheney> there is nothing in the issue
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1415517
<mup> Bug #1415517: juju bootstrap on armhf/keystone hangs <armhf> <bootstrap> <hs-armhf> <juju-core:Confirmed> <https://launchpad.net/bugs/1415517>
<alexisb> davecheney, that is the one i am thinking of
<davecheney> are the instructions like hidden or something ?
<thumper> cmars: what are the two return values of utils.MoveFile ?
<thumper> cmars: or more specifically, why are you checking the ok value if err != nil?
<thumper> cmars: isn't it more idiomatic go to not expect any other value to have meaning if err is not nil?
<thumper> cmars: nm, went and read the source
<wallyworld> davecheney: damn connection problems today, not sure if you saw last messages
<davecheney> nope
<davecheney> i kept saying "I'm not sure what access details you are seeing in that issue -- i cannot see them "
<wallyworld> [08:09:58] <wallyworld> davecheney: the issue is that state server jujud process dies on arm
<wallyworld> [08:10:16] <wallyworld> they can run workloads, but not state servers
<davecheney> ok
<wallyworld> te jujud process just disappears
<davecheney> dmesg ?
<wallyworld> i've asked for stuff like that
<wallyworld> i think they want us to ssh in
<davecheney> ok
<wallyworld> and see for ourselves
<wallyworld> there's a whole maas cluster
<davecheney> ok
<wallyworld> you need to use the vpn
<davecheney> fuk
<davecheney> that won't work from where I am
<wallyworld> i can get http access to maas, but maas rejects my ssh attempts
<wallyworld> and i known nothing about arm
<davecheney> this is linux
<davecheney> this is user space
<davecheney> it won't be arm specific
<wallyworld> true, also not my specialty :-(
<wallyworld> low level systen stuff
<davecheney> i'm not sure what the next step is
<davecheney> x wants us to do y
<davecheney> we've trued y
<davecheney> it didn't work
<davecheney> how can we break the stalemate
<wallyworld> didn't work for me. i've asked them to attach any post mortem and relevant info to bug
<davecheney> +1
<davecheney> i
<davecheney> m subscribed to the bug
<wallyworld> i may need to poke them again
<mup> Bug #1466087 changed: kvmBrokerSuite TestAllInstances fails <ci> <test-failure> <juju-core:Incomplete> <juju-core devices-api-maas:Triaged> <https://launchpad.net/bugs/1466087>
<mup> Bug #1474291 changed: juju called unexpected config-change hooks after read tcp 127.0.0.1:37017: i/o timeout <hooks> <openstack> <sts> <uosci> <juju-core:Invalid> <ceilometer (Juju Charms Collection):New> <https://launchpad.net/bugs/1474291>
<mup> Bug #1475386 changed: unit not dying after failed hook + destroy-service <destroy-service> <juju-core:New> <https://launchpad.net/bugs/1475386>
<thumper> fark...
<thumper> davecheney: still here?
<davecheney> thumper: ack
 * thumper is looking at bug 1474946
<mup> Bug #1474946: kvmBrokerSuite worker/provisioner: tests are poorly isolated <blocker> <ci> <regression> <test-failure> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1474946>
<thumper> I moved my /var/lib/lxc dir out of the way
<davecheney> it's a shitstorm
<thumper> and confirmed that my user can't create a dir there
<thumper> but when I run the tests, they pass
<davecheney> you have lxc installed
<thumper> WT actual F
<thumper> yes
<thumper> bit the dir /var/lib/lxc doesn't exist
<davecheney> mkdir -p will always pass if the directory exists
<thumper> because I moved it
<davecheney> what is the ownership of /var/lib ?
<thumper> doesn't allow my user to create dirs
<davecheney> possibly installing lxc changes gropu ownershipts
<davecheney> possibly installing lxc changes gropu ownerships
<thumper> that was the first thing I tested
<davecheney> puts you in wheel
 * thumper digs more
<thumper> FFS
<thumper> this test is bullshit
<thumper> it is a kvm test
<thumper> that checks the lxc dir for networking setup
<cmars> thumper, thanks for the review. i described the return bool here: https://github.com/juju/utils/blob/master/file_unix.go#L30
<cmars> thumper, did you want a comment in juju as well describing the use of it?
<davecheney> da fuq
<thumper> just in that use of it, yes
<cmars> thumper, ok, np
<thumper> code should be obviously correct when you read it
<thumper> davecheney: also, my version passes because for some reason, the lxc data dir is /home/tim/.local/share
<thumper> more modern lxc I guess
 * davecheney reaches for emoji
<davecheney> possibly, i'm on 14.04.2
<thumper> oh fuck
 * thumper head desks
 * thumper head desks
 * thumper head desks
<davecheney> always a good sign ...
 * thumper head desks
<thumper> in order to be a good citizen...
<thumper> we do this:
<thumper> LxcContainerDir  = golxc.GetDefaultLXCContainerDir()
<thumper> which does this:
<thumper> run("lxc-config", nil, "lxc.lxcpath")
<thumper> for root, it is probably the right thing
<thumper> for a user with modern lxc
<thumper> it isn't
 * thumper thinks
<thumper> ugh
<thumper> since the local provider jujud runs as root
<thumper> I think we are ok
<davecheney> lxc-config won't exist if lxc isn't installed
<thumper> but this is why the tests passes
<thumper> ack
<thumper> if there is an error
<thumper> it returns /var/lib/lxc
<thumper> which then doesn't exist
<thumper> however
<davecheney> if lxc-config ... fails, we fall back to /var/lib/lxc ?
<thumper> the bigger problem
<thumper> is that the test is bullshit
<davecheney> derp-tastic!
<thumper> we shouldn't be adding network config in lxc dir for kvm tests
 * thumper renames the function so it is obviously wrong
<thumper> and removes it
<davecheney> phase 1. delete test
<davecheney> phase 2. ??
<davecheney> phase 3. build is green
<thumper> phase 1: rename function to include LXC
<thumper> phase 2: make it so the local dir can't be created
<thumper> phase 3: run all tests
<thumper> phase 4: remove lxc function from kvm test
<thumper> phase 5: ensure no other failures
<thumper> phase 6: send email to network folks to see what  should be there
<thumper> phase 7: profit
<davecheney> 7 steps ?
<davecheney> that's too enterprise
<thumper> http://reviews.vapour.ws/r/2190/diff/#
<menn0_> wallyworld: the env life assert PR has now failed twice due to test timeouts in cmd/jujud/agent
<menn0_> wallyworld: makes me think it's an effect of the change
<menn0_> wallyworld: but of course it always works on my machine
<thumper> menn0_: did you want me to try here?
<menn0_> thumper: if you have the time, yes please
<thumper> menn0_: if you want to review said branch above
<menn0_> thumper, wallyworld: not sure if it's related but the race detector finds 11 races in that package
<thumper> really?
<thumper> menn0_: what did you do?
<menn0_> davecheney: have you been backporting your data race fixes to 1.24?
<menn0_> thumper: I haven't changed a thing in that package
<thumper> menn0_: no, he hasn't AFAIK
<davecheney> no
<davecheney> i have not
<menn0_> thumper: that could be why the races are still there then
<thumper> :)
<davecheney> yup
<menn0_> thumper: this PR only touches state
<thumper> this is on 1.24 is it?
<menn0_> thumper: yep
<menn0_> the races might be nothing to do with the test hangs
<menn0_> but it could be to do with txns being decoupled, changing the timings of things
<thumper> we should back port the apiserver wait group change
<thumper> because that could be it
<menn0_> thumper: I already did that I think
 * menn0_ checks
<menn0_> thumper: yep that's there
<thumper> hmm
<wallyworld> menn0: sorry was in meeting, but normally if agent tests timeout more than once there's an issue
<menn0> wallyworld: it's been 2 different tests in that pkg that have gotten stuck but they're both upgrade related
<menn0> wallyworld: i'm going to have peek at them in case something obvious jumps out
<perrito666> wallyworld: mm, that is the test that failed merging the patch yesterday, I think that it can only be reproduced by running the whole suite, I have been able to do it only once and no more so I could not get to it and thought it was one of the long standuing flaky testss
<wallyworld> could be a flakey test but i think work has been done recently to fix a lot of the agent related tests
<perrito666> wallyworld: mm, could definitely be something in the change that fixes the issue with status, but that would mean that the test is waiting for the wrong assumption
<perrito666> wallyworld: did you ever re-merge the code for agent status?
<wallyworld> perrito666: which code?
<perrito666> wallyworld: updateAgentStatus
<axw> wallyworld: on master, AFAIK, bootstrap will put image metadata directly into gridfs without using swift
<wallyworld> perrito666: that code as missing would only have failed to report a failed state
<wallyworld> the fix is merging now
<wallyworld> the refactoring reported non error status elsewhere
<axw> wallyworld: it's just that we weren't searching it (the fix for that landed already I think?)
<perrito666> wallyworld: it is odd that fixing the code would break the test :(
<wallyworld> perrito666: the code hasn't merged yet
<wallyworld> axw: i didn't think it did put it into gridfs, or i don't recall if it did
<axw> wallyworld: I'll find the code. I'm 99% sure it does
<wallyworld> axw: the search issue - that was cloud storage not being searched
<wallyworld> ie swift
<wallyworld> i didn't think 1.24 and master were different in that respect
<axw> wallyworld: oh... we're meant to be looking in gridfs as well
<wallyworld> axw: i didn't realise at all the simplestreams data has been added to gridfs
<axw> wallyworld: https://github.com/juju/juju/blob/master/cmd/jujud/bootstrap.go#L237
<menn0> thumper: so these tests are hanging because the machine agent Stop call is not returning
<axw> wallyworld: we're writing the image metadata into "state storage", which is gridfs
<thumper> heh
<menn0> thumper: but it's not the apiserver... I can see that does stop
<thumper> yeah...
<thumper> oh?
<thumper> interesting
<thumper> which one is it?
<menn0> thumper: still digging through the logs to figure out which workers are not stopping
 * menn0 is very grateful to thumper for adding the extra logging in the runner
<wallyworld> axw: i see, i had thought that the stor used was EnvironStorage
<thumper> axw: need to talk to you about environment destruction
<axw> thumper: mkay
<axw> wallyworld: hm, so looking back over anastasiamac's change, I don't think that's actually what we should be doing. we're meant to be looking in gridfs, and the individual providers can add additional search paths if they want to (e.g. look in keystone)
<wallyworld> axw: i'd prefer not to have a simplestreams blob
<wallyworld> structyred data is much better
<axw> wallyworld: I understand, and that's being fixed, but atm we're talking about *where* the blob is
<wallyworld> simplestreams should not be in env
<axw> in provider storage vs. gridfs
<axw> we should not be perpetuating provider storage
<wallyworld> agreed, and we're not
<wallyworld> i didn't realise we weren't writing to provider storage
<axw> wallyworld: the latest change reintroduces searching metadata in provider storage...
<thumper> axw: when do you have some time?
<wallyworld> because i thought we were writing metadata there based on the information i had
<axw> thumper: can chat now
<menn0> thumper: looks like it might be the certupdater
<thumper> haha
<menn0> thumper: it's blocked on a channel send
<thumper> bwa haha
<thumper> naked send?
<thumper> oh...
<thumper> I remember that...
<thumper> it is buffered
<thumper> with one value
<thumper> but sends twice
<menn0> wasn't that fixed?
<axw> thumper: where aboots?
<thumper> I thought so...
<thumper> perhaps not
 * menn0 keeps digging
<thumper> axw: https://plus.google.com/hangouts/_/canonical.com/env-destruction
 * menn0 loves tracebacks + decent logs
#juju-dev 2015-07-17
<menn0> thumper: yep naked send
 * menn0 checks master
<menn0> thumper: fixed in master but not 1.24
 * menn0 backports
<thumper> cmars: are you going to merge your branch?
<cmars> thumper, i'd like to
<cmars> thumper, master's blocked
<thumper> I've committed a fix
<thumper> cmars: please JFDI it
<cmars> thumper, with pleasure
<thumper> menn0: thanks for the backport
<menn0> thumper: np. hopefully my PR will make it through once this is in
<katco> cherylj: still there?
<thumper> wallyworld: I'm having a lateish lunch with Rachel
<thumper> wallyworld: will probably be late for our 1:1
<wallyworld> sure, np
<natefinch> evening all
<natefinch> man I really hate linux utilities' default of "don't tell the user wtf is going on"
<thumper> wallyworld: back
<thumper> wallyworld: good now?
<wallyworld> sure
<wallyworld> axw: just finishing up, be there in a sec
<axw> wallyworld: thanks for reminding me ;)
<thumper> cmars: http://reports.vapour.ws/releases/2890/job/run-unit-tests-win2012-amd64/attempt/908
<thumper> cmars: another intermittent metics collection failure
 * thumper sighs
<thumper> cmars: I'm wondering if all the extra tests in uniter also pushed the ppc timeout over 10 minutes
 * thumper is looking at http://reports.vapour.ws/releases and feeling sad
<thumper> menn0_: got a minute to talk through another intermittent failure?
<menn0_> thumper: yep
<thumper> menn0_: standup hangout
<thumper> menn0_: click "join" :)
<menn0_> thumper: it's not that. I have to re-auth and it wants a new password
<thumper> ah
<natefinch> anyone seen a gccgo error where it thinks you've multiply defined things that are not multiply defined?
<natefinch> thumper: ^ ?   I seem to remember a bug we've hit with gccgo in the past where like embedding things the wrong way will make gccgo mad.  Do you happen to remember what caused that?
<natefinch> nvm, pretty sure it's this: https://github.com/golang/go/issues/7627    which is fixed in go 1.3 .... which does not help us
<thumper> decisions, decisions
<thumper> wanting a new laptop bag
<thumper> trying to decide between black or red
<natefinch> if not-black is an option you like, go with not-black.  There's too much black in everything tech
 * natefinch looks at his black keyboard, black monitor stands, black headset, black camera strap, black camera, black CD player... all just on his desk
 * thumper chose black
<thumper> it was pointed out how grubby red would look when a little dirty
<thumper> ok, I'm done
<thumper> drink time
<thumper> laters folks
<mup> Bug #1475509 opened: upgrade-charm --force behavior causes races <juju-core:New> <https://launchpad.net/bugs/1475509>
<TheMue> dooferlad: hangout, and do you know where dimitern is?
<dooferlad> TheMue: No, I don't. I think we should wait for him.
<dimitern> here I am
<dooferlad> ah
<dimitern> sorry guys :)
<TheMue> hehe
<dooferlad> no worries. Joining
<dimitern> tgif :) why do meteors always land in craters :D http://9gag.com/gag/aPGB9YV?ref=fbp
<dimitern> ^^ if I heat my SSD drive until it becomes a gaseous state hard drive would that enable cloud computing? <<- good idea for a juju plugin
<mup> Bug #1475565 opened: juju expose should allow ip/port restrictions <juju-core:New> <https://launchpad.net/bugs/1475565>
<mup> Bug #1475565 changed: juju expose should allow ip/port restrictions <juju-core:New> <https://launchpad.net/bugs/1475565>
<mup> Bug #1475565 opened: juju expose should allow ip/port restrictions <juju-core:New> <https://launchpad.net/bugs/1475565>
<fwereade> perrito666, ping
<dimitern> dooferlad, check out bug 1474946 btw and thumper's fix http://reviews.vapour.ws/r/2190/diff/#
<mup> Bug #1474946: kvmBrokerSuite worker/provisioner: tests are poorly isolated <blocker> <ci> <regression> <test-failure> <juju-core:Fix Committed by thumper> <https://launchpad.net/bugs/1474946>
<dimitern> dooferlad, I believe the kvm broker shouldn't need to call EnsureRootFS.. as it only applies to lxc broker tests with lxc-clone: true (when we have the template)
<fwereade> perrito666, what's the source of the differences in SetStatus on unit vs service in state?
<fwereade> perrito666, and for that matter unitagent?
<perrito666> fwereade: sorry was @gym
<fwereade> perrito666, no worries, I expect asynchronicity :)
<perrito666> fwereade: I feel your comment a bit context lacking
<fwereade> perrito666, so, we have 4 SetStatus methods in state
<fwereade> perrito666, they're all basically setting the same document, and they're all different
<fwereade> perrito666, do you know what the source of the differences is, and/or if there are any plans to fix this?
<fwereade> perrito666, some gate on entity life, some don't; some error out when the entity disappears, some don't; some use buildTxn functions, some don't; some follow the very few documented guidelines for state methods, some don't...
<perrito666> fwereade: there are different rules for each entity, they satisfy state.StatusSetter, they could be a bit smaller and cleaned of redundant code tho
<fwereade> perrito666, what are the differences that aren't encoded in the newFooStatusDoc funcs?
<perrito666> fwereade: little to nothing, there is some extra resilience around service status and machine status has no history
<perrito666> oh and unit agent has a slightly different op
<mup> Bug #1475635 opened: Subordinate can break juju run for principal service <juju-core:New> <https://launchpad.net/bugs/1475635>
<perrito666> omg you've got to be kidding me one of my monitors will do a humming noise unless brightness is set in 56-60 or >95 levels
<perrito666> it has been driving me crazy since last night
<wwitzel3> katco: ping?
<fwereade> perrito666, do you know the reasons behind the various differences?
<fwereade> perrito666, eg, when implementing history, why *didn't* we do machine status?
<perrito666> partly
<fwereade> perrito666, why does service status need extra resilience, but not the others?
<fwereade> perrito666, why doesn't the unit agent check its entity? etc
<mup> Bug #1475641 opened: Bug in hooks can make jujud unresponsive <juju-core:New> <https://launchpad.net/bugs/1475641>
<mup> Bug #1475386 opened: unit not dying after failed hook + destroy-service <destroy-service> <juju-core:Triaged> <https://launchpad.net/bugs/1475386>
<perrito666> is anywone working on https://bugs.launchpad.net/juju-core/+bug/1474946 ?
<mup> Bug #1474946: kvmBrokerSuite worker/provisioner: tests are poorly isolated <blocker> <ci> <regression> <test-failure> <juju-core:Fix Committed by thumper> <https://launchpad.net/bugs/1474946>
<perrito666> says thumper there but I think he is no longer here
<fwereade> perrito666, I've figured out why RB2148 is making me uncomfortable
<fwereade> perrito666, lots of the changes are to the tests, changing the way they reach into the internals of state to "test" it
<fwereade> perrito666, which means that none of them are testing the important things -- what happens in real life when people call particular state methods -- they're *all* testing how the methods work when someone makes a specific change to the db
<fwereade> perrito666, so those tests become less accurate, and potentially have to be rewritten, every time someone changes the status code
<fwereade> perrito666, if you test to the exported interface you can change the implementation and have some confidence that it works how it used to
<dooferlad> TheMue, dimitern: http://reviews.vapour.ws/r/2197/ please!
<TheMue> dooferlad: *click*
<katco> wwitzel3: here now
<dimitern> dooferlad, looking
<TheMue> dooferlad: reviewed
<fwereade> dooferlad, one significant comment
<fwereade> TheMue, I am a bit disappointed you didn't spot that...
<TheMue> fwereade: pardon?
<fwereade> TheMue, non-bulk api call
<TheMue> fwereade: iiirks, yes
<fwereade> TheMue, ehh, it's friday afternoon, these things happen
<TheMue> fwereade: but should not
<perrito666> fwereade: too bad you realized after I pulled it :p
<perrito666> I change it
<fwereade> is anyone feeling up to speed on the ugly details of destroy-environment in JES?
<sinzui> dimitern: katco : can you ask someone to triage this issue? I cannot say if the issue is the local machine or a juju bug, and if it is a juju bug, is it something that needs fixing in 1.25 https://bugs.launchpad.net/juju-core/+bug/1474885
<mup> Bug #1474885: juju deploy fails with ERROR EOF <local-provider> <precise> <juju-core:New> <https://launchpad.net/bugs/1474885>
<katco> cherylj: perrito666: dimitern: moonstone is in meetings for the next 4 hours. can one of you TAL?
<dimitern> sinzui, katco, will do
<katco> dimitern: ty
<dimitern> dooferlad, I have not finished yet, but do have a few comments already, please wait for my review first
<katco> ericsnow: meeting time
<ericsnow> katco: omw
<katco> akhavr: o/
<dimitern> dooferlad, reviewed
<dooferlad> dimitern: thanks
<dooferlad> fwereade: thanks as well
<dooferlad> dimitern: guess I should have tried harder for a pair programming session earlier!
<dimitern> dooferlad, :) no worries
<dimitern> we can still do it, but it's getting a bit late for me ;/
<dooferlad> dimitern: can you do 30 minutes? Clearly there was some stuff I just hadn't encountered before and combining that with a little too much copy and paste has set me off in the wrong direction.
<dooferlad> dimitern: though I guess it is mostly that I should have gone back and looked at the spec again and need to know about the significance of bulk API calls.
<dimitern> dooferlad, sure, let's use the standup g+
<dimitern> dooferlad, juju-sapphire
<mup> Bug #1475565 changed: juju expose should allow ip/port restrictions <expose> <improvement> <juju-core:Triaged> <https://launchpad.net/bugs/1475565>
<natefinch> davecheney: about that  bug I emailed you about.  It was marked as fixed in 1.3, but we're running 1.2.1 officially.  I'm not sure how that interacts with the gccgo version, but it seems like maybe we're not yet running a version that has that fix?
<natefinch> sinzui: davecheney says that enabling trusty updates should fix that ppc issue... is that something that we're allowed to do on the stilson machines?
<sinzui> natefinch: they are supposed to be on...ubuntu doesn't not support machines that do not accept updates
<sinzui> natefinch: I think they must have been turned off when diagnosing other issues.
<natefinch> sinzui: my local machine has gcgo 4.9.1-16ubuntu6, but stilson-07 has 4.9.1-13ubuntu1
<sinzui> :/
<natefinch> sinzui: oh, forgot I'm not on trusty anymore
<natefinch> sinzui: just utopic, though
<sinzui> natefinch: trusty-updates are already enabled on stilson-09
<katco> natefinch: wwitzel3: planning time
<wwitzel3> katco: brt
<natefinch> me too
<natefinch> sinzui: weird, that one has gccgo 4.9.1-1ubuntu3
<natefinch> at least we're consistently inconsistent
<sinzui> natefinch: yes, stilson 6-8 permanently dirty. That is why we want the tests to pass in lxc so we can run them in a clean disposable env
<natefinch> sinzui: can I mark the bug failure as "invalid because it's running on a dirty machine"?  https://bugs.launchpad.net/juju-core/+bug/1471657
<mup> Bug #1471657: linker error in procsPersistenceSuite unit test on ppc64 <ci> <ppc64el> <test-failure> <unit-tests> <juju-core:Triaged> <juju-core feature-proc-mgmt:Triaged> <https://launchpad.net/bugs/1471657>
<sinzui> natefinch: 09 is not dirty
<natefinch> sinzui: oh, sorry, misread, thanks for the clarification
<rogpeppe> natefinch: you might want to take a look at this - it's a little cache package that i've just moved out of charmstore internal into juju/utils: https://github.com/juju/utils/pull/144
<natefinch> rogpeppe: nice.  Needs a package level comment, though.
<rogpeppe> natefinch: good point
<mup> Bug #1474382 changed: MeterStateSuite teardown failure on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1474382>
<mup> Bug #1474946 changed: kvmBrokerSuite worker/provisioner: tests are poorly isolated <blocker> <ci> <regression> <test-failure> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1474946>
<mup> Bug #1475724 opened: UniterSuite.TestUniterRelations <blocker> <ci> <ppc64el> <regression> <unit-tests> <windows> <juju-core:Triaged> <juju-core jes-cli:Triaged> <https://launchpad.net/bugs/1475724>
<davecheney> wallyworld: ping
<davecheney> oh, sorry, it's saturday
<alexisb> davecheney, it is also very early for him
<perrito666> alexisb: early as in friday night :p
<perrito666> its around 3AM for ian
<katco> davecheney: you people from the western hemisphere wouldn't understand.
<alexisb> AM == morning
<perrito666> alexisb:  I place the line between late and early around 5AM
<natefinch> 4am == night time, I agree with that :)
<perrito666> which is the time when, if I wake up Ill go with a coffee instead of more sleep
<mup> Bug #1475724 changed: UniterSuite.TestUniterRelations <blocker> <ci> <ppc64el> <regression> <unit-tests> <windows> <juju-core:Triaged> <juju-core jes-cli:Triaged> <https://launchpad.net/bugs/1475724>
<mup> Bug #1474382 opened: MeterStateSuite teardown failure on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1474382>
<mup> Bug #1474946 opened: kvmBrokerSuite worker/provisioner: tests are poorly isolated <blocker> <ci> <regression> <test-failure> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1474946>
<jcastro> Does anyone have a quick tldr on Juju/LXD support? I'd like to answer this question: http://askubuntu.com/questions/643658/lxd-and-juju-management
<mup> Bug #1474382 changed: MeterStateSuite teardown failure on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released by bteleaga> <https://launchpad.net/bugs/1474382>
<mup> Bug #1474946 changed: kvmBrokerSuite worker/provisioner: tests are poorly isolated <blocker> <ci> <regression> <test-failure> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1474946>
<mup> Bug #1475724 opened: UniterSuite.TestUniterRelations <blocker> <ci> <ppc64el> <regression> <unit-tests> <windows> <juju-core:Triaged> <juju-core jes-cli:Triaged> <https://launchpad.net/bugs/1475724>
<davecheney> natefinch: that bug you emailed me about earlier
<davecheney> i'm pretty sure that is fixed
<davecheney> and has been fixed for a long time
<davecheney> _but_ there is a catch
<davecheney> it's not available for 14.04 out of the box
<davecheney> even in 14.04.2
<davecheney> you _have_ to enable trusty-updates to get it
<natefinch> davecheney: sinzui says trusty updates is enabled on the ppc machines
<natefinch> sinzui: silson-09 says that gccgo is installed and set to "manual"... does that mean it's not getting updates?
<davecheney> the magic package is libgo5
 * davecheney struggles to contain his rage
<davecheney> WE OWN THE DISTRIBUTION
<davecheney> WHY CAN WE NOT ACTUALLY SHIP UPDATES FOR SHIT WE MAINTAIN
<natefinch> s/silson/stilson/
<davecheney> phase 1. fix bug
<davecheney> phase 2. ???
<davecheney> phase 3. nobody actually gets the bug fix
<davecheney> ffs
<natefinch> afk for 15-ish or so, sorry.  Kids need naps
<sinzui> davecheney: the machine always had updates on, but the issue might be that trusty didn't officiall support ppc4el. Its packages come from ports.ubuntu.com
<davecheney> natefinch: just emailed you the background
<davecheney> actually no
<davecheney> that was the wrong email thread
<davecheney> natefinch: i cannot find it now
<davecheney> but the summary was
<davecheney> "no versions of our LTS release ship with a working gccgo, you must be using the updated versions from trusty-updates"
<rogpeppe> davecheney: i'd appreciate a glance at https://github.com/juju/utils/pull/144 if you have a few moments.
 * davecheney looks
<davecheney> rogpeppe: safe flight back to blighty ?
<rogpeppe> davecheney: yup, no probs at all
<rogpeppe> davecheney: you still in US ?
 * rogpeppe leaves
<katco> wwitzel3: i'm ready to party.
<wwitzel3> katco: ok, I'm running a bit behind, lots of plates spinning
<katco> wwitzel3: np, lemme know when you're ready
<alexisb> perrito666, ping
<perrito666> alexisb: pong
<alexisb> are you still working this bug w/ menno:
<alexisb> https://bugs.launchpad.net/juju-core/+bug/1474606
<mup> Bug #1474606: $set updates may clear out the env-uuid field <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1474606>
<perrito666> alexisb: I added a fix for the original issue, but as I understand menn0 was working on some additional code to do a more general fix
<alexisb> perrito666, ok
<alexisb> sinzui, we are still blocked on 1474606
<sinzui> alexisb: understood
<mup> Bug #1475779 opened: EnsureJujudPassword fails if the registry keys are not present <juju-core:New for bteleaga> <https://launchpad.net/bugs/1475779>
<katco> jw4: hey didn't you do a relation state diagram at some point?
<jw4> katco: yeah - a first pass
<jw4> it should be in the docs folder
<katco> jw4: do you remember where that is? i couldn't find it
<katco> jw4: oh, hm. nothing jumped out at me there
<jw4> katco: I think it's doc/uniter-model.txt
<jw4> katco: pretty thin
<katco> jw4: no worries, ty
<jw4> :e
<jw4> um... that was supposed to be :)
<katco> jw4: haha thought you were in vim for sec
<jw4> exactly
<mbruzek> rick_h_: ping
<mbruzek> I have a bundle question rick_h_
#juju-dev 2015-07-19
<fwereade> jam, if your will to live becomes too strong today, I pulled together a massive leadership-backport branch: http://reviews.vapour.ws/r/2205/
<jam> fwereade: is there more than just porting other code?
<fwereade> nope, I mentioned the couple of non-automated bits I hit
<fwereade> jam, ^^
<jam> fwereade: why are you here on the weekend?
<jam> fwereade: backporting the state.Open() change seems a more serious change to the internal API in a stable release. We do depend on it, though?
<jam> that seems to be the bulk of the line-by-line changes
<fwereade> jam, I think we probably don't strictly *depend* -- but unpicking it felt riskier than including it
<fwereade> jam, yeah
<fwereade> jam, and state.Open isn't really API, is it? it's only ever called in-process
<fwereade> jam, so we can trust the compiler :)
<jam> hence "internal" API
<jam> fwereade: so I don't know that we've stated "we won't change our function signatures in a stable release" as much as our over-the-wire API
<fwereade> jam, I'd see the various, er, internal API methods as the "internal" API ;p
<fwereade> jam, why would we want to restrict internal function signatures?
<jam> fwereade: ABI better for you ?
<fwereade> jam, if I were worrying about dylibs, maybe
<fwereade> jam, but when it's all one big chunk of code compiled and linked together, I can't get too worked up about it
<jam> fwereade: well we do happen to know some people writing extensions for various parts that will need to be aware of something like this.
<jam> I think it falls more under if you look at "feature" vs "bugfix". I understand why we need to backport this stuff, but it does fall under the "questioning eye for a stable change"
<jam> fwereade: anyway LGTM
<fwereade> jam, I think I see your point, but I'm not really willing to commit to any stable package interface I don't absolutely have to ;)
<fwereade> jam, tyvm
<bdx> hey whats going on everyone?
<bdx> I'm hitting my head on some mongodb issues using local provider for juju
<bdx> my /var/log/upstart/juju-db-bdx-local.log shows http://paste.ubuntu.com/11906031/
<fwereade> bdx, shot in the dark based on that traceback -- is something using port 38017 ?
<fwereade> bdx, it's a bit surprising because I didn't think we ran mongod with --rest
<thumper> o/ fwereade
<fwereade> thumper, o/
<thumper> fwereade: Y U make my life difficult?
<fwereade> thumper, mainly for the sheer devilment of it, I must admit
<fwereade> thumper, (what did I do in particular?)
<thumper> system destroy
<fwereade> thumper, oh, yes
<fwereade> thumper, am I annoyingly correct or just annoying? or have you not decided yet?
<thumper> there is no list blocks command for a system yet
<fwereade> thumper, (either way, it may please you to know that the mosquitos are killing me here)
<thumper> and I'm not sure I agree about the general output formatter
<thumper> not for this case
<thumper> I like the formatters for general "list this" type commands
<thumper> or about this
<fwereade> thumper, well, for this case I can put up with whatever makes sense to stderr
<thumper> but this output was really to help inform the user
<fwereade> thumper, I think that if it's stdout it's a bit different?
<thumper> why?
<thumper> we output messages to std out
<thumper> plain text stuff
<thumper> this is just more
<thumper> context.Infof etc
 * thumper digs into the windows test failure
<thumper> fwereade: it seems the intermittent uniter test failures are becoming more common on win 2012
<thumper> bug 1475724
<mup> Bug #1475724: UniterSuite.TestUniterRelations <blocker> <ci> <ppc64el> <regression> <unit-tests> <windows> <juju-core:Triaged> <juju-core jes-cli:Triaged> <https://launchpad.net/bugs/1475724>
<fwereade> thumper, I think context.Infof writes to stderr? -- and, yeah, I saw that earlier, I am a little bit suspicious of update-status but haven't looked into it
<fwereade> thumper, would you nudge wallyworld in that direction when he comes on please?
<thumper> fwereade: hmm... maybe it does
<thumper> fwereade: I'm looking into it now as it is blocking the landing of the jes-cli branch
<thumper> fwereade: that and timeouts on ppc for uniter tests :(
<thumper> they take almost 8 minutes on my machine
<fwereade> thumper, I dunno, I just have a vague but strong expectation that stdout contain results of what you asked for and stderr have more freeform explanatory stuff and other abrotrary bits that seemed liek a good idea
<thumper> and ppc is slower
<thumper> fwereade: re: output, I think that's ok
<fwereade> thumper, fwiw, I have someone on the internet who says it's true :) http://www.jstorimer.com/blogs/workingwithcode/7766119-when-to-use-stderr-instead-of-stdout
<fwereade> thumper, re Output, I'm not sure why people haven't been using it at all
<fwereade> thumper, did everyone just forget it existed? or is there something important it doesn't do and is hard to fix?
<thumper> fwereade: what are you referring to exactly?
<thumper> fwereade: ctx.Infof and Verbosef ?
<fwereade> thumper, juju/cmd/output.go
<fwereade> thumper, the source of all the --format args
<thumper> fwereade: people do use it
<thumper> for all the places it makes sense at least
<thumper> I do think that 'juju block list' should be fixed to use it
<fwereade> thumper, cool -- and, ah, I see what bugged me
<fwereade> thumper, I saw the tabwriter, and saw no output formatter
<thumper> fwereade: you mean this time?
<fwereade> thumper, and thought OMFG-have-all-these-new-output-formats-been-hacked-in-horribly
<thumper> fwereade: in this case... personally, I'm ok with this
<fwereade> thumper, but no, I see formatTabular here and there as it shoudl be and am mollified
<thumper> as this is not 'standard' output for the command
<thumper> it is informational only
<fwereade> thumper, put it on stderr and I'll fold
<thumper> ok
<thumper> done
<fwereade> thumper, cool :)
<thumper> but I agree in principle about the formatters
<thumper> and I do use it, and expect others to
<thumper> it is used for all the user list, env list, etc
<thumper> but these default to --format=tabular
<thumper> for niceness
<fwereade> thumper, yeah, I just saw those, should have been looking earlier
<thumper> no more 'smart' bollocks
<fwereade> thumper, and +100 to default-tabular
<fwereade> thumper, and, yes, naming something "smart" is basically cursing it with an eternal suck magnet
 * thumper tries to get this win 2012 VM running the juju unit tests again
<thumper> windows just told me it finished the worker/uniter tests in -24672.05 seconds
<thumper> I can tell you it took longer than that
<anastasiamac> thumper: it's kind of funny.. in a sad way :D
<thumper> oh FFS
 * thumper looks at perrito666
<perrito666> thumper: ?
<thumper> perrito666: the source of the windows critical blocker is uniter_test.go line 272
<thumper> startUpgradeError{},
<thumper> that doesn't work on windows
<thumper> as it does chmod 555 $CHARMDIR
<thumper> so getting start-failed
<thumper> instead of started
<thumper> bogdanteleaga: ping
<thumper> menn0: do you know much windows?
<perrito666> thumper: apologies, I hadn't noticed that
<thumper> I'm wondering how to rework this without disabling
<perrito666> thumper: you can dissable that particular one by running it in a separate runUniterTests with the windows condition before and add a todo with my name in it
<perrito666> so you are unblocked
<perrito666> I would but it is a bit sunday dinner time fo rme
<thumper> understood
<thumper> I'm just wondering if I can just use 'exit 2' at the end of the script
<thumper> instead of a chmod
<menn0> thumper: I know it a bit
<menn0> thumper: but i'm no expert
<menn0> thumper: I can help with this issue now though
<menn0> thumper: https://bugs.launchpad.net/juju-core/1.24/+bug/1474606/comments/5
<mup> Bug #1474606: $set updates may clear out the env-uuid field <juju-core:Triaged by menno.smits> <juju-core 1.24:In Progress by menno.smits> <https://launchpad.net/bugs/1474606>
<menn0> thumper: it turns out what I was looking at is not quite as critical (but still needs fixing)
<thumper> quick hangout?
<menn0> yep
<perrito666> thumper: we seem to require a different failure for windows we might need to do a fully different charm failure for that
<perrito666> anyway, dinner, cheers
<bogdanteleaga> thumper: I'm around for 3-5 mins
<thumper> bogdanteleaga: don't worry, I'm working through it with menn0
<bogdanteleaga> thumper: fwiw, you're probably going to have to find another way of making it fail, it *should* be possible with permissions but it might get hairy
<bogdanteleaga> thumper: and you could use check.f to only run that test so you don't have to wait for everything
<thumper> bogdanteleaga: looking at 'cacls $CHARMDIR /p everyone:r'
<thumper> bogdanteleaga: think that'll work?
<bogdanteleaga> thumper: not sure if it'll disable other permissions you'd have to try it out; also check whether that works from both ps and cmd as I'm not sure under which that particular script gets ran
<thumper> bogdanteleaga: I'm trying it now
<menn0> thumper: this seems to be the correct way: http://paste.ubuntu.com/11906714/
<thumper> menn0: does it still ask for verification?
<menn0> thumper: what do you mean? I didn't see any prompts.
<thumper> menn0: good, tested and copied in
<thumper> running the test now
<thumper> if this doesn't work, I'll be ripping the test out and use test.skip for windows
<thumper> until someone can fix it
<thumper> it fails
<menn0> thumper: why?
<thumper> start-failed
<thumper> the test doesn't give a lot of output
<thumper> no
<thumper> I see it
 * thumper goes to try some again
<thumper> for my sanity, I'm going to the gym
#juju-dev 2016-07-18
<mup> Bug #1456717 changed: TestUpgradeStepsStateServer fails <ci> <test-failure> <juju-core:Invalid> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1456717>
<mup> Bug #1532849 changed: precise-amd64 and trusty-ppc64el unittests do not complete <ci> <ppc64el> <precise> <regression> <unit-tests> <juju-ci-tools:Fix Released by sinzui> <juju-core:Won't Fix> <juju-core service-to-application:Won't Fix> <https://launchpad.net/bugs/1532849>
<mup> Bug #1568150 changed: xenial lxc containers not starting <cpec> <cloud-init:Fix Committed> <juju-core:Invalid> <cloud-init (Ubuntu):Fix Released> <https://launchpad.net/bugs/1568150>
 * thumper goes to make lunch while the merge into model-migrations hopefully lands
<mup> Bug #1603841 opened: delete support for legacy models.yaml/accounts.yaml format <juju-core:Triaged> <https://launchpad.net/bugs/1603841>
<mup> Bug #1603841 changed: delete support for legacy models.yaml/accounts.yaml format <juju-core:Triaged> <https://launchpad.net/bugs/1603841>
<mup> Bug #1603841 opened: delete support for legacy models.yaml/accounts.yaml format <juju-core:Triaged> <https://launchpad.net/bugs/1603841>
<thumper> I have a test that is trying to read a charm from the charm dir of a unit agent
<thumper> obviously in the test it is unlikely to be there
<thumper> the worker can't be patched from the outside
<thumper> so I'd like to copy the test charm into the agent worker dir for the test
<thumper> anyone know of places where we do this?
<anastasiamac> thumper: we have charms we use for tests... neither of them work? u cannot add anothe test charm?
<thumper> actually...
<thumper> I think it is working
<thumper> but due to various parts of the system not sharing the testing clock
<thumper> it takes 3s for it to notice
 * thumper fires up the windows laptop to test
<mup> Bug #1602935 changed: Juju 2.0 DB2 charm giving error while deployed using ZFS as storage backend   <charmstore:New> <juju-core:Invalid> <https://launchpad.net/bugs/1602935>
<mup> Bug #1582667 changed: i/o timeout when deploying/upgrading charm <juju-core:Expired> <https://launchpad.net/bugs/1582667>
<mup> Bug #1582667 opened: i/o timeout when deploying/upgrading charm <juju-core:Expired> <https://launchpad.net/bugs/1582667>
<wallyworld> axw: when you get a chance, no rush, would appreciate a review on http://reviews.vapour.ws/r/5256/
<mup> Bug #1582667 changed: i/o timeout when deploying/upgrading charm <juju-core:Expired> <https://launchpad.net/bugs/1582667>
<axw> wallyworld: will do a bit later, trying to finish up this modelcmd branch
<wallyworld> yep, no rush
<menn0> wallyworld or axw: this just happened to me: http://paste.ubuntu.com/19866993/
<menn0> known issue?
<axw> menn0: not known to me
<wallyworld> hmmmm, not seen that before
<wallyworld> it must be intermittent because i've bootstrapped with bets13
<menn0> wallyworld, axw: it's intermittent for me too (only happened once).
<wallyworld> bug time
<menn0> wallyworld, axw: I /was/ boostrapping 2 controllers concurrently
<menn0> wallyworld, axw: I'll try to repro and will write up a ticket
<menn0> wallyworld, axw: I just made it happen again, so it's not too hard to repro
<axw> :(
<wallyworld> might be the 2 concurrently thing
<axw> wallyworld: I'm guessing it's related to dropping bootstrap config fallback, since that's the biggest change recently related to login I think
<axw> probably something wasn't quite right before, and it was falling back silently. just a guess tho
<wallyworld> could be. there was also a change to check for controler api addresses being present, or else bootstrap was considered not to be finished, but actually, i think that went away when the fallback was dropped
<mup> Bug #1603865 opened: migration: cater for virt-type constraint <juju-core:Triaged by anastasia-macmood> <https://launchpad.net/bugs/1603865>
<wallyworld> axw: and here's the fix for determining model config value source http://reviews.vapour.ws/r/5258/
<axw> wallyworld: sorry this change is taking ages :/  I'll probably be working later on, so will look later if I don't get to it soon
<wallyworld> axw: no worries
<mup> Bug #1594665 changed: reboot-executor is missing from the list of workers <juju-core:Invalid by fwereade> <juju-core model-migration:Fix Committed by fwereade> <https://launchpad.net/bugs/1594665>
<mup> Bug #1594665 opened: reboot-executor is missing from the list of workers <juju-core:Invalid by fwereade> <juju-core model-migration:Fix Committed by fwereade> <https://launchpad.net/bugs/1594665>
<mup> Bug #1594665 changed: reboot-executor is missing from the list of workers <juju-core:Invalid by fwereade> <juju-core model-migration:Fix Committed by fwereade> <https://launchpad.net/bugs/1594665>
<axw> wallyworld: how are we supposed to tie-break config source names?
<axw> currently taking the last match, just wondering if that was on purpose
<wallyworld> axw: there will be a fixed set (default, controller, region)
<wallyworld> default is defaults from code
<wallyworld> region not done yet
<axw> wallyworld: yep. so if the default is the same in all three, and the value is default, what should the source be?
<axw> I would have thought "default", rather than "region"
<axw> most general, rather than most specific
<wallyworld> i have currently done it to pick the most specific other than model, because we want to show the user what they would get if the unset a model attribute and caused the default to be used
<wallyworld> there's a separate model-tree command to show the source of config values
<wallyworld> so if i have set apt-mirror in my model, and I unset it, what value would be used. that's what Default will need to show, hence we want the most specific value in get-model-config output
<mup> Bug #1603888 opened: be able to specify "--bind" for bundle deployments <binding> <bundles> <network> <ux> <juju-core:Triaged> <https://launchpad.net/bugs/1603888>
<axw> wallyworld: if they're all the same, it's also valid to say that it'll be set to the controller or default, so I don't see how saying that it's going to be set to the region value is any more helpful
<axw> (nor is saying the others, that's just what *I* expected)
<wallyworld> axw: side issue - we won't show region specifically as a source AFAIK
<axw> ok
<wallyworld> just default / controller / model
<wallyworld> the spec is a little vague
<wallyworld> axw: the reason for saying the source is the most specific also means that you can reason ablout what would happen if you updated the more specific value and then created a new model
<axw> wallyworld: ok. had to reset my brain a bit, I get it now
<wallyworld> axw: no worries, i still need to think a bit each time
<axw> wallyworld: I've redone http://reviews.vapour.ws/r/5205/, could you please take a look - tomorrow is fine
<jam> babbageclunk: standup?
<babbageclunk> oops, omw
<jam> babbageclunk: I just mean stand up, there's something on your chair :)
<wallyworld> axw: looking in a bit
<mup> Bug #1603910 opened: model-level log forwarding not supported <oil> <juju-core:New> <https://launchpad.net/bugs/1603910>
<frobware> jam, babbageclunk: if we don't have a default gateway for a container should we error or log as a critical warning? My feeling is error. Thoughts?
<wallyworld> fwereade_: how do you make a worker run *only* for the controller model?
<fwereade_> wallyworld, I would probably make sure it was singular and run in the machine agent if JobManageModel
<fwereade_> wallyworld, warning: lots of machine jobs were done wrong, with then activity controlled by logic hidden away in the manifold instead of dragged out into the light
<fwereade_> wallyworld, agent/engine.Flag and agent/engine.Housing are your friends when Doing It Right
<wallyworld> fwereade_: i'm not across the new manifold config stuff so much - i assume i put something in agent/model/manifolds.go, but i have no idea what
<fwereade_> wallyworld, I think we probably want to put it in agent/machine/manifolds.go
<fwereade_> wallyworld, for consistency's sake
<wallyworld> fwereade_: sorry, was a typo
<fwereade_> wallyworld, ah cool ok
<fwereade_> wallyworld, what's the job, just for context?
<wallyworld> fwereade_: fwiw, it's the log forward worker; i think it is starting for hosted models as well as the controller model; i need to start a system to check, but the worker is there already
<wallyworld> line 404
<wallyworld> of said manifolds.go
<fwereade_> wallyworld, right, the manifold is deeply confused
<wallyworld> fwereade_: yeah, this was the initial implementation handed across, i'm trying to fix stuff
<fwereade_> wallyworld, worker/logforwarder/manifold.go:53 in particular
<fwereade_> wallyworld, it declares a dependency on state, which, WTF
<fwereade_> wallyworld, but presence or absence of a dependency does not affect whether or not you're started
<wallyworld> yeah, that bit confused me too
<fwereade_> wallyworld, so it literally does nothing
<fwereade_> wallyworld, and it also does a bunch of default-logic-inserting garbage, and has no tests at all
<fwereade_> wallyworld, but, derail
<wallyworld> fwereade_: there are a lot of tests missing in a few places - the code just needed to land to meet a contractural deadline
<fwereade_> wallyworld, the quick fix is to find another important-looking manifold and see what that does -- it will figure out the agent jobs by some means, probably foul, and exit if it's not wanted
<fwereade_> wallyworld, this is dumb but at least it's consistent
<wallyworld> fwereade_: yeah, i tried to find something i could cargo cult but was not sure - maybe the migration master?
<fwereade_> wallyworld, the correct thing to do would be to extract N JobFlag workers, and wrap the workers that need them in engine.Housings that declare the dependency, which *will* cause the worker not to be started if suitable flags aren't present
<fwereade_> wallyworld, worker/resumer/manifold.go
<wallyworld> ta, looking at that
<fwereade_> wallyworld, clean textbook example of doing it wrong
<jam> frobware: so without a default gateway, is there any way for us to tell that there is a problem? I'd tend to go with Warning if there is any way we can progress
<wallyworld> fwereade_: so what not to do :-)
<fwereade_> wallyworld, the worker shouldn't care where it's running, and it shouldn't hit some crazy different facade to find out the jobs :)
<frobware> jam: it's not clear any progress will really be made. you can get into the container but no route out means not much will get done.
<jam> frobware: sure, but just dying will mean we get restarted anyway
<wallyworld> fwereade_: right. agreed. perhaps there's no current example to look at then?
<fwereade_> wallyworld, if you *were* to extract a JobFlag worker/api/apiserver, and extract all -- or just some -- of the job-munging logic from the worker manifolds, that would be great; and if you wanted to do *that*, see the lifeflag and migrationflag workers, and how they're used in the manifolds funcs
<frobware> jam: but to me it smacks of the config is just so wrong nothing but manual intervention will help
<jam> frobware: so that is entirely likely to be true, but if you stop what you're currently doing, the other infrastructure code will just restart you, I think. Or is this during cloud-init ?
<frobware> jam: generating cloud-init for the container
<fwereade_> wallyworld, lifeflag and migrationflag are very similar, and I think jobflag would be too
<fwereade_> wallyworld, I looked a little while ago and couldn't figure out how to generalise nicely though
<jam> frobware: so... will we even be able to download the tools tarball if we can't setup a gateway?
<jam> is this likely to actually be a problem in practice, or is it just a drive-by thing?
<wallyworld> fwereade_: i'll take a look. realisitcally, i have zero time to spend though, as this log forward work is supposed to be finished in beta 11
<frobware> jam: part of this it's just a plain ol bug from our side (https://bugs.launchpad.net/juju-core/+bug/1602054)
<mup> Bug #1602054: juju deployed lxd containers are missing a default gateway when configured with multiple interfaces <2.0> <network> <regression> <juju-core:In Progress by frobware> <https://launchpad.net/bugs/1602054>
<frobware> jam: we were assuming eth0 would be the route out, but no so if your MAAS network topology ends up with no route out
<frobware> jam: in the general case we will have a default route (well, once the bug is fixed). so I would say if we really really end up with a network config that has no gateway then there's clearly no point continuing.
<frobware> jam: in this particular bug it's not that we didn't have one, we just made wrong assumptions
<jam> frobware: so, I'd counter with the fact that we can set up the agent, and get tools, etc, because they could very easily be in the same subnet.
<jam> so you can actually have a perfectly running simple charm even without a default gateway
<frobware> jam: you're talking about addressable containers. :-D
<frobware> jam, yep, ok.
<jam> the Controller could be in the same subnet, which lets us at least get the agent up, which takes the 'machine' out of pending.
<jam> we could then have the agent tell us "I can't get the charm, etc" but that's better than the machine being stuck in Pending, IMO
 * frobware reverts his lunch. and some of his changes too. :)
<frobware> jam: here's another. two interfaces (eth0 dhcp, eth1 static) but static has a gateway, should we assume the gateway comes from the DHCP lease or should we write the gateway option when rendering eth1?
<perrito666> rogpeppe1: ping
<perrito666> morning all
<jam> frobware: doesn't the entire machine get a default gateway?
<jam> (each interface is likely to have *a* gateway, but there is one global default gateway for the machine)
<babbageclunk> frobware: I have a problem where I can add a machine, and once it starts it is reachable, but then if it gets rebooted the network never comes back up.
<babbageclunk> I've used your trick to remove the password so I can get in through the terminal, but I can't see why the network isn't working.
<frobware> babbageclunk: can we HO in say 20mins?
<frobware> jam: ENI can only have one entry that is 'gateway'
<babbageclunk> frobware: that would be great, thanks
<frobware> jam: the point about DHCP and static (where static has a GW) is which wins?
<frobware> balloons: ^^ I think our 1 hour quick test should first reboot any machine, once boostrapped, and then run the tests
<jam> frobware: if we actually have that situation, which one wins in routes?
<balloons> frobware, any particular reason?
<balloons> I'm curious how that is more or less representative
<frobware> balloons: it helps verify that the ENI we rewrite doesn't just work the once. If we reboot the machine it helps verify that things are just working because of current state
<frobware> aren't
<balloons> frobware, ok. It's not something we do as part of any test, so we're expanding into something new
<frobware> balloons: sure - just something that has been on my mind for a while as a few people/bugs have mentioned that on reboot some things are not working
<frobware> jam: don't know off-hand. I don't know if you have eth0/dhcp, eth1/static whether ifupdown will apply the latest - I'm guessing so but any DHCP re-lease(sp?) would/could change that
<balloons> frobware, mind if I return a question back at you then? I can't seem to get juju to build from scratch using go get / go install. Am I doing something wrong? It fails to fetch the azure provider and if I workaround that, it fails to build
<frobware> jam: as we're processing and generating ENI in the container we could say that if we've seen a DHCP iface then we've seen that gateway too
<balloons> I run go get -d -v github.com/juju/juju/... then go install -v github.com/juju/juju/...
<frobware> balloons: dependencies need updating?
<frobware> balloons: if building from scratch I have: "rebuild-juju is aliased to `[ -f $PWD/juju/api.go ] && { nukegopkg; make clean; godeps -u dependencies.tsv; make install;}'
<frobware> "
<balloons> frobware, mmm.. possibly. I'll try playing with godeps. The thing is, this is a clean pull, so in theory nothing should be getting in the way
<balloons> frobware, thank you I will try that.. nuking is what I had in mind ;-)
<frobware> balloons: nukegopkg is aliased to `[ -d "$GOPATH/pkg" ] && rm -rf $GOPATH/pkg'
<frobware> babbageclunk: ho?
<babbageclunk> yes
<frobware> babbageclunk: standup HO
<balloons> frobware, ty btw. That worked
<frobware> balloons: possibly related to stale stuff in go/bin, go/pkg
<rogpeppe1> perrito666: pong
<mup> Bug #1600722 changed: MachineSuite.TestHostedModelWorkers is unreliable <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1600722>
<mup> Bug #1604006 opened: BundlesDirSuite.TestGet fails <ci> <intermittent-failure> <jujuqa> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604006>
<perrito666> rogpeppe1: I just answered via mail :)
<frobware> babbageclunk: "console=ttyS0,115200"
<alexisb> katco, ping
<katco> alexisb: pong
<alexisb> heya katco happy monday
<katco> alexisb: happy monday
<babbageclunk> frobware: I can't get into new vms created using your script - it's like cloud-init isn't running.
<frobware> babbageclunk: want to HO again?
<babbageclunk>  frobware yes please
 * rick_h_ grabs lunchables
<redir> I'm so lagged. I wonder why.
<katco> perrito666: hey, thought of you when i saw this: http://blog.binchen.org/posts/enhance-emacs-git-gutter-with-ivy-mode.html. how is your emacs adventure going?
<perrito666> katco: hijacked by nvim :p
<perrito666> It was my more lasting attempt I must admit
<katco> perrito666: lol
<perrito666> also, got large amounts of pain in my wrist :p
<katco> ah, that's no good
<perrito666> man I really am sorry we did  not use this 1->2 change to make all "Id" appearances into "ID"
<perrito666> It would save me from many complaints by my linter
<mgz> perrito666: did you forget about monsters from the id?
<perrito666> well that was an obscure reference for sure
<perrito666> mm, there is a book that might be a bit more bearable than the movie, I might even read it
<mgz> ...return to the forbidden planet isn't that obscure...
<mgz> though, I have to admit I don't know how popular it is in south america
<natefinch> mgz: most movies from 60 years ago are pretty obscure
<perrito666> the robot I never heard of it and I am as nerd as it gets
<mgz> sure, the movie is old, but there's the musical as well
<mgz> I was the robot in school musical
<natefinch> lol I had no idea there was a musical
<perrito666> mgz: how old are you? and what is a school musical?
<mgz> ..it was less than 60 years ago
<perrito666> i thought school musicals only happened in disney movies
<natefinch> perrito666: a play with songs and music performed by students at school
<natefinch> perrito666: lol, no, they're ubiquitous in the US, at least.
<perrito666> and are all the kids participating unrealistically aestethic like in disney movies?
<natefinch> no :)
<perrito666> meh, and now you are telling me that they are not pro pop singers either
<mgz> perrito666: just said *I* was in it...
<perrito666> mgz: you are not in the US, plus you might have been a been of uncanny beauty in your youth, then learned how to code
<mgz> eheheh
<perrito666> I was rather good looking and in good shape in high school
<perrito666> then my folks bought a computer
<mgz> you're a good shape now, just a cuddlier one
<perrito666> sadly the only known thing my country has are footballers so people expectations are rather hight
<perrito666> high
<rogpeppe1> perrito666: sorry, took a while, but you around for a chat?
<mgz> rogpeppe1: nope, only for a cuddle
<rogpeppe1> mgz: c'mon then
<mgz> ehehe
<perrito666> rogpeppe1: a sec
<natefinch> rick_h_: so, I'm not sure what to do about the default lxd cloud being called "localhost".  I mean, obviously that's a thing I can change, but I presume it was chosen on purpose, even if I personally think its a horrible name.
<rick_h_> natefinch: so we have the list there, can we just call it lxd in the list
<rick_h_> natefinch: and treat localhost as the auto region
<rick_h_> natefinch: so you get done with a lxd-localhost named entry?
<rick_h_> natefinch: especially since I think we should pull the 'type' column
<natefinch> rick_h_: well, so, the cloud is called localhost.  juju list-clouds
<rick_h_> natefinch: thinking
<natefinch> I can do whatever you want, but then lxd will be inconsistent with the rest that use the actual cloud name
<rick_h_> natefinch: right, but I'm ok with that in the sense that it's the one that doesn't make any sense if you follow the rules
<rick_h_> natefinch: so it's going to be an exceptoin one way or other
<rick_h_> natefinch: so I'm open to suggestions on how best to encode that "special" ness
<rick_h_> natefinch: but feel strongly that it reads well for the lxd case.
<natefinch> rick_h_: so, for clouds that only have one region, let's not put the region in the controller name. It's like, duh, I know that azure-china is gonna be in cn-north-1 given that's the one and only region (which I presume people in china are well aware of).
<natefinch> rick_h_: I think the idea is that the type of the cloud should be much less important than the cloud name.  People shouldn't really care that it's lxd locally, just know it's running on localhost.  At least, that's what it seems like we're going for.
<rick_h_> natefinch: the issue there is that when it comes to getting help/docs not having lxd in the name hampers you
<rick_h_> natefinch: as it won't be obvious for everyone that tries it what it is imo
<natefinch> rick_h_: that's a good point.
<rick_h_> natefinch: for that I think I do want to try to do lxd-localhost and when we get to actually supporing remote lxd it'll be -not-localhost
<rogpeppe1> perrito666: one sec or two? :-)
<perrito666> solving a git conflict, brt
<perrito666> could anyone http://reviews.vapour.ws/r/5259/  please?
<perrito666> rogpeppe1: ok, ready
<mup> Bug #1604081 opened: lxd out of order in list-clouds <juju-core:New> <https://launchpad.net/bugs/1604081>
<redir> natefinch: rick_h_ I thought the 1 pager wanted the active user's username to be prepended to the default name
<rick_h_> redir: the default name of the cloud?
<redir> rick_h_: the controller
<natefinch> redir: oh yeah, I missed that it was using the logged in user's username... misread it and thought that might be the credential name or something
<redir> rick_h_: I think I see what you mean
<natefinch> in that case, mark-localhost is not so bad
<rick_h_> redir: let's leave it out atm. Right now list-controllers outputs the controllers name, the user, and the cloud/region.
<redir> natefinch: mark-lxd-localhost no?
<rick_h_> adding all that into the name of the controller seems overloaded
<natefinch> redir: the spec calls for username-region
 * redir backs away slowly and goes back to removing users
<natefinch> redir: actually, it conflicts with itself
<rick_h_> this is why I want to bit off the earlier bits
<rick_h_> it's a ton of duplicated info down the layers here
<natefinch> Not sure if it's a good thing or a bad thing that the problem we're having is deciding which strings to concatenate
<rick_h_> natefinch: bad thing, so we're just trying to get through the minimum atm
<rick_h_> natefinch: so we can build up the rest of the experience around it and try to push back on the whole concat all the things
<rick_h_> stokachu: is the maas local file spec up to date with what's the plan?
<stokachu> rick_h_: yea
<mup> Bug #1563936 changed: juju bootstrap azure azure WARNING juju.provider.azure instancetype.go:100 found unknown VM size "Standard_D15_v2" <azure-provider> <bootstrap> <juju-core:Triaged> <https://launchpad.net/bugs/1563936>
<mup> Bug #1604081 changed: lxd out of order in list-clouds <juju-core:Won't Fix> <https://launchpad.net/bugs/1604081>
<perrito666> ok ppl, will be back in about 1.5h
<alexisb> katco, ping
<katco> alexisb: hey
<alexisb> heya katco, I am on the HO
<katco> alexisb: oh... really? i'm sitting in there lol
<katco> alexisb: one sec let me refresh
<natefinch> review comment of an internal method I exported: If this is now exported there should be a test for it.
<natefinch> ...... this is why I test internal methods
<mbruzek> Our partners at IBM pinging me about the bug: https://bugs.launchpad.net/juju-core/+bug/1600311  I see it is listed as incomplete, but I believe Prabakaran has added the information requested.
<mup> Bug #1600311: Juju 2.0 Bootstrap Fails on Ubuntu Trusty Power machine. <juju-core:Incomplete> <https://launchpad.net/bugs/1600311>
<natefinch> mbruzek: oh, we don't support trusty in 2.0 any more
<natefinch> mbruzek: .... kidding!
<mbruzek> natefinch: good one!
<mbruzek> natefinch: And ppc64le no less
<natefinch> mbruzek: it sounds like a lxd bug
<mgz> mbruzek: looks like a apparmour thing?
<mgz> see the second log
<mbruzek> It looks like a lxd bug to me. I asked Prabakaran to show me 'sudo lxc image list' and that worked fine.
<mgz> mbruzek: probably need one of the lxd guys
<mup> Bug #1604106 opened: azure provider 500s are not explained as azure problems <azure-provider> <observability> <reliability> <ui> <juju-core:Triaged> <https://launchpad.net/bugs/1604106>
<mbruzek> mgz: but we had the discussion in #juju-dev
<mbruzek> mgz: I see my logs in that room.
<mgz> yeah, I have it
<mgz> you did somehow have a juju2 package still installed
<mgz> hm...
<mgz> we really need to sort out our ppa packaging too
<mbruzek> mgz: I am happy to help where you need it
<natefinch> rick_h_, redir: I went with this naming scheme: (os-username || "local")-(len(regions)>1 ? region-name : cloud-name)
<rick_h_> natefinch: no local please. We're trying to get rid of the local: and such from the strings.
<natefinch> rick_h_: sure.... what would you recommend?  there's a call os/user.Current() ... which can fail.  I need a replacement standin, in that case
<rick_h_> natefinch: just leave it blank?
<rick_h_> natefinch: skip it vs adding a 'local' to it
<natefinch> rick_h_: ok
<mup> Bug #1604120 opened: new models do not inherit image-metadata-url from bootstrap config <juju-core:New> <https://launchpad.net/bugs/1604120>
<mgz> stokachu: doing a test build now
<mgz> urk... balloons imported the tarball with a - not a ~
<balloons> yep.. I've been down that road, but lp wants it's '-'
<mgz> well, the tarball needs that, right
<mgz> but the import should say the version for debian purposes is with a tilde, see my beta10 import
<mgz> anyway, building now
<balloons> mgz, ahh, right, I think I could make that happen
<mgz> yeah, you just specify the revision with a tilde on the import
<mgz> but never mind
<mgz> we need to find some time to just redo this at some point
<redir> gah. My laptop fonts are gone everywhere but chromium after suspend/resume
<mgz> debian-changelog-line-too-long line 9
<mgz> thanks lintian...
<stokachu> mgz: coo
<redir> natefinch: so Y/n in parens (Y/n) but other items in brackets [some-default]?
<natefinch> redir: hmm... it's tricky.  Y/n are options, not a default.
<redir> natefinch: understood, just verifying before I start changing things
<natefinch> redir: I guess parens for now.  at least different behavior is different display, even if both are arbitrary
<redir> natefinch: also a ? about destroy/remove
 * redir puts it in the doc
<redir> for posterity
<natefinch> redir: what about destroy vs remove?
<natefinch> redir: other than it's confusing? :)
<redir> natefinch: asked in doc
<redir> brb reboot and lunchables
<natefinch> redir: responded in the doc.
<mup> Bug #1603584 changed: juju-uitest calls obsolete --show-passwords <ci> <juju-gui> <regression> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1603584>
<alexisb> wallyworld, thumper can you guys kick off the release standup
<alexisb> I will be htere shortly
<wallyworld> ok
<perrito666> no one should be allowed to leave a job without clarifying all their todos
<mup> Bug #1604176 opened: github.com/juju/juju/worker/reboot timeout on windows <ci> <regression> <timeout> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1604176>
<perrito666> anyone from team network around?
<thumper> perrito666: whazzup?
<perrito666> thumper: sorry I answered myself
<thumper> ok
<mup> Bug #1600311 changed: Juju 2.0 Bootstrap Fails on Ubuntu Trusty Power machine. <juju-core:Invalid> <lxd (Ubuntu):New> <https://launchpad.net/bugs/1600311>
#juju-dev 2016-07-19
<menn0> wallyworld, thumper, axw : I think I've figured out that bootstrap problem
<menn0> the bootstrap command embeds a ModelCommandBase and that's never told what the name of the controller that's being bootstrapped
<menn0> so it defaults to the current controller
<menn0> so the correct API details are being written out to controllers.yaml
<menn0> but the details for the other controller are being read out
<menn0> and the other controller doesn't have any details yet
<menn0> trialing a fix now
<menn0> wallyworld, thumper, axw: yep, that was it
 * menn0 cleans up and adds tests
<menn0> wallyworld: can I pls have permission to modify the A team board?
<wallyworld> menn0: sorry was otp, done
<menn0> wallyworld: cheers
<axw> menn0: cool, thanks for fixing
<axw> wallyworld: is this a bit clearer? http://reviews.vapour.ws/r/5205/diff/4-5/
<wallyworld> looking
<wallyworld> axw: awesome, thanks
<axw> wallyworld: are we still JFDIing things, or should I fix https://bugs.launchpad.net/juju-core/+bug/1603596 first?
<mup> Bug #1603596: HA often fails on azure creating virtual machine <azure-provider> <blocker> <ci> <ha> <regression> <juju-core:Triaged by axwalk> <https://launchpad.net/bugs/1603596>
<wallyworld> axw: i thinks there's more than one critical blocker? if that bug is the last one, would be good to fix that first; if not what's one more jfdi rught?
<axw> wallyworld: just that one and the one you fixed yesterday. I'll take a look at the azure one now
<wallyworld> ok
<wallyworld> ta
<mgz> axw: if you have a bug that is also causing a CI failure and is marked crtical and tagged blocker, edit the bug to have that and use fixes-nnnn
<mgz> *is not marked
<axw> mgz: okey dokey
<axw> mgz: it's not a bug related thing, we were just JFDIing things for a while because of time pressure.
<axw> but no more
<thumper> menn0: oh FFS, master and model-migration conflict again
<thumper> menn0: do you wanna take this one?
<thumper> wallyworld: the CI run for model-migration is mostly good
<thumper> there are a few failures, but I don't think they are new, and certainly not related to migration
<wallyworld> you want to land it
<thumper> so I'm thinking lets just fix the merge conflicts and merge into master
<thumper> wallyworld: thoughts?
<wallyworld> let me cast a second set of eyes on it so we can say we've done due diligence
<thumper> wallyworld: http://reports.vapour.ws/releases/4148
<thumper> wallyworld: compare with the lastest curse email on master
<wallyworld> thumper: won't the race test failures be an issue?
<wallyworld> i thought we were gating on those now for master
<thumper> races aren't
<thumper> but joyent races are long known
<thumper> and I should double check that the joyent deps are right
<thumper> because dave fixed some of those
<wallyworld> hmmm, i could have sworn race tests were gating
<wallyworld> other than that, looks ok to merge
<wallyworld> or merge regardless and if i'm wrong it will land
 * thumper sets up a branch
 * thumper runs the tests
<natefinch> anastasiamac: did we have a meeting set up for tonight?
<anastasiamac> natefinch: we did not. we said whenever u and i r available :D
<natefinch> anastasiamac: are you available? :)
<anastasiamac> natefinch: for u, i could be :D
<anastasiamac> natefinch: in 15mins k?
<natefinch> anastasiamac: sure, whenever is good for you.
<anastasiamac> natefinch: \o/
<redir> natefinch: another issue
 * redir puts it in doc
<natefinch> redir: good... better to find them now
<menn0> thumper: yep, I'll take a lok
<thumper> menn0: too late
<thumper> done it
<menn0> ok
<thumper> and it is being merged
<menn0> thumper/wallyworld/axw: fix for that concurrent bootstrap issue: http://reviews.vapour.ws/r/5262/
<menn0> easy fix, horrible test
<wallyworld> looking
<menn0> it took me 3 attempts at different approaches to arrive at this
<thumper> ugh
 * thumper leaves for wallyworld
<wallyworld> menn0: jesus, you were not wrong about the test
<menn0> wallyworld: yeah, pretty terrible. I almost thought about not testing the change.
<wallyworld> you are a better man than me :-)
<menn0> ha :)
<menn0> thumper: is model-migration merged into master?
<menn0> looks like it
<thumper> menn0: yep
<menn0> thumper: awesome \o/
<mup> Bug #1604223 opened: Concurrent bootstrap fails with "no API addresses" <bootstrap> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1604223>
<menn0> axw: only just saw your review and it's already merging. I'll land a separate micro-PR afterwards.
<axw> menn0: it's no big deal
<axw> menn0: feel free to leave it
<menn0> axw: it's ok. your suggestion is good and it's really not much effort.
<menn0> wallyworld, thumper: that bootstrap fix was rejected because it's not one of the blockers. jfdi or wait?
<thumper> JFDI IMO
<wallyworld> menn0: update with new comment too :-)
<menn0> wallyworld: of course :)
<wallyworld> the change is only one line, what could possibly go wrong :-D
<thumper> wallyworld, axw: in the new world of commands, I want to get the modelTag for a modelCmd
<thumper> it isn't yet obvious to me
<wallyworld> there's a CurrentModelUUID somewhere on the base command, i'd need to look it up
<axw> thumper: 1. get the client store (c.ClientStore()); 2. call store.ModelByName(c.ControllerName(), c.ModelName()) to get model details; 3. names.NewModelTag(details.ModelUUID)
<axw> thumper: see cmd/juju/model/show.go
<thumper> axw: ta
<thumper> ha poop
<thumper> ah poop even
<thumper> I have an old controller lying around
<thumper> which I now can't talk do because missing creds
<thumper> axw: recoverable? or just kill the machines and blow away the cache dir?
<axw> thumper: umm. I think you can get the admin password from the machine? in the agent.conf?
<axw> old password or something?
<thumper> ugh
<thumper> too hard
<thumper> :)
<axw> thumper: not sure. there should be a command to fix it, but we don't have one
<axw> probably easiest to restart
<thumper> um...
<thumper> where is our config?
<thumper> ~/.config? or .cache?
<axw> thumper: ~/.local/share/juju
<thumper> just wondering...
<thumper> but why did we choose there?
<thumper> hmm
<thumper> nm
<axw> thumper: IIRC, things that you need for operation to continue should go in the data dir
<axw> optional config goes in config
 * thumper starts afresh
<anastasiamac> skipping windows tests review, plz :D http://reviews.vapour.ws/r/5263/
<menn0> wallyworld: what's the name you're using for the config inheritance work?
<wallyworld> juju model tree
<wallyworld> or juju model config tree
<menn0> ok, that'll do (it's for a document i'm writing on current model migration status)
<wallyworld> axw: no rush, when you are free, would love a review on http://reviews.vapour.ws/r/5264/
<axw> wallyworld: ok, in a little while
<thumper> yes... new command in under 350 lines http://reviews.vapour.ws/r/5265/diff/#
<thumper> juju dump-model
<thumper> (behind developer-mode feature flag)
 * thumper out now to go make dinner
<thumper> laters
<axw> wallyworld: where abouts do we generate the certs in the lxd provider?
<axw> I got lost in a twisty maze
<wallyworld> axw: there's a juju/tools/lxdclient package
<wallyworld> there's a UseRemoteTCP() method or something like that. it's called when config is finalised
<wallyworld> that method generates client certs, connects to server, and gets back the server cert
<wallyworld> the client certs are based on network interfaces and host name at the time
<wallyworld> so it's implausible to generate ahead of time
<axw> wallyworld: I was thinking we'd only generate them at bootstrap time
<axw> wallyworld: I guess having it in DetectCredentials opens users up to accidentally storing them in credentials.yaml. would be good to auto-generate still though...
<wallyworld> exactly
<wallyworld> but at bootstrap time seems ok i think
<wallyworld> i'd like to do that after the work to properly store credentials
<wallyworld> out of the model
<axw> wallyworld: ok
<wallyworld> for now, the PR as it exists I think adds value
<wallyworld> model-config looks sweet now
<axw> wallyworld: model-config looks good, I'm not sure there's any point in adding the tls auth-type without the implementation though?
<wallyworld> axw: it exists merely to define the attributes to filter out
<axw> ah, right
<axw> wallyworld: reviewed, just a few little things
<wallyworld> axw: ta, will look after I get back from school pickup, heading out now to do that
<wallyworld> axw: i explained the \n thing in the CLI - i could just drop the code since we don't displau authorised keys now, but figured it can't hurt to keep it?
<axw> wallyworld: I'm not strongly opposed to it, I just wondered what the reason was - wasn't mentioned in the review description. why drop the leading character?
 * axw looks at comment
<wallyworld> axw: cause i'm an idiot
<wallyworld> typo
<axw> wallyworld: maybe just use strings.TrimSuffix(valString, "\n") then ?
<wallyworld> yeah, that's much better
<anastasiamac> axw: wallyworld: fix of the fix :D http://reviews.vapour.ws/r/5267/
<anastasiamac> for those of u with eagle eyes :D
<anastasiamac> axw: fast \o/ tyvm!!
<mup> Bug #1567708 changed: unit tests fail with mongodb 3.2 <juju-core:Fix Released by 2-xtian> <https://launchpad.net/bugs/1567708>
<frobware> babbageclunk: my kvm-maas scripts now work on 2.0 - please let me know otherwise
<babbageclunk> frobware: wilco, thanks. In other news it looks like the dhcp was the problem.
<frobware> babbageclunk: as in lack of?
<babbageclunk> frobware: yup
<babbageclunk> frobware: just working out what I should be putting into the network info instead now.
<frobware> babbageclunk: want to HO?
<babbageclunk> frobware: yeah, why not?
<babbageclunk> frobware: I'm in juju-sapphire
<perrito666> wallyworld: tx for the overnight merge on controller tag
<wallyworld> perrito666: no worries
 * perrito666 fears the merge with master
<frobware> babbageclunk: http://pastebin.ubuntu.com/20024195/
<frobware> babbageclunk: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1590104
<mup> Bug #1590104: network config from datasource overrides network config from system <amd64> <apport-bug> <uec-images> <xenial> <cloud-init:Confirmed> <cloud-init
<mup> (Ubuntu):Fix Released> <cloud-init (Ubuntu Xenial):Confirmed> <cloud-init (Ubuntu Yakkety):Fix Released> <https://launchpad.net/bugs/1590104>
<frobware> babbageclunk: https://bugs.launchpad.net/maas/+bug/1604169
<mup> Bug #1604169: maas login yields "ImportError: No module named 'maasserver'" <MAAS:Confirmed> <https://launchpad.net/bugs/1604169>
<frobware> jam: ^ the maas bug I talked about this morning
<jam> frobware: I'll brt, just need to grab a snack
<frobware> ack
<rogpeppe> anyone know how to recompile a Go stdlib package without rebuilding the whole go install?
<rogpeppe> it used to work but no longer seems to
<perrito666> I believe that at some point we should accept that there is no way to be descriptive enough in a test name (in some cases) and start adding a couple of doc lines in the tests
<fwereade> perrito666, I heartily endorse explanatory comments all over the place
<perrito666> heh, my issue is with tests that usually are called something like TestThisHappensWhenX which is explanatory of what the test does, but not why, which is the important bit
<perrito666> bbl heating people is finally here
<mup> Bug #1604408 opened: LogForwarderSuite.TestConfigChange obtained Next, expected close <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1604408>
<mup> Bug #1604408 changed: LogForwarderSuite.TestConfigChange obtained Next, expected close <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1604408>
<mup> Bug #1604408 opened: LogForwarderSuite.TestConfigChange obtained Next, expected close <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1604408>
<babbageclunk> frobware: Those scripts work an absolute treat!
<frobware> babbageclunk: \o/
<babbageclunk> frobware: I guess I shouldn't be surprised, but it's nice! :)
<frobware> babbageclunk: what we need next is a means of describing what NICS you want connected to which subnets and/or dhcp/manual, et al
<babbageclunk> frobware: The only thing I'm having trouble with is that I don't want to allocate 16GB of disk to each node.
<babbageclunk> frobware: Even if I set VIRT_DISK_SIZE and KVM_DISK_SIZE to 10, I still get 16GB disks.
<babbageclunk> frobware: Ah, worked it out - removing the node doesn't remove the pool, so even though I'm specifying the size for add-node it's reusing the already-created image.
<babbageclunk> frobware: gah, nope - still can't see why the images are always coming up with 16G disks.
<babbageclunk> frobware: ignore, everything's fine now! PICNIC
<frobware> cherylj: fyi, marking this as critical  - https://bugs.launchpad.net/juju-core/+bug/1604482
<mup> Bug #1604482: MAAS bridge script should drop all 'source' stanzas from original file <network> <juju-core:New> <https://launchpad.net/bugs/1604482>
<mup> Bug #1604474 opened: Juju 2.0-beta12  userdata execution fails on Windows if azure-provider is used <azure-provider> <juju2.0> <windows> <juju-core:New> <https://launchpad.net/bugs/1604474>
<mup> Bug #1604482 opened: MAAS bridge script should drop all 'source' stanzas from original file <network> <juju-core:New> <https://launchpad.net/bugs/1604482>
<mup> Bug #1604514 opened: Race in github.com/joyent/gosdc/localservices/cloudapi <blocker> <ci> <joyent-provider> <race-condition> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1604514>
<balloons> can someone approve this? It's just a markdown change -- perhaps I did it wrong. https://github.com/juju/juju/pull/5820
<natefinch> balloons: LGTM
<katco> perrito666: i believe you are OCR. tal at a small change? http://reviews.vapour.ws/r/5270/
<perrito666> I am, and now with heating
<katco> yay :)
<perrito666> sadly my office is the last portion of the wather circuit :p
<mgz> balloons: you need to get the keys to reviews.vapour.ws off ericsnow then you could add yourself and your pr would be more visible :)
<balloons> natefinch, do I need to tell the bot to merge then? Not sure of the formal process as I do generally hit it with the comment
<perrito666> balloons: I can merge it for you
<perrito666> balloons: done
<natefinch> well, except it'll fail because master is blocked
<perrito666> not really, just added the $$merge$$ but you will have to watch if that goes trough
<perrito666> ahaa
 * perrito666 curses in spanish
 * natefinch adds a JFDI
<balloons> indeed.. I just wasn't sure if the person proposing should tell it to merge, or if the reviewer(s) do
<mgz> balloons: also with that mp^ - you shouldn't be using the juju namespace for merge proposals
<mgz> balloons: we had a spurious CI run because of it
<balloons> mgz, right, I noticed it did a CI run
<mgz> you need your git config to have different upstream and origin
<balloons> I did it on github as I saw the error and thought nothing more of it
<balloons> so I was led astray by gh :-)
<katco> are we still jfding changes?
<mgz> you have to treat owner privs with care or I shall remove them :)
<natefinch> yeah, that's not really his fault.  I'm kind of surprised it let him make a branch there
<mgz> it is surprising the web ui would do that
<natefinch> well, it's made to make it easy to make minor edits to docs.
<natefinch> it's our fault that we automatically run CI on any branch created there
<natefinch> ...and that CI takes so damn long :)
<balloons> working on that last part!
<marcoceppi> production deployment, beta11, AWS: 'listing instances: An internal error has occurred (InternalError)'
<marcoceppi> when trying ot provision a machine
<stokachu> marcoceppi: aws seems down
<balloons> stokachu, us east 1 is
<balloons> or was at least
<marcoceppi> stokachu balloons not according to the status anymore
<mup> Bug # opened: 1604542, 1604551, 1604559, 1604561
<stokachu> balloons: yea still is, error retrieving resource count
<stokachu> marcoceppi: aws.amazon.com shows failure retrieving instances
<marcoceppi> stokachu: http://status.aws.amazon.com/
<stokachu> marcoceppi: i see that, but im just telling you what i see
<stokachu> marcoceppi https://usercontent.irccloud-cdn.com/file/nH5BLoCy/
<stokachu> the status is reporting it working
<stokachu> but the console says otherwise
<marcoceppi> stokachu: works for me http://i.imgur.com/fnlaUkl.png
<marcoceppi> so, lets pretend they fixed this, can someone help me troubleshoot that error?
<marcoceppi> rather
<marcoceppi>  'cannot run instances: Request limit exceeded. (RequestLimitExceeded)'
<stokachu> aren't you limited to 20 instances
<stokachu> or smaller?
<marcoceppi> not on this account
<marcoceppi> this is API requests
<stokachu> i see
<marcoceppi> http://docs.aws.amazon.com/AWSEC2/latest/APIReference/query-api-troubleshooting.html#api-request-rate
<mgz> marcoceppi: what region?
<marcoceppi> us-east-1
<marcoceppi> seems to be catching the tail end of the api outage
<marcoceppi> I got a series of other errors about root-tagging
<marcoceppi> and now it's pending again
<mgz> sinzui says that's borked
<mgz> try a different region
<mgz> you can also see some red on our test-cloud-aws-us-east-1 health job
<perrito666> fcs is it really so hard to add docs to public structs?
<natefinch> That's why I love having golint running on save... the little squiggles drive me crazy, so I fix them ASAP.
<perrito666> some people might need for golint to kick them in the face to actually do something about
<perrito666> I would totally love a golint with that feature
<natefinch> haha
<natefinch> perrito666 is taking over Dave's role as grumpy cat, it seems.
<perrito666> oh, you sure are one to talk
<natefinch> lol
<natefinch> well, to be fair, maybe if we both worked at it together, we might be able to almost match dave's grumpiness.
<perrito666> mm, ok, it would seem that I need thumper to make sure I dont break anything in this
<thumper> wat?
<perrito666> oh, that actually works, sweet
<perrito666> mm, ok, it would seem that I need one million dolars so I dont break anything in this
<perrito666> meh, nothing happend
<natefinch> heh
<perrito666> thumper: I am adding ControllerUser to migrations, but it would seem that migrations are per model?
<thumper> perrito666: yes, migrations are only per model
<thumper> controller user probably doesn't need to be migrated
<perrito666> thumper: ok then controller permissions are not something we will be migrating, that makes sense, since we migrate models
 * perrito666 facepalms
<thumper> aye
 * perrito666 adds a bit of info to that pesky test that tells you that you need to migrate stuff
<perrito666> natefinch: your jfdi did not like windows
<natefinch> perrito666: juju doesn't like windows :/
<perrito666> well bad luck, since we are gating landings on it
<redir> without gates there'd be no windows
<natefinch> *rimshot*
<redir> so if I juju set-model-config default-series=xenial && juju deploy wordpress I should expect wordpress to be deployed on xenial, yes?
<mup> Bug #1604586 opened: devel Charm Docs are out of date and examples do not work <juju-core:New> <https://launchpad.net/bugs/1604586>
<thumper> menn0: http://reviews.vapour.ws/r/5271/diff/#
<menn0> thumper: looking
<thumper> menn0: ta
<perrito666> redir: ir fail iirc
<redir> tx perrito666 looks like it falls back if the charms doesn't support the series
<redir> but I have another error: ERROR storing charm for URL "cs:trusty/wordpress-4": delegatable macaroon cannot be obtained for public entities
<redir> think I must have something twisted.
<redir> like my understanding
<menn0> thumper: done
<thumper> menn0: as reviewer, here's another http://reviews.vapour.ws/r/5265/diff/#
<menn0> thumper: looking
<thumper> menn0: ta
<rick_h_> katco: can you add the link to your pr to the card on the board please?
<rick_h_> natefinch: you around to be able to give that a look over?
<katco> rick_h_: oh sorry, sure
<rick_h_> katco: np, ty much
<rick_h_> katco: there's a new bug added there but post the provider doc work just as a heads up, will cover in standup tomorrow
<katco> rick_h_: k
<katco> wallyworld: hey i see your comment in my PR for the config change... this change isn't in environs/config.go. should i still be writing unit tests there?
<wallyworld> katco: it's a bit messed up. the controller config was being parsed by environs/config.go since at bootstrap we handle a single bucket of config items which are then split up. i think that's still the case
<perrito666> does anyone know what goes on the depends on field on rb to actually have a dependency?
<wallyworld> katco: so as with log forward on/off, i think we can just extend the environs/config tests
<redir> perrito666: interface{}
<perrito666> redir: of reviewboard, sorry
<perrito666> missed that
<redir> perrito666: i got that, just a bad joke
<thumper> menn0: and another http://reviews.vapour.ws/r/5273/
<rick_h_> axw: wallyworld folks arriving on the call
<menn0> thumper: done and done. reboot worker one to go.
<mup> Bug #1603577 changed: backup-restore: panic: empty value for "api-port" found in configuration <backup-restore> <blocker> <ci> <regression> <juju-core:Fix Released by wallyworld> <https://launchpad.net/bugs/1603577>
<menn0> coffee first
<thumper> menn0: the reboot one is just a backport
<thumper> doesn't need its own review IMO
 * perrito666 would be a bit happier if we stoped passing State around as if it where a hot potato and started passing a reduced interface of state
<menn0> thumper: ok cool. I thought it sounded familiar :)
<thumper> hmm...
<thumper> merge bot is rejecting a 1.25 merge, but juju.fail doesn't show any blockers
<thumper> ah... a private bug
<thumper> hence juju.fail doesn't see it
<menn0> wallyworld: any idea on the status of this: http://reviews.vapour.ws/r/5173/
<thumper> ugh... none of the core description stuff has any json serialization tags
<thumper> perhaps we should add some
<wallyworld> looking
<wallyworld> menn0: it can be discarded for now, i'll do that
<menn0> wallyworld: cheers
<alexisb> wallyworld, thumper do you guys have time to meet before the standup?
<thumper> yeah
<alexisb> I will jump on teh hangout
<wallyworld> alexisb: can do, i have a 1:1 with redir, hopefully he will be ok if we do that later
<alexisb> wallyworld, it is not super urgent
<redir> wfm
<menn0> thumper: I just went through the review checklists with http://reviews.vapour.ws/r/5265/. It wasn't too painful and I found a bunch more stuff.
<anastasiamac> menn0: how r we maintaing review checklist? is it part of our wiki? or juju/juju/docs?
<anastasiamac> maintaining*?
<menn0> anastasiamac: it's on the wiki. there will be an official announcement this week about us using it soon from alexisb and rick_h_.
<menn0> anastasiamac: https://github.com/juju/juju/wiki/Code-Review-Checklists
<menn0> anastasiamac: I just thought I'd get some actual experience with using it
<alexisb> menn0, yeah we will try to get that out tomorrow
<anastasiamac> menn0: so for item "Are there unit tests with reasonable coverage?"
<anastasiamac> i think it should be coverage specific to what PR addresses
<anastasiamac> not general
<anastasiamac> it's not responsibility of an individual PR to ensure correct test coverage
<anastasiamac> PRs should b focused
<menn0> anastasiamac: sure. all the items related to the work being reviewed, not the code around it
<anastasiamac> menno my hero :D
<menn0> anastasiamac: we can make that clearer in the guidance at the top
<anastasiamac> menn0: \o/
<axw> wallyworld: seen https://bugs.launchpad.net/juju-core/+bug/1604561 ?
<mup> Bug #1604561: restoreSuite.TestRestoreReboostrapBuiltInProvider map ordering wrong <blocker> <ci> <regression> <unit-tests> <juju-core:Triaged by wallyworld> <https://launchpad.net/bugs/1604561>
<axw> oh it's assigned, I guess so
<wallyworld> yep :-)
<alexisb> menn0, https://hangouts.google.com/hangouts/_/canonical.com/menno-alexis
<perrito666> wallyworld: ill go have dinner then ping you re syncing about acls\
<wallyworld> perrito666: ok
<wallyworld> redir: there now
<redir> wallyworld: omw
#juju-dev 2016-07-20
<thumper> dog walk, then addressing review comments...
 * thumper afk for a bit
<redir> manana juju-dev
<wallyworld> menn0: a small one, fixes 2 blockers, when you have a moment http://reviews.vapour.ws/r/5276/
<menn0> wallyworld: looking
<menn0> wallyworld: ship it
<wallyworld> menn0: ta
<anastasiamac> menn0: axw beat u to it too :D
<menn0> wallyworld: I think I've figure out what's going on with https://bugs.launchpad.net/juju-core/+bug/1604514
<mup> Bug #1604514: Race in github.com/joyent/gosdc/localservices/cloudapi <blocker> <ci> <joyent-provider> <race-condition> <regression> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1604514>
<menn0> it's certainly not a new issue
<menn0> and I really don't think it should be a blocker
<wallyworld> yeah, i'd be surprised if it were
<menn0> I think the problem is that the joyent provider destroys machines in parallel
<wallyworld> it's not a regression
<wallyworld> i'm surprised it was marked as such
<menn0> but the joyent API test double isn't safe to access concurrently
<wallyworld> sounds plausible
<menn0> the correct place to fix it is in the test double but that's not our code
<wallyworld> yep, i think we can unmark as a blocker and figure out what to do from there
<wallyworld> we may need to pull in that external code, as a i doubt we will get it to be fixed
<menn0> wallyworld: ok, i'll update the ticket so it's no longer blocking
<menn0> wallyworld: and then I'll poke it some more to see if I can figure out a fix
<menn0> wallyworld: I can /occasionally/ reproduce the race if I use dave's stress script
<wallyworld> maybe there's a work around in the non test code, but would be better to fix upstream i guess
<stokachu> menn0: im still seeing https://bugs.launchpad.net/juju-core/+bug/1604644
<mup> Bug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>
<stokachu> just fyi
<menn0> stokachu: that's the issue xtian was looking at
<stokachu> menn0: this one was https://bugs.launchpad.net/bugs/1593828
<mup> Bug #1593828: cannot assign unit E11000 duplicate key error collection: juju.txns.stash <ci> <conjure> <deploy> <intermittent-failure> <oil> <oil-2.0> <juju-core:Fix Released by 2-xtian> <https://launchpad.net/bugs/1593828>
<stokachu> and it was marked fixed
<menn0> stokachu: they're the same issue (dup)
<menn0> stokachu: which version of Juju are you using? I think it was only fixed very recently (not sure exactly when though)
<stokachu> menn0: correct, i opened a new issue as the previous version was marked fixed release
<stokachu> Bug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash
<mup> Bug #1604644: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>
<stokachu> juju beta 12
 * menn0 checks when the fix went in
<stokachu> beta12 lol
<thumper> menn0: perhaps the patch approach didn't work?
<mup> Bug #1589471 changed: Mongo cannot resume transaction <canonical-bootstack> <juju-core:Invalid> <https://launchpad.net/bugs/1589471>
<menn0> stokachu, thumper: nope the fix didn't make beta12
<mup> Bug #1604641 opened: restore-backup fails when attempting to 'replay oplog' again <backup-restore> <blocker> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1604641>
<mup> Bug #1604644 opened: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <blocker> <conjure> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1604644>
<stokachu> lmao
<stokachu> it got mark fixed release
<menn0> the fix is here: 99cb2d1c148f5ed1d246bf4fe44064363226e12e (Jul 15)
<menn0> it's not in beta12
<stokachu> menn0: can you update that bug with your findings
<stokachu> 1604644
<menn0> stokachu: will do. shall I also mark it as a dup of the other one?
<stokachu> menn0: the other bug is already mark fixed released
<thumper> menn0: I thought the patch was applied to the top of our mgo branch
<stokachu> i think we should leave that one alone and work off this new one
<thumper> menn0: check with mgz and sinzui
<thumper> and balloons I suppose
<stokachu> sinzui: ^ they are saying it didnt make it in
<stokachu> make it in beta12
<menn0> thumper: no, it looks like we copied in a fixed version of mgo's upsert code into juju
<menn0> ah crap... chrome crash
<menn0> thumper: oh never mind, you're right we patch over mgo in the build
<menn0> thumper: at any rate, that change isn't in beta12
<thumper> Which was the release we just did? It should be in that
<thumper> if stokachu is building from source, he won't have it
<stokachu> this is from the ppa
<thumper> hmm...
<thumper> that should have the fix
<menn0> thumper: the latest tag in git is "juju-2.0-beta12"
<menn0> the fix is 99cb2d1c148f5ed1d246bf4fe44064363226e12e
<menn0> when I check out the tag, the fix isn't there
<menn0> when I check out master, it is
<thumper> ugh
<stokachu> im guessing a one-off was done for this issue
<stokachu> ?
<menn0> perhaps there was some miscommunication about when the release was ok to cut
<lazyPower> booo, that was in the release notes too
<lazyPower> mgo package update that retries upserts that fail with âduplicate key errorâ lp1593828
<lazyPower> speaking of o/ hey core team :
<lazyPower> :)
<stokachu> so we're sure that fix isn't in beta 12 from the ppa?
<stokachu> because it's also uploaded to the archive :)
<menn0> stokachu: pretty sure. the release tag is there in git, and the fix isn't part of that release.
<stokachu> menn0: ok, if you don't mind updating that bug so i can follow up with balloons/mgz in the morning
<menn0> awesome :(
<menn0> stokachu: will do. i'll poke xtian too so he's in the loop
<stokachu> menn0: ok cool thanks a bunch
<sinzui> menn0: thumper: The patch was added to the juju tree, and the scrpt that makes the tar file applies it. that is the hack that mgz put together
<thumper> sinzui: looks like something didn't take though
<thumper> o/ lazyPower
<axw> wallyworld: I'm planning to add this to the cloud package: http://paste.ubuntu.com/20129296/. one of those will be present in a new environs.OpenParams struct. sound sane?
<menn0> sinzui: it looks like the rev didn't make the cut of the release.
<axw> sound/look
<sinzui> Yeah, that is a bad way to deliver a fix
<wallyworld> axw: loking
<menn0> sinzui: what to do now?
<axw> wallyworld: open to suggestions for a better name also
<sinzui> menn0: I have no idea. I think godeps should define the repo and rev. Other wise we continue to maintain the patch in the tree and apply it each time the tar file is made
<menn0> sinzui: the immediate problem is that beta12 didn't include the fix at all. the revision with the fix was committed *after* beta12 was cut.
<menn0> sinzui: the mgo patch doesn't exist in beta12
<stokachu> we should amend the release notes and set the fix for beta13
<wallyworld> axw: i don't think that struct belongs in cloud - it's an analgamation of things used for an environ should really belongs in there
<wallyworld> and then it could be called CloudSpec
<wallyworld> or something
<sinzui> menn0: I cannot help at this point. The release was started we aboorted and tried again.
<stokachu> so can the mgo fix be pulled into godeps now?
<stokachu> what was the reason for applying the fix during the tarball build
<thumper> anastasiamac: while you are doing virt-type fixes, core/description/constraints_test.go:25, the virt type needs to be added to the allArgs func
<axw> wallyworld: yeah ok, that's what I had to start with. issue is how to then make State implement EnvironConfigGetter. I think I'll have to define a type outside of the state package that adapts it to that interface
<wallyworld> stokachu: the reason was we don't control upstream and we could not get the fix landed for us to use
<wallyworld> so we were forced to adopt a solution where the change was patched in a s part og the build
<stokachu> wallyworld: so the fixed was pulled in before the PR was accepted?
<thumper> stokachu: more complicated than that...
<stokachu> ah ok
<stokachu> just trying to understand
<thumper> related to golang, imports and the mgo release process
<wallyworld> stokachu: no, the upstream PR was unaccepted but it was landed in an unstable v2 branch which we could not use directly
<wallyworld> it's all a mess
<thumper> s/unaccepted/accepted/
<stokachu> ok, but the status in master is it is now part of the tree?
<wallyworld> no :-(
<thumper> kinda
<wallyworld> not that i am aware of
<thumper> but poorly
<thumper> wallyworld: it is in a patch...
<thumper> in the tree
<thumper> ick
<stokachu> how do you guys do it, this makes my head hurt
<wallyworld> sure, but unless you apply the patch manually....
<thumper> yes
<wallyworld> mine too
<thumper> stokachu: many years of built up resistence
<lazyPower> stokachu - i'm going to say copious amounts of beer and callous to schenanigans
 * thumper goes to put the kettle on
<stokachu> thumper: lol, you guys will lead the zombie resistance
<stokachu> lazyPower: :D
<thumper> I for one await the zomie appocalypse
<menn0> stokachu: this is partially due to the way Go handles imports
<lazyPower> I never trusted go imports
<stokachu> ok so not as simple as placing the git rev in the Godeps stuff
<menn0> stokachu: b/c mgo is imported all over the place across multiple repos, if we want to fork it, we would have to change *everything*
<menn0> stokachu: no, b/c the fix got accepted into mgo's unstable branch, but isn't yet in the stable branch
<stokachu> ah i see
<stokachu> gotcha, i didnt realize it was never in the stable branch
<menn0> stokachu: we *could* use the unstable branch, but that pulls in a bunch of other stuff we don't really want
<stokachu> understood
<lazyPower> doesn't that mean its going to wind up landing in stable and pull in that bunch of other stuff eventually?
 * lazyPower is showing his ineptitude at golang
<natefinch> the whole "unstable" thing in the import path just seems like a bad idea.  Either make it a new version or don't.  If you want to mark it as unstable, do so in the readme.
 * thumper notes that we are still using charm.v6-unstable
<natefinch> yep
<natefinch> dumb idea
<natefinch> instead of having to go change all the imports once when we move to a new version, we have to do it twice.  Assuming we ever actually bother to rename it from unstable.
<menn0> wallyworld: fix for the joyent race: http://reviews.vapour.ws/r/5277/
<wallyworld> looking
<wallyworld> menn0: lgtm
<menn0> wallyworld: thansk
<menn0> thanks even
<menn0> wallyworld: backport to 1.25 as well/
<menn0> ?
<wallyworld> menn0: um, it's such a simple fix, why not
<menn0> wallyworld: ok
<wallyworld> might get a bless more often than twice a year
<thumper> menn0: re dump-model review, and See Also, I copied it from elsewhere...
<thumper> I did think it was strange
 * thumper looks for a good example
<thumper> menn0: updated http://reviews.vapour.ws/r/5265/
<thumper> added a few drive by fixes for "See also:" formatting, made consistent with juju switch
<thumper> menn0: made the apiserver side a bulk call, client api still single
<thumper> added client side formatting
<menn0> thumper: looking. I wasn't really suggesting that you had to do the bulk API work given the rest of the facade but great that you did anyway :)
<menn0> thumper: "See also" is already quite inconsistent between commands
<menn0> sigh
<thumper> I thought that switch was most likely to be right
<thumper> I looked at quite a few
<menn0> thumper: oh hang on... you fixed them all!
<thumper> and picked the resulting style
<menn0> thumper: nice
<thumper> well, in that package
<menn0> thumper: ship it!
<thumper> menn0: ta
<babbageclunk> menn0: D'oh.
<frobware> dooferlad: ping
<dooferlad> frobware: hi
<frobware> dooferlad: any change we can meet now?
<frobware> chance
<dooferlad> frobware: need 5 mins
<frobware> dooferlad: I have a plumber arriving in ~30 mins which is likely to clash with our 1:1
<frobware> dooferlad: ok
<babbageclunk> menn0: ping?
<menn0> babbageclunk: howdy... i'm in the tech board call atm. talk after?
<babbageclunk> menn0: cool cool
<menn0> babbageclunk: hey, done now
<babbageclunk> menn0: Sorry, in standup.
<wallyworld> fwereade: in prep for some work, i have needed to move model config get/set/unset off client facade to their own new facade, so essentially a copy of stuff and a bit of boiler plate for backwards compat until gui is updated. would love a review at your leisure so i can land when CI is unclocked http://reviews.vapour.ws/r/5279/
<wallyworld> i also removed jujuconnsuite tests \o/
<menn0> babbageclunk: np, I'll hang around for a bit.
<fwereade> wallyworld, ack, thanks
<fwereade> wallyworld, I presume: s/have needed to/gladly took the opportunity to/ ;p
<wallyworld> fwereade: that too, but also a need
<wallyworld> :)
<babbageclunk> menn0: Sorry, rambling discussion about godeps and vendoring. Nearly done.
<menn0> babbageclunk: sounds like a repeat of the tech board meeting :)
<babbageclunk> menn0: quite
<babbageclunk> menn0: ok, done
<babbageclunk> menn0: did you manage to reproduce stokachu's problem?
<babbageclunk> menn0: sorry, I mean, has anyone had a chance to reproduce it?
<menn0> babbageclunk: nope. I gave stokachu a rebuild of 2.0-beta12 which definite had the patch applied.
<babbageclunk> menn0: And does he see it with that?
<menn0> babbageclunk: he was going to try it out and see if the problem happened with that as he's able to make it happen fairly reliably.
<menn0> babbageclunk: I don't know. He never got back to me. I think it was quite late for him at the time.
<menn0> babbageclunk: he was going to report back on the ticket but hasn't yet.
<babbageclunk> menn0: Ok, cool - I had a go with a checkout of the right commit and the patch applied, but no luck yet - not sure which bundle to use.
<menn0> babbageclunk:
<menn0> babbageclunk: my goal was to establish whether or not the patch made it into the release or not
<menn0> (and whether or not it worked)
<menn0> babbageclunk: I imagine we'll hear back from stokachu when he starts work again
<babbageclunk> menn0: Also not sure whether my laptop has enough oomph to cause the contention needed.
<menn0> babbageclunk: it seems like there was some process failure when the official beta12 was produced so I'm not ruling out that the patch didn't actually make it into the release
<babbageclunk> menn0: Yeah, it was a bit crazy.
<menn0> babbageclunk: stokachu said he could make the problem happen quite often with just using add-model and destroy-model
<menn0> I'm not sure how hard he was really pushing things
<babbageclunk> menn0: Ok, I'll try that a few more times. The hadoop-spark-zeppelin bundle really squishes my machine. It's pretty cool.
<menn0> babbageclunk: I guess you could try making the problem happen with a juju that's built without the patch
<menn0> and when you have a reliable way of triggering the problem
<menn0> rebuild with the patch and see if it goes away
<babbageclunk> menn0: Well, I'm more concerned that the 5-retry thing just made it a bit less likely, but not really better.
<menn0> or, you could hold off and do something else until we hear more from the QA peeps and stokachu
<babbageclunk> menn0: I'll give it a couple more kicks and then get in touch with the US peeps.
<menn0> you would think 5 would be enough...
<babbageclunk> I would and did!
<menn0> maybe a random short sleep between each loop would help?
<menn0> ethernet style
<babbageclunk> Yeah, could help - want to be sure it's happening first though.
<menn0> for sure... need more info
<babbageclunk> amusing - the test that was originally causing the problem in tests has been deleted.
<babbageclunk> I mean, in our suite.
<menn0> babbageclunk: for unrelated reasons?
<babbageclunk> yeah, because address picking has been removed.
<menn0> ha funny... still needs to be fixed of course
<menn0> babbageclunk: I've got to go. I've got a literal mountain of washing to contend with.
<babbageclunk> menn0: ok, thanks. Happy climbing!
<mup> Bug #1604785 opened: repeatedly getting rsyslogd-2078 on node#0 /var/log/syslog <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1604785>
<mup> Bug #1604787 opened: juju agents trying to log to 192.168.122.1:6514 (virbr0 IP) <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1604787>
<frankban> cherylj: hey morning, could you please merge trivial http://reviews.vapour.ws/r/5280/ ?
<cherylj> frankban: sure
<frankban> cherylj: ty!
<mup> Bug #1598272 changed: LogStreamIntSuite.TestFullRequest sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by fwereade> <https://launchpad.net/bugs/1598272>
<stokachu> babbageclunk: retrying to reproduce this morning, was late last night for me
<perrito666> morning all
<frankban> cherylj: how do I check what failed at /var/lib/jenkins/workspace/github-merge-juju/artifacts/trusty-err.log ?
<frankban> cherylj: sorry, at http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/console
<cherylj> frankban: I've pinged mgz to take a look.  I think it's a merge job failure
<frankban> cherylj: ty
<mup> Bug # changed: 1603596, 1604176, 1604408, 1604561, 1604644
<perrito666> wallyworld: go to sleep?
<wallyworld> ok, about that time
<mup> Bug #1604817 opened: Race in github.com/juju/juju/featuretests <blocker> <ci> <intermittent-failure> <race-condition> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1604817>
<natefinch> wallyworld: if you he 2 minutes, I'd love it if you could just read and maybe quickly respond to a couple review comments I have: http://reviews.vapour.ws/r/5238/
<natefinch> s/he/have
<wallyworld> ok
<wallyworld> natefinch: done
<natefinch> wallyworld: thanks
<natefinch> hey, we're down to just two blocking tests in master, awesome
<natefinch> (sorta)
<babbageclunk> fwereade: ping?
<frankban> cherylj: should I try merge again?
<fwereade> babbageclunk, pong
<fwereade> babbageclunk, what can I do for you?
<babbageclunk> fwereade: I'm trying to understand the relationship between container and machine provisioners.
<babbageclunk> fwereade: Sorry, environ provisioners
<babbageclunk> fwereade: (looking at bug 1585878)
<mup> Bug #1585878: Removing a container does not remove the underlying MAAS device representing the container unless the host is also removed. <2.0> <hours> <maas-provider> <network> <reliability> <juju-core:Triaged by 2-xtian> <https://launchpad.net/bugs/1585878>
<fwereade> babbageclunk, at the heart of a provisioner there is a simple idea: watch the machines and StartInstance/StopInstance in response
<fwereade> babbageclunk, I think that's called ProvisionerTask?
<babbageclunk> fwereade: yup, and it's the same between the environ and container provisioners.
<babbageclunk> fwereade: but with different brokers, I think.
<fwereade> babbageclunk, yeah, exactly
<babbageclunk> fwereade: So it looks like the environ provisioner explicitly excludes containers from the things it watches
<fwereade> babbageclunk, ultimately we *should* be able to just start each of them with a broker, an api facade, and knowledge of what set of machines they should watch
<fwereade> babbageclunk, yeah, that should be encapsulated in what it watches
<fwereade> babbageclunk, I expect they actually make different watch calls or something, though? :(
<babbageclunk> fwereade: Ok - in the maas case I need to tell maas the container's gone away after getting rid of it.
<cherylj> frankban: yes, looks like one PR went through, so something's working...
<cherylj> frankban: so I'd retry
<frankban> cherylj: retrying
<fwereade> babbageclunk, ha, ok, let me think
<babbageclunk> fwereade: Until I started saying this, I thought that the container broker didn't talk to the environ, but now I think that's wrong - it needs to tell it when it starts, right?
<fwereade> babbageclunk, I am confident that a container provisioner should *not* talk to the environ directly, because that would entail distributing environ creds to every machine
<babbageclunk> fwereade: Ok, that makes sense. So in order to clean up the maas record of the container, the environ provisioner would also need to watch containers, right?
<babbageclunk> I should trace the start path so I can see where maas gets told about the container.
<fwereade> babbageclunk, I would be most inclined to have a separate instance-cleanup worker on the controllers, fed by provisioners leaving messages (directly or indirectly) on instance destruction
<babbageclunk> fwereade: leaving messages how? Files?
<fwereade> babbageclunk, db docs?
<babbageclunk> fwereade: oh, duh
<fwereade> babbageclunk, ;p
<fwereade> babbageclunk, there is a general problem with having all-the-necessary-stuff set up before a provisioner sees a machine to try to deploy
<babbageclunk> fwereade: ok, so the container provisioner creates a record indicating that it killed a container, and a controller-based worker watches those and does the environ-level cleanup.
<fwereade> babbageclunk, trying to set up networks etc in the provisioner is wilful SRP violation -- but I think we do have a PrepareContainerNetworking (or something) call that the provisioner task makes
<babbageclunk> fwereade: ok
<fwereade> babbageclunk, yeah, I would be grateful if we would cast it in terms that applied to machines and containers both, and didn't distinguish between them except in the worker that actually handles them
<babbageclunk> fwereade: so that's in the environ provisioner - it talks to the provider.
<fwereade> babbageclunk, I don't think any provisioner should be responsible for doing this work, I think it should be a separate instance-cleanup worker
<babbageclunk> fwereade: (oops, that was in response the prev)
<fwereade> babbageclunk, yuck :)
<babbageclunk> fwereade: Ok - so the provisioner task would just say "this instance needs cleaning"...
<babbageclunk> fwereade: and then the new worker would see all of them and just do stuff for the containers for now.
<fwereade> babbageclunk, so, really, *that* should be happening in an instance-preparer worker, which creates tokens watched by the appropriate provisioner, which can then only try to start instances that have all their deps ready
<fwereade> babbageclunk, yeah, I think so
<fwereade> (I refer, above, to the instance-prep work currently done by the provisioner, not to what you just said, which I agree with)
<babbageclunk> fwereade: right, I was just going to check that.
<babbageclunk> fwereade: sounds good, thanks!
<fwereade> babbageclunk, note that there's an environ-tracker manifold available on environ manager machines already, it gets you a shared environ that's updated in the background, you don't need to dirty your worker up with those concerns
<babbageclunk> fwereade: ok, I'll make sure to base my worker on that.
<fwereade> babbageclunk, and it is called "environ-tracker", set up in agent/model.Manifolds
<fwereade> babbageclunk, just use it as a dependency and focus the worker around the watch/response loop
<fwereade> babbageclunk, you should then be able to just assume the environ's always up to date, and if you do race with a credential change or something it's nbd, just an error, fail out and let the mechanism bring you up to try again soon
<babbageclunk> fwereade: ok
<fwereade> babbageclunk, ...or, hmm. be careful about those errors, actually
<fwereade> babbageclunk, we want those to be observable, I think
<babbageclunk> fwereade: observable?
<fwereade> babbageclunk, and we probably shouldn't mark the machine that used them dead until they've succeeded
<fwereade> babbageclunk, report the error in status, I think, nothing should be competing for it by the time this is running
<babbageclunk> fwereade: oh, gotch
<babbageclunk> a
<fwereade> babbageclunk, so, /sigh, this implies moving responsibility for set-machine-dead off the provisioner and onto the instance-cleaner
<fwereade> babbageclunk, which is clearly architecturally sane, but a bit of a hassle
<fwereade> babbageclunk, otherwise we'll be leaking resources and not having any entity against which to report the errors
<fwereade> babbageclunk, sorry again: not set-machine-dead, but remove-machine
<fwereade> babbageclunk, the machine agent sets itself dead to signal to the rest of the system that its resources should be cleaned up
<babbageclunk> fwereade: ok
<frankban> cherylj: looked at the tests and I've found that the failure is real for my branch. I have a fix already, but how do I check the tests that actually failed from the CI logs?
<fwereade> babbageclunk, but we shouldn't *remove* it until both the instance (by the provisioner) and other associated resources (by instance-cleaner, maybe more in future) have been cleaned up
<babbageclunk> fwereade: Yeah, that makes sense.
<cherylj> frankban: go to your merge job:  http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/
<cherylj> frankban: and click trusty-err.log
<fwereade> babbageclunk, ...and ofc *that* now implies that we *will* potentially have workers competing for status writes
<cherylj> frankban: argh, looks like it failed to run again
<cherylj> balloons, sinzui - can you take a look:  http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/artifact/artifacts/trusty-err.log
<fwereade> babbageclunk, so... it's not trivial, I'm afraid, but I can't think of any other things that'll interfere
<frankban> cherylj: I am running 8477 now
<frankban> cherylj: let's see if it will fail to run again, it should fail with 2 tests failures in theory
<fwereade> babbageclunk, do you know what dimitern has been doing lately? I think he had semi-detailed plans for addressing the corresponding setup concerns but I'm not sure he started implementing them
<cherylj> frankban: ah, well, when it completes you can view that trusty-err.log file for the test output
<babbageclunk> fwereade: sorry, no - he's been away for the last week and a bit, not sure what he's working on at the moment.
<frankban> cherylj: yes thank you, good to know
<fwereade> babbageclunk, no worries
<fwereade> babbageclunk, do sync up with him when he returns
<babbageclunk> fwereade: hang on, why multiple workers competing to write status?
<fwereade> babbageclunk, if the provisioner StopInstance fails that should report; if the instance-cleaner Whatever fails, that should also report
<fwereade> babbageclunk, it might also be useful to look at what storageprovisioner has done
<fwereade> babbageclunk, with the internal queue for delaying operations if they can't be done yet
<babbageclunk> fwereade: Oh I see, so if both of them fail then an error in the provisioner might be hidden by one in the cleanup worker.
<fwereade> babbageclunk, yeah, exactly
<babbageclunk> fwereade: ok, that's heaps to go on with - I'll probably need more pointers once I'm a bit further along.
<babbageclunk> fwereade: Thanks!
<fwereade> babbageclunk, (nothing would be *lost*, because status-history, but it would be good to do better)
<fwereade> babbageclunk, np
<fwereade> babbageclunk, always a pleasure
<mup> Bug #1604644 opened: juju2beta12: E11000 duplicate key error collection: juju.txns.stash <conjure> <mongodb> <juju-core:New> <https://launchpad.net/bugs/1604644>
<sinzui> sorry cherylj: got pulled inot a meeting. Go is writing errors to stdout You can see the failure in http://juju-ci.vapour.ws:8080/job/github-merge-juju/8475/artifact/artifacts/trusty-out.log
<sinzui> cherylj: I think we can create unified log so that the order of events and where to look are in a single place
<rick_h_> katco: ping for standup
<katco> rick_h_: oops omw
<mup> Bug #1604883 opened: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>
<mup> Bug #1604883 changed: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>
<mup> Bug #1604883 opened: add us-west1 to gce regions in clouds via update-clouds <juju-core:New> <https://launchpad.net/bugs/1604883>
<perrito666> anyone has spare time to review this http://reviews.vapour.ws/r/5282/diff/# ? its not a very short one, its part of a set of changes to support ControllerUser permissions, I am happy to discuss what this particular patch does if anyone goes for the rev
<natefinch> rick_h_: I have a ship it for the interactive bootstrap stuff... should I push it through or wait for master to be unblocked?
<rick_h_> natefinch: wait for master please atm
<rick_h_> natefinch: just mark it as blocked on the card on master
<natefinch> rick_h_: will do
<rick_h_> natefinch: got a sec?
<natefinch> rick_h_: yep
<rick_h_> natefinch: https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1
 * rick_h_ goes for lunchables then
<natefinch> god I love small unit tests
<natefinch> I love that it tells me "you have an error in this 20 lines of code"
<rick_h_> jcastro: marcoceppi arosales heads up docs switch is done and the jujucharms.com site is all 2.0 all the time https://jujucharms.com/docs
<marcoceppi> rick_h_: yesssss
<mup> Bug #1604915 opened: juju status message: "resolver loop error" <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604915>
<rick_h_> marcoceppi: will send an email shortly, want to check on status of b12 in xenial update to go along with it
<mup> Bug #1604919 opened: juju-status stuck in pending on win2012hvr2 deployment <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604919>
<natefinch> rick_h_: output for interactive commands on stdout or stderr?
<rick_h_> natefinch: so jam had some thoughts and added notes to the interactive spec on that
<natefinch> rick_h_: ok, I was wondering who added that.  it was incomplete so I was hoping for clarification
 * rick_h_ loads doc to double check
<rick_h_> natefinch: ah, yea looks like he didn't finish typing
<natefinch> rick_h_: I know the answer for non-interactive commands, but not sure if it should be different for interactive
<natefinch> rick_h_: given that there's no real scriptable output
<natefinch> (I mean, you can script anything, but it's not made with that in mind)
<rick_h_> natefinch: can you ping him to clarify the rest, but the start is there as far as for interactive I think that's the idea that the questions/etc should go to stderr, but if we confirm things "successfully added X" it's stdout
<natefinch> rick_h_: ok, yeah, I'll talk to him about it.
<rick_h_> natefinch: ty
<natefinch> oh man... writing this package to handle the formatting of user interactions was the best idea I ever had.
<natefinch> ok, maybe not the best idea ever. But... it's certainly saving my ass.
<rick_h_> natefinch: <3
<alexisb> natefinch, so that begs the questions, what was your best idea ever
<natefinch> alexisb: that's like the best set up for a joke I've ever had....
<natefinch> alexisb: marrying my wife, obviously.  Only slightly behind would be the idea to switch from Mechanical Engineering to Computer Science in school.  Dodged a bullet there.
<natefinch> I have a couple mech-e friends... they basically design screws all day long
<alexisb> natefinch, yep
<alexisb> I got to my first statics class, followed by drafting and went "o hell no!"
<alexisb> I also had some time at Racor systems (Parker affiliate) and watched there engineers at a ProE screen all day
<alexisb> no thank you
<natefinch> yuuup
<natefinch> I realized fairly early that I found physics fascinating in the abstract, but the reality of actually figuring shit out was mind-bogglingly boring.
<alexisb> at racor, the acturally factory was AWESOME, which is where I started wtih control systems
<perrito666> wanna do some boring mech things, try calculating elevators for a living
<perrito666> most revealing class I ever had
<natefinch> friend of mine makes maglev elevators for things like aircraft carriers.... still pretty boring work in the small
<natefinch> he'd probably say the same for my job, though ;)
<natefinch> "So... you twiddled with carriage returns all day?"
<perrito666> lol "so, found that missing statement?"
<mup> Bug #1604931 opened: juju2beta12: unable to destroy controller properly on localhost <conjure> <juju-core:New> <https://launchpad.net/bugs/1604931>
<perrito666> but I was talking about builting elevators, I actually had to spend a semester calculating those
 * rick_h_ runs to get the boy from school
<arosales> rick_h_: great to hear thanks for the fyi
<natefinch> are we supposed to be able to add-cloud for providers like ec2?
<natefinch> it doesn't look like we're stopping people from doing that
<mup> Bug #1604955 opened: TestUpdateStatusTicker can fail with timeout <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604955>
<mup> Bug #1604959 opened: Failed restore juju.txns.stash 'collection already exists' <backup-restore> <ci> <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1604959>
<natefinch> rick_h_: It's a little weird that clicking on "stable" in jujucharms brings you to the 2.0 docs, which say at the top, in red "Juju 2.0 is currently in beta which is updated frequently. We donât recommend using it for production deployments."
<natefinch> the problem with MAAS API URL is that it looks like I'm shouting, but really it's just TLA proliferation
<mup> Bug #1604961 opened: TestWaitSSHRefreshAddresses can fail on windows <ci> <intermittent-failure> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1604961>
<mup> Bug #1604965 opened: machine stays in pending state even though node has been marked failed deployment in MAAS <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604965>
<redir> you sure are
<redir> ignore ^^
<menn0> perrito666: ping
<perrito666> menn0: ping
<perrito666> sorry pong
<perrito666> menn0: did I break something?
<menn0> perrito666: no, I just wanted to apologise for not getting to your ACLs PR yesterday... the day got swallowed up by critical bugs
<menn0> perrito666: I was about to review now and see that it's been discarded?
<perrito666> menn0: oh, no worries, you would not have been able to review it yesterday anyway, it had a dependency on an unmerged branch and the diff was uncomprehensible, I droped it, merged the pending branch and re-proposed
<perrito666> RB is really misleading, I thought that adding the dependency on the depends on field would fix the diff but did nothing at all and then it would not allow me to upload my own diff
<perrito666> I think we should change RB for something a bit more useful, like snapchat
<menn0> perrito666: LOL :)
<menn0> perrito666: I've been wondering about Gerrit or Phabricator, they seem like the best alternatives
<perrito666> I checked one of those during the sprint, and I liked it, I cant rememmber which one though, Phabricator I think
<perrito666> menn0: also, the only person that knew something about our RB is no longer in this team which makes an interesting SPOF
<menn0> perrito666: I don't think the ops side of RB is particularly hard
<menn0> perrito666: and I *think* the details are written down /somewhere/
<perrito666> menn0: I fear that the certain somewhere is an email :s
<perrito666> anyway, eric usually knew the dark secrets like how to actually make a branch depend on
<menn0> perrito666: phab is nice. I've used it a bit at one job. it support enforcing a fairly strict (but customized) development process..
<menn0> perrito666: you do this: rbt post --disable-ssl-verification -r <review_number> --parent <parent_branch>
<menn0> perrito666: and then check how it looks on the RB website and hit Publish
<katco> menn0: perrito666: i've been interested in how this works out for teams: https://github.com/google/git-appraise
<perrito666> menn0: ah, I need some non magic interaction :)
<perrito666> menn0: if you ask me (and even if you dont) if it cant be done on the website, its broken
<menn0> perrito666: I think you can upload arbitrary diffs to RB... but I've never done it
<menn0> katco: looks interesting! I hadn't heard of git-appraise before
 * menn0 reads more
<katco> menn0: i enjoy the decentralized nature. no ops needed
<katco> menn0: or at least i think i *would*. i've never used this
<perrito666> menn0: well I actually tried, It seems to assume rb has something it doesnt, we might have broken that particular workflow with our magic bot
<perrito666> katco: that looks amazing but seems to not work very nicely with github workflow (which we sort of use)
<menn0> katco: storing the reviews in git is a nice idea. the way you add comments is a little unfriendly though. I guess the expectation is that someone will create a UI/tool for that.
<katco> perrito666: just saw this: https://github.com/google/git-pull-request-mirror
<katco> menn0: and just found this: https://github.com/google/git-appraise-web
<redir> who's the resident data race expert?
<perrito666> katco: mm, really interesting, do you know actual users of this, I am interested in seeing how it behaves in heavily conflictive envs
<perrito666> redir: we all are good adding data races :p
<redir> perrito666: OK who's the resident data race tortoise?
<perrito666> redir: well, you are not in luck, its dave cheney :p
<menn0> katco: that improves the situation somewhat! :)
<perrito666> redir: just throw the problem to the field and well see how can we attack it
<katco> perrito666: i do not. this looks fairly active? https://git-appraise-web.appspot.com/static/reviews.html#?repo=23824c029398
<perrito666> man, was that english broken or what? ;p I am loosing my linguistic skills
<redir> I think it is pretty straightforward
<perrito666> katco: very interesting, I really like the idea of storage of these things In the repo
<perrito666> but ill say something very shallow
<redir> https://github.com/go-mgo/mgo/blob/v2/socket.go#L329 needs to be locked so it doesn't race with https://github.com/go-mgo/mgo/blob/v2/stats.go#L59
<redir>  I think
<perrito666> the UI is ugly as f***
<redir> trouble reproducing
<katco> perrito666: it is certainly spartan
<katco> perrito666: personally, i would be writing an emacs plugin for this if someone hasn't already
<katco> redir: why are stats being reset before kill has been returned? i think there's a logic bomb there
<perrito666> I dont know what kind of spartans you know, the ones from the movie certainly look better than that UI :p
<katco> perrito666: sorry, i intended this usage: "adj.	Simple, frugal, or austere: a Spartan diet; a spartan lifestyle."
<perrito666> katco: I know, I intended to : " troll (/ËtroÊl/, /ËtrÉl/) is a person who sows discord on the Internet by starting arguments or upsetting people,"
<katco> lol
<perrito666> redir: while killing, imho you should be locking everything indeed, but I have not checked past these two links to know if I am speaking the thruth about this particular issue
<perrito666> katco: I do dislike the ui though, I prefer something like github without the insane one mail per comment thing
<katco> redir: also i don't think that's the race. socketsAlive locks the mutex before doing anything: https://github.com/go-mgo/mgo/blob/v2/stats.go#L135
<redir> mkay thanks
<redir> perrito666: katco ^
<perrito666> moving to a silent neigbourhood is glorious for work
<mup> Bug #1604988 opened: Inconsistent licence in github.com/juju/utils/series <jujuqa> <packaging> <juju-core:Triaged> <juju-core 1.25:Triaged> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1604988>
<menn0> katco: you're convincing me that we should experiement with vendoring some more :)
<katco> menn0: eep...
<katco> menn0: as long as how go does vendoring is well understood, i'm happy. i am scared of diverging too much without forethought
<menn0> katco: sure... it's not something we should do lightly. and if do it, it should use Go's standard mechanism.
<perrito666> menn0: re our previous talk http://reviews.vapour.ws/r/5282/diff/#
<katco> menn0: yeah, agreed
<menn0> perrito666: ok. I can take a look.
<menn0> perrito666: my initial comment is that I wish this was 2 PRs: one for state and one for apiserver (but I will cope)
<perrito666> menn0: I am sorry I promise I tried to make it smaller
<perrito666> menn0: its smaller than it looks though, small changes in many files
<katco> menn0: i think i messed up the tech board permissions. i was trying to get a link and it looked publicly accessible, so i disabled that. now i can't view it
<menn0> katco: I'll take a look
<katco> menn0: sorry about that
<menn0> katco: you completely removed canonical access :) not sure how to put it back yet
<mup> Bug #1605008 opened: juju2beta12 and maas2rc2:  juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>
<katco> menn0: wait what! all i did was turn off link sharing :(
<menn0> katco: figured it out. what was it before? anyone at canonical can edit or view?
<menn0> or comment?
<katco> menn0: could comment i think, but it looked like external people with link could view as well
<menn0> katco: ok, it's fixed. anyone from canonical can comment again.
<katco> menn0: ta, sorry
<axw> wallyworld: did I miss anything on the call? slept through my alarm supposedly, pretty sure it didn't go off though
<axw> need to ditch this dodgy phone
<perrito666> axw: or get a clock
<wallyworld> axw: not a great deal, just release recap, tech board summary
<axw> perrito666: could do that too, I'd rather have it near my head so I don't wake up my wife
<axw> suppose I could move the clock...
<perrito666> axw: get a deaf people clock
<axw> wallyworld: ok, ta
<perrito666> (not trolling, these are a thing)
<axw> perrito666: ah, have not seen one
<perrito666> they have a thing that you put in your pillow and it vibrates
<perrito666> much like your phone, but less points of failure
<axw> I guess I could just use my fitbit then. if I can find it, and my charger...
<mup> Bug #1605008 changed: juju2beta12 and maas2rc2:  juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>
<axw> anyway
 * axw stops debugging alarm replacement issues
<mup> Bug #1605008 opened: juju2beta12 and maas2rc2:  juju status shows 'failed deployment' for node that was 'deployed' in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605008>
<alexisb> axw, thumper ping
<axw> coming, sorry
<thumper> coming
<redir> axw: thanks for the protip in the review. helpful
<axw> redir: np
<thumper> menn0: so you are working with redir on the race?
#juju-dev 2016-07-21
<redir> i'll be around for a while
<menn0> thumper: no right now I'm not. I'm working on this mgo dup key issue
<menn0> thumper: I can help if needed though
<redir> thumper: menn0 I'll be around for another 90 minutes or so
<redir> so no huge rush
<menn0> redir: should I have a quick look at it with you now while you're still around?
<redir> menn0: if you are at a good stopping point sure. If not let me know when you get to one.
<menn0> redir: now is good
<menn0> redir: hangout or IRC?
<redir> HO
<redir> ?
<menn0> ok, let's just use the tanzanite-standup one
<menn0> redir: ^
<redir> k
<wallyworld> axw: do you have a moment for a quick chat?
<axw> wallyworld: sure
<wallyworld> standup wil do
<axw> wallyworld: very hard to hear you
<wallyworld> axw: i'll reconnect
<thumper> redir: is that race sorted?
<mup> Bug #1604919 changed: juju-status stuck in pending on win2012hvr2 deployment <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1604919>
<redir> menn0: https://github.com/reedobrien/mgo/tree/fix-race-in-stats
<redir> menn0: I can make a PR if you want. But if you think it needs more polish... go ahead.
<redir> menn0: same results on the tests with or without the patch.
<redir> thumper: ^^
<thumper> menn0: http://reviews.vapour.ws/r/5284/diff/#
<thumper> menn0: I think you fixed the original, this is 1.25 now
<thumper> axw: oh, you are oncall reviewer
<thumper> axw: you could look above if you like
<mup> Bug #1458585 changed: SSHGoCryptoCommandSuite.TestCommand fails <blocker> <ci> <go1.6> <intermittent-failure> <regression> <test-failure> <wily> <xenial> <juju-core:Invalid> <juju-core 1.23:Won't Fix> <juju-core 1.25:In Progress by thumper> <https://launchpad.net/bugs/1458585>
<axw> thumper: sure, looking
<axw> thumper: LGTM
<thumper> axw: ta
<thumper> axw,wallyworld: isn't status history updated outside of transactions?
<thumper> so raw access?
<axw> thumper: I think so?
<wallyworld> thumper: i thought so
<thumper> allcollections doesn't mark it as rawAccess
<thumper> I'm looking at the bug about big transactions to clean up
 * thumper looks
<wallyworld> maybe the collection is accessed directly without txn ops
<perrito666> it is accessed without txn ops
<perrito666> or was last time I looked
<axw> st.getRawCollection(statusesHistoryC)
<thumper> hmm... we probably want to mark it as so otherwise removing the items uses transactions :)
<thumper> I'll do it
<menn0> thumper: sorry, was lunching... looking
<menn0> thumper: looks good, although I don't think I fixed this in master (at least I don't remember doing it)
<thumper> I compared the patch with the utils package
<thumper> menn0: axw has reviewed anyway
<natefinch> Is add-cloud intended to work with the static clouds, for example, ec2?  (it does, I just am wondering if it's *supposed to*)
<wallyworld> natefinch: it is supposed to allow you to add your own config
<wallyworld> thumper: so getRawCollection() - does that not work if collection is not marked as raw?
<wallyworld> thst sorta sucks a bit
<thumper> wallyworld: it does
<natefinch> wallyworld: ok, I had assumed it was only for things like manual, openstack, and maas, where you can actually create your own cloud
<thumper> wallyworld: but we have code that iterate over the all collections
<thumper> and do different things for raw
<wallyworld> ah ok
<wallyworld> natefinch: private cloud set up is one use case for sure
<natefinch> wallyworld: just wondering, since I'm working on the interactive add-cloud
<wallyworld> do the private cloud first, and then the public cloud config stuff can be done later - i'm prerry sure the doc references it
<wallyworld> one issue is that we need to enumerate valid config items to make user input sensible
<wallyworld> and there's not really a clear path to do that yet
<natefinch> wallyworld:  yeah... right now I'm just hard-coding it per cloud.
<wallyworld> the valid config items?
<natefinch> wallyworld: what values to ask the user for (which map to config values)
<wallyworld> we should be able to use the config schema definition
<wallyworld> just as is done when parsing config.yaml
<wallyworld> but use it to get at the valid item names and types
<natefinch> I'm not sure I can use the schema to make a user-friendly UI.
<wallyworld> really? it has the attribute names, their types, whether they are mandatory etc
<natefinch> what might be a good compromise is to put the interactive add-cloud code inside each provider, so it lives with the code that knows what values to ask about.
<natefinch> wallyworld: the key word there is user-friendly
<wallyworld> how else would it be done if not using the schema?
<natefinch> wallyworld: by having a human write sentences that humans can understand
<wallyworld> huh? for name - value input?
<natefinch> wallyworld: this is for the interactive add-cloud.  I'm not going to just prompt them with the yaml field names.
<wallyworld> that's what is used now in config.yaml files, and on the CLI as --config args
<natefinch> yes, and it's awful, and that's why we're writing an interactive mode.
<wallyworld> what's wrong with allowing them to enter a value for "http-proxy" say
<wallyworld> that is entirely readable
<wallyworld> you'd use tab completion to help too
<wallyworld> and print the description of each field also
<natefinch> because if we write it by hand, it'll be clean, crisp, and tasteful. If we auto-generate it, it'll feel auto-generated, it won't be able to do intelligent verification (like checking that a URL is a valid URL)...
<wallyworld> why can't we check a valid url? of course we can
<natefinch> the schema doesn't say it's a url, it just says it's a string.... I'm sure we check it later on sometime, but it's one of those things where you'll enter 10 fields, then it'll go off and run through a verification, and spit back that field #7 is incorrect
<wallyworld> whether it is done off a schema or we hard code prompts, the UX is the same
<natefinch> I guarantee you, the UX will not be the same if it's hand coded.
<wallyworld> the schema does need to say it's a url, yes. we can do that
<natefinch> That's just one example
<wallyworld> why won't it be the same?
<wallyworld> it's printing a prompt to enter a value
<wallyworld> in either case
<natefinch> yes, but the prompt is quite likely not to be exactly the same as the description of the yaml field
<wallyworld> so we fix our metadata
<natefinch> also, there's intelligence that is outside the schema... like the fact that openstack username and password only apply if the auth-mode is user-pass
<wallyworld> how do we handle the case where the config schema gains a new value?
<wallyworld> we have already talked about the need to make the schema support grouping and conditional fields
<natefinch> making the schema more complicated is not the answer
<natefinch> it's ok to write code
<wallyworld> it's not ok to create something that is hard to maintain
<wallyworld> and prone to skew
<wallyworld> schemas are the common solution for user data entry. our gui uses json schema for example
<natefinch> we'll spend days, weeks, adding gobs of complexity to the schema just to avoid the simplest of imperative code.
<wallyworld> it's not gobs of complexity
<natefinch> every time we have to do an if/else... that's significant added complexity and cognitive overhead to the schema... when it's completely trivial in imperative code.
<natefinch> how do you signify that openstack user and password only apply if auth-mode is userpass?
<wallyworld> using a directive in the schema - that's what schemas are for
<natefinch> how do you do it?
<wallyworld> http://stackoverflow.com/questions/9029524/json-schema-specify-field-is-required-based-on-value-of-another-field
 * thumper taps his fingers while the state tests run
<thumper> perhaps I should go and clean the coffee machine
<wallyworld> i haven't got the definitive refence handy
<wallyworld> but we are talking about bog standard schema features
<thumper> oh ffs
<thumper> state tests take over 10 minutes
<natefinch> that's jsonschema... the provider config schema is our own custom thing, isn't it? gopkg.in/juju/environschema.v1
<wallyworld> if we hard code it, who will take the lead on ensuring any config additon or change to the schema will be reflected in add-cloud?  that sort of double handing and non-DRY leads to bugs
<thumper> time go test -check.v -test.timeout=1200s -cover
<wallyworld> yes it is our own, but trivial to enhance
<thumper> lets see what that shows us
<thumper> now I'm definitely going to clean the coffee machine
<natefinch> wallyworld: I was actually thinking that the interactive code for each provider should live in the provider. Then it lives right next to the configuration, and so it's easy to check that any changes to the config have corresponding changes in the interactive code.
<wallyworld> but a lot of the base config is defined outisde of the provider in environs/config
<natefinch> sure, if and when we decide to automate those parts, the common interactive bits for those parts can live next to the common config definitions.  The other nice thing about just writing code is that it's easy to pick a sensible subset of things to ask the user, rather than giving them 50 prompts, when 48 of those are optional.
<wallyworld> the optional schema attributes would not be prompted for - the code can use the is-optional property of the schema to drive that
<natefinch> we're getting down a rat hole.  My initial statement is still the same.  We'll have to add a ton of complexity to our schemas (and I *do not* believe that adding conditional dependencies to our schema is a small amount of work), all to save a very small and easily maintained bit of code.  And every time we have some other custom field that works in some custom way, we'll have to extend our schema in new and interesting ways and it'll becomes a
<natefinch> maintenance burden of its own.  Also to avoid writing self-contained, very simple imperative code that lives near the configuration values it is intended to fill.
<natefinch> s/Also/All
<wallyworld> i disagree, as do every other project that uses a schemma to drive user input
<thumper> coverage: 85.3% of statements
<thumper> ok  	github.com/juju/juju/state	617.074s
<thumper> real	11m3.809s
<natefinch> wallyworld: you can generate a functional UI, you can't generate a *good* one.  And I bet I can write all the UIs for all our private clouds in less time than I could write the update to our schema to just support openstack.
<wallyworld> sure, but at the cost of maintability. it's always easy to bang out bespoke code
<wallyworld> openstack and aws both have different credential types
<thumper> axw: http://reviews.vapour.ws/r/5285/
<natefinch> it's not complicated code.  Print "What auth mode would you like (options..):    if authmode == openstack.userpass {  Print "What username?"  }  etc.
<wallyworld> and that can be trivially driven by a schema
<wallyworld> to produce the exact same output
<wallyworld> but will be maintainable
<natefinch> I already find our usage of schemas to be hard to understand.... adding more complexity is just going to make that worse.
<natefinch> to be fair, I think that has less to do with the schema itself and more our usage of it
<wallyworld> seems like you are conflating a schema driven approach with not understanding our schema
<natefinch> well, no.
<axw> thumper: reviewed
<natefinch> wallyworld: I'm trying to avoid writing a framework, and just write what we need.  Do the simplest thing that could possibly work.  If I'm some newbie juju dev, and I add a config value to some provider, and it needs some custom logic in the prompt, it's trivial to just write that code.  If instead I have to go try to figure out how to bend the schema to work, or, more likely, add logic to the schema, it's going to take me an order of magnitude
<natefinch> longer.  We haven't even started and we've already mentioned two additions we would have to make for the schema.
<wallyworld> we can disagree :-)
<natefinch> sure, but we gotta pick one :)
<axw> natefinch: the reason for using a schema is because you may not have acccess to the provider code at all times
<axw> natefinch: e.g. the GUI, or an older juju client
<natefinch> well, the older client wouldn't have the new provider code anyway, so it wouldn't know what to do with the information even if it could request it.
<axw> natefinch: why not? the client doesn't talk to the cloud directly, except for when it's bootstrapping or destroying the controller
<axw> natefinch: sorry, you're specifically looking at bootstrap aren't you. credentials are bigger than just bootstrap
<natefinch> axw: currently looking at add-cloud
<axw> natefinch: it should be possible to add credentials for a cloud after bootstrap also
<thumper> axw: removed the retry, and cleaned up
<thumper> axw: the undertaker will restart if there is an error
<thumper> and the other uses are in migration
<thumper> and I can work with menn0 to make sure those are retried if necessary
<axw> thumper: thanks
<axw> thumper: LGTM with comment fix
<thumper> axw: ta
 * thumper waits for test run to complete
<thumper> PASS: allwatcher_internal_test.go:940: allWatcherStateSuite.TestStateWatcherTwoModels	305.009s
<thumper> this test seems excessively long
<natefinch> axw, was that 6 minutes for one test?
<axw> natefinch: I guess you mean thumper
<thumper> natefinch: yes
<natefinch> axw: oops, sorry, yes
<thumper> 11.5 minutes for the whole package
<natefinch> thumper: good lord
<thumper> 5 minutes for that one
<natefinch> er yes, 5, bad math penalty.
 * anastasiamac impressed with thumper's machine - my machine panics as off late while running all tests in state package 
<mup> Bug #1605050 opened: Controller doesn't use tools uploaded with juju bootstrap --upload-tools <juju-core:New> <https://launchpad.net/bugs/1605050>
<thumper> anastasiamac: just provide the timeout option
<thumper> mine takes > 10 minutes too
<wallyworld> thumper: can you recall - what is the error if you try and deploy to a dying model
<thumper> I don't recall
<anastasiamac> thumper: \o/ yep, test.timeout is super-useful :D
<anastasiamac> thumper: tyvm
 * thumper is looking at this very slow test
<thumper> hmm...
<thumper> menn0: got a minute?
<menn0> thumper: give me 30s
<thumper> k
<thumper> hmm...
<thumper> I wonder if this fix is safe
<thumper> 305s -> 6s
<menn0> thumper: ok, what's up?
<thumper> 1:1 ?
<menn0> yep
<menn0> thumper: ok back again
<mup> Bug #1605057 opened: allWatcherStateSuite.TestStateWatcherTwoModels takes > 5 mintues <tech-debt> <unit-tests> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1605057>
<thumper> w00t
<thumper> test win of the day
<thumper> menn0: http://reviews.vapour.ws/r/5286/
<menn0> thumper: looking
<mup> Bug #1605057 changed: allWatcherStateSuite.TestStateWatcherTwoModels takes > 5 mintues <tech-debt> <unit-tests> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1605057>
<thumper> menn0: if you have a txn.Op with an assertion but no Insert/Remove etc, and it is the only op, what happens if the assertion fails?
<thumper> I mean as a []txn.Op
<thumper> but it is the only one
<menn0> thumper: then an ErrAborted is returned (I guess)
<wallyworld> axw: here's a fix for that deleted model bug http://reviews.vapour.ws/r/5287
<menn0> thumper: and we should be careful with this kind of thing. remember it seems to encourage mgo/txn to create runaway txn-queue fields
<mup> Bug #1605057 opened: allWatcherStateSuite.TestStateWatcherTwoModels takes > 5 mintues <tech-debt> <unit-tests> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1605057>
<thumper> menn0: ack, I'll do something else
<menn0> thumper: review done. looks good. just a couple of minor suggestions
<axw> wallyworld: reviewed
<wallyworld> ta
<wallyworld> axw: i was using the "Word of God", but yours is better :-)
<axw> wallyworld: lol
<wallyworld> axw: also, william has indicated in the past that he hates a generic notfound code used everywhere
<wallyworld> but i can ask again
 * thumper is done for the day
<thumper> laters folks
<mup> Bug #1605096 opened: Juju 2.0 Resource Error - sha mismatch error while re-deploying the charm <juju-core:New> <https://launchpad.net/bugs/1605096>
<babbageclunk> menn0: Nice spotting on the upload tools problem!
<wallyworld> fwereade: hey, i have a small PR which uses a new bespoke error code for a model which is no longer there (ie model not found). andrew looked but wanted your +1 on using the new error code. i'm sure we've talked in the past and not just using generic notfound everywhere is preferred. could you take a look? http://reviews.vapour.ws/r/5287/
<wallyworld> i need to land this for beta13 so if you +1 could you hit $$JFDI$$ for me as i need to head to soccer for a bit
<menn0> babbageclunk: just finishing en email to you with a few bits
<fwereade> wallyworld, LGTM, I left a trivial issue open but I'm happy to land it
<babbageclunk> menn0: Cool cool
<menn0> babbageclunk: sent!
 * menn0 has to run for now. will check in again later
<babbageclunk> menn0: kthxbye
<mgz> ha... we're not actually using the uploaded tools? well, that explains some of the oddness.
 * fwereade is trying to arrange for the split water main outside the house to be dealt with, might not make the meeting
<babbageclunk> Oh no, I totally missed the meeting! I got the 10-minute reminder but was wrestling with a patch and forgot about it by 11. :(
<menn0> babbageclunk: back again for a bit
<babbageclunk> menn0: oh hai
<babbageclunk> menn0: Replied to your message.
<menn0> babbageclunk: just reading that now
<menn0> babbageclunk: your modification LGTM. it's a little more correct that way isn't it.
<menn0> babbageclunk: does all look well when you run this under your test?
<babbageclunk> yup
<wallyworld> fwereade: thanks for +1, back from soccer, will land after eating
<babbageclunk> menn0: So, for the logging were you suggesting a sync.Once living at global scope, and then a call to Do from Query.Apply?
<menn0> babbageclunk: or newSession or something? you don't want it too early, certainly not at import time, because logging may not be set up properly yet.
<babbageclunk> menn0: Ok, makes sense - and the key thing is that the once lives at package scope so that it doesn't log once per session.
<menn0> babbageclunk: yep exactly
<stokachu> babbageclunk: menn0: 10 runs in, no errors yet
<babbageclunk> stokachu: Awesome!
<menn0> stokachu: ok great.
<menn0> stokachu: please give it a bit more testing if you can
<stokachu> im going to let them run about 4 hours
<menn0> That should do it :)
<stokachu> :D
<stokachu> ill post my findings before lunch
<babbageclunk> menn0: Do we set mgo's logger in juju? I can't see a SetLogger call anywhere.
<menn0> babbageclunk: I don't think you want to use mgo's logging infrastructure in this case
<menn0> import loggo, create a logger with loggo.GetLogger and use that
<menn0> that'll feed straight into juju's logging system
<babbageclunk> menn0: Ok, sounds good.
<babbageclunk> menn0: Probably doesn't matter, but: logging at info level? Questionable but means we can be sure it we'd see it without the reporter having to change the log level.
<babbageclunk> menn0: https://github.com/babbageclunk/mgo/commit/17abfbbc91fbb6a37e682946a8d423b1f7c4633d
<menn0> babbageclunk: I thought the default log level was WARNING, not sure though
<babbageclunk> menn0: In Juju?
<menn0> yep
<babbageclunk> huh
<menn0> It starts at DEBUG and then at some point during agent startup the model's logging config is applied
<menn0> if there hasn't been an explicit one specified a default is used
 * menn0 checks
<menn0> babbageclunk: TBH this message is almost certainly going to appear before the model's logging config is applied so may it's a moot point
<babbageclunk> Ok, I'll do it as debug and make sure I can see it in a bootstrap.
<menn0> babbageclunk: sounds good
<menn0> babbageclunk: i'm off. past bed time here
<babbageclunk> Can someone who knows about Juju logging help me? I've added the logging I was discussing with menn0 ^^, but I can't see it in logs after bootstrapping.
<fwereade> babbageclunk, am pretty sure that `juju set-model-config logging-config=INFO` will set the level globally
<fwereade> babbageclunk, also take note of maybe wanting to log some things in the controller model and some in the hosted model
<fwereade> babbageclunk, s/log some things/see some logs/
<babbageclunk> fwereade: So I should check machine-0.log on the controller and in (say) default?
<babbageclunk> fwereade: I've changed the place where I log to be a warning but it still doesn't seem to show up either on the controller or a machine that I add.
<fwereade> babbageclunk, huh
<babbageclunk> fwereade: Do we log to mongo?
<babbageclunk> https://github.com/go-mgo/mgo/commit/071b5646959162617d7356b49be8f419e5f8229c
<fwereade> babbageclunk, we do; I am brainfarting on exactly where the code lives
<babbageclunk> I'm logging in a sync.Once in session.Dial. If the log writer calls Dial, then maybe that first message gets dropped on the floor (since the writer's being set up) and then the message never gets logged again (because Once)?
<fwereade> babbageclunk, hmm, I am a little loath to suggest it but I would be inclined to use the mgo logging stuff there
<mup> Bug #1605241 opened: juju2beta12 lxd instances not starting <conjure> <juju-core:New> <https://launchpad.net/bugs/1605241>
<fwereade> babbageclunk, and use mgo.SetLogger to give it something that near-enough matches what it expects
<babbageclunk> fwereade: Where would the logging go? Oh, whereever the logger writes to.
<fwereade> babbageclunk, which can itself be wrapping a loggo.Logger
<perrito666> la
 * perrito666 wrong window
<babbageclunk> fwereade: Wouldn't that mean my patch would need to patch something within juju/juju?
<fwereade> babbageclunk, as a patch to mgo alone, I think it should use mgo's existing logging facilities
<fwereade> babbageclunk, if we wanted to make use of it in juju, we can/should separately register a logger for mgo so the output goes where we want it
<babbageclunk> fwereade: But there's nowhere in juju/juju that already calls mgo.SetLogger
<fwereade> babbageclunk, this is true -- what's the high level goal here? to get something in juju's logs that tells us that mgo did the right thing?
<fwereade> babbageclunk, if so, I think we should be using the existing mgo logging
<fwereade> babbageclunk, if not, sorry I misunderstood :)
<babbageclunk> fwereade: Just so we can tell that the juju running has the patch without torture testing (like stokachu's doing now)
<babbageclunk> fwereade: but the existing mgo logging isn't hooked up, so I'd need to make a separate change to do that.
<fwereade> babbageclunk, and I'm not sure that would be super-helpful either... we probably don't want to risk logging db ops to the db
<babbageclunk> fwereade: yeah - I think that's kind of the problem I'm hitting.
<fwereade> babbageclunk, so, how about exposing `mgo.HasAppliedE11000Patch() bool` and logging that explicitly in controller setup?
<fwereade> babbageclunk, which can just Once the patch and return true?
<fwereade> that's weird though, logging shouldn't be what triggers it
<babbageclunk> fwereade: Would I have to do that with reflection in the Juju code? Otherwise no-one would be able to build without applying the patch.
<babbageclunk> (Which might be a good thing.)
<fwereade> babbageclunk, ha. dammit again :)
<fwereade> babbageclunk, although, yeah
<fwereade> babbageclunk, that may very well be a good thing
<fwereade> babbageclunk, I am not comfortable developing against arbitrarily-varying dependencies
<babbageclunk> fwereade: probably means we'd need better (some) tooling to apply/unapply patches.
<fwereade> babbageclunk, it would, and we were just discussing in the tech board how we probably didn't want to devote effort to it because we wanted this situation to be rare and not encourage it
<babbageclunk> fwereade: crazy idea - in the once, start a goroutine that waits 2 seconds and then logs the patch message.
 * babbageclunk felt bad typing that.
 * fwereade peers disapprovingly over half-moon spectacles
 * babbageclunk is just going to try it for shits and giggles.
<mup> Bug #1605241 changed: juju2beta12 lxd instances not starting <conjure> <juju-core:New> <https://launchpad.net/bugs/1605241>
 * fwereade -- if such a thing lands -- will try to arrange matters such that he gets to hunt you for sport ;p
<fwereade> babbageclunk, but, yeah, see if it makes a difference
<babbageclunk> Would at least confirm that that's the problem.
<fwereade> babbageclunk, I am generally uncomfortable with our using logging as this sort of data channel, *and* with adding a loggo dependency to mgo...
<fwereade> babbageclunk, ...but I don't have a better answer offhand, I must admit
<babbageclunk> fwereade: This logging is only intended to be in the patch - it'll never go upstream, and will be gone when the patch can go.
<babbageclunk> fwereade: but I know what you mean.
<fwereade> babbageclunk, there's nothing so permanent as a temporary fix
<babbageclunk> quite
<fwereade> babbageclunk, how will we know when we're using a fixed upstream build? ;p
<fwereade> babbageclunk, unless that also logs...
<mup> Bug #1605241 opened: juju2beta12 lxd instances not starting <conjure> <juju-core:New> <https://launchpad.net/bugs/1605241>
<babbageclunk> The problem with the patch is there's no place we can look to determine whether a binary includes it - if the binary is tagged with a hash we can at least look in dependencies.tsv and follow the chain.
<fwereade> babbageclunk, (can we abuse the flag package to accept `-mgo.ensure-e11000-flag` or did we isolate it well enough that we can't? ...not that that's any better really anyway)
<fwereade> babbageclunk, wondering about env vars... they're probably the least worst globals I know...
 * fwereade actually has no idea
<fwereade> sorry
<babbageclunk> Hmm. Am I looking in the wrong place?
<babbageclunk> Is everything in machine-0.log on the controller machine?
<fwereade> babbageclunk, I most often just use juju `debug-log --replay` to get it locally so long as the system's generally functioning sanely
<fwereade> babbageclunk, with appropriate `-m whatever`
<babbageclunk> ok - that's what I'm doing (with -m controller)
<babbageclunk> bums, I was sure that would work.
<fwereade> babbageclunk, hey
<fwereade> babbageclunk, can you just make it write out a file somewhere..? say, in the standard temp dir, whenever it starts?
<babbageclunk> Yeah, that seems extremely obvious now.
<babbageclunk> I can definitely do that.
<fwereade> babbageclunk, cool
<babbageclunk> fwereade: Thanks!
<fwereade> babbageclunk, a pleasure :)
 * babbageclunk just wasn't expecting this to be so hard!
<babbageclunk> I guess that's all of software in a nutshell.
<fwereade> babbageclunk, never a truer word was spoken
<fwereade> babbageclunk, that still does have the what-will-a-fixed-upstream-build-do problem, though
<babbageclunk> I was thinking about putting the time in it, but I guess that's really just the mtime of the file. So an empty file it is!
<babbageclunk> fwereade: Well, it won't write the file, but we'll be able to see that the PR was merged into the commit. It's just normal change-management stuff, isn't it?
<babbageclunk> fwereade: (I'm not sure I understand the problem.)
<fwereade> babbageclunk, yeah, if it's just an internal check that's fine -- but if we expect anyone in the field to need to check it, surely we should always write it forever?
<fwereade> babbageclunk, i.e. if it's just for us, to have confidence that it was applied in build X, we can trigger the write on an env var and just make that one of the things we do until we merge the new version
<fwereade> babbageclunk, if it's not just for us it's 1000x harder
<babbageclunk> fwereade: I think it's just for us.
<babbageclunk> fwereade: why an env var?
<babbageclunk> fwereade: (as opposed to unconditionally)
<babbageclunk> fwereade: Ha ha - what if trying to log from inside dial was triggering a panic due to infinite recursion and then loggo was suppressing that?
<fwereade> babbageclunk, just because it's nicer not to dump the file wherever we run in production
<fwereade> babbageclunk, (that would be engagingly horrifying)
<babbageclunk> fwereade: yeah, makes sense - is it hard to set the env var so that it's set before the controller agent gets bootstrapped? Or would we check the env var each time, and only once.Do the writing when the env var was true
<babbageclunk> ?
<fwereade> babbageclunk, I was imagining that patched-mgo could have an init() that checks the var, writes the file if true, and that's all
<babbageclunk> fwereade: ok, but how would we set the env var on the bootstrapped machine in time for that? I guess we could set the var and bounce jujud.
<rick_h_> katco: ping for standup
<babbageclunk> fwereade: ha ha, I'm lame - all my theorising was for naught. The real problem was that I was logging in Dial, which we never call, and not in DialWithInfo.
<fwereade> babbageclunk, haha, bad luck
<babbageclunk> So just logging with loggo works fine.
<fwereade> babbageclunk, ok, please decorate it with suitably apocalyptic warnings about the terrible practices and I'm happy
<fwereade> ;p
<babbageclunk> wilco! (At least it's not the asynchronous logging with a sleep version. I really only suggested that to push the overton window across. :)
<natefinch> fwereade: I can send an email about what we're doing with interactive commands... gotta run, will be back in an hour and a half, which I know is getting late for you.
<fwereade> natefinch-afk, I might stop then but will try to be back later
<frobware> dear lazy-web, anybody know off-hand if Python 3's standard library has YAML support (reading/writing) baked in?
<rick_h_> frobware: everyone in python land uses http://pyyaml.org/
<frobware> rick_h_: is that part of the standard library?
<rick_h_> frobware: no, it's not
<frobware> rick_h_: asking because the script (bridge) has no 3rd party dependencies
<frobware> rick_h_: if we introduce dependencies we just need to ensure they get to the node being deployed before the script is run
<frobware> rick_h_: I just added thar as a "risk" to the doc
<mgz> frobware: can you just output json?
<mgz> that's also yaml, and is parsable by python stdlib
<mgz> we should really only use yaml for the config files we expect users to edit a lot
<frobware> mgz: need to parse networkd's (YAML) config, and rewrite.
<rick_h_> frobware: where does this script come from?
<mgz> frobware: ah, it's not our yaml? suck
<frobware> rick_h_: juju - we add it to cloud-init as text which gets dropped into a directory and then we have additional "rules" in our cloud-init to go look for that file and run it.
<mgz> frobware: wait, don't we always have pyyaml on cloud images?
<rick_h_> frobware: and it's python because that's what it needs to be for cloud-init?
<mgz> frobware: it's a cloud-init dependency
<mgz> so, we can't run cloud-init if it's not installed
<frobware> rick_h_: it was the easiest way for it to get to the node being deployed. It's essentially a char* - for those old enough. :)
<frobware> rick_h_: https://github.com/frobware/juju/blob/1.20/provider/maas/environ.go#L659
<frobware> rick_h_: and this time for real: https://github.com/juju/juju/blob/master/provider/maas/environ.go#L1200
<alexisb> babbageclunk, I am going to change location sbefore we meet
<alexisb> will be back online in a few minutes, but may be a little later than half past
<babbageclunk> hi alexisb - sorry, was afk when you sent that message.
<fwereade> has anyone else seen cmd/jujud/agent failures with "model is not prepared", and/or deadlocks involving the dummy provider?
<rick_h_> frobware: so I just bootstrapped, created a machine with add-machine and from there ran python3 import yaml
<rick_h_> frobware: and it's there
<rick_h_> frobware: so I don't know if it's the image, the cloud-init, or whatever setup but pyyaml is ootb on a machine in xenial
<frobware> rick_h_: great & thanks
<frobware> rick_h_: the only downside given our current logic is that we look for python2 first
<frobware> rick_h_: do you still have the environment running? can you install python2 and try the same import?
<rick_h_> frobware: sure thing
<natefinch> fwereade: I'm here, but feel free to not be here, since it's dinner time over there
<rick_h_> frobware: no luck there, no yaml ootb in python2
<frobware> rick_h_: so the issue there is that there's no guarantee somebody doesn't customise their MAAS image and we get python2 by our current search order.
<rick_h_> frobware: you have to install python-yaml to get it in py2
<rick_h_> frobware: so can we just change the search order?
<rick_h_> frobware: I mean py3 is the default now?
<frobware> rick_h_: see the comment: https://github.com/juju/juju/blob/master/provider/maas/environ.go#L1200
<rick_h_> frobware: so I see/process that. I guess if we know the script works in py3, and if we find it (they modified so py3 is there in trusty for instance) what's the risk of things going boom? maybe the version of py3?
<frobware> rick_h_: I would say less so than Feb/March - it was all new at that point.
<frobware> rick_h_: and the unit tests test both python2 and 3 -- assuming you have both installed (on the SUT)
<babbageclunk> a quiet stokachu is a happy stokachu?
<rick_h_> frobware: ok, just pushing/questioning :)
<stokachu> babbageclunk: so far so good!
 * babbageclunk dances
<stokachu> babbageclunk: ive been beating the hell out of it with menno's patched binary
<stokachu> no issues yet
<frobware> rick_h_: yep, the issue we have is... unless you go and test all this manually we can't just suck it and see against a CI / test rig.
<frobware> rick_h_: which is true for a lot of what we do
<babbageclunk> stokachu: (ever seen Archer? I want to say phrasing)
<babbageclunk> stokachu: great!
<stokachu> babbageclunk: love archer
<rick_h_> frobware: understand, ok...so where does this leave you?
<rick_h_> frobware: you get yaml in py3, but not other stuff. But this is all about using the new networking in yakkety stuff which is yaml?
<rick_h_> frobware: or is this for something else?
<frobware> rick_h_: no, you're right this is yakkety (so python3) so maybe ok
<rick_h_> frobware: so if they monkey with the maas image we'll still know to use py3 because it's got yakkety only networkd stuff
<rick_h_> frobware: so triggering off the OS vs the python search tree seems like a potential path forward?
<frobware> rick_h_: the monkeying comes from me taking a default Y image and makeing my curtin/custom_pressed have python2 for nefarious reasons
<rick_h_> frobware: so you mean to not just add py2, but to remove py3?
<rick_h_> I mean, just because py2 is there doesn't mean we have ot use it as long as py3 is there as well.
<frobware> rick_h_: where "me" is some customer in the wild who knows he wants python2 from the get-go
<frobware> rick_h_: yep, we get first switch on series
<frobware> get/can
<frobware> rick_h_: well, that's likely to break stuff. the issue is really only in adding py2.
<rick_h_> frobware: right, but we can be resilient to adding py2 because we can be smarter than just looking at python available
<frobware> rick_h_: but I don't think we should sweat this. we can first switch on series.
<rick_h_> if we want to marshall/unmarshall yaml it's because networkd gave it to us and we know to use py3 in that case
<frobware> rick_h_: yep
<rick_h_> frobware: rgr
<rick_h_> sounds like a plan
<frobware> rick_h_:  I'm always wary of the sands shifting underneath my feet. :-D
<rick_h_> wheeeee
 * frobware is out of here before there's a python 4...
<rick_h_> lol
<mup> Bug #1605313 opened: juju 2.0-beta12 ERROR unable to contact api server after 61 attempts: upgrade in progress (upgrade in progress) <sts> <juju-core:New> <https://launchpad.net/bugs/1605313>
<redir> brb reboot
<mup> Bug #1605335 opened: Bootstrap agent initialization timeout too small <juju-core:New> <https://launchpad.net/bugs/1605335>
<mup> Bug #1605313 changed: juju 2.0-beta12 ERROR unable to contact api server after 61 attempts: upgrade in progress (upgrade in progress) <sts> <juju-core:New> <https://launchpad.net/bugs/1605313>
<perrito666> again?
<mup> Bug #1605383 opened: Tab complete causes traceback when trying to tab-complete juju commands <juju-core:New> <https://launchpad.net/bugs/1605383>
<rick_h_> natefinch: ty for the email, while that shakes out can we work on talking with techboard/etc folks on path forward but in the meantime pick up the windows bug in the todo side for now please?
<rick_h_> natefinch: and hold onto the add-cloud while the path forward shakes out
<natefinch> rick_h_: sounds good
<rick_h_> natefinch: cool ty
<redir> brb stepping away for a minute
<anastasiamac> natefinch-afk: could u plz review this one? http://reviews.vapour.ws/r/5291/
<anastasiamac> anyone else is welcome to PTAL :)
<mup> Bug #1604514 changed: Race in github.com/joyent/gosdc/localservices/cloudapi <ci> <joyent-provider> <race-condition> <juju-core:Fix Released by menno.smits> <juju-core 1.25:Fix Released by menno.smits> <https://launchpad.net/bugs/1604514>
<mup> Bug #1604559 changed: LockingSuite.TestTestLockingFunctionDetectsDisobeyedLock did not obey <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1604559>
<mup> Bug #1605057 changed: allWatcherStateSuite.TestStateWatcherTwoModels takes > 5 mintues <tech-debt> <unit-tests> <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1605057>
<mup> Bug #1589635 changed: github.com/juju/juju/state fails on TestMachinePrincipalUnits with an unexpected name <ci> <regression> <test-failure> <unit-tests> <juju-core:Fix Released> <https://launchpad.net/bugs/1589635>
<thumper> menn0: thoughts on this failure? http://juju-ci.vapour.ws:8080/job/github-merge-juju/8503/artifact/artifacts/windows-out.log
<mgz> thumper: just looks like bug 1600301
<mup> Bug #1600301: cmd/jujud/agent MachineSuite.TestHostedModelWorkers fails because compute-provisioner never really starts <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:In Progress by dooferlad> <https://launchpad.net/bugs/1600301>
<menn0> thumper: so I've seen this kind of thing in that package quite a bit recently
<menn0> as I said in the call there seems to be a few different failure modes
<menn0> I'm guessing they're all the same root cause
#juju-dev 2016-07-22
<wallyworld> rick_h_: running a minute late
<wallyworld> in another meeting
<rick_h_> wallyworld: ty for the heads up
<rick_h_> menn0: got a sec?
<menn0> rick_h_: sure
<rick_h_> can you hop in https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1
<thumper> wallyworld, menn0: http://reviews.vapour.ws/r/5294/diff/#
<wallyworld> thumper: we are both talking to rick, will look soon
<perrito666> bbl
<thumper> ack
<wallyworld> thumper: lgtm, ta
<thumper> ta
<alexisb> thumper, wallyworld, do we hvae anything in the release that will break cli scripts or thta are api changes??
<alexisb> if so we need to send them to the list and note them in the release notes
<wallyworld> alexisb: not this release. when i land my queued branch after the beta is cut, that will be an api change
<alexisb> I didnt see anything, but it would be easy to miss something
<alexisb> yep wallyworld I remember that one
<wallyworld> alexisb: but even then, there's a shim for compatibility so stuff doesn't break
<alexisb> as soon as htat lands we will need a note to juju-dev
<wallyworld> yep
<alexisb> menn0, did you see my draft note
<alexisb> anything you want me to add/edit?
<menn0> alexisb: sorry, I've been pulled all over the place
<menn0> alexisb: looking right now!
<alexisb> anastasiamac, how close are we to your PR landing?
<anastasiamac> alexisb: it has not been reviewed
<alexisb> o yeah, natefinch ^^^
<alexisb> that one is critical
<menn0> alexisb: done. Two small suggestions.
<alexisb> if you are around
<alexisb> menn0, thanks
<natefinch> anastasiamac: link me?
<anastasiamac> natefinch: the critical PR to review is: http://reviews.vapour.ws/r/5291/
<anastasiamac> natefinch: tyvm \o/
<redir> later #juju-dev
<alexisb> bye redir
<blahdeblah> Hi all - anyone able to tell me what this means?  ERROR cannot upgrade service "prometheus" to charm "prometheus": storage "metrics-filesystem" removed
<blahdeblah> ^ This is from an environment which, AFAICT, has never used juju storage
<blahdeblah> ah, looks like it's https://bugs.launchpad.net/juju-core/+bug/1599503
<mup> Bug #1599503: Cannot upgrade charm if storage is modified, even if the service doesn't use said storage <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1599503>
<blahdeblah> Anyone able to suggest a workaround until such time as that bug is investigated?
<alexisb> natefinch, just checking that you have anastasiamac review, otherwise I will allocate to someone else
<natefinch> alexisb: I am looking at it.  unfortunately, I'm not super familiar with how image metadata is used
<alexisb> natefinch, that is fine, so long  as you are looking at it I am good :)
<anastasiamac> natefinch: m happy to pair up for review if it helps \o/ want to HO?
<alexisb> thumper, when you get back please make sure to support anastasiamac in what she needs, I have left her with the "emailing to release team" task so I can eod.
<alexisb> goodnight all!
<anastasiamac> alexisb: nite!
<natefinch> anastasiamac: HO that would be good, to help me understand what's going on, so I don't have to type out a bunch of dumb questions :)
<anastasiamac> natefinch: awesome. ill create one right now
<anastasiamac> natefinch: invite ine calendar/email
 * thumper gets a cuppa before chasing up with menn0, anastasiamac, and wallyworld
<wallyworld> thumper: flat white please
 * anastasiamac does not nee to be chased. all  under control for now
<anastasiamac> need*
<anastasiamac> natefinch: PR updated... unless u want to PTAL, m reading for landing \o/
<natefinch> anastasiamac: I gave you a fixit then shipit.  That means I trust you to fix the problems.  If I didn't, it wouldn't get the shipit :)
<anastasiamac> natefinch: u r awesome. pm pressing the button then ;D
<natefinch> do it! :)_
<anastasiamac> blahdeblah: raised 1599503 to critical to be addressed in next beta. but when u say "production", what version of juju are u using? 1.25.6?
<mup> Bug #1605449 opened: juju2: command error message (usage) consistency <landscape> <usability> <juju-core:New> <https://launchpad.net/bugs/1605449>
<thumper> wallyworld: wanna catch up?
<thumper> anastasiamac: all good?
<wallyworld> sure
<anastasiamac> thumper: of course :D always, sir!
<thumper> anastasiamac: do we have everything landed we need to?
<anastasiamac> thumper: landing now. will send an email once it's done..
<anastasiamac> almost there...
<mup> Bug #1605383 changed: Tab complete causes traceback when trying to tab-complete juju commands <juju-core:New> <https://launchpad.net/bugs/1605383>
<wallyworld> thumper: fyi, that unit storage bug is a legit issue when destroying machines; no user warning needed, just a code fix it appears
<thumper> cool
<natefinch> wallyworld: how come add-credential requires the cloud/provider name on the CLI, but prompts for everything else?
<wallyworld> natefinch: the info we were given asked that add-credential take a cloud name as a positional arg. i guess that doesn't have to be the case. i think the expectation was that the user knew what cloud they were adding a credential for since they would do it as they wanted to bootstrap that cloud
<natefinch> wallyworld: in theory they know all the information they're prompted for :)
<wallyworld> huh? that's sort of irrelevant to thr question at hand. the point of add-credential is to get what they know and have juju record it in credentials.yaml. the current add-credential simpy exapcts them to nominate what cloud they are adding a credential for
<natefinch> wallyworld: I just mean, saying they know what cloud they want to add credential for doesn't seem like much of an argument, they should know what name they want to call the credential, too, etc.  It just seems weird to require one specific positional arg, but prompt for everything else.
<wallyworld> the use case is that they know what cloud they want to bootstrap, so instead of going through a step of prompting for that, short circuit it and just rompt for the bits that need to be interactive. but that's just one way of looking at it and was in the spec we were given
<wallyworld> no reason why it couldn't prompt for cloud name if not supplied
<natefinch> I think it would be more user friendly if we did that..  it would also put it in line with how bootstrap works (i.e. if there's no arguments, go into interactive mode).
<natefinch> anyway, I gotta get to bed, and plan on how to move to Canada if Trump becomes President
<blahdeblah> anastasiamac: thanks, but yeah - 2.x is not any good to us; it's *production*, not "production", and it's all 1.2x :-)
<blahdeblah> anastasiamac: For now I just worked around it by reverting the charm storage code to the way it was previously.  I'm about to test the upgrade now.
<anastasiamac> blahdeblah: ack
<babbageclunk> frobware: is there a standup today? or is that leftover from the old meetings?
<frobware> babbageclunk: left over
<babbageclunk> :(
<frobware> sorry
<babbageclunk> frobware: might pop my head into the Rick team standup today just for some interaction. (:
<frobware> babbageclunk: sure
<mup> Bug #1544724 changed: repeatedly checks /dev/fd0 when it doesnt even exist <juju-core:Invalid> <https://launchpad.net/bugs/1544724>
<mup> Bug #1605593 opened: Fix missleading values in error message: x is not a valid distro_series. <juju-core:New> <https://launchpad.net/bugs/1605593>
<perrito666> morning
<babbageclunk> Has anyone had any luck using go-guru on the juju codebase?
<perrito666> babbageclunk: I believe my vim-go setup uses it
<babbageclunk> perrito666: If I try to find callees using juju/juju/... as the scope one of my cores just shoots to 100% and nothing else happens.
<perrito666> babbageclunk: a I usually don't
<perrito666> but sound plausible, especially if it doesnt have any caching policy
<rick_h_> macgreagoir: ping, morning
<macgreagoir> rick_h_: g'day
<rick_h_> macgreagoir: howdy, as OCR today I wanted to ping and see if you can get a chance to peek at http://reviews.vapour.ws/r/5290/ before your EOD please
<macgreagoir> rick_h_: I had tried earlier, but struggled to understand it to review. I will look again.
<rick_h_> macgreagoir: k, wrangle up someone to pair up with you then, get a hangout and go over it together
<macgreagoir> ack
<rick_h_> macgreagoir: and run down using the review checklist which should help at least knock out chunks of the review
<babbageclunk> perrito666: Turned out I was using way too broad a scope - setting something more specific churned for a bit but then gave good results.
<rick_h_> macgreagoir: so your pair'ing partner can limit down to things you have questions/uncertaintly on
<rick_h_> dooferlad: can you take a few to meet up with macgreagoir to help please? ^
<rick_h_> macgreagoir: and in the future if you look but aren't sure make sure to leave a comment to the effect in the PR so we can see where the PR stands/etc.
<dooferlad> rick_h_: macgreagoir sure
<rick_h_> ty much dooferlad
<dooferlad> macgreagoir: give me a moment to refill my water. Melting here.
<macgreagoir> nw
<macgreagoir> dooferlad: I'm in the quartz ho, when you're ready.
<natefinch> gsamfira: you around?
<natefinch> sinzui: do you have a windows 2012 machine I could RDP into to check out its TLS support?
<sinzui> natefinch: developer-win-unit-tester.vapour.ws
<sinzui> natefinch: check your email for info about connecting
<sinzui> natefinch: and I promised alexisb that I will get a google doc explaining access to all CI hosts so that hyou don't need to wait for me
<natefinch> sinzui: awesome, thanks :)
<rick_h_> dooferlad: ping for standup
<mup> Bug #1564662 changed: Juju binaries should be stripped <packaging> <juju-core:Fix Released> <juju-core (Ubuntu):Fix Released> <https://launchpad.net/bugs/1564662>
<katco> redir: ping
<mup> Bug #1605653 opened: backup-restore failed creating collection juju.blockdevices <backup-restore> <ci> <regression> <xenial> <juju-core:Triaged> <https://launchpad.net/bugs/1605653>
<mup> Bug #1605653 changed: backup-restore failed creating collection juju.blockdevices <backup-restore> <ci> <regression> <xenial> <juju-core:Triaged> <https://launchpad.net/bugs/1605653>
<frobware> macgreagoir: want to touch base with the relations bug?
<macgreagoir> frobware: Yes, please. Standup HO?
<frobware> macgreagoir: 10mins
<macgreagoir> nw
<mup> Bug #1605653 opened: backup-restore failed creating collection juju.blockdevices <backup-restore> <ci> <regression> <xenial> <juju-core:Triaged> <https://launchpad.net/bugs/1605653>
<marcoceppi> balloons sinzui we havne't been putting beta builds into homebrew, this is problematic
<frobware> macgreagoir: in standup HO for whenever works
<sinzui> marcoceppi: homebrew think juju2 is problematic
<marcoceppi> why.
<sinzui> marand I ahve been working with them to put juju2 into homebrew
<sinzui> marcoceppi: it is not compatible with juju 1. Users will get an update and break
<redir> katco:pong
<marcoceppi> we should just build a `juju load-credentials` which takes .juju/environments.yaml and converts it
<sinzui> marcoceppi: This the the current PR after other were abandoned https://github.com/Homebrew/homebrew-devel-only/pull/47
<katco> redir: hey i was just looking at your race condition again trying to understand what i missed. i'm leaving a comment on the PR
<katco> redir: aside from the race detector complaining, are we seeing an actual issue from this?
<balloons> sinzui, sigh.. they killed the other PR in the end eh
<sinzui> balloons: yes. The direction is for users to install juju or juju2 they conflict. the user chooses the version tehy need.
<sinzui> balloons: marcoceppi: The killer was the fact that juju's plugin detection is bong
<marcoceppi> what does bong mean?
<redir> katco: ResetStats could change (create a new one with copied values) or SetStats could remove (setting to nil) the underlying stats object during access.
<redir> katco: AFAIU Stats is only used in testing. So I think it is possible to get wrong results, but as to whether we were seeing them -- I don't know
<stokachu> marcoceppi: something that takes the sticky icky?
<katco> redir: i think the point i was going to make is invalid. i am worried that we have a logic race in our tests that this will fix, but at the wrong logical level
<mgz> OCR: review please, http://reviews.vapour.ws/r/5295/
<macgreagoir> mgz: ack
<mup> Bug #1605669 opened: grant-revoke User could not check status with read permission <blocker> <ci> <grant> <regression> <juju-core:In Progress by gz> <https://launchpad.net/bugs/1605669>
<mgz> macgreagoir: if it didn't get through before I dropped...
<mgz> for simiplicty, compare with http://reviews.vapour.ws/r/5205 - it should be the inverse
<macgreag1ir> mgz redir: 5295 +1 from me too. mgz, thanks for the 5205 link.
<mgz> thanks guys
<frobware> dooferlad: it's something (ip6tables, et al) between my libvirt interface and the host that causes the packets to get dropped. see nothing obvious atm
<dooferlad> frobware: :-(
<macgreagoir> frobware: lp:1603473 updated. The work-around may not be helpful to the containers (if they just grab all active interfaces), but we'll see if it starts to help, at least.
<frobware> macgreagoir: thanks
<mup> Bug # changed: 1592456, 1600404, 1603865, 1604223
<rick_h_> katco: ping
<katco> rick_h_: pong
<rick_h_> katco: can I steal some time?
<katco> rick_h_: sure
<rick_h_> katco: https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1
<alexisb> frobware, ping
<frobware> alexisb: pong
<alexisb> heya frobware happy friday
<alexisb> on this bug: https://bugs.launchpad.net/juju-core/+bug/1604482
<mup> Bug #1604482: MAAS bridge script should drop all 'source' stanzas from original file <2.0> <network> <juju-core:Triaged> <https://launchpad.net/bugs/1604482>
<alexisb> why is it critical if there is a workaround
<frobware> alexisb: so... people keep running into this. and the behaviour is subtly broken when it happens. I wanted to say "when juju boostraps your node, we will fix networking for you"
<frobware> alexisb: I did originally take the opinion that we should rely on the workaround, but the curtin fix has not landed and it has been >4 ? weeks now
<alexisb> so that bug requires a curtain fix?
<frobware> alexisb: happy to drop the priority
<frobware> alexisb: there fix is committed, but not released. needs to get out of proposed
<alexisb> lets keep it on our radar but make it high
<alexisb> frobware, thanks for the info, I will leave you to your day now
<frobware> alexisb: fine for me
 * rick_h_ EOD's to pack and get ready to get out of here tonight. If anyone needs anything hit me up on my phone via email, telegram, or such
 * mgz sends smoke signals to rick_h
<mgz> *have*
<mgz> *a*
<mgz> *good*
<mgz> *flight*
<perrito666> runRawTransaction is the one that doesnt do the model uuid mangling right?
<mup> Bug #1605710 opened: Fix and reland axw/cli-model-owner <ci> <grant> <juju-core:Triaged by axwalk> <https://launchpad.net/bugs/1605710>
<mup> Bug #1605714 opened: juju2 beta11: LXD containers always pending on ppc64el systems <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1605714>
<redir> so master is unblocked?
<alexisb> redir, yes sir
<redir> SO I can $$merge$$?
<katco> CRAP! quick someone block it!
<alexisb> yes sir
<alexisb> katco, it is staying unblocked tell thursday
<katco> noooo
<alexisb> but if you cause a regression you will get assigned to it as your top priority
<katco> i... i can write a race condition in if you want
 * redir light some firecrackers and pops champagne
<alexisb> katco, its your own pain ;)
<redir> great someone put conflicts in my way
<alexisb> alright all, I am off for a quick ride, back in about an hour
<alexisb> happy merging
<redir> have a good ride
<natefinch> oh internet explorer.... thanks for wrapping this plainttext file I downloaded in <html><body><pre>
<natefinch> ahh... windows is so crappy
<kwmonroe> hey.  i friggin love beta12.  nice work folks!!  (resource-get slowness was killing me, i feel alive again now)
<natefinch> kwmonroe: awesome! :)
<mup> Bug #1605747 opened: [ juju2 beta11 ] Maas system is deployed but agent remains pending <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1605747>
<natefinch> gah... I cannot for the life of me get .net to use tls 1.2
<perrito666> natefinch: the language?
<natefinch> the framework, yeah.  in this case I'm calling it from powershell... but it always requests tls 1.1 as the highest value
<natefinch> ah... beacuse it's .net 4.0
<natefinch> well shit
<perrito666> just .net get a new ver... oh, thats right, you cant
<perrito666> </trollface<
<natefinch> lol, juju depends on .net 4.5, evidently ;)
<perrito666> natefinch: since when exactly?
<perrito666> it used to work with windows 2k images that, afaik, have 4.0
 * perrito666 admits his knowledge of the ungo-ly word of windows :p
<natefinch> but recently we changed to only supporting TLS 1.2
<natefinch> and 4.0 doesn't support TLS 1.2 as far as I can tell
<perrito666> ahh, I see
<natefinch> our userdata uses powershell, which uses .net
<perrito666> that is indeed a problem, because windows 2k is a very popular hipervisor img
<perrito666> and I dont even want to think in the consequences of upgrading .net
<perrito666> mm, isnt there a 3rd party .net implementation of tls 1.2?
<perrito666> that happens to be floss?
<perrito666> or a patch? or something?
<alexisb> kwmonroe, be sure to make use of all the extra time
<natefinch> well, I thinj the OS itself supports TLS 1.2, it's just that .net 4.0 doesn't use it, because of windows XP compatibility (at least that was suggested in an article I read, and sounds plausible)
<redir> should things work on mongo 3.2?
<perrito666> redir: things is a broad word mate
<perrito666> statistically speaking, stuff should not work in mongo 3.2
<perrito666> for most values of stuff
<perrito666> juju is not one of those though, it should just work (except for the tests, those wont)
<natefinch> well, I thinj the OS itself supports TLS 1.2, it's just that .net 4.0 doesn't use it, because of windows XP compatibility (at least that was suggested in an article I read, and sounds plausible)
<mup> Bug #1577949 changed: windows services cannot upgrade to 1.25.6 <ci> <regression> <upgrade-juju> <windows> <juju-core:Invalid> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1577949>
<mup> Bug #1605756 opened: [ juju2 beta11 ] system show up in juju status as pending but there is no attempt to deploy in maas <oil> <oil-2.0> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1605756>
<alexisb> look at perrito666 picking up aussie slang, nice job mate!
<perrito666> I might be spending too much time with down unders
<redir> perrito666: juju
<redir> I see so tests won't pass
<redir> so no juju doesn't work with 3.2 yet
<perrito666> if you increase timeout to something near 2 days they will actually
<natefinch> lol
<perrito666> for 10 points, guess what is the part of juju that does not work well with mongo 3.2?
<natefinch> uh, the part that uses mongo?
<alexisb> lol
<redir> mmm it doesn't even find it's path
<perrito666> no one will say juju-conn suite?
<perrito666> redir: ah, that is new
<natefinch> heh.... we said tests
<perrito666> redir: you have to tell the tests where is mongo
<natefinch> the tests are what, like 10x slower on 3.2?
<perrito666> natefinch: I think it was 100x
<natefinch> OMG
<perrito666> we deviced a way so they would run at the same speed, it required some hacking of the mongo options plus a ramdisk plus I cant recall what other thing
<perrito666> its all the fault of the new storage for mongo, it is fast for anything but what we do in the tests
<perrito666> if you make the tests run with the legacy storage they pass fast
<perrito666> the thing is, that storage engine is EOL
 * perrito666 knows more about this subject than he would like
<redir> perrito666: finds it if I set JUJU_MONGOD
<perrito666> yes, that is the one
<redir> and unintstall all the old mongos
<perrito666> the mongo using tests are not all that smart
<redir> which I did first
<perrito666> I have another adjectives for it
<redir> I have a few myself
<perrito666> lol
<redir> but thought I'd try out the tests while I shepherd some merges
<perrito666> this deserves t-shirts
<redir> with the new mongo pkg
<perrito666> redir: unless you intend to be merging for ~30min*100 better try something else
<redir> 10gen printed plenty of shirts
<redir> which might be part of the problem
<redir> perrito666: 3 merges...
<redir> ~ 27 minutes each
<redir> last one starting now
<perrito666> oh, so its 20 days, you might want to find something else to do
<mup> Bug #1575794 changed: Agent config format version should be changed for 2.0 <juju-release-support> <rc1> <tech-debt> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1575794>
<mup> Bug #1584565 changed: Documentation regarding user management is out of date <helpdocs> <usability> <juju-core:Fix Released by macgreagoir> <https://launchpad.net/bugs/1584565>
<mup> Bug #1590947 changed: TestCertificateUpdateWorkerUpdatesCertificate failures on windows <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1590947>
<mup> Bug #1575794 opened: Agent config format version should be changed for 2.0 <juju-release-support> <rc1> <tech-debt> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1575794>
<mup> Bug #1584565 opened: Documentation regarding user management is out of date <helpdocs> <usability> <juju-core:Fix Released by macgreagoir> <https://launchpad.net/bugs/1584565>
<mup> Bug #1590947 opened: TestCertificateUpdateWorkerUpdatesCertificate failures on windows <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1590947>
<natefinch> weird, it says we have .net framework 4.5 installed, but powershell is still using 4.0
<natefinch> nope, I'm wrong... .net 4.5 is clr version 4.0.30319.17001 - 18400 ... makes perfect sense
<mup> Bug #1575794 changed: Agent config format version should be changed for 2.0 <juju-release-support> <rc1> <tech-debt> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1575794>
<mup> Bug #1584565 changed: Documentation regarding user management is out of date <helpdocs> <usability> <juju-core:Fix Released by macgreagoir> <https://launchpad.net/bugs/1584565>
<mup> Bug #1590947 changed: TestCertificateUpdateWorkerUpdatesCertificate failures on windows <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1590947>
<mup> Bug #1605767 opened: MachineSuite.TearDownTest no reachable servers <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1605767>
<mup> Bug #1605769 opened: LXD Bootstrap Failure: Cannot add new machine <bootstrap> <lxd-provider> <ui> <juju-core:Triaged> <https://launchpad.net/bugs/1605769>
<mup> Bug #1605770 opened: firewallerSuite.TearDownTest inst.Dial() failed <ci> <intermittent-failure> <regression> <unit-tests> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1605770>
<mup> Bug #1605776 opened: annot dial mongo to initiate replicaset: no reachable servers <bootstrap> <ci> <mongodb> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1605776>
<mup> Bug #1605777 opened: munna/juju2 is too slow to deploy windows images <ci> <maas-provider> <regression> <windows> <juju-ci-tools:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1605777>
<alexisb> alrighty all, spending some time with family after a long week and before  I travel
<alexisb> have a great weekend everyone
<katco> alexlist: tc
<katco> oops mt
<mup> Bug #1605790 opened: Unable to initialize agent <vpil> <juju-core:New> <https://launchpad.net/bugs/1605790>
<redir> perrito666: the full suite on 3.2 only takes about 2 minutes longer than 2.6 for me.
<perrito666> redir: nice, most likely using the old storage engine
<perrito666> if not, wow
#juju-dev 2016-07-23
<redir> does this failure mean anything to anyone at a glance:
<redir> http://goo.gl/f2Ns14
<perrito666> redir: I believe its an intermitent one
<redir> yup
<redir> thanks
<redir> OK. EoW. Have a great weekend juju-dev
<redir>  
#juju-dev 2016-07-24
<mup> Bug #1581297 changed: confusing failure mode if juju-db systemd unit fails to start up <juju-core:Expired> <https://launchpad.net/bugs/1581297>
<mup> Bug #1585359 changed: juju debug-log prints some log messages, but quickly stops streaming <juju-core:Expired> <https://launchpad.net/bugs/1585359>
<mup> Bug #1605976 opened: [2.0] bump mongod to 3.2.8 <juju-core:New> <https://launchpad.net/bugs/1605976>
<mup> Bug #1605986 opened: Creating container: can't get info for image 'ubuntu-trusty' <oil> <oil-2.0> <juju-core:New> <https://launchpad.net/bugs/1605986>
<menn0> anastasiamac: apparently it's just going to be you and me in the standup today. did you still want to do it?
<anastasiamac> menn0: sure \o/ in 25min :D
<menn0> anastasiamac: yeah, usual time. just checking now :)
<menn0> anastasiamac: standup?
<mwhudson> menn0: oh huh, i didn't get an invite for that
<menn0> mwhudson: the tanzanite standup slot is now the new "A Team" standup slot
<menn0> mwhudson: the onyx standup no longer happens
<mwhudson> menn0: yeah, but i don't think i got invited to the "A Team" standup, is it in some calendar i should be able to see?
<menn0> mwhudson: I just sent you an invite
<mwhudson> obviously me not being invited is fine, but not the impression i got on friday...
<menn0> (I hope)
<menn0> mwhudson: tell me if you didn't get it
<mwhudson> i got an invite for today's meeting it seems
<mwhudson> but that one only
<mwhudson> i assume it recurs?
<menn0> mwhudson: grrr... yes it reccurs
<anastasiamac> menn0: mwhudson: i can only add to an individual event, not recurrently :( m happy to do it if there is no other way :D
<mwhudson> anastasiamac: menn0 invited me tomorrow, i'll ask about it there
<anastasiamac> mwhudson: \o/
<anastasiamac> menn0: i'll play around with this today.. we were also missing chris :)
#juju-dev 2017-07-17
<axw> babbageclunk: as is wont, here's a gigantic PR: https://github.com/juju/juju/pull/7644. I've split it up into 4 commits, so I hope it's not too hard. let me know if it's too much, and I'll send separate PRs
<babbageclunk> :)
<babbageclunk> babbageclunk: doesn't look too bad on the commit level - looking at it now.
<axw> babbageclunk: thank you. going for a ride, I'll bbs
<pranav_> Hello good folks. Need some assistance with subordinate charms
<pranav_> I am writing a subordinate charm which is subordinate of ceilometer application
<pranav_> but needs a regular relation with mysql:shared-db interface
<pranav_> is this possible to achieve?
<Dockerya> Folks -i've openstack cluster on MaaS. Power outage happened and everything else seems to be working fine except ceph
<Dockerya> https://thepasteb.in/p/RghnZlB8zN2fz
<Dockerya> fixed clock skew and everything ..its still has issue electing leader
<Dockerya> hi
<schkovich> hi guys. im stuck with bootstrapping to private openstack cloud with cryptic message: the configured region "RegionOne" does not allow access to all required services, namely: compute
<schkovich> openstack service list is correctly giving nova service and openstack endpoint list is giving all 3 URLs for nova
<schkovich> i found one similar bug https://bugs.launchpad.net/juju/+bug/1667020 but it expired
<mup> Bug #1667020: openstack keystone v3: misleading error when domain doesn't have access to compute endpoint <oil> <openstack-provider> <juju:Expired> <https://launchpad.net/bugs/1667020>
<schkovich> http://paste.ubuntu.com/25115244/
<schkovich> RegionServiceURLs:map is missing compute URL
<hml> schkovich: sorry for some basic questionsâ¦.  how did you configure the credentials in juju for your private openstack cloud
<schkovich> hml: juju add-credential --replace -f credentials.yaml ostom
<hml> schkovich: the bug is about keystone v3 is that what youâre using?
<schkovich> hml: yes, i'm using keystone v3
<schkovich> hml: though, i don't think that it is v3 authentication problem since i would be already authenticated when error is thrown
<hml> schkovich: i had better luck setting up juju credentials using the âjuju autoload-credentialsâ command when using keystone v3
<hml> schkovich: the pastebin output indicates that juju isnât able to use the credentials as they were given
<schkovich> hml: hmhh, okay, let me try it
<schkovich> hml: now authentication is failing completely :(
<hml> schovich: interestingâ¦ and not in a good way  - youâve sourced a novarc file yes?  (just being paranoid.  :-) )
<schkovich> hml: yes, i  grabbed v3 config from os dashboard
<hml> schkovich: i see now the first line of the pastebinâ¦   so juju is getting different results from API calls to the openstack cloud.  one where authentication worked, and one where it didn't.
<hml> hrm
<schkovich> hml: i would say so :)
<schkovich> hml: the last try with auto-loaded credentials http://paste.ubuntu.com/25115313/
<hml> schkovich: iâm taking a look at the code for a hintâ¦.
<schkovich> hml: thanks. openstack is running on lxd ocata bundle
<schkovich> hml: figure it out
<schkovich> hml: auto-generated credentials are missing project-domain-name
<hml> schkovich: ahhâ¦ that would do it.
<schkovich> hml: of course, I have a different error now :)
<hml> ha
<schkovich> hml: no image metada found
<hml> schkovich: did you use conjure-up to deploy the openstack cloud?
<schkovich> hml: no, juju :)
<hml> schkovich: you loaded the metadata into swift it appears?  what did you call the service?
<schkovich> hml: most likely i did not set properly swift
#juju-dev 2017-07-18
<hml> if the service for the image metadata is setup and named per the docs - it will be found automagically by juju based on the naming.
<schkovich> hml: product-stream
<hml> and the service type is product-streams?
<schkovich> hml: yes
<hml> did the debug output list the URL juju was trying it?
<hml> you might need to list it as: http://10.0.8.52:80/swift/v1/simplestreams/images
<hml> if thatâs how you set it up
<schkovich> hml: image-metadata-url: invalid URL "http://10.0.8.52:80/swift/v1/streams/v1/index.json" not found
<hml> can you download from the list I just listed?
<hml> if so , you can update your endpoint.
<hml> juju will add the streams/v1/index.json to the end itself
<schkovich> hml: http://10.0.8.52/swift/v1/simplestreams is giving me 200 OK
<hml> sounds like you just need to update the endpoint service url and/or the url given on the CLI
<schkovich> hml: yup, that was it :)
<schkovich> hml: of course, im missing correct instance type now :)
<schkovich> hml: anyway, thank you very much
<hml> schkovich: youâre welcome.
<hml> my openstack-novalxd is using the m1.medium instance type for the controller fwiw, and m1.small for a unit currently
<schkovich> hml: i did not create all flavours eg m1.medium is missing
<schkovich> hml: should be simple but it's 2am on my side of the globe. time to go to bed. i will continue tomorrow. :)
<hml> schkovich: sounds good :-)
<schkovich> hml: it's so easier when there is someone to discuss the problem with :) thanks once again
<axw> babbageclunk: thanks for the review. did I miss anything in the standup?
<axw> babbageclunk: https://github.com/juju/juju/pull/7646
<babbageclunk> axw: sorry, was out - no, nothing much
<ashipika> hi core team.. any chance i could get a review of https://github.com/juju/juju/pull/7642 ?
<menn0> ashipika: i'm just reviewing something else. i'll do yours next.
<ashipika> menn0: tyvm
<menn0> ashipika: done
<rogpeppe1> i've got a PR for a change to testing/checkers that will undoubtedly break juju tests (but in a good way, i'd suggest). https://github.com/juju/testing/pull/129
<hml> crunchywelch: welcome - hereâs the wiki link: https://github.com/juju/juju/wiki/Implementing-environment-providers
<crunchywelch> aaay, thanks! o/
<babbageclunk> axw: I'm looking at PR 7649 now
<axw> babbageclunk: muchas gracias
<babbageclunk> axw: It looks good, except I'm confused by destroy methods with don't-destroy parameters.
<blahdeblah> any parameter on anything that says "don't do X" should be burned with fire
<axw> babbageclunk: you mean "juju remove-storage --no-destroy" ?
<axw> blahdeblah: happy to hear alternative suggestions to ^^
<blahdeblah> axw: Pretty sure there's a bug about that which says that destroying should be non-default. :-)
<axw> blahdeblah: AFAIK there's only a bug that says destroying an application/unit shouldn't destroy the storage, and that's the case now
<blahdeblah> Ah, that might be the one I was thinking of.
<babbageclunk> axw: No, I'm fine with `remove-storage --no-destroy` - it's the methods internally that are called DestroyX but have a parameter that means that they don't destroy the x.
<axw> babbageclunk: ok. probably the API method, I'll check the review
<blahdeblah> I was referring more to the general case where code does X by default and then someone puts in an option to toggle the behaviour, and calls it "disable_X" which defaults to false rather than "enable_X" and defaulting to true.  It just makes the logic more complex for the user to understand.
<babbageclunk> axw: I'd rather they were named in the same way as the command - RemoveX with an argument that says whether to also destroy.
<axw> babbageclunk: hmm I guess I'm OK with that. the main reason it's not Remote, is because of the distinction between Destroy and Remove in state
#juju-dev 2017-07-19
<axw> blahdeblah: if the default should be to do X, what's a better approach?
<axw> forcing someone to say "--do-X" when X is the obvious default isn't very friendly
<axw> (whether the obvious default is to destroy on remove is debatable, but it's the default already and I'm not sure we can change *that* behaviour)
<blahdeblah> axw: Personally, whenever it's destructive, I wouldn't have a default.  But like you say, may not be something you can change...
<axw> blahdeblah: that has come up as an option, it's probably what we're going to do with destroy-model/destroy-controller (force you to choose between destroying/keeping storage). so *maybe* for remove-storage too
<babbageclunk> axw: Yeah, I think I've talked myself around in the state case - it *is* destroying the storage in the model, it's just not destroying the underlying cloud storage. But there are also places in the providers and provisioner that I think would be clearer not calling the method Destroy.
<axw> blahdeblah: FWIW, the spelling at this stage for destroy-model will be --keep-storage/--destroy-storage
<blahdeblah> That makes good sense
<axw> so no double negatives in sight :)
<blahdeblah> \o/
<axw> babbageclunk: ok. I'll see what can be done to clarify
<babbageclunk> axw: Anyway, just wanted to give you a heads-up before I finish the review - it started to feel a bit like I was harping on about it, but it's really my only issue with the PR.
<axw> babbageclunk: no problem, I'm always happy to make the code easier to comprehend
<axw> hml: would you kindly take a quick look at https://github.com/juju/juju/pull/7653?
<hml> axw: looking
<axw> hml: thanks
<axw> babbageclunk hml: standup
<babbageclunk> sorry!
<axw> babbageclunk: thanks for the review. ReleaseVolumeParams doesn't really work because release = non-descructive remove. how about "RemoveVolumeParams"? i.e. params for destroying or removing the (cloud) volume
<axw> er, non-destructive
<babbageclunk> axw: Yeah, I like remove too
<babbageclunk> axw: That makes sense - so remove becomes the general term that could denote a release or a destroy.
<axw> babbageclunk: yep. except in state :)
<babbageclunk> axw: sure, not much we can do about that one.
<axw> babbageclunk: can you PTAL at https://github.com/juju/juju/pull/7649/commits, I haven't rebased yet so you can see the changes I've made today
<babbageclunk> axw: thanks - looking
<axw> babbageclunk menn0: do you know any tricks for removing a method from a newer version of a facade? I'm embedding v3 in v4 for hte Storage API, but I want drop a method... I'd rather not duplicate methods for the ones I want to keep
<babbageclunk> axw: If you put a method on the facade that takes 2 args, the RPC layer will ignore it.
<babbageclunk> axw: That effectively removes it.
<babbageclunk> axw: (This hack brought to you by wpk)
<menn0> axw: what babbageclunk said. I think the Uniter facade does that.
<axw> babbageclunk: that'll work, I think I saw something like that fly by
<axw> babbageclunk menn0: ok, ta
<menn0> axw: see the bottom of apiserver/uniter/uniter.go
<babbageclunk> axw: what he said
<menn0> axw: it's kinda awful but it works
<axw> cool, got it
<axw> yeah
<babbageclunk> yeah, it's gross but better than the alternative.
 * babbageclunk really should make an emacs func that builds a github url for a given go file.
<babbageclunk> axw: Reviewed - for some reason it marked my comments as outdated, not sure why - maybe a rebase?
<babbageclunk> axw: lgtm, anyway
<babbageclunk> axw: Thanks for making those changes, sorry if it was a pain!
<axw> babbageclunk: not at all
<axw> babbageclunk: thank you. not sure why outdated, I didn't rebase
<ashipika> menn0: hey.. i fixed https://github.com/juju/juju/pull/7642 , if you could please take another look at it..
<menn0> ashipika: reviewed!
<menn0> just one suggestion
<menn0> but ship it
<ashipika> menn0: tyvm
<rogpeppe> jam: I just responded to https://github.com/juju/testing/pull/129 FWIW
<mattyw> anyone around to discuss update status?
<niedbalski> wpk, ping
<mattyw> or more generally how uniters respond to lost connections with controllers
<bdx> hello everyone
<bdx> how are new series enabled in metadata for Juju to use?
<niedbalski> wpk, hey
<wpk> hello
<niedbalski> wpk, we were testing the reload-spaces functionality and I wondered what's the limitation/constraint for not updating the already existing space names and just the ids/subnets? i.e. we changed the name of a space on maas, run reload-spaces but the name remains as the original
<wpk> niedbalski: it's the next step - we identify the space by name so in case it is changed we need to trace all places in which it's used and change it too
<wpk> niedbalski: it is on our roadmap
<niedbalski> wpk, ok, so its a known constraint..
<niedbalski> wpk, is there a LP bug for tracking this implementation?
<niedbalski> anastasiamac, ^^ do you know?
<wpk> niedbalski: I'm looking and there is one for space/subnet remove but I can't find one for rename, you're free to create one
<niedbalski> wpk, ok
<niedbalski> wpk, thanks!
<anastasiamac> niedbalski: not that i can immediately recall ;) so probably not :D
#juju-dev 2017-07-20
<axw> burton-aus: standup?
 * babbageclunk goes for a run
<babbageclunk> bdx: hey, if you need I can help clean up those failing cleanups that are preventing a migration?
<bdx> babbageclunk: that would be amazing
<babbageclunk> bdx: are you familiar with doing stuff in the mongo shell?
<bdx> yeah
<bdx> what do I need to do
<babbageclunk> bdx: cool
<babbageclunk> bdx: find the records in the cleanups collection and delete them
<bdx> awesome
<babbageclunk> bdx: (they should be the only records in the collection, so that shouldn't be hard)
<bdx> ok
<babbageclunk> bdx: You should be able to see them with db.cleanups.find().pretty()
<bdx> babbageclunk: should I need to pass any args to get into the mongo shell?
<bdx> ahh port
<bdx> should be all eh?
<babbageclunk> bdx: hang on , pulling up the wiki page - I've got a script so I've forgotten all of the details.
<babbageclunk> bdx: you'll need the password from the agent config as well: https://github.com/juju/juju/wiki/Login-into-MongoDB
<bdx> thats a goodie!
<bdx> ok im in
<babbageclunk> bdx: can you see the problem cleanups?
<bdx> http://paste.ubuntu.com/25135798/
<bdx> they are resources
<bdx> strange
<babbageclunk> bdx: great, just like we'd expect from the log.
<babbageclunk> Oh, have you seen my comment on the bug?
<bdx> do you think there might be a bug where juju isn't cleaning up resources adequately on application removal?
<bdx> no, checking
<babbageclunk> Well, the error in the log is that the cleanup is failing because the resource is already removed.
<babbageclunk> So I'm fixing the cleanup to call that success (although I don't know why the resource has already gone).
<bdx> wow
<bdx> that makes sense
<bdx> ok
<babbageclunk> Potentially the cleanup was queued but some other operation removed the resource before it ran?
<bdx> I cant really be sure ... I think this is happening on another controller too
<bdx> checking
<bdx> babbageclunk: looks like I do have another one of these plaguing another controller http://paste.ubuntu.com/25135820/
<bdx> so I can just remove this entry from the db and all will be good?
<bdx> to some extent
<babbageclunk> bdx: assuming you have the same error in the logs for that controller, then yes.
<bdx> as far as getting migrations to work goes
<babbageclunk> Yup
<bdx> babbageclunk: I can just run db.cleanups.remove() ?
<bdx> to clear thee out
<babbageclunk> bdx: Might be better to do them by id just to make sure you don't inadvertently get any others.
<bdx> entirely
<babbageclunk> bdx: So db.cleanups.remove({_id: ""129a3d11-4d70-4504-86b6-c8442c95ae12:ObjectIdHex(\"590a50bbfd51631474ae0182\")"})
<babbageclunk> etc
<bdx> I got the same resource cleanup error in the logs on the other controller http://paste.ubuntu.com/25135845/
<bdx> awesome, I was just getting there
<bdx> thx
<babbageclunk> bdx: yeah, looks the same
<babbageclunk> bdx: ok, after that the migration should work (hopefully)
<bdx> ok they are cleaned up
<bdx> trying the migration now
<bdx> babbageclunk: I'm migrating!
<bdx> babbageclunk: many thanks
<babbageclunk> bdx: no worries!
<axw> anastasiamac: standup?
<axw> veebers: ^
<veebers> axw: ah oops, need a couple secs to get headset, omw
#juju-dev 2017-07-21
<seyeongkim> could somebody can check this LP? https://bugs.launchpad.net/juju/+bug/1704376
<mup> Bug #1704376: juju failed to deploy lxd when making bridge <sts> <juju:New> <https://launchpad.net/bugs/1704376>
 * anastasiamac looking
<anastasiamac> seyeongkim: looks awesome \o/ majority of ppl that can comment intellegently on the subject will be waking up soon
<anastasiamac> i'll triage it high for now :D
<seyeongkim> thanks anastasiamac
<anastasiamac> seyeongkim: loved code tracing details btw!! yvm :D
<seyeongkim> heh
<anastasiamac> tyvm*
<seyeongkim> no problem :)
<babbageclunk> axw: can you review this? https://github.com/juju/juju/pull/7664
<axw> babbageclunk: sure, looking
<babbageclunk> menn0: Feel free to take a look too if you're still around? :)
<babbageclunk> axw: thanks!
<axw> babbageclunk: too easy, LGTM
<jamespage> jam: https://bugs.launchpad.net/juju-core/+bug/1386926
<mup> Bug #1386926: provide a way to upgrade-charm when there are removed peer relations <canonical-bootstack> <canonical-webops> <relations> <upgrade-charm> <juju:Triaged> <juju-core:Won't Fix> <nova-compute (Juju Charms Collection):Fix Released by gnuoy> <https://launchpad.net/bugs/1386926>
<bdx> babbageclunk: hitting another bump in the road > https://bugs.launchpad.net/juju/+bug/1705730
<mup> Bug #1705730: `juju migrate` fails with source prechecks failed ERROR <juju:New> <https://launchpad.net/bugs/1705730>
#juju-dev 2017-07-22
<schkovich>  /LOGOUT
#juju-dev 2018-07-17
<domc> Hi all, multiple attempts to create a sample charm using "charm create simple" are failing with a Python "TypeError: a bytes-like object is required, not 'str'" in /snap/charm/158/usr/lib/python3.5/tempfile.py". Any suggestions?
<domc> I have even gone as far as a fresh OS install (Ubuntu 16.04) and then reinstalling snapd and charm, no luck
<domc> Would be very grateful for some help or tips.
<domc> Or maybe even a sign of life really...
