#juju-dev 2012-09-24
<TheMue> morning
<TheMue> fwereade, rogpeppe: hi
<fwereade> TheMue, heyhey
<fwereade> rogpeppe, heyhey
<rogpeppe> TheMue, fwereade: yo!
<rogpeppe> fwereade, TheMue: i was planning to take a swap day today, but the weather is minging
<rogpeppe> fwereade: so maybe i won't
<TheMue> rogpeppe: understandable
<fwereade> rogpeppe, yeah, I haven't quite decided yet myself... I think I'll still be in figuring-it-out mode for a bit
<fwereade> I will at least get the rename-kill-die branch in though
<fwereade> should be ready, but a full test run will take forever
<fwereade> (couldn't figure out how to prevent the sync on open yet, and it's a distinct change anyway...)
<rogpeppe> fwereade: i took a little break from juju yesterday, and did a little JSON-compatible encoding i'd been wanting to do for ages: http://go.pkgdoc.org/launchpad.net/goson
<fwereade> rogpeppe, cool, I remember you mentioning that idea
<rogpeppe> fwereade: it seems to work well
<rogpeppe> fwereade: (i forked the go JSON package, so i got the advantage of a pretty good test suite)
<rogpeppe> fwereade: see the MarshalIndent example to see how it looks
<fwereade> rogpeppe, the example links appear to be broken
<rogpeppe> fwereade: i'm going to use it for displaying and editing JSON even if i don't actually store or transmit anything in this format.
<rogpeppe> fwereade: hmm, works for me
<rogpeppe> fwereade: http://go.pkgdoc.org/launchpad.net/goson#_example_MarshalIndent
<rogpeppe> fwereade: you need to click on the Example link inline
<fwereade> rogpeppe, yeah; doing so takes me to an arbitrary top level doc for some reason
<rogpeppe> fwereade: weird
<fwereade> rogpeppe, ah no, it takes me *just* below but doesn't expand
<rogpeppe> fwereade: you can also look at the example source: http://bazaar.launchpad.net/+branch/goson/view/head:/example_test.go#L85
<fwereade> rogpeppe, just go getting it :)
<rogpeppe> fwereade: :-)
<rogpeppe> fwereade: now there's a command "goson" - takes goson or json on stdin and produces goson or json on stdout; launchpad.net/goson/cmd/goson
<fwereade> rogpeppe, neat :)
<rogpeppe> fwereade: goson -indent '' (compact output) reduces json size by 7% on average too. that's quite nice.
<fwereade> rogpeppe, sweet :)
<rogpeppe> fwereade: here's a nice before/after comparison: before: http://paste.ubuntu.com/1224159/; after: http://paste.ubuntu.com/1224160/
<fwereade> rogpeppe, weird, the before doesn't exist
<fwereade> rogpeppe, TheMue: btw, I really ought to know what exactly you guys are working on
<fwereade> rogpeppe, I am torn between, y'know, fixing what looks like avoidable ugliness and noise in the tests, and doing the RelationUnitsWatcher I think I really *should* be working on
<TheMue> fwereade: currently i'm going through the reviews in gustavos last mail
<fwereade> rogpeppe, but just convincing myself that what I just merged made the test suite no worse than it was before took a disturbingly long time
<rogpeppe> fwereade: i'm working on getting environs/ec2 working live, currently.
<fwereade> rogpeppe, cool, I'm sticking with RUW for the time being
<TheMue> lunchtime
<niemeyer> Good morning!
<TheMue> morning
<rogpeppe> niemeyer: yo!
<rogpeppe> niemeyer: nice work on the plane, BTW
<fwereade> niemeyer, heyhey
<niemeyer> rogpeppe: Thanks
<fwereade> ok, my brain is melted for now... bbl
<niemeyer> fwereade: Cheers
<niemeyer> I'm just off a meeting, and will grab some lunch
<fwereade> GAAAAH
 * fwereade now better understands StartSync's place in the ecosystem
 * fwereade kinda wants to kick something
<fwereade> (it emerges that -- obviously, in hindsight -- just StartSyncing before waiting for a watch is *not* enough if the event you're watching for hasn't happened yet)
<TheMue> fwereade: does the current ServiceRelationsWatcher what you expected in https://bugs.launchpad.net/juju-core/+bug/1032539 ?
<rogpeppe> fwereade: ah, good point.
<rogpeppe> fwereade: i wonder if StartSync should really be SetPollInterval(somethingSmall)
<rogpeppe> fwereade: then it won't be so vulnerable to buffering between request and reply
<niemeyer> -1 in principle.. the current behavior is pretty great in that it forces races conditions to be visible. I'm pretty sure there were never seen races with zk just because it happened to return fast enough from the watch locally.
<niemeyer> The observable delay between action + watch is real
<rogpeppe> niemeyer: the problem with StartWatch is that the delay might be >1s
<niemeyer> and it'll be even more real in a loaded and distributed system
<niemeyer> rogpeppe: Just make the logic deterministic then
<niemeyer> rogpeppe: See both the provisioner and the firewaller for examples of it working fine
<rogpeppe> niemeyer: i'm wondering if perhaps those are easier because they're making changes directly to the state, whereas (i hypothesize) perhaps some operations on the uniter do not respond to the state directly.
<rogpeppe> niemeyer: maybe that's not true though
<niemeyer> rogpeppe: I'm sure there are trickier cases waiting to be seen, even if that's not the case now.. I don't think yet that those guarantee a change in the rules, though
<rogpeppe> niemeyer: you may well be right. we'll see how things develop.
<niemeyer> TheMue: ping
<niemeyer> Hmm.. uh oh.. merge race
 * rogpeppe is off for the day. see y'all tomorrow.
<niemeyer> rogpeppe: Have a good evening man
<TheMue> re
<niemeyer> TheMue: Yo
<fwereade> niemeyer, if you're in the mood for some light reading: https://codereview.appspot.com/6567044/
<fwereade> niemeyer, not sure if you saw: https://codereview.appspot.com/6567044/ has a RelationUnitsWatcher (and a few Uniter tests passing)
<niemeyer> fwereade: I haven't seen it yet
<niemeyer> fwereade: I'm trying to get the uniter tests to pass as-is as well
<fwereade> niemeyer, very recent -- but I mentioned it shortly before you timed out
<niemeyer> fwereade: Ah, no, I didn't get that, thanks
<niemeyer> fwereade: I have a few fixes for the uniter already
<fwereade> niemeyer, ah, I was doing more uniter tests on top of that
<niemeyer> fwereade: Mainly ports from the specific watches to the entity ones
<fwereade> niemeyer, mostly fixing up unit/service watchers
<niemeyer> fwereade: Oh, but do you have uniter tests passing yet?
<fwereade> niemeyer, ha, we are colliding right now
<fwereade> niemeyer, only some so far but it seems healthyish
<niemeyer> fwereade: I imagined we could, but I wasn't too worried about having *two fixes* for the uniter :-)
<niemeyer> fwereade: I'm about to "time out"
<fwereade> niemeyer, ha, yeah :)
<fwereade> niemeyer, np, I should too
<niemeyer> fwereade: So perhaps I can push a branch for your enjoyment so that you can merge what you find useful, and drop what you don't
<fwereade> niemeyer, we can look at the fixes tomorrow :)
<fwereade> niemeyer, cool
<niemeyer> fwereade: Hold on, I'll send the WIP
<fwereade> niemeyer, yes please :)
<niemeyer> fwereade: Curious about what you find
<niemeyer> fwereade: I was just starting to debug a non-stopping watcher.. but I have to visit a relative that has an injured leg in a bit
<fwereade> niemeyer, no worries
<niemeyer> Pushing
<niemeyer> fwereade: It all works, btw
<niemeyer> fwereade: Surprisingly, even the uniter
<fwereade> niemeyer, surprisingly?
 * fwereade looks smug
<fwereade> ;)
<niemeyer> fwereade: I did a full deployment of builddb
<fwereade> niemeyer, awesome :D
<niemeyer> fwereade: Well, considering we did no work after we moved the state, I was expecting it to need tweaks, as we're doing now :)
<fwereade> niemeyer, yum yum dogfood ;)
<niemeyer> fwereade: I was surprised that it worked despite we not touching it
<niemeyer> fwereade: Good job :)
<niemeyer> fwereade: https://codereview.appspot.com/6562045
<fwereade> niemeyer, sweet, thanks
<niemeyer> fwereade: Oops.. I missed the pre-req.. please ignore the changes in unit
<fwereade> niemeyer, np
<niemeyer> fwereade: These are already up for review in an independent branch
<fwereade> niemeyer, cool, I must catch up with those tomorrow
<niemeyer> fwereade: Very quickly, did we do the same thing: https://codereview.appspot.com/6562045/diff/1/worker/uniter/modes.go
<fwereade> niemeyer, yeah, give or take a var name or two ;)
<niemeyer> fwereade: This is awesome.. for awesome, I appreciate the duplication of efforts :-)
<niemeyer> s/for awesome/for once
<fwereade> niemeyer, haha, yeah :)
<niemeyer> fwereade: I know it's late there, but in case you want the no-error unit.Resolved in trunk by the time you wake up, just have a quick look at: https://codereview.appspot.com/6570043/; I'll do required changes and merge when I'm back, if so.
<niemeyer> fwereade: Otherwise, we can easily handle it tomorrow
 * fwereade looks
<fwereade> niemeyer, LGTM, much nicer
<niemeyer> fwereade: cheers
<niemeyer> fwereade: Bootstraps are working again with trunk, btw
<fwereade> niemeyer, excellent
<niemeyer> Well, with the branch I'm merging right now
<niemeyer> Alright, I'm stepping out..
<niemeyer> fwereade: Have a good sleep there
<fwereade> niemeyer, cheers, I hope your relative mends soon
<niemeyer> fwereade: Thanks!
#juju-dev 2012-09-25
<davecheney> niemeyer: testing/mgo_test.go wasn't being run
<davecheney> i'll roll that fix into my next proposal
<niemeyer> davecheney: Oops.. thanks!
<davecheney> niemeyer: https://code.launchpad.net/~niemeyer/goetveld/trunk/+merge/126136
<davecheney> ^ not propsed with lbox, sorry
<davecheney> urgh, change list is empty as well ...
<davecheney> ahh, because i did it ass backawards
<davecheney> niemeyer: https://code.launchpad.net/~dave-cheney/goetveld/001-add-darwin-termios-support/+merge/126137
<niemeyer> davecheney: LGTM
<niemeyer> davecheney: I've just changed the project so it's trunk is owned by the gophers team
<niemeyer> s/it's/its/
<niemeyer> davecheney: Please feel free to repropose against the new trunk and submit it straihgt
<niemeyer> straight
 * niemeyer => bed
<fwereade> ha! my first Refresh-related bug with a long-lived-entity
<rogpeppe> fwereade: morning!
<rogpeppe> fwereade: it was bound to happen
<fwereade> rogpeppe, what would you think about putting Refresh in at the start of Watch on Unit and Service?
<fwereade> rogpeppe, that way the initial event actually is the current state, rather than potentially being whatever aged rubbish you started to watch?
<rogpeppe> fwereade: what are the events that those watchers produce?
<fwereade> rogpeppe, something-changed
<fwereade> rogpeppe, send fresh units back
<rogpeppe> fwereade: so won't the initial event be a fresh unit?
<rogpeppe> fwereade: and hence have all the currently correct settings?
<fwereade> rogpeppe, AFAICT it'll correspond to *some* transaction more recent than the revno of the old document
<fwereade> rogpeppe, hey wait that makes no sense
<rogpeppe> fwereade: hmm, that seems wrong
<fwereade> rogpeppe, ok, time seemed to be flowing in reverse, and a Refresh fixed it, but clearlymore thought is required
<rogpeppe> fwereade: the initial event should be the most recent revno
<fwereade> rogpeppe, ah, no, the initial event is just the unit yu originally watched
<rogpeppe> fwereade: that seems wrong to me
<fwereade> rogpeppe, yeah, agreed
<rogpeppe> fwereade: then there's no point in having the initial event
<fwereade> rogpeppe, well, yeah; we could either Refresh it or reconstruct a new one and send that back
<rogpeppe> fwereade: i'd incline towards the latter
<rogpeppe> fwereade: we don't do any implict Refresh anywhere else
<rogpeppe> TheMue: yo!
<fwereade> rogpeppe, ok, sgtm
<rogpeppe> fwereade: and every other event on the channel is a fresh object
<fwereade> TheMue, heyhey
<fwereade> rogpeppe, yeah, feels neat
<TheMue> morniing rogpeppe and fwereade
<fwereade> rogpeppe, TheMue: should be trivial: https://codereview.appspot.com/6571047
<rogpeppe> fwereade: LGTM
<fwereade> rogpeppe, cheers; trivial enough to submit directly, do you think?
<rogpeppe> fwereade: i *think* so, but it is a significant change in one sense.
<rogpeppe> fwereade: even though the code is trivial
<fwereade> rogpeppe, in the sense that it causes the uniter to work properly, it is significant ;p
<fwereade> rogpeppe, and I think it is unambiguously (1) correct, because watches should send current state and then deltas; and (2) safer, because we don't ever send the originating entity over to some unknown client over a channel
<rogpeppe> fwereade: it's that last thing that i think is the killer argument
<fwereade> rogpeppe, indeed, it is only secondary in my mind because it wasn't the thing that bit me
<rogpeppe> fwereade: i'd say go for it
 * fwereade cheers
<Aram> moin.
<davecheney> hello and goodbye
<davecheney> time to make the dinner
<davecheney> then i'll see ya'll for the hangout
<fwereade> ok  	launchpad.net/juju-core/worker/uniter	117.284s
<fwereade> ok  	launchpad.net/juju-core/worker/uniter/charm	7.835s
<fwereade> ok  	launchpad.net/juju-core/worker/uniter/hook	0.007s
<fwereade> ok  	launchpad.net/juju-core/worker/uniter/relation	8.826s
<davecheney> fwereade: did you see my comment to the list about i/o usage
<davecheney> that might explain your excessive test times
<fwereade> davecheney, I did, makes a lot of sense
<fwereade> davecheney, thanks
<davecheney> fwereade: maybe aram can help
<davecheney> i had a look today, but I wasn't much chop for anything
<davecheney> invoking mgo the way the tests do, doesn't make it create that big _tmp
<davecheney> flie
<davecheney> but then again, I never connected a client to the manually invoked one
<Aram> 20MB
<Aram> make it smaller
<Aram> 1MB
<davecheney> on my machine _tmp was 256mb
<Aram> hmm?
<Aram> perhaps testing/mgo had leaked?
<Aram> or was this only from one test?
<Aram> aaah
<Aram> I know
<Aram> niemeyer made the capped collection 20MB
<Aram> 200
<davecheney> aram what is _tmp ?
<Aram> and it needs contiguous space for that
<davecheney> (see email)
<fwereade> Aram, aaaahhh, and it's only made little in *some* tests
<Aram> ok, let me read the email
<fwereade> Aram, right?
<davecheney> fwereade: only made little in cloudinit -> deploy to ec2
<davecheney> oh, that explains why my had started version didn't use any disk space
<davecheney> it won't do that until someone connects to it and does ensure index ...
<fwereade> davecheney, StartMgoServer specifies everything that addMongoToBoot does, doesn't it?
<Aram> fwereade: make the capped collection 1MB and see if tests take less.
<Aram> in open.go
<Aram> wtf is wrong with launchpad today, so slow.
<Aram> actually, my internet is slow.
 * Aram reboots router
<TheMue> lunchtime, see you in the hangout
<TheMue> hmm, my build complains about an undefined bson.SetZero. i already update mgo. or is it a different problem?
<Aram> bzr pull
<Aram> go get -u is not enough
<TheMue> Aram: thx, will try
<TheMue> Aram: it pulled nothing, but i'm still using 1.0.2. may this be the problem?
<Aram> no
<Aram> you pulled mgo, right?
<TheMue> yes
<Aram> what does bzr log --line -r -1 say?
<davecheney> so, how about that meeting ...
<TheMue> funny, 172 with the added SetZero
<davecheney> TheMue: maybe a fast path is to
<davecheney> rm -rf .../v2/mgo
<davecheney> go get ...
<davecheney> the cd v2/mgo
<davecheney> bzr pull
<davecheney> that is what I did
<davecheney> i'm pretty sure it will work
<TheMue> will try that too
<davecheney> please double check you don't have another mgo snuck away somewhere else
<Aram> davecheney: but it's always funny to find new bzr misfeatures and broken behaviors to complain about them.
 * davecheney smiles at Aram 
<davecheney> its like bzr has set the revision to a revision in a branch
<fwereade> hm, should we be meeting now?
<davecheney> indeed
<davecheney> we should
<Aram> we should have a meeting about the lack of the meeting.
<davecheney> Aram: please propose an agenda first
<Aram> a user story.
<Aram> which we have to send to a program manager.
<TheMue> shall i sent out the invite?
 * davecheney is texting mrarmm
<fwereade> TheMue, please do
<TheMue> done
<mramm> sorry I was late to the meeting
<davecheney> s'ok
<davecheney> lets get going
<mramm> I'm in the meeting, called by frank
<davecheney> err, nobody else is
<Aram> we are
<Aram> :)
<Aram> davecheney: https://plus.google.com/hangouts/_/74ad471b07ee1c4e3215133bbf54c62e79ff5c7b?authuser=1&hl=en
<davecheney> ta
<Aram> rogpeppe: ^ ^
<rogpeppe> oops, didn't see it
<rogpeppe> will be there in a moment
<mramm> 16:06 niemeyer: Woah
<mramm> 16:06 niemeyer: services:
<mramm> 16:06 niemeyer:  builddb:
<mramm> 16:06 niemeyer:    charm: builddb
<mramm> 16:06 niemeyer:    exposed: false
<mramm> 16:06 niemeyer:    units:
<mramm> 16:06 niemeyer:      builddb/0:
<mramm> 16:06 niemeyer:        agent-version: 0.0.0
<mramm> 16:06 niemeyer:        machine: 1
<mramm> 16:06 niemeyer:        public-address: ec2-204-236-223-223.compute-1.amazonaws.com
<mramm> 16:06 niemeyer:        status: started
<fwereade> davecheney, sorry, did you paste it somewhere I missed?
<davecheney> sorry, email
<davecheney> irc is on another machine, sorry
<fwereade> davecheney, np, ty
<fwereade> lunch bbl
<niemeyer> Morning juju masters!
<davecheney> niemeyer: hello
<niemeyer> davecheney: Heya!
<davecheney> sleep it would seem, eludes me
<TheMue> niemeyer: hi
<niemeyer> davecheney: Wow, yeah, you've been here for a while :)
<niemeyer> TheMue: Yo
<fwereade> niemeyer, heyhey
<niemeyer> fwereade: Heya
<niemeyer> fwereade: Just reviewing the relation units stuff
<niemeyer> fwereade: Superb so far
<fwereade> niemeyer, for some reason I made the make-uniter-work branch depend on that one
<fwereade> niemeyer, jolly good, hopefully I won't need to reparent the followup ;)
<fwereade> niemeyer, I submitted one trivial this morning btw that was necessary
<niemeyer> fwereade: No, certainly not.. I think there were a few opportunities to have split the branches, as you noted yourself in the description, but it's alright.. happy to have it in whatever form.
<fwereade> niemeyer, cheers
<niemeyer> fwereade: Super, thanks re. trivial
<fwereade> niemeyer, now I come to think of it, the timeout bumps in the followup are surely only necessary because I wasn't using a tmpfs
<fwereade> niemeyer, I will adjust them down as seems fit
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: The timeouts on the sad cases may be high
<fwereade> niemeyer, I was thinking "conservatively", at first, anyway -- just to what they were originally
<niemeyer> fwereade: Those prevent spurious breaks when the machine happens to get some load or when tests are running on a slow machine
<niemeyer> fwereade: I haven't seen the branch
<niemeyer> fwereade: What were they?
<fwereade> niemeyer, 5s to 10s is the notable one, in a couple of places
<fwereade> niemeyer, and I really don't think it deserves that long
<niemeyer> fwereade: wow
<niemeyer> fwereade: Yeah, that's a lot
<niemeyer> fwereade: How come we're getting such long timeouts? Are we missing a resync?
<fwereade> niemeyer, that was pessimistic, in practive it only missed the 5s mark by a few milliseconds even under load
<fwereade> niemeyer, but I was in no mood to mess around ;)
<fwereade> niemeyer, almost certainly it is because the mongo tests stress the hell out of my machine unless I have a tmpfs
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: I wonder why
<niemeyer> Well, I do have an SSD
<fwereade> niemeyer, I think Aram and davecheney have a better developed understanding of it -- the stuff I've been working on has not generally been affected enough to get really worked up about when you're running a few tests at a time
<fwereade> niemeyer, that said, it feels *much* nicer now than it ever did before, not sure is this is psychosomatic
<fwereade> niemeyer, is anyone working on jujud btw?
<niemeyer> fwereade: not that I know of
<niemeyer> fwereade: I was about to ask what else has broken tests atm
<fwereade> niemeyer, that seems to be the one remaining eyesore in the test output
<niemeyer> fwereade: Anything obvious there?
<fwereade> niemeyer, barely looked tbh
<fwereade> niemeyer, focus has been elsewhere
<fwereade> niemeyer, schema error in bootstrap_test is the first
<fwereade> niemeyer, the remaining failures can probably be mostly attributed to collateral damage
<niemeyer> fwereade: That rings a bell somehow
<niemeyer> fwereade: Review sent
<fwereade> niemeyer, cheers
<niemeyer> Aram: ping
<Aram> pong
<niemeyer> Aram: Heya
<niemeyer> Aram: How're things going there?
<fwereade> niemeyer, the resyncing is to pick up Alive changes, without which Unit.Status will report "down" regardless of content
<Aram> niemeyer: yesterday I've taken a swap day and today I'm working on the missing watchers and lifecycle in watchers. but a little bit later, ATM I'm running some rrands.
<Aram> errand
<Aram> s
<niemeyer> fwereade: Sorry, I'm still slow today.. can you cover it a bit more piecemeal?
<niemeyer> Aram: Okay, let's please talk once you're back
<fwereade> niemeyer, ah, sorry, ok: I'm actually on crack, that's a different piece of code
<fwereade> niemeyer, no reason for the resyncing you correctly complain about
<niemeyer> fwereade: Is it just a matter of refreshing the unit itself?
<fwereade> niemeyer, yeah, that is the sane and easy way
<niemeyer> fwereade: Super.. I'm really just trying to understand these new patterns and see where/whether they fall short
<fwereade> niemeyer, that was almost certainly an unremoved attempt at working around the weirdness in the entity watchers
<rogpeppe> niemeyer, fwereade: just checking, do you think the provisioner is currently working in trunk?
<fwereade> rogpeppe, STM to be, just deployed a unit
<fwereade> rogpeppe, not from trunk, but I haven't touched the provisioner
<rogpeppe> fwereade: hmm, my live tests aren't working
<niemeyer> rogpeppe: It was yesterday
<rogpeppe> niemeyer: the live test?
<niemeyer> rogpeppe: The provisioner
<rogpeppe> niemeyer: the provisioner test suite runs fine for me locally - just not live
<niemeyer> rogpeppe: I've used the provisioner live yesterday
<niemeyer> rogpeppe: Multiple times
<rogpeppe> niemeyer: i've probably broken the live tests somehow then
<niemeyer> rogpeppe: Or they are just not working after the state migration
<rogpeppe> niemeyer: entirely possible. but it's not relying on watchers, so i *thought* it should work ok
<rogpeppe> niemeyer: did juju status report you the correct agent version for the bootstrap machine?
<niemeyer> rogpeppe: Yep
<niemeyer> rogpeppe: I have a paste somewhere, hold on
<niemeyer> rogpeppe: Sorry, can't find it
<niemeyer> rogpeppe: I can easily fire a new env, though
<rogpeppe> niemeyer: i think i saw it actually
<rogpeppe> niemeyer: i'm trying again, but not in the test this time
<niemeyer> fwereade: I think we'd benefit from adding back service.CharmURL, btw
<niemeyer> fwereade: We can probably just revert it from somewhere in history
<fwereade> niemeyer, I can only think of one place it's be a real boon... quite often we want the charm itself as well
<niemeyer> fwereade: In hindsight, it was the SetCharmURL bit that was ill-conceived.. CharmURL is fine
<fwereade> niemeyer, but yeah no actual objections
<niemeyer> fwereade: Yeah, but often we grab the charm and throw it away.. we might grab the charm only if
<fwereade> niemeyer, fair enough
<niemeyer> fwereade: Not an issue by any means.. just sharing brain state :)
<fwereade> niemeyer, ok, re MachineUnitsWatcher: what is the mechanism by which it guarantees it does not send events with the same unit both Added and Removed?
<niemeyer> fwereade: I think there isn't any right now..
<niemeyer> fwereade: I've pondered about the same thing in the context of the MachinesWatcher
<fwereade> niemeyer, I guess that was true both before and after the loop change
<niemeyer> fwereade: Right
<niemeyer> fwereade: I've actually made a comment regarding this in the CL
<niemeyer> fwereade: Dave suggested it wasn't an issue, but I'm not yet sure
<niemeyer> fwereade: I'm tempted to drop them too
<fwereade> niemeyer, it feels to me like it is
<niemeyer> fwereade: If nothing else, we must handle the situation where it doesn't show up correctly
<niemeyer> fwereade: Because it is just a perspective of timing
<niemeyer> fwereade: If we made the same query moments afterwards, we'd see nothing
<fwereade> niemeyer, yeah, indeed
<fwereade> niemeyer, I guess it's best to use that style though, and fix the watchers so that mergeChanges always does the Right Thing
<niemeyer> fwereade: +1
<fwereade> niemeyer, the advantage of the double-loop style in RUW is that I think it *did* allow me to make useful guarantees of that nature
<fwereade> niemeyer, anyway, shouldn't be too tricky :)
<niemeyer> fwereade: I've been doing a few changes, btw: renaming mergeChanges to merge; getInitialEvent to initial, and pass the change itself onto initial; plus the w.out change I've mentioned
<niemeyer> fwereade: Hmm
<fwereade> niemeyer, those all sgtm
<niemeyer> fwereade: I don't think so
<niemeyer> fwereade: I mean, I don't think there were any guarantees before either
<niemeyer> fwereade: Because events were aggregated regardless
<fwereade> niemeyer, I *think* that because the loops in which I send and the loops in which I get scope changes are different, it all Just Works
<fwereade> niemeyer, I don;t pick up a new scope event until I've sent the change derived from the first one
<niemeyer> fwereade: I don't see why it makes a difference.. the code that may be executed or not is exactly the same, with the new convention or with the old one
<fwereade> niemeyer, that change may have had any number of settings changes merged in but those are safe
<niemeyer> fwereade: The second loop is exactly a copy with the first one with the send disabled
<fwereade> niemeyer, not in RUW
<niemeyer> fwereade: Which is exactly what we do in the new convention, but with less code
<niemeyer> fwereade: Oh?
<niemeyer> fwereade: /me looks again
<fwereade> niemeyer, although it is just one more var to track really
<niemeyer> fwereade: I see.. I think it's actually preferable to teach merge about how to handle the situation
<niemeyer> fwereade: Otherwise it's unnecessarily buffering changes arbitrarily
<fwereade> niemeyer, yeah, sounds sane
<rogpeppe> niemeyer: ah, i think i *may* have discovered the reason why live tests are failing
<niemeyer> rogpeppe: Sweet!
<rogpeppe> niemeyer: just a hunch as yet
<rogpeppe> niemeyer: but...
<rogpeppe> niemeyer: what's the name of the public bucket containing the mongo binaries?
<niemeyer> rogpeppe: juju-dist.. why does it matter?
<rogpeppe> niemeyer: is it hard-coded?
<niemeyer> rogpeppe: yes
<rogpeppe> drat
<niemeyer> rogpeppe: As of today
<niemeyer> rogpeppe: It shouldn't be, but this avoids yet another refactoring without all tests green
<rogpeppe> niemeyer: that's fine. hrmph.
<niemeyer> fwereade: Follow up reviewed too
<TheMue> niemeyer: test of removals in firewaller always showed an effect. the fw dislikes the service removal when it is stopped. stopping the service watcher returns an error then.
<fwereade> niemeyer, ta
<niemeyer> TheMue: More details please?
<niemeyer> TheMue: If the firewaller is stopped, how can it dislike anything?
<fwereade> niemeyer, ok, so *that* was the bit with a weird StartSync that I think is justified
<niemeyer> fwereade: Yeah, I imagined it when reading
<fwereade> niemeyer, Unit.Status does an alive check on the UA
<niemeyer> fwereade: I'm just curious about how it unrolls into it
<niemeyer> fwereade: Aha, ok.. is that all?
<fwereade> niemeyer, Status returns "down" when the pinger is not there
<fwereade> niemeyer, yeah, that's it
<niemeyer> fwereade: So a suggestion: let's move the sync down to right before Status, and call Sync rather than StartSync
<fwereade> niemeyer, ah, yeah, sounds sensible
<fwereade> niemeyer, cheers
<niemeyer> fwereade: Thanks!
<niemeyer> TheMue?
<TheMue> niemeyer: sorry, my wife called me ;)
<TheMue> niemeyer: fw.Stop() is expected to return no error
<TheMue> niemeyer: but if a service is removed it returns a service not found
<TheMue> niemeyer: because during the stop all internal go routines are stopped, and there the service watchers
<TheMue> niemeyer: and this stopping does state.Service(name) for the non-existing service
<niemeyer> TheMue: This is a bug that should be fixed
<TheMue> niemeyer: indeed
<TheMue> niemeyer: so you had the right feeling about testing it
<niemeyer> TheMue: Would you mind to review all fw call sites that grab objects, and consider the appropriate action when the respective entity isn't found?
<niemeyer> TheMue: Happy to have that done in tiny branches that follow each other so that we have a good way to coordinate as we go
<TheMue> niemeyer: could you please rephase "fw call sites"? thx.
<niemeyer> TheMue: call site == a point in the code that calls something
<TheMue> niemeyer: ah, yes. ok, will do.
<niemeyer> TheMue: Once you have that first issue fixed, please push for review so we can talk a bit about it with more concrete logic
<TheMue> niemeyer: ok
<rogpeppe> niemeyer: hmm, live tests fail even though bootstrapping and deploying live work ok. looks like the provisioner never sees the new machine.
<niemeyer> rogpeppe: How can the provisioner never see the new machine if it is the one firing off new machines
<rogpeppe> niemeyer: it's not the one that creates the new Machine though
<niemeyer> rogpeppe: Exactly.. and if it didn't see that machine, it wouldn't fire an instance fo rit
<niemeyer> rogpeppe: Also, tests pass
<rogpeppe> niemeyer: the live deploy tests are currently disabled
<niemeyer> rogpeppe: Provisioner tests pass
<rogpeppe> niemeyer: sure. i'm just saying what i see.
<niemeyer> rogpeppe: Of course, and I'm just explaining that "provisioner never sees the new machine" can't be true on its own
<niemeyer> rogpeppe: The builddb charm works
<niemeyer> rogpeppe: Firing new machine
<niemeyer> rogpeppe: With machiner, uniter, etc
<rogpeppe> niemeyer: absolutely. i see that working too.
<niemeyer> rogpeppe: That doesn't happen with a provisioner that never sees a new machine
<rogpeppe> niemeyer: but my live test that used to work no longer works. so *something* is up.
<rogpeppe> niemeyer: it might be the fact that the second machine is added before the provisioner comes up
<niemeyer> rogpeppe: I'm sure..
<niemeyer> rogpeppe: Again, tests pass..
<niemeyer> rogpeppe: You seem to argue that the provisioner is completely broken.. I doubt it's as simple as that
<rogpeppe> niemeyer: i'm not arguing that at all
<rogpeppe> niemeyer: i'm saying that live tests fail that used to pass, that's all
<rogpeppe> niemeyer: and i don't see the provisioner see the new machine, which seems to be a symptom
<niemeyer> rogpeppe: Sorry, okay.. I don't know how to help then
<rogpeppe> niemeyer: i'm not asking for help, just giving a status update
<niemeyer> rogpeppe: That live tests are broken, ok :)
<rogpeppe> niemeyer: yeah, it's weird
<niemeyer> fwereade, rogpeppe: Regarding https://codereview.appspot.com/6571047, I really want to stop sending entities down the pipe at all
<niemeyer> Everything is a lot simpler and more correct if we just sent the entity identification
<rogpeppe> niemeyer: just send a change notification?
<niemeyer> rogpeppe: Yeah, with the id/name
<rogpeppe> niemeyer: i tend to agree.
<niemeyer> The machines watcher was significantly simpler and more obvious to handle correctly on the other side with it
<niemeyer> That said, I've been resisting suggesting this for now, just so we can get it all working as it used to
<niemeyer> After we're comfortable again, I think we should change
<fwereade> niemeyer, no strong feelings myself
<niemeyer> This also avoids the spurious reads we have nowadays within the watcher itself
<niemeyer> when the entity changes repeatedly without being consumed
<niemeyer> and it also simplifies the Dead-handling logic..
<niemeyer> Anyway, lunch time here
<niemeyer> rogpeppe: If you're still struggling by the time I'm back, let's try to pair on it
<rogpeppe> niemeyer: i think i might be getting somewhere now. just expressing the problem was helpful.
<rogpeppe> niemeyer: aaargh!
<rogpeppe> niemeyer: found it
<rogpeppe> so frickin simple
<platypusfriend> Greets
<platypusfriend> I'd like to suggest that, on the https://juju.ubuntu.com/ page, under the "Try it for yourself" heading, the instructions to install Juju from the PPA don't work out of the box with Ubuntu 12.04.1
<platypusfriend> gregkrsak@brewmaster:~$ add-apt-repository ppa:juju/pkgs
<platypusfriend> The program 'add-apt-repository' is currently not installed.  You can install it by typing: sudo apt-get install python-software-properties
<platypusfriend> (thanks!)
<niemeyer> platypusfriend: kthxbye
<niemeyer> rogpeppe: What was it?
<rogpeppe> niemeyer: well, one problem was that i was assuming that the machine received from the machine watcher had no information content, so was reading agent tools from the original.
<rogpeppe> niemeyer: but... that doesn't appear to have fixed the problem. we will see.
<rogpeppe> niemeyer: am currently running a live test again
<rogpeppe> niemeyer: hmm, looks like the machine watcher might not be sending an initial event.
<niemeyer> rogpeppe: Indeed, I don't think it is
<rogpeppe> niemeyer: i thought we tested for that
<niemeyer> rogpeppe: I can fix that.. what else is using it?
<niemeyer> Hmm
 * rogpeppe checks
<niemeyer> Apparently only used in tests
<niemeyer> In two places only, one of them being the live tests
<rogpeppe> niemeyer: that sounds about right
<niemeyer> rogpeppe: I'm on it.. will take the chance to simplify the watcher according to the latest conventions
<rogpeppe> niemeyer: sounds good
<niemeyer> rogpeppe: Hah, curious
<niemeyer> rogpeppe: If we send just the id, the initial event becomes a bit silly
<rogpeppe> niemeyer: i'm not sure
<rogpeppe> niemeyer: it's actually still useful, even though it carries no info content
<rogpeppe> niemeyer: because it for a very simple pattern inside various watcher clients
<rogpeppe> s/it/it makes/
<niemeyer> rogpeppe: Cool, I'm not suggesting we change the convention right now either way, since we're in the middle of a huge transition
<rogpeppe> niemeyer: +1
<niemeyer> rogpeppe: Let's just observe the changed sites and see how they look once we do this
<rogpeppe> niemeyer: sounds good
<rogpeppe> niemeyer: my observation was that in quite a few places, the initial event is not acted upon any differently from a changed event, so it makes sense to handle them both in the same branch of the select statement.
<niemeyer> rogpeppe: If that's generally the case for the entity watchers, +1
<rogpeppe> niemeyer: so it's nice to guarantee that we get at least one event
<niemeyer> rogpeppe: It perhaps also conveys useful information, although I'm not entirely sure:
<niemeyer> rogpeppe: When we get the watch fired, we know we're subscribed
<rogpeppe> niemeyer: interesting. i wonder if that's ever actually useful to know.
<niemeyer> rogpeppe: Ah, no, sorry, it doesn't matter
<niemeyer> rogpeppe: We provide the machine revno when watching, so there are no holes
<rogpeppe> niemeyer: ah, so in that case the initial event *does* carry useful info
<niemeyer> rogpeppe: ?
<rogpeppe> niemeyer: oh, sorry, other way around
<niemeyer> Yeah
<rogpeppe> niemeyer: i thought you were suggesting the events would contain the revno as well as the id
<rogpeppe> niemeyer: which may be a good idea, i suppose
<niemeyer> rogpeppe: Nope.. not suggesting that yet, at least
<rogpeppe> niemeyer: it kinda makes sense - you give a revno to watch from, then you're handed revnos as it changes.
<niemeyer> rogpeppe: I'm slightly concerned that we might see code ignoring the initial watch thinking "Oh, but when the initial event arrives, the machine/unit/service hasn't yet changed"
<niemeyer> rogpeppe: Which is not necessarily true.. let's watch out for that
<rogpeppe> niemeyer: yeah
<niemeyer> rogpeppe: The concept of revno is internal pretty much everywhere
<niemeyer> rogpeppe: I'm not keen on exposing it, if possible
<rogpeppe> niemeyer: fair enough.
<rogpeppe> niemeyer: though doesn't the uniter make use of revnos?
<niemeyer> rogpeppe: That where the "pretty much" comes from. The only exception is the settings version, because we have to persist them to disk.
<rogpeppe> niemeyer: here's an idea for how an updated watcher API might work:
<rogpeppe> http://paste.ubuntu.com/1226959/
<rogpeppe> niemeyer: i.e. each entity watcher embeds the entity that it's watching
<rogpeppe> niemeyer: oops, with one crucial addition: http://paste.ubuntu.com/1226963/
<niemeyer> rogpeppe: I don't understand what this is solving
<rogpeppe> niemeyer: it's solving, for me at any rate, the fact that i want to watch the same kind of thing on two different kinds of entity
<rogpeppe> niemeyer: perhaps that's not a problem anywhere else though, i guess
<rogpeppe> niemeyer: it seems kinda logical to tie the watcher together with the thing it's watching.
<niemeyer> rogpeppe: What is the MachineUnitsWatcher going to be tied with?
<rogpeppe> niemeyer: is MachineUnitsWatcher an entity watcher?
<rogpeppe> niemeyer: i'm only thinking of this for things where currently we just send an instance of the object down the channel
<niemeyer> rogpeppe: Sorry, seems like a big red-herring
<rogpeppe> niemeyer: ok fair enough
<rogpeppe> niemeyer: i wondered if UnitWatcher and MachineWatcher could actually be implemented by the same type, and that led me to think in this direction
<rogpeppe> niemeyer: looking at TestWatchMachine, it looks as if it *does* test for an initial event
<niemeyer> rogpeppe: Please leave that with me.. I'll send a branch in a moment
<rogpeppe> niemeyer: ok
<rogpeppe> niemeyer: i'm off now. might be able to have a look back in a bit, otherwise see ya tomorrow!
<niemeyer> rogpeppe: have a good time there
<niemeyer> rogpeppe: Do live tests work, btw?
<rogpeppe> niemeyer: no, i'm waiting for your machine watcher CL
<niemeyer> rogpeppe: Well.. this will just add an extra event
<rogpeppe> niemeyer: i *think* that's what's stopping them passing
<rogpeppe> niemeyer: that will be good enough
<rogpeppe> niemeyer: currently it blocks waiting for that event and never gets it
<niemeyer> rogpeppe: It's a spurious event.. should be trivial to just not wait for it
<niemeyer> Either way, too late.. the watcher is ready and you're off.. I'll have a look at it later
<rogpeppe> niemeyer: i've actually got another hour or so now
<niemeyer> rogpeppe: Oh, okay, let me push the watcher then
<niemeyer> rogpeppe: I don't think it's the initial event here, btw
<niemeyer> rogpeppe: Have you changed the test so it calls Refresh?
<rogpeppe> niemeyer: ok. i changed the test so it used the Machine that it receives on the channel
<niemeyer> rogpeppe: That won't work.. the channel contains an id now..
<niemeyer> rogpeppe: Just Refresh the machine
<rogpeppe> niemeyer: ok, i'll change as needed when i merge
<niemeyer> rogpeppe: Ok, I won't touch it
<rogpeppe> niemeyer: sounds good
<niemeyer> rogpeppe: You'll see why the original tests were working, btw
<rogpeppe> niemeyer: cool
<niemeyer> Committing & proposing
<rogpeppe> niemeyer: i'm interested - they looked plausible!
<niemeyer> rogpeppe: Agreed.  Two bugs coalesced for it to work.
<niemeyer> rogpeppe:  https://codereview.appspot.com/6564049
<rogpeppe> *click*
<rogpeppe> niemeyer: so the test was ok, then, and the problem was that the txn-revno was out of date, so it sent an event anyway?
<rogpeppe> niemeyer: i'm not sure i see the other bug you refer to though
<niemeyer> rogpeppe: Right
<niemeyer> rogpeppe: Well, the other bug is the fact it didn't send
<niemeyer> rogpeppe: The two bugs, together, make the test pass
<rogpeppe> niemeyer: the watcher didn't send?
<niemeyer> rogpeppe: The initial event.. :-)
<rogpeppe> niemeyer: ah of course!
<rogpeppe> niemeyer: it all becomes blindingly obvious
<rogpeppe> niemeyer: it would be nice if we could have a test that would have failed with the previous bugs.
<niemeyer> rogpeppe: The current test fails if we fix the revno issue
<niemeyer> rogpeppe: But it's not easy to test the fact the revno was wrong, precisely because it causes an initial event to be sent, which is the behavior we want
<rogpeppe> niemeyer: i wondered about getting a machine from a different source than AddMachine
<niemeyer> rogpeppe: The bug is in AddMachine
<niemeyer> was
<rogpeppe> niemeyer: given that there are several Machine constructors that can be bad in different ways
<rogpeppe> niemeyer: indeed.
<niemeyer> rogpeppe: Nope.. they are all ok, because they grab the db value
<rogpeppe> niemeyer: if the original hadn't used AddMachine, we'd have found the bug
<niemeyer> rogpeppe: AddMachine is special because it uses the memory value
<rogpeppe> niemeyer: yeah, i'm just saying. maybe there should be a test that AddMachine returns exactly the same values as Machine. but that's hard to do externally.
<niemeyer> rogpeppe: We could abuse DeepEquals for that.. it's generally a bad idea, but we can have a single trivial spot doing that for the entities, at least so we consciously know when we break that rule.
<rogpeppe> niemeyer: i wondered about that
<niemeyer> rogpeppe: I'll add a test
<rogpeppe> niemeyer: one thought: is it really worth sending the machine id down the channel. we're never going to actually use it AFAICS. why not chan struct{} ?
<niemeyer> rogpeppe: Happy to take it off while we don't care
<Aram> niemeyer: what is it that you wanted to talk about?
<rogpeppe> niemeyer: it would make me happy. then MachineWatcher and UnitWatcher can implement the same interface.
<niemeyer> Aram: Isn't it quite late for you already?  If you're taking the day off, we can talk tomorrow..
<niemeyer> rogpeppe: Cool, I'll drop it before merging
<Aram> niemeyer: it's late but it's fine, if we talk today I can work on it tomorrow in the morning, when you'll not be here.
<rogpeppe> niemeyer: in fact, both Machine.Watch and Unit.Watch could return the same interface type. then Machine and Unit are directly compatible for watching.
<rogpeppe> niemeyer: the MachineWatcher type could be unexported.
<niemeyer> Aram: I don't actually have much to talk about, I think.. I was going to bring up the auth + SSL stuff, but we don't even have the machine units watcher done yet, so that's probably the first thing to work on
<niemeyer> Aram: I think you've seen as well discussion about the latest conventions
<Aram> niemeyer: aah, yes. so we continue to do full document watchers or just ids?
<niemeyer> Aram: Regarding w.out, nil, dropping the two loops, etc
<Aram> I've seen that
<Aram> very nice way of dropping the extra loop
<niemeyer> Aram: Just ids, and with liveness
<niemeyer> Aram: Similar to how the current MachinesWatcher is working
<Aram> ok
<niemeyer> Aram: Dead+Alive rather than added/removed
<Aram> yeah
<niemeyer> rogpeppe: Yeah, we can do a lot, but let's hold off a bit.. this is already going quite far
<rogpeppe> niemeyer: cool. agreed.
<niemeyer> I'm actually going to split this equivalence stuff in its own branch
<rogpeppe> niemeyer: that sounds good too
<niemeyer> I had to do changes for the charm equivalence which is unrelated to the original goal
<niemeyer> Come on LP
<niemeyer> rogpeppe: It's in
<rogpeppe> niemeyer: i saw. thanks!
<niemeyer> rogpeppe: np, will push the follow up for review onw
<niemeyer> https://codereview.appspot.com/6566050
<niemeyer> rogpeppe: ^
 * rogpeppe looks
<rogpeppe> niemeyer: LGTM
<niemeyer> rogpeppe: Thanks!
<rogpeppe> niemeyer: ok, i've updated the live tests for the new interface. now running them live - we'll see what happens.
<niemeyer> rogpeppe: Fingers crossed!
<rogpeppe> niemeyer: i've got a bus to catch in 15 minutes so if this doesn't work, it'll have to be tomorrow
 * rogpeppe wants a connection with faster upload speed
<niemeyer> rogpeppe: +1
<Aram> yeah, in Romania I have 100Mbps upload for $9/month.
<Aram> here in Austria I pay 50â¬ for only 10Mbps.
<Aram> that's 71 times more expensive upload.
<rogpeppe> Aram: here i have no option for faster upload. it's crappy copper and they've no plans to upgrade.
<Aram> yeah, in Romania I was lucky to have fiber.
<rogpeppe> still, it only took 3:10 to upload the tools that time
<Aram> at some point they forgot to set speed limits for me and I have 1Gbps download/upload for months :-).
<rogpeppe> i should do what dave does and run the live tests on ec2 itself
<rogpeppe> gotta go, or will miss bus
<rogpeppe> tests still waiting mongod
<rogpeppe> niemeyer, Aram: have fun
<Aram> cheers.
<niemeyer> I wonder what "tests still waiting mongod" means.. It's pretty much instantaneous for me nowadays
<hazmat> out of curiosity how big are the tools?
<niemeyer> hazmat: Each binary is about 5M stripped
<hazmat> niemeyer, and what's the size with symbol?
<niemeyer> hazmat: 7MB or so
<hazmat> niemeyer, cool, thanks
<niemeyer> hazmat: np
<niemeyer> Stepping out for a while.. back soon
<fwereade> niemeyer, discussion in https://codereview.appspot.com/6567044/ ; trivial in https://codereview.appspot.com/6570050 ; it would be awesome if you could take a look at the first, in particular, if you're around at any stage
<niemeyer> fwereade: I'm here
<niemeyer> fwereade: Looking
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: I've already replied to the first a while ago
<fwereade> niemeyer, ah, sorry, missed that
<niemeyer> fwereade: np
<niemeyer> fwereade: LGTM on second
<fwereade> niemeyer, hmm, ok, I see; LGTM re out = nil
<niemeyer> fwereade: Super
<fwereade> niemeyer, LGTM stands with the change then?
<niemeyer> fwereade: I'll have some food while another round of live tests run
<niemeyer> fwereade: Yep!
<fwereade> niemeyer, sweet, the tests will look much nicer once the uniter's in
<niemeyer> fwereade: Btw, I'd like to bring the id on machine watcher bakc
<fwereade> niemeyer, sorry, context
<niemeyer> fwereade: I think it was a mistake to remove it.. the first time I've tried to fix some code, it was using the return :)
<niemeyer> fwereade: Ah, sorry
<niemeyer> fwereade: We've dropped the result this afternoon, so it returns struct{}
<niemeyer> fwereade: It was a candidate change, but it was a mistake.. first code I went to fix in the live tests, it was using the id of the result meaningfully
<niemeyer> fwereade: I'll send a branch so we can bring it back.. I don't want to hack the tests more at this point
<fwereade> niemeyer, cool, I should sleep soon but I'll be around for a few mins at least
<fwereade> niemeyer, I'll just treat those failures as deliberate/expected
<niemeyer> fwereade: Which failures?
<fwereade> niemeyer
<fwereade> cmd/jujud/provisioning_test.go:69: undefined: state
<fwereade> # launchpad.net/juju-core/environs/jujutest
<fwereade> environs/jujutest/livetests.go:262: m.AgentTools undefined (type struct {} has no field or method AgentTools)
<fwereade> # launchpad.net/juju-core/environs/jujutest
<fwereade> environs/jujutest/livetests.go:262: m.AgentTools undefined (type struct {} has no field or method AgentTools)
<fwereade> niemeyer, I was assuming that jujud was broken deliberately to prevent ugly panics dirtying up the test log
<niemeyer> fwereade: Yeah, this is broken and is expected.. I'm working to fix all of this
<fwereade> niemeyer, cool
<fwereade> niemeyer, I shall not allow them to trouble me
<niemeyer> fwereade: https://codereview.appspot.com/6571053
<niemeyer> fwereade: Hehe :)
<fwereade> niemeyer, LGTM
<niemeyer> fwereade: Cheers
<niemeyer> fwereade: I have quite a few nice trivials on the pipeline, but I'll try to catch up with Dave after dinner
<niemeyer> fwereade: Thanks, and have a nice sleep there
<fwereade> niemeyer, cool, perhaps gently encourage him to fix config-get, because I think that will give us more charms to play with right away
<niemeyer> fwereade: I think he's on it, but I'll talk to him
<fwereade> niemeyer, tyvm
<niemeyer> fwereade: np
<niemeyer> Dinner time.. biab
<fwereade> niemeyer, enjoy
#juju-dev 2012-09-26
<niemeyer> davecheney: Yo!
<niemeyer> davecheney: Are you staying around for a bit?
<davecheney> niemeyer: yy
<niemeyer> davecheney: I guess the answer is no.. :)
<davecheney> yeah, i'm arround
<niemeyer> davecheney: Ah, hey :)
<niemeyer> davecheney: I'm about to get live tests working, and have a set of changes in the pipeline.. if you'll be around for a bit, we can quickly interact on those
<davecheney> niemeyer: excellent
<niemeyer> davecheney: It's all pretty agreeable stuff
<davecheney> let me know what you need me too do
<niemeyer> davecheney: Btw, I have a connected RPi turning a led on and off here :-)
<davecheney> nice
<davecheney> niemeyer: http://dave.cheney.net/2012/09/25/installing-go-on-the-raspberry-pi
<niemeyer> Boards with GPIO <3
<davecheney> finally wrote it up last night
<niemeyer> davecheney: Oh, sweet
<niemeyer> davecheney: Very cool
<niemeyer> davecheney: I got Go working yesterday too
<davecheney> odessa(~/devel/src/launchpad.net/juju-core/testing) % go test ../cmd/juju
<davecheney> --- FAIL: TestPackage (0.00 seconds)
<davecheney> mgo.go:65: 	exec: "mongod": executable file not found in $PATH FAIL
<davecheney> FAIL	launchpad.net/juju-core/cmd/juju	0.116s
<davecheney> ^ with your review comments applied
<niemeyer> ?
<niemeyer> davecheney: I don't understand.. I didn't suggest any changes to the path or whatever?
<davecheney> niemeyer: nah, this is just responding to your comments in ,https://codereview.appspot.com/6560045
<niemeyer> davecheney: Ah, ok, just showing how it looks like.. super, thanks
<niemeyer> davecheney: Phew.. I thought it was all broken :)
<davecheney> nah, just testing that it does work as expected if you don't have mgo installed
<niemeyer> davecheney: Looks good
<davecheney> ty
<niemeyer> Amazon is taking ages to allocate a machine apparently.. and S3 is failing.. not a good night to test it live
<niemeyer> Anyway, I'll break the CLs down
<niemeyer> https://codereview.appspot.com/6572050
<davecheney> niemeyer: Proposal: https://code.launchpad.net/~dave-cheney/goetveld/002-add-darwin-termios-support/+merge/126362
<davecheney> for some reason goetveld doesn't create CL's, only LP reviews
<davecheney> niemeyer: try another zone ?
<davecheney> i ofter use southeast-1
<niemeyer> https://codereview.appspot.com/6573050
<niemeyer> https://codereview.appspot.com/6579043
<niemeyer> error: Failed to update merge proposal log: EOF
<niemeyer> Maybe it's my connection that is in a bad state
<niemeyer> https://codereview.appspot.com/6567048
<niemeyer> davecheney: Last for the day ^
<davecheney> niemeyer: i haen't seen that one for a while, the EOF
<davecheney> reviewing last CL now
<niemeyer> Shower.. will pass by in a bit..
<niemeyer> davecheney: Thanks for the reviews!
<davecheney> kk
<davecheney> niemeyer: no worries
<niemeyer> davecheney: Okay, I guess I'll merge this stuff
<niemeyer> davecheney: So people can make progress tomorrow on top of it
<niemeyer> davecheney: How's config-get going there?
<davecheney> niemeyer: dunno, i should pull that branch and see if it compiles
<niemeyer> davecheney: I mean the stuff that you were pushing forward since the sprint
<niemeyer> davecheney: Or what have you been up to?
<davecheney> i've been on swap days since the sprint
<niemeyer> Today as well?
<niemeyer> davecheney: ?
<davecheney> niemeyer: getting yelled at from downstairs, back in a while
<TheMue> morning
<davecheney-a6faf> morning
<fwereade> mornings
<davecheney-bb1ad> fwereade: http://paste.ubuntu.com/1228005/
<davecheney-bb1ad> did I just break transactions ?
<fwereade> davecheney-bb1ad, huh, not seen that before
<fwereade> davecheney-bb1ad, er, maybe? :)
<davecheney-bb1ad> fwereade: shitter
<fwereade> davecheney-bb1ad, but I actually have no idea
<fwereade> davecheney-bb1ad, what did you do? :)
<davecheney-bb1ad> just what I typed
<davecheney-bb1ad> that is trunk
<TheMue> fwereade: morning
<fwereade> TheMue, heyhey
<fwereade> davecheney, bah, I'll try myself
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1056642
<davecheney> for posterity
<fwereade> mroning rogpeppe
<rogpeppe> fwereade: hiya
<TheMue> rogpeppe: morning
<davecheney> fwereade: running those commands in a loop
<davecheney> fwereade: another failure
<fwereade> davecheney, oh dear :( same one?
<davecheney> different
<davecheney> http://paste.ubuntu.com/1228059/
<fwereade> davecheney, hmm, is that just ec2 taking too long?
<davecheney> possibly, but we should be able to cope with that
<davecheney> lucky(~/src/launchpad.net/juju-core/cmd/juju) % juju destroy-environment
<davecheney> lucky(~/src/launchpad.net/juju-core/cmd/juju) % bash -x stress.bash
<davecheney> + set -e
<davecheney> + true
<davecheney> + juju bootstrap --upload-tools
<davecheney> + juju deploy mongodb
<davecheney> error: no instances found
<davecheney> that happened in 2 minutes
<davecheney> less
<davecheney> our timeout is 10 minutes
<davecheney> lucky(~/src/launchpad.net/juju-core/cmd/juju) % bash -x stress.bash
<davecheney> + set -e
<davecheney> + true
<davecheney> + juju bootstrap --upload-tools
<davecheney> + juju deploy mongodb
<davecheney> + juju status
<davecheney> error: instance i-bbd62fc6 for machine 1 not found
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1056679
<fwereade> bizarre :(
<TheMue> ha!
<TheMue> catched the service bug in firewaller, phew
<TheMue> hmm, still sometimes have a race *sigh*
<davecheney> rogpeppe: http://paste.ubuntu.com/1228172/
<davecheney> ^ evil, or very evil ?
<Aram> moin
<rogpeppe> davecheney: why?
<davecheney> the py tests fro juju ssh were using this weird mocking thing
<davecheney> I needed a way to catch the ssh arguments, without running the command
<rogpeppe> davecheney: the way i was doing that before was by changing $PATH
<rogpeppe> davecheney: is that not reasonable? (it doesn't involve changing the actual code at all)
<davecheney> i guess I want to test that the code is generating the correct invovation
<rogpeppe> davecheney: i realise it's more work though
<davecheney> so I need the args
<TheMue> Aram: hi
<rogpeppe> davecheney: you can get the args. just make the ssh into a shell script that saves the args.
<davecheney> rogpeppe: sure, that would work, but then I have to write that stuff to disk, blah blah
<rogpeppe> davecheney: i thought we were doing something like this anyway
<rogpeppe> davecheney: we do, in TestSSHErrors
<davecheney> fair point
<rogpeppe> davecheney: i don't think it's much more hassle to get the script to save its args.
<rogpeppe> davecheney: or just use a different script template
<rogpeppe> davecheney: i'm not that keen on mocking Command.
<davecheney> fair enough'
<rogpeppe> davecheney: just change the script to do something like this: for i in "$@"; do echo $i >> {{.Dir}}/args; done
<rogpeppe> fwereade: do you find the "JUJU:DEBUG watcher: loading new events from changelog collection..." messages useful?
<rogpeppe> fwereade: i'm inclined to delete them - they're so noisy.
<fwereade> rogpeppe, sometimes; I would be quite keen on a package-specific debug flag that defaults to false though
<rogpeppe> fwereade: how would you turn it on?
<Aram> rogpeppe: good catch about nil != empty on golang-dev. I was like "wtf is this" but didn't say anything because I assumed I missed some subtlety and I was wrong.
<fwereade> rogpeppe, watcher.SetDebug(true) in the setups of tests that I thought it would help in debugging
<Aram> :)
<Aram> yes, delete the damn verbosity.
<rogpeppe> fwereade: or even just watcher.Debug = true :-)
<fwereade> rogpeppe, indeed :)
<fwereade> Aram, btw, what's the priority on the confignode read-missing change?
<Aram> fwereade: ?
<Aram> I'm missing something :)
<Aram> what's wrong
<fwereade> Aram, sorry, I thought I overheard you discussing it with niemeyer with a vague view to doing it soon
<fwereade> Aram, I mean changing CondfigNode such that an attempt to read an empty one returns an error
<fwereade> Aram, sorry, a *missing* one
<fwereade> Aram, empty is fine
<Aram> hmm.
<Aram> I'll take a look.
<Aram> the behavior used to be like you described.
<fwereade> Aram, honestly I'm easy either way, I think it would be a good change but not an overwhelmingly awesome one
<fwereade> Aram, I just wanted to know the plan because I'm about to fiddle around with  one of the clients
<Aram> the behavior should be like you described, if it is not, I'll do it after I do the machine units watcher.
<Aram> which I'll start after lunch.
<Aram> which is now
<Aram> :L)
<rogpeppe> fwereade: what's the current status of the uniter w.r.t. lifecycle? does it watch for Dying?
<fwereade> rogpeppe, nope, no handling at all
<fwereade> rogpeppe, that would be a good thing to add, actually :)
<TheMue> lunchtime (did i already say i hate testing for races *grmpf*)
<rogpeppe> fwereade: where would you fit the watcher in? Uniter.loop is sooo neat currently...
<rogpeppe> fwereade: hmm, i suppose it could just be a separate thing that kills the watcher, perhaps with a known error.
<fwereade> rogpeppe, I was expecting all the steady state modes to watch for it
<fwereade> rogpeppe, maybe I can have occasional checks at transition time too
 * fwereade shrugs
<rogpeppe> fwereade: is there anything mode-specific that needs to be done at shutdown time?
<fwereade> rogpeppe, yes, all sorts of watcher stops, generally
<rogpeppe> fwereade: those will happen anyway when the uniter tomb is killed, no?
<fwereade> rogpeppe, s/anyway/assuming the existence of a new entity whose responsibility it is to do that on unit death/
<rogpeppe> fwereade: ah, of course, the watchers are independent of the modes
<fwereade> rogpeppe, sorry I think I am miscommunicating
<fwereade> rogpeppe, yes, when we return from a mode, its watchers will be cleaned up; there are no watchers outside modes at the moment
<rogpeppe> fwereade: that's what i thought originally
<rogpeppe> fwereade: so...
<fwereade> rogpeppe, but 2/3 of the steady state modes already watch the unit
<rogpeppe> fwereade: why do we need a new entity to stop watchers on unit death?
<fwereade> rogpeppe, we don't
<fwereade> rogpeppe, we just need to use unit watchers in the modes
<fwereade> rogpeppe, no call at all to mess around with uniter.loop IMO
<rogpeppe> fwereade: i'm wondering if there's any reason to clutter up every single mode with dying logic
<fwereade> rogpeppe, well, yes
<fwereade> rogpeppe, usually Dying is of no concern at all
<rogpeppe> fwereade: when we can have an entirely separate entity that just kills the uniter when life changes to state.Dying
<fwereade> rogpeppe, whoa how do yo propose to do that?
<fwereade> rogpeppe, we don;t want to *kill* the uniter
<rogpeppe> fwereade: no?
<fwereade> rogpeppe, we enter an extended and detailed shutdown sequence in which lots of things happen
<fwereade> rogpeppe, when we hit Dead is a different matter
<rogpeppe> fwereade: ah, ok. that's what i meant by "is there anything mode-specific that needs to be done at shutdown time?"
<fwereade> rogpeppe, but if you're in a hook error state it matters not one whit whether you're alive or dying
<fwereade> rogpeppe, you wait for that user resolution, like it or not
<rogpeppe> fwereade: does that mean we can't remove units that are in an error state?
<fwereade> rogpeppe, there will be mode-specific subtleties though
<fwereade> rogpeppe, yes
<fwereade> rogpeppe, unless you -force
<fwereade> rogpeppe, eg, a hook error state should probably just ignore charm upgrades while dying
<rogpeppe> fwereade: ok, this all makes sense. not a trivial change then :-)
<fwereade> rogpeppe, afraid not :)
<fwereade> rogpeppe, I am slightly worried that all the lifecycle logic will rather dirty up the nice clean uniter
<rogpeppe> fwereade: i *think* that it'll just result in another couple of modes
<rogpeppe> fwereade: and some logic in the existing modes to make that transition
<fwereade> rogpeppe, I'm talking more about responses to life checks in the various watchers within the modes
<fwereade> rogpeppe, I worry that they will obscure the rest ;)
<rogpeppe> fwereade: i guess we'll find out...
<fwereade> rogpeppe,  yeah :)
<rogpeppe> fwereade: do you think it's reasonable that Refresh return a *NotFoundError when the entity has been removed?
<fwereade> rogpeppe, hmm, yeah, I think so
<fwereade> rogpeppe, we'll see how it is in practice
<Aram> fwereade: rogpeppe: since now we are only returning Ids, what do you say about the idea of only returning two kinds of changes, one for ints and one for strings, instead of a custom change type for each watcher.
 * fwereade kinda has a sad about this
<fwereade> generally speaking, when I get a change I really *like* being able to do something useful with it
<Aram> so you don't like that we are only returning ids?
<fwereade> Aram, I guess I need more context; which watchers?
<Aram> all of them.
<Aram> :)
<Aram> look at how machines watcher is now.
<Aram> it returns
<Aram> type MachinesChange struct {
<Aram> 	Alive []int
<Aram> 	Dead  []int
<Aram> }
<Aram> all the others will return struct { Alive, Dead []string }
<fwereade> Aram, yeah, I do appreciate the new structure... but, hmm, nothing needs Dying?
<Aram> other watchers sure need Dying.
<Aram> unsure about this.
<Aram> niemeyer wrote it
<fwereade> Aram, I thought we agreed a while back that we'd be supplying a single list of those entities whose life status had changed?
<fwereade> Aram, I'm not 100% sure that's right, but I'm sketchy on just having Alive/Dead fields
<Aram> yeah, I remember we agreed on that long ago, but I also remember we agreed on this more recently.
<Aram> "agreed"
<fwereade> Aram, blast, I missed that conversation
<fwereade> Aram, (sure, I am aware that agreement is a fluid that changes its nature as it flows within a group)
<Aram> if it were my choice, I'd just return map[Life]Id, because we might add more life states in the future
<Aram> map[Life][]Id
<Aram> so they are grouped by life
<Aram> and you can have as many life states as you want
<fwereade> Aram, that sounds good to me
<Aram> but it's not my choice :).
<fwereade> Aram, ...although...
<fwereade> Aram, I kinda feel that reporting by status is somewhat against the spirit of the ids-only change
<fwereade> Aram, really just `Changed []Id` seems best
<Aram> I agree in principle.
<fwereade> Aram, ok back to concrete :)
<fwereade> Aram, I want to watch a service's relations, and know when they are dying
<fwereade> Aram, am I now expected to set up an individual Dying watch, per relation that I detect alive?
<Aram> no
<fwereade> Aram, how can I avoid this?
<Aram> you'll get a ServiceRelationWatcher which will return struct { Alive, Dying []string }
<fwereade> Aram, ok, so each change type is tailored to its clients, and the Alive/Dead thing is not a brook-no-exceptions Agreement
<fwereade> Aram, that SGTM then :)
<Aram> but struct { Alive, Dying []string } is hardly tailored to its client, almost all watchers will use it.
<fwereade> Aram, I still don't like the amount of reloading I'll be doing on doc watches...
<Aram> yeah, I agree in principle. simpler watcher are good, but when they put a burden on the client the fact that they are simple means nothing.
<fwereade> Aram, on the basis that the ids change will be really ugly and inconvenient for the doc watchers' only clients... as it will be I think for ConfigNode watches... would you hold off on doing them until you've changed the other ones?
<Aram> sure.
<fwereade> Aram, I will try to remember to bring it up with niemeyer this pm
<fwereade> Aram, cheers
<fwereade> Aram, and for the relations, I should be thinking in terms of lists of relation IDs only?
<fwereade> Aram, (grouped as Alive/Dying ofc)
<Aram> whatever you want, but the primary key is the string formed from the endpoints and that has the advantage that you can extract information from it without having to load the document.
<fwereade> Aram, hmm; do we have a Relation accessor on state that accepts that key?
<fwereade> Aram, I don't think we do...
<Aram> surprisingly no, though State.Relation is close in spirit
<Aram> since the key is the stringified endpoints
<fwereade> Aram, I think we have some thinking do do
<fwereade> Aram, *lossily* stringified endpoints, I think
<fwereade> Aram, we can't reconstruct the originals without hitting state
<Aram> yeah, we can't ATM
<fwereade> Aram, and that will involve figuring stuff out by getting information from the charms
<fwereade> Aram, which I'm pretty sure we don't want to do because it will break all our tests
<fwereade> Aram, (because most tests just slap a dummy charm into state, which doesn't declare any relations)
<niemeyer> Hello all!
<Aram> hi
<fwereade> niemeyer, heyhey
<TheMue> niemeyer: hiya
<niemeyer> How's the day looking?
<niemeyer> GOod stuff?
<niemeyer> > db.mycoll.insert({_id: 1})
<niemeyer> duplicate key insert for unique index of capped collection
<niemeyer> > db.foo.insert({_id: 1})
<niemeyer> E11000 duplicate key error index: test.foo.$_id_  dup key: { : 1.0 }
<niemeyer> It's unfortunate that these are different errors
 * niemeyer reports upstream
 * TheMue still fights with a race condition
<Aram> niemeyer: do we return dying machines in the initial event of the machine units watcher?
<Aram> or do we now.
<Aram> s/now/not/
<Aram> returning dying machines makes it consistent with Machine.Units
<niemeyer> Aram: Dying, yes
<niemeyer> Aram: The only thing that cares about an Alive => Dying transition is the entity agent itself
<niemeyer> Aram: For all other purposes, the thing is still alive
<Aram> niemeyer: so you're fine with UnitsChange being
<Aram> type UnitsChange struct {
<Aram> 	Alive []string
<Aram> 	Dying []string
<Aram> }
<Aram> ?
<niemeyer> Aram: s/Dying/Dead/?
<fwereade> Aram, surely that's an Alive/Dead one
<niemeyer> Yeah
<fwereade> Aram, ServiceRelations is Alive/Dying, because while the watching entity is not istelf a relation, it *is* responsible for responding to the  relation's lifecycle changes
<fwereade> Aram, sorry that was not a helpful sentence
<Aram> how does one get Alive -> Dying if we don't deliver this event here? one watcher per each unit?
<fwereade> Aram, I think the question here is entirely situational, and dependent on what the client needs to use
<fwereade> Aram, the MPW only needs to about Alive (to deploy a container) and Dead (to destroy it)
<Aram> ok, that seems sensible.
<fwereade> Aram, the Uniter is responsible for watching the unit for Dying, and then shutting itself down in an orderly fashion before making itself Dead
<niemeyer_> Weird.. that was an abrupt "disconnection by peer"
<fwereade> Aram, the Uniter is also responsible for handling everything about relations, and itself sets Dead, so it should only need Alive/Dying
<fwereade> Aram, from SRW
<niemeyer_> fwereade: That said, hmm
<fwereade> niemeyer_, I personally would prefer just []ChangedLife
<fwereade> niemeyer_, and handle it appropriately at the various call sites
<niemeyer_> fwereade: Wouldn't it be weird.. let's say.. for a machine watcher to report a unit as alive when it's dying..
<fwereade> niemeyer_, which will I suspect contain a number of subtle differences
<niemeyer_> fwereade: just so it starts up, and dies
<Aram> it seems kind of wrong to return Dying units inside the Alive field of UnitsChange though.
<fwereade> niemeyer_, so a unit that is initiall seen to be Dying should be ignored, and should not generate a Dead
<niemeyer_> fwereade: Kind of..
<niemeyer_> fwereade: That's even more incorrect
<niemeyer_> fwereade: imagine the same situation, but the unit is actually still running
<fwereade> niemeyer_, ha
<rogpeppe> niemeyer_, fwereade: fairly trivial: https://codereview.appspot.com/6564054
<niemeyer_> rogpeppe: Thanks
<fwereade> niemeyer_, wait, surely the machine agent *knows* what is running anyway?
<niemeyer_> fwereade: I think something along your ChangedLife idea might be plausible
<niemeyer_> fwereade: Forces acknowledgement of the possibilities
<fwereade> niemeyer_, but regardless, yeah, I feel that's the way to go for now
<niemeyer_> fwereade: Or even Alive/Dying/Dead fields
<niemeyer_> fwereade: Which makes the need for handling even more explicit
<fwereade> niemeyer_, I'd prefer to keep it just as "changed!", rather than risk a lie (when the id is loaded and revealed to be in a different state)
<niemeyer_> fwereade: Good point
<fwereade> niemeyer_, on a related note: ids from watchers
<niemeyer_> fwereade: ok
<Aram> so Changed  is the consesnsus?
<fwereade> niemeyer_, I like it in principle, and (I think) in practice for the collection watchers; I feel that it's going to be unhelpful when applied to the document watchers
<niemeyer_> Aram: It may be just a slice of ids, I guess, but let's wait until the end of the conversation
<fwereade> niemeyer_, as the only client so far, every time I get a changed service or unit, I want it refreshed
<niemeyer_> fwereade: There are a few different details that have been going unperceived
<fwereade> niemeyer_, can I handwave single document watchers to be "different enough" as to be sent as objects?
<fwereade> niemeyer_, go on
<niemeyer_> fwereade: 1) We're faking the data for the entity; since the entity might not be there by the time we try to fetch it, we change its field to Dead and send a cached version
<niemeyer_> fwereade: Which is quite wrong.. the unit may have died in an entirely different state
<niemeyer_> fwereade: and such a Dead + arbitrary state may never have happened.. the justification for the death is in the unit that we didn't see (unit being just an example here)
<niemeyer_> fwereade: 2) If we get 100 changes between the last time we've observed the unit, and the next time we're able to handle the change because e.g. the hook returned,
<niemeyer_> fwereade: we're reloading the unit 100 times within the watcher, for absolutely no reason
<fwereade> niemeyer_, re (1), I dunno -- for what clients is the distinction important?
<niemeyer_> fwereade: I don't know.. we'll likely find out when something explodes in our face
<fwereade> niemeyer_, re (2), hmm, true
<niemeyer_> fwereade: 3) The cost of loading the unit is minimal in all cases I've ported
<fwereade> niemeyer_, I mean, I've been thinking about it, and I can't see any -- doesn't Dead mean "the only safe way to interact with this document is to destroy it"?
<fwereade> niemeyer_, yeah, it is only a few lines of code
<niemeyer_> fwereade: and, interesting, in some cases it cleaned up..
<niemeyer_> fwereade: The firewaller, for example, didn't really care much about that machine object
<niemeyer_> fwereade: It ended up not using it at all in most cases
<niemeyer_> fwereade: It was using it just because it was being handed off anyway
<fwereade> niemeyer_, I'm 100% behind it on the collections
<fwereade> niemeyer_, and I think I'm now convinced on the documents side
<niemeyer_> fwereade: 4) It makes the watchers simple (!) :-)
<fwereade> niemeyer_, ;)
<niemeyer_> fwereade, Aram: So, slice of ids?
<fwereade> niemeyer_, +1
<Aram> Changed []string?
<niemeyer_> Aram: []string
 * rogpeppe wishes that machine ids were strings too
<Aram> hmm
<niemeyer_> Aram: <-chan []string
<Aram> rogpeppe++
<Aram> niemeyer_: ok, that seems fine
<niemeyer_> rogpeppe: We should talk about that someday when we're not in the middle of a big change
<rogpeppe> niemeyer_: yeah
<niemeyer_> rogpeppe: Reviewed
<rogpeppe> niemeyer: thanks
<rogpeppe> niemeyer: i chose FitsTypeOf because then if the test fails it'll say what the failing error actually is
<niemeyer> rogpeppe: If the test fails it's trivial to find out what's wrong
<rogpeppe> niemeyer: ok
<niemeyer> rogpeppe: That's how it's being done in the rest of the code already
<niemeyer> rogpeppe: We have a test for IsNotFound that verifies it actually works as intended, and the rest is relying on it workng
<rogpeppe> niemeyer: that's fine. i generally to try to make test failures as informative as i can, but i'm happy to use IsNotFound too
<niemeyer> rogpeppe: That's a great approach, but there's value in testing the semantics we want.. we'll generally be using IsNotFound in client code, and it should work
<rogpeppe> niemeyer: ok, but i thought that's what the IsNotFound test is for.
<rogpeppe> niemeyer: anyway, i've changed it already
<niemeyer> rogpeppe: Nevermind. Thank you!
<rogpeppe> and one other occurrence too
<TheMue> have to leave due to an emergency, bbl
<niemeyer> TheMue: Ouch.. hope it's all good there
<jamespage> niemeyer, hey! available for a chat about versions of mongodb in quantal?  I understand 2.2.0 is desired....
<fwereade> niemeyer, whoops, I need to leave early today... I might have a CL for you later, but nothing yet I'm afraid
<niemeyer> jamespage: Yo, here
<niemeyer> fwereade: np, have a pleasant time there
<fwereade> niemeyer, Aram: ooo, one important thought: to make the ids change work with RelationUnitsChange et al, we will need a map[unit-name]txn-revno
<niemeyer> fwereade: Hmm
<fwereade> Aram, niemeyer, actually, I don;t think we do
<jamespage> niemeyer, I had a request to up the version of mongodb in quantal to 2.2.0 to support go-juju
<niemeyer> fwereade: That's good I guess, because I have no idea about what you have in mind yet :D
<fwereade> niemeyer, we just clear the settings out when we get a change
<jamespage> niemeyer, currently we will be shipping 2.0.6 (maybe 2.0.7 if I get time)
<niemeyer> jamespage: Col
<niemeyer> jamespage: Cool
<niemeyer> jamespage: 2.2.0 would be good indeed
<fwereade> niemeyer, I was thinking we'd need to keep track of revnos at the watcher level because they're used by the relation context level
<niemeyer> jamespage: Very good, in fact
<jamespage> niemeyer, at the moment I'm pushing back - its very late in the cycle
<fwereade> niemeyer, in persistent relation state storage
<jamespage> niemeyer, I'm assuming that this will be required to support usage on precise as well
<niemeyer> fwereade: Right, I think that's sensible
<niemeyer> jamespage: In principle, no
<jamespage> niemeyer, ??
<niemeyer> jamespage: Well, it depends a bit on what you mean by that
<niemeyer> jamespage: Support usage in which sense?
<niemeyer> fwereade: Let's have a chat once you have some more time
<jamespage> niemeyer, say if someone want to use go-juju on 12.04
<fwereade> niemeyer, but we don't: we can treat change notifications as clear-your-settings; and if we return a txnRevno from ReadSettings, we can persist *that*
<niemeyer> fwereade: I'd appreciate the insight there
<jamespage> rather than 12.10 ++
<fwereade> niemeyer, although it may require unhelpful extra plumbing
<fwereade> niemeyer, I need to thing it through a bit more
<fwereade> anyway gtg, take care & have fun
<niemeyer> fwereade: Sounds sensible I think
<niemeyer> fwereade: Cool, ttyl
<niemeyer> jamespage: Hm
<niemeyer> jamespage: SO,
<niemeyer> jamespage: So,
<niemeyer> jamespage: There are tons of very attractive changes on 2.2.0
<niemeyer> jamespage: Things like improved concurrency support, an aggregation framework that enables queries that are impossible otherwise, speed ups, etc etc
<niemeyer> jamespage: We want to benefit from those
<niemeyer> jamespage: That said,
<niemeyer> jamespage: We can benefit from those either way, in the server side
<niemeyer> jamespage: and we have other reasons to be working around the stock packaging in those cases
<jamespage> niemeyer, sounds like you have issues with the distro packaging then?
<jamespage> other than the version...
<niemeyer> jamespage: No, it's really the version
<niemeyer> jamespage: The problem is that we have to support N releases of Ubuntu with the same code
<niemeyer> jamespage: Back and forth
<jamespage> niemeyer, my suggestion is that we take the same approach as you guys did with zookeeper before
<jamespage> i.e. ship it in PPA alongside go-juju
<niemeyer> jamespage: not only that, but we need to make sure that upgrades don't happen without our explicit approval
<niemeyer> jamespage: Since an upgrade of the core MongoDB behind juju's back might kill the whole deployment
<niemeyer> jamespage: Well, sure.. if we have to workaround it, we can do that by several means
<niemeyer> jamespage: The point, though, is that we'd benefit from having the new version on the distro
<niemeyer> jamespage: So that people can more easily deploy juju *locally*
<niemeyer> jamespage: For development, testing, etc
<niemeyer> jamespage: Without having to go around the distro packages
<jamespage> niemeyer, I would agree with that IF go-juju was going to land in the distro for quantal
<niemeyer> jamespage: If it's not doable, I'll understand, and will start thinking of plan-b
<jamespage> niemeyer, I'm happy to push time at upgrading mongodb to 2.2.0 next cycle
<niemeyer> jamespage: If it's not in Quantal, it'll be there a bit afterwards
<jamespage> yes
<niemeyer> jamespage: It'll still come to life during Quantal's lifetime
<niemeyer> jamespage: That said, we don't have local support yet, so perhaps you're right and it doesn't matter for this cycle
<jamespage> niemeyer, I just think the overall risk this late in cycle is to great for a number of reasons
<jamespage> 1) none of the server team have much experience with mongodb or its packaging (it being in universe)
<niemeyer> jamespage: Understood, sounds good.. I'd probably be causing you pain for no good reason
<jamespage> 2) limited time with charm sprints and ODS in the next two weeks...
<jamespage> niemeyer, OK - I'll keep you notified of any work that happens
<jamespage> and will make sure its back-portable to precise/quantal
<niemeyer> jamespage: Thanks, and sorry for the half-false alarm
<jamespage> niemeyer, no problemo
<niemeyer> jamespage: Can we put this on the agenda for the next release?
<jamespage> niemeyer, absolutely - if mongodb is going to support juju going forward that it need to be MIR'ed and more owned in ubuntu anyway
<jamespage> owned/loved....
<niemeyer> jamespage: +1
<niemeyer> jamespage: Meanwhile, FYI we've built a version of 2.2.0 and are shipping alongside juju in a bare-bones fashion (bins only)
<niemeyer> jamespage: This means we're able to deploy in all releases of Ubuntu we want to support
<jamespage> coolio
<niemeyer> jamespage: Precise, Quantal, Quantal+1, etc
<jamespage> niemeyer, all archs as well?
<niemeyer> jamespage: Yep, if/when needed
<jamespage> niemeyer, good-oh
<niemeyer> jamespage: tools/mongo-2.2.0-precise-amd64.tgz
<jamespage> I know quite a bit of testing has been happening using juju/maas on arm for example
<niemeyer> jamespage: I don't have any experience with mongo on arm, though
<niemeyer> "juju-core has no active code reviews."
<niemeyer> It's been a while since I've seen that message..
<niemeyer> Btw, we've topped almost 6k lines in during the sprint
<niemeyer> and 11k lines *out*
<niemeyer> (thanks to zk's departure)
<niemeyer> Didn't take long, though: https://codereview.appspot.com/6574049
<Aram> fwereade: "
<Aram> <fwereade> niemeyer, Aram: ooo, one important thought: to make the ids change work with RelationUnitsChange et al, we will need a map[unit-name]txn-revno"
<Aram> do we need this?
<niemeyer> Aram: I don't know.. have to talk to William later about details
<niemeyer> Aram: Does it make a difference for you now?
<Aram> I think it would be easier to adapt later if we need to.
<niemeyer> Aram: Are you working on RelationUnitsChange?
<Aram> no.
<Aram> I was thinking in general.
<niemeyer> Aram: Okay, so that's not needed. RelationUnitsChange is very special in that regard
<Aram> about this.
<Aram> ok.
<Aram> suits me.
<niemeyer> Aram: :-)
<niemeyer> rogpeppe: Have you evolved/tried live tests today?
<rogpeppe> niemeyer: yeah, i've got a branch that is stubbornly refusing to work live. each time it's something a little bit different.
<niemeyer> rogpeppe: :(
<niemeyer> rogpeppe: We need to polish those live tests.. the organization right now is pretty confusing, there are bits in ec2 that should be in jujutest, unrelated tests are being done together, etc
<rogpeppe> niemeyer: i agree
<rogpeppe> niemeyer: which bits are you thinking can move from ec2 to jujutest, BTW?
<rogpeppe> niemeyer: i'm starting to think that my policy of always returning a non-nil Tools is not a good one
<rogpeppe> niemeyer: i think that if the tools haven't been set, AgentTools should return nil and NotFoundError
<rogpeppe> niemeyer: this is an error i get when calling juju status on the current test environment:
<rogpeppe> 2012/09/26 16:59:33 JUJU:DEBUG juju status command failed: cannot get all machines: invalid binary version ""
<rogpeppe> error: cannot get all machines: invalid binary version ""
<niemeyer> rogpeppe: That's what I'd expect too
<niemeyer> rogpeppe: (as a client of that API)
<niemeyer> rogpeppe: Hmm.. are you missing an update from trunk?
<niemeyer> rogpeppe: I'm running an env here.. status works fie
<niemeyer> fine
<rogpeppe> niemeyer: i don't *think* so, unless it's a very recent one
<niemeyer> rogpeppe: Last evening
<niemeyer> rogpeppe: I've fixed the marshalling of tools
<rogpeppe> niemeyer: no, i've merged trunk
<niemeyer> rogpeppe: Ok, so worth investigating..
<niemeyer> rogpeppe: Hmm.. wouldn't that happen if the machine was retrieved before it got an agent set?
<rogpeppe> niemeyer: that's what i'm thinking
<niemeyer> rogpeppe: the machine sets its own agent on startup
<rogpeppe> niemeyer: yup
<niemeyer> rogpeppe: So that must be it
<rogpeppe> niemeyer: that's why i thought moving to using a nil tools might help
<rogpeppe> niemeyer: but maybe it'd call SetBSON on the toolsDoc.Version anyway?
<niemeyer> rogpeppe: What does "using a nil tools" mean?
<rogpeppe> niemeyer: hmm, now i look, i'm not sure :-)
<rogpeppe> niemeyer: it seems to me that the tools *will* be nil in a newly created machine doc.
<niemeyer> rogpeppe: I think Tools is just missing the SetZero return if the value is bson's null
<rogpeppe> niemeyer: ah, you're probably right.
<niemeyer> rogpeppe: I'm always probably right.. I'm often probably wrong too.. :-)
<niemeyer> rogpeppe: I thought you were referring to this logic earlier:
<niemeyer>         if m.doc.Tools == nil {
<niemeyer>                 return &Tools{}, nil
<niemeyer>         }
<niemeyer> This is  bogus
<niemeyer> It should return nil, notFound("tools for machine %v", m.doc.Id)
<rogpeppe> niemeyer: yes, that's what i was saying above
<niemeyer> rogpeppe: Cool, so two bugs
<niemeyer> Btw, I found out why it's taking a bit to init mongo.. it's preallocing the journal
<niemeyer> I'll find a way to work aroudn it
<rogpeppe> niemeyer: that would be great
<rogpeppe> niemeyer: how do you find if a bson.Raw is a null?
<niemeyer> rogpeppe: Compare raw.Kind against a constant.. I don
<niemeyer> rogpeppe: 't recall which, but I think we're already doing this elsewhere..
<niemeyer> rogpeppe: Otherwise,
<niemeyer> rogpeppe: bsonspec.org
<niemeyer> Lunch!
<rogpeppe> niemeyer: enjoy!
<rogpeppe> niemeyer: ah! i see why the state tests passed now...
<rogpeppe> 	c.Skip("Marshalling of agent tools is currently broken")
<rogpeppe> hmm, no, it passes
<niemeyer> rogpeppe: Try to Marshal, Unmarhsal, re-Marshal, and re-Unmarshal
<niemeyer> rogpeppe: I suspect the third will fail
<niemeyer> rogpeppe: Sorry, the fourth oe
<niemeyer> rogpeppe: The third will save the bogus empty tools
<niemeyer> rogpeppe: Fourth will blow up
<rogpeppe> niemeyer: how does that sequence happen in our current code? i'm not sure we ever Marshal the whole machine doc
<rogpeppe> niemeyer: i'm trying to write a test that currently fails
<niemeyer> rogpeppe: The suggested sequence will fail, I suspect
<rogpeppe> niemeyer: Marshal isn't available externally to state
<niemeyer> rogpeppe: It marshals when it writes to the database
<niemeyer> rogpeppe: Ah, it's probably GetBSON that is broken too
<rogpeppe> niemeyer: i don't *think* it marshals when it writes to the db
<niemeyer> Hmm, or maybe not..
<niemeyer> rogpeppe: Huh?
<niemeyer> rogpeppe: There's no way for mgo to send anything to the database without marshaling
<rogpeppe> niemeyer: well, it marshals the agent tools when they're set, but given that they're set, it should all work ok
<niemeyer> rogpeppe: That's where the sequence above comes from.. due to the lack of SetZero on SetBSON, it will marshal empty tools
<rogpeppe> niemeyer: even if the tools aren't empty?
<niemeyer> rogpeppe: It's the same thing we just talked above
<niemeyer> <niemeyer> rogpeppe: I think Tools is just missing the SetZero return if the value is bson's null
<rogpeppe> niemeyer: i can see how we get an invalid tools in the machineDoc, but i'm not sure how we can get that marshalled again.
<rogpeppe> niemeyer: because AFAICS we never marshal from the machine doc's Tools
<rogpeppe> niemeyer: obviously you're right in one sense, but i don't quite see yet exactly how it happens. i'm probably being stupid, sorry.
<niemeyer> rogpeppe: You're not.. I don't know either
<niemeyer> rogpeppe: let's see the code that failed
<niemeyer> rogpeppe: Well, actually, can you push the WIP branch?
<niemeyer> rogpeppe: Otherwise I have no idea about what *actually* failed
<rogpeppe> niemeyer: ok, i'll do that
<rogpeppe> niemeyer: https://codereview.appspot.com/6566058
<niemeyer> Cheers
<niemeyer> s
<Aram> I'm preparing some roast pork, but I'll be back after all the stuff is in the oven
<rogpeppe> niemeyer: i'm running the live tests again to check if i get the same symptom
<rogpeppe> Aram: my mouth waters
<niemeyer> rogpeppe: If you get the error, see if you can look at the logs
<rogpeppe> niemeyer: i looked at the logs before and saw nothing unusual
<niemeyer> rogpeppe: That test environment, are you sure you bootstrapped it with up-to-date tools?
<rogpeppe> niemeyer: i'm doing go test -amazon in environs/ec2
<rogpeppe> niemeyer: that should be sufficient
<niemeyer> rogpeppe: Because that's very similar to the bug I fixed yesterday
<rogpeppe> niemeyer: i'm pretty sure i'm using the up to date tools
<rogpeppe> niemeyer: well, the tools from the source in the CL above
<niemeyer> Ok
<niemeyer> rogpeppe: Ahh, I think the mystery is solved
 * rogpeppe is all ears, having just reproduced the problem
<niemeyer> rogpeppe: txn has to re-marshal the doc on insertion to inject the key
<niemeyer> rogpeppe: Well, actually.. hmm
<niemeyer> rogpeppe: Nevermind.. makes no sense
<rogpeppe> darn
<niemeyer> rogpeppe: The remarshalling will not use the type
<niemeyer> rogpeppe: Do you have a live env with it?
<rogpeppe> niemeyer: yes
<niemeyer> rogpeppe: Can I get access?
<rogpeppe> niemeyer: yeah, one mo
<niemeyer> rogpeppe: There's a handy tool to import ssh keys from lp
<rogpeppe> niemeyer: don't i need to give you an ssh key?
<niemeyer> rogpeppe: ssh-import-id
<niemeyer> rogpeppe: I have one :)
<rogpeppe> niemeyer: yeah, but the environment won't be authorized for your key
<niemeyer> rogpeppe: ssh-import-id
<rogpeppe> niemeyer: or... can i add keys dynamically to a machine in the environment?
<rogpeppe> niemeyer: is that what you're suggesting?
<niemeyer> rogpeppe: Yes, you can log into the machine, and run ssh-import-id
<niemeyer> rogpeppe: ssh-import-id niemeyer, more precisely
<rogpeppe> niemeyer: ok, i'll try that
<rogpeppe> niemeyer: try now: ec2-23-20-216-248.compute-1.amazonaws.com
<niemeyer> I'm in
<niemeyer> rogpeppe: Tools versions in the env look fine
<niemeyer> rogpeppe: Is a "juju status" against it breaking?
<rogpeppe> niemeyer: yeah
<niemeyer> rogpeppe: Hah, nice.. that's going to be easy
<rogpeppe> niemeyer: but all juju status does is call AllMachines
<niemeyer> rogpeppe: How do you call juju status?
<rogpeppe> niemeyer:  juju status --debug -e sample-b765066ea6aab3bb
<niemeyer> rogpeppe: COol
<rogpeppe> niemeyer: (it helps that i log the environment public attrs in the test now)
<rogpeppe> niemeyer: out of interest, what mongo command did you use to inspect the db?
<niemeyer> rogpeppe: "mongo"
<niemeyer> rogpeppe: "mongo localhost:37017/juju"
<niemeyer> rogpeppe: More precisely
<rogpeppe> niemeyer: i got that far.
<rogpeppe> niemeyer: ah, got it. it's a literal "db".
<niemeyer> rogpeppe: db.machines.find()
<rogpeppe> niemeyer: i thought i had to type juju.machines.find(0
<rogpeppe> )
<niemeyer> rogpeppe: I can't reproduce the error without some further hacking
<niemeyer> rogpeppe: Can you please enable the mgo logs, and run juju status with them on
<niemeyer> rogpeppe: mgo.SetDebug(true); mgo.SetLogger(log.Target)
<rogpeppe> niemeyer: ok
<niemeyer> rogpeppe: Sent a few comments on https://codereview.appspot.com/6566058/
<rogpeppe> niemeyer: ah! you caught the bug i was actually trying to fix. thanks!
<rogpeppe> niemeyer: i'm so f*!#ing blind
<rogpeppe> niemeyer: i'm glad we found this in the process though
<niemeyer> rogpeppe: +1, and don't blame yourself too much.. it's a single char.. very easy to miss
<rogpeppe> niemeyer: ok, so i'm an idiot as suspected
<rogpeppe> niemeyer: and you were exactly right. i was conflating two problems and assuming they were linked
<rogpeppe> niemeyer: the "juju" i was running to run status was not the current juju. doh!
<rogpeppe> niemeyer: and v sorry to waste your time
<rogpeppe> niemeyer: it all works
<niemeyer> rogpeppe: Woohay!
<rogpeppe> pwd
<niemeyer> /home/rog
<rogpeppe> rm -rf *
<niemeyer> rm: command not found
<rogpeppe> give up in disgust
<niemeyer> rogpeppe: :-)
<niemeyer> rogpeppe: Those debug sessions are nice because we get some intimacy with the whole process
<rogpeppe> niemeyer: that's a nice slant to put on it :-)
<rogpeppe> niemeyer: right, i'm running the live tests again, and i hope and trust they will work this time
<rogpeppe> niemeyer: i've got to go and cook keema muttar curry :-)
<niemeyer> rogpeppe: Have good fun there
<rogpeppe> niemeyer: the "does nothing" channel range loop BTW is so that in a subsequent branch i can just slot in a Unit and watch it in the same way
<rogpeppe> niemeyer: that was the reason for my "i wish machine ids were strings" remark earlier
<niemeyer> rogpeppe: Ok, thanks
<rogpeppe> niemeyer: i'll pop back in 10 minutes or so and see if the test succeeded
<niemeyer> rogpeppe: Super, fingers crossed
<niemeyer> rogpeppe: Btw, I've put your review fixes here: https://codereview.appspot.com/6574049
<TheMue> so, back, will join full in a few moments
<rogpeppe> niemeyer:
<rogpeppe> 08.43.171 PASS
<rogpeppe> 08.43.176 ok  	launchpad.net/juju-core/environs/ec2	521.154s
<niemeyer> !!!
<rogpeppe> :-)
 * niemeyer dances the funky chicken
<niemeyer> rogpeppe: https://codereview.appspot.com/6569060
<niemeyer> rogpeppe: This will speed the next one up
 * niemeyer loves "watcher was stopped cleanly" panics
<niemeyer> So many background activity in tests caught up on it
<niemeyer> ok      launchpad.net/juju-core/cmd/jujud       13.860s
 * fwereade cheers at niemeyer
<niemeyer> fwereade: Hey! Good timing..
<fwereade> niemeyer, heyhey
<niemeyer> fwereade: With this change, the only tests that seem broken here are the uniter ones
<niemeyer> fwereade: But I thought you had fixed those.. is the change pending?
<fwereade> niemeyer, oh? which ones?
<fwereade> niemeyer, would you paste them to me, it should all work...
<niemeyer> fwereade: Cool
<niemeyer> fwereade: http://paste.ubuntu.com/1229166/
<niemeyer> fwereade: I haven't investigated, so could be something trivial
<fwereade> niemeyer, that is odd, it looks kinda familiar, but I forget what was involved
<fwereade> niemeyer, I'll grab a fresh trunk and try to repro
<niemeyer> fwereade: Not sure if it matters, but this was part of a full round run
<niemeyer> 2012/09/26 16:56:23 JUJU Unit status is "started"
<niemeyer> 2012/09/26 16:56:23 JUJU Built files published at http://ec2-177-71-232-16.sa-east-1.compute.amazonaws.com
<niemeyer> 2012/09/26 16:56:23 JUJU Remember to destroy the environment when you're done...
<niemeyer> error: Failed to update merge proposal log: EOF
<niemeyer> Why do you hate me LP
<niemeyer> One more change up: https://codereview.appspot.com/6565052
<niemeyer> Another one: https://codereview.appspot.com/6567054
<niemeyer> I guess I'll just bundle them up in a mail later.. the interactive reviewing isn't working today :)
<fwereade> niemeyer, https://codereview.appspot.com/6565052/ LGTM
<fwereade> niemeyer, and the other too
<niemeyer> fwereade: Danke!
<fwereade> niemeyer, bitte!
<fwereade> niemeyer, btw, what's your current thinking on the necessity of a corrective agent?
<fwereade> niemeyer, I'm feeling like I want one :)
<niemeyer> fwereade: Seems unnecessary
<niemeyer> fwereade: What are you missing?
<fwereade> niemeyer, ok, the situation that makes me want it is as follows
<fwereade> niemeyer, the Uniter has detected a Dying unit and discharged all its responsibilities
<fwereade> niemeyer, it calls EnsureDead() on the unit
<fwereade> niemeyer, at this point, all bets are off
<niemeyer> fwereade: In which sense?
<fwereade> niemeyer, in an ideal world, the uniter would smoothly delete the unit, and also the service (if it's dying and this was the last unit)
<fwereade> niemeyer, in the pessimistic case, it will be half way through doing that when the machine agent nukes the container and the unit agent with it
<fwereade> niemeyer, I think *something* needs to be capable of cleaning up after that
<niemeyer> fwereade: The uniter should never delete a unit
<niemeyer> fwereade: In my view so far, at least
<fwereade> niemeyer, ah, ok then, that's for the machiner to do once the unit is dead?
<niemeyer> fwereade: It's work ends at EnsureDead
<fwereade> niemeyer, that makes sense to me
<niemeyer> fwereade: Yes, once the machiner gets rid of the unit container, it deletes the unit
<fwereade> niemeyer, and once the machiner has cleaned up the container, it completely trashes the unit and possibly the srvice
<fwereade> niemeyer, cool
<niemeyer> fwereade: As we've talked a few times, Dead is always handled by the "lifecycle manager" so to speak
<fwereade> niemeyer, what happens when the machiner is AWOL?
<niemeyer> fwereade: The machiner<>provisioner relationship is the same
<niemeyer> fwereade: The same as usual
<fwereade> niemeyer, that sounds sensible
<niemeyer> fwereade: When the machiner handles the dead unit, it may also handle the service transparently
<niemeyer> fwereade: We should probably put that logic within RemoveUnit itself
<fwereade> niemeyer, +1, but worrying about races
<fwereade> niemeyer, I expect we can make it work, we just need to be careful
<niemeyer> fwereade: I think the idea we discussed at the sprint, with a refcount, works well
<fwereade> niemeyer, (and, to clarify: when a machiner is down it never removes the unit, and we sit and wait until someone fixes it?)
<niemeyer> fwereade: yep
<fwereade> niemeyer, ok, cool
<niemeyer> fwereade: Things should work in expected ways during shutdown.. juju will not kill resources up without user consent
<fwereade> niemeyer, permanently blocking a terminate-machine because of a dead unit (that's not being picked up due to a screwed machiner) seems a touch user-hostile if they don;t have the tools to resolve it
<fwereade> niemeyer, a Dying unit is one thing, they can resolve that
<niemeyer> fwereade: We're not blocking because of a dead unit.. we're blocking because of a non-responding machiner
<niemeyer> fwereade: I want to know why the machiner is not responding, rather than swiping that kind of error under the covers
<fwereade> niemeyer, yeah, good justification
 * fwereade is happy he doesn't have to worry directly about cleaning up services too :)
<fwereade> not right now, anyway ;)
<niemeyer> fwereade: yeah :)
<niemeyer> Review queue is owned today: https://code.launchpad.net/juju-core/+activereviews
<niemeyer> fwereade: I'll have to step out for an errand.. back later, but hopefully you'll be resting by then.. have a good time there
<fwereade> niemeyer, I'll almost certainly have a WIP branch I'd appreciate a look at... I'll mail you if I manage it
<fwereade> niemeyer, cheers
<fwereade> niemeyer, https://codereview.appspot.com/6575053 is (I think) reasonably complete, and does still pass all existing tests
<fwereade> niemeyer, still WIP, but I intend to write actual tests tomorrow morning and submit something pretty close, so preliminary comments would be much appreciated
<niemeyer> fwereade: Awesome!
<niemeyer> fwereade: Will have a look
#juju-dev 2012-09-27
<TheMue> morning
<fwereade> TheMue, heyhey
<TheMue> fwereade: heya
<TheMue> ooops, disconnected *hmmm*
<Aram> moin.
<TheMue> Aram: moin
<TheMue> lunchtime
<rogpeppe> fwereade: fairly trivial: https://codereview.appspot.com/6576053
<fwereade> rogpeppe, cheers
<fwereade> rogpeppe, LGTM
<rogpeppe> fwereade: ta
<rogpeppe> fwereade: and another, even more trivial: https://codereview.appspot.com/6566060
<fwereade> rogpeppe, LGTM
<rogpeppe> fwereade: tyvm
<rogpeppe> fwereade: i think i'll submit both, given your remark
<fwereade> rogpeppe, +1
<TheMue> have one too: https://codereview.appspot.com/6567060
<rogpeppe> TheMue: i would look, but i can't seem to pull from lp currently
<rogpeppe> TheMue: rather, i was looking, but i wanted to look more closely but failed to talk to launchpad
<rogpeppe> TheMue: i can't even pull from trunk currently
<TheMue> rogpeppe: oh, which error message?
<rogpeppe> Aram, fwereade: are you having any problems accessing launchpad through bzr?
<TheMue> rogpeppe: i just pulled trunk and it worked
<Aram> same
<Aram> workes
<rogpeppe> TheMue: http://paste.ubuntu.com/1230255/
<Aram> works
<TheMue> rogpeppe: iiirks, never seen that
<rogpeppe> TheMue: yeah, i don't know what's going on - it worked fine earlier
<rogpeppe> i wonder whether my password has expired or something
<rogpeppe> i suspect it may be something to do with using a long branch name
<rogpeppe> lbox just managed to push fine. weird
<rogpeppe> TheMue: all working now. who knows what was going on...
<TheMue> rogpeppe: maybe service-side
<rogpeppe> lol, i was trying to find the definition for the "delete" function, totally forgetting it was built in!
<Aram> :)
<TheMue> Aram: seem i found a bug in the machines watcher
<TheMue> Aram: typically a machine can be removed after EnsureDead()
<rogpeppe> TheMue: you've got a review
<TheMue> Aram: when testing the firewaller and remove a machine i get nor error returned but the machine watcher panics with "machine removed before being dead"
<TheMue> rogpeppe: just seen, cheers
<Aram> TheMue: I know, the implementation was correct and niemeyer broke it.
<TheMue> rogpeppe: good hints, thx
 * Aram is back in 15 minutes.
<rogpeppe> fwereade: another one (this one makes live tests pass, yay!): https://codereview.appspot.com/6566058/
 * fwereade looks
<fwereade> rogpeppe, LGTM :)
<rogpeppe> fwereade: thanks!
<rogpeppe> oh why can i never remember how to print a full revision id in bzr?!
<rogpeppe> nor is it easy to google for
<Aram> fwereade: meh, how I wish we had actual container entities.
<fwereade> Aram, heh, me too, I remain totally convinced it's the Right Thing to do
<Aram> the machine units watcher is ridiculously complex becase we don't, I have to watch 1) the machine, 2) n principal units, 3) all the units and take great care to integrate and use all this knowledge.
<rogpeppe> fwereade: the uniter tests seem to spend ages and ages doing: want resolved mode '\x00', got '\x02'; still waiting
<rogpeppe> fwereade: and similar
<fwereade> rogpeppe, hell, niemeyer had that yesterday; I was unable to repro :/
<rogpeppe> fwereade: the uniter tests passed for me just now but took 124s to run
<fwereade> rogpeppe, they do indeed take a long time: some of it may be the wanton fsyncing but probably not all
<fwereade> rogpeppe, I'm working on a suggestion of niemeyer's at the moment which changes the uniter quite interestingly, though so I was kinda hoping that would induce magical improvement somewhere ;)
<rogpeppe> fwereade: i'll paste you a copy of the output of the test, with timestamps so you can see where the time is going
<rogpeppe> fwereade: lol
<Aram> fwereade: for the machine units watcher, we only want to deliver events if there's a change in lifecycle, not a change in any other attribute, right?
<rogpeppe> fwereade: interesting, when i ran it that time, it failed with "never reached desired status"
<fwereade> rogpeppe, yeah, indeed, there is clearly something funky wrt Resolved -- that is one of the bits that is changing, actually
<fwereade> Aram, I think so, yes
<fwereade> Aram, tedious, innit
<rogpeppe> fwereade: http://paste.ubuntu.com/1230376/
<fwereade> rogpeppe, cheers
<rogpeppe> fwereade: that ouput is actually from a branch other than trunk
<rogpeppe> fwereade: i'll try in trunk and see if the same thing happens - i might have broken something
<Aram> there's a huge impedance mismatch between state/watcher and what we actually want. I don't find any purpose to state/watcher, we could have implemented watchers just the way we wanted withoout an aditional 1kLOC layer, and they would have been simpler.
<rogpeppe> Aram: how would you have implemented watchers?
<Aram> rogpeppe: I actually did like 4 or 5 times.
<Aram> let me find a branch.
<fwereade> rogpeppe, I wouldn't worry too much about it, I will be looking into it
<rogpeppe> Aram: i think the point of state/watcher is that it's efficient enough to cope with many thousands of documents
<rogpeppe> niemeyer: yo!
<niemeyer> Good mornign!
<TheMue> morning
<niemeyer> rogpeppe: Just sent some comments on the branch
<niemeyer> rogpeppe: Thanks for fixing the tests
<rogpeppe> niemeyer: cheers
<niemeyer> rogpeppe: The tools waiter looks like a nice idea, but it's sprawling too far IMO
<niemeyer> rogpeppe: A tool waiter should wait for the tools, and nothing else. It's messing up the already messy situation even further.
<rogpeppe> niemeyer: ok. i thought it was quite neat to bundle up the watcher with the thing it's watching, but evidently not.
<niemeyer> fwereade: ping
<fwereade> niemeyer, pong
<niemeyer> rogpeppe: It's a tool waiter
<niemeyer> rogpeppe: It's not nice to bundle things with arbitrary things so we can use the underlying things' interface
<niemeyer> fwereade: yo
<niemeyer> fwereade: Couple of questions
<niemeyer> fwereade: Did you figure what was wrong with the uniter tests? I think I can still reproduce
<fwereade> niemeyer, listening (btw thank you for excellent advice re uniter, it seems to be coming along pretty nicely)
<fwereade> niemeyer, afraid not; rog was just hitting it today
<niemeyer> fwereade: My pleasure, glad it was useful
<niemeyer> fwereade: Okay, I'll dive in right away then
<rogpeppe> niemeyer: the reason it made things nicer is that otherwise we end up always passing around the object along with the watcher channel.
<niemeyer> fwereade: Please keep going in whatever you were doing before.. :-)
<fwereade> niemeyer, my suspicion is that my ClearResolved() is funky, it looked off when I was changing it today
<rogpeppe> niemeyer: the waiter object made that more straightforward by bundling them together
<niemeyer> rogpeppe: It's a tools waiter.. it's not reasonable to have a tool waiter sprawling the live tests.. makes no sense whatsoever to be saying ToolsWaiter.EnsureDead (!!)
<rogpeppe> niemeyer: i admit that was a bit of a short cut :-)
<niemeyer> rogpeppe: Heh
<niemeyer> rogpeppe: These tests should be cleaned up, reduced in size and scope.. that's the opposite of it
<rogpeppe> niemeyer: i'd like to have a chat about that
<niemeyer> rogpeppe: Cool, let's just extinguish the fire first, and then I'm happy to discuss it
<rogpeppe> niemeyer: i'd like to do that, but i don't want the time taken to run the tests to double.
<niemeyer> rogpeppe: They won't double.. we already have BootstrapOnce
<rogpeppe> niemeyer: tests like this one start a unit.
<rogpeppe> niemeyer: that takes almost as long as bootstrapping
<rogpeppe> niemeyer: so it makes sense to test some stuff with the unit once we've started it, i think.
<niemeyer> rogpeppe: BootstrapOnce can create a default unit, with a well known testing charm
<niemeyer> rogpeppe: Which follow the same rules of the first machine (do not mess with it, and if you do, put it back in place)
<rogpeppe> niemeyer: i think that's wrong, for the same reason it was wrong in the other tests we changed recently.
<niemeyer> rogpeppe: Curiously, you won't be able to do that for upgrades, so I don't think it's too much to upgrade it
<rogpeppe> niemeyer: we don't want loads of arbitrary context in the suite
<niemeyer> rogpeppe: Erm, to deploy a new unit
<niemeyer> rogpeppe: Yep, so let's pay the price of time
<niemeyer> rogpeppe: Pick your fight :)
<rogpeppe> niemeyer: live tests already take 10 minutes. i don't want them to take 20.
<niemeyer> rogpeppe: I don't care
<niemeyer> rogpeppe: I won't be sitting and looking at the screen in either case
<niemeyer> rogpeppe: I do care, though, that we have incomprehensible and long-winded tests
<niemeyer> rogpeppe: Which took a week to fix when we changed something fundamental
<niemeyer> rogpeppe: *one* test
<rogpeppe> niemeyer: the difficulty i'm having is that all the tests i'd factor out are subsets of the current Bootstrap and Deploy test. that is, to test upgrading, i think we need to do almost exactly what we already do in that test.
<niemeyer> rogpeppe: That's why we have a BootstrapOnce test
<niemeyer> rogpeppe: s/test/helper/
<niemeyer> rogpeppe: Because it's fundamental to live tests
<niemeyer> rogpeppe: Everything else is not
<niemeyer> rogpeppe: Of course you need bootstrap to test deploy, of course we need bootstrap to test upgrade, of course etc etc
<rogpeppe> niemeyer: how can i test upgrade without doing a deploy?
<niemeyer> rogpeppe: But that's certainly not reasoning to put *all of those tests* in the same place
<niemeyer> rogpeppe: Otherwise all we know is "Huh.. the big blob of code is broken again!"
<niemeyer> rogpeppe: You can do a deploy without testing upgrade
<rogpeppe> niemeyer: sure. then we have a set of tests, each of which includes the other as a prefix. i'm with you, but i don't want our live tests to take 2 hours
<niemeyer> rogpeppe: What upgrade is a prefix of?
<rogpeppe> niemeyer: deploy is a prefix of upgrade. maybe upgrade is as far as it gets, yes.
<niemeyer> rogpeppe: bingo
<niemeyer> rogpeppe: Bootstrap and deploying things are fundamental.. everything else is not
<rogpeppe> niemeyer: so... does it make sense to have deploy as a separate test, when upgrade will test *exactly* the same path?
<niemeyer> rogpeppe: Does it make sense to know that deploy works, despite the fact that upgrade doesn't?
<rogpeppe> niemeyer: sure. but our test failure will tell us that by where it failed. i don't think it justifies an extra three minutes testing time.
<niemeyer> rogpeppe: So let's erase all the tests.. 0 minutes
<rogpeppe> niemeyer: now yer being silly :-)
<niemeyer> rogpeppe: Where's the value?
<niemeyer> rogpeppe: I am
<niemeyer> rogpeppe: But that's the point.. you're worried about timing and ignoring everything else
<niemeyer> rogpeppe: THere are other ways to optimize time
<niemeyer> rogpeppe: Running tests in parallel for instance
<niemeyer> rogpeppe: But if you take away the value of the test, there's little reason to have them
<niemeyer> rogpeppe: All we have is a black/white works/doesn't work
<rogpeppe> niemeyer: does that mean table-driven tests are of no value because they're all bundled into a single test?
<niemeyer> rogpeppe: Can't see that leap
<niemeyer> rogpeppe: "table tests" are generally very contained in scope
<niemeyer> rogpeppe: Bootstrapping, deploying, upgrading, deploying more units, remove units, test firewaller, open ports, close ports, check expose, blah blah blah
<rogpeppe> niemeyer: ok, i'll change it and see how it looks
<niemeyer> rogpeppe: hold on, this is all future work.. let's please make things stable first
<rogpeppe> niemeyer: ok. so what do you want me to do about the current branch?
<niemeyer> rogpeppe: Look at the review, and move forward?
<rogpeppe> niemeyer: ok. i thought you weren't keen on most of what i was doing there.
<niemeyer> rogpeppe: I've sent a review that details exactly my feeling about what's in that branch
<niemeyer> rogpeppe: And I've started the conversation saying that I appreciate the refactoring you did to wait for tools
<niemeyer> rogpeppe: And I also said I didn't want to get into this discussion before the fire was extinguished
<rogpeppe> niemeyer: fair enough
<rogpeppe> niemeyer: how about if toolsWaiter was named differently? i think the bundling together of object and channel works well - it made the code simpler.
<niemeyer> rogpeppe: :-(
<niemeyer> rogpeppe: Can we please make the tools waiter *wait for tools*!?
<rogpeppe> niemeyer: ok, i think i'm there
<niemeyer> rogpeppe: Thanks
<niemeyer> fwereade: Curiously, the tests are passing now :-(
 * niemeyer looks at the paste
<fwereade> niemeyer, yeah, I did see it once myself -- have a small amount of patience, because your suggestions led to a certain amount of rearrangement that happened to somewhat affect error resolution
<fwereade> niemeyer, but I'm not 100% there yet
<niemeyer> fwereade: Cool
<rogpeppe> niemeyer: i think this might be better: https://codereview.appspot.com/6566058
 * rogpeppe goes for a bite of lunch
<niemeyer> rogpeppe: Thanks, LGTM. Just sent a couple of questions for your consideration
<rogpeppe> niemeyer: the advantage of having watcher as an embedded field is just so we don't need to define another Stop method. but i guess it's probably best to be explicit and do that anyway.
<niemeyer> rogpeppe: Okay, I personally don't mind much either way in that case.. whatever you're happy with
<rogpeppe> niemeyer: we don't need a refresh because waitAgentTools has already done a Refresh, but again, maybe it's best to do it so it's obvious
<rogpeppe> niemeyer: i'm happy to do that if you prefer
<niemeyer> rogpeppe: Yeah, sounds slightly more resilient.. we could change the tools waiter without breaking the test
<rogpeppe> niemeyer: ok, will do. thanks for the review BTW
<niemeyer> rogpeppe: My pleasure, and sorry for the confusion
<rogpeppe> niemeyer: np
<niemeyer> I think I was the go fmt offender..
<niemeyer> I stumble upon it every other commit
<rogpeppe> niemeyer: i do it quite a bit too :-)
<niemeyer> fwereade: You might appreciate this one: https://codereview.appspot.com/6565057
<niemeyer> fwereade: Figure it while having a look at uniter tests runs.. :)
<niemeyer> rogpeppe: ;)
<fwereade> niemeyer, yay!
<fwereade> niemeyer, LGTM
<fwereade> niemeyer, that was on my copious-free-time list too :)
<niemeyer> fwereade: The uniter test that broke in the paste was relative to resolution
<niemeyer> fwereade: I'm wondering if it's just a sync race
<niemeyer> fwereade: I'll see if I can reproduce this afternoon
<TheMue> niemeyer: ping
<niemeyer> TheMue: Yo
<fwereade> niemeyer, hmm, I feel like that bit *should* be pretty clean, but I'm not certain
<TheMue> niemeyer: the MachinesWatcher seems to have a problem
<niemeyer> TheMue: Oh?
<TheMue> niemeyer: while the test of RemoveMachine() in state works I now tried it for the firewaller
<TheMue> niemeyer: and there I get no error in EnsureDead() and RemoveMachine() but a panic in the watcher telling me that the machine is not dead
<niemeyer> TheMue: Ah, interesting.. I'd like to have a closer look at that, as I'm about to refactor that watcher to use a slice of changes too
<niemeyer> TheMue: Do you have a reproducer?
<TheMue> niemeyer: currently in an own little branch of the firewaller test. but i'll set up an extra branch for you
<niemeyer> TheMue: ?
<niemeyer> TheMue: Thanks, please push something, or just put the test in a paste and send a link
<niemeyer> TheMue: I'll have a look after lunch
<TheMue> niemeyer: ok, will do
<niemeyer> I'll step out for an earlier lunch today to join Ale.. see you all soon
<rogpeppe> niemeyer, fwereade: i still get sporadic failures of worker/uniter
<rogpeppe> niemeyer: but that's the only failure
<rogpeppe> niemeyer: enjoy!
<niemeyer> rogpeppe: It's known.. I'll try to reproduce in the afternoon, and fwereade is also refactoring things somewhat
<rogpeppe> niemeyer: a slightly reddish shade of green then :-)
<niemeyer> rogpeppe: The message is the same: tests should pass before you merge. If something broke, don't commit
 * niemeyer steps out
<rogpeppe> uniter upgrade tests pass, yay!
<rogpeppe> (live)
 * fwereade cheers ar rogpeppe
<fwereade> well holy shit
<fwereade> after that *brutal* refactoring, the uniter tests pass in about half the time they did before
<rogpeppe> fwereade: trivial: https://codereview.appspot.com/6566063
 * rogpeppe cheers at fwereade
<fwereade> rogpeppe, (and worked second time after I got it to build, too, the only bug (detected thus far...) was errorContextf-related)
<rogpeppe> fwereade: nice one
<fwereade> rogpeppe, LGTM
<fwereade> rogpeppe, bbs
<rogpeppe> fwereade: so how long do the uniter tests take now?
<rogpeppe> k
 * niemeyer waves
<fwereade> rogpeppe, bit more than a minute
<niemeyer> TheMue: ping
<fwereade> niemeyer, that suggestion was *awesome*
<niemeyer> fwereade: Woohay!
<TheMue> niemeyer: pong
<fwereade> niemeyer, I think I remained true to its spirit, just got it building, and OMG at least twice as fast
<fwereade> niemeyer, but I have to go... you'll get a CL tonight for sure, unless I discover some subtle screwup
<TheMue> niemeyer: it is not reproducable standalone, only in the firewaller
<TheMue> niemeyer: shall i push that branch so that you can test it there?
<niemeyer> TheMue: Yeah, or even just a tiny test for the firewaller that exposes the bug
<TheMue> niemeyer: looked into the watcher. the revno = -1 seems to come to early, because the machine is still in w.alive
<niemeyer> TheMue: I've sent a review too
<TheMue> niemeyer: cheers
<niemeyer> TheMue: Yeah, I want to look at that.. I've put the panic mainly so I'd see those cases
<TheMue> niemeyer: I paste the test, one moment please
<niemeyer> TheMue: Thanks!
<niemeyer> fwereade: Superb, that sounds very exciting!
<rogpeppe> niemeyer: uniter upgrade, finally: https://codereview.appspot.com/6561063
<niemeyer> rogpeppe: Looking
<TheMue> niemeyer: http://paste.ubuntu.com/1230645/ , the panic is in merge() of the MachinesWatcher
<niemeyer> TheMue: Thanks
<niemeyer> TheMue: please let me know if you have any questions on the review sent
<TheMue> niemeyer: the fixed bug had no failing effect, I only discoverd that my removing of the service not worked. and then i've seen that the map contained the unit with two ids: the service name and the unit name. but we never looked for the service name, so no error. but my length test just didn't became 0.
<TheMue> niemeyer: the other comments a pretty clear, thanks
<niemeyer> TheMue: That's how you discovered the bug
<niemeyer> TheMue: But does it really have "no failing effect"? Why do we use this map at all then?
<TheMue> niemeyer: yes, but before it showed no effect
<niemeyer> TheMue: This sounds like a missing test?
<niemeyer> TheMue: Why is the map used?
<TheMue> niemeyer: the map only contains one entry with the service name and the last added unitd
<TheMue> niemeyer: now it is used for me to stop the serviced, but i have to look why we have added it once
<niemeyer> TheMue: Sorry, that doesn't sound right
<niemeyer> TheMue: Feels very hand-wavy
<niemeyer> TheMue: Why have you added that map in the first place?
<niemeyer> TheMue: Where is it used?
<TheMue> niemeyer: that's what i wanted to say. i right now don't remember and have to look for the reason
<niemeyer> TheMue: Bingo.
<TheMue> <TheMue> niemeyer: now it is used for me to stop the serviced, but i have to look why we have added it once <= last sentence, i have to look
<niemeyer> TheMue: <TheMue> niemeyer: the map only contains one entry with the service name and the last added unitd
<niemeyer> TheMue: That's what I was talking about
<TheMue> niemeyer: if you don't want me to describe what i've seen just let me know
<niemeyer> TheMue: Heh
<niemeyer> TheMue: I don't need you to tell me that you've seen a map with a service name and a unit
<niemeyer> TheMue:                                 unitd.serviced.unitds[unit.ServiceName()] = unitd
<niemeyer> TheMue: I can read the code
<TheMue> niemeyer: ok, so i don't describe stuff like that in future
<niemeyer> TheMue: I also don't need you to tell me that tests were passing before.. as that's obvious.
<niemeyer> TheMue: What I do need your help with, is making sure we have a test covering the fix
<niemeyer> TheMue: I do need you for that, a lot.
<TheMue> niemeyer: I'll do my best.
<niemeyer> TheMue: And that's the only thing I've suggested in the review
<niemeyer> TheMue: Do we have any test covering two services?
<TheMue> niemeyer: The map has been used after an exosed change. So the one (last added) unit has been passed twice to flushUnits() for opening/closing of ports
<TheMue> niemeyer: so far the map has been used no where else
<niemeyer> TheMue: Have you seen the line right below the one you changed/fixed?
<TheMue> niemeyer: yes
<niemeyer> TheMue: and?
<TheMue> niemeyer: here a changed unit is added directly to the slice, so only once
<niemeyer> TheMue: Slice?
<TheMue> niemeyer: but after the exposed we iterate of the serviced.unitds and the one unit is twice in the map and added to the slice
<TheMue> niemeyer: yes, the slice that is passed to flushUnits()
<niemeyer> TheMue: I mean line 130 and 131
<niemeyer> TheMue: The line you fixed, and the line right below it
<TheMue> niemeyer: ah, ic what you mean
<TheMue> niemeyer: the referenced serviced is the same
<niemeyer> TheMue: Yes, we have the unit twice in the same map with different keys
<TheMue> niemeyer: so one line isn't needed
<niemeyer> TheMue: Right, the bottom one
<niemeyer> TheMue: Do we have any tests with multiple services?
<TheMue> niemeyer: have to look, afaik not yet
<niemeyer> TheMue: Can we have one in a follow up branch?
<TheMue> niemeyer: will add it, sure
<niemeyer> TheMue: Happy to have this in just with the review comments sorted, plus the duplicated line removed
<niemeyer> TheMue: Thank you
<TheMue> niemeyer: yw
<rogpeppe> niemeyer: just a heads up: i'm rolling the provisioner into the machine agent, and i don't particularly see any reason to treat the firewaller as inherently bundled with the provisioner, so i'm creating two bools on the Machine - Provisioner and Firewaller. does that seem reasonable to you?
<rogpeppe> niemeyer: i think it makes the implication of the flags more clear, as well as being more flexible.
<niemeyer> rogpeppe: Sounds like it could be a list.. machine.AgentWorkers() []WorkerKind
<rogpeppe> niemeyer: yeah, i was wondering about that kind of thing too
<rogpeppe> niemeyer: or a map[WorkerKind]bool
<niemeyer> rogpeppe: Doesn't sound right..
<rogpeppe> niemeyer: we can have more than one instance of a given worker?
<rogpeppe> niemeyer: well, i guess so
<niemeyer> rogpeppe: Probably not, but that's unrelate
<niemeyer> d
<rogpeppe> niemeyer: oh - that was the reason i suggested it.
<niemeyer> rogpeppe: Yep.. that's unrelated
<rogpeppe> niemeyer: so why don't you think it's good like that?
<niemeyer> rogpeppe: Lists without duplications are a fine concept
<rogpeppe> niemeyer: ok. where would you put the check? or would we just trust?
<niemeyer> rogpeppe: Where would you put the check?
<rogpeppe> niemeyer: probably in the agent itself. but it depends whether we want to allow multiple instances of a given worker in principle
<rogpeppe> niemeyer: BTW would the agent worker kinds include the machiner too?
<niemeyer> rogpeppe: Good question.. I think it should, for consistency
<rogpeppe> niemeyer: it kinda doesn't seem right to leave it out, but then again i can't think of a situation where we don't want to run it
<niemeyer> rogpeppe: Let's have it in.. it reflects reality and reduces implicit assumptions
<niemeyer> rogpeppe: machine.SetAgentWorkers([]WorkerKind{MachineWorker, ProvisionerWorker, FirewallerWorker})
<rogpeppe> niemeyer: ok, so if the machine agent starts and no workers are set, it... does what?
<niemeyer> rogpeppe: It explodes and laughs
<rogpeppe> niemeyer: sounds good
<rogpeppe> niemeyer: to relive some typing, i think perhaps SetAgentWorkers(... state.WorkerKind)
<niemeyer> rogpeppe: +1
<niemeyer> rogpeppe: SetWorkers sounds good too
<rogpeppe> relieve
<rogpeppe> niemeyer: +1
<rogpeppe> niemeyer: and type AgentWorker string ?
<niemeyer> rogpeppe: Uh oh.. we have a problem
<niemeyer> rogpeppe: A race, more specifically
<niemeyer> rogpeppe: type sounds good
<niemeyer> rogpeppe: We have to create the machine with its workers set
<rogpeppe> niemeyer: ah yes
<rogpeppe> niemeyer: we could put the workers into AddMachine
<niemeyer> rogpeppe: +1
<rogpeppe> niemeyer: cool. then i won't provide a SetWorkers method, so there's no pretence
<niemeyer> rogpeppe: Should fail if MachineWorker isn't set, or if there are duplicates
<rogpeppe> niemeyer: sounds good
<rogpeppe> niemeyer: although we could just add the MachineWorker implicitly
<rogpeppe> niemeyer: and return it from machine.Workers
<niemeyer> rogpeppe: Let's please make it explicit. Very soon it may well make sense to have a machine that has no units. That's a trivial step from now.
<rogpeppe> niemeyer: ok, that seems good
<rogpeppe> niemeyer: actually we could make it not return an error, but do as requested (assuming at least one worker). we'd just need to tweak AssignToUnusedMachine to choose a machine with a MachineWorker
<rogpeppe> niemeyer: still i'll start by returning an error if no MachineWorker
<niemeyer> rogpeppe: One thing at a time, please
<Aram> OK: 124 passed, 1 skipped
<Aram> how can one skip a test?
<Aram> is this an old or new feature?
<niemeyer> Aram: Hah, I think that's actually a bug
<niemeyer> Aram: I mean, in the sense that we want the test running
<niemeyer> Aram: c.Skip(reason)
<Aram> I see
<Aram> 	c.Skip("Marshalling of agent tools is currently broken")
<Aram> I'm probably on an older branch
<rogpeppe> Aram: you are, and they aren't
<Aram> I'll merge trunk at some point
<niemeyer> Aram: https://codereview.appspot.com/6576056
<rogpeppe> niemeyer: i thought i'd already done that...
<rogpeppe> niemeyer: probably just in some other abandoned branch
<niemeyer> rogpeppe: Maybe.. one of the reasons why small/in-focus branches rokc
<rogpeppe> niemeyer: point taken :-)
<TheMue> niemeyer: CL is in again
<TheMue> niemeyer: I have to leave, familiy needs me, but I'll look into mail via phone later
<SpamapS> niemeyer: was just talking on a call and we were discussing the possibility of adding some things into metadata.yaml...
<SpamapS> niemeyer: How would you feel about a field at the same level as 'interface' that is 'recommends' and lists charms that are the recommended other side of that relationship?
<SpamapS> niemeyer: and on a related note, how about a text field called 'description' for the relation itself, that is displayed whenever there is confusion on which relation to establish?
<niemeyer> SpamapS: Unless we define behavior for such a field, I'm -1 on it
<niemeyer> SpamapS: This sounds like getting into stacks, which I very much hope we start diving into soon
<niemeyer> SpamapS: On the description one, sounds more interesting
<SpamapS> niemeyer: The thought was to help people build stacks. The only behavior for it is to make use in the charm store and such.
<SpamapS> niemeyer: its for the case where something is optional, like memcached, but there may be several implementations that provide: memcache protocol.
<SpamapS> niemeyer: prefer may be a better term than recommend
<niemeyer> SpamapS: Understood.. I'm still -1 on adding such a field before we define how stacks work and what behavior we want from juju itself when it looks at it
<SpamapS> niemeyer: ok, mind if I open up a bug in juju-core?
<niemeyer> SpamapS: Not at all, thanks for that.. If you don't mind, please try to phrase the bug in terms of the problem you're trying to solve, rather than the solution
<rogpeppe> hmm, seems like i've used up all my flocks:
<rogpeppe> bzr: ERROR: Could not acquire lock "/home/rog/src/go/src/launchpad.net/juju-core/.bzr/checkout/dirstate": [Errno 11] Resource temporarily unavailable
<SpamapS> niemeyer: indeed
<niemeyer> I wish Rietveld's copy & pasting worked as nicely as pastebin.ubuntu.com
<fwereade> niemeyer, https://codereview.appspot.com/6568060/ is still WIP but I think it hews closer to your vision
<fwereade> niemeyer, the tests worked first time I got it building (ok, ok, second time) and the tests runs about twice as fast as they did before :)
<niemeyer> fwereade: Oh, nice!
<niemeyer> fwereade: Why do we need the wantFoo stuff?
<niemeyer> Oh, maybe I misunderstand
 * niemeyer reads more
<niemeyer> Yeah, I think I did.. nice
<rogpeppe> fwereade: if you have a moment, you might want to have a glance at https://codereview.appspot.com/6561063
<rogpeppe> i'm now off for the evening
<rogpeppe> night all
<Aram> niemeyer: how do I turn off debug output from state/watcher?
<niemeyer> fwereade: Sent comments.. great stuff
<fwereade> niemeyer, sweet :D
<fwereade> niemeyer, I'll have time to look at them properly later
<niemeyer> fwereade: Oh, I think there are a few races in the filter logic, but they're fixable.. I'll send a note
<niemeyer> Aram: Just started hacking on the MachinesWatcher
<niemeyer> Aram: Curious to see the effect of the model we discussed yesterday
<Aram> niemeyer: cool, so how can I disable debug output from state/watcher wile still keeping it from state?
<niemeyer> Aram: Have you had a look at the package?
<niemeyer> fwereade, Aram: I wonder about the right moment to show Dead things
<niemeyer> If we show entities that first show up as Dead after the initial event, we'll have to cache them to avoid showing them repeatedly
<niemeyer> Hmm.. or perhaps not, actually
<niemeyer> We can clean up when we find the entity removed
<niemeyer> That sounds sane... nevermind
<niemeyer> fwereade, Aram: http://paste.ubuntu.com/1231019/
<niemeyer> Sane?
<Aram> whenever one or more machines is added or changes its lifecycle.
<niemeyer> Aram: Sounds good
<rogpeppe> niemeyer: Machine.Workers: https://codereview.appspot.com/6564063
<niemeyer> rogpeppe: Looking
<rogpeppe> niemeyer: thanks. i really am gone now :-)
<niemeyer> rogpeppe: have a good evening!
<niemeyer> rogpeppe: And a laugh, in case you're still around:
<niemeyer>  128 Â»       ProvisionerWorker WorkerKind = "firewaller"
<niemeyer>  129 Â»       FirewallerWorker  WorkerKind = "provisioner"
<Aram> machine units test pass
<Aram> hell yeah
<Aram> now to add more tests to test for more stuff
<niemeyer> Aram: Woot!
<Aram>  	// test -gocheck.f TestMachineWaitAgentAlive
<Aram> comment in the source
<Aram> I did NOT do that :).
<Aram> that was roger
<Aram> proposed
 * Aram has dinner
<Aram> bye
<fwereade> niemeyer, ping
<niemeyer> fwereade: yo
<fwereade> niemeyer, heyhey
<fwereade> niemeyer, clarification re: MachinesWatcher comment you pasted above
<niemeyer> fwereade: Cool
<fwereade> niemeyer, but I'm having trouble writing what I want clearly :/
<niemeyer> fwereade: It changed slightly after Aram's suggestion
<fwereade> niemeyer, ok, is it possible that the watcher will ever miss a Dead, and only observe the removal? and; if that is possible, and it occurs, will it send a Dead anyway?
<niemeyer> fwereade: It is possible, and it does report anyway
<fwereade> niemeyer, excellent, such was my supposition; I think there might be a nicer way to express that aspect of it
<fwereade> niemeyer, or maybe it's not worth it :)
<niemeyer> fwereade: I think the documentation actually guarantees it
<fwereade> niemeyer, it may just be my faulty reading then
<niemeyer> fwereade: You know too much :-D
<fwereade> niemeyer, what I really want to talk about is the uniter change I suggested earlier
<fwereade> niemeyer, I agree that resolved() is an ugly little knot of code
<niemeyer> fwereade: 'k
<fwereade> niemeyer, and I also have some sympathy for the notion that ModeHookError is really 2 modes, one Alive and one Dying, although I am somewhat resistant here
<fwereade> niemeyer, but the tension between the two is what really bothers me
<niemeyer> fwereade: I'm not sure if it was clear from my comment, but I wasn't suggesting getting rid of the current mode
<fwereade> niemeyer, in that unpacking the resolved method leads to duplication of slightly fiddly logic in 2 places
<niemeyer> fwereade: I was suggesting that it sounds fine to have it hand off parts of its work to two independent modes
<niemeyer> s/parts/part
<fwereade> niemeyer, or 4 if ModeHookErr and ModeConflicted are both split
<fwereade> niemeyer, and in either case, the resolved-handling and the upgrade-handling are duplicated
<niemeyer> fwereade: I'm hoping we can remove the fiddleness enough for it to not be an issue.
<fwereade> niemeyer, I don;t think the <-u.charmUpgrades(false) suggestion will fly -- getUpgrade only works right when it's run on the main goroutine
<niemeyer> fwereade: Upgrade handling, if the suggestions are made are possible, is simply return ModeUpgrade(foo)
<niemeyer> fwereade: Why?
<niemeyer> fwereade: What I see is actually a race caused by the fact it runs elsewhere
<fwereade> niemeyer, filter goroutine reads uniter state: ok, cool, we're not upgrading | main goroutine starts an upgrade | filter goroutine reads the "current" state from the charm dir, instead of using the upgrade state
<fwereade> niemeyer, it's rather deliberate that the filter doesn't have access to the uniter itself, anyway
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: Yeah, the fact it doesn't access the uniter sounds good
<niemeyer> fwereade: You see the race, though?
<fwereade> niemeyer, ah, sorry, just saw the new review, need to read that
<fwereade> niemeyer, yeah, you're absolutely right, and I would be more than happy to send both ResolvedModes and CharmThingys down the channels
<niemeyer> fwereade: Cool, we're in sync then.. I also see why you want getUpgrade in the uniter
<niemeyer> fwereade: It'll just loose some of its inner workings since we'll get that info from the channel
<niemeyer> lose
<fwereade> niemeyer, yeah -- and I suspect resolve will look a little different in this light, too
<fwereade> niemeyer, might give me a way out of the ickiness
<niemeyer> fwereade: Sweet
<fwereade> niemeyer, thinking about the Alive/Dying (sub)Modes a bit more
<fwereade> niemeyer, switching channels on and off in response to events on other channels is IMO not all that unpleasant a technique; it should surely not be used arbitrarily, but in each of those cases we have a situation in which...
<fwereade> niemeyer, the stimulus/response set in Dying is a strict subset of that in Alive; and the only valid transition is from Alive to Dying
<fwereade> niemeyer, it seems a shame to duplicate the identical bits any more than they have to be
<fwereade> niemeyer, but it will look different once I fix filter
<niemeyer> fwereade: The amount of logic you have added to enable the shifting is worth about exactly the same of the logic that you actually need in the dying case by itself.
<niemeyer> fwereade: One of them is entirely straightforward.. the other is not
<niemeyer> fwereade: I suggest, once you're done with everything else, at least trying to see how it looks
<niemeyer> fwereade: If you still feel the same way, I won't argue much :)
<fwereade> niemeyer, I'll see how it goes :)
<niemeyer> fwereade: and by worth I mean the number of lines
<fwereade> niemeyer, I'm really not sure about that... we lose the upgrade stuff, but the resolved stuff is still needed when dying
<fwereade> niemeyer, in both cases
<fwereade> niemeyer, it all hinges on how terse I can make the resolved handling ;)
<niemeyer> fwereade: Right, the resolved stuff is the only duplication in both cases
<niemeyer> fwereade: You only need tombDying next to it
<niemeyer> fwereade: In one of them
<fwereade> niemeyer, hmm, I think I should probably always have tombDying :)
<niemeyer> fwereade: or default..
<fwereade> niemeyer, in that one specific case in the main uniter loop, ok, yeah :)
<fwereade> niemeyer, actuall that has tombDying *and* default
<fwereade> niemeyer, I must be missing something
<niemeyer> fwereade: Yeah, that's slightly surprising
<fwereade> niemeyer, when in a Mode is a select default useful?
<fwereade> niemeyer, (that bit is actually totally unnecessary -- it really just allows us to indirectly check for unit-Dead in between modes
<fwereade> )
<niemeyer> fwereade: When in a mode is an if statement useful? :-)
<fwereade> niemeyer, d'oh, ok, yes, I use them
<fwereade> niemeyer, but in a for { select loop, surely all they will cause is spinning?
<niemeyer> fwereade: Ah, definitely, that'd be wrong
<niemeyer> davechen1y: Morning!
<davechen1y> morning
<davechen1y> niemeyer: thank you for your review, just responding to your comments now
<niemeyer> davechen1y: Cheers
<davechen1y> wrt. splitting hostFromTarget into two methods, one for machines, the other for units
<davechen1y> i'd still like to have a single method that dispatches to both of those
<davechen1y> so I can reuse it in scp
<davechen1y> what do you think ?
<niemeyer> davechen1y: Sounds like a good idea
<davechen1y> niemeyer: also, Unit.PublicAddress only works when called from the unit itself
<davechen1y> :(
<niemeyer> davechen1y: Isn't that EnvironProvider.PublicAddress?
<davechen1y> niemeyer: let me check before making mroe incorrect statements
<davechen1y> sorry, you are correct
<davechen1y> right ... so that comes out of the document, which means the UA sets that field, i'm guessing on Alive
<niemeyer> davechen1y: Yeah, it should set it at some point
<davecheney> niemeyer: what are your thoghts on changing environs.Environ.Instances([]string) to be ...string ?
<niemeyer> davecheney: I personally don't mind too much in either direction, but I wonder if environ.Instance(instId string) might be more useful
<davecheney> niemeyer: in the majority of cases where i've used that method, i only have a single id I want to resolve
<davecheney> so wrapping it in a []string is overkill
<davecheney> making the param varadic means we don't have to have two methods on the interface
<niemeyer> davecheney: It also means we'll call InstanceS when we want one, and have to unwrap a slice
<niemeyer> davecheney: WE also are using a real slice less than we should
<niemeyer> davecheney: The provisioner, for example, has a TODO for avoiding the overkill of per instance query
<niemeyer> davecheney: Slightly off topic, though, I know
<davecheney> the providers all expect a slice
<davecheney> as in the method takes a []string now
<davecheney> so changing it to be variadic would have no impact on them
<niemeyer> davecheney: I mean that in "foo, err := environ.Instances(whatever)", foo is a slice too
<davecheney> yeah, that also doth suck
<davecheney> just as common is foo, err := ... ; foo[0].something()
<niemeyer> davecheney: So if we're missing a handier "give me one", I'd suggest a real one
<davecheney> niemeyer: fair call
<davecheney> i will propose something
<niemeyer> davecheney: Cheers
<niemeyer> davecheney: I'll step out to hack a home appliance here.. wish me luck :)
<davecheney> roger
<davecheney> send more paramedics
#juju-dev 2012-09-28
<davecheney> niemeyer: scp & scp reproposed, some nice reuse in there
<niemeyer> davecheney: Sweet, will have a quick look
<niemeyer> davecheney: http://play.golang.org/
<niemeyer> Erm
<niemeyer> davecheney: http://play.golang.org/p/cPscJ6RuoX
<davecheney> niemeyer: yes, i had to check that as well
<niemeyer> davecheney: Sorry, I don't get it?
<niemeyer> davecheney: "This will panic if len(c.Args) == 1. I've redone the logic to be less crackful."
<davecheney> conshttp://play.golang.org/p/okbxoy-UI2
<niemeyer> davecheney: c.Target, c.Args = c.Args[0], c.Args[1:]
<davecheney> niemeyer: http://play.golang.org/p/TObeRIa8wL
<davecheney> yes, you are right
<davecheney> but I dont understand why http://play.golang.org/p/Fd_jNi6mVe
<niemeyer> davecheney: Because that's out of bounds
<niemeyer> davecheney: s[1:] works when len(s) == 1 for the same reason that s[:1] works.
<davecheney> i dont' think that is correct
<davecheney> but this isn't the right place to argue about it
<davecheney> i'll fix my code
<davecheney> ok, someone explained it too me in the channel
<niemeyer> davecheney: Stepping out for the night.. have a good day!
<davecheney> enjoy
<rogpeppe> davecheney, fwereade_: mornin'
<davecheney> rogpeppe: howdy
<fwereade_> davecheney, rogpeppe: heyhey
<rogpeppe> fwereade_: it'd be great if you could have a glance at the uniter upgrade branch, if you have a moment sometime this morning: https://codereview.appspot.com/6561063/
<fwereade_> rogpeppe, sure
<fwereade_> rogpeppe, (still thinking)
<rogpeppe> fwereade_: np
<fwereade_> rogpeppe, ISTM that it would be simpler (not to mention less conflicty, and mildly kinder to the network) to get the Unit only when you actually need to run an upgrade
<fwereade_> rogpeppe, doing that removes the need to change runOnce and Uniter
<fwereade_> rogpeppe, is there a deeper motivation in play than "I need a unit"?
 * rogpeppe has a look to remind himself
<rogpeppe> fwereade_: we need to unit immediately
<rogpeppe> fwereade_: because we use it to announce the current agent version
<rogpeppe> fwereade_: but that's not the reason we change runOnce and Uniter
<rogpeppe> fwereade_: the reason for that is that we want to upgrader to be as independent as possible of uniter bugs
<rogpeppe> fwereade_: so we do the absolute minimum necessary before starting the upgrader
<rogpeppe> fwereade_: hence it's important that the uniter factory method doesn't return an error - early errors should not take down the upgrader.
<rogpeppe> s/need to unit/need the unit/
<rogpeppe> fwereade_: does that make sense?
<fwereade_> rogpeppe, hmm, apart from the ones which should, but I see it's tricky
<rogpeppe> fwereade_: which errors *should* take down the upgrader?
<rogpeppe> fwereade: last thing i was was [07:23:38] <fwereade_> rogpeppe, hmm, apart from the ones which should, but I see it's tricky
<rogpeppe> s/was/saw/
<fwereade> rogpeppe, cheers
<fwereade> rogpeppe, I'm not quite convinced that deferring the error helps you much though, 1 mo
<fwereade> rogpeppe, surely an early error return from the Uniter will hose the upgrader just as badly, because runTasks will terminate it
<rogpeppe> fwereade: no, because we've got some special case logic in the upgrader for just such an eventuality
<rogpeppe> fwereade: if it's killed early on, it waits until it has at least had a squizz at the proposed version
<rogpeppe> fwereade: and if that's changed, it doesn't exit until it has actually downloaded the upgrade
<rogpeppe> fwereade: (well, with some timeout too)
<rogpeppe> fwereade: it's given 5 minutes
 * fwereade continues to think
<fwereade> rogpeppe, something about all that does make me a little queasy... I feel that if I ask an upgrader to stop, it should jolly well stop, by criminy
<rogpeppe> fwereade: this was discussed at the time
<rogpeppe> fwereade: i think the upgrader is special
<rogpeppe> fwereade: because it's the only way we can escape bad s/w
<fwereade> rogpeppe, I agree, but IMO that means it should have control of the tasks rather than being antisocial :)
<rogpeppe> fwereade: it *will* stop... when it's made good and sure that someone is not trying to upgrade us
<rogpeppe> fwereade: well, as you know, that's how i started and that was deemed incorrect
<rogpeppe> fwereade: so this is what we're doing. and it doesn't seem bad to me.
<fwereade> rogpeppe, I do feel your pain there... but I think I need to talk to niemeyer about this
<rogpeppe> fwereade: ok
<fwereade> rogpeppe, sorry :(
<rogpeppe> fwereade: if we change this now, BTW, it knocks everything off
<rogpeppe> fwereade: because everything is built in this way currently.
<rogpeppe> fwereade: the upgrader is "just" another task
<rogpeppe> fwereade: so, given that we're not swimming in free time, and this architecture will work for the time being, perhaps we could move forward as we are and consider a change later?
<fwereade> rogpeppe, the collision with my own uniter changes is not entirely trivial
<fwereade> rogpeppe, and there is a value I return from NewUniter that really should block upgrades
<rogpeppe> fwereade: which is?
<fwereade> ErrUnitDead
<rogpeppe> fwereade: i don't think it matters too much tbh. it's an edge case - it doesn't matter much if we do upgrade in that case, it's just an extra 5s delay
<TheMue> morning
<rogpeppe> fwereade: FWIW i think i'd probably structure it in a similar way even if the upgrader was in control - we want the upgrader to be independent right up until the moment it decides that now is the time to upgrade.
<rogpeppe> TheMue: hiya
<TheMue> rogpeppe: hiya
<rogpeppe> fwereade: if you like, i'll merge your most recent uniter branch and include it as a prerequisite
<fwereade> rogpeppe, surely the right thing to do there is to have the upgrader in change of runTasks? the wait & retry business could be handled internally, surely?
<fwereade> rogpeppe, it's not the code conflict that bothers me so much as the lack of clarity
<rogpeppe> fwereade: it's not *that* straightforward. you want to start the upgrader independently, regardless of what the other tasks are doing. then the wait and retry isn't logic that's in the outer loop (because it's dependendent on what gets downloaded) but in a separate goroutine.
<rogpeppe> fwereade: so the structure starts to look similar to what we've currently got
<rogpeppe> fwereade: i think the notion of a task being able to delay shutdown until it's done what it needs to do is a reasonable one, and one of the reasons we use tombs like we do.
<fwereade> rogpeppe, yeah, I see that
<rogpeppe> fwereade: last seen: [07:53:09] <fwereade> rogpeppe, yeah, I see that
<fwereade> rogpeppe, I *think* I'm convinced, although I still don't quite entirely like something about it
<rogpeppe> fwereade: i understand.
<fwereade> rogpeppe, that said, I think I am coming around to your perspective that something outside the uniter should handle death-watching
<rogpeppe> fwereade: wasn't there a load of mode-specific stuff that you needed to do when killed?
<fwereade> rogpeppe, the uniter's Dying response is not interesting from your perspective, I think (although there's little point upgrading a dying unit)
<fwereade> rogpeppe, I think you probably *should* be watching for Dead and crash-stopping though
<rogpeppe> fwereade: "me" being...?
<rogpeppe> fwereade: ah, i see
<rogpeppe> fwereade: you mean i should crash-stop when something returns ErrDead
<fwereade> rogpeppe, almost certainly yes
<rogpeppe> fwereade: maybe. although i think it's a very rare edge case tbh
<fwereade> rogpeppe, but what I was trying to say is that the upgrader itself should know not to bother watching a unit once it's dying
<rogpeppe> fwereade: the upgrader doesn't watch a unit
<fwereade> rogpeppe, I submit that it should, so that binary upgrades work within the same framework as anything else -- I don't think they transcend entity lifetime ;)
<rogpeppe> fwereade: there's always a delay between entity being killed and entity actually dying. this just makes it a little longer in some edge cases.
<rogpeppe> fwereade: if you upgrade a system, there's a possibility that some dying units might linger for a few seconds more as they download a new version. i don't think that's too bad a price.
<fwereade> rogpeppe, well, it means that it continues to act alive -- in some, but not all, respects -- for arbitrarily longer
<rogpeppe> fwereade: does it?
<fwereade> rogpeppe, it may be that this is actually correct behaviour
<rogpeppe> fwereade: what is the upgrader keeping alive?
<rogpeppe> fwereade: after all, the Unit is marked dead
<fwereade> rogpeppe, wait, I'm talking about bad behaviour on Dying more than ugliness on Dead here
<fwereade> rogpeppe, I don't like the ugliness on Dead but I could live with it
<rogpeppe> fwereade: i'm not sure it changes the Dying behaviour at all, does it?
<fwereade> rogpeppe, the question that currently exercises me is "should Dying entities upgrade their code?"
<fwereade> rogpeppe, maybe, actually, they should
<rogpeppe> fwereade: i think so
<fwereade> rogpeppe, yeah, I can imagine a bug blocking clean shutdown that can be resolved by an upgrade
<rogpeppe> fwereade: i think it's nice to have upgrading totally divorced from any of the other logic
<fwereade> rogpeppe, the fact that they stop watching for charm upgrades on Dying then becomes potentially problematic
<rogpeppe> fwereade: i think charm upgrades are a different thing - they're at a higher level.
<rogpeppe> fwereade: and they really do relate directly to the unit
<rogpeppe> fwereade: so it makes sense not to upgrade a charm when the unit is dying
<fwereade> rogpeppe, hm, the bug-blocking-clean-shutdown I guess does *not*apply because the user can already use juju resolved
<rogpeppe> fwereade: for charm upgrade, yeah - we're providing the always-available layer on top of a charm.
<fwereade> rogpeppe, ok, all sounds reasonable, I'll finish the review :)
<rogpeppe> fwereade: tyvm
<rogpeppe> fwereade: there's also this fairly trivial: https://codereview.appspot.com/6564063/
<rogpeppe> everything in my window system has just start breaking. i'm gonna reboot
<rogpeppe> started
<rog> anyone here use multiple monitors under ubuntu?
<fwereade> rog, ok, I have another thought
<fwereade> rog, sorry no
<rog> fwereade: ok, listening
<fwereade> rog, I wil be comfortable with this is there is some very basic life handling at the top level
<fwereade> rog, I think we need:
<fwereade> rog, 1) drop the unit return from runOnce, because it's potentially panic-inducing anyway
<fwereade> rog, 2) when we get the unit in runOnce, return a special error on NotFound, and that same special error if the unit exists but is Dead
<fwereade> rog, 3) if we see that error in Run, return nil
<fwereade> rog, 4) return the UpgradedError when we're doing an upgrade restart
<fwereade> rog, 5) add a "normal exit 0" stanza to the upstart conf
<fwereade> rog, and I think that's it
<fwereade> rog, that I think conveys my intent as best I can
<fwereade> rog, the precise details of implementation are ofc approximate
<davecheney> the disparity between *ConfigNode.Map() and Charm.Config().Option is a pain in the balls
<fwereade> rog, does that sound sane
<fwereade> davecheney, ouch, I bet
<davecheney> map[string]string vs map[string]interface{}
<davecheney> mix in some json or yaml and it's a royal pain
<rog> fwereade: i don't know what you mean about dropping the unit return from runOnce - i don't think it can induce panic
<rog> fwereade: if we can't get the unit in runOnce, we'll never start the upgrader
<fwereade> rog, bah, true
<rog> fwereade: we could also check for dead if you like.
<rog> fwereade: that would be trivial
 * rog wishes we used map[string]string throughout
<rog> bbs
<fwereade> rog, leaving unit stuff aside for a moment, what is your opinion of the "normal exit 0" thing?
<fwereade> rog, (when you return ofc0
<rog> fwereade: maybe the agent should just remove its own upstart conf
<fwereade> rog, that feels icky to me
<rog> fwereade: who else is going to remove it?
<fwereade> rog, the machine agent?
<rog> fwereade: ... or the principal unit agent, right?
<fwereade> rog, and I think it's fine to have a unit agent run on startup and exit immediately without error to indicate that it's done all it has to do
<fwereade> rog, yeah, whoever deployed it
<fwereade> rog, or a machine agent for that matter
<rog> fwereade: yeah, i think you're right
<rog> fwereade: presumably it *is* possible to get upstart to never start something again after it's exited ok
<rog> fwereade: i'm slightly dubious though - if we reboot, surely it'll start again anyway
<fwereade> rog, AIUI the stanza "normal exit 0" should be sufficient
<rog> fwereade: ok, sounds fine.
<fwereade> rog, it will; and the UA will cleanly observe its deadness, exit without error, and never trouble the machine again that run
<rog> fwereade: interesting point: currently if one of the workers dies without an error, runTasks will just let it die, but continue on. i wonder if actually it should kill everything as usual in that case, so the tasks are always tied together.
<fwereade> rog, hmmmmmmmmmm
<rog> fwereade: then the unit agent can exit with a nil error when the unit is dead
<rog> fwereade: and everything gets shut down cleanly.
<fwereade> rog, I think that in code I would prefer an explicit error return
<rog> fwereade: a nil error seems a fine way of saying "i'm shutting down with no error" to me
<fwereade> rog, an an error seems to me to be a fine way to signal "what you asked me to do ain't gonna happen"
<rog> fwereade: it's done *exactly* what we asked it to do, surely?
<rog> fwereade: but... given that it wants other things to die along with it, perhaps ErrDead is good.
<fwereade> rog, gaah, sorry
<rog> fwereade: last thing i saw was: [08:52:16] <fwereade> rog, an an error seems to me to be a fine way to signal "what you asked me to do ain't gonna happen"
<fwereade> rog, well, we get to choose how we define it -- I see the condition of lacking a unit as fundamentally an error condition for the uniter, because it means it can't do anything
<fwereade> rog, the client code may be in a position to handle that specific error in a different way
<rog> fwereade: zero is a ok value :-)
<rog> fwereade: last thing i said before you went:
<rog> [08:53:45] <rog> fwereade: but... given that it wants other things to die along with it, perhaps ErrDead is good.
<fwereade> rog, a Uniter *needs* a viable Unit to do its job -- if it cannot get the unit, or the unit is dead, that is an error :)
<fwereade> rog, ah, I missed that, sorry
<fwereade> rog, +1 on ErrDead causing nil return from Run
<rog> fwereade: ok, will do
<fwereade> rog, btw, would you put it somewhere easily accessible, like state, so I can also return it from the Uniter please?
<rog> fwereade: it could even go in the Uniter if we wanted
<rog> fwereade: i mean, in worker/uniter
<fwereade> rog, +1
<fwereade> rog, ok, sorry to keep banging on about the unit return, but it really doesn't feel right
<fwereade> rog, and it's a little goroutine-icky, even if not technically unsafe
<rog> fwereade: it's never used two goroutines simultaneously
<fwereade> rog, but it crosses my mind that the *upgrader* already has the unit, and can surely send down its PathKey in the UpgradedError itself?
<fwereade> rog, that's why I said not unsafe
<rog> fwereade: i'm not sure i see the issue
<rog> fwereade: runOnce only returns when none of its sub-tasks are running
<rog> fwereade: therefore there can be no problem with goroutine ickiness
<rog> fwereade: i may be on crack
<fwereade> rog, I guess I just don't like a return value that is present-but-useless when no error, or most errors; present-and-useful on one specific error, and not-present on some other errors
<rog> fwereade: it's always present :-)
<rog> oh no it's not
<fwereade> rog, so you somehow return a unit when a unit is not found?
<rog> indeed
<fwereade> rog, ISTM that the PathKey is the important thing, not the context, and that that would be a fine thing to send down with the error
<fwereade> rog, sorry, "not the context" is kinda meaningless
<rog> fwereade: that does assume that the agent is *always* going to be named after the PathKey.
<rog> fwereade: otherwise the upgrader wouldn't be able to send it
<rog> fwereade: i know what you mean about the ickiness though
<fwereade> rog, hmm, AIUI that was the only point of PathKey in the first place
<fwereade> rog, does Machine have one too?
<rog> fwereade: yeah i know - that's why i wanted to call it "AgentName" ...
<rog> fwereade: it does
<fwereade> rog, feels to me like the way to go
<rog> fwereade: the ironic thing is that we already have the means to make the agent name, without involving unit or machine
<fwereade> rog, ha!
<fwereade> rog, but then, meh, we (currently) always need the unit anyway, we may as well ask it what it thinks
<fwereade> rog, +1 on AgentName
<rog> fwereade: been there, it was wrong
<fwereade> rog, understood, but it may just have been an idea before its time :)
<rog> fwereade: maybe. perhaps i'll start with PathKey and see whether the incongruence might change niemeyer's mind
<fwereade> rog, SGTM
<rog> fwereade: the other thing that occurred to me is the UpgradedError could contain a closure which would do the upgrade (i.e. call ChangeAgentTools)
<rog> fwereade: but that's probably a bit icky
<fwereade> <fwereade> rog, the only thing that stopped me proposing that was a feeling that UpgradedError was weird enough already
<fwereade>  rog, but I think it would probably actually be the cleanest solution
<rog> [09:23:16] <rog> fwereade: but that's probably a bit icky
<fwereade> rog, not sure -- it's already not quite right, because it's not an upgrade*d* error
<fwereade> rog, it's actually an UpgradeReadyError
<rog> fwereade: i dunno - it's "someone has upgraded me"
<fwereade> rog, and if you think of it like that, a RunUpgrade method mades sense
<fwereade> rog, the actual upgrade of the agent does not take place until the agent calls ChangeAgentTools though
<rog> fwereade: UpgradeReadyError doesn't sound much like an error
<fwereade> rog, in what way is it worse than UpgradedError>
<rog> fwereade: i agree
<rog> fwereade: you're probably right, it's just about the same
<fwereade> rog, indeed -- the whole error idea is what still feels mildly abusive
<fwereade> rog, but I think at least it's worth a try
<fwereade> rog, https://codereview.appspot.com/6561063/ reviewed anyway
<fwereade> rog, https://codereview.appspot.com/6564063/ LGTM
<rog> fwereade: thanks
<rog> fwereade: dammit, the upgrader can't do the actual upgrade easily
<rog> fwereade: it doesn't have agent.Conf.DataDir
<fwereade> rog, really? blast
<rog> fwereade: and i'm slightly reluctant to pass that in just for this
<fwereade> rog, doesn't it use DataDir somehow to figure out the place to put the real tools dir in the first place?
<fwereade> rog, I *thought* the upgrader put the tools, and then just told the client that they can now symlink to the tools' known lcation
<rog> fwereade: yes, it does indeed have dataDir around, you're right
<fwereade> rog, cool
<rog> fwereade: ok, np then
<Aram> hello.
<rog> Aram: mornin'
<TheMue> Aram: moin
<rog> TheMue: hiya
<TheMue> rog: didn't we already seen 3h ago? ;)
<rog> TheMue: i thought maybe i hadn't :-)
<TheMue> rog: it's ok, we both are 40+. so i know that. :P
<rog> fwereade: do you think *all* upstart scripts we produce should have the "normal exit 0" stanza?
<rog> TheMue: :-)
<fwereade> rog, hmm, it demands a similar change in the machine agent, but -- for our purposes at least -- that might be a sensible way to go
<fwereade> rog, it will not be unbearably hard to make it configurable later if we want to
<rog> fwereade: i've already made the change to the machine agent
<fwereade> rog, well then go for it :)
<TheMue> release meeting time
 * davecheney waits
<TheMue> i ping the boss
<Aram> aah, meeting.
<Aram> I forgot about this.
<TheMue> hmm, no answer
<rog> fwereade: this is still WIP, but hopefully is closer to what you'd like: https://codereview.appspot.com/6561063
<mramm2> pong
<TheMue> ah, the master
<mramm2> hahah
<TheMue> just wanted to collect the state of all to send it to you, but now you're here. fine.
<rog> davecheney, fwereade: invites are out
<Aram> davecheney: fwereade: https://plus.google.com/hangouts/_/9dde8fe32d77795910d89838149fce4ccddea664?authuser=1&hl=en
<rog> i lost conn
<rog> now "we're having trouble connecting with the plugin"
<rog> fwereade: presumably your net connection isn't good enough for a hangout this morning?
<fwereade> rog, oh *hell* I completely forgot and was eating lunch
<fwereade> rog, joining
<rog>  fwereade: https://plus.google.com/hangouts/_/9dde8fe32d77795910d89838149fce4ccddea664?authuser=1&hl=en
<davecheney> all: http://paste.ubuntu.com/1247363/
<davecheney> ^ hook, sadness
<TheMue> fwereade: is the current ServiceRelationWatcher what you expected with issue #1032539?
<fwereade> TheMue, just a mo, let me take a look; if it has Added/Removed, no
<rog> davecheney: have you looked at the log output?
<rog> davecheney: that's where the hook output will go, which should be diagnostic
<rog> bbs
<rog> fwereade: done, i think: https://codereview.appspot.com/6561063
<rog> (live tests pass)
<fwereade> rog, awesome, I'll take a look in a sec
<rog> fwereade: thanks
<rog> fwereade: trivial? https://codereview.appspot.com/6568064
<rog> lunch
<fwereade> rog, LGTM on UpgradeReadyError... oh wait there was something I meant to check
<fwereade> rog, what's the deal with changing the arge to the machine agent?
<rog> fwereade: hmm, let me check
<rog> fwereade: i'm not sure what you mean. which argument?
<fwereade> rog, I can't find it, I may be remembering from an older version
<fwereade> rog, yeah, it doesn't exist -- sorry
<fwereade> rog, https://codereview.appspot.com/6568064/ looks trivial to me
<rog> fwereade: cool, will submit
<rog> fwereade: (thanks!)
<niemeyer> Hello all!
<niemeyer> Sorry, my phone warned me about the early meeting a bit late.. did we have one?
<Aram> we did
<niemeyer> Aram: Nice
<niemeyer> Aram: Have you seen the machines watcher I've pushed?
<Aram> niemeyer: still wip, found a bug and making final adjustments now.
<Aram> niemeyer: no, I'll take a look.
<niemeyer> Aram: https://codereview.appspot.com/6566066/
<niemeyer> Aram: Thanks
<niemeyer> Aram: Since we're both working on that stuff, it's good to cross-review
<niemeyer> Aram: Btw, I've talked about a similar pattern to the watcher you're pushing with William in the sprint, and now it occurred to me that you were not around in the conversation
<niemeyer> Aram: is it using a single goroutine, or a goroutine per unit?
<Aram> a goroutine for the machine plus one goroutine for each principal unit (not for subordinates).
<niemeyer> Aram: I don't think we need more than a single goroutine
<niemeyer> Aram: The logic ends up simpler rather than more complex with a single goroutine
<niemeyer> Aram: Please have a look at RelationUnitsWatcher
<niemeyer> Aram: It has a similar problem in which it has to decide what to watch as it goes
<Aram> niemeyer: yeah, it can be done with only one goroutine. I'll change it after I fix this one bug. I did it this way because it matched the way I thought of the problem, but it's easy to change now.
<niemeyer> Aram: It started with the same design of one-per-subcontext, and then after a conversation it got refactored to be a single goroutine
<niemeyer> Aram: Yeah, I totally understand that.. it feels "right", which is why I figured I should ask
<rog> niemeyer: yo!
<niemeyer> rog: Yo!
<rog> niemeyer: after discussion with william this morning, i made quite a few changes to the uniter upgrade branch
<rog> niemeyer: i hope you find them ok...
<rog> niemeyer: https://codereview.appspot.com/6561063/
<rog> niemeyer: am just running the first live test on the machine agent with provisioner and firewaller in-built
<rog> niemeyer: if that works, we're basically there for upgrading
<niemeyer> rog: It looks like you've made changes to the upgrader itself?
<rog> niemeyer: as requested, yes
<niemeyer> rog: Why did it affect the branch, specifically?
<rog> niemeyer: because fwereade was not happy about the fact that runOnce returned the unit
<rog> niemeyer: because it was only valid sometimes
<rog> niemeyer: so the fix was to make the upgrader responsible for doing the actual upgrade itself.
<niemeyer> rog: Yeah, I'm not against it for sure.. sounds like a great idea
<niemeyer> rog: I just wish this was done by itself
<niemeyer> rog: Rather than on top of another big branch
<rog> niemeyer: yes, perhaps i should have done that. it seemed like i'd have got into a twisty mess, but it may have been ok
<niemeyer> rog: I think that's where I am right now :)
<rog> niemeyer: ok, i'll back out the changes and create another branch
<niemeyer> rog: It's fine, I'm already on it
<niemeyer> rog: if err == uniter.ErrDead {?
<niemeyer> rog: Was this merged with William's change?
<rog> niemeyer: no, this was another thing that william suggested
<rog> niemeyer: perhaps i should have left that for later too
<niemeyer> rog: How's that related to upgrading?
<niemeyer> 	 82         if state.IsNotFound(err) || err == nil && unit.Life() == state.Dead {
<niemeyer> ?
<niemeyer> rog: Gosh man..
<rog> niemeyer: it's related to the logic around upgrading.
<niemeyer> rog: That's related to unit lifecycle
<rog> niemeyer: william asked me to change it, so i did. i'll change it back, np
<niemeyer> rog: Which is not handled now, and as far as I understand is completely unrelated to upgrading
<niemeyer> rog: Okay, can we please go back to applying the upgrading pattern that was already in place to the unit?
<niemeyer> rog: Or, optionally, refactor the upgrading out *without* adding it to the unit
<niemeyer> rog: and then add it to the unit in a separate step
<niemeyer> rog: The branch as it is seems somewhat out of scope in whichever angle we look
<rog> niemeyer: if i just take out the lifecycle changes, would that be enough?
<niemeyer> rog: I have no idea.. right now I have a branch that has at least: 1) Add upgrade support to unit; 2) Refactor the upgrade machanism; 3) Add lifecycle support to the unit
<niemeyer> rog: I'd do them as (2), (1), and keep (3) out entirely.. William is working on it
<rog> niemeyer: ok
<niemeyer> rog: I'd also be fine with (1) + (2), though, as that's the order you did and might be easier to get back onto that state
<rog> niemeyer: that would indeed be much easier, as i did the lifecycle thing last
<niemeyer> rog: Right
<rog> niemeyer: thanks
<rog> niemeyer: i should know better when to stop what i'm doing and make a new independent branch
<niemeyer> rog: I have a hard time understanding it, to be honest
<niemeyer> rog: It's so pleasing to get small branches in..
<niemeyer> rog: Fast, painless
<niemeyer> rog: Easy for people to look at and "Hah, of course we want that"
<rog> niemeyer: yeah, but in this case, the incentive was "if i just make these few small changes, then william will be happy"
<niemeyer> rog: You can always make William happy in another branch :-)
<rog> niemeyer: yeah
<niemeyer> rog: Even more when the change is *already* huge
<TheMue> so, late lunchtime today, afterwards an appointment outside
<Aram> niemeyer: you got a review.
<TheMue> niemeyer: maybe i come back to you in the evening regarding the security groups, but most is already clear
<niemeyer> Aram: Thanks!
<Aram> niemeyer: I'll mark my branch as WIP and do some more changes.
<niemeyer> TheMue: Super, that branch Aram reviewed has your test, btw
<niemeyer> Aram: Cool
<TheMue> niemeyer: yes, i've revied it too and like it
<TheMue> reviewed
<niemeyer> Aram: Can you clarify this bit: "to check that when we get an event, the lifecycle of the entity is really what we expect it to be."
<Aram> niemeyer: we get []int{2,3,5}, it would be nice if we could check that 5 is alive because it was just added, 2 is dead and 3 is dying.
<niemeyer> Aram: Hmm.. what are we testing?
<niemeyer> Aram: This feels like testing the test itself? I mean, it's the test itself that is putting the unit to Dead
<Aram> the test puts the unit to dead, but perhaps the watcher misfired for whatever wrong reason, stil delivering the correct event. by checking that the unit is dead we make sure that the watcher fired for the correct reason.
<niemeyer> Aram: Sorry, I don't think that's the case. The test is saying "u.EnsureDead()".. it doesn't make sense for this specific test to ask "Is u actually dead?"..
<niemeyer> Aram: We have tests for EnsureDead elsewhere
<niemeyer> Aram: Regarding this: "I pondered about this. If we use Select anyway, why don't we use the real document, machineDoc in this case, and ignore the fields we don't care about?"
<niemeyer> Aram: I was on the fence..
<niemeyer> Aram: I think you're right.. we should just use the machine doc
<rog> niemeyer: i hope this is more digestible: https://codereview.appspot.com/6561063
<niemeyer> rog: Thanks very much, looking
<rog> fwereade: i've taken out the life-cycle-related changes - they'll go in another CL
<fwereade> rog, ok, SGTM, so long as they come soon I won't fret :)
<niemeyer> rog: At these times I <3 Rietveld
<rog> niemeyer: yeah, being able to diff against the different stages is invaluable
<niemeyer> rog: Yeah, and diffing across the back and forth produces clean diffs
<rog> niemeyer: ah yes
<Aram> at previous job we had the worst review tool ever.
<Aram> custom made in house, probably 15 years ago.
<rog> fwereade, niemeyer: factored-out branch: https://codereview.appspot.com/6567067/
<niemeyer> rog: Reviewed
<niemeyer> rog: The first one, that is
<rog> niemeyer: tyvm
<niemeyer> Aram: Nasty. I think I never had anything close to Rietveld either, to be honest
<niemeyer> So much lifetime wasted :-)
<rog> niemeyer: the cloudinit change
<rog> is necessary because the upgrader uses the machine's PathKey for the agent
<rog> name. I could roll that back and explicitly pass in an agent name to NewUpgrader
<rog> instead, if you like.
<Aram> dy = -dx/(x + c1)
<Aram> y = -I[dx/(x + c1)]
<Aram> y = -ln|x + c1| + c2' = -ln|x + c1| + ln(exp(c2')) = -ln(exp(c2')|(x + c1)|) = -ln|c2(x + c1)|
<Aram> meh, sorry, not here.
<Aram> damn paste.
<niemeyer> rog: No, sounds sensible then, thank you
<rog> niemeyer: cool.
<niemeyer> Aram: Curious
<rog> niemeyer: BTW for the nil error, i think maybe it would be best as: if err == nil {err = fmt.Errorf("tasks finished with no error") } or something
<niemeyer> rog: "uniter error: %v" feels very terse and clear
<rog> niemeyer: rather than "uniter error: nil" which is still not great
<rog> niemeyer: ok, will do
<rog> niemeyer: seems funny saying "error" when there's no error, that's all
<Aram> niemeyer: I found an awesome hacky way of solving the not so complex differential equation x * y' + 1 = exp(y).
<niemeyer> Aram: What does the equation mean?
<Aram> can't remember where I first encountered it, particle physics for sure. A new solution just sprang (?) in my mind randomly and I had to write it down.
<niemeyer> rog: Fair enough regarding the error message, you're right
<niemeyer> Aram: That's how it generally goes :-)
<rog> niemeyer: ok, i'll go with my suggestion above, thanks
<niemeyer> rog: Thank you
<rog> ah, this will be why my combined machine and provsioning agent isn't working!
<rog> 2012/09/28 14:47:51 JUJU loaded invalid environment configuration: no registered provider for "ec2"
<rog> not too hard to fix :-)
<Aram> I must specify that I've probably encountered this equation more than 7 years ago though, heh.
<Aram> at a particular physics olympiad.
<niemeyer> rog: Follow up reviewed
<rog> niemeyer: thanks!
<rog> niemeyer: i wondered about worker.ErrDead but considered that it might be useful for a client to be able to distinguish which worker had given the ErrDead. but maybe that's overthinking it.
<niemeyer> Aram: Differential equations are surprisingly useful.. I wish I could keep the background math in my head for longer.. but haven't really ever used them in anger, so it remains just as a spare tool
<niemeyer> rog: The only agent that can be dead is the one that is running
<rog> niemeyer: ok.
<niemeyer> rog: e.g. we can't have a dead provisioning worker without a dead machine
<Aram> I have to step out early today guys, but I'll finish work in the weekend on the watcher.
<rog> Aram: have fun!
<niemeyer> Aram: Have a pleasant EOD
<Aram> thanks.
<rog> machiner/provisioner/firewaller upgrade worked live, yay!
<niemeyer> rog: WOAH
<niemeyer> I guess we're closing September in pretty good shape
<rog> niemeyer: in worker: var ErrDead = errors.New("agent object is dead") ?
<rog> pwd
<rog> niemeyer: better than we were a week ago, no question :-)
<niemeyer> rog: s/object/entity/, otherwise LGTM
<rog> niemeyer: cool
<rog> niemeyer: https://codereview.appspot.com/6570063/
<rog> niemeyer: the final piece of the puzzle
<niemeyer> rog: Looking
<rog> fwereade: https://codereview.appspot.com/6570063/
<fwereade> rog, ack
<fwereade> rog, that is awesome
<fwereade> rog, LGTM
<rog> fwereade: cheers!
<rog> fwereade: i need to do another branch to pass the state into the Machiner directly, i think.
<rog> fwereade: or maybe niemeyer will call that out in this branch
<rog> fwereade: actually, nope, it's not a problem, cool.
<niemeyer> rog: Done
<rog> niemeyer: thanks!
<niemeyer> rog: Very cool
<rog> niemeyer: it was very easy and *almost* worked first time...
<niemeyer> rog: Great stuff
<niemeyer> Lunch is calling
<niemeyer> biab
<rog> right upgrades are GO!
<rog> mramm2: ^
<mramm2> rog: awesome
<rog> mramm2: the only thing we don't have currently is the --bump-version functionality.
<rog> niemeyer: stage 1 of --bump-version: https://codereview.appspot.com/6560066
<niemeyer> rog: Looking
<niemeyer> rog: Done
<rog> niemeyer: thanks!
<niemeyer> rog: My pleasure
<rog> niemeyer: BTW, about --bump-version:
<rog> niemeyer: i'm wondering when we *don't* want it enabled
<rog> niemeyer: if we're uploading tools to private storage
<niemeyer> rog: Any time checking out a release and uploading it
<rog> niemeyer: why would we want to do that without bumping the build version? nothing will see the new release.
<niemeyer> rog: Bumping the version is a hack.. uploading tools to private storage is not
<niemeyer> rog: People can upload tools to private storage in their own cloud, with a real juju release
<rog> niemeyer: ok. should we make it an error if the tools already exist in the storage then?
<niemeyer> rog: How's that connected to the above?
<rog> niemeyer: because otherwise it's likely to be a mistake, and people might be surprised when nothing happens.
<niemeyer> rog: When you copy a file to a location that already contains the same file, nothing happens other than the same file being in the same place
<niemeyer> rog: That's not surprising
<rog> niemeyer: i'm not sure. in this case, it may well feel like we're asking juju to use a particular version.
<rog> niemeyer: which it will be, just not the version we've just uploaded
<niemeyer> rog: It will be the version we just uploaded if that's the version that is being used
<niemeyer> rog: I don't see the problem, really
<niemeyer> rog: --upload-tools uploads tools.. that's all
<rog> niemeyer: well, the problem there is we've got two "versions" of the s/w with the same version number
<niemeyer> rog: If people expect more than this, they'll be wrong
<rog> niemeyer: it's actually "upgrade-juju --upload-tools" and the problem i see is in the first word there.
<niemeyer> rog: That means you've changed the release
<rog> niemeyer: yeah - we've changed the release but the version number hasn't changed. that feels a bit like a mistake to me.
<rog> niemeyer: i'm trying to imagine a situation where it's useful to overwrite the existing tools for a given version
<niemeyer> rog: If you checkout 2.2.0, that's 2.2.0
<niemeyer> rog: Not 2.2.0.100, not 2.2.0.1001
<rog> niemeyer: ok, that's fine.
<niemeyer> rog: We're doing development, and we have tools to do what we want when we have to
<niemeyer> rog: Thus --bump-version
<rog> niemeyer: if someone has already checked out and uploaded 2.2.0, is it useful to them to be able to overwrite the existing 2.2.0 in storage?
<niemeyer> rog: Don't know.. doesn't look like a problem we should worry about right now
<rog> niemeyer: (i'm not suggesting automatic --bump-version BTW)
<rog> niemeyer: ok
<niemeyer> rog: If people ask to upload, and it uploads, sounds fine
<niemeyer> rog: We can fine tune that behavior over time too.. anything we think right now will likely be wrong in the real world
<rog> niemeyer: ok, seems reasonable
<rog> niemeyer: submitted. ... and with that, it's time for me to call it a week. oh, i might sign off a few bugs first :-)
<niemeyer> rog: Super, thanks for the great stuff this week
<rog> niemeyer: got there in the end :-)
<niemeyer> rog: Yeah
<rog> fwereade, Aram, niemeyer, everyone else: have a great weekend!
<fwereade> rog, and yourself :)
<niemeyer> rog: Thanks, you too!
<fwereade> niemeyer, btw, I just put a couple of trivials in the queue
<niemeyer> fwereade: I'll look right now
<niemeyer> fwereade: Done
<niemeyer> fwereade: Done
<fwereade> niemeyer, lovely, thanks
<niemeyer> fwereade: My pleasure.. have a couple of questions on the second one
<fwereade> niemeyer, the bits you're asking about are basically the same
<fwereade> niemeyer, if you do a plain waitHooks{}, it means, "Iexpect no further hooks to have been run"
<fwereade> niemeyer, the initial StartSync is to poke the uniter into responding to events that might cause spurious events to happen
<fwereade> niemeyer, the subsequent ones are to make sure that it responds to state changes in a timely way
<fwereade> niemeyer, sound sensible?
<niemeyer> fwereade: Ah, yeah, thanks.. I suggest dropping it down to, say, 100ms then, since this is the happy case
<fwereade> niemeyer, SGTM
<niemeyer> fwereade: We're getting a second every 5 of those, otherwise
<fwereade> niemeyer, indeed, dropping it will not hurt at all
<fwereade> niemeyer, ok to submit with that change?
<niemeyer> fwereade: LGTM
<fwereade> niemeyer, cheers
<fwereade> hmm, got to go out in a mo, actually, but they should be in soon enough :)
<niemeyer> fwereade: Have a good weekend, and thanks for the hard work too.. feels like we're on the runway for a fully working implementation
 * niemeyer steps out for a while
#juju-dev 2012-09-29
<mxd> anu juju guru (poetic) talkin tonight?
<mxd> hello
<SpamapS> mxd: heh, just about to go to bed
<SpamapS> mxd: ?
<mxd> was looking for  juju expert to see if this is a bug or me
<SpamapS> mxd: you have me for a couple more minutes while I install updates. :) go ahead
<mxd> ive been trying to juju and maas going and cant make it owrk juju gives a error every time ...
<mxd> so i thought i would hook up to my amazon ec2 account
<mxd> but when i go to launch juju i get and error on the control-bucket and for the life of me i canm't find that info at amazon on the web or with hybrid fox
<mxd> the guide says its not needed but bootstrap crashes if i take it out
<mxd> so is that a bug or not?
<SpamapS> I'd need to see the error
<SpamapS> mxd: you do need control-bucket.. and the aws creds you give juju need to have full control of that bucket
<mxd> ok i dont use s3 so where do you get the control-bucjet information from ..it isnt in security  anywhere that i can find
<mxd> as far as i can see for amazon basic ec2 service ..i need ...access key ..secret-key-- account ..and  password region and mybe the url if its not coded ....Not seeing bucket info any where
<SpamapS> mxd: it has to be *globally* unique
<SpamapS> mxd: the 'uuid' program is useful
<mxd> so this is a value from my system .....ie a volume location?
<mxd> so if i stick the default back in there with the sample ..will juju create (use ) that?   I was assuming that was a storage bucket from S3 services it was lookiing for not a repository on my net
<SpamapS> mxd: the way S3 works, buckets have to be globally unique... its just the place to put your stuff
<SpamapS> mxd: even across different users, it has to be unique. Its not local, its on S3, but you, the user, have to decide what the name should be
<SpamapS> mxd: and yes, juju will create a control-bucket that does not exist
<SpamapS> mxd: and if it exists but you can't write to it, you'll get an error
<mxd> ok so it is my storage .........so i can mount a drive iscsi or something get the uuid and put that in there ...and then all the resources and charms will go in there
<SpamapS> ugh, no
<SpamapS> mxd: there is a *program* called UUID
<SpamapS> mxd: it will generate a UUID
<SpamapS> mxd: you can paste the UUID it prints out in as control-bucket, and that should have a very high chance of working
<mxd> that does beg the question if you don't put a real device in there how does juju map to real storage?
<mxd> or maybe a better question how (where) do you control it
<mxd> that worked not a bug ...my not understanding what was going on ....
<mxd> now back to figuring out mass and juju ..key errors are the suckyist thing about the whole freaking cloud ...
<mxd> thanks Spam btw ...
<SpamapS> sorry did we switch back to maas?!
<SpamapS> mxd: I think my brain capacity has dropped to "Cro-Magnon" levels. g'nite
<mxd> gnite to ya ..and yes i am have an instance running on ec2 now ...next there will be how to control it .. doesnt pop up in maas control yet ...need to figure out my local cloud first though ..just wanted juju to work
<mxd> to look at wht it took to connect  names keys etc...
<mxd> man i need a new keyboard
<SpamapS> mxd: I'd recommend #juju for further discussion btw. It gets pretty heavy in the developers chatting about code in here.. users tend to hang over there
#juju-dev 2012-09-30
<mxd> anyone know if maas has a channel not ask ubuntu
#juju-dev 2013-09-23
<axw> thumper: I have a couple more small sshstorage MPs, if you have time to look
<axw> also, good morning
<thumper> axw: sure, line them up
 * davecheney waves
<davecheney> % env LD_LIBRARY_PATH=/opt/gccgo/lib64 ./juju version
<davecheney> 1.15.0-raring-amd64
<axw> thumper: https://code.launchpad.net/~axwalk/juju-core/sshstorage-put-stdin/+merge/186936
<axw> then https://code.launchpad.net/~axwalk/juju-core/sshstorage-tmpdir-default/+merge/186939
<axw> thanks
<axw> davecheney: cool :)
<davecheney> axw: lots of little edge cases
<davecheney> but they are around building gccgo
<davecheney> nothing to do with juju-core
<davecheney> wallyworld_: juju _STILL_ waits too long before telling you that your environment is already bootstrapped
<davecheney> http://paste.ubuntu.com/6143764/
<davecheney> worse, it happily overwrites tools, then notices your env is bootstrapped
 * axw wonders how much of juju llgo can build
<wallyworld_> davecheney: i never did any work on the "waits too long" bug
<wallyworld_> i was doing work on simplestreams
<davecheney> ok
<davecheney> should I raise an issue
<wallyworld_> but fwiw i agree with you
<wallyworld_> yes
<wallyworld_> mark it high i reckin
<davecheney> ok, will do
<wallyworld_> maybe i can get to it this week
<wallyworld_> still have simplestreams mirrors and checksum support to finish
<davecheney> done
<davecheney> kk
<thumper> axw: can we talk about this change?
<thumper> I have a few questions
<thumper> probably faster in a hangout
<axw> yes, just a sec
 * thumper waits
<axw> thumper: https://plus.google.com/hangouts/_/d485a58f42c7e12792ce53c161e010a0775851a8?authuser=1&hl=en
<axw> wallyworld_: your MP from the other day fixes this, right? https://bugs.launchpad.net/juju-core/+bug/1223224
<wallyworld_> looking
<wallyworld_> yes indeed
<wallyworld_> and errors
<wallyworld_> thumper did the work, i just added a test and landed
<wallyworld_> i'll link the branch
<axw> yeah, we had a chat, I was a bit confused when I saw it come from you :)
<axw> cool thanks
<axw> I'll assign the card in LeanKit to you and move it to merged if that's ok
<axw> or thumper
<axw> rather
<wallyworld_> sure, doesn't really matter
<thumper> nah, it's done now
<thumper> :)
<wallyworld_> so long as it's in merged
<wallyworld_> if it is in merged, it will be added to the release notes
<wallyworld_> s/will be/should be perhaps
<axw> yup
<axw> done
<thumper> school run, back shortly
 * thumper disconnects to head to ice-skating
<thumper> caitlin is skating
<thumper> I'm going to be working in the cold
 * thumper just proposes a fix to a fix
<thumper> tests still passed, but wasn't as clean as it should be
<rogpeppe> mornin' all
<rogpeppe> axw: hiya
<axw> rogpeppe: howdy
<rogpeppe> axw: i'm looking for a review on this. i reckon you might like it. https://codereview.appspot.com/13249054/
<axw> okey dokey, looking
<jam> morning rogpeppe and axw
<rogpeppe> jam: hiya
<axw> heya jam
<rogpeppe> jam: you might want to take a look too. i had some spare time over the weekend and thought it was worth doing.
<jam> rogpeppe: from the overview, I'm happy to not have the "Discarded" stuff going into the log. Having it at Trace level might be ok
<rogpeppe> jam: i'm happy for it to be silent, tbh
<rogpeppe> jam: this now means that the decision can be made externally
<jam> rogpeppe: I would be ok with a flag to somehow enable it, just because of the "why isn't this method being detected" it could be helpful for developement. But it serves no purpose in production.
<rogpeppe> jam: see https://codereview.appspot.com/13827043/
<rogpeppe> jam: with that, you'll get a test failure if you add a method that's not detected
<rogpeppe> jam, axw: here's an exploration of a potentially interesting idea that this enables. it only took an hour or so after i thought of the idea: http://paste.ubuntu.com/6143236/
<rogpeppe> jam, axw: it generates API client code for the whole API
<rogpeppe> jam, axw: (the output looks like this: http://paste.ubuntu.com/6143244/ )
<axw> rogpeppe: nice
<rogpeppe> axw: text/template was a perfect match for the problem - i only made a single tweak to the rpcreflect API to make it easier to use (make a method return (Method, error) rather than (Method, bool)
<rogpeppe> )
<rogpeppe> TheMue: mornin'
<TheMue> rogpeppe: good morning
<TheMue> rogpeppe: just scanning the mail from friday till today
<TheMue> rogpeppe: what's weird, i've got a pre-filtering to sort it a bit. it mostly works, but sometimes mail still lands in the inbox
<davecheney> TheMue: from the driver guys ?
<TheMue> davecheney: this, but also other mails. i think it depends on replys and the threading. while the first mail is sorted correctly a reply lands in the inbox and so the whole thread
<fwereade> mgz, ping
<jam> fwereade: I often don't see him around yet, anything I can help with?
<jam> good morning, btw
<jam> rogpeppe: so to come back to your "auto-generate the client side of the API", one caveat with what you've written is that everything is exposed as a Params object, but if you actually look at state/api/client.go we expose functions that have variables that hide the internal Params object.
<jam> For example, your auto-generated one does: DestroyRelation(p params.DestroyRelation)
<jam> while the existing one is: DestroyRelation(endpoints ...string)
<rogpeppe> jam: yes, that is of course an issue.
<jam> I'm pretty sure we wanted API clients to generally think in terms of "just a function call" rather than dealing with the params struts.
<jam> structs.
<rogpeppe> jam: but i don't really know how much of an issue it is in practice. most API calls are used in just one or two places.
<jam> rogpeppe: well, I think it made it easy to transition from state.State access to api.State access
<jam> since the func args are relatively the same.
<rogpeppe> jam: yes, it did.
<rogpeppe> jam: there's no question that the current api state client adds some value.
<rogpeppe> jam: one possibility is to make the current api state clients call the auto-generated code
<rogpeppe> jam: which actually has a few advantages, i think - for one it's type safe
<jam> rogpeppe: given the sum total of current API client code is "wrap the args into a Params and pass it to Call" it doesn't seem to gain us much
<rogpeppe> jam: and for another, we get an interface type that we can mock
<jam> rogpeppe: well we can mock the "State" object today, because it is just a "common.Caller"
<rogpeppe> jam: you're maybe right that it isn't worth doing, but i think it's worth considering - i'm not really sure that the current API client code pulls its weight - it's a thin shim with a tiny bit of local state.
<rogpeppe> jam: my motivation for thinking about this was: why is our api code so big (> 17000 lines including tests) when all it is is a relatively thin layer on top of the existing mongo state code?
<jam> rogpeppe: so searching for mongo error codes (another patch of yours) I did find: https://github.com/mongodb/mongo/blob/master/docs/errors.md
<jam> i
<jam> interestingly, 10057 is remarked as a Code, but no comment as to what it means :)
<jam> and the link takes you to a code page that isn't that code
<rogpeppe> awesome!
<jam> so it does seem to list and describe them, but the doc itself is pretty out of date for the "code" links :)
<rogpeppe> jam: from one point of view, all our agent API calls are internal.
<rogpeppe> jam: and your example is actually a interesting one - there is exactly *one* call to DestroyRelation in the non-test code. is it *really* worth the extra layer for the convenience of the variadic arguments?
<rogpeppe> jam: anyway, i'm not suggesting that we do this right away, just that it's something worth thinking about
<rogpeppe> fwereade: it would be nice if you could take a look at this too, please:  https://codereview.appspot.com/13249054/
<fwereade> rogpeppe, sure, cheers
<rogpeppe> fwereade: (my spare time became slightly more copious in the weekend :-])
<fwereade> rogpeppe, ha, that's always nice :)
<fwereade> rogpeppe, LGTM, that's really nice
<rogpeppe> fwereade: cool, thanks
<fwereade> rogpeppe, I think it made things more readable too
<fwereade> rogpeppe, but I think we should include the name in the error ;p
<rogpeppe> fwereade: yeah, i think it made the structure better. there's actually another branch in waiting adds rpcreflect.Value that implements Call
<fwereade> rogpeppe, stuttery messages are bad, but less bad than useless messages
<fwereade> rogpeppe, I'd rather risk the former when we don't control the clients in-package
<rogpeppe> fwereade: originally it returned bool
<rogpeppe> fwereade: i don't think the current error is worse than that
<rogpeppe> fwereade: but, fair enough, i'll make it return a NotFoundError
<fwereade> rogpeppe, heh, I find a bool less bothersome than a half-specified error
<rogpeppe> fwereade: it's not half-specified!
<fwereade> rogpeppe, with a bool it's totally clear that interpretation of the result is on the client
<rogpeppe> fwereade: i suppose so. but the docs do make that clear.
<rogpeppe> fwereade: and there are lots of other precedents for static error values which work ok
<rogpeppe> fwereade: e.g. io.EOF
<fwereade> rogpeppe, I *think* that the possibility of other errors is what makes the static error value a cromulent choice
<rogpeppe> fwereade: in this case there's no possibility of other errors, as documented
<fwereade> rogpeppe, in such cases, I think I like bools myself...
<fwereade> rogpeppe, but, eh -- it' somewhat academic anyway, because we certainly do control the client at the moment
<rogpeppe> fwereade: i only made the change so that i could do this more easily: http://paste.ubuntu.com/6143236/
<rogpeppe> fwereade: which was a little experiment
<rogpeppe> fwereade: to see how easy it might be to generate the client API code automatically
<fwereade> rogpeppe, that's interesting, indeed
<rogpeppe> fwereade: this is what the output looks like: http://paste.ubuntu.com/6143244/ )
<rogpeppe> fwereade: it compiles but i haven't tried using it
<fwereade> rogpeppe, cool
<fwereade> rogpeppe, anyway I do not think I am adding value by quibbling over the perfect return for the method :)
<fwereade> rogpeppe, follow your heart
<rogpeppe> fwereade: thanks. i did originally consider both possibilities - this is what my heart said :-)
 * fwereade had maybe better go buy tobacco,I have a feeling shops close at 12 here
<rogpeppe> fwereade: where are you?
<fwereade> rogpeppe, we moved to st pauls bay
<fwereade> rogpeppe, less city, more towny
<rogpeppe> fwereade: ah
<rogpeppe> "
<rogpeppe> I'm ok with a blacklist, but we might just want a whitelist test. (both?) and
<rogpeppe> then when you add something you update the lists.
<rogpeppe> "
<rogpeppe> jam: i'm not quite sure what you mean there
<rogpeppe> jam: it isn't a blacklist, by my understanding of the word
<rogpeppe> jam: although are you suggesting that i do what i mentioned in the CL description?
<jam> rogpeppe: it is a list of the entries that shouldn't be exposed, vs a list of entries that *should* be exposed
<rogpeppe> jam: well, really it's a whitelist of entries that we allow to be ignored
<jam> interestingly, one could argue that you should test that all of them *are* ignored, so the list doesn't grow stale.
<rogpeppe> we do that
<rogpeppe> jam: if you add another method that is ignored, the test will fail
<jam> rogpeppe: sure, I'm actually saying the inverse, if you remove a method that was previously ignored, but you're doing an exact match
<rogpeppe> jam: (that's why i don't think of it as a blacklist)
<jam> not iterating over them and looking them up as a set
<jam> rogpeppe: one could argue that we could do both sides as exact matches
<rogpeppe> jam: i agree with that
<rogpeppe> jam: it's a little bit more work when adding API calls, but probably worth it
<jam> it does add slightly more to the effort to expose a new name
<jam> were I doing it today, I probably would, but I don' think we have to worry too much
<rogpeppe> jam: we could easily have a trivial program that auto-generated the list
<jam> mgz: poke?
<mgz> jam: ey
<mgz> fwereade: what were you after earlier?
<rogpeppe> fwereade, jam: this factors out a little more logic from rpc to rpcreflect: https://codereview.appspot.com/13778046/
<fwereade> mgz, hey, I was wondering about how the get-a-new-address thing ties into the environs interface
<fwereade> mgz, a method on Instance?
<fwereade> mgz, actually, can we chat after the standup please? I have food to eat quickly
<mgz> fwereade: sure
<mgz> I was thinking an environ operation still
<fwereade> mgz, at the moment, maas does it by magic hackery, right?
<jam> standup time https://plus.google.com/hangouts/_/6f82d7382f19fc7b18493254974fb66ed1b99244
<jam> mgz: ^^
<jam> dimitern: ^^
<rogpeppe> dimitern: ^
<natefinch> mgz, and whoever else: https://pastebin.canonical.com/97850/
<mgz> ta
 * TheMue => lunch
<mgz> missing a number natefinch?
<mgz> doh, canonical
<mgz> "state or API addresses not found in configuration"
<natefinch> yeah, saw that... but I don't know what it means or how to fix it
<natefinch> I tried looking for that string in the sourcecode, but it only seems to appear in tests
<mgz> natefinch: comes from l326 agent/agent.go
<natefinch> mgz: oh, I see, that's why I couldn't find it
<natefinch> mgz: doesn't really tell me how it got in the bad state or how to fix it though.  What configuration is it talking about?
<mgz> something odd has happened with the agent config, it seems to not producing the format it claims
<mgz> or, actually, it seems okay... fun fun
<mgz> rog is trying some debugging
<mgz> natefinch: are you trying with trunk, or a slightly older version?
<natefinch> mgz: it's a branch from last week that hasn't been updated from trunk. I could update from trunk if you think that'll help
<mgz> it would make reasoning about it a little less fraught, but only if it doesn't take you too long to do
<natefinch> mgz: easy enough
<natefinch> mgz: done, bootstrapping a new environment
<mgz> natefinch: when it's up, if you have the same error, can you check the contents of /var/lib/juju/agents/bootstrap/agents.conf
<natefinch> mgz: will do
<rogpeppe> natefinch: actually, please check the entire contents of that directory
<rogpeppe> natefinch: /var/lib/juju/agents/bootstrap, that is
<rogpeppe> natefinch: i suspect that there might not be a "format" file there
<rogpeppe> natefinch: but if there is, it'll blow that theory out of the water
<natefinch> mgz, rogpeppe: finally was able to log in... there's a format file there
<rogpeppe> natefinch: bugger
<natefinch> mgz, rogpeppe: agent.conf - http://pastebin.ubuntu.com/6145223/
<rogpeppe> natefinch: could you try executing the command line on line 414
<natefinch> (and yes, it got the same error)
<rogpeppe> natefinch: (you'll need to re-quote it)
<natefinch> rogpeppe: line 414 of what?
<rogpeppe> natefinch: one mo
<rogpeppe> natefinch: this command: http://paste.ubuntu.com/6145226/
<mgz> natefinch: I think you'll need to elevate to do that
<rogpeppe> +1
<natefinch> mgz: yeah, it told me so :)   I get the same error
<natefinch> state or API addresses not found in configuration
<mgz> good good
<rogpeppe> natefinch: yay!
<rogpeppe> natefinch: (kinda)
<natefinch> heh
<natefinch> reproducibility is good, even if you don't yet know how to fix it
<rogpeppe> natefinch: is this machine accessible from outside?
<rogpeppe> natefinch: i.e. can i ssh into it?
<natefinch> I think you have to ssh into garage maas (maas.mallards) first and then you can ssh into the node that is running the virtual maas, maas-1-01
<rogpeppe> natefinch: ok, different approach: can you compile this program, scp it to the node and try running it? http://paste.ubuntu.com/6145248/<
<frankban> juju devs: maybe I have something wrong with my local configuration, but it seems that juju-core trunk does not execute the config-changed hook when using "juju set" to change charm options. "juju debug-log" also seems to no longer show hooks output. Reverting to an older revision of core (e.g. 1790)  everything restarted to work as usual... thoughts?
<rogpeppe> frankban: hmm, sounds bad
<rogpeppe> fwereade: any idea?
<frankban> rogpeppe: yeah, but it is possible that it is some configuration problem on my side, could you please try to reproduce?
<mgz> frankban: what are you doing to run into this exactly?
<rogpeppe> frankban: will do. what charm were you seeing the problem with?
<frankban> rogpeppe: juju-gui. mgz: deploy juju-gui, start debug-log, then change an option (e.g. juju set builtin-server=true).
<natefinch> rogpeppe: I get a nil error as long as I run elevated (otherwise permission denied opening agent.conf)
<rogpeppe> natefinch: did you upload-tools ?
<frankban> rogpeppe: the GUI correctly shows the option changed (so I guess the new value is correctly sent by the megawatcher). but ISTM that the hook is never called
 * frankban lunches
<natefinch> rogpeppe: yeah.  Let me make sure my process is correct.  I don't have a build environment on the maas virtual server, so I'm just building juju and jujud locally and copying them up to the server, then running bootstrap --upload-tools, which finds the jujud I copied up there
<rogpeppe> natefinch: what does jujud version print?
<natefinch> rogpeppe:  oh hmm... interesting... 1.15.0-raring-amd64
<natefinch> raring I'm guessing is the problem
<natefinch> (my dev environment is raring)
<natefinch> and the nodes are precise
<rogpeppe> natefinch: what's in /etc/lsb-release ?
<rogpeppe> natefinch: i think your node must be raring actually
<natefinch> rogpeppe: the nodes are definitely precise, just double checked lsb-release
<rogpeppe> natefinch: could you paste the output of "du -a /var/lib/juju" please
<natefinch> rogpeppe: the maas host where juju is running is raring though
<rogpeppe> natefinch: ah, sorry, i thought that's what you were talking about
<natefinch> rogpeppe: sorry, my terminology may not be great.  Yes, the host is raring, the nodes inside maas are precise
<natefinch> rogpeppe: du returns this on the bootstrap node: http://pastebin.ubuntu.com/6145289/
<rogpeppe> natefinch: what's in /var/lib/juju/tools/1.15.0.1-precise-amd64/FORCE-VERSION ?
<natefinch> rogpeppe: 1.15.0.1
<rogpeppe> natefinch: where is your shell finding jujud?
<rogpeppe> natefinch: i.e. what does "which jujud" print?
<natefinch> rogpeppe: which jujud on the bootstrap node doesn't find anything
<rogpeppe> natefinch: so how did you run it?
<natefinch> rogpeppe: oh sorry, I was running that previously on the maas host...
<natefinch> rogpeppe: thought you just wanted the version of what was getting uploaded
<rogpeppe> natefinch: that's fine, np
<rogpeppe> natefinch: that tiny source file i just pasted you: did you compile that against exactly the same juju source as the one you uploaded with --upload-tools ?
<natefinch> yep
<rogpeppe> weirdness
<natefinch> rogpeppe: I have to make a phone call, can  we pick this up in a bit?  Sorry... it's time sensitive
<rogpeppe> natefinch: np
<rogpeppe> natefinch: ok, when you're back, could you try this: cd $GOPATH/cmd/jujud; go build; scp jujud $MY_MAAS_NODE
<rogpeppe> natefinch: then on the maas node, try running that just-uploaded jujud binary with the same args as in this paste: http://paste.ubuntu.com/6145226/
<rogpeppe> natefinch: (just to sanity check that we can reproduce the problem directly)
<rogpeppe> mgz: for(i in `{go list -f '{{.Dir}}' ./...}){src=$i/*.go; if(grep -l 'launchpad.net/gocheck' $src > /dev/null&& ! grep -l 'gc\.TestingT' $src> /dev/null){echo $i}}
<rogpeppe> :-)
<mgz> chaos!
<rogpeppe> mgz: this works better. more chaos; love them one-liners :-) for(i in `{go list -f '{{.Dir}}' ./...}){if(! ~ $i *testing){src=$i/*.go; if(grep -q -l 'launchpad.net/gocheck' $src && ! egrep -l -q 'gc\.TestingT|testing\.MgoTestPackage' $src){echo foo $i}}}
<sinzui> rock! Next time someone says my sed lines are impossible, I'll point them the rogpeppe's bash
<rogpeppe> sinzui: it's not bash
<sinzui> Even better go emulating bash
<rogpeppe> sinzui: not quite
<rogpeppe> sinzui: it's rc actually
<rogpeppe> sinzui: which is considerably simpler than bash though you wouldn't guess it from that snippet :-)
<sinzui> :)
<natefinch> rogpeppe:  that ran without error.  maybe I accidentally uploaded an old jujud?  I can retry bootstrap with the new jujud
<mgz> natefinch: I think upload-tools can sometimes surprisingly pick something other than what you intended
<rogpeppe> natefinch: do you still have the executable that you wanted upload-tools to upload?
<natefinch> rogpeppe: I just overewrote it :/
<rogpeppe> mgz: https://codereview.appspot.com/13321052
<rogpeppe> natefinch: ok, so...
<rogpeppe> natefinch: i think you need to try again
<rogpeppe> natefinch: so bootstrap with an executable that you've built against trunk, verify that the problem happens.
<rogpeppe> natefinch: then md5sum the binary at both sides and see if it matches
<rogpeppe> natefinch: oh yes
<rogpeppe> natefinch: please bootstrap with --debug
<natefinch> yep
<rogpeppe> natefinch: because --upload-tools *might* print the path of the binary it's found, which might not be what you're expecting
<natefinch> rogpeppe:  it does.  it's getting the right one (it checks $HOME, looks like)
<natefinch> rogpeppe, mgz: and now it works fine.  Feh.  I swear I did the exact same thing last time... but evidently something was out of sync.
<mgz> natefinch: best guess is you maybe didn't run go install .... the upload-tools logic here is frustratingly tricky
<natefinch> mgz: shouldn't have been the problem, since I wasn't running upload-tools in my dev environment... anyway, it was obviously a problem of jujud being out of sync.
<natefinch> fwereade: FWIW in initial testing, maas-tags are working great.  I want to do some more testing around edge cases and stuff, but so far so good
<sinzui> mgz, ping
<natefinch> rogpeppe, mgz, jam:  what's the expected behavior if you specify constraints that can't be met?
<mgz> expected, or desired?
<mgz> you get no machine :)
<rogpeppe> you should see something in the status
<natefinch> mgz: what is juju right now coded to do?  Trying to make sure there's no bugs when you fail to match tags
<natefinch> rogpeppe: ok
<frankban> rogpeppe: any news on the error I reported?
<rogpeppe> frankban: sorry, i haven't had a look
<frankban> rogpeppe: np
<rogpeppe> frankban: please file a bug!
<rogpeppe> frankban: and i'll try to make time to have an investigate
<frankban> rogpeppe: ok, I'll try again, removing binaries, and then file a bug
<fwereade> natefinch, sorry, missed you
<rogpeppe> fwereade: did you see about frankban's problem above?
<frankban> rogpeppe: is " godeps -u dependencies.tsv" the right thing to do to set up the deps?
<fwereade> natefinch, jam, rogpeppe: btw you all seem to have positively reviewed branches not landed
<fwereade> rogpeppe, frankban, I did not, I shall scroll back
<rogpeppe> fwereade: i'm waiting for at least one branch to land
<fwereade> natefinch, that is awesome news
<fwereade> natefinch, thank you very much
<fwereade> rogpeppe, sweet
<fwereade> rogpeppe, frankban: I have no immediate input on the problem -- did someone repro, or not?
<rogpeppe> fwereade: i haven't tried yet
<frankban> fwereade: not yet, retrying and then filing a bug
<natefinch> this makes me sad: $ juju help constraints        ERROR unknown command or topic for constraints
<rogpeppe> natefinch: file a bug
<natefinch> rogpeppe: I will.
<fwereade> is anyone au fait with the latest state of upload-tools wrt simplestreams?
<rogpeppe> fwereade: i'm not, sorry
<frankban> fwereade, rogpeppe: filed bug 1229286
<_mup_> Bug #1229286: debug-log and boolean options are broken in trunk <juju-core:New> <https://launchpad.net/bugs/1229286>
<rogpeppe> frankban: is it just boolean options?
<frankban> rogpeppe: it seems so
<frankban> it seems they are always false
<rogpeppe> frankban: TheMue has been dealing with setting issues recently and might know more about this
<frankban> rogpeppe: I was trying to set a boolean option to true and it was not working and, since no output was printed in debug-log, I suspected the hook was not executed. But then I was able to successfully change a string option. As I wrote in the bug description, those are likely to be two different problems.
<frankban> or maybe a problem in my local devenv
<rogpeppe> frankban: i think it's probably all related to a single problem around boolean flags
<frankban> rogpeppe: cool
<rogpeppe> frankban: that logic has changed recently, i believe
<frankban> rogpeppe: as I mentioned in the bug revno 1750 works well
<rogpeppe> frankban: have you bisected between then and now?
<frankban> rogpeppe: no I don't, just taken a revision near the latest stable release
<rogpeppe> frankban: ok, thanks
<mgz> sinzui: the 1.14.1 release anouncement seems a little wonky
<sinzui> yeah 1.14.1 is replacing 1.14.0
<mgz> says it replaces 1.14.1, and the resolved issues list looks wrong
 * sinzui repllies
<sinzui> mgz. I was told/interpreted the instructions for a replacement release to re-list the issues
<mgz> fair enough, seems a little odd not to mention the openstack security group fix though
<utlemming> is there a way to bootstrap juju with out a tty?
<mgz> sure, write "juju bootstrap" in a text file, make it excutable, then double click it in your shell
<mgz> or did you want a less silly answer? the juju-gui can do a lot of management, I'm not sure what their easy setup plans are.
<rogpeppe> natefinch, fwereade, jam, dimitern, mgz: some provider/dummy updates. https://codereview.appspot.com/13594044
<rogpeppe> and that's me for the day
<rogpeppe> g'night all
<natefinch> rogpeppe: g'night. I'll take a look
<fwereade> utlemming, the only other option is to write your own go code to invoke juju
<natefinch> smoser: any reason you can think of that I might not be able to ssh into the nodes in my maas environment after the host rebooted?  I double checked, and resolv.conf is still set up correctly.  ssh just hangs forever not being able to connect
<sinzui> natefinch, do you know what runs the juju-core test suite for merges? What machine?
<natefinch> sinzui: no idea, sorry
<sinzui> okay
<smoser> natefinch, hold on
<thumper> mramm: ping
<mramm> thumper: pong
<mramm> sorry, got caught in another meeting
<thumper> mramm: still want to chat
<mramm> sure, yea
<wallyworld_> sinzui: i just read the backscroll, the juju-core landing bot is 10.55.32.52
<wallyworld_> it's an instance on canonistack
#juju-dev 2013-09-24
<thumper> axw: hey
<thumper> axw: fwereade was considering staying up to chat to you but when he realized what time you started he decided not to :)
<axw> thumper: yo
<axw> hehe
<thumper> axw: probably about the precheck branch
<thumper> I've not read is response other than to realize that he made one
<axw> thumper: I think about the null provider actually
 * thumper is busy with kvm
<thumper> ah
<axw> but yes, precheck needs work too
<thumper> expect a ping from him when he wakes
<axw> yup, will do
<axw> thanks
<thumper> np
<thumper> axw: when a manually provided machine is added, the agents still run as root, right?
<axw> thumper: yes
<thumper> cool
<thumper> just checking
<fwereade> axw, heyhey
<axw> fwereade: heya
<fwereade> axw, so, sorry I got confused about the time -- I hope I didn't leave you hanging too badly
<axw> fwereade: nope, not at all. thumper let me know you realised later about the time :)
<axw> I would feel quite guilty if you got up in the middle of the night to talk to me ;)
<thumper> axw: you get over that :)
<axw> heh
<fwereade> axw, haha
<axw> I guess it's inevitable
<fwereade> axw, ok, just catching up on your responses
<fwereade> axw, I *think* that series is maybe still a useful thing to have
<fwereade> axw, because it's the environ that's more or less in charge of making tools, images, etc available to juju
<thumper> also... for future container checks
<thumper> we probably want the instance id
<thumper> so we can get the machine
<thumper> and see if it said kvm was ok
<thumper> fyi, on ec2, no kvm
<fwereade> axw, and at least a check for "do we have an image available" depends on seriess + type
<axw> thumper: ah yes, I forgot you mentioned kvm-ok
<axw> fwereade: fair enough
<thumper> I plan to have the machine agent run it on start up
<fwereade> thumper, AFAIK Prechecker will be somehow used inside state
<fwereade> thumper, so we'll already know the machine
<thumper> and if kvm-ok fails, then we should record in state that the host machine can't do kvm
<fwereade> thumper, definitely
<thumper> fwereade: but is it passed through?
<thumper> we probably should
<fwereade> thumper, not yet
<axw> thumper: passing through is in a followup
<fwereade> thumper, we need tofigure out how to get this code where we need it without making baby jesus cry
<thumper> that's fine
<axw> hehe
<thumper> fwereade: I'm struggling getting the damn kvm code to actually run
<thumper> constantly fails for me
<fwereade> :(
<thumper> I'm writing the interfaces and mocks
<thumper> and broker
<thumper> and stuff
<thumper> hoping that we can plug working bits in later
<fwereade> thumper, cool, but if the stuff you'remocking seem flaky...
<fwereade> thumper, yeah
<thumper> it is what it is
<thumper> we can work with it for now
<fwereade> quite so
<thumper> we know basicly what it will take
<thumper> so can mock out that
<thumper> it won't be perfect
<thumper> until the utility code actually works
<thumper> so I'm mocking out what I think we'll need
<thumper> as far as params go
<thumper> the function calls are known though
<thumper> start/stop/list
<thumper> and that's about it
<thumper> except for the initialize
<thumper> go get me an image type code
<fwereade> thumper, ok, great
<axw> fwereade, thumper: so I'll put instance.Id in since it'll be used for kvm-ok, and I'll put series in for checking image availability
<axw> (into PrecheckContainer params)
<fwereade> axw, ah, ok, clearly I'm missing something -- why's the instance id needed there?
<fwereade> axw, thumper: surely what will happen is the machine agent will run and check, and record in state whether kvm is a viable option
<fwereade> axw, eliminating the need to ask the env?
<fwereade> axw, or am I ignorant about something?
<axw> ah yes of course
<fwereade> (I am surely ignorant about many things ofc)
<axw> sorry, I'm still getting my head around what's known in state etc.
<axw> that makes sense
<fwereade> axw, no worries
<axw> nope, I'm ignorant :)
<axw> ok. just series then
<fwereade> axw, that's not known yet anyway
<fwereade> axw, cool, thanks
<axw> fwereade: what's not known?
<fwereade> axw, sorry -- the can-we-run-kvm code doe not *yet* write anything into state to record the results
<axw> right
<fwereade> axw, but it should/will, so we're good with series + contianer kind
<axw> fwereade: thanks, I'll update it after proposing my httpstorage auth branch
<fwereade> axw, brilliant, thanks
<fwereade> axw, bah, it looks like I may be off shortly... the power company has scheduled maintenance, apparently
<fwereade> axw, so it I disappear that's where I'll be
<axw> fwereade: no worries, thanks for coming on early to chat.
<fwereade> axw, and thanks for addressing all those things, those CLs look solid, I'll give them a closer look in a bit
<axw> cool
 * fwereade is always a bit worried about searching for his power company
<axw> bad reviews? :)
 * fwereade fears the consequence of a fat-fingered search for, yes, "enemalta"
<axw> haha
<fwereade> it's particularly alarming seeing branded fuel tankers driving around the airport
<fwereade> aaanyway
<axw> speaking of alarming branding, I saw this in a parenting magazine yesterday: http://www.juju.com.au/
<fwereade> haha
<davecheney> oh dear
<rogpeppe> mornin' all
<rogpeppe> fwereade: hiya
<fwereade> rogpeppe, heyhey
<rogpeppe> fwereade: martin reviewed this, but suggested another look might be good, so i wondered if you could take a glance before i approve it. https://codereview.appspot.com/13778046/
<fwereade> rogpeppe, ah, sure
<fwereade> rogpeppe, btw I will disappear for an unknown period at some point today, power company doing some maintenance
<rogpeppe> fwereade: ok, thanks for the heads up
<TheMue> morning
<rogpeppe> TheMue: yo!
<TheMue> rogpeppe: btw, the chrome plugin you twittered is nice - but i dropped chrome
<rogpeppe> TheMue: what do you use now?
<TheMue> rogpeppe: to much memory and cpu consumption
<TheMue> rogpeppe: now back to the good old ff
<dimitern> TheMue, I have exactly the opposite experience with ff and chrome - the latter works faster and its more lightweight on my machine :)
<TheMue> rogpeppe: imho chrome becomes more and more an own os, only missing a go and a dart ide integrated ;)
<TheMue> rogpeppe: funny
<TheMue> rogpeppe: some of my friends switched too and feel better now
<TheMue> rogpeppe: may depend on the exact plugins one is using
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: TheMue | Bugs: 12 Critical, 135 High - https://bugs.launchpad.net/juju-core/
<TheMue> hmm, the number of bugs is only increasing. we need a feature freeze to simply handle those bugs
<jam> wallyworld_: I'm looking at the fetch code right now, and I see it failing to find sjson in 3 places before it even tries to look for .json: http://paste.ubuntu.com/6149206/
<jam> that looks like private-bucket, then ? then tools-url
<jam> wallyworld_: I thought the goal was to search for sjson then json in private, then fallback to site-tools, then user config?
<jam> wallyworld_: the issue is that the signed/unsigned flag is being done at a higher level
<jam> with "requireSigned"
<jam> environs/tools/simplestreams.go always calls "GetMaybeSignedMetadata(...,requireSigned=true)"before calling it with something else if it hasn't found anything
<jam> doesn't that mean when we have juju.canonical.com or even just signed mirror data, it *won't* ever use user-specified unsigned metadata ?
<raywang> jamespage, ping
<jam> wallyworld_: I'm also surprised you decided to take our "implicit" URLs before user-explicit URLs (you check the custom urls before "tools-url")
<jam> how would a user override an explicit setting in the env? Only by using their private bucket?
<wallyworld_> jam: i had no choice for the 1.15 release, because for in HP Cloud (for example), the tools url was set automatically and this hid the uploaded tools when using a dev release
<jam> wallyworld_: how is tools-url set automatically ?
<wallyworld_> jam: see certifiedclouds.go
<wallyworld_> it sets up the tools url in config validation
<wallyworld_> we won't need this once the repository comes online
<jam> wallyworld_: so... that indicates to me that our layering is probably wrong. Can't we put it into the CustomSources instead of overriding user-config ?
<jam> wallyworld_: we could call GetCertified... in openstack/provider.go GetToolsSources, couldn't we ?
<wallyworld_> maybe. it was done later friday night because i was smoke testing for the release and found the issue. but we didn't release 1.15 so it was for nought
<wallyworld_> i was just trying to get something done so the release would be ok
<wallyworld_> it's on my list to revisit
<jam> sure, did you see my point about always searching for DefaultBaseURL+ .sjson *before* searching the private-bucket for .json ?
<jam> that will also cause problems when we have "official" locations online
<wallyworld_> i'm reading it now
<wallyworld_> the thinking way back when was that signed metadata should take precedence but this could cause issues as you say
<wallyworld_> jam: the hp cloud url was done in config because i needed to convert the deprecated public-bucket-url into a tools url and that deprecation code lives in config, but i think it can be split out into custom sources
<wallyworld_> i was initially looking to keep the code dealing with deprecated config values together
<jam> wallyworld_: sure. It just feels like it should be "user-config" then "private" then "default tools-for-cloud" then "juju.canonical.com"
<jam> *maybe* private then user-config
<jam> but right now we have "default-tools-for-cloud" before user-config
<wallyworld_> sure. but a few short hours before when i thought the release was being done i needed a quick fix
<jam> np
<wallyworld_> tomorrow i should get to do mirrors and hence was making tomorrow my "deal with simplestreams" stuff day
<wallyworld_> stuff = where to find things
<jam> np
<jam> wallyworld_: I only noticed because I had to touch those bits, and it caused conflicts when merging turnk
<jam> trunk
<wallyworld_> np. i hadn't appreciated/property though about the sjon vs json thing so i'm glad you asked
<rogpeppe> jam, mgz, dimitern, TheMue, axw, natefinch: a small cleanup to the environs.Environ interface: https://codereview.appspot.com/13587046
<TheMue> rogpeppe: looking
<rogpeppe> TheMue: thanks
<TheMue> rogpeppe: reviewed
<rogpeppe> TheMue: ta!
<dimitern> rogpeppe, hey he provisioner is ready for review :) https://codereview.appspot.com/13720051/
<rogpeppe> dimitern: cool. i'll take a look soon.
<jamespage> raywang, hello
<raywang> Hi jamespage, just want to check with you that after deploy ceph-radosgw, anything extra works need to be done for testing s3 or swift compatibility?
<jam> standup time
<jam> mgz: dimitern https://plus.google.com/hangouts/_/239d0ac12a07a73dd5a83cc2b9d8bb4047ce20b4
<jamespage> raywang, #juju
<raywang> jamespage, ok
<TheMue> here's my CL for review: https://codereview.appspot.com/13430044/
 * TheMue => lunch
<natefinch> jam - one thing I forgot to mention is that once the maas tags bugs are fixed, I'll need another project to work on, if you have ideas.
<jam> natefinch: I would look closely at supporting the work from the weekly agenda https://docs.google.com/a/canonical.com/document/d/1eeHzbtyt_4dlKQMof-vRfplMWMrClBx32k6BFI-77MI/edit#
<jam> whether that is doing code reviews
<jam> or working with Martin on Amazon + VPC stuff
<jam> natefinch: also, if you have MaaS up and running
<jam> there is something I'm concerned about with our container strategy
<jam> in that we fire up a container and it grabs an address from the DHCP server.
<jam> I'm concerned that the MaaS infrastructure itself thinks that these are new machines that it is going to want to PXE boot.
<natefinch> ahh yeah, I remember you mentioned that
<natefinch> jam I can do some testing on that
<jam> natefinch: essentially I think "juju deploy --constraints container=lxc" and then see what changes on the MaaS master node
<natefinch> jam: I can do that. So that'll deploy a service to a container on one of the nodes, and we just want to make sure maas doesn't think you've added a new node for it to manage, right?
<jam> natefinch: right.
<jam> or at least document what happens so we can open bugs about how we want to fix it
<natefinch> jam: cool. I'll test that out once I get my maas in working order again.
<natefinch> smoser: you around?
<rogpeppe> wallyworld_: i've got a few remarks on https://codereview.appspot.com/13842044 if you can hold off approval for a few minutes
<wallyworld_> ok
<rogpeppe> wallyworld_: you have a review
<wallyworld_> thanks, looking
<mgz> sorry for missing standup, had to step out
<wallyworld_> rogpeppe: thanks for the Go tips, will implement those tomorrow before landing
<dimitern> fwereade, hey
<fwereade> dimitern, hey dude
<dimitern> fwereade, provisioner review? https://codereview.appspot.com/13720051/
<fwereade> dimitern, that took longer than expected
<fwereade> dimitern, twould be a pleasure
<dimitern> fwereade, I was wondering where are you
<fwereade> dimitern, power cut
<fwereade> dimitern, was apparently scheduled, I found out just today, wasn't expecting it to be like 4 hours though
<dimitern> fwereade, aw.. well at least it's just 4 hours :)
<jam> fwereade: welcome back
<rogpeppe> dimitern: quick question: why do non-global provisioners need the environ config at all?
<dimitern> rogpeppe, afaik they need to know the environ config is saved to state already
<rogpeppe> dimitern: why so? just asking, because in future we won't want them to be able to get access to the env config, so it'll save us work if we can avoid them accessing it now.
<jam> rogpeppe: so today the code is structured just that we wait for environ config changes and instantiate an "Environ" object
<jam> which needs an EnvironConfig
<jam> they then later on do stuff with that Environ
<jam> but the two parts of the code are pretty far apart
<rogpeppe> jam: what does the local provisioner do with an Environ?
<jam> rogpeppe: I'm not 100% sure how it all hangs together.From what I can tell it uses "getBroker" which returns either the Environ itself, or an LXCBrokecr
<jam> Broker
<jam> rogpeppe: and the Config stuff ends up getting handed off to a lot of places
<dimitern> rogpeppe, so in short, it seems the lxc one does not need it at all, but that's how the code is written
<rogpeppe> dimitern: I'd much prefer it if the local provisioner never called WatchEnvironConfig
<jam> so what comes to mind is having an API that returns a minimialistic EnvironConfig, which we later ask another API for the credentials needed for the environment broker
<rogpeppe> dimitern: (i mean WatchForEnvironConfigChanges of course)
<rogpeppe> dimitern: couldn't you only set p.environ if p.pt == ENVIRON ?
<jam> looking at the code, it does feel like we have a bunch of switch statements that might look better overall if we just had 2 types
<jam> an LXCProvisioner
<jam> vs an EnvironProvisioner
<fwereade> dimitern, ha, I was just coming to start the above conversation
<fwereade> dimitern, rogpeppe: so, yeah, the environ stuff is tied in too tightly there
<rogpeppe> ah, i see why
<jam> fwereade: so I think landing what we have to guarantee API-based provisioner in 13.10 and *then* trying to split things up would be our best way forward
<rogpeppe> i don't think the change would be difficult at all, at first glance anyway
<fwereade> dimitern, rogpeppe: in particular, it look like an lxc provisioner is waiting for a valid environ config
<fwereade> dimitern, rogpeppe: else how can it set p.environ
<fwereade> dimitern, rogpeppe: and that mean we're pooing secrets onto every machine again
<fwereade> dimitern, rogpeppe, jam: in the name of expediency I could put up with a hack-for-landing that had a p.environ field that was always nil for non-environ provisioners, for example
<fwereade> dimitern, rogpeppe, jam: but it'd really have to be addressed very soon afterwards
<rogpeppe> the problem is that ContainerManager.StartContainer wants a config.Config
<rogpeppe> ahhh
<rogpeppe> so we actually do want a significant portion of the environment config there
<rogpeppe> just not the secret bits
<fwereade> rogpeppe, ha, yes, anotherslice through environ config with its own special idiosyncracies
<rogpeppe> because we need to call FinishMachineConfig, which needs things like authorized keys,
<fwereade> rogpeppe, what does FinishMachineConfig use for non-managers aside from the auth-keys?
<fwereade> rogpeppe, AIUI it is in fact *only* the authkeys we need
<rogpeppe> fwereade: provider type
<fwereade> rogpeppe, aw, ffs
<rogpeppe> fwereade: and it uses cfg.AdminSecret, which seems wrong
<rogpeppe> fwereade: and StatePort and APIPort
<fwereade> rogpeppe, nah, that's all after "if !mcfg.StateServer{ return nil}"
<rogpeppe> fwereade: ah! missed that, thanks.
<fwereade> rogpeppe, I guess provider type i not so bad
<jam> fwereade: we *are* putting environ creds onto every machine, but we know that, and while it lets you start and stop instances, it isn't the same as giving you root on all the machines.
<rogpeppe> fwereade: for lxc, i presume it's always lxc
<jam> For example, you can't actually add ssh keys to machines just with provider creds (only startup ssh keys, etc)
<fwereade> rogpeppe, but I think there's a pretty good case for a narrower FinishBasicConfig which is called by FinishBootstrapConfig
<rogpeppe> yeah, after reflection, i think i'm with jam that we should land this now and work towards eliminating environ config later
<fwereade> jam, it lets an attacker spend a user's money
<fwereade> jam, keeping those particular credential secret i one of the major points of this
<jam> fwereade: I agree 100% that we want to get away from this. But I think getting to the point where "root on machine-a doesn't give root on all other machines" is a very useful step forward.
<rogpeppe> jam: +1
<rogpeppe> fwereade: i think that this CL is progress
<rogpeppe> fwereade: and it's nice to see it when not too much logic has changed
<jam> fwereade: the main problem is that we have a "config.Config" and we honestly don't know without a lot of deep inspection what we need out of that.
<fwereade> rogpeppe, jam, dimitern: I don't care if the provisioner API does something ludicrously dirty like look up SecretAttrs and crap in NO-NOT-YOURS values over what they return
<jam> so I think incremental steps. But having an EnvironProvider and an LXCProvider will actually clean up the code a bit as well as make it clearer what bits we actually need.
<fwereade> rogpeppe, jam, dimitern: but we really must not put provider cred on every machine
<rogpeppe> fwereade: agreed
<rogpeppe> fwereade: but that doesn't have to happen in this CL
<dimitern> exactly
<rogpeppe> fwereade: we don't have to do it all at once
<rogpeppe> fwereade: "progress not perfection"
<fwereade> rogpeppe: sorry, but this is one of the main point of this work
<rogpeppe> fwereade: sure
<rogpeppe> fwereade: but it doesn't have to happen in *this* CL
<rogpeppe> fwereade: which is already big enough
<rogpeppe> fwereade: and gives us behaviour that's better than we had before
<jam> fwereade, dimitern: so something like line 158 of state/apiserver/provisioner/provisioner.go after we grab Config.AllAttrs() we could do a "JobHostsStateServer" check and if that isn't part of the unit making the request, we nuke whatever secrets we can ?
<jam> fwereade: aren't those specific to the provider itself (openstack has different secrets than ec2 than azure)
<jam> or can we just nuke all secret attrs ?
<fwereade> jam, yes they are, but we don't need to construct an actual environ to get secrets, that' on EnvironProvider
<rogpeppe> jam: yeah, we can do that (only if the provisioner isn't the global one though, of course)
<fwereade> jam, but we will probably need to dick around and figure out the types of the secret attrs so we can overwrite them usefully,and in the general case overwriting them usefully may not even be possible
<dimitern> so do we need an Environ (and its config) in a lxc provisioner at all?
<fwereade> dimitern, we *need* two values out of the config
<dimitern> fwereade, which ones?
<fwereade> dimitern, provider type and authorized keys
<dimitern> fwereade, why?
<fwereade> dimitern, they're the ones actually used by FinishMachineConfig for a non-state-server
<jam> fwereade: so the Environ.SecretAttrs seems to be the only place that knows what the private attributes are
<fwereade> dimitern, but IIRC we agreed that authorized-keys should come from the machine anyway?
<jam> state.EnvironConfig doesn't combine sections or anything
<fwereade> jam, yeah
<fwereade> jam, afraid not
<dimitern> fwereade, how about provider type?
<jam> fwereade: so I would be fine with a quick hack, but I think if we can have LXCProvisioner than it doesn't need anything like EnvironConfig, it can just do "GetMyProvisionerConfig" sort of call.
<jam> dimitern: I would guess that we don't actually need environ for LXC provisioner
<jam> dimitern: but the current code layering
<dimitern> jam, yeah
<jam> means a lot of line-by-line auditing of "can I get here in an LXC provisioner"
<rogpeppe> personally I think we should move towards having known attributes in the state in a well known form separate from environ config.
<fwereade> rogpeppe, yeah, I'm definitely keen on migrating stuff into sensible buckets
<fwereade> dimitern, I'm trying to figure out why we need provider type
<dimitern> fwereade, we can do ProviderType() (string, error) api call, like we did with the uniter
<rogpeppe> fwereade: we'd have to be careful around juju set-environment
<fwereade> dimitern, yeah, indeed
<jam> dimitern: moving towards that also gets us away from "NewSimpleAuthenticator" running on the LXCProvisioner, which makes it easier to say "this does not actually directly touch Mongo"
<fwereade> rogpeppe, yeah, definitely
<dimitern> jam, we don't have to change that - we can change the api not to set mongo passwords
<fwereade> rogpeppe, but, eh, that stuff's all fatally broken anyway,it only works by luck
<jam> dimitern: this is more about auditability
<jam> by sharing the code
<jam> you have to look at switches, etc
<jam> oh, this isn't enabled in this configuration
<jam> if we split them apart
<jam> then there just isn't a state connection object anywhere in the LXCProvisioner
<dimitern> fwereade, so how about a follow-up that introduces ProviderType() and apiprovisioner.Machine.GetAuthorizedKeys() API calls, and uses that in the lxc provisioner?
<jam> dimitern: I don't tihnk we need the api to get the ProvisionerType, as cmd/jujud/machine.go is the only place where we ever call NewProvisioner
<jam> and if we *want*, we can make Provisioner just an interface{}
<jam> and then implement 2 of them
<dimitern> fwereade, GetAuthorizedKeys() needs to read the env config internally and return a slice of what?
<jam> and NewProvisioner returns one or the other based on the flag you passed in.
<dimitern> jam, it currently does that
<fwereade> dimitern, GetAuthorizedKeys I think takes an []tag, and retur an []string
<dimitern> jam, but not as an interface
<dimitern> fwereade, machine tag ok, and which env config keys should it return?
<jam> dimitern: no, it currently returns a Provisioner that has an internal flag set. vs a separate type/struct/whatever
<jam> dimitern: I guess my point is, we shared a bunch of the code because they look "similar" but honestly there are a *lot* of different bits, and I'd like to split them further apart.
<fwereade> dimitern, it just returns the valueof authorized-keysin config
<fwereade> jam, yeah, they should unquestionably be split
<dimitern> fwereade, ok, one per machine
<fwereade> jam, that was always the original plan
<jam> and have 2 implementations that share an interface, rather than 1 concrete type that has a bunch of switch statements.
<fwereade> jam, but the authkeys-in-config scotched it
<dimitern> fwereade, and then a follow-up that splits the 2 provisioners into separate types implementing an interface
<fwereade> dimitern, do you know offhand why there's an environ on p?
<jam> fwereade: GetAuthorizedKeys returns a []{tag: Name, keys []string} doesn't it?
<dimitern> fwereade, and a third CL, which binds the previous 2 together
<jam> well, []{tag: Name, keys []string, Error}
<dimitern> fwereade, it's an instance broker
<fwereade> jam, authkey is just a string in the format one would expect in .ssh
<rogpeppe> jam: we current represent a set of authorized keys as a single string
<fwereade> dimitern, wtf, why's it called environ then?
<jam> fwereade: []{tag: Name, keys string, Error} ?
<dimitern> jam, not really - we can co with just StringsResults - no name neede
<fwereade> dimitern, +1
<fwereade> dimitern, ok, I need to think and read more code a little
<dimitern> fwereade, because it implements what InstanceBroker's interface was extracted from
<jam> dimitern: I personally *really* like APIs that return "result + context you requested it"
<rogpeppe> jam: too late :-)
<dimitern> jam, we need to unnecessary change all calls
<dimitern> jam, and it's working well already
<rogpeppe> jam: also, it's really not necessary - adds more network traffic and more error conditions to check
<jam> rogpeppe: slightly more traffic vs being able to actually look at a response and instantly understand it....
<jam> xml/json are reasonably self documenting over ASN.1 for a reason
<dimitern> jam, you can look at the request and understand the response
<jam> dimitern: still takes 2 bits of information, when it could be all together
<rogpeppe> dimitern: +1. without the request, you're pretty much out of luck anyway
<dimitern> jam, i really don't see this as a drawback
<jam> dimitern: so if you take this array of things, and sort of squint at it enough, you can line it up with that array of things.
<rogpeppe> jam: i could potentially change the rpc package so it always included all the request data with the response
<jam> *or* you could write it down as an array of tuples
<jam> and immediately see the pairirng
<rogpeppe> jam: we never pass more than one thing anyway - the whole point is moot
<dimitern> jam, and send twice as much over the wire
<rogpeppe> jam: we might possibly one day have two uses for bulk calls
<jam> dimitern: one is a tag that is 5-10 bytes, one is an authorized-keys string that is 1000bytes, that is not 2x over the wire
<fwereade> dimitern, ok, so, that environConfig goes on a loong adventure between the getBroker call and the final FinishMachineConfig, but that's the only place it happens
<dimitern> jam, rogpeppe: if anything, logging api requests and responses can be improved - like print the request data in the response log
<fwereade> dimitern, so resolving that in this CL is indeed definitely not an option
<rogpeppe> fwereade: +1
<fwereade> dimitern, however I remain unrepentantly opposed to putting secrets,via the API, anywhere that is not known to be authorized to read those
<rogpeppe> fwereade: me too
<dimitern> rogpeppe, and we need to get rid of these discard * call signature debug logs from the rpc
<rogpeppe> dimitern: they're gone
<dimitern> rogpeppe, sweet!
<dimitern> fwereade, ok, it's just a temporary state
<rogpeppe> dimitern: https://codereview.appspot.com/13249054/
<dimitern> fwereade, it'll be fixed in the next 3 cls
<rogpeppe> dimitern, fwereade: i'd still like a second review of this (martin's request) BTW: https://codereview.appspot.com/13778046/
<fwereade> rogpeppe, bugger, I thought I texted you -- I'd just finished reading it when the power went out
<fwereade> rogpeppe, I said "LGTM" :)
<rogpeppe> fwereade: ah, you did! my phone was muted
<rogpeppe> fwereade: i sent you an email, expecting your "from-the-future toy" response...
<rogpeppe> fwereade: thanks
<fwereade> dimitern, can we have a quick g+ in 5 please?
<rogpeppe> fwereade: ha, looks like a merged it anyway!
<rogpeppe> s/a merged/i merged/
<dimitern> fwereade, sure, let me start one
<dimitern> fwereade, https://plus.google.com/hangouts/_/96f599fd9a86362a79be136169847367eaaf5539
<jam> dimitern: http://paste.ubuntu.com/6150016
<jam> fwereade: ^^
<jam> that is in EnvironConfig
<jam> we already have the AuthEnvironManager
<jam> to handle stuff like WatchEnvironMachines
<jam> so we can just use it here to hide all secrets if we aren't an environ manager
<dimitern> jam, and we need to restrict WatchForEnvironConfigChanges to the environ manager as well as EnvironConfig()
<dimitern> jam, then we won't need to delete any attrs
<dimitern> fwereade, i'm waiting
<jam> dimitern: so I think we need an entirely new structure for non environ provisioners, but if we just want "don't leak secrets to those machines" that should be enough, I think.
<jam> as in, we can do a bit of testing and land that in 2 hours
<jam> well hours not days
<fwereade> jam, ok, dimitern will be closing the API hole such that the current structure will work without secrets, and in the meantime I'll be finishing the review and trying to come up with a way forward
<fwereade> jam, well, without *real* secrets anyway
<TheMue> fwereade: I proposed the latest CL after the review changes again. would you mind to take a look there too?
<fwereade> TheMue, will do
<TheMue> fwereade: great, thx
<rogpeppe> fwereade, mgz, jam, TheMue, dimitern: small review if anyone wants to look: https://codereview.appspot.com/13839046/
<TheMue> rogpeppe: already reviewing it
<rogpeppe> TheMue: ta!
<TheMue> rogpeppe: done
<rogpeppe> TheMue: thanks
<fwereade> dimitern, wait, shit -- that SimpleAuthenticator thing needs a valid environ
<dimitern> fwereade, to call StateInfo() on it, yeah
<fwereade> dimitern, so we can't plug the hole until we've handled that
<fwereade> dimitern, it's easy to be fair becaue I'm pretty sure the provisioner is started with an agent config anyway, so it can clone its own
<dimitern> fwereade, so should I stop doing the masking of the secret attrs then?
<fwereade> dimitern, that's still necessary, I'm just trying to figure out the ordering
<dimitern> fwereade, ah, ok
<fwereade> dimitern, we can land secret masking right now with no impact
<fwereade> dimitern, but with secret masking in place, we can't land the provisioner until the StateInfo requirement is fixed
<fwereade> dimitern, but similarly, we can land provisioner with no impact, but then can't land secret masking until StateInfo is dropped
<dimitern> fwereade, hmm
<dimitern> fwereade, but the secretattrs that get masked is only one it seems - "secret", and it's needed by the stateInfo only, we don't need to use it, right?
<fwereade> dimitern, in practice the ec2 secret-key will be masked
<fwereade> dimitern, preventing us, I think, from reading the instance from storage and figuring out it address
<fwereade> dimitern, hey
<fwereade> dimitern, don't we already have API code for saying what the API addresses are?
<fwereade> dimitern, I guess maybe we don't
<dimitern> fwereade, we have one for the deployer
<fwereade> dimitern, yay! then we should surely reuse that
<dimitern> fwereade, so we get rid of environ.StateInfo() and call the APIAddresses() instead
<fwereade> dimitern, yeah, I think so, and if it turns out that some annoying client depend on state.Info being meaningful we can just fill it with nonsense in order to move forward
<fwereade> dimitern, it's nice that we have that auth interface for easy swapping out, innit ;)
<dimitern> fwereade, the manual provisioner uses the SimpleAuthenticator as well
<fwereade> dimitern, that's fine, it has legitimate access to the environ
<fwereade> dimitern has a power cut too
<dimitern> hmm it seems it was only my e-meter that tripped off
<dimitern> fwereade, back
<fwereade> dimitern, ah cool
<fwereade> dimitern, https://codereview.appspot.com/13720051/ just reviewed
<dimitern> fwereade, cheers
<rogpeppe> dimitern: a few comments from me too that i forgot to publish earlier
<dimitern> fwereade, you mean all api watchers should return tags?
<dimitern> fwereade, none of them do
<dimitern> rogpeppe, thanks
<rogpeppe> dimitern: i hadn't noticed that, but for consistency all the watchers should really return tags not ids, i guess
<dimitern> rogpeppe, you'd have surely noticed if you did a state-to-api migration of existing code :)
<rogpeppe> dimitern: surely i would :)
<fwereade> dimitern, part of the point of the relation tag changes was so that the watcher could convert them to tags, iirc,did we forget to do that bit?
<fwereade> dimitern, when we noticed we knew we were stuck with deployer's watchers but I think the others were fine
<fwereade> dimitern, ie not yet implemented
<fwereade> dimitern, eh,we both dropped the ball there I guess
<fwereade> dimitern, what else have we released with non-tag-based watchers
<fwereade> ?
<dimitern> fwereade, all of them
<dimitern> fwereade, but I need to check which ones exactly
<fwereade> dimitern, ok, let's not sweat it now
<rogpeppe> all tests pass
<dimitern> fwereade, so machiner, deployer, provisioner and uniter all have stringswatchers that need fixing
<rogpeppe> yay!
 * rogpeppe says, a little wearily
<fwereade> dimitern, tech-debt bug pointing out that all the StringsWatcher use a silly format but it's two-way transformable without context so nbd really
<fwereade> dimitern, we'll fix it in our Copious Free Time, but at this stage we may as well be consistent and fix them all together
<dimitern> fwereade, added bug 1229755
<_mup_> Bug #1229755: Watchers returned from the API should report tags, not ids as changes <tech-debt> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1229755>
<fwereade> dimitern, cheers
<rogpeppe> fwereade: perhas we could change the client code now to accept ids or tags, and transform them to tags if it gets ids. that way it's backward compatible when we move to returning tags from the API
<dimitern> rogpeppe, this is how all the client (well agent really) code works
<fwereade> rogpeppe, dimitern, hmm: maybe we could do that in the api stringsWatcher anyway... we can tell what kind a name/id has by inspection, can't we?
<fwereade> except wait, better yet
<dimitern> fwereade, except for the relation units watcher
<rogpeppe> fwereade: i'm not keen on doing that if it's still called "stringsWatcher". "tagsWatcher" would be a better name in that case.
<fwereade> rogpeppe, actually I had it the wrong way round anyway
<dimitern> rogpeppe, tags are strings
<fwereade> rogpeppe, stringsWatcher -- *if* it is indeed actually a names/ids/tags/whatever watcher will still want to produce non-tag things
<rogpeppe> dimitern: yes, but strings aren't tags
<dimitern> rogpeppe, i'm -1 on tagsWatcher
<dimitern> rogpeppe, perhaps entityTagsWatcher
<rogpeppe> dimitern: sure
<rogpeppe> dimitern: just i'm -1 on anything tag-related inside something that calls itself "stringsWatcher", implying that it's just a bunch of strings with no other attached semantics
<fwereade> rogpeppe, dimitern: ok, so, we can in that case wrap the existing stringswatchers with wrappers that untag the strings as they land, if they are in fact tags
<dimitern> fwereade, why so complicated?
<dimitern> fwereade, we just assume we always get tags
<jamespage> fwereade, who is looking at juju memory consumption on service units aka bootstrap node won't run on a tiny
<fwereade> dimitern, because that's the low-impact change we can make today that gives us forward compatibility with anapi that returns tags from watchers, I think?
<dimitern> fwereade, changing the stringsWatcher to return tags at client-side api seems low-impact to me  too
<fwereade> jamespage, I think I will have to force some space to look at that myself, I'm afraid :(
<fwereade> dimitern, except that all the existing code is expecting non-tags
<rogpeppe> jamespage: various of us have been tangientally involved
<dimitern> fwereade, the needed changes to start expecting tags from these watchers is minimal
<fwereade> jamespage, I know jam has been looking at it to dome degree too
<mgz> jamespage: what we're really missing is some expertise from the other side
<fwereade> dimitern, but there is code out there in the wild that we have released using ids, and changing to tags will mess with them
<jamespage> mgz: from the other side?
<mgz> it seems pretty clear something changed on canonistack, but no idea wgat
<mgz> *what
<jamespage> mgz, hmm - well I see exactly the same thing on serverstack
<dimitern> fwereade, I think we messed up pretty much by being lenient about version upgrades
<fwereade> dimitern, by implementing clients that can handle tags when they land in the future we can avoid changing the API today
<dimitern> fwereade, and now we lock ourselves in a vicious circle every time we need to change the api in any way
<jamespage> mgz, and I think I'm seeing a related issue with the mysql charm
<mgz> well, or possibly in cloudinit or the images
<mgz> we went from being able to run on tiny, to not, one afternoon without the juju code being changed (old versions now also exhibit the problem)
<mgz> so, there's stuff juju can do to make this work, some of which is documented in the bug, but would be good to understand what exactly broke
<fwereade> dimitern, yeah, I should have raged and screamed at the first version of the API and insisted on  explicit version handling from the beginning
<fwereade> dimitern, but the questions of compatibility just shift there really
<dimitern> fwereade, but fair enough, I see your point with decorating the stringsWatchers at client-side to untag stuff it gets
<fwereade> dimitern, and honestly, I don't think even that is really worth doing at this point
<dimitern> fwereade, then we need to change a bunch of watchers in state to return tags directly
<fwereade> dimitern, hmm, not so sure, I was thinking that tags were really an API language
<fwereade> dimitern, the first cut at them did I think leak into state, which is a shame
<fwereade> dimitern, but the transform necessary from id to tag is not hard to do at Next() time
<rogpeppe> dimitern: it's easy enough to change the API in a backwardly compatible way if we want - just return both tags *and* ids for the time being, then deprecate the ids field later.
<fwereade> rogpeppe, indeed
<dimitern> fwereade, so we don't need to do anything for bug 122975 for now
<_mup_> Bug #122975: thunderbird-bin crashed with SIGSEGV in raise() after click in email to show it in the preview pane <mt-needtestcase> <mt-waitdup> <thunderbird (Ubuntu):Invalid by mozilla-bugs> <https://launchpad.net/bugs/122975>
<fwereade> dimitern, that's a relief
<fwereade> ;p
<dimitern> oops bug 1229755
<_mup_> Bug #1229755: Watchers returned from the API should report tags, not ids as changes <tech-debt> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1229755>
<fwereade> dimitern, but yeah we don't need to do anything for bug 1229755 either
<_mup_> Bug #1229755: Watchers returned from the API should report tags, not ids as changes <tech-debt> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1229755>
<dimitern> :)
<dimitern> fwereade, I didn't quite get this https://codereview.appspot.com/13720051/diff/1/cmd/jujud/machine.go#newcode201
<fwereade> dimitern, jujud houldn't know about differences between providers
<dimitern> fwereade, you're asking to add a tech-dept bug, which says something like "We need a MachineJob called JobHostLXCContainers" ?
<fwereade> dimitern, yeah, we're running LXC based on kooky inferences
<fwereade> dimitern, we should be getting what to try to run from state, I think
<dimitern> fwereade, and who should decide whether the machine has that job or not?
<fwereade> dimitern, the provider
<dimitern> fwereade, and cmd/juju stuff that adds a machine?
<fwereade> dimitern, don't think so, no
<fwereade> hey wait sorry wrong job
<fwereade> dimitern, ok there is more context mixed in the linked CL
<fwereade> dimitern, I *think* that we figure out whether we can run containers by asking the environment
<fwereade> dimitern, but we should talk about this with axw tomorrow morning, I think
<dimitern> fwereade, ok
<dimitern> fwereade, morning being? :)
<fwereade> dimitern, the important thing is that you don't need to take action now but you do need to assign corrective action to yourself and we'll figure out exactly what it should be tomorrow
<dimitern> fwereade, 10pm our time?
<fwereade> dimitern, nah, morning our time, I think he's still around then usually
<dimitern> fwereade, ok, so no bug to file until then
<fwereade> dimitern, I was going to wait up last night but I don't think he actually get in until 3am or so
<fwereade> dimitern, I guess the bug is something like "MachineAgent.APIWorker uses witchcraft to start lxc provisioner"
<axw> fwereade, dimitern: I'm awake for the time being. What's up?
<fwereade> dimitern, but yeah, we can charactierise it in context with axw tomorrow
<fwereade> axw, heyhey!
<axw> :)
<fwereade> axw, more layering violations of the kind I was bitching about in the null provider review
<axw> fwereade: is this the LXC provisioner worker thing in the same area?
<fwereade> axw, machine agent deciding whether to do things based on provider/container type
<fwereade> axw, close but not directly touching
<fwereade> axw, the common thread is that machine agents are depending on environment info that's at the wrong level of abstraction
<fwereade> axw, and that we have a perfectly good mechanism for telling agent what to do already, which is jobs
<axw> right, as in they check the provider name, rather than checking what capabilities they have
<fwereade> axw, exactly
<fwereade> axw, and I thought that since I am telling both of yu that things must be fixed, and could perhap be fixed by defining new jobs, i thought it would be good for us all to be in sync
<rogpeppe> fwereade, mgz, TheMue, dimitern, axw, natefinch, jam: i'd love it if someone could take a look at this: https://codereview.appspot.com/13573046
<axw> fwereade: righto
<rogpeppe> it's unfortunately big, but not really splittable
<axw> fwereade: so, I was looking at a TODO you left in bootstrap.go regarding customising jobs
<fwereade> axw, yeah, that's the one :)
<axw> fwereade: I think that's going to be necessary here
<axw> and that leads to the question I emailed you regarding upgrade
<mgz> rogpeppe: okay, as it's so small :)
<fwereade> axw, yeah, it's a little fiddly, and in your case involves sending special instructions via cloud-init, I think
 * rogpeppe blows mgz a kiss
<axw> fwereade: for local-storage, yes
<axw> dimitern: you can reference 1229507 if you like
<axw> once one's done, it should be simple to do a second (or third, as is the case)
<dimitern> axw, ok, mine will be similar
 * rogpeppe goes back to the branch for which i made these changes... 46 branches previously
<fwereade> axw, ok, updating jobs should be pretty tolerable, but there is currently no code that does this IIRC
<fwereade> axw, I will expand on that in the response
<TheMue> rogpeppe: I like your large ones
<axw> fwereade: to my naive mind, this sounds like it might require generic upgrade support, which could also be used for state schema upgrades
<fwereade> dimitern, I will cc you in
<rogpeppe> TheMue: ha ha!
<fwereade> axw, yeah, and at least the actual upgrading is now behind the api server and could thus perhaps be managed with a bit more finesse
 * rogpeppe likes a good double entendre
 * fwereade considers that to be barely a single entendre
<jamespage> mgz, for context the mysql charms attempt to configure mysql with 80% of total memory on a service unit
<jamespage> this borkes fairly frequently now
<mgz> interesting
<jamespage> mgz, different openstack deployment but I see the same issue with the bootstrap node on a m1.tiny
<mgz> so, the main issue we were seeing was on bootstrap, starting jujud during cloudinit setup failing due to being short on memory
<TheMue> rogpeppe: and please change NewMem() to NewMemory() (or at least NewDisk() to NewDsk(), just to make it similar *smile*)
<mgz> mysql charm issues is a later point, and there's no mongo install happening
<rogpeppe> TheMue: somehow NewMemory doesn't do quite the right thing for me as NewMem. I'm not quite sure why.
<rogpeppe> TheMue: and Mem is a very commonly used abbreviation (e.g. memcached, etc)
<mgz> jamespage: do you have any clues on what could have changed?
<mgz> we were presumably close to the limit previously, but something made everyone start hitting this issue...
<rogpeppe> TheMue: i'm not sure it's worth bikeshedding over
<jamespage> mgz, poking again right now
<TheMue> rogpeppe: still NewMem reads strange ;)
<rogpeppe> TheMue: to do it right, it would probably be NewMemoryBasedConfigStorage
<rogpeppe> TheMue: but i'm actually reasonably happy with NewMem
<TheMue> rogpeppe: yeah, java style
<TheMue> rogpeppe: ok, can live with it
<TheMue> rogpeppe: NewMBCS()
<TheMue> ;)
<rogpeppe> TheMue: :)
<TheMue> rogpeppe: reviewed
<rogpeppe> TheMue: that was quick! thanks.
<jamespage> mgz, doh! I already forced the dataset-size smaller for my config
 * jamespage pushed the car back up the hill to see if it crashes next time
<TheMue> rogpeppe: most have been changes using the NewMem as additional arg, so here the reading has been simple
<rogpeppe> TheMue: cool
<rogpeppe> mgz: were you reviewing it; if not, i'll approve it now.
<mgz> nearly there
<mgz> nothing substantive
<mgz> I have some sympathy with TheMue's complaint about NewMem now he's emntioned it
<TheMue> mgz: also want a change from NewMem to NewMemory? ;)
<fwereade> TheMue, lgtm with a couple of comments
<TheMue> mgz: h5
<TheMue> fwereade: ta
<fwereade> rogpeppe, onto yours in a few minutes :)
<mgz> rogpeppe: posted
<rogpeppe> fwereade: cool, thanks
<rogpeppe> mgz: you're dead right about TestDestroyEnvironmentCommandConfirmation; i'll split it.
<mgz> there are still so many little aspects of the go language that trip me up...
<mgz> return statements being one of the more trivial...
<TheMue> fwereade: the hook name export has indeed an error, overlooked that i'm using it in the same package
<natefinch> mgz: what about return statements trips you up?
<mgz> there are so many possible variations to how they'll look in a function
<natefinch> mgz: I think as long as people avoid bare returns (which I think are an anti-pattern, and I wish they'd do away with them), then it's not so different than other languages... except for the multiple returns
<rogpeppe> fwereade: are you still planning to look at that branch?
<fwereade> rogpeppe, sorry, the first file is still open
<rogpeppe> fwereade: np; i'm still working on a fix for TestBootstrapWithDefaultSeries live test as it happens
<rogpeppe> fwereade: do you know what the deal is with tools/juju-1.15.0-raring-amd64.tgz versus tools/releases/juju-1.15.0-raring-amd64.tgz ?
<rogpeppe> fwereade: it seems to be uploading to the latter, but StorageName is returning the former
<fwereade> rogpeppe, huh
<fwereade> rogpeppe, there is some horrible patching function
<rogpeppe> fwereade: any clues as to where it might be?
 * fwereade meditates
<fwereade> rogpeppe, environs/tool/storage.go, right at the top
<fwereade> rogpeppe, I would guess there's something screwy about the patching/unpatching
<fwereade> rogpeppe, *but* I cannot recall why it was needed in the first place
<rogpeppe> WTF!?!?!
<rogpeppe> we set global variables for testing purposes, but SetTestPrefix is bonkers
<rogpeppe> it means that SyncTools is fundamentally not thread-safe
 * rogpeppe feels slightly queasy
<rogpeppe> fwereade: ^
<fwereade> rogpeppe, I do recall seeing it and being scared before, but being on the trail of bigger game, so your surmise is probably correct
<fwereade> rogpeppe, so we call that outside a testing context?
<fwereade> rogpeppe, or just that the tests are screwy?
<rogpeppe> fwereade: we call it in ReadList
<fwereade> rogpeppe, uhh?
<rogpeppe> fwereade: and in copyOneToolsPackage
<rogpeppe> fwereade: which is called by SyncTools
<rogpeppe> fwereade: and Upload calls it too
<rogpeppe> fwereade: as does SyncTools itself
<fwereade> rogpeppe, well, while it *might* coincidentally be safe, yeah, that's crack
<rogpeppe> fwereade: it's just random side-effects unrelated to the function
<rogpeppe> fwereade: ReadList, for example just changes the global tool prefix if it gets ErrNoTools
<rogpeppe> fwereade: oh actually it doesn't, sorry
<fwereade> rogpeppe, yeah, I think it always cleans up after itself
<fwereade> rogpeppe, but it's totally unaware of concurrency
<rogpeppe> fwereade: even if it was, it would be totally wrong
<rogpeppe> fwereade: you could protect it with a mutex and it would be just as wrong
<fwereade> rogpeppe, yeah, it's a pretty fundamentally insane operation, that stuff should be passed around, no question
<rogpeppe> fwereade: why did the tools need to move, BTW?
<fwereade> rogpeppe, I have absolutely no recollection of that, I'm afraid
<rogpeppe> fwereade: because we *could* put in lots of extra context and change quite a bit of code that assumes that names are independent of storage context, *or* we could just leave the names as they are
<rogpeppe> wallyworld: ping
<rogpeppe> fwereade: well, i think i'll leave that test as broken as is, as it's currently broken on trunk
<rogpeppe> fwereade: and i'll file a bug
<rogpeppe> fwereade: and i will approve my branch, unless you want to have a look first.
<fwereade> rogpeppe, I'm reading through it, if I find any reason to panic I will be sure t oinform you
<rogpeppe> fwereade: i'm sure you will :-)
<rogpeppe> fwereade: if you think you'll finish tonight, i'll hold off
<rogpeppe> fwereade: https://bugs.launchpad.net/juju-core/+bug/1229839
<_mup_> Bug #1229839: provider/ec2: LiveTests.TestBootstrapWithDefaultSeries is broken <juju-core:New for wallyworld> <https://launchpad.net/bugs/1229839>
<fwereade> rogpeppe, nice, thanks
<fwereade> rogpeppe, LGTM, that's great
<rogpeppe> fwereade: cool, thanks
<fwereade> rogpeppe, am I sane re: https://codereview.appspot.com/13573046/diff/5001/environs/open.go#newcode101 ?
<rogpeppe> fwereade: yeah, i feel quite good about the direction we're going
<rogpeppe> fwereade: i've been thinking about this issue
<rogpeppe> fwereade: short answer: yes, i was planning on just storing a diff
<fwereade> aww
<fwereade> it'll be big and smelly and redundant
<rogpeppe> fwereade: because otherwise the attributes may well have changed between the time you call Prepare and the time you call Open
<fwereade> loads of auto-inserted defaults
<rogpeppe> fwereade: that's true, but at least you'll have one single place to see all the settings that go into an environment
<rogpeppe> fwereade: that smelliness is actually our genuine stink
<fwereade> rogpeppe, well, it's still two places if you consider what's in environment.yaml ;)
<rogpeppe> fwereade: eventually, i hope that environments.yaml will actually be replaced by a network call to fetch actual config values
<fwereade> rogpeppe, I do agree there's a foul smell to the whole thing
<fwereade> rogpeppe, eh, the CLI should hardly ever need to know them in the first place
<fwereade> rogpeppe, bootstrap is, I think, the special case
<rogpeppe> fwereade: that's true
<rogpeppe> fwereade: and in most cases, the CLI will never see any attributes at all
<rogpeppe> fwereade: they're only relevant for the bootstrapper
<rogpeppe> fwereade: i think that, for an admin, knowing that they can bootstrap, then copy *only* environments.yaml and environments/name.xxx (ext TBD!) to another machine, is a really useful property
<rogpeppe> fwereade: without needing to remember to copy across any number of provider-specific env vars
<rogpeppe> too
<rogpeppe> fwereade: how about we go with "just a diff" to begin with, and if we think it looks too smelly, we can trim it down later.
<rogpeppe> ?
<rogpeppe> fwereade: i've got to go
<fwereade> rogpeppe, sorry, got called away myself
<fwereade> rogpeppe, I thought it'd be 5 seconds...
<fwereade> rogpeppe, the env vars are a very good point, but my heart still yearns for making the provider -- that knows about that -- responsible for recording those
<fwereade> rogpeppe, I'm not against the env file + environments.yaml, I think that's good, but I don't think it needs defaults inserted
<fwereade> rogpeppe, values read from other files, probably, yes, though
<fwereade> rogpeppe, anyway I must return
<rogpeppe> fwereade: we'll chat about this tomorrow
<rogpeppe> fwereade: g'night!
<rogpeppe> fwereade: BTW, only provider defaults will go in, as it's currently structured.
<rogpeppe> fwereade: and it's perhaps actually nice to see what those are
<rogpeppe> fwereade: by my logic above though, we'd should put in authorized keys and all the other defaults, making them concrete.
 * rogpeppe is really gone now
<natefinch> mgz, fwereade. jam: if  there's anyone left online: https://codereview.appspot.com/13802045
<mgz> looking
<mgz> ah, and it's one I really should review too
<mgz> tas is stored as a list in Constraints I take it?
<mgz> hence the change to IsEmpty checks rather than compare to {}
<mgz> *tags
 * mgz j-s it up
<natefinch> mgz: yeah, I added a comment to that effect after I realized it wasn't going to be obvious at first
<mgz> I'm still not sure about that vs just storing a space-seperated string
<mgz> (for other reasons, not because it would save an Isempty function, that's fine)
<natefinch> in general I really like to store lists of strings as lists of strings... then you let each consumer reformat (or not) as necessary
<natefinch> then it's really clear "hey, there's multiple values here"  and you don't have to go look at the implementation to know how to separate the values
#juju-dev 2013-09-25
 * thumper tries to formulate some replies
<axw> thumper: did you want to look at my local-storage auth change before I land? I have fwereade's LGTM
<thumper> axw: have I reviewed it?
<axw> thumper: no, just fwereade
 * thumper is replying to the upgrade emails
<thumper> nah, I trust you, and if fwereade has acked it, that is good enough for me
<axw> okey dokey
<axw> cool
<axw> thumper: the gist is, using TLS client certs for authentication; authentication=authorisation
 * thumper nods
<thumper> school run time
<thumper> axw, wallyworld: reminder on the code review time in 15 min
<wallyworld> yep
<axw> thanks
<thumper> ugh
<thumper> having a very meh day
<axw> thumper: I didn't really understand this statement in your email- "I think I agree with William here, these are more associated with a
<axw> provider rather than just agent config."
<thumper> axw: I was thinking that we wouldn't need to put it in the config
<axw> my intention was for the provider to control what jobs get put into the MachineConfig, which drives agent.Config creation
<thumper> but in state
<axw> thumper: yes, but the bootstrap process puts the jobs into state
<thumper> the machine jobs are defined in state
<thumper> are we talking about the bootstrap jujud process?
<thumper> or the machine agent?
<axw> yep
<axw> um
<axw> what's the difference?
<axw> cmd/jujud/bootstrap.go
<thumper> one is 'jujud bootstrap' the other is 'jujud machine'
<axw> currently does an InjectMachine with hard-coded set of jobs
<thumper> there is a key/value store in format 1.16
<axw> oh right. I'm talking about the former
<thumper> for the agent config
<thumper> so if you use that, you don't need to teach the format anything new
<axw> hmm yes, it just felt a bit wrong to put it into something so free-form... but I suppose the machine agent doesn't care about this at all
<thumper> right
<thumper> but I do get your meaning
<thumper> could be command line params to the bootstrap ?
<thumper> although that isn't much better
<thumper> and then creates a mix of data sources
<axw> thumper: they've all got pros and cons, so I guess key/value will do; at least there's no format change required then
 * thumper nods
<axw> thumper: it sounds like we're thinking along the same lines regarding upgrades
<thumper> good
<axw> I was thinking more after I sent my email last night, and that the code that I will need to add for this could be used for state schema upgrades
<thumper> even the multi-version jump?
<axw> yes definitely
<thumper> good
<thumper> I think that previously that has been in the "too hard" basket
<axw> having to hop along like that sounds horrible
<thumper> but it definitely needs to be done
<thumper> I keep thinking back to what we did with postgresql
<thumper> where we had a number of patch files
<thumper> a similar thing could be done for our version changes
<thumper> for each minor version, we have a slice of functions to call
<thumper> that will modify state based on what was left by the previous call
<axw> yep that makes sense
<thumper> so going from say 1.16 -> 1.18 would just require the one stack to be called
<thumper> but 1.16 -> 1.22 (say) would run three, one after the other
 * thumper waves hands
<thumper> magic
<axw> :)
<axw> thumper: so I'm vaguely thinking that there'll be an environs/upgrade package, which will handle generic upgrades, and defer to an optional EnvironUpgrader interface, which an Environ may implement
<axw> haven't thought through in much detail yet
<axw> environs/upgrade would do schema upgrades
<thumper> sounds good to me
<axw> EnvironUpgrader would do provider-specific things, like adding jobs
<thumper> axw: I'm wondering whether state/upgrade should do schema upgrades given that state has the core "state"
<axw> thumper: yes, probably
 * thumper calls a wash on today, back tomorrow
<rogpeppe> mornin' all
<axw> hey rogpeppe
<rogpeppe> axw: hiya
<rogpeppe> nice to see juju getting a shout out in this talk https://www.youtube.com/watch?v=sYukPc0y_Ro
<axw> rogpeppe: yeah, saw that this morning - very nice! :)
<rogpeppe> axw: last night thumper mentioned an email thread about upgrades - where is that?
<axw> rogpeppe: I emailed fwereade and rogpeppe only; I'll forward to you if you're interested
<axw> err not rogpeppe  )
<rogpeppe> axw: please
<axw> :)
<axw> thumper
<axw> actually I'll just CC the list, since it's grown into something that everyone will probably care about
<rogpeppe> good plan
<rogpeppe> wallyworld: hiya
<wallyworld> hi
<rogpeppe> wallyworld: you've replied to this, but i don't think you've pushed your changes yet : https://codereview.appspot.com/13842044
<wallyworld> i have
<wallyworld> i pushed to lp
<rogpeppe> wallyworld: could you re-propose too, please?
<wallyworld> it's been merged also
<wallyworld> :-(
<rogpeppe> wallyworld: so i can see the changes in codereview
<wallyworld> i don't have that branch easily assessicle right now
<wallyworld> look at lp
<rogpeppe> ok
<wallyworld> bit stressed trying to get @^%^@!^%@! gpg working
<wallyworld> i can't generate a private key have have Go use it
<rogpeppe> wallyworld: but in general it's nice to have the pushed version in codereview because the link is in the commit message
<wallyworld> guess so
<wallyworld> i hate codereview
<rogpeppe> wallyworld: you can't generate a private key?
<wallyworld> too hard to use
<rogpeppe> wallyworld: funny, i think it's really good, but there y'go
<wallyworld> i can generate fine'but when i try and use it in go, it complains it is encrypted
<rogpeppe> wallyworld: what library are you using?
<wallyworld> so i have a private key block which i got from pgp, then i use keyring, err := openpgp.ReadArmoredKeyRing(bytes.NewBufferString(signedMetadataPrivateKey))
<wallyworld> i trying to generate some signed data to test with, i have both public and private key blocks from a test key i generated in gpg
<wallyworld> errors.InvalidArgumentError = "signing key is encrypted" ("openpgp: invalid argument: signing key is encrypted")
<wallyworld> is the error
<wallyworld> i'm not sure how to make a signing key that is unencrypted
<wallyworld> rogpeppe: i hate codereview because i can't see the whole diff at once, so it's hard for me to navigate around the changes
<rogpeppe> wallyworld: i like codereview because i can see the whole file in context (and all the comments)
<rogpeppe> wallyworld: but it would be nice if it was much faster
<wallyworld> yeah
<wallyworld> rogpeppe: i got to go and buy dinner, but if you had a clue as to how I can get a pulic and private key i can use with openpgp in go that would be awesome
<rogpeppe> wallyworld: i'll have a look
<wallyworld> thanks
<wallyworld> i have a test private key but that comes from g a go lib test
<rogpeppe> wallyworld: did you see this bug BTW? https://bugs.launchpad.net/juju-core/+bug/1229839
<_mup_> Bug #1229839: provider/ec2: LiveTests.TestBootstrapWithDefaultSeries is broken <juju-core:New for wallyworld> <https://launchpad.net/bugs/1229839>
<wallyworld> and the data is precanned
<wallyworld> i replied
<rogpeppe> wallyworld: so you haven't generated the private key yourself?
<wallyworld> we have had that SetToolsPrefix in the code for a long time'
<rogpeppe> wallyworld: it's hideous!
<wallyworld> rogpeppe: i have, but go refuses to import as
<wallyworld> per the above error
<wallyworld> rogpeppe:  i have a chunk of text  wrpped with -----BEGIN PGP PRIVATE KEY BLOCK----- etc
<wallyworld> but i can't use ReadArmoredKeyRing to get something i can sign stuff with
<rogpeppe> wallyworld: what command line did you use to generate your gpg private key?
<rogpeppe> wallyworld: just so i can reproduce your issue
<wallyworld> rogpeppe: i used seahorse and exported it
 * rogpeppe hasn't heard of seahorse
<wallyworld> i generated a sign only key
<wallyworld> it's a gpg gui
<wallyworld> rogpeppe:  but i guess "gpg --gen-key" would have worked
<rogpeppe> wallyworld: that's ok, i can never remember the gpg usage either :-)
<axw> wallyworld: I just generated one, it worked for me
<axw> with --gen-key
<wallyworld> wtf
<wallyworld> can you read it using ReadAmouredKeyRing
<wallyworld> without getting an error
<axw> wallyworld: yep
<wallyworld> did you type in a passphrase when you generated it?
<wallyworld> axw: i'm stupid
<wallyworld> the reading in bit qorks
<wallyworld> works
<wallyworld> it's the signing that fails
<wallyworld> 	plaintext, err := clearsign.Encode(&buf, keyring[0].PrivateKey, nil)
<wallyworld> fails
<axw> ah
<wallyworld> using the keyring read in
<axw> wallyworld: did *you* type in a passphrase?
<axw> will seahorse allow you not to?
<wallyworld> no, i'll try command line
<axw> (gpg --gen-key assigned one even when I said not to)
<wallyworld> not sure if that will force it
<wallyworld> i how do i get an unencrypted private key is the question
<wallyworld> don't know why Go needs that
<wallyworld> anyway, got to run to get dinner, back layter
<axw> wallyworld: call PrivateKey.Decrypt(passphrase)
<axw> wallyworld: see the comment on the PrivateKey.Encrypted field (http://godoc.org/code.google.com/p/go.crypto/openpgp/packet#PrivateKey)
<axw> rogpeppe: if you didn't see it yet, I forwarded the upgrade email chain to juju-dev
<rogpeppe> axw: thanks
<rogpeppe> axw: here are some notes i made on major-version upgrades a while ago: http://paste.ubuntu.com/6153562/
<axw> rogpeppe: heh, pretty much what I wrote I think :)  nice to know I wasn't off base
<axw> the pending flag bit is a bit different
<rogpeppe> axw: yeah, it seemed pretty similar
<jamespage> davecheney, still around?
<axw> rogpeppe: I think thumper was suggesting that the API server just lock out all connections, and API clients keep attempting to reconnect
<axw> so the "pending flag" is in effect cleared when they can finally connect
<axw> I kinda like that approach
<rogpeppe> axw: just shutting off access to all agents seems a little draconian.
<rogpeppe> axw: but it does have some advantages
<rogpeppe> axw: like you don't have to wait for everyone
<rogpeppe> axw: but i think it might be good to let all the agents download their new version before we start to upgrade
<rogpeppe> axw: that way we can have a clean break
<axw> rogpeppe: what if the API server fails during upgrade? how would you roll back the tools on the machine agents?
<axw> or do you mean, download but don't action
<rogpeppe> axw: both are possible
<wallyworld> jam: i have 2 simplestreams mps, which fix the issues you were talking about, i'm hoping to get these into 1.15 https://codereview.appspot.com/13899044/ https://codereview.appspot.com/13888043/
<wallyworld> i got to go make dinner, but back later
<rogpeppe> axw: if we make sure that at least the new API is backwardly compatible to the point of finding out the agent version, then agents can downgrade if the upgrade fails
<axw> that sounds reasonable
<dimitern> fwereade, hey
<dimitern> fwereade, quick question
<dimitern> fwereade, if we don't take an environ in NewSimpleAuthenticator in the provisioner, how should we get the state and api infos? The manual provisioner needs a state connection I think, so we can't just return nil stateInfo
<jam>  fwereade: standup ?
<jam> https://plus.google.com/hangouts/_/3d4586c62aa1310a0c3f40960494578688c86f1a
<jam> fwereade: ^^
<fwereade> huh, hangout experiencing difficulties
 * TheMue => lunch
<dimitern> fwereade, ping
<abentley> sinzui: ping for standup
<fwereade> natefinch, https://codereview.appspot.com/13802045/ LGTM with comments (and a followup?) -- let me know what you think
<fwereade> natefinch, wait, sorry, I seem to have skipped some files
<fwereade> natefinch, added to it
<fwereade> natefinch, maybe not LGTM any more, but should be reasonably simple to address
<fwereade> natefinch, I think you just need to make a distinction between nil tags and empty tags
<natefinch> fwereade: good point about masking tags
<dimitern> ping
<dimitern> :)
<dimitern> i meant fwereade ping
<mgz> was going to ask :)
<fwereade> dimitern, heyhey
<fwereade> dimitern, looking
<natefinch> fwereade: good call on empty tags... we actually aren't handling it well at all.   tags=   is getting treated as a single empty string tag, rather than an empty list
<mgz> erk
<fwereade> mgz, eh, that's what review are for
<natefinch> ironcically, the "round trip" serialization tests don't catch this (I didn't have a test for it, but I added one, that still passes)
<fwereade> natefinch, thanks for checking it out
<mgz> fwereade: yeah, but I really should have looked out for that >_<
<natefinch> because []string{""}  serializes the same as []string{}
<natefinch> so, I wrote a specific test for it to check that tags= gets deserialized into []string{} ... which currently fails, but I'll fix that.   TDD, woo!
<mgz> yeay!
 * fwereade cheers
<rogpeppe> fwereade: fancy a small review (ExtraConfig in configstore)? https://codereview.appspot.com/13912043/
<rogpeppe> mgz, dimitern, TheMue, natefinch: ^
<mgz> looking
<TheMue> me too
<TheMue> rogpeppe: one minor comment, otherwise LGTM
<rogpeppe> TheMue: i'd prefer not to define yet another map[string]interface{} type
<rogpeppe> TheMue: i'm wondering about consolidating all of them to one type
<rogpeppe> TheMue: but in the meantime it's nice to avoid the type conversion if it's used with testing.Attrs, for example
<TheMue> rogpeppe: as those types doesn't cost a lot i like them explicit to give a better semantic expression
<rogpeppe> TheMue: what does that mean?
<TheMue> rogpeppe: especially when you set data not directly near to the function call
<rogpeppe> TheMue: sorry, i'm not sure i get you
<TheMue> rogpeppe: you can say cfg := ExtraConfig{"foo": 1234} then do something else, maybe add some more config, and later call SetExtraConfig(cfg)
<TheMue> rogpeppe: in this case already the initialization shows the usage of the data
<rogpeppe> TheMue: it's never going to be used that way
<dimitern> rogpeppe, +1 on naming types like map[string]interface{}
<dimitern> rogpeppe, we have too many too generic ones already
<TheMue> rogpeppe: sure for every developer until eternity? ;)
<rogpeppe> dimitern: i think map[string]interface{} expresses exactly what is required
<rogpeppe> dimitern: but i'm thinking of creating utils.Attrs and making *everything* use that
<TheMue> rogpeppe: it's technical, yes, but it doesn't say anything about the intention
<dimitern> rogpeppe, i don't want to argue, just giving my opinion
<TheMue> dimitern: thanks
<rogpeppe> TheMue: neither does "int", but we don't retype every int
<rogpeppe> TheMue: the argument name or function name is usually good enough
<TheMue> rogpeppe: will start to refactor :D
<TheMue> rogpeppe: but if it is enough you don't even need to create a utils.Attrs
<TheMue> rogpeppe: then let as stay with it as it is
<rogpeppe> TheMue: the current advantage to using an unnamed type is that it's compatible with other named types.
<TheMue> rogpeppe: that's right
<rogpeppe> TheMue: and also, given that config.Config.AllAttrs returns map[string]interface{}, I think my function should accept that exact type, because that's where its attributes are coming from
<dimitern> rogpeppe, there's already testing.Attrs btw
<TheMue> many attrs ;)
<dimitern> rogpeppe, and it's map[string]interface{}, so maybe while deciding to refactor all such cases should be considered in the codebase
<rogpeppe> dimitern: i know - i made it :-)
 * TheMue has to step out, somehow i'm stuck with my testing and will start freshly tomorrow morning
<rogpeppe> dimitern: i'd move it to utils
<rogpeppe> dimitern: and then make everything use it
<TheMue> rogpeppe: +1 for utils, yes
<TheMue> so, good n8 everybody
<rogpeppe> mgz: about the overall intention of the ExtraConfig stuff: the intention is that the extra configuration attributes are created only once, at Prepare time; subsequent operations should not add attributes, as that may well be racy.
<rogpeppe> mgz: adding endpoint address information should not be racy, as everyone will be trying to save the same info, so it's idempotent
<mgz> rogpeppe: okay, I sort of see
<mgz> but then how does that distinguish from just not letting the config get overritten once in place?
<mgz> *w
<mgz> (having the bool "has this been created" flag, that is)
<rogpeppe> mgz: well, create fails if the info already exists
<rogpeppe> mgz: so you can only set extra config attrs if you are the one that first created the info
<rogpeppe> mgz: ... does that make sense? i'm not sure i understood your question
<mgz> hm, sort of, I think this is implementation detail rather than something that will break things later anyway
<natefinch> man, I love tests. I found like 4 bugs with the tests I wrote in the last hour
<mgz> natefinch: :D
<natefinch> mgz, fwereade: btw, you both missed the fact that constraints.IsEmpty had no tests, and actually had a bug in it ;)
<mgz> I complained about the indenting, which should give me half points...
<natefinch> rofl
<natefinch> the indenting is as gofmt would have it, all hail gofmt
<mgz> but but... it makes no sense ;_;
<natefinch> mgz: it's grouping
<natefinch> mgz: it's showing explicitly with the indenting what the implicit order of operations will do
<natefinch> mgz: gofmt does similar stuff when it's all on one line by using spaces or no spaces to show what will get calculated together, it's actually pretty awesome
<dimitern> fwereade, there it is the secret attrs patch https://codereview.appspot.com/13916043
<rogpeppe> oops, just pastebin'ed all my aws security credentials
<mgz> natefinch: I'd understand if it was just the indent after the ||, it's the indent after the first && that makes no sense
<rogpeppe> fwereade: FYI here's what the environment info file looks like with *all* the attributes stored in it: http://paste.ubuntu.com/6155286/
<natefinch> mgz: it makes more sense if you put another || at the end like this: http://pastebin.ubuntu.com/6155319/
<mgz> okay, but that still took a little bit of thinking about, I'd be breaking out the brackets if I got that far
<natefinch> mgz: yes, in general... but it's handy if you think you know what it's doing , and turn out to be wrong... at least you get a hint of it.
<natefinch> mgz: this kind of thing has already saved my butt:  https://groups.google.com/forum/#!msg/golang-nuts/qFtA6g8AIxE/oumyR1ytP4MJ
<fwereade> rogpeppe, hmm, I dislike that a bit less than I expected to
<rogpeppe> fwereade: i think it has some definite advantages - it's nice being able to see everything in one place, minus all the config initialisation magic
<fwereade> natefinch, ouch, I was so happy to see IsEmpty everywhere
<natefinch> fwereade: well, there's a test for it now, no harm, no foul. :)
 * fwereade strokes his chin thoughtfully
<natefinch> (given that I was the wrong that wrote it and failed to write a test.....)
<natefinch> s/wrong/one
<fwereade> rogpeppe, it just feels like it's asking to replicate the Conn problems
<rogpeppe> fwereade: the Conn problems?
<fwereade> rogpeppe, the Environ
<fwereade> rogpeppe, which is wrong
<fwereade> rogpeppe, and which people keep trying to use because it's so ridiculously convenient
<rogpeppe> fwereade: well, we'll want to purge the environment from the environ info when we can
<rogpeppe> fwereade: (in fact, it might even be a command)
<rogpeppe> fwereade: really, we only want to keep it around until after the first connection
<rogpeppe> fwereade: but there's that problem with what happens if you lose contact with the last-contacted API address
<rogpeppe> fwereade: but that's a problem for everyone, and most people *won't* have the extra config attrs
<rogpeppe> fwereade: so i guess i'd prefer to work on that problem (finding API endpoints without any environ credentials), and just throw the credentials away at the first opportunity
<fwereade> rogpeppe, well, up to a point -- we *do* still have the creds around in environments.yaml and it would be silly not to use them
<rogpeppe> fwereade: well, one person does, yes
<rogpeppe> fwereade: but by fixing the issue for other people, we fix it for that person too
<fwereade> rogpeppe, yeah -- taking that knowledge (along with eg control-bucket) away from that user is not helpful I think
<rogpeppe> fwereade: i'm not sure how much mechanism is worth adding here
<fwereade> rogpeppe, I accept that a global fix would be ideal
<fwereade> rogpeppe, but today we still need actual environ access for the CLI in a bunch of situations
<rogpeppe> fwereade: indeed
<rogpeppe> fwereade: i don't see what difference it makes how many attributes you've got in that file though
<rogpeppe> fwereade: you've either got the environ attrs or not
<rogpeppe> fwereade: keeping them all in one place makes our life simpler
<rogpeppe> fwereade: (and probably users' lives too)
<fwereade> rogpeppe, I need to sleep on it I think
<dimitern> wtf is params.StatusData ?
<fwereade> rogpeppe, at the moment I still feel that if yu can get a complete environ config from that file alone we screwed up
<fwereade> dimitern, dict of useful info attachable to status
<dimitern> somebody landed this and changed machine.SetStatus() underneath
<fwereade> dimitern, relation hook status reporting is particularly bad
<dimitern> ok, so do need to care about it in the provisioner?
<fwereade> dimitern, I think you can ignore it safely for now, nothing sets it on a machine
<dimitern> fwereade, ok thanks
<fwereade> rogpeppe, I do see the advantages in rendering to the One True Format, once, explicitly, though
<rogpeppe> fwereade: that's interesting. i'm not sure *how* that means we've screwed up, given that environments.yaml+info = the same thing
<fwereade> rogpeppe, the problem is that it looks like a sane config but most if the time it is not
<fwereade> rogpeppe, I have a thought though
<fwereade> rogpeppe, what if we called it creation-config
<fwereade> rogpeppe, that might make its status a bit more apparent
<rogpeppe> fwereade: that's not a bad idea
<rogpeppe> fwereade: or bootstrap-config
<fwereade> rogpeppe, it's still a bit risky in the same way Conn.Environ is, but it's better than before
<fwereade> rogpeppe, perfect
<natefinch> I'm getting a weird test failure in state/service_test.go in TestUpdateConfigSettings
<natefinch> code that is no where near anything I've touched
<fwereade> rogpeppe, validate and complete it OAOO, record it locally so we can finish bootstrap nicely, subsequently stick to changing the state-servers field and fall back to bootstrap-config just in case it offers a save for a screwed situation
<rogpeppe> fwereade: i could imagine a command, say clear-bootstrap-info, which would delete the local bootstrap config
<rogpeppe> fwereade: although it's probably unnecessary
<fwereade> rogpeppe, I'm not sure it deserves a UI
<fwereade> rogpeppe, anyone who can be trusted to run it can handle deleting a few lines;p
<rogpeppe> fwereade: yeah
<rogpeppe> fwereade: what *might* be nice though is a way of exporting the file without the bootstrap-config
<fwereade> rogpeppe, yeah, I think that definitely deserves a UI
<fwereade> rogpeppe, import/export environment basically
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, but I think that vocab's taken so we should think of something else
<rogpeppe> fwereade: endpoint
<rogpeppe> fwereade: export-endpoint, import-endpoint ?
<fwereade> rogpeppe, I'm wondering whether it's more like an identity almost
<rogpeppe> fwereade: it would be great if it contained the environ's UUID
<fwereade> rogpeppe, that's a good point
<natefinch> mgz, fwereade, rogpeppe, dimitern: help with this test failure?  I don't know where it's coming from - http://pastebin.ubuntu.com/6155445/
<fwereade> rogpeppe, we do something annoying like defer its creation to bootstrap-state now, don't we?
<rogpeppe> fwereade: tbh it would be nice if the UUID was generated at bootstrap time
<rogpeppe> fwereade: yeah
<rogpeppe> fwereade: i argued against that at the time :-)
<fwereade> rogpeppe, well forseen
<fwereade> rogpeppe, eh, we can always add that to the info file, along with the state-servers, rather than jam it into config
<rogpeppe> fwereade: that's true
<fwereade> rogpeppe, just pick it up a bit later
<dimitern> natefinch, probably you need to merge trunk and frank's changes for updating default/empty settings values?
<rogpeppe> fwereade: although it would be nicer to verify that the other end has that UUID rather than taking it for granted we're talking to the right one
<natefinch> dimitern: hmm... I did just merge trunk.. let me double check
<fwereade> rogpeppe, agreed that Prepare-time UUID generation would be nicer
<jam> rogpeppe: any chance to not double space the cert?
<rogpeppe> jam: that's just how goyaml produces multiline strings
<rogpeppe> jam: if you wanted to patch goyaml... :-)
<jam> rogpeppe: is it part of the yaml spec? in that if you don't double space you end up with everything on-one-line ?
<rogpeppe> jam: inside single quotes, yes
<jam> rogpeppe: you can always use a custom type with a MarshalYAML or whatever the func is.
<rogpeppe> jam: yaml has the most abstruse quoting rules ever
<rogpeppe> jam: that can't help us, unfortunately
<rogpeppe> jam: because MarshalYAML returns interface{} containing values which will be marshalled according to the usual yaml marshalling rules
<rogpeppe> jam: GetYAML, i mean
 * rogpeppe has reached eod
<natefinch> dimitern: all set.  looks like it was old cruft from go install that needed to be rebuilt.
<dimitern> natefinch, ah, yeah - we keep bumping into that
<dimitern> fwereade, any chance of looking at https://codereview.appspot.com/13916043 ?
<rogpeppe> fwereade: seems like we have a plan
<fwereade> dimitern, sorry, got lost on the stack, doing it now
<natefinch> dimitern: yeah, I've hit it before, just didn't think of it this time... I just try never to do go install anymore... it's too fraught with problems like that
<fwereade> rogpeppe, I think so
<fwereade> rogpeppe, thanks
<rogpeppe> fwereade: cool.
<rogpeppe> fwereade: i'm happy.
<dimitern> natefinch, I tend to always do go install in cmd/juju and cmd/jujud before doing a live test with --upload-tools
<rogpeppe> g'night all
<jam> natefinch: I always use "cd cmd/juju; go build; ./juju ..." which means it won't ever find "jujud" next to it :)
<fwereade> dimitern, nice, LGTM
<natefinch> jam: yeah, I've taken to the go build; ./juju thing too
<dimitern> fwereade, cheers, the next one coming up shortly
<fwereade> dimitern, awesome, I am workecueing and will be around for a bit
<dimitern> fwereade, nice! :)
<natefinch> mgz, fwereade: updated the maas-tags stuff with some more tests and bug fixes and now we handle nil tags vs empty list of tags properly  https://codereview.appspot.com/13802045/
<dimitern> fwereade, and the last one https://codereview.appspot.com/13919043
<fwereade> natefinch, LGTM
<fwereade> dimitern, looking
<dimitern> fwereade, thanks!
<fwereade> dimitern, natefinch: if you're both around, I think we may want to verify that we can still deploy a container on maas with the changes
<fwereade> and I believe natefinch to have something resembling reliable maas access
<fwereade> dimitern, pending verification, LGTM
<dimitern> natefinch, you just need "juju deploy wordpres --to lxc:0" or something and make sure it works
<dimitern> fwereade, but natefinch will need to pull my branch, right?
 * dimitern brb
<dimitern> fwereade, you a maas live test specifically?
<fwereade> dimitern, that'sd where we need that providsioner
<fwereade> sorry knuckle typing
<fwereade> pig fat
<dimitern> :)
<dimitern> I can do a live tests on ec2 or the local provider (or both) tomorrow
<fwereade> ec2 runs no lxc proisioner now I Thought
<dimitern> so local provider then?
<dimitern> I think ec2 has it though
<dimitern> I've seen it start in the logs while live testing iirc
<fwereade> dimitern, hmm, maybe that has not landed yet
<fwereade> dimitern, see whether it does run on ec2
<fwereade> dimitern, if it does, add a container and check it's deployed sanely even if not actually addressable
<fwereade> dimitern, that'd be good enough for me I think
<dimitern> fwereade, ok, will do first thing tomorrow
<fwereade> dimitern, great, tyvm
<natefinch> dimitern, fwereade: sorry, had to step out.  I can try pulling dimiter's branch and test it out
<fwereade> natefinch, no worries, and only if it's convenient, we can do it tomorrow as easily
<fwereade> I'm about to be off myself
<natefinch> fwereade: figured. I'll actually see if I can just add everyone to the maas host's trusted_keys so we can all use this virtual maas environment
<fwereade> natefinch, oo, that'd be handy
<fwereade> natefinch, thanks
<natefinch> fwereade: welcome
<gary_poster> hello world.  I'm using juju core 14.1 on saucy and can't get local lxc environ to work.  I had it working fine on another machine with saucy a week ago, but not working here.  the lxc machine never goes past pending after 30 min, 45 min, 1 hr. excerpt from local log of most recent attempt (now 7 minutes into attempt) is here: http://pastebin.ubuntu.com/6156103/ .  My environments.yaml at this point in my attempts i
<gary_poster> s entirely barebones: only a local environment with an admin-secret.  Can anyone help?  thumper, you around, despite incredibly early time for you?
<gary_poster> To be clear, this is the kind of status I see: http://pastebin.ubuntu.com/6156113/
<thumper> hi gary_poster
<thumper> it is the start of my normal work day :)
<gary_poster> hey thumper.  oh cool...I think :-) what time is it?
<thumper> just before 9am
<gary_poster> oh cool, not too bad
<thumper> next week I will be 1 more hour closer to you
<thumper> as we go to UTC+13
<gary_poster> ah, right, then IIRC we get even closer when we do our time shift
<thumper> gary_poster: can you post the log from ~/.juju/local/logs/
<gary_poster> sure
<thumper> gary_poster: yes
<gary_poster> thumper, http://paste.ubuntu.com/6156123/
<gary_poster> that is local/log/machine-0.log .  there is no machine-1 fwiw
 * thumper nods
<thumper> have a look in /var/lib/juju/containers
<thumper> there should be a directory there gary-local-machine-1
<gary_poster> there is
<thumper> in there there should be two log files
<thumper> can you pastebin them?
<gary_poster> on it
<gary_poster> container.log http://paste.ubuntu.com/6156133/
<gary_poster> console.log (as root) http://paste.ubuntu.com/6156140/
<gary_poster> thumper, ^^
<thumper> gary_poster: yeah, looking
<gary_poster> cool, thx
<thumper> I think this is a key one: cloud-init-nonet waiting 120 seconds for a network device.
<thumper> cloud-init-nonet gave up waiting for a network device.
<thumper> also there seems to be some issues with the container.log
<thumper> perhaps lxc in saucy has changed something
<thumper> will need to follow up with serge
<gary_poster> ack thanks thumper.  thank you for looking.  Not blocked here.  anything else you need from my machine?
<thumper> perhaps apt-cache policy lxc
<gary_poster> thumper, http://paste.ubuntu.com/6156168/
<gary_poster> running away
<gary_poster> thanks and talk to you later
<thumper> kk
<wallyworld_> sinzui: hello, did you want to catch up?
<sinzui> wallyworld_, yeah
<wallyworld_> mumble hates me at the moment
<wallyworld_> so a hangout?
<wallyworld_> https://plus.google.com/hangouts/_/57e2c830e5a49e266ebc4dbeef8b459386dbf5bf
 * sinzui gets drink
 * thumper goes for a walk to think through some issues (juju related)
#juju-dev 2013-09-26
 * thumper misses list comprehension in go
<wallyworld_> thumper: mr ocr, i have a branch which hooks up simplestreams mirrors support for tools https://codereview.appspot.com/13952043
<wallyworld_> for fuck sake, out landing bot has been shot down
<wallyworld_> shut
<wallyworld_> ah maintenance i think
<thumper> hi wallyworld_
<wallyworld_> hi
<thumper> I'll look shortly
<wallyworld_> i hope canonistack is back soon
<wallyworld_> np
<bradm> wallyworld_: I thought it was back already?
<wallyworld_> bradm: when i nova list it says the instances are shutdown
<wallyworld_> +--------------------------------------+----------------------+---------+-------------------------+
<wallyworld_> | ID                                   | Name                 | Status  | Networks                |
<wallyworld_> +--------------------------------------+----------------------+---------+-------------------------+
<wallyworld_> | 4829b364-72ad-4ee7-a21c-3ba640f28854 | juju-gobot-machine-0 | SHUTOFF | canonistack=10.55.32.55 |
<wallyworld_> | 97a7c226-a195-4014-9df5-c998bba3a491 | juju-gobot-machine-3 | SHUTOFF | canonistack=10.55.32.52 |
<wallyworld_> +--------------------------------------+----------------------+---------+-------------------------+
<bradm> wallyworld_: yeah, the compute node being rebooted will do that
<wallyworld_> bradm: it would not have been in the procedures to restart stuff that was running?
<bradm> wallyworld_: I wasn't directly involved, but it would seem not to be the case
<wallyworld_> :-(
<wallyworld_> this is the second time our instances have been broken :-(
<bradm> you can't just power it on?
<wallyworld_> i'm not sure how
<wallyworld_> i assume thee's a nova command
<wallyworld_> i'll take a look
<bradm> nova start <id>
<wallyworld_> yeah, trying that now
<wallyworld_> bradm: back running, seems quicker perhaps
<bradm> wallyworld_: probably, there's likely hardly anyone elses instances going :)
<wallyworld_> \o/
<bradm> the compute nodes are pretty beefy machines
<bradm> they're just being overcommitted by a lot
<bradm> I'll chase up what happened internally, the announcements did say things would be restarted
<bradm> but that definately appears not to be the case, or at least not consistantly
<wallyworld_> thanks :-)
<thumper> wallyworld_: something is wrong with the gobot
<thumper> no mongod
<wallyworld_> :-(
<wallyworld_> i'm not familiar with how it is set up sadly
<thumper> we need more monday gods
<thumper> mon-god
<wallyworld_> yeah
<wallyworld_> although stopping and starting should not have affected it you'd think
<bradm> fwiw with my dinky little juju test env on lcy02 the reboot didn't break it, its back up and going
<thumper> oh good
<wallyworld_> thumper: i had a quick look - mongod is in /usr/local/bin and /usr/local/bin is in the path so i'm not sure
 * thumper -> haircut
<hazmat> thumper, testing saucy local fwiw
<hazmat> thumper, is there a particular version of interest? trunk i assume?
<hazmat> thumper, fails for me.. although looks like diff issue, namely the upstart job needs a wait between dropping an upstart template to disk and starting till inotify triggers and register with upstart.
<bradm> this is very interesting, a default juju bootstrap on lcy02 fails, since the instance type isn't big enough
<bradm> mongodb shuts itself down saying there's not enough space
<wallyworld_> bradm: that started happening about a week ago for some reason, i think folks are looking into it
<bradm> wallyworld_: I can tell you why
<bradm> wallyworld_: the default instance is a m1.tiny, which has a 2G /
<wallyworld_> ok :-)
<wallyworld_> juju used to be ok in 1G
<wallyworld_> or even 512
<bradm> wallyworld_: I just bootstrapped with more, mongodb alone uses 3G
<bradm> wallyworld_: its disk thats the issue, not memory
<wallyworld_> serious? the landing bot bootstrap machine used to be a 512M instance
<wallyworld_> ah disk
<wallyworld_> i thought you were talking about ram
<wallyworld_> still, juju should not pick tiny on canonistack
<bradm> I can't say why mongodb suddenly wants all your disk, but that seems to be the c ase
<bradm> bootstrap it with a m1.tiny and you'll see, check the logs in /var/log/mongodb
<bradm> it pretty clearly says it needs mor disk
<wallyworld_> there's some issue in how juju is choosing the instance
<wallyworld_> it used to work. it should be picking small
<wallyworld_> i'm not sure of the current status though but it is being looked at
<bradm> yeah, not sure where it changed, but thats the fix, to bootstrap with contraints that give you a bigger disk
<bradm> is that whats happening with your gobot?  needs more disk for mongodb?
<bradm> I wonder if mongodb should be using the smallfiles option
<wallyworld_> bradm: could be, but it was running fine before the shutdown
<bradm> wallyworld_: /var/log/mongodb/mongodb.log should make it pretty clear
<wallyworld_> bradm: yeah, true. i'm tied up trying to get some coding finished, but i'll look soon
<bradm> wallyworld_: cool, I can do some more testing myself once I've gotten this charm done
<wallyworld_> bradm: ok. i'm flat out right now as i'm off from tomorrow for a week and am trying to get everything done before i go. i'll hopefully be ale to look a bit later
<bradm> wallyworld_: actually, I'm off next week too :)
<wallyworld_> \o/
<wallyworld_> going anywhere?
<bradm> yeah, my parents have taken our son for a holiday, we're driving up there to pick him up and spend some time with them
<bradm> its one of our first times (outside of hospital) that we've been away from him, its interesting
<wallyworld_> how old is he?
<bradm> 6
<wallyworld_> yeah, we didn't spent time away from out son for a few years also
<bradm> there are medical issues with him too, so we're probably a bit more protective than normal
<wallyworld_> yeah, i can understand that
<bradm> he had 2 open heart surgeries before he was 5
<wallyworld_> wow
<wallyworld_> glad he's ok
<bradm> yeah, he's pretty good given what he's had to go thru
<bradm> how about you?  going anywhere interesting?
<wallyworld_> hervey bay to watch whales, then to frazer island for a few days
<wallyworld_> looking forward to it
<bradm> ahh, nice - I've been to hervey bay whale watching before, lots of fun
<wallyworld_> yeah, me too about 10 years ago
<wallyworld_> with kid #1. now with kid #2
<bradm> we're starting to think along those lines for holidays as the boy gets older, he might actually get a bit more out of it
<wallyworld_> yep. we took kid #1 to nz when he was 4 and he remembers nothing. what a waste
<bradm> it'd be pointless for us before now, we always seemed to spend a good portion of the year with him in and out of hospitals
<wallyworld_> that's a shame, i hope he gets well asap
<bradm> he's been really good this year
<bradm> usually a flu would mean a trip to hospital, this year so far things have been good
<wallyworld_> \o/
<bradm> ohh, there's two mongodb running in my juju env
<bradm> and its the non juju one taking up all the space
<bradm> the juju started one has --smallfiles, the other one doesn't
<wallyworld_> ah
<jam> wallyworld_: as I'm working through some other things, I came across this question. How does "juju bootstrap --upload-tools" work today w/ openstack. Doesn't it put the tools in your private bucket, which should *not* be world readable?
<wallyworld_> jam: yes
<wallyworld_> it puts them in private bucket
<jam> wallyworld_: right, and both cloud-init and Upgrader just use a "wget" to get the tools
<jam> no Auth
<wallyworld_> jam: bot is down since the canonistack maintenance. i haven't had a chance to look deeply, but running tests says it can't find mongod in path, but mongod is in /usr/local/bin and that dir is in the path from what i can see
<wallyworld_> jam: it uses a temp url
<wallyworld_> which is publically readable
<wallyworld_> jam: i've tested bootstrapping with upload tools and simplestreams and it works fine
<wallyworld_> unless i've missed smething
<wallyworld_> jam: when getting to tools URL, it does a storage.URL() which for environs storage returns a url from which anyone can read
<jam> wallyworld_: we don't have temp urls on canonistakc, IIRC. I'm worried we're actually making our private containers world readable
<davecheney> wallyworld_: sounds like the bot is using the old tarball
<davecheney> it should use mongodb from ppa:juju/stable
<jam> wallyworld_: when working it out originally, we decided it was ok that the "public-bucket-url" had to be world readable
<jam> I don't expect it to be any different with tools-url
<jam> but I'm seriously suspecting that we should be able to "juju bootstrap --upload-tools" on Canonistack
<wallyworld_> jam: i'll have to check but it all seemed to work ok
<wallyworld_> upload tools is now automatic
<jam> wallyworld_: you mean sync-tools ?
<wallyworld_> no, upload
<jam> again, I think if it *is* working, we have a security hole
<wallyworld_> i don't recall explicitly setting permissions on the control buket
<jam> wallyworld_: "swift stat $PRIVATE-BUCKET" has ".r:*,.rlistings"
<jam> wallyworld_:  :(
<wallyworld_> hmmm. the tool stuff doesn't set that i'mpretty sure
<jam> wallyworld_: I don't know *who* is setting it, but it is wrong, and it means private tools won't work when we "fix" it.
<jam> wallyworld_: I think the "auto-upload-tools" stuff creates a bucket and sets it world readable
<wallyworld_> i'll have to check
<wallyworld_> jam:
<wallyworld_> 		containerName: ecfg.controlBucket(),
<wallyworld_> 		// this is possibly just a hack - if the ACL is swift.Private,
<wallyworld_> 		// the machine won't be able to get the tools (401 error)
<wallyworld_> 		containerACL: swift.PublicRead,
<wallyworld_> this was put in in january
<wallyworld_> by dimiter i think
<jam> wallyworld_: with nobody realizing "you can't get the tools, but you're exposing all of your secrets to the world" ?
<jam> I'm not 100% sure what goes in the private bucket
<jam> as I don't think we put creds there.
<jam> So it *might* be ok
<wallyworld_> we put the state file there
<jam> wallyworld_: which is just the IP address, right?
<wallyworld_> i'd have to check but i don't think creds go there
<wallyworld_> i *think* so
<jam> wallyworld_: I think the only actually private thing is potentially private charms
<jam> As I'm pretty sure we put the charm data in there
<jam> however
<jam> that *also* needs to be accessible via "wget" because of how we removed Environ creds from the Uniter agents.
<wallyworld_> so it could be worse i guess
<jam> fwereade: I need to chat with you about this.
<jam> wallyworld_, fwereade: security bug #1231278
<wallyworld_> ok
<jam> (mup won't find it because it is private)
<fwereade> wallyworld_, jam, reading back
<jam> We just need a discussion, because there is certainly a "vulnerability vs not working at all" that we have to sort through.
<jam> fwereade: G+ might be appropriate
<jam> wallyworld_: bigjools seems to be enjoying himself without you so far :)
<fwereade> jam, well, I'm here, if the sight of a dressing gown will not be damaging to your sensibilities
<wallyworld_> jam: how do you know?
<jam> wallyworld_: he's posting pics of the great barrier reef on G+
<wallyworld_> fwereade: my eyes, my poor, poor eyes
<jam> fwereade: started: https://plus.google.com/hangouts/_/26fdcf993421ca83a1cf0b1a3ddd35772695e493
<wallyworld_> jam: ah ok. that social networking thing i ignore
<jam> fwereade: you could just turn the camera off :)
<davecheney> https://code.launchpad.net/~dave-cheney/juju-core/158-lp-1210407/+merge/187675
<davecheney> axw: thanks for your review, see my comments
<axw> looking
<axw> davecheney: will lgtm, just curious about this: "we don't reboot machines"  -- it doesn't work?
<axw> I get your point though - it doesn't really matter
<davecheney> axw: if you reboot a machine it gets a new ephemeral ip
<davecheney> and at that point, nothing works
<davecheney> axw: why do you say twice ?
<davecheney> I get your point. It just feels wrong to do it twice when it only ought
<davecheney> to be done once. But, given that's not really possible... LGTM.
<davecheney> "
<axw> davecheney: *if* it were able to reboot
<axw> it's idempotent though, so doesn't matter.
<davecheney> axw: fair point
<davecheney> also, bootcmd http://cloudinit.readthedocs.org/en/latest/topics/examples.html
<davecheney> does what runcmd does
<davecheney> it has the same firstboot properties
<axw> davecheney: ok, then the doc comment on juju-core/cloudinit/Config.AddBootCmd is wrong :)
<davecheney> axw: ok, i'll fix that in a followup
<axw> davecheney: in that page, the comment for bootcmd has a hidden gem: " * bootcmd will run on every boot"
<davecheney> urgh
<davecheney> oh well
<davecheney> care factor, quite small
<davecheney> this may fix the azure disk suckage
<davecheney> but while reading that page
<davecheney> where does it say runcmd is only rnu once ?
 * axw shrugs
<davecheney> i'm glad we've arrived at this place
<axw> it says it in juju-core, but that's maybe not authoritative
<davecheney> axw, best I can tell, we've never tried
<davecheney> everyone knows rebooting an ec2 intsance will screw it
<axw> no worries, it's not a big deal
<axw> fwereade: when you have a moment, would you mind expanding on your comments here? https://codereview.appspot.com/13832045/
<fwereade> axw, sure
<axw> I've changed the authentication stuff around a bit to allow HTTP GETs & HTTPS PUTs; wanted to know what you meant first, though, in case I was expending too much effort on this...
<fwereade> axw, I was just ruminating that *if* cert distribution prove to be some sort of hassle (mainly because the CLI still needs direct storage access for deploy/upgrade-charm) we *could* use ssh storage for the manual provider and filesystem storage for the local one, because the clients that need write access should already have the information needed to set up the appropriate Storage types
<axw> ah, right
<axw> fwereade: does anything other than CLI need write access?
<fwereade> axw, the API server itself may do
<fwereade> axw, but (assuming non-HA, anyway) that's doable via the filesystem
<axw> fwereade: ok. it can write directly, given it's local, so that's fine
<axw> yep
<axw> ok, so yes I did expend too much effort
<axw> oh well
<fwereade> axw, well, you expended it too early, at least
<axw> :)
<fwereade> axw, but no real harm done, I think
<fwereade> axw, we would ideally like to not depend on provider storage at all but that's not an immediate plan
<axw> ok
<fwereade> axw, what we will need to do soon, though, is start exposing storage access via the API, so that an API-only CLI can still upload charms from local repos
<axw> fwereade: also, when you're not busy, would you please look at my latest replies on these two: https://code.launchpad.net/~axwalk/+activereviews
<axw> fwereade: I was wondering if/when storage would be API based
<fwereade> axw, will do
<axw> thanks
<axw> fwereade: actually, my changes to httpstorage aren't for naught
<axw> they'll allow GETs to not require a self-signed cert
<axw> forgot taht important bit :)
<fwereade> axw, I don't think they are, indeed
<axw> fwereade: I mean, changes I haven't pushed yet
<fwereade> axw, ah right -- cool then :)
<axw> I've been changing things today
<fwereade> axw, https://codereview.appspot.com/13632046/ LGTM
<axw> fwereade: thanks. the error is tested in jujutest/livetests
<fwereade> axw, I was just thinking of direct tests for New/Is
<axw> fwereade: ok, then no. I'll add some before landing
<fwereade> axw, cheers
<axw> eh that package has no tests... time to add some
<rogpeppe1> mornin' all
<axw> morning rogpeppe1
<rogpeppe1> axw: hiy
<rogpeppe1> a
<rogpeppe1> :-)
<fwereade> axw, https://codereview.appspot.com/13255051/ nearly LGTM, take a look and let me know your thoughts
<fwereade> rogpeppe1, morning
<rogpeppe1> fwereade: yo!
<axw> fwereade: thanks, reading
<axw> fwereade: replying now, but yeah, there is currently no handling of destruction for bootstrap nodes
<axw> the others can be destroyed as usual
<axw> I wasn't really sure where to draw the line with "null" :)
<rogpeppe1> fwereade: i'd don't quite understand this comment: https://codereview.appspot.com/13912043/diff/1/environs/configstore/disk.go#newcode134
<rvba> Hi jam, hi mgzâ¦ would any of you have time to talk about what seems to be a serious bug in the MAAS provider (bug 1229275).
<rogpeppe1> fwereade: i *think* the only time we add attributes is when we call Prepare
<_mup_> Bug #1229275: juju destroy-environment also destroys nodes that are not controlled by juju <maas (Ubuntu):New> <https://launchpad.net/bugs/1229275>
<rogpeppe1> rvba: oops!
<mgz> rvba: yup, but I'll need to get on a bus in a sec
<fwereade> rogpeppe1, it's not really actionable, especially in the light of our later discussions
<rogpeppe1> fwereade: ok, cool
<fwereade> rogpeppe1, prepare chooses bootstrap-state and writes it; bootstrap uses exactly that
<rogpeppe1> fwereade: yup
<fwereade> rogpeppe1, it may involve some light massage of bootstrap responsibilities vs prepare responsibilitie
<fwereade> rogpeppe1, but nbd
<rvba> So basically, juju destroys all the instances he gets back from the provider's instances() method, and that is basically all the instances.
<fwereade> rvba, that looks like a critical to me
<rvba> Critical indeed.
<fwereade> rvba, how does the maas providers markinstances as controlled by itself?
<rogpeppe1> rvba: the provider's Instances method should not be returning instances it didn't itself create
<rvba> fwereade: it doesn't
<rvba> rogpeppe1: that's the problem indeed.
<rogpeppe1> rvba: the other providers take care to avoid that
<fwereade> rvba, well, crap -- as someone maasy, how would you recommend we do so?
<rvba> fwereade: if this needs to be addressed on the MAAS side, then the easiest way is probably to set a tag on the nodes.  A tag identifying the juju environment.
<rvba> Out of curiosity, how do the other providers do it?
<mgz> rvba: either by looking at the security groups or the name attached to the instances, I believe
<mgz> instances that env controls are given names juju-ENVNAME-*
<rvba> Rightâ¦ that's how the Azure provider works too now that I think of it.
<fwereade> mgz, rvba: fwiw envname is bad
<fwereade> mgz, rbva: long-term, envname can only ever be a local alias for the actual environment uuid
<mgz> it's all a little dodgy, but I don't like a the alternatives much
<fwereade> mgz, rvba: and we've already had problems with two people using the same env name and same provider credentials
<fwereade> it's easy to say "don't do that then"
<mgz> well, we should check for that on bootstrap and blow up
<mgz> right, I need to get on bus
<fwereade> but that's not as helpful as designing things such that we don't have to do so in the first place
<fwereade> rvba, I need to take a break for a bit, but... actually just a mo
<fwereade> rvba, how does juju not destroy those other instances first?
<fwereade> rvba, the provisioner will be asking for all instances and culling those it doesn't recognise
<fwereade> rvba, so *starting* a juju environment should also kill everything else
<fwereade> rvba, as should upgrading it
<fwereade> rvba, do you know if that's the case?
<rvba> fwereade: I just tested it, that's not what happens.
<rvba> (I'm testing with the latest trunk)
<fwereade> rvba, ok, that's weird
<fwereade> rvba, if it's not culling unknown instances it implies that actually AllInstances is reporting the right ones
<rvba> fwereade: that should happen during bootstrap right?
<jam> fwereade: well, you also have to run "juju status" first or it can poll the Provider at all yet
<jam> we've had many "run status and everything dies" bugs :)
<rvba> fwereade: I simply tested running "juju bootstrap", is the culling supposed to happen there or later, for instance when the bootstrap node comes up?
<fwereade> rvba, jam speaks truth, you need to connect once before the bad things will happeb
<fwereade> rvba, it'll happen when the proviioner starts running
<rvba> Okay, testing that now.
<rvba> (node is installing)
<fwereade> rvba, which will happen just after the first command that connects
<fwereade> rvba, cheers
<fwereade> axw, responded, let me know what you think of the Destroy error question
<fwereade> bbiab
<axw> fwereade: yes sorry, I agree Destroy should return an error for now
<rvba> fwereade: you were right, culling did happen.
<fwereade> rvba, well, ok, the good thing here is we don't have to worry about backward compatibility then, because nothing (sensible) we do can make the situation any worse
<fwereade> axw, this right here ^ is a reason for an EnvironUpgrader that acts directly on the environment (independent of the should-it-hit-state discussion)
 * axw reads back
<fwereade> axw, short version: maas instances are not tied back to their environment, and getting instances from maas gets *all* instances, not all instances in the *environment*
 * axw nods
<axw> and destroys them all
<axw> fwereade: where's the link to EnvironUpgrader?
<axw> I didn't get much sleep last night, so a little slower than usual today
<fwereade> axw, sorry, we were chatting about it in the state-upgrades thread
<fwereade> axw, your contention was that it should connect to state
<fwereade> axw, I think that's the wrong way round
<fwereade> axw, *but* that adding an optional upgrade method to environ might be a good idea for other reasons
<axw> fwereade: for example, so you could add a tag to the maas nodes that you control?
<fwereade> axw, exactly so :)
<fwereade> axw, or indeed so we could correct the envname problem (above) for the other providers
<axw> fwereade: your latest reply clarifies things for me, and yes, much nicer to not manipulate state from environ
<fwereade> axw, great
<axw> fwereade: I've updated https://codereview.appspot.com/13255051/
<axw> okay if I handle Destroy properly in a followup?
<fwereade> axw, absolutely
<fwereade> axw, LGTM
<axw> thanks
<axw> fwereade: I'll get the last of the httpstorage stuff in next, then get onto Destroy
<fwereade> axw, perfect, tyvm
<axw> fwereade: and then Prechecker wireup
<fwereade> great -- that one's going to be a bit interesting, I think, we should plan how we get it in there ahead of time
 * fwereade bbiab again, see you all atthe meeting
<axw> me too, I need a break. bbl
<thumper> rogpeppe1: I've realized that I really don't like mornings
<rogpeppe1>  thumper: that's taken you a while :-)
<rogpeppe1> thumper: i've realised that i forgot (again!) about our chat last night
<thumper> My head just isn't in it that early
<thumper> I should go check the agenda
<jam> https://bugs.launchpad.net/bugs/1229275 is that actually Critical ?
<_mup_> Bug #1229275: juju destroy-environment also destroys nodes that are not controlled by juju <juju-core:Triaged> <maas (Ubuntu):Triaged> <https://launchpad.net/bugs/1229275>
<jam> seems High at best
<jam> especially given "nobody is assigned to it"
<dimitern> fwereade, there it is https://codereview.appspot.com/13963043 - first part, the secrets blanking will follow
<jam> dimitern: the other way around
<fwereade> dimitern, would you take a really quick look at https://bugs.launchpad.net/juju-core/+bug/1229286 ? it feels somewhat likely to be unitery
<_mup_> Bug #1229286: debug-log and boolean options are broken in trunk <juju-core:New> <https://launchpad.net/bugs/1229286>
<dimitern> fwereade, looking
<fwereade> dimitern, the config bits specifically
<fwereade> dimitern, may be helpful to confer with TheMue, he was touching config recently
<dimitern> fwereade, I haven't tried juju set when live testing the api uniter
<dimitern> fwereade, just debug-hooks and relation-set/get
<fwereade> dimitern, yeah, I should have thought of that
<fwereade> dimitern, in fact the stuff you're doing is as critical as this regardless
<fwereade> TheMue, is there any likelihood you'll be able to look into it this pm?
<TheMue> fwereade: yep, will do
<TheMue> fwereade: lunch in a few moments, but then
<dimitern> fwereade, did debug-log show the hooks output before?
<dimitern> frankban, hey
<fwereade> TheMue, cool, thanks, please just verify what's happening with set vs config-changed
<dimitern> frankban, about that bug ^^
<dimitern> frankban, have you tried using debug-hooks instead?
<frankban> dimitern: no
<fwereade> dimitern, frankban: re logging you need to enable that logging in env config now
<fwereade> dimitern, frankban: thumper knows exactly
<dimitern> frankban, debug-hooks will show you if config-changed got fired
<frankban> dimitern: as I mentioned in the bug description, I am pretty sure that config-changed is called
<thumper> dimitern, frankban: it is due to logging changes that were made recently to make things more "productiony"
<thumper> bootstrap with --debug
<thumper> or --log-config=<root>=DEBUG
<frankban> thumper: cool, good to know
<thumper> or whatever you want
<thumper> this log config then propagates to all the agents
<frankban> thumper: so, by default, hooks output is not displayed in the debug log, correct?
<dimitern> ah, good to know
<thumper> can be updated using "juju set-env log-config=blah"
<thumper> frankban: correct
<thumper> only warning and errors
<thumper> used to be debug for everything
<thumper> I'll write an email for juju-dev tomorrow to explain the changes
<thumper> and hooks
<thumper> not juju hooks
<thumper> but how to do other logging stuff
<frankban> dimitern: so, the real bug is about boolean options: it seems they are always set to false
<frankban> thumper: thanks for the clarification
<thumper> np
<dimitern> frankban, hmm.. TheMue, can this be relevant to your recent config changes?
<TheMue> dimitern: should not, only empty settings have been touched
<TheMue> dimitern: will will take a look after lunch
 * TheMue => lunch
<dimitern> fwereade, did you have a chance to look at https://codereview.appspot.com/13963043 ?
<fwereade> dimitern, been in meetings I'm afraid, i'll try to fit in in before Igo forlunch
<dimitern> fwereade, ok
<fwereade> dimitern, did we not have an implementation for Upgrader that swapped out 127.0.0.1?
<fwereade> dimitern, erDeployer
<dimitern> fwereade, that's from there
<dimitern> fwereade, it's not swapping anything
<dimitern> fwereade, and it actually works like proposed - live tested on ec2
<fwereade> dimitern, I see, ok, no quibbles with what we're doing
<fwereade> dimitern, but would you please pull the common implementation of those methods out into a common type we can embed, like the other shared functionality?
<fwereade> dimitern, Ican live with that as an *immediate* followup
<dimitern> fwereade, even though it's going away as soon as we have machine addresses?
<fwereade> dimitern, we're still going to need to do the same thing in the same two places, aren't we?
<dimitern> fwereade, I'll do it in this CL, not to much to do I think
<fwereade> dimitern, we'd just stop using an environ to do so, surely
<fwereade> dimitern, that's even better :)
<fwereade> thanks
 * fwereade quick lunch
<jam> dimitern: https://codereview.appspot.com/13964043/ looks pretty much the same as the one you set back to WIP and were going to resubmit. Did you mark the wrong one?
<jam> https://code.launchpad.net/~dimitern/juju-core/145-apiserver-provisioner-blank-secrets/+merge/187577 looks just like https://code.launchpad.net/~dimitern/juju-core/147-apiprovisioner-blank-env-secrets/+merge/187738
<jam> dimitern: maybe you meant to reject https://code.launchpad.net/~dimitern/juju-core/146-apiprovisioner-addresses/+merge/187719 ?
<dimitern> jam, no, it has almost the same description and diff, but different prereq
<gary_poster> TheMue, when you get back would like to know how https://bugs.launchpad.net/juju-core/+bug/1224568 is doing
<_mup_> Bug #1224568: Improve hook error reporting <juju-core:In Progress by themue> <https://launchpad.net/bugs/1224568>
<TheMue> gary_poster: it's almost done, one smaller CL is missing. after investigating the problem of frankban i'll continue (tests are missing)
<gary_poster> awesome thanks TheMue @
<gary_poster> !
<TheMue> frankban: ping
<frankban> TheMue: pong
<TheMue> frankban: the boolean value, how is it configured?
<frankban> TheMue: I saw every boolean values set to False, both if they are true by default (in config.yaml) and when they are set to True using "juju set". Hope that answers your question
<TheMue> frankban: the setting makes me wonder, there has been a change in getting handling nil values when default is set
<TheMue> frankban: the change happened with rev 1800
<frankban> TheMue: it is possible, I saw this problem in trunk, but it works as usual reverting to 1750
<frankban> TheMue: the bug includes instructions to dupe, I'd ensure this is not soemthing wrong in my local configuration before investigating
<TheMue> frankban: so if 1799 would be ok and 1800 not we've got it ;)
<TheMue> frankban: the change has been to omit nil values if default is set. and this may be interpreted as false
<frankban> TheMue: the weird think is that it seems the value is False in the hooks execution even when you explicitly set an option to true (and the default is false)
<TheMue> frankban: are you still on 1750 or back on trunk
<frankban> TheMue: 1750
<TheMue> frankban: the hook execution part is strange
<TheMue> frankban: take a look at http://bazaar.launchpad.net/~go-bot/juju-core/trunk/revision/1800, get.go line 52 (the rest are tests)
<jam> TheMue: so rev 1800 has "if option.Default != nil { info["value"] = option.Default" which seems to be the only change. Otherwise we leave value untouched.
<TheMue> frankban: yes, exactly
<TheMue> frankban: before that change the map contains the key "value", only with a value nil
<TheMue> frankban: so with a quick hack on your 1750 to behave here like the 1800 and showing the same errors shows that it's a shitty CL :/
<frankban> TheMue: so you duped?
<TheMue> frankban: yes, i would revert it then
<TheMue> frankban: but it would help me if you make that quick hack test to be sure that this is the correct concluion
<TheMue> conclusion
<frankban> TheMue: are you sure the problem is there? AFAICT ServiceGet works correctly (the correct values are showed, i.e. in the GUI (and the GUI takes that information using the API)
<TheMue> frankban: no, i'm not sure, that's so far the only change i've found regarding config later than 1750
<fwereade> TheMue, there's another biiig one
<fwereade> TheMue, uniter working via API
<TheMue> frankban: so you see the correct values in GUI? fine
<frankban> TheMue: yes
<TheMue> frankban: ok, will investigate there (uniter)
<dimitern> fwereade, updated https://codereview.appspot.com/13963043
<fwereade> dimitern, cheers
<fwereade> dimitern, nice and clean, LGTM
<dimitern> fwereade, thanks
<fwereade> dimitern, remind me what else is on your plate after that one? the blanking?
<mgz> got a lead on our memory/tiny booting issues, bug 1227425 may be related
<_mup_> Bug #1227425: Cloud images do not need apt-xapian-index <bot-comment> <cloud-images-build> <ubuntu-cloud-images> <Ubuntu:New> <https://launchpad.net/bugs/1227425>
<fwereade> TheMue, ah-ha
<fwereade> TheMue, a true boolean is being reported to the uniter as ""
<dimitern> fwereade, I realized we no longer need StateAddresses() and APIAddresses() on agent.Config, so I'll remove these as well
<fwereade> dimitern, nice
<fwereade> dimitern, thanks
<TheMue> fwereade: i'm currently digging in the uniter
<TheMue> fwereade: where are you
<fwereade> TheMue, add a boolean to testing/repo/series/wordpress/config.yaml
<fwereade> TheMue, find the uses of assertYaml in uniter_test.go
<fwereade> shit
<fwereade> config data is getting squeezed through map[string]string and we didn't spot because we didn't have tests involving non-string config settings at the sharp end
<rogpeppe1> a small MP that might speed up tests slightly: https://codereview.appspot.com/13968043/
<frankban> TheMue: revno 1800 works well fwiw. trying 1845 now
<TheMue> fwereade: testing it, just had to change something in my test code ;)
<TheMue> frankban: aha
<fwereade> TheMue, frankban, dimitern: state/apiserver/uniter/uniter.go:509
<fwereade> TheMue, frankban, dimitern: those are not relation settings are are most definitely not a map[string]string
<fwereade> TheMue, frankban, dimitern: this is critical
<frankban> so sval, _ := v.(string) is killing booleans?
<dimitern> fwereade, hmm
<dimitern> fwereade, ok, so we need map[string]interface{} there?
<fwereade> dimitern, yeah
<TheMue> wow
<fwereade> dimitern, the confusing range of configgy/settingsy types with their selection of arbitrarily different rules is deeply depressing to me
<dimitern> fwereade, if it's only that, it's easy enough to fix the API
<fwereade> dimitern, bad luck for getting caught up in it (and Iprobably reviewed it too :/)
<fwereade> dimitern, I believe so
<fwereade> dimitern, we did release with the uniter api active, didn't we?
<dimitern> fwereade, we did
<fwereade> dimitern, still, upgrading the return type won't actually hurt
<fwereade> dimitern, or will it
<fwereade> dimitern, what happens if we try to deserialize a map[string]interface{} with mixed values into a map[string]string?
<dimitern> fwereade, it ignores non-strings?
<fwereade> dimitern, that'd be nice, and I think it might, but we should check
<dimitern> fwereade, I mean - non-strings get empty string values
<fwereade> dimitern, that would mean behaviour wouldn't change
<dimitern> fwereade, I can do a CL that changes the result of ConfigSettings() to params.ConfigResults (new type - like SettingsResults, but with params.Config instead)
<fwereade> dimitern, can we give them explicit ConfigSettingsResults and RelationSettingsResults names please?
<fwereade> dimitern, and name the types they use ConfigSettings and RelationSettings
<dimitern> fwereade, well, ConfigResult is used by the provisioner actually, for environ config result
<rogpeppe1> dimitern, fwereade, TheMue, natefinch, mgz, jam: environment file extension: anyone want to weigh in? https://codereview.appspot.com/13969043
<dimitern> fwereade, we can change these, but that means even more api incompatibility
<fwereade> dimitern, type names are arbitrary, aren't they? where's the incompatibility?
<fwereade> dimitern, field names are a problem
<dimitern> fwereade, protocol on-the-wire might change?
<dimitern> fwereade, or not, ok
<fwereade> dimitern, if they suck we just have to eat it up and hope we learn from our mistakes :)
<dimitern> fwereade, next CL will be about that then
<fwereade> dimitern, I think it's even more important than the secret-masking tbh
<fwereade> dimitern, this is a pretty devastating regression
<TheMue> rogpeppe1: reviewed
<dimitern> fwereade, I'm done with the provisioner for now - submitted the first for landing, the second one is next, and while waiting I'll tend to the uniter
 * fwereade throws flowers before dimitern's path
<natefinch> I like jenv because if we decide we don't like yaml anymore, we can put something else in there.  I do sorta have a hatred for prefixing things with j, just due to an inordinate amount of time exposed to java crap
 * natefinch isn't bitter though...
<dimitern> fwereade, (if we ask Captain Hindsight for advice it'll be:) we would've caught this if we had tests for non-string settings
<fwereade> thank you, Captain Hindsight!
<fwereade> dimitern, perfectly correct
<dimitern> fwereade, so I'll look about adding some
<fwereade> dimitern, stick to local unit tests for the bit you change, for now, please -- I consider this critical and don't want to release with it *again* ;p
<fwereade> dimitern, changing the uniter tests to exercise it may be noisy
<fwereade> dimitern, they must ofc be done but they'll delay landing the fix
<dimitern> fwereade, ok
<fwereade> dimitern, that said, hmm, how do we test in the api?
<smoser> hey
<smoser> looking at https://codereview.appspot.com/13962043/
<fwereade> dimitern, if we use wordpress' config settings
<smoser> rather than disabling certificate checking ...
<smoser> wouldn't it be better to add the certificates ?
<smoser> it seems juju would know them.
<smoser> cloud-init has config that explicitly allows adding certificates that should then be accepted.
<smoser> hazmat, ^ ?
<fwereade> jam, smoser makes an interesting point ^
<fwereade> dimitern, anyway: if we are using wordpress as the "standard" testing charm
<smoser> http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/examples/cloud-config-ca-certs.txt
<dimitern> fwereade, I have some simple charms I can use
<fwereade> dimitern, we should probably just add all config types to it and so gently encourage people testing to actually check them all
<fwereade> dimitern, you may find that the uniter is tightly coiled around the fake wordpress charm
<fwereade> dimitern, but, eh, that's the next branch anyway, I'll stop distracting you
<fwereade> smoser, I think the encompassing issue may be that some clouds don't even have certs configured
<smoser> is that possible ?
<smoser> ignorance being exposed....
<smoser> but when i go to some https sight with firefox
<smoser> it says "Hey, this doens't look right".  You want to get the certificate and trust it ?
<fwereade> smoser, I have only second-hand "knowledge", inferred from the conversations of those who know more than me
<smoser> can't juju client just do the "get the certificate" bit. and then launch instances with that.
<fwereade> mgz, IIRC you were doing ugly things to induce certificate errors recently I recall -- did I misunderstand your saying you'd been removing the certs temporarily and things had still worked?
<dimitern> fwereade, the fix is done, testing now
<mgz> fwereade: jam had done testing along those lines, but only for the client side so far I think (as it's harder to screw up the certs on a booted node and check that works)
<jam> mgz, fwereade: I ssh'd into the node and messed up the certs for testing the patch I proposed.
<jam> fwereade, smoser: While I like adding the functionality to allow a new known cert, I don't think it has the same user impact
<jam> because digging up the cert and adding it to the config is far more complex than just shoving a "false" in there when you are testing.
<jam> so I'd be happy to add support for custom certs
<jam> but I think we still need the "disable" ability
<smoser> jam, not necessarily
<smoser> see my comment about firefox above
<smoser> firefox bsaically allows me to say 'false' for checking of that server. and it does the rest.
<smoser> i've actually done this once before on a project for exlicitly this reason. i figured out how firefox did what it does... how it gets the certificate and did that. and inserted that certificate.
<smoser> i do see the point about this being "testing" and that https is likely only used without certificates on "test" scenarios.
<rvba> mgz: just one question about the tag solution: if you upgrade a juju deployment that was created before we used the tags and then use a version of juju which uses the tags to filter out machines, your deployment will be broken.  What's the policy to solve that kind of upgrade problems in juju?
<mgz> hm, good question
<mgz> that would be the case with either solution
<rvba> True.
<mgz> we could use compat code that detected the hey, no tag named after our environment, assume old behaviour of all machines are ours
<mgz> but that may not be the best way
<fwereade> rbva, mgz: we are getting closer to sanity for upgrades, but there's little so far
<rvba> mgz: that seems like the only solution
<fwereade> rvba, I was tending towards mgz's suggestion myself... it's bad but I don't see alternatives
<rvba> Well, another solution is to have juju detect that there is no tag, and then create it and attach all the nodes it knows about to it.
<mgz> we'd need to be doubly sure that destroy-enviornment *twice* wouldn't then go and delete all maas nodes anyway
<fwereade> rvba, mgz, tag only the machines that have instance ids assigned in state?
<mgz> because hey, the second time there's no tag named after our env, so everything must be ours, so wipe it...
<rvba> fwereade: yes
<rvba> mgz: the second time, no machine id will be stored, so no machine removed.
<fwereade> rvba, mgz: that can't happen automaticaly within the environment though
<mgz> it seems an easy enough disaster to avoid
<fwereade> rvba, mgz: yeah -- axw has a lot on his plate right now but he seems enthusiastic about doing the long-overdue upgrade stuff in the near future
<rvba> mgz: maybe the first solution (explicitly supporting the old behavior) is simpler after all.
<rvba> fwereade: out of curiosity, why doesn't juju itself keeps track of the machines it owns?
<rvba> keep*
<fwereade> rvba, it does -- but Destroy is entirely internal to the environment, which is itself expected to keep track of its own machines and differentiate between those in and out of the environment
<fwereade> rvba, it would indeed be possible to have written it such that juju had to specify all the instances it knew about
<dimitern> anyone seen this local provider error: http://paste.ubuntu.com/6159055/
<fwereade> rvba, but I think that would make it very hard for juju to effectively reap instances that it needed to itself
<dimitern> it used to work fine a week ago
<dimitern> loaded invalid environment configuration: storage-port: expected int, got 8040
<fwereade> dimitern, that looks kinda like an int has been inappropriately coerced to a string somewhere, doesn't it
<rvba> fwereade: I don't want to bother you with that, but I don't really understand.  If juju has the list of all the machine it owns, it can pass it to the environment when destroying it.
<rvba> machines*
<dimitern> fwereade, it does
<rvba> But that's not the way it works now so we have to fix the MAAS provider anyway :).
<fwereade> rvba, if we start an instance but fail to record it against a machine, we want to automatically trash that instance
<rvba> fwereade: hum, I see.
<fwereade> rvba, I will try to make the situation clearer than it currently is in the writing-a-provider doc I'm working on
<rvba> Cool
<rogpeppe1> mgz: what's the status of the VPC-only bug?
<hazmat> mgz, if you read the bug report, it states in the description how to get enabled with that on an existing account
<hazmat> https://bugs.launchpad.net/juju-core/+bug/1221868
<_mup_> Bug #1221868: juju broken with ec2 and default vpc <juju-core:Confirmed for gz> <https://launchpad.net/bugs/1221868>
<hazmat> its took about 2biz days
<dimitern> fwereade, ping
<fwereade> dimitern, pong
<dimitern> fwereade, how do you suggest to live test that thing? so far I tried ec2 live testing and calling juju set svc flag=True, calls config-changed in a debug hooks session and config-get shows it as expected
<fwereade> dimitern, that sounds solid
<fwereade> dimitern, but that local provider thing is really alarming
<dimitern> fwereade, I'll check on trunk to see if it's my branch or it's broken
<fwereade> dimitern, thanks
<TheMue> ah, tests pass
<dimitern> fwereade, same effect in trunk
<natefinch> args... couple annoying bugs in goyaml..... unmarshaling "" into a *string makes the string nil  (not an empty string), and unmarshalling [] into a slice gives you a nil slice (not an empty slice).  PITA
<dimitern> fwereade, so the local provider was broken earlier
 * fwereade freaks out at dimitern but wants to chat to nate for a moment
<fwereade> natefinch, that's annoying
<natefinch> fwereade: yeah, we already had one workaround in constraints
<fwereade> natefinch, I'm sure there was a similar bug with goyaml in the past
<natefinch> fwereade: yeah, we had  to set up a whole SetYAML method because the containertype was getting unmarshaled as nil instead of empty.
<fwereade> natefinch, ouch -- do you know if there's a goyaml bug for that?
<dimitern> can someone else try bootstrapping a local environment from trunk and deploying anything, to see if all-machines.log shows this error http://paste.ubuntu.com/6159055/
<natefinch> fwereade: didn't look like it when I perused the bug list (only 13 bugs listed)
<dimitern> TheMue, rogpeppe1, jam, mgz  ^^ ?
<dimitern> and please make sure you did go install . in cmd/juju and jujud/, and use --upload-tools on bootstrap
<mgz> hazmat: thanks, I'm just not certain I want to do that on the shared bzr account, how disruptive was it for you?
<dimitern> fwereade, there's the fix https://codereview.appspot.com/13908044
<hazmat> mgz, seamless, just pick a region your not using
<hazmat> mgz, you have to clear out ec2 resources in that region (ie no running instances, also good to clear out groups)
<mgz> ah, that does seem good
<hazmat> mgz, so i take it then there hasn't been any progress on this? we really need it for 1.16..
<hazmat> i ran into two users last week, who couldn't use juju on ec2..
<natefinch> fwereade: now there are bugs
<fwereade> natefinch, thanks
<dimitern> ok, so no one wants to try to reproduce the local provider issue, i'm filling a bug
<sinzui> fwereade, do you have a revision that you want to release as 1.15.0?
<fwereade> sinzui, I am very worried that I do not, because dimitern's problem seems pretty critical to me
<sinzui> fwereade, okay. That's fine. Is there a bug I can track
<fwereade> sinzui, dimitern is filing it as we speak
<sinzui> fab. Thank you.
<dimitern> fwereade, sinzui: there it is bug 1231543
<_mup_> Bug #1231543: upgrader startup failure with local provider <juju-core:New> <https://launchpad.net/bugs/1231543>
<sinzui> Thank you dimitern
<fwereade> dimitern, would you please mark that critical and start investigating? TheMue, are you on something else or can you assist reproing?
<dimitern> fwereade, it's filed as critical
<dimitern> fwereade, and I'm looking at it
<dimitern> fwereade, the uniter fix is proposed already
<fwereade> dimitern, you anticipate my micromanagement with aplomb and panache
<fwereade> dimitern, I'm about to LGTM it I think
<fwereade> dimitern, yep, LGTM, just one tweak needed
<dimitern> fwereade, ok, will tend to it afterwards
<TheMue> fwereade: can do tomorrow morning, have to reactivate the matching VM (not enough space anymore on disk)
<TheMue> fwereade: currently I'm fighting with a called but non-existing constructor *sigh*
 * TheMue still will propose now, so the changes can be reviewed
 * fwereade is taking a short family break but will return anon
<TheMue> shit, propose will not work with the missing function :(
<TheMue> dimitern: i'll start to setup my testing vm now
<TheMue> dimitern: will you not any findings in the issue to that i can support you after setup later
<TheMue> cu later
<dimitern> TheMue, so far I tested it happens in trunk and r1885, will go further
<rogpeppe1> dimitern, mgz, jam, natefinch: next stage in environment info storage, reviews appreciated please: https://codereview.appspot.com/13970043
<rogpeppe1> fwereade: ^
<rogpeppe1> dimitern: ping
<dimitern> ok, so it doesn't happen as far as r1844, going back up
<dimitern> rogpeppe1, pong
<dimitern> rogpeppe1, I'm up to my elbows into the local provider atm
<rogpeppe1> dimitern: i'm just wondering about API connections and how they can find the API addresses to store locally
<dimitern> rogpeppe1, expand a bit please
<rogpeppe1> dimitern: so, the plan is that when we make an API connection, we find out the current set of API addresses and store that locally in a .jenv file
<dimitern> rogpeppe1, how about if they change after that?
<rogpeppe1> dimitern: we refresh the cache each time we connect
<rogpeppe1> dimitern: and fall back to environ config info if the connection fails
<dimitern> rogpeppe1, sgtm
<rogpeppe1> dimitern: but we need to find out the current set of API addresses so we can store them
<rogpeppe1> dimitern: and i'm thinking of an API call that's available to anyone that can access the API that returns them
<jam> rogpeppe1: it could be returned from Login
<dimitern> rogpeppe1, so like a Login call
<rogpeppe1> jam: that's an interesting idea
<rogpeppe1> jam: i quite like that actually.
<rogpeppe1> jam: then api.Open can cache it, so it can be retrieved by a later call
<rogpeppe1> jam: so we don't have to change the type sig
<jam> something like that, yeah
<rogpeppe1> ah, there's a problem, i think
<rogpeppe1> jam: i *think* that State.APIAddresses just returns the same IP addresses that mongo peers use to talk to each other
<rogpeppe1> jam: which probably won't be public IP addresses
<dimitern> rogpeppe1, they aren't
<rogpeppe1> damn. i guess i'll need to fix that first
<dimitern> rogpeppe1, but with the addresser stuff coming up it might not be needed
<dimitern> machine addressability
<rogpeppe1> dimitern: go on... how does that help?
<dimitern> rogpeppe1, machines will know their own addresses (public, private, all)
<rogpeppe1> dimitern: go on
<dimitern> rogpeppe1, and you can query state for them, and there will be a worker to update them as needed
<dimitern> rogpeppe1, mgz is working on that I think for some time
<rogpeppe1> dimitern: so to find the API addresses, you do a search for all machines with ManageState, then query their addresses?
<rogpeppe1> s/ManageState/JobManageState/
<dimitern> rogpeppe1, yes
<dimitern> rogpeppe1, and for other potential new jobs we have
<rogpeppe1> dimitern: that seems somewhat inefficient. wouldn't it be a linear scan?
<dimitern> rogpeppe1, who needs to know?
<rogpeppe1> dimitern: it'll happen every time someone connects to the API
<dimitern> rogpeppe1, and currently it happens thorough the StateInfo
<rogpeppe1> dimitern: i was thinking that we'd have a doc in mongo which held the API addresses, then some agent would maintain that
<dimitern> through
<dimitern> rogpeppe1, that might be an addition to the addressability stuff, or even orthogonal to it
<rogpeppe1> dimitern: i think it's orthogonal, yes
<rogpeppe1> hmm, how does a machine's public address get filled in now? by the provisioner, i guess
<mgz> rogpeppe1: that's the idea
<mgz> not sure what you mean by "linear scan" though
<rogpeppe1> mgz: well, if i want to find out the addresses of all machines that are state servers, how should i do it?
<dimitern> rogpeppe1, not really
<dimitern> rogpeppe1, the unit's addresses are set by the uniter, but the machine addresses are taken from the environment
<mgz> query out machines that have the stateserver bit set in mongo, and pull the address?
<dimitern> rogpeppe1, by the provisoner, but it doesn't set them anywhere yet
<rogpeppe1> mgz: won't that be a linear scan through all machines?
<mgz> having a seperate table with addresses of state servers doesn't *sound* faster to me
<mgz> but is also perfectly possible, it's just a denormalisation
<dimitern> fwereade, I found the culprit - the issue in bug 1231543 starts to happen in r1877
<_mup_> Bug #1231543: upgrader startup failure with local provider <juju-core:New> <https://launchpad.net/bugs/1231543>
<rogpeppe1> mgz: to me it sounds like one fetch of a document in a single document collection, versus a scan through potentially many hundreds
<rogpeppe1> mgz: but... i think that for the time being it's probably fine
<rogpeppe1> mgz: storing the addresses separately is an optimisation really.
<rogpeppe1> dimitern: hmm, so the uniter API has PublicAddress and SetPublicAddress. is there any particular reason for that?
<dimitern> rogpeppe1, the uniter sets these on startup
<rogpeppe1> dimitern: what i mean is: why have the PublicAddress method if it's only there to pass its result to SetPublicAddress?
<rogpeppe1> dimitern: (which also gives a compromised uniter the potential freedom to muck with its reported public address, something you probably don't want)
<dimitern> rogpeppe1, the uniter needs both to set public/private addresses of a unit, and to read them
<rogpeppe1> dimitern: why's that?
<dimitern> rogpeppe1, the addresses shouldn't be on a unit at all - they should be on a machine, but that's that
<rogpeppe1> dimitern: i'm wondering about an API call, say Start, which informs the API that the uniter has started
<dimitern> rogpeppe1, because public-address is one of the relation settings set automatically when entering scope for example
<rogpeppe1> dimitern: ah, good point, so we need PublicAddress
<dimitern> rogpeppe1, the API very well knows when the unit agent connects, and starts a pinger now
<rogpeppe1> dimitern: in that case, that's probably the moment that the public and private addresses should be set
<dimitern> rogpeppe1, perhaps, if we're not using a separate worker for that
<dimitern> rogpeppe1, and setting them on the machine, not on the unit
<rogpeppe1> dimitern: yeah
<rogpeppe1> dimitern: but the point is that we could remove that stuff from ModeInit, i think
<rogpeppe1> dimitern: hmm, except not right now of course
<rogpeppe1> dimitern: because it really does get the public address from the provider
<rogpeppe1> dimitern: ok, ignore my stupidity
<mgz> I've added an explaination to bug 1227533 about our memory woes the last week
<_mup_> Bug #1227533: Juju fails to bootstrap if memory is lower than 1GB <juju-core:Triaged> <https://launchpad.net/bugs/1227533>
<mgz> now I must depart, farewell!
<rogpeppe1> mgz: one mo, please?
<mgz> one mo while I close things :)
<dimitern> rogpeppe1, there's a todo about it in mode init
<rogpeppe1> mgz: kapil was asking about the status of the VPC-only bug...
<rogpeppe1> dimitern: yeah, i understand that now :-)
<dimitern> rogpeppe1,  ...and a few other places, and there's the tech-dept bug 1205371
<_mup_> Bug #1205371: state.Addresses and APIAddresses need better implementation <juju-core:In Progress by gz> <https://launchpad.net/bugs/1205371>
<rogpeppe1> dimitern: hmm, so there's no way of finding out a machine's public address currently unless it has a unit on it?
<mgz> rogpeppe1: it's the next on my list, but haven't started yet, saw his comments earlier
<rogpeppe1> mgz: ok, cool
<mgz> will tackle the registration stuff at least tomorrow
<mgz> okay, now must fly
 * dimitern is totally puzzled how r1877 could lead to that local provider issue
<rogpeppe1> i'd love a review of  https://codereview.appspot.com/13970043/ if anyone has a little time
<natefinch> rogpeppe1: I can take that
<rogpeppe1> natefinch: ta muchly
<rogpeppe1> natefinch:
<fwereade> dimitern, thanks, Iwill meditate upon 1877
<fwereade> dimitern, "The simplestreams tools metadata includes a sha256..."?
<natefinch> rogpeppe1: what's the difference between 	 done := make(chan struct{})
<natefinch>  go func() { info.BootstrapConfig(); done <- struct{}{} }()
<natefinch> <-done
<natefinch> and just calling info.BootstrapConfig() in the current goroutine?  They both just block waiting for bootstrapconfig to finish, right?
<rogpeppe1> natefinch: ha, there is a subtle difference, but it's just a debugging remnant
<rogpeppe1> natefinch: i'll revert it
<rogpeppe1> natefinch: 2 points if you can tell me why i did it :-)
<natefinch> rogpeppe1: if you had a panic in bootstrap config it would make the call stack a lot shorter
<rogpeppe1> natefinch: close
<natefinch> rogpeppe1: could be something to do with the scheduler, but that seems too subtle to matter
<rogpeppe1> natefinch: nah
<rogpeppe1> natefinch: it's to do with gocheck
<rogpeppe1> natefinch: if you panic, then gocheck catches it and distorts things
<rogpeppe1> natefinch: so by panicing in a goroutine you get a much cleaner idea of what's going on at that momen
<rogpeppe1> t
<natefinch> ahh ok
<natefinch> rogpeppe1: I presume you'll take out the log messages in there as well
<rogpeppe1> natefinch: yes
<natefinch> k\
<natefinch> rogpeppe1: btw, is "erewhemos" someone misspelling "somewhere" backwards, or something that actually makes more sense?
<rogpeppe1> natefinch: the former :-)
<dimitern> rogpeppe1, sweet! i'll remember that trick next time i'm fighting tests panic
<natefinch> rogpeppe1: ha, ok.  I thought so, but you never know
<rogpeppe1> natefinch: just a nonsense name that's unlikely to be confused with anything in the production code
<fwereade> natefinch, I'm sorry about that, there was a satirical work by samuel butler called "erewhon" which is not *quite* "nowhere" backwards
<fwereade> natefinch, it seemed like a good idea at the time
<dimitern> fwereade, yes that's whati found so fr
<fwereade> dimitern, just to be crystal clear: 1876 works, 1877 does not?
<rogpeppe1> fwereade: we're in the *distopia* right?
<rogpeppe1> dystopia, sorry
<natefinch> fwereade: haha, ok. not up on my Victorian authors
<fwereade> rogpeppe1, heh
<dimitern> fwereade, that's what I see, but I'll double check, just a minute
<dimitern> fwereade, indeed
<dimitern> fwereade, and the error now makes sense 2013-09-26 17:48:00 ERROR juju runner.go:211 worker: exited "upgrader": cannot set agent tools for machine 0: empty size or checksum
<dimitern> fwereade, but, interestingly the coercing error is not there in 1877
<rogpeppe1> natefinch: still waiting for that review, BTW :-)
<natefinch> rogpeppe1: still doing it. Had to stop in the middle for a little bit.  Almost done :)
<rogpeppe1> natefinch: np
<dimitern> fwereade, so the other error starts to show in my r1884 that switches to api provisioner
<rogpeppe1> fwereade: do you know what stage mgz is at with the addressing stuff?
<natefinch> rogpeppe1: done
<rogpeppe1> fwereade: i just started hacking up the publisher/addresser worker, then realised that he might already have done/nearly done it
<rogpeppe1> natefinch: thanks
<fwereade> rogpeppe1, I'm afraid I do not actually know, i was kinda expecting a CLfrom him today
<rogpeppe1> fwereade: i need that, or something like it, to cache the API addresses
<fwereade> dimitern, ah ok
<fwereade> dimitern, so the upgrader thing appears to be a problem
<rogpeppe1> fwereade: this is the sketch of the code i just wrote: http://paste.ubuntu.com/6159815/
<dimitern> fwereade, yeah
<rogpeppe1> fwereade: oops, this is better: http://paste.ubuntu.com/6159817/
<fwereade> dimitern, I thought all we were meant to be setting was a version, not a whole tools
<dimitern> fwereade, and the other thing - it doesn't seem to be an int coerced to string, it's an int - I debugged so far as to say the provisionerAPI returns the correct map[string]interface{} in worker/WaitForEnviron
<fwereade> dimitern, oh, ffs, is it possibly a json problem? definitely an int and not a float?
<fwereade> rogpeppe1, sorry, I have only skim-read it, but I think it may well have overlap
<dimitern> fwereade, trying to see exactly what now
<rogpeppe1> fwereade: yeah, if he's doing an addresser worker, it almost certainly will
<rogpeppe1> fwereade: well, i'll keep it around in case
<dimitern> any idea why this error? ERROR juju.provider.local environ.go:482 could not install machine agent service: exec ["start" "juju-agent-dimitern-local"]: exit status 1 (start: Job is already running: juju-agent-dimitern-local)
<rogpeppe1> time to stop for the day
<fwereade> dimitern, aw hell, that really should be fixed for 1.16 too, we don't seem to shut down local envs cleanly
<dimitern> fwereade, hmm - we *are* stopping them, but the upstart job remained and it though "because it's there, it must be running"
<fwereade> dimitern, looks like we're calling StopAndRemove though
<dimitern> fwereade, hmm.. it get's deeper
<dimitern> fwereade, so now the upstart job hangs
<dimitern> fwereade, that's why the bootstrap doesn't complete and I terminated it
<rogpeppe1> g'night all
<rogpeppe1> might be back later, actually
<fwereade> rogpeppe1, see you soon
<fwereade> dimitern, "cannot install, already running" seems to imply that it really was running
<fwereade> dimitern, and was thus not properly cleaned up
<dimitern> fwereade, believe me, ps xa | grep juju was the first thing I did - no results, even as root
<dimitern> fwereade, just the upstart job was there
<fwereade> dimitern, very strange
<dimitern> fwereade, so the mongo hangs at bootstrap
<dimitern> fwereade, and that fails the whole thing
<dimitern> fwereade, it's indeed running now, and the error is correct
<fwereade> dimitern, ok, so we have *some* sort of poorly characterized local provider cleanup problem
<dimitern> fwereade, and even upstart believes jujud job is running
<dimitern> fwereade, and I can't see it
<fwereade> dimitern, and a clear current issue: that we're recording full agent tools including hashes for no clear reason, when all we really care about it the binary version they're running
<fwereade> dimitern, concur>
<dimitern> fwereade, not sure I get you there
<fwereade> dimitern, so the problem seems to be that we're setting *tools* on the agent, rather than just setting the binary version which is all anyone cares about AFAIK
<fwereade> dimitern, and we can't set tools because we didn't record the hash we downloaded and verified
<fwereade> dimitern, and it seems a bit pointless to report it back to juju when juju told it to us in the first place
<dimitern> fwereade, yes, that seems likely
<dimitern> fwereade, I have to stop though.. lest my head explodes :/
<fwereade> dimitern, no worries at all, you are already above and beyond
<fwereade> dimitern, is there a specific bug for the tools issue?
<dimitern> fwereade, don't know
<dimitern> fwereade, I added the one for the upgrader, but this seems unrelated
<fwereade> dimitern, the upgrader was what I meant by the tools issue
<dimitern> fwereade, bs, actually the upgrader error is about tools, the other errors were different
<dimitern> fwereade, :)
<fwereade> dimitern, I think there is one for screwy local-env destruction
<dimitern> fwereade, maybe
<rogpeppe1> fwereade: the point of setting tools on the agent was so that it was possible to make available that information in the status, so you could know exactly what s/w was running on each machine
<fwereade> rogpeppe1, ok, so we *should* have to record and write into the tools dirs the hashes of the original tarballs?
<rogpeppe1> fwereade: yes
<fwereade> rogpeppe1, I dob't really see how that helps anyone
<rogpeppe1> fwereade: when debugging stuff it means you have an unambiguous record of what is being run where, which i *think* could be very useful at times
<rogpeppe1> fwereade: for reproducibility and diagnosis of difficult issues in a highly distributed environment
<rogpeppe1> fwereade: and i don't really see why it should be a hard thing to do, though i haven't read through the discussion above, so i don't know what the current issue is
<fwereade> rogpeppe1, it looks like we're barfing when calling SetAgentTools because the tools in state now demand a hash
<rogpeppe1> fwereade: and you can't have a Tools with an empty hash?
<fwereade> rogpeppe1, apparently not
<fwereade> rogpeppe1, it seems to be demanding that if there's a URL, there must be a size and checksum
<rogpeppe1> fwereade: oh yes, checkToolsValidity
<fwereade> rogpeppe1, but not barfing if there's no URL
<fwereade> rogpeppe1, when I *thought* we always wrote a URL
<fwereade> rogpeppe1, but ofc do not necessarily have the original tgz available and so can't always manage size/hash
<fwereade> rogpeppe1, (not that we do, even when we do, AFAIK -- maybe that changed somewhere?)
<rogpeppe> fwereade: sorry, computer just crashed
<rogpeppe> fwereade: last thing i was was "it seems to be demanding that if there's a URL, there must be a size and checksum"
<rogpeppe> s/was/saw was/
<natefinch> sigh.... goyaml doesn't differentiate between nil slices and empy slices :/
<wallyworld> fwereade: hiya, saw the email about the error, i can take a look
<fwereade> wallyworld, tyvm
<wallyworld> any clues to get me started? i see a few comments in the bug
<wallyworld> could it be related to the env split up?
<thumper> grr
<thumper> I have the upgrader constantly bouncing
<thumper> any one else noticed?
<thumper> wallyworld: fwereade: ??? http://paste.ubuntu.com/6160651/
<fwereade> thumper, https://bugs.launchpad.net/juju-core/+bug/1231543
<_mup_> Bug #1231543: upgrader startup failure with local provider <regression> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1231543>
<fwereade> thumper, wallyworld is looking at it now dimitern has I think stopped
<wallyworld> thumper: that error looks like tools checksum is failing to be calculated
<thumper> kk
<thumper> I'm trying to chase the lxc issues
<wallyworld> fwereade: thumper's error message mentions checksums, whereas bug says something about ports
<fwereade> wallyworld, that is also a problem
<wallyworld> yeah, so  issues \o/
<wallyworld> 2
<fwereade> wallyworld, but the tools checksum is easier to get a handle on and isolate
<wallyworld> the tools one is my fault
<wallyworld> if i can't easy find it i can just disable the checksum check for now
<fwereade> wallyworld, so do we now write out size/sha256 into the tools dir when we unbundle?
<wallyworld> we do
<wallyworld> but for some reason the checksum is not getting passed down the api
<fwereade> wallyworld, I bet we just miss it in the local provider then
<fwereade> wallyworld, or is it happening everywhere?
<wallyworld> it could be that the tools are being read from the old place which means no checksum
<fwereade> wallyworld, although, hmm, yeah exactly
<wallyworld> fwereade: i tested bootstrapping on ec2, hp etc with the new stuff and it works
<fwereade> wallyworld, I'm a little scepticalabout the value of recording all that in state anyway
<fwereade> wallyworld, cool
<wallyworld> fwereade: we recorded the url in state, from which a tools stuct is made. and that tools struct is used to find a tools tarball. so it needs the checksum
<fwereade> wallyworld, we only ever call SetAgentTools in code that has already been extracted from the tarball in question
<wallyworld> fwereade: i'll have to re-read the code - what do we use the agent tools stored in state for? the tools info from SetAgentTools?
<fwereade> wallyworld, not much
<thumper> fwereade: we should get around to fixing the tools for the local provider
<wallyworld> so i could drop the checksum requirement. i thought it was needed somewhere, can't recall though
<fwereade> wallyworld, that said, minimal changes good, I am not encouraging you to rewrite and would most favour a simple tweak to the local providr that made sure it wrote its tools dir properly
<thumper> rather than the upload-tools malarky we do now
<fwereade> thumper, oh, god, yes we should
<thumper> fwereade: however I'm not sure what the best way is
<fwereade> thumper, I'm quite sure we can harmonise it with all the simplestreams stuff
<thumper> I hope so
<wallyworld> fwereade: when you say "not much" - is there a simple explanation of why we store the tools url and version in state?
<fwereade> wallyworld, the version we need for status
<fwereade> wallyworld, series is duplicated, a machine should already know its own series
<wallyworld> why the url?
<fwereade> wallyworld, and for that matter arch should always be in hardware characteristics too
<wallyworld> do we ever use the url to fetch tools?
<wallyworld> if not, i can drop the need for imsisting on checksum
<fwereade> wallyworld, I was asking rogpeppe -- I hope I am not mischaracterising him to say that it's there just in case it turns out to be useful one day
<fwereade> wallyworld, SetAgentTools is, as far as I'm aware, purely a record of what the agent reports itself to be running
<wallyworld> well
<fwereade> wallyworld, url and checksum and size are not, I think, exposed anywhere
<wallyworld> not sure i agree with recording all that extra info just to report a version
<fwereade> wallyworld, all that detail in (once) state.Tools would have been great if we'd ever stored an environment's available tools in state
<wallyworld> fwereade: would you object if i zero out url and checksum in set agent tools
<wallyworld> if we have a url and not the checksum, that is not something we should encourage
<fwereade> wallyworld, because then we could just grab the tools for a particular machine with a trvial query, get the url and size and checksum, and hand them straight over
<fwereade> wallyworld, well
<wallyworld> or i could find out why checksum is missing
<fwereade> wallyworld, the url really just indicates "this is where we got them from"
<wallyworld> ok, i'll see how it pans out. for the release, where we need something done, it may just be easier to drop the mandatory checksum requirement
<wallyworld> and fix next week
<fwereade> wallyworld, indeed, if that's what it comes to then so be it
<wallyworld> cause the other issue sounds more tricky
<thumper> wallyworld, fwereade: I'll look at the port int issue
<fwereade> thumper, <3
<thumper> wallyworld: if you want to tackle the checksum thing
<wallyworld> yes indeed
<thumper> heh, interesting,
<wallyworld> fwereade: i'm also part way through ripping out all legacy tools support - that will need to be landed after 1.15 when all clouds have had simplestreams tools uploaded by the release team
<thumper> I can see from the rpc logging that the value is being sent through as an int
 * thumper digs
<thumper> what the actual fuck...
<fwereade> wallyworld, awesome news
<fwereade> thumper, that sounds less awesome
 * thumper just digging
 * wallyworld needs a coffee
<wallyworld> thumper: how do i reproduce  your issue?
<thumper> wallyworld: all I did was bootstrap the local provider
<wallyworld> ok
<thumper> I did try to deploy some things
<thumper> before I checked the logs
<thumper> so not entirely sure
<wallyworld> np thanks
<thumper> but I feel just bootstrap is enough
<thumper> I also feel that my problem may be shadowing yours
<wallyworld> should be easy to find then hopefully
<thumper> so you might not get yours fixed
<thumper> until mine is
<wallyworld> let's find out
<thumper> hmm...
<thumper> I think I know what it is, but it is weird
<thumper> and not sure why it hasn't broken before this
<thumper> if it is what I think it is
 * fwereade wants to watch, but is going to bedinstead
<fwereade> gn all
<wallyworld> fwereade: night
<wallyworld> thumper: i found the spot where SetAgentTools was passing in incomplete tools
<thumper> wallyworld: cool, I've found out where the validate is failing, but unsure as to why
<wallyworld> but i'm not sure i habe the size and checksum info at that point to pass in also
<wallyworld> it really is just passing in a version wrapped in a tools struct which seem silly
<wallyworld> thumper: ah, actually i think when local provider starts up, the tools hack it uses might not be recording the checksum etc, so when that info is read back later, it is missing
 * wallyworld is guessing
<thumper> how to I get the type of something printed out?
<wallyworld> %T
<wallyworld> fmt.Println("%T", thing)
<wallyworld> Printf
<thumper> stabby!!!!!!!!!!!!!!!!!!1
<thumper> error used to be :  storage-port: expected int, got 8040
<thumper> added type info
<thumper> guess what?
<thumper> storage-port: expected int, got float64(8040)
<thumper> this is why it is failing
<thumper> FFS
<thumper> is it because json serialization only has float64?
<thumper> how to we fix this in a non sucky way?
 * thumper wonders how the api port is handled
 * thumper digs
<thumper> stabby stabby
<thumper> the difference is:
<thumper> schema.Int
<thumper> vs
<thumper> schema.ForceInt
<thumper> guess which is which?
<thumper> huh?
<thumper> I change it now I get a panic
<wallyworld> thumper: you need a custom json demarshaller i think
<wallyworld> for the struct
<thumper> no, found it
<thumper> you wouldn't believe it if I told you
<thumper> well, you might
<thumper> schema.Int -> int64
<thumper> schema.ForceInt -> int
<wallyworld> wtf
<thumper> ok, that fixes it
<wallyworld> \o/
<wallyworld> thumper: save me some time - can you point me to where the local provider does its tools hacky thing to find the tools to bundle
<thumper> it does the default --upload-tools bit
<thumper> what do you mean exactly?
<wallyworld> for some reason, the tools struct passed to bootstrap is (i think) missing the checksum info
<wallyworld> i need to find out how that is happening
<wallyworld> just working backwards to find it
<thumper> probably the possible tools created by the upload-tools stuff
<thumper> at a guess
 * thumper proposes a copule of branches
<wallyworld> thumper: found it, fixed, testing
<thumper> https://codereview.appspot.com/14005043/  is just logging tweaks
<wallyworld> the local environ did not implement CustomToolsSource interface
<wallyworld> so it did not find tools using simplestreams, and defaulted to legacy
<wallyworld> which means no checksums
<thumper> https://codereview.appspot.com/14006043 is the fix for the config
<thumper> ah
 * thumper goes to set commit messages in prep
 * thumper waits for review
<thumper> almost tiem for lunch
<wallyworld> thumper: done with one comment
<thumper> added a little context
<thumper> wallyworld: the new test failed with the expected same error output to the log file
<thumper> changed the schema, and all good \o/
<wallyworld> yay
<wallyworld> thumper: i'll be proposing a fix soon, may you can look after lunch
<thumper> ok
 * thumper is heading into town to lunch with veebers
<thumper> wallyworld: once you review the actual fix, you can approve it
<thumper> I'm hoping you won't find any issue
 * thumper -> lunch
<wallyworld> ok
#juju-dev 2013-09-27
<wallyworld> davecheney: if you are free, i'd love a review for a critical 1.15 release fix https://codereview.appspot.com/14011043
 * davecheney looks
<wallyworld> thanks :-)
<davecheney> wallyworld: LGTM. not a lot to complain about there
<wallyworld> \o/
<davecheney> if you want, i *can* complain
<davecheney> i have had a lot of practice
<wallyworld> i really wish we have CI testing cause these things would have shown up sooner
<davecheney> does anyone know what 'provider state lookup' is ?
<thumper> davecheney: I thought ` quoted strings were raw?
<thumper> but it is complaining about \
<thumper> what is the difference between ` and " ?
<davecheney> ` is raw
<davecheney> so
<davecheney> `"tim"` == "\"tim\""
<davecheney> what is 'it' ?
<thumper> a fuck sticks
<thumper> there is a mix of ` and " in the file for error matches
<davecheney> https://bugs.launchpad.net/charms/+source/haproxy/+bug/1231768
<_mup_> Bug #1231768: config-changed: fails when global_maxconns is not set <haproxy (Juju Charms Collection):Triaged> <https://launchpad.net/bugs/1231768>
<davecheney> what has gone on here ?
 * thumper shrugs
<thumper> this is a bug about all bool options being false, perhaps related?
<thumper> wallyworld: I'll send an email to the list about the local provider issues
<wallyworld> thumper: ok. did you notice one of your branches had a few small test issues?
<thumper> yeah, fixed and resubmitted
<wallyworld> thumper: i'm trying to get legacy tools removal finished. i think we should land that because legacy tools does not pick up checksums and running from trunk now uses simplestreams anyway
<thumper> wallyworld: how does this impact the local provider?
<wallyworld> the local provider uses simplestreams now that i have landed a previous fix
<wallyworld> it was failing for you because it was failing bacl to legacy
<wallyworld> hence no checksums
<wallyworld> so removing legacy will catch the error earlier
<wallyworld> s/the/any
<thumper> ok
<axw> wallyworld: null-provider is getting "cannot set agent tools for machine 0: empty size or checksum"  -- is this related to not implementing SupportsCustomSources?
<wallyworld> axw: yes. it needs to do what i just did for local provider
<axw> I just added the GetToolsSources method to nullEnviron, but I get that still
<axw> wallyworld: thanks, I'll see what else you did
<wallyworld> what did you return
<axw> wallyworld: same as local
<wallyworld> that should have worked then
<wallyworld> it will have the checksum error if it can't find tools via simplestreams
<wallyworld> are tools uploaded?
<wallyworld> is this in a test?
<axw> wallyworld: simplestreams metadata has been uploaded
<axw> wallyworld: in a live test against my VM
<wallyworld> if you run with debug and you see a messaging saying can't find tools, using legacy (or something like that) then it is not finding the tools meatdata
<axw> wallyworld: nothing about legacy, just this:
<axw> 2013-09-27 03:09:57 ERROR juju runner.go:211 worker: exited "upgrader": cannot set agent tools for machine 0: empty size or checksum
<wallyworld> that will happen as a result of not finding tools metadata. or, it could be a Tools{} struct is being constructed somewhere without Size and Checksum being set
<axw> wallyworld: I guess it's not finding it, because what's there does have the size and checksum
<axw> I'll keep digging
<wallyworld> run with debug and paste the output if you like
<wallyworld> are you sure the tols have been put in the right place?
<wallyworld> bbiab
 * thumper upgrades to saucy to look for lxc issues
<thumper> I'm having trouble replicating some of the bugs with raring
<davecheney> thumper: god's speed
<axw> wallyworld: ah, I think I know the problem. It's because of the sftp:// url ;)
<davecheney> i've finally got this laptop stable after the random reboot o rama of precise and quantal
<davecheney> there is no way i'm upgrading again
<thumper> heh
<wallyworld> axw: ah ok.
<axw> thumper: given this issue that's just popped up, and me remember that Monday is a holiday, one of my tasks probably does want to go red
<axw> remembering*
<wallyworld> davecheney: i forgot to add public bucket to ec2 before https://codereview.appspot.com/14021043
<thumper> 2.4 gig to download to upgrade
<thumper> 18 packages are going to be removed. 232 new packages are going to be  installed. 1943 packages are going to be upgraded.
 * thumper goes to make a coffee
<davecheney> wallyworld: LGTM, trivial
<davecheney> fire at will
<wallyworld> thanks :-)
<davecheney> o.O
<davecheney> 2.4 G to download ?
<davecheney> i don't think precise ocupied 2.4 Gb on disk
<thumper> davecheney: raring
<davecheney> wow
<davecheney> DOWNLOAD ALL THE THINGS!
<davecheney> can anyone help me with this issue
<davecheney> p1/0:config-changed % config-get --format json global_maxconn
<davecheney> ""
<davecheney> ^ inside the unit the default value is not being reported
<davecheney> this value defaults to 4096 in the config.yaml
<davecheney> but is being returned as "" in the hook context
<rogpeppe> mornin' all
<davecheney> mornning rogpeppe
<davecheney> any idea on the questio above ?
<davecheney> it's blocking me atm
<rogpeppe> davecheney: just looking
<davecheney> i'm redoin the environment
<rogpeppe> davecheney: i wonder if this is related to the recent problems with boolean in a hook context
<davecheney> i REALLY hope this isn't a transient failure
<davecheney> rogpeppe: we copy the config settings into the state don't we ?
<davecheney> well, obviously
<davecheney> that is how juju get works
<davecheney> and get reports the value correctly, and understands its got a default, and is an int
<rogpeppe> davecheney: this stuff has changed not too long ago and i'm not familiar with the code any more
<davecheney> oh poo
<davecheney> change is bad
<rogpeppe> davecheney: how can i get sudo to use my normal environment vars?
<rogpeppe> davecheney: scratch that, found it
<davecheney> -E from mempory
<davecheney> probably not right
<rogpeppe> davecheney: yeah, but it doesn't preserve $PATH. crappy frickin' thing. a recipe for unexpected behaviour
<rogpeppe> davecheney: so is this *really* the hoop one needs to jump through to bootstrap the local provider? sudo -E sh -c "export PATH=$PATH; juju bootstrap -e local --debug
<rogpeppe> axw: ^
<davecheney> rogpeppe: juju will already be in the path
<davecheney> unless you're a developer
<davecheney> which sucks for you
<rogpeppe> davecheney: true
<davecheney> ahh
<davecheney> now I understand your comment on unexpected behavior
<davecheney> and why none of you guys test the juju ppa's :)
<rogpeppe> davecheney: and non devs won't need GOROOT set either, i suppose
<axw> rogpeppe: ?
<axw> rogpeppe: sudo `which juju` ...
<rogpeppe> axw: you've been using the local provider quite a bit
<davecheney> rogpeppe: NOBODY needs $GOROOT set
<davecheney> but lets not get distracted
<rogpeppe> axw: that doesn't work for me
<fwereade> davecheney, https://bugs.launchpad.net/juju-core/+bug/1231457
<_mup_> Bug #1231457: uniter API discards non-string config settings <juju-core:Fix Committed by dimitern> <https://launchpad.net/bugs/1231457>
<rogpeppe> davecheney: ahem
<axw> rogpeppe: what's the problem?
<rogpeppe> davecheney: yeah, it's just path
<axw> I mean, what's the outcome
<rogpeppe> axw: i bootstrapped it ok - i just found it... unintuitive
<davecheney> fwereade: shit, i'm using yesterdays rev
<davecheney> well, that is a day I won't get back
<rogpeppe> axw: if you've got a version of juju that isn't at /usr/bin/juju, it fails
<axw> rogpeppe: yeah, it's not great. IIANM, future versions of Ubuntu/lxc/upstart will allow us to have local without root
<rogpeppe> axw: that would definitely be an improvement!
<fwereade> davecheney, chit, bad luck:(
<davecheney> fwereade: checking now
<davecheney> thanks for the answer
<davecheney> i take it that the cli doesn't use the api in the same way the unit agent does
<fwereade> davecheney, the CLI still hardly uses the api
<fwereade> davecheney, but even when it does there will be approximately no overlap with the agent API
<davecheney> that would explain the disparity
<fwereade> davecheney, yeah -- and particularly in terms of config settings the results are very different *anyway*
<fwereade> davecheney, because ServiceGet or whatever it is returns a great ugly pile of human-targeted bs like "description" and whether or not a value is the default
<davecheney> fwereade: oh fuck, don't mention default
<davecheney> i'll cry
<fwereade> davecheney, when really it should be a charm url + basic config settings, and the client can dress that up however he wants
<rogpeppe> fwereade: do you have anything to say about https://codereview.appspot.com/13969043/ before i approve it?
<fwereade> rogpeppe, I still don't love it but I'm not going to fight it,because an extension *will* be useful if those files are floating around freely, and it'll be better to have all such files with a consistent convention
<rogpeppe> fwereade: well, i'm very happy to go with any alternative suggestion
<fwereade> rogpeppe, if you're just thinking of the actual choice of extension I'm fine with .jenv... can't see anything elseobviously using it
<rogpeppe> fwereade: yeah, that's what i was thinking of
<rogpeppe> fwereade: ok, cool
<rogpeppe> fwereade: you'd prefer no extension at all?
<fwereade> rogpeppe, only slightly, and I don't have strong arguments to justify it
<fwereade> rogpeppe, whereas your arguments is favour of an extension doseem pretty strong
<rogpeppe> fwereade: ok, cool
<rogpeppe> fwereade: i'll approve then
<rogpeppe> fairly simple review, anyone? https://codereview.appspot.com/14028043
<rogpeppe> dimitern, fwereade, axw, TheMue, wallyworld: ^
<TheMue> rogpeppe: looking
<rogpeppe> TheMue: thanks
<rogpeppe> g
<rogpeppe> axw: thanks for the review
<axw> nps
<wallyworld> fwereade: i have to do a few things now, but can we talk about the release etc later
<TheMue> rogpeppe: reviewed
<fwereade> wallyworld, sure
<fwereade> wallyworld, pre-standup?
<wallyworld> fwereade: maybe, depends when i'm back. i have a branch which removes all legacy tools. there are about 6 upgrader tests commented out because of a slight change in tools finding logic (it stops looking earlier than legacy) and the tools in private hide the ones in public
<wallyworld> i could out that up for a pre-review if you want
<wallyworld> put
<wallyworld> i had to rename all our fake charms from "series" to "quantal"
<wallyworld> since simplestreams does not allow fake series
<wallyworld> and no test machine will be running quantal
<wallyworld> so there's a lot of one line changes all over the place
<wallyworld> fwereade: so i might propose it with the upgrader tests mentioned above commented out and come back later to talk about it
<wallyworld> fwereade: https://codereview.appspot.com/14031043/   i'll come back later, i may miss the standup but do want to discuss the release
<TheMue> fwereade, rogpeppe, dimitern: one one too complex review, https://codereview.appspot.com/14030043/
<rogpeppe> TheMue: why do we need to keep the old Status method around?
<rogpeppe> TheMue: can't you just return some more data from Status (like you added the data param to SetStatus) ?
<TheMue> rogpeppe: only want to do this in two CLs
<TheMue> rogpeppe: just keep this one small
<dimitern> TheMue, looking
<rogpeppe> TheMue: i don't think a few extra underscores add much to a review burden, but fair enough i guess
<dimitern> TheMue, reviewed
<TheMue> dimitern: thx, just seen
<TheMue> dimitern: maybe I should really change Status immediatelly ;)
<axw> if anyone has cycles, could I please get a review on this? https://codereview.appspot.com/14011046/
<axw> null provider is broken without it.
<axw> fwereade: I've POCd some changes to wire up prechecker; let me know if it's too evil ;)  https://codereview.appspot.com/14032043
<axw> fwereade: only thing I'm not really sure about is the modification to cmd/jujud. it's a new dependency, but it seems like the setup of State does belong here
<rogpeppe> wallyworld: any particular reason you changed "series" to "quantal" in your MP?
<fwereade> axw, cheers, I'll take a look
<rogpeppe> axw: i think your changes there are a good reason for us to avoid optional environs interfaces.
<rogpeppe> axw: but it may well be too late for that.
<rogpeppe> mgz: ping
<rogpeppe> axw: i'm wondering about bootstrap storage
<rogpeppe> axw: would it be possible for a provider to know if it has already been bootstrapped and therefore whether it's appropriate to use bootstrap storage or not?
<rogpeppe> axw: then we could lose all externally visible BootstrapStorage stuff
<mgz> rogpeppe: hey
<rogpeppe> mgz: could we have a chat about the addressing stuff
<rogpeppe> ?
<mgz> yup, before or after standup?
<rogpeppe> mgz: how about now, same hangout?
<mgz> I'm there
<fwereade> aand I'm back through no logical means that I can discern
<axw> rogpeppe: I couldn't think of a good way to do that before, but I'd certainly prefer that if possible
<axw> going out now - have a nice weekend
<dimitern> axw, have a good one as well
<fwereade> gaah
<TheMue> rogpeppe, dimitern, fwereade: to be sure please once again https://codereview.appspot.com/14030043/
<fwereade> TheMue, reviewed
<TheMue> fwereade: thanks
<dimitern> rogpeppe, I'm trying to add a test to prove state.Open fails for unit agents
<dimitern> rogpeppe, the problem is it always fails, even when I call SetMongoPassword on the unit in the test
<dimitern> rogpeppe, mongo passwords are not hashed, right? Whatever I pass in SetMongoPassword should be it, right?
<dimitern> rogpeppe, I want to make the test fail reliably, so when I remove SetMongoPassword on the unit it will succeed
<rogpeppe> dimitern: how does it fail?
<rogpeppe> dimitern: perhaps you could paste the test code?
<dimitern> rogpeppe, I doesn't fail, that's the problem
<dimitern> rogpeppe, here it is http://paste.ubuntu.com/6162450/
<dimitern> rogpeppe, and I have unit.SetMongoPassword(initialUnitPassword) in primeAgent
<rogpeppe> dimitern: you said "the problem is it always fails" ... then "it doesn't fail"... which one?
<rogpeppe> dimitern: so that test fails, because it succeeds in connecting to the state?
<dimitern> rogpeppe, the test case, as written should pass when I *didn't* set a mongo password
<dimitern> rogpeppe, but it should fail if I did, and it passes in both cases
<natefinch> it fails by not failing
<rogpeppe> not failing is passing
<rogpeppe> oh no, that's wrong
<mgz> fail test is failing
<dimitern> c'mon :)
<dimitern> maybe I'm not explaining it well enough
<natefinch> if at first you succeed, try again until you fail
<rogpeppe> dimitern: so you're saying that that test successfully connects to the state, even though the mongo password isn't set (supposedly) ?
<dimitern> if only..
<rogpeppe> dimitern: perhaps you should paste the output from the test too...
<dimitern> rogpeppe, no, I'm saying the test (as written) always passes (i.e. fails to connect to state), whether I set a mongo password or not in primeAgent
<natefinch> this is like watching Who's On First :)
<rogpeppe> something whizzes over roger's head
<dimitern> rogpeppe, does that make more sense?
<rogpeppe> dimitern: that would imply that a unit agent could never connect to the state...
<dimitern> rogpeppe, the agent is not even running yet
<dimitern> rogpeppe, that's step 2
<dimitern> rogpeppe, at step 1 here, I want to make sure I'm setting a mongo password and with it I can connect to state as that unit
<dimitern> rogpeppe, and when I remove the password setting it'll fail to connect
<dimitern> rogpeppe, istm state connections are not based on tag+password, just a password?
<rogpeppe> dimitern: that does seem a bit odd
<dimitern> rogpeppe, how can we tell who's connecting?
<rogpeppe> dimitern: no, they're based on tag and password
<dimitern> rogpeppe, ah, so maybe I'm missing Tag = unit.Tag() in state.Info
<rogpeppe> dimitern: maybe so
<rogpeppe> dimitern: you'll be setting the admin password if tag is blank
<dimitern> rogpeppe, I see
<dimitern> rogpeppe, now it works! :)
<rogpeppe> great!
<dimitern> rogpeppe, what's an admin password and what it is used for?
<rogpeppe> dimitern: that's what the juju clients use
<rogpeppe> dimitern: it's guarded by admin-secret
<rogpeppe> dimitern: we should be able to delete that account entirely in time, probably
<dimitern> rogpeppe, I see, ok
<dimitern> rogpeppe, thanks
<fwereade> dimitern, rogpeppe: any chance we could drop the ""->"user-admin" business? it doesn't seem like a win to me
<rogpeppe> fwereade: i'd love to, but given that it's going away and has compatibility issues, i think it's probably not worth changing at this point.
<rogpeppe> s/has comp/changing it has comp/
<fwereade> rogpeppe, how's it going away? that would entirely satisfy me :)
<rogpeppe> fwereade: because nobody needs to connect to mongo with the admin password, in the future
<fwereade> rogpeppe, oh! it's mongo-only is it? sorry, I see
<rogpeppe> fwereade: yea
<rogpeppe> h
<fwereade> rogpeppe, heh, we *really* want UUID in environ, don't we
<rogpeppe> fwereade: context?
<fwereade> rogpeppe, advising people not to use environment name as a unique identifier when writing environs
<fwereade> rogpeppe, I think they might reasonably ask wtf else they're expected to use
<rogpeppe> fwereade: ha, yes
<mgz> but the name is so handy...
<fwereade> mgz, for users, yes
<mgz> (the main advantage being anywhere the value resurfaces, it's easy to see at a glance what juju thing it's related to)
<fwereade> mgz, internally, for distinguishing between environments? it's hopeless
<fwereade> mgz, using it as such leads to great pain when developers use shared provider credentials
<mgz> right, I do have sympath with that
<mgz> +y
<mgz> I just suffer from dealing with uuids in nova daily
<fwereade> :)
<mgz> is this one a volume or a machine... did I repaste the right thing to the commandline
<fwereade> ha, yeah
<TheMue> fwereade: https://codereview.appspot.com/14030043/ looks better now
<dimitern> rogpeppe, fwereade: so right now, because of the firewaller  we need to enable state workers for JobManageEnviron as well
<fwereade> dimitern, yep, sounds right
<dimitern> fwereade, ok
<fwereade> hey, cool, that *was* the red arrows who flew across the bay with fancy coloured smoke, must be practising for the airshow tomorrow
<TheMue> fwereade: send pics *lol*
<gary_poster> are compilation complaints in trunk goyaml and juju-core expected, or is this indicative of something wrong on my local saucy system?  http://pastebin.ubuntu.com/6162806/
<fwereade> gary_poster, that looks more like goamz to me, but I wasn't aware of changes there
<fwereade> gary_poster, and also goose
<TheMue> fwereade, gary_poster: i had the goose one too. clearing all 3rd party pkg and src and re-get them helped
<gary_poster> fwereade, TheMue thanks, was just thinking of doing the same thing
<TheMue> fwereade: btw, could you find a moment to take a look into the reproposed CL
<TheMue> ?
<fwereade> TheMue, I'm churning through a big one from ian before he goes to sleep
<TheMue> fwereade: ok, makes sense
<dimitern> rogpeppe, fwereade: why do we have only PasswordHash() method on agent.Config, but not the actual password?
<rogpeppe> dimitern: because we never store the actual plaintext
<rogpeppe> dimitern: um
<dimitern> rogpeppe, well I need it for testing now :)
<rogpeppe> dimitern: hmm, actually i'm talking rubbish
<dimitern> rogpeppe, how can I access the password that gets changed after opening state the first time?
<rogpeppe> dimitern: one mo, i'll have a look
<rogpeppe> dimitern: it seems to me that there's no reason
<rogpeppe> dimitern: it could just return the password
<dimitern> rogpeppe, you mean rename agent.Config.PasswordHash() to Password() and return the plain test?
<dimitern> text
<rogpeppe> dimitern: i think so, yeah
<gary_poster> TheMue, fwereade killing all third party and rebuilding did the trick, thanks
<dimitern> rogpeppe, it's actually used only in 2 places in tests only
<wallyworld> fwereade: i just noticed that the hp cloud image metadata on cloud-images has a trailing / for the auth url whereas our boilerplate does not. so bootstrapping hp cloud fails. i could add that change into  the tools branch, or else a new branch will be required
<fwereade> wallyworld, well I'd really prefer a separate branch
<wallyworld> ok
<TheMue> gary_poster: yw
<fwereade> rogpeppe, ISTM that the only things that use Instances are the firewaller and the ssh command (which should be getting machine addresses)
<fwereade> mgz, speaking of which, is there anything I can do to support you in the addresses stuff?
<mgz> fwereade: rog and I are making pretty good progress here I think
<rogpeppe> fwereade: me and mgz are rattling it off as we speak...
<fwereade> mgz, I am primarily talking emotional support to be fair, but let me know if there's something practical I can do
<fwereade> mgz, rogpeppe: <3
<rogpeppe> fwereade: the forthcoming worker/addresspublisher will use Instances
<dimitern> rogpeppe, can you take a look at this please http://paste.ubuntu.com/6162906/, more specifically line 9
<dimitern> rogpeppe, how will this ever work?
<fwereade> rogpeppe, is Instances likely to move onto InstanceBroker then?
<fwereade> rogpeppe, that would be convenient :)
<rogpeppe> fwereade: not as far as i was aware
<dimitern> rogpeppe, and what's the original intent?
<rogpeppe> fwereade: at the moment we're thinking of the address publisher as something that runs independently of the provisioner.
<fwereade> rogpeppe, ah ok, I guess it's not part of the provisioner
<fwereade> rogpeppe, jinx :)
<rogpeppe> fwereade: hold on, i'll give you a sneak preview if you like
<rogpeppe> fwereade: then we can get some feedback
<fwereade> rogpeppe, cool
<rogpeppe> fwereade:
<fwereade> rogpeppe, I should do TheMue's first though
<rogpeppe> package addresspublisher
<rogpeppe> import (
<rogpeppe> 	"fmt"
<rogpeppe> 	stdtesting "testing"
<rogpeppe> 	"time"
<rogpeppe> 	gc "launchpad.net/gocheck"
<rogpeppe> 	"launchpad.net/juju-core/instance"
<rogpeppe> 	"launchpad.net/juju-core/state"
<rogpeppe> 	coretesting "launchpad.net/juju-core/testing"
<rogpeppe> 	jc "launchpad.net/juju-core/testing/checkers"
<rogpeppe> 	"launchpad.net/juju-core/testing/testbase"
<rogpeppe> )
<rogpeppe> func TestPackage(t *stdtesting.T) {
<rogpeppe> 	gc.TestingT(t)
<rogpeppe> }
<rogpeppe> var _ = gc.Suite(&publisherSuite{})
<rogpeppe> type publisherSuite struct {
<dimitern> oh nooo :)
<rogpeppe> 	testbase.LoggingSuite
<rogpeppe> }
 * fwereade goes off for a 5 minute break? ;p
<rogpeppe> var testAddrs = []instance.Address{instance.NewAddress("127.0.0.1")}
<rogpeppe> func (*publisherSuite) TestSetsAddressInitially(c *gc.C) {
<rogpeppe> 	ctxt := &testMachineContext{
<rogpeppe> 		getAddresses: addressesGetter(c, "i1234", testAddrs, nil),
<rogpeppe> 		dyingc:       make(chan struct{}),
<natefinch> lol
<rogpeppe> 	}
<rogpeppe> 	m := &testMachine{
<rogpeppe> 		instanceId: "i1234",
<rogpeppe> 		refresh:    func() error { return nil },
<rogpeppe> 		life:       state.Alive,
<rogpeppe> 	}
<rogpeppe> 	died := make(chan machine)
<rogpeppe> 	publisherc := make(chan machineAddress, 100)
<mgz> paste error :)
<rogpeppe> 	// Change the poll intervals to be short, so that we know
 * dimitern ducks for cover
<rogpeppe> 	// that we've polled (probably) at least a few times while we're
<rogpeppe> 	// waiting for anything to be sent on publisherc.
<natefinch> abort abort!
<rogpeppe> 	defer testbase.PatchValue(&shortPoll, coretesting.ShortWait/10).Restore()
<rogpeppe> 	defer testbase.PatchValue(&longPoll, coretesting.ShortWait/10).Restore()
<mgz> who can kick? :)
<rogpeppe> 	go runMachine(ctxt, m, nil, died, publisherc)
<rogpeppe> 	select {
<rogpeppe> 	case addr := <-publisherc:
<rogpeppe> 		c.Assert(addr.addresses, gc.DeepEquals, testAddrs)
<rogpeppe> 		c.Assert(addr.machine, gc.Equals, m)
<rogpeppe> 	case <-time.After(coretesting.LongWait):
<rogpeppe> 		c.Fatalf("publisher never published the expected address")
<rogpeppe> 	}
<rogpeppe> 	select {
<rogpeppe> ah, sorry about that
<natefinch> haha
<rogpeppe> fwereade: http://paste.ubuntu.com/6162921/
<rogpeppe> at least i know now that reconnecting aborts
<natefinch> wow, weird, goyaml recovers from panics inside a type's GetYAML and SetYAML and returns them as errors... didn't see that one coming.
<natefinch> I guess they don't return an error value, so that's sort of all you can do.
<rogpeppe> natefinch: they should definitely return errors. it's a design flaw that they don't
<rogpeppe> dimitern: what's the issue you have with that paste?
<rogpeppe> dimitern: it looks roughly plausible to me
<rogpeppe> dimitern: we connect using the password, falling back to the old password if the password isn't set or we can't connect with it
<rogpeppe> mgz: i'm gonna grab some lunch too
<natefinch> rogpeppe: I'm like || this close to rewriting the whole stupid thing.  I already looked around for alternatives and didn't see anything. The other yaml parsers just build up a document tree in memory... which is like, sorta useless by itself
<dimitern> rogpeppe, we construct and empty state.Info, filling in some stuff,  not setting the password, and the immediately  after that we check if the password is set
<abentley> sinzui: I've filed RT #1224057 asking for a VPN.
<dimitern> rogpeppe, ah, sorry, I'm blind
<dimitern> :)
<sinzui> thank you abentley
<fwereade> rogpeppe, that's looking pretty sane to me
<fwereade> TheMue, that sorta LGTM -- I think you need *some* more testing of Status() returns
<fwereade> TheMue, but if you agree that a followup putting the 3 vars into a struct would be a good idea, you can defer most of that work until then
<fwereade> dimitern, opinion? ^^
<fwereade> TheMue, this would make me happy because if you landed it now then gary_poster would have status-data in 1.15
<dimitern> fwereade, looking
<gary_poster> yay :-)
<dimitern> fwereade, TheMue, I like the idea of a SetStatusParams struct
<dimitern> fwereade, TheMue, and definitely +1 on more specific tests for statusData
<dimitern> fwereade, TheMue, I'm ok with doing these in a follow-up though, if it's soon
<TheMue> fwereade: yep, we've already talked about the struct, so it would follow
<wallyworld> fwereade: ok, merge is done. so release good to go hopefully
<fwereade> wallyworld, <3
<wallyworld> i smoke tested on ec2 and hp. but that was only using dev tools
<fwereade> TheMue, if you can just fill in all the Status-specific unit/machine unit tests that's good enough for this CL
<wallyworld> hopefully will be ok with tools pre-uploaded
<fwereade> dimitern, TheMue: I was thinking of just a Status struct in state tbh, I am agnostic re api/params
<fwereade> wallyworld, indeed so :)
<wallyworld> actually, i did test with tools already uploaded to priavte hp cloud bucket and it was ok
<fwereade> dimitern, TheMue: butactually yeah it would go in params wouldn't it
<TheMue> fwereade: could you rephrase "just fill in all the ..."
<wallyworld> but we need to smoke test with tools in various public locations, especially ec2 which has not had simplestreams before
<fwereade> TheMue, in unit_test.go and machine_test.go, find any tests that are testing Status/SetStatus directly, and make sure they're not using _
<TheMue> fwereade: ah, ok
<wallyworld> fwereade: ao you ok to liaise with sinzui wrt the release?
<fwereade> TheMue, anything else -- eg if there's a test that SetStatus doesn't cause a watcher change -- you can leave
<fwereade> wallyworld, I think so
<sinzui> yep. I just need a rev and any late additions to the release notes
<wallyworld> i'll check in again in a few hours after some sleep to make sure its all ok
<fwereade> sinzui, you'll have a rev very soon, and I can give you one at any time you're blocking on us
<wallyworld> no additions to the release notes that i am aware of
<sinzui> fwereade, I am not blocked
<fwereade> wallyworld, yeah, no worthwhile distinction between "we use simplestreams" and "we use simplestreams and don't fall back to the original"
<wallyworld> nah
<fwereade> sinzui, cool, please let us know if you are in danger of becoming so
<wallyworld> should be transparent
<sinzui> ack
<natefinch> fwereade, rogpeppe: I'm removing the --yaml flag from juju get-constraints, since obviously, it won't work
<fwereade> natefinch, SGTM
<natefinch> fwereade: handily, the panics I put in GetYAML and SetYAML cause the tests that checked for yaml support to fail, so it was easy to find them.
<rogpeppe> natefinch: you could always do "go panic(...)"... :-)
<fwereade> natefinch, nice :)
<natefinch> rogpeppe: go panic is not a bad idea
<rogpeppe> fwereade: cool, glad to know we're not too far off track
<TheMue> defer func()  { panic("ouch")  }()
<dimitern> aaarghh!
<dimitern> wtf is this about Waiting for sockets to die: 1 in use, 1 alive
<dimitern> I can see in the logs that the state connection gets closed
<dimitern> how come there are sockets open? and how can I see who's to blame?
<dimitern> rogpeppe, ^^
<TheMue> fwereade: proposal is in again, so i could land it now
<rogpeppe> dimitern: there's probably another one that hasn't been closed
<fwereade> TheMue, LGTM, just a couple of stray lines to remove
<TheMue> fwereade: oh, yes, forgot them
<dimitern> rogpeppe, I figured that much, but it's really hard to debug so many layers
<rogpeppe> dimitern: i tend to add some log statements to show where state instances are being allocated, and by whom
<natefinch> fwereade, mgz:  updated code review with some more tests and code to disable yaml serialization of constraints: https://codereview.appspot.com/13802045
<dimitern> rogpeppe, that's what I'm doing - adding stack traces to state.Open and Close, using your debug package btw
<mgz> natefinch: ace
<rogpeppe> dimitern: i'm a bit involved currently, but could take a look later
<dimitern> rogpeppe, whew.. found the culprit
<rogpeppe> dimitern: \o/
<natefinch> fwereade: there was something Tim mentioned in the meeting yesterday about testing something on MaaS.... but in banging my head against goyaml for a day, I've forgotten what he wanted
<TheMue> fwereade: the eagle has landed
<fwereade> natefinch, ghhh not sure exactly -- almost certainly, though, it's making sure we can run containers there
<fwereade> natefinch, but in terms of *specifically* what might be a problem? I don't know
<natefinch> fwereade: ok, thanks.  I've done some initial testing with containers in MaaS  (specifically --constraints container=lxc) and it seems to work fine. I'll ping him and see if he needed more than that
<fwereade> natefinch, have a go at `juju add-machine lxc:0`
<fwereade> natefinch, actually no that's crack
<fwereade> natefinch, would you start 2 machines, each running containers; put a related unit in each and check they can actually communicate
<natefinch> fwereade: sure
<fwereade> natefinch, awesome
 * fwereade just smacked a mosquito that had *just* been snacking on him
<fwereade> ahh, that was satisfying :)
<natefinch> ha nice
<mgz> snacking on it would have been even more appropriate
<natefinch> significantly more disturbing, however
<TheMue> so, have to step out. our local public festival starts today :)
<TheMue> have a nice weekend
<fwereade> TheMue, tyvm
<fwereade> TheMue, have fun
<TheMue> fwereade: thx, will have
<fwereade> natefinch, so, a horrible thought...  would it work, hateful as it is, if we used *[]string? would that let us dance through the hoops?
<fwereade> natefinch, ie a nil pointer is "not set", non-nil pointer to nil (or empty) slice is "cleared"?
<natefinch> fwereade: maybe. I'm not convinced that goyaml will treat the pointer any differently than a non-pointer slice... but I can give it a quick test.
<fwereade> natefinch, thanks
<natefinch> fwereade: that works, actually.  I'm not sure if that makes me happy or not
<fwereade> natefinch, haha
<natefinch> fwereade: so, update the Tags to be a *[]string?  Doable, just... ug.
<fwereade> natefinch, it's one of those humdrum workaday development tradeoffs -- working code at the cost of just a little piece of your soul ;p
<natefinch> haha yeah
<fwereade> natefinch, I don't see a better way to do it I'm afraid
<natefinch> fwereade: yep, you're right.
<mgz> the "just string" argument is stronger and stronger :)
<natefinch> mgz: :p
<dimitern> fwereade, rogpeppe, finally I'm almost done with the disabling of state access to agents, I'd appreciate if both of you have a look when I propose it shortly
<fwereade> dimitern, will do soon
<rogpeppe> fwereade, dimitern, natefinch: start of the addresspublisher package: https://codereview.appspot.com/14038045/
 * rogpeppe has reached the weekend
<rogpeppe> dimitern: sorry, i won't be able to review your branch this evening. i will shortly be drinking copious quantities of BEER.
<rogpeppe> happy weekends all
<dimitern> rogpeppe, also good :)
<natefinch> rogpeppe: have fun :)
<dimitern> fwereade, well, it was a struggle, but I made it https://codereview.appspot.com/14036045
<fwereade> dimitern, cool,looking
<fwereade> dimitern, ok, I think it may be a bit of a struggle to finish this tonight
<dimitern> fwereade, no rush, I can hardly look at it anymore anyway :)
<dimitern> fwereade, it's probably not very elegant, but tests pass at lest
<dimitern> least
<dimitern> fwereade, and I suspect some live testing is required, which I haven't done yet
<fwereade> dimitern, most assuredly so :)
<dimitern> fwereade, as they say in kerbal space program, I'll have to pull a Jebediah :)
<fwereade> dimitern, haha
<dimitern> fwereade, I've been watching some amazing videos past few days
<dimitern> fwereade, I'll play some myself over the weekend
<fwereade> dimitern, http://mlkshk.com/p/UBFM
<fwereade> dimitern, not KSP but pretty cool
<natefinch> fwereade: reproposed with pointers to lists now
<natefinch> fwereade: https://codereview.appspot.com/13802045/
<fwereade> natefinch, on it, thanks
<dimitern> awesome!
<dimitern> fwereade, take a look at this when you have 20 mins http://www.youtube.com/watch?v=oqYKXo1gbc8
<fwereade> dimitern, cheers
<fwereade> natefinch, I think that LGTM
<fwereade> natefinch, but I think I had a question about maas setting hardware characteristics in an earlier review -- did you find anything out about that?
<natefinch> fwereade: maas doesn't set anything automatically.   You can create tags that have a definition that match against the hardware characteristics returned from a syscall, and it'll get automatically applied to the machines that match
<fwereade> natefinch, it's not a blocker here indeed
<fwereade> natefinch, but we will need a followup that gets the tags on the maas instance that gets picked, and puts them into the HardwareCharacteristics return value from StartInstance
<natefinch> fwereade: ahh yeah
<fwereade> natefinch, and you'll need to do something with it in state.Unit.findCleanMachineQuery  as well to actually match it
<fwereade> natefinch, I'm assigning you to https://bugs.launchpad.net/juju-core/+bug/1161919 which I think is already fixed, but which as it happens covers that bit quite neatly
<_mup_> Bug #1161919: unused machine assignment ignores constraints <juju-core:Triaged> <https://launchpad.net/bugs/1161919>
<fwereade> natefinch, but you'll be in a position to check it in detail and probably just mark it fix released:)
<natefinch> fwereade: cool
<fwereade> natefinch, tyvm for pushing that
<fwereade> natefinch, did you get a chance to fiddle with containers on maas and check sanity?
<natefinch> fwereade: no problem.  I started to... actually most of the way to it - deployed two services in separate containers, just need to try the relation
<natefinch> arg
<natefinch> fwereade: once again, someone has shut down my maas host
<fwereade> natefinch, bleh, that sucks
<natefinch> fwereade: last time it royally screwed up the environment
<natefinch> fwereade: I'll wait for it to come back up and poke at it, and if it's all gone to hell like last time, I'll complain to red squad.
<natefinch> fwereade: actually, I'll complain to red squad regardless
<fwereade> natefinch, sgtm :)
<natefinch> smoser: someone powered off my maas host machine :/
<smoser> hm..
<smoser> i'm still up
<smoser> let me check
<smoser> natefinch, fwiw, i blame juju for this.
<natefinch> smoser: rofl
<natefinch> smoser: I suppose it's possible it's juju's fault. However, my guess is that it was someone doing something they weren't supposed to do.
<smoser> https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1229275
<_mup_> Bug #1229275: juju destroy-environment also destroys nodes that are not controlled by juju <juju:Confirmed> <juju-core:Triaged> <maas (Ubuntu):Triaged> <https://launchpad.net/bugs/1229275>
<smoser> i suspect you are a victom of that.
<natefinch> smoser: ahh, hmm... I guess I'm surprised juju doesn't just go through it's list of machines and dispose of the ones it knows about
<smoser> yeah, woudln't that make sense ?
<smoser> natefinch, fwiw...
<fwereade> smoser, natefinch: it would indeed make sense, sadly the maas provider never did anything to mark instances with their environ and nobody spotted it until now
<smoser> fwereade, i filed a similar bug over a year ago i think.
<smoser> pyjuju probably.
<fwereade> smoser, yeah, we need to look at pyjuju and see how it solves it
<smoser> oh, it doesn't.
<smoser> :)
<smoser> bug 1081247
<_mup_> Bug #1081247: maas provider releases all nodes it did not allocate [does not play well with others] <juju:New> <MAAS:Invalid> <https://launchpad.net/bugs/1081247>
<fwereade> smoser, lol
<fwereade> smoser, we're bug-compatible!
 * fwereade sighs gently
<natefinch> nice
<smoser> read that bug for a workaround.
<smoser> its not as bad as it seems.
<smoser> sinc eyou can have multiple maas keys, you can have multiple juju environments.
<smoser> at least i hope juju-core is consistent with that feature of pyjuju
<fwereade> smoser, a quick read reveals no reaosn it would not be
<smoser> its a fun bug to find.
<smoser> you deploy soething with juju
<smoser> and then you deploy another node.
<smoser> (without juju)
<smoser> and then juju shoots your other node
<smoser> and you say WTH! i swear i deployed a node.
<smoser> and you do it again
<smoser> and repeat
<fwereade> smoser, ouch
<fwereade> smoser, pyjuju also had sort of the opposite bug
<fwereade> smoser, someone had an old environment lying around and they spotted a couple of unused instances, so they killed them
<fwereade> smoser, but didn't terminate the machines in juju
<fwereade> smoser, so they kept popping up and would not go away ;)
<fwereade> smoser, we fixed that at least ;)
<smoser> i dont know which id' rather lose.
<smoser> data or money
<smoser> :)
<fwereade> haha
<fwereade> natefinch, uh-uh, is the bot broken?
<fwereade> natefinch, gaah commit message
<fwereade> natefinch, set one on the tags branch
<fwereade> natefinch, feel free to do the same for 015-typos after this one's landed, but that might want a fresh look, it's a bit old
<fwereade> natefinch, OTOH I remember it and, yeah, it's just typos :)
<sinzui> fwereade, is r1902 a sensible release candidate?
<natefinch> fwereade: yeah, really just was typos... though roger suggested a wording change... still, it's just a string
<fwereade> sinzui, it would be ideal to wait for r1903 which we approved a while ago... and then forgot to set the commit message on, but the bot should be on it now
<sinzui> fwereade, okay. I am not pressing anyone to rush and make mistakes. Once I know the revision, it will take a short drive of it before feeding into my scripts
<fwereade> sinzui, sensible
<fwereade> sinzui, it will depend upon simplestreams data in ec2, hpcloud, canonistack
<sinzui> yep. I will upload that before actually announcing it
<natefinch> fwereade: you put a comment on the maas tags branch ,so is that going in now?
<fwereade> natefinch, I hope so
<natefinch> fwereade: thanks, btw
<fwereade> natefinch, no worries, thank you :)
<fwereade> natefinch, sinzui: ok, great, merged at revision 1903
<sinzui> fab. Thank you fwereade , et al.
<fwereade> natefinch, please add a comment to the release notes :)
<natefinch> fwereade: I'd be happy to.... where are they?
<fwereade> dimitern, that video was *awesome*
<dimitern> fwereade, it's only 23 out of like 30 something like that from the same guy - from basically scratch to building amazing infrastructure
<natefinch> dimitern: is that a video game?
<natefinch> dimitern: I didn't recognize it, but I'm pretty far out of the loop
<dimitern> natefinch, yes, it's "an early alpha" but completely playable like a sandbox simulation, there's a demo as well
<natefinch> dimitern: very cool
<dimitern> natefinch, check it out some time, it's worth it, if a bit frustratingly realistic
<natefinch> dimitern: heh... what's it called?
<dimitern> natefinch, kerbal space program
<fwereade> sinzui, I feel obliged to state that it will also need simplestreams on azure, but this is just an obsessive personal-correctness thing and not any sort of actual concern that you won't ;)
<natefinch> fwereade: dang.... speaking of azure, we need to add juju help azure
<fwereade> natefinch, oh bugger -- ah well, too late for this one
<natefinch> fwereade: as long as it makes it into saucy, that seems ok
<sinzui> fwereade, I have the power, though I don't like all the steps needs to do that. I need to script that out still
<sinzui> natefinch, what is the effort needed to add the missing help?
<natefinch> sinzui: it's really just a string.  Adding it to the code is trivial. It's a matter of someone typing it up in a way that is coherent and useful.
<sinzui> understood
<fwereade> rogpeppe, I don't suppose you recall why the instance ports methods take a machine id?
<fwereade> rogpeppe, oh right, group names are named after machine id
<wallyworld> sinzui: how's the release? anything i can do?
<wallyworld> fwereade:  everything good with 1.15?
<sinzui> wallyworld, still waiting for the build to complete
<sinzui> Everything has been fine. I don't see anything that will cause a problem
#juju-dev 2013-09-29
<thumper> anyone else around?
#juju-dev 2014-09-22
<bradm> wallyworld_: fwiw juju 1.20.8 proposed works the same for me as 1.21-alpha1 did, for the disable-network-management thing is concerned.
<bradm> wallyworld_: as in, I can get a juju bootstrap working
<wallyworld_> bradm: that's good then  :-) thank for letting me know
<bradm> I'm having some weirdness with the rest of the deploy, but we haven't really gotten this going before, so I'm not sure what the issue is
<wallyworld_> ok, if you can pin it down to a specific issue, maybe we can help
<bradm> wallyworld_: I'm leaning towards it being an issue with our preseed in maas, but thanks, will let you know if I think I'm seeing juju issues.
<wallyworld_> ok
<menn0> waigani: review for http://reviews.vapour.ws/r/70/ done
<waigani> menn0: thanks
<menn0> waigani: so should I pick up another document to do the env UUID conversion for?
<menn0> waigani: I guess I should wait for your services branch to land as it has some useful helpers
<waigani> menn0: I've replied to your comments but not published them, should have the next pr up soon. Should I publish my comments now anyway?
<davecheney> waigani: wait til you have responded to them with code
<davecheney> otherwise you'll confuse other
<menn0> waigani: what davecheney said
<menn0> waigani: in the meantime I'll pick out the next one
<menn0> and start figuring out what needs to happen
<davecheney> thumper: ready when you are
<thumper> davecheney: ok
<menn0> thumper: : I'm getting a panic in environInfoUserTag when running juju status against an env bootstrapped with trunk on Friday
<thumper> menn0: otp, with you soonish
<menn0> thumper: k
<menn0> waigani: once we've done the env UUID work for units and machines we should then do the statuses collection. The "global keys" used there need a env UUID prefix.
<waigani> menn0: how do I update rbt?
<menn0> waigani: rbt post -r <review number>
<menn0> waigani: rbt post -u will work too, but only if you haven't rebased
<waigani> thought it was smart enough to remember, I just made a new pr accidentally
<menn0> waigani: nope
<waigani> menn0: http://reviews.vapour.ws/r/70
<menn0> waigani: looking
<menn0> waigani: didn't hit publish?
<waigani> what?
<waigani> oh ffs...
<waigani> now I get an 500 server error
<menn0> waigani: just do it from the web interface
<menn0> if you've uploaded a new diff there will be a publish button
<waigani> yep I hit it and I get a 500 err
<menn0> you could have also passed -p to rbt post to have it publish immediately
<menn0> waigani: awesome
<menn0> waigani: that's not something I have seen before
<menn0> waigani: I've got my 1:1 now anyway
<waigani> ericsnow: coded the easter egg just for me ;)
<waigani> ERROR: Could not reach the Review Board server at https://reviews.vapour.ws/
<waigani> menn0: it's uptodate on github
<menn0> waigani: can you get to the site at all?
<waigani> menn0: yes
<menn0> odd
<waigani> menn0: https://github.com/juju/juju/pull/802
<waigani> menn0: did you get my replies to your comments?
<menn0> yep
<menn0> waigani: actually I can see an updated diff on RB now
<waigani> is the right one? I still get the publish button and 500 every time I click it?!?
<waigani> okay, looks like the right diff is up. So what is with this publish button then?
<menn0> well it seems to have some of the changes I brought up
<menn0> waigani: what happens if you refresh the page?
<menn0> waigani: does the button go away?
<waigani> menn0: nop
<menn0> strange
<waigani> I'm in perpetual draft - a draft that everyone can see...
<menn0> waigani: hey there's still lots of cases where DocId should be getting used instead of called idForEnv
<waigani> menn0: okay, I'll do another sweep
<menn0> waigani: an idea: try removing the review request you accidentally added (mark it as Discarded)
<thumper> menn0: there now,
<menn0> runtime.panic(0xd79660, 0x1bc2aa8)
<menn0> 	/usr/lib/go/src/pkg/runtime/panic.c:266 +0xb6
<menn0> github.com/juju/juju/juju.environInfoUserTag(0x0, 0x0, 0xc2101664c0, 0xc210145ba0, 0x9, ...)
<menn0> 	/home/menno/go/src/github.com/juju/juju/juju/api.go:230 +0x2a
<menn0> github.com/juju/juju/juju.funcÂ·003(0xc210165480, 0x0, 0x0, 0x0, 0x0)
<menn0> 	/home/menno/go/src/github.com/juju/juju/juju/api.go:168 +0xc4
<menn0> github.com/juju/utils/parallel.funcÂ·005()
<menn0> 	/home/menno/go/src/github.com/juju/utils/parallel/try.go:135 +0x41
<menn0> created by github.com/juju/utils/parallel.(*Try).loop
<menn0> 	/home/menno/go/src/github.com/juju/utils/parallel/try.go:86 +0x9f
<menn0> that's for thumper :)
<waigani> menn0: I did change all the DocIDs, but files were not saved so didn't get added
<menn0> waigani: cool. in 1:1
<waigani> thumper: are we doing standup?
<waigani> 1:1 or whatever it's called
<thumper> waigani: yes, just otp with menn0 and finishing up
<waigani> menn0: otp
<menn0> waigani: kk
 * thumper goes to put the coffee machine on
<menn0> waigani: http://reviews.vapour.ws/r/70/
<waigani> menn0: how do I look at the original diff you commented on?
<menn0> waigani: There's a slider at the top of the diff viewer which lets you control the diff that's shown
<waigani> thanks
<menn0> waigani: you can also click on the title from the Reviews UI
<menn0> waigani: (the page which just lists the review comments together)
<thumper> waigani: review done
<waigani> thumper: thanks
<thumper> wallyworld_: could I get you to review axw's branch: http://reviews.vapour.ws/r/62/diff/#
<thumper> wallyworld_: I'm not comfortable with the tools storage
<wallyworld_> sure
<thumper> and I think I may miss something
<wallyworld_> np
<thumper> cheers
<thumper> I did the other one :)
<wallyworld_> \o/
<axw> thanks
<wallyworld_> thumper: last day today, off tomorrow to central queensland
<thumper> wallyworld_: have a nice holiday pre-sprint
<thumper> wallyworld_: and I'll see in in Belgium
<wallyworld_> thumper: will do, scary bit is a gotta get straight on a plane when we get back so i gotta pack for the sprint this afternoon
<thumper> haha
<wallyworld_> it's a 5 hour drive back to brisbane, then to airport, hope there's no delays
<thumper> :)
<wallyworld_> axw: i just looked at http://reviews.vapour.ws/r/72/ - was that bug raised earlier really a critical regression then?
<axw> which bug?
<wallyworld_> in the topic - 1371605
<wallyworld_> i marked it as Invalid
<axw> ah, missed it
<axw> looking
<wallyworld_> as i thought that HP Cloud should have a had a keystone product-stream endpoint
<wallyworld_> well, i'm sure it did, but may i misremember
<wallyworld_> if i'm wrong, we'll need to reopen
<axw> wallyworld_: even if it's meant to be there on HP, it won't be in all OpenStack installations
<axw> so yes, it's a real bug/regression
<wallyworld_> axw: ok, i'll reopen
<wallyworld_> for some reason i thought it would only kick in for CPCs
<wallyworld_> axw: in upgradejuju_test, why is there a change to  append /tools to the tools-metadata-url? this seems like an error
<wallyworld_> the url should just be the base url
<wallyworld_> hmmm, maybe not
<axw> wallyworld_: pretty sure it didn't work without it. just a moment, will need to rewind
<wallyworld_> axw: igmore me, i think it is needed
<wallyworld_> but surprising that the original code didn't have it
<wallyworld_> maybe there was code to append /tools if it wasn't there at one point
<axw> wallyworld_: I think in the old code, that tools-metadata-url was just being skipped over and it was using provider storage
<axw> notice I took out the MustUpload... calls
<wallyworld_> haven't got to that bit yet
<axw> line #316, just below
<wallyworld_> axw: just came across the deletion of the assertMirrors test - for our CPCs, like AWS, HP Cloud etc, provider storage will still be used for mirroring the tools local to the cloud
<wallyworld_> so we will still need a test of some sort for that
<axw> wallyworld_: I'll write a new more targeted test
<wallyworld_> axw: awesome, ty
<axw> just for sync.StorageToolsUploader
<wallyworld_> axw: also, can you double check that there's still tests which check that tools can be downloaded from a local mirror? there will be simplestreams ones, but i can't recall if there are provider ones for wc2 or openstack
<wallyworld_> great that the retries bool is gone
<axw> wallyworld_: I don't recall seeing any tests of that sort in the providers
<wallyworld_> ok, i don't recall off hand what's there
<wallyworld_> i just wated to be sure we weren't reducing test coverage of the mirrors stuff
<axw> wallyworld_: pretty sure I didn't remove any mirror reading tests, I'll double check tho
<wallyworld_> ok, ta
<wallyworld_> it's got a fairly high risk of breaking something if we get it wrong
<wallyworld_> since mirrors are used for aws etc
<waigani> thumper, menn0: http://reviews.vapour.ws/r/70/
<thumper> wallyworld_, menn0: http://reviews.vapour.ws/r/74/diff/
<thumper> waigani: I'm being dragged away, will look in the morning
<thumper> wallyworld_: this is a critical bug fix
<thumper> wallyworld_: I broke it last week
<waigani> thumper: okay
<wallyworld_> thumper: i get a 404
<thumper> really?
<wallyworld_> ah, not anymore
<waigani> thumper, menn0: I've made all requested changes - no push back. I'll run all the tests tonight
<wallyworld_> maybe i was too quick
<thumper> wallyworld_: gah... who do I choose for a review?
<thumper> reviewer
<wallyworld_> for the 74?
<thumper> done
<thumper> I hadn't published the review
<thumper> geez
 * thumper comes back later
<menn0> waigani: that PR is looking good. make sure you test an actual upgrade from 1.20 to your latest to make sure the upgrade step actually does what you expect
<waigani> menn0: good advice - I'm currently running make test - so far so good
<menn0> also I've just found one more problem... sorry. see the review
<waigani> menn0: fixed. I've hit some failing tests. So I'll push up the fix with the next round.
<TheMue> morning
<jam> morning TheMue
<jam> fwereade: I have a mongo question when you're around
<wallyworld> axw: if you had time to look at http://reviews.vapour.ws/r/73/ that would be great, i have to pop out but will be back later
<axw> wallyworld: sure
<wallyworld> ty
<fwereade> jam, oops, sorry
<fwereade> jam, (I'm here)
<voidspace> morning all
<TheMue> voidspace: heya, all preparations for your trip done?
<voidspace> TheMue: my visa arrived last week
<voidspace> TheMue: so just the packing to do :-)
<voidspace> TheMue: PyCon UK this weekend was fun too - and I got to practise my talk
<voidspace> TheMue: although the one for PyCon India needs to be 50% longer - so still some work to do
<TheMue> voidspace: sorry, had phone, insurance company because of my daughters :D
<TheMue> voidspace: India is fantastic, I enjoyed my -- too short -- time there
<voidspace> TheMue: it did occur to me that "treat your servers as cattle" is not the right metaphor for an Indian audience...
<voidspace> jam: I spent some time at PyCon UK with Wes Mason from online services
<voidspace> jam: he's the replacement for Sidnei on that team - and he's a mongo expert
<voidspace> (he hates it...)
<voidspace> jam: he's willing (beuno permitting) to give me or us some time looking at replicasets at some point
<TheMue> voidspace: what does he hate? mongo?
<voidspace> TheMue: yes :-)
<TheMue> voidspace: oh
<voidspace> TheMue: but he knows a lot about it
<jam> voidspace: sounds good
<wesleymason> voidspace: TheMue: jam: there is a certain correlation between my experience and levels hate
<voidspace> wesleymason: I assumed that was the case...
<jam> wesleymason: where else is it being used? I thought U1 was using Cassandra, not Mongo
<wesleymason> jam: push are using it for queues, but my experience is from previous workplace
<jam> wesleymason: maybe I can ask you my mongo question. If you do: "servicesCollection.Find(bson.D{}).All(&sdocs)" is there any particular order that the results would be in?
<jam> (in mongo, they seem to always come back in alphabetical sorted, in Tokumx I'm seeing them come back in the order they were inserted, I just wanted to confirm that the test is wrong)
<wesleymason> jam: unless specified it's called "natural order", which is 99% of the time insertion order, but that can be different depending on factors, for example if data were restored from a backup it could be sorted by datetime encoded inside the _id ObjectId
<jam> fwereade: ^^ if you have an answer you're welcome to respond
<jam> wesleymason: well, I'm seeing "insert(wordpress), insert(mysql), Find() => mysql, wordpress" for Mongo and wordpress,mysql for Tokumx
<jam> now, the test claims: 	// Check the returned service, order is defined by sorted keys.
<jam> but I don't see any sorting anywhere
<wesleymason> jam: have you tried the find on the mongo repl just to make sure there's nothing happening in the driver?
<jam> wesleymason: well, same driver, I believe
<fwereade> jam, I think if we care about order we should explicitly Sort()
<jam> fwereade: I don't think we care for State.AllServices() do we ?
<fwereade> jam, IMO, no
<jam> I think the test happened to get a sorted order and thought it was guaranteed
<jam> and we should be Sorting in the test
<fwereade> jam, agree
<wesleymason> ^ this, and if it's really important sort on a key other than _id, as that can be recreated and not always guaranteed to have the original datetime
<jam> wesleymason: so it would appear that my data was backwards. Toku is returning in sorted order, Mongo was returning in insertion order. Anyway, fixing the test
<wesleymason> jam: that makes MUCH more sense :D
<jam> wesleymason: yeah, I think Toku's "fractal index" and mvcc means that it ends up sorting on insert
<jam> http://reviews.vapour.ws/r/76/diff
<jam> thumper-eod: or wwitzel3 ^^
<jam> or fwereade^
<fwereade> jam, LGTM
<axw> wallyworld: not really sure why the bot is barfing on my branch. I think it must be running really close to the line on /tmp usage, and I tipped it over with the changes to how tools are build
<axw> built*
<axw> gotta make dinner, will take another look later
<jam> axw: your branch addresses bug #https://bugs.launchpad.net/juju-core/+bug/1371605
<jam> ?
<mup> Bug #1371605: HP Bootstrap fails: no endpoints known for service type: product-streams <bootstrap> <ci> <hp-cloud> <regression> <streams> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1371605>
<axw> jam: yes, amonst other things. if I can't land it soon I'll pull out the fix
<axw> can I get a review on http://reviews.vapour.ws/r/77/ please someone? fixes CI blocker
<jam> TheMue: voidspace: I'm going to be late to standup, and dimiter is out today, so probably you two can just chat with eachother about what's going on and have a short standup, since I've talked with you both
<voidspace> jam: TheMue: cool
<TheMue> jam, voidspace: yep, sounds fine to me
<TheMue> voidspace: I'm hanging out now ;)
<voidspace> TheMue: omw
<wallyworld> axw: lgtm
<voidspace> mgz: ping
<axw> wallyworld: thanks
<wallyworld> np
<mgz> voidspace: yo
<voidspace> mgz: sorry, I thought the bot wasn't working
<voidspace> mgz: but we were actually blocked on a critical bug I hadn't noticed
<mgz> voidspace: I'm just trying to see if I can open trunk
<mgz> but andrew's change hasn't made it through the testing yet
<hazmat> jam, ping
<natefinch> fwereade, perrito666: we should talk about charm sync.  I have a little time right now, but will likely get interrupted in 20-ish minutes... but I know it gets late fast on the other side of the pond, so want to talk whenever is good for you, William.
<perrito666> hey hi natefinch
<perrito666> hi fwereade
<fwereade> perrito666, natefinch: hey guys
<fwereade> natefinch, how long will your interruption be, do you think?
<fwereade> natefinch, it's only 3pm for me, but in 90mins I will be going into meeting mode for the rest of the day
<perrito666> fwereade: ouch, I hope "rest of the day" doesn't mean until midnight
<natefinch> fwereade: probably 20-ish minutes of disruption.  I think we can at least get a good start now if you like
<fwereade> natefinch, perrito666: ok, sgtm, would you start a hangout and I'll be there in <5?
<natefinch> https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=1
<axw> wallyworld: I have nfi what your email says, because my pgp key is invalid :)
<wallyworld> axw: just letting you know the tools stream branch has landed in trunk
<axw> okey dokey
<hazmat> quite a few yummy things landed in trunk
<hazmat> axw, so the provider storage removal is complete?
<axw> hazmat: *very* close to, but not quite
<hazmat> wallyworld, the --to on ensure-availability should also bring support for manual providers?
<axw> backups still relies on it
<axw> hazmat: (re --to): eventually it should, but currently "ssh:" is all handled CLI-side
<hazmat> axw, fair enough.. sounds like its enough though that a  non storage providing provider could be made
<hazmat> axw, well --to an existing machine shouldn't need the ssh.. its a pointer to an env machine
<axw> hazmat: after my next branch, you could provide a dummy Storage() method on Environ and 99.9% of Juju would work
<hazmat> don't really need to --to=ssh:
<hazmat> axw, awesome
<wallyworld> hazmat: ensure-availability --to nominally uses mass names, like is done with bootstrap
<wallyworld> bootstrap --to
<hazmat> bummer
<wallyworld> hazmat: the placement directive just gets passed through as is to the provisioner
<hazmat> wallyworld, if it took env machine ids, then it would work there as well.. i thought people could already specify maas-name via --constraints on ensure-avail
<wallyworld> that's what the landscape guys wanted
<axw> that would involve promoting machines from non-state-server, which is something we don't do atm
<wallyworld> yep
<hazmat> btw.. found a nice cli based git gui.. tig
<hazmat> http://jonas.nitro.dk/tig/
<wallyworld> "nice" and "git" often don't go together
<hazmat> wallyworld, its working for me, i needed a better browsing interface for keeping up.. and this one does the trick for me. tig --first-parent to browse merges and a custom external command to do proper diff (per juju workflow) on merge commits.
<hazmat> but yeah.. git tooling is relatively esoteric compared to bzr
<wallyworld> yep
<hazmat> but i'll take the speed anyday ;-)
<wallyworld> bzr is not reaaly slow nowadays
<hazmat> axw, so basically altering machine jobs post creation and updating stateServers 'e' doc for votingids stuff
<hazmat> wallyworld, for net ops.. git is still significantly faster for me
<wallyworld> fairynuff
<hazmat> katco, re format=oneline, great stuff.. but i wonder if the name shouldn't just be line.. cause its not like oneline of output.. and its shorter.
<axw> hazmat: I forget what exactly we don't handle. it won't work at all on Azure, for example, because state servers are bound to the same cloud service
<wallyworld> hazmat: conceptually we can change a machine's jobs, but the plumbing is not quite there yet, needs a bit of work
<hazmat> axw, oh.. that problem child ;-)
<katco> hazmat: i was leaning on feedback from the spec. that is fine with me as long as everyone agrees.
<axw> :)
<katco> hazmat: i'm going to focus on landing the tabular and summary formatters, and then we can circle back on the "oneline". can you leave some feedback on the spec, and maybe ping marco et. al.?
<hazmat> katco, sure
<katco> hazmat: ty, sir. good suggestion.
<voidspace> jam: so you can ask if the -short flag is on. So CI tests could skip if short is on.
<voidspace> jam: making the default developer invocation to use short, and CI to stay unchanged
<katco> what's the eta of the trunk landing being cleared now that the blocker is fixed?
<ericsnow> natefinch, wwitzel3, perrito666: standup?
<perrito666> aghh amazon not having enough machines again
<wwitzel3> natefinch: ping
<mgz> update for everyone on trunk landiness status:
<mgz> blocking bug has been fixed by andrew, but CI got stuck on a job over the weekend and it catching up on several revisions now
<mgz> so, need to wait a little bit still for the change to go through verification and the get bug marked fixed so things can land again
<natefinch> mgz: thanks
<hazmat> is there a provider interface?
<hazmat> ah.. environs/interface.go EnvironProvider
<natefinch> hazmat: would love to hear your thoughts on the process of writing a provider.  And definitely if you need any help with a DO provider, I'd be happy to offer whatever help I can (given that I haven;t writt+
 * katco is off to get her glasses adjusted. bbiab.
<natefinch> written a provider, that may not be much ;)
<hazmat> natefinch, i need to convince DO to bake cloudinit into their packages atm. currently doing the email thing with them on it
<hazmat> s/packages/images
<natefinch> hazmat: oh interesting, didn't realize that was missing
<hazmat> natefinch, they added userdata support to get coreos going.. and its the exact same interface as ec2.. we just need them to rebake their ubuntu images with cloudinit installed
<natefinch> hazmat: sorta surprised we let people deploy "ubuntu" without cloudinit
<hazmat> natefinch, we don't generally control that
<jrwren_> why don't they just use cloudimg?
<hazmat> its not a cpc cloud, so we don't build the images
<hazmat> no.. most non cpc clouds don't
<hazmat> the only public clouds using cloudimg are those that we push the image directly to their cloud images
<natefinch> hazmat: I guess I figured cloudinit was one of those things we require for an image to be called "ubuntu"
<hazmat> natefinch, its not on desktop or ubuntu core
<hazmat> ie. its not even on most of the images we distribute
<hazmat> getting userdata/cloudinit support is generally one of the harder aspects of juju enablement for clouds that don't have direct support for it
<lazyPower> if anyone has time, we have a user that's discovered brokenness in our homebrew recipe build of juju: https://bugs.launchpad.net/juju-core/+bug/1372550
<mup> Bug #1372550: juju metadata missing from brew juju 1.20.7 <papercut> <juju-core:New> <https://launchpad.net/bugs/1372550>
<jam> hey hazmat /wave
<jam> sinzui or mgz: the bot still thinks bug #1371605 is blocking trunk
<mup> Bug #1371605: HP Bootstrap fails: no endpoints known for service type: product-streams <bootstrap> <ci> <hp-cloud> <regression> <streams> <juju-core:Fix Committed by axwalk> <https://launchpad.net/bugs/1371605>
 * sinzui checks if test passed
<sinzui> jam, the commit is queued to test
<jam> sinzui: for 5 hrs ?
<sinzui> jam 1. CI was stuck testing a single test over the weekend. 2. 1.20 always tests before devel
<hazmat> jam, hey wanted to sync up on toku
<hazmat> and also some discoveries in mongo trunk
<sinzui> jam, I unblocked the stuck test 4 hours ago. I see 1.20 finishing its tests now
<jam> hazmat: sure. I'm in a meeting, but I'm done in about 5 min.
<jam> sinzui: so what is the process for unblocking trunk, it doesn't just check what the current blockers are, it checks them at the end of a test run?
<sinzui> jam, I checks that the bug is marked Fix Released. We mark our bug fixed released when we see the test pass
<sinzui> jam, this address the problem where a commit is made that does NOT fix the test
<mgz> jam: it is somewhat painfully slow when it's a fix done at the start of a working day for us
<jam> mgz: I had hoped it was a bit more automated than requiring you guys to jump on it.
<sinzui> mgz, jam., AWS is painfully show today. The slow tests are all trying to provision instances in AWS
<jam> vacation/sickness means the dev team could be blocked for >24hrs
<mgz> well, the testing is automated, just getting a revision through review, landing, then ci tests takes several hours at best
<mgz> and today has been much slower as curtis explained
<mgz> (partly my fault, I wanted to land a 1.20 change to see if that was blocked as well, which is now taking queue priority when it shouldn't really)
<jam> I guess we always have the JFDI hammer if we find you guys aren't responding.
<mgz> jam: there's no reason that bugs can't be marked fixed by anyone
<jam> marking it Fixed Released vs Fix Committed seems a bit of a stretch, but sure, if that is the actual flag
<jam> I was told it was: https://bugs.launchpad.net/juju-core/+bugs?field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.importance%3Alist=CRITICAL&field.tag=ci+regression+&field.tags_combinator=ALL
<mgz> it's just whether the policy of waiting for a CI run to verify a blocking bug is fixed to do that marking is helpful
<mgz> jam: yeah, that got changed, which I hadn't gathered initially either
<mgz> I think blocking all of trunk to do an ammendment to a landed change is generally a bad idea
<mgz> response to blockers should be identify the broken change, back that out, reopen the bug/pr that landed it for the followup fix
<sinzui> jam, sorry, I changed the rules a little over a week ago as an item from my meeting with alexisb , wallyworld , natefinch , and others. I failed to send an email explaining why we added Fix Committed as a blocking condition
<sinzui> mgz, I will add ...revert the ci test if it was discovered to be changed, or fix the test if it is discovered to be wrong
<sinzui> mgz, jam. 1.21-alpha2 started testing
<hazmat> jam, still around.. my meeting went a bit late?
<hazmat> ..
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: None
<waigani> thumper-eod: it's bod
<waigani> thumper-eod: did you get my email about parseTag - what id should it return?
<wallyworld> sinzui: i see you've created a 1.20.9 milestone. can i target the apt pinning and tools streams bugs at that? are you going to release the 1.20.8 built last week and then build a subsequent 1.20.9?
<sinzui> wallyworld, don't hesitate to target to it
<sinzui> wallyworld, Even if 1.20.8 is reviled, we will still move on to 1.20.9
<wallyworld> ok, will do
<sinzui> wallyworld, I don't think anyone will block 1.20.8. We are being polite and letting some say no
<wallyworld> sinzui: ok, i'm off today till the sprint, andrew will backport the tools stream change. was my email about streams behaviour correct?
<sinzui> yes please, but wallyworld the promise for "moving" the dirs was this year, not this month
<wallyworld> sinzui: the solution in my latest email doesn't require new dirs
<sinzui> wallyworld, we/my team can always make the index.json point to a locations we need
<wallyworld> the streams are encoded in the existing json files
<wallyworld> same dirs
<sinzui> wallyworld, rock, I just wanted to be clear that your plan for :released: :proposed: as stanzas in index.json was loved by many
<wallyworld> sinzui: i hope it is because that's what i've implemented :-) seems to work well, and only requires you to generate the files, not any changes to the dirs as such
<wallyworld> sinzui: one point - the mirrors file also has the :released:, :proposed: etc in it
<sinzui> wallyworld, oh! thank you for telling me. I had not noticed
<wallyworld> sinzui: i hope that is ok, i think it makes sense
<sinzui> wallyworld, looks like I have a choice of many...
<sinzui> but now I recall Ben wanted just cpc-mirrors.json, so it will have all the stream names and index will point to it in each stanza
<wallyworld> sinzui: so in that case, i think te implementation i did is correct - a single mirrors json file, containing different stanzas for each stream
<wallyworld> sinzui: if you need tweaks, you can ask andrew; he's available till mid next week
<sinzui> wallyworld, thank you. I hope not to need anything more than time on my side of things
<wallyworld> i hope so too
<wallyworld> the work is in trunk, should be backported over the next day or so
<wallyworld> 1.20.9 now has 3 bugs assigned
<wallyworld> right, i have to go and drive for 6 hours to my first destination, see you in brussels
<katco> wallyworld: wait till the last minute why don'tcha! ;)
<katco> wallyworld: tc, have fun, and i'll see you in brussels!
<wallyworld> katco: i just wanted to make sure everything was sorted out :-)
<katco> wallyworld: hehe
<wallyworld> katco: will do, see you then, have fun without me :-)
<katco> wallyworld: hehe tc
<rick_h_> thumper-eod: ping whe you've got time
<thumper> rick_h_: otp right now
<rick_h_> thumper: rgr when you have time
<waigani> menn0, thumper: http://reviews.vapour.ws/r/70/
<waigani> davechen1y: standup?
<thumper> rick_h_: free now
<rick_h_> thumper: rgr
<rick_h_> thumper: calling ya
<menn0> thumper, davechen1y, waigani: it looks like us non-manager types can't edit that spreadsheet
<waigani> thumper: ^
<davechen1y> menn0: sad trombone
<menn0> davechen1y, waigani: I'm just doing it in my own sheet which I'll send to thumper
<thumper> menn0: sounds good
#juju-dev 2014-09-23
<thumper> waigani: the parseTag method is only used in the annotator
<thumper> waigani: and it is doing a doc exists check
<thumper> so the id has to be the docID, not a local ID
<waigani_> when I bootstrap and deply mysql with trunk I get the following in juju status: agent-state-info: 'hook failed: "install"'
<rick_h_> waigani_: what did you deploy it with?
<rick_h_> waigani_: the gui or cli?
<waigani_> rick_h_: cli
<rick_h_> waigani_: k, nvm then. There was a known gui bug
<waigani_> okay, anyone else hit this with trunk or am I a special snowflake?
<rick_h_> waigani_: knowing what's in the unit log would be helpful
<rick_h_> waigani_: juju ssh mysql/0 and then check the end of /var/log/juju/unit-<tab>
<waigani_> rick_h_: ugh I just destroyed the env. Let me repeat my steps and I'll check out the log
<rick_h_> waigani_: rgr
<waigani_> rick_h_: http://pastebin.ubuntu.com/8407234/
<rick_h_> waigani_: what series is this?
<rick_h_> waigani: curious what series that is?
<waigani> rick_h_: https://bugs.launchpad.net/juju-core/+bug/1372699
<mup> Bug #1372699: juju status reports: agent-state-info: 'hook failed: "install"' <juju-core:New> <https://launchpad.net/bugs/1372699>
<rick_h_> waigani: gotcha, interesting.
<rick_h_> waigani: last bit of info I'd include is where you were doing this at, lxc, ec2, etc
<waigani> rick_h_: ah right, thanks
<rick_h_> since the issue is around a package install wonder if there's some mirror thing going on there
<waigani> rick_h_: Using trunk (daee198586afe412948b23d68f58592c625de08f) with LXC containers on my machine (amd64 14.04) ?
<rick_h_> waigani: hmm, you know what. I wonder if this is any part of the apt-get upgrade by default stuff
<rick_h_> waigani: is that env still up?
<waigani> rick_h_: sorry, I moved on. I had to test a different branch.
<waigani> rick_h_: what did you want me to check?
<rick_h_> waigani: understood, still trying to get an lxc env up to test something out
<rick_h_> waigani: so I'm wondering if you could manually apt-get install python-jinja2
<rick_h_> waigani: and if not, if an apt-get update && apt-get install python-jinja2 would work
<rick_h_> waigani: just trying to figure out why that install would fail to run and pondering
<waigani> rick_h_: Setting up python-jinja2 (2.7.2-2) ...
<rick_h_> waigani: :/
<waigani> installed no problem
<waigani> a could flaky Internet connection have anything to do with it?
<rick_h_> waigani: yes, if that apt-get install failed to work then it would die and cause the non-0 exit
<rick_h_> waigani: in which case it would be a temp issue I'd imagine
<rick_h_> assuming flaky means it works some of the time :)
<waigani> rick_h_: right, my guess is that is it.
<rick_h_> waigani: ah ok
<waigani> rick_h_: begs the question, if there is a better way to handle this situation?
<rick_h_> waigani: well the tough part is that the charm is doing it in the script. If it doesn't really check/handle that I'm not sure what juju can do.
<waigani> right, got ya
<rick_h_> waigani: there is talk of juju proviing a way for defining package deps before the charm even gets going, which might help as juj then can watch/manage that
<waigani> rick_h_: +1 from me
<rick_h_> waigani: not sure where that is in the pain point list currently.
<rick_h_> davechen1y: any idea? ^
<waigani> top of my list today ;)
<rick_h_> looks like wallyworld is afk today
<davechen1y> rick_h_: hold up,reading the backscroll
<rick_h_> davechen1y: is there anything on the charms listing system deps as part of the charm vs in the hook itself on the pain point list for the nearish future?
<axw> thumper: is there a topic doc for the sprint?
<davechen1y> rick_h_: nothing
<rick_h_> axw: one got started today
<davechen1y> rick_h_: what is the context
<davechen1y> ie, what doesn't the install hook provide you that you need ?
<axw> rick_h_: okey dokey, can you link me please?
<rick_h_> davechen1y: k, sorry waigani.
<rick_h_> axw: sent
<axw> cheers
<rick_h_> davechen1y: well the idea is that it would run pre-hook and in this case the charm doesn't do a good job of helping identity the issue when install failed
<rick_h_> davechen1y: where as if it was something more juju controlled it could do better, make sure deps are there before the install hook (python deps used in the python based single hook), etc
<rick_h_> davechen1y: so I know it was brought up but wasn't sure where it landed on plans/measure of pain
<davechen1y> rick_h_: what is a pre-hook ?
<davechen1y> rick_h_: i'm having trouble filtering out the bits you wish juju had from the problem you've hit
<rick_h_> davechen1y: so one problem is the new single hook stuff. If it's a python script, and I need python deps (python-jinja2)
<rick_h_> how can the install hook run?
<davechen1y> rick_h_: is teh default hook stuff going in ?
<davechen1y> i thought that it had many problems
<rick_h_> davechen1y: or let's say my hook is py2 but utopic is py3 out of the box, I need to have py2 available available
<rick_h_> davechen1y: my understanding is that it's waiting on charm feature flags in some form or other
<rick_h_> davechen1y: and that's the only blockers
<davechen1y> rick_h_: i strongly recommend always writing the install hook in shello
<davechen1y> 'cos you can't depend on anyting
<rick_h_> davechen1y: right, well that's the point :)
<davechen1y> rick_h_: default hook wasn't my suggestion
<davechen1y> i think it sounds good in theory
<davechen1y> but fails in practice
<rick_h_> davechen1y: ok, well anyway. I was just curious. It's come up as an idea but guess it didn't have as much traction as it seemed
<davechen1y> default hook is a large hammer to solve very specific problem, no symlink support on windows
<davechen1y> but windows chamrs by definition are indepdent of unix charms
<davechen1y> so they should adopt a different pattern
<davechen1y> for unix charms, you symlinks
<rick_h_> well the other problem is that so many charms find a single script easier to write/maintain. That's regardless of platform
<rick_h_> davechen1y: but all good, thanks for the sanity check on the idea
<thumper> axw: yep
<davechen1y> waigani: 1372699
<davechen1y> do you still have this machine up ?
<davechen1y> you'd missing the crucial bit of the log
<davechen1y> it appears above the python stack trace
<waigani> davechen1y: ah no, sorry
<waigani> davechen1y: I can try to reproduce, though I had a bad Internet connection at the time - pretty sure that was it
<rick_h_> davechen1y: sorry, was going to update that. In chatting we assume a flaky network connection was to blame
<davechen1y> rick_h_: waigani which provider ?
<davechen1y> my money is on another process holding the apt lock
<davechen1y> or apt failing
<rick_h_> davechen1y: lxc
<rick_h_> davechen1y: a manual apt-get install worked in debugging after the fact
<davechen1y> waigani: rick_h_ yup, that matche sboth scenaruios
<davechen1y> ie, if the apt lock was held by another process running apt-get afterwards won't fail because the other process has finished
<davechen1y> also, if your network crapped out, same
<davechen1y> antoher common charm bug is not running apt-get update in the install hook
<davechen1y> because of 'reasons' the apt mirrors will 404 debs taht are not current
<davechen1y> so if you have an old apt index on disk
<rick_h_> davechen1y: ok, I couldn't dupe it locally in my lxc but also not running trunk but 1.20.7
<davechen1y> and go to apt-get install some pacakge, if that pacakge has been replaced by a newer one in the archive
<davechen1y> you get a 404 and apt returns status 1
<davechen1y> and then your day is ruined
<davechen1y> rick_h_: i'd be very surprised if this was related to any version of juju
<davechen1y> this is most likely a charm bug
<davechen1y> well
<davechen1y> that isn't true
<davechen1y> older versions of juju would force apt-get update/upgrade on bootup
<davechen1y> now we don't do this
<rick_h_> davechen1y: just stating the diff in my attempt to replicate vs reported bug
<davechen1y> it shows up the charms that expect this to happen
<axw> does anyone have time for a quick review? fixes a critical bug that I'd like to backport to 1.20 ASAP: http://reviews.vapour.ws/r/75/diff/#
 * thumper looks
<thumper> axw: done
<axw> thanks thumper
<bradm> anyone about who can help with a juju deploy --to question?  I want to have some charms deployed to lxc on a specific unit, as in openstack-ha/0 - I know you can do things like lxc:25, but is there a way to say lxc:openstack-ha/0 ?
<bradm> (where openstack-ha is just a cs:ubuntu unit)
<bradm> looks like juju-deployer supposed lxc:openstack-ha=0 style of deployment, if I'm reading the docs right
<thumper> bradm: no, I don't think you can say that
<bradm> thumper: thats a pity, we're using it quite extensively
<thumper> menn0: got a minute for a hangout?
<menn0> thumper: yep
<menn0> thumper: where?
<thumper> here -> https://plus.google.com/hangouts/_/g2ozjdukkzbchwadejkjpa5e2ma?hl=en
<davechen1y> waigani: would you please find some time today to review http://reviews.vapour.ws/r/66/
<waigani> davechen1y: I'll try, probably this evening
<davechen1y> ta
<waigani> menn0, thumper: on trunk I get disconnected from the API if I upgrade AND specify  the version e.g: juju --show-log upgrade-juju -e local --version 1.21-alpha2 --upload-tools
<waigani> menn0: your script did not test with specifying the version
<waigani> menn0: could you run that upgrade step (with the version) from 1.20 to verify?
<menn0> you mean the test case I used before
<menn0> ?
<waigani> menn0: yep
<waigani> menn0: so for me, on trunk this works: juju upgrade-juju --upload-tools
<waigani> menn0: this does not: juju --show-log upgrade-juju -e local --version 1.21-alpha2 --upload-tools
<waigani> the latter is taking from the original CI blocking bug
<waigani> *taken
<menn0> waigani: I'll try it out on my machine
<menn0> waigani: works for me
<menn0> waigani: with master
<menn0> waigani: although I did see the "invalid entity name or password" error once
<menn0> waigani: the next attempt at juju status worked
<waigani> menn0: yeah that's what I get
<menn0> waigani: I suspect that could be before once of the DB migration steps has run
<waigani> menn0: that would be a bug woudn't it?
<menn0> waigani: I don't think so
<waigani> menn0: I upgraded twice and got that error message both times
<menn0> waigani: that's just because you're trying juju status before the upgrade steps have run
<menn0> if you had waited slightly longer it would have been ok
<waigani> menn0: yeah I get that, but from the user's point of view it does not look like that
<waigani> menn0: it looks like it's broken
<menn0> waigani: well we could try and add more hacks like the one I did
<menn0> waigani: so that more stuff works even without the migration has run
<waigani> menn0: or change the message
<waigani> "upgrade in progress"
<menn0> except we want status to work during upgrades if possible
<waigani> and if not possible?
<menn0> I guess if the login fails during an upgrade we could return "upgrade in progress"
<menn0> that's probably a good idea
<menn0> lets write up a ticket for that.
<waigani> okay
<menn0> waigani: will you or should I?
<waigani> menn0: I've got to finish testing this branch so i can land it and you can use it
<menn0> I'm already using it :)
<menn0> but I'll write up that ticket
<waigani> oh - right well then
<waigani> okay, I'm easy
<thumper> axw: slight problem with the lastest branches you have landed with upgrade steps
<thumper> axw: should have checked it earlier
<thumper> I have waigani adding in upgrade steps for 1.21-alpha2
<thumper> which is where any upgrade steps should go since the lastest version was tagged
<thumper> will not impact anyone going from release to release
<waigani> ah...
<thumper> but will if someone tries upgrading from 1.21-alpha1 to 1.21-alpha2
<thumper> which is supported but I'm not sure how formally
<axw> thumper: I didn't think we bothered with that
<axw> can fix though, not a big deal
<thumper> I think we should bother
<thumper> axw: very minor change, and can be done after waigani's branch lands
<thumper> small tweak
<axw> okey dokey, no problems
<thumper> cheers
<menn0> waigani: here's bug 1372752
<mup> Bug #1372752: juju status fails with "invalid entity name or password" before DB migrations have run <juju-core:New for menno.smits> <https://launchpad.net/bugs/1372752>
<waigani> menn0: thanks :)
<waigani> thumper: talking of which, completed live testing
<thumper> and...
<waigani> thumper: replied to your rb comment
<waigani> all good
<thumper> cool
<menn0> waigani: \o/
 * thumper feels ever so slight amount of terror getting this in
<thumper> but someone has to be first
<waigani> haha
<waigani> like a lamb to the slaughter
<thumper> waigani: done
<thumper> waigani: I suggest the machine collection next
<thumper> waigani: nothing like getting the big ones done first
<waigani> thumper: yeah, sorry I took a wrong turn or two. I *think* I've got a pretty good idea how to do the rest now (famous last words)
<waigani> thumper, right machinesC next? menn0 what are you doing?
<menn0> waigani: units
<thumper> waigani: ack
<waigani> okay cool
<menn0> thumper, waigani: I think we need to discuss these env UUID changes a little more on a hangout. now or tomorrow?
<axw> thumper: won't the 1.21 upgrade steps always run in alpha versions? i.e. they'd run from 1.20 -> 1.21-alpha1, and also from 1.21alpha1->1.21-alpha2... up until 1.21 (non-alpha)
<menn0> thumper, waigani: I'm seeing a lot of places where we need to worry about filtering and I'm thinking the "minimal change" approach we're going with is going to be the less safe approach
<axw> (I can test, but if you know the answer...)
<thumper> menn0: now is fine
<waigani> menn0: yep
<thumper> axw: it collects all from where it is, to the version going to
<waigani> hangout channel?
<menn0> thumper, waigani: yep, hangout channel
<thumper> axw: so you are right for migrating from 1.20 to 1.21
<thumper> it collects all the 1.21-alpha1 and 1.21-alpha2 steps along the way
<bradm> ugh, this cloud-tools ppa being pinned lower is causing me issues
<dimitern> fwereade, ping
<fwereade> dimitern, pong
<dimitern> fwereade, I've updated the firewaller PR: https://github.com/juju/juju/pull/799 and I'd appreciate if you have a look
<fwereade> dimitern, will do
<dimitern> fwereade, it turned out surprisingly hard to get to the unexported environTag inside *api.State from *apifirewaller.State
<fwereade> dimitern, hmm, that's surprising, I could have sworn I'd seen a simple EnvironTag method
<fwereade> dimitern, yeah, api.State.EnvironTag()
<dimitern> fwereade, yes, in *api.State, but what you get in api/firewaller/New** is a base.FacadeCaller
<dimitern> fwereade, so I added EnvironTag to the base.APICaller interface and exposed it as firewaller.State.EnvironTag by internally calling st.RawAPICaller().EnvironTag()
<fwereade> dimitern, cool
<fwereade> dimitern, sent a couple of quick responses to your responses, looking from the start again now
<dimitern> fwereade, cheers
<dimitern> jam, re exposing EnvironTag() from *api.State to facade callers ^^ I'd like to hear your thoughts as well
<jam> dimitern: is there a Reviewboard version?
<dimitern> jam, nope, when I proposed it, I forgot :/
<jam> dimitern: well, you can always add it at any time :)
<jam> I guess the discussion is already here
<dimitern> jam, indeed, but it will loose a bit of context after some comments were added on the PR
 * jam really dislikes the "everything is State" because then it is confusing which foo.EnvironTag() you're actually calling.
<waigani> davechen1y: http://reviews.vapour.ws/r/66 reviewed as requested
<davechen1y> waigani: ta
<jam> dimitern: so is Tag.Id() a tag-thing or an id "thing" ?
<jam> (envTag.Id() returns environ-UUID or just UUID ?
<dimitern> jam, envTag.Id() == UUID, envTag.String() == "environment-<UUID>"
<jam> dimitern: k, because there are lines like: https://github.com/juju/juju/pull/799/files#diff-06b44def560ee3d858becda14c28ba17L201
<dimitern> jam, that's one of rogpeppe's :) calling most facades *State or *API
<jam> where it looks like it used to take something called a Tag, and it is now taking the UUID
<jam> it may be that st.EnvironTag() returned the UUID, which was a bad name
<dimitern> jam, yes it did - it was string
<jam> dimitern: it was a string, but was it "environ-uuid" or "uuid" ?
<jam> In which case, you should be passing .String() and not .Id()
<rogpeppe>  /me hears his name taken in vain :-)
<dimitern> :)
<rogpeppe> i don't understand the issue here
<dimitern> jam, hmm, you're right in that case - it should be envTag.String(), I'll fix it (looking deeper the string is parsed as a tag in cacheChangedAPIInfo)
 * rogpeppe probably doesn't need to
<jam> dimitern: k, it would appear we are lacking test coverage if you were able to change it without something breaking
<dimitern> rogpeppe, it's about how naming a whole lot of things State in the api that's gets confusing
<jam> state.State, api.State, uniter.State
<dimitern> jam, I haven't run the full test suite yet, but I'm doing it now to see if anything will break
<TheMue> United States of Juju
<jam> TheMue: :)
<rogpeppe> jam: given you've got always got the package prefix there, why is it confusing?
<jam> dimitern: I'm going to need a bit longer to digest your proposal, I have to take my dog out before standup.
<jam> rogpeppe: because you often don't have the package prefix
<jam> like in all the packages
<jam> and when looking at a diff
<jam> all you see is "State" objects
<jam> and you have to carefully reread what file you're currently in
<rogpeppe> jam: but then you don't have the "State" name either, right?
<dimitern> jam, sure, np
<jam> rogpeppe: everything is just a State there
<rogpeppe> jam: oh i see, in the package itself
<jam> rogpeppe: and the bigger shortcomming is that then everyone abreviates their variable as "st"
<rogpeppe> jam: but you should surely be aware of what file you're looking at?
<jam> and you don't have a clue what "self.st" is
<jam> rogpeppe: so the information is there, but it is 20 lines away past the line boundary (again, when looking at a diff)
<rogpeppe> jam: if you're just looking at a diff without surrounding context, i can see
<rogpeppe> jam: the idea behind it is that actually they're all windows onto the same underlying state.
<rogpeppe> jam: a given package is only going to be using a single facade throughout, right?
<dimitern> fwereade, re https://github.com/juju/juju/pull/799#discussion_r17899691 - do you mean the api watcher should return 0:juju-public changes (not the state watcher - it should still return global keys as changes I think, and the api watcher will convert them)
<fwereade> dimitern, no
<fwereade> dimitern, globalKeys should not leak out of state
<fwereade> dimitern, they're meant to be purely internal
<fwereade> dimitern, so fix the state watcher soon
<fwereade> dimitern, and one day we'll fix the api watchers by inserting an id->tag translation thingy in a consistent place/way
<dimitern> fwereade, ideally, we should have a PortsWatcher instead of StringsWatcher, which returns changes like []{MachineId: 0, NetworkName: "juju-public"}
<fwereade> dimitern, yeah, that'd probably be nicer
<dimitern> fwereade, ok, I'll do a follow-up for that
<fwereade> dimitern, great, thanks
<dimitern> jam, i can't hear you btw
<jam> dimitern: my browser crashed
<jam> dimitern: and after restarting it... it crashed again, I think I have to reboot after the last updated, bbiab
<dimitern> TheMue, standup?
<TheMue> dimitern: omw
<jam> fwereade: dimitern: so we should have a discussion about how we want to handle Agents in a MESS scenario
<jam> because so far, the MESS design was that the Environ was implicit to the connection
<fwereade> jam, is this about the firewallers/provisioners for the various envs?
<jam> It may be that for Agent's we'll want them to be multi-environment aware, but that wasn't the MESS design (As I understand it), because Login takes the environ-uuid, and not the rest of the system.
<jam> fwereade: so this is brought up in the context of Dimiter's changes, but I'd like us to have a plan for what we're going to be doing so that we can work towards consistency.
<fwereade> jam, I *think* it's doable without conflict -- ok, an environ is implicit to a connection, because the agents are running in the initial environment; but I don't think that prevents us from explicitly using tags for other environs, for those environs' workers
<fwereade> jam, at the moment we pull the environ uuid out of the connection
<fwereade> jam, but we are explicitly referencing environs in api calls
<fwereade> jam, and I think it's relatively simple to move that awareness down a level, directly into the workers, without mucking with the actual API
<fwereade> jam, ie without mucking with the wire format
<fwereade> jam, there will still be code changes ofc :)
<fwereade> jam, any places where we depend on the implicit environment will be problems
<fwereade> jam, but I hope there aren't many of them
<fwereade> jam, I've tried to spot them and block them
<jam> fwereade: so Dimiter is needing EnvironTag from his api.Client object in order to pass it back into the api.Networker facade
<jam> which means changing api.FacadeCaller to also include EnvironTag, etc
<jam> which is why his changes are large-ish
<jam> but if you're getting your context from your context, it doesn't make a ton of sense.
<fwereade> jam, well, yes, I think I have a comment along those lines -- that there's no need for explicit knowledge of environ tag at the firewaller level at the moment, because the client can put it into the calls directly
<fwereade> jam, dimitern: I waffled on that comment though
<jam> fwereade: some of it would be "do you want to run a multi-environment firewaller" as in, 1 Firewaller worker for N environs
<fwereade> jam, I do not want to do that
<dimitern> fwereade, we're finishing standup, will respond shortly
<fwereade> jam, both provisioner and firewaller are bloated enough
<fwereade> jam, layering multi-environment-icity on top will be horrible
<fwereade> jam, isn't it "just" a matter of having a parent worker that keeps track of the list of environs, and starts appropriate workers for each?
<fwereade> jam, a provisioner-deployer, if you will
<jam> fwereade: well the question is, do these things Login separately (and thus all have their own state connection), or is it all a shared connection and the Environ they are operating on needs to be explicit in the API
<jam> I *thought* the plan was that the apiserver's State root object would know what environment you were working in.
<jam> certainly if you are then talking about some *other* environment you'd have a tag for it
<jam> But do we need the API to be "StartInstance(environTag, constraints, etc)"
<fwereade> jam, different example please? StartInstance isn't our API ;p
<jam> fwereade: AddMachine(environtag) ?
<jam> that is at the Client level, though
<fwereade> jam, WatchPorts(environTag)?
<fwereade> jam, yes, I think it should be
<fwereade> jam, implicit args have seemed to cause us more trouble than they're worth
<jam> fwereade: so, we just did Login(environTag) for that worker, right?
<fwereade> jam, but I think there's no conflict in being connected to the initial environment, and asking about some hosted environment
<fwereade> jam, yes
<jam> fwereade: I don't view environments as things that can be nested. Do you mean connected to the initial state server?
<fwereade> jam, where are the workers for the hosted envs running, if not in the initial env?
<fwereade> jam, blast, brb, keep talking
<jam> a fair point. though the workers for the hosted environs could have their own connections
<jam> fwereade: it feels like, if we want to be explicit about Environ, we pretty much need to add it to *every* request, which then means that we might as well make it common to every request
<jam> and it should just go back to implicit
<fwereade> jam, mmm, I'm not sure we really want N connections, one for each hosted env, from a single agent in a MESS
<fwereade> jam, I think there's some additional distinction too but I'm having some trouble articulating it
<dimitern> fwereade, jam, I'll make the environ tag argument to WatchOpenedPorts client-side firewaller facade implicit, but the server-side will still take Entities
<dimitern> fwereade, jam, is that OK?
<fwereade> jam, there's something worse about operate-on-nothing getting an implicit context than operate-on-something having it
<fwereade> dimitern, yeah, that's what I'd favour here
<dimitern> fwereade, cheers
<fwereade> dimitern, with the expectation that we do add environ-awareness to FW and P at some point in the not-too-distant future
<fwereade> dimitern, but that we don't want or need it this minute#
 * jam is making some coffee, but I feel this is probably hangout worthy
<fwereade> dimitern, and that all we'd need to do to accommodate that is to tweak the client code on one side, and the auth on the other
<fwereade> jam, just a sec, let me see
<jam> fwereade: so if you are WatchPorts(environTag), then surely anything you then do because of that watch
<jam> needs to take an environTag
<dimitern> fwereade, by "don't want or need it this minute" do you also mean making WatchOpenedPorts at server-side not taking Entities as well?
<jam> You can't do WatchPorts(envTag), => OpenPort()
<jam> fwereade: and as for multiple "Conns", it could be as simple as a different logical connectiion, it wouldn't have to be a separate TLS /TCP session.
<fwereade> jam, ha, yes, you're completely right
<fwereade> dimitern, server-side we do want it
<fwereade> dimitern, jam: I still *think* that this consideration is specific to provisioner/firewaller (maybe some others?)
<dimitern> fwereade, ok
<fwereade> jam, separating out logical connections would be nice, yeah
<jam> fwereade: so I'm fine enough that we'll have some "things" that are actually aware of multiple environments (cross environ relationship manager), but I'd like to be cautious about adding EnvironTag to any API that isn't ever going to take 2 different ones
<fwereade> jam, I'm still worried that every time we do a no-args thing we end up with problems
<jam> fwereade: I think mixing noargs and args is going to be worse in the general case
<jam> now, if we actually went all the way to stateless... then you have always-args
<jam> which I'm actually happy with, but hasn't been the design we went for
<fwereade> jam, heh, I just realised OpenPort doesn't apply -- but asking which ports are open still does
<jam> fwereade: well, there is an "actually implement this change in the Provider" which at this point may be just a direct callout, but the logical "if I have an API that takes an environTag, then whatever I call as a result of that first API call then also needs an environTag"
<jam> and I feel like, 90% ? of what we are going to do needs to then take an EnvironTag, which is why we made it implicit in the first place
<fwereade> jam, I guess that may be the main point of difference? I feel like the firewaller/provisioner are relatively rare and special
<jam> fwereade: so I think there is a definite case for different "rings" of agents
<jam> There is the Client API for "juju"
<jam> there is the Agent API for machine-10/unit-mysql-5
<jam> and there is the EnvironmentManager API for Provisioner/Firewaller/etc.
<jam> It would be good to actually separate those out in code.
<fwereade> jam, that does sound sane, yeah
<fwereade> jam, certainly different forces apply to the different categories
<jam> (and then there is the API server which is obviously underneath that)
<fwereade> jam, I'm not wholly clear on how that separation would work though -- purely as an organisational thing?
<dimitern> fwereade, re https://github.com/juju/juju/pull/799#discussion_r17900095 - do you think it's better to change "case changes ,ok := <-in:" to "case changes := <-in:" as it was?
<fwereade> dimitern, no, I'd prefer to be explicit about checking for channel closes, they can happen
<dimitern> fwereade, ok
<jam> fwereade: right, just different directories (apiserver/RING0/provisioner)
<dimitern> tasdomas, ping
<fwereade> dimitern, it's the fact that MustErr can panic
<jam> fwereade: given that is the whole point of Must, right?
<fwereade> jam, sure
<fwereade> jam, it's really the usage/existence of MustErr
<jam> fwereade: I certainly think we would benefit from "Agent" vs "Client" apis, especially as we split up Client into many Facades
<dimitern> tasdomas, re https://github.com/juju/juju/commit/7694614984932598535639e09129820f15b9d58d - changing dependencies.tsv to bump a revision of a versioned import path like gopkg.in/juju/charm.v3 instead of "releasing" a new version gopkg.in/juju/charm.v4 is BAD, let's not do that please
<fwereade> jam, I feel like it should be more GiveMeSomeErrorIndicatingEitherTheWatcherErrOrTheFactThatThereIsntOne
<dimitern> tasdomas, the whole point of having versioned import paths is to rely on "vX" meaning the same thing as the package evolves
<jam> fwereade: EnsureErr ? (EnsureThatIHaveSomeSortOfError)
<fwereade> jam, good idea
<jam> most Must* mean panic
<fwereade> jam, agree
<fwereade> jam, and yes I think splitting the API up like that would be smart too
<jam> fwereade: and in thinking about Client vs Agent, it does reveal that there are some meta-juju level APIs, but I'm not as sold that those are worth splitting out
<jam> if we did, then where does Upgrader fit in, as it is also pretty "Meta"
<fwereade> jam, it feels agenty to me but maybe I'm not thnking it through properly
<jam> fwereade: you could consider the division to be the point where things run on machine-0, but you could also consider it to be where the worker is doing things about Juju vs things for the user
<fwereade> jam, I don't think machine-0 is quite right
<fwereade> jam, (or even state0server)
<fwereade> jam, only-on-state-server though I guess?
<jam> fwereade: sure, I'm not saying explicitly 0, but potentially "things running on agents with JobManageEnviron" ?
<fwereade> jam, yeah, that sounds good to me
<fwereade> dimitern, fwiw, I kinda think we should be dropping unitData, my spidey sense says there's a big simplification waiting t oget out
<fwereade> dimitern, but I'm not sure it's one for this CL
<dimitern> fwereade, ok, so this seems the last blocking issue, I'll repropose for a final look shortly (did a live test just now to make sure it works ok)
 * fwereade lunch
<dimitern> fwereade, final look at https://github.com/juju/juju/pull/799 before landing?
<natefinch> ericsnow: you around?
<dimitern> fwereade, if you think it's ok, I'll queue it for landing, and I'm already working on 2 follow-ups - state opened ports watcher to report "0:juju-public" changes and environTag handling across the API facades
 * dimitern late lunch
<perrito666> ill tell you this my friends, spanish kb layout is a torture for programming
<wwitzel3> when ever I try to add an alternate KB layout my system hangs and I have to hard reset. So now i just type without the accents.
<perrito666> wwitzel3: why would you use accents in english?
<wwitzel3> perrito666: I wouldn't
<wwitzel3> perrito666: it is for French
<perrito666> ah hehe, I can write french with this layout but It is a bit like using emacs
<wwitzel3> haha
<katco> perrito666: which is to say, awesome! right! right? guys?
<mgz> can I get a stamp on <http://reviews.vapour.ws/r/78/>? needs rebasing now but can do that as I land
<mgz> natefinch: maybe plz? ^
<perrito666> katco: I need a couple of extra fingers
<wwitzel3> mgz: LGTM
<perrito666> katco: but the issue is more in the fact that I dont recall where they keys are since I dont use them
<mgz> wwitzel3: thanks!
<wwitzel3> jam: ping, you still around?
<dimitern> wwitzel3, it's perhaps a bit late for jam at this time
<wwitzel3> I figured as much, sometimes he hangs around though :)
<perrito666> dimitern: is not like many of us have lifes :p
<dimitern> perrito666, :) well, some of us at least
<perrito666> shame on you
<perrito666> I actually am around while cooking dinner
<perrito666> :p
<perrito666> my washing machine is an excelent standing desktop
<ericsnow> natefinch: you still need me?
<natefinch> ericsnow: yeah... rbt is being annoying
<ericsnow> natefinch: what's up?
<natefinch> rbt setup-repo keeps failing for me
<natefinch> ericsnow: we can talk in the standup
<ericsnow> natefinch: k
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1372961
<sinzui> jam, natefinch can you get someone to look into bug 1372961
<sinzui> mgz, adeuring : just reported bug 1372961 and this is my investigation of the errors in CI https://docs.google.com/a/canonical.com/document/d/1HJozm1Yo_d3mC0QyQif9XfQXf5AxRtd86_aroMd1OpE/edit#
<natefinch> sinzui: ok
<mgz> sinzui: ace, I'll read that
<sinzui> mgz, there are some failures I think are not juju, I can you read the two logs I included to 1862?
<sinzui> abentley, mgz, adeuring, jog: joyent very slow this week. tests fail because of timeouts, and we can see many tests fail during setup. We many need to take away their voting rights
<abentley> sinzui: That's a shame.
<sinzui> abentley, mgz, adeuring http://juju-ci.vapour.ws:8080/view/Cloud%20Health/job/test-cloud-joyent/ doesn't show the extent of the problem
<wwitzel3> anyone know off had any cmd client commands that have been refactored out of api/client and in to their own facade?
<wwitzel3> I'd be interested looking at examples of how we did it previously
<fwereade> SHIT
 * fwereade just overfilled bath
<fwereade> bbiab
<wwitzel3> I will reserve my laughing for later after we've determined there is no serious damage
<fwereade> wwitzel3, it'll be fine
<fwereade> wwitzel3, all on tiles
<fwereade> wwitzel3, any damage there may be will be as nothing compared to the 3 separate pipes that went a week or two ago
<fwereade> wwitzel3, my illusion of competence is the real victim here
<wwitzel3> that was too sad and self deprecating for me to laugh at all now
 * wwitzel3 kicks the dirt
<fwereade> wwitzel3, ehh, I deserve at least a bit of pointing and laughing
 * fwereade really *needs* a bath now after mopping all that up, but may be out of towels :-/
<mbruzek> Hello juju developers.  I noticed that I can no longer view my own logs on juju local.  I need sudo to view them.  There was a recent change to the log rotate function.  Has anyone else experienced this problem?
<katco> mbruzek: hi there. this was actually an issue raised by our security reviewers. there is some sensitive information that we felt any user on the system shouldn't be able to read.
<katco> mbruzek: this changed back in... 1.18 i believe. does that help at all?
<mbruzek> katco: Thanks for the reply.  As a developer I regularly look at the logs to diagnose failures.  This seems more recent to me, I keep up-to-date on juju releases, this seems like a 1.21 change as I was able to view the logs last week.
<katco> mbruzek: interesting. this is what i am referring to: https://bugs.launchpad.net/juju-core/+bug/1286518
<katco> mbruzek: which logs are you trying to read?
<mbruzek> I am usually interested in the unit-<charm-name>-#.log files because all-machines or the machine logs are too noisy.
<mbruzek> katco: no this is not the problem I am seeing, I was able to view them as recently as last week.  They are indeed 600 permissions, but now they seem to be owned by syslog.
<mbruzek> -rw------- 1 syslog syslog 25642 Sep 23 09:57 machine-0.log
<mbruzek> which is != mbruzek, thus the need for sudo.
<katco> natefinch: ping ^^^
<katco> mbruzek: does seem like a new issue. what's the path? in ~/.juju?
<natefinch> katco: in a meeting, but we did just change log rotation stuff... that may have changed things for local accidentally.
<katco> natefinch: ok np, thanks for the response.
<katco> natefinch: any dev i should ping?
<mbruzek> katco: I actually see the logs all owned by syslog, but in up to 3 groups (adm, syslog, and fuse)
<mbruzek> -rw------- 1 syslog adm    26787 Sep 23 09:57 all-machines.log
<mbruzek> -rw------- 1 syslog syslog 25642 Sep 23 09:57 machine-0.log
<mbruzek> -rw------- 1 syslog fuse   44601 Sep 23 09:56 unit-mongodb-0.log
<katco> mbruzek: what's the path?
<natefinch> katco: wwitzel3 may know about the all-machines stuff
<katco> natefinch: ty sir
<mbruzek> katco: the path I am looking at is /var/log/juju-mbruzek-local/  the one you referenced (~/.juju/local/log) is a symbolic link there.
<mbruzek> katco: your log location would be /var/log/juju-katco-local/ I suspect.
<katco> mbruzek: not quite, but close ;)
 * mbruzek does not know the username you use.
<katco> mbruzek: i'll have to defer to natefinch et. al. since they implemented the new log rotation. looking at my log directory, it looks like my permissions fix may have actually been undone in addition to the bug you're seeing
<katco> mbruzek: all of my stuff is world-readable and owned by root
<mbruzek> katco: my memory is not clear but I KNOW I was able to view the files as recently as last week.  What version is your juju.
<katco> mbruzek: i run tip
<mbruzek> of course you do!
<katco> mbruzek: but i suppose it's possible that once the logs are created correctly, the code would never recreate them, so only new log files would have issues
<mbruzek> $ juju version
<mbruzek> 1.21-alpha1-trusty-amd64
<katco> 1.21-alpha2-trusty-amd64
<mbruzek> katco: That is a possibility I often destroy-environment and my understanding is that wipes out the "local" directory.
<katco> mbruzek: but maybe not anything buried in var? it would be elucidating to manually remove the log files to see what juju creates
<mbruzek> katco: Let me do that now...
<mbruzek> katco: destroy-environment does not wipe out the log files, but it does remove ~/.juju/local directory.
<katco> mbruzek: how does juju create the log files if they don't exist?
<mbruzek> the ~/.juju/local directory only contained a symbolic link to /var/log/juju-mbruzek-local/
<katco> mbruzek: right; i'm wondering if you move/remove log files under /var how juju will recreate them. i.e.: are you seeing a ghost of a bug that is now fixed
<mbruzek> katco: ahh!  Let me try that
<mbruzek> katco: http://pastebin.ubuntu.com/8411564/
<katco> mbruzek: interesting. yep, i think you need to talk to natefinch's team
<mbruzek> I moved the old directory and the new one was created upon bootstrap
<mbruzek> katco: OK thanks for troubleshooting with me.
<katco> wwitzel3: perrito666: ericsnow: ping ^^^
<katco> mbruzek: sorry i couldn't help more. i tuned in b/c i thought i had worked in the space you were describing.
 * mbruzek is glad someone tuned in.
<mbruzek> wwitzel3, natefinch, perrito666, ericsnow if you are done with meeting please ping me.
<natefinch> mbruzek: will ping once I'm out
<dimitern> fwereade, here's the follow-up about the state.openedPortsWatcher http://reviews.vapour.ws/r/85
<dimitern> natefinch, ^^ as OCR, can you take a look as well please?
<natefinch> dimitern: sure
<fwereade> dimitern, LGTM with a trivial
<dimitern> fwereade, cheers!
<jcw4> rogpeppe: your review is much appreciated!  Will discuss and get your suggestions implemented
<rogpeppe> jcw4: np
<natefinch> evilnickveitch: https://github.com/juju/docs/pull/174
<rogpeppe> jcw4: i'm not sure i've finished, but i'm taking a break from it for a while.
<rogpeppe> jcw4: i'd really like to see doc comments for everything
<jcw4> rogpeppe: cool.  Some of your comments I can explain, but some of your suggestions (charminfo) make a lot of sense to me
<rogpeppe> jcw4: the last thing i was looking at was Cancel - i'm not sure quite how that's meant to work
<jcw4> rogpeppe: Yeah, I should have added the doc comments like you suggested
<jcw4> rogpeppe: I'll be updating today with those comments, etc.
<rogpeppe> jcw4: great, thanks
 * fwereade is really tired and taking a break, will be back in the evening sometime
<wwitzel3> 6
<wwitzel3> sure, why not
<wwitzel3> mbruzek: the permission commit that katco referenced was on 2014-07-02, that is when that went in to master.
<mbruzek> wwitzel3: Would you know why I was able to look at the logs last week and not this week?  The owner looks different between katco and my system.
<wwitzel3> mbruzek: the only thing I can think of, I think was mentioned before, if the old logs were not cleaned up for some reason and had the old permissions.
<wwitzel3> mbruzek: actually let me check when that commit was actually merged in .. the day it was commited and the day it was merged can be very different
<mbruzek> wwitzel3: Please do, I think this is a big change that will make it harder for charm authors to diagnose problems when they are writing charms.
<wwitzel3> mbruzek: https://github.com/juju/juju/pull/232 .. was merged in 02 Jul 2014. Not sure why you would of had access to the logs since then, if they had been newly created.
<mbruzek> wwitzel3: ack.  Thanks.  I am kind of concerned about this change since it will be more difficult for authors to look at the log files of their own units to diagnose problems.  Was there anything in there anything about the owner changed?
<wwitzel3> mbruzek: the original ticket is here, https://bugs.launchpad.net/juju-core/+bug/1286518 and has a comment that shares your concern, I'd recommend bumping that
<mup> Bug #1286518: juju log files should not be world readable <logging> <juju-core:Fix Released by cox-katherine-e> <https://launchpad.net/bugs/1286518>
<mbruzek> wwitzel3: this link is likely why I have not seen it until this week. https://bugs.launchpad.net/juju-core/+bug/1286518
<mbruzek> It looks like it was fix released on 09/10
<wwitzel3> mbruzek: yep
<mbruzek> thanks wwitzel3.
<wwitzel3> mbruzek: yep, np
<katco> wwitzel3: ah missed that point. i just assumed it had been released awhile ago
<katco> wwitzel3: ty
<wwitzel3> katco: np
 * katco needs to get her glasses adjusted again (sigh). bbiab.
 * natefinch did not realize glasses were something one got adjusted
<perrito666> I am not sure if she speaks about the prescription or the nose thing
<perrito666> I have to change the nose thinguies every 3 months
<perrito666> although the ones I got last time lasted for the whole cycle :p
<perrito666> ericsnow: ping
<ericsnow> perrito666: hey
<perrito666> ericsnow: I need your help swimming trough a sea of indirections
<ericsnow> perrito666: k
<perrito666> ericsnow: lets go priv
<natefinch> turns out the answer is "yes" to the question "are the outdoor receptacles on the same circuit as my office?"
<natefinch> not that it was a question I had intended to ask today
<perrito666> apparently your AP is not on that circuit
<perrito666> :p
<perrito666> I for once have the internet connection on a separate 6mm cable line along with the tv, the ps3 and the fridge
<perrito666> because priorities
<natefinch> haha
<natefinch> YEah, I'm currently actually running an ethernet cord from the other room to here.... I have an ethernet wall jack I need to install... but, well, it involves getting into the crawlspace under my office, and... priorities
<perrito666> lol
<perrito666> well when I moved here the house was under remodeling so I just passed special cables for the important stuff, a different circuit breaker line and walled the home theater cables so the satellite speakers look wireless
<natefinch> nice
<natefinch> Evidently Mark Shuttleworth is annoyed that HA is only a single command "ensure-availability" for both starting HA and recovering from a failed machine, so he wants us to change it.
<natefinch> Which is valid
<natefinch> I wonder if this is my penance for actually documenting it :)  https://juju.ubuntu.com/docs/juju-ha.html
<ericsnow> natefinch: no good turn goes unpunished :)
<natefinch> No, I kind of agree....  I just was ready to not look at HA for another 6 months at least :)
<natefinch> ericsnow, perrito666, wwitzel3: who wants to be in the meeting to talk about revamping HA?  I may be able to find the time to do it, but I might not... which means it might get delegated.
<katco> natefinch: i just got this pair. they fit to your nose/ears differently and apparently you have to go through a few rounds of adjustments before they don't hurt your head
<ericsnow> natefinch: I will if you need me to but would rather stay focused on backups
<perrito666> katco: so how does that work, do they adjust your nose and ears?
<perrito666> :p
<natefinch> ericsnow: yeah, good point... you'd disqualified
<natefinch> (lucky ;)
<katco> perrito666: lol no they put the frames in this heat thing and bend them
<ericsnow> natefinch: :)
<katco> perrito666: this is all new to me, but my wife is a veteran glass wearer lol
<natefinch> katco: ahh, I didn't realize that was necessary
<perrito666> katco: ah yes, you can do that in your house with a cheap hair dryer
<natefinch> having never had eyeglasses myself...
<katco> natefinch: apparently so
<katco> natefinch: i literrally got my first pair 4 days ago
<natefinch> katco: I naively assumed they were like sunglasses where you can just slap them on
<natefinch> katco: ahh
<perrito666> natefinch: well after wearing a few days you can notice how they slide on your face
<katco> but now i can smugly say, "of _course_ you have to adjust them (gufaw)"
<perrito666> and also the material settles and you might need to adjust a bit
<perrito666> if the ear thinguies are not properly adjusted the nose ones will hurt your skin because of the weight and vice versa
<natefinch> maybe this is how they justify paying $200 for some wire and plastic
<katco> lol
<perrito666> natefinch: that is how they justify getting laser surgery
<natefinch> haha
<natefinch> ericsnow: you around?
<ericsnow> natefinch: yeah
<natefinch> ericsnow: actually, nevermind.  I had problems with rbt, but realized I don't actually need that code to get onto reviewboard anymore
<ericsnow> natefinch: ok
<wwitzel3> if I register a facade and put the import in all facades .. why am I still getting an ERROR unknown object type "RunCommand"
<katco> wwitzel3: did you update your servers?
<katco> i.e. juju upgrade-juju --upload-tools
<wwitzel3> katco: yep :/
<katco> wwitzel3: hrm. that was my guess! :)
<cmars> thumper, time for a hangout?
<thumper> cmars: aye, already there
<davecheney> hazmat: ping
<waigani> menn0: so there is a switch in state.parseTag. In the case for services I've set the id to DocID. You'll need to do the same for the unit case. The other place you'll have to do this is in allWatcherStateBacking.docID
<thumper> cmars: review done
<cmars> thumper, thanks
<waigani> menn0: allWatcherStateBacking.docID is in the megawatcher, it's only used in the Changed function there.
<waigani> menn0, thumper, davecheney: if every watched entity was a tag (i.e. had a .Tag() method that returned a tag of the entity), then we could use state.parseTag(tag) whenever we needed to get the docID
<thumper> waigani: I don't think that is the case though
<thumper> is it?
<thumper> pretty sure we watch many things that are only losely linked to entities
<waigani> thumper: note the *if*
 * thumper nods
<waigani> just an observation while working on that branch
<thumper> although might be intersting to have something that goes from the shitty globalId() ('m#3' for machine-2) to a tag
<thumper> waigani: definitely worth leaving a note to think about
<waigani> so forget entities, basically if anything that is watched is a tag...
<wwitzel3> since I have a fresh group .. if I register a facade and put the import in all facades .. why am I still getting an ERROR unknown object type "RunCommand"
#juju-dev 2014-09-24
<menn0> waigani: yep thanks. I changed parseTag during the meeting and those annotator tests started parsing
<menn0> waigani: and I knew about the megawatcher change
<waigani> menn0: sweet
<menn0> waigani: from reviewing the services change
<waigani> menn0: okay, i *think* they are the only spots that will need progressive updating as we migrate
<waigani> other than dealing with collection specific quirks of course
<menn0> waigani: that sounds right
<menn0> waigani: and then there's all the queries that need to be updated to filter by env uuid
<menn0> waigani: but we should get the collections in shape first
<hazmat> out of curiosity.. anyone using gooracle
<hazmat> davecheney, pong
<waigani> menn0: largely mechanical
<waigani> hazmat: I wrote a sublime text plugin: https://github.com/waigani/GoOracle
<waigani> hazmat: I use it most days, has really helped :)
<hazmat> waigani, nice.. i've got sublime.. but emacs users myself.. currently resurrecting my go editor conf.. oracle seems pretty sweet
<waigani> hazmat: menn0 uses emacs with oracle
<waigani> I couldn't keep up with him, so a wrote the plugin for sublime ;)
<hazmat> waigani, solid.. i'm currently pimping my ride following along to http://yousefourabi.com/blog/2014/05/emacs-for-go/
<davecheney> hazmat: are you coming to paris on friday ?
<hazmat> davecheney, unclear.. its not sold out it is it?
<hazmat> nope.. still avail
<hazmat> oh.. crap.. super late bird only now
<davecheney> hazmat: well, you are super late
<hazmat> davecheney, its like SO next month :-)
<davecheney> hazmat: its like so 15 days away
<hazmat> if i get through a day i'm on top of the world
<hazmat> davecheney, noted.. i'll have it sorted this week
<davecheney> hazmat: ok, if you get stuck I can ask the organisers
<davecheney> i won't be a free ticket
<davecheney> but we might be able to get you into the overflow
<hazmat> i gotta coordinate with sidnei he'll come out as well if i go
<davecheney> is sidnei still in zurich ?
<hazmat> yup
<davecheney> cool
<davecheney> well, let me know
<hazmat> will do
<hazmat> heading out latter this week to surgecon.. aka disaster porn
<hazmat> hah.. twitter feed on that topic.. "Bryan Cantrill @bcantrill  Â·  8h Does systemd have you #illumos-curious? "
<davecheney> lol
<davecheney> systemd-make-me-a-sandwich
<menn0> hazmat, waigani: the oracle emacs integration is pretty good. I use go-oracle-referrers and go-oracle-implements fairly often.
<menn0> thumper, davecheney: is the first line of this method a little crazy?
<menn0> http://paste.ubuntu.com/8414641/
<menn0> or is that intentional for some reason?
<menn0> thumper, davecheney: actually I can understand why it might be making a copy, but assigning the copy to u seems a little sketchy
 * thumper looks
<thumper> menn0: where is the assigning to the copy?
<thumper> oh
<thumper> ew
<menn0> first line
<thumper> wha...?
<thumper> I'm guessing the reciever is effectively a value type
<thumper> assigning to it means nothing other than saving declaring another value?
<thumper> seems a bit esoteric
<menn0> it would be clearer if done like this: u2 := &Unit{st: u.st, doc: u.doc}
<thumper> agreed
<menn0> ok.
<thumper> or... unitCopy
<menn0> as long as it's not a bug or pure evil
<thumper> or whatever
<thumper> I don't think it is either of those
<thumper> just a bit confusing
<thumper> davecheney: do you agree with that assessment?
<menn0> thumper: using a longer than 2 character variable name??? That's crazy talk! This is Go!
<thumper> :P
<menn0> :)
<thumper> axw: you around?
<thumper> axw: two things, I just noticed that we have a ci blocker, see topic ^^^
<thumper> axw: secondly I want to chat to you about the gridfs storage stuff
<axw> thumper: I am here now, sorry, bit late
 * axw looks at topic
<axw> gah
<axw> ok, will get onto it
<axw> thumper: what about gridfs?
<thumper> axw: I'm munch on lunch and then catch up
<axw> okey dokey
<thumper> axw: mostly about how we store stuff in the paths and what impact multiple environments will have
<thumper> if any
 * thumper is hoping for no impact
<axw> mkay
<axw> paths already encode env UUID
<axw> we do need some modifications to pass in env-specific UUIDs though, atm it's just using State.Environment() to get the one and only UUID
<thumper> axw: hangout? https://plus.google.com/hangouts/_/gupglu6np5t3logeqtcab2praua?hl=en
<axw> thumper menn0 davecheney: trivial review please: http://reviews.vapour.ws/r/91/ (fixes CI blocker)
 * menn0 looks
<thumper> axw: what happens if we run this test on arm or power?
<thumper> axw: should it work?
<thumper> or do we mock out the architecture elsewhere?
<axw> thumper: it's uploading fake tools, should work still
<axw> the UploadArches just tells the fake-tools-uploading code which arches to upload for
<thumper> ok, but uploading fake tools for only i386 and amd64
<axw> it's ec2, there is only i386 and amd64
<thumper> ah, ok
<menn0> thumper, axw: i've already given it a Ship it (looks through the code to figure out the above)
<thumper> axw: can I get you to add just a comment before the UploadArches?
<axw> sure
<axw> menn0: thanks
<thumper> axw: just to say that if ec2 expands it architectures, we should add extra tools here?
<axw> thumper: will do
<thumper> thanks
 * thumper goes to rubber stamp menn0's review
 * thumper goes to make a coffee
<davecheney> thumper: /me reads
<davecheney> hmm, that is some complex code
<davecheney> such side effect
<davecheney> much impurity
<menn0> thumper, waigani: this units env UUID change has been tough, especially the unit watcher
<menn0> 11 failing tests in state left
<menn0> then the megawatcher change
<menn0> and then the migration
<thumper> menn0: yeah, I thought it would be pretty shitty
<davecheney> menn0: i *think* that line 2 on that paste does not update u
<davecheney> well, it updates the copy of the pointer passed to WatachSubUnits
<davecheney> but that is fine
<menn0> davecheney: that's the conclusion that thumper and I came up with too
<davecheney> but it's confusing code and should be rewritten
<menn0> but it's not especially clear
 * menn0 nods
<menn0> since finding that one I've found a few other methods that do the same thing
<davecheney> *u = &Unit would update the caller's copy
<davecheney> actually
<davecheney> *u = Unit
<menn0> yep
<davecheney> u = &Unit changes the pointer value passed to you to point to something else
<davecheney> it's just a bit nuts
<thumper> I'm pretty sure it is explicitly creating a copy as there will be a go-routine accessing it
<thumper> and this way, they avoid extra mutexes
<thumper> davecheney: I agree, unclear code
<thumper> menn0: I suggest you create a new variable, not reusing 'u' :)
<menn0> davecheney: as I said earlier something like u2 := &Unit{...} would have been much clearer
<davecheney> it updates u in the scope of that method
<davecheney> but to point to a new value
<davecheney> not the valye that u points to
<waigani> what code are you lot talking about?
<davecheney> waigani: http://paste.ubuntu.com/8414641/
<waigani> thanks
 * thumper makes a sad face
<thumper> 	sel := bson.D{{"_id", bson.D{{"$regex", "^" + regexp.QuoteMeta(ensureActionMarker(prefix))}}}}
<thumper> 	iter := actionsCollection.Find(sel).Iter()
<thumper> bodie_: you around?
<jcw4> thumper: I may have been responsible for that code
<thumper> jcw4: hey
<jcw4> hey :)
<thumper> jcw4: I'm looking at the collections working out what is needed for the multi-environments
<thumper> what makes up the id for an action?
<thumper> also, what is the prefix normally in the above code?
<thumper> as a general rule, I don't like encoding values into an id field that need to be decoded later
<jcw4> thumper: the prefix of an action id is the id of the unit to which it has been queued
<jcw4> thumper: agreed
<thumper> jcw4: is an action always for a unit?
<jcw4> thumper: at this point yess
<jcw4> thumper: the intent is to allow prefixes of service id's too
<thumper> jcw4: can you concieve of a time where it won't be?
<thumper> hmm...
<jcw4> thumper: although even in the case that an action is queued for a service I think it will ultimately end up queued for a unit
<jcw4> thumper: so maybe not
<thumper> jcw4: right, so the idea that the CLI says run this action for this service, and we create an action for each unit
<jcw4> thumper: the main rationale for the prefix is so that the watcher can easily filter
<thumper> that exists at that time
<jcw4> thumper: right
<thumper> jcw4: was there pushback from someone about having a Unit field explicitly and independently from the id?
<jcw4> thumper: long story - I think fwereade may have a strong opinion on it
 * thumper chuckles
<jcw4> (but I don't want to put him on the spot)
<jcw4> :)
<jcw4> thumper: when I talk to you, my initial feeling that the id should not encode the unit id is re-inforced
<thumper> this may be a situation where fwereade and I both hold strong and opposing points of view
<jcw4> thumper: however when I talk to fwereade he makes compelling case too
<thumper> what is the case to keep it in the id,
<thumper> apart from saving a few bytes?
<jcw4> thumper: I beleive the primary reason is to make the watchers efficient
<thumper> ah fark
<thumper> that's wright
<thumper> most of the watchers only see the id field right?
<jcw4> thumper: exactly
<thumper> wright?
<thumper> right!
<jcw4> haha
<thumper> fudge cake
<jcw4> thumper: that being said, I think env UUID prefix on the unit would transparently flow through to the action?
<thumper> I hate designing our system around problems that we have made for ourselves
<davecheney> *cough* mongo
<thumper> jcw4: no...
<davecheney> where is wallyworld when you need him
<thumper> jcw4: not exactly
<thumper> jcw4: the way we are dealing with the environment uuid is as follows
 * jcw4 leans forward
<thumper> jcw4: and we may want to tweak the actions collection too
<thumper> jcw4: we read the full document
<thumper> jcw4: and then prefix the existing _id field with "<env-uuid>:"
<thumper> jcw4: also adding "env-uuid": <env-uuid>
<thumper> jcw4: and changing the existing _id field to something meaningful
<thumper> like "name" or "foo"
<thumper> jcw4: I think it would be worthwhile also adding in "unit"
<thumper> jcw4: when we do the schema migration for actions
<jcw4> thumper: prefixing the existing _id field and then adding something meaningful?
<thumper> jcw4: the service document changes are now in master
<thumper> jcw4: do you have an up to date master as of about 18 hours ago?
<thumper> jcw4: I'll explain
<jcw4> thumper: yep
<jcw4> thumper: lemme look
<thumper> hangout?
<jcw4> sure
<thumper> jcw4: who do I add?
<jcw4> johnweldon4@gmail.com
<thumper> kk
<thumper> https://plus.google.com/hangouts/_/g424bdmxb7yrh4awwwarvkuw7ia?hl=en
<thumper> jcw4: this commit: 59cfe5aff287b46de440e45dfc2aa25b34dac571
<thumper> jcw4: git diff 3c0b77b..59cfe5aff
<jcw4> thumper:  git diff 59cfe5aff^..59cfe5aff
<thumper> jcw4: ah, that works too? I'm assuming it takes the first parent if a merge?
<jcw4> thumper: yeah I think so
<jcw4> thumper: http://git-scm.com/book/ch6-1.html#Ancestry-References
<thumper> jcw4: cheers
<menn0> thumper, waigani: almost done with the units env uuid work
<thumper> menn0: awesome
<waigani> menn0: nice
<menn0> thumper, waigani: just saw this in TestAddEnvUUIDToServicesIDIdempotent:
<menn0> 	serviceResults[0].DocID = s.state.docID(serviceName)
<menn0> }
<menn0> that's supposed to be an assertion
<menn0> thumper, waigani: we all missed it :(
<waigani> yikes
<thumper> haha
<thumper> yep
 * menn0 fixes
<waigani> thank you
<menn0> waigani, thumper: fixed. thankfully the tests pass with the assert in place
<thumper> :)
<waigani> few
<thumper> phew?
<waigani> true
 * thumper whimpers
 * thumper head desks
<thumper> menn0: is the upgrades collection used solely by the state servers to synchronise updates?
<menn0> thumper: yep
<menn0> thumper: although Will and I were thinking of using it to report the status of the last upgrade in the juju status output
<thumper> menn0: so, about env-uuid, should we add it?
<menn0> thumper: probably not... since the upgrade isn't environment specific
<thumper> well... it sis
<thumper> is
<menn0> thumper: the status thing isn't planned for any time soon
<menn0> thumper: well it is sort of
<thumper> yeah, sort ov
<thumper> of
<menn0> thumper: the state server upgrade aren't
<thumper> ugh
<menn0> thumper: the whole upgrades with MESS thing needs to serious thought. Maybe a good topic while we're in Brussels?
<thumper> probably
<menn0> thumper: for your current planning I would lean towards not including the env UUID in the upgrades doc
<thumper> menn0: that is what I have decided too for now
<menn0> thumper: k
<menn0> thumper, waigani: the tests for adding env uuid to services and units are now each one line
<menn0> the test implementation is generic and works for units and services
<thumper> woot woot
<menn0> it should now be much easier to add more env UUID migration steps
<menn0> about to try a similar refactoring for the upgrade steps themselves
<thumper> menn0: we also need to talk about what backup means for other environments
<menn0> thumper: definitely
<jcw4> thumper-cooking: http://reviews.vapour.ws/r/92/ (fwereade and TheMue too)
<menn0> thumper-cooking: env UUID change for units here: http://reviews.vapour.ws/r/93/diff/
<axw> waigani: if you're still working, can you please take a look at my response to your review?
<waigani> axw: done
<axw> waigani: thanks!
<axw> davecheney: you too, if you don't mind - there's two issues you raised that I didn't resolve, but commented on
<davecheney> axw: /me looks
<davecheney> axw: which link ?
<axw> sec
<axw> davecheney: http://reviews.vapour.ws/r/82/
<davecheney> axw: i can't see your comments
<axw> davecheney: nothing shows up next to your comments on the front page?
<axw> davecheney: there were two things unresolved: you asked if RB screwed up the formatting; looks fine to me on RB, and go fmt is happy anyway
<axw> the other was whether we should return an error from StartInstance if we can't record the instance-id
<axw> and I said no, because (a) we never recorded instance-ids for non-bootstrap machines before (known deficiency), and (b) if we fail to do it for bootstrap ,we'll have other issues anyway and bootstrap will tear down
<davecheney> axw: nope, can't see your comments
<davecheney> did you publish them ?
<davecheney> nah, RB is screwing up the indentation of some blocks
<davecheney> it's happened on other reviews as well
<axw> davecheney: yes, I did publish. weird, looks fine on my end
<davecheney> great, it's not even showing me my comments uinless i'm on the review screeen
<davecheney> it won't show them on the diff screen
<davecheney> what the f
<axw> davecheney: thanks. comments on the diff screen show up as a little box on the LHS, which you need to hover over
<davecheney> yup, couldn't see them
<davecheney> honestly given up
<davecheney> your change is good
<davecheney> no need to sweat the small stuff
<axw> hrm, yeah not showing up for me either. maybe because I uploaded a new changeset? I dunno
<axw> woot, once more step till Environ.Storage() is no more
<axw> one*
<thumper> \o/
 * thumper looks at the reviews
 * thumper takes a deep breath
<davecheney> axw: i've no fucks left to give rb
<axw> heh :)   I must say I'm not overwhelmed by it, I'd gotten used to GitHub reviews
 * thumper headdesks
<thumper> fwereade: have you started?
<thumper> guess not
<mattyw> morning folks
<dimitern> morning mattyw
<dimitern> mattyw, you're OCR today, right? have a look at http://reviews.vapour.ws/r/95/ please?
<mattyw> dimitern, be happy to
<rogpeppe> does anyone know if it's ok to have a non-subordinate relation to juju-info ?
<rogpeppe> fwereade: ^
<rogpeppe> dimitern: ^
<dimitern> rogpeppe, why not? it's not bound to subordinates afaik
<rogpeppe> dimitern: i *thought* it's probably ok, but i'm just writing some relation-verification code in the charm package and wanted to check
<rogpeppe> dimitern: i'll assume it's ok for now. the code in the state package doesn't look as if it complains
<dimitern> rogpeppe, but in general, using juju-info is frowned upon
<rogpeppe> dimitern: frowning isn't relevant here :-)
<dimitern> rogpeppe, :) just saying
<mattyw> dimitern, LGTM
<dimitern> mattyw, cheers!
<fwereade> rogpeppe, sorry I missed that -- but yeah I think it's fine
<rogpeppe> fwereade: cool, thanks
<fwereade> rogpeppe, pretty sure that if it wasn't we would have scoped it differently
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, I think one of the possible use cases was a penetration testing charm :)
<rogpeppe> fwereade: nice idea
<mattyw> reviewboard seems slow today - anyone else seeing that?
<jam> axw: aren't you supposed to link your RB review back to the original branch somehow? (like via a github PR? )
<axw> jam: could be helpful I guess - I didn't think anyone would care about the PR
<axw> I can do that tho
<jam> axw: so *I* occasionally like to actually download the code, and that is hard to do from just RB
<axw> fair enough
<jam> axw: in this case, I trolled github and found your PR and your branch, but it would be nice to have a direct link (though even just to the git branch would be fine for my purposes)
 * axw nods
<jam> though I don't think github makes it easy to see what PR's a given branch is involved in
<jam> I can see what PRs are proposed to go into a branch, but is there an obvious "this branch is proposed in these PRs" ?
<axw> not that I'm aware of
<jam> mattyw: it seems intermittent, and I'm seeing slowdown elsewhere, so I'm wondering if it is "Internet" or RB
<mattyw> jam, it feels like it's just slower on the larger reviews
<jam> dimitern: standup ?
<dimitern> jam, omw
<perrito666> morning
<dimitern> fwereade, jam, mattyw, axw, please take a look http://reviews.vapour.ws/r/96/
<mattyw> dimitern, looking
<dimitern> mattyw, ta!
<mattyw> dimitern, one thing from me, might be something we need to discuss
<dimitern> mattyw, I haven't changed the behavior what's allowed and what's not, I just converted the check already in place to use AuthFuncs
 * mattyw looks again
<mattyw> dimitern, I see it now - that's a great point - and a potential bug - thanks very much
<dimitern> mattyw, sure, np
<mattyw> folks, just wanted to say I've landed some code that fails the go vet test because I forgot to enable the pre push hook - this is me saying I'm sorry, it won't happen again
<jam> mattyw: I'm thinking mgz should be aware that the bot should be running go vet if we want to require it
<perrito666> mattyw: we love you anyway
<jam> mattyw: I don't hold people to very high "you must be very careful doing X", but I do hold our *process* to that standard. So if something happened that shouldn't have, that is more a sign that we're missing a step.
<jam> especially something like 'go vet' which is highly automatable
<mattyw> perrito666, I have to assume you love me because of all this ;)
<mattyw> jam, in that case think of me as a part time - "problem in process finder"
<axw> fwereade: if you have any time today, I'd appreciate if you could weigh in on http://reviews.vapour.ws/r/94/ -- just to make sure I've not veered wildly off course
<jam> mattyw: everybody makes mistakes in process, which is why I don't feel bad.  Anytime the design of something is "as long as we're careful" that's a sign of something brittle that should be done differently.
<jam> mattyw: we wouldn't need a test suite if we were just careful enough to not make mistakes :)
<mattyw> jam, sounds awesome - let's do it!
<rogpeppe> jam: what state is charm.v4 in? is it ok to move the charm store to use it?
<jam> rogpeppe: the only change there (to my knowledge) is updating the gocheck dependency, so it will still need to have v3 merged into it
<rogpeppe> jam: cool, thanks
<rogpeppe> jam: i have actually been applying v3 patches to v4 too, so all should be ok
<wwitzel3> jam: any idea why I would still be getting the, ERROR unknown object type "RunCommand", error from the rpc reflection, even though I've defined an init, registered a common facade 'RunCommand', and added that import to allfacades?
<perrito666> wwitzel3: not using upload tools?
<wwitzel3> using upload-tools
<perrito666> wwitzel3: facadeversions?
<perrito666> with capital V
<wwitzel3> perrito666: ?
<wwitzel3> perrito666: isn't facadeVersions built from registered facades?
<perrito666> I am checking Backups facade to see what other steps are required
<perrito666> wwitzel3: I see a map of strings and numbers
<wwitzel3> perrito666: I do too :/
<wwitzel3> first time noticing it, le sigh
<perrito666> it would be nice to have a facade todo list :p
<wwitzel3> I even read through TheMue's API implementation doc, and I don't recall it mentioning facadeversions, but I probably just missed it.
<TheMue> wwitzel3: the current proposal? pushed it yesterday again.
<jam> wwitzel3: I'd have to see your code to give you any hints
<wwitzel3> TheMue: yeah, current version, I didn't see any mention of api/facadeversions
<wwitzel3> jam: well perrito666 pointed me at facadeversions which has me getting a different error now, but it is progress :)
<TheMue> wwitzel3: ah, ok, currently I only cover the server-side
<jam> wwitzel3: sure, but I'll help you with it if you point me to code.
<wwitzel3> TheMue: I'll write some notes so I can help fill out the client side
<perrito666> wwitzel3: backups is very cool for that, its a very uncommon word so it makes easy to find what must be done for a faade
<TheMue> wwitzel3: great, thanks
<wwitzel3> jam: https://github.com/wwitzel3/juju/blob/ww3-juju-run-with-context/apiserver/runcmd/runcmd.go
<wwitzel3> jam: and the client is https://github.com/wwitzel3/juju/blob/ww3-juju-run-with-context/api/runcmd/client.go
<wwitzel3> jam: the new error is ERROR no such request - method RunCommand(1).Run is not implemented .. that is after adding RunCommand, 1 to api/facadeversions.go
<jam> wwitzel3: your Run() function takes a slice
<jam> you're only allowed to put structs
<jam> so if you need a slice, then you need a struct that has a slice
<jam> wwitzel3: I *think* if you run at the right debug level the RPC code will tell you that it is discarding certain functions (and hopefully why)
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: None
<jam> wwitzel3: make sense?
<wwitzel3> jam: yep, makes sense
<jam> wwitzel3: so without setting facadeversions, it was probably using BestAPIVersion to then request version 0 of the facade, which didn't (and shouldn't) ever exist
<jam> wwitzel3: and Run() just wasn't being exposed because it was taking a slice.
<wwitzel3> jam: thank you
<wwitzel3> perrito666: thank you
<perrito666> wwitzel3: np
<mattyw> dimitern, http://reviews.vapour.ws/r/96/ ship it
<mattyw> can I get a quick review from someone? I'm OCR today but I don't trust myself: http://reviews.vapour.ws/r/97/
<rick_h_> mattyw: that's a decent excuse I suppose :P
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1373424
<sinzui> fwereade, can you comment on bug 1372016. We either have a bug in the API or we need to document that clients cannot assume that the error information is in a consistent place
<mup> Bug #1372016: api errors coming back in response instead of envelope <api> <juju-core:Triaged> <https://launchpad.net/bugs/1372016>
 * fwereade looks
<jam> sinzui: fwereade: I think the statement there is "if your whole request is invalid, Error is on the overall scope", if one of the pieces of your request is invalid, then that piece gets the error.
<jam> since EA now takes a slice of machines, right?
<jam> (most bulk apis work this way, just because 1 entry of a slice is bad, we don't error on the whole request)
<sinzui> jam, fwereade, thank you. since we don't have a time machine to fix old envs, it might be true that all API clients need to check for the error in many locations
<fwereade> jam, sinzui: yes, the top-level errors are not the primary way of discovering problems at all -- if things are so bad that we can't give you the usual response, that's where the error will be
<natefinch> gsamfira: can you join #juju? someone is having difficulty deploying a windows charm
<gsamfira> natefinch: joined
<sinzui> natefinch, can you ask someone to look into bug 1373424. I think the issue has been in the code for a long time
<mup> Bug #1373424: method Client.AddMachinesV2 is not implemented <api> <ci> <compatibility> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1373424>
<natefinch> sinzui: will do.
<wwitzel3> is there a good way to check the type of error? in this case I want to check if the error is one of the ERROR from the rpc, so I can attempt to call the old API client.
<wwitzel3> ahh found it, in apiserver/params/apierror
<natefinch> if v, ok := err.(SomeErrorType) ; ok {  // it's that type, v is the value of that type }
<natefinch> wwitzel3: standup?
<gsamfira> heya guys. This PR: https://github.com/juju/utils/pull/27 has been reviewed. Can we merge it?
<katco> mattyw: can i get a (hopefully) final review and "Ship It!" (if appropriate) on http://reviews.vapour.ws/r/66/?
<mattyw> katco, of course
<katco> mattyw: thanks :)
<mattyw> katco, done, not quite "Ship it" from me I'm afraid
<katco> mattyw: no worries, doing a good job.
<katco> mattyw: i struggle with the split comments however. all that does is just shift the functions to a broader scope and possibly introduce new parameters due to lack of closure
<katco> mattyw: defining several functions within a function is a pattern, and it's a bit like defining a class w/o the boilerplate.
<mattyw> katco, I'm fine with the functions being unexported and only being used in FormatSummary. For me it's all about how much I can keep in my head. I don't mind a function having a number of inputs as long as I can quickly work out what it's doing. When functions get long I start getting confused. There's the rule of thumb that a person can remember about 7 "things". With closure I have to remember what they do but also work
<mattyw> out what parts they're closing over
<mattyw> katco, if it's all split out that I can usually look at a single function, work out what it does then only need to remember the one thing it does when I go back to what's calling it
<katco> mattyw: can't you do that within the scope of a function?
<katco> mattyw: closures are exactly equivalent to classes, except you can't modify the member variables they close over.
<katco> (outside the scope of what they close over)
<mattyw> katco, yes, but the larger the scope the harder is is to do
<katco> mattyw: which is equivalent to "the more member variables, the harder..." i think
<katco> mattyw: well, you're the second one to bring it up, so i'll move one scope-level higher and make it a class
<mattyw> katco, sorry - I wasn't ignoring you I was just thinking
<katco> mattyw: no worries at all
<mattyw> katco, I don't think you need to go that far, I agree with you that it adds boiler plate
<mattyw> katco, and as you say the more fields the harder it is to work out what's going on
<mattyw> katco, I think if you just turned the closures in that function into seperate functions that would be a good start - conceptually nothing much would have changed, but the FormatSummary function would be shorted and I think easier to read
<katco> mattyw: i guess i just don't like the idea of them floating around when they're only hyper-specific to the summary formatter
<mattyw> katco, I sort of know what you mean
<mattyw> katco, I agree. But the reason they are floating around is because people are weak. They're floating around to make a function with lots of things to do look smaller
<katco> mattyw: mmm... like if this were java, and the summary function were a class, this would look exactly the same indentation-level wise
<katco> mattyw: i am trying not to be too obstenant, but i'm struggling with why it's easy to refer to top-level functions, but not variables which are functions
<wwitzel3> katco: it is about the noise ratio, with the variables, I have to stop reading. with extracted methods, I can read to the end, I don't have to remember to stop.
<wwitzel3> katco: so as I'm reading through format summary, I am getting bogged down with the implementation specifics of each step
<wwitzel3> katco: when what I really want to read is each step, then go see the specifics, when/if/should i care to
<katco> wwitzel3: i guess i just scan over closures unless i care what they do. sounds like i'm a weirdo ;)
<wwitzel3> no you're just a lisp programmer / emacs user
<katco> wwitzel3: rofl
<wwitzel3> oh right, you said weirdo .. correct
<wwitzel3> :P
<katco> bam!
<katco> and to be fair, i think i can only barely claim to understand lisp. i've not used it that much. but i am trying to change that ;)
<mattyw> katco, you are an emacs user though right?
<katco> mattyw: absolutely
<mattyw> wwitzel3, vim?
<katco> mattyw: i make fun of myself, but i love that editor
<mattyw> katco, I moved from emacs to vim I'm afraid
<katco> mattyw: the only thing i argue vehemently to people is that they learn 1 editor deeply
<katco> mattyw: i think it's unproductive to argue over what works for you hehe
<mattyw> katco, +1000
<wwitzel3> mattyw: yeah, vim :)
<wwitzel3> katco: +1 on learning one editor really well
<katco> wwitzel3: it's like right above your brain on the programming stack. really important.
<wwitzel3> katco: yep, I try to go through some commands so I don't forget them, even if I haven't needed some of them in a while.
<wwitzel3> katco: I hate the context switch of .. oh how I do do that?
<katco> wwitzel3: see in emacs, that's just C-s (get up and flush the toilet) M-` 1 2 3 Q
<katco> wwitzel3: do you do vimgolf?
<wwitzel3> katco: haha, I did for a bit, it makes you faster for a while
<wwitzel3> then it makes you slower trying to find obscure ways to avoid keystrokes
<wwitzel3> until eventually vim is a regex parser
<katco> rofl
<wwitzel3> natefinch: heading to moonstone
<natefinch> I like thinking about my code not my editor.  That's why I use "dumb" gui editors like sublime :)
<natefinch> wwitzel3: omw too
<natefinch> I know like 3 hotkeys that are specific to sublime and that's it
<mattyw> calling it a day folks, see you all tomorrow
<katco> mattyw: ty for the review, i'll make those changes
<mattyw> katco, ping me when you want to take another look
<katco> mattyw: will do
<mattyw> night all
<natefinch> hazmat: HA meeting?
<hazmat> yup
<sinzui> natefinch, this job is failing. I changed the test to help juju pass http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/759/console I see (error: cannot re-bootstrap environment: restore does not support HA juju configurations yet) and it worries me. this looks like a policy change that breaks a test
<perrito666> sinzui: :| what?
<perrito666> that has always been there and I never in my life saw it triggered
<perrito666> ok, so suddenly environs.Environ StateServerInstances is returning something differnet
<sinzui> perrito666, All three tries with the revised test consistently show "error: cannot re-bootstrap environment: restore does not support HA juju configurations yet" I recall the rule is that the restore returns a single state-server and the user can run --ensure-ha again
<sinzui> perrito666, there was traffic on one the mailing lists where we recommend to backup the state-servers in HA because it is always a last restore in a catastrophe
<perrito666> sinzui: the thing is, what restore does not support is to restore while other state servers are up and running
<sinzui> perrito666, right, and the test shows they were deleted using nova.
<perrito666> sinzui: I do see the deletion taking place after the error
<sinzui> perrito666, the console logs are out of order because the error is reported last. when I see http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/760/console, I see the 3 state-servers deleted, then the restore is started, but it fails, then the status/destroy-env cleanup code is run, then we see the error message from rstore
 * sinzui changes test to print the error immediately so that the older of events is clear
<natefinch> katco: do you know if your team has scheduled time to make the debug-log fix from the pain points document?
<natefinch> katco: (that the filter requires using the internal juju identifier, not the CLI-style unit/0 identifier)
<katco> natefinch: um, i know ian sent out some bugs to work on while he was out if we ran out of things to do
<katco> natefinch: let me check those
 * sinzui re-runs test with better output
<perrito666> sinzui: tx
<katco> natefinch: are you referring to the customer pain points doc?
<natefinch> katco: correct
<katco> natefinch: he prioritized the apt-related bugs for axw and i
<natefinch> katco: ok
<katco> natefinch: and for me, that was prioritized behind the status work (also in that document)
<natefinch> katco: ok, cool.  I was thinking of having one of the people on my team tackle the debug-log change and didn't want to overlap.  Sounds like there wouldn't be overlap, so that's good.
<katco> natefinch: yeah i think you're gtg.
<katco> natefinch: i am half-remembering that i thought someone was looking at this already. for some reason voidspace comes to mind
<natefinch> katco: ok, I'll just email out on the list, then.  no big rush
<perrito666> sinzui: lemme know when you have logs
<sinzui> perrito666, I am watching http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/761/console, and it think we are 45 minutes away from completion
<perrito666> sinzui: :( ok, ill take a look in 45
<sinzui> perrito666, It is failing at this minute: http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/761/console
<perrito666> sinzui: very odd, I do see something different
<perrito666> WARNING: Could not find the instance_id in output
<perrito666> which comes from assess_recovery.py line ~94
<hazmat> axw, ping re https://bugs.launchpad.net/juju-core/+bug/1373592
<mup> Bug #1373592: When bootstrapping select-zone could retry if the instance-type is full in the region <papercut> <juju-core:New> <https://launchpad.net/bugs/1373592>
<perrito666> sinzui: and it suggests that at that point restore failed with the HA error once
<perrito666> and then you delete and after that it fails again with the same
<sinzui> perrito666, your script is old.
<sinzui> perrito666, update to pull the test changes
<sinzui> perrito666, the failure you are seeing is because the test wanted to get the instance_id that was rejected. It isn't used, so I change the test to just warn that juju no longer tells us what was rejected.
<perrito666> sinzui: my script? I am copying that from jenkins
<sinzui> perrito666, sorry, more ambiguity. the error is note the test, the test is showing why juju refused to restore to a working state-server (juju-restore correctly refused to restore because the state-server was still up.). That is correct. The test once looked for the instance_id in the text. Once we are certain juju wont do something dangerous, the script will remove all 3 state-servers, then we see "starting restore", followe
<sinzui> d by "Restore Failed"
<perrito666> sinzui: so the test should no longer look for the instance_id_
<perrito666> ?
<sinzui> perrito666, that is right. the test is correct. Why is juju showing a message that contradicts what CTS advises when we restore (error: cannot re-bootstrap environment: restore does not support HA juju configurations yet)
<perrito666> sinzui: what I am trying to clarify is, if the test no longer looks for the instance_id is why jenkins still shows the error as if the test was looking for it
<sinzui> perrito666, because I wanted a warning in place. I think the missing instance_id means juju's has intentionally introduced a regression
<perrito666> sinzui: ok, since restore has not changed I will guess that this is an unrelated regression that broke restore by accident, ill try to find it
<perrito666> btw something else the description on the jenkins job is odd
<perrito666> the revision for the first failure is 0fafcd986aa7501e9796519a2c432c16fe5b231f
<perrito666> yet it says: gitbranch:master:github.com/juju/juju r0fafcd98
<perrito666> shouldnt the hex be the shortform for the sha?
<perrito666> bbl
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1373424 1373611
<sinzui> perrito666, I need to remove the 'r' the 'r' is from the days of testing bzr revnos
<fwereade> natefinch, if thumper wasn't fixing the debug-log mess I don't think anyone else was
<fwereade> natefinch, I would be most grateful if you would unfuck that
<natefinch> fwereade: heh ok
<thumper> fwereade: ping?
<perrito666> sinzui: back
<fwereade> thumper, heyhey, just having a ciggie, with you in 5
<thumper> fwereade: ack
<perrito666> sinzui: ping me when you have a moment
<sinzui> perrito666, I am about
<perrito666> ok just if you can resolve my doubt about the commit hash so I know where to start looking
<sinzui> perrito666, I will arrange for that annoying 'r' to be removed tomorrow from future hashes. Maybe I can also get the branch and hash added to the tests that are missing that info
<perrito666> sinzui: duhhh
<perrito666> sorry I hadnt noticed the r
<perrito666> my apologies
 * perrito666 prepares coffee
<perrito666> sinzui: well Iam looking at the offending commit.. as usual, does not seem to change anything relevant... such is life
<sinzui> perrito666, no need to apologise. Humans should work with Jenkins. We are adding links to the s3 data now. next week, engineers can go to pages like this to download everything that the tests collected http://reports.vapour.ws/releases/1871
<sinzui> perrito666, there are three suspects https://bugs.launchpad.net/juju-core/+bug/1373611
<mup> Bug #1373611: cannot restore a HA state-server <backup-restore> <ci> <ha> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1373611>
<sinzui> perrito666, and I agree non look like they wanted to change restore
<perrito666> whyy whyyy always something breaks the commit is so long
<waigani> On ReviewBoard, is there a way/place to see a list of review requests that you are involved in?
<waigani> or a list or your reviews?
<ericsnow> waigani: 2 places: "Outgoing" and via some of the columns
<perrito666> sinzui: it does not really matter that filter should be removed since I made restore HA compatible, pushing a fix
<waigani> ericsnow: what I'm actually after is a list of my reviews - is that possible?
<waigani> I did a bunch yesterday and now I need to follow up
<ericsnow> waigani: one of the columns is "My Comments" (it shows a colored square with a curly right edge)
<waigani> ericsnow: right, just found that, thanks
<ericsnow> waigani: np
<perrito666> who is OCR?
<perrito666> http://reviews.vapour.ws/r/101/
<perrito666> or who would be so nice to R this anyway, it is extremely trivial
<perrito666> sinzui: that patch fixes the issue
<perrito666> I am not sure if this is a regression or what in the universe
<sinzui> thank you perrito666. I really appreciate your fast response
<thumper> perrito666: if it works in HA, I'm +1
<perrito666> thumper: sinzui my theory is that a bug that had environs.Environ StateServerInstances returning the wrong number of instance ids was fixed and that is why that condition triggered
<perrito666> after I finish to put the groceries I just bought Ill merge it tx for the review thumper and ericsnow
<menn0> thumper: can you remind me of how to get a State connected to another env in tests? you mentioned something recently.
<thumper> menn0: st.ForEnviron(tag)
<menn0> thumper: cheers. I had been searching but it wasn't jumping out.
#juju-dev 2014-09-25
<menn0> thumper: I was going to add a test to make sure the machine unit watcher doesn't show things from other environments but that's opened a whole can of worms in terms of filtering work we need to do and required test infrastructure
<menn0> thumper: do you want me to press on or do the minimal work required to get the units env UUID work landed?
<menn0> thumper: I'm aware that the identity work needs to get done too
<ericsnow> how do you get an agent.Config from state?
<axw> ericsnow: you don't. agent.Config is read from disk, and only from disk
<thumper> menn0: how big is that can of worms?
<menn0> thumper: at least the rest of the day
 * thumper hears "and Friday"
<ericsnow> axw: my real question is where do I look up the environment's data dir
<menn0> thumper: likely
<ericsnow> axw: from what I could tell environs.Config doesn't give it to you
<thumper> menn0: well, we know we will need a bunch of extra tests later anyway
<axw> ericsnow: that'd be encoded in the upstart job
<thumper> menn0: I say punt on it for now, but leave notes in the code
<menn0> thumper: and actually, since writing to you last i've realised that we can't really do this until at machiens have been migrated too
<axw> ericsnow: the agent is told where it is as a command line arg
<thumper> menn0: that makes sense
<axw> ericsnow: what are you trying to do?
<thumper> menn0: because they all have dependencies on each other
<ericsnow> axw: that's what I gathered
<ericsnow> axw: for backups we currently have a bunch of paths hard-coded
<menn0> thumper: I was trying to set up a second env it's own machine, services and units to test with and kept running into collisions with the initial env
<thumper> heh
 * thumper nods
<menn0> thumper: I will add some TODOs
<ericsnow> axw: I'm looking at how to pull those from elsewhere
<axw> ericsnow: IIRC, datadir is passed into the apiserver as a "resource" ... one sec
<ericsnow> axw: you're right
<menn0> thumper: I have a fair idea of how the test infrastructure for this should look (i.e. a new method on ConnSuite to create a new env and an easy way to get a Factory for that)
 * thumper nods
<axw> ericsnow: can you not take it from there then?
<thumper> axw: CI blocker: http://reviews.vapour.ws/r/104/diff/ ?
<ericsnow> axw: yeah, I guess I was hoping for a single-source-of-truth for paths
<axw> thumper: looking
<thumper> ericsnow: are you guys looking at https://bugs.launchpad.net/juju-core/+bug/1373611
<mup> Bug #1373611: cannot restore a HA state-server <backup-restore> <ci> <ha> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1373611>
<thumper> perrito666: is this your fix from before?
<ericsnow> thumper: no
<axw> thumper: bugger, I thought our compatibility was only the other way around...
<thumper> axw: I pasted back in the code I deleted earlier
<ericsnow> thumper: http://reviews.vapour.ws/r/101/ has a fix
<axw> thumper: yep. LGTM
<thumper> axw: yeah... see the bug description https://bugs.launchpad.net/juju-core/+bug/1373424
<mup> Bug #1373424: method Client.AddMachinesV2 is not implemented <api> <ci> <compatibility> <regression> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1373424>
<ericsnow> thumper: for #1373611
<mup> Bug #1373611: cannot restore a HA state-server <backup-restore> <ci> <ha> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1373611>
<thumper> ericsnow: I was wondering if that was it
<thumper> ericsnow: has it been submitted?
<ericsnow> thumper: don't think so: https://github.com/juju/juju/pull/836
 * thumper submits it
<wwitzel3> does anyone know where in the API call steps we unmarshall values?
<thumper> wwitzel3: in the magic bit of the code
<thumper> wwitzel3: juju/rpc maybe?
<wwitzel3> thumper: well, I'm getting ERROR json: cannot unmarshal object into Go value of type names.Tag
<thumper> yeah...
<wwitzel3> but the logs have no information on where the is being generated from at all
<thumper> the wire protocol needs to use strings
<thumper> until we fix the wire protocol
<thumper> wwitzel3: we could fix it...
<thumper> wwitzel3: you just need to provide the json serialization functions for all the concrete types
<thumper> wwitzel3: the error is being raised by the json serialization code
<wwitzel3> thumper: where is that code? apiserver? the jsoncodec call?
<thumper> wwitzel3: the json code? inside the standard library
<thumper> json.Marshal
<wwitzel3> thumper: I mean within our code, where is the unmarshalling happening?
<thumper> wwitzel3: I'm guessing, but probably in juju/rpc where it puts the code on the wire
<thumper> wwitzel3: api.client unmarshals actually
<wwitzel3> thumper: if I could fucking spell, I'd probably have less questions
<wwitzel3> acking for unmarshall exact match wasn't ever going to be helpful
<wwitzel3> I'm a moron
<wwitzel3> thank you
<wwitzel3> Umarshall that is
<wwitzel3> I at least had the puncuation right
<wwitzel3> thumper: so, is the solution to switch back to strings? or to define how to unmarshal names.Tag?
<thumper> wwitzel3: yes...
<thumper> wwitzel3: for assistance, version.go:117
<thumper> not sure if you want to go that way or not just yet
<wwitzel3> thumper: a Tag is pretty easy type to implement that for
<thumper> wwitzel3: yep
<wwitzel3> thumper: and it is still that something based on string isn't easily serializable to JSON anyway
<thumper> wwitzel3: because we even have the magic parseFooTag mthods
<wwitzel3> s/still/silly
<wwitzel3> ok, I'll go that route since I think it makes things better
<thumper> good luck
<wwitzel3> it's just typing
<thumper> and tests :)
<wwitzel3> yep
<davecheney> can someone help me
<davecheney> for some reason I can no longer see both the comments and the diff on the same screen
<thumper> davecheney: maybe
<davecheney> something has happened and reviewboard does not show me the comments that others have made on a review
<davecheney> unless I go to the "view review" screen
<davecheney> but then it doesn't show me the code
<davecheney> and I need to go to the "view diff" screen
<thumper> davecheney: I've had to go to view diff screen too
<thumper> I thought it was just me getting used to the tools
<perrito666> thumper: did you merge my fix?
<thumper> perrito666: aye
<perrito666> thumper: thank you sir
<davecheney> thumper: this worked a few weeks ago
<davecheney> ie, on the diff screen you would see inline comments
<davecheney> now they are all hidden on another screen
<perrito666> btw axw you might have inadvertedly fixed a problem in environs.Environ StateServerInstances congrats, you now subconsciously rock
<thumper> how the hell do I change the topic
<axw> perrito666: heh, what was that?
<davecheney> massive storm in sydney atm
<davecheney> if i go offline
<davecheney> please sent paramedics
<thumper> heh
<perrito666> axw: well the bug I fixed should have been triggered long ago but apparently StateServerInstances was returning only one instance :p and after your commit it now returns all the state servers
<thumper> ugh
<axw> perrito666: ah :)  that only happens for new environments tho
<thumper> the bot still things there are ci blockes
<perrito666> davecheney: dont you prefer us to send an internet technician? I mean I dont see paramedics getting you back online
<thumper> but they are both fix committed now
<perrito666> thumper: takes a bit I think
<axw> perrito666: it was an intentional improvement, didn't know it fixed anything in particular tho :p
 * thumper taps his fingers impatiently
<perrito666> axw: neither did I, in fact none of us knew it was broken
<perrito666> axw: the test you added does the same restore did that is why I realized
<davecheney> thumper: http://reviews.vapour.ws/r/79/diff/#
<davecheney> can you go to line 100 in filestorage/wrapper.go
<davecheney> and tell me if the line is properly gofmt'd on your screen
<davecheney> i'd like you to the line if that was something that RB could do
 * thumper looks
<davecheney> or this one http://reviews.vapour.ws/r/103/diff/#
<davecheney> api/backups/list_test.go lines 27 onwards
<davecheney> it looks liek RB gives up after three levels of indentation
<thumper> davecheney: looks properly formatted to me
<thumper> davecheney: list_test looks fine to me too
<davecheney> hang on, screenshotting
<davecheney> thumper: check your mail
<thumper> davecheney: yeah, yours looks different to mine
<thumper> davecheney: I see proper formatting
<thumper> davecheney: may I suggest it is your shitty browser?
<thumper> :)
<davecheney> chrome, just like everyone else
<thumper> hmm...
<thumper> it really does look fine here
<thumper> chromium here
<davecheney> fine, i'll just ignore any formatting issues
<davecheney> the bot will catch them for me
<thumper> gah... why is the bot not working out that we have fixed the blockers?
<menn0> thumper: http://reviews.vapour.ws/r/93/diff/ PTAL
<thumper> menn0: ok
<cmars> thumper, i told the buildbot that those blockers were fixed. it seemed to accept builds now
<thumper> cmars: how?
<thumper> cmars: I thought it looked at LP
<cmars> you put $$fixes-<bug#>$$ in the merge comment
<davecheney> $$__JFDI__$$
<cmars> ah, crap. buildbot is not remembering the fixes-NNN. i suppose that was already tried..
<menn0> thumper: are you still looking at that PR?
<thumper> yup
<menn0> thumper: ok
<thumper> cmars: I did that
<menn0> thumper: just making sure you didn't forget to hit publish
<thumper> cmars: that is how it landed the fixes
<sinzui> cmars, CI is blocked by the two critical bugs in the topic. a fix for 1 of them is being test now
<thumper> sinzui: I thought the blockage got removed when we committed them?
<sinzui> no
<thumper> sinzui: are they removed later now?
<thumper> ah
<sinzui> CI was being reopen but the tests dfailed
<sinzui> we not wait for someone to make the bug fix released when the test is passed
<thumper> sinzui: ack
<wwitzel3> oh man, so apparently if you implement a method on an interface with a pointer receiver .. every single place in the world you've ever used that break
<thumper> wwitzel3: you can't implement a method on an interface
<thumper> at least I didn't think so...
<thumper> davecheney: ?
<thumper> wwitzel3: yeah, the more I think about it, the more I think that you can't do that, it makes no sense
<thumper> menn0: published response
<wwitzel3> thumper: no, I mean on the concreate implementation
<wwitzel3> thumper: sorry, it's late, shitty wording
<thumper> wwitzel3: ok
<thumper> why does it break?
<wwitzel3> thumper: I added the UnmarshalJSON method to the tag interface and then added an implementation on MachineTag. Now I have several dozen errors for does not implement Tag (UnmarshalJSON method has pointer receiver). Every where we reflect tag.(MachineTag)
<wwitzel3> which makes sense
<wwitzel3> I would need to pass in a pointer
<wwitzel3> but there are a ton of touch points in the test :/
<thumper> wwitzel3: you don't need to add the marshal and unmarshal methods to the tag interface...
<thumper> wwitzel3: or at least, not yet
<wwitzel3> but the params we are sending over the wire are []names.Tag
<thumper> wwitzel3: yes, the unmarshal needs a pointer
<thumper> wwitzel3: as it changes the instance
<thumper> wwitzel3: and not a copy of it
<thumper> wwitzel3: hmm...
<thumper> maybe you do need it ?
<wwitzel3> if I don't add them to the interface, it complains it doesn't know how to Unmarshal a Tag, which makes sense
<wwitzel3> I'll figure it out eventually
<wwitzel3> tag.(*MachineTag) .. is probably want I want
<menn0> thumper: thanks
<menn0> thumper: I did make the change in service.go!
<thumper> menn0: did you
<menn0> thumper: the tests wouldn't be passing without it
<thumper> must have missed it
<menn0> thumper: but you're right that a migration step would also be needed so I might just undo that
<thumper> menn0: ah, so you did
<thumper> yeah, it makes me a little sad, but probably faster for now
<menn0> thumper: or *evil grin* could we just strip a trailing "/" off the prefix if it's there?
<thumper> haha
<menn0> thumper: I'm half-serious
<thumper> do it
<menn0> thumper: alrighty
<wwitzel3> that's my favorite exchange of the night
<menn0> thumper: http://reviews.vapour.ws/r/93/diff/
<thumper> menn0: do your upgrade test while I have a coffee
<thumper> that sounds a bit weird
<thumper> but Rachel just got home
 * thumper cracks the whip
<menn0> thumper: doing it already :)
<menn0> thumper: what's a quick and easy subordinate charm I can add to my test environment?
<davecheney> menn0: nagios-nrpe
<menn0> davecheney: thanks
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1373611
<menn0> thumper: well that didn't go so well...
<menn0> thumper: some services have disappeared
<menn0> thumper: I'm going to check the DB
<thumper> oh...
<thumper> jam: hey there
<thumper> menn0: so... how is the db looking?
<thumper> menn0: did you want to talk it through?
<menn0> thumper: the db looked as I expected it but obviously something wasn't right
<wwitzel3> so the Marshal methods have to be part of the interface definition for the generic names.Tag to support json encoding, and eventually I got it all working on the names package, but there are just too many places where we do something like []names.Tag{NewUnitTag()} .. which no longer works the momemnt you have a pointer receiver
<menn0> thumper: I'm just testing the same env with master so that I can be sure it was the units change
<thumper> menn0: ok,
<menn0> thumper: ok master is fine
<menn0> thumper: going to rebase my branch to upstream and recreate the problem again
<thumper> kk
 * thumper notices that st.AllServices() definitely leaks services across environments...
<menn0> thumper: there's tons and tons of stuff that leaks across envrionments
<menn0> thumper: once we have the base DocID changes done we'll need to circle back and deal with all that, writing tests as we go
<thumper> agreed
<menn0> thumper: I was thinking it might be a good idea to set up an alternate environment in the setup of ConnSuite that doesn't get directly referred to in tests but will expose places that leak
<menn0> thumper: unexpected data should show up all over the place and cause tests to fail
<menn0> thumper: this canary env should be fairly well set up with several machines, services, subordinate units etc to maximise the stuff it'll expose
 * thumper nods
<menn0> the mediawiki charm just won't install at the moment
<thumper> that seems reasonable, but potential overkill
<menn0> I keep having to ssh to the box and do a "apt-get update" and then run "juju resolved --retry mediawiki/0 to get it to install
<menn0> is that expected?
<menn0> this is with 1.20.7
<thumper> menn0: local?
<menn0> thumper: yep
<thumper> menn0: could be due to the apt upgrade / update thing
<thumper> your template is out of date
<menn0> thumper: right
<menn0> thumper: very likely
<menn0> thumper: delete the template machine or is there a faster way? (start the template and update it?)
<thumper> yeah
<thumper> yes to starting the template
<thumper> and logging in, and doing an apt-get update/upgrade
<menn0> thumper: ok, i'll do that
<thumper> then shutting it down again
<wwitzel3> (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
<thumper> â¬ââ¬ï»¿ ã( ã-ãã)
<bradm> hey, do we have a stable 1.20.8 yet?  wondering when I should expect to need to change to it
<menn0> thumper: http://paste.ubuntu.com/8423049/ :(
<menn0> thumper: the exact final result doesn't appear to be deterministic either
<thumper> menn0: oh crud
<thumper> menn0: is it the status collection?
<thumper> menn0: units have status
 * menn0 checks
<thumper> the status command would be looking there, right?
<menn0> yeah but we haven't changed the statuses collection keys yet.
<menn0> maybe that's the problem.
<menn0> it didn't break services...
 * thumper nods
<thumper> menn0: I think...
<thumper> menn0: that status should still keep local ids
<thumper> but need to check
<thumper> it is how it gets the underlying entity
<thumper> I'm being called for dinner
 * thumper will be back for team-lead meeting later 
<dimitern> morning all
<menn0> thumper-afk: the migration steps for units and services have mixed everything up.
<menn0> thumper-afk: for example, all services now have a name of "rsyslog-forwarder-ha" I can only tell them apart from the _id.
<menn0> thumper-afk: must be something with the generic upgrade function. I'll take a look
<dimitern> davecheney, hey
<davecheney> dimitern: o/
<dimitern> davecheney, due to the way names.NewEnvironTag() works, you *can* create an invalid tag without panicking, i.e. tag := names.NewEnvironTag(""), then tag.Id() == "" is equivalent to tag == nil if tag used to be names.Tag
<davecheney> dimitern: thanks for confimring
<davecheney> i think it is still a bug
<davecheney> i think the problem is NewEnvironTag("") should fail
<dimitern> davecheney, I agree
<dimitern> davecheney, but environ tags are relatively recent, and we still have to support the case when it's not set
<davecheney> dimitern: i wish thumper was here
<davecheney> environ tags are basically manditory for mees
<davecheney> mess
<davecheney> whatever
<davecheney> we can't keep continuing to say, oh well, if you don't have an env tag we can just overlook it this time
<dimitern> davecheney, since I saw your comments too late, I'll do a follow-up to do some of your suggestions and reply about NewEnvironTag
<davecheney> ok, thanjs
<davecheney> thanks
<menn0> thumper-afk: I've figured out the garbled data following the migration. if you read data into a map using mgo and then save that map in a slice you need to recreate that map before starting the next loop. setting it to nil was sufficient.
<menn0> thumper-afk: there's still what looks like a presence issue. after the upgrade the units are marked as down. the agents are started (in the DB too) but the presence check is failing.
<menn0> thumper-afk: I'll continue tomorrow. I need to EOD.
<dimitern> davecheney, wanna have a look? http://reviews.vapour.ws/r/105/
<davecheney> looking
<davecheney> dimitern: looks good, two issues
<dimitern> davecheney, cheers
<dimitern> davecheney, I don't think returning an error there is sensible, due to the possibility of not having environ UUID at the endpoint level
<dimitern> davecheney, and the login *will work* without it, for the same reason
<davecheney> sure
<dimitern> davecheney, but maybe adding a warning and continuing will be useful
<davecheney> var e names.EnvironTag
<davecheney> e.String()
<davecheney> no
<davecheney> e.Id() => ""
<davecheney> so there is no way to distinguish between presenting an environ id that is "", and the environ id being invalid and passing "" as a fallback
<davecheney> simply put, there is no point validating if we're not going to reject values which are not valid
<dimitern> davecheney, given a names.EnvironTag already, Id() == "" is the same as it being invalid
<dimitern> davecheney, but names.IsValidEnvironment(uuid-or-tag.Id()) works
<davecheney> crap
<dimitern> davecheney, the point of that code there is to only use the uuid if it's valid (an improvement from before when we didn't even validate the uuid)
<dimitern> davecheney, that's why I suggest to do a logger.Warningf("API endpoint has invalid environment UUID %v", uuid) when it's not valid, but still let it through
<davecheney> i guess that is all we can do
<davecheney> this will end up being a buig
<davecheney> i'm sure of this
<davecheney> but what else can we do
<dimitern> or maybe "ignoring invalid API endpoint environment UUID %v"
<davecheney> sure
<dimitern> yeah, we can incrementally make it better :)
<dimitern> davecheney, updated - http://reviews.vapour.ws/r/105/diff/1-2/
<dimitern> davecheney, is it good to land like this?
<axw> jam: contributions to things like gwacl still need CLA signing, right?
<jam> axw: I'm pretty sure all our stuff is (c) Canonical under the same CLA
<jam> so they don't need to sign it separately
<jam> but it does need to be signed
<axw> jam: ok - someone sent in an MP for gwacl specifically
<axw> if they've signed, it shows up on their LP ~ page right?
<jam> axw: I see "Canonical Contributer Agreement" as one of my groups on  https://launchpad.net/~jameinel/+participation
<jam> so I think so
<axw> thanks
<jam> though I don't see it for https://launchpad.net/~axwalk/+participation
<dimitern> jam, hey, can you have a look at http://reviews.vapour.ws/r/105/diff/ please?
<jam> I'm not sure that everyone in "Canonical" is also in CCA
<jam> dimitern: can you actually compare Tag objects? I thought it was an interface, and you don't want to compare exact internals
<jam> if you can, those changes seem fine
<axw> jam: AFAIK, only required if you don't work for Canonical; employment contract covers employees
<jam> I'm just concerned if doing "t1 := names.Tag("foo"); t2 := names.Tag("foo"); t1 != t2"
<jam> axw: right, you don't need to sign it, I just mean you need to look for one or the other
<axw> ah yep
<dimitern> jam, only names.Tag is an interface, the others are simple structs implementing it
<jam> dimitern: axw: https://launchpad.net/~not-canonical :)
<axw> hehe
<jam> I got there from jelmer, who used to work at canonical
<jam> Trying to find someone who isn't me or at canonical who should have signed the agreement to confirm participation
<dimitern> jam, yep that's the "former employees and people often mistaken as employees" list :)
<jam> dimitern: is there a simple way to link part of an RB diff?
<jam> I'm looking at one that is very-much comparing a names.Tag to a names.Tag
<dimitern> jam, you can link a X-Y diff, like this http://reviews.vapour.ws/r/105/diff/1-2/
<jam> dimitern: yeah, I want to link a line in the review: http://reviews.vapour.ws/r/105/diff/#1
<jam> one of those is *definitely* a names.Tag
<dimitern> jam, apistate.EnvironTag() (names.EnvironTag, error)
<dimitern> jam, and environ.EnvironTag() (names.EnvironTag)
<dimitern> jam, confusingly enough, environ has at least 4 more Tag methods - ServerTag, Tag, etc.
<jam> func(tag names.Tag) bool {
<jam> dimitern: ^^
<jam> from 3 lines above that
<dimitern> jam, aah, this one
<dimitern> jam, yeah, but it really is a names.EnvironTag
<dimitern> jam, because we do tag, err := ParseEnvironTag(strTag) and then call canAccess(tag)
<dimitern> jam, it's just wrapped inside a names.Tag due to the way AuthFunc is defined
<jam> dimitern: http://play.golang.org/p/wbU4gcGnWe seems to work
<jam> the interface isn't the same, but it seems to compare the underlying object
<dimitern> jam, http://play.golang.org/p/DzZ6RmhiE3
<dimitern> jam, :) yep
<jam> dimitern: in your example you're using concrete types, Tag isn't doing anything
<jam> ah, I guess 'u' was wrapped in a Tag
<dimitern> jam, exactly
<dimitern> jam, if I had func(tag Tag) bool { return tag == internalTag }, I could call it like f(Tag(u))
<dimitern> oops, I meant just f(u)
<jam> dimitern: so what I was more concerned about was the e1 == e2
<jam> because the interface comparison *could* compare the underlying pointers
<jam> rather than the objects that those pointers represent
<jam> dimitern: but you can use all kinds of wrappers, "==" seems to just compare the underlying structs
<dimitern> jam, yeah, but if that's the case st.EnvironTag() will return some different pointer than the tag arg points to
<dimitern> and they'll never be ==
<dimitern> yeah, I seem to recall some doc or article about golang that comparisons work like that
<dimitern> jam, so, is it good to land?
<dimitern> jam, I have at least 2 follow-ups queued :)
<jam> dimitern: this seems strange: if names.IsValidEnvironment(apiInfo.EnvironTag.Id()) {
<jam> convert it from being a tag, just to check if it was a valid tag
<jam> not saying there is a better way
<jam> LGTM
<dimitern> jam, if you follow the discussion we had earlier with davecheney
<dimitern> jam, NewEnvironTag does not panic or validate at all what you give it
<dimitern> it's a bug, but kinda intentional wrt backwards-compatibility for envs without uuid
<dimitern> jam, thanks!
<jam> dimitern: sure, it just feels like something that should be "tag.IsValid()"
<davecheney> jam: no
<davecheney> i disagree
<davecheney> you should never have to ask tag.IsValid
<davecheney> there are two possiblitues
<davecheney> var t names.Tag
<davecheney> t == nil { // not a valid tag, in fact not a tag at all
<davecheney> var t names.UserTag
<davecheney> t _must_ have a valid value
<davecheney> ie, don't do this
<davecheney> do this insteat
<davecheney> t := names.NewUserTag
<dimitern> -> panic
<dimitern> better do ParseUserTag
<davecheney> yes
<jam> dimitern: well, you should never panic in response to user input.
<davecheney> jam: good point
<dimitern> +100
<davecheney> william thinks we should only use tags over the api
<davecheney> so in a way they are not user input
<jam> davecheney: validating in the client sending the data != validating in the server receiving the data
<dimitern> but they are still strings when passed over the wire
<jam> "just don't send bad data"
<davecheney> yes, and we do that across the api boindaries with ParseUserTag
<jam> and then my servers won't crash
<davecheney> it gets more interesting when you're writing the implementation of, say
<davecheney> juju ssh unit/0
<davecheney> the command needs to convert that string into a unit tag
<davecheney> and it isn't as simple as
<dimitern> i'm itching to write a simple DoS script that given an api endpoint tries API calls with all sort of crazy tags
<dimitern> :)
<davecheney> "unit-"+argv[1]
<davecheney> dimitern: you're welcome to try
<davecheney> that path is very well tested
<dimitern> davecheney, yeah, at least for all facades I wrote; not all of them check all cases though
<aznashwan> hey guys; I'm currently working on some tests in which I need to patch the container type as obtained from (juju/api.State).Agent().Entity(machineTag)
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: None
<jam> ah, ffs. "github.com/juju/txn" depends on "github.com/juju/juju/testing" not "github.com/juju/testing"...
<axw> jam: just commented on the PR - we can fix that by disabling TLS. I can do that now if you like
<axw> I don't think there's any need for TLS in these tests
<jam> axw: yeah, I brought that up to bogdan, but seems fine as long as someone else agrees.
<axw> dimitern: was there a particular reason for choosing ^ over ! for excluding networks?
<axw> (I'm just curious)
<natefinch> heh I was just going to ask that.
<axw> :)
<natefinch> man, I swear the calendar on my phone said the team meeting was now... must be an old copy
<dimitern> axw, natefinch, yes, mainly due to ! having special meaning in bash and also fwereade and jam recommended ^ vs. ! IIRC
<axw> dimitern: ok, I had suspected bash. makes sense.
<natefinch> wow, that is a damn shame
<natefinch> (bash, that is)
<natefinch> ! is so much more obvious, but yeah, since bash decided no one ever gets to use !, there's no real alternative
<axw> nobody's very happy with bash today
<jam> dimitern: TheMue: taking the dog out, will try to be back in time
<dimitern> jam, sure
<TheMue> jam: no problem
<natefinch> gsamfira: nice point on Juju units not needing rsyslog anymore
<natefinch> removing external dependencies is already paying dividends :)
<gsamfira> natefinch: I should have removed that dependency in the same commit. I don't know where my head was
<gsamfira> natefinch: less moving parts, the better
<natefinch> gsamfira: especially since we let people install random stuff on the same machine
<jam> dimitern: standup?
<dimitern> jam, sorry, brt
<jam> TheMue: I think you clicked the link :)
<TheMue> ouch
<jam> TheMue: I'll give it a look, right now i'm in the team leads meeting
<TheMue> ok
 * thumper -> bed
<cmars> jam, hangout?
<jam> cmars: brt
<cmars> jam, re: login API PR, just pushed a minor comment fix, and confirmed v0 facade fallback on current master
<jam> cmars: thanks
<jam> we just had: FAIL: upgrade_test.go:420: UpgradeSuite.TestLoginsDuringUpgrade fail in a seemingly unrelated change
<jam> is that a known flaky test?
<jam> It failed with:     c.Assert(err, gc.IsNil) ... value *params.Error = &params.Error{"", "upgrade in progress - Juju functionality is limited"} ("upgrade in progress - Juju functionality is limited")
<natefinch> spurious "upgrade in progress" errors is definitely a problem we've hit with the tests before... I forget exactly what causes it
<perrito666> natefinch: its a race-ish condition iirc
<perrito666> authenticator enters upgrade mode because of the upgrade worker and this doesnt finish before the test does whatever its trying to do
<perrito666> dimitern: ping
<dimitern> perrito666, pong
<perrito666> dimitern: I see that you assigned me a card during the night/mornign
<perrito666> was that a mistake or you really wanted to assign that to me?
<dimitern> perrito666, mistake, sorry - it was about the AddMachinesV2 API not implemented on 1.18; I added a card, found the issue, commented on the bug and realized I won't have time to fix it, but thumper did it
<perrito666> np, just trying to figure out if action was required from me
<perrito666> I wonder if this is the image people has of us http://io9.com/scorpion-brings-the-stupidest-most-batshit-insane-hack-1638333877
<natefinch> WTF
<natefinch> seriously, totally unrealistic.  If you have to stop as fast as possible, why would you turn your tires SIDEWAYS?!
<cmars> perrito666, lol, that was amazing
<natefinch> I don't care about the ridiculous antics of flying an aircraft  20 feet off the ground w/ ethernet dragging behind... but the fac tthat they think you can't land an airplane without software is just ridiculous.
<Spads> especially when they show you can get most of a touch&go
<perrito666> natefinch: apparently his car doesnt have abs
<perrito666> also, really? nobody has a laptop? does not seem to be a very large quantity of data
<perrito666> they can copy it to a sd and then just throw that over the window :p
<wwitzel3> I know it is my own fault, but perrito666  .. you own me 2:17 of my life back
<cmars> i'm just still amazed at the strength of that rj45 connector
<perrito666> I believe that these shows are written the following way: "we have the following stunts on budget, lets add some argument around
<wwitzel3> cmars: that is the only believable thing from the video .. I've seen the damage a cable accidently hooked around a foot does ;)
<cmars> ooh ouch
<perrito666> cmars: I am equally amazed of the strength of the rj45 adapter of the laptop which should be as strong as a mini usb connection
<cmars> jam, what do you think re: https://github.com/juju/juju/pull/392. ok to land?
<wwitzel3> perrito666: also if you're driving a 458 you should probably have flipped the ABS to off anyway, otherwise you shouldn't be driving a 458.
<perrito666> wwitzel3: I drive an opel corsa from BEFORE abs so I wouldnt know
<wwitzel3> I have arrays, I'd like all their content to be in a single array .. there are points in the code where we validate each item, in each array .. so should just add an append to the validation loops of each array?
<perrito666> wwitzel3: you lost me a bit there
<wwitzel3> I have 3 []string, I want to make a 4th []string that holds the contents of the all the []string. There are loops that already validate each item in the original []string.
<wwitzel3> So, should i just add an append for my new array to each of the original []string loops?
<wwitzel3> Or is there a better Go way
<jam> wwitzel3: is it ok to modify in place?
<jam> wwitzel3: you can do "ar1 = append(ar1, ar2...)"
<jam> ar1 = append(ar1, ar3...)
<jam> else, you can create a new ar, and then do appends
<jam> wwitzel3: but "append(foo, slice...)" will append all of slice to foo
<jam> wwitzel3: no loop needed
<natefinch> wwitzel3: there's no "mash these N slices together".  foo = append(foo, bar...); foo = append(foo, baz...); foo = append(foo, bat...) is the best you get
<perrito666> natefinch: that would be a nice feature to have
<natefinch> natefinch: not really.  The 3 lines are perfectly clear.  If you really want, you can make it into a single loop  for _, s := range [][]string{bar, baz, bat} { foo = append(foo, s...) }
<perrito666> talking to self?
<katco> natefinch: you could chain your appends too: append(append(append(foo, bar...), baz...), bat...)
<perrito666> katco: nice listp
<perrito666> lisp
<katco> oh camon that's not lisp lol
<cmars> append is like a cons... hmm
<katco> haha
<wwitzel3> yeah perrito666, lisp is 2 brackets per every other character.
<perrito666> lol
<wwitzel3> I actually like lisp, but it is just so much fun to pick on
<perrito666> wait, you can actually use it? I thought it was just for the lulz
<perrito666> :p
<perrito666> natefinch: ericsnow wwitzel3 stdup?
<natefinch> perrito666, ericsnow, wwitzel3: there's actually a TOSCA call now, and wwitzel3 and I should probably be on that
<natefinch>  can we delay until the afternoon?
<natefinch> I have another meeting after tosca today
<perrito666> natefinch: np, better for me I can continue with my enrique iglesias playlist
<ericsnow> perrito666: nice
<natefinch> haha
<perrito666> you guys laugh as if it was a joke
<natefinch> I'm just sad you don't share during standups
<ericsnow> perrito666: I wasn't laughing :)
<natefinch> I was
<ericsnow> perrito666, natefinch: now I'm laughing
<perrito666> its very good to concentrate, there is no way you can get carried away with his music
<natefinch> heh
<ericsnow> perrito666: so it's like a white noise generator/sound machine
<perrito666> ericsnow: latino white noise :p
<natefinch> lol
<perrito666> its much like listening to fm radio without the guy speaking and the pub
<wwitzel3> interesting, so if a method returns type FooBar, which is a concrete implementation of an interface Foo, I can write var f Foo .. then f = GetFooBar()
<wwitzel3> but I can't create a function that takes a function that returns type Foo and then pass it a function that returns type FooBar
<perrito666> wwitzel3: I believe it compares the identity of the function there, what you propose requires a bit more inteligence
<perrito666> you can double wrap though
<perrito666> func () (yourinterface){func()(your concrete type)}
<ericsnow> is there an easy way to restart the jujud process running on machine-0 ("service" and "initctl" don't see the upstart job)?
<natefinch> ericsnow: kill it and let upstart restart it?
<ericsnow> natefinch: isn't initctl an interface for upstart (it doesn't see the job)?
<ericsnow> natefinch: I'll try it
<ericsnow> natefinch: well, that worked
<ericsnow> natefinch: thanks
<ericsnow> ha, since starting with Go I keep typing $GOHOME instead of $GOPATH
<ericsnow> guess my subconscious is trying to tell me something
<perrito666> ericsnow: something strange indeed, since you work at home
<ericsnow> perrito666: yeah, but in the context of programming languages, Go isn't home :)
<perrito666> }
<perrito666> well I hope never need to go home bc I dont really feel like making assembly and BASIC again .p
<perrito666> :p
<ericsnow> ha
<natefinch> ericsnow: for me $GOPATH=$HOME ,so in essence, $GOHOME is correct :)
<ericsnow> natefinch: :P
<natefinch> although, what really revolutionized it for me was $CDPATH=$GOPATH/src
<natefinch> so I can do cd github.com/juju/juju and it does the right thing
<natefinch> anyone know lxc and want to help a user in #juju?  he has a bunch of lxc containers that aren't coming up, though existing ones work fine
<jcw4> OCR: trivial change to fix go vet warning http://reviews.vapour.ws/r/106/
<natefinch> Anybody?  I am the world's least helpful person when it comes to lxc
<jcw4> natefinch: my only bit of wisdom about lcx containers is when the templates get messed up : http://irclogs.ubuntu.com/2014/07/30/%23juju-dev.html#t02:08
<katco> does juju not harvest machines if there are no running units on them?
<natefinch> katco: if you remove the last service on them it will, yes
<natefinch> it used to not, by default, but now it does, by default, and there's a setting somewhere to turn it off
<katco> well i worked on the harvest mode settings
<katco> i thought it worked like that, but i wanted to 2x check before replying to nick moffitt's email
<Spads> hi
<Spads> katco: so even if it's supported, it can be important to remove a unit, do a postmortem, and then manually destroy it later
<Spads> katco: so having the tools show you the states of things is essential even if you can have it auto-destroy
<katco> Spads: oh hi :)
<Spads> my last name has an unique spelling, so I tend to highlight on it :)
<katco> Spads: right i was just about to respond... i love your idea to return 0 if nothing is errored. i'll add that to the spec and look into it
<katco> Spads: :D
<katco> Spads: so does the filtering take care of your post-mortem use-case? exit-code aside
<Spads> katco: can you filter machines that are not associated with services?
<katco> Spads: sure, if you specify a state, it just looks for that state
<Spads> katco: but "not used by any service" isn't a machine state
<Spads> the machine shows up in machines: but nowhere else
<Spads> also again, I'm not interested specifically in ERROR
<katco> Spads: oh sorry, i misunderstood
<Spads> but also in anything that isn't perfect
<Spads> anything in progress
<katco> Spads: so you'd like a filter on un-utilized?
<Spads> anything pending
<Spads> yeah
<Spads> basically when I run juju status it's to learn about things that are out of the ordinary
<Spads> so I don't want to see services that are STARTED or machines that are JUST_FINE_THANKS
<katco> haha
<Spads> I want to see everything that's *not* those
<Spads> like I want to filter *out* states
<Spads> does that make sense?
<katco> Spads: hm. wondering if it would be acceptable to add a not conditional to the filter
<Spads> yeah
<katco> or if that's too fancy/scope creep
<Spads> well I think that filter/filter-out are a common pattern
<Spads> grep/grep -v
<Spads> etc
<katco> i don't think i can make the decision, but i at least understand what you're saying now
<Spads> select where not...
<katco> i'll write up a user-story for you to look at to make sure i have it right, and then i'll check in with a lead
<Spads> cool, do you want to summarise it for the list?
<katco> Spads: sure will do
<Spads> That's perfect.  Many thanks!
<katco> Spads: thank you for the input :)
<Spads> if I had my druthers
<Spads> juju status would behave as I described above
<Spads> and juju status --verbose would dump the whole state
<Spads> but I think that ship has sailed :)
<katco> as default
<katco> ?
<Spads> yep
<katco> i believe we're planning on moving to --format summary as default
<katco> which makes the sitaution at least a little better
<Spads> I'll have to find time to take a closer look at tip
<katco> but i don't think that decision has been made yet (hint hint)
<katco> like a 1.21 or 1.22 thing i think
 * Spads nods
<katco> at any rate. good discussion. i'll update the spec
<Spads> excellent.  Thanks for your time!
<bodie_> so what's the policy on gopkg dep versioning?
<bodie_> I have a breaking change to charm I need to make, and a juju/juju branch ready to land which unbreaks it, I just need to figure out whether I need to move to charm v5 or how to properly integrate my work
<mgz> bodie_: what specifically?
<bodie_> it's just a variable rename but it does cause juju to fail the tests without the core branch, so I'm thinking it would need a new gopkg charm version
<rick_h_> bodie_: on the charm package?
<mgz> ah. not sure that's been fully argued yet. in theory, yes, vNEXT and change imports and dependencies.tsv for juju
<mgz> but there was a thread recently with jam and rog about charm specifically
<rick_h_> bodie_: because we have stuff on that as well to keep up to date. rogpeppe has done a lot of work around the charm package and versioning and would be good to make sure we're in the loop
<rogpeppe> bodie_: what's the variable?
<bodie_> rogpeppe, renaming ActionRequested to Action per fwereade, and removing it from unitHooks
<rogpeppe> bodie_: in general if something's in gopkg.in, we try very hard to keep changes backwardly compatible according to the rules specified in http://gopkg.in/
<jcw4> rogpeppe: fwiw, I'm fairly sure no-one has actually depended on that variable in production yet....
<rogpeppe> bodie_: you could just keep the old ActionRequested variable around
<rogpeppe> bodie_: and document it as deprecated
<jcw4> rogpeppe: +1
<bodie_> hmm...  so use both
<rogpeppe> bodie_: it can be deleted if/when we change the version
<bodie_> ok, and then leave it as v4 since it's not a breaking change?
<rogpeppe> bodie_: yeah
<rogpeppe> jcw4: i also am tempted to agree with you about known users
<rogpeppe> jcw4: but...
<bodie_> rogpeppe, that makes sense.  my other question is whether I need to rebase it on master or on the godeps'd version
<rogpeppe> jcw4: i think it's nice to exercise this stuff so we know what to do when we *do* have users...
<jcw4> rogpeppe: +2
<bodie_> rogpeppe, jcw4, there's already code in core using ActionRequested
<bodie_> fwiw
<rogpeppe> bodie_: i don't understand that question
<rogpeppe> bodie_: why do you need to rebase anything?
<bodie_> rogpeppe, I have a commit to charm I want to land on top of the commit history... what I normally do is rebase my changes on top of master to ensure they're compatible with the current version
<rogpeppe> FWIW, i agree with fwereade about the name change
<rogpeppe> bodie_: that seems usual, yes
<rogpeppe> bodie_: (i often don't actually rebase, but merge then reset)
<bodie_> rogpeppe, but in this case, the version of charm in use by juju core (i.e. the hash from godeps) isn't master, unless I'm mistaken
<rogpeppe> i seem to get less conflicts that way
<bodie_> interesting
<rogpeppe> bodie_: that shouldn't influence how you land changes to charm
<rogpeppe> bodie_: it just means that juju-core hasn't caught up with recent charm changes
<bodie_> rogpeppe, I just don't want to make a change to godeps that implicitly includes a charm version that juju isn't built against
<bodie_> i.e. if Juju is on charm version C and my change is version A (i.e. master), but A includes B because A was rebased onto B instead of onto C.... am I making sense here?
<bodie_> say B was the previous master
<rogpeppe> bodie_: the entry in godeps should always reference a commit in the dependency's history
<rogpeppe> bodie_: you'll just be adding to the head of that history, so juju will be pointing somewhat back in time
<rogpeppe> bodie_: and when juju is ready, it can change godeps to point to the new charm master
<rogpeppe> bodie_: i'm not really understanding your question in fact
<rogpeppe> bodie_: godeps *specifies* which charm version juju is built against
<bodie_> right
<rogpeppe> bodie_: and all that matters is the state at that actual referenced commit. if there are some commits in the history that juju was never built against, it doesn't matter
<rogpeppe> bodie_: in the end, just commit to the charm repo as if juju didn't exist
<rogpeppe> bodie_: then change juju to refer to the newly pushed commit by changing godeps
<rogpeppe> bodie_: does that make sense?
<bodie_> rogpeppe, then, I guess my question is more about juju; I notice godeps indicates a version of charm older than master, so if I land my charm changes on top of charm master, then update godeps to the new charm master, it will implicitly include the charm master that I landed on top of, which previously wasn't in juju's deps
<rogpeppe> bodie_: yeah, that's fine. it's water under the bridge
<rogpeppe> bodie_: the charm package could change very frequently, but juju wouldn't need to update godeps for every commit
<bodie_> rogpeppe, I assumed that if godeps points to an older charm hash, it's because juju isn't ready to have the newer charm master
<bodie_> ah
<bodie_> hm
<natefinch> bodie_: you're right that one of those new commits could break juju
<rogpeppe> bodie_: it's usually a good idea to update godeps to the latest commits to all dependencies
<natefinch> bodie_: (one of the ones that you didn't author)
<bodie_> natefinch, exactly
<rogpeppe> bodie_: particularly for gopkg.in deps
<rogpeppe> bodie_: for others, you have to be more careful because changes might not be backwardly compatible
<natefinch> bodie_: there's not really anything you can do about that except try it, and if it breaks, figure out who made the intervening commits, and figure out why it broke
<bodie_> rogpeppe, regarding gopkg.in deps, that does make sense; I think my confusion about gopkg came because I had previously used gopkg for charm and been advised not to do so
<natefinch> bodie_: but in theory, the whole pooint of using gopkg.in is that you're *not* supposed to make breaking changes on the same branch
<rogpeppe> bodie_: but you can't do anything too bad because the 'bot will throw your PR out if the updated dep breaks something
<rogpeppe> bodie_: yeah, davecheney doesn't like gopkg.in
<rogpeppe> bodie_: but it's too late now
<bodie_> true, and it passes all my tests.  hmm...  the current charm master must be for a juju branch that hasn't landed.
<rogpeppe> bodie_: and i think it makes a lot of sense, although there are potential issues too
<natefinch> I don't see what the difference is, really.  We could also just use github.com/juju/charm.v3 instead of a branch called v3
<rogpeppe> natefinch: except then you'd need a different repo for each version
<rogpeppe> natefinch: so you couldn't share issues between them
<natefinch> rogpeppe: right... .my point is, gopkg.in isn't hurting anything.  It's just like using a completely different package whenever you change version numbers
<rogpeppe> natefinch: yup
<natefinch> rogpeppe: and it makes many things nicer, like issues on the
<natefinch> repo
<rogpeppe> bodie_: the recent charm changes were probably just me adding to the bundles in the charm testing dir
<bodie_> single point of failure, bus factor of one... just saying
<rogpeppe> natefinch: yeah. dave's objection is to having the version encoded in the package path
<rogpeppe> bodie_: it's not a spof
<rogpeppe> bodie_: there are fallback servers
<bodie_> well, assuming whoever's paying for gopkg.in stops paying for it
<rogpeppe> bodie_: and the code is almost trivial
<rogpeppe> bodie_: you're assuming that it costs something to run
<natefinch> if gopkg.in goes away ,the biggest thing we have to do is move our branches into separate repos and find/replace in the code
<rogpeppe> bodie_: i suppose the domain name costs something
<natefinch> or run our own redirector
<bodie_> fair enough
<natefinch> rogpeppe: servers that fail over to one another generally aren't free
<natefinch> rogpeppe: though I guess thinks like heroku and GAE do offer free tiers
<bodie_> yeah, I didn't mean a technical single server but rather the service itself, but I don't see it going away if people depend on it and it's foss
<bodie_> just the domain is mildly problematic
<bodie_> potentially
<natefinch> bodie_: the domain name can go away ,and that's nearly as bad as the service going away.  But still, all it requires is a find&replace and you're back in business.
<bodie_> haha, I'm seriously not saying I *think* this would happen
<bodie_> yeah, good point
<bodie_> heh, I always wonder how long github is going to be around
<rogpeppe> bodie_: one nice thing about gopkg.in is that it needs no persistent state at all
<rogpeppe> bodie_: yeah
<rogpeppe> anyway, i gotta
<rogpeppe> go
<bodie_> sure, take care.  thanks for the advice
<rogpeppe> bodie_: np, good chat
<rogpeppe> g'night all
<rick_h_> thanks for the help rogpeppe, night
<bodie_> for the gopkg deprecated variable, should I mention it in README or is it sufficient to comment the variable with a TODO?
<natefinch> bodie_: I'd say in the code is fine
<perrito666> natefinch: meeting
<perrito666> ericsnow: likewise
<stokachu> is there a situation where juju would not run kvm-ok? like maybe on an amd system?
<stokachu> http://paste.ubuntu.com/8427274/, line 1484 just says kvm container creation failed
<stokachu> but thats it
<natefinch> is trunk broken?
<natefinch> I tried to bootstrap on ec2 and it takes a long time and then says "no instances found"
<natefinch> mmm.. godeps out of date.... will retry
<bodie_> can I get an LGTM on https://github.com/juju/charm/pull/53 please?
<bodie_> it's a -3/+2 pr
<natefinch> and yet it took two commits
 * natefinch is just giving you a hard time
<natefinch> bodie_: is that right, you just removed action-requested from unitHooks?
<bodie_> heh
<bodie_> natefinch, yeah, and added Action kind
<bodie_> labeled ActionRequested DEPRECATED
<bodie_> it's just a dep for 617, is the thing, which is horribly overripe and finallyready to land
<natefinch> ok, I guess?  I don't know that I have a good enough idea of how this stuff is used to have any clue if you're breaking everything or not.   I guess as long as no one calling UnitHooks() relies on ActionRequested being in there, that's ok.
<sinzui> natefinch, bodie_ I hope both of you can help with the situation in https://bugs.launchpad.net/juju-core/+bug/1374087
<mup> Bug #1374087: Joyent is not deploying services reliably <joyent-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1374087>
<bodie_> natefinch, doesn't break juju core
<sinzui> natefinch, bodie_ I don't have much information :( I can collect what you advise
<bodie_> natefinch, the goal is to prevent people from calling using an Action as a UnitHook
<bodie_> "calling using" = using... derp
<natefinch> bodie_: lgtm'd
<bodie_> natefinch, thanks
<natefinch> sinzui: looking
<natefinch> sinzui: do we have a contact at Joyent we could talk to?  You'd think that a change that breaks outside users would be somethign they'd want to communicate with partners
<sinzui> natefinch, I don't know who it is, but I know who I can ask
<jcw4> mgz: how do we persuade CI to pick up a merge when there are LOTS of comments on the review?
<perrito666> jcw4: that should be no blocker
<sinzui> jcw4, I think mgz merged his paging fix
<jcw4> perrito666: I vaguely remembered a discussion where lots of comments pushed the $$merge$$ onto a new page
<jcw4> sinzui: I see
<perrito666> jcw4: what sinzui said
<jcw4> sinzui: bodie_ is trying to land 617
<jcw4> and it's not being noticed by CI
<jcw4> sinzui, mgz if you get a chance can you help us out in getting CI to notice that 617 is ready to land?
<sinzui> jcw4, I don't have access to the machine or experience with the server
<jcw4> sinzui: okay, thanks!  I don't know who else to ask... wallyworld maybe?
<jcw4> I guess he's not on
<sinzui> I see the paged comment support is merged https://github.com/juju/jenkins-github-lander/commits/develop
<natefinch> btw if anyone wants to lend a hand in #juju, it would be appreciated.  Trying to debug one guy's mystery machine and now a second guy has come on.
<jcw4> natefinch: feel like a one armed paper hanger?
<natefinch> yep
<natefinch> ug... people really need to remember to update the help on commands
<natefinch> juju upgrade-juju still says odd minor versions are considered development versions
<hazmat> dumb golang question.. i'm looking at the ec2 provider / ec2.go ...  its got a couple lines like this var _ environs.Environ = (*environ)(nil)
<hazmat> is that some sort of side effect for the compiler to check?
<hazmat> ie. another.. var _ simplestreams.HasRegion = (*environ)(nil)
<natefinch> hazmat: yes, it's a compile-time check that environ fulfills the environs.Environ interface
<natefinch> just a belt and suspenders check, really, since tests should also be checking that... but it can't hurt to have the compile-time check, too.   It's useful for packages that intend to fulfill an interface, but don't actually use their type as that interface inside the package
<hazmat> magical side effects ;-)
<hazmat> cool
<hazmat> more like inline assert instance creation cast
<natefinch> sure
<natefinch> except not magical at all.  :)
<bodie_> any sufficiently advanced technology is indistinguishable from magic
<bodie_> :P
<katco> sinzui: thank you for all your efforts with releasing 1.20.8
<sinzui> your welcome katco
<bodie_> hmmm, so how should I land 617?  the CI bot doesn't seem to be picking up my $$merge$$
<bodie_> I guess I have to wait for wallyworld?
<thumper> bodie_: I wouldn't wait for wallyworld, he on holiday for a bit :)
<bodie_> thumper, any suggestions?
<thumper> bodie_: which PR?
<bodie_> https://github.com/juju/juju/pull/617 @thumper
 * thumper shrugs...
<thumper> I wonder if the bot is awake
<bodie_> thumper, I think so.  jcw4 was saying he landed a trivial earlier today, I believe
<jcw4> yep
<bodie_> thumper, we thought it might be the pagination issue with PR comments, but apparently a fix for that was landed
<bodie_> I almost wonder if the fixed code is running on the ci server?
<thumper> bodie_: I have no idea, sinzui?
<bodie_> already pinged him earlier..
<sinzui> thumper, the lander is not connected to CI. It does defer testing to CI though
<bodie_> ah
<bodie_> so this is a dead end until wallyworld is back?
<ericsnow> running a local environment, should an "ubuntu" user exist on my box?
<sinzui> ericsnow, the answer is NO but...
<sinzui> ericsnow, there is an insane edge case that is not fixed. If your local host is a server ubuntu, you cannot delete the ubuntu user. its ridiculous because localhost on server or desktop always creates the container under the login users name, but when the os is the ubuntu server an  the ubuntu user is deleted, juju trys to use it anyway
<ericsnow> got it
<ericsnow> thanks
<sinzui> bodie_, I have a cunning plan...can you create a new pull request for your branch, and leave a comment with a link to the stuck PR and $$merge$$
<sinzui> bodie_, maybe the problem with the lander is that it is a few revs behind trunk, which has the paginated comment fix
<bodie_> sure, I'll try that.  thanks sinzui
<rick_h_> thumper: working on getting back in network went boom or something
<sinzui> I need to EOD now and it will be a hard reset because I need to switch to OS X to build juju and let my computer get the reboot it wanted on Monday
<bodie_> oh thank zombie baby jesus, the lander picked it up
<bodie_> anyone know what the deal is with godeps -t reporting differences even on master?  I'm scouring my inbox but not seeing a discussion about a new "right way"
<menn0> thumper, waigani, davecheney: standup?
<thumper> menn0: geez on the dot of 11
#juju-dev 2014-09-26
<perrito666> hi night shift
<perrito666> sweet my house is finally on street view,I am in the 21st century
<bodie_> welcome to the internet perrito666
<bradm> anyone know if its possible to deploy things to the juju bootstrap nodes, if you're using HA?  I'm trying to go do lxc:0, lxc:1 type of thing
<bradm> in the past we've done a juju deploy of cs:ubuntu to the units based on constraints
<bradm> ah, I might have to use native juju deploy to get the cs:ubuntu deployed, then use juju-deployer
<thumper> bradm: yes you can
<thumper> bradm: however the boostrap node isn't as special in an HA world
<thumper> bradm: it just becomes one of the state servers
<thumper> don't rely on hard coded numbers
<thumper> you can use placement directives for stateserver nodes just like normal
<bradm> thumper: I'm trying to deploy openstack HA things to those 3 nodes
<thumper> bradm: we don't have any magic short cut names
<thumper> bradm: but if this is something that you envision doing a lot
<bradm> thumper: I'm pretty sure this is a juju deployer bug
<thumper> it might make sense to add some sort of alias
<thumper> probably deployer
<bradm> thumper: I just did a juju deploy cs:trusty/ubuntu --to 0, and then juju add-unit ubuntu --to 1
<bradm> thumper: the use case here is bootstack, we want to have a HA juju bootstrap node, and then use those same 3 nodes as the HA for the rest of openstack
<bradm> thumper: by deploying ubuntu to the 3 HA bootstrap nodes, we can then deploy the rest via lxc
<axw> jam: any idea what's up with the old landing bot? https://code.launchpad.net/~mark-sheahan-ms/gwacl/cert-args/+merge/235889/comments/577883
<bradm> thumper: aha, bug 1324129 seems relevant to my interests, just added a comment to it.
<mup> Bug #1324129: unit placement can only go to machine 0 <juju-deployer:New> <https://launchpad.net/bugs/1324129>
<bradm> there's a real mix of inequality between juju and juju deployer about what you can do placement to
<thumper> bradm: why are you bothering to deploy ubuntu? it is the do nothing charm
<bradm> thumper: hrm, I probably could just use lxc:1, couldn't I, now I'm outside of juju-deployer
<bradm> or whatever the format is
<bradm> thumper: our standard way of deploying it has been to deploy ubuntu using constraints and give it a name, and then use that to deploy the units to
<bradm> thumper: oh, because I need to be able to refer to those hosts by service name, thats why
<thumper> whaaa???
<bradm> thumper: juju-deployer will only let you refer to machine id 0, and thats it.
<thumper> bradm: yes
<thumper> bradm: yes to the "lxc:1"
<bradm> thumper: right, which won't work inside juju-deployer yaml
<bradm> I can't see any other way around it until juju-deployer loses the idea that machine id 0 is special
<bradm> ideally both juju and juju deployer would agree on what you can deploy to
<thumper> ideally
<bradm> juju deployer will let you use service names, but only machine id 0
<bradm> and juju will let you use whatever machine id, but not service names
<bradm> so we're having to switch and change a bit
<bradm> thumper: well, there's now bugs on both programs to make it equal.. :)
<thumper> bradm: how does the deployer let you specify a service name as a target?
<thumper> and what does it mean when there are 0 units
<thumper> or more than 1 unit?
<bradm> thumper: you can do things like lxc:ceph=2
<bradm> thumper: and that'll do the 2nd ceph unit
<thumper> bradm: file a bug, and I'll make someone do it :)
<bradm> thumper: already done days ago
<bradm> thumper: bug 1372759
<mup> Bug #1372759: Extend juju deploy --to format to allow specifying service name <placement> <juju-core:Triaged> <https://launchpad.net/bugs/1372759>
<thumper> ta
<bradm> well, way to paint myself into a corner - I can't use juju deployer to deploy to units 0, 1 and 2, but I can't use juju to deploy via service name
<rick_h_> bradm: fixing that is part of stuff we'll be looking at in the coming weeks. We've got a call next week with the deployer devs to make sure we move the parts in sync
<bradm> rick_h_: awesome, that'll be good.
<bradm> I still need to figure out a way to deploy with HA bootstrap nodes, and with HA openstack, both on the same 3 nodes.
<rick_h_> bradm: hmm, we're looking at specifying new machines in the work. This is slightly different. We'll keep this in mind as we complete out the task list.
<bradm> rick_h_: basically, we're trying to use the 3 HA bootstrap nodes as a target for lxc containers for HA openstack, if that makes ense.
<bradm> er, sense.
<rick_h_> bradm: so I'd be curious how an HA juju env looks in the GUI machine
<rick_h_> machine view that we put out today
<rick_h_> bradm: and then you can use the webUI to create containers on each of the state servers and place services on it
<bradm> rick_h_: easy enough to deploy it, is it in the normal charmstore?
<rick_h_> bradm: yes, it was published today
<bradm> rick_h_: ok, I'll have a look at that shortly, just have to answer the door
<rick_h_> bradm: rgr
<rick_h_> bradm: shared a video of the machine view stuff on the orange box doing containers/etc. So it's missing the HA juju part which I've not tried out. It might give some idea if it'd be useful or not if I'm following you.
<rick_h_> bradm: what provider are you using? MAAS?
<menn0> thumper: ping
<thumper> menn0: hey
<menn0> thumper: just back from kids swimming.
<menn0> thumper: I've just figured out why those units aren't coming up after upgrade
<thumper> and?
<menn0> thumper: the unit agents aren't starting up after shutting down to restart into the new tools version
<menn0> thumper: once I manually start the agent, all is fine
<menn0> thumper: I have another unit that's still in that state so I'm going to poke around
<menn0> thumper: but do you have any ideas?
<thumper> only that if the process exits with a particular exit code, upstart won't try to restart it
<thumper> but I don't recall what it is, axw might
<axw> I wasn't aware of special exit codes
<axw> *I'm not aware
<thumper> hmm... pretty sure there is one
<thumper> normal exit 0
<thumper> in the upstart config file
<thumper> menn0: so if the user agent exited with a value of 0, it wouldn't have been restarted by upstart
<menn0> thumper: yep I got that
<menn0> thumper: I'm not seeing anything in the unit agent's logs that indicates why this might have happened.
<menn0> thumper: I'm going to try with master just to make sure this isn't a more general problem.
 * menn0 is glad he scripted the envrionment setup
<thumper> ok
<bradm> rick_h_: yes, we're using maas
<bradm> rick_h_: this is on softlayer hosts
<ericsnow> davecheney: thanks for all those reviews
<ericsnow> davecheney: would you mind taking one more look at http://reviews.vapour.ws/r/88/?
<davecheney> ericsnow: looking
<ericsnow> davecheney: thanks
<menn0> thumper: it's definitely my changes causing the unit agents to not start
<menn0> stragne
<davecheney> review done
<thumper> menn0: did you want to talk through it?
<menn0> thumper: probably a good idea
<menn0> thumper: standup hangout?
<thumper> menn0: I'm in the standup hangout :)
<davecheney> HOLY SHIT
<davecheney> did everyone know that in RB you can attac a comment to MORE THAN ONE LINE ?
<thumper> oh?
<ericsnow> davecheney: that's one thing I like about reviewboard
<ericsnow> click and drag, baby!
<menn0> davecheney: I did not but I just tried it and you totally can :)
<davecheney> ericsnow: it occured to me in the review comment mode, why it would only show one single line with no context
<davecheney> so, that's the reason
<davecheney> now if the bastard would only show comments and diffs on the same page
<ericsnow> davecheney: the comments are on the diff: the little numbered boxes on the left (click to expand and reply from the diff page)
<davecheney> ericsnow: yes, people keep saying that
<davecheney> so, click on the box
<ericsnow> davecheney: :)
<davecheney> and you want to reply to their reply
<davecheney> try it
<davecheney> you'll see why it doesn't work
<davecheney> ericsnow: is there an optoin to have RB auto expand those little boxes
<davecheney> ?
<ericsnow> davecheney: you mean that it redirects you away from the diff?
<ericsnow> davecheney: not natively
<davecheney> ericsnow: bingo
<ericsnow> davecheney: apparently reviewboard has quite a few extensions out there so I expect there's something
<bradm> rick_h_: https://chinstrap.canonical.com/~bradm/ha-juju.png is what HA juju looks like in the gui, in the machine view
<axw> restarting into utopic. bbs.. or possibly not
<axw> meint gott, it just worked
<menn0> thumper: ping
<thumper> menn0: ya?
<menn0> thumper: so scratch what I thought it might previously
<menn0> thumper: that unit died when the state server restarted
<menn0> thumper: (into the new tools version)
<menn0> thumper: lots of workers died because they lost their API connection (expected)
<menn0> thumper: but for some reason the uniter decided it should die too
<menn0> thumper: logs here: http://paste.ubuntu.com/8430245/
<menn0> thumper: that's the tail of the unit's log
<menn0> thumper: the key part is: INFO juju.worker.uniter uniter.go:144 unit "rsyslog-forwarder-ha/1" shutting down: tomb: dying
<menn0> and probably DEBUG juju.worker.uniter modes.go:398 ModeAbide exiting
<menn0> that shouldn't have happened
<menn0> the broken pipes, dying workers and watcher errors are expected when the state server goes away
<menn0> the other units had units restart when the state server died but they came back after a few retries, once the state server was available again
<menn0> but for some reason this unit (and others in previous upgrade attempts) decided to die
<menn0> I'm guessing it's a timing thing somehow
<thumper> hmm...
<thumper> ModeAbide?
<menn0> I learned about that in my first week thanks to talk by Will
<menn0> :)
<menn0> it's a function in the uniter
<menn0> but also the steady-state mode of the unit agent
<thumper> I would have thought that any exit of the agent due to an error should be non-zero, and hence restarted
<thumper> what you seem to be saying is that there is some situations where that doesn't seem to be the case
<menn0> yes
<thumper> hmm...
<menn0> that seems to be the case
<thumper> look at agentDone function
<menn0> and beyond that, the unit agent shouldn't have even been exiting at that point
<thumper> jujud/agent.go:315
<thumper> worker.ErrTerminateAgent
<thumper> I'm wondering if that is the error being returned somehow
<menn0> what it looks like from looking at how ModeAbide is exited is that the unit's state went to Dying
<menn0> which might explain why the agent exited with 0
<menn0> it was as if the unit had been told to terminate
<thumper> huh?
<thumper> the uniter thought it was dying?
<menn0> yep
<thumper> that would explain why it didn't restart
<menn0> exactly
<thumper> but seems weird that it might think it is dying because it can't talk to the api server
<menn0> I'm just checking what the state server sent back over the API about this unit
<menn0> thumper: nope. the "life" calls for that unit always respond with "alive"
<thumper> probably worthwhile running this test a few times on master if it is timing based
<thumper> as pretty sure it isn't your change that did this
<thumper> yes it is important, but perhaps a decent bug report would be best right now
<menn0> thumper: you think it's not my change? I'm not sure yet.
<menn0> thumper: but good to check on master a few times to be sure
<thumper> fairly sure...
<thumper> not 100%
<thumper> but well up there
<menn0> ok. recreating the env and switching to master now.
<dimitern> any reviewers around? http://reviews.vapour.ws/r/109/
<dimitern> TheMue, cmars, as OCR can you have a look please? ^^
<TheMue> dimitern: yep, will do
<TheMue> morning btw
<dimitern> fwereade, you might be interested, if you have time ^^
<dimitern> TheMue, morning :)
<dimitern> fwereade, TheMue, cmars, another trivial cleanup follow-up - http://reviews.vapour.ws/r/110/
<TheMue> dimitern: I'm almost throught with 109, will take a look then
<dimitern> TheMue, cheers!
<Spads> Is there a document somewhere with details on the upcoming Brussels sprint?
<TheMue> dimitern: so, you've got a review
<TheMue> Spads: afaik only the typical wiki page
<Spads> TheMue: where is it?
<Spads> https://wiki.canonical.com/CDO/Sprints/JujuOct14 <-- oh, this?
<dimitern> TheMue, tyvm
<TheMue> Spads: yep
<Spads> TheMue: many thanks
<TheMue> Spads: yw
<dimitern> TheMue, cmars, fwereade, and another small follow-up for the uniter - http://reviews.vapour.ws/r/111/
<TheMue> dimitern: hey, I wonna be productive today too, but you don't let me :D
<dimitern> TheMue, sorry to be a pest, but I really want to finish with the port ranges stuff today if I can, so we can start again on container addressability next week :)
<TheMue> dimitern: hehe, yeah, and it's my job as ocr today
<dimitern> TheMue, +1
<TheMue> dimitern: #111 is marked as WIP, so waiting with review? Oh, and btw, #110 is done.
<dimitern> TheMue, yes, all of the follow-up are WIP, because they can't land before their parent is merged into juju:master
<TheMue> dimitern: ok
<TheMue> dimitern: having a quick talk?
<dimitern> TheMue, yep, sorry, omw
<dimitern> TheMue, cmars, fwereade, last one for today :) I promise - http://reviews.vapour.ws/r/112/
<TheMue> dimitern: 112 is the german number for emergency calls, so I'm not sure :D
<dimitern> TheMue, :D it's the same here
<perrito666> TheMue: its the argentinian number for helpdesk, that never works btw
<TheMue> perrito666: *lol*
<katco> axw: standup?
<axw> katco: I'm there
<katco> axw: huh... so am i?
<axw> I'll rejoin
<rick_h_> http://www.reddit.com/r/juju/comments/2his7a/juju_gui_now_has_an_awesome_machine_view/ for your upvotes please
<axw> rick_h_: sexy
<katco> rick_h_: looking really sharp +1
 * katco watching your video right now
<axw> shame about the lack of hardware details in the old maas provider
<rick_h_> yea, glad to see the bug get fixed
<perrito666> you actually got me pausing my music to look at the vid
<rick_h_> so hopefully it'll just get better in time as juju support hardware info more
<axw> rick_h_: they all do now
<axw> so yep :)
<axw> (all do in master)
<axw> .. and 1.20.8 I think?
<perrito666> rick_h_: you need music on those vids :p and perhaps that sound from a boats horn that is used in all the movie trailers nowadays
<axw> +1
<rick_h_> lol
<perrito666> something like
<perrito666> THIS <horn> SUMMER <horn> JUJU <horn> WILL <horn> HAVE <horn> LESS <horn> BUGS
<Spads> SUNDAY SUNDAY SUNDAY
<axw> *explosion*
<katco> rofl
<perrito666> then DEPLOYS <coral arrangement> BUNDLES <coral arrangement> HA <coral arrangement> NEW VIEW <coral arrangement>
<katco> i'm just wondering when they'll start combinging the buzzfeed headlines with trailers
<katco> "YOU (DUNNNNN) WON'T (DUNNN) BELIEVE (pkaw!) HOW THESE TWO PEOPLE FALL IN LOVE"
<dimitern> rick_h_, upvoted!
<rick_h_> dimitern: ty!
<perrito666> re
<perrito666> k
<perrito666> sdadg
<perrito666> sorry kb hit the ground
<perrito666> rick_h_: upvoted
<perrito666> if you tweet something Ill also retweet you
<rick_h_> you want like an officially tweet https://twitter.com/jujuui/status/515467739951923200
<rick_h_> or a personal tweet https://twitter.com/mitechie/status/515462052106629121
<rick_h_> or another point of view tweet https://twitter.com/jaycee/status/515468467567603712
<rick_h_> :)
<perrito666> rk
<perrito666> j
<perrito666> mm the fall clearly was not good on this kb
<perrito666> now sometimes shift enters
<perrito666> rick_h_: retweets are for free so I can go with all three of them
<rick_h_> thanks for all the points/sharing, go juju! :)
<hazmat> bradm, deployer will support arbitrary machines.. it was a safety belt for portability.. but it reality is its very useful
<hazmat> to go to arbitrary machines
<natefinch> ahh the old "gmail is unavailable, check our status page" ..... Status page says everything is fine
<natefinch> reloading fixes it, though, so, I guess I won't ragequit gmail today
<perrito666> well, they dont say that the status page is working
<natefinch> lol
<natefinch> where's the status page for the status page?
<perrito666> natefinch: "check the status page" is an euphemism for "stop hitting our mail servers while we try to get them working again you refresh maniac"
<natefinch> I'm very sad that juju.io is not something we own
<natefinch> juju.wtf is available, though.
<Spads> juju.bike
<natefinch> there's juju.works which is kinda cute
<TheMue> +1
<perrito666> and most likely juju.ninja should be available too
<natefinch> nope, taken
<perrito666> maaan, I finally got to know how this is done http://cdn.diply.com/img/6d874ee0-a5db-43f2-ad34-b358b618bd5f.jpg
<perrito666> I can die happily
<natefinch> neat!
<perrito666> I was a whole day trying to get http://ide.ninja/ :p and finally got it, they are oddly unexpensive
<natefinch> gojuju.io is available, which is a pretty good compromise I think.  Not ideal, but since we already failed to get any juju domain names when deciding on the name of the product (seems like bad planning.... everyone knows the first thing you do when starting a project is buy the domain name!)
<perrito666> natefinch: lol
<perrito666> when I started my company I decided the name based on the one with the lowest google hits and domains taken
 * natefinch has several projects that never got past the "buy the domain name" stage, for example...
<perrito666> so while we did that we had a functioning company for like 4 months called thing2009
<natefinch> haha
<perrito666> in spanish, which is coso 2009
<perrito666> natefinch: my calendar says 1:1
<ericsnow> TheMue: could you get me a review on http://reviews.vapour.ws/r/107/?  It's pretty small.
<TheMue> ericsnow: will do it next, right now I'm doing the 113
<ericsnow> TheMue: cool, thanks!
<ericsnow> TheMue: I could also really use http://reviews.vapour.ws/r/88/, but 107 is a bigger priority for me :)
<TheMue> ericsnow: hmm, just wanted to make you happy with a done #107 and you've got already the next one for me. ;)
<ericsnow> TheMue: I'm a slave driver ;)
<TheMue> ericsnow: yeah, I see it. this morning already dimitern has been a slive driver.
<ericsnow> TheMue: ah, to be OCR and free!
<ericsnow> for backups upload and download we could follow the precedent of tools and charms (add hooks directly to the API server to handle HTTP requests directly)...
<ericsnow> ...or teach the RPC server to handle files too.
<ericsnow> any quick thoughts?
<ericsnow> (I'll probably end up taking this to the ML)
<natefinch> wwitzel3: standup?
<rick_h_> anyone know off the top of their heads if add-unit -n 3 --to=1,2,3 is possible?
<natefinch> rick_h_: I don't think so, but not 100% sure
<rick_h_> natefinch: thanks, will see if I can try it out
<katco> marcoceppi: the summary formatter just landed :)
 * marcoceppi revs up the git machine
<marcoceppi> katco: I had a huge 20 machine deployment yesterday, but completely forgot about testing as it was late at night, I'll be spinning up another one soon though
<katco> marcoceppi: no worries. it's been pretty well banged on, i'm just excited to see what large deployments look like :)
<TheMue> oh, just got notified my new provider will install the new connection (100 mbps) on tuesday. let's press thumbs that e'thing works
<natefinch> 100MPS is pretty sweet
<natefinch> dimitern, TheMue: could you guys talk to johnmc on #juju and try to give him a hand?  I spent a few hours working with him yesterday.  He has a machine that used to be able to deploy lxc containers and now can't
<TheMue> natefinch: sadly have to step out after my actions meeting in 2 min
 * perrito666 has 12 m and is happy enough
<wwitzel3> ericsnow: is there anything backups related I can assist with?
<ericsnow> wwitzel3: you could take a look at upload/download or at adding more functional tests
<ericsnow> wwitzel3: preferably the functional tests :)
<ericsnow> wwitzel3: thanks for the offer, BTW
<wwitzel3> ericsnow: sure, any specific tests you're looking to have done?
<ericsnow> wwitzel3: I guess one each on the backups interface (state/backups), on the API (api/backups), and the command (cmd/juju/backups)
<wwitzel3> ericsnow: so by more, you meant all? :P
<wwitzel3> ericsnow: sounds good, I'll ping you went I'm getting ready to start on them, I'm wrapping up the runcmd stuff now.
<ericsnow> wwitzel3: well, we have functional tests on CI that we will switch to the new backups once we ditch the old, but yeah :)
<ericsnow> wwitzel3: cool
<wwitzel3> ericsnow: I learned that case Foo, Bar: expression .. is much more useful than case Foo:\ncase Bar: expression
<wwitzel3> ericsnow: coincidentally learning that also fixed the last known bug I was stuggling with in apiserver/runcmd
<wwitzel3> which then made me feel dumb
<wwitzel3> yay!
<ericsnow> wwitzel3: oh, you mean how each case has an implicit break and you must explicitly ask it to chain them?
<wwitzel3> ericsnow: yep
<ericsnow> wwitzel3: I hate stuff like that
<perrito666> anybody using 14.10?
<natefinch> ericsnow: you around?
<ericsnow> natefinch: yeah
<natefinch> ericsnow: finally got to look at the backups list stuff
<ericsnow> natefinch: cool
<natefinch> ericsnow: how is this safe? https://github.com/ericsnowcurrently/juju/blob/backups-list/state/backups/backups.go#L99
<natefinch> ericsnow: by definition, you're letting people create a backup value with anything that implements filestorage.Filestorage
<ericsnow> natefinch: we control what goes in so we have a guarantee as to what comes out
<natefinch> This says you will accept anything that implements the interface: https://github.com/ericsnowcurrently/juju/blob/backups-list/state/backups/backups.go#L51
<natefinch> ericsnow: either we accept anythign with the interface, in which case, the downcast is bad, or we don't, and we should only accept the concrete type we really expect to get passed to NewBackups
<ericsnow> natefinch: hmm
<ericsnow> natefinch: you make a good point
<natefinch> ericsnow: it's perfectly ok to take a concrete type for NewBackups.... it's not really a big deal
<ericsnow> natefinch: an error would be better there than a panic
<ericsnow> natefinch: it makes testing a pain
<natefinch> well, I can write a test that will make List() panic right now ;)
<ericsnow> natefinch: right
<ericsnow> natefinch: sorry, the seconds statement was not in relation to the first :)
<ericsnow> natefinch: I agree that the code should be doing an "ok" check on the type assertion
<ericsnow> natefinch: and passing a concrete type to NewBackups makes testing a pain
<natefinch> ericsnow: yeah, but there's a difference between making testing difficult and making your interface lie.
<ericsnow> natefinch: at present the two are mutually exclusive so I picked the latter :)
<natefinch> I think the problem is that people are trying too hard to be generic with interfaces, when they should be generic with plain old structures (or POS as they say in the business ;)
<katco> natefinch: you're missing a closing paren there, friend.
<katco> (friday humor) ;)
<natefinch> I usually count the paren in the smiley face as the close paren (since I so often end asides with smilies :)
<katco> blasphemy!
<katco> parens must bring balance to the universe
<katco> i often wonder if only programmers notice those kinds of things
<natefinch> They said that about Anakin Skywalker, forgetting that, at that point, all the Jedi were currently good guys.
<katco> or maybe grammar nazis
<perrito666> katco: to have balance you dont need another parens just to begin with (;
 * katco is a trekkie
<katco> perrito666: lol
<perrito666> also ;)) looks like a guy with double chin
<katco> no, it's a big grin!
<natefinch> yeah, the problem is that my smilies are generally supposed to be inside the context of the parens, and doing (foo :) ) just looks weird
<natefinch> I just have the compiler automatically insert the extra parens where needed
<perrito666> natefinch: when inside parens I just :p just in case
<katco> syntactically, commas and em-dashes are acceptable for asides. they have the same meaning as parens, it's just a matter of how much of a pause you want to cause the reader
<natefinch> ericsnow: I am still thinking about your code, btw
<ericsnow> natefinch: k
<ericsnow> natefinch: thanks!
<perrito666> katco: I am pretty sure grammatical rules dont include smileys
<perrito666> at least in spanish
<natefinch> katco: I like parens because it's immediately clear that where there's an open paren, there's going to be a close paren - whereas with a dash, it might just be a single dash.
<katco> perrito666: true, true
<katco> natefinch: but as i lex your sentence, i parse :) as a single token, so i'm still waiting for the closing paren!
<katco> was that too nerdy even for this crowd? :p
<natefinch> katco: nah, just trying to get some work done too.  I Was going to make some joke about keeping parsers flexible enough for real-world inputs
<katco> hrm. aren't parsers generally very pedantic?
<natefinch> browsers aren't *shrug*
<katco> many think to their detriment... quirks mode promotes horrible practice
<natefinch> oh I know
<perrito666> katco: your tokenizer could be a bit more fault proof
<perrito666> an actuall lexer for english would have parsed that correctly and dropped the ; since there is no construction that includes ;) and ending in a ; is most likely an error
 * perrito666 worked in NLP before juju
<katco> oooh fun stuff!
<katco> one of the 1st go programs i wrote was a NLP using markovian chains
<perrito666> katco: I used heavy stuff made by linguists, awful c++
<katco> perrito666: i will have to buy you a beer sometime
<perrito666> and slow as a snail in reverse
<perrito666> ericsnow: who calls newapi?
<ericsnow> perrito666: look in the init func
<perrito666> oh I understand
<natefinch> fatal: remote eric already exists
<perrito666> natefinch: I am under the impression that you just made an attempt at a joke
<natefinch> nah, I just thought it was a funny error message
<perrito666> wait... you actually got that message?
<natefinch> haha yeah, in response to
<natefinch> git remote add eric https://github.com/ericsnowcurrently/juju
<ericsnow> natefinch, perrito666: I thought I was the only one :)
<natefinch> so then I had to do
<natefinch> git fetch eric
<natefinch> which worked
<ericsnow> natefinch: this is just weird :)
<natefinch> lol..... fridays... or maybe I'm just easily amused
<natefinch> so the real rpoblem is that filestorage.Metadata is an interface for no good reason
<natefinch> most of its functionality is just getters
<natefinch> except for Doc, which just returns an empty interface anyway
<ericsnow> natefinch: right, but not all
<natefinch> .... and I can;t even begin to imagine what is being done with that empty interface, certainly nothing good
<ericsnow> natefinch: FWIW, one of the patches I have up addresses that somewhat
<natefinch> yeah, I was looking at that... it still has the Doc() interface{} method though
<natefinch> Maybe that's a mistake there?
<natefinch> public structs are not evil.  They're actually very handy.  They make testing SUPER easy.  Why?  Because you can just make one, and just populate it with whatever fake data you want, and you don't need to make a fake implementation or anything.
<natefinch> Your filestorage cleanup adds like 6 new interfaces
<natefinch> maybe not, some of them were just moved
<natefinch> anyway
<ericsnow> natefinch: if you mean interfaces.go, only 2 are new (Doc and DocStorage)
<natefinch> yeah, sorry, I realized looking back that most of them were already there
<natefinch> how many implementations of these interfaces do we expect to actually write?
<ericsnow> natefinch: I've written at least 3 with another on the way
<ericsnow> natefinch: most are for testing
<natefinch> ericsnow: let me spend some more time with it... but I think the fact that in the end, we always cast back to a single implementation is pretty telling that we don't really need/want the genericism
<ericsnow> natefinch: sounds good.  thanks for having a look.
<ericsnow> natefinch:  keep in mind that my use of interfaces is mostly driven by testing (and for the concise summary it gives you for a type)
<perrito666> I was happy because I am getting new floor in my house entry way until I realized I am trapped inside my house :|
<perrito666> dang
<natefinch> haha
<natefinch> ericsnow: I am heartened that there are fewer methods using interface{} in your filestorage cleanup proposal
<ericsnow> natefinch: yeah, it bugged me too
<natefinch> I wish git unstash was a command.... I find it terribly unsymmetric that the opposite of git stash is git stash apply
<ericsnow> natefinch: well, that's the opposite of "git stash save", for which git stash is a short cut (I use git stash pop rather than apply)
<jcw4> natefinch: the full command is git stash save
<jcw4> ericsnow: jinx
<ericsnow> jcw4: :)
<natefinch> yeahbut, then why isn't git unstash a shortcut for pop/apply?
<jcw4> natefinch: you can make it so
<jcw4> git alias
<ericsnow> jcw4: I was just about to say that!
<jcw4> haha
<natefinch> I don't like aliases and plugins etc, because then when I get on some other machine without those aliases, I'm lost
<perrito666> natefinch: allow me to improve your life
<natefinch> or I tell someone to just do "git unstash" and they're like "whaaa?"
<perrito666> natefinch: ah I was going to suggest an alias lol
<jcw4> natefinch: I'm tempted to ask you if you code with notepad
<katco> natefinch: where do you spend most of your time?
<jcw4> ;)
<natefinch> this happened with bazaar, where people would have to explain three different ways of doing things based on what plugins someone had installed
<perrito666> natefinch: I dont go to a remote machine without scping .vim and .gitconfig
<jcw4> perrito666: +1
<katco> i have an entire config directory under scs that i just pull down
<natefinch> heh
<natefinch> well, today I learned that I have been doing git stash apply and should be doing pop
<jcw4> natefinch: unless you want to keep your stash after applying it
<natefinch> jcw4: well, yes, but I almost never do. It's almost always "oops, I'm on the wrong branch, let me stash this for a second while I go fix that"
<ericsnow> natefinch, perrito666: or keep it on a flash drive, on a web site, or in a repo
<jcw4> natefinch: yeah me too
<perrito666> natefinch: there are no two people that use git in the same way
<natefinch> perrito666: that's a problem, because then you can't talk to anyone else about it
<perrito666> ericsnow: I gues I should I picked the scp habit when I was working as a sysadmin
 * natefinch is tempted to make a git stache alias
<perrito666> ericsnow: where the machines did not always had internet
<jcw4> lol
<perrito666> natefinch: the first rule of git is you dont talk about it
<perrito666> may people use git as svn
<perrito666> many*
<natefinch> perrito666: the second rule of git is: "just copy this long list of commands and it'll fix your branch"
<perrito666> natefinch: that is true for most things
<natefinch> also true
<perrito666> "just copy this to your bashrc"
<katco> i heard that git was never really meant to be used w/o an interface
<katco> i use magit, so most operations are just a series of keystrokes
<katco> save stash for me is "z z"
<katco> stash apply is "a" on the line in a list of stashes
<perrito666> katco: zz is clearly undo
<katco> perrito666: haha
<jcw4> katco: you may be gratified to hear that my productivity has dropped to almost nothing while I try to force my vim muscle memory to conform to emacs
<natefinch> z z is much more obvious than stash save
<perrito666> katco: give the author of git, it might have been designed to piss people off mostly
<perrito666> jcw4: happent to me, didnt last a week
<perrito666> natefinch: ztash zave
<katco> jcw4: haha... noooo i'm never happy to hear that. remember i don't think everyone should use emacs just b/c i enjoy it.
<jcw4> perrito666: lol
<perrito666> it is spelled as if told by the bad guy from get smart
<jcw4> katco: I started with xemacs 15 years ago, and decided to switch since I worked mostly in telnet sessions
<katco> well in magit, s is reserved for staging
<katco> lower-case for the file you're on. upper for all
 * perrito666 ponders if going to 14.10 so he has time to fix his computer during the weekend
<katco> jcw4: that's a long time to use emacs and then switch. what convinced you?
<jcw4> katco: heh...no I mean I used it for a few months 15 years ago, and since then been hardcore Vim
<bodie_> perhaps he switched to the linux os
<katco> jcw4: ahhhh
<bodie_> gnu/linux, sorry
<katco> bodie_: i will ignore your attempt at humor, sir! ;)
<jcw4> bodie_: I was doing solaris / hp-ux / irix / all the fun unices that used to be prevalent
<jcw4> bodie_: linux was the snotty kid that no adult wanted to play with
<perrito666> jcw4: oh cmon 15 years ago linux was already being used by many people
<perrito666> I maintained a set of redhat+oracle servers
<jcw4> Not for financial industry hard core C++ class library use
<perrito666> jcw4: financial industry is stil deciding if they go from cobol to something newer
<jcw4> (not that I knew anything about the financial industry... they just happened to be the primary users of the Rogue Wave C++ Class Libraries)
<jcw4> perrito666: true dat
<katco> perrito666: we were using c#/java and a little groovy
<katco> and excel spreadhseets. (shudder)
<jcw4> katco: no Access DB?
<katco> oh yes. there was that too.
<jcw4> :)
<katco> i need to go wash my hands. i'm feeling dirty for some reason.
<jcw4> hahaha
<perrito666> lol
<perrito666> I reacall porting a set of msaccess functions to oracle pl just so the devs of a very bad erp system could migrate their system to it
<jcw4> eeew
<perrito666> actually I was hired to migrate the thing from access to oracle but their code was hardcoded to such level that I just changed the connection and rewrote access in oracle :p
<katco> perrito666: our users knew a bit of sql
<perrito666> these guys templated the sql queries all over the code
<katco> perrito666: so they'd regularly send us these 3-5k line sql and ask why it was slow
<perrito666> so the same query would be repeated over and over
<jcw4> katco: perrito666 make the bad memories stop
<katco> i know right
<perrito666> they did this in vb and had things like query on type so each keystroke would call the same query
<katco> omg!
<katco> that poor database
<jcw4> and perrito666 ups the level
<katco> yeah nk! jees!
<perrito666> and the deployment was done by having everyone using the same binary compiled from a samba share
 * jcw4 /leaves to save his sanity
<jcw4> or wait... is that /part ?
<perrito666> so the devs would work on a shared samba with the code and compile to replace that one file :p
<perrito666> jcw4: had enough?
<jcw4> perrito666: I'm squirming
<katco> my users would dictate database architecture
<katco> they wouldn't let us do roles. select any table had to be turned on
<perrito666> I actually quit the day that I had to explain tcp/ip to the head engineer to explain to him why the system would not work from different vlans
<katco> oyyy
<perrito666> when you ask the head engineer the rhetorical question "you know how basic tcp/ip works" to begin explaining vlans and you get No for an answer
<perrito666> something snaps inside of you
<jcw4> and perrito666 was never the same since
<perrito666> I only returned once to that company when they asked me to consult because the db seemed to have lost all data, someone had been making space on the hds because they where trying to avoid buying more (this was a hardware selling company) and the guy doing that deleted huge files with extension .log...  gues what is oracle extension for the db dataa files
<jcw4> perrito666: dum-da-duumm
<natefinch> wow, the extension for oracle data files is .log?
<perrito666> natefinch: was in oracle.. 8 or around
 * natefinch has been lucky enough to never have to deal with oracle
<perrito666> but they are in a pretty obvious path
<katco> oracle is actually pretty nice
<katco> i believe it's the most performant
<perrito666> jcw4: also the backup was not working because no one wanted to pay a watchdog and so it had been off since I left
<katco> .log makes sense if you know the architecture of oracle
<perrito666> katco: it is, I believe you cannot beat it in terms of db clustering
<jcw4> perrito666: you're trying to make me cry aren't you?
<perrito666> jcw4: heh I was a kid back then, I was like 19
<perrito666> also the salary was like U$300 /month iir the exchange rate correctly
<perrito666> so it was a cheap place internationally speaking
<jcw4> perrito666: that seems really cheap
<perrito666> jcw4: well that made up for my rent and food
<perrito666> everything was as cheap
<perrito666> and I lived in a pretty cool place
<jcw4> perrito666: where was that?
<perrito666> jcw4: cordoba argentina ~2003
<perrito666> I actually had to dig up the exchange rate http://es.wikipedia.org/wiki/Anexo:Cotizaci%C3%B3n_hist%C3%B3rica_de_monedas_de_la_Argentina
<jcw4> perrito666: I'd love to visit Argentina some time
<perrito666> jcw4: its nice 2 people have a good life level with 1.5k/month so you can come as a cheap vacation
<jcw4> nice
<arosales> natefinch: any folks around who can assist with juju-core on ppc64el
<natefinch> arosales: uh....
<arosales> :-)
<natefinch> arosales: I think the number of people that know anything about ppc64el is approximately 2
<natefinch> (on our team)
<perrito666> that would be dave and ian right?
<arosales> mbruzek: is working with some community folks on an upcoming demo at IBM Entprise 2014 and may need some support
<natefinch> And I'm just assuming someone other than Dave knows something about it
<ericsnow> natefinch: did you review that backups list patch?
<arosales> questions maybe just confirming behaviour, like juju selecting the correct tools on deploy from an x86 box to a MAAS ppc64el environment
<arosales> mbruzek: were you able to get a deploy working ?
<mbruzek> arosales: no
<arosales> mbruzek: resource issues, core, or environment?
<mbruzek> arosales: we can not bootstrap maas yet.
<natefinch> arosales: we can answer general questions like that.... especially where the answer is - yes it should do the right thing and if not, it's a bug.
<arosales> natefinch: understood and thanks :-)
<arosales> mbruzek: so your still trying to get juju to recognize maas API keys and creds?
<akash_> hi guys, im trying to bootstrap juju to ibm ppc environment using 1.20.8 version into a kvm instance via maas
<arosales> mbruzek: remember you may need to set up sshuttle with your vpn
<akash_> when it starts to bootstrap, it says pulling arbitrary tools for amd 64 though, and ultimately the bootstrap fails
<natefinch> unfortunately I have to bail to go make dinner for the family
<arosales> akash_: can you bootstrap with --debug
<akash_> can someone walk through whats required to bootstrap to ppc via maas
<arosales> and pastebin
<akash_> sure yes
<natefinch> ericsnow: not exactly... working on it.  Monday morning I think.
<ericsnow> natefinch: no worries
<arosales> natefinch: any other core devs around for the next hour or so?
<natefinch> arosales: friday 10 minutes before 5 is not the best time to find people online :)
<arosales> natefinch: agreed, I honestly would like to be cracking up beers at this time too
<natefinch> arosales: ericsnow and perrito666 are both here for now... katco too.  They're all pretty new, but I'm sure will help as much as possible
<arosales> but as demos would have it . . .
<arosales> natefinch: ericsnow, perrito666, katco thanks
<mbruzek> natefinch: the error we are seeing is:    WARNING juju.provider.maas environ.go:434 picked arbitrary tools &{1.20.8-trusty-amd64 https://streams.canonical.com/juju/tools/releases/juju-1.20.8-trusty-amd64.tgz 6abe3d33dc22601509e88febb11511aed9f9616e2598d7844fca0d16499ad9ca 8109965}
<arosales> note sure if its core related but may want to confirm.
<mbruzek> natefinch: the problem is that is a ppc64le system.
<mbruzek> akash_ and I were wondering why that would come up with amd64 tools
<arosales> mbruzek: juju core does have logic to select ppc64el
<natefinch> yes it does
<arosales> mbruzek: can we see how you are bootstrapping
<natefinch> and those tools exist: https://streams.canonical.com/juju/tools/releases/
<natefinch> I don't know... my wife is gonna kill me if I don't go now.  Good luck.
<arosales> natefinch: have a good weekend
<arosales> thanks for the replies here
<akash_> destroying environment and starting over from get go
<arosales> well see if we can work with er, perrito666 or katco
<arosales> akash_: cool thanks
<akash_> we will capture all of it here in a sec
<arosales> akash_: on next bootstrap
<perrito666> arosales: I am checking at the code
<arosales> please append --debug
<arosales> perrito666: looking at "juju help constraints"  it says that arch that are recognized are only amd64, i386, and arm.
<arosales> perrito666: I would expect to see ppc at least in there if not ppc64el
<perrito666> arosales: we discovered recently that our help might be outdated
<arosales> ah ok
<arosales> perrito666: so where do Juju do the selection of which arch to use for tools? Is at the client, provider, neither, or both?
<perrito666> arosales: I am also checking at the warning
<perrito666> is there a more extensive log?
<arosales> perrito666: mbruzek and akash are getting that now.
<arosales> they are rebootstrapping with --debug and are going to post the output
<katco> is there's anything i can do, please let me know. i haven't worked in that area yet.
<perrito666> arosales: for what I see, TheMue WARNING posted correspond, should at least, to a different kind of error
<mbruzek> perrito666: here is the arcitecture of the system. http://pastebin.ubuntu.com/8435653/
<mbruzek> (it is still attempting to bootstrap)
<akash_> so some things appear to  be working out right but we have no idea if these things are working and will paste logs :
<akash_> * is the architecture right?
<akash_> *one the vm is powered up, and we ssh, are the simplestreams, and other sites accessible and can populate the vm
<akash_> the issue is we are NOT on the hypervisor so we cant remmina in
<akash_> we are in another vm, and the bootstrapped vm's themselves are seperate vms also
<akash_> hope that gives insight into what we are dealing with
<perrito666> seems to me that juju is expecting ppc64el
<akash_> point on second * items :  my assumption is that the juju uses ssh on the maas node to finish its process of installing necessary components as i remember
<akash_> ok
<mbruzek> perrito666: arosales, Here is the bootstrap log
<mbruzek> http://paste.ubuntu.com/8435677/
<perrito666> eries="trusty", arch=<nil>, version=<nil>
<perrito666> pass ppc64el as the arch on the constraitns
<perrito666> constraints
<perrito666> sorry that word always gives me trouble
<mbruzek> juju bootstrap --constraints arch=ppc64el ?
<perrito666> put around the constraints ""
<perrito666> juju bootstrap --constraints "arch=ppc64el"
<perrito666> mbruzek: works?
<akash_> looking
<akash_> wondering if the ssh key is borked between maas and juju as well
<mbruzek> We bootstrapped again and it is still running, thinking it can not authenticate to the maas node
<perrito666> mbruzek: that is odd
<katco> what happens if you attempt to run that ssh command manually?
<mbruzek>  ssh -o "StrictHostKeyChecking no" -o "PasswordAuthentication no" -o "ServerAliveInterval 30" -i /home/iicroot/.juju/ssh/juju_id_rsa -i /home/iicroot/.ssh/id_rsa ubuntu@S822L04-vm1.IBMCloud.vm /bin/bash
<mbruzek> ssh: Could not resolve hostname s822l04-vm1.ibmcloud.vm: Name or service not known
<mbruzek> perrito666: katco: http://paste.ubuntu.com/8435783/
<akash_> katco, so we are able to see the node in maas "allocate to admin" , however, during bootstrap, it appears ssh fails?
<mbruzek> I called the bootstrap this way: uju bootstrap --constraints "arch=ppc64el" --debug 2>&1 | tee bootstrap.txt
<akash_> in other words we cant ping the hostname above from the vm, but can ping via ip
<katco> interesting... and again i apologize, i'm pretty new and this isn't an area i've poked around in before
<katco> this line kind of sticks out: 2014-09-26 21:12:17 DEBUG juju.environs.bootstrap bootstrap.go:47 network management by juju enabled: false
<katco> since there appears to be a networking issue; trying to determine if picking the tools is an anticedent problem
<mbruzek> katco: we can ping 172.26.48.102
<mbruzek>  ssh -o "StrictHostKeyChecking no" -o "PasswordAuthentication
<mbruzek>  no" -o "ServerAliveInterval 30" -i /home/iicroot/.juju/ssh/juju_id_rsa -i /home/iicr
<mbruzek> oot/.ssh/id_rsa ubuntu@172.26.48.102 /bin/bash
<mbruzek> ssh: connect to host 172.26.48.102 port 22: Connection refused
<perrito666> mmm, I wonder if the node is in good shape
<mbruzek> we can see the note start up in MAAS gui
<mbruzek> we can ping it, but not ssh to it.
<katco> have you shared you environments.yaml?
<katco> specifically, have you specified anything related to enable-os-updates?
<mbruzek> not that I am aware of.
<mbruzek> https://pastebin.canonical.com/117754/
<mbruzek> katco: that is our env.yaml
<katco> thanks
<akash_> katco, is there a time out constraint that i can pass in environments.yaml or at command line  (ive forgetten what that exactly is)
<katco> akash_: looks like there is bootstrap-timeout, bootstrap-retry-delay, and bootstrap-addresses-delay
<katco> all in seconds
<katco> right now kind of wondering if sshd didn't get installed for some reason. it looks like it picked the correct tools in your second attempt.
<mbruzek> ack
<katco> i don't suppose you have console access to that machine through maas?
<mbruzek> yes
<katco> can you eliminate that assumption and lmk if sshd is there?
<mbruzek> katco: We are using a different tool and can see that MAAS is installing Ubuntu.
<mbruzek> so sshd is not installed on the bootstrap node at this time.
<mbruzek> but it is installing...
<katco> ah, so you're thinking maybe the bootstrap just timed out?
<mbruzek> That is our current theory... but we have cycled through a few now.
<arosales> maas can take a while to bootstrap
<arosales> mbruzek: have you used any of the bootstrap-timeout options ?
<arosales> perhaps the power hardware isn't being as responsive as we would like.
<mbruzek> arosales: no, we are using a tool called kimchi to see the VM console
<mbruzek> We can see that the OS is STILL installing
<arosales> mbruzek: sorry I meant on the juju bootstrap
<mbruzek> arosales: no but we will do tht next, talking with Brian F. in another room
<katco> mbruzek: if the OS is still installing, that kind of precludes juju from doing anything. is the issue that juju won't pick up after the OS installation is completed?
<arosales> mbruzek: ack
<mbruzek> kateco
<katco> ?
<mbruzek> Sorry we are talking with a new person, explaining what is going on.
<katco> ah no worries
<mbruzek> katco: I agree with you it is taking a long time.
<katco> what i don't know is if juju will retry infinitely. if so, we should just see what happens when the install is complete.
<mbruzek> katco: We can watch the system boot and bootstrap seems to give up well before the install completes.
<mbruzek> I can give some extremely long timeouts while we think about something else.
<katco> good idea, the unit is seconds
<arosales> mbruzek: suggest to give a long timeout on bootstrap, append --debug, if that fails lets sync up with the maas folks first thing Monday
<katco> mbruzek: specifically i would make the bootstrap-retry-delay very long
<mbruzek> juju bootstrap --bootstrap-retry-delay=10000
<mbruzek> ?
<mbruzek> How are the called from the command line.
<katco> i always use the environs.yaml
<mbruzek> OK we can add that
<mbruzek> What keys
<katco> bootstrap-timeout, bootstrap-retry-delay, and bootstrap-addresses-delay
<katco> i would suggest making bootstrap-retry-delay very long since this we're wanting to wait a long time for maas to come up
<arosales> katco: perrito666: the help is much appreciated
<mbruzek> bootstrap-timeout: 6000  (is it just an int, or 1000s )?
<katco> mbruzek: it's just an int
<katco> arosales: np, wish i knew a bit more
<mbruzek>     bootstrap-timeout: 7200
<mbruzek>     bootstrap-retry-delay: 60
<mbruzek> Are those too long?
<arosales> katco: you know a lot about about juju core internals :-)
<katco> arosales: lol not _that_ much. it's only been 3 months!
<arosales> it is a big go code base
<katco> mbruzek: i would make bootstrap-rety-delay 6000 as you mentioned
<katco> mbruzek: otherwise you'll wait in the wrong spot.
<arosales> +1 just make it ridiculously long to rule that out
<perrito666> although if it takes ridiculously long for it to install something else might be wrong
<mbruzek> katco: arosales, the retry DELAY should not be over 1 hour should it?
<mbruzek> I have the bootstrap-timeout 3 hours (in seconds) but the delay between retries I would want every minute right?
<katco> mbruzek: that is the delay i would think you'd want higher. b/c the other delays are just going to fail outright i think.
<katco> i could be misunderstanding the situation, but that's my take on it.
<mbruzek> Can I just set the retry delay to 100?  I think we want it to retry for 3 hours
<katco> in other words, let it fail fast for the reasons it is, but give it a long time before it tries again b/c the machine is probably coming up
<katco> i think the net effect might be reasonably close, so w/e you're comfortable with
<mbruzek> I am bootstrapping again with these values set, I want it to retry every so often, and have the bootstrap-timeout of 3 hours.
<katco> ok well good luck! i need to EOD and take care of dinner :)
 * arosales would go with katco's suggestion
<arosales> katco: thanks!
<hazmat> any powerpc knowledge around?
<perrito666> hazmat: sorry most powerpc knowledge is in australia
<akash_> np thanks
<alexisb> akash_, are you going to be online this weekend?
<alexisb> I can send some of those guys mail and see if they are around to help out
<akash_> alexisb, tomorrow for sure for a few hours
<akash_> i havent taken a vacation so out of pocket starting 7pm to monday  :)
<akash_> alexisb, thanks...that would be very helpful
<alexisb> ok, will do
#juju-dev 2014-09-28
<menn0> thumper: morning.
<thumper> menn0: morning
 * thumper is just updating before rebooting
<menn0> thumper: you're going to love what that problem was that I was looking at on Friday
<menn0> thumper: I figured it out Friday night
<thumper> menn0: here is a pro tip: don't upgrade just before flying out
<thumper> menn0: get a stable base several days before
<menn0> thumper: I wasn't planning on it :)
<thumper> menn0: was that the modifying dict?
<thumper> menn0: or another issue
<thumper> the agent not starting issue?
<menn0> thumper: no remember the units magically shutting down
<thumper> yeah... what was that then?
<menn0> thumper: it turns out it's something that Jesse and I had already noticed but thought it was only a minor inconvenience for users
<menn0> bug 1372752
 * thumper looks at mup
<menn0> if units try to connect to the state server before some of the migrations have run
<thumper> ah...
<thumper> they think it is all bad?
<menn0> the login fails with a authorisation error and the unit thinks it has to terminate
<thumper> ah, bugger
<menn0> also the unit doesn't log anything when it does this
<menn0> it just shuts down
<menn0> I'll fix this too
<menn0> it's fairly easy to fix I think: report a different error when logins fail while upgrades are in progress
 * thumper nods
<menn0> something like "upgrades in progress" (which we already have defined as a server error)
<thumper> yup...
<thumper> sounds good
<menn0> this should make the problem friendlier for users as well
<menn0> so I'll do that first thing this morning
<menn0> brb... kids
<thumper> awesome
<menn0> is anyone able to review this bug fix? it's an easy one. http://reviews.vapour.ws/r/119/
<menn0> davecheney: morning
<davecheney> menn0: o/
<menn0> davecheney: NZ daylight saving came in to effect yesterday so it's midday for us
<davecheney> yay
<davecheney> yay insanity
<menn0> davecheney: no standup yet as thumper had to do a thing with his kids
<menn0> davecheney: and jesse is out
<davecheney> kk
<menn0> davecheney: thumper should be back soon-ish so perhaps do standup when he's back?
<davecheney> i saw there was a ppc64 issue over the weekend
<davecheney> does anyone have teh details
<davecheney> is there an issue ?
<menn0> davecheney: I haven't heard about that but I'm not caught up on email
<menn0> davecheney: any chance you could review this? it's an easy one and fixes a bug that's blocking me from landing a much bigger branch. http://reviews.vapour.ws/r/119/
 * davecheney looks 
<davecheney> done
<menn0> davecheney: thanks. my replies here: http://reviews.vapour.ws/r/119/
<davecheney> menn0: fair enough
<davecheney> i don't really know enought about what is going on here anyway
<menn0> davecheney: k
<davecheney> you have you lgtm
<davecheney> s/you/your
 * thumper goes to make lunch
<thumper> so hungy
#juju-dev 2015-09-21
<menn0> waigani, thumper or davecheney: http://reviews.vapour.ws/r/2721/
<menn0> easy one
<rick_h_> thumper: got 10min to chat?
<thumper> rick_h_: I do now
<mup> Bug #1497801 opened: provider/joyent: data races <juju-core:New> <https://launchpad.net/bugs/1497801>
<mup> Bug #1497802 opened: provider/maas: data races <juju-core:New> <https://launchpad.net/bugs/1497802>
<mup> Bug #1497807 opened: worker/rsyslog: data race <juju-core:New> <https://launchpad.net/bugs/1497807>
<mup> Bug #1497809 opened: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1497809>
<mup> Bug #1497810 opened: uniter/remotestate: data race <juju-core:New> <https://launchpad.net/bugs/1497810>
<davecheney> there are currently 5 data races in juju
<anastasiamac> davecheney: is it an improvement? :D
<davecheney> no, a regression of +5
<mup> Bug #1497809 changed: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1497809>
<mup> Bug #1497810 changed: uniter/remotestate: data race <juju-core:New> <https://launchpad.net/bugs/1497810>
<mup> Bug #1497809 opened: worker/statushistorypruner: data race <juju-core:New> <https://launchpad.net/bugs/1497809>
<mup> Bug #1497810 opened: uniter/remotestate: data race <juju-core:New> <https://launchpad.net/bugs/1497810>
<mwhudson> davecheney: guess that job didn't get made voting then?
<davecheney> sad trobone
<davecheney> too many test suites depend on github.com/juju/juju/testing.BaseSuite, when they should be depending on github.com/juju/testing.CleanupSuite
<davecheney> ^ ie, all the suites that _only_ use PatchValue
<menn0> hmmm my merge run is stuck trying to bring up an EC2 instance
 * menn0 cancels it
 * natefinch really just wants to run some shit in a goroutine that reads the db, but in Juju that means a watcher and a worker and an API facade and an api client...
<davecheney> https://github.com/juju/juju/pull/3340
<davecheney> i was able to remove testing.BaseSuite from this package
<natefinch> davecheney: nice
<natefinch> ahh, confusing... juju/testing vs juju/juju/testing
<davecheney> -ETOOMUCHBASESUITE
<natefinch> +100
<mup> Bug #1497829 opened: session closed in data source <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1497829>
 * thumper goes to make dinner, will check in later to see william
<thumper> fwereade: if you are around? a quick chat
<thumper> if you aren't, then enjoy the public holiday :)
<fwereade> thumper, am here
<thumper> I'll be quick
<jam> fwereade: standup?
<jam> frobware: dooferlad: the flag is "source destination check"
<dooferlad> thanks jam
<jam> dooferlad: you can google "ec2 vpc NAT instance" and "Source Destination Check" for more details
<jam> frobware: ^^ would be probably good to understand that discussion, since we're essentially natting for containers
<frobware> jam: ack
<frobware> jam, dooferlad: for the freeze on the 22nd is there anything we are missing or need to do?
<jam> frobware: have you kept the "15.10 Networking Work Items" up to date?
<jam> I think you and dimiter went over that last week
<dooferlad> frobware: on https://canonical.leankit.com/Boards/View/101652562#workflow-view you can set up a filter using the funnel on the top right. If you use the hilight mode and filter on tag 'phase-1.1-mvp' you can see all cards related to the initial networking support.
<dooferlad> frobware: assuming the tags are correct, that should be everything
<frobware> jam: dimiter and I went through that on friday and corrected the dates
<frobware> jam: for october there are two items outstanding which is in review
<dooferlad> frobware: there are three cards that aren't in merged. One is docs (assigned to Frank), one is Dimiters code for EC2 (that I can clean up and get someone to review, then land) and one is a low priority code cleanup card.
<frobware> jam, dooferlad: and the other item is waiting for James' ci tests to land - so, I believe the October section is up to date.
<dooferlad> frobware, jam: why do we have a doc for that when we have the board? My single source of truth alarm went off.
<frobware> dooferlad, jam: predates me but it's easier to see a top-down list vis-a-vis cards on the board which eventually go out of view because they're "done"
<jam> frobware: dooferlad: I would also guess size of scope for each one. I think cards are supposed to be finer grained than an overall summary of things that have been completed.
<fwereade> jam, want to do the series-metadata thing? if you're comfortable with the phase 1/2 bits I think I am too
<jam> fwereade: I thought you were off for public holiday today
<fwereade> jam, my concerns are all around the edge cases in the model when we have non-homogeneous units
<fwereade> jam, yeah, but so was ian and it's important to him :)
<jam> yeah, I'd like to chat, just started a different conversation because I thought you were gone, I'll swing back around in 20 min or so? Is that ok?
<fwereade> jam, sure, ping me, will try to be near
<perrito666> anastasiamac: axw ?
<perrito666> axw: standup
<jam> katco: when you're around, I'd like to meet with you in about 40 min or so if you're up.
<jam> katco: ping if you're around
<perrito666> jamespage: hey, I just bzr branched ubuntu:juju-mongodb, then I ran apt-get build-dep juju-mongodb but dpkg-buildpackage fails with: dpkg-source: error: can't build with source format '3.0 (quilt)': no upstream tarball found at ../juju-mongodb_2.4.10.orig.tar.{bz2,gz,lzma,xz}
<perrito666> jamespage: am I missing some step? sinzui you might now too
<jamespage> perrito666, bzr bd
<jamespage> (hint use the bzr builddeb pluing)
<perrito666> jamespage: tx a lot
<lazypower> perrito666, o/
<axw> perrito666 anastasiamac: sorry, I just forgot :/  I have set an alarm for tomorrow night
<bogdanteleaga> anybody fancy a fast pr? https://github.com/juju/utils/pull/152
<perrito666> lazypower: hi
<mup> Bug #1498010 opened: TestSetsStatusWhenDying <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1498010>
<perrito666> well building the mongo deb takes some time :p
<mup> Bug #1498010 changed: TestSetsStatusWhenDying <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1498010>
<katco> jam: hey around now
<katco> jam: still there?
<mup> Bug #1498010 opened: TestSetsStatusWhenDying <ci> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1498010>
<perrito666> jamespage: a bit more help :p at the end it tries to sign the package and fails because I lack the secret keys, any way to go around it?
<jam> katco: I'm stopping by IRC from time to time in case you are around today
<katco> jam: i'm here! :)
<jam> katco: I'm joining https://plus.google.com/hangouts/_/canonical.com/john-meinel-kat
<jamespage> perrito666,  "-- -us -uc"
<perrito666> tx again
<natefinch> man, second time we've been creating a service with a default value only to set the value correctly immediately afterward... where the only change to do it atomically is to pass in the value and use it at service creation time.
<natefinch> katco: fix committed on the 1.25 forward port of the io timeout bug btw.
<katco> natefinch: awesome! make sure LP is updated
<natefinch> katco: yep.
<natefinch> gah.... I hate it when people use unnecessary pointers
<lazypower> o/ Morning, who can i poke about the juju storage provider and loopback storage?
<katco> lazypower: wallyworld, axw, anastasiamac. i actually wrote the loopback provider, but don't have time this morning
<alexisb> lazypower, what is it that you need?
<alexisb> is it urgent?
<lazypower> alexisb, a CPP partner is running into an issue with loopback storage provider, http://paste.ubuntu.com/12515220/
<lazypower> it appears its trying to do whats right, but encounters an error and just leaves itself cycling in a broken state.
<alexisb> lazypower, can you open a bug
<lazypower> surely
<alexisb> and I can get eyes on it this afternoon
<alexisb> lazypower, thanks
<lazypower> https://bugs.launchpad.net/juju-core/+bug/1498081
<mup> Bug #1498081: loopback storage provider not completing successfully <storage> <juju-core:New> <https://launchpad.net/bugs/1498081>
<lazypower> Thanks for taking a look alexisb
<alexisb> thanks for the bug
<mup> Bug #1498081 opened: loopback storage provider not completing successfully <storage> <juju-core:New> <https://launchpad.net/bugs/1498081>
<mup> Bug #1498084 opened: TestManageEnvironRunsCharmRevisionUpdater fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1498084>
<mup> Bug #1498086 opened: TestImplicitRelationNoHooks fails <ci> <test-failure> <juju-core:Incomplete> <https://launchpad.net/bugs/1498086>
<mup> Bug #1498094 opened: TestCommitHook fails <ci> <test-failure> <juju-core:Incomplete> <juju-core maltese-falcon:Triaged> <https://launchpad.net/bugs/1498094>
<natefinch> katco: you around?
<katco> natefinch: yep
<natefinch> katco: I think I must be doing something dumb with registering my facade, because i keep getting an error from the API that is  unknown object type "UnitAssigner"
<natefinch> when I clearly do register a facade with the UnitAssigner name
<katco> natefinch: the rpc server uses reflection to check that it conforms to a type signature
<katco> natefinch: so the errors are at runtime
<katco> natefinch: make sure you have the correct methods defined with the exact type signatures expected
<katco> natefinch: does that help, or just stating the obvious?
<natefinch> katco: both ;)  Often, stating the obvious is helpful :)
<natefinch> lol... if you print something out at the wrong time from the client, it won't bootstrap
<natefinch> ERROR failed to bootstrap environment: cannot upload bootstrap tools: invalid version "2015-09-21 18:22:51 ERROR juju.apiserver.unitassigner unitassigner.go:15 Registering UnitAssigner\n1.26-alpha1.1-vivid-amd64" printed by jujud
<natefinch> I bet I know what it is... the version. I bet the default version is 0 for the server and 1 for the client, or vice versa, and so they're not matching up\
<natefinch> (version of the facade)
<natefinch> yep.... client defaults to v0, server defaults to v1.  Awesome.
<natefinch> er, I guess the server doesn't default, but most things are at v1, so I thought v1 would be the default.
<natefinch> man, would have been nice if the log message included the fact that it was the specific version of the facade that was missing, not the name
<katco> natefinch: hey, need a status update on bug 1497312 for the release meeting. almost done?
<mup> Bug #1497312: make assignment of units to machines use a worker <juju-core:In Progress by natefinch> <juju-core 1.25:In Progress by natefinch> <https://launchpad.net/bugs/1497312>
<natefinch> katco: not really... keep running into problems. It's not going to land today unless a miracle happens.  Sorry... lots of little things keep cropping up that are hard to debug.
<mup> Bug #1498175 opened: lxcBrokerSuite.* fails on centos <centos> <ci> <lxc> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1498175>
<thumper> perrito666: ping
<perrito666> thumper: this is not to boast about the rugby game isn't it?
<thumper> perrito666: no...
<thumper> perrito666: would like to talk work stuff
<thumper> if you ahve some time
<perrito666> sure, need a hangout?
<perrito666> thumper: ?
<thumper> yeah, just otp with alexisb right now
<thumper> after that?
<perrito666> sure, just ping me over here
<perrito666> thumper: didn't forget about me, right?
<thumper> perrito666: nope, ready now
 * thumper creates hangout
<thumper> perrito666: https://plus.google.com/hangouts/_/canonical.com/status
<perrito666> thumper: well that was an awful lot of info to lift from the back of my head, hope I didn't lie too much to you :p if you catch me in the other side of the day I might be more lucid
#juju-dev 2015-09-22
<thumper> kk
 * thumper heads out for lunch
<anastasiamac> waigani: hi :D
<waigani> anastasiamac: hello :)
<mup> Bug #1498232 opened: provider/ec2: provisioning with spaces should be provider-independent <ec2-provider> <network> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1498232>
<mup> Bug #1498235 opened: provider/ec2: add unit and feature tests for provisioning instances with spaces in constraints <ec2-provider> <network> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1498235>
<anastasiamac> waigani: i was wondering if ur old PR needed reviewing but I've reviewed it anyway :P
<anastasiamac> waigani: so all good :D
<anastasiamac> waigani: how r u, anyway?
<waigani> anastasiamac: sorry, internet dropped out. thanks for the review.
<axw> lazypower: if you're around, I can field answers about the loop storage provider now
<axw> lazypower: if not, feel free to email me
<natefinch> thumper, davechen1y, axw, etc: I have a function that gets run by a worker that iterates over a collection and performs an action for each document.  In theory, one or more of these actions may fail.  How do I handle the failures? Do I break out of the loop at the first one and let the worker restart? Do I run as many a possible and then collate the errors somehow?
<axw> natefinch: what's the action?
<axw> I'm not sure there's a general answer
<natefinch> axw: assigning units to machines.... this is the second half of a bugfix that makes adding services atomic... in this case, the user's desired units and their placement criteria are saved to a new collection when the service is created, and a worker comes along and does the assigning. \
<axw> natefinch: and what would cause individual ones to fail? bad placement?
<axw> natefinch: I think this is a case where you'd want to try them all and collate errors (or set individual errors as status units?), so one bad placement doesn't block everything
<natefinch> axw: in theory the service could have been destroyed, or yeah, bad placement
<natefinch> axw: yeah, that was what I had gone to do, but I was hoping someone else had made a standard error collating thingy
<axw> natefinch: maybe collating isn't the right thing to do though. if it may fail due to user input, then you probably want to report the error rather than bounce the worker
<natefinch> axw: we validate the placement sychronously during service deployment, so the only thing that could really fail is if you specify some combination that doesn't match any possible machines.
 * thumper agrees with axw
<thumper> collate and report
<natefinch> thoughts on how to turn N errors into 1 error?
<axw> well I'm actually saying don't collate/combine, but return an error per unit... and then update unit status with those rrors
<natefinch> axw: the unit statuses will be updated, that code already exists and I'm just reusing it
<axw> natefinch: ok, what're you going to do with the error then?
<natefinch> axw: that's what I was just thinking about... it's just an error reported to the worker, it either logs it and ignores it or restarts.  There's no much else to do.  If the state code logs each error individually... the worker shouldn't really need every detail of every error.
<axw> natefinch: so I'd probably return something like ([]UnitAssignmentResult, error) to the worker, where each result contains a unit-specific error. you can log them or not, but only bounce the worker if the top-level error result is non-nil
<axw> natefinch: and if you need to do something with them later that isn't logging, you haven't thrown away information
<natefinch> axw: makes sense.  Thanks for the help :)
<axw> natefinch: nps
<axw> davechen1y: in response to your review comment: https://github.com/juju/juju/pull/3346
<davechen1y> axw: lgtm, i dunno why reviewboard hasn't picked it up
<axw> davechen1y: ta. maybe because I was lazy and didn't add a description
<axw> oh it's there now... just a bit slow
<thumper> and with that epic email, I'm done for the day
<thumper> laters
<mup> Bug #1498349 opened: juju upgrade fails with tools upload error due to invalid series "wily" <juju-core:New> <https://launchpad.net/bugs/1498349>
<mup> Bug #1498349 changed: juju upgrade fails with tools upload error due to invalid series "wily" <juju-core:New> <https://launchpad.net/bugs/1498349>
<mup> Bug #1498349 opened: juju upgrade fails with tools upload error due to invalid series "wily" <juju-core:New> <https://launchpad.net/bugs/1498349>
<mup> Bug #1498349 changed: juju upgrade fails with tools upload error due to invalid series "wily" <juju-core:New> <https://launchpad.net/bugs/1498349>
<mup> Bug #1498349 opened: juju upgrade fails with tools upload error due to invalid series "wily" <juju-core:New> <https://launchpad.net/bugs/1498349>
<dooferlad> mgz: do you want me to split up those merge requests some more, or are you ok with them?
<dooferlad> mgz: https://code.launchpad.net/~dooferlad/juju-ci-tools/addressable-containers-tools/+merge/271836, https://code.launchpad.net/~dooferlad/juju-ci-tools/addressable-containers-assess/+merge/271837
<mup> Bug #1498481 opened: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1498481>
<mup> Bug #1498481 changed: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1498481>
<mup> Bug #1498481 opened: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1498481>
<perrito666> mattyw: your tweet just made me get again into the painful task of getting refills for my fountain pen
<mattyw> perrito666, you're one of those people as well are you?
<mattyw> perrito666, I'm new to it all
<mattyw> tasdomas, ^^
<perrito666> mattyw: lol, I am now using a very nice staedler I got as a prize in the las sprint charming contest because finding the refills for my pain is especially hard here
<perrito666> but yes, I am picky about my writing materials
<perrito666> and I write a lot
<rick_h_> perrito666: /me looks at my staedler fountain pen I picked up during nuremberg sprint
<katco> perrito666: rick_h_: i only use the one true writing utensil: the apple pencil
<rick_h_> katco: hah!
<rick_h_> my only pencils are my old drafting pencils
<rick_h_> solid nice things
<wwitzel3> I use the square construction pencils for everything
<rick_h_> wwitzel3: those are my fav in the shop
<rick_h_> nothing like taking a break to sharpen a pencil with a knife or a bandsaw :)
<wwitzel3> rick_h_: I bought a box for the shop and they've migrated inside as well, so now I just use them for everything
<perrito666> I only use the square ones when I help my wife with cooking, those things will write on anything
<wwitzel3> even got me a proper carpenter pencil sharpener
<perrito666> wwitzel3: oh, is that a thing? I use an opinel
<wwitzel3> perrito666: yeah, they are very cheap, around a US dollar or two, but they are worth it, extends the life of the pencil by a lot (for me anyway)
<perrito666> well look at that, never saw one of those
<perrito666> katco: alexisb I completely forgot about the interlock sorry
<katco> perrito666: too busy talking about pencils
<perrito666> katco: and compiling mongo, oh joy
<katco> perrito666: calendar notifications ftw ;)
<perrito666> katco: yeah, I dismissed the notification of 10 min before but I have no notif for "starting now"
<katco> perrito666: gcal really needs a snooze button
<perrito666> katco: +1
<mup> Bug #1498511 opened: state server records lxbr0 address which clients attempt to use for charm downloads <juju-core:New> <https://launchpad.net/bugs/1498511>
<mup> Bug #1498518 opened: Error when attempting to deploy service in manual env without specifying machine <juju-core:New> <https://launchpad.net/bugs/1498518>
<rogpeppe> anyone ever seen this error before (from the state tests) ? http://paste.ubuntu.com/12521386/
<rogpeppe> hmm, looks like it's a TLS error
<natefinch> rogpeppe: doesn't look familiar.
<rogpeppe> natefinch: yeah, first time i've seen it too
<natefinch> niemeyer: you around?
<niemeyer_> natefinch: Yes, but on a meeting
<natefinch> niemeyer_: ok, ping me when you're out?  Trying to understand out how to figure out what mgo assertion is failing
<rogpeppe> i've been looking for a review of this for 5 days now. any takers? http://reviews.vapour.ws/r/2689/
<rogpeppe> mgz, sinzui: i've pushed up a hopefully-working version of the use-charm.v6-unstable feature branch. any chance we could get a CI run on it, please?
<sinzui> rogpeppe: yes. I may not need to intervene becase CI has better feature branch rules
<rogpeppe> sinzui: thanks
<rogpeppe> natefinch: you're OCR? fancy a review? http://reviews.vapour.ws/r/2689/
<perrito666> rogpeppe: reviewed
<rogpeppe> perrito666: ta!
<rogpeppe> perrito666: not sure about your "comment name is wrong" suggestion. the comment looks as correct as it was before. or are you saying that it was always wrong?
<rogpeppe> perrito666: FWIW i wouldn't use that phrasing either, but i didn't want to make too many gratuitous changes
<perrito666> sorry, if you look carefully the name of the function in the comment and the actual function name are not the same
<rogpeppe> perrito666: ha, good point
<rogpeppe> perrito666: so it was always wrong
<perrito666> most likely, just noticed and since you where there :p
<perrito666> apart from that looks good, I really like that change
<perrito666> I feel this doc is lying to me http://doc.bazaar.canonical.com/latest/en/tutorials/using_bazaar_with_launchpad.html#personal-branches
<perrito666> I branched a project and push will not do what is says there
<abentley> perrito666: Did you do the launchpad-login step above?
<perrito666> abentley: yup
<abentley> perrito666: What happens?
<natefinch> rogpeppe: yeah, sorry, forgot I was OCR
<rick_h_>  natefinch http://reviews.vapour.ws/r/2732/ has one as well please for frankban
<perrito666> abentley: http://pastebin.ubuntu.com/12521781/
<natefinch> rick_h_: looking
<abentley> perrito666: I suspect juju-mongodb is a package, not a project.
<abentley> perrito666: The package URL is something like lp:~hduran-8/ubuntu/trusty/juju-mongodb/juju-mongodb2.6
<wwitzel3> where does the aws-quickstart-bundle live?
<frankban> natefinch: ty!
<sinzui> wwitzel3: lp:juju-ci-tools/repository maybe
<sinzui> wwitzel3: http://bazaar.launchpad.net/~juju-qa/juju-ci-tools/repository/view/head:/bundles.yaml
<wwitzel3> sinzui: thank you
<perrito666> abentley: ah thank you my mistake
<mattyw> TheMue, ping?
<natefinch> alexisb, katco: have we picked a date for the juju-core sprint?
<frobware> mattyw, he's out on holiday this week
<mattyw> frobware, ok no problem
<katco> natefinch: i think dec. 7th-11th is looking like it may be the winner. nothing official.
<natefinch> katco: thanks.  I presume no word on where yet?
<katco> natefinch: nope. i think everyone's still focused on upcoming sprint
<natefinch> katco: fair enough... actually, I guess it's further away than I was thinking... I forgot november exists ;)
<katco> natefinch: haha that's sep. for me... always seems to blow by for some reason
<wwitzel3> sinzui: trying to reproduce https://bugs.launchpad.net/juju-core/+bug/1498481 with current 1.25 against aws, is this happening all the time for you? Or just intermintent?
<mup> Bug #1498481: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged by wwitzel3> <https://launchpad.net/bugs/1498481>
<sinzui> wwitzel3: all the time on aws, hp, joyent, and maas
<wwitzel3> sinzui: should I be doing something other than using deployer to deploy the bundle to replicate?
<mup> Bug #1498575 opened: Create min helper function to eliminate duplicate definitions <juju-core:New for cherylj> <https://launchpad.net/bugs/1498575>
<mup> Bug #1498575 changed: Create min helper function to eliminate duplicate definitions <juju-core:New for cherylj> <https://launchpad.net/bugs/1498575>
<sinzui> wwitzel3: after quickstart finished, the script just polled the status. It early if an agent is in error. If all agents are stated, then success. This is a success from a previous test of 1.25 http://reports.vapour.ws/releases/3065/job/aws-quickstart-bundle/attempt/1016
<mup> Bug #1498575 opened: Create min helper function to eliminate duplicate definitions <juju-core:New for cherylj> <https://launchpad.net/bugs/1498575>
<mup> Bug #1498577 opened: juju deploy lis-test-charm hangs in pending <juju-core:New> <https://launchpad.net/bugs/1498577>
<mup> Bug #1498577 changed: juju deploy lis-test-charm hangs in pending <juju-core:New> <https://launchpad.net/bugs/1498577>
<mup> Bug #1498577 opened: juju deploy lis-test-charm hangs in pending <juju-core:New> <https://launchpad.net/bugs/1498577>
<alexisb> natefinch, yes, week of dec 7th
<alexisb> natefinch, venue still not locked down
<natefinch> alexisb:  cool, thanks
<perrito666> alexisb: well locking down the venue just for us is a bit overkill
<alexisb> perrito666, I want to ensure proper network
<rick_h_> perrito666: take over all the things!!!
<natefinch> niemeyer_: ping?
<perrito666> jamespage: thank you :D https://launchpad.net/~hduran-8/+archive/ubuntu/juju-mongodb2.6
<niemeyer_> natefinch: Heya
<niemeyer_> natefinch: So tell me, what's up with assertions there?
<natefinch> niemeyer_: hey, sorry, had to step out.  Back now.  There's just a large-ish list of ops being put into a single transaction. Most of them were there before, but in separate transactions.  The transaction is returning ErrAborted, which I understand means an assertion is failing?  But I don't know how to figure out which one, as there are several.
<niemeyer_> natefinch: There's no magic way.. you need to introspect the current state
<natefinch> niemeyer_: ahh, boo.  I had hoped there was a magic log file somewhere or something that could tell me which one was failing.  That's fine.  I can slog it out the hard way.
<niemeyer_> natefinch: No.. it's very hard to do that given the constraints we have in terms of db features
<niemeyer_> natefinch: All the assertions need to match before we fail.. when we fail, we can't be sure of what is broken since things are running concurrently
<natefinch> niemeyer_: understood.
<mup> Bug #1498642 opened: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<mup> Bug #1498642 changed: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<katco> wwitzel3: hey, need an update on bug 1498481 for the release meeting. how's it going?
<mup> Bug #1498481: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged by wwitzel3> <https://launchpad.net/bugs/1498481>
<mup> Bug #1498642 opened: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<mup> Bug #1498642 changed: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<mup> Bug #1498642 opened: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<mup> Bug #1498642 changed: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<perrito666> man waiting for ppas to build is nerve wrecking
<mup> Bug #1498642 opened: juju lxc template does not DHCP release <juju-core:New> <https://launchpad.net/bugs/1498642>
<perrito666> you know mup I feel like you are trying to tell me something
<katco> wwitzel3: ping
<mwhudson> good morning
<katco> mwhudson: o/
<mwhudson> perrito666: someone pointed me at https://wiki.ubuntu.com/SimpleSbuild recently
<mwhudson> perrito666: makes for fewer ppa embarrassments
<perrito666> mwhudson: thanks
<natefinch> you know it's bad when the simple process is 11 steps
<perrito666> mwhudson: but this is the first build of my ppa anyway, and its mongo so it will take time :p
<mwhudson> ah yes
<katco> wwitzel3: ping ping ping
<alexisb> katco, thumper can you guys cover the release call today?
<alexisb> and ping me if you need me
<katco> alexisb: yes; anything in particular you want us to raise?
<alexisb> give green light on beta1 release
<alexisb> do a first pass of priorities on beta2
<alexisb> let me know if you have questions
<katco> alexisb: can't give a green light on beta1, 1 crit. open. wwitzel3 is working on it
<alexisb> katco, crap yeah ok, forgot about that
<alexisb> thanks
<davecheney> what's the story with https://bugs.launchpad.net/juju-core/+bug/1497297
<mup> Bug #1497297: TestFindToolsExactInStorage fails for some archs Again <blocker> <ci> <precise> <regression> <test-failure> <unit-tests> <juju-core:Fix Committed by cherylj> <https://launchpad.net/bugs/1497297>
<davecheney> it's been fixed released since friday
<davecheney> sorry, fixed comitted
<katco> davecheney: sinzui says we're still waiting for a blessed master
<davecheney> katco: ok, i'll keep waiting
<davecheney> http://reviews.vapour.ws/r/2731/
<davecheney> review, going free
<katco> davecheney: you might poke him for a re-run in just a bit
<mup> Bug #1484419 changed: Local provider: fail to download charm from state server when using an isolated network <deploy> <ha> <landscape> <network> <juju-core:Triaged> <https://launchpad.net/bugs/1484419>
#juju-dev 2015-09-23
<thumper> axw: ping
<axw> thumper: PONG!
<thumper> axw: do you have a few minutes to talk though the uniter fix you proposed?
<axw> thumper: sure
<thumper> axw: https://plus.google.com/hangouts/_/canonical.com/arghh-run-sucks?authuser=1
<axw> hehe
<thumper> axw... hmm... not looking good just now, seems like the tests are hanging
<thumper> axw: hmm...
<thumper> ran the uniter tests by themselves and all is good
<thumper> did hit two other problems with the whole test suite
<thumper> one was a weird tear down failures in cmd/jujud where everything paniced because a mongo session was already closed
<thumper> and the other is the peergrouper tests failing with Go 1.5
 * thumper tries to write a failing test
<thumper> UniterSuite.TestUniterUpgradeConflicts intermittent failure
<thumper> hazaah
<thumper> NOT
<thumper> axw: confirmed here too that this fixes the bug as presented
<axw> thumper: cool
<thumper> axw: now the hard bit, write a failing test with the current code that the code fixes :)
 * thumper afk to collect a child
<anastasiamac> thumper: the panic in cmd/jujud due tot he closed session is fixed
<anastasiamac> well, i have the fix but m waiting for master to land...
<anastasiamac> *for master to un-block that is...
<thumper> good to know
<thumper> hey git folks, is there a way to git stash some of the changed files?
<thumper> or interactive stash, like bzr shelve?
<anastasiamac> thumper: git stash
<anastasiamac> :D
<mup> Bug #1498746 opened: azure: units fail to attach block storage <azure-provider> <storage> <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1498746>
<mup> Bug #1498746 changed: azure: units fail to attach block storage <azure-provider> <storage> <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1498746>
<thumper> axw: Your change which I approve, and my test: http://reviews.vapour.ws/r/2735/
<thumper> test confirmed failed before the good change
<axw> thumper: looking
<axw> thumper: sorry to be annoying, but can you delete the TODO(fwereade) at the top of Uniter.RunCommands? I forgot to delete it while I was there
<thumper> axw: ah yes, was meaning to do that
<mup> Bug #1498746 opened: azure: units fail to attach block storage <azure-provider> <storage> <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1498746>
<axw> thumper: LGTM, thanks. I'll take care of updating master later on
<thumper> axw: ok, I'll fix 1.25
<thumper> should be same fix
 * thumper done now, 
<frobware> voidspace, ping - can we catch up sometime today.
<voidspace> frobware: sure
<voidspace> frobware: after standup?
<frobware> voidspace, possibly. clashes with dooferlad, but we can discuss in standup
<voidspace> frobware: ah, it's Wednesday
<voidspace> frobware: feels like a monday
<voidspace> frobware: ok, at your convenience
<voidspace> frobware: I'm here all day...
<voidspace> frobware: now is good for me if you're free, otherwise later
<mup> Bug #1498859 opened: juju.InitJujuHome should not have side effects <juju-core:New> <https://launchpad.net/bugs/1498859>
<wwitzel3> sinzui: I'm stumped
<wwitzel3> sinzui: exact revision using quickstart on that bundle .. no issues in half a dozen tries
<mup> Bug #1498869 opened: doc/backup_and_restore.txt describes deprecated backup plugin <juju-core:New> <https://launchpad.net/bugs/1498869>
<rogpeppe> mgz: ping
<rogpeppe> sinzui: ping
<axw> anastasiamac perrito666: standup?
<perrito666> brt
<mup> Bug #1497297 changed: TestFindToolsExactInStorage fails for some archs Again <blocker> <ci> <precise> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1497297>
<mup> Bug #1497297 opened: TestFindToolsExactInStorage fails for some archs Again <blocker> <ci> <precise> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1497297>
<mattyw> rogpeppe, yeah, I tried that :(
<mup> Bug #1497297 changed: TestFindToolsExactInStorage fails for some archs Again <blocker> <ci> <precise> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1497297>
<mup> Bug #1496997 changed: TestErrorReadingEnvironmentsFile calls chmod on win <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by cherylj> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1496997>
<mup> Bug #1498869 changed: doc/backup_and_restore.txt describes deprecated backup plugin <juju-core:Invalid> <https://launchpad.net/bugs/1498869>
<rogpeppe> mattyw: hrmph
<mattyw> rogpeppe, master is unblocked thought - you should be able to go for it now?
<rogpeppe> mattyw: ah, cool
<mup> Bug #1496997 opened: TestErrorReadingEnvironmentsFile calls chmod on win <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by cherylj> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1496997>
<mup> Bug #1498869 opened: doc/backup_and_restore.txt describes deprecated backup plugin <juju-core:Invalid> <https://launchpad.net/bugs/1498869>
<mup> Bug #1496997 changed: TestErrorReadingEnvironmentsFile calls chmod on win <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by cherylj> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1496997>
<mup> Bug #1498869 changed: doc/backup_and_restore.txt describes deprecated backup plugin <juju-core:Invalid> <https://launchpad.net/bugs/1498869>
<mup> Bug #1496997 opened: TestErrorReadingEnvironmentsFile calls chmod on win <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by cherylj> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1496997>
<mup> Bug #1498869 opened: doc/backup_and_restore.txt describes deprecated backup plugin <juju-core:Invalid> <https://launchpad.net/bugs/1498869>
<bogdanteleaga> sinzui, mgz: can you mark this as closed? https://github.com/juju/juju/issues/3179
<sinzui> bogdanteleaga: Done
<bogdanteleaga> thanks
<mup> Bug #1496997 changed: TestErrorReadingEnvironmentsFile calls chmod on win <ci> <regression> <unit-tests> <windows> <juju-core:Fix Released by cherylj> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1496997>
<mup> Bug #1498869 changed: doc/backup_and_restore.txt describes deprecated backup plugin <juju-core:Invalid> <https://launchpad.net/bugs/1498869>
<perrito666> editor with completion and checking plus compiling mongo, this computer is about to bail on me
<mup> Bug #1498904 opened: outdated lp:juju-core branch <juju-core:New> <https://launchpad.net/bugs/1498904>
<mup> Bug #1498904 changed: outdated lp:juju-core branch <juju-core:Fix Released by sinzui> <https://launchpad.net/bugs/1498904>
<mup> Bug #1498904 opened: outdated lp:juju-core branch <juju-core:New> <https://launchpad.net/bugs/1498904>
<mup> Bug #1498904 changed: outdated lp:juju-core branch <juju-core:New> <https://launchpad.net/bugs/1498904>
<mbruzek> frobware: I think I am seeing the same problem as 1491592
<frobware> mbruzek, which is good...
<mbruzek> frobware: Are you unable to reproduce?
<mbruzek> frobware: I added a comment to https://bugs.launchpad.net/juju-core/+bug/1491592
<mup> Bug #1491592: local provider uses the wrong interface <charmers> <customer-support> <kvm> <local-provider> <networking> <juju-core:In Progress by frobware> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1491592>
<mup> Bug #1498904 opened: outdated lp:juju-core branch <juju-core:New> <https://launchpad.net/bugs/1498904>
<frobware> mbruzek, I could not reproduce. were you able to reproduce based on the git repo I cloned/
<mbruzek> frobware: no I did not clone a repo, I was just working on some stuff and noticed the problem today.
<frobware> mbruzek, I noticed in the original description that "There was no reboot of any of his systems."
<mbruzek> frobware: Yes I believe that is correct, unfortunately my reproduction was involving a reboot
<frobware> mbruzek, either way there's obviously an issue here. Your repro is a lot quicker/easier
<mup> Bug #1498904 changed: outdated lp:juju-core branch <juju-core:Fix Released by sinzui> <https://launchpad.net/bugs/1498904>
<mbruzek> frobware: I still have the error environment up.  Do you want any additional information before I destroy-environment?
<frobware> mbruzek, please could you add your environment info/setup to the bug report
<frobware> mbruzek, I see "-e kvm" but don't know what the setup really is.
<mbruzek> frobware: absolutely, you want my environments.yaml for that environment?  And what else?  Any log files or other information?
<frobware> mbruzek, ahh... is this repro on 1.24.5?
<frobware> mbruzek, yes to environments.yaml.
<mbruzek> frobware: all-machines.log ?  I can not get to the docker system any longer to get that machine log file
<frobware> mbruzek, on the KVM node did you add a bridge? https://jujucharms.com/docs/devel/config-KVM#kvm-guest-network-bridge
<perrito666> what is the process to add a feature flag?
<mbruzek> frobware: I don't remember adding a bridge but I do see lxcbr0 in my interface listing.
<mbruzek> Is it possible that was created by lxc ?
<mbruzek> where kvm != lxc ?
<frobware> mbruzek, oh certainly
<frobware> mbruzek, when you said "Restart the KVM host" which machine is that?
<mbruzek> my laptop
<mbruzek> I was using these two charms yesterday and restarted, tried to continue to use them today
<mbruzek> but I could not ssh to the docker/0 charm
<mbruzek> frobware: in any case I can still get to the nagios kvm machine, and it has network access
<mbruzek> frobware: do you need anything else before I destroy the environment?
<frobware> mbruzek, nope & thanks
<natefinch-afk> perrito666: you're assuming there's a process
<mup> Bug #1498942 opened: UniterSuite.TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1498942>
<sinzui> hi katco bug 1495591 (the master bug of 1498942) is a regression in 1.24 There is a trivial fix already in master that can fix 1.24
<mup> Bug #1495591: TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Fix Released by cmars> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1495591>
<natefinch> sinzui: pretty sure she's out for the rest of the week
<sinzui> ouch
<xwwt> sinzui: I will see who should take it
<sinzui> thank you natefinch : thumpr will find work for himself if it isn't resolved in a few hours
<natefinch> sinzui:  I can backport the fix listed in the bug, if we want
<sinzui> natefinch: that is preferrable to rolling back. I think the fix is trivial to apply
<natefinch> sinzui: wicked trivial
<mup> Bug #1498942 changed: UniterSuite.TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1498942>
<bogdanteleaga> does anybody have "extensive" knowledge of how proxies work in juju?
<mup> Bug #1498942 opened: UniterSuite.TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1498942>
<natefinch> bogdanteleaga: probably not
<natefinch> bogdanteleaga: I know we've had customers complaining that we obey the OS's proxy settings too much... which I find amusing
<natefinch> sinzui: it's working: https://github.com/juju/juju/pull/3357
<sinzui> thank you natefinch
<bogdanteleaga> natefinch: any idea if there are proxy settings being obeyed right at the first connection to the state machine? and if so, how?
<mup> Bug #1498942 changed: UniterSuite.TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1498942>
<mup> Bug #1498942 opened: UniterSuite.TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1498942>
<natefinch> bogdanteleaga: I really don't know the details, would have to dig into the code.
<natefinch> bogdanteleaga: I know we've had customers complain that they can't customize the port that the state server listens on, so they have to open their firewall for the port we've chosen
<voidspace> natefinch: ping
<mup> Bug #1498942 changed: UniterSuite.TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1498942>
<natefinch> voidspace: sup?
<voidspace> natefinch: got a couple of minutes spare to help me with a problematic transaction?
<alexisb> natefinch, thanks for stepping up to help w/ https://launchpad.net/bugs/1495591
<mup> Bug #1495591: TestRunCommand fails on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Fix Released by cmars> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1495591>
<natefinch> alexisb: merging 3 line changes is my specialty ;)
<alexisb> :)
<natefinch> voidspace: sure
<voidspace> natefinch: I sent you a pm with the details
<voidspace> no reason it should be a pm really though
<voidspace> this assert: http://pastebin.ubuntu.com/12531747/
<voidspace> always causes this failure: &errors.errorString{s:"cannot set addresses of machine 0: state changing too quickly; try again soon"}
<voidspace> and I'm wondering why...
<voidspace> omitempty is set, so I wonder if a missing value would be the problem
<voidspace> but putting an arbitrary assert that should always pass also seems to trigger the same problem
<voidspace> so I suspect I'm missing something fundamental
<natefinch> I think it's a problem with the transaction code in mgo
<natefinch> voidspace: as I said in DM it's probably a copy of this bug: https://bugs.launchpad.net/juju-core/+bug/1334773
<mup> Bug #1334773: Upgrade from 1.19.3 to 1.19.4 cannot set machineaddress <landscape> <lxc> <maas-provider> <precise> <regression> <upgrade-juju> <juju-core:Fix Released by axwalk> <juju-core 1.20:Fix Released by axwalk> <https://launchpad.net/bugs/1334773>
<voidspace> natefinch: :-/
<voidspace> natefinch: thanks
<natefinch> you might ask niemeyer if he's around today.
<voidspace> natefinch: we have some tests that actually expect that error (not for machine address but contention tests)
<natefinch> voidspace: this looks like the error: https://github.com/juju/txn/blob/master/txn.go#L41
<voidspace> natefinch: right, thanks
<voidspace> natefinch: so it's an error that we're raising
<voidspace> natefinch: that makes me think it's likely to be my fault
<voidspace> more debugging needed
<perrito666> Bbl
<mup> Bug #1498968 opened: ERROR environment destruction failed: destroying storage: listing volumes: An internal error has occurred (InternalError) <destroy-environment> <juju-core:New> <https://launchpad.net/bugs/1498968>
<natefinch> voidspace: that error is caused by too many of ErrTransientFailure, which is only ever returned here, AFAICT: https://github.com/juju/juju/blob/6c929a78879c396a2a244caad99851375984e7f8/state/service.go#L129
<voidspace> natefinch: I thought it meant that the transaction had failed too many times
<voidspace> hmmm
<natefinch> voidspace: well, yes
<voidspace> there's no service.Destroy involved here, that's for sure
<natefinch> voidspace: oh yeah, it looks like if you get txn aboirted too many times, it will return that error
<natefinch> voidspace: misread a loop
<voidspace> yeah, I just don't see why that transaction would fail - I'm just asserting that the field I'm changing hasn't been changed by anything else
<voidspace> natefinch: found it
<voidspace> natefinch: it's specified as "bson:,omitempty"
<voidspace> natefinch: needed to avoid a migration
<voidspace> natefinch: but causes the transaction to fail the first time I set the value - comparing a missing value against an empty address fails
<voidspace> hmmmm
<voidspace> that's tricky
<voidspace> natefinch: so I can assert that the field is either unchanged *or* missing, and that works
<natefinch> voidspace: ahh, there you
<natefinch> go
<frobware> dooferlad, did you come to any conclusion w.r.t. containers and spaces?
<dooferlad> frobware: still testing, but it looks like EC2 containers are broken.
<dooferlad> frobware: have created bug 1498982
<mup> Bug #1498982: failed configuring a static IP for container "1/lxc/0": cannot allocate addresses: instId not supported <juju-core:New> <https://launchpad.net/bugs/1498982>
<frobware> dooferlad, voidspace: the order of args to `subnet add' seems awkward to me. I want to do something like: juju subnet add dmz $(get-me-some-subnet-ids) ...
<voidspace> frobware: we wanted space to be an optional argument, not sure if that was a factor
<voidspace> so subnets default to the default space
<frobware> voidspace, I wonder which is the common case though: adding to [default], or something very explicit.
<frobware> voidspace, on EC2 If I get "cannot run instances: Request limit exceeded" is there something I can do ... ?
<mup> Bug #1498982 opened: failed configuring a static IP for container "1/lxc/0": cannot allocate addresses: instId not supported <juju-core:New> <https://launchpad.net/bugs/1498982>
<mup> Bug #1498982 changed: failed configuring a static IP for container "1/lxc/0": cannot allocate addresses: instId not supported <juju-core:New> <https://launchpad.net/bugs/1498982>
<frobware> dooferlad, any idea about working around EC2 "cannot run instances: Request limit exceeded"
<mup> Bug #1498982 opened: failed configuring a static IP for container "1/lxc/0": cannot allocate addresses: instId not supported <juju-core:New> <https://launchpad.net/bugs/1498982>
<dooferlad> frobware: seems like the canonical account has enough people hammering it that we are hitting the request limit. Unfortunately it seems like the solution is to ask Amazon to up the limit for us.
<dooferlad> frobware: which the account holder can do, but I am not sure who that is. You could try joey on the Canonical IRC server
<dooferlad> http://docs.aws.amazon.com/AWSEC2/latest/APIReference/query-api-troubleshooting.html#api-request-rate
<dooferlad> http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html
<dooferlad> :-(
<dooferlad> frobware: strange thing is that I am not hitting that message and I am using the canonical account on some machines in eu-central-1
<voidspace> frobware: sorry, just seen this
<voidspace> frobware: or "that"
<dooferlad> voidspace: Could you take a quick look at https://bugs.launchpad.net/juju-core/+bug/1498982 and post any insights?
<mup> Bug #1498982: failed configuring a static IP for container "1/lxc/0": cannot allocate addresses: instId not supported <juju-core:New> <https://launchpad.net/bugs/1498982>
 * dooferlad reaches EOD
<voidspace> dooferlad: oh yuck
<voidspace> dooferlad: have to look at it tomorrow I think
<frobware> voidspace, dooferlad: thanks. seems like a transient thing, have a machine booted now.
<dooferlad> voidspace: thanks. We need to hammer on getting everything EC2 working before next week (I think frobware would like his demo to work)
<voidspace> heh
<voidspace> dooferlad: I can do some spelunking in the code
<frobware> voidspace, dooferlad: +(a lot)
<dooferlad> right, I detect a daughter. See you all tomorrow.
<voidspace> dooferlad: I think something recent must have broken it, this was working fine
<voidspace> o/
<voidspace> I'm EOD too
<voidspace> the good news is that my transaction works and I have race condition tests that pass (and also fail with the Assert commented out)
<voidspace> just a handful of missing tests for my new network functions and I'm done
<voidspace> (I think...)
<rogpeppe> sinzui: it looks like this CI build is hung up waiting for an instance (it's been like that for quite a long time now): http://juju-ci.vapour.ws:8080/job/github-merge-juju/4846/console
<rogpeppe> sinzui: about 3 hours if the times are to be believed
<sinzui> :/
<sinzui> rogpeppe: I have seen this before, and this time matches AWS failures we saw else where. I can kill the job, someone will need to re-sibmit
<rogpeppe> sinzui: thanks
<perrito666> you have got to be kidding me bzr: warning: skipping /home/hduran/Develop/canonical/ppas/branches/juju-mongodb3.0/src/mongo-tools/mongodump (larger than add.maximum_file_size of 20000000 bytes)
<perrito666> that is NOT a sane thing to do as a default
<sinzui> perrito666: I didn't see that when I attempted to package mongodb3. Are you packaging mongodb or ofr5 mongo-tool
<sinzui> mongo-tools
<perrito666> I am packaging mongodb3 with tools (which implies you bundle the binaries
<sinzui> perrito666: Lp wont let you
<sinzui> a source package only contains sources, not binaries
 * perrito666 sighs and curses an different things
<sinzui> perrito666: In general, if a human cannot audit the content of a package and change it, the packge will not be accepted.
<perrito666> sinzui: I am just making a ppa for my personal use are those covered by the same rules?
<sinzui> perrito666: I already asked jamespage for advice about reconciling the massive change between mongo 2.4 and 3.0.1
<perrito666> sinzui: so I pretty much have managed to smash 3 into juju-mongodb
<sinzui> perrito666: you can get away with it in your PPA.
 * sinzui did the same to provide elastic-search java deps
<perrito666> sinzui: cool, my plan is to build the whole migration tool using the ppas as a way to prove the whole thing works
<perrito666> and then we can build decent packages
<sinzui> +1
<perrito666> we will need a juju-mongodb2.6 and juju-mongodb3 which can live next to the current juju-mongodb and, now I learn, also mongo-tools
<perrito666> or we could do juju-mongodb3 extra smart and compile both things
<perrito666> I am just completely aware that my juju-mongodb3 is not production level so I didnt bother make it all that nice
<sinzui> perrito666: yeah, the co-installable issue is one of the many issues I asked for help with
<perrito666> the scons options have changed a lot too
<perrito666> and the overal layout of the code for the matter I dont really understand how is it that their .deb is bundled with the tools but their root of repo does not build the package with them
 * perrito666 buys a bottle of gin and sits in a table with sinzui to cry about their ordeal packaging mongo
<perrito666> you might notice too that wiredTiger will not build in 386 and that there is a mistery option(s) to be tweaked so it compiles in willy, apparently there are issues witht he new gcc and some threading feature default aaand also I think it doesnt build in trusty unlest we provide different boost libs :p
 * perrito666 reads the news and decides to change the brand of his next car
<natefinch> sinzui: I think the build bot is stuck
<natefinch> mgz ^
<natefinch> perrito666: heh, no VW for you eh?
<sinzui> natefinch: again? me looks
<sinzui> natefinch: I aborted the stuck job and re-enqueded it. AWS has failed to provide instances several times today :(
<natefinch> sinzui: cool. Maybe my merge into 1.24 will happen sometime today, then.  I was wondering why it hadn't merged in 5 hours
<sinzui> natefinch: master has been slow to test for the same reason. I am retesting the last job now. So I hope your branch merges, then  CI starts the test of 1.24
<perrito666> natefinch: well I was considering a VW up
<perrito666> but I might reconsider :p
<perrito666> then again, its not like I never commented a test myself :p
<natefinch> lol
<mup> Bug #1498349 changed: juju upgrade fails with tools upload error due to invalid series "wily" <precise> <streams> <upgrade-juju> <wily> <juju-core:New> <https://launchpad.net/bugs/1498349>
<perrito666> finalllyyyy
<mwhudson> juju code should be able to import encoding now /cc rogpeppe
<mwhudson> (gccgo-go is fixed in trusty-updates)
<alexisb> wwitzel3, pint
<alexisb> ping
<wwitzel3> alexisb: pong
<alexisb> heya wwitzel3
<alexisb> I was just checking in on this bug: https://bugs.launchpad.net/juju-core/+bug/1498481
<mup> Bug #1498481: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged by wwitzel3> <https://launchpad.net/bugs/1498481>
<alexisb> any progress today?
<wwitzel3> alexisb: nope, I'm completely stuck on it, I've looked through the logs of the re-run and it didn't lend any help
<wwitzel3> alexisb: I can't reproduce locally against the same revision using that bundle, have tried it against aws and maas
<wwitzel3> alexisb: there has to be some variable i'm missing in attempting to reproduce, since it happens every time in CI
<wwitzel3> alexisb: but I'm stumped, I've managed to get some other bug fixes while thinking on this one, which isn't much, since they aren't block beta1
<alexisb> wwitzel3, ack, let look at doing a hand-off w/ thumper's team given it is there start of day
<alexisb> maybe fresh eyes will help
<thumper> wwitzel3: hey, off my call now and available to chat
<thumper> wwitzel3: looking at alexisb's initial attention grab, thinking "yeah, I need a pint about now"
<wwitzel3> thumper: haha
<wwitzel3> thumper: so I've updated the lp bug with the steps sinzui gave me at the call yesterday for running the juju-ci bundle
<thumper> wwitzel3: did you want a quick hangout?
<wwitzel3> thumper: sure
<thumper> wwitzel3: https://plus.google.com/hangouts/_/canonical.com/we-hate-haproxy
<perrito666> anyone is familiar with what is required to add a feature flag besides adding it to feature/flags.go?
<alexisb> perrito666, thumper is
<alexisb> but he is busy in a hangout w/ a pint
<perrito666> and here I am with just wather
<thumper> perrito666: that is all that is required, just add a constant, and start checking against it in code
<thumper> very easy
<perrito666> thumper: wow, I would have expected more boilerplate, just by adding a constant featureflags.Enabled(blah) will be true?
 * perrito666 is distrustful of easy things
<perrito666> thumper: I am not getting my feature flag enabled, in cmd/jujud/bootstrap.go is it supposed to work that way?
<thumper> flags aren't magically set
<thumper> you have to set them in tests
<perrito666> thumper: tests I am setting it in the env
<perrito666> I am trying to hide a feature inside a flag
<thumper> you can't set the env and expect it to work :)
<thumper> look for "initialFeatureFlags"
<perrito666> ahaa
<thumper> eg s.SetInitialFeatureFlags(feature.DbLog)
<perrito666> thumper: I am not trying to use this inside a test :)
<thumper> quck review for someone http://reviews.vapour.ws/r/2741/diff/#
<thumper> perrito666: oh? IRL?
<perrito666> thumper: yup
<thumper> perrito666: where?
<thumper> perrito666: what is your exact command?
<perrito666> thumper:  I am trying to hide bootstraping into mongo3 behind a feature flag so I went to cmd/jujud/bootstrap.go -> Run() and check if the flag is set before starting mongo so for it I set "mongo3" flag as a const and exported it in my env  as JUJU_DEV_FEATURE_FLAG=mongo3
<perrito666> then I juju bootstrap
<thumper> I'm fairly sure that the jujud process picks up feature flags early
<thumper> perrito666: FLAGS not FLAG
<perrito666> thumper: I set both
<perrito666> I wasnt sure
<thumper> osenv/vars.go
<perrito666> thumper: yes, I could have grepped too
<perrito666> but anyway, there are not available at that stage, bummer
<thumper> which stage?
<thumper> it is
<thumper> there is an init function in cmd/jujud
<thumper> that sets the flags based on the env var
<perrito666> there is, wtf is wrong then?
<thumper> sinzui: ping
<thumper> sinzui: looking at bug 1498481 and http://reports.vapour.ws/releases/3090/job/aws-quickstart-bundle/attempt/1055 as mentioned in comment https://bugs.launchpad.net/juju-core/+bug/1498481/comments/2
<mup> Bug #1498481: HAProxy charm broken by recent commit <blocker> <charm> <ci> <haproxy> <quickstart> <regression> <juju-core:Incomplete> <juju-core 1.25:Triaged by wwitzel3> <https://launchpad.net/bugs/1498481>
<thumper> sinzui: the logs don't have debug information
<thumper> sinzui: oh, you hit the dreaded positional arg bug
<thumper> sinzui: please put the --debug after the quickstart not before
<thumper> wwitzel3: if you are still around, can you send me the bundle they use?
<thumper> I don't have it
<thumper> perrito666, waigani, davechen1y: can someone please rubberstamp http://reviews.vapour.ws/r/2741/ ?
<waigani> thumper: shipit
<thumper> cheers
#juju-dev 2015-09-24
<axw> evilnickveitch: any significance to the pear and cake? what you happened to be eating at the time you merged? ;)
<rick_h_> axw: heads up. I've added you to a seattle session per alexisb's request as the expert as we got poking at an issue with ssh keys being requires for createEnvironment. Some notes in the doc linked to the session: https://docs.google.com/document/d/1JsVFVx0P6wKdBIPqYPOBiYqYNRpb5kR8ept_NfmkXPw/edit#heading=h.13n2f81u0zpj
<rick_h_> axw: just a heads up in case there's any pre-thought required :)
<axw> rick_h_: thanks
<davechen1y> thumper: fyi https://github.com/gorilla/websocket#gorilla-websocket-compared-with-other-packages
<thumper> davechen1y: handy
<davechen1y> hopefully it helps
<davechen1y> http://reviews.vapour.ws/r/2742/, anyone ?
<perrito666> davechen1y: ship it
<perrito666> I thought I was the only one changing deps by hand
<davechen1y> Posted 7 minutes from now (Sept. 24, 2015, 11:56 a.m.)
<davechen1y> ^ reviewboard lives in the future
<anastasiamac> davechen1y: yes, bleeding-edge technology :P
<davechen1y> perrito666: thanks
<davechen1y> axw: cherylj natefinch-afk you have reviews that have been approved
<davechen1y> please land then
<axw> davechen1y: mine are all blocked
<axw> I would love to land them :)
<davechen1y> axw: blocked on what ?
<davechen1y> are they against master, or a branch ?
<axw> davechen1y: 1.24 and 1.25 are blocked
<axw> those two
<davechen1y> \bummer
<thumper> I think I've landed an unblocker for 1.24
<thumper> waiting the bless
<thumper> and I'm looking at 1.25 bug
<thumper> but missing info
<thumper> davechen1y, waigani: team meeting time
<davechen1y> thumper: see you there
<waigani> thumper: coming
<perrito666> davechen1y: if you spoke spanish I would get you to record my answering machine message
<natefinch> lol
<axw> thumper waigani: fixed the link. for some reason the event shows up on my calendar twice: one I can edit, one I can't
<waigani> axw: nice work :)
<perrito666> axw: one is your event (it copies when you accept it) the other one is the actual event
<axw> perrito666: ah.
<natefinch> good god our watcher code needs better documentation
<natefinch> what's the correct way to watch a collection where I just want a notification whenever anything is added to the collection?  Looks like I should use newEntityWatcher or newDocWatcher, but they're both woefully underdocumented as to what "key" is supposed to be, and it is ever so helpfully an interface{} :/
<thumper> natefinch: I have no idea sorry
<thumper> hazaah, blessed 1.24
<natefinch> yay
 * thumper forward ports to 1.25
<wwitzel3> aka patch or commit
<natefinch> perrito666: http://www.amazon.com/gp/product/B004UWY4KG
<perrito666> ok ok,ill buy one
<natefinch> perrito666: lol
<perrito666> k ppl gnight
<natefinch> perrito666: I think I don't have a tron poster, oddly enough.  I have several others - star wars (a reproduction), hackers, 5th element... forget what else.  I had a dedicated movie room for a long time that I decorated... now they're just in storage.  I'll take out the hackers poster and put it next to my desk, though.
<natefinch> perrito666: night
<perrito666> natefinch: ill send one of those to the hotel once I figure where it is
<thumper> is there any way to go back through the logs of a closed hangout?
<natefinch> thumper: not to my knowledge
<thumper> cmars: ping?
 * thumper is being hopeful
<davechen1y> google says you can retrieve them via gmail
<thumper> davechen1y: and if you dn't use gmail?
<davechen1y> well, i couldn't find it in gmail
<davechen1y> it might allso be in the hangouts app on g+
 * davechen1y tries to help, but hits is knee on the coffee table
<davechen1y> mongo is one shithouse piece of software
<thumper> davechen1y: what brought this on?
 * thumper is amused with "shithouse" as Ian uses that too
<thumper> must be an ozzie thing
 * thumper back later for TL meeting
<davechen1y> thumper-afk: i'm fucking sick of loosing 1 CI run out of 5 because mongo shits itself on startup
<axw> wwitzel3: you're not working, right? I'm going to fix this 1.25 blocker if you're not actively working on it atm
<axw> davechen1y: would you kindly review http://reviews.vapour.ws/r/2745/? fixes the 1.25 blocker
<rogpeppe> axw: hiya
<davechen1y> axw: ship it
<axw> davechen1y: ta
<axw> rogpeppe: howdy
<rogpeppe> davechen1y, axw: do you know what the procedure is meant to be for merging a feature branch, by any chance?
<axw> rogpeppe: nope. I do know that it has to be blessed by CI first
<axw> other than that, no
<rogpeppe> axw: it has (finally!) been blessed
<davechen1y> rogpeppe: nfi
<rogpeppe> axw: and i wanna get it merged quickly before it gets loads more conflicts and consequent CI cursing
<axw> rogpeppe: simplest way would be to create a new branch and rebase it onto master, then propose
<rogpeppe> axw: i'm not sure that it should be rebased
<axw> rogpeppe: well maybe not "simplest"... but easiest :)
<rogpeppe> axw: but in this case, perhaps it should
<rogpeppe> axw: 'cos it's only one commit anyway
<rogpeppe> axw: ok, i'll give that a go
<axw> rogpeppe: quick, while master's still unblocked :)
 * rogpeppe types like lightning
<axw> it's been a nightmare week for getting stuff merged
<rogpeppe> axw: yeah
<rogpeppe> axw: my last branch failed because of this error:
<rogpeppe> ... value *errors.Err = &errors.Err{message:"", cause:(*net.OpError)(0xc2103e3640), previous:(*errors.Err)(0xc21028a050), file:"github.com/juju/juju/state/open.go", line:114} ("cannot create index: local error: bad record MAC")
<rogpeppe> axw: do we know what causes that sporadic failure?
<axw> that old chestnut
<axw> I think it's a mongo WONTFIX
<rogpeppe> axw: marvellous
<axw> fixed in 3.0 IIRC
<axw> or maybe 2.6
<rogpeppe> well, let's just change mongo versions! :)
<axw> rogpeppe: perrito666 is working on that, making good progress
<rogpeppe> axw: moving to 3.0 ?
<axw> rogpeppe: yup
<rogpeppe> \o/
<axw> there's some fun migration steps required, that's the main issue
<axw> you have to go 2.4 -> 2.6 -> 3.0
<rogpeppe> yay! another day, another bunch of conflicts.
<rogpeppe> axw: really? the db format has changed?
<axw> rogpeppe: I don't know why it's necessary
<rogpeppe> axw: so any old juju installation needs to go through that upgrade path. that's awkward.
<axw> rogpeppe: yes, but perrito666 is working on making it all automated
<rogpeppe> axw: yeah, still awkward and probably error-prone and slow though :)
<rogpeppe> axw: any chance of a quick LGTM on this branch please (now officially blessed :]) https://github.com/juju/juju/pull/3275
<axw> rogpeppe: looking
<rogpeppe> axw: ta
<axw> rogpeppe: LGTM
<rogpeppe> axw: ta!
 * rogpeppe hits the $$merge$$ button after more than two weeks...
<rogpeppe> axw: really hoping that your in-merge-progress branch won't conflict :)
<axw> rogpeppe: mine's to 1.25
<rogpeppe> axw: phew
<voidspace> dooferlad: frobware: omw to standup - just wrestling firefox
<frobware> fwereade, coming to standup?
<fwereade> frobware, oops, ty
<voidspace> fwereade: when you have some time could you check the transaction in this updated PR: http://reviews.vapour.ws/r/2593/
<voidspace> I *think* it's complete :-)
<natefinch> fwereade: you around?
<fwereade> natefinch, yeah, what can I do for you?
<natefinch> fwereade: I'm trying to set up a watcher that'll notify me whenever anything is added to a collection.  Am I right in thinking we don't have a general purpose type/function for doing this yet?
<fwereade> natefinch, take a look at the action watcher, it's implemented in a couple of layers, I suspect one of them will solve at least half of it for you
<fwereade> natefinch, newIdPrefixWatcher?
<natefinch> fwereade: so, let me make sure I'm not going down the wrong path here.  This is for the unit assignment worker... when a service is added, I store the unit assignment data as a document in this new collection.  then with this watcher I'm building, the worker will get notified and do the assignment
<fwereade> natefinch, yeah, that sounds entirely sane to me
<fwereade> natefinch, and notifying only on add is also entirely sane, just make sure you clearly document what the strings it sends actually mean
<natefinch> fwereade: so, I'm not sure I can properly act on just the information in the ID.... I need the unit name and the placement directive for the unit.  dumping that all into the id seems like too much
<fwereade> natefinch, yeah, definitely, I'd be inclined to just send out the unit ids
<fwereade> natefinch, then have the worker request the relevant info in bulk
<fwereade> natefinch, or, indeed, just send the list of units back up -- I doubt you'll have the opportunity to extract the actual assignment logic from state
<natefinch> fwereade: is the list of units useful?  I wasn't even going to bother since I'd have to get the rest of the data from the collection anyway, I was just saying "hey, go run everything in the collection"
<fwereade> natefinch, I'll cop to it being a bit forward-looking -- you could just have a NotifyWatcher that triggers on any add, and an assign-everything api method
<fwereade> natefinch, but as it is I think we have way too much detailed assignment logic in state
<fwereade> natefinch, which responsibility could/should be extracted
<fwereade> natefinch, so, if anything, I'm thinking the ideal is (1) watcher sends unit list (2) worker requests corresponding assignments in bulk (3) worker requests application of corresponding assignments in bulk
<natefinch> fwereade: I can do that - it at least keeps the logic of where we get the assignments separate from the actual assignment logic.
<fwereade> natefinch, because (1) that opens a path towards putting some of the machine-selection/addition logic outside state, and thus towards *consolidating* the various implementations that have grown up around deployer, gui, et al; and (2) assuming we properly separate the "service" from the facade that exposes it, we ensure the assignment service is decoupled from the worker that implements it
<fwereade> natefinch, awesome
<fwereade> natefinch, (I mean "service" as in https://github.com/juju/juju/wiki/Managing-complexity )
<natefinch> fwereade: btw, is it me, or is idPrefixWatcher a terrible name for that type?
<fwereade> natefinch, I think you're right
<fwereade> natefinch, I would not want you to think you should preserve horrible type names just because they happen to exist
<natefinch> fwereade: ha, no worries there
<fwereade> ;)
<natefinch> mattyw: you around?
<mattyw> natefinch, I am
<mattyw> natefinch, what can I do for you?
<natefinch> mattyw: document this better  :)  https://github.com/juju/juju/blame/master/state/watcher.go#L1488
<mup> Bug #1499277 opened: api: Client.UploadTools does not use error code <juju-core:New> <https://launchpad.net/bugs/1499277>
<mattyw> natefinch, like this https://github.com/juju/juju/blame/master/state/watcher.go#L1474. or is that not good enough
<natefinch> mattyw: no... wtf is the key in dockey?
<mattyw> natefinch, I tak you're point - I'll tell you and then turn it into docs
<natefinch> mattyw: thanks :)
<mattyw> natefinch, the original watcher just took keys ..string that it would watch for changes. The changes I made was to allow you to do that across collections. so you need to associated each key with a collection. So having those in a struct seemed like the best approach
<mattyw> natefinch, I'm embarrassed, I thought I documented this much better, I certainly remember wiring something somewhere
<natefinch> mattyw: luckily easily fixable
<mattyw> natefinch, do you think that's sufficient documentation?
<mattyw> natefinch, (what I just typed)
<natefinch> mattyw: you still haven't explsained what the key is :)
<axw> fwereade: any chance of another review on http://reviews.vapour.ws/r/2685/ later?
<fwereade> axw, sure
<axw> thanks
<mattyw> natefinch, the key is actually the id of the document, now that you mention it, it's a terrible name
<mattyw> natefinch, I was keeping in line with the existing getTxnRevno function which calls that value key, but I probably should have renamed that as well
<mattyw> natefinch, https://github.com/juju/juju/blame/master/state/watcher.go#L1511
<natefinch> mattyw: if it's just an id (which is a string, right?), why is it interface{}?
<mattyw> natefinch, that's a good question, probably a hangover from whatever was before
<mattyw> rogpeppe, ping?
<rogpeppe> mattyw: pong
<mattyw> rogpeppe, do you know why the key in this function is an interface{} https://github.com/juju/juju/blob/master/state/watcher.go#L1511
<rogpeppe> mattyw: yes
<rogpeppe> mattyw: but i'm not telling you
<mattyw> rogpeppe, because it's only used to findId, but id will always be a string right?
<mattyw> rogpeppe, I'll let you win next time we play dominion
<natefinch> lol
<mattyw> trying to work out if you're really not going to tell us, or just typing...
<rogpeppe> mattyw: it's an interface 'cos theoretically you don't need to use strings
<rogpeppe> mattyw: although in practice we only use strings
<mattyw> rogpeppe, I thought mongo required it to be a string
<mattyw> rogpeppe, or stringable
<rogpeppe> mattyw: i don't think so
<rogpeppe> mattyw: it can be an int for example
 * mattyw consults the mongodb he felt forced into buying last week
<mattyw> rogpeppe, natefinch huh, TIL
<mattyw> db.foobar.insert("_id": {"foobar":"foo"})
<mattyw> { "_id" : { "foobar" : "foo" }, "bar" : "foo" }
<mattyw> natefinch, I'll improve the docs, is there anything else you need?
<mattyw> rogpeppe, I reckon getTxnTevno should take id interface{} rather than key
<mattyw> rogpeppe, natefinch I'm doing that now unless ther are objections
<rogpeppe> mattyw: it looks like the word "key" is used more than just there in that file
<rogpeppe> mattyw: e.g. newEntityWatcher, docKey, etc
<rogpeppe> mattyw: so it might be better to be consistent in the naming there
<rogpeppe> mattyw: it would probably be fine to change it all though, docKey -> docId etc
<rogpeppe> mattyw: (if you do, please do it as a separate PR)
<urulama> did anyone try to bootstrap an env with 1.26 and then bootstrap it again (by mistake) with 1.24.6, reusing the .jenv file? bootstrap will continue and environment set up with 1.26 will get wiped out
<mattyw> urulama, that sounds nasty - can you open a bug?
<urulama> mattyw: i'll verify it first, then yes, will do ...
<mattyw> natefinch, http://reviews.vapour.ws/r/2748/
<mup> Bug #1499332 opened: "no such host" when bootstrapping manual provider <juju-core:New> <https://launchpad.net/bugs/1499332>
<mup> Bug #1499332 changed: "no such host" when bootstrapping manual provider <juju-core:New> <https://launchpad.net/bugs/1499332>
<mup> Bug #1499332 opened: "no such host" when bootstrapping manual provider <juju-core:New> <https://launchpad.net/bugs/1499332>
<mup> Bug #1499338 opened: charm GET endpoint is not authenticated <juju-core:New> <https://launchpad.net/bugs/1499338>
<mup> Bug #1499338 changed: charm GET endpoint is not authenticated <juju-core:New> <https://launchpad.net/bugs/1499338>
<mup> Bug #1499338 opened: charm GET endpoint is not authenticated <juju-core:New> <https://launchpad.net/bugs/1499338>
<mup> Bug #1499356 opened: all units have false hook errors after reboot <sts> <juju-core:New> <https://launchpad.net/bugs/1499356>
<mup> Bug #1499356 changed: all units have false hook errors after reboot <sts> <juju-core:New> <https://launchpad.net/bugs/1499356>
 * fwereade collecting laura, bbiab
<mup> Bug #1499356 opened: all units have false hook errors after reboot <sts> <juju-core:New> <https://launchpad.net/bugs/1499356>
<sinzui> cherylj: 1.24 is unblocked, you can merge if you were waiting
<cherylj> sinzui: thanks!
<mup> Bug #1499400 opened: number of disks constraint for maas provider <juju-core:New> <https://launchpad.net/bugs/1499400>
<mup> Bug #1499400 changed: number of disks constraint for maas provider <juju-core:New> <https://launchpad.net/bugs/1499400>
<mup> Bug #1499400 opened: number of disks constraint for maas provider <juju-core:New> <https://launchpad.net/bugs/1499400>
<frobware> bootstrapping on ec2 with 1.25 I just got "WARNING expected one instance, got 2." Is this a warning I can/should ignore?
<frobware> dooferlad, voidspace: more spaces related issues https://bugs.launchpad.net/juju-core/+bug/1499426
<mup> Bug #1499426: deploying a service to a space which has no subnets causes the agent to panic <network> <juju-core:New> <https://launchpad.net/bugs/1499426>
<voidspace> frobware: I've tracked down the cause of the bug dooferlad reported
<voidspace> https://bugs.launchpad.net/juju-core/+bug/1498982
<mup> Bug #1498982: failed configuring a static IP for container "1/lxc/0": cannot allocate addresses: instId not supported <network> <juju-core:New> <https://launchpad.net/bugs/1498982>
<voidspace> frobware: fixing the new bug you reported should be easy enough
<voidspace> frobware: but it should be fairly high priority
<frobware> voidspace, I think I set it as high - no?
<voidspace> frobware: yeah
<voidspace> frobware: I'm also playing with bootstrapping to ec2 - and I'm seeing the "WARNING: expected one instance, got 2"
<voidspace> frobware: it's probably / possibly because we're both using the account
<frobware> voidspace, ahh
<voidspace> frobware: bootstrapping seems to be successful anyway
<voidspace> dooferlad: interestingly, deploying a container to ec2 with addressable containers switched on seems to work for me!
<dooferlad> voidspace: what address does it get?
<frobware> voidspace, dooferlad: so who wants to pick up which bug? voidspace does it make sense for you to continue with 1498982
<dooferlad> voidspace: does it manage to assign a static IP?
<voidspace> dooferlad: it got a 10.0 address, which is correct I think
<dooferlad> voidspace: 10.0.3.* is wrong.
<voidspace> dooferlad: ah, right
<dooferlad> voidspace: those are from the LXC bridge
<voidspace> dooferlad: nothing at all in the logs
<voidspace> dooferlad: I'm confirming the flag is set properly and rebootstrapping
<voidspace> with a better log level
<voidspace> frobware: I've assigned 1498982 to myself
<voidspace> frobware: I think that's a critical regression
<dooferlad> voidspace: all-machines.log doesn't have "failed configuring a static IP for container "0/lxc/0": cannot allocate addresses: instId not supported" in it?
<voidspace> frobware: I still need my unit address bug completed too though
<frobware> voidspace, thanks; yep agreed. the other one I just raised may not happen if I remember to add some subnets to my space
<voidspace> frobware: we really should have that fixed in 1.25 final though
<voidspace> frobware: panics are bad!!
<voidspace> dooferlad: I've blown away the environment and am trying again
<frobware> voidspace, yep agreed to unit address bug. which is why I was asking ...
<voidspace> frobware: just waiting on a review from fwereade
<voidspace> frobware: so working on the one dooferlad reported in the meantime
<voidspace> frobware: dooferlad: I know the cause of the bug - just not sure of the best fix
<voidspace> frobware: dooferlad: implementing ec2 support for instance Id filtering of subnets may be the path of least resistance
<fwereade> voidspace, frobware: only just started looking; think I'll take you up on that "focus on the txn" thing
<voidspace> going to look at how the code used to look first
<voidspace> fwereade: sure, I realise you're a slightly busy man!
<voidspace> fwereade: I would get dimiter to look at it if he was here
<voidspace> fwereade: the substantial change is the transaction is completely different - and it's now done on address setting not on preferred address fetching
<voidspace> fwereade: the rest of the changes are just logical consequences of those changes
<voidspace> but it still amounts to effectively a complete rewrite of the original PR :-)
<voidspace> fwereade: the race condition tests fail if you comment out the assert - which makes me believe they are both good tests and that the assert works...
<voidspace> of course I may well be deluded on both counts...
<fwereade> voidspace, no, I think it's good
<fwereade> voidspace, reviewed the state changes, just one naming quibble
<voidspace> fwereade: thanks, looking
<mup> Bug #1499426 opened: deploying a service to a space which has no subnets causes the agent to panic <network> <juju-core:New> <https://launchpad.net/bugs/1499426>
<voidspace> fwereade: hah, that did occur to me
<voidspace> fwereade: but I thought getSetPreferredAddressOps was just silly...
<voidspace> setPreferredAddressOps it is
<fwereade> voidspace, I think of <verbPhrase>Ops as being conventional
<voidspace> cool, thanks
<natefinch> jw4: you around?
<natefinch> fwereade: btw, that idPrefixWatcher is surprisingly action-specific: https://github.com/juju/juju/blob/master/state/watcher.go#L2173
<jw4> natefinch: yeah, but otp
<natefinch> jw4: no worries... lunch just arrived, back in a bit
<jw4> natefinch: I believe the idPrefixWatcher is a 'base' type that the action specific code uses?
<voidspace> if you bootstrap a dev version of juju the toolsversionchecker spams your log with errors
<fwereade> natefinch, ha, I'd missed that bit
<fwereade> natefinch, at first glance it looks as though it could be parameterised?
<voidspace> dooferlad: ok, the error I see (at INFO level!) is the same one you do "machine-0: 2015-09-24 15:56:55 INFO juju.provisioner lxc-broker.go:114 not allocating static IP for container "0/lxc/0": cannot allocate addresses: no interfaces available"
<voidspace> this is on master
<voidspace> latest master I think
<voidspace> so the error you reported as coming from an earlier revision
<voidspace> dooferlad: I'm going to switch to getting the unit address stuff merged and ported
<voidspace> dooferlad: then I'll come back to this
<jw4> natefinch, fwereade: crap, it's in mergeIds too
<jw4> I wonder if that conversion could happen later, when consuming those id's instead
<jw4> although... actionNotificationId just passes the original back in harmlessly if the id isn't an actionId
<jw4> maybe the name of the function just needs to change to make it less confusing
<jw4> I'd be happy to make a quick PR if that seems like an acceptable approach
<alexisb> cherylj, ping
<alexisb> wwitzel3, ping
<natefinch> jw4, fwereade: it definitely looks like it could be pretty easily converted to be more generic.
<natefinch> jw4: don't change the name... I'm going to change the name and see if I can make those conversions injected rather than hard coded.
<cherylj> alexisb: what's up
<alexisb> heya cherylj critical bug on 1.24 came up in the interlock call
<alexisb> do you have a time to have a look?
<cherylj> alexisb: sure, what's the bug #?
<alexisb> https://bugs.launchpad.net/juju-core/+bug/1499356
<mup> Bug #1499356: all units have false hook errors after reboot <sts> <juju-core:Triaged> <https://launchpad.net/bugs/1499356>
<dooferlad> voidspace: Just stopping for the day, sorry for missing your message 10 mins ago. I was running from the 1.5 branch with the bug I reported if that helps.
<natefinch> biab, gonna run home from the office.
<natefinch> bbiab that is
<jw4> natefinch: I have (what I think is) a better idea... I'll change the filterFn in the idPrefixWatcher to return a modified if necessary id when running the filter
<jw4> that way we can inject the action specific modifier in the action specific constructor call
<natefinch> jw4: ok, that's cool
<jw4> kk.. I'll make a quick PR and you can decide if you like it :)
<natefinch> jw4: cool :)
<natefinch> back in like 45 mins.
<perrito666> bbl
<natefinch> jw4: btw I think I have most of the changes needed to rework the idprefixwatcher into a generic collection watcher
<jw4> natefinch: okay - I'll push up my changes anyway and you can pick and choose at least.  I think the actions stuff is more tightly coupled than I thought at first
<jw4> just running the final tests
<jw4> (and setting up reviewboard again)
<natefinch> jw4: doesn't look too too bad if you just factor out the actionNotificationIdToActionId into a generic "id conversion" function that gets inserted... but I may have missed something
<jw4> natefinch: that's essentially what I'm doing
<jw4> I just did a bit of refactoring and cleanup there too for the tests, etc.
<natefinch> cool
<jw4> natefinch: http://reviews.vapour.ws/r/2749/  do with it as you wish
<mup> Bug #1499499 opened: juju-deployer reports that config specifies num units for subordinate when it doesn't <juju-deployer> <juju-core:New> <https://launchpad.net/bugs/1499499>
<mup> Bug #1499501 opened: com_juju_juju_testing.InitCommand / nil pointer dereference <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1499501>
<mup> Bug #1499499 changed: juju-deployer reports that config specifies num units for subordinate when it doesn't <juju-deployer> <juju-core:New> <https://launchpad.net/bugs/1499499>
<mup> Bug #1499501 changed: com_juju_juju_testing.InitCommand / nil pointer dereference <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1499501>
<natefinch> jw4: I know dave had said the var _ foo = thingies should be in tests... but, why?
<mup> Bug #1499499 opened: juju-deployer reports that config specifies num units for subordinate when it doesn't <juju-deployer> <juju-core:New> <https://launchpad.net/bugs/1499499>
<mup> Bug #1499501 opened: com_juju_juju_testing.InitCommand / nil pointer dereference <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1499501>
<mup> Bug #1499499 changed: juju-deployer reports that config specifies num units for subordinate when it doesn't <juju-deployer> <juju-deployer:New> <https://launchpad.net/bugs/1499499>
<jw4> natefinch: I forget the exact rationale, but I remember that being raised as an issue before.  I guess we don't want the interface type assertions in production code.
<mup> Bug #1499499 opened: juju-deployer reports that config specifies num units for subordinate when it doesn't <juju-deployer> <juju-deployer:New> <https://launchpad.net/bugs/1499499>
<mup> Bug #1499499 changed: juju-deployer reports that config specifies num units for subordinate when it doesn't <juju-deployer> <juju-deployer:New> <https://launchpad.net/bugs/1499499>
<mup> Bug #1347322 changed: juju ssh results in a panic: runtime error <panic> <ppc64el> <juju-core:Fix Released> <juju-core (Ubuntu):Fix Released> <https://launchpad.net/bugs/1347322>
<mup> Bug #1347322 opened: juju ssh results in a panic: runtime error <panic> <ppc64el> <juju-core:Fix Released> <juju-core (Ubuntu):Fix Released> <https://launchpad.net/bugs/1347322>
<mup> Bug #1347322 changed: juju ssh results in a panic: runtime error <panic> <ppc64el> <juju-core:Fix Released> <juju-core (Ubuntu):Fix Released> <https://launchpad.net/bugs/1347322>
<alexisb> thumper-afk, ping
<perrito666> well it would seem that asus intends to make really hard for me to buy a replacement power brick for my laptop
<perrito666> none of you lives near an asus retailer by any chance ?
<mup> Bug #1489142 changed: cpu-power constraint conflicts with with instance-type when trying to launch a t2.medium <constraints> <juju-core:Fix Released by cox-katherine-e>
<mup> <juju-core 1.24:Fix Released by cox-katherine-e> <juju-core 1.25:Fix Released by cox-katherine-e> <https://launchpad.net/bugs/1489142>
<mup> Bug #1490603 changed: TestSubnets fails <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by dooferlad> <juju-core 1.25:Fix Released by dooferlad> <https://launchpad.net/bugs/1490603>
<mup> Bug #1493444 changed: juju upgrade from 1.24-beta2 to 1.24.5 broken <status> <upgrade-juju> <juju-core:Fix Released by hduran-8> <juju-core 1.24:Fix Released by hduran-8> <juju-core 1.25:Fix Released by hduran-8> <https://launchpad.net/bugs/1493444>
<mup> Bug #1491398 changed: RebootSuite test failures on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released> <https://launchpad.net/bugs/1491398>
<perrito666> thumper: after experimenting a bit I found out that the flags are not available in cmd/jujud/bootstrap for some reason
 * thumper thinks
<thumper> really?
<thumper> ah
<perrito666> thumper: featureflags.All is an empty array
<thumper> ha
<thumper> yeah... I know why
<perrito666> lol
<thumper> perrito666: quick hangout?
<perrito666> sure let me dig my headphones
<thumper> perrito666: https://plus.google.com/hangouts/_/canonical.com/boostrap-sucks?authuser=1
<perrito666> meh, google put me on hold to enter the call wtf
<perrito666> I am in
<perrito666> you are not
<mup> Bug #1491398 opened: RebootSuite test failures on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released> <https://launchpad.net/bugs/1491398>
<mup> Bug #1491398 changed: RebootSuite test failures on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released> <https://launchpad.net/bugs/1491398>
#juju-dev 2015-09-25
<perrito666> thumper: I could kiss you .... and also kill you
<axw> anastasiamac: when you're back, could you please review http://reviews.vapour.ws/r/2736/ for me?
<mup> Bug #1499570 opened:  public no address <backup-restore> <reliability> <retry> <juju-core:Triaged> <https://launchpad.net/bugs/1499570>
<mup> Bug #1499571 opened: Restore failed: error fetching address <backup-restore> <reliability> <retry> <juju-core:Triaged> <https://launchpad.net/bugs/1499571>
<anastasiamac> axw: looking :D
<anastasiamac> axw: lgtm :)
<axw> anastasiamac: thanks
<mup> Bug #1499570 changed:  public no address <backup-restore> <reliability> <retry> <juju-core:Triaged> <https://launchpad.net/bugs/1499570>
<mup> Bug #1499571 changed: Restore failed: error fetching address <backup-restore> <reliability> <retry> <juju-core:Triaged> <https://launchpad.net/bugs/1499571>
<mup> Bug #1499570 opened:  public no address <backup-restore> <reliability> <retry> <juju-core:Triaged> <https://launchpad.net/bugs/1499570>
<mup> Bug #1499571 opened: Restore failed: error fetching address <backup-restore> <reliability> <retry> <juju-core:Triaged> <https://launchpad.net/bugs/1499571>
<mup> Bug #1499573 opened: TestString failed on windows <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core feature-proc-mgmt:Triaged> <https://launchpad.net/bugs/1499573>
<cherylj> is there anyone around who's familiar with envworkermanager?
<thumper> davecheney: you have multiple versions of go handy right?
<thumper> I was testing a bug backport to 1.22
<thumper> but golang 1.5 really doesn't seem to like it, even with GOMAXPROCS=1
<thumper> I was wondering if I could get you to see if you can get the tests to pass with a different version
<thumper> https://github.com/howbazaar/juju/tree/ignore-machine-addresses-1.22
<anastasiamac> axw: series from version translation in utils :D PTAL if u have a chance http://reviews.vapour.ws/r/2751/
<axw> anastasiamac: before I review, can you show me where you need it? since we've not needed until now...
<anastasiamac> axw: sure, in https://github.com/juju/juju/blob/master/apiserver/imagemetadata/metadata.go#L210
<anastasiamac> in files, image metadata has versions
<anastasiamac> but we are storing series in structured image metadata
<anastasiamac> hence we need a translation
<anastasiamac> apiserver is the first place... another place will be
<anastasiamac> when we are caching cutom images
<anastasiamac> since this is clearly a util, it should belong in centralasied, logical location not apiserver where i have originally implemented it
<anastasiamac> in addition, it's counter-intuitive that we can go one way and not the othe... i guess the need did not arise until now
<thumper> axw: cheers for finding the service config problem, how'd you identify the issue?
<axw> thumper: nps. I looked through the commits listed as suspects, and that was the only likely candidate. then I repro'd exactly as described in the bug (took me a few goes because I picked the wrong bundle to start with)
<thumper> axw or anastasiamac: I'm after a favour from someone
<anastasiamac> thumper: on firday avo?
<thumper> I need someone with an earlier golang to run the unit tests for a 1.22 potential fix
<thumper> golang 1.5.1 hates 1.22
<thumper> anastasiamac: yeah, it is pretty easy though
<thumper> I have this https://github.com/howbazaar/juju/tree/ignore-machine-addresses-1.22
<thumper> which I think isolates and backports a 1.24 fix into the 1.22 branch
<thumper> but I can't run the tests here successfully
<thumper> not at all
<thumper> I even tried to downgrade golang, but then go failed for other reasons
<thumper> and I couldn't get it working
<thumper> and at the end of friday, I'm losing the will
<anastasiamac> thumper: :( i know the feeling - it being friday and all...
<axw> thumper: I'll give it a shot
<thumper> axw: ta
<thumper> so if you start from a fresh 1.22
<axw> thumper: any particular tests?
<thumper> and pull in that branch
<axw> mk
<thumper> axw: no, just all of them :)
<thumper> it touches a few places
<thumper> machiner worker
<thumper> state
<thumper> and apiserver
<thumper> could probably get away with: api, apiserver, cmd/jujud, state, worker/machiner
<thumper> if you wanted to limit it
<axw> thumper: apiserver is happy. I'll run the lot and let you know
<thumper> I've never been so tempted to spend 2.5kUSD https://glowforge.com
<thumper> axw: ta
<thumper> axw: what would the differences be in the binary created between me compiling with golang 1.5.1 and what you are using (which I'm assuming is an earlier one) ?
<thumper> I know that golang 1.5 changed the default GOMAXPROCS
<thumper> but does that impact the binary created?
<axw> thumper: yes, the runtime is linked in statically. I'm using 1.4.2.
<axw> thumper: couldn't tell you the differences without poring over the release notes
<thumper> ah
<thumper> in which case, can I get you to upload the built juju and jujud binaries to chinstrap somewhere?
<thumper> I'll propose the backport, but I'm going to see if we can get some confirmation that it actually helps first
<axw> thumper: sure
<axw> thumper: tests are just hanging.. nfi what they're doing
<thumper> bugger
<axw> thumper: anyway, binaries hare at https://chinstrap.canonical.com/~axw/ignore-machine-addresses-1.22.tgz
<thumper> axw: ta
<axw> thumper: giving up on tests now, they don't appear to be doing anything
<thumper> kk
<thumper> which ones hung?
<axw> thumper: I think it was in the cmd/juju package
<thumper> axw: or they were just taking ages?
<axw> thumper: maybe, hard to tell. it was going on for quite a long time (didn't have timestamps but around 10 mins?)
<thumper> hmm...
<thumper> axw: fyi https://bugs.launchpad.net/juju-core/+bug/1464304
<mup> Bug #1464304: Sending a SIGABRT to jujud process causes jujud to uninstall (wiping /var/lib/juju) <sts> <juju-core:Triaged> <https://launchpad.net/bugs/1464304>
<thumper> axw: I wonder if we should move the manual cleanup / removal code to SIGUSER1 or 2 rather than SIGABRT
<axw> thumper: yes, I think so
<thumper> axw: for some unknown reason, various folks have hit this
<axw> thumper: main problem is how to change it while still being able to destroy old environments
<thumper> yeah... always a problem
<thumper> here's my suggestion
<thumper> we fix it in 1.25 / master
<thumper> in 1.25, when the api client connects, it stores the server version in the client api
<thumper> so we can ask the server "what version are you"
<thumper> then switch based on known version
<thumper> or alternatively
<thumper> ask for the environ config agent version
<thumper> which is probably technically more correct
<thumper> as the api says what version the server is
<thumper> not what version the environment is
<axw> thumper: that doesn't work when there's no API connection though. we still have to support --force
<axw> thumper: probably could just "jujud --version" instead
<thumper> does that work?
<thumper> yep
<thumper> it does
<thumper> I think that would be a good start
<axw> thumper: I believe we're on bugs next week, I'll fix it then. trying to finish up some ceph-related storage changes atm
<thumper> kk
<thumper> have a good weekend
<thumper> chat next week
<axw> thumper: cheers, you too
<mup> Bug #1499613 opened: Windows device path mismatch in volumeSuite.TestListVolumesStorageLocationBlockDevicePath <ci> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1499613>
<mup> Bug #1499617 opened:  juju machine add --constraints arch=i386 says machine created, then doesn't create machine <juju-core:New> <https://launchpad.net/bugs/1499617>
<mup> Bug #1499617 changed:  juju machine add --constraints arch=i386 says machine created, then doesn't create machine <juju-core:New> <https://launchpad.net/bugs/1499617>
<mup> Bug #1499617 opened:  juju machine add --constraints arch=i386 says machine created, then doesn't create machine <juju-core:New> <https://launchpad.net/bugs/1499617>
<mup> Bug #1499617 changed:  juju machine add --constraints arch=i386 says machine created, then doesn't create machine <juju-core:New> <https://launchpad.net/bugs/1499617>
<mup> Bug #1499617 opened:  juju machine add --constraints arch=i386 says machine created, then doesn't create machine <juju-core:New> <https://launchpad.net/bugs/1499617>
<frobware> voidspace, dooferlad: Thinking of postponing planning/retrospective until Monday when dimiter and frank are back. Thoughts?
<voidspace> frobware: agreed
<dooferlad> frobware: seems reasonable
<frobware> voidspace, dooferlad: it would allow us to concentrate on the spaces bugs
<voidspace> frobware: so, Monday?
<frobware> voidspace, yep
<voidspace> standup as usual today then
<frobware> voidspace, yes, probably not long but we should...
<voidspace> ok
<voidspace> frobware: how's your git-fu?
<voidspace> frobware: I wish to branch off 1.25
<frobware> voidspace, mostly OK within Emacs... :)
<voidspace> heh
<frobware> voidspace, in your github clone?
<voidspace> if I checkout 1.24 (a tag from my upstream) it works fine
<voidspace> frobware: yep
<voidspace> if I checkout 1.25 I get told it doesn't exist
<voidspace> git checkout upstream/1.25 works
<voidspace> but puts me in a detached head
<frobware> voidspace, I have 1.2x branches in my github fork so I just branch from those
<voidspace> frobware: I wonder why my github fork doesn't have a 1.25 branch/tag
<voidspace> and how I get one...
<voidspace> if
<voidspace> th
<voidspace> e
<frobware> voidspace, I ran into this the other day and just pushed a branch
<voidspace> oops
<voidspace> if
<voidspace> if
<voidspace> 8f0
<voidspace> s-a
<voidspace>  
<voidspace>   
<dooferlad> well, this is interesting to watch
<dooferlad> it looks like voidspace just had a cat sit on his enter key :-)
<dooferlad> btw, you need to git checkout -b localname origin/branchname
<dooferlad> http://stackoverflow.com/questions/471300/git-switch-branch-without-detaching-head
<voidspace> my keyboard switched into "crazy mode" and I failed to get it to behave itself
<voidspace> even on reboot
<voidspace> so new keyboard it is
<voidspace> dooferlad: thanks, I can branch off the detached head
<voidspace> dooferlad: I just wondered why I apparently have a 1.24 tag/branch and not a 1.25 one
<voidspace> and also I wondered if the detached head was expected/correct
<dooferlad> voidspace: you may have branched rather than checked out previously?
<voidspace> your implication is that it is
<voidspace> dooferlad: possibly
<fwereade> frankban, ping
<frankban> fwereade: hi
<fwereade> frankban, ancient history now, but: https://github.com/juju/juju/commit/c67e13c37948d5b3e41125c40425fccbee592452
<dooferlad> voidspace: this is why I paid for http://www.syntevo.com/smartgit/ -- it is all the git-fu I need
<dooferlad> voidspace: shame it doesn't do bzr!
<fwereade> frankban, do you recall what the motivation for adding JujuOsEnvSuite to BaseSuite was?
<voidspace> dooferlad: your mental model is better than mine too I think
<voidspace> although I'll look at smartgit
<voidspace> anything that makes my life easier is good...
<dooferlad> voidspace: it feels like there should be a joke in there about me being mental...
<voidspace> dooferlad: oh, I wouldn't joke about that!
<dooferlad> voidspace: :p
<frankban> fwereade: it seems that BaseSuite was cleaning up env var before as well
<fwereade> frankban, yeah, sorry, on closer inspection it looks like it's just an extraction
<frankban> fwereade: yeah, np
<dooferlad> voidspace: https://plus.google.com/hangouts/_/canonical.com/sapphire
<voidspace> dooferlad: omw
<voidspace> dooferlad: frobware: oh, I forgot to mention in standup. I have a meeting at daughter's school for an hour this afternoon (school just round the corner so not much travel time). Will work later to make up the time.
<mgz_> master is broken.
<anastasiamac> mgz_: oh?.. why?
<mgz_> anastasiamac: pr3210
<mgz_> doesn't build on windows. last time dave poked the version also broke the build.
<anastasiamac> :(
<bogdanteleaga> mgz_: we really ought to get a GOOS=windows build test in the hook if this actually happens that often
<mgz_> bogdanteleaga: I have a bigger, more painful for everyone solution
<mgz_> that I've been procrastinating over
<mgz_> but given the failure rate recently on windows testing, is justified to just gate on a full windows run, even though it doubles the time
<mup> Bug #1499689 opened: Windows ftb after version.Binary.OS change <blocker> <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1499689>
<bogdanteleaga> mgz_: might still save time overall :)
<mup> Bug # changed: 1466498, 1466514, 1488576, 1497456, 1498481
<rogpeppe> mgz_: why would gating on a full windows run double the time? does it really take three times as long to run the tests on windows?
<mgz_> rogpeppe: takes longer, and doing two runs means the various intermittent failures are more likely to be hit on any given merge attempt
<rogpeppe> mgz_: presumably you could do the two runs concurrently?
<mgz_> yeah, which is why it's only going to ~double the time
<mup> Bug # opened: 1466498, 1466514, 1488576, 1497456, 1498481
<rogpeppe> mgz_: (in general it might be interesting to consider running all the tests concurrently across a few machines)
<rogpeppe> mgz_: (shouldn't be that hard, i'd think)
<mgz_> rogpeppe: if the tests were just reliable run either in parallel, or under lxc, could already massively speed up the run
<rogpeppe> mgz_: why would running them under lxc help?
<mgz_> don't actually need multiple machines, just to be able to use all the cpus
<rogpeppe> mgz_: doesn't go test use all cpus anyway?
<mgz_> rogpeppe: because then I can just use multiple containers, also avoids the overhead of starting up a new machine each time
<rogpeppe> mgz_: i'm fairly sure that by default it runs GOMAXPROCS packages' tests at the same time
<mup> Bug # changed: 1466498, 1466514, 1488576, 1497456, 1498481
<rogpeppe> mgz_: so we can't run tests under lxc?
<mgz_> rogpeppe: I may be wrong, but I belive GOMAXPROCS defaults to 1 for us
<mgz_> rogpeppe: they mostly work, but are less reliable and hits timing issues much more
<mgz_> though, job also not helped by being on wily these days and the tests not passing on wily:
<mgz_> <http://juju-ci.vapour.ws/job/xx-run-unit-tests-lxc-wily-amd64/>
<rogpeppe> mgz_: i think you'd need to deliberately set GOMAXPROCS to get it to default to 1
<rogpeppe> mgz_: you can find out with 	fmt.Println(runtime.GOMAXPROCS(0))
<rogpeppe> mgz_: tbh we *should* always run with GOMAXPROCS>1
<rogpeppe> mgz_: anyway, spreading tests across machines would also be very feasible
<mgz_> rogpeppe: wasn't the default changed in some go > 1.2.1?
<rogpeppe> mgz_: i think it's controlled by the -p flag:
<rogpeppe> 	-p n
<rogpeppe> 		the number of builds that can be run in parallel.
<rogpeppe> 		The default is the number of CPUs available.
<rogpeppe> mgz_: so unless you're deliberately running with -p 1, i think you'll be running num-CPU tests at a time currently anyway
<rogpeppe> mgz_: contrary to my assertion above, GOMAXPROCS is orthogonal to this
<mgz_> rogpeppe: yeah, they're different effects. we used to force -p to something small, but that at least is no longer needed.
<rogpeppe> mgz_: so probably you'll need to split across multiple machines in order to speed things up
<rogpeppe> mgz_: another way to speed things up would be to have a git cache so that we're not fetching all the deps each time
<mgz_> rogpeppe: another way would be make the tests less pants :)
<rogpeppe> mgz_: i'm not asking for miracles :)
<mgz_> the download isn't much of a speed issue - it's more painful when github api is down and the job fails out
<rogpeppe> mgz_: if i had the task of speeding up the tests, i'd start by looking at setup/teardown time - there are lots of tests that take almost no time but fixture setup and teardown takes at least 0.25s
<mgz_> rogpeppe: yeah, a bunch of the problem is still we're using suites that are terrible
<mgz_> though there was some connsuite destruction recently
<bogdanteleaga> anybody has an idea if juju uses the proxy settings right on the first setup? and if so, how?
<bogdanteleaga> I can see it takes the config values from the state machine when proxyupdater starts
<rogpeppe> mgz_: and i'd also look at the problem that there are quite a few tests that run for 5s or 10s or 15s because they're waiting for the poll interval. that shouldn't be too hard to sort out.
<rogpeppe> mgz_: one thing that might potentially speed things up is to use a single external mongodb instance for all packages.
<rogpeppe> mgz_: running 4 (or however many) mongodb instances at a time is not gonna be good for speed
<rogpeppe> mgz_: i think that moving towards mocking everything because it's too slow is actually a step backwards in some ways.
<rogpeppe> mgz_: and yeah, jujuconnsuite sets up much more than it needs to (directories, files, etc). most tests don't need that.
<mup> Bug #1499617 changed:  juju machine add --constraints arch=i386 says machine created, then doesn't create machine <add-machine> <constraints> <juju-core:New> <https://launchpad.net/bugs/1499617>
<rogpeppe> hi all. i have another step in my apiserver changes for macaroon auth available for your delight and edification. i'm sure you'll all be dying to review it. http://reviews.vapour.ws/r/2758/
<mgz_> how cute rog.
<rogpeppe> mgz_: :)
<rogpeppe> TheMue, cmars: it seems you're OCR... any chance of a review? :) http://reviews.vapour.ws/r/2758/
<cmars> rogpeppe, yep, got a bit of a backlog already though
<rogpeppe> cmars: fair enough, had to try :)
<frobware> rogpeppe, TheMue is out this week
<rogpeppe> frobware: ah, ok
<frobware> rogpeppe, which is obviously a bit late in the day to find out...
<rogpeppe> frobware: it's fine, 'cos cmars is on the case, right casey? :)
<cmars> rogpeppe, reviewing!
<rogpeppe> cmars: much appreciated
<natefinch> fwereade: you around?
<mattyw> fwereade, ping?
<mup> Bug #1499781 opened: macaroonLoginSuite fails on windows on chicago-cubs <ci> <test-failure> <windows> <juju-core:Incomplete> <juju-core chicago-cubs:Triaged> <https://launchpad.net/bugs/1499781>
<fwereade> natefinch, just passing -- can I help?
<natefinch> fwereade: no worries, was wondering about the unit assigning worker I'm writing... I presume it should be a singular worker, since we don't want multiple workers assigning units at the same time.
<mup> Bug #1499689 changed: Windows ftb after version.Binary.OS change <blocker> <ci> <regression> <windows> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1499689>
<sinzui> abentley: hangout?
<abentley> sinzui: let's.
<cmars> http://reviews.vapour.ws/r/2762/, fixes LP:#1499613
<mup> Bug #1499613: Windows device path mismatch in volumeSuite.TestListVolumesStorageLocationBlockDevicePath <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1499613>
 * natefinch changes code... test fails.
 * natefinch changes test... test passes.
 * natefinch wonders if anyone actually cares what error is returned from this function, so long as it actually fails.
<cmars> natefinch, can you review these? http://reviews.vapour.ws/r/2762/ http://reviews.vapour.ws/r/2763/
<natefinch> cmars: ship 'em
<cmars> natefinch, thanks!
<natefinch> cmars: thanks for fixing it, and FWIW, I wish we gated on the windows tests too
<mup> Bug #1499900 opened: scope: container is too ambiguous and confusing <juju-core:New> <https://launchpad.net/bugs/1499900>
#juju-dev 2015-09-26
<mup> Bug #1478287 changed: Juju bootstrap fails on Azure <azure-provider> <juju-core:Expired> <https://launchpad.net/bugs/1478287>
<mup> Bug #1478287 opened: Juju bootstrap fails on Azure <azure-provider> <juju-core:Expired> <https://launchpad.net/bugs/1478287>
<mup> Bug #1478287 changed: Juju bootstrap fails on Azure <azure-provider> <juju-core:Expired> <https://launchpad.net/bugs/1478287>
#juju-dev 2015-09-27
<thumper> morning anastasiamac
<thumper> anastasiamac: we are expecting wallyworld back today yes?
<thumper> wallyworld: hey there
<wallyworld> hey
<davecheney> http://reviews.vapour.ws/r/2757/
<davecheney> ping
<davecheney> this is replacement for #3210
<davecheney> review 2588
<anastasiamac> davecheney: morning! trick question - did u test it on windows? :P
#juju-dev 2016-09-26
<thumper> menn0: https://github.com/juju/juju/pull/6315
<menn0> thumper: looking
<thumper> hmm... best go pick up the dog
<thumper> menn0: the QA steps were all with the next branch, which adds the command aspects
<thumper> there is no QA for just the server bits
<thumper> because it is a new call
<menn0> ok cool, that's fine
<menn0> thumper: sorry that I forgot
<thumper> :)
<thumper> I'll add the bits shortly
<thumper> let me add docstrings, tweak names and move on to submitting the next bit
<veebers> thumper: seems like a bit of an edge case but if you create and then delete a model it still shows up in list-models, but cannot be deleted. But it also unselects it as the focused model: http://pastebin.ubuntu.com/23232282/
<veebers> thumper, menn0: You seen something like that before? ^^ Going to file a bug as I can't find an existing bug for it
<thumper> veebers: it hangs around until the undertaker kills it
<thumper> and cleans up
<thumper> it shouldn't take too long
<menn0> veebers: this could be related to an existing ticket
 * menn0 finds
<thumper> when we first did it, we had it keep the model around for a day so logs and things could be removed
<thumper> but folks didn't like that
<thumper> so it was shortened, but not sure what to
<thumper> menn0: PR updated
<veebers> menn0, thumper: hmm ok the models (I tried a couple of times) are still there and the status stats 'available', that should be 'destroying' or something no?
<thumper> veebers: when you say delete,
<menn0> veebers: yeah, that sounds like the bug I'm looking for
<thumper> what are you doing?
<veebers> thumper: as per the command in the pastebin: juju --show-log add-model -c charm-test model89; juju --show-log destroy-model model89 -y
<thumper> yes, the model hangs around for a while
<thumper> is it still there?
<thumper> hang on
<thumper> those commands errored out
<thumper> with not found
<veebers> thumper: yeah, after the original delete attempt (no error there) any follow up attempts error
<thumper> oh, first line does it all
<veebers> thumper: just re-checked list-models and they are still there (with status available0
<menn0> veebers, thumper: nope, I can't find that ticket
<thumper> yeah, that's definitely odd
<veebers> menn0: you thinking of this one? https://bugs.launchpad.net/juju/+bug/1613960
<mup> Bug #1613960: list-models can show a model that was supposed to have been deleted <juju:Triaged> <https://launchpad.net/bugs/1613960>
<veebers> menn0: huh right I have come across this before (as I filed that bug :-\)
<thumper> ha
<thumper> menn0: wanna +1 that PR?
<menn0> veebers: I was thinking of a different one, where an error like that appears after lots of add/destroy model commands
<menn0> thumper: yep
<menn0> thumper: done
<thumper> ta
<veebers> menn0: ah, that might be one that was alluded to when I tried this test run here (create a bunch of models, then delete a bunch of models)
<veebers> menn0: hmm, or not I think this might be the one I'm thinking of: https://bugs.launchpad.net/juju/+bug/1625774
<mup> Bug #1625774: memory leak after repeated model creation/destruction <eda> <oil> <oil-2.0> <juju:Triaged by alexis-bruemmer> <https://launchpad.net/bugs/1625774>
<menn0> veebers: no not that one
<veebers> menn0: if we keep going through _all_ the bugs I'm sure we'll finally uncover the one we're looking for ;-)
<menn0> veebers: haha ... I'm sure there is one but I can't find it
<menn0> veebers: I saw it when helping babbageclunk with something
 * thumper waits for branch to land before proposing next
<dimitern> jam: morning
<dimitern> jam: do you have the HO link?
<jam> morning
<jam> yes
<dimitern> ok
<jam> I'll be there in about 5 min
<dimitern> +1
<voidspace> macgreagoir: hey, I think on Friday I may have just been imaptient - I did eventually see one container deploy working
<voidspace> macgreagoir: I think I just may not have been allowing enough time for image download
<voidspace> macgreagoir: so I'm retrying your branch
<macgreagoir> voidspace: Enjoy!
<voidspace> :-)
<voidspace> macgreagoir: are you at the London sprint now?
<macgreagoir> I was wondering if you needed ot dpkg-reconfig maas pkgs to get dhcp on your new subnet too.
<macgreagoir> voidspace: I am.
<voidspace> macgreagoir: have fun :-)
<macgreagoir> Cheers!
<redir> http://www.ryman.co.uk/search/go?w=adapter
<redir> wrong link
<redir> how about https://docs.google.com/spreadsheets/d/1AGF6ED7kOtigvWTOBS8lkC0t2st63IRhbdpPWeofauU/edit#gid=1152189692
<frobware> redir: https://bugs.launchpad.net/juju/+bug/1611766
<mup> Bug #1611766: upgradeSuite.TearDownTest sockets in a dirty state <ci> <intermittent-failure> <regression> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1611766>
<mup> Bug #1626576 changed: credential v. credentials is confusing <usability> <juju:Triaged> <https://launchpad.net/bugs/1626576>
<mup> Bug #1626878 changed: ERROR juju.worker.dependency engine.go <juju:Triaged> <https://launchpad.net/bugs/1626878>
<mup> Bug #1627554 changed: juju binary broken on sierra <juju:Triaged> <https://launchpad.net/bugs/1627554>
<anastasiamac> jam: dimitern: macgreagoir: replacing JujuConnSuite in state with ConnSuite: https://github.com/juju/juju/pull/6317
<jam> anastasiamac: looking
<anastasiamac> ja \o/
<anastasiamac> ta even :D
<jam> anastasiamac: +1
<anastasiamac> jam: amazing \o/
<voidspace> macgreagoir: so I'm afraid I still see - with a machine with a single nic on the pxe subnet a lxd container starts fine
<voidspace> macgreagoir: with two nics, the "first" on a separate subnet, the container starts but gets no address
<voidspace> macgreagoir: your branch
<voidspace> macgreagoir: I'm just trying to confirm its not an oddity of the way I've set up the two nics
<macgreagoir> voidspace: You're seeing the addressing issue on my branch too?
<voidspace> macgreagoir: yup
<voidspace> macgreagoir: can't connect to the lxd at all (nor exec commands in it) to see the rendered /e/n/i
<voidspace> macgreagoir: unless you know a trick to get it
<macgreagoir> voidspace: Can you see inside /var/lib/containers/<container>/rootfs ?
<macgreagoir> /var/lib/lxd/containers... that is
<voidspace> macgreagoir: will try shortly - just adding a lxd container with your branch with the second NIC unconfigured
<voidspace> macgreagoir: to check that works
<voidspace> macgreagoir: for the second NIC (ethA not on pxe subnet) I have gateway address *on* that subnet - which probably means that subnet is not routable to the other one (or the wider internet)
<voidspace> macgreagoir: I wonder if that might be the issue and if the gateway address for 172.16.1.0/24 should be 172.16.0.1 (on the pxe subnet)
<rogpeppe> i've just resurrected https://github.com/juju/testing/pull/108 after leaving it languishing for a month or so. could someone review it please? (i got a positive review from fwereade, but it needed tests which i've just done).
<rogpeppe> it has a companion branch at https://github.com/juju/utils/pull/242 (much smaller)
<anastasiamac> rogpeppe: we'll look shortly :D thank you for the tests!
<rogpeppe> anastasiamac: ta!
<voidspace> macgreagoir: hmmm... with an unconfigured NIC as the "first" NIC it *looks* like I'm still seeing no address for the container
<voidspace> macgreagoir: restoring the order and trying *again*
<rick_h_> morning
<rick_h_> dooferlad: welcome back
<dooferlad> hi!
<voidspace> macgreagoir: yup, if I reorder the NICs then it starts fine.
<rick_h_> dooferlad: how's the little one?
<voidspace> macgreagoir: will try again with the order reversed and see if I can get to /e/n/i
<dooferlad> rick_h_: doing well. Old enough to smile now, which is lovely.
<voidspace> dooferlad: hey, hi!
<voidspace> dooferlad: you back, or just a visit?
<dooferlad> voidspace: hello
<dooferlad> I am back
<voidspace> dooferlad: so good when they can smile :-)
<voidspace> dooferlad: congratulations and welcome back!
<dooferlad> voidspace: thanks!
<dooferlad> voidspace: I just wish big sister would go back to sleeping well!
<voidspace> dooferlad: oh no!
<voidspace> dooferlad: I feel your pain, Benjamin is in (another) phase of not going to sleep until about 1am
<voidspace> very tiring, literally and figuratively
<voidspace> the joy of children :-)
<dooferlad> voidspace: yea, Naomi is often up before 6. I was just about feeling human before that started!
<voidspace> dooferlad: ah man, not much fun
<rogpeppe> ashipika: for some reason i seem to have been disconnected from canonical IRC
<rick_h_> rogpeppe: yea, lots of stuff down
<rick_h_> LP, etc
<rogpeppe> rick_h_: marvellous :)
<ashipika> rogpeppe, rick_h_: interesting :)
<rick_h_> "space| dooferlad: ah man, not much fun
<rick_h_> 11:49   rogpeppe| is now known as rogpeppe1
<rick_h_> 11:49  rogpeppe1| is now known as rogpeppe
<rick_h_> 11:49   rogpeppe| ashipika: for some reason i seem to have been disconnected from canonical IRC
<rick_h_> bah
<rick_h_> rogpeppe: ashipika looks like firewall issue atm, being worked on
<ashipika> rick_h_: thank you for the info!
<rogpeppe> ashipika: re: waiting for AfterFunc - we've got the Alarms method to tell when things have started waiting. That is unfortunately necessary, but anything more seems like it would be more than the test code should be relying on. For example, the code could change to start a goroutine itself rather than calling AfterFunc and the test code wouldn't be able to tell when that finished.
<ashipika> rogpeppe: and i suppose you'd have to change the signature of the parameter function to AfterFunc
<rogpeppe> ashipika: no, i don't think so
<ashipika> rogpeppe: ok.. it was just a thought.. feel free to land the PR
<rogpeppe> ashipika: any chance you could approve this too please? https://github.com/juju/utils/pull/242
<ashipika> rogpeppe: done
<rogpeppe> ashipika: ta
<rogpeppe> ashipika: ha, marvellous, there's a cyclic dependency between juju/utils and juju/testing
<ashipika> rogpeppe: my condolences :)
<rick_h_> natefinch: ping, how goes the rackspace work?
<natefinch> rick_h_: mostly figured out what was going on Friday.  still have one question for curtis when he gets on
<rick_h_> natefinch: k, I've got a call with rax in a bit under an hour and wanted to know where we stand with things
<natefinch> rick_h_: hoping to get a fix up today
<perrito666> voidspace: did the binary from my branch worked? I fixed my bug finally am writing tests now
<natefinch> rick_h_: but I would tell rackspace a couple days to be safe :)
<rick_h_> natefinch: k, all good. it's on a different topic but it might come up
<rick_h_> natefinch: so need to make sure we're still rc2 targeted
<natefinch> rick_h_: yep
<rick_h_> voidspace: ping, did we get anywhere with the MAAS issues friday?
<babbageclunk> redir: http://bazaar.launchpad.net/~juju-qa/juju-ci-tools/repository/
<frobware> mgz: ping
<katco> voidspace: standup time
<voidspace> katco: omw
<voidspace> rick_h_: I suggested something that might be the cause of the problem
<voidspace> rick_h_: hang on, in standup - I'll come back to you after that
<rick_h_> voidspace: rgr ty
<mgz> frobware: yo
<mgz> frobware: can I help?
<frobware> mgz: please - I was trying to run assess_recovery.py but I don't think I have enough runes; bombs with permisson denied
<mgz> frobware: run with --verbose --debug and pastebin?
<frobware> mgz: heh pastebin seems to be down
<mgz> frobware: eheh, try a different pastebin
<mgz> hm, come back lp, I need to finish getting by stuff reviewed
<babbageclunk> alexisb: wanna catchuo?
<babbageclunk> p
<alexisb> babbageclunk, sure
<mgz> dooferlad: I could do with bugging at some point today about some cross maas version network things
<dooferlad> mgz: sure. When works for you?
<mgz> half an hour?
<mgz> +in
<dooferlad> mgz: sounds good
<rock> Hi. I have OpenStack-on-lxd setup. juju version is 2.0-beta15. I am trying to install multipath-tools package on nova deployed LXD container and cinder deployed lxd container using our "cinder-storage driver" charm. But that package not installing on LXD containers. Giving issues. http://paste.openstack.org/show/582953/
<rock> I ran #apt-get install --yes multipath-tools   command on Online LXD container console . There also it was giving the same issue as I pasted above.
<rock> #lxd and #lxccontainers channels are not in active state.
<rock> if anyone have any idea on this please provide me that .
<perrito666> voidspace: this should fix your issue with some luck https://github.com/juju/juju/pull/6321
<natefinch> rock: you'll probably have better response on #juju but it sounds like a packaging problem since it's a dpkg error
<perrito666> I need a non trivial review here https://github.com/juju/juju/pull/6321
<rock> natefinch: OK. Thank you.
<voidspace> rick_h_: irc or hangout
<voidspace> rick_h_: but the summary, custom binaries from here *may* solve the issue: https://github.com/juju/juju/pull/6321
<voidspace> rick_h_: I've sent an email
<rick_h_> voidspace: ty
<voidspace> perrito666: thanks!
<voidspace> perrito666: on your branch, is the status polling in a goroutine the same pattern used by the other providers?
<voidspace> perrito666: and have you manually tested with maas 1.9 and 2...
<voidspace> perrito666: the code changes themselves look pretty straightforward, I like the maas2Controller interface
<dimitern`> LP still broken - we can't merge anything due to check-blockers.py getting 503
<perrito666> voidspace:  answering in order :)
<perrito666> 1) the status polling gorouting is not a pattern, we are not doing it for other providers (and we should)
<perrito666> voidspace: we where only updating the "instance status" which is wrong
<perrito666> I have manually tested with maas 1.9
<perrito666> sorry maas 2
<dimitern`> perrito666: please, keep in mind we have 2 separate code paths for maas 1.9 and 2.0
<dimitern`> perrito666: both should be tested if the change applies to both versions
<voidspace> dimitern`:  t looks good to me
<perrito666> dimitern`: I have (not very nicely separated btw :p ) but yes I kept it in mind while coding the fix, I guess I can start a 1.9 maas to try this
<voidspace> dimitern`: I'll see if I can check with maas 1.9, need to fail a deployment...
<voidspace> perrito666: ah, well - happy for you to do it
<voidspace> perrito666: and oi! the separation is *great*
<perrito666> voidspace: if you HAVe a 1.9 I would be very thankful if you did it for me
<dimitern`> voidspace, perrito666 thanks guys! :)
<perrito666> voidspace: I have to install the whole thing
<perrito666> if not ill do it
<voidspace> perrito666: I have one setup
<voidspace> perrito666: how did you test - what did you do to get deployment to fail. Mark as broken after deployment starts?
<perrito666> voidspace: I shall pay in beer :p
<perrito666> voidspace: I wrote the QA steps :p, basically bootstrap and once it is up, break the power profile for the nodes and deploy something
<voidspace> perrito666: I'll let you know how it goes.
<perrito666> tx a lot
<dimitern`> macgreagoir: http://paste.ubuntu.com/23233677/
<anastasiamac> mgz: is the bot stuck? http://juju-ci.vapour.ws:8080/job/github-merge-juju/9327/console
<mgz> likely, lp is down
<dimitern`> ok, my bad
<anastasiamac> mgz: is there a timeout for the blocker check?
<dimitern`> my PR got picked up by the bot, which due to LP being down, is now stuck at check-blockers.py
<mgz> not independently, but we can't land when lp isn't up anyway
<dimitern`> and that's because check-blockers is not called with a timeout
<natefinch> gah forgot launchpad is down
<voidspace> frobware: did you manage to reproduce the telefonica issue?
<redir> http://reports.vapour.ws/releases/issue/5762fb3b749a5667e3627666
<redir> frobware: babbageclunk ^
<natefinch> sinzui: I tried doing the easy fix for rackspace, just hack the endpoint url. but I'm getting a 401 response from rackspace:
<natefinch> 11:54:41 DEBUG juju.provider.openstack provider.go:625 authentication failed: authentication failed
<natefinch> caused by: requesting token: Unauthorised URL https://dfw.images.api.rackspacecloud.com/v2/auth/tokens
<natefinch> caused by: request (https://dfw.images.api.rackspacecloud.com/v2/auth/tokens) returned unexpected status: 401
<natefinch> sinzui: do I need to access that identity.api.rackspacecloud.com url first, to authenticate?  and if so, where do I get the api key?
<sinzui> natefinch: I think so. I am sprinting this week. josvaz in @cloudware has most of the details. I found the API key in the rackspace web ui. There isn't anything in the juju config to show that. I do have a rackspacrc file. It exports the standard OpenStack vars. I see "_RACKSPACE_API_KEY" defined, but unused. I didn't notice it unitl now :(
<natefinch> hmm ok
<natefinch> sinzui: thanks for the info
<natefinch> sinzui: I'll talk to josvaz
<perrito666> ouch, I actually need lp to pick a new bug
<natefinch> it's back
<perrito666> just in time
<alexisb> perrito666, do you need bug suggestions?
<perrito666> alexisb: sure
<perrito666> admit it, you have a script checking on me talking about bugs
<alexisb> :)
<alexisb> that is one of the duties of my job
<voidspace> perrito666: hmmm... so after a long update / new image import / bootstrap cycle
<voidspace> perrito666: I'm now seeing on maas 1.9: after a deploy, then manually marking the machine as broken in maas (my nodes all have manual power types so that seemed easier)
<voidspace> perrito666: the machine stays as pending
<voidspace> perrito666: status doesn't change
<voidspace> perrito666: I'll try again :-/
<perrito666> voidspace: interesting, tx, if you have that issue again ill investigate with my setup here
<natefinch> sinzui: is there anyone else I can talk to? Josvaz is past EOD, AFAICT.
<sinzui> natefinch: rcj?
<natefinch> sinzui: thanks
<redir> babbageclunk: https://bugs.launchpad.net/juju/+bug/1606310
<mup> Bug #1606310: storeManagerSuite.TestMultiwatcherStop not stopped <ci> <intermittent-failure> <regression> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1606310>
<voidspace> perrito666: so with a broken power type I do see the status change to down
<voidspace> perrito666: however, if I manually break the machine I don't see a status change I don't think
<voidspace> perrito666: I'm going to try that with maas 2 - but probably tomorrow now as I'm nearly EOD
<voidspace> perrito666: I left a question and a comment - the question is likely to be just me being dumb
<perrito666> voidspace: tx a lot for the tests
<voidspace> np
<CorvetteZR1> hi.  i got openstack up and running with openstack-base-xenial-mitaka
<CorvetteZR1> i can log into the dashboard, but when i go to containers, i get an error:  Unable to get the Swift container listing.
<CorvetteZR1> how do i configure this?  how do i log into the servers juju configured?
<CorvetteZR1> ssh into them i mean
<hml> question - how to i add a new endpoint for running the juju go test with the openstack provider?  iâm missing the piece
<rick_h_> katco: natefinch have any hint for hml ? ^
<katco> hml: what is the juju go test? our suite of tests written in go?
<hml> katco: looking at the contribute.md - you run go test github.com/juju/juju to test changes?
<natefinch> yeah
<natefinch> those won't hit a real openstack
<katco> hml: ah ok. you should be able to just run that command; i don't know what you mean by add an endpoint. can you explain?
<hml> katco: the code change starts to use the neutron api.  however the test environment doesnât know about an endpoint to find neutron -
<natefinch> you can't make a test that hits a real openstack.... they have to be (more or less) self contained.
<hml> katco: for the related goose pkg changes - in the tests i had to add code to spoof neutron
<katco> hml: ah, as natefinch says we don't do that in juju. the test should be a unit test and only utilize things in memory
<natefinch> the way to connect to openstack using juju normally is to use add-cloud which will prompt for the endpoint
<hml> natefinch: hrmâ¦ so how do the juju openstack provider tests for nova run then?  they look for a novaClient.
<natefinch> hml: there's a lot of spoofing in the tests, precisely to keep it from hitting real infrastructure.  I'm afraid I don't know the details of the openstack tests
<katco> hml: i don't know the specifics of the openstack provider tests, but you would mock a novaClient and pass that in. any new tests should not hit anything outside of memory
<hml> natefinch: okay - i believe that if i could find where the nova spoofing is done, i could figure it out for neutron, but i havenât been able to find it yet
<hml> katco: iâm looking for where the novaclient is mocked, so i can do the same for a neutronclient.  but so far iâm missing how itâs done.
<natefinch> hml probably something with this: gopkg.in/goose.v1/testservices/novaservice
<katco> hml: natefinch: i think it's this: https://github.com/juju/juju/blob/master/provider/openstack/local_test.go#L1917
<katco> hml: natefinch: called from here: https://github.com/juju/juju/blob/master/provider/openstack/local_test.go#L170
<hml> natefinch: katco: cool - iâll take a look.  thanks
<katco> hml: hth, gl
<thumper> morning
<alexisb> morning thumper
<alexisb> voidspace, you still around?
<voidspace> alexisb: kind of
<voidspace> alexisb: :-)
<perrito666> how can this not be working now if it was working on friday
<alexisb> voidspace, trying to leave dimiter and andy alone
<alexisb> voidspace, do you know if this bug is still an issue for 2.0:
<alexisb> https://bugs.launchpad.net/juju/+bug/1560331
<mup> Bug #1560331: juju-br0 fails to be up when no gateway is set on interface <juju:Triaged> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1560331>
<alexisb> o is dooferlad back??
<voidspace> alexisb: he is!
<alexisb> welcome back dooferlad!
<alexisb> dooferlad, you could probably answer the q above as well, if you are still around
<voidspace> alexisb: it's nearly 9pm UK time so unlikely he's around, nor the sprint people
<voidspace> alexisb: I don't know specifically about that bug, but that area of the code has changed dramatically in recent months
<voidspace> alexisb: we have new bugs related to "bridging all the things" for example
<voidspace> alexisb: I can talk to dooferlad tomorrow morning and email you
<rick_h_> alexisb: that should be corrected in rc1
<alexisb> lol yes that is a fun topic for today
<alexisb> rick_h_, ack will mark it so
<voidspace> cool, sorry I couldn't be more helpful
<voidspace> alexisb: ah, there's a comment on the bug saying "if we bridge all the things it will go away"
<voidspace> alexisb: and that is done
<alexisb> its fixed so that is good :)  one less bug
<thumper> alexisb: morning
<voidspace> thumper: o/
<thumper> hey voidspace
<thumper> sprinting?
<voidspace> thumper: nope, going downstairs to watch a movie with the wife and hope that the lad falls asleep before 1am tonight :-)
<voidspace> thumper: have a good day
<voidspace> see you on the other side maybe
<thumper> ha
<thumper> night
<katco> rick_h_: ping
<rick_h_> katco: pong
<katco> rick_h_: https://github.com/juju/juju/pull/6323 https://github.com/juju/juju/pull/6324
<katco> rick_h_: double PR into develop branch failed. what do i do? $$merge$$ other pr?
<katco> rick_h_: i wasn't there for the discussion on this
<rick_h_> katco: only deal with merging from the one that deals with master
<rick_h_> katco: but we should look at what failed in taht check
<katco> rick_h_: that merge into develop brought in like 30 commits from before mine
<rick_h_> katco: yea, that's all good, because it's not constantly kept up with develop that'll happen
<katco> rick_h_: just not sure what failure is related to. looks possibly related to my pr, but i will trust the results of $$merge$$ into master
<rick_h_> katco: k
<rick_h_> katco: http://juju-ci.vapour.ws/job/github-check-merge-juju/42/artifact/artifacts/trusty-out.log though with a failure in "testAddLocalCharm" seems like it might be a real thing
<katco> rick_h_: yes it's possible
<rick_h_> katco: the windows one there's an intermittent test failure that's hit before there. Might check it matches up, but not sure.
<rick_h_> katco: but might be worth double checking that test while the merge runs to get ahead of the game if there is something
<perrito666> alexisb: having fun?
<alexisb> always having fun
<perrito666> you have changed the standup 5 times :p
<menn0> thumper: phew, https://github.com/juju/juju/pull/6325
<alexisb> perrito666, yeah I was learning something new
<katco> alexisb: is ian on vacation or something?
<thumper> yeah
<alexisb> katco yes
<alexisb> and he did not update the calendar
<katco> alexisb: ah ok :) ty
<alexisb> which I will be pestering him about when he returns next week
<alexisb> ;)
<katco> lol no biggie
<thumper> menn0: review done
<veebers> thumper: a little later on today I would like to bother you again about bug; https://bugs.launchpad.net/juju/+bug/1626784 I have some more details regarding it
<mup> Bug #1626784: upgrade-juju --version increments supplied patch version <juju:Incomplete> <https://launchpad.net/bugs/1626784>
<thumper> ok
<alexisb> axw, ping
<axw> thumper: you said you had me pinged? (sounds a bit like "had me made")
<thumper> :)
<perrito666> menn0: ping me when you need me
<thumper> I think I said pinned
<thumper> I was really wanting the time queue as part of the provisioner
<thumper> but as nice as it would be
<thumper> it isn't as high a priority as many of the current fires
<menn0> perrito666: could you please try deploying something into a container with Juju 2.0 using MAAS 2.0?
<axw> thumper: oh right, gotcha
<perrito666> menn0: surre, bootstrapping, gimme a moment
<menn0> perrito666: thank you
<perrito666> menn0: anny particular formula or just placement?
<menn0> perrito666: we don't have much detail at the moment
<menn0> perrito666: seems the user was trying to deploy openstack using maas and all the container watchers were panicking
<menn0> perrito666: so let's just establish whether or not container deployments work at all for you
<perrito666> menn0: k, ill go get a beer for the US debate while this bootstraps
<menn0> perrito666: sounds good :)
#juju-dev 2016-09-27
<veebers> thumper: this is a quick brain dump re: the version bug I was talking about yesterday: http://pastebin.ubuntu.com/23235882/ note we have 2.0-rc2 available in the steams but starts upgrading to 2.0-rc2.1.
<veebers> thumper: rats, I realise i'm running late for a thing in town, I'll chase up when I get back :-) sorry to ping and run
<thumper> why is rc2 in streams?
<thumper> OMG status on beta7 looks terrible compared to now
<veebers> thumper: I think it's in testing streams, right? http://juju-dist.s3.amazonaws.com/parallel-testing/agents/...
<menn0> perrito666: figure anything out?
<perrito666> yes, that today is ... I guess I would translate to clerks day?
<perrito666> and therefore only open businesses are those tended by their owners
<perrito666> annyway, deploying to a container as we speak
<perrito666> also juju deploy cs:mysql --to lxd/0 didnt have the effect I was hoping
<veebers> thumper: You have a moment to consider the version bug with the details I linked?
<veebers> (I'm back from lunch now :-) )]
<thumper> I'm expecting a visitor any minute
<thumper> but she isn't staying, just picking soemthing up
<thumper> I do need to take broken coffee things down to hardly normal
<thumper> and get another coffee
<veebers> thumper: ack
<thumper> how about a little later in the afternoon?
<veebers> thumper: hopefully they can repair it quickly. They do have some nice Rockets at southern hospitality that aren't stupidly expensive (for some value of stupid)
<veebers> thumper: sounds good
<thumper> rockets are too manual for me
<thumper> I prefer the push button
<thumper> but I do like grinding my own beans
<thumper> so not too push button
<veebers> heh :-)
<thumper> alexisb: changes to the state package since beta 7:  232 files changed, 25849 insertions(+), 13540 deletions(-)
<perrito666> k ppl, EOD
<alexisb> thumper, yep
<veebers> axw: do you have a couple of moments to spare while I wait for thumper? :-) I'm waiting to further this bug here https://bugs.launchpad.net/juju/+bug/1626784 (which I spoke with you about a couple days back)
<mup> Bug #1626784: upgrade-juju --version increments supplied patch version <juju:Incomplete> <https://launchpad.net/bugs/1626784>
<axw> veebers: looking
<veebers> axw: cheers, this is some follow up thoughts: http://pastebin.ubuntu.com/23235882/ (as mentioned above)
<axw> veebers: are you using *agent-metadata-url*? tools-metadata-url was renamed to agent-metadata-url. I don't think we handle the old name in juju 2.0
 * veebers double checks
<veebers> axw: ah yes, we're using agent-metadata-url, the log output is out of date, but the command/config used is correct
 * veebers updates the logging call
<axw> veebers: seems to me that you're doing all the right things. I'll try and repro locally
<veebers> axw: awesome, thanks. Please take into consideration the details in that pastebin as it was pointed out to me that there was some missing details in the bug report itself
<axw> veebers: yup
<veebers> (I will update the bug with those details shortly)
<thumper> veebers: I'm around for about 10 minutes, then I have to go and collect a child
<axw> veebers: well, I can repro
<axw> need to finish something off then I can come back and have a deeper look
<veebers> thumper: I bugged axw about it as well. Seems he's able to repro
<veebers> axw awesome thanks
 * thumper needs to collect kiddo
<veebers> axw: please let me know if there is anything I can help with that bug, it's blocking the upgrade tests which we'll need for between rc to rc and onwards
<axw> veebers: ok, I will look again in a little while and let you know. I think we can take it from here though
<veebers> sweet
<axw> wow, weather just went from zero to torrential
<veebers> axw: just noticed the bug status changed on that, I'm champing at the bit to see what you come up with ;-)
<axw> veebers: I was just about to comment :)
<axw> veebers: I'm not sure if I did exactly the same as you, but in my case, after bootstrap it's at version 2.0rc1.1, not 2.0rc1
<axw> veebers: the reason being that the agent stream doesn't include 2.0rc1 agents
<veebers> axw: does that .1 carry over to the rc2? (The upgrade attempt is to 2.0-rc2)
<axw> then because the controller is custom, it seems to not want to upgrade to a streams version
<axw> veebers: ^^ haven't gotten to the bottom of that bit yet, but the fact that the server is an unreleased build seems to force uploads
<veebers> axw: oh, sorry you're still investigating :-)
<axw> veebers: https://github.com/juju/juju/pull/6327
<veebers> axw: sweet, cheers for sorting that :-)
<anastasiamac> redir: this si the "main" unreachbale server bug :D https://bugs.launchpad.net/juju/+bug/1605767
<mup> Bug #1605767: MachineSuite.TearDownTest no reachable servers <ci> <intermittent-failure> <regression> <unit-tests> <juju:Triaged> <https://launchpad.net/bugs/1605767>
<anastasiamac> "main" = one of many \o/
<macgreagoir> dimitern: https://github.com/juju/juju/pull/6328
<dimitern> macgreagoir: http://paste.ubuntu.com/23239901/
<redir> mainy
<redir> frobware: babbageclunk: https://github.com/juju/juju/pull/6330 PTAL
<dimitern> frobware: here's what I have so far: https://github.com/juju/juju/compare/master...dimitern:maas-bridge-some?expand=1
<rogpeppe1> here's some cleanup in the apiserver cert changing code - a little simplification and a test: https://github.com/juju/juju/pull/6331
<rogpeppe1> dimitern: fancy a review?
<dimitern> rogpeppe1: I'll put it on my list, sure
<rogpeppe1> dimitern: ta
<dimitern> rogpeppe1: might take an hour though, as I'm in crit-bug-fix-mode atm :)
<rogpeppe1> dimitern: np
<marcoceppi_> o/ just wanted to drop by and share some love. rc1 has been awesome from the UX fixes in juju status throughout
<frobware> dimitern: https://github.com/frobware/juju/tree/master-lp1627037
<frobware> dimitern: unit tests are broken but it now supports '--interfaces-to-bridge' which is space delimited
<dimitern`> frobware: perfect!
<redir> babbageclunk: PTAL https://github.com/juju/juju/pull/6332
<dimitern> rogpeppe1: you've got a review with 1 comment
<macgreagoir> dimitern: No rush, just another group https://github.com/juju/juju/pull/6333
<rogpeppe1> dimitern: ta!
<redir> mgz: yt?
<mgz> redir: yo?
<redir> hey mgz are patches in juju/pathces applied only at test time or also at build time for releases?
<mgz> they are applied as part of building the tarball
<mgz> which happens as the first stage of the revision test process
<mgz> and before any unit test runs
<mgz> and as part of the release
<mgz> only time it won't happen is on a dev's machine unless they specifically include it
<redir> mgz: thanks
<redir> jam: per our discussion. easy peasy https://github.com/juju/testing/pull/112 PTAL
<babbageclunk> alexisb: do we need to talk today? I don't think I have anything to talk about.
<alexisb> babbageclunk, have you heard back from jhobbs?
<rick_h_> voidspace: mgz natefinch ping for standup
<voidspace> rick_h_: crap, sorry - omw
<frobware> rick_h_: status update; we are getting closer to a patch we can use / build a binary from but have just run into an issue with aliases.
<rick_h_> frobware: ok, thanks for the heads up. What's the aliases issue?
<frobware> rick_h_: two things; all the bridge script idempotent transformations fail
<frobware> rick_h_: second is that the stuff that dimiter has done on the maas side doesn't consider aliases.
<frobware> rick_h_: we could dispense with the second (unit test) transformation but would rather not...
<frobware> rick_h_: so this issue is new-right-now. thinking about we do about it. bbiab.
<babbageclunk> alexisb: Only what was on that email - there was something wrong with the test setup, he's rerunning now.
<alexisb> babbageclunk, ok, can you send me a mail to your current PR/branch?
<babbageclunk> alexisb: Sure - sent!
<alexisb> thanks
<babbageclunk> alexisb: actually I hadn't sent it then but I have now.
<kwmonroe> hiya, when i bootstrap aws with rc1, i get a machine with 3.5G RAM, when i bootstrap azure with rc1, i get a machine with 1.7G.  is that normal?
<kwmonroe> i ask because once i get up to 10ish machines in a mode, azure slows waaaay down (or rather, juju commands like 'status' and 'models' slow waaay down).
<anastasiamac> jam: PTAL https://github.com/juju/juju/pull/6334
<rick_h_> kwmonroe: so the instances we get up are based on cpu constraints I think.
<rick_h_> kwmonroe: we're looking at tweaking them for 2.0 final but not sure yet on the final bit. I'd be surprised if 10 machines though slow down the status/etc.
<rick_h_> kwmonroe: I'd be curious if it's network lag or something? I mean does it really time slower after reach machine?
<natefinch> definitely 10 machines should be perfectly fine with 1.7G of RAM.  I think, honestly, it's just that the azure machine in general is a much slower machine than what you get from AWS
<natefinch> azure machines in general are pretty slow, and the AWS machine we get is one of their newer sets.
<kwmonroe> rick_h_: natefinch: mongo (1.2G) + jujud (150M) eat up just about all the rams: http://paste.ubuntu.com/23242577/  and kswapd0 is using 50% of my cpu trying to find swap space (which doesn't exist):
<kwmonroe> yeah rick_h_, it does time slower after reaching double digit machines...
<kwmonroe> $ time juju status
<kwmonroe> ...
<kwmonroe> real	15m31.609s
<kwmonroe> user	0m0.090s
<kwmonroe> sys	0m0.050s
<rick_h_> 15 minutes?
<natefinch> that's crazy
<natefinch> but mongo using that much is pretty unusual too
<kwmonroe> natefinch: rick_h_:  is there anything i can pull off this controller to make a useful bug report (like why mongo is using so much ram)?  if not, i'm gonna try adding swap space to see if it's slow because it's out of ram, or slow because kswapd0 is eating so much cpu.
<rick_h_> kwmonroe: not that I'm aware of but if you can replicate it pretty consistently it'd be good to  know what region, deployment, etc you've got going that is causing mongo to chew up like that
<kwmonroe> ack rick_h_, i'll keep an eye out
<rick_h_> dooferlad: also assigned the other related card your way as I think the two bugs are related and might be useful to keep both in mind.
<dooferlad> rick_h_: thanks
<natefinch> rick_h_, kwmonroe: we should file a bug, so we can track it.
<natefinch> no one ever said "man, I wish I hadn't filed that bug" ;)
<kwmonroe> on it natefinch, i'll spin up a cuople different regions to see how consistent it is
<natefinch> kwmonroe: awesome, thanks :)
<mup> Bug #1628155 opened: cmd/juju: juju deploy "see also" refers to non-existent command <helptext> <juju-core:New> <https://launchpad.net/bugs/1628155>
<anastasiamac> redir: babbageclunk: do u think that this failure will be improved with ur work? https://bugs.launchpad.net/juju/+bug/1625768
<mup> Bug #1625768: github.com/juju/juju/state go test timeout <ci> <intermittent-failure> <regression> <unit-tests> <juju:In Progress by thumper> <https://launchpad.net/bugs/1625768>
<babbageclunk> anastasiamac: potentially...
<redir> anastasiamac: PTAL https://github.com/juju/juju/pull/6335
<frobware> rick_h_: now live testing https://github.com/frobware/juju/tree/master-lp1627037
<rick_h_> frobware: <3
<anastasiamac> redir: lgtm
<voidspace> dimitern: ping
<dimitern> voidspace: pon
<dimitern> g
<voidspace> :-)
<voidspace> dimitern: it's alright, I'm going to think a bit more and maybe come back to you
<dimitern> voidspace: sure :)
<voidspace> dimitern: you still there?
 * rick_h_ goes for lunchables
<rick_h_> dimitern: frobware ping please
<frobware> rick_h_: pong
<rick_h_> frobware: dimitern can you please join https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1
<frobware> rick_h_: omw
<natefinch> rick_h_: I think so?  I'll hack the code to pretend we get this from the service and see what happens.
<rick_h_> natefinch: k
<kwmonroe> anyone know much about azure that could hop on a hangout real quick?
<kwmonroe> i see a failed deployment in the azure portal, but juju isn't catching it
<rick_h_> kwmonroe: what's up?
<kwmonroe> rick_h_: might be easier to show you.. can you join tvansteenburgh and i?  https://hangouts.google.com/hangouts/_/canonical.com/eco-wx
<rick_h_> kwmonroe: omw
<alexisb__> rick_h_, do you want me to join, given asure is on our court atm
<rick_h_> alexisb__: you can, getting folks to file a pair of bugs
<natefinch> uh, do we not have a way to enable trace logging on the client?
<natefinch> logging-config only affects the server and --debug only makes us output debug
<kwmonroe> alexisb__:  rick_h_: tvansteenburgh:  if there's anything else you might find helpful to debug the "pending" vs "error" machine in azure, lmk.  we've still got the env for https://bugs.launchpad.net/juju/+bug/1628246
<mup> Bug #1628246: juju not detecting azure provisioning failure <juju:New> <https://launchpad.net/bugs/1628246>
<tvansteenburgh> kwmonroe: thanks for filing that
<rick_h_> kwmonroe: ty
<rick_h_> kwmonroe: leave it up a few hours in case axw thinks of anything that he'd find particularly useful
<kwmonroe> ack
<perrito666> we need a sprint on mars
<natefinch> perrito666: +1 for mars
<natefinch> man simplestreams is a giant PITA
 * katco` just realized she has been glancing at a disconnected #juju-dev screen all day
<menn0> axw: ping
<perrito666> menn0: too early
<perrito666> katco`: emacs :p
<menn0> perrito666: yeah I figured. just trying my luck :)
<menn0> doesn't matter, I put my question on the review instead
<katco`> perrito666: i know you're joking, but something's going on with my bouncer. the katco here is connected, but my bouncer won't let me connect to that session =|
<perrito666> menn0: the standup time is pretty much the earliest youll find andrew, that s why its that time
<perrito666> katco`: interesting, bouncer spitting any logs?
<katco`> perrito666: yeah, but nothing that was interesting in 5m of looking
<perrito666> katco`: just kill it, it happened to me a couple of times when using bip
<katco`> perrito666: i rebooted my headless this morning; i think the disk might be going out (don't know if that's related)
<katco`> was getting segfaults running any command; inodes corrupt, etc.
<katco`> watched fsck run; seemed to clean some things up (shrugs)
<perrito666> katco`: ouch, that depends too much on your bouncer implementation but make sense that it would need some sort of access to logs/cache to let you in
<katco`> well the reboot seems to have cleared that up, that's why i'm not sure if it's related
<katco`> plus... the session is connected
<katco`> really no clue yet. i'll probably have to look at it tonight
<perrito666> katco`: well if the files are corrupt you might have that issue even if fsck attempted to repair the disk
<katco`> i don't know why it would still connect the session though...
<menn0> katco`: what's the story with PRs against develop? as OCR should I ignore them for now?
<katco`> menn0: yeah, working on a PR that encompasses that
<katco`> menn0: ta for asking
<menn0> katco`: I take it we will all soon be landing into develop instead of master?
<rick_h_> menn0: hooefully
<katco`> menn0: i think that's the idea. right now the goal was to begin double-pushing to develop to test out the automated testing, but seems that's happening on master already
<katco`> menn0: i imagine that when we switch we'll just rebase develop against a fresher master
<rick_h_> katco`: yea figure just delete/recreate develop/staging and block landing on master
<katco`> rick_h_: cool
<alexisb> thumper, ping
<thumper> veebers: standup?
<veebers> thumper: I'm just about to head out the door, see ping to alexisb in #juju :-)
#juju-dev 2016-09-28
<veebers> thumper, menn0 I'm getting a _bunch_ of log message like this when trying to get the status "INFO  juju.api apiclient.go:507 dialing "wss://10.0.8.193:17070/model/25914173-c6bf-4a50-86b3-4ba6060f5d59/api""
<veebers> I can see that the lxd containers still exist (my first thought is it being wiped from underneath). While the env is still up what debugging can I do?
<menn0> veebers: context please :)
<menn0> veebers: what lead up to this/
<veebers> menn0: oh sorry sure, I bootstrap, I add a new model, I deploy a charm (CI dummy charm), I then add a unit to it. I then wait for it to be considered 'started' and with 'workloads'. At this point I check the status in a loop for a couple of minutes with a bit of a sleep inbetween checks
<veebers> The first status check succeeded, it output something, the second attempt is now sitting there with that message repeating over and over
<veebers> menn0: it's now failed with this: http://pastebin.ubuntu.com/23245012/
<menn0> veebers: the apiclient "dialing" messages are attempts to connect to the Juju API server(s)
<menn0> veebers: it keeps trying and then has given up
<axw> veebers: that upgrade-juju thing should be fixed in master now
<menn0> veebers: this might mean the controller was unhappy
<axw> menn0: when you're free, can you please review https://github.com/juju/juju/pull/6336
<veebers> axw: sweet, I've just built and about to try it out.
<menn0> veebers: can you get logs for the controller machine(s)
<veebers> menn0: I have the lxd containers still (have paused the test from cleaning up) is there anything I can grab off them off interest while it's running? (normal log collection will occur once I un-pause it
<menn0> ?
<menn0> veebers: the normal log collection should be enough
<veebers> menn0: oh, I took too long typing :-) Yep I can grab them. I'll just let it complete and grab the logs from there
<menn0> axw: will look shortly
<veebers> will be some minutes as part of the clean up is to check status, so we'll go through this timeout process again
<veebers> menn0: ok, I now have the log files, you want them?
<menn0> veebers: sure. dropbox?
<menn0> veebers: or email if they're not too big
<veebers> menn0: which log? or do you want them all? You could probably scp them off the machine itself
<veebers> axw: Where would I find juju stuff on the controller machine? I'm logged into an lxd machine that "apparently" is the controller based on the bootstrap output but /var/log/juju is empty and "ps aux | grep -i juju" returns nothing. I find this quite odd
<veebers> I've logged in because I have this issue again with "INFO  juju.api apiclient.go:507 dialing "wss://10.0.8.180:17070/api" being repeated over and over (I imagine because the controller is borked somehow)
<axw> veebers: that's where you would find the logs. agent and other data is in /var/lib/juju. if the agent failed to come up, there should be something in /var/log/cloud-init-output.log
<veebers> axw, this is really odd as the system had been working before, I added models, deployed charms etc. and now it's not responding. /var/lib/juju doesn't even exist on that machine :-\
<axw> veebers: juju 2.0?
<veebers> axw: yep, 2.0-rc2 (although a day or so old)
<axw> veebers: there have been bugs where juju would uninstall itself. I thought it was sorted for 2.0, but possibly not...
<axw> I don't *think* it wipes /var/log/juju on unistnall though
<veebers> axw: did that uninstall bug include wiping all logs etc.?
<veebers> heh
<veebers> /var/log/juju exists, but it's empty
<veebers> axw: juju was totally running on this machine before hand, a grep juju /var/logs/syslog says so, also status stopping juju ... etc.
<axw> veebers: the uninstall code doesn't touch logs. so it's something else
<veebers> axw: Would you have a couple of minutes to help me work out if it's an existing bug or needs a new one?
<axw> not sure what I can tell you if there's nothing there. I have looked at the uninstall code and it doesn't touch logs, so it's not the same old one I referred to
<veebers> axw: ok. I'll have a dig around and see what details I can uncover
<veebers> axw: oh btw your fix unblocks me and the functional upgrade test, cheers :-)
<axw> cool
<voidspace> dimitern: ping
<dimitern> voidspace: pong, otp though - might be slow to respond
<voidspace> dimitern: ok
<voidspace> dimitern: currently the network.Select*Address|HostPort return the first address matching the requested scope
<voidspace> dimitern: we have a "bug" where IPv6 addresses are being returned when users want an IPv4 one
<voidspace> dimitern: so I'm changing the functions to prefer IPv4 (but still return IPv6 if that's all that is available as an exact match for the scope)
<voidspace> dimitern: so the choice is, should the order of preference be:
<voidspace> dimitern: IPv4, hostname, IPv6
<voidspace> dimitern: or
<voidspace> dimitern: IPv4, IPv6, hostname
<voidspace> dimitern: or alternatively:
<voidspace> dimitern: IPv4, hostname or IPv6 (so *only* prefer IPv4 - no weighting on IPv6 or hostname, just whichever appears first)
<redir> frobware: ping
<voidspace> dimitern: whatever we pick I have to adjust some tests
<voidspace> dimitern: I like just preferring IPv4 (option 3)
<hoenir> option 3 sounds the most reasonable
<voidspace> hoenir: cool, thanks
<redir> rogpeppe: yt?
<rogpeppe> redir: yt?
<rogpeppe> redir: ah "you there?"
<rogpeppe> redir: yes
<dimitern> voidspace: sorry, we just finished now - reading scrollback
<voidspace> dimitern: I'm doing option 3, just fixing tests
<voidspace> dimitern: so you can ignore it if you like
<dimitern> voidspace: option 3 sounds reasonable atm
<dimitern> however we should really move away from the notion of a single preferred address
<dimitern> voidspace: please ping me with the PR when you're done, I'd like to have a look
<voidspace> dimitern: sure, it's pretty straightforward really
<rogpeppe> here's an update to juju-core that fixes it for the changed Clock interface and updates dependencies: https://github.com/juju/juju/pull/6337
<redir> thanks rogpeppe
<voidspace> dimitern: a consequence is that the Select*Addresses/HostPorts (note the plural) only return IPv4 if they are available - not IPv6 at all
<voidspace> dimitern: I think that's still ok as we'll want to use IPv4 if they're available - still treating IPv6 as a fallback
<voidspace> dimitern: we can change that if we need to
<voidspace> dimitern: PR https://github.com/juju/juju/pull/6338
<dimitern> voidspace: looking
<voidspace> dimitern: I'm doing a "juju deploy ubuntu -n 15" on lxd
<voidspace> dimitern: so far all of the machines have IPv4 addresses
<dimitern> voidspace: that'll only work if you bump up the limits on number of open files
<voidspace> also running a full test suite just to check no tests depend on the old behaviour
<voidspace> dimitern: working fine so far
<dimitern> voidspace: please, double check what actually runs in each container
<voidspace> dimitern: well, 9 started so far
<voidspace> now 10
<voidspace> dimitern: what do you mean?
<dimitern> voidspace: if it's only /sbin/init and a couple of other processes, that's the limit issue - I'll try to find the bug
<voidspace> dimitern: 14 of the 15 have started ok and are reporting IPv4 addresses
<rogpeppe> some fixes to make test pass under Go tip; review appreciated: https://github.com/juju/juju/pull/6339
<dimitern> voidspace: ok, if you hit the open file limit issue, they should be stuck in pendingf
<voidspace> dimitern: nope, all started fine - killing it now because it has ground my machine to a crawl
<voidspace> running tests at the same time!
<babbageclunk> jam: https://github.com/babbageclunk/juju/tree/mongo-ssl
<jam> very interesting babbageclunk, seems all the mongo testing infrastructure already supported not having an SSL cert. I'm a little surprised that juju connecting to mongo isn't complaining that it isn't trusted.
<dimitern> voidspace: I'll review your PR after lunch, if that's ok - we've been on a HO with jam till now
<jam> ah, but maybe we don't check the SSL of mongo to the same degree that we check the API Conn.
<jam> babbageclunk: have you looked at all to see if we are using an SSL cert on the Controller as well for things like JujuConnSuite, and whether we're are connecting to ourselves over an SSL connection?
<babbageclunk> jam: Ah, no - didn't think to check that, sorry.
<babbageclunk> jam: Can take a look at that now, if you like?
<jam> babbageclunk: might be worthy of a look at least.
<babbageclunk> jam: Ok, I'll have a go now.
<jam> given how small your patch is, and that it is ~10% across the board, I'm starting to lean more towards it being worth doing.
<frobware> dimitern: http://paste.ubuntu.com/23246440/
<rogpeppe> anyone wanna sign off on this before i land it? (trivial-ish changes to make Juju tests pass under Go tip) https://github.com/juju/juju/pull/6339
<perrito666> rogpeppe: wont that break the tests with other go versions?
<rogpeppe> perrito666: it shouldn't, no
<rogpeppe> perrito666: the CI bot should make sure that's OK
<rogpeppe> perrito666: i tried to change things so that it would work under all versions
<perrito666> rogpeppe: cool, as long as you tested with other versions ship it
<dimitern> voidspace: reviewed, with some questions
<rick_h_> voidspace: are you able to help review/QA macgreagoir's branch so we can try to get it into the next rc please?
<dimitern> rick_h_: are we planning on including the maas bridge fixes in rc2?
<rick_h_> dimitern: yes, that's why the EOD timline
<rick_h_> dimitern: they have to be there to allow openstack to go back to being install-able by folks
<frobware> rick_h_: I was also going to try macgreagoir's change today; largely because voidspace reported that he could not repro
<rick_h_> frobware: k, all good
<dimitern> rick_h_: I've tested the changes, as agreed on maas 1.9, testing on 2.0 now
<rick_h_> dimitern: k, ty
<macgreagoir> voidspace frobware: I retested earlier this week and can repro :-) Comment in the review.
<dimitern> rick_h_: I'll propose the PR with the changes; also the last CI run (of the same branch, minus the "configured" handling) passed OK
<rick_h_> dimitern: k
<dimitern> here's the PR https://github.com/juju/juju/pull/6341
<rick_h_> natefinch: if the rax stuff works properly now can you file a bug against the docs/rackspace setup with any notes/etc please?
<rick_h_> natefinch: and check the cloud config docs around that as I know there was some question over a domain name and such in the past
<rick_h_> voidspace: assigned a card your way for next please.
<rick_h_> dooferlad: how goes the return? Have you gotten through catch up enough to pick up the hostname/lxd issues today?
<dooferlad> rick_h_: naughty me - didn't move the cards
<rick_h_> dooferlad: cool, ty. Just checking :)
<natefinch> rick_h_: will do
<dimitern> dooferlad: hey there, long time no see ;)
<dooferlad> dimitern: hi
<voidspace> rick_h_: I did QA it and it didn't work for me
<rick_h_> voidspace: ic, ok
<voidspace> rick_h_: I see you put me on the list-spaces card - I was thinking about picking that one up
<voidspace> rick_h_: is the kanban board ordered?
<natefinch> rick_h_: I actually think it's a bug in our code... juju add-credential rackspace asks for a domain name, and shouldn't
<rick_h_> voidspace: somewhat, I do try to stick critical at the top
<rick_h_> natefinch: k, I want to make sure we're polishing off that experience there so that this works ootb.
<voidspace> kk
<rick_h_> natefinch: if that's a bug in asking for it then we need a card to remove that please
<voidspace> rick_h_: grabbing coffee before our sync up - be with you in 5
<rick_h_> voidspace: but it tends to wander sometimes when I don't keep at it every day
<rick_h_> voidspace: rgr
<natefinch> rick_h_: making the bug then the card, right now
<rick_h_> natefinch: ty
 * dimitern can confirm https://github.com/juju/juju/pull/6341 works as expected on MAAS 1.9 + 2.0 by manual testing
<macgreagoir> dimitern: http://reviews.vapour.ws/r/5715/
<natefinch> rick_h_: ug, so this seems to be a problem for openstack as well - the interactive add-credentials code is very dumb, and just prompts for everything you could possibly set, which doesn't take into account whether or not it's v2 or v3.  Ideally we'd ask the user if it's v2 or v3 and then only prompt for domain name for v3.... but there's no way to do that right now.  I *can* fix it for rackspace by overriding the entire
<natefinch> schema for rackspace to exclude domain-name.
<rick_h_> natefinch: since it's different enough from openstack I'd be +1 to a rackspace specific config
<natefinch> rick_h_: cool
<natefinch> rick_h_: I'll make a separate bug for the openstack issue.  It's addressable via documentation, but just not user-friendly that way.
<rick_h_> natefinch: understand, ty
<rick_h_> natefinch: look for an existing bug though, I know this domain-name issue has come up and thought it was in bug form.
<rick_h_> but I could be wrong and it was in email or something
<natefinch> rick_h_: https://bugs.launchpad.net/juju/+bug/1577776
<natefinch> rick_h_: seems like we just added domain-name to openstack config, which might be why other things around it are breaking
<rick_h_> natefinch: gotcha
<rick_h_> natefinch: ok, so let's update rackspace so we can try to get it smooth ootb and we'll have to revisit the openstack case
<natefinch> yep
<natefinch> rick_h_: ping for standup ;)
<voidspace> dimitern: I answered your questions by the way
<voidspace> rick_h_: ping for standup...
<dimitern> voidspace: cheers, looking
<rick_h_> voidspace: doh omw
<voidspace> dimitern: thanks!
<dimitern> voidspace: LGTM
<natefinch> katco: we need to develop one universal standard error package that covers both juju errors and errgo
<katco`> natefinch: "and then you have 3 problems"
<natefinch> katco`: just reminded me of that xkcd with 14 competing standards :)
<katco`> natefinch: not sure if you were serious :)
<natefinch> katco`: no no :)
<katco`> haha
<natefinch> dooferlad: show us your shirt
<natefinch> dooferlad: thought that was Nathan Fillion at first
<dooferlad> natefinch: https://en.wikipedia.org/wiki/Con_Man_(web_series)
<natefinch> oh, it is, ok :)
<natefinch> dooferlad: weird, I recognize the title font and stuff, but somehow never looked deeper than that.  It sounds amazing
<dooferlad> natefinch: it is rather good. I haven't seen it all yet due to new children.
<natefinch> dooferlad: totally understand that
<natefinch> dooferlad: I only recently got past season 3 of Game of Thrones for the same reason :)
<natefinch> katco: hooly crap. According to godoc.org, Dave's github.com/pkg/errors package is imported by 623 packages
<katco> natefinch: yep. he brought the lessons learned on juju to the whole community
<natefinch> katco: there have been other errors packages, but I guess star power has its benefits :)
<katco> :)
<mup> Bug #1628155 changed: cmd/juju: juju deploy "see also" refers to non-existent command <helptext> <usability> <juju:Triaged> <https://launchpad.net/bugs/1628155>
<voidspace> babbageclunk: ping
<babbageclunk> voidspace: pong
<babbageclunk> dooferlad: how's double-dadhood going?
<dooferlad> babbageclunk: OK. The expected problem (not enough sleep) but from an unexpected source (elder daughter).
<dooferlad> babbageclunk: Mostly loving it though!
<babbageclunk> dooferlad: yeah, we found that too - you don't get to take advantage of all the time little babies spend asleep to catch up on some yourself!
<dooferlad> babbageclunk: definitely not when you are working :-)
<natefinch> oh man, just used go rename in vscode... this is game changing
<perrito666> natefinch: how so?
<perrito666> I mean, go rename is an external tool :p
<natefinch> perrito666: right right, but the CLI is kind of hard to use...  in vscode I just click the thing I want to rename and hit F2, type in the new name, and it just works.
<perrito666> ah never used it in the cli, I use vim-go
<perrito666> I would guess katco has a similar one for emacs, I dont think anyone uses go rename in the cli
<katco> indeed i do
<katco> comes standard in the emacs go-mode
<katco> i can also visually debug with delve
<natefinch> yeah, I just started debugging with delve a few days ago, also amazing
<katco> yeah i missed visually debugging hehe
<natefinch> yep
<katco> it's still not my goto bc it's still a little more cumbersome than i'd like, but it's there and very helpful
<natefinch> it's pretty great in vscode.  Very easy to set breakpoints, give it a command line to run, inspect local variables, etc.
<natefinch> anyway, gotta run, back after lunch
<perrito666> redir: ping
<redir> yo
<redir> otp perrito666
<perrito666> redir: just ping me when you hang
<redir> will do perrito666
<perrito666> tx
<anastasiamac> babbageclunk: this is what it gives me :D http://juju-ci.vapour.ws:8080/job/github-merge-juju/9349/artifact/artifacts/windows-out.log
<voidspace> dimitern: rick_h_: ping
<rick_h_> voidspace: pong
<voidspace> rick_h_: landing onto develop now, right?
<rick_h_> voidspace: starting monday
<voidspace> ah
<rick_h_> voidspace: so not today, getting the ducks in a row to prepare
<voidspace> rick_h_: so just land this branch on Monday then
<voidspace> I mean on master
<rick_h_> voidspace: yea, on master please
<voidspace> rick_h_: I think you did say this in standup...
<voidspace> kk
<rick_h_> voidspace: yep, all good
<dimitern> voidspace: pong
<voidspace> dimitern: unping :-)
<voidspace> dimitern: landing the prefer IPv4 branch
<dooferlad> bridge everything... LXD not getting on the right subnet https://www.irccloud.com/pastebin/0h06Df4K/
<dimitern> voidspace: go for it :)
<dooferlad> dimitern, frobware: ^^
<dimitern> dooferlad: which branch are you testing?
<frobware> dooferlad: let's do the testing on the master-lp1627037-final branch I am about to push.
<frobware> dooferlad, dimitern: it's not clear to me we're all testing the final thing
<dooferlad> This is without your latest changes. Just master + a couple of changes of my own that shouldn't do anything to break this.
<frobware> dooferlad: ohhhh....
<dooferlad> it has bridged everything
<dooferlad> but LXD gave the container a 10. address, not something from the host 192.168.1.0/24...
<dimitern> dooferlad: that might happen if the 192.168.1.0/24 has no available IPs
<dooferlad> dimitern: there are plenty
<dimitern> dooferlad: I had this case when I used a subnet which I forgot that I reserved .1-.254 previously
<dooferlad> dimitern: I launched machines since that are fine
<dimitern> dooferlad: any errors in the log of the host ?
<dooferlad> dimitern: not that I have seen yet
<dooferlad> oh, fun: "ERROR juju.worker.proxyupdater proxyupdater.go:160 lxdbr0 has no ipv4 or ipv6 subnet enabled"
<dooferlad> It looks like your lxdbr0 has not yet been configured. Please configure it via:
<dooferlad> sudo dpkg-reconfigure -p medium lxd
<dooferlad> and then bootstrap again.
<voidspace> dooferlad: I always get that, and then it works
<voidspace> dooferlad: I *think* it's normal
<voidspace> deceiving though, because it takes a while to download the template, so it seems like it hasn't worked and there is this error message in the logs...
<voidspace> and then about ten minutes later it completes...
<dooferlad> voidspace: you are right - I get it on working machines too.
<voidspace> annoying, especially at error level
<dooferlad> what a rubbish message!
<rogpeppe> perrito666: did you have a look at https://github.com/juju/juju/pull/6339 ? i'd quite like to land it if poss.
<dimitern> it's total crap yeah
<dimitern> but it's fine otherwise
<dimitern> doesn't stop it from working
<dooferlad> Well, the address is wrong for the container... https://www.irccloud.com/pastebin/7BdR7noH/
<rogpeppe> dimitern: or maybe you might have a look - it's basically trivial. https://github.com/juju/juju/pull/6339
<dimitern> rogpeppe: we'll be leaving the office in a minute or so, sorry :/
<rogpeppe> dimitern: ok
<rogpeppe> dimitern: i might land it anyway
<rogpeppe> dimitern: i have one review (not from core though)
<redir> perrito666: what's up?
<perrito666> rogpeppe: I did, sorry I lgtmd in irc and forget to do it in gh
<perrito666> redir: priv
<rogpeppe> perrito666: thanks
<voidspace> perrito666: where are you based again?
<perrito666> voidspace: argentina
 * perrito666 looks out of the window and sees a rocket from voidspace aproaching
<voidspace> perrito666: heh
<voidspace> perrito666: fancy brining me a large rock to the next sprint?
<perrito666> voidspace: too many variables in that sentences
<voidspace> perrito666: there are some lovely amethyst geodes in your part of the world and I would really like one
<perrito666> define large and rock
<voidspace> perrito666: especially a sphere
<voidspace> perrito666: a few kilos...
<perrito666> I presume customs wouldnt have an issue with it so I dont see why not
<voidspace> perrito666: :-)
<perrito666> send me more details and ill try to procure one
<voidspace> perrito666: I'll send you a link to an example and see if you can find one - I think it would be much *cheaper* if you find one than me buying it from a non-indigenous provider...
<perrito666> certainly I think these things are sort of easy to find here, hippies use them to to all kind of crafts
<voidspace> hippies are great :-)
<perrito666> if you say so
<voidspace> I do, and I'm glad you consider me an authority
<perrito666> ill go to the flea market and find out once you send me a pic
<perrito666> lol
<voidspace> perrito666: this is the sort of thing I would love, the purpler the better
<voidspace> perrito666: http://www.ebay.co.uk/itm/TOP-GRADE-HUGE-DARK-PURPLE-AMETHYST-GEODE-SPHERE-FROM-URUGUAY-/232091390623?hash=item3609b9929f:g:5E8AAOSwTA9X4-8m
<voidspace> perrito666: that size is wonderful, that price is not
<perrito666> should this be a perfectly shaped ball?
<voidspace> perrito666: ideally but not necessarily, to be fair I love any and all beautiful minerals
<voidspace> perrito666: as they're cut from a geode they're usually ground to a shape like a sphere with a "bite" taken out of it
<voidspace> perrito666: so not a perfect sphere - I'd much prefer one with some of the crystals intact
<perrito666> do search "amatista" in mercadolibre.com
<perrito666> .com.ar
<voidspace> perrito666: cool, looking
<voidspace> perrito666: this is the closest, not really exactly what I'm after (would prefer more purple and more spiky) http://articulo.mercadolibre.com.ar/MLA-629608847-esfera-de-agata-y-amatista-hermosa-115mm-1600grs-_JM
<voidspace> perrito666: I'll look again, thanks
<voidspace> perrito666: and if you see anything - or any other large beautiful minerals - let me know :-)
<perrito666> ill do, ill tlegram you pics
<perrito666> the hippie fair opens only on weekends
<voidspace> perrito666: flourite is nice I think there is some in Argentina
<voidspace> perrito666: I have agate
<voidspace> perrito666: anyway, thanks :-)
 * rick_h_ goes for lunchables
<perrito666> I could use reviews in https://github.com/juju/juju/pull/6344 https://github.com/juju/juju/pull/6340 https://github.com/juju/juju/pull/6321
<natefinch> greedy
<natefinch> perrito666: I'm looking at 6344
<rogpeppe> this PR implements letsencrypt certificate support for controllers. anyone fancy taking a look? https://github.com/juju/juju/pull/6345
<natefinch> whoa, awesome!
<rogpeppe> natefinch: note: if you want to try it out, you need to bootstrap with api-port=443
<natefinch> rogpeppe: ahh, interesting
<rogpeppe> natefinch: alternatively you can run a port forwarder on the controller instance to forward from 443 to 17070 but then you'll need to open up port 443 in the security group
<natefinch> rick_h_: so there's some auto-detection of environment variables for openstack, so like you can set OS_TENANT_NAME and we'll pick it up.  Currently that's being reused for rackspace... I don't think we should reuse those environment variables, what do you think?
<rick_h_> natefinch: meet you in standup?
<natefinch> rick_h_: yep
<natefinch> fastest hangout ever
<rick_h_> :)
<cmars> hello, i have a runaway jujud burning up my machine. how do i attach the profiler to it?
<cmars> for example, https://paste.ubuntu.com/23247920/
<natefinch> cmars: anything interesting in the logs?
<cmars> looking
<cmars> natefinch, maybe. lots of messages like this are spewing to machine-0.log: https://paste.ubuntu.com/23247941/
<natefinch> cmars: can you run juju model-config logging-config="<root>=TRACE"
<natefinch> rick_h_: when did set-model-config and get-model-config get rolled into one command?
<cmars> natefinch, set model-config on the controller model?
<natefinch> cmars: uh, yes
<natefinch> cmars: I can never remember if it matters, probably does
<cmars> natefinch, controller model worked. here's the last 2000 lines with trace on: https://pastebin.canonical.com/166645/
<alexisb> natefinch, about 3 weeks ago regarding the model-config command
<alexisb> set-config and get-config also got collapsed
<natefinch> alexisb: personally not a fan of losing the verbness on set
<rick_h_> natefinch: yea, that was the negative, was in beta18.
 * rick_h_ goes to get the boy from school, biab
<thumper> rogpeppe: I haven't looked at your letsencrypt branch, just saw the email fly past. Just wanted to say that I love the idea, and very cool that you've done this
<rogpeppe> thumper: thanks!
<natefinch> simple PR anyone?  https://github.com/juju/juju/pull/6346  just overriding a couple methods on the rackspace provider to strip out domain name as a valid credential attribute.
<gQuigs> are there daily builds available for juju 1.25?   I'd like to try a fix that's committed but not released
<katco> thumper: hey we need a third opinion, got a sec?
<thumper> katco: sure
<thumper> gQuigs: I don't think so
<katco> thumper: can you TAL at my comment regarding juju/rety here? https://github.com/juju/juju/pull/6321/files
<thumper> sure
<katco> thumper: and lmk if that's appropriate? perrito666 would rather use a simple for loop
<thumper> ack
<natefinch> thumper: https://github.com/juju/retry/pull/4
 * thumper pushes it on the stack
<natefinch> thumper: just a doc update, no rush
<katco> thumper: sweet, i didn't know you were preemptable. the possibilities...
<thumper> queue, no stack
<thumper> geez
<katco> aw boo
<thumper> nerds
<thumper> well
<thumper> geeks
<thumper> but that is just stating the obvious
<natefinch> also pedants, evidently
<thumper> my kids a beginning to understand just how much of a geek household they live in
<katco> sudo review my comment already!
<thumper> well, we do work at pedantical
<katco> lol
<katco> natefinch: comment left on your pr
<natefinch> katco: added QA steps
<natefinch> katco: thanks for the reminder
<katco> thumper: ta
<natefinch> does anyone else think we'll get push back for calling our Kubernetes distro the Canonical Distribution of Kuberneters?  Makes it sound like its the official one.
<katco> thumper: so can you clarify when it's appropraite to use juju/retry?
<thumper> I think that play on words is hillarious
<thumper> I don't think it is
<thumper> retry either has time limits or retry limits
<thumper> whereas this is polling
<katco> thumper: does juju/retry demand a time/retry limit?
<thumper> yeah, it does
<thumper> I wonder if I added an infinite?
<natefinch> I think retry implies "try until you succeed"
<natefinch> otherwise it would just be called juju/loop
 * thumper checks something
<katco> thumper: ah i see in https://github.com/juju/retry/blob/master/retry.go#L155
<thumper> yeah, it expects either a max duration or max times
<thumper> whereas I think perhaps that should be loosened
<thumper> to expect one or more of: max duration, max retries, or stop channel
<natefinch> I was just gonna mention stop channel
<thumper> pull request welcome :)
<thumper> actually
<katco> see? he's totally preemptable
<thumper> if you say Attempts: retry.UnlimitedAttempts
<natefinch> katco: lol
<katco> i'm just going to keep bringing stuff up
<thumper> then that passes
<katco> hey thumper what do you think of mongo?
<thumper> it is lovely
<thumper> as long as you don't care about all your data
<katco> ah i'm not crazy! no wonder i couldn't figure out where this error message was coming from. different results after first time deploy is run
<kwmonroe> hey alexisb, yesterday we talked about azure slowing down on long running deployments.  i've got a controller with molasses in its tubes if you're interested.  bug 1628206 has my logs.. i can keep it up as long as you want.
<mup> Bug #1628206: azure controller size seems too small <juju:Triaged> <https://launchpad.net/bugs/1628206>
<alexisb> kwmonroe, thanks for the bug
<kwmonroe> np alexisb, thanks for being such a wonderful person.
<alexisb> axw comes online this afternoon
<alexisb> :)
<alexisb> kwmonroe, can you send me info with details on access
<alexisb> then I can have axw take a look when he comes online
<kwmonroe> will do alexisb
<alexisb> katco if you have time: https://github.com/juju/juju/pull/6347
<alexisb> I would like to land this today if possible and it is a simple change
<katco> alexisb: looks simple enough; you grepped for all possible places that reference get-controller, etc.?
<alexisb> yes
<katco> alexisb: lgtm
<babbageclunk> thumper: sorry, back now - hangout?
<thumper> babbageclunk: ack
<natefinch> rick_h_: I presume I should work on the card assigned to me in the todo?  this bug: https://bugs.launchpad.net/juju/+bug/1621375
<mup> Bug #1621375: "juju logout" should clear cookies for the controller <juju:Triaged by rharding> <https://launchpad.net/bugs/1621375>
<rick_h_> natefinch: yea, so I suspected that was related to the other one you had in tracking
<rick_h_> natefinch: my thought was that might be why a logout didn't actually log you out
<katco> rick_h_: i need an opinion, hangout rq?
<alexisb> rick_h_, ping
<katco> PR for someone: https://github.com/juju/juju/pull/6348
<thumper> katco: comments added
<thumper> alexisb: unit leader in status https://github.com/juju/juju/pull/6350
<alexisb> thumper, awesome
#juju-dev 2016-09-29
<axw> perrito666: I'm not sure what your message was in reply to ("I believe so too...")
 * axw goes to get breakfast
<alexisb> alrighty all I am off for the night, see everyone tomorrow
<thumper> axw: good point about not showing for only one unit
<thumper> I'll look into it
<axw> thumper: thanks. one complication would be when you specify status filtering
<axw> i.e. you might just be showing a subset of units, and there may be more than what you're showing
<thumper> yeah...
<thumper> also, what about showing leader in yaml / json?
<thumper> keep there even if only one?
<axw> thumper: IMO it's fine to have it in there
<thumper> ok
 * thumper will think on it
<axw> thumper: seems I led you astray on the openstack storage fix. https://bugs.launchpad.net/juju/+bug/1615095
<mup> Bug #1615095: storage: volumes not supported <landscape> <juju:In Progress by axwalk> <https://launchpad.net/bugs/1615095>
<thumper> oh?
<thumper> damn
<axw> thumper: you know how we're looking for the volume endpoint in SetConfig?
<axw> thumper: doesn't work, because the client isn't authenticated yet
<thumper> ah
<thumper> oops
<axw> we don't authenticate to keep Open fast
<axw> thumper: gtg out, will fix it tomorrow
<thumper> ack
<rogpeppe1> axw: thanks a lot for the review of my letsencrypt branch
<dimitern> rogpeppe1: hey, I'm trying to figure out why the api server started to throw errors like this: ERROR juju.worker runner.go:210 exited "apiserver": cannot start api server worker: crypto/tls: private key does not match public key
<dimitern> rogpeppe1: it seems to happen frequently after the controller machine was restarted, on CI
<rogpeppe1> dimitern: interesting
<rogpeppe> dimitern: which version of juju are you using?
<dimitern> rogpeppe: what I found so far seems to indicate the key file was corrupted somehow
<rogpeppe> dimitern: wouldn't it be nice if debugging output printed the whole error stack?
<dimitern> rogpeppe: 2.0, on a feature branch master-lp1627037
<rogpeppe> dimitern: so have you seen this issue on master?
<dimitern> rogpeppe: yes, 2 days ago
<dimitern> rogpeppe: http://reports.vapour.ws/releases/4429/job/functional-container-networking-maas-2-0/attempt/1093
<dimitern> rogpeppe: and the last occurrence is http://reports.vapour.ws/releases/4436/job/functional-container-networking-maas-2-0/attempt/1108
<dimitern> rogpeppe: the errors are visible in machine-0.log on the controller, after it has been rebooted - it seems intermittent though
<rogpeppe> dimitern: ah, so the reports.vapour.ws log doesn't have that error in
<dimitern> rogpeppe: it just shows it failed to connect to the apiserver after 10m of trying
<dimitern> rogpeppe: but the machine-0.log shows the apiserver keeps restarting every 3s with that error
<rogpeppe> dimitern: i don't see any occurrence of "does not match" in http://data.vapour.ws/juju-ci/products/version-4429/functional-container-networking-maas-2-0/build-1093/controller/machine-0/machine-0.log.gz
<dimitern> rogpeppe: sorry, so the one on master shows a different error, but still related: 2016/09/27 13:24:45 http: TLS handshake error from 10.0.30.40:43974: remote error: bad certificate
<dimitern> or maybe not related ... /me is getting confused :/
<rogpeppe> dimitern: it seems like it might well be related
<dimitern> rogpeppe: I know some things around tls / certs have changed lately
<rogpeppe> dimitern: they have?
<rogpeppe> dimitern: have you got a link to a log that contains the "private key does not match" error?
<dimitern> rogpeppe: might have.. not sure - I know you added tests, but that shouldn't have caused such things
<dimitern> yeah, just a sec
<rogpeppe> dimitern: yeah, actually, i did change some stuff, it's true, i'd forgotten that
<dimitern> rogpeppe: here's the log http://data.vapour.ws/juju-ci/products/version-4436/functional-container-networking-maas-2-0/build-1107/machine-0.log.gz
<rogpeppe> dimitern: are all the failures since that commit (42d0c9c07cffbc5075cb05add9e1398056f0d890) ?
<dimitern> rogpeppe: it's from the feature branch CI run, but the branch itself does not have anything to do with tls or certs - just changes in provider/maas
<dimitern> rogpeppe: let me check
<rogpeppe> dimitern: does that feature branch include commit 42d0c9c07cffbc5075cb05add9e1398056f0d890 ?
<rogpeppe> dimitern: does the report (http://reports.vapour.ws/releases/4429/job/functional-container-networking-maas-2-0/attempt/1093) mention the commit id anywhere? i can't find it currently.
<dimitern> rogpeppe: I can't seem to find that commit on the fbranch or master
<rogpeppe> dimitern: i don't know what "revision 4429" means in a git context
<dimitern> rogpeppe: check the top of the report - has a link to the commit hash tested and it links to github
<rogpeppe> dimitern: what's the text of the link?
<dimitern> rogpeppe: ah, no actually - the
<dimitern> Jenkins link links to the job which mentions the commit id
<dimitern> e.g. http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1107/
<dimitern> gitbranch:master-lp1627037:github.com/juju/juju bbd844f
<dimitern>  
<rogpeppe> dimitern: i get a 404 from the jenkins link
<dimitern> rogpeppe: ah, looking at that job's build history I can confirm the commit  â42d0c9c
<rogpeppe> dimitern: how do you find the build history?
<dimitern> rogpeppe: failed
<rogpeppe> dimitern: failed what?
<dimitern> rogpeppe: can you open http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/ for example?
<rogpeppe> dimitern: nope
<rogpeppe> dimitern: 404
<voidspace> mgz: ping
<dimitern> rogpeppe: try logging in and then the link above?
<rogpeppe> dimitern: ok, works now, thanks
<rogpeppe> dimitern: one might've thought the 404 would contain a login link
<anastasiamac> rogpeppe: try to login ;) this 404 is misleading... it really means that u r not authenticated...
<anastasiamac> :)
<voidspace> mgz: unping :-)
<rogpeppe> anastasiamac: yeah, i hate that :)
<dimitern> rogpeppe: yeah, it's not *that* helpful..
<anastasiamac> rogpeppe: \o/ keeping us on our toes
<rogpeppe> dimitern: so it looks like http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1093/console doesn't include the commit we're thinking of (42d0c9c)
<rogpeppe> dimitern: so perhaps we shouldn't level the blame at that :)
<rogpeppe> dimitern: assuming the "REVISION_ID=a5606e7126c0ee5b816b3c52e85f5c77635b5ce3" holds the revision being tested
<dimitern> rogpeppe: well, 42d0c9c did fail the job on master though...
<dimitern> rogpeppe: http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1098/
<rogpeppe> dimitern: and that was the first time it failed like that?
<dimitern> rogpeppe: it failed before, but not like this I think.. checking earlier logs
<dimitern> rogpeppe: it failed before because the substrate was unclean - no machine matches constraints
<dimitern> rogpeppe: but interestingly, the very next run on 42d0c9c passed ok..
<dimitern> might be just flaky.. or misconfigured maas node
<dooferlad> dimitern, frobware: So, this keeps happening:
<dooferlad> MODEL  CONTROLLER  CLOUD/REGION  VERSION
<dooferlad> foo    maas        maas          2.0-rc2.1
<dooferlad> APP  VERSION  STATUS  SCALE  CHARM  STORE  REV  OS  NOTES
<dooferlad> UNIT  WORKLOAD  AGENT  MACHINE  PUBLIC-ADDRESS  PORTS  MESSAGE
<dooferlad> MACHINE  STATE    DNS            INS-ID                                                          SERIES  AZ
<dooferlad> 0        started  192.168.1.101  /MAAS/api/1.0/nodes/node-67b68b08-1452-11e6-9228-54a050d5d9eb/  xenial  default
<dooferlad> 0/lxd/0  started  10.0.0.199     juju-df0cd5-0-lxd-0                                             xenial
<dooferlad> 1        started  192.168.1.102  /MAAS/api/1.0/nodes/node-7b5b54e0-1452-11e6-9228-54a050d5d9eb/  xenial  default
<dooferlad> 1/lxd/0  started  192.168.1.103  juju-df0cd5-1-lxd-0                                             xenial
<dooferlad> oops, that should have gone to pastebin
<dooferlad> hang on
<dimitern> yeah
<dimitern> :)
<dooferlad> https://www.irccloud.com/pastebin/PrlicZuK/
<dooferlad> I have a smart IRC client, dumb user.
<rogpeppe> dimitern: which is the juju report associated with http://juju-ci.vapour.ws:8080/job/functional-container-networking-maas-2-0/1098/ ?
<frobware> dooferlad: on the 0/lxd/0 machine can you cat:
<frobware> /var/lib/cloud/seed/nocloud-net/network-config
<dooferlad> frobware: on it
<rogpeppe> dimitern: i'd like to see the machine-0.log for it
<dooferlad> https://www.irccloud.com/pastebin/5d4y61hL/
<dooferlad> frobware: ^^
<dimitern> rogpeppe: it should say somewhere.. looking
<dooferlad> frobware: it is the same as on the LXC that worked
<dimitern> rogpeppe: http://reports.vapour.ws/releases/4431
<frobware> dooferlad: MAAS?
<dooferlad> frobware: yes
<dooferlad> frobware: 1.9
<frobware> dooferlad: this look like you need mick's fix
<rogpeppe> dimitern: but which link on that page has the actual test run that contains the machine-0.log artifact for that run?
<frobware> dooferlad: https://github.com/juju/juju/pull/6276
<frobware> dooferlad: I just ran into this too.
<dooferlad> frobware: it was doing this yesterday too though, before that landed (I think)
<frobware> dooferlad: Just checking that it has landed...
<dimitern> rogpeppe: scroll down
<frobware> dooferlad: so it has.
<rogpeppe> dimitern: i see hundreds of links but no artifacts
<dimitern> rogpeppe: and search for the job name - on the build number to the right (hovering) you'll see the logs
<rogpeppe> dimitern: what job name am i looking for?
<frobware> dooferlad: 17 hours ago - is that before or after the CI job?
<dimitern> rogpeppe: it's frustrating how it overlaps the logs, but if you hover on functional-container-networking-maas-2-0 | Succeeded | >1099<
<dooferlad> frobware: it is showing up in the history, so it must have merged, right?
<rogpeppe> dimitern: but that's a success - i thought this was meant to have failed
<frobware> dooferlad: agreed. but pull and check
<dimitern> rogpeppe: the list that appears has 1098 ... but unfortunately http://reports.vapour.ws/releases/4431/job/functional-container-networking-maas-2-0/attempt/1098 does no appear to have the machine-0 log
<dooferlad> frobware: or is this the 'new process' stuff that I have been ignoring
<frobware> dooferlad: that was going through my head
<dimitern> damn :/ hate it when that happens!
<dooferlad> frobware: yes, it is there in my build
<rogpeppe> dimitern: so you're not sure whether 42d0c9c failed because of the issue you've described?
<dooferlad> frobware: and it wasn't yesterday when I was running into it the first time
<rogpeppe> dimitern: looking at the changes i made in that branch, i'd find it very unlikely that they could cause an extra error case when starting the api server
<rogpeppe> dimitern: the changes were all about how changed certificates were handled
<rogpeppe> dimitern: i think the issue probably arises because a duff cert/key pair is being passed into the api server
<rogpeppe> dimitern: it's difficult to say without being able to reproduce the issue
<rogpeppe> dimitern: perhaps add some debugging log statements that might help if the issue happens again
<rogpeppe> dimitern: for example if the cert is wrong, log it and the key
<dimitern> rogpeppe: well, I'll let you know if we repro it again, and thanks for looking into it!
<rogpeppe> dimitern: np
<frobware> dooferlad: did you draw any conclusion?
<dooferlad> frobware: no. Got stuck in other email. Will take another look in a moment or 10
<frobware> dooferlad: pulling that change on top of what I'm doing fixed that lxd/dhcp/eth0 case for me
<dooferlad> frobware: frustrating that it didn't help me then :-|
<frankban> hey, I need a review for https://github.com/juju/juju/pull/6352 anyone available? thanks!
 * dooferlad tries again in case of user error
<rogpeppe> jam: i've replied to https://github.com/juju/juju/pull/6345 and made one or two changes. do you think it's good to land?
<dimitern> voidspace: your PR fails make check on trusty btw
<dimitern> voidspace: you can see which failed in trusty.out log - GetServerAddrs or something like this
<dimitern> voidspace: http://juju-ci.vapour.ws:8080/job/github-merge-juju/9362/artifact/artifacts/trusty-out.log
<dimitern> voidspace: ah, you've updated that I guess it will pass now - sorry for the noise :)
<frankban> axw: could you please take a look at https://github.com/juju/juju/pull/6352 when you have time? thanks
<voidspace> dimitern: I know, already fixed
<voidspace> dimitern: but thanks
<dooferlad> ok, latest master still has the random not getting the right address problem. No question. Yay. Starting more digging...
<dooferlad> frobware, dimitern: the answer is in /var/log/lxd/<name>/lxc.conf -- the container that doesn't end up on the right subnet gets:
<dooferlad> lxc.network.type = veth
<dooferlad> lxc.network.flags = up
<dooferlad> lxc.network.link = lxdbr0
<dooferlad> lxc.network.hwaddr = 00:16:3e:4c:3d:b3
<dooferlad> lxc.network.name = eth0
<dooferlad> The right thing would be more like:
<dooferlad> lxc.network.type = veth
<dooferlad> lxc.network.flags = up
<dooferlad> lxc.network.link = br-eth0
<dooferlad> lxc.network.hwaddr = 00:16:3e:ca:f3:6c
<dooferlad> lxc.network.mtu = 1500
<dimitern> dooferlad: what's in the machine-0.log around PrepareContainerInterfaceInfo API call?
<dooferlad> lxc.network.name = eth0
<dooferlad> lxc.network.type = veth
<dooferlad> lxc.network.flags = up
<dooferlad> lxc.network.link = br-eth1
<dooferlad> lxc.network.hwaddr = 00:16:3e:40:36:21
<dooferlad> lxc.network.mtu = 1500
<dooferlad> lxc.network.name = eth1
<dimitern> dooferlad: any errors
<dimitern> aaaaah stop it:)
<frobware> dooferlad: what does: lxc config show <container-name> show?
<anastasiamac> dooferlad: \o/ i second dimitern :D
<dooferlad> bah
<frobware> dooferlad: humbug?
<dooferlad> dimitern: logs https://www.irccloud.com/pastebin/4xfeEDhy/
<dimitern> dooferlad: I can't see PrepareContainerInterfaceInfo response there?
<dimitern> dooferlad: it should be a bit earlier in the contoller
<dimitern> dooferlad: controller's machine log
<frobware> dimitern, dooferlad, voidspace, babbageclunk: I concluded my manual testing on https://github.com/juju/juju/pull/6342
<frobware> dimitern, dooferlad, voidspace, babbageclunk: want to land this but looking for final review/approval now
<dimitern> frobware: http://pasteboard.co/8ThbB7FWl.png
<rogpeppe> dimitern: do you know what the set-numa-control-policy setting does, by any chance?
<rogpeppe> dimitern: i was mucking around in controller/config.go and came across: 	// NumaControlPolicyKey stores the value for this setting.
<rogpeppe> dimitern: which is possibly the most uninformative doc comment I have ever come across
<rogpeppe> the above is a question for anyone else too, BTW.
<dooferlad> dimitern: sorry, my fingers typed 'juju ssh 0', not 'juju ssh -m controller 0'
 * dooferlad curses CLI changes
<anastasiamac> rogpeppe: sounds like my thing :D off memory, it was done as a fix for https://bugs.launchpad.net/juju-core/+bug/1350337
<mup> Bug #1350337: Juju DB should use numactl when running mongo on multi-socket nodes <canonical-bootstack> <hours> <maas> <mongodb> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1350337>
<rogpeppe> this is also hilarious: // DefaultNUMAControlPolicy should not be used by default.
<anastasiamac> rogpeppe: it's a setting specific for NUMA machines. r u using NUMA?
<rogpeppe> anastasiamac: no, but i think that every attribute should be adequately documented
<anastasiamac> rogpeppe: mongo needs to have flag setup to run on NUMA. Hence, the setting
<dimitern> :D
<dimitern> it is
<dimitern> hilarious
<frobware> dimitern: guestmount -a /var/lib/libvirt/images/maas19-node6.qcow2 -o ro -m /dev/sda1 /mnt
<anastasiamac> rogpeppe: agreed :) it was my ealry days on juju and of course, as a dev, it was obvious to me at the time :) sorry
<rogpeppe> anastasiamac: ok, well it should be documented that it's about running mongo with numactl for a start
<anastasiamac> rogpeppe: agreed, feel free to clarify as an external force that is now aware :-P
<anastasiamac> rogpeppe: especially, since u r in the area :)
<rogpeppe> anastasiamac: do you know what a "multi-socket server" is?
<anastasiamac> rogpeppe: :D not sure where u r reading "server"... i thought it was about nodes :)
<rogpeppe> anastasiamac: BTW shouldn't this be done by some code on the system to test if it's NUMA rather than with a config setting?
<rogpeppe> anastasiamac: from the bug report you linked "When running Juju on multi-socket servers I see this in the mongo log:"
<rogpeppe> anastasiamac: perhaps it's talking about physical CPU sockets
<anastasiamac> rogpeppe: haha ;) no, i do not know... can't really cast my mind that far back: feels like another lifetime \o/
<rogpeppe> anastasiamac: how about this for a doc comment?
<rogpeppe> 	// NUMAControlPolicyKey specifies whether the MongoDB
<rogpeppe> 	// instance on the controller nodes should be run under numctl.
<rogpeppe> 	// This should be set if the controller will run on NUMA hardware.
<anastasiamac> rogpeppe: \o/ sounds perfect :D
<dooferlad> dimitern, frobware: got it:
<dooferlad> 77427b4d-a1e4-4659-8b34-fc17ed10ac2b machine-0: 2016-09-29 10:10:35 DEBUG juju.apiserver request_notifier.go:140 -> [3C2] machine-0 694.756636ms {"request-id":107,"response":"'body redacted'"} Provisioner[""].PrepareContainerInterfaceInfo
<dooferlad> e6b42b7c-5cb7-47e2-8e4d-27040bdc810b machine-0: 2016-09-29 10:10:35 WARNING juju.provisioner lxd-broker.go:62 failed to prepare container "0/lxd/0" network config: creating device interface: ServerError: 400 BAD REQUEST ({"vlan": ["This field is required."]})
<dooferlad> missing vlan field
<frobware> dooferlad: what does the setup look like to get into this state?
<dooferlad> frobware: could you be more specific?
<frobware> dooferlad: :) MAAS setup / node config / interface config on the nodes
<dooferlad> frobware: ip addr https://www.irccloud.com/pastebin/mxvcDvoP/
<dooferlad> frobware: sorry, ignore that
<dooferlad> frobware: this is the right machine https://www.irccloud.com/pastebin/XrgdgkDx/
<dooferlad> that machine has 3 NICs with eth1 assigned an address
<dooferlad> no VLAN tags
<dooferlad> I really don't want to see if it works if I change the NIC with an address to eth0. It would make me throw things if that was it.
<dooferlad> frobware: I need to go make lunch for older daughter. Back in a few minutes.
<perrito666> axw: still here?
<rogpeppe> a large-but-mechanical change to use consistent spellings for API and NUMA throughout Juju. review appreciated, thanks. https://github.com/juju/juju/pull/6353
<hoenir> https://bugs.launchpad.net/juju-core/+bug/1173122/+index?ss=1
<mup> Bug #1173122: API server should not log passwords <logging> <security> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1173122>
<hoenir> why log passwords? and why not review the patch sended? (check diff).
<hoenir> anyone?
<rick_h_> hoenir: sorry, I've no idea why this never got looked at. It's from over 2 years ago on the juju-core release in maintenance mode because we've focused on juju 2.0 in launchpad.net/juju and since the code's moved to github we've not looked at code reviews in LP for some time
<hoenir> rick_h_, thanks for making it clear.
<hoenir> rick_h_, I makes me wander... why we keep them if their are invalid? Why not close all bugs/PR's on bzr/lanchpad and add into the project desction smth like "Juju dosen't recive any more commits on launchpad, we moved to github" ?
<rick_h_> hoenir: all I can say is that I've never even looked to be honest.
<rick_h_> hoenir: there's no reason not to
<frobware> dooferlad: sorry, was sidetracked. will look in a bit.
<frobware> rick_h_: we have not landed the patch (yet!). We ran into an issue with aliases when testing this morning.
<hoenir> I think we will make our lives better if we will take some time cleaning the project issues/bugs... It's kind of a mess.
<rick_h_> frobware: k, please keep us up to speed on that.
<frobware> rick_h_: we should sync (now?)
<rick_h_> frobware: k, meet you in standup
<frobware> rick_h_: omw
<anastasiamac> hoenir: "project issues/bugs".. are u refereing to launchpad?
<hoenir> anastasiamac, yeah
<frobware> dooferlad, babbageclunk, voidspace, macgreagoir: PTAL @ https://github.com/juju/juju/pull/6342 and rubber stamp an approval if you're happy with it.
<hoenir> anastasiamac,  lately the review system is chaged or I'm undertanding it wrong? We still use http://reviews.vapour.ws or the github one?
<anastasiamac> hoenir: we r experimenting with doing reviews on github
<anastasiamac> hoenir: ur patch in launchpad, is it possible for u to re-propose agaisnt github?
<Andrew_jedi> Hello guys, I was wondering whether this bug fix was included in the openstack oslo messaging charms for Liberty? https://bugs.launchpad.net/oslo.service/+bug/1524907
<mup> Bug #1524907: [SRU] Race condition in SIGTERM signal handler <sts> <sts-sru> <Ubuntu Cloud Archive:Fix Released> <Ubuntu Cloud Archive liberty:In Progress by hopem> <oslo.service:Fix Released> <python-oslo.service (Ubuntu):Fix Released> <python-oslo.service (Ubuntu Wily):Won't Fix>
<mup> <python-oslo.service (Ubuntu Xenial):Fix Released> <python-oslo.service (Ubuntu Yakkety):Fix Released> <https://launchpad.net/bugs/1524907>
<anastasiamac> Andrew_jedi: this is mostly juju-dev channel, u might get more info in #juju - where charmers are \o/
<Andrew_jedi> anastasiamac: thanks!
<hoenir> anastasiamac, could you reformulate the last question again?
<hoenir> anastasiamac, the patch in the launchpad It's not mine.
<anastasiamac> hoenir: oh k.. i wonder if we have adressed it in codebase in github already
<rick_h_> dooferlad: are you still reviewing the bridge changes pr?
<hoenir> Yeah that's why I'm suggesting to clean up a bit the project tracker issue.
<rick_h_> mgz: heads up, might be a coulpe of min late to our call this morning. Gotten sidetracked this morning
<mgz> rick_h_: no problem
<jam> babbageclunk: I was playing around with your tests again, and if I run with '-v' I do see some errors here and there about "http: TLS handshake error from..." I wonder if the mongo driver is assuming ssl?
<jam> and periodically it is trying to poll the replicaset but failing because of TLS stuff, which isn't failing the primary test
<babbageclunk> jam: I think we see that occasionally in normal test runs anyway? Maybe.
<jam> babbageclunk: it certainly sounds like something is genuinely wrong and we just haven't been noticing.
<jam> I'll try running with pure trunk and see if I still see it
<babbageclunk> jam: Yeah, could be.
<rogpeppe> jam: have you got a final opinion on https://github.com/juju/juju/pull/6345 (your review didn't approve or not-approve) ?
<jam> rogpeppe: I haven't seen your response yet, will try and look again
<rogpeppe> jam: ta
<rick_h_> mgz: omw
<mgz> rick_h_: I await
<jam> rogpeppe: I had some comments, but I think LGTM
<rogpeppe> jam: thanks. i responded to "I would guess the minimum is still how long it takes for DNS to update."
<rogpeppe> jam: it's independent of DNS updates
<jam> rogpeppe: so its not entirely independent. if you run 2 controllers at the same IP address, then sure, they both get a signed cert.
<jam> but if you are doing "juju bootstrap" you then need to go update your IP record
<jam> and then it has to propagate
<jam> and *then* you can get a signed cert.
<rogpeppe> jam: you do, but that's independent of letsencrypt
<rogpeppe> jam: ideally I think we'd provide a way for Juju to automatically update its own DNS record
<rogpeppe> jam: and then of course it would take a while for the record in the IP address to propagate
<rogpeppe> jam: one other thing, kinda trivial: I've been wanting to standardise on Api vs API spelling for ages (we're inconsistent all over), and this PR does that. Do you think it's a good idea? https://github.com/juju/juju/pull/6353
<jam> rogpeppe: you didn't address Http while you were at it?
<rick_h_> rogpeppe: jam my one concern would be if things like deployer/etc break due to api changes to them ^
<jam> I'm +1 on the general concept of preferring API
<rogpeppe> jam: good point, that would be good too
<rogpeppe> rick_h_: and +1 on that
<rogpeppe> rick_h_: i'll double check we're not changing any API calls
<jam> the only thing I worry about with a global search/replace is things that we've exposed as part of a public interface.
<jam> as rick_h_ pointed out as well.
<rogpeppe> jam: looks like no API calls are affected
<rogpeppe> jam: or by Http vs HTTP either, which I'll do too
<rick_h_> rogpeppe: <3 ty
<jam> babbageclunk: looks like you're right, the TLS stuff is pre-existing.
<jam> I wonder if something is trying to update to an old address that got torn down
<jam> (my quick hack of apiserver.go to drop TLS shows about 10 tests that actually need ssl cause they are testing that we expose SSL endpoints, and 54s vs 58s runtime)
<jam> so another 10%-ish, but quite a bit more actual work to get all of those to actually run.
<jam> rogpeppe: looks like you missed a legacy_test.go http://juju-ci.vapour.ws:8080/job/github-merge-juju/9366/artifact/artifacts/trusty-err.log
<mup> Bug #1173122 changed: API server should not log passwords <logging> <security> <tech-debt> <juju-core:Fix Released> <https://launchpad.net/bugs/1173122>
<rogpeppe> jam: darn, i thought i'd run all the tests locally
<rogpeppe> jam: ah, it's not surprising - that code has only just landed in master
 * rogpeppe should've rebased before $$merge$$
<rogpeppe> jam: BTW I changed https://github.com/juju/juju/pull/6353 to do all of: API, NUMA, HTTPS, HTTP, FTP and URL.
<jam> rogpeppe: so I'm +1 on theory, but I'd like someone to really go through it carefully to watch out for any actual public API changes.
<rogpeppe> jam: so I did a grep for all those names inside apiserver/... and there are no methods that have changed name; and there are no fields in apiserver/params without JSON annotations that have changed name either
<rick_h_> frobware: dimitern ty for getting that in and landed! please enjoy the rest of the sprint!
<dimitern> rick_h_: thank you ;) I'm glad it landed
<rogpeppe> jam: i'd really like it if we had an automatic API compatibility checking tool that checked for any type and name changes. obviously it wouldn't be sufficient, but it would be a good sanity check to have.
<rogpeppe> jam: i started doing one when we had "friday labs" but no time now
<dimitern> oh the good ol' days..
<dimitern> :D
<rogpeppe> jam: here are all the differences that are inside apiserver/... that aren't in _test.go files: http://paste.ubuntu.com/23251153/
<rogpeppe> dimitern: :)
<mup> Bug #1173122 opened: API server should not log passwords <logging> <security> <tech-debt> <juju-core:Fix Released> <https://launchpad.net/bugs/1173122>
<mup> Bug #1173122 changed: API server should not log passwords <logging> <security> <tech-debt> <juju-core:Fix Released> <https://launchpad.net/bugs/1173122>
<natefinch> rick_h_, voidspace, dooferlad, macgreagoir: ping for standup
<voidspace> natefinch: omw :-)
<voidspace> natefinch: thanks
<anastasiamac> natefinch: macgreagoir is at sprint :) r u just missing Mick or do u need to update him? :D
<macgreagoir> natefinch: Aye, sorry, I'm sprinting :-)
<natefinch> macgreagoir: forgot, nevermind :)
<babbageclunk> jam: Sorry, have you already done the no-ssl-api-connection hack?
<jam> babbageclunk: only in one package and a very dirty dirty hack without making all the tests work.
<jam> with failing tests its hard to say if its faster because of the SSL or because of the tests stopping early
<babbageclunk> jam: ah, cool - I'm finding it pretty slow going/very whack-a-mole
<jam> I just did "apiserver/" itself.
<jam> I have the feeling that the difficulty you're seeing in implementing is enough of an answer in itself.
<jam> not worth it for the time to whack all the moles
<babbageclunk> jam: yeah, I'm thinking that too
<jam> babbageclunk: because to do it *cleanly* you still have to whack those moles
<frankban> natefinch, axw: I have two branch up for review when you have time: https://github.com/juju/juju/pull/6352 and https://github.com/juju/juju/pull/6354 thanks!
<dimitern> macgreagoir, anastasiamac: PTAL https://github.com/juju/juju/pull/6355 - small time-related state improvements
<natefinch> frankban: will take a look
<frankban> natefinch: ty!
<anastasiamac> dimitern: lgtm
<rick_h_> natefinch: ping, sorry forgot to stay on
<rick_h_> natefinch: can you meet me back in the standup?
<anastasiamac> jam: tell user when operations are blocked \o/ https://github.com/juju/juju/pull/6356
<natefinch> rick_h_: sure
<rick_h_> dooferlad: re: https://bugs.launchpad.net/juju/+bug/1623480 did you run frobware's test on azure and that's corrected?
<mup> Bug #1623480: Cannot resolve own hostname in LXD container <lxd> <network> <juju:Triaged by rharding> <https://launchpad.net/bugs/1623480>
<dooferlad> rick_h_: no.
<dooferlad> rick_h_: probably should!
<katco`> rick_h_: hey what was the consensus on the new charm URL format. is the change from cs:<series>/<charm>-<rev> to cs:<series>/<charm>/<rev> intended?
<rick_h_> dooferlad: ty
<rick_h_> katco`: consensus was "watch out! it's a swamp!"
<katco`> lol
<katco`> rick_h_: but i'm OK to correct unit tests to new format? i'm not just fixing by coincidence?
<rick_h_> katco`: I've got a reply on the latest there I need to go process.
<rick_h_> katco`: i'm not sure tbh
<katco`> rick_h_: ok, well lmk; no rush
<rick_h_> katco`: the quick test is can you use that url in trunk atm?
<katco`> rick_h_: looks like it locates it at least
<rick_h_> katco`: ok, I know not all the url space works so I think it might be in an in-between land
<rick_h_> katco`: so we have to run with what we've got as to get it all working seems like it's beyond the 2.0 scope atm
<katco`> rick_h_: although it's cs:<charm>/<series>/<rev>
<rick_h_> katco`: right, a lovely mongrel
<rick_h_> although actually that's good charm/series/rev, wonder if charm/rev works
 * rick_h_ needs to update trunk and tinker
<katco`> rick_h_: doesn't look like it
<katco`> rick_h_: if you're going to specify a rev, looks like you need a series. which doesn't make any sense with multi-series charms
<rick_h_> katco`: right :/
<dooferlad> frobware, dimitern: found it https://bugs.launchpad.net/juju/+bug/1628973
<mup> Bug #1628973: maas provider uses 'primary interface' logic - allocateContainerAddresses1 fails when interface 0 doesn't have a VLAN <juju:Confirmed> <https://launchpad.net/bugs/1628973>
<dimitern> dooferlad: looking
<dimitern> dooferlad: there is in fact such a thing as primary nic for a device
<dimitern> dooferlad: it's the only one maas creates along with the device when you pass it a mac address
<dooferlad> dimitern, frobware: that whole function is suspect; looking at a subnet to find what VLAN an interface should use? You can have the same subnet used by two different VLANs - kind of the point.
<dooferlad> dimitern: no, there isn't.
<dooferlad> dimitern: that logic uses 'primary' to mean 'first in the list'
<dimitern> dooferlad: no, there is ever only one vlan for any subnet in maas
<dimitern> in fact subnets are contained within a vlan
<dooferlad> dimitern: that is a current MAAS limitation, not a network limitation.
<rogpeppe> dimitern: BTW that doc comment on NowToTheSecond is misleading - mongo stores time in millisecond resolution, not second resolution
<dimitern> and I'm talking about the maas vlan db entity, not a VLAN with tag 1234
<dimitern> dooferlad: I agree it's maas specific, but that's what the code is handling there
<dimitern> dooferlad: ok the name might be changed to "first" vs "primary"
<dooferlad> dimitern: I suggest comments. They are useful :-)
<dimitern> maas db triggers ensure every device has at least 1 NIC, but it doesn't link it to the correct vlan (i.e. uses the "default vlan" for it)
<dooferlad> the problem with that statement is that 'default vlan' could be the absence of a VLAN in terms of the network, right?
<dimitern> dooferlad: I'm not saying they aren't useful ;) I'm thankful
<dimitern> dooferlad: it's a vlan used when maas cannot figure out which vlan to use - it's in fact hardcoded in maas src with id=5001
<dimitern> horrible, horrible stuff
<dooferlad> yikes!
<dooferlad> do you know what happens if we don't specify the VLAN field in the API request? It seems like that is the right thing to do in this case.
<dooferlad> I need to look at the API docs (and hope they are correct)
<dimitern> I know yes - it fails, as for a physical nic vlan is required
<dimitern> and all device nics are physical
<dimitern> that chunk of code tries to satisfy the requirement
<dooferlad> oh my
<dooferlad> I think we need to just use id=5001 when it isn't set then :-(
<dimitern> you can easily check: maas <profile> devices create parent=xyz mac_addresses=aa:bb:cc:dd:ee:f0 ; maas <profile> interfaces create-physical <dev-id> [no vlan=]
<dimitern> might be, but if it happens not to match the vlan used by the host bridge, you'll get issues
<dimitern> fortunately, figuring out which subnet the host bridge is on (and the vlan of that subnet) is easier now, since we only bridge host interfaces with ip addresses
<dimitern> it was much harder previously, as the host bridge might be without address (hence subnet)
<dooferlad> dimitern: in allocateContainerAddresses1 that isn't the case. On my hardware br-eth0 and br-eth2 don't have addresses, so we shouldn't be trying to get lxd/0 eth0 or eth2 addresses. That code even uses the horror of if nic.InterfaceName == "eth0".
<dooferlad> it will fail if an interface called eth0 isn't configured to use as a set of defaults.
<dimitern> dooferlad: are you using tip of master? how come those address-less br-* get created?
<dooferlad> dimitern: yes, using the tip of master
<dooferlad> dimitern: as of this morning that is.
<dimitern> dooferlad: yeah, well the bridge-only-configured (master-lp1627073 ?) PR landed, so now you shouldn't be seeing such bridges anymore
<dooferlad> dimitern: so your changes will have stopped br-eth0 from turning up, but this code still looks for an eth0 (from MAAS?)
<dooferlad> so it doesn't make any difference
<dooferlad> ...maybe
<dimitern> dooferlad: eth0 is inside the device
<dooferlad> will it be linked to br-eth0?
<dimitern> dooferlad: it will be linked to the first bridge on the host
<dimitern> dooferlad: which has an address - might be br-ens5 or br-eth0 (depends on the host node network config)
<dooferlad> so, it will be linked to br-eth1 in this case. OK.
<dimitern> yeah
<dimitern> dooferlad: also maas names the first device nic "eth0", fwiw
<dooferlad> dimitern: but first != special
<dooferlad> dimitern: and untagged is a magic API value
<dimitern> dooferlad: it kinda is, because it's always there
<dooferlad> (5001)
<dimitern> dooferlad: so we can't create it, just find it and update it (if needed)
<dimitern> maas auto-creates it using the mac address passed to create device
<dooferlad> so after the message "NIC %v has no subnet - setting to manual and using untagged VLAN", shouldn't we assign the value "5001" to match the API?
<dimitern> dooferlad: I think that's just dead code after that PR landed :/
<dimitern> dooferlad: since it will have a subnet always
<dooferlad> dimitern: I am all for deleting it and replacing it with an error!
<dimitern> dooferlad: if you're willing to go there, please do - I'll happily review the changes
<dimitern> (now rc2's been cut off anyway)
<dooferlad> dimitern: I think we need to add a cleanup card to schedule the work. That function is over 100 lines long without any comments and I think has bigger problems than the dead code we just identified.
<katco`> p
<dimitern> dooferlad: sgtm
<dooferlad> dimitern: confirmed the problem has gone with your latest changes :-)
<dimitern> dooferlad: \o/ !! :)
<natefinch> rick_h_: so, authentication with apikey is going to be more difficult than I thought.  It's special for rackspace, not using the openstack standards, and since our rackspace code just reuses all that... it would take some work to extract the authentication code into something rackspace can override
<rick_h_> natefinch: yea, that's what I was thinking
<rick_h_> natefinch: so it'd be good to drop any notes into the bug, rename it to "add support for api-key auth to the rackspace provider" and 2.1 it
<natefinch> rick_h_: yep
<natefinch> rick_h_: it's unfortunate... I can change the names we call them, but then deep in the goose code, it has hard coded expectations of what the values will be called, and those aren't what rackspace expects.
<rick_h_> natefinch: yea, that's what I expected
<natefinch> rick_h_: oh well.  I can disable access-key type authentication easily enougj
<rick_h_> natefinch: ty
<natefinch> Simple review anyone? +25 -17: https://github.com/juju/juju/pull/6357
<natefinch> rick_h_: ^ "fixed" :)
<rick_h_> natefinch: ty
<hatch> when connecting to a controller via the API the user-info returns two fields 'controller-access' and 'model-access' - why would these values differ from what is returned via the cli for the same user?
<hatch> they are both "" when the cli reports that it should have addModel
<rick_h_> natefinch: so when you add credential there's no option right?
<rick_h_> natefinch: as far as which type to use?
<hatch> rick_h_: via the api you can specify for certain clouds
<natefinch> rick_h_: correct.  it just says 'Using auth-type "userpass"'
<natefinch> $ juju add-credential rackspace
<natefinch> Enter credential name: bar
<natefinch> Using auth-type "userpass".
<natefinch> Enter username: foo
<natefinch> Enter password:
<natefinch> Enter tenant-name: a
<natefinch> Credentials added for cloud rackspace.
<rick_h_> natefinch: <3 ty
<natefinch> rick_h_: d-(^_^)-b
<hatch> ok let me rephrase my question, when logging into the controller should the acl data be returned in the response, or do I need to make another request for the data?
<rick_h_> perrito666: ^
<katco`> https://github.com/juju/juju/pull/6348 could use another look. warning: scripted renames ahead
<rick_h_> natefinch: katco` can you all swap for review then please?
<natefinch> yep yep
<katco`> yep
<natefinch> katco`: is this correct? cs:~a-user/trusty/spam-5  is now   cs:trusty/spam/~a-user/5  ?
<katco`> natefinch: i think i got the bargain in this transaction: lgtm
<natefinch> katco`: lol I was gonna say.... :)
<katco`> natefinch: sigh no that's not correct
<natefinch> katco`: well, at least that's fairly easy to find with a regex
<katco`> yeah
<katco`> i have a feeling i probably missed some cases too
<natefinch> I presume /2 instead of -2 is the new correct way?
<katco`> yeah
<katco`> read commit message
<natefinch> ahh yeah, missed that, but figured it from context
<katco`> i actually don't know how it's supposed to look for a user url
<natefinch> I presume ~user/charmname/series/revision ..... anything else would be kinda wacky
<natefinch> (which, yes, means the old way was wacky ;)
<katco`> natefinch: cs:charm/series/rev means is kinda wacky for multiseries charms too
<natefinch> katco`: very true
<natefinch> katco`: if I ran the zoo, all charms would be multiseries... it's silly to have different charms when the charm format is cross platform.
<perrito666> hatch: to be honest, I dont remember
<katco`> what was the consensus on squashing commits with reviews?
<natefinch> katco`: do it, I believe was the consensus
<katco`> natefinch: squashed with fix for user urls
<hatch> perrito666: heh ok
<rick_h_> perrito666: can you please help hatch out with the gui using ACL bits. There's a possiblity of a bug that we're about to head out in rc2 and I want to make sure it's ok
<perrito666> rick_h_: sure
<hatch> thanks all
<hatch> you rock
<perrito666> yes we do
<natefinch> katco`: I like the way you show the regex you used as if I can somehow tell if it's right or wrong ;)
<katco`> natefinch: lol, well it's mainly there in case we need it again or it's screwed something up :)
<katco`> natefinch: just good to pair the command used with the commit
<natefinch> katco`: lgtm.  I have a quibble with the logging, but meh, it's not that important
<katco`> natefinch: yeah i am considering your comments
<katco`> natefinch: i usually have a rather black and white view on the matter; i.e. it's not the place of a function to determine what happens with its errors. i'm weighing your opinion trying to decide if it's not always so
<katco`> natefinch: you are correct in that we only ever log it; the case i consider is the future. it's much easier to change things on the edges, and when the handling is done on the bottom-edge, it couples all callers together with their handling
<natefinch> well, either senderror has to handle its own errors, or the http handler function has to handle their own errrors
<katco`> natefinch: right, and i always prefer the top-edge (i.e. http handler)
<katco`> pushing decisions down into your tree makes code rigid, but easier to read/use
<natefinch> I guess I see senderror as just a shortcut instead of copying and pasting that whole thing into every http handler
<katco`> natefinch: i believe sometimes it goes through other functions before making its way up to the http handler?
<katco`> natefinch: at any rate, i am still mulling over your comments. i'm going to $$merge$$ so that the tests i undoubtedly missed fail and take a late lunch
<natefinch> heh good idea
<natefinch> if the merge goes through, like I said, it's mostly a philosophical difference.  It's obviously not incorrect as it is.
<natefinch> katco`: btw, it looks like sendError *used* to get called from lower in helper functions, and I agree that would be bad... but now those helper functions (correctly) just return errors, so sendError is only ever called from the top level http handler functions.
<katco`> natefinch: ah
<katco`> natefinch: i have been told to wait on lunch in case my wife needs help stuffing our cat into the crate for the vet
<natefinch> haha, I know how that goes
<natefinch> one of our cats is like "No!!! DEATH FIRST!"  And the other one is like "hey, what's this, a nice little place to curl up? don't mind if I do"
<katco`> lol yep. we have those same 2 cats
<balloons> redir, you about/
<alexisb> balloons, nope he wont be
<alexisb> balloons, something I can help you with?
<alexisb> perrito666, ping
<balloons> alexisb, just looking for someone to +1 this so we can merge it ;-) https://github.com/juju/juju/pull/6358
<natefinch> I'm on call reviewer
<balloons> ahh, I misread.. howdy natefinch
<balloons> you've been through the drill before
<natefinch> balloons: lgtm
<alexisb> natefinch, beat me to it :)
<perrito666> alexisb: pong (sorry was having afternoon tea, yes we do that here)
<alexisb> :) np
<redir> balloons: all set?
<balloons> redir, yeppers!
 * redir nods
<natefinch> anyone remember if facades are supposed to start at version 0 or 1?
<katco`> natefinch: 1
<katco`> natefinch: bc 0 is the default value and is equivalent to forgetting about the version
<veebers> Has this been discussed before? During a bootstrap when an 'apt-get' fails juju continues to then try install some packages and doesn't timeout at all.
<perrito666> axw: ?
<thumper> morning
<katco`> heya thumper
<axw> perrito666: awake now, but I'm heading into a meeting shortly
#juju-dev 2016-09-30
<axw> thumper: back, ready to chat when you are
<alexisb> axw, you must of eaten a big breakfast ;)
<axw> alexisb: :p  may have done some other things too...
<thumper> axw: gimmie 10 to finish up what I'm doing
<axw> thumper: np, ping whenever
<thumper> axw: now is good
<axw> thumper: standup?
<thumper> sure
<thumper> axw: as in bluejeans or hangout?
<axw> thumper: sorry, I meant hangout
<axw> derp
<thumper> heh
<thumper> link?
<axw> thumper: https://hangouts.google.com/hangouts/_/canonical.com/a-team-standup?authuser=1
<jamespage> morning all
<jamespage> any storage gurus around? I'm seeing this with the openstack provider and juju 2.0 rc-1
<jamespage> juju add-storage ceph-osd/3 osd-devices=4,10G
<jamespage> failed to add "osd-devices": adding storage to unit ceph-osd/3: pool "cinder" not found
<frobware> anastasiamac: https://bugs.launchpad.net/juju/+bug/1466514
<mup> Bug #1466514: MachineSuite.TestCertificateUpdateWorkerUpdatesCertificate timeout while waiting for certificate to be updated <ci> <intermittent-failure> <test-failure> <juju:In Progress by thumper> <https://launchpad.net/bugs/1466514>
<jamespage> fwiw raised https://bugs.launchpad.net/juju/+bug/1629229 for ^^
<mup> Bug #1629229: unable to add-storage with openstack provider <juju:New> <https://launchpad.net/bugs/1629229>
<rogpeppe> axw: i wonder if you might be amenable to giving this the thumbs up... https://github.com/juju/juju/pull/6353
<rogpeppe> axw: large but fundamentally trivial
<axw> jamespage: https://bugs.launchpad.net/juju/+bug/1615095
<mup> Bug #1615095: storage: volumes not supported <landscape> <juju:In Progress by axwalk> <https://launchpad.net/bugs/1615095>
<axw> jamespage: (same underlying issue)
<jamespage> axw, dupe then - thanks!
<axw> rogpeppe: my brain is fried, but seeing as it's mechanical
<rogpeppe> axw: :) thanks!
<voidspace> fsck
<voidspace> formatted text output comes from iteration over a dictionary
<voidspace> so testing output is a damn pain
<dooferlad> voidspace: ouch!
<voidspace> dooferlad: going to sort in the code just so I can test it
<dooferlad> voidspace: +1
<voidspace> dooferlad: I bet the default string sort is asciibetical
<voidspace> dooferlad: but I don't think I care...
<voidspace> dooferlad: and of course the only way to get all the keys out of a golang map is to iterate and build a slice
<voidspace> dooferlad: which is just inelegant code
<voidspace> but ah well
<axw> rogpeppe: +1
<rogpeppe> axw: tyvm
<redir> babbageclunk macgreagoir https://github.com/juju/juju/wiki/Intermittent-failures
<dimitern> great page and diagrams redir!
<rogpeppe> redir: interesting. isn't that kind of thing why the Alarms channel exists?
<macgreagoir> redir: +1, really nice and clear with links ot code.
<rogpeppe> heads up: my rather large branch which standardises acronym spelling in juju-core has just landed. you might wanna rebase before submitting any existing PRs.
<rogpeppe> oh no, it hasn't actually landed, darn
<rogpeppe> it has now, finally
<babbageclunk> rogpeppe: I think the problem with the Alarms channel here is that you might have had many calls to clock.After, but have nothing in the waiting queue.
<babbageclunk> rogpeppe: So WaitAdvance would continue even though there isn't anything in waiting.
<rogpeppe> babbageclunk: unless someone's passing zero duration to time.After, how could there be nothing in the waiting queue?
<rogpeppe> babbageclunk: (assuming the waiting queue is drained before you call Advance)
<rogpeppe> babbageclunk: hmm, i guess if you advance a long way
<babbageclunk> rogpeppe: Yeah, that would clear multiple waiters, potentially.
<rogpeppe> babbageclunk: what's WaitAdvance?
<rogpeppe> babbageclunk: is something added in a feature branch
<rogpeppe> ?
<babbageclunk> rogpeppe: Oh, this is the proposed solution. It waits until there are waiters before advancing.
<rogpeppe> babbageclunk: that's never going to be sufficient
<rogpeppe> babbageclunk: because you need to wait for *all* the waiters
<babbageclunk> rogpeppe: We're aware that the name isn't very clear :(
<rogpeppe> babbageclunk: the problem is always going to be that you need to know exactly how many entities there are that will be waitinmg
<babbageclunk> rogpeppe: True
<rogpeppe> babbageclunk: which is always going to be fragile
<babbageclunk> rogpeppe: But so's advancing before the waiters get there.
<rogpeppe> babbageclunk: well exactly. you always need to wait for everyone to block on the clock
<rogpeppe> babbageclunk: before advancing
<babbageclunk> rogpeppe: Another possibility would be to have the waiters specify an absolute time instead of a delta.
<rogpeppe> babbageclunk: i've been toying with that idea
<rogpeppe> babbageclunk: but often waiters will actually want to wait for a certain length of time
<rogpeppe> babbageclunk: so they'll call time.Now
<rogpeppe> sorry, clock.Now
<rogpeppe> babbageclunk: and that will depend on whether advance has been called or not
<rogpeppe> babbageclunk: so i think you'll end up back in the same situation
<babbageclunk> rogpeppe: yup :/
<rogpeppe> babbageclunk: one possibility might be to assume that the number of goroutines remains constant
<rogpeppe> babbageclunk: then the first time you wait for a while to gather them all, then always wait for that number of alarms in the future
<rogpeppe> babbageclunk: but that's not great either
<rogpeppe> babbageclunk: another possiblity, perhaps better but more work is this:
<rogpeppe> babbageclunk: if you start a goroutine that's going to block, then register it with the clock
<rogpeppe> babbageclunk: and unregister when it exits
<rogpeppe> babbageclunk: then you know exactly how many alarms to wait for
<babbageclunk> rogpeppe: Makes sense, but that's really invasive.
<rogpeppe> babbageclunk: yup
<rogpeppe> babbageclunk: it could also potentially be useful for other things though - it could give an idea of how many long-lived workers there are in the system.
<rogpeppe> babbageclunk: hmm, but it doesn't work in general
<rogpeppe> babbageclunk: because most workers don't block on time events.
<rogpeppe> babbageclunk: i think that you've bitten off a problem that's too hard to chew
<rogpeppe> babbageclunk: i don't think there's going to be any decent solution to this in the large scale
<babbageclunk> rogpeppe: But we already have intermittent failures for tests using clock.Advance.
<rogpeppe> babbageclunk: yup, i'm not that surprised
<babbageclunk> rogpeppe: Fixing the easier constant number of waiters problem still seems worthwhile even though we can't do the broader one.
<rogpeppe> babbageclunk: the problem is that the number of waiters won't remain constant
<rogpeppe> babbageclunk: for example, you might have a worker that mostly blocks on a network event, but occasionally makes a call with a retry and wait
<babbageclunk> rogpeppe: Well, if we make the number of waiters a parameter then the test can change as it needs.
<rogpeppe> babbageclunk: it's gonna be inevitably fragile because the test needs to know exactly all the things underneath that can block, and that's an implementation detail
<babbageclunk> rogpeppe: True, but I can't see any other way.
<rogpeppe> babbageclunk: i suspect that clock mocking is useful in the small scale but becomes an anti-pattern in more integration-level tests
<babbageclunk> rogpeppe: Well, other than not using clock
<babbageclunk> rogpeppe: Yeah, I think you're right
<rogpeppe> babbageclunk: at that level, better just to have very small durations for waiting and use wall clocks
<babbageclunk> rogpeppe: Nothing's ever easy! :(
<rogpeppe> babbageclunk: :)
<rogpeppe> babbageclunk: sometimes it's better to cut yer losses
<babbageclunk> rogpeppe: Thanks for the discussion/pep-talk anyway.
<rogpeppe> babbageclunk: np. it's a very interesting area.
 * babbageclunk lunches
 * perrito666 forgot to tell hr that today is a local holiday... and also forgot himself
<frankban> hey, is anyone available for a second review for https://github.com/juju/juju/pull/6352 ? thanks
<rogpeppe> frankban: i'll take a look
<rogpeppe> frankban: it seems a little weird that we serve the default icon instead of a not-found error for any file in the archive at all. what's the rationale behind it?
<rogpeppe> frankban: oh, i see now, it's only if you ask for the icon
<frankban> rogpeppe: yes
<rogpeppe> frankban: reviewed
<frankban> rogpeppe: ty
<redir> babbageclunk: frobware anastasiamac macgreagoir https://github.com/juju/testing/pull/113 is ready for review PTAL
<rogpeppe> add official-dns-name controller attribute: https://github.com/juju/juju/pull/6363; reviews appreciated, thanks.
<babbageclunk> katco: ping?
<voidspace> wwitzel3: ping
<voidspace> wwitzel3: actually, unping - but hi o/
<voidspace> wwitzel3: gotta nip off - will catch you in a bit :-)
<voidspace> review for someone https://github.com/juju/juju/pull/6364
<katco`> babbageclunk: hey
<voidspace> dooferlad: ooh, thanks for the review
<babbageclunk> katco`: Hey, I'm chasing an intermittent test failure in https://github.com/juju/juju/blob/master/worker/logforwarder/logforwarder_test.go#L221
<alexisb> happy friday all!
<voidspace> alexisb: o/ happy friday
<katco`> babbageclunk: ok
<babbageclunk> katco`: Trying to understand the logforwarder - do you know why the loop method has two goroutines in https://github.com/juju/juju/blob/master/worker/logforwarder/logforwarder.go#L188
<katco`> babbageclunk: confused; i only see 1?
<katco`> babbageclunk: do you mean why does it start a goroutine when it's running in a catacomb goroutine?
<babbageclunk> katco`: Oh, sorry - you're right. I mean, why the internal goroutine pushes records over to the main loop via a channel, rather than just doing sender.Send() itself.
<katco`> babbageclunk: ah, let me read through the code rq
<babbageclunk> katco`: Thanks!
<natefinch> this may be a remnant of the fact that it used to be two separate workers
<katco`> this is some sloppy code...
<katco`> e.g. why is L189 not after L188
<katco`> doh L187 after L188
<babbageclunk> katco`: Oh, yeah - that's pretty dangerous.
<babbageclunk> natefinch: Yeah, that would explain it a bit.
<natefinch> stream is used in the closure
<katco`> natefinch: right but it's not used outside the closure
<natefinch> oh ha
<natefinch> really, that whole closure should have been pulled out into a separate function.  No sense putting almost 40 lines of code inside another function.
<natefinch> Also would make the interdependencies more obvious
<katco`> there's no need for the member variable enabled either. could use enabledCh, have: func (l *LogForwarder) Enabled() bool { _, ok := <- l.enabledCh; return ok == false }
<katco`> lf.waitForEnabled, and then we do if !enabled?
<katco`> babbageclunk: my guess is that it's in a goroutine because we're not using channels for streaming records; so we have to obtusely convert a method into a channel
<katco`> babbageclunk: if it were a pipeline all the way down, and stream returned a channel, we could just have one loop, one select
<katco`> babbageclunk: does that make sense? if you all you have is a method to retrieve records, you can't use it in a select block unless you create a goroutine to convert the calls into a channel
<babbageclunk> katco`: Ah, makes sense - because stream.Next() blocks?
<katco`> babbageclunk: yeah, and it must be a preemptable op
<katco`> babbageclunk: this is why i advocate for API's returning channels. bc you end up having to do this dance
<babbageclunk> katco`: Thanks! That's much clearer.
<katco`> select is way more fundamental of an op than people realize; it's how modules come together sanely
<katco`> babbageclunk: hth
<katco`> babbageclunk: if you're touching this code, please try and clean it up a bit? again, waitForEnabled could be returning whether the enabled chan is closed or not
<babbageclunk> katco`: definitely - I'm beginning to get a feel for this but it's still not second nature.
<natefinch> awesome article on using channels in APIs, for anyone who hasn't seen it: https://inconshreveable.com/07-08-2014/principles-of-designing-go-apis-with-channels/
<babbageclunk> katco`: ok, will do. Trying to understand the test now.
<katco`> gets rid of a variable and a bunch of locking
 * babbageclunk clicks
<babbageclunk> the link I mean.
<katco`> babbageclunk: it's probably too much surgery/controversial to change stream.Next() to channels, but i would encourage you to pull that goroutine out into a method that has a clear name
<babbageclunk> katco`: Yeah, I'll definitely do that - seems like a nice Friday afternoon refactoring task.
<natefinch> +! ^
<natefinch> +1 that is :)
<alexisb> anastasiamac, ping
<anastasiamac> alexisb: pong?
<babbageclunk> + it might help me understand the timing issue causing the intermittent failure.
<alexisb> anastasiamac, can you jump in a HO with me real fast
<anastasiamac> alexisb: i can. let me get ears :)
<alexisb> ok, anastasiamac I will meet you on teh a-team bluejeans
<anastasiamac> k
<rick_h_> alexisb: anastasiamac looks like they're using 1.25?
<alexisb> katco ping
<alexisb> natefinch, ping
<natefinch> alexisb: yo
<redir> babbageclunk: https://github.com/juju/juju/pull/6366 PTAL
<alexisb> heya natefinch can you come join the party
<natefinch> alexisb: I love parties
<alexisb> https://bluejeans.com/5036865018
<alexisb> o you are really going to love this party
<babbageclunk> ominous
<katco`> alexisb: sorry, i have 2 nicks in here right now =/
<alexisb> babbageclunk, natefinch was soooooo glad he joined the party ;)
<natefinch> power pc AND simplestreams.  I'm so lucky
<natefinch> oh, and 1.25
<katco`> =|
<babbageclunk> d'oh
 * redir backs away slowly
<redir> s/quickly
<natefinch> lol
 * natefinch jumps on the grenade
<alexisb> katco natefinch may need your morale support today
<alexisb> :)
<natefinch> http://i.imgur.com/YQ6kI9z.gif
<katco`> i'm afraid i'm fresh out of morale =|
<natefinch> heh
<katco`> v1/2 -> v3 charm url has drained me
<babbageclunk> katco`: ouch! Sorry!
<katco`> babbageclunk: yeah i had to pull in the latest charm lib. breaks a bunch of tests. like a lot.
<babbageclunk> katco`: I know - I fixed most of them! In a branch that isn't landed yet. :(
<katco`> babbageclunk: well too late now ;p
<babbageclunk> Oh no.
<katco`> babbageclunk: https://github.com/juju/juju/pull/6348
<katco`> babbageclunk: doesn't have my latest pushed up yet
<babbageclunk> katco`: I updated juju/names to handle the new charm urls: https://github.com/juju/names/pull/74
<rogpeppe> any chance of a review of this branch, by any chance? https://github.com/juju/juju/pull/6363
<rogpeppe> i need a review from someone in juju-core please
<katco`> babbageclunk: yeah: https://github.com/juju/juju/pull/6348/files#diff-bc1c339eba7450f0fe12ab61e4e1987aR81
<babbageclunk> katco`: The thing that stalled me was that the cmd/juju/applcation tests were failing because the charmstore would send bundle URLs back in the V3 format (which doesn't include "bundle")
<alexisb> voidspace, do you have a moment to review rogpeppe PR https://github.com/juju/juju/pull/6363
<babbageclunk> alexisb: free for a hangout?
<redir> mgz: yt?
<mgz> redir: yo
<redir> mgz: there's two things that landed this week that will hopefully reduce or remove some intermittent issues. 1. a patch in patches to fix a race in mgo.Stats 2. an update to the testing clock.
<redir> 1. mgz there's a change in clock at https://github.com/juju/testing/blob/master/clock.go#L107 which should show up if we experience intermitttent failures like https://bugs.launchpad.net/juju/+bug/1607044
<mup> Bug #1607044: WorkerSuite.TestUpdatesAfterPeriod timed out <ci> <intermittent-failure> <regression> <unit-tests> <juju:In Progress by reedobrien> <https://launchpad.net/bugs/1607044>
<mgz> gotcha, so we want to look out for that new message in failures
<mgz> and also see if there's an obvious change in incidence
<redir> mgxz exactly
<redir> and the other is related to https://bugs.launchpad.net/juju/+bug/1604817
<mup> Bug #1604817: Race in mgo Stats implementation <ci> <intermittent-failure> <race-condition> <regression> <unit-tests> <juju:In Progress by reedobrien> <https://launchpad.net/bugs/1604817>
<redir> which might fix various tests where that race is the cause.
<redir> mgz I plan to send an email to the qa list, but thought I'd mention it in case I forget over the long weekend.
<redir> :)
<mgz> redir: thanks!
<alexisb> babbageclunk, I am now
<babbageclunk> alexisb: ok, jumping into the hangout
<alexisb> babbageclunk, I am there
<smoser> i'm trying to get a cloud-utils upload into yakkety. it is blocked on juju's dep8 test due to failures on ppc64:
<smoser>  http://autopkgtest.ubuntu.com/packages/j/juju-core-1/yakkety/ppc64el
<smoser> and:
<smoser>   http://autopkgtest.ubuntu.com/packages/j/juju-core-1/yakkety/amd64
<smoser> (linked to from http://people.canonical.com/~ubuntu-archive/proposed-migration/update_excuses.html)
<smoser> i really do not think cloud-utils is related at all, as the onyl change to that package is in mount-image-callback, which i'm pretty sure is not used by juju
<smoser> anyone able to help refute or confirm that ?
<mgz> Setting up lxc1 (2.0.4-0ubuntu4) ...
<mgz> ERROR: ld.so: object 'libeatmydata.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
<mgz> looks like your rebuild is being unfairly blamed
<mgz> but, does seem to be a real issue of some kind
<mgz> actually, maybe it is the utils change
<mgz> "Tools checksum mismatch"
<mgz> is not the clearest message
<mgz> but seems to be the final failure?
<mgz> smoser: we can run these tests in CI, and perhaps with extra debugging
<smoser> mgz, did you look at ppc64el or amd64
<smoser> they fail differently
<katco`> babbageclunk: you still around?
<babbageclunk> katco`: yup yup
<katco`> babbageclunk: what's the new bundle url format for v3?
<babbageclunk> It's not distinguishable from the charm url
<babbageclunk> katco`: ^
<katco`> babbageclunk: do i specify a series for a bundle then?
<mgz> smoser: the amd64 failures all seem to be testbed related
<mgz> it's possible the ppc64 one is as well, in a more subtle way
<babbageclunk> katco`: No, I don't think so. Some more details here: https://bugs.launchpad.net/juju/+bug/1584193
<mup> Bug #1584193: juju deploy <bundle> is in a different form than jujucharms.com <2.0> <landscape> <usability> <juju:In Progress by 2-xtian> <https://launchpad.net/bugs/1584193>
<mgz> perhaps we want to poke pitti about it?
<katco`> babbageclunk: ta. if i specify "wordpress-with-endpoint-bindings/1" i get "series not specified"
<babbageclunk> katco`: In deploying on command line or in tests?
<katco`> tests
<babbageclunk> right - in that case it's going through a charmstore running locally and the url that comes back is in the V3 format.
<babbageclunk> babbageclunk: And there's code in the deploy command looking for series == "bundle"
<babbageclunk> duh, who am I again?
<babbageclunk> katco`: ^
<katco`> babbageclunk: https://github.com/juju/juju/blob/master/cmd/juju/application/bundle_test.go#L124
<katco`> babbageclunk: "cannot post archive: series not specified"
<katco`> babbageclunk: i've changed the test to be wordpress-with-endpoint-bindings/1
<katco`> babbageclunk: seems like it's an error on uploading the bundle... maybe i need to pull in a newer copy of charmrepo?
<babbageclunk> katco`: Ah, I fixed that (in my branch) but then got to the next problem and had an existential crisis.
<babbageclunk> katco`: Hang on - finding link
<katco`> babbageclunk: well maybe we can meet halfway :) i think i had already fixed what you ran into
<katco`> honestly it's all becoming a bit of a blur
<babbageclunk> katco`: same - sorry you're running into all of this too.
<babbageclunk> katco`: So I've got a local change that sets the series to bundle when uploading http://paste.ubuntu.com/23256099/
<babbageclunk> katco`: Then the deploy line in the test fails because the urls coming back from the local charmstore don't indicate that it's a bundle.
<katco`> babbageclunk: this looks... scary. if i introduced this, what exactly are we testing? you say deploy has something similar?
<babbageclunk> katco`: I'm certainly not sure it's the right thing to do.
<katco`> babbageclunk: hm... thanks for the diff. i will think on this for a few mins
<babbageclunk> katco`: Here's the bit of deploy that is checking for series == "bundle". If this fails then it falls back to assuming it's a charm and you get UnsupportedSeries errors back.
<babbageclunk> https://github.com/juju/juju/blob/master/cmd/juju/application/deploy.go#L882
<katco`> babbageclunk: ah, i remember that bit! :)
<babbageclunk> katco`: basically all of this stems from the change I made in charm splitting URL.String() and URL.Path().
<babbageclunk> katco`: I think it's possible to fix the tests by changing the charmstore code to use url.Path() where it's currently using .String()...
<babbageclunk> katco`: But I'm not sure that's the right thing to do.
<babbageclunk> katco`: Or another fix would be to make charm urls distinct from bundle urls by including "bundle" in the V3 format.
<voidspace> alexisb: rogpeppe: sorry, missed your message - it has a couple of approves already, but I can see changes since those
<voidspace> alexisb: rogpeppe: I'll take a look
<alexisb> voidspace, thanks, we just want to make sure we get one person from core to review
<voidspace> alexisb: ah, cool
<perrito666> katco`: seen this? https://mobile.twitter.com/filler/status/781175066742534144
<katco`> perrito666: yes :)
<admcleod> hi guys! with a workload state of blocked, how often will update-status retry?
<admcleod> retry/execute/whatever
<voidspace> rogpeppe: yup, LGTM
<admcleod> babbageclunk: hi! ;)
<admcleod> anastasiamac_: hello also :) any ideas ^^?
<babbageclunk> admcleod: hi! I don't know off the top of my head, and this isn't a great time for code-spelunkery, sorry!
<anastasiamac_> admcleod: perrito666, do u know ^^
<admcleod> sinzui: ^^^^ :}
<perrito666> anastasiamac_: 5 something, sec or min, cant recall which
<admcleod> perrito666: and then apparently it increases?
<perrito666> admcleod: ^ gimme a sec and Il lookl
<natefinch> update-status is every 5 or 10 minutes, I forget which
<perrito666> 5 minutes
<babbageclunk> admcleod: doesn't look like it changes
<perrito666> it doesnt
<perrito666> it runs every 5 mins
<admcleod> http://paste.ubuntu.com/23256225/
<natefinch> admcleod: I wonder if other hooks are preventing it from running on time
<kwmonroe> admcleod: i did a --replay and the output didn't change from your pastebin.. 'active' charms update-status every 5.  'blocked' charms are every 25.  (at least in this case)
<perrito666> other things might be preventing the update-status hook from being invoked
<kwmonroe> maybe, but show-status-log isn't telling...
<kwmonroe> 30 Sep 2016 15:08:36Z	juju-unit	idle       	
<kwmonroe> 30 Sep 2016 15:12:55Z	juju-unit	executing  	running update-status hook
<kwmonroe> 30 Sep 2016 16:56:21Z	         	idle       	last 3 statuses repeated 20 times
<admcleod> i really dont think anything else will be getting in the way
<admcleod> kwmonroe: what version juju?
<kwmonroe> rc1
<perrito666> admcleod: the code running that hook is very different since I was last there so cant give you a more complete answer without some digging
<kwmonroe> ya know what's interesting admcleod?  the last line of show-status-log above says "repeated 20 times", which is 100 minutes if update-status runs every 5... which is the time difference between the last 2 messages in that output.
<admcleod> kwmonroe: ah
<kwmonroe> i reckon i'll just watch the stupid thing for 5 minutes and put this to rest.
<kwmonroe> ha!  don't have to.  admcleod, machine logs on the unit show it running every 5.  not sure why debug-log --replay showed every 5th one.
<kwmonroe> (and did so consistently over the last 1.5 hours)
<admcleod> kwmonroe: nice one, thanks
<katco`> wishing these tests didn't spin up an entire charmstore and actually deploy charms right about now
<rogpeppe> voidspace: thanks!
<hml> while trying to test goose library changes in juju, iâm hitting the known issue where godeps leaves your git respository in a detached head state.  and my changes donât seem to take.
<hml> google ways to fix involve not using godeps restore or something.
<hml> does anyone have a better way in juju land?
<natefinch> hml: git checkout master
<natefinch> hml: if you just want to get the repo back into a known state
<natefinch> hml: not sure exactly what you're doing, but I have hit similar problems with gopdes
<natefinch> godeps
<hml> natefinch: ideally iâd like to test juju using the testservices from my goose branch - itâs not in the master yet because testing isnât complete.
<natefinch> hml: ahh, ok.
<natefinch> hml: so, run godeps -u dependencies.tsv for juju.  *then* switch your goose code to your branch, then build juju
<natefinch> make sure your goose code is in the same directory as the default (so, $GOPATH/src/gopkg.in/goose.v1/)
<natefinch> the way I usually do it is to just git remote add <my_fork>
<hml> natefinch: iâm working in a branch off of <my_fork>â¦ maybe thatâs the problem
<natefinch> hml: totally doesn't matter. go install/go build etc only care about what code is in $GOPATH/src/gopkg.in/goose.v1/   it doesn't know about git or branches or anything
<natefinch> the *only* thing that understands git / vcs at all is "go get"  after the code is on your drive, the rest of the commands only look at the files on disk in the directories they're told to look at by import statements
<hml> natefinch: running off to try this
<hml> natefinch: got it working - thank you, i think i was confused about how often to run godeps to update the dependenciesâ¦ iâm picking up my code to test now.
<natefinch> hml: yeah, you only have to do that once, immediately after you switch branches
<papertigers> Where can I find what uuid the cloud provider of choice is using for bootstrap?
#juju-dev 2016-10-01
<mup> Bug #1512566 changed: Panic in deployerSuite unittests while attempting to connect to mongo <ci> <test-failure> <unit-tests> <juju-core:Fix Released> <https://launchpad.net/bugs/1512566>
<mup> Bug #1534643 changed: cookies file locked for too long <2.0-count> <ci> <intermittent-failure> <juju:Fix Released> <juju-core:Invalid> <https://launchpad.net/bugs/1534643>
<mup> Bug #1438274 changed: unit test failures: storeManagerStateSuite.TestStateWatcher <ci> <juju-core:Fix Released> <https://launchpad.net/bugs/1438274>
<mup> Bug #1494798 changed: Juju fails to report it cannot create buckets <ci> <ec2-provider> <storage> <juju-core:Fix Released> <https://launchpad.net/bugs/1494798>
<mup> Bug #1497301 changed: mongodb3  SASL authentication failure <ci> <mongodb> <unit-tests> <juju:Incomplete> <https://launchpad.net/bugs/1497301>
#juju-dev 2017-09-25
<thumper> review someone please: https://github.com/juju/names/pull/83
#juju-dev 2017-09-26
<kjackal_> Hey juju core people! I wanted to testdrive cross-controller relations for kubernetes-CDK. Got 2.3-alpha1-xenial-amd64 from edge snap. The CDK deployments fail. There is an error on the uniter in non leader subordinate charm flannel. Have you seen this before?
<rick_h> kjackal_: someone was mentioning this yesterday I think. The teams' in NY for the sprint atm so not sure if anyone's poked at it
<kjackal_> thanks rick_h. The discussion yesterday was here on irc or was it on an email thread?
<rick_h> kjackal_: someone in irc just asked if edge was broken with the same looking story
<kjackal_> thank you
<rick_h> kjackal_: just sorry I don't have better news. Maybe file a bug and we can poke the folks in NY as they're getting going today
<rick_h> one of the few times thumper and company are around now :)
<thumper> o/
<thumper> I don't know of anything but the CDK folk are around somewhere
<thumper> we should probably get them involved
<kjackal_> thumper: yeap we are at nyc as well. I just wanted to check with you people before i go opening bugs
<kjackal_> thumper: which room are you?
<thumper> 14.09
<kjackal_> should I drop by in 10-20 minutes when i have the deployment failing? is it a good time?
<kjackal_> thumper: ^
<thumper> https://github.com/juju/juju/pull/7875
<thumper> babbageclunk: ^^
#juju-dev 2017-09-27
<thumper> babbageclunk: https://github.com/juju/juju/pull/7875
<babbageclunk> thumper: https://github.com/juju/1.25-upgrade/pull/47
#juju-dev 2017-09-29
<rick_h> thumper: ping, see the PR on the juju version go by. Just wanted to check is the actual need functionality available or juju version specifically?
<wallyworld> rick_h: zup
<rick_h> wallyworld I see the PR on the juju version go by. Just wanted to check is the actual need functionality available or juju version specifically?
<rick_h> wallyworld: I always get afraid checking juju version numbers (especially point releases) will turn into nightmares down the road
<wallyworld> we've sort of been down this road a bit before - we couldn't agree on a clean way to advertise capabilities. the charm folks just want version
<rick_h> wallyworld: and much prefer feature checks vs version checks and we went through a lot of this when we added the min_juju_version in charms in an attempt to not specify a single version but make sure things assumed forward compatible
<rick_h> wallyworld: ok, just wante to dbl check it wasn't a drive by thing without bigger audience
<wallyworld> juju_version is analogoues to min version
 * rick_h sits in his desk and sees single PR since he's far away and wanted to dbl check
<wallyworld> no worries, thanks for asking
<wallyworld> there's still time to refine before relase
<wallyworld> i'll be syncing with the charm folks again
<rick_h> wallyworld: yea, I'd be curios how the charmers will make sure that there's common tools for charm authors to express to users "this failed/errored/won't work because of your juju version" clearly
<rick_h> so that users don't have to debug the stuff
<rick_h> min_juju_version at least fails early and clearly while this could be in deep hook context during some relation/config change and such
<wallyworld> rick_h: it's more that the openstack folks want to say "what version of juju is running, if not 2.3 or later, then X won't be there"
<rick_h> wallyworld: right, but once the tool is there people will use it how they wish
<rick_h> wallyworld: not just in the clean way some folks want to use it
<wallyworld> that's true of any feature right?
<wallyworld> it's meant to alleviate the need for charm helpers to have all these try/except blocks
#juju-dev 2019-09-23
<mup> Bug #1844975 opened: Published 1.25.x streams for Bionic are buggy <juju-core:New> <https://launchpad.net/bugs/1844975>
<mup> Bug #1844975 changed: Published 1.25.x streams for Bionic are buggy <juju-core:New> <https://launchpad.net/bugs/1844975>
<mup> Bug #1844975 opened: Published 1.25.x streams for Bionic are buggy <juju-core:New> <https://launchpad.net/bugs/1844975>
