#juju-dev 2012-04-09
<niemeyer> Good mornings
<fwereade_> niemeyer, heyhey; nice holiday?
<niemeyer> fwereade_: Heya!
<niemeyer> fwereade_: Yeah, nice overall.. went to my dad's for a nice barbecue. Ale was working over most of it, though, which removed some of the feeling of easter.
<niemeyer> fwereade_: How about you, good stuff there?
<fwereade_> niemeyer, aw :(
<fwereade_> niemeyer, had a lovely easter actually; went out with cath's family for *awesome* food, and a spot of light walking in the countryside afterwards
<niemeyer> fwereade_: Oh, sounds nice indeed
<niemeyer> fwereade_: How's the countryside like there?
<fwereade_> niemeyer, at this time of year, both rocky and green, once you actually get out there
<fwereade_> niemeyer, lots of flowers
<fwereade_> niemeyer, over summer it's a bit brown but still really rather lovely
<niemeyer> fwereade_: Ah, colorful then
<niemeyer> (at this time)
<fwereade_> niemeyer, yeah :)
<niemeyer> mthaddon: ping
<niemeyer> Lunch, biab
<hazmat> fwereade_, re error on private cloud with constraints.. http://pastebin.ubuntu.com/922019/
<hazmat> ic the problem
<hazmat> fwereade_, effectively required constraints aren't populated for private clouds.. http://pastebin.ubuntu.com/922057/
<hazmat> az and instance type are still valid to specify even lacking the ability to define the vocabulary, the values should pass through
<hazmat> fwereade_, i ended up just putting in bandaid to remove the using_amazon check when setting up ec2 provider constraints in r518.. else private cloud usage using the ec2 api was broken.. as the provider requires both constraint keys defined in several places, and the default selection of instance-type m1.small relies on it as well.
 * hazmat moves on to checking out the sub agent branch
<fwereade_> hazmat, thanks, and sorry
<fwereade_> hazmat, but I would say that surely we should in fact be getting values rather than getiteming them, rather than exposing meaningless values through the UI?
<hazmat> fwereade_, well typically the standard values do have meaning in private contexts, just not the ones we think they do.. i looked at doing gets.. but that's also problematic because the default-instance-type is resolved via constraint
<hazmat> which fails when the constraint itself defined
<hazmat> what a tangled web we weave when first we practice to abstract ;-)
<fwereade_> hazmat, I'm confused; default-instance-type short-circuits constraints
<hazmat> fwereade_, it does.. but say you don't have it defined
<hazmat> fwereade_, then the default value comes from generic constraints
<hazmat> but those are only setup upon registering generics
<hazmat> ie. instance-types
<fwereade_> hazmat, hmmm, yeah, got you; I'd imagined people were always using default-instance-type and default-image together :(
 * hazmat grabs some snacks and call its lunch
<hazmat> fwereade_, that might be the case for some, but not all.
<fwereade_> hazmat, clearly so :(
<fwereade_> hazmat, everyone uses default-image-id, though, right?
<hazmat> fwereade_, on private clouds yes
<fwereade_> hazmat, jolly good
<fwereade_> hazmat, when you have a mo, please let me know whether you feel http://paste.ubuntu.com/922279/ is trivial enough to be a trivial
<fwereade_> hazmat, tiny tweak to above, http://paste.ubuntu.com/922285/
<hazmat> fwereade_, hmm.. i don't see this as particularly better.. the common vocabulary would suggest common usage even if we can't guarantee the value mapping
<hazmat> although likely the ec2 zone is wrong given its part in value construction
<hazmat> fwereade_, also i noticed that passing instance-type when it wasn't defined didn't trigger a user error
<hazmat> fwereade_, ie. given this patch the user wouldn't ever be able to specify via constraint any instance type they'd have to fallback to specifying it via hot swapping in environments.yaml... even though the value they want to give would work if just passed thorough
<fwereade_> hazmat, fair points;
<fwereade_> hazmat, I shall think on them
<fwereade_> hazmat, thanks
<hazmat> fwereade_, np.. i think the underlying issues are the same as when this was implemented just that we haven't had time to put in place the right solutions.. i mean  from a different perspective.. --constraints are exposed on private clouds, if usage of them is ignored without warning that's a bad ux. the ideal scenario is of course the same, constraints work everywhere, but that requires both image mapping and instance type mapping definitions, which i thin
<hazmat> k we agree is the long term solution.
<fwereade_> hazmat, yeah, I think I'm converging on an implementation of that
#juju-dev 2012-04-10
<niemeyer> Alow
<andrewsmedina> niemeyer: hi :D
<niemeyer> andrewsmedina: yo
<andrewsmedina> niemeyer: everything ok?
<niemeyer> andrewsmedina: Yep, things running well
<niemeyer> andrewsmedina: On the quiet side given the holidays
<niemeyer> andrewsmedina: at least on the Go side of things.. py folks pushing hard
<andrewsmedina> niemeyer: you have any issue for me?
<niemeyer> andrewsmedina: The one I suggested last time still looks like a good one to handle, unless you'd like something else for whatever reason
<andrewsmedina> niemeyer: that issue is done in python version?
<niemeyer> andrewsmedina: Yeah, I believe so
<niemeyer> andrewsmedina: hazmat can confirm, thouh
<hazmat> greetings
<hazmat> andrewsmedina, re env? no
<andrewsmedina> niemeyer: I'm need talk with you about juju works on centos. I'm needing a lot of that :(
<hazmat> andrewsmedina, it should be fairly straight forward from a software perspective.. the charms though are not portable
<hazmat> so a separate distro needs effectively its own set of charms
<hazmat> and we would need cloud-init to get the machine initialization in place
<andrewsmedina> hazmat: juju commands are restrict to apt
<hazmat> andrewsmedina, what apt specific usages there are pretty minimal
<hazmat> the biggest work would be making cloud-init run on centos.. amazon has done that already to some extent with their linux ami.. actually it looks like fedora support is already there.
<hazmat> https://bugs.launchpad.net/cloud-init/+bug/883286
<andrewsmedina> hazmat: nice
<niemeyer> Stepping out for the day, or perhaps back later.. cheers all
<hazmat> niemeyer, cheers
<SpamapS> *argh*
<SpamapS> constraints totally rewrote the same stuff ssl verification touches
<SpamapS> :-P
 * SpamapS puts on his merge pants
<SpamapS> wow sheesh, this is like.. a huge refactor
<SpamapS> tests changed.. whole structure changed
#juju-dev 2012-04-11
<SpamapS> alright, not so bad once I got into it
 * SpamapS pushes fix for SSL cert verification into juju
<hazmat> SpamapS, woot!
<hazmat> SpamapS, meld is your friend on the nasty merges
<hazmat> i typically invoke via bzr qconflicts
<SpamapS> hazmat: none of it was really nasty actually. We both just legitimately changed the same path in a lot of places, but luckily it never got that ugly
<SpamapS> most of it was that get_current_ami changed to something much more elegant..
<SpamapS> and what I did was very non-intrusive..
<SpamapS> hazmat: oo, never used that
<SpamapS> hazmat: know any good tools for analyzing aggregate code stats for python?
<SpamapS> I am putting together a little retrospective for the evolution of the juju codebase over the last 6 months.
<hazmat> SpamapS, you mean want LOC over time?
 * hazmat gives up on english
<SpamapS> LOC and maybe test coverage though I know I have to actually *run* the tests to do that
<SpamapS> hazmat: I can find a lot of tools to do per-file stats
<SpamapS> but I just want to point at a dir and get some interesting numbers
<hazmat> SpamapS, nothing python specific comes to mind for that.. you could try just pushing it to ohloh
<SpamapS> oooo
<SpamapS> yeah that might work :)
<SpamapS> argh.. yet another site to login to
<hazmat> SpamapS, else.. scripting http://cloc.sourceforge.net/ with bzr -u revno
<SpamapS> hazmat: 96 test coverage... not too shabby
<SpamapS> 96% I should say
<SpamapS> ahh and the worst file is juju/lib/lxc/__init__ .. which would be better if I had not skipped those tests :)
<hazmat> yeah.. we dropped 2% since last release
<hazmat> sloppy ;-)
<SpamapS> I got really weird errors when running coverage tests
<SpamapS> exceptions.IOError: /usr/lib/python2.7/dist-packages/ubuntuone-storage-protocol/juju/hooks/tests/hooks/success-hook doesn't exist
<hazmat> huh
<hazmat> odd path
<hazmat> SpamapS, via make coverage?
<SpamapS> https://www.ohloh.net/p/juju
<SpamapS> hazmat: yeah
<hazmat> SpamapS, haven't seen that one before
<SpamapS> Step 2 of 3: Importing source code into database (Running 0/1)
<SpamapS> *get on with it!* ;)
<hazmat> bcsaller, the latest lbox diff looks off
<bcsaller> when doesn't it
<bcsaller> I'll take a look
<hazmat> ic some of the changes that jimbaker made in that diff that are on trunk
<bcsaller> hazmat: which branch are you looking at?
<hazmat> https://codereview.appspot.com/5892046/
<hazmat> bcsaller, the pre-req needs to have trunk merged
<hazmat> lp gets the diff wrong as well.. say its 7k
<hazmat> bcsaller, might be easiest to just remove the pre-requiste on the mp and run lbox again
<hazmat> but getting the pre-req merged to current trunk should also work
 * hazmat is tired
<hazmat> beer o'clock back in a bit
<hazmat> bcsaller, can you run lbox again
<bcsaller> hazmat: I had, you think it will be different if I do it again?
<hazmat> bcsaller, i modified the mp
<hazmat> just now
<bcsaller> ahh, ok, running it now, for both
<hazmat> bcsaller both?
<hazmat> bcsaller, btw the problem b4, was you had to push the pre-req branch to lp after merging trunk
<hazmat> lbox pulls there for the pre-req
<bcsaller>  https://codereview.appspot.com/5892046 and  https://codereview.appspot.com/5991079
<bcsaller> though the later includes the former
<hazmat> merging it locally doesn't do anything by itself
<bcsaller> I proposed both before as well
<hazmat> bcsaller, the status branch which is the pre-req for agent wasn't pushed
<hazmat> with a trunk merge
 * hazmat calls it a night
<bcsaller> I thought that was merged
<bcsaller> maybe it didn't detect that
<bcsaller> I think the problem is the pre-req stays around even after merge and it shouldn't when the merge is with the lbox propose -for target
<wrtp> mornin' all
<TheMue> wrtp: morning
<fwereade_> morning wrtp, TheMue
<TheMue> heya fwereade_
<wrtp> fwereade_, TheMue: hiya. hope you've had a good time over Easter.
<fwereade_> wrtp, lovely, thanks; and yourself?
<wrtp> fwereade_: just great. we even had *some* nice weather at the start (the rest was a bit shit tbh)
<fwereade_> wrtp, cool
<fwereade_> wrtp, I forget, what was it you were doing
<fwereade_> ?
<wrtp> fwereade_: went up north, to skye (where my folks live) via the ardnamurchan pensinsula with our bikes.
<wrtp> fwereade_: so shitty weather is par for the course
<TheMue> wrtp: ah, skye, home of good malts
<wrtp> TheMue: only one malt really :-)
<fwereade_> wrtp, indeed, but I must say that I've always found bikes to be an unpleasantness multiplier for shitty weather
<wrtp> TheMue: distilled about 3 miles from where my parents live...
<wrtp> fwereade_: luckily we had the good weather at the start when we were on the bikes
<fwereade_> wrtp, perfect :)
<wrtp> fwereade_: here was a photo i took on my phone about half way through the longest day trip. we'd just come along the coast (mostly walking as our mountain bike skillz are not sufficiently l33t): https://www.facebook.com/photo.php?fbid=10150779643720903&set=a.445021760902.235579.754250902&type=3&theater
<fwereade_> wrtp, that's lovely
<fwereade_> wrtp, oddly reminiscent of a nice bay in gozo
<wrtp> fwereade_: pity about the shitty fb photo resolution. you can see the snow shower that hit us if you look vertically above the wee house.
<wrtp> a good mixed day
<fwereade_> wrtp, that ever-so-slight fuzziness?
<wrtp> fwereade_: xactly
<wrtp> fwereade_: sun and blizzards, what more could we ask for?!
<fwereade_> wrtp, sounds pretty awesome to me :)
<TheMue> wrtp: wonderful photo. and did you brought a fine Talisker with you
<wrtp> TheMue: of course!
<TheMue> wrtp: (envy) ;)
<TheMue> wrtp: i had the hope that my highland park 30yo will be here before easter. but it's so rare that i have to wait until mid of may. (sigh)
<wrtp> fwereade_: here's the bike route we did that day, excepting the bit that google maps doesn't know about. we started from Acharacle and went anti-clockwise. http://g.co/maps/wup3q
<wrtp> i love the fact that the google van went down all the tiny little roads around there (not all the tracks though, sadly :-))
<niemeyer> Good morning
<wrtp> niemeyer: yo!
<wrtp> niemeyer: hope you had a good Easter...
<niemeyer> wrtp: Heya! Welcome back
<niemeyer> wrtp: Yeah, pretty good stuff
<niemeyer> wrtp: Although I've just been notified that the blog has been hacked again.. :(
<niemeyer> It's time to drop wordpress..
<wrtp> which blog?
<niemeyer> wrtp: blog.labix.org
<wrtp> niemeyer: oops. i'd better check mine.
<TheMue> niemeyer: morning
<niemeyer> It's time to drop wordpress..
<niemeyer> TheMue: Heya!
<TheMue> aaaaaaargh! f*ck, watching the right node during tests definitely helps :D
<andrewsmedina> morning
<niemeyer> Lunch time.. biab
<hazmat> bcsaller1, it looks like the problems are the same bzr circular merge that was here before..
<hazmat> bcsaller1, ignoring reitveld's diff.. the actual diff to trunk is borked
<bcsaller1> hazmat: thats is not what I thought the issue was, but I'm looking at it again
<bcsaller1> hazmat: my diff to trunk looked ok, what are you seeing?
<hazmat> bcsaller1, the lp diff
<hazmat> bcsaller1, diffing manually to trunk produces a sane result
<bcsaller1> yeah, thats was I was seeing
<hazmat> hmm.. actually not..
<hazmat> test_service diff is odd
<hazmat> bcsaller, UnsupportedSuborinateServiceRemoval
<hazmat> missing a 'd'
<bcsaller> ha
<hazmat> and the doc string for the same has a typo
<bcsaller> type once, expand everywhere :(
<hazmat> bcsaller, huh.. nevermind that actually is screwed on trunk re test_service.py
<bcsaller> I think there was a whitespace issue, ran reindent.py on it when I saw that
<bcsaller> not sure what happened though
<hazmat> bcsaller, looks like it was introduced in the merge of subordinate control
<bcsaller> well, that should fix it
<wrtp>  /me is off for the night.
<wrtp> :-)
<wrtp> see y'all tomorrow
<hazmat> bcsaller, review in
<hazmat> wrtp, cheers
<bcsaller> hazmat: thank you
#juju-dev 2012-04-12
<TheMue> morning
<wrtp> TheMue: yo!
<niemeyer> Hello all!
<andrewsmedina> niemeyer: hi
<niemeyer> andrewsmedina: Heya
<niemeyer> TheMue: ping
<TheMue> niemeyer: pong
<niemeyer> TheMue: Heya
<TheMue> niemeyer: hi
<niemeyer> TheMue: How're things going there?  Good progress on the watchers stuff?
<TheMue> niemeyer: now yes, had two hard days hunting a bug
<niemeyer> TheMue: Oh?
<TheMue> niemeyer: yeah, didn't wanted as i wanted it to do
<niemeyer> TheMue: Cool.. glad you're unblocked
<TheMue> niemeyer: first i had to find out that a watch of GetW() won't fire later if a non-existing node is created (like the watch of ExistsW())
<niemeyer> TheMue: Yeah, it fails immediately I believe
<niemeyer> TheMue: So the watch is never established
<mthaddon> niemeyer: have you seen there are some questions outstanding on the charmstore RT?
<niemeyer> mthaddon: Nope
<niemeyer> mthaddon: I missed those
<TheMue> niemeyer: and in the tests i had to put those changes i'm watching in a goroutine. you'll see, i'll check in today
<niemeyer> mthaddon: I'm working on stats, btw
<niemeyer> mthaddon: Almost ready, but need some reviews
<mthaddon> niemeyer: ok, thx - if you could take a look and get back to me (via the RT or here) on the questions when you get a chance that'd be great
<niemeyer> mthaddon: Definitely
<niemeyer> mthaddon: Done
<niemeyer> I'm stepping out for lunch, biab
<flacoste> hi SpamapS, are you uploading a new juju version before final freeze (today)? the archive version still doesn't support constraints
<hazmat> fwereade_, i'm looking over the orchestra constraints branch,d o you have a moment to chat regrading
<fwereade_> hazmat, ofc
<hazmat> mostly i'm curious how reliable the cobbler profile data and arch data is
<fwereade_> hazmat, it's installed as part of orchestra and so should be solid; but it is true that I have no way of dealing with an installation that someone's messed around with enough
<fwereade_> hazmat, OTOH setting up distros/profiles is a bit headachey and IMO one of the really nice things about orchestra is that it does that for you so you don't have to worry about it
<flacoste> fwereade_, hazmat: shouldn't we just remove the orchestra provider, now that we have maas?
<flacoste> Daviey: wdyt? ^^^
<flacoste> Daviey: what's the plan with the packages in the archives?
<flacoste> leave it universe?
<hazmat> flacoste, from a juju perspective, maas for 12.04 has significantly less features
<flacoste> hazmat: really, how so?
<hazmat> flacoste, because it can only address machines by name
<hazmat> ie. poor constraints support
<fwereade_> flacoste, I expect we will want to retire the orchestra provider sooner rather than later though
<Daviey> hmm
<Daviey> What can the juju prover for orchetra do?
<Daviey> There isn't a problem leaving it in i guess
<Daviey> I mean, it's not exposed to the user, right?
<flacoste> hazmat: but you still have to implement the orchestra constraints support, dropping it would means less maintenance :-)
<hazmat> it can use a specified orchestra class as a constraint
<hazmat> flacoste, the orchestra class support for constraints is already in
<hazmat> there's additional branch in review which i'm looking at that adds arch/series
<fwereade_> hazmat, sadly, it *doesn't* add arch, mainly because I am an idiot :(
<fwereade_> hazmat, it matches existing arch
<fwereade_> hazmat, that branch is a result of me suddenly realising that ubuntu-series is critical, and totally unaddressed in orchestra
<Daviey> fwereade_: *please* tell me you aren't doing development for the orchestra provider at the moment.
<flacoste> Daviey, hazmat, fwereade_: yeah, that's my point about reducing maintenance
<flacoste> :-)
<flacoste> not doing work is great!
<flacoste> the zen of non-doing
<hazmat> Daviey, flacoste  having a useable baremetal solution that works with juju's expectation of a provider is important
<hazmat> even without cosntraints, right now neither maas nor orchestra does  that  because their ignoring series
<hazmat> which is critical to being able to deploy a charm as intended because the charm tied to a series
<flacoste> right, but maas only support precise
<hazmat> and we have zero precise charms atm
<hazmat> that's supposed to change this week
<hazmat> but come 12.10 we either expect folks to move core  infrastructure pieces off an LTS release for a newer version of maas?
<fwereade_> Daviey, sorry to say I am, simply because we have this provider that (1) still exists and (2) doesn't work right
<Daviey> fwereade_: wow.. if it doesn't ork.. drop it.
<Daviey> Really.. It's a *total* waste of time working o Orchestra.
<Daviey> Would it help if i removed orchestra fro the archive?
<flacoste> fwereade_: i agree
<flacoste> Daviey: we are getting the following error when bootstrapping:
<flacoste> SSH authorized/public key not found.
<flacoste> 2012-04-12 12:09:14,139 ERROR SSH authorized/public key not found.
<fwereade_> Daviey, flacoste: I would personally have no problem with that, but it feels like this is a very late stage to be doing this
<hazmat> woah.. it does work
<Daviey> flacoste: fixed
<flacoste> Daviey: how did you fix that?
<flacoste> i see
<flacoste> you just created an ssh key
<Daviey> right
<flacoste> Daviey: pgraner says hold on your fire :-)
<flacoste> but thanks!
 * Daviey lets go of the fire.
<flacoste> Daviey: +1 to dropping orchestra from the archive :-)
<flacoste> Daviey: better to be honest to our users about what's happening here
<flacoste> hazmat: yes, it's expected that people who want to use 12.10 will use more recent version of juju and maas
<fwereade_> Daviey, flacoste: my understanding from Budapest was that we'd *maybe* drop orchestra if we had maas doing everything we wanted by 12.04; when I came back to the python to resurrect the constraints work a couple of weeks ago, it was still around and maas was still a little up in the air
<fwereade_> Daviey, flacoste: I guess I should have made a bunch of noise about it then
<flacoste> Daviey: Invalid host for SSH forwarding: ssh: Could not resolve hostname node-00e081d1b147.local: Name or service not known
<Daviey> flacoste: that just means it's not switched on/installed.
<flacoste> Daviey: did we set it up with IPMI?
<fwereade_> Daviey, flacoste: I think I may have been unclear: the specific thing that doesn't work with orchestra is series selection (that is, deploying oneiric/foobar may or may not get an oneiric machine on which to run fobar)
<flacoste> fwereade_: right, but that's still work that has to happen for an unsupported piece of software
<flacoste> i'd say remove it from juju and leave the community to support it if they need it
<hazmat> please don't remove orchestra from the archive.. for 12.04 its the only bare-metal provider that will work well with juju both for constraints and soon series.
<hazmat> i understand that orchestra is dead, but maas doesn't fufill the needs of users who want to use it with juju at the moment imo
<hazmat> future dev will resolve that i'm sure, but that's not about 12.04.
<flacoste> hazmat: maas works well enough for precise
<flacoste> keeping orchestra around just confuses thing
<fwereade_> Daviey, flacoste: that's a decision I would have been comfortable supporting *if* we'd had a public discussion about it a month or so ago, and determined that maas was in a position to entirely replace it; but (1) we didn't, and (2) it isn't
<flacoste> i'd rather handle flak for going without 1 and 2
<flacoste> than handle the confusion for the next 5 years
<fwereade_> flacoste, ok, it works well enough for precise; but it seems to me to be bad form to entirely drop the ability to provision pre-precise machines
<flacoste> why?
<flacoste> they can stick with 11.10 if they like it
<Daviey> this is flippin' crazy
<fwereade_> flacoste, largely because I imagine people to have been using it to deploy pre-precise machines and I'm not sure a supposed upgrade should take features away from people
<flacoste> fwereade_: we remove features all the time
<flacoste> especially ones that have few users
<flacoste> otherwise you just grow a pile of unmaintainable mess
<flacoste> it's not like we are forcing upgrades on anyone here
<fwereade_> flacoste, fair point; I think there's a disconnect somewhere between this and the "we have users now, we can't just keep breaking their installations on upgrade" attitude I have recently developed
<fwereade_> flacoste, does anyone have any numbers on what providers people are using?
<flacoste> fwereade_: you should know this better than I :-)
<flacoste> maybe jorge would know
<fwereade_> flacoste, my main worry is that *anyone* wanting to use juju for bare metal will have been using orchestra, and that if we trash that and offer only a less-featureful alternative we'll be chasing them away
<flacoste> fwereade_: it's not less featureful
<flacoste> stop that
<flacoste> these "more" features don't exist yet
<flacoste> constraints -> that's not a feature they can use
<flacoste> it requires more development to support
<flacoste> it's more featureful than the existing setup
<fwereade_> flacoste, depending on exactly what machines you have set to use juju, it could be argued either way
<fwereade_> s/to use/to be used by/
<fwereade_> flacoste, if you currently have a couple of dozen oneiric machines running oneiric charms, it's fine; and (1) that's what any users will have been using, and (2) maas can't do that
<hazmat> constraints support for orchestra is already in trunk
<niemeyer> wrtp: ping
<wrtp> niemeyer: pong
<niemeyer> wrtp: Heya
<niemeyer> wrtp: How're things going there?
<fwereade_> flacoste, hazmat: indeed, and has been for a while; this is more a bug than a feature, because I just wasn't handling every applicable constraint correctly
<wrtp> niemeyer: pretty slow getting back up to speed yesterday, but more or less back in the groove now, doing the tests for the ssh forwarding stuff
<wrtp> niemeyer: not entirely convinced that my approach to testing error paths was the correct one. i think perhaps i *should* try to provoke all the desired misbehaviours from ssh.
<wrtp> niemeyer: am intending to get around to looking at your reviews soon!
<wrtp> niemeyer: how about you?
<niemeyer> wrtp: Super
<niemeyer> wrtp: I'm good as well.. have been a bit slow early this week too, but also getting up to speed
<niemeyer> wrtp: Happy about the stats stuff that's coming into the store
<wrtp> niemeyer: cool. it'll be nice to see some feedback.
<wrtp> niemeyer: (from the store traffic, that is)
<wrtp> niemeyer: are we still planning a meeting on mondays?
<niemeyer> wrtp: Yeah
<niemeyer> wrtp: We've lacked the last couple of meetings
<niemeyer> wrtp: But we should definitely do them from now on
<wrtp> niemeyer: sounds good.
<wrtp> niemeyer: perhaps we could do them when you first come on line (given you're in the latest time zone)? i usually have to finish fairly promptly on monday evenings.
<fwereade_> wrtp, +1
<niemeyer> wrtp: Isn't the timing we agreed to good?he timing we've agreed to sounds
<niemeyer> Erm
<niemeyer> wrtp: Isn't the timing we agreed to good?
<wrtp> niemeyer: i don't remember agreeing to a timing...
<wrtp> niemeyer: perhaps i'm missing a calendar entry
<niemeyer> wrtp: 14UTC was the one we settled on
<wrtp> niemeyer: that sounds fine to me.
<wrtp> niemeyer: i'll put an entry in my calendar
<niemeyer> wrtp: Cheers
<wrtp> niemeyer, fwereade_: i've made a calendar for us and added the meetings to it. i dunno if that's the best way to do it.
<niemeyer> wrtp: We can try it out and see
<wrtp> niemeyer: at least it makes it easy for us to amend the dates and times easily and be reminded of the change...
<hazmat> bcsaller, ping
<bcsaller> hazmat: hey
<hazmat> bcsaller, are we ready to get the last subs branches in..
<bcsaller> you approved it yesterday
<hazmat> bcsaller, right.. but is it merged?
<bcsaller> yes
<hazmat> sweet!
<bcsaller> :)
<wrtp> niemeyer, fwereade: i'm off. see you tomorrow.
<niemeyer> wrtp: Have a good EOD
<wrtp> niemeyer: will do, ta!
<hazmat> bcsaller, the change to lxc broke the tests
<hazmat> actually the env.defaults thing that i thought got removed..
<bcsaller> ugh, those didn't run on my last try (and yes that should have been removed), checking
<hazmat> bcsaller, i'm on it
<hazmat> i'm rolling up the fix into my next merge
<hazmat> its a one liner
<bcsaller> hazmat: thanks
<hazmat> bcsaller, i was realizing though that the subordinate state change from last week is still going to cause issue with extant envs because of the topology change, even with the transparent upgrade... actually because of the transparent upgrade
<hazmat> because there are older agents running around
<bcsaller> w/o that upgrade code...
<hazmat> its actually worse than not upgrading it
<bcsaller> hmm
<hazmat> because all of the agents are going to fail simulatenously without any user warning
<hazmat> i think we need to disable that migration
<hazmat> its changing the topology format, so really nothing run older code is going to survive.
<hazmat> its better to just error out on the cli b4 that happens
<hazmat> bcsaller, thoughts?
<bcsaller> its odd, because even as far back as reviewing the spec we thought this was a good idea. I'm trying to think how'd we make migration like this work at all if this is the case now
<bcsaller> would we force the agents to restart?
<hazmat> bcsaller, they won't get the new code. restart doesn't help
<hazmat> without consistent code versioning and a notion of code/db upgrade as an explicit op, incompatible state changes are hosers
<bcsaller> I guess it could support returning the topo in the requested version as the data is still there, but that wouldn't work in all cases
<bcsaller> and would still require eating this atleast once to get there
<hazmat> bcsaller, we've got maybe 30m to resolve this, afaics the only thing to do is to disable the transparent topo migration
<bcsaller> then thats what we should do I guess
<hazmat> bcsaller, do you see any alternative?
<bcsaller> no, some of its a data reorg and some of its new data
<bcsaller> the new data doesn't leave us a choice
<bcsaller> or we could just forward from 1 on read for now (for v2 clients)
<bcsaller> but that new data has to live in there
<hazmat> new data isn't a problem by itself dependin on the struct (ie a new key in a dict is fine), its the struct change that kills it.
<bcsaller> then v2 clients will just refuse to connect to v1 topos for now
<bcsaller> hazmat: so rather than call migrate you just raid incompat version. Do you want to make that change?
<hazmat> bcsaller, done
<bcsaller> cool. It seemed like a good plan at the time :-/
<hazmat> bcsaller, jimbaker, fwereade_  all changes are in, pls consider trunk frozen
<niemeyer> Woohay!
<flacoste> hazmat: so juju is now complaining on juju deploy mysql
<flacoste> about series constraints not being implemented
<flacoste> Error processing 'cs:precise/mysql': entry not found
<flacoste> and Bad 'ubuntu-series' constraint 'oneiric': MAAS currently only provisions machines running precise
<flacoste> if i try to do oneiric/mysql
<flacoste> what's my work-around until there is a precise mysql charm?
<jimbaker> hazmat, awesome news!
<hazmat> flacoste, with maas.. you'd have to pull the charm down to local disk
<hazmat> and put in a charm repo under a precise series
<hazmat> the precise version of that charm does not exist in the store
<flacoste> hazmat: that's what i'm doing now
<niemeyer> hazmat: Are you querying the cs automatically to feed the web UI in any way?
<hazmat> niemeyer, not atm. i'm planning to update the ui to mark items not in the charm store when looking at the official series
<hazmat> niemeyer, it pre-dates the charm store, and there is no query on the charm store outside of querying the thing you want directly by name afaik
<niemeyer> hazmat: Cool, no worries.. I'm happy to extend with stuff that makes things simpler on the other end
<niemeyer> hazmat: I'm asking mainly due to stats
<niemeyer> hazmat: Please don't add any automatic querying without syncing up
<niemeyer> hazmat: In case you need/want to query one of the existing APIs, I'll hand you a parameter so it's not counted as a legitimate request
<hazmat> niemeyer, cool, sounds good
<hazmat> niemeyer, if you have a url to grab stats that would be nice to pull in
<niemeyer> hazmat: I will have, many of those actually
<niemeyer> hazmat: Per charm, etc
<SpamapS> jimbaker: https://code.launchpad.net/~jimbaker/juju/relation-hook-commands-spec/+merge/97733 .. you should land this in docs :)
 * SpamapS is trying to close out the florence milestone
<jimbaker> SpamapS, sounds good
<SpamapS> jimbaker: there are 4 open, Approved bugs left in florence, all for lp:juju/docs, all for you :)
<jimbaker> SpamapS, yeah, i was planning on combining those together in a more cohesive whole, but it's probably best to do that as a second step
<SpamapS> jimbaker: I'll push them off to the galapagos milestone if you're not ready to commit them just yet.
<jimbaker> SpamapS, that probably makes sense
<SpamapS> Blueprints:
<SpamapS> 1 Implemented
<SpamapS> Bugs:
<SpamapS> 108 Fix Released
<SpamapS> Kapil and William tie for most assigned at 27 :)
#juju-dev 2012-04-13
 * niemeyer <= lunch
<wrtp> i'm off a little early today. have a great weekend everyone!
<hazmat> niemeyer, wtf needs a dep update for txzk
<hazmat> wrtp, enjoy, cheers
<niemeyer> hazmat: Will check it
<niemeyer> wrtp: I'd appreciate some reviews, when you have a moment..
<niemeyer> There are several branches pending already
<wrtp> niemeyer: will do, sorry for the delay.
<niemeyer> wrtp: np, just want to get this in soonish so we get to have an idea of the growth
<m_3> niemeyer: is there a shortcut to lp:~charmers/charms/precise/hadoop/trunk from the 'juju deploy' command?  'juju deploy hadoop' couldn't find a cs:precise/hadoop
<niemeyer> m_3: Was that blessed as an official charm?
<niemeyer> m_3: If it was, it's supposed to work
<m_3> it's owned by ~charmers, and in the correct url format... there has _not_ been an alias created for lp:charms/hadoop.  I'm not sure what exactly it means to be an official charm
<m_3> niemeyer: (backgroud) I'm testing charms against precise in prep for the official changeover... but wanted to know if robbie et al can use the charmstore shortcuts to deploy precise charms before the aliases have been created
<niemeyer> m_3: It means using the promulgation stuff to brand the charm
<niemeyer> m_3: The two things are connected.. if robbie has access to the charms in the shortcut format, everybody else does as well
<niemeyer> m_3: But robbie can test the charms in another URL
<m_3> niemeyer: so if we do not get the charms officially changed over to precise before the next demos, what is the syntax for deploying those directly from lp?
<m_3> is that even possible? or should I have them stick to a local charm repo?
<niemeyer> m_3: Ah, sorry, I didn't realize that was the question
<niemeyer> m_3: cs:~charmers/precise/hadoop
<m_3> awesome... thanks
<niemeyer> np
<m_3> sorry that wasn't clearer
<niemeyer> m_3: Re-reading your original question, it was actually somewhat clear.. I just went into another direction for some reason
<m_3> niemeyer: that just worked beautifully btw... 'juju deploy cs:~charmers/precise/hadoop hadoop-master'
<niemeyer> m_3: Sweet!
<niemeyer> I'm getting out for some exercising..
<hazmat> niemeyer, the store oneiric/mysql charm is still borked afaics
<hazmat> oh.. thats my cache getting in the way.. nm
<niemeyer> Cool
#juju-dev 2013-04-08
<thumper> davecheney: yeah
<davecheney> thumper: merged your packaging fixes
<davecheney> thank you very much
<thumper> davecheney: awesome, ta
<thumper> davecheney: lp hasn't noticed, have you pushed your packaging branch back up?
<davecheney> it has
<davecheney> it's sending me all sorts of email
<davecheney> sppokey
<thumper> davecheney: ah, I see it now, the MP is marked as merged.
<thumper> davecheney: I wrote that magic :)
<davecheney> how does it work ?
<davecheney> smell your branch in my push ?
<davecheney> (ewww)
<thumper> kinda...
<thumper> it notices a push from your branch.
<thumper> it then looks for any pending merge proposals targetting your branch
<thumper> it then looks at the tip revision id for each of those merge proposals
<thumper> and if the tip revision of the source branch is in your branches ancestry, it gets marked merged
<thumper> and it also works out which revision number on your branch it was merged at and saves that too
<davecheney> very nice
 * bigjools appreciates thumper's work there too
<thumper> the branch also gets its state changed to "merged" too
<thumper> so it drops off the various branch listing pages
 * thumper fires off the can of worms email before going to lunch
<thumper> so... lunch
 * davecheney reads
<thumper> davecheney: did you say that there were bugs around this pipe closing behaviour?
<thumper> davecheney: or were you referring to the hook serialisation bugs?
<davecheney> oh, that
<davecheney> that is fine
<davecheney> you are sending EOF to stdin on the child process
<davecheney> then waiting for it to exit
<davecheney> the race is actaully the line after that
<davecheney> were we close the logger
<thumper> ah... no, I don't think that is write
<thumper> we aren't closing stdin of the child
<thumper> the outWriter is only set for stdout and stderr
<thumper> so the process writes to the pipe
<davecheney> that is crack, if you close the stdout of a child process, it'll get EPIPE
<davecheney> this code is really subtle
 * davecheney goes to reread
 * thumper goes to lunch, and to drop stuff in at the 2nd hand shop
<bigjools> I've got juju core tests passing on a new raring VM, but when I try locally I get loads of failures (I mean LOADS) like this:
<bigjools> [LOG] 24.71999 ERROR api: error receiving request: read tcp 127.0.0.1:57535: use of closed network connection
<bigjools> I've no idea what's different locally, how can I work this out?
<bigjools> also, constraints_test.go:283: is failing on raring, it claims that lp:1132537 is fixed
<davecheney> bigjools: i think that isn't the error
<davecheney> just a symptom of the TearDownTest running after a failure
<bigjools> almost certainly
<davecheney> bigjools: got a larger paste ?
<bigjools> I'll re-run, the output is massive
<bigjools> give it its due - this is the first test suite that brought my 4-core to its knees
<davecheney> that we do
<bigjools> the disk is gettng hammered
<davecheney> blame mongo
<davecheney> i put /tmp on a /tmpfs
<bigjools> heh
<bigjools> I used to do that for Launchpad tests
<bigjools> didn't make a lot of difference
<bigjools> on the bright side, the mongo in raring works out of the box with juju-core
<davecheney> excellent
<davecheney> that is good news
<bigjools> davecheney: http://paste.ubuntu.com/5688064/
<davecheney> bigjools: turns out
<davecheney> it doens't
<davecheney> PANIC: addrelation.go:0: ConstraintsCommandsSuite.TearDownTest
<davecheney> ... Panic: local error: bad record MAC (PC=0x4117A4)
<davecheney> /build/buildd/golang-1.0.2/src/pkg/runtime/proc.c:1443 in panic
<davecheney> /home/ed/canonical/GO/src/launchpad.net/juju-core/testing/mgo.go:178
<davecheney> ^ your mongo does not speak TLS
<bigjools> davecheney: wtf, the same mongo is installed on the VM, which works
<davecheney> ldd $(which mongod)
<davecheney> (from memory)
<bigjools> yeah
<bigjools> ssl is linked
<bigjools> better, mongod -h
<bigjools> shows ssl options at bottom
 * bigjools is stumped
<davecheney> got several mongos installed ?
<bigjools> not that I can see
<bigjools> but looking for stray ones as we speak
<bigjools> well, waiting for updatedb :)
<thumper> bigjools: I'm running raring, and tests passing
<thumper> bigjools: apart from an intermittant one this morning
<bigjools> thumper: ta
<bigjools> go fmt is great and all, but it creates extra lines of diff (and buggers bzr blame) lining up struct items if I add a new long one :(
<davecheney> seroiusly fellas, can you please create a new channel for bitching about Go
<bigjools> will that make Go better?
<davecheney> it'll make me feel better
<davecheney> Go is what it is
<davecheney> it's not going to change
<davecheney> not in a timeframe that is relevent to the problem at hand
<davecheney> so all this bitching just gives me heartburn
<bigjools> "it's not going to change" ... then it'll die
<thumper> davecheney: can you remind me what it means to do a case <- some_channel in a select?
<thumper> davecheney: under what situations does it wait?
<thumper> davecheney: if it is a closed channel, it skips it?
<thumper> bigjools: not everyone cares about whitespace changes
<davecheney> it will wait only in the case that there is no default: clause in a select, and none of the other cases are rady
<thumper> bigjools: better tools can help that
<davecheney> ready
<bigjools> thumper: that's my point
<thumper> bigjools: what I mean is that it is not necessarily gofmt that is at fault, but the diff checking tools
<bigjools> thumper: arguably, yes. Although whitespace is important in some places of course (Python)
<thumper> davecheney: so for example: cmd/jujud/machine.go:80 will wait for both channels to close before moving on
<thumper> bigjools: sure, but not really go
<davecheney> yes, that will block until one of those conditions is true
<thumper> got I hate the name tomb
<thumper> it doesn't explain what it does at all
<davecheney> thumper: the trick is that it then nils out the channel reference, so it can never be satisfyable again
<thumper> davecheney: sure, but it looks like if either one fails, the tomb sets to dying
<davecheney> yes, then it loops agian
<thumper> davecheney: so <-nil is valid?
<davecheney> from a quick reading, it's waiting for both of those things to finish
<davecheney> then keeping the more important error of the two
<davecheney> moreImportant(err, err) may be a subjective judgement
<davecheney> WHO THE FUCK CHANGED THIS !!!
<davecheney> 2013/04/08 12:33:09 INFO JUJU:juju:bootstrap environs: searching for tools compatible with version: 1.9.13-precise-amd64
<davecheney> 2013/04/08 12:33:09 DEBUG JUJU:juju:bootstrap listing tools in dir: tools/juju-1.
<davecheney> 2013/04/08 12:33:14 DEBUG JUJU:juju:bootstrap listing tools in dir: tools/juju-1.
<davecheney> 2013/04/08 12:33:15 ERROR JUJU:juju:bootstrap juju bootstrap command failed: cannot find tools: use of closed network connection
<davecheney> the directory that we look for public tools has changed
<thumper> huh?
<davecheney> it should be tools/
<davecheney> not tools/juju-1.
<thumper> hmm...
<thumper> I thought that listed tools starting with that, so only listed version 1 tools
<davecheney> maybe the listing hasn't change
<davecheney> changed
<davecheney> it just never failed before
<davecheney> odd, it works with the source version, even without using upload tools
<davecheney> maybe ec2 is having a lie down
<thumper> maybe
<davecheney> yup, now it works
<thumper> :)
<thumper> panic over?
<davecheney> it's justifyable
<davecheney> almost every day I go to test the tool
<thumper> panic expected?
<davecheney> it's borken
 * thumper sighs...
<thumper> why does go make me have a return statement at the end of the function if it just says "return"
<thumper> that's dumb
<bigjools> thumper: do you get FAIL: constraints_test.go:276: ConstraintsSuite.TestGoyamlRoundtripBug1132537
<bigjools>  on raring?
<thumper> bigjools: nope
<thumper> bigjools: tests pass for me on raring
<bigjools> that's the only one I see failing
<bigjools> did you update goyaml lately?
<thumper> nope
<bigjools> I am on revno 39
 * thumper looks
<thumper> 36
<bigjools> that'll be it then
 * bigjools lunches, ttyl
<thumper> maybe...
 * thumper wonders where kapil's tool is to get the reproducible builds...
 * thumper goes to get girls from school
<thumper> bbs
<thumper> back
 * thumper wondered why juju boostrap failed so often with error: cannot find tools: use of closed network connection
<thumper> and then it suddenly worked
<thumper> ?!
 * thumper away, probably back later
<rogpeppe> mornin' all
<rogpeppe> dimitern: ping
<dimitern> rogpeppe: hey
<rogpeppe> dimitern: ho!
<rogpeppe> dimitern: i was just looking at status watching for the allWatcher
<dimitern> rogpeppe: yeah?
<rogpeppe> dimitern: and realised that there's no statusDoc - we have a different doc type for units and machines
<rogpeppe> dimitern: that makes things more awkward, and i was wondering if there's a compelling reason for it
<dimitern> rogpeppe: that's right
<dimitern> rogpeppe: because they're potentially different types of status information (units/machines)
<rogpeppe> dimitern: i'm considering two alternatives: 1) just define the status as a string for each, and type-convert the result when returning Status. 2) have a different field name for each, and omit the other one when empty
<rogpeppe> dimitern: because it makes my code considerably simpler if i can assume that each collection has a uniform schema
<dimitern> rogpeppe: what's the complication now?
<rogpeppe> dimitern: the code is simpler if i assume that there's a single type for a collection that can know how to marshal itself from the collection and how to update the allInfo when it changes
<rogpeppe> dimitern: and i think that the rest of the code becomes no more complex, to be honest
<rogpeppe> dimitern: in fact, it probably becomes a bit less magic
<dimitern> rogpeppe: i'm not convinced getting rid of typed UnitStatus + constants (and for machines as well) is a better solution
<rogpeppe> dimitern: i'm not suggesting that
<rogpeppe> dimitern: the external interface (Status and GetStatus) would remain exactly as is
<dimitern> rogpeppe: you're saying to save them in a single doc, casted as strings
<rogpeppe> dimitern: that's option 1, yes.
<rogpeppe> dimitern: it's statically type safe
<rogpeppe> dimitern: which the current scheme is not
<dimitern> rogpeppe: and the 2nd option? how's the field names important?
<rogpeppe> dimitern: you'd have type statusDoc struct {UnitStatus params.UnitStatus `bson:",omitempty"; MachineStatus params.MachineStatus `bson:"omitempty"`; Info string}
<dimitern> rogpeppe: so it's like a union
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: which is kinda what's happening now anyway
<dimitern> rogpeppe: hmm... yeah, although it's explicit
<dimitern> rogpeppe: so what about the field names?
<rogpeppe> dimitern: what about them?
<dimitern> rogpeppe: ah, sorry - I though using the same names for the 2 different docs was confusing
<rogpeppe> dimitern: the point is i'd like there to be only one doc, same as we've got for every other collection
<dimitern> rogpeppe: both options seem fair enough
<dimitern> rogpeppe: i'd like to pass this through fwereade as well
<rogpeppe> dimitern: yeah
<dimitern> rogpeppe: I might be missing some subtleties
<rogpeppe> dimitern: the second is more versatile (if we wanted to change UnitStatus to be more than a string, that would be ok); but that would be a major-version change in the current scheme too.
<rogpeppe> dimitern: i'm leaning towards option 1 currently. it's the simplest approach, and not hard to change in a backwardly compatible way.
<rogpeppe> dimitern: thanks for the discussion. i'll pass it by william when he comes online
<dimitern> fwereade: hey
<fwereade> dimitern, hey dude
<dimitern> fwereade: having a bit of discussion with rogpeppe here, about having a single statusDoc to simplify the allwatcher
<rogpeppe> fwereade: morning!
<fwereade> rogpeppe, heyhey
 * fwereade raises at least half an eyebrow but would like to hear more
<rogpeppe> fwereade: yeah - status is the only place (i think!) where we have more than one schema doc for a collection
<fwereade> rogpeppe, ah, yes
<rogpeppe> fwereade: it doesn't look like the gains are worth that much to me
<rogpeppe> fwereade: and if it was consistent, it would make my code simpler (and probably the existing code would be a little simpler too)
<fwereade> rogpeppe, well, it's really down to "are the unit status constants the same as the machine ones"
<fwereade> rogpeppe, I am not unsympathetic to arguments that, yes, they are
<rogpeppe> fwereade: in my mind it's down to "can they both be represented by a string?"
<rogpeppe> fwereade: but there's an alternative approach if we think they might grow bigger
 * fwereade is listening
<rogpeppe> fwereade: we could have a different field for the different kinds of status
<rogpeppe> fwereade: type statusDoc struct {UnitStatus params.UnitStatus `bson:",omitempty"; MachineStatus params.MachineStatus `bson:"omitempty"`; Info string}
<fwereade> rogpeppe, mm, not so awfully keen there
<rogpeppe> fwereade: (which is actually the go standard approach to "unions" when marshalling)
<fwereade> rogpeppe, interesting
<rogpeppe> fwereade: but tbh at the moment i'm leaning towards type statusDoc struct {Status string; Info string}
<rogpeppe> fwereade: and (static) type-convert when returning from the Status function and calling SetStatus
<TheMue> moin
<rogpeppe> TheMue: hiya!
<fwereade> rogpeppe, that does sound reasonable, I think
<rogpeppe> fwereade: cool
<fwereade> TheMue, heyhey
<dimitern> rogpeppe: I have to finish nonced provisioning branches first, then I'll propose the change to status
<rogpeppe> dimitern: i'm happy to do it if you like
<rogpeppe> dimitern: it's on the critical path for me
<dimitern> rogpeppe: that'll be awesome
<dimitern> rogpeppe: thanks
<dimitern> fwereade: did we agree not to make MachineNonce required in the first CL (agent.Conf / MachineConfig)?
<fwereade> dimitern, I think we planned to require it and pass a non-empty fake
<fwereade> dimitern, but as you prefer
<dimitern> fwereade: yeah, so I recall correctly
<rogpeppe> dimitern: i'm just trying to understand this comment, that the _id in the status doc is omitted "to allow direct use of the document in both create and update transactions."
<dimitern> fwereade: hmm.. no actually - see here: https://codereview.appspot.com/8247045/diff/1/environs/agent/agent_test.go#newcode226
<rogpeppe> dimitern: how does omitting _id allow that?
<dimitern> rogpeppe: it allows you to do statusDoc{status, info}, rather than statusDoc{x.globalKey(), status, info}
<rogpeppe> dimitern: and if that's true, how do the other docs get away with having that field and using the same doc for both create and update transactions
<rogpeppe> ?
<rogpeppe> dimitern: so we're just trying to save one function call?
<dimitern> rogpeppe: no, we're still calling it, when we need to pass Id: of the txn.Op
<rogpeppe> dimitern: so is the comment misleading?
<rogpeppe> dimitern: is it actually "allowing" something, or is it just a convenience?
<dimitern> rogpeppe: maybe it should've been "allows you to avoid specifying an explicit key for the document" or something
<rogpeppe> dimitern: i'm only saying because i'd find it useful to have it there
<rogpeppe> dimitern: and i'm wondering if there's a particular reason to avoid it
<dimitern> rogpeppe: have the _id key?
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: it means a status doc can know its own id
<dimitern> rogpeppe: well, for settingsRefDoc{1} works better than for the statusDoc{x, y} perhaps
<dimitern> rogpeppe: is this that useful?
<fwereade> rogpeppe, sorry, which docs with _ids do we use in updates?
<fwereade> rogpeppe, I see a statusDoc as a dumb transport mechanism essentially -- to get one, you have to have a globalKey
<dimitern> fwereade: looking at the comment link I posted above, istm for that test to be valid, nonce shouldn't be required
<fwereade> rogpeppe, but in your case I can see how it would be useful
<fwereade> dimitern, ah, sorry
<fwereade> dimitern, we agreed required in MachineConfig
<dimitern> rogpeppe: but not it agent.Conf?
<fwereade> dimitern, every MC refers to a machine; not every Conf does
<rogpeppe> fwereade: yeah, the only other place is constraintsDoc, i see
<rogpeppe> fwereade: which equally doesn't have an id field
<dimitern> fwereade: ok
<fwereade> rogpeppe, I was asking about the otherway round actually
<rogpeppe> dimitern: yeah, agent.Conf is used by all agents
<fwereade> rogpeppe, how often do we update whole docs?
<rogpeppe> dimitern: but the nonce is only used by the machine agent
<rogpeppe> fwereade: constraints is the only place
<dimitern> hey guys, meet danilos - newest member of blue squad!
<rogpeppe> fwereade: and i'm going to have the same issue there
<rogpeppe> danilos: welcome!
<fwereade> danilos, heyhey
<fwereade> rogpeppe, I am having trouble figuring out the problem though
<fwereade> rogpeppe, you get events which have the _id of the changed document
<fwereade> rogpeppe, that tells you everything you need to know, right?
<rogpeppe> fwereade: yeah, you're right, there's no problem.
<fwereade> rogpeppe, what's the situation where you need an _id and don;t have one available?
<fwereade> rogpeppe, ah ok
<rogpeppe> fwereade: i now understand that comment :-)
<dimitern> rogpeppe: care to explain what you understood for me? :)
<rogpeppe> dimitern: i'd thought that other docs were used in update transactions
<rogpeppe> dimitern: but actually they're not (and presumably you'll get an error if you do)
<fwereade> rogpeppe, yeah, IIRC it complains about setting _id even if it matches
<dimitern> rogpeppe: what other docs? they're only 2 of them
<rogpeppe> dimitern: unitDoc, machineDoc, etc
<dimitern> rogpeppe: ah, right!
<rogpeppe> fwereade: you could probably use omitempty, but yeah.
 * dimitern needs to dash to the bank before it closes, bbiab
<danilos> rogpeppe, fwereade: hi, thanks :)
<mattyw> rogpeppe, if I wanted to call the api from a unit, is there anything special I need to do? compared to calling it from local machine?
<rogpeppe> mattyw: you'll need to know the address of the api server
<mattyw> rogpeppe, ok, we can grab that from a file - the same as the gui does
<rogpeppe> mattyw: and... it might be best if you don't give that unit the keys to the whole environment
<rogpeppe> mattyw: but it'll be ok for the time being if you do
<rogpeppe> mattyw: actually, you don't use that anyway, so there's no problem
<rogpeppe> mattyw: you'll have to store the admin secret in a file too
<rogpeppe> mattyw: rather than grabbing it from environments.yaml
<rogpeppe> mattyw: or... you could just do exactly the same thing and put bogus provider keys in the environments.yaml file
<rogpeppe> mattyw: although tbh it would be better not to - that way your unit can be portable across providers
<mattyw> rogpeppe, at the moment we can probably just send the admin-secret up to the uint via config setting in the charm
<rogpeppe> mattyw: seems reasonable
<rogpeppe> mattyw: not very "secret" though :-)
<mattyw> rogpeppe, yeah - it's not very long term
<TheMue> fwereade: ping
<fwereade> TheMue, pong
<TheMue> fwereade: i'm a bit trapped ;)
<fwereade> TheMue, I'm sorry about that, how can I help?
<TheMue> fwereade: i changed the doc to be embedded, i've missed that before. but that i run on a mgo bug returning the error "slice of unaddressable array".
<rogpeppe> dimitern: https://codereview.appspot.com/8510043
<TheMue> fwereade: so i changed, just for testing, uuid back into a slice instead an array. and hey, it works.
<fwereade> TheMue, then the complications of the array are too much hassle, I guess
<fwereade> TheMue, although, embedded?
<TheMue> fwereade: so next step, set bson:_id on the uuid field. and oh no, it is not hashable. *sigh*
<fwereade> TheMue, I meant to say a "doc" field
<rogpeppe> TheMue: i'd use string for the uuid
<TheMue> fwereade: eh, not embedded, as a regular field, like in the other entities. wrong word, sorry.
<rogpeppe> TheMue: simpler all round
<fwereade> rogpeppe, +1
<fwereade> TheMue, cool, just checking
<fwereade> rogpeppe, TheMue, I would be fine even if trivial.NewUUID returned a string
<rogpeppe> fwereade: sounds good to me
<fwereade> rogpeppe, TheMue, we don't really seem to get a lot of value out of the type itself
<TheMue> fwereade: ook, will do so, but only due to technological reasons, which i dislike. a uuid is a 16 octed array, and it has a string representation.
<TheMue> fwereade: but it shall be ok here.
<fwereade> TheMue, understood, but I'm fine with and data representation which is (1) unambiguous and (2) convenient
<rogpeppe> TheMue: i think there's no good reason why we need to tie ourselves to UUID "standards"
<TheMue> rogpeppe: i would prefer a database client not having troubles with it. ;)
<rogpeppe> TheMue: i'm not sure what you mean
<rogpeppe> TheMue: what db client is going to have a problem with an arbitrary string key?
<TheMue> rogpeppe: and it's no "standards", it's an OSF/DCE standard.
<TheMue> rogpeppe: not with a string, but the client should be able to handle a simple array of bytes.
<TheMue> rogpeppe: using the string representation here is just a compromise, but hey, it's ok.
<rogpeppe> TheMue: can you name a single advantage that we get from adhering to the OSF/DCE standard?
<rogpeppe> TheMue: there's at least one disadvantage that i can think of
<TheMue> rogpeppe: currently none, but the consequence is to ask for each standard why do you follow it. later, e.g. when having more integration aspects or new people coming into the team, have to maintain the code and ask themselves why here a standard hasn't been used is not good.
<TheMue> rogpeppe: and we also have no real reason to not follow the standard, only the problem of mgo.
<rogpeppe> TheMue: AFAICS the standard here just makes things a little more complex for no gain and some loss. in particular, you can't tell by seeing the UUID that it actually refers to a juju environment or which juju environment it might refer to.
<rogpeppe> TheMue: if i'm seeing lots of juju env UUIDs flying around, that's very useful information
<TheMue> rogpeppe: so you don't talk about a UUID anymore. you talk about an environment identifier and the decision how it has to look like.
<rogpeppe> TheMue: yes
<rogpeppe> TheMue: that's what we need
<TheMue> rogpeppe: that's a totally different discussion, even it is a fine one :)
<rogpeppe> TheMue: a universally unique environment identifier
<TheMue> rogpeppe: hehe
<TheMue> rogpeppe: would not have any problem with it. how would you ensure that it "speaks" and is also unique?
<TheMue> fwereade: what do you say about it? don't use a UUID but an own invention?
<rogpeppe> TheMue: "speaks"?
<rogpeppe> TheMue: we ensure that it's unique in exactly the same way as you're doing currently - by using random bytes.
<fwereade> TheMue, is there some problem with using a string representation of a UUID as you've already implemented it?
<TheMue> rogpeppe: yeah, i don't have a better word. just a direct translation. ;) an identifier that contains useful informations.
<rogpeppe> fwereade: see my comments above. i think it would be nice if our environment identifiers at least identified themselves as environments
<TheMue> fwereade: please see the discussion above. not technologically, but rogpeppe correctly mentioned, that a regular one doesn't contain helpful informations.
<rogpeppe> fwereade: BTW it would be wrong, i think to use the UUID as the key in the environment collection.
<fwereade>  rogpeppe, TheMue: (1) we need to supply JUJU_ENV_UUID (2) if we're calling it a UUID, we should make it resemble one (3) it's convenient to represent it as a string, both in the data layer and when communicating it to hooks (4) make it a string :)
<fwereade> (5) that came from a UUID in the "standard" way
<fwereade> (6) unless there's some concern about ambiguity
<fwereade> rogpeppe, expand please
<rogpeppe> fwereade: it means that only way to find out the uuid is by listing all items in that collection.
<rogpeppe> fwereade: which may in the future contain more items
<TheMue> fwereade: reasonable. where does somebody as a user has a direct contact with it? or is it just a pure technical reference to identify an environment?
<fwereade> rogpeppe, if it contains more items, you'll be accessing it via an environment-specific state connection, won't you?
<fwereade> TheMue, some charms want to use it, I'm afraid I don't know the detailed use case
<rogpeppe> fwereade: it seems odd to use a key that you can't know in advance.
<fwereade> TheMue, also, landscape wants to be able to identify environments
<TheMue> fwereade: ok, and today it is a v4 uuid (in th epy code).
<fwereade> TheMue, and that is a fine thing, and it's convenient to represent it as a string in our case
<fwereade> TheMue, do they not store it as a string in python?
<fwereade> TheMue, *surely* that's the convenient representation for yaml-in-zk?
<rogpeppe> i think a string is the right representation, and leaves us open to changing the representation easily in the future if we should choose to do so
<TheMue> fwereade: have to look, the generate it with an external package imho.
<fwereade> TheMue, python is irrelevant really anyway
<rogpeppe> fwereade: i think an "environment uuid" can reasonably look different from other kinds of uuid
<rogpeppe> fwereade: i'm thinking of places where we've got something as-yet-to-be-invented that's dealing with lots of environments
<TheMue> fwereade: but i have no problem with the uuid as string, i'm only coming from a world where the types are used as long as possible and the conversion into a representation (string, xml, json etc) is done when needed. ;)
<rogpeppe> fwereade: and that an env-uuid would be a very useful thing to see in log files.
<fwereade> TheMue, ISTM that we are doing it when needed ;)
<TheMue> fwereade: due to database client reasons, but ok. :D
<rogpeppe> fwereade: it's really an *environment id* we're talking about here
<fwereade> rogpeppe, agreed -- but the UUID is unique, and no other aspect of an environment necessarily is
<rogpeppe> fwereade: which we need to be globally unique
<TheMue> fwereade: eh, btw, as slice there's no problem. really strange.
<rogpeppe> TheMue: your Copy and Raw code is a bit dubious BTW
<rogpeppe> TheMue: both of them do exactly the same thing
<fwereade> rogpeppe, if we're dealing with lots of environments, we'll be identifying them by uuid
<fwereade> rogpeppe, surely?
<rogpeppe> fwereade: yes, but if there are lots of uuids flying around, it would be nice to know which ones refer to juju environments
<rogpeppe> fwereade: surely what?
<TheMue> rogpeppe: only different return types.
<rogpeppe> TheMue: Copy is redundant.
<rogpeppe> TheMue: uuid1 := uuid
<rogpeppe> TheMue: does the same
<fwereade> rogpeppe, ha, hadn't spotted that
<rogpeppe> TheMue: Raw can be just return [16]byte(uuid)
<TheMue> rogpeppe: came from a uuid being a slice first and indeed has to be changed, yes. thanks.
<fwereade> rogpeppe, TheMue: weren't we just going to drop the type and return a string?
<rogpeppe> fwereade: yeah, just saying for future reference.
<fwereade> rogpeppe, TheMue: heh, I didn't know you could just "cat /proc/sys/kernel/random/uuid"
<TheMue> fwereade: any problem with keeping the type but adding a function NewUUIDString() string?
<rogpeppe> i'd prefer to see something like juju-24e5f46753
<rogpeppe> fwereade: cool
<TheMue> fwereade: there are many ways.
<rogpeppe> fwereade: that's a bit bigger than our 16 bytes :-)
<fwereade> rogpeppe, what sort of context are you concerned about?
<TheMue> rogpeppe: how would the juju-prefix help?
<rogpeppe> TheMue: it would say "this string of random hex bytes means something in the context of juju"
<fwereade> rogpeppe, huh? it just gives you a string representation, like we're discussing
<fwereade> rogpeppe, I don't see the use case there
 * TheMue neither
<rogpeppe> fwereade: we're talking about a name for something
<rogpeppe> fwereade: it's nice if names have some meaningful information in them, i think
<TheMue> rogpeppe: no, the name is the name, the uuid is simply an identifier.
<fwereade> rogpeppe, what TheMue says
<rogpeppe> TheMue: what is an identifier if not a name?
<fwereade> rogpeppe, I don't want to denormalize name out into the uuid -- what if names become changeable?
<fwereade> rogpeppe, a name is a property of an environment identifies by a UUID
<TheMue> rogpeppe: it's a simple reference for something identifying it in a given context and not especially used by human beings.
<rogpeppe> fwereade: we already use the name as a key into the user's environment
<rogpeppe> TheMue: human beings will see these things all the time
<TheMue> rogpeppe: they will see the name of the environment, not the id
<rogpeppe> fwereade: i think there are probably sound reasons for not including the environment name actually (in particular it's nice to have a fixed-length key)
<rogpeppe> fwereade: but i still think our string representation should identify itself as a juju environment identifier
<fwereade> rogpeppe, yeah, and using plainly non-unique things as keys is rarely very sensible, it's going to be a problem for us
<rogpeppe> fwereade: it's a level of sanity checking apart from anything else
<rogpeppe> fwereade: it's only part of a key. uuids often contain non-unique sequence numbers too
<fwereade> rogpeppe, obviously not every bit of a uuid has a unique value :/
<fwereade> rogpeppe, look, it's a UUID
<rogpeppe> fwereade: indeed.
<fwereade> rogpeppe, a UUID looks like X
 * TheMue just references http://en.wikipedia.org/wiki/Universally_unique_identifier
<fwereade> rogpeppe, not like juju-X
<fwereade> rogpeppe, I'm not going to tell people "we decided to improve on UUIDs"
<fwereade> rogpeppe, even if they need improvement, now is not the time
<rogpeppe> fwereade: it's not "improving on". it's adding some more information so we can easily say "this cannot possibly be a juju uuid"
<rogpeppe> fwereade: there is absolutely nothing special about a UUID
<TheMue> rogpeppe: where do you expect we need to answer this question?
<rogpeppe> TheMue: UUIDs come from outside. whenever they do, we ask this question.
<fwereade> rogpeppe, why do we care about the answer?
<TheMue> rogpeppe: where do they come from outside?
<fwereade> rogpeppe, "here's the env" or "env not found"
<rogpeppe> fwereade: or "bogus environment identifier" - would be more helpful.
<TheMue> rogpeppe: if the code asks the db "hey, i wonna have the env with the id 1234" and it finds it it's ok.
<rogpeppe> fwereade: if UUIDs were so special, there wouldn't be so many different versions of them.
<fwereade> rogpeppe, how so? who to? in what way does it pay for 3 engineers at $X/hr when there's a clear implementation out there already that we can just match?
<TheMue> rogpeppe: only 5
<fwereade> rogpeppe, so, you're saying we should improve on them...
<rogpeppe> fwereade: i'm saying they're just 16 random bytes in hex. job done.
<TheMue> rogpeppe: and those 5 version differen in the way they are generated
<TheMue> rogpeppe: nothing else
<rogpeppe> TheMue: and in their string representations
<TheMue> rogpeppe: that may be identifiers, but they aren't uuids.
<rogpeppe> TheMue: why not?
<rogpeppe> TheMue: they're ids and they're universally unique.
<fwereade> rogpeppe, that is not the same thing as a UUID
<TheMue> rogpeppe: there is only one standard in represent those 16 octets as string, no more
<fwereade> rogpeppe, we need a UUID
<dimitern> looking for a second review on https://codereview.appspot.com/8247045/
<fwereade> rogpeppe, that is what we will be implementing
<fwereade> rogpeppe, not an I-decided-this-was-better-than-a-UUID
<TheMue> fwereade: +1
<rogpeppe> fwereade: it's trivially *exactly the same* as a UUID
<fwereade> rogpeppe, the bikeshade is puce, and I'm sticking to it
<fwereade> rogpeppe, words have meanings
<TheMue> rogpeppe: not the same, only similar
<TheMue> rogpeppe: and JUJ_ENV_UUID contains the term UUID where people expect it to be one
 * rogpeppe never understood why the form of a set of random bytes mattered
<dimitern> rogpeppe: maybe due to parsing something expected
<rogpeppe> dimitern: it's a string...
<TheMue> rogpeppe: you're discussing a design decision has been done a long time ago. would the variable has been introduced as JUJU_ENV_ID it would be more open.
<TheMue> rogpeppe: but the decision has already been made.
<dimitern> rogpeppe: btw and easy quick review? https://codereview.appspot.com/8247045/
<rogpeppe> dimitern: i'll swap you https://codereview.appspot.com/8510043/
<dimitern> rogpeppe: already did yours ;)
 * rogpeppe refreshes his mail and sees it
<rogpeppe> dimitern: ta!
<rogpeppe> dimitern: "what happens if somehow an invalid value ends up here?"
<rogpeppe> dimitern: isn't it just the same as it was before?
<rogpeppe> dimitern: i'm not sure i see how things are any different now.
<dimitern> rogpeppe: well, before Status was typed, now it isn't
<rogpeppe> dimitern: i don't think that makes any difference
<rogpeppe> dimitern: each one could be read into the other before, as now
<dimitern> rogpeppe: it does until you cast it explictly i guess
<rogpeppe> dimitern: if you used a machine id to fetch a unit status doc before, you'd get exactly the same results as you get now
<dimitern> rogpeppe: but anyway it's internally accessible only, so it'll be all right
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: we do no checking of valid statuses currently
<rogpeppe> dimitern: anyone could do m.SetStatus("fdvdsfvfdsvfdsv", "")
<rogpeppe> dimitern: with no repercussions
<dimitern> rogpeppe: not sure - it must be params.MachineStatus value, isn't it?
<rogpeppe> dimitern: no
<rogpeppe> dimitern: a constant string is assignable to a typed string
<rogpeppe> dimitern: you even do it yourself
<dimitern> rogpeppe: ha! didn't know that
<rogpeppe> dimitern: in Status you return "" for MachineStatus
<rogpeppe> dimitern: constants in Go are untyped
<dimitern> rogpeppe: :) cool, point taken
<dimitern> can someone tell what do I need to setup to run ec2 live tests?
<danilos> heya, anybody seen this problem with juju-core and golang from raring: http://pastebin.ubuntu.com/5688989/? I am sure I am doing something wrong, but I wonder what (and I couldn't find any reference web page for the package net/http/cookiejar)
<mgz> dimitern: just source your ec2 creds and run in the environ/ec2 dir with -live I think
<mgz> though you may also have some other flags now as well
<dimitern> mgz: I don't have the same creds file as novarc for canonistack
<dimitern> danilos: you need to call "go build ./..." and then possibly "go test ./..." (note the dot and the slash)
<danilos> dimitern, ah, lovely syntax, thanks :)
<mgz> you don't? I'm pretty sure john or I emailed you the bzr creds
<dimitern> mgz: ha! let me look
<mgz> look for gpg encrypted junk :)
<dimitern> mgz: can't seem to find them; anyway I just created an account with aws anyway
<dimitern> mgz: is there a wiki page with a similar setup for ec2, like for canonistack?
<mgz> there's a docs page
<mgz> but, I didn't re-edit the ec2 one, so it might not be as useful as the openstack one.... but you should work it out
<mgz> I think there's a wrinkle with juju-core in that it doesn't look for the ecuatools form of the envvars
<dimitern> so do I need anything else beside AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env vars?
<mgz> ...I really want to have a session with evilnick at some point about what we need to make the docs sane for juju-core
<mgz> dimitern: only the other normal bits. you can set some things like the region to something other than us east in your environment if you want
<dimitern> mgz: ok
<rogpeppe> dimitern: you have a review https://codereview.appspot.com/8247045/
<dimitern> rogpeppe: cheers!
<rogpeppe> fwereade: any chance of a second opinion? https://codereview.appspot.com/8510043/
<dimitern> rogpeppe: any idea why i'm getting this error http://paste.ubuntu.com/5689043/ while running the ec2 live tests?
<rogpeppe> dimitern: are you getting it every time?
<dimitern> rogpeppe: that was the first run, but then I got "your account is being verified for the next 2h...", weird though - some tests pass, I can see instances running in the dashboard
<dimitern> rogpeppe: will try later, I just registered the account 30m ago
<rogpeppe> dimitern: with this kind of problem, i recommend ssh'ing to the machine as it's bootstrapping (as soon as the test logging prints its DNS name) and tail -f the cloud-init-output.log on the machine
<dimitern> rogpeppe: will do, thanks
<rogpeppe> dimitern: it looks like you've started the machine ok, but the mongo init has failed
<jamespage> bigjools, (or anyone else): are you still seeing SSL issues with the MongoDB in raring? I upload 2.2.4 on Friday which contained some SSL related fixes
<dimitern> rogpeppe: maybe it's something related to the account being not fully verified yet
<rogpeppe> dimitern: there is a race between the connecting client and the init process, but the code logic *should* deal with that
<rogpeppe> dimitern: i don't see anything ec2-related there
<rogpeppe> dimitern: unauthorizedError comes from mongo, i think
<dimitern> rogpeppe: I didn't paste the whole thing
<rogpeppe> dimitern: it managed to make the connection to the mongo on the new machine
<rogpeppe> dimitern: which indicates that your ec2 credentials worked fine
<dimitern> rogpeppe: that error was reported a few times: http://paste.ubuntu.com/5689057/
<rogpeppe> dimitern: that's different
<dimitern> probably only 1 instance at a time is allowed during these 2h
<rogpeppe> dimitern: perhaps
<rogpeppe> dimitern: i don't think your problem that you pasted first is to do with that though
<TheMue> lunchtime
<rogpeppe> dimitern: have you still got an instance running from that account?
<dimitern> rogpeppe: how can I pause the test to inspect the log before it shuts down the instance/
<dimitern> rogpeppe: not anymore after the tests
<rogpeppe> dimitern: i generally ssh in while the test is running
<rogpeppe> dimitern: so you can see the log as it's being produced
<rogpeppe> dimitern: alternatively, just ^C the test
<dimitern> rogpeppe: I'll try just that test again, while sshing
<rogpeppe> dimitern: good plan
<rogpeppe> dimitern: one other thing
<dimitern> FAIL: live_test.go:136: LiveTests.TestInstanceGroups - that's the one
<rogpeppe> dimitern: i'd add a log statement
<rogpeppe> dimitern: in juju.NewConn
<rogpeppe> dimitern: after the "if state.IsUnauthorizedError" test
<rogpeppe> dimitern: so that we can be sure that logic is being triggered
<rogpeppe> dimitern: (if it gets an unauthorized error, it should continue trying to connect for a minute)
<rogpeppe> dimitern: and i recommend running only TestBootstrapAndDeploy to start with
<dimitern> rogpeppe: now it passed
<dimitern> rogpeppe: ok, I'll add the log stmt
<rogpeppe> dimitern: i don't think TestInstanceGroups was the one that failed for you, BTW
<dimitern> rogpeppe: but for the bootstrap+deploy to work, i need to be able to run more than 1 instance, right?
<rogpeppe> dimitern: yeah, but you'll see that error when it tries to initially connect to the state
<dimitern> rogpeppe: add log.Warningf("juju: unauthorized error while connecting to state server; retrying") ?
<rogpeppe> dimitern: sounds good
<dimitern> rogpeppe: in the "if IsUnauth()" block?
<rogpeppe> dimitern: actually, i think it's more a Notice
<rogpeppe> dimitern: because it can happen in the normal course of events
<rogpeppe> dimitern: yeah
<rogpeppe> dimitern: i'm just wondering about name of the newly factored-out allWatcher
<rogpeppe> dimitern: it's no longer inherently an all-watcher - it just watches a collection of stuff
<rogpeppe> dimitern: i'm wondering about "combiWatcher" as a name
<rogpeppe> fwereade: thoughts? ^
<dimitern> rogpeppe: combinedWatcher?
 * fwereade reads back
<rogpeppe> dimitern: i'd really like something shorter
<rogpeppe> fwereade: i've factored the allWatcher code into its own package
<fwereade> rogpeppe, maybe multiWatcher?
<rogpeppe> fwereade: i wondered about that
<fwereade> rogpeppe, panopticon ;)
<fwereade> rogpeppe, (ok, not "pan")
<rogpeppe> fwereade: but several other things are actually "multi-watchers"
<dimitern> rogpeppe: bootstrap and deploy failed again, this time the notice is there
<rogpeppe> dimitern: did you ssh to the instance when it was bootstrapping?
<fwereade> rogpeppe, no more so than other things are combi-watchers, surely?
<rogpeppe> fwereade: yeah, i was thinking that
<dimitern> rogpeppe: no, because i'm figuring out how to setup my .ssh/config to allow me to use my newly generated keypair
<rogpeppe> dimitern: i do: ssh -i $home/.ec2/rog.pem ubuntu@$1
<rogpeppe> dimitern: in a script that i call "ec2ssh"
<rogpeppe> dimitern: is this against trunk?
<fwereade> dimitern, just `ssh ubuntu@` seems to work for me, I just have a standard id_rsa in ~/.ssh
<rogpeppe> fwereade: it depends if you generated an ssh key pair or not
<fwereade> rogpeppe, it was to having done that that I was alluding
<dimitern> I added Host *.amazonaws.com + the key
<rogpeppe> fwereade: ah yes, that will probably work with the usual authorized keys logic
<dimitern> rogpeppe: ssh-ing and tailing cloud-init-output.log shows no errors (just host key generated; cloudinit finished)
<rogpeppe> dimitern: hmm
<dimitern> running again
<rogpeppe> dimitern: can you bootstrap and run status ok?
<rogpeppe> dimitern: could you paste the log of the test run?
<dimitern> still running
<rogpeppe> dimitern: i mean the one that just failed
<rogpeppe> fwereade: i'm also looking for another name
<rogpeppe> fwereade: there are two types exported by multiwatcher
<rogpeppe> fwereade: one is the actual watcher used by clients
<rogpeppe> fwereade: which was called StateWatcher, but now is (provisionally) multiwatcher.Watcher
<rogpeppe> fwereade: the other is the type that's shared between several Watcher instances, that holds the record of everything that's currently around
<rogpeppe> fwereade: it's only there to be used by state
<rogpeppe> fwereade: i'm thinking multiwatcher.Store
<rogpeppe> fwereade: but i dunno
<dimitern> rogpeppe: cloud-init-output.log: http://paste.ubuntu.com/5689114/
<dimitern> rogpeppe: test log: http://paste.ubuntu.com/5689116/
<dimitern> rogpeppe: that's the one
<fwereade> rogpeppe, Store sgtm
<rogpeppe> dimitern: ah, so it did fail
<rogpeppe> dimitern: see line 131 of the cloud-init output
<rogpeppe> dimitern: it failed to download the tools
<dimitern> rogpeppe: yeah, I saw that - when I was tailing it before it was too late probably
<rogpeppe> dimitern: perhaps you're not authorized to push to s3
<dimitern> rogpeppe: could be, that's why I'll try in an hour or so, after verification
<rogpeppe> dimitern: i'd be interested in exploring this failure mode actually
<rogpeppe> dimitern, fwereade: i wonder if we shouldn't pipeline wget straight into tar xz
<rogpeppe> we should probably download the tools, check the sha1sum and *then* untar them
<fwereade> rogpeppe, +1 in spirit although I would prefer that they were actually signed
<rogpeppe> fwereade: what would signing give us that an sha1sum doesn't?
<rogpeppe> fwereade: (after all, signing is usually just signing a hash of the content anyway, right?)
<fwereade> rogpeppe, requires compromise of the actual key used to sign, rather than just compromise of the hash and binary delivery channels
<rogpeppe> fwereade: really?
<fwereade> rogpeppe, isn't that the difference? what it takes to fake it?
<rogpeppe> fwereade: isn't the key delivered through the same channel?
<rogpeppe> fwereade: if someone can compromise your cloudinit, it doesn't matter what key you used to sign
<fwereade> rogpeppe, I'm just talking about how the binaries/hashes get to juju
<rogpeppe> fwereade: yeah - through cloudinit, right?
<rogpeppe> fwereade: well, the hashes anyway
<rogpeppe> fwereade: i think we have no choice but to trust the integrity of the cloud-init script
<fwereade> rogpeppe, I wasn't concerned about cloud-init
<rogpeppe> fwereade: in which case, a secure hash is as good as a signed binary, i think
<rogpeppe> fwereade: it's a different matter for the mongo binary though
<fwereade> rogpeppe, who generates the hash, and how do we trust them?
<rogpeppe> fwereade: Bootstrap generates the hash
<rogpeppe> fwereade: ah... doh!
<rogpeppe> fwereade: public tools, sorry
<rogpeppe> fwereade: i was thinking of upload-tools
<fwereade> rogpeppe, yeah, I had the same realisation just now
<rogpeppe> fwereade: yeah, we need signing
<rogpeppe> fwereade: and for the public key to be embedded in the juju source.
<fwereade> rogpeppe, exactly
<fwereade> rogpeppe, ah, hmm
<rogpeppe> fwereade: well, possibly with an override
<fwereade> rogpeppe,  we can't let a potentially-evil binary verify its own integrity
<rogpeppe> fwereade: i'm sure there's any alternative
<fwereade> rogpeppe, the pubkey goes in the client, pushed up to cloudinit, verify signature of downloaded tools inside cloudinit
<fwereade> rogpeppe, untrusted client = game over anyway
<rogpeppe> fwereade: exactly
<rogpeppe> fwereade: we do need an override though, so that sync-tools can work
<rogpeppe> sync-tools --public, that is
<fwereade> rogpeppe, upload-tools?
<rogpeppe> fwereade: that too
<fwereade> rogpeppe, for sync-tools they're just the same files in a different location, though, right?
<fwereade> rogpeppe, upload-tools involves either overriding or just disabling the checks in development mode
<fwereade> rogpeppe, however I would rather have the override just so we're always exercising that code path
<rogpeppe> fwereade: sync-tools --public is different, because other juju environments will use that public bucket
<fwereade> rogpeppe, I'm not following
<rogpeppe> fwereade: so we need to provide a env config way of specifying the tools/mongo(/either?) public key, i think
<fwereade> rogpeppe, I agree that's the way to do it
<rogpeppe> fwereade: bootstrap --upload-tools is {generate key; generate binaries; upload binaries signed with new key; bootstrap with cloudinit holding new key}
<rogpeppe> fwereade: normal bootstrap is {get key from somewhere (defaulting to our public key); bootstrap with cloudinit holding that key}
<fwereade> rogpeppe, yeah, sgtm
<rogpeppe> s/somewhere/environment config/
<fwereade> rogpeppe, yep
<rogpeppe> fwereade: i guess we want to allow sync-tools to generate a key too
<mgz> fwereade: have we got any good starter bugs for danilo in juju-core?
<fwereade> mgz, ooh, let me take a look, just a sec
<fwereade> mgz, hmm, how about keyauth support for goose? you'd know better than I
<fwereade> mgz, it's the card that looks most obviously bitesize to me
<mgz> ah, that would be good
<fwereade> mgz, alternatively, if he wants to hit core, then "hook execution serialization" might be a good one -- it should be relatively simple code in a single location
<mgz> danilo: bug 1135335
<_mup_> Bug #1135335: Keyauth support not available for Openstack providers <Go OpenStack Exchange:Triaged> <juju-core:New> < https://launchpad.net/bugs/1135335 >
<fwereade> danilos, sorry, don;t know why I'm talking about you instead of to you
<danilos> fwereade, hi, let me read up :)
<fwereade> danilos, keyauth for openstack is probably better because it demands a bit less context, I think
<danilos> mgz, fwereade: ok, I'll start with the keyauth and then after I am done I'll take a look at hook execution serialization (I assume that's bug 1121968)
<_mup_> Bug #1121968: hook serialization per-system <juju-core:New> < https://launchpad.net/bugs/1121968 >
<fwereade> danilos, perfect, thanks
<fwereade> danilos, I'm going to eat in a mo but feel free to ping me if yu have any questions, I shouldn't be long
<mgz> so, the trick with keyauth is to look at the current docs, and python implementation, then fixup the goose code
<mgz> https://juju.ubuntu.com/docs/provider-configuration-openstack.html
<danilos> fwereade, thanks, sounds good
<mgz> lp:juju juju/provider/openstack/credentials.py and ./client.py and tests
<mgz> then lp:goose identity/
<fwereade> danilos, incidentally, before you do work on core, you might find it helpful to read the developer docs (most of the doc/ dir)
<mgz> I'm happy to be bugged with questions or pair if you'd find that helpful
<danilos> fwereade, yeah, started on that already, thanks
<mgz> and yeah, the docs directory in juju-core is very useful for background on the design and implementation
<fwereade> danilos, and I would be most grateful for criticisms from your POV, if there's anything unclear
<danilos> mgz, sure, I'll take a look around first and will ping you as needed
<danilos> fwereade, ack
<TheMue> fwereade: another ping
<dimitern>  fwereade: updated https://codereview.appspot.com/8429044/ PTAL
<dimitern> rogpeppe: also you please? ^^
<TheMue> dimitern: MachineNonce, my translator tells me that a nonce is a paedophile. i hope there are other meanings too. ;)
<dimitern> TheMue: :D - see here http://en.wikipedia.org/wiki/Cryptographic_nonce
 * dimitern bbi30m
<TheMue> dimitern: ah, not http://en.wikipedia.org/wiki/Nonce_(slang)
<rogpeppe> dimitern: why do we need a special value for the bootstrap nonce?
<rogpeppe> TheMue: that's new to me. the other meaning i know for nonce is "homosexual".
<TheMue> rogpeppe: reminds me of the Mitsubishi Pajero, a word with a special meaning in South America :)
<TheMue> rogpeppe: always interesting when terms have so different or multiple meanings
<rogpeppe> dimitern: you have a review
 * dimitern is back
<dimitern> rogpeppe: tyvm
<dimitern> rogpeppe: because it's not generated by the provisioner
<dimitern> fwereade: ping
<fwereade> dimitern, pong
<dimitern> fwereade: https://codereview.appspot.com/8429044/ :)
<fwereade> dimitern, cool
<rogpeppe> dimitern: that's true, but i'm not sure i see what it needs to be a particular special value
<dimitern> rogpeppe: well, it has to have some value, and it's already special, so why not?
<rogpeppe> dimitern: giving it a special constant makes it seems like it *has* to be that value
<dimitern> rogpeppe: i don't understand what's your point, sorry, expand a bit more on why this is a bad idea?
<rogpeppe> dimitern: i'm not sure it is.
<TheMue> fwereade: a ping by me too ;)
<dimitern> rogpeppe: what you rather have there instead?
<rogpeppe> dimitern: i'd probably just have the provisioners pass in ""
<fwereade> rogpeppe, it does have to be that value
<dimitern> rogpeppe: the provisioner will never see that
<dimitern> rogpeppe: it'll be already there at bootstrap time
<rogpeppe> dimitern: oops, i meant the provider
<rogpeppe> fwereade: oh?
<fwereade> rogpeppe, that will be done in jujud bootstrap-state, I think, by passing it into InjectMachine (IIRC)?
<dimitern> rogpeppe: ah, well I don't like "", it could be generated at bootstrap time, like in the provisioner though, so it's not fixed
<rogpeppe> dimitern: i don't think that would help
<fwereade> dimitern, I don't think there's much payoff to making it configurable
<TheMue> fwereade: may i reserve the next ping-timeslot? ;)
<fwereade> TheMue, sorry, pong
 * rogpeppe quite likes the zero value as a default
<TheMue> fwereade: hehe
<TheMue> fwereade: e'thing is now green, only one thing is open
<dimitern> rogpeppe: well, what's not to like about "user-admin:bootstrap" ? :)
<rogpeppe> dimitern: it makes it look like there's more structure there than there actually is
<TheMue> fwereade: i changed the operation order when creating the env and co after Dimiters review.
<fwereade> TheMue, yeah, you changed it the wrong way, it was right before
<dimitern> rogpeppe: that's because you don't yet know the format what the PA will generate
<TheMue> fwereade: you say that this order is wrong. could you explain why and which kind of error you expect here?
<fwereade> TheMue, if a settings document exists without a matching entity, no big deal
<dimitern> rogpeppe: it'll be similar "machine-<id>:<random hex token of (possibly) fixed length>
<rogpeppe> dimitern: why not just a uuid? :-)
<benji> I can't bootstrap, I'm getting "error: cannot log in to admin database: auth fails"
<fwereade> rogpeppe, because the story is not specified to require a UUID?
<dimitern> rogpeppe: it's too heavy perhaps..
<TheMue> fwereade: but doesn't the txn ensures that all is created or nothing?
<rogpeppe> fwereade: i'd have thought that just a random hex token would be enough
<rogpeppe> fwereade: but if we think we want an extra prefix, then who am i to argue? :-)
<dimitern> rogpeppe: it has to be unique across the environment
<rogpeppe> dimitern: sure. that's what randomness gives you.
<fwereade> TheMue, yes, but with some subtleties
<dimitern> rogpeppe: so even a counter should do
<benji> I have run go get -u launchpad.net/... and done a build -a ./... and still have the problem.  I have also switched S3 buckets (which has fixes similar issues in the past)
<rogpeppe> dimitern: sure.
<fwereade> TheMue, txn execution might be interrupted and not resumed for a while
<TheMue> fwereade: ok, that's an argument
<dimitern> benji: ec2?
<fwereade> TheMue, we would like to use the existence of the entity documents as guarantees that certain other documents exist
<benji> dimitern: yep
<rogpeppe> dimitern: i just like the simplicity of "bootstrap" - it suggests a simple starting point.
 * TheMue likes real dbms transactions ;)
<dimitern> benji: I had the same issue - are you using a newly created ec2 account?
 * fwereade has become quite fond of mgo/txn
<TheMue> fwereade: so the order is constraints, settings, environment
<benji> nope; this one has been active for a few years
<rogpeppe> dimitern: and it's a really special value, so it doesn't matter that it's a different format. in fact that might even be better.
<rogpeppe> dimitern: anyway, i don't wanna push any more. i just wanted to raise the question.
<fwereade> TheMue, there's no relationship between constraints and settings, but both must come before the environment doc itself
<TheMue> fwereade: ok, then the CL will flow in in a few seconds
<dimitern> rogpeppe: well, I think we should keep it simple and use a const, what exact value  to choose doesn't bother me that much
<dimitern> fwereade: perhaps you can chip in on your idea behind "user-admin:bootstrap" meaning?
<fwereade> benji, would you try pointing your public-bucket to an empty one?
<benji> fwereade: control-bucket?
<dimitern> fwereade: I had the same issue earlier, with a new bucket (well, haven't created it, just picked juju-askfjasljgfjgljfg)
<fwereade> benji, public-bucket
<dimitern> but I did set control-bucket, no public
<benji> fwereade: I don't have a config setting with that name.  Should I add one?
<fwereade> benji, actually, simpler: would you check the current version number in version/version.go?
<benji> fwereade: const version = "1.9.14"
<fwereade> benji, if you were to add one we could eliminate the possibility of inappropriate tools being picked up for some reason
<dimitern> fwereade, rogpeppe: I got it - bootstrap nonce is badged with the tag of the responsible entity
<benji> fwereade: will do
<fwereade> benji, we have surprising fallback behaviour across buckets that I'm working on
<fwereade> benji, it's not certain to be the cause but at least it eliminates one avenue of ugliness
<benji> fwereade: do I need to make that bucket public in S3?
<fwereade> benji, not at all
<benji> k
<fwereade> benji, it defaults to "juju-dist" and gets the released tools
<fwereade> benji, but if you point it somewhere empty, it'll work fine so long as you have sensible tools in the control-bucket
<fwereade> benji, (well, maybe it will: there may be more going on, but that's how to get a definitely-clean environment)
<benji> fwereade: I'm bootstrapping now with new public and control buckets
<fwereade> benji, thanks -- if it's weird again, please paste me /var/log/cloud-init-output.log
<benji> k
<TheMue> Sh**, forgot to remove a comment.
<rogpeppe> fwereade: here's an interesting thing
<rogpeppe> fwereade: the API server does need to use JujuHomePath
<rogpeppe> fwereade: because it gets charms
<rogpeppe> fwereade: and therefore needs a charm cache
<rogpeppe> fwereade: and that's how the charm cache works
<fwereade> rogpeppe, that's an argument for parameterising the charm cache, I think
<rogpeppe> fwereade: perhaps we should divorce the charm cache from JUJU_HOME
<fwereade> rogpeppe, +1
<rogpeppe> dimitern: here's your allwatcher refactoring branch. it's a lot of lines but no logic changes: https://codereview.appspot.com/8458044/
<dimitern> rogpeppe: cheers, will look shortly
<rogpeppe> dimitern: thanks
 * dimitern hates this error already! => 2013/04/08 15:53:59 ERROR JUJU:juju:status state: connection failed, paused for 2s: dial tcp 23.21.7.175:37017: connection refused
<dimitern> it's arguably even an error
<dimitern> so, practically I cannot do anything live on ec2, because there are no charms for quantal and no tools can be found without --upload-tools (or they fail to unpack during cloudinit)
<rogpeppe> dimitern: yup
<rogpeppe> dimitern: you can always download charms and rename them to quantal :-)
<rogpeppe> dimitern: it's not an error
<dimitern> I had to resort to juju deploy cs:~charmers/quantal/glance-4 as the easiest :)
<rogpeppe> dimitern: it's a Notice
<rogpeppe> dimitern: oh, hold on!
<rogpeppe> dimitern: you can use fake-tools
<rogpeppe> dimitern: or whatever it's called
<dimitern> rogpeppe: eh?
<dimitern> rogpeppe: --upload-tools --fake-tools ?
<rogpeppe> dimitern: juju bootstrap --upload-tools -fake-series precise
<rogpeppe> dimitern: it's in trunk
<rogpeppe> fwereade: the help message should say that the series are comma-separate (assuming they are)
<niemeyer> Hullah
<rogpeppe> niemeyer: yo!
<rogpeppe> niemeyer: good news, eh?!
<niemeyer> rogpeppe: indeed!
<dimitern> rogpeppe: cool, i'll try that next
<rogpeppe> niemeyer: hopefully it'll be ultra-stable now :-)
<niemeyer> rogpeppe: Let's hope so :)
<dimitern> mramm: you around?
 * dimitern yes! fake_nonce and provisioner works still
<benji> fwereade: same error; however, I don't see my new public bucket listed in S3; is the config setting simply "public-bucket" under the environment name?  like so:
<benji> environments:
<benji>   ec2:
<benji>     type: ec2
<benji>     public-bucket: foo
<fwereade> benji, yeah
<benji> I wonder why it isn't listed in s3 then
 * dimitern lunch, finally
<fwereade> benji, ok, I guess that wasn't it
<fwereade> benji, we don't try to create it
<benji> oh!  I'll create it and try again
<fwereade> rogpeppe, offhand, will we list an empty bucket without error? I *think* we do
<rogpeppe> fwereade: i think so too
<fwereade> benji, that shouldn't matter
<fwereade> rogpeppe, I have been up to the elbows in a not-quite-up-to-date branch, haven't actually run tip lately
<fwereade> rogpeppe, have you?
<rogpeppe> fwereade: i think so.
<rogpeppe> fwereade: actually, maybe not
<rogpeppe> fwereade: i'll check again
<dimitern> fwereade: it does create it, just tried
<fwereade> dimitern, all the better then
<mattyw> where do all the charm files in a unit get installed under juju-core? In pyju it was /var/juju/....
<benji> fwereade: I get the same error with a newly-created public-bucket
<fwereade> benji, sorry, I clearly didn't communicate that it wouldn't make any difference -- would you paste me cloud-init-output.log please?
<benji> fwereade: you communicated fine; I tried it anyway ;)
<rvba> fwereade: Hiâ¦ in MAAS we don't have a shared storage to upload the tools once and for allâ¦ would it be reasonnable to get the provider to always upload the tools as part of the Bootstrap process?
<fwereade> rvba, I'd recommend that people sync-tools before bootstrap
<mattyw> mattyw, ^^ I'm an idiot :(
<rvba> fwereade: so we don't do anything special in the provider itself, but we encourage people to use 'juju --upload-tools' ?
<fwereade> rvba, I'd prefer to avoid that
<rvba> fwereade: you mean avoid uploading the tools in the provider when Bootstrap() is called?
<fwereade> rvba, jam recently added the juju sync-tools command
<rvba> Ok, seems sane to me to keep the providers as similar as possible.
<fwereade> rvba, if you have ec2 credentials in your env it will copy the latest ones form the official juju-dist bucket
<fwereade> rvba, it's an extra step but not too painful
<fwereade> rvba, I hope
<rvba> fwereade: sounds good to me.
<rvba> I'll cleanup the MAAS' provider code now.
<rvba> fwereade: thanks!
<mattyw> rogpeppe, ping?
<benji> fwereade: I had to re-bootstrap so it took a while, but now I am confused as to how to fetch cloud-init-output.log if I can't ssh into the machine
<fwereade> benji, can you not just ssh ubuntu@dns-name?
<benji> fwereade: I'll try that
<benji> fwereade: https://pastebin.canonical.com/88664/
<fwereade> benji, hmm, how about /var/lib/juju/... um, poke around for an agent.conf somewhere inside a dir called "bootstrap"?
<rogpeppe> mattyw: pong
<rogpeppe> mattyw: (sorry, was in a call)
<mattyw> rogpeppe, no problem? can I borrow you in cloud-green?
<rogpeppe> mattyw: sure
<benji> fwereade: /var/lib/juju/agents/machine-0/agent.conf perhaps?
<fwereade> benji, damn, sorry, I got completely distracted by test failures
<fwereade> benji, I *think* there should be another one for bootstrap stuff
<fwereade> rogpeppe, can you confirm how that one works ^^
<rogpeppe> fwereade: sorry, how what works?
<benji> failing tests can have that effect on people :)
<fwereade> rogpeppe, the agent conf for bootstrap-state
<fwereade> rogpeppe, does it get it by pretending to be an agent called "bootstrap" or something?
<rogpeppe> fwereade: something like that, yes
<rogpeppe> fwereade: and the agent conf gets removed immediately
<benji> fwereade: I have completely rebuilt the world (as best I understand how in go) and it seems to be working now.
<rogpeppe> fwereade: after bootstrap-state has run
<benji> boundless confidence
<fwereade> rogpeppe, ah, ok -- funny, it seemed like benji was seeing bootstrap-state falling over
<rogpeppe> fwereade: it's a very short-lived agent :-)
<fwereade> rogpeppe, I'd expect the conf to be somewhere still around
<rogpeppe> fwereade: no, i'm fairly sure it gets removed after use
<fwereade> rogpeppe, that's a bit annoying if it happens even on failure
<fwereade> benji, heh :(
<rogpeppe> fwereade: yeah. tbh, we should probably do a set -e at the top of cloudinit
<fwereade> benji, my current intention is not to leave this desk until I have sane tools
<rogpeppe> fwereade: i've thought that a few times before but never actually done it
<niemeyer> I'm getting a failure on CmdSuite.TestDestroyEnvironmentCommand
<niemeyer> Is that known?
<fwereade> niemeyer, no, I don't think so; bug please :)
<niemeyer> [LOG] 51.40394 DEBUG api: <- error: read tcp 127.0.0.1:59023: use of closed network connection
<niemeyer> [LOG] 51.40395 ERROR api: error receiving request: read tcp 127.0.0.1:59023: use of closed network connection
<niemeyer> [LOG] 51.41987 ERROR rpc: client protocol error: unexpected EOF
<niemeyer> fwereade: Cool
<benji> fwereade: stay hydrated
<fwereade> niemeyer, it's usually the bit just above that's important
<fwereade> benji, cheers :)
<niemeyer> fwereade: I don't see anything obvious
<fwereade> niemeyer, but if nothing failed before that rogpeppe will be interested ;)
<rogpeppe> he will indeed :-)
<niemeyer> :)
<niemeyer> Opening a ubg
<niemeyer> bug
<niemeyer> Ah, hold on
<niemeyer> ... Panic: unauthorized (PC=0x42ED71)
<rogpeppe> niemeyer: trunk tests pass for me
<niemeyer> That's the issue
<rogpeppe> niemeyer: that *might* have something to do with it :-)
<niemeyer> /home/niemeyer/src/launchpad.net/juju-core/testing/mgo.go:205
<niemeyer>   in MgoReset
<rogpeppe> niemeyer: ah, i think i know what the issue is
<rogpeppe> niemeyer: i've been thinking i could put it off, as it didn't seem to raise its head in normal tests
<rogpeppe> niemeyer: the problem, i *think*, is that in the tests the API server runs as the same mongo user that the client connects as
<rogpeppe> niemeyer: and the client changes its password immediately after connecting
<rogpeppe> niemeyer: which invalidates the API server connection
<niemeyer> rogpeppe: I see
<rogpeppe> niemeyer: the answer is to have the API server connect as a different user (for instance, by creating machine 0 and setting the mongo password for the machine-0 entity)
<rogpeppe> niemeyer: but that buggers up lots of tests
<rogpeppe> niemeyer: so i left it on the back burner until i had a bit more time
<niemeyer> rogpeppe: So a race.. okay
<rogpeppe> niemeyer: yeah
<rogpeppe> niemeyer: i saw the issue when i lowered the state/watcher refresh interval
<niemeyer> fwereade: Done your suggestions
<niemeyer> fwereade: On the publish command
<niemeyer> fwereade: I also ended up just fixing InferURL so it handles the no-series case properly
<fwereade> niemeyer, lovely, thanks, I'll try to take a look tonight
<fwereade> niemeyer, <3
<niemeyer> fwereade: Rather than risking a change in behavior later
<niemeyer> fwereade: Thanks
<fwereade> niemeyer, yeah, +1
<rogpeppe> fwereade: there's one time that i don't have the _id to hand
<fwereade> rogpeppe, oh yes?
<rogpeppe> fwereade: which is when i fetch everything at the beginning
<rogpeppe> fwereade: i guess i'll just duplicate the entity docs to give myself an id
<fwereade> rogpeppe, can't you just fetch all the other stuff after fetching the entities that use them?
<rogpeppe> fwereade: i'd prefer to avoid n round trips
<rogpeppe> fwereade: although actually, now you come to mention it
<rogpeppe> fwereade: hmm no
<rogpeppe> fwereade: we really *should* avoid all those round trips if we can
<rogpeppe> fwereade: i guess i'll just leave it as a todo
<fwereade> rogpeppe, ah! yes, I see... sorry about that then
<fwereade> rogpeppe, embed  the doc in a wrapper with _id?
<fwereade> rogpeppe, when getting all at once I mean
<rogpeppe> fwereade: depends whether bson embedded types work properly or not, i suppose
<rogpeppe> fwereade: unfortunately it doesn't work
<fwereade> rogpeppe, bah
<fwereade> rogpeppe, very well, do as you must
<rogpeppe> fwereade: i'll go with the round trips for now
<TheMue> Anyone interested in spending me a second LGTM: https://codereview.appspot.com/8322043
<fwereade> niemeyer, does http://paste.ubuntu.com/5689862/ look like something you'd know about?
<niemeyer> fwereade: Looking
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: I can see the issue, but I don't know why it's happening
<niemeyer> fwereade: If I run "bzr log -v --long" here
<niemeyer> fwereade: i get stuff like this:   bzr/bzr.go                     bzr.go-20130403210030-zt48wpftmd160ox3-2
<niemeyer> fwereade: IOW, there's a digest after the filename
<niemeyer> fwereade: The output you're observing doesn't have the digest
<niemeyer> fwereade: It's easy to fix the test by dropping the empty space
<niemeyer> fwereade: Slightly curious about why bzr is showing different results for that, though
<fwereade> niemeyer, it's what I see when I do `bzr log -v --long` here
<niemeyer> fwereade: It doesn't matter in this specific test, either way
<fwereade> niemeyer, cool
<niemeyer> fwereade: Right, so it's a mystery why the output string in that test doesn't show it
<niemeyer> fwereade: But again, it doesn't matter.. the test is juts checking that the given file was added
<fwereade> niemeyer, no, the weird version is what I see
<niemeyer> fwereade: To verify that the commit behavior was basically sane
<fwereade> niemeyer, yeah -- the other bits of the test look like they do that ok
<niemeyer> fwereade: I suggest just dropping the space, on the basis that we don' care
<niemeyer> don't
<fwereade> niemeyer, ok, I'll bug it and assign it to myself, I don't want to get myself too distracted :)
<niemeyer> fwereade: Btw, these multiple line outputs are handy.. I haven't seen it working in practice very often :)
<niemeyer> fwereade: I can quickly shoot a one-liner
<niemeyer> fwereade: Hold on
<fwereade> niemeyer, tyvm
<fwereade> niemeyer, I've found them pretty hendy in general
<fwereade> niemeyer, er, handy
<niemeyer> fwereade: https://codereview.appspot.com/8521043/
<fwereade> niemeyer, sorry, but I don't see the revision-id in that output
<fwereade> niemeyer, that's also a (surprising) part of the problem
<niemeyer> fwereade: Oh, indeed.. I missed that
<fwereade> niemeyer, maybe I have a hideously outdated bzr?
<niemeyer> fwereade: I have no idea then
<niemeyer> fwereade: It looks like --long was ignored entirely
<niemeyer> fwereade: Oh, hold on
<niemeyer> fwereade: I probably mixed the options.. I may some defaults locally.. just a sec
<niemeyer> fwereade: Yep
<niemeyer> fwereade: --show-ids
<niemeyer> fwereade: Can you please run log with --show-ids there and see if it helps?
<fwereade> niemeyer, bingo
<niemeyer> fwereade: Okay, so..
<fwereade> niemeyer, revision-id: fwereade@gmail.com-20130408082602-zu4ypfo9xoexns33
<niemeyer> fwereade: The test already had --show-ids.. :)
<niemeyer> fwereade: Does it show the id next to the file as well?
<fwereade> niemeyer, yeah
<niemeyer> fwereade: So there's something funny going on there
<niemeyer> fwereade: Why isn't that working in the test?
<niemeyer> fwereade: It already uses -v --long --show-ids
<fwereade> niemeyer, oh crap, my branch isn't up to date
<niemeyer> fwereade: Hah. Tim fixed it
<niemeyer> fwereade: Mystery solved
<fwereade> niemeyer, heh
<fwereade> niemeyer, thanks, sorry to distract
<niemeyer> fwereade: np
<rogpeppe> next CL in pipeline, if anyone cares to have a look: https://codereview.appspot.com/8487044/
<rogpeppe> fwereade: ping
<fwereade> rogpeppe, pong
<rogpeppe> fwereade: i just noticed that the testing charm name contained the series, and wondered why
<rogpeppe> fwereade: like cs:series/series-wordpress
<rogpeppe> fwereade: it looked redundant, but i imagine there's a good reason
<rogpeppe> fwereade: it confused me for a while - i thought i'd mucked something up
<fwereade> rogpeppe, convenience somewhere... possibly getting an easy and unique ident string?
<rogpeppe> fwereade: i'd've thought the charm name would be enough
<rogpeppe> fwereade: (within the series, of course)
<fwereade> rogpeppe, IIRC it came up when we were adding series support somewhere
<fwereade> rogpeppe, exact provenance blurs slightly
<rogpeppe> fwereade: yeah, it's quite a recent change, so i thought you might remember
<rogpeppe> fwereade: (bzr ascribes credit to you :-])
<fwereade> rogpeppe, that's as close as I can get without hunting down the revision
<rogpeppe> fwereade: ok, np
<bac> hi rogpeppe, i know its late for you but can i bug you for a second?
<rogpeppe> bac: sure
<bac> rogpeppe: cool.  so i've had to change one of the API return structs to change one of the existing entry names and to add another field.  tests all updated and pass.
<rogpeppe> fwereade: if you're still around, i could really do with a second review of https://codereview.appspot.com/8510043/ (i'm about to need it as a second dependency). it's pretty trivial.
<rogpeppe> bac: ok
<bac> rogpeppe: i rebuilt and reinstalled everything.  multiple times
<rogpeppe> bac: ok
<bac> rogpeppe: but when i get the data back on the client side it looks like the old version
<bac> rogpeppe: i've blown away $GOPATH/bin and pkg.  rebuilt, reinstalled, killed my S3 bucket
<rogpeppe> bac: you are bootstrapping with --upload-tools --fake-series precise, right?
<bac> i'm at wits end
<bac> no
<bac> just --upload-tools
<bac> juju bootstrap --upload-tools
<rogpeppe> bac: what series are you deploying from?
<bac> precise
<bac> er, wait
<bac> rogpeppe: my host is quantal.  default-series in environments.yaml is precise
<rogpeppe> bac: what API field have you changed?
<bac> rogpeppe:  ServiceGetResults s
<rogpeppe> bac: perhaps you could push the branch, and i'll have a look
<bac> rogpeppe: here's a diff: http://paste.ubuntu.com/5690437/
<rogpeppe> bac: in general though, it's *really* worthwhile bootstrapping with --fake-series precise --upload-tools
<rogpeppe> bac: otherwise when you deploy a precise charm, it will fall back to using the old version of the tools
<bac> rogpeppe: i'll try that.  it smells environmental
<bac> rogpeppe: ok, but like i said i changed buckets
<rogpeppe> bac: yeah, but this is *within* an environment
<bac> rogpeppe: right, but where would it get an old version of the tools if not from the bucket?
<bac> unless i misunderstand
<rogpeppe> bac: from the public bucket
<bac> er
<rogpeppe> bac: the one that gets used even if you don't use --upload-tools
<bac> oh
<rogpeppe> bac: that shouldn't be used on the bootstrap node though
<rogpeppe> bac: but will be used on any other nodes that don't match the bootstrapped node's series
<bac> rogpeppe: ok, i'm trying again with fake-seriese
<rogpeppe> bac: ok, crossed fingers :-)
<fwereade> rogpeppe, LGTM
<rogpeppe> fwereade: ta!
<bac> rogpeppe: --fake-series worked.  thanks.
<rogpeppe> bac: yay
<rogpeppe> !
<bac> rogpeppe: is that indicative of a problem that should be investigated?
<rogpeppe> bac: no
<rogpeppe> bac: it's the solution to a problem that was :-)
<rogpeppe> bac: and fwereade is working on a way to make it less easy to muck up that way
<rogpeppe> bac: please share in juju-gui the fact that everyone should use --fake-series
<bac> rogpeppe: your comments above made it sound like it shouldn't be required.  fine by me...i can proceed!
 * thumper settles down to read email
<fwereade> thumper, hell, I have not yet done my self-promised email burst today
<thumper> :)
 * thumper is replying to the placement thread...
<rogpeppe> thumper: hiya
<thumper> hi rogpeppe
<thumper> rogpeppe: still up I see
<rogpeppe> thumper: yeah, trying to unblock the gui folks
<fwereade> rogpeppe, I'm thinking I really should rename that -- we may be faking it up now, but cross-compiling should not actually be too far out of reach
<rogpeppe> fwereade: i'm not sure. it might be an elusive goal unfortunately
<fwereade> rogpeppe, meh, probably not worth it then
<rogpeppe> fwereade: i'd prefer to have a wildcard tools fallback actually
<rogpeppe> fwereade: if matching against the series directly fails, try "linux"
<rogpeppe> fwereade: then make upload-tools upload to series==linux
<rogpeppe> fwereade: i mean, we do have cross-compilation currently *almost*
<fwereade> rogpeppe, it is just a dev tool anyway, I guess, we can be a bit freer about changing things
<rogpeppe> fwereade: yeah
<thumper> what is the purpose of the machine nonce?
<fwereade> thumper, there's a window between provisioning a machine and recording it as provisioned in state
<fwereade> thumper, if the provisioner goes down in that window, it will start a new machine
<fwereade> thumper, and we'll end up with two instances running the "same" machine agent
<fwereade> thumper, the nonce allows the machine agent to be sure that it's the official one before it starts messing around changing passwords and deploying units and stuff
<thumper> so what happens when it isn't the official one/
<thumper> ?
<fwereade> thumper, it stops immediately and waits for the provisioner to reap the instance
 * thumper nods
 * thumper still thinks there is a problem with the hook logger
<thumper> rogpeppe: did you want to talk about it, or should I just keep it on the list
 * fwereade is not entirely comfortable with it but doesn't quite have the headspace
 * fwereade will watch conversations with interest though
<rogpeppe> thumper: let's talk about it
<thumper> rogpeppe: ok, hangout?
<rogpeppe> thumper: ok
<thumper> it'll make it hard for fwereade to watch :)
 * thumper starts one
<rogpeppe> fwereade: feel free to join the fun
<thumper> https://plus.google.com/hangouts/_/8078c0dcc36c791886bcd025be7df7b497f49c00?authuser=0&hl=en
<rogpeppe> thumper: one little question, if i'm using bzr pipes, can i rename a branch in the pipeline without breaking the pipe links?
<thumper> rogpeppe: not really
<rogpeppe> thumper: ok, thought so, but thought i'd check.
<rogpeppe> thumper: another thing: how can i link an existing branch into a pipeline?
<thumper> rogpeppe: add-pipe referencing the existing branch
<rogpeppe> thumper: ah, ok
<rogpeppe> right that's me done for the day
<rogpeppe> i've got quite a few reviews out if anyone cares to look
<rogpeppe> g'night all
#juju-dev 2013-04-09
<bigjools> thumper: is this likely to be fallout from your series changes? http://pastebin.ubuntu.com/5691125/
<bigjools> I am trying to work out if it's a bug in the core or in the maas provider
 * thumper looks
<thumper> bigjools: yes, you want --fake-series=precise
<bigjools> thumper: on the bootstrap command line?
<thumper> aye
<bigjools> ok cheers
<thumper> as you are uploading tools
<thumper> and by default, it makes tools for your series
<thumper> but the machine you are booting says precise
<bigjools> it worked \o/
<thumper> although you should have public tools for 1.9.13 now
<bigjools> thumper: public?
<thumper> as in, ec2 public bucket
<bigjools> thumper: I am not using ec2
<bigjools> this is maas
 * thumper nods, ok
<bigjools> anyway, bootstrap "completed" so thanks :)
<bigjools> is there a local placement option in juju-core like there was in Python?
<markramm> there is one under review right now
<bigjools> ok ta
<markramm> bigjools: it will function similarly to deploy-to but be called force-machine
<markramm> discussion on the mailing list about it is ongoing
<bigjools> markramm: I don't know much about this - is deploy-to a cmd line option?
<davecheney> bigjools: no, that isn't an option at the moment
<markramm> bigjools: it was a jitsu feature in python
<markramm> that took a machine ID and installed a service on that particular machine
<bigjools> ah ok, no wonder I hadn't heard of it
<bigjools> thanks both
<markramm> I just now realize that I'm not sure what you meant by "local placement"
<markramm> so perhaps I got it wrong
<bigjools> markramm: "placement: local" in the environment config
<bigjools> juju-core complains if I have that in
<davecheney> bigjools: yes, we don't understand that key
<bigjools> davecheney: the error message was a little opaque: error: placement: expected nothing, got "local"
<davecheney> yeah, that isn't that helpful
 * bigjools files a bug
<davecheney> bigjools: good idea
<bigjools> I was actually looking to avoid bringing up another machine in maas and re-using the bootstrap node
<bigjools> since it's slow - there's no fast installer yet
<bigjools> but no biggie
<davecheney> bigjools: https://launchpad.net/ubuntu/raring/+source/mongodb/1:2.2.4-0ubuntu1
<davecheney> is it possible to drag this into a PPA targeting precise ?
<bigjools> it might be, depends if it builds with precise's available deps
<davecheney> should I try a source build on a Precise vm
<davecheney> ?
 * thumper goes to make dinner
<bigjools> davecheney: use pbuilder
<davecheney> bigjools: speak to me as if I were a child
<bigjools> well you can use a VM
<bigjools> davecheney: I've got 4 naughty kids, don't tempt me!
<davecheney> bigjools: if you want to come to NSW for a spankin' let me know
<davecheney> i'm happy to arrange
<bigjools> what, you'd whizz straight past me then
<bigjools> so you can use a VM, or pbuilder automates an environment for a particular release
<bigjools> whatever you're comfortable with
<davecheney> i'll try pbuilder
<davecheney> i expect to get screwed on permissings
<davecheney> permissions
<bigjools> frankly, I fire up a canonistack instance these days
<bigjools> massively quicker
 * davecheney deployes the ubuntu charm
<bigjools> just dget the source for raring on a precise instance
 * bigjools afk
<rogpeppe> mornin' all!
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: TheMue | Bugs: 0 Critical, 52 High - https://bugs.launchpad.net/juju-core/
<TheMue> moin moin btw
<rogpeppe> TheMue: hiya
<rogpeppe> TheMue: i believe you are the on-call reviewer today...
<rogpeppe> TheMue: i have some reviews for you :-)
<rogpeppe> TheMue: https://codereview.appspot.com/8458044/ https://codereview.appspot.com/8487044/ https://codereview.appspot.com/8533044/ https://codereview.appspot.com/8534043/
<rogpeppe> fwereade: if you fancy having a look, another opinion would be nice too
<fwereade> rogpeppe, cheers, I'll try to get there soon
<rogpeppe> fwereade: thanks
<rogpeppe> fwereade: that last one gets us back to having unit and machine status for the gui folks
<TheMue> rogpeppe: will look at them.
<rogpeppe> TheMue: thanks!
<TheMue> rogpeppe: a lot of stuff, but i'm getting forward
<rogpeppe> TheMue: thanks a lot. the big changes are pretty much mechanical
<TheMue> rogpeppe: yes, that helps
<rogpeppe> TheMue: thanks for the review. i'm not sure what you're thinking of when you say "exposed constants for the EntityTypes"
<TheMue> rogpeppe: simple public constants, nothing more. but as i also wrote, today they are needed nowhere, so simply forget it. ;)
<rogpeppe> TheMue: sorry, do you mean the entity kinds? i don't see any occurrence of the word EntityType in the source.
<TheMue> rogpeppe: oh, sorry, yes. where you return strings like "service" etc.
<rogpeppe> TheMue: ah, i see.
<rogpeppe> TheMue: they are actually compared, but not outside the params package.
<TheMue> rogpeppe: i haven't seen them as strings elsewhere, only that you fill that one map by reflection. am i right?
<rogpeppe> TheMue: look for the string "machine" in state/api/params and you'll see what i mean
<danilos> heya, I am getting a bunch of test failures with juju-core trunk on raring: http://pastebin.ubuntu.com/5691888/
<TheMue> rogpeppe: ah, ic. so probably constants make sense, but not public ones.
<danilos> I did follow README and CONTRIBUTING docs, except that I am using system-installed mongodb
<rogpeppe> TheMue: i'm happy just using the strings. test coverage is sufficient that you won't be able to get it wrong.
<danilos> (because it's 2.2.4 and 2.2.0 was referenced in the README)
<danilos> anyone has any ideas what did I mess up?
<rogpeppe> danilos: i think you do need the SSL-capable mongo
<rogpeppe> danilos: it looks like you might be using GOPATH wrongly
<rogpeppe> danilos: what's the value of your $GOPATH ?
<danilos> rogpeppe, it's /home/danilo/.gopkgs
<danilos> rogpeppe, I've got a symlink from ~/.gopkgs/src/launchpad.net/juju-core to ~/juju-core, but it behaves the same no matter what directory I am in
<danilos> rogpeppe, or maybe not the same, but I've seen a lot of test failures either way
<rogpeppe> danilos: symlinks don't work, i'm afraid
<rogpeppe> danilos: your juju source must live in /home/danilo/.gopkgs/src/launchpad.net/juju-core
<rogpeppe> danilos: you can symlink the other way if you want
<danilos> rogpeppe, :/ but originally I tried it like that as well, maybe the non-SSL mongo was to blame
<rogpeppe> danilos: you definitely need the SSL-capable mongo
<rogpeppe> danilos: were you getting build errors before?
<danilos> rogpeppe, no, go build seems to work correctly either way
<rogpeppe> danilos: sorry, i'm mean build errors when testing
<rogpeppe> danilos: like the errors you pasted above
<rogpeppe> danilos: the easiest way to start is to do: "go get launchpad.net/juju-core/..."
<rogpeppe> danilos: making sure you remove that symlink first
<danilos> rogpeppe, right, that's how I started off then rearranged the trees a bit to follow my branch organization model (which is a shared repo with lightweight checkouts: I like having multiple branches around, especially when starting on a project)
<rogpeppe> danilos: that will get you the source into the right place
<mgz> go is really picky about placement of things
<rogpeppe> danilos: i use cobzr
<rogpeppe> danilos: fwereade doesn't but i don't know how he *does* use multiple branches
<danilos> rogpeppe, right, I hate the git model of a single working directory and local merge problems when you forget to commit stuff, which is why I tried to avoid it
<mgz> jam has a setup like yours, I'll just see if I have what his layout was anywhere
<mgz> it certainly can be made to work
<mgz> ...I think thumper does as well
<rogpeppe> mgz: i think thumper uses native bzr colocation
 * mgz has been using this to test the 2.6 colo stuff
<danilos> rogpeppe, it seems there are no build failures when I get the code directly in, argh :/ there're still a bunch of test failures
<rogpeppe> danilos: are you using the right version of mongo?
<danilos> mgz, rogpeppe: ok, I'll check with them if I don't figure something out, and switch to cobzr model until then
<danilos> rogpeppe, not yet, I'll get it downloaded and set up as well
<rogpeppe> danilos: all you need to do is download the tar file and put the files somewhere in your $PATH
<danilos> rogpeppe, ok
<danilos> rogpeppe, so far it seems that was the remaining problem; I'll figure out a tree layout that works for me, thanks for the input
<rogpeppe> danilos: cool, np
<mgz> I know one thing that works, is having the normal shared-repo+treeless branches in your normal dev location
<mgz> then lightweight checkouts in the go-expected locations
<mgz> there's a trick then to make switch easy, by doing something in locations.conf I think, but am not sure what it is
<danilos> mgz, right, and can probably live with something like that, thanks
<rogpeppe> fwereade: i'd really appreciate your comments on this branch before it gets waved through, as it's the technique that's directly impacted by the doc split between unit/machine and status. https://codereview.appspot.com/8534043/
<fwereade> rogpeppe, ok, looking now
<rogpeppe> fwereade: thanks
<fwereade> rogpeppe, that one's approved
<fwereade> rogpeppe, need to grab some lunch, bbiab
<rogpeppe> fwereade: thanks!
<rogpeppe> fwereade: ping
<fwereade> rogpeppe, pong
<rogpeppe> fwereade: i was just wondering why we don't just use constraints.Value directly as a mongo doc
<dimitern> mgz: standup?
<mgz> I'm there
<danilos> mgz, dimitern: https://pastebin.linaro.org/2129/
<danilos> mgz, dimitern: http://pastebin.ubuntu.com/5692068/
<dimitern> TheMue: cheers for updating the topic!
<TheMue> dimitern: yep, seen it as my task as reviewer today. ;)
<fwereade> rogpeppe, sorry, I left a comment half-types
<fwereade> rogpeppe, because I'd rather have a dumb but explicit translation layer between the api type and the database representation than to conflate the two
<dimitern> no critical bugs? how come?
<rogpeppe> fwereade: do you think we should do that for all the other types we marshal directly to mongo then?
<rogpeppe> fwereade: i'm wondering if we should marshal constraints to mongo as string
<bac> anyone have time for a quick review? https://codereview.appspot.com/8532043
<fwereade> rogpeppe, I can't think of many of those... state.Tools, any others?
<rogpeppe> fwereade: charm.Meta
<rogpeppe> fwereade: (that's a big one)
<fwereade> rogpeppe, hmm, yeah, in hindsight the pressures maybe feel different now constraints is out of state
<rogpeppe> fwereade: charm.Config
<rogpeppe> fwereade: i'd quite like a tool that automatically checked for mongo compatibility between versions
<rogpeppe> fwereade: but i'm not entirely sure how it would work
<fwereade> rogpeppe, yeah, I think it'd be tricky
<rogpeppe> fwereade: there are two approaches: 1) make some kind of formal description of the schema and check that everything adheres to it 2) write a test that talks to two different versions, each of which looks at the same mongodb, getting one to write and the other to read
<rogpeppe> fwereade: (at least) two approaches :-)
<fwereade> rogpeppe, yeah, it's not going to be fun
<rogpeppe> fwereade: maintaining compatibility is crucial though.
<fwereade> rogpeppe, agreed
<rogpeppe> fwereade: there's another possibility actually
<fwereade> rogpeppe, I'm honestly not sure about external types, but I am starting to feel that from this perspective it's important that we keep a clear eye on just wtf we are storing in mongo
<rogpeppe> fwereade: use a tool to tell us exactly what types we're storing in mongo
<rogpeppe> fwereade: then we can use a variant of the go compatibility checker to make sure we don't change those types in backwardly incompatible ways.
<rogpeppe> fwereade: it won't work for types that know how to marshal themselves, but those are probably not where the problems will occur
<rogpeppe> fwereade: because if you change a MarshalBSON method, you'll have a pretty good idea that it might have some impact on mongo storage
<fwereade> rogpeppe, well, it's the types that know how to marshal themselves I am concerned about tbh -- those changes are going to be easy to make accidentally
<rogpeppe> fwereade: i'm more concerned about someone just changing the type of a field
<rogpeppe> fwereade: without realising that the type is used in mongo
<fwereade> rogpeppe, yep, that too
<fwereade> rogpeppe, agreed that having what you described in place would take one big worry off the table
<fwereade> rogpeppe, in the meantime I think I *would* prefer to avoid directly storing external types in mongo, but obviously we can't change all that now, we just need to fret and hope :/
<rogpeppe> fwereade: for now, perhaps we should just comment every type that's used in mongo, so that people know it's important not to change it.
<rogpeppe> fwereade: it should be straightforward to do
<fwereade> rogpeppe, that sounds like a good idea to me
<rogpeppe> fwereade: i've added a ticket to the kanban
<fwereade> rogpeppe, cheers
<danilos> mgz, dimitern: the other test seems to really be fixed by r39 in goyaml from a week or so ago by gustavo
<dimitern> rogpeppe, fwereade: next chapter in the nonced provisioning saga: https://codereview.appspot.com/8561044/
<mgz> trivial test fix branch to review: https://codereview.appspot.com/8570043
<mgz> danilos: ^you can practice your reitveld on that too :)
<danilos> mgz, I will, that's even easier :)
<danilos> I am having some trouble getting lbox to do anything with codereview.appspot.com: I suppose I need to configure something (even if I set my email address for bzr to the one I use to log in to codereview.appspot.com, no review is created [one is created on LP])
<rogpeppe> TheMue: here's a review for you: https://codereview.appspot.com/8568044
<danilos> ah, ok, so if updating the milestone fails (which I don't have permissions for yet), then it borks out and doesn't do a code review submit
<TheMue> rogpeppe: just doing dimiters, then yours will follow
<rogpeppe> TheMue: ta!
<dimitern> danilos: how did you manage to update a milestone with lbox? :)
<danilos> dimitern, I didn't, but it tries to update a bug milestone and I am still not a member of the ~juju team which is the maintainer for the juju-core project
<danilos> dimitern, I am only passing the -bug=... parameter to the propose command
<dimitern> danilos: I see
<dimitern> danilos: didn't know it touches milestones when you link a bug
<danilos> dimitern, yeah, it tries to create a bug if you don't link it (or so it seems)
<danilos> now I need Kapil, Gustavo or Dave Cheney to add me to the team
<fwereade> dimitern, reviewed
<danilos> hazmat, hi, can you please add me to the ~juju team?
<dimitern> fwereade: cheers!
<dimitern> fwereade:  and this is the final step: https://codereview.appspot.com/8561045/
<fwereade> dimitern, ta
<hazmat> danilos, welcome to juju core, and done
<danilos> hazmat, thanks! :)
<dimitern> fwereade: I don't think asserting on both instanceid and nonce being "" will work for replayed transactions
<danilos> TheMue, hi, I wouldn't mind getting a review for https://codereview.appspot.com/8572043/ :)
<fwereade> dimitern, I think replayed transactions leak badly here anyway
<fwereade> dimitern, but in the actual use cases, we'll never be setting the same instance id anyway, so that specific check actually reduces to "never set" anyway
<TheMue> danilos: enqueued, just doing rogers ;)
<danilos> TheMue, thanks
<danilos> mgz, dimitern: I wouldn't mind if one of you gives me a second ack (or nack, depending on your opinion :) on https://codereview.appspot.com/8572043/
<fwereade> dimitern, I think it's clearer and simpler and no less correct ;p
<dimitern> fwereade: it's simpler for sure, just have a nagging feeling I'm missing something, if i don't assert instid should either be "" or "instid"
<dimitern> fwereade: perhaps you could alleviate my concern - what'll happen when we need to replay a SetProvisioned transaction without asserting "instid", which is already set?
<fwereade> dimitern, I think there is a general issue wrt txns that need to be synchronized with non-txn state
<fwereade> dimitern, the short answer is "we're probably screwed", but asserting on the possibility of the same instance id can't actually help, I think
<dimitern> fwereade: istm asserting only on "" will cause erraborted, which then by the virtue of the machine being still alive will return "already set"
<dimitern> fwereade: that's the only reason to check for same id in the assert, if you think it's superfluous, I'll drop it
<dimitern> danilos: you've got a review
<fwereade> dimitern, essentially a delayed SetProvisioned txn leaves state badly inconsistent with reality until it's completed, and there's no way round that except to do some distracting cleverness re completing machine txns before making provisioning-related decisions
<fwereade> dimitern, and it's low-impact enough that I plan to just skip it for now
<dimitern> fwereade: yeah, and hard to induce such an error to test it properly imo
<dimitern> fwereade: ok, I'll do as you suggest
<fwereade> dimitern, yeah, indeed
<danilos> dimitern, thanks
<danilos> dimitern, fwiw, I did test it with both tip revisions and going back one revision of goyaml to confirm it fails expectedly with it
<dimitern> danilos: cheers
<TheMue> rogpeppe, danilos: you've got reviews
<rogpeppe> TheMue: ta!
<TheMue> rogpeppe: yw, nice little helpers in the code
<TheMue> danilos: does go get -u retrieves the fixed goyaml version?
<danilos> TheMue, I can try reverting to r38 to see if it does (since I started yesterday, I got the latest version by default: r39 is from April 2nd)
<danilos> TheMue, yeah, it does update it for me at least
<TheMue> danilos: that's good
<danilos> TheMue, btw, is "LGTM" a convention of rietveld (i.e. does it interpret it as a positive review and makes the review message green), or just internal juju team convention to give out a shorter "looks good to me"?
<mgz> it's for the pretty green
<danilos> mgz, for some definition of pretty, yes :)
 * TheMue almost missed hangout
<TheMue> danilos: it is interpreted by rietveld, and "not lgtm" marks it red
<dimitern> danilos: and there's the red NOT LGTM for Disapprove
<danilos> TheMue, dimitern: ack, thanks
<dimitern> danilos: not sure if it's case-sensitive, but it definitely catches it as part of the word (like "this is LGTMy"), ymmv
<dimitern> fwereade: updated https://codereview.appspot.com/8561044/
<dimitern> fwereade: and there's the other one as well - https://codereview.appspot.com/8561045/
 * dimitern lunch
<mgz> heh, even later lunch than me
<danilos> niemeyer, heya, I've marked a couple of bugs that you've fixed in goyaml with r39 as fix-committed, I hope you don't mind :)
<niemeyer> danilos: Thanks!
<niemeyer> danilos: They're even released already
<danilos> niemeyer, even better :)
<rogpeppe> danilos: are you ÐÐ°Ð½Ð¸Ð»Ð¾ Ð¨ÐµÐ³Ð°Ð½  ?
<rogpeppe> danilos: 'cos if so, i think you've just earned your first "broken trunk" badge :-)
<TheMue> rogpeppe: with an updated goyaml?
<rogpeppe> TheMue: yes
<TheMue> rogpeppe: oh, where does it break?
<rogpeppe> TheMue: constraints
<TheMue> danilos: you've done go build ./... and go test ./... in the juju-core directory?
<danilos> TheMue, yeah, with latest trunk
<danilos> TheMue, are you seeing any problems?
<danilos> rogpeppe, btw, I am, what did break for you?
<danilos> rogpeppe, got the traceback?
<mgz> ha, that would be nice :)
<mgz> correct term is "cryptic one-line error"
<rogpeppe> danilos: http://paste.ubuntu.com/5692522/
<danilos> rogpeppe, that's exactly what should fail if you don't have fixed goyaml version
<rogpeppe> mgz: aww come on, it's a cryptic four-line error, surely?!
<rogpeppe> danilos: ah, so i need to pull goyaml?
<danilos> rogpeppe, i.e. when I tested it with r38 of goyaml, it failed like that
<rogpeppe> danilos: ah, i'm on v36
<danilos> rogpeppe, yes, as per my email (this is goyaml r39 from 2013-04-02 by niemeyer): it was failing for me since I just fetched it yesterday for the first time
<mgz> that probably deserves a post to the mailing list then, if everyone needs to update
<mgz> ...which you've already done
<danilos> mgz, I posted to juju-dev
<rogpeppe> danilos: ahhhh, i think gustavo must've moved the goyaml repo
<niemeyer> Hm?
<rogpeppe> danilos: because it thinks 36 is the latest version
<TheMue> rogpeppe: that's why i asked for the goyaml. *phew*
<mgz> yeah, need `bzr pull --remember lp:goyaml`
<danilos> rogpeppe, what does 'bzr info' say for you in there?
<rogpeppe> danilos: http://paste.ubuntu.com/5692535/
<mgz> changed from lp:~gophers/goymal/trunk
<danilos> it should be something like bzr+ssh://bazaar.launchpad.net/+branch/goyaml off the top of my head, let me check on disk for me
<rogpeppe> niemeyer: i've done "bzr pull" in my goyaml directory and it things 36 is the latest branch, but apparently there's a r39 out there
<mgz> rogpeppe: ^see my bzr pull command above
<niemeyer> It's still lp:goyaml.. it's under ~goyaml's trunk now, indeed.. the change was announced when we switched everything out of ~gophers
<danilos> rogpeppe, yeah, lp:goyaml points at http://bazaar.launchpad.net/~goyaml/goyaml/trunk/ for me (so not bzr+ssh, guess that's go thingy)
<niemeyer> danilos: Nope
<mgz> yeah, go get is... not helpful
<niemeyer> Ah, yes
<rogpeppe> niemeyer: ah, presumably i had to do a bzr pull --overwrite --remember lp:goyaml/trunk
<niemeyer> rogpeppe: Hmm.. no..
<niemeyer> rogpeppe: bzr pull lp:goyaml --remember
<danilos> niemeyer, any other reason why it didn't save the "top" URL of lp:goyaml (which I think evaluates to bzr+ssh://bazaar.launchpad.net/+branch/goyaml or similar)
<niemeyer> danilos: I have no idea..
<rogpeppe> niemeyer: ah, ok, done, thanks
<niemeyer> danilos: I'd expect that too
 * rogpeppe wonders what other packages are "frozen" in the same way
<danilos> niemeyer, I guess it's not using "lp:" for fetching branches, which would explain it
<mgz> yeah, jam proposed a fix for that
<niemeyer> danilos: Maybe.. I'd not expect that to make a difference, though
<niemeyer> danilos: Whatever is fetching branches is still talking about launchpad.net/goyaml
<mgz> it may even have landed in go trunk
<niemeyer> danilos: Not ~foo/goyaml/trunk
<danilos> niemeyer, maybe it's doing so over http (I had the http ~foo/goyaml/trunk in it after a simple 'go get', and my bzr login is set properly)
<niemeyer> danilos: Maybe, but I'd still consider that a bug in bzr itself
<niemeyer> danilos: If the user asks for lp.net/foo, that's what it should be linked with
<niemeyer> danilos: Even if it fetches from elsewhere
<niemeyer> danilos: We probably won't solve that bug today, though
<danilos> niemeyer, yeah, or in Launchpad (I've just double-checked that, if I do a 'bzr pull --remember http://launchpad.net/goyaml', it does the wrong thing)
<dimitern> still waiting for a review on https://codereview.appspot.com/8561045/
<danilos> niemeyer, indeed :)
<dimitern> fwereade: sorry, but i'm bugging you again ^^
<mgz> there does seem to be a launchpad forwarding bug
<fwereade> dimitern, sorry dude
<mgz> but... using that http url is... more than a little bogus anyway
<niemeyer> mgz: Hmm.. why?
<danilos> mgz, yeah, LP sends permanent redirects for that http url (it's not perfect, but it seems to work and I suppose go get uses it)
<TheMue> dimitern: started revisioning
<danilos> niemeyer, I don't think it's advertised anywhere as the URL to use for fetching branches
<dimitern> TheMue: thanks
<niemeyer> danilos: Well.. 1) It works; 2) It's extremely user-friendly
<danilos> niemeyer, I am not disagreeing, I think there's a LP (or maybe bzr) bug in there
<mgz> the branch also has a broken parent reference
<rogpeppe> danilos: i can't see your post to juju-dev anywhere. what was the subject line?
<niemeyer> rogpeppe: Update goyaml dependency for juju-core
<rogpeppe> niemeyer: ah, just turned up
<mgz> niemeyer: can you run this please: `bzr config --scope=branch -d bzr+ssh://bazaar.launchpad.net/+branch/goyaml --remove parent_location`
<danilos> rogpeppe, I've posted an update with your experience as well, hope this helps somebody else who hits the same problem
<rogpeppe> danilos: thanks
<danilos> rogpeppe, and now I see that mgz beat me to it... oh well :)
<mgz> :
<mgz> )
<niemeyer> mgz: What does that do?
<mgz> removes a broken parent location reference on the trunk branch
<mgz> it's trunk, it shouldn't have a parent (and certainly not a ../../ one that goes nowhere)
<TheMue> dimitern: you've got a review
<dimitern> TheMue: tyvm
<niemeyer> mgz: Never seen that
<niemeyer> mgz: Neat
<niemeyer> mgz: The command, Imean
<mgz> beats editing .bzr/branch/branch.conf, especially when it's a remote branch :)
<danilos> phew
 * danilos hands the first "broken trunk" badge back ;)
<mgz> okay, I think that probably does fix the launchpad redirect too
<mgz> so, go get should now have the right origin
<niemeyer> mgz: Done
<danilos> mgz, it does? I just filed https://bugs.launchpad.net/launchpad/+bug/1166854, let me check that :)
<_mup_> Bug #1166854: bzr http URL stores evaluated URL instead of the link <Launchpad itself:New> < https://launchpad.net/bugs/1166854 >
<danilos> mgz, it doesn't for me, it still stores the evaluated URL instead of the shorter one
<mgz> oh, that bxr resolves redirects and stores the dest is a known thing
<danilos> mgz, at least 'bzr pull --remember http://launchpad.net/goyaml' does (haven't tried 'go get' itself)
<mgz> but launchpad actually had the wrong redirect
<danilos> mgz, right; oh, the redirect worked fine for me at least
<danilos> mgz, btw, is bzr supposed to store temporary redirects as well? could that be a solution?
<danilos> mgz, solution on the LP side, that is
<mgz> the launchpad side for this is very messy
<mgz> there are two levels, the first of which isn't actually an http redirect at all, but a fake branch that references another location
<danilos> mgz, it's starting to sound nice
<danilos> mgz, "interesting" is a perfect word, I assume :)
<danilos> anyway, enough of that for now, those smarter about code-hosting in LP and bzr can comment on the bug and close it as invalid if needed; it'd still be nice to improve the experience for potential Go Launchpad users
<dimitern> niemeyer: reviewed
<niemeyer> dimitern: Cheers
<mgz> sorry, http->https redirect first... then:
<mgz> $ curl https://launchpad.net/goyaml/.bzr/branch/location
<mgz> http://bazaar.launchpad.net/~goyaml/goyaml/trunk
<niemeyer> fwereade: Any idea of when we'll make log messages reasonable?
<danilos> I am seeing a problem in environs/openstack/config_test.py configTest.check method: it doesn't seem to clean the state properly after a failing test: I expect first test to fail (I am just adding it before adding the code), but the second one shouldn't fail: if I move my new test to the end, second one doesn't fail anymore
<fwereade> niemeyer, ...damn, I have a draft email proposing a small doable card
<niemeyer> fwereade: Just got a request by dimitern to uglify messages to conform to other ugly messages
<niemeyer> fwereade: I'm doing it, but the output is comical
<niemeyer> 2013/04/09 12:39:56 INFO JUJU:juju:publish cmd/juju: charm published at 2013-04-01T15:53:05Z as cs:~niemeyer/precise/ubuntu-6
<fwereade> danilos, that sounds like a problem to me
<dimitern> niemeyer: i think a consistent source is better than half-baked better solution only at places
<niemeyer> JUJU! JUJU! JUJU! I mean it, JUJU!
<niemeyer> dimitern: I'm doing it..
<dimitern> niemeyer: and i agree the format needs to change to cut the bullshit of the logs :)
<niemeyer> dimitern: But I don't have to agree.. :)
<fwereade> niemeyer, the proposal will eliminate that at least -- thank you for bearing with us re consistency in the meantime, though
<danilos> fwereade, http://pastebin.ubuntu.com/5692650/ and my test is in lp:~danilo/juju-core/bug-1135335
<niemeyer> fwereade: np.. I'm more bothered by presenting people with such obviously bad UX than the fact I have to change it
<dimitern> fwereade: i though the JUJU: badge was supposed to somehow help the logs integrate better with syslog, but it's kinda crap if it's a requirement - a smart syslog will badge these according to sources
<fwereade> danilos, omg, that's evil
<fwereade> danilos, var credentialsTestConfig = configTests[12]
<danilos> fwereade, aah
<fwereade> danilos, I haven't checked your branch, but I bet you inserted a test before that one, thus throwing off the hardcoded magiv number
<mgz> yeah, that is good.
<fwereade> danilos, please kill that technique with fire if you would be so good :)
<danilos> fwereade, I'd be delighted to :) any suggestions as to how since I am just getting to grips with go syntax? :)
<danilos> fwereade, should I simply split it out into a separate var, and then reference it in the big array after that?
<mgz> move it above the block of test definitions as its own thing, then add it in the list
<fwereade> danilos, easiest way: just stick that test in a var, then stick the var into the []tests
<fwereade> danilos, but cast an eye over the actual uses of it
<danilos> mgz, fwereade: ok, I guess we are all on the same page, thanks :)
<danilos> fwereade, of course
<fwereade> danilos, it may be that it's not worth it in the first place ;)
<danilos> fwereade, right, it's used in a single place with a different test.err and a couple of different env vars (just below the declarations): I'll see if I can move that as a separate entry in the configTests array
<dimitern> today i tried the classic bootstrap+deploy mysql/wordpres+add-unit for both+expose; in ec2 - everything worked with the new provisioning!
<fwereade> danilos, great, thanks
 * fwereade cheers at dimitern
 * danilos tries to count from 0 to 12 in an array :)
<fwereade> dimitern, https://codereview.appspot.com/8561044/ LGTM
<fwereade> dimitern, btw, what's the value of BootstrapNonce?
<fwereade> dimitern, didn't spot that
<dimitern> fwereade: "user-admin:bootstrap", as you suggested - it was part of the previous CL
<fwereade> dimitern, ah, great, thanks
<dimitern> fwereade: thanks for the review; i really want to finish and land both of these today
<fwereade> dimitern, what was the thinking behind UUID for nonce?
<fwereade> dimitern, seems a shame not to have it badged with the provisioning machine
<fwereade> dimitern, nothing wrong with a UUID in itself, but I'd prefer the nonce to look like "machine-0:<UUID>" if we're going with that rather than a sequence
<niemeyer> publish is going in!
 * fwereade cheers at niemeyer
<niemeyer> and the server has been updated last night, which means it should work for everybody using tip already
<dimitern> fwereade: fair enough, machine-0:<uuid> it is
 * rogpeppe eats the last piece of Easter chocolate
<TheMue> rogpeppe: enjoy, i'll have dinner in a few moments. today no lunch. :/
<dimitern> i need two LGTMs still on the following CLs: https://codereview.appspot.com/8561044/ and https://codereview.appspot.com/8561045/
<dimitern> fwereade: ping for the second one
<rogpeppe> dimitern: looking
<rogpeppe> dimitern: i know that we have to set the nonce at the same time as the instance id, so SetProvisioned is reasonable there (i'm not keen on the name, but can't think of a better one currently), but i think InstanceId should just return the instance id, and a new method InstanceNonce (?) can return the nonce. no need to make calling instanceid harder, i think.
<rogpeppe> dimitern: it's a pity that SetInstanceId is used as a test example everywhere!
<dimitern> rogpeppe: not keen on having a way to get the nonce directly, it's supposed to be internal
<rogpeppe> dimitern: huh? it's blatantly not internal - the function returns it!
<dimitern> rogpeppe: I mean it's intentional not having a way to get it, you're supposed to check if it's sane with CheckProvisioned only
<rogpeppe> dimitern: another possibility would be to define InstanceId as struct {Id string; Nonce string}, i suppose
<rogpeppe> dimitern: in which case, don't return it from InstanceId
<dimitern> rogpeppe: InstanceId returning a bool is kinda half-useful now, because we have CheckProvisioned, i agree
<dimitern> rogpeppe: it's not returned from InstanceId
<rogpeppe> dimitern: ah, sorry, i misread
<dimitern> rogpeppe: so InstanceId() -> InstanceId and that's it?
<rogpeppe> dimitern: i'm not sure i understand you there
<dimitern> rogpeppe: if i understood you correctly, you're suggesting to change machine.InstanceId() to return InstanceId only, rather than InstanceId and bool (instance != "")?
<rogpeppe> dimitern: no, not really. i lost that argument aeons ago.
<rogpeppe> dimitern: i'd misinterpreted a change in your CL
<dimitern> rogpeppe: we're living in a brave new world now, times for such changes, if they make sense
<dimitern> :)
<rogpeppe> dimitern: i'll leave it for now
<dimitern> rogpeppe: works for me
<rogpeppe> dimitern: hmm, i'm surprised tests pass. i thought there were probably various tests that assumed you could call SetInstanceId more than once
<dimitern> rogpeppe: there was one in api_test, which I changed (TestMachineRefresh), to work like the similar test in state
<rogpeppe> dimitern: ah, i wondered why TestMachineRefresh had changed so much
<dimitern> rogpeppe: and there was another one calling SetInstanceId(x, y) and then SetInstanceId("", ""), where I removed the second case, because the next case is SetAgentTools(), which should also trigger a change
<danilos> TheMue: another one from me if you've got the time: https://codereview.appspot.com/8584043/
<danilos> fwereade, I suppose you might want to comment on the general approach at least: https://codereview.appspot.com/8584043/
<dimitern> danilos: I think he signed off already, I can take a look
<danilos> dimitern, sure, I'd appreciate it (this is just a refactor to fix problems I hit in the test when adding my own code)
<niemeyer> Trivial one: https://codereview.appspot.com/8585043
<dimitern> danilos: good change, but while you're at it, why not fix the source of the problem (ec2 provider, where this code was copied from)?
<mgz> at least let him do that in a seperate mp :)
<danilos> dimitern, heh, I did not know it was copied from there :)
<danilos> dimitern, but yeah, let's do that in a separate MP, I want to get some traction on the actual bug fix as well
<dimitern> mgz: well, it's all the same - the changes should be identical, and a bit less shit overall :) so win-win
<dimitern> danilos: at least put a TODO(danilos) Fix this test to save/restore env vars, like openstack config_test ?
<danilos> dimitern, sure
<danilos> dimitern, I can try seeing if it's simple enough (depending on how similar they are: there might be some intricate test details that I don't want to just do over in haste)
<dimitern> danilos: they should be almost identical, the check() func for sure
<dimitern> danilos: but a TODO is fine, lest we forget about it
<danilos> dimitern, a bunch of variables are renamed (looking at the diff in trunk), even in the configTest struct
<dimitern> danilos: ok then, then it's better to leave it off for now
<dimitern> niemeyer: reviewed
<dimitern> danilos: you too
<danilos> dimitern, thanks
<niemeyer> dimitern: Danke!
<danilos> dimitern, for the "invalid-mode", do you want me to define a const in the test (since this is a non-existant auth mode, the value is typed in as a throw-away string)?
<dimitern> danilos: you can use AuthMode("invalid-mode") w/o problems, it'll be obvious even more I think
<danilos> dimitern, ah, right, thanks
<dimitern> danilos: or maybe even without the cast - since literal string consts in go are casted automatically
<danilos> dimitern, it is being used without the cast right now, so now I am not sure what do you mean?
<dimitern> danilos: when in doubt if anything will work, try it on https://play.golang.org/ - you can even share it :)
<danilos> dimitern, you mean define it as a const in the test file and then use it there?
<dimitern> oops - it's http: only, no s
<danilos> dimitern, cool, nice tip, I'll see if it can replace  what python shell does for me in python
<dimitern> danilos: I mean try authMode: "invalid-mode", when authMode is typed, rather than just string - if it compiles, you're good, if not, cast it
<danilos> dimitern, ah, ok
<dimitern> danilos: the best thing about go is most things are compile-time checked
<niemeyer> dimitern: If I can explore you one last time today: https://codereview.appspot.com/8589043
<niemeyer> dimitern: Also tiny
<rogpeppe> eod for me. g'night all.
<niemeyer> rogpeppe: Cheers!
<dimitern> niemeyer: sure, np
<rogpeppe> niemeyer: nnight
<dimitern> rogpeppe: night!
<niemeyer> I believe this is the last piece I'll be doing on publish right now
<dimitern> niemeyer: LGTM
<niemeyer> dimitern: Super, thanks!
<niemeyer> fwereade: Are you around by any chance?
<danilos> dimitern, it compiles, but it ends up being AuthMode for the expected value, and string for the received value: I suppose config.go makes it as string
<dimitern> danilos: ah, bugger.. ok so reluctantly I'd go with strings then (why we have consts we're not using them also in tests? because the config stuff is crap basically..)
<danilos> dimitern, right, but let's not fix all the problems in there today :)
<dimitern> danilos: indeed :)
<dimitern> 'night guys, i'm off
<danilos> dimitern, night
<niemeyer> danilos: Wanna do a second review on those two CLs?  They're both tiny and easy going: https://codereview.appspot.com/8585043/, https://codereview.appspot.com/8589043
<niemeyer> Wow.. look at those numbers, btw
<danilos> niemeyer, yeah, very confusing
<danilos> niemeyer, I am not sure I know how to use rietveld to review stuff already, but sure (if you'd trust a 2nd day Go expert like myself :)
<niemeyer> danilos: It's already the second review.. (or 3rd, in the case of the first change, as I talked to William about it ahead of time)
<niemeyer> danilos: So, sure, I'm happy to take your review on it
<niemeyer> danilos: Look at the CLs, if you have any comments/questions, double-click on the respective line and comment
<niemeyer> danilos: When you're done with the whole review, click on "Publish+Mail comments" at the top
<niemeyer> danilos: Include "LGTM" if you're happy with the changes to go in, or any other commnets
<danilos> niemeyer, cool, I'll do them and you can be my second reviewer on https://codereview.appspot.com/8584043/ :)
<danilos> niemeyer, LGTM on both, one minor comment on the first one
<niemeyer> danilos: LGTM as well, with a comment on the restoring
<niemeyer> danilos: Sensible comments, thakns.
<danilos> niemeyer, nice, thanks for reminding me of the defer
<niemeyer> danilos: Yeah, it's a nice superpower we have in Go :)
<niemeyer> danilos: Note that it's not just convenience.. it solves an actual issue because there are possible paths before the current restoring that could exit the function without restoring
<danilos> niemeyer, right, I can imagine other errors being caught and handled in the code in between
 * danilos -> off
<fwereade> niemeyer, hey, I'm back unless laura wakes back up
<niemeyer> fwereade: Heya
<niemeyer> fwereade: All good
<niemeyer> fwereade: Already got reviews, and changes are in
<niemeyer> fwereade: Including the validator changes we talked about
<fwereade> niemeyer, you rock
<fwereade> niemeyer, thank you very much :)
<niemeyer> fwereade: It's been my pleasure
<niemeyer> fwereade: Thanks for bearing with me as well :)
<thumper> morning
 * thumper frowns
<thumper> fwereade: so... is there a go debugger that would allow me to step through a test?
<thumper> to figure out what isn't working?
<thumper> test fails when I'd expect it to pass
<fwereade> thumper, hmm, I am not a debugger person, so all I can do is try to hunt down the blog post I've seen people mentioning
<fwereade> thumper, but if the symptoms are amenable to concise description they might ring a bell for me
<fwereade> thumper, http://golang.org/doc/gdb
<thumper> fwereade: well, we could chat about it, hangout would be faster
<fwereade> thumper, sure
<thumper> fwereade: I'll start it
<thumper> ?
<fwereade> thumper, yes please, sorry
#juju-dev 2013-04-10
 * thumper needs food badly
<thumper> davecheney: hi there
<davecheney> thumper: ack
<thumper> davecheney: I'm busy refactoring some config.Config stuff
<davecheney> noice
<thumper> and we have identical code in all three provider specific configs
<thumper> around setting the default firewall mode to instance
<davecheney> i would not doubt it
<thumper> do you think we'll always do that?
<davecheney> copy pasta is copy pasta
<thumper> if so, I can move the code up
<davecheney> i think it was highlighted in austin that we should do a better job of refactoring out the common provider stuffs
<thumper> kk
 * thumper moves it
<davecheney> sgtm
<thumper> set-environment command review request incoming
<thumper> Rietveld: https://codereview.appspot.com/8610043
 * thumper has to run daughter to sports
<thumper> bbl
<davecheney> bigjools: looks like the raring 2.2.4 works fine on precise
<davecheney> well
<davecheney> precise/i386 so far
<bigjools> davecheney: good news
<davecheney> damn skippy
<davecheney> otherwise we'd be royally fooked
 * bigjools is this -><- close to deploying a charm with the new maas provider
<bigjools> what out of Ports/OpenPorts/ClosePorts needs implementing?
<davecheney> bigjools: depends, how do firewalls work in Maas ?
<bigjools> davecheney: what firewalls? :)
<bigjools> none of this was implemented in the Python version
<bigjools> but I am unsure how it works in Go
<davecheney> bigjools: then i'd say you have 0 work to do
<bigjools> well I see a panic in the jujud log
<bigjools> one of our "not implemented" messages
<davecheney> ok, so change that to return nil :)
<bigjools> so something is calling one of them
<bigjools> furry muff
<davecheney> the firewaller (runs inside the provisioning agent) will do that whenever a new machine appears
 * bigjools hacks
<bigjools> davecheney: what about Ports()
<davecheney> that is only used by the status command
<davecheney> and i'm not even sure we show the value of Ports()
<davecheney> just return an empty slice
<bigjools> ok
<davecheney> bigjools: i need help
<davecheney> https://launchpad.net/~juju/+archive/experimental/+packages
<davecheney> ^ I have copied the pacakge as Precise
<davecheney> but I cannot repeat the process for Quantal
<bigjools> what are you copying for quantal, the exact same source?
<davecheney> yes
<davecheney> well
<davecheney> no
<bigjools> ah yes I see the error
<bigjools> you need to copy the one *in* the PPA to a new series
<bigjools> with binaries
<davecheney> ok, i'll try that
<bigjools> pool-based repos mean that you can't have more than one copy of a particular version
<davecheney> nope
<davecheney> no jot
<davecheney> joy
<bigjools> let me try
<bigjools> you want 2.2.4-0ubuntu1 in quantal?
<davecheney> yes
<davecheney> the plan is to use this PPA for P and Q
<davecheney> and the archive for R
<bigjools> there you go
<bigjools> you forgot to check "include binaries" I expect
<davecheney> no, that was deliberate
<davecheney> Copy options: (tick) Rebuild the copied sources
<davecheney> oh well
<davecheney> fuickit
<davecheney> if quantal is broken that will only give more weight to fixing this properly
<davecheney> i can't wait til the gmail move
<davecheney> getting away from thunderbirds fucked up search
<bigjools> gmail has plenty of other problems
<davecheney> don't care
<davecheney> thunderbird is terrible
<bigjools> t'bird is a crock these days
<bigjools> but then all email clients are quite frankly
<davecheney> yeah, what is with that
<bigjools> I haven't used a good one for at least 5 years
<davecheney> every man and his dog is writing a web framework or a mobile operating system
<bigjools> then all tend towards shite
<davecheney> but no love for IMAP
<davecheney> what gives
<bigjools> IMAP in tbird is the only reason I use it
<bigjools> it's very solid
<davecheney> my favorite tbird feature is the way it burns your nuts while running your CPU at 100%
 * bigjools guffaws
<bigjools> my favourite is the default setting to download all of your email that you have, even if it's gigs and gigs
<davecheney> COME GET SOME!!
<bigjools> is there an easy way to replace the tools on a bootstrap node without re-bootstrapping?
<bigjools> my TDD cycle is a tad long
<thumper> bigjools: technically
<thumper> bigjools: you could do an upgrade --upload-tools
<thumper> but I've not tried it
<bigjools> thumper: well I'll give it a go next time, I have nothing to lose (literally).  Ta.
<davecheney> lucky(~/src/launchpad.net/juju-core) % juju status
<davecheney> machines:
<davecheney>   "0":
<davecheney>     agent-version: 1.9.14
<davecheney>     dns-name: ec2-54-253-21-192.ap-southeast-2.compute.amazonaws.com
<davecheney>     instance-id: i-295d9914
<davecheney> services: {}
<davecheney> it worked!
<davecheney> no tarball in site
<davecheney> sight
<davecheney> Get:14 http://ppa.launchpad.net/juju/experimental/ubuntu/ precise/main mongodb-server amd64 1:2.2.4-0ubuntu1 [5137 kB]
<davecheney> Fetched 32.9 MB in 37s (880 kB/s)
<thumper> \o/
<davecheney> ubuntu@ip-10-248-69-6:~$ tail -f /var/log/juju/unit-mysql-0.log
<davecheney> /bin/sh: 1: exec: /var/lib/juju/tools/unit-mysql-0/jujud: not found
<davecheney> /bin/sh: 1: exec: /var/lib/juju/tools/unit-mysql-0/jujud: not found
<davecheney> /bin/sh: 1: exec: /var/lib/juju/tools/unit-mysql-0/jujud: not found
<davecheney> /bin/sh: 1: exec: /var/lib/juju/tools/unit-mysql-0/jujud: not found
<davecheney> somethign is borken
<rogpeppe> mornin' all
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: fwereade | Bugs: 0 Critical, 52 High - https://bugs.launchpad.net/juju-core/
<rogpeppe> fwereade: you've got a review: https://codereview.appspot.com/8604043/
<fwereade> rogpeppe, thanks, good ideas
<rogpeppe> fwereade: cool
<fwereade> rogpeppe, I originally called it "exclude" and then though "hell, it's really set difference" but happy to change back
<rogpeppe> fwereade: for me, set difference could work either way. "Without" might be good too.
<rogpeppe> fwereade: (i mean that x.Difference(y) could mean x - y or y - x)
<rogpeppe> (or even (x - y) union (y - x)
<rogpeppe> )
<mattyw> has anyone seen an issue with juju-core leaving a single volume in aws after a juju destroy-environment? I've got no useful debug yet but it's happened twice
<rogpeppe> mattyw: i haven't seen that issue, but i usually have loads of s3 garbage so i wouldn't notice
<fwereade> mattyw, I don't think it's meant to delete your control-bucket, just to clear its contents
<fwereade> mattyw, iirc bucket existence consistency is even more eventual than the rest of s3 and it became a hassle when spinning a single env up/down
<fwereade> mattyw, but that recollection does not have a high degree of confidence attached
<fwereade> mattyw, but if destroy-environment leaves any files in the control bucket, or any relevant instances not stopping/terminated, I think that's a bug
<dimitern> fwereade: I think it does leave stuff in the bucket after the live tests at least
<mattyw> fwereade, dimitern next time it happens I'll see if there's anything the bucket - if there's stuff there shall I raise a bug?
<rogpeppe> fwereade, mattyw: it does try to delete the bucket.
<fwereade> rogpeppe, ah, ok
<fwereade> rogpeppe, if that fails, destroy-environment should fail too, right?
<rogpeppe> fwereade: yes
<rogpeppe> fwereade: have you got time for a quick G+ call?
<fwereade> sure, would you start it please?
<rogpeppe> fwereade: https://plus.google.com/hangouts/_/7e115006bcf50750316f3b966eb0825d1c263baf?authuser=0&hl=en-GB
<TheMue> lunchtime
<dimitern> so now with one not lgtm from dave and 2 lgtms already on https://codereview.appspot.com/8561044/, I cannot land it or any of the next 4 CLs depending on it, until we cleared up the misunderstanding..
<dimitern> hey, what do you know :) dave responded and it's now fine, I'll submit it shortly
<dimitern> fwereade: when's the kanban meeting? at 16h daily?
<fwereade> dimitern, I think it's 1615 now
<dimitern> fwereade: oh, i see, cheers
<fwereade> dimitern, or it might be anyway, I'm not quite clear
<dimitern> fwereade: I'll start attending from today
<rogpeppe> fwereade: back to 1600 now
<fwereade> dimitern, cool
<fwereade> rogpeppe, ah ok, thanks
<rogpeppe> fwereade: just checking: there's no get-environment command (or equivalent) in pyjuju, right?
<fwereade> rogpeppe, there's "look at environments.yaml"
<rogpeppe> fwereade: :-)
<rogpeppe> fwereade: just wondering about compatibility issues
<dimitern> fwereade, rogpeppe: re get/set environment - shouldn't it be "get" and "set" rather than "get-environment" and "set-environment" ?
<fwereade> dimitern, discussed in atlanta: env config and service settings are different realms
<rogpeppe> dimitern: i agree. but there's an argument that says that they're different enough to require different commands
<dimitern> fwereade: so get/set will be for service config only
<fwereade> dimitern, as opposed to env constraints and service constraints, which interact closely
<fwereade> dimitern, yeah
<dimitern> fwereade: so the the service it'll be "juju get <svc> [<value>]" ?
<fwereade> dimitern, I think it already is
<rogpeppe> dimitern: juju get doesn't provide a way to get a specific key
<dimitern> fwereade: I see, although I lean towards having a single command (get/set) one with service, another without - why having separate ones necessarily better?
<fwereade> dimitern, that's my own personal preference as well, but we agreed otherwise in atlanta
<fwereade> dimitern, but different realms imply potentially different options, different output formats, etc
<fwereade> dimitern, segregating them is probably wise even if it leads to slightly less visually pleasing cli commands
<dimitern> fwereade: ok, i see
<rogpeppe> for anyone who's interested, you might want to check out https://ec2-50-17-53-143.compute-1.amazonaws.com/
<rogpeppe> it's running the GUI on top of go juju
<rogpeppe> fwereade, dimitern: ^
<dimitern> The site's security certificate is not trusted!
<dimitern> :)
<dimitern> rogpeppe: password?
<rogpeppe> dimitern: any password
<dimitern> ah, it works without
<dimitern> if I move a node the relation links are not updated
<rogpeppe> dimitern: yeah, i've just found that
<fwereade> rogpeppe, nice
<rogpeppe> dimitern: i'm just filing a bug.
<dimitern> mgz: standup?
<rogpeppe> dimitern: it's probably a known issue
<mgz> gah, clock...
<davecheney> evening gents
<dimitern> https://codereview.appspot.com/8561045/ need second LGTM on this
<dimitern> davecheney: evening
<davecheney> i got bootstrap working on raring, _AND_ precise without the tarball
<davecheney> the diff is quite small
 * dimitern cheers at davecheney
<dimitern> davecheney: how? from the archive? +ssl?
<davecheney> archive for raring
<davecheney> ppa for precise
<davecheney> haven't tested quantal yet
<davecheney> dimitern: I posted a sample in the channel this afternoon
<davecheney> it should be in th channel logs
<dimitern> ah ok
<dimitern> rogpeppe: a fairly small one https://codereview.appspot.com/8561045/
<dimitern> fwereade: ping
<fwereade> dimitern, pong
<fwereade> davecheney, awesome, tyvm
<dimitern> fwereade: so by "minimal machine errors in status" we mean short agent-state + agent-state-info for a machine?
<fwereade> dimitern, yeah
<TheMue> davecheney: ping
<dimitern> fwereade: cool\
<davecheney> TheMue: pack
<davecheney> ack
<dimitern> fwereade: how can I tell which machine the provisioner is running on, to get its tag for the nonce badge?
<dimitern> fwereade: now it's always machine-0, but i'd like the code to handle the case when it's not, if possible
<TheMue> davecheney: got my mail regarding the status? just wanted to know if there's already existing code or if i can start at zero?
<fwereade> dimitern, ha, good question -- I guess just pass it into the provisioner when it's started
<davecheney> yes, i got your email
<davecheney> did you not get my reply ?
<davecheney> i have a few branches, but they are so old
<davecheney> you'd be better to start from scratch rather than trying to merge them to trunk
<dimitern> fwereade: sgtm, and add a field for it in the Provisioner struct
<fwereade> dimitern, yeah, sgtm
<TheMue> davecheney: did not received your mail yet, strange.
<fwereade> dimitern, (I have a lurking belief that all tasks should accept something like an agent.Conf, so they're all responsible for extracting the context they need rather than the agent having to do it for them, but meh, that's not one for today)
<TheMue> davecheney: ok, so i'll start new, thanks. enjoy your evening.
<TheMue> davecheney: ah, just seen, my mail app seems to have a problem with the canonical ssl. hmmm.
<dimitern> fwereade: I agree; for now I added machineId to NewProvisioner
<dimitern> fwereade: and a TODO for NewFirewaller
<davecheney> TheMue: some old branhces, https://codereview.appspot.com/8619043
<davecheney> probably not useful
<TheMue> davecheney: thanks, will take a look
<davecheney> there isn't much there sadly
<TheMue> davecheney: but at least it looks like my approach in mind too. h5
<dimitern> fwereade: what happened to that critical bug about upgrade-charm should accept --switch?
<fwereade> dimitern, I imagine it was downgraded given everything else we're doing, but I'm not aware specifically
<dimitern> fwereade: sorry, my bad - check out this https://codereview.appspot.com/8620043/
<dimitern> fwereade: I see, ok
<dimitern> and I still need a second LGTM on this CL: https://codereview.appspot.com/8620043/
<TheMue> dimitern: do i get it right, the prefix of the nonce of each machine is the machine id of the provisioner?
<dimitern> TheMue: yes
<dimitern> TheMue: except for the bootstrap node, where it's the user-admin who created the environment
<TheMue> dimitern: ok, what's the reason behind this prefix?
<fwereade> TheMue, audit trail if multiple provisioners ever come to race
<TheMue> fwereade: thx, that was the missing info. ;)
<fwereade> dimitern, reviewed
<dimitern> fwereade: tyvm
<TheMue> dimitern: and another one
 * fwereade lunch
<dimitern> TheMue: thanks
<dimitern> this is ridiculous! how come I cannot cast []byte to [16]byte (of the same length)?
<dimitern> I have to range over the slice to fill in the array
<rogpeppe> dimitern: https://code.google.com/p/go/issues/detail?id=395
<dimitern> rogpeppe: i'm glad I'm not the only one who was thinking about this :)
<rogpeppe> dimitern: why do you need a [16]byte?
<rogpeppe> dimitern: oh yes, you don't need to range
<rogpeppe> dimitern: var x [16]byte; copy(x, slice)
<dimitern> rogpeppe: because UUID is [16]byte
<dimitern> rogpeppe: ah, cool, I'll try copy then
<dimitern> first argument to copy should be slice; have [16]byte
<dimitern> doesn't work
<dimitern> rogpeppe: ^^
<rogpeppe> dimitern: ah, sorry, copy(x[:], slice)
<dimitern> rogpeppe: nice!
<dimitern> rogpeppe: it's working now, thanks
<rogpeppe> fwereade: as far as you know, are there any rsetrictions on
<rogpeppe> aargh
<rogpeppe> fwereade: as far as you know, are there any restrictions on interface or relation names currently?
<rogpeppe> fwereade: other than the interface name can't start with "juju-" or be "juju"
<TheMue> afk, biab
<dimitern> fwereade: the status displaying machine agent/instance state is not dependent on MA setting status
<fwereade> dimitern, weeell, it would be a lot saner if the MA stuff were in place
<dimitern> fwereade: I mean one does not depend on the other
<fwereade> dimitern, true
<dimitern> fwereade: I'll propose both shortly, the status is done, and I think it came along nice
<fwereade> dimitern, great news, tyvm
<fwereade> bbiab
<rogpeppe> fwereade, dimitern: a branch cleaning up some redundant statecmd/params stuff; big but almost entirely mechanical: https://codereview.appspot.com/8626043
<dimitern> rogpeppe: will take a look shortly
<rogpeppe> dimitern: thanks.
<dimitern> rogpeppe: in the mean time, i'll swap it for this https://codereview.appspot.com/8561046/
<rogpeppe> dimitern: we're calling machine.AgentAlive, but i can't see anywhere that we're actually calling Machine.SetAgentAlive, can you?
<rogpeppe> dimitern: have you tested that branch live?
<rogpeppe> dimitern: i suspect it will always show machines as "down"
<dimitern> rogpeppe: it cannot be tested live until the MA actually sets the status, which is what i'm working on right now
<rogpeppe> dimitern: ah, cool
<rogpeppe> dimitern: if you could make another branch that also gets the machine agent to SetAgentAlive, that would be marvellous
<dimitern> rogpeppe: will do soon, once the tests pass
<TheMue> fwereade: just for info, after a first look into the py code for the units it doesn't seem so weird. i hope this impression will stay. ;)
<dimitern> I had a curious error from gocheck:
<dimitern> ... Panic: Couldn't create temporary directory: mkdir /tmp/gocheck-669867415: file exists (PC=0x41175F)
<dimitern> It turned out I have loads of these in /tmp/, after rm -fr /tmp/gocheck-* it works now
<fwereade> TheMue, yeah, it's not insane or anything, it just uses concepts we don't expose
<TheMue> dimitern: hmm, interesting. why do they remain? good question.
<dimitern> TheMue: probably because I almost never restart my machine, and they're not removed by gocheck at the end
<dimitern> fwereade: https://codereview.appspot.com/8561046/
<TheMue> dimitern: imho gocheck is removing them (that's what i thought). but you may be right.
<fwereade> dimitern, cheers
<rogpeppe> fwereade: on call today? fancy a look at this: https://codereview.appspot.com/8626043 ?
<fwereade> rogpeppe, I'm looking at it now
<rogpeppe> fwereade: thanks
<rogpeppe> fwereade: sorry about the size, but it's really just cleaning up cruft.
<fwereade> rogpeppe, it's also wrong, but I suspect it was before
<fwereade> rogpeppe, how the hell did we write this stuff without tests?
<rogpeppe> fwereade: which stuff in particular?
<rogpeppe> fwereade: (i'm thinking of one thing, but you might be thinking of another)
<dimitern> shit, I forgot the kanban meeting today.. mramm can you invite me through the calendar, so I'll keep track of it please?
<mramm> dimitern: I'll do that
<dimitern> mramm: cheers
<fwereade> rogpeppe, hmm, there are some tests, but I'm sure they were missing last time I looked
<fwereade> rogpeppe, anyway the yaml stuff is not the same as python
<rogpeppe> fwereade: how does it differ?
<fwereade> rogpeppe, in python it's equivalent to a map[string]map[string]interface{}
<rogpeppe> fwereade: orly?
<fwereade> rogpeppe, yeah
<rogpeppe> fwereade: so what's the top level map?
<fwereade> rogpeppe, service name
<rogpeppe> fwereade: but we already know the service name, right?
<fwereade> rogpeppe, use case is having a single config file for your whole environment
<rogpeppe> fwereade: ah, i see
<fwereade> rogpeppe, not sure if we handle it on deploy, we should really
<fwereade> rogpeppe, ah, we do
<fwereade> rogpeppe, but still in the wrong way
<fwereade> rogpeppe, I think it's just a matter of fixing SetYAML though
<rogpeppe> fwereade: i'm not entirely sure
<rogpeppe> fwereade: ISTR that the gui asks you to specify a config file when deploying a given service, not a config file for the whole thing
<rogpeppe> fwereade: so it would be quite weird to specify a yaml file like that and have all the settings ignored because the service name didn't match
<rogpeppe> fwereade: i can see use cases for both possibilities
<fwereade> rogpeppe, what is the point of having yaml config setting in the gui if not so people can use the settings files they do in the cli?
<rogpeppe> fwereade: let me just check in the gui
<rogpeppe> fwereade: the problem i see with it is that it means i can't have a yaml file that holds the default config settings for any instance of some charm i might want to deploy more than one service of
<fwereade> rogpeppe, that's not the use case it's addressing as far as I'm aware
<rogpeppe> fwereade: it seems like a reasonable one to me. that's how i'd want to use it anyway.
<rogpeppe> fwereade: because service names are fluid
<fwereade> rogpeppe, meh, settings for one service are not necessarily settings for another service, even if they share a charm
<rogpeppe> fwereade: it would be more useful if it mapped from charm name (or url) to settings
<fwereade> rogpeppe, it's a services-config file for a whole environment
<dimitern> rogpeppe: there is the rest https://codereview.appspot.com/8630043
<rogpeppe> fwereade: so like a half-hearted stack?
 * fwereade shrugs
<fwereade> rogpeppe, didn't say it was the world's most inspired feature
<fwereade> rogpeppe, just that it's a compatibility break for no good reason ;p
<rogpeppe> fwereade: so what does set do? read the yaml and return an error if there's no matching service?
<arosales> jcastro: can we get a spot on ubuntu-meeting for the charmer meeting?
<rogpeppe> fwereade: istm that if that's what we've got, we really want "juju set --config foo.yaml" to change the settings on all the services mentioned in the yaml file.
<jcastro> arosales: I can work that
<arosales> jcastro: thanks
<fwereade> rogpeppe, agreed, but that's not what we have now
<arosales> jcastro: we also need to sort the on-air recording
<rogpeppe> fwereade: no. i just find it difficult to countenance a Service.SetYAML that works like that. i'll learn to deal with it, i guess.
<dimitern> rogpeppe: reviewed
<rogpeppe> dimitern: thanks
<fwereade> rogpeppe, yeah, I don't like it either
<fwereade> dimitern, ping
<dimitern> fwereade: I have 2 CLs for you to review please: https://codereview.appspot.com/8561046/ and https://codereview.appspot.com/8630043/
<dimitern> fwereade: pong :)
<rogpeppe> fwereade: agreed about half-deployed services when it comes to AddUnit time. but before we've added any units, i really think we should clean up the service.
<dimitern> fwereade: what's that g+ invite about?
<fwereade> dimitern, I want to talk about one of them, was thinking about what rogpeppe said
<fwereade> rogpeppe, can we talk about this a bit later?
<rogpeppe> fwereade: sure
<rogpeppe> fwereade: trivial? https://codereview.appspot.com/8616044
<dimitern> rogpeppe: ping about https://codereview.appspot.com/8630043/
<fwereade> rogpeppe, https://codereview.appspot.com/8616044/ LGTM trivial
<rogpeppe> fwereade: ta
<fwereade> dimitern, btw, could I get a review on https://codereview.appspot.com/8545043/ please?
<dimitern> fwereade: sure, I'll be on it shortly
<fwereade> rogpeppe, ok, half-deployed services
<rogpeppe> fwereade: ok
<dimitern> fwereade: btw is the meeting tomorrow morning at 9 or at 10?
<fwereade> dimitern, er 10 I think
<fwereade> rogpeppe, destroying them is probably right
<dimitern> fwereade: ok, and the kanban is at 16
<fwereade> rogpeppe, I'm just bothered by the cases where we don't manage to
<rogpeppe> fwereade: yeah. i think if you try to deploy a service with a malformed config, the service should not be created.
<fwereade> rogpeppe, the config should probably be validate first then
<fwereade> rogpeppe, a misconfigured subordinate could be very annoying to clean up if we dropped connection just after creating
<rogpeppe> fwereade: agreed. it is an issue, but failing to add a unit is a much rarer occurrence (network error) than a failed config (user error)
<fwereade> rogpeppe, I think that the only reasonable thing to do is to validate the config before creating the service
<fwereade> rogpeppe, I didn't catch on that we weren't doing so
<rogpeppe> fwereade: i'm not that bothered really. the SetConfig step can still fail either way, and we'll want to clean up afterwards.
<rogpeppe> fwereade: validating beforehand reduces the likelyhood of a failure, but it's only making the window smaller, not eliminating it
<fwereade> rogpeppe, well, in what ways can it sensibly fail (aside from dropped connections, in which case no point trying to clean up)?
<fwereade> rogpeppe, how so?
<fwereade> rogpeppe, we know the charm, we created the service: we should be able to set a config with 100% certainty
<fwereade> rogpeppe, hm, immediate concurrent ninja-edits are possible
<fwereade> rogpeppe, but this is why I contend that this should all be in a transaction
<fwereade> rogpeppe, (not that I'm saying to do that now, just to illustrate a general point)
<rogpeppe> fwereade: if someone's running around calling SetConfig on about-to-be-created services, i think they deserve what they get :-)
<fwereade> rogpeppe, yeah
<fwereade> rogpeppe, so that reduces the cases we care about to dropped connections, in which case there's no point trying to clean up
<fwereade> rogpeppe, we just have to validate the incoming data against the charm before we start with the service
<rogpeppe> fwereade: assuming there are no transient mgo errors, yeah
<fwereade> rogpeppe, fair enough
<rogpeppe> fwereade: given current timeline, i'm going to leave it as a TODO for now
<fwereade> rogpeppe, your call
<rogpeppe> fwereade: and certainly not in that already-too-big branch :-)
<fwereade> rogpeppe, +1
<rogpeppe> fwereade: i had a clean live test with API auth enabled, BTW
 * fwereade cheers at rogpeppe
<dimitern> something very weird is going on - m.SetAgentAlive() returns no error, but m.AgentAlive() returns false even after 5 sec., and m.WaitAgentAlive(5sec) timeouts
<rogpeppe> fwereade: enable API auth: https://codereview.appspot.com/8626044/
<fwereade> rogpeppe, sorry, gotta catch the shops
<fwereade> rogpeppe, I'll be back some time later though
<rogpeppe> fwereade: np
<rogpeppe> right, that's me for the day
<rogpeppe> reviews of https://codereview.appspot.com/8626044/ much appreciated
<rogpeppe> g'night all
<niemeyer> fwereade, rogpeppe: The agent state remains as pending although the install hook is actually running
<niemeyer> fwereade, rogpeppe: Sounds familia?
<niemeyer> r
<niemeyer> fwereade, rogpeppe: There's also something weird going on with the juju tools selection
<niemeyer>     agent-version: 1.9.14
<niemeyer>     agent-version: 1.9.13
<niemeyer> That's with machines 0 and 1
<niemeyer> I used --upload-tools to bootstrap the environment
<niemeyer> I'll send a note to the list
<niemeyer> Done
<thumper> morning
<fwereade_> thumper, heyhey
<thumper> morning fwereade_
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: thumper | Bugs: 0 Critical, 52 High - https://bugs.launchpad.net/juju-core/
 * thumper goes to read the godocs again
<thumper> fwereade_: are you still around?
<thumper> happy birthday davecheney
<fwereade_> thumper, yeah
<fwereade_> davecheney, happy birthday
<thumper> fwereade_: just looking at the set-env comment re: agent-version
<thumper> fwereade_: right now, in config.New, it sets the agent-version if it is unset or empty
<fwereade_> thumper, ah, yes, sorry, that's not proposed yet
<thumper> fwereade_: I just changed how it was doing it
<fwereade_> thumper, disregard my comments, I'll suck up the merge
<thumper> fwereade_: oh, you are changing it?
<fwereade_> thumper, yeah, the pick-latest-tools behaviour on bootstrap is one that nobody has argued against
<fwereade_> thumper, but an explicit setting should win
<thumper> fwereade_: sure, and it will
<thumper> fwereade_: config.New just makes sure there is something there that is valid
<fwereade_> thumper, indeed -- but if there's a real default there's no way to tell the difference between an explicit setting of version.CurrentNumber and an automatic one
<thumper> fwereade_: also, if a default one is retrieved, it can always be overridden with Apply anyway
<thumper> fwereade_: why should config care?
<fwereade_> thumper, as a reader of config when choosing bootstrap tools, I care
 * thumper thinks
<fwereade_> thumper, otherwise I always bootstrap with the client version of the tools (unless I upload-tools)
<thumper> fwereade_: the problem is, a config without an agent-version isn't valid
<fwereade_> thumper, I think this is just another one of those cases where the validity of a key depends on context
<thumper> fwereade_: and what are you doing instead?
<fwereade_> thumper, AgentVersion() (version.Number, bool)
<thumper> fwereade_: hmm...
<thumper> fwereade_: so do you want me to change my branch?
<thumper> fwereade_: I could remove the default
<fwereade_> thumper, no, I said disregard it
<thumper> kk
<fwereade_> thumper, I'll suck up the conflicts
<fwereade_> thumper, so long as it doesn't change assumptions made across the codebase we're fine ;)
<thumper> fwereade_: no, I've made no sweeping changes :)
<fwereade_> thumper, btw, I just reproposed https://codereview.appspot.com/8604043
 * thumper nods
<thumper> I take another look
<fwereade_> thumper, I think an LGTM from you would be enough to land it
<thumper> ok
<thumper> I'm just going through the set-env review
<thumper> not much to change
<thumper> but it needs another LGTM, perhpas I'll poke davecheney :)
<davecheney> how did ya' all know ?
<fwereade_> thumper, nah, mostly just ramblings
<thumper> davecheney: magic
<davecheney> thumper: LGTMs for everyone today!!
 * fwereade_ cheers
<thumper> davecheney: and google+ said so
<davecheney> thumper: fwereade_ I think I may have a clue about what niemeyer is talking about with pending agents
<davecheney> i saw one case last night when I was testing the mongo stuff
<davecheney> basically unit agent has the wrong path for tools
<davecheney> or the tools symlink is not there
<davecheney> that is about all I can say at the moment
<davecheney> but obviously that will leave the machine in pending
<fwereade_> davecheney, s/machine/unit/ right?
<fwereade_> davecheney, but, hmm, that is ...troubling
<davecheney> fwereade_: i meant machined == virtuail machine == service
<davecheney> but yes, you are correct
<davecheney> fwereade_: i will investigate more
<davecheney> i thought that it was something I had screwed u p on my branch
<davecheney> as, as usual, I was crossing series to deploy the environment
<fwereade_> davecheney, I'm going to change --fake-series to default to config.DefaultSeries and the actual config's DefaultSeries()
<davecheney> fwereade_: +1
<davecheney> that would be more useful
<davecheney> atm, we could have 3 series in play
<davecheney> the workstation, the fake series, and the default-series
<davecheney> that is too many
<fwereade_> davecheney, yeah, you should only need to remember it on relatively rare occasions
<davecheney> there is a slightly complicated case where you are depliying a R or Q bootstrap node, and precise service units because they follow the charm spc
<davecheney> but honestly, that is always painful
<davecheney> and in those cases I just hack version.version
<thumper> fwereade_: FYI, I am swayed by your reasoning for the local test array
<thumper> fwereade_: just the }{{ and }}{ lines take some getting used to
<fwereade_> thumper, cool -- and, yeah, I know, it definitely repulsed me at first :)
<davecheney> thumper: that part is unfortuntely lisp like
<thumper> :)
<davecheney> who's got something for review, i'm in a bonevolent mood
<thumper> fwereade_: we should take your nice description to the list to get general approval
<thumper> davecheney: o/
<thumper> davecheney: the set-environment needs another LGTM
<thumper> let me just update the diff with review comments.
 * davecheney looks
<fwereade_> thumper, good idea, I will try to remember it before I sleep :)
 * thumper wonders if the second lbox propose will remember the prereq from before
<davecheney> thumper: did you land the prereq ?
<thumper> not yet
<davecheney> ok
<davecheney> np
<thumper> davecheney: I wanted to land them together
<thumper> lbox propose failed anyway
<thumper> does it remember?
<thumper> or do I need to be explicit?
<thumper> davecheney: the prereq is all good to merge, but I want to land the pair
<davecheney> you need to do -req on the first cycle of lbox propose
<davecheney> for that branch it remembers (or probaly relies on bzr to remember)
<fwereade_> davecheney, btw, https://codereview.appspot.com/8545043/ is up as well and should be fairly simple
<thumper> fwereade_: how did you get the please take a look with the new patch at the same time?
<thumper> I always end up with two comments
<fwereade_> thumper, lbox propose sends drafted comments
<thumper> fwereade_: oh, I didn't realise that
<davecheney> thumper: yeah, anything you haven't sent, it will send when you propose again
<thumper> fwereade_: you have your two https://codereview.appspot.com/8604043/
<thumper> davecheney:  https://codereview.appspot.com/8610043 should be updated now
<davecheney> thumper: ta
<fwereade_> thumper, cheers
<davecheney> thumper: re set environment
<davecheney> i have a recollection that we do _exactly_ that logic when
<davecheney> shit, it was the first lisbon sprint
<davecheney> something about pushing the config into the enviornment
<davecheney> minus the secrets
<davecheney> something about updatesecrets, but only if there are some
<davecheney> it is in environs.config environs/config
<davecheney> something like that
<davecheney> does that ring any bells ?
<davecheney> thumper: oh look, you've found that part
<thumper> davecheney: I used that as a basis of the set-environment changes
<davecheney> thumper: i'm sorry you had to spend so much time looking at the environment configuration horror show
<thumper> :)
<davecheney> you have made it much less horrid
<thumper> davecheney: the change to agent-version defaults was to normalize the approach
<thumper> davecheney: if it wasn't set, it was set inside config.New
<thumper> davecheney: so it made sense to be consistent
<thumper> davecheney: however fwereade_ is changing that behaviour, and has agreed to suck up my changes and tweak as necessary
#juju-dev 2013-04-11
<davecheney> SGTM
<thumper> davecheney: does lbox submit also send pending comments?
<davecheney> yes
 * thumper pulls fwereade_'s latest commit and will rerun the tests before submitting
<fwereade_> thumper, ah, sorry
<thumper> fwereade_: np
<thumper> fwereade_: it is only a few minutes...
<bigjools> hey fwereade_ did you get anywhere with the review on the maas branch?
<fwereade_> thumper, I have endured worse test times than this... I've endured build times worse than these test times
<thumper> fwereade_: so have I
<thumper> I remember one bank I was at where the compile time was two hours
<fwereade_> bigjools, I have been most remiss there -- it all looks essentially fine, but I have yet to step up and convince myself it all fits together
<fwereade_> bigjools, I will do another pass
 * bigjools had a 4 hour build once
<thumper> ah wat?
 * thumper sighs
<thumper> lbox submit expects the prereq to be committed first
<bigjools> fwereade_: ok we're still making some changes to it, some as requested in the initial pass and others to actually make it work. I deployed a charm on my microservers yesterday!
<thumper> half the point of a prereq is to get work reviewed independently of landing
<davecheney> thumper: of course
<thumper> that's bollocks
<fwereade_> bigjools, sweet!
 * thumper goes to follow lbox's rules
<bigjools> thumper: lbox expects quite a few things that are unreasonable IMO
<davecheney> if you propose the chain A -> B -> C
<davecheney> i don't think it is unresonable to expect A to land beore B, etc
<thumper> I think it is unreasonable
<bigjools> if C has B and A, then C will implicitly land them
<thumper> what if A isn't complete?
<thumper> or doesn't make sense without B?
<davecheney> then that isn't a prereq in the sense that lbox undestands
<thumper> in which case lbox is dumb
 * davecheney is not fucking arguing about the tools today
<thumper> :)
<bigjools> careful  thumper, you don't want to be accused of bitching
<thumper> :P
 * thumper merges before heading into town for lunch and errands
<davecheney> thumper:                 Tools:           newSimpleTools("1.2.3-linux-amd64"),
<davecheney> this will make you mad
<davecheney> what series are these tools ?
<davecheney> no wait, they are the linux series
<thumper> linux of course
<thumper> :)
<davecheney> larakin linux
 * davecheney fixes that shit
 * thumper waits for lbox to do its slow merge
 * thumper bitches
<thumper> and done
 * thumper heads to lunch
<fwereade_> bigjools, ok, I think I've finished the review
<fwereade_> bigjools, I whine about lots of things but basically I want it to land
<fwereade_> davecheney, I don't suppose you're just putting the last finishing touches to a https://codereview.appspot.com/8545043/ review? :)
<davecheney> fwereade_: sorry, i was not
<davecheney> it looked really large
<davecheney> so I was selfishly noodling with my own brach
<davecheney> branch
<fwereade_> davecheney, no worries -- but it's a bit less bad than it looks, honest :)
<davecheney> fwereade_: you -> sleep, will review after lunch
 * davecheney goes back to fighting with the cloudinit checkers
<davecheney> map[interface{}]interface{} can bite me
 * fwereade_ goes off to sleep :)
 * fwereade_ solemnly advocates killing it with fire
<bigjools> fwereade_: thanks
<bigjools> not sure I will ever get used to "if" preconditions
<bigjools> pre-statements even
<davecheney> constraints_test.go:283: // A failure here indicates that goyaml bug lp:1132537 is fixed; please // delete this test and uncomment the flagged constraintsRoundtripTests. c.Assert(val.Hello, IsNil)
<davecheney> ... value *string = (*string)(0xc20006b4d0)
<davecheney> whut ?
<davecheney> I guess I need to merge to trunk
<bigjools> are there plans to introduce library versioning into Go at any point?
<davecheney> bigjools: nothing has been discussed on the list
<thumper> davecheney: there was a list thread about pulling the latest goyaml
<davecheney> yeah, i did that
<davecheney> but the failure above sounds like the opposite
<davecheney> i've just merged trunk
<davecheney> i think that has fixed it
<thumper> cool
<davecheney> juju debug-log is awesome
<davecheney> i can watch everything happening in an environment
<davecheney> shit
<davecheney> Proposal: https://code.launchpad.net/~dave-cheney/juju-core/113-juju-bootstrap-raring-cloudinit-II/+merge/158263
<davecheney> error: Failed to send patch set to codereview: can't upload base of environs/mongo.go: ERROR: Checksum mismatch.
<davecheney> ^ i deleted them added this file during the branch
<davecheney> now it is really upset
<davecheney> any suggestions
<thumper> huh?
<thumper> you removed a file?
<thumper> and it is complaining?
<davecheney> then I added it back
<davecheney> as in copied it from another branch
<thumper> oh...
<thumper> not so good
<davecheney> fuckit
<thumper> it will complain about file-ids
<thumper> can't you just do a reverse merge of the revision that deleted it?
<thumper> and discard everything else?
<davecheney> i'll figure it out
<davecheney> screw it, just importde the diff into a new branch
<thumper> davecheney: you can delete merge poposals in LP
<thumper> instead of rejecting them, (if you like)
<davecheney> thumper: too late now
<m_3> davecheney: hey... gotta sec?
<m_3> having an hp problem 'error: cannot log in to admin database: auth fails'
<m_3> status -v (http://paste.ubuntu.com/5697341/) shows I'm connecting, but...
<davecheney> m_3: sorry that environment is stillborn
<davecheney> grab /var/log/cloud-init-output.log from the first machine
<davecheney> then destroy the environment
<m_3> davecheney: ok, lemme look
<m_3> davecheney: that log looks normal
<m_3> davecheney: and I can ssh to the box
<m_3> davecheney: ubuntu@15.185.162.247... your keys are there
<m_3> if you have a sec to help, I'm trying to set up a scale-test environment
<m_3> nm, I'm just using that instance to test from there
<thumper> davecheney: what would your first thought be if I said I wanted "environment: whatever" as the first line of `juju status` ?
<thumper> personally I think it is crazy we don't have this in status somewhere
<m_3> yeah, it's not even in there huh
<thumper> hmm...
 * thumper takes it to the list
<thumper> I would just do it
<thumper> but it seems that some people get hung up with backward compatability
<thumper> but it would be nice to see something while it waits for bootstrap to finish...
<m_3> Ha! ok, that's funny, I figured for sure that'd be in the juju-0.6 status
<m_3> most of the time you're sort of filtering status output anyways
<m_3> the machine ids up top are just noise normally
<m_3> but yes, it absolutely needs the current JUJU_ENV imo
<thumper> m_3: feel free to comment on my email message then :)
<davecheney> m_3: checking
<davecheney> m_3: 2013/04/11 04:09:32 JUJU jujud machine command failed: state entity name not found in configuration
<davecheney> error: state entity name not found in configuration
<davecheney> ok, this broke because there is no 1.9.13 tools for hp
<m_3> davecheney: np... I think I worked around it
<davecheney> jam gz: caused by: https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/servers%!(EXTRA string=Resource limit exeeded at URL %s., string=https://az-2.region-a.geo-1.compute.hpcloudsvc.com/v1.1/17031369947864/servers)
<davecheney> ^ is there a bug about this format string snaffu ?
 * thumper off swimming, back for the meeting
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: <noone> | Bugs: 0 Critical, 52 High - https://bugs.launchpad.net/juju-core/
<rogpeppe> mornin' all
<davecheney> rogpeppe: moning
<davecheney> morning
<rogpeppe> davecheney: yo!
<rogpeppe> davecheney: how's tricks?
<davecheney> good!
<rogpeppe> davecheney: i could do with another review on https://codereview.appspot.com/8626044/ if you fancy it
<davecheney> wrap your peepers around this, https://codereview.appspot.com/8648043/
<rogpeppe> davecheney: swap ya!
 * davecheney looks
<rogpeppe> davecheney: nice! reviewed.
<davecheney> rogpeppe: and to explain myself in my review of your code
<davecheney> i'm concerned about the magic strings
<davecheney> "machine-password"
<davecheney> not becuse they are secrets
<davecheney> just appear to be test warts
<rogpeppe> davecheney: they're only magic in tests
<davecheney> as you said on the other CL yesterday
<davecheney> so why not change SetPassword(string) to be ResetPassword() -> string, error
<davecheney> then it tells you what the password is
<davecheney> and we don't have to add even more fixtures
<rogpeppe> davecheney: are you talking about state.User.SetPassword there?
<davecheney> yes
<rogpeppe> davecheney: no can do
<davecheney> why not
<rogpeppe> davecheney: the whole point is that we're setting the password from the admin-secret in the environment config
<davecheney> ok fine
<davecheney> objection withdrawn
<rogpeppe> davecheney: cool
<rogpeppe> davecheney: your point about possibly leaving a blank password on the admin user is a good one, i think, *because* we don't abort the bootstrap init process if juju bootstrap-state fails
<rogpeppe> davecheney: i think we should
<rogpeppe> davecheney: i've thought for a while now that we should do "set -e" at the start of the cloudinit script
<rogpeppe> davecheney: what do you think?
<davecheney> if you just did AddUser("admin", $random) that would solve the problem
<davecheney> the user is created, but nobody knows the password
<davecheney> would that work ?
<rogpeppe> davecheney: i'd prefer to solve the deeper problem
<rogpeppe> davecheney: why are we even running the machine agent if the initial set up has failed?
<davecheney> sounds like a reasonable question
<davecheney> none of those cloud init steps are optional
<rogpeppe> davecheney: exactly
<rogpeppe> davecheney: i might change cloudinit to make each command do foo || error foo failed >&2
<rogpeppe> davecheney: that way you won't have to infer the failure in the logs from the command's stderr output
<davecheney> doesn't cloudinit send &1 and &2 to /var/log/cloud-init-output ?
<rogpeppe> davecheney: yeah, the &2 is probably unnecessary, just habit
<rogpeppe> davecheney: maybe just set -ex actually
<rogpeppe> davecheney: does that sound reasonable to you?
<rogpeppe> niemeyer: hiya! bit early for you, ain't it?
<niemeyer> rogpeppe: Nope.. it's actually very late :)
<rogpeppe> niemeyer: :-)
<rogpeppe> niemeyer: squalling bairn?
<niemeyer> rogpeppe: Have been cooking a release of mgo fixing a cluster resync bug, and spent some hours on it
<rogpeppe> niemeyer: cool
<niemeyer> Just about done now
<rogpeppe> niemeyer: the resync code is quite subtle, i thought
<niemeyer> rogpeppe: Yeah, real world is not so friendly
<niemeyer> rogpeppe: Servers popping in and out with their own ideas of what they should be doing is fun
<rogpeppe> niemeyer: while you're about it, it would be nice to have the redial interval configurable. we're hacking around it currently by sleeping in Dial) but i think the right place is in mgo.
<niemeyer> rogpeppe: What's the issue?
<rogpeppe> niemeyer: by default, mongo dials about 10 times a second.
<rogpeppe> niemeyer: that's not great when you're waiting for a server to come up.
<niemeyer> rogpeppe: Where does it do that again?
<rogpeppe> niemeyer: the logic is scattered
<rogpeppe> niemeyer: the emergent behaviour is that it dials 3 (5?) times then sleeps for a bit (0.5s ?) then repeats
<niemeyer> rogpeppe: Ah, right
<rogpeppe> niemeyer: i've been through the logic and understood it a few times, but i always forget immediately afterwards :-)
<niemeyer> rogpeppe: It's that retry that I always forget to count
<niemeyer> rogpeppe: I think we could just take that out..
<niemeyer> rogpeppe: But, not today
<rogpeppe> niemeyer: indeed
<rogpeppe> niemeyer: our hack works ok for the time being
<niemeyer> rogpeppe: Will keep that in mind
<rogpeppe> niemeyer: some time, an bounded exponential backoff would probably be good.
<rogpeppe> niemeyer: so we don't suffer too badly from the herd effect after a network outage.
<niemeyer> rogpeppe: This will probably never happen
<rogpeppe> niemeyer: ok. too hard? or just not the right thing to do?
<niemeyer> rogpeppe: A database driver is something you want connected as soon as the network is bac
<niemeyer> k
<niemeyer> rogpeppe: But more reasonable timings is a good idea
<rogpeppe> niemeyer: if the max bound was only a small number of seconds, it would still be ok.
<niemeyer> rogpeppe: Sure, but it would also be fairly irrelevant I suppose
<rogpeppe> niemeyer: i'm not sure. if the network goes down, we don't really want all the clients dialling in synchrony
<rogpeppe> niemeyer: the exponential backoff (with a random element) can allow a good spread over time, i think.
<niemeyer> rogpeppe: Perhaps.. it depends on the scale and on the application behavior
<rogpeppe> niemeyer: indeed
<niemeyer> rogpeppe: The random element is somewhat unnecessary as well.. they're not wall-clock synchronized
<niemeyer> rogpeppe: This would only be really effective if we allowed the backoff to actually backoff
<rogpeppe> niemeyer: if they all see the net go down at the same moment, they all redial immediately and they're all sleeping for the same amount of time, i suspect they'll keep reasonably clustered.
<niemeyer> rogpeppe: Like, spanning through several tens of seconds
<rogpeppe> niemeyer: the only random element i'd add would be the first sleep interval.
<davecheney> 2013/04/11 06:43:16 ERROR JUJU:jujud:machine cmd/jujud: upgrader loaded invalid initial environment configuration: required environment variable not set for credentials attribute: User
<davecheney> any ideas ?
<niemeyer> rogpeppe: I guess I could put something intelligent based on the number of clients
<niemeyer> rogpeppe: and have the backoff take that into account
<rogpeppe> niemeyer: that sounds like a great idea if you have that info
<niemeyer> rogpeppe: Yeah, I think it's around in the server
<rogpeppe> davecheney: interesting
<niemeyer> Either way, that's the kind of bug I'd love to get someone complaining about :-)
<rogpeppe> niemeyer: lol
<davecheney> rogpeppe: environment still appears to work
<davecheney> this is in HP
<davecheney> we don't need that attr in ec2
<rogpeppe> davecheney: ah.
<rogpeppe> davecheney: yeah, i wondered
<niemeyer> Most people that use mgo heavily do so on the basis of a few machines
<niemeyer> millions of requests a days, but just a few servers
<davecheney> i wonder if this is part of our 'don't upload the secrets' logic
<rogpeppe> niemeyer: well, we are going to move in that direction
<niemeyer> davecheney: Heya
<rogpeppe> davecheney: it's possible. is this with the openstack driver?
 * davecheney waves
<niemeyer> rogpeppe: Yeah, I have my fingers crossed to have a Critical filed! ;)
<rogpeppe> niemeyer: so far the API seems to be working quite well
<rogpeppe> niemeyer: although none of the agents are using yet
<niemeyer> Alright, I really need to sleep now, or I won't wake up in time for the meeting in a few hours
<rogpeppe> it yet
<rogpeppe> niemeyer: sleeeeeeep....
<niemeyer> have a good Beginning Of Day folks
<rogpeppe> niemeyer: you too
<rogpeppe> niemeyer: well, a good sleep anyway :-)
<niemeyer> Thanks :)
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1167723
<_mup_> Bug #1167723: environs/openstack: error relating to upgrader on bootstrap node <juju-core:New for dimitern> < https://launchpad.net/bugs/1167723 >
<rogpeppe> jeeze, i'm not surprised some of the live tests fail. having started an instance, it didn't see that instance when asking for it, even when waiting a whole 10 seconds later.
<rogpeppe> how the f*!@# can we make this stuff really reliable?
<dimitern> rogpeppe: increase timeouts and wait more?
<rogpeppe> dimitern: i'm just trying with a 20s timeout
<rogpeppe> dimitern: but really, if 20s why not 5 minutes?
<dimitern> rogpeppe: :) seriously?
<rogpeppe> dimitern: i've no idea
<rogpeppe> dimitern: i don't know what's happening inside amazon
<dimitern> rogpeppe: would it affect local live/other tests?
<rogpeppe> dimitern: yeah.
<dimitern> rogpeppe: btw a short and quick review? https://codereview.appspot.com/8630043/
<rogpeppe> dimitern: any time you're expecting an error, you'd need to time out the max amount
<rogpeppe> dimitern: looking
<rogpeppe> dimitern: i just saw the same thing happen even when waiting for 20 seconds
<rogpeppe> dimitern: i guess i should check our logic again. although it succeeds sometimes, it's possible we're getting something wrong somehow.
<TheMue> rogpeppe: is there any way to log the communication between the client and ec2?
<dimitern> rogpeppe: sgtm
<rogpeppe> TheMue: yeah, we can do that
<TheMue> rogpeppe: fine, maybe we just interpret the feedback wrong (or they changed a tiny bit in the protocol that's now fools our logic).
<rogpeppe> TheMue: it's possible, but slightly unlikely, as our logic does work... some of the time!
<TheMue> rogpeppe: that's what i meant, the difference between "working" and "done" if you only look at "done" but never interpret "working" (w/o knowing the protocol)
<TheMue> rogpeppe: me knowing the protocol
<rogpeppe> TheMue: i'm pretty sure it's sending exactly the same request each time
<rogpeppe> TheMue: (i'm just checking that)
<TheMue> rogpeppe: the client is sending the request, but i'm more interested in ec2s answer
<rogpeppe> TheMue: yeah
<dimitern> can someone send me the g+ link for the meeting?
<davecheney> https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.gdt9rkp5uspih9n3db6b95kccc
<rogpeppe> ha! found the bug!
<rogpeppe> TheMue: our own logic *was* screwed
<fwereade_> mramm, ping
<rogpeppe> mramm: bleep bleep bleep :-)
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: - | Bugs: 0 Critical, 53 High - https://bugs.launchpad.net/juju-core/
<dimitern> rogpeppe: I have an issue with m.SetAgentAlive() - returns no error, I can see the pinger starting in the log, but pwatcher.Alive(m.globalKey()) returns false (that's what m.AgentAlive() returns), even after a timeout of 5s or using m.WaitAgentAlive(), still fails - any clues?
<rogpeppe> dimitern: you'll have to debug it i'm afraid
<dimitern> rogpeppe: bugger.. ok, I though I might be missing something obvious
<rogpeppe> dimitern: check out the pinger tests
<dimitern> rogpeppe: i'm getting this every time I run the machiner tests: [LOG] 20.79113 NOTICE juju: authorization error while connecting to state server; retrying
<dimitern> rogpeppe: and then it reconnects and continues
<dimitern> rogpeppe: could it be an indication why the pwatcher is not working right?
<dimitern> rogpeppe: I found it! so adding mr.st.StartSync() in the MA code (not tests) after calling SetAgentAlive() the tests pass
<dimitern> why did i ever have to call this? i though it was for tests only
<mramm> hello all
<mramm> I am very sorry I missed the meeting!
<dimitern> mramm: hiya
<dimitern> mramm: it went well, we even took notes: https://docs.google.com/a/canonical.com/document/d/1tKtAcrz9ADGF-o6lQtmGKYRf2ZHIR7lb-WctXvcLnfg/edit
<mramm> I had the alarm set for the meeting, but it was set an hour late!
<mramm> I am reading the notes now.
<fwereade_> dimitern, it the machiner using the same *State as the tests?
<fwereade_> dimitern, you shouldn't have to call StartSync usually, I don't think
<dimitern> fwereade_: probably not, because the tests use the jujuConn one
<dimitern> fwereade_: I had StartSync in both, but adding it the MA code and removing it from the test didn't cause failures, so I left it at that and submitted
<fwereade_> dimitern, sorry, I don;t think that's ok
<dimitern> fwereade_: i'm open to suggestions
<dimitern> fwereade_: why do you think that?
<fwereade_> dimitern, ok, does SetAgentAlive work in practice? it when you do juju status against a live environment, does it work?
<fwereade_> dimitern, because Sync and StartSync are purely for the convenience of tests AFAI am aware
<dimitern> fwereade_: in order to test this, I need to finish the status CL, then I'll test it live
<fwereade_> dimitern, hmm, is this the tests for the actual machiner, or for the machine agent?
<dimitern> fwereade_: machiner
<mramm> So, from the notes it seems to me like we ought to call this next release 1.10.0 and that we should try to do a release as soon as it is possible to do the python+go thing
<mramm> it will have some bugs, which we should focus on fixing for the next few days, and release 1.10.1 early next week
<fwereade_> mramm, +1
<dimitern> fwereade_: sorry, I keep mixing "task" and "agent", even though now I know the difference :)
<fwereade_> dimitern, don't worry it still bugs me
<fwereade_> dimitern, but ok I think I know what the problem is, just a mo
<mgz> this weekend should be the aim for python+go release testing
<mgz> the python code is in review for including in raring, and the go part should be ready to go after that
<mramm> My long term proposal would be that we do a "stable" release at the end of the month every month, and then switch to an unstable number for dev versions for weekly releases within that month.
<mramm> mgz: that sounds reasonable
<dimitern> mramm: +1
<mramm> we should then try to come up with some good ways to test that the 1.10.x to 1.12.x release goes smoothly.
<fwereade_> dimitern, ok, the problem is that StartSync needs to be called after SetAgentAlive to get a timely response, but that the test doesn't have an obvious way of knowing when that's happened
<mramm> From the notes: "John: It will be parallel installable and easy to remove; they canât easily bootstrap"
<mramm> what's the second part mean?
<mramm> Also, can't we do this for them: "We need to make it clear that they need to set up a completely fresh directory structure for juju2"
<fwereade_> dimitern, I think the right thing to do is to SS in the test every 50ms while waiting for WaitAgentAlive in another goroutine and sending the result out into the same select you're using for the SSs
<dimitern> fwereade_: I tried that as well, didn't work
<dimitern> fwereade_: oh, wait - I didn't try a separate goroutine
<dimitern> fwereade_: let me just finish the status CL and I'll test it live as is first
<fwereade_> dimitern, ok cool
 * dimitern machine needs a restart I think, bbiab
<jtv> fwereade_: progressing steadily with our provider feature branch â thanks again for the reviews.  We have some things that are unclear though.  Could you comment on https://code.launchpad.net/~maas-maintainers/juju-core/maas-provider-skeleton/+merge/157025 and specifically the latest comments by RaphaÃ«l?
<fwereade_> jtv, sure
<jtv> Thanks
<fwereade_> jtv, I think I've covered it, let me know if I missed something
<fwereade_> jtv, awesome that you got a deploy working :)
<rvba> fwereade_: thanks for your reviews.
<jtv> fwereade_: looking now...  Thank rvba for getting the deployment running!
<fwereade_> rvba, thanks for the deployment!
<rvba> ;)
<fwereade_> rvba, let me know if anything needs clarification
<rogpeppe> interesting, i just saw a real-world failure caused by a temporary provisioning failure:
<rogpeppe> 2013/04/11 09:54:08 ERROR JUJU:jujud:machine worker/provisioner: cannot start instance for machine "1": cannot run instances: The service is unavailable. Please try again shortly. (Unavailable)
 * rogpeppe has finally submitted those ec2 expenses
 * rogpeppe hopes it's not too late
<dimitern> f@$%! time zone issues with both g calendar and thunderbird
<dimitern> fwereade_: so the problem is not only with tests
<fwereade_> rogpeppe, cool
<fwereade_> dimitern, oh yes?
<dimitern> fwereade_: w/o SS in M's loop(), I'm not even getting AgentAlive() -> true, nil right after calling SetAgentAlive() with no error
<fwereade_> dimitern, that's expected
<fwereade_> dimitern, the MA doesn't need to detect it's made itself alive
<dimitern> fwereade_: or, after using WaitAgentAlive for 5s
<dimitern> fwereade_: i'm confused now
<rogpeppe> fwereade_: i think that just having the pinger around should cause AgentAlive to return true, no?
<fwereade_> dimitern, are you sure about the presence timeout being 5s?
<fwereade_> rogpeppe, hmm, ok, that is an interesting point
<dimitern> fwereade_: I gave it 5s
<fwereade_> dimitern, but where are you asking AgentAlive()?
<fwereade_> dimitern, sorry, I mean the underlying pwatcher refresh frequency
<dimitern> fwereade_: I had the following (all combinations tried): SAA() -> no error, WAA(5s) -> timeout, AA() -> false, nil (after SAA()) - that's all without SS after SSA()
<fwereade_> dimitern, when and where are you making these calls, in response to what? :)
<dimitern> fwereade_: with SSA() + SS() all works
<fwereade_> dimitern, maybe we should g+
 * fwereade_ starts one
<dimitern> fwereade_: in the beginning of the loop,before settings status
<dimitern> fwereade_: sure
<rogpeppe> dimitern: if what you're saying is true, surely the presence tests would fail?
<dimitern> rogpeppe: I got it working, just a sec
<TheMue> davecheney: oh, almost missed it. happy birthday.
<davecheney> TheMue: thanks mate
<TheMue> davecheney: yw, 3h left. and hey, you should party, not being here online ;)
<davecheney> TheMue: party on the weekend
<TheMue> davecheney: i would like to join, sadly got no time. *cough* *cough*
 * TheMue steps out for lunch, biab
<dimitern> fwereade_: while i'm on to this, how about doing the same with the "down" state for units: "status: info" (regardless of the status, not only on error, and include info only when != "")?
<dimitern> fwereade_: agent-state-info, that is
<dimitern> fwereade_: anyway, PTAL https://codereview.appspot.com/8561046/
<rogpeppe> dimitern: what was the problem BTW?
<dimitern> rogpeppe: not really sure, but the difference between state.Sync() and StartSync() made the difference
<dimitern> rogpeppe: that a look, if you want ^^
<rogpeppe> dimitern: are you saying it worked with StartSync but not with Sync?
<dimitern> rogpeppe: i'm saying the other way around
<rogpeppe> dimitern: but in the CL you're using StartSync, no?
<jam> mgz: poke for standup
<rogpeppe> dimitern: it shouldn't make any difference, given that you're doing WaitAgentAlive anyway
<dimitern> rogpeppe: in the machiner CL yes, but now in the one above I changed it to how it should be (not to use StartSync() in the code, just call Sync() in the test, and it passed)
<rogpeppe> dimitern: oh, you were calling Sync in the *code*... i see
<dimitern> rogpeppe: :) yeah
<dimitern> rogpeppe: i'm testing it live on ec2 now and it seems to work fine
<rogpeppe> dimitern: but i was just looking at the one above, and it calls StartSync not Sync
<rogpeppe> dimitern: but that's fine - it shouldn't make a difference
<jam> allenap: is mgz with you?
<dimitern> rogpeppe: are you sure you're looking at the right one? https://codereview.appspot.com/8561046/diff/11001/worker/machiner/machiner_test.go
<rogpeppe> dimitern: ah, i was looking at status_test
<fwereade_> dimitern, LGTM
<dimitern> fwereade_: cheers, I'll make the changes and submit it soon
<dimitern> fwereade_: and then i'll need some direction on the deuglifying the logs
<fwereade_> dimitern, it's in the card
<fwereade_> dimitern, drop the JUJU: badge throughout, and drop all package name prefixes from "main" packages (leave others intact)
<dimitern> fwereade_: ah, ok - so only main packages
<rogpeppe> dimitern: so when you used StartSync in the machiner test, it failed?
<dimitern> fwereade_: I though it was to generally reduce the JUJU strings
<dimitern> rogpeppe: yeah
<rogpeppe> dimitern: hmm, that seems fragile.
<fwereade_> dimitern, this gives us reasonably sensible output as feedback without sacrificing logging clarity, I think
<fwereade_> rogpeppe, I forget why but AgentAlive tests all seem to use sync
<dimitern> rogpeppe: I reckon because Sync() is synchronous, while StartSync() is not
<fwereade_> rogpeppe, possibly I never knew why, IIRC gustavo did the whole presence system
<rogpeppe> dimitern: i know that, but even synchronous just means "delivered *somewhere*"
<rogpeppe> dimitern: not necessarily acted on
<rogpeppe> dimitern: so if it was racy with StartSync, it's probably racy with Sync
<rogpeppe> dimitern: i'll have a quick check in the presence code
<dimitern> rogpeppe: cheers
<dimitern> it works live, just tested
<dimitern> fwereade_: what do you mean by "parenthesize the infos" ?
<dimitern> fwereade_: agent-state-info: "(somestatus: someinfo)" ?
<fwereade_> dimitern, agent-state: down; agent-state-info: (started)
<dimitern> fwereade_: right
<rogpeppe> dimitern: i see why Sync is different from StartSync in this case. i think it's justifiable.
 * rogpeppe doesn't like the way the watcher code doesn't behave synchronously on receipt of a message, even though it easily could.
<rogpeppe> in my opinion, this is a hack:
<rogpeppe> 	w.next = time.After(0)
<dimitern> rogpeppe: ha! so it returns a channel that's available to read on immediately
<rogpeppe> dimitern: yeah. or it sets the timer channel to that anyway.
<rogpeppe> fwereade_: i was thinking just yesterday that there really wasn't much value from having two status types
<dimitern> rogpeppe: it definitely seems like a hack
<rogpeppe> dimitern: it's just to save writing a function.
<rogpeppe> dimitern: that would be called from two places
<fwereade_> rogpeppe, I'd be +1 on unifying them if you are
<dimitern> rogpeppe: nasty :)
<fwereade_> rogpeppe, will simplify state a bit too
<rogpeppe> fwereade_: definitely.
<fwereade_> dimitern, would you tack a branch to do that onto the end of your status work then please? or integrate it if you prefer
<dimitern> rogpeppe, fwereade_: just to be clear here, we're talking about having just the same status type, rather than separate MachineStatus and UnitStatus?
<rogpeppe> fwereade_: const StatusError Status = "error" ?
<fwereade_> rogpeppe, dimitern: yep
<dimitern> fwereade_: it'll simplify some things, but not too much I think
<rogpeppe> why do those darn live tests keep passing when i want them to fail?!
<rogpeppe> dimitern: the Status functions become a little simpler
<rogpeppe> dimitern: no type conversion required
<dimitern> rogpeppe: and the status command as well
<rogpeppe> dimitern: definitely - that being the motivating example in this case
<dimitern> rogpeppe: we can have back the processStatus(status, info string) for both machines and units
<rogpeppe> dimitern: +1
<dimitern> aren't we missing something? it's easy to do this now, since all the places that use them are easy to find and change, but won't we lose some value out of it?
<dimitern> fwereade_, rogpeppe ^^ ?
<rogpeppe> dimitern: i don't think so
<rogpeppe> dimitern: but YMMV
<dimitern> if not, I can add a card for it and do it later today
<fwereade_> dimitern, describe the negative consequences from your POV
<dimitern> fwereade_: not sure I see any :)
<rogpeppe> dimitern: we're all agreed then :-)
<fwereade_> dimitern, then let's do it :)
<dimitern> cool, card added and assigned to myself
 * dimitern lunch
 * fwereade_ lunch too
<mgz> jam: sorry, lunch
<jam> mgz: sure sure, I hope that sandwhich was worth your position... dun dun duuuunnnn
<mgz> :D
<mgz> sausage roll special, so it may well have been
<allenap> jam: Those sausage rolls really are quite special.
<mgz> jamespage has just finished his
<jamespage> hmmmm
 * jamespage yom yom yom
<rogpeppe> dimitern, fwereade_: trivial but important: https://codereview.appspot.com/8661043/
 * dimitern is back
<dimitern> rogpeppe: looking
<dimitern> rogpeppe: LGTM
<rogpeppe> dimitern: thanks
<dimitern> rogpeppe: I found a bug in rietveld - if you double click the send button it sends the mail twice :)
<rogpeppe> dimitern: nice
<rvba> fwereade_: could I have your opinion on this: https://code.launchpad.net/~rvb/juju-core/providers-import-fwreade-1/+merge/158369 ?  Maybe you'll have a better idea on where to put the import statements.
<fwereade_> rvba, hmm, why would maas be importing testing?
<fwereade_> rvba, oh, wait, you have all the tests in-package
<fwereade_> rvba, that would maybe be the problem?
<rvba> fwereade_: hum, having the tests in the same package allows us to monkey patch things easily.
<fwereade_> rvba, hmm, wait
<fwereade_> rbva, why does cmd/ import environs/all?
<fwereade_> rbva, individual commands can and should, but I don't think the command-utility package should know about them
<rvba> fwereade_: well, if we want all the providers registered, it needs to happen in cmd/ right?
<fwereade_> rvba, do it in the individual command packages -- cmd/juju, cmd/jujud
<fwereade_> rvba, and only the ones that actually need it
 * fwereade_ thinks that should do it
<rvba> fwereade_: indeed, that will probably fix the import loop as wellâ¦
<rvba> fwereade_: ta
<fwereade_> rvba, fwiw I ardently support the detailed testing you have in place in maas
<fwereade_> rvba, but I would really prefer those tests which can be external to be external
<fwereade_> rvba, are you familiar with export_test?
<rvba> fwereade_: we have a card for that, we will look in it.
<rvba> fwereade_: no
<fwereade_> rvba, ok, files ending in "_test.go" are only compiled when running tests
<fwereade_> rvba, this means that you can take internal functions and define them in a file called export_test.go, but still in the package rather than the test package
<fwereade_> rvba, and anything exported in there will be visible to the tests and only the tests
<fwereade_> rvba, this is useful, and allows for monkey-patching just fine
<fwereade_> rvba, but it has a sting in the tail
<rvba> fwereade_: indeed, looks exactly like what we need.
<fwereade_> rvba, go test ./... does not guarantee that the tested code will build when _test.go files are not compiled in
<fwereade_> rvba, so you should always check go build ./... as well
<rvba> fwereade_: all right
<dimitern> rogpeppe: was it a problem if state imports state/api/params? I want to make statusDoc have the status as params.Status, rather than string
<dimitern> can somebody send me a link to the kanban g+?
<rogpeppe> dimitern: no, that's the main point of params
<fwereade_> dimitern, https://plus.google.com/hangouts/_/539f4239bf2fd8f454b789d64cd7307166bc9083
<rogpeppe> dimitern: when the agents aren't using state directly, i think params should be able to be merged into state/api
<dimitern> rogpeppe: I can simplify the status ops then, getting the fields directly, instead of a doc
<dimitern> fwereade_: cheers
<rogpeppe> dimitern: i'd use a doc
<dimitern> rogpeppe: cool then
<mgz> do we test juju-core with gccgo at all?
 * dimitern wonders: params.Started or params.StatusStarted is better?
 * TheMue has to find out why his microphone doesn't work. Already changed the browsers, but that doesn't seem to be the reason.
<dimitern> fwereade_, rogpeppe: thoughts? ^^
<fwereade_> dimitern, StatusStarted please
<rogpeppe> +1
<dimitern> fwereade_: yeah, I was leaning towards that
<fwereade_> mgz, not that I'm aware
<rogpeppe> TheMue: i wonder if all the noise we were hearing from you was from your mike going wrong
<rogpeppe> mgz: i've never used gccgo
<rogpeppe> mgz: it would be great to have a try
<rogpeppe> mgz: i'm not sure how well cgo works, but i suppose it *should* work even better...
<TheMue> rogpeppe: yeah, maybe, i'm checking right now
<dimitern> TheMue: if you're using an hdmi external monitor it's worth checking the volume control to see it's using the correct input - i have this problem with skype after g+ every time (not the other way round though)
<TheMue> dimitern: i'm just using my macbook and using the mic for dictation works fine
<TheMue> dimitern: would you pass me your skype address for a test?
<dimitern> TheMue: sure: ultramarine_crystal
<TheMue> dimitern: Thx, contacted you.
<fwereade_> rogpeppe, TheMue: btw, did my defence of inline test arrays find any favour?
<dimitern> fwereade_: I can live with it :)
<rogpeppe> fwereade_: i'm still not keen
<rogpeppe> fwereade_: i don't think the naming is a problem (just name it after the test)
<fwereade_> rogpeppe, apart from the level of indentation
<fwereade_> rogpeppe, the naming really is a problem that exists and has the drawbacks I listed
<rogpeppe> fwereade_: and i like knowing that all the data is uninfluenced by the stuff that comes into the function
<rogpeppe> fwereade_: i know that it's got to pass through the logic in the code, but i can scan that easily
<fwereade_> rogpeppe, the global test arrays have way worse sanity/stability guarantees than the scoped ones
<rogpeppe> fwereade_: really?
<fwereade_> rogpeppe, yes
<fwereade_> rogpeppe, they're modifiable from anywhere, and they can be used in multiple test methods
<fwereade_> rogpeppe, if they're not I cannot think of any reason to use a wider scope than necessary
<fwereade_> rogpeppe, apart from, as you, said, indentation ;)
<fwereade_> rogpeppe, I'm willing to pay a tab to banish spooky action at a distance
<TheMue> fwereade_: due to consistency i personally prefer the tests outside the function, but it's no big reason
<rogpeppe> fwereade_: i don't really mind if they're modifiable from anywhere - it's easy to see if anywhere else has a reference to them
<fwereade_> rogpeppe, abuse of global test arrays tripped danilos up yesterday :)
<rogpeppe> fwereade_: he won't do it again :-)
<fwereade_> rogpeppe, he didn;t do it
<fwereade_> rogpeppe, someone left a landmine for him
<rogpeppe> fwereade_: honestly, i like to see the name of the test next to the logic of the test
<rogpeppe> fwereade_: not separated by 200 lines
<rogpeppe> fwereade_: or 500 in at least one case
<fwereade_> rogpeppe, but the names only sometimes match the tests
<rogpeppe> fwereade_: we should make them match always
<rogpeppe> fwereade_: that's the convention i've been trying to follow
<fwereade_> rogpeppe, if there are two suites that have TestFoo, you can't have two fooTests~s
<fwereade_> rogpeppe, so you either relax that or you say, ok, let's have really redundant names everywhere
<rogpeppe> fwereade_: the moment you have a collision, you get told
<rogpeppe> fwereade_: i don't really see the issue
<rogpeppe> fwereade_: i'm interested to know about the issue danilos had yesterday though
<fwereade_> rogpeppe, it's a constant low-level irritation and a source of nothing but hassle
 * rogpeppe doesn't find it so
<fwereade_> rogpeppe, someone had made some test depend on otherTestArray[12]
<rogpeppe> fwereade_: oh jeeze
<rogpeppe> fwereade_: that was straight-up crack
<fwereade_> rogpeppe, clearly we can;t be trusted with sharp tools ;p
<rogpeppe> fwereade_: but shared test data can be very useful
<danilos> rogpeppe, bzr diff -r1133..1134 :)
<fwereade_> rogpeppe, sure -- but it should be differentiated from that which is tightly scoped
<fwereade_> rogpeppe, globals are bad, mmkay?
<danilos> rogpeppe, basically, there was an array of test configs, and then one of them was overridden in a couple of tests later, and it was referenced as array[12]
<rogpeppe> danilos: yeah, i remember now
<rogpeppe> fwereade_: i don't think we've got anywhere else like that though
<rogpeppe> fwereade_: if i'd seen it i'd have called it out
<rogpeppe> i don't agree that globals are bad
<fwereade_> rogpeppe, every global carries a cognitive load
<rogpeppe> fwereade_: i like static data
<fwereade_> rogpeppe, you may be better at bearing that load than I am
<rogpeppe> fwereade_: i think that a large potentially dynamic array carries its own cognitive load
<fwereade_> rogpeppe, how are the global vars not dynamic?
<rogpeppe> fwereade_: "is there somewhere in this data that changes at runtime?"
<fwereade_> rogpeppe, yeah, how do you guarantee that for the global ones?
<rogpeppe> fwereade_: it's trivially easy to verify that they're only referred to once
<fwereade_> rogpeppe, instead of reading the test, you have to read the whole package
<rogpeppe> fwereade_: grep fooTests *.go
<fwereade_> rogpeppe, every time you see one, you should check?
<fwereade_> rogpeppe, or maybe you shouldn;t *have* to
<danilos> fwereade_, +1
<rogpeppe> fwereade_: i think it's easier to check for global references than dynamic data
<fwereade_> rogpeppe, perhaps there's some case you're concerned about that I'm not seeing
<danilos> fwereade_, rogpeppe: I'd definitely agree with the notion that test input and expectations should be collocated, ideally in their respective tests
<rogpeppe> fwereade_: i can't think of any other issues we've had that relate to this issue, other than occasionally needing to rename a global variable
<rogpeppe> fwereade_: and i really think that being able to see the test function as a whole is important
<fwereade_> rogpeppe, the test code is the last block
<fwereade_> rogpeppe, the definition is the first block
<fwereade_> rogpeppe, the tests separate the two
<fwereade_> rogpeppe, putting them all in one places tightens the scope, and the only thing you lose is having the name of the test next to the logic... with the benefit of no longer having an extra global var to worry about
 * rogpeppe tries to find the CL where this came up
<rogpeppe> fwereade_: still not keen. i think the second version here is generally easier to read. https://codereview.appspot.com/8604043/diff2/2001:18001/environs/tools/list_test.go
<rogpeppe> fwereade_: it emphasises the difference between the static and moving parts
<fwereade_> rogpeppe, I dunno, to me the version I landed is strictly worse
<rogpeppe> fwereade_: in the second version, i can read the logic of TestMatch in a glance in the new version. in the old one, the bulky test data interferes.
<fwereade_> rogpeppe, but the actual context you need is still way off up at the top of the definition, it seems like a straight tradeoff readability-wise... one use becomes easier, another harder
<rogpeppe> fwereade_: in a way, both the test data and the function can be read independently, and i like it that way
<fwereade_> rogpeppe, if it were a simple matter of readability I wouldn't care, it's just the wanton spamming of globals to no clear benefit... you could separate the test definitions by creating a local var, if you really wanted to
<rogpeppe> fwereade_: the test method definition is already global for most intents
 * rogpeppe guesses he perhaps has less of a problem with globals than other people.
<fwereade_> rogpeppe, it's a method on a type... that's not global in the "global vars are bad" sense
<dimitern> rogpeppe: can you help me with an obscure failing megawatcher_internal_test ?
<rogpeppe> fwereade_: i really don't see the problem. you read it once. you don't change it. i have *never* had an issue with this.
<rogpeppe> dimitern: sure
<rogpeppe> fwereade_: and i don't wanna lose that indent either - test data often struggles with line length.
<fwereade_> rogpeppe, every global variable is a vector for spooky action at a distance within your code
<rogpeppe> fwereade_: only if someone else has a reference to it
<dimitern> rogpeppe: this is the error http://paste.ubuntu.com/5698822/, and the only changes to the code from trunk are params.Unit/Machine* (status) -> params.Status*
<rogpeppe> fwereade_: and in this kind of case, it's trivial to check that noone does.
<fwereade_> rogpeppe, every single time?
<dimitern> rogpeppe: it seems fishy somehow - in setUp the unit is added and set to started, but it's seen as in error state
<dimitern> rogpeppe: I tried getting the status right after the set, asserting it's the same, calling Sync() or StartSync() before or after SetStatus(), calling SetStatus() twice (!) - all without problems, and the same failure
<fwereade_> rogpeppe, every global variable costs a grep per developer per change to the source code ;p
<fwereade_> rogpeppe, assuming nobody ever abuses them and causes the devlopers to actually think about it
<dimitern> rogpeppe: it seems the change it sees is for a different unit
<rogpeppe> fwereade_: you only write the tests once. they're read 100 times.
<mramm> so, here is my summary of the versioning situation:
<mramm> https://docs.google.com/a/canonical.com/document/d/1XvVvVls3ka0z_JahAKl9CHVvT1WAMvE1BTmpKdezlQc/edit
<rogpeppe> dimitern: is this trunk, or your branch?
<fwereade_> rogpeppe, no argument there
<fwereade_> rogpeppe, anyway, I'll take it to the lists
<mramm> I will send something like that to the list in an hour or so, so if you have feedback let me know between now and then
<dimitern> rogpeppe: my branch, but as i said the only changes are replacing params.(Unit|Machine)(\w+) with params.Status\1
<rogpeppe> dimitern: ah, i know the problem
<rogpeppe> dimitern: you've changed the statusDoc definition, yes?
<dimitern> rogpeppe: and the types of the Status field in Unit/Machine info to params.Status, instead of params.Unit/MachineStatus
<dimitern> rogpeppe: yes?
<rogpeppe> dimitern: hmm, actually, no.
<rogpeppe> dimitern: could you push your branch and i'll take a look
<dimitern> rogpeppe: sure, that's the only thing left before proposing
<dimitern> rogpeppe: lp:~dimitern/juju-core/031-unify-machine-unit-status-types
<rogpeppe> dimitern: looking
<rogpeppe> dimitern: ha, i see your problem!
<rogpeppe> dimitern: you broke backingStatus.updated
<dimitern> mramm: great summary, thanks - only two comments: 1) "...the command line interface, the mongo..." -> "...the command line interface or output, mongo..."; 2) "...of these three api's..." -> "...of these three things(?)..." (only the last one is an API)
<dimitern> rogpeppe: oh? how?
<rogpeppe> dimitern: case doesn't work like in C
<mramm> well they are effectively API's
<mramm> since people program against them
<dimitern> mramm: just suggestions :)
<mramm> sure
<rogpeppe> dimitern: you need to revert to the earlier version and just remove the type conversions.
<rogpeppe> dimitern: (earlier version of that function, that is)
<dimitern> rogpeppe: can you point me to the location please?
<rogpeppe> dimitern: line 216
<rogpeppe> dimitern: what is that case supposed to be doing?
<mramm> dimitern: I will clarify the API bit
<dimitern> rogpeppe: line 216 is var allWatcherChangedTests
<rogpeppe> dimitern: megawatcher.go:216
<rogpeppe> dimitern: the tests are still fine
<rogpeppe> dimitern: you just broke the code itself :-)
<rogpeppe> dimitern: so it was right that the tests failed...
<dimitern> rogpeppe: ah, well, since they're essentially the same I wanted to unify them
<rogpeppe> dimitern: you can't
<rogpeppe> dimitern: unless...
<dimitern> rogpeppe: ewww.. ok, I'll revert it back
<rogpeppe> dimitern: you kinda could, but it wouldn't be worth it
<dimitern> rogpeppe: isn't that supposed to handle both cases like this, or I'm missing something?
<rogpeppe> dimitern: you can't have a variable that's two types at once unless it's an interface
<rogpeppe> dimitern: no
<rogpeppe> dimitern: cases don't fall through in Go
<rogpeppe> dimitern: (that's why we don't put a "break" after each one
<dimitern> rogpeppe: oh! they're defined there, right!
<rogpeppe> dimitern: you can separate possible values with commas though
<dimitern> rogpeppe: how about case *params.UnitInfo, *params.MachineInfo: ?
<rogpeppe> dimitern: that's valid syntax
<rogpeppe> dimitern: but wouldn't work
<rogpeppe> dimitern: because then the type of info would be the type of info0
<dimitern> rogpeppe: I see now, ok, sorry about the noise - i'll revert them
<rogpeppe> [16:57:17] <rogpeppe> dimitern: you can't have a variable that's two types at once unless it's an interface
<rogpeppe> dimitern: if you defined a setStatus function on the two types, then you could unify the cases
<rogpeppe> dimitern: but you'd need a generic way of making a copy too
<dimitern> rogpeppe: I don't want to make this otherwise almost trivial CL into a more complicated one
<dimitern> rogpeppe: agreed?
<rogpeppe> dimitern: indeed. i don't believe it's worth it.
<rogpeppe> dimitern: i had two cases there before. you can have two cases there still...
<dimitern> rogpeppe: sure
<dimitern> rogpeppe: although before they looked a bit more different
<rogpeppe> dimitern: agreed
<rogpeppe> dimitern: but they're still generating quite different assembly code
<benji> rogpeppe: after a fresh bootstrap I'm getting "error: no reachable servers" when trying "juju status"
<rogpeppe> benji: your bootstrap has probably failed for some reason
<rogpeppe> benji: try juju status --debug; that will show you the bootstrap node's address; then ssh to that node and see what /var/log/cloud-init-output.log has in it
<dimitern> rogpeppe: indeed - easy to overlook that though
<rogpeppe> benji: i always like to know what failure modes are around
<benji> rogpeppe: will do (it will take a second because I have destroyed the environment to test something, I'll re-bootstrap and dig around)
<dimitern> fwereade_, rogpeppe: unified status for machines/units: https://codereview.appspot.com/8667043/ - mostly mechanical replacing
<rogpeppe> benji: does it happen every time?
<benji> rogpeppe: for the last 3 or so times it has
<rogpeppe> benji: ah, ok, that's "good" :-)
<benji> ooh, a new error when bootstrapping: error: cannot save state: cannot write file "provider-state" to control bucket: remote error: handshake failure
<rogpeppe> benji: i have a recommendation for that one
<rogpeppe> benji: if you cd to $GOPATH/src/launchpad.net/goamz, what does bzr revno tell you?
<benji> rogpeppe: 35
<rogpeppe> benji: hmm, darn
<rogpeppe> benji: oh i see
<rogpeppe> benji: goamz/s3 has auto retry logic for Get but not Put
<rogpeppe> benji: that's an unfortunate transient error
<benji> so I should rebootstrap
<rogpeppe> benji: yeah
<benji> k
<rogpeppe> benji: i'll try to propose a change to goamz/s3 to help
<benji> rogpeppe: error: cannot save state: cannot write file "provider-state" to control bucket: read tcp 207.171.189.80:443: connection reset by peer
<rogpeppe> benji: sounds like a similar issue
<rogpeppe> benji: s3 is horribly flaky
<rogpeppe> benji: try again...
<benji> will do
<rogpeppe> niemeyer: ping
<niemeyer> rogpeppe: pong
<rogpeppe> niemeyer: i'd like to fix s3 so it's resilient in the face of transient Put errors
<rogpeppe> niemeyer: i have a proposal i'd like to run by you
<rogpeppe> niemeyer: it goes something like this: http://paste.ubuntu.com/5698946/
<niemeyer> rogpeppe: I have some ideas on how to do that already, but they require changing the API, which I think it's necessary either way
<niemeyer> rogpeppe: There are other things to be fixed in that API, and I'd like to do that all at once
<rogpeppe> niemeyer: ok. could we apply this band-aid in the meantime. (see benji's woes above)
<rogpeppe> ?
<niemeyer> rogpeppe: Can't we apply a band-aid in juju-core instead?
<rogpeppe> niemeyer: ok. i'd hoped to avoid duplicating the shouldRetry logic, but i guess not.
<niemeyer> rogpeppe: I'd rather not introduce yet another way to put a file just to break the API soon
<rogpeppe> niemeyer: i haven't introduced anything new
<rogpeppe> niemeyer: just changed PutReader so if you *happen* to give it a ReadSeeker, it'll retry
<niemeyer> rogpeppe: Even worse.. that's breaking the API
<rogpeppe> niemeyer: not badly. there's no guarantee of how much data PutReader will read.
<rogpeppe> niemeyer: though i suppose you might give it a reader already half-way through.
<rogpeppe> niemeyer: hmm, yeah
<rogpeppe> niemeyer: will band-aid in juju
<niemeyer> rogpeppe: Thanks.. I'm hoping it won't be so long until we fix this
<niemeyer> rogpeppe: The new API will look a lot better, and enable other things that are simply impossible ATM
 * rogpeppe lives in ope
<rogpeppe> hope
<rogpeppe> niemeyer: great
<rogpeppe> niemeyer: it would be nice to see CopyObject too, i just wanted it today
<niemeyer> rogpeppe: Hah, yeah, I thought about that after your message
<niemeyer> rogpeppe: Actually, I should answer it
<rogpeppe> niemeyer: which message?
<niemeyer> Done
<niemeyer> rogpeppe: "A few issues"
<rogpeppe> niemeyer: ah yes!
<rogpeppe> niemeyer: that must've been what prompted me to go looking for it - i'd forgotten!
<niemeyer> rogpeppe: I thougth about the copying too, but scratched the reply because it's S3-specific
<niemeyer> rogpeppe: The suggestion I just gave isn't, though
<rogpeppe> niemeyer: we've already unified the binaries
<niemeyer> rogpeppe: So it's a single one?
<rogpeppe> niemeyer: yup
<niemeyer> rogpeppe: Brilliant!
<rogpeppe> niemeyer: still takes a minute to upload though
<niemeyer> rogpeppe: Wow.. really? I don't think it takes a minute even for me
<rogpeppe> niemeyer: shitty rural internet bandwidth
<niemeyer> rogpeppe: Ah, okay
<rogpeppe> niemeyer: the ADSL is very "A"
<niemeyer> :-)
<dimitern> fwereade_: reviewed
<dimitern> rogpeppe: https://codereview.appspot.com/8667043/?
<rogpeppe> dimitern: looking
<benji> rogpeppe: https://pastebin.canonical.com/88970/
<rogpeppe> dimitern: reviewed
<rogpeppe> benji: looking
<benji> thanks
<dimitern> rogpeppe: cheers
<rogpeppe> benji: ha, looks like a problem with dave cheney's new mongo db packaging branch
<rogpeppe> who here knows about dpkg issues?
<rogpeppe> mgz: ^
<rogpeppe> see line 308 and after on benji's paste above: https://pastebin.canonical.com/88970/
<mgz> looking
<rogpeppe> benji: is this in ec2?
<benji> yep
<rogpeppe> benji: this was with --upload-tools?
<benji> rogpeppe: yep; this is the command I ran: juju bootstrap --fake-series precise --upload-tools
<rogpeppe> benji: and you're running on quantal?
<mgz> probably borked in quantal
<mgz> I'd expect that to work in raring
<rogpeppe> mgz: that what i'm thinking
<rogpeppe> oh rats
<rogpeppe> back to the tarball we go
<mgz> I don't think I have a quantal vm up right now, but can spin one up quickly
<rogpeppe> benji: there's a (sigh) easy fix for you just to get things working. change the CurrentSeries function in the version package to just return "precise"
<mgz> has any of the backport work for mongo actually happened yet?
<rogpeppe> mgz: what backport work?
<benji> rogpeppe: ok, let me give that a try
<mgz> well, this is never going to work till we have mongo 2.2 with ssl in something usable by the pre-raring series
<rogpeppe> mgz: it works in precise, i think
<mgz> we have a the ppa...
<mgz> I guess the issue i the boost versioning somehow
 * rogpeppe has never used boost but loves it anyway
<benji> rogpeppe: I get this error when bootstrapping with the hard-coded precise: error: cannot find tools: no compatible tools found
<benji> should I still be using --fake-series precise and --upload-tools
<rogpeppe> benji: what does juju version print?
<benji> rogpeppe: 1.9.14-quantal-amd64
<rogpeppe> benji: looks like you're not using the version you just changed
<benji> hmm, I did a "go build -a launchpad.net/juju-core/..." is there some other rebuild command I should have used?
<rogpeppe> benji: go install launchpad.net/juju-core/...
<rogpeppe> benji: go build just checks; it doesn't affect anything
<benji> running
<benji> ok, we have "1.9.14-precise-amd64" now, re-bootstrapping
<benji> rogpeppe: "juju bootstrap --fake-series precise --upload-tools" still gives me "error: cannot find tools: no compatible tools found"
<rogpeppe> benji: darn
<rogpeppe> benji: try adding --debug to that. what does it print?
<benji> rogpeppe: http://paste.ubuntu.com/5699072/
<rogpeppe> benji: line 6 is weird
<benji> yep
<rogpeppe> benji: could you paste your changed version of version.go, please?
<benji> rogpeppe: here's the diff http://paste.ubuntu.com/5699080/ Do you want the whole file too?
<rogpeppe> benji: no, that's fine
<rogpeppe> benji: erm
 * rogpeppe is slightly baffled
<rogpeppe> benji: ah!
<rogpeppe> benji: you need to change default-series in your environments.yaml too
<benji> rogpeppe: trying
<rogpeppe> fwereade_: looks like the upload-tools logic has been broken
<rogpeppe> fwereade_: upload-tools should override default-series, i think
<benji> it is looking better
<benji> rogpeppe: all working; much thanks for your help
<rogpeppe> benji: np. it illuminated a bug, so it's all for the best.
<rogpeppe> pwd
<dimitern> last CL for today, trivial: https://codereview.appspot.com/8674043/
<dimitern> niemeyer: you'll like that :) ^^
<niemeyer> dimitern: Woohay!
<dimitern> rogpeppe, fwereade_: any one of you wanna take a look as well? ^^
<rogpeppe> i just took a look at a single machine-0.log file
<rogpeppe> the environment had been booted for a couple of hours
<rogpeppe> it's 23MB
<rogpeppe> 148643 lines
<dimitern> rogpeppe: that's with debug on?
<rogpeppe> dimitern: yeah
<dimitern> rogpeppe: yeah.. not that surprising
<rogpeppe> dimitern: 4.3MB even with debug off though
<rogpeppe> dimitern: (53098 lines)
<rogpeppe> dimitern: that's not really that sustainable
<dimitern> rogpeppe: we need logrotate + zipping
<rogpeppe> dimitern: total time elapsed since log file started: 2h23m
<dimitern> rogpeppe: the logs are full of duplicated stuff, so should compress well
<rogpeppe> dimitern: with debug on, 1.2MB compressed; with debug off, 177K
<dimitern> rogpeppe: that's not that bad
<dimitern> rogpeppe: although... for 24h that's like 2MB compressed
<rogpeppe> dimitern: exactly
<dimitern> rogpeppe: we can implement some smart off loading to the bucket + logrotate and compression
 * rogpeppe is glad he has a command line interface that can easily cope with a 23MB history
<dimitern> :)
<dimitern> rogpeppe: btw have a look at the log deuglifying CL when you have 2m
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe: the card said only do the main packages for now
<dimitern> rogpeppe: thanks
<rogpeppe> dimitern: i'm pretty sure it was intended to take JUJU out too
<dimitern> fwereade_: ping
<rogpeppe> dimitern: let me check
<rogpeppe> dimitern: from william's email:
<rogpeppe> > 1) Drop package badging from log calls in "main" packages 2) Drop
<rogpeppe> > the JUJU:... badging across the board
<dimitern> rogpeppe: ah, you're right the card says both, but in an earlier talk with fwereade_ he said not to do the JUJU stuff for now, IIRC
<dimitern> i'll leave it hanging for now then, until we resolve what to do exactly
<rogpeppe> dimitern: oh go on, please do :-)
<dimitern> rogpeppe: I want to, but not today :)
<rogpeppe> dimitern: ok then
<dimitern> rogpeppe: i have some meatballs and beer waiting
<rogpeppe> dimitern: time for one more? https://codereview.appspot.com/8545045
<dimitern> rogpeppe: will look in a bit
<rogpeppe> dimitern: hmm, the SetProvisioned logic has broken this environment
<dimitern> rogpeppe: oh?
<rogpeppe> dimitern: the machine agent is repeatedly doing this: http://paste.ubuntu.com/5699268/
<dimitern> rogpeppe: do you have more context where the instance was started?
<rogpeppe> dimitern: it was started just there
<dimitern> rogpeppe: oh, sorry, yes
<dimitern> rogpeppe: but how about the machiner log?
<rogpeppe> dimitern: here's the first stuff we hear from the provisioner: http://paste.ubuntu.com/5699287/
<dimitern> rogpeppe: what does status report?
<dimitern> rogpeppe: reviewed btw
<rogpeppe> dimitern: http://paste.ubuntu.com/5699292/
<dimitern> rogpeppe: yeah.. as expected
<rogpeppe> dimitern: there are two machines because when the first one failed, i removed the service and tried to remove the machine
<rogpeppe> dimitern: why can't it set the instance id when there's nothing reported by status for inst id ?
<dimitern> rogpeppe: really weird though.. I tested this both with tests and with live instances, several times - no problems
<rogpeppe> dimitern: how does it judge "already set"?
<dimitern> rogpeppe: 		Assert: append(isAliveDoc, notSetYet...),
<dimitern> rogpeppe: notSetYet := D{{"instanceid", ""}, {"nonce", ""}}
<fwereade_> dimitern, I did not intend to say we should keep the JUJU badging
<fwereade_> dimitern, nobody in the whole world likes the JUJU badging AFAIK ;)
<mgz> I believe the characters JUJU should appear whereever JUJU possible
<fwereade_> mgz, that's JUJU crazy talk
<dimitern> rogpeppe: are you sure the tools the env was bootstrapped with include all my latest CLs?
<fwereade_> dimitern, I would hope that each one of them would work in order ;p
<dimitern> rogpeppe: I can't see the agent-state being set in status, and where set it says "running", which is wrong, it should be started
<dimitern> rogpeppe: I removed that case
<rogpeppe> dimitern: i'm not sure
<dimitern> fwereade_: so is it good like this?
<rogpeppe> dimitern: but would that impact this bug?
<dimitern> rogpeppe: checking..
<rogpeppe> dimitern: i just retrieved the value of the Machine:
<rogpeppe> &state.Machine{st:(*state.State)(0xc200335d10), doc:state.machineDoc{Id:"1", Nonce:"", Series:"precise", InstanceId:"", Principals:[]string{}, Life:1, Tools:(*state.Tools)(nil), TxnRevno:4, Jobs:[]state.MachineJob{1}, PasswordHash:""}, annotator:state.annotator{globalKey:"m#1", tag:"machine-1", st:(*state.State)(0xc200335d10)}}
<fwereade_> rogpeppe, it's not possible the cli tools have a funny version, is it?
<fwereade_> dimitern, we need to lose the JUJU badging
<fwereade_> dimitern, I'm sorry, I clearly hideously miscommunicated
<dimitern> fwereade_: ok, so I leave it WIP for now and finish it tomorrow
<fwereade_> dimitern, sgtm
<fwereade_> dimitern, wipped
<dimitern> rogpeppe: this confirms it - it's using tools from before the nonce was generated
<dimitern> rogpeppe: probably even before startinstance was respecting the passed nonce
<rogpeppe> dimitern: but...
<rogpeppe> dimitern: i just tried calling SetProvisioned directly from my client connection
<rogpeppe> dimitern: and it failed saying "already set"
<rogpeppe> dimitern: even though InstanceId and Nonce are both empty
<dimitern> rogpeppe: there was a lurking bug in there, initially, which I fixed afterwards
<rogpeppe> dimitern: i still see that issue with the latest version of trunk
<rogpeppe> dimitern: that is, this code: http://paste.ubuntu.com/5699342/
<dimitern> rogpeppe: ok then, that's good, because I don't
<rogpeppe> dimitern: produces this output:
<rogpeppe> &state.Machine{st:(*state.State)(0xc20032cdc0), doc:state.machineDoc{Id:"2", Nonce:"", Series:"precise", InstanceId:"", Principals:[]string{"buildbot-master/1"}, Life:0, Tools:(*state.Tools)(nil), TxnRevno:2, Jobs:[]state.MachineJob{1}, PasswordHash:""}, annotator:state.annotator{globalKey:"m#2", tag:"machine-2", st:(*state.State)(0xc20032cdc0)}}
<rogpeppe> 2013/04/11 19:33:57 set prov: cannot set instance id of machine "2": already set
<rogpeppe> dimitern: i'm not sure how that could happen, regardless of what's out there in the cloud
<dimitern> rogpeppe: file a bug then please, I'll dig into it tomorrow, if I can reproduce it
<rogpeppe> dimitern: i have the environment online now...
<rogpeppe> dimitern: i can leave it until the morning if you like
<dimitern> rogpeppe: can I access it?
<dimitern> rogpeppe: so I can debug the code in place?
<rogpeppe> dimitern: hmm, let me think
<dimitern> rogpeppe: actually, can you try adding some logging into SetProvisioned
<rogpeppe> dimitern: sure
<rogpeppe> dimitern: i just sent you a PM on canonical IRC
<dimitern> rogpeppe: log the exact error on Run
<dimitern> rogpeppe: ok, I'll try it now
<rogpeppe> dimitern: it must be ErrAborted
<dimitern> rogpeppe: ok, let's think aloud
<dimitern> rogpeppe: indeed it has to be, otherwise it'll be caught and reported earlier
<rogpeppe> dimitern: yup
<dimitern> rogpeppe: this means either assert failed, and since we're checking for alive before that, it has to be the other assert, right?
<dimitern> rogpeppe: no other case that I can see, AFAIU state/mgo transactions
<rogpeppe> dimitern: i assume the composition for AND conjunction works, but i don't *know*
<dimitern> niemeyer: ping
<niemeyer> dimitern: Heya
<dimitern> niemeyer: hey, can you please take a look at this code: http://paste.ubuntu.com/5699363/
<rogpeppe> dimitern: i've gotta go in a few moments
<niemeyer> dimitern: Sure.. what should I be looking for?
<dimitern> niemeyer: and reading a bit further up the log, tell me if i'm correct
<niemeyer> dimitern: Can you be a bit more specific?
<dimitern> niemeyer: so we're seeing "already set" error being reported from this method
<dimitern> niemeyer: and in state both instanceid and nonce are empty for that machine
<rogpeppe> dimitern: ok, so without the asserts it did succeed.
<dimitern> niemeyer: so the asserts should work fine
<dimitern> niemeyer: but somehow it aborts - can it abort for something other than a failed assert?
<niemeyer> dimitern: No
<rogpeppe> dimitern: ah...
<niemeyer> dimitern: Are you sure these values are present and empty?
<rogpeppe> dimitern: i think i know what's going on
<dimitern> rogpeppe: without both or without only notSetYet?
<rogpeppe> niemeyer: that's the issue
<niemeyer> rogpeppe: Cool, bingo
<niemeyer> dimitern: "not set" != "empty"
<dimitern> niemeyer: rogpeppe connected to the state server and extracted the machine: &state.Machine{st*state.State)(0xc20032cdc0), doctate.machineDoc{Id:"2", Nonce:"", Series:"precise", InstanceId:"", Principals]string{"buildbot-master/1"}, Life:0, Tools*state.Tools)(nil), TxnRevno:2, Jobs]state.MachineJob{1}, PasswordHash:""}, annotatortate.annotator{globalKey:"m#2", tag:"machine-2", st*state.State)(0xc20032cdc0)}}
<dimitern> what?
<rogpeppe> dimitern: so it is a compat issue after all
<rogpeppe> dimitern: i haveta go
<rogpeppe> see you tomorrow
<dimitern> rogpeppe: see you
<dimitern> niemeyer: can you explain please, because i didn't get it
<dimitern> niemeyer: the doc is there, so they should be set to empty, right?
<niemeyer> dimitern: MongoDB documents may have an empty field ({"nonce": ""}), and it may also have a non-existent field ({}). Those aren't the same thing.
<dimitern> niemeyer: is mongo ignoring empty string fields when you insert a doc?
<niemeyer> dimitern: No
<niemeyer> dimitern: But you can do that in several ways from code
<dimitern> niemeyer: because the code that adds the machine is ..
<dimitern> niemeyer: state.addMachine, which inserts a machine doc, setting only Id and Life
<dimitern> niemeyer: others, by the virtue of being uninitialized string fields, should be set to empty string, no?
<dimitern> niemeyer: no, actually the code is like this: http://paste.ubuntu.com/5699387/ and they should be set explicitly
<niemeyer> dimitern: Just load the document from the database before running code that is failing
<niemeyer> dimitern: and print it
<niemeyer> dimitern: Into a map
<dimitern> niemeyer: good idea, but how?
<niemeyer> var m map[string]interface{}
<niemeyer> err := collection.FindId(id).One(&m)
<dimitern> niemeyer: no, I mean I just do st.machines.FindId(m.Id()).One(&map) ?
<niemeyer> if err != nil { return err }
<dimitern> ah, ok, 10x
<niemeyer> dimitern: I think that's what I've just said, yeah :)
<dimitern> niemeyer: indeed, thanks for the help
<niemeyer> dimitern: np, let me know what it shows
<dimitern> niemeyer: unfortunately, I cannot access it, I tried, but it's rogpeppe's environment and despite his comment the aws keys shouldn't matter, they do - i cannot use mine
<niemeyer> dimitern: Well, they don't matter much at least
<niemeyer> dimitern: Do you have ssh access to it?
<dimitern> niemeyer: ah, let me try that
<dimitern> niemeyer: same problem - perm denied
<niemeyer> dimitern: Sure, but you have the whole data at your hand
<dimitern> niemeyer: it's not a shared account or anything
<niemeyer> dimitern: Just connect to the database with mongo and do the same query
<niemeyer> dimitern: mongo localhost:<whatever port>
<niemeyer> dimitern: use juju
<dimitern> niemeyer: I can't access the mongo there in rog's environment
<niemeyer> dimitern: db.machines.find({_id: <the id>})
<niemeyer> dimitern: Hmm.. why?
<dimitern> niemeyer: ssh is not working (my key is different)
<niemeyer> dimitern: Oh, okay.. huh
<niemeyer> dimitern: How come Roger assumed you could access it?
<dimitern> niemeyer: :) probably he's tired
<niemeyer> dimitern: It's a bit of a weird idea if you have no keys whatsoever :-)
<dimitern> exactly :)
<dimitern> anyway, I'm tired too, so have a good evening all!
<niemeyer> dimitern: Either way.. Roger said "that's the issue"
<niemeyer> dimitern: So I assume he checked it
<dimitern> niemeyer: yeah, hope he remembers :)
<dimitern> niemeyer: thanks again, if the issue is reproducible tomorrow, I'll try what you suggested
<niemeyer> dimitern: indeed :)
<niemeyer> dimitern: np.. I'm pretty sure it's an issue with the document
<niemeyer> dimitern: the code path for such a trivial assertion was exercised enough, I'd hope
<dimitern> niemeyer: yeah, mongo keeps surprising me here and there
<niemeyer> dimitern: What kind of surprise did you have so far?
<dimitern> niemeyer: syntax mostly - it's not always trivial to translate from mongo docs into D{{}} things
<niemeyer> dimitern: Hmm
<niemeyer> dimitern: It's actually 1-to-1.. !?
<dimitern> niemeyer: probably, but haven't got the hang of it yet - still try to find similar examples in the code and adapt
<niemeyer> dimitern: It's really 1-to-1
<dimitern> niemeyer: the nested {{}} and sometimes []D{{}} are not helping :) but i'm learning
<niemeyer> dimitern: yeah, the visuals may get confusing
<niemeyer> dimitern: Note that this is an optimization
<niemeyer> dimitern: For non-important code paths, you can use maps, which look a lot better
<dimitern> niemeyer: how?
<niemeyer> dimitern: {"foo": "bar"} in the mongo shell is M{"foo": "bar"}
<niemeyer> dimitern: So the overhead is a single char :)
<dimitern> niemeyer: and M is map[string]interface{} ?
<niemeyer> dimitern: assuming a M is bson.M or you own map[string]interface{}
<niemeyer> dimitern: yeah
<niemeyer> dimitern: You can define your local type whenever you feel like it
<dimitern> niemeyer: this sheds more light on it, actually
<niemeyer> dimitern: type m map[string]interface{}.. m{"foo": "bar"}
<dimitern> niemeyer: yeah, i did that in some places, esp. nested maps like config attrs
<niemeyer> dimitern: there's zero support for the bson.M type, specifically
<niemeyer> dimitern: It's just a map
<dimitern> niemeyer: it's not bad, it's just confusing at first to see the go equivalent and parse it visually
<dimitern> niemeyer: but I agree it's the shortest workaround possible in go probably
<niemeyer> dimitern: Right.. we need at least one char there
<niemeyer> dimitern: Which doesn't feel so bad :)
<dimitern> niemeyer: indeed
<thumper> morning
<mgz> hey thumper
<fwereade_> thumper, heyhey
<thumper> hi fwereade_
<fwereade_> thumper, can you think of any reason to upgrade juju with --upload-tools *without* bumping the build number?
<thumper> fwereade_: I've not looked into it too deeply, but my first thought was, no, bumping the build number sounds essential with upload-tools
<thumper> the only time you wouldn't
<thumper> is if you have updated the version number yourself since your last upload
<fwereade_> thumper, *and* there are no tools in the bucket with a matching m.m.p that need to be superseded
<fwereade_> thumper, cool, thanks
<thumper> np
<davecheney> mramm: ping
#juju-dev 2013-04-12
<thumper> fwereade_: around?
<fwereade_> thumper, yeah, kinda
<thumper> fwereade_: got time for a 5 min chat?
<thumper> fwereade_: got a few quick questions
<fwereade_> thumper, sure, would you start it please?
<thumper> yep
<thumper> fwereade_: https://plus.google.com/hangouts/_/096d99c30b8876a0dec1471a202468258e010074?authuser=0&hl=en
<fwereade_> thumper, hmm: type StringSet map[string]bool
<fwereade_> thumper, then you can just return StringSet{}
<thumper> that doesn't give enough type hiding IMO
<thumper> it exposes the map functions that I don't want there
<fwereade_> thumper, fair enough
<fwereade_> thumper, just a thought :)
<thumper> thanks
 * thumper afk
<mramm> davecheney: just got back
<bigjools> trying to keep up with core changes in the maas branch is hard - what do we need  to fix errors about missing MongoURL on machine config?
<jtv> Hi there â anyone else getting this test failure for a nonexistent function, environs.Setenv()?
<jtv> environs/tools_test.go:344: undefined: environs.Setenv
<thumper> I know what that is...
<thumper> but I don't get that error
<thumper> Setenv is "exported" in export_test.go (I think)
<bigjools> thumper: I get it too FWIW
<thumper> hmm... I haven't tested with latest trunk
<thumper> just yesterday :)
<thumper> environs/export_test.go
<thumper> var Setenv = setenv
<jtv> Yes, the theory works.  Having some trouble with the practice.
 * thumper had tests pass
 * thumper pulls latest
<thumper> and retries
<thumper> ok, it passes for me on raring with tip of trunk
<thumper> jtv: are you sure you are using tip? and have no changes?
<jtv> I'm not sure of anything, really.  I'll see if I can run it in another configuration.
<jtv> No change so far...
<jtv> Ah!  Got it.
<jtv> It passes when I run "go test launchpad.net/juju-core/environs"
<jtv> ânot when I run "go test ./environs"
<jtv> So it must be the softlinking again.
<bigjools> !
<jtv> One test failure because of that difference in all of environs nicely hits the bitter spot: not enough to make you think there's a systemic problem with the way you're running the tests, enough to make you stop and look for problems in the actual code.
<bigjools> jtv: what do you mean by softlinking?
<jtv> My $GOPATH/src/launchpad.net/juju-core is softlinked to a branch.
<davecheney> jtv: that'll screw you every time
<jtv> Yes, especially with the feature branch now.  Can't really afford to colocate such different worlds.
<bigjools> jtv: it's not that then, I don't do that
<bigjools> jtv: I am using native colo, works very well
<arosales> davecheney, let me know if you still hit the 20 compute limit on hp cloud
<davecheney> arosales: just saw your mail
<davecheney> will try again
<arosales> davecheney, the 200 limit bump should apply to the juju-scale-test project
<arosales> but as I learned things can be different than expected :-)
<davecheney> third times' a charm
<thumper> davecheney: []string(nil) is a nil slice? whereas []string{} is an empty slice?
<davecheney> yes
<thumper> hmm...
<thumper> ok, my tests pass now
<thumper> if you take params(values... string)
<thumper> and don't pass anything in
<thumper> you get a nil slice, not an empty slice
<davecheney> yes
<davecheney> normally this doesn't matter
<davecheney> but gocheck cares
<arosales> davecheney, :-)
<davecheney> m_3: ping
 * thumper proposes the hacking for the day
 * davecheney listens
<davecheney> thumper: i'll LGTM your proposal if yuou rename trivial to utils
<davecheney> or util
<davecheney> even
<thumper> I didn't really want to do that until we agreed on a name :-|
<thumper> but I do want to rename trivial
<davecheney> that's my deal, take it or leave it
<thumper> or kill it and move the content
<thumper> :)
<thumper> I can wait
<davecheney> i'll also accept that
<thumper> I'll propose another branch first thing monday to move / change the trivial package to util(s)
<thumper> davecheney: I wanted a set to deal with the bootstrap fake series
<thumper> which is now the top of the pipeline
<davecheney> what is the CL again ?
<thumper> CL?
<thumper> for what?
<thumper> it seems you can't add a prereq for something already proposed
<davecheney> thumper: nope
<thumper> oh well, friday calls, and warsaw's second law says time to stop :)
<davecheney> fucking prereqs
<thumper> davecheney: have a good weekend
<davecheney> you too mate
<rogpeppe> davecheney: hiya
<rogpeppe> davecheney: i don't want to rename trivial to utils. we deliberately decided against utils as a name.
<davecheney> i understand
<davecheney> i still side with tim
<davecheney> the only thing worse than having a utils package, is having two of them
<rogpeppe> davecheney: i'm ok with a utils *directory*
<rogpeppe> davecheney: which is what he's actually done
<rogpeppe> davecheney: so utils/trivial, utils/cloudinit, etc
<davecheney> that is gold plating it
<rogpeppe> davecheney: i don't think so
<rogpeppe> davecheney: the problem with a "utils" package is it becomes a huge grab-bag of random stuff
<rogpeppe> davecheney: but that's not a problem if each thing is in its own package, and we just use the "utils" name to keep the root name space uncluttered
<rogpeppe> davecheney: (see my recent comments on https://codereview.appspot.com/8672044/)
<davecheney> rogpeppe: i feel that encourages not one utils package, but a forrest of them
<rogpeppe> davecheney: i think that's ok - each package can be small and simple and targetted to what it needs to do. i suggest reserving the name for leaf dependencies which could be as well implemented outside juju-core.
<rogpeppe> davecheney: so nothing that imports juju-core/log, for example
<davecheney> sure
<davecheney> we can have different opinions here
<rogpeppe> davecheney: ping
<m_3> davecheney: pong
<rogpeppe> m_3: are you around for a very brief chat?
<rogpeppe> m_3: i want to knock up a minimal version of the juju-wait tool and want to make sure i'm doing the right thing first
<m_3> rogpeppe: really can't atm... about to board a plane
<rogpeppe> m_3: :-) ok!
<m_3> yikes... lemme see
<rogpeppe> m_3: short story: juju-wait (machine|unit)
<rogpeppe> m_3: waits for machine or unit to come into a stable state
<m_3> yeah, sorry man... that's just too ambitious atm... I'm trying to see if we're past our account limit on hp in the few minutes before the flight
<rogpeppe> m_3: exits with non-zero if it ended in an error state
<rogpeppe> m_3: okeydokey
<m_3> stable is one of "started, start-error, install-error" or the equiv?
<rogpeppe> m_3: yup
<rogpeppe> m_3: brings you to a state where you can query status and find out what's actually gone on.
<m_3> rogpeppe: and then it'd be nice to actually get the error state easily from there
<m_3> rogpeppe: ack right
<rogpeppe> m_3: ok, i'll print the error state
<rogpeppe> m_3: or perhaps i'll always print the final state
<m_3> we can filter status to get the state of the thing we're waiting on, but filters would be cool again at some point
<m_3> it'd be great to also get the final state
<m_3> but shell exit codes are important
<rogpeppe> m_3: agreed. i'm wanting this to take less than an hour though :-)
<m_3> haha
<m_3> nice
<m_3> ok, yeah, and we can totally refine over time
<m_3> just being able to wait for a set of states to be reached
<m_3> that'll rock
<rogpeppe> m_3: juju-wait machine [state....] ?
<rogpeppe> m_3: waits for any of the given states to be reached?
<m_3> by service instead?
<rogpeppe> m_3: ah...
<rogpeppe> m_3: that's more interesting. wait for *all* units in the service to reach the given state?
<m_3> juju wait mysql "started | start-error"
<m_3> or perhaps just a unit
<rogpeppe> m_3: just a unit has more obvious semantics
<m_3> that's all that's needed right now... `juju wait mysql/0 --state=...`
<m_3> something along those lines
<rogpeppe> m_3: juju wait mysql/0 'started|.*error' ?
<rogpeppe> m_3: i.e. a regexp for the acceptable stages
<rogpeppe> states
<m_3> davecheney: I started 30 instances from ubuntu@15.185.162.247... might not have connectivity to wait them out and shut them down... please check on it in an hour or so if you can
<m_3> davecheney: shut em down either way... then you can play if the numbers are up
<m_3> rogpeppe: dude, that syntax would rock
<rogpeppe> m_3: cool. that's also more trivial to do.
<m_3> rogpeppe: that'd be great... probably all we'd need
<rogpeppe> m_3: wonderful, just what i want hear
<rogpeppe> m_3: just units for the time being, no machines or other things?
<m_3> rogpeppe: right
<rogpeppe> m_3: great
<rogpeppe> m_3: thanks - that's very helpful!
<m_3> rogpeppe: prob not worth the time to work up any other logical expressions of state ( ! & | )
<m_3> rogpeppe: regexps'll be awesome
<rogpeppe> m_3: juju wait -v mysql/0 '.*error'
<rogpeppe> m_3: wait for anything *but* the given regexp
<rogpeppe> m_3: analagous to grep -v
<m_3> yup... but really we can add tha tlater
<rogpeppe> m_3: ok, cool
<m_3> rogpeppe: thanks!
<davecheney> m_3: hey
<davecheney> i was going to ask you about building that mongodb package
<davecheney> but I figured it out from your instruciotns
<dimitern> rogpeppe, fwereade_: looking for a critical eye on the log deuglifying CL - now with JUJU: and prefixes removed - https://codereview.appspot.com/8674043/
<rogpeppe> dimitern: looking
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe: cheers
<fwereade_> dimitern, please drop all the JUJU: crap
<fwereade_> dimitern, not just JUJU
<dimitern> fwereade_: I did, didn't I?
<fwereade_> dimitern, ERROR jujutest blah command failed: BAM!
<fwereade_> dimitern, ERROR command failed: BAM!
<dimitern> fwereade_: so rogpeppe is suggesting to drop the "command" from there, do you agree?
<dimitern> fwereade_: but leave the command name
<fwereade_> dimitern, not necessarily -- IMO that's out of scope
<fwereade_> dimitern, no, drop the command name too please
<fwereade_> dimitern, none of the JUJU:... stuff needs to exist
<dimitern> rogpeppe: agreed?
<fwereade_> dimitern, rogpeppe: "command failed: %v" is a useful final thing to note, right?
<dimitern> fwereade_: well, which command was that?
<fwereade_> dimitern, the one you just ran
<rogpeppe> fwereade_: i suppose so, although the command will print that anyway, i suppose. hmm, but maybe not to the log file
<fwereade_> dimitern, I am not interested in catering to goldfishes ;p
<rogpeppe> fwereade_: it was just the redundancy of "jujutest foo" and "command"
<rogpeppe> fwereade_: FWIW i think that the final message printed by a failing juju command *should* be prefixed by the name of the command
<rogpeppe> fwereade_: but not necessarily the log message
<fwereade_> rogpeppe, and honestly I am not *very* interested in logging the CLI stuff -- I'd hope we'd have an Infof in supercommand, maybe, saying what we're running, and leave it at that
<dimitern> ok, I'll drop both the "command" and the command name from the log message
<fwereade_> rogpeppe, doesn't sound unreasonable
<rogpeppe> fwereade_: yeah. the logging stuff is interesting from a CLI p.o.v. because it's what gets printed for --verbose/debug
<fwereade_> dimitern, please just stick to the task as given ;p
<rogpeppe> dimitern: if you drop the command name, you can leave "command"
<fwereade_> rogpeppe, yeh, I see the job of log in a CLI context as feedback not logging IYSWIM
<fwereade_> rogpeppe, dimitern: messing with "command failed" is out of scope
<fwereade_> rogpeppe, dimitern: we discussed this on the lists and crafted it carefully so as not to piss away too much time
<rogpeppe> fwereade_: +1
<dimitern> fwereade_, rogpeppe: so finally, leave just "ERROR command failed: xyz" instead of "ERROR jujutest blah command failed: xyz" ?
<rogpeppe> dimitern: yup
<dimitern> ok, submitting with that change then
<dimitern> fwereade_: I need a crash course talk for the constraints implementation for openstack :)
 * TheMue happily discovered that the endpoints in relation keys are sorted, so the tests are not only "green by accident".
<dimitern> rogpeppe: what happened yesterday with the SetProvisioned() issue? you said you found the problem, but didn't share
<rogpeppe> dimitern: i was running into the incompatible tools issue
<rogpeppe> dimitern: i was using an older client with uploaded tools
<dimitern> rogpeppe: so your setup was flawed, not the code?
<rogpeppe> dimitern: so it was creating the Machine in mongo without the new fields
<rogpeppe> dimitern: so the assertion failed
<rogpeppe> dimitern: yes, but...
<dimitern> rogpeppe: right! I was beginning to get concerned
<rogpeppe> dimitern: it was good because it made me aware of a new and quite subtle potential compatibility issue to watch out for
<dimitern> rogpeppe: incompatible schema? it's not new i think
<rogpeppe> fwereade_: have you seen this bug, BTW? https://bugs.launchpad.net/juju-core/+bug/1168154
<rogpeppe> dimitern: it's *how* it's incompatible that's interesting
<rogpeppe> dimitern: i had assumed that we could add fields without necessarily being incompatible
<dimitern> rogpeppe: ah, right :)
<rogpeppe> dimitern: and that can be true, but not if we assume that they start life as set.
<dimitern> rogpeppe: this opens a potentially vast can of worms in the code, where we need to care about fields not being there, when they should be, all kinds of failing asserts..
<dimitern> rogpeppe: and i suppose that was the reason for the log file being so huge after 2h?
<rogpeppe> dimitern: i think it's only a problem when we're asserting that something is an empty string
<rogpeppe> dimitern: yeah, the provisioner was constantly falling over and being restarted
<dimitern> rogpeppe: how about asserting on non-empty field, which isn't there?
<rogpeppe> dimitern: i don't think that's a problem
<dimitern> rogpeppe: cool
<rogpeppe> dimitern: the assertion will fail as it did in our example yesterday
<dimitern> rogpeppe: but it's essentially the same failure as if the field was there, but with a different value
<rogpeppe> dimitern: yes, which is fine. if we assert that field=="value" then we don't care about the difference between field=="" and field==nil.
<rogpeppe> fwereade_: you might've missed my comment: have you seen this bug? https://bugs.launchpad.net/juju-core/+bug/1168154
<rogpeppe> fwereade_: it's a bit concerning, but may be trivial to fix
<fwereade_> rogpeppe, responded, I'm not certain there's a juju problem there so much as a charm problem
<dimitern> fwereade_: and how about a quick chat about openstack constraints?
<rogpeppe> fwereade_: ah, interesting. what charm hooks need to run before the unit can be destroyed?
<fwereade_> rogpeppe, broadly speaking, once it starts it expects to pass through install, config-changed, start before top and suicide
<fwereade_> s/top/stop/
<rogpeppe> fwereade_: ah, i'd forgotten there was a stop hook
<fwereade_> dimitern, sure
<fwereade_> dimitern, would you start a hangout please?
<rogpeppe> fwereade_: yeah, i think that if install or start fail, we should allow unit removal without resolving the error.
<rogpeppe> fwereade_: i wonder if actually we should allow the stop hook to run regardless.
<dimitern> fwereade_: https://plus.google.com/hangouts/_/143d8b05982bc269466e8bb2402d68e8d0018523?authuser=0&hl=en
<rogpeppe> fwereade_: the reason for the charm failing is that it tried to do "apt-get install -y --force-yes python-shell-toolbox" which failed (error: Unable to locate package python-shell-toolbox)
<fwereade_> rogpeppe, sorry, could have sworn I sent a reply
<rogpeppe> fwereade_: a reply to the bug? you did.
<fwereade_> rogpeppe, that hook failure sounds like a charm problem rather than a juju problem
<rogpeppe> fwereade_: yup
<fwereade_> rogpeppe, ok, so I'm not too bothered there
<rogpeppe> fwereade_: but i think it should be possible to destroy a failed unit
<fwereade_> rogpeppe, unless we're deploying it on the wrong series or something and killing it that way?
<rogpeppe> fwereade_: because calling "resolved" on it might lead to worse problems
<fwereade_> rogpeppe, like what?
<rogpeppe> fwereade_: like subsequent hooks fail or behave badly because the previous hook has failed but been "resolved" without any actual resolution
<fwereade_> rogpeppe, then if you must you can painstakingly `resolved` your way through it to destruction
<rogpeppe> fwereade_: also, if i call destroy on a unit that's in an error state, i think it's reasonable to assume that i want to kill it, not let it carry on.
<rogpeppe> fwereade_: yeah, i think that's wrong - lots of work for no gain.
<fwereade_> rogpeppe, STRONGLY disagree
<fwereade_> rogpeppe, you might want to kill it but you still have to resolve it and walk it through an orderly shutdown
<fwereade_> rogpeppe, fancy features for skipping that need careful thought
<rogpeppe> fwereade_: this is particularly true for install/start hook failures
<rogpeppe> fwereade_: i don't mind too much about later failures
<fwereade_> rogpeppe, so long as there's a clear boundary I'm ok extending the window
<rogpeppe> fwereade_: if a charm has failed to install, i want to be able to blow it away, and i can't see any particular reason why we shouldn't allow that
<fwereade_> rogpeppe, but not now
<fwereade_> rogpeppe, there is a path to resolution
<fwereade_> rogpeppe, it may suck but it exists
<rogpeppe> fwereade_: fair enough
 * fwereade_ bbiab
<rvba> fwereade_: we've fixed most of the problems you pointed outâ¦ we're landing the MAAS provider feature branch right now.
<jtv> All the important ones are fixed.
<jtv> fwereade_: feature branch has landed...  see the MP for another note, on your comment #2 â if it's weird, it's EC2's kind of weirdness so I don't think it needs changing right now.
 * dimitern bbiab
 * TheMue loves merging conflicts *gnarf*
<fwereade_> rogpeppe, ping
<rogpeppe> fwereade_: pong
<fwereade_>             assertNothingHappens(c, upgraderDone)
<fwereade_> rogpeppe, in jujud/upgrader_test.go
<rogpeppe> fwereade_: i'm just fixing that exact code
<rogpeppe> fwereade_: it's crack
<fwereade_> rogpeppe, fucking tell me about it
<rogpeppe> fwereade_: it's nearly done
<rogpeppe> fwereade_: it was from before i knew that interlinked table-driven tests were an anti-pattern
<fwereade_> rogpeppe, I am a little surprised that you are doing it given that I said I would, though
<rogpeppe> fwereade_: i said i'd fix the upgrader tests
<fwereade> rogpeppe, I said, not sure if you saw: I am a little surprised that you are doing it given that I said I would, though
<rogpeppe> [12:05:56] <rogpeppe> fwereade_: i said i'd fix the upgrader tests
<rogpeppe> fwereade: to be compatible with both old and new dev-versions
<rogpeppe> fwereade: and you said "yes please" AFAIR
<rogpeppe> fwereade: 'cos i knew they'd be a hassle
<fwereade> rogpeppe, as I recall you said they needed to be fixed, and I said I knew, and that they fit well with what I was currently doing
<rogpeppe> fwereade: oh, crossed wires, sorry
<rogpeppe> fwereade: how far down the road are you?
<rogpeppe> fwereade: i've unfucked the tests in principle; i just need to get them to pass now.
<fwereade> rogpeppe, I have spent about 12 hours straight unfucking upgrades
<fwereade> rogpeppe, and I have a green build
<rogpeppe> fwereade: oh, ok, cool./
<rogpeppe> fwereade: i've only spent the last 2.
<fwereade> rogpeppe, and I am on a sanity pass through
<rogpeppe> fwereade: assertNothingHappens could never be triggered
<rogpeppe> fwereade: because the channel never gets sent to
<rogpeppe> fwereade: i think it probably did in an earlier incarnation which changed.
<fwereade> rogpeppe, however I did just spot that that can't possibly work, yeah
<rogpeppe> fwereade: i changed it to: http://paste.ubuntu.com/5701129/
<fwereade> *yoink*
<rogpeppe> fwereade: good word :-)
<dimitern> mramm: ping
<TheMue> *hmpf* test fails after merge, but sure it will be easy
<TheMue> Bingo
<TheMue> fwereade: could you pls take a look at https://codereview.appspot.com/8705043. it's still wip, i will add more test scenarios (multiple relations, peer relations), but i would like a feedback about the current approach by you
<fwereade> TheMue, will do shortly
 * TheMue just has been called for lunch, bbiab
<TheMue> fwereade: great, thx
<dimitern> the trunk is now broken
<dimitern> environs/maas/environ.go:7:2: import "launchpad.net/gomaasapi": cannot find package
<TheMue> dimitern: did you go get it?
<rvba> dimitern: we just merged the MAAS provider.  You need to get the lib gomaasapi.
<dimitern> TheMue: I tried go get launchpad.net/gomaasapi/...
<dimitern> and it reported: # launchpad.net/gomaasapi/example
<dimitern> ../gomaasapi/example/live_example.go:45: not enough arguments in call to gomaasapi.NewAuthenticatedClient
<dimitern> go get launchpad.net/gomaasapi (w/o /... works) though
<TheMue> dimitern: ouch, so gomaasapi is broken and the trunk fails as a result of this
<rvba> dimitern: let me look into itâ¦
<dimitern> rvba: now running go build ./... && go test ./... in juju-core/
<dimitern> rvba: but the same fails inside gomaasapi with the error above
<rvba> dimitern: I'm on it.  We updated something in gomaasapi but forgot to update the 'live_example.go' file.
<dimitern> rvba: cheers
<dimitern> rvba: always run "go build ./... && go test ./..." successfully before submitting please
<dimitern> rvba: juju-core tests pass
<rogpeppe> lunch
<rvba> dimitern: we did got the tests to pass before merging.
<dimitern> rvba: I mean in the dependencies as well, like gomaasapi
<dimitern> so juju-core trunk is not broken after all, sorry
<mramm> dimitern: pong
<dimitern> mramm: sorry, I wanted to ask about the swap days after US flights
<mramm> sure
<dimitern> mramm: but I figured I can ask in oakland :)
<mramm> cool
<dimitern> mramm: anyway, I updated the calendar with my holiday leave in june (10-25) and for europython in july (1-5), filed swap days, etc. in cadmin
<mramm> dimitern: great -- I saw the calendar updates
<mramm> but will login to cadmin and approve stuff
<dimitern> mramm: cheers, actually probably jam needs to approve these still, but anyway
<mramm>  right!
<TheMue> mramm: btw, during oakland we have a national holiday. i'll take it together with the swap days and two of my holidays of last year in the week after oakland.
<mramm> sure
<rvba> dimitern: the fix has landed.  I'll also land a fix to make sure that kind of breakage does not happen again (i.e. to make sure that the content of examples/ [which is not tested by the unit tests] compiles).
<dimitern> rvba: that's good, but now I see another error:
<dimitern> # launchpad.net/gomaasapi/templates
<dimitern> templates/source_test.go:11: undefined: GomaasapiTestSuite
<rvba> dimitern: that's fixed in my next branch :)
<dimitern> rvba: ah, cool
<TheMue> dimitern: hehe, that's real wip
<rvba> dimitern: all fixed now.
<dimitern> rvba: indeed, thanks!
<dimitern> i still can't find the kanban g+ link :(
<dimitern> mramm: can you send it please? I'll add it to my calendar manuallt
<dimitern> ah! found it! sorry
<mramm> https://plus.google.com/hangouts/_/539f4239bf2fd8f454b789d64cd7307166bc9083
<fwereade> TheMue, I'm not to sure about https://codereview.appspot.com/8705043/
<fwereade> TheMue, where did you get that data format from?
<fwereade> rogpeppe, dimitern: the upgrade-juju stuff is here: https://codereview.appspot.com/8663045/
<rogpeppe> fwereade: woo!
<TheMue> fwereade: by daves code and my interpretation of the py code
<TheMue> fwereade: i have to step out for i thin 1h. will ping you then again.
 * fwereade now has to dust off the provision-time changes and see how badly they rotted in the last day
<rogpeppe> fwereade: and in return, juju-wait: https://codereview.appspot.com/8710043
<rogpeppe> dimitern: ^
<dimitern> fwereade: will take a look a bit later
<dimitern> rogpeppe: and at yours too
<rogpeppe> dimitern: ta!
 * dimitern needs to think a bit, so going for a short run - bbi30m
<jtv> Would anyone be interested in having a makefile?  There are a few advantages that come from convenience...  Suggestion is at https://codereview.appspot.com/8711043
<rogpeppe> jtv: +0; i don't think i'd ever use it.
<rogpeppe> jtv: but i wouldn't mind it being around
<rogpeppe> fwereade: any chance of a review of https://codereview.appspot.com/8710043/ ?
<rogpeppe> fwereade: i'm a fair way into your upgrade-juju review, BTW
<dimitern> rogpeppe: i'm on it now
<rogpeppe> dimitern: cool, thanks
<dimitern> rogpeppe: reviewed
<rogpeppe> dimitern: ta!
<dimitern> fwereade: your internet connection sure is flaky today
<rogpeppe> fwereade: you've got a first round of review comments
<rogpeppe> fwereade: hmm, what's our rationale for not letting agent version change in config again?
<fwereade> rogpeppe, it's just a way to mess up the upgrade process afaict
<rogpeppe> fwereade: ah, of course.
<fwereade> rogpeppe, having no way except the approved one to change it feels like a win to me
<rogpeppe> fwereade: might be good to add a comment there
<rogpeppe> fwereade: +1
<fwereade> rogpeppe, definitely
<rogpeppe> fwereade: i *knew* there was a good reason!
<fwereade> rogpeppe, sorry, btw, I need to stop for a while, I don't think I'll manage to finish it today
<rogpeppe> fwereade: ok
<fwereade> rogpeppe, if I do it'll be late
<rogpeppe> fwereade: sure
<rogpeppe> fwereade: i need to stop soon too
 * fwereade strikes out in search of pizza
<rogpeppe> fwereade: hopefully i'll get to the end of your review
 * rogpeppe quite fancies pizza
<rogpeppe> fwereade: have a great weekend!
<dimitern> fwereade: reviewed as well
<dimitern> i have to stop
<dimitern> happy weekends everyone!
<rogpeppe> m_3: ping
<rogpeppe> m_3: if you're around at some point, please could you take a look at https://codereview.appspot.com/8710043/ and see if it looks like something you could use
<rogpeppe> right, that's me done
<rogpeppe> g'night all and happy weekends too
<m_3> [6~[6~/win 2
<TheRealMue> <7nick TheMue
#juju-dev 2013-04-14
<thumper> morning
<fwereade_> thumper, heyhey
<thumper> hi fwereade_
<thumper> fwereade_: just reading some review comments
<thumper> why do I feel like I'm wading in treacle?
 * fwereade_ sighs a little
<fwereade_> thumper, I do not get what's going on there
<thumper> there seems to be some fundamental differences in design opinion
<fwereade_> thumper, yeah... I have developed a nose for things-people-won't-like and tend to avoid them where possible, but I haven't been able to comprehend the underpinnings
<thumper> there are fundamentals and techniques...
<fwereade_> thumper, in particular the "abstraction too far" business
 * thumper doesn't want to preech to the converted
<thumper> I've not read all the comments yet
<thumper> I need to go through them...
<fwereade_> thumper, yeah, I suspect we'd just loudly agree with each other for a bit
<thumper> :)
<thumper> so you have a branch that changes the precise/local env default for tools upload to?
<fwereade_> thumper, I have something that by default uploads to host, default, and precise
<fwereade_> thumper, I feel somewhat bad about the code dumps I have been indulging in this weekend
<fwereade_> thumper, but it turns out that when you're trying to integrate things you end up uncovering a lot of weirdness... the sort with tentacles
<thumper> :)
 * thumper does something to make dave happier
<thumper> fwereade_: still around?
<fwereade_> thumper, yeah
<thumper> fwereade_: care to comment on the trivial->utils proposal?
<thumper> fwereade_: it is entirely mechanical
<thumper> fwereade_: although I did want to have utils.ReadYaml and utils.WriteYaml move into a package utils/yaml and change the methods to Read and Write
<thumper> fwereade_: then the code would be yaml.Read instead of utils.ReadYaml
<thumper> fwereade_: but I held myself back
<fwereade_> thumper, I'd still consider that mechanical, but yeah, why not, save it for another day :)
<thumper> :)
<thumper> namespaces are a honking good idea, lets do more of those
<thumper> so says the zen of python
<fwereade_> thumper, LGTM trivial
<thumper> w00t
<thumper> now just to wait for dave to turn up
<thumper> fwereade_: I have a few comments on the set one
<thumper> fwereade_: I'm not strictly attached to StringSet
<fwereade_> thumper, in case it's not clear, btw, "trivial" means "go ahead and merge it"
<thumper> fwereade_: what I really want is a generic set
<thumper> fwereade_: oh, I didn't realise that
 * thumper goes and merges then...
<fwereade_> thumper, (even if it wasn't trivial, the urge to badge that one trivial would have been almost overwhelming)
<thumper> haha
<thumper> fwereade_: set.NewStrings ... are you happy with that name?
<thumper> and set.Strings as the type?
 * thumper wants set.Set<string>
<thumper> but hey
<fwereade_> thumper, set.NewStrings and set.Strings SGTM
 * thumper nods and edits
<thumper> once the submission is noe
<thumper> done
<thumper> I did it in the pipe prior to set
<thumper> I will get a clash on directory names
<thumper> but I'll resolve that
<thumper> and move forwards
<fwereade_> thumper, (but please keep the useful methods that make it a set)
<thumper> fwereade_: oh, I will
<thumper> don't worry about that
<thumper> :)
<fwereade_> :)
 * thumper does a little research on python set methods and c++ set methods to get precedence
 * thumper pushes latest string set changes
 * thumper takes the care quickly to an engineering place, bbs
<davecheney> packaging ... packaging ... packaging
<thumper> davecheney: morning
<thumper> davecheney: I think we should have more namespaces, not less...
<thumper> set.Strings makes more sense than utils.Strings
<davecheney> thumper: i disagree
<thumper> since the rename that rog wanted
<davecheney> if you want to write java, write java
<davecheney> but i honestly can't give a fuck about this bikeshedding
<thumper> this isn't about java
 * davecheney goes to change his vote on that CL
 * thumper likes the zen of python
<davecheney> thumper: you've got 2
<thumper> and lots of c++ namespaces
<davecheney> submit it
<thumper> davecheney: ta
<thumper> davecheney: how about https://codereview.appspot.com/8701043/ ?
 * davecheney is looking
<davecheney> thumper: do you have a test for
<davecheney> var s set.StringSet
<davecheney> s.Add("hello"); s.Add("world")
<davecheney> ?
<thumper> davecheney: yep
<davecheney> ok, cool
<thumper> TestUninitialized
<thumper> I found myself wanting all the methods to work propery with uninitialized variables.
<davecheney> yeah, that is very nice
#juju-dev 2014-04-07
<rick_h_> thumper-gym: hey, tracking the debug-log work. Is that branch you've got proposed the final?
<davecheney> o/ thumper-gym
<thumper> hi davecheney
<davecheney> thumper: o/
<davecheney> re that that change to peergrouper
<davecheney> i thought it was a fix
<davecheney> but it just made things more confusing
<thumper> heh
<davecheney> rog says that the peergrouper shold return things in order
<davecheney> but it doesn't
<davecheney> even fixing the mock server to make it respect order doesn't return things in order
<davecheney> so I don't know what to do
<davecheney> if order is important to the operation of the peergropuer
<davecheney> then there is a problem
<davecheney> if its not
<davecheney> then we can just go back to my previous commit
<davecheney> s/commit/proposal
<davecheney> the point is, both proposals can't be right or wrong
<waigani> hi all :)
<thumper> o/ waigani
<davecheney> waigani: thumper http://paste.ubuntu.com/7215168/
<davecheney> getting there, slowly
<waigani> davecheney: I've been out of action sorry, getting back into it now
<davecheney> waigani: no worries
<waigani> davecheney: fyi I put up a wip at the end of last week: https://codereview.appspot.com/84360043. I'm just reading it now to remind myself where I was at.
<davecheney> waigani: i can't reproduce the issue
<davecheney> do you have the latest gccgo and gccgo-go ?
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core/provider/local$ go test .
<davecheney> ok      launchpad.net/juju-core/provider/local  30.707s
<waigani> davecheney: oops, just saw your message hmm
<waigani> i can't reproduce it now either. Though when I run make check I'm hitting nil pointer panics. - something to do with mongo being checked for
<davecheney> waigani: can you paste waht you see ?
<davecheney> waigani: do you have a /usr/bin/mongod ?
<davecheney> i suspect you will not
<waigani> davecheney: http://pastebin.ubuntu.com/7215232/
<waigani> davecheney: ah, I'll have a look - I need to run to a standup now
<waigani> bb in 30min
<waigani> davecheney: I do not have /usr/bin/mongod. I am confused as I thought that mongod was a dependancy of juju-local, which I do have.
<waigani> davecheney: I thought the idea was to add juju-local to install-dependencies in the Makefile and check only for juju-local as mongod would be installed as a dependency of juju-local
<waigani> davecheney: though I may very easily got that wrong - thumper?
<thumper> I'm busy responding
<thumper> please hold
<davecheney> waigani: there is a bug
<davecheney> you need to symlink /usr/lib/..../mongod to /usr/bin
<waigani> davecheney: ah okay
<waigani> davecheney: if this is a bug, does it make sense for me to fix it in this branch? As it now relies on mongod being installed correctly as a dep.
<waigani> s/it/we
<waigani> wallyworld_: is the simplestreams a universal metadata format juju specific?
<wallyworld_> universal
<waigani> hmm, how did I miss that? I'll have to read up.
<wallyworld_> from the project page on lp
<wallyworld_> Simple Streams describe streams of like items in a structural fashion.
<wallyworld_> A client provides a way to sync or act on changes in a remote stream.
<wallyworld_> It's been used to described cloud images which we just happen to make use of in Juju
<waigani> ah, that helps, thanks
<waigani> Does the server API rely on it?
<waigani> anyway, I shouldn't distract you - I'll read up :)
<thumper> davecheney, waigani: there is another problem with the tests on trusty, in that they should use the juju-mongodb mongo, and not the mongodb-server mongo
<waigani> thumper: okay, I'll update my branch
<waigani> thumper: your comment about cloud-init. I don't understand?
<thumper> waigani: hangout?
<waigani> sure
<waigani> give me a sec
<davecheney> thumper: yes, exactly
<waigani> thumper: got a link? Tried calling via hangs, no luck.
 * thumper goes through axw's review comments
<axw> btw I have been through the tests now, and just had the one additional comment - expect no more pedantry :)
<axw> until you reply at least
<davecheney> ubuntu@winton-02:~/src/launchpad.net/juju-core/provider/openstack$ go test
<davecheney> OK: 53 passed, 5 skipped
<davecheney> PASS
<davecheney> ok      launchpad.net/juju-core/provider/openstack      59.350s
<davecheney> \o/
<davecheney> gentlemen, we have TWO (at least) ways of creating fake tools
<davecheney> everyone should feel ashamed of themselves
<thumper> davecheney: only two?
<thumper> axw: wondering about agent vs. entity
<thumper> axw: the log files are agregated by agent
<thumper> for now at least
<thumper> I'll change to entity
<thumper> just because I think it makes more sense
<thumper> we can work out what to do later when we merge the unit agents into the machine agent
<davecheney> thumper: i'd be fine with N, where N was the number of providers
<davecheney> but it's two
<axw> thumper: thanks
<axw> thumper: that'll be "interesting" :)
<thumper> davecheney: I think it would be N where N is the number of developers who have needed it :-)
<davecheney> thumper: well played
 * davecheney runs go test ./... and goes to make lunch
 * davecheney throws a chair
<davecheney> everywhere we use version.Current
<davecheney> that should be a version number, not a full number/arch/series
<thumper> axw: refactoring the authentication may be bigger than I want
<thumper> I'll attempt it
<thumper> and see how massive it is
<axw> mk
<thumper> I didn't know it was a copy
<thumper> was in the work I was passed
<axw> ah right
<tvansteenburgh> thumper: i'm looking at https://bugs.launchpad.net/juju-core/+bug/1302935
<_mup_> Bug #1302935: trusty local provider unit agents stuck pending <deploy> <local-provider> <lxc> <juju-core:Incomplete by thumper> <https://launchpad.net/bugs/1302935>
<tvansteenburgh> `sudo lxc-ls` returns 'juju-precise-template'
<thumper> axw: which is better `maxLines value "foo" is not a valid unsigned number` or `maxLines value "foo" is not a valid unsigned number: strconv.ParseUint: parsing "foo": invalid syntax`
<thumper> tvansteenburgh: yeah...
<tvansteenburgh> do i need to kill that somehow?
<thumper> no, it makes no difference
<thumper> it is filtered out before anyone cares
<waigani> What write permissions should /usr/lib/juju have?
<thumper> waigani: why?
<thumper> 0755 at least
<axw> thumper: I see, fair enough :)
<axw> leave it then
<waigani> thumper: http://pastebin.ubuntu.com/7215512/
<tvansteenburgh> thumper: hrm, not the answer i was hoping for. i deleted the trusty lock file, redeployed, and same result. i'll post the new log file
<waigani> it seems the dirs can be created, no problem
<thumper> tvansteenburgh: how long did you wait?
<thumper> tvansteenburgh: for me to create a new template, and start it, it took over 5 minutes
 * tvansteenburgh turns suddenly sheepish
<tvansteenburgh> thumper: you're right, now it's all started up
<tvansteenburgh> sorry :(
<thumper> tvansteenburgh: the slow startup is just the first time the template is started
<thumper> tvansteenburgh: that is fine
<thumper> we should communicate this more clearly
<thumper> tvansteenburgh: it will be much faster now :-)
<thumper> tvansteenburgh: on my todo list is a plugin command to create the template out of band
<thumper> with more feedback
<thumper> so you can see what is going on
<tvansteenburgh> cool. unfortunately i would have never figured out the lock file thing on my own
<thumper> yeah, another problem with communication I think
<tvansteenburgh> but now i know, so thanks for that
<thumper> we should be aware that people will kill it in the middle there
<tvansteenburgh> i guess i killed it at some point, but didn't even remember that
<thumper> axw: I'm up for ideas on how we can make the destroy environment call on Ctrl-C remove the lock file if it created one
<thumper> tvansteenburgh: this is good feedback though
<thumper> don't feel sheepish
<axw> thumper: which lock file is that?
<thumper> axw: when we create a new lxc template for a series
<thumper> axw: it can take a while
<thumper> we take a fslock
<axw> thumper: is there a reason we don't just use flock? then we wouldn't have this problem, because nobody would be holding the lock
<thumper> axw: I'm sure there was originally, but now I'm not so sure...
<wallyworld_> hazmat: you online?
<axw> thumper: otherwise it'll get messy if you want to handle multiple users concurrently trying to add containers in different environments
<axw> which is why it's in /var/lib/juju, rather than data-dir, I presume
<thumper> yeah
<axw> can't just blow it away; you'd probably want to record the PID, and check if it's alive, etc.
<thumper> meh... add it to the todo list
 * thumper is refactoring http auth rubbish
 * davecheney consoles thumper 
<waigani> wip: https://codereview.appspot.com/84360043/
<thumper> axw: refactored the auth code
<thumper> axw: reusing the httpHandler now
<axw> thumper: cool, thanks
<thumper> just waiting on lbox now
 * axw looks forward to cutting out that particular bottleneck
<axw> takes 5-10 mins for me sometimes...
<thumper> geez, how to freak out people: http://www.dailydo.co.nz/garden/glowgravel495499?utm_source=Dunedin&utm_campaign=b90cc26652-Dunedin+2014-04-07&utm_medium=email&utm_term=0_4768a78a2f-b90cc26652-280945562
<axw> thumper: ah yeah, I forgot NewTailer starts immediately. thanks, start makes that clearer
<thumper> np
<davecheney> https://codereview.appspot.com/84860046
<davecheney> if anyone has time
<thumper> davecheney: lgtm
<davecheney> thumper: ta
<vladk> jam: I will be available at 10:10 for hangout, sorry
<dimitern> morning all
<thumper> hmm... dimitern is here...
<thumper> time to leave
<thumper> o/ dimitern
<dimitern> hey thumper
<dimitern> hey jam1
<dimitern> jam1, it seems my ssh key is missing from the bot - can't login
<jam1> dimitern: hmm... I'll go check
<dimitern> jam1, thanks
<jam1> dimitern: dimitern@kubrik is on the bot, are you sure you're connecting as the "ubuntu" user?
<dimitern> jam1, yes
<dimitern> jam1, https://pastebin.canonical.com/107836/
<jam1> dimitern: that is machine-0, I wouldn't expect you to be able to connect there, just machine 1 @ 118
<rogpeppe> mornin' all
<wallyworld_> fwereade: got time for a quick implementation discussion?
<voidspace> rebooting :-)
<vladk> dimitern: good morning
<dimitern> vladk, morning
<vladk> I need to get all MACs related to node via MAAS interface, I know http://paste.ubuntu.com/7216131/, but it does not return MAC addresses
<dimitern> vladk, I'm working on reading the node hardware list (xml output of lshw), which contains all ethernet cards and their interface names and mac addresses
<vladk> dimitern, cool, it is very convenient to know device names, instead of their MAC addresses
<dimitern> vladk, I'll propose it a bit later then I'm done and send you a link if you like
<vladk> can I expect something like 'NetworkInfo struct' before newCloudinitConfig call?
<dimitern> vladk, StartInstance will gather the networkining info and return it, but will populate machine config with the necessary info so newCloudinitConfig can prepare the scripts
<vladk> Why not to get that information from commissioning stage. I really need network information before cloudinit
<vladk> dimitern: ^
<dimitern> vladk, that's exactly where we're getting the info from
<dimitern> vladk, it's before cloudinit
<vladk> dimitern: now, I am trying to use mgz's GetNetworksList function that uses above API call, so it becomes useless
<mgz> morning all
<dimitern> morning mgz
<dimitern> vladk, we need the result of GetNetworksList to get the network details - also part of []NetworkInfo returned by StartInstance
<dimitern> vladk, so []NetworkInfo is populated by GetNetworksList and the new call that returns NIC details for a node
<vladk> mgz, morning
<vladk> dimitern, mgz: I fixed GetNetworksList, it assumes that VlanTag is string but it is a number or null in JSON, so I'm going to create a merge request
<dimitern> vladk, great, thanks!
<vladk> do I need to create a bug in LP or card in kanban?
<vladk> dimitern: ^
<mgz> vladk: a card wouldn't hurt
<jam1> rogpeppe: I made it back, care to join the hangout?
<jam1> morning mgz
<rogpeppe> jam1: standup hangout?
<jam1> rogpeppe: our 1:1 hangout
<rogpeppe> jam1: ha! 1-1!
<rogpeppe> jam1: joining
<dimitern> vladk, card is fine
<dimitern> vladk, you could add a bug as well, but in this case i don't think it's worth it
<fwereade> rogpeppe, would you take a look at https://codereview.appspot.com/84430044/ please? I don't quite see what's going on there
<jamespage> fwereade, quick question - what should the behaviour of upgrade-juju look like with a 1.18.0 client against a 1.17.7 environment?
<jamespage> fwereade, also noted this potential upgrade problem - bug 1303697
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:New> <https://launchpad.net/bugs/1303697>
<fwereade> jamespage, the env should be upgraded to 1.18.0, does that not happen?
<jamespage> fwereade, without --version 1.18.0 it did not
<fwereade> jamespage, I *thought* we were upgrading from dev to stable without explicit args, but I may have missed something -- dimitern, can you comment?
<fwereade> jamespage, and that bug is weird, I will peer closer
<jamespage> fwereade, thanks
<dimitern> fwereade, the original logic was to try to upgrade to the next stable without --version
<dimitern> fwereade, (or current)
<dimitern> jamespage, did you use --upload-tools ?
<fwereade> dimitern, I thought so too, do you recall anyone working on that subsequently?
<mgz> jam: 1:1!
<jamespage> dimitern, nope
<dimitern> fwereade, not that i know of
<jamespage> dimitern, I saw the same with both maas (using streams.canonical.com) and openstack (using a local mirror of streams.canonical.com)
<jam> yep, be there in 1sec
<jam> mgz: ^^
<mgz> cool
<dimitern> jamespage, did you try --debug to see what's going on?
<jamespage> dimitern, I did not
<jamespage> dimitern, I did hop onto the bootstrap node and looked at the log - messages about it picking 1.17.7
<dimitern> jamespage, hmm... weird
<rogpeppe> fwereade: sorry, was in a call with john. looking.
<fwereade> jamespage, I don't suppose you have the unit agent log from lp:1303697 ?
<jamespage> fwereade, lemme check
<mgz> c'mon internet
<rogpeppe> fwereade: i'm not sure i see what's going on either
<rogpeppe> fwereade: in the comment i made in the previous review, i was talking about address order *within* a machine, not the order of machinesw
<jam> bug 1303697
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:New> <https://launchpad.net/bugs/1303697>
<fwereade> jamespage, ok, I can see why the juju run could have failed
<fwereade> jamespage, in fact I suspect an ill-timed one could have panicked the unit agent, hence the second config-changed
<fwereade> jamespage, root cause remains unclear
<jamespage> fwereade, the call in the charm pretty much does the same thing in the config-changed hook
<fwereade> jamespage, yeah, but it turns out we start listening for juju run commands while the uniter is less than half initialized
<fwereade> jamespage, but I can't see that code path that would allow a config-changed hook to berun in those circumstances
<perrito666> good morning everyone
<jamespage> fwereade, it looks like it runs multiple times
<jamespage> I had to resolve it twice
<jamespage> fwereade, and I'm def not changing the config :-)
<fwereade> jamespage, config-changed runs every time the unit agent restarts
<mgz> hey perrito666
<jamespage> fwereade, ah
<fwereade> jamespage, and that juju-run could easily have bounced the agent, because the code is decidedly unsafe
<jamespage> fwereade, right - I see
<jamespage> fwereade, I'm less worried about the juju run failing and more worried that whenever I upgrade juju, keystone will hook error as its can't access the peer relation :-)
<fwereade> jamespage, indeed, that's why I'm stillpoking around
<jamespage> fwereade, OK - I'll get out of your hair then :-)
<fwereade> jamespage, I'd still love to see the agent log if you have it :)
<jamespage> fyi I've tested on openstack and local for trusty; just validating against the lastest maas
<jamespage> fwereade, I'll repro that for you now
 * jamespage downgrades and pushed the car backup the hill
<fwereade> jamespage, I doubt that behaviour would vary by provider
<jamespage> ok
<fwereade> jamespage, I consider it a big deal anyway ;)
<jamespage> fwereade, I'm reproducing on openstack as that is quicker
<fwereade> jamespage, cool
<dimitern> fwereade, rogpeppe, mgz, jam, wallyworld, standup
<jam> rogpeppe: standup?
<jamespage> fwereade, unit log - http://paste.ubuntu.com/7216528/
<jamespage> fwereade, machine log - http://paste.ubuntu.com/7216527/
<jamespage> fwereade, and one new issue - http://paste.ubuntu.com/7216530/
<jamespage> fwereade, bug 1303735
<_mup_> Bug #1303735: private-address change to internal bridge post juju-upgrade <juju-core:New> <https://launchpad.net/bugs/1303735>
<fwereade> jamespage, thanks
<jamespage> fwereade, sorry :-)
<fwereade> jamespage, nothing to be sorry for, except on our side ;p
<jamespage> fwereade, I could not say whether the address change thing is a new problem
<jamespage> I don't spend that mich time upgrading juju environments
<voidspac_> Plugging my iphone in appears to have locked up my computer
<voidspac_> all except audio!
<voidspac_> So I'll have to reboot and rejoin the hangout
<voidspac_> sorry
<rogpeppe> fwereade, jam: the client doesn't necessarily need to read from the master
<jam> rogpeppe: well, we always do Strong consistency, right?
<rogpeppe> fwereade, jam: i'm giving up. connection too crappy.
<tvansteenburgh> hey guys, wondering if this is something i should file a bug on: http://pastebin.ubuntu.com/7216718/
<hazmat> wallyworld, am now
<hazmat> er.. ping
<hazmat> re azure, does this mean the provisioning interface now knows the workload prior to machine allocation or is just fed constraints/parameters ?
<adeuring> could somebody please have a look here: https://codereview.appspot.com/84470053
<dimitern> smoser, i just got hit by bug 1303617
<_mup_> Bug #1303617: Latest curtin version prevents Juju from bootstrapping on MAAS <landscape> <curtin:New> <curtin (Ubuntu):Confirmed> <https://launchpad.net/bugs/1303617>
<dimitern> smoser, and since version 0.1.0~bzr121-0ubuntu1 is no longer in trusty I cannot downgrade to work around it
<rogpeppe> natefinch: you've got a LGTM on https://codereview.appspot.com/81980043/
<vladk> I found a reason: gomaasapi gives a panic in testing mode when GET /MAAS/api/1.0/networks/?node=xxx and no assigned network to the node
<dimitern> vladk, yeah that's the issue, but I don't see panics when running maas tests otherwise
<dimitern> fwereade, mgz, vladk, perrito666, get a list of NICs with mac addresses in maas https://codereview.appspot.com/84850045
<vladk> dimitern: because you don't call GetNetworksList before environ.startNode()
<dimitern> vladk, startNode shouldn't panic - if there are no networks requested for the machine it shouldn't try to get them (or if it fails it should ignore it)
<natefinch> rogpeppe: thanks
<perrito666> dimitern: tx
<vladk> dimitern: it panics only in testing mode (testservice.go networksHandler() function)
<dimitern> vladk, take a look at TestGetNetworksList - it adds a network and then a connects a node to it
<vladk> dimitern: from GetNetworksList I have (net_name, addr, mask, vlan, descr), from getInstanceNetworkInterfaces I have (mac_addr, iface_name)
<vladk> dimitern: I need to extract mapping between net_name and mac_addr via MAAS API
<dimitern> vladk, the macs will be there if the node is connected to the network
<dimitern> vladk, and you can get the networks with the macs
<dimitern> vladk, i.e. GetNetworksList needs to return mac addresses (when set)
<vladk> dimitern: what I really need i multimap(iface_name->vlan)
<vladk> dimitern: GetNetworksList doesn't read mac addresses now
<vladk> dimitern: I may change it, do you know MAAS API to read MAC addresses?
<vladk> dimitern: as to TestGetNetworksList, this will work if I manually add a network to each instance before bootstrap.Bootstrap() or testing.AssertStartInstance()
<vladk> I think that panic should be changed to error code in gomaasapi
<dimitern> vladk, so I can get the macs from maas maas-root network list-connected-macs vlan0
<vladk> dimitern: this will be very long way
<dimitern> vladk, with is equivalent to doing /api/1.0/networks/<name>?op=list-connected-macs
<dimitern> vladk, s/with/which/ ...(I hope)
<dimitern> list_connected_macs actually
<dimitern> vladk, it seems MAAS does not allow us to get a list of all networks along with their mac assignments - just for a single network, so we'll need multiple API calls for each network
<vladk> dimitern: long way: from GetNetworksList I get []networkNames, than for each networkName I read a listConnectMacs and search for my node in that list
<vladk> dimitern: we definitely need a separate API call for all this stuff
<dimitern> vladk, we could have a separate api, but we don't have it now, so we can work around it with multiple calls
 * dimitern needs to step out for a while
<wallyworld> hazmat: hi, just wanted a clarification. you said in an email that you wanted upload-tools to use the jujud binary and you had hardwired it to do so. AFAICT juju actually looks for the jujud executable in the current path and uses that if found. so could you clarify where you see the problem with the current behaviour? Have I missed something?
<hazmat> wallyworld, sorry that wasn't clear.. i mean mentally i've hardwired myself to use upload-tools
<hazmat> wallyworld, indeed the tool lookup by path works fine
<wallyworld> ah, ok :-) thanks
<wallyworld> pinned bootstrap using exact version match should help with that for public clouds
<vladk> dimitern: I take a task of changing GetNetworksList to myself
<natefinch> rogpeppe: what work did you do on the HA branch after I left on Friday?  I forget exactly where we left it.
 * rogpeppe tries to remember
<rogpeppe> natefinch: i did this: https://codereview.appspot.com/84540044/
<rogpeppe> natefinch: (and possibly some more as well; i forget)
<rogpeppe> natefinch: over the weekend, i also tried to integrate the branches to see what might actually work
<rogpeppe> natefinch: there are a few things that we will need to do
<rogpeppe> natefinch: we need to apt-get install mongo inside EnsureMongoServer
<rogpeppe> natefinch: we need EnsureMongoServer to write the server secret files
<rogpeppe> natefinch: we need to add SystemIdentity to StateServingInfo
<natefinch> rogpeppe: that seems like stuff that can be in separate CLs, right?  Not needed for the single-server mode we're using right now
<rogpeppe> natefinch: the state server Initiate needs to add the machine tag to the replica set entry
<rogpeppe> natefinch: yeah, although it does make EnsureMongoServer make sense.
<axw> rogpeppe: doesn't cloud-init know the intended jobs? so it can do apt-get as usual?
<natefinch> rogpeppe: I really want to get the MA-HA branch into trunk ASAP so we don't have to keep maintaining a huge branch
<rogpeppe> natefinch: +1
<natefinch> rogpeppe: I honestly don't care if it makes sense today, as long as it doesn't break anything today
<rogpeppe> axw: i would much prefer to isolate all the mongo installation into EnsureMongoServer
<rogpeppe> axw: which means that cloud-init doesn't need to know anything at all about mongo
<axw> rogpeppe: fair enough
<axw> although, there may be trickiness around adding apt sources
<rogpeppe> axw: the apt sources *can* be added by cloud-init
<rogpeppe> axw: because they're needed for lxc too
<axw> yeah okay
<natefinch> rogpeppe: you had a change to make the replicaset member host separate from the dial info... is that in your version of the MA-HA branch?
<rogpeppe> axw: BTW juju ssh is broken currently if $SHELL isn't sh-compatible
<rogpeppe> axw: the fix is trivial, but i haven't got around to it yet, sorry
<axw> oy vey
<axw> what's the issue?
<axw> or you didn't get to the bottom of it?
<rogpeppe> axw: ssh proxying uses $SHELL to execute the proxy command
<rogpeppe> natefinch: did you merge my version of the MA-HA branch?
<natefinch> rogpeppe: doing so no
<natefinch> now
<axw> rogpeppe: as in, ProxyCommand is executed using $SHELL?
<rogpeppe> natefinch: yes, my version of the branch adds a MemberHostPort member to InitiateMongoParams
<rogpeppe> axw: yes
<axw> rogpeppe: is that to separate dialing address from the member address?
<axw> I was just about to ask about that
<rogpeppe> axw: yes
<axw> cool
<rogpeppe> axw: because you need to dial localhost otherwise you don't get access to mongo
<axw> yep
<axw> I spent most of Friday learning a bunch about the quirks around replica sets, localhost exception, etc.
<natefinch> axw: yeah, mongo replicaset stuff is like 100% quirks
<axw> hehe :)
 * natefinch wonders if he's the only one who types bzr bootstrap --debug and juju install ./...
<mgz> :)
<natefinch> I need one command that picks juju, go, or bzr based on the command and does the right thing
<natefinch> switch being the only tricky one between juju and bzr
<smoser> dimitern, fix is uploaded.
<smoser> dimitern, https://launchpad.net/ubuntu/+source/curtin/0.1.0~bzr125-0ubuntu1/+build/5886362
<smoser> and really sorry/embarrased on that.
<smoser> dimitern, you just need to upgrade curtin on your maas system.  the actual fix is in curtin-common.
<hazmat> any ppc enablers around?
<hazmat> there's some client agent failures in the logs ref'd in https://bugs.launchpad.net/juju-core/+bug/1303787   here that need attention
<_mup_> Bug #1303787: hook failures - nil pointer dereference <hooks> <local-provider> <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1303787>
 * hazmat falls back to email
<perrito666> fwereade: is there any written spec on how a placement directive can look like? I see a set of tests that assert that varios values passed to machineornewcontainer yield true or false, but I am not sure how to interpret all of them
<fwereade> that google doc is all we have
<fwereade> the only ones implemented are lxc/kvm
<fwereade> and they all take <pseudo-provider>:<host-machine-id>
<fwereade> perrito666, ^^
<fwereade> perrito666, the juju-core/names package has the definition for a machine id
<axw> perrito666: what are you working on? I'm starting to look at some placement stuff, just wondering if we're going to collide...
<perrito666> axw: well I am working on something that is tangential to it, I was trying to do network checkings on a given instance when specified in --to
<perrito666> and well, trying to get the instance I tried to get the id for the instance
<perrito666> and one thing led to another
<axw> ah ok
<perrito666> axw: nothing coded yet
<axw> so not actually extending --to
<natefinch> haha... I just got an offer letter in email from some company I've never heard in my gmail inbox.  Addressed to Nate Finch and everything.... not the first time I've gotten email for the wrong Nate Finch, but first time actual legal documents have been mailed to me that way.
<voidspace> natefinch: nice. Anything interesting?
<mgz> natefinch: what's your new job?
<natefinch> voidspace, mgz: http://centricconsulting.com/
<natefinch> Looks kinda boring, actually
<perrito666> natefinch: ah I get the fanmail of a famous chilean singer whose gmail acct is the same words as mine but reversed :p its fun
<perrito666> natefinch: just answer telling them that paychecks should be sent to your address too
<natefinch> Haha, nice
<nessita> hello! re-pasting a question I just posted in # juju
<nessita> 11:54 < nessita> hello everyone, I'm having an issue with juju deploy that is showing up since I updated juju-core to 1.18. Error is: "only charm store charm references are supported, with cs: schema" when deploying the solr-jetty charm
<nessita> 11:54 < nessita> more debug output in https://pastebin.canonical.com/107885/
<nessita> 11:54 < nessita> any ideas?
<nessita> 11:55 < nessita> perrito666, would you know who can help me with that ^?
<perrito666> nessita: no clue about, sorry
<nessita> perrito666, thanks
<nessita> perrito666, any idea how could help me debug further? I'm happy to file a bug, but I'd also need, ideally, a workaround or instructions to fix that
<rogpeppe> natefinch: are you in a hangout?
<natefinch> rogpeppe: nope
<mgz> nessita: can you rerun with --debug? also, can you verify the version of juju *deployed*, not just the local copy?
<nessita> mgz, hi! this is the --debug  output https://pastebin.canonical.com/107892/
<nessita> mgz, what do you mean with the juju version deployed?
<rogpeppe> natefinch: let's have a chat in a little bit - there's an issue that would help from a couple of people thinking about it, i think
<rogpeppe> natefinch: 10 minutes, maybe?
<natefinch> rogpeppe: sure
<mgz> nessita: as in, ssh t0 machine 0 and look at what juju it has, will be in the log in /var/log/juju/... I'd think
<nessita> mgz, would that be "juju ssh" or plain ssh?
<mgz> addCharmViaAPI looks correct to me, but something must be up
<mgz> nessita: `juju ssh 0` is easiest, but anything that works
<nessita> mgz, while I ssh in, this is machine0.log: https://pastebin.canonical.com/107893/
<mgz> ta
<nessita> $ juju ssh 0
<nessita> Permission denied (publickey,password).
<nessita> ERROR rc: 255
<mgz> nessita: can you look for references to your charm in the other juju logs, sibling to that (the version seems fine...)
<mgz> nessita: ah, yeah, juju ssh is still probably borked for the local provider
<mgz> so, just look at the logs in situ :)
<nessita> on it
<mgz> hadn't twigged it was local, so no version mismatch possibilities
<natefinch> anyone else see a problem destroying local environments?   I'm getting this on trunk:
<natefinch> $ juju destroy-environment local -y
<natefinch> ERROR exec ["stop" "--system" "juju-db-v2"]: exit status 1 (stop: Method "Get" with signature "ss" on interface "org.freedesktop.DBus.Properties" doesn't exist)
<nessita> mgz, no result when grepping for "solr" inside the ~/.juju/local/log folder
<jam> natefinch: that looks like your dbus is hosed, since I think it is trying to send a signal to stop a machine, and DBus doesn't think that interface exists.
<jam> Sounds (to me) like an incomplete dbus upgrade
<natefinch> jam: could be, I just updated yesterday and rebooted for it this morning
<natefinch> jam: I'll rerun update/upgrade
<jam> natefinch: I'm pretty sure we don't call out to dbus directly
<nessita> mgz, in the mean time I filed the bug LP: #1303880
<jam> that appears to be "stop" not talking to upstart correctly.
<jam> bug #1303880
<_mup_> Bug #1303880: After upgrade to 1.18, can not longer deploy the solr-jetty charm using local provider <juju-core:New> <https://launchpad.net/bugs/1303880>
<mgz> nessita: thanks
<mgz> the workaround is a simple downgrade I guess
<nessita> mgz, right. Let me know if I can get any extra debug info it may be useful
<natefinch> jam: update/upgrade didn't help, I'll try a reboot, see if that shakes anything out
<nessita> mgz, FYI, seems like the deploy of local unit is somehow broken, because noodles775 tried to deploy another charm (elasticsearch) and it failed too
<mgz> nessyeah, this is long before anything charm specific
 * nessita edits the bug summary
<mgz> nessita: can you try cding to the --repository location, and specifying just local:precise/CHARMNAME?
<nessita> mgz, trying
<mgz> pretty sure issue is after the cmd parsing and url expansion, but simple to verify
<nessita> mgz, just to verify, I should be in ./../.juju-repo not in ./../.juju-repo/precise, right? also, would this be the correct command?  juju deploy -e local local:precise/solr-jetty
<mgz> nessita: yup, that's it
<mgz> expecting the same error
<nessita> mgz, hum, I got Added charm "local:precise/solr-jetty-1" to the environment.
<mgz> ha, well, that works then
<mgz> next step: work out if it was the --repository or the charm url expansion that's borked
<nessita> mgz, anything I should do in that front?
<mgz> you can remove the service, then try either the cd and no --repository, or the local:CHARMNAME and see which works and which fails
<nessita> ack
<natefinch> rogpeppe: want to talk?  I have to go in 15 minutes for lunch
<rogpeppe> natefinch: https://plus.google.com/hangouts/_/canonical.com/juju-core-team?authuser=1
<nessita> seems like the charm url change makes it work:
<nessita> $ juju deploy -e local --repository=./../.juju-repo local:precise/solr-jetty solr-jetty
<nessita> Added charm "local:precise/solr-jetty-1" to the environment.
<nessita> (that is ran from a location outsite the repository folder)
<mgz> nessita: thanks
<nessita> np
<dimitern> vladk, i'll change https://codereview.appspot.com/84850045/ as you suggest
<dimitern> vladk, can I have an LGTM? :)
<vladk> dimitern: done
<dimitern> vladk, thanks!
<voidspace> rogpeppe: spectacularly failed to get anything useful committed today. Off to Krav maga, may have another stab on my return.
<voidspace> rogpeppe: in case I don't, branch is: https://code.launchpad.net/~mfoord/juju-core/wrapsingletonworkers/+merge/214208
<rogpeppe> voidspace: thanks
<voidspace> rogpeppe: rename done, only requires a test
 * voidspace hangs head
<voidspace> right, EOD folks
<voidspace> and EOW
<voidspace> off to PyCon tomorrow
<rogpeppe> voidspace: have fun!
<voidspace> back to work a week on Friday
<voidspace> rogpeppe: thanks, hope so
<voidspace> language summit first
<stokachu> sinzui: do you have a wiki page or a tentative eta for 1.18, we are blocked on bug 1299588
<_mup_> Bug #1299588: LXC permission denied issue with 1.17.7 <cloud-installer> <landscape> <lxc> <micro-cluster> <regression> <juju-core:Fix Committed by wallyworld> <juju-core 1.18:Fix Released by wallyworld> <juju-core (Ubuntu):Confirmed> <https://launchpad.net/bugs/1299588>
<sinzui> stokachu, I.18 was release Saturday
<sinzui> stokachu, It is in the juju PPA, Ubuntu is packaging it now
<stokachu> ok must not be in trusty archive yet
<stokachu> gotcha
<stokachu> sinzui: thanks for the update
<sinzui> stokachu, yep, not in the archive yet. I am polling to check for the arm64 and ppc ports
<stokachu> sinzui: ok cool, i didnt see it in the proposed queue
<jam> sinzui: fwereade: jamespage bug #1303697 is this a release blocker bug?
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:Triaged> <https://launchpad.net/bugs/1303697>
<jam> It feels like if this is how it worked in the past, then it wouldn't be a blocker
<jam> if we are changing behavior that used to work, then we should focus on it.
<sinzui> jam: I think it is. If not, then I would lower the importance to High
<jamespage> jam: I don't think so - its a pita
<fwereade> jam, sinzui: I fear it may be, it certainly should not have acted like that, I failed to figure it out earlier but will take another look now
<jamespage> jam: I can't say whether this used to work or not
<jam> fwereade: so if you feel we know we shouldn't have acted that way, then I'm ok with it being critical
<jamespage> this is a new use of the peer relation in the keystone charm
<sinzui> jam: actually, we have released. Maybe we want to target it to 1.18.1
<jam> sinzui: well this is targetted against 1.19
 * sinzui created the milestone a few hours ago just in case
<jam> but it should target 1.18.1 if it is actually Critical
<sinzui> okay jam, if we agree it is critical It goes to 1.18.1, if not we lower the importance.
<jam> sinzui: correct
<fwereade> jamespage, unless you upgraded the charm to add the peer relation immediately before upgrading juju, that would be a different story, but I'm pretty sure I saw the logs joining cluster:3 some time before
<jam> fwereade: sinzui, jamespage: bug #1303735 feels more like it should be Critical, blocking 1.19, and backported to 1.18.1
<_mup_> Bug #1303735: private-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1303735>
<fwereade> jamespage, it would still be a sucky story but atm it looks like a regression to me
<fwereade> jam, that depends on whether it's a regression -- jamespage seemed uncertain, I don't have any specific input there
<jam> fwereade: well, we changed what private address we are reporting, from a routable-private address to a fully hidden one, (from what I can see)
<jam> fwereade: bug #1302205
<_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1302205>
<jam> It feels like probably not a regression
<jam> just that we aren't supporting arm64 well yet
<jam> sinzui: I've been opening up a 1.19.1 for "things which we need for the next release, but don't have to block getting a dev snapshot along the way"
<jam> does that seem ok for you?
<sinzui> jam +1
<mramm2> For those of you who like positive feedback: http://www.reddit.com/r/Ubuntu/comments/22ehsz/ubuntu_maas_and_juju_wow_im_impressed_what_are/
<mramm2> they are loving the juju and MAAS
<mramm> also if you think it is useful stuff, please feel free to upvote it ;)
<jam> fwereade: bug #1208430,  jamespage is this something we should be spending cycles on now/
<_mup_> Bug #1208430: mongodb runs as root user <mongodb> <juju-core:Triaged by natefinch> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1208430>
<jam> ?
<jam> I don't think natefinch actually is working on it
<jam> (since we need to create a user, get mongo running it, etc)
<fwereade> jam, yeah, agree re 1302205
<jamespage> jam: I think we could defer that for now
<natefinch> jam: yes, I am not working on that currently
 * natefinch just got back from a longer than expected lunch
<jam> fwereade: so bug #1303697. If we want it to be critical for 1.19, we need to assign it to someone. Care to nominate ?
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303697>
<fwereade> jam, if dimitern has any spare cycles I think he did stuff around there semi-recently... my perception is that vlan is making reasonable progress?
<jam> fwereade: it is, but he's currently on the critical path to getting VLAN completed.
<jam> so if this should block 1.19.0, I'd rather give it to someone else
<fwereade> jam, indeed so -- I mention him only because he's the only person with recent experience there
<mramm> How about andrew, ian or tim?
<fwereade> jam, can we ... yeah, what mramm said :)
<jam> fwereade: mramm: other than I don't control those guys :)
<fwereade> jam, sure; they *are* doing important stuff, but also explicitly doing bugs as well
<fwereade> jam, (depending on how my evening goes) I will likely see if I can make it thumper's problem
<jam> fwereade: sgtm
<stokachu> sinzui: just saw it show up in proposed queue
<stokachu> sinzui: for all the archs
<sinzui> thank you stokachu
<stokachu> np
 * rogpeppe is done for the day
<rogpeppe> g'night all
<natefinch> night rog
<dimitern> mgz, fwereade, natefinch, in case you're still here I'd like a review on https://codereview.appspot.com/85060043
<natefinch> dimitern: I'll review later if I have time, really trying to get HA landed
<dimitern> natefinch, sure, np, if you can
 * dimitern reached eod
<perrito666> natefinch: o cmon, we all know you are watching a DVD :p
<natefinch> rofl
<natefinch> well, I mean, I am *now*.
<perrito666> natefinch: a long, long, long, time ago, debian solved that problem by having us and non-us repos, dunno how legal that was
<natefinch> perrito666: doesn't that still mean the US people are screwed? :)
<natefinch> (for some value of screwed that just requires a quick google... but still)
<perrito666> natefinch: as you can imagine, I never encountered said problem
<perrito666> :p
<natefinch> heh
<perrito666> natefinch: but hey, that solved the problem for all but a portion of the world
<natefinch> I wish our copyright laws weren't so draconian. It's really ridiculous that you need to pay for the right to play the DVD you already paid for, and that getting around what is basically just an encoding counts as breaking the law.
<perrito666> natefinch: Is 25 bucks too much for a dvd player? (I have no idea what is the cost of a windows one)
<natefinch> perrito666: windows comes with a dvd player, nothing to pay for (after you pay for windows)
<perrito666> natefinch: ahh true, I used one called power something, bundled with my discrete videocard
<perrito666> which, iirc, had some hardware decoding feats
<natefinch> ahh, yeah, probably Cyberlink's PowerDVD ... it gets bundled with some OEMs, and comes with some players etc etc but you don't really need it.  It used to be that you needed it more for burning DVDs, but eventually windows actually built that into the OS
<perrito666> natefinch: ah, well I switched from win to osx around 2004 and never came back from *nix, living on a country where you cant patent software has some good things :p
<natefinch> perrito666: and yes, $25 is too much for something that is a solved problemwith open source software.... plus the reviews say the software isn't that great.
<natefinch> perrito666: don't even get me started on software patents.... hopefully in my lifetime they'll go away.
<perrito666> natefinch: you just need to move away :p this country has all the other flaws but we can watch dvds :p
<natefinch> haha
<perrito666> ironically we can legally use bittorrent too so that makes the DVD part kind of useless
<natefinch> I can watch DVDs too, I just have to break a law that no one enforces anyway
<natefinch> I can legally *use* bittorrent. :)
<perrito666> natefinch: heh
<natefinch> I just can't legally watch movies I haven't paid for.  So I don't legally watch them :)
<perrito666> heh, I can pay for movies but cannot legally get them delivered via regular mail without an equal change of either getting them stolen or have to pay 50% import tax :p
<thumper> hazmat: hey
<thumper> hazmat: I'm just going to go make breakfast
<thumper> hazmat: but hit me up on PMs for demo issues
<thumper> I'll be back later
<hazmat> thedac, ack
<hazmat> thumper, ack
<waigani> morning all
<natefinch> morning
<waigani> I'm trying to work out why gccgo does not want to use the local package. "unexpected reference to package", "reference to undefined identifier âlocal.Providerâ":  http://pastebin.ubuntu.com/7218823/
<natefinch> waigani: is there a local variable clashing with an imported package name or something?
<waigani> natefinch: ah, true I'll have a hunt
<natefinch> waigani: that first error looks like you're trying to print out a variable called local.. but there's a package called local (you can't print out packages)
<waigani> natefinch: that is my newbie mistake. Thanks.
<natefinch> waigani: no problem, happens :)
<waigani> natefinch: though that debug line does at least confirm that it is a reference to a package, which rules out a variable clash right?
<natefinch> waigani: yeah.... it looks like the problem is that there is no variable called local.  There's a package called local.  Whatever variable you're intending to reference is either not defined or called something else
<waigani> natefinch: right. So the real question is why local.Provider is referencing an undefined identifier.
<natefinch> waigani: can you just pastebin the whole file? I can take a look
<waigani> natefinch: when I run "provider/local$ go test -gocheck.f TestOpenFailsWithProtectedDirectories" I get this: http://pastebin.ubuntu.com/7218823.
<waigani> natefinch: here is the first file in the error stack: http://pastebin.ubuntu.com/7218888/
<natefinch> waigani: the first error references a log line on line 36 that doesn't exist in that file.... which is weird.  try blowing away $GOPATH/pkg and then run the test again.  It's possible something weird got stuck in there (it's happened to a few people on the team lately)
<waigani> natefinch: sorry, my bad. I just took out my debug line, so it is referencing 37
<waigani> natefinch: valid, err := local.Provider.Validate(testConfig, nil)
<natefinch> waigani: local.Provider is defined in provider/local/export_test.go, in the var block at the top of the file.  Does that exist in your branch?
<waigani> natefinch: yes
<natefinch> That's the thing that it says is undefined, which is weird if it's there
<waigani> natefinch: I'm testing on gccgo on a ppc vm. I wounder if I need to make or build anything for gccgo recognise the package?
<natefinch> it's a normal test file, it shouldn't have any special requirements except "running during go test"
<natefinch> waigani: it's still probably worth trying to wipe out your $GOPATH/pkg directory.  Some intermediate files can get stuck in there sometimes, and usually the symptom is symbols not reflecting what is in the code files, like this
<waigani> natefinch: okay I'll do that now
<waigani> natefinch: no luck :(
<natefinch> dang
<waigani> natefinch: here is my wip: https://codereview.appspot.com/84360043/
<waigani> you might get the same result on your machine if you run " go test --compiler=gccgo ..."
<natefinch> waigani: does this only fail with gccgo?
<waigani> natefinch: to the best of my knowledge, yes
<natefinch> ahh ok.   It must be a gccgo bug, then
<natefinch> yeah, it definitely passes on my machine with gc
<waigani> right, I hit another bug on my machine, but I think that is because of my perms
<waigani> as explained in my wip
<natefinch> So, let me explain what that code is doing... local is the package name, Provider is a public variable in that package, in a _test.go file, which means it's only compiled during tests.   What it's doing is exposing access to a private variable, by assigning it to a public variable (but since it's only available during tests, it's not hurting anything)
<waigani> natefinch: yep
<natefinch> It seems like gccgo is somehow simply not compiling that file
<waigani> hmmm
<waigani> you're welcome to poke around the vm if you're keen
<waigani> natefinch: if you run with gccgo compiler on your machine, do you get the same error?
<natefinch> possibly because it has no test methods in it?  I 'm not sure.  Try slapping a TestFoo(t *testing.T) {} method in provider/local/export_test.go
 * natefinch installs gccgo
<natefinch> huh, weird, no, it passes for me with gccgo 4.9 on your branch
<waigani> hmph
<natefinch> one of the prereqs tests fails, but I think that's an environment problem on my machine, not really a test failure
<waigani> natefinch: I've found with a few issues, that they are only reproducible on ppc arch
<natefinch> waigani: we're getting paid a lot of money for this, right? :)
<waigani> natefinch: to get juju working on ppc? I believe so.
<natefinch> heh yes
<waigani> slapping on a vm with vi is painfully slow
<natefinch> waigani: in provider/local/export_test.go  add this line to the imports:
<waigani> ah man, same error with the added test: "reference to undefined identifier âtesting.Tâ"
<natefinch> coretesting "testing"
<natefinch> and then make the test:
<natefinch> func TestFoo(t *coretesting.T) {
<natefinch> 	t.Error("you should see this")
<natefinch> }
<waigani> natefinch: will do
<natefinch> (sorry, the testing package that is imported in that file is our testing package, not the standard library's)
<waigani> right, we might be onto something ... fingers crossed
<natefinch> you should see something like this:
<natefinch> --- FAIL: TestFoo-8 (0.00 seconds)
<natefinch> 	export_test.go:60: you should see this
<natefinch> if you *don't see that, then it's not compiling that file
<waigani> :(
<waigani> natefinch: no luck
<natefinch> so, do you see that output?
<waigani> natefinch: exactly the same original output
<waigani> erroring on local.Provider
<natefinch> so, either it's not compiling that file, or something is stuck in the precompiled binaries somewhere.  If you comment out that first validate line (line 36/37), does it at least not report that same error anymore?  That would be a good way to check that it's actually picking up changes in the source code
<waigani> natefinch: it printed out my debug line, so yes it is picking up changes
<natefinch> ok, well that's good at least
<natefinch> I can't imagine why it's not compiling export_test.go .... that's bizarre
<waigani> let me try some other tests ...
 * fwereade is an idiot -- he just coded a fix that would work perfectly if he could retroactively apply it to 1.17.7 :-/
<natefinch> I gotta run, unfortunately.  It's past end of day for me.  Try renaming export_test.go to foo_test.go or something.  See if that does anything
<natefinch> fwereade: doh
<waigani> natefinch: thanks for your help man
<fwereade> and all the shops are closed, too
<jcw4> lbox propose insists on using sensible-browser for the oauth stuff, which in a non-gui environment doesn't work for me... any clues?
<fwereade> ... *and* it actually can't be fixed cleanly unless we can guarantee state servers upgrade before unit agents
<fwereade> dammit
<waigani> heh
<jcw4> fwiw, I just hacked lpad/oauth.go to immeditately return "", nil to force out of band auth
<waigani> thumper: when I pull down my branch to test in on the ppcvm I put it here: ~/go/src/launchpad.net/mybranch
<thumper> otp
<waigani> thumper: otp?
<thumper> on the phone
<waigani> ooooh
<waigani> thumper: putting it there seems to have caused the error I was debugging all morning.
<waigani> thumper: when I merged ~/go/src/launchpad.net/mybranch with ~/go/src/launchpad.net/juju-core and ran tests I no longer got the error
<jcw4> any help for this one?  I have a cobzr branch of juju-core.  identical except for  a few revisions in worker/uniter/charm/.   go build ./... works in master.   fails in my branch.   ran godeps -u in both; dependencies.tsv is identical in both
<jcw4> all the failures are in the provider/azure/environ.go which was fixed in master by using godeps
<jcw4> looks like I had to switch to the working branch, do a go get -u ./..., then do a godeps -u dependencies.tsv, then finally able to build
<sinzui> wallyworld, thumper I broke canonistack testing. probably by removing admin-secret and control-bucket from the config: http://pastebin.ubuntu.com/7219300/
<sinzui> ^ Do I need those keys. Do I need to put them back as they where
<sinzui> were
<wallyworld> sinzui: control bucket is no longer needed in the yaml
<wallyworld> it will be geneated on the fly
<sinzui> wallyworld, thumper if there is cruft left behind, I would prefer to nuke it
<sinzui> wallyworld, okay, I understood that much
<wallyworld> i think the same applies to admin secret
<wallyworld> so, you should just nuke what's there and start again
<wallyworld> it should then create a new control bucket and away it goes
<sinzui> wallyworld, Hp is not broken, I remove admin-secret and public-pubic from it too
<wallyworld> hmmm. so you must still have the jenv file then
<wallyworld> cause when bootstrap first happens, the jenv file gets created and the yaml file is then redundant
<wallyworld> when you say removed from config, do you mean yaml file, jenv, or actual environ config using juju unset?
<sinzui> wallyworld, There is no jenv file
 * sinzui checks nova
 * wallyworld has to take something to car, back in a minute
<sinzui> wallyworld, nova doesn't show any machines.
<wallyworld> sinzui: so you are trying to bootstrap a new env from scratch?
<sinzui> wallyworld, no, one that existed in the last round of tests
 * sinzui tries a rename
<wallyworld> but there is no jenv file
<wallyworld> hence it will create a new control bucket et al
<wallyworld> if the control bucket has been deleted from yaml
<sinzui> wallyworld, more interesting, new env name and same error.
 * sinzui looks at bigger diff
<wallyworld> does sound like a permissions issue
<wallyworld> if control bucket can't be created
<sinzui> wallyworld, I just restored the config and got nothing. Maybe I am having a panic attack...canonistack is actually tits up
<wallyworld> oh
<wallyworld> at least that explains it. canonistack does seem to be way overcommitted
<sinzui> wallyworld, I cannot bootstrap as myself it seems
<wallyworld> i'll try also
<sinzui> wallyworld, looks like the dashboard was setup for lcy02. It cannot list containers.
<wallyworld> hmm, ok
<davecheney> https://codereview.appspot.com/85100044
<davecheney> quick review
<davecheney> fixes test explosion if you're missing mongod
<davecheney> waigani: maybe one for you
<waigani> davecheney: why do you no longer need to zero set inst.addr etc?
<davecheney> waigani: 'cos we return an error
<waigani> davecheney: so why do you need to remove inst.dir ?
<davecheney> 'cos we're cleaning up our failure
<waigani> so why no remove addr etc to clean up failure?
<davecheney> 'cos the policy is once you return an error
<davecheney> you cannot make any assuptions about the state of the value
<davecheney> honestly if this is going to be a sticking poiint
<davecheney> i'll put thoselines back
<davecheney> it doesn't make any different
#juju-dev 2014-04-08
<waigani> davecheney: I'm new as I know, so I'm asking to understand
<waigani> s/i/you
<davecheney> sure
<davecheney> basically the rule is
<davecheney> if a function/method returns an error
<davecheney> you generally cannot make any assetions abvout the state of any other values it returns or the instance itself
<davecheney> *generally*
<davecheney> there are exceptions
<davecheney> so, there is no point in cleaning up the internal state of that instance as it is broke
<davecheney> there is value in cleaning up the files we poo'd on disk
<davecheney> because that is something we can do to make it possible to run the test again later
<waigani> ah okay
<davecheney> waigani: this ticket, https://bugs.launchpad.net/juju-core/+bug/1299969
<_mup_> Bug #1299969: launchpad.net/juju-core/provider/manual: ssh tests are not properly isolated <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1299969>
<davecheney> is a blocker on getting the common and manual provider tests working
<davecheney> do you think you could add it to your todo list ?
<waigani> davecheney: sure. I actaully hit that problem and thought I was being stupid because I did not have keys on the vm
<davecheney> waigani: right, but the solution is not to add keys to the vm
<davecheney> that will just push the problem off onto someone elses' plate
<waigani> davecheney: yeah I get that now :)
<davecheney> the goal is we want to run these tests during the build on the lp build servers
<davecheney> which are scrupiolously clean
<waigani> right
<davecheney> so any screwing with the environment beforehand will have to be expunged
<waigani> davecheney: provider/local is now passing :) proposing my branch now
<perrito666> sinzui: nite, question, am I supposed/allowed to clos the bug and also, could you open the new one with the new bug info? (such as the exit of jenkins)
<davecheney> waigani: sweet
<sinzui> perrito666, I will do that
<davecheney> thumper: waigani wallyworld http://paste.ubuntu.com/7219526/
<davecheney> i
<davecheney> i'd like to log a bug about this
<sinzui> perrito666, The bug is fix committed. It cannot be closed until the milestone is released/a package is distributed to the users.
<davecheney> i see far to many of these errors in the log when working with the local provider
<davecheney> i think its more than a harmless warning
 * sinzui reports the new bug
<thumper> davecheney: which bit exactly?
<perrito666> sinzui: tx
<wallyworld> yeah, doesn't seem right
<davecheney> thumper: 2014-04-08 00:35:07 DEBUG juju.worker.logger logger.go:45 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING;unit=DEBUG"
<davecheney> ^ what generates this line
<thumper> davecheney: the logging config worker
<thumper> davecheney: that is the default logging level
<davecheney> thumper: http://paste.ubuntu.com/7219530/
<thumper> that is the logging level specified in state
<davecheney> thumper: this is the repro
<davecheney> ok
<davecheney> /dev/sda        9.9G  7.6G  1.9G  81% /
<davecheney> 10gb vm isn't really enough to debug on ...
 * davecheney waves to axw 
<davecheney> welcome to daylights saving
<thumper> davecheney: do you have a question for me?
<axw> davecheney: howdy
<axw> no daylight savings over here
<davecheney> thumper: how can I set the logging back to debug everywhere ?
<davecheney> i'm trying to get more details on issue 1303787
<thumper> davecheney: "juju bootstrap --debug" or "juju bootstrap --logging-config='juju=debug'"
<thumper> davecheney: or... if you want more
<davecheney> thumper: cna't change it after the fact ?
<thumper> davecheney: yes
<davecheney> config-set ?
<thumper> juju set-env logging-config=juju=debug
<axw> oh ffs unity
<davecheney> thumper: thans
<davecheney> thanks
<davecheney> ok, environmnet setup, now apparently we wait 7 hours ...
<thumper> davecheney: I have been talking about that issue
<thumper> davecheney: how much memory does your VM have?
<davecheney> 8 gb
<thumper> ok... perhaps that'll be fine
<thumper> 24 was bad, 16 was good
<davecheney> hmm, that is odd
<thumper> I'm wondering if there is a fundamental issue elsewhere
<thumper> as in, not us
<davecheney> more likely
<thumper> but yes, keep an eye on it
<davecheney> i don't doubt there are bugs in the compiler
<davecheney> that is why i need machine-0.log
<davecheney> all machines via rsyslog is eating the stack trace
<davecheney> thumper: could you make sure if thre is any conversation about that issue that you cc me
<davecheney> i've been trying to get everyone to communicate via the isse
<thumper> for the original machine that was a problem has now got 16gig of ram and a new kernel
<thumper> davecheney: it was all voice conversation
<davecheney> but am currently managing 3 indepedentant email threads by three separate groups working the problem
<thumper> I was chatting with hazmat about it
<thumper> huh...
<thumper> that isn't fun
<davecheney> thumper: i've done more funner tings
<davecheney> things
<thumper> :)
<davecheney> i know this issue is critical
<davecheney> but saying it's critical isn't enough
<davecheney> we need details
<axw> fairly trivial review anyone? deleting 1.16 compat code: https://codereview.appspot.com/84520046/
<tvansteenburgh> davecheney: re http://pastebin.ubuntu.com/7219597/
<tvansteenburgh> (re https://bugs.launchpad.net/juju-core/+bug/1303787)
<_mup_> Bug #1303787: hook failures - nil pointer dereference <hooks> <local-provider> <ppc64el> <juju-core:Incomplete by dave-cheney> <https://launchpad.net/bugs/1303787>
<tvansteenburgh> i'm guessing that doesn't help much :P
<thumper> tvansteenburgh: have you encountered issues since the vm was resized?
<davecheney> tvansteenburgh: the command you want is
<davecheney> dpkg -l | grep gccgo
<davecheney> sorry
<davecheney> dpkg -l | grep gccgo
<tvansteenburgh> thumper: honestly i don't know for sure as i haven't been on the machine today, but from convos on irc i've inferred that there are still probs
<davecheney> tvansteenburgh: can you get me the machine-0.log file from the machine
<davecheney> tvansteenburgh: or
<davecheney> better
<davecheney> can you do ssh-import-id dave-cheney
<davecheney> and I can do it myself
<tvansteenburgh> davecheney: dpkg -l | grep gccgo prints nothing
<tvansteenburgh> davecheney: imported your key
<davecheney> tvansteenburgh: is this on a ppc64el machine ?
<tvansteenburgh> yes
<davecheney> tvansteenburgh: this is very worrying
<davecheney> what you are teling me doesn't add up
<davecheney> the example fro the weekend showed that juju was compiled locally
<davecheney> but you're saying that there is no compiler on this machine
<tvansteenburgh> don't shoot the newbie :)
<davecheney> tvansteenburgh: soryr mate
<tvansteenburgh> i didn't install juju on this machine so i'm not sure what to tell you
<tvansteenburgh> but keep in mind that juju was upgraded since the original bug report
<davecheney> tvansteenburgh: ok
<tvansteenburgh> so it's possible that it was from source before, and now pre-compiled binary
<davecheney> i'll check out the machine myself
<davecheney> tvansteenburgh: are you going to be doing hte demo, or hazmat ?
<tvansteenburgh> hazmat
<davecheney> roger
<thumper> davecheney: no, hazmat ;-P
 * thumper snickers at his joke
 * hazmat hides
<davecheney> tvansteenburgh: what is the internal ip of wolfe-01
<davecheney> i have the wrong config
<tvansteenburgh>  10.245.66.193
<davecheney> ta
<davecheney> tvansteenburgh: thanks, i'm grabbing the log files now
<hazmat> thumper, local provider detects btrfs and just uses it if its at /var/lib/lxc ?
<thumper> yup
<hazmat> thumper, awesome
 * hazmat gives it a whirl
<thumper> hazmat: will still take time to create the template first time around
<thumper> wallyworld: got 10 minutes? I need help solving a tools issue
<wallyworld> sure
 * thumper starts a hangout
<thumper> https://plus.google.com/hangouts/_/7acpj3v9ejfcghmeigfq41jfc4?hl=en
<hazmat> thumper, understood, i updated the lxc cache image directly as well
<hazmat> well that's cute.. juju set-env random-string="abc"  just works..
<hazmat> and is retrievable via juju get-env
<davecheney> http://paste.ubuntu.com/7219672/
<davecheney> grep -c panic
<davecheney> eeerk
<davecheney> sinzui: where is the packaging branch for juju-core ?
<davecheney> tvansteenburgh: i'd take a punt and say wolfe-01 got nuked and recreated
<davecheney> that is why there is no compiler
<davecheney> nuke; recreate; [sudo] apt-get install juju-core
<tvansteenburgh> davecheney: yeah i think that /is/ what happened
<davecheney> tvansteenburgh: that makes sense tehn
<davecheney> then
<davecheney> jamespage: where is the packaging branch for juju-core ?
<waigani> thumper: wallyworld: axw: my internet connection is crapping out on m
<waigani> e
<wallyworld> yay for NZ inyternet
<thumper> works for me
<hazmat> thumper, davecheney so good news.. i've run through the demo about a dozen times.. zero panics
<thumper> \o/
<hazmat> thumper, and using btrfs :-)
<thumper> on power?
<hazmat> thumper, yup
<thumper> awesome
<thumper> how do I find out which package provides an executable again?
<thumper>  wallyworld https://codereview.appspot.com/85220043
<wallyworld> looking
<thumper> now this works
<thumper> for some value of works
<thumper> it tries to create a trusty container and fails
<thumper> for other reasons
<thumper> I'm asking on #ubuntu-server for answers to those reasons
<davecheney> ubuntu@winton-02:~$ juju status
<davecheney> ERROR failed verification of local provider prerequisites: exec: "mongod": executable file not found in $PATH
<davecheney> MongoDB server must be installed to enable the local provider:
<davecheney> sudo apt-get install mongodb-server
<davecheney> still telling me to instal lmongodb-server
<davecheney> did that branch land ?
<thumper> not yet
<davecheney> ok
<davecheney> https://codereview.appspot.com/85150044
<davecheney> trivial fix
<jam> davecheney: LGTM, all the other files in that directory use the bson: syntax
<davecheney> jam: that is why i feel pretty confident that the fix is ok
<jam> davecheney: me too
<davecheney>      ââjujudââ¬âlxc-createâââlxc-ubuntu-clouâââubuntu-cloudimgâââwget
<davecheney>      â       ââ11*[{jujud}]
<davecheney> what the balls is it downloading
<davecheney> and is there a way to speed it up
<jam> davecheney: if you are using LXC, it has to download the Ubuntu image for the LXC instance
<davecheney> it's hung up trying to download some small the releases text file from cdimages
<davecheney> ubuntu@ip-10-251-8-60:~$ juju deploy ubuntu --debug
<davecheney> 2014-04-08 05:54:20 INFO juju.cmd supercommand.go:296 running juju-1.19.0-trusty-amd64 [gccgo]
<davecheney> 2014-04-08 05:54:20 DEBUG juju api.go:189 trying cached API connection settings
<davecheney> 2014-04-08 05:54:20 INFO juju api.go:259 connecting to API addresses: [localhost:17070 10.0.3.1:17070]
<davecheney> 2014-04-08 05:54:20 INFO juju.state.api apiclient.go:194 dialing "wss://localhost:17070/"
<davecheney> 2014-04-08 05:54:20 INFO juju.state.api apiclient.go:141 connection established to "wss://localhost:17070/"
<davecheney> ^ hun
<davecheney> hung
<davecheney> hang on
<davecheney> why is it talking to localhost ?
<davecheney> the api server is running on 10.3.0.1
<davecheney> axw: wasn't there a bug logged about this over the weekend ?
 * axw looks up
<axw> um
<axw> davecheney: localhost is the "public address" for the local provider's machine-0
<axw> 10.0.3.1 is the internal
<axw> davecheney: is there a problem with that?
<davecheney> axw: not sure
<davecheney> there will be when proxies are involved
<davecheney> we always say
<davecheney> no_proxy="10.0.3.1"
<davecheney> not no_proxy="localhost"
<axw> no_proxy=localhost only makes sense if you're outside the environment
<axw> i.e. using the CLI
<axw> I suppose we could manage the CLI's environment specifically for the local provider
<davecheney> axw: ok
<davecheney> nm
<davecheney> probably not the problem at the moment
<davecheney> 2014-04-08 06:07:56 ERROR juju.provisioner provisioner_task.go:438 cannot start instance for machine "3": error executing "lxc-start": command get_cgroup failed to receive response
<davecheney> any ideas ?
<davecheney> I wonder if it is related to https://bugs.launchpad.net/bugs/1304167
<_mup_> Bug #1304167: syntax error, trusty beta-2 cloud image <apparmor (Ubuntu):New> <https://launchpad.net/bugs/1304167>
<jam> davecheney: that sounds like nested-lxc issues, but I'm guessing you're not nesting?
<davecheney> nup
<davecheney> just reboote
<davecheney> thre was an apparmor snafu installing the lxc package
<davecheney> maybe that was the cause
<davecheney> http://paste.ubuntu.com/7220297/
<davecheney> nope, lxc is broken
<rogpeppe1> mornin' all
<rogpeppe1> i was just looking at this change: https://github.com/juju/testing/pull/3/files
<rogpeppe1> is there any way i can see the whole file diff on github?
<rogpeppe1> or do i have to pull it down to do that?
<fwereade> jamespage, jam: so, I think I've tracked down that relations-on-upgrade bug -- but I think (1) it only hits relations without other members, and (2) it's always existed
<fwereade> jamespage, were there any other units in that peer relation?
<fwereade> jamespage, jam: that's bug 1303697 ^^
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:Triaged by fwereade> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303697>
<axw> fwereade: I'm creating an instance.Placement structure for placement directives, but I'd like to leave SSH out of it because it only makes sense in add-machine (for now at least)
<axw> is that okay with you?
<axw> fwereade: http://paste.ubuntu.com/7220424/  -- currently adding a new method to state to add a machine with one of these
<fwereade> axw, I feel it's mildly suboptimal, because I think it *should* be reasonable to deploy --to ssh:blah -- but I can see how it would pull in a lot more work, so, fair enough
<fwereade> axw, about the kv pairs in environment placement... upsides and downsides
<axw> I'm pretty sure we can add it later if necessary
<fwereade> axw, and, hmm, I don't quite love the separation between ContainerPlacement/EnvironmentPlacement
<axw> fwereade: the idea was that a container placement can encapsulate any other, the others being terminals
<fwereade> axw, (1) I think the conception of lxc/kvm as pseudo-providers is quite nice; (2) I think that --to us-east-1a is nice, but that's not a kv pair
<axw> fwereade: so you could do lxc:maas:name=abc
<fwereade> axw, ha, that's an interesting thought
<fwereade> axw, I'm a little bit -1 on it -- I think it's ok that the domain of the lxc/kvm providers is restricted to existing machines
<fwereade> axw, I can see nice things about it but it also makes my spidey sense tingle a little
<axw> okay, I'll collapse them for now
<axw> fwereade: so just some kind of scope/provider and a value, and leave it to the environ to parse k=v if it wants to?
<fwereade> axw, yeah, I think that's the right separation of concerns
<axw> ok
<fwereade> axw, and it does leave the providers free to have us-east-1a for now, and later to allow zone=us-east-1a,somethingelse=somethingelse
<axw> yep, fair enough
<fwereade> axw, fwiw I think scope is a nice term for it
<axw> okey dokey well that simplifies the changes needed to state then
 * fwereade is getting really quite grumpy with the half-assed attempts to make the uniter work on multiple goroutines
<fwereade> axw, I don't suppose you know why we call u.proxy.SetEnvironmentValues() in updatePackageProxy, do you?
 * axw looks
<axw> fwereade: I think so the hooks run with the proxy env vars set?
<fwereade> axw, we explicitly pass the proxy settings in when we create a hook context
<axw> sorry, not really sure tbh. better off asking thumper - he implemented that I think
<fwereade> axw, np, cheers
<axw> fwereade: is this better? http://paste.ubuntu.com/7220551/
<axw> machine-id is a special case, because there isn't really a scope/provider/whatever
<axw> for empty scope, "juju add-machine" will fill it in with the current environment
<axw> so "juju add-machine --to maas-node-123" will implicitly have the current env's name as the placement scope
<fwereade> axw, I'm wondering whether it would be excessively evil to consider "0" or whatever to be in some special scope itself
<fwereade> axw, "--to 0" == --to "machine:0" as it were
<fwereade> axw, just trying to figure out the consequences of a collision between DWIM vs law of least surprise
<fwereade> axw, or juju:0, or something
<axw> fwereade: I suppose we could just have an instance.MachineScope constant, = "-" or some character that is not valid in an environment name
<dimitern> fwereade, hey
<fwereade> dimitern, heyhey
<dimitern> fwereade, https://codereview.appspot.com/85060043/diff/1/provider/maas/environ_whitebox_test.go#newcode537 re that comment
<fwereade> dimitern, oh yes
<dimitern> fwereade, what exported method did you mean?
<fwereade> dimitern, StartInstance
<dimitern> fwereade, but to do that we need to land a few things in gomaasapi
<fwereade> dimitern, expand please? sorry I'm not up on all the context there
<dimitern> fwereade, well, to test the new features we need them to be supported by the test server of gomaasapi
<dimitern> fwereade, that's lshw machine details and list_connected_macs for a network
<dimitern> fwereade, i wanted to save some time and land these changes (after testing them live on my local maas ofc) so we can have MVP, and later polish it
<fwereade> dimitern, ok, I see -- as long as there's a card for it, then :)
<dimitern> fwereade, sure, will add one now
<fwereade> axw, yeah, something like that maybe
<axw> fwereade: I'll go with that and see how it turns out.
<fwereade> axw, cheers
<vladk> dimitern: morning
<dimitern> vladk, hey morning
<vladk> dimitern: I am working on linking GetNetworkInfo and extractInterfaces
<vladk> dimiternÐ I do not want both of us do the same changes
<davecheney> anyone: if I do juju destroy-environment, does this remove /var/log/juju-$ENV ?
<rogpeppe> davecheney: i'm not sure it does. there might have been a discussion about this recently. jam? fwereade?
<jam> davecheney: thumper explicitly designed it so that destroy-env does *not* delete /var/log/juju* (though it does delete /var/lib).
<jam> var/log will be deleted on another bootstrap
<rogpeppe> axw: i am thinking that "mongodb" in utils/apt.go, cloudArchivePackages, should be "mongodb-server". does that seem right to you?
<dimitern> vladk, ah, sorry, I misunderstood you perhaps
<davecheney> jam: hang on
<davecheney> you just said two things
<davecheney> and they conflicted
<dimitern> vladk, I have the branch that returns []NetworkInfo proposed already
<davecheney> you said /var/log/juju-ubuntu-local would not be deleted
<davecheney> then you said it would
<rogpeppe> any else know about what package we should be installing for mongodb?
<jam> davecheney: /var/log is not deleted during destroy, but /var/lib is
<dimitern> vladk, but we do need some changes in gomaasapi
<jam> davecheney: during bootstrap /var/log is cleaned up
<davecheney> jam: /var/log/juju-ubuntu-local is truncated
<rogpeppe> s/any else/anyone else/
<davecheney> so the all machines log of my environment is only 5 mins long
<dimitern> vladk, to extend the test server, so it supports /version/ (capabilities), /network/<name</?op=list_connected_macs and /nodes/<id>/?op=details (for the XML lshw data)
<axw> rogpeppe: sorry, just a moment
<vladk> dimitern: I see your last code review, do you have something behind that?
<dimitern> vladk, I have 2 more - the provisioner to take []NetworkInfo and add NICs/networks in state
<axw> rogpeppe: yes, I think you're right
<dimitern> vladk, and the last one is the cloudinit scripts, where I believe you did some work on and we can do it together?
<rogpeppe> axw: but i don't see mongodb-server mentioned in http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/cloud-tools_versions.html
<davecheney> jam: ok, thanks for confirming
<davecheney> arosales: you online ?
<rogpeppe> axw: but... it looks like that's what we're using in current bootstrap, so i guess it should work
<axw> rogpeppe: well, mongodb depends on mongodb-server. so we get what we need either way
<rogpeppe> axw: ah
<rogpeppe> axw: ok, trying that
<rogpeppe> axw: i've added mongodb-server to cloudArchivePackages
<axw> cool
<vladk> dimitern: we need one more change to gomaasapi, to not panic in testing mode, what bootstrap a node without networks
<vladk> dimitern: how can I see your code?
<dimitern> vladk, yeah, that's why we need to change gomaasapi
<dimitern> vladk, what code specifically?
<dimitern> fwereade, I replied to your comment https://codereview.appspot.com/85060043/ does it make sense?
<fwereade> dimitern, yeah, sgtm
<dimitern> fwereade, cool, submitting then
<vladk> dimitern: I can't work on cloudinit script before you land your changes on NIC->VLAN mapping
<dimitern> vladk, it's approved and on its way to land now
<natefinch> morning all
<natefinch> axw: thanks for doing some testing on my branch. Don't need to apologize that it works ;)
<rogpeppe> hmm, it occurs to me that since we don't need the tools until we connect to the instance, we could start the instance first, then concurrently upload the tools. that would make bootstrap --upload-tools about twice as quick for me.
<rogpeppe> as i wait for yet another bootstrap
<rogpeppe> fwereade: where does the machiner get the addresses from now?
<fwereade> rogpeppe, net.InterfaceAddrs
<fwereade> rogpeppe, and it sets them all with NetworkUnknown
<rogpeppe> fwereade: as an alternative, perhaps the uniter should wait until its machine has an address before starting
<fwereade> rogpeppe, I'm not sure exactly what handwaving we do with the local provider to make those addresses show up usefully
<fwereade> rogpeppe, yeah, maybe that's the way -- ask repeatedly for your own address until you have one, then carry on
<fwereade> rogpeppe, and I think we can do that with existing apis
<rogpeppe> fwereade: yeah
<natefinch> axw: not yet...
<rogpeppe> fwereade: istm that if you're the uniter, you really don't want to start until other units can talk to you
<axw> natefinch: cat ~/.juju/local/cloud-init-output.log and see if there's anyhting interesting there
<fwereade> rogpeppe, yeah -- and that used to work, but we moved stuff around
<fwereade> rogpeppe, although, hmm
<fwereade> rogpeppe, unaddressable containers should be fine, this might render them less fine
 * fwereade goes to dig and find out exactly how we collate addresses
<rogpeppe> fwereade: hmm, interesting
<axw> natefinch: also, you do have juju-mongodb installed right?
<rogpeppe> fwereade: i think we need to be able to explicitly mark a container as unaddress
<rogpeppe> able
<fwereade> rogpeppe, yeah
<rogpeppe> fwereade: some special form of address might work
<fwereade> rogpeppe, and add a bunch of consistency stuff so we don't accidentally put workloads that need to relate into them etc etc
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, well, they'll still have some sort of address on the machine network at least
<natefinch> axw: yeah, there's nothing interesting in the cloud init output.... last thing is just "opening environment local"
<fwereade> rogpeppe, that'll all come if/when we do explicit proxy charm support I think
<natefinch> axw: I have mongodb built with SSL in the correct spot, but it's not specifically juju-mongodb
<rogpeppe> natefinch: what are you having problems with?
<axw> natefinch: mk. is mongo even running?
<natefinch> axw: mongo runs, yeah
<natefinch> rogpeppe: bootstrapping local
<natefinch> axw: maybe I should just install juju-local
<axw> wouldn't hurt :)
<natefinch> :)
<axw> natefinch: do you have an http proxy set? just wondering if that breaks something...
<axw> otherwise I am out of ideas
<natefinch> no proxy
<rogpeppe> natefinch: try putting some more debug statements in the dial code
<axw> gotta make dinner, bbl
<natefinch> well, installing juju-local let me bootstrap, but I can't destroy the environment now.  Sigh
<fwereade> wallyworld, I think we'll need to TOPIC this, but let me read further
<wallyworld> sure
<fwereade> wallyworld, yeah, I think we need to make the fallback logic provider-specific too
<fwereade> wallyworld, consider env:arch=i386, service:instance-type=t1.micro
<wallyworld> fwereade: i think it is, or?
<wallyworld> all providers except ec2 and openstack just use the current WithFallbacks as they don't yet support instance-typ
<wallyworld> ec2 and openstack have specific logic
<fwereade> wallyworld, ah! ok, I misread something, let me keep going
<wallyworld> ok, no rush
<fwereade> wallyworld, we should probably move WithFallbacks off the Constraints type though
<wallyworld> sure
<wallyworld> that's easily done
<fwereade> wallyworld, and I think there's a subtlety wrt which values need to be masked when
<wallyworld> ok, if we can encapsulate the business rules, i'll implement them
<wallyworld> we have the means now that there's provider specific behaviour we can plug in
<fwereade> wallyworld, yeah -- did you get a chance to look at the python implementation by any chance?
<wallyworld> fwereade: yeah, but a lot of it was pretty much not that relevant
<fwereade> wallyworld, it's not a perfect model, but it does allow for defining conflicts between particular constraints
<fwereade> wallyworld_, as I was saying before we were so rudely interrupted
<fwereade> wallyworld, it's not a perfect model, but it does allow for defining conflicts between particular constraints
<fwereade> wallyworld_, ie attempting to define mem and instance-type together fails
<fwereade> wallyworld_, and defining instance-type in service constraints masks out only those env constraints that conflict with it
<wallyworld_> fwereade: yes, although i have been more strict
<dimitern> fwereade, mgz, vladk, perrito666, https://codereview.appspot.com/85380043 - a small necessary step before the provisioner can start adding networks/NICs
<fwereade> wallyworld_, I think it becomes too strict though
<wallyworld_> ie if the combined constraint has inst type and (mem or cpu-core or cpu-poewer etc) , it ignores instance type
<fwereade> wallyworld_, it looked like it was also masking arch
<wallyworld_> fwereade: not for ec2
<natefinch> rogpeppe: btw, I landed the namespace branch
<rogpeppe> natefinch: thanks a lot
<fwereade> wallyworld_, ok, I just fail at reading then :)
<wallyworld_> fwereade: it masks for openstack but i can change that
<wallyworld_> if needed
<fwereade> wallyworld_, possibly all we need is conflict resolution within single-level constraints
<wallyworld_> fwereade: for ec2, if there's an arch, it checks that the instance type can support it
<natefinch> rogpeppe: merging into HA now
<rogpeppe> natefinch: cool
<wallyworld_> fwereade: i'll ley you digest and we can topic
<wallyworld_> gotta put kid to bed
<rogpeppe> natefinch: i'm just trying out a spike which tries to integrate everything to see if it all actually works, BTW
<axw-afk> wallyworld_: I just sent another update on placement directives
<natefinch> rogpeppe: seems like a good thing to do
<perrito666> dimitern: small?
<dimitern> perrito666, well, at least it's straightforward I hope :)
<mgz> :)
<wallyworld_> axw-afk: great, thanks :-)
<jam> dimitern: standup ?
<dimitern> jam, sorry, coming
<dimitern> fwereade, I still think this should land https://codereview.appspot.com/85380043, but I'll change https://codereview.appspot.com/85220044/ to implement the SetProvisionedWithNetworks API as agreed
<jam> rogpeppe: natefinch: I'm getting a strange failure trying to do this in a test case: 	apiMachine, err := s.State.AddMachine("quantal", state.JobManageEnviron)
<jam> if I do it in SetUpTest it works fine
<jam> but in TestFoo
<jam> it gives:"cannot add a new machine: state server jobs specified without calling EnsureAvailability"
<rogpeppe> jam: you can't do that if it's not the first machine
<rogpeppe> jam: only the bootstrap machine can be explicitly added with JobManageEnviron
<jam> rogpeppe: so even if the first machine *isn't* a JobManageEnviron
<rogpeppe> jam: yes
<rogpeppe> jam: to allow that would complicate the logic for no gain apart from in test code
<rogpeppe> jam: (and it's usually easy enough to work around the issue in tests)
<rogpeppe> jam: i still think that EnsureAvailability isn't a great idea in general, but this is one of the implications of it
<rogpeppe> jam: just wanted to run something by you for sanity checking
<jam> ?
<rogpeppe> jam: i thought that mgo.Strong implied that you'd always be talking to the mongo primary
<mgz> fwereade: do you have a mo?
<rogpeppe> jam: so IsMaster *should* always report that the current session is master
<rogpeppe> jam: at least that's what i assumed
<rogpeppe> jam: but it doesn't seem to be the case
<mgz> fwereade: your review for horacio's network verification branch, you say you want it done in state
<rogpeppe> jam: can you think of some reason that's not a reasonable assumption?
<mgz> fwereade: but that means the first time juju knows it can't do do what the user asked is in the provisioner
<jam> rogpeppe: which 'IsMaster' are we talking about?
<mgz> fwereade: which strikes me as pretty pantsy feedback
<rogpeppe> jam: replicaset.IsMaster (and the corresponding mongo api call)
<jam> rogpeppe: I'm digging
<rogpeppe> jam: me too
<jam> rogpeppe: so the one thing I see is that session.SetMode(consistency, refresh)
<jam> can take "consistency=Strong" but "refresh= false"
<jam> which means it will set slaveOk=falseb
<jam> but doesn't seem to actually unset the current connection
<rogpeppe> jam: we never actually call SetMode explicitly AFAIK
<jam> rogpeppe: DialWithInfo ends with (SetMode(Strong, true))
<jam> newSession takes consistency as a parameter
<jam> EnsureIndex calls Clone and then SetMode(Strong, false)
<rogpeppe> jam: oh, i think i see what might be going on
<rogpeppe> jam: replicaset itself calls SetConsistency
<rogpeppe> jam: SetMode
<rogpeppe> natefinch: why does replicaset.CurrentConfig set the session's consistency mode to Monotonic?
<jam> rogpeppe: I think if you want to dial a specific Mongo server
<jam> you have to use Monotonic
<jam> otherwise, whatever you read would actually go to the mastetr
<rogpeppe> jam: hmm, yes.
<rogpeppe> jam: i think that it's wrong that it sets the argument session's mode though
<rogpeppe> jam: i've just changed it to clone the session first
<jam> rogpeppe: so the mgo way of doing it is that you always clone before changing stuff liket hat
<rogpeppe> jam: we'll see if that fixes the problem
<jam> rogpeppe: right
<perrito666> fwereade: hey, I was discussing your input with mgz abut the --to error on case of network incompatibility and we got some doubts
<fwereade> perrito666, heyhey
<fwereade> perrito666, go on
<fwereade> perrito666, mgz: ah I see
<fwereade> perrito666, mgz, not sure I follow
<fwereade> perrito666, mgz: if you're assigning a unit to a machine that doesn't exist, the network spec comes from the service, easy
<fwereade> perrito666, mgz: if you're assigning to a machine that exists but hasn't been provisioned, all you have to go on is whatever network spec the machine was created with
<fwereade> perrito666, mgz: if you're assigning to a provisioned machine, you can directly check include/exclude against the actual networks
<mgz> fwereade: sure, (apart from the last one, we don't yet populate that)
<fwereade> perrito666, mgz: in any case you should be able to abort at unit-assignment time, right?
<fwereade> mgz, indeed :)
<mgz> fwereade: the issue is if we're doing this in state, that won't trigger till in response to the user's api call, right?
<fwereade> perrito666, mgz: it sucks a bit that it's hard to do at unit *creation* time (or service creation for that matter)
<fwereade> mgz, yes
<fwereade> mgz, that's still plenty of time to send an error back down to the user, surely?
<mgz> fwereade: so, the api call will succeed, then the worker will go, "oh hey, I can't actually do that"... then the user needs to run status and see that nothing actually happened
<jam> rogpeppe: do you remember the earlier question about "is the address actually changing"? I'm seeing this in the local provider: updater.go:249 machine "0" has new addresses: [public:localhost local-cloud:10.0.3.1]
<jam> but that slice doesn't ever change
<fwereade> mgz, what worker? the assignment completes before the API call returns
<natefinch> rogpeppe, jam: thanks for fixing that.
<mgz> fwereade: ah, does it? that's the bit I'm not clear on.
<fwereade> mgz, it's in the juju package iirc -- we create unit, then assign to machine, and the result of all that gets sent over the api
<mgz> fwereade: basically perrito666 doesn't have enough to go on from your review to actually change the code how you want it done
<jam> fwereade: mgz: I'm sure it is done in the same call, but it is not atomic
<jam> "juju add-unit --to N" will fail
<fwereade> mgz, the sucky bit is that we don't generally check assignment sanity before creating the unit, so you can end up with unassigned units, which is rubbish
<jam> but leave you with a Unit in Pending status
<rogpeppe> jam: that is odd
<jam> fwereade: right
<jam> that happens with --to
<jam> so it isn't *worse* than with --network :)
<fwereade> jam, and also just in general with deploy and add-unit -- they're not transactions and can fail partway through
<jam> fwereade: yeah, it is just surprising when the command line args aren't valid, but it creates some stuff anyway
<fwereade> jam, most other things are transactional iirc (nothing is atomic :-/)
<fwereade> jam, stricter checking in the api is the only response I'm really comfortable with there at this stage
<fwereade> jam, no argument that we should have it
<fwereade> jam, but some stuff -- like auto assignment in particular -- is likely to be tricky to get right in a single transaction
<rogpeppe> good news: i just stopped the juju-db service running on machine 0 and juju status continues to work
<natefinch> nice :)
<fwereade> rogpeppe, sweet
<fwereade> let's ship it
<fwereade> ;p
<jam> rogpeppe: presumably in HA first, right ? :)
<rogpeppe> jam: yea
<rogpeppe> h
<jamespage> fwereade, as the font of all knowledge - I've noticed that juju in our internal cloud picks i386 by default - is that intentional?
<jamespage> we did not hit this before because I was local tool building for amd64 only
<jamespage> now we sync from streams
<fwereade> jamespage, hmmmmmm, that is suboptimal, I think that is a casualty of arch-selection changes in response to ppc
<jamespage> fwereade, hmmmmm
<mgz> I thought we selected amd64 first...
<jamespage> fwereade, I have to force with --constraints now
<jamespage> fwereade, mgz: sounds like a bug to me
<fwereade> mgz, we certainly did once, and I *thought* we still did
<fwereade> jamespage, concur
 * jamespage goes to raise a bug
<mgz> maybe we had the constraint default to amd64, which was wrong (given non-intely clouds), and just lost it as a preference when fixing?
<jam> mgz: note we have a MaaS bug open for arm clouds related to that
<jam> apparently if you don't specify arch= we thought we were starting an amd64 there
<jam> fwereade: https://codereview.appspot.com/85450043 Upgrader returns version.Current if we haven't upgraded yet
<jam> tested using the local provider and, indeed, the API server upgrades first.
<jam> they all wake up in response to changing the agent-version in config, but they're all told to not do anything, and they all ask again when machine-0's agent restarts and they all reconnect
<rogpeppe> fwereade: do you know what happens about all-machines.log in the presence of multiple API servers, by any chance?
<fwereade> rogpeppe, hopefully you update the rsyslog config to sync to the other state-servers ;)
<jam> rogpeppe: that might explain the results where DesiredVersion was confusing you in the past
<jam> rogpeppe: the *unit* agents get the version their Machine is currently running.
<rogpeppe> jam: i'm not sure how that explains how DesiredVersion could go backwards
<jam> rogpeppe: DesiredVersion for the machine is 1.19.2, but until the machine upgrades DesiredVersion for the unit is 1.19.1
<jam> so it shouldn't go backwards
<jam> for a given agent
<jam> but it might look like it
<jam> if we aren't tweezing out the calls
<rogpeppe> jam: but it really did go backwards, because it caused the unit agent to break because of that
<fwereade> jam, LGTM
<jam> rogpeppe: thats why.... Old Unit agents *don't* watch their machine versions
<jam> so if the Unit agent upgrades before the machine agent
<jam> and slightly out of sync with the api
<jam> so you get upgraded Unit and API but *not* machine
<rogpeppe> jam: ah!
<rogpeppe> jam: brilliant, well done
<jamespage> fwereade, jam: bug 1304407
<_mup_> Bug #1304407: juju bootstrap on openstack cloud defaults to i386 <amd64> <apport-bug> <ec2-images> <trusty> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1304407>
<jam> jamespage: any chance that your juju client is i386 ?
<jamespage> jam: definately not
<jam> fwereade: so I think we know why bug #1299802 happened, but I'd be interested to brainstorm how we might fix it.
<_mup_> Bug #1299802: upgrade-juju 1.16.6 -> 1.18 (tip) fails <juju-core:Incomplete> <https://launchpad.net/bugs/1299802>
<fwereade> jam, oh yes?
<jam> fwereade: so in 1.18 the Unit agent gets the Machine agent's actual version as its DesiredVersion, right?
<jam> but in 1.16 the Unit agents just get the global desiredversion
<jam> which means that if the API server and the Unit agent upgrade before the Machine agent
<fwereade> jam, ah, yes, an old server will send the wrong info
<jam> fwereade: well, not an old server
<jam> but just an about-to-be-upgraded one
<fwereade> jam, indeed, a server running the old version
<jam> so technically the fix I just put up would help here, but we can't "put that genie back in the bottle" for existing 1.16.6 servers.
<jam> fwereade: refuse to downgrade would be the easiest fix
<fwereade> jam, indeed
<jam> fwereade: given we actually have 0 experience making downgrades work
<fwereade> jam, +1
<jam> fwereade: refresh rawMachine ? the issue is that there *isn't* a record in the DB for the raw and api server machine versions
<jam> and SetEnvironAgentVersion checks that everything has the same version
<jam> before it lets you set it to something else
<jam> and nil != version.Current
<jam> fwereade: (context is the patch you just reviewd)
<fwereade> jam, doh, I see, they are now the same machine
<fwereade> jam, forget I said anything
<fwereade> jam, s/now/not/
<jam> "not" right
<fwereade> jam, yeah
<fwereade> jam, sorry, half my brain is taken up trying to figure out wtf has happened to the uniter
<jam> fwereade: np, too many threads as always
<rogpeppe> jam, fwereade: refusing to downgrade is the easy answer - it's a useful sanity constraint anyway
<jam> rogpeppe: so I have lp:///~jameinel/juju-core/1.18-refuse-downgrade-1299802, the question is testing it...
<rogpeppe> jam: i'm not sure that the log message should mention the bug. it might not be due to the bug.
<rogpeppe> jam: testing it shouldn't be too hard, i'd have thought
<rogpeppe> jam: it should sit nicely with the other upgrader tests
<dimitern> fwereade, jam, mgz, vladk, I still need a review on https://codereview.appspot.com/85380043 pleasre
<dimitern> *please
<rogpeppe> we need to decide how we want to represent environ-worker machines in status
<axw> fwereade: if you have any time today, I'd appreciate a glance over https://codereview.appspot.com/85040046. I can a full review off someone tomorrow if you're okay with it in general
<axw> fwereade: this is placement for add-machine only at this stage; I will follow up with deploy and add-unit
<fwereade> dimitern, I thought we didn't need to send those errors over the API? I'm strongly -1 on ignoring them at state/api level -- we should catch the at apiserver level, or handle them at agent level, but I really don't like just ignoring them in the api client
<fwereade> dimitern, and tbh I'm getting less sure it's helpful, even at the state level -- surely we should be checking that every field is identical; and if they are it really doesn't seem worth an error, and if they aren't there's a clear fuckup that's more serious than just "alreayd exists"
<dimitern> fwereade, actually with the next CL your concerns will be moot, because i'm removing both AddNetworks and AddNetworkInterfaces from state/api/provisioner
<fwereade> dimitern, haha, ok -- at the state level, though?
<rogpeppe> natefinch: how are you getting on with merging trunk into MA-HA ?
<dimitern> fwereade, i've added machine.SetProvisionedWithNetworks at both state and api level
<fwereade> dimitern, awesome -- it feels like that should completely replace this CL then?
<dimitern> fwereade, more or less, but AlreadyExistsError I think should stay
<dimitern> fwereade, I'm live testing it on maas and will propose soon
<fwereade> dimitern, until someone's consuming it I'm not 100% sure there
<mgz> fwereade: can you give perrito666 guidance on writing really complicated asserts with the mgo shizzle?
<fwereade> mgz, hah, yes, I can try
<mgz> because I'm pretty sure we have no docs for that, and that's what you're asking for with the review
<dimitern> fwereade, *sigh* it'll be a bit difficult to replace the current CL with it, but i'll try
<fwereade> mgz, perrito666: there's a bit in doc/hacking-state.txt, but indeed not a great deal
<fwereade> perrito666, 2 mins, would you start a hangout please?
<perrito666> certainly
<fwereade> dimitern, I don't follow what dependencies on that will continue to exist?
<mgz> perrito666: invitemetoo!
<perrito666> suddenly I am so popular, everyon wnts to hang out with me :p
<dimitern> fwereade, well the error itself, which will still be returned in state AddNetwork/AddNetworkInterface, but will be ignored at the API level (i need it on the state level though for SetProvisionedWithNetworks)
<dimitern> fwereade, ahem.. "ignored" as in SetProvisionedWithNetworks will not fail if you try to add an existing network or interface
<fwereade> dimitern, indeed, I see that -- but I'm worried that in state we should either be checking every field -- in which case, ehh, why return any error if the state perfectly matches the effect of having called successfully
<fwereade> dimitern, sorry, s/either//
<fwereade> dimitern, s/in which case/if they match/
<perrito666> mgz: fwereade google says you are both not aailable
<fwereade> dimitern, and if they don;t match that's probably a real error
<fwereade> perrito666, just paste the link, it'll work
<dimitern> fwereade, and how do you solve the case when not everything is the same, but the network name exists?
<fwereade> dimitern, something's all fucked up then, isn't it?
<mgz> perrito666: I;m making one
<dimitern> fwereade, or we could be trying to change it, which is not supported now
<fwereade> dimitern, yeah I wondered whether it's cleaner as an Update-style thing
<dimitern> fwereade, and in any case, this "ignore if it exists" is a temporary shortcut to the MVP, ideally, we want to discover and add all networks before deploy (probably at bootstrap time or shortly after)
<fwereade> dimitern, good point, well made
<dimitern> fwereade, that will allow us to have much better picture of the reality in state and make sanity checks much better
<fwereade> dimitern, ok, sgtm, AlreadyExists handling will smooth the transition
<dimitern> fwereade, yeah, thanks
<natefinch> rogpeppe: sorry, was helping with family stuff.  The merge is complete, fixed some merge problems, checking the tests now.  Looks like there's a few more compilation issues in the tests, hope to have it in a decent state shortly.
<rogpeppe> axw: ping
<axw> rogpeppe: yo
<rogpeppe> axw: did you have something to do with stateOpened in the machine agent?
<rogpeppe> axw: i'm just wondering what the justification for it is
<axw> rogpeppe: yeah I think I did that. broken is it?
<axw> ah
<rogpeppe> axw: 'fraid so
 * axw tries to remember
<rogpeppe> axw: i can see why you did it, but i'm pretty sure it's broken anyway
<rogpeppe> axw: (i thought so before, but a runtime panic in a live environment assures me that it really is)
<axw> ah yeah, so the upgrade steps for state servers could get a state connection
<rogpeppe> axw: yeah.
<rogpeppe> axw: i think that the solution is probably for the upgrader to make an independent connection to the state
<rogpeppe> axw: i'm just wondering if that might end up being awkward.
<axw> rogpeppe: that would probably be fine, but I honestly can't remember the finer details at the moment
<rogpeppe> axw: ok
<rogpeppe> axw: FWIW, the StateWorker can be called multiple times. the second time it gets called, it panics because it closes the stateOpened channel again
<axw> agh :(
<axw> sorry
<rogpeppe> axw: that's ok - i only saw it relatively recently, thought "must fix that some time" but didn't realise it was quite as bad as it was
<rogpeppe> axw: i've nearly fixed it, BTW
<natefinch> yay for tests.  Somehow dropped a codepath in the merge and a test detected it.
<axw> rogpeppe: cool, thank you. I'll check out what you did in the morning
<sinzui> fwereade, I see a report that says I will release https://launchpad.net/juju-core/+milestone/1.19.0 tomorrow. 1/3 of the targeted bugs will not be ready.
<sinzui> fwereade, jam: and then I see 1.19.1 for Friday, which is impossible
<natefinch> rogpeppe: all MA-HA code builds and tests all pass, doing live tests.
<rogpeppe> natefinch: great
<rogpeppe> natefinch: if you propose it, i'll have a last look and then hopefully it can go in
<sinzui> fwereade, jam: I can release 1.19.0 this week as it is, or we can push all non-essential bugs to 1.19.1 today. I think we already did push the non-essential bugs to 1.19.1 though
<natefinch> rogpeppe: https://codereview.appspot.com/72500043
<fwereade> sinzui, yeah, I don't think we can reasonably push that tomorrow
<fwereade> sinzui, we will have to see what gets done overnight wrt those bugs
<sinzui> fwereade, bug 1303583 is the only azure bug. The report I see defines azure availability sets as the definition  of 1.19.0
<_mup_> Bug #1303583: provider/azure: new test failure <gccgo> <juju-core:Triaged> <https://launchpad.net/bugs/1303583>
<rogpeppe> hmm, i'm seeing this when i lbox propose: error: Failed to load data for branch lp:juju-core: Get https://api.launchpad.net/devel/branches?url=lp%3Ajuju-core&ws.op=getByUrl: dial tcp 91.189.89.225:443: connection refused
<natefinch> hmm... canonical twiddling with their SSL certs right now perhaps?
<natefinch> FWIW, lbox worked ok for me 10 minutes ago
<natefinch> hm... apt-get update is failing on amazon
<natefinch>  Unable to connect to us-east-1.ec2.archive.ubuntu.com:http:
<natefinch> Unable to connect to ubuntu-cloud.archive.canonical.com:http:
<natefinch> Unable to connect to security.ubuntu.com:http: [IP: 91.189.92.200 80]
<natefinch> wonder if canonical's servers are getting slammed from everyone in the world doing an apt-get update today
<rogpeppe> natefinch: #is are dealing with it
<natefinch> rogpeppe: could you try bootstrapping local with my branch?  My local environment is still hosed, so I can't test it.  I'll try to fix it, but I don't want to gate HA on my stupid environment.
<rogpeppe> natefinch: will do
<natefinch> rogpeppe: of course, if juju can't apt-get update, we can't test in the cloud either
<rogpeppe> natefinch: true 'nuff
<natefinch> update worked this time, let's see what happens with upgrade
<natefinch> gah, it's not going to finish before I have to go.  Looks like it upgraded and installed everything just fine.  It's sitting at "Bootstrapping Juju machine agent", so it may or may not work (since that seems to be where we fall over a lot).
<rogpeppe> natefinch: reviewed
<natefinch> Gotta go pick up my daughter from preschool ,back in like half hour or 45 minutes.
<rogpeppe> natefinch: ok
<rogpeppe> fwereade, dimitern, mgz: i'd appreciate a review of this, if possible - it fixes a live panic i saw (and cleans some test code up slightly): https://codereview.appspot.com/85450044
<mgz> rogpeppe: that branch makes some sense to me
<mgz> would prefer if someone else looked at it too
<rogpeppe> mgz: thanks.
<dimitern> rogpeppe, looking
<rogpeppe> dimitern: ta!
<dimitern> rogpeppe, LGTM
<rogpeppe> dimitern: thanks
<stokachu> so im seeing an issue with a local provider running kvm as machines and deploying charms within those machines to lxc containers. problem is you can not access the lxc containers outside of the parent machine http://paste.ubuntu.com/7222158/
<stokachu> should there be some sort of tunneling setup to allow juju to be able to add-relations between those containers or be accessible outside of the machine hosting the containers?
<dimitern> fwereade, mgz, vladk, if anyone of you is still here, I'd appreciate a review on https://codereview.appspot.com/85220044/
<natefinch> rogpeppe: so, my bootstrap on amazon failed to open state.
<rogpeppe> natefinch: oh. i thought you'd tried that.
<natefinch> rogpeppe: that's what I was trying as I left, but it didn't finish before I had to go.
<rogpeppe> natefinch: have you looked at the logs on the bootstrap machine?:
<natefinch> rogpeppe: not yet, the bootstrap machine was destroyed when I got back.  I  guess I should have suspended the client it before I left
<rogpeppe> natefinch: well, just checked - the local provider seems to work ok with your branch
<natefinch> rogpeppe: thanks
<natefinch> rogpeppe: hmm no mongo running on
<natefinch> rogpeppe: on the bootstrap node in amazon
<rogpeppe> natefinch: have you looked at /var/log/upstart ?
<rogpeppe> natefinch: that's where mongod errors seem to go
<natefinch> rogpeppe: yeah, I thought it should be in rsyslog.log, but that file is almost empty
<rogpeppe> natefinch: so... do you see anything?
<natefinch> rogpeppe: /var/log/upstart/rsyslog.log contains only a single log line: Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
<rogpeppe> natefinch: there should be a juju-db.log file, i think
<marcoceppi> so, can someone jump on a hangout real quickly, 1.18 appears to break local charm deployments
<rogpeppe> marcoceppi: i'm afraid i'm too close to EOD
<natefinch> marcoceppi: I can jump on
<marcoceppi> natefinch: thanks, inviting
<marcoceppi> natefinch: https://plus.google.com/hangouts/_/7ecpje3khh45s9i32nccdngavk
<natefinch> rogpeppe: there's no upstart juju-db.conf either... something is wonky
<rogpeppe> natefinch: what does your cloud-init-output look like?
<natefinch> rogpeppe: sorry, gotta work on this call
<rogpeppe> natefinch: it all seems to work fine for me
<natefinch> rogpeppe: weird, ok.
<rogpeppe> natefinch: i have to go now. perhaps we could pair tomorrow on trying to get the tests written for the spiked code i've got that actually puts it all together.
<natefinch> rogpeppe: cool, yeah, that would be good.
<natefinch> rogpeppe: I'll try to figure out what's wrong with my environment
<natefinch> rogpeppe: if you've tested amazon and local, I'll make the tweaks you suggested and land HA, if you think that's ok?
<wwitzel3> natefinch: is your branch up to date, I have some spare time if you want me to kick the tires with it on maas as well
<rogpeppe> natefinch: sgtm
<rogpeppe> wwitzel3: that would be great
 * rogpeppe leaves
<rogpeppe> g'night all
<wwitzel3> rogpeppe: see ya rogpeppe
<rogpeppe> wwitzel3: have a great pycon
<natefinch> wwitzel3: that would be awesome.  Just a bootstrap and quick deploy would be great.  I have to figure out why my environment is borked
<wwitzel3> rogpeppe: thanks :)
<wwitzel3> natefinch: ok, I'll do that
<wwitzel3> natefinch: I'm having some weird issues connecting to cloud-images
<natefinch> wwitzel3: I know IS was having some problems with their servers.... likely due to heartbleed, one way or another
<wwitzel3> natefinch: yeah, I will keep trying
<marcoceppi> natefinch: fyi: http://askubuntu.com/a/445101/41
<natefinch> marcoceppi: so the only difference is specifying the series, it looks like?   Is this intended behavior or is it a bug?
<natefinch> (obviously the error message is terrible even if it is intended)
<marcoceppi> natefinch: it's still a bug, but not nearly as critical, see 1303880
<marcoceppi> bug 1303880
<marcoceppi> dear mup, where are you
<natefinch> bug #1303880
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <juju-core:Triaged> <https://launchpad.net/bugs/1303880>
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <juju-core:Triaged> <https://launchpad.net/bugs/1303880>
<natefinch> heh just slow
<sinzui> cmars, bug 1303880 relates to your recent changes to support charms with ambiguous series
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <regression> <series> <juju-core:Triaged> <https://launchpad.net/bugs/1303880>
 * cmars looks
<sinzui> cmars, I will talk with others to discuss how serious this issue is. I would like to classify the issue a a documentation problem
<cmars> sinzui, got it. thanks. let me know the outcome
<sinzui> or a UI problem. The people would have fixed the issue themselves if the UI said a series/os-version must be specified when deploying a local charm
<natefinch> sinzui: grossly incorrect error message is essentially a bug.  At least it should be easy to fix
<perrito666> fwereade: hey, you most likely told me this before but, when checking networks against Machine doc, should I query for Addresses or MachineAddresses ? looks to me as if both should have the same content, but famous last words
<fwereade> perrito666, hey, you still around?
<perrito666> fwereade: always, I have no life
<bac> anyone having trouble using lbox today after launchpad got new keys?  i'm getting error: Get https://api.launchpad.net/devel/people/+me: x509: certificate signed by unknown authority
<bac> sinzui: ^^ any thoughts?
<sinzui> bacL I haven't used lbox since since last Friday
<sinzui> bac
<bac> sinzui: browsers can talk to https://launchpad just fine.  i'm not sure what lplib is doing
<bac> oh, lbox probably rolled it's own access, it can't be using lplib
<sinzui> bac, lbox was the first user of a go-based port of lplib
<bac> sinzui: so a CA that python knows about that go doesn't?  /me grasps
<sinzui> natefinch, you landed something today? does lbox love you?
<natefinch> sinzui: it did this morning, but things may have changed in the last 5-ish hours
<natefinch> sinzui: let me see if I can run it now
<natefinch> sinzui, bac: works for me, re-proposing an already proposed branch with new changes.  Don't have a new branch to propose to try that, if it's different.
<bac> sinzui: i am trying to 'lbox submit'.  oddness.  also it looks like the certs aren't newly generated today.
<sinzui> bac, did you get lbox from code?
<sinzui> bac "go get launchpad.net/lbox"
<bac> sinzui: no.  i'll try that
<natefinch> sometimes I forget how awesome go get is, because I use it all the time, then I try to go build something else from source, and I'm like "oh yeah, this blows"
<marcoceppi> is there a max filesize for charms?
<perrito666> fwereade: you vanished
<fwereade> perrito666, hey, sorry, my internets went all funny for a bit
<perrito666> fwereade: branches you too? :p
<fwereade> perrito666, did you get my stuff about the networks from 30 mins ago?
<perrito666> fwereade: nope, after I answered, you timed out
<perrito666> well, not you, your irc client
<fwereade> perrito666, ha, I never saw your answer
<perrito666> <perrito666> fwereade: always, I have no life
<fwereade> perrito666, haha
<fwereade> perrito666, when you're around: there's a separate collection, called (IIRC) linkednetworks, that has requested machine and service networks, keyed on the respective entity's globalKey
<fwereade>  perrito666, that's where you get the unit's requested networks (from the service) and the networks the machine was itself requested to start with -- there will imminently be another collection of *actual* machine networks, but dimitern hasn't landed that yet
<fwereade> perrito666, if any of that is incomprehensible I can expand at arbitrary length on it all
<perrito666> ok so, for the moment, as we spoke, I should only check that the requested nets of the unit and the requested nets of the machine are a match
<fwereade> perrito666, yeah -- and for the transaction, assert no changes in either document
<bac> hi marcoceppi, i'm trying to use lbox to submit the latest version of lp:~bac/charm-tools/cmd-line-server but it is not working for me due to launchpad certificate issues.  would you mind trying to land it?
<marcoceppi> bac: yeah, give me 30 mins
<perrito666> fwereade: please expand that last one a bit
<bac> marcoceppi: no rush.  thanks.
<fwereade> perrito666, hangout?
<perrito666> fwereade: sure, gimme a sec
<natefinch> ahh shit
<natefinch> thumper, sinzui: are we supposed to be supporting go 1.1?  Rog added a line that requires 1.2 :/
<thumper> natefinch: yes, AFAIK, we are still 1.1
<natefinch> dang
<sinzui> natefinch, I think we are.
 * sinzui checks cloud archive
<sinzui> natefinch, thumper the PPAs that build depend on Go 1.1.2
<thumper> thought so
<natefinch> yeah, me too, just hoped I was wrong
<thumper> also, what does gccgo support?
<natefinch> no idea
<thumper> natefinch: what was the line ? I'm curious
<natefinch> thumper: using sort.Stable to do a stable sort of instance addresses
<natefinch> thumper: to pick the one that was the most like what we wanted (cloud local, non-hostname)
<thumper> ah
<natefinch> I gotta run.  Unfortunately that means HA won't go in tonight.  But it should be easy to fix in the morning. That's the last blocker.
<mwhudson> gccgo in trusty supports go 1.2
<mwhudson> and we really don't care about <trusty for gccgo
<bac> marcoceppi: ignore my earlier request.  lbox finally decided to play nicely.
<marcoceppi> Bac ack
<bac> in black
<waigani> hi all
<waigani> when I run test on trunk I get a mgo.QueryError
<waigani> "exception: cannot run map reduce without the js engine"
<mwhudson> juju-mongodb does not support map reduce
<mwhudson> so that's interesting :)
<waigani> right, if I use mongodb-server, no problem
<thumper> waigani: is that in the store tests?
<waigani> thumper: yep
<thumper> yeah, we need to skip that test if using juju-mongodb
<thumper> the store code should be moving out of core AFAIK
<thumper> the store doesn't use juju-mongodb
<waigani> rightio
<davecheney> thumper: sinzui ubuntu@winton-02:/var/log/juju-ubuntu-local$ grep panic -c all-machines.log
<davecheney> 0
<thumper> \o/
<davecheney> 12 hours, no panic
<thumper> although, we do suck...
<davecheney> $ uname -a
<davecheney> Linux winton-02 3.13.0-8-generic #28-Ubuntu SMP Mon Feb 17 08:22:39 UTC 2014 ppc64le ppc64le ppc64le GNU/Linu
<thumper> I'm trying to get the new debug-log client to error on an old server
<davecheney> ^ magic kernel, accept no substitutes
<thumper> but it just blocks
<davecheney> :(
<thumper> and there is no way as the api client to know the version of the server
<thumper> and I can't add it to use it, because it needs to be there now
<thumper> so I can look from the future
<thumper> hmm...
<thumper> maybe I should add a method
<thumper> to ask for the remote version
<thumper> and if that fails, no debug-log for you
<thumper> it seems the websocket connection just hangs if the end point isn't there
<thumper> can't seem to get it to time out
<davecheney> thumper: hmm
<davecheney> that shold be easy to fix
<davecheney> is thre any log on the other side if you hit an non existant endpoint ?
<davecheney> it probably doens't get further than the rootMethod
<thumper> nope
<thumper> I'm adding a client call "Version"
<thumper> that returns version.Current
<thumper> we know that if you call a client end point that isn't there, then it doesn't work
<thumper> as it is the rpc layer
<thumper> but the websocket gets stuck.
<thumper> it is a bit shit, but necessary :-(
<waigani> hi davecheney
<waigani> First pass at ssh test isolation: https://codereview.appspot.com/85710043
#juju-dev 2014-04-09
<waigani_> davecheney: is there a particular bug you'd like me to look at next or should I just grab one from the list?
<sinzui> wallyworld_, thumper: does any of this look familiar and do you have an advice? http://pastebin.ubuntu.com/7224152/
<wallyworld_> sinzui: and juju works fine on hp cloud?
<sinzui> wallyworld_, yes
<wallyworld_> hmmm
<sinzui> wallyworld_, I just confirmed that both configs use the same keys
<wallyworld_> sinzui: in the past, where container permissions have been wrong, the container has been created but subsequent reads failed. here we can't even create the container
<wallyworld_> it does seem to imply a canonistack swift issue
<sinzui> wallyworld_, noted. and in the past creation fails were race conditions. this say auth failure
<wallyworld_> sinzui: have you tried the other region?
<sinzui> wallyworld_, no
<wallyworld_> sometimes that can work
<wallyworld_> lyc01 vs lyc02
<sinzui> wallyworld_, but I just checked the canonistack dashboard for /both/ accounts. The container view shows an error
<wallyworld_> hmmm, ok
<wallyworld_> and no joy asking in #is?
<sinzui> wallyworld_, they officially defer to canonical support. I opened a ticket there 10 hours ago and no one will talk to me
<wallyworld_> :-(
<sinzui> I am tempted to send an email notifying canonistack that it will be desupported. Without working accounts, I cannot deliver the next juju to it
<wallyworld_> agreed
<wallyworld_> we do need an openstack deployment to test against though :-( besides hp cloud
<wallyworld_> thumper: this fixes bug 1304132 and also removes the log noise from the critical bug alexis emailed about https://codereview.appspot.com/85770043
<_mup_> Bug #1304132: nasty worrying output using local provider <ppc64el> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1304132>
<davecheney> arosales: hazmat email sent
<davecheney> with ppc segfault informatoin
 * arosales looks
<hazmat> davecheney, k.. trying fresh on new ppc8.. orange box #2 vanquished
<davecheney> hazmat: that would be a good data point, i only have access to wolfe and winton, which are power7
<davecheney> hazmat: if you see panics on your power8 host, you should revert to that kernel I specified
<hazmat> davecheney, my p7 host has been good wolfe-02..  3.13-08
<hazmat> trying on stilton-5
<davecheney> hazmat: yes, that is the working kernel
<davecheney> it is pre the switch to 64k
<davecheney> pages
<arosales> dfc: so 3.13.0-08.28 is what we need correct?
<arosales> hazmat: are you running -08.28?
<davecheney> the .28 isn't the important bit
<davecheney> the -08 -18, -23 is
<davecheney> i've included a link to the old kernel in the archive
<hazmat> one moment.. switching tracks off maas
<arosales> dfc, gotcha, avoid -18 and -23
<hazmat> davecheney, so my p7 is -> 3.13.0-8-generic
<hazmat> davecheney, my p8 is -> 3.13.0-23-generic
<davecheney> hazmat: right
<hazmat> davecheney, so.. theory being that's an okay version? .. i'm gonna test and find out either way
<davecheney> hazmat: i've tested -8, -18 and -23
<davecheney> only -8, which was pre the 4k page switch can run juju stabally
<davecheney> the other kernels radomly kill juju proceses with SEGVs
<hazmat> solid
<hazmat> davecheney, cool, thanks for tracking that down.. apparently i lucked into having at least one good demo p machine
<davecheney> hazmat: yeah me too
<davecheney> winton-02 is ooooooooooold
<davecheney> so it was running a very old kernel
<davecheney> but thumper bodie and timv hit problems
<mwhudson> oh man, 64k pages kill the gccgo runtime?
<mwhudson> somehow that's easy to believe
<davecheney> mwhudson: yup
<davecheney> mwhudson: tell me your thoughts
<davecheney> its signal related
<davecheney> somehow an invlaid signal is generated, or created, or just pops into existance
<davecheney> the powerpc/kernel/signal_64.c doesn't know how to handle it, so It calls force_sigsegv
<davecheney> and the userland thinks it has hit a nil pointer exception and panics
<mwhudson> davecheney: well i think malloc.goc has a #define PAGE_BITS 12 in it
<mwhudson> o
<mwhudson> h
<mwhudson> that sounds pretty messed up
<davecheney> mwhudson: but why should that matter
<davecheney> 12 is < 16
<mwhudson> davecheney: dunno
<davecheney> but is a multiple
<davecheney> all that happens is if you call mmap(0, 4096) you get a 64k allocation
<davecheney> mwhudson: i'm logging this all in a bug now
<davecheney> then i have a juju test to fix
<davecheney> then i'll try to create a smaller reproduction case
<davecheney> there is additional debugging in that file
<davecheney> but it appears to be turned off in this build
<davecheney> maybe spinning a new kernel with it enabled is the next step
<thumper> wallyworld_: back from the gym now
<mwhudson> davecheney: an invalid signal number is generated?
<mwhudson> wow
<mwhudson> what is the userspace doing when this signal arrives?
<davecheney> chlling
<wallyworld_> thumper: ok. i have 2 fixes for that critical bug https://codereview.appspot.com/85770043 and https://codereview.appspot.com/85750045
<mwhudson> so it's an async signal?
<wallyworld_> not sure if more work is needed
<davecheney> [18519.444748] jujud[19277]: bad frame in setup_rt_frame:
<davecheney> 0000000000000000 nip 0000000000000000 lr 0000000000000000
<davecheney> [18519.673632] init: juju-agent-ubuntu-local main process (19220)
<davecheney> killed by SEGV signal
<davecheney> [18519.673651] init: juju-agent-ubuntu-local main process ended, respawning
<thumper> wallyworld_: so what what going wrong?
<wallyworld_> thumper: i'll get those landed and will have to either test or ask axw if there's anything else obvious that needs looking at
<wallyworld_> thumper: 2 things 1. instance poller noise due to it not ignoring unprovisioned machines
<thumper> +1 for that
<wallyworld_> 2. bad schema def for storage-port config attr on manual provider causing provisioner startup to fail
<wallyworld_> due to json serialisation issue
<wallyworld_> float64 vs int and all that
<wallyworld_> so those 2 fixes i did just by looking at logs
<wallyworld_> i had a look at the code to see if i could relate the fixes to the actual observed issue, but didn't get far enough
<wallyworld_> so i figured we could fire up some arm instances and test and/or ask axw  for input when he comes online
<axw> I am online
<axw> what input do you need?
<wallyworld_> \o/
<axw> I have LGTM'd your two fixes
<wallyworld_> axw: bug 1302205
<_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
<wallyworld_> ok :-)
<wallyworld_> i am not sure if my fixes are sufficient
<wallyworld_> they are needed, but is there more to be done
<hazmat> davecheney, confirmed btw re 23.. panic while doing nothing detected in the log
<axw> hrm
<wallyworld_> have you seen similar issues when developing the manual provider?
<axw> wallyworld_: nope
<wallyworld_> or maybe we just need to test with the fixes
<wallyworld_> could yet be an arm issue i guess
<mwhudson> davecheney: can you run my test program from https://sourceware.org/bugzilla/show_bug.cgi?id=16629 ?
<axw> looking at the logs now...
<davecheney> hazmat: i've even seen /usr/bin/go panic while running tests
<davecheney> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304754
<_mup_> Bug #1304754: gccgo compiled binaries are killed by SEGV on 64k ppc64el kernels <linux (Ubuntu):New> <https://launchpad.net/bugs/1304754>
<wallyworld_> certainly that storage-port issue is pretty fatal
<davecheney> arosales: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304754
<_mup_> Bug #1304754: gccgo compiled binaries are killed by SEGV on 64k ppc64el kernels <linux (Ubuntu):New> <https://launchpad.net/bugs/1304754>
<axw> wallyworld_: ah, the "close of closed channel" is something rogpeppe brought up last night
<axw> there's a bug in cmd/jujud
<axw> not sure if he fixed it yet...
<mwhudson> (although i can't see why 64k pages would matter here)
<wallyworld_> axw: is that in the machine 0 log attached to the bug?
<axw> wallyworld_: yeah
<wallyworld_> let me look
<axw> wallyworld_: https://codereview.appspot.com/85450044/
<axw> wallyworld_: I broke the machine agent when I allowed upgrade steps to get a state connection
<mwhudson> davecheney: also, it would sure be nice to follow the execution of handle_rt_signal64 with gdb
<wallyworld_> axw: does that mp fix the close channel issue?
<davecheney> mwhudson: way above my pay grade
<davecheney> i'm not even qualified for pointer arythmetic
<axw> wallyworld_: yeah
<arosales> davecheney: looks like the latest in the archies is -23
<davecheney> yup
<wallyworld_> axw: great. so i'll land my branches and we can re-test i guess
<axw> wallyworld_: sgtm
<axw> wallyworld_: it would be nice to silence "cannot get instance info for instance "manual:10.0.128.7": no instances found" too, but it's not critical
<wallyworld_> axw: i haven't look into that on yet - what's the cause?
<mwhudson> i wonder if there is an arm64 kernel with 64k pages i can try with
<axw> wallyworld_: manually provisioned machines are not managed by the provider - they just should not be polled
<davecheney> mwhudson: thta would be a good test
<davecheney> i tried to test using gccgo/amd64
<wallyworld_> axw: in that case i'll add some code to my first branch
<davecheney> but lxc was all fucked on amd64 yesterday
<wallyworld_> do both fixes in one go
<axw> wallyworld_: there's a state.Machine.IsManual method that'll help there
<mwhudson> i don't know anything about legacy architectures like amd64
<arosales> davecheney: do you have link hand to the matching initrd to the -28 .deb you pointed at?
<hazmat> arosales, so i removed the other kernels .. sudo upgrade-grub.. currently doing shutdown -r now .. to see if it worked ;-)
<hazmat> arosales_, removed via pkgs that is
<jcastro> where's this -28 kernel at, I don't see it in proposed?
<hazmat> jcastro, its on the machines that barf.. ls /boot
<thumper> jcastro: o/
<thumper> jcastro: I have a version of debug-log on my machine that works with the local provider
<hazmat> arosales, don't do what i just suggested it.. it doesn't like that ;-)
<jcastro> thumper, hey! we made a plugin, heh
<thumper> yeah, but it doesn't do filtering
 * thumper guesses
<jcastro> oooh
<hazmat> or replay or exclude/include by unit/machine or channel
<hazmat> jcastro, we have parity..
 * hazmat sheds a tear
<hazmat> for debug-log
<jcastro> heh
<jcastro> thumper, also, one thing we should talk about
<davecheney> arosales: not -28
<jcastro> is the debug-hooks <-> retry --resolved thing makes me cry
<arosales> hazmat, ack :-)
<davecheney> you want uname -a
<davecheney> Linux winton-02 3.13.0-8-generic #28-Ubuntu SMP Mon Feb 17 08:22:39 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
<jcastro> oh _8_
<davecheney> wget https://launchpad.net/ubuntu/+source/linux/3.13.0-8.28/+build/5602341/+files/linux-image-3.13.0-8-generic_3.13.0-8.28_ppc64el.deb
<jcastro> not 28
<arosales> davecheney, so the _only_ fix is to revert to -08?
<davecheney> jcastro: i'll forward you the email
<jcastro> davecheney, do I need an initrd for that?
<hazmat> arosales, atm yes
<davecheney> arosales: the only workaround I have at this time
<davecheney> jcastro: no
<jcastro> davecheney, thanks!
<thumper> jcastro: I don't understand
<jcastro> so I do debug-hooks
<jcastro> and in order to be able to fire off a hook to debug it
<thumper> also, know this
<thumper> I can only make one person happy at a time
<jcastro> I need to open a new terminal and do resolved --retry
<thumper> it isn't your turn
<thumper> it is hazmat's
<jcastro> <3
<jcastro> local log will keep me happy. :D
<thumper> yeah, I'm open to looking to fix it...
<arosales> davecheney, I am confused in your email you state, "workarounds: you should install this kernel
<arosales> wget https://launchpad.net/ubuntu/+source/linux/3.13.0-8.28/+build/5602341/+files/linux-image-3.13.0-8-generic_3.13.0-8.28_ppc64el.deb"
<arosales> davecheney, ah I should have said -8.28 not -28
<arosales> davecheney, gotcha
<arosales> with is a revert
<hazmat> thumper, and all of cts :-)
<arosales> sorry long day
 * arosales better just grab some dinner
<davecheney> arosales: yup, we're also lucky that -28.8 isn't a think
<davecheney> both those numbers appear to be increasing
<davecheney> s/think/thing
<davecheney>     c.Assert(err, gc.IsNil)
<davecheney> ... value *mgo.QueryError = &mgo.QueryError{Code:16149, Message:"exception: cannot run map reduce without the js engine", Assertion:false} ("exception: cannot run map reduce without the js engine")
<davecheney> store tests are failing again
<davecheney> i thought that the store tests wouldn't run unless we passed a flag ?
<davecheney> cmars: didn't you fix this ?
<cmars> davecheney, thought so, yes. is this trunk or 1.18?
<davecheney> cmars: trunk
<cmars> hmm
<cmars> davecheney, which test is it? is there a file & line #?
<davecheney> cmars: please hold
<davecheney>  go test launchpad.net/juju-core/store 2>&1 | pastebinit
<davecheney> Failed to contact the server: [Errno socket error] [Errno socket error] timed out
<davecheney> oh for fucks sake
<davecheney> does nothing work today ?
<davecheney> thumper: what is the env var to lower logging ?
<davecheney> JUJU_LOG= ?
<thumper> here's mine: JUJU_LOGGING_CONFIG=<root>=INFO; juju.container=TRACE; juju.provisioner=TRACE
<thumper> that is ready by bootstrap
<davecheney> ta
<davecheney> hnn, that isn't it
<davecheney> rog had a different one
<davecheney> a flag to testing
<davecheney> -juju.log WARNING
<davecheney> cmars: http://paste.ubuntu.com/7224466/
<cmars> ok, thanks. looking
<davecheney> ta
<cmars> davecheney, i thought it had landed, but it hasn't
<cmars> https://code.launchpad.net/~cmars/juju-core/cs-mongo-tests/+merge/213563
<cmars> i think we can land it, if CI will support running the store tests with full mongodb tests
<cmars> davecheney, what do you think? will you take that as an action item?
<davecheney> cmars: no
<cmars> ok :)
<davecheney> i cannot take that as an action item
<cmars> i'll follow up w/curtis tmw then
<davecheney> cool
<cmars> cheers
<wallyworld_> arosales: do you have any doc or otherwise that tells me what i need to do get access to some arm vms to test a fix for that bug
<dannf> wallyworld_: i can help w/ that
<wallyworld_> \o/
<dannf> wallyworld_: do you have an account on batuan?
<dannf> that's the gateway into our network - if you don't, you can ask for one in #is
<wallyworld_> yes, since i have logged onto power vms previously
<dannf> sweet
<wallyworld_> i think you know it's bug 1302205
<_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
<dannf> right
<wallyworld_> i don't think it's specifcally an arm issue
<wallyworld_> but want to test on arm anyway
<dannf> yep - just a sec... wonder if someone just took our host down
<wallyworld_> there have been 3 branches which landed today or last night which should hopefully fix it
<dannf> wallyworld_: yeah, looks like its in use debugging an unrelated issue. i'll send you an e-mail with access info and ping you (or have someone ping you) when its ready
<wallyworld_> great thanks :-)
<thumper> davecheney: how can I make sure that the bufio.Scanner doesn't consume too much?
<thumper> davecheney: I have an io.ReadCloser
<thumper> davecheney: and I want to read up to the first new line, and no more
<thumper> the Scanner when I call scan reads 4k
<thumper> which consumes way more than I want
<dannf> wallyworld_: e-mail sent - system isn't quite ready yet, stay tuned..
<wallyworld_> ok
<arosales> dannf, thanks re wallyworld arm access question :-)
<dannf> np, very happy to have help w/ it :) two days and i've just managed to learn how to kinda read go :/
<dannf> wallyworld_: ok, have at it
<wallyworld_> dannf: awesome, firing up ssh now
<wallyworld_> dannf: "ssh 10.229.41.200" should work right?
<dannf> it should, but its not working for me either - lemme ask
<thumper> is that because the ssh config can't work out where to proxy through?
<jcw4> thumper: fyi I filed a merge request for the changes you suggested yesterday about isolating the git tests
<thumper> jcw4: awesome, I saw that in my inbox
<thumper> jcw4: I'll take a look once I submit this change :-)
<jcw4> thumper: thanks; no rush... just excited about contributing ;)
<thumper> wallyworld_: fyi instance type constraint branch has conflicts
<wallyworld_> yes it does, fixed loclly
<wallyworld_> still wip
<thumper> jcw4: did you push your changes?
<jcw4> yes; to a new branch
<jcw4> the last one was too messy
<thumper> :-)
<thumper> jcw4: ok, there is a resubmit option on the RHS of the merge proposal page, that includes a "start over"
<thumper> which would have marked the old as superseded
<thumper> but that's OK, I'll just reject the old one.
<jcw4> I see. Thanks
<thumper> jcw4: what happens when you change the LC_ALL to "C" ?
<jcw4> I was planning on doing that after running all the tests without it.
<jcw4> after they all passed I was too excited and forgot
<jcw4> testing now
<jcw4> thumper: worker/uniter/charm/... tests passed
<jcw4> I'll push that change too?
<thumper> move that patch env into the base git test suite
<thumper> with the other env patches
<thumper> jcw4: then you can delete the SetUpTest for GitDirSuite
<jcw4> cool, right
<thumper> as it won't be doing anything
<thumper> then yes, push that
<jcw4> thumper: the LoggingSuite TearDownTest(c) needs to be called in the GitSuite TearDownTest?  I'd add that back in if necessary
<thumper> jcw4: if the only line of the tear down is to upcall the tear down, then you can just delete it
<jcw4> thumper: okay; that's all there is.  tx
 * thumper EODs
<axw> wallyworld_: I've responded to your comments, but I'm now looking at HA
<wallyworld_> np
<wallyworld_> just wanted to get some thoughts down
<wallyworld_> i'm stuck on otherthings also
<yaguang> hi all,  I am using 1.16.6 stable juju-core to bootstrap an Openstack Havana cloud, but fails with can't find index.json
<yaguang> it seems that juju is trying to find the meta file in the path  /streams/  but  swift has  tools/streams/
<dimitern> morning all
<dimitern> fwereade, can you take a look at this please ? https://codereview.appspot.com/85220044/
<bigjools> dimitern: howdy!  How's the vlan work coming along?
<dimitern> hey bigjools
<dimitern> bigjools, i'm in the final steps - cloudinit scripts that bring up network interfaces
<dimitern> bigjools, vladk is working on a few extensions to gomaasapi to allow us to unit test the new api calls
<dimitern> bigjools, capabilities; lshw dump of a node; networks?op=list_connected_macs
<bigjools> dimitern: nice, all going to make it for the release?  And any issues with maas I need to know about?
<dimitern> bigjools, but all these were live tested on my local maas using daily builds ppa
<dimitern> bigjools, we're aiming for feature completeness by friday, but should be ready before that
<bigjools> excellent
<dimitern> bigjools, bug 1303617 hit me after a recent upgrade and i can no longer use the fast installer (fails at boot and doesn't recover), which is slow and tedious
<_mup_> Bug #1303617: pc-grub install path broken in curtin <landscape> <curtin:Fix Released by smoser> <curtin (Ubuntu):Fix Released> <https://launchpad.net/bugs/1303617>
<bigjools> dimitern: weird, I  did a fast install today and it was fine
<dimitern> hmm Fix Released - i'll try it now
<dimitern> bigjools, we have a few wishlist items for the maas api
<dimitern> bigjools, like the ability to see networks + connected macs in one place (either in GET node/system_id or in GET networks/(all))
<bigjools> dimitern: please file bugs
<dimitern> bigjools, will do
<bigjools> I will triage them as wishlist and we'll put them on the stack
<dimitern> bigjools, otherwise now we need to do several api calls at startinstance time to get all we need
<bigjools> ok we can optimise that
<jam> cmars: I'm not sure why your test failed, but it would seem that we could tell the landing bot to always run the mongojs tests
<jam> cmars: though because of that, I'd actually rather have the CI tests disable it, rather than disabled by default.
<jam> experience has shown that ENV vars play nicer with go test than flags, because flags are only valid per package, and "go test ./..." tries to pass all flags to all packages.
<dimitern> bigjools, filed bug 1304857
<_mup_> Bug #1304857: API should report networks and connected macs in the response of a single node <api> <MAAS:New> <https://launchpad.net/bugs/1304857>
<dimitern> jam, fancy a review ? https://codereview.appspot.com/85220044/
<dimitern> bigjools, another question re gomaasapi - how do you feel about adding high level wrappers around common APIs? like having AcquireNode method that the provider calls, rather than constructing a URL internally? Other similar examples are ListNetworks, ListNodes, etc.?
<bigjools> dimitern: I can't remember much about gomaasapi
<dimitern> bigjools, :) well, that was just a thought
<bigjools> dimitern: one of the other guys will have an opinion I'm sure
<dimitern> bigjools, we'll ask them for reviews when changes are proposed
<fwereade> dimitern, I'm getting progressively more nervous about NetworkName vs making it clear that it's a provider-specific id like instance.Id
<dimitern> fwereade, can't we say yes, it's provider specific, but it's also used by juju to identify the network internally?
<fwereade> dimitern, it will be, indeed
<fwereade> dimitern, but we're going to want network names as well
<dimitern> fwereade, ok, how are we going to make it clearer?
<fwereade> dimitern, when openstack gives us network abcdef638746328756865198, and we call that NetworkName, what field will we use for the "private" name users will want to use
<fwereade> dimitern, or "my_network" or whatever
<dimitern> fwereade, openstack has labels for networks just the same
<fwereade> dimitern, and so does every provider ever?
<fwereade> dimitern, that's quite the prediction ;p
<dimitern> fwereade, i can't say that :P
<dimitern> fwereade, so tell me how to alleviate your nervousness about it? :)
<fwereade> dimitern, call it NetworkId :)
<fwereade> dimitern, you know -- we have machine ids, and instance ids, and they are not the same
<fwereade> dimitern, (and machine ids are machine names really, but hysterical raisins)
<dimitern> fwereade, so, basically change it everywhere from NetworkName to NetworkId ?
<dimitern> fwereade, I need a follow-up for that
<fwereade> dimitern, I'm more concerned about the API
<dimitern> fwereade, you're thinking about network tags?
<fwereade> dimitern, and that the terminology that's hard to change be consistent with what we expect to do
<fwereade> dimitern, well, that was my first thought
<fwereade> dimitern, but then I realised that converting these names into tags would be completely wrong
<rogpeppe> mornin' all
<dimitern> morning rogpeppe
<fwereade> dimitern, because they're provider vocabulary, not juju vocabulary
<rogpeppe> dimitern: hiya
<dimitern> fwereade, ok, so then what?
<dimitern> fwereade, i'm trying to follow but can't see what's needed
<rogpeppe> axw: ping
<fwereade> dimitern, although -- wait, don't you use tags in the client api? I think we should...
<axw> rogpeppe: pong
<dimitern> fwereade, we use tags everywhere in the api
<dimitern> fwereade, but not for networks
<rogpeppe> axw: about removing JobManageEnviron:
 * fwereade grumbles
<rogpeppe> axw: the reason we don't want to remove JobManageEnviron from a voting state server is that when a machine hasn't got JobManageEnviron we allow it to be removed
<fwereade> dimitern, we don't identify machines in the client API by provider-specific instance id, and we shouldn't identify networks that way either
<rogpeppe> axw: and if that happens we could break the invariant that we only ever have an odd number of voting state servers
<rogpeppe> axw: or rather, and odd number of state servers that *want* to vote
<dimitern> fwereade, i agree, but the only way we can deal with networks so far is if we get them from the provider
<dimitern> fwereade, at provisioning time
<rogpeppe> s/and odd/an odd/
<fwereade> dimitern, we *can* impose a requirement that network names match provider ids exactly, this is MVP after all
<dimitern> fwereade, i guess you're suggesting to require the user to add any networks to juju before being able to deploy with them
<dimitern> fwereade, and i can see how this is the way we wanna go eventually, but not for nwo
<dimitern> now
<fwereade> dimitern, well, mid-term, yes -- I'd expect --networks params to be validated
<axw> rogpeppe: okay
<axw> rogpeppe: lots to take in here, still figuring out how all the voting bits work.
<fwereade> dimitern, short-term, I want us to be clear on the distinction between juju vocabulary over the client API (tags) and provider vocabulary over the internal API (network ids)
<dimitern> fwereade, so let's make a plan - i land this last CL and make another one for s/NetworkName/NetworkId/ throughout, and then do  the cloudinit stuff
<rogpeppe> axw: thanks for taking a look. feel free to ask about whatever doesn't seem to make sense.
<fwereade> dimitern, internally I'm fine saying that network name == network id (for mvp at least)
<fwereade> dimitern, am I helping?
<dimitern> fwereade, how can we be clear about this? in the docs? where?
<rogpeppe> fwereade: it would be nice if network ids were distinguishable from machine ids and unit ids (which are both currently distinguishable from each other)
<rogpeppe> fwereade: and service names, of course
<dimitern> fwereade, yeah, that is how it's gonna be for now - we call it networkId, but we mean maas-specific name
<fwereade> rogpeppe, not really gonna happen, that's why we have tags
<rogpeppe> fwereade: yeah, fair enough
<fwereade> rogpeppe, although *probably* units/machines will be safe, but services won't ;p
<rogpeppe> fwereade: yeah
<fwereade> dimitern, so, in the Setprovisioned bits: it's a provider-specific network id, not a tag
<fwereade> dimitern, in the Client-facing IncludeNetworks/ExcludeNetworks bits, we should be using tags
<dimitern> fwereade, yes, you mean better doc comment
<fwereade> dimitern, internally we can just strip off the "network-" prefix and keep going mapping 1:1 with provider-specific network ids
<dimitern> fwereade, so juju deploy --networks=net1,net2 which goes over the API as network-net1, network-net2
<dimitern> fwereade, and for include/excludeNetworks in state we still use the ids, not tags as usual
<fwereade> dimitern, yeah, exactly -- and for now the stripped names have to map to intrnal provider ids, but we keep them distinct so it doesn't become confusing when we have to change over later
<dimitern> fwereade, ok, got it
<fwereade> dimitern, inside state you can even stick with a single field in the document doing both duties... but be very clear that the _id field is for the *juju* name, not the provider name
<dimitern> fwereade, better comments, ok
<fwereade> dimitern, brilliant, thanks
<dimitern> fwereade, i'll try to remember all that :) will propose it some time later today
<fwereade> dimitern, I'm going through the CL now in case you hadn't realised ;p -- some more naming quibbles but otherwise looking sound I think
<dimitern> fwereade, great!
<fwereade> dimitern, reviewed
<fwereade> rogpeppe, btw, do you think you might have a spare cycle to look at https://bugs.launchpad.net/bugs/1303735 today? it looks a bit like something you might know about
<_mup_> Bug #1303735: private-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
<rogpeppe> fwereade: looking
<fwereade> axw, did you see https://bugs.launchpad.net/bugs/1303583 ?
<_mup_> Bug #1303583: provider/azure: new test failure <gccgo> <juju-core:Triaged> <https://launchpad.net/bugs/1303583>
<axw> fwereade: I have, but haven't had time to look into it yet
<fwereade> axw, np, just wanted to make sure it was on your radar
<dimitern> fwereade, ta
<rogpeppe> fwereade: the issue is quite obscure to me - i'm can't see the exact problem that's being reported there
<fwereade> rogpeppe, AIUI it's a change in behaviour -- jamespage will be able to make it clear I think?
<rogpeppe> fwereade: right. it would be nice to know what's the expected behaviour there and how the reported logs differ
<jamespage> rogpeppe, I upgrade nova-compute nodes (which had the correct private-address) and the private-address switches to be the ip address of the internal bridge virbr0
<rogpeppe> jamespage: where can i see the result of that in the status? (or the logs?)
<jamespage> rogpeppe, in the bug report
<rogpeppe> jamespage: yeah, i was looking at the bug report
<jamespage> rogpeppe, the dns-name of all the nodes are the same
<jamespage> #err
<rogpeppe> jamespage: the status doesn't seem to show private addresses
<jamespage> rogpeppe, OK - public-address then
<rogpeppe> jamespage: ah, dns-name, sorry
<jamespage> rogpeppe, whatever happened it was wrong
<rogpeppe> jamespage: right, the public address. that really confused me.
<jamespage> rogpeppe, I'm not sure about the private-address tbh
<jamespage> rogpeppe, titled changed
<rogpeppe> jamespage: thanks
<dimitern> fwereade, how about s/SetProvisionedWithNetworks/ProvisionInstance/ ?
<rogpeppe> jamespage: can you find out what addresses nova returns for the instance ids?
<jamespage> rogpeppe, not right now
<rogpeppe> jamespage: ok
<jamespage> but I can look again later
<rogpeppe> jamespage: i'm suspecting that nova is returning the libvirt bridge address as one of the addresses for an instance, and our logic happens to be picking it out
<jamespage> rogpeppe, hmm
<jamespage> rogpeppe, nova has no knowledge of that afaik
<jamespage> as in there is no agent in the instance that would let it know
<rogpeppe> jamespage: hmm
<rogpeppe> jamespage: ah, i see where it comes from
<rogpeppe> axw: i think this issue (#1303735) is to do with worker/machiner - setMachineAddresses is setting the libvirt bridge address without marking it as NetworkMachineLocal
<_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
<rogpeppe> jamespage: so, i know what the issue is, but i'm not yet sure of the right way to fix it
<axw> rogpeppe: that would suggest that the openstack provider doesn't have any cloud-local addresses
<axw> is that expected?
<axw> rogpeppe: and you're right of course, it's not setting them to local - how would it know to do that?
<rogpeppe> axw: yeah
<rogpeppe> axw: i don't think it means the provider doesn't have any cloud-local addresses, as we're looking for public addresses here
<axw> rogpeppe: sorry, misread the bug
<axw> rogpeppe: I thought it was private
<rogpeppe> axw: it seems like state.mergedAddresses doesn't preserve ordering, which is perhaps a pity
<rogpeppe> jamespage: it would still be useful to see what addresses nova is returning for the instances
<rogpeppe> axw: i'm thinking that it might be possible for a machine to know which interfaces are private, but it might be quite os-specific
<axw> rogpeppe: ISTM that the best thing we could do is to prefer cloud-local over unknown
<axw> rogpeppe: indeed
<rogpeppe> axw: when asking for a public address?
<axw> (quite os specific)
<rogpeppe> axw: that seems wrong to me
<axw> rogpeppe: yeah, if there's no public address
<axw> is it less wrong to choose an unknown address that might be private (like this)?
<rogpeppe> axw: another possibility is to strictly order Machine.Addresses before Machine.MachineAddresses
<axw> the right thing of course is to classify things properly
<axw> rogpeppe: looking at instance.SelectPublicAddress, that won't work - it chooses the last cloud-local/unknown in the list
<axw> which is different to internal, for some reason
<rogpeppe> axw: that's definitely wrong if so
<axw> rogpeppe: perhaps it just needs to change to be like internal
<rogpeppe> axw: yes
<axw> (and preserve order)
<rogpeppe> axw: in fact, the implementation of internalAddressIndex and publicAddressIndex should probably be merged
<rogpeppe> jamespage: when you have a moment, would you be able to run this go program on one of the openstack nodes in the juju env that exhibits this problem? http://play.golang.org/p/GH0261EIHH
<rogpeppe> axw: i'm thinking we might be able to make some deductions from the interface name
<jamespage> rogpeppe, OK - lemme finish up the upgrade testing I'm doing and I'll try again
<natefinch> morning all
<natefinch> rogpeppe: you around?
<rogpeppe> natefinch: yup
<rogpeppe> natefinch: just doing a review. will be with you shortly.
<natefinch> rogpeppe: sure
<axw> rogpeppe: sorry was afk. I suppose that would be better than what we have now
<rogpeppe> axw: preserving order, you mean?
<axw> rogpeppe: deducing classification
<rogpeppe> axw: yeah
<rogpeppe> axw: we should preserve order too, i think, so the addresses are in predictable order. currently we're shuffling them randomly, which isn't great
<natefinch> rogpeppe, axw: you guys talking about the sort.Stable address problem with replicaset addresses?
<axw> rogpeppe: provider addresses should certainly come before machine, but otherwise I think relying on order is a mistake
<axw> natefinch: no, something else entirely - choosing public addresses when there are only unknown/cloud-local
<rogpeppe> natefinch: no, we're tallking about #1303735
<_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
<rogpeppe> axw: mgz says that order is important
<natefinch> ahh ok
<axw> rogpeppe: if addresses really do have a priority, then I think that should be explicit
<rogpeppe> axw: and that a provider can return preferred addresses by putting them earlier in the addresses slice
<axw> ordering in a slice seems pretty subtle, easy to break
<rogpeppe> axw: that's the current design, FWIW
<axw> yeah, I get that - it needs to be fixed - just whining :)
<natefinch>  type AddressByPriority []Address
<natefinch> now it's explicit
<rogpeppe> natefinch: i think it's reasonable as is, actually.
<axw> rogpeppe: we should probably document that order is important on instance.Instance.Addresses
<rogpeppe> axw: it's not too hard to take care to preserve order. it would be nice if there was a function to help with merging address slices in the instance package
<rogpeppe> axw: definitely
<natefinch> rogpeppe: I'm not a huge fan of relying on order of a generic slice.  I guess we very rarely pass it around outside the provider, and if the provider interface makes it clear the order matters, then that's probably ok.
<rogpeppe> natefinch: we pass it around a lot actually
<rogpeppe> natefinch: i don't really see the problem - slices are inherently ordered
<natefinch> rogpeppe: yes, but that order usually doesn't matter.  And it's not clear it matters when some random function gets a list of addresses deep in the bowels of the code.
<rogpeppe> natefinch: huh? that order often/usually does matter!
<natefinch> I presume we got into this mess because we didn't realize the order of the slice matters
<rogpeppe> natefinch: e.g. []byte
<rogpeppe> natefinch: we definitely need to document that more
<rogpeppe> natefinch: but i think it's reasonable to have a convention that []Address is ordered
<rogpeppe> natefinch: otherwise we'd end up adding some kind of a priority field which would actually make things considerably harder
<natefinch> rogpeppe: I'm just not a fan of preventing bugs by following conventions that are likely only written down in one place in a huge codebase.  But I agree making the providers return a different type would be a hassle.
<rogpeppe> natefinch: it's not just making the providers return a different type - it's coordinating priorities. do you have some global definition of address priority levels? what do you do when you combine address from two different sources?
<rogpeppe> natefinch: all those issues fall out naturally if you assume that ordering matters in a slice
<rogpeppe> natefinch: we should definitely write down in a couple of places that order is significant
<natefinch> rogpeppe: I don't want to continue to argue it, since it's just stopping us from actually doing anything, but I think the answer is non-trivial no matter what we do.
<rogpeppe> natefinch: i don't think it's too hard actually. just preserve order when combining addresses.
<wallyworld_> mgz: have you nova booted an instance on hp cloud manually and then attempted to ssh into it? i've had no luck getting in via ssh
<natefinch> rogpeppe: I guess I don't know how to preserve order when merging two slices unless you know how they were sorted in the first place.
<rogpeppe> natefinch: trivial answer: just concatenate the slices
<jam> axw: I just got a "session already closed" panic on the bot. Doesn't your patch fix that?
<axw> my patch?
<jam> axw: the one that untwines StateWorker and APIWorker
<axw> jam: rogpeppe fixed a channel closed one
<rogpeppe> natefinch: more sophisticated answer: delete items in the second slice that exist in the first slice before concatenating them
<axw> jam: link?
<axw> jam: nm, found it
<natefinch> rogpeppe: how do you know the ones in the second slice are lower priority than all the ones in the first slice?
<jam> axw: heres' a link to the failure: https://code.launchpad.net/~jameinel/juju-core/go-vet-cleanup/+merge/214911
<rogpeppe> jam: yeah, my patch wasn't for a "session already closed" error
<axw> jam: that looks different
<rogpeppe> natefinch: you make that decision
<rogpeppe> natefinch: based on the origin of each slice
<rogpeppe> jam: it may well be related to my patch though
<rogpeppe> jam: i'll have a look
<jam> axw: I thought you had a comment in IRC about breaking the machine agent because of the multiple connections during upgrade, which might be related, but maybe not directly.
<jam> this, in particular, looks like a Watcher that is trying to finish something while the connections are cleaning up.
<axw> jam: I did, and rogpeppe fixed it... I don't think it is related, but maybe rog will have a better idea
<jam> axw: rogpeppe: looking at state/watcher/watcher.go it looks like it could be a race condition. If we triggered tomb.Dying but also got the timeout in time.After(period), the w.needSync will be checked without looking at tomb.Dying
<jam> hmm.. alternatively, on first entering the function, you also set needSync, but haven't looked at Dying yet (AFAICT)
<rogpeppe> jam: i don't think that should matter
<jam> the traceback says that it was happening in New()
<rogpeppe> jam: until the watcher's tomb is Dead, it's entitled to do anything it likes
<jam> though it doesn't go above that.
<rogpeppe> jam: i think it must be that we're not closing things down properly
<jam> rogpeppe: sure, it looks like we might have gotten a closed session while we were doing something else, and we're closing it concurrently with creating something new.. ?
<jam> rogpeppe: anyway, don't look too deeply on this, I was just trying to push out some of wwitzel's in-progress stuff while he was gone
<jam> it isn't critical work
<natefinch> btw, rogpeppe: to land HA, we need to rework the sort.Stable of addresses.  sort.Stable is go 1.2, and we only require go 1.1.2  right now
<axw> natefinch: why do you need to stable sort?
<rogpeppe> natefinch: right - i saw that. all that selectPreferredStateServerAddress logic is about to go anyway
<rogpeppe> natefinch: i didn't suggest taking it out because i didn't want to perturb the branch any more
<rogpeppe> natefinch: i'd just delete all of that and use mongo.SelectPeerAddress instead
<rogpeppe> axw: we used a stable sort to preserve address order
<natefinch> rogpeppe: right, we just have to take it out since the bot can't compile it
<jamespage> rogpeppe, OK - this is from 12.04 - http://paste.ubuntu.com/7225600/
<axw> rogpeppe: yeah, just wondering what part of the address is being ignored for the sort.Sort not to be good enough
<jamespage> rogpeppe, however I think I saw the issue on 14.04 nodes - so doing it there as well.
<rogpeppe> jamespage: oh, one mo. i didn't include some crucial info.
<axw> cos if they're equal and we're considering all fields, surely we don't care
<rogpeppe> jamespage: this is more useful: http://play.golang.org/p/mmy9KhUy9T
<rogpeppe> axw: we weren't comparing all fields
<mgz> wallyworld_: yeah, you need to add your ssh key either through cloud-init or via nova though
<wallyworld_> mgz: i tried via nova using keypair-add
<mgz> right, with that... it didn't work?
<wallyworld_> i used the --pub-key option
<wallyworld_> yeah, didn't work
<wallyworld_> mgz: i'm trying to test the latest fixes to the manual provider that landed today
<jamespage> rogpeppe, http://paste.ubuntu.com/7225650/
<mgz> you can use `nova console-log` to see what's up if you supplied any cloud init bits
<wallyworld_> mgz: didn't supply any cloud init bits, was just assuming keypair-add would work
<wallyworld_> console log seemed to show some random key being used
<wallyworld_> not mine
<mgz> odd
<rogpeppe> jamespage: thanks
<mgz> wallyworld_: ah,
<rogpeppe> jamespage, mgz: do you think it would be reasonable to pattern match on the interface name to determine the class of address? (e.g. if it matches virbr* then assume it's machine-local)
<mgz> did you actually use `nova boot --key-name MYKEY` ?
<wallyworld_> yep
<rogpeppe> jamespage: i don't know how predictable interface names are in linux
<mgz> okay, I'm out of ideas then :P
<wallyworld_> mgz: the same name as i used for keypair-add
<wallyworld_> :-(
<mgz> wallyworld_: try supplying a key with cloud-init instead
<jamespage> rogpeppe, hmm
<mgz> 's a bit more work but should be fine
<wallyworld_> mgz: point me to some doc to tell me what to do?
<mgz> sec
<wallyworld_> or i can try with lxc i guess
<rogpeppe> jamespage: because i believe there are cases where we really do want to get the addresses off the local machine interfaces. but that's hard if we can't tell which ones are machine-local.
<mgz> wallyworld_: basically, make a text file with `#cloud-config\nssh_authorized_keys\n  -  ssh-rsa .... blah@blah\n"
<jamespage> rogpeppe, you can't safely make that assumption "if it matches virbr* then assume it's machine-local"
<mgz> see doc/examplescloud-config-ssh-keys.txt in lp:cloud-init for an example
<wallyworld_> ta, ok
<mgz> then you can supply that file stright as --user-data to boot
<mgz> (no need to gzip as it's so small)
<wallyworld_> ok, i'll try that
<jamespage> rogpeppe, is it possible to limit juju to quering interfaces its been told about or created itself?
<jamespage> rogpeppe, whitelist rather than blacklist
<dimitern> jam, standup?
<mgz> jamespage: we'll nearly start doing that with maas now I think, as dimitern has started getting the network interfaces from the lshw that maas provides
<mgz> we should probably do something similar when we grow better networking support in other clouds
<dimitern> mgz, jamespage, we definitely will do that for other clouds, gradually as juju networking support grows
<dimitern> fwereade, updated and tested https://codereview.appspot.com/85220044/ - should be good to land
<fwereade> dimitern, cheers
<mattyw> fwereade, I've made the small change you asked for - just added a test and a small fix - happy or me to land it? https://codereview.appspot.com/83060049/
<jam1> dimitern: sorry I missed the ping. I completely spaced off the standup, and was on my other laptop.
<dimitern> jam1, we're still there, you can join if you like :)
<fwereade> mattyw, if I LGTMed with fixes you don't need to ask, but you can always ask for another review if you'e not sure
<mattyw> fwereade, ok, I just added the test - and a fix I found while writing it so I'll approve it then, thanks
<fwereade> mattyw, cool
<jam1> fwereade, dimitern: https://code.launchpad.net/~jameinel/juju-core/1.18-refuse-downgrade-1299802/+merge/214878 needs a review
<dimitern> jam1, looking
<jam1> dimitern: thanks
<dimitern> jam1, LGTM
<rogpeppe> mgz, jamespage: i wonder if we could just add only addresses from eth* interfaces for the time being. that would probably cover the case that we care about most currently.
<rogpeppe> natefinch: hangout?
<axw> rogpeppe: is it expected we'll want to have non-voting replicaset members? is that why we have NoVote/WantsVote? or is that specifically for handling inaccessible members?
<rogpeppe> axw: yes - if a machine goes down, we don't know that it might just come back up again in a few moments, so we don't want to just destroy it or remove it immediately
<rogpeppe> axw: so we just mark it so that it doesn't want the vote
<rogpeppe> axw: also, we can have a machine with WantsVote=false and HasVote=true
<axw> ok
<natefinch> rogpeppe:  sure
<rogpeppe> axw: our main invariant is that the number of machines that *want* the vote must always be odd, and similarly the number of machines in the replica set configuration that *have* the vote must always be odd.
<rogpeppe> natefinch: one mo, i've just been called to lunch
<natefinch> ok
 * rogpeppe lunches
 * natefinch breakfasts
 * perrito666 snacks after breakfast
<perrito666> we really need more names for eating occasions
 * axw pats his belly full of pizza
<jam1> perrito666: brunch is the breakfast + lunch meal
<jam1> second breakfast is the hobbit one, (along with elevensies (sp?))
<axw> heh
<perrito666> jam1: I am in the hobbit one
<perrito666> I intend to lunch too
<perrito666> (and honestly,I might also eat something near eleven not that you mention it)
<jam1> perrito666: http://www.moviemistakes.com/film1778/quotes
<jam1> so, breakfast, second breakfast, elevensies, Lunch, Luncheon, Afternoon tea, dinner, supper, I'm not sure if there are more
<perrito666> that pretty much covers my day :)
<jam1> rogpeppe, natefinch: so how close are we to having a "juju ensure-state-availability" that we can play with ?
<rogpeppe> jam1: i've got a branch that seems to work
<rogpeppe> jam1: but it needs more tests
<jam1> rogpeppe: natefinch: I just noticed that we thought EnsureMongo could probably land (and be polished from there) yesterday, but it is still up for review.
<jam1> at least the comment yesterday was "if I get enough time before the kids wake up", which probably didn't happen, but certainly afterwards... ?
<rogpeppe> jam1: it's landing very soon
<rogpeppe> jam1: it used a go 1.2 feature which meant it couldn't land as was
<jam1> rogpeppe: if that wasn't said weeks ago, I would trust you :)
<jam1> rogpeppe: what was that? (I wasn't particularly aware of 1.2 incompatibilities)
<rogpeppe> jam1: it used sort.Stable, which is a go1.2 addition
<jam1> ah
<rogpeppe> jam1: it's been LGTM'd
<mattyw> is the landing bot awake?
<jam1> mattyw: it landed my stuff 10 min ago
<jam1> but I'll check it
<jam1> mattyw: do you have something that it isn't noticing?
<mattyw> jam1,  https://code.launchpad.net/~mattyw/juju-core/deploy-with-user-name/+merge/213962
<mattyw> jam, I guess there might be a queue?
<jam1> mattyw: you don't have a commit message set
<jam1> so the bot ignores it
<mattyw> jam, ah - of course, thanks
<jam1> mattyw: I copied your description
<mattyw> jam1, that's great thanks very much
<mattyw> jam, I'll try to remember for next time
<dimitern> fwereade, poke re https://codereview.appspot.com/85220044/
<rogpeppe> natefinch: i've got a dentist's appointment now. back in 30 mins
<jam1> mattyw: I can see the bot is processing your request.
<jam1> Note that we've had some intermittent failures with "Session already closed". If you see that, you can resubmit.
<mattyw> jam1, ok thanks
<sinzui> Hi jam, fwereade : I think this bug is describing unsupported behaviour or lxc nested in kvm: https://bugs.launchpad.net/juju-core/+bug/1304530
<_mup_> Bug #1304530: nested lxc's within a kvm machine are not accessible <addressability> <cloud-installer> <kvm> <local-provider> <lxc> <juju-core:New> <https://launchpad.net/bugs/1304530>
<mgz> sinzui: yeah, that's likely just a case of no one having tried it yet
<mgz> the local provider is already pretty crazy when it comes to addressing without adding nested containers in
<sinzui> mgz, I think stokachu  has done something like that and it required esoteric magis to work
<mgz> if you manually fiddle with the network setup you you could probably make it work
<mgz> it's not something we're looking to support for trusty though
<sinzui> mgz, CI hates trunk https://bugs.launchpad.net/juju-core/+bug/1305047
<_mup_> Bug #1305047: Unit tests fail on lp:juju-core r2588  <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1305047>
<mgz> sinzui: that's rogpeppe's bug
<mgz> rogpeppe: have you got a bug number for it?
<sinzui> Ah, silly me, stokachu is the reporter of the bug. So I think he has reached the dead end that thumper predicted
<fwereade> dimitern, rereviewed
<dimitern> fwereade, thanks
<rogpeppe> mgz, sinzui: i tried and failed to reproduce that problem
<sinzui> :(
<rogpeppe> sinzui: interestingly panic is in a different test to the one that jam saw
<rogpeppe> s/panic/that panic/
<sinzui> rogpeppe, CI will run the tests 5 times before giving up. It tried for many revs and did many fails
<sinzui> rogpeppe, But Vi just got a pass http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/run-unit-tests-amd64-precise/
<sinzui> rogpeppe, trusty is has the same bad record, but its passes happen in a better order to make CI happy: http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/run-unit-tests-amd64-trusty/
<mgz> sinzui: it seems like a pretty easy to hit race fail
<mgz> landing bot has jackpotted a number of times
<natefinch> rogpeppe:  you have dentist appointments that only take 30 minutes?  Damn, mine always take like an hour.
<natefinch> (and that's not including time to get there)
<rogpeppe> natefinch: the actual appointment was just a checkup - 10 minutes only; and the dentist is only a couple of minutes bike ride away
<natefinch> rogpeppe: that's cool.
<rogpeppe> hmm, i just saw another (probably unrelated) panic when testing
<rogpeppe> http://paste.ubuntu.com/7226267/
<mgz> rogpeppe: is the change you suspect just revertable?
<rogpeppe> mgz: probably
<rogpeppe> mgz: i'd like to know what's going on though
<mgz> if CI can hit the error this reliably, should be pretty easy to confirm blame or not
<jamespage> rogpeppe, does the same code get used in the MAAS provider? when using LXC containers, brX is also valid
<rogpeppe> jamespage: yes, the same code gets used in the MAAS provider
<rogpeppe> jamespage: perhaps we need provider-specific code to run in the client to get the addresses
<jamespage> rogpeppe, so in the MAAS provider the IP address is assigned to the bridge, not the physical interface
<jamespage> assuming LXC or KVM containers have been created
<jamespage> whitelisting eth* and br* might work OK
<jamespage> that said I've seen emX style entries as well with biosdevname
<rogpeppe> jamespage: i'm not familiar with the details of what scenarios we really need the machine-local address discovery.
<rogpeppe> fwereade, mgz: ^
<rogpeppe> s/discovery/discovery for/
<mgz> jamespage: I'm suspecious of anything just running on the machines themselves
<jamespage> mgz, rogpeppe: do we know what this local discovery step is used for?
<rogpeppe> jamespage: i'm not sure. i'm guessing there are some places that we can't use the provider for discovery
<rogpeppe> jamespage: perhaps this is something that's there for manual provisioning only
<mgz> jamespage: this is all re bug 1303735 right?
<_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
<jamespage> mgz, yes
<fwereade> rogpeppe, jamespage: it's the blasted local provider that needs local address discovery
<rogpeppe> ah, interesting, there's a race in the test between agent config and the apiaddressupdater
<fwereade> rogpeppe, jamespage: I can't immediately recall whether we have lxc-ls --fancy everywhere we might need it, which *could* let us work around that
<fwereade> rogpeppe, jujud tests?
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, those things suck
<fwereade> ;)
<rogpeppe> fwereade: it's only a race because we don't set the APIHostPorts in the state
<rogpeppe> fwereade: so the apiaddressupdater starts and immediately assigns no addresses to the agent config
<rogpeppe> fwereade: the test only works if the APIWorker grabs the APIInfo before it does that
<rogpeppe> fwereade: we should have valid APIHostPorts in the state, then there shouldn't be a problem
<fwereade> rogpeppe, right, makes sense
<rogpeppe> fwereade: i wondered about having an EnvironProvider method that allows us to ask a provider for local addresses
<rogpeppe> fwereade: then the local provider could implement it, but the other providers could just return nothing
<rogpeppe> fwereade: (but it would potentially allow us to move away from using the instancepoller for some providers, if we wanted to - hazmat thinks that's a good idea)
<fwereade> rogpeppe, honestly I think it's a matter of tuning the instancepoller more than it is a matter of dropping it
<fwereade> rogpeppe, we want to keep track of instance status as well
<fwereade> rogpeppe, I think there's something else
<fwereade> rogpeppe, oh, yeah, instance networks
<rogpeppe> fwereade: regardless, having a way for providers to add locally-sources addresses to a machine seems like a reasonable idea
<fwereade> rogpeppe, yeah, I wouldn't object to making the MachineAddresses stuff smarter
<dimitern> fwereade, network ids and tags https://codereview.appspot.com/86010044 when you can take a look
<rogpeppe> fwereade: FWIW MachineAddresses isn't a great name - it doesn't really say why Machine.MachineAddresses is different from Machine.MachineAddresses...
<fwereade> rogpeppe, agreed, it's an awful name
<rogpeppe> fwereade: LocalAddresses?
<rogpeppe> fwereade: LocallySourcedAddresses?
<fwereade> rogpeppe, the semantic payload there is not ideal either
<rogpeppe> (i don't like either of those, BTW)
<fwereade> rogpeppe, the latter is probably best
<natefinch> anyone know why I'm getting this when I run juju? WARNING unknown config field "proxy-ssh"
<fwereade> (best of a bad bunch)
<rogpeppe> fwereade: AgentProvidedAddresses ?
<fwereade> natefinch, hmm, that seems odd -- it's something axw added for azure, but I'm not sure where ther error comes from
<natefinch> fwereade: I don't have proxy-ssh in my environments.yaml anywhere
<natefinch> fwereade: I think blowing away my old environments.yaml and making a new one helped
<vladk> dimitern: I done lbox propose for gomaasapi, but no codereview on appspot was created, only on LP
<dimitern> vladk, did lbox give you any errors?
<fwereade> natefinch, would you take a few moments to dig into it sometime today please? many users will have old environments.yamls...
<vladk> dimitern: no just print link to LP:
<vladk> Proposal: https://code.launchpad.net/~klyachin/gomaasapi/101-testserver-extensions/+merge/214961
<dimitern> vladk, you could try running it again
<natefinch> fwereade: yeah, once HA lands, I'll be able to actually work on other thigns
<natefinch> fwereade: which will be as soon as I can run a few tests
 * fwereade cheers at natefinch
<natefinch> fwereade: this also seems to have fixed other problems I had been experiencing.  Definitely worth investigating
<natefinch> (luckily I kept around the old environments.yaml
<natefinch> I wish we had an environment variable we could set that would effectively add --debug to every juju command line call
<vladk> dimitern: I specified -cr explicitly: https://codereview.appspot.com/86070043
<dimitern> vladk, ah, you know - gomaasapi doesn't have .lbox.check in the root dir I think
<dimitern> vladk, in juju-core we have .lbox containing the default args for lbox: "propose -cr -for lp:juju-core"
<mgz> vladk: you can always rerun lbox propose as many times as you want, so that's fine
<mgz> I always do `lbox propose -cr -v` out of long standing habit
<jamespage> fwereade, are you guys aiming for a 1.18.1 release prior to the end of tomorrow?
<jamespage> (which co-incidentally is when Final Freeze kicks in)
<mgz> jamespage: we appear to have no fixed bugs in 1.18.1
<mgz> so I'd guess not.
<natefinch> gah, I still can't deploy stuff locally
<jamespage> mgz, well we have 24 hrs until tomorrow eod :-)
<mgz> jamespage: yeah, but all the bugs look hard... ;_;
<fwereade> jamespage, it's not impossible: I'm working on the first; I will ask axw to look at the second overnight; just asked for cmars' comments re third; not sure about 4th, I'll ask an australian to take a look; 5th apears unreproable
<jamespage> fwereade, OK - thanks
<natefinch> ah hah..... lxc-ls seems broken.  I bet that's my problem
<jamespage> fwereade, its not impossible to get a point release in after tomorrow
<fwereade> jamespage, sure, but I prefer to be a good citizen where practical
<rogpeppe> natefinch: i'm just proposing a branch to fix one of the machine agent panics. perhaps we could join up to move the HA stuff forward after that?
<dimitern> fwereade, updated https://codereview.appspot.com/85220044/ once more
<natefinch> rogpeppe: sure.  I fixed the port problem with the initiate address and removed the testing panics you mentioned in the review.  It works on amazon, but I seem to have an LXC problem on my local host, so I'm apt-getting and will reboot after to see if that fixes anything
<dimitern> fwereade, i really want to land that and  https://codereview.appspot.com/86010044/ today if i can
<rogpeppe> this fixes a cmd/jujud test crash: https://codereview.appspot.com/86080043/
<rogpeppe> fwereade, mgz, dimitern, natefinch: review appreciated
<rogpeppe> unfortunately it's not the one that people have been seeing on the 'bot and in CI
<mgz> ha, typed in the wrong id
<mgz> but I should actually review dimitern's branch, which is where I ended up :P
<natefinch> brb, gonna reboot now that I have upgraded, see if that fixes my lxc problems
<dimitern> rogpeppe, reviewed
<rogpeppe> dimitern: ta!
<rogpeppe> dimitern: in general i prefer to use a literal - if i use "nothing", then i have to check its value. i don't mind the slightly greater verbosity.
<rogpeppe> dimitern: at some point in the future i hope to see a "zero" builtin in Go that acts like nil except it represents the zero value for any type.
<dimitern> rogpeppe, I know you don't mind :) It's just my opinion
<dimitern> rogpeppe, yeah, that will be very handy
<rogpeppe> dimitern: FWIW, i think the code with naive literals reads slightly more easily - it's more directly obvious what the code is doing.
<fwereade> dimitern, https://codereview.appspot.com/86010044/ reviewed
<dimitern> fwereade, ta!
<fwereade> dimitern, you might not be so happy when you read it, we may need to discuss, I fear I have been unclear
<dimitern> rogpeppe, the first thing i'm doing when reading unfamiliar code and see a var/type/etc. i don't get, i immediately hit M-. in emacs, which invokes godef on the symbol and voila!
<rogpeppe> dimitern: i'm usually reading the code in codereview...
<rogpeppe> dimitern: but without the nothing declaration there's no need for any second look - it's immediately obvious on first scan
<rogpeppe> dimitern: which is why i prefer it more direct like that
<fwereade> dimitern, the other one LGTM with trivials
<dimitern> fwereade, thanks, still reading the last review :)
<dimitern> fwereade, i can't really impose restrictions on what juju deems a valid network id until it's provider specific, can I ?
<natefinch> rogpeppe: lucky 13? https://codereview.appspot.com/72500043/   addressed the things you mentioned in the last review, and it tests ok live on local and amazon
<fwereade> dimitern, bugger, ofc you're right
<fwereade> dimitern, TODO it with a short explanation of why we're so lax? or, hmm, ask the maas guys what their restrictions on net names are?
<fwereade> dimitern, and slavishly copy those? :)
<dimitern> fwereade, and re params.Network having both Tag and Id as you suggest - I can do that and for now make sure both match always
<dimitern> fwereade, (except for the tag prefix ofc)
<dimitern> fwereade, I'll just look at the maas source
<dimitern> fwereade, re tags/names/ids - we can have in state and in the api + params all three and make tags work with names and keep name=id for now
<vladk> dimitern: I got LGTM from rvba. Could you give me the next task?
<dimitern> vladk, sorry, I have a few comments for you review
<dimitern> vladk, will submit in a minute
<fwereade> dimitern, yeah -- params.Network is just saying here's the net with this juju name, and this is what the provider calls it
<fwereade> dimitern, sounds like we're aligned
<fwereade> dimitern, thanks
<dimitern> fwereade, yep, thanks, will propose a bit later, if you're still here will ping you again :)
<dimitern> vladk, reviewed
<mgz> fwereade: your comments on dimitern's proposal confuse me
<fwereade> mgz, ha :)
<fwereade> mgz, it is perfectly possible that I am missing something
<rogpeppe> natefinch: get it approved!
<fwereade> mgz, would you expand a little?
<mgz> it seemed like we were deriving the tags from the cloud provider stuff, hence the no restrictions bar != "", rather than from being named by the use
<mgz> *user
<mgz> tag=what juju calls the network, id=what the cloud calls the network, name=junked due to being ambiguous
<natefinch> rogpeppe: it's going. I *just* merged and fixed a conflict, so it should just work.   fingers crossed.
<rogpeppe> natefinch: hangout?
<mgz> (and label=optional friendly id for network also from the cloud)
<mgz> I guess your review is saying we *should* be providing a way for the user to specify a tag, tied to a given id
<mgz> but without some juju cli network commands, I don't see how we add that
<mgz> fwereade: ^does that make sense of my confusion?
<fwereade> mgz, tag is purely API-level, it's not what juju calls things
<dimitern> mgz, for now network.ProviderId == network.Name in juju (both state and api) and tags are created from names
<mgz> fwereade: ugh, the name inside the tag then
<mgz> having tag be the magic decoration bit is annoying
<fwereade> mgz, we *will* have cli network commands, and I want what we have to day to fit in with what we will need to do soon
<fwereade> mgz, for now, maas is the only thing that has networks, and there's a perfect mapping between provider id and user name
<mgz> fwereade: so, dimitern's version seems to do that, by autonaming the... names inside the tags, and placing no restrictions on them
<mgz> so they can become user-specified later
<fwereade> mgz, but it also conflates names and ids in several places -- and if we don't make the kind of data clear now we will have the devil of a time once we have a name that does not match an id
<mgz> fwereade: well the conflating I saw was from that autonaming business... maybe there's some other bits I missed?
<dimitern> mgz, I wasn't clear about not just renaming Name to Id, but having both and using name for juju stuff and id for provider stuff
<fwereade> mgz, it seemed to me that it was using Id instead of name across the board
<mgz> okay, so we just need to be picky as hell about the naming... which will still be confusing even if we are due to too many things, too few names for names...
<natefinch> rogpeppe: sorry, stepped out to get lunch.  I can hangout now, yeah
<rogpeppe> natefinch: ok, cool
<rogpeppe> natefinch: https://plus.google.com/hangouts/_/canonical.com/juju-core?authuser=1
<rogpeppe> natefinch: lp:~rogpeppe/juju-core/540-enable-HA
<vladk> :332
<natefinch> \o/  The proposal to merge lp:~natefinch/juju-core/030-MA-HA into lp:juju-core has been updated.   Status: Approved => Merged
<alexisb> natefinch, sweetness!
<natefinch> finally finally finally
<rick_h_> natefinch: if you get time want to chat about that for a couple of min
<natefinch> rick_h_: how urgent is it?  My day is pretty slammed
<rick_h_> natefinch: not at all, completely when you've got time
<rick_h_> and the time can even be 'let's catch up in vegas'
<natefinch> ha, ok
<rick_h_> just a heads up gui wants to catch up on HA to see what we can/should do from our end so we can have a plan
<natefinch> rick_h_: good enough... I'll shoot you an email about it.  We're not quite done with it, but this was a huge chunk that had taken way too long to get in
<rick_h_> natefinch: rgr, thanks
<jam1> hey guys, something in the test suite is now creating a directory and turning it into 666
<jam1> which means it is not executable or writeable
<jam1> which means the test suite is failing to clean it up
<jam1> a lot of: /tmp/jctest.LpP/gocheck-5577006791947779410/27/some-file
<jam1>  files
<jam1> fwereade: is that one of your FT tests ?
<natefinch> jam1: there's a lot of tests that use 0666
<jam1> natefinch: sure ,but you don't normally change a *DIR* to 0666
<fwereade> jam, hmm, I didn't *think* I did that, but I can't swear to it
<jam1> they need 7 to be able to read the contet
<natefinch> jam1: oh, directory, right
<natefinch> jam1: Tcharm/repo_test.go has a couple of those
<natefinch> s/Tcharm/charm.
<fwereade> jam1, hmm, I do have an 0644 in there, drivebying it now
<jam1> fwereade: sorry, it is 444 read only
<jam1> 6 would be rw
<fwereade> jam, none of them I think
<fwereade> ah-ha! yes I do
<fwereade> bugger
<jam1> well, it is all test number 27 ... ):
<jam1> :)
<jam1> not that *that* part helps
<jam1> fwereade: TestRemovedCreateFailure
<jam1> TestDirCreateFailure
<jam1> fwereade: so I think just adding a Chmod(777) so we can clean up afterwards would be nice
<fwereade> jam1, yeah, deferred chmods back to 0777
<jam1> I don't think it is causing the test suite to *fail*, but it is preventing rm-rf from cleaning up after itself
<fwereade> jam1, yep
<fwereade> jam1, sorry about that
<jam1> fwereade: np, I only noticed because the test suite is failing for other reasons
<jam1> and that shows up in the log
<jam1> dimiter's last patch just failed, some of which looks transient, and some which looks like an error message changed.
<jam1> but at the *end* of that, it says "I couldn't clean up"
<jam1> but hey, root can do anything it wants...
<jam1> :)
<natefinch> trivial code review anyone?  https://codereview.appspot.com/85970044
<natefinch> jam1: btw, did you see EnsureMongoServer finally landed?
<jam1> natefinch: I didn't. YAY \o/
<natefinch> right? :)  Super psyched
<natefinch> rogpeppe: https://codereview.appspot.com/85970044
<jam1> natefinch: lgtm
<fwereade> natefinch, woooot!
<natefinch> fwereade: thanks :)
 * natefinch has to see a man about some bees.  
 * rogpeppe is done for the day
<rogpeppe> might make it back in later for a little bit
<rogpeppe> g'night all
 * fwereade needs to go out for a while, would appreciate looks at https://codereview.appspot.com/85670046
<jam1> natefinch-afk: you forgot to set a commit message when you proposed your merge, I'll do it for you
<cmars> proposal for LP: #1303880 up, PTAL https://codereview.appspot.com/86130043
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <regression> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
<cmars> jam1, can you take a look at ^^
<sinzui> cmars, I need to update the bug and note that setting the default-series in the env is also a solution if you are opposed to typing the series when you deploy a charm
<sinzui> cmars, I am taking the regression tag off. Now that we know the affected users are the edge cases we talked about. I think the solution is to show the right error message
<cmars> sinzui, that's a much easier fix :) please note the desired error message in the bug
<sinzui> I see you included lucid, but juju and charms don't run on it
<sinzui> cmars, I will think of a message right now
<sinzui> cmars, https://bugs.launchpad.net/juju-core/+bug/1303880/comments/6
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
<sinzui> though I see I meant to write "release notes" in that commment
<cmars> sinzui, updated my proposal. can someone take a look, its much smaller now :) https://codereview.appspot.com/86160043
<sinzui> cmars, I am not a reviewer but that size is nice.
<cmars> :)
<sinzui> where did the US go?
<sinzui> perrito666, can you review cmars's branch?
<rogpeppe1> anyone up for doing a review? https://codereview.appspot.com/86200043/
<bac> sinzui: have you tried using staging-tools or their kin lately?
<sinzui> bac, I don't even know what they are
<bac> sinzui: you created the branch :)
<bac> https://code.launchpad.net/~ce-orange-squad/charmworld/staging-tools
<sinzui> bac :) I have forgotten much
<sinzui> oh
<sinzui> bac.
<sinzui> you probably care about the rt a report today
<bac> yes, maybe.  does it involved access to canonistack post hb?
<sinzui> bac: I used those tools several times a week. orangesquad and juju-qa cannot use swift. Juju is unusable
<sinzui> bac: nova is fine
<bac> sinzui: i'm seeing canonical-sshuttle dying, not being able to connect to canonistack
<bac> it all worked the last time i tried
 * sinzui tries
<sinzui> bac, I am connected, but what did I connect too because it looks empty
<sinzui> bac: I think my jenv is bad. I am told the env is not bootstrapped
<bac> sinzui: you think you're on staging?
<thumper> davecheney: bugger... it seems like godeps doesn't update the hg branches
<thumper> ah, no it does, it just doesn't say that it does
<thumper> trivial review to just update the go.net library: https://codereview.appspot.com/86250043
<cory_fu> Has juju ssh to a machine number (juju ssh 0) been fixed yet for LXC?
<thumper> cory_fu: what do you mean?
<thumper> cory_fu: for the local provider?
<thumper> cory_fu: yes it works, except for machine 0 as that is the host
<thumper> unless your host actually has sshd running
<thumper> cmars: how goes https://bugs.launchpad.net/juju-core/+bug/1303880
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
<cmars> thumper, i have a proposal, PTAL, https://codereview.appspot.com/86160043/
<thumper> ack
<dannf> wallyworld_: just to clarify - the branch you linked fixed that error, but wasn't the root cause, correct? just wondering if there's a fix i can/should verify
<thumper> dannf: perhaps more context would help :-)
<dannf> thumper: LP: #1302205
<_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
<thumper> dannf: it was my understanding that the fixes that wallyworld had were to fix the root cause
<thumper> dannf: are you still having issues?
<thumper> I know that wallyworld was trying to test yesterday
<thumper> but not sure on the final progress
<wallyworld> dannf: hi, i just logged on again after networking issues so can't see that backscroll, can i help?
<dannf> wallyworld: yeah - just curious if the branches you linked were root cause - i.e., if it is worth me retesting w/ them
<wallyworld> dannf: i committed fixes, but had trouble testing because i have to run from trunk to test and so can't use simplestreams to get the tools and building juju from source on the arm vms just hung
<wallyworld> so i couldn't get tools built to test with
<dannf> wallyworld: did you try building on the nova host? i've built there many times w/o a problem
<wallyworld> i installed gcc-go and couldn't get outgoing access to lunchpad or github so just copied my source tsrball acros
<wallyworld> yeah, i think i build on the nova host
<wallyworld> i can try again
<wallyworld> actually
<wallyworld> i could get outgoing access via wget
<wallyworld> but go get just hung
<wallyworld> so i couldn't get the juju source in the normal way
<wallyworld> via vcs
<dannf> though building in the vms *should* work - if not, probably a bug
<rogpeppe1> a fairly trivial review if anyone wants to take a look: https://codereview.appspot.com/85600044
<dannf> wallyworld: we can ask is to open access for us to certain things. surprised lp access was blocked
<wallyworld> dannf: i could wget to launchpad but "go get launchpad.net/juju-core" failed
<wallyworld> or hung
<dannf> ah - go get... never used that before
<cmars> thumper, thanks
<wallyworld> so i just copied across the source
<dannf> i'll investigate that and at least get a bug filed if neeeded
<wallyworld> dannf: go get uses bzr behind the scenes
<wallyworld> dannf: i have a meeting now but will ping back when done
<dannf> ack
<cmars> sinzui, fix for LP: #1303880 is landing in trunk. do you need it proposed to any branches?
<_mup_> Bug #1303880: Juju 1.18.0, can not deploy local charms without series <charms> <series> <juju-core:In Progress by cmars> <https://launchpad.net/bugs/1303880>
<sinzui> cmars, yes please lp:juju-core/1.18
<dannf> wallyworld: seems to be working for me. but slow. and no output. i just see the directory growing in size
<thumper> wallyworld: trivial review for you, although since it is needed on two branches, lbox isn't good at handling it
<thumper> https://code.launchpad.net/~thumper/juju-core/update-websocket-lib/+merge/215057
<thumper> and https://code.launchpad.net/~thumper/juju-core/update-websocket-lib/+merge/215046
<wallyworld> thumper: otp, will look soon
<thumper> ack
<thumper> cmars: I have approved the other branch
<thumper> cmars: although I did realise that there aren't any tests for the new error message
<thumper> cmars: is it hard to add one?
<thumper> cmars: also, lbox doesn't like submitting the same branch to multiple targets
<thumper> cmars: it is a bit too dumb
<cmars> thumper, i'll propose a test case for deploying local without series. might be after dinner
<thumper> cmars: ack
<dannf> wallyworld: and go get seems to have completed (/home/ubuntu/dannf/go)
<wallyworld> dannf: great :-) still in meeting, will check back soon
<dannf> wallyworld: np; i need to start up the grill, so responses will be latent
<davecheney> arosales: hazmat ping
<davecheney> arosales: hazmat do you have time for a quick G+ to talk about the demo
<arosales> davecheney: hello
<davecheney> arosales: hazmat lets take this to #eco
<arosales> ok
#juju-dev 2014-04-10
<wallyworld__> dannf: yeah, seems to be working for me now. i think it was just so slow yesterday that i gave up. i also tried building juju from src and i gave up after 30 minutes. maybe the vms are ust slow
<davecheney> thumper: wallyworld__ on my ppc systems I see the provisioning constantly (every 300 sec) polling the charm store
<davecheney> this fails because FIREWALLS!
<davecheney> but i wonder why does it poll at all ?
<wallyworld__> davecheney: to see if charms are out of date
<wallyworld__> so that can be shown in status
<davecheney> wallyworld__: ok, so i should raise an RT to get access to charm store enabled
<davecheney> looks like the proxy is blocking it
<wallyworld__> yeah that would be good
<davecheney> wallyworld__: ill fix
<wallyworld__> davecheney: so when you run status, it says "hey you have mysql version 10 installed but version 12 is available"
<davecheney> ahh nice
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1305365
<_mup_> Bug #1305365: juju 1.18.0 environment unusable after bootstrap <juju-core:Triaged> <https://launchpad.net/bugs/1305365>
<davecheney> ^ what the hell
<davecheney> it bootstrapped and deployed fine
<davecheney> why does status have a whinge
<cmars> thumper, i'm trying to capture output that's been written with logger.Errorf (from within in a coretesting.RunCommand). is there a way to hook into loggo like this?
<cmars> i've tried coretesting.Stdout(ctx), and Stderr, but nothing there
<thumper> cmars: hey
<cmars> hi
<thumper> cmars: chances are you are using a LoggingSuite
<thumper> it caputres the logging
<thumper> it is in the gocheck *gc.C thingy
<thumper> as some test log
<thumper> normally we don't test logging
<thumper> what are you after exactly?
<thumper> hi axw
<axw> hey thumper
<thumper> axw: was asked to hit you up about a critical bug
<thumper> bug 1303735
<axw> ok
<_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Triaged> <juju-core 1.18:Triaged> <https://launchpad.net/bugs/1303735>
<cmars> thumper, the recent error message I've added. i think that'll do it, thanks
<thumper> apparently you were talking with rob about it
<axw> ah, was chatting with rog about that last night...
<thumper> cmars: there are some new methods on the context object
<thumper> cmars: if you are testing the error
<thumper> then you should check the error response from Run
<thumper> cmars: testing.RunCommand returns a context and an error
<thumper> the error is what you want to be checking
<thumper> axw: do you know what the issue is there with the addresses
<thumper> ?
<thumper> I recall work going on there
<thumper> but I don't know exactly what changed
<axw> thumper: it looks like the openstack environment has no public addresses, only cloud-local and/or unknown; the SelectPublicAddress code chooses the last cloud-local/unknown address in the list... so it gets one of the unknown addresses that the machiner records
<thumper> hmm...
<thumper> so... how do we go about fixing it?
<axw> thumper: so, we should probably prefer the provider addresses over machine addresses if they're all cloud-local/unknown
<thumper> are we able to tell the difference
<thumper> ?
<axw> yes, they're recorded in separate lists
<axw> thumper: it would be ideal if the machiner could decide which addresses were machine-local, but it's not a straightforward thing to do
<axw> apart from localhost addresses of course...
<thumper> right
<waigani> thumper: davecheney has LGTMed https://codereview.appspot.com/85710043/ did you want to take a look before I land it?
<thumper> yes please
<waigani> ah, I should remove loggingSuite from environ_test.go
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1305386
<_mup_> Bug #1305386: state/apiserver: multiple data races <juju-core:Triaged> <https://launchpad.net/bugs/1305386>
<thumper> davecheney: what causes these and do you have any suggestions to fix?
<davecheney> thumper: not sure yet
<davecheney> i'm hoping its something we're doing (sharing a mgo conn) not a driver bug
<davecheney> its not clear if these are cosmetic or serious
<wallyworld__> thumper: awesome, found out why go install was hanging. trunk of code.google.com/p/go.crypto has changed and juju-core no longer compiles but that causes go install to just hang unless you compile with the -v option
<wallyworld__> how good is thst
<thumper> wut?
<thumper> hah
<wallyworld__> so we had better not update go.crypto
<thumper> why?
<wallyworld__> no idea
<thumper> wallyworld__: so you should run godeps, yeah?
<wallyworld__> seems so
<wallyworld__> but i mean wtf
<wallyworld__> it should have told me there was a compile error
<thumper> i would have thought so
<wallyworld__> i just thought it was slow cause other stuff has been slow
<davecheney> wallyworld__: is the compiler spinning ?
<davecheney> agl and hanwen landed that big branch today
<wallyworld__> davecheney: not sure, how do i tell?
<davecheney> i didn't see any discussion about it
<davecheney> top ?
<davecheney> that was rather unorthodox
<wallyworld__> davecheney: strangely using the -v flag which prints aeach package as it is compiled printed out the errors also and then exited when done
<davecheney> wallyworld__: paste
<davecheney> ?
<wallyworld__> davecheney: https://pastebin.canonical.com/108166/ shows first attempt hanging, and then -v showing errors
<davecheney> wallyworld__: can you use pastebin it
<davecheney> my 2fa is downstairs
<wallyworld__> ok
<wallyworld__> http://pastebin.ubuntu.com/7229253/
<davecheney> wallyworld__: can't tell if a hang, or just took a long time
<davecheney> try
<davecheney> rm -rf $GOPATH/pkg
<wallyworld__> davecheney: it ran for over an hour before i hit ^C
<davecheney> wallyworld__: did you look at top ?
<wallyworld__> no, not at the time
<davecheney> bummer
<davecheney> as for tip of ssh being broken
<davecheney> yes, that is poor form
<davecheney> but we're not blameless here
<davecheney> the top of gomaasapi is unusable isn't it
<davecheney> or is it gwacl ?
<wallyworld__> not sure
<wallyworld__> davecheney: i ran it again and that time it exited with the compile errors even without the -v flag
<davecheney> i don't -v would have any impact on a hang
<davecheney> -v is for cmd/go
<davecheney> which just forks the compiler
<davecheney> its hard to say uless you can get it to happen again
<davecheney> i'd 100% believe go get hanging
<wallyworld__> i read in help that -v is for verbose
<davecheney> wallyworld__: yes, that is what it does
<davecheney> go get can get confused when hg/bzr/git fire off some program to deal with merge conflicts as the pty they run against isn't a terminal
<wallyworld__> ok
<wallyworld__> davecheney: right, after running godeps, install now appears to be spinning
<davecheney> which process is spinning
<davecheney> cmd/go
<wallyworld__> yep
<davecheney> or the compiler ?
<davecheney> kill cmd/go with SIGQUIT
<davecheney> capture the output
<davecheney> great
<davecheney> this is some pretty shit
<davecheney> 11th hour and everyting is breaking, including the toolchain
<davecheney> wonderful
<wallyworld__> davecheney: http://pastebin.ubuntu.com/7229265/
<wallyworld__> something about stack unavailable
<wallyworld__> nuking pkg fixes it
<davecheney> wallyworld__: whos' machine is this ?
<davecheney> its running gccgo
<wallyworld__> thumper: \o/ and now i have it compiled i can't test anyway because of bug 1304742  FML
<_mup_> Bug #1304742: version reports "armhf" on arm64 <arm64> <hs-arm64> <juju-core:Triaged> <https://launchpad.net/bugs/1304742>
<wallyworld__> i guess i should fix that
<thumper> yeah... guess so
 * wallyworld__ sighs heavily
 * axw joins in
<axw> HP cloud doesn't like my change
<davecheney> supportedArchitectures isn't an ordered list, right ? https://bugs.launchpad.net/juju-core/+bug/1305397
<_mup_> Bug #1305397: provider/common: test failure <gccgo> <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1305397>
 * davecheney goes to lunch
<davecheney> in the rain
<wallyworld__> davecheney: another one http://pastebin.ubuntu.com/7229318/
<wallyworld__> faaaark. only took 1000 attempts but finally got the go compiler to build without panicing
<davecheney> wallyworld__: eerk, that looks simlar to the bug we see on 64k ppc64 kernels
<wallyworld__> davecheney: yuk. i'm not having much luck with juju per se either. machine 0 agent won't start properly
<davecheney> wallyworld__: more nil pointers ?
<wallyworld__> doesn't seem to get past starting the start server
<wallyworld__> not sure yet if it's because it can't see the db or something else
<davecheney> wallyworld__: using lxc ?
<wallyworld__> nope, arm vms
<wallyworld__> manual provisioning
<davecheney> ok
<davecheney> lxc is broken atm
<wallyworld__> :-(
<davecheney> wallyworld__: i know
<wallyworld__> i'm running from trunk and it seems like it's dying inside the ensure ha stuff but there's not enough logging to know for sure
<davecheney> :(
<davecheney> :emoji crying:
<wallyworld__> davecheney: i farking give up. takes 1000 retries to get a compile. then complains about missing libgo.so.5. so find that i need extra compile flags. 1000 retries later get new jujud. still libgo.so.5 missing. so i install deb package on target machine just to get that shared lib. rinse and repeat. still complains. sigh
<davecheney> wallyworld__: see https://docs.google.com/a/canonical.com/document/d/1m9R2n6LPLNLGjdopcNkQYVG8D5V4FTyvc1vvn-9ZifM/edit
<davecheney> at the bottom
<davecheney> the gb alias
<davecheney> wallyworld__: can you paste me anything intersting from dmesg
<wallyworld__> davecheney: i used those flags
<davecheney> i want to compare with the ppc64 problems
<davecheney> if it's complaining that libgo.so.5 is missing
<wallyworld__> binaries were about 30% bigger after
<davecheney> then it didn't work
<davecheney> wallyworld__: i think you should cut your losses
<wallyworld__> so then i tried installing the lib directly on the target machine
<davecheney> looks like juju doesn't work on arm64
<davecheney> i don't know if anyone is working on that atm
<wallyworld__> we're close
<wallyworld__> dmesg doesn't have anything that interesting
<davecheney> wallyworld__: ldd $(which jujud)
<wallyworld__> ubuntu@ms01a:~/juju/src/launchpad.net/juju-core/cmd/jujud$ ldd $(which jujud)
<wallyworld__>         linux-vdso.so.1 =>  (0x0000007fafbdd000)
<wallyworld__>         libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007fafba7000)
<wallyworld__>         libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000007fafb07000)
<wallyworld__>         libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000007fafae4000)
<wallyworld__>         libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007faf999000)
<wallyworld__>         /lib/ld-linux-aarch64.so.1 (0x0000005595496000)
<wallyworld__> hmm, so libgo.so.5 is missing
<wallyworld__> i built with
<wallyworld__> go install -a -v -gccgoflags -static-libgo launchpad.net/juju-core/...
<wallyworld__> ah missing =
<wallyworld__> fark
<davecheney> wallyworld__: nope
<davecheney> you got it right
<davecheney> if you got it wrong
<davecheney> there would be a line showing
<davecheney> libgo.so.5 => (missing)
<wallyworld__> ok
<davecheney> but you should probably use the =
<wallyworld__> used the =, same result
<rogpeppe2> mornin' all
<davecheney> in this case it will be
<davecheney> but if you wanted to pass a second option
<davecheney> you'd need = and to quote it
<davecheney> rogpeppe2: o/
<rogpeppe2> davecheney: yo!
<axw> morning rogpeppe2
<rogpeppe2> axw: hiya
<axw> rogpeppe2: I've got a CL for the addresses bug, about to propose (just finishing live test) - got time to have a look?
<rogpeppe2> axw: sure
<axw> rogpeppe2: I made the changes we talked about, but also changed instance.NewAddress derive scope from IPv4 address range
<axw> so 10.*, 192.168.*, etc. are recorded as cloud-local
<rogpeppe2> axw: that seems reasonable to me
<rogpeppe2> axw: although that's not really true in the case of this bug
<rogpeppe2> axw: the 192.168.* addresses were machine-local
<axw> rogpeppe2: I think that's okay, because we'll still choose provider cloud-local over machine cloud-local
<axw> machiner*
<rogpeppe2> axw: yeah, it should be ok
 * axw taps finger, waits for lbox
<wallyworld__> davecheney: success. i blew away *everything* and started again. bootstrapped manual provider on arm64. gotta run to soccer, but will add another machine later and deploy a charm
<axw> rogpeppe2: https://codereview.appspot.com/85590044/
<axw> mgz: would appreciate a review from you too, in case there's some subtlety I've missed
<axw> rogpeppe2: on a completely unrelated note, I was thinking about the EnsureAvailability task some more this morning. Is there a reason why we shouldn't move a non-voting state server back to voting if it becomes accessible again?
<rogpeppe2> axw: absolutely not.
<rogpeppe2> axw: i think we should do that
<axw> goodo
<axw> that's what I thought too
<rogpeppe2> axw: that's why we keep 'em around
<fwereade> anyone know anything about the replicaset failure in https://code.launchpad.net/~fwereade/juju-core/uniter-relation-states/+merge/215003 ?
<fwereade> are we seeing it a lot?
<davecheney> fwereade: wrt your previous comment
<davecheney> http://paste.ubuntu.com/7229693/
<davecheney> i am not sure if this is a real race, or an instrumentatoin error
<fwereade> davecheney, I will take a look at that, thanks
<fwereade> davecheney, although at first glance... sync.WaitGroup *ought* to be used that way, oughtn't it?
<davecheney> fwereade: you cannot call Add() while another goroutine calls Wait()
<davecheney> only Done()
<rogpeppe2> axw: reviewed
<axw> ta
<axw> rogpeppe2: heh, I kept writing IPv4 in my new code and then changed it to match the existing :/
<axw> I will fix it another time- this is going to need to be backported, don't want to create unnecessary hardship :)
<rogpeppe2> axw: yeah, it's definitely worth changing at some point
<rogpeppe2> axw: fair enough
<davecheney> http://golang.org/pkg/sync/#WaitGroup.Add
<rogpeppe2> fwereade: ah, i think i know what that might be to do with
<rogpeppe2> fwereade: i think perhaps the replicaset tests were relying on the fact that Set put the session into monotonic mode.
<axw> rogpeppe2: likewise for NewAddress arg order if you don't mind - that will create a lot of noise
<rogpeppe2> axw: ok
<rogpeppe2> fwereade: (i mean Initiate, not Set)
<fwereade> davecheney, indeed, I see; but I'm having some difficulty mapping it onto the code
<davecheney> fwereade: yes
<fwereade> davecheney, the code looks odd fwiw, I don't see why we'd wait twice
<davecheney> i'm having trouble understandig if that is a real race
<davecheney> or a bug in the race detector
<davecheney> i'd use 1.3, but thre are heaps of bugs at tip
<fwereade> heh
<davecheney> as usual juju is leading the way in showig bugs in go tip
<fwereade> go us!
<rogpeppe2> fwereade: hmm, it doesn't look as if that is actually the case
<rogpeppe2> davecheney: it may well be a real race
<davecheney> rogpeppe2: \o/, i guess
<rogpeppe2> davecheney: something similar came up on golang-nuts recently
<davecheney> rogpeppe2: indeed
<davecheney> and the semantics may change
<davecheney> but under 1.2
<davecheney> the old semantics apply, you can't call Add(positive) after someone else has called Wait()
<fwereade> davecheney, hey, I think it is a race, but I'm still only at single-coffee levels of incisiveness
<davecheney> fwereade: based on that assesment, I shall raise a bug
<davecheney> i have found many other races
<davecheney> some in the mongo driver
<fwereade> davecheney, cheers, we can always close it if it isn't
<rogpeppe2> davecheney: looks like a trivial fix for that one at least
<rogpeppe2> davecheney: though i'm not sure it will actually fix the bug we're seeing
<davecheney> rogpeppe2: they normally are
<davecheney> true
<rogpeppe2> davecheney: it may do though, thinking about it
<rogpeppe2> davecheney: get that Add in!
<davecheney>  https://code.launchpad.net/~dave-cheney/juju-core/128-environs-sync-tempdir-prefix/+merge/215087
<davecheney> if anyone has two seconds
<davecheney> no
<davecheney> ignore, that one landed
<davecheney> https://code.launchpad.net/~dave-cheney/juju-core/127-fix-lp-1305397/+merge/215077
<davecheney> this one is also trivial
<fwereade> rogpeppe2, that failure in my MP has an *awful* lot of "attempting Set got error: replSetReconfig command must be sent to the current replica set primary." lines before the no-reachable-servers failure
<rogpeppe2> fwereade: yeah, i don't know why
<davecheney> rogpeppe2: worker/peergrouper/worker_test.go :318
<davecheney> i see test failures when the servers is not the first entry in expectedAPIHostPorts(3)
<rogpeppe2> davecheney: yeah, agreed, it's dubious. weren't we going to sort the servers slice?
<davecheney> rogpeppe2: i am trying
<davecheney> but how can I sort a set of []instance.APIPorts
<rogpeppe2> davecheney: fairly easily
<davecheney> do tell
<rogpeppe2> davecheney: it's not hard to define an ordering between two []instance.HostPort values
<rogpeppe2> davecheney: (compare each element in order)
<davecheney> rogpeppe2: thta isn't what you told me to do
<rogpeppe2> davecheney: you can probably just compare address value
<davecheney> you said that []instance.HostPort is already sorted
<rogpeppe2> davecheney: it is
<davecheney> wright
<davecheney> rifht
<rogpeppe2> davecheney: we're sorting a [][]instance.HostPort
<davecheney> so I need to sort a [][]instance.HostPort
<davecheney> yup
<rogpeppe2> davecheney: so to do that, you need to compare two []instance.HostPorts
<davecheney> rogpeppe2: ok, got it
<rogpeppe2> davecheney: cool
<rogpeppe2> anyone know how to find out what code a given CI test is actually running?
<rogpeppe2> (the CI test itself, that is)
<fwereade> rogpeppe2, I think it's lp:~juju-qa/juju-core/ci-cd-scripts2
<rogpeppe2> fwereade: i just found that, but it doesn't seem to have all the scripts in it (e.g. aws-upgrade)
<rogpeppe2> fwereade: but i'm not sure what the correspondence is between that branch and the test names we see in jenkins
<davecheney> rogpeppe2: https://codereview.appspot.com/86400043/
<rogpeppe2> darn, i've just realised that we really need to support upgrading to HA
<davecheney> your thoughts sir
<davecheney> opps
<davecheney> two secs
<rogpeppe2> davecheney: that doesn't look quite right
<rogpeppe2> davecheney: i don't think that Less function is commutative
<davecheney> https://codereview.appspot.com/86400043/
<davecheney> sorry, have another look
<rogpeppe2> davecheney: i think you want: if len(a) != len(b) { return len(a) < len(b) }
<davecheney> ok
<rogpeppe2> davecheney: in fact, i think the other test isn't right either
<davecheney> yeah, that was tricky
<rogpeppe2> davecheney: i think it needs to compare the values for equality, and only if they're not equal should it compare the less-ness of them
<davecheney> yeah, that makes sense
<rogpeppe2> davecheney: i've got that wrong before
<rogpeppe2> davecheney: so i always look closely at Less functions now :-)
<davecheney> rogpeppe2: this one is tricky because in our case the ports are always the same
<davecheney> so a < b is always false
<rogpeppe2> davecheney: i don't think it's that tricky
<rogpeppe2> davecheney: one mo, i'll paste a suggestion
<rogpeppe2> davecheney: reviewed (with suggested code in the review) https://codereview.appspot.com/86400043/
<davecheney> rogpeppe2: ta
<davecheney> rogpeppe: thanks, that is much more straight forward
<rogpeppe> davecheney: np
<rogpeppe> davecheney: just sent one other trivial comment
<axw> mgz: ping
<rogpeppe> afk
<davecheney> rogpeppe: et al, http://code.google.com/p/go/issues/detail?id=7749
<davecheney> juju really breaks go 1.3 at the moment
<mgz> axw: hey
<axw> mgz: hey, would you please take a look at https://codereview.appspot.com/85590044/ when you have a moment?
<axw> I've changed some address logic, and I'm a bit nervous about breaking everything :)
<mgz> :)
<mgz> axw: had planned the pivate range stuff, that looks fine, will have a proper poke through before the standup and see if any of the subtle logic bits have got lost
<axw> mgz: great, thanks
<mgz> axw: only thing that jumps out as a risk is canonistack and similar setups
<mgz> were we don't actually *have* a public address generally,
<mgz> but currently juju can lie, and return the NetworkUnknown 10. address which will work with sshuttle
<mgz> right, I need to change location now
<axw> mgz: I did live test with canonistack (and HP); the 10. address is now recorded as cloud-local, and usable as both public and internal
<rogpeppe> davecheney: :-(
<rogpeppe> davecheney: can you reproduce that reliably?
<natefinch> morning all
<axw> morning natefinch
<jam1> natefinch: good morning. extra early for you, isn't it ?
<natefinch> jam1: yeah, been trying to get up early to get some more work done on HA. Also, yesterday I lost a good bit of the working day due to some emergency beekeeping tasks that came up.
<perrito666> good morning
<jam> just mentioning, standup in 3 min
<thumper> natefinch: you have bees?
<thumper> axw: just wondering about 11.0.0.0/8
<thumper> axw: it is technically public
<thumper> axw: but actually private
<thumper> axw: I wonder if anyone tries to use it
<thumper> axw: it is the ipaddress of the us military group
<thumper> and is a disconnected internet type network
<thumper> I think the current code is good, and if someone is stupid enough to use this
<thumper> then it can be on their head :-)
<axw> thumper: that would be an interesting problem to have :)
 * axw thinks
<axw> it will be no worse than it is now, as cloud-local will be used for public/internal as well as unknown
<axw> oh wait
<axw> public.. hrm
<axw> well they would be inside the cloud anyway, so it would be fine I think
<axw> inside the cloud -> inside the network
<thumper> axw: I do like the 'less unknowns' :-)
<perrito666> hey fwereade this is a wip https://codereview.appspot.com/86430043 for some reason https://codereview.appspot.com/86430043/patch/1/10003 is getting empty unit networks, care to take a quick look? (I did not upload work for the machine not changed checks to avoid clutter on what I am trying to debug)
<natefinch> thumper: yeah, three hives. bees are awesome :)
<fwereade> perrito666, actually do you want to pop on, I'm not quite following everything there
<perrito666> pop?
<fwereade> perrito666, sorry, I mean, quickly reenter the team meeting hangout
<perrito666> I certainly can
<rogpeppe> i've got two branches up for review. would much appreciate if someone could take a look: https://codereview.appspot.com/86200043/ https://codereview.appspot.com/85600044/
<c7z> axw: properly going over address branch now, I see rog has already looked through it
<axw> c7z: I assume that's mgz; yes he has, just thought I'd get your thoughts too, because you did the original work I think?
<axw> bbs
<c7z> axw: yup, though there's a lot more complexity than the first version unfortuanately
<natefinch> rogpeppe:  looking
<waigani> oh noooo, daylight savings I missed the meeting
<natefinch> hahaha
<waigani> I was like, why is there no one here???
<natefinch> yep, daylight savings is annoying
<waigani> sigh, reading notes
<jamespage> fwereade, sorry - bug 1305780
<_mup_> Bug #1305780: juju-backup command fails against trusty bootstrap node <juju-core:New> <https://launchpad.net/bugs/1305780>
<natefinch> waigani: didn't miss much
<waigani> hey, I'm in the notes!
<waigani> natefinch: okay, that's good. Well I'll get some sleep and be more productive tomorrow!
<waigani> night all
<natefinch> waigani: g'night
<fwereade> jamespage, I thought we had the tools used by backup/restore in juju-mongodb
<jamespage> fwereade, we do - but I suspect the fact they are not in the path is breaking things
<fwereade> jamespage, gaaaah ofc
<jamespage> fwereade, it works fine on 12.04
<jamespage> where that is the case
<c7z> axw: commented
<jamespage> fwereade, I suspect if I could get restore past "error: cannot restore bootstrap machine: cannot get public address of bootstrap machine: machine "0" has no public address"
<jamespage> then I would hit the same issue gain on 14.04
<c7z> jamespage: https://codereview.appspot.com/85590044/
<c7z> one of your bugs is nearlyfixed
<jamespage> c7z, \o/
<jamespage> woser
<jamespage> that was complex
<jamespage> c7z, I better stop finding new ones :-)
<c7z> yeah, axw decided to start doing the right thing with deriving the network scope rather than piling on hacks
<rogpeppe> natefinch: "could this code get moved to the loop over entity.jobs below?"
<rogpeppe> natefinch: i don't think so
<rogpeppe> natefinch: because newSingularRunner can fail
<dimitern> jam, mgz, have the bot stopped landing stuff for gomaasapi?
<jam> dimitern: I'm pretty sure the bot never landed things for gomaasapi
<jam> only gwacl
<c7z> dimitern: it never did in the current iteration
<dimitern> really?
<c7z> dimitern: I manually landed the last bits, I can land anything else you guys need
<c7z> there was once a gomaasapi bot, but it wasn't ours, and it went away
<jam> dimitern: lp:~juju/gomaasapi/trunk
<c7z> end of fairytail
<natefinch> rogpeppe: yeah, that's a good point.  I guess if you need to make sure you fail early, that's valid.
<dimitern> vladk, there's your reason ^^ c7z is your man :)
<jam> the bot has never been in ~juju, IIRC
<natefinch> rogpeppe: not for this review, but that method really needs to be refactored. 110 lines is just too long.
<jam> dimitern: I intentionally didn't want to give the bot too much access to things that weren't its, nor give ~juju direct access to bits controlled by the bot
<rogpeppe> natefinch: yeah, it could be easily split up
 * jam is away for a bit
<dimitern> jam, yep, understandable concerns
 * rogpeppe has just acquired a new, unbroken phone
<rogpeppe> woo
<c7z> rogpeppe: what did you go for?
<c7z> also, if it doesn't have a smashed screen by vegas, I'll be disappointed
<rogpeppe> c7z: a samsung galaxy s4 active from ebay
<rogpeppe> c7z: :-)
<rogpeppe> c7z: main reasons were the fact that it is waterproof (i killed a previous phone from water damage) and it has a replaceable battery
<vladk> c7z: could you manually land my branch to gomaasapi: https://code.launchpad.net/~klyachin/gomaasapi/101-testserver-extensions/+merge/214961
<c7z> vladk: on it
<axw> c7z: just so I understand about the floating IP...
<axw> c7z: floating-ip would be stored as NetworkUnknown as well?
<axw> c7z: hence why we would take the last one?
<c7z> yeah, so, the original iteration of the code understood some network names as special
<c7z> but hp and some others were annoying in that they had a network named 'private'... but the (public) floating ip just got appended to that network, not added to a new one
 * rogpeppe just realises that c7z==mgz
<c7z> rogpeppe: sorry :P
<c7z> but I'm pretty sure that your code will actually pass that test (with less fiddling that you needed), because of the new scope detection code
<axw> c7z: okay, cool. yes I think it should work then
<axw> c7z: good catch on NetworkName. I'll fix that and land
<c7z> but yeah, the current/old version of openstack bits basically left everything as NetworkUnknown and used some hacks based on ordering to make the various previous cases work
<c7z> vladk: landed
<rogpeppe> natefinch: how's it going?
<natefinch> rogpeppe: sorry, helping my ender daughter, Lily get ready for preschool.  Should be back in about 45 minutes, though not at full capacity for a little over an hour.
<natefinch> s/ender/elder/
<rogpeppe> natefinch: ok. perhaps you could just push the branch you were working on last night?
<rogpeppe> natefinch: then i can move it forward
<natefinch> rogpeppe: cool, just pushed it here: lp:~natefinch/juju-core/041-moremongo
<rogpeppe> natefinch: thanks
<c7z> evilnickveitch: had a note that the environments.yaml config option for bug 1241674 isn't added to the 1.18 docs, what branch do I need to get to put it in?
<_mup_> Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks <cts-cloud-review> <openstack-provider> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1241674>
<evilnickveitch> c7z, it should go in the master branch for now, thanks!
<axw> c7z: sorry, will fix that test in a followup
<c7z> axw: I wasn't quite clear the first time around that the test should have just been s/127\./10\./
<axw> c7z: no worries, understood now - I think the bot's already running it though
<jam> anyone else having trouble getting to bazaar.launchpad.net ?
<c7z> axw: no problems, as I said, should pass that way as well
<c7z> jam: I just did, and it was transient
<jam> c7z: can you check if you can get to launchpad?
<jam> c7z: k, it is still failing for me... :(
<c7z> as in, failed to branch twice, pinged, worked, sshed, worked, branched... worked
<jam> c7z: I can't SSH or get to the HTTP page
<c7z> jam: apparrently lp app servers were seeing issues getting stuff through squid, it's still working for me at present
<jam> c7z: it just worked fro me
<jam> for
<jam> dimitern: ping about SCP and extra arguments
<jam> you seem to have a patch that made it so that scp only supports *1* extra argument
<jam> and CI wants to use about 5 extra args
<c7z> when doing a local provider deploy for the first time, how can you track the image download progress?
<jam> c7z: iftop ?
<jam> I wish I knew a better way, I think it is controlled underneath lxc
<jam> (hidden from us)
<c7z> isn't that fun
<dimitern> jam, my changes to scp was that it can accept any number of extra args
<jam> dimitern: not in 1.18
<jam> dimitern: "juju scp 1:foo . -o "StrictHostKeyChecking: no"
<dimitern> jam, if it got changed later i don't know
<jam> complains that "-o" is unknown
<jam> dimitern: vs juju scp 1:foo . -o"StrictHostKeyChecking: no"
<jam> works
<jam> but it has to be *1* argument
<jam> dimitern: the line is "if i != len(c.Args) - 1"
<jam> sounds like it only accepts 1 extra argument, and is attributed to you (according to bzr annotate)
<dimitern> jam, hmm.. looking at the code I see the problem
<dimitern> jam, it was broken before - not being able to take more than 3 targets
<dimitern> jam, but i broke that it seems - the fix should be: once we start adding extraArgs we treat all the rest as extraArgs
<jam> so CI was trying to do:       if timeout 5m juju --show-log scp -e $ENV -- -o "StrictHostKeyChecking no" -o "UserKnownHostsFile /dev/null" -i $JUJU_HOME/staging-juju-rsa 0:/var/log/juju/all-machines.log $log_path; then
<jam> using "--"
<jam> which I don't quite see that we ever actually supported anyway
<jam> But they would like to have the target late, and extra args early
<jam> we can move that around
<jam> (I think)
<jam> but we should support more args
<dimitern> why targets late and extra args early?
<jam> dimitern: it is a natural way to write it, if you were writing "SCP" code.
<jam> as in, it is how *I* would write "scp ..."
<dimitern> it used to be documented that extra args are passed after -- that was never implemented
<jam> dimitern: I don't think we *have* to because SCP will let you passed them late.
<dimitern> jam, well, initially i made it so -- can be at any place (even between targets) to specify one or more -args, but it was rejected on the review
<dimitern> jam, yeah, passing them last is both easier to parse and not far from natural
<dimitern> jam, I think something like this should fix the issue http://paste.ubuntu.com/7230671/
<bodie_> I'm getting a weird error with cobzr.  I should be able to diff against trunk, right?
<axw> rogpeppe: https://codereview.appspot.com/86490043  -- hopefully I'm at least on track here...
<rogpeppe> axw: looking
<axw> rogpeppe: I'm about to sign off, so no particular rush
<c7z> evilnickveitch: https://code.launchpad.net/~gz/juju-core/update_config_openstack_1.18/+merge/215177
<evilnickveitch> c7z, thanks, i'll take a look
<axw> I'm off now, will check a bit later but would appreciate if someone could poke my MP if it fails again: https://code.launchpad.net/~axwalk/juju-core/lp1303735-fix-address-logic/+merge/215085
<axw> and I'll backport to 1.18 in the morning
<jam> dimitern: the problem is the signal to start adding things has a "if i != len(args) -1"
<jam> so we really just need a different check. I think actually implementing "--" would be the easiest thing, if we want to support multiple SCP targets
<dimitern> jam, -- sgtm to
<dimitern> jam, can't remember who was against it in the review ;)
<c7z> ...the code did use  --... which broke things apparently, hence the fix to not
<jam> c7z: well the CI guys had written "scp -- 0:foo ." which would be broken regardless
<jam> because that would pass an explicit "0:blah" to scp
<jam> We *could* just detect if any given argument had a possible Juju identifier at the start and map it
<jam> with potentially an escape character?
 * jam doesn't like escapes very much here
<dimitern> fwereade, are you around?
<mramm> fwereade: alexisb: I just put one of you on the hook for a 1.19 and 1.18 update at the cross-team meeting.
<alexisb> mramm ack
<jam> mramm: I think fwereade said he had a headache, so is off for a bit. which puts alexisb on the hook :)
<mramm> jam: cool
<mramm> I'm sure alexisb can handle it ;)
<jam> alexisb: here's my summary, I think cmars is already working on #1304770, but I don't have a feeling for why it isn't done already
<_mup_> Bug #1304770: store: tests do not pass with juju-mongodb <ppc64el> <juju-core:Triaged by cmars> <https://launchpad.net/bugs/1304770>
<fwereade> mramm, jam, alexisb: thanks, I just dragged myself up here to make sure there was someone there
 * fwereade goes back to bed
<jam> bug #1302205 was the one that wallyworld__ sent an email on. I don't think it should block 1.19.0 if we can't fix it in time
<_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
<jam> the toolchain is hard to work with there, so it isn't a *regression*
<mramm> we should definitely talk over that bug on the call
<jam> bug #1303697 I thought fwereade actually had a patch that I reviewed, so it should be just-about done as well
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:In Progress by fwereade> <juju-core 1.18:In Progress by fwereade> <https://launchpad.net/bugs/1303697>
<jam> dimitern: do you have a status for bug #1304905? Presumably that is Critical because you wanted to fix the API before it becomes released?
<_mup_> Bug #1304905: Change NetworkName to NetworkId across codebase and use network tags in the API <api> <tech-debt> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1304905>
<jam> bug #1305386 is about "this might be a bug" being told to us by the go compiler, which is something we should try to address, but it isn't a bug that has seen actual live problems
<_mup_> Bug #1305386: state/apiserver: multiple data races <juju-core:Triaged> <mgo:New> <https://launchpad.net/bugs/1305386>
<dimitern> jam, i'm about to propose the CL that fixes it
<cmars> jam, alexisb, I have a fix for 1304770, which I can land, we'll just need to update CI to test the charm store tests that will become disabled by default
<jam> cmars: could we invert the logic? or provide it via an ENV var and just have the build process set that ENV var?
<jam> because the landing bot is happy to run the tests
<jam> and so is CI
<jam> it is more that the build process for trusty needs to be able to disable them
<jam> We could even just do "if version.Current.arch == 'trusty'" if we wanted.
<cmars> jam, i can certainly invert the logic
<dimitern> jam, mgz, rogpeppe, anyone.. I'd really appreciate a review on this https://codereview.appspot.com/86010044/, which also fixes bug 1304905
<_mup_> Bug #1304905: Change NetworkName to NetworkId across codebase and use network tags in the API <api> <tech-debt> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1304905>
<cmars> jam, would you prefer a switch or env var?
<rogpeppe> dimitern: ok, lookin
<rogpeppe> g
<cmars> or both :)
<dimitern> rogpeppe, cheers - it looks huge, but it's not, just touches a lot of things
<jam> cmars: *I* prefer an env var, because 'go test ./...' doesn't pass switches well, they have to be defined on all packages
<jam> while an ENV var can just be picked up while running.
<cmars> wfm
<natefinch> +1 for me too
<rogpeppe> jam: +1
<rogpeppe> jam: it would be nice to have a small set of well defined env vars though
<rogpeppe> fwereade: do you recognise this last test failure? looks like a test wasn't updated. https://code.launchpad.net/~rogpeppe/juju-core/mfoord-wrapsingletonworkers/+merge/215035
<perrito666> rogpeppe: fwereade seems to be ill in bed :(
<rogpeppe> perrito666: ah
<alexisb> so jam just a general note... much of the status on bugs you are giving here is not actually in the bug, any reason for that?  Can we just ask folks to update bugs, seems like a pretty good common place to put status :)
<cmars> jam, updated the nomongojs option for tests, ptal https://codereview.appspot.com/82930043
<rogpeppe> dimitern: you've got a review
<dimitern> rogpeppe, thanks!
<rogpeppe> dimitern: yw
<dimitern> c7z, ping
<c7z> dimitern: hey, will be free in a min
<dimitern> c7z, rogpeppe reviewed my https://codereview.appspot.com/86010044/, but can you take a look as well please? I want to land it today, and I almost have the last bit working (cloudinit).
<jam1> fwereade: does your patch for https://bugs.launchpad.net/juju-core/+bug/1303697 actually fix the full bug? I felt like it did, so I went ahead and marked it Fix Committed since your branch is merged.
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:Fix Committed by fwereade> <juju-core 1.18:In Progress by fwereade> <https://launchpad.net/bugs/1303697>
<jam1> fwereade: I should note that if you do "bzr commit --fixes lp:12345" then Tarmac will mark things as Fix Committed when it merges them.
<jam1> alexisb: so I updated some things, for things that "have a branch in progress ready to land" that information is captured in the bug, as long as people link their branches to the bugs. (after the Description is a section about Related Branches), see https://bugs.launchpad.net/juju-core/+bug/1303697
<_mup_> Bug #1303697: peer relation disappears during upgrade of juju <juju-core:Fix Committed by fwereade> <juju-core 1.18:In Progress by fwereade> <https://launchpad.net/bugs/1303697>
<jam1> some of the other bits I mentioned were just stuff that I worked out while reading over the current list of critical bugs, thus existed nowhere but in IRC at that moment :)
 * jam1 goes to take my son to bed
<cmars> can someone PTAL at my mongojs notest option proposal, https://codereview.appspot.com/82930043/?
<dannf> thus far i've only tried building juju from the deb package - what's the right mechanism for doing it w/ go gotten source?
<dannf> nm - see the README
<rogpeppe> natefinch: lp:~rogpeppe/juju-core/natefinch-041-moremongo
<natefinch> rogpeppe: func fakeCmd(path string) {
<natefinch> 	err := ioutil.WriteFile(path, []byte("#!/bin/bash --norc\nexit 0"), 0755)
<natefinch> 	if err != nil {
<natefinch> 		panic(err)
<natefinch> 	}
<natefinch> }
<alexisb> jam1, thanks!
<jam1> cmars: lgtm
<cmars> jam1, thanks!
<jam1> cmars: make sure to update the bug with how the build process can skip the tests if it needs to
<cmars> jam1, will do. i'm standing up an LXC to test it with an actual juju-mongodb right now, just to be sure. i usually develop against a stock mongodb-server
<jam1> cmars: I also wonder if there is an obvious place in a README or HACKING to describe env flags for the test suite like this
<cmars> jam1, CONTRIBUTING, in the Testing section
<cmars> i'll add a blurb
<alexisb> jam1, any objections to me moving the target for 1302205 to 19.1?
<jam1> alexisb: fine with me
<jam1> alexisb: it sounds like something that has a stakeholder, so it is important work, but not something that used to work that we broke
<alexisb> nor something that should hold up 19.0
<alexisb> I think wallyworld__ moved it because he had 3 fixes he wanted to land but there are still additional issues that need to be investigated
<alexisb> anyone else having issues with launchpad server?
<natefinch> alexisb: the juju-core homepage opens ok for me, but I haven't really been on it all day
<alexisb> natefinch, I got logged out and am having troubles with the login page
<natefinch> alexisb: haha, yeah, the login page looks borked: Something broke while generating the page. Please try again in a few minutes, and if the problem persists file a bug or contact customer support. Please quote OOPS-ID ['OOPS-a584d1910ee34d7d91342302107837b7']
<alexisb> worked this last time though
<alexisb> heh
<alexisb> well I am in now
<natefinch> yep, me too
<alexisb> interesting got same type of oops trying to get back into email
<jam1> alexisb: it has been flakey on and off today. You can probably go into #webops on irc.canonical.com and ask there when we have trouble like this. Apparently there is an issue with some of the Launchpad servers and some squid proxy machines
<alexisb> good pointer, thank you jam1
<natefinch> jam1: do we really need to support quantal?
<dimitern> https://codereview.appspot.com/86600043 - reviews appreciated (VLAN cloudinit network setup for MAAS)
<jam1> natefinch: is there a specific bug in quantal that we're trying to avoid?
<natefinch> jam1: lack of mongo
<jam1> natefinch: I don't have something that says "OMG, we must support Q for people". However, it sounds like more code than not to avoid it.
<jam1> natefinch: so, given that nobody ever bothered to port mongo to Q and nobody has complained, I think its pretty clear where our support for it lise
<jam1> lies
<jam1> however, a *lot* of our test code uses Q as a "your not actually running the test suite on this otherwise 'supported' version"
<natefinch> jam1: we have code that says "if you're running quantal, apt-add-repository ppa:juju/stable
<jam1> natefinch: so if we have mongo, what is the problem?
<natefinch> jam1: just more code to maintain, more special case tests to write
<natefinch> (I was writing the test to check that quantal did the add-apt-repository)
<jam1> natefinch: Quantal EOL's when Trusty is released, If I'm reading http://en.wikipedia.org/wiki/List_of_Ubuntu_releases#Ubuntu_12.10_.28Quantal_Quetzal.29 correctly
<jam1> natefinch: so... I'd rather limp along and support it for the next release
<jam1> but I wouldn't spend huge amounts of time on it.
<natefinch> jam1: ok, that's fair
 * rogpeppe has reached eofd
<rogpeppe> or eod even
<rogpeppe> g'night all
<jam1> anyone else able to "lbox propose" ? I get: error: Get https://api.launchpad.net/devel/people/+me: x509: certificate signed by unknown authority
<jam1> I have the feeling LP changed their SSL certs due to Heartbleed, and now LBOX is refusing to let us get our work done.
<cmars> jam1, i just got that same x509 error w/lbox, then retried and it was successful. strange
<jam1> cmars: maybe it depends what lp appserver you get
<jam1> as there are like 16 of them
<jam1> I did get it 2x in a row
<natefinch> jam1: it worked for me this morning. haven't tried since then, though I've heard others have had problems sporadically
<jam1> though I thought all the SSL stuff was handle in the Apache front end
<waigani> morning all
<alexisb> morning waigani
<davecheney> morning
<davecheney> waigani: thanks for trying to resubmit your branch
<davecheney> mwhudson: do you have time for some deb hand holding ?
<waigani> davecheney: no problem, looks like mongod problem again?
<davecheney> waigani: that bot is screwed
<waigani> hmmm
<davecheney> turns out our dogfood isn't fit for human consumption
<waigani> haha
<waigani> davecheney: fyi I'm working on this now: 1304767
<mwhudson> davecheney: probably
<waigani> https://bugs.launchpad.net/juju-core/+bug/1304767
<_mup_> Bug #1304767: test failure in cmd/juju <ppc64el> <juju-core:Triaged by waigani> <https://launchpad.net/bugs/1304767>
<davecheney> waigani: ok
<waigani> basically, I'll mock out a fake tarball so we are just testing tools uploading - not actually building
<davecheney> waigani: hmm
<davecheney> i don't think that is the right solution
<davecheney> it might eb
<waigani> davecheney: open to suggestions
<davecheney> be what I have found in is in many test cases they use the version.Current symbol
<davecheney> but the test then expects amd64 tools
<davecheney> id check that first
<waigani> davecheney: okay
<waigani> davecheney: wallyworld__and thumper pointed me in the direction of mocking out the tools tarball
<davecheney> waigani: thumper knows what he is doing
<waigani> davecheney: I'll look into both :)
<cmars> davecheney, i've flipped the logic for mongojs tests and that's landed in trunk, re: 1304770
<cmars> davecheney, can you PTAL at landing this in  https://codereview.appspot.com/86650043/?
<cmars> ^in 1.18 i mean
<davecheney> cmars: so be it
<davecheney> nobody will read that
<davecheney> and we'll get bug reports
<davecheney> but so be it
<davecheney> until the store moves
<davecheney> that is
<davecheney> *not very subtle hint*
<davecheney> it's not a big problem, nobody runs the tests but us
<cmars> sounds like a topic for discussion at the next sprint. for now, this is a way to unblock CI
<davecheney> cmars: roger
<davecheney> cmars: thanks for getting that fix in
<cmars> np
<davecheney> ffs
<davecheney> the bot is really screwed
<davecheney> every single change has to be landed several times before it sticks
<davecheney> http://paste.ubuntu.com/7232727/
<davecheney> mongo is blowing up :(
<davecheney> can anyone connect to the bot and see if there are like 1,000 mongod processes leaking around
<thumper> hazmat: where did you look for the source of juju-mongodb again?
<thumper> hazmat: I'm looking at bug 1302747
<_mup_> Bug #1302747: mongodb fails to start with local provider  <mongodb> <juju-core:Incomplete> <https://launchpad.net/bugs/1302747>
<wallyworld__> davecheney: not 1000, but maybe 5
<wallyworld__> i'll nuke them
<davecheney> wallyworld__: ta
<davecheney> i normally find a few doen lurking around by the EOD
<hazmat> thumper, i went to github  lsat time
<hazmat> thumper, mongodb upstream that is
<thumper> kk
<hazmat> anyone seen juju machines pending forever on aws?
<hazmat> http://paste.ubuntu.com/7232845/ machine 8 fwiw machine-0 log
<hazmat> pending in status for 20m
 * hazmat switches to stable branch tip
<cmars> https://codereview.appspot.com/86650043/, can I get a look (backport to 1.18)
<cmars> needs a LGTM
#juju-dev 2014-04-11
<thumper> machine-0: 2014-04-11 00:19:50 INFO juju.worker.instanceupdater updater.go:264 machine "0" has new addresses: [public:localhost local-cloud:10.0.3.1]
<thumper> localhost is public?
<thumper> also...
<thumper> $ juju destroy-environment local -y
<thumper> ERROR readdirent: no such file or directory
<thumper> where did that start coming from?
<davecheney> thumper: that is um, wrong
<davecheney> how does that even happy
<davecheney> happen
<davecheney> NewAddress requires you to give an address scope
<thumper> davecheney: axw landed something in the last day that changed it
<davecheney> thumper: i've been tinking that Address.Scope should be private
<davecheney> then we can force every creation to go through a helper function
<thumper> davecheney: chat with axw when he starts as he has done a lot of this
<thumper> any one want to review the debug-log client hookup? https://codereview.appspot.com/85570044
<wallyworld__> cmars: you got your lgtm, sorry about delay. i've been head down landing some critical 1.18 fixes
<cmars> np, thanks wallyworld__
<wallyworld__> thumper: suppose so
<stokachu> if you remove juju-local juju-mongodb juju-core via apt-get with no bootstraped environment, the mongod process is still around
<stokachu> is that intentional?
<stokachu> this is 1.18 on trusty
<thumper> stokachu: depends, do you have mongodb-server package?
<stokachu> thumper: just juju-mongodb
<stokachu> and its pointing to /usr/lib/juju/bin/mongod
<thumper> probably unintentional
<stokachu> ok ill probably file a bug because if you dont kill that process subsequent bootstraps will fail
<thumper> wallyworld__: I talked with rog last night about the error
<wallyworld__> ok
<thumper> wallyworld__: he suggested that I change the connection error to a more generic NotSupported error
<wallyworld__> agree that's better
 * wallyworld__ doesn't like inconsistency
 * thumper nods
<thumper> I have found another problem though
<thumper> but it is existing and elsewhere
<thumper> I'll land this then fix the bug
<wallyworld__> s/CodeIsNotImplemented/IsNotSupported :-)
<thumper> well...
<thumper> not implemented has a different meaning to not supported
<thumper> not implemented implies that one day you might
<thumper> but yes, agree in general
<wallyworld__> sure, but we are using this as a mechanism to delete running against older api servers
<wallyworld__> detect
<wallyworld__> and failing back to 1dot16foo()
<wallyworld__> sinzui: you'll see the email, but i got into 1.18 both john's fixes, for scp and downgrades. hopefully that will allow CI to work again
<thumper> wallyworld__: do you want me to just use not implemented?
<thumper> I'd be ok with that
<wallyworld__> thumper: nah, let's go with the new error and promise hand on heart to port to using it everywhere appropriate :-)
<thumper> fft
<wallyworld__> the 1dot16 fallbacks will be disappearing anyway
<thumper> like that'll happen
<thumper> yeah, I also renamed it fallback from 1.16 to 1.18
<sinzui> wallyworld__, I am hopeful that 	lp:juju-core/1.18 r2267 will pass. The azure-deploy test is very ill. I think we need to review both the test and azure cloud itself
<wallyworld__> ok, we can ask axw for input there perhaps
<axw> ?
<axw> azure is bad on 1.18?
<wallyworld__> axw: sinzui says there are issues with the azure CI test not working
<wallyworld__> i haven't looked yet, but perhaps we need to review what's being tested and how
<wallyworld__> to see where the issue is
<axw> is there a bug I can look at?
<axw> or build failure
<axw> CI build
<wallyworld__> http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/azure-deploy/
<axw> ta
<sinzui> wallyworld__, I revert the changes that were made to help it pass
<wallyworld__> thanks for looking, i'm flat out right now landing stuff
<wallyworld__> sinzui: you mean the changes with regard to downgrades and scp changes?
<sinzui> wallyworld__, axw. We increased the timeouts and tried to change the tools-metadata-url from stable to testing to help tests pass. The effoerts didn't help
<wallyworld__> i saw the metadata url bug
<axw> sinzui: what's going on? it's just hanging there?
<sinzui> wallyworld__, the downgrades restoration fixes 1.18.
<wallyworld__> \o/
<wallyworld__> sinzui: i'm porting to trunk now, conflict to solve but should be landed soon
<sinzui> In trunk no env will upgrade withing 30 minutes
<axw> I see the last one failed in scp, but the current one is just stuck on bootstrap?
<sinzui> The scp is secondary, though it also acts as a computability test
<axw> 2014-04-11 01:20:06 ERROR juju.cmd supercommand.go:300 charm not found in "/var/lib/jenkins/repository": local:precise/dummy-source
<axw> wat
<sinzui> axw we tried replacing mysql and wordpress charms with charms the let us test just juju
<sinzui> axw, we started on it earlier this week, then rushed it into use when we hoped to remove the many had starts that both of those charms have
<sinzui> the next run will use mysql and wordpress
<sinzui> In theory charm-testing is responsible for making sure that mysql and wordpress are sane.
<axw> ok
<axw> sinzui: which location do the azure tests run in?
<sinzui> US West...the only location that has ever worked
<axw> heh ok
<davecheney> sinzui: thumper http://paste.ubuntu.com/7233131/
<davecheney> gettng closer
<davecheney> the ones maked failed > 600s
<davecheney> are actually timeouts
<davecheney> if the builder was faster (building gccgo at the same time)
<davecheney> they may have passed
<sinzui> well done
<davecheney> SHIT
<davecheney> is jesse around
<davecheney> provider/common is still complaining about missing keys
<axw> sinzui: I'm not convinced azure is entirely healthy, I'm getting errors I haven't seen before from the API
<axw> e.g. the storage API refusing connections
<axw> there's scheduled maintenance tomorrow on West US, I wonder if they started early on the "safe" parts
<axw> though now I've said that, azure-deploy just passed
<sinzui> axw, I see a lot of errors. The CI often reties. Azure is actually very healthy today. I only failed 4 out of 24 hours
<axw> ok
<axw> heh :)
<sinzui> Blessed: lp:juju-core/1.18 r2267
<axw> I have to backport another fix, but that's good to know
<axw> wallyworld__: I'm about to backport the network addresses change - you're not already doing that, right?
<wallyworld__> axw: no, i've been dealing with the other critical issues stopping CI from working
<wallyworld__> just porting to trunk now from 1.18
<axw> nps, thanks
<axw> ah ok
<wallyworld__> sinzui: does that mean you worked around the scp issues fixed by r2268?
<sinzui> wallyworld__, we didn't succeed. It was a lower priority than getting a pass
<wallyworld__> i'm not 100% sure, but maybe r2268 allows the original scripts to work?
<sinzui> wallyworld__, the scp issue only comes into play when the test fail
<wallyworld__> ah ok
<wallyworld__> anyways, it's merged into 1.18 and heading to trunk so if the tests fail again.... :-)
<sinzui> wallyworld__, http://ec2-54-84-137-170.compute-1.amazonaws.com:8080/job/canonistack-deploy-devel/ didn't capture logs from the last tests
<wallyworld__> yeah, that's r2267
<wallyworld__> r2268 should hopefully cspture the logs
<sinzui> CI's normal and fallback credentials are broken in canonistack. we cannot test it until the swift authentication issue is fixed. So every canonistack test will fail
<wallyworld__> :-(
<thumper> anyone? https://codereview.appspot.com/85570045/
<sinzui> Blessed: lp:juju-core/1.18 r2270
<waigani> thumper: I put some garbage into jujud/agent.go then ran cmd/juju/bootstrap_test.go TestTest: passes locally, fails on vm
<thumper> \o/
<thumper> wallyworld__: don't worry about the juju command there...
<thumper> waigani: sorry that was for you
<thumper> waigani: but check the others
<thumper> waigani: although I do challenge that
<thumper> waigani: if you delete ~/go/bin/jujud and rerun the test
<thumper> waigani: what happens?
<waigani> thumper: passes
<thumper> waigani: what is the error on the vm?
<waigani> https://bugs.launchpad.net/juju-core/+bug/1304767
<_mup_> Bug #1304767: test failure in cmd/juju <ppc64el> <juju-core:In Progress by waigani> <https://launchpad.net/bugs/1304767>
<waigani> thumper: I'll paste full error, hang on
<thumper> waigani: also, run a make check to run all the tests
<waigani> thumper: http://pastebin.ubuntu.com/7233312/
<thumper> with a broken jujud you'll get a lot of failures
<thumper> that I don't think should happen
<thumper> heh
<thumper> ok, to test locally
<thumper> we should do something like this...
<waigani> is it something to to with there not being a candidate match for 14.04:ppc ?
<thumper> PatchValue(&version.Current.Series, "magic")
<thumper> make the series be something it can never be anywhere
<thumper> and you'll hit the same problem locally
<thumper> (I think)(
<waigani> thumper: still passes
<thumper> hmm...
<thumper> it is something like that...
<thumper> patch it before the start of setup
<waigani> ah okay
<thumper> the conn suite will bootstrap in setup
<waigani> thumper: still passes
<thumper> it is something like that....
<thumper> play a bit and break it
<waigani> thumper: I'll keep debugging on the vm - slowly but surely
<waigani> thumper: okay, I'm good at breaking things :)
<waigani> I have to run and get my girl now
<waigani> I've cornered the bug, with a bit more testing I should get it tonight.
<thumper> waigani: found it
<thumper> we patch version.Current, but don't patch arch.HostArch
<sinzui> wallyworld__, https://bugs.launchpad.net/juju-core/+bug/1302205 bothers me. It is critical, but is not targeted to the current milestone. I can see you working on it. I want to move it to 1.19.0 to reflect how it is being treated
<_mup_> Bug #1302205: manual provisioned systems stuck in pending on arm64 <add-machine> <hs-arm64> <manual-provider> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1302205>
<thumper> sinzui: what is the plan fo releasing 1.19?
<wallyworld__> sinzui: sure
<sinzui> thumper, if 1.19.0 gets a passing rev today/tomorrow I might release it. lp:juju-core r2587  was the last time 1.19.0 passed.
<sinzui> 1.18.1 has two passes today, so I think it is more likely to be released
 * thumper nods
<cmars> sinzui, i've marked 1303880 and 1295140 as fix-committed. fixes for these have landed in trunk & 1.18
<sinzui> thank you cmars
<axw> wallyworld__: backport for addresses fix, can you please review? https://codereview.appspot.com/86720043/
<wallyworld__> sure
<wallyworld__> axw: done, looks like a nice change
<axw> wallyworld__: cheers
<wallyworld__> davecheney: is this one you've seen before? http://pastebin.ubuntu.com/7233429/
<wallyworld__> dannf: you still online?
<sinzui> thumper: Maybe you can point a developer to work on bug 1306212. CI will fail trunk because of it
<_mup_> Bug #1306212: juju bootstrap fails with local provider <bootstrap> <ci> <local-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1306212>
<wallyworld__> sinzui: make him fix it himself :-P
<wallyworld__> except, it may be HA related
<sinzui> wallyworld__, I don't make fixes after midnight. I do dangerous things when I am tired
<wallyworld__> sinzui: no, not you, thumper :-)
<wallyworld__> i didn't mean you to fix it
<wallyworld__> you need to go to bed!
<sinzui> I am very skeptical that any of the trunk upgrades will pass. They are all paused and approaching the 30 minutes timeout
<wallyworld__> :-(
<wallyworld__> deployments look ok
<wallyworld__> at least 1.18 works so that can be released soon
<sinzui> I already downloaded the 1.18.1 tarball and win installer. I could release that tomorrow
<wallyworld__> sinzui: fix for bug 1303735 landing now though
<_mup_> Bug #1303735: public-address change to internal bridge post juju-upgrade <openstack-provider> <juju-core:Fix Committed by axwalk> <juju-core 1.18:In Progress by axwalk> <https://launchpad.net/bugs/1303735>
<sinzui> \o/
<wallyworld__> we should not release without that one
<wallyworld__> the only other one is backup failure
<wallyworld__> meh for 18.1 i reckon
<sinzui> wallyworld__, There is not much point fixing that backup bug because the restore bug is targeted for 1.19.0.
<wallyworld__> that's what i reckon also
<wallyworld__> hence, "meh" :-)
<wallyworld__> so we can retarget and release as soon as the final fix lands
<wallyworld__> sinzui: fuck off to bed, it's way too late for you to be online
<wallyworld__> :-)
<sinzui> I have more email explain what blocks 1.19.0 and then I will sleep
<davecheney> wallyworld__: yes
<davecheney> it turns out that there is a bug in gccgo
<davecheney> things are only given small stacks
<davecheney> and when they run off the end, they crash
<wallyworld__> boom
<davecheney> dunno why some ppc machines are ok
<davecheney> i'm building a test gccgo now
<wallyworld__> ok
<davecheney> mwhudson: ping
<wallyworld__> davecheney: btw, arm64 works now \o/
<davecheney> wallyworld__: working on a compiler fix
<davecheney> wallyworld__: it's actually a build option
<wallyworld__> the issue i has was bug 1274558, transparent huge pages, there is a wor arpund
<davecheney> eek
<davecheney> autoconf is incorrectly turning on the -fsplit-stack option for gccgo
<wallyworld__> the issue caused go executables to hang etc etc
<davecheney> which is really only an intel thing
<davecheney> wallyworld__: can you grab the last dozen lines of dmesg
<davecheney> on that system
<wallyworld__> ok
<wallyworld__> davecheney: this is not a juju bug is it https://bugs.launchpad.net/juju-core/+bug/1300256
<_mup_> Bug #1300256: juju status results in unexpected fault address on arm64 using/ local provider <arm64> <hs-arm64> <local-provider> <status> <juju-core:Triaged> <https://launchpad.net/bugs/1300256>
<dimitern> morning all
<dimitern> fwereade, I hope you're feeling better today :), just as a reminder - my two VLAN CLs https://codereview.appspot.com/86010044/ and https://codereview.appspot.com/86600043/ when you can have a look
 * dimitern is away for 1h
<rogpeppe> mornin' all
<waigani> davecheney: turned out to be a one liner: https://codereview.appspot.com/86760043/
<waigani> morning rogpeppe
<rogpeppe> waigani: hiya
<waigani> Exciting Friday night here, coding in the kitchen
<axw> do we have a way to version API methods yet?
<rogpeppe> axw: renaming them is the only way currently
<rogpeppe> axw: or going the backwards-compatible route
<axw> rogpeppe: ok, thanks
<rogpeppe> axw: there are a couple of directions i'd like to go with it, but it's a sensitive issue
<rogpeppe> ha, i wondered what i'd done to break the cmd/jujud tests, but this happens in trunk (many times): [LOG] 36.85478 ERROR juju worker: exited "rsyslog": failed to write rsyslog certificates: cannot create temp file: open /var/log/juju/rsyslog-cert.pem824698669: no such file or directory
<rogpeppe> this is not great
<axw> rogpeppe: did you get a chance to look at the EnsureAvailability CL?
<rogpeppe> axw: i'm about half way through the review
<axw> ok
<rogpeppe> axw: sorry, will get back to it!
<axw> rogpeppe: nps
<rogpeppe> axw: there was one thing i was having difficulty working out
<rogpeppe> axw: i couldn't quite see whether it always preserves the invariant that the number of wants-vote machines is always odd
<axw> it should always be the parameter, and that is checked at the top
<axw> if one is taken out of VotingMachineIds, another will replace it
<axw> rogpeppe: I must admit it was a little mindbendy to me, so don't take my word for it ;)
<mwhudson> davecheney: pong
<davecheney> waigani: nice fix
<davecheney> yeah, that was what I suspected
<davecheney> we weren't specifying, so it fell through to 'this machine'
<davecheney> which didn't match the fixtures
<davecheney> waigani: interesting line wrapping, are you using a c64 ?
<waigani> davecheney: line wrapping? in the description you mean?
<rogpeppe> axw: i'm wondering if we should be doing all the stateserverinfo ids manipulation inside a single Update op
<davecheney> waigani: y
<axw> rogpeppe: what's the benefit?
<rogpeppe> axw: it means that other code can't see them in an intermediate state
<waigani> davecheney: lol - well I may be formatting out of nostalgia ;)
<axw> rogpeppe: all the ops are run in a single transaction though?
<waigani> davecheney: how are we looking now on ppc?
<rogpeppe> axw: read operations don't respect transactions
<davecheney> waigani: no complain', just sayin'
<davecheney> waigani: will know in < 300 seconds
<axw> rogpeppe: ah, I see
<davecheney> waigani: provider/common was whinging about mising ssh keys
<waigani> oh really?
<waigani> I can look into that if you like?
<davecheney> waigani: i'll know in a few mins
<rogpeppe> axw: the other thing i'm trying to persuade myself of is whether the $size asserts in maintainStateServersOps are still sufficient
<davecheney> hold tight
<axw> rogpeppe: I was thinking they should be changed to exact match - is there a use case for two concurrent callers?
<axw> I figure someone might want to have a cron job calling this
<rogpeppe> axw: well, we try to make everything work ok with concurrent callers
<rogpeppe> axw: i'm not sure that an exact match is possible to do though
<rogpeppe> axw: it might need to be a txn_revno assertion
<waigani> davecheney: I looked into 1262967. I suspect it is a similar problem. So far though, none of the tests are failing for me.
<davecheney> waigani: bootstrap_test.go:69: c.Assert(err, gc.IsNil)
<davecheney> ... value *errors.errorString = &errors.errorString{s:"no public ssh keys found"} ("no public ssh keys found")
<axw> rogpeppe: sure, I just meant whether we try to allow concurrent modifications or lock each other out entirely
<rogpeppe> axw: ah yes
<rogpeppe> axw: if we've got two concurrent callers both calling ensure-availability with different numbers, that's likely to be problematic anyway
<rogpeppe> axw: but if the server count hasn't changed, i think it should be fine to just let a concurrent call assume that the first one has worked, and return with success
<rogpeppe> axw: s/server count/voting server count/
<axw> rogpeppe: as we do now?
<axw> len(VotingMachineIds) == numStateServers?
<rogpeppe> axw: well, that was ok before, but isn't now, because we want to juggle available servers
<waigani> davecheney: I can't reproduce that?
<waigani> provider/common$ go test on ppc 31 tests pass?
<davecheney> waigani: rm -rf ~/.ssh :)
<waigani> ugh god facepalm
<axw> rogpeppe: it should still work - it checks if there are any machines being taken out of the voting set
<davecheney> waigani: it should be the same as your provider manual fix, right
<waigani> davecheney: is there a bug for that one?
<davecheney> the underlying cause is the same
<axw> rogpeppe: info is updated by updateAvailableStateServersOps if that's not clear
<waigani> davecheney: yep, looks very similar
<davecheney> waigani: the bug was for both I think
<axw> so VotingMachineIds out may be smaller than going in
<waigani> ooh right, okay I can fix and link it to the same bug?
<davecheney> yup
<waigani> sweet, will do
<rogpeppe> axw: currently, we just return nil if the number of voting machines hasn't changed. i'm not sure we can still do that.
<davecheney> mwhudson: i've found hte cause of the juju crashes on ppc
<davecheney> in the gccgo deb
<davecheney> actually, i'll step back
<mwhudson> davecheney: oh?
<davecheney> mwhudson: basically, libgo tests to see if the compiler supposed -fsplit-stack
<davecheney> which ppc says it does
<davecheney> but it lies
<mwhudson> ah
<davecheney> i think the same is for arm
<davecheney> arm64
<davecheney> i mean, you need gold to support it
<davecheney> and gold isn't even installed on ppc
<axw> rogpeppe: sorry, I don't understand why. if we remove a voting machine ID, then len(VotingMachineIds) < numStateServers. If we bring an available server back in, that count doesn't change, but we add a txn.Op to make the change in mongo
<mwhudson> davecheney: i'
<mwhudson> davecheney: i'm pretty sure i checked that the configure script does not think arm64 supports split stacks
<davecheney> ok, that might be a different error
<davecheney> but it looks like ppc is saying it does
<davecheney> and so libgo configures itself accordingly
<davecheney> and so each goroutine has a small stack and runs of the end easily
<davecheney> checking whether -fsplit-stack is supported... no
<davecheney> hmm, that is odd
<rogpeppe> axw: i'm just saying that i don't think we can just do nothing at all if len(VotingMachineIds) == numStateServers
<rogpeppe> axw: (which is the current behaviour)
<axw> rogpeppe: ah, we should perform an assertion you mean?
<rogpeppe> axw: we'll still need to potentially decommission unavailable machines and add new ones if so
<axw> rogpeppe: yes, that's being done in my CL. possibly not very obviously
<rogpeppe> axw: yup
<axw> or maybe I just don't understand
<rogpeppe> axw: perhaps when you said "as we do now" you meant "as we do in my branch" ?
<axw> rogpeppe: "if the server count hasn't changed, i think it should be fine to just let a concurrent call assume that the first one has worked, and return with success"  -- do we not do that on trunk? maybe you meant server count hasn't changed in mongo, and availability hasn't changed
<axw> anyway. I think we're roughly on the same page
<rogpeppe> axw: yeah
<natefinch> morning all
<rogpeppe> natefinch: hiysa
<rogpeppe> axw: i suppose the question now in my mind is: is it possible for EnsureAvailability to do any actions without changing the server id counts?
<rogpeppe> axw: hmm, i think it might be possible
<axw> morning natefinch
<rogpeppe> axw: so perhaps the best thing is to assert on txn_revno and assume that txn.ErrAborted means all's well
<axw> rogpeppe: yes, it will change the counts definitely. the check there at the moment says that the counts didn't change externally
<axw> rogpeppe: SGTM
<davecheney> waigani: are you going to apply https://codereview.appspot.com/85710043/ to provider/common ?
<rogpeppe> axw: what if we've got the following scenario (x means unavailable): machineids: 1 2 3x 4; votingids: 1 2x 4; then we'll end up with something like: machineids: 1 2 4 5; voting ids: 1 4 5
<rogpeppe> axw: i think
<waigani> davecheney: what would be best? I can update that branch or create a new one?
<rogpeppe> axw: where the voting counts haven't changed, but the contents have
<davecheney> waigani: you need to createa  new branch
<rogpeppe> axw: (and 5 is a newly commissioned machine)
<davecheney> this is the gift that lbox brings us
<waigani> davecheney: ah hehe okay
<waigani> nice timing, as I was just starting on the old branch
<waigani> you must have sensed it (or you've hacked into my computer)
<axw> rogpeppe: each of the promote/demote ops assert that they weren't changed too
<axw> rogpeppe: so if something else modified the the contents concurrently, we'd still see assert on wantvote/hasvote fields
<rogpeppe> axw: ah, good point
<rogpeppe> axw: reviewed
<axw> rogpeppe: thanks
<axw> rogpeppe: ahhhh, now I see what you mean about voting count not changing
<axw> on ErrTxnAborted
<axw> ErrAborted rather
<rogpeppe> axw: cool
<waigani> davecheney: lboxing...
<davecheney> kk
<waigani> davecheney: https://codereview.appspot.com/86800043
<davecheney> waigani: reviewing
<axw> night all
<davecheney> it would be nice to get this on in this evening
<waigani> davecheney: cool, I'll hang around until it lands
<davecheney> waigani: nah
<davecheney> i can land it myself
<davecheney> i have timezones on my side
<davecheney> it looks like there might be another bug in cmd/juju
<davecheney> tests
<davecheney> simple ordering one
<waigani> davecheney: I'm hitting this on the vm: fatal error: bad spsize in __go_go
<davecheney> eek
<waigani> yeah, I don't seem to be able to run my tests
<davecheney> waigani: $ go test ./provider/common/
<davecheney> ok      launchpad.net/juju-core/provider/common 24.926s
<davecheney> LGTM
<davecheney> ship it
<waigani> davecheney: sweet
<waigani> davecheney: cmd/juju$ go test
<waigani> OK: 226 passed, 2 skipped
<waigani> PASS
<davecheney> waigani: nice one
<davecheney> give yourself the evening off
<waigani> davecheney: thanks. If you see any bugs you want me to look at, send me an email for Monday.
<waigani> davecheney: I'll be keen to see how close we are to getting juju working on ppc :)
<davecheney> waigani: i think we're down to 3
<davecheney> cmd/juju
<davecheney> which looks easy
<davecheney> cmd/jujud which looks like a timeout
<davecheney> the joyent provider which is a timeout
<davecheney> and worker tests
<davecheney> which are a bug in the compiler i'm working on
<mgz> rogpeppe, dimitern: standup!
<davecheney> waigani: http://paste.ubuntu.com/7233131/
<davecheney> from a few hors ago
<davecheney> be suspcious of any tests thta take 600 seconds
<davecheney> the watchdog kills it
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1306536
<_mup_> Bug #1306536: replicaset: mongodb crashes during test <juju-core:Triaged> <https://launchpad.net/bugs/1306536>
<davecheney> this is a real thing
<davecheney> if mongo shits itself during our CI
<davecheney> which it does, continually
<davecheney> what is it going to do in the field ?
<rogpeppe> natefinch: shall we hang out elsewhere?
<natefinch> rogpeppe: sure
<rogpeppe> natefinch: https://plus.google.com/hangouts/_/canonical.com/juju-HA?authuser=1
<natefinch> davecheney: we'll look into it. I've seen it occasionally.  not sure what causes it yet
<davecheney> natefinch: what version is the bot running
<davecheney> is it still running out old crack versoin of mongo we made 2 years ago ?
<rogpeppe> natefinch: bzr+ssh://bazaar.launchpad.net/~rogpeppe/juju-core/natefinch-041-moremongo/
<natefinch> davecheney: I don't think so, but I'm not sure
<davecheney> $ ~/bin/juju destroy-environment -y local
<davecheney> ERROR failed verification of local provider prerequisites:
<davecheney> juju-local must be installed to enable the local provider:
<davecheney> umm ....
<davecheney> so, i can't develop juju unless I have a conflicting juju binary installed ?
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1306544
<_mup_> Bug #1306544: developing juju requires juju-local  to be installed <juju-core:Triaged> <https://launchpad.net/bugs/1306544>
<natefinch> interesting way to put it, we just need youy to have  juju-mongodb
<davecheney> this is kind of a problem
<natefinch> it's been a huge problem for new developers.... but you're right, we should have a separate package to just deploy the db
<davecheney> you have to be super careful `juju` is the juju you mean
<natefinch> yeah, I think rog hit that
<natefinch> interesting... actually:
<natefinch> $ which juju
<natefinch> /usr/bin/juju
<davecheney> sad trombone
<rogpeppe> does anyone know anything about statecmd.MachineConfig ?
<rogpeppe> from the description, it looks like it just returns information, but AFAICS, it's actually responsible for setting up the initial password info on a machine too
<rogpeppe> and it seems kinda weird that it's living inside statecmd too
<rogpeppe> ah, i see, it's a 1.16 legacy
<mgz> evilnickveitch: do you want me to resubmit doc changes against github in order to get them landed?
<evilnickveitch> mgz, no, it's fine. I did a PR for the GH repo. Nobody has looked at it, so I will just merge it anyhow
<mgz> evilnickveitch: I can probably stamp it...
<mgz> evilnickveitch: too late, thanks!
<evilnickveitch> :)
<wallyworld__> fwereade: hiya, saw your comment on instance type constraints. i haven't done any work on it for a few days except for merging trunk and resolving conflicts as i've been doing the arm stuff and other work for 1.18.1 etc. i guess i should mark it back as wip. i'll be able to get more done next week
<natefinch> rogpeppe: back'
<fwereade> wallyworld__, no worries, I knew we'd talked about some of the stuff I mentioned, just wanted to make sure it was recrded
<wallyworld__> sure, ok
<fwereade> rogpeppe, IIRC it's primarily for the manual provider (and particularly for hazmat's convenience)
<rogpeppe> fwereade: i'm just pondering the best place to add mongo password setup for new state servers
<rogpeppe> natefinch: https://plus.google.com/hangouts/_/canonical.com/juju-HA?authuser=1
<fwereade> rogpeppe, preferably purely inside existing state servers, surely?
<rogpeppe> fwereade: sure
<rogpeppe> fwereade: i think it's best done along with the other password in the API call ProvisioningScript
<rogpeppe> fwereade: (FWIW, that's another bad name - it doesn't sound like it actually sets up the machine too)
<fwereade> rogpeppe, fair enough, so long as we don't put it in cloudinit
<rogpeppe> fwereade: definitely not - it couldn't go in cloudinit anyway
<rogpeppe> fwereade: BTW NewAPIAuthenticator seems to be fundamentally misguided - it only gets the state and API addresses once
<fwereade> rogpeppe, yeah, I know, it sucks
<rogpeppe> fwereade: i'm not sure whether to change it to fetch the addresses each time, or to add a watcher to the provisioner
<fwereade> rogpeppe, midpoint: get the addresses once per batch of machines
<rogpeppe> fwereade: i suppose AuthenticationProvider could do the watch too
<fwereade> rogpeppe, the intent was always to do it per-batch but I forget why it didn't happen
<fwereade> rogpeppe, I suspect there was some ugly interaction with container provisioners
<rogpeppe> fwereade: ah, i see. we should get the address in provisionerTask.startMachines
<perrito666> how cool would be to have a special var that when something other than nil is assigned to it would automatically produce panic or return err (error handling induced day dreaming)
<rogpeppe> fwereade: it's pretty awkward to refactor the code so it uses bulk API calls, BTW
<rogpeppe> fwereade: although it's definitely a place that it's worth it
<rogpeppe> fwereade: although actually just running all the startMachine calls concurrently would be a big win (and quite likely faster than using sequential bulk calls)
<natefinch> perrito666: magic is pretty much anathema in Go. I prefer to have the code do exactly what it says it does, and no more.  Magic is how you get bugs.
<perrito666> natefinch: amen
<perrito666> natefinch: I guess I was more in search of syntactic sugar than magic
<natefinch> perrito666: also not something Go generally does :)   you can do this, though:
<natefinch> if err := foo(); err != nil {
<natefinch>     return err
<natefinch> }
<fwereade> rogpeppe, we do also want environ.StartInstances to mitigate rate limiting
<rogpeppe> fwereade: yes.
<perrito666> natefinch: yep, that is where I go whenever I can, although that is closer to syntactic saccharine than sugar
<fwereade> rogpeppe, and I'm well aware of just how much hassle that will be, but we can only put it off so long ;)
<rogpeppe> fwereade: although i'm yet sure if rate limiting should really be done in the Environ itself
<rogpeppe> s/yet/not yet/
<fwereade> perrito666, also it keeps the scope of err nice and tight, which is often a benefit of itself
<fwereade> rogpeppe, I'm reasonably sure it should be, myself
<rogpeppe> fwereade: the problem is that not everything that makes provider calls necessarily shares the same Environ
<fwereade> rogpeppe, which is not to say that we shouldn't also have some stuff around instancepoller too
<fwereade> rogpeppe, most of them should, though
<fwereade> rogpeppe, and it's probably not that hard to arrange
<rogpeppe> fwereade: doesn't every worker create its own Environ?
<rogpeppe> fwereade: but, yeah, it's probably not too hard to pass an environ in to those workers that need one
<mattyw> fwereade, when you have a moment...
<rogpeppe> fwereade: except that it can change
<fwereade> rogpeppe, yeah, but it doesn't have to -- pass one in, have another worker responsible for updating it
<fwereade> mattyw, heyhey
<natefinch> anyone understand this log about wordpress not installing?   failed to fstat previous diversions file: No such file or directory    whole log: http://paste.ubuntu.com/7235032/
<sinzui> Hi jamespage : I have an idea to address a bug 1304493 that I think was caused by republication of juju tools.
<_mup_> Bug #1304493: Juju tools 1.18.0 streams.canonical.com checksum mismatch <ci> <landscape> <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1304493>
<jamespage> sinzui, I think so yes
<jamespage> sinzui, that feels bad to me
<dimitern> rogpeppe, fwereade, updated https://codereview.appspot.com/86010044/
<dimitern> rogpeppe, fwereade, (networks stuff)
<sinzui> jamespage, I am not convinced it is the right solution since I do not know why streams.canonical.com got a different file size for the amd64 precise tools. I think the only package it could find is the one from the juju stable ppa
<jamespage> sinzui, one from the PPA, then one from the distro when I uploaded it I suspect
<sinzui> jamespage, you made one for precise?
<jamespage> sinzui, no
<sinzui> I thought you were just making trusty
<jamespage> oh - that really is odd then
<jamespage> sinzui, I am
<jamespage> sinzui, I don't understand then
<jamespage> we need utlemming I think
<sinzui> I will keep investigating
<sinzui> thank you jamespage
<rogpeppe> fwereade: can i run something by you?
<fwereade> rogpeppe, sure
<rogpeppe> fwereade: i *think* we can remove StateInfo from environs.MachineConfig and cloudinit.MachineConfig
<fwereade> rogpeppe, w00t!
<fwereade> rogpeppe, I've been wanting to do that for months :)
<rogpeppe> fwereade: basically, any agent that needs to connect to State can dial localhost
<rogpeppe> fwereade: that will need to change in the future if we want to have more API servers than mongo instances
<rogpeppe> fwereade: but even then, i don't think it needs to be in MachineConfig
<fwereade> rogpeppe, +100
<fwereade> rogpeppe, anything starting up at bootstrap time can use localhost, and anything starting up late can just grab the info over the api, anyway
<rogpeppe> fwereade: yup
<fwereade> rogpeppe, excellent
<sinzui> jamespage, I now think bug 1304493 is caused by different versions of gzip. I don't think this issue is about alternate packages.
<_mup_> Bug #1304493: Juju tools 1.18.0 streams.canonical.com checksum mismatch <ci> <landscape> <packaging> <juju-core:Triaged> <https://launchpad.net/bugs/1304493>
<jamespage> sinzui, ah - yes - deterministic zip creation
<natefinch> marcoceppi, hazmat: either of you recognize this error from installing a charm?  failed to fstat previous diversions file: No such file or directory   full log: http://paste.ubuntu.com/7235032/
<rogpeppe> fwereade: i think we'll remove state addresses from agent config too
<natefinch> seems to only happen when deploying locally
<fwereade> rogpeppe, sgtm I think
<rogpeppe> fwereade: and i'm also changing the agent config to store hostports to avoid the current impedance mismatch.
<rogpeppe> fwereade: then it's [][]instance.HostPort throughout
<fwereade> rogpeppe, cool
<marcoceppi> natefinch: that's an interesting error, I've not come across it before. What's the charm?
<natefinch> marcoceppi: that's wordpress
<natefinch> marcoceppi: different error for mysql, start fails
<natefinch> marcoceppi: mysql gets this: http://paste.ubuntu.com/7235297/
<jamespage> sinzui, are we likely to see a 1.18.1 release today?  as it will be critical bug fixes I can push that pre-release
<jamespage> and we're still in universe so meh
<sinzui> jamespage, the gzip bug is blocking, but I hope to be able to start the release in 2 hours
<alexisb> natefinch, fwereade ping
<natefinch> alexisb: howdy
<alexisb> hi natefinch happy friday
<alexisb> can I lean on you to take a look at the 4 new critical bugs for 1.19.0 and see if you can easily assign them to someone across the juju team?
<alexisb> looks like most are regressions
<natefinch> alexisb: I can look, for sure. What's the timeline on getting them fixed?  There's not a lot of working hours left for most juju devs
<alexisb> as soon as we can, but in regular working hours
<natefinch> alexisb: ok
<alexisb> we just need to try and unblock sinzui from releasing
<alexisb> sinzui, are there any bugs in 1.18 that need our assistance for the release jamespage needs?
<alexisb> and natefinch thank you!
<natefinch> alexisb: welcome
<alexisb> sinzui, our == juju-core
<rogpeppe> is hook output still labelled with HOOK in the logs?
<sinzui> alexisb, I am working on the one bug that blocks the release of 1.18.1.
<sinzui> alexisb, I am going to defer the backup bug because fixing it doesn't allow you to restore. The restore bug is targeted to 1.19.0
<alexisb> sinzui, ack
<natefinch> sinzui, marcoceppi, alexisb:  Btw, looks like wordpress and mysql both fail to deploy on trunk, at least using the local provider on trusty.  Getting some errors related to apparmor when installing wordpress and when starting mysql.
<alexisb> natefinch, that would make sense given on of the critical bugs deals with failing to bootstrap a local provider
<natefinch> alexisb: I can bootstrap ok, it's deploying that I have a problem with
<alexisb> ah ok, shows my ignorance :)
<fwereade> alexisb, heyhey, I'll take a look as well
<alexisb> fwereade, thanks
<natefinch> fwereade: check out my response to this bug at the bottom and let me know if you disagree: https://bugs.launchpad.net/juju-core/+bug/1208430
<_mup_> Bug #1208430: mongodb runs as root user <mongodb> <juju-core:Triaged> <juju-core (Ubuntu):Triaged> <https://launchpad.net/bugs/1208430>
<fwereade> natefinch, concur
<fwereade> natefinch, db access gives you the keys to the kingdom regardless
<fwereade> natefinch, have we closed the mongod ports externally yet?
<fwereade> natefinch, I'm guessing not, but I think we can now; right?
<fwereade> natefinch, at least on providers where we can do firewalls
<fwereade> natefinch, in fact it should be a matter of just not explicitly opening it any more
<natefinch> fwereade: roger says he thinks we keep them open still... but I think we can close them (and definitely should close them)
<fwereade> natefinch, close 'em :)
<fwereade> natefinch, ideally we'd close them even to traffic from other machines in the environment, but that's not practical yet
<natefinch> fwereade: yep
<sinzui> natefinch, then this is the real bug 1305280
<_mup_> Bug #1305280: juju command get_cgroup fails when creating new machines, local provider arm32  <armhf> <local-provider> <lxc> <packaging> <juju-core:Triaged> <apparmor (Ubuntu):New> <https://launchpad.net/bugs/1305280>
<sinzui> natefinch, The bug is in ubuntu, but I kept it open on juju-core to make it easy for me to track
<natefinch> sinzui: ahh, ok, thanks
<fwereade> natefinch, looking at those criticals I am suspicious that the other 3 are ha-/replicaset-related
<natefinch> sinzui: the armhf in the title seems misleading, since I'm definitely not running arm here
<sinzui> natefinch, yep. I thought it was limited ubuntu ports/packaging issue
<natefinch> fwereade: yeah
<natefinch> fwereade:  this one seems like it just requires an update to the backup script to find the juju-db binaries: https://bugs.launchpad.net/juju-core/+bug/1305780
<_mup_> Bug #1305780: juju-backup command fails against trusty bootstrap node <backup-restore> <juju-core:Triaged> <https://launchpad.net/bugs/1305780>
<natefinch> oh, i guess that one is 1.81.1
<natefinch> er 1.18.1
<natefinch> fwereade: certainly the other ones are HA related
<sinzui> natefinch, I am going to move that bug to 1.19.0
<sinzui> natefinch, if we fix the backup bug, the user still hits the two restore bugs in 1.19.0
 * sinzui moves the bug to be with its friends
<natefinch> the more the merrier
<sinzui> natefinch, So I see one bug in progress for 1.18.1, It's my job to solve the gzip compression problem between different machines.
<sinzui> Juju-Core can focus on getting trunk releasable for Monday/tuesday
<fwereade> natefinch, yeah, I think they're not on the $PATH
<fwereade> natefinch, sorry, evening is happening around me a bit ;)
<natefinch> fwereade: seems like an easy fix at least
<fwereade> natefinch, yeah
<fwereade> btw, https://codereview.appspot.com/86910043 is up for review at long long last
<cjohnston> natefinch or fwereade any chance either of you could help debug an issue where trusty cant deploy precise instances using lxc?
<fwereade> cjohnston, IIRC thumper has started work on that -- is this new?
<sinzui> cjohnston, did you set default-series in your config...and is your the failure with some charms bug 1305280
<_mup_> Bug #1305280: juju command get_cgroup fails when creating new machines, local provider arm32  <armhf> <local-provider> <lxc> <packaging> <juju-core:Triaged> <apparmor (Ubuntu):New> <https://launchpad.net/bugs/1305280>
<cjohnston> https://bugs.launchpad.net/juju-core/+bug/1306537 is the bug filed for it, yes default_series is set
<_mup_> Bug #1306537: LXC provider fails to provision precise instances from a trusty host <juju-core:Invalid> <https://launchpad.net/bugs/1306537>
<cjohnston> I don't see (error: error executing "lxc-start": command get_cgroup failed in juju status, the instances just sit at pending for hours
<cjohnston> however deploying cs:trusty/ubuntu works
<sinzui> cjohnston, This issue may be fixed in trunk. The opposite scenario was recently fixed: bug 1302820
<_mup_> Bug #1302820: juju deploy --to lxc:0 cs:trusty/ubuntu creates precise container <landscape> <juju-core:Fix Committed by thumper> <https://launchpad.net/bugs/1302820>
<sinzui> cjohnston, 1.19.0 will be released next week, a day after trunk stabilises.
<rogpeppe>  i'm sure i used to be able to move my mouse when my machine was heavily loaded
<natefinch> rogpeppe: I've noticed that recently, too
<rogpeppe> and this IRC client has an amazing failure mode when i've been typing into it when the machine's heavily loaded
<natefinch> (recently = last few months at least)
<rogpeppe> every few keys typed, it ignores what i've typed and types something from the past insteads
<rogpeppe> so just then, if i typed "ddddddddd", i'd get "d abled abled"
<rogpeppe> hmm, we've taken away support for 1.19 to work against 1.16 clients, right?
<natefinch> time to get a new irc client
<sinzui> rogpeppe, There is a commit in trunk that says that
<sinzui> and I removed the ability to package 1.16. its dead to me
<rogpeppe> sinzui: so we don't care about upgrading from 1.16 then?
<rogpeppe> sinzui: which upgrade transition has been failing on trunk?
<sinzui> rogpeppe, we test stable to stable. 1.16 goes to 1.18. 1.18 can go to 1.19 or 1.20
<rogpeppe> sinzui: ok, cool
 * sinzui notes that juju docs should make that very clear
<rogpeppe> natefinch, dimitern, anyone else that's around: large but largely trivial: https://codereview.appspot.com/87010044
<natefinch> review for whoever: https://codereview.appspot.com/86920043/
<natefinch> rogpeppe: this is the machine 0 log for upgrading 1.18 to trunk
<natefinch> http://pastebin.ubuntu.com/7236187/
<natefinch> rogpeppe:  ubuntu@ec2-174-129-121-255.compute-1.amazonaws.com
<perrito666> how deep is a copy of an object in go?
<perrito666> for instance I copy a document that has fields such as []string slices and arrays inside
<perrito666> do all those get properly copied?
<natefinch> perrito666: it's hard to answer that without being flippant.  The obvious answer is, it copies everything.  But that would be confusing
<natefinch> perrito666: slices, channels, and maps are all implemented as pointers, so the pointers get copied but not what they point to
<natefinch> interfaces, too
<natefinch> arrays are actually consecutive memory like you'd expect, and when you copy them, you copy the whole dang thing
<natefinch> but slices are just pointers to parts of arrays, so when you copy them, you're only copying the pointer
<perrito666> natefinch: as I thought, if I want to "clone" and object, I need to create a sort of deepcopy
<natefinch> perrito666: depends on the value, but yes, some values take some extra work
<natefinch> perrito666: maps and slices are really the only thing you need to worry about (there's no real way to deep-copy an interface, and it doesn't really make sense to copy a channel)
<natefinch> perrito666: I guess a good question would be - why do you need to clone something?
<perrito666> natefinch: thank you, btw, how many hours a day do you work, you are here when I arrive and when I leave?
<perrito666> natefinch: refactoring a big piece of code of assignToMachine
<natefinch> perrito666: haha... not that much, really.  I have a lot of interruptions during the day due to having two small children.  I start work around 5:30 or 6am and end at 5pm, but there's usually a few hours of non-work in there.
<natefinch> ok, EOD for me
<perrito666> natefinch: have a nice weekend
<natefinch> perrito666: you too
#juju-dev 2014-04-13
<waigani> morning all
<thumper> morning
<waigani> morning thumper
<waigani> alexisb, morning. Our meeting is tomorrow morning, not today right? i.e. in about 24hrs
 * thumper trawls through the 100 odd emails
<waigani> does anyone use GDB for debugging Go?
<thumper> waigani: I can't remember if I tried and failed, or didn't get around to trying
<thumper> but no, not right now I don't
<waigani> thumper: there is a nice looking GoGDB plugin for sublime. But yeah, I can't get it working out of the box. So just wondering if it is worth the effort.
<waigani> chucking in copious amounts of debug logging does the trick for now ;)
<waigani> thumper: make check --compiler=gccgo on my local machine throws a bunch of linker errors. I've tried removing pkg. Any hints?
<thumper> waigani: likely to be an old local gccgo
<waigani> ah okay
<thumper> fwereade: I have my physio appt in 10 minutes
<thumper> fwereade: chat later I guess.
<waigani> thumper: I've hit a bit of  road block. Running tests on vm I get: http://pastebin.ubuntu.com/7246968. Running locally I get linker errors (gccgo upgrade didn't help).
 * thumper is back
<thumper> waigani: looking
<thumper> waigani: which tests are causing that error?
<waigani> thumper: good question, came up during make check. I'm guessing something in cmd/juju
<thumper> also, would help to have more context in the pastes :-)
<thumper> looks like a gccgo compiler bug
<thumper> as the code compiles fine with cgo
<thumper> but that isn't particularly helpful
<thumper> I know
<waigani> thumper: I'm not sure what more context to give. Error occurs during make check, everything passes up to that error. Line before is"ok  	launchpad.net/juju-core/cmd/envcmd	0.330s"
<thumper> even that context helps
<waigani> I'll to an apt-get update on the vm
<thumper> waigani: just that the pastebin didn't say anything about which package was having errors
<waigani> thumper: right, I should have put in that I don't know which package is throwing the error. Just ran tests in cmd/juju - all pass.
<thumper> hmm..
<thumper> in the vm?
<waigani> yep
<waigani> apt-get update; make check in vm ....
<waigani> same error, sigh
<waigani> thumper: I logged this https://bugs.launchpad.net/juju-core/+bug/1307241
<_mup_> Bug #1307241: Tests are not isolated from jujud <juju-core:New for waigani> <https://launchpad.net/bugs/1307241>
<waigani> should I focus on that and then see if dave has hints regarding the vm when he is on?
<waigani> thumper: ^
<thumper> waigani: yeah
<mwhudson> waigani: this is on ppc?  want me to try on arm64?
<waigani> mwhudson: yep, I'm on ppc. Sure, it would be interesting to see.
<mwhudson> waigani: what version of juju, what command line?
<waigani> mwhudson: juju version
<waigani> 1.19.0-trusty-ppc64
<mwhudson> waigani: how do i get that?  bzr or apt-get source?
<waigani> mwhudson: bzr, I just pulled the latest trunk and ran make check
<mwhudson> okidoke
<waigani> mwhudson: thanks
<mwhudson> go get-ing...
<mwhudson> i guess i need to run godeps after this?
<mwhudson> oh maybe make check handles that for me
 * mwhudson doesn't know what he's doing
<thumper> mwhudson: yeah, go get
<thumper> mwhudson: and, yes you will most likely need to go get launchpad.net/godeps, and run godeps -u requirements.tsv from the root of juju-core
 * thumper goes to make a coffee to stay awake
<mwhudson> exec: "hg": executable file not found in $PATH
<mwhudson> i guess i need to fix that too :-)
<waigani> mwhudson: https://docs.google.com/a/canonical.com/document/d/1m9R2n6LPLNLGjdopcNkQYVG8D5V4FTyvc1vvn-9ZifM/edit#heading=h.cbgzlmojri6s
<waigani> they are the setup notes for ppc, might be helpful?
<mwhudson> yeah, probably
<thumper> WT actual F
<thumper> os.RemoveAll is giving an error
<thumper> the error makes no sense
<thumper> and the RemoveAll worked
<davecheney> good morning
<davecheney> today is bought to you by the letters
<davecheney> ./configure
<thumper> davecheney: I've just sent an email to juju-dev that I'd appreciate your thougts on
<thumper> thoughts even
 * thumper reads a typo at the start of the email
 * thumper sighs
<thumper> I blame tiredness
<davecheney> thumper: ok
 * davecheney things
 * davecheney thinks
#juju-dev 2015-04-06
<dimitern> morning all
<dimitern> (just in case it's not just me today :)
<natefinch> dimitern: morning.  I'm here, but only just for a minute or two before bed
<dimitern> natefinch, hey, btw do you need another review on that ensure availability API branch?
<natefinch> dimitern: yes please.  Just pushed a bunch of changes.  There's a few comments from Andrew that I just can't get to before I fall over tonight, but mostly I think it's fine.
<natefinch> http://reviews.vapour.ws/r/1299/
<dimitern> natefinch, ok, np - looking
<natefinch> way past bedtime for me.  Good night
<dimitern> good night natefinch !
<dimitern> jam, I guess you're not here today
<jam> dimitern: I'm going to be a bit late for our 1:1, I have to take my son to a friends house. I'll ping you when I get back.
<dimitern> jam, sure, np
 * dimitern steps out for ~30m
<perrito666> morning
<jam> dimitern: sorry I forgot to ping you back. Do you have any pressing items to discuss? How are you feeling about prep for next week?
 * dimitern is back
<dimitern> jam, hey
<dimitern> jam, perhaps we can have a quick chat now if you're available?
<dimitern> morning perrito666
<dimitern> perrito666, aren't you off today?
<perrito666> dimitern: nope, I have been on vacations for two weeks
<perrito666> but everyone else seems to be off
<dimitern> perrito666, yeah :)
<natefinch> wwitzel3: standup?
<natefinch> dimitern: about utils_RandomPassword ... I intentionally use an unusual format w/ the underscore, since it's replacing utils.RandomPassword... I find it makes it easy to tell what function is being replaced by this variable when you're just looking at the code.  otherwise I could just call it randomPassword()... but I like having an easy way to tell what it's replacing without having to go to definition on it.
<dimitern> natefinch, well what's wrong with utilsRandomPassword ?:)
<natefinch> dimitern: it's still not as obvious that the first part is part of the package name... the underscore really separates it out, and it still looks a lot like the original namespaced form: var utils_RandomPassword = utils.RandomPassword
<natefinch> dimitern: but if people really don't like it, it's not a huge deal.
<dimitern> natefinch, I still rather not have underscores in code, but it that's the only thing remaining and others are ok with it, I'll accept that
<dimitern> s/but it/but if/
<natefinch> dimitern: heh... there's a ton of other stuff remaining :)
<natefinch> dimitern:  I'm not sure what you meant when you said there was inconsistent indenting with the txn.Ops.  I just copied the other three functions that were formatted exactly like mine (as far as I can tell).
<dimitern> natefinch, in a call, will get back to you
<dimitern> natefinch, ah, so I though my comment there made it clear,
<natefinch> dimitern: not to me, at least :)
<dimitern> natefinch, I think you either use []txn.Op{{\nC:XYZ,\n,...,\n}, {\n,C:UVT,\n...,\n}} or []txn.Op{\n{\nC:XYZ,\n...,\n},\n{\nC:UVT,\n...,\n}\n,}\n
<dimitern> natefinch, does it make more sense?
<natefinch> dimitern: I think I'm doing that first one
<natefinch> or are you asking me to un-indent the stuff inside?
<dimitern> natefinch, sorry, let me have a look at the review again
<dimitern> natefinch, ugh, I've seen it now - it's fine, so please ignore my comment and I'll drop the issue
<natefinch> ok :)  good, I thought I was going crazy or something, which is entirely possible running on 4 hours sleep :)
<dimitern> :)
 * dimitern reached EOD
<cherylj> ericsnow: you around?
<ericsnow> cherylj: yep
<cherylj> ericsnow: I'm working on bug 1439447 that we talked about last Friday and I need to get some advice...  natefinch had an idea that we could specify a new config option to indicate whether or not to use the proxy when contacting the bootstrap node
<mup> Bug #1439447: tools download in cloud-init should not go through http[s]_proxy <cloud-installer> <landscape> <juju-core:In Progress by cherylj> <juju-core 1.23:In Progress by cherylj> <https://launchpad.net/bugs/1439447>
<cherylj> ericsnow: and I've been digging through things for a while and I'm wondering if this flag should be part of proxy.Settings since it does relate to how we use the proxy...
<cherylj> proxy.Settings is part of juju/utils
<cherylj> and it would break backwards compatibility, but we branched utils because of the license changes I made anyway...
<ericsnow> cherylj: I thought about that too
<ericsnow> cherylj: no_proxy already covers this sort of
<ericsnow> cherylj: if it supported IP ranges it might be good enough
<cherylj> ericsnow: yeah, and another thought I had was to just add the state server IPs to the no_proxy list when this new flag is set
<ericsnow> cherylj: yep
<cherylj> ericsnow: then if we add new state servers, or remove them, then we need to prune the list
<cherylj> ericsnow: and if the flag is unset, do we remove them?  What if the user had manually added them and we just don't know?
<ericsnow> cherylj: proxy.Settings could also grow a new field like "ProxyPrivateAddresses"
<cherylj> ericsnow: yeah, that was my thought
<ericsnow> with that setting we wouldn't need to add addresses to no_proxy
<cherylj> ericsnow: are there other places (other than this curl command) that we would need to change to check this?  I'm not sure if there are other places communication is happening with the state servers through the proxy
<ericsnow> cherylj: looks like several packages use proxy.Settings: utils/apt/apt.go and a few places in juju core
<cherylj> ericsnow: okay, I'll take a look at those places
<cherylj> ericsnow: so you think adding the new field to proxy.Settings sounds good?  I think it will be the clearest way to add this config option.
<ericsnow> cherylj: ideally the proxy.Settings methods would handle that new field properly by adding the appropriate IP address
<ericsnow> cherylj: however, without IP range support that probably isn't doable
<cherylj> cherylj: I had also thought that in environs/config/config.go where we return a new proxy.Settings, we could check there if this flag was set, and add the IP addresses to the no_proxy list before returning
<ericsnow> cherylj: definitely
<cherylj> I'm not sure if that list is used in any permanent way
<cherylj> or if ProxySettings() is called every time we need the proxy info
<ericsnow> cherylj: it would be nice, though, if we had an implicit solution; that way every use of proxy.Settings doesn't need to handle the new field
<cherylj> yeah
<ericsnow> cherylj: for now we could just add the IP address of each state server on a private IP address to Settings.NoProxy
<ericsnow> cherylj: or prepend "HTTP_PROXY= HTTPS_PROXY= " to that one curl command if the IP in the curl command is private
<ericsnow> cherylj: in the interest of solving the 1.23 bug
<ericsnow> cherylj: then open a bug against 1.24 for the better solution
<cherylj> ericsnow: yeah, I wasn't sure what the timeframe looked like for this bug.
<ericsnow> cherylj: we could probably argue for deferring it to 1.23.1, but it would be nice to have a fix for 1.23.0. :)
<cherylj> ericsnow: it's easy to determine if we're not the bootstrap node, and we could prepend the "HTTP_PROXY= ..." in that case.  Do I need to check if the IP is also private?
<ericsnow> cherylj: it depends on whether or not cloudinit for a new HA instance pulls from another state server
<ericsnow> cherylj: I'm guessing we can simply key off of w.mcfg.Bootstrap
<cherylj> ericsnow: that's what I was thinking
<ericsnow> cherylj: we would need to check whether or not that is set for HA instances
<ericsnow> natefinch: do you know if MachineConfig.Bootstrap (as used by cloudinit) is set for HA instances?
<natefinch> ericsnow: I do not.   I'd have to go look it up
<ericsnow> natefinch: no worries
<natefinch> gotta run for a bit, back in an hour-ish.
<cherylj> ericsnow: I don't think that it is set.  Just did a little test with ensure-availability and it wasn't...
<cherylj> I gotta take a quick lunch break.  brb
<ericsnow> cherylj: k
<perrito666> brb
<natefinch> >     >     >> >>> On Apr 5, 2015, at 11:28 AM, Someone with an annoying plaintext email program replied....
<wwitzel3> lol
<cherylj> ericsnow: do you know how can I get the script output like what's included in that bug to see the actual commands being run?
<ericsnow> cherylj: look for cloudinit.log on the new instance
<ericsnow> cherylj: I don't recall exactly where the cloudinit script is written
<cherylj> ok
 * perrito666 had to go replace his fron't door lock after he had to dismantle the door to get out of the house
<perrito666> mondays...
<mup> Bug #1433577 changed: Vivid unit tests need to pass <ci> <test-failure> <vivid> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1433577>
<mup> Bug #1435860 changed: certSuite.TestNewDefaultServer failure <ci> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.22:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1435860>
<mup> Bug #1435974 changed: Copyright information is not available for some files <juju-core:Fix Released by cherylj> <juju-core 1.22:Fix Released by cherylj> <juju-core 1.23:Fix Released by cherylj> <https://launchpad.net/bugs/1435974>
<mup> Bug #1437040 changed: unit test failure: TestNewDefaultServer.N40_github_com_juju_juju_cert_test.certSuite <ci> <unit-tests> <juju-core:Fix Released by ericsnowcurrently>
<mup> <juju-core 1.22:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1437040>
<natefinch> thumper: got a minute?
<thumper> natefinch: otp with alexisb
<natefinch> thumper: k
<mup> Bug #1439880 changed: Container's interfaces are all on private networks instead of host's eth0 network <lxc> <maas-provider> <network> <oil> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1439880>
<thumper> natefinch: ok, here now
<thumper> natefinch: do I need coffee?
 * thumper goe to get breakfast
<axw_> katco: Internet just dropped out, I guess because of the weather. Just FYI, in case I don't turn up at the standup
<katco> axw: k ty
<katco> axw: since it's just us, we can probably just coordinate over irc. i have a few questions for you, but they can wait until later (after my dinner)
<katco> axw: i'll just catch up with you over irc after dinner
<axw> katco: :/  sorry
<axw> katco: we can try again if you like. otherwise may be best to email me
<thumper> axw, katco: seems like your team is as small as mine this week :)
<axw> cosy
<thumper> axw: makes for quick standup calls :)
 * thumper relocates for better design thinking space
<katco> thumper: lol yep
<katco> axw: i'll send you an email
#juju-dev 2015-04-07
<axw> katco: thanks, sorry about that. little bit of rain and the network goes out
<rick_h_> thumper: good time off?
<thumper> rick_h_: yeah, a real nice break
<thumper> rick_h_: didn't get all I wanted to get done done...
<thumper> but them's the breaks
<rick_h_> thumper: never do
<thumper> the kids have decided they like D&D
<rick_h_> lol
<thumper> and I'm kinda required as a DM
<thumper> which takes time...
<rick_h_> but of course
<katco> thumper: awesome!
<rick_h_> to do properly
<katco> axw: no worries at all
<rick_h_> thumper: have time today for me to bug you with a few notes on things?
<thumper> they were real frustrated that I had to read up on the campaign first
<rick_h_> lol, yea been a long time here
<rick_h_> I'd have to go reread a lot of stuff heh
<thumper> rick_h_: yeah, at a cafe now so I can focus on design without kids around
<thumper> the rules changed since I last played
<thumper> over 20 years ago
<axw> thumper: nice :)  I never had anyone interested in playing D&D growing up
<rick_h_> thumper: k, ping if you get free/bandwidth and if not will bring it up later but had some things come up friday/today figured I'd mention
<katco> axw: same here.... somehow i was a huge nerd and never played even once
<axw> steve jackson books were there for me though
<thumper> rick_h_: need to mention with voice? or irc would work?
<rick_h_> thumper: we can try irc if you want
<thumper> rick_h_: my mind is pretty good at reading your writing in your voice
<rick_h_> lol awesome
<thumper> can also pick up most of the sarcasm
<thumper> :)
<thumper> rick_h_: if we find it now working, we could do a call later
<rick_h_> thumper: here, PM, or other side?
<thumper> rick_h_: if it is sensitive PM or other
<thumper> otherwise, I'm fine here
<rick_h_> I don't think it is but do that just to be safe I guess
<thumper> sure
<axw> katco: it's JUJU_DEV_FEATURE_FLAGS
<axw> you're missing the "DEV"
<katco> axw: thx
<axw> katco: left a few more comments. all the panics need to go, and init needs to move, otherwise trivials/suggestions
<katco> axw: yeah i was going to ask about the panics... wasn't quite sure what to do there
<katco> axw: i think i ran into a circular reference last time i tried to address the registration? i'll let you know here in a sec
<axw> katco: there should be no references from storage->openstack, only the reverse
<katco> axw: i see my misunderstanding... i think i was trying to register it in storage/provider/ something or other
<katco> axw: thanks for the great reviews. fresh PR up.
<rick_h_> and like that you can ruin someone else's day bwuhahaha :)
<thumper> katco: it was suggested that without the _DEV_ bit, it would be too tempting for clients to try to use them in prod settings
<thumper> which they would do
<katco> thumper: i completely agree with that lol
<rick_h_> thumper: I forgot a dinner tomorrow, can I move out sync back a few hours?
<thumper> rick_h_: back as in later...?
<rick_h_> thumper: yes
<rick_h_> later in the day 3hrs
<thumper> 3.5?
<thumper> or is that too later?
<rick_h_> that's peachy
<thumper> ok
 * thumper moves
<rick_h_> aunt's b-day, not on my work calendar doh!
<thumper> :)
<axw> katco: thanks, LGTM with a few more small fixes
<katco> axw: k thx... getting the dummy charm to allocate some storage will be a milestone
<axw> katco: did you see the email I sent out, with the reference to the hacked version of postgresql?
<axw> katco: or did you just want to test with something a bit leaner?
<katco> axw: just wanted to start lean so i can iterate quickly
<axw> sure, SGTM
<katco> axw: i will move onto postgres afterwards since i know that's what we would like to demo with
<axw> good practice to write charms anyway :)
<katco> axw: haha yeah :)
<cherylj> hey axw, do you know how to get the output of the cloudinit script like is in bug 1439447?
<mup> Bug #1439447: tools download in cloud-init should not go through http[s]_proxy <cloud-installer> <landscape> <juju-core:In Progress by cherylj> <juju-core 1.23:In Progress by cherylj> <https://launchpad.net/bugs/1439447>
<axw> cherylj: one moment, I think I know, just checking
<axw> cherylj: the instance's cloud-config is /var/lib/cloud/instance/cloud-config.txt
<axw> it's not exactly a shell script, but has each of the commands in a YAML file
<cherylj> axw: Awesome, thanks!!
<axw> nps
<axw> cherylj: BTW regarding your last comment on that bug: yes, there will almost certainly be people that want it either way. In particular, manually provisioned machines may need to go through a proxy
<axw> I think otherwise they'd be going direct tho
<thumper> axw: yeah... kinda hard to deal with that where we have a mixed environent
<thumper> unsure just yet
<cherylj> axw: Thanks, providing an option is proving to be a bit difficult to implement.
<axw> thumper cherylj: I was just thinking, don't we currently expect all nodes to communicate directly? for the API anyway?
<axw> so... maybe just disabling the proxy isn't a problem
<axw> I think even for manual we require things to be directly routable atm
<thumper> that was my reasoning too
<cherylj> axw, thumper:   cool, so I'll just disable the proxy when we're downloading tools for the non-bootstrap node
<natefinch> weird, just got disconnected
<thumper> natefinch: I just assumed you were done :)
<axw> cherylj: sounds good. I can't recall if there's any case where a non-bootstrap node downloads tools directly from the Internet, but if so then don't disable for that
<axw> cherylj: I don't *think* there is such a case though
<natefinch> thumper: not quite :)
<thumper> axw: we were looking at ONLY changing the curl command for acquiring tools
<cherylj> axw: yeah, I don't think there is
<thumper> axw: there is a command line option '--noproxy=*' that /should/ work
<axw> thumper: yep, SGTM
<thumper> cherylj: any luck getting it working from the machine not using cloudinit?
<natefinch> axw: did my last two comments go through, about the review?
<axw> natefinch: looking
<natefinch> axw: it was just 15 minutes ago... possibly I was already disconnected and the client didn't tell me
<axw> natefinch: last comment I saw from you was from 9.5h ago
<axw> ah nope
<natefinch> ok :)
<natefinch> axw:  I tried moving the set password stuff in the agent code directly but I couldn't find the right time in the startup sequence where it would work.  Did you have an idea of where is appropriate?  I tried just above where I'd put the converter code, at the beginning of the state worker, and pretty near the beginning of all the workers.   I got different errors for each, but unfortunately didn't copy all of them down.
<axw> natefinch: lemme see, one minute
<axw> natefinch: did you try modifyin OpenAPIState? add an "else" after the "if usedOldPassword", and call entity.SetPassword(info.Password) in it
<axw> modifying*
<axw> modifyin'
<natefinch> axw: nope, but that's a good idea
<axw> natefinch: I expect the StateWorker would bounce until that passes, but I think it should keep retrying?
<natefinch> axw: it should
<mup> Bug #1440940 was opened: xml/marshal.go:10:2: cannot find package "encoding" <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1440940>
<axw> natefinch: initially I was thinking stateStarter could not start StateWorker until the API has connected, but I'm not sure if that'll cause problems with API server initialisation when there's only one server ...
 * natefinch tries it.
<cherylj> thumper: It seems to be a problem using the * wildcard.  If I do --noproxy *, it still tries to go through the proxy.  But if I explicitly list the state server, it bypasses the proxy
<thumper> haha
<thumper> ha
<thumper> hmm
<thumper> I know what it is
<cherylj> oh?
<thumper> the whole command line is being pushed through bash
<thumper> * is expanded to all the filenames in the directory
<cherylj> I figured it was some substitution like that
<thumper> try wrapping it in quotes so bash doesn't expand it
<thumper> --noproxy "*"
<cherylj> that worked, yay!
<thumper> w00t
<cherylj> let me try it in the actual code now
<thumper> ok
<cherylj> thumper: it worked!  huzzah!
<thumper> awesome
<cherylj> I'm going to turn in for the evening and write the tests for the change in the morning.
<thumper> kk
<katco> axw: when i do juju storage pool, i don't see "cinder" as an option... what have i done wrong?
<katco> axw: juju storage pool list rather
<axw> katco: you won't see any pools; there's an implicit pool for each provider, but it's not listed
<axw> (perhaps we should change that)
<katco> Added charm "local:trusty/hello-world-charm-3" to the environment.
<katco> ERROR cannot add service "hello-world-charm": reading pool "cinder": settings not found
<katco>  
<axw> huh
<katco> that was from: juju deploy local:trusty/hello-world-charm --storage="foo=cinder,1MB"
<axw> katco: you *bootstrapped* with the feature flag enabled right?
<katco> yeah
<katco> export JUJU_DEV_FEATURE_FLAGS=storage
<katco> and then juju bootstrap --upload-tools
<katco> this is on canonistack
<katco> i'll try tearing it down and retry just in case
<axw> katco: that's weird, the error annotation implies that the error is not a NotFound error, but the cause suggests it is
<axw> katco: see storage/poolmanager/poolmanager.go
<katco> axw: k
<axw> katco: oh, there's a bug in provider/openstack/init.go  -- not sure if it's related
<axw> katco: there's an existing call to registry.RegisterEnvironStorageProviders
<axw> katco: remove the first one, which is saying that openstack supports no custom storage providers
<axw> hmm don't think it's related tho, looks like it'll accumulate
 * thumper heads home
<katco> axw: lol stale binary >.<
<axw> katco: :)
<katco> axw: so it's pending... it should eventually show up?
<axw> katco: yes, once the instance is created, the storage provisioner should try to create it
<axw> katco: we need to add proper status support to storage, it's a little bit difficult to debug what's going on at the moment
<katco> axw: agent is started, running hook config-changed
<katco> axw: and storage is still pending. hm.
<katco> axw: i'm getting some decent errors in debug-log at least
<katco> volume "0" not provisioned
<katco> getting storage source "cinder": requisite configuration was not set: auth-url not assigned
<katco> axw: ah i see... there are some config options i need to set to tell it how to authenticate to canonistack. where do i do that for storage? i.e. how does it get passed into VolumeSource(...)?
<axw> katco: hm, why are those pool config attributes? shouldn't they just come from the env config?
<axw> I think I glossed over that in my review
<axw> katco: sorry, in the VolumeSource method, it should be using environConfig to get the credentials to open a nova/cinder session
<katco> i'm getting them from *storage.Config
<katco> axw: ah k
<axw> katco: see ec2/ebs.go for inspiration if required
<katco> i shall seek inspiration :)
<katco> axw: fyi the whole provider/* vs. storage/provider/* screwed me up for a long time
<katco> axw: i kept wondering if you guys just hadn't checked in the ec2 provider
<axw> katco: sorry. the reason it lives in provider/ is because it's closely tied to the environ provider
<axw> non IaaS storage providers will go in storage/provider/
<katco> axw: totally my fault
<katco> axw: i need to turn in, but this looks like we're pretty close. thanks for the help :)
<axw> katco: cool :)  no worries, talk to you tomorrow
<katco> axw: have a good day!
<axw> cheers, good night
<natefinch> axw: btw, that suggestion to put it in OpenAPIState works great.
<axw> natefinch: awesome
<natefinch> axw: added some tests and finished the cleanup & suggestions you had.   Would love to have you re-review.  The tests aren't super thorough, but they're similar to what already existed, so I don't feel too bad (I do still feel bad, but I also need to get this committed).
<natefinch> going to bed.  Good night all.
<axw> natefinch: yep, just about to hit the button. thanks for the updates
<natefinch> axw: thanks for the review.  It's been a big help.
<axw> no worries
<dimitern> morning all
<voidspace> morning all
<dooferlad> dimitern: moeninf
<dooferlad> dimitern: morning rather
<dooferlad> ok, the coffee hasn't hit
<dimitern> voidspace, dooferlad, morning!
<voidspace> o/
<voidspace> coffee is a good idea
<TheMue> morning o/
<davecheney> o/ europe!
<TheMue> davecheney: heya, Mister Vienna
<dimitern> dooferlad, I've landed your branches, but we need to clean up a few bits
<voidspace> davecheney: what are you doing in Europe - just here early for the sprint or at a conf?
<dooferlad> dimitern: yea, I spotted that. Thanks.
<dimitern> voidspace, I've tried to land yours but one had conflicts after the first one landed
<voidspace> dimitern: yep, fixed and retried
<voidspace> dimitern: just now
<voidspace> dimitern: creating versions of those fixes for master
<dimitern> voidspace, sweet!
<voidspace> dimitern: first one applied cleanly and PR created
<voidspace> dimitern: second one the patch didn't apply cleanly, looking at now
<voidspace> dimitern: thanks for landing the other one
<dimitern> voidspace, I'd appreciate if you have a look at bug 1439880 to see if my analysis is correct
<mup> Bug #1439880: Container's interfaces are all on private networks instead of host's eth0 network <lxc> <maas-provider> <network> <oil> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1439880>
<voidspace> oh god
<voidspace> dimitern: ok
<dimitern> voidspace, that's most likely the same issue (not found subnets: [])
<dimitern> which you fixed already
<voidspace> dimitern: ok, I hope so
<voidspace> dimitern: looking
<dimitern> TheMue, heya, any update on your maas issue?
<dimitern> voidspace, LGTM on http://reviews.vapour.ws/r/1391/ btw
<voidspace> dimitern: thanks
<TheMue> dimitern: I'm an dialog with rvba, lp issue is created. now I at least want to propose my little fix that doesn't stop with an error but returns the default values instead
<voidspace> dimitern: yeah, those logs make it seem pretty likely that this is the same issue
<voidspace> dimitern: fetching an interface with no subnet and attempting to use that
<voidspace> dimitern: we'll see when he tries latest 1.23
<dimitern> voidspace, yep, sounds good
<dimitern> TheMue, ok, but please follow through with rvba on the issue
<TheMue> dimitern: yes, sure, just added a comment
<dimitern> voidspace, dooferlad, and BTW I'd appreciate a review on http://reviews.vapour.ws/r/1390/
<dooferlad> dimitern: *click*
<rvba> TheMue: the screenshot you pasted on bug 1439831 helped a great dealâ¦ but do you have access to the full console log?  It's very possible that the cause of the problem could be identified if we had all the log.
<mup> Bug #1439831: Missing lshw breaks cloudinit <MAAS:Incomplete> <https://launchpad.net/bugs/1439831>
<davecheney> voidspace: sprinting with aram for a week
<davecheney> to get some arm64 stuff done before the close of the change window
<voidspace> davecheney: cool
<TheMue> rvba: I really would like to. sadly I cannot do a cut'n'paste with the vmware console at that time. as far as I know the vmware tools have to be installed at a later time
<voidspace> davecheney: Vienna is on my list of "must visit sometime"
<dimitern> davecheney, hey, so arm64 support will be released with golang-gc 1.5 ?
<voidspace> dimitern: enjoy the sprint
<dimitern> I wish :)
<TheMue> rvba: but I'll see what I can do
<rvba> Thanks.
<dimitern> TheMue, have you tried using the vmware CLI tools to capture the console log of the VM ?
<dimitern> it should be possible
<voidspace> davecheney:: enjoy the sprint
<TheMue> dimitern: they can be installed when then the OS is installed
<dimitern> TheMue, no, I don't mean the vmware tools that you install on the vm, but the command-line vmware client you can use from the host
<TheMue> dimitern: never heard of it, have to search for it
<dimitern> TheMue, vmware fusion seems to come with a vmrun command - https://www.vmware.com/pdf/vix162_vmrun_command.pdf which has some interesting features, like running a script inside the guest and copying a file from guest to host - do you have vmrun?
<TheMue> dimitern: just took a look, seems to be in the application package and I only have to set it up. sounds like a good way, indeed.
<davecheney> voidspace: it's nice here
<davecheney> very layed back
<davecheney> no forms
<davecheney> dimitern: yes, arm64 will ship in go 1.5
<davecheney> that's what aram and I are doing this week
<dimitern> davecheney, awesome!
<dimitern> davecheney, do you know anything about native ppc64 support in gc-go as well?
<davecheney> dimitern: same, ppc64 will ship in go 1.5
<dimitern> davecheney, \o/ great! I can't wait not to have to care about gccgo/ppc64 bugs :)
<davecheney> dimitern: i'm looking forward to your support when I fight for moving everyone up to 1.4 next week
<dimitern> davecheney, count on it! :)
<dimitern> dooferlad, thanks for the review btw
<dimitern> dooferlad, as you progress with the implementation of "space list" you'll need to add a type for Space in params/network (like I did for Subnet)
<dimitern> dooferlad, or actually, hold on.. you won't need that - you already have all the info (e.g. list of all spaces - like AllSpaces in cmd/juju/subnet), and the rest should come from a list of params.Subnet for each space's associate subnets
<dimitern> dooferlad, most of the rendering code (and tests) could be reused between space list and subnet list, but I'd suggest copying it first and when done
<dimitern> dooferlad, ...refactoring it to minimize duplication
<dimitern> dooferlad, voidspace, TheMue, last step of the subnets CLI, please take a look http://reviews.vapour.ws/r/1393/ (esp. proof-reading)
<lazyPower> o/ Good Morning - there's a member in #juju looking for which ports are required to be open on the juju state server. Does anyone happen to know these ports off the top of their heads?
<dimitern> lazyPower, 17070
<dimitern> tcp
<dimitern> lazyPower, that's the api server, not state server (mongo) which is 37070/tcp, but it's not accessible anyway
<lazyPower> ok, so just 17070 and 22 - the rest should be fine in state: closed?
<dimitern> lazyPower, most commonly, yeah
<lazyPower> right on, cheers dimitern :)
<dimitern> lazyPower, no prob :)
<jam> axw: katco: I've updated http://reviews.vapour.ws/r/1378/ for now I'm just commenting out the 1 non deterministic test. I'll try to work with william to get it tested again
<voidspace> dimitern: if I add issues and then hit "Ship It", does it automaitcally become "Fix it, then ship it"?
<voidspace> dimitern: as I can't specifically see an option for that
<voidspace> dimitern: anyway, you have a review
<jam> wow.... running "worker/uniter" tests causes the test suite to rebuild cmd/jujud/jujud which consumes about 700MB just for the 6l linker...
<jam> so much for running the test suite on a 1GB VM
<voidspace> dimitern: hmmm... no, it doesn't. Ah well. And I seem to have done it twice :-)
<wwitzel3> jam: so it rebuilds it everytime, even if there are no changes?
<jam> wwitzel3: I believe the issue probably has to do with JujuConnSuite building tools and "uploading" them to the environment as part of default setup. I don't quite understand why it always rebuilds jujud, but it is probably building everything in a temp dir (I would guess)
<wwitzel3> jam: hrmm, wonder if there is a way to avoid that. Rebuilding everything is not an insignificant amount of time and with our test suite already being slow.
<jam> wwitzel3: well I only noticed cause i did "go test -c" and then "./uniter.test & ./uniter.test" 10 times and my VM died to swapping :)
<wwitzel3> jam: heh
<voidspace> dimitern: in the "Juju Container Addressability" doc, is the greyed out section (from the bottom of page 5)
<voidspace> dimitern: there for historical reasons only, or do I need to go through that as well?
<dimitern> voidspace, it's historical only
<voidspace> dimitern: ok
<dimitern> voidspace, sorry, I just saw your earlier messages
<voidspace> dimitern: I deleted one out of date bullet point and I'm adding an additional one in the "common features" section about the addresser worker
<dimitern> voidspace, so AIUI "Fix it, then ship it" turns to "Ship it" once all issues are resolved/dropped
<dimitern> automatically
<voidspace> dimitern: right, but that wasn't the question
<voidspace> dimitern: it was how to post a "Fix it, then ship it" in the first place
<voidspace> dimitern: I posted a "Ship It" and it was just a "Ship It"
<voidspace> :-)
<dimitern> voidspace, ah :) well the "Fix it" part appears when you add any issues and tick the ship it box
<voidspace> dimitern: ah!
<voidspace> dimitern: thanks
<voidspace> I didn't post two "Ship It" reviews - wwitzel3 posted one within the same minute as me...
<voidspace> dimitern: hmmm... I have a horrible feeling
<voidspace> dimitern: we have the watcher, worker, apiserver ReleaseContainerAddresses method, api client method
<voidspace> dimitern: but it doesn't look *to me* like destroying a container calls the ReleaseContainerAddresses method
<voidspace> unless I'm missing something
<voidspace> I really thought I did that...
<voidspace> dimitern: the provisioner should be calling it in StopInstance
<dimitern> voidspace, how about adding a few ops in state around machine destruction?
<voidspace> dimitern: so when the machine is destroyed in state release the addresses
<voidspace> dimitern: then we don't need the api
<voidspace> and the logic is simpler
<voidspace> we have the machine ID, just find all IP addresses with that machine ID and mark Life as dead
<voidspace> no need for the unit agent to do it, so no need to check permissions
<voidspace> dimitern: I'll create an issue and a kanban card *sigh*
<dimitern> voidspace, thanks
<dimitern> voidspace, yes, that sounds like a better approach - the provisioner API RCA() can still stay (for now at least)
<voidspace> dimitern: heh, you just unassigned me from a three year old bug
<dimitern> voidspace, I think it's safer to do it in state though
<voidspace> dimitern: yep
<dimitern> voidspace, are you still working on it? :)
<voidspace> dimitern: I might have got round to it...
<dimitern> voidspace, well, in that case feel free to reassign yourself then :)
<voidspace> dimitern: I think I'll skip it...
<sinzui> mgz, dimitern Can either of you review http://reviews.vapour.ws/r/1394/
<mgz> sinzui: on it
<mgz> sinzui: lgtm
<sinzui> thank you mgz
<voidspace> aargh, spotify killed unity again
<voidspace> mouse no longer works, events blocked
<voidspace> I can alt-tab and type
<voidspace> but can't use a browser very well
<voidspace> will reboot and go on lunch, create tickets (and work on them) after that
<jam> sinzui: abentley: i see "inc-1.23-beta4" in the pipeline, and I'm trying to land the "remove leader-election flag for 1.23" as well.
<jam> It bounced once due to AddRemoveSet
<jam> but it should land
<sinzui> jam, yes, 1.23-beta3 was sent to the builders. we will release a 1.23-beta4 later this week or next if need be
<sinzui> dimitern, bug 1427814 should be High, or we should remove it from the milestone. We don't commit Medium bugs to deadlines.
<mup> Bug #1427814: juju bootstrap fails on maas with when the node has empty lshw output from commissioning <bootstrap> <maas> <maas-provider> <network> <juju-core:Triaged by themue> <juju-core 1.23:Triaged by themue> <https://launchpad.net/bugs/1427814>
<dimitern> sinzui, ok, I'll retriage it as high then
<sinzui> thank you dimitern
<mup> Bug #1427508 changed: cmd/jujud/agent: test failure <intermittent-failure> <tech-debt> <test-failure> <juju-core:Fix Released> <juju-core 1.23:Fix Released> <https://launchpad.net/bugs/1427508>
<mup> Bug #1438168 changed: juju 1.23 doesn't release IP addresses for containers <juju-core:Fix Released by mfoord> <https://launchpad.net/bugs/1438168>
<mup> Bug #1438683 changed: Containers stuck allocating, interface not up <add-machine> <cloud-installer> <landscape> <maas-provider> <network> <juju-core:Fix Released by mfoord> <https://launchpad.net/bugs/1438683>
<mup> Bug #1438820 changed: IP address life field upgrade step and addresser worker don't play well together <juju-core:Fix Released by mfoord> <https://launchpad.net/bugs/1438820>
<natefinch> ericsnow, wwitzel3: note the moved standup
<natefinch> (in an hour)
<ericsnow> natefinch: k
<perrito666> ericsnow: are you not in pycon?
 * perrito666 looks at the calendar and acuses it of lying
<ericsnow> perrito666: leaving for the airport in 4 hours
<perrito666> ahh :D makes sense
<perrito666> my calendar, given my tz, has shown me things in the wrong day before
<ericsnow> perrito666: the calendar's right, it just doesn't show today as partial :)
<wwitzel3> ahh, I see that now
<wwitzel3> hangout is the best max fan speed tool for Linux, even steam doesn't get it cranked like hangout
<perrito666> wwitzel3: get hardware accel for it
<ericsnow> natefinch: one-on-one?
<natefinch> ericsnow: I think we can skip it unless there's something you need
<ericsnow> natefinch: nah
<natefinch> ericsnow: cool.  See you in Nuremberg.  Have fun at pycon.
<ericsnow> dimitern: thanks for hopping onto that vmware networking thread
<voidspace> natefinch: can you remove me from the list of attendees to moonstone standups please
<voidspace> natefinch: I don't think I can do it...
<ericsnow> dimitern: I think we have a good enough plan going forward, but I want to be sure I'm understanding the juju networking model correctly
<voidspace> natefinch: I keep getting updated invitations to your standups :-)
<ericsnow> voidspace: were we that bad? <wink>
<voidspace> ericsnow: yep <wink>
<wwitzel3> lol
<voidspace> ericsnow: you at pycon yet?
<voidspace> :-)
<ericsnow> voidspace: flying out in a few hours
<voidspace> ericsnow: have fun, I'll miss everyone :-(
<ericsnow> voidspace: way only just told that the posters can be 4x8 (feet)
<wwitzel3> lol
<voidspace> hah
<ericsnow> voidspace: but it does allow me to fit more stuff on there :)
<voidspace> good
<wwitzel3> less is more for posters
<dimitern> ericsnow, sure, I'll dig into the discussion so far and respond (but most likely tomorrow)
<ericsnow> wwitzel3: pshaw
<ericsnow> dimitern: np, thanks!
<wwitzel3> poeple want to be drawn in by the poster, a core concept on the post that is easy to follow will generate interest and questions
<voidspace> just have a picture of a cloud
<wwitzel3> a big diagram full of stuff = empty poster session
<ericsnow> wwitzel3, voidspace: apparently there are also accommodations for a laptop, so I may demo some stuff too
<ericsnow> wwitzel3: true
<ericsnow> voidspace: :)
<voidspace> ericsnow: I would have a predeployed local environment with gui
<wwitzel3> yeah, was just going to say, use the GUI :)
<ericsnow> voidspace: more homework? :)
<voidspace> heh, it's easy...
<ericsnow> voidspace: I'm probably going to set up openstack
<voidspace> wowzer :-)
<voidspace> so, deploy openstack with juju and then deploy to that openstack with juju
<ericsnow> voidspace: :)
<ericsnow> voidspace: TBH, I wasn't planning on using a laptop, but it's so tempting
<voidspace> being able to show a deployed environment through the gui is nice
<voidspace> showing the relationships
<ericsnow> voidspace: yep
<voidspace> gmail has decided, fairly reasonably in my opinion..., that all those blessed/cursed emails are spam...
<perrito666> voidspace: I added a filter
<voidspace> perrito666: I have a thunderbird filter - I use imap
<ericsnow> voidspace: the demo openstack page that lazyPower posted inspired me
<perrito666> voidspace: saying that these where not spam :p because I assumed they where going to be taken as such
 * lazyPower perks up
<voidspace> perrito666: but gmail is moving them to spam before I get to them
<lazyPower> o/
<perrito666> voidspace: you can nevertheless add a filter in gmail
<voidspace> perrito666: heh
<voidspace> perrito666: I could do...
<perrito666> voidspace: its no more than 5 clicks
<voidspace> perrito666: I wish gmail would leave my damn email alone and let me handle it thank you very much
<perrito666> voidspace: well, I havent looked in my canonical acct, but in my regular acct I get a lot of spam so I am fairly happy with gmails filter
<perrito666> but, I use gmail as a client
<wwitzel3> voidspace: you can disable the spam filtering completely
<voidspace> yeah, I don't like web clients
<perrito666> voidspace: I like consistent clients :p so with that I get the same client on my phone, my tablet and my laptop
<voidspace> it's sucky - hey, but at least it's sucky everywhere!
<perrito666> voidspace: and I pretty much like the ui so I would really use a desktop client if it provided the same interface
<voidspace> right
<voidspace> I prefer the thunderbird ui
<perrito666> the idea of labels instead of folders is the killer feature for me
<voidspace> that's the killer un-feature for me!
<voidspace> I like folders dammit
<ericsnow> I don't mind the labels, but the filters drive me nuts
<ericsnow> you can't prioritize them
<ericsnow> (they *all* get run)
<wwitzel3> voidspace: if you create a filter for Has Words: is:spam and check never mark as spam, you essentially disable the spam filtering
<ericsnow> wwitzel3: nice :)
<wwitzel3> voidspace: I did that for my Canonical email after I missed a few that were sent to spam.
<perrito666> voidspace: I am a person with difficulties making choices so If I can put an email in many folders it solves my problem :p
<dimitern> TheMue, hey, any progress on the maas issue? Did you manage to use vmrun successfully?
<TheMue> dimitern: hehe, just wanted to ping you. vmrun does not help, it is only a CLI tool doing the same as the UI. when wanting to get into the running machine you need a user/password.
<natefinch> voidspace: ha, sorry, I can fix it.
<TheMue> dimitern: but I'm currently setting up a serial debugging console hoping to grabbing the output there
<TheMue> dimitern: found some docs at vmware and in forums on how to set it up
<dimitern> TheMue, that's an option yeah
<TheMue> dimitern: I'm not aware what I'll see there, but I've got almost no other idea
<TheMue> dimitern: one option I found too is simply do a screen recording, creating a kind of console output film
<TheMue> dimitern: :)
<voidspace> wwitzel3: hah, nice - thanks
<TheMue> dimitern: if only this initial ubuntu user would have a password to login *sigh*
 * TheMue is frustrated
<voidspace> natefinch: thanks :-)
<dimitern> TheMue, if only cloud-init did it's job :)
<perrito666> natefinch: standup?
<TheMue> dimitern: exactly, the a simple ssh would be no problem
<natefinch> sinzui: what's up with https://bugs.launchpad.net/juju-core/+bug/1440940  ??  It looks like it's a build environment problem, since it's not able to find the stdlib's encoding package.
<mup> Bug #1440940: xml/marshal.go:10:2: cannot find package "encoding" <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1440940>
<sinzui> natefinch, I hope a developer can explain what is up so that I can fix it. We haven't changed packaging on that machine in months
<sinzui> natefinch, you think encoding is provided by a specific ubuntu package?
<natefinch> sinzui: encoding is a go package that is part of the go standard library
<voidspace> yeah, that's odd
<sinzui> natefinch, stilson-07 only had kernels in /usr/src
<natefinch> sinzui: seems like goroot is set incorrectly there
<natefinch> sinzui: $GOROOT seems to be set to /usr/
<sinzui> natefinch, what should goroot be on a machine that uses gccgo?
<sinzui> mgz can you and natefinch sort out what stilson-07 needs (that I presume stilson-08 has) to allow us to run unit tests? I am at a critical moment in a release that affects production streams
<natefinch> sinzui: no problem
<mgz> sure
<natefinch> sinzui, mgz:  goroot doesn't need to be set, the go tool figures it out on its own for the most part.
<mgz> natefinch: so, you say I unset that and it compiles?
<sinzui> mgz, since 1.23 and 1.22 works, I think something is different about the package or the Makefile
<natefinch> mgz: I am 100% sure there's no possible way that anything else could possibly be wrong
<mup> Bug #1441206 was opened: Container destruction doesn't mark IP addresses as Dead <juju-core:In Progress by mfoord> <https://launchpad.net/bugs/1441206>
 * natefinch hopes they get the sarcasm'
<sinzui> mgz, tarball, not package. stilson-08 takes the same tarball and made packages.
<natefinch> mgz: unsetting it should work, but I can't be sure there's nothing else wrong
<mgz> natefinch: paste.ubuntu.com/10763144
<natefinch> mgz: what does `go env` return ?
<mgz> the reason trunk is broken and 1.23 is not is that the xml package from the stdlib was forked as a dep
<mgz> and that's what's not compiling
<mgz> sinzui: ^that change is on trunk only
<mgz> natefinch: GOROOT=/usr
<natefinch> mgz: it sounds like the installation of go is messed up
<mgz> I can purge it and reinstall
<mup> Bug #1441206 changed: Container destruction doesn't mark IP addresses as Dead <juju-core:In Progress by mfoord> <https://launchpad.net/bugs/1441206>
<mgz> natefinch: stilson-08 also says /usr for GOROOT through go env
<natefinch> mgz: how did you guys get Go installed on those machines?
<mgz> natefinch: apt-get install
<natefinch> mgz: oh, yeah, I guess that's actually the correct goroot...  I had forgotten how it works when you use apt-get
<rogpeppe1> natefinch: hiya
<natefinch> rogpeppe1: howdy
<rogpeppe1> natefinch: just wondering if you know how to quote an arbitrary argument passed to cmd.exe; we're trying to write a platform-general "open link in web browser" package
<natefinch> mgz: still, that means there should be a /usr/src/pkg/encoding/encoding.go
<rogpeppe1> natefinch: and i don't trust the implementation https://github.com/toqueteos/webbrowser/blob/master/webbrowser.go
<mgz> rogpeppe1: just look at the python module and do that?
<rogpeppe1> mgz: link?
<alexisb> dimitern, ping
<dimitern> alexisb, pong
<mgz> rogpeppe1: there's a function in subprocess.py that does quoting, and webbrowser does what you're doing
<rogpeppe1> mgz: it's not just normal quoting unfortunately
<rogpeppe1> mgz: it's quoting past cmd.exe, which is somewhat harder, i think
<mgz> rogpeppe1: `vi /usr/lib/python2.7/subprocess.py`
 * perrito666 suggests rogpeppe1 asks gsamfira 
<rogpeppe1> mgz: not really keen on that implementation
<natefinch> mgz: hmm... except that installing it locally doesn't do that either....
<rogpeppe1> mgz: it doesn't seem to respect user web browser preferences, although i may not have seen that bit
<mgz> rogpeppe1: /list2cmdline
<natefinch> mgz: ahh... when I install go from apt, it gives me goroot = /usr/lib/go
<mgz> rogpeppe1: it looks at the BROWSER envvar on nix, not sure if it looks up the registry setting or whatever on windows
<mup> Bug #1441206 was opened: Container destruction doesn't mark IP addresses as Dead <juju-core:In Progress by mfoord> <https://launchpad.net/bugs/1441206>
<rogpeppe1> mgz: isn't list2cmdline just for executing a command directly?
<mgz> I also don't like webbrowser much, but it'll cover a bunch of cases you don't think if otherwise
<rogpeppe1> mgz: whereas what we are thinking of doing is running cmd /c start $url
<rogpeppe1> mgz: and in that case there are a bunch of cmd.exe metachars that would need quoting (& being the most obvious)
<mgz> rogpeppe1: see also `if shell:` in _execute_child
<rogpeppe1> mgz: link?
<mgz> same file
<mgz> further down
<ericsnow> rogpeppe1: did you see the shell package in the utils repo?
<ericsnow> rogpeppe1: it has code for shquoting
<rogpeppe1> mgz: i don't see it inhttps://hg.python.org/cpython/file/2.7/Lib/webbrowser.py
<rogpeppe1> ericsnow: ah, in windows too, cool
<ericsnow> rogpeppe1: both powershell and cmd.exe, I believe
<ericsnow> rogpeppe1: powershell is easier, BTW
<natefinch> rogpeppe1:  wow, I was totally confused about what you meant by quoting
<natefinch> rogpeppe1: would just putting quotes around it not work?
<rogpeppe1> natefinch: probably not
<rogpeppe1> natefinch: because they'll get quoted by the Go syscall quoting mechanism, i think
<natefinch> rogpeppe1: Oh I see
<mgz> rogpeppe1: you were asking about the subprocess function, that file
<rogpeppe1> i think this (from winCmdEscapeMeta) has what i need: 	const meta = `()%!^"<>&|`
<natefinch> rogpeppe1: http://stackoverflow.com/questions/1327431/how-do-i-escape-ampersands-in-batch-files
<rogpeppe1> natefinch: yeah, i know about ampersands
<rogpeppe1> natefinch: it was all the other stuff that you can find in urls that i'm concerned about
<natefinch> rogpeppe1: right ok
<natefinch> sinzui, mgz: can we mark #1440940 as not blocking trunk?  It's not a code issue that dev can fix.
<mup> Bug #1440940: xml/marshal.go:10:2: cannot find package "encoding" <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1440940>
<natefinch> of that I'm 100% sure
<mgz> natefinch: have you successfully built on gccgo locally with trunk?
<natefinch> mgz: not recently, but I can give it a go
<natefinch> (heh)
<mgz> natefinch: I think unblocking trunk when know one knows if t
<mgz> it actually builds or not on one of our supported platforms, is probably not a good idea
<natefinch> mgz: I actually had thought we decided that errors with gccgo builds weren't going to block trunk, but maybe I misunderstood.   I am compiling with gccgo now, btw.
<sinzui> natefinch, I think the issue is that Ubuntu require this to work and we need to prove it is a compiler issue that they manage, not a code issue in Juju
<natefinch> sinzui: well, it's certainly a compiler issue.  It can't find a package from the standard library.   FWIW, I just finished a build of trunk with gccgo with no issues
<sinzui> mgz: This atrociously long log shows gccgo did compile the ppc64el package on stilson-08
<natefinch> sinzui: my suggestion is to remove and reinstall golang on that machine... maybe the installation got corrupted somehow
<sinzui> natefinch, but isn't this about xml being forked. 1.22 and 1.23 build fine
<natefinch> sinzui: the error from the compiler clearly says that it can't find the encoding package from the standard library. it's not about the forked xml package... that's just the leaf-most package that happens to import encoding, and therefore shows the error.
<natefinch> sinzui: those other branches build fine on the same machine?
<sinzui> natefinch, yes
<mgz> natefinch: it really is about the change that introduced that, it's when it started breaking
<natefinch> sinzui: hmm.... so before, we didn't import encoding directly in our code at all, only indirectly through importing xml.  Now we do it directly.  I wonder if that somehow is triggering this issue.
<sinzui> neither stillson has golang* package installed
<natefinch> sinzui: where is it getting the go tools from, then?
<sinzui> mgz, is xml defined in dependencies.tsv?
<sinzui> natefinch, I think they get it from gccgo
<mgz> sinzui: our forked version is, for trunk
<sinzui> mgz, yes, which is why I think Ubuntu can say the issue is in Juju's code
<natefinch> sinzui: the problem is that the standard library package "encoding" isn't there.  That's not a bug in juju's code.
<mgz> yeah, I wanted to grab dave or someone to work out what about the root package import is unhappy on gccgo
<mgz> natefinch: but all the encoding/blah imports are fine
<sinzui> mgz, stilson-08 will work because it is packaging...debian deps will ensure the fakeroot will get the source.
<natefinch> mgz, sinzui: can one of you try copying this program to a file on that machine and doing gccgo run <thatfile>?  http://play.golang.org/p/6b1LdO_13i
<sinzui> mgz stilson-07 doesn't have golang-go src and it hasn't needed it because I think the tarball gets all the golang go pakages needed. GOROOT is irrelevant for the tarball because the static link rules require we provide everything that ubuntu needs to audit
<sinzui> mgz, so I wonder if xml is being removed when the tarball is created because it isn't stated to be required
<mgz> natefinch: yeah, that does not compile
<mgz> sinzui: we shouln't be tarring up stdlib bits
<mgz> sinzui: so, I'm pretty sure the tarball is just fine, it has the forked xml
<sinzui> mgz, then lets install std libs on stilson-07
<sinzui> mgz, the are not there and I don't think they every were
<mgz> sinzui: okay, I think there is an actual bug here, but I'm fine just sticking golang on if that's enough
<natefinch> mgz, sinzui:  certainly if that 12 line program doesn't compile, it's not a juju issue.   Something is fubared with the environment.  It seems like installing the go compiler on the machine on which you intend to compile go code should be uncontroversial ;)
<natefinch> I have to run for a couple hours to do tax stuff.
<sinzui> natefinch, mgz, apt tells me that golang-src isn't available for ppc64el
<natefinch> lol
<natefinch> sinzui: it's pretty easy to build from source
<sinzui> natefinch, I really don't know what provided xml or encodings in the past
 * natefinch shrugs
<natefinch> I gotta run, sorry.    http://golang.org/doc/install/source
<sinzui> natefinch, the machine is setup using the juju Makefile so we don't create custom setups
<mup> Bug #1423936 changed: Juju backup fails when journal files are present <backup-restore> <cts> <juju-core:Fix Released by niedbalski> <juju-core 1.22:Fix Released by niedbalski> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1423936>
<mup> Bug #1436191 changed: gce: bootstrap instance has no network rule for API <firewall> <gce-provider> <juju-core:Fix Released by dimitern> <juju-core 1.23:Fix Released by dimitern> <https://launchpad.net/bugs/1436191>
<mup> Bug #1436390 changed: GCE provider config should support extracting auth info from JSON file <gce-provider> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1436390>
<mup> Bug #1436397 changed: map-order sensitive test in md/juju/storage needs to be fixed <map-order> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1436397>
<mup> Bug #1436415 changed: vivid local template container "juju-vivid-lxc-template" did not stop' <ci> <lxc> <tech-debt> <vivid> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1436415>
<mup> Bug #1436655 changed: gce provider should stop using deprecated zone europe-west1-a <gce-provider> <juju-core:Fix Released by wwitzel3> <juju-core 1.23:Fix Released by wwitzel3> <https://launchpad.net/bugs/1436655>
<mup> Bug #1436988 changed: juju backup/restore is upstart-specific <backup-restore> <systemd> <vivid> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1436988>
<mup> Bug #1437038 changed: 1.23b2 fails to get IP from MAAS for containers, falls back to lxcbr0 <addressability> <maas-provider> <network> <juju-core:Fix Released by mfoord> <juju-core 1.23:Fix Released by mfoord> <https://launchpad.net/bugs/1437038>
<mup> Bug #1437220 changed: gce provider often can't find its own instances <gce-provider> <observability> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1437220>
<mup> Bug #1437296 changed: apt-http-proxy being reset to bridge address <local-provider> <proxy> <juju-core:Fix Released by anastasia-macmood> <juju-core 1.22:Fix Released by anastasia-macmood> <juju-core 1.23:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1437296>
<mup> Bug #1437366 changed: MAX_ARGS is reached when calling relation-set <charm> <landscape> <relations> <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1437366>
<mup> Bug #1438748 changed: Use of /tmp/discover_init_system.sh is a security vulnerability. <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1438748>
<mup> Bug #1439398 changed: GCE low-level RemoveInstances fails if firewalls are not found <juju-core:Fix Released by ericsnowcurrently> <juju-core 1.23:Fix Released by ericsnowcurrently> <https://launchpad.net/bugs/1439398>
<mup> Bug #1439761 changed: AWS V4 signing does not work <ec2-provider> <juju-core:Fix Released by cox-katherine-e> <juju-core 1.23:Fix Released by cox-katherine-e> <https://launchpad.net/bugs/1439761>
<voidspace> right, I'm off
<voidspace> g'night all
<voidspace> EOD
<mup> Bug #1441302 was opened: Vivid unit tests are not reliable enough <test-failure> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1441302>
<mup> Bug #1441319 was opened: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <lxc> <oil> <juju-core:Triaged> <https://launchpad.net/bugs/1441319>
<natefinch> sinzui: anything I can do to help with that gccgo problem?
<natefinch> wwitzel3: you going to be on the release call tonight?
<sinzui> natefinch, I update the bug with what I learned. gccgo doesn't use goroot. The libgo5 package is still installed and the encoding package is there
<sinzui> natefinch, I an currently attempting a reinstall because I I cannot think of anything wlse to do
<davecheney> which bug is this ?
<mgz> ah, dave is who I wanted
<natefinch> davecheney: https://bugs.launchpad.net/juju-core/+bug/1440940
<mup> Bug #1440940: xml/marshal.go:10:2: cannot find package "encoding" <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1440940>
<davecheney> tada, i'm here in your timezone
<davecheney> natefinch: that bug is because someone wrote code with Go 1.4 features
<davecheney> (i think)
<davecheney> checking
 * perrito666 managed to organize a go meetup in his city tonight, the whole go community is coming... al 8 of them
<mgz> davecheney: so, it started happening because we copied the encoding/xml package into our namespace
<davecheney> why they heck did you do that ?
<davecheney> sounds like you solved a big problem by lighting a massive house fire
<davecheney> s/big/small
<mgz> "to make marshalling of namespaced attributes work correctly"
<davecheney> in short, we won't get that fix til next year
<davecheney> sorry
<mgz> anyway, for whatever reason, the import of just "encoding" from that package doesn't work on a vanilla install of gccgo on trusty, for reasons I don't understand
<davecheney> mgz: which commit was this ?
<mgz> butI believe it came in with the new charmstore api
<natefinch> davecheney: note that a simple gccgo run of this code fails on that machine (though it works fine on my machine): http://play.golang.org/p/6b1LdO_13i
<mgz> davecheney: note, a trivial script that imports encoding also fails
<natefinch> heh
<davecheney> maybe that package doesn't exist in gccgo
<mgz> (unless you have golang installed as well I believe)
<davecheney> mgz: if installing go as well as gccgo fixes the problem
<davecheney> that is extremely super bafd
<davecheney> that is extremely super bad
<mgz> I can confirm that on a fresh amd64 vm if that would be helpful
<davecheney> ok
<davecheney> in terms of getting a fix
<davecheney> you have to roll that back if you want to be able to release this week
<davecheney> getting a fix into gccgo-4.9 is impossible on that timeframe
<thumper> o/
<mgz> the other option maybe is... also forking the root encoding package into juju/
<katco> o/ thumper
<davecheney> mgz: this sounds like taking a bad situation and making it worse
<mgz> of course :)
<mgz> or copying in the TextMarshalelr class to the forked package, 's all it uses
<wwitzel3> natefinch: yep
<natefinch> wwitzel3: cool, thank you.  My main concern is this CI bug (https://bugs.launchpad.net/juju-core/+bug/1440940) that is blocking me from getting my HA stuff in.   My position is that this is just an environmental problem, not a bug in our code, since they can reproduce the problem with trivial code that doesn't even use juju.
<mup> Bug #1440940: xml/marshal.go:10:2: cannot find package "encoding" <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1440940>
<natefinch> I gotta run, sorry guys.  Good luck.
<davecheney> i thought we had a standing rule to revert any change which was blocking CI
<mgz> davecheney: yeah, so that hasn't happened in practice, and this was a complex one anyway
<mgz> the forking of the package happened a while before it was added as a dep to juju-core,
<mgz> and the fact we reuse ppc machines for packaging meant this wasn't caught immediately,
<mgz> and everyone whines if we actually block trunk on ppc issues
<mgz> so we don't do that any more, but we still need to actually *release* a working juju on ppc
<davecheney> this sounds like double speak
<davecheney> ppc either blocks releases or it doesn't
<davecheney> and from what i'm hearing, it doesn't
<katco> axw: do you mind if i ping you later for the standup?
<axw> katco: no problem, how much later? I may need to go back downstairs
<katco> axw: hour or so from now?
<axw> katco: okey dokey
<axw> school holidays atm, so I'm easy
<katco> axw: is now a good time?
#juju-dev 2015-04-08
<axw> katco: just eating breakfast, 5 mins please
<axw> katco: also, network has been going out still if it's not obvious...
<katco> axw: haha, ok i'll email my question... please take your time and enjoy your breakfast
<axw> katco: standup now?
<katco> axw: sure
<katco> axw: one sec sorry
<axw> katco: sure
<katco> axw: k omw
<thumper> I'm likely to drop off at some time as I move from ADSL to VDSL
<jw4> thumper: VDSL?
<rick_h_> very dsl, it's like super powered
<thumper> very fast
<rick_h_> :P
<jw4> haha
<jw4> nice
<rick_h_> more dsl than normal
<thumper> I'm currently getting about 14 Mbps down and less than 1 up
<thumper> VDSL should make it about 20/6
<rick_h_> nice
<jw4> sweet.
<thumper> slightly faster down, much faster up
<rick_h_> upgraded to 50/10 last week. Got sick of 1.5mb up
<thumper> and the same modem they have given me will work for fibre when I get it in six months or so
<jw4> I (briefly) considered moving to kansas city when google announced fiber there
<thumper> which will make it over 150 symmetric
 * rick_h_ dreams of 100 bidirectional
<jw4> wow
<thumper> well, you have to remember that this is within NZ
<thumper> push anything over the wet noodle to the states and it slows down
<rick_h_> lol
<jw4> yeah, when you cross the pacifc you slow to a crawl I suppose
<rick_h_> thumper: yea, remmeber that in all these 'api calls' they going to go through london across oceans heh
<katco> axw_: dummy charm is working; i'll try postgres tomorrow
<katco> axw_: latest changes are in the PR. still not 100% done, but could use a look if you're bored
<katco> good night everyone
<thumper> wow, that made a big difference
<thumper> locally, went from 14/0.8 Mbps to 33/9.5
<thumper> and between here and VA, I get 7.76/5.12 Mbsp
<thumper> so a big improvement
<axw_> katco: sweet.
<axw_> thumper: :o nice
<axw_> welcome to gigatown
<axw_> ;)
<jam> axw: did you see the question on the mailing list about a missed upgrade step related to AvailZone ?
<jam> Do you know anything about it?
<axw> jam: I saw it, but I don't. I think Eric worked on that one
<jam> axw: eric is at pycon :(
<voidspace> morning all
<axw> voidspace: morning
<voidspace> axw: o/
<axw> jam: so, looks like if there's any instanceid in state that's invalid, the whole upgrade step will fail
<axw> not sure if that's the cause tho
<jam> axw: could that be triggered by containers?
<jam> ISTR that used to be invalid machine ids
<axw> jam: I don't think containers get an entry in instancedata, but not entirely sure
<axw> actually, they must, that's where the HW info and all that is set
<axw> I guess the instance ID is not something that MAAS will know about tho
<jam> I'm just spit balling. I believe there were bugs in the past because destroy-environment would send a request to MAAS to destroy the instances that were actually containers.
<axw> jam: yeah, I think you're right
<TheMue> voidspace: morning o/
<mup> Bug #1441478 was opened: state: availability zone upgrade fails if containers are present <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1441478>
<voidspace> TheMue: o/
<voidspace> no dimiterm (yet) ?
<TheMue> voidspace: hmm, he's offline on Friday, but today he should be here
<voidspace> yeah, that's what I thought
<voidspace> TheMue: dooferlad: in case I forget to mention it in standup, I'm off on Friday too
<voidspace> going to a conference with my Dad on Friday, Saturday - flying to Nuremberg Sunday
<TheMue> voidspace: dooferlad: based on current planning I'll be off tomorrow. organizing the new car, preparing my Go/Juju talk on the conference the week after the sprint, and wedding anniversary of my parents-in-law
<voidspace> TheMue: ok
<mup> Bug #1438683 was opened: Containers stuck allocating, interface not up <add-machine> <cloud-installer> <landscape> <maas-provider> <network> <juju-core:Fix Committed by mfoord> <juju-core 1.23:Fix Released by mfoord> <juju-core trunk:Fix Committed by mfoord> <https://launchpad.net/bugs/1438683>
<TheMue> Hehe, that's what juju is for: http://dilbert.com/strip/2015-04-08
<dimitern> I *really* hate how I'm forced to mock out 99% of a huge Environ interface just so I can test 2 method calls on it
<TheMue> dimitern: yeah, would be better if it would be a combination of smaller interfaces, like io.ReaderWriter, and we only use the small ones instead of the combination
<TheMue> dimitern: so for test one only would have to mock the according smaller one
<dimitern> TheMue, exactly! - so for example in order to get an Environ from environ config, I need a registered provider which only needs an Open() method, returning environs.Something which then has methods to get full featured interface, e.g. GetZoned -> ZonedEnviron
<dimitern> or GetNetworking -> NetworkingEnviron
<dimitern> granted, using finer-grained smaller interfaces for specific environ features will incur some overhead at run-time (type-asserting against a "feature sub-interface" before calling "feature-specific methods")
<TheMue> dimitern: hmm, setting the smaller ones as parameter types is fine, but when Open() returns an Environment it always have to be a combination. but this at least could be done with a struct combining multiple smaller mocks
<dimitern> TheMue, why it has to be a full Environ? Open can return a smaller "feature-facade" interface (e.g. GenericEnviron), which you can then use to access specific features (e.g. GenericEnviron.SupportZones() (ZonedEnviron, bool); SupportNetworking() (NetworkingEnviron, error))
<dimitern> the problem is embedding smaller feature-based interfaces into a bigger "all-in-one" Environ interface - that thing should die at some point soon
<TheMue> dimitern: oh, yes, that could be a good approach. it has to know those interfaces but don't implement them. yeah, like it
<davecheney> dimitern: use a type assertion
<davecheney> e := someenviron()
<davecheney> ze, ok := e.(ZonedEnviron)
<davecheney> if !ok { // bummer, doens't support zones }
<dimitern> davecheney, I know, but I wanted to avoid the need to create something implementing Environ, just so I can call 2 methods on it
<dimitern> davecheney, I did find a relatively cruft-free solution though: embed Environ and only implement the methods I need to test
<davecheney> dimitern: what are the chances of having a serious discussion about refactoring the environ interface in nuremberg
<davecheney> as in, a discussion that leads to a resolution and work in the next cycle
<davecheney> not just another commitment to fix it, but someday
<niedbalski> katco, ping
<dimitern> davecheney, we already had an interesting discussion about it in Cape Town
<dimitern> davecheney, by interesting I mean with actual implementable steps and a rough roadmap
<dimitern> davecheney, so let's sit down, discuss it finally and get it done :)
<davecheney> dimitern: +1
<davecheney> dimitern: have you put it on the sprint discussion spreadsheet ?
<niedbalski> axw, seems that commit https://github.com/juju/juju/commit/a16c5c3fd534e9457965b61621cbc2aca00cd21b , adds the leader-election by default. Would be possible to trigger a new -devel PPA build?
<dimitern> davecheney, if not I'll do it now, while still frustrated :)
<axw> niedbalski: best to talk with sinzui, mgz or abentley about that. I know nothing about the PPA builds
<niedbalski> axw, yeah, will have to wait , thanks
<davecheney> dimitern: that's best
<dimitern> davecheney, done, I've added you to the list of attendees for it
<davecheney> dimitern: are you sure you did ?
<davecheney> i don't see my name there
<dimitern> davecheney, which spreadsheet are you looking at?
<davecheney> dimitern: which one are you looking at ?
<davecheney> i'm looking at the correct one
<davecheney> and so is everyone else
<dimitern> davecheney, I've updated Juju MaaS Sprint Agenda - Nuremberg - April 201
<dimitern> 2015
<davecheney> fucking wonderful
<davecheney> again we manage to have two planning documents for one event
<davecheney> lets open the champagn
<dimitern> :)
<davecheney> dimitern: you have updated a document nobody on juju has access to
<davecheney> dimitern: please give me the link to the document you are using
<dimitern> I think that one I've updated is used for scheduling based on the first one with the list of topics
<TheMue> I thought it's the Nuremberg Sprint Topics/Planing  Juju Core?
<dimitern> I'll update both now
<davecheney> dimitern: can we stop writing stuff down twice ?
<davecheney> https://docs.google.com/a/canonical.com/spreadsheets/d/1TrPuHrWvnHU-Ekzt9SLEoWVaSV-f87kHOuLXqzjHnCc/edit#gid=0
<TheMue> davecheney: +1
<davecheney> this is the document
<davecheney> any others shold be deleted and merged into the correct document, https://docs.google.com/a/canonical.com/spreadsheets/d/1TrPuHrWvnHU-Ekzt9SLEoWVaSV-f87kHOuLXqzjHnCc/edit#gid=0
<dimitern> davecheney, updated both now
<davecheney> thanks
<alexisb> davecheney, I have to get use to you talking in this timezone ;)
<davecheney> then i'll change it up on you
<alexisb> jam, you still around?
<jam> alexisb: I am
<alexisb> heya
<alexisb> first off thank you for looking at lp 1441302
<alexisb> I will follow-up with the QA team today re the env
<alexisb> for the other bug you are were looking at: https://bugs.launchpad.net/juju-core/+bug/1440737
<jam> that's also timing related, just potentially higher sensitivity I think
<alexisb> sounds like there are some test updates needed, do you think their is an actual bug in the code?
<davecheney> alexisb: why is a test failure marked as a private bug
<jam> both bugs look to be test related
<alexisb> davecheney, that is a very good question
<davecheney> it contains only information about a synthetic test run
<davecheney> and is not related to a security issue
<davecheney> fat finger issue ?
<alexisb> davecheney, I believe so, but i have not actually verified with the bug submitter
<davecheney> trust me, it's just a common garden test faliure
<jam> davecheney: I think CI prefers to make them private because there have been bugs in the past where Juju would leak secrets to the log file
<jam> alexisb: ^^
<davecheney> jam: this is a test run
<davecheney> there are no secrets to leak
<alexisb> katco, ping
<wwitzel3> natefinch: 1 on 1?
<natefinch> wwitzel3: yep
<katco> alexisb: pong
<alexisb> katco, sorry one sec
<katco> alexisb: no worries
<rogpeppe1> mgz: how can i determine if a given repo is managed by the juju bot or not?
<mgz> rogpeppe1: from the github side? if it has hacker or bots as contributors
<mgz> *collaborators
<rogpeppe1> mgz: so juju/charm has Hackers under team collaborators, so that means i should use $$merge$$ there?
<mgz> rogpeppe1: see github.com/orgs/juju/teams/hackers/repositories and same but /bots/
<mgz> rogpeppe1: charm does not
<mgz> but yeah, if it does
<rogpeppe1> mgz: but it has hackers as collaborators...
<mgz> rogpeppe1: I wanted to do charm, but it doesn't build with tips of deps when I tried, and has no dependencies.tsv
<rogpeppe1> mgz: so... how can i determine... etc?
<mgz> rogpeppe1: see the pages above^
<rogpeppe1> mgz: juju/charm is in github.com/orgs/juju/teams/hackers/repositories
<alexisb> katco, can you take a look at this bug: https://bugs.launchpad.net/juju-core/+bug/1441319
<mup> Bug #1441319: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <ci> <lxc> <oil> <test-failure> <vivid> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1441319>
<rogpeppe1> mgz: is it just bots that i should be looking at?
<alexisb> looks like you worked in 1.22
 * katco looking
<mgz> rogpeppe1: look at either, if it's in hackers you have to use the github merge button or similar, if it's in bots you need $$merge$$
<rogpeppe1> mgz: you could quite easily (i hope) add a dependencies.tsv file to juju/charm by updating deps to those found in juju-core or charmstore first, then using that
<mgz> rogpeppe1: I'd like to add more of these subrepos to the gated set, if you have any you want please say
<mgz> there are just a few requirements/restrictions over what's needed to build/test
<rogpeppe1> mgz: sure. some of them are gated by a rival bot
<rogpeppe1> mgz: (e.g. charmstore)
<mgz> rogpeppe1: the juju-gui bits are the same code (roughly) via a different jenkins
<mgz> both are in the bots group
<katco> alexisb: i remember this
<katco> mgz: would you be able to tell me what "lxc-start --version" returns on whatever machine runs "local-deploy-vivid-amd64 (non-voting)"?
<mgz> katco: sure, sec
<TheMue> need some assist regarding LogMatches. my test shows http://paste.ubuntu.com/10773293/ as output and I don't know why it fails.
<mgz> katco: 1.1.0
<katco> mgz: ty kindly
<mgz> package version 1.1.0-0ubuntu1
<katco> thanks
<wwitzel3> TheMue: did you try printing out the length of each message and make sure it matches?
<wwitzel3> TheMue: sometimes there is some hidden character and I know the last check the LogMatches does is a length check
<TheMue> wwitzel3: to catch appended spaces? good idea
 * natefinch grumbles about tests that do string matching on errors
<TheMue> ah, that's why I hear it grumbling here
<TheMue> wwitzel3: hmm, both are the same
<wwitzel3> TheMue: :(
<TheMue> wwitzel3: got a good hint by dimitern, the parens need escaping
<natefinch> 17 stupid tests fail when you add more information to an error with errors.Annotatef.  Fantastic.
<TheMue> yeeeeeeehaw, it passes
<perrito666> natefinch:  isn't that what tests are supposed to do?
<natefinch> perrito666: no, we should be using a type system so that we know the right type of error is returned, not that the string the error serializes to is exactly the same
<natefinch> perrito666: the actual text of the error message doesn't really matter
<sinzui> hi jam, dimitern, natefinch: can we say that bug 1258485 is fix committed in 1.23-beta4 now that leader elections are not behind a feature flag?
<mup> Bug #1258485: support leader election for charms <juju-core:Triaged> <cassandra (Juju Charms Collection):Triaged> <postgresql (Juju Charms Collection):Triaged> <https://launchpad.net/bugs/1258485>
<dimitern> sinzui, I think so, yeah
<sinzui> yeah
<sinzui> yay
<sinzui> \o/
<natefinch> gah, and now another test *is* checking types, but just printing out the string in the test failure method.  Geez, people.
<perrito666> man you really get angry with things badly done, you would die of a hearth attack in my country :p
<natefinch> heh
<wwitzel3> lol
<katco> mgz: when we create an lxc container for cloning, we create a log file for the containers console output... would it be possible to pull that from the test machine?
<alexisb> thank you wwitzel3
<mgz> katco: I can see
<mgz> does it generally get removed to destroy-environment?
<katco> mgz: hm... yes probably
<katco> mgz: actually, it probably doesn't even exist. that's probably the issue
<dimitern> voidspace, dooferlad, TheMue, please have a look - http://reviews.vapour.ws/r/1399/ subnets api server-side facade
<dimitern> I think I managed to get a good balance between test coverage and the amount of unavoidable boilerplate I need to stub out environ/provider/state methods
<mgz> katco: I can add any files you ant to the list of local logs to copy before destroy-environment
<mgz> but I'd expect *some* to already be present if it had worked at all, and there are none
<mgz> older failures have the jenv, all-machines, machine-0 and cloud-init-output
<katco> mgz: yeah on second thought i don't think it's necessary. it looks like if it exists, and we can read from it, the messages will be present in machine-#.log
<natefinch> man I hate tests that rely on the order of items in a slice
<wwitzel3> I think it is easier to keep track of what nate doesn't hate in tests
<wwitzel3> much shorter list :)
<jw4> wwitzel3: lol
<natefinch> lol
<lazyPower> o/ whats the trick to pass a port to juju add-machine when ssh is not on the standard port of 22?
<lazyPower> juju add-machine ssh:user@host -P 2222 doesn't seem to be doing it
<katco> mgz: ah wait, there are 2 log files: console.log & container.log... container.log would be useful to have i think. it contains stderr information from lxc-start
<katco> mgz: and it doesn't appear that the logged information would be anywhere else
<katco> mgz: actually, i'm wondering if this is desired, or is a dormant bug...
<katco> https://github.com/juju/juju/blob/master/container/lxc/clonetemplate.go#L282-L283
<katco> can i get another opinion on that? shouldn't the stderr of lxc-start go into the same place as console output? especially if we're only watching console output?
<mgz> kat, hmm, yeah, that could get added
<mgz> katco: /var/lib/juju/containers/juju-*-lxc-template/*.log ?
<katco> mgz: looks right
<katco> mgz: i think i might need that to troubleshoot further. command line arguments to lxc-start look sane. i think it's erroring out, yet remains running? need to see stderr
<rogpeppe1> here's a change to allow deploying charms with authorization; any reviews appreciated: https://github.com/juju/juju/pull/2048/files
<katco> mgz: any chance we could add those log files and then rerun vivid?
<mgz> katco: on it
<mgz> just checking if need to change how elevation works
<katco> mgz: ty sir. i'm going to take this opportunity to fix lunch
<lazyPower> to anyone following alogn about re: add machine with non-standard ssh port - it appears we dont support this - http://paste.ubuntu.com/10774659/
<lazyPower> unless i hear otherwise, i'll repost with a bug and bugger off :) thanks
<lazyPower> https://bugs.launchpad.net/juju-core/+bug/1441749
<lazyPower> cheers
<mup> Bug #1441749: Add-Machine does not support non-standard ssh port <juju-core:New> <https://launchpad.net/bugs/1441749>
<mup> Bug #1441749 was opened: Add-Machine does not support non-standard ssh port <juju-core:New> <https://launchpad.net/bugs/1441749>
<jam> alexisb: what time is team lead meeting tomorrow/tonight?
<jam> I think I see it in the afternoon. good for me.
<alexisb> jam, yep, back to the old time
<mgz> katco: only got the jenv, nothing else
<katco> mgz: that's odd...
<mgz> maybe collecting at the wrong point or something
<mgz> oh, I tyoped
<mgz> >_<
<katco> haha
<mgz> katco: attempt #2
<katco> :)
<mgz> 20 mins
<katco> np
<voidspace> g'night all
<niedbalski> abentley, ping
<mgz> katco: finished, see #394
<katco> mgz: ty
<katco> mgz: i haven't quite figured this out yet... how do i get to 394? i don't see it on http://reports.vapour.ws/releases
<mgz> katco: I'm manually triggering on jenkins, so am in there
<katco> mgz: oh gotcha
<mgz> you should have a dev login for the jenkins site?
<mgz> I can also direct link you the files via data.vapour.ws
<katco> mgz: i do not have a jenkins login
<mgz> because it's not going through the standard job triggering process, reports won't pick up the new artifacts till it happens to run again for another reason
<katco> mgz: just point me at the files if you don't mind... no idea where to find that on jenkins either
<mgz> http://data.vapour.ws/juju-ci/products/version-2525/local-deploy-vivid-amd64/build-394/console.log.gz
<mgz> for those following along at home
<mgz> and container.log.gz same path, big file
<katco> container.log is what we're interested in, and it's HUGE because it contains the same error repeated thousands of times :)
<abentley> niedbalski: pong
<katco> mgz: any thoughts on whether this is applicable? https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/1296459
<mup> Bug #1296459: Upgrade from 2.8.0-0ubuntu38 to 2.8.95~2430-0ubuntu2 breaks LXC containers <apparmor (Ubuntu):Fix Released by tyhicks> <https://launchpad.net/bugs/1296459>
<katco> meh.. just noticed the date on that
<mgz> katco: not that directly, I'd expect
<mgz> fragile apparmour profiles are always a possible though
<katco> mgz: what version of apparmor is on that machine?
<katco> mgz: this is the 1st interesting error in the log:
<katco>       lxc-start 1426806676.710 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:183 - No such file or directory - failed to change apparmor profile to lxc-container-default
<mgz> 2.9.1-0ubuntu9
<mgz> no apparmor updates pending, there is an lxc though
<katco> worth a try i suppose
<katco> while i continue looking into this
<mup> Bug #1441808 was opened: juju units attempt to connect to rsyslog tls on tcp port 6514  but machine 0 never installs required rsyslogd-gnutls package <juju-core:New> <https://launchpad.net/bugs/1441808>
<mup> Bug #1441811 was opened: juju-1.23beta3 breaks glance <-> mysql relation when glance is hosted in a container <oil> <juju-core:New> <https://launchpad.net/bugs/1441811>
<mgz> dist-upgrading
<katco> mgz: is that machine cycled often?
<mup> Bug #1441826 was opened: deployer and quickstart are broken in 1.24-alpha1 <ci> <regression> <juju-ci-tools:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1441826>
<mgz> katco: uptime says 6 days, cloudinit says the machine was created 15 mar
<mgz> katco: I can restart now if desired
<katco> mgz: it's a shot in the dark, but if you don't mind
<katco> mgz: i've had issues with lxc where i had to cycle cgmanager
<katco> mgz: and research shows that power-cycles have fixed various issues for others with these types of errors
<katco> mgz: who is our resident app armor expert?
<mgz> good question, maybe check in #ubuntu-sever ?
<katco> mgz: freenode or internal?
<mgz> freenode, I guess there's an internal equiv
<katco> mgz: did you cycle that machine and trigger a new run
<katco> ?
<mgz> katco: yup
<katco> mgz: awesome ty
<mgz> looks much the same though...
<katco> mgz: it was worth a try
<katco> mgz: btw isn't it super late for you?
<mgz> late-ish, yeah :)
<perrito666> vmaas is a blessing, too bad the ammount of network magic I need to do to get it exported
<mup> Bug #1441808 changed: juju units attempt to connect to rsyslog tls on tcp port 6514  but machine 0 never installs required rsyslogd-gnutls package <logging> <juju-core:New> <https://launchpad.net/bugs/1441808>
<alexisb> team...
<alexisb> I have this bug from a stakeholder:
<alexisb> https://bugs.launchpad.net/juju-core/+bug/1287718
<mup> Bug #1287718: jujud on machine 0 stops listening to port 17070/tcp WSS api <cts> <cts-cloud-review> <mongodb> <state-server> <juju-core:Triaged> <https://launchpad.net/bugs/1287718>
<alexisb> it needs some love, if someone has cycles
<alexisb> I will find "volunteers" if no one speaks up :)
 * jw4 whistles while working really hard
<alexisb> :)
<katco> thumper: ping
<thumper> katco: hey
<thumper> katco: I was just about to go and run an errand
<thumper> katco: is this quick, or can we do it in  abit?
<katco> thumper: rq
<thumper> shoto
<thumper> or...
<thumper> shoot
<katco> thumper: i'm working on https://bugs.launchpad.net/juju-core/+bug/1441319
<mup> Bug #1441319: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <ci> <lxc> <oil> <test-failure> <vivid> <juju-core:Triaged by cox-katherine-e> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1441319>
<katco> thumper: going to have to hand off to you so i can work with axw tonight on some storage stuff
<thumper> yeah...
<katco> thumper: it's well triaged, and we're engaged with hallyn in #canonical to further diagnose
<katco> thumper: so hopefully by the time i hand it off you'll have an easy go
<thumper> ack
<katco> thumper: we can talk more when you have time... i should be on tonight
<katco> axw: daily question since it's just us: can we push the stand-up back an hour?
<axw> katco: no worries, may need to be a bit later still if that's okay, since my wife will be getting ready for work an hour from now
<katco> axw: that's perfectly fine
<katco> axw: i'll eat dinner and give my kiddo a bath and catch up with you in a bit :)
<katco> axw: enjoy your family as well!
<axw> cool, ttyl
<katco> ttyl
<thumper> katco: did you work it out? I think I know what the problem is
<thumper> katco: from not looking at the bug nor the code...
<thumper> so an educated guess
<thumper> hmm... reading the bug, seems to be different
<thumper> comment added
<mup> Bug #1441899 was opened: jujud-machine-0 handles mongo errors poorly (and fails to start after a juju upgrade gone wrong) <juju-core:New> <https://launchpad.net/bugs/1441899>
<mup> Bug #1441904 was opened: juju upgrade-juju goes into an infinite loop if apt-get fails for any reason <juju-core:New> <https://launchpad.net/bugs/1441904>
<katco> thumper: o/
<katco> thumper: so i think those errors were pulled from a vivid machine
<thumper> katco: hangout?
<katco> thumper: sure, if you don't mind me popping out for a few mins in the middle
<thumper> katco: that's fine
<katco> thumper: https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand?authuser=0&hceid=Y2Fub25pY2FsLmNvbV9pYTY3ZTFhN2hqbTFlNnMzcjJsaWQ5bmhzNEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t.q61hqsau8oh348d0dqmosuqilk
<mup> Bug #1441913 was opened: juju upgrade-juju failed to configure mongodb replicasets <juju-core:New> <https://launchpad.net/bugs/1441913>
#juju-dev 2015-04-09
<lazyPower> thumper: ping
<thumper> lazyPower: hey man
<lazyPower> hey thumper :) is there any way that i can query an action status in juju? i've ran 3 items (1 ran 2 queued)
<lazyPower> i'm realllllyyyy curious whats going on with those other 2 actions that queued that gave me zero feedback other than a queue with a hash.
<thumper> lazyPower: NFI sorry, jw4 around?
<jw4> yep
<lazyPower> \o/
<lazyPower> score, right people, right time
<jw4> lazyPower: juju action status ?
<lazyPower> how... did i miss this in juju action help?
 * lazyPower facepalms
<jw4> lazyPower: bad docs
<jw4> mea culpa :(
<jw4> lazyPower: btw... juju help actions is very truncated primer
<lazyPower> its a new feature, i forgive you jw4
<jw4> lazyPower: lol thanks
<lazyPower> we'll do better next time \o/
<mup> Bug #1441915 was opened: juju uses unsafe options to dpkg inappropriately <juju-core:New> <https://launchpad.net/bugs/1441915>
<katco> axw: standup?
<rick_h_> lazyPower: ping
<lazyPower> rick_h_: pong
<rick_h_> lazyPower: hate to be a bother man, but noticed that our link to the openstack bundle was broken and see you pushed it under -basic vs -base per https://pastebin.canonical.com/129202/
<lazyPower> yikes
<lazyPower> 1 sec, let me fix that
<rick_h_> lazyPower: working on the release notes email and wanted to call out the bundle move, any chance you've got time to repush it? and remember the name in the yaml has to match please
<rick_h_> sorry for not realizing it when you did it :(
<lazyPower> All good - glad we caught it before we broadcast a 404
<rick_h_> lazyPower: +1
<lazyPower> rick_h_: the bundle deploy command turns into:     juju quickstart bundle:openstack/openstack-base
<lazyPower> right?
<lazyPower> or is it openstack-base/openstack-base
<rick_h_> juju quickstart openstack-base
<rick_h_> that's it, just the same thing as the jujucharms.com url for promulgated bundles
<rick_h_> lazyPower: e.g. http://jujucharms.com/mongodb-cluster check the copy/paste command there
<lazyPower> ok openstack-base is pushed
<rick_h_> or even the one there now https://jujucharms.com/openstack-basic/
<katco> mgz: sinzui: merge incorrectly failed on goose: http://juju-ci.vapour.ws:8080/job/github-merge-goose/3/console
<rick_h_> lazyPower: <3 my hero
<lazyPower> rick_h_: do we wan tot wipe openstack-basic LP repo and nuke the charmstore bin?
<rick_h_> lazyPower: will wait for it to ingest before I sent my email
<lazyPower> or have i created cruft :|
<rick_h_> lazyPower: yes please
<lazyPower> ack, on it now
<rick_h_> I'll blow it away once the branch is gone
<lazyPower> done, branch shoudl 404 now
<axw> katco: back, ready now?
<katco> axw: yep
<rick_h_> lazyPower: can you double check that the url in the new bundle is right? lp:~charmers/charms/bundles/openstack-base/bundle (ends in /bundle?)
<lazyPower> fetches 34 revisions from bzr
<rick_h_> lazyPower: I think we only ingest /trunkto avoid everyone's branches in dev/progress
<lazyPower> thats a mis-match with what everythign else thats a bundle is pushed to for bundles
<lazyPower> the /trunk is correct nomenclature with charms, but /bundle is what we've been using for bundles since i started.
<rick_h_> lazyPower: ah ok then coolio
<rick_h_> lazyPower: just double checking
<lazyPower> :)
<lazyPower> https://launchpad.net/~charmers/charms/bundles/mediawiki/bundle <- as verification
<rick_h_> yep, gotcha
<rick_h_> on that first page the only other bundle was /trunk so I got nervous
<axw> mgz: goose bot still doesn't want to merge stuff: http://juju-ci.vapour.ws:8080/job/github-merge-goose/3/console
<rick_h_> lazyPower: <3 https://jujucharms.com/openstack-base/
<axw> rick_h_: would you mind hitting the merge button on https://github.com/go-goose/goose/pull/6? the tests pass (see console above), but the lander mustn't be set up correctly because the merge failed again.
<rick_h_> axw: looking
<lazyPower> Email Deployed!
<rick_h_> axw: button hit
<axw> rick_h_: thanks :)
<rick_h_> np
<rick_h_> always happy to use my powers for evil
<katco> rick_h_: ty sir :)
<thumper> axw: you around?
<axw> thumper: I am
<axw> howdy
<thumper> axw: team meeting?
<axw> oops
<jam> thumper: hey, did you see my emails about potential changes to the JES spec? I haven't heard any replies from you.
<jam> I have to go walk the dog, but I'll be back in a bit
<thumper> jam: yeah...
<thumper> jam: we should have a hangout when you are back and have time
<thumper> jam: it was on my todo list to give you a comprehensive response
<jam> thumper: are you still there?
<thumper> yup
<jam> I'm up for a hangout, let me grab my coffee cup
<jam> heh, empty anyway :)
<rick_h_> jam: thumper just replied to the cli stuff fyi
<thumper> rick_h_: why are you still here?
<thumper> surely it is past your bed time :)
<rick_h_> thumper: because I can't sleep and I was clearing kanban and replying to bugs and found your email interesting :)
<rick_h_> at least more interesting than bug triage
<rick_h_> so party time!
<jam> rick_h_: I don't know the ascii art for blowing a noisemaker... :)
<rick_h_> jam: that's ok, I couldn't do it in animated gif form either so we'll be quiet party folk
<jam> thumper: https://plus.google.com/hangouts/_/gqw3kdy7c4nxrjs2a7adlggamea
<rick_h_> jam: thumper is that JES or uncomitted and worth me listening in on for fly on the wall info next week? or carry on with my bugs?
<jam> rick_h_: the chat is about JES I believe
<thumper> rick_h_: you are welcome
<jam> you're welcome if you're interested
<mup> Bug #1427814 changed: juju bootstrap fails on maas with when the node has empty lshw output from commissioning <bootstrap> <maas> <maas-provider> <network> <juju-core:Won't Fix> <juju-core 1.22:Won't Fix> <juju-core 1.23:Won't Fix> <https://launchpad.net/bugs/1427814>
<dimitern> dooferlad, hey there
<mup> Bug #1442012 was opened: persist iptables rules / routes for addressable containers across host reboots <addressability> <network> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1442012>
<dooferlad> dimitern: hi
<dimitern> dooferlad, I'm reviewing your branch, but something more urgent came up
<dooferlad> dimitern: yea, the container stuff?
<dimitern> dooferlad, yeah - some experiments are needed - have a look at bug 1441811
<mup> Bug #1441811: juju-1.23beta3 breaks glance <-> mysql relation when glance is hosted in a container <oil> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1441811>
<dimitern> dooferlad, I've added some comments with suggestions how to change the iptables rules we generate so we'll hopefully solve the issue
<dimitern> dooferlad, can you please try these (or others if you have a better idea) - the result we're seeking is packets from a container-hosted charm arrive to another host with the container's IP as source, not it's host's
<dooferlad> dimitern: sure
<dimitern> dooferlad, cheers! in the mean time I'll finish your review
<voidspace> dimitern: this only applies to new containers deployed with 1.23, the upgrade doesn't change existing containers network configuration - right?
<dimitern> voidspace, which does?
<voidspace> dimitern: that bug - the routing problems
<dimitern> voidspace, ah, yes
<dimitern> voidspace, but post-upgrade any new instance hosting containers will potentially have the issue
<voidspace> dimitern: yep
<voidspace> dimitern: so you can be in a "mixed" environment, with old style and new style containers
<dimitern> voidspace, oh most certainly :)
<dimitern> voidspace, we need to deal with this gracefully though
<mup> Bug #1442012 changed: persist iptables rules / routes for addressable containers across host reboots <addressability> <network> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1442012>
<mup> Bug #1442012 was opened: persist iptables rules / routes for addressable containers across host reboots <addressability> <network> <juju-core:Triaged> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1442012>
<dimitern> voidspace, aren't those 2 cards in review merged btw?
<dooferlad> dimitern: iptraf seems to be a good tool to add to the collection. Shows source address of traffic, so just pinging a container is enough to see if the source address is the host or the container.
<dimitern> dooferlad, cool - how do you run it?
<dooferlad> dimitern: it is a console app. It is apt-getable as well.
<dimitern> dooferlad, I'll check it out
<dooferlad> dimitern: it looks like if we just don't add the current SNAT rule we are fine. Currently we have "-A POSTROUTING -o eth0 -j SNAT --to-source <host IP>"
<dooferlad> dimitern: without it, ping responses are from the container IP
<dooferlad> dimitern: just testing a small change
<dimitern> dooferlad, hmm we better check any proposed fix on both AWS and MAAS just to be sure
<dooferlad> dimitern: indeed!
<mup> Bug #1442046 was opened: Charm upgrades not easy due to versions not being clearly stated <cts> <juju-core:New> <https://launchpad.net/bugs/1442046>
<rogpeppe1> i just saw this panic when running github.com/juju/juju/cmd/jujud/agent tests: http://paste.ubuntu.com/10781542/
<rogpeppe1> it looks like a legit juju bug to me
<natefinch> rogpeppe1: heh I was just trying to fix that on my branch, assuming it was my own fault somehow
<dimitern> rogpeppe1, looks like a bug to me as well << axw
<natefinch> rogpeppe1: seems like ensureErr should just return nil if the thing its given is nil... though there might be more to the bug than that.... like why we're passing it something nil
<rogpeppe1> natefinch: it looks to me as if we're getting a closed channel on filesystemsChanges before the first environConfigChanges value has arrived
<rogpeppe1> natefinch: that would be my thought for a fix too
<dimitern> natefinch, nope, EnsureErr's reason for existence is to report watcher errors on failure, it should not be called when err != nil
<rogpeppe1> dimitern: it's being called with a nil Errer
<dimitern> rogpeppe1, yeah, but it shouldn't - seems to me like an omission in the loop using the watcher
<rogpeppe1> dimitern: hmm, you're right, it shouldn't. looking more closely, i don't see how we can possibly be getting a closed channel on filesystemsChanges when filesystemsWatcher is still nil
<dimitern> weird
<rogpeppe1> fwereade: i'm seeing sporadic failures in UniterSuite.TestUniterUpgradeConflicts too
<dooferlad> dimitern: What do I do to get a public IP address for an EC2 container?
<dimitern> dooferlad, there's no such thing yet
<dimitern> dooferlad, the public address of the host is used
<dooferlad> dimitern: ah, so what do we expect from an EC2 container? Being able to start a service in it, and access it using the host IP?
<dimitern> dooferlad, the public address is only needed for exposing stuff
<dimitern> dooferlad, but in AWS we should have the same behavior as MAAS (assuming the fix worked)
<dooferlad> dimitern: well, in MAAS I can create a container and ping it, getting the response back from the container IP address
<dimitern> dooferlad, i.e. for other hosts (or containers on the same or other hosts) talking to the a container-hosted charm should see the packets from the container's ip
<dooferlad> dimitern: which is private to the host at the moment, because it is on a bridge?
<dooferlad> unless a service has been exposed
<dimitern> dooferlad, what is private to the host?
<dooferlad> dimitern: to ask a better question, do we expect addresses of containers, attached to lxcbr0, to be accessable from other physical machines in the same VPC?
<dimitern> dooferlad, ok let's call 10.0.3.0/24 and 192.168.122.0/24 addresses "local container addresses" and the ones in the same range as their host "internal container addresses"
<dimitern> dooferlad, we expect other hosts (or containers on other hosts) to be able to connect to the internal container address and see connections originating from the same address
<thumper> hmm...
<thumper> noticed the check in time at Nuremburg is 3pm
<thumper> I arrive around 7:30 am
<dooferlad> dimitern: great
<thumper> who is staying saturday night?
<dimitern> dooferlad, while the local container addresses are irrelevant (if possible)
<dimitern> thumper, I'll be there saturday afternoon
<thumper> hazaah
<thumper> I think I'll be half dead / asleep
<dimitern> dooferlad, "if possible" == I'd rather not have to deal with local container addresses wrt iptables rules (just their CIDR range)
<voidspace> dimitern: yes, good point - I'll move them now
<dimitern> voidspace, cheers!
<dooferlad> dimitern: OK, I started an EC2 machine with an LXC on it, juju status says its IP address is 10.0.3.82. I guess that is bad.
<dimitern> dooferlad, it is bad
<dimitern> dooferlad, it needs to be something like 172.x.x.x
<dimitern> dooferlad, check machine 0 logs around PrepareContainerInterfaceInfo (I hope you're logging at TRACE level)
<dooferlad> dimitern: "cannot allocate addresses: no interfaces available"
<dooferlad> logging at debug
<axw> dimitern: yeah I noticed that just earlier myself, filesystemsWatcher isn't being assigned
<axw> rogpeppe1: ^^
<axw> it's shadowed
<axw> in startWatchers
<axw> gtg catch a plane, see you in a few days
<rogpeppe1> axw: ah!
<rogpeppe1> axw: it's not the only one either
<rogpeppe1> dimitern: there's another bug there too
<dimitern> rogpeppe1, oh yeah?
<rogpeppe1> dimitern: when environConfigChanges is closed, we should do EnsureErr on environConfigWatcher, but it's doing it on volumesWatcher instead
<rogpeppe1> dimitern: so i think that's three easy-to-fix bugs :)
<dimitern> rogpeppe1, nice catch!
<rogpeppe1> dimitern: i saw an actual panic from that one too
<dimitern> fwereade, hey, are you about?
<fwereade> dimitern, o/
<dimitern> fwereade, :) I hope you've sorted out your flights for nuremberg
<fwereade> dimitern, ha, yes
<dimitern> fwereade, cause you're still red on the logistics spreadsheet
<fwereade> ...shite, I think I forgot that bit
 * fwereade goes crawling off to try to sort that out
<rogpeppe1> mgz: would it be possible to fix this failure in the juju landing 'bot please? http://juju-ci.vapour.ws:8080/job/github-merge-juju/2834/console
<rogpeppe1> mgz: i think it's happening because code.google.com/p/go.net is no longer a dependency (a Good Thing)
<rogpeppe1> mgz: and for some reason the script is trying to remove charset/testdata
<rogpeppe1> is anyone else here able to fix the landing bot ?
<mgz> rogpeppe1: sure
<rogpeppe1> mgz: thanks
<jam> fwereade: ping, I had some questions about a leader-election test
<fwereade> jam, heyhey
<mgz> rogpeppe1: when you say no longer a dep, did it actually just move?
<rogpeppe1> mgz: yeah
<mgz> so, do we still need to remove that stuff but from a different path?
<rogpeppe1> mgz: to golang.org/x/net
<rogpeppe1> mgz: quite possibly.
<mgz> okay, I will do that
<rogpeppe1> mgz: do you know why it was removed anyway?
<jam> fwereade: I don't know if you saw http://reviews.vapour.ws/r/1378/ which basically just removes leader-elected as a feature flag (its just always enabled) per Mark's request.
<mgz> rogpeppe1: it's not properly licenced
<jam> I had a bit of cleanup (some stuff with how exceptions were getting wrapped and unwrapped)b
<jam> but mostly it worked
<jam> except one test
<rogpeppe1> mgz: why should that matter in the build bot?
<jam> fwereade: specifically UnitDying test causes leader-settings-changed (presumably because the unit loses its leader status)
<jam> but leader-settings-changed isn't ordered vs db-relation-departed
<fwereade> jam, ah yeah, I saw the ship-it and didn't look further than that
<mgz> rogpeppe1: we use the same tarball creation script as the actual release, so we're testing the right stuff
<jam> fwereade: so http://reviews.vapour.ws/r/1378/diff/# line 67 is where I unwrapped the error
<jam> and 1034 is the test I commented out
<jam> axw made the comment "maybe leader-settings-changed should be suppressed when dying" which I had just thought of independently
<fwereade> jam, hmm, so re the unwrapping that surprises me a bit -- what it it that's wrapping an ErrDying in the first place? I usually think of that as a direct signal rather than something that bubbles through many layers
<jam> fwereade: isLeader
<fwereade> jam, not to say that it can't happen or that it's not good though
<jam> times out with ErrDying
<jam> and the upper layers do "errors.Trace()"
<fwereade> jam, generally I think we should be checking Cause(err) rather than err just about everywhere though
<fwereade> jam, it's the price we pay for tracing
<fwereade> jam, re leader-settings-changed while dying
<rogpeppe1> mgz: i'm surprised about the licensing - golang repos don't usually contain anything encumbered
<jam> fwereade: yeah, so maybe we want a helper that wraps tomb.Kill in tomb.Kill(errors.Cause(err))
<fwereade> jam, don't think so? it's site-specific
<fwereade> jam, frequently the context is just what the doctor ordered
<fwereade> jam, it's only for certain special values in any given case
<fwereade> jam, even if there are going to be some very common cases...
<jam> fwereade: you're right. It was more about for Signaling (singleton?) errors
<jam> the trace did help me actually find where the error was being generated
<jam> though interestingly enough
<fwereade> jam, cool :)
<jam> isLeader() doesn't return errors.Trace(ErrDying)
<jam> which would have actually gotten the line
<mgz> rogpeppe1: changed, try sending it through again
<rogpeppe1> mgz: trying
<fwereade> jam, heh, interesting point
<fwereade> jam, errors.Trace(tomb.ErrDying) is squicky at first sight, but I can't think of a good argument against it
<mgz> rogpeppe1: the html test cases are from the w3c, and their licence is non-free
<mup> Bug #1442132 was opened: [packaging juju-1.23] Issues met while working on debian/copyright file <juju-core:New> <https://launchpad.net/bugs/1442132>
<fwereade> jam, and that'd then require us to enforce cause-checking properly
<jam> fwereade: well one bit is that it makes it *obvious* that the errors are going to be wrapped and you need errors.Cause() before passing to tomb.
<jam> paraphrase jinx
<fwereade> jam, indeed :)
<mgz> has a nohm, their website currently claims dual licenced to 3-clause bsd, I wonder if that's a recent change
<fwereade> jam, so, yeah, I think you''re right
<mgz> their licence has an advertising clause
<fwereade> jam, the traces are good, cause-checking is generally necessary anyway, we should just trace special error values from the start
<rogpeppe1> mgz: it looks fairly free to me: http://www.w3.org/Consortium/Legal/2008/04-testsuite-copyright.html
<fwereade> jam, doesn't affect the most common errors.New/Errorf code, right?
<mgz> and no-modification
<jam> fwereade: so a helper to errors.Cause ErrDying for tomb is sane?
<rogpeppe1> mgz: is the problem this: "Neither the name of the W3C nor the names of its contributors may be used to endorse or promote products derived from this work without specific prior written permission."
<rogpeppe1> ?
<fwereade> jam, yeah, and we can just stick it in in place of all the x.tomb.Kill(x.loop()) calls
<mgz> rogpeppe1: and "No right to create modifications or derivatives of W3C documents is granted..."
<fwereade> jam, about LSC when dying
<rogpeppe1> mgz: i don't see that clause anywhere
<mgz> but by that current doc, we can just take 3-clause bsd, I'll check with rbasak
<fwereade> jam, we should look further into where that LSC is coming from -- I don't recall us abdicating leadership at that point
<fwereade> jam, off the top of my head, I'd suspect the filter of starting a fresh watcher maybe?
<fwereade> jam, assuming it is a legitimate hook, though, the arbitrary ordering is a feature not a bug
<jam> fwereade: right, I don't think we can prescribe an ordering.
<jam> fwereade: so we have checks that say "if as a result of this action, I'm no longer leader, trigger leader-settings-changed"
<fwereade> jam, so looking at AliveLoop/DyingLoop I am mainly concerned that they are more different than I would have hoped -- ie that we seem to stop reacting to all manner of hooks while dying
<rogpeppe1> mgz: same issue still: http://juju-ci.vapour.ws:8080/job/github-merge-juju/2835/console
<fwereade> jam, and I'm not completely sure that's correct -- I know it has on occasion irritated users that we don't handle charm upgrades while dying, for example
<fwereade> jam, and as a charmer you *don't know* whether you're dying
<mgz> rogpeppe1: doh, I changed the wrong machine
<fwereade> jam, so every difference between alive and dying is just arbitrarily varying behaviour from the POV of the charmer
<jam> fwereade: could it be uniter.go line 397 "before.Leader != after.Leader" ?
<fwereade> jam, quite likely, yes -- but I'm still not seeing what'd trigger it
<fwereade> jam, oops sorry wrong bit
<jam> fwereade: so a unit no longer being alive means it can't be elected to leader, right?
<dimitern> voidspace, dooferlad, Subnets API - AllSpaces() - please, take a look: reviews.vapour.ws/r/1403/
<jam> certainly we don't want a Dying unit to become the leader (I would think)
<fwereade> jam, after.Leader shouldn't have been set, should it?
<mgz> rogpeppe1: re-re-try
<fwereade> jam, completely independent currently
<rogpeppe1> mgz: reretrying
<jam> fwereade: so I haven't debugged here, but  before.Leader should have been set, right? It was the only unit of a service, thus the leader, then it goes to dyign
<jam> fwereade: we know the unit is not leader because it got leader-settings-changed not leader-elected
<fwereade> jam, yes, before.Leader should have been set, and unless we ran a resignLeadership op it should still be set
<fwereade> jam, apart from anything else
<fwereade> jam, a dying unit should not renounce leadership if it's the only unit
<jam> fwereade: so I don't *know* that its that code that's triggering it. I just know I'm seeing a leader-settings-changed in that test, and it feels a lot like something noticing its not leader anymore and thus queuing a leader-settings-changed event
<fwereade> jam, and the simplest way to implement that it to completely decouple leadership from life -- the only cost is that the next leader may be elected after a short delay
<fwereade> jam, I admit I am mostly occuppied with a different structure in my head right now so I might easily be wrong somewhere
<jam> fwereade: meaning, if you were leader you have 30s after you die before we notice that you're no longer handling leadership ?
<fwereade> jam, yes
<jam> fwereade: what about the code that votes in a new leader, seems it should favor non-dying
<fwereade> jam, dropping it the moment the unit's set to Dead would be fine and good
<fwereade> jam, not convinced that makes much difference in the end? should the only remaining unit, dying but blocked, never be elected leader, and this never (say) run service-level actions?
<jam> well I did say favor not never-elect
<fwereade> jam, true :)
<jam> but at the same time, if you're Dying I don't know whether leader stuff actually matters.
<fwereade> jam, but I can't see a robust way to do that and I'd rather do nothing than something inherently flaky
<jam> fairy 'nuff
<fwereade> jam, it still does, I think -- you're still likely to be responsible for a bunch of things even if the charm isn't aware of it
<fwereade> jam, given a service composed of N dying+blocked units, you still want one of them to be leader so it can aggregate statuses, run service actions, whatever
<fwereade> jam, and when a dying and non-blocked unit is elected, ehh, it'll be deposed soon enough, and we have to expect and tolerate leadership changes *anyway*
<jam> fwereade: so runOperation appears to be the only place that could possibly call WantLeaderSettingsEvents(true) does that fit with you?
<fwereade> jam, I think so, yes
<jam> fwereade: I was just trying to figure out where to look to find out why we were deposed
<fwereade> jam, that's the one place that should see all the state changes we record between leader/minion
<fwereade> jam, seeing what operation set it to false in there should work
<jam> fwereade: and your thought is that it should be a ResignLeadership op ?
<fwereade> jam, well, I sorta think it *shouldn't* be --ie, yes, that is the only op that should; but I don't *think* we should get that op... should we?
<jam> fwereade: well, that's the only code that has "Leader = false" but potentially leadership.State{} also sets it to false
<jam> (setting by omission is one of my dislikes of go defaults)
<fwereade> jam, indeed, I would be suspicious of finding some op that wasn't using stateChange.apply
<dooferlad> dimitern: so the reason that EC2 containers aren't working is that provider/ec2/environ.go -> NetworkInterfaces -> ec2Client.NetworkInterfaces is not returning any interfaces.
<dooferlad> ...so we can't find out what the machine's IP address is, so we can't set up the container correctly
<dimitern> dooferlad, have you tried without your fix?
<dooferlad> dimitern: no, but I am not sure how it would make any difference.
<dooferlad> dimitern: can do now if you like.
<dimitern> dooferlad, please do
 * dimitern hopes we didn't break addressable containers on AWS
<jam> fwereade: so I do see "running operation resign leadership" being triggered, but sometimes the test doesn't fail...
<jam> is there a good way to figure out why we'd be running that op?
<dimitern> dooferlad, which branch are you using?
<dpb1> benji: yw
<dooferlad> dimitern: 1.23
<fwereade> jam, look in op_plumbling.go for the creator func and see where that's used
<fwereade> jam, should only be a couple of places
<fwereade> jam, and probably only one of them should be a plausible source given the test
<dimitern> dooferlad, ok, I'll try 1.23 tip here on AWS
<dooferlad> dimitern: I am already running
<jam> fwereade: modeAbideDyingLoop has newResignLeadershipOp()
<fwereade> jam, right, but I thought it only triggered when the tracker toldd us to
<jam> Refresh, DestroyAllSubordinates, SetDyingg, ResignLeadership
<jam> fwereade: ModeAbide has "<-u.f.UnitDying() return modeAbideDyingLoop"
<fwereade> jam, *dammit* sorry
<fwereade> jam, I do "resign leadership" as soon as we hit that loop
<fwereade> jam, it doesn't affect anything else, it's effectively just early warning for the charm that soon it won't be leader
<fwereade> jam, so I think the problem is that the other hooks are racing with <-UnitDying in modeAbideAliveLoop
<jam> fwereade: sure, so ordering means we may or may not get it before db-relation-broken and db-relation-dying etc.
<fwereade> jam, we should run it, you're absolutely correct, all my intimations that we shouldn't have been coplete nonsense
<fwereade> jam, if the broken and dying were triggered by the unitdying too we'd be fine I think
<fwereade> jam, is this test one where we're dying *while* the remote units really are leaving the relation?
<fwereade> jam, if so, it's unorderable I fear
<fwereade> jam, collecting laura gtg bbs
<jam> fwereade: enjoy
<jam> yeah, I don't think it should have an order, the question is how to properly test it.
<natefinch> rogpeppe1, dimitern: did you guys figure out that panic w/ EnsureErr?
<rogpeppe1> natefinch: axw worked it out
<rogpeppe1> natefinch: i'm leaving it for one of you guys to fix (there are about 3 bugs there)
<natefinch> rogpeppe1: since it's blocking me from committing, I'm more than willing to fix it.
<rogpeppe1> natefinch: it's mostly a shadowed-variable bug
<rogpeppe1> natefinch: but there's one place that the wrong variable is used too
<rogpeppe1> natefinch: it's sporadic (i just managed to merge a PR)
<natefinch> rogpeppe1: saw it in scrollabck, and I see it in the code, looks straightforward enough... just remove a couple colons
<natefinch> rogpeppe1: what's the wrong variable?
<rogpeppe1> natefinch:
<rogpeppe1> 		case _, ok := <-environConfigChanges:
<rogpeppe1> 			if !ok {
<rogpeppe1> 				return watcher.EnsureErr(volumesWatcher)
<rogpeppe1> 			}
<rogpeppe1> natefinch: volumesWatcher should be environConfigWatcher
<natefinch> rogpeppe1: yep, ok, I see it
<dooferlad> dimitern: no change without the fix
<dimitern> dooferlad, well, something is wrong at your side, because I've just bootstrapped and deployed a container on AWS - with an address from the host's range
<dooferlad> dimitern: well, that's no good :-|
<dimitern> dooferlad, what AWS account are you using?
<dooferlad> dimitern: the canonical one I was given
<dimitern> dooferlad, is the env still alive?
<dooferlad> dimitern: yes
<dimitern> dooferlad, in us-east-1 ?
<dooferlad> dimitern: yes, ec2-54-159-20-216.compute-1.amazonaws.com
<dimitern> dooferlad, got it - the issue is there's no default VPC there
<dooferlad> dimitern: oh (*%&^*
<dooferlad> dimitern: what region should I target?
<dimitern> dooferlad, hmm.. no there is one actually
<dimitern> dooferlad, but why wasn't it used? I can see the instance is a classic EC2 one, not VPC
<dimitern> dooferlad, if you have full TRACE logs and --debug log from the bootstrap that might give us some pointers
<dooferlad> dimitern: http://paste.ubuntu.com/10782887/
<perrito666> wwitzel3: natefinch are we having standup?
<dimitern> dooferlad, and the machine-0 log?
<dooferlad> dimitern: http://paste.ubuntu.com/10782893/
<natefinch> perrito666: sorry, lost track of time
<natefinch> katco: you doing the cross team call?  Do you have the info?
<perrito666> I think my ears are shrinking
<perrito666> my earplugs are not as comfortable as they used to be
<katco> natefinch: yeah i'll be there... although the call keeps dropping for some reason
<mrpoor> hi!
 * dimitern steps out for ~1h
<natefinch> ahh sleeps in tests, the hallmark of true quality
<mgz> dimitern: gobot doesn't actually have push rights to go-goose
<mgz> *jujubot
<natefinch> anyone else seeing notifyWorkerSuite.TestCallSetUpAndTearDown  failing sometimes with it not having called setup?
<mgz> I'n not sure where it's falling down exactly though
<natefinch> mgz: did you do the "set membership to public" thing?   I don't seem to have rights to see the membership stuff, so I can't tell myself.
<mgz> natefinch: I did
<mgz> but I don't have perms to poke further
<natefinch> mgz: is this the first time we've tried this with something external to github.com/juju?
<mgz> natefinch: yup, but it's not really much different
<natefinch> wow, ok,  Isee what'swrong with the notifyworker tests..... it's a built-in race condition.  We're assuming a notifyworker running a goroutine will call Setup() before one of the tests checks that setup has been called.
<perrito666> is it my idea or there is no clear documentation on how to create a new hook?
<natefinch> perrito666: documentation is for the weak
<perrito666> or for those that need to implement a hook before monday
<natefinch> perrito666: sorry.... all the info I can see in the docs is here: https://github.com/juju/juju/blob/e751bc6d1b44ef71679b946417a3b5c7484673b2/doc/charms-in-action.txt
<natefinch> perrito666: which from a quick skim does not seem to really talk about making new hooks
<perrito666> yup, same here, I think Ill dig out a bit among the implemented ones, I foresee lots of jumping around :p
<natefinch> rogpeppe1: Care to do a super quick review of those changes you found, plus a couple other small test fixes? http://reviews.vapour.ws/r/1405/
<natefinch> or dimitern, or katco or anyone else ^^
<natefinch> jam: fwereade: ^^
<katco> natefinch: tal
<natefinch> katco: thanks
<fwereade> perrito666, what hook? :)
<mup> Bug #1442257 was opened: lxc network.mtu setting not set consistently across hosts <juju-core:New> <https://launchpad.net/bugs/1442257>
<fwereade> perrito666, I can try to hit the high points
<alexisb> wwitzel3, fwereade, katco, perrito666 I need a volunteer to help with a critical customer issue
<alexisb> someone have bandwidth to help the onsite team?
<fwereade> alexisb, I can jump in but perhaps not for very long
<alexisb> fwereade, pointed you to the chatter
<jam> natefinch: given notifyHandler is test code, is there any reason SetUp is done asynchronously in the first place?
<fwereade> alexisb, cheers
<jam> natefinch: reviewed
<natefinch> jam: the setup is the watcher method that gets called by the watcher code
<katco> natefinch: also reviewed... just a few questions
<katco> alexisb: fyi i think wwitzel3 is traveling atm
<natefinch> jam: it's this code, which is done from a goroutine spawned in NewStringsWorker: https://github.com/juju/juju/blob/master/worker/stringsworker.go#L64
<natefinch> be back in a bit... gotta pick up my kid from preschool
<katco> launchpad question: if a bug is blocked while we're waiting on more information, do we mark it as incomplete?
<katco> i'm hesitant because it says "cannot be verified", but what i really mean is "verified, but need more information"
<mgz> katco: incomplete is fine
<katco> mgz: cool, ty
<sinzui> natefinch, how goes your branch for bug 1394755?
<mup> Bug #1394755: juju ensure-availability should be able to target existing machines <cloud-installer> <ha> <landscape> <juju-core:In Progress by natefinch> <https://launchpad.net/bugs/1394755>
<sinzui> voidspace, how goes the fix for bug 1441206?
<mup> Bug #1441206: Container destruction doesn't mark IP addresses as Dead <juju-core:In Progress by mfoord> <https://launchpad.net/bugs/1441206>
<katco> sinzui: natefinch is picking up his kiddo
<alexisb> sinzui, natefinch's fix is ready but he is seeing failing tests when he tries to merge
<alexisb> he is currently investigating
<katco> alexisb: i think that fix is under review too
<alexisb> voidspace, is done for the day
<alexisb> and fwereade your a rock star, thank you for the help!
<sinzui> katco, thank you.
<fwereade> alexisb, I just turn up and frown at the bugs and they go away ;p
<perrito666> fwereade: you are chuck norris
<natefinch> back
<perrito666> front
<natefinch> katco: responded to your review... do you understand my response?
 * katco looking
 * natefinch doesn't want to just dismiss your review and commit without you actually reading it and agreeing I'm not crazy ;)
<katco> natefinch: ah yeah that makes sense
<natefinch> (about this)
<natefinch> katco: cool
<katco> natefinch: another place channels would have saved us some trouble :)
<natefinch> katco: yep, channels are pretty cool
 * natefinch really wishes reviewboard's markdown parser would auto-link urls
<perrito666> natefinch: I would not make the merging on a patch dependent to someone saying you are not crazy
<alexisb> :)
<voidspace> sinzui: yeah, sorry - EOD
<voidspace> sinzui: it won't be finished today and I'm off tomorrow
<voidspace> sinzui: will land in Nuremberg...
<alexisb> voidspace, that is ok, it will just go in a point release
<natefinch> perrito666: haha very funny
<alexisb> have a great weekend and we will see you next week voidspace
<sinzui> voidspace, thank you
<natefinch> I am somewhat unreasonably excited to see people again.... I think the fact that the last 4 months of my life have been dominated by a very tiny human being probably has something to do with that.
<alexisb> natefinch, I totally understand :)
<alexisb> though it only takes ~30 hours to start getting homesick
<alexisb> my husband says that he would really love a week away with adults and is jealous
<katco> alexisb: i have instructed my wife to send me at least 1 new picture of my daughter a day
<natefinch> heh yeah, my wife always tell me she's jealous that I get a week's holiday... no matter how much I tell her it's really work
<rick_h_> alexisb: heh, I'm getting pay back. One week after I get back she's gone for 8 days. Her longest trip ever.
<alexisb> the pictures really do help
<katco> i miss her when she's at school. i'm awful =|
<alexisb> yep
<rick_h_> natefinch: my in-laws still try to build me a travel itenerary "You have to go see xxx and yyy and zzz" :)
<natefinch> rick_h_: haha
<katco> rick_h_: lol i get that too!
<rick_h_> work, enjoy dinner...bar maybe? then bed
<natefinch> talking with the kids over hangouts helps, but there's no replacing hugs in person
<alexisb> rick_h_, natefinch husband jokes that I missed all the fun in school so sprints are now my frat parties
<alexisb> natefinch, right now the hangouts are hard with jay because he cries for mommy
<katco> ack!
<alexisb> austin this last time was really really bad
<katco> that has to be heart wrenching
<alexisb> I actually stopped talking to him because it made it so hard on james
<alexisb> which killed me
<natefinch> alexisb: yeah, this one is going to be really bad for me, ZoÃ« has become a huge Daddy's girl, and the first couple nights I'm sure it'll be hell trying to get her to go to bed.
<alexisb> :(
<natefinch> and by "for me", I mean "but much worse for my wife"
<rick_h_> natefinch: yea, it's rough when mom asks "come here and talk to daddy" and the response is "no, I don't want to talk to him" and it's your one chance to chat in however many days
<alexisb> the royal we
<rick_h_> takes time for them to figure out wtf is going on as well
<natefinch> rick_h_: that's rough
<rick_h_> we do phone recorded video swaps now more
<alexisb> rick_h_, that is a good idea
<rick_h_> helps with TZ diff, and I'll record something special, so did from the top of table mountain last trip
<rick_h_> and it's async and mom can help record something when he's in a good mood
<rick_h_> go back/forth and helps be somewhere in the middle I think
<katco> natefinch: oh i forgot to tell you! we got a ladybug girl book for my daughter :) she's still a little young, but she likes looking at it :)
<natefinch> katco: awesome.. love those books :)
<alexisb> natefinch, has all the good skinning on the kids books
<alexisb> jay and I read "How does a dinosaur say goodnight" twice this morning
<katco> we also just got this book "dinotrucks" i think... it's hilarious
<katco> my nephew loves cars and stuff, so he loves it
<rick_h_> alexisb: if your boy gets into dragons pick up 'dragons love tacos'
<katco> alexisb: how do dinosaurs say good night? :)
<rick_h_> got it in vegas and still a fav
<alexisb> with a kiss and a hug and by tucking their tails in
<katco> aw
<alexisb> which I love to read to him because I get a kiss and a hug
<katco> my daughter is *just* starting to give hugs and it's the best thing ever
<voidspace> alexisb: sinzui: thanks - see you next week
<alexisb> cherylj, ping
<mup> Bug #1442308 was opened: 1.23 cannot deploy on vivid, but master can <ci> <local-provider> <lxc> <ubuntu-engineering> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1442308>
<natefinch> alexisb: btw, my old co-worker just applied for that dev manager position.  I don't know who the hiring manager is for that, but hopefully they'll see her as a quality candidate, even if her prior technology skill set isn't an exact match.
<alexisb> natefinch, if you forward me her info I can ping the hiring manager
<cherylj> alexisb: what's up?
<alexisb> cherylj, the latest 1.23 bug may be fixed by one of your 1.24 fixes
<alexisb> can you take a quick peak at https://bugs.launchpad.net/juju-core/+bug/1442308
<mup> Bug #1442308: 1.23 cannot deploy on vivid, but master can <ci> <local-provider> <lxc> <ubuntu-engineering> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1442308>
<cherylj> alexisb: sure
<alexisb> thnaks
<cherylj> oh, the commit they're referencing is new functionality
<cherylj> and would not fix the problem
<cherylj> I'll look more
<natefinch> katco: realized I missed some other spots where we were manually creating that type, and so had to move the construction into a function... much cleaner now.. and it fixes some other spots that I didn't realize also needed it: http://reviews.vapour.ws/r/1405/
<katco> natefinch: tal
<natefinch> katco: thanks... I gotta run for a bit, but will try to merge later if it passes muster
<katco> k
<sinzui> cherylj, can you have look at bug 1442308. Master likes vivid but 1.23 does not, and one of your commits might be the fix (though I cannot see how)
<mup> Bug #1442308: 1.23 cannot deploy on vivid, but master can <ci> <local-provider> <lxc> <ubuntu-engineering> <vivid> <juju-core:Triaged> <https://launchpad.net/bugs/1442308>
<cherylj> sinzui: Yeah, I'm looking at that now and I know for certain that my commit wouldn't come into play here...
<cherylj> sinzui: in the similar bug for trusty, it looks like they were able to grab the lxc log from /var/lib/juju/containers/juju-*-lxc-template/container.log.  How can I get that for this vivid failure?
<sinzui> cherylj, ha. it is still there from the last failure. let me get that to you before something cleans up
<cherylj> sinzui: thanks :)
<sinzui> cherylj, I can see that juju-vivid-lxc-template is still from from the last attempt too
<sinzui> cherylj, I attached the log
<cherylj> sinzui: thanks!  I'll take a look
<sinzui> cherylj, should I delete the current container log? will that make future runs easier to diagnose.
<cherylj> sinzui: yes, go ahead
<sinzui> okay
<cherylj> sinzui: Do you know if you could grab that same log from the trusty failure?
<sinzui> cherylj, I cannot, trusty has never failed in CI
<sinzui> cherylj, only vivid fails
<cherylj> sinzui: oh, I guess I'm confused about the difference between bug 1442308 and 1441319
<mup> Bug #1442308: 1.23 cannot deploy on vivid, but master can <ci> <local-provider> <lxc> <ubuntu-engineering> <vivid> <juju-core:Triaged by cherylj> <https://launchpad.net/bugs/1442308>
<sinzui> cherylj, thumper asked me to separate the vivid issue from trusty. I reported a new bug because ubuntu-engineering want it fixed
<cherylj> sinzui: did you ever find a log in /var/lib/juju/containers/juju-trusty-lxc-template/container.log?
<cherylj> for bug 1441319
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <ci> <lxc> <oil> <test-failure> <vivid> <juju-core:Incomplete by cox-katherine-e> <juju-core 1.23:Incomplete by cox-katherine-e> <https://launchpad.net/bugs/1441319>
<cherylj> I'm just confused because I'm comparing the two, but they're both referencing juju-vivid-lxc-template, even though failure message for 1441319 indicates juju-trusty-lxc-template
<sinzui> cherylj, as I said CI didin't fail, oil did and they have the log
<cherylj> ah, okay, I'll ping lmic
<sinzui> cherylj, I will hide my vivid comment from the other bug so that it is only about oil
<cherylj> sinzui: cool, thanks.  Would it be possible to get the /var/lib/juju/containers/juju-vivid-lxc-template/container.log from the successful master run?
<sinzui> cherylj, wasn't that what I just gave? or is the log reset at each bootstrap?
<sinzui> cherylj, the log is not saved, so all I could get is what was left of the machine and the size lead me to think it was from many weeks of tries
<cherylj> sinzui: I think the large size is due to this "peer has disconnected" error being repeated numerous times.
<sinzui> understood
<sinzui> cherylj, I have mixed news. beta4 can bootstrap and deploy trusty charms. this is good, but you need to set default-series: trusty in your env to ensure it juju doesn't try to use vivid versions of local charms
<sinzui> my bad news is I just ran out of disk.
 * sinzui cleans up
<cherylj> d'oh
<sinzui> cherylj, I can document a work around for vivid deploying vivid charms. beta4  doesn't complete the vivid template. we can stop it ourselves, they destroy the env. creating and deploying again will work
<sinzui> so as long as you don't remove a working template, Juju is good. I am going to change CI to not delete my working template
<alexisb> thumper, ping
<thumper> alexisb: yaas?
<alexisb> do you mind joining the release call in 5 minutes
<alexisb> we are going to need to make some tough calls on 1.23 and I need a clear picture of the lxc issues on vivid
<thumper> alexisb: ok
<thumper> will be there
<thumper> link?
<alexisb> sent you the invite
<thumper> k
<thumper> sinzui: ok, where are these vivid machines?
<sinzui> thumper, I am going to add your key to ubuntu@vivid-slave-b.vapour.ws
<alexisb> katco, does monring or afternoon work better for you tomorrow?
<sinzui> thumper, try logging in.
<sinzui> thumper, I am disable the one job on it so that CI won't use the machine
<thumper> sinzui: I'm in
<katco> alexisb: probably morning-ish
<sinzui> thumper, and it looks like CI already used the machine and confirmed that master does love vivid.
<thumper> sinzui: so master shuts down nicely, but 1.23 doesn't?
<sinzui> thumper, the juju on the machine is the latest beta which mostly works
<sinzui> thumper, yes
<thumper> sinzui: ok, all I need to do is go through all that is in master that isn't in 1.23 and look for systemd stuff I guess
<thumper> simple
<thumper> ...
<sinzui> thumper, but cherylj and I could not find a commit to correlate to the passes yesterday
 * thumper nods
<thumper> I'll start from the start
 * thumper sighs
<thumper> how do I just make a git branch refer to a particular head?
<thumper> upstream 1.23 here
<jw4> thumper you mean --set-upstream ?
<jw4> git branch --set-upstream?
<jw4> or do you mean check out a branch starting at origin/1.23
<thumper> git checkout -t upstream/1.23
<thumper> found that command
<thumper> that's what I wanted
<jw4> coolio
<thumper> WTF?  juju goes from 47 meg to 51 meg between 1.23 and 1.24
<thumper> sinzui: I can't even seem to get 1.23 to bootstrap a local provider on vivid
<sinzui> thumper I had just bootstrapped on that machine using local 30 minutes before our meeting
 * sinzui tries again
<thumper> sinzui: I'm currently trying to bootstrap
<thumper> the first time it failed with: 2015-04-09 23:05:35 ERROR juju.cmd supercommand.go:430 cannot initiate replica set: cannot dial mongo to initiate replicaset: no reachable servers
<sinzui> thumper, I see master passed, but the first two tries died quickly. are you getting an error immediately
<sinzui> oh...
<thumper> and interestingly (FSVO) it failed to remove the .jenv file or the datadir
<sinzui> thumper, I have seen that on this machine in the past, but not in the last few days
<thumper> so it thought it was still bootstrapped for the next try
<thumper> how long does the bootstrap process normally take?
<sinzui> thumper, I using --force a lot on this machine trying to kill mongo
<sinzui> so I may not have seen this
<sinzui> thumper, just over a minute
<thumper> if bootstrap fails, it shouldn't leave cruft behind
<thumper> it is a bug if it does
<thumper> and it is
<thumper> which is a bug
 * thumper sadface
<thumper> at least it was my understanding that it was a bug
<thumper> perhaps the behaviour has changed to allow people to invesigate
<thumper> no reachable servers... again
<thumper> takes 5 minutes to time out
 * thumper tries a third time
<thumper> oh... worked that time
<thumper> yay?
<thumper> anyone gives me the 2 minute overview of systemd commands?
<thumper> sinzui: do you know ^^?
<thumper> sinzui: just bootstrapped 1.23, deployed ubuntu charm, and destroyed the environment, all looked fine
<sinzui> thumper, no mongod left running?
<thumper> with 1.23 built locally from upstream/1.23 and copied across
<sinzui> It happend to me 3 times
<thumper> sinzui: not that I could tell...
<thumper> I didn't use --force
<sinzui> I saw a mongod running aft the message.
<thumper> how?
<thumper> second bootstrap worked
<sinzui> and your 1.23 must be the same as mine because no one has committed
<sinzui> thumper,  ps ax | grep mongod
 * thumper looks
<sinzui> but if you didn't see an error, juju thinks it did everything right
<thumper> ok, just the one
<thumper> for the currently running
 * thumper destroys again
<thumper> and no mongo
<thumper> sinzui: hangout?
<sinzui> sure
<thumper> sinzui: https://plus.google.com/hangouts/_/canonical.com/vivid
#juju-dev 2015-04-10
<mup> Bug #1442493 was opened: Openstack services failing on 1 node while deploying using JUJU <juju-core:New> <https://launchpad.net/bugs/1442493>
<dooferlad> TheMue: morning! o/
<TheMue> dooferlad: morning o/
<dooferlad> TheMue: Could you review this for me? http://reviews.vapour.ws/r/1406/
<TheMue> dooferlad: so the sapphire group is complete for today ;)
<dooferlad> TheMue: yep!
<TheMue> dooferlad: yes, will do
<dooferlad> TheMue: thanks
<TheMue> dooferlad: uff, reading the bug is longer than the fix :)
<dooferlad> TheMue: the effort of testing it on EC2 was larger than that by many times. MAAS just worked. Our lack of support for when there is no default VPC really screwed me over
<TheMue> dooferlad: do you know why this SNAT rule has been in there? so doesn't its removal crash something?
<dooferlad> TheMue: not that I have found.
<TheMue> dooferlad: hmm, ok. here I definitely need some good networking lessons to get more safe in understanding it
<dooferlad> TheMue: I am sure I can arrange that :-)
<TheMue> dooferlad: would be great. in my past I mostly have done typical business software, absolutely different area ;)
<TheMue> dooferlad: you've got you ship it
<dooferlad> TheMue: thanks!
<TheMue> yw
 * fwereade out for an hour or two
<dooferlad> TheMue: hangout?
<TheMue> dooferlad: coming, just had phone
<rogpeppe1> this PR removes the testing deps from the production juju code: http://reviews.vapour.ws/r/1407/
<rogpeppe1> reviews appreciated (it's pretty trivial)
<TheMue> rogpeppe1: will do after hangout
<rogpeppe1> TheMue: ta!
<dooferlad> TheMue: http://reviews.vapour.ws/r/1408/ is very similar to the last change :-)
<TheMue> *click*
<TheMue> dooferlad: thinking you're right, so ship it. alternatively %T is also valid, but no combination
<natefinch> dooferlad, TheMue: there's nothing wrong with %#T
<natefinch> http://play.golang.org/p/8XAOJVdeO2
<rogpeppe1> natefinch: why would you use %#T - it's identical to %T
<natefinch> rogpeppe1: because I always forget it's the same
<TheMue> natefinch: rogpeppe1: %#T is not documented, and %#v and %T behave different in showing also the values with %#v
<mup> Bug #1442541 was opened: hook name ommitted using juju-log <juju-core:New> <https://launchpad.net/bugs/1442541>
<TheMue> hmm, does vet complain about it?
<natefinch> TheMue: %T is the correct fix then.  I just got done fixing that error message so it *didn't* use %v.  It's doing type checking, so it should print out the type if it's incorrect, not the error's string, which doesn't tell you anything.
<natefinch> dooferlad: ^
<TheMue> natefinch: ic, so dooferlad, fix it, then ship it ;)
<natefinch> dooferlad: and thanks for fixing my go vet mistake
<dooferlad> natefinch: no problem - shame I didn't ask you what your intention was before the first review :-)
<natefinch> dooferlad: np :)
<TheMue> dooferlad: $$merge$$, not %%
<dooferlad> TheMue: darn it!
<TheMue> dooferlad: no wonder, after all this %... discussion
<rogpeppe1> natefinch: looks like go vet doesn't like the %#T
<rogpeppe1> mgz: ping
<rogpeppe1> natefinch: i think in the place you've used it, %#v would be a better bet, as if the test fails you really want to see the actual value there not just the type
<natefinch> rogpeppe1: very true
<rogpeppe1> natefinch: the test isn't doing type checking, it's doing value checking, BTW
<rogpeppe1> anyone have any idea how this build failure might be happening? http://juju-ci.vapour.ws:8080/job/github-merge-juju/2845/console
<rogpeppe1> Extant directories unknown:
<rogpeppe1>  gopkg.in/juju/charm.v5
<natefinch> rogpeppe1: regardless, printing out the string value of the error is quite a bit less than useful
<rogpeppe1> i'm presuming that's the reason for the build failure, not the vet message: apiserver/server_test.go:88: unrecognized printf flag for verb 'T': '#'
<rogpeppe1> natefinch: that's why i'd use %#v
<natefinch> rogpeppe1: yep
<rogpeppe1> natefinch: the string value is almost certainly going to be more useful than just the type though
<rogpeppe1> natefinch: which will usually be just *errors.Err
<natefinch> rogpeppe1: let's just agree that %#v is the correct fix :)
<rogpeppe1> natefinch: :)
<natefinch> dooferlad: is that code already committed?
<dooferlad> natefinch: yes
<dooferlad> natefinch: as %T
<dooferlad> I did wonder about %T: %#v...
<natefinch> dooferlad: %#v includes the type...%#v is basically %T with %v
<natefinch> well.. no that;s a bad description
<natefinch> dooferlad: %#v prints out the value as if it were go code to construct the value
<natefinch> dooferlad: http://play.golang.org/p/6HKt-7atpL
<natefinch> except of course, ignore the "type:" string on the beginning of each line, since that was just copied from the old text I had in there
<natefinch> fixed example: http://play.golang.org/p/FtEYRp1dJS
<natefinch> I need to stop trying to explain things before 7am
<dooferlad> natefinch: no worries :-)
<dooferlad> natefinch: actually the merge job failed, so I can easily switch back to %#v if you like
<natefinch> dooferlad: yes please :)
<natefinch> rogpeppe1: about that error in the console... note that error message is looking for gopkg.in/juju/charm.v5, not gopkg.in/juju/charm.v5-unstable
<rogpeppe1> natefinch: yes, i think i know what's going on now
<rogpeppe1> natefinch: i've updated that branch to change everything to use charm.v5
<rogpeppe1> natefinch: which should fix the issue (although it makes that branch quite a bit bulkier)
<rogpeppe1> natefinch: i've also changed it to fix your %#T thing
<natefinch> rogpeppe1: I think dooferlad is making the %#v fix, though I suppose it can't hurt
<mgz> rogpeppe1: hey
<rogpeppe1> mgz: hiya
<rogpeppe1> mgz: too late! :)
<mgz> you're just loving breaking deps at the moment
<rogpeppe1> mgz: it's one of my favourite activities
<mgz> well, at least this one was an actual catch by the bot
<rogpeppe1> mgz: not really
<rogpeppe1> mgz: it's actually a bug in godeps
<rogpeppe1> mgz: because godeps uses go get to fetch new deps
<rogpeppe1> mgz: and there's no way to tell go get not to fetch recursively
<mgz> yeah, which just pulls in everything
<rogpeppe1> mgz: so really i think godeps needs to fork the go get vcs fetch functionality
<mgz> so, tells you to fix the imports, no?
<rogpeppe1> mgz: no, the imports are right
<mgz> rogpeppe1: I agree with that though, we need to not go get really
<rogpeppe1> mgz: the problem is that the tip of the repo has a different set of deps from the dep we wanted to use
<mgz> rogpeppe1: ah, that one again
<mgz> er... i can sort that for now I guess
<rogpeppe1> mgz: i'm fixing it my updating to use the latest deps
<mgz> I have a flip that just removes unkniwn deps rather than complaining, and assumes that if things build the godep stated deps were in fact the intended ones
<rogpeppe1> mgz: the branch was all about updating deps anyway
<mgz> fair enough
<rogpeppe1> mgz: so here's the branch, updated (and now huge 'cos of all the import path changes): http://reviews.vapour.ws/r/1407
<rogpeppe1> mgz: if i want to change the description of a PR, should I do it in reviews.vapour or in github ?
<mgz> rogpeppe1: well, reviews is the one people read
<rogpeppe1> mgz: i care mostly about the commit log message
<mgz> I guess change on github then
 * rogpeppe1 $$merge$$s
<rogpeppe> mgz: https://go-review.googlesource.com/#/c/8725/
<rogpeppe> mgz: it won't fix it for now but sometime in the future, i hope
<rogpeppe> mgz: (assuming it's accepted)
<mattyw> fwereade, rogpeppe is one or both of you packing dominion?
<fwereade> mattyw, I will surely pack a set or two
<fwereade> mattyw, hopefully so will rogpeppe or dimitern as well
<rogpeppe> fwereade: i definitely intend to bring some, probably seaside, prosperity, maybe intrigue too
<fwereade> rogpeppe, intrigue is probably my favourite -- hopefully dimitern will bring his :)
<fwereade> rogpeppe, any of mine you'd be particularly interested to see?
<dooferlad> TheMue: Another review for you... http://reviews.vapour.ws/r/1400/
<rogpeppe> fwereade: i haven't played many real life games with alchemy, and dark ages is always good
<fwereade> rogpeppe, I don;t have alchemy actually -- I'll probably go with hinterlands/dark ages
<rogpeppe> fwereade: sgtm
<rogpeppe> fwereade: BTW i came across your recent dependency engine thing - cool stuff!
<fwereade> rogpeppe, glad you like it -- it borrows heavily from worker.Runner
<rogpeppe> fwereade: i like that it fits in with the worker framework too
<fwereade> rogpeppe, and I *suspect* will come closer still to it as I need to cover some of the more baroque cases I want in jujud
<fwereade> rogpeppe, yeah, absolutely, the worker approach has been fantastic -- it's just organising them clearly where we've really fallen down
<rogpeppe> fwereade: i still don't quite understand the overall motivation behind it mind
<fwereade> rogpeppe, I want to run something to take over some of the uniter's responsibilities when certain resources (like an api conn) are not available
<rogpeppe> fwereade: the output value thing could probably do with a little more documentation
<fwereade> rogpeppe, the thought of coordinating two distinct workers that ran at such very different levels made me cry
<fwereade> rogpeppe, noted, thanks
<rogpeppe> fwereade: i found the term "manifold" somewhat opaque; was this the meaning you had in mind "a manifold is a topological space that resembles Euclidean space near each point. More precisely, each point of an n-dimensional manifold has a neighbourhood that is homeomorphic to the Euclidean space of dimension n"
<rogpeppe> ?
<fwereade> rogpeppe, no, it's a mechanism with pipes going in and oout of it
<rogpeppe> fwereade: ah "a chamber having several outlets through which a liquid or gas is distributed or gathered."
<rogpeppe> fwereade: not a term i'm familiar with
<rogpeppe> fwereade: (which isn't to say that it's not entirely appropriate!)
<fwereade> rogpeppe, I could swear I made a point of documenting that...
<fwereade> / Manifold defines the behaviour of a node in an Engine's dependency graph. It's
<fwereade> / named for the "device that connects multiple inputs or outputs" sense of the
<fwereade> / word.
<rogpeppe> fwereade: ha, i evidently skipped over that :)
<fwereade> rogpeppe, ;p
<sinzui> katco, dooferlad : the commit that reverted the container SNAT rule in 1.23 may be causing aws bundle tests to fail exactly like bug 1441319
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <lxc> <oil> <juju-core:Incomplete by cox-katherine-e> <https://launchpad.net/bugs/1441319>
 * sinzui tries to get container log
<dooferlad> sinzui: no, that is because we don't support container cloning.
<sinzui> dooferlad, this test passed on the previous version on 1.23 and we know the landscape bundle we are deploying hs not changed
<TheMue> dooferlad: just seen your message, reviewing it now
<sinzui> this test passed on joyent and maas 1.7 and hp cloud
<dooferlad> sinzui: it was also reported before the SNAT rule change was reverted
<sinzui> dooferlad, not be CI and not with 1.23
<sinzui> dooferlad, Until your commit, CI has never seen this error. This test is voting, so 1.23 will not be blessed for a release. I need to get more information about what is wrong.
<dooferlad> sinzui: OK, the only thing that should have changed in terms of what other machines see, is that traffic from containers comes from the containers IP address, not the host machine's address. This used to be the case before rev  b584fcb85b9bcce3dadba97d32fe50e8f3680e40
<dooferlad> on that commit we added an SNAT rule to modify all traffic leaving a container host to look as if it came from that host
<sinzui> dooferlad, yes, I agree and this test was happy when that change was made. I don't know why but would appear. could there have been another change made inconjuction for aws
 * natefinch almost doesn't know what to do now that the HA code finally made it in.
<dooferlad> sinzui: so it worked OK on the 8th after the proxy change? That would seem more likely.
<sinzui> dooferlad, yep
<natefinch> sinzui: what would you say is the correct behavior for this bug? https://bugs.launchpad.net/juju-core/+bug/1441904   just failing?
<mup> Bug #1441904: juju upgrade-juju goes into an infinite loop if apt-get fails for any reason <canonical-is> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1441904>
<sinzui> natefinch, yes, say it failed
<natefinch> sinzui: wonder if we should retry a few times in case of network problems etc
<sinzui> natefinch, I thought all apt calls were retried 3 times. dimitern fixed an issue a few months ago where juju wasn't retrying
<natefinch> sinzui: hmm.. good question, I can check, I don't know offhand
<sinzui> katco, I added the container.log you requested on https://bugs.launchpad.net/juju-core/+bug/1441319
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <lxc> <oil> <juju-core:Incomplete by cox-katherine-e> <https://launchpad.net/bugs/1441319>
<katco> sinzui: ty... that's from the machine that failed?
<sinzui> katco, no larry's. the latest commit to 1.23 broken the aws deployer test in CI
<katco> sinzui: confused... isn't larry the one who reported the failure? and this is the one we're thinking is intermittent?
<sinzui> katco, yes, but now CI is affected. I am providing a log
<katco> sinzui: ah ok. so we're sure it's the same issue?
<sinzui> yep. we can reproduce it on command with 1.23 tip deploying the landscape bundle to aws
<sinzui> I have 5 failures
<katco> sinzui: i thought larry's issue was an lxc container problem...
 * katco apologizes if she's being dense
<natefinch> rogpeppe, fwereade: FWIW, I have Alchemy, it's just not that great, so I kinda hesitate to lug it 3000 miles :)
<rogpeppe> natefinch: i've quite enjoyed it on occasion in Androminion
<sinzui> katco, oh, sorry, my error is not the same. I will report another bug.
<katco> sinzui: k, thanks for walking me around the block on that one
<dooferlad> it looks like that container is having problems because of apparmor. The peer has disconnected message is because https://github.com/dotcloud/lxc/blob/master/src/lxc/af_unix.c -> lxc_af_unix_rcv_credential is failing.
<katco> sinzui: still working on some caffeien here :p
<katco> caffeine even
<sinzui> katco, thank you for questioning me. I am low on caffeine.
<katco> sinzui: a toast!
<natefinch> rogpeppe: at least it's small.  I could probably leave the box at home and it would be less of a big deal.
<katco> dooferlad: are you refering to 1441319?
<dooferlad> katco: yes
<katco> dooferlad: which log on that bug?
<dooferlad> katco: https://bugs.launchpad.net/juju-core/+bug/1441319/+attachment/4371597/+files/container.log (the last attachment, from sinzui)
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <lxc> <oil> <juju-core:Incomplete by cox-katherine-e> <https://launchpad.net/bugs/1441319>
<katco> dooferlad: ah ok. so that will be relavant to sinzui's new bug he's opening, not 1441319
<dooferlad> katco: yea, indeed
<katco> relevant even
<katco> jees with the spelling this morning
<alexisb> dooferlad, can you work iwth sinzui on a bug as a result of your lastest commit
<dooferlad> alexisb: sure
<alexisb> and dooferlad I just finished reading the backscroll :)
<sinzui> dooferlad, katco sorry, more caffeine and CI's error is the same as oil's 'failed to retrieve the template to clone: template container
<sinzui>           "juju-trusty-lxc-template" did not stop'
<sinzui>         instance-id: pending so https://bugs.launchpad.net/juju-core/+bug/1441319 is the issue
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <lxc> <oil> <juju-core:Incomplete by cox-katherine-e> <https://launchpad.net/bugs/1441319>
<katco> sinzui: that's actually great! repeatable!
<dooferlad> TheMue: thanks for that review. Fixes commited.
<natefinch> anyone have an opinion on the best way to get a PR merged into master into the 1.23 branch?  I could re-PR to 1.23, or I could cherry-pick from master to 1.23... not sure which was is "better" (or if there's a better third way)
<natefinch> mgz: ^  any thoughts?
<natefinch> (should say "to get a PR that was merged into master, into the 1.23 branch"
<natefinch> sinzui: for the record, we try apt installs *30* times, with 10 second delays between each.  Wowza.
<sinzui> natefinch, well we tried. juju can admit it failed
<katco> natefinch: i always like to cherry pick from master, but that's just my workflow
<katco> natefinch: if it's important that it be done fast for 1.23, land it there first and forward port
<natefinch> sinzui: yep, it shouldn't be a big deal to have the error from apt-get just tell the command to fail.  If it doesn't work after trying for 5 minutes, it's not going to work.
<dooferlad> katco: I thought forward porting was the new normal. Maybe just for Saphire.
<natefinch> no, we're supposed to land on 1.23 first and forward port ,from what I remember... I just did the wrong thing
<katco> dooferlad: could be?  i don't think it really matters... isn't it personal preference?
<natefinch> katco: I don't think it really matters, which is not the same as saying no one cares ;)
<katco> natefinch: hey. you paint that shed blue dammit.
<natefinch> I have an email from alexisb about making sure to land on stable and then forward port to master.... but it was only sent to me and wwitzel3, but I think that's The Way It's Supposed To Beâ¢
<sinzui> natefinch, bug 1394755 says it is fixed in 1.23, but I don't see the commit. I do see the commit in master though
<mup> Bug #1394755: juju ensure-availability should be able to target existing machines <cloud-installer> <ha> <landscape> <juju-core:Fix Committed by natefinch> <https://launchpad.net/bugs/1394755>
<mup> Bug #1436863 was opened: test failure replicaset_test.go:52: SetUpTest.pN42_github_com_juju_juju_replicaset.MongoSuite <ci> <ppc64el> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1436863>
<natefinch> sinzui: working on that, sorry
<natefinch> katco: how do I cherry-pick the PR without listing out the 37 commits?  Picking just the merge commit says I'm missing -m, and using -m says it expects a numerical value, but I have no idea what the number should be
<natefinch> also probably doesn't help that I have merge commits from master into that branch... sigh
<sinzui> mgz, can your review http://reviews.vapour.ws/r/1409/
<katco> natefinch: sorry i was in a meeting
<katco> natefinch: i am the worst person to ask for advice on git, because i use an emacs mode which does it all for me
<natefinch> katco: haha no problem
<mgz> sinzui: on it
<natefinch> katco:  I think I'm getting it.  I just ended up doing a cherry pick of all ~34 non-merge commits
<mgz> sinzui: lgtm, hold till when you need it
<sinzui> thank you mgz
<alexisb> natefinch, "git help -a" should have what you are looking for
<alexisb> though I must admit that it has been a long time since i have done a kernel patch and used git
<natefinch> alexisb: git help doesn't always.
<katco> alexisb: you used to work on the linux kernel?
<alexisb> yes mam
<katco> alexisb: that's awesome!
<natefinch> alexisb: I thought i had it, but somehow my cherry pick left this branch in a bad state... some code referring to packages that don't exist.  Sigh.
<alexisb> it was fun, there are days I think about going back :)
<katco> alexisb: nooo! we need you!
<alexisb> katco, it has  been long enough I would have to really work to get back into it, my last public commit was in 2008 I believe
<natefinch> alexisb: I'm sure it's mostly the same ;)
<sinzui> natefinch, I have paused CI because I have a change merging now, and I don't want CI to be busy when you change arrives
<natefinch> sinzui: thanks, I guess.  I don't know that my change will make it in terribly soon, given the troubles I'm having with git, and I need to go pick up my daughter from preschool in 20 minutes
<sinzui> :(
<katco> natefinch: is it checked into 1.23?
<sinzui> natefinch, I assume CI will take 3 hours to test my version change.
<natefinch> katco: no, it's in master, that's the problem... I developed it off of master, merged master into my branch a couple times, and then PR'd into master
<katco> natefinch: so you're trying to backport?
<natefinch> katco: yes
<katco> natefinch: what's the commit hash?
<natefinch> katco: b228e89dd3ef9a3fe0f14b958db123e87a68bc48
<natefinch> katco: I presume there's some magic command line that'll do the right thing... but I'll be damned if I can figure it out.
<katco> natefinch: from what i can remember, it's not a single command
<katco> natefinch: you set cherry head, and upstream
<katco> natefinch: and then start picking commits
<alexisb> natefinch, katco if we have someone that can help out natefinch on this that would be great
<alexisb> we gotta get 1.23 out
<katco> alexisb: i am backporting now
<mup> Bug #1258485 changed: support leader election for charms <juju-core:Fix Released by jameinel> <cassandra (Juju Charms Collection):Triaged> <postgresql (Juju Charms Collection):Triaged> <https://launchpad.net/bugs/1258485>
<katco> natefinch: ptal: http://reviews.vapour.ws/r/1410/
<katco> natefinch: especially make sure i didn't mess up the merge
<natefinch> katco: hmm.... I think I must have given you the wrong hash.... that's not what needed to get ported... well, that may still need to get ported, but that's not all of it
<katco> natefinch: is there more than one commit?
<natefinch> katco: there's two different merges... one is the test fixes and one is the HA --to code
<katco> natefinch: k i'll pull the other commit too
<natefinch> katco: there's Merge pull request #1962 from natefinch/ha3  which contains 37 commits :/
<katco> natefinch: k tal
<mgz> it is possible to charrypick an entire merged branch with git, I don't recall the spelling off the top of my head
<natefinch> mgz: it's the spelling that I'm not sure of... plus the branch contains commits that are merges from master to that branch, which would contain changes that shouldn't go into 1.23 (presumably)
<mgz> natefinch: right, but the diff from merged to master commit vs previous commit to master should be the right thing
<mgz> so the dumb version is just take that diff and apply it, but cherry-pick does have a one-step of that
<natefinch> mgz: I would think so, but I'm always disappointed in what vcs figures out on its own
<mgz> git just doesn't record mainline, so talking about three-way merges in the commandline is more annoying
<natefinch> unfortunately, I gotta run to pick up my daughter.... I'll be back in like an hour.  That's about the best I can do.
<katco> gosh... it's going to be hard for me to merge this much in without knowing anything about what i'm merging
<mgz> katco: how conflicty is the `git cherry-pick -m <rev for parent> <merge rev>`?
<sinzui> natefinch, mgz, I think I had to use patch or format-patch once
<katco> mgz: 47 unmerged files
<mup> Bug #1442719 was opened: juju sync-tools fails <sync-tools> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1442719>
<katco> mgz: i (or rather emacs) ended up doing "git --no-pager -c core.preloadindex=true merge master~2 --no-commit"
<mgz> hm, I wonder if that's actually right without specifying the mainline?
<katco> mgz: i'm merging it into a branch off of 1.23 upstream
<mgz> maybe the ~2 just happens to be resolving right on sha1 alpha order
<katco> mgz: it is; that's emacs doing magic
<katco> mgz: i found the merge master hash, hit m and it does the right thing
<katco> mgz: i would hate doing all of this manually
<katco> mgz: which is to say via the command line
<mgz> well resolving conflicts is always going to need an editor
<mgz> you can use an OS if you want :P
<katco> mgz: haha
<katco> mgz: i'll show you this workflow in nuremberg. it's quite nice
<mgz> neat
<katco> mgz: merge dumps me into a buffer with all merge conflicts. pressing "e" on an unmerged file dumps me into a visual diff
<katco> mgz: and i hit "a" or "b" depending on which side i want. and can edit in 3rd pane at bottom
<katco> alexisb: i can't perform this backport. i don't know enough about the patch to perform the requisite merges.
<alexisb> katco, just fyi, the juju-core leadership team advocates the process of forward porting as juju has historically seen more issues with back porting then forward porting
<alexisb> katco, ack, we will have to wait for natefinch then
<katco> alexisb: ok, duly noted.
<alexisb> as wwitzel3  is traveling
<alexisb> mgz, sinzui if need be we will have to add the HA working in a point  release
<alexisb> sinzui, mgz, dooferlad, where are we with the aws container failure?
<sinzui> alexisb, If Ubuntu deem the patch to be a feature, they will not accept it. natefinch is the patch more like a bug than a feature
<alexisb> sinzui, in many of the "bugs" we fix for stakeholders there is a fine line between feature and bug
<alexisb> I see your point, but thought we had decided this was a bug
<sinzui> alexisb, yes, and Ubuntu doesn't honour that line they have reject our streams change in 1.20 and our right refused to package the backup/restore changes to 1.16.4
<alexisb> katco, get with natefinch when he is back and lets get this merged
<alexisb> sinzui, I see this impacting our ability to deliver by weds
<alexisb> sinzui, where do we stand on the aws failure mentioned this morning?
<sinzui> No progress.
<sinzui> alexisb, and this bug was just moved to juju-core https://bugs.launchpad.net/juju-core/+bug/1439535
<mup> Bug #1439535: 1.23-beta2 websocket incompatibility <juju-core:New> <python-jujuclient:Incomplete> <https://launchpad.net/bugs/1439535>
<sinzui> ^ There are contradictions about beta3 being affected
<alexisb> so sinzui adding bugs is not helping get the release out :)
<alexisb> so lets start with the bug from this morning, what does no progress mean?
<alexisb> is there no one from core looking at it?  is there something blocking us from debugging the issue?
<sinzui> alexisb, neither katco or dooferlad have asked for more information from me after I provided the log that was requested.
<sinzui> alexisb, I am at a critical moment in the release, so I cannot let myself get too distracted
<alexisb> sinzui, that is fine, I understand
<alexisb> dooferlad, are you still around?
<dooferlad> alexisb: just. In the middle of feeding the baby
<alexisb> :) that is a fun task
<alexisb> I know it is close to your eod but have you had a chance to take a look at the logs sinzui provided
<dooferlad> alexisb: As I said above, the failure seems to be to do with LXC and apparmor having a bad interaction.
<dooferlad> alexisb: and I haven't touched code anywhere near that
<alexisb> dooferlad, ack, katco is this similar to the issue you looked at earlier this week?
<katco> alexisb: yeah... looks to be exactly the same
<alexisb> which we determined was not a juju issue, correct?
<sinzui> dooferlad, really. great. let me check to see if an apparmor change was delivered to trusty in the last day ( and see if the aws mirrors are our of sync
<katco> alexisb: we were leaning that way because we hadn't touched the code in that area; however, sinzui indicated that a commit caused this to be repeatable?
<katco> dooferlad: or was sinzui saying the change was your commit?
<sinzui> dooferlad, katco. apparmor did change last week:
<sinzui> 2.8.95~2430-0ubuntu5	release (main)	2014-04-04
<katco> sinzui: if we haven't done any commits that touch that area of code, i'm blaming that
<sinzui> We have testing this week that says aws was fine, but let me see if I can find evidence that the mirrors are stale
<katco> sinzui: i apologize, i still don't understand the connection between aws and an lxc issue?
<sinzui> katco, I am sorry about the situation
<katco> sinzui: i'm assuming that i'm missing something; nothing to be sorry for
<dooferlad> OK all, see you in Germany!
<katco> dooferlad: see you there!
<natefinch> katco:  back
<rogpeppe> mgz: here's the start of the fix to godeps - copying a bunch of code from the go tool: https://codereview.appspot.com/223390043/
<katco> natefinch: hey, i can't backport that change... there are too many merge conflicts i don't understand
<natefinch> katco: I was worried about that
<mgz> rogpeppe: neat! I had a look at your get upstream change earlier, that seems like a good idea regardless
<katco> natefinch: try "git --no-pager -c core.preloadindex=true merge master~2 --no-commit" but tweak the ~2 to point to the correct merge hash for your branch of master
<natefinch> katco: man, how did I miss something so obvious?
<katco> natefinch: (shrugs) it's easy to get blinders on that you have to cherry
<perrito666> natefinch: obviously you dont use emacs, there most likely is a shortcut for that, smth like: Ctrl-M gnpccptmmnc2
<katco> perrito666: lol i put my cursor over the merge hash and pressed "m"
<katco> some of those command line flags are probably superfluous
 * katco somewhat embarrassingly just realized natefinch was joking.
<natefinch> katco: how do I figure out the number after ~?  That stuff confuses the hell out of me
<natefinch> katco: lol yes
<katco> natefinch: it's just an offset from head
<katco> natefinch: so however many commits back from head that merge hash is for you
<natefinch> katco: you say that, but sometimes ~2 brings in like 1000 changes if you've done a merge
<katco> natefinch: i pulled from trunk recently so it is likely the same, but i would 2x check
<alexisb> katco, I need to step out for a bit, but I will be back soon, can you please make sure that the latest summary gets captured here: https://bugs.launchpad.net/juju-core/+bug/1441319
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <deployer> <lxc> <oil> <regression> <juju-core:Triaged by cox-katherine-e> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1441319>
<alexisb> so that we don't loose the conversation that happened on irc
<katco> alexisb: sure
<alexisb> thanks
<natefinch> katco: this keeps bringing in a ton of changes to files I know I haven't been anywhere near
<katco> natefinch: ok, i'll see if i can wizard up another command for you
<natefinch> katco: what I really need is to just whip up a patch file from the commit to master and then apply that to 1.23...
<katco> natefinch: so see on the PR, the merge has 2 parents? https://github.com/juju/juju/commit/b228e89dd3ef9a3fe0f14b958db123e87a68bc48
<katco> natefinch: which is the parent you don't want to cherry?
 * perrito666 has a ton of things to do and already lost 6hs on bureaucracy and now I just discovered that wile I was a away the mailman delivered mail... before the rain
<natefinch> katco: both of those should be fine... though I don;t know why those are parents per se... those are just the two previous commits I made
<natefinch> katco: I think the problem is that I merged master into my branch, which is evidently screwing up git, because it thinks I then want to merge that into 1.23, which I don't.
<cmars> natefinch, i've often found git diff start..merge | patch -p1 --merge works, when cherry-pick doesn't dwiw. maybe there's a way to tell cherry-pick how to pick a sensible parent, but i've not grasped it
<cmars> that's kind of dirty, but works when branches have diverged quite a bit
<katco> natefinch: so maybe do both: git cherry-pick b228e89dd3ef9a3fe0f14b958db123e87a68bc48 -m 1
<katco> natefinch: and then git cherry-pick b228e89dd3ef9a3fe0f14b958db123e87a68bc48 -m 2
<mgz> right, that's it
<mgz> whether it's -m 1 or 2 depends on alpha order of the parent shas
<mgz> you may want to squidge the commits anywhat if using cherrypick
<cmars> git-squidge should totally be a thing
<katco> cmars: lol
<natefinch> katco: so far, the first one looks promising
<natefinch> katco: and that second one brings in a bunch of crap that is totally wrong.  So maybe just the first one is correct
<natefinch> running tests... we'll see how it goes. It looks promising
<katco> natefinch: are you sure that has all of the changes you need?
<natefinch> katco: nope.  I should probably throw up a diff and make sure it includes everything
<natefinch> katco: looking good.  Files look correct.  NUmber of changes is correct
<katco> natefinch: woohoo!
<natefinch> well, it is missing my other test fixes,  but that's just a single commit which hopefully is trivial to cherry-pick
<katco> not bad
<natefinch> they're not strictly part of HA, but they fail reliably on my machine... maybe they don't on the bot though
<natefinch> I think the bot is running with GO_MAX_PROCS = 1, which is likely the difference
<katco> ah yeah i run with that too... i learned quickly lol
<natefinch> katco: I think that backport you produced is actually the other half of stuff I'd like to get in
<katco> natefinch: lol i deleted the pr
<natefinch> katco: here's the review for my backport (really, I should just call it your backport ;)
<natefinch> http://reviews.vapour.ws/r/1414/
<katco> natefinch: ah jees... for the same reason i couldn't do the backport, this is going to be hard to review
<katco> natefinch: how confident are you that you pulled the right changes?
<mgz> I'm going to have a bunch of trivial changes to licence headers I'll need a rubber stamp on shortly
<mgz> ..meh, and dep bumps that's a bit more painful
<natefinch> katco: basically 100%.  I did a visual side by side diff of the changes for the original PR: https://github.com/juju/juju/pull/1962/files  vs. the backport PR: https://github.com/juju/juju/pull/2060/files
<natefinch> katco: the diff in the review in this case is not really useful.  You need to diff the two diffs :)
<katco> natefinch: k i'm going to rubber stamp it then...
<katco> natefinch: and tests are passing?
<mgz> oh, but I'm being called for food first
<natefinch> katco: they only fail in the way the ones on master fail for me... if that's any consolation.  Some of that is what I fixed in that other PR
<katco> k
<katco> you have been rubber stamped
<natefinch> katco: I promise to fall on the grenade if it blows up after merging.
<katco> lol
<natefinch> So, last night at 6pm, I realized that my last two pairs of jeans that were at all fit to wear had decided to simultaneously acquire large rips in the knees.... and given no time to actually go out shopping anywhere.. .I made my first purchase of jeans off amazon, to be delivered tomorrow, hopefully before I have to leave.
<katco> lol that is brave
<natefinch> I *think* they're a style that has fit well before... and men's sizes are usually regular enough that as long as I stay away from slim-fit and other nonsense, they'll just work.  But yes, it is a bit of a gamble.
<natefinch> I wouldn't have done it if they didn't explicitly say they have free returns.
<natefinch> katco: btw..... just noticed this works: https://patch-diff.githubusercontent.com/raw/juju/juju/pull/1962.patch
<sinzui> dooferlad, katco. I have confirmed that 1.22.1 and 1.23-beta4 can deploy the test bundle to aws, but the revision that changed reverted container networking cannot. I can report a separate bug if you think that will be less confusing.
<natefinch> katco: I got that from seeing this tip at the bottom of the PR page:  ProTip! Add .patch or .diff to the end of URLs for Git's plaintext views.
<sinzui> dooferlad, The release notes for the container networking state it was only enabled for aws and maas. Could the revision be incomplete? Does something else need to change for aws?
<natefinch> anyway, I gotta run some errands, I'll check back in later to see if this has blown up or not
<natefinch> katco: I'll try to get you an email with info ASAP, but it might be late tonight.
<natefinch> (info about team lead stuff)
<katco> sinzui: can you explain the lxc + aws connection to me?
<sinzui> katco, bundle is deploying to two containers.
<katco> sinzui: ah ok, so it's deploying containers to an aws host?
<sinzui> yes. Two app servers in a containers, and haproxy directly on the host to round-robin the work load
<katco> sinzui: ah ok. thanks, that's the information i was missing
<sinzui> katco, let me update the bug with the command line. I used any of us can run it. It isn't anything like the awkward CI tests
<katco> sinzui: awesome
<katco> sinzui: so to echo this back to make sure i understand
<katco> sinzui: 1.23-b4 works? where was the commit that breaks things? trunk?
<sinzui> katco, the dooferlad's commit is the only change from 1.23-beta4 released. We used cherylj's commit
<katco> sinzui: ah so it was to 1.23-beta4, it just hasn't been packaged (released) yet?
<sinzui> katco, it was released 20 minutes ago.
<katco> sinzui: so i'm confused (sorry)... you said 1.23-beta4 is working, but 1.23-beta4 includes dooferlad's commit, which is not working?
<sinzui> no
<sinzui> katco, we selected cherylj's as the official 1.23-beta4 because tip was broken.
<katco> ah ok
<katco> i understand now, thanks
<katco> and that commit repeatably works whereas dooferlad's commit repeatably fails
<katco> sinzui: is that https://github.com/juju/juju/commit/7e7bc9d3ad436cf25fa725f54334527cea9cb938 ?
<sinzui> katco, https://bugs.launchpad.net/juju-core/+bug/1441319/comments/11
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <deployer> <lxc> <oil> <regression> <juju-core:Triaged by cox-katherine-e> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1441319>
<sinzui> yes, that is the commit that breaks things.
<katco> sinzui: well at first glance, it certainly looks to be directly related...
<sinzui> katco, Since the release notes say only maas and aws were supported by SNAT, maybe another change was made that also needs to be reverted
<lazyPower> Attention core developers - juju actions are friggin sweet. Thank you for this feature. that is all
<katco> lazyPower: hey you are pretty awesome.
<lazyPower> :D
<lazyPower> I've been prototyping actions for a couple days now. i want to go through and hyperbole all the charms with actions now
<katco> lazyPower: i want to work on a charm with you at the sprint
<lazyPower> actions ALL THE THINGS
<katco> lol
<lazyPower> katco: i wont be at n.burg :(
<katco> lazyPower: oh what the boo
<lazyPower> only a handfull of eco peeps wil be there, we got split up
<lazyPower> we're now under d. westervelt in the big division of juju teams
<katco> yeah heard about that
<katco> didn't realize you wouldn't be at the sprints though... for good?
<lazyPower> hopefully not
<lazyPower> but we'll see what the future brings
<lazyPower> marco, cory and antonio will be there however
<katco> cool
<lazyPower> also - i'm *always* up for a review/pairing session
<lazyPower> just hit me with a repo link and a timeblock and i can make time to pair
<katco> cool
<katco> wish i had more time to actually use juju and not just work on it :p
<katco> just have a small installation on my home network
<lazyPower> i've got coutn it - 8 environments running now
<lazyPower> that i manage/assist
<katco> wow
<katco> that's pretty cool
<lazyPower> :D
<lazyPower> getting easier every day
<katco> my cousin works at a zoo which is understaffed
<lazyPower> keep landing great features and i'll keep spinning up the envs
<katco> and they need to stand up a bunch of standard windows things
<katco> pinged him to use juju
<lazyPower> oh man
<lazyPower> our windows support without maas/openstack is non-existant today
<katco> so bring them to maas! ;)
<lazyPower> +1 for that
<katco> sinzui: so this kind of blows any theory that the apparmor update was at all responsible?
<katco> sinzui: since cherylj's commit works fine
<sinzui> katco, yes, I didn't see any issue with the aws mirrors to imply they are different from other mirrors too
<katco> sinzui: ok. well, we'll have to ping alexisb for her opinion on what to do. i'm assuming this blocks v1.23 for the time being?
<sinzui> katco, this is very awkward, because dooferlad was reverting to unblock. so we cannot revert his change. We need to find another change that was made related to the SNAT
<katco> sinzui: do you know if anyone else was involved with that code?
<sinzui> I don't. I can only think to use annotate and log to find other parts of the code that were changed for snat.
<katco> sinzui: looks like possibly dimiter
<alexisb> katco, whats up?
<katco> alexisb: well, curtis has arguably proved that we have a commit in juju that is responsible for the failures
<alexisb> lazyPower, make sure to thank jw4 for actions
<katco> alexisb: it's made complicated by the fact that the commit was backing another commit out to fix another issue
<katco> alexisb: we probably need dimiter+team to make further changes to fix tip
<katco> 1.23 tip
<alexisb> well that means we have to wait tell monday for those fixes
<katco> alexisb: that is the issue we needed your input on
<katco> alexisb: what does that mean for vivid?
<alexisb> it means we release vivid with a broken juju
<alexisb> so what "fix" was reverted that broke aws and why did we revert it?
<katco> sorry had to blow my nose >.<
<katco> sinzui: do you know the answer to that question?
<alexisb> the issue is that we have a known issue that affects oil and any openstack bundle really, and the current 1.23 blessed version does not have a fix for that issue
<katco> alexisb: here's the suspect commit: https://github.com/juju/juju/commit/7e7bc9d3ad436cf25fa725f54334527cea9cb938
<alexisb> which means the first 1.23.0 we are targeted to get released to vivid will break many of our stakeholders
<alexisb> but if we dont release then juju will just not work on vivid
<katco> alexisb: well that's not a good decision to have to make.
<alexisb> katco, so that commit is actually a fix for the issue I am referring to
<alexisb> I take it that is the one that is breaking aws
<katco> alexisb: yep.
<sinzui> alexisb, the bug doesn't really say we reverted the container network changes. It says we made a small change that address stakeholder conerns. the change just is not enough to restore AWS containers. maas containers are fixed
<katco> alexisb: can oil work around the bug by manually tweaking the nat tables on containers?
<alexisb> yeah I actually dont think it is reverting anything
<mgz> hmm, who reported the aws container issues?
<sinzui> alexisb, this is difficult. before the change this morning some charms were broken, now all aws containers are broken
<alexisb> mgz, you guys did :)
<alexisb> katco, I dont have answer to your question jamespage would have an answer
<alexisb> sinzui, yeah it sucks :)
<sinzui> mgz, https://bugs.launchpad.net/juju-core/+bug/1441319/comments/9 and beyond
<mup> Bug #1441319: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <deployer> <lxc> <oil> <regression> <juju-core:Triaged by cox-katherine-e> <juju-core 1.23:Triaged> <https://launchpad.net/bugs/1441319>
<alexisb> so if we know that maas is fixed, is there a way to trace the aws code path and figure out what is different that would affect the issue
<alexisb> katco, question for you ^^^
<katco> alexisb: it's possible, but not likely in the time we have with what resources are left
<alexisb> katco, that is totally fair
<alexisb> so the next question is, is it more important for the first release to have oil working or aws containers working?
<alexisb> and katco I need to follow-up re your question on work around
<mgz> so... I think containers on aws matters a fair bit for our testing, but I don't know of any customers deploying on aws
 * alexisb goes to jason hobbs
<mgz> as in, big paying ones, lots of people use it
<katco> alexisb: mgz: yeah, i'm thinking OIL is probably more important to get right out of the gate?
<alexisb> yeah I am thinking oil working is going to be the priority
<alexisb> katco, you beat me to it
<katco> will oil be utilizing vivid right away?
<mgz> almost certainly not
<sinzui> katco, not likely
<katco> hm
<katco> so the "crumple zones" we have are: workarounds, and adoption rate
<alexisb> but they do use the latest stable of juju
<katco> if we can understand those better, we could probably make a better decision
<sinzui> and they want leader elections
<katco> yeah
<katco> well, i guess i vote oil then.
<alexisb> yeah so katco we do have a pretty good understanding of those, oil is the right choice
<alexisb> it just sucks
<alexisb> given there is an obvious failure we can see in our testing
<katco> yeah =/
<alexisb> so sinzui ....
<katco> another lesson learned: releases should be "done" well before sprints and everyone takes off
<alexisb> lol
<sinzui> alexisb, there are now 3 regressions are in https://bugs.launchpad.net/juju-core/+milestone/1.23.0
<alexisb> well we had planned for 3/27 but you know ho that goes
<katco> lol yeah
<alexisb> sinzui, you are referring to all the ones marked critical
<sinzui> yes.
<alexisb> so lp 1439535 just came to us today
<alexisb> we think that 811 is fixed
<sinzui> alexisb, and I think  we need me to split one of them. because while the errors CI sees are the same as oil in one of the bugs, we know they were using a older juju
<alexisb> and will not be able to verify thta without a release
<sinzui> alexisb, we could say it is fixed, except the same charms listed in the bug are still broken on aws
<alexisb> sinzui, we should open a new bug, as what I really want to know is if oil is fixed for that bug
<alexisb> if it is then we need to figure out the path for aws
<alexisb> if it is not then we need to rethink the fix altogether
<sinzui> okay, I will shuffle the bugs
<alexisb> sinzui, thank you for all you work helping us sort all these issues out
<katco> sinzui: yeah curtis, you are amazing
<davecheney> katco: anything customer facing will be on the LTS releases as that is all we well support on an all that our support wing will support customers using
<davecheney> we don't have charms for the non LTS releases either
<sinzui> alexisb, katco, My son doesn't think so. I forgot to pick him up from school 45 minutes ago. I suck
<katco> sinzui: oh no :(
<katco> davecheney: ah ok ty for the added info
<davecheney> katco: and recently I found out that CTS doen't even start recommending the new LTS version for 4-6 months after release
<alexisb> katco, davecheney is correct, however we do have a special circumstance in the case of 1.23 because the openstack team next released set of charms will leverage Leader Elections
<davecheney> the first "stabilty release" or some other euphemism, stragenly coinciding with the point release
<alexisb> so we will need 1.23 for that release at the end of the month
<davecheney> alexisb: what does that have to do with V ?
<katco> davecheney: i am familiar with that pattern from the MS world. maybe it spilled over somehow
<alexisb> davecheney, point taken, actually nothing except the release is aligned
<davecheney> alexisb: it's probably because to get 1.23 backported we always have to land a release in the current version
<davecheney> that's the way backports work aparently
<davecheney> if you'd like to know more
<davecheney> i'll be crying into my beer all next week about this
<davecheney> (try the fish)
<katco> davecheney: lol, i'll need to pick your brain to understand the situation better
<katco> davecheney: if you don't mind some annoying questions :)
<davecheney> katco: oh it's a turbulet tail
<katco> haha
<davecheney> full of technical intriegue and skullduggery
<davecheney> turbulet -- what kind of a word is that
<alexisb> davecheney, lol
<alexisb> davecheney, I like having you in this timezone
<davecheney> how do I even adjective
<katco> davecheney: i think you have a novel in you ;)
<davecheney> it's weird working in this timezone
<davecheney> i'm use to having the west coast available all morning
<davecheney> right now, they don't come online til I get back to the hotel
<mgz> katco: I am spamming reviewboard with licence header fixes for various juju subrepos at various versions of the branches - if you get a mo can you rubber stamp them?
<katco> mgz: sure
<mgz> cmd and testing*2 are there, charm*2 should be there shortly
<mgz> then I can do the dep update and juju*2
<mgz> katco: hm, actually charm is not under reviewboard, those are on https://github.com/juju/charm/pulls only
<mup> Bug #1442801 was opened: aws containers are broken in 1.23 <ci> <deployer> <ec2-provider> <lxc> <regression> <juju-core:Triaged> <juju-core 1.23:Triaged by cox-katherine-e> <https://launchpad.net/bugs/1442801>
<katco> mgz: k tal
<katco> mgz: i think i got all of them
<mgz> katco: thanks!
<katco> np
<mgz> wtf is date 2015-03-30T-9:15:59Z in dependencies.tsv about...
<mgz> I guess I'll just change that line as well
<katco> mgz: i don't know what that's used for, so i always just change it
<mgz> urk, there are non-trivial charm.v4 changes it seems
<mgz> hm nope, git is just weird, everything is fine
<katco> time to switch to my laptop to make sure everything is working properly
<mgz> katco: last bit http://reviews.vapour.ws/r/1417
<mgz> I may also do trunk while I;m here but less urgency on that
<katco> mgz: kathump
<katco> (sound of a rubber stamp)
<mgz> :D
<alexisb> katco, did you assign yourself to lp1442801
<katco> alexisb: no
<alexisb> katco, ok, I am going to go ahead and assign it to james
<katco> alexisb: ok
<mgz> if anyone has a mo, licence header changes for trunk too, http://reviews.vapour.ws/r/1418
#juju-dev 2015-04-11
 * perrito666 enjoying a 6hs layover
#juju-dev 2015-04-12
<jw4> is there a channel for conversations at nuremberg?
<davecheney> jw4: you're in it
 * davecheney shit eating grin
<jw4> davecheney, i don't want to hear about that while eating lunch :)
<jw4> Anyone else want a 40 euro lunch buffet... Quite tasty...
<davecheney> hmm, (â¯Â°â¡Â°ï¼â¯ï¸µ â»ââ»
<davecheney> jw4: you could have a 17 euro hamberger
<davecheney> % ls
<davecheney> "\"
<davecheney> fml
<davecheney> how am i going to delete that
<jw4> At least it wasn't your password
<davecheney> my password is tableflip
<jw4> Lol
<davecheney> it's hard to type on european keyboards
<jw4> Hehe.  So do we have a conference room today or just camping in our hotel rooms?
<davecheney> jw4: screw conference rooms, they have a bar
<jw4> Lol.. Is that where you are?
<jw4> Workaround fix for bug 1438489 : http://reviews.vapour.ws/r/1411/
<mup> Bug #1438489: juju stop responding after juju-upgrade <upgrade-juju> <juju-core:Triaged by johnweldon4> <https://launchpad.net/bugs/1438489>
<natefinch> anyone else finding the canonical (and hotel) internet to be just horrible. It seemed fine for a bit, then it just crapped out
<natefinch> can't come close to doing a hangout
<jw4> natefinch, seems okay for me, but I'm just connecting to the room internet, not the 'Canonical' AP
<jw4> I was able to do skype earlier, although sending video was spotty.
<jw4> fwiw, it seems that the Canonical wifi signal is dramatically dropping in and out using my WiFi analyzer app on my phone...
#juju-dev 2016-04-11
<thumper> wallyworld: quick call then
<thumper> wallyworld: 1:1 ?
<wallyworld> thumper: didn't you notice the sarcasm?
<thumper> no
<wallyworld> ok
<thumper> wallyworld: I actually thought that you were looking forward to working
<rick_h_> wallyworld: :p
<wallyworld> rick_h_: living the dream once again, wheeeeee
<rick_h_> hey, you know you missed it! can't miss the finish line
<rick_h_> wallyworld: ^
<wallyworld> rick_h_: indeed, wasn't my first choice to disappear this last little while
<anastasiamac> isn't it one of the relay strategy to put strongest sprinters last? :D
<rick_h_> hah
<rick_h_> good times while away wallyworld ?
<rick_h_> i'm heading out next week. leave to the best of the best at the end :p
<wallyworld> rick_h_: i wish, spent the entire time working my fingers to the bone - ended up having to buy a chain saw to get everything done :-D
<rick_h_> wallyworld: oooh new tools ftw!
<wallyworld> rick_h_: you camping or something next week?
<wallyworld> yeah, new tools :-)
<rick_h_> wallyworld: i wish. snowing today
<wallyworld> i wish we had snow
<rick_h_> wife and i celebrating 10yrs in hawaii
<wallyworld> rick_h_: oh congrats
<rick_h_> so i'll be closer where i can keep an eye on you :)
<wallyworld> i'll wave from my balcony
<rick_h_> new camper is done may 11 so after these may sprints i'll live from the woods for a while
<wallyworld> with no wifi :-(
<rick_h_> nope have a booster antenna for the mifi device
<davecheney> fatal error: concurrent map read and map write
<davecheney> goroutine 1660 [running]:
<davecheney> runtime.throw(0x18945e0, 0x21)
<davecheney> bzzt, cannot land my 1.25 fix wihtout fixing gomanta
<davecheney> fairy nuf
<rick_h_> davecheney: that seems ungood
<davecheney> rick_h_: thumper cherylj menn0 http://reviews.vapour.ws/r/4507/
<davecheney> fix is here
<davecheney> i need to land that before I can land that other fix
<wallyworld> davecheney: fwiw, in master joyent doesn't use manta anymore, still may need to clean up some old tests
<davecheney> wallyworld: thanks for that
<davecheney> i need to land this on master to backport it
<davecheney> then i'll raise a tech debt care to remove the manta library as a dep
<davecheney> +1, less things to depend on, wheee!
<wallyworld> davecheney: sure, np. the change to not use manta only landed recently, still scrambling to clean everything up
<wallyworld> but yes, less deps is good. and we now don't need provider storage so long as the provider supports tagging
<menn0> cherylj, wallyworld, davecheney, sinzui: launchpad is down so the check blockers script is dying at the start of merge attempts
<wallyworld> really?
<davecheney> looks like i picked a bad day to quit sniffing glue
<wallyworld> lol
<blahdeblah> menn0 cherylj wallyworld sinzui: we're working on it
<wallyworld> ty
<wallyworld> did the hamster die
<menn0> blahdeblah: ok great
 * menn0 starts following @launchpadstatus
<davecheney> you must construct more pylons
<sinzui> menn0: I can comment out the check
<davecheney> sinzui: pleaes
<menn0> sinzui: that would be good but it might not be enough. juju has deps on packages hosted on launchpad. godeps might complain.
<menn0> sinzui: it's worth a try though
<sinzui> menn0: bzr is working so disabling the check might be enough
<menn0> sinzui: great
<sinzui> menn0: http://juju-ci.vapour.ws:8080/view/Juju%20Ecosystem/job/github-merge-juju/7283/console is retying now
<menn0> sinzui: looks good so far
<menn0> spoke too soon
<sinzui> :/
<sinzui> no I just got godeps
 * menn0 nods 
<menn0> seems to be stuck at godeps
<menn0> progress! :)
<menn0> sinzui: it seems to be moving... i'll keep an eye on it
<menn0> sinzui: thanks!
<sinzui> np menn0
<menn0> launchpad.net appears to be back now anyway
<mup> Bug #1568643 opened: RebootSuite times out if unrelated lxd containers exist <juju-core:New> <https://launchpad.net/bugs/1568643>
<alexisb> sinzui, I am pretty sure you are a machine that never sleeps
<alexisb> thanks for the 1.6 work
<sinzui> alexisb: I don't sleep well :)
<davecheney> wallyworld: https://github.com/juju/juju/pull/5065
<wallyworld> looking
<davecheney> ^ backport to 1.25 which unblocks my other change landing
<wallyworld> lgtm
<davecheney> danku!
<axw> anastasiamac: reviewed your branch
<anastasiamac> axw: thnx \o/
<menn0> grrr... why has lxd suddenly stopped working?
<alexisb> menn0, we have lots of lxd bugs going around atm
<alexisb> there are about 3 criticals against juju-core that are impacting lxd/lxd provider
<cherylj> axw: did you want to take one more look:  http://reviews.vapour.ws/r/4502/ ?
<menn0> alexisb: yep I know... lxd just stopped working for me in the middle of a complex test reproduction - annoying
<cherylj> menn0: what happened?
<axw> cherylj: looks good, thanks
<cherylj> thanks, axw!
<menn0> cherylj: I was deploying the dummy charm various models under one controller
<cherylj> menn0: what were you seeing?
<menn0> the first machine came up as normal
<menn0> the other were stuck in pending
<menn0> no activity
<menn0> nothing in juju's logs about it
<menn0> nothing in the lxd logs that I could find
<cherylj> oh fun, I've seen that too
<menn0> the machines didn't exen exist when I ran "lxc list"
<cherylj> menn0: do the machines exist in the database?
<cherylj> (mongo)
<menn0> yes
<menn0> they were showing in status so had to have been in the DB
<menn0> it's like juju didn't ask lxd to create the instances, or lxd didn't create them for some reason
<cherylj> ah, I've hit an issue where the services show up, but no machines to host them were ever created
<menn0> that sounds like what I just saw
<cherylj> I mean, they didn't even exist in mongo
<menn0> oh right... in my case the machines did show up in status
<menn0> so they must have been in the DB
<menn0> cherylj: ^
<davecheney> https://github.com/juju/juju/pull/5064#issuecomment-208119816
<davecheney> it's been a while since tht one failed
<davecheney> good to see it's as unreliable as ever
<davecheney> cherylj: are you happy with the response to https://bugs.launchpad.net/bugs/1568602
<mup> Bug #1568602: Cannot build OS X client with stock Go 1.6 <ci> <go1.6> <osx> <packaging> <regression> <juju-ci-tools:Fix Released by sinzui> <juju-core:Invalid> <https://launchpad.net/bugs/1568602>
<davecheney> we see that report a lot
<davecheney> and it's always a crapped up go install
<davecheney> usually by not removing the old version before unpacking the new version
<cherylj> davecheney: thanks for looking at it.  sinzui mentioned he got it working now
<mup> Bug #1568602 changed: Cannot build OS X client with stock Go 1.6 <ci> <go1.6> <osx> <packaging> <regression> <juju-ci-tools:Fix Released by sinzui> <juju-core:Invalid> <https://launchpad.net/bugs/1568602>
<mup> Bug #1568654 opened: ec2: AllInstances and Instances improvement <juju-core:New> <https://launchpad.net/bugs/1568654>
<davecheney> cool, let me know
<davecheney> I can provide some suggestios for how to use the upstream tarball concurrently
<davecheney> if needed
<davecheney> it's pretty straight forward
<cherylj> not sure if we'll need that.  sinzui ?  ^^
<cherylj> wallyworld: can you take another look?  http://reviews.vapour.ws/r/4504/
<wallyworld_> sure
<davecheney> build times go up, build times go down. you cannot explain that
<cherylj> wallyworld: I had to move things around to avoid circular dependencies
<wallyworld_> cherylj: lgtm, ty
<cherylj> wallyworld_: thanks!
<sinzui> davecheney: getting it working was easy for me but not CI. You were right about more than one tool chain on the host. CI in an effort to purge the env I setup for, disocvered the wrong go tool chain. It found the go I used to compile the go 1.6 we place in a special dir away from the system. All that is resolved now
<davecheney> okie dokes
<davecheney> this might be the one time that i actually say "you should set GOROOT"
<davecheney> but please, don't tell anyone I said thta
<cherylj> Can I get another review?  http://reviews.vapour.ws/r/4510/
<cherylj> (pretty easy)
<davecheney> cherylj: LGTM, ship it
<cherylj> thanks, davecheney!
<cherylj> I'll have to get in line...  lots of merges lined up :)
<cherylj> menn0: model-migration merge completed!
<davecheney> boom!
<menn0> cherylj: *\o/*
<davecheney> wallyworld_: provider/joyent/local_test.go:
<davecheney> 10:     lm "github.com/joyent/gomanta/localservices/manta"
<davecheney> ^ ja'cuse
<davecheney> still some code using gomanta
<davecheney> should I kill it with spite ?
<wallyworld_> davecheney: yes, please, all the manta stuff needs to go away
<wallyworld_> davecheney: 2.0 should not import gomanta at all
<davecheney> hulk smash!
<davecheney> so what does joyent use now ?
<menn0> wallyworld_ or axw: can I get some help with a charm storage / GridFS related issue?
<axw> menn0: sure, what's up?
<wallyworld_> depends what it is :-)
<menn0> so I fixed this: https://bugs.launchpad.net/juju-core/+bug/1541482
<mup> Bug #1541482: unable to download local: charm due to hash mismatch in multi-model deployment <2.0-count> <juju-release-support> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1541482>
<menn0> it turns out the api server had a local cache of charms, which wasn't model specific
<menn0> it was easy to fix
<menn0> but fixing it has exposed another problem
<menn0> if you deploy a local charm to one model
<menn0> and then deploy the same charm to another model in the same controller, the unit in the 2nd model can't download the charm
<menn0> if the local charm is the same but with slight modification it works
<menn0> but not if it's exactly the same charm
<wallyworld_> menn0: was this cache added by something in charmrepo?
<menn0> i've added lots of debug logging and everything looks fine
<menn0> wallyworld_: no the cache is in the charm download API handler
<wallyworld_> ok, there's also one in charmrepo from memory
<menn0> wallyworld_: it uses a directory to download charms out of storage into
<menn0> wallyworld_: the cache assumes that the contents of a charm with a given charm URL is always the same
<menn0> wallyworld_: but that isn't true for local charms across models
<axw> menn0 wallyworld_: I think the "binarystorage" index might not be model-specific, checking...
<wallyworld_> menn0: i'm not familair with the code - why does the api server need a cache of stuff from mongo?
<menn0> wallyworld_: good question. the endpoint supports returning file listings and downloading particular files out of the charm archive
<menn0> wallyworld_: I guess the expectation is that there will be many API calls for the one charm archive
<wallyworld_> right, so a local cache adds very little that i can see
<wallyworld_> i guess so
<menn0> wallyworld_: at any rate, the cache isn't the problem here
<wallyworld_> is the index model specific?
<menn0> wallyworld_: the cache was hiding a deeper bug in charm storage
<axw> menn0: sorry had my wires crossed, that's just for tools...
<menn0> axw, wallyworld_: the way charms are stored looks fine but there's obviously a problem
 * menn0 prepares a paste
<wallyworld_> menn0: the charns collection should be model aware
<menn0> wallyworld_: yep it is
<menn0> wallyworld_: the problem isn't there I don't think
<wallyworld_> that's where the charm doc goes - the url is not model aware but that shouldn't matter
<menn0> wallyworld_: it's in the GridFS put/get code I think
<wallyworld_> there's a PutForBucket api
<wallyworld_> so long as we set the bucket id to be the model uuid that should be enough
<menn0> wallyworld_, axw : http://paste.ubuntu.com/15752687/
<menn0> wallyworld_, axw : that's the output from a bunch of debug logging I added
<menn0> wallyworld_, axw : does that help?
<menn0> so for the second charm post, the GridFS entry appears to get reused
<thumper> oh poo
<menn0> but during the download attempt for the second model it can't be found
<thumper> my maas update seemed to loose all the power settings for the machines
<wallyworld_> menn0: that looks like it's using the old blobstore
<wallyworld_> should be blobstore.v2
<wallyworld_> with PutForBucket
<menn0> this is blobstore.v2
<wallyworld_> not that it should matter i guess in terms of this bug
<wallyworld_> ah
<wallyworld_> some internal methods were not renamed
<wallyworld_> sigh
<mup> Bug #1568666 opened: provider/lxd: non-juju resource tags are ignored <juju-core:Triaged> <https://launchpad.net/bugs/1568666>
<mup> Bug #1568668 opened: landing bot uses go 1.2 for pre build checkout and go 1.6 for tets <juju-core:New> <https://launchpad.net/bugs/1568668>
<mup> Bug #1568669 opened: juju 2.0 must not depend on gomanta <juju-core:New for dave-cheney> <https://launchpad.net/bugs/1568669>
<wallyworld_> menn0: it doesn't make sense at first glance - the resource path i think from memory is the raw access to the blob - that should not be affected by wthat happens above with bucket paths etc
<wallyworld_> so the resource path should be there - unless something is deleting it
<wallyworld_> is it worth putthing debug in the remove methods
<menn0> wallyworld_: ok, i'll try that. I have put some in some of the cleanup-on-error defers already but they're not firing
<menn0> wallyworld_: i'm also checking how things look in the DB
<wallyworld_> menn0: so to be sure i understand - the issue is that the second upload correctly creates resource metadata for the new model uuid etc, and it rightly shares a de-deuped blob with the first upload, but the attempt to access that blob fails
<wallyworld_> and it is failing at the point at which the blob itself is retrieved
<menn0> wallyworld_: spot on. that's my best understanding at the moment
<menn0> yes
<wallyworld_> hmmm, so yeah, that blob should be there unless deleted
<wallyworld_> this should all be orhtogonal to any bucket uuid stuff
<axw> yeah, it seems to be passing the right path to gridfs ..
<axw> based on the error
<menn0> looking at the storedResources collection and the blobstore db directly, everything looks ok
<wallyworld_> jeez, well that sucks
<menn0> i'll add more debug logging in the lookup side of things
<wallyworld_> at least the storage side looks correct - it is using the model uuid correctly
<wallyworld_> and de-duping across models
<menn0> wallyworld_: it certainly *looks* like it's doing the right hting
<wallyworld_> still could be a subtle issue i guess
<menn0> yep
<menn0> i've just added a lot more logging on the get side of things
<menn0> wallyworld_, axw: I think I've found it
<menn0> wallyworld_, axw: the GridFS instance is created with the model UUID as the prefix
<menn0> wallyworld_, axw: so even though the charm is being requested with the correct de-duped UUID
<menn0> wallyworld_, axw: the GridFS prefix prevents it being found
<wallyworld_> the namespace?
<menn0> wallyworld_: yes
<menn0> does that seem right?
<wallyworld_> so maybe the namespace should be the controller uuid
<wallyworld_> i think so
<wallyworld_> this was all done pre-multi model
<arcticlight> Hi! Am I in the right place to ask about filing a bug against Juju? I'm kind of new to the whole FOSS community thing and I have *no idea* how to use Launchpad.
<wallyworld_> arcticlight: sure, ask away
<wallyworld_> and welcome
<wallyworld_> menn0: does each controller get a different uuid in HA?
<menn0> wallyworld_: naively fixing this by changing the charms code to use a statestorage with the controller's UUID will no doubt break existing deployments
<wallyworld_> i think so right?
<wallyworld_> menn0: it will break existing yes
<arcticlight> wallyworld_: It's actually really simple... Juju doesn't seem to depend on `curl` but goes ahead and uses it anyway. But on Ubuntu Server 14.04 LTS it's not installed by default, or at least it wasn't on my system (I have a fresh install) and `juju bootstrap` blew up. Thought I'd mention it
<menn0> wallyworld_: in HA the controller is really the whole cluster
<menn0> wallyworld_: there is only one controller model and one UUID
<menn0> wallyworld_: so that's not a problem
<wallyworld_> arcticlight: yes, it does use curl - to retrieve the tools binaries at bootstrap. i'm surprised 14.04 lts doesn't have it out of the box, but i'm not a packaging guy
<wallyworld_> arcticlight: thanks for mentioning, i'll followup with someone who knows more about ubuntu default packaging and see if we need to fix anything
<arcticlight> wallyworld_: Yeah. I fixed it by installing curl, but it did blow up on a fresh install. I figured someone should know so they can add a dependency on curl.
<arcticlight> wallyworld_: Welcome!
<wallyworld_> arcticlight: just ask if there's anything else you get stuck on. lots of folks here who can help
<wallyworld_> menn0: so the "easy" option is to use the controller uuid with gridfs namespace. i'm ok with release notes telling poeple they need to re bootstrap between betasa, but that's IMHO
<arcticlight> wallyworld_: OK! Good to know. I've been fooling around with MAAS/Juju since around 2013. I'm just really shy lol~ Figured this was simple enough to just pop on and ask about tho
<wallyworld_> we don't bite
<wallyworld_> unless really provoked :-)
<menn0> wallyworld_: what about 1.25 systems?
<wallyworld_> menn0: we have sooooooo many tings to fix with upgrades there
<wallyworld_> this is just one more
<wallyworld_> menn0: upgrading from 1.25 will be really, really difficult
<wallyworld_> already
<wallyworld_> menn0: i suspect we'll considering migrayions :-)
<wallyworld_> migrations
<wallyworld_> might be easier in the long run
<menn0> wallyworld_: you realise that would mean backporting migrations to 1.25?
<wallyworld_> yes, or a form thereof
<wallyworld_> that will likely be so much easier that the alternative
<anastasiamac> jeez wallyworld_, u made thumper quit with all this upgrades talk
<menn0> wallyworld_: I've realised for some time that this was on the cards :-/
<wallyworld_> na, i just bit him hard
<wallyworld_> menn0: i think we all have :-) sort of a slowly increasing dread
<anastasiamac> dread != anticipation
<menn0> dread is more accurate :)
 * menn0 tries the naive fix
<anastasiamac> sure... but when u dread problems become harder and motivation disappears... when u anticipate, besides being prepared, there is always something to look forward to..
<menn0> wallyworld_: were you proposing we pass the controller UUID to charm related calls to state/storage.NewStorage() or were you proposing that we use the controller UUID in the call that stateStorage makes to blobstore.NewGridFS()?
<axw> menn0: I think the latter
<wallyworld_> menn0: i think(?) just the latter is all we need
<wallyworld_> that would be my preference
<axw> so they're all sharing a common gridfs namespace, and we have model-specific catalogues that point into it
<wallyworld_> yup
<menn0> wallyworld_: cool that makes sense (the latter, not the former). I misunderstood what you meant the first time around and then realised it didn't make sense as I started to change all the calls to NewStorage()
<wallyworld_> oh sorry :-)
<wallyworld_> i should have been more clear
<menn0> wallyworld_: no it was my bad
<menn0> wallyworld_: use the controller UUID over a fixed value? the controller UUID isn't readily available in stateStorage
<wallyworld_> menn0: isn't there a helper method on state? you don'r have access to that in stateStorage?
<menn0> wallyworld_: no it gets a MongoSession, not a State
<menn0> wallyworld_:  I can rejig things
<wallyworld_> where does it get the model uuid from then?
<menn0> wallyworld_: or use a fixed value "juju" or "state" or "jujustate"
<menn0> the model UUID is passed in with the MongoSession
<menn0> 2 args
<wallyworld_> so can't we change that to controller uuid?
<axw> menn0 wallyworld_: may as well just use a constant, it doesn't really need to be the controller UUID
<axw> it just needs to be the same for all views on the db
<wallyworld_> except if we want to share a mongo instance between controllers
<blahdeblah> Anyone know if there are moves afoot to introduce DNS as a Service support into either juju itself or the charm store?
<menn0> wallyworld: i've just tried using a fixed GridFS namespace ("juju") in state/storage.NewStorage() and that fixes the problem
<wallyworld_> menn0: great. if it's easy to get controller uuid in there that would be good
<menn0> wallyworld: your reasoning for that is to avoid any chance of collision in case of a mongodb instance being shared by multiple controllers?
<wallyworld_> menn0: yeah
<wallyworld_> we just don't know what might need to be supported in the future
<menn0> wallyworld_: but that would break anyway... our main db is called "juju"
<menn0> and the collection names in it are fixed
<wallyworld_> well, we use model uuids though
<wallyworld_> i guess "juju" is ok for gridfs
<menn0> wallyworld_: I was about to allow my mind to be changed :)
<wallyworld_> quick win for now
<wallyworld_> menn0: i guess there's pros and cons. maybe if it's easy to do....
<menn0> wallyworld_: if we were to do it then it would probably make sense to have NewStorage just take a *State
<menn0> it could then pull the model uuid, controller uuid and session off that
<menn0> sound ok?
<menn0> it looks like all the call sites have a *State
<wallyworld_> menn0: let's use a bespoke interface that just declares the state methods it needs
<wallyworld_> pass in the concrete state.State from the caller
<wallyworld_> but declare the NewGridFS() method to use a smaller interface
<wallyworld_> or just add a controller uuid param
<menn0> wallyworld_: ok sounds good. i'll pass in state but via an interface
<wallyworld_> blahdeblah: i don't think there's anything coming in core, but i thought the ecosystems/charm guys had something for juju itself they use
<menn0> wallyworld_: state doesn't currently have a ControllerUUID method but i'll add one
<wallyworld_> menn0: i though it did?
<menn0> wallyworld_: nope. there's ControllerModel and you can get it from there but that's messy
<menn0> wallyworld_: there's a unexported controllerTag field so I'll just expose the UUID via that
<wallyworld_> like we do for ModelUUID()
 * menn0 bids
<menn0> nods even
<blahdeblah> wallyworld_: thanks - will ping them
<menn0> wallyworld_: I just noticed that state/imagesstore uses a fixed "osimages" prefix
<wallyworld_> menn0: ah yes. i think the thinking there was that it is ok to share cached generic lxc images between controllers
<menn0> wallyworld_: and state.ToolStorage uses the model UUID
<wallyworld_> hmmm, so tools storage could break the same way maybe?
 * menn0 wonders if ToolsStorage could end up with the same problem where an entry is inaccessible
<menn0> haha
<menn0> yes I think so
<menn0> wallyworld_: it's probably hard to trigger at the moment
<wallyworld_> menn0: do we even need a namespace?
<menn0> wallyworld_: can you upload tools for a hosted machine
<menn0> ?
<wallyworld_> um
<wallyworld_> i think the controller stores all the tools tarballs for the hosted models, would need to check
<menn0> wallyworld_: I guess if you do juju upgrade-juju --upload-tools for 2 hosted models you might hit the same problem
<wallyworld_> not sure, but sounds plausible doesn't it
<menn0> if you used exactly the same tools
<wallyworld_> yes, that's they key - need to be the same
<menn0> wallyworld_: does upload-tools recompile the tools
<menn0> ?
<wallyworld_> depends - if it finds binaries in the path, then no
<wallyworld_> is my recollection
<menn0> wallyworld_: ok. if it did recompile then the odds of the binary being exactly the same is slim
<wallyworld_> yes
<wallyworld_> but let's not count on it
<menn0> wallyworld_: regarding whether we need a namespace, I think you need to provide soemthing.
<menn0> wallyworld_: the docs say the convention when there's only one namespace required is to use "fs"
<menn0> wallyworld_: I think it still makes sense to use a namespace for different uses of gridfs
<wallyworld_> menn0: so maybe we just use a const "osimages", "tools", "charms" etc?
<menn0> wallyworld_: yep.
<menn0> wallyworld_: or we use the controller UUID...
<menn0> wallyworld_: no a prefixes per use seems better
<wallyworld_> menn0: i think not looking at the different usages - namespaces for images, charms, tools etc seems ok
<wallyworld_> we still differentiate by model uuid inside each namespace
<menn0> wallyworld_, axw: why does ToolsStorage and GUIStorage not use state.NewStorage()?
<menn0> what's different about binaryStorage?
<wallyworld_> menn0: not sure, i know someone renamed binary storage recently, but can't recall the details
<wallyworld_> i can recall the specifics ottomh
<menn0> wallyworld_: ok. well I'll avoid fixing the world.
<wallyworld_> not in this pr :-)
<wallyworld_> next one
<menn0> wallyworld_: thanks for your help and input
 * menn0 is gone for now
<wallyworld_> menn0: np, didn't do much :-)
<wallyworld_> thanks for fixing
<axw> menn0: I think they probably could. tools storage was written before that IIRC
<mup> Bug #1566345 changed: kill-controller leaves instances with storage behind <juju-release-support> <kill-controller> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1566345>
<mup> Bug #1568715 opened: simplestreams needs to be updated for Azure Resource Manager <juju-core:Triaged> <https://launchpad.net/bugs/1568715>
<frobware> dimitern: ping; are you using/testing LXD containers on trusty?
<dimitern> frobware, hey, yes - I did, but mostly testing with xenial now
<frobware> dimitern: I saw you landed the /e/n/i change. It's only half the story though as we need to delete eth0.cfg.
<frobware> dimitern: my MAAS/LAN/OTHER seems borked at the moment as having a lot of problems bootstrapping.
<dimitern> frobware, ok, I'll propose a fix that deletes eth0.cfg
<frobware> dimitern: I was trying to verify on trusty but ran into deploy problems.
<frobware> dimitern: I also wanted to understand the order a little better. Who wins doe 00-juju.cfg and eth0.cfg
<frobware> s/doe/for
<dimitern> frobware, I'm also working on sketching the next steps to fix network config setting properly
<frobware> dimitern: I might still see the issue where the MA needs bouncing on xenial before a containers gets its correct network profile
<dimitern> frobware, well, on trusty you need to add "trusty-backports" to /e/a/sources.list and a-g update
<frobware> dimitern: tried that...
<dimitern> frobware, and then a-g install trusty-backports/lxd ?
<frobware> dimitern: question is, should the have been done by juju for trusty?
<dimitern> frobware, yes, i think so
<dimitern> frobware, wasn't jam doing something about that?
<dimitern> frobware, hmmm or maybe it IIRC trusty-backports are *supposedly enabled* by default, so that steps were unnecessary ?
<dimitern> I haven't seen a cloud image of trusty where backports is on by default
<frobware> dimitern: right; just expected juju to do the business. Only switched to trusty because I had issues with xenial.
<frobware> dimitern: was going to do some testing before I land the `rm eth0.cfg' change. Wanted to understand the order of when 00-juju.cfg wins over eth0.cfg to see if we need to down/up all interfaces.
<dimitern> frobware, sure
<frobware> dimitern: I also wonder whether our rm should be a bit smarter. rm all but 00-juju.cfg.
<dimitern> frobware, alternatively, we can change /e/n/i/ to source interfaces.d/*.juju :)
<dimitern> or something like that
 * fwereade has a filthy cold, please excuse me dimitern voidspace et al
<dimitern> fwereade, get well soon! :)
<voidspace> fwereade: yep, recover quickly
 * TheMue dccs wishes to get well soon to fwereade 
<jam> dimitern: frobware: you shouldn't need to directly add trusty-backports as it should be *available* by default (but not active), and "juju bootstrap" should be doing "apt-get -t trusty-backports install lxd"
<jam> if it isn't then we have a bug we should be aware of
<jam> dimitern: I've done several bootstraps on Trusty images on AWS and they work.
<jam> dimitern: now, we've had bugs with *trusty* and "go-1.2" with official releases (juju-2.0-beta3) because go-1.2 doesn't work with LXD
<jam> so it tells you "unknown" or somesuch.
<jam> but Master or "juju bootstrap --upload-tools" should all work, and next release should be built with 1.6
<dimitern> jam, I've yet to see a trusty cloud image on maas that has trusty-backports in /e/apt/sources.list tbo
<jam> dimitern: so I'm not sure where it is enabled, but it does work.
<jam> have you tried just doing "apt-get -t trusty-backports install lxd" ?
<mgz> dimitern: it should be there, but not at a prio that installs packages unless explictly selected
<jam> I'm bootstrapping now to confirm, but I have tested it in the past
<jam> hi mgz
<dimitern> jam, yes, and it says trusty-backports is unknown or something
<jam> dimitern: what cloud/what region/what version of juju?
<dimitern> jam, mgz, hmm ok I'm trying now again to confirm after updating all images
<dimitern> jam, maas/2.0 and juju from master tip
<dimitern> voidspace, I'm getting a lot of "Failed to power on node - Node could not be powered on: Failed talking to node's BMC: Unable to retrieve AMT version: 500 Can't connect to 10.14.0.11:16992 at /usr/bin/amttool line 126." errors with 2.0 maas
<dimitern> it works but unreliably
<dimitern> like 1-2 out of 5 power checks fail
<voidspace> dimitern: weird
<voidspace> dimitern: that really sounds like maas bug
<voidspace> dimitern: I don't hit that because I'm not setting the power type I guess
<voidspace> dimitern: or maybe it's a new version of amttool in xenial
<voidspace> or virsh
<voidspace> either way - maas problem
<dimitern> voidspace, indeed
<jam> dimitern: "sudo apt-get -t trusty-backports install lxd" works for me on a Trusty image created with 2.0.0-beta3 in AWS (even though Juju 2.0.0b3 wouldn't be able to talk to it with released tools)
<dimitern> jam, mgz here it is - fresh install of trusty - http://paste.ubuntu.com/15756105/
<dimitern> MAAS Version 2.0.0 (beta1+bzr4873)
<dimitern> jam, mgz, is it possible AWS uses different cloud images than MAAS ?
<mgz> dimitern: it does
<jam> dimitern: I'm sure they are different builds, as the root filesystem is different
<jam> but I thought they were supposed to be as much the same as possible.
<jam> dimitern: we need a bug on MaaS and Juju that tracks that
<jam> thanks for noticing.
<dimitern> well that's kinda crappy ux :/
<dimitern> jam, yeah
<dimitern> jam, looking at http://images.maas.io/query/released.latest.txt and http://images.maas.io/query/daily.latest.txt it looks like the trusty images MAAS is using are 2 years old
<jam> dimitern: that doesn't sound good
<dimitern> indeed
<perrito666> morning
<voidspace> babbageclunk: are you working on the MAAS2 version of maasObjectNetworkInterfaces?
<voidspace> babbageclunk: pretty sure you are - in which case I'll leave it as a TODO in my branch
<babbageclunk> voidspace: yes
<voidspace> babbageclunk: ok
<voidspace> babbageclunk: it will need wiring into my branch when done
<babbageclunk> voidspace: just pushing and creating a PR for the storage - spent longer than expected bashing my head on golang features.
<voidspace> babbageclunk: heh, welcome to go :-)
<perrito666> bbl (~1h)
<frobware> jam: trying again; got sidetracked by h/w
<babbageclunk> voidspace: https://github.com/juju/juju/pull/5070
<babbageclunk> voidspace: would you prefer I work on finishing off stop-instances or interfaces next?
<voidspace> babbageclunk: https://github.com/juju/juju/pull/5071
<voidspace> babbageclunk: well, StopInstances and then deploymentStatus get us closer to bootstrap
<voidspace> babbageclunk: bootstrap isn't actually blocked on interfaces yet
<voidspace> dimitern: frobware: two PRs for you
<voidspace> dimitern: frobware: http://reviews.vapour.ws/r/4513/
<babbageclunk> voidspace: ok, stop instances next then
<voidspace> babbageclunk: cool
<voidspace> babbageclunk: thanks
<voidspace> dimitern: frobware: http://reviews.vapour.ws/r/4514/
<dimitern> voidspace, looking
<dimitern> voidspace, both reviewed
 * dimitern steps out for ~1h
<voidspace> dimitern: thanks
<voidspace> dimitern:  that point about the comment is in a test that currently doesn't run (I changed the name to DONTTestAcquireNodeStorage...)
<voidspace> dimitern: because it needs the work that babbageclunk has done on storage
<voidspace> dimitern: so that will be updated next
<voidspace> dimitern: so effectively that whole test is commented out - and there *is* a TODO comment at the start of the test to explain why.
 * voidspace lunch
<mup> Bug #1568845 opened: help text for juju autoload-credentials needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1568845>
<dimitern> voidspace, ok,sgtm
<mup> Bug #1560457 changed: help text for juju bootstrap needs improving <juju-core:Triaged> <https://launchpad.net/bugs/1560457>
<mup> Bug #1568848 opened: help text for juju bootstrap needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1568848>
<mup> Bug #1568854 opened: help text for juju create-model needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1568854>
<mup> Bug #1568862 opened: help text for juju needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1568862>
<katco> morning all
<mgz> wotcha katco
<voidspace> dimitern: did you see the reply from babbageclunk on his PR
<voidspace> dimitern: he doesn't understand your review comment, and nor do I
<voidspace> we so missed an opportunity
<voidspace> our maas feature branch should have been called maaster
<babbageclunk> voidspace: amazing
<dimitern> voidspace, babbageclunk, sorry - I've posted a reply
<babbageclunk> voidspace: I mean, amaazing
<dimitern> the diff mislead me I guess
<voidspace> baabageclunk
<voidspace> babbageclunk: I like babbagelunch by the way - nice
<babbageclunk> voidspace: thanks
<voidspace> *clunch even
<babbageclunk> dimitern: no worries!
<babbageclunk> frobware: are you ok with that panic if a test sets up the fakeController without files?
<frobware> babbageclunk: panic seems bad
<babbageclunk> frobware: what other way is there of failing the test?
<frobware> babbageclunk: sorry, sidetracked again.
<natefinch> panic is ok if it indicates a programmer error, especially during tests.  It's sort of a nice "don't do that!" as long as it happens right away with an obvious cause.
<frobware> babbageclunk: I replied in the review
<babbageclunk> I'll change it to return a generic error with a clear message indicating what the problem is.
<babbageclunk> frobware: cool, thanks!
<natefinch> cherylj: what's the color coding on the bug squad board?
<natefinch> cherylj: nvm... card type. I see
<cherylj> :)
<natefinch> cherylj: I was looking at priority for priority and they were all normal, which was confusing me
<cherylj> ah
<natefinch> cherylj: just grab a critical off the top somewhere, or do you have a suggestion?
<cherylj> natefinch: just pick one on the criticals that interest you :)
<katco> cherylj: ty :) all of moonstone will be on bug squad now
<frobware> jam: not sure if you're still about but I had no joy with add-machine lxd:0. See bug #1568895
<mup> Bug #1568895: Cannot add LXD containers in 2.0beta4 on trusty <juju-core:New> <https://launchpad.net/bugs/1568895>
<cherylj> cool, thanks katco.  If there are new bugs found in CI, I might need to redirect people :)
<katco> cherylj: we're at your direction
<cherylj> sinzui:  is master running in CI now?
<cherylj> katco: I'm actually wondering if I need to block things as we're trying to get a release out tomorrow
<cherylj> :(
<cherylj> I'd rather not, but we really really need a good run
<katco> cherylj: if we need stability then i'd agree =/
<sinzui> cherylj: 1.25 strted about 45 miniutes ago
<cherylj> sinzui: could you kill it and run master?
<sinzui> cherylj: I hesitate to. killing it means waiting for other things to complete and cleanup.
 * sinzui looks
<cherylj> sinzui: understood
<cherylj> katco: and technically, master is blocked from a different bug :)
<katco> cherylj: what bug is that?
<katco> cherylj: nm i see it
<cherylj> katco: one of many that I opened over the weekend for failures
<katco> cherylj: looks like fix committed
<katco> cherylj: by you ;p
<cherylj> oh snap, I forgot
<sinzui> cherylj: Can we compromise? it will take an hour for some of the current jobs to complete, but I can see we are half way though 1.25 tests. I can make some jobs not retry to get to master faster
<cherylj> sinzui: did you turn back on the block checker?
<cherylj> sinzui: yeah, just make them not retry
<sinzui> cherylj: I can enable it now
<cherylj> sinzui: thank you
<cherylj> katco: lp was having issues yesterday, so sinzui disabled the blocking bug check
<cherylj> so people should get bounced back again once sinzui re-enables if they're trying to merge
<cherylj> katco: did you guys have anything you needed to add to release notes?
<katco> cherylj: i don't think so, but i'll check with the team
<cherylj> thanks!
<katco> natefinch: standup time
<mup> Bug #1568895 opened: Cannot add LXD containers in 2.0beta4 on trusty <juju-core:New> <https://launchpad.net/bugs/1568895>
<natefinch> katco: oh, sorry, I expected it to be tonight... coming
<tych0> frobware: any luck on the networking stuff for lxd?
<frobware> tych0: the /e/n/i is partially fixed. need to delete eth0.cfg, but also wanted to understand if I can just do this carte-blanche in cloud-init.
<frobware> tych0: sidetracked by the fact that juju/lxd is not installing on trusty (for me at least)
<tych0> frobware: ah, i saw that bug. can you just use --series=xenial?
<frobware> tych0: yes, but had problems there so I though... I'll just use trusty... and the rest is history
<frobware> tych0: so, yes. back to trusty... but also wary that deleting eth0.cfg might also break precise.
<frobware> tych0: correction, back to xenial...
<cherylj> alexisb: was the cached-images command one on your list to flatten?
<alexisb> cherylj, yes
<alexisb> cherylj, I am behind
<babbageclunk> voidspace: What does this mean? https://github.com/juju/juju/pull/5070
<frobware> voidspace: AA - did we say disable this across the board, or just for MAAS?
<cherylj> alexisb: no problem.  I'm asking for a friend ;)
<babbageclunk> voidspace: that I wasn't up to date with master?
<frobware> cherylj: is master blocked on bug #1568312
<mup> Bug #1568312: Juju should fallback to juju-mongodb after first failure to find juju-mongodb3.2 <blocker> <ci> <mongodb> <juju-core:Fix Committed by cherylj> <https://launchpad.net/bugs/1568312>
<frobware> cherylj: nevermind, mup answered my question
<cherylj> frobware: master really should be blocked with a stabilization bug
<cherylj> since we're trying to get a release out tomorrow
<frobware> cherylj: right. was trying to help answer babbageclunk's question above
<cherylj> ah, ok
<babbageclunk> frobware: Ah, is that bug  blocking my merge?
<frobware> babbageclunk: yes
<mgz> babbageclunk: we want to do a clean release before landing all the maas2 changes, with the current batch of breakage on master dealt with
<babbageclunk> mgz: makes sense - so should I hold off until a release is cut?
<frobware> babbageclunk: keep stacking those branches up. :)
<babbageclunk> frobware: :) I was so close!
<mgz> frobware: if you need to, create a fork under juju to land on
<mgz> frobware: whatever makes it easiest to coordinate your work
<frobware> voidspace, babbageclunk: I think we can keep stacking stuff for a day. thoughts? ^^
<mup> Bug #1568925 opened: Address Allocation feature flag still enabled for MAAS provider in Juju 2.0 <juju-core:New> <https://launchpad.net/bugs/1568925>
<babbageclunk> frobware, voidspace: yeah, no big problem for me.
<katco> cherylj: hey, for bug 1567170 what does "hosted model" mean? aren't all models hosted?
<mup> Bug #1567170: Disallow upgrading with --upload-tools for hosted models <upgrade-juju> <juju-core:Triaged by cox-katherine-e> <https://launchpad.net/bugs/1567170>
<cherylj> katco: non-admin model
<cherylj> sorry, was using old-terminology
<katco> cherylj: ahhh ok
<frobware> cherylj, katco: how does one, in development, update a model that is not ":admin"?
<katco> cherylj: so that's interesting... the binaries are different for non-admin models?
<frobware> cherylj, katco: I got caught by this friday, just ended up doing my upgrade-juju in the admin model
<cherylj> frobware: the idea is that you can do an upgrade to a published (in devel or stable) release
<cherylj> frobware: a later feature request is "upgrade my model to match what the state server is at"
<frobware> cherylj: everytime I tried this I ran into "version mismatch". And each time it tried it bumped the version number
<mup> Bug #1568925 changed: Address Allocation feature flag still enabled for MAAS provider in Juju 2.0 <juju-core:New> <https://launchpad.net/bugs/1568925>
<mup> Bug #1568925 opened: Address Allocation feature flag still enabled for MAAS provider in Juju 2.0 <juju-core:New> <https://launchpad.net/bugs/1568925>
<bogdanteleaga> is it possible to get the tools from the state machine through http, instead of trying insecure https?
<voidspace> frobware: babbageclunk: heh, I got mine in just before the block
<babbageclunk> frobware, voidspace: Just checking I understand - the bug number jujubot's quoting is a red herring (the fix was merged early this morning), it's just blocking merges until the release is done?
<babbageclunk> voidspace: grrr! :)
<frobware> babbageclunk: guessing so. cherylj mentioned that there really should be a stabilisation bug
<babbageclunk> frobware: ok, got it - cool
<voidspace> babbageclunk: the block isn't removed by the jujubot until the bug is changed by QA to "fix released"
<voidspace> babbageclunk: "fix committed" (i.e. merged) is not sufficient to unblock as it hasn't been QA verified yet
<babbageclunk> voidspace: ah, thanks
<voidspace> babbageclunk: this block will probably be left in place (as frobware said) until the release is done
<babbageclunk> ooh, networking meeting
<natefinch> cherylj: How firm is the suggested wording on #1564622?
<mup> Bug #1564622: Suggest juju1 upon first use of juju2 if there is an existing JUJU_HOME dir <juju-release-support> <juju-core:Triaged> <https://launchpad.net/bugs/1564622>
<natefinch> cherylj: (if you know)
<rick_h_> natefinch: it's rough
<rick_h_> natefinch: please feel free to suggest something better. It was on the fly at a sprint
<ericsnow> tasdomas: PTAL http://reviews.vapour.ws/r/4515/
<natefinch> rick_h_: ok.  Sometimes it's hard to know what's set in stone and what is not.  I'll play with it and see what seems to work best.
<mup> Bug #1568943 opened: Juju 2.0-beta4 stabilization <blocker> <juju-core:Triaged> <https://launchpad.net/bugs/1568943>
<mup> Bug #1568944 opened: Failure when deploying on lxd models and model name contains space characters <juju-core:New> <https://launchpad.net/bugs/1568944>
<ericsnow> katco: that bug was more trivial than expected, so I'm going to pick up another (#1456916)
<natefinch> ericsnow: nice
<bogdanteleaga> can somebody not call HEAD on a state machine endpoint with tools
<natefinch> no idea if we support HEAD or not... possibly not
<natefinch> well, wait, which endpoints? It's basically all websocket
<ericsnow> natefinch: tools, charms, backups, and resources all have HTTP endpoints
<natefinch> ahh, thus the reference to tools... misunderstood :)
<ericsnow> (plus a few others)
<katco> ericsnow: sorry was otp... that's always a nice problem to have :)
<ericsnow> dooferlad: are you working on bug #1456916
<mgz> bug 1456916 looks like a pain to fix well
<mgz> there are lots of ways to try and fix it badly
<ericsnow> dooferlad: just noticed it's assigned to you in LP but not on the card in leankit
<frobware> babbageclunk: I'm still in the call if you wanted to chat
<babbageclunk> frobware: can't get back in for some reason
<frobware> babbageclunk: want to drop into the sapphire standup HO instead?
<babbageclunk> frobware: can't get to there either.
<voidspace> babbageclunk: do you get a redirect loop?
<voidspace> babbageclunk: I've had that and had to force a log back in by going to something else canonical work related
<voidspace> something else on google I mean
<voidspace> like docs.google.com
<voidspace> which should then let you log back in without a redirect loop
<voidspace> but that may not be your problem at all
<babbageclunk> voidspace: yeah, might be something to do with SSO like Jay had.
<babbageclunk> Would make sense,
<ericsnow> natefinch: small patch: http://reviews.vapour.ws/r/4515/
<ericsnow> redir: thanks for the review :)
<redir> :)
<jam> frobware: lxd had a packaging bug in trusty for 2.0.0~rc9
<jam> frobware: that should be fixed as of 20 minutes ago Stephan uploaded a bugfix
<frobware> jam: great, will try again
<frobware> jam: I'm using releases for my trusty images. is that enough or do I need to switch to daily?
<natefinch> ericsnow: lol wow
<ericsnow> natefinch: yep
<natefinch> ericsnow: awesome. ship it.
<ericsnow> natefinch: alas, master is blocked now
<natefinch> doh
<ericsnow> rogpeppe: re: bug #1566431, are you talking about different AWS accounts or different Juju accounts?
<mup> Bug #1566431: cloud-init cannot always use private ip address to fetch tools (ec2 provider) <juju-core:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1566431>
<ericsnow> rogpeppe: I'mm looking at how to repro
<rogpeppe> ericsnow: different AWS accounts
<ericsnow> rogpeppe: figured :)
<rogpeppe> ericsnow: problem doesn't manifest in us-east, for some reason
<jam> frobware: it should be an archive change. I don't know how long it will take to propagate
<jam> now dimiter's finding about no 'trusty-backports' in maas images is a different bug
<natefinch> rick_h_: for #1564622 do we actually execute the requested action, or no? like, juju bootstrap is conceivably a valid action at that point.
<mup> Bug #1564622: Suggest juju1 upon first use of juju2 if there is an existing JUJU_HOME dir <juju-release-support> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1564622>
<redir> so if master is blocked does that mean I should do something differently? or simply that my commits won't make it to master until unblocked?
<mgz> redir: you don't do $$merge$$ till unblock
<redir> mgz: that's what I was looking for:) Thanks.
<rick_h_> natefinch-lunch: sorry, was otp, looking
<rick_h_> natefinch-lunch: intresting, I guess you're right there's a list of actions that are legit.
<rick_h_> natefinch-lunch: so if you've got juju1 home dir stuff, and you run your very first juju2 command...I think it'd be ok to block and output it once
<rick_h_> natefinch-lunch: and make them redo the command a second time if they knew they were looking for 2.0
<rick_h_> natefinch-lunch: hmm, but my suggested text falls over there doesn't it heh
<perrito666> mwhudson: hey, are you around?
<perrito666> mm, I would guess not yet
<ericsnow> rogpeppe: what did you do to get the provider to create instances under a different AWS account?
<rick_h_> natefinch-lunch: updated the bug with another round of thoughts. Let me know what you think
<cherylj> arosales: with your update to bug 1566420 - do your machines show up in the machine section?
<mup> Bug #1566420: lxd doesn't provision instances on first bootstrap in new xenial image <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1566420>
<cherylj> arosales: or just the services?
<arosales> cherylj: replying in #juju as the reporter is there
<alexisb> cherylj, perrito666 is looking for a task, can he just take any bug of the board or do you have a particular one that needs eyes?
<cherylj> perrito666: here's a quick one for you - bug 1568944
<mup> Bug #1568944: Failure when deploying on lxd models and model name contains space characters <juju-core:Triaged> <https://launchpad.net/bugs/1568944>
<perrito666> cherylj: tx a lot
<cherylj> perrito666: but remember that master is blocked for now :(
<perrito666> cherylj: I know
<cherylj> I feel like we should create a feature branch for people to merge into for now
<cherylj> until we unblock master
<perrito666> cherylj: nah, the merge back will be a nightmare
<cherylj> any thoughts (anyone)?
<perrito666> I do believe that we should do a branch for the release
<cherylj> perrito666: well, hopefully master won't change
<cherylj> ah, so the opposite :)
<cherylj> branch the other thing
<perrito666> exactly, so master keeps being master
<perrito666> and if fixes are required for the bless, you can merge just those
<cherylj> sinzui, mgz, abentley, any thoughts on creating a temporary branch for our 2.0-beta4 release?
<cherylj> will that cause problems when we have to release / tag?
<sinzui> cherylj: It wont affect the tag in git
<cherylj> it's just weird having people who can actually work bugs while we're trying to get ready for a release :)
<sinzui> cherylj: are people trying to merge post 2.0  features now?
<cherylj> perrito666: if you want to tackle some of the help text bugs too, I won't say no :)    https://bugs.launchpad.net/juju-core/+bugs?field.tag=helpdocs
<mgz> cherylj: it seems a little unnessersary from what I've seen of the pending prs
<abentley> cherylj: I think it is a good idea.  It's less friction for devs.  There is a risk that bug fixes won't get applied to the 2.0-beta4 branch, but I think that can be managed.
<mgz> not much chance of mass conflicts between then
<perrito666> cherylj: everytime I write text for juju I end up sounding like tonto
<cherylj> perrito666: the text is there, you just need to copy it in :)
<cherylj> perrito666: the docs team has been busy writing up help text for all the commands
<perrito666> oh, then I could :p
<perrito666> ill give a try to the lxc one and then go to docs otherwise
<abentley> perrito666: I don't understand why you said the first case would be a nightmare.  Should be exactly as much effort as merging 2.0-beta4 into master.
<cherylj> sinzui: post 2.0-beta4 bugfixes
<perrito666> abentley: why would you merge beta4 into master?
<abentley> perrito666: Because it has bugfixes.
<perrito666> abentley: ideally bugfixes should do bug->master->beta
<perrito666> or the other way but just the one merge, not the whole branch
<abentley> perrito666: I am pretty sure that your policy is bug -> release -> master.
<perrito666> abentley: yes
<perrito666> sorry
<perrito666> we are using github a bit poorly which causes a lot of merge conflicts
<abentley> perrito666: If you want to do one merge per bugfix, you can, but it seems inefficient to me.
<perrito666> abentley: that is what we where doing for 1.24, 1.25, 1.x maintenance
<abentley> perrito666: Isn't that just to get bug fixes merged in a timely manner?
<rogpeppe> ericsnow: you need to create a model inside the controller that uses different creds
<rogpeppe> ericsnow: every model inside a controller has a different set of model attributes
<ericsnow> rogpeppe: I thought I tried that though
<rogpeppe> ericsnow: we could chat about this if you like
<ericsnow> rogpeppe: sure
<perrito666> cherylj: I would like your input in the lxc bug, I added a question/suggestion, in the mean time ill go add some help texts :)
<mup> Bug #1563590 changed: 1.25.4, xenial, init script install error race <landscape> <juju-core:Invalid> <systemd (Ubuntu):In Progress by pitti> <https://launchpad.net/bugs/1563590>
<cherylj> perrito666: which lxc bug?  (sorry, been in calls)
<cherylj> oh ffs
<cherylj> seriously?
<cherylj> 2016-04-11 19:38:30 ERROR cmd supercommand.go:448 region "DFW" in cloud "rackspace" not found (expected one of ["dfw" "ord" "iad" "lon" "syd" "hkg"])
<rick_h_> cherylj: case is important?
<cherylj> rick_h_: shouldn't be
<rick_h_> cherylj: sorry, I missed the :/ on the end there
<cherylj> we changed it to be lower case because sabdfl asked us to, but to keep from regressing people, we should convert the region to lower case before trying to use it
<rick_h_> cherylj: quit talking sense :P
<cherylj> Bug reporting activated
<cherylj> ses_2: I also see this error coming out of CI:  Test recovery strategies.: error: unrecognized arguments: --charm-prefix=local:trusty/
<cherylj> http://reports.vapour.ws/releases/3881/job/functional-ha-recovery-rackspace/attempt/472
<frobware> cherylj, alexisb: the other known LXD issue - https://bugs.launchpad.net/juju-core/+bug/1568895
<mup> Bug #1568895: Cannot add LXD containers in 2.0beta4 on trusty <juju-core:New> <https://launchpad.net/bugs/1568895>
<frobware> cherylj, alexisb: but may be resolved by "tomorrow" if all I'm waiting on is a package update
<cherylj> frobware: I remember a conversation not too long ago where someone said backports was enabled by default??
<cherylj> frobware: maybe juju is overriding this somehow?
<frobware> cherylj: let me trying just deploying trust from my MAAS
 * cherylj too
<frobware> cherylj: https://bugs.launchpad.net/juju-core/+bug/1568895/comments/6
<mup> Bug #1568895: Cannot add LXD containers in 2.0beta4 on trusty <juju-core:New> <https://launchpad.net/bugs/1568895>
<mup> Bug #1569024 opened: Region names for rackspace should accept caps and lowercase <rackspace> <juju-core:Triaged> <https://launchpad.net/bugs/1569024>
<cherylj> frobware: looks like maybe we're overwriting it?
<cherylj> frobware: a fresh deploy through AWS shows it enabled:  http://paste.ubuntu.com/15767200/
<frobware> cherylj: so on MAAS only?
<cherylj> frobware: did you do that through  a maas deploy or juju?
<cherylj> frobware: this was not a juju provisioned machine
<frobware> cherylj: what's your /etc/cloud/build.info
<frobware> cherylj: neither was mine, just MAAS deployed
<cherylj> serial: 20160406
<cherylj> frobware: interesting
<frobware> cherylj: daily or releases image on AWS?
<cherylj> frobware: daily - I choose the latest from here:  https://cloud-images.ubuntu.com/locator/daily/
<frobware> cherylj: going to switch to daily, as I have two "releases" MAAS setups atm
<thumper> morning
<thumper> o/ frobware
<thumper> frobware: still working?
<frobware> thumper: only virtually
<frobware> thumper: haven't left standup yet :)
<frobware> thumper: still talking about.... CONTAINERS!
<mgz> frobware: contain yourself!
<frobware> Ctrl-D
<cherylj> ah fudge, should we allow people to specify regions in their cloud definitions that have caps?
<cherylj> or should we lowercase everything?
<cherylj> (like we do with users?)
<cherylj> thumper ^^  thoughts?
<mgz> I am confused by the bug in general
<thumper> ah...
<thumper> hmm
<thumper> personally I like lowercasing them
<thumper> but
<anastasiamac> cherylj: some regions must b caps coz of provider. axw made some changes in the area...
<thumper> I don't believe in forcing it if it could cause a problem
<thumper> anastasiamac: yeah... was thinking it might be something like that
<cherylj> so I guess people using rackspace need to just change to use lower case region names?
<cherylj> maybe we can handle that internally to the rackspace provider
 * cherylj looks
<ses_2> cherylj, I will fix that
<cherylj> thanks, ses_2
<frobware> cherylj: https://bugs.launchpad.net/maas/+bug/1554636 - interesting read if I'm just switching between daily and releases
<mup> Bug #1554636: maas serving old image to nodes <landscape> <MAAS:New> <MAAS 1.9:New> <https://launchpad.net/bugs/1554636>
<cherylj> frobware: interesting!  I noticed that my trusty images were old too
<cherylj> (from like 3 weeks ago)
<frobware> cherylj: https://bugs.launchpad.net/juju-core/+bug/1568895/comments/7
<mup> Bug #1568895: Cannot add LXD containers in 2.0beta4 on trusty <juju-core:New> <https://launchpad.net/bugs/1568895>
<frobware> cherylj: I don't see any difference with switching to daily images
<frobware> cherylj: neither have trusty-backports by default
<cherylj> I wonder if curtin screws with it
<frobware> cherylj: if I launch a container via lxc (locally on my desktop) then I do see backports listed; not all cloud images are created equally
<mwhudson> perrito666: hi!
<natefinch> katco: wallyworld: moonstone standup or tanzanite or ?
<wallyworld_> natefinch: nothing on the calendar so i was assuming it hadn't been rescheduled yet
<katco> wallyworld_: natefinch: correct hasn't been rescheduled. natefinch, remember i was going to discuss with wallyworld?
<natefinch> katco: oh, right.  ok
<frobware> cherylj: https://bugs.launchpad.net/juju-core/+bug/1568895/comments/8
<mup> Bug #1568895: Cannot add LXD containers in 2.0beta4 on trusty <juju-core:New> <https://launchpad.net/bugs/1568895>
 * frobware is done... hopefully the MAAS fairies will point me towards the URL of truth...
<mup> Bug #1569047 opened: juju2 beta 3: bootstrap warnings about interfaces <landscape> <juju-core:New> <https://launchpad.net/bugs/1569047>
<perrito666> mwhudson: hi, so, are we having problems to bootstrap or just to test?
<mwhudson> perrito666: i haven't tried bootstrapping, i don't actually know how to use juju :-p
<menn0> cherylj: https://bugs.launchpad.net/juju-core/+bug/1569054
<mup> Bug #1569054: GridFS namespace breaks charm and tools deduping across models <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1569054>
<menn0> cherylj: I've also created a card for it on the bug squad board
<perrito666> mwhudson: mmm, ok, have you attached the logs to the bug?
<perrito666> from the test I mean
<mup> Bug #1569054 opened: GridFS namespace breaks charm and tools deduping across models <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1569054>
<mwhudson> perrito666: not the most recent run probably, let me do that
<mwhudson> perrito666: https://bugs.launchpad.net/juju-core/+bug/1567708/comments/11
<mup> Bug #1567708: unit tests fail with mongodb 3.2 <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1567708>
<perrito666> that was fast
<mup> Bug #1569054 changed: GridFS namespace breaks charm and tools deduping across models <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1569054>
<mwhudson> perrito666: that was from my run yesterday, i haven't run the tests today...
<perrito666> tis ok, nothing changed
<mup> Bug #1569054 opened: GridFS namespace breaks charm and tools deduping across models <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1569054>
<alexisb> thumper, I am running late
<thumper> alexisb: ok, just on a call with mpontillo
<alexisb> thumper, on the HO when you are ready, no rush
<katco> ericsnow: i thought we fixed this? bug 1567170
<mup> Bug #1567170: Disallow upgrading with --upload-tools for hosted models <upgrade-juju> <juju-core:In Progress by cox-katherine-e> <https://launchpad.net/bugs/1567170>
<katco> ericsnow: sorry wrong bug. bug 1545116
<mup> Bug #1545116: When I run "juju resources <service>" after a service is destroyed, resources are still listed. <2.0-count> <juju-release-support> <resources> <juju-core:Confirmed> <https://launchpad.net/bugs/1545116>
<ericsnow> katco: pretty sure we did
<menn0> wallyworld_, thumper, cherylj : I've gone for an even simpler fix for the gridfs namespace issue. it's ready now - just doing some manual testing.
<wallyworld_> menn0: awesome, ty
<alexisb> thumper, I dropped off the standup, please ping when you are available
<thumper> alexisb: ok, here now
<alexisb> lol of course
<mup> Bug #1569072 opened: juju2 bundle deploy help text out of date <landscape> <juju-core:New> <https://launchpad.net/bugs/1569072>
<perrito666> mwhudson: is there I way I can get my hands into this?
<mwhudson> perrito666: you mean test on s390x?
<mwhudson> perrito666: the failures occur on intel too though
<perrito666> mwhudson: ok, so to reproduce this you do what exactly?
<perrito666> sorry I asked you this already :)
<perrito666> just making sure I am doing things right
<mwhudson> perrito666: install the juju-mongodb3.2 from ppa:juju/experimental
<mwhudson> perrito666: run JUJU_MONGOD=/usr/lib/juju/mongo3.2/bin/mongod go test github.com/juju/juju/...
 * perrito666 is pretty sure he is about to break something in his desktop
<menn0> wallyworld_: http://reviews.vapour.ws/r/4523/
<wallyworld_> looking
<menn0> wallyworld_: I went with the approach of making the gridfs namespace the same as the DB name (as per osimages)
<menn0> wallyworld_: I came to the conclusion that anything more elaborate was just YAGNI
<wallyworld_> yep
<menn0> wallyworld_: tested with local and charm store deployments
<perrito666> mwhudson: I get juju-mongodb3 and juju-mongo3 which is a transitional
<wallyworld_> menn0: lgtm
<menn0> wallyworld_: cheers
<mwhudson> perrito666: well you should check with sinzui i guess but i'm preeeeetttty sure juju is jumping from 2.6 to 3.2, not to 3
<perrito666> sinzui: care to shed some light into this ?
<perrito666> wallyworld_: wb btw
<wallyworld_> it's 3.2
<perrito666> mm, there is juju-mongodb3 package in version 3.2.0-0ubuntu0~juju8~ubuntu15.10.1~15.10
<mwhudson> perrito666: if you've been making juju compatible with 3.0 that would explain some things :-)
<wallyworld_> juju-mongodb3.2
<mwhudson> (presumably that work is a subset of the work to be compatible with 3.2 so not wasted effort entirely)
<perrito666> wallyworld_: mwhudson https://pastebin.canonical.com/153956/
<mwhudson> ah i bet my reply to this is "3.2 is only built for xenial"
<perrito666> mwhudson: good to know :p
<perrito666> ok, here goes my machine's stability
 * perrito666 upgrades to xenial
<mwhudson> ah yes, probably
<mwhudson> hah uh, or use a lxd or ec2 instance or something? :)
<mwhudson> i do need to upgrade myself soon
<wallyworld_> perrito666: juju-mongodb3.2 is not in the archives yet afaik
<wallyworld_> that's the whole issue that has been hanging around for several weeks
<wallyworld_> which is why my pr should not have been landed yet
<mwhudson> i am about to upload juju-mongodb3.2 RIGHT NOW!!!one
<perrito666> wallyworld_: your pr issue has been dealt with
<mwhudson> unfortunately it will then sit in NEW for a while
<perrito666> I am now trying to see mwhudson issue
<wallyworld_> mwhudson: new or proposed for a while?
<wallyworld_> should be out of proposed for xenial pretty quickly?
<mwhudson> wallyworld_: NEW as in https://launchpad.net/ubuntu/xenial/+queue
 * mwhudson afk for 5
<menn0> thumper or wallyworld_: here's the fix for the original bug: http://reviews.vapour.ws/r/4524/
<perrito666> wallyworld_: axw anastasiamac i am trying to find my laptop for the standup be right there
<anastasiamac> perrito666: k
<wallyworld_> menn0: will look after standup
<menn0> wallyworld_: np
<mup> Bug #1569086 opened: Juju controller CA & TLS server keys are weak <juju-core:New> <https://launchpad.net/bugs/1569086>
<mup> Bug #1569047 changed: juju2 beta 3: bootstrap warnings about interfaces <landscape> <juju-core:New> <https://launchpad.net/bugs/1569047>
<ericsnow> axw: re: bug #1566431, unfortunately there's more to it than fiddling with the provisioner-side...tools stuff has to be tweaked as well
<mup> Bug #1566431: cloud-init cannot always use private ip address to fetch tools (ec2 provider) <juju-core:In Progress by ericsnowcurrently> <https://launchpad.net/bugs/1566431>
<axw> ericsnow: yeah, doesn't surprise me :/
<ericsnow> axw: looks like you had to deal with a related issue last year
<axw> ericsnow: oh? what was that?
<ericsnow> axw: https://github.com/juju/juju/blob/master/apiserver/common/tools.go#L359
<axw> ericsnow: ah that comment's just about HA really
<ericsnow> axw: the comment applies regardless, I think
<axw> ericsnow: yup
<ericsnow> axw: we do the separate "tools URL" thing for the sake of possibly using alternate tools servers, no?
<axw> ericsnow: we've got a very small window to make a breaking change, if we can fix it... :)
<ericsnow> axw: yeah :)
<axw> ericsnow: can't remember why, sorry
<ericsnow> axw: np
<axw> ericsnow: I think that is the case, just to encapsulate that logic for the several places where we need to get tools URLs
<axw> it may be redundant if we're just returning all the addresses
#juju-dev 2016-04-12
<mup> Bug #1569097 opened: jujud fails to start with "could not find a suitable binary for "0.0/mmapv1"" <blocker> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1569097>
<cherylj> wallyworld: could you or someone from your team take bug 1569097?
<mup> Bug #1569097: jujud fails to start with "could not find a suitable binary for "0.0/mmapv1"" <blocker> <mongodb> <juju-core:Triaged> <https://launchpad.net/bugs/1569097>
<wallyworld_> ok
<cherylj> thanks!
<wallyworld_> cherylj: part of the issue is the mongo stuff got erged too soon, so we'll need to look into how to deal with that. i'm still ramping up on the issues
<wallyworld_> cherylj: i also added a bug to the board - HA doesn't use bootstrap constraints
<cherylj> wallyworld_: I guess that was a miscommunication - we thought it was safe to merge because it had a fallback :/
<wallyworld_> cherylj: my PR didn't have a fallback - it expected mongo 3.2 to be in xenial
<cherylj> the above bug happened on trusty, if it makes a difference
<wallyworld_> anyways, all good, we'll fix
<wallyworld_> on trusty it was supposed to use mongo 2.4 stull, hmmm, i'll need to check
<wallyworld_> i bootstrapped yesterday without issues, but that may have been on xenial, i'll need to check
<wallyworld_> i wonder if wily is also broken
<wallyworld_> cherylj: good news though - it's in the queue, so progress :-) https://launchpad.net/ubuntu/xenial/+queue
<cherylj> yay!
<cherylj> brb
<wallyworld_> cherylj: i reckon bug 1534627 should be high rather than medium, since it quite adversely affects stakeholder deployments
<mup> Bug #1534627: Destroyed models still show up in list-models <2.0-count> <conjure> <juju-release-support> <juju-core:Triaged> <https://launchpad.net/bugs/1534627>
<rick_h_> wallyworld_: +1 and the change on it is backward incompatible
<wallyworld_> yep, that too
<mup> Bug #1569106 opened: juju deploy  <service> --to lxd:0 does not work <conjure> <juju-core:New> <https://launchpad.net/bugs/1569106>
<perrito666> wallyworld_: hey, ruthere?
<wallyworld_> maybe
<wallyworld_> depends who's asking
<perrito666> I would make a taxes joke, but I have no idea how is the Ausie irs called
<wallyworld_> ATO
<wallyworld_> australian tax office
<perrito666> will it kill you? like everything in australia?
<wallyworld_> it can
<wallyworld_> feeding it money helps
<perrito666> so lemme know when you can ho
<wallyworld_> anytime
<perrito666> k standup room?
<wallyworld_> ok
<mup> Bug #1569109 opened: Juju makes wrong network configuration when adding physical machine <juju-core:New> <https://launchpad.net/bugs/1569109>
<natefinch> evening folks
<alexisb> good evening all, see you in the morning
<thumper> bugger...
 * thumper sighs
<thumper> shelving all current work to pop the stack and fix other bits.
<natefinch> cherylj: is there something I should be working on to help unblock master?
<cherylj> natefinch: want to take a look at https://bugs.launchpad.net/juju-core/+bug/1564791 ?
<mup> Bug #1564791: 2.0-beta3: LXD provider, jujud architecture mismatch <blocker> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1564791>
<cherylj> looks like an interesting one
<natefinch> not really ;)  But I will :)
<natefinch> cherylj: actually, it gets less bad toward the end of the bug :)
<mup> Bug #1569120 opened: wrong lxc bridge still used in juju beta4 <conjure> <juju-core:New> <https://launchpad.net/bugs/1569120>
<cherylj> axw: got a sec?
<axw> cherylj: yup?
<cherylj> axw: I'm looking at bug 1569024
<mup> Bug #1569024: Region names for rackspace should accept caps and lowercase <blocker> <rackspace> <juju-core:In Progress by cherylj> <https://launchpad.net/bugs/1569024>
<cherylj> and was thinking that for public clouds, we could strings.ToLower the region names
<cherylj> that way we don't mess with any user defined cloud regions
<cherylj> and maintain compatibility for rax
<axw> cherylj: gah, yeah, we should and I meant to do that
<axw> cherylj: on input, lower case
<cherylj> but just for public clouds, yes?
<cherylj> or for all?
<axw> cherylj: hrm. well, maybe not lowercase when we pass through, just compare case insensitive
 * axw looks at the code
<cherylj> ah, that works too
<cherylj> strings.EqualFold()
<cherylj> neato
<axw> cherylj: I *think* it's just a matter of changing "getRegion" in cmd/juju/commands/bootstrap.go
<axw> where we check region.Name ==
<axw> cherylj: also the set-default-region command
<cherylj> axw: yeah, I had some changes in there already, just wanted to verify what we should do
<cherylj> axw: so don't change the region, just do a case insensitive comparison?
<axw> cherylj: I think that's safest, yeah
<cherylj> axw: sounds good, thanks
<cherylj> natefinch: I have access to the arm hardware for that lxd bug.  Need me to forward it your way?
<natefinch> cherylj: yes please
<natefinch> cherylj: though it probably will be a matter of looking at the code and then thinking real hard.
<cherylj> break out the hamster
<cherylj> hey rcj, slumming it with the juju devs?
<natefinch> cherylj: my brain refuses to read arm64 ... every time it translates it into amd64, and I have to do a double take to make sure it says the right thing
<cherylj> natefinch: oh me too
<axw> menn0: when you're importing a model, will it be visible during import? will it be mutable while importing?
<axw> (import as in migration)
<menn0> axw: there's a migration-mode flag which will be set to "importing"
<menn0> axw: that blocks critical txns as well as preventing API logins for it
<menn0> axw: the former has been done but not the latter
<axw> menn0: ok, cool. but you'll still be able to see it in list-models?
<menn0> axw: I guess so, but we could make it so they didn't show up
<axw> menn0: I'm thinking it might make sense to have a status entry for models
<menn0> axw: that could be done
<axw> available, importing, destroying, archived
<axw> something like that
<menn0> axw: sounds useful
<axw> menn0: we need to be able to filter out Dead models in list-models, but I think we should show status of Alive vs. Dying
<axw> but a more descriptive status would be better
<axw> I'll look at adding that
<natefinch> menn0: speaking of migrations... I added a field to charmDoc and tried to figure out if there was anything I needed to do for migration, but couldn't find code migrating charm stuff. What's up with charms and migration?
<menn0> natefinch: migration of charms and tools is still in-progress... there is code but it needs reworking and isn't plugged in to the process yet
<menn0> natefinch: for most collections there's tests that fail if fields are added
<menn0> natefinch: but probably not for charms yet
<menn0> natefinch: so just add your field for now and email thumper and me about it just to make sure
<natefinch> menn0: ok, cool, will do
<menn0> axw: you thinking this status would replace the migration-mode field?
<menn0> axw: or is the status a virtual concept only for the status API?
<axw> menn0: probably not, it's just for human consumption
<menn0> axw: kk
<menn0> axw: you know that there's already a environment-status (hopefully model-status) section which can optionally appear in the status output
<menn0> axw: perrito666 added it to support reporting that there's a tool upgrade available
<menn0> axw: model migration status will appear there too
<axw> menn0: ah ok, I'll check that out - thanks
<rcj> cherylj, what did I do?
<rcj> cherylj, I'm just here to remind everyone to use the 'daily' stream when running with the xenial series until it ships, otherwise you have a very stale experience.
<cherylj> heh
<rcj> I mean, that's not why I'm here, but I'll make that public service address whenever the opportunity presents itself.
<cherylj> this has been a CPC public service announcement
<rcj> cloud images, best consumed fresh
<rcj> also, I'm not in charge of any actual branding efforts
<cherylj> axw:  Can you do a quick review?  http://reviews.vapour.ws/r/4528/
<axw> cherylj: looking
<cherylj> natefinch: how's that arm bug coming?  (I'm curious because it's such a weird bug)
<cherylj> it's not a nag :)
<axw> cherylj: looks good, but can you please do set-default-region while you're there?
<cherylj> gah, I forgot you said that
<cherylj> yes
<axw> cherylj: thanks :)
<cherylj> axw: can you take another look?  http://reviews.vapour.ws/r/4528/
<cherylj> I had to do it a bit differently for set-default-region
<cherylj> so that we wrote out what was in the cloud region list, not what the user specified
<menn0> cherylj, thumper: what was the decision on where to land stuff while master is blocked?
<cherylj> bleh, I haven't done that.
 * thumper waits...
<thumper> cherylj: what was the decision?
<cherylj> it was a back and forth for a while, but the general consensus was "yeah, sure"
<axw> cherylj: sorry was afk, looking
<cherylj> menn0, thumper, since it's already tomorrow, I can go either way on a bug branch.
<cherylj> if either one of you wants to create one, go for it
<cherylj> I'm just waiting to land this rackspace fix so I can go to bed
<axw> cherylj: LGTM, thank you
<axw> sorry for keeping you from bed :(
<cherylj> it happens :)
<cherylj> thanks for the review!
<thumper> cherylj: was it acceptable to have a release branch?
<cherylj> thumper: I'd rather not do that at this point because I do'nt know if CI would run on it tonight (until the QA team wakes up)
<thumper> ah... good point
<menn0> thumper, cherylj: let's make a "next" branch
<thumper> ack
<thumper> next branch created
<menn0> the compression ration acheived by lrzip is amazing but geez it's slow
 * menn0 has been waiting for almost 2 hours for a file to decompress
<mwhudson> menn0: two *hours*?
<mwhudson> menn0: seems unlikely the extra compression saved you two hours of download time...
<menn0> mwhudson: I agree but that's how the file came
<menn0> it's a 365MB file that currently up to 11GB and climbing
<menn0> lrzip is even using every core and it's still taking this long
<thumper> wow
<menn0> mwhudson, thumper: just finished... a little over 2 hours. 365 MB to 14GB
<davecheney> what was in that giant file ?
<mwhudson> menn0: that is quite a ratio
<menn0> davecheney: DB dump from a broken system
<wallyworld_> axw: if you get a chance, here's a small mongo ha fix for beta4 http://reviews.vapour.ws/r/4529/
<axw> wallyworld_: ok, a little later, trying not to context switch right now
<axw> (unless it's urgent)
<wallyworld_> tis fine, whenever suits
<wallyworld_> nah, can wait
<wallyworld_> so long as it lands sometime today so Ci can run
<wallyworld_> i could bug menn0 :-) if he is waiting for lrzip
<davecheney>         m.Server = httptest.NewServer(nil)
<davecheney>         c.Assert(m.Server, gc.NotNil)
<davecheney>         m.oldHandler = m.Server.Config.Handler
<davecheney> create a new server, then save the value of it's handler ...
<davecheney> then restore the handler in the tear down
<davecheney> then the new test overwrites the value we just restored ...
<davecheney> wat
<menn0> wallyworld_: i'm waiting for a long mgopurge run
<menn0> wallyworld_: i'll take a look
<wallyworld_> ty
<wallyworld_> is there a custimer issue?
<menn0> wallyworld_: ship it
<wallyworld_> menn0: yay, tyvm
<menn0> wallyworld_: even though you're deleting a lot of my turd polishing :)
<wallyworld_> menn0: sorry :-)
<wallyworld_> less turds left now
<menn0> wallyworld_: actually hang on
 * wallyworld_ hangs
<menn0> wallyworld_: can't you be a bit more aggressive about test removal
<wallyworld_> possibly
<wallyworld_> i thought about removing the whole fakeensure stuff
<menn0> wallyworld_: some of those asserts you've removed were the point of those tests so I suspect the whole test can go
<menn0> that's what I was thinking too
<wallyworld_> yeah, had the same thought
<menn0> if it's not being used
<wallyworld_> i was trying to be a bit conservative
<wallyworld_> i'll take another look
<menn0> wallyworld_: it's really just TestMachineAgentUpgradeMongo
<menn0> and perhaps the fakeensuremongo
<wallyworld_> yep, i convinced myself that test remained useful
<wallyworld_> but seems not
<wallyworld_> menn0: yeah, a lot of extra code can just be deleted
<menn0> wallyworld_: excellent
<wallyworld_> peergroup is having a big haircut
<wallyworld_> peergrouper
<axw> wallyworld_: before you delete all that...
<wallyworld_> already gone :-)
<axw> is it still possible to promote machines to controllers with your changes?
<wallyworld_> axw: you mean ones which are not yet has-vote
<axw> wallyworld_: I mean "enable-ha --to 0,1,2"
<axw> where we transform a non-state-server into a state-server
<wallyworld_> i'll double check, i didn't test that explicitly
<dimitern> whew it finally worked !
<mup> Bug #1569196 opened: enable-ha with placement fails due to invalid JobManageNetworking <juju-core:Triaged> <https://launchpad.net/bugs/1569196>
<voidspace> morning everyone
<voidspace> back to the routine of the school-run this morning
<voidspace> *sigh*
<dimitern> morning voidspace
<voidspace> dimitern: so thumper broke my code *again* overnight :-)
<dimitern> voidspace: oh yeah? :)
<voidspace> dimitern: see here: https://docs.google.com/document/d/1YmbdGpP7Oy5uglOwqbRXf1k_7siaxfEpoWkshk5_oPo/edit?ts=56fb30ca#
<voidspace> dimitern: basically you were right about not_networks so he changed the allocate machine args again
<voidspace> dimitern: and I was just updating the code to work with master as it was yesterday :-)
<voidspace> it's not a big change - so not difficult
<dimitern> voidspace: cool :)
<thumper> voidspace: it is my mission in life to make your mornings miserable
<thumper> however, dimitern will like to hear that he was right
<voidspace> thumper: ah, that explains why you joined our standups!
<voidspace> thumper: :-)
<voidspace> thumper: hey, so gomaasapi now has its own dependencies.tsv
<thumper> voidspace: if you want to jump in the hangout now, we can chat that way I can not work later
<thumper> voidspace: yeah, needed for the merge bot
<voidspace> thumper: sure
<voidspace> thumper: right, but the versions of its dependencies are different than the juju ones
<voidspace> thumper: I'll join the hangout
<voidspace> babbageclunk: you too?
<thumper> voidspace: shouldn't be off by much
<voidspace> babbageclunk: early hangout
<thumper> voidspace: probably just testing
<voidspace> all of them are now different I think
<babbageclunk> voidspace: sure
<voidspace> thumper: but everything still works
<voidspace> we just need to be careful
 * thumper nods
<dimitern> thumper: I told you ;)
<TheMue> morning
<dimitern> TheMue: \o
<TheMue> dimitern: heavy on fire for Juju 2 and also 16.04?
<dimitern> TheMue: oh yeah :)
<Alex____> Hi, I wonder if somebody could please point to the place for a quick question on BigData charms?
<TheMue> dimitern: how is J2 different from the J1.*? so many incompatible changes to change the major release number?
<dimitern> TheMue: a lot has changed, and some things in an incompatible way, check the release notes :)
<TheMue> dimitern: will do. still very interested in juju and always trying to place it in projects or give interested people a hint. many don't know it.
<axw> fwereade_: the branch I put up is for 2.0, in which compatibility breaks are many and varied
<babbageclunk> Anyone know why building the next branch is failing?
<fwereade_> axw, oops, fair enough, I do default to unthinkingly-maintain-compat
<dimitern> frobware: managed to figure it out - erc-email-userid needs to match my nick for i.canonical.c to accept it along with the server password
<axw> fwereade_: and I thank you for it :)
<frobware> dimitern: no turning back now :)
<dimitern> frobware: indeed :)
<frobware> babbageclunk: guessing... did you run godeps -u ...
<babbageclunk> frobware: not locally - in the github-merge-juju Jenkins job.
<babbageclunk> frobware: http://juju-ci.vapour.ws:8080/job/github-merge-juju/7313/console
<babbageclunk> frobware: looks like lots of provider/lxd tests
<dimitern> babbageclunk: fwiw I see the same errors even after upgrading to xenial when running make check on master tip
<axw> fwereade_: responded to your other questions on RB, will look again tomorrow. thanks for the review
<dimitern> if anything it got worse - I only saw a couple of failures yesterday on wily
<babbageclunk> frobware: some of the failing builds under that are against master, some against next.
<babbageclunk> frobware, dimitern: I tried running provider/lxd tests for next locally and I don't see the failures (although I didn't run the full test suite).
<babbageclunk> dimitern: I'll try running make check
<dimitern> babbageclunk: I'll try next now so see if it's any better
<dimitern> but first I need to reboot..
<menn0> hi all
<babbageclunk> menn0: hi!
<menn0> babbageclunk: how's it?
<babbageclunk> Hey, I saw a build of yours failed with lots of lxd provider failures.
<babbageclunk> Did you work out why? A branch of mine had that just now too.
<menn0> babbageclunk: everyone's seem to be failing like that. I wonder if there's a problem with the test runner hosts.
<menn0> any QA people about?
<dooferlad> frobware: launchpad seems to have gone read only, so I can't put this in the bridged bond bug right now. The pre-up/post-down thing is a red herring. Even if you include them cloudinit hangs. Rebooting always works and cloudinit seems to finish happily.
<dooferlad> frobware: and I really need to get the proxy bug fix landed, so pausing on this for now.
<frobware> dooferlad: ack
<dooferlad> frobware: ah, bug just updated. Yay web services.
<frobware> dooferlad: really need to conclude on an investigation of replace ENI and reboot...
<babbageclunk> menn0: Running the full test suite locally (on juju/next) I get the same failures
<menn0> babbageclunk: interesting... so not the build hosts then
<menn0> babbageclunk: I'm just finishing something else up and then I'll try on my machine.
<babbageclunk> menn0: takes ages though so I haven't run the tests against master as well yet - I saw that cherylj has some failing runs against master with the same errors.
<frobware> dimitern: whoa! that's subtle...
<frobware> dimitern: we currently have 00-juju.cfg and eth0.cfg
<frobware> dimitern: which would/could/should give us 2 addresses on eth0
<frobware> dimitern: but because we specify a mac addr, the ifup via DHCP on eth0.cfg gives us the same IP addr
<frobware> dimitern: ok, that explains it (for me at least) :)
<menn0> babbageclunk: if you run just one of the tests that's failing in CI does it fail then? (that shouldn't take too long)
<babbageclunk> menn0: Yeah, it turns out running just ./provider/lxd that fails.
<dimitern> frobware: interesting
<dimitern> frobware: and lucky I guess :)
<babbageclunk> menn0: But now I can't find a version where it doesn't fail.
 * menn0 runs those tests
<menn0> babbageclunk: they pass for me
<frobware> dimitern: I was trying to understand the behaviour. If I try this outside of juju the ifup on another foo.cfg (which also specifies eth0) will just add another IP addr to eth0.
<babbageclunk> menn0: is it safe to just do a checkout, godeps, then go test?
<menn0> babbageclunk: yep that should do it (as long as you have mongodb installed)
<menn0> babbageclunk: and I guess you probably need to have lxd installed for some tests too
<frobware> menn0, babbageclunk: isn't the underlying problem related to the configuration of lxdbr0?
<babbageclunk> menn0: ok, so it'll rebuild everything.
<frobware> or lack of
<menn0> frobware: sure... but why is it suddenly happening in CI and on babbageclunk's machine?
<babbageclunk> menn0, frobware: ok - I installed lxd last friday.
<menn0> babbageclunk: what does "lxc version" show?
<babbageclunk> 0.20
<babbageclunk> menn0: I'm on wily
<menn0> babbageclunk: I'm on vivid but I'm running 2.0.0.rc1
<dimitern> so I see exactly the same test failures on next as on master
<babbageclunk> Maybe I should upgrade to that.
<menn0> dimitern: yes, all recent merge attempts have had the same lxd/lxcbr0 problems
<babbageclunk> menn0: potentially that's also the problem on the build machine(s)
<menn0> babbageclunk: there's a PPA for the current lxd from the lxd/lxc team
<frobware> menn0, babbageclunk: to repro this just 'cd provider/lxd; go test'?
<menn0> frobware: I believe so
<babbageclunk> frobware: yup - might need a godeps in there too
 * menn0 prefers or "go test ./provider/lxd" but whatever
<dimitern> frobware: same thing with running only provider/lxd tests
<frobware> menn0, babbageclunk: ok && not terribly helpful but OK: 77 passed, 1 skipped
<frobware> menn0, babbageclunk: however, I am _only_ at dd9828ec7003d1a6ec1fc4dbcb7e6d17467a21f0
<babbageclunk> menn0, frobware - ok, I'm going to add the ppa and upgrade.
<frobware> babbageclunk, menn0: or go backto dd9828ec and try there... it may be something more recent in master
<babbageclunk> menn0, frobware: then I guess if that fixes it then it's an indication that someone should do the same on build hosts.
<dimitern> frobware: I suspect you did run `sudo dpkg-reconfigure -p medium lxd` as suggested by the tests?
<dimitern> otherwise how are you not seeing the failures..
<frobware> dimitern: nope, not medium. but I did reconfigure some time last week
<menn0> dimitern: I can't repro the problem, and I haven't run dpkg-reconfigure in a long time
<babbageclunk> frobware: I tried going back to find a bisect start point, but got back to last Monday and the tests were still failing.
<frobware> menn0: it was probably last tue/wed when I did the dpkg-reconfigure
<menn0> I haven't since I installed lxd (about 2 months ago?)
 * dimitern *facepalm*
<frobware> babbageclunk: my /etc/default/lxd-bridge config: http://pastebin.ubuntu.com/15784102/
<babbageclunk> anyone have the ppa handy?
<dimitern> I remember what I did - changed /e/default/lxd-bridge to not have IPv4 addresses as it was messing up my lxd multi-nic testing
 * frobware would like to kickstart/jumpstart all his machines every morning to avoid state...
<menn0> dimitern: but why is this also happening on the build hosts?
<frobware> menn0: which is why I was suggesting first go back to my current rev ^^ to see if it's just recent churn in master.
<babbageclunk> frobware: I'm on that rev - it's upstream/next and upstream/master (since no one's been able to land anything)
<frobware> oooohhh. I am at that rev. apologies...
<frobware> babbageclunk: my lxd package is:
<dimitern> menn0: not sure - perhaps when /e/d/lxd-bridge was introduced it did not have IPv4 config and CI machines haven't been updated since?
<frobware> $ apt-cache madison lxd
<frobware>        lxd | 2.0.0-0ubuntu2 | http://gb.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
<dimitern> now all provider/lxd tests pass
<frobware> dimitern: to confirm, you're running xenial?
<dimitern> frobware:yep
<dimitern> however, p/lxd tests should NOT fail if anything like that on the machine happens - they should be properly isolated
<babbageclunk> Huh. lxc version still says 0.20, but the tests pass for me now.
<dimitern> babbageclunk: you're on wily?
<babbageclunk> dimitern: yup
<dimitern> babbageclunk: so I needed to add `deb http://ppa.launchpad.net/ubuntu-lxc/lxd-stable/ubuntu wily main` to /e/a/srcs.list to get lxd to work on wily
<babbageclunk> dimitern: yeah - I did the same, but via add-apt-repository for ppa:ubuntu-lxc/lxd-stable
<dimitern> babbageclunk: yeah - same thing, and then a-g update && a-g dist-upgrade
<dimitern> dist-upgrade if you already installed lxd I mean
<babbageclunk> ok, so how do we get the tests passing on the build machines?
<voidspace> ooh, down to two failures
<dimitern> mgz: ^^
<dimitern> mgz: istm the dpkg configure for lxd might have been skipped with the noninteractive frontend
<dimitern> mgz: it will be useful to keep the machine around when the merge job fails to see what's going on
<babbageclunk> dimitern: I'm running apt-get under ansible, so maybe it's being run noninteractively as well?
<dimitern> babbageclunk: well, what's in /etc/default/lxd-bridge ?
<babbageclunk> dimitern: http://pastebin.ubuntu.com/15784982/
<babbageclunk> dimitern: But the tests are passing now that I've upgraded.
<babbageclunk> dimitern: so maybe it was just that I added the ppa.
<dimitern> babbageclunk: yeah, it looks like the tests should still fail (?!) mine was very similar before I fixed it
<frobware> babbageclunch: looks like most of the config is empty
<voidspace> frobware: dimitern: babbageclunk http://reviews.vapour.ws/r/4535/
<voidspace> frobware: dimitern: babbageclunk http://reviews.vapour.ws/r/4535/
 * voidspace lurches to lunch
<dimitern> voidspace: looking
<voidspace> dimitern: thanks
<babbageclunk> frobware, dimitern, voidspace: http://reviews.vapour.ws/r/4536/
<babbageclunk> voidspace: looking at yours now.
<voidspace> babbageclunk: dimitern just reviewed it, but thanks
<babbageclunk> voidspace: yeah, I just saw that it's merging
<voidspace> it will fail
<voidspace> dammit
<voidspace> I missed off some test fixes - didn't push them
<babbageclunk> voidspace: probably should have a look anyway - I'm OCR tomorrow.
<voidspace> ooh
<babbageclunk> voidspace: Well, it was going to fail due to the lxd thing anyway, right? ;)
<voidspace> hah
<voidspace> babbageclunk: are all merges backed up on that
<babbageclunk> voidspace: there are 9 failures in a row on github-merge-juju that I think are provider/lxd ones.
<voidspace> babbageclunk: nice :-)
<voidspace> right
 * voidspace really goes on lunch
<wallyworld_> cherylj: a small one for a ha bug i found testing ha http://reviews.vapour.ws/r/4537/
<babbageclunk> voidspace: that AllocateMachine change is biting me too - I'll use a version that has Link.IPAddress() but not the AllocateMachine change until you've updated stuff.
<babbageclunk> wallyworld_: we've been having merge jobs fail on Jenkins because of LXD provider tests - do you know about that?
<wallyworld_> i saw my job fail, but don't know what's wrong with lxd
<rick_h_> babbageclunk: wallyworld_ cherylj and QA at looking into I think
<wallyworld_> but i did see a bug where lxd behaves differently on trusty vs xenial with the bridge
<wallyworld_> i strongly suspect an upstream lxd issue
<babbageclunk> rick_h_: ok, thanks
<wallyworld_> babbageclunk: bug 1569120 may be related / relecant
<mup> Bug #1569120: wrong lxc bridge still used in juju beta4 <conjure> <juju-core:Incomplete> <https://launchpad.net/bugs/1569120>
<babbageclunk> rick_h_, wallyworld_: if it helps, I was getting the same failures on my machine (wily) until I added the PPA for lxd-stable and upgraded.
<mgz> we're using daily xenial images for the merge bot (which we have to, as the last one has too old an lxc)
<mgz> and there's a new lxd as of 2016-04-11 that's probably in today's image
<mgz> with various changes, bug 1548489
<mup> Bug #1548489: [FFe] Let's get LXD 2.0 final in Xenial <lxd (Ubuntu):Fix Released> <https://launchpad.net/bugs/1548489>
<mgz> so it's likely we just got broken again
<babbageclunk> mgz: ah, ok - thanks
<frobware> bug #1569361 makes it hard to iterate on fixing container bugs...
<mup> Bug #1569361: LXD containers fail to upgrade because the bridge config changes to a different IP address <network> <juju-core:New> <https://launchpad.net/bugs/1569361>
<cherylj> frobware:  :(
 * perrito666 gets budgeted for his next home internet... U$D450/5M
<mup> Bug #1569361 opened: LXD containers fail to upgrade because the bridge config changes to a different IP address <network> <juju-core:New> <https://launchpad.net/bugs/1569361>
<mgz> so... where are we actually at with lxd?
<mgz> our master doesn't work with their 2.0 - plus various other bugs?
<voidspace> babbageclunk: or you can merge my branch
<voidspace> babbageclunk: https://github.com/juju/juju/pull/5094/files
<babbageclunk> babbageclunk: yeah, but this was pretty easy and likely to require less explanation at review time.
<voidspace> babbageclunk: cool, that branch is ready to land though
<katco> morning all
<babbageclunk> voidspace: true - I'll need to merge it in eventually.
<babbageclunk> katco: o/
<wallyworld_>                          
<wallyworld_> \
<mgz> wallyworld_: your arm is falling off
<wallyworld_> pressed wrong key
<ericsnow> katco: rogpeppe1 is proposing a small API change in csclient.Client which would require a likewise small (isolated) change in core
<ericsnow> katco: any objections?
<katco> ericsnow: yeah saw the email... cherylj what would a change to core look at this point? would it still go into rc1?
<cherylj> katco: it pulls in an updated dep, right?
 * ericsnow ignores wallyworld_ since he can't possibly be coherent at this point
<katco> cherylj: and a small change to core
<cherylj> katco: I'm going to say that should go into rc1.  (not what we're trying to release this week)
<katco> cherylj: that's fine
<katco> ericsnow: ok, no objections
<cherylj> so put it in the next branch that thumper created
<ericsnow> cherylj: FYI, it *is* a bug
<ericsnow> katco: k
<cherylj> yes, I know
<ericsnow> rogpeppe1: ^^^
<cherylj> ericsnow:  is there a bug opened?  the email I saw didn't mention one?
<ericsnow> cherylj: not yet, I expect
<rogpeppe1> cherylj: no, i didn't file a bug yet. will do.
<ericsnow> rogpeppe1: thanks
<cherylj> thanks rogpeppe1
<ericsnow> and thanks for noticing the bug :)
<katco> ericsnow: fix lands here: https://github.com/juju/juju/tree/next
<rogpeppe1> cherylj: not sure if i should file the bug against juju-core or charmrepo/csclient
<cherylj> rogpeppe1:  you can target to both
<rogpeppe1> cherylj: interesting. how would I do that?
<ericsnow> katco: is master for 2.0.1 now?
<ericsnow> rogpeppe1: "Also affects project"
<cherylj> rogpeppe1: Use "also affects project"
<katco> ericsnow: that is my understanding. cherylj, correct?
<cherylj> ericsnow: no, master is for beta4.  We didn't branch for the release last night because I didn't know if the branch would've been picked up for testing overnight
<cherylj> (but it didn't matter anyway because no merge jobs passed because of lxd)
<ericsnow> cherylj: so the fix for rogpeppe1's bug should go in master or next?
<katco> cherylj: what is the "next" branch for?
<rogpeppe1> cherylj: do i have to do that after submitting the bug? i don't see that option in the "new bug" page.
<cherylj> next is for rc1
<cherylj> when we release beta4, we will merge next into master
<ericsnow> cherylj: ah, okay
<mgz> it's 100% that someone is going to screw up targetting here
<katco> that seems... backwards
<cherylj> rogpeppe1: yes, after you create the bug you can target to a different project
<cherylj> katco: yes, I know, but we did it that way because we wanted to make sure master / whatever we're going to release got a CI run overnight and I didn't know if it would pick up a new branch
<cherylj> and it was way past EOD for the qa team
<katco> cherylj: our tooling T.T
<rogpeppe1> cherylj: it doesn't like the fact that there's no launchpad project for charmrepo (it's in github)
<cherylj> rogpeppe1: then just target to juju-core
<rogpeppe1> cherylj: i've created the bug. https://bugs.launchpad.net/juju-core/+bug/1569386
<mup> Bug #1569386: list resources will not work correctly <juju-core:New> <https://launchpad.net/bugs/1569386>
<cherylj> thanks!
<cherylj> guess I should create a 2.0 rc1 milestone
<cherylj> hey natefinch, any luck with bug 1564791?
<mup> Bug #1564791: 2.0-beta3: LXD provider, jujud architecture mismatch <blocker> <lxd> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1564791>
<natefinch> cherylj: it's kind of a twisty maze of code getting passed around, but I have some suspicious lines I'm looking at. e.g.  if result.Arch == "" {result.Arch = "amd64"}
<mup> Bug #1569386 opened: list resources will not work correctly <juju-core:New> <https://launchpad.net/bugs/1569386>
<natefinch> ericsnow: just had a good idea about the bug 3 lines up... I think this is another case of needing to make our "local" provider special.  LXD has to always default to the arch of the host machine, but we have provider code that says that if you don't specify the arch, we default to amd64, which obviously fails to run on other arches.  I think we never see this in development, because we always use --upload-tools
<mup> Bug #3: Custom information for each translation team <feature> <iso-testing> <lp-translations> <Launchpad itself:Fix Released> <MTestZ:Invalid> <Ubuntu:Invalid> <mono (Ubuntu):Invalid> <https://launchpad.net/bugs/3>
<ericsnow> natefinch: yep
<katco> ericsnow: natefinch: standup time
<natefinch> cherylj: is there a card for https://bugs.launchpad.net/juju-core/+bug/1564791
<mup> Bug #1564791: 2.0-beta3: LXD provider, jujud architecture mismatch <blocker> <lxd> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1564791>
<cherylj> natefinch: not yet, I can make one for you
<frobware> anybody else see bootstrap failures related to mongod not found in PATH?
<frobware> I have bootstrapped quite a few times today but has failed twice in a row now
<frobware> see bug #1569408
<natefinch> cherylj: thanks
<mup> Bug #1569408: Failed to bootstrap because exec: "mongod": executable file not found in $PATH <juju-core:New> <https://launchpad.net/bugs/1569408>
<katco> cherylj: can redir land help text changes into the next branch?
<cherylj> katco: yes
<katco> cherylj: k ta
<mup> Bug #1569408 opened: Failed to bootstrap because exec: "mongod": executable file not found in $PATH <juju-core:New> <https://launchpad.net/bugs/1569408>
<redir> :)
<katco> redir: what's your launchpad id?
<redir> reedobrien
<redir> katco: ^
<katco> redir: ty
<dimitern> now everything is broken
<dimitern> maas cannot bootstrap due to missing mongod, aws can't add lxc containers as cloud-init sets a non-present locale en_US.UTF-8
<dimitern> and the locale is missing because apt-get update & upgrade are apparently required for xenial now
<frobware> dimitern, voidspace, tych0: PTAL @ https://github.com/juju/juju/pull/5099
<frobware> dimitern: I went back to trusty and added backports to sources.list -- working there. \o/
<dimitern> frobware: looking
<dimitern> frobware: I managed to get xenial to work as well by doing a-g up & upg & a-g install language-pack-en-base
<frobware> dimitern: I can no longer bootstrap with xenial...
<dimitern> frobware: on maas, I have the same issue - but I'm using AWS now to verify dropping address-allocation ff does not break something there
<frobware> dimitern: gotcha
<natefinch> lol, I now have 3 unkillable lxd environments
<dimitern> frobware: LGTM
<frobware> dimitern: ty
<natefinch> uh.... anyone know what this means?
<natefinch> $ juju bootstrap local-Apr-12 lxd --upload-tools
<natefinch> ERROR invalid config: no addresses match
<perrito666> throw some debug there?
<dimitern> natefinch: try --debug?
<natefinch> oh, maybe this is the lxd problem everyone's been having, that I avoided by just not using lxd for a while :/
<natefinch> 2016-04-12 16:04:14 DEBUG juju.cmd.juju.commands bootstrap.go:365 preparing controller with config: map[type:lxd name:admin uuid:0a58e9ef-099f-4cf8-8a48-2772cf8b5c05 controller-uuid:0a58e9ef-099f-4cf8-8a48-2772cf8b5c05]
<natefinch> 2016-04-12 16:04:14 ERROR cmd supercommand.go:448 invalid config: no addresses match
<dimitern> that's a new issue to me
<katco> natefinch: there's a good thread on that with rogpeppe1 and redir
<katco> natefinch: search email for that error message
<rogpeppe1> natefinch: i think the underlying cause is this: https://bugs.launchpad.net/juju-core/+bug/1567952
<mup> Bug #1567952: container/lxd: TestDetectSubnetLocal fails with link/none <juju-core:Triaged> <https://launchpad.net/bugs/1567952>
<cherylj> natefinch: you need to do the dpkg-reconfigure to set up the bridge, then service lxd restart
<cherylj> perrito666: do you have a minute?
<natefinch> cherylj: is this something that'll get fixed?  Or is this something special because we ran old versions of lxd, or?
<cherylj> natefinch: you should only have to do it once
<cherylj> but it's something that right now, you have to do every time for newly provisioned instances
<natefinch> cherylj: ew
<cherylj> yeah
<natefinch> cherylj: omg, this is so much worse than I expected
<cherylj> hahaha
<natefinch> seriously, an order of magnitude
<cherylj> yeah
<cherylj> it's *awesome*
<natefinch> I hope the only thing I have to change the default for is the name of the bridge
<frobware> cherylj: but at least I can run --upgrade-juju now with LXD containers... makes debugging a little quicker.
<natefinch> and lol still fails with the same error message
 * natefinch reboots just in case
<alexisb> natefinch, for master I was able to get it working by running lxd init and configuring the bridge and network that way
<natefinch> alexisb: ok
<natefinch> alexisb: oh, it doesn't want me to to that since I have existing containers, let me dump those
<bogdanteleaga> is there a way to replace the tools that are in state?
<bogdanteleaga> to deploy a machine with freshly built tools?
<alexisb> natefinch, yep you have to dump those, then I also removed my lxc bridge
<alexisb> not sure if that step was necessary, but that was my process
<natefinch> still thinks I have containers around, even though list says there aren't. Sigh.  Gotta run to lunch, will pick this up after.
<dimitern> so the missing juju-mongodb3.2 package on xenial broke AWS bootstrap as well as MAAS (with update/upgrade enabled)
<cherylj> perrito666: ping?
<cherylj> dimitern: yeah, I'm working on it
<cherylj> dimitern: well, the problem is now that it's there
<cherylj> and we're not looking in the right place for it
<dimitern> cherylj: oh, cheers then! :)
<perrito666> dimitern: what?
<cherylj> perrito666: hey, I've got mongo questions for you :)
<perrito666> dimitern: current master wont fail if the package is not there
<perrito666> cherylj: sure
<cherylj> perrito666: it does :(
<dimitern> perrito666: yeah? :)
<cherylj> wait
<dimitern> perrito666: sure
<cherylj> sorry
<cherylj> it fails if it *is* there
<cherylj> heh
<perrito666> cherylj: ok, ill need more details
<cherylj> I can has english
<cherylj> perrito666: can you HO?
<perrito666> cherylj: gimme a sec
<cherylj> perrito666: np, when you're ready:  https://plus.google.com/hangouts/_/canonical.com/mongo-fun?authuser=0
<mgz> that sounds fun
<perrito666> cherylj: mgz look its no fun adding bugs to stuff if you people are going around finding them
<cherylj> ha
<cherylj> so it *IS* sabotage?
<cherylj> hey mgz - about functional-container-networking
<mup> Bug #1569467 opened: backup-restore loses the hosted model <backup-restore> <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1569467>
<cherylj> oh yeah, I was going to look at that ^^
<cherylj> good timing, mup
<cherylj> mgz: there was a change in juju ssh to default to not using the proxy
<cherylj> which breaks that test on AWS
<mgz> oh fun
<cherylj> mgz: but an easy fix.  Just use juju ssh --proxy=true
<cherylj> and backwards compatible to boot
<mgz> how long have we had the --proxy flag?
<mgz> I guess it doesn't matter too much, can just supply it always for 2.0
<cherylj> https://goo.gl/X0oQBt
<perrito666> cloud "lxd" not found, trying as a provider name  <--- such is my luck
<mgz> perrito666: that's an expected warning
<mgz> perrito666: it should still continue fine from there
<perrito666> mm I am getting same error as nate, I wonder if the upgrade did something to my conf
<natefinch> so... lxc list returns an empty list, but lxd init says error: You have existing containers or images. lxd init requires an empty LXD.
 * natefinch reboots just in case
<natefinch> sigh
<natefinch> hey, that's a ifferent error message
<natefinch> $ juju bootstrap local-apr-12 lxd --upload-tools
<natefinch> ERROR cannot find network interface "lxcbr0": route ip+net: no such network interface
<natefinch> ERROR invalid config: route ip+net: no such network interface
<alexisb> natefinch, it should not be looking for lxcbr0
<alexisb> lxdbr0
<alexisb> are you working off master?
<perrito666> natefinch: did you dpkg-reconfigure lxd ?
<perrito666> be sure to sat yes to the ipv4 config
<natefinch> perrito666: yes, I did, but I changed lxdbr0 to lxcbr0... I guess that was not the right thing to do.
<perrito666> natefinch: I did too and am working now with lxd
<perrito666> bootstraping a xenial as we speak
<alexisb> natefinch, you have time for a hangout
<alexisb> w should be able to work through this
<natefinch> alexisb: definitely... I'd love to get past this
<alexisb> k, out 1x1 HO
<perrito666> lemme know if I can help you
<cherylj> ugh, I can't even get restore to work
<cherylj> the db just exits
<cherylj> like "see ya, suckers"
<cherylj> Apr 12 17:49:41 ubuntu mongod.37017[3194]: Tue Apr 12 17:49:41.356 [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
<natefinch> perrito666: thanks, alexis is helping me out
<cherylj> I hit this during the last restore problem I was debugging
<cherylj> perrito666: did you ever see this restore problem?
<perrito666> cherylj: never, which is weird since I coded it
<cherylj> I hit this every time.  am I doing something wrong?  argh!
<natefinch> yay, alexisb fixed it for me :)
<alexisb> :)
<natefinch> ugh wow, I wonder if there's a network problem between me and wherever the images are hosted 'cause dayum this is a slow download
<alexisb> natefinch, they have been very slow
<alexisb> once the image is cached it is easy
<natefinch> yeah
<alexisb> you can always copy the image over and alias it with the tag
<alexisb> lxd will look for the tag and use it
<natefinch> alexisb: this will be done in 5 minutes or so, it's ok
<natefinch> gah... is juju ssh supposed to work?
<perrito666> natefinch: might fail in lxd
<perrito666> natefinch: just lxc list and ssh to the machine
<mup> Bug #1569490 opened: storage-get crashes on xenial (aws) <storage> <juju-core:New> <https://launchpad.net/bugs/1569490>
<natefinch> you just need to put -m <model> before the machine number for some reason
<natefinch> oh, I guess because if you put it after, it thinks that's the ssh command
<natefinch> blech
<natefinch> perrito666: works fine in lxd... just PEBCAK
<perrito666> lol I just ssh I am lazy
<natefinch> I juju ssh because I'm lazy :)
<natefinch> whelp, figured out why I always call kill-controller and not destroy-controller... I don't have to type out the pesky --destroy-all-models
<cherylj> yep
<cherylj> natefinch: are you trying to ssh to machine 0 just after a bootstrap?
<natefinch> cherylj: yes, but it was just a problem of spelling the command correctly, what with multiple models and stuff
<cherylj> ah, ok
<cherylj> ..... and now mongo3.2 has hit the mirrors for my region
<cherylj> yay
<natefinch> cherylj: like, it defaults to an empty model, but I wanted to ssh to the controller, so I had to specify the model, but if you put that after juju ssh 0 then it thinks it's a command....
<cherylj> yeah, that's totes annoying
<natefinch> and I happened to have already created a machine in the non-admin model, so juju ssh 0 still worked and tried to run -m admin as an ssh command, which gave a wacky error message
<redir> if unrelated tests fail in CI, do I need to resubmit the PR?
<natefinch> yes, if you think they're spurious and will go away
<natefinch> ...which is fairly common, unfortunately.  But if you're not sure, send a link and we can help
<redir> well the first failure is a termination worker timeout which is fine locally and I can't imagine that it would be related to helptext updates, so I'll resumbit in a bit.
<redir> second failure is because it can't untar juju-core_2.0-beta4.tar.gz...
<redir> which seems like a CI hiccup
<natefinch> sinzui: ^
<redir> I'll resumbit both after the queue shrinks.
<natefinch> redir: I wouldn't count on the queue shrinking, just sayin' :)
<cherylj> true dat
<natefinch> redir: yes, sounds like one-off failures, though the failure to untar is concerning
<redir> no such file/dir so prolly failed to DL in time.
<sinzui> redir: looks like a hiccup, the tar file didn't arrive on the testing instance.
<mgz> natefinch: looking at the log, we got ssh disconnected when scping the source to the ec2 test running machine
<mgz> ...sinzui won ;_;
<sinzui> oh, is this using the xenial ami?
<sinzui> mgz: We are testing with the xenial from last week.
<redir> yeah I see 'lost connection' above
<redir> I know exactly what will fix this for me.
<redir> Soup and/or sandwich
<cherylj> if only that were the answer to all problems.  sigh...
<mgz> mmm, soup
<redir_lunch> I guess it is just a work-around
<mup> Bug #1569529 opened: update-clouds strips "DO NOT EDIT" warning <ci> <update-clouds> <juju-core:Triaged> <https://launchpad.net/bugs/1569529>
<natefinch> sinzui, mgz: is there a trick to compiling for arm64?  GOARCH=arm64 go build github.com/juju/juju/cmd/juju returns errors from lxd about undefined functions
<sinzui> natefinch: You can compile on the actualy host if you lile, That is what we do
<natefinch> sinzui: I guess... cross compile *should* work and lets me edit in my local environment... but I guess I can copy my code up
<sinzui> natefinch: We cross compile windows. In the case of all builds, we use the release tarfile. The scripte that makes it double checks the deps and purges undocumented packages.
<sinzui> natefinch: The installed lxd packages can differ between archs in ubuntu.
<cherylj> sinzui: Okay, I actually got a restore to work.  Does the test kill the controller?  or use destroy-controller?
<natefinch> sinzui: yes, not the code, though... and I'm getting a compile error
<sinzui> cherylj: kill-controller.
<cherylj> thanks.
<cherylj> btw - the output makes it look like it's a status command that's failing:
<cherylj> ERROR:root:Command '('juju', '--show-log', 'show-status', '-m', 'functional-backup-restore', '--format', 'yaml')' returned non-zero exit status 1
<cherylj> it's just confusing for me
<cherylj> but anyway
<natefinch> sinzui: oh, it uses cgo, that's probably the problem
<sinzui> natefinch: I don't think arm64 golang-1.6 is using cgo. only the osx is using cgo to my knowledge
<natefinch> sinzui: no no, sorry, not being clear. The LXD code uses cgo, which complicates cross compilation
<sinzui> ah
<sinzui> yeah it does. natefinch . We had to setup a dedicated  OS X builder because it does need cgo to link to the native crypto libs
<sinzui> natefinch: this long log shows the last build of arm64 for master http://reports.vapour.ws/releases/3881/job/build-binary-xenial-arm64/attempt/424
<natefinch> looks like the reason that yuo can cross compile windows is because the cgo stuff is all linux only.... what a PITA.
<natefinch> ..well, duh, of course the lxd stuff isn't compiled in Windows :)
<natefinch> sinzui: the arm64 machine can't access github, can it? :/
<sinzui> natefinch: I just sent you an email with the ssh rules I use. The machine is on Canonical's network. It cannot see much
<natefinch> sinzui: yeah, I got the ssh config stuff from cherylj last night.  I guess tgz it is
<perrito666> cherylj: I got it fixed, ill make a pr, this goes against master?
<natefinch> What happens when I targz a brand new gopath with just juju in it: -rw-rw-r-- 1 nate nate 216M Apr 12 15:44 src.tar.gz
<natefinch> oh well, ship it.  Take longer to fix it than just push it up.  Yay for a decent upload speed.
<natefinch> 4.8 MB/s... I'll take it
<alexisb> natefinch, I was wondering if you had a minute to repay the favor from earlier :)
<alexisb> I am stuck on a test update that I am sure is a simple "how go works" type q
<bogdanteleaga> can I actually forcibly kill a controller using current master?
<bogdanteleaga> "kill-controller" seems to be stuck waiting
<alexisb> bogdanteleaga, you should be able to with kill-controller
<alexisb> if it is not working it is a bug
<bogdanteleaga> seems to be very happily stuck on "Waiting on 1 model, 2 machines, 3 services"
<bogdanteleaga> alexisb, but I might be able to help with the how go works thing :p
<redir> sinzui: got a second?
<sinzui> I do
<cherylj> perrito666: yes, against master
<cherylj> bogdanteleaga, alexisb if the model is not in a good state, kill controller can "hang"
<natefinch> alexisb: sorry, yes, I can help
<cherylj> bogdanteleaga: see bug 1566426
<mup> Bug #1566426: kill-controller should always work to bring down a controller <juju-release-support> <kill-controller> <juju-core:Triaged> <https://launchpad.net/bugs/1566426>
<bogdanteleaga> cherylj, yeah I turned of the controller, issued it again and it went straight to the provider
<cherylj> there's a "workaround" in there
<bogdanteleaga> s/of/off
<cherylj> yeah, that's the workaround :)
<alexisb> natefinch, back to the 1x1 hangout
<cherylj> while I have you here, bogdanteleaga, is bug 1516668 addressed by your action changes?
<mup> Bug #1516668: Switch juju-run to an API model (like actions) rather than SSH. <2.0> <2.0-count> <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1516668>
<bogdanteleaga> cherylj, yup
<cherylj> bogdanteleaga: and that landed, right?
<bogdanteleaga> cherylj, correct
<cherylj> yay, fix committed it is, then!
<bogdanteleaga> I think sometime last week
<redir> tx sinzui
<cherylj> bogdanteleaga: also bug 1470820 - now that we're ta go 1.6, should this be done?
<mup> Bug #1470820: Remove github.com/gabriel-samfira/sys/windows once go 1.4 lands <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1470820>
<bogdanteleaga> cherylj, https://bugs.launchpad.net/juju-core/+bug/1426729
<mup> Bug #1426729: juju-run does not work on windows hosts <juju-agent> <run> <ssh> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1426729>
<bogdanteleaga> this too probably
<bogdanteleaga> cherylj, yeah I've talked with curtis about that one last week before the CI switch
<bogdanteleaga> however I'm still unsure
<bogdanteleaga> since the tests on windows get ran using 1.2
<cherylj> bogdanteleaga: maybe something to look at for 2.1 then?
<mup> Bug #1545116 changed: When I run "juju resources <service>" after a service is destroyed, resources are still listed. <2.0-count> <juju-release-support> <resources> <juju-core:Fix Released by cox-katherine-e> <https://launchpad.net/bugs/1545116>
<bogdanteleaga> cherylj, I was about to say it shouldn't be that hard to get the tests passing on 1.6 until I saw the last email with the job
<bogdanteleaga> cherylj, we might have to push it further back I guess
<bogdanteleaga> any idea what's up with all the "no tools" test errors?
<cherylj> bogdanteleaga: do you have a job link you could send?
<cherylj> mgz: you still around?
<mgz> cherylj: yo
<bogdanteleaga> cherylj, http://reports.vapour.ws/releases/3881/job/run-unit-tests-centos7-amd64-go1_6/attempt/1
<bogdanteleaga> sorry
<bogdanteleaga> http://reports.vapour.ws/releases/3881/job/run-unit-tests-win2012-amd64-go1_6/attempt/1
<bogdanteleaga> this one
<cherylj> hey mgz could you help me figure out the juju commands that are run as part of the functional-backup-restore test?
<cherylj> I can't recreate using what I *think* is going on, and the job output is unhelpful
<cherylj> bogdanteleaga: let me take a look
<cherylj> bogdanteleaga: I *think* there is one place in the test suite I could change to fix a lot of those problems
<mgz> cherylj: sure, also refer to assess_recovery.py for the details
<perrito666> how can git not be able to fix a conflict where one commit has nothing and the other has something there....
<perrito666> cherylj: did anyone just land anything in master?
<mgz> cherylj: if you want, we can also rerun a CI job with --verbose for the explicit
<cherylj> mgz: would you be able to do that for this job?  It would be most helpful to see the output of the reboostrap
<bogdanteleaga> cherylj, sounds good, I don't understand how changing the go version can give that kind of error
<perrito666> cherylj: well a change from wallyworld_ has just landed that broke my patch and pseudo fixed the issue
<cherylj> hmm
<mgz> cherylj: backup-restoe exactly, not one of the other variants?
<mup> Bug # changed: 1175580, 1235529, 1276403, 1279879, 1280949
<cherylj> mgz: yeah functional-backup-restore
<mgz> building
<perrito666> I really need a punching bag in my office
<cherylj> sounds like an idea for the next team sprint, perrito666
<cherylj> instead of tshirts
<cherylj> here's a punching bag!  (complete with juju logo)
<perrito666> cherylj: oh no need, in the sprint I can use wallyworld_  :p
<cherylj> lol
<wallyworld_> perrito666: wot you talking about?
<perrito666> wallyworld_: GO TO SLEEEEEEEEP
<TheMue> hehe
<perrito666> oh ts 6:30 tis ok
<wallyworld_> perrito666: i just woke up
<perrito666> wallyworld_: go breakfast?
<TheMue> punching bags w/o sand, otherwise hard to take it as hand luggage in the plane
<perrito666> anyway we just clashed on a fix
<wallyworld_> perrito666: getMongoDumpPath still needs to be fixed
<TheMue> perrito666: I'll go to bed instead of wallyworld_. here it is almost 11pm now, so time is getting closer.
<cherylj> wallyworld_: no sts call today, btw
<wallyworld_> cherylj: yeah, saw, ty
<wallyworld_> i can actually have breakfast :-)
<perrito666> this inability to actually finish destroying controllers is beginning to get into my nerves
<perrito666> finally
<mgz> oh, what the pants. assess_recovery.py is one of our few jobs that doesn't use common args yet
<mup> Bug # changed: 1158187, 1280953, 1289619, 1374906
<mgz> cherylj: really rebuilding this time
<cherylj> I'm watching it now, mgz  :)
<mgz> hm, I want to make our wait loops nicer with --verbose
<natefinch> gah, I can't tell if I've fixed this bug, because --upload-tools hides it
<cherylj> yeah, what a pain  :(
<natefinch> cherylj: I gotta run to make dinner for the kids.  won't be back for a few hours until after they're in bed.
<cherylj> natefinch: can you push your changes somewhere?  maybe we could make a branch and test?
<natefinch> cherylj: here's a PR..I am honestly not super confident in the fix, since I was kind of running blind... and furthermore the tests in that package pass both before and after I made my change, which means they're not actually testing that
<natefinch> cherylj: https://github.com/juju/juju/pull/5116
<cherylj> :(
<cherylj> thanks, natefinch, we'll see what we can do
<natefinch> cherylj: I'll be back on likely in 3.5 hours.
<mgz> cherylj: run finished, 'INFO juju --show-log' search should get you all the commands
<mup> Bug #1554863 changed: juju bootstrap does not error on unknown or incorrect config values <2.0-count> <juju-release-support> <juju-core:Fix Released> <https://launchpad.net/bugs/1554863>
<cherylj> thanks mgz
<perrito666> cherylj: http://reviews.vapour.ws/r/4552/
<cmars> does lxd placement work? as in, should i be able to juju deploy xyz --to lxd:<machine-number> ?
<katco> cherylj: i need a new bug to work on. all of them seem to require a lot of context... any suggestions on what to pick up?
<cherylj> let me look
<cmars> does lxd placement work? as in, should i be able to juju deploy xyz --to lxd:<machine-number> ?
<cherylj> katco:  you can review perrito666's PR while I do that?  ^^ :)
<cmars> sorry, wrong window
<katco> cherylj: sure
<katco> perrito666: if you can review mine :) http://reviews.vapour.ws/r/4551/
<cmars> was up-arrow,enter-ing in a term
<perrito666> katco: just in case, check it in github too, I am not sure how well rb takes amends
<perrito666> katco: sure
<katco> perrito666: where's the test for this?
<perrito666> katco: mm, you are right, that did not break a test, lemme check that again
<perrito666> katco: ship it, but, I am curious, why this change?
<perrito666> this is going to make developmente testing incredibly hard
<katco> perrito666: just going on what the bug said. "the decision has been made"
<katco> perrito666: i was not part of that conversation
<perrito666> oh, ok, well It is time to resurrect my fake streams builder it seems
<perrito666> well of course I did not break any tests... there arent tests for that, well, lets fix that
<katco> perrito666: :)
<perrito666> aaand of course, external tests
<katco> perrito666: what do you mean external tests?
<perrito666> package_test tests
<katco> perrito666: i think that's devs discretion and i actively avoid doing that
<katco> perrito666: because it just causes boilerplate churn
<perrito666> I believe Ill do regular unit tests
 * katco cheers
<perrito666> I am on your side, I was protesting that the existing ones are externals
<perrito666> I cant wait for this semester discussion about internal vs external tests
<katco> perrito666: lol
 * perrito666 has it pretty much like one of the sprint events
 * redir is in a maze of twisty little passages, all alike
<katco> redir: please beware of the gru, we've only begun to get to know you.
<redir> :)
<perrito666> oh dont worry, if you find it just throw the status tests to it, that should keep it occupied a good half an hour
<mup> Bug #1565089 changed: create-model does not use the same config format as bootstrap <jujuqa> <juju-core:Fix Released> <https://launchpad.net/bugs/1565089>
<mup> Bug #1566303 changed: uniterV0Suite.TearDownTest: The handle is invalid <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Invalid> <juju-core 1.25:Fix Released by dave-cheney> <https://launchpad.net/bugs/1566303>
<mup> Bug #1339931 changed: Status panicks during juju-upgrade <panic> <status> <upgrade-juju> <juju-core:Fix Released> <https://launchpad.net/bugs/1339931>
<wallyworld_> perrito666: quick chat?
<perrito666> wallyworld_: sure
<perrito666> where?
<wallyworld_> standup
<bogdanteleaga> this is impressive http://classicprogrammerpaintings.tumblr.com/
<perrito666> wallyworld_: frozen
<perrito666> wallyworld_: cannot hear you, you are frozen
<perrito666> wallyworld_: you left me speaking alone
<anastasiamac> perrito666: wow.. i think something exciting just happenned on our side... i was kicked out from perrito666uassel at least... maybe ian experiences fun too..
<anastasiamac> quassel that is..
<wallyworld> perrito666: sorry, chrome ate all my memory :-(
<mup> Bug #1567690 changed: Can't push charm to my new LP home <juju-core:Invalid> <https://launchpad.net/bugs/1567690>
<thumper> ugh... struct equality again...
<thumper> what's valid?
<davecheney> struct equality ?
<thumper> nm
<thumper> interestingly...
<thumper> if args == StructType{} {
<thumper> return nil
<thumper> doesn't work
<thumper> but
<thumper> var empty StructType
<thumper> if args == empty {
<thumper> does
<thumper> hit this before, and no idea why Go doesn't like it
<bogdanteleaga> thumper, have you tried args == (StructType{})?
<thumper> no
<thumper> but I find that less readable
<thumper> so would probably go with empty var
<anastasiamac> wallyworld: ? :/
<bogdanteleaga> it breaks symmetry though :P
<davecheney> thumper: http://play.golang.org/p/C16rPMEAlO
<davecheney> it's a parsing ambiguity because the parser cannot tell where the start of the block begins and the struct literal ends
<davecheney> ironicall it can with this even more verbose version
<davecheney> http://play.golang.org/p/R_ui2oTlma
<davecheney> but, what you're trying to do smells bad
<wallyworld> katco: around?
<mwhudson> can i get some juju usage help?
<wallyworld> cherylj: katco: i think we got the PR for bug1567170 backwards
<mwhudson> i'm trying to test the juju-mongo-tools3.2 package i made
<mwhudson> so i need to try to make a backup
<mwhudson> i have an controller bootstrapped in ec2
<wallyworld> mwhudson: bootstrap with mongo 3.2 is broken at the moment
<mwhudson> but now i get
<mwhudson> (master *)mwhudson@aeglos:juju-mongo-tools3.2$ juju backups create
<mwhudson> ERROR backups are not supported for hosted models
<mwhudson> wallyworld: i merged perrito666's PR
<wallyworld> mwhudson: juju create-backup -m admin
<wallyworld> or first switch to admin
<wallyworld> juju switch admin
<wallyworld> when you bootstrap, you are switched to the hosted model
<mwhudson> ERROR while preparing for DB dump: mongodump not available: failed to get mongod path: exec: "mongod": executable file not found in $PATH
<mwhudson> win, i think
<mwhudson> wallyworld: now how do i log into the controller node?
<cherylj> wallyworld, katco I think you're right
<wallyworld> mwhudson: yeah, that's a bug i told horatio i found yesterday when doing a code read
<mwhudson> oh juju ssh 0
<mwhudson> oh right
<wallyworld> mwhudson: the mongodump path needs to be fixed
<wallyworld> i'll do a fix today
<mwhudson> wallyworld: well mongodump is not even installed
<wallyworld> mwhudson: that's because the mongotools package is not installed
<wallyworld> juju should depend on it
<mwhudson> wallyworld: because i haven't uploaded it yet :-)
<wallyworld> right :-)
<mwhudson> so i was going to install the package from the ppa
<wallyworld> mwhudson: but even when it is uploaded, juju will look in the mongo2.4 path :-(
<mwhudson> but you're saying that even that won't work, because the path is wrong?
<mwhudson> excellent
<wallyworld> i will do a fix this morning
<wallyworld> i only just saw it yesterday doing a code read by accident
<wallyworld> the backup code is not something i am 10000% familiar with
#juju-dev 2016-04-13
<wallyworld> cherylj: so, we'll need to get that fix for bug 1567170 fixed
<mup> Bug #1567170: Disallow upgrading with --upload-tools for hosted models <upgrade-juju> <juju-core:In Progress by cox-katherine-e> <https://launchpad.net/bugs/1567170>
<cherylj> yes, it landed in next, so we have some time
<wallyworld> ah phew ok
<mwhudson> wallyworld: so juju is looking for mongodump in /usr/lib/juju/bin/mongodump?
<wallyworld> mwhudson: from memory, yreah
<mwhudson> ERROR while creating backup archive: while dumping juju state database: error dumping databases: error executing "/usr/lib/juju/bin/mongodump": 2016-04-13T00:02:21.072+0000	error parsing command line options: --dbpath and related flags are not supported in 3.0 tools.; See http://dochub.mongodb.org/core/tools-dbpath-deprecated for more information; 2016-04-13T00:02:21.072+0000	try 'mongodump --help' for more information;
<mwhudson> sounds like juju needs moar fixing
<mwhudson> but well no reason not to upload my package
<wallyworld> mwhudson: yeah, juju is currently hard coded to look in the mongo 2.4 binary path for mongodump
<perrito666> wallyworld: I fixed my patch, can you look at it so I can merge it after dinner?
<wallyworld> perrito666: will do
<wallyworld> mwhudson: it should be a quick fix to sort out that mongodump path, i'll do that today
<mwhudson> wallyworld: that was after i copied the 3.2 tools to /usr/lib/juju/bin/
<mwhudson> wallyworld: looks like the command line needs to change too
<wallyworld> oh right, yes, i didn't read the error
<wallyworld> damn it
<mwhudson> BTW MIR for this is going to be a doozy
<wallyworld> yes
<mwhudson> but that's not today's problem either
<wallyworld> mwhudson: so we have today to get a blessed beta4 for includsion in xenial main
<wallyworld> is my understanding
<mwhudson> wallyworld: well i guess Someone (tm) should get to working filing MIRs for the deps
<wallyworld> mwhudson: yeah, i must admit i have nfi about the process with all that
<wallyworld> it is being driven by others
<wallyworld> mwhudson: but i thought we had permission to put mongo packages needed for juju into main
<wallyworld> along with juju itself
<wallyworld> isn;t that already approved?
<mwhudson> wallyworld: oh maybe
<wallyworld> i thought so but am out of the loop a little on all that
<menn0> wallyworld, axw: I'm thinking about picking up bug 1456916. this also relates to these old tickets: bug 892552 and 802117
<mup> Bug #892552: juju does not extract system ssh fingerprints <docs> <feature> <ssh> <pyjuju:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/892552>
<wallyworld> menn0: it will be a fair bit of work
<menn0> wallyworld, axw: to me, the easiest win seems to be to use per-model known hosts files for juju ssh and juju scp (instead of /dev/null)
<perrito666> wallyworld: why would we embed base suite?
<wallyworld> perrito666: because that supresses logging output to console and other things
<perrito666> k
<menn0> wallyworld, axw: that doesn't help with the "first connect" problem but does prevent MITM after the first connect while avoiding the scary warnings when addresses are reused between models
<wallyworld> menn0: seems ok to me - so long as we can extract the fingerprints from cloud init, but i am not an expert
<menn0> I guess there's still the problem of killing a machine and recreating it in the same model which could lead to the same address being reused
<wallyworld> that is the issue
<wallyworld> one of
<menn0> wallyworld: yeah, one of
<menn0> wallyworld: ok so the ultimate solution would be for juju to extract the ssh key for each new machine from cloud-init and store it... and then have juju ssh and juju scp get told the key before connecting so it can be written into the client's known_hosts
<menn0> wallyworld: something like that?
<axw> menn0: yep
<wallyworld> menn0: +1
<wallyworld> menn0: i assume we pass pass a bespoke known hosts file to the client each time
<wallyworld> mwhudson: are we putting juju-mongo3.2 into trusty?
<wallyworld> cherylj: i think you did a fix so that if mongo 3.2 were not found, it would not retry? ill need to retest, but it appeared not to work when i bootstrapped a trusty controller. took ages to bootstrap and cloud init was filled with mongo3.2 retries
<wallyworld> i may be able to get a fix done today if i can sort out some other issues
<axw> wallyworld: we don't pass any known_hosts, we pass /dev/null. that's part of the problem
<axw> I railed against that at the time it was done, but the change went in anyway
<wallyworld> sigh
<wallyworld> axw: awesome, i removed juju-mongodb (only have mongo 3.2 installed) and tests fail \o/
<axw> wallyworld: :/
<axw> wallyworld: I'm guessing testing/mgo.go needs updating to look in the right spot
<wallyworld> yep
<wallyworld> i might worry about that later
<wallyworld> perrito666: btw, removing ensureAdminUser is fine; need to double check restore, but bootstrap etc all works as expected
<perrito666> sweeeet, this is a great week
<redir> see you tomorrow juju-dev
<cherylj> wallyworld: yeah, I did. The fix was in juju/utils, so you'll need to make sure your deps are up to date to pull it in
<wallyworld> ok, i'll double check
<wallyworld> cherylj: ah, i changed my scripts (long story) and my godeps one had an issue, utils just got updated, so i'll retry
<cherylj> ok, phew
<cherylj> :)
<thumper> menn0: if you get a minute https://github.com/juju/gomaasapi/pull/33
<wallyworld> mwhudson: when you upload the mongo 3.2 tools, the package will get installed automatically with mongo 3.2 db right?
 * thumper needs food
 * thumper goes to hunt in the kitchen
<wallyworld> mwhudson: i'm testing a fix for that mongodump issue (the path and the args) and have no tools available - can you provide me wit the deb i can install manually?
<mup> Bug #1569632 opened: status: decide on statuses for migration <juju-core:Triaged> <https://launchpad.net/bugs/1569632>
<mup> Bug #1569632 changed: status: decide on statuses for migration <juju-core:Triaged> <https://launchpad.net/bugs/1569632>
<mwhudson> wallyworld: it's in ppa:juju/experimental
<wallyworld> ta
<mwhudson> wallyworld: juju-mongo-tools32
<mwhudson> wallyworld: juju-mongo-tools3.2
<mwhudson> rather
<wallyworld> mwhudson: yay, my fix worked, backups good now
<mwhudson> wallyworld: nice
<wallyworld> mwhudson: but we'll need that tools deb in the repos etc before beta4 goes out
<mwhudson> wallyworld: slangasek said he'd look at it today
<wallyworld> awesome
<mup> Bug #1569632 opened: status: decide on statuses for migration <juju-core:Triaged> <https://launchpad.net/bugs/1569632>
<alexisb> wallyworld, menn0, axw is one of you available to look at: https://bugs.launchpad.net/juju-core/+bug/1569467
<mup> Bug #1569467: backup-restore loses the hosted model <backup-restore> <blocker> <ci> <regression> <juju-core:Triaged by cherylj> <https://launchpad.net/bugs/1569467>
<cherylj> I suspect that it has come from the made-model-workers branch
<axw> alexisb: this is a known deficiency of backup/restore, meant to be fixedin 2.1 or later IIRC
<cherylj> but it's just a theory at this point
<cherylj> I don't think it is
<cherylj> I can do this test manually and it works
<cherylj> It's just failing in CI
<cherylj> for some reason
<axw> cherylj: oh? I was under the impression that backup/restore didn't work with hosted models
<cherylj> axw: the backup is being created from the admin model
<alexisb> axw, BR should work on the admin model
<axw> ok
<cherylj> the problem that's happening in CI is that the dummy hosted model that is created to rebootstrap with is not removed
<cherylj> and the existing models don't seem to get "hooked up"
<cherylj> I'm wondering if the restore is actually dying in the controller
<wallyworld> i'm working on some other backup fixes for mongo 3.2 (with tools), can look a bit later
<cherylj> and it doesn't complete
<cherylj> I think we need another run with the environment kept after the failure
<alexisb> thank you wallyworld we leave it in your capable hands
<cherylj> I'll update the bug with where I am with things
<cmars> anyone seen messages like this when a unit gets a hook error? https://paste.ubuntu.com/15803425/
<cmars> juju resolved --retry doesn't seem to get picked up unless i reboot the machine that it's happening on
<cmars> should i open a bug?
<natefinch> cmars: probably
<cmars> ok
<cmars> d'oh, i already did. ok, attached a full log to LP:#1566130
<mup> Bug #1566130: awaiting error resolution for "install" hook <juju-core:Triaged> <https://launchpad.net/bugs/1566130>
<natefinch> wallyworld, anastasiamac: I have a question about the imagemetadata api endpoint.  It has code to default the arch to amd64: https://github.com/juju/juju/blob/master/apiserver/imagemetadata/metadata.go#L184   ...when would that be triggered? Why would we ever get image metadata that doesn't specify an arch?  Isn't the arch like one of the most important pieces of information?
<anastasiamac> natefinch: hopefully never...
<anastasiamac> (as in hopefully nver triggered)
<natefinch> anastasiamac: the reason I ask is because of bug #1564791
<mup> Bug #1564791: 2.0-beta3: LXD provider, jujud architecture mismatch <blocker> <lxd> <juju-core:Triaged by natefinch> <https://launchpad.net/bugs/1564791>
<anastasiamac> natefinch: so when u reproduce the bug, u fall into this code?
<natefinch> anastasiamac: when using LXD, machines after the bootstrap machine on non-amd64 hosts are downloading amd64 juju binaries for some reason, which obviously don't work
<natefinch> anastasiamac: it's easy to reproduce the bug, it's very hard to add debugging changes, because it only happens if you *don't* use upload-tools.  I'm actually not sure how to test this with my own binary
<anastasiamac> natefinch: I *think* u will only have no arch by this stage if ur image metadat does not have arch... which would b weird
<wallyworld> natefinch: i think some early simplestream metadata may have omitted the arch, not sure now. but now a days, it should never be ""
<natefinch> wallyworld: ok, this may be a red herring then.
<wallyworld> natefinch: well, depends on the simplestreams metadata
<wallyworld> if they generate metadata without an explicit arch, then boom
<natefinch> actually, looks like they're explicitly requesting the wrong tools, now that I re-read the logs ni the bug
<natefinch> Attempt 1 to download tools from https://10.0.3.164:17070/tools/2.0-beta3-xenial-amd64
<natefinch> so, not really the fault of the streams that it gave them what they were asking for
<natefinch> I'm having a heck of a time trying to figure out how we determine the arch that eventually goes into that download URL, though
<wallyworld> natefinch: it should be the arch of the host on which the tools are to be run
<wallyworld> menn0: can i bother you for a 99% red one? http://reviews.vapour.ws/r/4562/
<menn0> wallyworld: looking
<menn0> wallyworld: best. change. ever.
<wallyworld> yay :-)
<natefinch> wallyworld: I know the arch should be the arch of the host on which it is run, but in this case, the bug is that it's not :)
<wallyworld> i need to read the bug
<natefinch> wallyworld: basically just that non-amd64 LXD environments are downloading amd64 tools for some reason
<mup> Bug #1554677 changed: Provider help topics need to be updated for 2.0 <2.0-count> <docteam> <juju-release-support> <juju-core:Invalid> <https://launchpad.net/bugs/1554677>
<mup> Bug #1569652 opened: help text for juju grant needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1569652>
<mup> Bug #1569654 opened: help text for juju revoke needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1569654>
<wallyworld> were they bootstrapped on amd64
<wallyworld> with --upload-tools
<natefinch> wallyworld: no. it's been seen on arm64 and s390x
<natefinch> wallyworld: upload-tools actually fixes the problem
<wallyworld> hmmm, not sure then, i'd have to debug
<wallyworld> you'd need to trace the simplestreams request and response
<natefinch> wallyworld: do you have tips on how to debug? Since upload-tools avoids the bug, I don't really know how to get my own jujud into the system, but still have it look to simplestreams to get tools for further machinoes
<wallyworld> i'd just hack the code somewhere, eg don't store the uploaded tools in state to force a simplestreams check or something
<wallyworld> not sure off hand, i'd need to read the code
<natefinch> wallyworld: ok, np
<mup> Bug #1567104 changed: Unable to connect to API <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1567104>
<terje> hi, I'm trying to understand when config.yaml is parsed when deploying a charm.
<terje> I'd like to use some values stored in config.yaml when my install, start and stop hooks are executed.
<terje> Do I have the right idea/
<terje> ?
<natefinch> terje: yep.... anything the user sets in the charm's config will be available to the charm code by having the charm call the config-get command, which will any configuration set on the charm (default in yaml)
<natefinch> terje: https://jujucharms.com/docs/stable/authors-hook-environment#config-get
<terje> ok, great.
<cherylj> wallyworld: looks like perrito666's fix for the mongo path has some genuine test failures:  http://juju-ci.vapour.ws:8080/job/github-merge-juju/7344/console
<wallyworld> cherylj: already on it :-)
<cherylj> thanks :)
<wallyworld> a couple of missing "."
<terje> natefinch: is config-get a shell function, sourced prior to executing my hooks?
<terje> (I don't see a shell command on my system)
<natefinch> terje: yep, there's a bunch of shell commands that get installed with a charm.  They're called "hook tools" in juju terminology: https://jujucharms.com/docs/stable/authors-hook-environment#hook-tools
<natefinch> terje: they are in the path when a hook is executing
<terje> awesome, thank you.
<terje> ok
<wallyworld> axw: https://bugs.launchpad.net/bugs/1551779 is fixed right?
<mup> Bug #1551779: New azure provider ignores agent mirrors <2.0-count> <azure-provider> <jujuqa> <regression> <streams> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1551779>
<axw> wallyworld: no, didn't make the cut: https://github.com/juju/juju/pull/5068
<wallyworld> ok, np
<axw> it's ready to go whenever
<wallyworld> jus saw the bug get changed, i thought it was fix committed
<wallyworld> axw: i have to go afk for 45 minutes or so, if you see any landing failures, could you please retry?
<axw> wallyworld: okey dokey
<wallyworld> ta
<mup> Bug #1568177 changed: configFunctionalSuite.TestUsingTCPRemote lxdclient no addresses match <ci> <lxd> <regression> <test-failure> <unit-tests> <juju-core:Invalid by sinzui> <https://launchpad.net/bugs/1568177>
<thumper> sweet
<thumper> well that was out of context...
<thumper> it was for this: <wallyworld> mwhudson: yay, my fix worked, backups good now
<mwhudson> :)\
<thumper> irc window was scrolled
<natefinch> man, juju debug-log is sloooooooow
<menn0> wallyworld: http://reviews.vapour.ws/r/4563/ reviewed
<wallyworld> menn0: ty
<wallyworld> menn0: the previous PR was actually lgtm, i just fixed the tests, i'll see if i can pull the job
<wallyworld> or i can tweak those method names in a driveby next time
<menn0> wallyworld: it's really not that important
<menn0> wallyworld: don't pull the current job
<wallyworld> menn0: ok, i'll fix real soon in another branch
<menn0> wallyworld, axw: regarding SSH host key management... looks like there's 2 ways to tackle it
<menn0> you can have cloud-init post them to a URL
<menn0> or your can pre-generated them and have cloud-init install them for you
<menn0> I prefer the latter as it's simpler, but it does mean a little extra work for the server to do for every new machine
<wallyworld> menn0: we did explorer having a listener on the cloud init url at one point (for provisioning status and errors), but it is more moving parts
<wallyworld> what's the cpu cost of generating a key?
<menn0> wallyworld: pre-generating them server side does seem better
<menn0> wallyworld: I will check
<axw> menn0: I don't think we can do option 1 without a lot of added complexity
<axw> as in, where do we post to
<axw> menn0: my intention was to generate client-side for bootstrap, then have the server generate a new key and publish it into state
<axw> menn0: then we'd query state for public keys when we want to SSH
<axw> no prompting needed for verifying fingerprints
<menn0> axw: I don't like the complexity of #1 either. we'd need a endpoint on the API server.
<menn0> axw: I think we're thinking the same thing.
<menn0> axw: what do you mean generate the key client side and then have the server generate a new key though?
<menn0> axw: what i'm thinking is that when a new machine is to be created, a the host keys are generated on the server and inserted into state, and also passed to cloud-init
<menn0> when you pass the keys to cloud-init, ssh keys aren't generated on the machine and the one that were passed are used instead
<menn0> axw: http://cloudinit.readthedocs.org/en/latest/topics/examples.html#configure-instances-ssh-keys
<menn0> axw: "ssh_keys"
<axw> menn0: sec, trying to find something. there's something about cloud-config not being completely secure. I think it might just be because anything on the machine can query the metadata
<menn0> axw: ok, that's worth following up
<axw> menn0: eh can't find it, I'm sure there's a bug aobut it tho. but basically on AWS and others, you can get the cloud-config metadata by GETting a statically defined URL
<axw> menn0: no ACLs other than being on the machine
<menn0> axw: ok that would be bad
<axw> so if you wanted to run something non-privileged on the machine, it could easily become privileged
<axw> menn0: so, what I was thinking is that we'd pass through ssh_keys, and then have the machine agent generate a new one on startup
<axw> menn0: the initial ssh_keys one is really only necessary for the bootstrap machine tho
<menn0> axw, wallyworld: FWIW generating both the RSA and DSA keys at the default (recommended) bit sizes takes less that 0.2s (combined) on my machine
<wallyworld> menn0: and for dense openstack deployments with containers, there may be several spun up at once
<axw> (and containers can access the metadata URL)
<wallyworld> i guess the host controller needing to run that many containers would cope
<axw> menn0: not sure if what I said about bootstrap & ssh_keys made any sense, let me know if a hangout would help
<menn0> axw: it mostly makes sense but a hangout would be helpful
<axw> menn0: ok just eating lunch, give me 15 mins or so
<menn0> axw: np
<axw> wallyworld: sorry, got distracted and forgot to check your merge again
<wallyworld> axw: tis ok. there was a spurious failure, i resubmitted
<axw> would be good to have the CI bot sending messages in here
<menn0> axw: +1
<axw> menn0: https://plus.google.com/hangouts/_/canonical.com/juju-ssh-keys?authuser=1
<axw> if you're free now
<menn0> axw: coming!
<axw> menn0: wrong button
<axw> brb
<menn0> LOL
<axw> what have I done
<dimitern> yay! master can bootstrap on xenial again :)
<dimitern> there's another issue today though
<dimitern> 2016-04-13 06:12:08 ERROR juju.worker.dependency engine.go:526 "proxy-config-updater" manifold worker returned unexpected error: unknown watcher
<dimitern>  id (not found)
<wallyworld> axw: this shit just keeps popping up everywhere :-( http://reviews.vapour.ws/r/4564/
<axw> wallyworld: :/  looking
<wallyworld> ta
<axw> wallyworld: LGTM
<wallyworld> ty
<dimitern> wallyworld: hey, thanks for fixing that juju-mongodb3.2 issue!
<wallyworld> dimitern: np, one more to go, landing now, affects restore
<wallyworld> wasa joint effort with horatio
<dimitern> wallyworld: I've found a new proxyupdater issue on master, which I'm testing a fix for - such a fix should qualify for landing on master, but first I'll file a bug report for it
<wallyworld> ok. we need to get a bless within 6 hours or so, maybe a bit more
<mup> Bug #1569725 opened: proxyupdater api facade does not set NotifyWatcherId in the result <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1569725>
<dimitern> frobware, voidspace, dooferlad: http://reviews.vapour.ws/r/4565/ fixes the bug above, please take a look
<frobware> dimitern: looking
<frobware> dimitern: apart from the test changes is the fix just in proxyupdater.go?
<frobware> dimitern: ah, I see it.
<dooferlad> dimitern: +1 from me.
 * dooferlad will be back a bit later
<dimitern> frobware: the fix is mostly in the api/proxyupdater changes
<frobware> dooferlad: are we doing 1:1 in 10?
<frobware> dooferlad: oh, it's 9:30...
<dooferlad> frobware: indeed
 * dooferlad really is going now
<dimitern> frobware: I've observed at last what you're describing - lxd containers coming up with 1 NIC (as reported by lxc info), but inside having multiple NICs; applying the machine profile we create directly on the container manually seems to fix this (lxc info shows 8 NICs as expected)
<frobware> dimitern: it depends... the output from lxc list (with netinfo) can dribble in
<frobware> dimitern: if you can get to the same stage then don't apply the profile manually, just sit on watch lxc list for a bit
<dimitern> frobware: so not seeing all nics there is ok as long as they actually work - that's what you're saying?
<dimitern> :)
<frobware> dimitern: there's the two case: 1) we don't get anything. 2) as you've descibed, but I'm not 100% sure this is because the info from lxc list can dribble in over time.
<frobware> dimitern: I want to change the way we apply the profile anyway. it can be applied to the container directly (need to expose some LXD api)
<dimitern> frobware: I've seen the dribbling you're talking about
<frobware> dimitern: want to sync for 30 mins?
<dimitern> frobware: don't you have another call?
<frobware> dimitern: would be interested in any results from LXD bashing
<frobware> dimitern: yep, but in 30mins
<dimitern> frobware: I'm not there yet (heavily testing lxd deployments in w/ w/o charms)
<frobware> dimitern: sure, but (gentle nudge) let's sync anyway. :)
<dimitern> frobware: so if you don't mind I'd like to get the AC removal well tested first and finish it, so I can focus on LXD testing
<dimitern> i.e after standup will be better
<frobware> dimitern: sure
<dimitern> frobware: cheers
<frobware> jam: ping
<jam> frobware: hi
<frobware> jam: I'm guessing my PR for detectSubnet() does the wrong thing...
<jam> frobware: so its entirely possible that we're using it incorrectly, but the intent of detectSubnet is find a subnet that is not in use
<jam> and the highest one we found +1 is not in use, but clearly the highest one we find *is* in use.
<frobware> jam: so then my fix is not a fix...
<frobware> jam: it would read better as detectSubnetNotInUse() :-
<jam> frobware: renaming "detectSubnet" to findUnusedSubnet would probably be a step in figuring out what is wrong
<jam> frobware: jinx?
<jam> paraphrase jinx
<dimitern> 2016-04-13 08:04:42 DEBUG juju.cmd.juju.commands ssh.go:180 proxy-ssh is false
<frobware> jam: OK, I'll revert and look at that this morning.
<dimitern> wtf?! since when
<dimitern> can't juju ssh 1/lxd/0 because of that
<jam> dimitern: there was a big discussion about it recently
<dimitern> only juju ssh 0 works
<dimitern> jam: and it was decided proxy-ssh is bad?
<frobware> dimitern, jam: interesting. pretty sure I was ssh'ing to 0/lxd/0 yesterday
<jam> dimitern: the main motivation was that in multi-model controllers, proxy-ssh means you have to have ssh access to the API server.
<jam> which is very bad for security
<voidspace> dimitern: frobware: is "next" unblocked yet? (as in - do lxd tests pass again, can we land branches)
<jam> If I let you create a model on my controller, that doesn't mean I want you to have SSH access to root my API server.
<voidspace> ah, I see updates on master and next
<dimitern> jam: it depends whether 0/lxd/0 has a dns-name coming from a vlan (e.g. 10.100.19.0/24), which might only be accessible via the controller
<voidspace> so probably
<jam> frobware: dimitern: well aren't X/lxd/Y supposed to be on the routable subnets anyway?
<jam> dimitern: so for people that have complex networking setups, they can do the jumps themselves, which is slightly unfortunate, or they can pass "--proxy" to juju ssh
<frobware> jam: yes, though it's nominally easier (for me) to think of them as logical machines (i.e., 0/lxd/3) then to dig (ho) out its IP addr
<jam> but the default is that you can only use it if you are an admin on the controller.
<dimitern> jam: they are, but what we deem the machine's private address for ssh still uses the legacy scoped selection + sorting
<dimitern> nice! --proxy it is then
<dimitern> yay! works
<jam> dimitern: so for people that aren't admins "--proxy" will fail (which is why we set it to default false)
<jam> but if you have access to the API server its available for you
<dimitern> jam: yeah, this makes sense, but unfortunately also means we have to sort out the private address picking now
<jam> dimitern: well, we should generally fix that anyway, right ? :)
<dimitern> jam: indeed :)
<frobware> dimitern, jam: http://reviews.vapour.ws/r/4566/
<jam> frobware: shipit
<frobware> ty
<jam> 'this change is not incorrect' is confusing
<jam> but the change itself is good
<dimitern> frobware: LGTM
<frobware> jam: :) the change is incorrect (was in my head, but clearly not my fingers).
<dimitern> wallyworld: hey, is there a way to disable the model switching to "default" from "admin" after bootstrap?
<dimitern> it's a bit frustrating to do parallel bootstrap tests on different controllers
<dimitern> axw: ^^
<wallyworld> dimitern: no, that was the intended behaviour
<axw> dimitern: not at the moment. that shouldn't stop parallel bootstrap though
<wallyworld> we don't wanted users in general doing things with the admin model
<frobware> wallyworld: it's an odd experience first time round though
<dimitern> wallyworld, axw: makes sense, but it's a bit inconvenient to have to pass -m local.xxx:yyy each time.. I guess needs to be scripted
<wallyworld> frobware: why is it odd? you deploy workloads to hosted models
<wallyworld> dimitern: juju switch is your friend
<dimitern> the only reason I do it it's because I don't want to wait yet another 5m after bootstrap to be able to add a container to a machine in the default model
<frobware> wallyworld: as a new user I bootstrap and then try and add-machine lxd:0 and it doesn't. I guess most people deploy workloads.
<dimitern> wallyworld: switch doesn't work well with parallel bootstraps - each does a switch at the end
<frobware> wallyworld: heh, dimitern's comment is what I try to avoid too
<wallyworld> frobware: the default hosted model is empty that is true
<wallyworld> machine 0, the old so called bootstrap machine, is not something we want users to think about
<frobware> wallyworld: and I still don't understand how I can get upgrade-juju to work in the default model
<wallyworld> the controller is an inrernal detail
<dimitern> I almost wish to have a --use-admin-model flag to bootstrap :) i.e. I want you to not switch to "default" as I know what I'm doing
<axw> perhaps we could special case --default-model=admin? (pretty sure that is broken atm, now that I think of it)
<axw> so if you did "juju bootstrap -d admin", then you would have no secondary model
<dimitern> axw: that sounds exactly what I'd use
<wallyworld> frobware: i think you need to upgrade the controller tools first
<wallyworld> frobware: hosted models need to use tools compatible with their host
<frobware> wallyworld: ok, let me spin up a machine in the default model and report on my upgrade step. my experience my actually be a symptom of bug #1569361 as I'm generally only trying to use LXD containers
<mup> Bug #1569361: LXD containers fail to upgrade because the bridge config changes to a different IP address <network> <juju-core:In Progress by frobware> <https://launchpad.net/bugs/1569361>
<voidspace> babbageclunk: did your storage stuff land?
<voidspace> babbageclunk: I see that it didn't yet
<voidspace> babbageclunk: I have a branch with deployment status implemented for maas 2 - trying it with your storage stuff merged in
<voidspace> ok, it fails because a default zone is specifed
<babbageclunk> voidspace: No, I was out by the time they got Jenkins unblocked. Do you know, is the next branch still going?
<voidspace> babbageclunk: stuff has landed on next - so I assume we're still using it
<babbageclunk> voidspace: How do I retry a merge - is it just $$retry$$
<voidspace> thumper: should we be targetting next or master
<babbageclunk> voidspace: cool, thanks.
<voidspace> babbageclunk: $$anything$$
<babbageclunk> voidspace: really? I thought there were different commands?
<voidspace> babbageclunk: nope
<voidspace> it's a regex match
<babbageclunk> voidspace: ok, fixing merge conflicts and kicking it off again.
<babbageclunk> voidspace: I've heard mention of $$jfdi$$ though - is that a special case?
<voidspace> babbageclunk: that is special
<voidspace> babbageclunk: so, *something* is adding a zone of "default" to my StartInstance call - which is causing the AllocateMachine call to fail
<voidspace> this didn't happen before I don't think
<voidspace> but I haven't found where it's added yet
<voidspace> ah no - the machine was deployed, so there actually were no machines available
<voidspace> the error message was just confusing!
<babbageclunk> voidspace: ok, that was an easy merge. Just running the tests and pushing now, then I'll queue the merge again.
<voidspace> ok, now I get: "ERROR could not record instance in provider-state: cannot record state instance-id: Not Found
<voidspace>  - 4y3h7p"
<voidspace> babbageclunk: so somewhere we're being inconsistent with what we use as an id
<voidspace> however, it looks like bootstrap continues
<voidspace> babbageclunk: this is with your storage branch and my branches with storage/constraints/deploymentStatus
<voidspace> cloud-init is running
<voidspace> dimitern: frobware: so with the latest of my branches (not yet tested) and babbageclunk's not-yet-merged storage branch
<voidspace> frobware: dimitern: bootstrap gets as far as deploying the node (maas status changes to Deployed)
<dimitern> voidspace, babbageclunk: great!
<voidspace> waiting to see if it gets any further - bootstrap has not yet returned. It has already reported one error, which will probably be fatal later but it hasn't stopped bootstrap.
<dimitern> I'll give maas2 bootstrap a try today then
<voidspace> ERROR failed to bootstrap model: bootstrap instance started but did not change to Deployed state: instance "4y3h7p" is started but not deployed
<voidspace> not true - it is Deployed
<voidspace> probably caused by:
<voidspace> ERROR could not record instance in provider-state: cannot record state instance-id: Not Found
<voidspace>  - 4y3h7p
<wallyworld> axw: off to soccer, i had to require that restore fix, pprpof test error, if it fails again and you notice, could you requeue for me?
<axw> wallyworld: ok, gotta go make dinner shortly though
<wallyworld> np, only if you notice
<wallyworld> i'll check later
<wallyworld> looks like restore is f&cked
<wallyworld> errors setting restore status
<axw> :/
<wallyworld> txn assert error
<wallyworld> but Assert is empty
<wallyworld> need to diagnose
<axw> fwereade_: I have pulled Status and Life back out put params.Model, and made worker/undertaker responsible for setting model status. if you have time, PTAL
<fwereade_> axw, will do, tyvm
<fwereade_> axw, while they're not *immediately* relevant, did you have any thoughts re the philosophy-of-status essay in my reply?
<voidspace> babbageclunk: if you merge voidspace/maas2-deployment-status into your storage branch
<voidspace> babbageclunk: and then attempt to bootstrap
<voidspace> babbageclunk: you should see the "cannot record state instance-id" error
<voidspace> babbageclunk: it would be great to work on that
<voidspace> babbageclunk: isn't there also your interfaces work to complete? (a MAAS 2 version of maasObjectNetworkInterfaces
<babbageclunk> voidspace: yup - I need to get stop-instances and interfaces up to date with next and do PRs for them.
<voidspace> babbageclunk: maas2Instance.volumes also needs implementing - and *may* be the cause of this failure (unlikely though)
<babbageclunk> voidspace: then I'll grab your branch and see if I can find the instance-id problem
<voidspace> babbageclunk: cool
<voidspace> dimitern: that system id is correct
<voidspace> dimitern: the deployed machine is 4y3h7p, the rack controller is 4y3h7n
<dimitern> voidspace: hmm, but does the provider-state file get created ok?
<voidspace> dimitern: so it's possible a bug in the way we fetch the instance (machine)
<voidspace> dimitern: no idea
<voidspace> dimitern: it needs investigating
<dimitern> voidspace: and then we get 404 trying to get instance 4y3h7p?
<voidspace> dimitern: but the id is correct
<voidspace> dimitern: well, that's what it looks like on the basis of no investigation beyond checking the system id of the machine
<axw> fwereade_: sorry, went afk. I do like the philosophy of separating collection, and summarisation/representation/visualisation. part of the reason why I removed the migration statuses from my PR
<dimitern> voidspace: maybe we're not passing agent-name and it doesn't find it..
<voidspace> dimitern: babbageclunk is going to look into it
<voidspace> dimitern: we are passing agent name
<dimitern> voidspace: ok
<voidspace> dimitern: well, at the juju level
<voidspace> it's possible gomaasapi screws things up :-)
<voidspace> but we can check that too
<axw> fwereade_: i.e. because migration and lifecycle are quite different, and so their statuses should probably be recorded separately
<axw> fwereade_: what I don't have a good idea about is how to represent all of that to the user. but at least we can make that call at the UI level
<babbageclunk> frobware, dimitern, frobware: StopInstances branch - http://reviews.vapour.ws/r/4567/
<babbageclunk> (menn0 told me the trick of removing changes from the branch it was chained off - sorry about the other times it wasn't done!)
<dimitern> babbageclunk: LGTM
<dimitern> babbageclunk: urm.. I meant the other one of yours, looking at the one above now
<mup> Bug #1569802 opened: add support for "decrement-container-interfaces-mtu" config option <juju-core:Triaged> <https://launchpad.net/bugs/1569802>
<babbageclunk> dimitern: awesome, thanks
<dimitern> babbageclunk: reviewed
<dimitern> mgz: aren't we using go 1.6 for merge gating now? just noticed "go version go1.2.1" still used by github-merge-juju
<babbageclunk> :( My maas2-storage merge run got a failure in worker/resumer tests.
<babbageclunk> They pass when I run them locally - anyone else seeing that?
<dimitern> babbageclunk: yeah, that's one of the flaky ones - seen it before, just $$retry$$
<dimitern> frobware, babbageclunk: fyi - https://github.com/juju/juju/pull/5130 if you have no objections, I suggest merging this
<babbageclunk> dimitern: makes sense to me.
<frobware> dimitern: I got a failed CI unit test run this morning for my LXD revert - do you know if this was already present in next?
<dimitern> frobware: which one?
<frobware> dimitern: http://juju-ci.vapour.ws:8080/job/github-merge-juju/7359
<dimitern> frobware: that looks like it's on next and also a mongo-related flakiness
<frobware> dimitern: yeah, submitted again
<mup> Bug #1555211 changed: Model name that "destroy-model" accepts doesn't match "list-models" output <2.0-count> <juju-release-support> <juju-core:Fix Released> <https://launchpad.net/bugs/1555211>
<TheMue> (late) morning
<voidspace> babbageclunk: did you make any progress tracking that issue?
<babbageclunk> voidspace: no, sorry - was making review changes and wrangling branches.
<babbageclunk> voidspace: I just got up to trying out your branch now, but when I try bootstrapping I get "ERRIR Requested map, got <nil>."
<wallyworld> fwereade_: i am stupid, can i ask you to look at something?
<wallyworld> i sort of hope for ann answer without having to dig too deeply and debug
<babbageclunk> voidspace: Ed Hope-Morley was just here.
<voidspace> babbageclunk: you need to set the MAAS2 feature flag
<voidspace> babbageclunk: ah, cool
<voidspace> babbageclunk: export JUJU_DEV_FEATURE_FLAGS=maas2
<babbageclunk> voidspace: thanks, was just asking that.
<voidspace> babbageclunk: otherwise it assumes maas 1 (we put the maas 2 work behind a feature flag)
<babbageclunk> voidspace: now I get a panic!
<babbageclunk> voidspace: nil-pointer dereference. Do I need to be on a later version of MAAS?
<babbageclunk> voidspace: http://pastebin.ubuntu.com/15809705/
<babbageclunk> voidspace: Oh, is that because the storage is nil? Does this branch have the storage changes in it?
<voidspace> babbageclunk: no - you need my branch plus storage
<voidspace> babbageclunk: is storage merged?
<babbageclunk> voidspace: ok
<babbageclunk> voidspace: No, grr - and the build for it just failed, I think because interfaces just got merged.
<babbageclunk> voidspace: 5th or 6th time's the charm!
<voidspace> :-)
<babbageclunk> voidspace: the fun bit is, I've still got a merge for stop-instances in the queue. That's based off storage, and I've updated it now so storage might get merged with that.
<babbageclunk> voidspace: What happens to the storage PR in that case? Will it just become merged automatically when all of its changes are already in the destination branch?
<babbageclunk> voidspace: Right, my head's feeling a bit explody, I'm going for a run.
<voidspace> babbageclunk: if you set stop-instances to merge and that is based off storage then yes - merging stop-instances will merge storage I think
<voidspace> babbageclunk: but as you want to merge both that doesn't matter
<voidspace> babbageclunk: not sure what will happen to the PR, it will *probably* get marked as merged
<babbageclunk> voidspace: yup, that's all good.
 * babbageclunk is actually going for a run now
<fwereade_> wallyworld, sorry, missed you, in meeting
<fwereade_> wallyworld, how can I help, if you haven't already solved it?
<wallyworld> fwereade_: there's a txn that returns Aborted - it appears to have just started to act up
<wallyworld> func (info *RestoreInfo) SetStatus(status RestoreStatus)
<wallyworld> in state/backups/restore.go
<wallyworld> the initial state being passedin is Pending
<wallyworld> and the txn fails, so restore aborts
<wallyworld> i can't see why off hand
<wallyworld> if i comment out the assert, it works
<wallyworld> initially, it's an empty assert for pending
<wallyworld> ah, not state/backups, just state
<fwereade_> wallyworld, hold on, ringing a bell...
<fwereade_> wallyworld, heh, there might be a few things going on
<wallyworld> it's such a simple txn
<fwereade_> wallyworld, the failure would seem to indicate that something else has already set the status away from pending
<wallyworld> fwereade_: not really
<wallyworld> for pending, the assert is empty
<wallyworld> ah, but initially there won't be a doc
<fwereade_> wallyworld, d'oh, yes
<wallyworld> so it should be Insert
<wallyworld> so how the fuck did this ever work
<wallyworld> the code is from 2014
<fwereade_> wallyworld, heh, I actually think I must have broken it
<wallyworld> maybe something else inserts a doc initially
<wallyworld> oh, when?
<fwereade_> wallyworld, creating a RestoreInfoSetter would implicitly set a status, which would trigger txn failures when two things created them at once
<fwereade_> wallyworld, especially infurating when one of them was only creating the "setter" in order to get, grrmbl
<wallyworld> see https://bugs.launchpad.net/juju-core/+bug/1569467/comments/2
<mup> Bug #1569467: backup-restore loses the hosted model <backup-restore> <blocker> <ci> <regression> <juju-core:Triaged by cherylj> <https://launchpad.net/bugs/1569467>
<wallyworld> that comment has a timeline of sorts
<fwereade_> wallyworld, but, yeah, if there wasn't *something* else creating the restore status, I don't see how that could have ever worked
<wallyworld> fwereade_: a quick code search shows nothing now is creating an initial entry
<wallyworld> cherylj: see scrollback - that restore issue has been identified as due to a refactoring earlier - i don't think it can be fixed in time for beta4
<fwereade_> wallyworld, concur
<wallyworld> cherylj: we will just have to release note it and fix for the next release
<fwereade_> wallyworld, but, well, I made that change a long time ago and I'm 99% sure it was on MADE-model-workers pre-bless... maybe something else got mangled in a merge?
<wallyworld> fwereade_: or CI didn't pick it up
<wallyworld> because the reason it has started failing is due to newer multi-model tests
<fwereade_> wallyworld, right, but I'm with you in that I can't see how it could have worked at all
<wallyworld> where the hosted model is created with a name named after the test
<fwereade_> wallyworld, (by the way, fun feature of that file: const currentRestoreId = "current")
<wallyworld> fwereade_: one option maybe just to get it working is to comment out the restore status setting
<fwereade_> wallyworld, ha
<wallyworld> as in restore will work but status will be unknown; i'd have to check to see where status is used
<fwereade_> wallyworld, yeah... you know, I'm actually not sure it is even used
<wallyworld> fwereade_: looks like a watcher is used to set a restoring flag on the machine agent
<wallyworld> so we can block api calls
<wallyworld> so it sort of is needed i think
<fwereade_> wallyworld, yeah, sort of, even though that flag is not goroutine safe, and doesn't appear to have any way to dc existing connections
<fwereade_> wallyworld, wait a mo
<fwereade_> wallyworld, if the assert is empty
<fwereade_> wallyworld, I don't think that'd ErrAborted, would it?
<wallyworld> fwereade_: tl;dr; no quick fix for beta4 for inclusion in xenial
<wallyworld> given we need to get a CI bless asap now
<fwereade_> wallyworld, surely it would run and try to modify a document that wasn't there and fail silently?
<fwereade_> wallyworld, yeah, agreed
<wallyworld> fwereade_: that's what i thought, but it seems it does because the doc doesn't exist and Insert is not used
<fwereade_> wallyworld, oh, that scenario explicitly gives us ErrAborted?
<wallyworld> i'm going by what the logs report
<wallyworld> appears so
<wallyworld> according to the logs
<wallyworld> either way, it's broken
<wallyworld> fwereade_: i hacked up a version where for pending, the first status set call, i made the Assert nil, and that seemed to allow it to get futher
<wallyworld> so in that case it may have silently failed, because the next status set call to restoring failed
<fwereade_> wallyworld, it'll still fail later though, won't it? FinishRestore will never succeed
<wallyworld> since that call did have an assert
<wallyworld> yep
<mup> Bug #1569898 opened: cmd/pprof: sporadic test failure <juju-core:New> <https://launchpad.net/bugs/1569898>
<wallyworld> so the restore nazi says "no restore for you in beta4" (with apologies to Seinfeld and the Soup Nazi)
<wallyworld> cherylj: when is the drop dead cutoff for beta4?
<cherylj> ha, yesterday
<cherylj> heh
<cherylj> we need to get a release out asap
<cherylj> this whole restore situation is confusing to be because it works for me on joyent
<cherylj> to me, not to be
<cherylj> heh
<cherylj> wallyworld: in your recreate, did you see anything weird for mongo in syslog?
<wallyworld> cherylj: didn't look in syslog - the error is quite apparent from the juju logs
<wallyworld> cherylj: are you sure it worked?
<wallyworld> it may have appeared to work
<wallyworld> it would look like it did from juju status
<wallyworld> but it would not have
<cherylj> wallyworld: I could run juju status on the hosted model and it showed me the right info (correct machine / service) and list-models showed the right info
<cherylj> I bootstrapped with a different model name for the default model
<cherylj> and that was right too
<wallyworld> hah ok, i did not see that in my testing, it failed deterministically, and the cause is now apparent from the code
<wallyworld> cherylj: did you use upload-tools
<wallyworld> your restore may have grabbed beta3 tools
<wallyworld> which may not have the problem
<cherylj> I did upload-tools
<cherylj> er wait
<cherylj> not for restore, you're right
<wallyworld> that may well be the reason
<cherylj> thank goodness there's some sense to the situation then
<cherylj> heh
<wallyworld> yeah, agreed, we needed to be able to explain it
<wallyworld> but sadly we will go to beta4 without restore working
<wallyworld> cherylj: i landed a few branches today to fix various issues with restore and mongo3.2; so all that should be ok now, but of course now that that is fixed, we see this other issue, although i did much of my testing on trusty containers for the lastest failures
<mup> Bug #1569898 changed: cmd/pprof: sporadic test failure <juju-core:New> <https://launchpad.net/bugs/1569898>
<natefinch> wallyworld: do you know where I should be looking for the code I need to hack for upload-tools?
<mup> Bug #1569898 opened: cmd/pprof: sporadic test failure <juju-core:New> <https://launchpad.net/bugs/1569898>
<mup> Bug #1569914 opened: help text for juju show-controller needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1569914>
<wallyworld> natefinch: to do which bit?
<wallyworld> natefinch: did you see john's explanation?
<wallyworld> you  may not need to hack anything
<natefinch> wallyworld: I saw the text, but it doesn't really help me.
<wallyworld> doesn't it explain why amd64 is being chosen?
<wallyworld> the provider is not asking for the right arch
<wallyworld> i only skimmed it
<natefinch> wallyworld: it doesn't say where the code is that it's talking about. The provider is a ton of twisty code
<wallyworld> i have not looked at it
<natefinch> wallyworld: also, when he says images, does he mean the OS image or the tools image?
<wallyworld> os image
<natefinch> wallyworld: well, ok, I know that's incorrect
<natefinch> wallyworld: the alias of the OS image may be ubuntu-trusty, but the arch is correct... we can see that in lxd... and the fact that the OS image runs at all means it's the right arch
<natefinch> wallyworld: the problem is that cloud-init is requesting the wrong tools version
<natefinch> wallyworld: e.g. Attempt 1 to download tools from https://10.0.3.164:17070/tools/2.0-beta3-xenial-amd64...
<mup> Bug #1569914 changed: help text for juju show-controller needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1569914>
<wallyworld> right ok, so that last bit is the binary version string
<natefinch> wallyworld: what I have been trying to figure out is how we generate the URL to request.... it's just really hard to trace it back just by looking at the code, and I haven't been able to figure out how to use upload-tools and still let the code request stuff from streams
<wallyworld> i gave you instructions :-)
<wallyworld> which i have not tried, i think they will work
<natefinch> wallyworld: I don't know where any of that code lives
<natefinch> wallyworld: I spent a few hours trying to find that code last night
<wallyworld> i'd have to search for it to be exact. there's a ToolsStorage struct in state which is used to store the tools
<wallyworld> then i'd sesrch for what increments the version Build field
<wallyworld> to see how to stop that ebing incremented
<wallyworld> you could also trace what sets the cloud init config
<wallyworld> see see where it is telling it to download tools of a certain arch
<natefinch> I have tried that, too.
<katco> wallyworld: cherylj: doh... sorry i inverted the fix to that bug
<wallyworld> katco: you had your glasses on back to front
<cherylj> heh
<katco> wallyworld: lol
<mup> Bug #1569914 opened: help text for juju show-controller needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1569914>
<wallyworld> natefinch: i just did a quick code search - did you look at func InstanceConfig() in instanceconfig.go ?
<wallyworld> that's where the cloud init tools are set up i think
<cherylj> natefinch: some (possibly) interesting factoids
<cherylj> 1 - This test had been passing on arm64 until the host was updated to xenial
<tych0> frobware: hi, what's the current state of lxd/juju?
<tych0> frobware: i'm finally done with some customer stuff and can get back to looking at it now
<tych0> (well, i have to fly back to denver, but i can look this afternoon/late evening)
<cherylj> 2 - the problem with s390 could be attributed to s390x not being an valid arch in beta3 (see bug 1554675)
<mup> Bug #1554675: Unable to bootstrap lxd provider on s390x <lxd> <s390x> <juju-core:Fix Committed by cherylj> <juju-core 1.25:Fix Released by cherylj> <https://launchpad.net/bugs/1554675>
<wallyworld> natefinch: so looking at that code, it appears the tools arch comes from the machine's recorded HardwareCharacteristics, which may be nil
<wallyworld> this is just a guess - i've not seen this code before, it's all been refactored
<wallyworld> so when the container is created, if the hardware characteristics do not include the arch, that could explain it
<wallyworld> but i'd need to trace it through
<wallyworld> you don't need to hack around upload tools to do that
<wallyworld> just add extra debugging
<wallyworld> cherylj: natefinch: using upload tools would get around that invalid arch issue in beta3
<cherylj> wallyworld:  for the s390 case at least.  Don't know if it's the same for arm
<wallyworld> cherylj: arm should have been properly defined in juju since forever
<cherylj> yeah, I know :(
<cherylj> but, it may not make a difference
<cherylj> how could they have bootstrapped a s390x lxd provider anyway?
<cherylj> I don't know...  just spewing data points
<sinzui> wallyworld: cherylj: since agents use deb archs "dpkg --print-architecture" on the client host will show probbly arch for the container
<babbageclunk> voidspace, dimitern: gah! Is allWatcherStateSuite.TestStateWatcherTwoModels another flaky test?
<cherylj> yes
<babbageclunk> :(
<babbageclunk> oh, cherylj, that probably wasn't to me, was it.
<cherylj> babbageclunk: it was, sorry :)
<babbageclunk> cherylj: d'oh
<babbageclunk> cherylj: ok, requeuing. Thanks!
<natefinch> wallyworld: where's the code that gets the arch from the machine?
<wallyworld> can't recall, i 'll look
<tych0> frobware: i just tried a deploy with master on GCE, looks like it's still broken with no network. i know you said you had a patch for this in progress, what's the state of it?
<katco> cherylj: really quick review for inverting that bug fix? http://reviews.vapour.ws/r/4571/diff/#
<cherylj> katco:  we should probably cancel the $$merge$$ request on my revert, then
<cherylj> sinzui: can we do that? ^^
<wallyworld> natefinch: i've traced it back from the state docs - there's a SetInstanceInfo() API on the machine facade called by the provisioner
<cherylj> it's still in the queue
<sinzui> cherylj: me trues
<wallyworld> natefinch: and that gets it's infro from StartInstance()
<katco> cherylj: whatever is easier
<sinzui> cherylj: your PR?
<wallyworld> natefinch: this is maas right?
<natefinch> wallyworld: lxd
<cherylj> sinzui: https://github.com/juju/juju/pull/5131
<natefinch> wallyworld: sounds like the problem may be that startinstance isn't setting the arch when it should be
<wallyworld> yes, which is what john said i think
<sinzui> cherylj: I don't see it in the queue to cance
<sinzui> wow 2 hours ago
<cherylj> yeah, there's quite a backlog for merging
<sinzui> cherylj: revert-upload-tools is canceled
<cherylj> thanks, sinzui  :)
<katco> sinzui: ty!
<wallyworld> natefinch: looks like lxd does the right thing at a quick look
<wallyworld> but all this can be debugged
<natefinch> wallyworld: yeah, doing that now
<wallyworld> ok, good luck, i'm out of here for a few hours to zzzzzzzzz
<natefinch> wallyworld: good, talk to you tomorrow
<wallyworld> ttyl
<bogdanteleaga> sinzui, cherylj, can this get bumped to a higher priority? https://bugs.launchpad.net/juju-core/+bug/1567676
<mup> Bug #1567676: windows: networker tries to update invalid device and blocks machiner from working <juju-core:Triaged> <https://launchpad.net/bugs/1567676>
<bogdanteleaga> the machiner is broken
<cherylj> bogdanteleaga: sure I can target for rc1
<babbageclunk> voidspace: ok, so I see the power-settings clearing bug Tim was talking about, I think.
<voidspace> babbageclunk: heh, much better not to set the power :-)
<katco> cherylj: any suggestions on what to pick up next?
<cherylj> you can take bogdanteleaga's bug he just mentioned :)
<katco> cherylj: let me tal
<cherylj> I just added a card for it
<katco> cherylj: can i reproduce this on linux, or do i need a windows installation?
<cherylj> katco: I suspect it's windows only
<katco> cherylj: do we have any windows installations to test against?
<cherylj> katco: there is in CI.  sinzui, abentley - can katco access a MAAS that has windows images to deploy?
<sinzui> cherylj: katco sure. I can add katco to munna
<katco> sinzui: cherylj: ty... any documentation on how to utilize this?
<sinzui> katco: I have no documentation. These masses are all new to me
<katco> sinzui: ok
<katco> sinzui: do i need to be on a vpn or anything?
<sinzui> katco: cloud-city and the CI runs have configureations in them
<sinzui> katco: you will not, you well enter as the CI bot with all its privs
<katco> natefinch: standup time
<frobware> tych0: it's currently in master
<frobware> alexisb, tych0: is is possible to get GCE creds to poke around?
<dimitern> alexisb: ping
<alexisb> dimitern, omw
<dimitern> ok
<mup> Bug #1569948 opened: help text for juju list-machines needs improving <helpdocs> <juju-core:New> <https://launchpad.net/bugs/1569948>
<mup> Bug #1569949 opened: log spam: "skipping observed IPv6 address ..." <juju-core:New> <https://launchpad.net/bugs/1569949>
<natefinch> katco, ericsnow: and once I figure out how to do this dance with upload-tools, I'll write it down so others can benefit from it.
<katco> natefinch: sounds good
<ericsnow> natefinch: nice
<katco> ericsnow: natefinch: btw they're doing something to the sidewalk again -.- i'm hopeful they don't cut anything this time
<ericsnow> katco: buena suerte
<terje> I'm deploying a charm I'm working on to private openstack clouds. In doing so, I need a few things (endpoints, username/pw, token's), etc.
<terje> Currently I put them in config.yaml and use config-get to access them when the charm deploys but since they differ, per install this isn't going to work
<terje> is there a better way to go about this?
<katco> terje: you might have more luck in #juju
<katco> terje: that's where the charming folks hang out, and i'm sure they've run into that problem before
<cherylj> perrito666: ping?
<perrito666> cherylj: pong
<natefinch> yeah, folks on this channel just aren't as charming
<cherylj> perrito666: looks like the mongo path changes broke windows unit tests:  http://paste.ubuntu.com/15813652/
<perrito666> cherylj: whaaa? I am pretty sure I +build !windows to that test
<mup_> Bug #1569963 opened: log spam:  image metadata, apiworkers manifold worker, KVM <juju-core:New> <https://launchpad.net/bugs/1569963>
<mup_> Bug #1569969 opened: No way to set credential for current model. <juju-core:New> <https://launchpad.net/bugs/1569969>
<perrito666> cherylj: there it is https://github.com/juju/juju/blob/master/mongo/internal_test.go#L4 I wonder what is going on
<perrito666> natefinch: do you know if build flags are valid for tests?
<terje> lulz, ok. thanks. There's never anyone active in #juju but I'll give it a shot.
<cherylj> perrito666: the failing test is  in state/backups/internal_test.go
<natefinch> perrito666: absolutely
<cherylj> which does not have the windows build flag
<perrito666> cherylj: oh, I see, I did not touch that why would it be failing
<cherylj> (sorry, I cut that part off from the paste)
 * perrito666 goes check
<cherylj> here's the full paste:  http://paste.ubuntu.com/15813799/
<perrito666> cherylj: I see no state/backups/internal_test.go
<perrito666> cherylj: actually, grep says that test is not here
<cherylj> https://github.com/juju/juju/blob/master/state/backups/internal_test.go#L28
<perrito666> cherylj: ohhh, I see, Ian most likely merged this while my patch was on the queue
<perrito666> there it is
<terje> crickets in #juju ..
<katco> marcoceppi: anyone that can help terje out?
<marcoceppi> terje: katco hi, I'll answer in #juju
<katco> marcoceppi: thx dude
<terje> yea, thanks!
<katco`> gah... stupid construction! power keeps cutting out
<katco`> thank god the internet seems unaffected
<fwereade_> praise be to the internet
<katco`> may we all bask in its warm glow and loving embrace
<katco`> so sayeth we all
<natefinch> sinzui: we have 2.0beta3 in streams somewhere, right?
<natefinch> or mgz_ ^
<sinzui> natefinch: devel streals
<sinzui> devel streams
<natefinch> sinzui: what's the url for that? I'm trying to write some documentation and I can't actually find those tools in streams. I'm sure I'm just missing them somewhere
<sinzui> natefinch: agent-stream: devel
<sinzui> no need to set agent-metadata-url
<natefinch> sinzui: no no... what do I type into chrome?
<natefinch> sinzui: I just want to see what's available
<sinzui> natefinch: This is the official streams location https://streams.canonical.com/juju/tools/streams/v1/index2.json
<natefinch> sinzui: where's the actual list of available images?  I don't know how to parse that json
<sinzui> natefinch: images? oh, that would be cloud-images.ubuntu.com
<natefinch> sinzui: s/images/tools
<natefinch> sinzui: for example: https://streams.canonical.com/juju/tools/proposed/
<natefinch> sinzui: except, with 2.0beta3 in the list
<natefinch> sinzui: this is for human consumption, not machine
<sinzui> natefinch: that is obsolete from a long time ago
<natefinch> sinzui: somewhere there is a link to juju-2.0beta3-trusty-amd64.tgz .... can you pleaase just find that url for me?
<sinzui> natefinch: https://streams.canonical.com/juju/tools/streams/v1/index2.json is what juju uses
<sinzui> natefinch: in that file is a path the the stream you want to use, "devel"
<sinzui> natefinch: a devel juju will then read https://streams.canonical.com/juju/tools/streams/v1/com.ubuntu.juju-devel-tools.json
<natefinch> sinzui: what is the base of the relative path that they give? e.g. "agent/2.0-beta3/juju-2.0-beta3-trusty-amd64.tgz" ?
<sinzui> natefinch: the agent-metadata-url used by Juju. Juju's default is https://streams.canonical.com/juju/tools
<mup> Bug #1569982 opened: pathsSuite.TestPathDefaultMongoExists failing on windows <blocker> <ci> <juju-core:Triaged by cherylj> <https://launchpad.net/bugs/1569982>
<natefinch> sinzui: ahh, thank you
<sinzui> natefinch: The directories are browsable on that server :) https://streams.canonical.com/juju/tools/agent/
<natefinch> sinzui: yes, that's the URL I was looking for
<sinzui> cherylj: bug 1559715 is about the model instances being left behind
<mup> Bug #1559715: restore-backup is unreliable <backup-restore> <ci> <destroy-controller> <destroy-environment> <regression> <juju-ci-tools:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1559715>
<sinzui> cherylj: I will re-title it
<cherylj> perrito666: https://github.com/juju/juju/pull/5135
<cherylj> sinzui: yeah, good idea
<tych0> jam: frobware: https://github.com/juju/juju/pull/5136
<frobware> tych0: your testing for this problem is from the tip of master?
<tych0> frobware: yes
<frobware> tych0: ok, makes perfect sense
<tych0> frobware: although i haven't tested against "next", but it looked to me like they were the same
<frobware> tych0: the reason I was asking is because there is (or has been a race) in that part of the code
<tych0> right, i assume that is what you were working on?
<frobware> tych0: the "if len(networkConfig.Interfaces) > 0 " can happen very occasionally on MAAS
<tych0> you mean it is == 0 even on maas?
<frobware> tych0: or, put another way, it's possible for len == 0 even on MAAS...
<frobware> tych0: let me dig out the bug
<tych0> frobware: right, i'm not testing on mass, but gce
<tych0> and it is always broken for me
<frobware> tych0: bug #1564395
<mup> Bug #1564395: newly created LXD container has zero network devices <bootstrap> <conjure> <network> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1564395>
<tych0> that's what i was complaining about a few weeks ago
<frobware> tych0: can you send me the machine-0.log without your fix?
<mup> Bug #1566531 changed: Instances are left behind testing Juju 2 <ci> <destroy-environment> <ec2-provider> <jujuqa> <juju-ci-tools:Invalid> <juju-core:Fix Released> <https://launchpad.net/bugs/1566531>
<mup> Bug #1570009 opened: pathsSuite.TestPathDefaultMongoExists fails because of windows path <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1570009>
<tych0> frobware: sure
<tych0> (i'm on a plane right now so it might be slow :)
<frobware> tych0: planes are fast. :-D
<tych0> frobware: but i think without the above juju/lxd container type is basically broken everywhere, no?
<cherylj> can I get a quick review?  http://reviews.vapour.ws/r/4574/
<frobware> tych0: yes-ish. what's not clear is whether that's due to the bug ^^ -- ie., by the time we get to "your patch" it was going to be empty come what may
<tych0> frobware: http://paste.ubuntu.com/15815783/
<tych0> frobware: oh. does juju provide network config in all cases?
<tych0> frobware: i read somewhere in the code that it was opitonal
<tych0> maybe in the struct InstanceSpec comments or something
<frobware> tych0: checking...
<tych0> frobware: ah, it's on StartInstanceParams
 * babbageclunk is BACK!
<mup> Bug #1570009 changed: pathsSuite.TestPathDefaultMongoExists fails because of windows path <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1570009>
<frobware> tych0: followed up in RB
<cherylj> perrito666: review, pretty please?  http://reviews.vapour.ws/r/4574/
<frobware> cherylj: updated the release notes but will follow-up again tomorrow in the cold light of day...
<perrito666> cherylj: ship it
<cherylj> tyvm, frobware!
<tych0> frobware: i'm not sure i understand your comment
<tych0> frobware: since we don't create those devices at the moment, don't we want to use the default profile whether they're nil or not?
<mup> Bug #1305509 changed: state/watcher: possible data race in commonWatcher <race-condition> <juju-core:Fix Released> <https://launchpad.net/bugs/1305509>
<mup> Bug #1517747 changed: provider/joyent/gomanta: data race <2.0-count> <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1517747>
<mup> Bug #1570031 opened: Cannot bootstrap MAAS2 since Juju2 does not use MAAS API V2 <juju-core:New> <https://launchpad.net/bugs/1570031>
<frobware> tych0: it's not clear whether we should immediately fall back to "default"
<frobware> tych0: I need to able to test this.
<tych0> frobware: what other options are there?
<natefinch> I love the way in juju status that we try to save horizontal space by labelling the Instance ID column "INS-ID" ... but then the actual ID is juju-c9d1b54c-95ba-4046-8b16-06e16b02ada8-machine-0
<frobware> tych0: NetworkConfig.Device and NetworkConfig.NetworkType may have valid values
<tych0> frobware: right, but nothing in the code uses those right now right?
<frobware> tych0: in container/lxd.go no, but that could be an oversight having only tested this with MAAS
<frobware> tych0: I'm trying to understand from the POV of the other providers, particularly AWS with and without juju's address-allocation feature flag
<tych0> right. i'm not proposing a fix for that, i'm proposing a fix for something else: that there is no networks at all in containers on GCE or AWS
<frobware> tych0: so it wasn't clear to me this was broken on AWS.
<tych0> i think it's broken everywhere that doesn't provide network configuration, which includes AWS right?
<tych0> i haven't actually tested it there, i just heard someone (maybe you?) mentioned that GCE and AWS were similar in that respect
<frobware> tych0: just trying on AWS
<mup> Bug #1570031 changed: Cannot bootstrap MAAS2 since Juju2 does not use MAAS API V2 <juju-core:New> <https://launchpad.net/bugs/1570031>
<mup> Bug #1570035 opened: Race in api/watcher/watcher.go <ci> <race-condition> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1570035>
<cherylj> katco: ericsnow just opened up a bug with the same error of that windows bug you're working on
<cherylj> katco: and I'm assuming he's not on windows :)
<katco> cherylj: k
<cherylj> katco: so maybe there's a way to recreate without windows
<katco> cherylj: dunno... ericsnow, are you? ;p
<cherylj> bug 1569963
<mup> Bug #1569963: log spam:  image metadata, apiworkers manifold worker, KVM <juju-core:New> <https://launchpad.net/bugs/1569963>
<ericsnow> cherylj: not on windows :)
<ericsnow> katco: ^^^
<katco> cherylj: this looks like a completely different error?
<cherylj> machine-0: 2016-04-13 14:54:23 ERROR juju.worker.dependency engine.go:526 "apiworkers" manifold worker returned unexpected error: setting controller network config: cannot set controller provider network config: cannot set link-layer devices to machine "0": invalid device "unsupported0": Type "" not valid
<katco> cherylj: oh nm i see the message now
<cherylj> ok, thought I was going crazy
<cherylj> which is entirely possible
<katco> aren't we all? heh heh heh
<frobware> tych0: I'm using your tree and AWS seems busted (for add-machine lxd:0)
<tych0> frobware: oh? what's the symptomp?
<frobware> tych0: bleh, scrap that. Warning != Error.
<tych0> frobware: ok, my abttery is going to die. let me know what you figure out about aws
<frobware> tych0: just testing one more path...
 * frobware mulls over the fact that he is doing basic sanity bootstrap... that a machine and/or CI could do...
<frobware> tych0: not sure if you're still about but ... http://pastebin.ubuntu.com/15818135/
<frobware> tych0: that's your tree, bootstrapped as: JUJU_DEV_FEATURE_FLAGS=address-allocation juju bootstrap a1 aws --upload-tools
<frobware> tych0: I'll need to dig into this more tomorrow
<frobware> tych0: so, the problem above is because:
<frobware> ubuntu@ip-10-229-48-181:/var/log/lxd/juju-machine-0-lxd-0$ sudo lxc profile list
<frobware> docker
<frobware> tych0: there is no default profile
<mup> Bug #1568668 changed: landing bot uses go 1.2 for pre build checkout and go 1.6 for tets <juju-ci-tools:New> <https://launchpad.net/bugs/1568668>
<mup> Bug #1568668 opened: landing bot uses go 1.2 for pre build checkout and go 1.6 for tets <juju-ci-tools:New> <https://launchpad.net/bugs/1568668>
<katco> natefinch: how's your stuff going?
<natefinch> katco: I may have just found a key piece.... I still haven't figured out how to disable all the upload-tools stuff, but I just found a spot where we haven't set the arch when getting user data config (i.e. the thing that sets up cloudinit and which determines the tools we download)
<natefinch> katco: maybe... still not sure
<cherylj> katco: do you know if there was ever a plan to support nested lxd containers for the lxd provider (such that deploy --to lxd:# would work)?
<katco> cherylj: that's certainly a use-case
<cherylj> katco: but would we expect it to work now? (because we don't allow it because of the "can host containers" check)
<katco> cherylj: i don't think we specifically added support for that in the provider
<cherylj> ok
<katco> cherylj: so right now it doesn't work because of a security setting?
<rick_h_> cherylj: katco that's the profile issue
<cherylj> katco:  I think it's because of the check that we do to determine if a machine can host a container
<rick_h_> cherylj: katco where you have to use the docker profile to get that to work
<cherylj> the check is "are we a container" I think
<katco> rick_h_: cherylj: right, that's where i was going... can we tweak the profile to disable that security check
<katco> cherylj: oh, it's a juju thing, not lxd?
<cherylj> wait, I'm thinking of a different bug
<cherylj> maybe it's not that check that's failing
<cherylj> (sorry, I'm slowly losing my mind)
<cherylj> ah, yes it is the same error output:  "cannot add a new machine: machine 1 cannot host lxd containers"  but it doesn't fail at deploy time?  weird.
<cherylj> this is bug 1569106 for reference
<mup> Bug #1569106: juju deploy  <service> --to lxd:0 does not work with lxd provider <conjure> <lxd> <placement> <juju-core:New> <https://launchpad.net/bugs/1569106>
<mup> Bug #1568668 changed: landing bot uses go 1.2 for pre build checkout and go 1.6 for tets <juju-ci-tools:New> <https://launchpad.net/bugs/1568668>
<natefinch> katco or ericsnow: care to pair on this?  A fresh pair of eyes may be a big help
<ericsnow> natefinch: sure
<katco> natefinch: trying to get this windows test going
<ericsnow> natefinch: I need a break :)
<natefinch> ericsnow: cool
<natefinch> katco: np
<natefinch> katco: good luck
<tych0> frobware: wat
<tych0> frobware: how is there no default profile? that is odd.
<thumper> morning
<perrito666> hi thumper
<thumper> hey perrito666
 * perrito666 wonders what exactly did he change in his linkedin lately that provoked a wave of recruiters
<mup> Bug #1569072 changed: juju2 bundle deploy help text out of date <landscape> <juju-core:New> <https://launchpad.net/bugs/1569072>
<natefinch> katc0, cherylj: pretty sure ericsnow and I figured out the problem with lxd on non-amd64.
<cherylj> yay!
<cherylj> natefinch: what was it?
<natefinch> cherylj: we were getting a list of valid tools, and not filtering it for valid architectures before picking one off the list.  It just happens to always be amd64 first... and just happens to be ignored during bootstrap
<cherylj> nice
<mup> Bug #1570096 opened: No way to remove user; remove-user command is missing <juju-core:New> <https://launchpad.net/bugs/1570096>
<redir> that seems like a pretty solid bug
<thumper> wallyworld: meeting?
<wallyworld> thumper: on way, last meeting ran over
<perrito666> wallyworld: I am leaving for like an hour or so, if you change your mind regarding the standup please have anastasiamac_ or axw contact me via some communication protocol from this century, preferably one that rings my phone such as twitter
<wallyworld> perrito666: ok, will do
<anastasiamac_> wallyworld: axw: I proposed a quick fix for cloud lookup to include built-in providers consistently http://reviews.vapour.ws/r/4579/
<anastasiamac_> cherylj: if u don't want it in beta4 ^^^, I can re-propose against next with more testing around the area...
<thumper> trivial review for someone: https://github.com/juju/gomaasapi/pull/37
<thumper> axw: you on call?
<axw> thumper: I am
<axw> (currently reviewing in fact)
 * axw enqueues
<axw> thumper: can we not change gomaasapi to drop the empty-value ?op=
<thumper> axw: we *could*, but I'm more hesitant to change parts that are used already...
<thumper> it is almost certainly likely to be fine
<thumper> but still has me a bit squeemish
<axw> thumper: ok
<menn0> axw: quick state API question: http://paste.ubuntu.com/15822102/
<menn0> I think I prefer the latter
<wallyworld> natefinch: how goes the non amd64 lxd provider fix?
<ericsnow> wallyworld: we paired right before he had to go for dinner and it looks like we came up with a fix
<ericsnow> wallyworld: natefinch should be back in a little while to wrap it up
<wallyworld> ericsnow: oh, that is awesome
<wallyworld> what was the root cause?
<axw> menn0: sorry going to meeting, will look in a little while
<ericsnow> wallyworld: the list of tools that gets passed in is basically "all available" and amd64 is the first one
<menn0> axw: no rush
<ericsnow> wallyworld: the provider was simply using the first one in the list
<ericsnow> wallyworld: needed to filter for arch first
#juju-dev 2016-04-14
<wallyworld> ericsnow: has it been like that forever? i didn't think lxc had that issue?
<ericsnow> wallyworld: this was just the lxd provider
<wallyworld> ok, that makes sense since that provider was re-written from scratch
<wallyworld> i didn't think we had this issue in 1.25
<wallyworld> with lxc
<ericsnow> wallyworld: I'm guessing no-one had tried to use the LXD provider with non-amd64 before
<wallyworld> not till now :-)
<ericsnow> wallyworld: anyway, I gotta run; natefinch can fill in the rest :)
<wallyworld> ok, ty
<wallyworld> ttyl
<axw> menn0: if you were going the first route, I'd just supply a names.Tag rather than a GlobalEntity
<menn0> axw: yep, fair enough
<axw> menn0: I'm ambivalent though. I've found having the methods on state make it a bit easier to mock in tests
<axw> but at the same time, I don't like throwing it all on the one type
<menn0> that ship has sailed I think :)
<axw> menn0: :)
<menn0> but I guess I don't have to make it worse
<menn0> axw: you're right about mocking in tests though... given that we only have concrete Machines I'm setting myself up for difficult testing if I add methods to Machine
<menn0> difficult / dumb
<axw> menn0: some packages go to lengths to mock Machine out too, but it is a PITA
<menn0> axw: especially for something so simple
<axw> menn0: not sure if you've looked at state/volume.go or state/filesystem.go, but this is why I put everything at the top level
<axw> it made testing much easier
<axw> menn0: I ended up having to have some methods on Volume/Filesystem to support, e.g. StatusSetter/StatusGetter
<menn0> yep, makes sense
<rick_h_> axw: howdy, wanted to chat on the model statuses and get your thought on something
<rick_h_> axw: since the end state is archived, what about destroying is archiving?
<axw> rick_h_: it did occur to me. I would say it's not really archiving at that point. It's destroying the resources within the model, it's just the the model docs that are archived
<rick_h_> axw: right, but it's working toward the archived state. And it's not used elsewhere because of that idea
<rick_h_> axw: I guess it's not quite 'factual' but as far as state transitions it goes through a few moving bits on the way to archived. Picking any one points out a bit of what's up?
<axw> rick_h_: so I kinda agree that you should have through "archiving" to get to "archived", but at the same time I don't think "archiving" implies that anything is being cleaned up
<rick_h_> axw: do you have a link to the bug there? I wanted to go back and refer to the list you had but I'm failing to see the bug
<axw> but more ... put away
<axw> 1 sec
<axw> rick_h_: https://bugs.launchpad.net/bugs/1534627. I've currently got it as "active", "destroying", "archived" in my branch
<mup> Bug #1534627: Destroyed models still show up in list-models <2.0-count> <conjure> <juju-release-support> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1534627>
<rick_h_> axw: so what caused us to keep it around by default anyway?
<rick_h_> axw: vs a cleanup all the wya and removal?
<axw> rick_h_: I don't know, that predates my involvement. thumper?
<rick_h_> axw: I mean it seems useful, but it seems like the exception to the rule. A destroy-model --archive that has this seems like the cleaner wa
<axw> it definitely is an exception to the rule
<rick_h_> I guess I assumed we had a good reason from some stakeholder but I'm coming up empty for making this the default behavior.
<rick_h_> wallyworld: ^
<wallyworld> rick_h_: predates me too
<thumper> hmm...
<axw> rick_h_: it doesn't seem useful to me, TBH. I'd prefer to just lean on external logging to keep records
<thumper> was there mostly so we don't suddenly error
<rick_h_> axw: witht he rsyslog logging work going in I'm +1 to that idea
<thumper> because the way people watch an environment being destroyed is status
<thumper> if you had "watch juju status" running
<thumper> you would watch the services go down
<thumper> machines being removed
<rick_h_> thumper: oic, so this was so that you could see it die vs go away and then if it failed to destroy you had no good way to get at it?
<thumper> then "error: unknown environment"
<thumper> mostly so you could watch it die without errors
<thumper> and grab logs... maybe
<rick_h_> axw: can you investigate if we can just destroy in a clean way and do away with the kept around models at all please?
<rick_h_> axw: I feel like we're chasing the wrong end of the problem atm
<wallyworld> rick_h_: our general pattern has been to shepard entities through dying -> dead. until they are reaped by a cleaner, they will still be there
<axw> rick_h_: sure
<rick_h_> wallyworld: right, so can we do that better vs keep things around like this?
<thumper> the 24 hours thing was entirely arbitrary
<thumper> and something I picked out of the air
<thumper> we could have a sensible default...
<thumper> of a much smaller number
<rick_h_> thumper: yea, I mean can we just watch it until it's dead cleanly and then remove it right away?
<wallyworld> rick_h_: oh, i didn't realise they were kept for 24hrs
<rick_h_> e.g. check every 5s or something?
<wallyworld> rick_h_: i thought they were removed immediately once dead, sorry
<rick_h_> I'm all for giving the user confidence/observability in what they're doing
<thumper> it is the client going "juju status"
<thumper> and us wanting to say "environment is dead"
<thumper> rather than "error, you suck"
<rick_h_> right, so at some point it fails that "model is not available" or the like
<thumper> eventually
<rick_h_> you asked for it to go away, what do you expect?
<thumper> right now, 24 hours later
<rick_h_> yea, that's gotta go imo
<thumper> this is the job of the undertaker
<rick_h_> if we can tell from status it's dead, then we should be able to reap it right then and there
<wallyworld> +1
<rick_h_> and if status says it's not dead, then we don't reap it and you can get at it and diagnose
<wallyworld> that's ow i thought it worked
<rick_h_> heh no
<wallyworld> didn't realise models were special
<rick_h_> so we were looking to do all this renaming and such which :/
<rick_h_> little snowflakes in the wind :)
<thumper> oh shit
<thumper> I think I need to work out this juju 2.0 cli thing
 * thumper needs to add a maas cloud
<thumper> perhaps after a dog walk
 * rick_h_ grumbles
<rick_h_> thumper: on that please fix the whole maas as a cloud vs not needing credentials/etc kthx
<thumper> wat?
 * thumper has no idea about that
<axw> rick_h_: as in maas showing up in list-clouds?
<rick_h_> thumper: there's a thing in that maas setup that's confusing users
<thumper> maas is a cloud isn't it?
<axw> maas is a type of cloud
<rick_h_> axw: yea, and if you do add it and then you try to add-credential it tells you maas does't need credentials
<axw> wat
<rick_h_> but then users don't know how to use it since it's different
<thumper> wat?
 * redir goes eod
<redir> see you tomorrow juju-dev
<rick_h_> night redir
<axw> rick_h_: I think that might be covered by anastasiamac_'s branch, http://reviews.vapour.ws/r/4573/
<axw> night redir
<anastasiamac_> rick_h_: iwat axw said - it's beeing fixed ^^
<anastasiamac_> (landing, really)
<anastasiamac_> thumper: about cloud vs cloud type, see https://bugs.launchpad.net/juju-core/+bug/1564054
<mup> Bug #1564054: lxd, maas and manual do not make sense in list-clouds <juju-release-support> <juju-core:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1564054>
<wallyworld> rick_h_: what do you mean? juju add-credential maas works fine
<rick_h_> wallyworld: hmm, a user was hitting it and gave me a pastbin the other day
<wallyworld> rick_h_: that would have been an old beta
<rick_h_> wallyworld: I'm trying to find it, they got an error out of juju along the lines of "xxx does not need a credential"
<wallyworld> this is all wip
<thumper> :)
<rick_h_> heh
<wallyworld> it used to be that
<wallyworld> we have been delivering things as threy are finished in each beta
<rick_h_> ah ok
<rick_h_> never mind then
<wallyworld> add credentials came late in the beta cycle
<wallyworld> rick_h_: stakeholders want maas included in list clouds
<rick_h_> wallyworld: did we have plans for an add-cloud as wel?
<wallyworld> that's why we did it we know maas is not a cloud
<rick_h_> wallyworld: well, maas clouds need to be there. The trouble is the lack of an add-cloud command for now
<anastasiamac_> wallyworld: a user would hit a "boom" if they specify cloud as part of command arguments..
<rick_h_> wallyworld: yea, I was one of those, but mark is right
<axw> rick_h_: we have add-cloud, but not interactive
<rick_h_> anastasiamac_: wallyworld so I +1 going with Mark's feedback there
<rick_h_> axw: right, interactive is what I'm thinking
<axw> you just point at a YAML file
<rick_h_> axw: not for 2.0 but we should add it down the road
<wallyworld> rick_h_: will you break the news to adam?
<axw> it would be helpful
<rick_h_> axw: right, but folks are confused as to what goes in the clousd.yaml, the credentials, and the config
 * axw nods
<rick_h_> wallyworld: stokachu?
<wallyworld> yeah
<wallyworld> he needs maas in list clouds for his app
<rick_h_> wallyworld: heh ok, will do
<rick_h_> wallyworld: I'm still -1 on the maas:/ special thing where you don't have to add it to the clouds.yaml
<wallyworld> rick_h_: that was also well received and asked for by users :-)
<rick_h_> wallyworld: bah, it just adds a 'differnet way to do it' that's special and unique and causes confusion folks have to look it upo
<wallyworld> rick_h_: you mean like lxd :-P
<wallyworld> there's no lxd in clouds.yaml also
<rick_h_> wallyworld: heh
<thumper> ah crap...
<wallyworld> rick_h_: the use case driving this is - people want to know what they can put with juju bootstrap <controllername> <what goes here>
<thumper> I don't think my 1gb kvm maas instances are up to running juju are they?
<thumper> wallyworld: remember when I said the whole cloud credential spec would be a big pile of work?
<rick_h_> wallyworld: yea...but they can't do maas without config info (in this case the IP address) so we've changed the bootstap command to make it work
<wallyworld> rick_h_: altered slightly yeah, but still inituitive imo
 * thumper goes to walk the dog and get away from the copmuter a bit
<wallyworld> thumper: oh i know
<wallyworld> i never doubted it
<wallyworld> we piled in so much stuff in such a short time, and delivered incrementally over several betas and got beat up when it wasn't all there beta 1 :-/
<rick_h_> wallyworld: no one got beat up :P
<rick_h_> wallyworld: I couldn't reach you!
<wallyworld> lol
<wallyworld> rick_h_: not beat up directly
<rick_h_> :P
<wallyworld> , more complaints :-)
<rick_h_> wallyworld: we just had an opinion that add-credential should have come first :P
<rick_h_> not complaints, opinions...I hear it's like something else everyone has :)
<wallyworld> rick_h_: sure, but it was harder to add and was icing - we needed the core functionality with credentials added by hand. having fancy add credentials with nothing working would not have been cool
<rick_h_> wallyworld: understand completely...after I stopped and thought about it :P
<wallyworld> :-)
<wallyworld> rick_h_: next time we'll start the work at the beginning of the cycle, not a month or 2 out :-)
<rick_h_> wallyworld: psh, don't go changing everything on me
<wallyworld> great, now i've got that song stuck in my head
<rick_h_> lol, glad to be of service
<wallyworld> i love you just the way you are.... ta de dum
<wallyworld> axw: got 5 mintes, standup?
<axw> wallyworld: sure, brt
<bradm> should juju2 be able to bootstrap against a private openstack cloud? I'm getting:
<bradm> 2016-04-14 01:30:25 ERROR cmd supercommand.go:448 failed to bootstrap model: model "admin" of type openstack does not support instances running on "amd64"
<bradm> my index.json does list 14.04 images...
<bradm> ah, when I don't use --upload-tools it gives:
<bradm> 2016-04-14 02:05:56 ERROR cmd supercommand.go:448 failed to bootstrap model: cannot start bootstrap instance: no "trusty" images in bootstack-canonistack-bos01 matching instance types [m1.small m1.medium m1.large m1.xlarge]
<thumper> wallyworld: ping
<wallyworld> wot
<wallyworld> thumper: ?
<thumper> wallyworld: got a few minutes to chat?
<wallyworld> sure
<wallyworld> bradm: juju 2 should work similar to juju 1 for private clouds
<wallyworld> it's all about getting the correct streams metadata which can be tricky
<bradm> wallyworld: just filed LP#1570162 about it
<wallyworld> but if it works for juju 1 it should work for juju 2
<wallyworld> ok, we'll look at the bug
<bradm> let me know if you need any more info about it
<thumper> wallyworld: 1:1 ?
<wallyworld> yup
<bradm> wallyworld: this is mitaka on trusty, and it has different arch compute nodes too
<bradm> grabbing some lunch, back in a while
<mup> Bug #1570162 opened: juju2 openstack private cloud cannot start bootstrap instance <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570162>
<mwhudson> wallyworld: juju-mongo-tools3.2 just got accepted \o/
 * mwhudson runs away for a bit
<wallyworld> mwhudson: awesome
<menn0> axw: I was thinking that the machiner would send the SSH host key to the state server (via the machiner facade)
<axw> menn0: sounds fine to me
<menn0> axw: seem reasonable?
<natefinch> wallyworld: btw, this is the fix for the lxd arch bug that I'm pretty sure works: https://github.com/juju/juju/pull/5116/files. But I still can't quite fully test due to not being able to fully disable upload-tools.. I haven't figured out how to disable saving the tools (I've found a couple likely spots and commented out code, but to no avail).
<wallyworld> ok, i'll look at pr, thanks
<wallyworld> there's a toolsstorage that manages the tools in gridfs
<natefinch> wallyworld: I commented out this whole loop: https://github.com/juju/juju/blob/master/apiserver/tools.go#L262 which looks like the place where we store the tools after uploading them... but it didn't seems to change anything
<wallyworld> natefinch: why not just make storage.Add() itself a no op
<wallyworld> natefinch: anyway, here's what you wnated
<wallyworld> fetchAndCacheTools
<wallyworld> or maybe not, that is when we download then from streams
<wallyworld> it's in jujud/bootstrap.go
<wallyworld> 		logger.Debugf("Adding tools: %v", toolsVersion)
<wallyworld> 		if err := toolstorage.Add(bytes.NewReader(data), metadata); err != nil {
<wallyworld> i just searched for usages of toolsstorage.Add()
<wallyworld> there's only a few to check
<natefinch> wallyworld: thanks... I really wasn't sure what to search for
<wallyworld> just the .Add()
<wallyworld> and see what calls it
<wallyworld> as that's where tools are added to state
<natefinch> wallyworld: yes but, I didn't know it was called .Add
<wallyworld> but you commented it out so you must have ?
<wallyworld> in that place
<wallyworld> anyways, you'll need to talk to john about that todo
<natefinch> wallyworld: I looked in the server code for where we were handling the http post for tools upload
<wallyworld> as he is making lxd work on provisioned machones
<wallyworld> sure, and toolsstorage.Add() was right there
<natefinch> wallyworld: it never occurred to me that there would be code in the client that would be identical
<wallyworld> bootstrap is special
<natefinch> wallyworld: indeed
<wallyworld> it has to have logic built in as there's no server running yet
<wallyworld> so to recap, we'll need to get that todo sorted asap
<wallyworld> i think
<natefinch> not really
<natefinch> this is the lxd provider
<natefinch> we only support localhost
<wallyworld> actually, yeah maybe not
<wallyworld> it's only local yeah
<natefinch> some day we maybe might support a remote host. Today's not that day :)
<wallyworld> i was confusing the prtovisioner
<natefinch> yeah, it's confusing
<wallyworld> good that it's fixed, ty
<natefinch> well, I'll go test it, but I'm pretty certain that fixes it
<wallyworld> yeah, the code looks correct
<natefinch> there was a suspicious log message on boot that we were saving amd64 tools, even though we were bootstrapping with arm64... and that message changes to arm64 now... but I have to do the full test with upload-tools disabled to know for sure.
<wallyworld> natefinch: maybe not - cloud init gets tools via the controller which acts as a proxy
<wallyworld> so even if upload tools is used, you just trace requests into the controller
<wallyworld> i didn't think of that previously
<wallyworld> so just trace the tools download hander gets
<natefinch> I'm building on the arm64 machine which is just about the slowest machine in existence, I'm pretty sure
<thumper> quick
<thumper> someone
<thumper> where is the cloudinit data stored
<thumper> on the cloud machine
<thumper> I need to get it off before the code kills the machine
<natefinch> dunno, sorry
<thumper> menn0: ^^?
<thumper> juju brought the machine up but can't ssh in
<thumper> axw: hey...
<thumper> this may be part your history
<thumper> 2016-04-14 03:33:01 DEBUG juju.provider.common bootstrap.go:328 connection attempt for 192.168.100.3 failed: /var/lib/juju/nonce.txt does not exist
<thumper> axw: we just bring up a machine with ssh keys
<thumper> and I'm guessing that file
<axw> thumper: yes, so we know we've connected to the right machine
<thumper> axw: but that file isn't there
<axw> thumper: something's not doing the right thing with cloud-init then I guess?
<thumper> axw: know where the cloud init file is?
<axw> thumper: /var/lib/cloud I think?
 * axw rummages
<thumper> I have user_data.txt from there
<thumper> but it is base64 encoded
<thumper> and decoded looks binary
<axw> thumper: there should be a plaintext file nearby
<axw> thumper: instance/cloud-config.txt maybe
<thumper> doesn't exist
<thumper> oh
<thumper> zero bytes
<axw> thumper: interesting, I don't think I've ever seen a zero-byte one. maybe part of the problem.
<thumper> hmm...
<thumper> I think I'll add logging to the creation of the user data
<mup> Bug #1570175 opened: juju2 kill-controller doesn't work when bootstrap server is unreachable <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570175>
<thumper> axw: we are this |-| close to having maas2 boot
<axw> thumper: cool :)
<natefinch> ahh man, the fact we can't cross compile is killing me on this dumb arm64 machine
<natefinch> my laptop compiles juju in like 10 seconds.  This machine takes like 3 minutes
<menn0> thumper: sorry, missed your message... I don't know the answer either (cloud-init newbie)
<thumper> well, I've added some debugging code to the cloud init renderer
<thumper> which I found...
<menn0> thumper, axw: would the cloudinit logs give some clues? (/var/log/cloud-init.output and cloud-init.log)
<thumper> so I can catch it as it is yaml-ified, before gzip, base64
<thumper> ok, got the cloud init
<thumper> but it is kinda big...
 * thumper looks deeper
<axw> menn0 thumper: *is* there a /var/log/cloud-init-output.log?
<thumper> I'll look this time
<thumper> last time 10 minutes passed and machine was released
<thumper> - install -D -m 644 /dev/null '/var/lib/juju/nonce.txt'
<thumper> - printf '%s\n' 'user-admin:bootstrap' > '/var/lib/juju/nonce.txt'
<thumper> from the local userdata it is sending down
<thumper> not much of a nonce :)
<axw> thumper: heh yeah, but it's better than connecting to some random machine on your local network and running juju setup on it :)
<thumper> true
<thumper> 2016-04-14 03:47:40,431 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: 'H4sIAAAJbogA/+w7aXPbuJLf...'
<thumper> WTF...
<thumper> basically it stopped cloudinit
<axw> thumper: nice :)
<axw> thumper: so MAAS is just dropping it on the floor?
 * thumper shrugs
<thumper> I'll have to dig a bit more
<axw> thumper: people have complained about the nonce.txt file being missing before, I never could repro tho
<axw> maybe it depends on contents/padding/something
<natefinch> wallyworld: huzzah, success
<wallyworld> great
<wallyworld> after waiting ages for a compile
<natefinch> wallyworld: well, I kept getting stupid compile errors... finally realized I could just slap a return nil before the code I wanted to avoid
<wallyworld> thumper: anastasiamac_ committed a fix for that maas bug you saw with listing credentials
<bradm> wallyworld: hmm, I can bootstrap with juju2 beta 3 on canonistack-lcy01, but not on this new mitaka cloud
<wallyworld> bradm: it may not have the flavours set up correctly?
<wallyworld> juju uses those plus arch to determine the instance id
<wallyworld> and matches it all with simplestreams
<wallyworld> it all needs to match up correctly
<bradm> wallyworld: it has flavours, I'm not sure how you have to set them up, we've never done anything about mapping them before
<bradm> wallyworld: I can certainly boot instances, although that is done by specifying an image id
<wallyworld> right, it appears the image id selection algorithm fails due to reaons
<wallyworld> reasons
<wallyworld> it would need investigation to see what's wrong with the set up
<bradm> I can see all 3 arches in the index.json
<wallyworld> i personally have not internalised the algorithm - would need to trace it all out and see what's needed where
<bradm> something must have changed with mitaka
<bradm> or juju2
<bradm> hm, thats a point, I should try juju1 on it
<wallyworld> bradm: also anastasiamac_ thinks there could be related (same?) issue that is fixed in beta 4
<bradm> wallyworld: yeah, maybe I should just wait until beta 4 is out
<bradm> are we there yet?  are we there yet? ;)
<wallyworld> bradm: in maybe 24 hrs tops
<wallyworld> in time for xenial cut off
<bradm> wallyworld: awesome.  I'll try out juju1 on this to see how it goes, might narrow things down a bit.
<wallyworld> bradm: yes please, that data point would be helpful in case there's still an issue
<wallyworld> juju1 and 2 should behave the same AFAIK
<wallyworld> that bug mentioned above may well be an existing issue
<wallyworld> i don't know the detials
<bradm> a newly released mitaka being deployed via unreleased charms and trying to use a beta juju on top of it?  nah, couldn't be any problems there. :)
<wallyworld> so we'll take it a step at a time: try juju1, wait till beta4
<wallyworld> of course not
<wallyworld> what could possibly go wrong
<bradm> its a house of unreleased cards
<bradm> but thats why we're doing it, to help bash out any bugs early
<bradm> wallyworld: aha, it fails in the same way
<wallyworld> \o/
<wallyworld> not juju2 :-)
<bradm> the index.json is pretty simple
<bradm> oho
<bradm> arch: x86_64
<bradm> why is that there
<natefinch> whelp, I have a PR up for that fix here: http://reviews.vapour.ws/r/4555/  I haven't been able to successfully change the tests such that they actually fail with the old code, but it's time for bed.
<bradm> wallyworld: hah, that was it.  for some reason the version of glance-simplestreams-sync we have was canonicalizing the arch from amd64 to x86_64.  now I get a different error. :)
<wallyworld> bradm: progress!
<bradm> wallyworld: now it wants me to give it a network id
<wallyworld> what is "it"?
<bradm> the error from the bootstrap
<wallyworld> i haven't done much with networking, not sure
<bradm> Multiple possible networks found, use a Network ID to be more specific.
<bradm> that feels like a nova error, though.
<wallyworld> yes it does
<wallyworld> could be propagated back from start instance
<bradm> how do you tell juju2 what the default network is, like the network setting in juju1
<bradm> just trying to add it where you define the endpoint
<bradm> nope, no go :-/
<bradm> do I just mark the bug as invalid?
<wallyworld> bradm: juju has spaces and things. but if you are asking about config, instead of foo=bar in environments.yaml, you use --config "foo-bar" on the bootstrap cmd line
<wallyworld> juju2 i mean
<wallyworld> if it's a set up issue with openstack, then yeah, bug may be invalid
<bradm> wallyworld: thats got it
<bradm> wallyworld: at least, its trying to get an address now, so its much further than before.  looks like we need a way to set the default network
<wallyworld> that worked?
<wallyworld> ok
<wallyworld> that bit i am not sure of
<wallyworld> bradm: dimiter and andy are the folks to ask about that
<bradm> oh boy, not quite.  now it errored out with:
<bradm> error: flag provided but not defined: --model-config
<wallyworld> bradm: you need upload-tools if you are running from trunk
<bradm> wallyworld: that was with upload-tools...
<bradm> trying now without
<wallyworld> the error looks like it is because the jujud binary from tools is old
<bradm> it seems to be getting further without the upload tools
<bradm> oh, still fails.
<bradm> https://pastebin.canonical.com/154283/ <- error message
<bradm> maybe I should just wait for beta4
<menn0> axw: here's the state part and some of the API work for storing SSH host keys. http://reviews.vapour.ws/r/4586/
<axw> menn0: cool, looking
<wallyworld> bradm: do you have the latest code checked out? that could explain why upload tools give poor results. also that error - is the auth url correct? just aguess
<bradm> wallyworld: nope, I'm just using teh beta3 version from the ppa
<wallyworld> bradm: in that case upload tools does nothing
<bradm> wallyworld: that can't be true, I saw very different things
<wallyworld> ok, so it grabs the first juju binary from the search path and uses that
<menn0> axw: next up... some routines in juju/utils for parsing host key files and generating known hosts files
<wallyworld> bradm: or builds from source, but you don't have source checked out
<bradm> wallyworld: and what auth url do you mean?  the endpoint in the clouds.yaml
<wallyworld> bradm: so there may be another juju in the path
<wallyworld> bradm: for openstack, yeas i think so
<bradm> wallyworld: there's no juju source on this box at all
<bradm> wallyworld: there is juju 1.25.5
<wallyworld> bradm: so in that case upload tools is using those binaries
<wallyworld> which explains a lot
<wallyworld> the --model-config unknown etc
<bradm> I used juju 1.25.5 to deploy openstack
<wallyworld> sure but you said you wer eusing juju2 from a ppa
<bradm> yeah, juju2 from ppa, juju1 from ppa
<wallyworld> so if you us a juju2 client and upload tools picks the 1.25 binaries it wil screw up
<wallyworld> don't use upload tools please :-)
<wallyworld> unless you are a developer and have source
<bradm> righto.
<bradm> I'm just trying different things to work out what its doing :)
<wallyworld> bradm: so now it may be that keystone is messing up
<wallyworld> bradm: i would love upload tools to be removed tbh
<wallyworld> it causes too many issues unless used under strict conditions
<bradm> I'm sure we've had to use it to fix things in the past
<bradm> but yeah, if its no longer of use
<wallyworld> with source yes, or custom binaries put in the right place
<wallyworld> it has a use, but yu need to be careful
<bradm> wallyworld: I've been using juju since the python days, I'm sure I've got all sorts of redundant things in my brain about it :)
<wallyworld> and in this case, it caused "weird" errors until i found out you didb't have source code and had 2 versins etc
<wallyworld> bradm: understood. you deserve a medal :-)
<wallyworld> for consuming all of our bugs for so long
<mup> Bug #1570162 changed: juju2 openstack private cloud cannot start bootstrap instance <canonical-bootstack> <juju-core:Invalid> <https://launchpad.net/bugs/1570162>
<bradm> oh my
<bradm> that error message really is true, I can't reach the keystone IP from a VM
<wallyworld> at least juju is not lying :-)
<bradm> juju would never lie to us, would it?
<wallyworld> *never*
<bradm> missing a route.
<axw> menn0: code looks good, but I have a question about the structure of the keys
<axw> (in RB)
<menn0> axw: can the SSH server really have multiple host keys for a given algorithm?
<axw> menn0: I think so, but I'll test to make sure
<menn0> axw: I just checked the man page... I don't think it's possible
<axw> menn0: sshd_config? what part?
<menn0> axw: man sshd
<bradm> juju2 likes leaving secgroups around, just hit a quota limit
<menn0> axw: the "-h host_key_file" part and the FILE section
<menn0> FILES
<menn0> axw: there's one host key file per key/algorithm type
<axw> menn0: AFAIK they're just the default ones, referenced by /etc/ssh/sshd_config
<menn0> axw: the wording in man sshd_config is more vague about it
<menn0> axw: I guess it's safer to use []string (with no real downside)
<menn0> axw: good catch
<menn0> axw: in that case, I'll just store the key files in state verbatim (they're one line each)
<menn0> axw: and handle the parsing and reformatting in the client when it generates the bespoke known_hosts file
<axw> menn0: SGTM. FWIW, starting sshd with multiple RSA keys works fine
 * axw nods
<menn0> axw: you just tried it?
<axw> menn0: yep
<axw> menn0: BTW, there's a function that you can use to parse the public keys: https://godoc.org/golang.org/x/crypto/ssh#ParseAuthorizedKey
<menn0> axw: ok cool.. that's definitely the right approach then
<menn0> axw: good to know... I just found something in juju/utils/ssh which also does it :)
<axw> heh
<axw> can't have too many
<menn0> axw: maybe the keys should be parsed to a (type, keydata, comment) struct and send and stored that way?
<bradm> oh, you can just run juju2 enable-ha ?
<bradm> er, can't just
<menn0> axw: rather than the raw strig
<menn0> string
<axw> menn0: *shrug* if it's easy enough to do without losing any info, maybe
<axw> I'm not sure it's worth the effort tho
<axw> IOW, authorized key format is already perfect information, so probably not worth destructuring at that point unless we think we're going to query on the individual fields
<menn0> axw: you're right
<menn0> axw: what threw me a little was that the known_hosts file doesn't include the comment field on my machine
<menn0> axw: so I was thinking it would need to be stripped
<menn0> axw: but looking at the docs, it's fine if it's there
<menn0> axw: []string it is
<axw> menn0: cool. a comment saying that it's in authorized_keys format would be helpful
<menn0> axw: yep, will add.
<bradm> wallyworld: yeah, this network thing is going to be a blocker.  need a way to set it as a default somewhere.   just setup a multiuser env, tried to boot a VM and it errored out with the same thing about multiple networks
<wallyworld> bradm: i recall conversations in this area but not any specifics, not sure of the status
<wallyworld> aybe there's a solution already, i just don't know it
<wallyworld> axw_: ping
<axw_> wallyworld: pong
<wallyworld> axw_: stupid quetion of the day, i'll make a dick of myself i'm sure. can look look at line 71 or environ_broker.go in the lxd provider
<axw_> wallyworld: yep?
<wallyworld> should be finishInstanceConfg()
<wallyworld> the arg struct is passed by value
<wallyworld> so how will it ever work
<axw_> wallyworld: pretty sure InstanceConfig is a pointer
<axw_> yep
<wallyworld> ah right
<wallyworld> yes i missed that
<axw_> well shit. I just bootstrapped and it panicked on the server due to "send on closed channel" in the systemd package
<wallyworld> yay
<menn0> axw_: please take another look at http://reviews.vapour.ws/r/4586/diff/
<menn0> axw_: no rush as I'm about to EOD. but if you could look before you finish that would be great
<axw_> menn0: no worries, have a nice evening
<menn0> axw_, wallyworld : it's feeling like this ssh host key handling issue is going to be easier to lick than it seemed (still a bit to do I realise)
<wallyworld> win
<axw_> wallyworld: is there an agenda for the sprint yet? formal or informal
<axw_> wallyworld: I mean, topic list we're compiing
<axw_> compiling*
<axw_> wallyworld: CI for storage really needs to happen
<wallyworld> axw_: sort of - right as of now, it's digesting the roadmap wish list
<axw_> it's been broken in master since last year, in Malta...
<axw_> yep
<axw_> ok
<wallyworld> axw_: agreed about CI for storage. can we discuss in 1:1 tomorrow?
<axw_> wallyworld: sure
<mwhudson> wallyworld: is there any other packaging stuff juju is waiting on?
<mwhudson> other than juju itself ;-p
<wallyworld> not that i know of
<wallyworld> well do want mongo 3.2 in trusty and wily at some stage
<wallyworld> soon hopefully :-)
<mwhudson> wallyworld: somehow the deadlines on those don't seem so tight
<mwhudson> eg i guess i should backport go 1.6.1 to trusty...
<wallyworld> mwhudson: they are not as tight, but we do want a consistent mongo experience across series at some stage
<mwhudson> i guess we can find out if the packages build for a start
<wallyworld> axw_: could you look at http://reviews.vapour.ws/r/4587/ ? i want to land it because CI is failing with lxd on arm. i have taken nate's work and added tests
<wallyworld> i want to try and get this in for beta4
<axw_> wallyworld: looking
<wallyworld> ta
<bradm> I've filed LP#1570219 if anyone who knows more about the networking side of things could take a look, that'd be great, thanks.
<axw_> wallyworld: done
<wallyworld> ta
<mup> Bug #1570216 opened: juju2 not cleaning up nova secgroups with openstack provider <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570216>
<mup> Bug #1570219 opened: juju2 openstack provider setting default network <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1570219>
<fwereade__> ashipika, cmars: responded to a couple of bits of review, went into detail with the problems with the idiosyncratic approach to workers; we should probably all talk about this live today
<ashipika> fwereade__:  thanks.. it's ok.. i was not going to land this PR as it is.. i still welcome any and all comments, but i'll be breaking it up into smaller managable bits, i expect..
<fwereade__> ashipika, cool, thanks, let me know if you want to talk about any of it
<ashipika> fwereade__: i expect i will.. many times along the way :)  but this week we have our priorities elsewhere, so this might have to wait a bit.. anyways it needs to land by may, if i understand the timeline..
<fwereade__> ashipika, cool
<ashipika> fwereade__: and the thing is: there was no spec.. it was just "oh, this needs to work.. "
<fwereade__> ashipika, yeah, that was rather my reading of it
<ashipika> fwereade__: we kept asking for more details, but nothing came back.. other than DTAG wants rsyslog forwarding
<voidspace> babbageclunk: so, in my test for waitForNodeDeployment I'm seeing the same NotFound error
<voidspace> babbageclunk: so looks like a straightforward bug
<babbageclunk> voidspace: yay!
<babbageclunk> voidspace: with my stuff landed now, what should I pick up?
<voidspace> babbageclunk: maas2Instance.volumes would be good
<babbageclunk> voidspace: ok
<voidspace> babbageclunk: there's a list at the top of the status document of tasks
<voidspace> babbageclunk: you could take on fixing the behaviour when you run against MAAS2 without the feature flag
<voidspace> babbageclunk: currently it just panics
<voidspace> babbageclunk: instead we should detect MAAS 2 and exit with an error instead
<voidspace> babbageclunk: that's easy enough to do - might be good to get that in first
<voidspace> babbageclunk: so attempt to create the controller even without the feature flag
<voidspace> babbageclunk: if it succeeds then we're on MAAS 2
<voidspace> babbageclunk: if we don't have the feature flag error out with a NotSupported error
<babbageclunk> voidspace: yeah, I'll do that first.
<mwhudson> wallyworld: juju-mongo* stuff builds on trusty with a bit of flailing for -tools https://launchpad.net/~mwhudson/+archive/ubuntu/devirt/+packages/?field.series_filter=trusty
<voidspace> babbageclunk: ah no - my NotFound error is because my test doesn't give the fakeController.Machines method anything to return
<voidspace> babbageclunk: so not the same bug...
<voidspace> babbageclunk: ooh, see Tim's status update - he did some work for us
<voidspace> babbageclunk: and got bootstrap further
 * thumper wonders if he is in a different hangout to the others
<voidspace> thumper: morning
<voidspace> thumper: keen bean
<voidspace> babbageclunk: just added a new task to the list - use maas2NetworkInterfaces from StartInstance
<voidspace> babbageclunk: should be trivial, but needs a test as well
<voidspace> babbageclunk: (a test at the StartInstance level)
<voidspace> dimitern: frobware: babbageclunk: http://pastebin.ubuntu.com/15826362/
<voidspace> dimitern: frobware: babbageclunk: that's real progress
<dimitern> voidspace: awesome!
<frobware> voidspace: indeed
<dimitern> voidspace: I see you're possibly hitting the same ssh issue I had - my key is ssh-dss, apparently no longer considered secure
<dimitern> voidspace: but I found a way around it, if you need
<voidspace> dimitern: go ahead
<voidspace> dimitern: I thought this was the issue tim had with not being able to ssh in
<dimitern> might still be that, but try this:
<voidspace> dimitern: can you join us in #maas on canonical
<voidspace> dimitern: (as well)
<dimitern> voidspace: http://paste.ubuntu.com/15826416/
<dimitern> ah, I thought I'm there already
<dimitern> voidspace: so replace the IP ranges to match yours - the important bits are the last 4 sections
<voidspace> dimitern: thanks
<voidspace> dimitern: /etc/ssh/ssh_config
<dimitern> voidspace: ~/.ssh/config
<voidspace> dimitern: cool, thanks
<voidspace> dimitern: can you confirm that we gzip userdata for allenap in #maas on canonical irc
<dimitern> voidspace: looking
<voidspace> retrying
<voidspace> (the bootstrap I mean)
<mup> Bug #1570269 opened: state: ensure that Models are always paired with the correct State <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1570269>
<babbageclunk> voidspace: awesome
<thumper> voidspace: no, that is a different bug
<TheMue> morning
<thumper> 2016-04-14 03:57:54 ERROR cmd supercommand.go:448 failed to bootstrap model: waited for 10m0s without being able to connect: /var/lib/juju/nonce.txt does not exist
<thumper> o/ TheMue
 * thumper is outa here
<voidspace> thumper: o/
<TheMue> n8 thumper
<TheMue> :)
<mup> Bug #1570285 opened: worker/undertaker: update status with remaining resources <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1570285>
<babbageclunk> dimitern: how do I bootstrap to a different region in AWS? If I use --to zone=eu-west-1 I get this: http://pastebin.ubuntu.com/15827282/
<babbageclunk> dimitern: also, why do I always need to specify --upload-tools?
<mwhudson> babbageclunk: i don't know but eu-west-1 is a region, not a zone
<dimitern> babbageclunk: juju bootstrap <controller-name> aws/<region> ...
<babbageclunk> mwhudson: Ah, ok - I was following some old docs, I think
<dimitern> babbageclunk: since that changed I tend to keep this around for reference: http://paste.ubuntu.com/15127859/
<babbageclunk> dimitern: thanks
<babbageclunk> dimitern, mwhudson - also, juju help placement lead me astray too (I guess there are people writing new help messages).
<dimitern> babbageclunk: docs can be improved indeed, but have you tried 'juju help bootstrap' ?
<babbageclunk> dimitern: yup, that was how I got to 'juju help placement'
<dimitern> babbageclunk: ah :) I see - "placement" is related to the --to argument
<frobware> dimitern: can you drop into the sapphire standup HO?
<dimitern> frobware: sure, just a sec
<frobware> thx
<babbageclunk> dimitern: Right - totally missed the bit at the top of the bootstrap help - was confused because the placement docs matched what I saw in the old docs on the web.
<babbageclunk> dimitern: thanks!
<voidspace> frobware: dimitern: babbageclunk: http://reviews.vapour.ws/r/4591/
<dimitern> voidspace: LGTM
<perrito666> bbl
<mup> Bug #1570368 opened: juju commands timeout while a bootstrap is in process <conjure> <juju-core:New> <https://launchpad.net/bugs/1570368>
<voidspace> dimitern: thanks
 * voidspace lunches
<mattyw> fwereade_, ping?
<mattyw> fwereade_, I have some questions if you can spare 5 minutes?
<babbageclunk> voidspace: what kind of error should I return for when the endpoint's MAAS 2 but the flag isn't set? errors.NotSupportedf?
<babbageclunk> voidspace: struggling to give an error message that makes sense with " not supported" stuck on the end.
<babbageclunk> voidspace: ok "unless the 'maas2' feature flag is set MAAS 2 is"
<dimitern> babbageclunk: for those cases there's also a errors.NewNotSupported(nil, fmt.Sprintf("fmt str", args,...)) you can use
<babbageclunk> dimitern: great, thanksÂ¬
<babbageclunk> oops, !
<katco> morning all
<natefinch> morning katco
<katco> and actually need to reboot... brb
<voidspace> babbageclunk: just errors.New and a sensible error message of your choice will be fine
<babbageclunk> voidspace: hello
<voidspace> babbageclunk: hello
<babbageclunk> voidspace: so, obviously that change was trivial
<voidspace> babbageclunk: cool
<babbageclunk> voidspace: but working out why the tests weren't failing already in the same way wasn't
<voidspace> babbageclunk: don't forget to update the status doc
<voidspace> babbageclunk: hah
<voidspace> babbageclunk: I added the feature flag to our tests at some point
<babbageclunk> voidspace: it turns out maas2 just returns some html when you ask for /1.0/version/
<babbageclunk> voidspace: rather than 404ing
<voidspace> babbageclunk: ah, it used to return nul
<voidspace> babbageclunk: they've changed it
<voidspace> dimitern: babbageclunk: I'm leaving early today to go to a tatooist
<voidspace> dimitern: babbageclunk: then I'm coming back in again later
<babbageclunk> voidspace: well, it returns null when you parse it as json
<dimitern> voidspace: ok, have phun ;)
<voidspace> ah
<voidspace> dimitern: I will
<voidspace> babbageclunk: that makes sense
<voidspace> babbageclunk: well, not returning a 404 doesn't make sense
<voidspace> but there you go
<babbageclunk> voidspace: have a nice tattoo appointment!
<voidspace> babbageclunk: I'm sure I will, not going yet - but soonish
<babbageclunk> voidspace: won't forget the doc this time, sorry!
<voidspace> heh, np
<mup> Bug #1453805 opened: Juju takes more than 20 minutes to enable voting <blocker> <ci> <ensure-availability> <intermittent-failure> <regression> <juju-core:Triaged
<mup> by menno.smits> <juju-core 1.23:Fix Released by menno.smits> <juju-core 1.24:Fix Released by menno.smits> <https://launchpad.net/bugs/1453805>
<babbageclunk> voidspace: still around?
<voidspace> dimitern: ping
<voidspace> babbageclunk: yes
<voidspace> babbageclunk: I might have found the bug
<voidspace> babbageclunk: gomaasapi does base64 encoding for us, and so do we
<babbageclunk> voidspace, dimitern, frobware: http://reviews.vapour.ws/r/4595/
<rick_h_> voidspace: do you know if frobware is around today?
<babbageclunk> voidspace: Oops
<voidspace> rick_h_: he was earlier, yes
<babbageclunk> voidspace: nice
<rick_h_> voidspace: k, ty
<babbageclunk> voidspace: Is there an easy way to explore the maas api?
<voidspace> babbageclunk: I use the CLI...
<babbageclunk> voidspace: ah, I keep forgetting about that.
<babbageclunk> voidspace: thanks
<babbageclunk> voidspace: the docs are singularly unhelpful.
<frobware> rick_h_: yep, here, but IRC dropping out a lot atm
<rick_h_> frobware: ah ok, I asked stokachu to shoot you an email on a potential network/bridge issue he was seeing last night
<katco> natefinch: standup time
<rick_h_> frobware: wanted to let you know I asked him to and I know you've been doing MAAS2/bug stuff but wanted to see if you or something could poke at it and see if it's a bug or working as intended/etc
<alexisb> rick_h_, we should have him open a bug so we can get it on the squad board
<alexisb> that is where the full team is pulling priority bugs
<rick_h_> alexisb: rgr, the question was "is this a bug?" so just wanted to make sure first
<frobware> rick_h_: I semi-stalled on an answer to stokachu. tych0 is proposing a patch for the problems discussed in that email. I also owe tych0 a patch too.
<katco> natefinch: ping?
<rick_h_> frobware: ok, cool. Ignore me then.
<natefinch> katco: sorry
<natefinch> katco: lost track of time, coming
<voidspace> dimitern: babbageclunk: frobware: removing the extra base64 encode from gomaasapi fixes the issue Tim reported this morning
<voidspace> and now we die in a new way
<frobware> voidspace: that was base64 on base64 then?
<dimitern> voidspace: ah, good - too much encoding then :)
<voidspace> frobware: yep
<tych0> rick_h_: yeah, i know what the issues are with a bridge
<tych0> gonna send some patches today
<tych0> just need to catch up on email :)
<rick_h_> tych0: ok cool, thanks for all the help in figuring it out!
<voidspace> dimitern: where is that done - in the cloudinit package?
<dimitern> voidspace: in cloudconfig/providerinit IIRC
<voidspace> dimitern: you are correct
<voidspace> dimitern: I've asked Tim to fix it in gomaasapi
<dimitern> voidspace: sweet!
<tych0> rick_h_: sure, np
<voidspace> dimitern: do we propagate feature flags onto the juju controller machine?
<voidspace> we must do
<voidspace> however, the issue I'm seeing now kinda implies not
<dimitern> voidspace: yeah
<dimitern> voidspace: we do
<voidspace> ok, kinda hard to see where this "requested map got nil" comes from
<voidspace> I'm bootstrapping with debug to see
<voidspace> maybe Subnets
<dimitern> voidspace: this sounds like a GetMap() failed somewhere
<voidspace> heh, possibly from space discoovery
<voidspace> dimitern: well yes...
<dimitern> but on a jsonobject, not a maasobject
<dimitern> i.e. while processing a response
<voidspace> that's the error message we usually get hitting a 1.0 endpoint against 2.0
<voidspace> but I'm trying to work out where
<voidspace> the error message is pointing me to a non existent line in supercommand.go and --debug provided no extra information
<voidspace> although the debug line before it is immediately before a call to NewEnviron - which would report that error message if it thought the feature flag wasn't set
<voidspace> dimitern: where in juju are feature flags set on the controller machine
<voidspace> dimitern: if it's after we attempt to open an environ then we'll fail in this way
<voidspace> when running jujud on the controller
<dimitern> voidspace: let me check exactly
<dimitern> voidspace: cmd/jujud/main_nix.go
<voidspace> dimitern: I see a call to SetFlagsFromEnvironment in jujud/main_nix.go
<voidspace> dimitern: right, but what puts them in the environment - cloud init?
<dimitern> voidspace: no, they are part of the agent config we pass via the userdata
<dimitern> voidspace: check also cmd/jujud/agent/machine.go - in the beginning or Run()
<dimitern> you could grep for "developer feature flags enabled" in the logs
<voidspace> ok
<voidspace> dimitern: thanks
<alexisb> ericsnow, when you have a moment can you kindly review http://reviews.vapour.ws/r/4583/
<ericsnow> alexisb: will do
<dimitern> babbageclunk: you've got a review btw
<voidspace> dimitern: babbageclunk: as I suspected. With maas2 including babbageclunk's branch *and* my gomaasapi fix, bootstrap now dies with
<voidspace> 2016-04-14 16:03:57 ERROR cmd supercommand.go:448 MAAS 2 is not supported unless the 'maas2' feature flag is set
<voidspace> dimitern: babbageclunk: so the feature flag isn't being propagated correctly / early enough to the controller machine
<dimitern> voidspace: you're not seeing the log saying they're enabled?
<voidspace> dimitern: haven't checked yet - we haven't touched that code path though!
<voidspace> off to the tatooist
<voidspace> will look when I return
<dimitern> ok
<tych0> rick_h_: jam: frobware: https://github.com/juju/juju/pull/5164
<dimitern> tych0: agent.LxdBridge is never ever set
<dimitern> tych0: only agent.LxdBridge is
<dimitern> if that
<tych0> dimitern: sorry, i don't understand?
<dimitern> tych0: e.g. here https://github.com/juju/juju/pull/5164/files#diff-7db54798352f1e675c4e2ecba7bc349dR57
<dimitern> or the one below
<dimitern> in MaintainInstance
<tych0> dimitern: you said "agent.LxdBridge is never ever set only agent.LxdBridge is"
<frobware> tych0: btw, the merge should be into next
<dimitern> agent.LxcBridge is non-empty if explicitly set by a provider - MAAS and EC2 used to do that, but no longer do
<dimitern> tych0: sorry :)
<dimitern> so 'only agent.LxcBridge is'
<dimitern> but as I said, now agent.LxcBridge is no longer set and is always empty
<dimitern> the confusion comes from bad naming - agent.LxcBridge should've been called agent.ContainerBridge
<dimitern> frobware: gofmt breaks alignment when it finds a blank line
<tych0> dimitern: ok. so you're saying we should just delete that entirely and always use lxcbr0? or?
<tych0> i don't actually know where that comes from, i just figured it was configuration from the user
<dimitern> tych0: yes, I think that's correct
<tych0> dimitern: i guess i'm a little gunshy about making that change
<tych0> since i don't understand any of this very well :)
<dimitern> tych0: long, long ago there was a "network-bridge" setting you could use to override agent.LxcBridge, but it's long gone
<voidspace> babbageclunk: have you made much progress on volumes in gomaasapi?
<tych0> dimitern: ok. it seems like that should be part of a larger change to get rid of it everywhere else then i guess?
<voidspace> babbageclunk: it would be good to let Tim know where you got to in the status doc
<tych0> i can drop that patch if you think it doesn't matter though
<babbageclunk> voidspace: nope - struggling to understand how the current code works.
<dimitern> tych0: so now unless the provider populates ContainerBridgeName in the BootstrapParams passed to providercommon.Bootstrap(), agent.LxcBridge won't be set in the agent config
<babbageclunk> voidspace: It seems to rely on attrs that aren't in the 1.9 JSON.
<voidspace> babbageclunk: can you write it up in the doc - Tim can look at it or we can feature request the maas guys
<babbageclunk> voidspace: writing my own little test harness
<voidspace> babbageclunk: cool
<voidspace> babbageclunk: maybe there's another api to get the information
<mup> Bug #1570473 opened: juju lxd bridge detection fallback is not reliable <conjure> <juju-core:New> <https://launchpad.net/bugs/1570473>
<alexisb> ericsnow, or dimitern. or frobware : this is a high priority PR for review today: http://reviews.vapour.ws/r/4598/
<ericsnow> alexisb: k
<ericsnow> alexisb: already looking at it :)
<alexisb> sweet :)
<dimitern> alexisb: I've already added comments and discussed a few points with tych0
<dimitern> tych0: apart from using the always empty agent.LxdBridge (or agent.LxcBride) - LGTM
<tych0> dimitern: yeah. i guess i'm not super comfortable getting rid of that because i don't really know how it works
<tych0> it seems like if we want to get rid of it, we should get rid of it everywhere
<mup> Bug #1570473 changed: juju lxd bridge detection fallback is not reliable <conjure> <juju-core:New> <https://launchpad.net/bugs/1570473>
<dimitern> tych0: sgtm
<mup> Bug #1570473 opened: juju lxd bridge detection fallback is not reliable <conjure> <juju-core:New> <https://launchpad.net/bugs/1570473>
<ericsnow> tych0: FYI, ship-it
<ericsnow> tych0: (with one small comment)
<tych0> ericsnow: no, that constant isn't exported in the LXD package; i moved it to lxdclient because we needed it there
<ericsnow> tych0: sounds good
<tych0> how do i change the branch target?
<tych0> seems lik ei might need a new pr?
<ericsnow> tych0: of the PR?  yeah, make a new PR and link to the old review request
<tych0> ericsnow: ok, cool. and then i'm good to merge right away?
<ericsnow> tych0: yep
<tych0> ok, cool
<tych0> ericsnow: wait, next is older than master?
<ericsnow> tych0: no, though it may have temporarily diverged a little
<tych0> ok
<alexisb> tych0, remind me, what version of lxd did the switch to lxdbr0?
<alexisb> was it rc9??
<tych0> i think so
 * tych0 looks
<tych0> yeah
<tych0> rc9
<perrito666> mm, how long until its morning in nz?
 * perrito666 needs a hand from menn0
<alexisb> perrito666, you have about 2 hours
<perrito666> one of the fun things of this job :p one question I hardly thought I would be making
<redir> bbiab
<alexisb> and tych0 do you have the link handy for your insights write-up on lxd init and bridge setup?
<alexisb> cheryl linked me to is yesterday but now I cant find it :)
<alexisb> tych0, nevermind
<alexisb> found it
<alexisb> sorry
<perrito666> agh why are the tests that take the longer the ones that always fail
<tych0> alexisb: cool, np
<perrito666> dimitern: voidspace can any of you make anything of the first error in https://pastebin.canonical.com/154358/  ?
<dimitern> perrito666: looks like map ordering issue?
<alexisb> katco, did you add channels to the release notes?
<katco> alexisb: no
<katco> alexisb: we didn't do the front-end work... did it not make it in there?
<alexisb> nope
<dimitern> perrito666: in any case, feel free to skip/ignore or just delete this test, as it's no longer relevant - uses state.NetworkInterface which must be removed (no longer used) - just haven't got there yet myself
<alexisb> katco, would you be up to adding soemthing?
<alexisb> we need it to release
<katco> alexisb: yeah adding now
<alexisb> thanks
<alexisb> heading under "Whats new for beta4" please
<katco> alexisb: do you think this should be an overview of channels, or simply a blurb stating that they exist
<alexisb> katco, I think an overview would be nice to have
<alexisb> so that people know
<alexisb> but it doesnt have to be overly detailed
<katco> alexisb: ok, i'm going to ping someone from the CS side of things as they are way more familiar
<alexisb> fair enough
<perrito666> dimitern: tx, Just making sure everything tests properly with mongo3 and was not sure if I should pay attention to that test
<katco> alexisb: are you fine with me linking to our already excellent documentation, and then providing info about juju's command line? https://jujucharms.com/docs/devel/authors-charm-store#entities-explained
<alexisb> katco, yes that is fine
<katco> alexisb: k
<perrito666> hey, I suddenly have to go for like an hour, ill be back later, mail me if you need anything
<voidspace> perrito666: no idea without digging into it, sorry
<perrito666> Voidspace no worries dimitern told me what I needed
<mup> Bug # changed: 1450299, 1538303, 1554675, 1556207, 1559099, 1560391, 1564694, 1567017, 1567020, 1568092, 1569982
<perrito666> I see some bugs changed
<katco> ericsnow: i'm in our 1:1 if you're ready
<alexisb> wallyworld, when you are in please ping me
<wallyworld> alexisb: give me 5
<alexisb> lol
<alexisb> it is not urgent
<katco> wallyworld: you are a robot, i'm convinced
<alexisb> katco, me too
<alexisb> convinced that wallyworld is a robot
<alexisb> fueled by hoity-toity coffee
<redir> wall-eworld
<alexisb> lol
<alexisb> that was good redir
<redir> :)
<wallyworld> alexisb: zup?
<alexisb> 1x1 HO
<wallyworld> ok
<mup> Bug # changed: 1426729, 1516668, 1524077, 1533262, 1537620, 1538735, 1543223, 1553272, 1554251, 1554687, 1555083, 1555248, 1556249, 1560201, 1560511, 1560520, 1560531, 1560595, 1560665, 1560667, 1563576, 1563615, 1563628, 1563762, 1563843, 1563845, 1563853, 1563923, 1563924, 1563927, 1563928,
<mup> 1563938, 1563958, 1564057, 1566237, 1566589, 1566628, 1567182, 1567228, 1567683, 1568312, 1568390, 1569024, 1569097, 1569196, 1569408, 1569725
<thumper> hi ho
<thumper> hi ho
 * thumper thinks he knows what's wrong with maas2 bootstrapping
<mup> Bug #1570594 opened: read access to admin model allows grant <docteam> <juju-core:New> <https://launchpad.net/bugs/1570594>
<alexisb> thumper, and it is your fault
<alexisb> the bootstrap issue
<thumper> alexisb: there is another one :)
<thumper> probably also my fault
<thumper> but from much earlier
<thumper> wallyworld: would love a chat when you have a minute
<thumper> damn...
<wallyworld> thumper: sure, after release standup
<thumper> wallyworld: s'ok, I think I've sorted it out
<thumper> code has changed from what I remembered it being
<thumper> and I was having to work through things
 * thumper crosses his fingers
<thumper> oh... getting close...
<thumper> fuck yeah!!!
<thumper> alexisb: bootstrap maas2 succeeded
<perrito666> I take you found it?
 * thumper tries deploy
<thumper> hmm...
<thumper> wat
<mgz_> so, the correct final step after dpkg-reconfigure lxd on a fresh xenial,
<mgz_> is `systemctl restart lxd`, right?
<thumper> can't deploy?
<thumper> wat?
<perrito666> mgz_: dpkg should do that for you
<perrito666> mgz_: in any case, if it doesnt, lxd-bridge
<mgz_> it does, but that doesn't create the bridge
<thumper> http://paste.ubuntu.com/15839571/
<perrito666> mgz_: lxd-bridge is the service to restart
<thumper> anyone had deploy issues?
<mgz_> okay
<thumper> trying to deploy ubuntu charm dies talking to charmstore
<perrito666> mgz_: did you instruct dpkg to create the ipv4 network?
<mgz_> yeah
<thumper> WAT? debug-log not supported?
<alexisb> \o/
<alexisb> thumper, that is freak'n awesome!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
<thumper> alexisb: except debug log isn't working
<wallyworld> perrito666: did you want to chat now?
<thumper> deploy isn't working
<thumper> and neither of these things are maas bugs
<wallyworld> thumper: that error looks like juju not charm store
<alexisb> heh baby steps
<alexisb> and at least you know enough about the bug to know that
<thumper> maybe it is maas's fault
<thumper> jujud is panicing
<alexisb> thumper, i jut deployed on lxd provider on latest next build
<thumper> nice
<menn0> thumper: I did think of this, and then forgot. Hooray for tests :) http://paste.ubuntu.com/15839652/
<thumper> :)
<thumper> alexisb: the failure was due to the maas provider needing to do subnet/space discovery the new way
<thumper> so I'll leave that for voidspace
<thumper> and get on to the filesystem bits of gomaasapi
<perrito666> wallyworld: going
<perrito666> wallyworld: ??
<wallyworld> perrito666: cpu 100%, hangout frozen
<alexisb> thumper == gomaasapi guy
<alexisb> ;)
 * thumper goes to make a coffee
 * thumper looks to see who is on-call reviewer
<thumper> ericsnow: still around?
<ericsnow> thumper: yep
<thumper> I'm just proposing a very simple branch that we need for maas2
<ericsnow> thumper: k
<thumper> damn it
<thumper> proposed agains master , not next
 * thumper redoes
<thumper> ericsnow: http://reviews.vapour.ws/r/4603/diff/#
<ericsnow> thumper: ah, bootstrap-state :)
<thumper> ericsnow: were you around when voidspace was having these issues?
<ericsnow> thumper: around but not involved
 * thumper nods
<ericsnow> thumper: Windows isn't a concern here, right?
<thumper> ericsnow: no, because we only bootstrap on ubuntu
<thumper> windows is currently workload only
<thumper> not apiserver
<ericsnow> katco: ^^^
<ericsnow> thumper: sounds good
<ericsnow> thumper: ship-it
<thumper> ericsnow: ta
<menn0> wallyworld: an old MongoDB HA bug has resurfaced (it's one of the blockers)
<wallyworld> menn0: which branch? next?
<wallyworld> bug number?
<menn0> wallyworld: yep on next
<wallyworld> menn0: it may be fixed in master
<wallyworld> we fixed a bunch of ha stuff for beta4
<mgz_> they will be reconverging shortly
<menn0> wallyworld: bug 1453805
<mup> Bug #1453805: Juju takes more than 20 minutes to enable voting <blocker> <ci> <ensure-availability> <intermittent-failure> <regression> <juju-core:Triaged by menno.smits> <juju-core 1.23:Fix Released by menno.smits> <juju-core 1.24:Fix Released by menno.smits> <https://launchpad.net/bugs/1453805>
<menn0> wallyworld: ok that's good to know
<wallyworld> oh i haven't see that bug
<menn0> wallyworld: it's an old one that aaron reopened because the symptoms look the same
<menn0> wallyworld: what happens is that after enable-ha the new controller hosts come up and the agents can connect to MongoDB but then get disconnected
<menn0> wallyworld: we don't have the mongodb logs to confirm what's going on
<wallyworld> joy
<menn0> wallyworld: but off memory I think that can happen when the replicaset isn't ready yet
<menn0> wallyworld: it's intermittment, I can't replicate it
<wallyworld> sigh
<menn0> wallyworld, mgz_ : we really need those MongoDB logs to know what's happening
<wallyworld> if it's an existing bug why is it a regression?
<menn0> wallyworld: it was fixed in 1.24 and 1.25 and has now come back
<menn0> wallyworld: it could well be a completely different cause
<wallyworld> i'd say so because i don't think we messed with those bits
<wallyworld> but there were a lot of changes
<menn0> wallyworld: what about those changes to mongodb setup in the machine agent that you made? (all that deleted code)
<menn0> wallyworld: could that reorg have something to do with it?
<wallyworld> the deleted code was for pre ha environments where stuff wasn't set up yet for replication
<wallyworld> that setup is now done in bootstrap
<menn0> wallyworld: yeah... seems unlikely
<menn0> wallyworld: looking at the failures it's happening in master and next
<wallyworld> and i think next was branched before my changes
<wallyworld> but i did notice it took a while to transition to has-vote
<wallyworld> i just thought it was mongo behaving as normal, because well, you know, mongo is web scale
<menn0> wallyworld: not 20mins though right/
<menn0> ?
<wallyworld> not sure tbh
<wallyworld> maybe 5?
<redir> heading out for a while. I'll check back later this eve to see if things merged...
<wallyworld> let's hope so
<menn0> wallyworld: 5 is acceptable I think, 20+ is not
<wallyworld> even 5 seems unforntunate
<wallyworld> i mean, wtf is it doing
#juju-dev 2016-04-15
<voidspace> thumper: nice work
<voidspace> thumper: and I knew we'd hit subnets pretty soon
<thumper> voidspace: it is coming along nicely
<voidspace> thumper: that should be very straightforward to do
<thumper> voidspace: I'm doing blockdevice, filesystem, and partition in gomaasapi
<voidspace> thumper: anyway, goodnight - see you tomorrow if you're around
<voidspace> thumper: cool, thanks
<thumper> voidspace: night
<voidspace> o/
<wallyworld> axw_: does bug 1539684 ring any bells for you?
<mup> Bug #1539684: storage-get unable to access previously attached devices <canonical-bootstack> <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1539684>
<sinzui> wallyworld: do you have a moment to review http://reviews.vapour.ws/r/4605/
<wallyworld> i do
<wallyworld> sinzui: ths is a merge of next into master right?
<sinzui> wallyworld: yesl sorry. I seem to have missed a whole sentence.
<wallyworld> np, thought it was, just checking
<wallyworld> sinzui: i've eyeballed the changes, looks ok
<wallyworld> and i am landing
 * menn0 is back (had a visitor)
<menn0> wallyworld: this is what's in the logs of the controller hosts added after enable-ha:
<menn0> 2016-04-14 10:22:55 INFO juju.mongo open.go:125 dialled mongo successfully on address "127.0.0.1:37017"
<menn0> 2016-04-14 10:22:55 DEBUG juju.worker.dependency engine.go:479 "state" manifold worker stopped: cannot connect to mongodb: no reachable servers
<menn0> 2016-04-14 10:22:55 ERROR juju.worker.dependency engine.go:526 "state" manifold worker returned unexpected error: cannot connect to mongodb: no reachable servers
<wallyworld> menn0: yeah, mongo is great
<menn0> wallyworld: we really need those mongodb logs
 * menn0 updates the bug
<wallyworld> yes
<wallyworld> alexisb: you still working on bug 1506225 ?
<mup> Bug #1506225: Failed bootstrap does not clean up failed environment w/o --force and error message is unhelpful <bootstrap> <destroy-environment> <jujuqa> <juju-core:In Progress by alexis-bruemmer> <https://launchpad.net/bugs/1506225>
<alexisb> wallyworld, nope I haven't loocke dat it in ages
<wallyworld> ok, np, will update
<mup> Bug # opened: 1570651, 1570654, 1570657, 1570660
<thumper> once master has next merged
<thumper> we need to remove the next branch
<wallyworld> thumper: i am rebooting to try and fix my camera, will be a little late
<thumper> wallyworld: ack
<axw_> wallyworld: sorry didn't see message before. I think I've seen the storage bug in LP before, but haven't witnessed the bug first hand
<wallyworld> np
<wallyworld> axw_: i am just triaging bugs so was curious
<wallyworld> axw: running late, otp, be a minute or 2
<axw> wallyworld: sure, ping when ready
<axw> sinzui: can the beta4 stabilisation bug be closed now?
<sinzui> axw: soon, I need to bump the version. CI will reject all branches that claim to be 2.0-beta4
<axw> sinzui: ah ok
<sinzui> wallyworld: can you review http://reviews.vapour.ws/r/4606/
<wallyworld> sure
<wallyworld> sinzui: lgtm. i have been naughty and snuck in a landing prior to the stampede
<sinzui> wallyworld: I better make sure CI is paused then because it will fail that revision
<wallyworld> sinzui: sorry
<wallyworld> i can abort
<sinzui> wallyworld: no need
<sinzui> wallyworld: Since CI makes version we release like agents. It assumes something terrible has happened if it is asked to makes and test a version it has made and released.
<wallyworld> ah
<sinzui> wallyworld: CI needs to prevent someone tampering with the agents in streams
<wallyworld> makes sense
<wallyworld> i was being impatient
<wallyworld> axw: ready now
<sinzui> wallyworld: all is fine. ci is paused, when all is merged I will unpause and rmove the blockl
<wallyworld> tyvm
<sinzui> wallyworld: and everyone, looks like I need to find some disk space fix merges.
<wallyworld> ah bollocks
<sinzui> why is mongo on this host
<mup> Bug #1568943 changed: Juju 2.0-beta4 stabilization <blocker> <juju-core:Fix Released> <https://launchpad.net/bugs/1568943>
<mup> Bug #1541536 changed: Deployer and Quickstart failed setting annotations because of socket or json parsing <api> <blocker> <ci> <deployer> <quickstart> <regression> <juju-core:Invalid> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1541536>
<mup> Bug #1564791 changed: 2.0-beta3: LXD provider, jujud architecture mismatch <lxd> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1564791>
<mup> Bug #1541536 opened: Deployer and Quickstart failed setting annotations because of socket or json parsing <api> <blocker> <ci> <deployer> <quickstart> <regression> <juju-core:Invalid> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1541536>
<mup> Bug #1564791 opened: 2.0-beta3: LXD provider, jujud architecture mismatch <lxd> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1564791>
<mup> Bug #1541536 changed: Deployer and Quickstart failed setting annotations because of socket or json parsing <api> <blocker> <ci> <deployer> <quickstart> <regression> <juju-core:Invalid> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1541536>
<mup> Bug #1564791 changed: 2.0-beta3: LXD provider, jujud architecture mismatch <lxd> <juju-core:Fix Released by natefinch> <https://launchpad.net/bugs/1564791>
<natefinch> wallyworld: got a few minutes to talk?
<wallyworld> natefinch: sure, am outside, give me a sec to move
<natefinch> wallyworld: np
<wallyworld> natefinch:  https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand
<bradm> should I be able to bootstrap juju2 on ppc64el?
<bradm> I get an error about "ERROR invalid constraint value: arch=ppc64el ; valid values are: [amd64 arm64]"
<frobware> bradm: I have done so in the past when looking at various pp64el bugs.
<bradm> ah, I see my issue now that I say that
<bradm> my images didn't sync right for some reason
<bradm> yup, booting fine after kicking the sync a bit harder
<axw> wallyworld: would you please take a look at the last diff here, http://reviews.vapour.ws/r/4533/diff/5-6/
<wallyworld> sure
<axw> wallyworld: the rest has been reviewed, the last rev is to remove models immediately
<wallyworld> awesome
<wallyworld> axw: a couple of questions
<axw> wallyworld: thanks
<axw> wallyworld: I'll add some more tests around the refcount, please see my reply to your first issue
<wallyworld> sure
<wallyworld> axw: ah sorry, i misread the command
<axw> wallyworld: the increment for controller model would have always silently failed, because it was being run before the document is created
<axw> without txn.DocExists
<axw> woohoo
<wallyworld> win!
<wallyworld> lukcily we are adding a test :-)
<wallyworld> naughty thumper :-)
<voidspace> thumper-afk:  frobware: dimitern: a really difficult one to start the morning off http://reviews.vapour.ws/r/4611/
<frobware> voidspace: shipit
<voidspace> frobware: thanks :-)
<dimitern> voidspace: +1
<voidspace> dimitern: o/
<dimitern> frobware: it turned out surprisingly difficult to get rid of AC only in MAAS
<dimitern> it would've been a lot easier to get rid of all bits of AC
<frobware> dimitern: patience... :)
<dimitern> frobware: yeah, but still - I'm not quite happy with what I came up with eventually
<voidspace> dimitern: not just removing the branches where we check for the flag?
<dimitern> voidspace: removing the code in MAAS was easy, but convincing the provisioner/brokers to still work with and w/o AC on AWS, while not breaking MAAS at the same time, has proven to be nasty
<voidspace> dimitern: ah
<voidspace> dimitern: so sometimes use the flag but sometimes not
<voidspace> I see
<dimitern> the price we pay for dirty PoC-style hacks ..
<dimitern> voidspace: we do, but a lot of places in the code assumes having the feature flag on is sufficient to use the legacy approach AC (iptables, allocateaddress, etc.)
<voidspace> dimitern: oh yes, I was just understanding - that does sound much harder than "just remove some code"
<dimitern> whereas now just the flag is not sufficient, as we also need to check if SupportsAddressAllocation returns true or NotSupported
<voidspace> frobware: have you been to Vision Express?
<frobware> not yet
<voidspace> ok
<frobware> voidspace: going to go around lunchtime
<voidspace> frobware: ah, cool
<voidspace> frobware: couldn't bear to miss our daily banter :-)
<frobware> voidspace: makes your eyes bleed :-D
<dimitern> frobware: but on the upside, I figured out how to untangle the contention around discoverspaces still going on when trying to add a container to the bootstrap node
<thumper-afk> babbageclunk, voidspace: meeting...
<frobware> voidspace: standup
<voidspace> kk
<frobware> dimitern, voidspace, dooferlad, babbageclunk: PTAL @ http://reviews.vapour.ws/r/4609/
<voidspace> MAAS2 bootstrap and deploy seems to work with my branch!!!
<mup> Bug #1570759 opened: apt-get install juju does not install /usr/bin/juju <juju-core:New> <https://launchpad.net/bugs/1570759>
<frobware> voidspace: congrats all round !!
<dimitern> frobware: you've got a review
<babbageclunk> dimitern: Do you think I can put a $$merge$$ on that gomaasapi PR?
<babbageclunk> https://github.com/juju/gomaasapi/pull/40
<babbageclunk> dimitern: I can see the extra bit that needs adding in controller.AllocateMachine after that.
<dimitern> babbageclunk: let me have a look
<babbageclunk> dimitern: you already did! :)
<dimitern> babbageclunk: ah, yes - I'm ok with landing this - esp. if the storage tests pass in maas?
<babbageclunk> dimitern: ?
<dimitern> babbageclunk: make check passed I presume?
<babbageclunk> dimitern: Ok - I'll check it out and run them
<dimitern> babbageclunk: cheers!
<dimitern> babbageclunk: :) I'm not trying to be difficult, but folks familiar with storage are not around, so hopefully it has good tests that can show possible regressions
<babbageclunk> dimitern: No, that makes sense"
<babbageclunk> dimitern: I think its tests are mostly in the canned JSON mould, so there's not anything that would really show regressions, unfortunately.
<dimitern> babbageclunk: I see, well - we'll take what we can get
<babbageclunk> dimitern: :)
<mup> Bug #1570791 opened: ERROR wait: no child processes with juju run on ppc64el <juju-core:New> <https://launchpad.net/bugs/1570791>
<babbageclunk> dimitern: I think we're really the only clients of this part of the api, so if it's wrong it only hurts us anyway.
<dimitern> babbageclunk: yeah, the feature flag gives at least some piece of mind to fix stuff
<frobware> dimitern: thanks. got sidetracked by GCE, but can now bootstrap there too.
<mup> Bug #1570796 opened: container startup issue when juju network management disabled <juju-core:New> <https://launchpad.net/bugs/1570796>
<dimitern> frobware: nice!
<babbageclunk> voidspace, dimitern: trying to work out what the storage stuff will look like in the constraint map JSON coming back from the MAAS api.
<babbageclunk> Is there any way to specify storage constraints from the CLI?
<dimitern> babbageclunk: yeah, sure - although I haven't done it with MAAS 2.0 CLI
<dimitern> babbageclunk: in 1.0 I'd use `maas 19-root nodes acquire storage='...' dry_run=True verbose=True`
<dimitern> or something like that
<babbageclunk> dimitern: Ooh, dry_run is handy!
<dimitern> as for the format of the storage argument, have a look at how's it constructed in maas/constraints.go
<babbageclunk> dimitern: ok, will do - thanks
<babbageclunk> dimitern: I can get it to refuse to give me machines if I ask for too much storage, but it doesn't seem to matter what I put fror networks (or not_networks), it always allocates me the node.
<babbageclunk> dimitern: Am I specifying this right?
<babbageclunk> networks=ip:192.168.200.5
<babbageclunk> dimitern: argh - looks like I'm being bitten by docs not being up to date again - I think this is interfaces now.
<babbageclunk> dimitern: ok, if I specify interfaces=default:space=0 that works.
<babbageclunk> dimitern: I don't really understand what that means, but it seems to do something.
<dimitern> babbageclunk: 'interfaces' takes a list of items, separated by ; - each item can have a "<label>:" and a list of key-value comma-separated attributes, like space=0 (0 is the ID; but the name should work as well there)
<dimitern> babbageclunk: so 'interfaces=default:space=0;admin:space=admin-api' means pick a machine with 2 or more NICs, which have addresses from the subnets in 'space-0' and 'admin-api'
<dimitern> babbageclunk: the not_networks on the other hand apply to the machine as a whole - i.e. not_networks=cidr:10.20.30.0/24 will mean none of the NICs on the machine have access to the subnet with cidr 10.20.30.0/24
 * dimitern needs to step out for ~30m
<babbageclunk> dimitern: thanks, that helps a lot!
<babbageclunk> dimitern: And the labels in the interfaces specifications - what do they do?
<babbageclunk> dimitern: Oh - do those become the labels in the constraints_by_type map that come back? So you can tie the ids back to the constraints you specified?
<babbageclunk> dimitern: (I know you're not there, just asking while it occurs to me.)
<dimitern> babbageclunk: labels are user-defined, in juju we use either binding names as labels (i.e. when you do `juju deploy mysql --bind 'server=db-space cluster=internal-api'`, juju will construct 'interfaces=server:space=42;cluster:space-62', assuming 42 and 62 are the maas provider ids of the 'db-space' and 'internal-api' spaces
<dimitern> babbageclunk: I was almost out, but couldn't resist :)
<babbageclunk> dimitern: gotcha.
<dimitern> babbageclunk: anyway - bbs; we could get on a HO later if you want
<babbageclunk> dimitern: o/
<mup> Bug #1466514 opened: apiserver has a race in the using of the port number <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1466514>
 * dimitern is back
<perrito666> frobware: I just privmsgd you
<perrito666> and by frobware I meant fwereade
<mup> Bug #1547741 changed: Cannot build on armhf with go1.2 <2.0-count> <armhf> <packaging> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1547741>
<wallyworld> frankban: any chance of a small review to fix an issue with local charm series detection https://github.com/juju/bundlechanges/pull/21
<abentley> sinzui: I've got the updated public-clouds.syaml up, but I'm thinking I need to delete clouds.yaml before I'm done done.
<sinzui> abentley: agreed
<frankban> wallyworld: lgtm
<wallyworld> frankban: tyvm
<mup> Bug #1567594 changed: upgrade-gui command isn't in the juju tab complete <help> <ui> <juju-core:Invalid> <https://launchpad.net/bugs/1567594>
<mup> Bug #1567938 changed: juju bootstrap requires network ID as config option on command line although it's specified in clouds.yaml <config> <juju-core:Invalid> <https://launchpad.net/bugs/1567938>
<mup> Bug #1570883 opened: imageSuite.TestEnsureImageExistsCallbackIncludesSourceURL fails on centos go 1.6 <centos> <go1.6> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1570883>
<babbageclunk> dimitern, voidspace: Can you please look at this? https://github.com/juju/gomaasapi/pull/41
<dimitern> babbageclunk: sure, looking
<babbageclunk> dimitern: thanks! Was pretty easy to implement by just following what thumper had already done, but took a while for me to understand the constraints and response structure first.
<abentley> sinzui, mgz_: I've got my keystone3 test passing using a fake juju client, but I think I'm at the stage where I need a real maas to test with.  Would it make sense to use parallel-maas17?
<dooferlad> frobware: I think you are in the best place to review http://reviews.vapour.ws/r/4613/
<sinzui> abentley: no maas 1.7 is unsupported and juju2 doesn't know 1.7 or 1.8
<abentley> sinzui: Should I wait for munna to be idle, then?
<sinzui> abentley: I have run small concurrent loads
<abentley> sinzui: I don't think the openstack bundle counts as a small load, does it?
<frobware> dooferlad: looking
<mgz_> abentley: munna would be best, but can you use 1.9 on finfolk as well?
<abentley> mgz_: I'm not picky.
<abentley> mgz_: Here's what I'm doing: https://pastebin.canonical.com/154458/
<dimitern> babbageclunk: LGTM
<babbageclunk> dimitern: sweet
<abentley> (Yes, I know I'm using the 2.0 endpoint)
<mgz_> abentley: that all looks reasonable
<abentley> mgz_: Okay, I'll try it on finfolk.
<sinzui> abentley: 1. no, and liberty has never deplpued on ur vmass
<abentley> sinzui: Sorry, I don't understand.
<sinzui> abentley: some OS budles don't deploy on our vmass 1.9. You will learn if yours does and if it does, we gain new OS testing
<abentley> sinzui: Okay.
<perrito666> uh, if you have a bundle that deploys in vmaas I would like to have it too
 * fwereade just had an interesting new life experience: rescuing a cat from a tree. only one scratch drew blood.
<rogpeppe3> fwereade: :)
<rogpeppe> fwereade: you're now one step on the road to becoming a burly fireman
<sinzui> perrito666: I will pass it on when we have one
<perrito666> fwereade: hint for the future, if the cat is mature, just tip it and it will fall ok :p
<rogpeppe> fwereade: random question: i just did: "juju bootstrap --upload-tools ec2 aws"; go install github.com/juju/juju/...; juju upgrade-juju --upload-tools; should that have worked OK or is it something you shouldn't do?
<sinzui> perrito666: We have a landscape and wikimedia bundlles that work with native deploy on maas 1.9
<rogpeppe> fwereade: 'cos i tried to deploy a unit and it's failing because it can't fetch the tools ("no matching tools available")
<rogpeppe> anyone else know about --upload-tools vs upgrade-juju ?
<natefinch> rogpeppe: that should be fine. In fact, I think that's the only way you can upgrade after doing the initial upload-tools on bootstrap
<rogpeppe> natefinch: ok, well then it looks like a bug
<rogpeppe> natefinch: FWIW upgrade-juju appeared to work (zero exit status) but it did print something about "available-tools...\nbest version...\n" which confused me a but
<rogpeppe> bit
<frobware> dooferlad: ignore my github comments, doing it in RB
<dooferlad> frobware: ack
<natefinch> rogpeppe: I haven't upgraded a juju environment.... uh, possibly ever.  So, no idea what it's supposed to actually look like.
<rogpeppe> natefinch: i used to do it all the time - very useful if you don't wanna wait for ec2 instances to start up again
<rogpeppe> natefinch: pity it seems to be broken now
<abentley> mgz_: Do we have a config for maas19 on finfolk?
<sinzui> alexisb: bug 1570035 is cause master failures We need an engineer to fix the test or juju.
<mup> Bug #1570035: Race in api/watcher/watcher.go <ci> <race-condition> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1570035>
<natefinch> rogpeppe: well, it really should work, so we should fix that
<ericsnow> rogpeppe: don't forget that you must use --upgrade-tools to upgrade the admin model and then separately upgrade any other models, explicitly providing the version of your newly upgraded tools
<katco> ericsnow: rogpeppe: you can no longer upgrade anything but the admin model with --upload-tools
<rogpeppe> katco: so if i use upgrade-juju on the admin model, it breaks all my models?
<ericsnow> rogpeppe: you can upgrade the others separately but not using --upgrade-tools
<katco> rogpeppe: it shouldn't? why do you think that's what would happen?
<rogpeppe> katco: that's what happened for me
<katco> rogpeppe: how did they break?
<rogpeppe> katco: i did upgrade-juju, then deployed a unit in the default model and it failed 'cos it couldn't get tools
<rogpeppe> katco: am just writing up a bug report now
<katco> rogpeppe: kk
<fwereade> rogpeppe, yeah, I thought that should work
<rogpeppe> katco: tbh i didn't know that different models in the same controller *could* have different sets of tools. seems like it might be easy to introduce problems there.
<katco> rogpeppe: well if you're using --upload-tools, you're off a production path anyway. all bets are off
<rogpeppe> katco: well, even if you're not
<katco> rogpeppe: i think different models having different agents satisfies the dev, test, prod use-case
<alexisb> sinzui, ack
<rogpeppe> katco, fwereade, ericsnow, natefinch: https://bugs.launchpad.net/juju-core/+bug/1570917
<mup> Bug #1570917: upgrade-juju: success but then deploy fails <juju-core:New> <https://launchpad.net/bugs/1570917>
<katco> rogpeppe: what's interesting is that every shop i've been a part of would test the entire installation of juju -- controller and all -- in a dev, test, prod
<katco> rogpeppe: ta for the bug report
<rogpeppe> katco: yes, because the controller is an integral part of the behaviour of the system
<katco> rogpeppe: yep
<rogpeppe> katco: you can't really usefully test a dev version without testing the server part too
<katco> rogpeppe: i suppose in a multi-user world, you could look at the controller as a separately managed thing
<katco> rogpeppe: to draw an analogy, it is like the OS upon which your app (another model) runs
<katco> rogpeppe: ops people would test the controller in a separate env. app-devs would use dev, test, models, to test their workloads
<abentley> mgz_, sinzui: http://10.0.30.100/MAAS/ is giving me "Service unavailable" even after a reboot.  I think it's forgotten that it was our parallel-maas19
<katco> rogpeppe: app-devs might not have access to nor care about the admin model
<katco> rogpeppe: uncharted waters here i suppose
<rogpeppe> katco: it depends what reasons one might have for using different juju versions
<katco> rogpeppe: btw upgrade-juju has a -m flag; no need to switch first
<rogpeppe> katco: ah
<abentley> rogpeppe: If you're interested in an alternative to --upload-tools that uses actual simplestreams, I'd be happy to work with you.  I've got lots of experience with simplestreams.
<rogpeppe> abentley: tbh this is all overhead on top of what i'm trying to do atm
<rogpeppe> abentley: but i'd be interested to see your alternative
<katco> ericsnow: standup time
<abentley> rogpeppe: What I have right now is code that generates simplestreams json that can then be formatted as actual simplestreams metadata.  lp:juju-release-tools make_agent_json.py
<abentley> rogpeppe: I think it would be good to make this as convenient as --upload-tools, so that devs don't need to use --upload-tools any longer.
<rogpeppe> abentley: if it was, i'd use it
<rogpeppe> abentley: if only simplestreams was simple :)
<icey> how do we configure apt-http-proxy in juju2 now by default for a specific cloud?
<abentley> rogpeppe: If you're up for working with me, I'm happy to try.  I need dev input to get it to the point where it does what everyone needs.
<babbageclunk> frobware, dooferlad, dimitern: spaces sync? Am I in the wrong place, or is it not happening?
<rogpeppe> abentley: it's all a bit fraught around here currently, but i'll let you know if i get a few minutes free :)
<dooferlad> babbageclunk: I don't remember that meeting happening recently...
<babbageclunk> I mean, I just see meeting notifications and click on them, I don't know nothing.
<fwereade> sinzui, alexisb, I have to go out in a sec so I can't test in detail but I'm reasonably confident that http://paste.ubuntu.com/15850768/ will fix lp:1570035
<alexisb> fwereade, you still around?
<alexisb> lol
<fwereade> alexisb, ha :)
<alexisb> fwereade, you read my mind
<alexisb> I leave you to it
<alexisb> thanks!
<fwereade> that was serendipitous :)
<alexisb> I will assign the bug to you
<fwereade> alexisb, sinzui: yeah, I think that's crack actually, I misread something
<fwereade> alexisb, sinzui: will poke at it while I can and update with what I know
<alexisb> fwereade, assume, if you can take it great,if you need to hand it off please unassign yourself and send me a note
<fwereade> alexisb, will do
<voidspace> alexisb: I can bootstrap and deploy with MAAS2!
<voidspace> alexisb: actually, to be fair I *could*, I've completed the implementation of Subnets but not yet tested, and I'm about to try a manual check that I can *still* bootstrap and deploy...
<alexisb> SWEET!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
<voidspace> alexisb: if it works I'll point you at the branch
<alexisb> voidspace, awesome thank you
<mup> Bug #1067213 changed: race: concurrent deployments corrupt secret token <deploy> <race-condition> <juju-core:Fix Released> <https://launchpad.net/bugs/1067213>
<mup> Bug #1086236 changed: environs/ec2: concurrent deployments fail creating the s3 bucket. <ec2-provider> <race-condition> <test-needed> <juju-core:Fix Released> <https://launchpad.net/bugs/1086236>
<mup> Bug #1570917 opened: upgrade-juju: success but then deploy fails <juju-core:New> <https://launchpad.net/bugs/1570917>
<fwereade> alexisb, I have to stop; proposed a quick solution in the bug
<fwereade> and unassigned myself
<alexisb> fwereade, awesome thank you
<alexisb> katco, ^^^ given this is a blocker it is high priority for th eteam
<alexisb> well with katco out :)
<alexisb> natefinch, perrito666, ericsnow what are you guys up to atm??
<natefinch> alexisb: finishing up a bugfix.. should be proposing soon
<ericsnow> alexisb: knee deep in 2 bugs
<alexisb> natefinch, can you pick up the bug fwereade was working on
<alexisb> lp:1570035
<alexisb> once you are done
<natefinch> sure, I can pick up #1570035
<mup> Bug #1570035: Race in api/watcher/watcher.go <ci> <race-condition> <regression> <test-failure> <juju-core:In Progress> <https://launchpad.net/bugs/1570035>
<alexisb> awesome thiank you!
<lazyPower> http://paste.ubuntu.com/15851796/
<lazyPower> Has anyone seen a model controller refuse to serve a charm due to error code 400?
<lazyPower> if it were 401 i'd reasonably think it were auth based, (charms are everyone read), 404 not found, but 400? wat?
<natefinch> lazyPower: weitf
<natefinch> lazyPower: weird too
<lazyPower> natefinch - indeed. i have an active env if thats helpful
<alexisb> lazyPower, not one i have seen
<alexisb> lazyPower, natefinch is busy
<lazyPower> ack
<alexisb> lazyPower, a bug is a good place to start
<alexisb> as always :)
 * alexisb changes location
<frobware> dooferlad: ping
<frobware> dooferlad: I just tried your latest change live against a node: http://pastebin.ubuntu.com/15852945/
<frobware> dooferlad: it could have been there on your first PR -- I didn't try live, it was just a review
<mup> Bug #1570963 opened: Model Controller refuses deployment with error code 400 <juju-core:New> <https://launchpad.net/bugs/1570963>
<perrito666> alexisb: delayed answer, having lunch
<frobware> dooferlad: fyi - https://bugs.launchpad.net/juju-core/+bug/1564397/comments/1
<mup> Bug #1564397: MAAS provider bridge script deletes /etc/network/if-up.d/ntpdate during bootstrap <bootstrap> <network> <juju-core:Triaged> <https://launchpad.net/bugs/1564397>
<mup> Bug #1570994 opened: deploy fails to download updated local charm <juju-core:New> <https://launchpad.net/bugs/1570994>
<bogdanteleaga> is there something out there for creating aliases between commands? I want command A = command B + some args
<natefinch> bogdanteleaga: you mean like command line aliases, or something else?
<bogdanteleaga> natefinch, yes
<bogdanteleaga> for example: "juju show-run-status" = "juju show-action-status --name juju-run"
<natefinch> bogdanteleaga: linux or windows?
<bogdanteleaga> natefinch, I don't think it matters
<natefinch> bogdanteleaga: it does if I tell you to put something in your bashrc :)
<bogdanteleaga> natefinch, hehe, there's PS aliases too
<bogdanteleaga> natefinch, I guess that's a solution, but I thought having it built in would be nicer
<natefinch> bogdanteleaga: actually... thumper was doing something with aliases
<bdx> whats going on everyone? I'm currently having issues when trying to add ssh keys to my environment using `juju add-ssh-keys` command - using juju2 beta4 ... any insight or recommendations here? thx
<natefinch> bogdanteleaga: I don't see anything in juju help about aliases though, so maybe it never made it in
<bdx> when I enter 'juju import-ssh-keys `cat id_rsa.pub` --model lxd-share', the command completes successfully, but following the command I don't see any new ssh keys in my environment ....
<bogdanteleaga> natefinch, yeah there's other kinds of aliases, but not ones that support args
<bogdanteleaga> natefinch, I think I'll just add it to run's docstring for now
<natefinch> bdx: try putting --model lxd-share before the call to cat
<bdx> natefinch: negatory
<natefinch> bdx: well, that's sort of good, it's not supposed to matter where the flag goes, but occasionally I've seen problems with specific commands.
<natefinch> bdx: not sure what's going on there.
<natefinch> bdx: file a bug, if you would, and include machine-0 logs if you can.
<bdx> natefinch:  'juju add-ssh-keys --model lxd-share `cat id_rsa.pub`' fails with an error everytime too .... I can't seem to get the command to complete with success
<bdx> natefinch: alright .... do you experience the same behavior?
<natefinch> bdx: lemme give it a try
<natefinch> bdx: what error are you getting?  I'm getting "cannot add key <my email>: invalid ssh key: <my email>"
<bdx> natefinch: using `juju add-ssh-keys --model lxd-share `cat id_rsa.pub` - yea
<bdx> natefinch: does 'juju import-ssh-keys `cat id_rsa.pub`' work for you?
<natefinch> bdx: nope, still say invalid key ... man that command needs a flag or something to take a filename
<natefinch> alexisb: I see your name on add_sshkeys.go .... how is that supposed to work?
<alexisb> natefinch, I flattened the command
<alexisb> when I tested it I just put the key on the cl
<natefinch> weird, I can't get it to accept my key
<bdx> natefinch: so I don't get the error with 'juju import-ssh-keys `cat id_rsa.pub`', but I also don't get a key added
<alexisb> natefinch, bdx none of this sounds good
<alexisb> bdx, can you please open a bug and I will get someone on it asap
<bdx> totally, omp
<alexisb> so bdx I am able to import a key
<alexisb> but I had to actually copy the key in the CL
<alexisb> just the key
<alexisb> not the full output of id_rsa.pub
<bdx> alexisb: nice! does that work for 'juju add-ssh-keys' too?
<bdx> https://github.com/juju/juju/issues/5187
<natefinch> interesting... I think it's something wrong with cat'ing the file
<natefinch> I got it
<natefinch> you gotta "`cat ~/.ssh/id_rsa.pub`"
<natefinch> note the extra quotes
<bdx> oooooh NICE!!!!
<bdx> natefinch: so ^ got `juju add-ssh-keys` to work, but `import-ssh-keys` still has no affect on anything
<natefinch> bdx: I don't honestly knwo the difference between add and import
<bdx> natefinch: alright
<bdx> I updated the bug to address that too
<bdx> natefinch: thanks for your help
<natefinch> bdx: you're adding the key to the lxd-share model... are you also checking for the key in a machine from that model?
<bdx> natefinch: yes
<natefinch> bdx: ok, cool, just double checking :)
<natefinch> bdx: the new model stuff always screws me up
<bdx> natefinch: yea, it took me a minute to adjust too, good looking out!
<natefinch> huzzah, rogpeppe just deployed a charm with resources from the charmstore!
<natefinch> ericsnow, katco ^^
 * natefinch does a dance
<redir_lunch> bbiab
<ericsnow> natefinch: \o/
<mup> Bug #1571053 opened: container networking lxd 'Missing parent for bridged type nic' <ci> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1571053>
<natefinch> alexisb: bing https://github.com/juju/juju/pull/5189
<mup> Bug #1571065 opened: Panic on bundle with local charm and no series <ci> <deploy> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1571065>
<natefinch> anyone got time for a quick review of a fix for a CI blocker? http://reviews.vapour.ws/r/4617/  ericsnow?
<ericsnow> natefinch: sure
<natefinch> ericsnow: I apologize in advance for the use of patching, export_test, etc.  I'm doing what fwereade_ recommended in his review of the bug earlier.
<ericsnow> natefinch: np :)
<ericsnow> natefinch: ship-it-ish
<mgz_> natefinch: you ran with -race locally with your fix?
<perrito666> wow ericsnow turnet into ned flanders
<perrito666> :p
<natefinch> mgz_: indeed
<natefinch> $ go test -race -check.f=TestWatchForProxyConfigAndAPIHostPortChanges
<natefinch> OK: 1 passed
<natefinch> PASS
<natefinch> ok  	github.com/juju/juju/api/proxyupdater	1.165s
<mgz_> well, that may save us doing dodgy hacks to get the weekend tests running then
<mup> Bug #1571082 opened: autopkgtest lxd provider tests fail for 2.0 <lxd-provider> <packaging> <juju-core:Triaged> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1571082>
<redir> pretty sure this test is gonna timeout
<perrito666> redir: it is ugly when you are not sure if test is too long or its going to timeout
<redir> pretty sure this usually doesn't take so long...usually
<redir> getting used to the ones that pause my life
<perrito666> I have two gopaths with different branches of juju so I can do something in the other one while running tests
<redir> speak of the devil...
<redir> hah
<redir> only two?
<redir> jk
<mgz_> anyone know how to configure the lxd provider to use daily images?
<alexisb> evening all, have a great weekend!
<perrito666> alexlist: you too
<perrito666> lool, too late
#juju-dev 2016-04-16
<redir> later #juju-dev
<redir> EOW
<mup> Bug #1571131 opened: juju add-ssh-key $(cat ~/.ssh/a-key.pub) needs quoting in helptext <helpdocs> <juju-core:Triaged by reedobrien> <https://launchpad.net/bugs/1571131>
<mup> Bug #1571254 opened: Can't deploy multiseries charms in bundles <juju-core:New> <https://launchpad.net/bugs/1571254>
<mup> Bug #1571254 changed: Can't deploy multiseries charms in bundles <juju-core:New> <https://launchpad.net/bugs/1571254>
<mup> Bug #1571254 opened: Can't deploy multiseries charms in bundles <juju-core:New> <https://launchpad.net/bugs/1571254>
#juju-dev 2016-04-17
<mup> Bug #1532932 changed: Unable to bootstrap the lxd provider on vivid <2.0-count> <adoption> <juju-release-support> <juju-core:Fix Released> <https://launchpad.net/bugs/1532932>
<mup> Bug #1571254 changed: Can't deploy multiseries charms in bundles <juju-core:Invalid> <https://launchpad.net/bugs/1571254>
<mup> Bug #1544850 changed: unit-test failure: cloudImageMetadataSuite.TestFindMetadata <ci> <intermittent-failure> <test-failure> <unit-tests> <juju-core:Expired> <https://launchpad.net/bugs/1544850>
<mup> Bug #1545216 changed: upgrade-mongo panic: juju home hasn't been initialized <ci> <mongodb> <test-failure> <upgrade-mongo> <juju-core:Expired> <https://launchpad.net/bugs/1545216>
<mup> Bug #1571254 opened: Can't deploy multiseries charms in bundles <juju-core:New> <https://launchpad.net/bugs/1571254>
<mup> Bug #1570963 changed: Model Controller refuses deployment with error code 400 <juju-core:New> <https://launchpad.net/bugs/1570963>
<menn0> wallyworld: morning
<wallyworld> hey
<menn0> wallyworld: is it a known thing that juju no longer works on vivid
<wallyworld> could be. vivid is out of support anyway
<menn0> wallyworld: with master I now get:
<menn0> $ juju bootstrap local lxd --config ~/canonical/juju-local.yaml --upload-tools
<menn0> ERROR cannot find network interface "lxdbr0": route ip+net: no such network interface
<menn0> ERROR invalid config: route ip+net: no such network interface
<menn0> and there's no current lxd packages for vivid
<wallyworld> not surprised
<menn0> time to upgrade my laptop I guess
<wallyworld> yep
<menn0> I was holding out until xenial was final :)
<wallyworld> i've been on xenial for a while, it works fine
<menn0> wallyworld: next question, I'm looking at this replicaset voting bug. the issue seems to be that the mongodb on the new controller hosts can't find the replicaset config.
<menn0> wallyworld: this familiar error from mongodb: replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
<menn0> wallyworld: the replicaset is fine on the bootstrap node though
<wallyworld> joy
<menn0> wallyworld: i'm trying to find where the new controller nodes get added to the replicaset and I haven't found it yet. any pointers?
<wallyworld> i have nfi sorry :-(
<wallyworld> oh wait
<wallyworld> in state somewhere i think there is code to manage that, let me check
<wallyworld> menn0: oh, and there's a peergrouper worker
<wallyworld> i think that worker responds to the changes in state
<wallyworld> menn0: addmachine.go in state has func (st *State) EnableHA
<wallyworld> that sets up the desired replicaset state
<wallyworld> and the worker then acts on it
<menn0> wallyworld: ok, thanks. I'll keep digging from there.
<wallyworld> sorry, that's all i know
#juju-dev 2017-04-10
<anastasiamac> axw: sure, i'll look \o/
<axw> anastasiamac: if you liked that PR, you'll love this one: https://github.com/juju/juju/pull/7216 ;)  if you have any time... it's kinda big
<anastasiamac> axw: i saw and got excited but it may need to wait until after lunch... if it's k
<axw> anastasiamac: no worries. finally going to look at 1.8 test failures this morning
<anastasiamac> axw: \o/
<anastasiamac> axw: i was going to say "unless u want to figure out why multiwatcher is not seeing changes"... but m not even convinced that this si an issue m looking at now... is there any way that transaction integrity could b violated in some way?
<anastasiamac> axw:  may i need more coffee too..
<anastasiamac> maybe*
<axw> anastasiamac: I don't know
<axw> veebers: are you aware that qa.jujucharms.com is unavailable?
<anastasiamac> axw: yes :) veebers is working on resurecting it.. apparently 1.25 is involved :D
<axw> anastasiamac: okey dokey
<veebers> axw: aye, looking. (as anastasiamac says)
<anastasiamac> axw: reviewed (could not resist) :D
<axw> anastasiamac: thanks :)
<axw> anastasiamac: interestingly, I have no applications in the modelEntityRefs collection for the model that's looping
<axw> anastasiamac: nor machines, volumes, filesystems
<anastasiamac> axw: i do occasionally... yeah no machines/volumes/filesystem but "ubuntu' as an app makes an appearance
<anastasiamac> axw: so the best i can say that application get removed at/before/after watcher is created... ?
<anastasiamac> axw: no changes to app collection come thru, but the first check to see if model is empty still believes that there is an app in entityrefs tables
<axw> anastasiamac: seems that restarting the controller agent resolved it in this case
<anastasiamac> axw: right
<anastasiamac> axw: because the chemckmodelempty so that there is nothing for apps in modelentityrefs coz "lingering" app was removed
<anastasiamac> axw: m reb-bootstrapping with status update to display the actual erro that we get when we r infitely waiting for changes :)
<anastasiamac> axw: it's https://github.com/juju/juju/blob/develop/state/model.go#L1047
<anastasiamac> axw: and becuase we only re-check if model is ready to die if/when we get changes, we'll never re-check agains coz we have not seen any changes on application and machine collections
<axw> anastasiamac: because they're separate collections, it is theoretically possible that we'll observe the app removal before the ref removal
<axw> anastasiamac: I would've thought it'd happen too quickly, but it is theoretically possible
<mup> Bug #1615986 changed: Agents failing, blocked by upgrade but no upgrade performed <canonical-is> <juju:Invalid> <juju-core:Expired> <juju-core 1.25:Expired> <https://launchpad.net/bugs/1615986>
<mup> Bug #1661681 changed: Broken agent complaints about tomb: dying <juju-core:Expired> <https://launchpad.net/bugs/1661681>
<axw> anastasiamac: teeny weeny review? https://github.com/juju/juju/pull/7217
<anastasiamac> axw: will look in a sec... m just about to propose my changes and will need u to return a favor :D
<axw> anastasiamac: okey dokey
<veebers> Quick Q, how can I tell if a filesystem on an AWS instance is persistent? I need to backup the juju db so I can safely wipe it (to get replication started again)
<jam> axw: reviewed 7217
<veebers> I have limited '/' space, but see "/dev/xvdb       3.9G  8.1M  3.7G   1% /mnt" which would be enough for the back up. Can I be sure that's not going to disappear on me?
<axw> jam: thanks
<anastasiamac> jam: tyvm!
<jam> axw: I'm concerned that the proxy changes indicate a real problem that we need to understand
<veebers> axw: would you know? ^^
<axw> veebers: sorry I don't understand your first question about persistence
<veebers> axw: ah right, I'm concerned that if I move the backup to that filesystem (under /mnt) if the machine restarts will that file system be there again (or is it some semi-permanent FS that gets lost on reboot
<veebers> axw (it's an aws question, not juju :-))
<veebers> I only ask because I saw this: https://forums.aws.amazon.com/thread.jspa?messageID=562254 where they say they rebooted and lost their /dev/xvdb
<axw> veebers: I see. IIRC the cloud images mount the ephemerical block device at /mnt. you should be able to confirm by looking in the ec2 console, and looking at the volumes associated with the instance
<axw> veebers: IOW, probably not persistent, but you can check that way
<veebers> axw: ack will check. Thanks
<axw> jam: I've responded to your comments
<axw> anastasiamac: and I've responded to your comments on the vsphere PR, if you can please see my replies before I type $$billz$$
<anastasiamac> axw: sure thing, gimme a sec - almost there :D
<anastasiamac> axw: jam: PTAL https://github.com/juju/juju/pull/7218
<axw> anastasiamac: looking
<axw> anastasiamac: presumably you've tested, and you're not seeing failures anymore?
<anastasiamac> axw: thnx :) m happy with 7216
<anastasiamac> axw: yes, tested :) we also have CI tests that exercise different scenarios than the one i've used.. i cannot fail it any more... but if u r keen, feel free to run it too :D
<jam> axw: I have comments, I feel like we should probably just try to chat about them directly.
<jam> I have to start up a long-running test here, once I get it kicked off, I think a hangout/IRC chat is appropriate
<axw> jam: sure
<axw> anastasiamac: a few little things, main thing is to move the settings changes out
<anastasiamac> axw: thnx, i'll look :D
<jam> anastasiamac: i had a couple comments as well
<anastasiamac> jam: \o/
<anastasiamac> jam: axw: PR 7218 updated, manual tests pass... happy for me to land?
<axw> anastasiamac: yup, it's accepted
<anastasiamac> axw: awesome \o/
<anastasiamac> axw: jam: the bit that was a drive-by
<anastasiamac> https://github.com/juju/juju/pull/7219
<axw> jam: ready to chat yet?
<jam> axw: ah sorry, yeah, let's go ahead and do it, I'll interrupt my other interruptions :)
<jam> axw: https://hangouts.google.com/hangouts/_/canonical.com/jam-axw?authuser=1
<axw> jam: IRC is fine for me, then you can multi task
<axw> or that
<rogpeppe> anyone care for a review of some changes to some of the API testing code? https://github.com/juju/juju/pull/7222
<perrito666> hey annybody here?
<rick_h> perrito666: party
<perrito666> rick_h: hehey, do you happen to know if curtis or aaron are on?
<rick_h> perrito666: not sure, /me looks for abentley_ and sinzui (not here)
<abentley_> perrito666: Hi.
<perrito666> abentley_: hello :) I had not seen you with that underscore
<perrito666> abentley_: I am looking with someone with privileges to remove me from canonical organization in github, juju is quite spammy of my email  :)
<abentley_> perrito666: Are you sure that will have the desired effect?  AIUI, Juju is under the Juju team, not Canonical.
<rick_h> perrito666: let me know which org and I can pull it
<perrito666> rick_h: I did not know you had that kind of power ;)
<rick_h> perrito666: :)
<rick_h> perrito666: removed from both
<perrito666> rick_h: ah cool, now I can actually see notifications from my own projects :) cheers
<rick_h> perrito666: party on
<menn0> anastasiamac: while testing some stuff I kept running into the same issue you've been looking at (can't destroy model when there's an app with no units)
<menn0> anastasiamac: restarting the controller is an inconvenient workaround
<anastasiamac> menn0: on tip?
<menn0> thumper: ^^^
<anastasiamac> menn0: coz i've landed a fix yesterday
<menn0> anastasiamac: hmmm
<menn0> anastasiamac: not sure if this was with your fix or not
<menn0> anastasiamac: i'll try again
 * anastasiamac holding breath and looks over menn0's shoulder...
 * menn0 has also managed to get the juju client into a state where "juju models" segfaults
<menn0> will look into that soon
<anastasiamac> thumper: menn0: wallyworld_:axw: here is an interesting question: should retry -provisioing work on containers?
<menn0> anastasiamac: I would have thought so
<axw> anastasiamac: I don't see why not
<anastasiamac> menn0: I was going to look into juju models in general.. we have a lot of different failure scenarios for it.. but feel free to lend a hand :D
 * menn0 has no time for any of this
<anastasiamac> menn0: axw: k, I'll triage it as a wishlist
<axw> anastasiamac: seems like a bug to me.
<anastasiamac> menn0: don't worry about list models then :D i'll address (at some stage)... but really really want to know if u cannot destory model still
<anastasiamac> axw: does it? do we have code that allows it to cater for containers already?
<wallyworld_> it wasn't designed to work with containers
<wallyworld_> but it could
<axw> anastasiamac: I don't know, but how is that relevant? lack of code does not mean it's not a bug
<wallyworld_> it was designed to handle transient cloud failures
<wallyworld_> containers should not have the same class of transient errors
<axw> wallyworld_: why? the host/container system can fail in transient ways too
<wallyworld_> "the same class of"
<wallyworld_> but yeah
<wallyworld_> there's no reason why the code can't be extended
<wallyworld_> i think there's an IsTransient method or something on an error?
<anastasiamac> axw: lack of code indicates an RFC - feature improvement: not a bug but wishlist: "fill this functional gap for me" :D
<wallyworld_> i can't recall if that's what is checked
<axw> anastasiamac: a design bug is still a bug.
<anastasiamac> axw: and when i say wishlist, it does not mean we should not address :)
#juju-dev 2017-04-11
<blahdeblah> Anyone able to interpret the output of jam's script from https://bugs.launchpad.net/juju/+bug/1680683/comments/3 for me?  https://pastebin.canonical.com/185461/  Looks like something went wrong, possibly because it was run on a non-primary node?
<mup> Bug #1680683: Poor "juju create-backup" performance <canonical-is> <juju:Incomplete> <https://launchpad.net/bugs/1680683>
<menn0> anastasiamac: looks like the "juju models" crash is related to a change in the prototype branch I'm working on.
<menn0> anastasiamac: probably nothing to worry about
<anastasiamac> menn0: what about destroy model? r u still seeing the failure? even with my fix :( ?
<menn0> anastasiamac: still testing that
 * menn0 is juggling 4 things at once
 * anastasiamac still holds breath then 
<menn0> anastasiamac: gah! i'm not even able to reproduce the problem without your fix now
<blahdeblah> ah, worked it out - needs to be run on the primary
<blahdeblah> anastasiamac: ^
<menn0> anastasiamac: i'll let you know if I see the problem with your fix in place, but let's assume it's all good for now :)
 * menn0 has to go
<anastasiamac> menn0: excellent, i'll breath again then
<menn0> anastasiamac: it's actually fairly likely I was seeing it in a controller that didn't have your fix
<anastasiamac> menn0: awesome \o/ i'd suprised if u'd see it again :D
<anastasiamac> (with my fix i mean)
<axw> babbageclunk: standup?
<menn0> anastasiamac: I like your confidence :)
<anastasiamac> menn0: if m not confident in mymself, what m doing here?...
 * anastasiamac sighs.. if only i could type
<menn0> true :)
<mup> Bug #1681287 changed: juju should retry failed update-status hooks <juju-core:Won't Fix> <https://launchpad.net/bugs/1681287>
<mup> Bug #1681287 opened: juju should retry failed update-status hooks <juju-core:Won't Fix> <https://launchpad.net/bugs/1681287>
<mup> Bug #1681287 changed: juju should retry failed update-status hooks <juju-core:Won't Fix> <https://launchpad.net/bugs/1681287>
<wallyworld> axw: if vsphere vms have extraConfig metadata that can hold tags, why do we still also need folders?
<axw> wallyworld: AFAIK you can't see the metadata in the UI
<axw> wallyworld: folders are the standard way of organising
<wallyworld> ok, so purely a user facing aide, that seems fine
<axw> wallyworld: it also makes various DestroyController more efficient, since you don't need to trawl through all the VMs then
<axw> not a big deal, but it's something
<wallyworld> yeah, i noticed that
<wallyworld> btw, my isp is having issues atm, so i can't get to launchpad or streams.canonical.com :-(
<wallyworld> sigh, makes it hard to look into 2.2 bugs
<wallyworld> axw: pr lgtm, we'll need to ensure CI tests are updated etc
<axw> wallyworld: thanks. CI should add a test for migrate & kill-controller, I think everything else is covered
<wallyworld> axw: i was thinking specifically of verifying folder structure etc
<axw> wallyworld: ok. I don't know what we normally test in that regard, but seems reasonable
<wallyworld> yeah, especially since it's a user facing artefact we are relying on
<axw> wallyworld: I've made a small change so that the upgrade step no longer removes the old metadata. it causes an error when the value is empty
<axw> (despite what the docs suggest...)
<wallyworld> ok
<wallyworld> i doubt it will matter much in practice
<axw> wallyworld: just a little messy, no big deal
<axw> nobody sees it anyway
<wallyworld> yep
<mattyw> who's ready for a question?
<mattyw> wallyworld, you for example?
<ashipika> o/ ;)
<jam> anyone around that can look at https://github.com/juju/txn/pull/28
<jam> wpk: ^^ its not your area but maybe you'd like exposure :)
<jam> I was hoping wallyworld or axw might still be around
<axw> jam: I am but need to go pick kids up from my mother in law's
<jam> oh sure, play the mother-in-law card :)
<jam> axw: np
<axw> jam: if I get a chance I'll look later on
<jam> it is small
<jam> its a bug that hasn't landed in juju-core proper yet, cause I was testing it before landing
<jam> but lack of testing in juju/txn
<axw> jam: yeah that's trivial, I can review now
<axw> jam: LGTM
<joedborg> Quick juju question - will 2.2 have the same behaviour as 2.1.2 in terms of not automatically bridging interfaces?
<rick_h> joedborg: yes, the behavior will be the same as far as I'm aware. You'll need to specify interfaces you want setup on containers.
<joedborg> cheers rick_h!
<babbageclunk> axw: ping?
<babbageclunk> or menn0: ping?
<menn0> babbageclunk: hi (not working though)
<babbageclunk> menn0: just a really quick question, promise
<menn0> babbageclunk: ok
<babbageclunk> menn0: It seems like units on a machine could be bound to different spaces?
<babbageclunk> menn0: But then wouldn't they have different public addresses? Or is only one space able to have public addresses?
<babbageclunk> menn0: the code only supports (zero or) one public address for the machine and unit.PublicAddress() just delegates to the machine, so I guess I'm wrong.
<menn0> babbageclunk: AFAIK they could have different public addresses
<menn0> babbageclunk: i'm not sure, but I think the idea of single public address was a overly simplisitic idea which I think we're moving away from
<babbageclunk> menn0: Hmm.
<menn0> jam would know better
<babbageclunk> menn0: ok, thanks.
<menn0> machine/unit.PublicAddress() is probably problematic
<babbageclunk> menn0: I just realised jam would be the person to ask.
<menn0> and it's his afternoon so hopefully he's about or will be back soon
<babbageclunk> menn0: yeah, good call - sorry to distract you! Thanks for confirming that there's probably something fishy going on there, anyway.
<menn0> babbageclunk: no worries - don't work too late :)
<babbageclunk> menn0: no, I'm about to crash out (or get shanghaied by a baby)
<babbageclunk> jam: I'm probably going to have to drop soon, but if/when you see this any insights would be welcomed!
<jam> babbageclunk:
<jam> if you're around, I can join a HO or we can IRC
<jam> was just on a phone call
<babbageclunk> jam: hey - still here, let's hangout!
<jam> babbageclunk: https://hangouts.google.com/hangouts/_/canonical.com/afternoon-jam?authuser=1
<babbageclunk> jam: you froze
<jam> yeah, brb
<jam> trying to reconnect
<thumper> anastasiamac: coming...
<thumper> just restarting chrome
<babbageclunk> wallyworld: around?
<wallyworld> babbageclunk: i am, but am waiting to head into a meeting
<babbageclunk> wallyworld: ok, I'll grab you after.
<wallyworld> babbageclunk: was it a quick question?
<babbageclunk> wallyworld: maybe (or else it can be something to distract you in a boring meeting) - are you sure that a unit won't enter scope until it's assigned?
<babbageclunk> (to a machine)
<babbageclunk> wallyworld: there's code in the ingress address watcher that suggests otherwise, although I'm not sure whether that's defensive coding.
<wallyworld> babbageclunk: i thought i was sure yes
<wallyworld> it's defensive
<babbageclunk> ok - needing to handle that would make the stuff I'm doing quite a bit fiddlier (I think)
<wallyworld> yeah
<babbageclunk> wallyworld: oh, one more quick one - the IAWatcher maintains a set of addresses (as the current output) and the known map (unit -> address) in sync. I'm tempted to generate the former from the latter when it's needed - is there any problem with that?
<babbageclunk> wallyworld: I'll get the tests passing the way it is and do the change in a separate commit to see if that breaks anything.
<wallyworld> babbageclunk: not really, given the size of the lists etc, should be ok
<babbageclunk> cool cool
<babbageclunk> thanks
<wallyworld> nw
<cmars> is there something in juju that can tell me the OS name (not series) of a machine? like "ubuntu" instead of "trusty", "windows" instead of "win2008somethingsomething", etc.?
<cmars> by something i mean go function in juju/juju/...
<cmars> aha, juju/utils/series.GetOSFromSeries
<hml> anastasiamac: ty, i forgot about adding the pr to the bug
<anastasiamac> hml: nps :)
#juju-dev 2017-04-12
<cmars> hi there, can i get a review of https://github.com/juju/juju/pull/7228 ?
<wallyworld> cmars: looking
<wallyworld> cmars: reviewed with a wish to count unknown OS type also
<cmars> wallyworld, excellent suggestions, updated 7228, ptal
<wallyworld> cmars: looking
<cmars> bah
<cmars> i could collapse that increment
<cmars> one sec
<wallyworld> cmars: yeh, lgtm, feel free to land
<cmars> wallyworld, all right, thanks!
<anastasiamac> wallyworld: axw: jam: thumper: menn0: here is another re-occurence of wrong upgrade to devel version when a released is out: https://bugs.launchpad.net/juju/+bug/1681853
<mup> Bug #1681853: Juju Tools Upgrade from 2.0.2 to 2.1.2 fails: ERROR no matching tools available <juju:Incomplete> <https://launchpad.net/bugs/1681853>
<wallyworld> anastasiamac: that's a x-stream issue
<anastasiamac> wallyworld: my wording is poor - from 2.0.2 to 2.1.2.1 when 2.1.2 is expected
<babbageclunk> wallyworld: wanna talk about networking stuff?
<wallyworld> babbageclunk: sure, just in meeting, will finish real soon
<babbageclunk> wallyworld: okies
<axw> wallyworld: FYI on vsphere, apt update/dist-upgrade are being run
<wallyworld> axw: on xenial?
<axw> wallyworld: yes
<wallyworld> ok, tahnks. i'll test on aws also
<wallyworld> might just be joyent
<wallyworld> babbageclunk: now?
<wallyworld> standup HO?
<babbageclunk> yup
<thumper> this bug is turning into a world of hurt
 * thumper is done for today
<menn0> jam: this PR puts the beings docs on a diet: https://github.com/juju/juju/pull/7230
<jam> menn0: looking
<jam> menn0: reviewed
<menn0> jam: thanks
<menn0> jam: I agree that iter is better. will change
<menn0> jam: and yes the proposed upgrade step and/or pruning logic will take care of old data
<menn0> jam: regarding the upgrade step, I'm not sure if we can safely do it as API connections are already up when the upgrade-steps worker runs.
<jam> menn0: we have db migration steps that sync between the controllers, don't we?
<jam> menn0: ISTR there was a place that triggered really early because of stuff like that
<menn0> jam: we do but the worker doesn't come up until there's an API connection
<menn0> jam: the state upgrade steps are now defined and run separately from the rest but still within the one worker
<menn0> jam: really we need 2 upgrade-steps workers
<jam> menn0: sounds like we do
<jam> pre-api, and post-api
<menn0> yep
<menn0> sigh
<menn0> I can't really spend much more time on this
<menn0> jam: maybe we just get the presence pruning done, negating the need for the nuke-from-orbit upgrade step
<menn0> although a "compact" call after that first big prune wouldn't go astray
<menn0> jam: ^^
<jam> menn0: presence pruning is generally useful, and I guess it would push out all the old data pretty quickly since everything would be new after an upgrade
<axw> anastasiamac: would you please review? https://github.com/juju/juju/pull/7231
<axw> or jam, wallyworld ^^
<jam> axw: lgtm
<axw> jam: ta
<wallyworld> axw: sorry, was out talking to anastasia
<axw> wallyworld: all good
<wpk> quick one: https://github.com/juju/utils/pull/271
<wallyworld> babbageclunk: any chance i can arm twist you for a review on that PR?
<babbageclunk> wallyworld: looking at it right now, sorry!
<wallyworld> np
<wallyworld> hml: your test skip pr is good to land
<hml> wallyworld: cool, tx
<babbageclunk> wallyworld: LGTM. What's JEM?
<babbageclunk> (I had thought you were saying JUMM in Strine.)
<wallyworld> babbageclunk: Juju Environment Manager (from memory)
<wallyworld> it's the controller proxy
<babbageclunk> Thanks - what does JIMM stand for again?
<wallyworld> can't recall exactly right now!
<wallyworld> thumper: got 30 seconds to jump back into release HO?
<thumper> first part: https://github.com/juju/cmd/pull/51
#juju-dev 2017-04-13
<thumper> anastasiamac: ^^^ that is the first part of my fix
<anastasiamac> thumper: looking ;)
<anastasiamac> thumper: reviewed :D
<axw> wallyworld: did the KVM/LXC image cache expire old images?
<axw> I'm guessing not, since that would require some smarts in the server - and it was just a proxy IIRC
<wallyworld> axw: no, was quite a simple implementation
<wallyworld> axw: babbageclunk : standup?
<babbageclunk> wallyworld: here's that bug: https://bugs.launchpad.net/juju/+bug/1679948
<mup> Bug #1679948: juju bootstrap is failing in the MAAS CI lab  <juju:New> <https://launchpad.net/bugs/1679948>
<wallyworld> TA
<axw> wallyworld hml: got a call from oracle about my trial... did I miss anything?
<axw> oh you're still there
<wallyworld> axw: no, just explaining how build agent etc works
<thumper> anastasiamac: thanks
<thumper> anastasiamac: next one coming up shortly
<anastasiamac> thumper: nps :D m looking forward to the sequel
<wallyworld> veebers: sorry, caught up in meeting, will miss the QA catchup i think
<veebers> wallyworld: nw
<thumper> anastasiamac: https://github.com/juju/juju/pull/7233, +251 â730 (as long as you ignore the 217 files bit)
<anastasiamac> thumper: wow... anything in particular to pay attention too? coz otherwise m going to hit the button with my eyes closed :D
 * thumper looks for the files of interest
<thumper> cmd/juju/commands/main.go
<anastasiamac> thumper: ack
<thumper> cmd/jujud/main.go
<thumper> cmd/plugins/juju-metadata/metadata.go
<thumper> that's it really
<thumper> everything else is dealing with using cmdtesting from the juju/cmd package rather than juju/juju/cmd
<thumper> and the rename of juju/juju/cmd/cmdtesting to juju/juju/cmd/cmdtest
<anastasiamac> thumper: ack. looking -tyvm!!
<thumper> The only thing left in the juju/juju/cmd/cmdtest now is some weird arse function about running a command with the dummy provider
<thumper> can't fix everything at once
<anastasiamac> thumper: agreed. lgtm'ed
<thumper> ta
<babbageclunk> wallyworld: Can you take a look at https://github.com/juju/juju/pull/7234
<wallyworld> sure, i just need a quick bit to eat
<babbageclunk> wallyworld: I need to pop out but after that can we chat about the changes needed for short-circuiting?
<babbageclunk> wallyworld: cool cool
<wallyworld> babbageclunk: sure, can do
<wallyworld> babbageclunk: left some comments, happy to chat whenever
<babbageclunk> awse, thanks
<babbageclunk> looking now
<anastasiamac> menn0: if you happen to find urself with nothing to do \o/ PTAL at my update of https://github.com/juju/juju/wiki/MgoPurgeTool with everything u've mentioned
<mup> Bug #1492237 changed: juju state server mongod uses too much disk space <canonical-bootstack> <mongodb> <oil-2.0> <uosci> <juju:Won't Fix> <juju-core:Won't Fix> <https://launchpad.net/bugs/1492237>
<jam> anastasiamac: "runaway transactions on apihostports or machines" could be lumped into cleaning up txn references from all documents
<jam> at one point we had specific problems on a couple collections, but now we cleanup all of them
<jam> (the newest one is model documents)
<jam> I don't know if people care as much about the specific issues vs the category of issue, though
<anastasiamac> oh jam, that the wording I did not touch :)
<jam> Prunning I think is spelled Pruning
<anastasiamac> ah, that one would be mine - anything misspelt cannot b menn0 :)
<jam> I'm pretty sure pruning has to run on the primary
<jam> compact needs to be run on secondaries
<jam> we've also found that it isn't 100% compatible with Juju 1.18, though it is close
<anastasiamac> jam: yeah, that is why is said juju 1.25.x :)
<jam> mgopurge drops the db entirely for presence, and we can't do that, but you can drop the individual tables
<mup> Bug #1492237 opened: juju state server mongod uses too much disk space <canonical-bootstack> <mongodb> <oil-2.0> <uosci> <juju:Won't Fix> <juju-core:Won't Fix> <https://launchpad.net/bugs/1492237>
<menn0> anastasiamac: unfortunately I am famous for typos
<menn0> jam: pruning can run anywhere I think (all writes go to the primary), but it's most efficient to run on the primary
<anastasiamac> menn0: jam: fixed everything except does pruning need to/must run on primaries?
<anastasiamac> k. i'll add that as a note too :)
<jam> menn0: well, if it redirects all writes to the primary, are you running on a secondary? I'm not sure what guarantee we are opening the session in
<jam> (strongly consistent auto redirects all requests to the master)
<jam> which would mean you might be running the script on another machine, its still running on the primary
<mup> Bug #1492237 changed: juju state server mongod uses too much disk space <canonical-bootstack> <mongodb> <oil-2.0> <uosci> <juju:Won't Fix> <juju-core:Won't Fix> <https://launchpad.net/bugs/1492237>
<jam> anastasiamac: why did you mark https://bugs.launchpad.net/juju/+bug/1492237 as Won't Fix vs Fix Committed for 2.2 ?
<mup> Bug #1492237: juju state server mongod uses too much disk space <canonical-bootstack> <mongodb> <oil-2.0> <uosci> <juju:Won't Fix> <juju-core:Won't Fix> <https://launchpad.net/bugs/1492237>
<jam> though I suppose the other aspects are "configurable log" and "statuseshistory" would be the other 2 commits for 2.2
<jam> and then what to do about the charm blob store
<jam> esp. wrt JAAS cache where the admin of the db isn't the one whose data we are working with.
<jam> anastasiamac: coming to https://hangouts.google.com/hangouts/_/canonical.com/discuss-roadmap?authuser=1 ?
<menn0> anastasiamac: r u joining?
<anastasiamac> sorry sorry got distracted by imprtant ppl
<babbageclunk> wallyworld: take another look? https://github.com/juju/juju/pull/7234
<wallyworld> babbageclunk: will do, just in yet another meeting, *sigh*
<babbageclunk> wallyworld: ok, no worries :)
<babbageclunk> wallyworld: running a test against GCE now - wasn't sure whether rebooting the machine was enough - wouldn't enter scope also happen in that case anyway?
<wallyworld> babbageclunk: don't think so - scope entry is on the controller side
<babbageclunk> ah right
<wallyworld> babbageclunk: i'm multi-tasking - how did the GCE testing go?
<wallyworld> axw: quick review if you have time? https://github.com/juju/juju/pull/7235
<axw> wallyworld: sure
<wallyworld> ty
<wallyworld> bbiab after coffee run
<jam> anastasiamac: I see, so we're not "imprtant ppl"... :,(
<anastasiamac> jam: u r... but there is also a higher category of "very imprtant"
<anastasiamac> :D
<jam> anastasiamac: ah, I see. so priority inversion, unable to interrupt the 'imprtant' people to meet with the 'very imprtant' ones
<anastasiamac> :)
<babbageclunk> wallyworld: sorry, was afk.
<wallyworld> no wuckers
<babbageclunk> wallyworld: it looks like it didn't work because the bounced machine hasn't picked up its new public address - I guess a problem with the GCE provider?
<babbageclunk> wallyworld: :(
<wallyworld> babbageclunk: there's an instance address poller than runs but then backs off once the address is known. the agent is also suposed to report addresses from memory. i can't recall the specifics
<wallyworld> but that's not for this PR
<babbageclunk> wallyworld: hmm - I guess the agent wouldn't know that the machine's public address changed, right?
<babbageclunk> wallyworld: so it would have to be the instance poller.
<babbageclunk> wallyworld: Ok, I'll merge my PR anyway
<wallyworld> babbageclunk: there's supposed to be an address poller in the agent i thought
<wallyworld> but i could be wrong, will have to check
<babbageclunk> wallyworld: but it would need to talk to the provider to find out the public address.
<wallyworld> why?
<wallyworld> the agent would poll the nics
<wallyworld> on the machine
<wallyworld> i guess it depends on how things are configured
<wallyworld> maybe you're right
<wallyworld> maybe the agent start up should trigger the instance poller
<wallyworld> maybe that's already supposed to happen
<wallyworld> so many questions
<wallyworld> axw: i've pushed changes, too bad ParseDuration() doesn't support "days"
<axw> wallyworld: yeah. we could always add another function that extends the syntax, but at least this way it's more flexible
<axw> wallyworld: LGTM
<wallyworld> axw: ty
<wallyworld> will test again befor elanding
<wallyworld> babbageclunk: i've added a card to ensure we remember to check for why rebooting an instance causing an address change is not being picked up by juju
<babbageclunk> wallyworld: oh, it eventually did! Haven't had a chance to investigate - talking to the maas guy.
<wallyworld> babbageclunk: that matches my understanding then, whew. the instance poller will wake up and do it. we just need to be smarted about triggering it after a reboot
<mwhudson> axw: turns out that the TMPDIR-clearing antics are the dynamic loader's fault
<mwhudson> (because there is a setuid executable in the chain of invocations when you run a snap)
<axw> mwhudson: ah, right
<wpk> jam: order does matter if you want to have something to test against :)
<jam> wpk: I'm not worried about SortedValues as much as I am whether we need to preserve the *users* order
<jam> wpk: and I believe it doesn't matter, so shoving it into a Set is fine, and the SortedValues is appropriate
<wpk> jam: technically it's a set
<wpk> jam: also, that's what proxyupdater has been always doing
<jam> wpk: sgtm
<babbageclunk> jam: I'm chasing this bug 1679948 with a guy from the maas team
<mup> Bug #1679948: juju bootstrap is failing in the MAAS CI lab  <juju:New> <https://launchpad.net/bugs/1679948>
<babbageclunk> jam: The controller looks like it's running fine, but the client can't connect to the api - we just see the Forbidden message in the log
<babbageclunk> jam: connectivity seems fine, we can ssh from the client to the controller
<babbageclunk> jam: and in the logs on the controller we can see the various workers connecting the the API as well, so I'm not sure why the client can't
<jam> babbageclunk: are you able to wget against the controller?
<jam> is the URL a valid uuid for a model?
<babbageclunk> jam: gah, didn't think of just wgetting - I was trying to find a websocket client I could use
<jam> it just tells you if you can open the socket
<jam> you can also try "telnet host 17070" or whatever the exact prot is
<babbageclunk> jam: yeah, will get in touch and try that
<babbageclunk> jam: I guess telnetting to it would tell something even though it's not going to do the https handshake
<jam> babbageclunk: you'll see the SSL headers come over
<jam> babbageclunk: there is also "openssl s_client"
<jam> which is 'telnet but for SSL'
<babbageclunk> jam: found a websocket client code example that might be a useful test bed - just seeing how far I can get to pass in some JSON to login to the API and eg list the models.
<menn0> jam: here's a PR that addresses one of the items we were discussing in today's call
<menn0> https://github.com/juju/juju/pull/7237
<jam> menn0: looking. what is cloud 'dev' ?
<menn0> jam: just lxd with config for my local apt proxy and disabling auto apt upgrade
<menn0> jam: ala https://github.com/juju/juju/wiki/Faster-LXD#suggested-juju-config-for-lxd-deployments
<jam> sure, I have the same, just put it on 'lxd' itself
<jam> menn0: https://github.com/juju/juju/pull/7237#pullrequestreview-32605432 lgtm, though in *my* opinion, if we're going to have '--no-switch' then we really should have '--switch'
<jam> and I'm hesitant to add a short option
<jam> we could live with just landing what you requested
<jam> *proposed
<menn0> jam: I don't understand why we would you have --switch? if there's already --no-switch.
<menn0> jam: I was borderline on the short option but it seemed like the kind of thing that some people would want to do all the time
<jam> menn0: so if you have "--dont-do-foo" it seems useful to have a "--do-foo" to invert it
<jam> it may be that "--do-foo" is currently the default
<menn0> jam: but if that's the default....
<jam> but that doesn't let us change it
<jam> *i* like the ability to be explicit, especially for things like scripts
<menn0> if it's the default, most people won't use the option IMHO
<menn0> and we can't really change the default (until 3.x anyway) for compatibility reasons
<jam> menn0: its personally a pet peeve when there is an option but you can't explicitly select a behavior, as then you have to know what the default behavior is, instead of just being able to say "this is the behavior I want"
<jam> that, however, is *my* opinion, so if you'd like a second opinion/consensus feel free to override/ask others
<menn0> fair enough, i'll ask around next week
<menn0> it's late and there's no particular rush to land this right now
<jam> menn0: for example, in 'mgopurge' we have -ssl=false as how you disable it, but you can specify -ssl=true
<menn0> yep, I guess so
<menn0> on the flip side we already have -no-gui and --no-browser-login
<menn0> on bootstrap
#juju-dev 2017-04-14
<wpk> Upgrading my laptop to 17.10, fingers crossed.
<wpk> yay, it's alive!
<tasdomas> hi, could I get a review of https://github.com/juju/juju/pull/7241 ?
<wpk> tasdomas: isn't it reviewed already?
<tasdomas> wpk, yeah, but another review wouldn't hurt (if you've got the time)
<mup> Bug #1682827 opened: Bootstrap on OpenStack Cloud fails <juju-core:New> <https://launchpad.net/bugs/1682827>
<petevg> I've got a (hopefully quick) question: what's the best on-the-fly way to get a complete list of Facades that the websocket api supports? I was relying on "facades" key in the "Login" response, but that only seems to return a partial list of facades.
<petevg> Also, hi all! Forgot to include a friendly greeting up front :-)
<petevg> After poking about some more, and running into dead ends, I filed a bug about the above: https://bugs.launchpad.net/juju/+bug/1682925
<mup> Bug #1682925: List of Facades in the Login response is incomplete <juju:New> <https://launchpad.net/bugs/1682925>
<petevg> I know that it's Friday afternoon after a really tough and interesting week, though, so I understand if it take a little while to get a response :-)
<hml> petevg: :-) itâs also a holiday and 4 day weekend for a lot of folks.  many causes for a delay.  thank you for filing the bug so item doesnât get lost
<petevg> hml: that's right. I've been crunching to get this python-libjuju thing done, and forgot that it's a holiday weekend! Probably means that it's just about time for me to take a break and do a weekend myself. :-)
<petevg> have a good one.
<hml> you too
#juju-dev 2017-04-15
<mup> Bug #1683075 opened: Rsyslog constantly restarts after upgrade to 1.25.11 <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1683075>
#juju-dev 2018-04-09
<wallyworld> thumper: i updated https://github.com/juju/juju/pull/8549, plus there's this one https://github.com/juju/description/pull/36
 * thumper nods
<wallyworld> thumper: annnnd, merge conflicts fixed in https://github.com/juju/juju/pull/8546
<thumper> ack
<thumper> will look when I hit a natural break in what I'm doing
<wallyworld> babbageclunk: see this test failure in a CI check run; maybe related to what you're seeing http://ci.jujucharms.com/job/github-check-merge-juju/758/testReport/github/com_juju_juju_worker_peergrouper/TestPackage/
<wallyworld> might not be either
<babbageclunk> wallyworld: hmm, maybe - will take a look in a bit
<wallyworld> babbageclunk: didn't mean for you to fixc, just another data point
<wallyworld> might provide a clue
<babbageclunk> wallyworld: yeah, could do!
<vino> hi
<thumper> o/ vino
<vino> HI Tim
<babbageclunk> Hi vino!
<vino> Hello
<vino> plz ignore my messages. I am just trying to see how IRC works.
<thumper> vino: Use '/me action' to show something like this
 * thumper headdesks
 * vino having lunch
 * thumper EODs
<wallyworld> manadart: hey, there's a consistent failure in TestErrorAndStatusForHASpaceWithNoAddressesAddrIPv6 stopping the bot from landing stuff. is this something you could do a quick fix for?
<wallyworld> i'm close to EOD
<manadart> wallyworld: I will look at it. It passes locally every time :(
<wallyworld> yeah, the number of times we have seen that
<wallyworld> here's a sample failure http://ci.jujucharms.com/job/github-check-merge-juju/760/testReport/github/com_juju_juju_worker_peergrouper/TestPackage/
<manadart> wallyworld: Thanks. Looking.
<wallyworld> tyvm
<manadart> wallyworld: Got a PR up. Made it consistent with other test manipulation of the machine docs.
<manadart> Still can not replicate locally. Will observe in-situ.
<jam> manadart: https://images.linuxcontainers.org
<jam> github.com/lxc/lxd/config.go DefaultRemotes
<jam> manadart: http://streams.canonical.com/juju/images/releases/streams/v1/
<jam> http://streams.canonical.com/juju/images/releases/
<manadart> jam: Can you review https://github.com/juju/juju/pull/8566 when it goes green?
<manadart> Landing that will allow others to make forward progress.
<manadart> That would be a great suite to use GoMock with since we are not using a real state.
<manadart> It has been thorny to negotiate.
<jam> manadart: reviewed and $$merge$$ d
<jam> manadart: any thing else for us to discuss or shall we skip the 1:1 ?
<manadart> jam: Safe to skip for my part.
<jam> sgtm
<manadart> jam: Should we be adding config for container-image-stream too, or assume that the user includes an exact URL with the metadata config?
<jam> manadart: I think we should follow image-metadata-url and have a container-image-stream as well
<manadart> jam: Ack. I've proceeded on that basis.
#juju-dev 2018-04-10
<wallyworld> anastasiamac: review done, lgtm with some comments, see what you think
<anastasiamac> \o/ ta
<anastasiamac> wallyworld: btw, we've agreed that m changing the tests to use new .SupportedLts() too, right? :D
<wallyworld> yeah
<anastasiamac> awesome
<wallyworld> babbageclunk: if you can find 15 minutes before your EOD, would love a review today so that i can land this before my caas meeting tomorrow https://github.com/juju/juju/pull/8568
<wallyworld> anastasiamac: lts pr lgtm, tyvm
<anastasiamac> \o/
<babbageclunk> wallyworld: yup, looking now
<wallyworld> yay
<thumper> wallyworld: are you free in about 25 minutes to talk through this white paper idea?
<wallyworld> sure
<thumper> cool
<babbageclunk> wallyworld: reviewed
<wallyworld> great ty!
<anastasiamac> wallyworld: so our approach of supportedlts in juju/juju/version introduces cycle import... :( pondering whereelse to put it that makes sense...
<wallyworld> ah, awesome
<anastasiamac> yep... if testing.BaseSuite was not in juju/juju/testing directly, everything'd b peachy... for real
<anastasiamac> wallyworld: so i have a cunning plan :) but i think i want to run it by u...lemme know when u have a chance :)
<wallyworld> i am free
<babbageclunk> thumper/wallyworld - if you have a worker with a child worker (say it was added to the parent's catacomb in the Plan.Init list), can the child worker cause the parent to stop?
<thumper> I'm not sure
<thumper> babbageclunk: do you mean if it ends in an error?
<wallyworld> if it errors, the parent will error also
<wallyworld> unless
<wallyworld> you set up a custom runner
<wallyworld> and provide IsFatal
<thumper> the catacomb doesn't have a runner does it?
<babbageclunk> thumper: no it doesn't - oh nice, thanks.
<babbageclunk> I think the problem is that at the moment the raft-transport worker makes a streamLayer (which is another worker) but doesn't add it to the raft-transport catacomb.
<babbageclunk> So I'm fixing that now.
<jam> manadart: do you mind if we move the oracle review back by 30min? that gives me time to have lunch every day instead of trying to squeeze it in
<jam> I'm looking at the gomock stuff now
<manadart> jam: No problem.
<anastasiamac> wallyworld: 8567 was updated with func moved to juju/juju/juju as version... what do u think? it passes now, is ready for landing...
<wallyworld> looking
<wallyworld> anastasiamac: works for me
<anastasiamac> wallyworld: awesome \o/ i'll land it on 2.3
<jam> manadart: https://pastebin.canonical.com/p/wrNMVKHRXg/
<hml> balloons:  CI help please?  if i run the command out side of CI it works, inside it fails??  https://pastebin.canonical.com/p/4WJkdkGsFf/
<balloons> hml, looking
<balloons> hml, that looks more or less correct
<balloons> hml, I would put a try block around the call, expect this error, and fail if something else happens
<balloons> hml, subprocess isn't going to return cleanly because the remote just hangs up
<hml> balloons: I did have a try block around itâ¦ which just confirmed the timeout
<hml> balloons: what I donât understand is why itâs timing out in one place, but  not the other
<hml> balloons: in both cases the reboot happened
<hml> balloons:  other things start failing then tooâ¦ timing out
<balloons> hml, I'm not sure I'm seeing what you are.
<balloons> hml, yor paste just shows me a calledprocesserror when you run the reboot command. Off the cuff, that makes sense to me to expect
<hml> balloons: why?
<balloons> hml, do you think the subprocess call is exiting cleanly?
<balloons> my guess is it just hangs up because the host goes away
<balloons> hence the error
<balloons> hml, we might be able to tweak your call so it exits clean
<hml> balloons:  hrmâ¦ perhaps.  so, after this happens, i also get a timeout trying to run  lsb_release -c on the host too
<hml> balloons: i was looking at that, then found the first timout of reboot
<balloons> hml, so I was thinking of something like 'sudo shutdown -r 0 && exit'
<balloons> hml, googling gave a slightly complex and interesting answer as well: 'nohup sudo halt &>/dev/null & exit'
<hml> balloons:  will try
<balloons> hml the goal is to tell the machine to reboot, but exit cleanly from the ssh command before it does so
<balloons> nohup shutdown -r now & exit .. all variants on the same idea :p
<hml> balloons: && exit helped, can do lsb_release afterâ¦ will try with try now
<balloons> hml, ohh.. so clean exit? So no need to catch an error then
<hml> balloons: okay
<balloons> make sense?
<thumper> morning vino
<babbageclunk> hi vino!
<veebers> Hi vino o/
<babbageclunk> hey vino, you're in the staff directory now!
<balloons> hi vino!
<hml> w00t
<vino> Hi Everyone!
<wallyworld> babbageclunk: standup?
<veebers> babbageclunk: would you have a link handy for the wiki/readme section re: running the mongo client on a juju machine to query the db?
<babbageclunk> thumper: do you need to add vino to our team on Trello before she can see our board?
<veebers> babbageclunk: nvm I found it, I was blind.
<babbageclunk> veebers: sorry, was just finding it myself. I guess I didn't have it handy.
<veebers> ^_^
<babbageclunk> :)
<anastasiamac_> thumper: and while u r looking at team members on trello, mayb some ppl need to be removed? :D
<thumper> anastasiamac_: heh
<thumper> ok
<thumper> sorry was interviewing
<anastasiamac_> :) m not sorry u were :D
#juju-dev 2018-04-11
 * thumper shudders as he builds 1.25 for some testing
<thumper> babbageclunk: next one https://github.com/juju/juju/pull/8573
<thumper> wallyworld: or you could... ^^
 * thumper goes to make a coffee
<babbageclunk> yeah, share the love!
 * wallyworld gets coffee and then looks
<wallyworld> babbageclunk: actually, wanna chat briefly?
<wallyworld> i hear you've living the dream
<babbageclunk> wallyworld: yes please!
<babbageclunk> in standup?
<babbageclunk> so lonely
<babbageclunk> wallyworld: actually I dropped out because it screws up the timing in the test
<anastasiamac> wallyworld: thumper: PTAL   https://github.com/juju/juju/pull/8574
<babbageclunk> thumper: ok it gets weirder - it's not the manifolds, because they're never called (jujuconnsuite runs a minimal set of workers, but it doesn't run a dep engine). So it's actually something that gets imported by the worker/raft package (or worker/raft/rafttest)
<babbageclunk> on the way to working it out now.
<anastasiamac> thumper: also https://github.com/juju/juju/pull/8575 for deprecation on the flag
<thumper> babbageclunk: that's weird
<thumper> anastasiamac: check failed for first PR
<anastasiamac> thumper: yes, looking
<anastasiamac> thumper: but doubt it's me - i changed only tests, and that should not have effected featuretets pkg at all.. mayb intermittent...
<wallyworld> babbageclunk: oh sorry, missed ping
<thumper> wallyworld: I also updated presence PR
<wallyworld> looking. stupid quassel doesn't beep at me anymore
<thumper> wallyworld: my quassel is struggling to beep too
<wallyworld> yeah, so i miss a lot of pings :-(
<wallyworld> especially when i'm knee deep in code etc
<thumper> why aren't you neck deep?
<thumper> what is wrong with you?
<thumper> wallyworld: I haven't squished commits yet so you can review the changes more easily
<thumper> wallyworld: I'll squish before getting the bot to merge
<wallyworld> ty
<wallyworld> i need to start doing that too
<thumper> I'm done for the day
<wallyworld> thumper: lgtm but from diff it appears there could be some newlines between methods in interface missing
<thumper> wallyworld: I've not got newlines between every call
<thumper> as I've grouped them
<thumper> do we really need newlines?
<wallyworld> i think that's the expected way to do it
<wallyworld> IIANM
<wallyworld> YMMV
<thumper> FFS
<thumper> fine
<thumper> wallyworld: some blank lines just for you
<wallyworld> \o/ ty, still lgtm, land that sucker
<thumper> you didn't approve
<thumper> oh, I see you did earlier
<thumper> thanks
<wallyworld> np
 * thumper squishes and pushes
<thumper> oh no!!!
<thumper> I had a mouthful of coffee left in my cup
<thumper> and it is now cold
<thumper> :(
<manadart> jam: As mentioned.
<manadart> PR for supplying image metadata url and stream to containers: https://github.com/juju/juju/pull/8578
<jam> manadart: reviewed
<manadart> jam: Ta.
<jam> manadart: I think I figured out why you can't use TearDownTest with gomock
<jam> the gc.C that is passed into TearDownTest is not the same object as the one passed into SetUpTest
<jam> manadart: and you probably can't assume it is the same as the one during the test itself. :(
<jam> manadart: I'm not positive about it. but if I just put a "c.Fail" in TearDownTest then I at least get "test fixture has paniced"
<jam> which would at least 'fail' the test
<jam> manadart: I *definitely* see that the object passed into SetUpTest is not the same object as passed into the Test itself.
<jam> manadart: interestingly, if the test itself fails an assertion *during* the test, then it will report whatever the SetUpTest object got as failure messages
<jam> but it can't fail on its own during TearDown. I'm a bit surprised at that, but it is what I'm seeing :(
<manadart> jam: That explains the behaviour I saw when testing it. If I panic'd the test I could see some of the satisfied/unsatisfied calls reporting, but not when running finish in teardown.
<manadart> jam: One option might be to write a TestReporter implementation, instantiate as a member of the suite in SetupTest, pass it to MockController. Then in TeardownTest we always know what we are getting and can Finish.
<jam> manadart: well potentially a TestReporter that is a thunk, and then you call "SetThunk" in each place. but I don't think it would really give the behavior you want
<jam> manadart: specifically, failures in assertions will get "Fixture has paniced()" rather than just treating it as a failed test.
<manadart> jam: Yes I thought "oh, but wait" after I typed that :)
<jam> balloons: can you hear me?
<babbageclunk> thumper or wallyworld: could you please review this? https://github.com/juju/juju/pull/8571 Sorted out that weird failure.
<thumper> babbageclunk: ack
<thumper> babbageclunk: what was it?
<wallyworld> sure soon, otp for a bit
<babbageclunk> thanks!
<thumper> babbageclunk: what is your response to jam's question?
<babbageclunk> It was the import thing - importing the FSM from rafttest meant that some test fixture stuff came in too, and something in that lot did (something I'm not sure what) at package init time that meant that the hook commands ran slower.
<babbageclunk> Would have been easier to find if golang allowed unused imports I think.
<babbageclunk> thumper: not sure - need to think about it more carefully.
<thumper> babbageclunk: so the fix was what?
<babbageclunk> move the FSM out of rafttest into worker/raft.
<hml> short review anyone?  https://github.com/juju/juju/pull/8582
<hml> balloons: ^^
<hml> :-)
<balloons> :-) awesome heather!
<veebers> wallyworld, thumper tiny addition to the charmrepo.v3 change
<veebers> https://github.com/juju/charmrepo/pull/125
 * wallyworld still otp :-(
<veebers> ack
<veebers> hml: lgtm
<hml> veebers: ty
<babbageclunk> veebers: build's failing on yours?
<hml> wpk: why is timing a big issue here?  with the interfaces.py?
<hml> wpk: ifdown used to be before the interfaces.py was called
<veebers> babbageclunk: ah as per comment, test failure as charmstore.v5 still deps on charmrepo.v2 and thus fails. I could add charmrepo.v2 to charmrepo.v3 deps but that's mucky :-) Once this lands I'll propose a PR for charmstore.v5 (my branch will currently be the only thing depping on charmrepo.v3 right now)
<wpk> hml: the main concern is to wait with anything that's hard to revert for as long as possible, we've seen weird failure modes and IMHO it's impossible to avoid them all, but we could do as much as we can to 1. avoid them 2. make the machine boot if we f.up
<babbageclunk> veebers: oh sorry - didn't see that!
<veebers> hah I may have made it after you looked at the PR ^_^
<hml> wpk, veebers, balloons: will look at changing interfaces.py instead in the AM
<babbageclunk> that's my story and I'm sticking to it.
<wallyworld> babbageclunk: anastasiamac: standup?
<anastasiamac> m standing m standing \o/
#juju-dev 2018-04-12
<veebers> thumper, wallyworld (anyone) Tests are passing, ready for another review: https://github.com/juju/juju/pull/8498 Have been doing some manual testing as well
<wallyworld> veebers: awesome, just got out f all my meetings for a bit (until next one), will look
<wallyworld> 181 files! close to a record :-)
<wallyworld> i think babbageclunk has the recent record with his raft one
<wallyworld> veebers: remind me - the utlimate plan is to go to macaroon.v2 rather than unstable right, but we need to do this in stages
<thumper> wallyworld: there is no macaroon.v2
<thumper> it is all unstable
<thumper> my brain is almost exploding
<thumper> threading the presence stuff through hit model migrations
<veebers> wallyworld: initially we wanted to up deps but stopped short of macaroon v2-us, then we decided to get in line with what charmstore.v5 had (hence the macaroon-bakery.v2-us parts)
<wallyworld> veebers: review done, a few small things only
<veebers> wallyworld: awesome, I'll get them sorted posthaste
<wallyworld> let me know if anything is unclear
<thumper> babbageclunk: got a few minutes?
<wallyworld> babbageclunk: are you looking at thumper's comments?
<babbageclunk> wallyworld: yup
<babbageclunk> also thumper: yup
<babbageclunk> thumper: in 1:1?
<wallyworld> ok, i'll look at thumper's PR next
<thumper> babbageclunk: ack
<thumper> babbageclunk: waiting
<veebers> wallyworld: re: the comments, er, comment. Is the expectation something like "RootKey implements bakery.Storage.RootKey" or more: "RootKey returns the rootkey found in the bakery storage, used for making new macaroons." (same for Get)
<wallyworld> if the method comes from an interface we typically use the former
<veebers> wallyworld: cool, does it make sense to have the package in there? (i.e. bakery.) or is <inteface>.<method> the done thing? (i.e. Storage.Get)
<wallyworld> i think we mostly do not include the package name'
<thumper> babbageclunk: not too big, just 900 lines or so
<babbageclunk> thumper: hey you tricked me!
<wallyworld> thumper: review done
<thumper> wallyworld: awesome, thanks
<babbageclunk> Is anyone else running the pre-push git hook?
<veebers> wallyworld: can you clarify your comment "These need to operate on both the embedded suites"?
<veebers> babbageclunk: oh, no I'm not but I should be. Its just a matter of symlinking something isn't it?
<wallyworld> babbageclunk: i normally run it
<wallyworld> veebers: if a suite embeds > 1 other suite, you need to call the SetUp/TearDown funcs on all the embedded suites or else only one will get run
<wallyworld> the other embedded suites will not get setup/torndown
<babbageclunk> veebers: yup - see here https://github.com/juju/juju/blob/master/CONTRIBUTING.md#local-clone
<veebers> wallyworld: ah I see, I was only matching any setups with a teardown.
<veebers> wallyworld: does that mean I need to add a loggingsuite setuptest too? (or are it's setupsuites enough
<wallyworld> veebers: the embedded suites may not explicitly define a SetUp (for example) *today*, but if one were to be added....
<wallyworld> if loggingsuite doesn't have one at all, then no needed
<wallyworld> but if there's one by virute of an embedded suite in loggingsuite
<veebers> wallyworld: loggingsuite has a setuptest, so I'll add it too
<wallyworld> great, yup
<wallyworld> and order of teardown should be opposite of setup
<veebers> ack
<veebers> wallyworld: does the same count for setup/teardown suite? (i.e. I have the looging one there, should I add the mgo)
<wallyworld> any embedded suite that has a setup/teardown needs to have those called by the outer suite
<wallyworld> otherwise go will be a single, arbitrary one to call and not call the others
<wallyworld> s/be/pick
<veebers> ah ok, gotcha
<wallyworld> same even goes for if you add logic to the outer setup and there's only one embedded suite
<wallyworld> still need to then call the embedded suite setup
<babbageclunk> thumper: are you happy with https://github.com/juju/juju/pull/8571 or do you want wallyworld to look at it too?
<veebers> wallyworld: re: the macaroon.New wrapper, would the function be New or NewMacaroon() to be clearer?
<wallyworld> veebers: NewMacaroon works i think as it will be in a testing package
<veebers> sweet, makes sense
<thumper> babbageclunk: all good
<babbageclunk> thumper: cool, thanks!
<thumper> jam: in our 1:1
<veebers> wallyworld: FYI have pushed pr review changes. Also I had a go at getting a failure when using a charm with resources, but couldn't. It's possible I don't fully understand the process required.
<jam> thumper: omw
<veebers> I published a charm to the store, I've deployed it (it got me to auth), I upgraded the controller +  model to the 2.4-b1 and attached a new resource to teh application but it all worked
<veebers> ah man, failling tests. I should have caught that locally :-\
<veebers> ah, I see why.
<veebers> babbageclunk: is there a nice way to "go test" but just build the test not run them (i.e. so I can catch compile errors)
<babbageclunk> veebers: go test -c
<babbageclunk> But I haven't really worked out a nice way to do that across all packages.
<veebers> babbageclunk: cool thanks. Huh that succeeds, but there are errors to be found :0\
<veebers> ah, just in juju/juju won't do much will it
<babbageclunk> It only builds the tests for the package you're in.
<veebers> aye :-)
<kelvinliu> sorry, got a quesiton about current supported golang version, `make install-dependencies` will install golang-1.10 from snap but README.md shows it should be 1.9.  I am currently on `develop` branch
<kelvinliu> seems README needs to be updated.
<thumper> kelvinliu: yes
<wallyworld> it was 1.9 until recently
<kelvinliu> thx :)
<wallyworld> we should tell folks just to use Go from the snap
<kelvinliu> yes, snap is handy rather than installing manually
<wallyworld> it is, and it's easy to try rc of next go version etc
<wallyworld> so we can be sure juju will continue to work when next version is out
<kelvinliu> yes, agreed
<kelvinliu> wondering if I wanna update the README, how  do we manage branchs?
<wallyworld> sure, feel free to propose a change. vino had the same thought but i've given her a bug to work on so you could look at a readme change
<babbageclunk> jam: ping?
<thumper> night all
 * thumper goes to make dinner and have a glass of wine
<kelvinliu> yup thx
<kelvinliu> night thumper
<babbageclunk> kelvinliu: he never waits for responses. I was going to say it too. :(
<kelvinliu> haha
<anastasiamac> a very quick and easy review plz: https://github.com/juju/juju/pull/8584
<wallyworld> anastasiamac: looking
<anastasiamac> wallyworld: thnx
<wallyworld> anastasiamac: nice
<anastasiamac> m thinking it should mayb (!) say 'odel cannot be upgraded to x.x.x while the controller is y.y.y: upgrade 'controller' model first'...
<anastasiamac> does it read better to u^^
<anastasiamac> wallyworld: ^^
<wallyworld> sure, sounds good to me
<anastasiamac> \o/
<anastasiamac> i'll re-phrase in a sec... m just a breath (or a few) away from proposign worker/maniforld for cred check...
<wallyworld> jam: i had to make a fix to a recent juju/description commit which added some stuff for caas support, any chance of a review before you EOD? i need this to be able to add support for model import/export in juju itself https://github.com/juju/description/pull/37
<vino_> hi kelvin
<kelvinliu> hi vino
<vino_> hi.. did u manage to get details abt branches ? i closed freenode.
<kelvinliu> yes, https://github.com/juju/juju/blob/develop/CONTRIBUTING.md
<kelvinliu> just have a check CONTRIBUTING.md
<wallyworld> jam: thanks for review, i responded to comments if you are interested. i'll test with model export in juju before landing
<wallyworld> kelvinliu: i left a comment - let's change the readme to tell people ot use the snap
<kelvinliu> yes, go point, I m updating it now
<vino_> snap install of go ?
<kelvinliu> yes
<vino_> i had issue installing snap version by default installed 1.6 and that had issues with juju installation
<wallyworld> kelvinliu: lgtm
<wallyworld> vino_: it should have installed the latest stable go release, not sure why it installed 1.6
<wallyworld> juju won't compile with go < 1.10
<vino_> yes. in 16.04
<wallyworld> hmm, no idea why off hand. i'm running bionic
<kelvinliu> i am running bionic as well
<kelvinliu> `make install-dependencies` will install all for u, vino
<vino_> so i tried upgrading it to 1.10 and eventually it allowed me to do it only with longsleep/golang-backports
<vino_> maybe i can recheck
<vino_> https://bugs.launchpad.net/juju/+bug/1763201
<mup> Bug #1763201: Build juju from source documentation requires update <usability> <juju:Triaged> <https://launchpad.net/bugs/1763201>
<wallyworld> vino_: you sure you used the snap?
<wallyworld> rather than the one from the archive?
<wallyworld> i'd expect 1.6 to be in the archive
<vino_> i did. but i can recheck again.
<wallyworld> the reference to backports implies archive
<vino_> i rechecked and snap installs 1.10 - i was wrong here.
<vino_> wallyworld : i reachecked and i couldnt reproduce how i got 1.62 where i faced issues to upgrade and referred to link :  https://github.com/golang/go/wiki/Ubuntu
<wallyworld> no worries. we should update that wiki page to refer to snap also
<vino_> It already does.
<vino_> https://github.com/juju/juju/blob/develop/README.md
<vino_> snap also or snap only ?
<wallyworld> vino_: i meant this one https://github.com/golang/go/wiki/Ubuntu
<wallyworld> oh never mind
<wallyworld> i didn't scroll down far enough
<wallyworld> it already mentions snaps
<vino_> I am not clear.
<vino_> apologise
<vino_> did u mean that juju README.md should also mention snap ?
<wallyworld> for the juju readme, IMO it should be snap ony
<wallyworld> since people on xenial etc would otherwise get an older version of go that won't ork with juju
<vino_> ok snap only is correct. apt install info can be removed
<vino_> ok.
<wallyworld> yeah i think so
<anastasiamac> wallyworld: if u plan more insomnia or r just looking for bedtime reading, PTAL https://github.com/juju/juju/pull/8586 - cred worker/manifold... :D
<wallyworld> ok, will try and get to it after i finish this current PR
<anastasiamac> wallyworld: was kind of hoping u'd look at it at the earliest tomorrow :)
<wallyworld> can do
<vino_> wallyworld : i will update the bug i opened this morning https://bugs.launchpad.net/juju/+bug/1763201
<mup> Bug #1763201: Build juju from source documentation requires update <usability> <juju:Triaged> <https://launchpad.net/bugs/1763201>
<vino_> but just to update : both snap and apt-get install go install 1.10 and works fine for me to build juju and installing.
<vino_> i will detail in the bug report
<vino_> snap install go --classic is fine
<vino_> only the next option needs to be updated with - sudo apt-get install golang-go
<vino_> hi Kelvin.. Can i assign that bug to you  as you are updating it ?
<kelvinliu> sure, just give it to me. I v already got a PR for it.  Vino
<vino_> i just saw that.
<vino_> i felt u can add apt-get as it works fine for me
<vino_> it installs latest go version.
<vino_> i have assigned it to u
<vino_> get u get mail notification kelvin ?
<vino_> https://bugs.launchpad.net/juju/+bug/1763201
<mup> Bug #1763201: Build juju from source documentation requires update <usability> <juju:Triaged by kelvinliu1976> <https://launchpad.net/bugs/1763201>
<jam> manadart: if its possible while you're in the area, bug #1753418 is one reason why our ci run isn't clear.
<mup> Bug #1753418: intermittent failure in kvmProvisionerSuite.TestKVMProvisionerObservesConfigChanges <intermittent-failure> <test-failure> <juju:Triaged> <https://launchpad.net/bugs/1753418>
<manadart> jam: Yep. Will take a look.
<jam> manadart: lower priority than finishing the work, but since you're working with KVM right now
<balloons> externalreality, are you ok with doing the change for the runtime panics on that bug to start?
<balloons> externalreality, I'll look quickly for ovh account
<externalreality> ballons, sounds good, will do.
<hml> balloons: subprocess.check_output is workingâ¦ but have to change a lot of things over to use itâ¦
<hml> unhappy itâs not clear why yet
<balloons> OK, so you are unblocked?
<hml> balloons: currently -
<thumper> morning team
<rick_h_> thumper: morning and +1000 on the collapse of the juju vs -dev list/channels
<rick_h_> in case you were looking for votes. I've been saving up my paper ballots :P
<thumper> balloons: where are we with 1.25.14 build needs - go/1.8 etc
<thumper> rick_h_: heh, ack
<balloons> good morning thumper
<thumper> also status for 2.3.6?
<thumper> since we are co-opting the release call for whole team meeting :)
<balloons> thumper, yea.. We have 3 releases we need to do. 1.25, 2.3, and 2.4
<thumper> yeah
<thumper> 2.3.6 first, then 2.4 then 1.25
<balloons> thumper, 2.3.6 has to have a call on if we wait for the container-image-metadata or not
<thumper> we've waited long enough for the 1.25 build, another week or two won't kill them
<thumper> I don't think we need to wait
<thumper> I'm happy enough that going into 2.4
<thumper> field are working around the issue now
<thumper> it isn't critical
<thumper> 2.3.6 is
<balloons> this needs to land then, and we should be good https://github.com/juju/juju/pull/8589
<thumper> balloons: is mongo 3.6 in bionic yet?
<balloons> thumper, it went into the queue a few hours ago. Needs approval, then it will land in proposed
 * thumper nods
 * thumper jumps in call 1 of 3
<veebers> Morning all o/
 * balloons tips hat
<wallyworld> kelvinliu: if you're online, come to team meeting?
<babbageclunk> morning everyone
<hml> veebers: if you would, i updated the pr from yesterday with the change
<veebers> hml: mean, you have a link?
<veebers> hah I have a Link, I hear him crying right now :-P
<hml> veebers: https://github.com/juju/juju/pull/8582
<hml> feeling paranoid - i manually did the upgrade series setups on a 2.3.6 config - then did juju upgrade-juju to get to 2.4-beta1 and the unit was still happy.  :-)
<veebers> hml: what's the difference to what the script used to be doing (with the .tmp that's different to using --output-file) don't both write a 'tmp' file and then mv it in place at the last second?
<hml> veebers: there are shades of âlast secondâ
<hml> veebers: i could have just removed the os.rename and done the mv after ifdown
<hml> veebers: but since we can specify the input file, itâs ânicerâ to specify an output file too?
<veebers> hml: perhaps I don't fully understand the script . . . ah I see, right *that's* the important part here, not moving the interfaces file to bak before it's time
<veebers> I was focusing on the wrong part :-)
<hml> veebers: i changed so it was a copy to .bak - then the replacement was done very last we could
<veebers> hml: ack, that makes sense and obvious now that I see it
<veebers> hml: LGTM
<hml> veebers: TY!  :-)
<veebers> thumper: Seems it's really easy to add gometalinter to emacs (using spacemacs too): https://github.com/syl20bnr/spacemacs/tree/master/layers/%2Blang/go#pre-requisites
<babbageclunk> veebers: Also really CPU hungry
<babbageclunk> (Probably because of the size of the codebase)
<veebers> babbageclunk: ah right, I'm not so keen to add more CPU load to my emacs usage ^_^
<vino> C, c++
<vino> and python scripting
<vino> in cloud i used JAVA RestApi
<veebers> vino: might be wrong channel :-)
<vino> yes yes.
<veebers> vino, kelvinliu: Where you making changes to CONTRIBUTING.md? I think I want to make some updates too
<vino> i am still sleepy
<veebers> ^_^
<vino> i already generated a PR
<vino> in the Local Clone section.
<veebers> Is master even a branch we really care about now? I think CONTRIBUTING.md should mention develop, not master.
<vino> https://github.com/juju/juju/pull/8588
<veebers> vino: lol that's exactly the change I was going to make
<vino> Oh sorry.
<vino> i missed something and Kelvin did it yesterday in README>md
<veebers> vino: hah no it's good, you got to it first.
<vino> the go version
<veebers> vino: note, there is a typo in the PR typo, it's a little long too (will be truncated in the git log)
<vino> done.
<vino> corrected it
<veebers> vino: you'll need to update the summary part too, it still has "â¦ection with correct path" as the first line
<vino> good eyes.
<hml> has anyone tried to bootstrap from a mac recently?  (iâm using a remote openstack cloud)
<wallyworld> babbageclunk: i might be a minute late for standup, so tell some jokes if i'm not there on time :-)
<babbageclunk> wallyworld: I'll do my best man, but if the audience are expecting you they'll still be disappointed
<hml> how do you build agents for a different processor?  trying to bootstrap 2.4-beta1 (develop) it complains about not having agent binaries, even if they are built during bootstrap.  Wrong OS?  and the 2.4beta1 binaries are not online yet
<babbageclunk> veebers: review plz? https://github.com/juju/juju/pull/8590
<veebers> babbageclunk: sure thing!
<babbageclunk> hml: Don't know, sorry! anastasiamac/balloons/thumper?
<babbageclunk> veebers: fanks
<hml> i was able to bootstrap with 2.3.5 at least?
<babbageclunk> I think once you have a controller you could build the binaries for the different arch and then use sync-tools to push them up, then upgrade?
<hml> hrm
<hml> get juju to build the binaries?
<hml> babbageclunk: how would you get the controller to build the binaries?
<babbageclunk> no, build them locally - I think by setting GOOS and GOARCH?
<babbageclunk> hml: Something like this https://dave.cheney.net/2015/08/22/cross-compilation-with-go-1-5
<hml> havenât played with those before
<veebers> babbageclunk: "Thanks ants, Thants". lgtm
<babbageclunk> veebers: Just... look around you!
<veebers> hml: with 2.3.5 it would have pulled agent from streams, for 2.4-b1 you'll need to use the testing streams to do that
<veebers> babbageclunk: ^_^ that's the one
<hml> veebers: thereâs a testing streams?
<veebers> hml: for each build we generate a stream entry and put it in s3.
<hml> veebers: though that wonât allow me to test my code easily?
<hml> veebers: how do you use it?
<veebers> hml: not if you're needing an agent you've just built :-\
<hml> veebers: sounds like i need to learn to use GOOS and GOARCH?
<hml> veebers: i was just courious
<veebers> hml: if you have a log of the upgrade series test too that would be useful
<hml> veebers: not sure if the logs i have now are useful?
<hml> veebers: i know where they are failing nowâ¦
<hml> veebers: if i find a new one, iâll forward along :-)
<veebers> hml: sweet :-)
<hml> veebers: eventually iâd be curious as to what broke which lead me to using subprocess.check_output to run remote commands
<hml> some juju some not
<hml> veebers: itâs not warm fuzzy that this is needed
<hml> but i want it done too :-)
 * hml waves goodnight!
<veebers> o/
<babbageclunk> bye hml
#juju-dev 2018-04-13
<anastasiamac> wallyworld: thumper: if u get a chance, PTAL https://github.com/juju/juju/pull/8586 - that's worker/manifold for cred watching
<wallyworld> anastasiamac: yeah, just finished meetings, starting my reviews now
<anastasiamac> wallyworld: \o/
<anastasiamac> babbageclunk: i haven't :D
<anastasiamac> babbageclunk: but i wonder if rick_h_ has :D
<rick_h_> What has Rick done now?
<rick_h_> I bootstrapped from a Mac today hml
<anastasiamac> rick_h_: \o/ i think hml eod'ed but it's good to know :) thnx !!
<rick_h_> anastasiamac: is that what you wanted to know? It wasn't babbageclunk  but seemed only question
<anastasiamac> rick_h_: yes, i think so :)
<wallyworld> anastasiamac: review done, a few little things to look at
<anastasiamac> ta
<thumper> babbageclunk, wallyworld: small(er) branch https://github.com/juju/juju/pull/8591 ??
 * thumper walks up to physio
<babbageclunk> thumper: looking
<vino> we follow gnu style flag parsing correct  for all the juju commands
<vino> some options doesnt work that way like model.
<vino> --model="default" or -m="default"
<vino> both should work
<vino> i do see the first one works fine but not the second
<vino> sorry -m "default"
<vino> my fault i didnt notice the '=' there
<vino> has anyone has idea abt the --output option ?
<vino> it is supposed to write the output to file
<babbageclunk> vino: on which command?
<vino> juju run
<vino> this output option doesnt write to any file that i specify. not sure if i am missing something
<babbageclunk> I've never tried it, tbh
<vino> mmm... i will raise this a bug.
<babbageclunk> yeah, I think you're right - doesn't work for me either
<thumper> what's this?
<vino> juju run command
<thumper> --model default, --model=default and -m default should all work
<vino> yes yes. i missed the'=' by mistake
<vino> i identified it myself
<vino> but we are talking abt the outout option
<vino> output*
<thumper> juju run --machine 0,1,2 juju-presence-report -o ~/sandbox/foo.yaml
<thumper> juju run --machine 0,1,2 juju-presence-report --output ~/sandbox/foo2.yaml
<thumper> both work for me
<thumper> what is your command line?
<babbageclunk> thumper: it looks like if you only specify one target (or specify --all and only have one machine, frex) it writes to stdout rather than writing through c.out
<thumper> ah...
<thumper> yeah...
 * thumper nods
<thumper> I remember writing that
<thumper> that should be fixed I guess
<vino> yes.
<thumper> vino: lucky you, you have found yourself another bug to fix :)
<thumper> vino: that one should be quite simple
<vino> yup.
<thumper> babbageclunk: did you want to talk about presence?
<babbageclunk> Oh no - sorry, got distracted and went back to my own stuff.
<babbageclunk> thumper: I read through it, looked good - I'm approving it now
<veebers> wallyworld: if we can get a +1 on this we can finally get it landed :-) https://github.com/juju/juju/pull/8498
<thumper> babbageclunk: sweet, I've just fixed the manifold tests, where I hadn't added the names
<babbageclunk> oh yeah, those ones always get me too.
<thumper> I'm not going back to the threading the presence code through the apiserver
<thumper> fun for the friday afternoon
<babbageclunk> oh man, it's annoying when you accidentally upgrade a controller to a version of jujud that runs but fails to start the API server
<wallyworld> thumper: i replied to your comment about the worker loop, see what you thunk
<thumper> wallyworld: how is it less LOC?
<wallyworld> no worker attributes to have to declare and assign to etc
<thumper> wallyworld: quick HO?
<wallyworld> the initial model cred is just a var in the loop
<wallyworld> sure
<wallyworld> veebers: i can't see the comment on the legacy index? do you agree we need it still?
<veebers> wallyworld: ah what, seems I was attempting to comment on the PR as a review, not in response to your question.
<veebers> wallyworld: have commented properly now
<veebers> but to save you a webpage reload: legacy macaroons will exist in a collection setup by a previous version of juju, so wouldn't those indices remain? Im pretty sure they aren't removed by this change, and there is no need for it to exist from this point onwards as we're just using new Juju.
<wallyworld> veebers: i wasn't sure about the MongoIndices() behaviour - whether we wiped the existing indices and applied just the ones specified
<wallyworld> make check an upgraded system?
<wallyworld> to be sure?
<wallyworld> i'd hate for the index to get wiped
<veebers> wallyworld: I'm 99.9% certain it reamins, as I tested things expiring yesterday. That being said it wasn't a specific test for that, I'll re-do now to move to 100% certainty
<wallyworld> veebers: maybe just a visual inspection of the indices
<wallyworld> then land that sucker
<wallyworld> then have a beer or six
<veebers> wallyworld: hah yep Ill spinup a current and upgrade to this branch ver checking the indices as I go
<wallyworld> great ty
<veebers> Reading mgo.session suggests that what I saw was expected, only creates new ones and doesn't remove them. Also doesn't update an existing index of the same name. Will do the check though, won't take long
<veebers> luckily we used expire-at previously and v2 uses expires, so there is no upgrade step required there ^_^
<veebers> wallyworld: FYI before and after upgrade: https://paste.ubuntu.com/p/Yx5kzHnfr7/
<wallyworld> looking
<wallyworld> veebers: great!
 * veebers $$merge$$s
<wallyworld> babbageclunk: first thing monday, can you look at this one? i need to land it before we ship 2.4 beta1 https://github.com/juju/juju/pull/8592
<babbageclunk> wallyworld: ok, will do
<wallyworld> ty
<wallyworld> it contains a data model change
<wallyworld> don't want to have to worry about upgrades
 * wallyworld afk to go buy coffee
<babbageclunk> oh bugger, just realised I landed that raft branch without squashing it
<veebers> https://github.com/juju/juju/pull/8498 -> Merge \o/
<veebers> oh bums, same here babbageclunk :-\
 * babbageclunk commiserates
<babbageclunk> hmm, wallyworld - if I want to make apiserver depend on raft, it can't start because the peergrouper (which raft depends on for apiserver addresses) depends on the upgrader which depends on the api-caller.
<wallyworld> joy
<wallyworld> i'm not sure how to solve that off hand
<babbageclunk> Neither. For this I technically don't need to... But to make the leadership API use it I will.
<babbageclunk> Ok, what about a new worker that both raft and API depend on that has methods to set and get the raft. If raft hasn't been set yet, get returns an error. Then the raft worker sets it when it finishes starting up.
<babbageclunk> Feels a bit baroque, but I think that would work?
<babbageclunk> wallyworld: ^
 * wallyworld ponders
<wallyworld> that could work i think
<wallyworld> worth spiking on in the absence of a better idea
<babbageclunk> The key thing is that although it kind of depends on it, the API server won't need the raft until someone actually calls an API that uses it?
<babbageclunk> yeah, I think so
<babbageclunk> wallyworld: is the reason the upgrade steps need to be idempotent because the different controllers could each try to run them?
<wallyworld> they are run every time aby upgrade is done
<wallyworld> or they may half run and then the agent restarts etc
<babbageclunk> ok - but it looks like there's also no ifPrimaryController around them either. (This is good for me - otherwise asking for leadership would be needed straight away.)
<wallyworld> with your thought, i think only the master controller runs them
<wallyworld> oh
<wallyworld> maybe not then
<babbageclunk> oops - need to go feed the animals! Will check back later to see if you have any other brainwaves!
<wallyworld> coffee migh thelp
<vino> the --output option works with --format specified.
<vino> so its not a bug :)
<vino> I am talking about juju run --output option
<anastasiamac> vino: wallyworld: the PRs need to have reviews before landing... no?
<vino> yes. he is doing it.
<vino> wallyworld is reviewing.
<anastasiamac> vino: ? m seeing $$merge$$ without a review. that's a landing...:D
<vino> sure.
<wallyworld> which PR?
<wallyworld> the contributing one?
<wallyworld> i thought chris reviewed it
<anastasiamac> m seeing it on 8593
<wallyworld> oops, ok, i'll stop the job thanks for picking that up :-)
<anastasiamac> nws
<anastasiamac> vino: very nice to see PRs up, btw :D
<wallyworld> yeah, 3rd day and already a couple of PRs :-)
<anastasiamac> \o/
<vino> Thanks everyone :) :)
 * wallyworld off to soccer, have a good weekend
<manadart> Just pushed what I think are the last changes to https://github.com/juju/juju/pull/8578.
<admcleod_> is it possible that the cloud-init juju use for lxd containers was updated in the last 24 hours?
<admcleod_> hmm nvm
<hml> morning
<balloons> manadart, KVM done as well, interesting
<balloons> Good morning HNL
<balloons> lol, I'm terrible at typing
<hml> ha
<manadart> balloons: Yes; we can discuss in 8 mins.
<hml> tab complete :-)
<frankban> jam: ping, do you have a minute?
<hml> juju devel version bootstrap from mac: if you specify âbootstrap-series xenial and âbuild-agentâ¦ juju builds an agent for the mac, but not for the bootstrap series of xenial.  interesting
<hml> cross compiling juju for linux/amd64 from mac is not as easy as it should beâ¦ some dependencies of dependencies are missing.  :-/
<balloons> hml, ugh. We used to do the oppposite easily enough
<balloons> anyone want to have a quick look https://github.com/juju/juju/pull/8596?
<hml> balloons: blows up in LXDâ¦ trying to find a way forward now
 * hml looking
<hml> balloons: lgtm
<hml> something is definately weird with cross compiling juju from mac - itâs failing on a wrong version issue (mismatch of arguments) of a package not required in ubuntu juju build environment
<balloons> hml.. wow. Did you hit a wall?
<hml> yup
<hml> balloons: i canât build the lxc client eitherâ¦ which i should be able to do on mac per the github
<hml> so perhaps i there is something else wrong or not right yet
<hml> ???
<balloons> hml, gotcha. Well the answer is to use ubuntu :-) Always the right answer
<hml> itâs odd where itâs failing thoughâ¦
<hml> balloons: at least the reverse works?  i can cross compile on ubuntu for mac.. .and without the package thatâs failing the otherway.  :-)
<balloons> hml, yea, reverse is easy. But we build natively now on mac
<hml> https://pastebin.ubuntu.com/p/GFtYqH5jdK/  <â this is where iâm at
<hml> holy crap balloons, I think I just had a good complete total run with the new ci test!.
<balloons> WHA!?
<hml> I KNOW!  right
<balloons> hml, that is awesome
<balloons> Your thoughts on writing a test would be most appreciated as usual :-)
<hml> balloons: hereâs the log: http://paste.ubuntu.com/p/Qp2Nzxf2rM/
<hml> balloons: sureâ¦ will ponder things for the new folks
<hml> it took 1h45 to run at home.  bleh
<balloons> hml, a good feeling though to see it do all that work eh?
<hml> balloons: almostâ¦
<balloons> hml, yea.. that's amazingly long.  But perfect for something that clearly isn't trivial to do yourself
<hml> balloons:  it wasnât the whole thingâ¦  but the long part that iâd been having trouble with
<hml> :-(
<hml> but the other part is already working
<hml> so stil a happy day
<balloons> hml, yea.. somehow despite my efforts is I always give folks some real doozey's to start with
<balloons> I guess all the easy stuff already has tests eh?
<hml> perhaps.. it shouldnât have been this hard to write
<balloons> I will agree on that
<hml> iâd stil like to understand what is going on in the ci test âlibrariesâ such that i couldnât get juju run or juju ssh working consistently
<balloons> hml, believe it or not the library used to be 3 or 4 times bigger than it is now. It's still way too big
<hml> wow
<balloons> as always.. it's MUCH better than it was. Ask chris if you don't believe me :-)
<hml> doing one last run with the other two parts while i clean up the commented out code and comment other stuffâ¦.
<hml> crossing fingers for real PR by EOD - if i didnât get jinx myself
<balloons> I'm having fun making the release process better. Always something low-hanging to automate
<balloons> and of course everything works fine for me
<hml> of course
<hml> balloons: for your reviewing pleasure: https://github.com/juju/juju/pull/8563
<balloons> hml, yay
<hml> now to figure out how to get the job created
<balloons> thanks hml. I actually edited that file today when I was doing it myself
<balloons> that is, the readme on how to add a job
<hml> balloons: can you give me the link again?  i lost it :-(
<balloons> hml, feel free to tweak the document in general. It could use it I think. It kind of runs on
<hml> balloons: ack
<hml> always fun when a new person uses the docs. :-)
<balloons> hml, first glance is this looks really nice, especially for your first effort
<hml> balloons: what does âartifacts the job resultsâ mean?
<balloons> hml, in the yaml or in general?
<hml> itâs mentioned in the readme for creating a new test yaml
<balloons> it means it will retain copies of whatever build artifacts you specify. Otherwise they'll be gone
<hml> ack
<hml> balloons: running jenkins-jobs test locally to test the yaml, should a job be generated?
<balloons> I think it spits it out on the console
<balloons> I never really run test -- just push and try it
<hml> ah
<balloons> if it fails to parse, it will fail to create the job
<balloons> no harm in pushing it several times, running, and tweaking
<hml> it passed parsing
<balloons> make sense?
<hml> yup
<balloons> hml, reviewed, again very nice
<hml> balloons: ty
<balloons> hml, ready for another? :p
<hml> balloons: iâll take a small break for now.. but i have a feeling the backup restore test is in my future
<balloons> hml, I think that was a great warmup to it. To be fair, the test for backup / restore exists. Though, it's unclear if I'd advise you to use much, if any , of it
<hml> :-)
<balloons> a quicky for anyone https://github.com/juju/juju/pull/8597
<hml> balloons: how do those work?  iâm not sure what iâm looking at with the pr
<hml> :-)
<balloons> hml, I discovered one of the patches (look at the Makefile for reference on when we use it, and the README in patches folder for context) isn't being applied because it's .patch, not .diff
<hml> ah
<hml> how did we get away with âforgettingâ that patch?
<balloons> hml, it's another reason I'm not a fan of them. Easy to not include them
#juju-dev 2018-04-15
<veebers> Morning all o/
<vino> Morning :)
<veebers> Hey vino o/ How has your morning gone? I just had to call the insurance company because I broke the stovetop :-(
<vino> Hey Veebers! I am good. Awake from 5:am this morning.
<vino> Excellent its a good start for you :p
<vino> Looks like u had a fun weekend
<veebers> vino: hah that's earlier than me and I have a baby alarm clock ^_^
<veebers> thumper: going to propose http://paste.ubuntu.com/p/bPYQSpV5Kq/ to fix https://bugs.launchpad.net/juju/+bug/1745031 (finally).  Wanted to add a test for ValidateFileAttrValue (none currently), seems either I need to write actual files or mock os.Stat to do so. I presume mocking os.Stat would be preferred?
<mup> Bug #1745031: gce add credentials "Enter file" absolute path msg improvment <juju:Triaged by veebers> <https://launchpad.net/bugs/1745031>
<thumper> or writing some creds to a temporary file
<thumper> that would be preferable to mocking os.Stat
<veebers> thumper: the function in question only deals with file paths, doesn't care about the content
<thumper> write actual files
<thumper> ioutil.WriteFile
<veebers> thumper: also, juju add-credential google is happy with an empty json file, says "Credential "blah" added locally for cloud "google".:
<veebers> thumper: ack, can do
<thumper> veebers: that seems like a different validation bug for credentials :)
<veebers> thumper: it does
<thumper> veebers: are you almost ready to land the simple fix from Jan now?
<thumper> heh
<veebers> thumper: hah yeah ^_^ it's been a journey
 * thumper goes back to his weaving of presence
<wallyworld> vino: standup?
