#juju-dev 2012-08-27
<davecheney> hello
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1038296/comments/5
<davecheney> ^ tricky one to chase down
<fwereade_> davecheney, heyhey
<fwereade_> davecheney, ahh, well caught
<davecheney> fixing it might be harder
<davecheney> i've just proposed a small patch that adds the changes to the test to get that debugging
<davecheney> this will have to be fixed sharpish, because while it happens infreqently in dummy
<davecheney> the race will happen almost 100% of the time on ec2
<mramm> ouch!
<davecheney> actually I shold test that before throwing off wild assertions
<TheMue> morning
<davecheney> fscking launchpad, why can't I add a bug from the milestone screen !!
<TheMue> davecheney: *lol* e'body loves this tool
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1042066
<davecheney> ^ thoughts
<TheMue> davecheney: sounds reasonable. but i would like it configurable
<TheMue> davecheney: one mode workes like today, to test functionality
<TheMue> davecheney: and the other one more randomly to behave like in the wild
<davecheney> TheMue: sure, we can do that
<davecheney> shit, https://bugs.launchpad.net/juju-core/+bug/1038296/comments/6
<davecheney> turns out, our recovery logic works, but this isn't awesome
<mramm> Well, it's almost 3am here, so I'm going to crash for a bit, see you all in the morning
<TheMue> cu, and have a good night
<davecheney> right - that is enough bugs logged for tonight
<davecheney> gotta get some sleep so I can fix 'em tomorrowq
<TheMue> davecheney: enjoy your evening
<niemeyer> Hello all
<TheMue> niemeyer: hiya
<niemeyer> TheMue: Yo
<niemeyer> fwereade_: ping
<fwereade_> niemeyer, pong
<niemeyer> fwereade_: Yo
<fwereade_> niemeyer, sorry I missed you on friday
<niemeyer> fwereade_: Oh, n[p
<niemeyer> rogpeppe: Heya
<rogpeppe> yo!
<rogpeppe> public holiday today BTW, i'm not here :-)
<niemeyer> rogpeppe: Just reviewed the test refactoring branch.. nice stuff!
<niemeyer> rogpeppe: Ah, cool :-)
<rogpeppe> niemeyer: cool, thanks
<rogpeppe> niemeyer: did you think of a neater way to do what i was trying to do with gocheck?
<rogpeppe> niemeyer: i think the reassurance it gives is very useful - it's so easy to break things in subtle ways.
<niemeyer> rogpeppe: Agreed re. reassurance, and no I haven't looked yet
<rogpeppe> niemeyer: np
<niemeyer> fwereade_: Is it a holiday for you as well?
<niemeyer> mramm: Morning
<mramm> niemeyer: Morning
<rogpeppe> niemeyer: am spending the afternoon putting all our wedding photos online, and wondering about using juju to host it, for the crack - do you know of any decent open source photo website s/w?
<niemeyer> rogpeppe: Neat!
<niemeyer> rogpeppe: I used to use Gallery for that
<niemeyer> rogpeppe: http://gallery.menalto.com/
<rogpeppe> niemeyer: cheers! not entirely sure whether using juju is a good idea, as no storage management yet... we'll see.
<niemeyer> rogpeppe: It's somewhat easy to workaround that
<niemeyer> rogpeppe: It's actually neat as it forces you to think through, rather than just forgetting the data in a machine somewhere
<rogpeppe> niemeyer: what i don't want to do is spend hours uploading GB of photos and then lose 'em...
<niemeyer> rogpeppe: Right
<niemeyer> rogpeppe: FWIW, we use EBS for the disk on all instances, so this shouldn't happen in generla
<niemeyer> rogpeppe: Then, I'd suggest using S3 as a back-and-forth media for quick re-deployments
<rogpeppe> niemeyer: ah, makes sense.
<rogpeppe> niemeyer: so have a big tar file in S3 and suck that down when starting a new unit.
<niemeyer> rogpeppe: Yeah
<niemeyer> rogpeppe: and via config you can easily send it back
<rogpeppe> niemeyer: interesting. how would that work?
<rogpeppe> niemeyer: set a config attr saying "please archive now" ?
<niemeyer> rogpeppe: Once we have the Go port nailed down, the stop hook should work too
<rogpeppe> niemeyer: yeah, *that* would be good.
<niemeyer> rogpeppe: Yeah, you can use a named config value
<niemeyer> rogpeppe: Whenever it changes, save it again
<rogpeppe> niemeyer: a bit of a hack, but yeah, that'd work ok.
<niemeyer> rogpeppe: Doesn't feel like a hack to me.. it's a straightforward convention with named backups
<rogpeppe> niemeyer: ah, if you don't overwrite backups, yeah.
<niemeyer> TheMue: go-mstate-life-file reviewed
<niemeyer> TheMue: ping
<TheMue> niemeyer: seen it, cheers
<TheMue> niemeyer: just tested google two step authentication and got troubles with all those apps who need an individual pw
 * TheMue has an LGTM overflow ;)
<niemeyer> TheMue: Good stuff indeed
<TheMue> niemeyer: thx
<TheMue> niemeyer: and mstate is real fun
<niemeyer> I'm mostly out of things to review, which would be a feat were not for the fact that the pending one is a massive 1k+ branch
<niemeyer> I'll head to lunch now, though
<niemeyer> TheMue: Glad to hear it
<TheMue> niemeyer: enjoy your lunch, i'll go to archery now ;)
<niemeyer> fwereade_: it-lives has final review!
<niemeyer> a final
<niemeyer> Back in ~30mins
<rogpeppe> fwereade_: i wonder if you could do me a favour and check that the goamz live tests work ok for you. i get SignatureDoesNotMatch errors and i'm not sure if it because i've just mucked up my aws account or because there's actually something wrong.
<rogpeppe> fwereade_: hmm, you're probably done for the day, of course!
<rogpeppe> niemeyer: ping
<niemeyer> rogpeppe: Hi
<niemeyer> Doc appointment took a bit longer than expected
 * niemeyer feels sorry for mramm
<fwereade_> niemeyer, tyvm for reviews :)
<niemeyer> fwereade_: My pleasure
<fwereade_> niemeyer, the ErrDying thing has caused rogpeppe's dislike for ErrorContextf to click, though, by ErrorContextf-ing the modes I can't (easily) actually return ErrDying, and *that* was what I was having a problem with
<fwereade_> niemeyer, I'll figure something out, though :)
<fwereade_> niemeyer, `x.tomb.Kill(nil); x.tomb.Kill(ErrDying)` still gives Error() of nil, right?
<niemeyer> fwereade_: Aha, that makes more sense, thanks
<niemeyer> fwereade_: Yeah
<fwereade_> niemeyer, ok, I'll come up with something clean, but maybe tomorrow
<niemeyer> fwereade_: Of course, thanks a lot
<fwereade_> niemeyer, cheers, gn
<fwereade_> davecheney, gm and gn :)
<niemeyer> fwereade_: Have a good night
<fwereade_> niemeyer, oh, just one question on the other review -- part of me wants to say that rather than "dropping good information on the floor" I should maybe actually explicitly require nil when setting charm state to Deployed
<niemeyer> fwereade_: What's wrong with preserving the URL?
<fwereade_> niemeyer, it's not so much that the information is *bad* but it is a not a canonical source of truth
<niemeyer> fwereade_: There is an actual URL, right?
<niemeyer> fwereade_: Sure, but it still feels like we know what we've last put there
<niemeyer> fwereade_: THat's what the info tells
<fwereade_> niemeyer, there's deployedURL, which is definitely correct, and then there's this; if we have and trust this, then we don;t need the one inside the charm dir, do we?
<niemeyer> fwereade_: While this isn't a big deal, I don't see a reason to not have it either
<niemeyer> fwereade_: Makes sense, or am I missing something?
<fwereade_> niemeyer, either this, or the .juju-charm file, seems to be redundant
<fwereade_> niemeyer, I would be comfortable with either as the single source of truth
<fwereade_> niemeyer, having both just makes me nervous ;)
<niemeyer> fwereade_: It's not about having a single source of truth in this case.. they have different meanings
<niemeyer> fwereade_: We're not trusting on both
<fwereade_> niemeyer, hmm, true
<niemeyer> fwereade_: We know they mean different things and may be different
<niemeyer> fwereade_: and we know why
<niemeyer> fwereade_: We're also not even exposing that info ATM
<niemeyer> fwereade_: I'm suggesting that mainly because it enables us to change our minds
<fwereade_> niemeyer, ha, yeah, fair enough :)
<fwereade_> niemeyer, ok, sgtm
<fwereade_> niemeyer, cheers
<niemeyer> fwereade_: Cheers
<fwereade_> niemeyer, oh, and JujuConnSuite.Reset is so that I can write the uniter tests as a table, in which each starts from pristine state -- am I missing some obvious better way to do that?
<niemeyer> fwereade_: Hmm, interesting
<niemeyer> fwereade_: I guess it sounds reasonable..
<fwereade_> niemeyer, it's wanting to do stuff like this that underlies my half-belief that Fixture is a useful concept that usefully encompasses Suite and Test... but sadly *doesn't* handle both Suiteness and Testness at the same time, so the idea clearly needs more thought
<fwereade_> niemeyer, oh, and I'm not sure I explicitly thanked you: I forget which review it was where you suggested replacing table comments with summary fields and embracing zero values, but it really made table-driven testing click -- ie it now seems *better*, not just different, IYSWIM
<niemeyer> fwereade_: That's awesome!
<fwereade_> niemeyer, it's one of those things that now seems ridiculously obvious, it takes a bit of effort to remember the previous perspective :)
<niemeyer> fwereade_: Yeah, I totally know what you mean.. quite a few things take a while to click
 * fwereade_ threatens himself with a hammer, and reluctantly slinks off to bed; nn all :)
<niemeyer> fwereade_: Have a good night man :)
#juju-dev 2012-08-28
 * davecheney starts to salivate
<davecheney> http://shopap.lenovo.com/au/en/products/laptops/thinkpad/thinkpad-innovation?cid=EDM_20120828_ANZ_AU_CON_X1Carbon_ShortRange_SL&RRID=222186471&esrc=EPI2JANZ
<niemeyer> davecheney: Wow, sweet indeed
 * mramm2 has no love for my ISP today -- but they are sending out the second tech support person in 2 days tomorrow morning
<davecheney> hello, is there an mstate.Open ?
<davecheney> i can't find it, i must be dumb
<TheMue> good morning
<davecheney> TheMue: good morning
<davecheney> do you know if there is an mstate.Open method ?
<TheMue> hiya dave
<TheMue> davecheney: sorry, don't know if it already exist
<davecheney> TheMue: that is sad
<davecheney> is there an mstate.Info ?
<TheMue> davecheney: dunno too, i've just started with lifecycle and test completion
<TheMue> davecheney: so you just have to scan the code
<davecheney> TheMue: cool
<davecheney> thanks
<davecheney> i'll ask aram
<TheMue> davecheney: at least in trunk they don't exist
<davecheney> TheMue: does mongo have anything like the concept of zookeepers' fallover addresses ?
<TheMue> davecheney: and aram currently focusses on txn and watchers
<TheMue> davecheney: afaik the concept is different, see http://www.mongodb.org/display/DOCS/Sharding+and+Failover. but here i'm not deep enough into both systems.
<davecheney> TheMue: ta
<rogpeppe> fwereade, TheMue: morning
<fwereade> rogpeppe, heyhey
<TheMue> rogpeppe: hi, had a nice day off (spending your time on wedding photos)?
<TheMue> fwereade: hello
<rogpeppe> fwereade: any chance you could run a very brief live test for me? i can't work out if i've mucked up my amazon stuff or if our code has gone wrong
<fwereade> TheMue, heyhey :)
<rogpeppe> TheMue: yes thanks
<fwereade> rogpeppe, heh, ok, sure; in 5 mins?
<rogpeppe> fwereade: np
<fwereade> rogpeppe, ok, that wasn't 5 mins, but maybe it actually will be in 5 more mins -- would you let me know what I need to do now?
<rogpeppe> fwereade: go test -amazon -gocheck.vv launchpad.net/goaws/ec2
<rogpeppe> fwereade: (assuming you've got valid AWS_ environment variables set up)
<rogpeppe> oops
<rogpeppe> s/goaws/goamz/
<rogpeppe> oops again
<rogpeppe> fwereade: go test launchpad.net/goamz/ec2 -amazon -gocheck.vv
<rogpeppe> flags after packages, but only for go test :-)
<fwereade> rogpeppe, failures: http://paste.ubuntu.com/1171358/ (I managed to figure that bit out at least)
<rogpeppe> fwereade: interesting, but not the failures i was looking for :-)
<rogpeppe> fwereade: i'm seeing signature failures
<fwereade> rogpeppe, ha, sorry :(
<rogpeppe> fwereade: the weird thing is that the python juju works ok.
<rogpeppe> fwereade: time to make a minimal failing example, i think
<fwereade> rogpeppe, something's scratching at my mind about signed urls, I'll let you know if it turns into anything real
<rogpeppe> fwereade: ta. it's really odd - some test example code works.
<Aram> moin.
<TheMue> hi Aram
<rogpeppe> fwereade: one last test, just to make doubly sure: could you make sure you've got the latest goamz version (cd $GOPATH/src/launchpad.net/goamz; bzr pull) and run this program, with the auth details i gave you earlier substituted as appropriate.. http://paste.ubuntu.com/1171398/
<rogpeppe> fwereade: i'm finding it all a bit weird
<rogpeppe> fwereade: or just say bugger off if you're too busy... :-)
<fwereade> rogpeppe, huh, sorry, looks like I wasn't up to date :/
<fwereade> rogpeppe, and np at all I'm watching other tests atm, did something stupid :)
<fwereade> rogpeppe, bingo, signature does not match
<rogpeppe> phew
<rogpeppe> fwereade: and if you use your own credentials?
<fwereade> rogpeppe, doing that now
<fwereade> rogpeppe, yep, same failures
<fwereade> rogpeppe, sorry about that :(
<rogpeppe> fwereade: lovely thanks
<rogpeppe> fwereade: no, that's good!
<rogpeppe> fwereade: and when i reverted to revision 8, it all works
<fwereade> rogpeppe, nah, just sorry about the dumb version thing to begin with
<rogpeppe> fwereade: np, i'd've done the same
<rogpeppe> fwereade: the three line change between r8 and r9 is the culprit
<TheMue> Aram: just to get sure, the agreement has been that all reads on one or all entities with a lifecycle returns them regardless if of their life state, isn't it?
<Aram> TheMue: yes.
<TheMue> Aram: ok, i'm currently doing services and machines and will change that too where needed
<Aram> ok.
<rogpeppe> fwereade: i think i've found the problem
<fwereade> rogpeppe, oh yes?
<rogpeppe> fwereade: the urls don't end with a slash
<fwereade> rogpeppe, ha!
<fwereade> aaaaaaaaaaaaaaaaaaaaand we have a (rudimentary) Uniter merged :D
<rogpeppe> fwereade: right, now i can have some breakfast
<rogpeppe> fwereade: yay!
<rogpeppe> fwereade: i will eat muesli in celebratory mood!
<fwereade> rogpeppe, heh, I should probably do the same in a mo :)
<fwereade> sorry guys, need to pop out for a bit, bbs
<TheMue> lunchtime
<davecheney> biggup!
<fwereade> heya davecheney
<davecheney> has everyone got their UDS invite yet ?
<Aram> no
<davecheney> ffs, i told michelle that there was a problem, but she didn't believe me
<davecheney> eventbrite has this web bug they put on their email, so they claim it has been 'opened'
<davecheney> like that could ever be wrong
<Aram> hmm
<Aram> I do have an invite, now that you made me look
<davecheney> bwahahah
<Aram> 14 days ago, actually
<Aram> heh
<davecheney> sssh
<Aram> Ubuntu Developer Summit - R
<Aram> what's the 'R'?
<fwereade> Aram, letter after Q
<davecheney> it's the next letter after Q
<davecheney> jynx!
 * fwereade gesticulates wildly but emits no sound
<davecheney> fwereade: !
<fwereade> sorry guys 2 mins
<davecheney> at an old workplace, the IRC bot had a !jynx command
<davecheney> one guy spent far to much time teaching it levenshtein distances so it could figure out who to mute
<niemeyer> Yo!
<TheMue> hiya niemeyer
<davecheney> hey
<niemeyer> Anyone has the invites out yet
<niemeyer> ?
<davecheney> nup
<niemeyer> Sending
<niemeyer> rogpeppe?
<rogpeppe> niemeyer: yo!
<rogpeppe> niemeyer: meeting, i guess
 * rogpeppe goes to fetch the other computer
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1042604
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1042579
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1038296/comments/5
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1038296/comments/6
<davecheney> Whoop whoop!
<niemeyer> davecheney: I'm not sure if you want to talked to talk a bit more about the problem
<niemeyer> davecheney: Do you wanna brainstorm on it for a moment?
<davecheney> niemeyer: sure
<niemeyer> davecheney: Cool, so I'll pick the code to follow alone
<niemeyer> along
<niemeyer> Wow.. interesting typo
<davecheney> is the hangout still active ?
<niemeyer> davecheney: Would you rather use G+? Cool.. I've sent it again
<davecheney> jynx! so have I
<davecheney> niemeyer: https://plus.google.com/hangouts/_/649cd97bfeb012132fd9f58aeaf998dfd90329b2
<davecheney> ^ does that work ?
<davecheney> niemeyer:         svc, err := s.Conn.AddService("test-service", charm)
<davecheney>         c.Assert(err, IsNil)
<davecheney>         err = svc.SetExposed()
<davecheney>         c.Assert(err, IsNil)
<davecheney>         units, err := s.Conn.AddUnits(svc, 1)
<davecheney>         c.Assert(err, IsNil)
<davecheney>         err = units[0].OpenPort("tcp", 999)
<davecheney>         c.Assert(err, IsNil)
<rog> davecheney, TheMue: i think the firewaller probably needs to be changed so that it only adds machines to the machineds map when the machine has an instance id.
<rog> davecheney: or, better perhaps, the firewaller could keep the set of current instances, and change it when a machine's instance id changes
<TheMue> rog: InstanceId() of machine already returns an error if unset and that is caught in firewaller line 202
<rog> TheMue: yes, but this is something that happens in the normal course of things. we don't want the provisioner to be restarted each time it happens.
<rog> TheMue: i'm not sure that retry logic is the right fit here either
<rog> TheMue: hmm, mind you perhaps retry is ok here, in a kinda hacky sort of way, because we know that the firewaller is in the same process, so it will be seeing the same changes and be reacting pretty quickly.
<TheMue> rog: if we don't have an instance id retrying to open them doesn't help ;) so indeed there may be a need for (a) watching for the instance id
<rog> TheMue: that's what i'm thinking
<rog> TheMue: (we already have a watcher that can do that)
<TheMue> rog: sadly i don't see the according error message in the log. here i'm wondering
<TheMue> rog: ah, found, looked too high
<rog> TheMue: looking at the code, i don't think it would be too hard.
<rog> ... maybe
<TheMue> rog: depends on the strategy
<TheMue> rog: wait there less or more good wrapped but blocking
<rog> TheMue: syntax error :-)
<TheMue> rog: or start a kind of async port opener in an extra goroutine
<TheMue> more or less
<rog> TheMue: i didn't understand that first sentence, but i don't think the latter is a good idea
<rog> TheMue: if we let ourselves be led by the model, a Machine's instance id can change at any time, and we should track that.
<TheMue> rog: 1st sentences is a kind of instanceId, err := machined.machine.WaitInstanceId()
<rog> TheMue: i don't think that's a great idea either
<TheMue> rog: no, because it blocks the fw
<rog> TheMue: i'm thinking we should maintain another map inside the Firewaller
<rog> TheMue: that maps machine id to instance id
<rog> TheMue: or to environs.Instance, better perhaps.
<rog> TheMue: there's something i'm trying to understand about the current firewaller; maybe you can explain
<TheMue> rog: i'll try
<rog> TheMue: it never calls Instance.Ports, so if the firewaller is restarted, how can it know what ports to close on an instance?
<TheMue> rog: afaik we once talked about it. i just pass the instance the ports i want have opened or close, regardless if they are alredy open or closed
<rog> TheMue: that's fine for opening ports (open is idempotent, and when we start, we assume no ports are open), but i think it fails for closing ports.
<davecheney> sorry lads, wasn't watching what you were writing
<davecheney> was talking to gustavo
<niemeyer> TheMue: Please leave that to davecheney
<TheMue> rog: why does it fail?
<TheMue> niemeyer: ok
<rog> TheMue: because when you start up, you need to close any ports that are currently open but that are not mentioned in the state
<rog> TheMue: but unless you call Instance.Ports, you can't know what those are
<davecheney> ARGH!
<davecheney> why does LP not have a 'report a bug' link on the milestone page
<davecheney> I spend my life on that page and there is no bloody link to create a _new_ bug for this milestone
<rog> davecheney: think of daisies, la la la
<TheMue> niemeyer: is rog with this fw-startup-port-closing right? if so we should file a bug.
<davecheney> rog: please file a bug
<davecheney> rog: also, is there a bug for the AMZ breakage
<rog> davecheney: yes, someone else had already filed one
<davecheney> I have a shittone of 'doesn't work in XYZ region' bugs that I am working through
<rog> TheMue: interestingly, this was an issue that we didn't have in the code sketch that i originally proposed for the firewaller
<rog> TheMue: (well, perhaps... :-])
<TheMue> ;)
<rog> davecheney: i'll write a test that breaks first, then i'll file a bug :-)
<davecheney> rog, you've done this before :)
<rog> davecheney: yup, test fails as expected
<davecheney> niemeyer: https://bugs.launchpad.net/juju-core/+bug/1042717
<davecheney> does this capture (part) of the discussion we just had
<rog> davecheney: i think AllInstances only returns running instances
<rog> davecheney: (or pending)
<niemeyer> davecheney: As far as terminology goes, I suggest stopped vs. terminated
<niemeyer> davecheney: That's what EC2 uses
<davecheney> ya'all can edit the ticket, please have at it
<mramm2> sorry all
<davecheney> mramm2: did my text make it ?
<mramm2> yep
<mramm2> and my phone was right next to me
<davecheney> mramm2: if you wanna have a quick catchup now while everyone is online, lets do it
<mramm2> but I slept through it
<niemeyer> davecheney: Looks good
<niemeyer> davecheney: I was wondering a bit about (2)
<niemeyer> davecheney: Do we need it right now?
<rog> davecheney: what do you think we should do with a stopped machine? its agent will appear dead.
<rog> niemeyer: ^
<davecheney> rog: discuss with niemeyer, he dug up that corpse
<rog> hmm, yeah, it might be problematic if we find we're running two unit agents for the same unit
<rog> but i suppose that's a problem with a down network connection too
<niemeyer> TheMue: Sorry, I did read the discussion, but it's not clear to me what problem is being fixed, or why we should change something there
<davecheney> niemeyer: this one is a bit pithier
<rog> niemeyer: the problem is that the firewaller sees a new machine and tries to change its ports, but the machine hasn't yet been allocated an instance.
<niemeyer> davecheney: Which one?
<davecheney> niemeyer: https://bugs.launchpad.net/juju-core/+bug/1042721
<davecheney> paste fail
<rog> niemeyer: so the firewaller dies
<niemeyer> rog: I've discussed this with davecheney already.. there's zero reason to change ports for an instance that doesn't exist
<rog> niemeyer: indeed
<rog> niemeyer: which means the firewaller should watch each machine to see when its instance id changes, i think
<niemeyer> rog: I don't get the leap
<rog> niemeyer: when should the firewaller open the ports for a new instance?
<niemeyer> davecheney: +1
<davecheney> niemeyer: my only question is
<davecheney> which ports are we talking about, the ones in the state, or the ones in the security group of the provider ?
<niemeyer> rog: When it gets an open port watcher firing for an instance within it and its service was exposed, in either order
<niemeyer> davecheney: State
<rog> niemeyer: open port watchers fire for machines, not instances
<niemeyer> davecheney: We've agreed back then that StartInstance never gives back something with ports open
<niemeyer> rog: It fires for units, actually
<rog> niemeyer: sure
<davecheney> niemeyer: ok, then in the pathological case, all the units' machines have been replaced
<niemeyer> rog: Which live within an instance when they are running
<davecheney> and the service is now offline
<niemeyer> rog: No instance, no unit
<rog> niemeyer: but when you see that state change, you haven't necessarily got an instance to change the ports on
<niemeyer> rog: Impossible
<rog> niemeyer: really?
<niemeyer> rog: The unit lives within the instance.. if there's no instance, there's no uniter, and thus no open ports
<davecheney> niemeyer: ahh, and you just answered my question, when the instance is replaced, the new uniter will react to 'exposed' and do what it needs
<rog> niemeyer: ah...
<rog> niemeyer: i'd forgotten that rub
<rog> niemeyer: brilliant!
<TheMue> the missing piece
<rog> niemeyer: so the test is wrong.
<rog> it's all my fault :-)
<mramm2> I setup a hangout: tps://plus.google.com/hangouts/_/0fbf3a66c1e6ee955123854516a78f947aa621cb
<niemeyer> davecheney: Yeah, or in install/start whatever.. it's free to run open-port in any hook
<mramm2> not required, but if you want to chat
<mramm2> feel free to join
<niemeyer> rog: Well, kind of.. as discussed with davecheney, the test is also right
<niemeyer> rog: It's just a different unit tests
<niemeyer> rog: It shouldn't blow up in such a state
<davecheney> rog: it's wrong in the way a hat made of bacon is wrong
<rog> davecheney: are you dissing my bacon hat?
<rog> niemeyer: should it just ignore the error?
<rog> niemeyer: the firewaller, that is
<niemeyer> rog: Yeah, if the instance is gone, it can ignore it.. the provisioner will close ports in state, fire a new instance, assign to it, and the new uniter will open it back again
<rog> niemeyer, davecheney: so this should fix that particular test: http://paste.ubuntu.com/1171742/
<davecheney> niemeyer: me looks
<davecheney> niemeyer: that should work _almost_ work
<rog> davecheney: almost?
<davecheney> there is a race between the call to dummy.StartInstance() returning and the intance id hitting the state
<davecheney> it is a much smaller window than currently exists
<rog> davecheney: shit yeah
<rog> davecheney: i knew about that before
<rog> davecheney: just forgotten it
 * rog is groggy today
<davecheney> rog: http://codereview.appspot.com/6482081/
<davecheney> your thoughts would be appreciated
<davecheney> note, setting a value higher than about 10ms will cause jujud tests to hang
<niemeyer> davecheney: Huh.. a good one to inspect later :)
<davecheney> rog: niemeyer : booohhh https://bugs.launchpad.net/bugs/1042545
<rog> davecheney: why bother putting the delay time in the environState if it's actually global?
<davecheney> rog: TheMue requested that we be able to change it
<rog> davecheney: ah
<davecheney> so in theory, we could change it by type aserting to dummy, then reaching in and changing the value on a per test basis
<rog> davecheney: not really
<rog> davecheney: it's all unexported
<rog> davecheney: i'd prefer a possible entry in the dummy environ configuration attributes.
<rog> davecheney: if we wanted an override
<davecheney> rog: good point
<niemeyer> rog: Yeah, that sounds sensible
<niemeyer> I've suggested a function in the dummy package, but an attribute setting seems even nicer
<niemeyer> environment setting
<rog> niemeyer, davecheney: i'm not sure about it though (an override that is)
<niemeyer> davecheney: That said, I still think there's value in having a flag to enable the delay globally
<rog> when would it be appropriate to use the override?
<niemeyer> davecheney: May be done with a command line flag, though
<niemeyer> davecheney: as suggested in the review
<rog> niemeyer: i think on balance i prefer the environment variable, as it means it's easy to run all tests with the delay enabled.
<niemeyer> rog: -dummy.delay 10s is just as easy
<niemeyer> I'd prefer to stay out of the business of env variables
<rog> niemeyer: i can't do: go test launchpad.net/juju-core/... -dummy.delay 10s
<niemeyer> For that purpose
<davecheney> niemeyer: that doesn't work when you do: go test launchpad.net/juju-core/...
<davecheney> what rog said
<niemeyer> Hmm.. okay
<Aram> we'll put it in a SOAP web service that the tests query using CORBA.
<davecheney> Aram: don't forget perl, you'll need lots of perl
<niemeyer> davecheney: Okay, so why all the fanciness? var delaySecs = os.Getenv("..."); func delay() { if delaySecs != "" { time.Sleep(...) } }?
<davecheney> niemeyer: TheMue requested that it be changable
<rog> niemeyer: +1 (assuming we don't allow delay overriding with the config)
<niemeyer> davecheney: Ah, we have time.ParseDuration actually
<davecheney> which I now realise didn't work
<TheMue> Aram: i'm missing the JEE server in the middle receiving the requests and writem the via mqseries into an oracle where we could fetch them
<rog> TheMue: why do you want the delay overridable?
<niemeyer> Which you're already using
<davecheney> niemeyer: rog so if in principle you are in favor, I'll resubmit that CL tomorrow with something simpler
<davecheney> env var -> delay()
<niemeyer> davecheney: Yeah, definitely
<niemeyer> davecheney: we can also improve it in the future as we see necessary
<rog> davecheney: i think it's a good idea. i don't think we want a configurable delay - i can't think when it would ever be appropriate to use it.
<niemeyer> davecheney: Potentially approaching the mgo/txn's Chaos stuff
<rog> niemeyer: +1
<niemeyer> davecheney: But it's not worth it for now
<TheMue> rog: just has been a quick idea after dave mentioned the topic to test different ranges
<TheMue> rog: if the implementation concept now makes it useless it's fine for me
<davecheney> all: i agree, i just need a way to make dummy slow down to better simulate a real provider
<rog> davecheney: +1
<TheMue> davecheney: +1
<niemeyer> davecheney: +1
<davecheney> right, i'll try out your test sugggestion tomorrow niemeyer
<niemeyer> davecheney: Cheers man
<davecheney> niemeyer: thanks for the discussion, i can see a straightforward way for the PA to implement those two tickets we discussed
<niemeyer> davecheney: Superb, thanks for finding the issue!  Glad to see these bugs being fleshed out.
<niemeyer> Okay, so reviews, and then presence
<niemeyer> TheMue: ping
<TheMue> niemeyer: pong
<niemeyer> TheMue: We'll need to fix the Kill methods.. they're changing the cached state to Dying irrespective of previous state
<niemeyer> TheMue: We shouldn't do Dying > Dead
<TheMue> niemeyer: ouch
<niemeyer> TheMue: Adding a comment
<TheMue> niemeyer: oh, yes, now i see it
<niemeyer> TheMue: It's fine to move on as they are for the moment, and fix all of them at once in a follow up
<TheMue> niemeyer: both new ones for service and machine are like the ones for unit and relation
<niemeyer> TheMue: Yeah, it's all good
<niemeyer> TheMue: We can have a new CL that follows up on that and fixes all three at once
<niemeyer> TheMue: (with a test!)
<rog> fwereade: could you just quickly give a high level overview of how constraints work? in particular, who solves the constraints? the PA or the client? I'm presuming the former, but just need to check.
<TheMue> niemeyer: when i've got your ok i'll change it i'll do the followup
<TheMue> niemeyer: +1
<niemeyer> TheMue: Cool
<fwereade> rog, well, "solve" is frankly a bit of a strong word -- we basically match against a list of instance types, sort by cost, and pick the first
<rog> fwereade: yeah, but until that's done we don't know the architecture that's going to be used for the new unit, right?
<rog> fwereade: or series
<rog> fwereade: but that "solving" is done by the PA?
<fwereade> rog, ok, the series is known before we even start on constraints
<fwereade> rog, that's defined by the charm
<fwereade> rog, in every existing case, arch defaults to amd64 but can be set to i386 if desired
<fwereade> rog, but remember this is pretty ec2-specific
<rog> fwereade: yeah, i don't want to do anything provider-specific here
<rog> fwereade: so am i right that the PA does the matching?
<fwereade> rog, and -- well, *probably* it should be up to the PA, but at present no it is not
<rog> fwereade: ah, so the add-unit command works out the architecture etc then adds the new unit with those set?
<fwereade> rog, but I know niemeyer is -1 on this, and we hashed it out last UDS, so... yes, it is up to the PA
<fwereade> rog, er, IYSWIM
<fwereade> rog, yeah, that was what it did
<rog> fwereade: thanks, that's useful
<fwereade> rog, the reasons for it doing so are not especially interesting, and the actual choice procedure shouldn't really be any different
<rog> fwereade: it makes a difference for upgrading, interestingly
<rog> fwereade: i'm writing a little description of my current problem
<niemeyer> TheMue, Aram: ping
<Aram> pong
<fwereade> rog, ah, ok :)
<niemeyer> Aram, TheMue: Can we quickly talk about the last point here: https://codereview.appspot.com/6495043/
<TheMue> niemeyer: pong, sorry, have been afk for a moment
<TheMue> niemeyer: yes
<Aram> niemeyer: what point, this? mstate/service.go:45: s.doc.Life = Dying
<niemeyer> Aram: The last one in the review
<rog> niemeyer, fwereade: here's a description of my current upgrading difficulty: http://paste.ubuntu.com/1171829/
<TheMue> niemeyer: currently RemoveUnit() would run on an error if the units are not dead
<TheMue> niemeyer: but it has to be put into a complete txn later
<niemeyer> rog: Sorry, I'm covering a different issue right now
<rog> niemeyer: that's fine. but when you have a moment, i'm blocked on this.
<niemeyer> TheMue: That's unrelated to transactions
<niemeyer> TheMue: This is about the lifecycle behavior
<TheMue> niemeyer: otherwise we start to remove relations, remove unitts and may beak before deleting the service
<Aram> niemeyer: I thought we had agreed that units should listen for their service and delete/die themselves.
<niemeyer> TheMue: RemoveService is abruptly *removing units from state*, despite whatever state they're in
<TheMue> niemeyer: ok, reducing it to lifecycle the solution should be to let all units die if Die() is called on a service
<TheMue> niemeyer: today it's not working this way
<niemeyer> TheMue: Of course it's not.. that's why we're doing the lifecycle stuff in the first place :)
<niemeyer> TheMue: Which is why I'm asking what's the plan
<niemeyer> Aram: Yes
<niemeyer> Aram: Not delete
<Aram> yes, only die.
<niemeyer> Aram: Kill themselves, actually
<niemeyer> Aram: and then die, you're right
<niemeyer> Aram: The deletion is the bit that is done outside
<niemeyer> (by the machine agent)
<Aram> niemeyer: cool, we're on the same page. it hasn't been done yet because we were lacking watchers, I haven't forgoten about it.
<niemeyer> Aram: Still, there's a problem in that RemoveService implementation.. doesn't look like that's what we want
<Aram> well no, now it isn't.
<Aram> but the plan is to change it after we have watchers.
<niemeyer> Aram: Okay, I'm wondering because we're changing it now to claim lifecycle integration,
<niemeyer> Aram: and it feels quite bogus from a lifecycle perspective
<Aram> nah, the claim is wrong. it's a step, but not the final step.
<Aram> there's more work to be done.
<niemeyer> Aram: Cheers
<TheMue> niemeyer: how should "kill themselves" happen?
<niemeyer> TheMue: The uniter will monitor the service
<niemeyer> TheMue: If the service gets to Dying, the units kills itself
<niemeyer> TheMue: There's also a refcounter in the service to tell how many units are alive or dying (and not Dead yet)
<TheMue> niemeyer: so they don't kill themselves, the uniter kills them, ok
<niemeyer> TheMue: The service stays dying meanwhile
<niemeyer> TheMue: Heh.. the uniter is the implementation of the unit
<niemeyer> fwereade: Btw, are you in sync with this ^^^
<niemeyer> fwereade: last 5 sentences
<TheMue> niemeyer: had been at the unit state, not at the unit implementation
<fwereade> niemeyer, had half an eye on it; looks pretty sensible to me
<fwereade> niemeyer, same model as relations, really
<niemeyer> fwereade: Yeah
<fwereade> niemeyer, one more thing to watch, maybe in a couple of places, which might be a little tedious
<fwereade> niemeyer, but well actually, no, I might only need to watch it when I'm started
<niemeyer> fwereade: Well, not really.. the service can die at any point
<niemeyer> fwereade: It's an entry for the "steady" select loop, I suppose
<fwereade> niemeyer, yeah, still thinking it through
<niemeyer> TheMue: Okay, you got a +1 on all the lifecycle stuff
<niemeyer> rog: So, what' sup?
<TheMue> niemeyer: cheers
<rog> niemeyer: just booking flights 4 uds, 1 mo
<fwereade> niemeyer, I *think* that impending service death is not a good enough reason to interrupt anything else the unit is doing (including, say, waiting for hook error resolution)
<niemeyer> rog: np, reading the paste meanwhile
<niemeyer> fwereade: Uh, that's awkward
<niemeyer> fwereade: The guy said *kill the whole thing*
<niemeyer> fwereade: Why we'd we go "Oh, btw, I have a small issue here?"
<niemeyer> fwereade: Hmm
<niemeyer> fwereade: I'm trying to think of scenarios where we'd actually want to wait for the error to be resolved
<niemeyer> fwereade: Did you have something in mind?
<niemeyer> fwereade: Well, at the same time, it doesn't feel like a big deal to wait, to be honest
<niemeyer> fwereade: An argument could be enabling debugging of such issues
<niemeyer> fwereade: "juju resolved" would always enable the service to die either way, I suppose?
<niemeyer> fwereade: Sorry, clearly I'm brainstorming..
<fwereade> niemeyer, yeah, that was my thinking -- that in general we want smooth and steady shutdown of everything
<rog> niemeyer: flight booked
<rog> niemeyer: did the issue make sense to you?
<niemeyer> fwereade: Sounds sensible, sorry for the derail.. should have talked to a bear before
<fwereade> niemeyer, by my gut I'm -1 on any sudden-death mechanisms beyond remove-unit --force
<fwereade> niemeyer, haha np
<niemeyer> rog: Not yet.. digesting it still
<niemeyer> rog: Why would add-unit *guess* tools?
<rog> niemeyer: because it doesn't know what architecture the new unit is going to run on yet
<niemeyer> rog: Hmm
<niemeyer> rog: So that's not right
<rog> niemeyer: unless we say that constraints are solved by the client
<niemeyer> rog: We shouldn't set tools before that's decided
<rog> niemeyer: we don't
<niemeyer> rog: Well, if we *guess*, we do
<niemeyer> rog: If it's decided, we dont' guess
<rog> niemeyer: that's only with solution 2
<rog> niemeyer: which i'm not keen on, but i thought it might a possibility
<niemeyer> rog: Solution 1 feels a bit like a derail..
<rog> niemeyer: yeah, i'm not too keen on that either
<rog> niemeyer: which might mean that the whole proposed tools architecture is misguided
<niemeyer> rog: Heh
<niemeyer> rog: Let's keep the bby
<niemeyer> baby
<niemeyer> rog: I'm still digesting the issue, just a moment
<niemeyer> rog: You know what's interesting.. a unit may have to run on a different Ubuntu release than the machine agent that starts it
<niemeyer> rog: This is theoretically quite feasible
<rog> niemeyer: yes, that's true.
<niemeyer> rog: and probably practically too
<niemeyer> rog: For a constrained selection of series at least
<rog> niemeyer: oh... unit
<niemeyer> rog: So we need three details to be able to start a unit:
<rog> niemeyer: interesting. i thought the LXC stuff always used the same series as the main instance
<niemeyer> Sorry
<niemeyer> I actually meant
<niemeyer> rog: So we need three details to be able to assign tools to a unit:
<niemeyer> - The series
<niemeyer> - The version
<niemeyer> - The arch
<niemeyer> We can tell the series from the service
<niemeyer> The version should probably be inherited from the provisioning agent
<niemeyer> The arch must match the machine being deployed in
<niemeyer> rog: People deploy different series in *chroots*
<niemeyer> rog: LXC has better isolation than chroots even
<rog> niemeyer: uh huh. and i guess we can take advantage of that.
<niemeyer> rog: Yeah
<rog> niemeyer: the above stuff seems to imply that you think the PA should assign the tools to a unit
<rog> niemeyer: is that right
<rog> ?
<niemeyer> rog: No.. so far it's just brainstorm.. just trying to figure what comes from where, so we find the proper hook point
<rog> niemeyer: ok
<niemeyer> rog: It feels like there are two possible cases:
<niemeyer> rog: 1) Unit assigned to existing machine
<niemeyer> rog: 2) Unit assigned to undeployed machine
<rog> niemeyer: +1
<rog> off the top of my head, maybe the PA should assign units to machines, rather than doing it client-side. that would solve this issue, at any rate.
<niemeyer> rog: For (1), AssignToMachine may ensure the proper set of agent tools based on the machine agent tools, and potentially the service some day when we do support the distinction
<rog> niemeyer: i'm not sure that's true actually
<niemeyer> rog: Oh?
<rog> niemeyer: what if the machine agent gets upgraded in the meantime?
<rog> niemeyer: i suppose it comes down to what semantics we want from upgrade
<niemeyer> rog: Actually, hmm..
<rog> s/upgrade/upgrade-juju
<niemeyer> rog: What if ProposedTools() defaulted to the machine tools?
<niemeyer> rog: When the setting is missing entirely
<rog> niemeyer: interesting
<rog> niemeyer: what about machine proposed tools?
<niemeyer> rog: Meaning?
<rog> niemeyer: what does Machine.ProposedAgentTools default to?
<niemeyer> rog: That's an easy one.. we need tools to start the machine agent in the first place
<rog> niemeyer: but we don't need tools to create the Machine
<niemeyer> rog: Indeed, but we need tools to start the machine agent in the first place
<rog> niemeyer: sure. but i don't see how this gets us out of the race that i described
<niemeyer> rog: Well, there's no way to avoid it if we're allowing for anything concurrent to pick agent tools
<rog> niemeyer: solution 1 avoids the race, at some cost.
<niemeyer> rog: It doesn't..
<niemeyer> rog: Unless you sit down and wait for all upgrades to finish
<niemeyer> rog: Before continuing to upgrade
<rog> niemeyer: you have to sit down and wait for parent agent upgrades to finish, yeah
<niemeyer> rog: and even that has a race, if you assume that new parent agents may be starting
<rog> niemeyer: they'll be started by another agent, so we'll always be able to upgrade that first
<rog> niemeyer: essentially we percolate upgrades down from the root
<niemeyer> rog: That's a long derail
<rog> niemeyer: here's another possibility:
<niemeyer> rog: and complex too..
<rog> niemeyer: agents are responsible for ProposingTools on their children.
<rog> niemeyer: (not sure i like that much either)
<niemeyer> rog: Hmm
<niemeyer> rog: It sounds like we're introducing a lot of cost for the benefit of features that won't exist for quite a while..
<niemeyer> rog: I wish we had noticed that before :(
<rog> [15:18:34] <rog> niemeyer: which might mean that the whole proposed tools architecture is misguided
<rog> :-)
<rog> :-(
<niemeyer> rog: state.ProposedVersion() ? :-)
<rog> niemeyer: that has its own down sides, and i can't quite remember them right now...
<niemeyer> rog: Mainly we can't do selective upgrading, which is what I was referring to above
<rog> niemeyer: what happens if we don't have versions for every architecture we need?
<niemeyer> rog: We find the closest possible version available, and if there's none, we put an error in the state pointing out we can't deploy said resource
<niemeyer> rog: I think the version setting can actually be part of the config.Config type
<rog> niemeyer: what do you mean by "closest version"?
<rog> niemeyer: i think perhaps we should make it exact or nothing
<niemeyer> rog: $MAJOR.0.0 <= $CLOSEST_VERSION <= $MAJOR.$MINOR.$PATCH
<niemeyer> rog: That's unnecessary.. we have to handle compatibility within majors anyway.. there's no reason to prevent that from happening purposefully
<niemeyer> rog: This will be handy when there's an upgrade in one architecture but not in another
<rog> niemeyer: so nothing later than the proposed version.
<niemeyer> rog: Yeah
<rog> niemeyer: state.ProposedVersion() (version.Number, error) right?
<niemeyer> rog: I was thinking that it'd be easier to have that in config.Config
<niemeyer> rog: and thus state.EnvironConfig()
<niemeyer> rog: So we can use existing infrastructure to deal with it
 * rog thinks
<niemeyer> rog: E.g. we already have env watches, already have means for reading and writing this setting, etc
<niemeyer> rog: There's one handy pre-req which is making state deal with config.Config rather than ConfigNode on EnvironConfig and the watch
<niemeyer> rog: Which is something I've been trying to do since Lisbon
<rog> niemeyer: this means that the providers would have to know about proposed tools, right?
<niemeyer> rog: Why?
<rog> niemeyer: because they're created with config.Config attributes, no?
<niemeyer> rog: Not that I see a problem upfront, but just wondering what you have in mind
<niemeyer> rog: No provider should break if config.COnfig has an attribute that it doesn't know about
<rog> niemeyer: ah, i didn't know that
<niemeyer> rog: The generic config.Config will handle it
<rog> niemeyer: another question occurs to me
<rog> niemeyer: when we do "juju upgrade-juju", how do we choose what version to propose in the state?
<rog> niemeyer: do we look through all the agents and see what architectures they're running, then choose the best version that is provided for all of them? or do we just choose the best version for any architecture?
<niemeyer> rog: We can use the functionality you've already put in place to find max(version with current major)
<rog> niemeyer: for which architecture?
<niemeyer> rog: I'd say any
<niemeyer> rog: If we use the logic prevoiusly mentioned, that'd be fine
<niemeyer> rog: Agents may simply not be able to catch up immediately
<rog> niemeyer: the functionality currently in place looks for tools for a given arch and series
<niemeyer> rog: But with the retry logic that should exist anyway (download may fail, etc), we'd catch that
<rog> niemeyer: but i could remove that restriction
<niemeyer> rog: Ah, I see.. this is still useful
<niemeyer> rog: We'll want to use that when within the agent figuring what to run
<rog> niemeyer: indeed
<rog> niemeyer: i could provide BestBinaryTools or something
<niemeyer> rog: We just need to be able to disable the flag
<niemeyer> rog: Yeah
<rog> niemeyer: i wonder, if an agent can't find the exact version, perhaps it should keep on polling the available tools until the version is availab.e
<niemeyer> rog: I think it's fine to run something else that is closer to the proposed version
<rog> niemeyer: but what happens if we later upload the required version? how can we ask the agents to upgrade?
<niemeyer> rog: Reality will have that kind of scenario due to arch and series discrepancies
<niemeyer> rog: We shouldn't have to
<niemeyer> rog: The agent itself should note that it's still out of date in comparison to the proposed version
<rog> niemeyer: it'll know that, but what should it do about it?
<niemeyer> rog: It should check to see if the available tools are now available
<niemeyer> Erm
<niemeyer> rog: It should check to see if the proposed tools are now available
<rog> [16:11:52] <rog> niemeyer: i wonder, if an agent can't find the exact version, perhaps it should keep on polling the available tools until the version is availab.e
<rog> niemeyer: that's what i was suggesting.
<niemeyer> <niemeyer> rog: I think it's fine to run something else that is closer to the proposed version
<niemeyer> rog: That's my counterproposal :-)
<niemeyer> rog: The "exact" word in there is the disagreement
<niemeyer> rog: It should continue polling, but if it finds something closer, it should upgrade too
<niemeyer> rog: and continue polling
<rog> niemeyer: definitely.
<rog> niemeyer: ah, sorry, i thought you were disagreeing with the idea of polling
<niemeyer> rog: No, that's nice
 * rog hopes that he doesn't have to throw away *too* much code :-)
<niemeyer> rog: I was thinking about that as went through, I *think* it's mostly ok
<rog> niemeyer: that's my inclination too
<niemeyer> rog: The whole upgrading logic is gold, just needs to watch something else
<rog> niemeyer: yeah
<rog> niemeyer: i think it's mainly the watchers in state
<niemeyer> rog: Interestingly, there's a bunch of code going away, which is nice
<niemeyer> TheMue: ping
<rog> niemeyer: at the expense of flexibility of course, but maybe we'd never really want to deliberately deploy different versions.
<TheMue> niemeyer: pong
<niemeyer> rog: I'm feeling better about this aspect, to be honest.. I prefer we make things more complex to implement the fancy scenarios when we do need it, than to have it complex by default
<niemeyer> TheMue: Heya
<niemeyer> TheMue: I think we might use your help on this one
<rog> niemeyer: yeah, i think i agree.
<niemeyer> TheMue: Not sure about rog's plan, so we have to brainstorm for a sec
<TheMue> niemeyer: ok, will read the last lines.
<niemeyer> TheMue: There's some work that is on my plate for a while, and I never got to it
<niemeyer> TheMue: No need, I'll explain
<rog> niemeyer: i'd rip out all the current ProposedAgentTools stuff from state
<TheMue> niemeyer: ok, listening mode = on
<rog> TheMue: ^
<niemeyer> TheMue: We need EnvironConfig() to return config.Config
<niemeyer> TheMue: and also the respective watche
<niemeyer> r
<niemeyer> TheMue: I think all the stars are now aligned for this to be relatively easy, but this is a blocker for rog
<niemeyer> TheMue: Would you mind to put that at the front of the queue, on CL for state, then another one for mstate?
<niemeyer> s/on CL/one CL/
<niemeyer> rog: Not sure if you agree, or if you'd like to do that yourself?
<rog> niemeyer, TheMue: that would be great
<niemeyer> TheMue: We'll need a counterpart for State.EnvironConfig: State.SetEnvironConfig
<niemeyer> TheMue: Since config.Config is read-only
<niemeyer> TheMue: But it all sounds quite straightforward
<TheMue> niemeyer: yes, first look seems so
<rog> TheMue: if you do state.SetEnvironConfig, i'll do the ProposedAgentTools stuff.
<niemeyer> TheMue: Can you help us on that, with some priority?
<TheMue> niemeyer: sure
<niemeyer> TheMue: Thanks a lot
<TheMue> niemeyer: yw
<niemeyer> TheMue: There are quite a few things that are touched by that (provisioner, etc), but I suspect it will be rather pleasing. This is the last piece of the puzzle of config.Config, so I expect it to fall into place correctly in all cases.
 * rog goes for a bite of lunch
<TheMue> niemeyer: so EnvironConfig() (*ConfigNode, *ConfigWatcher, error)?
<niemeyer> TheMue: Uh?
<TheMue> niemeyer: or two calls?
<niemeyer> TheMue: We already have an environ watcher
<niemeyer> TheMue: The idea is just to make EnvironConfig() and the respective watcher operate with config.Config, rather than ConfigNode
<niemeyer> TheMue: We have proper helpers for everything
<TheMue> niemeyer: aargh, read it wrong, sorry
<niemeyer> TheMue: np
<TheMue> niemeyer: already wondered
<niemeyer> TheMue: State.EnvironConfig and the watch will both use config.New, and SetEnvironConfig will use Config.AllAttrs
<TheMue> niemeyer: ok
<rog> back
<niemeyer> I'll head to lunch
<niemeyer> biab
<TheMue> rog: to get it right, state don't uses a persisted config anymore. it is in mem set with SetEnvironConfig()?
<rog> TheMue: i'm not sure what you mean
<TheMue> rog: today the config that is returned by EnvironConfig() is fetched from ZK
<TheMue> rog: from the environment path
<TheMue> rog: ah, explain helps
<TheMue> rog: it's just a differrent return type, source of the data stays the same
<rog> TheMue: +1
<TheMue> rog: had been confused for the moment due to the late jump into the discussion
<rog> TheMue: np
<fwereade> TheMue, rog, Aram: when one upgrades a charm, and gets a conflict, and marks it resolved, I don't see any way for us to verify whether or not the user has actually done anything sensible with the conflicted data; does this sound like a problem to you, or a "just don't be an idiot" situation?
<rog> fwereade: the latter
<SpamapS> conflict?
<SpamapS> how might a charm upgrade cause a "conflict" ?
<rog> fwereade: ^ you might wanna explain :-)
<fwereade> SpamapS, ah, sorry
<fwereade> SpamapS, short version: we're versioning charms
<fwereade> SpamapS, so we maintain a git repo of charms-used-by-this-unit, which just gets its contents overwritten neatly when we upgrade; and *then* we pull from that repo into the actual charm dir, which is itself a git repo
<fwereade> SpamapS, which then means that weird directory structure changes and the like will at least be *caught* when we try to upgrade
<SpamapS> nice to see we're abandoning bzr whole-heartedly :)
<fwereade> SpamapS, haha, I wanted to use bzr, but it has an ugly crash in the precise situation that prompted this idea
<SpamapS> fwereade: I feel like you can be way more heavy handed than this with a charm upgrade. If I say "give me version X" .. I don't mean "merge it into what I have now" .. I mean *X*
<SpamapS> also I feel like we need to (soon) make the charms readonly and enforce a data storage area for charms that want to write data.. but I keep forgetting to file a bug on that :p
<fwereade> SpamapS, ha, I would like that solution most of all, but my personal reading was that the writing-to-charm-dir genie was out of the bottle; is that completely wrong?
<SpamapS> its out of the version 1 bottle
<SpamapS> or rather, format: 1 bottle
<SpamapS> format: 2 is still unsettled...
 * fwereade makes very loud HMMMMM sounds
<SpamapS> (and has actually never been discussed on the public mailing list.. which is a HUGE problem)
<SpamapS> fwereade: why git tho. Why not just rsync?
<fwereade> SpamapS, (to briefly return to what you said before, the trouble is that without separate data storage we actually *are* always saying "merge with what I have")
<fwereade> SpamapS, er, because I didn't consider it, and when I thought "detect conflicts" I thought "VCS"
<SpamapS> see I don't think detecting conflicts is at all important
<SpamapS> applying delta tho.. that is..
<SpamapS> rsync would fail for this..
<SpamapS> and you're right, we are stuck w/ merging until we ditch the writable charm dir
<fwereade> SpamapS, ok; I was concerned that an un-thought-through upgrade from version X to version X+3, for example, could easily end up (say) blindly replacing a data dir with a file, and this seemed like unfriendly behaviour
<SpamapS> fwereade: that must be thought through in upgrade-charm, not juju
<fwereade> SpamapS, but by the time upgrade-charm runs, the damage is done
<SpamapS> fwereade: all I'd like to see is deltas applied sanely. VCS does make sense for that.
<SpamapS> fwereade: its not "damage" if the author does something stupid
<SpamapS> fwereade: authors *must* be cautious of exactly that situation.
<fwereade> SpamapS, (the other advantage, which feels like a nice help when writing/debugging charms, is that we can actually maintain a complete per-unit history of the charm dir)
<fwereade> SpamapS, in my mind this is indeed more targetted at authors than at users
<fwereade> SpamapS, if a user hits this situation the author has screwed up
<SpamapS> authors will have their own VCS
<SpamapS> fwereade: feels very much like "trying to do too much"
<SpamapS> What I really care about is that you try to apply all the delta from the shared base.. a straight "merge" problem.
<SpamapS> If I have created a 'data' dir in charm, and the new charm has a static data dir.. then yes, thats a conflict.
<fwereade> SpamapS, heh, I am not immune to this disease; I think niemeyer will be back from lunch soon, and I would like to involve him in this discussion
<fwereade> SpamapS, another benefit of merging is that (eg) deleted hooks actually get deleted
<SpamapS> fwereade: I think its fine to do as you've suggested. Merge.. report conflict when they happen. I get it now.
<fwereade> SpamapS, cool
<SpamapS> and I'm even wondering if thats actually more straight forward than trying to disallow writing
<fwereade> SpamapS, I think it is a pretty neat solution (which I totally can't take credit for)
<fwereade> SpamapS, although I was pleased with myself for then realising that we can commit charm dir state after every hook, which could be quite the debugging aid -- do you forsee any issues there?
<SpamapS> fwereade: no, I don't think it should be a problem...
<SpamapS> fwereade: some charms download lots of code into the charm dir on config-changed .. some even have git repos embedded.. so you have to be mindful of that.
<fwereade> SpamapS, ha, good wrinkle, hadn't thought of that
<fwereade> gents, I need to be off; I'll try to get back on later
<fwereade> takes care all
<Aram> niemeyer: your email mentioned a ChangeLog function, but: http://paste.ubuntu.com/1172117/
<Aram> where is it? :).
<Aram> or did you mean something else?
<niemeyer> Aram: Yo
<Aram> hey.
<niemeyer> Aram: No, that was it really
<niemeyer> Aram: Are you not finding it after pull?
<Aram> as you can see, no.
<Aram> is my tip revision there correct?
<niemeyer> Aram: Let me check
<niemeyer> Aram: There's something awkward going on
<niemeyer> Aram: That revision is the one that introduced the ChangeLog function
<niemeyer> Aram: Try to run this: bzr diff -r 168..169
<TheMue> Aram: my godoc shows Runner.ChangeLog()
<niemeyer> Aram: Try to "bzr revert" perhaps
<niemeyer> Aram: To get the tree back in shape
<Aram> niemeyer: it's in the diff.
<Aram> I'll do a revert
<Aram> maybe this is the problem
<Aram> white:txn$ bzr st
<Aram> working tree is out of date, run 'bzr update'
<Aram> how did that happen though?
<Aram> I didn't make any changes to it.
<Aram> bzr revert didn't do anything, bzr update solved it though
<niemeyer> Aram: I can't tell how you got there, but there are a number of ways this can happen
<Aram> perhaps a go get -u did this?
<niemeyer> Aram: Quite possible
<niemeyer> Aram: Since it'll have to update backwards
<niemeyer> Aram: (the tag is not in the latest revision)
<niemeyer> fwereade: ping
<fwereade> niemeyer, pong
<niemeyer> fwereade: Do you have a moment for a call?
<niemeyer> fwereade: I know it's late for you, so "no" is a fine answer
<fwereade> niemeyer, just a sec...
<fwereade> niemeyer, yeah, can we keep it down to 15 mins or so though please?
<fwereade> niemeyer, shall I invite? you and..?
<niemeyer> fwereade: Let's go then
<niemeyer> fwereade: You and me, for the moment
<fwereade> niemeyer, sent
<rog> niemeyer: ping
<niemeyer> rog: Yo
<rog> niemeyer: i sent a response to your review
<rog> niemeyer: only point of contention is 0666 vs 0600
<rog> niemeyer: i vote for former as it's standard
<rog> niemeyer: and what threat are we protecting against?
<niemeyer> rog: 0600 is the usual for files that contain credentials.. if it doesn't work with that file mode, it's broken
<rog> niemeyer: ah, i see.
<niemeyer> rog: re. 40ms, awesome!
<rog> niemeyer: though i can't see how it could make any difference
<rog> niemeyer: 4ms is probably when there's nothing to remove. and it is using my SSD device
<rog> 40
<niemeyer> rog: Sure, if it makes no difference, then 0600 is fine
<rog> niemeyer: sure, ok.
<rog> niemeyer: it'd probably be even faster if i wasn't running it -gocheck.vv
<niemeyer> rog: Ah, most certainly
<rog> niemeyer: a very useful program BTW: http://paste.ubuntu.com/1172299/
<rog> niemeyer: i call it "timestamp"
<rog> niemeyer: so i did (in state) go test -gocheck.vv 2>&1 | timestamp
<rog> niemeyer: to find the timings
<niemeyer> rog: Curious
<niemeyer> rog: Clever, actually
<rog> niemeyer: sample output: http://paste.ubuntu.com/1172303/
<niemeyer> rog: This is awesome
<rog> niemeyer: it's incredibly useful sometimes
<niemeyer> rog: Well, curiously gocheck is also showing the timestamps in that one case
<rog> niemeyer: unfortunately gocheck's timestamps wrap
<rog> niemeyer: i've had a proposal in for ages to fix that
<niemeyer> rog: Where's it?
 * rog looks
<rog> niemeyer: https://codereview.appspot.com/5874049/
<niemeyer> Looking
<niemeyer> rog: I see.. I'd be glad to fix the wrapping, but requires some more thinking indeed
<niemeyer> rog: It should at least be a consistent unit and length
<niemeyer> rog: The length can of course vary on extremely long cases, but the sample output there is a bit awkward
<rog> niemeyer: yeah.
<rog> niemeyer: i've found that the output from "timestamp" works quite well.
<rog> niemeyer: but that's probably because i'm used to it!
<niemeyer> rog: What is it? min/sec/ms?
<rog> niemeyer: yeah
<niemeyer> rog: It looks reasonable to me as well actually..
<rog> niemeyer: and i guess it doesn't matter so much if it wraps after an hour
<niemeyer> rog: I'm actually more concerned on the lower side, but 1ms might be enough resolution
<rog> niemeyer: it could be 04:05.000 i suppose
<niemeyer> (to debug races)
<niemeyer> rog: =1
<niemeyer> +1
<rog> niemeyer: yeah. i think that less than 1ms and stuff like the latency of locks around the logging starts to have an effect.
<niemeyer> rog: mutexes should run well under that
<niemeyer> rog: As in, several orders of magnitude below it
<rog> niemeyer: true.
<rog> niemeyer: if something is within a millisecond, then it deserves a closer look. but if i'm trying to debug a race, it's generally sensitive to the scheduler and i'll use println not Printf.
 * rog is not sure that those two sentences are in any way related
<niemeyer> rog: :)
<niemeyer> rog: I've used gocheck's output to debug races quite successfully in the past
<rog> niemeyer: and sub-millisecond timing was important to that?
<rog> niemeyer: the other weird thing about the current log time stamps is that they don't start from zero...
<niemeyer> rog: It can help.. sometimes the timing tells how much apart the two events were
<niemeyer> rog: I can't recall how much the under ms helped, though
<rog> niemeyer: yeah. that was where my "if something is within a millisecond, then it deserves a closer look" statement came from.
<niemeyer> rog: Since it was just there, I was considering it without realizing
<niemeyer> rog: Well, that's too late
<niemeyer> rog: If you're debugging a race, "deserves a closer look" is exactly what the log is for
<rog> niemeyer: we could print microseconds too. i don't mind too much. i just want it not to wrap.
<niemeyer> rog: I think ms is fine to begin with, to be honest
<niemeyer> rog: If we ever miss resolution we can increase it
<rog> niemeyer: sounds good. milliseconds is nice and human-friendly :-)
<niemeyer> rog: I'd also do M:SS
<niemeyer> rog: Rather than MM
<niemeyer> rog: Or even SSS, I guess
<rog> niemeyer: i vote for M:SS
<niemeyer> rog: Works for me
<rog> niemeyer: i think that makes the units marginally more obvious
<niemeyer> rog: Yeah
<rog> niemeyer: i'll repropose the CL
<niemeyer> rog: Thanks a lot
<rog>  done
<rog> niemeyer: CL reproposed
<rog> i'm off now. see y'all tomorrow.
<mramm> Wow, 3 different Uverse technicians have been out to my house in the last 36 hours to try to fix my internets!
<mramm> and finally I think we have got it resolved.
<mramm> apparently "squirrels ate the wires"
<niemeyer> Haha
<niemeyer> mramm: That's great
<niemeyer> Aram: Btw, I was wondering that maybe we could have an unbounded log
<niemeyer> Aram: Rather than a capped collection
<niemeyer> Aram: The difference is pretty minimal either way
<niemeyer> Aram: So we can tweak this
#juju-dev 2012-08-29
<TheMue> morning
<fwereade_> TheMue, heyhey
<TheMue> fwereade_: heya
<davecheney> howdy
<rog> fwereade_, TheMue: hiya
<fwereade_> rog, heyhey
<TheMue> rog: hi
<TheMue> rog: impl of environ config is - almost - done, i'm currently change the testing. here Config behaves different than ConfigNode
<rog> TheMue: great! thanks for doing it.
<TheMue> rog: one question here
<TheMue> rog: is SetEnvironConfig intended to update or to replace the config?
<rog> TheMue: yes, i believe so
<rog> TheMue: for State anyway
<rog> TheMue: it's not intended to replace ConfigNode in general, i think
<TheMue> rog: no, only this one method. so the passed config replaces an existing one, ok.
<rog> TheMue: oops, sorry i answered the wrong question
<rog> TheMue: i *think* it should replace the config
<rog> TheMue: as it's usually set all at once. but i'm not so sure
<rog> TheMue: interesting question.
<TheMue> rog: i think so too, otherwise by updating you only can add more values but never remove them
<rog> TheMue: the issue i'm thinking is that if one client updates the environment settings and the other does an upgrade (setting the proposed version), then one or other of the changes may be lost.
<TheMue> rog: cfg(a, b, c).update(cfg(b, c, d)) => cfg(a, b', c', d)
<TheMue> rog: oh, yes, would be an issue too
<TheMue> rog: but the same would be in case of an update by two clients
<rog> TheMue: i'm not sure
<rog> TheMue: what i'm worried about is: c := cfg(a=1); {c.set(a=2)} in parallel with {c.set(b=1)}
<TheMue> rog: client a thinks b has to be b' and updates, a millisecond later client b thinks b (the old value he remembers) has to be b'' and updates too. the first update is lost.
<rog> TheMue: that's true, but in this case we're dealing with disjoint sets of attributes.
<TheMue_> shit, network agaiin
<TheMue_> rog: ok, as long as keys are different an update mixing the value sets is ok. but updating the same key is still a problem.
<rog> TheMue: the environ config that we set when changing provider attributes doesn't hold the proposed version attribute.
<Aram> hello.
<rog> TheMue_: of course, but that's inevitable
<rog> Aram: hiya
<TheMue_> Aram: moin ;)
<rog> TheMue_: it's reasonable that someone might want to do an upgrade at the same time as updating the provider attrinbutes
<rog> TheMue_: i'm becoming less convinced that it's appropriate to lump the proposed version in with the provider attributes
<TheMue_> rog: a kind of txn.exec(func(cfg)) would be nice, so the whole change based on the current config executes in a txn
<rog> TheMue_: actually, we do have that kind of thing. i think we might be ok. let me check.
<TheMue> rog: i have to leave for some time in a few minutes, so you've time to rethink it
<rog> TheMue: ok
<rog> TheMue: i think it depends on how the API to changing environment settings works
<rog> fwereade: do you know what mechanism the user will use to change provider settings in the state? (i'm presuming that the client is the only thing that will change them - is that right?)
<TheMue> rog: let's talk about it later
 * TheMue is now afk, bbl
<rog> TheMue: k
<fwereade> rog, um, I think there's an env-set command planned
<fwereade> rog, constraints will interact somehow; python already has set-constraints which allows for env-level settings
<rog> fwereade: that's good. i don't think the user would expect attributes they don't provide on the command line to be deleted.
<fwereade> rog, damn straight :)
<rog> fwereade: but the SetEnvironConfig method kind of implies that
<fwereade> rog, what's the justification for that method?
<fwereade> rog, surely all we ever want is some form of update?
<rog> fwereade: i *think* we already have something similar. but maybe not.
<rog> fwereade: i agree.
<fwereade> rog, what we originally had was just a ConfigNode, but I haven't been following the updates very closely
<rog> fwereade: yeah.
<rog> fwereade: we're going to replace that with a config.Config
<fwereade> rog, ok, indeed, that's a good thing to get
<rog> fwereade: but there's no way of setting attributes on a config.Config, so the interface will have to be SetEnvironConfig(*config.Config)
<fwereade> rog, my heart still says we want an UpdateEnvironConfig method
<rog> fwereade: and it seems perhaps a little surprising that if you do that, then no attributes will ever be deleted.
<fwereade> rog, what attributes might we want to delete?
<rog> fwereade: signature?
<rog> fwereade: i don't think we do
<fwereade> rog, hold on, a bell is ringing
<fwereade> rog, maybe SetEnvironConfig *is* ok, if we can trust ourselves to build a provider-valid config before we call it
<rog> fwereade: it's not ok if it does a replace not an update
<fwereade> rog, ha, concurrent updates
<rog> fwereade: yup
<fwereade> rog, hm, it could still use a confignode internally and work right
<rog> fwereade: not really. because we don't know what attributes to delete.
<fwereade> rog, what's the use case for deleting attributes?
<rog> fwereade: i think we should have UpdateEnvironConfig(*config.Config) and document that attributes are never deleted.
<rog> fwereade: 'cos yeah i don't see a use case for deleting attributes.
<fwereade> rog, sounds sane to me
<rog> fwereade: are you planning to add more methods to UniterSuite?
<rog> fwereade: more Test methods, that is
<fwereade> rog, no, I think I'll just be adding table entries
<rog> fwereade: currently we've got TestUniter only, but SetUpSuite writes stuff to VarDir. unfortunately JujuConnSuite sets up environs.VarDir on a per-test basis, not a per-suite basis
<rog> fwereade: actually, perhaps i'll just do the tools building in SetUpTest instead of SetUpSuite.
<rog> fwereade: it makes me wish for more flexible fixtures...
<fwereade> rog, yeah, I know the feeling
<fwereade> rog, I have a vague recollection of being aware of that and deciding it didn't matter -- I guess it does for what you're doing?
<rog> fwereade: i'm wondering if fixtures should all conform to interface{SetUp(c *gocheck.C); TearDown(c *gocheck.C); Reset(c *gocheck.C)}
<TheMue> re
<rog> fwereade: JujuConnSuite did not set environs.VarDir before
<TheMue> rog: i've just read that UpdateEnvironConfig() makes more sense? would be fine for me.
<fwereade> rog, ah, ok
<rog> TheMue: cool
<fwereade> rog, I do rather like that potential fixture style, yes
<rog> fwereade: i might run the idea past niemeyer some time
<rog> fwereade: actually, it seems that the above interface would not be sufficient for the use case in UniterSuite. the JujuConnSuite is reset for every test in the table, but that makes a new VarDir; whereas we want to keep the same jujuc executable around for all the tests.
<rog> fwereade: so actually i think that suite really does want to set VarDir itself.
<rog> fwereade: (took me a while to realise what was going on!)
<rog> note for the future: this is not a good idea: 	c.Logf("testlog: %q", c.GetTestLog())
<rog> i wondered why i was seeing sequences of 255 backslashes
<rog> 256 presumably
<fwereade> rog, huh, did I do that?
<rog> fwereade: no, i did, trying to debug a failing UniterSuite test
<fwereade> rog, ah, good, because I'm pretty sure I felt the temptation to the other day, was wondering if I'd blacked out and checked it in or something ;p
<rog> fwereade: i'm sure the cause is something simple, but currently i can't see the wood for trees
<rog> fwereade: do you have an instant gut feeling for what kind of thing might be causing this failure? http://paste.ubuntu.com/1173708/
<Aram> lsr | grep test | grep -v internal | grep -v confignode | grep -v life | xargs grep '(Update(Id)?)|(Find(Id)?)|(Insert(Id)?)|(Upsert(Id)?)'
<Aram> oops
<Aram> wrong focus
<rog> fwereade: it's almost certainly to do with something in the environment that hasn't been set up correctly.
<TheMue> rog: so, implementation and test in state are done, now check for not yet covered dependencies. had to fight with the schema for test ;)
<rog> TheMue: cool. i'm struggling trying to merge trunk into a LGTM'd branch :-(
<TheMue> rog: *lol* yeah, sometimes hard
<rog> TheMue: the uniter tests are broken somehow, and it's too subtle for me to see yet
<TheMue> rog: strange. no useful hints?
<Aram> TheMue: thanks for the work you did on mstate, I merged all txn stuff and pushed it.
<rog> TheMue: no clue yet. am delving to see what's actually going on in the tests.
<TheMue> Aram: yeah, took already a look. it's doing great steps into the right direction.
<fwereade> rog, sorry, I was having lunch
<niemeyer> Gooood morning
<rog> fwereade: np
<rog> niemeyer: hiya
<fwereade> rog, and, hum, I'd usually expect to see rather more in the logs than just that, looks almost like there's no uniter at all
<fwereade> niemeyer, heyhey
<fwereade> man, I think there's a thunderstorm on its way, I'm feeling ridiculously flat
<rog> fwereade: it's failing in startupError, doing waitUnit for hook failed just after createUniter{}
<fwereade> rog, yeah, I saw that much... and I didn't see any of the uniter logging at all in that paste
<rog> fwereade: i changed the tests so that everything gets run with this function; that way i can see substeps:
<rog> func step(c *C, ctx *context, s stepper) {
<rog> 	c.Logf("%#v", s)
<rog> 	s.step(c, ctx)
<rog> }
<fwereade> rog, ISTM like the uniter is failing before it even hits a mode
<rog> fwereade: hmm, maybe the LoggingSuite isn't being hooked in right
<fwereade> rog, otherwise I'd expect an "examining charm state..."
<fwereade> rog, possibly something is up with the tools?
<rog> fwereade: entirely possible
<fwereade> rog, NewUniter calls esnureFs which calls EnsureTools
<fwereade> rog, except hmm *that* surely ought to be caught :/
<fwereade> rog, yeah, startUniter checks that :/
<rog> fwereade: the logger seems to be hooked in ok
<rog> fwereade: here's the full test output BTW: http://paste.ubuntu.com/1173763/
<fwereade> rog, all I can say is that something is very up, because test 3 at *least* ought to be giving you log output
<rog> fwereade: what log messages should i be seeing?
 * rog has an idea
<rog> ha!
<TheMue> rog: is there a way to push an invalid value (not matching the schema) into config.Config?
<rog> TheMue: which schema?
<TheMue> rog: i have to invalidate it for a test
<fwereade> rog, something more like http://paste.ubuntu.com/1173766/
<TheMue> rog: environ config
<rog> fwereade: the log hooking is being disabled after Reset!
<fwereade> rog, ahhhhh
<TheMue> rog: i want to set name = 1, the provider test has done it before to see how the provider reacts
<fwereade> rog, then that would definitely explain it :)
<TheMue> rog: and the old usage of the config node allowed it to be so mean ;)
<rog> TheMue: you'll have to go behind the scenes, i'm afraid
<TheMue> rog: *sigh*
<rog> TheMue: kinda the point of config.Config is that it always matches the schema
<TheMue> rog: expected this answer, but hoped for an easier one :|
<rog> fwereade: all tests pass. phew. thanks for your help.
<TheMue> rog: need a kind of config.Evil(attrs) *Config, err
<fwereade> rog, phew indeed :)
<rog> fwereade: i'd let an extra line creep in when merging.
<rog> fwereade: which happened to be LoggingSuite.TearDownTest
<niemeyer> TheMue: Yo
<fwereade> rog, ouch :)
<niemeyer> TheMue: How's the EnvironConfig stuff?
<TheMue> niemeyer: implemented in state and tested there, now i'm changing the dependencies
<TheMue> niemeyer: hi btw
<TheMue> niemeyer: and for the provider test i have to write an invalid config
<niemeyer> TheMue: Superb
<TheMue> niemeyer: i think i have to go directly to ZK here *sigh*
<Aram> niemeyer: TheMue: I want to get rid completely of mgo in mstate_test. Tests should only rely on the public API and not implementation details. for test that HAVE to peek in the DB, I'd rather move them to a internal_test.go file or something.
<niemeyer> Aram: +1
<TheMue> Aram: +1
<niemeyer> TheMue: FWIW, it's a good thing that you're finding it difficult to write down a broken config ;)
<TheMue> niemeyer: yeah, indeed *lol*
<TheMue> niemeyer: the usage of config.Config with coercing a schema makes it hard, and that's a good feature
<rog> fwereade: could you have a glance at the uniter test changes in this CL, please, before i submit: https://codereview.appspot.com/6484051
<fwereade> rog, they look fine to me :)
<rog> fwereade: thanks
<fwereade> rog, I'm feeling rather uncomfortable about writing tests which check git output directly, but I can't think of another way to get, say, the status/log of a git repo
<fwereade> rog, am I missing something obvious? :/
<rog> fwereade: i presume that's the only way, but i'm not familiar with git
<rog> fwereade: there may be some API
<fwereade> rog, hmm, seems that such things exist, but they look like a hassle
<fwereade> ;)
<niemeyer> Aram: How's the watcher going?
<Aram> niemeyer: working on it, have a good model drawn on paper, expecting very good results.
<niemeyer> Aram: Was there any issue with the proposal?
<Aram> no.
<Aram> at least I haven't found any.
<niemeyer> Aram: Cool, was just wondering if the model on paper was different
<rog> lunch
<rog> niemeyer: can you think of any reason to retain the AgentToolsWatcher?
<niemeyer> rog: No, I think EnvironConfig watcher should cover it
<rog> niemeyer: yeah, i think so. i was wondering if we might have reason to watch the current tools of an agent, but i don't think so. just checking.
<niemeyer> rog: I don't think so as well.. it's mostly informational
<fwereade_> niemeyer, I'm wiped out and need a break; if you have a moment, can I get a really casual pre-review on https://codereview.appspot.com/6488051 please? there's something not quite right about it:
<fwereade_> niemeyer, I'm starting to think I should drop charm.Manager entirely and move its functionality into Uniter itself
<niemeyer> fwereade_: Interesting, what's the motivation that makes you feel that way?
<fwereade_> niemeyer, (also the tests are screwed up, but I'm hoping fresh eyes after a lie down will fix that, so you shouldn't have to worry too much there...
<fwereade_> niemeyer, I can't quite put my finger on it... I'm sort of hoping you'll take one look and say "why are you doing X" and I'll say "er, I don't know", and we'll all live happily ever after
<niemeyer> fwereade_: LOL
<fwereade_> niemeyer, don't spend too much time on it, though
<niemeyer> fwereade_: Cool, will check it out
<fwereade_> niemeyer, cheers
<TheMue> niemeyer: most stuff for environ config is done, but provisioner (and here especially the tests) are a problem. dave also tested invalid environments. his impl today is relative friendly and retries based on the changes delivered by the watcher. but the new watcher stops on invalid environments, because config.New() thankfully doesn't except them.
<niemeyer> TheMue: There is an important distinction between an invalid config.Config and a config.Config considered invalid by a provider
<niemeyer> TheMue: The logic in the provisioner, and in fact in other workers too, should still consider a broken env and retry
<TheMue> niemeyer: config.New() checks the schema and returns an error if it is invalid
<niemeyer> TheMue: The firewaller should do that too, for example
<niemeyer> <niemeyer> TheMue: There is an important distinction between an invalid config.Config and a config.Config considered invalid by a provider
<TheMue> niemeyer: what should the watcher return in this case? a struct containing the config and an error, so that the receiver can check? would be easy, but leads to more changes in other places.
<niemeyer> TheMue: If the watcher receives an invalid *config.Config*, it can drop it, and wait for the next change
<niemeyer> TheMue: That said, the firewaller and the provisioner must still check for a config.Config considered invalid by the provider
<TheMue> niemeyer: ah, good idea
<niemeyer> TheMue: Because a valid config.Config is not necessarily a valid configuration for a specific provider
<TheMue> niemeyer: yeah, i don't change that check
<TheMue> so, it seems there's only one failing assert left. have to check why
<TheMue> ah, landed, only one well known fail due to my locale (compared bzr message of store is wrong)
<TheMue> dinnertime
<niemeyer> fwereade_: I think I spotted some of your dislike there
<niemeyer> TheMue: Enjoy!
<fwereade_> niemeyer, oh yes?
<niemeyer> fwereade_: Yeah, I've sent a review
<fwereade_> niemeyer, sweet, tyvm
<niemeyer> fwereade_: np!
 * niemeyer steps out for lunch
<fwereade_> niemeyer, yeah, good food for thought -- thanks :)
<rog> niemeyer: ping
<rog> fwereade_: ping
<niemeyer> rog: Pongus
<rog> niemeyer: just wanted to have a word about the strictness of provider config attributes
<niemeyer> rog: Ok
<rog> niemeyer: you said yesterday that they shouldn't be strict, but we had tests specifically checking for strictness. just wanted to check that it's ok to relax the checks.
<rog> niemeyer: i've proposed the CL anyway.
<niemeyer> rog: I'm not sure about what this is about
<rog> niemeyer: EnvironConfig
<niemeyer> rog: I don't recall saying anything about strictness
<rog> [16:05:25] <niemeyer> rog: No provider should break if config.COnfig has an attribute that it doesn't know about
<niemeyer> rog: Yeah, that's still the case
<rog> niemeyer: but at least one provider *was* breaking when it saw an unknown attribute
<rog> niemeyer: (and we had a test to check it)
<niemeyer> rog: I can't imagine how that would have happened.. they don't validate the config.Config internals
<rog> niemeyer: environs.Provider.SetConfig
<niemeyer> rog: ?
<rog> niemeyer: the provider sees all the attributes in the config
<rog> niemeyer: including (now) agent-version
<niemeyer> rog: I said something else
<niemeyer> rog: They don't validate the config.Config internals
<niemeyer> rog: What is breaking, more precisely/
<niemeyer> ?
<rog> niemeyer: your comment didn't mention internals.
<niemeyer> rog: config.New validates itself
<niemeyer> <niemeyer> rog: I can't imagine how that would have happened.. they don't validate the config.Config internals
<rog> niemeyer: they validate the attributes they get from the config
<rog> niemeyer: SetConfig was failing
<niemeyer> rog: What is breaking, more precisely?
<niemeyer> rog: file and line, please
<rog> niemeyer: well, the symptom i saw was that the provisioner failed to make a valid environment
<rog> niemeyer: the cause was:
<rog> niemeyer: environs/dummy/environs.go, checker.Coerce
<rog> niemeyer: (in Validate)
<niemeyer> rog: There are no general attributes in there
<niemeyer> rog: The file and line where an environment validates the general config.Config attributes, please?
<rog> niemeyer: ah, should config.Config know about agent-version?
<niemeyer> rog: Yes
<rog> niemeyer: ok
<rog> niemeyer: that makes sense.
<niemeyer> rog: Sweet
<rog> niemeyer: ok, thanks. i'll mark that branch as WIP
<niemeyer> rog: np
<rog> niemeyer: gotta go now. see you tomorrow.
<niemeyer> rog: have a good time
<rog> niemeyer: have fun
<niemeyer> rog: Thanks
#juju-dev 2012-08-30
<TheMue> morning
<fwereade_> heya TheMue
<fwereade_> heya wrtp
<wrtp> fwereade_: moanin'
<wrtp> TheMue: hiya
<TheMue> fwereade_, wrtp: hi
<Aram> hello.
<TheMue> Aram: hi
<Aram> so, the MongoDB tailable cursors are a piece of shit.
<TheMue> Aram: tailable?
<Aram> TheMue: http://www.mongodb.org/display/DOCS/Tailable+Cursors
<Aram> the thing I wanted to use to avoid polling.
<Aram> but it's a piece of crap. truth is MongoDB doesn't have this simple feature, and encapsulates server side polling inside a tailing API.
<Aram> but it's not a real, fast, delivery mechanism, in fact it's slower then polling.
<TheMue> Aram: ic, yes, would be nice for queues.
<Aram> if you starve the data from mondogb it will sleep on the server side for about 10 seconds.
<Aram> it doesn't really block and wait for new data.
<Aram> it just creates a thread that wakes after 10 seconds to do a new query.
<TheMue> Aram: *sigh* instead of really pushing it to an open connection.
<Aram> I spent a day debugging why my watcher made tests take about one minute per test.
<Aram> yeah.
<niemeyer> Morning all!
<fwereade_> niemeyer, heyhey
<TheMue> niemeyer: hiya
<niemeyer> Reviews are clean.. I'll jump right into proposal implementation for presence.. got a nice design yesterday, which should scale to a reasonable level.
<niemeyer> fwereade_: Do we have any place that checks for liveness in-place with Alive instead of AliveW?
<niemeyer> fwereade_: Or, can you imagine why we'd do that?
<fwereade_> niemeyer, hmmm, I'm not sure we do
<fwereade_> niemeyer, Unit.AgentAlive
<niemeyer> fwereade_: Cool, I'll try to go with a more conservative design
<fwereade_> niemeyer, it's used in status
<niemeyer> fwereade_: Ah, good point, cheers
<fwereade_> niemeyer, (and Machine.AgentAlive too)
<niemeyer> fwereade_: Yeah, for the same reason I expect
<fwereade_> niemeyer, yeah
<wrtp> niemeyer: "
<wrtp> It'd be nice to avoid yet another place where the environment configuration is
<wrtp> manipulated magically. What's wrong with doing it when that configuration is
<wrtp> created?
<wrtp> "
<wrtp> niemeyer: i thought it was nice to ensure that the agent tools version was always correctly set
<niemeyer> wrtp: It is nice, but that's not the agent tools version
<wrtp> niemeyer: i could easily change things so that jujud bootstrap-state (and dummy.Environ.Bootstrap) is responsible for setting the version, but that leaves a small window where if you do: {juju bootstrap; juju upgrade} the upgrade might not see the agent version, so must fail
<wrtp> niemeyer: proposed agent tools version?
<niemeyer> wrtp: Exactly.. we have a setting for the current agent tools version, and it lives elsewhere
<wrtp> niemeyer: sorry, all this is talking about the proposed version, not the current version.
<niemeyer> wrtp: Exactly
<niemeyer> wrtp: That's why the suggested location seems awkward
<niemeyer> wrtp: There's such window, actually
<wrtp> niemeyer: so what should upgrade-juju do if it sees that the version has not yet been set? my inclination is for it just to fail, as it's a marginal case.
<niemeyer> wrtp: Why would the version not 'be set?
<niemeyer> wrtp: That configuration that you're manipulating in initialization is coming from somewhere
<wrtp> niemeyer: hmm, of course
<wrtp> niemeyer: doh!
<niemeyer> wrtp: :)
<niemeyer> I'll get some coffee and bbiab
<wrtp> niemeyer: should be better now: https://codereview.appspot.com/6494057
<TheMue> niemeyer: any good idea on how to invalidate the environment w/o writing direct to ZK? sadly JujuConnSuite embeds only ZkSuite to start ZK, but doesn't expose a ZK connection to use.
<wrtp> TheMue: can't you get the zk connection from the State?
<TheMue> wrtp: different package, the provisioner_test.
<niemeyer> wrtp: Sorry, I think we're still not understanding each other..
<wrtp> niemeyer: oh, darn
<wrtp> niemeyer: should agent-version be set when reading environments.yaml?
<wrtp> niemeyer: that seemed wrong to me
<niemeyer> wrtp: Please check out environs/ec2/ec2.go:245
<wrtp> niemeyer: in my branch?
<niemeyer> wrtp: In trunk
<wrtp> niemeyer: i thought about that too, but it didn't seem quite right somehow
<niemeyer> wrtp: See the tools there, two lines below it?
<niemeyer> wrtp: It's all there
<niemeyer> wrtp: What's wrong is that there's something generic being done in the provider
<niemeyer> wrtp: But that's pending even despite this change
<niemeyer> publicAttrs is really supposed to be generic
<wrtp> niemeyer: are you thinking of something like this: http://paste.ubuntu.com/1175776/
<wrtp> ?
<wrtp> niemeyer: or do you think the agent-version should be in the Config that the Environ was created with?
<niemeyer> wrtp: I'm thinking about something like environs.BootstrapConfig(e environ.Environ, tools *Tools)
<niemeyer> wrtp: That is mainly publicAttrs, plus the new agent version logic
<niemeyer> wrtp: The former, from your two options, except it'd be within publicAttrs, which is in fact BootstrapConfig
<wrtp> niemeyer: that seems like a reasonable thing to do. not entirely sure that it shouldn't be environs.BootstrapConfig(p EnvironProvider, cfg *config.Config, tools *Tools)
<wrtp> niemeyer: which gives the provider potential leave to manipulate the config before it goes out into the cloud. but... it probably doesn't matter much.
<niemeyer> wrtp: e is what's calling BootstrapConfig..
<wrtp> niemeyer: i realise that
<niemeyer> wrtp: It means it manipulate the result..
<niemeyer> it can
<wrtp> niemeyer: but all BootstrapConfig will use e for is to get its Config, no?
<wrtp> niemeyer: and the provider
<niemeyer> wrtp: e is the thing that "sends t'he config into the cloud"
<niemeyer> wrtp: It has the result of BootstrapConfig in its hands
<wrtp> niemeyer: oh, hold on - will BootstrapConfig actually call Environ.Bootstrap? no, it couldn't.
<wrtp> niemeyer: environs.BootstrapConfig(e environ.Environ, tools *Tools) *config.Config
<niemeyer> wrtp: No.. it will call e.Config()
<wrtp> presumably
<niemeyer> wrtp: and e.SecretAttrs
<niemeyer> e.Provider().SecretAttrs, actually
<wrtp> niemeyer: yes - SecretAttrs is a method on EnvironProvider
<niemeyer> wrtp: Yep, and Config is in e
<niemeyer> wrtp: It has all it needs
<wrtp> niemeyer: if all BootstrapConfig needs is the provider and the config, why not just pass those in explicitly
<wrtp> niemeyer: makes it easier to test, for one thing.
<niemeyer> wrtp: Sure.. that works too
<wrtp> niemeyer: cool.
<wrtp> niemeyer: that type sig makes it more obvious what it's doing too.
<TheMue> YEAH!
<TheMue> Found my missing function that helps me.
<wrtp> niemeyer: a quick question: ec2.environProvider.publicAttrs uses UnknownAttrs not AllAttrs. i *think* that's a bug - do you concur?
<niemeyer> wrtp: Looking
<niemeyer> wrtp: yeah, looks like a bug. Good catch
<wrtp> niemeyer: it would be caught by tests soon enough :-)
<wrtp> niemeyer: i'm a bit surprised it wasn't caught already
<niemeyer> wrtp: Yeah, you've probably saved Dave quite a bit of head-scratching
<niemeyer> wrtp: I've filed a bug about updateSecrets yesterday
<niemeyer> wrtp: It's still sending the full config ATM.. that's why the bug isn't visible yet
<wrtp> niemeyer: this is my initial stab at BootstrapConfig: http://paste.ubuntu.com/1175826/
<wrtp> s/providerInstance/p/ obviously :-)
<niemeyer> wrtp: Beautiful!
<fwereade_> GAAAAH it's testing.HTTPServer that occasionally hangs the Uniter tests
 * fwereade_ has a relieved
 * fwereade_ is going for a giggie
<fwereade_> ciggie
<niemeyer> fwereade_: Woohay :)
 * wrtp wants a cobzr branch feature that lists branches in time-last-modified order.
<Aram> it's easy to create a script that does this.
<wrtp> Aram: comparing times isn't that easy in a script, but yeah, i could write a program to do it.
<wrtp> Aram: actually, the timestamps used are easy to compare lexicographically, so not too bad.
<Aram> it would be great if it could print unix time or some other simple integer
<niemeyer> All going well.. review queue is empty, lots of things being merged..
<niemeyer> Presence is well in progress too
<niemeyer> I'll step out for a slightly extended lunch (I expect ~1:30) to run some errands, and will continue on that
<wrtp> fwereade_, TheMue: do you think it should be an error if State.AddService is called twice with exactly the same arguments?
<fwereade_> wrtp, hmm, that's a bit philosophical ;p
<wrtp> i.e. a service with the same name and charm being created twice?
<Aram> yes.
<fwereade_> wrtp, yeah, I *think* it should be
<wrtp> fwereade_: currently it seems to create a new service with an identical name. i don't think that's right.
<fwereade_> wrtp, whoa, that definitely sounds screwed up
<TheMue> wrtp: I would say so.
<Aram> in mstate it good
<wrtp> fwereade_: FirewallerSuite.TestNotExposedService does it.
<Aram> it's
<wrtp> fwereade_: but i haven't actually checked what's happening - i'm just assuming from a brief look at the code
<wrtp> but the test passes, so it must be doing something
<fwereade_> wrtp, heh, sorry, I'm not familiar with that bit
<wrtp> TheMue: any particular reason that that test calls AddService("wordpress", s.charm) twice in the same test?
<fwereade_> wrtp, (the perspective that gives me pause is "well, at the end of the call there is a service named wordpress with the right charm, everyone should be happy")
<fwereade_> wrtp, if there are *two* services with the same name, yeah, that's just wrong
<wrtp> fwereade_: ahem, yeah, but service names should be unique
<TheMue> wrtp: where exactly do you see it? i see it only once.
<wrtp> i think there must be
<TheMue> wrtp: line 74
<wrtp> TheMue: oh drat! i'm just mistaking my own copy and pasted stuff for the original
 * wrtp hangs head
<wrtp> sorry, false alarm!
<TheMue> *ROFL*
<wrtp> could've sworn i checked
<fwereade_> haha, I'm pretty sure I've done worse :)
<TheMue> wrtp: btw, the environ config stuff is in
<wrtp> TheMue: brilliant, thanks!
<wrtp> TheMue: would appreciate if you could have a look at this. i hope that your tests are still fundamentally the same although i've shuffled things around a little. https://codereview.appspot.com/6499056
<TheMue> *click*
<TheMue> wrtp: LGTM, only one smaller note.
<wrtp> TheMue: thanks
<wrtp> TheMue: tbh i'm not sure it's even worth keeping that check around - it's just checking that StartInstance works, which we test in other places.
<wrtp> TheMue: it looks to me like it was a debugging remnant
<wrtp> TheMue: i left it in in case it was important - what do you think?
<TheMue> wrtp: Yeah, I think it's by Dave and it's just to get sure that the instance is started.
<wrtp> TheMue: ok
<wrtp> TheMue: i'll delete it i think
 * niemeyer waves
<wrtp> niemeyer: ping
<niemeyer> wrtp: yo
<wrtp> niemeyer: one little wrinkle to our upgrade scheme change
<niemeyer> wrtp: Uh oh :)
<wrtp> niemeyer: i think the dev flag has to be in the environ config
<wrtp> niemeyer: rather than an argument to upgrade-juju
<niemeyer> wrtp: Why's that?
<wrtp> niemeyer: because the agents need to know if they can upgrade to a dev version or not
<niemeyer> wrtp: Not that it sounds bad.. it actually sounds reasonable, but just curious
<niemeyer> wrtp: Ah, interesting
<wrtp> niemeyer: and that's the best (i think) way of communicating that
<wrtp> niemeyer: it actually works out quite well i think
<niemeyer> wrtp: Yeah, it sounds good
<wrtp> niemeyer: unfortunately all these changes mean i am unlikely to make tomorrow's deadline :-(
<niemeyer> wrtp: Well, it'd be awesome if we have that well aligned by the end of the day tomorrow at least
<wrtp> niemeyer: to prepare for the other change, i'm changing FindTools to and BestTools to look like this:
<wrtp> niemeyer: http://paste.ubuntu.com/1176355/
<niemeyer> wrtp: That flag will likely be used for other stuff as well, btw
<wrtp> niemeyer: the "Highest" flag is useful because we don't want the version of the client to determine the highest version that can be deployed
<wrtp> niemeyer: and i'm happy to lose the bool tbh
<niemeyer> wrtp: Why would we not want that?
<wrtp> niemeyer: why would we? if there's a version later than my current client in the cloud, i'd want it to be used when i bootstrap, i think
<wrtp> niemeyer: it's the same semantics we had previously
<wrtp> niemeyer: the <= semantics are only important when upgrading, i think
<niemeyer> wrtp: Why? What's the logic?
<wrtp> niemeyer: if the logic *always* selects the highest version, then we can never downgrade.
<niemeyer> wrtp: Uh.. now I'm even more lost :)
<niemeyer> wrtp: Hmm
<wrtp> niemeyer: ok. so what we *did* have was that BestTools always selected the highest available version with the same major number.
<wrtp> niemeyer: now clients are using BestTools to choose a suitable set of tools when told to upgrade to a particular version number.
<wrtp> niemeyer: so we make sure (your idea) that they don't upgrade to a higher version number than requested.
<niemeyer> wrtp: The idea was also to ensure compatibility with the client, but I guess it doesn't matter.. we have to preserve it either way
<wrtp> niemeyer: but the old semantics are still applicable IMHO when initially deploying
<wrtp> niemeyer: yes
<niemeyer> wrtp: Okay.. can we please just tweak the flag names a bit.. Dev is too generic.. we need some kinds of prefix/suffix such as DevTools
<wrtp> niemeyer: sure, suggestions welcome. DevTools, HighestTools?
<wrtp> HighestVersion?
<niemeyer> wrtp: You mean DevVerison as well?
<wrtp> niemeyer: i hadn't, but actually that could work well
<niemeyer> wrtp: NewestVersion and NewestCompatVersion
<niemeyer> ?
<wrtp> niemeyer: the flags are orthogonal
<niemeyer> wrtp: Or VDev VNewest VNewestCompat
<niemeyer> wrtp: So VNewest and VCompat
<niemeyer> wrtp: (or NewestVersion and CompatVersion)
<wrtp> niemeyer: i'm not sure i like Compat
<niemeyer> wrtp: Suggestions/
<niemeyer> ?
<wrtp> niemeyer: because dev versions *are* compatible with non-dev versions
<wrtp> niemeyer: (or should be)
<wrtp> niemeyer: we already have Version.IsDev
<niemeyer> wrtp: Yep, the point of compat is that 4.0 is newer than 3.0
<wrtp> niemeyer: ah, currently it *always* chooses a compatible version, so that's not an option
<niemeyer> wrtp: Yes, but it is
<wrtp> niemeyer: i'd like to add that later, if i may
<niemeyer> wrtp: Sure, but we don't have to wait until later to fix the flag name
<wrtp> niemeyer: sure, but DevVersion, NewestVersion, and CompatVersion would work ok
<wrtp> niemeyer: as orthogonal flags
<wrtp> niemeyer: CompatVersion to be added later
<niemeyer> wrtp: Compat is the behavior you're introducing right onw
<niemeyer> wrtp: The lack of Compat is what will come later
<wrtp> niemeyer: ah, so maybe we should make the flag opposite in meaning
<wrtp> niemeyer: AllowIncompatibleVersion :-)
<niemeyer> wrtp: I don't see why.. the three flags above look nice and are orthogonal as you suggest
<wrtp> niemeyer: if i add the flag now, then i have to add loads more tests, and i'd prefer not to for the time being. there are so few calls to BestVersion and FindTools that it's trivial to add CompatVersion later
<wrtp> niemeyer: (like about 4 calls)
<niemeyer> wrtp: Sorry, I don't get it.. you don't need any new tests in addition to what you'd have anyway
<niemeyer> wrtp: I'm not suggesting you change the logic, I'm suggesting we have the real flags we want since there's zero cost in that
<wrtp> niemeyer: yes, i have to test what happens when that flag is *not* specified
<niemeyer> wrtp: if flag not specific { panic("not yet") }
<niemeyer> wrtp: !?
<wrtp> niemeyer: ok, i'll do that
<niemeyer> wrtp: Cheers
<wrtp> niemeyer: https://codereview.appspot.com/6500052/
<wrtp> niemeyer: i apologise for the size - i haven't had time to split it into several CLs.
<wrtp> niemeyer: i have to go now. see you tomorrow!
<niemeyer> wrtp: np, I'll try to have it reviewed before the EOD
<niemeyer> wrtp: Have a good evening
<wrtp> niemeyer: that would be marvellous!
<mramm> sent an e-mail about the sprint
<mramm> please book travel as soon as you can (and definitely before the weekend!!!) to save costs
#juju-dev 2012-08-31
<davechen1y> niemeyer: you make a good point
<niemeyer> For context,
<niemeyer> <davechen1y> http://codereview.appspot.com/6497057/ is my proposal
<niemeyer> <niemeyer> I suspect it's a bit more involved than that
<niemeyer> <niemeyer> As the comment suggests, we should only push secrets when they are not yet set, and should push *only* secrets
<niemeyer> <davechen1y> should we take this to the chhanel ?
<niemeyer> <niemeyer> Sure
<davechen1y> niemeyer: what happenes currently is AllAttrs - SecretAttrs are pushed on bootstrap
<davechen1y> so, with 6497057
<davechen1y> the set of attrs not pushed would only be the secrets
<davechen1y> but this is only by coincidence, not design
<niemeyer> davechen1y: Hmm.. I'm not sure I understand what that means
<niemeyer> davechen1y: That's what have been aiming at
 * davechen1y finds code
<niemeyer> davechen1y: Also, it's not true that we currently push AllAttrs-SecretAttrs.. it will be so once wrtp's branch goes in with the fix, though
<davechen1y> niemeyer: hmm, there are unit tests that check this
<davechen1y> certainly up to the point that they are yaml encoded and passed to cloudinit.Config
<niemeyer> davechen1y: If we have unittests for this, they're broken or not being run
<niemeyer> davechen1y: https://codereview.appspot.com/6500052/diff/2001/environs/ec2/ec2.go
<niemeyer> davechen1y: Line 132 of the left-hand side
<davechen1y> niemeyer: http://codereview.appspot.com/6458161/patch/17001/15007
<niemeyer> davechen1y: Sure, it works because updateSecrets is broken, right?
<niemeyer> davechen1y: The configuration being tested there is after updateSecrets has set everything over agian
<niemeyer> sent
<davechen1y> niemeyer: yes, you are correct
<davechen1y> niemeyer: http://codereview.appspot.com/6458161/diff/17001/state/state_test.go?column_width=80, rhs, like ~ 88
<davechen1y> that doesn't go through the conn
<davechen1y> so the problem must be the handoff to jujud bootstrap-state via cloudinit
<niemeyer> davechen1y: I'm not entirely sure about what you're trying to show me.. the test contains the full configuration..
<niemeyer> davechen1y: The problem is in the exact line I've shown you above, and wrtp's branch fixes it
<davechen1y> cool, i'll wait for it to lang
<davechen1y> land
<niemeyer> davechen1y: You can also use it as a pre-req if you'd like to build on it
<niemeyer> davechen1y: It's already approved
<davechen1y> sweet, will do
<niemeyer> OK: 1 passed
<niemeyer> PASS
<niemeyer> ok      launchpad.net/juju-core/mstate/presence 1.111s
<niemeyer> !!!!!
<niemeyer> On that note, I'm heading to bed as tomorrow will be an early day
 * davecheney opens the champaign
<niemeyer> davecheney: I wish I could enjoy it with you :)
<davecheney> TO LISBON!
<niemeyer> davecheney: Oh yeah!
<davecheney> MOAR PORT WINE
<niemeyer> SO TRUE
<niemeyer> 8)
<niemeyer> 'davecheney: Have a great working day, and will see you in a bit!
<davecheney> will do
 * niemeyer takes off
<davecheney> common LP, you can do it
<davecheney> % lbox propose
<davecheney> error: Failed to load data for project "juju-core": Get https://api.launchpad.net/devel/juju-core: unexpected EOF
<wrtp> davecheney: mornin'
<davecheney> wrtp: hody!
<davecheney> can you commit your bootstrap branch
<davecheney> so I can unfuck my cross branch that requiresit
 * davecheney hates prereqs so much
<wrtp> davecheney: sorry for stepping on your toes with the provisioner fix - i'd done the fix before i checked the bug and found it was InProgress
<wrtp> davecheney: am just about to submit
<davecheney> wrtp: no appology needed
<davecheney> don't care, it's fixed :)
<wrtp> :-)
<davecheney> i have a secrets only branch to follow
<wrtp> davecheney: it was great that you'd added the last "happens in the wild" comment to the bug report
<wrtp> davecheney: 'cos otherwise i wouldn't have found the other firewaller bug
<davecheney> wrtp: also, nice use of the watcher inside the provisioner test
<wrtp> davecheney: it was a little more involved than perhaps it should be, but i hate fuckin' timeouts
<wrtp> davecheney: we have far too many tests that rely on waiting for arbitrary time periods
<davecheney> wrtp: 110% agree, see my whinge about https://bugs.launchpad.net/juju-core/+bug/1037421
<wrtp> davecheney: yeah.
<wrtp> davecheney: i wonder if we should standardise on using debug log messages to determine "actively doing nothing"
<wrtp> davecheney: because that's usually what we're testing when we timeout
<davecheney> wrtp: not sure, like it says in the ticket, i haven't looked into _why_ this happens at all
<wrtp> davecheney: presence may be different
<wrtp> davecheney: i looked at the timing of the presence tests BTW, and it's not that there's one test that's slow - there are many slow tests there.
<davecheney> wrtp: 0% cpu usage while it is waiting, i smell timeout
<wrtp> davecheney: definitely
<wrtp> davecheney: just doing final test after merging trunk, then will submit
<davecheney> kk
<wrtp> davecheney: the other thing to make tests faster: it would be great if all the tests could use the same state server.
<davecheney> wrtp: i cant' see how we could do that in zk, but with mongo probably
<wrtp> davecheney: apparently it is possible in zk (you can do the equivalent of chroot) but mongo is more important now.
<wrtp> davecheney: submitted
<davecheney> wrtp: i guess with mgo it would be some sort of unique prefix on the table names
<wrtp> davecheney: i guess.
<wrtp> davecheney: BTW do you think that https://codereview.appspot.com/6499056/ is superficial enough to submit without niemeyer's LGTM?
<davecheney> wrtp: you have my permission to throw me under the bus if niemeyer objects
<davecheney> i've been submitting build fixes without waiting
<wrtp> davecheney: i didn't see any buses in lisbon
<davecheney> you never say LONGO BUS ?
<wrtp> davecheney: oh yeah, one thing: there was a test in firewaller_test that looked like a debugging remnant. see line 81 of https://codereview.appspot.com/6499056/diff/5001/worker/firewaller/firewaller_test.go
<wrtp> davecheney: i left it in, but was it there for a particular reason?
<davecheney> wrtp: dunno, i didn't write it (i think)
<wrtp> davecheney: actually i came from the airport on a bus. you may be in danger.
<wrtp> davecheney: frank thought you did actually
<davecheney>  s/say/saw
<wrtp> davecheney: that bit of it anyway
<davecheney> wrtp: fuck it, it's called refactoring
<wrtp> davecheney: yeah. i didn't want to leave out tests that actually checked functionality.
<wrtp> davecheney: but that just tests that the instance has been started, which we test elsewhere
<davecheney> # launchpad.net/juju-core/juju
<davecheney> ./conn.go:95: env.Get undefined (type *config.Config has no field or method Get)
<davecheney> ./conn.go:96: env.Set undefined (type *config.Config has no field or method Set)
<davecheney> ./conn.go:99: env.Write undefined (type *config.Config has no field or method Write)
<davecheney> sadface
<davecheney> i was using those
<wrtp> davecheney: we could add Get.
<wrtp> davecheney: i don't think the other two are appropriate though
<davecheney> wrtp: what CL changed this
<davecheney> i'll get my learn on
<wrtp> davecheney: in fact, i wonder if the lack of Write is a problem
<wrtp> davecheney: it's been going to happen for ages
<wrtp> davecheney: frank did it at niemeyer's request
<wrtp> davecheney: right, you're officially at risk
<davecheney> wrtp: jolly good
<davecheney> wrtp: any idea why this happens ?
<davecheney> ubuntu@server-15347:~/go/src/pkg/net/http$ godoc launchpad.net/juju-core/environs/ec2
<davecheney> PACKAGE
<davecheney> package ec2 import "launchpad.net/juju-core/environs/ec2"
<wrtp> interesting
<davecheney> tested from a clean checkout
<wrtp> davecheney: yes, happens for me too
<wrtp> davecheney: but works fine on dummy, for example
<davecheney> how odd
 * wrtp refuses to get distracted :-)
<davecheney> wrtp: i'll lot a bug to figure out WTF that is happening
<davecheney> can I test the waters with this CL, http://codereview.appspot.com/6492065/
<davecheney> I was playing on canonistack today
<davecheney> it is very possible to do a three line install of juju from source from scratch
<davecheney> has anyone else run into this issue btw, https://bugs.launchpad.net/juju-core/+bug/1044164
<wrtp> davecheney: i think that's a good idea
<wrtp> davecheney: i think it's a pity that go get doesn't have a way of asking for testing deps too
<davecheney> wrtp: it's kind of a hack
<davecheney> but it works
<davecheney> and is reasonably self contained
<wrtp> davecheney: given the current limitations of go get, i think it's reasonable.
<wrtp> davecheney: i've fixed the issue you had above BTW
<wrtp> davecheney: (it happens all the time for me)
<davecheney> godoc ?
<davecheney> or AWS
<wrtp> davecheney: sig mismatch
<davecheney> do I need to update goamz ?
<wrtp> davecheney: i can't remember if i submitted or not
<wrtp> davecheney: hmm, maybe i never even proposed it
<wrtp> davecheney: ha, i never even committed the change!
<wrtp> davecheney: short story: goamz is totally broken currently
<wrtp> davecheney: fix: add / to the urls
 * davecheney facepalm
<davecheney> ah yes
<davecheney> bucket.s3.amazon.com is a host
<davecheney> bucket.s3.amazon.com/ is a url
<davecheney> or something like that
<wrtp> davecheney: it signs the path, which should be "/" but is ""
<davecheney> right
<davecheney> the root of a URL is always tricky
<davecheney> well it's not tricky
<davecheney> it's very simple
<davecheney> but somehow always catches things out
<wrtp> davecheney: URLs are fundamentally broken
<davecheney> wrtp: no argument there
<wrtp> davecheney: we just hobble around the brokenness
<wrtp> davecheney: any sequence of slashes should be a separator
<wrtp> davecheney: like in unix paths
<wrtp> davecheney: mind you, even in unix it's a bit broken: i think cat /etc/passwd/ should work
<davecheney> wrtp: not sure I can follow you there
<davecheney> is a collection
<davecheney>  /index is a resource
<davecheney>  / is a collection
<wrtp> davecheney: i don't understand that
<davecheney> put it this way
<davecheney>  / is a dir
<davecheney> actually never mind
<davecheney> iw as talking cack
<wrtp> davecheney: the model becomes really simple if trailing slash (or several) make no semantic difference.
<wrtp> davecheney: with root being a special case of course, so that we get relative paths.
<davecheney> wrtp: what does cat /etc/ do ?
<wrtp> davecheney: depends on the system
<davecheney> :)
<davecheney> you've used solaris
<wrtp> davecheney: once upon a time it gave you the contents of the dir in binary format
<wrtp> davecheney: (it still does in plan 9, and i think that's ok)
<davecheney> wrtp: i kinda agree
<wrtp> davecheney: but the point is {operation /etc/} is the same as {operation /etc}
<davecheney> i'm not sure if I want /etc and /etc/ to *be* the same thing
<wrtp> davecheney: they are
<davecheney> well currently /etc/ isn't, it's an error
<wrtp> davecheney: no it's not - it's just fine
<wrtp> davecheney: ls /etc/
<wrtp> % ls /etc/ | md5sum
<wrtp> 910833b8b809292bd85de24a9536df74  -
<wrtp> % ls /etc | md5sum
<wrtp> 910833b8b809292bd85de24a9536df74  -
<davecheney> but then we get to the issue of ls /etc/passwd/
<wrtp> davecheney: /etc/passwd/ is just an error because something in the kernel is trying to be clever
<davecheney> i try not to be clever, to many things have gone wrong
<wrtp> davecheney: or probably because of the way the namec works internally
<wrtp> s/the/that/
<davecheney> hurry up ec2
<wrtp> if it's still called namec any more
<davecheney> i wanna propose and get to my friends leaving drinks
<davecheney> namec is the function ?
<wrtp> in plan 9 you could do cat /etc/passwd/ if /etc/passwd existed :-)
<wrtp> davecheney: yeah, it's the kernel name lookup function, or used to be
<wrtp> davecheney: probably not under linux
<davecheney> damn open sores
<wrtp> i may well be remembering very badly :-)
<wrtp> davecheney: my memory is indeed faulty. it's namei. http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/sys/sys/nami.c
<davecheney> later lads
<wrtp> davecheney: cheerio
<TheMue> morning
<Aram> moin.
<TheMue> Aram: hi
<TheMue> Aram: how are the watchers doing?
<Aram> I ma fighting bugs.
<TheMue> Aram: if i can help you let me know.
<TheMue> Aram: i'm interested in how they are done now. so any paste is welcome.
<Aram> ok, let me bzr push
<wrtp> Aram, TheMue, fwereade__: morning
<Aram> yo.
<fwereade__> wrtp, hey
<TheMue> wrtp: good morning, too
<fwereade__> hey everyone else too :)
<TheMue> fwereade__: and also you ;)
<wrtp> i've actually been around for over 4 hours this morning, just forgot to say hello earlier!
<Aram> TheMue: lp:~aramh/juju-core/62-mstate-watchers-mux
<Aram> look in watchserver.go
<Aram> there are bugs, but that's the general idea
<TheMue> Aram: great, thx
<Aram> some types will change, e.g. will use a chan []genericDoc instead of chan genericDoc, some loops will change from m x n to n x m etc.
<Aram> but now I am debugging the thing first.
<Aram> ah, there will be e generic unmarshalling function using reflection
<Aram> rogpeppe: I forgot how can I run a single test.
<rogpeppe> Aram: -gocheck.f pattern
<Aram> rogpeppe: thanks, how do I find this information if I need it again?
<rogpeppe> Aram: run the test binary with -help
<Aram> ah yes, I was running go test itself with -help
<TheMue> for standard tests my editor scans the file and provides a shortcut to a menu where all or individual tests can be run. that's nice.
<rogpeppe> TheMue: do you remember talking about the possibility of using a single watcher for the EnvironConfig for both provisioner and firewaller?
<rogpeppe> TheMue: it looks like it should be ok to me, but i'm not entirely sure.
<Aram> btw, in the new watcher implementation, there is a single real watcher. a watch server demultiplexer, so to speak.
<Aram> it's only one of those per session.
<TheMue> rogpeppe: yeah, we once talked about it. i don't remember anymore why we chosen watchers for each.
<rogpeppe> Aram: that sounds like a good move
<Aram> it gets the data and distributes it to interested parties
<rogpeppe> Aram: i was hoping for something like that
<rogpeppe> Aram: in this case though, it's more about saving code than watchers per se
<TheMue> rogpeppe: they should reference the same environ, so that one SetConfig() should be enought.
<rogpeppe> TheMue: exactly
<TheMue> rogpeppe: but there are TWO "should" in that sentence ;)
<rogpeppe> TheMue: i think i'll do it. shouldn't be hard. and it removes some moving parts (and hence tests) from the firewaller and the provisioner.
<TheMue> rogpeppe: +1
<rogpeppe> func NewProvisioner(environ environs.Environ, st *state.State) *Provisioner
<TheMue> rogpeppe: it simply sounds more logical
<rogpeppe> fwereade__, Aram: does that sound reasonable to you?
 * fwereade__ reads back for context
<rogpeppe> fwereade__, Aram: using a single shared environs.Environ for the various workers, and having a separate process watching it and updating it when necessary.
<fwereade__> rogpeppe, +1
<Aram> lgtm
<TheMue> Aram: i'm no friend of the reflection based unmarshal idea. ok, it may save a few lines, but we only talk about four internal docs, so your actual way seems fine to me.
<Aram> yes! the fix I dreamt last night was correct!
<Aram> dreaming fixes has happened before, but not in a long time.
<Aram> I tried it in the morning, and thought it didn't work, but my fix was wrong because the zero value of interface{} is nil, not 0.
<Aram> but what I have dreamt was correct :).
<fwereade__> Aram, nice :)
<Aram> "panic: interface conversion: interface is int, not mstate.Life"
<Aram> TheMue: ^ ^ this is the reason I want to use reflection, not because it produces less code.
<Aram> but because the code is necessarily correct and doesn't break when you change the data model.
<fwereade__> conversely, I just spent all frickin' morning delving deeeeep into the guts of the charm manager, trying to figure out what's going on, and I *just* realised it was a poorly merged test :(
<rogpeppe> fwereade__: my fault?
<fwereade__> rogpeppe, nope, 100% mine
<rogpeppe> fwereade__: my commiserations
<rogpeppe> fwereade__: i hate it when that happens
<Aram> YES! everything works.
<rogpeppe> fwereade__: the bright side is you usually come out with a better understanding of everything goingon
<Aram> fixed the damn thing.
<rogpeppe> fwereade__: and probably made some fixes while you were doing it
<fwereade__> rogpeppe, sadly in this case all I have done is come up with a series of ever-more-batshit theories about how git works, and I fear they may pollute my brain for a while ;)
<rogpeppe> fwereade__: oh well
<Aram> TheMue: you might want to pull again, to see working watchers.
<Aram> tests pass now
 * fwereade__ cheers at Aram
<Aram> it's by far the most complex piece of machinery in mstate, sadly.
<Aram> but I am happy with it from many perspectives.
<TheMue> Aram: great
<TheMue> Aram: and yes, the server is a little monster ;)
<rogpeppe> Aram: which piece is this?
<Aram> rogpeppe: lp:~aramh/juju-core/62-mstate-watchers-mux
<rogpeppe> Aram: i will have a look :-)
<Aram> watchserver.go is the real thing, watcher.go is just a quick hack to be able to validate the design.
<rogpeppe> environ watchers ripped out of firewaller and provisioner. all tests pass (some new tests needed though). am happy.
<TheMue> Aram: btw, I'm just doing the EnvironConfig change for mstate
<Aram> TheMue: what change?
<TheMue> rogpeppe: great., i hope https://codereview.appspot.com/6488057/ will then still run too
<TheMue> Aram: state.EnvironConfig now returns an environs/config.Config and there's an additional SetEnvironConfig
<Aram> ok.
<TheMue> Aram: test is already working here
<rogpeppe> have we got a meeting now?
<davecheney> waiting on gustavo and mramm
<rogpeppe> davecheney: just a quick query
<mramm> I'm on my way
<niemeyer> Morning all
<niemeyer> Do we have invites sent yet?
<mramm> not yet
<mramm> that i see
<davecheney> rogpeppe:mmm
<mramm> inviting everybody now
<rogpeppe> davecheney: i decided that it would be a good thing if the provisioner didn't listen for environ config changes itself
<rogpeppe> davecheney: so something else listens and changes the environ; everyone uses that environ
<rogpeppe> davecheney: does that sound reasonable to you?
<davecheney> rogpeppe: why the reason for the change ?
<rogpeppe> davecheney: i was going to have to have exactly the same code in three places
<davecheney> it sounds like my beloved proxy environ idea :)
<rogpeppe> davecheney: no need for a proxy i think
<mramm> https://plus.google.com/hangouts/_/4ca5fa037ece34849caa79686808bc9e75cfae19
<Aram> ah, meeting today?
<rogpeppe> Aram: yup
<mramm> https://launchpad.net/juju-core/+milestone/1.3
<rogpeppe> fwereade__: meeting?
<fwereade__> rogpeppe, whoops, ty
<davecheney> Aram: https://wiki.canonical.com/UbuntuEngineering/Sprints/JujuCoreIntegration
 * Aram goes to lunch
<TheMue> lunchtime
<niemeyer>                 case change, ok := <-fw.environWatcher.Changes():
<niemeyer>                         err := fw.environ.SetConfig(change)
<rogpeppe> niemeyer: http://paste.ubuntu.com/1177699/
* ChanServ changed the topic of #juju-dev to: Milestone goals: 1) mstate in, state out 2) uniter working with simple charms; 3) upgrade-juju working
<davecheney> night guys
<niemeyer> rogpeppe: http://pastebin.ubuntu.com/1177736/
<niemeyer> rogpeppe: fw.environ = environ
<niemeyer> rogpeppe: worker.SetEnvironConfig(fw.environ, config)
<niemeyer> rogpeppe: https://codereview.appspot.com/6488057/diff/1/worker/provisioner/provisioner_test.go
<niemeyer> rogpeppe: case fw.environTest <- fw.environ: // do nothing
<niemeyer> rogpeppe: someworker.SetEnvironWatcher
<niemeyer> rogpeppe: someworker.SetEnvironObserver
<niemeyer> rogpeppe: someworker.SetEnvironObserver(ch)
<niemeyer> rogpeppe: case fw.environObserver <- fw.environ: // do nothing
<rogpeppe> niemeyer: http://paste.ubuntu.com/1177767/
<rogpeppe> niemeyer: perhaps you could mention to TheMue the thoughts you had on the worker tests, re: the above provisioner_test.go CL.
<niemeyer> rogpeppe: Will do
<niemeyer> Will get some breakfast first, though!
<rogpeppe> niemeyer: are we concerned about what might happen if an upgrade is done concurrently with an environment setting change?
<niemeyer> rogpeppe: In which sense?
<rogpeppe> niemeyer: would it be acceptable for the upgrade request to be lost, for example?
<rogpeppe> (in that case)
<niemeyer> rogpeppe: Are you alluding to the race in SetEnvironConfig?
<rogpeppe> niemeyer: yeah.
<niemeyer> rogpeppe: Not a problem we have to worry for a while, IMO
<rogpeppe> niemeyer: well, the race that happens when you must do: EnvironConfig followed by SetEnvironConfig.
<niemeyer> rogpeppe: yeah
<rogpeppe> niemeyer: ok, i thought so, but just wanted to check explicitly
<rogpeppe> niemeyer: thanks
<niemeyer> rogpeppe: We should fix that at some point, probably by having a method that enables applying deltas safely, but this is a nice problem for us to have when things actually work
<niemeyer> rogpeppe: In practice, fiddling with the env is rare enough that it won't be an issue anytime soon
<rogpeppe> niemeyer: definite agreement there. and fiddling with the env *concurrently* is even rarer.
<niemeyer> rogpeppe: Yeha
<fwereade__> niemeyer, before I polish this properly, can I get another pre-review sanity check on the charm upgrades please? https://codereview.appspot.com/6488062
<niemeyer> fwereade__: Sure
<niemeyer> fwereade__: I'll just finish something and will be with you
<fwereade__> niemeyer, cheers
<niemeyer> Flights booked
<niemeyer> fwereade__: Haven't reviewed it entirely, but sent some pre-review comments
<fwereade__> niemeyer, thanks, that's what I was after -- and sorry to keep bugging you with this one
<niemeyer> fwereade__: It still looks like you're fighting with the awkwardness of having the charm manager handling responsibilities which ought to be elsewhere
<fwereade__> niemeyer, it seems excessive to me to have to keep *another* state file around to allow the uniter to keep track of what operation it's performing
<fwereade__> niemeyer, it feels to me that being able to determine restart state purely from charm state and hook state is a good thing
<fwereade__> niemeyer, charm state + hook state + uniter state feels wrong -- especially since IMO charm state and hook state are useful simplifying features
<fwereade__> niemeyer, but once we're dealing with uniter state as well they actually become redundant and make life more complex
<fwereade__> niemeyer, or have I misunderstod something?
<fwereade__> niemeyer, whereas just making sure that the charm op and the hook op overlap STM like the clearest way to ensure that the appropriate ops are followed by appropriate hooks
<niemeyer> / charm directory, it returns ErrConflict. cbSuccess will be called before
<niemeyer> / status is set from st back to Deployed, to give clients the opportunity
<niemeyer> / to complete dependent actions.
<niemeyer> fwereade__: This is a hack
<niemeyer> fwereade__: This is a charm.Manager..
<niemeyer> fwereade__: It has deployed the charm
<niemeyer> fwereade__: There are no "dependent actions"
<niemeyer> fwereade__: It has done it
<niemeyer> fwereade__: No conflicts, no problems.. it's finished its job
<niemeyer> fwereade__: It doesn't depend on anything else to say it has finished its job either
<niemeyer> fwereade__: We can talk about different ways in which we can solve that, but first thing is to understand what we're doing and why
<fwereade__> niemeyer, ok -- what I am trying to do is ensure that a charm operation is always followed by the apropriate hook
<fwereade__> niemeyer, ISTM that the two ways to do that are to overlap the operations, so the charm op ends after the hook op has begun; or to add *another* layer of state that's specific to the uniter, which feels redundant
<niemeyer> fwereade__: So what you mean is that you'd rather do the whole upgrade again than acknowledge the fact we can tell the merging has been done?
<fwereade__> niemeyer, er, I don't think that's what I'm saying at all
<niemeyer> fwereade__: That's what's in the code
<niemeyer> fwereade__: If cbSuccess fails, we never set the status to deployed, despite the fact it *is deployed*
<fwereade__> niemeyer, so what? same thing happens if writing "Deployed" fails
<niemeyer> <niemeyer> fwereade__: So what you mean is that you'd rather do the whole upgrade again than acknowledge the fact we can tell the merging has been done?
<fwereade__> niemeyer, are yu telling me I should be parsing git logs or something?
<niemeyer> fwereade__: No.. I'm telling that we shouldn't have a cbSuccess that prevents us from acknowledging the merge has been done, for purposes that are completely external to the concerns of the Manager
<niemeyer> fwereade__: Divide and conquer.. the Manager is doing a lot
<niemeyer> fwereade__: We should have a pleasant API to work with, that makes sense by itself
<niemeyer> fwereade__: I don't buy that we should be breaking abstractions and injecting that kind of weirdness because we can save a file from being written to disk elsewhere
<fwereade__> niemeyer, I am trying, but I don't see the middle ground between what I suggest and dropping independent charm and hook state entirely
<fwereade__> niemeyer, if I've got a super-state that tells me how to interpret charm and hook state, doesn't it end up noticeably simpler to simply write every operation out in just one place? ...but then we lose many of the benefits of separating out /charm and /hook
<fwereade__> niemeyer, or...... hum, I think I see
<fwereade__> niemeyer, the uniter state is basically independent of hook state and exists purely to manage the transition from charm op to associated hook?
<niemeyer> fwereade__: Possibly. I'm not even entirely sure (yet) we need an extra state, to be honest
<fwereade__> niemeyer, ok -- the problem I am trying to avoid is: write-charm; set-deployed; whoops-we-crashed-before-run-hook
<niemeyer> fwereade__: If the uniter decides it needs to upgrade, what happens within the uniter itself?
<niemeyer> fwereade__: Yeah, I kind of see it.. I just don't have as good a view on the details as you do
<niemeyer> fwereade__: The basic semantics I imagine for the manager, in isolation is: write-charm-and-url, set-deployed
<niemeyer> fwereade__: If there is a crash, the uniter should have a mechanism that tells it was going to over a charm upgrade.. it will try to write the charm again, and the manager will go "Ah, don't worry, that's what I have"
<niemeyer> fwereade__: What is that mechanism?
<fwereade__> niemeyer, the mechanism ATM is charm.ReadState
<niemeyer> fwereade__: Okay, maybe we can still keep it like that
<niemeyer> fwereade__: If there's really nothing else we need in terms of uniter information to make that decision
<fwereade__> niemeyer, well... ok -- but the fact that the charm is deployed is not in itself enough for the uniter to know that it has completed all relevant operations (ie running a ollowup hook)
<niemeyer> fwereade__: Yeah, there's a missing state
<fwereade__> niemeyer, so it's uniter state? or are you thinking of something else?
<fwereade__> niemeyer, or a missing state within an existing set of states?
<niemeyer> fwereade__: No, it may be fine to do that within the manager if that makes it simple and doesn't disturb the manager itself
<niemeyer> fwereade__: E.g.
<niemeyer> fwereade__: Installing, Installed, InstalledAck, Upgrading, Upgraded, UpgradedAck
<fwereade__> niemeyer, ah, ok, so the Manager is responsible or writing most o those states but the Uniter writes the Ack states afterwards?
<niemeyer> fwereade__: yeah
<niemeyer> fwereade__: Or even, Ack()s :-)
<fwereade__> niemeyer, honestly that feels like a step back from what I originally proposed -- it smears the charm-state-writing responsibility into 2 places instead of one, and that eels to me like it's harder to ollow?
<fwereade__> hm, my F key appears to be flaky, sorry
<niemeyer> fwereade__: I'd be glad to understand why you think that's the case
<fwereade__> niemeyer, well, beforehand I had a single func (changeCharm) that did all the writing of charm state
<niemeyer> fwereade__: :-)
<niemeyer> fwereade__: I hope that's not what you're proposing now :)
<fwereade__> niemeyer, feels analogous to hook state, which while it's in 2 funcs is at least all in the same package
<niemeyer> fwereade__: (as a single function of 300 lines might look bad)
<fwereade__> niemeyer, I'm talking specifically about the charm state file
<niemeyer> fwereade__: I feel like part of the reason we're having so much trouble is precisely that file
<niemeyer> fwereade__: I'd like to kill the idea that we ReadState and WriteState
<niemeyer> fwereade__: We have a Manager, that manipulates a charm directory
<niemeyer> fwereade__: Anyone that is not the Manager should talk to the Manager to change its state
<fwereade__> niemeyer, ah! if we can move all that state-writing into uniter itself, I'd be entirely happy
<niemeyer> fwereade__: LOL
<niemeyer> fwereade__: I was thinking the opposite :-)
<niemeyer> fwereade__: But, tell me about that idea
<fwereade__> niemeyer, fuzzily, I think I'm thinking that the charm state is actually uniter operation state
<niemeyer> fwereade__: Ok
<niemeyer> fwereade__: Makes sense
<niemeyer> fwereade__: What would the Manager look like in such a world?
<fwereade__> niemeyer, I think the manager would be more of a Deployer
<fwereade__> niemeyer, ...except maybe not quite...
<fwereade__> niemeyer, aside: when I originally moved the charm dir out of the manager, it really did feel like a step forward, but I ended up not going anywhere very good from there
<niemeyer> fwereade__: Yeah ,the problem is drawing the line at the right place.. you probably were onto a good direction in some sense, but the line got blurred somehow
<niemeyer> fwereade__: The idea of a deployer makes sense to me..
<niemeyer> fwereade__: The only state we need within the deployer is probably the URL that is actually on disk (what we have as readURL)
<fwereade__> niemeyer, yeah, exactly
<fwereade__> niemeyer, ok, so assuming we were able to separate out a charm dir, a deployer, and a charm-operation-state file
<niemeyer> fwereade__: In the uniter state, I suspect we'll still need extra states, though
<niemeyer> fwereade__: Which some of the causation of the discussion above
<niemeyer> fwereade__: Since we need to track "need to deploy => deployed and need to run hook => run hook"
<fwereade__> niemeyer, I *think* that works OK though -- once we're deployed, yay, we're deployed -- but the state is still the uniter's responsibility to recover on unexpected abort
<niemeyer> fwereade__: Yep
<niemeyer> fwereade__: So State, ReadState and WriteState are all gone..?
<fwereade__> niemeyer, well, I think they are divorced from Manager anyway
<niemeyer> fwereade__: Yep
<fwereade__> niemeyer, they still feel pretty "charm"y
<niemeyer> fwereade__: Not to me
<niemeyer> fwereade__: Anymore than the unit is charmy, anyway
<niemeyer> fwereade__: "I need to run hook", as an example
<fwereade__> niemeyer, I can see it both ways
<niemeyer> fwereade__: Is pretty unity :)
<fwereade__> niemeyer, yeah, indeed :)
<fwereade__> niemeyer, ok, so, as a rough sketch: charm.Deployer + charm.GitDir; then some unexported uniter state manipulation in changeCharm around Deployer operations?
<fwereade__> niemeyer, the readURL thing starts to feel pretty redundant now
<niemeyer> fwereade__: +1 except for the last sentence
<niemeyer> fwereade__: The URL has to be put as part of the deployed content if we want to be able to easily tell what's in the directory
<niemeyer> fwereade__: The uniter state can't track what's *actually* there (after facing a revert, or merge, or whatever)
<fwereade__> niemeyer, hmm, I'll think about it a bit more... ah, ok, we could quite easily have a *missing* operation state file meaning "deployed, check the charm dir"
<fwereade__> niemeyer, ok, I think maybe I have direction again
<fwereade__> niemeyer, cheers
<niemeyer> fwereade__: I don't see that as something entirely neat.. if we have state, we can easily name it..
<niemeyer> fwereade__: Having the same URL both in the uniter state and in the charm is a great sign of things being right :)
<fwereade__> niemeyer, ok, I think my resistance may come from not being able to figure out how I should react if they don't match :)
<niemeyer> fwereade__: They serve different purposes, anyway
<niemeyer> fwereade__: Example scenario:
<niemeyer> fwereade__: We're running an upgrade from revision 10 to 11
<niemeyer> fwereade__: uniter state says we're in revision 10, trying to install 11
<niemeyer> fwereade__: we ask the deployer to upgrade, and get a conflict
<niemeyer> fwereade__: unit goes into error mode
<niemeyer> fwereade__: user logs into the machine.. fiddles around, reverts
<niemeyer> fwereade__: user calls resolved
<niemeyer> fwereade__: uniter catches up, learns that directory has revision 10.. doesn't call upgrade hook, goes back into working mode
<niemeyer> fwereade__: End of story point.
<fwereade__> niemeyer, feels like the next stepis actually "goto10"
<niemeyer> fwereade__: ?
<fwereade__> niemeyer, uniter returns to working mode; detects an upgrade, and...
<niemeyer> fwereade__: That's another story
<niemeyer> fwereade__: At the end of the story above, it gets back onto the main loop, and can do anything it usually would
<niemeyer> fwereade__: Including upgrading again, sure.. perhaps not even to the same version.
<rogpeppe> niemeyer: this might be something like we discussed: https://codereview.appspot.com/6501072
<fwereade__> niemeyer, I don't see the value of that story, because what *will* happen is another upgrade, and we can easily and automatically handle a revert-and-upgrade-to-new-version ourselves, even from an error state
<fwereade__> niemeyer, without a separate actually-don't-upgrade-this mechanism, I don't see why a user would ever need or want to revert
<niemeyer> fwereade__: The value of that story is that the only way to make it possible is by knowing what's actually inside the directory
<niemeyer> fwereade__: Ah, maybe it's that's the disagreement.. I totally see why people (or I :-) would like to resolve a conflict, and then revert back to the previous state
<niemeyer> fwereade__: Because what was done was wrong
<niemeyer> fwereade__: Even then, we're diving into too much detail here
<fwereade__> niemeyer, are we talking about one specific unit, or a whole service?
<niemeyer> fwereade__: The core of my argument is that there's value in knowing what a given revision of the charm directory actually contains
<niemeyer> fwereade__: I was hoping that this wouldn't be very controversial
<fwereade__> niemeyer, neither keep-the-url-in-the-charm-dir nor keep-the-url-in-a-state-file are individually controversial in my mind
<niemeyer> rogpeppe: Cheers, will look
<fwereade__> niemeyer, keeping *both* of them still makes me nervous
<fwereade__> niemeyer, because I'm not sure that we can do anything useful with disagreements
<niemeyer> fwereade__: Okay, I'm happy to have it just within the charm directory while we don't figure a reason to have it outside
<fwereade__> niemeyer, hence my contention that what we're after is *operation* state, not charm state, and we shouldn't have a URL in operation state when we're not operating
<niemeyer> fwereade__: I don't know how you'll manage to tell what version you're supposed to upgrade to before the upgrade actually happens, but that's part of the "while we don't figure a reason" part
<niemeyer> fwereade__: Okay, that makes sense
<fwereade__> niemeyer, if I go one step further and say we shouldn't have *any* operation state when we're not operating, are you still comfortable?
<fwereade__> niemeyer, blast, sorry, brb
<niemeyer> fwereade__: Yeah, I'd definitely not complain if you made that work nicely
<fwereade__> niemeyer, cool, I'll have another crack at it
<fwereade__> niemeyer, sorry this bit's taking so long
<rogpeppe> flight to lisbon booked
<niemeyer> fwereade__: No problem at all, I appreciated that conversation
<niemeyer> rogpeppe: Reviewed, good stuff
<rogpeppe> niemeyer: thanks
<fwereade__> niemeyer, a pleasure, as always :)
<rogpeppe> niemeyer: i thought about Errer, but couldn't quite bring myself - it just looks like a spelling mistake. still, if you think it's ok...
<niemeyer> rogpeppe: It crossed my mind, but it still feels better than Errorer (which I'm probably unable to pronounce :-)
<rogpeppe> fwereade__: o fellow english speaker: what do you think: Errer or Errorer ? (as a name for interface{Err() error})
<rogpeppe> fwereade__: or some alternative?
<fwereade__> rogpeppe, Errer is an ugly word, but also seems like a legitimate "that which errs", so I think I'd go with that
<Aram> rename Err to Error and use the builtin error interface?
<rogpeppe> Aram: that would be kinda cool but... do we want all watchers to be errors ?
<Aram> rogpeppe: ^
<rogpeppe> Aram: it would also be a far reaching change
<rogpeppe> Aram: tomb, for instance, is an external package that uses Err()
<niemeyer> rogpeppe, Aram: Nope, Err is the proper name for that method, and it's actually a standard convention
<niemeyer> rogpeppe, Aram: Err is seen in things that report an "error" (the interface); they are not errors themselves
<rogpeppe> niemeyer: +1 having looked in the go source
<Aram> white:pkg$ pwd
<Aram> /home/aram/go/src/pkg
<Aram> white:pkg$ csh '\.Err\(' | wc -l
<Aram> 21
<Aram> it's only used in database and exp/html
<niemeyer> Aram: Yeah, but it's an agreed-to convention, and I actually pushed the patch prior to Go 1 to make it consistent
<rogpeppe> Aram: yeah - it wouldn't be right anyway because Error (error's method) returns a string not an error
<rogpeppe> Aram: which we definitely don't want to do
<Aram> actually the very few .Err() in the go tree return error, not string.
<Aram> so while I agree that changing to error is not a good idea, calling three usages of an Err in 400,000 LOCs hardly counts as convention, especially since our Err has a different signature :)
<niemeyer> rogpeppe: https://codereview.appspot.com/6501069/ reviewed too
<rogpeppe> niemeyer: ta!
<niemeyer> rogpeppe: Thank you!
<rogpeppe> niemeyer: i put the numbering in because it was so painful counting through 13 tests to find which one had failed.
<rogpeppe> niemeyer: the alternative is to put the comment in the struct
<rogpeppe> niemeyer: which is probably better, come to think of it
<niemeyer> rogpeppe: Yes, there are multiple ways to solve that.. putting the comment as a summary in the struct is a generally a great one
<rogpeppe> niemeyer: i'll go with that then
<niemeyer> rogpeppe: Thanks!
<niemeyer> Stepping out for lunch
<Aram> yay, I can use chan []map[string]interface{} as a map key
 * Aram wished Put wasn't near Undo in acme.
<rogpeppe> Aram: i don't get bitten by that very much, though i wish Put didn't appear under the mouse when you get to the end of a load of Redos...
<Aram> yeah, that too
<rogpeppe> Aram: i have never thought of a good way to make it better though
<rogpeppe> yay, ec2 is fixed again!
<niemeyer> Aram: Gosh, do we need that as a map key? :)
<Aram> yeah...
<niemeyer> Aram: Why?
<Aram> it's not that bad, it's only chan []genericDoc, genericDoc is map[string]interface{} because it represents any mongodb document (like bson.M). we need it because this is how client watchers register with the watch server.
<Aram> the code is actually quite clear, even if the key is complex.
<Aram> the key is what is is because it is the way the server sends event to clients.
<Aram> clients expect slices of documents.
<niemeyer> Aram: Hmm.. that sounds a bit suspect
<niemeyer> Aram: Why do we have a genericDoc in lower level watcher?
<niemeyer> Aram: That wasn't in the design we talked about
<Aram> what is a lower level watcher?
<niemeyer> Aram: The design we talked about, which I thought you've been working on
<Aram> I have.
<niemeyer> Aram: Well, that wasn't there
<niemeyer> Aram: We don't work with genericDocs in mstate so far
<niemeyer> Aram: The fact we can deal with actual values instead is quite convenient
<niemeyer> Aram: I'm concerned that what we talked about was simpler than what seems to be taking to get it in place
<Aram> there is a single entity doing all the queries for all the watchers. watchers have different documents, it has to use a map to be able to get data before converting it to a document.
<niemeyer> Aram: That's not what is in the design
<niemeyer> https://gist.github.com/e1453bb97426081a12e5
<niemeyer> Aram: This is the design
<Aram> "A Watcher can watch any number of documents and collection" that implies arbitrary documents, hence maps instead of structs.
<niemeyer> Aram: You don't have to agree with it, and it's fine to go in a different direction, but when I go over the trouble of doing that after we stumbled upon several issues, I do expect feedback before we change gears
<niemeyer> Aram: Where's genericDoc in this design?
<Aram> the fact that you didn't mention genericDoc doesn't mean I can't use it, you didn't mention error or string or int either. genericDoc is just a name I gave for a map, I can change it to bson.M if you want.
<Aram> because it is essentially bson.M, not only in representation, but in concept as well.
<niemeyer> Aram: I don't care about the name.. I care about the fact I can't see a "generic map" or however you wanna call it in this design
<niemeyer> Aram: We simply do not *load* the document
<Aram> surely someone loads documents as the events delivered by watchers are full entities that have the full document inside them.
<niemeyer> Aram: I'm talking about the design in this page, not about whatever you're implementing on top of it
<niemeyer> Aram: This is a building block that we need, and agreed upon
<niemeyer> Aram: You seem to be talking about stuff that is not in this design, despite the fact we don't have this integrated or up for review anywhere'
<niemeyer> Aram: We seem stuck despite all my efforts, and I'd like to understand why
 * rogpeppe loves all it when 90 lines of new test code all passes after one minor initial glitch.
<rogpeppe> somehow the initial glitch makes it even better, because you have positive verification that the test is actually running.
<niemeyer> rogpeppe: True
<niemeyer> rogpeppe: When I get a straight pass I'll generally tweak things a bit to make sure it's *really* passing :)
<rogpeppe> niemeyer: i often leave a "this cannot pass" piece of code in the test just to make sure
<rogpeppe> niemeyer: easier when you've got an ErrorMatches test
<niemeyer> rogpeppe: +1, done that before as well
<rogpeppe> 5 minutes and 21 seconds to upload the tools that time.
<rogpeppe> (running live upgrading test BTW)
<rogpeppe> niemeyer: friday's last gasp: https://codereview.appspot.com/6490067/
<rogpeppe> niemeyer: it passes live tests!
<rogpeppe> niemeyer: and that should get us roughly back to where we were...
<rogpeppe> and with that, i'm off to practice drinking port in preparation for Lisbon
<rogpeppe> fwereade__, mramm, niemeyer: have a great weekend!
<niemeyer> rogpeppe: Superb!
<niemeyer> rogpeppe: Thanks so much
<niemeyer> rogpeppe: and have a brilliant weekend too
<mramm> rogpeppe: have a fantastic weekend, and get good and practiced up on the port drinking!
<rogpeppe> mramm: darn, it's running out. need more supplies!
<mramm> heading to a late lunch, back in a bit
<niemeyer> PASS: presence_test.go:148: PresenceSuite.TestScale     1.978s
<niemeyer> 1000 pingers starting, pinging, half being killed, all of them observed.. 2 seconds. Woohay.
#juju-dev 2013-08-26
<davecheney> hey, does anyone know if juju ssh 1/lxc/0 works ?
<davecheney> can you ssh directly to a contianer
<bigjools> https://bugs.launchpad.net/gwacl/+bug/1216744
<_mup_> Bug #1216744: Destroying role instances always leaves a VHD file behind <Go Windows Azure Client Library:Triaged> <https://launchpad.net/bugs/1216744>
<bigjools> jtv2: so I explained azure to davecheney, Ian and axw and there was only medium swearing
<davecheney> he was very restrained
<davecheney> gold star
<jtv2> bigjools: tell them they're naÃ¯ve fools and won't know what hit them.
<jtv2> Joking.
<bigjools> to be fair most of it was from Ian and it's hard to tell the difference
<bigjools> shhh he's here
 * davecheney is trying to figure out why juju status resolves an instance id to an Instance twice to talk to the state server
<bigjools> I wondered why you had that look on your face
<davecheney> bigjools: no, that is just my face
<bigjools> jtv2: can you check the vhds container in whatever storage account you were using for azure - I think there's a bug as I have about 20 29.3GB blobs left behind.
<davecheney> bzzt
<axw> bigjools: http://paste.ubuntu.com/6027561/
<bigjools> axw: https://bugs.launchpad.net/juju-core/+bug/1216768
<_mup_> Bug #1216768: Azure provider: Authentication error when using public tools <juju-core:New> <https://launchpad.net/bugs/1216768>
<bigjools> axw: another nice bug https://bugs.launchpad.net/gwacl/+bug/1216744
<_mup_> Bug #1216744: Destroying role instances always leaves a VHD file behind <Go Windows Azure Client Library:Triaged> <https://launchpad.net/bugs/1216744>
<davecheney> question about lxc
<davecheney> are local containers addressable on the local lan ?
<davecheney> ie, do they DHCP from the host's network
<davecheney> or are they bound to 10.x addresses
<MACscr> Im trying to use juju with local (aka, LXC). I already have lxc setup, but i did have to change the bridging in order for LXC instances to be accessible from other computers on the LAN. So instead of lxcbr0, i just have br0. juju doesnt seem to like that. Does juju only work when the bridge is NAT'ed?
<davecheney> MACscr: i don't think nat'ing is the issue here
<davecheney> you've changed the name of the interface
<davecheney> juju expects it to be called lxcbr0
<davecheney> you've shortened it to br0
<davecheney> i think if you fix that
<davecheney> it'll work
<MACscr> every tutorial states to use br0 when setting up the public access
<davecheney> MACscr: sure
<MACscr> but i guess i can try renaming it
<davecheney> but juju is different :)
<MACscr> right, its a shame its hard coded
<davecheney> MACscr: yup, it is hard coded at the moment
<davecheney> _BUT_ it is hard coded to the default
<MACscr> im 99% sure this change isnt going to work
<MACscr> as the juju probably wants to be its own private network so it knows what IP's its assigning, etc
<davecheney> MACscr: you're probably right
<MACscr> with dhcp, the ip isnt being assigned until its booted, not when its created
<davecheney> environ.go
<davecheney> 39:const lxcBridgeName = "lxcbr0"
<davecheney> at the moment, the name of the bridge interface is hard coded
<MACscr> amazing that a developer would do that
<MACscr> are there any developers in here that know if NAT has to be used with LXC on the actual host?
<davecheney> they did that because it is the default name of the LXC interface
<davecheney> ubuntu@ip-10-248-6-186:~$ ifconfig lxcbr0
<davecheney> lxcbr0    Link encap:Ethernet  HWaddr ee:6f:ad:c0:96:91
<davecheney> ^ this is a fresh lxc install
<davecheney> i agree that it isn't as flxebile as one would like
<MACscr> yes, i know its default, but obviously its meant to be reconfigured
<davecheney> but it does sound like a reasonable default
<MACscr> but the name doesnt matter, the issue is that with juju needing it to be that way, i highly doubt anything it creates using the local method will allow any of those instances to be accessible outside the host
<davecheney> MACscr: i think you are right
<davecheney> i was incorrect there
<davecheney> i was focusing only on the first problem of getting the lxc provider working
<MACscr> can any developers concur with that statement of mine? That seems to be a pretty huge limitation of LXC and pretty much means it can be only used for limited testing
<jam> fwereade: ping for when you're around. I had a question about "statecmd".
<fwereade> jam, heyhey
<jam> fwereade: I trust your return to Malta was met with great fanfare and celebration?
<fwereade> jam, haha
<fwereade> jam, it's nice to be back
<jam> fwereade: so I updated "juju add-unit" to use the API instead of a direct state connection which seems good. rog mentions here https://codereview.appspot.com/13212043/ that we should probably get rid of statecmd (since it was only for sharing implementation between the API and the CLI).
<jam> which is fine
<jam> but I'm looking at AddServiceUnits
<jam> and it does weird things like
<jam> NewConnFromState
<jam> so it can do Conn.AddUnits
<jam> but why do we need conn?
<fwereade> jam, for AddUnits I'm 90% sure we don't any more *except* that Deploy does (or might...), and it uses AddUnits internally
<jam> fwereade: well the implementation of AddUnits is on Conn
<fwereade> jam, the only reason for the conn in the first place is to access provider storage
<jam> so it *feels* like if I'm getting rid of statecmd/addunit.go I should be moving Conn.AddUnit somewhere else
<jam> but where?
<fwereade> jam, api.Client, I think, is the ideal
<jam> fwereade: we have api.Client.AddServiceUnits which is what I'm using. It is backed in the apiserver which uses state/statecmd/addunit
<jam> which uses Conn.AddUnit
<jam> my point is that it seems really strange on the server side
<jam> to get a NewConnFromState in order to AddUnit on it.
<fwereade> jam, sure; moving AddUnit up to client and sharing that implementation between AddServiceUnit and Deploy would be just fine as far as I'm aware
<fwereade> jam, it's just fiddly because Deploy might need Environ access to upload a local charm
<fwereade> jam, but we should be exposing that capability over the api regardless
<jam> fwereade: conn.AddUnits isn't called by anyone except tests and statecmd as near as my grep-fu tells me
<jam> conn.DeployService can call conn.AddUnits
<jam> I don't particularly mind conn having AddUnits, it just seems strange to need Conn in the apiserver.
<jam> Am I just wrong on that?
<jam> hi rogpeppe
<jam> I thought today was "Summer Bank Holiday" for you
<fwereade> jam, no, I think Conn is really inappropriate there
<jam> I know mgz is out.
<MACscr> sorry to interrupt, but i had a dev question earlier. Did anyone read it? DaveChaney sent me here
<rogpeppe> jam: it is indeed the auguest bank holiday today
<rogpeppe> jam: i am not here :-)
<jam> rogpeppe: I started a chat with fwereade about removing statecmd, I would be interested to hear your thoughts if your ephemeral self wants to chime in.
<jam> MACscr: mgz is probably the best person to chat with, and he is off today for the UK holiday. He should be around by this time tomorrow.
<jam> MACscr: thumper would  be another person to ask, but he is in AU so is usually gone by now.
<rogpeppe> jam: briefly, perhaps
<fwereade> MACscr, the RelationList question sent to juju-dev? sorry, just saw that
<jam> I guess if you were chatting with davecheney then thumper is in a similar timezone.
<MACscr> okie dokie. Thanksffor the tips
<rogpeppe> fwereade: what are your thoughts?
<jam> fwereade: the one in IRC from a bit ago is about LXC needing lxcbr0
<jam> rogpeppe: so I'm happy to remove statecmd, but it revealed some other ugliness.
<jam> namely, it is using NewConnFromState inside the API server to call Conn.AddUnits.
<fwereade> rogpeppe, I'm +1, the cli and the gui should be using the exact same implementation
<jam> And he and I seem to agree that Conn shouldn't be used inside the API Server.
<fwereade> rogpeppe, it's really all about the conn nastiness
<rogpeppe> jam: +1 to that
<rogpeppe> fwereade: yeah, we need to eliminate that
<fwereade> jam, I don't really see the need for Conn at all once we're api-only
<jam> rogpeppe: but it is unclear to me where to move Conn.AddUnits to avoid duplication.
<rogpeppe> fwereade: in particular we need a way to upload charms
<rogpeppe> jam: what else uses Conn.AddUnits?
<jam> rogpeppe: we do, but "not yet" given that AddUnits doesn't actually upload anything
<jam> rogpeppe: Conn.DeployService
<jam> and statecmd.AddServiceUnit
<jam> and some tests
<MACscr> fwereade: im not to concerned about the name, thats an easy fix. My concern is about LXC by default being NAT only and i have a feeling juju will only work that way. Thus anything it creates with LXC wont be accessible outside the host.
<jam> and the builddb charm
<rogpeppe> jam: all that will be inside state/api eventually, right?
<rogpeppe> jam: or can be, at any rate
<rogpeppe> jam: if we choose it to be
<jam> rogpeppe: that is sort of my point. *Where* does the functionality in Conn.AddUnits go.
<rogpeppe> jam: into state/api, no?
<jam> rogpeppe: why would it go on the client side?
<rogpeppe> jam: sorry, doh
<rogpeppe> jam: into state/apiserver/client
<jam> rogpeppe: how does the current conn.DeployService call into that/
<jam> ?
<rogpeppe> jam: well the current conn.DeployService should also end up in state/apiserver
<jam> rogpeppe: does Conn have an API connection?
<jam> rogpeppe: I'd rather not have to move-the-whole-world to get one bit to land :)
<fwereade> MACscr, ah, sorry, I was thinking of the wrong question then; and mgz is the expert, but the answer is basically that juju *will* be handling container addressability, but will only be creating containers if the cloud has addresses available to assign to the them
<rogpeppe> jam: perhaps it's not appropriate to move Conn.AddUnits until we can do tat
<rogpeppe> that
<jam> MACscr, fwereade: I think this is about local provider
<jam> as in your personal machine
<jam> no "cloud provider" availabel
<jam> available.
<MACscr> fwereade: correct. Local
<rogpeppe> jam: no, Conn doesn't have an API connection.
<jam> MACscr: how are IP addresses assigned?
<rogpeppe> jam: the point is that probably all the stuff that's talking directly to state inside the juju package can eventually move into state/apiserver
<jam> MACscr: that is the key point that I can see. As we have to play some games with lxcbr0, and I don't know how that works with third-party DHCP assigning addresses
<fwereade> MACscr, the local provider is explicitly a dev tool; we're not currently planning to handle addressability from outside
<MACscr> the way i have LXC setup now, it gets the IP's assigned through DHCP from my router and the bridge setup is called br0. LXC isnt setup that way by default
<jam> rogpeppe: k, for now, I'll get rid of statecmd, leave in NewConnFromState and file a bug about APIServer should not depend on Conn. Sound good?
<jam> fwereade: ^^
<MACscr> fwereade: ah, thats very disappointing. Very unrealistic to use a physical machine or full VM just for a single service
<jam> MACscr: 'juju deploy --to" or the future container work handle both of those.
<rogpeppe> jam: i don't think you can get rid of statecmd completely yet, no? just gradually whittle it down as you implement the various CLI commands in the API
<jam> rogpeppe: no, just getting rid of statecmd/addunit.go
<jam> right
<jam> I overstated earlier.
<rogpeppe> jam: SGTM
<rogpeppe> jam: it's just good to have an idea of where we're heading
<fwereade> MACscr, would you explain your use case a bit more? do you have a cloud we don't support, that you want to run a local provider on?
 * rogpeppe fades away again
<MACscr> fwereade: I have a management server that i want to run some LXC instances on. Also, when i deploy openstack, i want to put multiple services like ceph-mon and nova, etc, all all the same 3 physical servers
<MACscr> the only servers that will be separate will be ceph-osd and the compute nodes
<MACscr> keystone, etc, as well will be on the 3 nodes i just mentioned
<MACscr> and probably some other openstack services i dont know yet =P
<fwereade> MACscr, ok, I agree that it's nicer and cleaner to have your service units running in LXC containers, but you can colocate already without containers; you just don't get the benefits of isolation
<MACscr> oh, i wanted to install the juju-gui on my management node, that was my first and original request/problem
<jam> MACscr: so for 13.10 we are hoping to natively support containers inside machines provided by another cloud (it already mostly works when MaaS is providing the machines). Which might sort this out in a better way. Then you can use "juju deploy --to" for all of the openstack bits you want on the same physical machine.
<MACscr> so it seems with juju has to deploy the gui, it seems to require that it runs on a separate server (no idea why)
<fwereade> MACscr, we're actively working on manual provisioning, which would let you set up your containers as you prefer, but that isn't ready today
<fwereade> MACscr, not at all, but service units get fresh machines by default
<fwereade> MACscr, `juju deploy juju-gui --to 0` should put it right where you need it
<MACscr> so 0 would pretty much be localhost?
<fwereade> oh, damn, sorry, fwereade gets it now
<fwereade> MACscr, on the local provider, I think that would *probably* work but is not recommended
<fwereade> MACscr, if you're dealing with actual hardware would you consider using maas for your environment?
<MACscr> fwereade: huh? Why would i need two management nodes just for using juju-gui?
<MACscr> one to deploy it and another one its deployed on?
<TheMue> good morning, just back from doc, now start the week ;)
<fwereade> MACscr, the local provider is not going to work with more than one physical node in play
<fwereade> MACscr, maas would require a maas server on top of the nodes it deploys, I think, if that's what you're saying?
<MACscr> fwereade: I understand that. Local means local only, thus not accessible outside that host. Which then means i cant even use juju-gui unless i have two physical hosts if i want to access it on the lan
<fwereade> MACscr, tell me when I say something wrong:
<fwereade> MACscr, you would like to use the local environment on a single remote machine
<MACscr> yes
<fwereade> MACscr, and control it from outside that machine
<MACscr> yes
<MACscr> without having to do port forwarding
<fwereade> MACscr, ok, on the understanding that this is outside design parameters, you *might* be able to deploy to machine 0 -- which will be the physical machine running the local environment -- and have it work transparently
<jam> fwereade: I'm pretty sure local provider doesn't let you do machine 0, but the maas provider has reasonable LXC support (at least with thumper's pending patch)
<jam> as in, the LXC nodes end up on br0
<jam> like MACscr is trying today.
<MACscr> though i could settle at this point just being able to have juju-gui on my management node
<jam> MACscr: doesn't "juju deploy juju-gui --to 0" do that?
<MACscr> i have no idea, i havent tried it
<fwereade> jam, damn, how does it stop you? the proper way of stopping you doesn't seem to be there, but maybe it's been hacked in somewhere I didn't expect?
<jam> fwereade: I could be wrong. I know it wasn't intended to work. :)
<jam> MACscr: so you're talking about deploying openstack infrastructure. You sound like you want to use "local provider" but have it put stuff on things that aren't on the local machine, is that correct?
<MACscr> nope, doesnt work, it tries to install it to an LXC container and thus gives me the 'error: net: no such interface'
<jam> (as in, how are you going to deploy the openstack compute nodes, and the ceph stuff that you didn't want on the same physical machine?)
<MACscr> jam: sorry guys, im not trying to confuse anyone. Lets forget openstack and just go simple for now
<fwereade> jam, gaaah, ok, it won't work
<MACscr> i have a single server running LXC on it and they are all setup to get DHCP from the router and are using br0. lxcbr0 doesnt exist as that was the bridge that was just used for private nating.
<MACscr> all i want to do is get juju-gui to work so i can deploy openstack to my physical servers and also setup a few amazon instances
<jam> MACscr: but what is juju-gui interfacing with? What are you using it to deploy?
<MACscr> id like to start simple and just get the amazon instances working, but i do want to use the gu to do that so i can start testing it
<jam> the amazon instances?
<fwereade> MACscr, then surely you can bootstrap on amazon and deploy the gui to machine 0 there?
<MACscr> jam: I guess that would be just amazon to start?
<jam> it sounds like you are mixing providers (some local, some amazon, maybe some openstack), which isn't something we support today.
<MACscr> fwereade: no way, i dont want to use up a amazon instance just for the gui. thats just plain stupid
<fwereade> MACscr, juju will not run a single environment across separate clouds
<jam> MACscr: juju bootstrap (creates the controller node), juju deploy juju-gui --to 0 puts the gui on the same node.
<MACscr> fwereade: i never said anything about a single environement. The amazon instances are for my dns servers and I guess maas would just be for openstack
<fwereade> MACscr, the juju gui currently administers a single environment
<MACscr> so i would have to deploy two of them?
<MACscr> or is that even possible
<MACscr> i guess i just dont understand why whatever system im running the juju commands from isnt the actual controller
<MACscr> thus why the need to setup a different one
<MACscr> for just the gui
<fwereade> MACscr, if you're using an actual cloud, you can/should deploy it to the controller node
<fwereade> MACscr, so if you want to run stuff on amazon, bootstrap there and deploy to 0 and you're fine
<jam> fwereade: with MaaS, could we run the juju bootstrap node on the MaaS controller node?
<jam> that's probably more an allenap question.
<fwereade> jam, I'm not sure about that but I wouldn't expect a maas controller to "provision" itself
<jam> fwereade: it seems really useful in the "small scale allow me to get stuff up and running with 3 machines" case.
<MACscr> jam: um, it makes sense for efficiency purposes
<fwereade> MACscr, if you have hardware, you should be able to use MAAS, but that does currently demand an additional node to control MAAS as far as I am aware
<MACscr> why in the world would the controller not be the same system that you are running the juju commands on?
<fwereade> MACscr, your particular use case *will* be satisfied when we have manual provisioning working; that's in progress
<jam> MACscr: you might be running juju commands from your laptop, controlling Amazon nodes.
<jam> the Amazon nodes need to be able to talk to the controller node
<jam> to coordinate activities.
<MACscr> jam: yes, so if im running it from my laptop, its the controller, this should have the gui on it
<jam> MACscr: running 'juju' is not the controller. The "controller" is a jujud process that is running for the other nodes to talk to.
<jam> amazon nodes wouldn't have routing back to your laptop (in the general case)
<MACscr> ok, now thats making sense, but wouldnt that only be needed when its initially setup? I mean, after they are deployed, the controller wouldnt need to be up anymore would it?
<MACscr> so i dont see why the amazon nodes wouldnt be able to talk to the laptop
<jam> MACscr: as you add units, put more stuff out, they trigger hooks being run on each machine
<jam> coordinated by the controller.
<jam> MACscr: my machines are all behind NAT and not directly accessible from a public space like Amazone.
<jam> And it certainly wouldn't stay at the same IP address as I move around.
<fwereade> MACscr, manual provisioning is in progress (aside: that *would* allow you to add aws machines to an environment, assuming everything *is* mutually addressable)
<jam> MACscr: juju doesn't just provision machines at one point in time and then disappear. It continually monitors the system to notice when machines leave, when new nodes get added and they need to be connected to the other nodes, etc.
<fwereade> MACscr, and will become useful to you in its second phase, in which we support manual provisioning of an environment controller (rather than just adding machines to an existing environment)
<MACscr> fwereade: ok, its starting to make more sense now. Not that i agree with its design, but im starting to understand it
<fwereade> MACscr, ;p
<MACscr> ok, so a controller that is setup can only manage one type of cloud at a time?
<MACscr> so if i deployed the gui to an lxc instance, it would only be able to deploy lxc instances?
<jam> MACscr: that is correct (at present).
<MACscr> and if wanted to deploy an amazon guests, i have to setup an amazon controller too?
<fwereade> MACscr, today, yes, but assuming your controller is accessible to ec2 instances, manual provisioning will allow you to include those in another environment
<MACscr> so even if i only want to deploy and manage 4 amazon instances, i actually have to have 5 of them? one for the controller?
<MACscr> well i dont really care what the future holds. I need to figure things out with currently available features =P
<MACscr> hope that didnt sound offensive, wasnt my intent
<fwereade> MACscr, if you're using the complete capacity of those nodes and *don't* want to run the juju controller on one of them then, yes, you'll need another node
<fwereade> MACscr, np, all I heard was "I am frustrated"
<fwereade> MACscr, otherwise you can bootstrap on amazon, giving you machine 0, and manually deploy some of you units onto that machine
<MACscr> yeah, its just frustrating when you though you found a tool that was going to be perfect for your particular need and then reality hits and find out its not as cool as you though it was
<MACscr> thought*
<MACscr> lol, i forgot the T twice
<MACscr> its 5am, im expected to make mistakes =P
<fwereade> MACscr, no worries, and I'm sorry to disappoint you
<fwereade> MACscr, was I clear about aws?
<MACscr> fwereade: yeah. I think so and I really appreciate both you and jam's time to help explain things to me
<fwereade> MACscr, for deploying openstack, we do recommend running on maas and that -- being an entirely separate thing -- does require its own hardware
<MACscr> fwereade: hmm, so i couldnt install maas on the management node i have now?
<MACscr> heck, maybe i could install that in an LXC guest on my management node?
<MACscr> im trying to limit what i install on the actual management host OS
<fwereade> MACscr, you might be able to install it on your laptop, but maas provisions bare metal from scratch, and it won't be able to do that to itself
<fwereade> MACscr, assuming everything's addressable, though, I don't see why you wouldn't be able to do run maas locally and use it to provision your 3 physical machines
<fwereade> MACscr, that said, if you have anything running on those machines that you want to keep, don't do that
<fwereade> MACscr, maas will overwrite those machines with the ubuntu series juju asks for
<MACscr> sry, my mac got angry and i had to reboot. It has been about 3.5 weeks since the last one
<MACscr> fwereade: all machines or just ones i specify? I am assuming thats done through mac addresses or something like that?
<fwereade> MACscr, you register individual machines with the maas controller and it's then free to have its way with them
<fwereade> MACscr, just installing a maas controller shouldn't cause havok in itself ;)
<MACscr> ok, but just to verify, the maas server can be its own controller. Correct?
<MACscr> so when it asks for the controller ip, i can just do 127.0.0.1?
<TheMue> fwereade: ping
<fwereade> TheMue, pong
<TheMue> fwereade: I'm just doing the changes for string options to be also "", while others like int options react with an error. we talked about it.
<fwereade> MACscr, sorry, didn't see: I'm afraid I don't know, #maas would be the place for someone who does
<fwereade> TheMue, ok
<MACscr> thanks again for your time
<TheMue> fwereade: but here I've seen that this also influences the reading of a config.yaml
<TheMue> fwereade: today "" here is the same as null
<MACscr> neat to see at least maas planned it right by allowing the gui to run on the same node as the controller =P
<TheMue> fwereade: but with the change "" would be treaten as string while it is inappropriate for int, foat etc
<TheMue> fwereade: what do you think about it?
<fwereade> TheMue, I don't think "" is a valid float -- AFAIK the "" handling was new to juju-core, so I think we're safe pulling it out
<TheMue> fwereade: got aware of it in ConfigSuite and there TestParseSettingsYAML
<TheMue> fwereade: that's good news. ;)
<fwereade> TheMue, except, hmm -- check the python
<TheMue> fwereade: argh
<TheMue> fwereade: I hoped you would not say this :D
<fwereade> TheMue, sorry :(
<TheMue> fwereade: hehe, np, alsready looking
<TheMue> already
 * fwereade thinks a moment
<fwereade> TheMue, even if it does work in python, I am sorely tempted to classify that as a bug
<TheMue> fwereade: after our discussion last week I would agree
<TheMue> fwereade: but I'll look a bit more to be sure if we have to note this in the release notes
<fwereade> TheMue, thanks
<TheMue> jam: https://codereview.appspot.com/12752044/
<natefinch> jam: do we have Juju icons and images I can use for the installer?
<jam> natefinch: I don't explicitly know of them, but I would look in the documentation branch at "lp:juju-core/docs"
<jam> natefinch: looking at it, though, juju.ubuntu.com doesn't seem to show the Juju logo (the inverted and upright juju symbol)
<jam> you can see it at https://jujucharms.com/
<natefinch> Yeah, that
<natefinch> that's unfortunate
<jam> natefinch: when he's around, maybe poke gary poster
<natefinch> charm store does though
<jam> gary_poster: ^^ do you have nice Juju icons?
<jam> fwereade: it seems like state/api.Client should really be an interface. I realize there is the "you should consume interfaces but return concrete types argument", but it seems really hard to break Client up into 50 one-function interfaces.
<jam> fwereade: bonus points, though, APIConn already has Close() which just calls self.State.Close()
<fwereade> jam, I'd fine with that so long as the testing overlaps properly :)
<fwereade> jam, cool
<jam> fwereade: one bit I'm not super happy with NewAPIConn is the DialOpts. We currently pass it through a bunch of layers, but I was hoping to hide it for the CLI.
<jam> Is it ok to just use api.DefaultDialOpts ?
<fwereade> jam, I'm ok with that
<gary_poster> oh hey jam.  sory, didn't see you.  Yeah I have svgs somewhere.  Is that what you need?
<gary_poster> jam, I was about to ask you if I mind if I share your email about the CLI API work.  That looks like exactly what I was hoping for, so far.
<jam> gary_poster: svgs sound good for nate.
<gary_poster> ack, looking
<jam> gary_poster: for the API stuff, I have one more step I'm putting together today, (hopefully)  but yeah you're welcome to share it.
<gary_poster> thanks jam!
<natefinch> gary_poster: an .ico file with multiple sizes would be ideal along with the svgs
<natefinch> gary_poster: I'm working on the windows installer for juju and want to make it look perty
<jam> natefinch: well, you can create one if you get an svg, right? That is what "scalable" is all about.
<natefinch> jam: yes, but if he already has one, that saves me some time wrestling with gimp :)
<gary_poster> natefinch, ack. :-) I have a favicon.ico but that's it, and that's probably not what you need, yeah?
<natefinch> gary_poster: I was looking at that one, it's likely just 16x16... I can make a multisize one from the SVG, that's ok.
<gary_poster> cool.  yeah it is 16x16
<jam> fwereade: *sad face*, APIConn.Close() exists, but api.Client().Close() does not. Time to start writing code.
<TheMue> back again
<TheMue> fwereade: could you repeat your last question here? i'm not sure if i got you right acoustically.
<fwereade> jam, there is a slight worry there in that Client is unique among api types in being legitimately allowed to close the underlying conn -- I'm starting to think maybe just returning an api.State is the best thing
<jam> fwereade: api.Client is a pretty unique beast, though. :)
<fwereade> jam, yeah, if it's commented it's probably fine
<jam> fwereade: so *today* conn.Environ from NewConnFromName is used for "juju status" to provide conn.Environ.Name(), "environment.go" to report "conn.Environ.Provider()", and "upgradejuju" (presumably to upload the tools so that it can ask to be used?
<jam> I'm surprised not to see deployer in there.
<fwereade> jam, BTW, I don't necessarily expect delivery, but consideration would be worthwhile: the client API is entirely unbulky at the moment, and it would probably be good to make it more so
<jam> status also directly calls "fetchAllInstances".
<jam> fwereade: more focused on bulk-operations? Or have more extraneous stuff that is unused? :)
<jam> (define bulky)
 * fwereade deadpans "the former"
<jam> fwereade: you had commented on TheMue's unset changes, can you give it a look:https://codereview.appspot.com/12752044/ he seems to have responded to your requests
<fwereade> jam, AFAICS the deploy command uses NewConnFromName
<jam> fwereade: sure, but it uses conn.PutCharm which might touch self.Environ. It doesn't call conn.Environ stuff directly.
<jam> vs status et al
<jam> but a fair point. Conn itself is the object that is touching Environ
<jam> and we need to be careful when porting commands across.
<fwereade> jam, the whole Conn thing is kinda crack anyway because it's got an Environ with a config that's most likely wrong
<fwereade> jam, anyone actually *using* conn.Environ is playing with fire
<fwereade> jam, PutCharm is probably mostly ok
<jam> fwereade: well, it is one of those "this needs to be exposed in the api so we can use it there instead".
<fwereade> jam, because all it uses is the creds which don't change much, and the control-bucket, which is immutable
<jam> fwereade: well status accesses the Name, which isn't something we expose elsewhere.
<fwereade> jam, but the important point is that in general you cannot trust conn.Environ and would therefore, in an ideal world, never use it
<fwereade> jam, if you want to know something, get it from the API ;)
<jam> fwereade: do we have Client.Name() ?
<jam> or Client.EnvironmentName() ?
<jam> (eventually we want to put something like that into a Client.Status() sort of output.)
<jam> x
<fwereade> jam, we have EnvironmentInfo
<fwereade> jam, sure, but I expected status would be composed server-side and sent down in one go regardless
<jam> right, that was my Client.Status() as a single call
<fwereade> jam, cool
<jam> fwereade: https://codereview.appspot.com/12943045/ NewAPIClientFromName instead of NewAPIConnFromName if you want to give it a look
<fwereade> jam, cheers
<jam> fwereade: it would appear that api.State.Close is not re-entrant :( (It gives an error already closed if you call it again)
<jam> we had a test that Conn.Close() could be called multple times
<jam> which I copiesd
<jam> but then found the test wasn't actually checking the error return
<jam> adding that to Conn.Close() didn't give any errors
<jam> but for APIClient.Close() which just calls c.State.Close() underneath returns errors...
<jam> fwereade: *should* Close be idempotent?
<fwereade> jam, it's somewhat useful for tests -- defer an assert that something closes without error -- but I guess it's at the point where errors aren't really useful, so logging and moving on at a lower level is probably reasonable
<jam> gary_poster: I poked the notes you shared with me a little bit. We changed one of the function names, etc.
<jam> fwereade: I'm fine with it returning an error. I'm just asking if "already closed" should be an error?
<jam> It isn't for state.State.Close() it *is* for api.State.Close()
<fwereade> jam, ah, I see
<fwereade> jam, er, dunno
<fwereade> jam, you're implementing the client, though... make it work however seems to make most sense there ;)
<jam> fwereade: it is being passed up from rpc.Conn.Close() where it checks "if conn.closing: return errors.New("already closed")".
<jam> I guess I can wait for roger to figure out what his thought was.
<jam> and since it is an untyped error, I'm not a big fan of detecting it in api.State.Close() and suppressing it: "if err.Error() == "already closed": return nil"
<gary_poster> perfect, thanks for updating the doc, jam.  I hoped that might happen, actually. :-)
<natefinch> jam: do we sign our installers to prevent the "publisher: unknown" dialog box from popping up during install?  If so, I'll need a canonical certificate I can use with the installer.
<TheMue> fwereade: so, went through the whole stuff in pyjuju from all sides and i'm pretty sure we can make the change. interestingly i found something like "set foo=@filename" where the value is read out of he file
<fwereade> TheMue, yeah, I noticed that the other day, I have no intention of implementing it unless we have to
<TheMue> fwereade: so far nobody seems to miss it ;)
<fwereade> TheMue, exactly :)
<sidnei> fwereade: oh, was about to file a bug on that friday :)
<sidnei> but just because we expected the functionality to be there from pyjuju
<fwereade> sidnei, oh bother
<fwereade> sidnei, if it's used, please file a bug
<sidnei> people that didn't know it exists just used juju set foo="$(<filename)"
<sidnei> which is slightly more annoying
<fwereade> sidnei, yeah, fair point -- my thinking was just that if nobody had noticed yet, hopefully nobody would ever notice ;)
<sidnei> can't tell for sure how many people were aware of it from pyjuju land
<sidnei> https://bugs.launchpad.net/juju-core/+bug/1216967
<_mup_> Bug #1216967: Missing @ syntax for including file config <juju-core:New> <https://launchpad.net/bugs/1216967>
<fwereade> sidnei, thanks
 * fwereade bbl
<jam> can anyone help me figure out why this gc.DeepEquals is failing?
<jam> http://paste.ubuntu.com/6029469/
<jam> is it different integer types in 'value' ?
<jam> value in Config is "int64{0}" which is supposedly getting sent over the wire, but coming back over the API it is probably going via JSON and ending up a float?
<jam> How much do we need the type to be matched exactly?
<jam> I'm guessing the test used to work because it talked directly to state and BSON saved the type
<jam> but in going via the API and JSON, it now ends up as a float64
<jam> *sigh*, confirmed
<natefinch> jam: I'm sad that it default to float and not integer when there's no decimal part
<jam> natefinch: json numbers are floats
<jam> I'm a bit surprised it doesn't print 0.0 with a . to make it obviously not an int.
<natefinch> numbers in javascript are floats. Numbers in json are text... in theory there's no reason why you couldn't parse 5555 into an int and not into a float... though I admit that's not really standard
<natefinch> it actually never occurred to me that numbers in json should be restricted to floats. Seems like an unnecessary restriction just because javascript doesn't have integers
<natefinch> floats make me sad. They
<natefinch> there's just too much funny business at values far from zero
<niemeyer> natefinch: You can easily do that with Go's json package
<niemeyer> natefinch: http://golang.org/pkg/encoding/json/#Decoder.UseNumber
<gary_poster> hey hazmat and mramm.  rick_h has discovered that upcoming chrome 30 does not approve of the websocket headers that pyjuju produces.  https://bugs.launchpad.net/juju-gui/+bug/1217011 .  Juju Core headers are fine.  First answer to http://stackoverflow.com/questions/11300694/chrome-20-websocket-handshake might describe cause, but this is preliminary.  Do we care
<_mup_> Bug #1217011: error connecting to rapi ws from chrome dev channel (30) <juju:New> <juju-gui:Triaged> <https://launchpad.net/bugs/1217011>
<gary_poster> ?
<natefinch> niemeyer: I'm actually not 100% sure what jam was talking about... but I'll forward that on to him, seems like that would be a good idea
<gary_poster> Remediation looks like it would be in pyJuju
<gary_poster> Risk would be that chrome 30 will come out within 2-4 weeks, and break GUI against pyjuju
<jam> natefinch: the issue is that we've lost the type information after round tripping. If you "fix" it to return an int, I can just use float64() on that param instead. I'm guessing round-tripping isn't important at that level, but it *could* be.
<natefinch> jam: yeah, it's kind of one of those json things where you have to know what you put in and how to interpret it on the way out
<natefinch> jam: it's specifically not self-describing
<hazmat> gary_poster, i'll have a look
<gary_poster> thanks hazmat
<thumper> morning
<natefinch> MOrning
<thumper> mramm: ping?
<mramm> thumper: pong
<thumper> call shortly?
<mramm> thumper: sure
<hazmat> gary_poster, rick_h fix pushed
<gary_poster> awesome thanks hazmat!
<rick_h> hazmat: awesome, to both rapi/pyjuju? Will there be a release of the old juju?
<hazmat> rick_h, the ws stuff only existed on the rapi branch, the update was applied there. no plans for new releases of pyjuju.
<hazmat> the 0.7 release branch is going to be in saucy but deprecated, primarily exists to support existing envs.
<rick_h> hazmat: right, but I duped the error talking to pyjuju on canonistack with the dev browser. Users could hit this in the next month or so as the version updates.
<hazmat> rick_h, the gui charm pulls from the branch to deploy its ws endpoint independent of the underlying pyjuju version
<hazmat> rick_h, ie. try deploying the gui now to reproduce, should work
<hazmat> rick_h, the issue will remain against existing gui instances unless their updated
<rick_h> hazmat: ah ok. I had thought that the juju-gui WS endpoint was the juju node and rapi was only for development purposes.
<thumper> morning axw, DarrenS
 * thumper sighs
<thumper> tab complete fail
<axw> morning thumper
<axw> feeling better?
<thumper> morning davecheney
<thumper> yeah
<axw> cool.
<thumper> going through email tasks this morning
<axw> now I feel like shit :)  haven't slept well in a few nights
<thumper> :(
<thumper> that sucks
<thumper> I take lots of vit-c and multivitamins
<axw> always happens when I go away
<thumper> especially when on sprints
<axw> learnt about azure yesterday
<axw> and now I know what simplestreams is used for
<axw> and had lots of chats with davecheney
<axw> so otherwise winning
<thumper> cool
<davecheney> <neo>I know simplestreams</neo>
<davecheney> i also know crsn
<thumper> heh
<thumper> where's wally?
<davecheney> thumper: he's here
<thumper> axw: please poke ian for me
<axw> thumper: he just stepped out
<davecheney> thumper: he's asking the hotel for a new internet code
<axw> back now
<thumper> davecheney: yeah, I know, but I couldn't help that one
<wallyworld_> thumper: you rang?
<thumper> wallyworld_: I did, see PM
<wallyworld_> thumper: can you resend, my internet got disconnected after the code ran out
<thumper> wallyworld_: the PM?
<wallyworld_> yeah
<thumper> the email should be fine :0
<wallyworld_> ah email
<thumper> got that?
<thumper> wallyworld_: or the pm this time
<bigjools> *snort* <davecheney> i also know crsn
<davecheney> too soo ?
<davecheney> soon
#juju-dev 2013-08-27
<axw> thumper: what's your stance  on things like ctx/txn?
<thumper> axw: I strongly prefer what uncle Bob calls "Unse Pronounceable Names"
<thumper> personally I use context
<thumper> and prefer transaction
<axw> thumper: ok. my reasoning is that, for things like ctx, the translation to context is automatic (for me)
<axw> so I pronounce it "context"
<thumper> but the point is, you need to translate
<thumper> there is a cognitive process there
<thumper> that takes ctx, and makes context
<axw> not consciously
<thumper> but it is there
<thumper> when I see it, I still see "see tee ex"
<davecheney> sure, so is translating source code into your native language
<davecheney> i think it's a very minimal cost
<davecheney> and we're trained to do it very effectively
<thumper> davecheney: but writing code is all about people reading it
<axw> if it's not the same for everyone that's fair enough
<thumper> axw: I agree that some don't have that
<thumper> but do you have a problem reading "context" as a variable?
<thumper> davecheney: computers are easier to train
<axw> thumper: only when it means the source line > 80 chars, or becomes a series of ugly lines
 * davecheney is sad that we can't apply common sense here
<thumper> axw: I agree, in which case the logic should be simplified
<thumper> computers are good at optimising
<thumper> davecheney: common sense isn't as common as we'd hope
<thumper> axw: I remember writing some code one, it was c++
<thumper> where I managed to use a single for_each function
<axw> thumper: btw I'm not advocating ctx necessarily, just playing devil's advocate
<thumper> in the end though, I rewrote it with a normal for loop that took two or three more lines
<thumper> because it was more readable
<thumper> one of my favourite programming mantras is:
<thumper> just because you can, doesn't mean you should
<thumper> however, smart C++ programmers would have been able to see what my extremely clever function was doing
<thumper> but it wasn't obvious
<thumper> I'd go for obvious over clever every time
<thumper> however, I feel that there are some on our team that prefer clever
<axw> thumper: agreed. I used to fall into the clever code trap, but try hard to steer away from it now. It doesn't scale well in teams, and I tend to forget what my code is meant to do
<thumper> agreed
 * axw remembers in horror at his former colleagues' naming conventions
<axw> or lack thereof
<axw> basically all the variables would be the first letter of the variable's compound word type
<axw> var atp AggregateTableProvider
<axw> fun times
<thumper> haha
<axw> bbl, lunch
<bigjools> thumper: txn reads perfectly well! :)
<bigjools> thumper: however when I see ctx my mind pronounces it as "cuntox"
<thumper> heh
<bigjools> blame big kev ...
<thumper> well, everyone is different
<bigjools> yeah
<bigjools> which is precisely why you are right
<thumper> the idea is to code for the majority
<kvt> poking through the state pkg, and noticing some odd items a couple of the docs have counts, like serviceDoc has relation count, relations have unitcount.. is that just an optimization vs querying those?
<thumper> kvt: probably
<davecheney> bigjools: ping
<bigjools> davecheney: hail
 * thumper read that as 'fail'
<thumper> perhaps I have too much on my mind
<bigjools> understandable
<thumper> jam: I've added Dubai to my clock city list so I can know what time it is there.
<jam> thumper: :)
<thumper> jam: so you are 8 hours out from me
<thumper> jam: got time for a quick hangout?
<jam> thumper: right. I'm right at the end of your day, I believe. though it depends whether I wake up and poke around before officially "starting".
<jam> sure
 * thumper starts one
<thumper> https://plus.google.com/hangouts/_/f16d2fd28ebc5923a5ceb1a2e1ffc57f5b152404?hl=en
<thumper> jam: ^^
<jam> thumper: trying to connect, it rings on my phone where I don't want to connect
<jam> and isn't loading on the desktop where I do
<thumper> haha :(
<thumper> hmm..
<thumper> jam: did you want to set one up
<thumper> ?
<jam> let me make sure I'm signed into all of my accounts quickly
<jam> thumper: I'll call you back, it is just getting stuck at "joining video call".
<thumper> kk
<jam> calling you
<jam> https://plus.google.com/hangouts/_/f16d2fd28ebc5923a5ceb1a2e1ffc57f5b152404
<jam> thumper: ^^
<jam> I really don't prefer the "calling" style of hangout, vs just creating one and having people join it.
<thumper> night people
<davecheney> bigjools: 16:19 < kurt_> davecheney: do you know this error? error: cannot create bootstrap state file: gomaasapi: got error back from server: 400 BAD REQUEST
<davecheney> any ideas ?
<bigjools> davecheney: yes, maas was not accepting empty files.  There's an SRU about to go out for it
<bigjools> if desperate, use the daily PPA
<davecheney> ok, will tell 'em
<axw> bigjools: did you see my comment on the gwacl/vhd bug?
<bigjools> axw: from this morning?
<bigjools> if so yes, and I replied
<axw> oh
<axw> (yes)
<axw> bigjools: thanks. didn't get an update - I thought it would auto-subscribe me
<davecheney> axw: hang on, you are upset because LP _didn't_ spam you ?
<davecheney> you'll quickly change your tune
<bigjools> haha
<bigjools> most of the spam I get is because stupid people stupidly do stupid team nesting in teams that have stupid subscriptions
<bigjools> this is why I left ~uju as soon as I could
<bigjools> ~juju even
<davecheney> yes, that guy in china who subscribes me to bugs with the intel chipset
<davecheney> WTF
<jam> allenap: poke about https://code.launchpad.net/~allenap/juju-core/makefile-stuff/+merge/181113  I commented on it, but haven't heard a response from you yet.
<jam> axw: is there a reason https://code.launchpad.net/~axwalk/juju-core/testing-mgo-nounixsocket/+merge/181218 isn't landed?
<axw> jam: nope, I just forgot to. thanks - I'll do it now
<jam> axw: I was guessing so. I'm just going through activereviews and cleaning it out.
<axw> jam: looks like the bot's /tmp is full of gocheck dirs
<axw> are you able to clean that?
<jam> axw: if you're able to see it, you're able to clean it, but I'll do it
<axw> I can see it because my MP failed
<axw> tests failed, because /tmp/gocheck-blah exists
<axw> jam: gocheck doesn't set a seed for rand when it generates the temp dirs
<rogpeppe> mornin' all
<jam> morning rogpeppe
<rogpeppe> jam: hiya
<jam> rogpeppe: I have a couple more CLI API patches up that would appreciate a quick review.
<rogpeppe> jam: ok, will have a look
<jam> https://codereview.appspot.com/12744052/
<jam> and
<jam> https://codereview.appspot.com/12744053/
<rogpeppe> jam: BTW we can work around the int64 vs float64 thing if we want to, but given that the API is accessed from javascript, I'm not entirely sure we should.
<jam> rogpeppe: right. My concern is if someone out there is getting their settings as YAML and then parsing it into a struct that requires an int.
<jam> We do have a "type" object in that struct, though.
<jam> So the other option is that GetConfig/SetConfig start understanding the actual content of those messages
<jam> but I'd like to just punt on all that :)
<jam> rogpeppe: gustavo mentioned we can change the decode function to "UseNumber" but I'm not sure what that actually changes? If it doesn't have a decimal you get an int64?
<jam> It uses arbitrary precision integers?
<rogpeppe> jam: you can change it to get any type you want - one possibility is just to use a string.
<jam> rogpeppe: I tracked it down, it is "json.Number" which is actually a string, and then you "Number.Float64()"
<jam> it still wouldn't let us use DeepEquals easily
<jam> but it would let us get what we want later on.
<jam> But if we have to do that step
<jam> we can int64(floatValue)
<TheMue> jam: thx for review. setting an option to nil means unsetting it, so that the default is valid. and setting an option to "" will not be translated to nil anymore (as before). so for string options "" is a valid value while it's an error for other ones (like in pyjuju).
<jam> and it just matters when you have > 40bits of precision in your int74
<jam> int
<jam> TheMue: right that is where I thought we were going: Allow "" to be a valid setting, and use nil to indicate you want a default value. So your patch fits that just fine.
<rogpeppe> jam: exactly
<rogpeppe> jam: and that might actually be a problem
<TheMue> jam: yep
<jam> rogpeppe: you mean losing precision on ints?
<rogpeppe> jam: yes
<rogpeppe> jam: we might be causing subtle breakage in existing charm usage
<jam> rogpeppe: float64 has exact precision up to ~52 bits so it only matters if someone enters 2**61+1 that they care about that +1
<rogpeppe> jam: yes
<rogpeppe> jam: but that still might be important
<rogpeppe> jam: for instance someone might be storing bytes in there or something
<jam> rogpeppe: don't store bytes in a number type
<jam> I think that is a fine statement
<jam> we have strings
<rogpeppe> jam: indeed
<jam> and base64
<rogpeppe> jam: but it's still a possible issue
<jam> rogpeppe: I would be willing to take a patch that changed "juju set" to notice that you might lose precision and warn you
<jam> rogpeppe: I'm happy to deal with it when someone shows it is an actual problem.
<rogpeppe> jam: tbh i think it's ok to just say we only cope with 52 bits of precision
<jam> rogpeppe: I feel the same way.
<rogpeppe> jam: in fact, we *could* deprecate the "int" type entirely
<jam> which is why I JFDI with float()
<rogpeppe> jam: +1
<rogpeppe> jam: those two CLs reference the same MP
<rogpeppe> jam: i think i just reviewed the one without the prereq
<rogpeppe> jam: so it proabably counts as a review of both :-)
<jam> rogpeppe: yeah, I meant https://codereview.appspot.com/12943045/
<jam> rogpeppe: so for Client object. What fwereade_ really wanted is to avoid exposing "Environ" to the CLI code as much as we can (because we shouldn't need stuff like provider creds for almost everything we do).
<rogpeppe> jam: yeah, that seems reasonable
<jam> rogpeppe: beyond that, there isn't much that api.State provides CLI above "api.State.Client()".
<rogpeppe> jam: agreed
<jam> so I'd like to start with restricting it to just Client, and see if we can make it with that
<rogpeppe> jam: i think that sounds fine too
<jam> but I'm happy to fix the comment
<jam> rogpeppe: ah, back to you. IT turns out that multiple Close of api.State actually isn't ok
<jam> underneath api.State.Close() just calls rpc.Conn.Close()
<jam> which returns: errors.New("already closing")
<jam> rogpeppe: should multiple Close calls be an error?
<rogpeppe> jam: it's not clear
<jam> rogpeppe: we had a test that state.State.Close() can be called multiple times (though like I copied, it didn't actually check the err return :)
<rogpeppe> jam: on network connections, i think the second close will return an error
<jam> you can add that, and it passes
<jam> it fails for api.State.Close()
<rogpeppe> jam: we could make rpc.Conn.Close return nil if it's already closing
<jam> rogpeppe: that would be my preference, but you wrote it and I wanted to check what your reasoning was
<rogpeppe> jam: i think i was principally following the original logic of net/rpc
<rogpeppe> jam: which returns an error in that case
<rogpeppe> jam: (if in doubt i followed net/rpc's conventions)
<jam> rogpeppe: right: http://golang.org/src/pkg/net/rpc/client.go?s=7213:7248#L262
<jam> though it returns a typed error
<jam> which we could suppress at a higher level if we wanted.
<rogpeppe> jam: that's true. but it's a typed error that actually means two possible things
<rogpeppe> jam: 1) you just closed it; 2) the server shut down on you
<jam> rogpeppe: sure, though if I'm calling Close() and that happened to be shutdown by the remote side, that doesn't sound like an error
<jam> so I don't know if you need to distinguish between the two cases in this path
<jam> in *my* opinion, Close() is one of those idempotent functions that you can call multiple times to make sure resources were cleaned up
<jam> rogpeppe: if you agree, I'm happy to change rpc.Conn and assert it higher in the stack.
<rogpeppe> jam: I think that sounds reasonable (presumably you mean "lower in the stack"?)
<jam> rogpeppe: well I mean assert it across the stack
<jam> so we assert api.Client.Close and APIConn.Close() and api.State.Close and rpc.Conn.Close()
<jam> They happen to be implemented in terms of eachother, but it is nice to have the statement "Close() is idempotent at the various levels"
<rogpeppe> jam: what's APIConn ?
<jam> rogpeppe: juju.APIConn
<rogpeppe> jam: oh doh!
<jam> Contains an api.State and api.Environ
<jam> rogpeppe: mirroring juju.Conn
<rogpeppe> jam: yes, that all seems reasonable
<jam> but *probably* going away
<jam> I didn't want to be ripping it out at this point, but we have api.Client() where functionality that needs a place like juju.Conn can go, there doesn't seem much point in having both, and we *don't* want most CLI things to need an object that has Environ on it.
<rogpeppe> jam: yes, i think it will probably go away, though there are one or two lingering bits
<jam> rogpeppe: hopefully most of it involves moving things into the API :)
<rogpeppe> jam: yeah, i'm very much +1 on putting everything behind api.Client
<rogpeppe> jam: it wasn't possible before, because we couldn't mix logic that involved both Environ and State in a State method.
<jam> rogpeppe: out of curiousity, why is rpc.Conn defined in "server.go" rather than "client.go" ?
<rogpeppe> jam: it could be either
<rogpeppe> jam: it's both the server and the client side
<rogpeppe> jam: it could have been in a separate file, i guess, but i don't think it's a big issue
<rogpeppe> mgz: morning!
<mgz> qmorning!
<jam> rogpeppe: no, it just wasn't where I was looking for it when dealing with connection on the client side. And the word "Conn" exists all over the place in our source code. :)
<jam> mgz: morning
<rogpeppe> jam: you should really use better tooling :-)
<jam> rogpeppe: we should use better names
<rogpeppe> jam: rpc.Conn seems just fine to me
<rogpeppe> jam: it's unambiguous
<jam> rogpeppe: except for net/rpc/Conn
<rogpeppe> jam: they're both connections
<jam> rogpeppe: but they both use "rpc.Conn" and are entirely different code.
<jam> even if they are "similar"
<rogpeppe> jam: ah, sorry, thought you were referring to net.Conn vs rpc.Conn
<jam> rogpeppe: there may not actually be a net/rpc/Conn, but punning 'rpc' with a stdlib package is also not great naming.
<rogpeppe> jam: in a large s/w environment, there are always going to be duplicate names. the package name spacing means that it's possible to reliably and quickly find the actually definition for any given name
<rogpeppe> jam: and there are tools that integrate with your editor that makes that easy to do
<rogpeppe> jam: (assuming you use emacs or vim :-])
<rogpeppe> s/actually/actual/
<rogpeppe> jam: do you think that all our names should be globally unique?
<jam> rogpeppe: I'm not sure about globally unique, but avoiding repeated 10 times would be nice
<jam> eg State
<jam> Conn
<jam> and Client
<rogpeppe> jam: i think that that's actually a benefit - those names are well known concepts, and they're unambiguous when used in context (rpc.Conn vs net.Conn; uniter.State vs machiner.State, etc)
<rogpeppe> jam: i guess it comes down to the "no needless repetition" thing
<rogpeppe> jam: which is a conventional approach in Go
<rogpeppe> jam: but does have some down sides, as you point out
<jam> rogpeppe: I *do* like not having var foobar foo.FooBar  = new foo.FooBar(fooing the bars)
<rogpeppe> jam: but you prefer foo.FooBar to foo.Bar ?
<jam> rogpeppe: I like it being really easy to translate "foo := SomethingReturningAFoo()" into what Foo that is returning.
<jam> and using "tools that are already installed on my system" rather than having to dig up "nostandard but useful tools" from 3rd parties.
<rogpeppe> jam: presumably if it's foo.NewBar(), that's fairly obvious
<jam> rogpeppe: except it is "rpc.Conn" and that is actually net/rpc/Conn or foo/bar/baz/rpc/Conn
<jam> rogpeppe: so I have to look up what "foo" is actually being imported here.
<rogpeppe> jam: it's true that package identifiers aren't unique (we use "testing" many times, for example)
<jam> rogpeppe: and it is a source of confusion occasionaly.
<rogpeppe> jam: yeah, it can be. but hopefully in situations where ambiguity might be a problem, we rename the identifiers appropriately.
<rogpeppe> jam: we only use net/rpc in one place in our tree, for example
<jam> rogpeppe: so there is "it is unambiguous when you've paged in enough context" and there is "it is unambiguous when you read 3 lines of code"
<rogpeppe> jam: personally, i think it's reasonable to assume some contextual knowledge. but that might be just me.
<rogpeppe> jam: and i have to say that when browsing code i'm not familiar with, i use godef a lot
<jam> rogpeppe: in a team of 10+ having people look at code that they aren't as familiar with it is *useful* albeit not *required* to minimize the need for contextual knowledge.
<jam> rogpeppe: it is a goal (generally) to code is such a way that it is understandable to people who aren't as intelligent as myself (especially since *I* will not remember everything in 1 month either :)
<rogpeppe> jam: i totally agree with that
<rogpeppe> jam: but i don't think that the redundant naming helps greatly there
<rogpeppe> jam: in general, i find external Go code (using similar naming conventions to our current ones) is quite easy to browse and understand
<rogpeppe> jam: but again, i do use godef (and godoc.org) a lot
<jam> rogpeppe: one thing I would *really* like for godoc, is if you could click the Bar in "func Something() Bar)" and have it take you to the definition.
<rogpeppe> jam: i think it does that, doesn't it?
<rogpeppe> jam: yeah, it does
<rogpeppe> jam: it linkifies all type names
<jam> rogpeppe: not in the one-line overview, and the "func Something" doesn't give you the arguments or the return type, but if you keep drilling down until you get to the grey-box with more stuff that does eventually get there.
<rogpeppe> jam: i might be looking at a different godoc.org to you
<rogpeppe> jam: in the one i'm looking at, it's two clicks
<rogpeppe> jam: (one to click on the one line overview, which takes you to the actual description, then another to click on the type name)
<jam> rogpeppe: and a bit of visual searching to find the right thing, but with practice not too bad. I'm not sure if it used to do that, or if the specific layout of the page made it hard for *me* to see it.
<rogpeppe> jam: it's quite possible. and the site does change - it might not have been so good in the past.
<jam> I think I would also like to see the overview of functions for a Struct when you go to the structs definition (like there is in the top level index).
<jam> AFAIK there isn't a way to jump to the place in the index
<jam> and the index mixes
<jam> things that return Foo
<jam> with methods on Foo itself
<jam> eg, http://godoc.org/net/rpc has Client
<jam> which has functions returning a Client
<jam> and functions on Client
<jam> but no concise "these are the methods on Client"
<rogpeppe> jam: yeah, it tries to classify factory funcs
<jam> I don't mind them being nearby
<jam> but it would be useful to have a simple "here are the methods" section
<rogpeppe> jam: that's the overview, i guess
<jam> in my head it would exist here: http://godoc.org/net/rpc#Client
<jam> rogpeppe: they aren't visually distinct from the "here are factory functions"
<jam> they are slightly different
<rogpeppe> jam: yeah, you have to look for the (client *Client)
<rogpeppe> jam: if you can think of a nice way of visually distinguishing them, i'm sure they'd be open to suggestions
<rogpeppe> jam: it's not something that had occurred to me
<jam> rogpeppe: Are all of the tests for "state/api/*" code over in state/apiserver?
<jam> I don't see any _test.go files in state/api
<jam> which is.. a bit surprising
<jam> (I realize you sort of need apiserver in order to test api code, but I would expect to find api tests in the api package)
<jam> rogpeppe: which might be why I put the APIClient.Close test in juju/*
<jam> (I'm only *just now* remembering that I saw some api.Client tests in apiserver a few days ago)
<jam> rogpeppe: as in, I don't have an immediately obvious place to put new tests that api.State.Close() and api.Client.Close() are idempotent.
<jam> rogpeppe: so how do you find where there would be tests for api.State?
<jam> rogpeppe: having a unique name instead of State
<jam> would make that pretty obvious (i would think)
<jam> so references of X not definitions of X
<jam> and some of those references might be in the package itself
<rogpeppe> jam: my original scheme for testing API functionality was to put all tests in the client.
<jam> rogpeppe: there are no tests that I see in state/api/*, and there is no state/api/client/ directory
<jam> rogpeppe: is it just that they were moved elsewhere?
<jam> rogpeppe: also, no tests for state/api.State object that I can immediately find
 * rogpeppe goes to look
<rogpeppe> jam: the tests that are now in state/apiserver/client did originally live in state/api, i believe
<rogpeppe> jam: yeah, that seems to be the case. they should really be in state/api/client, i think
<jam> rogpeppe: I'll file a techdebt bug, but not fix it right now.
<rogpeppe> jam: sgtm
<rogpeppe> jam: FWIW, i still think that testing all API stuff through the client makes sense, as there's so little actual logic in the RPC layer.
<rogpeppe> jam: and the way things are, we've got vast swathes of almost-but-not-quite identical tests at each layer
<rogpeppe> jam: i guess the main reason we can't do that is that we decided to expose stuff in the API server that we don't make available in the client.;
<jam> rogpeppe: which I've already run into, though it was in the "and this isn't tested" category AFAICT
<jam> AddUnit supported ToMachineSpec but that wasn't exposed in api.Client
<rogpeppe> jam: ha, yes
<rogpeppe> jam: i have difficulty keeping track of what our test coverage is in this area
<rogpeppe> jam: because the tests are so overlapping
<jam> rogpeppe: so where do you think tests about api.State are today? (where should I add a test that calling api.State.Close() multiple times doesn't give an error)
<rogpeppe> jam: i think that tests on api.State should live with the other tests on api.State.
<rogpeppe> jam: it seems that that's currently (but wrongly) in state/apiserver/client
<rogpeppe> jam: so i'd add it to those, to be moved over at some later date
<fwereade_> rogpeppe, I was taking a casual look at Prepare... there's a 3.5 second sleep in jujud/machine.go
<rogpeppe> fwereade_: yes, i've removed that since.
<rogpeppe> fwereade_: but not re-proposed the branch
<rogpeppe> fwereade_: (that was me trying to reproduce a saw-it-once-only test failure)
<fwereade_> rogpeppe, :)
<fwereade_> rogpeppe, seems fine to me, go ahead and land it, if anything turns out to be a problem for the followup work we'll find out soon enough
<rogpeppe> fwereade_: i'm pretty sure it was nothing to do with my changes
<jam> fwereade_: did you have anything to add to the float64() vs int64() discussion about the  ServiceGet() api?
<fwereade_> jam, I'm fine accepting the json precision restrictions (well, I don't like it, but I don't think it's worth the effort of fixing)
<jam> fwereade_: well *JSON* supports arbitrary precision integers AFAICT
<jam> golang json.Unmarshal defaults to float64 though you can tweak it to return "type Number string"
<jam> but that would make testing it harder rather than easier AFAICT
<fwereade_> jam, shall we just call it a bug and move on?
<fwereade_> jam, unless you judge it simple enough to make it Right rightnow
<jam> fwereade_: my personal feeling is that 'juju get' will return similar-enough types that it probably isn't a problem, and I'm willing to deal with it when charms themselves really do need it
<fwereade_> jam, seconded
<jam> fwereade_: and I would say the test should be in state/api, except there are 0 tests in there :)
<jam> which is what I was just chatting about with roger
<fwereade_> jam, but let's document it as a bug regardless so it's easier to hunt it down and figure it out when it *does* happen
<rogpeppe> jam: presumably we have a potential issue even if we do use int64 because python would (presumably) use arbitrary-precision ints when needed
<jam> fwereade_: bug #1217282
<_mup_> Bug #1217282: api.Client tests should be in state/api not state/apiserver/client/ <tech-debt> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1217282>
<fwereade_> jam, oh bollocks, they're all in apiserver
<jam> fwereade_: as yet I haven't actually found direct tests of api.State either.
<jam> rog mentions they might be in state/apiserver/client/* but I haven't actually found a suite that is focused on the api.State object.
<fwereade_> jam, oh yay
<natefinch> jam: I sent an email that got blocked by the message size limit on juju-dev.... the autoreply I got said it was awaiting moderator approval... are there actual moderators that might approve it, or should I just recreate it?
<jam> natefinch: I am not one of the moderators. Gustavo might be, but it is probably easier to just resubmit.
<jam> I believe you can reject the existing message yourself.
<natefinch> jam: yeah.... bah.  40k limit? What is this, 1998? :)
<jam> morning evilnickveitch
<jam> I don't remember you having the "evil" goatee, though.
<evilnickveitch> jam, hey! My facial hair is in a constant state of flux to confuse my enemies
<TheMue> fwereade_: in https://codereview.appspot.com/12752044/diff/8001/cmd/juju/get_test.go you made a comment regarding "default" in "juju get"
<rogpeppe> jam: i'd like to see tests in state/api focused on the functionality that's in that package. you're right, tests for api.State itself seem to have vanished. i'm pretty sure i wrote some originally.
<TheMue> fwereade_: could you explain your thoughts a bit more?
<TheMue> fwereade_: i haven't touched the logic of get here, only moved. but i could handle it in a third CL.
<rogpeppe> has anyone seen this when testing?
<rogpeppe> PANIC: machine_test.go:58: MachineSuite.TearDownTest
<rogpeppe> ... Panic: unauthorized db:presence ns:presence.presence.beings lock type:0 client:127.0.0.1 (PC=0x414321)
<fwereade_> TheMue, saying "the value is nil, and that's the default" is redundant/meaningless because a nil value is only possible when there's no default set
<fwereade_> TheMue, so a trivial CL stripping "default: true" out of settings where the value is nil would be cool
<TheMue> fwereade_: ok, will do
<fwereade_> bbiab
<jam> natefinch: you have https://code.launchpad.net/~natefinch/juju-core/005-azure-address/+merge/181117 still pending though it has my approval. just poking for you to think about landing it.
<natefinch> jam: thanks, meant to talk to you about that. It's LGTM pending live tests, but the values aren't actually used anywhere in the branches, so I sorta can't live test
<jam> natefinch: I meant "test against Azure"
<davecheney> natefinch: nice work in the win32 installer
<jam> as in actually running it live
<davecheney> i hope you get some feedback
<davecheney> but given the self selection population in the channel
<davecheney> please don't take it too hard if you don't get a lot of feedback
<natefinch> davecheney: my message got caught by the size limit on the mailing list, so no one but mods have seen it I suspect :)
<natefinch> jam:  I guess I can insert some temporary code to call instance.Addresses[]... my point was that nothing calls that method in the branch I'm on
<natefinch> at least as far as I could tell
<davecheney> natefinch: i saw it
<davecheney> i am not a nod
<davecheney> mod
<natefinch> davecheney: oh, ok, so maybe it was let through and there wasn't a message back to me about it... and since I don't think I get my own messages from the list... makes it hard to tell what's going on.
<davecheney> natefinch: you do
<davecheney> but gmail doesn't show them too you
<natefinch> davecheney: oohh... ok, that could be it
<davecheney> you can also check the listserv directlu
<natefinch> Ahh yeah, I always forget I can do that
<natefinch> well, good.
 * TheMue => lunchtime
<jam> natefinch: talking about the windows installer. should we just make a 32 bit one and not worry about it? juju-the-client won't ever really benefit from >4GB addressable space. Though I realize go itself works better in 64-bit because of how it does virtual adress
<jam> addressing
<jam> juju-the-client isn't long lived enough to really matter.
<jam> when we get to 'jujud-the-daemon' we'll want to focus more on 64-bit
<natefinch> jam: yeah, that's probably a good point. There's very little benefit from 64 bit as you said. And 32 bit is more portable, which is a lot more valuable
<jam> (I originally started in 64-bit, and then went to 32-bit when we had C dependencies, because 64-bit mingw wasn't very good, and was thinking to switch back but maybe we just want to stay 32-bit for juju-the-client for now)
<davecheney> does anyone know where cloud-init writes the original text of the user data ?
<mgz> dave, sec
<davecheney> i can never find the bugger
<mgz> /var/lib/cloud/instance/user-data.txt
<davecheney> mgz: ta muchly
<davecheney> bloody hell, cloud-init has everything that opens and shuts
<davecheney> oh shit, its base64 encoded
<davecheney> well, that is one terminal i wont be using again
<natefinch> lol
<mgz> dave : in the same folder are the decoded bits
<davecheney> mgz: last time I c&p something blindly from you
<mgz> I included not cat :)
<natefinch> it's like the CLI equivalent of rick rolling :)
<fwereade_> guys, I'm afraid I won't make the standup, and I'll probably be working somewhat interrupted hours all week -- it emerges that the flat is actually literally falling apart and I need to find somewhere to live :/
<natefinch> fwereade_: wow, damn
<fwereade_> yeah, it kinda sucks
<natefinch> fwereade_: yeah, I can see how that could be less than optimal
<TheMue> fwereade_: iiirks, wishing you luck
<natefinch> fwereade_: yeah, good luck, definitely. That's a crappy situation to be in.  Do you own the flat, or are you renting?
<jam> natefinch: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471 standup
<jam> fwereade_: ^^
<rogpeppe> mramm: standout, if you want to
<rogpeppe> standup even
<mgz> standover
<jam> standarounder
<davecheney> stand to the right
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1202163/comments/4
<_mup_> Bug #1202163: openstack provider should have config option to ignore invalid certs <juju-core:Triaged> <https://launchpad.net/bugs/1202163>
<davecheney> jpds needs love on this one
<mgz> that's sadly not as easy to do in gojuju as it was in pyjuju
<davecheney> mgz: could we make it always on ?
<davecheney> mgz: sorry, was jumping to the soluont
<mgz> as in, never check certs?
<davecheney> yes
<davecheney> i know how to create a client that doens't check certs
<mgz> that seems like a bad thing from a security perspective :)
<davecheney> mgz: sure
<davecheney> is the issue in passing down the 'please don't check' flag
<davecheney> or implementing ' please don't check '
<mgz> passing through to all the points that use an endpoint seems like the hard part
<davecheney> mgz: what if it was at a per environment level ?
<mgz> I'm just assuming go has a flag in the http library somewhere for not checking certs
<davecheney> yup
<mgz> davecheney: so, what happens in pyjuju, is we have the silly er... ssh-hostname-verify or something env config variable
<davecheney> yup
<mgz> that used to default to false, but got switched to true for openstack
<mgz> and that gets used but the openstack provider client,
 * davecheney remembers ssh-hostname-verify
<mgz> *and* also the curl for looking up the instance id on machine startup
<davecheney> cocking nora
<mgz> we should be able to do roughly that for gojuju... but we use the swift container for many more things, like tools,
<mgz> and without checking, I'm not totally certain all the places where we might access provider storage we actually have the environment config
<mgz> because some of that is now initiated from each machine, whereas it used to all be from the provisioner
<davecheney> i guess the 'just turn it off everywhere' option is not acceptable ?
<mgz> given the elling that made us flip the default for python, I think not
<jam> davecheney: I would think the self-signed cert thing is High + papercut vs Critical
<mgz> *yelling
<jam> Critical == block the next release until fixed
<davecheney> jam: feel free to change it
<jam> davecheney: just running it by you rather than having an edit war on LP :)
<davecheney> jam: don't care
<jam> natefinch: I sent what is hopefully a point-by-point discussion about the installer. Mostly it seems like "good enough for now".
<natefinch> jam: thanks, the feedback is very much appreciated
<jam> I'll check with Mramm if it is appropriate to upload it to juju-core proper today.
<jam> at least, if he shows up to my 1:1 this morning.
<mramm> jam: seems reasonable to upload it
<natefinch> mgz: I moved my cards around so they're up to date. I'll poke the red squad about azure creds so  I can do a sanity check on the azure code and land that today
<mgz> ace, and added an intaller one, great
<natefinch-afk> going to be in and out some today, kids stuff.
<davecheney> TheMue: https://docs.google.com/a/canonical.com/document/d/1aEvcmxSJaj1i9zNjGy48yKF-SPlTFwW-NiKfoO_Ygo4/edit
<davecheney> could you please document juju unset in the release notes
<davecheney> ta
<TheMue> davecheney: yep, will do
<jam> natefinch-afk: quick comment, we probably won't release 1.13.2 as a windows installer because of the known bug against azure with public tools. (fixed in trunk)
<davecheney> also, 1.13.3 is coming at the end of the week
<jam> davecheney: right, so we'll probably wait for 1.13.3 to do a public -setup.exe
<davecheney> jam: and then we get to have the discusion about 1.14 again :)
<jam> davecheney: to be fair we have chatted about it a few times now :)
<natefinch-afk> davecheney: gotcha
<davecheney> jam: next time's the charm
<wallyworld> fwereade_: hiya, if you have time, would be great if you could look at this to ensure it matches our discussions. i hope to land it tomorrow so i can progress the simplestreams stuff with andrew https://codereview.appspot.com/13278043
<jam> TheMue: in talking with mramm he mentioned you'd likely be interested in being at the doc sprint on Friday. Is that still true?
<jam> I can add you to the invite list so you get a reminder.
<jam> hey wallyworld, how's your day been?
<wallyworld> busy :-)
<wallyworld> i know understand how horrible and hard charms are to write :-(
<wallyworld> we so need to fix that
<wallyworld> the mental model is difficult and there are so many ways to subtley shoot yourself in the foot
<davecheney> warning: juju may contain traces of footgun
<wallyworld> traces = copious quantities
<TheMue> jam: i'm interested in supporting the documentation, yes. i only have some troubles on this friday as i would have to leave at about 8pm local time.
<jam> TheMue: I don't think you are expected to be there the whole time
<jam> but if you can be there during your work day that is still a great advantage.
<TheMue> jam: yep, absolutely no prob. sounds fine to me
 * rogpeppe goes for lunch
<TheMue> hmmmpf
<TheMue> my software does not like me. first two missed tests i forgot but now a panic on the bot while i have a never seen failing test (but no panic) here. *sigh*
<TheMue> and now it passes again.
<TheMue> fu..
<TheMue> running the failing suite isolated it works, but with go test ./... it fails
<mgz> that is not uncommon
<TheMue> mgz: but also not nice
<fwereade_> TheMue, added a couple of comments to your review
<TheMue> fwereade_: thx
<TheMue> fwereade_: who's best to contact in the GUI team?
 * TheMue has to leave, bank appointment
<hazmat> fwereade_, the relationcount, unitcount on respective states (service, and relation) are primarily used for txn condition guards ?
<natefinch-afk> arosales, mgz: either of you guys know how to set up juju with azure? I don't see anything on juju.ubuntu.com
<mgz> not really, but you can just read the comment in the source/autogenerated config
<mgz> provider/azure/config.go
<natefinch-afk> mgz: cool, thanks
<natefinch-afk> man, the SSO for azure is buggy. It only works if I log in from particular websites
<natefinch-afk> evidently I should "contact your admin and report the following error: 80045C17"
<rogpeppe> hmm, how is it possible that provider.StartInstance and provider.StartBootstrapInstance have gone in without any tests?
<jamespage> anyone from the juju team care to join the 'deliverying juju 2.0 into saucy' session?
<jamespage> rogpeppe, fwereade_, mramm ^^
<rogpeppe> jamespage: if that's happening now, i'm afraid i'm just reaching end of day, so can't without domestic consequences...
<jamespage> mgz, ^^
<rogpeppe> if anyone fancies a largish but pretty mechanical code review: https://codereview.appspot.com/13269045
<rogpeppe> natefinch: ^
<rogpeppe> and with that, i *must* leave!
<rogpeppe> g'night all
<natefinch> rogpeppe: I'll take a look.  Gnight
<kvt> surprising number of tests work on osx (+ mongodb w/ ssl) nice
<natefinch> kvt: that's awesome. not entirely surprising, but it would be great if we could get all the tests passing on OSX.  Ideally, we'd be able to run the tests on any platform juju supports.
<hazmat> natefinch, there are a couple of lxc tests that make it barf
<hazmat> the mongodb manual compilation  for ssl was a bit unfortunate/time consuming, required some patching of scons config files.
<natefinch> hazmat: yeah. lxc is kind of a problem child for cross compatibility.   Still, if we can isolate that with OS compilation restrictions, that's ok.
 * kvt nods
<kvt> kvt == hazmat + osx ;-)
<thumper> morning
<thumper> fwereade_: you around?
<fwereade_> thumper, heyhey
<thumper> fwereade_: got time for a chat
<thumper> ?
<fwereade_> thumper, sure, just a mo, would you start one please?
<thumper> sure
<bigjools> hullo
<thumper> hi bigjools
<bigjools> hey man
<mramm> thumper: bigjools: hey all
<bigjools> g'day mramm
<thumper> hi mramm
<davecheney> axw: hey
<davecheney> off da phone
<davecheney> gonna check out of my room and i'll see you in mitchel
<axw> davecheney: okey dokey. I'm down there now
<axw> davecheney: they took the wifi off when I mentioned we had a meeting room
<davecheney> fuckers
<axw> davecheney: ?
<axw> davecheney: I mean they didn't charge me
<mramm> davecheney: that is good at least
<davecheney> oh right
<davecheney> i thought they *turned it off*
<axw> hah :)  no sorry
<axw> hey mramm, how's it going?
<mramm> axw: pretty good
<mramm> busy with all kinds of stuff for the other teams the last couple of weeks
<mramm> seems like juju core land is running pretty well
<axw> cool :)
<sidnei> mramm: if i can get one bug in your prioritization radar, it'd be https://bugs.launchpad.net/juju-core/+bug/1190985 it's biting us madly :/
<_mup_> Bug #1190985: Confusing update-charm and deploy -u behavior <juju-core:Triaged> <https://launchpad.net/bugs/1190985>
#juju-dev 2013-08-28
<davecheney> thumper: can you join #juju on canonical ?
<davecheney> need to talk about some stuff
<davecheney> thumper: oh wait, i'm not in #juju
<davecheney> use #eco
<wallyworld__> davecheney: hi, doctor running late as usual. i've fixed william's issues with by bootstrap branch but wanted to understand your issues. i can't really see what the problem is. so long as tools are uploaded to public buckets when ppa is released, things just work. and stricter matching is only done on major.minor, so point releases can still be used
<davecheney> wallyworld__: when you say "public bucket" that only works for ec2
<davecheney> private openstack clouds act as if they have no tools an fall back to using s3 for _release_ (not devel) tools only
<davecheney> your change would prevent them accepting the final fallback toolset and their bootstrap will fail
<davecheney> i 100% agree that mixing toolsets is a mistake
<davecheney> but that genie is out of the bottle and people are using it now
<wallyworld__> davecheney: i just got called in, will catch up in a bit
<axw> bigjools: would you mind landing my gwacl branch? pretty sure I don't have rights to
<thumper> davecheney: you weren't there
<thumper> davecheney: was at the gym, sorry
<bigjools> axw: ah, I can fix that
<bigjools> axw: land away! :)
<davecheney> thumper: imma there
<davecheney> #eco on canonical
<thumper> oh
<axw> bigjools: thanks. next question: how? :)  I haven't landed without lbox before
<bigjools> axw: just change the status of the MP to "approved" and the bot does the rest
 * bigjools grumbles about lbox
<axw> bigjools: ah right, just didn't show up before cos I wasn't in the group. of course.
<axw> thanks!
<bigjools> yup
<davecheney> thumper: i might even bring forward 1.13.3 tagging to today
<davecheney> if I can do the upgrade testing
<davecheney> 'cos wallyworld wants to break everything
<davecheney> (wallyworld is sans internets, he'll be here soon)
<bigjools> axw: my gwacl bot seems to be dead... fixing
<axw> bigjools: thanks
<bigjools> axw: ah it's not
<bigjools> did you set a commit message?
<axw> I think so... checking
<bigjools> nope
<axw> dah
<axw> thanks
<axw> fixed
<bigjools> tarmac silently ignores MPs with no commit msg
<bigjools> bot runs every ten minutes
<bigjools> I need to get landing moved over to the one that jam runs
<axw> okey dokey, thanks bigjools
<thumper> davecheney: is this wallyworld's tool selection work?
<wallyworld> wot
<thumper> hi wallyworld
<wallyworld> hey
<thumper> wallyworld: davecheney says you want to break the wold
<thumper> world
<wallyworld> well, not really
<wallyworld> only for private clouds that choose not to set up a public bucket for tools
<axw> wallyworld: lp:~axwalk/juju-core/juju-metadata-generate-tools
<wallyworld> and rely on sync tools pulling down possibly outdated tools from s3
<thumper> wallyworld: so how is it breaking?
<wallyworld> now, juju will fall back to using any tools where the major version number matches
<wallyworld> eg 1.3 client will use 1.2 tools
<wallyworld> but the change now restricts the matching to require major.minor match
<wallyworld> patch updates are used though
<wallyworld> so 1.3.0 client will use 1.3.2 tools
<thumper> I don't think that is reasonable
<wallyworld> william does
<thumper> I'd say only use patch updates for non-dev releases
<thumper> we break things from point release to point release on dev envs
<wallyworld> we can't guarantee 1.28 will be compatible with 1.26
<wallyworld> it's not an issue in practice
<wallyworld> juju release = upload tools and then upload ppa
<wallyworld> so there are always tools
<wallyworld> for devs on release+1, they should be using --upload-tools
<wallyworld> thumper: william is the one driving this change
<thumper> sure
<thumper> I guess that's fine
<wallyworld> i'm +0 on it - i can see pros and cons
<davecheney> wallyworld: correction, juju release uploads tools for *known* cloulds, the CPC clouds
<davecheney> there exists an unknown set of non public clouds
<davecheney> who currently (inadvertendtly) rely on the automatic sync-tools behavior that bootstrap does for them
<davecheney> i 100% agree this is wrong and bad
<davecheney> but if we take it away from them, they'll lynch us
<thumper> wallyworld: I suggest we skip the package review stuff with axw as you are sprinting
<wallyworld> thumper: sounds good to me
<thumper> wallyworld: please let axw know
<wallyworld> will do
<davecheney> 15:48 <jpds> error: invalid value "maas-name=mongodb.lab.boston.cts.canonical.com" for flag --constraints: unknown constraint "maas-name"
<davecheney> ^ i think i already know the answer
<davecheney> but what do I tell jpds ?
<davecheney> bigjools: can you response please
<bigjools> davecheney: OTP
<davecheney> bigjools: when you are free
<bigjools> but looks like a juju error
<bigjools> and juju-core doesn't support maas flags
<davecheney> oh cock
<bigjools> yes James
<wallyworld_> thumper: bigjools: i got my coffee machine back! :-D
<bigjools> \o/
 * jtv cheers for wallyworld_ 
<wallyworld_> it's heating up right now :-D
<bigjools> and I ran out of beans
<jtv> Go over to wallyworld_'s right now â I hear he has working equipment
<bigjools> I heard the cord was cut
 * wallyworld_ is the coffee nazi - no beans for you
<wallyworld_> boom bomm ching
<bigjools> davecheney: wtf is "maas-name" anyway?
<davecheney> bigjools: constraints the units to the maas tag
<bigjools> that's maas-tag I thought
<bigjools> maas-tags even
<davecheney> https://juju.ubuntu.com/docs/charms-constraints.html
<bigjools> and only on pyjuju
<davecheney> docs says maas-name
<bigjools> maas docs say maas-tags....
<bigjools> haha
<bigjools> is he using juju-core or pyjuju?
 * davecheney starts to cry
<davecheney> juju-core
<bigjools> he's fucked then
<davecheney> how charming
<bigjools> until fwereade_ gets  provider-specific constraints in (I think --to was being talked about)
<bigjools> he's not going to be charming
<davecheney> jpds: short version, we haven't implemented provider specific constriants yet
<davecheney> the docs are wrong
<bigjools> they probably refer to pyjuju
<jpds> bigjools: That did exist in pyjuju.
<bigjools> exactly
<davecheney> jpds: fwiw, https://bugs.launchpad.net/juju-core/+bug/1170337
<_mup_> Bug #1170337: maas provider: missing support for maas-specific constraints <openstack> <juju-core:Triaged> <https://launchpad.net/bugs/1170337>
<davecheney> also, https://bugs.launchpad.net/juju-core/+bug/1170337
<_mup_> Bug #1170337: maas provider: missing support for maas-specific constraints <openstack> <juju-core:Triaged> <https://launchpad.net/bugs/1170337>
<davecheney> shit
<jpds> Keep calm.
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1217717
<_mup_> Bug #1217717: docs: charm constraints page refers to unsupported 'maas-name' constraint <juju-core:Triaged by evilnick> <https://launchpad.net/bugs/1217717>
<jam> axw: did we get gwacl updated on the build bot to match your "critical update gwacl" patch?
<jam> the bot doesn't (yet) actually look at dependency.tsv
<axw> jam: no, but it doesn't affect the juju-core tests (it's tested in gwacl)
<jam> axw: sure, but we probably actually want the bot to be building with the version of gwacl we are suggesting. I'll go update it.
<jam> (It also helps that if there was an accidental incompatibility break we would notice it.)
<axw> jam: fair enough, thank you
<jpds> Guys, my bootstrap node is dying.
<jpds>  /var/log/juju/machine-0.log keeps aying: juju environ.go:37 worker: loaded invalid environment configuration: maas-oauth: expected string, got nothing
<jpds> Which is a lie.
<fwereade_> jpds, do you know the juju version you used to bootstrap, and the jujud version you're runnning?
<jam> fwereade_: Good morning. I'm slightly concerned that we really do have a bug we need to deal with in ServiceGet. Specifically, the charm config types are described in the schema (description, type, value), and on the server side we are doing the work to put the data into the right type. But we lose that type when we transmit it via JSON.
<jpds> fwereade_: ppa:juju/devel 1.13.2.
<jam> fwereade_: so do we need to implement a pass over the data in the client side of ServiceGet?
<jpds> fwereade_: But yeah, I've just bootstrap'ed it and nothing works, not even status.
<fwereade_> jpds, would you find the jujud on the bootstrap node (/var/lib/juju/tools/machine-0?) and run `jujud version`, just to check, please
<fwereade_> jam, blech
<jpds> fwereade_: 1.13.2-precise-amd64
<fwereade_> jpds, hmm, would you pastebin me /var/log/cloud-init-output.log please?
<rogpeppe> mornin' all
<rogpeppe> jam: can you give a specific case when it would be a problem?
<fwereade_> jam, I can accept that's a bug, but I'm reluctant to fix ServiceGet which itself is pretty insane
<jam> rogpeppe: The fact that the metadata tells you what type you have, and the existing code spends the effort to return things of that type sound like the code has made some expectations.
<fwereade_> jam, ISTM that it's an unholy combination of service info and charm info, and that there is literally no possible justification for smooshing that information into the same API call
<rogpeppe> fwereade_: +1
<fwereade_> jam, if we didn't do that mixing, the only problem would be precision, right?
<rogpeppe> fwereade_: well, some code may rely on the fact that we expect an int64 for an int attribute value
<jam> fwereade_: so I can confirm the precision issue (I added a test that proves it). The other issue is that you could easily write code that does "var stuff int64 = result.Config["skill-level"]["value"]"
<rogpeppe> jam: but i'd still like to see at least one place where this is actually a problem
<jam> rogpeppe: the main problem is that it is code that *we* didn't write.
<rogpeppe> jam: ah, you're worrying about 3rd party code using the API?
<jam> rogpeppe: I'm concerned that we have code that clearly spent a lot of time to get types right that we are throwing away
<jam> fwereade_, rogpeppe: I'm willing to just push it forward, file a bug and get on with life, but it was clearly some effort into doing it the way it is done today.
<jpds> fwereade_: /msg.
<rogpeppe> jam: when we could represent integers in the marshalled form of the data, it made sense to represent an int as an int
<rogpeppe> jam: but given that we're now using JSON, i don't see that is necessary.
<fwereade_> rogpeppe, I thought we *could* represent ints regardless
<fwereade_> rogpeppe, jam, I'm starting to be convinced here
<rogpeppe> fwereade_: we *could* but it's awkward to do
<fwereade_> rogpeppe, jam, although I still sorta lean towards a "ServiceGet is hopeless rubbish, but it's not just ServiceGet we need to worry about, we should actually fix our marshalling"
<rogpeppe> fwereade_: what's bad about ServiceGet, for the record?
<fwereade_> rogpeppe, we've already encountered this problem in environment config
<fwereade_> rogpeppe, it's mixing together charm stuff and service stuff for no clear reason
<jam> rogpeppe: the specific bad of ServiceGet is that it mixes the Charm.Config object with the Service.Constraints data
<jam> which should really be 2 different calls.
<jam> fwereade_, rogpeppe: bug #1217742, I'll document it in the code and reference that bug if we find it is really important to fix.
<_mup_> Bug #1217742: ServiceGet returns integer values as float64 <juju-core:Triaged> <https://launchpad.net/bugs/1217742>
<jam> fwereade_: most of our marshalling uses "structs" that will enforce types
<jam> ServiceGet is a bit special in that it is passing around user-data that is in a map[string]interface{}
<rogpeppe> fwereade_: i don't really mind the fact that it's one call where it could be two, if most API usage gets both things at the same time
<rogpeppe> fwereade_: i'm not sure it makes it hopeless rubbish
<rogpeppe> jam: yeah
<fwereade_> jam, I'm not so bothered about service config vs constraints, getting and (beinga ble to) set all service info at once is fine by me really... the big problem is total asymetry
<rogpeppe> jam: i'm trying to think of other places that pass around a free interface{}
<rogpeppe> jam: we could potentially just use UseNumber always
<fwereade_> rogpeppe, jam: we had that problem with ports in env config
<rogpeppe> hmm, why does Resolved return settings?
<fwereade_> rogpeppe, jam, we will probably have it again when unit agents need to use service config
<jam> rogpeppe: we could change it to UseNumber, but we'd still need to implement the custom unmarshalling in ServiceGet to cast the Number into the correct types.
<fwereade_> rogpeppe, I have *no* freaking idea
<fwereade_> rogpeppe, because it seemed like a good idea at the time, regardless of layering/sanity concerns
<fwereade_> rogpeppe, just like ServiceGet ;p
<jam> fwereade_: ports aren't in typed structs?
<jam> I guess EnvironConfig isn't type
<jam> typed
<fwereade_> jam, environment config resists that sort of thing pretty hard
<rogpeppe> jam: we could just document that the Go API has changed
<rogpeppe> fwereade_: i was looking over some of the camlistore source at the weekend, and it's interesting to see how it manages configuration stuff. it's got quite a neat little type to manage it.
<fwereade_> rogpeppe, oh yes?
<rogpeppe> fwereade_: http://godoc.org/github.com/bradfitz/camlistore/pkg/jsonconfig
 * jam goes to grab lunch, will be back soon.
<rogpeppe> fwereade_: the cunning bit is Validate
<rogpeppe> fwereade_: in particular, lookForUnknownKeys works out which keys could be in the config by knowing what keys have been looked up previously
<rogpeppe> fwereade_: which is actually kinda neat if you think about it
<rogpeppe> fwereade_: i'm not that keen on the way it stores extra keys in the map though.
<rogpeppe> fwereade_: anyway, interesting to see how other code deals with similar issues.
<fwereade_> rogpeppe, thanks, I'll try to take a proper look shortly :)
<rogpeppe> a review of this would be much appreciated please. it's large but it's just moving code. https://codereview.appspot.com/13269045/
<fwereade_> jam, rogpeppe, I have a theory for why jpds's environment is broken
<rogpeppe> fwereade_: oh yes?
<fwereade_> jam, rogpeppe: the upgrader api is trying to look in the environment for tools before we've sent the environ config secrets over
<fwereade_> jam, rogpeppe: and for bonus points ServerError is panicking because schema.error_ is unhashable and can't be used with ServerError
<rogpeppe> fwereade_: hmm, the upgrader used to wait until it had got a valid environment config
<rogpeppe> fwereade_: i guess things will probably sort themselves out eventually
<fwereade_> jam, rogpeppe: leaving aside that it should *never* have had access to an environ config, yes ;p
<fwereade_> rogpeppe, not sure it is actually doing so
<rogpeppe> fwereade_: well, sure
<jpds> rogpeppe: I have a 43MB log file with a time range of 30 minutes - full of tracebacks. :)
<fwereade_> jpds, how does juju status fail exactly?
<jpds> fwereade_: It didn't return.
<jpds> I just bootstrap'ed, it installed and immediately started freaking out.
<rogpeppe> jpds: could you paste a representative example of some of the log file (including at least two iterations of the traceback)?
<jpds> rogpeppe: They're on https://chinstrap.canonical.com/~jpds/juju-debug/
<fwereade_> jpds, is it remotely possible that you killed the status before mongod was running on the server, and haven't run it since?
<rogpeppe> jpds: perhaps from the very start up until it starts to look repetititive
<jpds> rogpeppe: The full all-machines.log is there.
<TheMue> fwereade_: you made a comment on my last CL regarding the GUI team. whom can i ask for it best?
<jpds> fwereade_: I may have Ctrl-C'ed it - I remember I ran status before the machine came up..
<fwereade_> jpds, would you try juju status again please?
<fwereade_> jpds, when that can connect to mongo, the first thing it'll do is hand over any missing secrets
<rogpeppe> hmm, this looks highly suspicious: "panic: runtime error: hash of unhashable type schema.error_"
<fwereade_> jpds, this is still definitely a bug, but it might be easily resolved that way
<fwereade_> rogpeppe, yeah, that was what led me in that direction
<fwereade_> rogpeppe, regardless of tools returning an error, that nukes the whole process
<fwereade_> rogpeppe, if ServerError didn't panic, I think the upgrader would nicely fail and get retried
<jpds> fwereade_: The status is just sitting there and the log is just getting more errors.
<fwereade_> rogpeppe, oh except, wait, we're still using allFatal for some reason
<fwereade_> jpds, would you run it with --debug and see if anything interesting pops up?
<jpds> fwereade_: 2013-08-28 08:09:28 INFO juju.state open.go:68 opening state; mongo addresses: ["angha.maas:37017"]; entity ""
<jpds> And nothing.
<jpds> Log continues to complain about: maas-oauth: expected string, got nothing
<rogpeppe> fwereade_: hmm, schema should really return an error that's a pointer type
<jam> 678646
<fwereade_> rogpeppe, maybe so, but ServerError should also not be assuming that something's hashable just because it implements error
<jam> good thing those are one-time passwords :)
<rogpeppe> fwereade_: that is true too
<fwereade_> jam, haha
<rogpeppe> fwereade_: that's my fault entirely.
<jam> fwereade_: though I guess if you give away 2 of them, you can figure out the sequence, or something along those lines (given setting one up is start it, sync it, and enter 2 sequential values)
<fwereade_> rogpeppe, not to worry, we can fix it
<jam> rogpeppe: we should probably be decreasing "discarding action method" from INFO to at least DEBUG, fwiw
<fwereade_> jpds, can you definitely resolve angha.maas from where you're running?
<rogpeppe> jam: i have an old branch somewhere that would let us discard that message entirely
<rogpeppe> jam: at the least it could only log exported methods, because they're the ones where problems are likely to lie
<jpds> fwereade_: Oh, that's interesting; fixed DNS and now it works.
<fwereade_> jpds, because it's starting to look like that's the underlying problem -- the environ errors are an expected consequence of no-secrets-yet, and the panics are an unexpected one, but both will be sorted out once we hand over the secrets
<fwereade_> jpds, sweet
<jpds> And the server looks happier.
<jam> rogpeppe, fwereade_: so one thing we certainly want to fix, is ServerError(unknownError) shouldn't ever panic. How do we detect that a type is unhashable and can't be looked up in a map?
<fwereade_> jpds, secrets handover happens first time you connect to an environment
<jpds> I wouldn't normally expect it to fail like that on DNS though...
<jpds> fwereade_: I see.
<rogpeppe> jam: i think we could probably use a switch instead.
<rogpeppe> jam: alternatively we could recover from it
<fwereade_> rogpeppe, *is* there a way to detect it other than to try, panic, recover?
<rogpeppe> fwereade_: perhaps with reflect
<fwereade_> rogpeppe, eww ;p
<jpds> Maybe the node should just sit there until status is run? It's kind of failing up with log files otherwise...
<rogpeppe> fwereade_: actually i can't see an easy way even with reflect
<fwereade_> jpds, I think that if we fixed the ServerError panic things would look a lot better
<fwereade_> jpds, panic tracebacks are very heavyweight
<fwereade_> jpds, but there is indeed another bug to fix to prevent it just repeatedly erroring and trying again regardless
<fwereade_> jam, rogpeppe: I think *that* one would be fixed by doing a WaitForEnviron in Upgrader.Tools
<jam> rogpeppe: would the ,ok syntax work, or it still panics on non-hashable types
<fwereade_> jam, rogpeppe, jpds; I'd be a bit reluctant to make *everything* wait for bootstrap to complete
<rogpeppe> fwereade_: i'm not entirely sure about that
<rogpeppe> jam: the ,ok syntax would not work
<fwereade_> jam, rogpeppe, jpds: better for just the things that need an environment to wait really
<rogpeppe> jam: but a switch would
<jpds> sigh, my debug-log keeps filling up with "panic: runtime error: hash of unhashable type schema.error_" error still.
<fwereade_> rogpeppe, jam, a switch seems sane to me, just dodge the whole issue
<rogpeppe> fwereade_: yeah
<jam> rogpeppe: we have a switch just after it, but I don't quite know what a switch for unhashable types would look like
<jam> My google-fu for "golang unhashable" isn't very helpful
<fwereade_> jam, the point is just that a switch won't try to hash it
<rogpeppe> jam: comparing unhashable types is ok if they're different underlying types
<rogpeppe> jam: equality comparison compares the types first, then the value
 * rogpeppe goes to makes sure of chapter and verse
<jam> rogpeppe: so it is *usually* a good idea to avoid hardcoding a switch/series of if/else statements if you can put that information into a map so that it is O(1) lookup instead of O(n) and usually maps are easier to add data to at runtime for types that you may not have known about yet.
<jam> So I'd still like to keep the "we have a map for registered types," that we can look in
<fwereade_> jpds, that is... more upsettin
<fwereade_> jpds, is the server still generating those panics after a successful status, or is debug-log itself screwy?
<fwereade_> jpds, there was a bug with rsyslog resyncing state it didn't need to, and I think it was fixed, but I'm not sure when
<fwereade_> wallyworld_, ^^
<jam> fwereade_: I'm pretty sure that fix is in 1.13.2. You can confirm by checking /etc/rsyslogd/25-juju.log There should be a couple of lines with &~ inbetween them.
<jam> It fixed it in our testing at least.
<jam> And it only triggered when you had deployed a unit to machine-0
<jam> and we haven't gotten to the point of deploying yet.
<jpds> fwereade_: tail'ing the log on the bootstrap node to make sure debug-log isn't just playing catch up.
<fwereade_> jpds, look at machine-0.log, that wouldn't demonstrate the problem even if it were still there
<rogpeppe> jam: i was thinking of something like this: http://paste.ubuntu.com/6035835/
<rogpeppe> jam: but it will indeed be O(n)
<TheMue> fwereade_: ping
<jam> rogpeppe: and not extensible at runtime
<rogpeppe> jam: i think on balance i'm preferring the recovery approach
<rogpeppe> jam: is that something we'd want?
<jam> rogpeppe: that's where I'm getting to as well.
<jam> rogpeppe: it is often a nice pattern to be able to add things you didn't think about ahead of time as data rather than code.
<rogpeppe> jam: at runtime?
<jpds> fwereade_: Yeah, that looks OK, it's just playing ping pong.
<jam> rogpeppe: it would allow packages that define an error to register that error mapping into a code
<fwereade_> jpds, ok, cool
<jam> rogpeppe: then the places that define the error define their code
<rogpeppe> jam: they can do that anyway, by returning an error with a Code method
<jam> rather than having to have "common/errors.go" know about all possible errors.
<rogpeppe> jam: common/errors.go is really for legacy errors
<rogpeppe> jam: or at least, that was the intention
<fwereade_> TheMue, pong
<TheMue> fwereade_: seen my question above regarding the GUI team?
<rogpeppe> jam: i guess it comes down to dependencies
<TheMue> fwereade_: whom can i ask best to be sure about the change?
<fwereade_> TheMue, frankban seems to be right here and would be a good person
<TheMue> fwereade_: thanks
<rogpeppe> jam: and responsibility - we either have lots more things import state/api/params (for the error codes) or something has to be responsible for registering error types
<jam> rogpeppe: it is the Factory pattern which is often quite good, especially for mapping things of X to things of Y. It happens that go specifics mean it may not be great here.
<fwereade_> TheMue, but if he weren't you could probably just pop into #juju-gui and see who's around ;)
<TheMue> fwereade_: oh, didn't know about that channel. then it's more easy ;)
<TheMue> fwereade_: after that i'll take a look at the ssl issue. here i may come back to you with questions.
<jpds> How do I forcibly move a machien from dying to dead? (It was never alive and I didn't want it to be) - already terminate-machine'd it.
<rogpeppe> jam: i'm not quite sure how what you mean there
<fwereade_> jpds, if it's dying, it should have 0 units, and the provisioner ought to clear it up for you...
<fwereade_> jpds, but we did also see a bug in which the provisioner *sometimes* misses a machine that needs cleanup
<fwereade_> jpds, so you may have encountered that
<jpds> It's still "Allocated to root" in MAAS.
<fwereade_> jpds, and in status it appears as dying, with an instance id?
<jpds> fwereade_: Yes.
<fwereade_> jpds, if the machine is still coming up you just have to wait for it to appear, and the machine agent will figure out that it's no longer wanted and remove itself
<jam> rogpeppe: ugh. we have no direct tests of common.ServerError, at least not in the common package.
<rogpeppe> jam: yeah, the test is in state/apiserver. it obviously wasn't moved over when we moved the code.
<fwereade_> jpds, there's room for us to improve that story though
<fwereade_> jpds, I imagine you wouldn't be mentioning it if it weren't annoying, so I'll write some bugs
<fwereade_> jpds, thanks
<jpds> fwereade_: Yeah, problem was the machine is a VM with no power settings, so it was never going to boot...
<fwereade_> jpds, ha, I see
<rogpeppe> jam, fwereade_: https://codereview.appspot.com/13336043
<jpds> fwereade_: Let me know if you file bugs so I can track them.
<fwereade_> rogpeppe, I'd prefer a code, ok sort of return really
<fwereade_> rogpeppe, it seems a little bit magic
<fwereade_> rogpeppe, and in-band signalling is just generally a Bad Thing ;)
<rogpeppe> fwereade_: you'd prefer something like this? http://paste.ubuntu.com/6035888/
<jam> rogpeppe: for the opposite method I'm submitting it now
<fwereade_> rogpeppe, yeah, I think so, if you have no objections
<jam> waiting for lbox to finish
<fwereade_> rogpeppe, the magic doesn't feel painful now
<rogpeppe> fwereade_: ok, fair enough
<jam> rogpeppe: I traced back to who introduced the singleton error map, which is quite a while ago (rev 957 or so)
<rogpeppe> jam: i did that
<jam> and lbox fails with "failed to load data"
<rogpeppe> jam: it's all my fault :-)
<jam> jpds: bug #1217760 for the bad panic bug
<_mup_> Bug #1217760: apiserver.common.ServerError needs to handle unhashable errors <juju-core:In Progress by jameinel> <https://launchpad.net/bugs/1217760>
<jpds> jam: Thanks.
<jam> rogpeppe: https://codereview.appspot.com/13338043
<jam> hi noodles775 (/wave)
<noodles775> jam: o/
<jam> rogpeppe: at this point, I don't *really* care which fix goes  in, though we'll want to associate whatever lands with the bug #
<rogpeppe> jam, fwereade_: alternative updated: https://codereview.appspot.com/13336043
<rogpeppe> jam, fwereade_: i don't mind much either, although i quite like the more compact nature of the switch. (and using recover in that kind of way has reasonable precedent)
<rogpeppe> s/switch/map lookup/
<fwereade_> jpds, https://bugs.launchpad.net/juju-core/+bug/1217781
<_mup_> Bug #1217781: machine destruction depends on machine agents <juju-core:New> <https://launchpad.net/bugs/1217781>
<jam> rogpeppe: I like the map lookup, the fact that you have to recover tells me it is something to be scared of
<jam> *but* the fact
<rogpeppe> jam: for a similar kind of use, see http://golang.org/src/pkg/os/exec/exec.go?s=5901:5926#L122
<rogpeppe> jam: i think it's reasonable
<jpds> fwereade_: Ta.
<jam> rogpeppe: I defer to a 3rd party for final decision about the tastefulness of a defer() {recover()} it looks la  lot like a "try: except:" in python which you should almost always universally avoid.
<jam> as it suppresses all errors.
<rogpeppe> jam: there's only one error possible there
<jam> such as fwereade_^^
<rogpeppe> jam: your code could be a little more compact if you stored the singleton codes in a table and iterated over them
<rogpeppe> fwereade_: yeah, i defer to you too
<jam> rogpeppe: we have a switch today, might as well add some bits to it.
 * rogpeppe likes tables
<fwereade_> rogpeppe, jam, I was fine with a switch, but I thought we were leaning towards maps being nice? if we agree that maps are nice I'm fine with the recover
<jam> fwereade_: to be clear, we both have a solution to the problem and not enough motive to decide between, just pick one and we'll go about our lives in a merry fashion. https://codereview.appspot.com/13338043/ and https://codereview.appspot.com/13336043/
<jam> fwereade_: map is nice, recover is ugly, etc. :)
<fwereade_> jam, rogpeppe: rogpeppe wins by the highly scientific toss-a-coin method
<rogpeppe> lol
<fwereade_> jam, rogpeppe: a bit annoying that euros don't have heads on them, that gave me pause for a moment
<fwereade_> (jam, if I wanted to debate further, I'd contend that recover is more unfamiliar than intrinsically ugly, and that when the scope is so very narrow I don't see any problems with it)
<fwereade_> rogpeppe, jam: separately
<jam> fwereade_: having to write a function to do  a safe map lookup is... unfortunate.
<jam> rogpeppe: go for it
<fwereade_> rogpeppe, jam: I contend that ca-private-key is Just Another Secret
<fwereade_> rogpeppe, jam: and should be communicated just like all the other secrets
<jam> fwereade_: I take it this is a different discussion.
<rogpeppe> fwereade_: how can we do that?
<jam> fwereade_: we can't talk to the server over TLS until it has that secret
<rogpeppe> jam: exactly
<fwereade_> rogpeppe, jam: *ca*-private-key
<rogpeppe> fwereade_: um, do we even keep the ca private key around?
<fwereade_> rogpeppe, jam: the fact that the server private key is vulnerable is itself separate I think
<rogpeppe> fwereade_: i don't think we need to send the ca-private-key over the wire at all, do we?
<jam> fwereade_: it does feel like the server private key going into the system is a pretty big flaw in the overall security. The other option is that it would be generated on the fly, but then how do we validate it is correct?
<fwereade_> rogpeppe, we will for HA if we want to actually give each server its own cert
<rogpeppe> fwereade_: if we do, then yes
<jam> fwereade_: note also that it wouldn't be terrible to pass in some entropy to the machines we are starting (Dustin Kirkland has done some work around this area)
<rogpeppe> jam: i think the right solution is to send the private key in, but then have the server change its own key and send the new public key back on first connection.
<fwereade_> rogpeppe, jam: the server private key only becomes vulnerable once we're running user code, though
<fwereade_> rogpeppe, jam: so I think it's fine if we generate a new one once the ca-pk is up there
<fwereade_> rogpeppe, managing the restarting dance is going to be interesting though
<rogpeppe> jam: i think that's a good idea
<jam> fwereade_, rogpeppe: I think we need someone with real security background to actually do the modeling of what threats we are going to try and handle and which ones we aren't.
<fwereade_> rogpeppe, hmm, how do we revoke the bad one though?
<rogpeppe> jam: that sounds like a good idea
<jam> It is true that this handles the "services running on machine-0 (even in containers)" could get access to user data and see the private key that was initialized with.
<jam> But someone that gets that data and snoops the first connection
<jam> can get the ongoing security details
<jam> I think
<fwereade_> jam, we don't run user code until after the first connect
<fwereade_> jam, if the cloud itself is compromised it's out of scope
<jam> fwereade_: so I agree that we've closed the "charms running on machine-0" hole
<jam> fwereade_: but we haven't closed other cases
<rogpeppe> jam: if someone can MitM the initial data, we're stuffed, i think
<jam> and we might be able to if someone who actually knows this stuff thinks about it.
<jam> Dustin might be a great person to bounce ideas off of.
<rogpeppe> jam: it would be good to get a second opinion, definitely
<jam> rogpeppe: note that SSH does it via cloud-init-output. and the boot log.
<rogpeppe> jam: sorry, SSH does what?
<jam> rogpeppe: when cloud init finishes it writes out the fingerprint of the SSH key it generate
<jam> generated
<jam> so that the person who started the bootstrap
<jam> can read the log
<jam> and then when connecting knows that the fingerprint matches the generated vaule
<jam> rogpeppe, fwereade_: so we *could* just generate the server public and private key on first boot
<rogpeppe> jam: what's this guarding against?
<jam> in the cloud-init code
<jam> and then have the last bit of cloud-init report the public key
<jam> which we can then verify without having to do a "first connect" song and dance for that bit.
<rogpeppe> jam: compromise of the initial StartInstance data?
<jam> rogpeppe: having to upload the private key at all
<rogpeppe> jam: how do we know we're connecting to the same machine we thought we were starting?
<jam> rogpeppe: so rather than uploading something which gets replaced, we just have it generated in a location that we can read
<rogpeppe> jam: how do we verify that we're talking to the right location?
<jam> rogpeppe: isn't that where you ask the provider for the machine that you started?
<jam> as in, you started it, it gave you an instance id, and you're asking it to give you the IP for that machine as well as the log for that machine.
<fwereade_> jam, assuming you really get the address for that machine once it's (potentially) gone through DNS
<rogpeppe> jam: this presumably assumes that you can get a log from the machine without talking to the machine itself, right?
<jam> rogpeppe: you get the log from the provider
<rogpeppe> jam: can we assume that's available from al providers?
<rogpeppe> all
<fwereade_> jam, rogpeppe: in ec2 we originally decided against it because it makes bootstrap take longer, and yeah, I'm not sure we can depend on it everywhere
<jam> rogpeppe: openstack has "nova console-log"
<jam> I'm pretty sure EC2 has it
<fwereade_> jam, quite a lot of delay on ec2 at least
<jam> because it is how people boot a machine and add the ssh key to their known hosts without having to manually verify the ssh fingerprint
<jam> fwereade_: how does it make bootstrap much longer?
<jam> because of the time to generate a key?
<jam> fwereade_: because it hasn't actually made it any longer to get to a *usable* state
<jam> given that we have to wait for whatever song and dance to finish anyway.
<fwereade_> jam, literally just the wall clock time before `juju bootstrap` returns
<jam> fwereade_: if you have to wait for jujud to be started before you can connect, you have to wait for jujud to be started, it iisn't much longer if you are waiting for the key to be generated at startup.
<jam> fwereade_: you don't have to wait for it, you ask for it at the same time you would have done it during first connect.
<fwereade_> jam, we're talking about different things
<fwereade_> jam, it was a use perception argument
<jam> fwereade_: to be fair, users have wanted us to wait for bootstrap to return for other reasons :)
<fwereade_> jam, "oo, that was quick, how nice"
<fwereade_> jam, yeah, indeed, I am not sure it holds much actual water
<jam> fwereade_: My argument is that it can be functionally identical to what we have today.
<jam> if we wait in bootstrap
<jam> or we wait in first connect
<jam> first connect already has to wait
<rogpeppe> jam: i think it's different
<jam> it just happens to *also* not have the public key for the TLS connection.
<jam> until the machine finishes first-startup and we read it from the console-log
<rogpeppe> jam: you can call bootstrap, it returns. then some time later you can connect quickly.
<fwereade_> jam, well, yeah, rogpeppe's use case has *some* utility
<rogpeppe> jam: what happens if your network connection goes down half way through bootstrap?
<jam> rogpeppe: except we can't today
<jam> because we are waiting for it to start up
<jam> and under the proposed situation
<jam> it is no different
<jam> just the "first connection" reads the console log first
<jam> rogpeppe: doesn't matter. Again, *Do it the way we do today*
<rogpeppe> fwereade_: i can certainly say that i've made use of that functionality
<jam> you just read the public key in a different way
<fwereade_> rogpeppe, likewise
<jam> instead of assuming you can write it up, just read it back
<jam> fwereade_: well, I've gone and gotten a coffee because I was waiting for it, but that doesn't mean it was *helpful* functionality :)
<rogpeppe> jam: i'd be slightly concerned that we start fundamentally relying on provider-specific security for the log data
<jam> rogpeppe: don't we already?
<fwereade_> jam, rogpeppe: the other possibility is to extend the hardware-characteristics hack
<jam> rogpeppe, fwereade_: we can always add some sort of "secret nonce" to the instance data that we validate on connect, but that doesn't have to be a giant cert file
<fwereade_> jam, rogpeppe: but that's probably also dependent on clou-specific bits
<jam> fwereade_: don't we already have a bootstrap nonce to know that the machine we started is the one we asked for?
<jam> I realize some of that today is written into the DB for the server-side code to do the validation
<jam> but we could save that locally
<jam> and then have an actually unique bootstrap node nonce
<jam> instead of the hard-coded one
<fwereade_> jam, no -- the bootstrap nonce is just a hardcoded value *because* there's no risk of double-starting the bootstrap node
<fwereade_> jam, not to say it's a bad idea
<fwereade_> jam, it's probably good actually
<rogpeppe> jam: i'm not sure we do currently need to rely on provider-level security for anything other than starting an instance and making sure that the user data isn't compromised along the way.
<jam> fwereade_: well we have bug #1212177 today, which potentially gives an opening for double-bootstrapping machine-0
<_mup_> Bug #1212177: bootstrap detection stops too early <ui> <juju-core:Triaged> <https://launchpad.net/bugs/1212177>
<jam> rogpeppe: so what security is required in reading the log?
<jam> we aren't writing the private key there
<jam> (i don't think the client needs it)
<jam> just the public key
<rogpeppe> jam: we need to know definitively that we're reading the unaltered log from the correct machine
<fwereade_> jam, ha, I'd missed that one
<jam> rogpeppe: but if you write up the secret key, you definitely need to make sure you are connecting to the write IP address and someone didn't start another instance with the same private details.
<jam> write IP => right IP
<rogpeppe> jam: i think that if someone manages to intercept the initial user data and compromises the provider DNS lookup, i think we're stuffed any which way
<jam> rogpeppe: What I really want to get out of this is that there is more than one way to solve this, and none of us are security experts. So while we can put forth some ideas, if we really care about having good security we should ask someone who knows what the likely and unlikely threats are and see how our proposals stack up in practice.
<fwereade_> rogpeppe, jam: what if we were to create a bootstrap-only CA and finish bootstrap by sending back down the real one generated on the server?
<rogpeppe> fwereade_: that's pretty much what i had in mind
<jam> fwereade_: isn't that the same problem of bootstrap having to wait for the server to finish booting?
<jam> and in that case it is *bootstrap* waiting.
<rogpeppe> jam: i *think* that fwereade_ was thinking of doing that on first connection
<rogpeppe> jam: first connection == "finish bootstrap"
<fwereade_> jam, no, we do it as today: but we use the bootstrap CA only for the first connection, and as part of that we get the real one
<rogpeppe> jam: that's what i was thinking of anyway
<jam> fwereade_: CA vs cert
<jam> it doesn't sound like we need a bootstrap CA
<jam> just a cert
<jam> and then it generates a CA and a signed cert for that connection.
<jam> And then how we get that private CA to sign the keys for other HA nodes
<fwereade_> jam, can we revoke that effectively though?
<jam> fwereade_: so yes, it sounds like a workable solution, but I'd like someone who actually knows this stuff to actually validate our thoughts.
<jam> fwereade_: revoke what? The initial cert?
<fwereade_> jam, yeah; if we have completely separate CA certs we don't need to worry about the old one being misused
<jam> why do we care to revoke an ephemeral cert?
<fwereade_> jam, because if it's vulnerable then bad peoples might use it
<jam> after first connect, we regenerate everything anyway, and we stop recording that the original one was what we wanted to connect to.
<fwereade_> jam, ok, cool, I must have misread you
<jam> fwereade_: so a CA is one cert that is used to sign other certs so that you can check that the cert is valid without having to have the public key for every possible cert.
<jam> that sounds like *way* overkill for an initial cert that we are going to be replacing.
<jam> fwereade_: so my proposal is that on "bootstrap" we generate a cert and call it "bootstrap-cert". We then connect and only allow connections to someone whose public key matches that bootstrap cert.
<jam> And on the jujud side
<jam> it generates a new CA and a new cert for that machine.
<jam> and we do the "here is the new CA for all machines in this environment"
<fwereade_> jam, ah, got you, for bootstrap we just send up a cert and key, and we get *back* a CA cert that was used to sign the new server cert
<fwereade_> perfect
<jam> fwereade_: if we *do* generate stuff on server side, we should look into the entropy stuff that Dustin worked on.
<jam> which is to take client entropy and feed it into the startup
<fwereade_> jam, yes indeed
<rogpeppe> jam: it's a bit awkward to bypass CA certs entirely for TLS connection
<jam> because otherwies 0 entropy on newly started machines generate bad certs. :)
<fwereade_> jam, quite so
<rogpeppe> jam: we already generate stuff server side, so we already need entropy
<fwereade_> rogpeppe, jam, dustin was talking at iom about providing an entropy service
<rogpeppe> fwereade_: that sounds like a good plan.
<rogpeppe> fwereade_: but more is always better.
<fwereade_> rogpeppe, doesn't it
<jam> fwereade_: and I'm pretty sure that service is what he was talking about charming
<jam> and found it difficult.
<fwereade_> rogpeppe, but I'm not sure we can depend on that being available, so sending it up from the client will work today I guess
<jam> fwereade_: but regardless of that, he had an early thread talking about possibilities.
<jam> and one is to just call out to /dev/random on the client and write up ~100 bytes or so to the server
<jam> to seed its random number generator.
<rogpeppe> jam: yeah, that sounds like a good plan
<jam> http://blog.dustinkirkland.com/2012/10/entropy-or-lack-thereof-in-openstack.html and http://blog.dustinkirkland.com/2012/10/seed-devurandom-through-metadata-ec2.html
<rogpeppe> jam: it helps more if you're not starting juju from a cloud instance itself
<jam> http://blog.dustinkirkland.com/2012/10/seed-devurandom-through-metadata-ec2.html mentions that *today* you can seed /dev/urandom by just writing to it
<jam> and we already use cloud-init
<jam> so we just need to add
<jam> the AddFile(/dev/urandom, "$SOME_RANDOM_BYTES")
<wallyworld_> jpds: fwereade_: the rsync bug should have been fixed last release or the one before
<fwereade_> wallyworld_, just for units, or also for machine 0? it seemed like jpds was seeing repeats from machine-0.log
<jam> fwereade_: the bug that was filed was about adding a unit to machine-0 caused a loop
<jam> fwereade_: if we're seeing a different issue about cycling on itself, then it would need a different fix
<fwereade_> jam, repeated trash from the past was the symptom, though, so maybe it's the same thing happening
<fwereade_> jam, AddFile(/dev/urandom, "$SOME_RANDOM_BYTES") sounds good
<wallyworld_> fwereade_: should have been for everything afaik. i didn't do the fix myself (hence i didn't really test it)
<wallyworld_> if there might still be an issue it should be investigated
<jam> fwereade_: the fix landed August 20th, so it should definitely be in 1.13.2, so if that isn't fixing it, we need a different fix.
<rogpeppe> fwereade_: assuming that AddFile will work in that context, yeah
<jam> fwereade_: bug #1217808 so we don't forget about it.
<_mup_> Bug #1217808: juju should seed entropy into started instances <juju-core:Triaged> <https://launchpad.net/bugs/1217808>
<jam> rogpeppe: http://blog.dustinkirkland.com/2012/10/seed-devurandom-through-metadata-ec2.html
<fwereade_> jam, you just beat me there, thanks
<fwereade_> I'll do the debug-log spam one
<rogpeppe> jam: ah, cool
<jam> fwereade_: original bug for syslog is bug #1211147
<_mup_> Bug #1211147: Deploying service to bootstrap node causes debug-log to spew messages <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1211147>
<jam> fwereade_: bug triage clarification, easy vs bitesize. Easy means a core dev focusing on it shouldn't have much problem, and bitesize is for 3rd parties, or is it the other way around
<fwereade_> jam, bitesize means the actual change is small but you may need context to know what it is
<fwereade_> jam, easy means you don't need so much context but does not necessarily connote "small"
<jam> fwereade_: unfortunately that doesn't stick easily in ones head.
<fwereade_> jam, eh, I picked it because it seemed obvious, I have no specific attachment to those words
<fwereade_> jam, if there's something more standard that we should use to indicate those distinct characteristics I'm fine with that
<fwereade_> brb
<rogpeppe> fwereade_, jam: here's a possible way of doing things that doesn't rely on provider logs and does not require the API server to change its server key: http://paste.ubuntu.com/6036133/
<rogpeppe> jam: it uses a bootstrap key, but only for signing - it never actually needs to serve the connection using the bootstrap key
<jam> rogpeppe: given you can sign a key from multiple other certs, why not have the "bootstrap key" be effectively a CA, and then the newly generate CA and key itself are signed by the key passed in?
<rogpeppe> jam: ah yes! if i understand correctly, i think that's a great idea
<rogpeppe> jam: then for the first connection, we use the bootstrap cert as the only accepted CA, but then we change the accepted CA down one level.
<jam> rogpeppe: essentially it changes that first connect from using TLSInsecuryKey
<jam> right
<rogpeppe> jam: that's much nicer, thanks!
<rogpeppe> jam: and it means we just need one call in the API (CACert) and to change the initial connection code to change the CA cert to that
<jam> rogpeppe: and the server side to generate the new CA and cert and sign them, but it doesn't change a lot of the actual interchange
<rogpeppe> jam: yeah.
<jam> rogpeppe: fwereade_: the bot seems to be stuffed right now, getting errors in CompareAndSwap, any ideas of what might fix it?
<rogpeppe> jam: link?
<jam> rogpeppe: recent failures trying to submit merge requests, one secd
<jam> rogpeppe: https://code.launchpad.net/~thumper/juju-core/container-address/+merge/182271
<fwereade_> rogpeppe, jam: nice
<fwereade_> jam, doesn't ring a bell
<mgz> I didn't poke anything to upset the bot recently...
<rogpeppe> jam: the problem looks as if notifyWorker.commonWatcher is nil
<jam> mgz: I don't trust you. I have my eyes on you :)
<fwereade_> allenap, ping
<rogpeppe> jam: ah, i've found part of the bug, at any rate
<jam> rogpeppe: what's that?
<rogpeppe> jam: worker/machiner/machiner.go:53
<rogpeppe> jam: it's the old statically typed nil gotcha
<jam> rogpeppe: I don't quite understand. An err that isn't actually nil?
<jam> rogpeppe: my line 53 is "m.Watch()"
<rogpeppe> jam: *return* m.Watch(), right?
<jam> rogpeppe: as in it should be return "m.Watch(), nil" ?
<rogpeppe> jam: no, it should be: w, err := m.Watch(); if err != nil { return nil, err}; return w, nil
<rogpeppe> jam: alternatively we could make m.Watch return api.NotifyWatcher, or make SetUp return *watcher.NotifyWatcher
<jam> rogpeppe: I thought that was when you return an err that is actually nil but then it gets perceived as not nil, is that correct?
<rogpeppe> jam: the problem is that SetUp is returning non-nil, non-nil
<jam> but the other bug is that we are returning an error and a nil pointer for *watcher.NotifyWatcher but that ends up in an interface that doesn't realize it is nil?
<jam> rogpeppe: because it is actually returning nil but that is being put into an interface which makes it look not-nil ?
<rogpeppe> jam: exactly
<rogpeppe> jam: i think api.Machine.Watch should return  api.NotifyWatcher
<rogpeppe> jam: i'll propose a fix
<jam> rogpeppe: that *might* be an import loop. At least it looks a lot like when I added the NotifyWatcher interfaces and fwereade_ asked me to return the concrete type that happens to implement the interface.
<rogpeppe> jam: yeah, it's an import loop, but i'll just move the interface definition into watcher itself
<rogpeppe> api/watcher, that is
<fwereade_> jam, rogpeppe: that nil-not-nil behaviour is starting to really get up my nose
<rogpeppe> fwereade_: yeah, it's occasionally very annoying
<jam> fwereade_: it is definitely a giant gotcha for people that don't think about it a lot
<jam> fwereade_: and it means you have to put lots of the "if err != nil" statements when it doesn't look like you actually need it.
<fwereade_> jam, and you only really do think about it when it hits you
<rogpeppe> fwereade_: i think it might be amenable to a go vet check actually
<rogpeppe> fwereade_: FWIW i proposed a very long time ago that there should only be one nil, but that has its own problems
<fwereade_> rogpeppe, yeah, I might have read that discussion
<jam> rogpeppe: as in interface{nil} is not possible? or that it == nil or ?
<rogpeppe> jam: that typeof((*Foo)(nil)) != *Foo
<rogpeppe> jam: in general, typeof(x) == typeof((interface{})(x))
<rogpeppe> jam: and "one nil to rule them all: would break that
<jam> rogpeppe: as in, that should just return "nil type" ?
<rogpeppe> jam: if you wanted to get rid of the problem, you'd make it, i think, so that interface{}(anyNil) == nil
<natefinch> if I could just easily ask an interface if the thing contained within it was nil, I'd be happy.   the <interface> == nil thing has bitten me a few times
<rogpeppe> natefinch: reflect.ValueOf(x).IsNil() :-)
<rogpeppe> natefinch: i agree though
<natefinch> That's why I said "easily"  I don't consider reflect to be easy :)
<rogpeppe> natefinch: :-)
<rogpeppe> natefinch: i'm not sure when it would ever be appropriate to do that though, through reflect or not.
<rogpeppe> natefinch: the underlying issue is that typed nil is just as valid a value as typed 0 or typed ""
<rogpeppe> natefinch: so the language is just being consistent
<jam> rogpeppe: but type Foo string; f := Foo(""); if f == "" {} works today ,doesn't it?
<jam> or do you have to do "if f == Foo("")" ?
<rogpeppe> jam: yes, you do
<natefinch> rogpeppe: I know... it's just annoying when you have a function that returns a pointer wrapped by an interface, and the pointer knows that it's nil and therefore not valid, but the interface doesn't know that
<natefinch> jam: they're different types and therefore can never be equal
<rogpeppe> natefinch: yes, that's the classic place to get the gotcha
<natefinch> jam:  even if Foo  is type Foo string
<natefinch> rogpeppe: yep.  First time I hit it, it took me all day to figure out wtf was going on
<rogpeppe> natefinch: i do think it might be reasonable to get go vet to help here
<rogpeppe> natefinch: i think it does some type-base analysis already
<rogpeppe> based
<natefinch> rogpeppe: yeah, I've had some good help from go vet... though I haven't studied exactly what it finds
<jam> rogpeppe: for your earlier conversation, I think the newly proposed key details is http://paste.ubuntu.com/6036251/
<rogpeppe> natefinch: in particular, it could check if you're returning a statically typed value from a function that returns an interface and an error
<jam> natefinch: the place it has bitten a lot in juju-core is actually the "error" interface when you have a custom error type that is a pointer.
<natefinch> rogpeppe: could or does?
<rogpeppe> natefinch: could
<rogpeppe> natefinch: i'm pretty sure it doesn't currently
<natefinch> rogpeppe: ok
<jam> rogpeppe: does that matter? error itself is problematic because it is an interface
<rogpeppe> jam: i'm not sure how to avoid too many false positives for the error case itself
<rogpeppe> jam: the place i've seen the problem hit most subtly is when you do return f()
<natefinch> seems like the only reliable way to avoid the nil problem is to have  "if foo == nil { return nil } else { return foo }   everywhere you return an interface :/
<rogpeppe> jam: from a function that returns an error
<jam> rogpeppe, natefinch: http://play.golang.org/p/fvTyl7V1EY
<jam> natefinch: that has been my experience
<rogpeppe> natefinch: well, anywhere you return a statically typed value that may be nil from an interface, yes
<jam> rogpeppe: "from an" or "as an" ?
<rogpeppe> jam: "as an", sorry, yeah
<rogpeppe> jam: or "from a function that's returning an"
<jam> rogpeppe: do you agree with http://paste.ubuntu.com/6036251/
<rogpeppe> jam: am looking
<natefinch> btw, I'm reading Clean Code, and it's coming across quite java-y...  especially the naming. createPluralMessageDependentParts?  This is from the *good* example? :/
<natefinch> not that I don't think there's a lot of good stuff... but I could tell from the first chapter it was written by a java guy
<jam> rogpeppe, natefinch: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471
<jam> TheMue: ^^
<fwereade_> TheMue, standup
<allenap> fwereade_: pong
<fwereade_> allenap, https://code.launchpad.net/~allenap/juju-core/makefile-stuff/+merge/181113 has LGTMs, does it need something else?
<rogpeppe> 	store CA cert locally for future use, discarding bootstrap key
<jam> fwereade_: sexy times, I think.
<rogpeppe> http://paste.ubuntu.com/6036298/
<allenap> fwereade_: Yeah, I need to fix something that jam brought up, then I'll land it. (I don't know why, but email from codereview.a.c goes straight to archive in Gmail and I don't see it; it's probably a bad filter, but it's the reason I haven't been very responsive to reviews on this branch.)
<fwereade_> allenap, no worries, thanks
<jam> rogpeppe: fwiw, your branch got bumped because of the nil stuff. I don't know if your branch *triggers* it, or it is a race condition. Other branches have landed since.
<rogpeppe> jam: thanks. i'll propose a fix first, i guess
<allenap> jam: Do you think it's worth always doing `go test -i $PROJECT/...` before running the tests for real? (wrt Makefile)
<rogpeppe> jam: does this look reasonable to you? http://paste.ubuntu.com/6036547/
<rogpeppe> fwereade_: ^
<rogpeppe> jam, fwereade_: a fix for the worker nil-in-interface problem: https://codereview.appspot.com/13326044
<rogpeppe> fwereade_: i'm not sure what you mean by your comment here: https://codereview.appspot.com/13269045/diff/3001/environs/testing/polling_test.go#oldcode80
<rogpeppe> fwereade_: what should have moved where?
<fwereade_> rogpeppe, *something* is all -- it just seemed to be an environ test entirely in terms of provider
<rogpeppe> fwereade_: it's a test for environs/testing.PatchAttemptStrategies, no?
<rogpeppe> fwereade_: and the test is in environs/testing, which seems reasonable to me
<fwereade_> rogpeppe, so maybe that's the thing that should move, or maybe it's fine as it is -- i subscribe to the latter really, but that one file made me pause more than the others
<rogpeppe> fwereade_: ok, thanks
<rogpeppe> fwereade_: FWIW i think we should keep the subdirectories of provider entirely for provider impls
<rogpeppe> fwereade_: (that's why i created it)
<rogpeppe> fwereade_: "all" is the only exception, which seems kinda ok
<fwereade_> rogpeppe, ehh, that kinda forces us to cram all the possible utilty functions into one package
<fwereade_> rogpeppe, if it cam to it I'd rather have provider basically empty and a util subdir
<fwereade_> rogpeppe, but we're making progress regardless
<fwereade_> rogpeppe, if the packages are themselves clean it's a hell of a lot easier to rearrange them
<rogpeppe> fwereade_: i know what you mean.
<rogpeppe> fwereade_: but we need somewhere to put the actual provider implementations, and i really don't want to mix that name space with the other utility packages again
<rogpeppe> fwereade_: we could have providerutils, i suppose
<natefinch> man, juju really needs better feedback... I can never tell if a command is hanging or if it's just working, but slowly
<natefinch> (it's pretty much always the latter, but as a user... I can't tell)
<natefinch> every time I do juju bootstrap, I immediately follow it with  juju status, and juju status hangs forever.  Tried with both azure and AWS.  Anyone seen that?
<rogpeppe> natefinch: i use --verbose or --debug generally
<natefinch> lemme retry with that... I always forget
<natefinch> this repeated every 1/3rd of a second: 2013-08-28 14:45:42 ERROR juju open.go:89 state: connection failed, will retry: dial tcp 137.116.116.136:37017: connection refused
<rogpeppe> jam: just looking at https://codereview.appspot.com/12949047
<rogpeppe> jam: i'm not sure we should always default to simplify
<natefinch> I think I've seen that before under similar circumstances... where it just keeps retrying for forever
<rogpeppe> natefinch: if that's happening, you need to ssh to the machine and look in the logs to see what's actually gone on
<natefinch> rogpeppe: huh ok... of course I tried juju ssh which fails with the same error.   Have to do it the hard way I guess
<rogpeppe> natefinch: yeah, juju ssh requires an environment to connect to
<natefinch> rogpeppe: I'm spoiled by  how easy juju makes things :)
 * rogpeppe goes for lunch
<jam> natefinch: if it is "forever" then it is a bug. if it is "a really long time" then it is just because instances take a long time to start up. For Azure I've heard it can be 10 minutes before cloud-init is all finished.
<jam> natefinch: the one thing I've seen over in the HP Cloud is that bootstrap can start and the machine only has a private IP address, and if 'status' sees that and thinks it should connect there, it will never connect (and the machine later gets a public address).
<jam> We *should* have HP Cloud such that we wait for a pub address before we continue (thanks to noodles775)
<natefinch> jam: I waited 10 minutes. I'd say that's close enough to forever :)
<jam> natefinch: You can look at whatever overview page Azure provides (I don't know it)
<jam> to make sure the machine is booting.
<natefinch> jam: yeah... I just killed it, I'm restarting the process with debugging so I can see what's going on
<jam> natefinch: and it is also possible something broke and the machine is up but jujud didn't get itself properly configured and running.
<jam> natefinch: this is one of our mistakes. By default juju gives no output, so all of *us* run with -v, which is slightly too much output, but we really need something more than 0
<natefinch> jam: I'll leave it going for longer this time, I might have killed it just before it finished booting up
<natefinch> jam: yeah, I have that in an email that I was going to send out.  There's not enough user feedback, especially for someone new to the product.
<jam> i believe fwereade_, thumper, and myself agreed on informative messages by default (no -v)
<fwereade_> +1
<jam> and a -q if you really want nothing.
<natefinch> jam: yeah, I much prefer that way
<jam> there is an argument that you shouldn't tell the user you just did what they asked you to do, but the delays involve mean you sit around wondering "what is going on"
<natefinch> I'd like to have bootstrap return the text from juju status, honestly.  I always type juju status after running bootstrap, and I can't be the only one
<jam> natefinch: that bit is that we don't want to wait around until status is ready. And was discussed *heavily*
<jam> But I personally agree that having bootstrap block your terminal is no worse than the next thing you want to do blocking your terminal
<natefinch> jam: well, so that's interesting... what information do we have immediately after bootstrap returns?  There has to be something that tells bootstrap "ok this was a success"
<jam> but bootstrap and deploy etc are all async by design
<natefinch> If bootstrap is async, why does it take 2 minutes to return?
<jam> natefinch: it doesn't on ec2, it does take a little while to process and get everything requested
<natefinch> hmm.. maybe my azure setup is just borked
<jam> but not 2 min
<jam> natefinch: azure has been reported to be slow
<natefinch> jam: ahh... juju status finally returned after 11 minutes of connection refused
<jam> natefinch: so there is definitely a "lets investigate and see if we can make it better" you can compare that with canonistack pretty easily
<natefinch> jam: with an error about no reachable servers
<jam> natefinch: no reachable servers happens if we don't see anything BUILDING within 10s or so
<natefinch> jam: maybe someone messed that up and made it 10 minutes?
<natefinch> jam: jam http://pastebin.ubuntu.com/6036957/
<natefinch> jam: ahh yeah, DefaultDialOpts() in state/open.go uses a 10 minute timeout
<jam> natefinch: that is the time we expect an instance to start in
<jam> vs the time we expect to see a machine *start* building
<natefinch> jam: it's not really ok to let a command take 10 minutes to return
<jam> natefinch: you asked for a command to wait for the instance to be started, it takes that long for it to start
<natefinch> jam I asked for the status. The status can be "not started yet, or perhaps in the process of starting"
<natefinch> jam: I never expect a command to wait for 10 minutes. Prtty much ever, unless I specifically tell it to
<natefinch> jam: I'd expect "timed out waiting for environment to finish bootstrapping, try again in a few minutes" after at most a 30 second timeout
<jam> natefinch: to be fair things other than azure don't take that long
<natefinch> jam: well, to be fair, we explicitly set the timeout at 10 minutes :)
<natefinch> jam: also, something must be broken since it's been 45 minutes and I still can't get status, even though thew azure management page says the instance is up and running fine, and I can ping it at that IP address.  Weird.
<jam> natefinch: can you ssh in?
<jam> I don't doubt something might be broken
<jam> and cloud-init-output.log is the best way to find out
<natefinch> it might just be my configuration though
<natefinch> I'm trying to figure out how to ssh in without using juju ssh... not sure what key it thinks it's using
<jam> natefinch: it is just ssh to the host address
<jam> we just look it up for you with "juju ssh"
<TheMue> arrrgh, need help by a brz-diff-patch-freak
<natefinch> jam: right, but juju ssh has the same connection problem
<TheMue> jam: maybe you can help me
<jam> natefinch: juju ssh can't work because we connect to the db to find the address.
<jam> natefinch: but if azure says it is up and running, grab the ip address and ssh into it
<jam> ssh ubuntu@host
<natefinch> Permission denied (publickey)
<TheMue> jam: question is here: how to apply a (reverting) patch, when one of the files has been renamed since then
<jam> TheMue: bzr merge -r AFTER..BEFORE ?
<jam> TheMue: so if rev 100 committed something; bzr merge -r 100..99
<TheMue> jam: ah, hmm, ok, i created a diff and tried to merge it with patch -p0 -R < mydiff.patch
<TheMue> jam: but will try it now that way, thanks
<jam> natefinch: I know there was a bug with cloud-init on azure not getting ssh access, but I thought it was just for units other than machine-0
<jam> you might check the launchpad bugs
<natefinch> jam: ok, thanks
<TheRealMue> jam: ah, great, worked. exactly what i wanted. thx again
<TheMue> jam: wow, you're fast
<natefinch> rogpeppe, jam, mgz, TheMue, fwereade_: anyone have an opinion on whether or not we should abstract away runtime.GOOS so we can test OS-specific code?   Specifically looking at a function I'm writing to get the user's "HOME" directory, which is different on linux vs. windows
<rogpeppe> natefinch: hmm, interesting question
<rogpeppe> natefinch: in the end, path/filepath is different on linux vs windows so we can't get away without having tests that might fail on one platform or another
<rogpeppe> natefinch: i suggest we concentrate all the OS-specific stuff in one place
<natefinch> rogpeppe: right, that's a good point
<rogpeppe> natefinch: (assuming we do have any code that needs tagging specifically for windows)
<TheMue> natefinch: regarding non-unix clients this sounds ok
<natefinch> rogpeppe: in theory most of the filepath stuff should abstract us away from the path differences, as long as we use the built-in separators and not hardcode slashes etc
<rogpeppe> natefinch: that's my thinking too
<rogpeppe> natefinch: the difficulty is when we have paths that can be either local or remote
<rogpeppe> natefinch: another approach is to keep all paths in portable form and to transform with filepath.FromSlash at the last moment possible
<natefinch> rogpeppe: seems like we shouldn't be messing with specific paths too often... or are there places where that's done a lot?
<rogpeppe> natefinch: i'm thinking in particular of the stuff around environs/cloudinit
<rogpeppe> natefinch: but that's probably not too bad until we actually want to deploy on windows, which tbh would probably require a whole new init script, so not too bad.
<natefinch> rogpeppe: yeah, totally
<natefinch> rogpeppe: still not sure what the answer is - put windows-specific tests in a windows-specific file, so the tests will run on windows but not linux?  I wonder if they'll actually ever get run that way :/
<rogpeppe> natefinch: hopefully there won't be any windows-specific test necessary
<rogpeppe> s/test/tests/
<rogpeppe> natefinch: if we do need such tests, then they would need to be in a windows-specific file, yes
<natefinch> rogpeppe: "HOME" on windows is "HOMEDRIVE" + "HOMEPATH"  ... I'm writing a function to abstract away the difference so the rest of the code doesn't have to know... but I need a test for that function
<rogpeppe> natefinch: is it not possible to write in a portable way?
<rogpeppe> natefinch: even though it will only run under linux
<rogpeppe> windows even
<natefinch> rogpeppe: the problem is, that I have to switch on runtime.GOOS, so on Linux, the windows path never gets run
<natefinch> actually
<natefinch> rogpeppe: one exported function does the switch, calls two internal functions, one per OS... I can then test each internal function separately
<rogpeppe> natefinch: that was my thought too
<natefinch> rogpeppe: sometimes the obvious is too obvious ;)
<rogpeppe> natefinch: i think it's worth keeping away from build tags until we can definitely no longer avoid them
<natefinch> rogpeppe: yep
<rogpeppe> fwereade_, jam, natefinch, TheMue: small patch to testing.LoggingSuite: https://codereview.appspot.com/13351043
<natefinch> rogpeppe:  looking
<natefinch> rogpeppe: there's no way to redirect the log output to both the normal output and something you can test?
<rogpeppe> natefinch: well that wouldn't be testing the functionality i need to test, would it?
<natefinch> rogpeppe: oh, right :)
<natefinch> rogpeppe: if it were me, I'd remove the INFO part of the string matching... that seems like it's an internal detail to Infof, and not actually something this code cares about
<natefinch> rogpeppe: (in the tests, that is)
<rogpeppe> natefinch: actually, it's something that's specifically printed by the LoggingSuite code
<rogpeppe> natefinch: the error level, that is
<rogpeppe> natefinch: specifically this line: 	w.c.Output(3, fmt.Sprintf("%s %s %s", level, module, message))
<rogpeppe> natefinch: so i think it's worth leaving in
<natefinch> rogpeppe: oh yeah... ok, I didn't realize that was actually in this package.  That's good, then
<rogpeppe> natefinch: cool
<natefinch> LGTM'd
<TheMue> rogpeppe: LGTM by me too
<rogpeppe> TheMue, natefinch: thanks. landing.
<TheMue> rogpeppe: yw
<TheMue> so, i'm leaving, see you tomorrow
<natefinch> rogpeppe: arg.... was using this to get the drive off the path passed to my SetHome() method: http://golang.org/pkg/path/filepath/#VolumeName     but it only does the right thing on Windows
<rogpeppe> natefinch: ha, of course
<rogpeppe> natefinch: perhaps we really should go the build tag route
<rogpeppe> natefinch: and have windows-specific files and tests
<jam> natefinch: I would say write the code so that it looks at $HOMEPATH and $HOMEDRIVE from the env
<jam> and then write a test that forces them into the env
<jam> note that you need that anyway
<jam> because *my* homepath may not be on C:
<jam> and Users
<jam> vs Documents and Settings
<jam> vs whatever
<jam> natefinch: that code can run on any system.
<jam> natefinch: then I would write another test that doesn't poke at stuff, but asserts that we get a value for "FindHome()" or whatever we want to call it.
<jam> It won't really matter *what* that is, as long as it is valid.
<jam> And you *could* do a test that has a OS switch
<jam> so you test that we just use $HOME on linux, and when available on Windows, and $HOMEDRIVE\$HOMEPATH when $HOME isn't
<thumper> morning people
<thumper> rogpeppe: ping
<rogpeppe> thumper: hangout?
<thumper> rogpeppe: sounds good, got one handy?
<rogpeppe> thumper: usual standup hangout should do
<thumper> ok
<rogpeppe> thumper: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471?authuser=1
<natefinch>  jam: the code is pretty simple, and does the right thing on the right OS, it's just hard to test it correctly on the wrong OS
<natefinch> jam: because the code uses stdlib functions that work differently on different OSes (like path.Join) and filepath.VolumeName
<natefinch> rogpeppe, jam: http://pastebin.ubuntu.com/6037498/
<natefinch> jam: notable, filepath.VolumeName returns an empty string on linux regardless, which means the setHomeWin function ends up setting the full  path into HOMEPATH
<natefinch> jam: also,  I disagree with looking at %HOME% on a Windows OS.  Hopefully whoever set it would keep it in lockstep with HOMEPATH, but if not... that's really non of juju's business. HOMEPATH is what the OS itself uses as the user's home directory. Anything else is application-specific
<jam> natefinch: I use $HOME, I've used work machines that set $HOME to a shared network drive so it is shared across machiens
<natefinch> jam: my point is... HOME has no meaning on Windows.  Any meaning is specific to the application that sets it up.  iTunes could set up HOME to mean the iTunes music library root or something.
<jam> natefinch: I fully disagree based on past experience
<jam> certainly *my* .ssh directory is in $HOME
<jam> I would say that if HOME is set, we should preferentially use it.
<natefinch> jam: I don't know how we can figure out which one the user wants us to use without asking them.  When I run a native windows application, I expect it to put stuff in my windows HOMEPATH.  If I happen to have $HOME set up somewhere else because I have cygwin installed... I'd be pretty annoyed that juju put stuff in there and not the windows default.  We have an application specific environment variable to override the OS defaul
<natefinch> t, it's JUJU_HOME.
<natefinch> jam: we have to be respectful of the defaults on each OS, so that the application feels like a native application, and not just a bad port.   That
<natefinch> think about it the other way.... would you want juju to put your .ssh folder under $HOMEPATH if it was set on linux?
<jam> natefinch: clearly not given that $HOMEDRIVE wouldn't be set
<jam> :)
<natefinch> I can go set it right now :)
<natefinch> we
<natefinch> we're not making an application to run under cygwin. We're making an application to run under windows.
<jam> natefinch: well, we're making an app to run under cmd.exe, which is a slighty different beast. There really isn't a "standard place to put your ssh keys on Windows". as such if someone has cygwin installed, they likely *do* have ssh keys in $HOME/.ssh/* that would actually be useful for connecting to another machien.
<jam> it is possible that they also have putty somewhere, and we should try to find that as well
<jam> natefinch: *if* we had an explicit gui, then I would expect that to be much more windows-centric about where things are stored, and it would isolate itself and expect you to configure it via menus, and store its config in the Registry.But that isn't what we have.
<jam> (yet)
<jam> natefinch: I would accept that we shoudrn't write there by default, but we shouldn't write to HOMEPATH either
<jam> we should be writing to APPDATA
<natefinch> jam: yes and no. .juju should definitely be in appdata... but .ssh is a different beast.  in theory it's supposed to be cross-application, though obviously that's not standard on windows
<jam> natefinch: it wouldn't be '.ssh' anyway if it was windows. (windows won't let you create a '.' dir in explorer)
<natefinch> jam: yes that's true (I hate that restriction in windows btw)
<natefinch> jam: the other problem with app data is that it's hard to find in explorer, and we expect people to edit the environment.yaml
<jam> natefinch: juju edit config to launch a default editon?
<jam> juju edit-config?
<natefinch> jam: I didn't know that existed :)
<jam> natefinch: it doesn't, but it could
<natefinch> jam: ahh. yeah, that would be very good
<natefinch> back in about 20 minutes, have to drop off a preschool application
 * rogpeppe is done for the day
<natefinch> jam: what do you think we should do for friday with this Windows stuff?  My inclination is to put the info in app data like it probably ought to be, and work on making it more accessible later
<thumper> blurgh...
<thumper> just had to clean up a long vomit stretch in the hall
<thumper> I wish the dog would stop eating the cat food
<thumper> who'd think it would upset her so much
<bigjools> good morning vietnam
<bigjools> thumper, I see your long vomit stretch from the dog, and raise you two projectile vomiting twins
<thumper> oh dear
<davecheney> morning
<bigjools> o/ davecheney
<davecheney> Ave Caesar
<bigjools> back in the rolling valleys of South Wales?
<davecheney> bigjools: it's grand
<bigjools> boyo
 * davecheney goes back to upgrade testing
#juju-dev 2013-08-29
<davecheney> thumper: more debug is neede here
<davecheney> 2013-08-29 01:42:25 DEBUG juju.worker.uniter.filter filter.go:289 got unit change
<davecheney> 2013-08-29 01:42:26 INFO juju.worker.upgrader upgrader.go:138 required tools: 1.13.2-precise-amd64
<davecheney> all the units are blocked on this line
<davecheney> no further output
<thumper> hmm...
<thumper> doesn't seem too helpful
<davecheney> this is in ap-southeast-2 as well
<bigjools> wallyworld__: how's the coffee machine?
<davecheney> so it shouldn't take more than a few seconds to get the tools
<thumper> I wonder where it is blocked
<davecheney> i'll hit it with SIGQUIT and hope stderr goes somewhere
<axw> wallyworld__: I've just pushed some changes to my image-metadata branch that fixes the marshalling so numbers aren't floats
<davecheney> thumper: http://paste.ubuntu.com/6038698/
<axw> wallyworld__: it's a bit gnarly though, you might want to review my changes to environs/simplestreams
<davecheney> thumper: right, it did the upgrade, but it didn't restart
<thumper> hmm...
<davecheney> hmm, maybe it worked
<davecheney> hard to tell from the output
<davecheney> thumper: ok, here is the issue
<davecheney> 2013-08-29 01:59:40 INFO juju runner.go:253 worker: start "upgrader"
<davecheney> 2013-08-29 01:59:43 DEBUG juju.worker.uniter.filter filter.go:289 got unit change
<davecheney> 2013-08-29 01:59:43 INFO juju.worker.upgrader upgrader.go:138 required tools: 1.13.3.1-precise-amd64
<davecheney> ^ this is the message from the upgrade _after_ it has restated
<davecheney> oops 2013-08-29 02:03:48 INFO juju.worker.uniter context.go:234 HOOK Shutting down without a db
<davecheney> 2013-08-29 02:03:48 INFO juju.worker.uniter context.go:234 HOOK /var/lib/juju/agents/unit-mediawiki-0/charm/hooks/db-relation-departed: line 4: /var/lib/juju/agents/unit-mediawiki-0/charm/hooks/stop: No such file or directory
<davecheney> nope, charm bug
<wallyworld__> bigjools: awesome. i've just made a second one for today
<bigjools> wallyworld__: \o/
<wallyworld__> axw: thanks, i'll take a look
<axw> wallyworld__: thanks. also, I started using tools.Fetch to test... but it looks like that only returns a single version? is that right?
<axw> so I guess I'll just grab the files from storage and check their contents directly
<wallyworld__> axw: the Fetch method can return multiple but the constraint limits it. i've pushed a branch for review which allows loser matching and hence returns multiple
<axw> wallyworld__: ok, cool
<wallyworld__> to generate the metadata, i think the current best option is to grab from storage
<wallyworld__> we will/should revist that i think
<wallyworld__> axw: it is hacky isn't it. perhaps we can make it better by working the construct logic into the unmarshall method
<axw> wallyworld__: moving it into which unmarshal method?
<wallyworld__> the json unmarshaller for the collection
<wallyworld__> so instead of shoving stuff into a "" key, we can call construct and fill out the collection directly
<axw> wallyworld__: the problem is that the item type isn't known there
<axw> wallyworld__: there's no way to convey context to the unmarshaller, except through a global :(
<wallyworld__> is it sort of since we figure out the call point
<axw> hm
<wallyworld__> so we can map that to type
<axw> I don't like any of the solutions :)
<wallyworld__> i'm being a bt hand wavey, but i think it is possible
<wallyworld__> i agree, they all suck
<axw> yeah I get you
<wallyworld__> i think Go's json unmarshalling sucks
<wallyworld__> it can't be extended quite right
<wallyworld__> at least we would be confining the hackery to a single unmarshall method
<wallyworld__> so the method is a black box that "just works" but is hacky inside
<wallyworld__> and i'd put it separately in a json.go file in the same package
<axw> hmmmm
<axw> I'll have a look later
<wallyworld__> ok
<axw> need to respond to some review comments
<wallyworld__> np
<axw> wallyworld__: so one way you could do it is this: have ParseCloudMetadata store itemType in a global, protected with a mutex (only one unmarshal at a time); have ItemsMap.UnmarshalJSON do the Callers check, and then use the global
<axw> is that what you were thinking?
<wallyworld__> axw: sort of. i was thinking there'd be a map of method name (as determined in the current code) -> type
<wallyworld__> so no mutex needed
<wallyworld__> just look u the map after calling method is determined
<axw> ah yeah ok
<axw> so they just register up front the function name -> reflect.Type
<wallyworld__> yep
<wallyworld__> still hacky, but it allows it to be kept under the rug
<wallyworld__> and isolated from the core business logic
<axw> thumper: can you please explain this to me (comment by William)? "Never use conn.Environ if you can possibly help it. It's basically never up to date."
<thumper> axw: sure
<thumper> the conn.Environ is the environment based purely on the parsing of the local config
<thumper> not the value in the bootstrap node
<thumper> when a machine is bootstrapped
<thumper> is initializes passwords etc
<thumper> so they are no longer the same
<axw> ah right
<thumper> also, if someone calls juju set-environment
<thumper> it doesn't modify the local
<thumper> only the bootstrap copy
<thumper> which then notifies all the workers
<thumper> does that make sense?
<axw> thumper: yes
<thumper> cool
<axw> I've seen the code that updates config from state
<axw> not sure about an Environ tho
<davecheney> trollololo, more bugs
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1218168
<_mup_> Bug #1218168: cmd/juju: upgrade-charm does not expand tilde in filepaths <papercut> <juju-core:Triaged> <https://launchpad.net/bugs/1218168>
 * thumper runs the tests knowing that they'll fail
 * thumper was very surprised to see them all pass
<thumper> was in the wrong pipe
 * thumper smiles at the failures
<thumper> seing base64 encoded certs rather than yaml serialized []byte
<davecheney> thumper: so you fixed that huge turd of output in the cloud-init-output ?
<davecheney> bigjools: https://bugs.launchpad.net/maas/+bug/1218182
<_mup_> Bug #1218182: No way to put a node into "maintenance mode" <MAAS:New> <https://launchpad.net/bugs/1218182>
<davecheney> how can this be a thing
<davecheney> surely someone has asked for this already
<bigjools> just re-commission it
<bigjools> but yes generally that would be useful
<bigjools> we even have a node state for it that's not used yet
<davecheney> bigjools: can you recomission remotely ?
<davecheney> or does one need to get off ones ass ?
<bigjools> davecheney: just click on the ui button to do it
<bigjools> however
<bigjools> it'll do the usual cycle and may not fail if nothing is getting installed
<bigjools> we need to add commissioning tests
<bigjools> deletion is the best way of taking it out for now
<jam1> davecheney: ~ is expanded by your shell, and has different behaviors if you attach an argument. Try: "echo -f=~/" vs "echo -f ~/" For me the former doesn't expand, the latter does expand.
<davecheney> jam: we hit this in japan
<davecheney> it would be nice if there was a solution
<jam> davecheney: use --repository ~/foo/bar
<jam> it works
<jam> don't use --repository=~/foo/bar it doesn't
<davecheney> lucky(~/src/launchpad.net/juju-core) % juju upgrade-charm --repository=~/charms --switch local:mediawiki mediawiki
<jam> bash-ism
<davecheney> error: no repository found at "/home/dfc/src/launchpad.net/juju-core/~/charms"
<davecheney> sure
<davecheney> but can we expand it in the command
<axw> thumper: did you want to do another review of my manual provisioning changes, or are you happy with it? fwereade_ has LGTM'd
<thumper> axw: if you don't mind, I'll take a quick look
<thumper> but probably just before the meeting
 * thumper breaks for dinner, back later
<axw> thumper-dinner: not at all, I will go back to simplestreams stuff for now
<axw> jam: what is that gwacl/failing-test thing about?
<jam> axw: sorry for the noise. bigjools has asked that I set up gwacl under our tarmac bot, and I want to test that it both "successfully lands good patches" and "successfully fails bad patches"
<axw> jam: I was just curious, not bothered :)
<axw> thanks
<rogpeppe1> mornin' all
<jam> morning rogpeppe1
<rogpeppe> jam: hiya
<rogpeppe> jam: any idea what might be going on here? It stopped my merge last night and i can't reproduce it locally. https://code.launchpad.net/~rogpeppe/juju-core/376-factor-out-provider-utils/+merge/182465/comments/414178
<rogpeppe> fwereade_: ^
<jam> rogpeppe: the only thing that comes to mind is that the test suite is "leaking" information an the test case is trying to contact a mongodb running elsewhere.
<jam> (either a 'zombie' one that didn't get torn down properly or something else)
<rogpeppe> jam: that is a possibility i suppose, if the port selection logic isn't working
<rogpeppe> jam: i'm considering changing it actually
<rogpeppe> jam: currently it makes a socket, letting the system choose the port, then closes the socket, and trusts that it will be still ok to use in a short while
<rogpeppe> jam: we can't avoid some window, but i'm wondering if it might be better to pick a port at random, check that we can't dial it, then use that
 * rogpeppe wishes the port name space was considerably larger
<jam> rogpeppe: could we pick a port but open it as reopenable and only close it after something else has grabbed it?
<jam> SOCK_REUSEADDR or whatever that param is
<rogpeppe> jam: i don't think SOCK_REUSEADDR works like that
<rogpeppe> jam: it only allows address reuse of unique local/remote pairs, AFAIR
<rogpeppe> jam: i don't think it allows you to bind two listeners to the same port
<jam> http://stackoverflow.com/questions/775638/using-so-reuseaddr-what-happens-to-previously-open-socket
<jam> rogpeppe: probably
<jam> I know you can bind them to the same port by doing fork magic
<rogpeppe> jam: you mean by inheriting the fd?
<jam> rogpeppe: right. it is how apache used to do it with their forking daemons.
<jam> one of the subprocesses "wins" the Accept request
<rogpeppe> jam: that's kind of a different thing - it's just sharing the already bound socket
<jam> rogpeppe: though to be fair, I don't have high expectations that it is actually the bug you are seeing.
<jam> rogpeppe: but I *have* been seeing a lot of zombie mongodb's this week.
<rogpeppe> jam: i went through a rash of "cannot bind to port" problems last week
<jam> rogpeppe: "SO_REUSEPORT... allows you to bind an arbitrary number of sockets to exactly the same source address and port as long as all prior bound sockets also had SO_REUSEPORT"
 * rogpeppe didn't know about REUSEPORT
<jam> but it looks like it may be a BSD flag
<rogpeppe> jam: but i don't think that helps us
<rogpeppe> jam: even if it was available
<rogpeppe> jam: because mongod won't be using that flag
<jam> rogpeppe: "Linux 3.9 added the option SO_REUSEPORT to Linux as well"
<jam> rogpeppe: it does
<jam> because *we* use that flag
<jam> and then the next person can bind without the flag
<jam> at least from what I read
<jam> I could be wrong
<rogpeppe> jam: i slightly doubt it. let's check, one mo
<jam> rogpeppe: the wording is a bit funny, so no guarantees
<jam> also, I think precise is older than kernel 3.9
<rogpeppe> jam: the wording sounds to me as if all binders to the port must use that option
<rogpeppe> jam: otherwise it's really quite dangerous
<rogpeppe> jam: because i might wish to bind a server to a port, but because the previous server on that port has used REUSEPORT, it allows us anyway, and then we get two different servers randomly sharing the same network address
<rogpeppe> jam: there's another possibility which won't rule out the failure but will make it more obvious what's happening
<rogpeppe> jam: which is to get the API server to send back the env UUID and have the client check that
<rogpeppe> jam: then at least we will definitively know when we're talking to an unexpected server
<jam> rogpeppe: except we've already gotten rejected by that point haven't we? Or are you saying we send it before login?
<rogpeppe> jam: we'd need to make it available even to non-logged-in clients though, and i'm not sure if that would be judged to be an unwanted information leak
<jam> haven't we validated the cert by this point?
<rogpeppe> jam: yeah, we have validated the cert, yes
<rogpeppe> jam: and the cert is randomly generated, so actually that's a good point
<rogpeppe> jam: and i *think* we use a secure mongo connection even in tests
<rogpeppe> jam: no, i don't think it can be a duplicate mongo problem
<jam> rogpeppe: I know that if you have a mongo that doesn't support TLS the test suite dies in horrible ways
<rogpeppe> hrmph, well it's managed to merge noe
<rogpeppe> now
<rogpeppe> that failure does concern me though. wtf can be going on, when we've got two concurrent sessions connected to the same API port, the second of which fails logging in with exactly the same creds as the first succeeded with?
<rogpeppe> fwereade_: i just noticed this comment of yours: "
<rogpeppe> Never use conn.Environ if you can possibly help it. It's basically never up to
<rogpeppe> date.
<rogpeppe> "
<rogpeppe> fwereade_: what does it mean for an Environ to be "up to date"?
<rogpeppe> fwereade_: ah, you mean that it might have out of date config attrs?
<jam> rogpeppe: he means that conn.Environ is read from local disk, but the source of truth is actually conn.State.Environ
<jam> or whatever the actual request is
<jam> rogpeppe: which is why for the CLI API stuff we went away from using APIConn and are trying to get away with just api.Client
<rogpeppe> jam: yeah
<rogpeppe> jam: (well, that's not the only reason for avoiding client use of Environ)
<jam> rogpeppe:  this one also failed earlier today, and left a mongodb running: https://code.launchpad.net/~allenap/juju-core/makefile-stuff/+merge/181113
<jam> a Watcher didn't err when it exited
<jam> I don't know why
<jam> I'm concerned we introduced some race conditions recently without realizing it.
<jam> Or the bot is just on a VM that is having neighbor issues, which triggers these less frequent problems.
<rogpeppe> jam: interesting, that also died in TestManageStateServesAPI
<rogpeppe> jam: (the same place i saw the problem)
<jam> rogpeppe: I don't quite see that in the 500 line panic, but I trust you did
<jam> and yes, we might just have a flakey test, that when it fails might leave a mongodb running.
<rogpeppe> jam: (to find which test is running, search for .Test in the stack trace
<rogpeppe> )
<rogpeppe> jam: even if we have moribund mongodb's, that shouldn't be a problem AFAICS
<jam> rogpeppe: I'm saying the test is *causing* moribound mongodb's not that it is affected by them.
<rogpeppe> jam: ah, right, yeah
<jam> (test suite teardown is known to fail to tear down the mongodb it started)
<rogpeppe> jam: it probably happens when a goroutine panics without a recover
<rogpeppe> jam: (that happened in this case)
<jam> rogpeppe: I thought gocheck recovered from all panics in order to catch them cleanly and report errors from test cases ?
<jam> is this happening in a TearDownSuite or something where it isn't being caught?
<rogpeppe> jam: it can't recover from panics in goroutines that it didn't start
<jam> rogpeppe: ah, even if it starts the one that started it
<rogpeppe> jam: which is true in this case - the panic is in a watcher goroutine
<rogpeppe> jam: yes - you can't catch panics from many goroutines at once - that would be nastily asynchronous
<rogpeppe> jam: the usual solution is to put some cleanup code outside the main executable
<axw> wallyworld_: I came up with a better approach to the JSON problem. It's now very similar to how it was originally, but without the float problem
<wallyworld_> great :-)
<axw> and with no runtime callstack crap
<wallyworld_> hooray, i'll take a look in a bit
<rogpeppe> jam: i've found how that panic can happen, i think
<TheMue> rogpeppe: you've once discussed https://bugs.launchpad.net/juju-core/+bug/1202163 with hazmat
<_mup_> Bug #1202163: openstack provider should have config option to ignore invalid certs <papercut> <juju-core:Triaged by themue> <https://launchpad.net/bugs/1202163>
<mgz> okay, nearly reboot for meeting time
<mgz> TheMue: I'm probably the best person to ask about that bug
<TheMue> rogpeppe: could you give me a hint where this change has to be done
<TheMue> mgz: ok, so we'll continue after the meeting, thanks
<TheMue> mgz: that you can reboot now ;)
<rogpeppe> TheMue: it's probably something that needs a change in goose too
<rogpeppe> TheMue: basically we need some way to tell goose that it should ignore unknown certs
<rogpeppe> TheMue: within goose, it would need to set the TLS config on the https request that it makes to have InsecureSkipVerify=true
<TheMue> rogpeppe: ah, this makes it more clear. already looked in our code as a first step, but not goose
<rogpeppe> TheMue: cool
<mgz> TheMue: posted a comment in the bug
<TheMue> mgz: thx
<thumper> davecheney: coming to the meeting?
 * fwereade_ goes to get breakfast
<jam> rogpeppe: you mention that you might have a solution for the TestManageStateServesAPI bug?
<rogpeppe> jam: for the panic(no error) problem anyway, i think, yes
<jam> fwereade_: what did you decide on: https://code.launchpad.net/~fwereade/juju-core/prepare-leave-scope/+merge/181065
<jam> the thread went long, and sort of ended on "maybe this is or isn't correct"
<rogpeppe> jam: i believe it's because the underlying State is being closed, which causes the state.Watcher to be stopped and return without an error
<rogpeppe> s/state\.Watcher/watcher.Watcher/
<jam> rogpeppe: so still just a race condition, right?
<rogpeppe> jam: not really
<jam> as in we expect the Watchers to be closed first
<jam> but in this case the State got closed.
<jam> rogpeppe: given it doesn't always fail it is clearly *some* sort of race condition :)
<rogpeppe> jam: hmm, there's definitely some racing involved, yes
<rogpeppe> jam: but i think it's wrong that the state watchers assume that because the underlying state has been closed they can panic
<rogpeppe> jam: you're probably right that the watchers should probably be closed nicely before the state is closed
<jam> rogpeppe: so I think the MustErr thing is because in production they want to expect that if they are shutting down, there is a reason for it.
<jam> rogpeppe: But I find it very strange to panic when you *don't* have an error :)
<jam> rogpeppe: there *might* be a case where we purposefully trigger something like that during upgrade to force the process to restart, but I'm not 100% sure how all that ties together.
<rogpeppe> jam: the usual reason for MustErr is that if you're the code solely in control of a watcher, you *know* that it can't have been stopped by something else
<rogpeppe> jam: so if it dies without an error there's something weird enough going on to warrant a panic
<jam> fwereade_: thinking of upgrading. You had a comment about WatchAPIVersion (I think) being unhappy before we've gotten our env credentials set up. Is it just that you don't want that API call to return until we're ready to handle FindTools requests, or ?
<rogpeppe> jam, fwereade_: FWIW i don't think it's a good idea to make that call block indefinitely until there's a valid environ config
<rogpeppe> jam, fwereade_: because that could take forever and there's no way of interrupting that call once it's in progress
<rogpeppe> fwereade_:
<rogpeppe> jam, fwereade_: although having said that, i'm not sure i can think of a better alternative
<jam> rogpeppe: well, avoiding panic will be a good first step towards figuring out what is going wrong. :)
<rogpeppe> jam: indeed
<jam> rogpeppe: because I think fwereade_ said jpds managed to get 'juju status' to run and either we still have the log-replay bug, *or* it was still failing on something.
<jam> I was very surprised to see the err was a "schema.error_" which sounds like it is coming from somewhere else.
<rogpeppe> jam: that panic should have been fixed by the recent ServerError change
<rogpeppe> jam: that error comes when a schema doesn't match
<fwereade_> jam, it's Tools in particular
<fwereade_> jam, rogpeppe: I am relatively unbothered by blocking forever there
<fwereade_> jam, rogpeppe: and using the environ in the first place is somewhat suboptimal
<rogpeppe> fwereade_: yeah, i was just wondering about that - we're only using for agent-version, right?
<rogpeppe> s/using/using it/
<fwereade_> jam, rogpeppe: so at some point it will wither away regardless, in favour of a cache of simplestreams data
<jam> fwereade_: as in, WatchAPIVersion runs, finds stuff, and then we call FindTools but that blocks until we have creds?
<fwereade_> rogpeppe, nope
<fwereade_> rogpeppe, it's FindExactTools that's the problem
<fwereade_> rogpeppe, I'm 80% sure that we manage to extract agent-version, for the watcher, without creating an environ
<fwereade_> jam, yeah
<rogpeppe> fwereade_: oh, of course
<rogpeppe> fwereade_: i think i agree that we can block forever - it doesn't mean we can't still respond the Upgrader.Kill, as long as we run the Tools call in a new goroutine
<rogpeppe> s/respond the/respond to/
<jam> fwereade_: well one option would be to add one more api which is "State.DesiredAgentVersion()" rather than having it always be Tools with the URL to get the new tools.
<jam> fwereade_: because then it will go quiescent without having to search the provider.
<fwereade_> rogpeppe, WaitForEnviron requires a done chan orsomething already iirc
<jam> fwereade_: the change that worker/upgrader/upgrader.go did
<jam> is that on startup for every agent it must go scan the bucket
<fwereade_> jam, what's the utility there?
<rogpeppe> fwereade_: but we're talking server side here, right?
<jam> I believe we did it that way because it seemed silly to do 2 api requests
<jam> but given that 1 is expensive (must read the provider buckets)
<jam> and one is cheap (just from the db)
<jam> maybe it is worth splitting up
<fwereade_> jam, istm that it just complicates things without delivering any benefits?
<jam> fwereade_: it very specifically moves us closer to how we used to do it, and would avoid this bug
<jam> fwereade_: as the Upgrader would say "what version do you want me to be at?" and the response would match currentTools.version and we would go back to waiting on a change
<jam> fwereade_: and we don't need provider creds to get that answer.
<jam> rogpeppe: we got Unauthed access again: https://code.launchpad.net/~thumper/juju-core/container-address/+merge/182271
<jam> mgz: it has already been approved, but if you could look over https://code.launchpad.net/~thumper/juju-core/container-address/+merge/182271 to see how it fits with your Addresses stuff.
<rogpeppe> jam: that is *so weird*
<fwereade_> jam, ah, ok... maybe that leads to a nicer server-side implementation too?
<jam> rogpeppe: I agree, it is clearly a test that is an issue
<jam> fwereade_: it splits the bits nicely
<jam> fwereade_: and is probably really easy to implement
<fwereade_> jam, ok, consider it blessed, thanks :)
<jam> fwereade_: do we have a bug about failing during startup in 1.13.x ?
<mgz> jam: is this the one I read through the other day, looking...
<jam> fwereade_: also, this probably only happens when you are using a private bucket that requires creds.
<jam> mgz: if you read thumper's MP you didn't respond to it
<jam> fwereade_: it also means that we don't scan the provider every time *anything* in environconfig changes
<jam> since we don't have smarts about just noticing api version changes yet
<rogpeppe> jam: i'm not sure i see how DesiredAgentVersion would help
<rogpeppe> jam: we'd still need to wait for the environment config to be valid, wouldn't we?
<mgz> jam: right, didn't comment, still not completely sure on the implications (but as a hack, it seemed fine)
<rogpeppe> afk
<jam> rogpeppe: we write AgentVersion into the environment config during bootstrap. We don't yet have *provider* (eg EC2/MaaS/Openstack) credentials.
<jam> rogpeppe: the existing Tools() api requires searching the provider.Storage for the binary that matches the desired version
<jam> rogpeppe: because it returns the URL to download those tools.
<jam> which is a problem during bootstrap because we don't actually have the credentials to search Storage yet.
<jam> well, "first boot"
<rogpeppe> jam: ah yes, of course
<rogpeppe> jam: i think that separating the API calls makes a lot of sense actually
<rogpeppe> jam: then the relationship between the version watcher and the API call we're making is obvious
<fwereade_> jam, I'm feeling somewhat nauseous and it isn't going away, I'm going to lie down for a while
<jam> fwereade_: shame, I'm just about to propose the fix. Feel better soon.
<rogpeppe> fwereade_: hope it *does* go away
<fwereade_> jam, if I don't come back for the meetings please make my apologies
<fwereade_> jam, I will surely at least manage some reviewing later today though
<fwereade_> rogpeppe, cheers )
<jam> davecheney: if you are around, didn't you have a bug about seeing listing-the-storage-bucket over and over in the logs?
<jam> I think my fix will also address that
<jam> rogpeppe: if you have time https://codereview.appspot.com/13380043/
 * TheMue => lunchtime
<rogpeppe> jam: reviewed
<jam> rogpeppe: so I was following the example of "tools.Tools" which returns a pointer rather than a struct.
<jam> Is there a reason they should be different?
<rogpeppe> jam: just convention
<rogpeppe> jam: we always pass around *tools.Tools
<rogpeppe> jam: and version.Number
<rogpeppe> jam: if we want to change it, we should do it consistently across the code base
<jam> rogpeppe: what about the API itself then? I think it was using *tools so that it just puts nil there when it say, has an error.
<jam> it seems a little strange to serialize a bunch of 0's onto the wire when you have an error.
<rogpeppe> jam: omitempty might possibly work
<rogpeppe> jam: although it may not work on structs, come to think of it
<rogpeppe> jam: i don't mind if the serialisation struct uses a pointer, if that's what we need to lose it from the RPC resulyt
<mgz> what is "empty" in the context of a struct... can you test for nilledness easily with go?
<mgz> zeroedness actually I guess
<jam> mgz: foo == Foo{}
<rogpeppe> mgz: yeah
<rogpeppe> mgz: but it's not that easy to check for using reflection, so it's quite probably it doesn't do that
<rogpeppe> probable even
<mgz> that makes sense
<rogpeppe> mgz: yeah, it doesn't do that
<mgz> jam: any ideas why sync-tools --source is trying to list tools before doing anything?
<mgz> <http://paste.ubuntu.com/6039913/>
<mgz> on canonistack, so may involve simplestreams...
<jam> mgz: as in why it would be listing ~/go/bin or listing something upsteram?
<jam> it normally lists the target tools
<jam> so it can find what tools need to be copied
<mgz> it's trying to list a swift container. it should just splat up the local stuff, no?
<jam> mgz: not when you already have stuff
<jam> it is "put stuff I don't have"
<jam> not "overwrite everything"
<jam> mgz: that way if I get interupted after copying 3 things, I don't start from scratch again.
<mgz> but... this is then a catch-22. have no tools, can't upload my local tools, because I have not tools?
<jam> not that it should fail if it can't find any target tools
<jam> mgz: I think you are misreading it. I think it is failing because it can't find any *source* tools to copy
<mgz> jam: that dir certainly contains jujud
<mgz> all: will miss standup, things to mention, filed a selection of goose bugs about issues related to rackspace
<mgz> and er... I'm being prodded
<jam> mgz: tools are usually a foo-series-arch-number.tar.gz sort of thing
<arosales> jam, https://bugs.launchpad.net/juju-core/+bug/1216768
<_mup_> Bug #1216768: Azure provider: Authentication error when using public tools <juju-core:Fix Committed by axwalk> <https://launchpad.net/bugs/1216768>
<jam> rogpeppe: https://plus.google.com/hangouts/_/f497381ca4d154890227b3b35a85a985b894b471 standup?
<arosales> jam, https://bugs.launchpad.net/juju-core/+bug/1218329 is the other Azure bug needed.
<_mup_> Bug #1218329: Update default environment.yaml for Azure to use Precise for default-series <juju-core:New> <https://launchpad.net/bugs/1218329>
<arosales> jam what method would you suggest to test trunk with a public tools? sync-tools to upload the current trunk tools to a public bucket?
<jam> arosales: you can create a different "public" bucket and sync-tools into it, I think.
<jam> and then configure you environment to treat that as the actual public bucket.
<arosales> jam, ok I'll give that a try to day thanks.
<jam> arosales: for bug #1218329 you've confirmed we can switch to precise? because it is pretty trivial to change that one line.
<_mup_> Bug #1218329: Update default environment.yaml for Azure to use Precise for default-series <juju-core:New> <https://launchpad.net/bugs/1218329>
<arosales> jam, the other point I need to confirm is the image-stream.
<arosales> I need to confirm if we are calling the latest precise images with the fix as "released" or "daily" in simple streams and in Azure publication.
<arosales> jam, I also added a comment with that.
<jam> rogpeppe: I think I responded to all of your requests: https://codereview.appspot.com/13380043/
<jam> arosales: I won't be around tomorrow to make sure azure things are addressed before the cut of the release (usually happens on the weeked). You can probably poke some of the people who are in Europe during your morning (mgz, fwereade_, allenap come to mind)
<jam> hey mramm, we didn't think we'd see you today
<mramm> jam: I had the overnight flight to london
<mramm> just checking in at the hotel
<arosales> jam, I'll sync with davecheney and fwereade_  on it if a release is targetted this week
<jam> arosales: I fully expect at least 1.13.3 to be out this weekend. And we'd like to release it, test it, and possible directly upgrade it to 1.14
<arosales> jam, ok, and if a stable release causes a lot of pain and you just go with devel just let me know, for azure documentation set up purposes.
<rogpeppe> jam: reviewed
<TheMue> jam: due to doc sprint tomorrow and now having discovered that it's a deeper goose change too i would like to see lp:1202163 reassigned to one of the goose team. what do you say?
<rogpeppe> jam, fwereade_: state.RelationUnitsWatcher doesn't seem to have any tests at all. do you know what's going on there?
<rogpeppe> hmm, looks like it was deleted in https://codereview.appspot.com/7198051/
<rogpeppe> https://bugs.launchpad.net/juju-core/+bug/1218362
<_mup_> Bug #1218362: state.RelationUnitWatcher is not tested <tech-debt> <juju-core:New> <https://launchpad.net/bugs/1218362>
 * rogpeppe just ran the coverage test tool for the first time
<rogpeppe> it works well
<natefinch> nice
<rogpeppe> 85% coverage of the state package
<natefinch> hey, 85% is really good
<rogpeppe> natefinch: it would be 86.9% if we actually tested RelationUnitWatcher
<rogpeppe> aw, "cannot use test profile flag with multiple packages"
<natefinch> rogpeppe: heh. I don't put a ton of stock in coverage, since covered doesn't necessarily mean tested... but not covered is definitely not tested
<rogpeppe> natefinch: that's my thought
<rogpeppe> natefinch: i was wondering about things that would have made it obvious that we'd lost test coverage
<natefinch> rogpeppe: so, only 15% definitely not tested, and 85% that is at least exercised, so that's pretty great
<rogpeppe> natefinch: (as happened in this CL in january: https://codereview.appspot.com/7198051/ )
<natefinch> rogpeppe: yeah, detecting when we lose coverage is a good idea
<rogpeppe> jam: if you're still around, this fixes the watcher panic: https://codereview.appspot.com/13386044/
<rogpeppe> fwereade_, natefinch, mgz: reviews appreciated
 * rogpeppe goes for some lunch
<natefinch> rogpeppe: I'll take a look
<arosales> any juju core folks around to attend vUDS session: http://summit.ubuntu.com/uds-1308/meeting/21899/servercloud-s-juju-new-user-ux/
<arosales> rogpeppe,  ^
 * arosales was going to bother fwereade but I don't see him online atm
<rogpeppe> arosales: he's sick atm, i think
<arosales> ah, sorry to hear
<rogpeppe> arosales: what time zone is that time in?
<arosales> rogpeppe, utc
<arosales> so roughly in half hour
<rogpeppe> arosales: ok, i'll be there
<arosales> rogpeppe, much appreciated, thank you
<rogpeppe> arosales: do i have to register as attending to join?
<arosales> rogpeppe, I don't think so
<arosales> rogpeppe, I'll post the hangout url in the channel here in a bit, and we can take it from there.
<rogpeppe> arosales: ah, it's a hangout - i was wondering where the video was
<rogpeppe> arosales: thanks
<arosales> rogpeppe, yup live hangout, but I haven't started it just yet
<arosales> still have about 20 minutes
<rogpeppe> natefinch: i've addressed your concerns, i hope: https://codereview.appspot.com/13386044
<rogpeppe> natefinch: or replied, at any rate :-)
<natefinch> rogpeppe: cool, I'll take a look.  Would like someone more familiar with the problem to give it the lgtm if possible, though
<rogpeppe> natefinch: yeah, i guess i'll have to wait until tomorrow unless mgz or jam are around
<mgz> the watcher change makes some sense to me, but I'm not sure we have more qualified than rog on the problem :)
<mgz> what's the implication of a one-value return from a channel, rather than the `, ok` form?
<natefinch> rogpeppe: it sorta bugs me that Watcher is an interface without a Watch() method :p
<rogpeppe> please think of a better name!
<rogpeppe> natefinch: ^
<natefinch> rogpeppe: I would almost call it Stopper or Ender... it's the  Changes() method that really makes a watcher a watcher
<rogpeppe> natefinch: i originally thought about Worker
<natefinch> rogpeppe: yeah, I was thinking something like that originally
<natefinch> arosales: you should email the team ahead of time so we can plan on being around for these meetings :)
<arosales> natefinch, you guys didn't know about vUDS?
<arosales> natefinch, but noted I don't think I included juju-core folks in my uds reminders
<natefinch> arosales: reminders are much appreciated :) Also, I'm new, so I might not know things I'm supposed to know :)
<arosales> natefinch, for sure you get a pass :-)
<arosales> but these other veterans  . . . .
<arosales> natefinch, but I agree with you on reminders. I'll note for next time bother the juju-core folks :-)
<weblife> Juan Negron in here?
 * rogpeppe has reached eod
<rogpeppe> g'night all
<thumper> morning folks
 * thumper is at that WTF moment debugging
 * thumper grunts...
<thumper> found the source of my confusion
<thumper> simplified to this:
<thumper>     c.Assert(obtained, gc.DeepEquals, expected)
<thumper> ... obtained []uint8 = []byte{}
<thumper> ... expected []uint8 = []byte{}
<thumper> bigjools: the magic of gocheck  :)
 * thumper fixes 
<bigjools> thumper: wtf
<thumper> bigjools: I know, right?
<thumper> had me scratching my head for quite a while
<bigjools> what is it hiding?
<thumper> it is outputting something incorrectly
<thumper> here are the two lines prior
<thumper> 	obtained := []byte{}
<thumper> 	var expected []byte
<thumper> obtained is an empty slice
<thumper> expected is a nil slice
<davecheney> thumper: we need a SliceEquals checker
<thumper> what is shown here is just a part of the problem
<thumper> I'm checking an entire structure
<thumper> using deep equals
<thumper> the slice is just one part
<thumper> a key part of the problem is that it isn't showing []byte{nil}
<thumper> which is what *should* be shown for a nil slicke
<thumper> slice
<thumper> that would have made the problem completely obvious
<thumper> instead of hiding it
<thumper> oh...
<thumper> check this out:
<thumper> http://play.golang.org/p/LRDkBszMNa
<thumper> vs http://play.golang.org/p/iO2wpSUxeO
<thumper> a nil string slice is output as []string(nil), but a nil byte slice is shown as []byte{}
<thumper> however, a nil byte slice does not equal an empty byte slice
<thumper> davecheney: go bug?
<davecheney> nope
 * bigjools boggles
<thumper> davecheney: why?
<davecheney> otp
#juju-dev 2013-08-30
<davecheney> thumper: why does var b []byte != var b = []byte{} ?
<thumper> one is nil, one is not
<davecheney> right, but was that the question ?
<thumper> I think that either they should be equal, or be visually different
<thumper> no
<thumper> the question is why they have the same representation
<thumper> <thumper> a nil string slice is output as []string(nil), but a nil byte slice is shown as []byte{}
<thumper> empty string slice is []string{}, empty byte slice is []byte{}
<thumper> nil slice != empty slice
<thumper> so they shouldn't have the same representation
<thumper> made debugging very confusing
<davecheney> probably just how fmt decides to print them
<davecheney> thumper: if you're looking for someoone to blame
<davecheney> blame gocheck
<thumper> davecheney: see the play things above
<thumper> they don't use gocheck
<thumper> if anything, I'll blame the "fmt" library
<thumper> hence, go bug?
<davecheney> certainly an inconsistency
<davecheney> id' still like to blame gocheck for this
<davecheney> because the chance of chaning it in the go runtime are approximatly 0
<thumper> you can't just blame a different library because core won't change
<thumper> that's dumb
<davecheney> yup, i'm dumb
<thumper> surely it could be filed as a bug
<davecheney> absolutely
<thumper> and fixed during the next *we'll break things* type release
<thumper> to make things more consistent
<thumper> lunch time...
<davecheney> axw: ping
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1218329
<_mup_> Bug #1218329: Update default environment.yaml for Azure to use Precise for default-series <juju-core:Triaged> <https://launchpad.net/bugs/1218329>
<davecheney> do we need to hold off 1.13.3 for this bug ?
<wallyworld> davecheney: i think we should, but that's IMHO
<davecheney> wallyworld: ack
<wallyworld> davecheney: it's a trivial patch, so we can do it today once confirmation of the stream is done
<axw> davecheney: would you prefer if I hold off merging manual provisioning?
<davecheney> axw: what are the changes it'll bugger up azure
<davecheney> if it's > 5%, then yes
<axw> davecheney:  I would say 0%, but my thinking may be biased
<davecheney> :)
<davecheney> your call
<davecheney> https://canonical.leankit.com/Boards/View/103148069
<davecheney> i;m worried about all the stuff in review
<axw> davecheney: worried why?
<axw> worred that we need them in for 1.14?
<davecheney> three cards there that I wanted to see land
<davecheney> MP, azure stuff, and the bug fix for version formatting
<davecheney> there is nearly as much stuff in review as has landed
<axw> davecheney: I can and will get MP in right now. I'm confident it's not going break anything else
<davecheney> axw: sgtm
<axw> dunno anything about the others I'm afraid
<davecheney> axw: for manually provisioning, this requires a working environment
<davecheney> ie, you cannot manually provision a bootstrap node
<axw> davecheney: correct
<davecheney> axw: if my bootstrap node is in amazon, can I manually provision a machine in my local lan ?
<axw> davecheney: yes, if it can route to the bootstrap node
<davecheney> hmm
<davecheney> this sounds icky
<davecheney> i'd like to not mention this in the release notes
<davecheney> or we'll get bollocked by jorge again
<axw> davecheney: fair enough, it is a bit raw
<davecheney> alternatively, if you want to write war and peace
<davecheney> go for it
<axw> nah leave it out for now
<davecheney> but I vote for putting a card in leankit to properly document it
<davecheney> and leaving it for 1.15.0
<axw> agreed
<davecheney> axw: https://bugs.launchpad.net/juju-core/+bug/1218329
<_mup_> Bug #1218329: Update default environment.yaml for Azure to use Precise for default-series <juju-core:Triaged> <https://launchpad.net/bugs/1218329>
<davecheney> do you have an environemnt to test this change >?
<axw> davecheney: yep I'll give it a whirl
<davecheney> ta
<axw> azure called me up yesterday morning about the account I created ... that was slightly weird
<davecheney> axw: yeah, they'll call you a few times
<davecheney> i told the guy i was only using it for a test and had no interest in paying for their service
<davecheney> he soundded confused
<davecheney> and was still confused when he called me a week later and asked the same thing
<axw> lol
<axw> davecheney: so far, not so good... looks like the SSH keys aren't setup properly
<axw> also - I couldn't use released, had to use daily
<davecheney> axw: ok, play with it a bit more, otherwise move the ticket to 1.15.0
<axw> ok
<davecheney> once we have a 1.14 series we can backport fixes
<davecheney> but waiting on a working precise image could take longer than we have
<davecheney> so screw it
<davecheney> we can do a point release into 1.14
 * davecheney waves to robbiew 
<jcastro> davecheney: hang out with us!
<davecheney> jcastro: but it's my special time with ubu hulk
<jcastro> oh dang
<robbiew> davecheney: on my way...got distracted with other work stuff
<jcastro> ok, after?
<davecheney> jcastro: sure thing
<jcastro> we're just hanging out
<davecheney> jcastro: shiiit, what is this, a party ?
<jcastro> didn't know it was Thor <-> Hulk bonding time
<jcastro> davecheney: nah, did a live juju upgrade on discourse so we had to do it post-work and late
<davecheney> jcastro: DC characters need love too
<jcastro> and we had some feedback, but whatever, nothing that can't wait
<jcastro> from now on you are SOLMON GRUNDY
<robbiew> jcastro: wtf you doing up?
<jcastro> robbiew: upgrading services with less than a minute downtime? :)
<robbiew> nice
<jcastro> next time we're doing cascading upgrades ....
<jcastro> not to brag or anything
 * jcastro brags because he's backed by marcoceppi.
<davecheney> damnit!
<davecheney> This party is over...
<davecheney> But you can start a new one.
<davecheney> some party
<jcastro> davecheney: let's party tomorrow?
<wallyworld> axw: ping
<axw> wallyworld: hoy
<wallyworld> i just saw your mp for genersting tools metadata and have a question or two
<wallyworld> 1. i'd like the functionality to generate the json separated out into a separate method, as i need to use it in another context
<wallyworld> i want to pass a list of ToolsMetdata and get back the index and product json
<wallyworld> i need this for tests
<axw> ok- I can move it to environs/tools?
<wallyworld> also, the Boilerplate() method can be updated to use this also
<wallyworld> instead of go templates
<axw> ok
<wallyworld> i think environs/tools sounds ok
<wallyworld> 2. i'm not sure why you are writing to a storage and not just to disk
<wallyworld> we don't need to generate the metadata into a public bucket
<axw> oops
<axw> :)
<wallyworld> we need to read tools from somewhere (currently all we have is the public bucket), but the metadata needs to go either to disk locally or uploaded somewhere
<axw> would people not want to host it in their private swift storage or whatever?
<wallyworld> so for now, just have the command take a dest dir (default to .juju like for imagte metadata)
<wallyworld> yes, private storage perhaps, but not public
<wallyworld> so you could use a storage abstraction as yuo have done
<axw> yeah I was thinking "public" as in shared within a private org
<axw> not the official public one
<wallyworld> ah ok
<axw> but anyway
<axw> it can go to disk and they can do that
<axw> manually
<wallyworld> well, the storage abstraction used works
<wallyworld> but use the env.Storage()
<wallyworld> not env.PublicStorage() to write to
<axw> okey dokey
<wallyworld> it would be great to land this today if we could as i have stuff queued up behind it - i have tests where i need to generate tools metadata and then read it back
<axw> will get onto it now
<wallyworld> thanks :-) it doesn't have to be perfect straight up cause we won't advertise it, it will mainly be for us internally
<wallyworld> axw: with the metadata generation, if the new method could be such that i get the json back as a string for index and product, that would be good. then there would be another new method to write those strings to files
<wallyworld> or a reader or whatever
<axw> wallyworld: nps, easy enough
<wallyworld> i just need to hold the data in memory for a http proxy class to use
<wallyworld> thanks
<axw> wallyworld: I think I just found a problem... paths should be relative to the base URL right? i.e. "juju-1.12.0-precise-amd64.tgz", and not "tools/juju-1.12.0-precise-amd64.tgz"?
<axw> since the base URL includes /tools/
<wallyworld> yes, paths should be relative to the base url
<wallyworld> i envisiage tarballs will be in releases/
<wallyworld> ie http://juju.canonical.com/tools/releases
<axw> ok
<wallyworld> and metadata in http://juju.canonical.com/tools/streams/v1
 * wallyworld has to go do school pickup
<axw> wallyworld: updated. I haven't yet added tests to environs/tools. Haven't yet updated boilerplate. It's still using storage, I will change that now.
<wallyworld> axw: ok, thanks, bbiab
<axw> cya
 * thumper slogs through fixing shit
<thumper> morning fwereade
<thumper> fwereade: around 7:30am?
<fwereade> thumper, heyhey
<fwereade> thumper, yeah
<thumper> got time for a quick chat before I EOD?
<thumper> and EOW
<fwereade> sure
<thumper> and EOM
<thumper> fwereade: https://plus.google.com/hangouts/_/ebc7ddf59bb207399b6ce7ab7ccd391a4fd9fede?hl=en
<thumper> night all
<rogpeppe> mornin' all
<wallyworld> axw: hi, i made some more comments. i hope they make sense
<axw> wallyworld: thanks, I'll have a look in a sec
<wallyworld> axw: np. i have to take the dog for a quick walk before he bites me, i'll check back in soon
<axw> hehe
<axw> no worries
<axw> wallyworld: `There's little point using an env to read the tools - the tools are readily
<axw> accessible directly from "https://juju-dist.s3.amazonaws.com/" (see synctools)`
<axw> environs/sync does use the env's storage?
<fwereade> axw, it's done in sync-tools -- there's a storage interface that talks to juju-dist directly over http
<axw> ah
<axw> fwereade: thanks
<fwereade> axw, and also allows for reading in from a dir, which I think wallyworld mentioned, but I barely skimmed that CL I'm afraid
<wallyworld> axw: sync uses the env storage to write to the private bucket, so the tools are available for that env
<axw> wallyworld: yeah I just missed the whole bit about newFileStorageReader/ec2.NewHTTPStorageReader
<wallyworld> axw: is anything unclear with my (brief) comments?
<axw> wallyworld: nope, makes sense
<axw> just updating and testing now
<wallyworld> cool, thanks
<wallyworld> the main business logic is to (somehow) get a list of tools and generate metadata
<axw> yup
<wallyworld> this will be used by sync tools and tests, plus the generate command for devs/prototyping
<axw> I've moved the fileStorageReader to environs/filestorage
<axw> will use that and the ec2 thing inside toolsmetadata
<axw> and update sync accordingly
<wallyworld> ok. i'm wondering is we want a "wget" reader also, one that takes a http url if you know what i mean
<wallyworld> maybe not
<wallyworld> let's wait till we need it
<wallyworld> i'm hoping medium term when we have the proper tools repository that we won't need the reader abstraction
<wallyworld> since we can access the official tools and metadata directly using wget
<wallyworld> and not via a s3 bucket
<axw> yeah, no need to list then
<axw> wallyworld: you want me to move ItemCollection.UnmarshalJSON to json.go? so, I should move all the structures there?
<wallyworld> so when we do that, we should be able to have an abstraction that takes a url, either file:// or http:// to get the tools etc
<axw> in environs/simplestreams
<wallyworld> axw: i 'd like all the json related stuff out separate
<axw> ok
<wallyworld> so as not to obscure the core simplestreams lofic
<wallyworld> we did something similar in goose
<wallyworld> see goose.nova package
<rogpeppe> fwereade: ping
<fwereade> rogpeppe, pong
<rogpeppe> fwereade: i've been looking at the config stuff, and wondering about firewallmode==default
<rogpeppe> fwereade: the idea is that a given environment can choose its own default, right?
<fwereade> rogpeppe, I *think* so, yes
<rogpeppe> fwereade: i can't see how that can happen currently
<fwereade> rogpeppe, doesn't surprise me at all
<rogpeppe> fwereade: the code *looks* as if it's trying to go in that direction
<rogpeppe> fwereade: but swerves at the last moment
<fwereade> rogpeppe, haha
<fwereade> rogpeppe, if that's not what it's for, your guess is as good as mine
<rogpeppe> fwereade: in particular, in config.New:
<rogpeppe> 	// Default firewall mode is instance.
<rogpeppe> 	if c.FirewallMode() == FwDefault {
<rogpeppe> 		c.m["firewall-mode"] = string(FwInstance)
<rogpeppe> 	}
<fwereade> ha
<fwereade> rogpeppe, I think TheMue may have been involved there
<rogpeppe> fwereade: so AFAICS no provider will ever see FwDefault
<fwereade> rogpeppe, he might remember
<fwereade> rogpeppe, I think you're right
<rogpeppe> fwereade: i'm sorely tempted to lose the whole idea
<rogpeppe> fwereade: and just let a given provider ignore the firewall mode if it can't deal with it
 * fwereade shrugs, and doubts that's any worse than what we currently have
<fwereade> rogpeppe, the whole firewalling story is madness anyway IMO
<rogpeppe> fwereade: BTW i'm testing a branch which cleans up the config prior to moving forward with environment attr storage, which changes config.New to this: http://paste.ubuntu.com/6043287/
<fwereade> rogpeppe, it's probably not too many steps from sanity, but plotting those steps makes me confused
<rogpeppe> fwereade: i'm hoping/assuming you will wholeheartedly approve
<axw> wallyworld: there's a bunch of tests I'm not going to be able to complete today. if you wanted this stuff landed to unblock you, I can do so and follow up with tests on Monday
<wallyworld> axw: it's ok, i can wait. i'm currently modifying another upstream branch
<axw> wallyworld: okey dokey
<fwereade> rogpeppe, that looks pretty sexy to me :)
<wallyworld> we can get the next release out first anyway
<rogpeppe> fwereade: the next step is to pass a config.Defaulting flag to EnvironProvider.Prepare and EnvironProvider.Open
<wallyworld> axw: maybe push what you have and i'll take a look over the w/e
<rogpeppe> fwereade: to enable providers to be similarly strict about env vars
<fwereade> rogpeppe, nice
<axw> wallyworld: done
<wallyworld> thanks :-)
<wallyworld> will look a bit later
<axw> wallyworld: just missing tests on environs/filestorage (never were any explicit ones), and the new marshalling stuff in environs/tools
<axw> thanks
<fwereade> wallyworld, btw, I'm having some persistent difficulty mapping simplestreams cloud info <-> juju concepts -- would you be free for a chat about it after the standup?
<wallyworld> we are also blocked on IS getting the tools repository set up
<axw> wallyworld: I generated metadata for all the tools in S3 earlier, if you want the JSON files for anything...
<wallyworld> fwereade: sure. i'll be up to my 5th drink by then, should be fun :-)
<wallyworld> axw: sure, send them my way, thanks :-)
<fwereade> wallyworld, it'll probably help ;p
<wallyworld> fo :-)
<rogpeppe> everyone should try the go coverage tool to check test coverage - it's awesome!
<rogpeppe> for example, see this output: http://paste.ubuntu.com/6043486/ (save the text as html and open it in your browser)
<fwereade> wallyworld, btw, when you're around, I was wondering if you knew what "crsn" was meant to stand for :)
<wallyworld> fwereade: cloud region short name
<fwereade> rogpeppe, nice and shiny, indeed
<fwereade> rogpeppe, I have seen far worse coverage output ;p
<rogpeppe> fwereade: BTW, speaking of test coverage, do you know what happened to the tests for RelationUnitsWatcher ?
<rogpeppe> fwereade: there appear to be none at all
<fwereade> rogpeppe, I'm not that surprised, it's basically an implementation detail of RelationScopeWatcher
<rogpeppe> fwereade: it is used directly by the uniter though
<rogpeppe> fwereade: and none of its code is covered by the state tests (verified with the shiny new coverage tool)
<rogpeppe> fwereade: i suspect the test coverage loss was an inadvertent consequence of https://code.launchpad.net/~fwereade/juju-core/state-relationunit-move/+merge/144747
<fwereade> rogpeppe, sorry, meeting pushed that conversation out of my mind
<rogpeppe> fwereade: np
<fwereade> rogpeppe, looks like you're right, and I suck
<rogpeppe> fwereade: happens to all of us :-)
<fwereade> rogpeppe, indeed, but thanks for catching it
<fwereade> rogpeppe, I suspect I zoned briefly out of the distinction between Watch and WatchScope and, boom, splat
<rogpeppe> fwereade: i'm wondering if we could do something with the coverage tool to automatically detect loss of test coverage
<rogpeppe> fwereade: something to keep in mind anyway
<fwereade> rogpeppe, that would be very nice, yeah
<fwereade> rogpeppe, tech-debt bug?
<rogpeppe> fwereade: https://bugs.launchpad.net/bugs/1218362
<_mup_> Bug #1218362: state.RelationUnitWatcher is not tested <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1218362>
<fwereade> rogpeppe, ah, I meant for the coverage tool
<rogpeppe> fwereade: oh, i see
<rogpeppe> fwereade: yeah, i'll do that
<rogpeppe> fwereade: https://bugs.launchpad.net/juju-core/+bug/1218834
<_mup_> Bug #1218834: There's no way to easily detect loss of test coverage <tech-debt> <juju-core:New> <https://launchpad.net/bugs/1218834>
<fwereade> rogpeppe, standup
<rogpeppe> fwereade: "There is a problem with connecting to this video call. Try again in a few minutes"
<natefinch> evilnickveitch: it occurs to me that your document says to sync docs with the reality of 1.12 ... does that mean we shouldn't document stuff that is later than 1.12?  For example, windows support is not in 1.12
<mgz> natefinch: if we release 1.14, presumabley we'd update that to read '1.14'
<evilnickveitch> natefinch, i think in this case we can document it and work out what notes and info we need to provide later. at some point in the future we will have different versions of the docs which will make this easier
<mgz> having a doc 'feature' branch for dev stuff that will be in next stable seems sane
<mgz> then we just merge that to doc trunk when we do the release
<evilnickveitch> mgz, indeed
<evilnickveitch> we need a good base to start from - i think we need to be at a point where we are not adding loads of stuff to the stable docs before we start doing that
<natefinch> mgz, fwereade, rogpeppe: just had a thought - should we refrain from creating a "local" environment section in environment.yaml on Windows, since it can't possibly work there?  Or maybe create it commented out, with a comment that it doesn't work on Windows.
<natefinch> TheMue: ^^^ since you're writing docs on the local provider - might be good to mention it doesn't work on Windows :)
<TheMue> natefinch: oh, yes, good hint, and it won't work on a mac
<natefinch> TheMue: yep
<TheMue> natefinch: only in a vmware image running linux ;)
<fwereade_> natefinch, good idea, thank you
<fwereade_> hmm, my smoothie has fermented
<TheMue> fwereade_: iirks
<marcoceppi> How often does the store update personal branches?
<marcoceppi> nvm, gui cache issue
<rogpeppe> fwereade_, mgz, natefinch, jam, TheMue: trivial CL, enabling some tests that were not hooked in: https://codereview.appspot.com/13428043/
<fwereade_> rogpeppe, I'll do that if you'll do https://codereview.appspot.com/13401045 :)
<fwereade_> rogpeppe, LGTM
<rogpeppe> fwereade_: likewise
<mgz> :
<fwereade_> rogpeppe, cheers
<mgz> nice backscratching :)
<TheMue> rogpeppe: LGTM from my side too
<rogpeppe> mgz: always good when you have an itch
<mramm> fwereade_: you around?
<fwereade_> mramm, yeah, what can i do for you?
<mramm> I am wondering if we can setup a "what does 2.0 mean" meeting for the beginning of next week
<fwereade_> mramm, sure, that sounds good
<fwereade_> mramm, a bit after standup on monday perhaps?
<mramm> I'm thinking about timelines, and thinking that we may want to manage the scope of a 2.0 release a bit more
<mramm> yea, that works for me
<fwereade_> mramm, that was put with admirable delicacy btw
<mramm> I will want to include Gary in that conversation
<hazmat> fwereade_, ping
<fwereade_> hazmat, pong
<hazmat> fwereade_, trying to debug the slowness that's been reported on the list
<hazmat> do you have a minute to discuss on g+?
<fwereade_> hazmat, thank you kindly :)
<fwereade_> hazmat, in a moment I think
<mgz> hazmat: a fair bit of it can just go from status, when it doesn't need to do anything other than query state (over the api)
<hazmat> mgz, right, i could make a fast status impl right now just using the allwatcher, not quite the same data. one of the issues in their status output is the amount of garbage they've got  in terms of machines
<hazmat> that are missing, pending, etc.
<hazmat> and helping them garbage collect
<mgz> right, there's always an issue inside the reported issue :)
<hazmat> mgz, status in the api is all nesc for correctness for deployer
<hazmat> since watch api is eventually consistent
<hazmat> i took a look at the sprint unfortunately the status code has several embedded panics
<hazmat> s/all/also
<hazmat> hmm.. bootstrapping on ec2 takes a very long time
<hazmat> for the command to complete
<fwereade_> bbl
<hazmat> is juju trunk supposed to build?
<hazmat> provider/ec2/ec2.go:131: inst.Instance.IPAddress undefined (type *ec2.Instance has no field or method IPAddress)
<rogpeppe> hazmat: i think you need to do go get -u launchpad.net/goamz/ec2
<hazmat> rogpeppe, thanks
<rogpeppe> fwereade_, natefinch, TheMue, anyone else: , here's a large but trivial CL, just moving test code around: https://codereview.appspot.com/13348049/
<rogpeppe> hazmat: for the record, there's no a file in the juju-core root (dependencies.tsv) that declares the revision numbers of dependencies
<rogpeppe> s/no a/now a/
<hazmat> hmm ec2 Storage().RemovaAll() takes 50s  of 1m3s for destroy-environment
<rogpeppe> hazmat: hmm, how many tools were uploaded?
<hazmat> rogpeppe, 1
<rogpeppe> hazmat: that's really odd
<hazmat> rogpeppe, it was an empty environment, nothing deployed
<rogpeppe> hazmat: i mean, i know s3 is slow, but...
<hazmat> rogpeppe, do we even need to remove them?
<hazmat> rogpeppe, i mean minus provider-state, its basically just cache
<rogpeppe> hazmat: it tries to remove all trace of an environment when it's destroyed
<hazmat> except security groups
<rogpeppe> hazmat: unfotunately that's not possible
<rogpeppe> hazmat: (i tried)
<hazmat> rogpeppe,  right.. but its not clear that we're adding significant value destroying the bucket vs the time it takes
<rogpeppe> hazmat: i've never seen it take anywhere near that long
<rogpeppe> hazmat: does this happen every time for you?
<hazmat> rogpeppe, atm yes, i'm still instrumenting, but about to switch out for a meeting
<hazmat> rogpeppe, bootstrap also takes quite a while
<rogpeppe> hazmat: with --upload-tools ?
<hazmat> rogpeppe, no
<rogpeppe> hazmat: hmm, it's not bad for me if i'm not uploading tools
<hazmat> rogpeppe, 2m9s for me
<hazmat> rogpeppe, define not bad?
<rogpeppe> hazmat: 26s for me
<hazmat> rogpeppe, which provider?
<rogpeppe> hazmat: ec2
<hazmat> rogpeppe, hmm.. region?
<rogpeppe> hazmat: us-east, i think
<rogpeppe> hazmat: yeah
<hazmat> i'm working against us-west-2 (us-east is literally my backyard)
<rogpeppe> hazmat: part of the slowness is timeouts for eventual consistency. it looks like if you haven't uploaded any tools, we time out, because the bucket doesn't exist and we poll just in case it was only created a moment ago
<rogpeppe> hazmat: that's 5s of my time
<rogpeppe> hazmat: there's another 10s gap where i'm not sure what it's doing
<hazmat> gotta run for a meeting..
<rogpeppe> hazmat: k
<hazmat> rogpeppe, it does look like the destroy time was an abberation, i'm getting closer to 30s averages now.
<rogpeppe> hazmat: ok. but that's still a bit rubbish
<rogpeppe> hazmat: i'd like to add trace messages that print all traffic to and from servers
<rogpeppe> hazmat: so we can see just what's going on
<hazmat> rogpeppe, here's my instrumentation on destroy fwiw http://pastebin.ubuntu.com/6045232/
<rogpeppe> hazmat: looks like there might be 15s worth of eventual-consistency waiting in there
<rogpeppe> hazmat: or maybe 20s
<hazmat> rogpeppe, should destroy-env care about eventual consistency?
<rogpeppe> hazmat: the difficulty is writing live tests that pass consistently
<rogpeppe> hazmat: if we've just created some storage, then try to delete it, the operation often fails
<rogpeppe> hazmat: unless we try for a while
<rogpeppe> hazmat: i frickin' hate it
<hazmat> right, but that's a test responsibility not a user experience
<hazmat> ?
<rogpeppe> hazmat: well, we're testing that the operations work
<hazmat> running away
<hazmat> to meeting ;-)
<rogpeppe> hazmat: :-)
<natefinch> ha, wow, it took me forever to figure out why some of my code kept saying "LICENSE" was not a file, when I clearly saw it right in the filesystem.  Finally realized the file was actually named LICENCE ... you crazy brits :)
 * natefinch just finally proposed his Windows changes for juju client, including the installer... now if only everyone else wasn't gone for the weekend :)
<rogpeppe> natefinch: i'll swap you a review: https://codereview.appspot.com/13355044
<rogpeppe> natefinch: :-)
<rogpeppe> natefinch: although i fear mine is probably a lot more work than yours to review...
<TheMue> so, guys, i'll stepping out, having a nice weekend
<evilnickveitch> TheMue, thanks for your efforts!
<TheMue> evilnickveitch: np, anytime again
<natefinch> rogpeppe: I'm at eod, but will look monday morning
<rogpeppe> natefinch: np
<rogpeppe> natefinch: i'm away next week BTW
<natefinch> oh yeah.  have fun!
<rogpeppe> natefinch: what's your CL, BTW?
<natefinch> rogpeppe: https://codereview.appspot.com/13079045/
<rogpeppe> natefinch: ta
<rogpeppe> natefinch: have a great weekend
<rogpeppe> natefinch: and a good week too!
<natefinch> rogpeppe:  you too
<rogpeppe> right, that's me for the week
<rogpeppe> g'night anyone that's still around :)
#juju-dev 2013-09-01
<thumper> mornin
<thumper> g
<bigjools> morning thumper
<thumper> o/
<bigjools> booked your flights yet thumper?
<thumper> ha, no
<thumper> need to do the initial paperwork and signoff right?
<bigjools> yeah
<bigjools> I did that bit
<bigjools> doesnt help having a travel provider 9 time zones away
<thumper> :)
<wallyworld_> i booked my own tickets
<davecheney> i'm going to start cutting 1.13.3 now
<wallyworld_> davecheney: excellent, i'll commit the bootstrap chsnge when the release is done
<davecheney> wallyworld_: yup, i wanna unblock y'all as soon as possible
<davecheney> oh fuck
<davecheney> ubuntu@ip-10-248-63-202:~$ df -h
<davecheney> Filesystem      Size  Used Avail Use% Mounted on
<davecheney> /dev/xvda1      7.9G  7.8G     0 100% /
<davecheney> showstopped
<davecheney> showstopper
<thumper> :(
<davecheney> thumper: this is an upgrade problem
<thumper> davecheney: where is that?
<davecheney> this is a 1.12 env upgraded to 1.13
<thumper> is it logging?
<davecheney> yup
<thumper> what is it logging?
<davecheney> 1.12 did not contain logrotate
<davecheney> ubuntu@ip-10-248-63-202:~$ ll /var/log/juju/
<davecheney> total 3356424
<davecheney> drwxr-xr-x  2 root   root       4096 Aug 29 00:46 ./
<davecheney> drwxr-xr-x 12 root   root       4096 Sep  1 06:45 ../
<davecheney> -rw-r-----  1 syslog adm  3226935296 Sep  1 23:13 all-machines.log
<davecheney> -rw-r--r--  1 root   root       5508 Aug 29 00:46 juju-gui.log
<davecheney> -rw-r--r--  1 root   root  108191300 Sep  1 23:13 machine-0.log
<davecheney> -rw-r--r--  1 root   root  101818940 Sep  1 23:13 unit-juju-gui-0.log
<davecheney> thumper: wallyworld_ here is the problem
<davecheney> if you have a 1.13 environment, log rotate works
<davecheney> if you have a 1.12 and you upgrade
<davecheney> log rotate does not work
<wallyworld_> i didn't realise we had added log rotate
<davecheney> wallyworld_: uh, you were the one that did it ...
<thumper> haha
<wallyworld_> nope, i didn't add log rotate
<thumper> that is a metric fuck-ton of logging
<thumper> WTH is being logged?
<davecheney> wallyworld_: there are two bugs about log rotate
<wallyworld_> sure, one of the fixes by andrew was to tweak the rsyslof conf
<davecheney> i'm sure you closed at least one of them
<wallyworld_> but that is not log rotate
<davecheney> thumper: one of the uniters has gone mad
<wallyworld_> the fix i'm aware of was to stop rsyslog's processing of rules so it didn't keep adding stuff
<thumper> davecheney: gone mad in which way?
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1218616
<davecheney> thumper: appears to have been doing config-changed over and over again
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1191651
<davecheney> ^ wallyworld_ i was wrong
<davecheney> there are two bugs, both open
<wallyworld_> ok, phew, i thought i had missed something
<davecheney> hmmm, ok, this isn't a blocker
<davecheney> it's broken in 1.12 and 1.13.x
<davecheney> i hope you don't want to run your environment for more than 4 days
<davecheney> ...
<thumper> so this is more about a uniter going made
<thumper> mad
<thumper> and logging constantly?
<davecheney> thumper: counting
<hazmat> thumper, the api rpc stuff also logs constantly, though not runaway style
<davecheney> hazmat: it's not that
<davecheney> but it is annoying
<hazmat> yeah
<davecheney> that was turned off
<thumper> I'm working on a branch that should fix this
<davecheney> how did it get turned back on again ?
<thumper> I propose setting the log level of logging every message and response as "trace"
<thumper> for juju.api
<davecheney> +1
<davecheney> thumper: ubuntu@ip-10-248-63-202:~$ grep -c config-changed /var/log/juju/all-machines.log
<davecheney> 3181195
<davecheney> config-changed hook has gone ape shit
<thumper> hmm
<thumper> while amusing, not helpful
<davecheney> thumper: large log is large
<davecheney> i'm grabbing it now
<davecheney> not sure where to put 100 odd gig of logs
<thumper> yeah, we probably don't need all of it
<thumper> you could probably safely just grab the last few megs
<davecheney> thumper: wallyworld_ can one of you please merge those two bugs
<thumper> and that would be more than enough
<davecheney> and mark the remaining one as blocker for 1.13.4
<wallyworld_> davecheney: ok
<davecheney> wallyworld_: while technically it sohuld be a blocker for 1.13.3
<davecheney> we need to release taht today
<davecheney> and make a 1.14 branch
<davecheney> i'll start a backport document for bugs we have to backport
<wallyworld_> they are separate issues though
<wallyworld_> no no point marking as dupe
<davecheney> wallyworld_: ok, i don't really care, as long as it gets fixed
<davecheney> i know hte one from kurt is really 'do debug-log another way'
<wallyworld_> sure
<davecheney> that won't get done this week
<davecheney> so maybe the other one from prodstack is the better choice
<thumper> hmm...
<thumper> it seems that we do now have rsyslog things added
<thumper> but I bet on upgrade, we don't add them
<davecheney> thumper: yup, that was what I thought the problem was initially
<thumper> I do wonder where it would specify size...
<wallyworld_> rsyslog can do log rotation: http://www.rsyslog.com/doc/log_rotation_fix_size.html
<wallyworld_> so we can add that for next release
<davecheney> wallyworld_: release is tagged, go nuts
<wallyworld_> davecheney: awesome, thanks
#juju-dev 2014-08-25
<perrito666> hey,anyone seen menn0?
<thumper> perrito666: he is on holiday this week
<perrito666> such is my luck
<perrito666> thank you thumper
<thumper> np
<davecheney> thumper: wallyworld something is screwed with the cmd/juju tests
<davecheney> 1,000's of goroutines all waiting on some semaphore to send the 404 message baqck to the client
<wallyworld> is this new? or more likely thy've been screwed for a while
<davecheney> not new
<davecheney> but really getting worse
<davecheney> who knows
<davecheney> what spec is the machine you are booting for running CI tests ?
<axw> wallyworld: hey, was TestBootstrapNoTools failing?
<axw> ah, I see a bug #
<wallyworld> axw: yeah, on i386 and ppc64
<wallyworld> davecheney: it's a m3.large i think
<axw> sorry, should've run them all with i386 first
<wallyworld> np
<wallyworld> axw: i think that test is obsolete now anyway
<davecheney> wallyworld: ok, that shouldn't be penalised for running cpu intensive jobs
<davecheney> i know the smalls and mediums get heavily penalised actually using the cpu
<wallyworld> aws reduces their priority?
<davecheney> yup
<davecheney> spin up a small or a medum
<davecheney> run mpstat
<davecheney> and look at the %steal colum
<davecheney> can be up to 80% on t1.smalls
<davecheney> the more cpu you use, the more you are penalised
<davecheney> so when you REALLY need to go fast, you go the slowest
<davecheney> yay, cloud
<davecheney> wallyworld: when cmd/juju does pass, it passed by the barest of margins
<davecheney> ok  github.com/juju/juju/cmd/juju597.325s
<davecheney> 2.675 seconds and this build would have failed
<wallyworld> yep
<wallyworld> if i had a magic wand to fix the tests i'd wave it
<davecheney> go test -test.timeout=900s github.com/juju/juju/...
<davecheney> possibly the option needs to go at the end
<davecheney> but there is something horribly wrong with that cmd/juju test
<axw> yeah, it goes end-to-end when it doesn't need to
<davecheney> axw: do you know the name of the test off the top of your head
<davecheney> i noticed it's using io.Pipe, not os.Pipe
<davecheney> some buffering may help
<axw> davecheney: which test?
<davecheney> 11:23 < davecheney> but there is something horribly wrong with that cmd/juju test
<davecheney> 11:24 < axw> yeah, it goes end-to-end when it doesn't need to
<axw> I was referring to the entire package
<davecheney> :(
<wallyworld> davecheney: so many of our tests suck - they are not unit tests
<wallyworld> there's no easy fix
<davecheney> wallyworld: +1,000,000 to the 'not unit test' comment
<wallyworld> yep :-(
<wallyworld> doesn't help that mongo is *everywhere*
<axw> we don't even need the API server in the mix, let alone mongo
<axw> (for cmd/juju)
<wallyworld> yup, the mongo comment was a general lament
<perrito666> ok folks Ill go finish my sunday
<perrito666> btw if someone wants to review https://github.com/juju/juju/pull/530 which has only been reviewed by junior reviewers and is in need of being merged
<perrito666> cheers
<thumper> davecheney: https://github.com/juju/juju/pull/570
<davecheney> thumper: /me looks
<davecheney> thumper: this all looks rather uncontravesial
<thumper> davecheney: it should be :)
<waigani> thumper: what version should the upgrade step target (adding state users as environ users)
<thumper> waigani: 1.21.alpha1
<waigani> thumper: a 'get all users' function is not jumping out at me, does one exist (I've got admin)?
<thumper> waigani: I don't think there is one yet
<davecheney> waigani: i don't think one exists
<davecheney> we only have one user
<davecheney> so you sort of knew the answer
<waigani> in that case is it okay to just add admin to environ in the upgrade step?
<davecheney> as long as you dont' call them "admin"
<waigani> sigh
<thumper> waigani: no... there is an example with the upgrade step
<thumper> waigani: already in the code that iterates through ever user
<thumper> waigani: it updates the last connection thingy
<thumper> waigani: so just do something similar
<waigani> thumper: okay, I'll take a look
<thumper> davecheney: and I'm dealing with the admin name :)
<waigani> davecheney, thumper: what are we calling admin once it's added as an environment user?
<waigani> oh just saw your comment thumper, so I'll leave it admin for now?
<thumper> waigani: what do you mean?
<thumper> don't alter the existing user, just connect to the environment by adding an EnvUser
<waigani> thumper: just trying to grok davecheney's comment about admin name
 * thumper takes a deep breath and fixes state.Initialize
<davecheney> waigani: the initial user of an envionment is "admin"
<davecheney> wallyworld__: thumper so, bad news
<davecheney> there is no stand out test in cmd/juju
<davecheney> they are all slow
<davecheney> well they are fast
<davecheney> milliseconds
<davecheney> but the setup for each test is a few seconds each
<wallyworld__> that's a common failing across most of our tests because they are not true unit tests
 * thumper agrees
<thumper> waigani: can't hear you
<waigani> thumper: shit sorry, hangon
<wallyworld__> axw: when you have a moment, a fairly small one https://github.com/juju/juju/pull/598
<axw> wallyworld__: yup, looking
<wallyworld__> axw: thanks, yeah, i'm only doing this for 1.20
<axw> wallyworld__: cool
<axw> sorry, missed simplification until after LGTM, but it doesn't really matter
<thumper> davecheney: your branch landed \o/
<thumper> mine is running now
<waigani> thumper: usersC.Find(nil).All(&users) finds one zero valued user, yet st.User("admin") finds the admin user?
<thumper> waigani: then it will be a different query to list them all
<thumper> perhaps you just have to specify something that matches all possible ones
<thumper> axw: do you have a few minutes to talk through a problem?
<axw> thumper: in a few mins, lunch is about to come out of the oven
<thumper> axw: ok, ping me when you have 10 minutes or so... I'll pause and do something else for now
<axw> thumper: ready
<thumper> axw: https://plus.google.com/hangouts/_/grprl36idkwixt2q2cpajgeip4a?authuser=1&hl=en
<wallyworld_> thumper: is menno around today?
<thumper> waigani: menno is on leave this week
<wallyworld_> ok, ta
<wallyworld_> he should have marked it on the calendar :-)
<thumper> yeah, I'll go add it
<davecheney> thumper: finally
<davecheney> looking into just getting a little more out of the cmd/juju tests now
<davecheney> alone they take 400s on my machine
<davecheney> running with others they can take > 600
<davecheney> wallyworld_: I have a fix for the slowness of cmd/juju
<davecheney> you probably won't like it
<wallyworld_> maybe :-)
<davecheney> wallyworld_: do you want me to tell you what i've done
<davecheney> or just send a PR
<wallyworld_> tl;dr; version?
<davecheney> i've moved some of the tests into another package
<wallyworld_> which package?
<davecheney> cmd/juju -> cmd/juju/test
<wallyworld_> doesn't that just move the problem?
<davecheney> the problem is the package takes > 600s to test
<davecheney> split the package up
<davecheney> each part takes less time
<wallyworld_> has this been causing failed builds?
<davecheney> yes
<davecheney> well there is the usual nonsense with the repl sets
<davecheney> but over the weekend cmd/juju has constantly been taking > 600 seconds
<davecheney> it takes 400s on my machine uncontended
<davecheney> and clsoe to 550 with other tests running in parallel
<wallyworld_> we could simply tweak the test timeout until the tests can be fixed
<davecheney> i'm proposing my solution
<davecheney> you're welcome to nack it
<wallyworld_> ok
<davecheney> but in my experience raising timeouts only leads to raising timeouts again
<davecheney> and again
<davecheney> and again
<wallyworld_> well, we need to fix the test
<wallyworld_> moving them just messes up the code base
<wallyworld_> axw: remind me - to log onto juju's mongo, the password is recorded in the jenv file. yet mongo --ssl -u admin -p xxxx --port 37017 fails with an auth error
<davecheney> wallyworld_: i didn't say it was a perfect solution
<davecheney> but it solves the problem we have today
<davecheney> where it takes two days to land a branch
<wallyworld_> so does increasing the timeout
<wallyworld_> without any churn the the code
<davecheney> wallyworld_: i'll propose my solution, you can nack it
<wallyworld_> two days?
<axw> wallyworld_: try the hash of the password
<wallyworld_> axw: sha256?
<davecheney> wallyworld_: it took me friday, saturday anf this morning to land my names branch
<davecheney> it failed 7 times
<axw> wallyworld_: I don't know what the hash type is, it's stored in the bootstrap agent's  conf file IIRC
<wallyworld_> davecheney: timeouts? i've never had any failure due to timeouts like that
<axw> wallyworld_: what are you trying to do?
<wallyworld_> axw: i want to log in and look at the collections
<davecheney> wallyworld_: this is what tim pinged you aout this morning
<davecheney> go check the build dashboard
<axw> wallyworld_: https://github.com/kapilt/juju-dbinspect
<davecheney> dozens of red builds all from cmd/juju timing out
<wallyworld_> i see one red dot because of that
<wallyworld_> maybe the other timeouts caused -p 2 to run and those worked
<davecheney> wallyworld_: those are the timeouts i'm talking about
<davecheney> and the reason why the build times have gone from 18 mintes to an hour
<davecheney> in the last week
<wallyworld_> we need to identify the root cause and fix that
<wallyworld_> something must have changed to make the tests start running so long
<davecheney> i noted on this channel 2 montsh ago that the times for cmd/juju were growiung
<davecheney> they have now passed the 10 minute mark
<davecheney> this is the result
<davecheney> every SINGLE test case in cmd/juju takes 3-4 seconds to setup
<davecheney> every time we add a new command or option
<davecheney> boom another 3-5 seconds gone
<wallyworld_> yep, so there's a systemic problem with those tests
<wallyworld_> moving stuff to a new package is simply rearranging the deck chairs on the titanic
<davecheney> wallyworld_: no argument there
<davecheney> but it has blocked landing anything for a week now
<wallyworld_> not bloked
<davecheney> (do you want to see the graph again)
<davecheney> yes blocked
<wallyworld_> landings have occurred
<davecheney> my change took all weekend to land
<wallyworld_> only one was blocked
<davecheney> the test have gone from 18 mntes to an hour
<wallyworld_> only one red dot was attributable to this problem
<davecheney> because of this
<davecheney> so, instead of being able to land 3 changes per hour
<davecheney> we can land less than one
<davecheney> hows that for productivity ?
<wallyworld_> that's not blocked bu definition
<davecheney> my change is downright horrible
<davecheney> but as nobody else is stepping up to address this issue
<wallyworld_> so we can work around it by running with -p 2 to start with
<davecheney> i stand by it
<wallyworld_> we need to draw a line in the sand and just fix the freaking tests
<wallyworld_> just as curtis has now started being firm about blocking landings for regressions, we also need to take a first stance
<wallyworld_> firm
<davecheney> fiar enough
<davecheney> i don't know how to resolve this situation
<wallyworld_> someone needs to look into what happens at test startup and determine how to better mock out the backend
<wallyworld_> axw: i gotta go get my son and take him to doctor, will look at your branch when i get back
<axw> wallyworld_: cheers
<axw> it's a big one...
<wallyworld_> that's what she said
<dimitern> jam, hey, sorry I'll be 10m late for 1-1
<jam> dimitern: np
<jam> thanks for letting me know
<TheMue> morning
<jam> morning TheMue
<jam> mgz: if you're still doing reviews today: https://github.com/juju/juju/pull/601
<dimitern> jam, https://github.com/juju/juju/pull/601 LGTM
<wallyworld_> axw: i wanted to get another fix done for 1.20. i can look at your branch now. if you have time, maybe you could look at https://github.com/juju/juju/pull/602
<axw> wallyworld_: sure, looking
<wallyworld_> axw: i'm not 100% sure it will fix the issue - the fix is based on reading the code
<axw> wallyworld_: we shouldn't be starting the container provisioner until after the upgrades have finished, right?
<wallyworld_> axw: it starts after the upgrade steps worker yes, but the upgrade work starts in parallel with it
<wallyworld_> it's a bit confusing
<wallyworld_> the upgrade work lsitens for upgrade requests
<axw> wallyworld_: huh? how can it start after and in parallel?
<wallyworld_> the upgrade steps worker does the upgardes
<wallyworld_> there's 2 workers - upgrade and upgrade steps
<axw> wallyworld_: ah ok. after upgrade steps, in parallel with upgrader
<wallyworld_> yes, that's what it looked like
<wallyworld_> and if there were no upgrade steps, it's all a bit of a race to see who starts first
<perrito666> good morning everyone, I am OCR today along with mgz feel free to ask for reviews, Ill be taking a look at the queue anyway
<jam> wallyworld_: this looks like a spurious failure to me: http://juju-ci.vapour.ws:8080/job/github-merge-juju/411/console
<jam> I'll dig into it
<jam> but in case you wanted it around for reference
<wallyworld_> jam: thanks. its on my todo list this week to get these documented. our tests need work
<jam> this isn't very helpful:
<jam> goroutine 17700 [running]:
<jam> 	goroutine running on other thread; stack unavailable
<jam> created by launchpad.net/gocheck.(*suiteRunner).forkCall
<jam> 	/home/ubuntu/juju-core_1.21-alpha1/src/launchpad.net/gocheck/gocheck.go:631 +0x23f
<jam> the only thing "running" is trying to fork something
<jam> wallyworld_: the second failure I don't fully understand, as it seems like maybe something didn't clean up in time and stayed bound to a port we thought we wanted to use in the next test
<wallyworld_> jam: it seems we have all sorts of isolation issues, plus issues with mongo startup failing at various times. sadly, many of our unit tests aren't really unit tests
<jam> dimitern: standup ?
<dimitern> jam, owm
<jam> TheMue: you weren't supposed to change your networking
<perrito666> is there anyone besides menno that is familiar with upgrade mode?
<dimitern> perrito666, wallyworld_ afaik, but he might be off already
<perrito666> tx dimitern
 * wallyworld_ is sorta here
<wallyworld_> menno did all of the upgrade mode stuff though
<wallyworld_> i there a specific question?
<dimitern> wallyworld_, I see, ok it seems my knowledge is a bit out of date :)
<wallyworld_> dimitern: i did the initital upgrade work, but tim's team has since taken it on
<perrito666> wallyworld_: not really I am implementing a "restore mode" and william told me to discuss with menno to make sure I dont reinvent the wheel
<wallyworld_> seems like there will be overlap there, or similar restrictions anyway
<dimitern> perrito666, wallyworld_, what's a "mode"? a runner that runs its workers with delay from the rest?
<dimitern> or more like a uniter mode
<perrito666> wallyworld_: there will certainly be
<perrito666> dimitern: it is an arbitrary term I believe
<dimitern> :) ah
<perrito666> in the case of upgrade
<perrito666> its a state of the API server where it rejects most requests
<wallyworld_> dimitern: at a high level, it means that the state server will reject connections while an upgrade is still running
<perrito666> with an error indicating its upgrading
<dimitern> wallyworld_, perrito666, I see, thanks guys!
<wallyworld_> there's so much to keep track off
<perrito666> dimitern: restore mode is to do something very similar
<dimitern> yeah, just keeping all the far and wide networking effort going pretty much leaves me in the dark about what's going on with the other teams :/
<wallyworld_> yup, same here
<JoshStrobl> What hooks would be the appropriate place to call open-port? I'm guessing config-changed, start, install, etc. and close-port should be called on ./stop?
<dimitern> JoshStrobl, in practice it doesn't matter - just do it (in a hook) before you need to use it; ideally both at config-changed time (close-port first, open-port after that)
<dimitern> s/at//
<dimitern> sorry, that should've been s/ideally both at/ideally at/
<JoshStrobl> dimitern, thanks :)
<dimitern> JoshStrobl, np :) and as for close-port - ideally in the stop hook
<JoshStrobl> dimitern, yea I figured that much :D
<dimitern> mgz, hey, it seems the merge bot has some issues today - multiple failures, timeouts..
<ppetraki>  /join ##cacheio badblocks
<ericsnow> mgz: you around?
<arosales> fyi bug https://bugs.launchpad.net/juju-core/+bug/1322705 still is not targetted.
<mup> Bug #1322705: juju help does not contain Joyent help information <joyent-provider> <ui> <juju-core:Triaged> <https://launchpad.net/bugs/1322705>
<arosales> I'll see if I can get a branch proposal out
<waigani_> thumper: standup?
<thumper> coming
<thumper> wallyworld: morning
<wallyworld> thumper: hey
<thumper> wallyworld: something to start your day :-) https://bugs.launchpad.net/juju-core/+bug/1361216
<mup> Bug #1361216: unit tests for all series and archs fail <ci> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1361216>
<wallyworld> sigh
<thumper> wallyworld: it isn't a sudden change
<thumper> wallyworld: it is JujuConnSuite, replica set, and cmd/juju tests
<thumper> three different issues
<wallyworld> yep
<wallyworld> i can't fix them all
<wallyworld> it needs a whole of team approach
<thumper> I think we could fix many by changing cmd/juju tests to use a mock
<thumper> or mocks
<wallyworld> you think?
<thumper> yes
<thumper> most of the cmd/juju tests do too much
<wallyworld> that was sarcasm
<thumper> they are end to end tests
<thumper> oh
<thumper> sorry
<thumper> hard to tell
<wallyworld> yeah
<wallyworld> i'm just pissed off out tests are so bad
<wallyworld> why oh why did anyone think it's a good idea to bring up the whole stack to run unit tests
<wallyworld> i guess the same reason why we interleave mongo throughout all of our business logic :-(
#juju-dev 2014-08-26
<wallyworld_> thumper: how long do the cmd/juju tests take for you? on my average laptop, they take about 160s. JujuConnSuite test set up is about 200ms for each
 * thumper runs
<thumper> ah...
<thumper> my state is somewhat broken right now
<wallyworld_> the bot seems way slower
<wallyworld_> i'm pretty sure we're using tmpfs
 * thumper scrolls way up
<thumper> ~170s
<wallyworld_> so that's well within the timeout limit
<thumper> we are on fairly high speced machines though
<wallyworld_> so yes, the tests are bad, but they should run
<davecheney> wallyworld_: on my laptop it was 400s
<wallyworld_> wow
<thumper> I have ssd
<wallyworld_> are yoyu using tmpfs?
<davecheney> yes
<davecheney> core i5 thinkpad x220
<davecheney> there is a 3-4 second delay between each test run
<davecheney> i'm assuming that is setup/teardown
<wallyworld_> i don't see that
<wallyworld_> i see 200ms
<davecheney> don't look at the times
<davecheney> go test -gocheck.v
<wallyworld_> i did
<davecheney> well, your machine is the odd man out
<davecheney> it takes 400s on my machine
<davecheney> and 570+ in CI
<wallyworld_> thumper gets the same times as me
<davecheney> i only have 4 cores
<davecheney> just like CI
<wallyworld_> i have an 8 core i7
<wallyworld_> shouldn't be that much faster
<davecheney> yes, that is what I said
<wallyworld_> i'm not sure CI should be blocked on this - it's not a regression
<davecheney> i agree
<wallyworld_> i'll update the bug
<davecheney> thanks
<wallyworld_> maybe in the interim we can run CI on a larger instance type
<davecheney> maybe
<davecheney> the largest that ec2 offer is an 8 core
<davecheney> and in my testing
<davecheney> that is still slower than my 2 year old core i5
<davecheney> building gccgo as a test
<wallyworld_> davecheney: thumper: changing the instance type to "c3.2xlarge" (8 core, 16GB) seems to have helped the cmd/juju tests to pass. More data points needed, but -p 2 was faster than full parallisation for this run. there was a spurious apiserver/client failure which caused -p 2 to run. interestingly, the -p 2 run was faster for many tests this time around
<wallyworld_> maybe we try running with -p 2 to start with for a bit
<perrito666> thumper: are you around?
<thumper> yeah...
<perrito666> thumper: don't sound so happy
<perrito666> :p
<thumper> I'm writing docs/bootstrapping.txt
<perrito666> thumper: my condoleances
<thumper> after having to work it out for the umteenth time...
<thumper> I thought I'd write it down
<perrito666> thumper: how savvy are you on menn0's "upgrade mode" ?
<thumper> ish
<thumper> whazzup?
<perrito666> thumper: I am trying to replicate the idea for restore and I would like some overview but apparently I missed menn0 this week
<perrito666> I worked out a decent part of it
<perrito666> ok guys braing shutting down, see you all or some of you tomorrow
<waigani_> cya perrito666
<waigani_> I'm getting a "state changing too quickly; try again soon" error
<davecheney> o_O
<waigani_> how soon? should I have a cup of tea?
<waigani_> I'm looping over a collection of state users and adding each one as an environ user via a new AddEnvironmentUser func - maybe I need to add them all as one transaction?
<waigani_> thumper: ^?
<thumper> waigani_: it is lying to you
<thumper> that is another error that returns when the assertion fails
<thumper> it tries a few times
<thumper> and then assumes that the assertion is due to someone else
<thumper> in dev, it is most likely you
<thumper> waigani_: did that help?
 * thumper goes to turn the coffee machine on
<waigani_> thumper: sorry I was afk
<axw> wallyworld__: I've got to fix up my bootstrap tools branch, it's incompatible with your changes to use /var/lib/juju in the containers
<axw> gah, not sure how I'm going to fix this...
<wallyworld__> axw: that's ok, i mounted that directory to avoid the need for the container to call out to http (if i am thinking of the right thing)
<wallyworld__> calling out to http on the host
<axw> wallyworld__: yes that is the one. in my branch, file:// is treated specially. it means read the file contents locally and then add to the cloud-init script
<wallyworld__> i did that change because people were seeing errors
<wallyworld__> this branch i think obsoletes the need for that
<wallyworld__> since the tools are copied into the container via ssh init
<axw> wallyworld__: only for bootstrap ,not for containers...
<wallyworld__> the containers will get from state server
<wallyworld__> we can handle retries at that point
<wallyworld__> it was only a short term quick fix
<axw> wallyworld__: different branch. I guess I can revert it temporarily... I'm checking if we can do better tho. we may be able to load the tools into the userdata for the local provider
<wallyworld__> could do. i just wanted a quick way to avoid the observed source of the errors
<wallyworld__> i knew it would be throwaway
<axw> wallyworld__: can you PTAL: https://github.com/juju/juju/pull/600/commits
<wallyworld__> sure
<axw> from the 5th commit on
<axw> wallyworld__: it looks like lxc doesn't have the same limit on userdata size, so if we need to we have the option of serialising the tools in there for all lxc containers
<wallyworld__> good to know
<wallyworld__> i think that would be useful
<wallyworld__> avoid networking calls back to the state server to get the tools
<wallyworld__> axw: supportedArchitecturesCount is just for testing?
<wallyworld__> ah nevermind
<wallyworld__> i thought it was in production code
 * thumper needs to work out how to poke inside of mongo
<thumper> also noticed some wonderful weirdness in our code
<thumper> the password hash of the admin-secret is used as the actual password for the admin user to mongo (which is "machince-0" btw)
 * thumper goes to make that coffee
<thumper> bugger, timer on the machine would have turned it off again by now
<thumper> ugh...
<thumper> spent the day teasing apart the layers of juju to work out where to put my change
<thumper> still not entirely sure... but closer
<stokachu> where does juju log its debugging output when attempting to bootstrap into openstack?
<stokachu> i ran with --debug but its just sitting at apt-get update http://paste.ubuntu.com/8146668/
<thumper> grr...
<thumper> stokachu: not sure sorry
<stokachu> no worries
 * thumper can't push because master is dirty 
<thumper> ericsnow: you need to fix your pre-push hooks
<davecheney> thumper: hang on
<davecheney> martin fixed that, twice
<davecheney> how did a change get past the bot
<thumper> wallyworld__: chat?
<thumper> davecheney: no idea
<wallyworld__> thumper: hiya, ok
<wallyworld__> onyx standup?
<thumper> https://plus.google.com/hangouts/_/g6ga27vzkwgy3dz4s7xly5sivia?authuser=1&hl=en
<wallyworld__> ok
<thumper> wallyworld__: https://github.com/juju/juju/pull/604
<thumper> davecheney: https://github.com/juju/juju/pull/605
 * davecheney looks
<davecheney> thumper: not logm
<davecheney> do not use NewXXTag in productoin code
<davecheney> unless you are 100% sure that the tag is valid
<thumper> davecheney: the line above it validates
<thumper> davecheney: how else should we do it?
<davecheney> names.ParseEnvironTag("environment-"+string)
<davecheney> if you're sure the tag is valid then LGTM
<davecheney> but this is a warning
<thumper> hmm...
<davecheney> honestly those NewXXTag functoins shouldn't be in the package
<davecheney> they are a footgun
<thumper> davecheney: but they have to be created somewhere, right?
<davecheney> in most cases they come as strings on the wire
<davecheney> kind-id
<davecheney> so we use parse
<thumper> but something puts them on the wire
<davecheney> yup tag.String()
<thumper> but something creates the tag
<thumper> you have to have trust somwhere
<davecheney> no argument there
<davecheney> but you should look suspciously at every case
<thumper> the same method is used in state.Initialize
<thumper> we have a uuid
<thumper> then create an environment tag from it
<davecheney> sure
<davecheney> but you are arguing that using a dangrous weapon is ok because others have used it heaps of times in the past with nothing going wrong
<davecheney> past performance is no guarentee of future profit
<thumper> no, I am saying that this is one of the places where you use the dangerous weapon carefully
<davecheney> sure
<thumper> there are always places where we need to create tags with known data
<davecheney> yes
<davecheney> but I don't see any validation there
<davecheney> you just take what comes out of hte jenv fil
<thumper> no, this isn't the jenv
<davecheney> +ssInfo, err := st.StateServerInfo()
<thumper> right, st here is *state.State
<davecheney> ssIfnfo should have an envTag field or method
<davecheney> no, StateServerInfo is not a *state.State
<davecheney> it's some turd that got passed back from the api
<thumper> no, st is
<davecheney> ssInfo
<thumper> StateServerInfo is a method on *State
<davecheney> +st.environTag = names.NewEnvironTag(ssInfo.EnvUUID)
<davecheney> you have a LGTM with resevatoins
<davecheney> there is no value in griding on this point
<thumper> right, here is some ickyness...
<thumper> which we can fix
<thumper> StateServerInfo is a POD structure
<thumper> Plain Old Data
<thumper> all public
<davecheney> yeah, this is the same POS that infests the state and the mongo packages
<davecheney> and binds them tightly to the _client_ api
 * thumper nods
<thumper> we should separate the serialization structure from the info structure
<davecheney> it's fine for it to be public fields
<davecheney> it's returned by value
<davecheney> we can't change the copy that state has
<thumper> but then adding a method that creates an environ tag from the uuid in the struct is meaninless
<thumper> as it gives you a false sense of security
<thumper> when there is none
<davecheney> obiously we'd remove the envUUID field
<thumper> nah... I just put it there
<davecheney> well shit
<thumper> this is all about our shitty data structures
<axw> wallyworld__: https://github.com/axw/juju/commit/b56a48d3bd760f9ab58ccada562dd663b1786a0d#commitcomment-7526685
<wallyworld__> rightio
<thumper> davecheney: let me ponder this env tag for a bit
<axw> can you please take a look at that and see if I'm making sense
<thumper> before I merge it
<thumper> I'd like to ensure that we do it right
<axw> wallyworld__: thanks. going to do some more testing before I land, and double check coverage
<wallyworld__> ok, it's got potential to break things does this branch
<wallyworld__> jam1: i got the bot "fixed" by throwing a larger instance at it - our tests are still horrible
<axw> wallyworld__: well it's pretty invasive so yeah... I have done some targeted testing with non-amd64 arch, will test the whole lot before attempting merge tho
<wallyworld__> ty
<dimitern> morning all
<jam1> wallyworld__:  :(
<axw> hazmat: in case this got lost in the noise of github activity: https://github.com/juju/juju/pull/596
<TheMue> morning
<jam> morning TheMue
<TheMue> jam: seen your mail, could you tell me a bit more about it?
<dimitern> morning jam, TheMue
<dimitern> jam, the meeting will start any minute now :)
<TheMue> dimitern: heya
<dimitern> jam, ping
<jam> dimitern: pong
<jam> sorry about the delay
<jam> dimitern: dang it
<jam> sorrI  missed it
<jam> I have to take the dog out now, will be back in 20 min or so
<dimitern> jam, no worries, i'll bring you up-to-speed at the standup
<jam> dimitern: speaking of which :)
<jam> TheMue: ^^
<dimitern> brt
<mattyw> dimitern, ping - when you have a moment
<dimitern> mattyw, a tentative pong (doing standup now)? :)
<hazmat> do actions use a hook context? (ie. long running)
<hazmat> bodie_, fwereade ^
<gsamfira> hazmat: hi. There does not seam to be any difference between Hooks and Actions aside from location (when it comes to running). https://github.com/juju/juju/blob/master/worker/uniter/context.go#L330
<hazmat> gsamfira, thanks
<gsamfira> glad I could help :)
<TheMue> eh, maybe Iâm blind, but do we have a github.com/juju/juju/upstart?
<TheMue> cmd/jujud/machine.go imports it, but I cannot find it (neither can my compiler)
<TheMue> dimitern: youâre around for a little crosscheck?
<TheMue> mattyw: ping
<mattyw> TheMue, hey there
<TheMue> mattyw: could you take a look please? it seems the repo has a problem
<alexisb> mgz, I am on the hangout whenever you are ready
<mattyw> TheMue, I certainly can but I might not be the best person for the job
<mgz> alexisb: thanks for the poke
<TheMue> mattyw: maybe I already found the checkin
<TheMue> so currently our master wonât build, a package is missing
<dimitern> TheMue, now I'm here
<mattyw> TheMue, this one yeah? https://github.com/juju/juju/commit/1f7148c5e2ae9f68eb9f8b0c94f6c00b82ee4a18
<TheMue> dimitern: thx, but found it already.
<TheMue> mattyw: yeah, exactly
<dimitern> ok
<TheMue> dimitern: in jujud machine.go imports a non-existing package :(
<mattyw> TheMue, dimitern but it looks like the package isn't used either
<dimitern> TheMue, what?
<TheMue> mattyw: yep
<TheMue> mattyw: Iâm only wondering how it passed the bot
<mattyw> TheMue, me too - who's the best person to ask about the bot?
<mattyw> TheMue, the tests run http://juju-ci.vapour.ws:8080/job/github-merge-juju/431/console
<mattyw> TheMue, but that error is in main - do tests on main get run?
<dimitern> mattyw, TheMue, is this about upstart?
<TheMue> mattyw: this seems to be the problem
<TheMue> dimitern: yep
<dimitern> TheMue, mattyw, it seems juju/upstart moved into juju/service/upstart
<dimitern> and perhaps wallyworld had goimports installed and juju/upstart code was in GOPATH before juju/service/upstart, and probably the same happened on the bot
<TheMue> dimitern: hmm, could be the reason
<dimitern> it happened in https://github.com/juju/juju/commit/190f98fcab118b5dce269e8c0021a563455fee39#diff-88ad1ca7d18fe89a76f6348caf6ddd42
<mattyw> dimitern, makes sense
<mattyw> dimitern, TheMue anyidea how we can stop this from happening next time?
<dimitern> mattyw, TheMue, it just happened so that code importing the old path was merged last
<dimitern> https://github.com/juju/juju/commit/880aaa83f1a474ef7856f1237c3781ab6a51dbfe
<TheMue> also upgrades isnât used in machine.go
<dimitern> mattyw, I'm not sure if that's the case, but if it is, then we should look into the bot and see how it does fetch dependencies, etc.
<mattyw> dimitern, where is the code for the bot?
<dimitern> it might be that ian added the import line manually, rather than using goimports
<dimitern> mattyw, mgz would know that I guess
<mgz> which bot bit?
<mgz> mattyw: you want to look at the make-release-tarball script in lp:juju-release-tools
<mattyw> mgz, ok thanks TheMue ^^
<TheMue> yep
<mgz> what was the symptom exactly? I'm a little confused from the log
<mgz> we had broken import that got past the landing?
<mgz> or didn't get past the landing but did get past the build?
<TheMue> mgz: cmd/jujud/machine.go imports packages it doesnât use and that doesnât exist
<mgz> I see, on trunk currently.
<dimitern> we should file a critical ci blocker bug for that
<dimitern> so nothing lands until it gets fixed
<mgz> and blame is on the last rev of master? or an earlier one?
<dimitern> mgz, last rev
<mattyw> tasdomas, dimitern in other news this is ready for more reviews when you have a moment https://github.com/juju/juju/pull/562
<dimitern> mattyw, will have a look shortly
<mgz> oh, I see
<mgz> go fmt passes...
<mgz> and the go build line has gone
<mgz> dimitern: my suggestion, I land a backout of the last rev
<mgz> add `go build ./...` back to the tarball script
<dimitern> mgz, there seems to be another issue
<mgz> reland the earlier rev and see that it borks?
<dimitern> mgz, ../../state/backups/metadata/metadata.go:10:2: cannot find package "github.com/juju/utils/filestorage" in any of:
<dimitern> 	/usr/lib/go/src/pkg/github.com/juju/utils/filestorage (from $GOROOT)
<dimitern> and it's not in dependencies.tsv as well
<mgz> also the same rev?
<mgz> if so, covered by the backout
<dimitern> mgz, let me check
<mgz> seems not..
<mgz> probably eric's change?
<dimitern> mgz, yes, on trunk
<dimitern> mgz, but it's not the same rev I think
<mgz> yeah, looks like that's ericsnowcurrently backups-storage
<dimitern> mgz, yep https://github.com/juju/juju/commit/f4da7f542947abb798da7da730a5482a029eee44
<mgz> so, two borked landings from the build line going... now, why was that removed...
<dimitern> mgz, so we're not even trying if it builds ? lol..
<mgz> well, not at the tarball stage
<dimitern> mgz, iirc there was a unit test for deps.tsv..
<mgz> we do before running the tests, and that's working for some reason
<dimitern> mgz, it takes like 10 secs - we should do it before running tests, not as late as tarball packaging time i think
<mgz> tar
<dimitern> ah, I see
<mgz> sorry,
<mgz> tarball build comes before tests
<dimitern> mgz, hmm.. I wonder why that is
<mgz> we get deps and make tarball on the main juju machine
<mgz> then send the tarball to a new instance to run the tests
<mgz> so, the bot *should* still be failing before we run tests, but from the logs it's not for some reason
<mgz> can see the line `go build github.com/juju/juju/...` in the landing console log, and it's not got the error
<mgz> a little worried it's not actually testing the right juju at present
<mgz> hm,
<mgz> I'm tempted to blame a godeps change
<mgz> nothing on the ci side has changed
<mgz> heh
<mgz> okay, got it
<dimitern> mgz, yep? what is it?
<TheMue> ah?
<mgz> for some reason, fixDetachedHead from cmd/go/vcs.go is now getting called, when it wasn't before
<mgz> and that does checkout master... overwriting the merge
<dimitern> yay! :D
<mgz> not sure *what* has triggered this, but can fix at least
<dimitern> lots of fun
<dimitern> godeps perhaps
<mgz> lets hope it was recent
<TheMue> strange kind of error
<dimitern> it depends on how it gets missing revisions from git - if it does not use fetch but pull it can happen
<mgz> because we've only been testing the current head, rather than the pending merge, for the last few landings at least
<TheMue> Iâve seen it first when testing the dummy provider. here I got it in github.com/juju/juju/environs/jujutest/livetests.go:124: build command âgoâ failed â¦
<dimitern> nope, scratch that - godeps uses git fetch, at list in lp:godeps trunk
<mgz> for now, I want to just back those two changes out
<mgz> and eat lunch...
<alexisb> jcw4, bodie_ we are on the hangout when you guys are ready
<jcw4> woo hoo
<jcw4> TheMue: #jujuskunkworks
<perrito666> hello everyone
<mgz> hey!
<mgz> anyone: pr #609
<mgz> gd
<mgz> urk
<mgz> gsamfira, perrito666: ^
<perrito666> mgz: on which repo?
<mgz> juju/juju
 * perrito666 receives confirmation from msdn of subscription... I wonder when did I subscribe
<perrito666> it was at least one month ago
<bodie_> so what's the deal with upstart and how far back do we have to roll back to get it to build?
<gsamfira> well, 2 options. there is one commit that was ported forward, and if we remove that one, it will build
<gsamfira> or, we can do a PR, that removes the extra imports and calls agentConfig.Tag().String() instead of agentConfig.Tag() in a couple of places
<gsamfira> and it will also build
<gsamfira> but I have not investigated the issues related to the second PR that is being reverted by https://github.com/juju/juju/pull/609
<gsamfira> mgz might have more info on that
<perrito666> mgz: btw, can https://bugs.launchpad.net/juju-core/+bug/1361721 be reproduced in something other than utopic?
<mup> Bug #1361721: MachineSuite.TestDyingMachine failing frequently <juju-core:Triaged> <https://launchpad.net/bugs/1361721>
<mgz> perrito666: that's the only job it's on I think, but it's been a dodgy testfor a while
<mgz> bodie_: I'm not sure, which upstart what?
<bodie_> the failing build on the latest master
<gsamfira> bodie_ : the upstart package was moved to the service package quite a while ago.
<jcw4> bodie_: mgz has a revert in the pipeline now
<mgz> bodie_: my pr should fix the failing build
<bodie_> ah, great
<gsamfira> if there is no other issue with the commits that are being reverted, the change to get it to build without reverting is about 4 lines. I am running the tests now. Should I let them finish and see if that fixes it?
<mgz> gsamfira: I want to just revert, because the tests were never run on those changes
<mgz> then fix the landing before putting in new code
<gsamfira> fair enough. As you wish. I am running the tests on that code now, with the fix. If you prefer to revert, its fine with me :). I was just offering an alternative that would be shorter
<ericsnow> perrito666: how's your morning go?
<perrito666> ericsnow: wonderful
<ericsnow> perrito666: glad to hear it
<perrito666> ericsnow: btw, one of your PRs has just been reverted, please contact mgz for more info
<ericsnow> perrito666: yeah, I saw :(
<ericsnow> mgz: how exactly was my patch failing?
<ericsnow> mgz: I'm guessing it's related to updating dependencies.tsv
<mgz> 16:02 < dimitern> mgz, ../../state/backups/metadata/metadata.go:10:2: cannot find package "github.com/juju/utils/filestorage" in any of:
<mgz> 16:02 < dimitern> I/usr/lib/go/src/pkg/github.com/juju/utils/filestorage (from $GOROOT)
<mgz> 16:02 < dimitern> and it's not in dependencies.tsv as well
<ericsnow> mgz: weird
<mgz> worth trying the merge again and building locally, see if it's actually okay
<ericsnow> mgz: github.com/juju/utils/filestorage has existed for some time and it's in the revision listed in depenedencies.tsv
<ericsnow> mgz: at least as long as that revision didn't get reverted too
<perrito666> all: I just sent an email to juju-dev in the thread "getting rid of all-machines.log" you opinion will be greatly appreciated
<perrito666> ok good news is: I don't need utopic to fix 1.20 tests
<mgz> ace
<perrito666> sweet, 8G of ram really did the trick for test running
<perrito666> why cant I see builds before #545 for http://juju-ci.vapour.ws:8080/job/run-unit-tests-utopic-amd64/ ?
<perrito666> abentley: mgz jog anyone can tell since when has 1.20 -> http://juju-ci.vapour.ws:8080/job/run-unit-tests-utopic-amd64/ been broken? I know it failed for the last revision, but do we know if this is indeed something new
<perrito666> ?
<perrito666> sorry, shift too close to enter
<abentley> perrito666: The last revision that passed was eba6e37f
<perrito666> abentley: thanks a lot
<abentley> perrito666: r6dc9a588 was tested and failed, but I need to check to see whether it was the same failure mode.
<abentley> perrito666: Silly me, that's the candidate.
<perrito666> weird, gitk says r6dc9a588 is not part of 1.20
<abentley> perrito666: I'm sorry, the last to pass was eba6e37f.  The way jenkins displays this is confusing.
<perrito666> abentley: I know, dont worry, confuses me each time
<perrito666> mm, changes from eba6e37f to head of 1.20 contain a patch mgz just reverted from master
<wwitzel3> ping ericsnow, perrito666
<ericsnow> wwitzel3: hey
<wwitzel3> ericsnow, perrito666: got time for a quick meeting / standup?
<ericsnow> wwitzel3: sure
<wwitzel3> ok, going to moonstone
<ericsnow> wwitzel3: sorry, thought I had joined!
<perrito666> sorry was in another window, you guys still there?
<ericsnow> perrito666: nope
<ericsnow> perrito666: we didn't talk for long
<ericsnow> perrito666: just a quick recap for Wayne
<perrito666> well not much from me either I am fixing a bug in 1.20
<thumper> morning
<perrito666> thumper: morning
<perrito666> mgz: still here?
<mgz> perrito666: yup
<perrito666> mgz: ah nevermind I was just curious if git revert worked for you
<mgz> perrito666: it does, but is a little finickity
<perrito666> mgz: I tried git revert -m 1 hash
<perrito666> and got all kind of conflicts
<perrito666> I really expected it to be slightly smarter
<mgz> are you sure the -m was right?
<perrito666> I ... I am not sure, I guess it was not given the result, I do not feel the explanation for what -m does was meant to be understood
<perrito666> would anyone please https://github.com/juju/juju/pull/610
<perrito666> it fixes https://bugs.launchpad.net/juju-core/+bug/1361721
<mup> Bug #1361721: MachineSuite.TestDyingMachine failing frequently <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1361721>
<mgz> well if it reverted the right stuff, even with conflict pain, presumably
<perrito666> mgz: I end up doing it by hand way easier
<perrito666> thumper: cmars you are the ocrs
<thumper> perrito666: no, that was yesterday :)
<perrito666> mgz: if you append .diff to the pr in ghub it will yield the patch in plain text
<perrito666> thumper: ah true, it is still today for me lol
<thumper> funny, it is still today for me too
<wallyworld_> perrito666: why does my branch break that test? the tests pass if run with reduced parallelisim so it's likely coincidental that that commit is blamed
<perrito666> wallyworld_: well it is consistent when I run them and I believe mgz has the same results
<wallyworld_> they pass for me locally
<wallyworld_> and the bot or else it wouldn't have landed
<perrito666> wallyworld_: mm, strange, they fail here and in CI
<perrito666> wallyworld_: http://juju-ci.vapour.ws:8080/job/run-unit-tests-utopic-amd64/
<wallyworld_> they will likely pass if the tests are run with reduced paralleism
<wallyworld_> we have a number of tests that fail without -p 2
<wallyworld_> that test as also failed intermittently previously
 * perrito666 does
<wallyworld_> my branch changes machine agent startup to write the tools version earlier in the startup, so it is hard to see how that could affect that test
<wallyworld_> once machine agent is up and running, there's no difference
<wallyworld_> since it only fails on utopic, it is very likely to be a timing issue, which is an issue many tests have sadly
<wallyworld_> since we tend to use timeouts all over the place rather than channels and signals to coordinate
<perrito666> wallyworld_: I can reproduce it with thusty
<wallyworld_> with -p 2?
<perrito666> wallyworld_: I am running with p2
<perrito666> lets get coffee while we wait :p
<wallyworld_> i have had that test fail even before my branch landed
<mgz> wallyworld_: did you see the revert on trunk?
<mgz> I still havent fully resolved what changed to make the bot pass borked merges, but will have it fixed
<wallyworld_> mgz: haven't seen that revert yet, let me look
<wallyworld_> mgz: how did the backup pr break the tests? looks very srlf contained?
<mgz> wallyworld_: it may have actually been okay, but dimitern flagged it as dodgy as well
<mgz> the dep borked for him
<wallyworld_> the utils dep? how did it bork?
<mgz> lack of filestorage package
<mgz> may have just been a mistake, I told eric to reland if it built for him locally
<mgz> I just wanted to back out all suspect changes as the bot had not in fact been testing them
<perrito666> wallyworld_: tests fail with go test -test.parallel=2 github.com/juju/juju/...
<wallyworld_> perrito666: same test?
<arosales> wallyworld_, mgz abentley: added a comment to https://bugs.launchpad.net/juju-core/+bug/1361721
<mup> Bug #1361721: MachineSuite.TestDyingMachine failing frequently <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1361721>
<perrito666> wallyworld_: exact same test
<perrito666> golang-go                                             2:1.2.1-2ubuntu1
<perrito666> wallyworld_: I can run any other sort of test for you if you want
<wallyworld_> arosales: i've only just SOD, but will look into the test and try and see that the issue is, and we'll get 1.20.6 out today
<wallyworld_> perrito666: thanks, i need to look at the test to see where it's failing
<perrito666> wallyworld_: exact same output that in jenkins
<wallyworld_> perrito666: and yet the test passed the bot
<mgz> wallyworld_: you mean, you don't love us all bugging you before breakfast? :)
<perrito666> wallyworld_: I recall mgz saying earlier that the bot was letting things pass
<wallyworld_> we have so many tests that fail due to subtle changes in timing due to different instance types etc
<wallyworld_> mgz: before breakfast is ok, not before coffee :-(
<perrito666> wallyworld_: let me run one more test and I might be able to give you more info
<wallyworld_> ok, thanks
<wallyworld> perrito666: the test passes for me - can you try running it in isolation?
<wallyworld> i'm running on an SSD, msny of our tests pass more often with fast i/o
<perrito666> wallyworld: I am running in an ssd too
<perrito666> everything in this machine is ssdish
<perrito666> wallyworld: what do you mean in isolation
<wallyworld> go test -gocheck.f TestDyingMachine
<wallyworld> cd to the cmd/jujud package
<wallyworld> and just run that one test
<perrito666> wallyworld: running
<perrito666> it passes
<wallyworld> yup, so it's just another case of our tests being stupid :-(
<perrito666> wallyworld: well the tests where written by us so...
<perrito666> :(
<wallyworld> perrito666: agreed. there's sooo much that needs fixing
<perrito666> wallyworld: well I made a few tries and definitely I am not able to figure out why your patch triggers this failure
<wallyworld> perrito666: my patch doesn't - this test has failed several times in the past before my patch
<perrito666> wallyworld: let me rephrase
<wallyworld> i knew what you meant, sorry :-)
<perrito666> wallyworld: I believe that your patch somehow triggers our underlying test error, yet I cannot figure out why on the universe
 * perrito666 reruns
<wallyworld> and sadly, it seems to happy just on utopic on CI, on trusty elsewhere, and not for me at all
<wallyworld> s/happy/happen
<perrito666> that speaks so bad about the affected piece of code consistency
<wallyworld> perrito666: if you can get it to fail, maybe try increasing the poll timeout to see if that makes a difference, just to see if the agent will eventually die or is hung
<perrito666> wallyworld: I believe I tried
<perrito666> wallyworld: I find somehow interesting the various log entries that state that Open has been called without addresses
 * perrito666 swims in seas full of red herrings
<wallyworld> s/swims/drowns
 * perrito666 does as always in case of fish and starts the barbecue grill
<perrito666> wallyworld: I found it
<perrito666> MachineSuite is not properly isolated
<wallyworld> that and several other test suites :-(
<wallyworld> what particular issue did you find?
<perrito666> wallyworld: by removing the tests you added to machine_test the whole suite runs
<wallyworld> it works for me even with those tests
<perrito666> wallyworld: well I guess Ill have to be the one to find the bug then
<wallyworld> i'm looking into why the agent is not stopping - well it is stopping according to the logs, but the test doesn't see it
<wallyworld> trouble is there's not enough logging
<wallyworld> in the machine agent Run() method
<perrito666> wallyworld: what is primeAgent?
<wallyworld> that creates a machine and tools and sets up a machine agent
<perrito666> I have a hunch that at some point a machine is being shared
<wallyworld> perrito666: maybe, but the logs show the agent dying in response to the machine being marked as dead. it's just that the test doesn't find that out. and for some reason, the agent tries to start again
<wallyworld> perrito666: would be interesting, if you can get it to fail, to add logging around these lines at the end of machine agent's Run() method
<wallyworld> 	if err == worker.ErrTerminateAgent {
<wallyworld> 		err = a.uninstallAgent(agentConfig)
<wallyworld> 	}
<wallyworld> 	err = agentDone(err)
<wallyworld> 	a.tomb.Kill(err)
<wallyworld> 	return err
<perrito666> wallyworld: ok, going
<wallyworld> perrito666: so it looks like the logic in one of the runners is not detecting that the worker is dying, and is attempting to restart everything
<wallyworld> the agent itself correctly notices that the machine is dead, which is what the test is testing for, but the worker doesn't allow the agent to exit
<perrito666> so this actually is a bug
<wallyworld> well, it's a bug somewhere because the test fails when it shouldn't. but not sure where exactly
<wallyworld> i'm guessing it's in the worker/runner infrastructure
<thumper> davecheney: https://github.com/juju/juju/pull/605/files
<wallyworld> func (runner *runner) run() error {   <-- this function in worker/runner.go is noticing that the runner has been stopped but then attempts to restart because it doesn't know that it's deliberate
<davecheney> thumper: looking
<thumper> davecheney: ta
<thumper> waigani: https://github.com/juju/juju/pull/519 has a merge conflict with master
<perrito666> wallyworld: whoa
<perrito666> wallyworld: I am running it with some more logging
<waigani> thumper: thanks, looking
<wallyworld> perrito666: it appears the logs are missing the 'killing "api"' line which means that the api work is not being killed like it should
<wallyworld> that's the worker that is then being erroneously restarted
<perrito666> wallyworld: good catch
<perrito666> looking
<wallyworld> perrito666: func killWorker(id string, info *workerInfo) {   <---- if this is not called, then info.start is not set to nil, so when the worker terminates, it will just be restarted again
<wallyworld> which is not what we want
<perrito666> wallyworld: I am quite close to call you
<wallyworld> i introduced a deliberate error into the test and compared the logs - it seems when it fails, the "deployer" worker is not killed as it shold be'
<wallyworld> hmmm, but that's because the deployer is not started
<thumper> davecheney: still happy with that? If so, I'll merge it (when landing unblocked)
<davecheney> thumper: lgtm
<davecheney> minor gripes
<davecheney> but lgtm
<thumper> what are the gripes?
 * thumper looks at that test
<davecheney> thumper: that's my only comment
<davecheney> everything else looks good
<perrito666> thumper: well in spanish is the plural for the flu
<davecheney> perrito666: lol
<perrito666> thumper: also your last name pronounced as read in spanish means penis :p </end of trivia>
<thumper> perrito666: yea, back to high school
<perrito666> thumper: actually It triggered a very weird look from my wife when I told her your name when I was chatting with you the other night
<perrito666> and then I realized
#juju-dev 2014-08-27
<thumper> davecheney: https://github.com/howbazaar/juju/commit/71719716c26571c98bdf6ce3119f0a8c5450cb59
 * thumper goes for a run
<davecheney> thumper: very nice
<davecheney> LGTM
<perrito666> wallyworld: https://pastebin.canonical.com/115910/
<davecheney> perrito666: that's a paddlin' for using that pastebin
<davecheney> now I have to get my ubikey to find out what you wrote
<perrito666> sorry chrome completed that one first
<bigjools> thumper's name in Spanish means penis?  Why have I not heard this revelation before.
<perrito666> bigjools: not really, his last name read in spanish, which is basically read all the letters
<bigjools> ah I see
<bigjools> still funny
<perrito666> bigjools: our fonetics make many english words fun
<bigjools> yeah it works in reverse too :)
<perrito666> it usually makes a big laugh in my work day by just reading the irc channel with my brain set to spanish
<bigjools> heh
<perrito666> mmpf the flakiness of this test really annoys me
<perrito666> wallyworld: http://pastebin.ubuntu.com/8154484/ seccond pass failed on the testDying test
<perrito666> third pass seems to have gone flawless so far and it is beyond failure point
<wallyworld> perrito666: the fix definitely is needed, so we can land that and cross our fingers
<perrito666> wallyworld: mm I just got a very interesting failure
<perrito666> wallyworld:  http://pastebin.ubuntu.com/8154509/
<wallyworld> i've  seen those before too
<wallyworld> more test badness that happens every so often
<perrito666> ok but regardless, the Dying test passed
<perrito666> 4th run
<wallyworld> good so far, i'll keep looking in case there's more than can be done
<perrito666> man I have seen thrillers with less suspense than this test run
<perrito666> wallyworld: I added a card for this issue, please make sure to move it to done when you finish later today
<wallyworld> perrito666: that wait we added is not enough - the start up does not account for the fact that the machine may be marked as dead after that. i will need to add some more code
<wallyworld> ok
<perrito666> wallyworld: we could use the channel for more than just a flag
<perrito666> wallyworld: ok the last pass re-triggered the other error but the dying error is no longer triggered
<perrito666> wallyworld: I do not believe there is a connection so it should be safe to add the wait as we did
<wallyworld> it's a bit more complicated - worker startup can fail because a machine is dead, and return an arbitrary error. this case does not trigger the agent to die. we only check for dead machines right at the start and not thereafter
<wallyworld> i need to add some logic account for this
<perrito666> well my brain just turned into jelly so I will be out of here, I will reassign this to you, I will add a comment with what you just said, ok?
<perrito666> wallyworld: ?
<wallyworld> perrito666: i'll update the bug. thanks for you help
<wallyworld> have a goodevening
<perrito666> wallyworld: tx
<perrito666> here have the paste to save you some work
<perrito666> The wait we added is not enough (using the provided channel) - the start up does not account for the fact that the machine may be marked as dead after that and return an arbitrary error. this case does not trigger the agent to die. we only check for dead machines right at the start and not thereafter.
<perrito666> Logic to account for this will be added by Ian who just took over.
<davecheney> anyone seen this before ?
<davecheney> http://paste.ubuntu.com/8154661/
<davecheney> ignore
<davecheney> i have local changes
<davecheney> and I broke some shit
<davecheney> question
<davecheney> when was juju ssh changed to always ssh via the bootstrap node ?
<axw_> late last year
<davecheney> shows how much I know
<davecheney> for all providers ?
<axw> oh sorry
<axw> I misread
<axw> can't remember, I think early this year though
<axw> yes, all providers (except local)
<davecheney> ok
<axw> it can be disabled
<davecheney> nah, it's ok
<davecheney> just hit a stange error message when sshing to a unit
<davecheney> tlaking about an internal addrss
<davecheney> perrito666: i replied to yor email
<davecheney> i think you need to do more to make it clear that action is required from others, and make those others aware that the ball is in their court
<perrito666> thank you very much for taking the time davecheney
<wallyworld> perrito666: i got a fix up. i experimented with a few approaches, but what's there seems to work https://github.com/juju/juju/pull/612
<wallyworld> axw: changes pushed
<axw> wallyworld: looking
<wallyworld> thanks axw
<thumper-otp> wallyworld_: bugger... *os.PathError = &os.PathError{Op:"write", Path:"/mnt/tmp/gocheck-2703387474910584091/5/ssh", Err:0x1c} ("write /mnt/tmp/gocheck-2703387474910584091/5/ssh: no space left on device")
<thumper-otp> tests failed
<thumper-otp> no disk left
<wallyworld_> thumper: rerun
<thumper> wallyworld_: is it a dedicated instance?
<wallyworld_> i have seen that once before
<wallyworld_> it is an instance stated new each time
<wallyworld_> not sure why it happens
<thumper> hmm... ok
<ericsnow> FYI, there's a chance I will have a reviewboard site up for demo tomorrow, hosted in the CI environment :)
<wallyworld_> ericsnow: you legend
<wallyworld__> axw: could you please take another look at this PR. i had to propagate the ErrDead back to the caller which is something i was trying to get away with not doing https://github.com/juju/juju/pull/612
<wallyworld__> but alas it is needed
<axw> sure, looking
<axw> wallyworld__: bleh, would be nice if we had an IsTerminateAgent predicate that knew about errgo
<wallyworld__> yeah. one step at a time :-)
<wallyworld__> i just want to get this little fucker landed so we can release 1.20.6
<axw> wallyworld__: given that it's "not found or dead", perhaps just use the existing CodeNotFound?
<wallyworld__> axw: thought about it, but i think Dead is a distinct error. i think the message in ErrDead() is broken tbh
<wallyworld__> i'm 50/50
<axw> wallyworld__: what is the error code returned without new mapping?
<wallyworld__> ""
<axw> heh, ok
<wallyworld__> as the error is not recognised
<axw> wallyworld__: lgtm
<wallyworld__> ty
<wallyworld__> i hope it works this time
<davecheney> wow, the tests take SOOOOOOOOOOOOO long to run
 * thumper sighs
<thumper> axw: where are the bootstrap logs written?
<thumper> axw: when bootstrapping?
<axw> thumper: cloud-init-output.log
<axw> /var/log/cloud-init-output.log
<axw> $root-dir/cloud-init-output.log for local
<thumper> ah, local is what I'm looking for
<thumper> axw: I don't suppose you have a handy bunch of tricks to poke around the inside of the mongo db?
<axw> thumper: https://github.com/kapilt/juju-dbinspect
<axw> other than that, fraid not
<thumper> nah, want a bit lower level than that
<thumper> thanks anyway
 * thumper goes to google
<davecheney> thumper: /usr/bin/hd
<thumper> hd?
 * axw prefers the oscilloscope
<thumper> screw you guys
 * thumper goes back to googl
<davecheney> thumper: echo "1" > /dev/tcp/localhost/17017
<axw> davecheney: I think I found why the cmd/juju tests slowed down so much
<axw> I made some changes to bootstrap to generate the system-identity sooner. we just need to pregenerate that for tests
<davecheney> axw: cool
<davecheney> what is the system-identify ?
<axw> davecheney: a private key owned by state servers for logging into other machines
<axw> davecheney: I might just move it into server side, shouldn't really be there I think. Anyhow, I'm looking into it
<davecheney> right
<davecheney> i see
<davecheney> sheesh
<wallyworld__> davecheney: thanks for the syslog info - i had forgotten the reason
<davecheney> wallyworld__: syslog is a bit better in Go 1.3, but
<davecheney> a. we've standardized on 1.2
<davecheney> b. eveyrone has lost confidence in that package
<davecheney> axw: related LP 1356806
<davecheney> mup: ping
<mup> davecheney: I apologize, but I'm pretty strict about only responding to known commands.
<davecheney> mup: 1356806
<mup> davecheney: In-com-pre-hen-si-ble-ness.
<davecheney> #1356806
<mup> Bug #1356806: cmd/juju: juju takes 3 seconds to do nothing <juju-core:Triaged> <https://launchpad.net/bugs/1356806>
<axw> davecheney: thanks
<davecheney> axw: how did you find the problem
<axw> davecheney: ah that's a bit different
<axw> davecheney: go test -gocheck.vv
<davecheney> what does that show over -v ?
<axw> gives me timing for SetUpTest
<davecheney> ahh
<davecheney> then you werelike "wtf is taken 3 seconds ...
<davecheney> axw: can you please log an issue for this bug
<davecheney> i have a feeling that this will show up on arm and ppc64
<davecheney> until fixed
<davecheney> as they don't have a fast path for big.Rat
<axw> davecheney: will do
<thumper> wallyworld__: what is the magic to turn off apt update/upgrade with local?
<thumper> wallyworld__: actually more fucked up than that...
<thumper> wallyworld__: we are issuing apt-get update/upgrade  on the host machine for the local provider
<thumper> wallyworld__: we really shouldn't be...
<davecheney> thumper: search your email for a message from katco today
<thumper> ah... nada
<wallyworld__> thumper: the defaults should have stayed the same
<thumper> wallyworld__: they didn't
<davecheney> thumper: https://bugs.launchpad.net/bugs/1350493
<mup> Bug #1350493: 1.20.x local provider not running apt-get update <charms> <regression> <juju-core:Fix Committed by cox-katherine-e> <https://launchpad.net/bugs/1350493>
<davecheney> bottom of that
<thumper> davecheney: ta... however the more challenging issue is the apt-get update/upgrade on the host of the local
<wallyworld__> davecheney: right, the intent was to keep things how they used to be by default
<thumper> i.e. my machine
<wallyworld__> thumper: it was supposed to only run any apt commands inside the container once created. so seems like there's a bug there if that's not the case. should be an easy fix
<thumper> wallyworld__: I haven't tested a container
<wallyworld__> ie yes, it should not be touching host machine
<thumper> just bootstrapping local
 * thumper nods
<thumper> should be easy enough
<wallyworld__> yep, will be fixed overnight
<thumper> davecheney: any idea what user.Current().Username looks like on windows?
<davecheney> thumper: LOCALDOMAIN/admin
<thumper> davecheney: particularly wondering about spaces and capitals
<davecheney> ^ total guess
<davecheney> yes, it can have both
<thumper> bugger
<davecheney> oh boy
<davecheney> just read os/user/lookup_windows.go
 * thumper doesn't have the source
<davecheney> assume the worse
<davecheney> worst
<thumper> well, for me that means unicode
<thumper> and various bits of punctuation
<davecheney> yup, all that is allowed
<davecheney> _In_   PLSA_UNICODE_STRING Names,
<davecheney> trying to find the msdn docs
<davecheney> http://msdn.microsoft.com/en-us/library/windows/desktop/ms721799(v=vs.85).aspx
<davecheney> thumper: what were you planning on using that string for ?
<thumper> davecheney: the default admin user name, but it has to be squeezed through names.IsValidUser
<davecheney> no chance
<davecheney> sorry
<davecheney> for a start
<thumper> ok, so here is the plan :-)
<davecheney> windows calls it
<davecheney> Adminstrator
<davecheney> with a capital A
<thumper> grab the username
<thumper> if there is a slash, take the last part
<thumper> then lower case it
<thumper> replace spaces with dashes
<thumper> then validate check
<thumper> if invalid -> "admin"
<thumper> otherwise use it
<davecheney> windows usernames are case sensitive
<davecheney> Admin != admin
<davecheney> unlikely
<davecheney> but if you want to do it right
<axw> davecheney: ok  	github.com/juju/juju/cmd/juju	155.808s
<axw> down from 400-odd
 * axw proposes
<dimitern> axw, \o/
<davecheney> axw: nice one
<davecheney> axw: this is anything that embeds jujuconnsuite isn't it ?
<axw> davecheney: anything that bootstraps, so yep
<davecheney> son of a bitch
<davecheney> axw: nice one, this one should merge like a greased pig
<wallyworld__> oink
<davecheney> axw: ok  github.com/juju/juju/cmd/juju549.275s
<davecheney> still sucks in the cloud
<axw> ? :(
<axw> is that my PR?
<axw> nope
<wallyworld__> nope, that's mine i think
<davecheney> sorry, yes
<wallyworld__> mine is running now
<davecheney> Building https://github.com/wallyworld/juju.git revision agent-version-on-startup-maste
<bodie_> if anyone has any obvious review for https://github.com/juju/juju/pull/617 it would be appreciated while I wait to go over the panic with jcw4
<bodie_> thanks all :)
<TheMue> morning
<mattyw> morning all
<dimitern> anyone willing to review a short patch for the critical  https://bugs.launchpad.net/juju-core/+bug/1361374 (along with a few others) ? PTAL https://github.com/juju/juju/pull/618
<mup> Bug #1361374: maas provider assumes machine uses dhcp for eth0 <addressability> <maas-provider> <network> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1361374>
<TheMue> dimitern: looking
<dimitern> TheMue, thanks!
<TheMue> dimitern: LGTM, and leads me to postpone my PR for some minutes more. then Iâll integrate your changes into the providers environment capability
<dimitern> TheMue, tyvm
<gsamfira> morning all
<gsamfira> https://github.com/juju/juju/pull/620 <-- if anyone has some spare time, this PR makes wrench build on Windows
<TheMue> dimitern: does your latest PR build? Iâve got a problem that our dependencies.tsv looks ill. the kardianos/service uses packages that arenât listed. so my build fails :/
<dimitern> TheMue, it does build - i've first tested that trunk builds as well, and built my branch to test it on MAAS
<dimitern> TheMue, run godeps, it should fix it
<TheMue> dimitern: I always run godeps first
<dimitern> TheMue, did you try go get -u -v bitbucket.org/kardianos/service ?
<dimitern> gsamfira, looking
<dimitern> gsamfira, morning btw :)
<gsamfira> dimitern Thanks :D
<TheMue> dimitern: yep, Iâve got a script for it. and I found it in â¦/pkg/â¦
<TheMue> strange
<dimitern> TheMue, the -v and -u flags are important to fetch the package's dependencies, then you run godeps again
<dimitern> gsamfira, I'm not sure how's the deal with contributors, but I think new files should have the copyright + license header comments, would that interfere with // +build !windows ? (i.e. does it have to be on the very first line?)
<gsamfira> nope...I always forget the copyright *sigh*
<gsamfira> lemme add that
<TheMue> dimitern: as I said, I *always* do (have a script for it)
<dimitern> gsamfira, cheers!
<dimitern> TheMue, ok, sorry, I wanted to make sure because I've been bitten by this lots of times :)
<TheMue> dimitern: thatâs why I created a script :D
<dimitern> gsamfira, LGTM with some trivials
<TheMue> dimitern: hmm, maybe I found the reason, could be in a conflict of GOROOT and GOPATH *investigating*
<gsamfira> dimitern: changes made :). Thanks
<dimitern> gsamfira, cheers!
<TheMue> no :(
<TheMue> dimitern: do you have code.google.com/p/go.exp/fsnotify in your pkg dir?
<dimitern> TheMue, nope, just checked
<TheMue> dimitern: I got the error message â/home/themue/code/go/src/bitbucket.org/kardianos/service/config/config.go:14:2: cannot find package "code.google.com/p/go.exp/fsnotify" in any of: â¦â
<TheMue> dimitern: same for go.text
<TheMue> *grrrr*
<dimitern> TheMue, I don't have bitbucket.org/kardianos/service as well
<TheMue> dimitern: it is the last line of the tsv
<TheMue> dimitern: take a look at github directly
<TheMue> dimitern: https://github.com/juju/juju/blob/master/dependencies.tsv
<gsamfira> dimitern, TheMue : bitbucket.org/kardianos/service is only needed on Windows to enable jujud.exe as a service
<dimitern> gsamfira, right!
<gsamfira> TheMue, what errors are you seeing?
<TheMue> gsamfira: my run of godeps after a go get of all packages in dep.tsv works
<TheMue> gsamfira: but then I wonna do a go test â¦/.
<dimitern> TheMue, so I did "go get -u -v bitbucket.org/kardianos/service", which completed successfully, then I run my run-godeps script to update deps, and finally just to check godeps -u dependencies.tsv - no errors
<TheMue> gsamfira: and then I get - below others - the error message shown above
<TheMue> dimitern: yep, here too
<TheMue> dimitern: but now do a go test for all
<dimitern> TheMue, now I'm running go build ./... && go test ./... in the root dir
<gsamfira> running now
<gsamfira> so far so good
<gsamfira> TheMue: can you do a: cd $GOPATH/src/bitbucket.org/kardianos/service && hg log | head
<gsamfira> I'm curious if the revision is correct
<TheMue> ha! found it, really strange!
<TheMue> gsamfira: itâs all fine in the tsv
<gsamfira> yup, just wanted to make sure its fine in the repo as well :)
<gsamfira> TheMue, what was it?
<TheMue> gsamfira, dimitern: dunno why, but I typed âgo test â¦/.â instead of âgo test ./â¦â. and I this case you get this error, crazy.
<gsamfira> ahh :)). I thought it was a typo on irc :D
<gsamfira> well then, glad it worked :)
<dimitern> TheMue, all tests pass here
 * TheMue also has to configure that thereâs no more crazy char replacement in his IRC client. nothing to do with this problem, but looks wierd
<TheMue> dimitern: see above, strange behavior when using â¦/. instead of ./â¦
<dimitern> TheMue, ah, it happens :)
<TheMue> dimitern: but then I would have expected a different kind of error, really crazy :D
<gsamfira> dimitern: can you merge https://github.com/juju/juju/pull/620 if its fine?
<dimitern> gsamfira, will look shortly
<gsamfira> thanks! :)
<TheMue> dimitern: btw, how do you see my three dots â¦? they always seem to raise a bit up here in my client, drives my crazy
<axw> wallyworld_: are you around still?
<wallyworld_> hey
<axw> hey. I'm changin sync-tools to use the API
<axw> seems that --public does not make any sense when we change to no provider storage
<axw> do you concur?
<wallyworld_> i think --public was sort of obsolete a while back
<axw> metadata plugins should take care of that right?
<wallyworld_> pretty much. we'll just make sure we check the new implementation satisfies the common workflows
<axw> wallyworld_: so I was thinking we could just honour it if the user uses --local-dir
<axw> but otherwise it'd be ignored
<wallyworld_> that sounds reasonable, bit i'd need to re-read the code to comment for sure
<wallyworld_> which i will do
<axw> thanks
<dimitern> gsamfira, submitted for merging
<gsamfira> thank you! :)
<dimitern> TheMue, they look different enough - I'm using monospace 11 in xchat
<TheMue> dimitern: I found the switch, ... and â¦ ;)
<dimitern> TheMue, nice :)
<TheMue> dimitern: yeah, sometimes software tries to be more intelligent than you and behaves strange.
<TheMue> dimitern: now I'm resolving the merge conflict after your latest change :D
<dimitern> TheMue, sweet!
<TheMue> dimitern: currently, as a quick change, I'm using your disableNetworkManagement and my RequiresSafeNetworker in a combination. did I get your right that the capability should answer it directly (by taking a look at the configuration)?
<dimitern> TheMue, I think the capability should check disableNetworkManagement internally and return true when it is true, before doing other checks
<TheMue> dimitern: ok, so have to touch all providers again. :/ at least those are only a few methods :)
<dimitern> TheMue, sorry about the fuss :/
<TheMue> dimitern: no pro, simply makes sense
<TheMue> dimitern: oh, got logical problem now. the capability today tell if a safe or "unsafe" networker shall be used
<TheMue> dimitern: but it doesn't tell, if NO networker shall be used
<dimitern> TheMue, no, no - there is always a networker running, but if management is disabled we should run a safe one
<TheMue> dimitern: ah, here's my mistake, so if disableNetworkManagement is true we return true to run the safe one, got it
<dimitern> TheMue, yep
<perrito666> good morning everbody
<katco`> wallyworld_: brt, mic doesn't seem to be working
<wallyworld_> ok
<alexisb> perrito666, ping
<perrito666> alexisb: pong
<perrito666> set, match
<alexisb> :)
<alexisb> can you jump on the a call to help with a technical question?
<perrito666> which call?
<alexisb> MAAS call, it is about the windows workload support in juju
<alexisb> getting hangout
<perrito666> oh, ok, could you giveme the url?
 * TheMue sometimes hates unit testing. no need to discuss how valuable it is. but sometime stuff is so interwoven ...
<bodie_> TheMue, I always appreciate unit testing and never question its utility ;)
<natefinch> morning all
<bodie_> hullo natef
<bodie_> tab
<wwitzel3> hey natefinch
<wwitzel3> natefinch: I'm in moonstone, ready when you are
<TheMue> bodie_: as I said, no discussion about how much it's worth, only sometimes the effort
<perrito666> natefinch: hey wb
<alexisb> natefinch, welcome back
<bodie_> TheMue, I was just kidding :) but I always realize how much it's worth when something I thought was bulletproof, isn't...
<TheMue> bodie_: but in my case it now also leads to a cleaner design using a mockable interface ;)
<TheMue> so, the refactored code build, but now I have to change the tests
<ericsnow> natefinch, perrito666, wwitzel3: standup?
<perrito666> ericsnow: going
<aznashwan> Hello, just throwing this out there: has anyone else noticed that we're still using the launchpad.net version of gocheck in all our tests when the repo's been moved to github for like 6 months now?
<abentley> We have updated a utopic machine and now the lxcbr0 interface is missing an we can't use the local/lxc provider.  Any ideas?
<abentley> The machine is our test slave for utopic.
<katco`> hey is anyone from onyx online?
<katco`> well maybe someone else can answer this then :)
<katco> thumper mentioned that a recent change is causing apt to be run on host machines, not just worker machines, i think when running local. just wanted to see if anyone had some more information on that?
<katco> i'm looking at all-machines.log, and my machine's /var/log/apt/history.log, and i'm not seeing evidence that it's being run, but i'm going on the assumption that i'm just not looking where thumper was
<hazmat> any folks with juju azure implementation knowlege on hand?
<alexisb> mgz_, do you happen to know anything about azure?
<mgz_> some, but probably only enough to be dangerous
<mgz_> what's the question?
<mgz_> hazmat: what's up?
<hazmat> mgz_, so we're setting a single affinity group for the entire env.. that seems inappropriate.. afaik the only reason we're doing that is to create a virtual network for private backpane communication between service instances
<hazmat> i wanted to confirm that's the only reason we're touching affinity groups
<mgz_> yeah, we need all machines able to talk to all other machines, which means they all need to be in the same affinity group, no?
<hazmat> because that underlying reason is no longer accurate, ie. regions/locations can be used when creating virtual nets now.
<mgz_> current networking work should relax that restriction eventually
<hazmat> mgz_, that's the issue, that's not true
<hazmat> it may have been at the time, but the apis have been updated
<mgz_> okay, sounds like a bug to raise then if we have new azure tools to make it work
<mgz_> what should we do instead? we still need some kind of affinity group policy, no?
<hazmat> and it impacts how the fabric internally maps instances to fault domains when their in the same affinity group.. ie affinity group tries to say place me on the same rack, where as also want fault domain which is their mapping to az.
<hazmat> mgz_, actually we don't
<mgz_> and juju can't currently do smart things like put different units of the same service in different groups I think
<hazmat> mgz_, we were only using affinity group (on the whole env) for the networking aspect
<mgz_> so, just dump the group entirely?
<hazmat> mgz_, azure internally will do fault domain placement for us
<mgz_> well, that doesn't sound too bad
<hazmat> for multiple role instances
<mgz_> hazmat: can you file a bug, with some of the details of the new scheme we should use instead?
<hazmat> mgz_, sure
<mgz_> thanks!
<hazmat> mgz_, just want to verify the fault domain mapping on our instances as we create at the moment.. i kinda of wish juju had commands for listing underlying cloud info on instances for admins
<mgz_> yeah, we should have an extended status thing really
<hazmat> mgz_, for manual provider plugins.. i've started to just have a separate command list-machines with provider specific details
<hazmat> a list-machines command in juju might be appropriate for this.. our nested yaml data structs on status are rather hard to interpret visually given their tendancy to scroll off the screen... there's some status work though as part of pain points that may help
<bodie_> I have landed and merged the bugfix for intermittent Action related test failures.  is this "released" or "committed"?
<bodie_> https://bugs.launchpad.net/juju-core/+bug/1351371
<mup> Bug #1351371: state/apiserver/uniter/uniter_test fails TestPreexistingActions when Actions arrive out of order. <actions> <intermittent-failure> <juju-core:Fix Committed by binary132> <https://launchpad.net/bugs/1351371>
<perrito666> ppl I am a bit ill Ill go rest a moment and then come back
<bodie_> anyone, guidance on the bug status?  I assume it is "committed" because it is not yet in a release version of juju?
<bodie_> mgz_, natefinch, alexisb?
<natefinch> bodie_: committed
<alexisb> one sec bodie_ on a call
<natefinch> bodie_: I let sinzui et. al. change it to released
<bodie_> okay, cool.  thanks natefinch
<alexisb> bodie_, looks like you got your answer
<alexisb> thanks natefinch
<natefinch> ericsnow: LGTM'd your unrevert
<ericsnow> ah, cool
<ericsnow> natefinch: thanks
<ericsnow> natefinch: FYI, I've filed a bug against the postgresql charm, 1 against reviewboard, and what feels like 100 against the reviewboard charm. :)
<ericsnow> natefinch: the charm author has reached out to me. :)
<natefinch> ericsnow: awesome... maybe you can help fix up the reviewboard charm
<ericsnow> natefinch: it's not on the scale that the author will need much help
<ericsnow> natefinch: I did, however, right my first charm (for the OAuth reviewboard extension).
<ericsnow> wasn't as hard as I had expected
<ericsnow> and writing Python is always like going home :)
<natefinch> ericsnow: cool, glad you got to write a charm, and got to exercise those python skills
<natefinch> ericsnow: btw, you can upload to your own namespace in the charmstore, with zero overhead or review needed
<ericsnow> natefinch: good to know
<ericsnow> natefinch: I'll put that on my todo list :)
<natefinch> ericsnow: so like juju deploy c:~ericsnow/reviewboard
<natefinch> cs:~ericsnow that is
<TheMue> so, redesign after merging done, tomorrow morning some polishing. good night all
<thumper> katco: morning
<katco> thumper: howdy, thumper
<thumper> katco: so... here is what I was seeing yesterday...
<thumper> was testing the local provider to check some mongo changes I was doing
<thumper> juju bootstrap --debug
<thumper> for the local provider
<katco> thumper: ok
<thumper> this showed that the cloud init step was doing an apt-get update/upgrade
<thumper> you can see the results in $datadir/cloud-init-log
<thumper> or something like that
<katco> thumper: on the container? or on your machine?
<thumper> it seems that the cloud init script being generated for the local provider on bootstrap is doing the apt dance
<thumper> no containers are being created
<thumper> just bootstrapping
<thumper> which messes with the host machine
<katco> thumper: ah yes, ok. i was afraid i misunderstood ian.
<thumper> (oh how I'd like to fix that)
<thumper> this made bootstrapping the local provider take a lot longer than it needs to
<katco> thumper: please forgive my newbness :( doesn't bootstrapping create a container to be the state machine?
<thumper> katco: not for the local provider
<thumper> it should (in the future)
<thumper> but doesn't
<thumper> machine 0 for the local provider is the host
<katco> thumper: ahhh! there's the insight i was missing.
<thumper> the datadir is set to $JUJUHOME/<envname>/
<thumper> so for me, with a local called "local"
<thumper> the data dir is ~/.juju/local
<katco> thumper: ok, so a little background: we wanted the new config settings to trump all, but have sensible defaults
<katco> thumper: the defaults for local were to ommit the apt commands only if lxc-clone was on
 * thumper nods
<natefinch> thumper: can you talk to dpb on canonical's #juju, he has some lxc issues and would like some help, and it's past EOD for me.
<thumper> katco: we need to special case the bootstrap node for local
<katco> thumper: but it sounds like there's a completely set of code i need to modify to ommit the apt commands from local
<katco> thumper: yeah. ok i think ian tried to impart that :p
 * thumper nods
<katco> thumper: thank you so much, i'll start now. pretty good idea of where the modifications need to be made.
<thumper> bootstrap in provider/local/environ.go is already a little special
<thumper> kk
 * katco laughs
 * natefinch has to run
<katco> thumper: actually, still a little strange. right now my machine has a nonsensical apt proxy. wouldn't i be seeing failures in /var/log/apt/history.log?
<katco> thumper: or is that an incorrect metric?
<thumper> umm... apt proxy set where?
<katco> thumper: /etc/apt/apt.conf on my machine
<katco> thumper: i was trying to use that as a hammer to notify me if _anything_ was awry
<thumper> I'm not sure where apt logs problems
<thumper> but I would guess /var/log/apt ... something
<katco> thumper: (shrugs), i trust you. i'll just make the changes.
<katco> arg, my irc client crashed. missed any response after the "i trust you" comment
<thumper> katco: I didn't say anything :-)
<thumper> I was just basking in the glow of trust
<katco> hahaha
<katco> thumper: https://github.com/juju/juju/pull/621
<thumper> katco: otp now will look shortly
<katco> thumper: no problem, just don't want you blocked
<wallyworld_> katco: hey, i see you're working on the apt thing. did you want me to look at the other provisioning issue?
<katco> wallyworld_: thanks, but i think i almost have that wrapped up
<wallyworld_> \o/
<katco> wallyworld_: will hop back on that when this PR lands
<wallyworld_> sure, just wanted to be sure you weren't blocked
<katco> wallyworld_: i'm down to tests that look like they're actual failures due to new harvesting settings, instead of regressiosn
<wallyworld_> ok
<katco> wallyworld_: you could look at that PR to get it landed faster. that would unblock me :)
<wallyworld_> katco: the apt settings one? sure. i thought thumper was so didn't want to double up, but i can. i have a meeting now but will do so straight after
<katco> wallyworld_: oh, he'll probably get to it. he's on a call i think.
<wallyworld_> ok
<katco> wallyworld_: also, is FAIL: upgrade_test.go:324: UpgradeSuite.TestUpgradeStepsHostMachine a transient error?
<katco> just got it, hadn't in 2 previous runs >.<
<thumper> katco: it certainly shouldn't be
<thumper> katco: which failures?
<katco> thumper: upgrade_test.go:328:
<thumper> katco: which package?
<katco> cmd/jujud
<perrito666> wallyworld_: great job with the bug yesterday
<katco> thumper: running again...
<wallyworld_> katco: no, not that i know of
<wallyworld_> perrito666: finally got it nailed
<wallyworld_> perrito666: one down, 100s more to go :-(
<wallyworld_> maybe an exaggeration
<perrito666> wallyworld_: well you gotta start somewhere
<wallyworld_> perrito666: of course, after that i had to unrevert the erroneous revert of my other branch :-/
 * katco throws hands up
<katco> it passes
<perrito666> wallyworld_: and applied the patch for the fix in master too?
 * katco runs the entire suite again...
<wallyworld_> perrito666: yup
<wallyworld_> katco: welcome to juju testing :-)
<katco> wallyworld_: and a warm welcome it is ;)
<thumper> that is a terrible test
<wallyworld_> "warm"
<katco> wallyworld_: almost like being in hell
<thumper> and I think mostly unneeded
<thumper> I'd like wallyworld_'s opinion too
<thumper> I don't believe it should be testing what it is testing
<thumper> the upgrade steps should be tested elsewhere
<wallyworld_> thumper: otp with alexis, will look soon
<thumper> jujud should assert that it is running upgrades, but not that every step has run
<thumper> that's just unnecessary
<thumper> wallyworld_: I see lots of emails about juju merges accepted, but not rejected nor landed
<thumper> wallyworld_: any ideas?
<wallyworld_> thumper: will look soon, still otp
<alexisb> thumper, quick bugging Ian in our 1x1 ;)
<katco> wallyworld_: ian! ian! hey!
<thumper> wallyworld_: what ever you do, don't mention that thing to alexisb
<wallyworld_> fark off
<katco> haha
 * katco loves her team
<waigani> thumper: standup
<katco> thumper: so does this need to be a new concept? update behavior for bootstrap only?
<wallyworld_> thumper: yesterday I aborted several of my landings, perhaps that's what you are seeing
<thumper> wallyworld_: maybe
<waigani> thumper: magic!
<thumper> see :)
<wallyworld_> thumper: you happy with https://github.com/juju/juju/pull/621 ?
<thumper> no
<thumper> wallyworld_: looks like the values are being set for the entire environment
<thumper> wallyworld_: rather than just the bootstrap node
<katco> thumper: wallyworld_: is there an existing concept for settings for a single node or operation?
 * thumper takes a quick look
<thumper> katco: provider/local/environ.go:139
<wallyworld_> thumper: why aren't they needed for the entire environment? all machines which start shoudl have those settings
<thumper> here we directly modify the machine config struct for the bootstrap node
<wallyworld_> oh wait
<thumper> line 167
<wallyworld_> i see what you mean
<thumper> both line 167/168 should be false, yes?
<wallyworld_> thumper: i think that's all that's needed isn't it - just set those 2 to false
<thumper> I think so
<thumper> to fix this issue anyway
<katco> hrm... i see what you mean. so https://github.com/katco-/juju/blob/local-apt-defaults/provider/local/environ.go#L162 would be false
<katco> b/c that will only affect the machine config for the state server?
<wallyworld_> katco: hmmm. see cloudcfg on line 184
<wallyworld_> that also sets the update bools
<katco> wallyworld_: right, but it pulls them from the machine config
<bodie_> final lynchpin for actions on the unit: https://github.com/juju/juju/pull/617 ready for review (415 and 520 depend on this)
<katco> i think setting the machine config to false would be ok, unless you think it should only set the cloud config
<wallyworld_> yes, it does, so yes setting mcfg to false should do it
<katco> running tests
<wallyworld_> you may need a new test for this issue specifically if there's not one there
<katco> wallyworld_: https://github.com/juju/juju/pull/621/files#diff-51c33284469e382ba300508359834cc6R210
<katco> wallyworld_: hrm. do we want a blanket false? or should we continue to check if the environment settings are set?
<katco> wallyworld_: the way it's coded now, even if i set the environment settings to true, the updates/upgrades won't be run
<wallyworld_> katco: maybe we should allow the user to override them - otherwise it's just another special case for people to get confused about
<wallyworld_> i can't see why they'd want to set them to true
<wallyworld_> but they might
<katco> wallyworld_: yeah, i agree. if i specifically tell juju to do something, i'd like it to happen :)
<wallyworld_> juju world-peace
<katco> haha
<katco> gotta charm it first
<wallyworld_> Error: Juju cannot do that, Hal
<katco> wallyworld_: while i'm waiting for tests to finish. remember how you were saying values would be cached on machine objs? if i do an EnsureDead, will a call to Life return the correct value?
<katco> wallyworld_: afterward i mean
<wallyworld_> katco: should do, let me check
<wallyworld_> katco: no, appears not :-( looks like you gotta call Refresh()
<katco> wallyworld_: how did you check that so fast?
<wallyworld_> state/api/machine/machiner.go
<katco> ah
<wallyworld_> i knew where to look, bitter experience
<katco> someday, i too, will be able to say i have bitter experience.
<katco> ;)
<wallyworld_> see the EnsureDead() method - doesn't look like it does any update on the struct's life value
<wallyworld_> oh, i was bitter before too :-)
<katco> haha
<wallyworld_> and twisted
<katco> it's wally's world and we're all just living in it ;)
<wallyworld_> glad you understand
<katco> rofl
<katco> i think that's one of the bugs right there. b/c i separated the ensuredead from placing it into the dead slice
<katco> previously it would ensuredead and then immediately fall into the dead list, despite it's comical assertions that it wasn't yet perished.
<katco> *its
<wallyworld_> yes, sounds plausible
#juju-dev 2014-08-28
<katco> wallyworld_: https://github.com/juju/juju/pull/621
<katco> ready for review
<wallyworld_> looking
<katco> wallyworld_: grabbing some supper, brb
<wallyworld_> ok
<katco> wallyworld_: back
<wallyworld_> katco: i left a couple of droppings in your PR
<wallyworld_> i think maybe the test coverage needs to be expanded a bit
<katco> wallyworld_: gah i'm flipping back and forth b/t branches too much. the reason i didn't use the val in that map is b/c with the harvesting stuff, what's in the map will be a string and we want an int. here it's totally fine though. thanks :p
<wallyworld_> :-)
<katco> wallyworld_: ready again. running tests on my machine
<wallyworld_> ok
<katco> wallyworld_: http://golang.org/doc/effective_go.html#redeclaration
<wallyworld_> katco: rightio. i thought that only applied to the first variable
<wallyworld_> i HATE := vs = soooo much. worst design decision EVER
<katco> wallyworld_: afaik, go doesn't do anything anywhere with regards to parameter ordering
<katco> wallyworld_: haha
<katco> wallyworld_: i don't mind it, but i do wish they would have standardized on new(...) vs :=
<wallyworld_> := vs = is the cause of so many bugs
<wallyworld_> and anyway, it's the fucking compiler's problem to sort out, not the programmer
<katco> wallyworld_: really? i haven't experienced that directly yet
<wallyworld_> we have in juju
<katco> wallyworld_: any more feedback? i think i got everything/pushed
<wallyworld_> katco: yeah, just about to LGTM but you interrupted me :-P
<katco> wallyworld_: sorry oh supreme leader of wallyworld! ;)
<wallyworld_> have i told you today?
<katco> told me what?
<wallyworld_> fark off!
<katco> LOL
<katco> that'll be 2x i think today
<wallyworld_> there, now you've been told
<katco> haha
<katco> ok back to the harvesting stuff
<wallyworld_> thanks for fixing
<katco> thanks for the review
<katco> thumper: should be landing momentarily. sorry for the regression.
<wallyworld_> don't apoligise to him, he will expect it everytime now
<katco> i already told him i trusted him. i think i'm just off on the wrong foot with that guy
<hazmat> axw, ping
<hazmat> axw, was doing some research on azure earlier today and found some interesting info, wanted to run by you..
<wallyworld_> he's just a pussy cat really, roll him over and rub his belly and he's all good
<hazmat> axw, nutshell we don't need to associate the env to an affinity group anymore for the sole purpose of getting a vnet.. vnets can be associated to regions now
<wallyworld_> hazmat: i have a theory why provisioning is failing, but the log files don't contain the error message i would expect to see, so it's a guess. apt contention installing container dependencies
<wallyworld_> that explains the issue you raised, but doesn't explain the one where only one container out of several fails to start
<hazmat> wallyworld_, do we normally have errors we don't log?
<hazmat> wallyworld_, possibly.. i thought that was addressed already via retry?
<wallyworld_> hazmat: yes we do, and i'm not seeing them which is confusing me
<hazmat> wallyworld_, the issue is nothing else on that machine is installing anything
<hazmat> wallyworld_, the one other unit on the machine.. is the ubuntu charm.. aka do nothing
<hazmat> wallyworld_, its log is also in the tarball
<hazmat> wallyworld_, so apt contention with what..
<wallyworld_> hazmat: retry is only in 1.20.6
<hazmat> wallyworld_, ah.. fair enough.. this is .5
<hazmat> wallyworld_, but still curious as to what it would contend with
<wallyworld_> hazmat: ok, i didn't see what the unit was. but apt contention is the only thing that i can see right now that explains why the logging cuts off at the point it does
<wallyworld_> there may be another cause
<hazmat> wallyworld_, why are all the container watchers being killed on all the machines at the same time
<wallyworld_> but what's happening is that the code is calling the container setup,which calls apt, but then gets no further
<wallyworld_> hazmat: they are killed because they are no longer needed - they exist to set up container support for the machine and then they die
<wallyworld_> ie the apt stuff and set up to run lxc is done lazily
<wallyworld_> once the first lxc is asked for
<hazmat> wallyworld_, so it would be a container provisioner logic issue then
<wallyworld_> maybe, i can't explain why things just stop
<wallyworld_> there was an issue in 1.20.5 where the watcher was stopped twice
<wallyworld_> but i don't think that will cause this issue
<wallyworld_> i need to keep digging a bit
<hazmat> wallyworld_, ack
<wallyworld_> thanks for getting the logs
<hazmat> np
<wallyworld_> i'm keen to get 1.20.6 out there
<wallyworld_> so we can see how it behaves
<wallyworld_> lots of fixes in there
<wallyworld_> a CI issue with azure is holding things up
<hazmat> the --upload-tools issue on azure?
<wallyworld_> azure was broken for CI, might be fixed now
<wallyworld_> not sure, i just heard 2nd hand that the CI tests failed
<wallyworld_> they passed yesterday or the day before
<wallyworld_> but i haven't heard directly
<wallyworld_> we'll be pushing for a release tomorrow regardless
<wallyworld_> we have to get this 1.20.6 out and into the hands of landscape and other folks
<hazmat> wallyworld_, so does that mean there's some coordination between container watcher and container provisioner?
<axw> hazmat: I changede our vnet creation to use a "location" (region) a while back, but reverted it. not sure if it was a coincidence, but after that change there were a lot of problems with the vnet not being available
<axw> it would take >5 mins for the vnet to be accessible after creation
<hazmat> axw, interesting
<axw> IIRC the warning message that popped up in the azure console said it was only a problem with vnets created without an affinity group
<thumper> katco: awesome, ta
<katco> thumper: np
<axw> hazmat: did you see my PR for the docs on zones?
<hazmat> axw, i did looked good
<axw> reading your doc now
<wallyworld_> hazmat: yes, the container watcher starts the provisioner when a new container is requested
<hazmat> ah ic now
<hazmat> wallyworld_, by why would container watcher killed be seen before apt-get install lxc if the watcher is responsible for installing the pre-reqs
<wallyworld_> hazmat: not sure. i think it asks the worker to die, but it won't do so until the current operation has finished ie the provisioner is started and then it exists
<wallyworld_> i'm not 100% across the worker infrastructure
<hazmat> wallyworld_, that's not apparent
<hazmat> wallyworld_, ie the logs where its successful show it die and then the provisioner come up
<wallyworld_> hazmat: my understanding is that kill() marks the worker as dying, and it still needs the current loop invocation to finish, but i'm not sure
<wallyworld_> i'll look at reworking it, adding more logging also
<hazmat> ah ic.. it signals to stops itself before doing its actual work
<davecheney> thumper: https://github.com/juju/juju/pull/614
<davecheney> if you have a sec
 * thumper looks
<davecheney> this is th eone from standup
<axw> waigani: you can stop reviewing https://github.com/juju/juju/pull/547, it's redundant
<axw> I already fixed the problem
<waigani> axw: ah so I see, thanks I missed that
<axw> hazmat: I don't have much to say on your doc, SGTM.
<axw> it would be nice if we could use thise to enable colocation of services in azure
<axw> atm that's disallowed because we can't control which units communicate to which based on zone allocation
<hazmat> axw, hmm
<axw> (I still have no idea how it would work though)
<hazmat> axw, zone/fault domain in azure is a logical concept that's specific to azure service and its role instances.
<hazmat> axw, theoretically we could map to those when doing co-location, and pick the appropriate next instance (ie distribution group from co-located service's instances
<axw> you'd also have to make sure you don't spread the two units across fault domains though
<axw> it's not enough to stick them in the same availability set
<axw> and then there's upgrade domains
<hazmat> axw, you do want to spread across fault domains.
<hazmat> axw, we don't actually use upgrade domains afaics
<axw> they are implicitly used
<hazmat> axw, does azure use upgrade domains under the hood?
<axw> when the machine is upgraded
<axw> i.e. regular maintenance
<hazmat> ic
<axw> that's my understanding anyway
<hazmat> my understanding was that it was tied to the app roll out of updates
<hazmat> but yeah.. underlying upgrades also makes sense
<axw> it is definitely tied to the app updates
<axw> I thought both tho
<axw> hazmat: re spreading across fault domains, I mean if you have a co-dependent app server & db, you surely don't want to spread htem across fault domains
<axw> but multiple units of each, eys
<axw> yes*
<hazmat> axw, every ref ic to upgrade domain references app /deployment updates.. not iaas updates
<hazmat> axw, multiple units of each.. and you'd want spread.. single unit of each.. does it matter ;-)
<axw> hazmat: all I'm saying is the pairs need to be located in the same fault domain, otherwise you have a broken service if one goes down
<hazmat> axw, single unit of each and we don't really have any real notion of trying to keep it up.. fault domains are not global
<hazmat> their service local logical
<hazmat> axw, ie. if their co-located their on the same vm.. so doesn't matter.. if their separate services in azure
<hazmat> there is no guarantee that 0 == 0 between two services fault domains
<axw> right, they have to be in the same cloud service
<axw> it's a bit messy, forget I said anything :)
<axw> when thye're in the same CS there's also issues of port collision
<hazmat> axw, so we'd use them as separate roles ?
<hazmat> within a service
<hazmat> axw, yeah
<hazmat> azure.. is special
<axw> yes, separate roles. I was thinking we could deploy a service and specify the cloud service name
<axw> (to be the same as an existing one)
<hazmat> woah.. now your talking crazy.. semantic service names in an iaas console ? ;-)
<hazmat> i walk away for a few months to come back and remember how special it is.. i wrote up some code to verify the fault/upgrade domain thingy and its interaction with affinity groups. https://gist.github.com/kapilt/d326b853e4606f9203e9 i kinda of wish we had a list-machines to do iaas provider specific details
<hazmat> axw, oh.. nevermind not semantic
<hazmat> axw, we currently do separate roles per instance as well.. there's some messiness trying to treat azure as general compute
<axw> a (Virtual Machine) role is an instance
<axw> there's some other roles that aren't applicable to IaaS
<axw> web worker roles.. don't know much about them
<hazmat> axw, so why do we/they have roles and role_instance_list separately
<axw> nfi
<axw> I think it's to do with deployments
<axw> you can have prod/testing deployments
<axw> and switch them at runtime
<hazmat> yeah.. the slots and upgrades
<hazmat> and rollbacks
<axw> so you define a role and ther's an instance for it in each deployment
<hazmat> ah.. ic.. that makes a certain sense.. logical from instantation across prod vs staging
<axw> anyway, so what I was saying is we could do, say "juju deploy app --to cloudservice=mythingy" and "juju deploy db --to cloudservice=mythingy", then if you ensure each service has at least the same number of units as there are fault domains, then the units can self organise to talk to units in the same fault domain
<axw> there's still the issue of port collisions, but there's not much we can do about that. only matters for exposed services anyway
<davecheney> waigani: https://github.com/juju/juju/pull/622
<hazmat> axw, there on the same  machine w/ co-location.. so the port collision thing is immaterial to the provider.
<hazmat> axw, also matters for unexposed.. cause failure to bind
<axw> hazmat: not same machine, just same cloud service
<hazmat> axw, we don't control fault domain
<waigani> davecheney: looking :D
<axw> hazmat: no... but there are 2 fault domains and we allocate 2 units, I think Azure will spread them equally?
<axw> but if*
<hazmat> it will
<hazmat> axw, this is where the spec comes into play.. the charms can choose to self-organize that way if they choose.. via relation-get query to remote unit matching zone
<axw> hazmat: right, that was my point :)
<axw> I'm saying with your proposal, this is feasible
<hazmat> axw, aha.. finally i understand.. i should go to bed.. that took a while ;-)
<waigani> davecheney: that's awesome.
<waigani> davecheney: what's the -type d flag?
<waigani> help just says: -type [bcdpflsD]
<waigani> not very helpful...
<davecheney> waigani: man find
<davecheney> waigani: please review my comments to https://github.com/juju/juju/pull/617
<davecheney> waigani: please review my comments to https://github.com/juju/juju/pull/613
<thumper> waigani: did you want to update the envuser stuff now with the st.environTag, or as a followup?
<waigani> thumper: followup? I've got the todos in there so should be easy/quick
<thumper> kk
 * thumper keeps reviewing
<waigani> davecheney: I've got to do the school run,  I'll be back online in a bit
<davecheney> kk
<thumper> heh...
 * thumper squeezed (â¯Â°â¡Â°)â¯ï¸µ â»ââ» into a unit test
 * hazmat steps back from the unicode wizardry
<bodie_> davecheney, thinking about your concern with the empty ActionTag as a signal of non-action hook
<bodie_> davecheney, since I'm initializing the ActionTag with an empty value and only inserting a value (via api) if the hook was an Action, it seems to me like there would never be a case when it would not suffice as the switch
<davecheney> bodie_: then you never need to check ?
<bodie_> davecheney, well, the check is to consider whether it is an action (i.e. always has a tag value), or not in which case the value will always be empty
<davecheney> if it's empty then use somethign that can be nil
<davecheney> otherwise you'll get fucked by the subtle difference between var a names.ActionTag, and a = names.NewActionTag("")
<bodie_> I think the latter would only happen if the action didn't have an id, in which case we're fucked anyway
<wallyworld_> axw: a small one https://github.com/juju/juju/pull/623 if you have a moment
<bodie_> but, that error case should get caught by runHook
<axw> looking
<bodie_> i.e., the value is *always* going to either be empty = non-action, or non-empty = action, or already errored out when the id was mysteriously missing
<davecheney> bodie_: i don't like using the zero value like that
<bodie_> that is my feeling too
<davecheney> please make it a pointer or use the names.Tag interface
<bodie_> sounds like a plan
<bodie_> :)
<bodie_> thanks
<davecheney> cool
<davecheney> thanks
<axw> wallyworld_: that AddInt32 test looks like crack anyway
<axw> won't it always stop the last worker it added?
<wallyworld_> it doesn't add workers
<wallyworld_> it stops the container watcher once all supported container types have been intialised
<wallyworld_> lazy init of containers
<axw> ah, I see
<wallyworld_> i'm not sure it was bad how it was, but it's more logical to have it in a defer i think
<wallyworld_> axw: thanks. the defer is a hail mary. it *shouldn't* matter but the runner stuff is a bit mysterious. certainly early termination of the worker is one explanation for the logs i saw
<thumper> waigani: if you can't use the factory, just use the state methods to create users
<waigani> thumper: ok
<waigani> thumper: https://github.com/juju/juju/pull/553
<thumper> waigani: I'll look shortly, need to go make dinner
<waigani> thumper: np, I'll have to do the same soon - at ice skating right now
<axw> wallyworld_: can you please close https://github.com/juju/juju/pull/547?
<wallyworld_> sure
<axw> thanks
<bodie_> davecheney, addressed your points.  any response to https://github.com/juju/juju/pull/617#discussion_r16817826 when you have a sec?
<bodie_> davecheney, this code is the dep for a bunch of other stuff, so if I can get even a brief comment on that reply it would be really helpful to moving us forward
<bodie_> otherwise I believe others may hesitate to jump in on that topic
<bodie_> and since this is my 1:30 am, I don't have a lot of confidence I will get a chance to pester you again soon :)
<wallyworld_> axw: something to ponder with the tools work, not 100% relevant now but good to keep in mind https://bugs.launchpad.net/juju-core/+bug/1347984
<mup> Bug #1347984: container provisioner may choose bad tools <juju-core:Triaged> <https://launchpad.net/bugs/1347984>
<axw> wallyworld_: thanks
<axw> wallyworld_: you make a good point about "pending forever", but it's the same either way
<axw> perhaps when we fix that we can put a sensible timeout in place?
<wallyworld_> yeah, we do need to do something
<wallyworld_> we have work scheduled to improve this area
<TheMue> morning
<dimitern> morning
<TheMue> hmm, two tests running the whole story fail *checking*
<TheMue> dimitern: btw, your latest change led to a minor but nice redesign by my side
<dimitern> TheMue, oh yeah?
<TheMue> dimitern: yeah, using an interface answering the questions RequiresSafeNetworker() has, instead of adding more and more arguments
<dimitern> TheMue, cool!
<TheMue> dimitern: and, you may believe it or not, machiner.Machine implements this interface too :D like my mock type for the tests
<dimitern> TheMue, the IsManual thing?
<TheMue> dimitern: and the Id of the machine, all now fetched in one versioned doc, and the params separated from the in-memory storage
<TheMue> dimitern: John and I discussed about it these days
<dimitern> TheMue, yep, it is better like this, isn't it?
<TheMue> dimitern: yeah, I think so. params should simply be for transport. this also will make the implementation of versioning more simple
<dimitern> TheMue, that's the intent, yeah
<TheMue> dimitern: +1
<mattyw> davecheney, morning - thanks for the review
<TheMue> so, looks like I catched all failing tests due to the redesign. one final complete test and then PR :)
<davecheney> mattyw: no worries
<gsamfira> morning all
<natefinch> wwitzel3, ericsnow, team meeting?
<hazmat> do we have a stack trace dump signal handler on agents?
<hazmat> wallyworld_, was thinking that might have helped container debug
<wallyworld_> hazmat: no, would be nice though
<natefinch> hazmat: the stack trace should get output to stderr on a panic and thus go into the log
<natefinch> hazmat: or maybe you mean like give it a signal and it'll log the current stack trace?  We can do that easily
<hazmat> natefinch, the later
<natefinch> but yeah, no, doesn't currently exist
<hazmat> given a hung/spun .. with no log output. nothing happening on syscalls (per strace).. it would be nice to see what's brokens
 * hazmat files a bug
<natefinch> hazmat: what's your preferred signal?
<hazmat> natefinch, QUIT
<hazmat> natefinch, https://bugs.launchpad.net/juju-core/+bug/1362546
<mup> Bug #1362546: Need a way/signal handler to dump stack trace on agents <juju-core:New> <https://launchpad.net/bugs/1362546>
<hazmat> jam, i think i totally misunderstood the context of your email yesterday
<hazmat> re container density
<jam> hazmat: well, some of it was just testing that we can genuinely get container addressibility, and some of it was trying to see what we could do with it for scale testing.
<jam> natefinch: SIGQUIT is built into Go
<jam> to trigger a panic()
<jam> I've used it repeatedly
<jam> hazmat: I'm pretty sure you alredy can
<natefinch> jam: triggering a panic is different than just printing a stack trace though
<natefinch> jam: but that's a good point
<hazmat> jam, thanks x2
<wallyworld_> axw: katco: finish meeting, be therereal soon
<wallyworld_> finishing
<jam> wallyworld_: is aggregateSuite.TestMultipleResponseHandling one of your intermittant tests?
<jam> because I just came across it
<jam> and it assumes that "go foo(); go bar()" will call foo before bar
<jam> which is *not* guaranteed.
<wallyworld_> jam: no. i will add it. what's the jenkins link?
<jam> wallyworld_: I just discovered it locally
<jam> wallyworld_: I'll try to just fix it, since i'm doing some tests there
<wallyworld_> jam: ok, thanks
<jam> I happened to have the system change ordering, or I wouldn't have noticed.
<jam> fortunately it is just a bug in the test, and not a more serious underlying issue
<perrito666> good morning everybody
<perrito666> natefinch: hey, did you get my email?
<natefinch> perrito666: yep, got it.
<perrito666> the cold medicine I took must be either made of unicorn powder or some illegal drug, these things work waaay too well
<natefinch> perrito666: heh.... psuedoephedrine is good stuff
<perrito666> heh, well that explains
<wwitzel3> ericsnow, natefinch: standup time :)
 * perrito666 notices that the only person actually standing up in those is wwitzel3 
<mattyw> apparently landing is blocked - is anyone currently working on https://bugs.launchpad.net/juju-core/+bug/1362636 ?
<mup> Bug #1362636: ppc64el compilation error <ci> <ppc64el> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1362636>
<mgz_> mattyw: not that I know of
<natefinch> mgz_, rogpeppe1, thumper, wallyworld_: do any of you know if we verify the SSL certificate of the state servers when agents connect to them?  I presume we do, but I don't actually know.
<natefinch> dimitern, TheMue ^^
<rogpeppe1> natefinch: we did originally, but at some point someone added InsecureSkipVerify i think.
<rogpeppe1> natefinch: i hope that's been removed now.
<mattyw> mgz_, curtis doesn't seem to be around - any idea how I can get started on looking into that?
<rogpeppe1> natefinch: actually it does look as if we correctly verify the SSL cert of the state servers now
<rogpeppe1> natefinch: look in state/api/apiclient.go
<natefinch> rogpeppe1: blech
<rogpeppe1> natefinch: what's the blech for?
<natefinch> rogpeppe1: oh, sorry, misread what you said
<natefinch> rogpeppe1: I can't really tell from the apiclient code if it's actually verifying the certs.  I see them being passed around, but I can't figure out where they're actually being checked.
<mgz_> 's just done in the go stdlib, no?
<rogpeppe1> natefinch: they're being checked by the websocket code
<rogpeppe1> natefinch: and by the fact that we use a wss: address
<rogpeppe1> natefinch: and we add a known root CA to the config
<natefinch> ahh, ok
<natefinch> anyone know of a way to get gtalk inside gmail to make the desktop notification mail icon thingy turn blue?  Also, what is that thing called and how do I change its settings?  It doesn't seem to have any kind of menu on it.
<perrito666> I dont think you can do that
<perrito666> that is a part of unity iirc
<mattyw> does anyone know how I could try to run a ppc build of core? I'm trying to take a look at https://bugs.launchpad.net/juju-core/+bug/1362636
<mup> Bug #1362636: ppc64el compilation error <ci> <ppc64el> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1362636>
<bodie_> my full test always times out on my pc
<bodie_> is there some way to accelerate the tests, or increase the timeout?
<mattyw> arosales, ping?
<arosales> mattyw: hello
<rogpeppe1> natefinch: just had a brief glance through lumberjack.go
<rogpeppe1> natefinch: looks great in general
<rogpeppe1> natefinch: a few minor suggestions:
<rogpeppe1> natefinch: if you specified MaxAge as a time.Duration you wouldn't need the comment and your code would be simpler, and (i think) the API a little more obvious
<rogpeppe1> natefinch: similarly, if you specified the max size in int64 bytes, you wouldn't need to mock megabytes.
<rogpeppe1> natefinch: i think that rather than returning an error if a write is too big, you'd be best off just writing it anyway
<natefinch> rogpeppe1: v1 used bytes, but then in config files you have like size = 100000000 which is illegible and error prone... and no one cares about anything smaller than a megabyte anyway.
<rogpeppe1> natefinch: i don't see any particular reason why you sort the result of oldLogFiles
<natefinch> rogpeppe1: I really appreciate the feedback btw.
<natefinch> rogpeppe1: sorting the old logfiles may be a leftover from the v1 code.  I'll look at it again
<natefinch> rogpeppe1: I thought it was so I could determine which were the N newest and keep those
<rogpeppe1> natefinch: you just scan directly through the list. you *could* break, i suppose, but that would seem like severe premature optimisation...
<natefinch> rogpeppe1: they're likely returned in last modified order, which if someone modifies an old log file might mean its last modified date is newer than the contents.
<rogpeppe1> natefinch: i can't see how the order affects anything
<rogpeppe1> natefinch: oh, i see
<natefinch> rogpeppe1: maxbackups .... right
<rogpeppe1> natefinch: yeah
<rogpeppe1> natefinch: perhaps it would be better to put the sort just before the code that relies on it
<rogpeppe1> natefinch: rather than sorting in oldLogFiles
<natefinch> rogpeppe1: yeah that's probably more clear
<rogpeppe1> natefinch: then it's more obvious why the slicing logic works
<rogpeppe1> natefinch: trivial thing: i'd put the [:l.MaxBackups] before the [l.MaxBackups], just because it's slightly nicer to slice the start before the end
<rogpeppe1> natefinch: i'm not entirely sure about the conflation of actual Logger and the serialisability of the logger config
<rogpeppe1> natefinch: i *think* i'd be happier leaving all the serialisation stuff out, and leaving it for higher layers
<natefinch> rogpeppe1: I could see splitting out the config from the logger object itself, so people won't try to do wacky stuff like change values on the fly...
<rogpeppe1> natefinch: the thing that seems a little hooky to me is the "well we'll preguess yaml and json because we know about those formats" thing
<natefinch> rogpeppe1: yeah, that's true
<rogpeppe1> natefinch: i'd just leave the config as vanilla, i think, and if people outside the package want to massage it, they're free to
<rogpeppe1> natefinch: and specify age as time.Duration and size as bytes.
<rogpeppe1> natefinch: leaving it up to higher layers to decide about sensible formatting if need be (i'd like to see 4g, 32m, for example to specify sizes, but that's really out of the domain of lumberjack)
<rogpeppe1> natefinch: great package name, BTW
<rogpeppe1> natefinch: but i do see the other side of the coin
<rogpeppe1> too
<rogpeppe1> natefinch: it forces higher layers to know about all the lumberjack config details
<natefinch> yeah... I struggled with that
<rogpeppe1> natefinch: but then again, they probably will anyway - we'd probably use juju config attributes to specify some of this stuff
<rogpeppe1> natefinch: i *think* i tend towards the "not this package's concern" p.o.v.
<natefinch> rogpeppe1: yeah, easy deserialization definitely affected the API, that's why it's megabytes and days, not bytes and time.Duration
<natefinch> rogpeppe1: I think you're right, that it shouldn't be this package's concern
<natefinch> rogpeppe1: Thanks again for the review.  It's a big help having fresh eyes on it.
<rogpeppe1> natefinch: np. it's a nice package, thanks.
<katco> hey need a quick opinion: i'm looking to document the new harvest mode behavior, and also the update/upgrade settings. are those better in their own individual documents, or embedded in another file (architectural-overview.txt)?
<katco> oops nevermind, just reviewed my notes. looks like juju/docs is the place to be
<marcoceppi> what's the URL to download the zip from charmstore?
<marcoceppi> having a hell of a time tracking it down int he code
<arosales> mattyw around?
<arosales> I think alexisb got the power machine worked out
<arosales> re mattyw
<alexisb> cmars, you have what you need with the power box?
<alexisb> arosales, mattyw is probably gone for the day
<arosales> alexisb: ack
<abentley> natefinch: it says On-call reviewer: see calendar.  What calendar?
<natefinch> abentley: that's the joke, which calendar keeps changing.... ask thumper, he's redoing it as of this morning.  I think it'll be on the juju core team calendar... which I doubt most people can see.
<abentley> natefinch: Could I  ask you do do a review?  It's verra short.
<natefinch> abentley: I have 3 minutes, so we'll see how short
<abentley> https://github.com/juju/juju/pull/629
<natefinch> :)
<natefinch> abentley: LGTM'd
<abentley> natefinch: TY.
<thumper> waigani: https://github.com/juju/juju/pull/631
<waigani> thumper: https://github.com/juju/juju/pull/632
<waigani> ;)
<waigani> thumper: good catch, I didn't write that test
<thumper> I had it fail on me this morning
<waigani> I also didn't know you could use the || in an assert like that - makes sense
<waigani> thumper: CI blocker
<cmars> thumper, mattyw and I were unable to land anything today, there's a ppc64el build error blocking
<cmars> i got access to a ppc64 and about to try to reproduce
<cmars> is davecheney around today?
<thumper> cmars: he will be later
<thumper> cmars: he normally starts in just over an hour
<cmars> ok cool
<thumper> cmars: you can do it locally
<thumper> cmars: I have reproduced the compiler error on amd64
<thumper> state/apiserver/deployer$ go test -compiler gccgo
<cmars> thumper, ah, so its a general gccgo issue
<thumper> yep
<thumper> unlikely to be power specific
<thumper> should find out when it last passed, and what the change was
<cmars> git bisect might be helpful there
<thumper> damn, how to I get git log to show me the diff
<thumper> for revisions
<cmars> gitk might be best to browse that
<cmars> ugly but useful
<perrito666> yup or gitg, which is slightly less ugly but also less useful
<thumper> so git log won't show me a diff for the revision?
<perrito666> i dont think so, it should just tell you the commit message and some other metadata
<perrito666> thumper: what exactly are you trying to do?
<ericsnow> I've found qgit to be a lot nicer
<thumper> I want to look at the files changed for every commit
<thumper> I know what I'm looking for (ish), I just want to see the commits
<perrito666> thumper: apparently -p does that
<thumper> nope
<thumper> ah...
<thumper> hang on
<perrito666> --stat
<perrito666> that seems to produce a very useful output
<perrito666> I use that kind of ouput for pull and it is actually very informative
<cmars> looks like it is passing now. something must have landed to fix?
<thumper> cmars: that is the 1.20 branch
<cmars> oh
<perrito666> cmars: yes, that is certainly very confusing
<cmars> good grief, is there a way to see more build history for http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/
<cmars> i know where it is on the filesystem... grr
<perrito666> cmars: jenkins is not actually finding it
<perrito666> I tried going to a previous job by hand and I get 404
<cmars> hmm
<thumper> hmm...
<thumper> ok, I have a commit I want to test
<thumper> how do I revert the tree to a particular commit?
<perrito666> thumper: you can use co or revert
<perrito666> sorry s/co/checkout
<cmars> not revert
 * thumper nods
<perrito666> aghh effing git commands
<perrito666> thumper: apologies I meant to say reset
<perrito666> which is like svn revert
<perrito666> and those always get mixed in my head
<thumper> why is git show <rev> for a commit not showing me the diff?
<thumper> --stat shows lots of files
<thumper> but no diff
<thumper> ok, definitely have the error
<perrito666> thumper: is a merge
<thumper> yes
<perrito666> thumper: you dont get diff on merges
<thumper> I want to see the diff as a result of the merge
<thumper> yes you do...
<thumper> grr
<thumper> dumb git
<perrito666> thumper: one of the lines from show say merge blah and bleh
<perrito666> git diff those two
<perrito666> thumper: let me correct myself, you should, git sucks
<cmars> thumper, i'm running an automatic bisect, will let you know how it goes
<thumper> doing that now
<thumper> cmars: I have the revision
<thumper> looking at the change
<cmars> oh cool
<thumper> 3ebb3a1edbccd8e6c4211b2f5b9e1fd6d518d82a
<perrito666> thumper: I presume that merge in internal terms for git adds actual git nodes doesnt do an actual merge of diffs
 * perrito666 never bothered to actually check how git internally works
<thumper> the problem is that the code is perfectly fine, just triggering a bug in gccgo
 * thumper sighs
<thumper> hmm, ok not that bit
<thumper> I have a bad feeling about this
<thumper> hahaha
<thumper> omg
 * thumper grunts
<waigani> well don't leave us hanging...
<alexisb> waigani, I was thinking the same thing
 * perrito666 eats popcorn and reads
<thumper> here is the code that was removed:
<thumper> -                               // TODO(dfc) comparing the two interfaces caused a compiler crash with
<thumper> -                               // gcc version 4.9.0 (Ubuntu 4.9.0-7ubuntu1). Work around the issue
<thumper> -                               // by comparing by string value.
<thumper> -                               if names.NewMachineTag(parentId).String() == authEntityTag {
<thumper> it was replaced by a line that compared two interfaces
<thumper> well
<thumper> one interface and one type
 * thumper pokes
<waigani> lol
<perrito666> I remember that one
<cmars> thumper, bisect tells me the breaking change is 41e8f0a7bf33d3b22a7ccf0949e988c834c4eeac
<cmars> and i confirm this with gccgo on 41e8f0a7bf33d3b22a7ccf0949e988c834c4eeac vs 41e8f0a7bf33d3b22a7ccf0949e988c834c4eeac~1
<thumper> cmars: didn't you trust me?
<cmars> i did, but i wanted to see my bisect work :)
<cmars> trust but verify? :)
 * perrito666 never saw "no" said so elegantly
<thumper> cmars: ok, I'll give you that
<cmars> funny thing is, it has a conditional *very* similar to the one you pasted up there
<thumper> oh this is so fucked
<cmars> for some value of 'this'
<thumper> ok I have a fix
 * thumper runs all apiserver tests with gccgo
<thumper> gccgo needs work
<alexisb> thumper, that is why we are investing in golang
 * thumper nods
<thumper> I was just about to say something about that
<thumper> beautiful day here today, want to take the dog for a walk around ross creek at lunch time
 * cmars misses beautiful dunedin now. 100F outside and all the grass is dead
<thumper> it is about ...
 * thumper calculates
<thumper> 50Â°F
<perrito666>  now that is something useful we can teach mup
<thumper> so quite cool
<cmars> i'd take it :)
<perrito666> cmars: interesting we had a couple of days like that a few days ago
<perrito666> the only issue is that we are in winter
<cmars> perrito666, aw man, that's not fair at all. sounds like our winters
<perrito666> but was an interesting change, it is quite hard to actually store summer clotes in winter
<thumper> cmars, perrito666: https://github.com/juju/juju/pull/633
<cmars> fixing the compiler would be best, but i wonder, if we could walk the AST to look for this bug ahead of time, to prevent this from breaking the build
<thumper> confirmed passes tests locally with gc and gccgo
<thumper> for apiserver at least
<cmars> it's comparing two interfaces that triggers the compiler bug?
<thumper> hang on
<thumper> I think I can simplify
<thumper> cmars: no, it appears to be one interface, and one concrete type
 * thumper pushing
<thumper> cmars: https://github.com/juju/juju/pull/633/files
<perrito666> thumper: isnt tag == authEntityTag  blowing?
<thumper> perrito666: no, because they are both interfaces
<thumper> it appears to be when one is a concrete type, and one is an interface
<thumper> where the concrete type implements the interface
<thumper> not the pointer to the concrete type
<perrito666> that is a good thing to mail to the list for people to keep an eye on it
<thumper> agreed
<perrito666> btw, isnt there a bug filed for that in gccgo? perhaps a reference to it in the comments would be useful so future maintainers can know when to remove the workaround
<cmars> ok, let me pull and restart the tests
 * thumper shrugs
<thumper> perrito666: I'll ask dave in the standup
<thumper> cmars: sorry this blocked you and matty so much today
<cmars> thumper, no problem. it's a good reminder to check gccgo locally. although, it's nice to have access to power8 now, in case we need it in the future
<alexisb> thumper, I pointed wallyworld to a spreadsheet today that says you have access to multiple power vms
<alexisb> but there was not access info
<alexisb> it would be nice to share the info with the whole team
<alexisb> https://docs.google.com/a/canonical.com/spreadsheets/d/1_y3BM1Fcxmc_niIMrNvqtrzOl23vrX1DdeoQQqTejbg/edit?usp=sharing
<thumper> I've forgotten mostly how to get there ... :)
<thumper> sure...
<thumper> however mostly gccgo problems can be caught locally
<alexisb> that way we can have power access in US timezones when there is an issue
<thumper> people just don't know how
<alexisb> well education would help to :)
<thumper> I included that in my email to the list
<alexisb> cool, thanks
<alexisb> cmars, thank you for driving help with that bug today!
<mwhudson> thumper: is there a gccgo bug report for that?  want me to file one?
<thumper> cmars: I'll check with dave if there is a bug fix
<thumper> mwhudson: oh hai
<thumper> mwhudson: I'll see if dave has done one already first
<mwhudson> ok
<thumper> would be good to get a minimal test case
<thumper> which I think I have a good grip on now
<mwhudson> yeah, that was going to be my next question :)
<thumper> waigani: with you in a sec
<thumper> waigani: just testing this bug
<waigani> thumper: okay, I'll just keep the hangout open in bg
<waigani> thumper: nice work on getting the bug!
<waigani> thumper: dave is here
<mwhudson> thumper: i don't see a fix flicking through gofrontend commits
<thumper> rogpeppe1: your suggestion works, and is less intrusive, ta
<rogpeppe1> thumper: cool, np
<thumper> rogpeppe1: we are looking at creating a simple reproduction of the error
<thumper> seems to be only with nested funcs and closure issues
<thumper> so... not simple
<rogpeppe1> thumper: ah
<waigani> what are the system-y tests - mentioned in the team lead minutes?
 * thumper takes the dog for a walk
<thumper> bbl
#juju-dev 2014-08-29
<wallyworld> katco: good evening, not that you should be still be working now, but i've commented on your PR. so you have something to look forward to when you wake up tomorrow :-)
<axw> wallyworld: would you be satisfied if I just doc commented syncToolsAPI? i.e. say that it's a subset of state/api/Client?
<axw> (thanks for review btw)
<wallyworld> axw: yep, sure
<wallyworld> just to assist the casual reader
<axw> cool
<axw> yup
<davecheney> thumper: http://paste.ubuntu.com/8174218/
<thumper> my fix at the end was very like that
<davecheney> ok, let me see if I can make a repro out of it
<thumper> wallyworld: that is a long email, will read it fully later, but... good effort
<wallyworld> thumper: yeah, it is long, sorry
<thumper> don't be sorry
<wallyworld> i'm not, well, that's what i say to the girls
<davecheney> wallyworld: that's a yello card mate
<bodie_> davecheney, any word on pr 617?
<bodie_> just when you have a minute
<davecheney> lets keep the induendo below the audible level
<davecheney> bodie_: i haven't looked at it
<davecheney> i'm not the on call reviewer today
<wallyworld> it's friday, gotta have a little fun
<bodie_> gotcha -- can you leave a comment or something to the effect that it's cool if someone else has a look then?  I don't want this thing to get trapped in limbo
<bodie_> I just have a bunch of stuff depending on it https://github.com/juju/juju/pull/617#discussion_r16817826
<bodie_> just since you'd left a negative comment
<thumper> wallyworld: I have to agree with davecheney on this...
<wallyworld> sigh
<thumper> sorry dude
<bodie_> and since we're on totally different timezones, and nobody else will respond to it since you'd left a comment
<thumper> there is a time and place, but this isn't it
<wallyworld> thumper: it's not like you haven't been "guilty" before :-P
<thumper> wallyworld: doesn't make it right
<thumper> I'm happy to be called out on it
<bigjools> I have to agree, wallyworld should be more careful
<bigjools> I mean, if you're going to try and make someone take offence, do it properly
 * bodie_ furiously scribbles notes
 * wallyworld makes a mental note who has a thin skin for next time
<davecheney> wallyworld: casual sexium isn't about having a thin skin
 * bodie_ compares notes with wallyworld and rebases his changelist
<wallyworld> it wasn't sexism, but anyway
<wallyworld> define sexism: prejudice, stereotyping, or discrimination, typically against women, on the basis of sex.
<wallyworld> anyways, let's move on
<bodie_> well, where were we.  oh yes.  so, I've noticed that when a core member on a 12 hour offset timezone comments on something, other team members are hesitant to add their two cents to discussion, and thus my review gets trapped in the equivalent of a HTTP syn timeout on a 12 hour latency window
<bodie_> I chatted about the PR with TheMue and alexis today and both indicated they wanted davecheney to be the one to respond to that comment on 617
<davecheney> thumper: http://paste.ubuntu.com/8174325/
<davecheney> repro for the gccgo bug
<davecheney> filing an issue upstream now
<bodie_> if someone could at least acknowledge what I'm saying, maybe I wouldn't wonder as much whether the cold shoulder has anything to do with wanting to keep us out of the loop
<thumper> I'd say more casual sexual harassment rather than sexism... however... moving along
<davecheney> https://code.google.com/p/go/issues/detail?id=8612
<bodie_> or, you know, you could post a three word reply on that pr, such as "someone else take this" and it would be totally cool
<bodie_> oops, that was 4...
<thumper> davecheney: cool, ta
<perrito666> cool email wallyworld, thanks for taking the time
<wallyworld> np
<bodie_> so I'm taking that as a "no, your PR is going to sit in limbo until I'm the next on call reviewer, and/or you're on my ignore list now"
<perrito666> I really need to go live a few weeks to the countries where you guys are from so I get to learn local english
 * perrito666 feels like the hispanic comic relief on an adam sandler movie half the time
<bodie_> LMAO
<bodie_> perrito666, you're cool by me, I never notice your accent ;)
<perrito666> bodie_: trust me I sound terrible when speaking for long periods of time, I am really bad at managing my air and I get really exasperated when trying to speak in english at the same speed than spanish
<davecheney> bodie_: nope, that's not it
<davecheney> i was pulled off onto another crash this morning
<davecheney> the OCR reviewer rotates daily to avoid anyone being on the critical path
<bodie_> I just don't like feeling like I'm talking into thin air at timse
<bodie_> but I understand you guys are totally slammed
<bodie_> just frustrating trying to push our stuff through when nobody will respond to a thread because someone who's only online when it's 10pm where I live owns the review
<bodie_> perrito666, english was invented for people who mutter under their breath :P
<davecheney> bodie_: i think you need to raise that with alexis
<bodie_> yeah, I did this morning
<davecheney> bodie_: i've said the same to her in the past
<davecheney> i'm not sure what I can do
<bodie_> davecheney, she and TheMue both said they wanted the response on that thread to come from you
<davecheney> nah, that can't be how it works
<bodie_> you could just say something like "I'm content for someone else to look at this" or "bodie_, I'll be able to review it next thursday" or something other than... ya know, nothing
<davecheney> he told me that martin is reponsible for landing your stuff
<davecheney> so, all i know is eveyrone knows a different story
<bodie_> TheMue has replaced martin for working with the actions team
<bodie_> I'm just trying to make things work, man :)
<bodie_> if we have a blocker, I need to figure out how to get around it
<davecheney> bodie_: yup, i hear you
<perrito666> ok, lets do this I have a decent overlap with all of you guys, mostly because I have no life and I hardly sleep, bodie_ could you privmsg me with a brief on what you need? davecheney could you tell me what alexis told you so I can try to unlock it tomorrow?
<bodie_> maybe you could either respond to that comment, or make a mention to them in #jujuskunkworks, either of which will take no more than 20 seconds
<bodie_> that way they won't feel like you own the thread
<bodie_> or, I can pass it along tomorrow, or I can wait for your response when you're able
<bodie_> I just want to know what to expect so I can be effective here
<bodie_> lol
<davecheney> thumper: good news
<davecheney> upstream have a fix for the gccgo crash
<davecheney> now comes the hard part ...
<thumper> getting it into ubuntu?
<davecheney> :P
<davecheney> thumper: https://code.google.com/p/gofrontend/source/detail?r=75739d377426
<davecheney> we have a fix
<davecheney> what's the best way to report this ?
<davecheney> an issue on ubuntu/gccgo ?
<thumper> I guess
<davecheney> right-o
<davecheney> i'll try that
<davecheney> might need you and sinzui to ride shotgun on this one
<thumper> who is in charge of gccgo?
<thumper> inside canonical?
<davecheney> dokku maybe ?
<davecheney> the lack of an owner probably explains why the last fix hasn't landed for 5 months
<mwhudson> thumper, davecheney: i have some gccgo patches i want in ubuntu too
<mwhudson> thumper, davecheney: i think the official process is "hound doko relentlessly"
<thumper> mwhudson: I suggest we push through alexisb to pat
<thumper> mwhudson: who does doku report to?
<mwhudson> thumper: slangasek i think
<mwhudson> btw, there is already a newer gccgo in trusty-proposed
<mwhudson> have you guys tried that?
<mwhudson> (it obviously doesn't fix this problem that just got fixed, but doko will probably ask...)
<davecheney> mwhudson: thumper `https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1362906
<mup> Bug #1362906:  internal compiler error: in comparison, at go/gofrontend/expressions.cc:6508 <gccgo-4.9 (Ubuntu):New> <https://launchpad.net/bugs/1362906>
<davecheney> mwhudson: nope, wasn't fixed until 30 minutes ago
<davecheney> mwhudson: pls mail thumper and I with your bug numbers
<davecheney> i'll add mine
<davecheney> and it'll be another thing on alexisb 's plate
<mwhudson> https://bugs.launchpad.net/ubuntu/+source/gcc-4.9/+bug/1361940
<mup> Bug #1361940: patches for cgo on arm64 <patch> <gcc-4.9 (Ubuntu):New> <https://launchpad.net/bugs/1361940>
<mwhudson> davecheney: this is all for cgo though, that probably shouldn't be on alexisb's plate?
<davecheney> if it doesn't affect juju or other canonilcal developed products
<davecheney> probably not
<mwhudson> docker docker docker
<davecheney> docker docker docker malkovich docker
<ericsnow> axw: do you mind if I add another 10 files and 500 lines to that backups PR <wink>
<bodie_> lol
<ericsnow> axw: but seriously, thanks (to you, davecheney, and waigani) for the reviews
<ericsnow> the rest of my backups patches are *much* smaller
<waigani> ericsnow: welcome
<ericsnow> who controls the juju account on github?
<axw> ericsnow: :)
<axw> nps
<axw> ericsnow: all the team leads are owners I believe
<ericsnow> axw: okay
<ericsnow> axw: I'll bug nate about it tomorrow
<ericsnow> axw: nps?
<axw> ericsnow: no problems
<ericsnow> axw: I figured as much
<wallyworld> axw: thanks, i was wondering about wording of that message. i wanted to explicitly call out that there could be an instance running; but as you say, i think mentioning destroy --force will be enough
<axw> cool
<wallyworld> axw: i think testing failure bug 1202039 is either fixed, or will be with your current work?
<mup> Bug #1202039: Provider tests are architecture-specific <arm64> <feature> <i386> <ppc64el> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1202039>
<axw> wallyworld: I think they're already working on i386 on ppc64 right? so it's fixed already?
<axw> CI runs the tests on i386 and ppc64 I'm pretty sure
<wallyworld> axw: yeah, that was my thinking too, i'll mark as fix committed
<wallyworld> just wanted to double check
<mattyw> morning all
<TheMue> morning
<TheMue> dimitern: ping
<dimitern> TheMue, hey
<TheMue> dimitern: as I as OCR cannot review myself would you mind to take a look at https://github.com/juju/juju/pull/626 ;)
<TheMue> dimitern: and morning btw :D
<dimitern> TheMue, sure :)
<TheMue> dimitern: thx
<axw> wallyworld: just noticed I cannot edit the bugs doc
<axw> would like to strike out a line
<wwitzel3> fwereade: ping
<wwitzel3> wallyworld: got time to chat about a test I want to try and fix?
<rogpeppe1> anyone know how to get a list of all charms? i used to use http://manage.jujucharms.com/api/2/charms?text= but it doesn't seem to produce many results now
<wwitzel3> rogpeppe1: manage.jujucharms.com is still how I know how to get a list of them
<rogpeppe1> wwitzel3: do you have a better URL than that?
<rogpeppe1> wwitzel3: i only get about 34 out of that
<wwitzel3> rogpeppe1: sounds like you're getting trusty only
<rogpeppe1> wwitzel3: actually 20 only
<wwitzel3> oh .. hrmm
<wwitzel3> there are 38 charms for trusty .. so if you're only getting 20
<rogpeppe1> wwitzel3: it looks like there's a limit kicking in, but adding limit=100, for example, doesn't give me any more
<rogpeppe1> wwitzel3: have you actually got a current-ish list, by any chance?
<wwitzel3> rogpeppe1: https://manage.jujucharms.com/charms/trusty
<wwitzel3> rogpeppe1: that's the HTML version if that helps anyway
<rogpeppe1> wwitzel3: it's a hassle (i'll have to manually scrape the html) but better than nowt
<wwitzel3> rogpeppe1: sadly it doesn't do an ajax request (that I can see) to get the results
<wwitzel3> rogpeppe1: was hoping it might be using the API to render that list
<rogpeppe1> wwitzel3: yeah
<rogpeppe1> wwitzel3: it's annoying because it *did* work once. i even wrote a little program to automate getting all charms.
<wwitzel3> rogpeppe1: yeah, according to the lp repo for that API it hasn't changed since 2013-10
<wwitzel3> rogpeppe1: but something must have :/
<rogpeppe1> wwitzel3: well, the last-modified time on my getcharms.go file is 2013-08, so the 2013-10 change might've broken it i guess
<rogpeppe1> wwitzel3: which lp repo is it, BTW?
<wwitzel3> rogpeppe1: also doest his give you better results http://manage.jujucharms.com/api/3/search ? that isn't techincally an unstable API, but if 2 stopped working for you
<wwitzel3> rogpeppe1: https://code.launchpad.net/charmworld
<rogpeppe1> wwitzel3: i only see 41 results there
<rogpeppe1> wwitzel3: (assessed by searching for "files": in the JSON :-))
<wwitzel3> rogpeppe1: http://bazaar.launchpad.net/~juju-gui-bot/charmworld/trunk/view/head:/charmworld/search.py#L141
<wwitzel3> rogpeppe1: I feel like someone else was running in to this 20 limit thing recently
<rogpeppe1> wwitzel3: i tried using limit=9999
<dimitern> TheMue, reviewed
<dimitern> TheMue, ping me if something is not clear please :)
<TheMue> dimitern: yeah, seen a lot of it (and commented some back). thanks a lot, will mark it as wip and continue
<dimitern> TheMue, cheers!
<wwitzel3> rogpeppe1: yeah, sorry I couldn't get it working, the code itself seems to respect limit, but then overrides it somewhere with that default
<rogpeppe1> wwitzel3: yeah, i couldn't see where that happens.
<rogpeppe1> wwitzel3: and the limit doesn't always appear to be 20 though
<dimitern> TheMue, about the dummy provider running "unsafe" networker - it certainly makes sense to use "safe" by default, but there should be a way to test the "unsafe" one if needed - we can isolate what the networker does from the system (config, interfaces), so it's fine
<TheMue> dimitern: is it ok to do this in a second, follow-up PR? this one already grows and grows. ;)
<wwitzel3> rogpeppe1: yeah, if you leave it off all together, it seems to be 41
<dimitern> TheMue, i'm not saying to add more tests, just to tweak the dummy provider a bit so RequiresSafeNetworker can return true or false, just by adding a useSafeNetworker bool in the dummy provider struct and a SetUseSafeNetworker(bool) method on the dummy provider to set that flag
<dimitern> TheMue, it's like 5-6 more lines :)
<rogpeppe1> wwitzel3: actually no, with a more accurate count, it always seems to be 20
<rogpeppe1> wwitzel3: (counted with http://paste.ubuntu.com/8177535/)
<TheMue> dimitern: ah, ok, haven't been sure that in this case the "unsafe" (/me still not really happy with that term) doesn't make any harm to the own system
<dimitern> TheMue, should we do a quick standup?
<TheMue> dimitern: yes, already comming
<perrito666> wallyworld: is there any assigning system for the tests that need rework?
<wallyworld> perrito666: raise a bug and assign yourself to it. cross out the item in the doc so others can see there that that one is taken
<wallyworld> many failing tests already have bugs
<wallyworld> so if you go to raise a bug, it should tell you if there is one already
<wallyworld> include the test name in the bug description
<wallyworld> so that it has a better chance of finding any dupe
<perrito666> I guess that created bugs are more important than the ones not yet there
<wallyworld> no necessarily
<wallyworld> there's failures that people haven't created bugs for
<wallyworld> and i found today that bugs i looked at had already been fixed
<wallyworld> but the bugs were still open
<wallyworld> so pick a failure that you think you can fix - all fixes are good
<katco> wallyworld: interesting, i can't add an event to the team calendar =/
<wallyworld> katco: that's no good, that means you are not allowed to have the holiday
<katco> wallyworld: you are  afunny one ;)
<wallyworld> oh, you think i was joking
<wallyworld> :-P
<katco> wallyworld: we're planning on going to our botanical garden's japanese festival :)
<wallyworld> cool, sounds fun
<katco> oldest and largest in the country!
<wallyworld> well, that's something i didn;t know
<katco> http://www.missouribotanicalgarden.org/things-to-do/events/signature-events/japanese-festival.aspx
<katco> oops, says "one of" the oldest/largest
<wallyworld> ah, you americans always exaggerate
<wallyworld> world series baseball = usa plus canada
<perrito666> I am amazed on how US people actually measure things in order to be able to have "the largest" for almost anything :p
<katco> hey we are THE BEST at exaggerating. no one is better, anywhere.
<perrito666> wallyworld: its like in columbus times "the known world"
<katco> and me? i'm the most humble person alive.
<wallyworld> you too?
<wallyworld> we can't both be
<katco> hey i'm WAY more humble
<wallyworld> of course
<katco> lol
<wallyworld> you have reason to be
<katco> LOL
<katco> so nice to work with people who appreciate a dry sense of humor :)
<wallyworld> don't know what you mean
<perrito666> lets say each of you is the humblest in his/her timezone
<katco> what did you ask me yesterday? have i told you today?
<wallyworld> lol
<wallyworld> no you haven't
<katco> fark off! haha
<wallyworld> :-( you hate me
<perrito666> wallyworld: dont be like that, we all hate you, its because you are a manager
<wallyworld> perrito666: oh, that's a low blow
<katco> haha
<wallyworld> way to hurt a guy
<katco> no, wallyworld, i definitely don't hate you. i quite like working with you, axw, and everyone i've had the pleasure to work with so far
<katco> with the exception of mgz_, traitor!
<wallyworld> katco:  i was joking :-)
<perrito666> katco: nahh, dont piss off the guy that can lock your commits
<mgz_> ;_;
<katco> haha
<katco> mgz_: i kid, i kid!
 * perrito666 accidentally spills yerba all over his kb
<wallyworld> wwitzel3: sorry i missed your ping before, i was at soccer. did you still have a question?
<katco> need to cycle my server, brb
<perrito666> wallyworld: I cant write the doc either
<wallyworld> perrito666: i changed the permissions, have you tried refeshing?
<perrito666> tx
<natefinch> perrito666: you around?
<wwitzel3> wallyworld: yeah, I was going to try to fixup a test from the list
<perrito666> natefinch: I am I though you where not here
<natefinch> perrito666: yeah, sorry, busy morning.  I'm in the hangout on the calendar event
<wallyworld> wwitzel3: ok, or it can be any other one you may have come across that isn't there
<wallyworld> or it can be improving the cmd/juju tests to use mocks
<mattyw_> dimitern, do you have 10 minutes spare for a quick question, possible via a hangout
<dimitern> mattyw_, sure, just give me 15m
<mattyw> dimitern, great, just ping when it's a good time
<perrito666> natefinch: interesting fact, cold medicine hinders my already low spoken english skills :p
<natefinch> perrito666: haha
<natefinch> perrito666: btw, it's "beats around the bush"
<perrito666> ahh well close enough
<natefinch> yeah, I knew what you meant
<perrito666> natefinch: it is hard to translate those, the spanish one is nothing like it
<natefinch> yeah... those kind of sayings are kind of crazy, because the actual words don't make any sense
<perrito666> literal translation of spanish one is "dodges the lump"
<perrito666> to give you an idea
<natefinch> rofl
<perrito666> of how far it is
<perrito666> omg, even though its not ergonomic I must say, thinkpad keyboard that emulates the one in the laptop is awesome
<perrito666> I wish I had found a newer one, I actually got one ibm branded :p
<natefinch> haha
<perrito666> I also with it was bt, this one actually has a 2.5M cable :p
<katco> is there a place we could put consts representing config key-strings? seems silly and error-prone to repeat them all over the codebase.
<dimitern> katco, how about in environs/config ?
<katco> dimitern: that seems like a fine place
<katco> dimitern: could probably even put it in environs/config/config.go
<katco> dimitern: is this a small enough shift that i can just introduce this concept? :)
<dimitern> katco, go for it :)
<katco> dimitern: awesome, thanks for the input! :)
<marcoceppi_> hook execution question
<marcoceppi_> do hooks in a service group execute serially or is that only on the unit level?
<katco> boy documenting all public values seems silly sometimes
<katco> 	// ProvisionerHarvestModeKey stores the key for this setting.
<katco> 	ProvisionerHarvestModeKey = "provisioner-harvest-mode"
<perrito666> ahh tautological comments
<perrito666> katco: you could expand on what exactly does that key means
<katco> i know... i understand wanting good documentation, but an absolute rule doesn't make sense
<katco> perrito666: that's already covered in the actual setting
<katco> perrito666: this is literally the key to get the setting in the map
<katco> i especially dislike repeating the name of the thing in the comment. hungarian notation all over again.
<bac> hi marcoceppi_
<marcoceppi_> hey bac
<bac> marcoceppi_: lazyPower_ has filed a bug regarding supporting tags as a constraint for MAAS.  you've seen it i think.
<marcoceppi_> bac: yup
<bac> marcoceppi_: in the definition tags are a comma-separated list.  until now we've marked bundles with constraints listed as a comma-separated as deprecated.  clean implementation of tags will require us to reject those deprecated bundles.
<bac> i.e. constraints: 'k1=v1 k2=v2 tags=a,b,c' is valid bug 'k1=v1,k2=v2' is not
<marcoceppi_> bac: the error we're getting is tags is not a valid constraint
 * marcoceppi_ checks the bundle file
<bac> marcoceppi_: yes.  i'm trying to fix that.
<marcoceppi_> oh, I see
<marcoceppi_> one sec
<bac> marcoceppi_: to properly support tags will require not allowing comma-separation in the other constraint key value pairs
<marcoceppi_> bac: maybe constraints are all wrong
<bac> is it possible to run charm-proof against the set of all bundles to see if it is even an issue?
<marcoceppi_> maybe they should be a YAML list
<marcoceppi_> bac: I can download them all and see
<bac> marcoceppi_: that would be great
<marcoceppi_> constraints:
<marcoceppi_>   - key: v1
<marcoceppi_>   - key1: v2
<marcoceppi_> hazmat: ^^
<bac> marcoceppi_: that may be the best long term solution but i thought you needed a fix now.
<marcoceppi_> well, I'll look at all the bundles we have
<marcoceppi_> and see what each is doing for constraints
<bac> changing proof rules already touches way too many parts...
<TheMue> oh, bugfix by accident, nice. field of a struct has been prepared but never been set. now I deeded it.
<perrito666> TheMue: lol
<TheMue> perrito666: yeah, gave me a panic during tests :D
<perrito666> TheMue: which one (bug)? just in case
<TheMue> perrito666: it never has been filed. it has been a field for checking the authorization of reading a machine
<TheMue> perrito666: the function (it's a function type) has been created, but the field never set
<TheMue> perrito666: and I now wanted to check if I can read => *bang*
<perrito666> ouch, I wonder if its there on the bugs ian added to the doc which are not all reported
<TheMue> have to check
<natefinch> Anyone up for reviewing my log rotation change?  It's pretty small: https://github.com/juju/juju/pull/512
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1363130 1363143
<mattyw> natefinch, I'll take a look
<perrito666> mattyw: what is a vanity url?
<mattyw> github.com/perrito666/projectname
<mattyw> perrito666, A url with someones name in it
<mattyw> perrito666, I'm not happy with the name vanity url either :)
<mattyw> but that's what they're called in the mailing list
<mattyw> and who am I to go against consensus?
<perrito666> I would think a vanity url is something like beautifulpopleonly.com/perrito666/thewonderfulwonderfulman/
<perrito666> :p
<perrito666> now that is a vanity url
<jcw4> perrito666: can you get me one like that too?
<perrito666> :p
<mattyw> perrito666, I'd be happy with that one - at least make it obvious
<mattyw> perrito666, what I'm happy and not happy with is inconsistent and changes daily
<perrito666> we might want to rething our mentor policy to not be 12 hours distance with our reviewers
<jcw4> perrito666: quit complaining... you're on here 24/7 anyway
<perrito666> jcw4: Id say around 20
<perrito666> but yes
<jcw4> :)
<perrito666> its the corners of the week which are a problem, my mentor will not be here until sunday (my sunday)
<jcw4> yeah, that's a bummer.
<mattyw> perrito666, how many corners does a week have? I thought they were circular?
<mattyw> they feel circular sometimes
<jcw4> I have to say though, working with folks all over the globe make it more than worth it for me :)
<perrito666> if weeks where circular in two days it would be last monday
<jcw4> spiral then
<perrito666> uff, this bug is a heavy thing to pinpoint, I am pretty sure I know what is going on, I am not so sure on the how part
<bodie_> my weeks have no corners, only hills and valleys
<natefinch> perrito666, mattyw: a vanity url, in the context of Go is a url which is not the same as the url to the repo.  So, for example, the package that gets imported when you do import "labix.org/v2/mgo"  is actually hosted on launchpad, so the real url is launchpad.net/mgo (or something like that).
<mattyw> natefinch, I didn't know that - thanks very much
<natefinch> mattyw, perrito666: Go supports this through the go tool which, if it doesn't recognize the url of the import path, will fall back to trying to get a web page from that URL, and it'll look for a header that looks something like <meta name="go-import" content="labix.org/v2/mgo bzr https://launchpad.net/mgo">
<natefinch> It's pretty nice because it means that if you control the domain, you can move your code from one host to another and not break anyone else's code that imports using the vanity url.
<mattyw> natefinch, now I know the proper definition of vanity url my comment probably makes no sense
<mattyw> natefinch, I was just wondering if accepting a number of dependencies that had team members names in was ok
<mattyw> has anyone else seen this error using the local provider on tip? http://paste.ubuntu.com/8180167/
<perrito666> mattyw: are you sure your code is up to date?
<mattyw> perrito666, it was tip
<perrito666> mattyw: looks a lot like someone made one of those automatic changes on the code and forgot to change a return somewhere
<mattyw> perrito666, I was trying to recreate this https://bugs.launchpad.net/juju-core/+bug/1363143 but I can't even get that far
<mup> Bug #1363143: local lxc deployments fail to create machines <ci> <local-provider> <lxc> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1363143>
<perrito666> mattyw: checking
<alexisb> perrito666, the mentorship matching is my fault, we did the best we could but most of the mature skills on the team are in +12 timezone, you are one of the few with potential overlap
<perrito666> alexisb: sorry didnt meant to make you feel bad
<perrito666> mattyw: and this blows running bootstrap right?
<mattyw> perrito666, yeah
 * jcw4 wonders about relocating to UTC+12
<jcw4> ;
<jcw4> )
<jcw4> ;) even
<perrito666> mattyw: rm /pkg/* on your go path and then recompile
<alexisb> yes perrito666 I feel terrible, I may never recover from your slam ;)
<wwitzel3> lol
<natefinch> mattyw: btw, your comment is actually correct - gopkg.in/natefinch/lumberjack.v2 *is* a vanity url.  It redirects to github.com/natefinch/lumberjack at the v2 branch.
<perrito666> this will so cost me beer on the next sprint
<alexisb> perrito666, you should feel tremendously guilty on my behalf
<alexisb> lol totally
<alexisb> o wait
<alexisb> not thumber buys the beer remember
<mattyw> perrito666, same problem
<alexisb> no, thumber buys the beer, remember?
<natefinch> close
<natefinch> one more try?
<perrito666> we can levenstein from there
<natefinch> lol
<mattyw> perrito666, I'm about to EOD to be honest, maybe we can take a look at it on monday?
<alexisb> :)
<perrito666> mattyw: sure, ping me, honestly looks like a red herring
<natefinch> mattyw: btw, the reason it's not under juju is because it's my personal code written during off hours.  Otherwise, yes, absolutely, anything written during hours is Canonical's and belongs under github.com/juju
<mattyw> perrito666, ok cool
<natefinch> mattyw: in theory, the code is MIT licensed and anyone could just fork my repo to put the code under github.com/juju
<natefinch> but some might consider that rude
<perrito666> that would remind me the early days of google chrome
<perrito666> absolutely all of the deps from the project had been imported into the project svn
<perrito666> from here it was a 3 or 4 hs checkout
<natefinch> ouch
<natefinch> good old SVN
<mattyw> perrito666, I like red herrings
<natefinch> tasty with a little lemon and butter
<mattyw> perrito666, I might be able to work it out for myself once I've slept :)
<perrito666> I only ate herring once in my life, in amsterdam, there was this kiosk next to the canal that made raw herring sandwitches
<perrito666> mattyw: try coffee, has pretty much the same effect
<mattyw> natefinch, my point is probably pretty academic as all the code is on github anyway
<natefinch> mattyw: it's a valid point... if it were canonical's code, it should be represented as such, and under canonical's control.  But it's not, so it's not :)
<mattyw> natefinch, I just don't want to end up in the situation where core relies on changes landing in your project but you want to take the project in another direction
<mattyw> natefinch, but we can cross that bridge if it comes - and there are many ways of fixing it
<natefinch> mattyw: that would be a perfectly valid reason to fork the project, and like I said, it's MIT, you can just do that.
<natefinch> mattyw: exactly
<natefinch> mattyw: especially for a package that's used in exactly one file... it's trivial to change.
<mattyw> folks - it's time for me to call it a day
<mattyw> have a good weekend all and I'll speak to you on monday
<wwitzel3> see ya mattyw
<TheMue> so, off for today, have a nice weekend
<perrito666> TheMue: likewise
<perrito666> I might be wrong here, but does this sounds like a huge isolation bug to anyone else? https://github.com/juju/juju/blob/master/juju/testing/conn.go#L426
<natefinch> perrito666: certainly seems like something that should be exposed by MgoServer not something the suite is figuring out.
<perrito666> natefinch: I am a bit worried about the shared nature of the mongo server also
<perrito666> https://github.com/juju/testing/blob/master/mgo.go#L35
<natefinch> perrito666: yeah
<natefinch> perrito666: that very well could be why some of our tests are failing... they're running at the same time and then the streams cross and boom
<perrito666> The failure I am looking at (only have a few logs) seems to be triggered by the test assuming the server is alive and trying to connect but it gets auth fails, which makes me think its hitting someone else's server
<natefinch> hmm could be
<natefinch> perrito666: this may actually not be that bad
<natefinch> perrito666: each package runs tests as an independent process
<ericsnow> anyone have a little time to give a follow-up review?
<ericsnow> https://github.com/juju/juju/pull/606
<natefinch> ericsnow: is there a reason you need to hide the fields of the archive?
<ericsnow> natefinch: I guess not...I've been trying to make things private by default and avoid exposing them if possible
<natefinch> ericsnow: for things that are just data... don't bother.... and this is definitely just data
<natefinch> ericsnow: I'm used to the same pattern from past experience, too.  But honestly, if someone wants to change the filename after the make a new Archive.... who cares?
<natefinch> it simplifies the code a bunch not to hide the values - you don't need a constructor and don't need the getters
<ericsnow> natefinch: Archive is one of the types that I actually think might be used outside of juju, so I wanted to be extra careful there to keep the interface clear
<ericsnow> natefinch: those fields are just implementation details
<ericsnow> natefinch: it's the getters that expose the structure of the archive file
<natefinch> ericsnow: it's not really an implementation detail if you're just returning what you got passed in.  It's just data
<natefinch> ericsnow: unless you need the functions to fulfill an interface
<ericsnow> natefinch: yeah, the fields are not part of the interface I care about here.
<ericsnow> natefinch: so you make a good point
<natefinch> ericsnow: it's not that the way you have it is wrong, it's just somewhat more complicated that necessary.  Not that it's actually complicated at all, of course - the functions are trivial.
<ericsnow> natefinch: yeah, I've run into that several times and it's always tempting to just make fields/types public because it simplifies things to much
<ericsnow> natefinch: in some cases I did so because keeping them private wasn't worth it
<natefinch> ericsnow: I've come to the conclusion that the only time I need to make fields private is when there are interactions between different fields that would be invalidated if the fields were updated after construction.... like if you were generating the DBDumpDir just one time in the constructor.
<ericsnow> natefinch: yeah, the caching case
<bac> ping lazyPower_
<lazyPower_> hey bac
<lazyPower_> sorry i saw teh pings while i was out and about checking on my phone
<lazyPower_> you're looking for an example bundle with tags as constraints yes?
<bac> yes i am
<bac> lazyPower_: any that are blocked
<lazyPower_> bac: i edited the bundles for promulgation
<bac> lazyPower_: so you took the tags out?
<lazyPower_> sure did. I pointed the resource to the charm store, and removed the tags as constraints - and left a note in the bzr log + MP that i had done so for promulgation.
<lazyPower_> https://code.launchpad.net/~kirkland/charms/bundles/transcode-cluster/bundle
<lazyPower_> this was the hottest culprit, as the orange-box bundle has tags to denote NFS and TRANSCODE should go on hardware / physical nodes
<bac> lazyPower_: ok.  i mean it is trivial for me to come up with test bundles but i thought i'd run my charm proof against the real ones.
<lazyPower_> bac: well that submission would be a great one to look at
<bac> oh, the famous transcoder...
<lazyPower_> famous? :D
<bac> lazyPower_: minimally.  :)
<bac> lazyPower_: that bundle is actually trivial wrt constraints.
<lazyPower_> indeed. very minor set of constraints - it was throwing the charmworld proofer through loops though
<bac> lazyPower_: i've got a charmworldlib fix that i'll put up for review soon.  it'll require a small code change to charmworld and a redeploy, which cannot happen until monday
<lazyPower_> when i run charm proof --offline  it passes.
<bac> lazyPower_: you're not currently blocked so monday should be fine, correct?
<perrito666> natefinch: yes, but that still leaves a whole package to race against
<perrito666> I mean, you cannot guarantee that a mgo server died properly in the last test
<natefinch> perrito666: well, the tests are run in series, not in parallel, so there's no concern as long as the setup and teardown do the right thing
<perrito666> natefinch: I believe the tear down might be failing
<perrito666> this is a heisembug so I might try to make sure this is isolated and hope for the best
<bodie_> https://github.com/juju/juju/pull/617 should be done now
<bac> hi marcoceppi_, when you get a chance could you look at this charmworldlib branch for the constraints issue: https://codereview.appspot.com/131710044/ ?
<marcoceppi_> bac: I wonder if there's a limit to characters that can be in a tag/constraint
<marcoceppi_> nvm
<marcoceppi_> ignore what I'm saying
 * bac ignores you
 * bac listens again
<natefinch> ericsnow: is backups.create actually used anywhere?
<ericsnow> natefinch: not until the next patch
<natefinch> ericsnow: ahh ok
<ericsnow> natefinch: I linked it in the PR summary
<natefinch> ericsnow: aww... reading?! :/
<perrito666> aghhh I believe I foundiiiiiittttt
<natefinch> huzzah!
<perrito666> ish, someone might be re-setting the mgoserver but I cannot find who yet :p
#juju-dev 2014-08-30
<bogdanteleaga> hello, could I get another review for https://github.com/juju/testing/pull/29 ?
<bodie_> bogdanteleaga, the PathChecker is only for checking that the paths are the same, right?  it shouldn't matter whether there is a file there
<bodie_> I guess the problem is that "C:/Program Files" and "C:/Programming" have a name collision and could be C:/PROGRA~1 and C:/PROGRA~2
<bodie_> but can those really be considered the same *path*
<bodie_> it's possible that there is a problem with my concept of "path"
<bogdanteleaga> Umm, it uses os.samefile to check
<bogdanteleaga> So it would actually work
<bogdanteleaga> It's kind of a workaround for symlinks
<bogdanteleaga> Because they don't work on windows
<bogdanteleaga> And yeah it doesn't matter if there's a file there
<bogdanteleaga> It's multipurpose
<bogdanteleaga> If the path doesn't match it checks if it points to the same file
<bodie_> no, I know it would work in that case, the question is more whether the files should have to exist for the path to check correctly
<bodie_> i.e., maybe C:/PROGRA~1 matches both C:/Program Files and C:/Programming
<bogdanteleaga> It wouldn't
<bodie_> well it's not a one-to-one mapping, no, rather a many-to-many, so the concept of comparing those paths is intrinsically problematic unless you compare the underlying files
<bodie_> my thought is only that perhaps if you're comparing paths, you don't want the actual files to have to exist
<bodie_> but, it's just a thought :)
<bogdanteleaga> they don't have to
<bogdanteleaga> if they don't exist it exits
<bogdanteleaga> if you follow the logic it first looks at the strings themselves
<bogdanteleaga> and if they don't match then it looks if both of them point to actual files
<bodie_> I'm specifically looking at your comment on line 205
<bogdanteleaga> and only after that if checks whether the files are actually the same file
<bodie_> thus, the files don't have to exist _unless_ x
<bodie_> but, I'm saying C:/Programming could a match for C:/PROGRA~1 even if neither file exists
<bodie_> or PROGRA~2
<bodie_> there's really no way of knowing
<bodie_> since it's many-to-many name mapping
<bodie_> I just don't know if the user will expect the files to have to exist in that case
<bogdanteleaga> they would not match because if they don't exist it would throw that specific error
<bogdanteleaga> the only case in which you would expect 2 paths to be the same if they contain different strings is the case in which they point to the same file
<bogdanteleaga> I don't understand exactly what you're getting at here
<bogdanteleaga> think of specific use cases
<bodie_> well what you're testing is whether the files match in those cases, not whether the paths match
<bodie_> and if the files don't exist, and the user is trying to check something about paths, maybe that could be problematic
<bodie_> however, I'm not truly certain about your intent, so I thought I would just mention it here instead of holding back your review
<bodie_> as I said -- just a thought :)
<bogdanteleaga> yeah I got that it's just a thought but I wanted to clarify it for you
<bogdanteleaga> the issue is that if the paths are not the same string-wise the only other possible case is a symlink or hardlink
<bodie_> hmm, okay
<bogdanteleaga> it's probably only going to get used inside windows tests, but it's also *nix compatible
<bogdanteleaga> thanks for the review, i'll fix that in a bit
<bogdanteleaga> actually I had a test for that already
<bogdanteleaga> commented on github
#juju-dev 2014-08-31
<thumper> morning
<thumper> menn0: good break?
<menn0> thumper, yeah it was. not exactly relaxing, but fun :)
 * thumper sees around 450 emails to trawl through
<perrito666> menn0: hi, welcome back
<menn0> perrito666: hi, thanks
<perrito666> can I get a hold of you later for some question re upgrader worker?
<perrito666> I need to go out for a moment now
<menn0> perrito666, sure
<menn0> I have my team's standup in 10 mins but will be available after that or any time later
<perrito666> well I have to take my wife to mass so it will be at least an hour
<perrito666> :)
<menn0> thumper, waigani: stand up?
<waigani> thumper: can talk through environ user to apiserver?
<waigani> thumper: or is there a spec in the identity doc?
<thumper> waigani: I don't understand what your first question means
<waigani> thumper: sorry, typo, I meant to say adding the env user to the apiserver.
<thumper> I'm still not sure what you are asking
<waigani> thumper: so when a user connects to the api we want to add that user as an environ user?
<waigani> if they have not been added already
<thumper> waigani: no
<thumper> waigani: the command line will be something like "juju share jesse@local"
<thumper> which will be a client command to share the current environment with the user "jesse@local"
<thumper> the current user has to have access to log into the current environment
<thumper> so that is already handled
<thumper> a new method on the apiserver client interface
<thumper> to share an environment
<waigani> which gives "jesse@local" perms to connect to the apiserver?
<thumper> we want to add an envuser on the current environment for the user specified
<thumper> correct
<thumper> well
<waigani> go it
<thumper> gives jesse@local permission to access that environment
<thumper> (or will do when we tweak the authorization)
<waigani> what does that mean 'access that environment'?
<waigani> how much of the api can they use?
<thumper> right now, almost all of it
<thumper> well, as of right now, all of it
<thumper> my branch will be the precursor to changing that
<thumper> wallyworld: morning
#juju-dev 2015-08-24
<menn0> waigani: would you mind looking at: http://reviews.vapour.ws/r/2456/
<waigani> menn0: sure
<menn0> waigani: it's a fix for intermittent allenvwatcher test failures that i've noticed
<menn0> waigani: there's another problem to fix as well but this one issue is happening more often
<waigani> menn0: shipit
<menn0> waigani: thanks
<davecheney> i love the fact that all our upgrade and provisioner tests use "ppc64le" as the "i cannot possibly happen" operating system
<anastasiamac> davecheney: isn't it easier to support impossible operating systems? :D
<davecheney> we're basically using quantal-ppc64le as our "nope, cannot happen" sentinal
<mwhudson> davecheney: good think the ubuntu architecture is called ppc64el then
<mwhudson> *THING
<mwhudson> doh can't brain today
<davecheney> axw_: godeps: cannot get information on "/home/dfc/src/launchpad.net/gwacl": bzr revision-info has unexpected result "244.1.1 andrew.wilkins@canonical.com-20150810080433-4l0x47qsc16ure7c\n"
<davecheney> mwhudson: oh god
<davecheney> don't mention the war
<davecheney> or the worst joke in the world that continues to keep on giving
<davecheney> state tests now take 450+ seconds to pass
<davecheney> % go test ./state/...
<davecheney> ok      github.com/juju/juju/state      438.504s
<davecheney> golf clap
<davecheney> FAIL    github.com/juju/juju/state/leadership   600.006s
<davecheney> fucking fantastic
<mup> Bug #1487947 opened: provider/ec2: localServerSuite.TestNetworkInterfaces  <juju-core:New> <https://launchpad.net/bugs/1487947>
<mup> Bug #1487947 changed: provider/ec2: localServerSuite.TestNetworkInterfaces  <juju-core:New> <https://launchpad.net/bugs/1487947>
<dimitern> rogpeppe, hey, have you seen https://code.launchpad.net/~dimitern/godeps/fix-bzr-revision-info-parsing/+merge/268223 ?
<rogpeppe> dimitern: nope :)
<rogpeppe> dimitern: only just back from 2 weeks' holiday
<dimitern> rogpeppe, have a look when you have a moment :) it's just 2 characters change
<rogpeppe> dimitern: looking
<dimitern> dooferlad, TheMue, hey guys, it seems net-cli is blessed for merging in master
<rogpeppe> dimitern: interesting. out of interest, do you actually want to use non mainline revision numbers as dependencies?
<dimitern> rogpeppe, I suppose not, but at least when there is such a case now godeps refuses to work anymore
<rogpeppe> dimitern: yeah, i agree it shouldn't fail in that case
<dimitern> rogpeppe, maybe print a warning when there are dots in the bzr revision and go on?
<rogpeppe> dimitern: i think i'll just go with your fix actually. if you really want a non mainline revision, you can have one
<rogpeppe> dimitern: it should be obvious in the .tsv file anyway
<dimitern> rogpeppe, sounds good
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe, thanks!
<dimitern> rogpeppe, I think you should manually merge the MP, as there's no bot to pick it up
<rogpeppe> dimitern: merged (ooo, my bzr-fu is getting rusty!)
<dimitern> rogpeppe, :) mine too
<dooferlad> dimitern: great! Lets do it and go drink some beer!
 * dooferlad knows that TheMue would rather drink whisky
<TheMue> dooferlad: good idea, got time next week monday. wonna meet in London?
<dimitern> \o/
<TheMue> dooferlad: ah, you know me right
<dooferlad> TheMue: I think you meen tuesday. Well, maybe Monday evening...
<TheMue> dooferlad: yeah, Monday evening
<dimitern> dooferlad, voidspace, standup?
<voidspace> dimitern: trying, it's not letting me in :-/
<voidspace> restarting the browser
<dimitern> voidspace, oh boy :/ monday hangout issues
<voidspace> dimitern: not having used this machine for six weeks issues too...
<voidspace> 350mb of updates
<voidspace> including a new firefox - that maybe the issue
<voidspace> that looks better
<voidspace> dimitern: ready
<dimitern> voidspace, omw
<dimitern> fwereade, hey, are you around today?
<TheMue> dimitern: btw, the provider I meant has been the vsphere one. i has a partial implementation of the networking interface
<dimitern> TheMue, ok, so for vsphere then SupportsSpaces will be false, but the other method will remain
<TheMue> dimitern: yep, did it that way
<dimitern> TheMue, +1
<redelmann> hi, im having some problem with juju-gui
<redelmann> machine-0 logs says:  error stopping *state.Multiwatcher resource: unit not found
<redelmann> everytime i load juju-gui on browser
<redelmann> and no icons appears
<redelmann> this is a juju bug or gui?
<redelmann> i open this issue: https://bugs.launchpad.net/juju-gui/+bug/1485249
<mup> Bug #1485249: Juju gui is not loading. <juju-gui:New> <https://launchpad.net/bugs/1485249>
<rogpeppe> anyone know about the openstack provider?
<urulama> dimitern: ^
<mup> Bug #1486106 changed: TestNetworkInterfaces fails on vivid/wily <ci> <ec2-provider> <intermittent-failure> <test-failure> <juju-core:Fix Released by dooferlad> <https://launchpad.net/bugs/1486106>
<rogpeppe> in particular at this moment I'm interesting in understanding the different roles of the tenant-name (taken from env var OS_REGION_NAME) and region (no env var) attributes
<mgz> rogpeppe: you have that mixed up, tenant-name is taken from OS_TENANT_NAME
<mgz> it's a user-specific grouping of resources
<rogpeppe> mgz: so they are - not sure how i read that wrong
<mgz> whereas regions are a cloud-level seperation
<rogpeppe> mgz: for some reason i'd copied/pasted the wrong env vars and didn't check back to the source
<rogpeppe> mgz: i did think it was potentially very confusing :)
<rogpeppe> mgz: one weird thing: why does openstack fall back to using the ec2 passwords?
<mgz> rogpeppe: there's a semi-compatible ec2 api, so many of the rcs set the same vars for ec2
<rogpeppe> mgz: it seems like a potential security leak to me
<rogpeppe> mgz: and i don't see any other values that do the same thing
<wwitzel3> fwereade: ping
<fwereade> wwitzel3, pong
<wwitzel3> fwereade: katco isn't feeling well, are you able to move up our discussion to now? so she can take off and get some rest
<fwereade> wwitzel3, sure, let's
<rogpeppe> mgz: and surely if it was using AWS_SECRET_ACCESS_KEY it ought to use AWS_ACCESS_KEY_ID too?
<wwitzel3> fwereade: we are in moonstone, https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=1
<mgz> rogpeppe: the names used are just based on what was looked at in practice by the clients
<mgz> and there's not really a security issue, the access ley/secret key are not passed along unless you say you're using that authmode
<rogpeppe> mgz: do you think it might be reasonable to drop support for using AWS_SECRET_ACCESS_KEY and EC2_SECRET_KEYS for the openstack password?
<mgz> so you need to be pretty deliberate about wanting it
<rogpeppe> mgz: isn't userpass the default auth mode?
<rogpeppe> mgz: so ISTM that it'll send that password in the usual case
<mgz> and it's basically always going over https to somewhere that won't be loging your auth attempts
<mgz> you'd need to have and endpoint url manually set to something hostile, or dns poisoned
<rogpeppe> mgz: from the p.o.v. of the user that's not necessarily true - they don't know what's at the other end
<mgz> and also have accidentally sourced your ec2 creds
<mgz> I don't think the ec2 provider is robust against that
<mgz> we don't check that the ec2 endpoint is really amazon
<rogpeppe> mgz: i think we do, don't we
<rogpeppe> mgz: 'cos we use https
<rogpeppe> mgz: without using InsecureSkipVerify
<rogpeppe> ?
<mgz> that just verifies it has been signed by something in your chain
<mgz> it doesn't verify what it is.
<rogpeppe> mgz: well, same security as any website :)
<rogpeppe> mgz: and, yes, i'm well aware that https security is a farce
<mgz> the ec2 signing mechanism is better though, in not passing along the secret
<rogpeppe> mgz: but we all blithely go on assuming that verisign et al are somewhat reliable
<rogpeppe> mgz: i mean, fair enough i guess if people actually use AWS_SECRET_ACCESS_KEY for their openstack password, we don't wanna break that
<rogpeppe> mgz: it just triggers my Something Wrong Here sense.
<mgz> I think it's likely not really used any more, all the current rc things will at least include both, and generally only the OS_ versions
<mgz> just whether it's worth futzing with
<rogpeppe> mgz: i'm moving the env vars into the openstack provider to do the environschema stuff, and it made me wonder...
<rogpeppe> mgz: i mean, maybe goose.v1/identity should export the env var names
<mup> Bug #1465404 changed: worker/provisioner: fail lxcBrokerSuite.TestStartInstanceLoopMountsDisallowed <intermittent-failure> <unit-tests> <juju-core:Fix Released by dooferlad> <https://launchpad.net/bugs/1465404>
<mup> Bug #1473209 changed: github.com/juju/juju/service/windows undefined: newConn <ci> <intermittent-failure> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by bteleaga> <juju-core trunk:Fix Released by bteleaga> <https://launchpad.net/bugs/1473209>
<mup> Bug #1465404 opened: worker/provisioner: fail lxcBrokerSuite.TestStartInstanceLoopMountsDisallowed <intermittent-failure> <unit-tests> <juju-core:Fix Released by dooferlad> <https://launchpad.net/bugs/1465404>
<mup> Bug #1473209 opened: github.com/juju/juju/service/windows undefined: newConn <ci> <intermittent-failure> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by bteleaga> <juju-core trunk:Fix Released by bteleaga> <https://launchpad.net/bugs/1473209>
<dimitern> fwereade, hey
<dimitern> fwereade, does this look in line with what we discussed? http://reviews.vapour.ws/r/2463/
<rogpeppe> mgz: this is what i've got: http://paste.ubuntu.com/12184056/
<mup> Bug #1465404 changed: worker/provisioner: fail lxcBrokerSuite.TestStartInstanceLoopMountsDisallowed <intermittent-failure> <unit-tests> <juju-core:Fix Released by dooferlad> <https://launchpad.net/bugs/1465404>
<mup> Bug #1473209 changed: github.com/juju/juju/service/windows undefined: newConn <ci> <intermittent-failure> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by bteleaga> <juju-core trunk:Fix Released by bteleaga> <https://launchpad.net/bugs/1473209>
<rogpeppe> mgz: does that look reasonable to you?
<rogpeppe> mgz: actually, i should avoid copy/pasting those env vars
<mgz> rogpeppe: you have EC2_SECRET_KEYS ... with an S
<rogpeppe> mgz: that's what goose uses
<mgz> also, not really to do here, but we should add tenant-id already
<mgz> heh, in which case you probably can just delete it, as it's been wrong for a while then
<rogpeppe> mgz: :)
<rogpeppe> mgz: i suspect the same of AWS_SECRET_ACCESS_KEY too
<mup> Bug # changed: 1458717, 1461393, 1463641, 1467372, 1474788, 1478660, 1484617
<rogpeppe> mgz: but i'm actually thinking i don't want to risk breaking anything at all
<mgz> rogpeppe: no, that one is right, the old naming there was just dumb
<rogpeppe> mgz: i know it's right - i just think that using an AWS name for openstack creds is silly
<rogpeppe> mgz: particularly when it doesn't use AWS_ACCESS_KEY_ID for one of the user names
<mgz> it was the only way to use non-password auth on the old hp cloud
<mgz> as there aren't OS names for the different values
<rogpeppe> mgz: ah, makes sense
<rogpeppe> mgz: ok, here's a review for you: https://github.com/go-goose/goose/pull/14
<mup> Bug #1459064 changed: worker/leadership: data races in test and code <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1459064>
<dimitern> voidspace, dooferlad, TheMue, please have a look at these last 2 PRs we need to unblock merging net-cli into master: http://reviews.vapour.ws/r/2463/ and http://reviews.vapour.ws/r/2464/
<mup> Bug #1460893 changed: many unhandled assigned values <tech-debt> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1460893>
 * dimitern steps out, bbl
<frobware> hello all - new starter on the Juju Sapphire team today.
<mgz> rogpeppe: looks fine. as I hinted earlier, there are some bigger bits that could do with fixing in the os auth stuff, but nothing that blocks this change.
<mgz> frobware: hey there!
<rogpeppe> mgz: thanks
<mgz> frobware: another Andrew?
<frobware> mgz, tis the way...
<ericsnow> fwereade: we didn't mean to chase you away! :)
<mup> Bug #1459064 opened: worker/leadership: data races in test and code <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1459064>
<mup> Bug #1460893 opened: many unhandled assigned values <tech-debt> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1460893>
<rogpeppe> mgz: could you land my goose branch for me please? i'm not a member of goose collaborators currently
<mup> Bug #1459064 changed: worker/leadership: data races in test and code <intermittent-failure> <tech-debt> <unit-tests> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1459064>
<mup> Bug #1460893 changed: many unhandled assigned values <tech-debt> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1460893>
<redelmann> anyone know about juju-gui?
<redelmann> im trying to debug an issue
<natefinch> redelmann: probably better to ask in #juju
<redelmann> thank natefinch
<voidspace> dooferlad: I see you reviewed the first one, are you looking at the second too?
<mgz> rogpeppe: erm... can't remember exactly but the bot may actually just still check if you're a juju memeber
<dooferlad> voidspace: yes
<mgz> rogpeppe: added the comment anywa
<rogpeppe> mgz: thanks
<voidspace> dooferlad: ok, I'll leave it to you :-)
<voidspace> thanks
<dooferlad> voidspace: I haven't graduated yet. I suggest you look at the first one first - it seems to have been committed with the second as well.
<dooferlad> voidspace: then you know which bits to ignore :-)
<voidspace> ah
<voidspace> I did think the second had some extraneous stuff in it
<voidspace> dooferlad: dimitern: first looks good and it tells me which bits of the second I can ignore too
<voidspace> in which case the second looks trivial
<voidspace> dooferlad: although I missed the concern you raised, need to read the code agian!
<voidspace> *again
<mup> Bug #1488139 opened: juju should add nodes IPs to no-proxy list <juju-core:New> <https://launchpad.net/bugs/1488139>
<mup> Bug #1488139 changed: juju should add nodes IPs to no-proxy list <juju-core:New> <https://launchpad.net/bugs/1488139>
<mup> Bug #1488139 opened: juju should add nodes IPs to no-proxy list <juju-core:New> <https://launchpad.net/bugs/1488139>
<bodie_> is the answer to why godeps is used rather than godep a simple one?
<alexisb> welcome frobware !
<mgz> bodie_: godeps is older than godep
<bodie_> I see.  thanks mgz o/
<rogpeppe> mgz and anyone else: fancy a smallish review of a change to the openstack provider code? http://reviews.vapour.ws/r/2465/
<dimitern> katco & moonstone: fwereade texted me about an hour ago his power's out and wanted to convey this apologies
<dimitern> voidspace, dooferlad, thanks for the reviews guys!
<mgz> rogpeppe: looks fine in general, the descriptions are a bit all over the place though, do you want nitpicks?
<rogpeppe> mgz: specific suggestions would be great
<rogpeppe> mgz: i copied the descriptions from the boilerplate
<rogpeppe> mgz: much appreciated
<mgz> ah, so they're now it two places? that's a bit unfortunate.
<rogpeppe> mgz: when i get some extra time, the plan is to generate the boilerplate from the schema
<mgz> how much will you hit me over the head if I ask you about generating the boilerplate from the schema?
<mgz> eheh, you win.
<rogpeppe> :)
<rogpeppe> mgz: i'm also not liking my decision to go with partial sentences for the descriptions. using go-like full sentences seems like it's a better plan in the end.
<rogpeppe> mgz: but these things are fairly easy to change
<mgz> rogpeppe: review'd
<rogpeppe> mgz: ta v much
<rogpeppe> mgz: not sure about mentioning that control-bucket will be auto-generated - that's kinda true of every parameter that's optional.
<mgz> I'm a little antsy on that one as setting it can be actually harmful... but it can also just be required
<rogpeppe> when would it be required?
<rogpeppe> mgz: perhaps a comment in the description saying "(do not set this unless you know what you are doing)" might be appropriate
<mgz> rogpeppe: yeah, but leave it for now, don't need to futz with here.
<rogpeppe> mgz: k
<mgz> we need it for ci testing on canonistack, I *think* it's related to being able to cleanup reliably
<mup> Bug #1488166 opened: Service with just one unit left which doesn't think it's the leader <landscape> <leadership> <juju-core:New> <https://launchpad.net/bugs/1488166>
<perrito666> bbl groceries shopping
<natefinch> ericsnow: getting this error when running featuretests: charm not found in "/home/nate/src/github.com/juju/juju/testcharms/charm-repo": local:quantal/workload-actions
<ericsnow> natefinch: is it actually there (/home/nate/src/github.com/juju/juju/testcharms/charm-repo/quantal/workload-actions)?
<ericsnow> natefinch: perhaps still named "proc-actions"?
<ericsnow> natefinch: (which should be fixed)
<natefinch> ericsnow: doh, yep
<natefinch> I blame wwitzel3
<ericsnow> natefinch: :)
<natefinch> ericsnow: I'll submit a PR to rename them
<ericsnow> natefinch: sounds good
<wwitzel3> :(
<natefinch> I think I got whatever katco has.  I should remember to wear a mask to hangouts from now on.
<mup> Bug #1488245 opened: Recurring lxc issue: failed to retrieve the template to clone  <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1488245>
<davechen1y> ping http://reviews.vapour.ws/r/2458/
<perrito666> davechen1y: ship it, I am happy that that is being worked :)
<perrito666> there is only a typo in the description
<davechen1y> perrito666: thanks
#juju-dev 2015-08-25
<anastasiamac> proposed http://reviews.vapour.ws/r/2469/
<davechen1y> ping http://reviews.vapour.ws/r/2470/
<sinzui> waigani: davechen1y: can you review http://reviews.vapour.ws/r/2471/ to prepare to the 1.25-alpha1 release? This is a merge of net-cli into master.
<davechen1y> reviewed
<davechen1y> can someone review http://reviews.vapour.ws/r/2470/
<davechen1y> thanks
<anastasiamac> davechen1y: looking
<sinzui> thank you davechen1y
<anastasiamac> davechen1y: done ;)
<davechen1y> anastasiamac: thanks
<anastasiamac> davechen1y: replied :)
<mup> Bug #1421260 changed: juju 1.21.1 bootstrap timeout <bootstrap> <oil> <oil-bug-1372407> <juju-core:Expired> <https://launchpad.net/bugs/1421260>
<mup> Bug #1468365 changed: internal compiler error: fault <ci> <intermittent-failure> <juju-core:Expired> <juju-core 1.24:Expired> <https://launchpad.net/bugs/1468365>
<mup> Bug #1487727 changed: TestAddresserWorkerStopsWhenAddressDeallocationNotSupported fails on precise and windows <blocker> <ci> <intermittent-failure> <precise>
<mup> <regression> <unit-tests> <windows> <juju-core:Fix Released> <juju-core net-cli:Fix Released by dimitern> <https://launchpad.net/bugs/1487727>
<dimitern> who can review this http://reviews.vapour.ws/r/2475/ fix for bug 1455627, which is causing a lot of issues in CI ?
<mup> Bug #1455627: TestAgentConnectionDelaysShutdownWithPing fails <ci> <intermittent-failure> <lxc> <test-failure> <unit-tests> <windows> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1455627>
<dimitern> TheMue, voidspace ^^ ?
<voidspace> dimitern: can do, need to make a phone call first
<dimitern> voidspace, sure
<voidspace> dimitern: interesting technique
<voidspace> dimitern: looks ok to me
<TheMue> dimitern: doing it
<dimitern> voidspace, thanks :)
<voidspace> dimitern: so devices work is on master, not 1.24
<voidspace> dimitern: so it needs adding to the release notes
<voidspace> dimitern: I'll add something
<dimitern> voidspace, awesome, ta!
<dimitern> voidspace, TheMue, standup?
<voidspace> dimitern: omw
<voidspace> dimitern: I added release notes for devices, maybe too long
<dimitern> voidspace, tyvm! it's ok - it's still a draft, so we'll polish it for the release
<frobware> fwereade, I sent you a HO invite; feel free to move it to what works for you...
<fwereade> frobware, thanks, that should be fine, see you shortly :)
<mattyw> wallyworld, you in bed yet?
<mattyw> fwereade, I'd love a review of http://reviews.vapour.ws/r/2459/ when you have a moment
<mattyw> fwereade, I feel like I really need some advice about what I should be doing
<wallyworld> mattyw: hey
<mattyw> wallyworld, so my last pr actually broke loads of stuff, would you mind taking another look http://reviews.vapour.ws/r/2466/diff/#
<mattyw> wallyworld, I do miss you, but I'm glad we weren't in the same room for the batting collapse
<wallyworld> mattyw: yeah, i was sus about it, hence i added a card to audit what we have in that area even if the tests can pass. we still need to do that even once this lands
<wallyworld> mattyw: hard to gloat when the ashes are lost
<wallyworld> worst ashes series ever
<wallyworld> no test went for 5 days, each was one sided
 * rick_h_ is trying to figure out if you all are talking code or cricket :P
<wallyworld> rick_h_: cricket :-D
<wallyworld> should have used Test
<wallyworld> and Ashes
<rick_h_> lol
<mattyw> wallyworld, reckon that branch should land?
<mattyw> rick_h_, we'll come up with some sport banter in chicago if you like
<wallyworld> mattyw: if the tests pass, sure. we can iterate. i added a card to make sure we go back and audit everything
<mattyw> wallyworld, was the card on the uniter sprint board?
<wallyworld> mattyw: yeah
<wallyworld> mattyw: https://canonical.leankit.com/Boards/View/116558647/116837709
<wallyworld> i put a few words in the description
<mattyw> wallyworld, awesome, I'll take a look at that later I think
<mattyw> wallyworld, you really should be in bed though right?
<wallyworld> mattyw: yeah, meant to be on a swap day. but i have a meeting in an hour
<wallyworld> timezones are wonderful
<mattyw> wallyworld, fwereade ClearResolvedFlag() doesn't seem to be being called anywhere except in tests. And it doesn't exist on the operation.Callbacks interface anymore
<mattyw> ^^ any thoughts?
<mattyw> ashipika, ^^
<wallyworld> mattyw: it's gone
<wallyworld> not needed anymore
<mattyw> wallyworld, ok - so we can remove any reference to it
<mattyw> wallyworld, why is it not needed anymore?
<wallyworld> mattyw: yes, but any tests will need to be checked to ensure some sort of equivalent behaviour doesn't need to be tested, but mostly i expect they could be deleted. needs ti be case by case
<wallyworld> mattyw: eg the RemoteState snapshot may need to be udated
<wallyworld> depends on the test etc
<mattyw> wallyworld, ack
<TheMue> dimitern: voidspace: frobware: maas hangout
<mattyw> wallyworld, regarding what we just discussed http://reviews.vapour.ws/r/2476/
<voidspace> TheMue: oh, dammit
<voidspace> TheMue: is it still on?
<TheMue> voidspace: yep
<voidspace> ok, omw
<wallyworld> mattyw: lgtm. but i still have a feeling we need to add different tests that test the new resolved model. theres several similar such things we need tests for too. so we'll need to check all that once we think we've gt everything passing
<mattyw> wallyworld, ack - I'm still trying to get my head around this part of the code, shall I add a card for that?
<wallyworld> mattyw: yeah, it will be a mega card - we can break it up later - but good to have as a reminder
<wallyworld> mattyw: we can have a hangout to discuss this work maybe in a day or 2
<wallyworld> once we get the stuff we know about passing
<mattyw> wallyworld, I'll add something the calendar feel free to move it around as you see fit
<wallyworld> mattyw: sgtm, i'm too tired to do much else right now
<mattyw> wallyworld, ack np
<rogpeppe> ultra-trivial review anyone? (one line dependency change. https://github.com/juju/juju/pull/3097
<rogpeppe> or http://reviews.vapour.ws/r/2477/
<dimitern> rogpeppe, ship it!
<rogpeppe> dimitern: ta!
<mup> Bug #1488523 opened: Azure provider attempts to reuse dying vnet <azure-provider> <bootstrap> <ci> <destroy-environment> <reliability> <repeatability> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1488523>
<mup> Bug #1488523 changed: Azure provider attempts to reuse dying vnet <azure-provider> <bootstrap> <ci> <destroy-environment> <reliability> <repeatability> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1488523>
<mup> Bug #1488523 opened: Azure provider attempts to reuse dying vnet <azure-provider> <bootstrap> <ci> <destroy-environment> <reliability> <repeatability> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1488523>
<mup> Bug #1488554 opened: wily + 1.24.5: Failed to execute operation: Unit name /etc/systemd/system/juju-clean-shutdown.service is not valid <amulet> <openstack> <uosci> <juju-core:Triaged> <https://launchpad.net/bugs/1488554>
<mup> Bug #1488554 changed: wily + 1.24.5: Failed to execute operation: Unit name /etc/systemd/system/juju-clean-shutdown.service is not valid <amulet> <openstack> <uosci> <juju-core:Triaged> <https://launchpad.net/bugs/1488554>
<mup> Bug #1488554 opened: wily + 1.24.5: Failed to execute operation: Unit name /etc/systemd/system/juju-clean-shutdown.service is not valid <amulet> <openstack> <uosci> <juju-core:Triaged> <https://launchpad.net/bugs/1488554>
<mup> Bug #1488554 changed: wily + 1.24.5: Failed to execute operation: Unit name /etc/systemd/system/juju-clean-shutdown.service is not valid <amulet> <openstack> <uosci> <juju-core:Triaged> <https://launchpad.net/bugs/1488554>
<mup> Bug #1488573 opened: Juju treats transient vnet failures as permanent <azure-provider> <bootstrap> <ci> <deploy> <intermittent-failure> <reliability> <repeatability> <juju-core:Triaged> <https://launchpad.net/bugs/1488573>
<mup> Bug #1488576 opened: TestAddresserWorkerStopsWhenAddressDeallocationNotSupported fails on pp64el <ci> <intermittent-failure> <ppc64el> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1488576>
<mup> Bug #1488581 opened: TestFindToolsExactInStorage fails for some archs <blocker> <ci> <i386> <ppc64el> <regression> <storage> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1488581>
<perrito666> anybody knows the whereabouts of the OCR?
<alexisb> natefinch-afk, ping ^^
<alexisb> wwitzel3, ping (for something unrelated)
<natefinch> 'allo... sorry, have been neglecting my duties
<perrito666> aghh afk
<perrito666> why people wont use the actual away feature on irc
<natefinch> perrito666: irc has features?
<perrito666> and why on earth my client does not realize that natefinch-afk is natefinch
<perrito666> you are using it wrong </steve jobs>
<natefinch> sinzui: what version of Go are we going to be building with for 1.26?
<sinzui> natefinch: its not decided. We remain on 1.2 until we confirm 1.5 works every where and Ubuntu can put it every where
 * natefinch prays for 1.5
<natefinch> ericsnow: are there corresponding changes to juju-process-docker to support the changes in your patch for plugins to packages?
<ericsnow> natefinch: yep
<ericsnow> natefinch: http://reviews.vapour.ws/r/2426/
<davecheney> cherylj: http://reviews.vapour.ws/r/2482/
<davecheney> ship it
<wallyworld> cherylj: you are awesome for fixing that bug - was going to be somthing i did today and i looked to see you'd already done it :-)
<cherylj> thanks davecheney, wallyworld
<cherylj> wallyworld: regarding your review comment, I think we do need it.  My reasoning behind it is in the bug https://bugs.launchpad.net/juju-core/+bug/1488581/comments/2
<mup> Bug #1488581: TestFindToolsExactInStorage fails for some archs <blocker> <ci> <i386> <ppc64el> <regression> <storage> <unit-tests> <juju-core:In Progress by cherylj> <https://launchpad.net/bugs/1488581>
<wallyworld> cherylj: fair enough. perhaps then use arch.AMD64 instead of "amd64" to be consistent with the other lines in the test?
<cherylj> wallyworld: sure, I'll do that
<wallyworld> ty
<wallyworld> axw_: anastasiamac: perrito666: running late, will ping when ready, soon
<perrito666> k
<anastasiamac> wallyworld: k :D
<wallyworld> axw_: anastasiamac: perrito666: there now
<davecheney> cherylj: thanks for your fix
<davecheney> don't worry about adding those extra patch values
<davecheney> i'll take care of them with my change set
<davecheney> the goal is to remove version.Current.Arch
<davecheney> so they won't compile
<davecheney> and i'll find them and fix them
#juju-dev 2015-08-26
<menn0> i'm going to mark 1488581 as Fix Released. Thanks to cherylj's change, the broken test is now passing against in jenkins.
<menn0> bug 1488581
<mup> Bug #1488581: TestFindToolsExactInStorage fails for some archs <blocker> <ci> <i386> <ppc64el> <regression> <storage> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1488581>
<anastasiamac> menn0: cherylj: tyvm!!
<menn0> master is unblocked again
<sinzui> master cannot be unblcked without a bless
<sinzui> PS I just forced a retest to help master get a bless.
<anastasiamac> sinzui: \o/ tyvm
<menn0> sinzui: sorry
<sinzui> you are forgiven menn0
<sinzui> only one job left, then I can sleep
<mup> Bug #1488581 changed: TestFindToolsExactInStorage fails for some archs <blocker> <ci> <i386> <ppc64el> <regression> <storage> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1488581>
<axw_> wallyworld: the `storage: add volume status to "juju storage volume list"` card *was* done
<axw_> just checked, it's in master
<axw_> moving card from review to previous
<wallyworld> axw_: awesome, wanna add a sentence to the release notes?
<axw_> wallyworld: will do
<wallyworld> ty
<davecheney> wow
<davecheney> ./worker/... tests take > 40 minutes to run on ppc64
<axw_> davecheney: how much is uniter?
<davecheney> > 600 seconds
<davecheney> timed out
<axw_> heh
<davecheney> http://reviews.vapour.ws/r/2483/
<davecheney> this fixes the patch that cherylj landed earlier
<davecheney> not that her's was wrong
<davecheney> this is just the proper fix
<mup> Bug #1488777 opened: Add a wait command <juju-core:New> <https://launchpad.net/bugs/1488777>
<davechen1y> http://reviews.vapour.ws/r/2484/
<davechen1y> ping
<bogdanteleaga> is there a pretty way in golang to insert a slice of strings in the middle of another slice?
<bogdanteleaga> davechen1y ^^
<dimitern> bogdanteleaga, https://github.com/golang/go/wiki/SliceTricks
<bogdanteleaga> I guess it's a = append(a[:i], append(b, a[i:]...)...), wouldn't call it pretty tho
<dimitern> voidspace, TheMue, http://reviews.vapour.ws/r/2486/ please take a look - provisioning support for spaces constraints
<dimitern> bogdanteleaga, the beauty is within :D
<bogdanteleaga> haha
<voidspace> dimitern: looking
<voidspace> hmmm... coffee first
<voidspace> dimitern: the travel provider booked me tickets from London to Northampton, and then on the friday back from Northampton to London...
<voidspace> oops
<voidspace> dimitern: they're buying me new tickets now (for the sprint)
<dimitern> voidspace, ok, so all will be sorted out soon hopefully :)
<TheMue> dimitern: could you tell me why machineSubnetsAndZones()  only uses includeSpaces[0]? (or do I'm looking wrong?)
<dimitern> TheMue, there is a comment about it
<dimitern> TheMue, for the MVP work we only use the first positive space and ignore the rest
<TheMue> dimitern: yeah, read it. but why? simply helping me to understand it. it's a new topic for me. ;)
<dimitern> TheMue, for simplicity
<dimitern> TheMue, we want a *minimal* viable product :)
<TheMue> dimitern: ok, sounds good :)
<dimitern> TheMue, once we can do deployments within a single space reliably, it shouldn't be much of a problem do expand this to multiple spaces
<TheMue> dimitern: thx
<TheMue> dimitern: you've got a review
<dimitern> TheMue, thanks!
<voidspace> dimitern: will be a couple of minutes late to standup - on the phone :-/
<dimitern> voidspace, sure
<voidspace> dimitern: lost connection - rejoining
<voidspace> dimitern: lost connection again - rejoining
<voidspace> dimitern: struggling to rejoin apparently
<voidspace> dimitern: dammit - connection lost again
<voidspace> dimitern: my suggestion is that we store a default, but when the address is requested we look at whether it is an exact match for scope & type
<voidspace> dimitern: if it isn't an exact match we check for a better match and if there is one we return that and change the default address
<voidspace> dimitern: so we always return the same one until / unless a better match is found - and then we always return that one
<voidspace> dimitern: so we *can* change address - but only if we were using a fallback scope or type
<voidspace> dimitern: does that sound like a reasonable algorithm?
<voidspace> I can add it as a comment on the bug to give Ed a chance to comment
<dimitern> voidspace, yes, that sounds good
<voidspace> dimitern: cool
<dimitern> voidspace, just to summarize the chat so far: we need a default address (per scope most likely), and a way to only select an exact match per scope
<voidspace> dimitern: where "per scope" is private / public
<dimitern> voidspace, when we don't have a default address yet, just pick the first one; later if we have more than 1 exact match, use the previously stored default
<voidspace> dimitern: but yes
<voidspace> dimitern: yes
<voidspace> dimitern: the algorithm looks like this
<voidspace> dimitern: first time requested use the first "match" using the current algorithm
<voidspace> dimitern: subsequent requests, check if the stored default is an exact match - if it is *always* use it
<voidspace> dimitern: if the default isn't an exact match check if an exact match is now available
<voidspace> dimitern: if not, use the current default (so we remain stable)
<voidspace> dimitern: if an exact match is available, store that as the default and use that
<voidspace> dimitern: subsequent requests will now always see the new default
<dimitern> voidspace, how can a previous default address both still exist and not be an exact match?
<voidspace> dimitern: because if there isn't an exact match on scope / type we allow fallbacks
<voidspace> dimitern: e.g. if preferipv6 is on we can return an ipv4 address if an ipv6 one isn't available
<dimitern> voidspace, ah, because the previous default might have been a fallback
<voidspace> dimitern: so the first part is "if an exact match isn't available still store the fallback as default"
<voidspace> dimitern: to ensure we always return the same one unless / until an exact match is available
<dimitern> voidspace, that all sounds sane to me, but let's add it to the bug and ask dosaboy how does it sound to him
<dimitern> voidspace, also, I'd appreciate a review on my PR :)
<voidspace> dimitern: ok, I've written up my suggestion on the bug and will switch to your PR now
<dimitern> voidspace, cheers!
<rogpeppe> trivial review anyone (no logic changes, just a data change): https://github.com/juju/juju/pull/3110
<voidspace> dimitern: exactly duplicated test setup code in api and apiserver :-/
<dimitern> voidspace, it's not quite duplicated :)
<dimitern> voidspace, but fair point - it can be put in a common place
<dimitern> voidspace, state/testing ?
<voidspace> dimitern: sounds good.
<voidspace> dimitern: your call
<voidspace> dimitern: for *two* use cases it's barely worth it
<voidspace> dimitern: for three it would definitely be worth it
<voidspace> dimitern: I won't add it as an issue on the PR, just noting it
<dimitern> voidspace, the duplication was bugging me as well, but wanted to make it work first, then polish
<dimitern> voidspace, ok
<voidspace> s.checkStartInstanceCustom(c, m, "pork", s.defaultConstraints, nil, nil, nil, nil, false, nil, true)
<voidspace> nil, nil, nil, nil, false, nil, true
<voidspace> self documenting code be damned!
<voidspace> dimitern: ah, so this PR only adds support into the dummy provider
<voidspace> dimitern: EC2 will be a follow up
<voidspace> I was looking for where the new mapping was actually used!
<voidspace> anyway, looks ok to me - dependent on the EC2 support of course
<dimitern> voidspace, yeah, it had to include the dummy provider support
<dimitern> voidspace, thanks, will update with the suggestions and set it to land
<voidspace> cool
<perrito666> morning
 * fwereade wishes he'd learn to follow his own advice -- dependency.Engine would be much easier to test in pathological cases if I'd thought to write the algorithm without the concurrency
<perrito666> fwereade: you can tell your self you told you so
<TheMue> dimitern: btw, your testing stub for spaces is very tricky, but nice. needed some time to get it, now doing the final changes here to handle then enabling and disabling of SupportsSpaces
<dimitern> TheMue, awesome!
<TheMue> dimitern: but this leads to a more general proposal. each package should have a doc.go containing only the package statement but above a description about the intention and usage of the package. so it is visible e.g. on godoc.org as well as it is easy to jump into new packages
<dimitern> TheMue, this is unrelated, but I like it
<dimitern> TheMue, we should bring it up on the ML for discussion
<TheMue> dimitern: yep, will do so
<sinzui> cherylj: is master waiting for a merges to release 1.25-alpha1? or are we waiting on the 3 untested revisions? http://reports.vapour.ws/releases/3008 is bless, but is 3 merges behind tip
<natefinch> ericsnow: thanks for the review of my untrack cmd code.  I responded to a bunch of stuff, but haven't started actually modifying the code yet.
<ericsnow> natefinch: cool
<dimitern> voidspace, hey, how goes the forward port btw?
<frobware> dimitern, do you have time to HO for build/test related stuff?
<dimitern> frobware, I do, just give me 10m ?
<frobware> dimitern, np
<sinzui> dimitern: per my question in channel earlier, is master waiting for a merges to release 1.25-alpha1? or are we waiting on the 3 untested revisions? http://reports.vapour.ws/releases/3008 is bless, but is 3 merges behind tip
<sinzui> dimitern: also Lp wont let me update bug 1488576 at the moment. It is a rising problem in the test suite
<mup> Bug #1488576: TestAddresserWorkerStopsWhenAddressDeallocationNotSupported fails on pp64el <ci> <intermittent-failure> <ppc64el> <regression> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1488576>
<dimitern> sinzui, I have one PR with I'll set to land in 2 minutes
<dimitern> sinzui, re that bug - I vote to skip it on ppc64
<sinzui> dimitern: Ci sees it everyewhere, but is ppc64el most http://reports.vapour.ws/releases/issue/55d89569749a56415476f23b
<dimitern> sinzui, so if we disable it on ppc64, it should improve things for the time being?
<sinzui> yes, but we still risk failure on some other arch until we fix the test
<dimitern> sinzui, agreed, I'll look into it tomorrow
<sinzui> :)
<dimitern> sinzui, actually, I'm seeing failures even locally
<dimitern> sinzui, I'll disable it entirely then
<wwitzel3> ericsnow: found my bug :)
<dimitern> sinzui, then tomorrow will dig into it some more
<sinzui> dimitern: yeah. when I first found the bug, it was only on ppc64el. we now have several runs and see it everywhere. That is was I was trying to say in the bug when Lp was giving who timeouts
<wwitzel3> ericsnow: I didn't remove the Workloads list from the meta struct in the charm library and our Definitions call loops through meta.Workloads (which was empty)
<sinzui> dimitern: once your current PR is merges (assuming a bless) do we haver all the features in place for a release?
<dimitern> sinzui, I have no others of mine
<dimitern> sinzui, and I believe nothing critical remains from our stuff
<sinzui> thank you dimitern
<dimitern> frobware, *whew* sorry for the delay, let's get in a HO now?
<frobware> dimitern, ok
<ericsnow> wwitzel3: ah, glad you caught that
<marcoceppi> hey core guys, there's an email on the main juju list with some issues around huge mongodb, odd errors, and 4+gb log files Could someone take a look?
<marcoceppi> hey core guys and gals*
<marcoceppi> hey core ppl*
<mgz> pc fail
<marcoceppi> actually trying to break old habbits of calling everyone guys
<mgz> er... as in 'politial correctness'
 * marcoceppi is not sloth
<mgz> it's good to try :)
<mgz> I basically think of guys as gender neutral these days, it's how it seems to be used
<mgz> I've seen women using it to address a mixed group, but I guess not a all-female group
<cherylj> I use it to address my lady-friends
 * katco thinks guys is fine and uses it to address my lady-friends as well
<natefinch> y'all
<natefinch> or all-y'all
<katco> natefinch: personally hate that colloquialism
<natefinch> katco: lol me too
<cherylj> I don't think I've ever used y'all un-ironically
<cherylj> and I'm from texas, y'all!
<katco> lol
<marcoceppi> I made someone mad over the weekend by using guys in my typical gender neutral fashion, so peeps is it from now on
<mgz> eheheh
<natefinch> cherylj: me either, but I'm a Yankee
<mgz> I use peeps ironically, except sometimes I forget the irony
<natefinch> heh
<marcoceppi> everything I say is ironic
<natefinch> even that!
<marcoceppi> anyways, core peeps, there's that email, I'm not sure what to tell him over then "try to upgrade your production openstack environment to use juju 1.24.5"
<natefinch> marcoceppi: looking
<marcoceppi> natefinch: thanks!
<natefinch> marcoceppi: first thought is that it might be networking problems.  timeouts usually mean networking
<sinzui> dimitern: cherylj : are you happy with master tip? Is this waht you want to release? Ci will test this soon: https://github.com/juju/juju/commits/master
<cherylj> sinzui: that works for me, but I don't know if there are other merges we're waiting on.
<cherylj> cmars, katco, are there things you're wanting in  1.25-alpha1?
<sinzui> ^ katco are you really hear to call the cut of 1.25-alpha1?
<sinzui> here
<mup> Bug #1489053 opened: rabbitmq-server charm fails on cluster-relation-changed when using openstack provider <juju-core:New> <https://launchpad.net/bugs/1489053>
<katco> sinzui: i'm here
<katco> cherylj: sinzui: sorry been out sick... moonstone didn't have anything for 1.25
<ionutbalutoiu> Hey! I have an idle juju machine with a charm deployed on it. Everytime I reboot the machine, config-changed gets executed when machine comes up. I pressume this is the normal behaviour, right ?
<natefinch> ionutbalutoiu: I believe so.  config-changed hooks get run a lot.  The code in them should be written with that in mind.
<voidspace> dimitern: oh yeah, meant to tell you - sorry
<voidspace> dimitern: it's already on master
<mup> Bug #1489053 changed: rabbitmq-server charm fails on cluster-relation-changed when using openstack provider <rabbitmq-server (Juju Charms Collection):New> <https://launchpad.net/bugs/1489053>
<voidspace> natefinch: for bugfixes, which branch should we target?
<voidspace> 1.24?
<voidspace> natefinch: nice haircut by the way
<natefinch> voidspace: thanks
<natefinch> voidspace: uh... bug fixes for what?
<voidspace> natefinch: I am working on a juju-core bug
<voidspace> natefinch: how far back are we doing bug fixes (what versions are we still supporting)?
<voidspace> the bug is in 1.20
<voidspace> but I don't think we're supporting that
<voidspace> katco:  do you know how far back are we doing bug fixes (what versions are we still supporting)?
<voidspace> katco: I'm working on a bug reported against 1.20, but I'm pretty sure we're not still releasing 1.20
<katco> voidspace: yeah i'm pretty sure we aren't releasing anything < 1.22. sinzui comments?
<sinzui> katco: we are not. also we have nothing scheduled to 1.22. only 1.24 has plans for a release
<sinzui> voidspace: ^
<voidspace> sinzui: thanks
<voidspace> katco: thanks too
<katco> voidspace: o/ sorry we don't get to talk much
<sinzui> PS, don't merge into 1.23, it wont pass without other backports
<voidspace> ok
<voidspace> katco: yeah :-/
<voidspace> katco: but o/ from across the water...
<katco> voidspace: :) how is the newbie family member doing?
<voidspace> katco: he's good. Keeping us busy, very demanding of attention - desparate to walk and eat but too young for either (5 months no teeth)
<voidspace> but very adorable of course
<voidspace> katco: how's yours?
<katco> aw glad to hear it :)
<katco> well she gave the family another lovely chest cold
<katco> so she's recovering, but overall doing good
<voidspace> he's lovely, much more sociable than Irina was - who still doesn't really like people very much :-)
<katco> started walking a few weeks ago
<voidspace> katco: heh
<voidspace> katco: ah, chaos time :-)
<katco> hah yeah
<katco> voidspace: isn't it weird how their personalities can come through at such a young age
<voidspace> yeah, lovely to watch personality develop
<voidspace> Irina is in the "why" and "what" phase - she asks about 428 times a day what a word means
<voidspace> which is cool
<voidspace> and inevitably every answer has another word she wants explained
<voidspace> so we end up in these long exhausting chains of explaining things
<voidspace> but it's great that she's so thirsty to learn
<katco> haha
<katco> yeah i bet that's neat
<katco> anxiously waiting for that stage
<voidspace> so far (aged four) Irina has got more and more fun as she grows
<katco> i am definitely a >= toddler person. not overly excited about babies
<katco> very excited to be able to communicate better
<voidspace> hehe - me too mostly
<voidspace> yep, it's *so* much better when you can explain things to them and they can understand
<katco> i was just thinking yesterday when she was not feeling great
<voidspace> we went on a big road trip round Europe - Irina was fine in the car, but you can't explain to a baby why they're in the car and how long it will be for
<cmars> cherylj, nothing specifically. i want 1.25 to branch :)
<katco> she was crying and i was wishing she could tell me what was wrong
<voidspace> yeah, that's hard
<katco> voidspace: haha so true
<mup> Bug #1489087 opened: certificate verify failed <juju-core:New> <https://launchpad.net/bugs/1489087>
<sinzui> katco: bug 1489131 jeopardises our release of 1.25-alpha1 tomorrow. I think cherylj fixes something similar to this yesterday.
<mup> Bug #1489131: BootstrapSuite.TestRunTests fails on non-amd64 archs <blocker> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1489131>
<katco> cherylj: ring any bells?
<cherylj> katco: let me take a look
<sinzui> cherylj: I would phrase this as have you seen anf fixed a mistake like this before?
<cherylj> sinzui: it looks like it's in the same vein as the blocker yesterday, but it's in a different spot.
<sinzui> cherylj: I wonder if someone cargo culted one of the areas you fixed before your fix merged
<cherylj> sinzui: I see davechen1y was in the offending file 2 days ago.
<cherylj> hmm, I'm puzzled why this wouldn't have been hit before...
<cherylj> it's not as a result of a recent change, as far as I can tell
<cherylj> wait, this could be due to Dave's changes yesterday
<cherylj> yeah, I think it is.    Either way, it's an easy fix.
<jcastro> https://bugs.launchpad.net/juju-core/+bug/1489142
<mup> Bug #1489142: cpu-power constraint conflicts with with instance-type when trying to launch a t2.medium <juju-core:New> <https://launchpad.net/bugs/1489142>
<jcastro> we really need help with this bug
<jcastro> it's currently blocking a demo we need to deliver tomorrow and this bug kind of reached out to bite us.
<arosales> critical issue here for Mark's demo tomorrow at Silicon Valley OpenStack ^
<katco> jcastro: arosales: TAL
<mup> Bug #1489131 opened: BootstrapSuite.TestRunTests fails on non-amd64 archs <blocker> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <juju-core:In Progress by cherylj> <https://launchpad.net/bugs/1489131>
<mup> Bug #1489142 opened: cpu-power constraint conflicts with with instance-type when trying to launch a t2.medium <juju-core:New> <https://launchpad.net/bugs/1489142>
<katco> jcastro: is the behavior any different if you do add-machine first?
<jcastro> trying now
<jcastro> katco: no change, same result
<katco> jcastro: k, peeking around the code
<marcoceppi> http://paste.ubuntu.com/12201667/
<katco> jcastro: what version is this?
<jcastro> 1.24.5
<jcastro> katco: anything that just lets us deploy to a t2.medium, even if it's ugly, would do the trick
<marcoceppi> katco: even explicitly setting cpu-power=0 in the global constraints for the environment I still get no instance types in us-west-2 matching constraints "cpu-power=100
<marcoceppi>       instance-type=t2.medium"
<cherylj> Review request for the blocker if someone could take a look: http://reviews.vapour.ws/r/2492/
<cherylj> marcoceppi: what do you get when you run juju get-constraints?
<cherylj> you can unset all constraings with set-constraints "" I think
<cherylj> constraints, even
<cherylj> jcastro: can you unset the constraints with juju set-constraints "" ?
<jcastro> he's on the phone, one sec
<cherylj> kk
<marcoceppi> cherylj: it's empty when I first ran it
<cherylj> well that's interesting
<marcoceppi> cherylj: it's now cpu-power=0 which has no effect
<marcoceppi> cherylj: it's always been empty, fwiw :)
<cherylj> hmm, wonder where it's coming from, then
<mramm> { // General Purpose, 3rd generation.
<mramm> 		Name:     "t2.medium",
<mramm> 		Arches:   amd64,
<mramm> 		CpuCores: 2,
<mramm> 		Mem:      4096,
<mramm> 		// Burstable baseline is 40% (from http://aws.amazon.com/ec2/faqs/#burst)
<mramm> 		CpuPower: instances.CpuPower(40),
<mramm> 		VirtType: &paravirtual,
<mramm> 	},
<mramm> try setting it to less than 40
<marcoceppi> mramm: you can't
<mramm> ok
<marcoceppi> juju deploy cassandra --constraints="cpu-power=0 instance-type=t2.medium" -n 3
<marcoceppi> Added charm "cs:trusty/cassandra-3" to the environment.
<marcoceppi> ERROR ambiguous constraints: "cpu-power" overlaps with "instance-type"
<cherylj> and if you just do --constraints="instance-type=t2.medium", that doesn't work?
<katco> cherylj: see bug 1489142
<marcoceppi> cherylj: no, I get this error
<mup> Bug #1489142: cpu-power constraint conflicts with with instance-type when trying to launch a t2.medium <juju-core:New> <https://launchpad.net/bugs/1489142>
<marcoceppi> no instance types in us-west-2 matching constraints "cpu-power=100 instance-type=t2.medium"
<katco> marcoceppi: specifying both cpu-power and instance-type is the correct error
<marcoceppi> katco: cool, figured it was
<katco> marcoceppi: specifying instance-type is the mystery atm
<marcoceppi> I would have never tried if I didn't get the above error first :)
<katco> yeah hehe
<mramm> and you really need t2.medium?
<marcoceppi> katco cherylj mramm http://paste.ubuntu.com/12201761/
<jcastro> yes, the demo needs a t2.medium
<jcastro> can't be another instance type
<marcoceppi> mramm: it's the only amazon instance that matches the orangebox openstack demo of 2 cpus and 4 gb of ram
<natefinch> I wonder if CPU-power is specified as an environment level, and we're just combining them dumbly
<natefinch> environment-level constraint that is
<katco> https://github.com/juju/juju/blob/master/provider/ec2/image.go#L22
<cherylj> natefinch: that's what I thought too, but we should be able to see that with get-constraints which marcoceppi says is empty
<mramm> can you get a t2.medium by JUST using cpu-power=40
<marcoceppi> cherylj: wellthere are global default constraints, that never show up, ie: cpu-power=100 cpu-cores=1 mem=1.5g, etc
<marcoceppi> mramm: trying that now
<katco> marcoceppi: mramm: yeah just going to suggest that
<marcoceppi> uh, i got this http://paste.ubuntu.com/12201785/
<mramm> ahh VPC... dreaded VPC
<marcoceppi> t2 instnances don't have this requirement though
<marcoceppi> this is an m4 isntance it tried to launch
<marcoceppi> my guess m4.large
<mramm> interesting
<katco> marcoceppi: try "arch=amd64 cpu-cores=2 mem=4096 cpu-power=40"
<katco> marcoceppi: (basically just specifying the things declared by default in a t2.medium)
<katco> ericsnow: wwitzel3: natefinch: all hands
<katco> perrito666: ping ^^
<wwitzel3> ahh, yep, one more line ....
<marcoceppi> if this doesn't work I'll spin the instances by hand and manual provider them in with ssh
<marcoceppi> katco: http://paste.ubuntu.com/12201804/
<mramm> "Itâs interesting to note that Amazon is making T2 instances VPC-only. This is a mechanism to move newer workloads to VPC by default."
<marcoceppi> wtf it didn't say that on the page I was looking at
<marcoceppi> okay, so how do I get a vpc juju thingy
<mramm> You must launch your T2 instances into a virtual private cloud (VPC); they are not supported on the EC2-Classic platform.
<wwitzel3> katco: what's the hangout? it isn't on my calendar
<mramm> directly from the amazon docs
<marcoceppi> foadthx amazon
<katco> wwitzel3: no hangout atm
<mramm> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-vpc.html#vpc-only-instance-types
<natefinch> katco: there's nothing in my calendar... what all hands?
<katco> natefinch: wwitzel3: the discussion happening in this chat right now
<natefinch> katco: oh, sorry, misunderstood.
<marcoceppi> okay, so we have the instance, it's just I need a vpc now
<katco> natefinch: wwitzel3: try and troubleshoot the first attempt in bug 1489142
<mup> Bug #1489142: cpu-power constraint conflicts with with instance-type when trying to launch a t2.medium <constraints> <juju-core:New> <https://launchpad.net/bugs/1489142>
<marcoceppi> and juju+vpc =...?
<marcoceppi> I'm guessing good to go no issues at all fire away, but I'll wait
<mramm> marcoceppi: do you have an account with default VPC?
<mramm> that is the easiest path forward
<mramm> IIRC
<marcoceppi> I don't think so? my account was created like 4 years ago
<mramm> yea, you'd need a newer account to use
<marcoceppi> I do have a new account though
<marcoceppi> that I have access to
<mramm> I'm not sure how much of VPC support has actually landed in 1.24
<natefinch> at best, we need to handle this case better.  Saying there's a conflict is a huge red herring
<jcastro> yeah, let's sort that after we unblock marco
<katco> natefinch: focus on the 1st use-case... the instance-type constraint doesn't seem to be working
<mramm> natefinch: agreed there is lots here to fix
<marcoceppi> Going to see if this new account has default vpc, in which case I can use the constraints katco gave to get unblocked
<mramm> marcoceppi: cool
<katco> marcoceppi: jcastro: here's the algo. for performing the selection: https://github.com/juju/juju/blob/master/environs/instances/instancetype.go#L95
<marcoceppi> mramm katco so with a default vcp, will all my instances just luanch in it?
<katco> marcoceppi: i haven't looked at it recently, but i think i remember reading that
<marcoceppi> katco: seems so, bootstrap node seems to be in it
<mramm> katco: marcoceppi: I believe that is correct
<natefinch> didn't we used to display instance type in juju status?
<marcoceppi> for the love of
<marcoceppi> ugh
<marcoceppi> 'cannot run instances: Virtualization type ''hvm'' is required
<marcoceppi>       for instances of type ''t2.medium''. Ensure that you are using an AMI with virtualization
<marcoceppi>       type ''hvm''. For more information, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html
<marcoceppi>       (InvalidParameterCombination)'
<marcoceppi> looks like I'm just not going to win today
<marcoceppi> can I give juju an AMI?
<marcoceppi> I know the answer is "no", but like, can I actually do it?
<natefinch> marcoceppi: lol
<marcoceppi> I need this ami-afd5c09f for hvm apparently
<natefinch> marcoceppi: http://cdn.meme.am/instances/55580985.jpg
<jcastro> we're going to launch them by hand and then just add-machine via ssh
<natefinch> can you not just use a different machine size?
<natefinch> like... bigger?
<jcastro> no
<jcastro> the demo requires a t2.medium
<arosales> natefinch, we need to have a specific instance size for benchmark comparison
<bogdanteleaga> marcoceppi: yes, you could give juju a ami to use, but you need to modify some code in core
<arosales> bogdanteleaga, modifying code in core is a no-op for a demo tomorrow
<bogdanteleaga> amazon only accepts image metadata that's signed
<bogdanteleaga> that would be the problem here
<bogdanteleaga> if you could get that, it would also work
<natefinch> so, instance-type works. just not with t2.... so probably it's the VPC thing causing an error that we interpret incorrectly.
<jcastro> yeah
<katco> natefinch: wondering if instance-type is specified, should we even be setting a default cpu power? isn't that implied by instance-type?
<jcastro> when you do it with a new account it works because amazon defaults to vpc for users newer than dec 2013
<marcoceppi> katco: I say no, I the user supplied an instance-type, it should be the only thing that matters
<jcastro> once you get past that though it's trying to use the pv image instead of the hvm image
<natefinch> katco: yeah, instance-type is supposed to override everything
<natefinch> katco: my guess is that the code that adds CPU-Power is coming later in the call stack and appending it, without considering if instance-type is set
<natefinch> katco: although I don't think it's actually hurting us
<katco> natefinch: pushing patch now that addresses that
<natefinch> (aside from being confusing)
<bogdanteleaga> got a questions about constraints since we're on the subject: is there a place to define default constraints/provider?
<bogdanteleaga> I found that at least on aws windows machines need 35GB
<katco> natefinch: well, i'm wondering if we didn't set cpu power to a default if the search algo could find a match
<natefinch> bogdanteleaga: any constraints set when you bootstrap become environment-wide constraints
<sinzui> katco: natefinch : do either of you have a minute to review cherylj's http://reviews.vapour.ws/r/2492/
<bogdanteleaga> natefinch, yes but ideally we'd have this the default for windows instances
<katco> sinzui: trying to resolve something blocking a demo tomorrow
<natefinch> katco: my guess is that amazon is returning "no instances" because you don't have the default VPC, and we're just assuming it's because of the constraints mentioned
<natefinch> katco: but certainly we shouldn't be adding the cpu-power when instance-type is set
<natefinch> katco: so that's a fix to make regardless
<katco> http://reviews.vapour.ws/r/2493/
<natefinch> looking
<natefinch> katco: logically backwards... should only be if instancetype *is* nil
<katco> natefinch: doh... damn sick head
<Makyo> Having some problems with panics on CI - anyone else encountered similar? http://juju-ci.vapour.ws:8080/job/github-merge-juju/4494/console
<natefinch> katco: this is why we have reviews :)  (and also tests, but I understand not writing the test in a crunch)
<katco> natefinch: recheck
<natefinch> katco: got a typo in the comment there.... think you were typing in the wrong window
<katco> natefinch: again
<natefinch> katco: ship it!
<katco> marcoceppi: i don't suppose a 1.24.6 beta would help at all?
<jcastro> we're past the vpc problem for now
<jcastro> just wrestling with the hhvm issue now
<katco> jcastro: k
<jcastro> t2.medium wants an ami that is built for hhvm, but juju is handing it normal pv ami's
<katco> natefinch: i'm going to wait to merge this until ian weighs in as well. it looks like he touched a lot of the constraints code
<jcastro> so .. the current plan is he just launched the instances by hand in the aws gui
<jcastro> and then is manually adding the machines via ssh
<jcastro> and then deploying cassandra onto that
<marcoceppi> wwitzel3: http://paste.ubuntu.com/12202091/
<natefinch> jcastro: manual provider should do well enough
<natefinch> jcastro: at least for last minute demo purpsoes
<jcastro> indeed
<sinzui> Makyo: the juju unittests are poorly isolated and timed. We often see many attempts needed to merge. Because of the poor isolation, it is possible a test change you your branch unhinged the  dblogSuite.TearDownTest and dblogSuite.TestUnitAgentLogsGoToDBWithJES tests :(
<Makyo> sinzui: ack, thanks.  I'll do a bit more digging around what changed.
<marcoceppi> wwitzel3: https://bugs.launchpad.net/juju-core/+bug/1478156
<mup> Bug #1478156: tabular format does not give enough details about machine provisioning errors <charmers> <juju-core:Triaged> <https://launchpad.net/bugs/1478156>
<katco> wallyworld: hey can you TAL at http://reviews.vapour.ws/r/2493/diff/# and see if it seems sane?
<wallyworld> sure
<davechen1y> cherylj: thanks for the fix
<davechen1y> fyi, i am testing these, not just comitting them blidly
<davechen1y> on ppc64
<davechen1y> but my ppc64 machine is so slow I cannot run all the tests
<davechen1y> so I have to work only in the areas I think are affected
<davechen1y> sorry i missed this one
<wallyworld> anastasiamac: standup?
#juju-dev 2015-08-27
<axw_> wallyworld: one thing we talked about after you'd gone on Friday was that remotestate should probably be renamed to "goal state" or "desired state", and we'll move all watcher-related things in there
<axw_> wallyworld: and that updating status should move into there
<axw_> wallyworld: I've just found that updating status is done as a separate channel in the loop, so I'll move that over... maybe not immediately, depends on how much it gets in the way of testing
<wallyworld> name change sounds good. not sure about adding in update status though
<wallyworld> i think polling update status should be a uniter responsbility
<wallyworld> maybe it could be in goal state; if stuff is changing and we are running hook regularly, no need for polling a charm's status.
<wallyworld> but seems kinda othogonal to goal state to me
<axw_> wallyworld: I just don't want there to be a lot of concurrency spread around again
<wallyworld> yeah, i was mulling over the same thought
<axw_> wallyworld: and I'd like to keep as much business logic out of the loop as possible
<rick_h_> wallyworld: got a sec to chat? can you ping when you do?
<wallyworld> rick_h_: sure, now?
<rick_h_> wallyworld: sure thing https://plus.google.com/hangouts/_/canonical.com/daily-standup?authuser=1 adjust authuser
<mup> Bug #1489215 opened: Output from metadata generate-image looks bad <juju-core:New> <https://launchpad.net/bugs/1489215>
<davechen1y> sinzui: wallyworld anyone https://bugs.launchpad.net/juju-core/+bug/1489218
<mup> Bug #1489218: go get cannot fetch launchpad.net/gomassapi <juju-core:New> <https://launchpad.net/bugs/1489218>
<davechen1y> can you reproduce this issue ?
<davechen1y> mwhudson: https://bugs.launchpad.net/juju-core/+bug/1489218
<davechen1y> is this related to the launchpad change that happened recently ?
<sinzui> davechen1y: I think someone renamed ir disbaled the project!
<mwhudson> davechen1y: gomassapi or gomaasapi?
<davechen1y> good point
<davechen1y> i think i fat fingered it trying to investigate https://bugs.launchpad.net/juju-core/+bug/1489217
<mup> Bug #1489217: godeps -u dependencies.tsv from scratch fails with Go 1.5 <juju-core:New> <https://launchpad.net/bugs/1489217>
<sinzui> :)
<mwhudson> oh good
<mwhudson> message is plenty confusing though
 * mwhudson disappears
<mup> Bug #1489217 opened: godeps -u dependencies.tsv from scratch fails with Go 1.5 <juju-core:New> <https://launchpad.net/bugs/1489217>
<wallyworld> davechen1y: looking, was otp
<wallyworld> davechen1y: oh, just read backscroll. ooops
<davechen1y> wallyworld: what do to
<davechen1y> ?
<davechen1y> something is screwed with that repo in a way we cannot detect with the build bot
<wallyworld> not sure, i'll ask in the lauchpad channel if i can reporoduce
<menn0> waigani: review please: http://reviews.vapour.ws/r/2496/
<wallyworld> davechen1y: so it's only a go 1.5 thing? i'm still on 1.4 and it's fine for me
<wallyworld> so what would there be to fix in launchpad
<mup> Bug #1489131 changed: BootstrapSuite.TestRunTests fails on non-amd64 archs <blocker> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1489131>
<mup> Bug #1489131 opened: BootstrapSuite.TestRunTests fails on non-amd64 archs <blocker> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1489131>
<mup> Bug #1489131 changed: BootstrapSuite.TestRunTests fails on non-amd64 archs <blocker> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1489131>
<waigani> menn0: done. I replied to your review also.
<davechen1y> wallyworld: i believe mwhudson made (or asked) for a change to happen in the Go tool that made launchpad less special
<davechen1y> so depdency resolution for lp repos has changed
<davechen1y> i don't know the details
<wallyworld> ah ok
<wallyworld> but the change was made in Go
<wallyworld> so Go needs to be fixed, not launchpad
<axw_> wallyworld: in my upgrade branch, I'm getting a failure at https://github.com/juju/juju/blob/maltese-falcon/worker/uniter/uniter_test.go#L346. in my branch, the config-changed doesn't happen because it was resolved without retry above
<axw_> wallyworld: seems that previously we would run another config-changed after start, even if we resolved without retry a previous config-changed
<axw_> wallyworld: seems wrong to me. what do you think?
<wallyworld> it does but i'd defer to william for a definitive answer. maybe tweak the test with a todo and land and then i can ask him tonight
<axw_> wallyworld: sounds good, thanks
<axw_> wallyworld: also fixing a bug in storage resolver, it's running storage-attached for unattached (but "alive") storage attachments
<davechen1y> https://github.com/juju/juju/pull/3119
<davechen1y> anyone ?
<dimitern> wallyworld, hey, still about?
<wallyworld> dimitern: hey. thanks for taking up those bugs, sorry for not saying good morning in the email :-)
<dimitern> wallyworld, oh :) no worries at all
<dimitern> wallyworld, the backport didn't happen, but it should today I hope
<wallyworld> no hurry with that - we are concentrating on 1.25 right at the moment
<dimitern> wallyworld, the other armhf issue - I have access to the lab and their maas, but need to investigate today
<wallyworld> was meant to go out today but there was a regression
<dimitern> wallyworld, oh? which one this time..
<wallyworld> a test failure due to arch != amd64
<dimitern> wallyworld, and btw you're perhaps the perfect person to ask - how to set up armhf tools/images/metadata/etc. locally?
<wallyworld> use juju metadata generate-images
<wallyworld> that will put metadata in a directory
<wallyworld> you specify the directory with -d
<dimitern> wallyworld, I guess I can use the 1.24.5 (was it?) release packages.. but no upgrade --upload-tools then, which is not good for debugging
<wallyworld> or else i think it uses ~/.juju
<dimitern> wallyworld, ah, ok will try this
<dimitern> wallyworld, so far istm to debug the issue I need to start another armhf node to build juju from source on it :)
<wallyworld> you can't geneate arm tools though unless you're on arm of course
<wallyworld> assuming use used -d mydir
<wallyworld> then you bootstrap with
<dimitern> yeah
<wallyworld> juju bootstrap --metadata-source mydir
<wallyworld> the tools command is juju metadata generate-tools
<dimitern> it was asking if my agent-tools-url is set correctly btw
<wallyworld> it will do that if it can't find tools
<dimitern> right
<dimitern> but generate-tools will need already built arm binaries, right? could they be simply in $PATH?
<wallyworld> generate tools will build the tools from source if there's no binary in the path
<wallyworld> i think
<axw_> wallyworld: would you please review the final commit on https://github.com/juju/juju/pull/3105/commits?
<wallyworld> sure
<dimitern> good, so if I manage to build it from source on an arm machine, then do generate-images and generate-tools, I should be able to bootstrap a custom binary
<axw_> wallyworld: various bugs cropped up when I made my sweeping change for upgrades
<wallyworld> that's what tests are for :-)
<wallyworld> axw_: lgtm. i assume we delete the operation.Queued check in the leader resolved because that's handled externally. now that i think about it, i also seem to maybe recall that the could have been a blanket policy to always run config changed after a start hook just because, but i'll check with william
<axw_> wallyworld: you assume correctly. ok, thanks. we'll need to update resolvers if that's the case, so best done in a separate branch anyway
<wallyworld> yup
<wallyworld> axw_: school pickup time, but here's a trivial http://reviews.vapour.ws/r/2499/
<wallyworld> axw_: could you ltgm and merge for me while i'm out, ty
<axw_> wallyworld: sure
<wallyworld> axw_: thanks for landing. what bit you working on now?
<axw_> wallyworld: TestUniterUpgradeConflicts
<wallyworld> just as the board says, doh
<axw_> wallyworld: just about ready to land
<axw_> I mean propose
<wallyworld> i wonder if that will fix any of the other upgrade ones, maybenot
<axw_> ashipika: I had to make the remoteState.Life==Dead check for another test, so it's also in https://github.com/juju/juju/pull/3121
<axw_> ashipika: I didn't uncomment the unit test you're working on though, cos it still fails for some other reason
<ashipika> axw_: ack!
<axw_> wallyworld: UniterSuite.TestUniterUpgradeGitConflicts works with my branch, with a trivial change to expect leader-settings-changed
<wallyworld> axw_: great :-) assign yourself to the card
<wallyworld> luckily i had just phone my niece for her birthday so hadn't started yet
<axw_> wallyworld: I'll check the other upgrade tests before sending a change
<axw_> cool
<wallyworld> sounds good, i'm hoping they may work of the changes to make them work are trivial
<axw_> yeah the other one is fixed too
<axw_> wallyworld: I'll just update my current PR
<wallyworld> yup
<ashipika> axw, wallyworld: can you assign/move corresponding cards on the uniter sprint board, as well?
<wallyworld> ashipika: we are :-)
<ashipika> wallyworld: excellent! :)
<wallyworld> getting there slowly
<wallyworld> and sometimes quickly
<wallyworld> axw_: i had a couple of questions on the review
<axw_> wallyworld: thanks for the review. please see my replies, let me know if they make sense
<wallyworld> looking
<wallyworld> axw_: let's get it landed
<axw_> wallyworld: ta
<axw_> wallyworld: any idea why we run actions in ModeTerminating?
<wallyworld> nope. i didn't realise we did. i'm curious as to why
<dimitern> dooferlad, morning
<dooferlad> hi
<dimitern> dooferlad, since we discussed the needed changes around provider/ec2, I've landed the foundation that should give the provider what it needs - subnets-to-zones map
<dimitern> dooferlad, so, slight change of plans, hopefully simplifying what's left
<dooferlad> dimitern: great. Just dealing with email then will take a look. I spotted that you had done something in that area.
<dimitern> dooferlad, ok
<voidspace> dimitern: I don't know if you saw my message yesterday
<voidspace> dimitern: that forward port is already on trunk
<voidspace> dimitern: is there an issue for me to close related to it?
<dimitern> voidspace, yeah, I've updated the card
<dimitern> voidspace, it's a backport to 1.24, not forward port to master
<dimitern> voidspace, sorry, my bad :/
<voidspace> dimitern: hah!
<voidspace> dimitern: ok, that's easy to do :-)
<dimitern> voidspace, yep ;)
<voidspace> dimitern: and for the IP address bug I'm targetting 1.24 and will forward port when it's done
<dimitern> voidspace, sounds good
<voidspace> dimitern: PR created, just running tests
<dimitern> voidspace, awesome!
<mup> Bug #1489346 opened: /var/lib/juju/db taking lots of disk space <juju-core:New> <https://launchpad.net/bugs/1489346>
<axw_> wallyworld: there's a bug in remotestate watcher. I'm going out in a little while to the school, will fix it later on. just letting you know in case anyone sees weird timing errors like http://juju-ci.vapour.ws:8080/job/github-merge-juju/4504/consoleFull
<voidspace> dimitern: thanks for the LGTM
<voidspace> dimitern: I still have the problem that running all cmd/juju tests times out on my main box
<voidspace> dimitern: going to run just the upgrade tests
<dimitern> voidspace, have you tried make check?
<dimitern> voidspace, dooferlad, standup
<voidspace> dimitern: ah, that does a full test run
<voidspace> dimitern: just kicked it off
<voidspace> omw
<mattyw> fwereade, ping?
<dooferlad> dimitern: having sound issues
<dooferlad> dimitern: hopefully there in a second
<wallyworld> axw_: ty
<fwereade> mattyw, pong, I observe I ought to be in my standup
<mattyw> fwereade, should be quick - we think a flowchart of hooks - and what is called when would be useful, we think something existed but not sure if there is anything up to date
<mattyw> fwereade, asking if you knew of one
<fwereade> mattyw, I don't think all the recent hooks have been added
<fwereade> mattyw, but https://jujucharms.com/docs/devel/authors-charm-hooks should still be accurate for the hooks it references
<fwereade> mattyw, not a flowchart
<mattyw> fwereade, ok, I'm drawing one - we can see how useful it is after its been drawn
<fwereade> mattyw, cool
<wallyworld> mattyw: or fwereade: a trivial one if you have a moment http://reviews.vapour.ws/r/2503/
<mattyw> wallyworld, I think that's the one I just did
<wallyworld> oh joy
<mattyw> wallyworld, yep - posted 6 minutes from now
<wallyworld> you did the same work?
<wallyworld> oh, a lgtm
<wallyworld> ty
<wallyworld> dimitern: have you seen this failure before? https://pastebin.canonical.com/138582/
<dimitern> wallyworld, yes
<dimitern> wallyworld, is it on master?
<wallyworld> dimitern: we are seeing that in our feature branch
<wallyworld> we merged master about 12 houra ago
<dimitern> wallyworld, yeah, dooferlad fixed it on master
<wallyworld> dimitern: ty, we will remerge master
<dimitern> wallyworld, it needed a bump of deps.tsv for amz.v3
<wallyworld> dimitern: ah, ty
<dimitern> wallyworld, but that fix was done ~6 days ago (http://reviews.vapour.ws/r/2434/diff/#) - I hope it's not a new regression
<wallyworld> dimitern: okay, we'll look into it - i thought we had the latest master
<wallyworld> fwereade: can you join us in the uniter meeting?
<wallyworld> https://plus.google.com/hangouts/_/canonical.com/discuss-missing
<TheMue> dimitern: PR is http://reviews.vapour.ws/r/2504/. had to assure first to not merge the whole net-cli branch again, the first naive diff looked awful. :D
<dimitern> TheMue, ok, will look in a bit
<TheMue> dimitern: tia, just added 2nd card for the client side and CLI
<dimitern> TheMue, you've got a review
<TheMue> dimitern: thx
 * TheMue is afk for a moment, visiting daughter in hospital and bring some sweets. bbiab
<perrito666> omg I hope she is ok
<wwitzel3> marcoceppi: how's things? any more core issues last night?
<marcoceppi> wwitzel3: m4.xlarge is not a "valid" instance-type on aws for 1.24.5
<marcoceppi> other than that, demo is setup
<wwitzel3> marcoceppi: doesn't appear to be a type in master either, glad to hear the demo is up and going
<marcoceppi> wwitzel3: I'll file a bu
<marcoceppi> g
<wwitzel3> ty
<natefinch> ideally, if you use instance-type as a constraint, we don't check it and just try it with amazon :/
<wwitzel3> marcoceppi: of all the people I know with the last name Ceppi, you're my favorite.
<natefinch> it doesn't matter if *juju* knows what the instance type means
<natefinch> man, all this indirection and layering is killing me
<katco> wwitzel3: natefinch: ty both for helping marco et. al. get the demo running
<natefinch> ericsnow, wwitzel3: where's the concrete type that implements Component?
<katco> natefinch: https://godoc.org/golang.org/x/tools/cmd/oracle
<katco> natefinch: know thy programming toolchain
<perrito666> no one knows their whole programming toolchain
<wwitzel3> natefinch: it is in component/all/workload.go
<katco> perrito666: you should at least be aware of the pointers into interesting spaces :)
<perrito666> katco: I am they have all been nicely integrated into go-vim :p
<natefinch> katco: yeah, I've tried to use the oracle before... couldn't figure it out.  Seems like it kinda requires editor integration, and I didn't have time to get hat settled last time
<katco> perrito666: haha, same here for emacs
<katco> natefinch: alan is a heavy emacs user, for sure
<katco> natefinch: https://github.com/waigani/GoOracle
<perrito666> katco: that it in itself a problem though, sometimes I need to go figure how to call these things in the comand line
<katco> perrito666: clearly you just need to move to an OS that is also an editor ;p
<perrito666> lol
<perrito666> I concede, you won this round
<katco> no, perrito666. emacs has won. it always wins. PLEASE HELP I CAN ONLY CONTROL MY EDITOR ONCE EVERY but emacs is great
<perrito666> loooolo
<natefinch> katco: yeah, I saw waigani's oracle integration... couldn't get it to work either
<perrito666> natefinch: I think I saw a sublime-go thing around
<natefinch> perrito666: there's a couple
<lazyPower> wwitzel3: Mornin o/ Whats the current state of the union on process management? Are we almost ready for me to have another go at the feature branch next week?
<katco> lazyPower: there's only a few user-facing things that have changed this iteration. what are you interested in?
<wwitzel3> lazyPower: yep, when we end this current iteration (Friday) we will have landed all of the major changes
<wwitzel3> lazyPower: there are still some features that we need to implement, but the interface for your will stop being a moving target
<lazyPower> katco: if I can fully test it without embedding the plugin bins in my charms :)
<lazyPower> end to end from a user perspective. I've got some new iteration charms that leverage charm composition and the reactive framework
<lazyPower> we're working on docs to land before the charmer summit and I'd love to spotlight the proc mgmt from a charming with containers perspective - as its a solid path forward for new charmers.
<lazyPower> I'm thinking i'll wind up running a workshop on it for anyone thats interested in getting hands on with the tech while we're at the summit
<katco> lazyPower: from a charmer perspective, i think wwitzel3 is correct. you should see much more stability :)
<wwitzel3> lazyPower: yeah, that's my personal goal too is to have something beta-able for the summit
<katco> lazyPower: there are still a few commands to be hashed out
<wwitzel3> yep ^
<lazyPower> Solid. Once that release lands i'll get my grubby mits on it and do the second cycle of early feedback. Everything else was a bit natural given the model we're implementing works pretty well.
<wwitzel3> <3
<lazyPower> has core taken a look at the reactive framework example cory_fu wrote up using vanilla as an example?
<wwitzel3> I have, Cory had me do a proof of the docs before he talked about it on the last office hours.
<katco> lazyPower: i've talked to ben about it and given it a cursory look. looks awesome
<lazyPower> katco: wwitzel3: excellent. We'll have a few docker examples that we're going to re-tool to using the proc mgmt stuff once that hits (next week you say?) so having that precursory info really helps.
<katco> lazyPower: i am always a bit skeptical at our claim that you can charm in any language. all the cool stuff is in py
<wwitzel3> I'm pretty sure there is reactive hooks for bash too
<lazyPower> well, we also claim "integrate with any cm framework" when in reality we're giving you a root shell and then jerry-rig that language into juju w/ bash or python glue code
<lazyPower> but yeah, reactive works in python and bash
<lazyPower> and looks strikingly similar
<wwitzel3> yeah, not sure what form of hand wavey black magic happened to get them in to bash, but they are there
<katco> haha
<katco> that's interesting
<natefinch> what's this reactive stuff?
<wwitzel3> teh future
<lazyPower> natefinch: its a pattern to surface events and handling events in charms outside of just hook context.
<lazyPower> natefinch: one way to look at it - is a state machine in a state machine
<katco> https://www.youtube.com/watch?v=Vm8S331kUPQ
<natefinch> lazyPower: and its power is only exceeded by its mystery?
<lazyPower> natefinch: i think thats a byproduct of how new it is :)
 * perrito666 approves of quoting that movie
<lazyPower> let me fish up the docs. that video link is our office hours though and cory gives a pretty good walktrhough
<natefinch> perrito666: heh, wondered who would get that.  Love that movie.
<lazyPower> and my pop culture reference gland is on vacation today
<cory_fu> natefinch, lazyPower: http://52.0.202.26:8000/en/authors-charm-composing.html
<lazyPower> thanks cory_fu
<perrito666> lazyPower: it is a really bad movie that manages to drive the entire plot around a very cheap literrary resource
<wwitzel3> dangit! I was looking for that exact link
<wwitzel3> thanks a lot cory_fu
<cory_fu> There are actually three connected concepts: composing charms from layers (a pre-release step), using pre-built relation stubs, and using the reactive pattern (reacting to semantic states instead of hook contexts directly)
<cory_fu> Together, they are basically a framework, so there's an inherent amount of hand-waviness, but I think the end result is easy to follow.  Certainly more so than the services framework ever was
<natefinch> wow... that's um.  Something.
<natefinch> do you have a page written for idiots?
 * katco always shudders at the site of php
<perrito666> o cmon, php in itself its not that bad (after 10 years) it just has tolerance for bad programmers
<katco> perrito666: i wrote lots of php in <= v4... it just seems like there are much better alternatives now. and yeah, for whatever reason it has a rep. for bad coding
<perrito666> katco: I believe it depends a lot on your business, there is a lot done and allows you to get things working fast without requiring people to savvy I know companies that have big (properly writen) chunks of their user facing bits in php because it allows them to find devs easy
<perrito666> in my city I know a good 100 ootb good php devs, 40 python (of which I had to train half) and 2 go (both working for canonical) :p
<mgz> version bump review please: http://reviews.vapour.ws/r/2505/
<mattyw> mgz, lgtm
<mgz> mattyw: ta!
<mup> Bug #1489477 opened: m4 instance types not supported on AWS <juju-core:New> <https://launchpad.net/bugs/1489477>
<mup> Bug #1489484 opened: Juju should support passing an AMI to deploy <juju-core:New> <https://launchpad.net/bugs/1489484>
<mup> Bug # changed: 1174610, 1265871, 1401368, 1447899, 1449210, 1450737, 1452649, 1455260, 1456265, 1456343, 1457122, 1459093, 1459775, 1464237, 1464392, 1467037, 1467379,
<mup> 1468153, 1468579, 1471771, 1472632, 1474614, 1475779, 1476895, 1477263, 1479546, 1481500, 1483082, 1483083, 1483086, 1484403, 1484789, 1486074, 1486640, 1486749
<frobware> dimitern, do you have some (HO) time to talk about network spaces...?
<dimitern> frobware, sure
<dimitern> frobware, standup HO ?
<frobware> dimitern, yep
<mup> Bug # opened: 1174610, 1265871, 1401368, 1447899, 1449210, 1450737, 1452649, 1455260, 1456265, 1456343, 1457122, 1459093, 1459775, 1464237, 1464392, 1467037, 1467379,
<mup> 1468153, 1468579, 1471771, 1472632, 1474614, 1475779, 1476895, 1477263, 1479546, 1481500, 1483082, 1483083, 1483086, 1484403, 1484789, 1486074, 1486640, 1486749
<mup> Bug # changed: 1174610, 1265871, 1401368, 1447899, 1449210, 1450737, 1452649, 1455260, 1456265, 1456343, 1457122, 1459093, 1459775, 1464237, 1464392, 1467037, 1467379,
<mup> 1468153, 1468579, 1471771, 1472632, 1474614, 1475779, 1476895, 1477263, 1479546, 1481500, 1483082, 1483083, 1483086, 1484403, 1484789, 1486074, 1486640, 1486749
<mattyw> fwereade, ping?
<fwereade> mattyw, hey, am indeed back
<fwereade> mattyw, sorry I couldn't get back online
<fwereade> mattyw, I feel like we'd covered most of the high ground there though?
<mattyw> fwereade, no problem at all, I think we're ok
<mattyw> fwereade, I can't remember which side you were on about states being specific to each resolver (who is on who's side doesn't really matter)
<mattyw> we decided to make them specific
<fwereade> mattyw, excellent :)
<mattyw> fwereade, I'm just carrying on with the action resolver
<fwereade> mattyw, cool
<mattyw> fwereade, and the reason I'm pinging
<fwereade> mattyw, I still haven't sent you any comments, have I :(
<mattyw> fwereade, to have a sort of nebulous chat about it
<mattyw> fwereade, I'm fine for the moment but will hassle you later I think
<fwereade> mattyw, ok, cool
<natefinch> ericsnow: you around?
<ericsnow> natefinch: yep
<natefinch> ericsnow: in your latest comment, you say "we should be passing a full ID (e.g. "my-proc/some-id") to the hook context method"  ... but I don't think that's realistic.  The code calling the hook tool is only likely to know the workload name, not the ID.  So, like, workload-untrack myproc
<ericsnow> natefinch: I was referring to the method, not the command
<ericsnow> natefinch: at some point a bare name must be converted to a full ID
<natefinch> ericsnow: so maybe the right answer is to try to parse the argument in init, and if it has a name + id, then we're good.  If it doesn't, and it's just a name, try to look up the id for the name there, and fail.  Then run can just pass the id as-is
<natefinch> (er fail if the lookup fails)
<ericsnow> natefinch: that sounds like what we've already been doing :)
<natefinch> ericsnow: maybe I didn't understand the code that was doing it, then.
<ericsnow> natefinch: well, I could be misunderstanding too :)
<fwereade> https://github.com/juju/juju/wiki/Managing-complexity -- comments and edits accepted gratefully
<fwereade> (some people have already commented and I haven't actually included all those, but I wanted to post it and not get further distracted)
<ericsnow> fwereade: thanks!
<perrito666> off course, I try to use the recommended editor by katco and among the possible key shortcuts there is one that, I kid you not, is described as: toggle the battery status
<katco> perrito666: core team call
<perrito666> well connection sucks, I tried to join but saw people move a bit then lost all
<wwitzel3> oh yay, my internet is back
<katco> wwitzel3: welcome back. juju is now rewritten in emacs lisp.
<Makyo> C-m C-x juju-mode
<wwitzel3> rewritten/assimilated
<wwitzel3> emacs will be skynet, when it decides the time is right
<katco> boy the emacs jokes are running heavy today. i had not intended that.
<perrito666> katco: lol
 * katco is also wondering if Makyo has a notification set up for emacs mentions
<Makyo> Alas, I'm a vim-er and an Atom-er, but I'm not about to pass up a good emacs joke if it comes up :)
<wwitzel3> "emacs at least it isn't acme"
<perrito666> to each its editor, I know people that work in either mcedit or gedit
<katco> wwitzel3: haha
<wwitzel3> without the banter about ones choice in editor, I'm lost .. don't ruin this for me perrito666
<katco> wwitzel3: all i'm saying is vimscript is awesome!
<perrito666> wwitzel3: life is about choices, you can bash people about their choice on about anything
<wwitzel3> perrito666: umm, but it is only fun to do it about choices that don't have any real impact or meaning
<ericsnow> cmars: does the uniter currently run any "subordinate" workers
<cmars> ericsnow, checking..
<ericsnow> cmars: as far as I can tell, it doesn't
<cmars> ericsnow, doesn't look like it
<ericsnow> cmars: I've realized that the workers we're running for workloads should be running under the uniter
<ericsnow> cmars: not directly under the unit agent
<ericsnow> cmars: maybe
<cmars> ericsnow, what do the workload workers need of the uniter?
<ericsnow> cmars: per the doc comment, "Uniter implements the capabilities of the unit agent."
<cmars> ericsnow, doc comment might be wrong now :)
<ericsnow> cmars: ha
<ericsnow> cmars: from what I can tell, the uniter is the worker that handles all the work of a unit
<ericsnow> cmars: workload [processes] are a part of that
<cmars> ericsnow, how?
<ericsnow> cmars: they are a part of the workload dictated by a charm
<cmars> ericsnow, right.. but it sounds like there's a worker associated with each process.. what does that worker need to do exactly?
<ericsnow> cmars: API calls and possibly persist some data locally
<ericsnow> cmars: plus, of course, interact with the workloads (e.g. docker)
<cmars> ericsnow, i'd see if they could be managed independently of the uniter worker if at all possible. uniter already does too much imo
<cmars> ericsnow, and then the questions to ask & answer are: what do these process workers need from the uniter? and then we can possibly encapsulate that into a resource that is managed by uniter and consumed by process workers
<ericsnow> cmars: per the uniter's manifold doc comment: "...We expect to break it up further in the coming weeks"
<cmars> ericsnow, see http://reviews.vapour.ws/r/2511/ for an example of this kind of interaction. in this PR, there's a resource that coordinates between uniter & consumers, so that we don't operate on a workload while it's upgrading, or not started
<ericsnow> cmars: under a consolidated (unit+machine) agent there will be multiple uniters per agent, right?
<cmars> ericsnow, i imagine you could do something similar to safely operate on a workload without getting entangled in the reactor loop of the uniter itself
<ericsnow> cmars: good point
<ericsnow> cmars: I definitely don't want to get involved in that :)
 * ericsnow shudders
<cmars> ericsnow, nice thing about this model is that it is so much more testable. mock out the resources and test each side of the interactions independently.. then move the end-to-end tests (conn suite) out to integration tests
<ericsnow> cmars: yay
<cmars> ericsnow, i think actions should come out as well.. though i think we'll need to get them working as they were before making that leap
<ericsnow> cmars: actions are tied into the uniter?
<cmars> ericsnow, currently, yes. they have their own operations & remote/local state resolver.
<sinzui> wallyworld: I will branch 1.25 from mgz's commit. No backports needed
<wallyworld> sinzui: that is most excellent, thank you
<sinzui> wallyworld: I will also prepare a branch to rename master 1.26-alaph1
<wallyworld> ok
<cmars> hey wallyworld, i'm looking at TestUniterUpgradeConflicts, seems to be failing for me in maltese-falcon branch
<wallyworld> cmars: i thought that one had been fixed yesterday
<cmars> what I can't figure out is, how the unit resolved state is supposed to go from ResolvedNoHooks to ResolvedNone
<wallyworld> we can fix that today if you want
<wallyworld> i need to see why it is not fixed when i hought it was
<wallyworld> unit resolved state is changed by the user in practice
<wallyworld> and then is propogated to the remote snapshot
<cmars> wallyworld, here's a paste of what i'm seeing: http://paste.ubuntu.com/12209761/
<cmars> wallyworld, two runs of the test, one with an induced panic during waitUnitAgent
<cmars> wallyworld, but it must be that the unit watcher isn't getting the update
<wallyworld> cmars: i just tried it locally and yeah, it now fails for me too, but was passing yesterday i'm sure. i can get it fixed today if you want
<cmars> wallyworld, thanks, that'd be great
<wallyworld> cmars: sure, i can't tell you right away why or when it broke
<wallyworld> cmars: with the skipped metrics related tests still in worker/uniter, what's the ETA on moving those?
<sinzui> wallyworld: can you review http://reviews.vapour.ws/r/2512/
<wallyworld> sure
<wallyworld> sinzui: lgtm, ty
<wallyworld> sinzui: do you still use thunderbird?
<sinzui> No, it is dead to me
<wallyworld> it just broke for me when i updated :-(
<wallyworld> sigh
<sinzui> wallyworld: I just use the canonical gmail. Yes I had an account issue. I have been with Canonical long enough that my identity didn't work with the standard account integration
<cmars> wallyworld, there's possibly a little bit left to move out of the context tests, checking
<sinzui> wallyworld: 1.25 branch exists and is open for business.
<wallyworld> sinzui: well after the update, it won't retrieve my email and asks for a password
<wallyworld> cmars: yesterday i relabelled the skipped metrics tests to "Skip("maltese-falcon metrics")    there were about 7 maybe
<sinzui> wallyworld: oh, that is what I had issues with. authentication. I coudl use gmail interface though
<wallyworld> sinzui: so how did you solve it?
<sinzui> I use gmail
<wallyworld> damn, i might ask in #is
<perrito666> wallyworld: you might need to remove-add your account
<perrito666> wallyworld: it requires some love to work
<cmars> wallyworld, yep, i think we have comparable coverage for all of those & they can be removed. will comment the new tests in the review
<wallyworld> cmars: awesome, we are really close now i think. actions is the biggest one to finish
<cmars> wallyworld, that's great to hear. mattyw will be working on actions tmw, can I send him your way in case he gets stuck?
<wallyworld> cmars: sure, although william might be better as i've not had much to do with actions. but please to try and help
<cmars> wallyworld, ack, thanks
<jw4> cmars, wallyworld fwiw, I still watch this channel and *might* be able to help if there are actions / uniter type questions
<jw4> but william is ofc the best option
<wallyworld> jw4: hey! we are currently rewriting the uniter, so are in the middle of changing how the execution queue works
<jw4> yeah I've been noticing that chatter
<jw4> I doubt I can contribute much to that, but ...
<jw4> :)
<wallyworld> jw4: np, it's been fun. uniter is now (to me at least) much less complex now
<jw4> sweet
<jw4> It's long overdue I think
#juju-dev 2015-08-28
<axw_> wallyworld: I can't get TestUniterUpgradeConflicts to fail at all, and it's been passed by the landing bot...
<axw_> cmars: were you on the latest maltese-falcon when you saw it fail ^^ ?
<wallyworld> axw_: it failed for me when casey pinged me, but i think stuff may have landed since
<wallyworld> i'll retry
<wallyworld> in  bit
<axw_> wallyworld: ok, thanks
<cmars> axw_, yes, i'll confirm the revno in just a minute
<axw_> cmars: thanks
<cmars> axw_, yep, still getting a fail in TestUniterUpgradeConflicts at 6425ee2188
<axw_> cmars: is it intermittent? I've been running the test in a loop... :/
<cmars> axw_, it's as if the resolver doesn't get the unit state change between waitHooks & waitUnitAgent. very repeatable for me
<cmars> i haven't seen it pass at all today
<axw_> hrm
<cmars> running with go1.2.2 (was using 1.5 but got distracted by map ordering bugs, which i've set aside for later)
<axw_> I am using 1.4.2... I'll try that
<cmars> axw_, anything you'd like me to try, let me know.
<axw_> cmars: will do, ta
<axw_> cmars: still can't repro with 1.2.2. might be helpful if you could add some debug logging to uniter/resolver.go at the top, to print out the local and remote states, and pastebin a failure
<cmars> axw_, ok. in NextOp?
<axw_> yes, please
<axw_> I keep on doing that while debugging, should probably just leave it in and commit it
<cmars> axw_, http://paste.ubuntu.com/12211350/
<axw_> cmars: thanks
<axw_> cmars: would you please try applying http://paste.ubuntu.com/12211722/, and see whether it fixes the issue? there's a race between the watcher refreshing and us clearing the in-memory resolved mode
<cmars> axw_, still failing
<axw_> cmars: thanks. thinko, there's race there anyway
<axw_> cmars: so, current thinking: upgrade fails, becomes conflicted; agent is restarted and comes up *not* thinking it's conflicted; upgrade error is fixed, and resolved mode is set; agent proceeds to upgrade (setting the resolved mode was unnecessary for this), and because it doesn't think it was conflicted, doesn't clear the resolved mode
<cmars> axw_, what stores the "conflicted" state? is that just the state of the git repo (when git used to be used)?
<axw_> cmars: it's set in uniter.go when an upgrade op fails
<axw_> cmars: tho it looks like verifyWaitingUpgradeError is supposed to cater for this. anyway, gotta go out for lunch, will look again later
<cmars> axw_, ok cool. where is local state persisted?
<cmars> i think that's the issue.. if we lose the conflicted state
<menn0> waigani: if you feel inclined, could you please have a look at http://reviews.vapour.ws/r/2516/
<axw_> cmars: only some local state is persisted; conflicted is not
<axw_> cmars: what would happen in practice is that the uniter would come up, attempt to upgrade again and become conflicted again (unless the problem was fixed)
<axw_> cmars: found it... the SetAgentStatus is failing, and its error wasn't being checked
<TheMue> dimitern: ping. I yeasterday evening pushed http://reviews.vapour.ws/r/2504/ again. would you please take a look?
<dimitern> TheMue, sure, it's the next on my list :)
<TheMue> dimitern: great, thanks
<dimitern> TheMue, reviewed
<TheMue> dimitern: thx *click*
<dimitern> TheMue, I've tried to clarify the need for (bool, error) result on SupportsSpaces
<TheMue> dimitern: yep, I've already changed it
 * TheMue is typing with one hand, phonecall :/
<fwereade> mattyw, just added a card: "make uniter.RunCommands goroutine-safe"
<fwereade> mattyw, which contains a short note that might be relevant to actions
<fwereade> mattyw, let me know if it's useful
<fwereade> s/if/whether/
<TheMue> dimitern: so, phone call ended. I don't think an environs function should return a params type. the change of the standard not-support to params should be done inside the server-side API
<voidspace> TheMue: dimitern: dooferlad: frobware: I have a suggestion
<voidspace> TheMue: dimitern: dooferlad: frobware: why don't we postpone the retrospective/planning for next week
<voidspace> when we're face to face
<voidspace> so just have a normal standup today
<TheMue> voidspace: would be fine for me
<dimitern> voidspace, hmm, I'm not sure how are these two related :)
<dimitern> voidspace, we'll see each other f2f anyway, but frobware might be interested in how we're doing the retro/planing
<voidspace> dimitern: but planning is one of the things that really benefits from being face to face
<frobware> voidspace, agreed
<voidspace> dimitern: so we can still do a retrospective
<voidspace> but planning is one of the things we should be doing next week *anyway*
<voidspace> and it seems like a good way to take advantage of being face to face
<dimitern> voidspace, good point about the planning
<voidspace> dimitern: we could do the retrospective part this morning
<voidspace> as there is some benefit to it being fresh in our minds
<voidspace> but leave planning and estimation for face to face
<voidspace> frobware: standup?
<frobware> voidspace, yes, let's see what we want to do and when. :)
<voidspace> frobware: we're all in standup - there seems to be consensus that doing the retrospective now is good, but leave planning and estimation until next week
<frobware> voidspace, hmm. HO's tells me I'm in the hangout.
<dooferlad> frobware: https://plus.google.com/hangouts/_/canonical.com/sapphire
<voidspace> frobware: are you logged in with the right google account? I've made that mistake before and it's put me in an empty hangout
<voidspace> ah, hi
<frobware> voidspace, yep, that was it. ;)
<dimitern> voidspace, TheMue, dooferlad, guys, since we have a 1.25 branch now, just a reminder to target your PRs against master and then backport them to 1.25
<dimitern> (or vice versa - doesn't matter as long as there is a card for both and they're both done)
<voidspace> dimitern: thanks
 * dimitern will be back in ~1h
<dooferlad> dimitern: http://pastebin.ubuntu.com/12213276/
<dimitern> dooferlad, I suspect these are due to the removal of assignPrivateIPAddress
<dooferlad> dimitern: it is in export_test.go
<dimitern> dooferlad, even if you put it in export_test, in the code where it's used you need a var
<dooferlad> dimitern: same package
<dimitern> dooferlad, you can't patch a func
<dooferlad> ah
<dooferlad> dimitern: spot on
<dooferlad> dimitern: why not just change the function name then?
<dimitern> dooferlad,  :) cheers
<dooferlad> dimitern: nevermind
<dooferlad> dimitern: because you can't patch a func
<dooferlad> heh
<dimitern> dooferlad, yeah
<dimitern> dooferlad, so if you do var assignPrivateIPAddress = func(...) { .. }, then you can add var AssignPrivateIPAddress = &assign.. in export_test
<dimitern> ok, now I'm really away :)
<urulama> dimitern: hey, you got a sec to take a look at this https://github.com/go-goose/goose/pull/15
<urulama> dimitern: it'll land in "liberty" branch, but would like to get some feedback from you
<dimitern> urulama, sure, looking
<bogdanteleaga> sinzui, mgz_: is this a known failure? http://juju-ci.vapour.ws:8080/job/github-merge-juju-testing/14/console
<dimitern> urulama, reviewed
<urulama> dimitern: thanks. not sure that roles are part of v3 ... might have missed it though
<dimitern> urulama, it appears so, according to the official API docs
<urulama> dimitern: ah, right, doman + role name is globally unique across all domains
<dimitern> urulama, yeah, and as commented, in all cases where either ID or Name can be used, we should support both I think, if it's not a lot more work ofc
<sinzui> bogdanteleaga: While I am not familiar with the faillures for this job, I would still retry it because mgo_test in juju is a common cause of intermittent faillures
<dimitern> sinzui, bogdanteleaga, I was looking at this - it seems the failure only happened after the changes in the "testfix" branch
<bogdanteleaga> dimitern, yes, I did notice that but I cannot reproduce it locally
<rogpeppe> i've code a PR up for review here which will enable environment providers to generate the boilerplate YAML automatically from the environment schema. reviews appreciated: https://github.com/juju/environschema/pull/6
<dimitern> bogdanteleaga, different go versions and/or arch/os ?
<rogpeppe> mgz_: this is the kind of thing you were alluding to, i think
<bogdanteleaga> dimitern, all it seems to do for mongo is add one extra mongo path
<bogdanteleaga> dimitern, 1.2.1 amd64 on trusty
<bogdanteleaga> dimitern, I think that's what the box is using
<dimitern> bogdanteleaga, yeah, that change seemed innocent enough to me, but .. well
<bogdanteleaga> dimitern, it is weird because I ran them about 10 times locally and I got no failure
<bogdanteleaga> dimitern, while it failed 3 times consecutively on the ci
<dimitern> sinzui, can we tweak that job to run with TEST_LOGGING_CONFIG='<root>=TRACE' ?
<dimitern> sinzui, temporarily of course
<dimitern> bogdanteleaga, better logging will help - when it fails it seems to take ~1m each time
<bogdanteleaga> dimitern, sinzui, tried adding a mongod executable on the new path just in case
<bogdanteleaga> doesn't seem to change anything
<dimitern> what's more likely to be causing it is if JUJU_MONGOD is set in the job environ I think
<bogdanteleaga> yeah, but then that one becomes the first one and nothing should change
<dimitern> yeah.. something else is in play
<natefinch> I should really remember to take my own advice
<wwitzel3> katco: are we starting at 10 or 10:30?
<katco> wwitzel3: 10
<katco> ericsnow: natefinch: stand up
<mgz_> someone want to volunteer to back ou the last change on master to unblock?
<mgz_> see bug 1489896
<mup> Bug #1489896: Juju cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1489896>
<mup> Bug #1489896 opened: Juju cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1489896>
<mgz_> I suggest it should be OCR job is no one else steps up
<mup> Bug #1489896 changed: Juju cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1489896>
<mup> Bug #1480310 changed: systemctl link request failed for service FOO: Unit name FOO is not valid. <blocker> <ci> <systemd> <wily> <juju-core:Fix Released> <systemd (Ubuntu):Fix Released by pitti> <https://launchpad.net/bugs/1480310>
<mup> Bug #1489896 opened: Juju cannot upgrade to 1.26-alpha1 <blocker> <ci> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1489896>
<mattyw> fwereade, ping?
<fwereade> mattyw, pong
<mattyw> fwereade, hey hey, would you have 5 minutes to talk about the action pr?
<fwereade> mattyw, oops, just saw mail
<fwereade> mattyw, sure; and the end result of that retry command iis just what I wanted, yeah
<mattyw> fwereade, I'm going to be in chicago next week so wanted to chat before I end up in an awkward timezone
<fwereade> mattyw, good idea, thanks
<mattyw> fwereade, https://plus.google.com/hangouts/_/gtdt43qnnxyz7gcbbdn5clz4u4a?pqs=1&hl=en&authuser=0
<mgz> OCR or anyone else: review please to unblock master: http://reviews.vapour.ws/r/2523/
<natefinch> mgz: is that just a straight up revert?
<mgz> natefinch: yes
<mgz> natefinch: compare with http://reviews.vapour.ws/r/2250/
<voidspace> right, EOW
<voidspace> bye all
<voidspace> see some of you in London next week
<natefinch> mgz: ship it
<cmars> thanks natefinch.. i lacked context on that one, would have been a good part of my afternoon to review :)
<natefinch> cmars: it was just a revert, so all I did was make sure it looked pretty much like the prior commit.
<cmars> ah, right. first thing I saw was the size of the diff
<ericsnow> natefinch: in moonstone
<natefinch> ericsnow: got some trime for a little more help?  I can't figure out how to get basecommand to get the ID from the args... I keep getting empty data for some reason
<ericsnow> natefinch: k
<natefinch> ericsnow: There's no destroy in the API.. how is destroy different than untrack?
<ericsnow> natefinch: destroy on the plugin (I mispoke)
<natefinch> oh ok
<natefinch> ericsnow, katco: landing that 3 pointer now... just waiting for the landing bot to take it.
<katco> natefinch: yay! good work! :D
<ericsnow> natefinch: sweet
<natefinch> gotta run, but I'll pop back in to double check it really landed for true
<natefinch-afk> ericsnow, katco: and, landed!
<katco> natefinch-afk: woohoo!
<katco> natefinch-afk: good work dude!
<bdx> core, dev: Does anyone here know how to add custom cloud-config to maas provisioning....i.e. curtin_userdata preseeed or custom preseed??
#juju-dev 2015-08-29
<mup> Bug #1490075 opened: juju use lxcbr0 rather than juju-r0 <juju-core:New> <https://launchpad.net/bugs/1490075>
<mup> Bug #1490075 changed: juju use lxcbr0 rather than juju-r0 <juju-core:New> <https://launchpad.net/bugs/1490075>
<mup> Bug #1490075 opened: juju use lxcbr0 rather than juju-r0 <juju-core:New> <https://launchpad.net/bugs/1490075>
#juju-dev 2016-08-29
<veebers> wallyworld: I'm just about to head out for lunch, but I notice when running the grant/revoke test that the outpu of list-users seems to have changed again
<veebers> I don't see display-name for users that are added in the output
<wallyworld> display name is only displayed if there is one
<wallyworld> extrernal users don't have a display name
<wallyworld> otherwise the output is filled with empty attributes
<wallyworld> the test should not be fragile though to break on such changes
<wallyworld> it needs simply to parse the json or yaml
<wallyworld> whether display name attribute is in the output should be immaterial
<wallyworld> the parsed data will be the same
<veebers> wallyworld: To be clear  it's expected to not show an empty display name?
<wallyworld> in which type of output? json? yaml? tabular?
<wallyworld> for json or yaml, it is immaterial whther empty displyaname is there or not, and the test should not depend on it
<wallyworld> the parsed data is the same either way
<veebers> wallyworld: json output, currently the test compares predefined datastructures against the resulting output.
<veebers> I'm confident that this was working with the current assumptions (i.e. display-name: "") on Friday, is this a recent change?
<wallyworld> yeah, can't recall exactly when it landed
<veebers> wallyworld: Ok. I'll make the change to the test to not expect display-name when not set
<wallyworld> that will be good, thanks
<thumper> veebers: ping me when you are back
<veebers> thumper: hey o/
<thumper> veebers: hey ehy
<veebers> thumper: sorry I left for the gym later than originally intended
<thumper> that's fine
<thumper> hangout?
<veebers> sure thign
<veebers> thumper: which one/
<thumper> veebers: https://hangouts.google.com/hangouts/_/canonical.com/lxd?v=1471979335&clid=FD6B1EDF18B181C1&authuser=0
<veebers> omw
<veebers> wallyworld: would it take you less then 2 minutes to find the commit id for when empty display name was removed from juju output? Would be nice to include in the MP for this test :-)
<wallyworld> probably, i'll look
<wallyworld> veebers: it also omits last login time and date create if empty, but those will only be empty for external users. i assume the CI tests are not doing anything with eternal users yet (but they should at some point) edc0b5324b2a110eb838852f2c7bc5f26d627efd
<veebers> wallyworld: awesome thanks. Yeah already ignores login/date stuff. We have a task for the external testing (that needs some fleshing out)
<wallyworld> great
<thumper> menn0: with you in a minute
<menn0> ok
<wallyworld> menn0: you going to add QA steps to your PR?
<menn0> wallyworld: yes, working on that now
<wallyworld> ok, awesome
<menn0> wallyworld: turns out it's harder to QA than I realised
<wallyworld> i can imagine
<veebers> thumper: just saw the email response. I should be able to get the db dump, but not sure about access, that's more of a sinzui,balloons,tbaumann thing
<thumper> veebers: just replied, apt install tmate
<thumper> really cool
<veebers> thumper: it's not really my machine to install or give access too :-\
<veebers> thumper: although we can always ask for forgiveness
<menn0> wallyworld: trying to do the QA steps for this PR has made me realise that I probably need to do the prechecks before waiting for agents to report that they're in readonly mode
<wallyworld> hmmm, seems reasonable
<menn0> wallyworld: otherwise when you have machines that are being provisioned or down, they'll never report that they're in readonly mode and it takes 15 mins before the migration is aborted
<wallyworld> ouch
<menn0> wallyworld: so ignore my PR for now. I'm going to do another one which swaps the phases around
<wallyworld> i haven't read the PR yet :-)
<menn0> wallyworld: ok np. that PR won't actually change, but it'll be easier to QA once I swap the phases
<wallyworld> sgtm
<thumper> menn0: big problem was out of date lxd container
<thumper> had an old lxd package
<thumper> which blocked on things
<thumper> updating containers fixed the CI test
<menn0> thumper: how did you update the containers?
<thumper> apt update && apt upgrade
<wallyworld> thumper: why the fark wouldn't CI ensure it pulled the latest packages to test with ie what our customers are actually going to use?
<thumper> wallyworld: these are long standing lxd machines that are just stopped / started for the ci tests
<thumper> they just got out of date
<wallyworld> hmpf. they should be kept up to date :-)
<thumper> yes, they should
<natefinch> wallyworld: I have some questions about the certupdater, got few minutes?
<wallyworld> ok
<natefinch> wallyworld: so, the certupdater has a watcher that watches the addresses of the machine it's on.... which seem to be different than the addresses in APIHostPorts.... shouldn't it watch APIHostPorts?
<wallyworld> not sure tbh. it needs to update the cert SAN list with whatever wget on the instance nodes connects back to
<wallyworld> i can't remember where wget gets it's info from
<wallyworld> how is APIHostPors different to the machine addresses?
<natefinch> wallyworld: the issue seems to be that the machine's list of addresses for manual provider don't include the cloud-local address, but the APIHostPorts does include that address
<natefinch> wallyworld: I guess the other question is.. where does the machine's address list get populated?
<wallyworld> i know there's an instance poller on clouds
<wallyworld> and i think there's somewhere else that queries any machine NICs
<wallyworld> natefinch: i looked at the code - it does use apihost ports initially
<natefinch> wallyworld: yeah, the apihostports initially don't have the cloud-local, but they get updated
<wallyworld> so why wouldn't the updated addresses get populated in the machine addresses
<natefinch> not sure if that's a race condition or what, but they get updated with cloud local later... but since the certupdater isn't watching apihostports, it doesn't get the new address
<natefinch> good question.  I don't know
<wallyworld> i think the assumption is that they would. i'd have to go through the code to see what's happening
<wallyworld> trace what is updaing apihostports
<natefinch> ok, I'll try that and see why it's not also updating the machine addresses
<wallyworld> that's a good starting point
<natefinch> it's turtles all the way down :/
<thumper> menn0: http://pastebin.ubuntu.com/23106006/
 * thumper sighs
<thumper> menn0:  http://reviews.vapour.ws/r/5552/
<thumper> menn0: as discussed, moved the introspection worker out of the engine
<menn0> thumper: give me a sec
<thumper> no worries
<menn0> thumper: swap you: http://reviews.vapour.ws/r/5551/
<thumper> ok
<menn0> thumper: I was initially going to swap PRECHECK and QUIESCE but then it occurred to me that we didn't actually need both phases
<thumper> ok...
<menn0> thumper: the prechecks are now just done at the start of the QUIESCE phase, at the same time as the agents are going to readonly mode
<thumper> menn0: why all the phase changes from "PRECHECK" to "IMPORT"?
<thumper> in the first file
<menn0> thumper: because PRECHECK no longer exists. IMPORT is now next phase after QUIESCE
<thumper> ok
<menn0> thumper: so you didn't mean to show me juju-engine-report ?
<menn0> looks pretty useful
<thumper> menn0: well, that is in the review QA
<thumper> but pasted the wrong buffer
<menn0> got it
<menn0> thumper: review done. just little things.
<thumper> menn0: ta
<thumper> menn0: replied, would like your thoughts before I do much else
<menn0> thumper: looking
<menn0> thumper: where does the contents of introspectionWorkerBashFuncs get put by cloud-init?
<menn0> thumper: never mind... I checked myself
<menn0> thumper: replies done.
<thumper> ta
 * thumper is off now
<thumper> see y'all tomorrow
<menn0> wallyworld, thumper: presence doesn't appear to be working at all... I shut down a machine agent 20 mins ago and it still has a status of "started"
<wallyworld> awesome
<menn0> wallyworld: it works for unit agents but doesn't appear to work for machines
<menn0> I have a terrible theory...
 * menn0 checks
<wallyworld> axw: if you have a moment, not urgent, i have a PR which tweaks some bootatrap messages http://reviews.vapour.ws/r/5553/
<axw> wallyworld: looking
<axw> wallyworld: " - juju-b01da0-0 ted"   <- what is ted?
<wallyworld> axw: damn. there was some code to move the cursor to start of line line and it looks like a message has changed in length
<wallyworld> i'll have to add padding or something
<wallyworld> looks like we just got lucky before
<wallyworld> i didn't even notice :-)
<wallyworld> axw: added padding to width of 40, seems reasonable to me as other messages printed are a bit more than that
<wallyworld> other messages during bootstrap i mean
<wallyworld> not other messages on that line that is overritten
<axw> wallyworld: there are terminal escape codes for clearing lines, but I'm not sure about portability. seems reasonable.
<axw> (what you've done; LGTM)
<wallyworld> axw: yeah, i was afraid to mess with such things and screw up on windows etc
<wallyworld> or when we pipe etc
<perrito666> morning
<wallyworld> fwereade: you around?
<fwereade> wallyworld, o/
<wallyworld> hey
<wallyworld> fwereade: question for you
<fwereade> wallyworld, sure
<fwereade> wallyworld, (it's quiet today... uk bank holiday... maybe more?)
<wallyworld> i think you were starting to look at refactoring state a bit. perrito666 has an issue where there's the mdel manager facade which has a controller connection, hence it's state has the controller model uuid. but beng a model manager facade, it needs to operate on mulitple models. so the facade needs to use a state with a given model uuid assigned to it so that the correct model uuid is automatocally used with the doc ids. but state.
<wallyworld> ForModel() starts all the listeners etc. what's needed is just the newState() bit which constrcuts a state without all the other stuff just to allow collections to be correctly read and written
<wallyworld> does that make sense?
<fwereade> wallyworld, perrito666: I think the abstraction you are looking for is a state.Database
<perrito666> actually we should actually refactor some endpoints, the split of facades into controller or models was a bit careless, some facades needed a bit of split
<wallyworld> one option discussed with menno is to use the state pool
<wallyworld> i've not come across the state database, will need to look
<fwereade> wallyworld, perrito666: in particular if you sometimes want a collection to be global and sometimes not, I think the RTTD is to create databases with different schemas and use those
<wallyworld> all that's needed is a state object without all the listener stuff, but which does the model uuid thing
<fwereade> wallyworld, perrito666: yes
<fwereade> wallyworld, perrito666: that is what Database does
<wallyworld> ah ok, that's what we want then :-)
 * perrito666 goes reading
<perrito666> why didn't I know of this before?
<fwereade> wallyworld, perrito666: it's still internal to state but I'm 99% sure it's the piece closest to what you need
<wallyworld> if it does what i stat above, then awesome
<fwereade> perrito666, not sure, but if you've used allcollections.go the Database is the thing whose operation/structure is defined by it
<fwereade> perrito666, well, the database, which happens to implement Database
<wallyworld> i'll leave you guys to it, need to go get my son from work
<fwereade> wallyworld, o/
 * wallyworld the match maker :-)
<fwereade> <3
<rock__> Hi. I have juju openstack depolyed setup on LXD. Unfortunately I did #juju logout --force. Now I am not able to see #juju status. i tried to do #juju login  but it was asking username and password. In  username and password.  username and password. It has default credentials?
<rock__> .
<rock__> In .local/share/juju/accounts.yaml is empty
<rock__> How can I get my previous setup?
<rock__> please help me in this.
<mup> Bug #1617526 changed: cmd/juju: no help available for common flags <help> <ui> <juju:New> <https://launchpad.net/bugs/1617526>
<mup> Bug #1617602 changed: juju status <service-name> stuck <status> <juju:New> <juju-utils:New> <https://launchpad.net/bugs/1617602>
<mup> Bug #1617820 changed: JUJU fails in bootstrapping used with openstack <conjure-up:New> <juju:New> <https://launchpad.net/bugs/1617820>
<natefinch> standup for rick's juju core team standup thing  (do we have a standup?)
<natefinch> I mean, we do have a standup.. but it's kinda barren
<natefinch> oh, public holiday in the UK kinda decimates our standup
<beisner> hi juju devs - is there a way to recover the credentials if a user --forces logout before changing the initial account password?  (fyi rock__ ^)
<perrito666> beisner: I honestly dont know
<beisner> it seems like 'no' given the warning http://pastebin.ubuntu.com/23107425/
<rock__> beisner: I got the same as  http://pastebin.ubuntu.com/23107425/. then I ran #juju logout --force.  From that onwards I am not able to see #juju status. And I am not able do #juju login back due to loss of account password.
<mup> Bug #1617602 opened: juju status <service-name> stuck <status> <juju-core:Incomplete> <juju-utils:New> <https://launchpad.net/bugs/1617602>
<beisner> rock__, right, i think it is expected that you set the password to a known-value before logging out.
<rock__> beisner: Now we can't do anything right.
<beisner> rock__, as far as i know, the creds are required in order to interact with the existing model
<rock__> beisner: OK. thank you.
<beisner> rock__, yw.  sorry i don't have a better answer on that.
<natefinch> There's probably a way to hack it if you can log into the controller machine and then twiddle with the database directly.  I don't have a concrete list of steps to do that, though
<rock__> beisner: It is Ok. You provided me good info.
<rock__> When I run #juju status. It was giving : ERROR No controller.  Please either create your own new controller using "juju bootstrap" or connect to another controller that you have been given access to using "juju register".
<redir> morning
<natefinch> mental note: bootstrapping to a pre-existing machine with manual is faster than even lxd
<thumper> morning folks
<katco> heya thumper
<thumper> katco: morning
<alexisb> morning thumper
<sinzui> ses_: https://hangouts.google.com/hangouts/_/canonical.com/curtis
<natefinch> thumper: is there a way to make loggo print out milliseconds in the timestamp?  it would help when using grep during busy periods in the log
<thumper> natefinch: with debug-log, yes
<thumper> --ms
<thumper> not for the default file writers
<thumper> but the logs in the db have full time precision
<thumper> and debug-log can now show that
<natefinch> thumper: ahh, that's too bad
<thumper> actually...
<thumper> I added an environment variable for the default logger
<thumper> which is to the terminal
<thumper> but not for the file writers
<natefinch> thumper: this is 1.25... looks like no --ms on debug-log there
<thumper> natefinch: export LOGGO_TIME_FORMAT="15:04:05.000"
<thumper> natefinch: see if that helps :)
 * natefinch just hacks the loggo source...
<redir> use the source nate
<katco> anyone have an opinion of where this should live? https://github.com/juju/juju/blob/master/cmd/juju/application/store.go#L155-L166
<katco> candidates i have found are: github.com/juju/juju/charmstore gopkg.in/juju/charmrepo
<thumper> katco: I think it would be nice to have some package that encapsulates juju's interaction with the charmstore
<thumper> but no idea where
<thumper> perhaps your first choice?
<katco> thumper: i think that might be github.com/juju/juju/charmstore?
<thumper> but I don't know what is in there now
<thumper> seems reasonable to me from a distance :)
<katco> thumper: i will consider that an affirmation from a verified expert whose word is ironclad and whom i can blame when this blows up.
<thumper> haha
<alexisb> thumper, do you ahve time to run through bugs with me?
<alexisb> or are you on other tasks
<thumper> alexisb: I'm trying to finish off a branch from yesterday
<thumper> if you can live without me
<alexisb> thumper, sure
<perrito666> k running a long test, bbl
<alexisb> redir, menn0 did the fix in mgo needed for this bug land yet? : https://bugs.launchpad.net/juju/+bug/1604817
<mup> Bug #1604817: Race in mgo Stats implementation <ci> <intermittent-failure> <race-condition> <regression> <unit-tests> <juju:In Progress by reedobrien> <https://launchpad.net/bugs/1604817>
<redir> alexisb: I haven't seen anything on the upstream PR
 * menn0 double checks
<menn0> alexisb: nope, no response on that one
<alexisb> k
<alexisb> wallyworld, is this still a valid bug: https://bugs.launchpad.net/juju/+bug/1612717
<mup> Bug #1612717: Pinger facade not implemented on controller websocket connection <juju:Triaged> <https://launchpad.net/bugs/1612717>
<wallyworld> not sure, hasn't seen it
<wallyworld> probably it could be
<wallyworld> seems like fallout from rog's work
<alexisb> thanks wallyworld; following up with that team
<wallyworld> alexisb: thumper: are we having this meeting?
<alexisb> redir, menn0 ping
<redir> alexisb: ack
<redir> pong
<redir> whatever
<alexisb> redir, standup
<redir> doh!
<redir> brt
<wallyworld> redir: looking at your branch now
<redir> wallyworld: tx
<wallyworld> redir: lgtm with a couple of small things
<redir> wallyworld: tx
#juju-dev 2016-08-30
<redir> wallyworld: PTAL http://reviews.vapour.ws/r/5548 and make sure I understood you.
<wallyworld> ok
<wallyworld> redir: you sure you oushed the changes? i can't see a new diff
<redir> shit I need to rbt it
<redir> RB ate my original PR
<redir> wallyworld: ^^ rbt post -u run...
<wallyworld> ok
<wallyworld> axw: looking at the cloud facade - the credential apis take / return tags. but i thought our general principal was that tags are not to be exposed outside of the apis (wire format only). we pass in names etc and these are converted to tags inside api/*
<axw> wallyworld: feel free to change it. I like the type safety and documentation it brings
<wallyworld> axw: yeah, i get that too. i'll leave for now, but we should agree on a consistent approach
<wallyworld> axw: should raise as a tb topic
<wallyworld> axw: damn, and we also leak params structs for results where we return a slice of {*Value, *Error}
<axw> wallyworld: where?
<wallyworld> everywhere, eg
<wallyworld> func (c *Client) StorageDetails(tags []names.StorageTag) ([]params.StorageDetailsResult, error)
<axw> wallyworld: oh, yeah we do that in many places. I thought you meant in Cloud
<wallyworld> sigh, i'll just stick to current convention for now
<thumper> wallyworld, menn0: can you see this ? http://imgur.com/a/zmSPH
<wallyworld> yes
<thumper> good?
<menn0> nice
<wallyworld> ecept for colour blind people
<menn0> looks good to me... the red could have better contrast IMO but that comes down to your terminal color settings
<thumper> I can't help them...
<menn0> thumper: I'm not sure that "unknown" should be highlighted. it's not that unusual for a charm to not set workload status
<thumper> right, but if we mark it yellow, people will want to set it green, right?
<wallyworld> thumper: we can help them by choosing better colours, like jenkins does
<wallyworld> but red and green are nice i admit
<natefinch> use red for bad, blue for good.  add symbols to make it more clear - â error   	â success
<natefinch> â unknown
<thumper> no symbols with this change
<natefinch> red / blue then
<thumper> huh...
<thumper> interesting
<thumper> tried bright blue for the good color
<thumper> but watch -c doesn't show that
<thumper> weird
<thumper> less -R does
<thumper> I'm tempted to stay with green for now and see what response we get
<natefinch> 10% of men are color blind. That's probably 9% of our target market...  I personally know several color blind people.  It's just a bad idea, and sort of inexcusable in this day and age. It sucks the default linux tools don't cooperate, but that doesn't mean we shouldn't do the right thing anyway.
<natefinch> sounds like we might be hitting something like this: https://github.com/cloudfoundry/cli/issues/840
<natefinch> do we always output color, or do we do some terminal detection?
<wallyworld> axw: if you get bored of looking at juju heap dumps, no rush or anything, http://reviews.vapour.ws/r/5554/
<axw> wallyworld: will take a look after lunch
<wallyworld> sure, no hurry
<menn0> thumper: phew... source prechecks done. just 2 more target prechecks to go.
<menn0> doing some manual QA now
<menn0> veebers: the migration prechecks around juju versions, upgrades in progress and machine health have all landed
<menn0> veebers: and there's a bunch more on their way
<veebers> menn0: sweet. Tomorrow morning I might take some of your time to flesh out any remaining CI tests for the migration parts
<menn0> veebers: sounds good
<thumper> review up: http://reviews.vapour.ws/r/5555/
<thumper> natit was just the blue, watch showed red/green/yellow, thinking it is the high 8
<thumper> wallyworld, menn0: here is what blue for good looks like http://imgur.com/a/4kMVu
<wallyworld> thumper: ah, so you looked at what jenkins does :-)
<thumper> no
<thumper> nate suggested it
<thumper> wallyworld: review is ready
<wallyworld> thumper: that's what i suggested too, sigh
<thumper> it is a one line change to go from green -> blue
<thumper> ok, you too
<wallyworld> looking
 * thumper heading off to bjj, will check on review later
<menn0> thumper-afk: I prefer green (but only slightly)
<axw> wallyworld: how will the ListCredentials API be used?
<axw> wallyworld: just wondering what's the use if you can't see all of the attributes
<wallyworld> axw: the GUI will use it to load credential data to be displayed and to allow to user to paste in new secrets if they want to update their credentials. it still needs to be fleshed out
<wallyworld> so for openstack say, you can see you domain and tenant and username etc
<wallyworld> but not the secret
<wallyworld> i'm thinking we can always add in secrets if needed
<axw> wallyworld: yeah ok, I guess that's fine. maybe we could set the value to "" for those attributes at least, so you can tell that they're redacted rather than missing
<wallyworld> that's a fair point
<wallyworld> i'll do that
<axw> wallyworld: or a separate field with names of refacted attrs
<axw> redacted*
<axw> le shrug
<wallyworld> separate field might be ok too
<wallyworld> axw: so there's just the revoked issue to consider now - as explained in the rb comment, the api just sets the revoked flag; we need to decide how to best handle that in subsequent prs
<axw> wallyworld: ok, will look again in a moment
<wallyworld> no rush
<axw> wallyworld: I'm having second thoughts about delaying removal of revoked creds now, but we can change it in a follow up if needed
<axw> wallyworld: just thinking, in terms of security you probably want that gone from the db ASAP
<axw> rather than waiting around for something to stop referring to them
<wallyworld> axw: i was kinda thinking the same think, and that goes along with not shipping secrets by default. the reason for a new type is that I wanted to keep cloud.Credential nice and clean, but i guess adding a redacted slice is ok
<wallyworld> axw: with remiving them straight away, the idea is that a followup would implement the listener to immediately notify everything that those credentials are gone; that also allows them to be replaced immediately with the same name
<wallyworld> so i'm happy to revert to what was there originally
<wallyworld> axw: you +1 with s/Revoke/Remove in state ?
<axw> wallyworld: you don't need to add Redacted to cloud.Credential yet
<wallyworld> axw: yeah, came to the same conclusion
<wallyworld> have removed the new type
<TheMue> morning
<Mmike> Hi, lads. How do I get older version of a charm source? charm-pull-source seems to download only the most recent version. Or: where do I ask questions like this? :)
<perrito666> morning
<voidspace> perrito666: morning
<TheMue> ah, two old colleagues. already wondered where the European ones are.
<babbageclunk> fwereade: Thanks for the review - have you got a moment to talk about the Undertaker tests?
<perrito666> TheMue: I am not in europe :p
<fwereade> babbageclunk, let's
<rogpeppe> mgz: just went back to this after a while and found that it had failed, but it's not clear why. any idea? http://juju-ci.vapour.ws:8080/job/github-merge-juju/8952
<TheMue> perrito666: no, but at least voidspace should be
<babbageclunk> fwereade: Wanna hangout?
<TheMue> perrito666: has been so quiet in the channel this morning
<fwereade> babbageclunk, sgtm
<perrito666> TheMue: well its like 8:30 for me :p
<babbageclunk> fwereade: https://hangouts.google.com/hangouts/_/canonical.com/core?hl=en&authuser=1
<mgz> rogpeppe: I'll take a look
<rogpeppe> mgz: ta
<rogpeppe> mgz: not that important, but it's useful to know how/why these things fail
<mgz> I agree
<rogpeppe> mgz: i sometimes wonder if we should auto-retry when an intermittent error happens that's not to do with the code being tested
<mgz> I'd like to surface those to github better at least
<rogpeppe> mgz: yeah
<rogpeppe> mgz: the main one is that if a merge has been blocked because of critical bugs, it would be good to retry when unblocked
<rogpeppe> mgz: i've actually thought of doing a little daemon that would do that for me
 * babbageclunk goes for a run.
<voidspace> TheMue: I'm still in Europe, yes
<voidspace> TheMue: morning o/
<TheMue> voidspace: o/ but it seems dimitern and frobware are on vacation
<voidspace> TheMue: ah, maybe. I don't see that in the calendar.
<TheMue> voidspace: I've only interpreted the silence that way *lol*
 * TheMue currently plays a bit with crypto packages for token signatures
<voidspace> TheMue: you might be right, I know frobware was going on holiday but I can't remember if he is due back yet or not.
<frobware> voidspace: I'm here...
<voidspace> frobware: hello o/
<frobware> o/
<TheMue> ah, hey frobware o/
<frobware> TheMue: hello
 * frobware runs for some lunch before standup
<babbageclunk> dimitern: Did you get everything sorted in the bug you guys were working on?
<dimitern> babbageclunk: more or less - I'm working on good, well tested fix now - will start proposing PRs soon :)
<babbageclunk> dimitern: nice one
<babbageclunk> Yay, thanks whoever added the build-parameters info for queued builds in Jenkins!
<babbageclunk> Now I can see how far back mine is. :(
<rogpeppe> tiny review: this fixes a couple of unreliable tests: https://github.com/juju/juju/pull/6119
<babbageclunk> rogpeppe: LGTM! Failing less often sounds nice.
<mgz> rogpeppe: your mp from earlier failed to merge again, in case you didn't see
<mgz> rogpeppe: not the same issue
<mgz> ah, you resent
<perrito666> Bbl doctors appointment
<katco> natefinch: dimitern: standup time
<katco> fwereade: standup time
<rogpeppe> mgz: now i wanna put my "fix unreliable tests" PR near the top of the queue so that the other PRs have more hope of landing...
<mgz> :
<mgz> D
<mgz> ow face cut in half
<mgz> we can cancel things above it in the queue that haven't started yet, if there are lots
<babbageclunk> mgz: Is build 8976 stuck? It's been running since 12:32.
<mgz> lets have a look
<mgz> hm, harder to see what's up with the new multi task thing
<fwereade> katco, oops, sorry, I was packing
<katco> fwereade: no worries; any updates on refcounting stuff?
<fwereade> katco, I may or may not land the cleanup change after the flight
<fwereade> katco, ...but I have no excuse not to land that one that was already reviewed
<fwereade> not *land* the cleanup change, but push it for review
 * fwereade fail english? that's unpossible
<katco> fwereade: haha
<fwereade> probably ought to be off now to avoid panic later
<fwereade> o/
<katco> fwereade: cool; yeah, figured the cleanup is the more important of the 2? the one you can land just sets the stage?
<katco> k tc fwereade
<fwereade> katco, yeah exactly
<fwereade> later all
<katco> fwereade: ta
<mgz> babbageclunk: I unstuck it
<babbageclunk> mgz \o/
<mgz> mattyw: your run failed - the hang was not your fault, but also looks like to have a map-order dependent test failure
<babbageclunk> mgz ok now cancel all the other ones except for mine.
<mgz> babbageclunk: :P
<mattyw> mgz, yeah - that was fixed an hour ago but I'm waiting for this die so I can propose the fix
<mgz> ec2 wasn't giving us a machine and the script doesn't have a shorter timeout at the right point for that
<natefinch> dimitern: ok, back in the standup hangout?
<dimitern> natefinch: omw
<natefinch> dimitern: ssh ubuntu@104.196.124.147
<babbageclunk> dimitern: Assuming I've turned on security.nesting, I should be able to bootstrap to lxd inside a lxd container, right?
<babbageclunk> frobware: ^^
<frobware> babbageclunk: yes, been a while since I tried though
<frobware> babbageclunk: my profile: http://pastebin.ubuntu.com/23112208/
<babbageclunk> frobware: Someone over in #juju is getting an error with beta16 bootstrapping to lxd - I tried it and get the same error, with both lxd 2.1 and 2.0.4
<babbageclunk> frobware: They say it wasn't happening in beta15. Does it sound like that packaging problem you guys were talking about?
<frobware> babbageclunk: :(
<babbageclunk> frobware: I don't see it if I use my built-from-source juju, only in a container with a juju from ppa
<babbageclunk> frobware: Anyway, I'll log a bug about it.
<frobware> babbageclunk: ack. sidetracked atm :\
<babbageclunk> frobware: No worries, just wanted confirmation that it should work. Thanks!
<dimitern> babbageclunk: I haven't tried recently
<dimitern> I know we set security.nesting by default
<babbageclunk> dimitern: ok, thanks.
<frobware> babbageclunk: I just started a container OK
<frobware> babbageclunk: but I just built from tip of master
<babbageclunk> frobware: yeah, I get that as well - try with juju beta16 from the ppa
<frobware> babbageclunk: perhaps. I _need_ this bootstrap. :)
<babbageclunk> frobware: Tsk! They're cattle not pets! ;)
<frobware> babbageclunk: diff between tip and 16: 140 files changed, 4769 insertions(+), 1405 deletions
<babbageclunk> frobware: I'm just checking it's a beta16 thing by trying with beta15.
<babbageclunk> frobware: yeah, it works with beta15.
<redir> morning juju
<redir> juju-dev even
<frobware> babbageclunk: you have your bisect point
<babbageclunk> frobware: gah, sounds like fun!
<dimitern> frobware, babbageclunk, voidspace: I'd appreciate a review on this: http://reviews.vapour.ws/r/5559/, I tried to keep the changes straightforward - mostly added tests
<mgz> rogpeppe: urk your unreliable test fix branch didn't go through
<mgz> feature tests failed
<perrito666> it was unreliable
<perrito666> :p
<perrito666> sorry could not resist
<mgz> I'll requeue it later, as my branch before failed due to the one that branch fixes :)
<mgz> I think we have a bug for the feature tests already?
<mgz> sinzui: do you know? unit test failure of featuretests due to api connection refused
<sinzui> mgz: I definitely saw one recently, but I was also reviewing 18 months of data. I will look
<mgz> sinzui: thank you
<sinzui> mgz: http://reports.vapour.ws/releases/issue/578f7bf1749a567c7344833e is increasing in frequency
<sinzui> mgz: cmdStorageSuite.TestStorageAddToUnitSuccess has never failed in CI testing
<mgz> hm, I wonder if it's something new on juju-core-slave then
<rogpeppe> mgz: i know
<rogpeppe> mgz: i've reported a bunch of unreliable test bugs today
<mgz> rogpeppe: <3
<rogpeppe> mgz: including most recently this: https://bugs.launchpad.net/juju/+bug/1618560
<mup> Bug #1618560: worker/txnpruner: sporadic test failure <juju:New> <https://launchpad.net/bugs/1618560>
<rogpeppe> mgz: took me a little while to figure out how that could fail (i've added a comment)
<rogpeppe> mgz: a lot seem to be failing because they can't contact the API server. not sure what's going on there.
<rogpeppe> mgz: feel free to discard any dupes.
 * redir lunches
 * redir lunches in a minute after reboot
<alexisb> wallyworld, I will miss the sts call
<wallyworld> ok
<wallyworld> niedbalski: google hates me, be there as soon as hangouts starts working
<niedbalski> wallyworld, sure, np!
<thumper> oh FFS
<thumper> this test: BootstrapSuite.TestBootstrapProviderDetectRegions has blocked two landings
<thumper> intermittent failure due to ordering
<thumper> anyone fixing this yet?
<alexisb> https://bugs.launchpad.net/juju/+bug/1618582
<mup> Bug #1618582: BootstrapSuite.TestBootstrapProviderDetect(No)Regions fails the expected auth type due to misordering <intermittent-failure> <regression> <trusty> <unit-test> <xenial> <juju:Triaged by wallyworld> <https://launchpad.net/bugs/1618582>
<alexisb> ^^^ ian earned a regression
<alexisb> but you are welcome ot fix it
<alexisb> thumper, ^^
<thumper> k
<thumper> alexisb: I'll grab it
<thumper> it is blocking me
<thumper> is there a card for it on the board?
<alexisb> thumper, nope
<thumper> found it
<thumper> quick review for someone: https://bugs.launchpad.net/juju/+bug/1618582
<mup> Bug #1618582: BootstrapSuite.TestBootstrapProviderDetect(No)Regions fails the expected auth type due to misordering <intermittent-failure> <regression> <trusty> <unit-test> <xenial> <juju:In Progress by thumper> <https://launchpad.net/bugs/1618582>
 * thumper sighs
<thumper> this one http://reviews.vapour.ws/r/5561/diff/#
<thumper> alexisb: want to review it?
<thumper> or menn0
<thumper> http://reviews.vapour.ws/r/5561/diff/#
<thumper> menn0: morning btw
<menn0> thumper: howdy
<menn0> alexisb, thumper: I'll review it. I'm OCR.
<thumper> menn0: review is a trivial fix for a test that is failing half the time
<perrito666> this database is a poorly writen joke
<thumper> heh
<perrito666> our latest foe https://jira.mongodb.org/browse/SERVER-20729
<menn0> thumper: ship it
<thumper> ta
<menn0> perrito666: i'm not apologising for the db, but that ticket is closed and seems to refer to something dodgy the user was doing
<perrito666> menn0: we are being biten by the error, but the cause is not the same
<menn0> perrito666: ah ok
<perrito666> apparently that can happen for a number of reason included but not limited to, the db shut down abnormally
<thumper> menn0: when can we have the external migration flag?
<thumper> prechecks seems to be taking longer than we thought
<menn0> thumper: it has... I ran into a lot of problems yesterday
<menn0> thumper: the latest changes are ready but I noticed a possible issue while QAing yesterday
<menn0> thumper: around unit status
<menn0> thumper: I can pause prechecks after my current PR is done (although there's only 2 checks left to implement after that actually, but they involve minor API work)
<thumper> I've pushed the external flag to beta 18
<thumper> so keep with prechecks
<menn0> ok
<thumper> menn0: thoughts? https://bugs.launchpad.net/juju/+bug/1606991
<mup> Bug #1606991: TestWaitMinionNeverBecomeMinion wrong minion <ci> <intermittent-failure> <regression> <unit-tests> <windows> <juju:Triaged> <https://launchpad.net/bugs/1606991>
<menn0> thumper: nothing to do with migrations if that's what you're thinking
<menn0> thumper: will just happened to use the word minion for this leadership test
<menn0> thumper: I have almost no clue about how leadership works
<menn0> thumper: but I can dig into it if you need me to
<thumper> menn0: txn pruner clock branch: http://reviews.vapour.ws/r/5562/
<thumper> menn0: ok, re bug abovev
<thumper> wallyworld: https://bugs.launchpad.net/juju/+bug/1603176 thoughts on this?
<mup> Bug #1603176: juju debug-log returns not logged-in error <debug-log> <login> <juju:Triaged> <https://launchpad.net/bugs/1603176>
<menn0> thumper: ok, just reviewing perrito666's PR atm, will do yours next
<thumper> I wonder if it is how we are passing creds through to the streaming apis
<wallyworld> hmmm
<wallyworld> off hand i am not sure, but it does seem suspicious that it's not for a normal rpc call
<menn0> perrito666: the diff on RB for PR 6104 doesn't match what's actually in the PR on Github
<menn0> perrito666: in the review UserAccessSpec has ObjectID, but the PR doesn't have that
<menn0> perrito666: sorry ObjectUUID, not ObjectID
<thumper> hmm... enable-ha on to two manual machines seem to not work
<thumper> can anyone else see the problem here? http://pastebin.ubuntu.com/23113867/
<thumper> both amusing and not at the same time
<menn0> thumper: it should be using a non localhost IP otherwise the other nodes can't connect?
<thumper> :)
<thumper> well.. it complains a few lines lower that it has multiple entries with the same _id
<thumper> I'm poking more
<menn0> thumper: regarding the txnpruner fix, it looks great but I don't think you need the started channel do you?
<thumper> menn0: kinda, otherwise the test doesn't know when to start advancing the clock
<thumper> unless you have other ideas
<menn0> I think I do...
 * menn0 checks the test clock
<menn0> I think the test clock has a feature to let the test know when After has been called
<menn0> thumper: yep, that's it. The testing clock can return a channel via the Alarms method which lets you know when After has been called
<menn0> thumper: you can just wait for that in the test
<menn0> it's exactly for this use case
<thumper> ah, cool
<thumper> please leave a note :)
<menn0> will do
<thumper> that was the only bit I was not happy with
<thumper> hmm...
 * thumper wonders how this works with ha lxd
 * thumper fires one up
<thumper> has anyone else got tab completion woes with juju?
<thumper> juju _juju_complete_2_0: command not found
<thumper> that was juju <tab>
<menn0> thumper: ship it
<menn0> thumper: it looks like the juju-2.0 package isn't installing /etc/bash_completion.d/juju-2.0
<thumper> I haven't installed the package
<thumper> I'm running from source
<menn0> thumper: you don't have the official juju-2.0 installed as well?
<thumper> nope
<thumper> don't think so
<menn0> if you installed from source how could bash know where to find the completion?
<thumper> probably should yes?
<thumper> I do have that installed
<menn0> hmmm, it actually works for me
<thumper> and I am missing  /etc/bash_completion.d/juju-2.0
<menn0> I have juju-2.0-beta17 installed
<thumper> hmm... I must be missing the ppa
<thumper> I have beta 15
<menn0> thumper: I actually just have /etc/bash_completion.d/juju-core and that seems to work
<thumper> and beta 15 doesn't install anything in /etc/bash_completion.d
<menn0> there's more stuff in /usr/share/bash-completion/juju*
<menn0> that's probably the problem then
 * menn0 checks where his juju package came from
<menn0> thumper: yeah, I have the stable PPA
<menn0> cd
<thumper> menn0: just the stable?
<menn0> thumper: yep
<thumper> stable doesn't ahve juju in it
<thumper> menn0: as in ppa:juju/stable
<thumper> interesting
<thumper> when using lxd in ha mode
<thumper> the same input generates a different result
<thumper> curious
<perrito666> menn0: sorry was at the market, what happened with gh and rb?
<menn0> perrito666: the diff on Github is not the same as the one on RB
<perrito666> meh, that sucks (that being rb)
<perrito666> menn0: so your review does not apply?
<menn0> perrito666: well most of it will still apply
<perrito666> menn0: tx for the heads up btw, ill take a look to see if I can make them be the same
<menn0> perrito666: the main thing I noticed is that state.UserAccessSpec doesn't have ObjectUUID in the pull request on Github
<menn0> perrito666: did you manually update the diff on RB?
<perrito666> menn0: nope
<menn0> weird
<perrito666> menn0: I dont even have the tools for that on my system
<menn0> axw: the maximum log buffer size would be around 100MB
<menn0> so not ridiculous
<axw> menn0: ah that's probably not it then, thanks
<thumper> rogpeppe: I fixed that bootstrap test already
<thumper> rogpeppe: you didn't mark in the bug that you were fixing it
<thumper> so I did
<perrito666> menn0: love your doorbell
#juju-dev 2016-08-31
<menn0> perrito666: haha
<perrito666> I have the exact same, but no one ever rings it since my office has a window outside and I see people comming
<perrito666> mm, GoRename in an interface will do the smart thing with its implementation....
<redir> perrito666: smart meaning what you want?
<perrito666> redir: what else?
<redir> :)
<redir> perrito666: the right thing?
<redir> I wish it updated comments.
<redir> I understand why it doesn't but I can wish.
<perrito666> ah the comments part is really annoying
<perrito666> I tried once GoRename on implementation instead of the interface, not nice
<redir> I also with GoTestFunc worked with check, or vice-versa
<redir> *also wish
<perrito666> menn0: funny, delegatos in spanish (with a space "dele gatos") means "give that person cats"
<menn0> perrito666: LOL!
<menn0> ok
<menn0> perrito666: ship it
<perrito666> menn0: tx a lot
<perrito666> well look at that, a 4k monitor costs 6X more in my country than in amazon....
 * perrito666 stays in HD
<menn0> perrito666: ping
<perrito666> menn0: pong
<menn0> perrito666: https://github.com/juju/juju/blame/master/apiserver/client/status.go#L853
<menn0> perrito666: should that really be || ?
<menn0> perrito666: I'm wondering if it's supposed to be &&
<perrito666> menn0: let me refresh my memory
<menn0> perrito666: sure
<menn0> perrito666: I'm not completely sure myself, but it looks suspicious
<perrito666> menn0: I am pretty sure I am blamed there for a move since I cant recall wth is going on there but, it makes sense because it might be possible to be in Maintenance but with message Installing
<perrito666> I would track MessageInstalling to determine that
<perrito666> I bet it is set when Agent is in Maintenance, this would be a good reason for such ||
<menn0> perrito666: actually, I've misread the code
<menn0> perrito666: ignore me
<perrito666> menn0: sure thing :p
<perrito666> ill go have dinner then return if you need a rubber duck
<redir> wallyworld: http://reviews.vapour.ws/r/5563/ or menn0 ...
<redir> bbiab
<wallyworld> ok
<redir> ~8pm or so PDT
<perrito666> redir: what's with US people and Time standards
<perrito666> menn0: need a hand?
<natefinch> perrito666: what, we have 6 time zones, which differ based on what time of year it is, thanks to daylight saving time. Not so bad, right?
<perrito666> natefinch: actually the problem is not how many you have, is the names you use
<perrito666> much like with every other form of numeric representation aparently
<natefinch> lol
<natefinch> unless anyone has some special insight into https://bugs.launchpad.net/juju-core/1.25/+bug/1610880 - I'm just going to bail on it.  I've burned a week trying to figure this crap out, and now it's just decided to stop being reproducible :/
<mup> Bug #1610880: Downloading container templates fails in manual environment <juju-core 1.25:Triaged by natefinch> <https://launchpad.net/bugs/1610880>
<wallyworld> menn0: juju/agent/MigrateParams - there's a check for empty Model, but to me, model should never be empty right?
<menn0> perrito666: all good at the moment
<menn0> wallyworld: loading context
<wallyworld> menn0: also, we now are going to need to start explicitly passing around a controller UUID since that will not be the same as the controller model uuid anymore
<menn0> wallyworld: MigrateParams has nothing to do with migrations
<menn0> wallyworld: model migrations that is
<wallyworld> oh, ffs
<wallyworld> it for migrating older format files
<wallyworld> hmmm
<wallyworld> 2.0 will be clean slate
<perrito666> gnight all
<menn0> perrito666:  good night
<perrito666> wow, tests are especially unreliable today
<natefinch> thumper, wallyworld: either of you have thoughts on a bug I should tackle?  Looks like all the criticals under juju are assigned, except this one:  https://bugs.launchpad.net/juju/+bug/1614635
<mup> Bug #1614635: Deploy sometimes fails behind a proxy <landscape> <juju:Triaged by rharding> <https://launchpad.net/bugs/1614635>
<wallyworld> natefinch: anything landscape related is good to pick up
<thumper> ffs
<natefinch> wallyworld: I was hoping to avoid that one, just because it sounds setup-intensive.  I'd need a maas environment behind a firewall using a proxy...
<thumper> I'm giving up on the manual ha bug right now
<thumper> since you can never get into ha, it won't matter that you can't kill it
<natefinch> haha
<thumper> this is a rabbit hole
<thumper> and a lot more work than I thought
<thumper> so perhaps better to work on other more important bugs
<natefinch> thumper: exactly the reasoning I used when abandoning my manual bug.... only after following the rabbit hole way too deep.
<thumper> natefinch: what was your manual bug?
<wallyworld> natefinch: what about https://bugs.launchpad.net/juju/+bug/1617190
<mup> Bug #1617190: Logout required after failed login <juju:Triaged by alexis-bruemmer> <https://launchpad.net/bugs/1617190>
<natefinch> thumper: https://bugs.launchpad.net/juju-core/1.25/+bug/1610880
<mup> Bug #1610880: Downloading container templates fails in manual environment <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1610880>
<natefinch> thumper: it was 100% reproducible until today, when it became 100% unreproducible, which is when I threw in the towel
<thumper> haha
<wallyworld> thumper: manual ha will be important for system z etc
<wallyworld> but not this week
<natefinch> wallyworld: I could do that one, sure.  I wasn't sure if alexis was actually working on it
<wallyworld> natefinch: no, she assigns bugs to herself
<wallyworld> so she knows what to track
<natefinch> wallyworld: ok cool
<thumper> wallyworld: http://juju-ci.vapour.ws:8080/job/github-merge-juju/8993/artifact/artifacts/trusty-out.log this is failing often
<thumper> but intermittently
<thumper>  cmdControllerSuite.TestAddModelWithCloudAndRegion
<thumper> [LOG] 0:00.377 ERROR juju.worker.dependency "api-caller" manifold worker returned unexpected error: cannot open api: unable to connect to API: websocket.Dial wss://localhost:59081/model/deadbeef-0bad-400d-8000-4b1d0d06f00d/api: dial tcp 127.0.0.1:59081: getsockopt: connection refused
<thumper> not sure why it is failing to connect to the api server
<wallyworld> yeah, weird
<thumper> actually
<thumper> perhaps it failed the first time
<thumper> and retried
<thumper> but the error was still written out
<thumper> the expected string is in the obtained string
<thumper> but after an error line
<wallyworld> hah, tht could be it
<thumper> fuck sticks
<thumper> menn0-afk: really afk?
<thumper> hmm, only 12 minutes, so probably
<thumper> anyway, remeber the i/o timeout retry patch?
<thumper> well the rpc layer returns errors.Trace(err)
<thumper> so none of the errors had an error code, and were all fatal
<thumper> even the ones that said retry :)
<thumper> anyway, identified, and fix QA
<thumper> then pushing
<thumper> wallyworld: cmdRegistrationSuite.TestAddUserAndRegister failing intermittently too
<natefinch> heh, katco fixed a bug just like that somewhere recently
<thumper> ...     "ERROR \"api-caller\" manifold worker returned unexpected error: cannot open api: unable to connect to API: websocket.Dial wss://localhost:50983/model/deadbeef-0bad-400d-8000-4b1d0d06f00d/api: dial tcp 127.0.0.1:50983: getsockopt: connection refused\n" +
<thumper> in the middle of expected output
<wallyworld> same root cause then
<thumper> probably worthwhile figuring out why we are getting this connectino refused
<thumper> it is obviously new
<thumper> and causing many failures
<wallyworld> might just be slow back end
<thumper> part of the problem is the apiserver errors are showing in the client output
<thumper> that's just wrong
<thumper> but due to how our tests do everything...
<thumper> wallyworld: I can see why we are getting the errors
<thumper> [LOG] 0:00.168 INFO juju.apiserver listening on "[::]:60736"
<thumper> [LOG] 0:00.226 ERROR juju.worker.dependency "api-caller" manifold worker returned unexpected error: cannot open api: unable to connect to API: websocket.Dial wss://localhost:50983/model/deadbeef-0bad-400d-8000-4b1d0d06f00d/api: dial tcp 127.0.0.1:50983: getsockopt: connection refused
<thumper> different port
<wallyworld> oh, that race condition
<thumper> perhaps also ipv6 localhost
<wallyworld> that's been there for ages
<thumper> but port will do it
<wallyworld> apparently it's hard to fix
<wallyworld> and the window was considered small enough that we wouldn;t see it, but i may be mis remembering
<thumper> wallyworld: http://reviews.vapour.ws/r/5565/
<thumper> wallyworld: if you are busy, I could take another look
<wallyworld> i am busy, separating controller and model tag, what a a mess, but i can look at pr
<wallyworld> thumper: lgtm
<thumper> pr is very small
<thumper> ta
<natefinch> has anyone done anything about not having to pass --config=bootstrap.yaml every time we bootstrap? I keep forgetting, so I end up bootstrapping without all the config options I normally want to use.
<thumper> these featuretests are failing almost every time now
<thumper> ffs
<natefinch> --debug isn't displayed as a flag anywhere anymore?
<natefinch> it still works... but how is anyone supposed to know it exists?
<natefinch> ahh, help global-options... which is like 3 levels deep in the help :/
<menn0> thumper: back
<thumper> menn0: got a few minutes to discuss this annoying intermittent failure?
<menn0> thumper: yep
<thumper> 1:1
<menn0> thumper: I used -check.list to see which tests run before TestAddModel
<menn0> thumper: passing this should run just those tests: -check.f '(meterStatusIntegrationSuite|UserSuite|debugLogDbSuite|syslogSuite|annotationsSuite|CloudAPISuite|apiEnvironmentSuite|BakeryStorageSuite|blockSuite|cmdControllerSuite.TestAddModel)'
<menn0> thumper: i've got the stress tester running now with that
<dimitern> wow I've got a full novel as a review from babbageclunk :)
<TheMue> morning dimitern
<dimitern> hey TheMue :)
<TheMue> found some time to lurk around here again ;)
<dimitern> TheMue: how's life? ;)
<TheMue> dimitern: fine, nice project here using a mix of JS frontend (mostly done by others), Go microserves (here I'm the evangelist), and couchdb as database
<TheMue> dimitern: only fighting with some fences of this bigger company, in total 12K worldwide
<TheMue> dimitern: so many old-fashioned processes I have to break for a new product
<dimitern> TheMue: sounds interesting :)
<TheMue> dimitern: yip, it is, and a good local team here
<TheMue> dimitern: only less interesting travels than with Canonical :D last tour has been to London again?
 * TheMue continuous with his crypto stuff ...
<dimitern> TheMue: actually I was in Leiden most recently
<dimitern> TheMue: but I'm sure we'll be in London soon enough :)
<TheMue> dimitern: Netherlands, interesting. a new location.
<TheMue> dimitern: I'm missing San Francisco *sigh*
<dimitern> TheMue: yeah - was the first time for me there
<dimitern> TheMue: I'll be at the Juju Charmers Summit in Pasadena, CA in a couple of weeks
<TheMue> dimitern: quickly have to join *lol*
<dimitern> TheMue: here's a link: http://summit.juju.solutions/
<TheMue> dimitern: don't think my company would send me
<rogpeppe1> i see the get-controller-config command. is there any way of setting controller config values?
<rogpeppe1> axw, menn0: ^
<axw> wallyworld: ^
<axw> rogpeppe1: (I don't think so)
<rogpeppe1> axw: hmm
<axw> AFAICR, you can only set them at bootstrap time right now
<wallyworld> rogpeppe1: axw: juju set-model-defaults
<wallyworld> ah
<wallyworld> controller-config, that is immutable
<wallyworld> apiport, stateport etc
<wallyworld> controller uuid
<wallyworld> did you mean the model defaults?
<axw> wallyworld: for argument's sake, it would be nice to be able to toggle auditing-enabled
<wallyworld> axw: i thought there was CLI for that, but maybe not
<wallyworld> the auditing stuff is POConly
<wallyworld> needs to be cleaned up
<macgreagoir> babbageclunk: Following up on that lxd bootstrap failure stuff, I'm not seeing any issues with amd64 (trusty or xenial) but with ppc64le. Have you tested, by any chance?
 * macgreagoir should look for your bug...
<mup> Bug #1618798 opened: endpoint not used in lxd provider <juju-core:New> <https://launchpad.net/bugs/1618798>
<babbageclunk> macgreagoir: sorry, was afk - it's bug 1618636
<mup> Bug #1618636: Bootstrapping with beta16 on lxd gives "unable to connect" error <lxd> <juju:New> <https://launchpad.net/bugs/1618636>
<macgreagoir> babbageclunk: Cheers, let me compare notes with that...
<babbageclunk> It happens when I bootstrap inside a container - I did that just to sidestep the built-from-source juju binary, but I'll try reproducing it outside the container now.
<voidspace> menn0: you should try this: http://www.bbc.co.uk/news/world-europe-37228413
<babbageclunk> voidspace: gah!
<voidspace> babbageclunk: hardcore
<babbageclunk> voidspace: ah, it's fine, he's holding onto that stick for most of it.
<voidspace> babbageclunk: ah, fair point - I hadn't noticed that
<babbageclunk> voidspace: (I think it's a selfie stick.)
<voidspace> babbageclunk: :-)
<voidspace> babbageclunk: dilemna
<voidspace> babbageclunk: I'm working with CloudImageMetadata
<voidspace> babbageclunk: should the plural be CloudImageMetadatas
<voidspace> babbageclunk: ?
<voidspace> it just sounds awful
<voidspace> I'm being inconsistent at the moment - I have a type called cloudimagemetadata and then a collection type that I kind of have to call cloudimagemetadatas
<babbageclunk> Not as awful as dilemna
<macgreagoir> voidspace: Data is plural ;-)
<voidspace> macgreagoir: yes it is
<macgreagoir> :-)
<voidspace> macgreagoir: hence the dilemna
<babbageclunk> Yeah, should be CloudImageMetadatum
<voidspace> hah
<voidspace> it's not a datum it has data
<voidspace> at a higher level of abstraction its a datum I guess
<babbageclunk> CloudImageMetadataSet?
<macgreagoir> CloudImageMetadatabase :-D
<voidspace> babbageclunk: dilemna is valid
<voidspace> babbageclunk: and better
<voidspace> macgreagoir: hah
<voidspace> babbageclunk: maybe
<babbageclunk> you're invalid
<voidspace> babbageclunk: ableist
<babbageclunk> Does that really have an e? When I'm the boss it will not.
<voidspace> babbageclunk: your spelling is horrifical
<voidspace> babbageclunk: whenever there's an option you pick the most confusing one so as to seem superior
<voidspace> babbageclunk: *everyone* knows that
<voidspace> gah
<babbageclunk> Anyway, there are lots of infos in the codebase, but I don't think datas would be so easily tolerated.
<voidspace> babbageclunk: yeah, it's pretty horrible
<voidspace> infos isn't great
<babbageclunk> voidspace: I think dataset's alright, as long as one isn't too bothered by the implication that it should be a set.
<voidspace> babbageclunk: right, I think I agree
<voidspace> and there is no real set in go anyway
<babbageclunk> And really, if you're ok with dilemna, something purporting to be a set but then not really being one is the least of your problems.
<voidspace> :-)
<macgreagoir> voidspace: +1 metadataset, fwiw
<voidspace> macgreagoir: cool, thanks - sounds like it's decided
<babbageclunk> macgreagoir: I haven't managed to reproduce that bug outside a container. How can I determine whether the agent is being uploaded from my machine rather than pulled from a stream?
<babbageclunk> macgreagoir: (because I think that might be confounding things)
<macgreagoir> babbageclunk: Let me check my logs. I can repro on amd64 only in a nested container, like you.
<macgreagoir> babbageclunk: I see tools downloaded (in both archs). The error in ppc64le is clearly from the code that tries to use local as remote.
<macgreagoir> I guess that's a fair starting point for amd64 too.
<macgreagoir> The ppc64le bug is back on beta11.
<ashipika> mgz: ping?
<macgreagoir> babbageclunk: Just fyi, https://bugs.launchpad.net/juju/+bug/1618636/comments/1
<mup> Bug #1618636: Bootstrapping with beta16 on lxd gives "unable to connect" error <lxd> <juju:New> <https://launchpad.net/bugs/1618636>
<babbageclunk> macgreagoir: another interesting data point, I get much more output (from cloud-init, apt installs etc) from the bootstrap process in the container than I do from the one running on my host - do you know why that might be?
<macgreagoir> An interesting datum? :-)
<macgreagoir> babbageclunk: I'm trying to get a ppc64le build with some more loggin output, but not succeeding yet.
 * babbageclunk lols
<mattyw> hey folks, how can one login to the models' mongo db now?
<babbageclunk> mattyw: the juju-mongodb package includes the mongo binary now, so you can run that connecting to 127.0.0.1:37017 on the controller. Get the password from the admin user from the agent.conf, and pass --sslAllowInvalidCertificates.
<mattyw> babbageclunk, I get errors about --sslAllowInvalidCertificates not being a valid arg
<babbageclunk> mattyw: You need to run the mongo from /usr/lib/juju/mongo3.2/bin - check the version?
<rogpeppe1> fwereade: i'm seeing a lot of this kind of test failure. any idea why we might be seeing more of it recently? http://juju-ci.vapour.ws:8080/job/github-merge-juju/9009/artifact/artifacts/trusty-out.log/*view*/
<babbageclunk> mattyw: Oh, you need to pass --ssl as well.
<babbageclunk> mattyw: I'm trying to do this now too, but I haven't gotten in yet.
<mattyw> babbageclunk, I think I wasn't using juju-mongodb, I'm updating my script now
<mattyw> babbageclunk, /usr/lib/juju/mongo3.2/bin/mongo: No such file or directory
<babbageclunk> mattyw: Really? On xenial?
<mattyw> babbageclunk, trusty
<babbageclunk> mattyw: that would do it - sorry, all of this has been mongo 3.2 advice.
<mattyw> babbageclunk, do we not install 3.2 on trusty as well?
<babbageclunk> mattyw: Not sure. Does the directory exist?
<mattyw> babbageclunk, it doesn't
<babbageclunk> mattyw: Then I think the mongo binary will be in /usr/lib/juju/bin instead, if it's anywhere.
<mattyw> babbageclunk, I see all the server side packages but none of the client ones
<babbageclunk> mattyw: Ok, in that case you should be able to install mongodb-clients
<macgreagoir> babbageclunk: Again, just fyi, I found a beta12 ppc64el deb. I don't see the lxd issue in this test.
<mattyw> babbageclunk, so this is the script I'm using http://paste.ubuntu.com/23115673/
<babbageclunk> macgreagoir: I don't see the lxd problem in beta15 either.
<mattyw> babbageclunk, must be something wrong with the last line - the args sent to mongo
<babbageclunk> mattyw: On trusty it's mongo 2.4, so the args are different
<mattyw> babbageclunk, any idea what they should be?
<macgreagoir> babbageclunk: I can't find or build a ppc beta15 to test :-/
<babbageclunk> mattyw: just trying it out now - had to bootstrap a trusty controller
<babbageclunk> mattyw: This works for me: mongo -u admin -p <password> 127.0.0.1:37017/juju  --authenticationDatabase admin --ssl
<mattyw> babbageclunk, sorry mate, how do I login to the controller, juju ssh -m default 0?
<babbageclunk> mattyw: juju ssh -m controller 0
<fwereade> rogpeppe1, those look like it's "just" a logging change... and that ideally we'd have the logging going somewhere else so we could inspect the command's contributions to stderr alone
<rogpeppe1> fwereade: it's a sporadic failure
<rogpeppe1> fwereade: do you think that the API connection failure is expected there then?
<fwereade> rogpeppe1, right -- most of the time we connect flawlessly, sometimes the odd connection failure should be retried transparently
<fwereade> rogpeppe1, that was my read
<rogpeppe1> fwereade: interesting. i'll go back and look at the other test failures in that light
<rogpeppe1> fwereade: do you think that perhaps the manifold worker shouldn't be reporting at ERROR level?
<rogpeppe1> fwereade: and... this is client-side, right? what's the worker doing logging its errors there?
<rogpeppe1> fwereade: or does the client side have workers too now?
<fwereade> rogpeppe1, this is a featuretest: it is indeed probable that it's just part of the appropriate chunk-of-controller
<rogpeppe1> fwereade: i guess we could paper over the issue by configuring the logger used for the command output to exclude all logs from juju.worker.*
 * fwereade winces a little, and worries how well that's going to work in practice... but logs and commands are tangly, so, well, I suppose it'd be expedient
<rogpeppe1> fwereade: FWIW this test failure and others similar has wasted me a lot of time yesterday. it's definitely worth some kind of fix :)
<rogpeppe1> fwereade: yeah, i'm not sure either.
<fwereade> rogpeppe1, yeah, definitely not arguing it doesn't need a solution, just pushing for a bit of find-the-ultimate-cause ;)
<rogpeppe1> fwereade: alternatively, maybe that manifold error doesn't justify ERROR status and could be logged at INFO level
<rogpeppe1> fwereade: after all, it's fairly run-of-the-mill to get errors there.
<fwereade> rogpeppe1, probably also reasonable, yeah
<fwereade> rogpeppe1, but similarly leaves us vulnerable to similar changes infecting these tests in future
<rogpeppe1> fwereade: yeah, i was thinking that too
<fwereade> rogpeppe1, mainly a matter of how-much-time-can-you-justify, I think :(
<rogpeppe1> fwereade: zero currently - i'm off on hols tomorrow :)
<rogpeppe1> fwereade: i'm just trying to remind myself how that output gets captured in tests
<rogpeppe1> fwereade: ha, interesting realisation: there's no way to get a logging source to totally shut up, because CRITICAL messages will always be logged.
<rogpeppe1> luckily nothing logs at CRITICAL level
<fwereade> rogpeppe1, two opposing bugs in perfect balance, eh
<babbageclunk> jhobbs: around?
<perrito666> fwereade: ping?
<babbageclunk> jhobbs: ping?
<frobware> dimitern: ping - can we sync before standup please?
<dimitern> frobware: ok, just a couple of minutes and I'll join standup HO
<frobware> dimitern: ack
<dimitern> frobware: oops, sorry
<katco> fwereade: mgz: standup time
<dimitern> babbageclunk: ping
<babbageclunk> dimitern: pong!
<dimitern> babbageclunk: hey, I've pushed some changes to http://reviews.vapour.ws/r/5559/
<babbageclunk> dimitern: ok, I'll take a look.
<dimitern> babbageclunk: ta!
<dimitern> voidspace, frobware: please take a look if you can as well ^^
<dimitern> babbageclunk, voidspace, frobware: sorry, I've realized I didn't actually push them :/ now I did
<dimitern> fwereade_: you've got a review
<fwereade_> dimitern, tyvm
<rogpeppe1> mgz: any idea why this failed? i can't see any errors. http://juju-ci.vapour.ws:8080/job/github-merge-juju/9014/
<dimitern> rogpeppe1: check trusty-err.log: state/modelmigration_test.go:29: undefined: state.ModelMigrationSpec
<dimitern>  
<macgreagoir> babbageclunk: Out of interest, your test for lxd bootstrap with a from-source binary, did you also have a jujud available in ./ ?
<voidspace> it compiles!
<rogpeppe1> dimitern: oh i didn't realise compiler errors were hidden in the other artifact
<voidspace> Doesn't *work*, but it compiles...
<rogpeppe1> dimitern: that's... unintuitive
<babbageclunk> macgreagoir: no, I was in ~, and mangled my path and gopath so that there were no from-source juju binaries around.
<babbageclunk> macgreagoir: But I've worked out my problem - we ripped out the bit of the lxd provider that tells LXD to listen to https.
<dimitern> rogpeppe1: it's says where to look plainly in the console log :)
<macgreagoir> babbageclunk: OK, cool. One down :-)
<rogpeppe1> dimitern: it does? where does it say that?
<babbageclunk> macgreagoir: well, not quite sure where to put it back in, but yeah, the mystery's solved!
<dimitern> rogpeppe1: See /var/lib/jenkins/workspace/github-merge-juju/artifacts/trusty-err.log
<dimitern>  
<macgreagoir> babbageclunk: I can't reproduce my bug on ppc with tip.
<dimitern> rogpeppe1: ~10 lines from the bottom going up
<babbageclunk> macgreagoir: yay, it's fixed! ;)
<macgreagoir> :-D
<rogpeppe1> dimitern: i know that's where the error is
<rogpeppe1> dimitern: but i think it's unintuitive that most test failure output goes into trusty-out.log but compiler errors go into trusty-err.log
<rogpeppe1> dimitern: i think that both stderr and stdout from the go test should go into the same stream
<dimitern> rogpeppe1: ah, sorry - yeah, that's not so obvious
<babbageclunk> rogpeppe1: +1
<rock___>  Hi. I deployed OpenStack on LXD using https://github.com/openstack-charmers/openstack-on-lxd. I have created "cinder-storagedriver" charm . I pushed the created charm to public charm market place(charm store). So Using JUJU GUI when i was trying to deploy "cinder-storagedriver" charm by adding relation  to "cinder" it was throwing an error.
<rock___> ERROR: Relation biarca-openstack:juju-info to cinder:juju-info: cannot add relation   "biarca-openstack:juju-info cinder:juju-info" : principal and subordinate applications' series must match.
<rock___> But I can able to deploy my charm through juju cli by taking from charm store. And I can able to add relation to cinder successfully. And everything working fine.
<dimitern> rock___: can you paste a link to the charm you pushed?
<marcoceppi> rock___: biarca-openstack needs to be the same operating system series as the charm deployed
<dimitern> rock___: also, how did you deploy it?
<marcoceppi> rock___: I imagine the GUI is pulling the wrong series of teh charm
<rock___> https://jujucharms.com/u/siva9296/biarca/0
<rock___> my charm support xenial,trusty and precise. Rest of the openstack deployment  support  Xenial. juju status pasted info : http://paste.openstack.org/show/565199/
<rock___> As part of debug,  I deployed juju-gui as $juju deploy cs:juju-gui-134 on our setup. But it was showing series as trusty even we have choosen xenial.
<dimitern> rock___: that paste link does not open for me - can you retry pasting it with paste.ubuntu.com or some other service please?
<rock___> dimitern: juju status pasted info http://paste.openstack.org/show/565217/
<dimitern> rock___: ok, I opened that fine, but the charm urls are not there - please paste `juju status --format=yaml` for more details?
<rock___> marcoceppi: Yes. GUI is pulling the wrong series of the charm. When I deployed my charm using JUJU CLI. It worked fine.
<dimitern> rock___: if your config has 'default-series: trusty' that might be cause; alternatively, deploy biarca with --series xenial to be explicit about it
<dimitern> (it should complain if that's required, but it might be just picking the first entry in the list of series-es from metadata - trusty in this case)
<dimitern> babbageclunk, frobware, voidspace: a friendly review poke :)
<rock___> dinitern: yes . In metadata.yaml trusty is the first one of the list of series. But how it worked fine when I deployed my charm through JUJU CLI.
<babbageclunk> dimitern: I was still reviewing it! It takes a long time to type all of my complaining. ;)
<dimitern> babbageclunk: ah, sorry :) sure, take your time!
<dimitern> rock___: are you using the CLI from a trusty machine or a xenial one?
<rock___> dimitern: From a Xenial machine.
<babbageclunk> dimitern: Finished now.
<babbageclunk> dimitern: that approach is neat!
<dimitern> babbageclunk: sweet! thanks :)
<rock___> dimitern: juju status --format=yaml pasted info http://paste.openstack.org/show/565227/
<dimitern> rock___: thanks! there you have it - the gui is running as trusty
<dimitern> how did that happen I'm not sure.. but does it work if you deploy your charm from the CLI?
<rock___> dimitern: As part of debug,  I deployed juju-gui as $juju deploy cs:juju-gui-134 on our setup. But it was showing series as trusty even we have choosen xenial.
<rock___> Actually, We no need to deploy juju-gui separately. If we run "$sudo juju gui" It will give https://10.75.116.66:17070/gui/3d58eec3-a3ed-4430-8eb4-8f3ec7db7ea8/.
<dimitern> rock___: yeah, it sounds like it should refuse to deploy it, if it's multi-series charm and no series is given (it used to - perhaps something changed recently)
<rock___> dimitern: We manually deployed juju-gui to know which series it is going to take.
<rock___> dimitern: we choosen xenial but it has taken trusty.
<dimitern> rock___: but you should then use the full cs: url - e.g. cs:xenial/juju-gui-134
<dimitern> or instead, juju deploy cs:juju-gui-134 --series xenial
<dimitern> rock___: unless some config gets in the way - you check if `juju model-config | grep -i trusty` returns anything
<voidspace> dimitern: you've had a review on 5559 - or do you want a review on something else?
<voidspace> dimitern: ah, I'm just slow
 * voidspace has now read the backscroll
<dimitern> voidspace: :)
<babbageclunk> voidspace: other people a
<babbageclunk> voidspace: re allowed to review it too!
<voidspace> babbageclunk: frankly it's better for everyone if they do...
<frobware> dimitern: hey, sorry. was sidetracked with maas-2.1 and bridge_all=True
<rock___> dimitern:  I tried $sudo juju deploy cs:juju-gui-134 --series xenial.  pasted log for this is http://paste.openstack.org/show/565229/. sudo juju model-config | grep -i trusty returns nothing.
<babbageclunk> voidspace: I feel like you're saying the same thing as me but mean the opposite. :)
<voidspace> babbageclunk: uhm, that would assume I have even the faintest idea what I'm talking about
<voidspace> babbageclunk: a very dangerous assumption
<dimitern> rock___: oops, it seems you've got a panic there - sorry about that
<babbageclunk> voidspace: you know what they say about assumption
<voidspace> heh
<dimitern> rock___: it might be worth trying the next (last) beta16 to see if it's any better (and produces no panic)
<rock___> dimitern: So Juju 2.0 is the development channel. So the charms present in development channel were migrated to edge?
<dimitern> rock___: if it's still there, please file a but with that output, link to the charm, and the version of juju!
<babbageclunk> jhobbs: ping?
<rock___> dimitern: So I have to upgrade to juju latest version[Juju 2.0-beta16] right?
<dimitern> rock___: I'd first try that to see if it was fixed between beta15 and beta16
<dimitern> rock___: if not - then please file a bug and we'll triage it
<jhobbs> hi babbageclunk
<babbageclunk> jhobbs: Hey! I'm trying to do some digging on https://bugs.launchpad.net/juju/+bug/1611159
<mup> Bug #1611159: model not successfully destroyed, and error on "juju list-models" <oil> <oil-2.0> <juju:Triaged by 2-xtian> <https://launchpad.net/bugs/1611159>
<babbageclunk> jhobbs: How can I reproduce it?
<jhobbs> well it's an automated test setup
<jhobbs> babbageclunk: i have a maas server with 12 machines commissioned in it, and 1 vm. i bootstrap with the VM as the controller node, and then jenkins adds a couple of models and deploys openstack to them, runs some tests, then destroys the models
<jhobbs> each model is handled by a seperate worker in jenkins and done in parallel
<jhobbs> if i were going to try to reproduce it outside of oil i would use juju with a maas provider, one controller, and add and remove a bunch of models in parallel
<rock___> dimitern: ok. Thank you.
<redir> morning
<babbageclunk> jhobbs: ok, I'll try that. Do you feel like the bundles being deployed would make a difference? Or is it more about the models being added and removed?
<jhobbs> i really don't know
<jhobbs> it seems the failure is around models being created and destroyed
<jhobbs> but maybe other stuff going on at the same time contributes to that
<babbageclunk> jhobbs: Yeah, that makes sense.
<babbageclunk> jhobbs: Is this maas 2, or 1.9? I'll start trying to reproduce it with beta16 - are you working with that or another version in particular?
<jhobbs> babbageclunk: MAAS 2, and that was with beta 14
<jhobbs> i can try to reproduce again with 16, maybe later today, if i do, are there any settings i should enable for logging?
<frobware> dimitern: oooh bridges! http://178.62.20.154/~aim/bridges.png
<frobware> dimitern: minor quibble -  I can't actually login to the node
<dimitern> frobware: weeell - it's alpha1 :)
<babbageclunk> jhobbs: I think the logging you're using is good - mostly I want to be able to poke around in the system once it's in this state to try to work out what to look at next.
<frobware> dimitern: and, heh... alll the things we've discovered you can't do.... and some of them are back again.
<frobware> dimitern: MTU issues -> http://pastebin.ubuntu.com/23116684/
<dimitern> frobware: we've been there, done that
<dimitern> frobware: but that's likely as much curtin's fault as maas'es
<frobware> dimitern: I'll raise this via email/bug with maas folks
<dimitern> frobware: +1
<frobware> dimitern: I need to EOD now - will pick this up in the moring.
<dimitern> frobware: ok, have a good one ;)
<dimitern> I'll be EODing soon as well
<mup> Bug #1618963 opened: Local provider can't deploy on xenial <juju-core:New> <https://launchpad.net/bugs/1618963>
<jcastro> anyone have options on: https://bugs.launchpad.net/juju/+bug/1618996
<mup> Bug #1618996: unable to specify manually added machines from bundle.yaml <juju:New> <https://launchpad.net/bugs/1618996>
<katco> jcastro: do you know if that worked in b15?
<katco> jcastro: or is this a new feature request?
<natefinch> jcastro: is the bundle supposed to declare the machines, too?  I haven't really dealt with bundles yet, but I'm sorta surprised that it would work without machines specified in the bundle itself
<jcastro> the bundle is declaring the machines
<katco> natefinch: there is a "To" field i think might allow you to place things on existing machines.
<jcastro> right
<katco> jcastro: i'm not sure if this is a regression or a new feature. could you check to see if it works on b15?
<katco> jcastro: also, is this bundle hand-crafted? wondering if the to directives are wrong: https://github.com/juju/charm/blob/v6-unstable/bundledata.go#L201
<katco> jcastro: i.e. maybe it should be "cinder/0" or "lxc:0", not "1"
<natefinch> also, does it work somewhere other than manual?
<jcastro> marcoceppi: ^^
<redir> Review anyone? http://reviews.vapour.ws/r/5569/
<marcoceppi> jcastro: what's the action item?
<alexisb> katco, can you help out redir
<katco> alexisb: yes in a bit
<redir> no major rush
 * katco just finished a true unit test for deploying a bundle :)
 * redir plays trumpet, pops champagne and passes a glass to katco
<katco> i haven't looked at the diff, but i think this will be a pain to review.
 * redir backs away slowly
<katco> i need to move some things around too
<redir> alexisb: you have some controllers created ?
<redir> anyone have a couple controllers created
<redir> just laying around...
<katco> redir: i have some bootstrapped
<redir> katco: what's the output of juju show-controllers?
<natefinch> hey, color, neat
<katco> redir: gah.... i am lying to you. when did i destroy that...
<natefinch> redir:
<natefinch> CONTROLLER  MODEL       USER         CLOUD/REGION
<natefinch> google*     default     admin@local  google/us-east1
<natefinch> localhost   controller  admin@local  lxd/localhost
<natefinch> lxd         default     admin@local  lxd/localhost
<katco> natefinch: ta
<natefinch> google is in blue
<redir> natefinch: that is show or list?
<redir> looks like show
<natefinch> redir: oh... I didn't realize there were two different commands
<natefinch> that's horrible
<katco> god me either...
<redir> which ist that?
<redir> that you have the output from?
<katco> that is awful. who upon first use knows the difference between show and list?
<natefinch> that's "juju controllers"  which uh... is the same as list-contollers
 * katco face-palms hard
<redir> ok so what's the output of show-controllers?
<redir> just one entry?
<natefinch> yes, just one
<katco> juju show-controllers should be juju controllers --detail{s,ed}
<katco> or something
<natefinch> katco: +10000000000000000000000000000000
<katco> but not 2 commands with verbs that are synonymous
<redir> whelp
<alexisb> redir, yes I do
<redir> alexisb: natefinch answered my question
<redir> but it leaves me new questions
<natefinch> yeah, show controllers only ever shows one, as far as I can tell, the current one
<natefinch> which uh, makes it poorly named
<redir> natefinch: yes it appears to be an alias to show-controller
<natefinch> aaaaahhhhhhh
<katco> natefinch: juju controllers <name> could do the right thing
<redir> at least help for plural shows the help for singular
<natefinch> redir: yes that appears to be true
<katco> wait... so show-controller is just an alias for list-controllers?
<natefinch> no no
<redir> no
<natefinch> show-controllers is an alias for show-controller
<redir> show-cotnrollers appears to be an alias for what natefinch said
<katco> ...why
<natefinch> except that every other command is very careful not to alias plurals... because show-xxx  is supposed to show exactly one, and list-xxx is supposed to show many
<katco> we should remove that alias. show-controller makes *more* sense to me at least
<alexisb> natefinch, redir, katco: the cli is consistant (or should be consistant) with "<somecommand>s" being the same as "list<command>s"
<redir> becase strange attractors
<alexisb> if show-controller is aliased to controllers that is wrong
<katco> alexisb: show-controllers is aliased to show-controller (i think?)
<alexisb> ah yeah that needs to be cleaned up still
<redir> ding ding I think katco said the right thing
<katco> which makes no sense to me
<redir> and natefinch before that
<alexisb> we still have plurals where we shouldnt
<alexisb> list should be plural and show should be singular
<redir> so I just added agent version info to show-controllers
<redir> but it only shows me one
<alexisb> the original design didnt start that way and we havent cleaned up yet
<alexisb> there is a bug open
<natefinch> oh man
<redir> and I am trying to understand if it should show more
<redir> Oh then I just added version to show-controller and plural is a vestigal alias
<redir> yes?
<natefinch> is there a bug open to hide aliases from juju show commands? cause, right now:
<natefinch> r$ juju help commands | wc -l
<natefinch> 168
<alexisb> redir, there should not be an alias for show controller
<redir> OK
<redir> alexisb: care to HO?
<alexisb> redir, sure
<jcastro> while we're at it, all the action CLI commands need a redo IMO
<redir> standup?
<alexisb> jcastro, adding to the pile will not help (also you will need to elaborate)
<jcastro> heh
<katco> jcastro: only we can criticize our CLI! we're retaking these complaints
<jcastro> yeah so, if you think of doing an action show-action-status and show-action-output break the flow
<jcastro> I can never remember them so I constantly have to refer to the docs
<natefinch> Our CLI: https://1.bp.blogspot.com/-ZM7ejcL9pk8/Vr4roZBEJsI/AAAAAAAACpY/oyyCEKiAs7A/s1600/TheSimpsons1218-1.jpg
<jcastro> but none of my complaints are 2.0 material I don't think
<katco> redir: bootstrapping your change now, only 1 minor comment for the review
<thumper> morning
<redir> katco: k tx
<redir> katco: and going to eradicate the alias for that too
<katco> redir: ship it
<redir> katco: tx
<menn0> wallyworld: here's the change to extract the unit status logic in the apiserver http://reviews.vapour.ws/r/5571/
<wallyworld> ok
<alexisb> thumper, axw ping
<alexisb> perrito666, ping
<axw> alexisb: pong, just joined
<perrito666> alexisb: pong
<alexisb> perrito666, standup
<perrito666> yay bug landed
#juju-dev 2016-09-01
<axw> wallyworld thumper menn0 anastasiamac: could someone please review http://reviews.vapour.ws/r/5572/, fixes the blocker
<axw> if it looks good, please land - I need to go help with kids
<thumper> axw: ack
<redir> two in a row perrito666!
<redir> ~warp speed~
<menn0> axw: looking
<wallyworld> menn0: axw: sorry, just got off a call, menn0 i can look at your PR in a minute
<alexisb> anastasiamac, ping
<menn0> wallyworld: cheers
<anastasiamac> alexisb: pong :)
<anastasiamac> alexisb: a-team standup?
<alexisb> sure
<wallyworld> menn0: in the PR, it looks like the "lost" processing has been omitted for machines?
<menn0> wallyworld: it was never there
<menn0> wallyworld: there should be no functional change in this PR
<wallyworld> ah, right, looks like there was some dumb no-op code there for mchine status
<menn0> wallyworld: yep there was some dead code there. I meant to mention it in the PR description, sorry.
<wallyworld> no worries, i just had a reading comprehension problem
<wallyworld> menn0: +1, nice to see that refactoring
<axw> wallyworld: can you please look at http://reviews.vapour.ws/r/5572/ also
<wallyworld> ok
<wallyworld> axw: was the main issue adding back the EnableHTTPSListener call?
<axw> wallyworld: yes
<wallyworld> axw: would be nice to test on a virgin system, to repro the scenario that lead to the bug, but maybe that's a bit difficult
<axw> wallyworld: I reproduced the issue by wiping out the core.https_address config from my lxd
<wallyworld> awesome, ok
<wallyworld> lgym
<axw> wallyworld: sorry, should have included that in QA
<wallyworld> no worries
<axw> wallyworld: the CI machine is out of disk, and I need to head out for a bit. can't land my branch just yet
<wallyworld> damn
<redir> doh
<menn0> wallyworld: thanks for the review
<wallyworld> np
<menn0> gosh darn it... github.com/juju/charm/hooks has moved/gone
<menn0> thumper: so the migration export format only stores the charm URL against the application
<menn0> thumper: but a particular unit might running a different charm version
<menn0> thumper: if a charm upgrade is in progress
<menn0> thumper: axw picked up that the prechecks don't prevent a migration if a charm upgrade is in progress
<menn0> thumper: we either need to extend the export format to store charm URL per unit
<menn0> thumper: (which is probably needed when the migration format gets used for backup and restore)
<menn0> thumper: or we block migrations during charm upgrade
<menn0> thumper: thoughts?
<anastasiamac> menn0: this bug has been assigned to u, but I suspect u were snowed under and haven't had a chance to look at.. bug 1453805... is it correct? m thinking to re-target it to next beta...
<mup> Bug #1453805: Juju takes more than 20 minutes to enable voting <ci> <ensure-availability> <intermittent-failure> <regression> <juju:Triaged by menno.smits> <juju-core 1.23:Fix Released by menno.smits> <juju-core 1.24:Fix Released by menno.smits> <https://launchpad.net/bugs/1453805>
<menn0> anastasiamac: yeah, snowed under ... retarget please
<anastasiamac> menn0: \o/
<thumper> menn0: precheck for charms
<thumper> menn0: we will have to update the format though I think
<menn0> thumper: ok, easy enough
<anastasiamac> thumper: menn0: wallyworld: axw: i thought I heard someone say during standup that they were working on fixing bundle deployment.. has bug 1555808 been addressed as part of that work?
<mup> Bug #1555808: Cannot deploy a dense openstack bundle with native deploy <2.0> <2.0-count> <bundles> <cdo-qa> <ci> <deployer> <juju-release-support> <jujuqa> <maas-provider> <juju:Triaged> <https://launchpad.net/bugs/1555808>
 * wallyworld doesn't know
<axw> likewise - I missed that
<thumper> anastasiamac: my fix may have addressed that
 * anastasiamac have troubles believing that there is something that wallyworld doesn't know :-P
<thumper> but it wasn't the specific fix
<anastasiamac> thumper: \o/ i'll get ppl to re-test :D tyvm!
<anastasiamac> veebers: do u know if there is a functional test for this ^^ bug?
<wallyworld> CI doesn't yet deploy openstack bundles
<wallyworld> to the best of my knowledge; it's a gap that needs to be filled
<wallyworld> we tend to rely on OIL
<anastasiamac> wallyworld: in my converstaions with Torsten and Nicholas, there have been movements in this area... i'll confrim :D Thank you for answering Chris's question \o/
<wallyworld> i'd love to know how far away such movements are from something concrete
<veebers> anastasiamac: sorry missed the ping, seems wallyworld answered for me :-)
<anastasiamac> wallyworld: axw: i thought bug 1582021 was addressed :D
<mup> Bug #1582021: Juju loses track of current controller when destroying model <2.0> <bitesize> <destroy-model> <juju-release-support> <usability> <juju:Triaged> <https://launchpad.net/bugs/1582021>
<wallyworld> so did i, it could be a dupe
<anastasiamac> wallyworld: i love those \o/ i'll mark as fix committed :D
<axw> wallyworld: your PR is obese, too hard to see what's going on. can you please point out where the OpenParams.ControllerUUID is used?
<wallyworld> axw: yeah, sorry, i didn't know how to split it up
<wallyworld> it's used in the ec2 provider (and hopefully soon openstack) to construct the firewaller component
<wallyworld> the controller uuid is needed to manipulate security groups
<wallyworld> remmeber how in openstack we didn;t have the controller uuid avaiable and had to make that api call?
<wallyworld> we can now revert that change
<axw> wallyworld: why would we need controller-wide security groups? shouldn't they be model specific?
<axw> wallyworld: which API call?
<wallyworld> let me check the exact one
<wallyworld> axw: before i do that, here's a call that's made in the ec2 provider in current master: controllerSecurityGroups(controllerUUID)
<wallyworld> maybe the intent was for controller model uuid
<wallyworld> ah, no, i think controller uuid is to filter on tag
<wallyworld> we tag ec2 security groups with model and controller uuid
<axw> wallyworld: if we just move the security group creation to the Create method, like we were planning to, we don't need OpenParams.ControllerUUID
<axw> it already has the controller UUID there anyway...
<wallyworld> that me be a very valid point
<wallyworld> i can look at that and revert the controller uuid in open params
<axw> thanks
<axw> wallyworld: but before you do, what was the openstack case?
<axw> maybe it's needed there
<wallyworld> axw: i think with openstack, we can tag security groups, so need to construct the name as juju-controlleruuid-modeluuid
<wallyworld> let me check hte code
<axw> wallyworld: assuming you mean s/can/can't/ -- yeah I think that's the case. probably same deal then, move to Create
<wallyworld> axw: it's to get all the security groups for instances to be stopped; i just checked and ec2 also makes an api call - i think openstack didn't and then we had to so it's become no worse than ec2; so i think we are ok to revert that controller uuid in open params
<axw> cool, sounds good
<wallyworld> at least it will make the pr smaller
<wallyworld> axw: i even now see the todo in ec2 environ create, sigh, forgot about that before
<axw> wallyworld: :)  is it even necessary at the moment though? the code that's there already has access to controller UUIDs AFAICS
<axw> wallyworld: I'd prefer to do that change in another PR if it's not necessary, because there are possible negative side effects
<axw> wallyworld: Create is currently called synchronously, when adding a model. so if we do anything slow in there, it means add-model is going to be slow
<wallyworld> axw: i found out that not all places had the controller uuid. but. all i need to do is create the firewaller struct in create
<wallyworld> and i can call it later when needed using the current code
<axw> wallyworld: they're not necessarily called on the same Environ
<axw> in fact, s/necessarily// :)
<anastasiamac> axw: do u know if bug 1607347 is related to some credential change?
<mup> Bug #1607347: Password for juju-gui not showing up after a change <cpe-sa> <juju:Triaged> <https://launchpad.net/bugs/1607347>
<wallyworld> axw: damn, a bit more to do - appears ec2 tests don't call Create()
<wallyworld> which is kinda important :-)
<wallyworld> axw: huh, looks like only prod code for all the providers calls create
<anastasiamac> wallyworld: axw: also was bug 1614010 addressed as part of recent works in he area?
<anastasiamac> the*
<wallyworld> anastasiamac: that one is invalid
<anastasiamac> wallyworld: could u plz update the bug and explain why it's invalid as part of the comment? ;D
<wallyworld> i may not have all the exact reasoning to hand; would be better to ask rick whose team did the work
<wallyworld> something along the lines of needing a separate juju home for each user
<wallyworld> i'm not sure i agree with it
<wallyworld> mick did the work i believe
<anastasiamac> wallyworld: axw: and what about bug 1607347?
<mup> Bug #1607347: Password for juju-gui not showing up after a change <cpe-sa> <juju:Triaged> <https://launchpad.net/bugs/1607347>
<axw> anastasiamac: we wipe the password off disk when you call change-user-password
<anastasiamac> axw: \o/ could you please adda  comment to the bug?
<anastasiamac> wallyworld: do u know if ur recent work addressed bug 1617046 as a drive-by?
<mup> Bug #1617046: WARNING cannot read current model: current model for controller <name> not found <oil> <oil-2.0> <juju:Triaged> <https://launchpad.net/bugs/1617046>
<wallyworld> anastasiamac: possibly, not sure off hand, there's been activity in that area
<wallyworld> axw: it appears we only call env.Create() during a new model operation; it is not called at bootstrap. and it needs to be for ec2 at least so that the new firewaller is set up
<wallyworld> you agree that i need to add a create call at bootstrap?
<axw> wallyworld: we do not call Create on the same Environ that other methods are called with. so setting up a firewaller in that method will be pointless. you can/should create *security groups* in that method
<axw> wallyworld: but again: do we actually need to make that change in this PR? from what I could see, controllerUUID is *already* available where we were creating the security groups
<axw> i.e. in the StartInstances method
<wallyworld> axw: firewaller is misnamed perhaps (i copied off the openstack naming) - all it is is a component to do the securty group thing
<wallyworld> axw: i'll chedk the code again - there were places where we did not have controller uuid
<axw> wallyworld: my point is that Create should not modify the Environ object - or if it does, you should not *expect* that it has done so from any other Environ method
<wallyworld> axw: that's fine - other environs are opened and created a sneeded
<wallyworld> and those other ones have the create called to set up things
<wallyworld> but for bootstrap, the bootstrap instance startup was failing because the security group bit had not been initialised because create wasn;t called
<axw> wallyworld: right, Create is only called for hosted models
<wallyworld> i'll take another look to confirm that we always have controller uuid where needed
<wallyworld> axw: yeah, i think in an earlier incarnation i mixed up controller uuid and controller model uuid and so controller uuid is available where needed; i've reverted all the ec2 provider changes and and open params ones, diff is 1 page of files less
<wallyworld> except i left in a create call, need to remove it
<wallyworld> done
<mup> Bug #1616832 changed: manual environment juju-db timeout <manual-provider> <juju:Triaged> <juju-core:Won't Fix> <juju-core 1.25:Incomplete> <https://launchpad.net/bugs/1616832>
<mup> Bug #1618798 changed: endpoint not used in lxd provider <juju:Triaged> <https://launchpad.net/bugs/1618798>
<mup> Bug # changed: 1492000, 1566414, 1584616, 1596597, 1612645, 1614364, 1614633
<wallyworld> axw: thnks for review - i was 50/50 on dropping the controller uuid from the login result. i couldn't see anything that used it. but i guess some callers may want to
<axw> wallyworld: seeing as you don't need to know it before hand, I think it's best to keep it
<wallyworld> ok, will revert
<voidspace> fwereade: ping - got a minute
<voidspace> fwereade: ?
<fwereade> voidspace, sure
<voidspace> fwereade: working on migration for cloudimagemetadata
<voidspace> fwereade: cloud image metadata stuff is in a sub-package of state
<voidspace> fwereade: the migration internal test checks migrated fields so needs access to cloudimagemetadataDoc
<voidspace> fwereade: but this isn't exported and the test is in the state package
<voidspace> fwereade: so the two options I can see are either export the doc or copy the migration test infrastructure into the sub-package
<voidspace> fwereade: can you think of another option?
<voidspace> exporting the doc seems like a bad idea
<fwereade> voidspace, thinking/spooling
<axw> wallyworld: I'm looking to see if we can fix the dummy provider now
<axw> looks diable
<axw> doable*
<wallyworld> axw: econtext switch, which bit?
<axw> wallyworld: getting rid of references to controller UUID from dummy provider's PrepareConfig
<wallyworld> ah ok. did you want me to land this current work first once i fix the comments?
<fwereade> voidspace, I'm thinking that the package boundary should be firm enough that it *should* be sufficient to check the data that's exported -- but, yeah, I'm drawing a bit of a blank on how to do that with certainty that we're not missing fields as we evolve
<axw> wallyworld: no, I don't want PrepareConfigParams getting worse than it is already
<wallyworld> ok
<fwereade> voidspace, is the test machinery at all suitable for extraction and reuse rather than copying?
<voidspace> fwereade: it's a pretty simple function (and a method)
<voidspace> fwereade: so could easily be movedd
<voidspace> *moved
<voidspace> fwereade: the heavy lifting is a 15 line reflection function getting the exported fields
<fwereade> voidspace, I'm gently leaning that way at the moment then
<voidspace> fwereade: I'll do that
<voidspace> fwereade: just need to find the right place for it to live
<voidspace> fwereade: and as it will be a single test in cloud image metadata I'll just inline the assert and move the reflection function somewhere resusable
<voidspace> fwereade: thanks
<fwereade> voidspace, cheers
<rock_> Hi. How can I set bugs-url and homepage  for my newly developed charm? Please anyone respond to this.
<wallyworld> marcoceppi: ^^^^^^ you around to help?
<wallyworld> axw: i've pushed some changes, will retest, i can wait till you propose and land yours and then rebase
<axw> wallyworld: thanks, just gotta pull in your restore changes
<wallyworld> ok, will go feed dog etc
<perrito666> Morning
<axw> wallyworld: my WIP is here: https://github.com/juju/juju/pull/6137, seems to be an issue with leaking mongo. I need to go help with the kids, will try and be back later
<wallyworld> ok
<anastasiamac> babbageclunk: fwereade: do u have any idea what could be wrong in bug 1616832 or if there is a workaround for juju 1.x?
<mup> Bug #1616832: manual environment juju-db timeout <manual-provider> <juju:Triaged> <juju-core:Won't Fix> <juju-core 1.25:Incomplete> <https://launchpad.net/bugs/1616832>
<mup> Bug # changed: 1425808, 1468752, 1484105, 1496166, 1498642, 1500981, 1504602, 1506498, 1510651, 1510675, 1511103, 1511235, 1512875, 1513659, 1517474, 1517535, 1519403, 1519473, 1522409, 1525868, 1532085, 1534296, 1535678, 1539684, 1541228
<fwereade> anastasiamac, nothing really springs to mind... at least not without stopping mongo, running it locally without auth, and seeing what's actually in the admin database
<fwereade> anastasiamac, it does look like machine-0 has forgotten how to log in, but I don't have a mechanism for how that'd happen
<dimitern> frobware, babbageclunk: hey guys, I've updated http://reviews.vapour.ws/r/5559/ and still need +1 to land it
<frobware> jam: ^^ based on our current discussion we may want to look at this again
<anastasiamac> fwereade: \o/ thank you. could u plz comment in the bug with these thoughts and i'll follow up with Menno in the morning ;D
 * dimitern steps out for ~45m
<axw> wallyworld: red herring it seems, I had the same failure on master. running tests again and then will propose
<wallyworld> ok
<wallyworld> looked good from what i saw
<axw> wallyworld: if you're happy with https://github.com/juju/juju/pull/6137, please land and rebase off that
<wallyworld> ok
<wallyworld> axw: let me know if you're happy with the latest version of mine; i've alredy fixed the controller uuid thing
<axw> wallyworld: looking
<axw> wallyworld: LGTM with that field removed
<wallyworld> axw: awesome, ta. yours is almost dome landing hopefully
<mup> Bug #1484105 opened: juju upgrade-charm returns ERROR state changing too quickly; try again soon <bug-squad> <upgrade-charm> <upgrade-juju> <juju-core:Incomplete> <https://launchpad.net/bugs/1484105>
<mup> Bug #1573136 changed: kill-controller is stuck, lots of "lease manager stopped" errors <juju:Fix Released by fwereade> <juju-core:Won't Fix> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1573136>
<babbageclunk> alexisb, wallyworld - Everyone cool for me to pause trying to reproduce bug 1611159 and work on maas2 instance tagging instead?
<mup> Bug #1611159: model not successfully destroyed, and error on "juju list-models" <oil> <oil-2.0> <juju:Triaged by 2-xtian> <https://launchpad.net/bugs/1611159>
<wallyworld> babbageclunk: big +1 from me :-)
<wallyworld> that bug is a rabbit hole
<wallyworld> you can come back to it
<babbageclunk> wallyworld: I'm definitely feeling that too.
<babbageclunk> wallyworld: Ok, I'll start on that unless alexisb says otherwise in the meantime.
<wallyworld> sgtm
<wallyworld> we discussed it briefly
<wallyworld> and it's good to reset your brain a bit as well
<babbageclunk> yeah, absolutely - letting your subconscious have a go at it in the background wh
<babbageclunk> rogue wh
<mup> Bug #1573136 opened: kill-controller is stuck, lots of "lease manager stopped" errors <juju:Fix Released by fwereade> <juju-core:Won't Fix> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1573136>
<alexisb> babbageclunk, see priv chat
<dimitern> jam, are you around?
<mup> Bug # changed: 1573136, 1595617, 1597318, 1598329, 1604988
<dimitern> babbageclunk: if you can have a look at http://reviews.vapour.ws/r/5559/ again, I'd appreciate a +1 :)
<mup> Bug # opened: 1595617, 1597318, 1598329, 1604988
<babbageclunk> dimitern: yup, lookin now
<babbageclunk> g
<mup> Bug # changed: 1595617, 1597318, 1598329, 1604988
<dimitern> babbageclunk: ta!
<babbageclunk> dimitern: LGTM!
<dimitern> babbageclunk: tyvm
<katco> voidspace: natefinch: standup time
<mgz> so, go 1.6
<mgz> I thought I'd consistently get random ordering in maps
<mgz> is that not actually the case?
<mgz> in specific,
<mgz> iterate over a map of three objects
<mgz> I'd expect a test expecting a specific order to fail most of the time
<mgz> is is not actually that random?
<frobware> mgz: make your map content bigger
<voidspace> katco: booger!
<voidspace> katco: I caught up and didn't notice the time, sorry
<frobware> mgz: MOAR
<voidspace> I'm going to start setting an alarm
<mgz> frobware: so, there's special handling for small maps? or something?
<dimitern> mgz: yes there is
<frobware> mgz: Just the hashing of your object may yield the same result each time
<dimitern> mgz: I can remember the size, but it was ~10 entries
<frobware> mgz: s/object/keys
<mgz> aha, thanks guys
<dimitern> (from what I remember from gophercon, maps are internally implemented as linked arrays of fixed size - buckets, + overflow space
<frobware> natefinch: do you want to set aside some time to install MAAS before I EOD in a few hours?
<dimitern> that's why using a map lookup vs slice iteration for small number of elements is not worth the overhead
<natefinch> frobware: yeah, will you have time in like 15 minutes?
<frobware> dimitern: let's do 30 - would that work?
<frobware> natefinch: ^^ (sorry!)
<dimitern> ha! :)
<frobware> dimitern: habits...
<natefinch> frobware: sounds good
<natefinch> mgz: randomish - https://play.golang.org/p/lzGvkm9VxC
<natefinch> mgz: it appears as though small maps are just occasionally randomized
<mgz> that is interesting, and pretty odd
<natefinch> mgz: pretty sure small maps are implemented as slices
<rock>  dimitern: Hi. as you said yesterday I upgraded to juju version 2.0-beta16. And also I deployed juju gui manually $juju deploy cs:juju-gui-134 --series xenial. It was deployed. But from gui I am not able to add relation between mycharm to cinder.ERROR: Relation biarca-openstack:juju-info to cinder:juju-info: cannot add relation   "biarca-openstack:juju-info cinder:juju-info" : principal and subordinate applications' series must match
<rock> I am getting the same issue. juju status pasted info http://paste.openstack.org/show/565721/
<natefinch> mgz: they add in fake random access so that when you get to a larger map that isn't just a slice, you won't get screwed by having code that relies on ordered iteration
<rock> dimitern: So I am thinking it might be a juju-gui bug.
<frobware> natefinch: you'll need a server ISO before we start
<natefinch> frobware: I can get that
<frobware> natefinch: xenial if you want MAAS 2.x
<natefinch> frobware: I wouldn't get anything else :)
<frobware> natefinch: it depends if you need to repro against MAAS 1.9 though - that's trusty only
<natefinch> well that's a PITA
<dimitern> rock: hey! did you also deploy your biarca charm with specified --series ?
<natefinch> frobware:This bug is 2.0, and that's likely to be where most bugs are in the future, I'd imagine
 * frobware chuckles... :)
<dimitern> natefinch, mgz: here's the gophercon talk that explains how maps are implemented, if you're interested :) https://github.com/gophercon/2016-talks/tree/master/KeithRandall-InsideTheMapImplementation
<mgz> it's a journey of fun discovery :)
<mgz> okay, I'm going to go ahead and change this delete implemtation to preserve order
<katco> natefinch: frobware: andreas has offered up help in reproducing bug 1614635 simply
<mup> Bug #1614635: Deploy sometimes fails behind a proxy <landscape> <juju:Triaged by natefinch> <https://launchpad.net/bugs/1614635>
<katco> natefinch: frobware: please ping him if you run into trouble getting the environment set up
<natefinch> looks like almost exactly 75% of the time, iterating a small map will do so in order
<rock> dimitern: Yes.
<perrito666> natefinch: iirc express randomization was added to that
<natefinch> perrito666: yep.... I just thought it was interesting that it wasn't just always random
<rock> dimitern: Manually, using juju CLI I deployed juju-gui by specifying series as xenial. It worked. It was showing juju-gui series as Xenial.
<perrito666> natefinch: well that is the definition of random
<natefinch> perrito666: well, no... sometimes it decides to be random, and a lot of the time it decides not to be
<perrito666> its random
<natefinch> 50% of the time I iterate over a map of 5 items, it's exactly 0 1 2 3 4.... that's not random.  If it was random, that order would only come up 1/120 times
<natefinch> it's choosing to be random some of the time, and a bunch of the time it's choosing just to iterate in order
<perrito666> natefinch: I recall a random implementation in turbo C that would spit the same number sequence each time
<perrito666> it is random at random, how meta
<natefinch> yes... that probably helps keep the average runtime near constant
<natefinch> notably, the bigger the map, the less often it goes in order... 3 items it's 75% in order, 5 items, it's 50%  8 items it's 12%
<natefinch> ...those numbers still being hugely higher than pure random would indicate
<frobware> natefinch, katco: do we still want to setup locally?
<katco> frobware: natefinch: andreas said the setup is easy; maass in vmware, add 1 node, boom
<katco> frobware: natefinch: so if it starts sounding any more complicated than that, i would say ping him
<natefinch> uh... well, maas in vmware is still Â¯\_(ã)_/Â¯ to me
<frobware> natefinch: want to start now?
<natefinch> frobware: yes :) Thank you :)
<frobware> natefinch: https://hangouts.google.com/hangouts/_/canonical.com/andrew-mcdermot?authuser=0
<dimitern> rock, right and your charm - was it showing xenial as well?
<katco> voidspace: sorry for delay, meetings... np about standup. how goes migrations?
<voidspace> katco: there's progress - it compiles, so mostly fixing tests, so not far off now. Probably tomorrow though.
<katco> voidspace: cool
<voidspace> katco: ran into trouble with packages - cloudimagemetadata doesn't live in the state package, so our usual import and test techniques don't work
<voidspace> katco: plus I did a grand rename to get rid of the abomination "cloudimagemetadatas" I originally used :-)
<katco> haha
<katco> what's it named now?
<voidspace> katco: some places I could simply use CloudImageMetadata, even though it returns a slice of Metadata
<voidspace> katco: and the places I couldn't I used cloudimagemetadataset
<voidspace> katco: which is not ideal (it's not a set) but still better
<natefinch> frobware: all installed
<frobware> natefinch: ok let's jump back in to the HO
<redir> morning
<mgz> heya redir
<redir> :)
<macgreagoir> g'day redir
<frobware> katco, natefinch: all done. working maas setup. \o/
<katco> frobware: ya! thanks a lot frobware
<redir> \o/
<redir> bbiab, going for a haircut.
<lazyPower> Hey core devs o/  do juju-gui bugs get targeted against http://launchpad.net/juju now that its moved into core?
<natefinch> lazyPower: no idea
<cmars> hatch, do you know? ^^
 * hatch looks
<cmars> i thought we just used github issues, but wasn't sure
<hatch> lazyPower: if it's a GUI bug it should be reported https://github.com/juju/juju-gui/issues unless it specifically has to do with the juju core cli
<lazyPower> well, i filed it here: https://bugs.launchpad.net/juju/+bug/1619389
<mup> Bug #1619389: juju-gui fails to parse exported bundle in beta-16 <juju:New> <https://launchpad.net/bugs/1619389>
<lazyPower> i'll xpost to juju-gui repo
<cmars> lazyPower, thanks
<hatch> lazyPower: it's a known bug
<hatch> and it's fixed in gui tip
<hatch> lazyPower: https://github.com/juju/juju-gui/issues/1937
<lazyPower> https://github.com/juju/juju-gui/issues/1966
<hatch> thanks lazyPower we're working on getting the next GUI out but we have some bigger tasks we're just wrapping up first
<lazyPower> no stress, magicaltrout found it so was just making sure we had it on the docket
<hatch> yup, and been fixed :)
<hatch> lazyPower: also to clarify, the GUI project has not been moved into core, only the ability to serve the GUI was added to core.
<lazyPower> you're now part of core, tl;dr
<hatch> lol definitely not
<hatch> or maybe I have been and I just don't know it yet!
<tvansteenburgh> can anyone give me an example of the 'directive' and 'scope' portions of a placement
<tvansteenburgh> like if i have lxc:7, is that all directive, and the default scope is the model?
<tvansteenburgh> basically trying to figure out how to convert a placement string into a Placement map
<katco> tvansteenburgh: try: https://github.com/juju/charm/blob/aece7b0e56c298641968239a7fa0b3466afa6ef5/bundledata.go#L626-L629
<tvansteenburgh> katco: thanks!
<katco> tvansteenburgh: np, happy hacking
<mgz> I had to add a bunch of refactoring to make one of my tests reliable, can I request a re-review in a sec?
<mgz> natefinch: around still?
<natefinch> mgz: I exist in this reality.
<mgz> :D
<perrito666> ok, apparently its impossible to bootstrap juju from my ISP
<perrito666> in aws
 * perrito666 goes a bit closer
<mgz> natefinch: okay, I have pushed a new commit on http://reviews.vapour.ws/r/5545/
<mgz> natefinch: the base fix has been reviewed, but I did some refactoring to make the new testing not subject to map ordering problems
<mgz> natefinch: so, I'd appreciate an eye on that part
<natefinch> mgz: looking
<natefinch> god I hate significant whitespace
<natefinch> whats the opposite of deploy in maas?
<natefinch> (unrelated to my previous statement)
<mgz> natefinch: I was wondering
<mgz> natefinch: `maas ENV machine delete` for 2, I think it's `... node delete` in 1
<natefinch> delete or release?  I was just looking at the UI... I had deployed a machine to test that my maas is working correctly.  And now I want to undo that....
<natefinch> maybe release is the opposite of commission... not that I know what commission means
<mgz> natefinch: repoke about review
<natefinch> mgz: oops, sorry, had started looking at it then got distracted by maas stuff
<natefinch> mgz: reviewde
<mgz> natefinch: ta!
<cmars> how do i connect to juju's mongodb on xenial?
<cmars> tried: /usr/lib/juju/mongo3.2/bin/mongo --ssl --username admin --password <value of oldpassword> --sslAllowInvalidCertificates localhost:37017 admin
<cmars> doesn't work
<thumper> menn0: ^^
<thumper> cmars: I think menn0 has a script somewhere
<menn0> I thought the packaging was being fixed to include the mongo3.2 client
 * menn0 checks
<cmars> menn0, it's there
<cmars> menn0, i just can't log in with it, maybe my cmdline is wrong?
<menn0> could be... let me check my script
<menn0> mongo 127.0.0.1:37017/juju --authenticationDatabase admin --ssl --sslAllowInvalidCertificates --username "admin" --password "$password"
<cmars> ah
<menn0> that's what my script uses
<cmars> menn0, ok, that got me in, thanks
<cmars> menn0, can I have a copy of that script?
<menn0> cmars: I just confirmed that it still works.
<cmars> just needed the --authenticationDatabase flag
<menn0> cmars: sure... let me revise it first though. It goes and installs the client from the mongodb.org PPA b/c we didn't use to include the client.
<menn0> that's unnecessary now
<menn0> cmars: https://gist.github.com/mjs/0d0b89356654de04adf4935860642c5f
<menn0> cmars: run it from your machine like this: juju-db <controller:model>
<menn0> where "model" is the controller model name
<menn0> thinking about it, it should probably just assume the name "controller"
<cmars> menn0, thanks!
<cmars> this is going in my tools repo..
<menn0> cmars: i've just improved it
<menn0> cmars: if you now run "juju-db" without args it assumes machine 0 in the "controller" model for the currently selected controller
<menn0> which is usually what you want
<cmars> menn0, perfect, even better
<wallyworld> thumper: menn0: you free for a chat? in s.o. h.o.?
<mup> Bug #1617602 changed: juju status <service-name> stuck <status> <juju:Triaged> <juju-core:Won't Fix> <juju-utils:New> <https://launchpad.net/bugs/1617602>
<katco> that feeling when you can iterate through a change because it has an actual unit test
<wallyworld> cmars: hey, fyi, thumper just landed a fix fo all the recent landing issues. you will want to rebase your kpi branch to fix all the things
<wallyworld> katco: to be fair, there would have been a test wouldn't there? maybe more like an intergration test, but still a test to run?
<katco> wallyworld: yes, but it would have taken significantly longer to run
<cmars> wallyworld, if we've passed CI and there are no conflicts, i think we ought to be able to just land as-is
<wallyworld> cmars: true. i was just letting you know in case you were doing any more development on that branch before landing
<cmars> wallyworld, no, it's ready to go i think
<cmars> just waiting for QA blessing
<wallyworld> cmars: sweet, in that case it *will* land foirst time now :-)
<cmars> awesome. i spent my afternoon retrying so many times..
<wallyworld> cmars: i just noticed you guys had a bunch od landing bot failures yesterday
<wallyworld> yeah
<wallyworld> thought'd you like to know it was fixed
<cmars> that's great. i was getting worried :)
<wallyworld> cmars: yeah, i blame thumper :-)
<thumper> wallyworld: http://reviews.vapour.ws/r/5580/
<menn0> wallyworld: sorry, I've been AFK
<menn0> wallyworld: still want to talk?
<perrito666> I am late for standup sorry ill be 10 late
<thumper> 10 what?
<thumper> ms?
<thumper> s
<thumper> hours?
<anastasiamac> years
<anastasiamac> ?
<perrito666> Ill ley you guess
<katco> perrito666: if you're late by 10 years i'll buy you a beer in 5
<perrito666> lets not tell katco that It was minutes so she buys me abeer in 5
<katco> perrito666: i didn't specify the unit =|
<perrito666> I feel  cheated :p
<katco> perrito666: i have used your own logic against you! mua! muaha ha!
<perrito666> It was just me typing on the phone, you attribute me too much inteligence
#juju-dev 2016-09-02
<wallyworld> redir: thanks for doc, i may have missed it, was there a test to check that "source" of config value is updated dynamically? ie if it shows "controller" and I update the region value to match that of the controller, the source should now say "region"
<redir> erm, lemme look
<wallyworld> redir: also, i assigned you a nice card to pick up next :-)
<redir> wallyworld: how do you update the region value to match that of the controller?
<redir> 067118
<wallyworld> redir: you should be able to do all the development either with lxd or aws without teardown of server so dev cycle should be fast
<redir> wallyworld: would you care to HO for a minute on the QA/CI stuff? prolly faster
<wallyworld> sure
<wallyworld> standup
<menn0> anastasiamac: I've added a request for more information, with instructions, on that ticket
<anastasiamac> menn0: u r brilliant! tahnk you :D
<anastasiamac> thank you even
<anastasiamac> thumper: i've missed 3 days. did u have a chance to import/export workloads for model migration?
<anastasiamac> wallyworld: model confi defaults: are cloud region default done? and command re-work?
<anastasiamac> commands*
<wallyworld> no
<wallyworld> there's cards for it on the board
<anastasiamac> q1 or q2 or both?
<wallyworld> region defaults done partly
<anastasiamac> k
<axw> anastasiamac: I'm back
<anastasiamac> veebers: r u available? m happy to chat without u if u r busy and pull u in if we can;t live without :D
<veebers> anastasiamac: I'm available
<anastasiamac> veebers: pm-ed u
<thumper> anastasiamac: payloads?
<thumper> yes
<thumper> resources, no
<anastasiamac> k:D
<katco> axw: ping
<axw> katco: [ong
<axw> pong even
<katco> axw: hey, long time. o/
<axw> katco: indeed, how's it going ?
<katco> axw: kind of chaotic here. kiddo is probably coming down with something :(
<katco> axw: how is your family doing?
<axw> that's never fun
<katco> yeah, it's no fun
<axw> katco: not bad, nothing eventful as that :p
<katco> lol, well that is good to hear ;)
<katco> axw: i pushed up further refactor of the deploy command. i couldn't get the tests passing tonight, but wanted to get someone's eyes on it since i'll be out for the next 2 weeks
<katco> axw: it's http://reviews.vapour.ws/r/5582/ if you (or anyone else) get a chance today. i'm afraid it's quite long, but i think a lot of the diff is tests
<axw> katco: ok. pretty busy today, but I'll see what I can do
<katco> axw: no worries. i understand if you can't get to it
<katco> so is it getting warmer over there?
<axw> katco: not really that warm here yet
<axw> katco: not as cold as it was a few weeks ago, but still fairly chilly
<katco> lucky
<katco> we had a nice fallish day today
<natefinch> Someday fall will come here.  It's been unseasonably hot all summer.
<katco> well we're technically still in summer i suppose
<natefinch> I think we've had like one below average temperature day all summer.  New England... where all the days are above average
<katco> wondering how much of that is climate change too =/
<natefinch> well yeah. When the last like 15 months or something have all been record highs, globally, and 14 of the 15 hottest years have been since 2000 (or some numbers pretty close to those)
<mwhudson> perrito666, etc: juju-mongodb3.2 3.2.9 builds fine on trusty it seems https://launchpad.net/~mwhudson/+archive/ubuntu/devirt/+packages
<natefinch> so weird.... the X button (window close) in the upper left of my windows is disabled when windows are maximized
<natefinch> no, sorry, all those buttons (minimize and maximize too) are disabled
<thumper> menn0, veebers: who is ready for a call now?
<menn0> menn0: i'm around
<veebers> thumper: I'm available
<thumper> I'll go with menn0 first because I think it'll be shorter
<thumper> menn0: 1:1?
<menn0> thumper: ok
<thumper> veebers: now?
<wallyworld> mwhudson: that's awesome. maybe we can get juju-mongodb on trusty updated at some points?
<mwhudson> wallyworld: yeah, should be possible
<wallyworld> \o/
<veebers> thumper: in 5?
<thumper> veebers: sure
<veebers> thumper: actually I wanted to ask wallyworld something but he's just disconnected. Now is fine :-)
<thumper> ok
<thumper> veebers: https://hangouts.google.com/hangouts/_/canonical.com/manual-ci?authuser=0
<anastasiamac> thumper: menn0: axw: is it possible to use `juju scp` to copy between units directly?
<thumper> kinda
<thumper> goes via the host
<menn0> anastasiamac: yes
<thumper> but yes, I think
<menn0> juju scp -- -3 from:path/to/file to:
<menn0> something like that
<anastasiamac> \o/ thank you :D then we have a bug... but i'll target it ;D
<menn0> the -3 is important
<menn0> you can't copy directly between machines because they don't have SSH keys for each other
<menn0> anastasiamac: I even made sure there was an example of this in the help
<menn0> (last one)
<anastasiamac> menn0: and that's why u r brilliant \o/ tyvm
<anastasiamac> menn0: -3 is importnt because it's...?
<menn0> $ man scp | grep -- -3
<menn0>      -3      Copies between two remote hosts are transferred through the local host.  Without this option the data is copied directly between the two remote hosts.  Note that this option disables the progress meter.
<menn0> anastasiamac: ^
<anastasiamac> menn0: \o/
<menn0> anastasiamac: the client can connect to the 2 Juju managed hosts, but the hosts can't connect to each other, so -3 works around that
<anastasiamac> menn0: awesome! i'll add it to the comment in the bug... i hope it's a user error and we don't need to do anything! thank you
<veebers> wallyworld: you recently landed changes re: controller UUID handling right?
<veebers> wallyworld: would this change what appears in a log forwarded syslog? (I think maybe the CI test for log-forward needs updated as it might be looking for the wrong uuid in the syslog file)
<wallyworld> yes - controller uuid is now different to controller model uuid
<veebers> wallyworld: which uuid will get forwarded for log-forwarding (and also, how would one get that uuid used)
<wallyworld> i'll look. if the test assumed controller UUID = controller model UUID then it needs fixing
<wallyworld> those values happened to be the same before but it is wrong to assume that
<wallyworld> you know the uuid of the controller as it is the one you are connecting to
<wallyworld> let me look at the test output
<veebers> wallyworld: we were using the uuid from 'show-controller' json/yaml output under controllername -> details -> uuid
<wallyworld> that is the correct thing to do
<veebers> wallyworld: hmm, that would suggest that the ci test is doing the right thing but it fails. I'll dig deeper to see what's happenig
<wallyworld> i'm just looking at the code
<wallyworld> veebers: it depends on what you are looking for - the rfc5424 appname contains the model uuid not the controller uuid
<wallyworld> it would have worked by coincidence before if you are searching using controller uuid
<wallyworld> we do record the controller uuid in syslog structured data
<veebers> wallyworld: For this test I use the UUID found as outlined before to check the logs of the rsync-sink to ensure expected output is landing there.
<wallyworld> veebers: you misunderstand - i am saying you are probably checking the wrong content
<veebers> wallyworld: oh
<wallyworld> we record the model uuid, you are checking for controller uuid
<wallyworld> we record the model uuid in the appname field
<veebers> wallyworld: right, so I'm using the wrong uuid
<wallyworld> only if the script is looking in appname
<veebers> wallyworld: how does one get the controller model uuid?
<wallyworld> if the script if looking in structured data then it is correct
<veebers> wallyworld: its checking content of syslog on the rsyslog syslog file
<wallyworld> right, but the syslog content contans lots of diffrent fields :-)
<wallyworld> which field is the script looking for
<wallyworld> appname of the structured data
<wallyworld> *or
<veebers> wallyworld: the check is doing a regex grep on syslog similar to: ^[A-Z][a-z]{,2}\ +[0-9]+\ +[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}\ machine-0.0518053c\-31cd\-4d49\-8316\-a504c6d78389\ jujud-machine-agent-0518053c\-31cd\-4d49\-8316\-a504\ .*$
<veebers> (this is a specific example from a test)
<wallyworld> right, so i guess it is grepping assuming it is parsing the appname field
<wallyworld> use use show-model controller to see the controller model uuid
<veebers> wallyworld: ok, I'll give that a spin now
<wallyworld> fingers crossed, should be a simple fix hopefully :-)
<veebers> hopefully :-)
<wallyworld> "juju show-model controller" prodiceds yaml similar to show-controller
<wallyworld> and you just get that uuid instead
<veebers> cool, makes sense
<wallyworld> we just needed to see what was being searched for in the syslog output
<wallyworld> appname field or structured data
<wallyworld> and now the tests will be even better because a faulty assumption will be fixed :-)
<veebers> wallyworld: can you have structured data in the syslog log file (or am I misunderstanding again?)
<wallyworld> yes, it is text, and can be parsed into simple fields and structured data. i think the structured data is written out inside [] but i'd need to check that syntax
<wallyworld> it's all in the rfc standard
<wallyworld> we write the flat fields and structured data fields
<wallyworld> aooname is a standard scalar field
<wallyworld> appname
<wallyworld> we put "juju-"modeluuid in there i think
<wallyworld> and then things like controller uuid, model uuid as separate fields in structured data
<wallyworld> which is then flattened out into a line of text in the syslog file
<wallyworld> i'd have to read the rfc to get the exact syntax etc
<veebers> wallyworld: ah cool, thanks for clarifying
<wallyworld> np
<thumper> wallyworld: http://reviews.vapour.ws/r/5584/
<thumper> wallyworld: just running test suite now
<thumper> wallyworld: much rationalisation on aliases
<wallyworld> ok
<wallyworld> thumper: you sure about add-units? we can add more than one at a time
<wallyworld> juju add-units -n 10
<thumper> wallyworld: that's what the doc says
<thumper> there is no alias for add-unit
<wallyworld> hmmpf
<thumper> the thing is, if you add plural for one, you should do for all
<thumper> then it just looks messy
<thumper> like you haven't decided
<thumper> be opinionated
<thumper> I went on changes based on the 2.0 command table
<thumper> just wondering whether we should have status or show-status be the alias
<thumper> doc says "status" is the alias
<thumper> but it also says "list-foo" should be the primary
<thumper> so it appears to be slightly out of date
<wallyworld> thumper: in that case, we are inconsistent. the PR changes to juju list-sshkeys. but we still have juju machines as the list machines example
<thumper> wallyworld: refresh
<thumper> I fixed that
<thumper> I tried to be consistent on all the aliases
<thumper> wallyworld: http://paste.ubuntu.com/23122921/
<thumper> that is all the aliases now
<wallyworld> thumper: lgtm with a trivial
<thumper> wallyworld: taking the dog for a quick walk while it is light
<thumper> bbl
<wallyworld> sure, ttyl
<thumper> pushed trivial fix and lined up for merging
<veebers> wallyworld: fyi that was the fix for the CI job, seems I had conflated the controller uuid and the contrller model uuid. All sorted now
<wallyworld> veebers: yay, thank you. not just you who confused it. juju did also for about 18 months or more :-)
<veebers> heh :-) I've set the log-forward job to be non-voting for now until this fix can land in the test
<frobware> dimitern: you about?
<dimitern> frobware: yeah
<dimitern> frobware: aren't you out today?
<frobware> dimitern: for the "unconfigured vlan" bug I think I'm just going to deal with bridging unconfigured interfaces and leave the MAAS 2.1 issues for another commit
<frobware> dimitern: I am - need to leave in a bit
<dimitern> frobware: ok, sgtm
<frobware> dimitern: the script needs to change so that we can look at forward interfaces that we may (or may not) need to bridge. That's a departure from what we do today.
<dimitern> frobware: btw re the other bug with hostnames - I still think we should have that as an optional "native mode", even if later we do nss by default everywhere
<frobware> dimitern: I think the nss module is orthogonal to that bug.
<frobware> dimitern: if we probe/scan for open ssh ports on an undefined order that should still help without nss
<dimitern> frobware: that way we can work consistently across the board, but still satisfy specific deployments that need to use maas hostnames and have properly configured dns settings
<frobware> dimitern: from a juju client perspective (CLI, UI) we would always be dealing with ip addrs
<dimitern> frobware: so I'm proposing to add a maas-specific config flag that enables "put hostname on top of addresses list", otherwise it doesn't and just returns IPs
<dimitern> off by default
<frobware> dimitern: propose via email and let's discuss with jam, et al
<dimitern> and ivoks
<dimitern> ok
<frobware> dimitern: I think we should try and land "unconfigured vlans" soon - if we can.
<dimitern> frobware: agreed - I'm on it
<mup> Bug #1441319 changed: intermittent: failed to retrieve the template to clone: template container juju-trusty-lxc-template did not stop <canonical-bootstack> <cisco> <cpec> <deployer> <landscape> <lxc> <oil> <systemd> <upstart> <juju-core:Invalid> <https://launchpad.net/bugs/1441319>
<mup> Bug #1475509 changed: upgrade-charm --force behavior causes races <race-condition> <upgrade-charm> <juju:Triaged> <https://launchpad.net/bugs/1475509>
<mup> Bug #1531589 changed: debug-log does not work with local provider on xenial + 1.25.0 <debug-log> <local-provider> <xenial> <juju-core:Invalid> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1531589>
<mup> Bug #1475509 opened: upgrade-charm --force behavior causes races <race-condition> <upgrade-charm> <juju:Triaged> <https://launchpad.net/bugs/1475509>
<mup> Bug #1531589 opened: debug-log does not work with local provider on xenial + 1.25.0 <debug-log> <local-provider> <xenial> <juju-core:Invalid> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1531589>
<mup> Bug # changed: 1394223, 1475509, 1514456, 1531589, 1554088, 1613804, 1615106, 1615112, 1615118
<mup> Bug # opened: 1394223, 1514456, 1554088, 1613804, 1615106, 1615112, 1615118
<mup> Bug # changed: 1301367, 1394223, 1444576, 1446885, 1448308, 1514456, 1554088, 1613804, 1615106, 1615112, 1615118
<wallyworld> axw: here's a PR to handle the list controllers things http://reviews.vapour.ws/r/5585/  no rush to land, can do it next week
<axw> wallyworld: ok, will probably review on monday. still butting heads with auth
<wallyworld> axw: yeah, no worries, figures as much :-) just letting you know
<babbageclunk> wallyworld: At the moment I'm not exposing much of the tags functionality in gomaasapi - just enough to do what we need in the provider. Is that ok?
<babbageclunk> wallyworld: We can always flesh out the other parts later as needed.
<babbageclunk> wallyworld: (is my thinking)
<voidspace> babbageclunk: in
<voidspace> oops
<voidspace> babbageclunk: in general, only exposing what we need is the way we've gone
<voidspace> babbageclunk: so that sounds good
<voidspace> babbageclunk: gomaasapi itself is open source - so if anyone needs more they can add it themselves
<babbageclunk> voidspace: cool, thanks - might get you to look at the interface I'm thinking about for tags before I start implementing it if that's ok?
<voidspace> babbageclunk: sure
<voidspace> down to only seven test packages that fail or won't build
<voidspace> progress
<wallyworld> babbageclunk: sounds ok - just enough to satisfy our use case, but if there's a tateful way to do it so that we can accommodate future requirements, that would be good. the main thing for juju is to be able to get and set arbitary tags on machines and storage/volumes
<babbageclunk> wallyworld: ok - sounds good.
<babbageclunk> wallyworld: Just rereading your email - do we expose tagging to users somewhere? Or is it just something we use to find instances by custom tags internally.
<babbageclunk> ?
<wallyworld> babbageclunk: the primary use case is to allow the Environ ControllerInstances() method to correctly identity which machines are controllers - that eliminates the provider-state file requirement. But we also support the resource-tags attribute in config to allow users to specify their own tags as well. There are typically seen via cloud specific tools, like the aws console for example
<wallyworld> but juju could be asked to query and present them
<babbageclunk> wallyworld: Ah, ok - reading about resource-tags now.
<wallyworld> babbageclunk: what happens when juju provisions a resource is that it applies standard tag values (eg controller uuid, model uuid etc) and then adds to those any user specified values
<wallyworld> babbageclunk: so for example, model admins amay set different tags which are then used for accounting or other purposes
<babbageclunk> wallyworld: Ok, and then the point of those custom tags is that users can find them in the provider's UI (the maas one in this case)?
<wallyworld> babbageclunk: see usages of the environs.tags.ResourceTags() method
<wallyworld> yes
<wallyworld> maybe juju can display those in stats, i can;t recall now exactly what the yaml contains
<wallyworld> *status
<babbageclunk> wallyworld: Ok.
<babbageclunk> wallyworld: When you talk about storage volumes, that would correspond to block devices in maas-land? They don't seem to be able to be tagged in the same way.
<wallyworld> babbageclunk: yeah, i'm not sure what maas supports there. in aws for example, we can tag ebs block volumes. but if maas doesn't support a similar concept that's fine. the main thing is to tag machines. but we may also query tags when killing the controller so that we clean up all resources. but that's only really relevant to clouds like aws where the storage is separate and is attached dynamically to a machine
<wallyworld> s/is/may be
<wallyworld> babbageclunk: you'll find StartInstance() in particilar for provider/ec2 or provider/openstack will have code to collect all the required tags and then apply them to the newly started instance
<wallyworld> maas provider would do something similar
<wallyworld> and then ControllerInstances() for maas would be changed to not use provider-state, and instead query machines where controller-uuid tag is the relevant one
<wallyworld> and we'd move the Environ.Storage implementation off the maas 2.0 provider and retain it for use in 1.9, and the mass 2.0 work could be nice and clean
<wallyworld> because Environ.Storage is i think only used to access the provider-state file
<dimitern> babbageclunk, wallyworld, voidspace: hey guys, can you have a look at http://reviews.vapour.ws/r/5586/ ? it should be straightforward
<wallyworld> which we won't need on maas 2.0 now
<babbageclunk> wallyworld: ok, makes sense - thanks!
<babbageclunk> dimitern: Looking
<wallyworld> dimitern: is SysClassNetPath const necessary as an arg? will it ever be anything different?
<dimitern> babbageclunk: ta!
<dimitern> wallyworld: well, I wrote it initially to take just a name, but then I found it awkward to test without introducing global vars unecessarily
<dimitern> wallyworld: it will only be called in a couple of places anyway
<wallyworld> dimitern: yeah, understood. i also wonder if at least the error from say reading the file should be checked. fair enough if it's a notexists. but what happens if you get an eof? maybe worth at least loggin something before returning UnknownInterface. i'm a little wary of swallowing all errors
<dimitern> wallyworld: yeah, ok - I'll add Debug logging on error
<wallyworld> dimitern: you'll appreciate the log entry when something goes wrong and you're asked to diagnose and  fix :-)
<dimitern> wallyworld: sure - I'm usually the one to add insane level of trace logging all over the place :)
<wallyworld> warranted in this case IMHO :-)
<dimitern> +1
<babbageclunk> wallyworld, voidspace - here's how I'm planning to add tags to the gomaasapi interface: https://github.com/juju/gomaasapi/compare/master...babbageclunk:tags?expand=1
<wallyworld> babbageclunk: i don't know the gomaasapi very well, but i generlly prefer not to define interfaces domain entities like Tags and then add verbs to them, like Delete() or Machines() or UpdateNodes(). those verbs I tend to add to a service type of interface
<wallyworld> the above really belong on Controller interface IMO
<wallyworld> DeleteTag(machine, tag)
<wallyworld> etc
<babbageclunk> wallyworld: I was trying to keep consistency with the other types - like Device.Delete or File.Delete.
<babbageclunk> wallyworld: (And to a lesser extent with the underlying API.)
<babbageclunk> dimitern: LGTM
<babbageclunk> dimitern: (Other than the logging that wallyworld mentioned)
<dimitern> babbageclunk: thanks! I've added logging and will push an update shortly
<babbageclunk> dimitern: cool
<wallyworld> babbageclunk: Device and Machine interfaces for example in gomaaspi - these tend to only expose methods relevant to the domain entity which they prepresent
<wallyworld> so given a Machine instance, you can query aspects of that machine. but you can't do things *with* that machine
<wallyworld> it seems the Controller interface performs that role
<wallyworld> eg AllocateMachine, ReleaseMachine etc
<wallyworld> there could be tag related methods on  a machine I guess
<babbageclunk> wallyworld: No, for example Machine has Start and CreateDevice.
<wallyworld> right, but Device (which is analogoues to tag) doesn't have methods that can be used to manipulate machines. a Device is tied to a machine, hence I can see why CreateDevice is there
<wallyworld> so a Machine interface could gain a Tagger interface, providing AddTag() DeleteTag() GetTag() on the Machine interface
<babbageclunk> wallyworld: Yeah, that makes sense.
<wallyworld> babbageclunk: and Controller already has a Machine() method - the MachineArgs would gain tags to filter on
<wallyworld> Machines()
<babbageclunk> wallyworld: That would make gomaasapi quite differently structured from the underlying API - do we think that's a problem?
<wallyworld> babbageclunk: and hopefully/maybe the StartInstance args used to start an instance in maas would have a tags attribute so you can start an instance and apply tags in one call without having to follow up with AddTag()
<wallyworld> doesn't the underlying api have a query machines method?
<wallyworld> where you pass filter attributes to query on for example?
<wallyworld> which is implemented by Controller.Machines()
<babbageclunk> wallyworld: It does, but it doesn't support tags.
<wallyworld> oh, damn
<babbageclunk> wallyworld: The tag querying is done from a separate place in the api - you couldn't search by tag AND mac addresses, for example.
<wallyworld> babbageclunk: so, we should construct the api to be the best it can be for juju and so it is consistent with our design goals; we can adapt that to the underlying maas api and then work with them to improve theor api if it is not "tasteful"
<babbageclunk> (Well, without doing that filtering in the gomaasapi ourselves.)
<wallyworld> if the underlying api is not perfect, we should not build on that an expose it
<wallyworld> we should adapt to it so that if/wehn it improves, out api stays nice and clean
<wallyworld> that's IMHO
<babbageclunk> wallyworld: yeah, I see what you mean.
<wallyworld> there's nothoing wrong with writing an abstraction layer that hides some dirty laundry :-)
<babbageclunk> ok - I'm going to think about that while I go for a run. Thanks!
<wallyworld> babbageclunk: so yeah, think of a Tag like a Device; a machine instance has a Devices() method; it would also have a Tags() method. and just as there's CreateDevice(), also there will be a CreateTag() etc
<wallyworld> have fun
<wallyworld> or AddTag()
<rock_> Hi. I want to deploy openstack using [MAAS+openstack-base-bundle]. Can anyone please give me the detailed hardware requirements. and refrence links for doing that.
<babbageclunk> wallyworld: still around?
<babbageclunk> wallyworld: Actually, I'm seeing a yoda-like apparition of you telling me how to handle the thing that occured to me on the run, so that's fine.
<voidspace> down to two failing packages
<dimitern> voidspace: \o/
<dimitern> out of how many?
<natefinch> anyone familiar with configuring squid?
<natefinch> dimitern, voidspace, frankban, dooferlad: ^^
<natefinch> all I want is the simplest "just forward everything from everyone" ... which google seems to think I am unique in wanting
<dimitern> natefinch: I've been there :) let me check my maas proxy setup
<natefinch> dimitern: exactly what I need it for :)
<mup> Bug #1619682 opened: [azure] can't bootstrap in Central US : no OS images found for location <juju-core:New> <https://launchpad.net/bugs/1619682>
<dimitern> natefinch: here's my /var/lib/maas/maas-proxy.conf: http://paste.ubuntu.com/23124122/
<dimitern> natefinch: it's not exactly allow everything from everyone, but every segment defined in localnet
<natefinch> the problem with that is that then I have to edit it when things change... and if it breaks, I have to figure out what I messed up...  whereas if there were something like "mode transparent" that just did the right thing, I wouldn't have to wonder if juju is broken or my proxy is broken.
<natefinch> dimitern: I'll try to make yours work for me.  Thanks
<dimitern> natefinch: I hope it helps; fwiw I found transparent mode to be quite difficult to get working right
<natefinch> dimitern: not so transparent, eh?
<dimitern> natefinch: I found that using explicit http_proxy like http://10.20.20.2:8000 works best
<dimitern> with transparent mode you'll have tons of issues around TLS
<natefinch> dimitern: ahh yeah, that makes sense
<dimitern> natefinch: last time I tried to get transparent mode working with e.g. https://cloud-images.ubuntu.com/ took me a few hours and at the end it was still not usable
<natefinch> dimitern: well that makes testing this bug a lot harder
<dimitern> natefinch: you could do other things, like iptables + DNAT to the non-transparent proxy host:port
<natefinch> dimitern: that sounds much easier :/
<dimitern> natefinch: http://wiki.squid-cache.org/ConfigExamples/Intercept/LinuxDnat
<rock_> dimitern: Hi. I pused my charm to charm store. I am trying to set BUG URL. Here I am facing issue. pasted issue info : http://paste.openstack.org/show/566078/
<dimitern> rock_: hey, I'm not sure why you're getting this error - where's your charm's source ?
<macgreagoir> natefinch: I don't have a maas2 setup. Maybe first step is me installing something and seeing what it needs for you?
<rock_> dimitern: my charm source https://jujucharms.com/u/siva9296/kaminario-openstack
<dimitern> rock_: I mean is it on launchpad or github, etc. ?
<natefinch> macgreagoir: we could just screen share on my machine, would seem more efficient
<macgreagoir> natefinch: Aye, true enough. I mostly want to see what it has now as a proxy service, compared to 1.x.
<rock_> dimitern: my source is not on Github. And please tell me what it means [is it on launchpad]
<dimitern> rock_: e.g. the cinder charm (https://jujucharms.com/cinder/) has bugs-url=https://bugs.launchpad.net/charms/+source/cinder/+filebug but that's because its source is at https://code.launchpad.net/charms/+source/cinder
<rock_> dimitern: So How can I put my source on launchpad?
<dimitern> rock_: it doesn't have to be on launchpad - use whichever code hosting / bug tracker you're familiar with
<dimitern> rock_: have a look if this might help: https://jujucharms.com/docs/stable/charm-review-process
<dimitern> rock_: maybe someone from the charmers team might be able to help you better (or in #juju), like marcoceppi, jcastro, lazyPower
<rock_> dimitern: we don't have any bug tracker as of now. for pushing charm we used https://jujucharms.com/docs/2.0/authors-charm-store. And Used https://github.com/openstack-charmers/release-tools/blob/master/push-and-publish#L138 to set bug URL.
<rock_> dimitern: OK. Tahank you. I will ask.
<dimitern> sure, np
<dimitern> katco: ping
<katco> dimitern: pong
<katco> dimitern: err i mean "route to katco not found because of some obscure networking setup affecting juju"
<dimitern> katco: so it was decided to go less verbose and no way to bump it back up even you want all the details?
<dimitern> hehe
<katco> dimitern: yeah, that was what was suggested in the bug rick opened
<katco> dimitern: i wanted to log the messages to debug instead, but was getting late and couldn't find a hook to a logger with debug on it. i can certainly figure that out though
<dimitern> katco: ok
<mgz> natefinch: double checking, this issue is wht dbus thing you fixed in 1.25.6? http://paste.ubuntu.com/23124253
<mup> Bug #1619682 changed: [azure] can't bootstrap in Central US : no OS images found for location <cloud-images:New> <juju-core:Invalid> <https://launchpad.net/bugs/1619682>
<dimitern> katco: you've got a review - though not terribly useful I'm afraid
<katco> dimitern: not a problem, tyvm for looking at it in the 1st place
<mgz> macgreagoir: so, I have an answer on how we've done charmstore testing
<macgreagoir> mgz: I'm interested! :-)
<mgz> we had a test that monkeyed with iptables after bootstrap on the state server
<mgz> so when deploy happened after it would go off and talk to the staging store
<mgz> but... we appear to have lost the test
<mgz> macgreagoir: aha, assess_cs_staging.py in juju-ci-tools
<mgz> macgreagoir: also, there seems to be a JUJU_CHARMSTORE envvar that should do something
<mgz> but I'm not sure where or what
<macgreagoir> mgz: Let me see if I can find that. Rings a bell from a code comment.
<katco> mgz: oh god please don't check something like that into juju's src tree =|
<natefinch> mgz: yes, exactly
<babbageclunk> voidspace: still around? What timezone are you in at the moment?
<redir> morning
<voidspace> babbageclunk: UK
<voidspace> babbageclunk: so yes
<voidspace> babbageclunk: although grabbing coffee - back in 5!
<babbageclunk> voidspace: okcool
<voidspace> babbageclunk: finethen
<babbageclunk> voidspace: pingmewhenyou'reback
<voidspace> babbageclunk: back
<babbageclunk> voidspace: Cool - can we hang out?
<voidspace> babbageclunk: ok
<voidspace> babbageclunk: standup one
<babbageclunk> voidspace: core, right?
<voidspace> babbageclunk: yep
<natefinch> gah, my eyes.... evidently debug log uses color now
<perrito666> lol
<perrito666> I am starting to think that someone actually adds weak points on purpose on mongo tooling
<rock_> dimitern : As we discussed before. This issue was not solved. Series miss match issue between mycharm and cinder while adding relation from JUJU GUI. pasted info  http://paste.openstack.org/show/566123/
<katco> hatch: ^^^
 * hatch pops in
<katco> hatch: looks like maybe the gui is choosing the wrong series to deploy for a charm
<hatch> yes I believe we've got an open PR atm for this fix
<hatch> one moment while I confirm
<hatch> ^ rock_
<voidspace> babbageclunk: ping
<babbageclunk> voidspace: pong
<hatch> rock_: this is indeed a GUI bug and I believe it will be resolved with the next GUI release next week cc/ katco
<katco> hatch: thanks for checking in!
<katco> rock_: i believe if this is critical for you to resolve, it's possible to deploy a beta version of the gui. is that right, hatch?
<voidspace> babbageclunk: PM sent
<hatch> katco: unfortunately the PR hasn't yet made it through the review & qa process
<hatch> I'm working my way through them now
<rock_> katco: Yes. It was critical. Because I have to test my charm in both ways[CLI and GUI]. I have critical deadline for my charm to deliver to the client.
<hatch> rock_: when is the deadline?
<hatch> assuming the PR is good, I will be able to get you a beta build of the GUI to test in a few hours
<rock_> hatch/rock: I have to test it and then deliver by next week[8/9/2016].
<hatch> ohh ok that's perfect then, we should have a new release of the GUI out by then as well
<rock_> hatch: Thank you. now please provide beta build of the GUI. i will test on it.
<hatch> regardless though I'll ping you when I have confirmed this bug fixed (or not) with a beta build of the GUI
<rock_> hatch: OK. Thank you for your support.
<hatch> anytime, glad to help
<rock_> Hi. I have a question.  For suppose acc. to openstack-base bundle requirements I have taken machines. Which node I have to take as main node to deploy bundle? Once I deploy bundle how that bundle distribute service charms to hardware machines that we have taken?
<rock_> please provide me the network architeture of {MAAS+openstack base bundle]  based openstack setup.
<katco> rock_: that's really a better question for #juju. you could try pinging elmo there, but i think it's past his EOD
<rock_> katco; OK. Thank you.
 * redir lunches
<lazyPower> omg we have color in tabular status output on key fields?
<lazyPower> \o/
<natefinch> there was a lot of work lately put into colorizing things
<hatch> rock_: sorry, the fix for the issue you mentioned is not yet complete and will need some additional time to fix before it can be merged in
<hatch> rock_: a possible workaround is to go into the subordinates "Configuration" pane in the inspector and change it to match the series you've related to
<hatch> and then click deploy
<hatch> This may or may not work.
<hatch> However, if your charm works as expected when deployed via the CLI you can be confident that it will also work via the GUI (assuming the GUI is free of bugs ;) )
<natefinch> I can't tell if juju is screwed up or maas is screwed up, or if I've screwed up.
<hatch> Can anyone confirm for me that the Juju controller does not support local charms for the AddCharm api call?
<katco> hatch: pretty sure this is true. you need to call AddLocalCharm
<mbruzek> hi katco: I have a problem
<mbruzek> katco: I was hoping you could help me
<mbruzek> ERROR creating API connection: invalid entity name or password (unauthorized access)
<katco> mbruzek: ok, what's up?
<cmars> hatch, that is correct. apiserver/client.AddCharm calls AddCharmWithAuthorization which rejects any charm URL that isn't a cs: URL
<mbruzek> I did `juju log out`  and now I can not log back in
<katco> mbruzek: i think that's an open bug...
<mbruzek> katco: and I can not kill my controller
<mbruzek> katco: is there a work around?
<katco> mbruzek: https://bugs.launchpad.net/juju/+bug/1617190
<mup> Bug #1617190: Logout required after failed login <juju:Incomplete by natefinch> <https://launchpad.net/bugs/1617190>
<katco> mbruzek: i guess you need to log out again?
<mbruzek> katco: I can not get that to work, it never logs me out of the controller.
<mbruzek> $ juju logout
<mbruzek> Logged out. You are still logged into 1 controller.
<mbruzek> mbruzek@warhorse:~$ juju status
<mbruzek> ERROR no credentials provided
<katco> mbruzek: i'm trying to figure out where that error occurs
<katco> mbruzek: what version are you on?
<mbruzek> 2.0-beta16-xenial-amd64
<katco> mbruzek: can you run that command with --debug?
<mbruzek> katco: http://pastebin.ubuntu.com/23125739/
<katco> perrito666: are you still on?
<perrito666> katco: sort of
<katco> perrito666: can you help mbruzek? i don't know where this error comes from
 * perrito666 reads backlog
<mbruzek> perrito666: I seem to be in a weird state
<perrito666> mbruzek: can you ssh into the machine and check the log there? it will tell you which api endpoint rejected you. My best guess is that we have some wrong permission around some basic call
<mbruzek> ssh using ubuntu to the controller did not work.
<mbruzek> ssh ubuntu@54.67.11.148 -> Permission denied (publickey).
<katco> bradm: juju switch controller && juju ssh 0
<katco> bradm: oops sorry for the ping
<katco> mbruzek: ^^
<mbruzek> katco: juju switch controller -> ERROR getting account details for qualifying model name: account details for controller containers not found
<mbruzek> because I am not logged in?
<katco> i have no idea... i have done 0 with the new ACL stuff perrito666 helped put in
<mbruzek> katco: I was able to list-controllers
<perrito666> mbruzek: mm, you shoud try ssh -i ~/.local/share/juju/ssh/id_rsa ubuntu@yourip
<mbruzek> perrito666: I was able to juju switch controller amazon  and destroy the controller
<perrito666> or that :)
<mbruzek> perrito666: do you still want me to ssh ?
<perrito666> mbruzek: well you have nothing to ssh into now :p
<perrito666> no worries, if this is a bug it will happen again soon
<perrito666> ok, I am well past EOD/W and going to theatre, cuall
<mbruzek> bye and thanks perrito666
<katco> perrito666: ta
<mbruzek> katco: Thanks for walking me through it. I don't have a console for this aws account and there was no other way for me to kill that instance
<katco> mbruzek: no worries, sorry for the trouble/bug. speaking of, can you file a bug for that?
<mbruzek> Do you want me to update the bug you found?
<katco> mbruzek: no i'm not sure it's related
<katco> mbruzek: if it is we can always dupe it later
<hatch> katco: ohhh AddLocalCharm thanks
<katco> hatch: np
<katco> hatch: i am landing some tests which actually do a decent job of documenting the API calls necessary for deploying charms
<hatch> oh great - we just have a bug which throws an error for local charms but deploys fine - I hnarrowed it down to not needing the AddCharm call
<hatch> but we should likely instead swap AddLocalCharm
<hatch> although it doesn't appear necessary
<hatch> katco: so we're doing an POST to upload the actual charm then just a deploy and it works...is this just a fluke and we should instead call https://github.com/juju/juju/blob/master/api/client.go#L269
<katco> hatch: wait, i'm sorry. AddLocalCharm is a client-side call which just reads the charm from disk... uploading and then deploying is just fine
<hatch> ohh ok perfect thanks for confirming katco
<hatch> have a good weekend
#juju-dev 2016-09-03
<redir> Have a good weekend juju-dev
<rock__> Hi. I have a question.  For suppose acc. to openstack-base bundle requirements I have taken machines. Which node I have to take as main node to deploy bundle? Once I deploy bundle how that bundle distribute service charms to hardware machines that we have taken?
<rock__> please provide me the network architeture of {MAAS+openstack base bundle]  based openstack setup.
<mup> Bug #1588041 opened: juju bootstrap with vsphere provider hangs with xenial <ci> <jujuqa> <landscape> <oil> <oil-2.0> <vsphere> <cloud-images:New> <juju:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1588041>
#juju-dev 2016-09-04
<mup> Bug #1598752 changed: Juju-deploy fails: Juju cannot bootstrap because no tools are available for your environment <juju-core:Expired> <https://launchpad.net/bugs/1598752>
<mup> Bug #1588041 changed: juju bootstrap with vsphere provider hangs with xenial <ci> <jujuqa> <landscape> <oil> <oil-2.0> <vsphere> <cloud-images:New> <juju:Triaged> <https://launchpad.net/bugs/1588041>
<menn0> thumper: I got this up on Friday night and it still hasn't had a review: http://reviews.vapour.ws/r/5587/]
<thumper> ok
<thumper> menn0: story with this one? http://reviews.vapour.ws/r/5413/
<menn0> thumper: regarding ^^^, the tarball always failed to build whenever I tried to merge it. will take another look.
<menn0> thumper: quick HO to discuss your review feedback?
<thumper> yeah
<thumper> gimmie 2m
<menn0> kk
<thumper> 1:1
#juju-dev 2017-08-28
<axw> babbageclunk: https://github.com/juju/juju/pull/7798, running QA now
<babbageclunk> axw: approved!
<axw> babbageclunk: thanks
 * babbageclunk goes for a run
<babbageclunk> axw: do you know any way of getting tool metadata from the API other than the upgrader facade? I think I'm going to add another method to the migrationtarget facade otherwise.
<babbageclunk> axw: I can't use the upgrader facade because it's locked down to only work for an agent.
<axw> babbageclunk: api.Client.FindTools ?
<axw> babbageclunk: or do you want the metadata given an agent tag?
<axw> babbageclunk: what are you trying to do?
<babbageclunk> axw: huh! no, just given version/series/arch.
<babbageclunk> axw: I need to update the exported model with the tools we've gotten from the target
<babbageclunk> axw: I was about to calculate SHA and size myself when wallyworld pointed out that we have it in the target.
<babbageclunk> axw: looking at api.Client.FindTools now
<babbageclunk> axw: yeah, that looks like what I need, thanks!
<axw> babbageclunk: cool beans
<thumper> externalreality: 1:1 ?
<wallyworld> hml: i'll be there in 5, running a touch late
<hml> wallyworld: ack
<anastasiamac> anyone keen on a quick review? https://github.com/juju/juju/pull/7804
#juju-dev 2017-08-29
<wallyworld> hml: standup?
<hml> wallyworld: Iâm somewhereâ¦ did the link change?
<wallyworld> not that i know of
<wallyworld> it's different on friday
<mup> Bug #1709791 changed: juju deployed lxd falls back to lxdbr0 bridge when binding is specified <juju:New> <https://launchpad.net/bugs/1709791>
<mup> Bug #1709791 opened: juju deployed lxd falls back to lxdbr0 bridge when binding is specified <juju:New> <https://launchpad.net/bugs/1709791>
<mup> Bug #1709791 changed: juju deployed lxd falls back to lxdbr0 bridge when binding is specified <juju:New> <https://launchpad.net/bugs/1709791>
<anastasiamac> axw: jam: last PR in store saga - https://github.com/juju/juju/pull/7806
<anastasiamac> it's fairly straighforward, most pain is in tests changes - they had outdated, inconsitent data that was not separated api vs store
<anastasiamac> actually not the very last PR, the next (and last one, i promise!) will get model count from store rather than controller details... working on it now...
<axw> anastasiamac: looking
<mup> Bug #1709520 changed: juju-db spams syslog, fills disk <juju-core:Won't Fix> <https://launchpad.net/bugs/1709520>
<icey> any chance on an update about https://bugs.launchpad.net/juju/+bug/1684325 ?
<mup> Bug #1684325: customize-failure-domain has no effect when ceph-mon is deployed in a container <availability-zones> <bitesize> <containers> <cpec> <OpenStack ceph-mon charm:New> <juju:Triaged> <https://launchpad.net/bugs/1684325>
<rogpeppe> axw, wallyworld, anastasiamac: i'm after a couple of juju-related reviews if you have a moment. https://github.com/go-goose/goose/pull/55 and https://github.com/juju/juju/pull/7793
<anastasiamac> rogpeppe: o/ m at dinner and family, but I'll try to cycle back in 3-4hrs.. sorry for delays!
<rogpeppe> anastasiamac: np
<axw> rogpeppe: done
<rogpeppe> axw: tyvm
<bdx> any good tricks out there for getting stuck applications out of a model?
<bdx> I've always felt a '--force' arg would just be the perfect accommodation for removing applications ... I'm sure there is a reason why it doesn't exist
<rick_h> bdx: you can --force the machine number but not the application as it can't promise to be clean (hulk smashed, etc)
<bdx> got it
 * rick_h has done a bunch of juju remove-machine --force X today
<bdx> rick_h: lol those meds getting to you?
<bdx> making you destroy all the machines?
<rick_h> bdx: working on a charm :)
<bdx> :)
<rick_h> bdx: so whenever I bork it and it hangs in hook error land just blow it away and start another lxd container wheeee
<bdx> yeah .. that a great dev hammer you have there
<bdx> my workflow for removing stuck applications is to
<bdx> 1) ssh into the machine and `sudo systemctl stop jujud-unit-mystuckapplication-<#>`, 2) `sudo rm -rf /var/lib/juju/agents/unit-myapplication-<#>` 3) `sudo systemctl restart jujud-machine-<#>`
<bdx> oooh step 3 should be
<bdx> rm /etc/systemd/system/jujud-unit-mystuckapplication.service
<bdx> then step 4 is to restart the machine agent
<rick_h> ugh
<bdx> yeah ... works though
<rick_h> fair enough
<bdx> rick_h: http://paste.ubuntu.com/25427233/
<bdx> ;(
<rick_h> bdx: double ;(
<rick_h> bdx: oh, did you try to reolved --no-retry?
<rick_h> bdx: I find that gets me into trouble sometimes and using --no-retry will let a resovled actually work
<bdx> oooh I didn't know about --no-retry
<bdx> thats huge
<wallyworld> thumper: you coming back?
<rick_h> wallyworld: you passing thumper along when you're done?
<thumper> coming
<wallyworld> he's coming now
#juju-dev 2017-08-30
<wallyworld> axw: hml: babbageclunk: burton-aus: i'll be right back
<babbageclunk> wallyworld: aw man, I didn't see that you'd requested the review on GH - I guess that's close enough to voluntelling. I'll have a look this afternoon once I get charms uploading.
<wallyworld> babbageclunk: no worries, you are busy, totally understand
<thumper> FARK
<anastasiamac> thumper: ?
<thumper> just trying to look at how mongo counts records in a collection
<thumper> it is non-trivial
<babbageclunk> thumper: Can we use stats instead of counting? Might be accurate enough and much faster.
<thumper> what stats?
<thumper> I have a theory on the count
<babbageclunk> db.<collection>.stats()
<thumper> but need access to a big controller to test
<babbageclunk> https://docs.mongodb.com/manual/reference/method/db.collection.stats/
<thumper> babbageclunk: you would think that db.coll.find().count() would use that
<thumper> but perhaps it doesn't
<thumper> can we get access to the stats through mgo?
<babbageclunk> not sure
<babbageclunk> mmm, toasted sandwich for lunch
<wallyworld> axw: if you have a chance, here's a PR for review. sadly it doesn't reflect 2 file renames so counts the full contents towards the +/- count :-( https://github.com/juju/juju/pull/7808
<thumper> var stats bson.D
<thumper> session.DB("mydb").Run(bson.D{{"collStats", "mycollection"}}, &stats)
<thumper> fmt.Println("Stats:", stats)
<thumper> so... maybe
<axw> wallyworld: ok, looking
<wallyworld> ta
<axw> wallyworld: is relationnetworks.go otherwise unchanged?
<wallyworld> axw: sadly not - there's some tweaks to accommodate the egress stuff - same core structs but with a direction attribute - "ingress" or "egress"
<wallyworld> pita the renames aren't better handled
<wallyworld> reviewboard did it better
<axw> wallyworld: perhaps do the rename and change in separate commits next time. anyway, not too big, reviewing...
<wallyworld> that would have worked, yeah
<axw> wallyworld: looks good
<wallyworld> tyvm
<thumper> anyone know what needs to be done to be able to run the mgo tests?
<babbageclunk> thumper: I've never got them running locally, but I think I had them running in travis against my github fork
<babbageclunk> thumper: that was pretty straightforward to set up
<babbageclunk> wallyworld: approved
<wallyworld> babbageclunk: yay, ty
<babbageclunk> axw: just realised I need something from 2.2.3 in the upgrade juju2 tree (client side of the machine sanity check) - do you think I should just spot patch that in from 2.2.3, or update the tree to get it?
<babbageclunk> axw: While typing that up I think I decided, I'll just update that client.
<babbageclunk> axw: thanks for the help!
<axw> babbageclunk: hah :)  FWIW I was going to say the same
<babbageclunk> ha, lucky!
<babbageclunk> axw: Can you take a look at https://github.com/juju/1.25-upgrade/pull/20 please?
<axw> babbageclunk: ok. just finishing up a PR, then will take a look
<babbageclunk> axw: thanks - no rush, I'm close to end of day so I'll look at comments tomorrow.
<axw> babbageclunk: okey dokey
<axw> jam: when you're free, can you please take a look at https://github.com/juju/juju/pull/7803?
<wallyworld> jam: wpk: you free now for a HO? or soon?
<jam> wallyworld: I was just about to make coffee and then I'm available, i'm not sure about wkp
<jam> wpk
<wallyworld> ok, just ping whenever
<jam> wallyworld: is it worth just you and I meeting? I feel like we're mostly in sync
<wallyworld> jam: we are, we can do it again tomorrow with everyone
<jam> I'm off tomorrow, but maybe I'll be around the house (local 'vacation')
<jam> Its a national holiday, vs a planned vacation
<wallyworld> or it can wait. we just need to ensure we're fully synced up and plan the next most important bit to do for nextwork-get
<wpk> wallyworld: jam: sorry, I am now if you're available
<wallyworld> i can be in 10
<wpk> k
<wallyworld> wpk: jam: i am free now if you guys are
<jam> wallyworld: we're on the hangout that wpk linked on the other channel
<hml> is there a way to get juju upgrade-juju âbuild-agent to work on a model other than controller?
<thumper> hml: no
<thumper> hml: you need to upgrade the controller first
<thumper> then specify the controller version for the hosted model
<thumper> the hosted model then gets the same version as the controller
<hml> thumper: how do i do the 2nd bit âspecify the controller version for the hosted modelâ?
<thumper> hml: ok, lets say you have upgraded the controller
<thumper> and do a status on the controller
<thumper> and it says version 2.2.3.2 say
<thumper> you then go
<thumper> juju upgrade-juju other-model --agent-version=2.2.3.2
<thumper> probably a -m in there too
<hml> thumper: okay, iâll give it a shot
<bdx> I'm experiencing some odd behavior using JAAS + AWS + network spaces
<thumper> bdx: what sort of odd behaviour?
<bdx> at a high level, things work (charms deploy successfully) if deployed in the default vpc, with no juju spaces defined
<bdx> on the other hand, if I create a model in the non default vpc, add some spaces, and deploy the exact same bundle except to the defined space I get odd failures throughout
<bdx> grabbing some material to support this omp
<bdx> it might be specific to the charm, in this case, elasticsearch
<bdx> ok
<bdx> this is making me feel a bit crazy
<bdx> but its entirely reproducable
<bdx> so
<bdx> I don't know what to say .... I think it might be a charm specific thing though and not juju or network related ..... I just don't get it
<bdx> using upstream elasticsearch in the charmstore (cs:trusty/elasticsearch-19)
<bdx> what happens is elasticsearch will be deployed and running on the machine deployed to the default vpc, and when elasticsearch deployed with a spaces constraint, the process just doesn't exist
<bdx> all the dirs are there, the binary is ther
<bdx> this model https://pastebin.com/NaLcqBFy is configured to a default vpc, no spaces added, and no constraints used
<bdx> from that elasticsearch node, http://paste.ubuntu.com/25434884/
<bdx>  <- `ps aux | grep elasticsearch`
<bdx> we can see that elasticsearch is running
<bdx> next, a model created with a non-default vpc and spaces added, and all the machines deployed to the spaces
<bdx> http://paste.ubuntu.com/25434893/
<bdx> http://paste.ubuntu.com/25434894/
<bdx> heres the full yaml status http://paste.ubuntu.com/25434903/
<bdx> and now, same as before, I'll ssh into elasticsearch box and ps aux
<bdx> http://paste.ubuntu.com/25434906/
<bdx> boom
<bdx> java process for elasticsearch exists
<bdx> odd because the machine has elasticsearch and all the deps (java) installed
<bdx> lol
<bdx> I've been ramming my head on this for days now .... just thinking my fork of elasticsearch is borked
<bdx> lol
<bdx> ^^ important correction s/java process for elasticsearch exists/java process for elasticsearch ceases to exist/
<bdx> its quite baffling, really
<bdx> thumper: ^that kind
<babbageclunk> bdx: can you see logs for elasticsearch? It sounds like it might be trying to start and failing when you have space constraints.
<bdx> babbageclunk: I was just about to report back with that
<bdx> babbageclunk: the java process appears just for a second then disappears
<bdx> the logs are clean though
<bdx> because it looks as if it actually does start
<bdx> so there is no error being logged or anything
<bdx> I have this juju log for the failing instance http://paste.ubuntu.com/
<bdx> wow ... the logs are entirely different
<bdx> and this log for the working instance 2017-08-30 22:19:45 INFO juju.cmd supercommand.go:63 running jujud [2.2.2 gc go1.8]
<bdx> 2017-08-30 22:19:45 DEBUG juju.cmd supercommand.go:64   args: []string{"/var/lib/juju/tools/unit-elasticsearch-0/jujud", "unit", "--data-dir", "/var/lib/juju", "--unit-name", "elasticsearch/0", "--debug"}
<bdx> 2017-08-30 22:19:45 DEBUG juju.agent agent.go:543 read agent config, format "2.0"
<bdx> 2017-08-30 22:19:45 INFO juju.jujud unit.go:151 unit agent unit-elasticsearch-0 start (2.2.2 [gc])
<bdx> 2017-08-30 22:19:45 DEBUG juju.worker runner.go:319 start "api"
<bdx> oh no
<bdx> my bad - jeeze
<bdx> and this log for the working instance http://paste.ubuntu.com/25435003/
<thumper> weird
<bdx> possibly one of my models has different logging config
<bdx> geh
<bdx> yep
<bdx> logging-config was turned up on the model with spaces/non-working one
#juju-dev 2017-08-31
<axw_> babbageclunk: I just need to get some more context on that code before chatting
<babbageclunk> axw_: ok - the start agent code or the ebs stuff?
<axw_> babbageclunk: ebs
<babbageclunk> cool
<axw_> babbageclunk: for the ebs thing, I think you should just compare the pool's storage provider type to the model provider type. ec2 can have ebs, maas can have maas, etc.
<axw_> babbageclunk: rather than looking at the name of the pool, or treating ec2/ebs specially
<babbageclunk> axw_: ok, that sounds nicer.
<babbageclunk> axw_: so filter out any that don't match the model provider type
<axw_> babbageclunk: yes, but they don't all match exactly. you'll need a hard-coded table
<babbageclunk> axw_: Right.
<axw_> babbageclunk: is there something else you wanted to chat about? we can grab another hangout
<babbageclunk> axw_: Just an opinion really - I was just looking in start-agents and it uses sudo service <agent> start, which should work on either, right? It's just a matter of working out which machines it needs to be run on post-agent-upgrade.
<babbageclunk> axw_: I'm tempted to use the saved machines that rollback-agents uses.
<babbageclunk> axw_: Does that sound reasonable or hacky to you?
<axw_> work on either what? started or stopped? xenial or trusty?
<axw_> babbageclunk: ^
<babbageclunk> axw_: oh sorry - 1.25 or 2, I guess. But also xenial or trusty. (I didn't realise service still worked on xenial.)
<axw_> babbageclunk: ah ok. yes, should be fine on both - names haven't changed AFAIK
<axw_> babbageclunk: using the saved list sounds fine
<babbageclunk> axw_: cool.
<babbageclunk> axw_: thanks - I'll make that change to the export code and land it.
<axw_> babbageclunk: sounds good, thanks
<bdx> thumper, babbageclunk: just to follow up the issue from earlier where the elasticsearch would start on a default vpc, and bounce when deployed to non-default vpc with a space constraint
<bdx> http://paste.ubuntu.com/25435596/
<bdx> ^ I find that in the elasticsearch.log
<bdx> strange I only get that when deployed to a space
<rick_h> bdx: what's the diff in the ifconfig of the two differently deployed setups?
<bdx> not-working: http://paste.ubuntu.com/25435632/
<bdx> working: http://paste.ubuntu.com/25435639/
<bdx> nothing... other then the hardware address and ip/ip6
<mup> Bug #1701481 changed: juju 1.25 leaks memory (1.25.11+) <canonical-bootstack> <sts> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1701481>
<bdx> I made a bug with the maintainers of the elasticsearch charm ... here's the link https://bugs.launchpad.net/elasticsearch-charm/+bug/1714126
<mup> Bug #1714126: Elasticsearch won't start when deployed to a space <Elasticsearch Charm:New> <https://launchpad.net/bugs/1714126>
<mup> Bug #1701481 opened: juju 1.25 leaks memory (1.25.11+) <canonical-bootstack> <sts> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1701481>
<mup> Bug #1701481 changed: juju 1.25 leaks memory (1.25.11+) <canonical-bootstack> <sts> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1701481>
<thumper> wallyworld: when did we move from mongo 2.4 to mongo 3.2?
<wallyworld> hmmm, a while back. i think we did it for juju 2.0 actually
<wallyworld> as we were part way through upgrade steps for 2.4->2.6->3.2
<wallyworld> and we aborted since for 2.0 we didn;t need that
<mup> Bug #1714130 opened: juju reports units in non-existent OpenStack availability zones <juju:Incomplete> <juju-core:New> <https://launchpad.net/bugs/1714130>
 * babbageclunk goes for a run
<babbageclunk> axw_: I've updated https://github.com/juju/1.25-upgrade/pull/20, can you take another look please?
<babbageclunk> axw_: took a while to get it tested - I had a successful import but when I tried to roll it back and remove the model from the target controller it killed the machines, even though I'd removed them from the model with --keep-instance beforehand. Not sure what's going on there.
<axw_> babbageclunk: LGTM
<babbageclunk> axw_: ta
<axw_> babbageclunk: I suspect "keep-instance" isn't removing the tags that we use to clean up
<babbageclunk> axw_: yeah, I think that's it too.
<babbageclunk> axw_: the upside is running through the upgrade steps fresh is pointing out places where I was expecting the toolsDir to exist and it doesn't yet.
<axw_> babbageclunk: for example?
<babbageclunk> axw_: well, at the moment it's only created in upgrade-agents, but import (which needs to run first since it uses the agent.conf to connect to state) was trying to download tools into it.
<axw_> okey dokey
<babbageclunk> axw_: ha, also running things without --debug is showing that I really need some helpful output when an import succeeds.
<axw_> juju/juju has finally hit 1000 stars
 * axw_ listens out for fireworks and jingling money sacks
<veebers> axw: hah ^_^
<thumper> heh
<wallyworld> axw: if you have a chance before eod, here's a PR, or else whenever https://github.com/juju/juju/pull/7813
<axw> wallyworld: ok
<wallyworld> ta, it's a folloeup from the one you looked at yersterday
<wallyworld> axw: thank you
<babbageclunk> yay, thanks for reviews axw!
<rick_h> kwmonroe: halp!
<kwmonroe> sup rick_h?
<rick_h> kwmonroe: I've got an INTERFACE_PATH defined and an updated grafana-source interface code there edited.
<rick_h> kwmonroe: now how do I tell charm build to use that locally checked out version of the interface to make sure the charm works with the changes?
<kwmonroe> i can tell where this is going.  you've mistaken me for cory_fu ;)
<rick_h> kwmonroe: yea, but you're all helpful and use this stuff :P
<rick_h> cory_fu: also halp then please kthx :) ^
<rick_h> kwmonroe: has abandoned me in my hour of need
<kwmonroe> well ok then... so, if you have INTERFACE_PATH and do not use "charm build --no-local-layers", then the build should just work to pick up your local interface changes
<kwmonroe> so rick_h, by default, charm build will prefer your local bits.  it's only when you specify "--no-local-layers" that it ignores included layers/interfaces.
<rick_h> kwmonroe: what files does that go into to check if the code is there?
<cory_fu> rick_h: As long as INTERFACE_PATH is set in your env to a directory containing a copy of that layer, named as the layer name, it should use it.
<cory_fu> rick_h: It looks like there's a bug where charm-build will tell you that it's using a local layer for regular layers, but not for interface layers
<cory_fu> rick_h: I opened an issue for it: https://github.com/juju/charm-tools/issues/340
<cory_fu> rick_h: So, if INTERFACE_PATH=/tmp/interfaces; and you want a local copy of interface:grafana-source, then it should be checked out in /tmp/interfaces/graphana-source
<rick_h> cory_fu: ok, where's the interface stuff build int the build charm? I want ot grep/check the source there to see if it's in fact there
<cory_fu> hooks/relations/<interface-name>
<cory_fu> rick_h: ^
<rick_h> cory_fu: <3 ty I had a typo in my path and the output doesn't say it's doing anything differently but fixing my path and checking the source I see it updated now
<beisner> hi all - what do i need to do to deploy series: artful with juju at this time?  i'm getting https://bugs.launchpad.net/juju/+bug/1714305 with agent-stream proposed.
<mup> Bug #1714305: artful: no matching agent binaries available <openstack> <uosci> <juju:New> <https://launchpad.net/bugs/1714305>
<rick_h> beisner: since there's no agents released it needs to be uploaded from the client which means it needs to be deployed from an artful machine?
<rick_h> beisner: balloons correct me here, I feel like there's got to be something I'm forgetting. ^
<beisner> ooo weird, that'll be a problem for us rick_h
<rick_h> beisner: yea, I mean agents are compiled binaries for the series, no release has had them with artful yet I don't think...so it's got to be supplied from the outside. I must be missing something if we've done any testing/etc yet.
<bdx> hey, did I see something coming over the wire about replacing creds?
<bdx> for a model
<bdx> or was that just for a controller
<stokachu> anyone seen failed to create relation egress networks: forbidden transaction: references unknown collection "relationNetworks" before?
<rick_h> bdx: it's in a controller working out how to hook things up
<stokachu> rick_h: ^ is that a mongo error?
<rick_h> stokachu: seems like it. I'd assme relationNetworks is in the CMR feature flag
<rick_h> stokachu: so maybe something landed that missed the feature flagging bit?
<stokachu> yea possible, this user is running from edge
<stokachu> rick_h: lol it is
<stokachu> https://travis-ci.org/conjure-up/conjure-up/builds/270515922#L866
<stokachu> ugh
<rick_h> stokachu: yea, so you could try turning on the feature flag and seeing if it works out and file a bug and let wallyworld know what's up
<rick_h>  export JUJU_DEV_FEATURE_FLAGS=cross-model
<rick_h> but that has to be done pre-bootstrap
<stokachu> ok
<stokachu> wallyworld: https://bugs.launchpad.net/juju/+bug/1714318
<mup> Bug #1714318: juju fails to bootstrap with "references unknown collection "relationNetworks"" <conjure> <juju:New> <https://launchpad.net/bugs/1714318>
<beisner> rick_h balloons - with artful in freeze, release just around the corner, we need to be able to deploy/test it.   other than deploying from a client on an artful machine, what options do we have?
<wallyworld> stokachu: rick_h: damn, will fix straight away this morning. it's almost time we turn off the cross model feature flag
<stokachu> wallyworld: \o/
<wallyworld> i just want to get the data model and api stable; we're almost there
<stokachu> wallyworld: can't wait for it :D
<stokachu> wallyworld: i also have this one https://bugs.launchpad.net/juju/+bug/1711019
<mup> Bug #1711019: vsphere: cache VMDKs in datastore to avoid repeated downloads and firewalled hosts <conjure> <juju:Triaged> <https://launchpad.net/bugs/1711019>
<stokachu> wallyworld: more people are running into where they dont like having the images downloaded to their local machine
<stokachu> and also the firewall issue
<stokachu> axw: ^
<wallyworld> stokachu: ok, we'll look into it
<stokachu> wallyworld: ty!
<veebers> Happy Birthday thomi!
<veebers> hah wrong channel sorry
<bdx> wtf http://paste.ubuntu.com/25441085/
<bdx> just added a model, deployed some stuff to it
<bdx> then I go to add another
<bdx> and WHOAMMI
<bdx> ERROR cannot obtain authorization to collect usage metrics: failed to authorize reseller plan
<cmars> bdx, we're in the middle of a db migration, it's the planned outage i emailed about on the ml
<bdx> oooOOooo
<bdx> :)
<cmars> bdx, sorry for the trouble. i'll let you know when it's back
<bdx> no worries
<bdx> cmars: thx
<hml> wallyworld: have a minute or two to review my pr?  config-changed hook one
<wallyworld> hml: yep, will do
<wallyworld> hml: initial comment - if we refresh and get life, we don't also need u.st.life(u.tag)
<wallyworld> hml: also, i think we can drop the facade version check as the controller is always updated first before the unit agents
<hml> wallyworld: okay
<wallyworld> hml: and probably we can drop the embedding of the common.Life stuff since it's no longer used
<wallyworld> on the client side anyway
<wallyworld> hml: just a few small things, otherwise looks good
<hml> wallyworld: thx - iâll get working on those
<wallyworld> hml: also, would be worth just another manual bootstrap and test just to be sure
<wallyworld> if not already done
<hml> wallyworld:Iâve done a couple - a couple more wonât hurt
<cmars> bdx, hey, we should be all done with the maintenance, are you able to create models now?
<bdx> cmars: yea, good to go!
<cmars> bdx, \o/ ok, glad to hear
<bdx> cmars: what was the upgrade from -> to ?
<cmars> bdx, it was an VM migration to a different openstack VM host
<bdx> ahh nice
<bdx> I remember those days
<bdx> thank god for openstack
<hml> wallyworld: looks like common.Life is still used in the uniter, but i removed the version check
<wallyworld> we should look at that usage
<wallyworld> see if it should be replaced by refresh
<hml> wallyword: application in uniter uses it, not just unit, but perhaps unit should use refresh instead
#juju-dev 2017-09-01
<hml> wallyworld: iâm not going to change unit to use refresh instead of common.Life - itâs not strictly needed and we canât get ride of common.Life easily anyways.
<wallyworld> where is the life() call made?
<hml> wallyworld: st.Unit(tag) api/uniter/uniter.go ln 147 ish
<wallyworld> we should be calling refresh there
<wallyworld> why construct an incomplete unit
<wallyworld> when we can call refresh() instead of life() and get all the data
<hml> wallyworld: okay -
<wallyworld> but there are places in Application that call life() that we would retain
<wallyworld> as currently the application life mirrors the unit life
<wallyworld> babbageclunk: if you had time for a smallish mechanical review that would be great https://github.com/juju/juju/pull/7815
<babbageclunk> wallyworld: sure
<wallyworld> axw: were you free now for 1:1? or soon?
<axw> wallyworld: sure, can do now
<wallyworld> ok
<babbageclunk> axw: thanks for the review - can you look at https://github.com/juju/1.25-upgrade/pull/24 as well? :)
<axw> babbageclunk: yup, looking now
<babbageclunk> wallyworld: sorry, I totally got sidetracked on your review - back to it now
<babbageclunk> wallyworld: approved!
<babbageclunk> wallyworld: I'm popping out for a while, but I'll be back online tonight.
<wallyworld> babbageclunk: awesome, thank you for review
<wallyworld> axw: swap you reviews? https://github.com/juju/juju/pull/7818
<axw> wallyworld: sorry, I took jeremy to his friend's place. sure, but I think you're getting a raw deal ;p
<wallyworld> axw: no worries, thanks for review. the azure one looked sane; i can't give expert opinion on the implementation but testing seemed thorough
#juju-dev 2017-09-03
<babbageclunk> wallyworld: 1:1?
<wallyworld> babbageclunk: oh, sorry
<wallyworld> now?
<babbageclunk> no worries - yup yup
