#juju-dev 2012-10-01
<Guest7958> fwereade: mornin'
<fwereade> hi rog(?) ;p
<rogpeppe> fwereade: not sure why it does that. surely noone else is using "rogpeppe"...
<fwereade> rogpeppe, I *think* I saw your name change spontaneously at some point yesterday
<fwereade> rogpeppe, I have a couple of CLs up, and I would appreciate your thoughts
<rogpeppe> fwereade: i'm having a look right now
<fwereade> rogpeppe, sweet, tyvm
<rogpeppe> fwereade: i saw this:
<rogpeppe> [17:25:19] [Notice] -NickServ- rog__!~rog@ool-45754a58.dyn.optonline.net has released your nickname.
<rogpeppe> [17:25:20] *** You are now known as Guest7958.
<fwereade> rogpeppe, omg, it's that nickserv dude again! he's almost as bad as peer!
<fwereade> (sorry)
<TheMue> morning
<fwereade> heya TheMue
<TheMue> hey, hey, fwereade
<TheMue> ;)
<rogpeppe> TheMue: hiya
<TheMue> hi roger
<rogpeppe> fwereade: ping
<fwereade> rogpeppe, pong
<rogpeppe> fwereade: i'm looking at the filter code, and had a little question
<fwereade> rogpeppe, go for it
<rogpeppe> fwereade: why only receive on the want* channels when there's an event to send?
<rogpeppe> fwereade: would it be a problem to always wait on those channels?
<fwereade> rogpeppe, because receiving on the want channels will unblock the sending channel and we don't want them to send nils
<fwereade> rogpeppe, and we can't guarantee they won't until we've received a change
<rogpeppe> fwereade: but aren't the sending channels themselves nil until an event arrives?
<fwereade> rogpeppe, happily, all  the watchers guarantee us initial events, so it can't be blocked for a long time
<fwereade> rogpeppe, the action of a receive on a want channel is to un-nil the sending channel
<rogpeppe> fwereade: is that necessary?
<rogpeppe> fwereade: erm, yes, i see
<fwereade> rogpeppe, in the context of that code, the condition of wanting to send on an out chan is defined by it being non-nil
<fwereade> rogpeppe, it means that want* calls can all block for a short time until the filter is ready to deal with them, but the filter should be guaranteed to either become ready or fail, assuming the watchers themselves work as advertised
<fwereade> rogpeppe, and the failing will itself unblock the want calls
<fwereade> rogpeppe, and hence, all is sunshine and puppies, I think
<rogpeppe> fwereade: i'm definitely starting to see puppies in the sun
 * fwereade feels cheered
<rogpeppe> fwereade: one other thing: i'm wondering if it's possible to make the dying status propagate without another step through the watcher
<fwereade> rogpeppe, I thought about that and decided that having One True Path for the information was the clearest thing I could do
<fwereade> rogpeppe, it would not be a problem to check-and-set the local life var, and close the channel immediately though
<rogpeppe> fwereade: yeah, i was wondering if there was an elegant way to do that
<fwereade> rogpeppe, it just feels more complex and prone to embarrassing screwup ;)
<rogpeppe> fwereade: i do know what you mean
<rogpeppe> fwereade: i guess it's just my premature-optimisation head saying "5 more seconds! 5 more seconds!"
<fwereade> rogpeppe, well, life could actually be a filter field and we could have a separate hey-the-unit-is-dying method
<fwereade> rogpeppe, it's a reasonable concern, any number of crazy operations could be kicked off in 5 seconds
<rogpeppe> fwereade: i was mainly thinking about responsiveness, but that too, yeah
<fwereade> rogpeppe, if the life-field-plus-method SGTY I can do that, it does feel more ruthlessly efficient in a pleasant sort of way
<fwereade> rogpeppe, but then... I dunno
<fwereade> rogpeppe, in terms of sheer simplicity of implementation it's a mere Sync away...
<rogpeppe> fwereade: true. i seem to remember it being considered bad practice to call Sync in a watcher though.
<rogpeppe> fwereade: but i definitely see the attraction
<fwereade> rogpeppe, leave some sort of comment for discussion, I imagine niemeyer will have input too ;)
<rogpeppe> fwereade: i'm sure of it :-)
<fwereade> rogpeppe, crack-check on another idea
<fwereade> rogpeppe, move most of cmd/jujuc/server -- ie the commands, HookContext, and associated types -- into worker/uniter/hook
<rogpeppe> fwereade: +1
<rogpeppe> fwereade: it's part of the worker after all, right?
<fwereade> rogpeppe, I'm quite certain I will do more with the idea but I feel I can't start until I've moved them and absoerbed the impact
<fwereade> rogpeppe, I have an intuition the the commands themselves may want to move somewhere else
<fwereade> rogpeppe, but yeah
<rogpeppe> fwereade: let me just have a quick look at cmd/jujuc/server again
<rogpeppe> fwereade: this is where i come up against the "what things are actually intended to be exported?" thing
<rogpeppe> fwereade: looking at http://go.pkgdoc.org/launchpad.net/juju-core/cmd/jujuc/server - which functions are actually part of the package API and which are just there for testing purposes?
<fwereade> rogpeppe, some of RelationContext's methods may not technically need to be exported
<fwereade> rogpeppe, and if you want to get technical about it we *could* make all the commands internal-only, but I think that actually ossifies the package
<rogpeppe> fwereade: yeah, i was wondering about NewClosePortCommand and friends
<rogpeppe> fwereade: i was trying to work out when we'd need them
<fwereade> rogpeppe, ATM the only client is NewCommand or whatever it is
<rogpeppe> fwereade: i'm not sure what you mean by "ossifies" actually
<fwereade> rogpeppe, adds to the burden of moving parts of it as it becomes more apparent that they're different packages
<rogpeppe> fwereade: ok, so everything else other than NewCommand could be private?
<fwereade> rogpeppe, and HookContext and most if not all its methods
<rogpeppe> fwereade: i don't think that's a substantial problem - renaming is easy
<fwereade> rogpeppe, and RelationContext and several of its methods
<rogpeppe> fwereade: this is why i think we should have a more rigorous approach to private/public naming. the important parts of the API are obscured by all the bits that could be private.
<rogpeppe> fwereade: but regardless of that, i think it would sit well under worker/uniter
<Aram> moin.
<fwereade> rogpeppe, sorry, got a phone call
<rogpeppe> fwereade: np
<rogpeppe> fwereade: you've got a filter review
<fwereade> rogpeppe, tyvm
<rogpeppe> fwereade: what does it mean for a relation to be "broken" rather than "departed" (looking at the comment "TODO don't stop until all relations broken.")
<rogpeppe> ?
<fwereade> rogpeppe, has executed the -broken hook (and left the scope)
<fwereade> rogpeppe, ie when that relation will never again be touched by that unit
<rogpeppe> fwereade: ah, so "broken" for a relation is a kinda edge-triggered thing, rather than a steady-state broken like the unit broken thing?
<fwereade> rogpeppe, yeah, it's an awful name
<rogpeppe> fwereade: it certainly confused me :-)
<fwereade> rogpeppe, I'm not sure there's anyone it doesn't
<rogpeppe> fwereade: so the "broken" in this context is like a connection breaking - it doesn't imply anything going wrong at all?
<fwereade> rogpeppe, yeah, exactly
<rogpeppe> jeeze
<rogpeppe> fwereade: that's something the docs are going to need to be very careful about...
<rogpeppe> fwereade: just looking at modeContext - is there ever a mode that would *not* want to use modeContext?
<rogpeppe> davecheney: hiya
<fwereade> rogpeppe, basicaly no
<rogpeppe> fwereade: why not factor it out into the loop that's calling the modes then?
<davecheney> rogpeppe: hey
<fwereade> rogpeppe, because the bits that need to know the name (which include the error annotation) can't be used there, because you can't turn a Mode into a name
<rogpeppe> fwereade: hmm, the mode name might be a problem
<rogpeppe> fwereade: you could always do something like this, i suppose: http://paste.ubuntu.com/1253570/
<rogpeppe> fwereade: i do feel that modeContext feels a bit like a hack in the midst of nice clean code.
<fwereade> rogpeppe, I feel that modeContext enables a fair amount of that code's cleanliness
<rogpeppe> fwereade: if you could get the name from a mode, then you could still have that cleanliness without modeContext AFAICS
<fwereade> rogpeppe, although, hmm, actually the non-name bits would probably be better if they were stuck in the main loop
<rogpeppe> fwereade: i think so
<fwereade> rogpeppe, sgtm
<rogpeppe> woo hoo, just got a 50 pound royalty payment.
<fwereade> rogpeppe, nice!
<fwereade> rogpeppe, what for?
<rogpeppe> fwereade:  a tune i wrote some time ago
<rogpeppe> fwereade: someone put it on their album
<fwereade> rogpeppe, awesome :D
<TheMue> rogpeppe: +1
<rogpeppe> fwereade: it'll at least pay for my next bow re-hair...
<rogpeppe> fwereade: (which is long overdue as it happens)
<fwereade> rogpeppe, excellent :)
<TheMue> rogpeppe: where can we d/l your first album? ;)
<rogpeppe> TheMue: when i record it. that's predicted to be in NaN years.
<TheMue> rogpeppe: ok, i plan to get old, but not so old.:(
<TheMue> rogpeppe: different topic. anything special to know when doing ec2 tests (beside setting the environment with the keys)?
<rogpeppe> TheMue: you should set the test timeout, because it takes a while
<rogpeppe> TheMue: here's the script i use (i call it "livetest"):
<rogpeppe> go test -amazon -test.timeout 2h -gocheck.vv $* >[2=1] | timestamp
<rogpeppe> for bash, it would be:
<rogpeppe> go test -amazon -test.timeout 2h -gocheck.vv "$@" 2>&1 | timestamp
<TheMue> rogpeppe: great, thx
<rogpeppe> i find the timestamp bit very useful, as i can glance over at the running test and see if it's hanging up unreasonably or progressing normally
<TheMue> rogpeppe: if i do a "go test" in environ/ec2 and get an error that "juju-sample" is not found, what did i wrong?
<rogpeppe> TheMue: hmm, good question, i'll have a look
<rogpeppe> TheMue: could you paste the output of go test -gocheck.vv ?
<TheMue> rogpeppe: sure, one moment
<TheMue> rogpeppe: http://paste.ubuntu.com/1253687/
<rogpeppe> TheMue: is this running live tests? on trunk?
<TheMue> rogpeppe: it's a fresh branch of trunk, just to get sure it's non of my changes
<rogpeppe> TheMue: this happen even with you're not running with the -amazon flag?
<rogpeppe> s/happen/happens/
<rogpeppe> s/with/when/ :-)
<TheMue> rogpeppe: yes, just a play go test
<TheMue> rogpeppe: ah, tried it now with -amazon, looks different
<rogpeppe> TheMue: i'm wondering if you've got an old copy of the goamz package
<TheMue> rogpeppe: i'll force an update, to get sure
<rogpeppe> TheMue: my version of goamz is on revision 13
<rogpeppe> TheMue: and trunk environs/ec2 passes all tests for me
<rogpeppe> TheMue: (N.B. go get -u doesn't work with bzr repositories)
<TheMue> rogpeppe: uuh, i've got the rev 11
<TheMue> rogpeppe: thx, i'll update now and test again
<rogpeppe> TheMue: that should fix your problem
<Aram> rogpeppe: what GUI tool do you use for diff?
<rogpeppe> Aram: qbzr
<rogpeppe> Aram: it's not as good as codereview, but much more immediate :-)
<Aram> can it help with merges?
<rogpeppe> Aram: good morning, BTW!
<Aram> (as in, is it a three way diff tool?)
<Aram> moin.
<rogpeppe> Aram: no. there is a tool around that does, but i can never remember its name
<Aram> I'm fine with raw diffs when I need a diff, but merge conflicts really confuse me.
<Aram> so I'm searching for a better tool.
<TheMue> rogpeppe: thx a lot, you've been right, now the rev 13 runs fine
<rogpeppe> TheMue: cool
<TheMue> Aram: moin from here too
<Aram> hi.
<rogpeppe> fwereade: stupid bzr question: if i've got a full revision id, how can i make a branch from it?
<fwereade> rogpeppe, sorry, i have no idea
<rogpeppe> fwereade: thanks
<rogpeppe> ah! done it!
<rogpeppe> mramm: morning!
<mramm> morning
<mramm> How goes?
<TheMue> mramm: hiya
<Aram> hi.
<rogpeppe> mramm: pretty good, just about to propose the final piece of the upgrade logic (the --bump-version flag on upgrade-juju)
<mramm> awesome
<Aram> meh, I'm using p4merge, all these open alternatives suck.
<rogpeppe> Aram: i usually find that apart from very occasional cases, doing it inline is fine
<rogpeppe> fwereade: relatively trivial: https://codereview.appspot.com/6593053
<Aram> rogpeppe: my brain is simply not adapted for such a task.
<Aram> so I have to do it three times before it's right.
<fwereade> rogpeppe, sorry, just having lunch, I'll do it when I come back
<rogpeppe> Aram: i found it hard to start with, but then worked out a technique that seems to work
<rogpeppe> Aram: i edit the "this" parts by copying and pasting from the "source" parts.
<rogpeppe> Aram: then i delete the source parts
<rogpeppe> fwereade: np, enjoy!
<niemeyer> Good morning juju masters!
<TheMue> niemeyer: heya
<niemeyer> Hmmm.. why am I getting a BADSIG on us.achive
<niemeyer> archive
<TheMue> niemeyer: time for a short question?
<niemeyer> TheMue: Sure
<TheMue> niemeyer: fine, thx
<TheMue> niemeyer: when set-up a new machine we set the juju-group and a machine-individual group
<TheMue> niemeyer: in ec2
<niemeyer> TheMue: Hmm.. do we?
<TheMue> niemeyer: i hope i describe it right, try to make it better understandable
<TheMue> niemeyer: there's a setUpGroup() for a machine-id. and there a group with the group name ("juju-xyz") and a machine group ("juju-xyz-(machineid)") is set-up
<TheMue> niemeyer: now i removed the machine group, so all machines share only one group and the ports are opened/closed on it per instance
<niemeyer> TheMue: Ah, okay, you're talking about *security* groups in EC2..
<TheMue> niemeyer: sorry, yes, missed to tell more about the context
<niemeyer> TheMue: Ok
<TheMue> niemeyer: so now the question: shell the IP permissions for the juju-group stay the same (ssh for all plus all ports inside the group)?
<TheMue> shall
<niemeyer> TheMue: Yeah, sounds ok
<niemeyer> TheMue: Btw,
<TheMue> niemeyer: ok
<TheMue> niemeyer: yes?
<niemeyer> TheMue: I'm wondering if we should have that as an option rather than the default behavior
<niemeyer> TheMue: If we keep the firewaller behaving as-is, would it be feasible to have this behavior switchable within the env only?
<TheMue> niemeyer: would be fine to me as i prefer to start with as few open ports as possible
<TheMue> niemeyer: let's make it in two CLs. i start with the current default behavior for the juju-group and then, after your lgtm, look how we can make this switchable. ok?
<niemeyer> TheMue: Hmmm.. you mean breaking what's working so that we can then see how to fix it?
<TheMue> niemeyer: it seems i did not get your idea right.
<TheMue> niemeyer: could you please tell me what you mean by "switchable"?
<niemeyer> TheMue: I mean being able to select between current behavior and a new behavior
<TheMue> niemeyer: oh, understand (hopefully).
<TheMue> niemeyer: i reduced it to the default ports, but you talk about the groups, don't you?
<niemeyer> TheMue: I'm not sure about what "reducing to the default ports" means
<niemeyer> TheMue: And yes, I think we've both been talking about security groups
<TheMue> niemeyer: I thought about which ports to be open by default.
<niemeyer> TheMue: I don't think we want to change that
<niemeyer> TheMue: Do we?
<TheMue> niemeyer: no, but after your first sentence "i'm wondering â¦" i thought so.
<niemeyer> TheMue: Ah, ok.. no, I'm talking about the whole change
<TheMue> niemeyer: yes, now i got it. sorry i got it wrong the first place.
<niemeyer> TheMue: It's all good
<TheMue> niemeyer: the firewaller relies on the interface for opening and closing ports. so the change imho can be done without changing the firewaller.
<TheMue> niemeyer: but the current implementation opens and closes on the machine group
<TheMue> niemeyer: and in future on the juju group
<TheMue> niemeyer: so i think this can be done configurable
<niemeyer> TheMue: Right.. so the first CL would be adding such a setting, without any further changes
<TheMue> niemeyer: ok, and i keep my current branch as a reference for the follow-up ;)
<TheMue> niemeyer: another question about security groups
<niemeyer> TheMue: Yeah, nothing is lost probably
<TheMue> niemeyer: if i change a ip permission for the juju group, this does not effect all instances, doesn't it?
<niemeyer> TheMue: If you change a security group, it affects all instances to which the security group is attached
<TheMue> niemeyer: oh, so instead of a machine security group a service based security group could make more sense, or do i get it wrong?
<niemeyer> TheMue: Security groups can only be attached to a machine when the machine is created
<TheMue_> Shâ¦, disconnected
<niemeyer> TheMue: Security groups can only be attached to a machine when the machine is created
<TheMue> niemeyer: Thx, that has been the missing link to the chosen model. Just reading a lot about it to get a better understanding.
<niemeyer> TheMue: np
<rogpeppe> TheMue, niemeyer: do you think it still makes sense to have two security groups? i'm thinking that we can make do with only one now.
<rogpeppe> niemeyer: --bump-version is back, BTW: https://codereview.appspot.com/6593053/
<niemeyer> rogpeppe: It depends a bit on the implementation
<niemeyer> rogpeppe: Would be best to have one, but given that it is switchable, I don't know what will end up being easiest
<niemeyer> TheMue: ^
<niemeyer> rogpeppe: Cheers
<rogpeppe> niemeyer: ah, i see.
<rogpeppe> niemeyer: yeah, it depends. if you want it switchable at runtime, there's no other option.
<TheMue> rogpeppe: they need something like gosecuritygroups, more lightweighted than real groups. ;)
<rogpeppe> TheMue: all they need is to be able to add security groups after launch
<rogpeppe> s/add sec/modify sec/
<TheMue> rogpeppe: yes, that would help a lot
<niemeyer> rogpeppe: I'm not considering much runtime at this point
<niemeyer> rogpeppe, TheMue: I'm hoping we can keep the core issue in mind and have that as a trivial yet nice addition soon
<rogpeppe> niemeyer: it's interesting actually - we'd need to store the current mode in the environment somewhere - something we don't do anywhere. maybe an additional field in the bucket.
<niemeyer> rogpeppe: We don't need to do anything we're not doing currently. Just a trivial setting that defines the intended behavior
<rogpeppe> niemeyer: i mean, we'd need to do that regardless of switching at environment-creation-time or run-time.
<rogpeppe> niemeyer: StartInstance needs to know whether it needs to create a new security group or not.
<niemeyer> rogpeppe: StartInstance is in the environment, that has the setting to define its behavior
<rogpeppe> niemeyer: yes, assuming noone calls StartInstance from the client side, where the setting might be different from the one that the environment was bootstrapped with.
<niemeyer> rogpeppe: If we randomly use different settings client side and server side, we surely will have many issues
<rogpeppe> niemeyer: ok. i actually thought it would work pretty well currently with just the bare minimum of settings on client's side (after bootstrap of course)
<niemeyer> rogpeppe: default-series?
<rogpeppe> niemeyer: interesting. tbh i have to say i'm not entirely clear on where and how default-series is used.
<niemeyer> rogpeppe: Okay, so before we dive away, TheMue issue is trivial one as it stands. We must consistently use the environment settings stored in the environment no matter what.
<rogpeppe> niemeyer: +1
<TheMue> niemeyer: +1
<TheMue> lunchtime
<rogpeppe> niemeyer: FWIW in the only occurrence of default series being used that i can find, it's taken from the state env config, not the local config. i imagine there are other places where it will come in though.
<niemeyer> TheMue: Enjoy!
<niemeyer> rogpeppe: That seems to agree with what was just stated above
<rogpeppe> niemeyer: i probably misunderstood you, sorry. i thought you were saying that things could be mucked up by a client using different env config settings from what we bootstrapped with. i don't *think* we can, currently.
<niemeyer> rogpeppe: They cannot because we're doing exactly what I suggested, which is consistently using the environment settings stored in the environment.
<niemeyer> rogpeppe: This is not a coincidence.
<rogpeppe> niemeyer: ah, i see. someone *could* still muck things up by calling StartInstance directly, but none of our stuff will do that.
<niemeyer> rogpeppe: Of course, someone can screw up things arbitrarily by doing arbitrary things..
<niemeyer> rogpeppe: We give people all the tools to pull an environment configuration from a mailing list and running wild..
<niemeyer> rogpeppe: Just sent a question to the CL
<rogpeppe> niemeyer: i'm not sure what "interacting properly with the --version option" would be
<rogpeppe> niemeyer: currently the --version flag does not influence which version is uploaded
<rogpeppe> niemeyer: it seems like you might be suggesting that it should
<rogpeppe> niemeyer: tbh, i'm thinking that the tool upload functionality might be best off as an entirely separate subcommand
<niemeyer> rogpeppe: Hmm
<niemeyer> rogpeppe: Yeah, you're right
<niemeyer> rogpeppe: --version is about selecting which version to pick
<niemeyer> and if we're uploading we don't really have a choice
<niemeyer> rogpeppe: We should conflict these options explicitly then, I suppose?
<rogpeppe> niemeyer: we could do.
<rogpeppe> niemeyer: though i don't mind override semantics either
<niemeyer> rogpeppe: It's bogus I think.. it will pretend to be a different version than it actually is
<rogpeppe> niemeyer: i'm not sure that will happen.
<rogpeppe> niemeyer: s/will/can/
<niemeyer> rogpeppe: What does --version 10.11.12 --upload-tools --bump-version mean?
<rogpeppe> niemeyer: the given version is ignored, the tools are uploaded with a bumped version if necessary
<niemeyer> rogpeppe: There you go..
<rogpeppe> niemeyer: i'm not sure what you mean by "pretending to be a different version".
<rogpeppe> niemeyer: what's pretending?
<niemeyer> rogpeppe: Nothing is pretending, because you've ignored the option entirely
<rogpeppe> niemeyer: you can also do: --version 10.11.12 --upload-tools
<niemeyer> rogpeppe: Let's just error when the option means nothing
<rogpeppe> niemeyer: ok.
<niemeyer> rogpeppe: Unrelated to the current CL, though.. LGTM
<rogpeppe> niemeyer: thanks
<rogpeppe> niemeyer: apart from that last wrinkle (and major-version upgades, of course), that completes upgrading for the time being, i think.
<niemeyer> rogpeppe: Woohay!
<rogpeppe> niemeyer: i'm now moving on to fixing a few bugs that have been assigned to me (in particular some of the sporadic live test failures). other suggestions welcome.
<niemeyer> rogpeppe: We have a pretty high/critical one on the pipeline that could see your attention, actually
<niemeyer> rogpeppe: I've originally invited Aram to look at it, but on a second thought you probably have more context to do it quickly
<rogpeppe> niemeyer: ok. let me at it!
<niemeyer> rogpeppe: We're not game to start kicking out the ssh proxying hackery
<niemeyer> s/not/now
<rogpeppe> niemeyer: woo. do we think that go.net ssh is up to the task?
<niemeyer> rogpeppe: We don't need ssh at all, hopefully
<rogpeppe> niemeyer: even better!
<niemeyer> rogpeppe: The idea is to connect straight onto mongo
<rogpeppe> niemeyer: what shall we use for crypto?
<niemeyer> rogpeppe: It'll take a few steps to get there, though
<niemeyer> rogpeppe: Mongo can talk SSL
<rogpeppe> niemeyer: cool.
<niemeyer> rogpeppe: mgo can't right now, but I'll fix that before you get there, I'm hoping
<niemeyer> rogpeppe: As a very first step, we have to introduce the idea of a connection password for the agents
<rogpeppe> niemeyer: where does the password come from?
<niemeyer> rogpeppe: The agent passwords are automatically generated
<niemeyer> rogpeppe: For the machine agent, we have to hand the password off during the machine creation, so that the new agent can connect once it comes up
<niemeyer> rogpeppe: Unfortunately, we have to do a little trick once the agent comes up the first time, and replace the agent password with a new one that is created locally
<niemeyer> rogpeppe: Because the metadata that we send to the machine is visible to anyone within the machine itself
<rogpeppe> niemeyer: perhaps we could have a new agent that's responsible for exchanging one-time passwords for persistent passwords
<rogpeppe> (thinking aloud)
<niemeyer> rogpeppe: It's significantly simpler than that
<niemeyer> rogpeppe: It doesn't even have to be a one-time password
<niemeyer> rogpeppe: It's just a "Oh, is that the password I got during creation, okay.. here is a new one I want to use from now on."
<niemeyer> rogpeppe: As long as we do that before the machiner goes on to create units, it's all good
<rogpeppe> niemeyer: how do you get the new one?
<niemeyer> rogpeppe: Just create one and change it
<rogpeppe> niemeyer: ah, does SSL have provision for changing passwords remotely?
<niemeyer> rogpeppe: Huh.. wires crossed there
<niemeyer> rogpeppe: Forget SSL for the moment
<niemeyer> rogpeppe: This is juju state logic
<niemeyer> rogpeppe: and mongo auth
<rogpeppe> niemeyer: ok.
<rogpeppe> niemeyer: are we planning to store passwords inside the state?
<niemeyer> rogpeppe: Nope
<rogpeppe> niemeyer: sorry, ignore my bleating. i'll shut up until you've explained :-)
<niemeyer> rogpeppe: No, that's it really
<niemeyer> rogpeppe: That's the first step
<rogpeppe> niemeyer: could you outline the next steps, so i know where we're heading?
<niemeyer> rogpeppe: How far? :-)
<rogpeppe> niemeyer: until we've got equivalent functionality to today
<rogpeppe> niemeyer: but via direct connection to mongo
<niemeyer> rogpeppe: We can't drop SSH just yet, but that means we can then introduce SSL more or less easily, and dro pit
<niemeyer> rogpeppe: This is the tricky bit for doing so
<niemeyer> rogpeppe: After that we need some means to define the admin password
<niemeyer> rogpeppe: Which is what that old admin-secret meant
<niemeyer> rogpeppe: So that we can connect straight to mongo
<rogpeppe> niemeyer: mongo's authorization is password-based, presumably
<niemeyer> rogpeppe: and then we just enable SSL
<niemeyer> rogpeppe: Yes, user/pass
<niemeyer> rogpeppe: In a future universe, we'll then introduce an HTTPS API to which everyone will talk to
<niemeyer> rogpeppe: Which maps more or less well to that whole model
<rogpeppe> niemeyer: and there's a mongo capability to add users and passwords, i guess
<niemeyer> rogpeppe: Yeah, that's already well mapped onto mgo
<rogpeppe> niemeyer: so we'd have one user per agent, each with its own password
<niemeyer> rogpeppe: Exactly
<rogpeppe> niemeyer: out of interest, which part of the API do you use to do that?
<rogpeppe> niemeyer: oh, i see
<niemeyer> rogpeppe: You mean in mgo?
<niemeyer> rogpeppe: http://go.pkgdoc.org/labix.org/v2/mgo#Database
 * rogpeppe grepped for password, not user
<rogpeppe> niemeyer: so can anyone change any user?
<niemeyer> rogpeppe: No.. we have to put some thinking on how that will work
<rogpeppe> niemeyer: it seems like the model we've got here is something like this: http://paste.ubuntu.com/1254028/
<niemeyer> rogpeppe: As you spotted above, we'd have one user per agent, each with its own password
<rogpeppe> niemeyer: i'm not quite sure how the MA would pass the password securely to other agents. maybe we should just put it in a file.
<rogpeppe> niemeyer: after all, if you're root you can get the password anyway.
<niemeyer> rogpeppe: We probably need it in a file either way, since we must know it after it changes
<rogpeppe> niemeyer: good point.
<rogpeppe> niemeyer: ok, if the above sketch looks right to you, i'm happy.
<niemeyer> rogpeppe: It doesn't because of what I mentioned above
<niemeyer> rogpeppe: As you spotted above, we'd have one user per agent, each with its own password
<niemeyer> rogpeppe: The sketch makes it feel like the password provided is the machine password
<rogpeppe> niemeyer: ok, so the MA creates a new mongo user for each agent it starts
<rogpeppe> niemeyer: and so, presumably, does the unit agent when it starts subordinates?
<niemeyer> rogpeppe: Yeah, that sounds right
<niemeyer> rogpeppe: This is a bit silly right now, if you think about it, because all those users have the same permissions
<niemeyer> rogpeppe: But it starts to make more sense if we consider that situation won't be like that soon
<rogpeppe> niemeyer: yeah, each agent could potentially delegate permissions.
<niemeyer> rogpeppe: Btw, when you say "the MA creates a new mongo user", it's not just a mongo user we're talking about.. these details should be abstracted behind our own API
<rogpeppe> niemeyer: yeah, because the MA knows nothing about mongo
<niemeyer> Right
<rogpeppe> niemeyer: would this be sufficient? state.State.AddUser(user, pass string) error
<niemeyer> rogpeppe: No..
<rogpeppe> niemeyer: and an extra field in state.Info
<niemeyer> rogpeppe: We have to think through the relationship between user and entity
<rogpeppe> niemeyer: a "capability" bitmask or set?
<niemeyer> rogpeppe: Sorry, I just meant that we don't want dangling users around without association to an entity, and we don't want to have to manipulate the needs of authentication and the needs of entity creation/etc independently
<niemeyer> rogpeppe: It should be clear from the API how these things take palce
<niemeyer> place
<rogpeppe> niemeyer: ok. by "entity" you mean some type or value in the state?
<rogpeppe> niemeyer: or an agent?
<niemeyer> rogpeppe: We've been using entity to refer to unit/service/machine
<niemeyer> rogpeppe: But in this case it's really the subset of those that have an agent ATM
<niemeyer> rogpeppe: So unit/machine
<rogpeppe> niemeyer: so the Machine entity is kinda superior to the Unit entity, because an MA can create users for UAs but not vice versa?
<niemeyer> rogpeppe: We won't have the low-level restrictions in place today.. we just need a proper API that makes sense in such a context
<rogpeppe> niemeyer: that's what i'm trying to imagine
<niemeyer> rogpeppe: We don't need to "create users" in general
<niemeyer> rogpeppe: We have proper unique keys for the agents
<niemeyer> rogpeppe: We shouldn't have to be manipulating users and passwords in our own API externally to the state model
<niemeyer> rogpeppe: This would add an implicit API on top of our own API that is probalby not necessary
<rogpeppe> niemeyer: what about something like this? http://paste.ubuntu.com/1254071/
<niemeyer> rogpeppe: Yeah, that looks nice
<rogpeppe> niemeyer: and then a new "Password" field in state.Info
<rogpeppe> niemeyer: (which in actual fact incorporates both the mongodb username and password)
<niemeyer> rogpeppe: I think we can be a bit more explicit: // SetPassword returns a new random password the agent responsible for X should use to communicate with the state servers. Previous passwords are invalidated.
<rogpeppe> niemeyer: NewPassword?
<niemeyer> rogpeppe: NewFoo in general returns a new foo, without further consequences
<rogpeppe> niemeyer: yeah
<niemeyer> rogpeppe: ReplacePassword maybe
<rogpeppe> niemeyer: or ChangePassword
<niemeyer> rogpeppe: SOunds good
<niemeyer> rogpeppe: ChangePassword() (auth string, err error)
<niemeyer> rogpeppe: So we can have Auth in state.Info rather than password, as it may contain the username too
<niemeyer> rogpeppe: Although we never expose those details out of the state itself
<rogpeppe> niemeyer: sounds good
<niemeyer> rogpeppe: Hmm.. there's a small detail here, though.. we may need to split that up a bit into separate methods
<rogpeppe> niemeyer: i was wondering about this: http://paste.ubuntu.com/1254085/
<niemeyer> rogpeppe: Ah, no.. we should send the new password..
<niemeyer> rogpeppe: ChangePassword(password) (auth string, err error)
<niemeyer> rogpeppe: Otherwise there's a race
<niemeyer> rogpeppe: We need to persist the new password before we ask to change, otherwise the agent might be locked out
<rogpeppe> niemeyer: ah, interesting, yeah
<niemeyer> rogpeppe: Regarding the past, I'd prefer an explicit API instead
<niemeyer> rogpeppe: unit.ChangePassword.. machine.ChangePassword
<rogpeppe> niemeyer: ok
<niemeyer> rogpeppe: No need to guess where it makes sense and where it doesn't
<rogpeppe> niemeyer: sounds good
<rogpeppe> niemeyer: the nice thing about passing in a password is that the method becomes idempotent.
<niemeyer> rogpeppe: Indeed
<rogpeppe> niemeyer: i think SetPassword makes better sense now.
<niemeyer> rogpeppe: +1
<rogpeppe> // SetPassword sets the password the agent responsible for the machine
<rogpeppe> // should use to communicate with the state servers.  Previous passwords
<rogpeppe> // are invalidated.
<rogpeppe> func (m *Machine) ChangePassword() (password string, err error)
<niemeyer> func (m *Machine) SetPassword(password string) (auth string, err error)
<niemeyer> I suppose
<rogpeppe> niemeyer: yeah c&p error
<niemeyer> fwereade: I mid-way through the filter review.. the conversation here was good, so I didn't make much progress, but will continue after lunch
<fwereade> niemeyer, no worries
<niemeyer> rogpeppe: I'll step out for lunch and bbiab
<rogpeppe> niemeyer: ok cool
<fwereade> niemeyer, I may be off soon but will try to pop back on later
<niemeyer> fwereade: Sounds good, thanks for the nice progress
 * niemeyer => lunch
<fwereade> niemeyer, seems to be going ok :)
<fwereade> niemeyer, cheers
<TheMue> niemeyer: https://codereview.appspot.com/6596051/ as the first step is in
<TheMue> cu later
<niemeyer> mramm: Do we have a call in 20 mins?
<mramm> niemeyer: Yea the Cloud Consistency Call
<mramm> niemeyer: we will be going over the certified public cloud initiative, and some server stuff
<niemeyer> mramm: 'k
<mramm> niemeyer: not sure that your attendance is required, if you want to skip I can ping you if something important comes up
<niemeyer> mramm: Sounds good, I'll be around then
<rogpeppe> niemeyer: just wondering what's the best thing to do about removing users. i'm thinking that SetPassword("") removes the user for an entity (i can't see that we'll ever want to actually set a blank password)
<niemeyer> rogpeppe: Why would we want to remove a user in such a fashion?
<rogpeppe> niemeyer: if a unit gets removed, i think it makes sense to remove the unit's mgo user too
<niemeyer> rogpeppe: Exactly! :)
<rogpeppe> niemeyer: erk
<rogpeppe> niemeyer: of course.
<niemeyer> rogpeppe: Btw, over lunch I was thinking on our interface.. we'll stumble on problems regarding SetPassword returning auth, for the same reason that we need to pass the password in instead of just using the result
<niemeyer> rogpeppe: I suggest this instead:
<niemeyer> func (m *Machine) SetPassword(password string) error
<niemeyer> struct Info { .....; Principal interface{ SetPassword(string) }; Password string; }
<rogpeppe> niemeyer: interesting
<rogpeppe> niemeyer: i don't quite see why SetPassword is a problem though, assuming it's deterministic
<niemeyer> rogpeppe: The only change is that the auth result was nonsense
<rogpeppe> niemeyer: the Info change above is a pretty major thing.
<niemeyer> rogpeppe: Because?
<rogpeppe> niemeyer: currently it's a very simple set of info.
<niemeyer> rogpeppe: Ah, you're right.. we have a chicken and egg
<niemeyer> rogpeppe: Well, we'll need to save the principal as a string then
<rogpeppe> niemeyer: alternatively we could just have a Principal method on Machine and Unit
<niemeyer> rogpeppe: For?
<rogpeppe> niemeyer: we need to get the user name and password from somewhere.
<niemeyer> rogpeppe: That's not an alternative to the issue above, since we still must tell it what the principal is, and we can't get a password from state
<rogpeppe> niemeyer: what about Machine.Auth(password) ?
<niemeyer> rogpeppe: Sorry, feels like we're going backwards in our conclusions
<rogpeppe> niemeyer: i'm not sure i've concluded anything yet
<niemeyer> rogpeppe: We already have machine.SetPassword(password).. Auth(password) is no better
<rogpeppe> niemeyer: sorry, i'm thinking *in addition* to SetPassword
<niemeyer> rogpeppe: I'm a bit lost
<niemeyer> rogpeppe: What problem are you solving?
<rogpeppe> niemeyer: so we'd have SetPassword(password) error; and Auth(password) (auth string, err error)
<niemeyer> rogpeppe: Why?
<rogpeppe> niemeyer: so that we can generate the password, save it to disk, then if we crash we can regenerate the required auth info from the password only.
<niemeyer> rogpeppe: All of that seems unrelated to Machine.Auth(password)
<rogpeppe> niemeyer: Auth is maybe a bad name. AuthInfo, perhaps.
<niemeyer> rogpeppe: Seems like a complex way to do something simple
<rogpeppe> niemeyer: what's your suggestion?
<niemeyer> rogpeppe: The same.. Principal string
<rogpeppe> niemeyer: where does that come from?
<niemeyer> rogpeppe: We already have a nice key on PathKey.. we can rename it to something more sensible and reuse
<rogpeppe> niemeyer: i was going to suggest that
<rogpeppe> niemeyer: i still think that AgentName is a reasonable name for that method.
<rogpeppe> niemeyer: BTW there's one thing we haven't talked about
<rogpeppe> niemeyer: what does the client do?
<rogpeppe> niemeyer: in my implementation so far, i've implemented State.SetPassword
<niemeyer> rogpeppe: +1 on AgentName.. I can't come up with anything better right now
<rogpeppe> niemeyer:
<rogpeppe> // SetPassword sets the password the administrator
<rogpeppe> // should use to communicate with the state servers.
<rogpeppe> niemeyer: but actually maybe we don't care much about changing the admin password
<rogpeppe> niemeyer: and we could hardwire "admin" as a principal name
<niemeyer> rogpeppe: The thing I dislike about AgentName is that it blocks adding the same interface to other things we'll want to identify in the same manner
<rogpeppe> niemeyer: you mean, if we have principals that don't map directly to state objects?
<niemeyer> rogpeppe: No, I mean that the concept of having a unique and intelligible key for entities is very useful
<niemeyer> rogpeppe: EntityName perhaps
<rogpeppe> niemeyer: i wrote that, then deleted it
<rogpeppe> niemeyer: but it's not too bad, despite being a nasty mouthful.
<niemeyer> rogpeppe: This means we can have "service-foobar" too
<rogpeppe> niemeyer: +1 to EntityName
<rogpeppe> niemeyer: or even just Name...
<niemeyer> rogpeppe: -1, unless we refactor further
<niemeyer> rogpeppe: We already have that in some places
<rogpeppe> niemeyer: oh yeah, we already use it
<rogpeppe> niemeyer: darn
<niemeyer> rogpeppe: It's not a bad idea, though
<niemeyer> rogpeppe: We might move to Id() in the other fields
<rogpeppe> niemeyer: yeah, Id would work well for how unit.Name etc is used currently.
<rogpeppe> niemeyer: and it would fit well with Machine.Id too
<niemeyer> rogpeppe: unit.Id(), service.Id(), and then we have Name() with the full-blown name
<rogpeppe> niemeyer: +
<rogpeppe> 1
<rogpeppe> !
<niemeyer> :-)
<rogpeppe> niemeyer: ok, that shouldn't be hard to change.
<niemeyer> rogpeppe: We have to do that sooner rather than later, or we'll be stuck with the database schema
<rogpeppe> niemeyer: i'll do it now, if you think it's good
<niemeyer> rogpeppe: It sounds like a good move
<rogpeppe> niemeyer: what do you think about the client-initiated connection? Info.Principal == "admin" ?
<niemeyer> rogpeppe: Yeah
<niemeyer> rogpeppe: Your idea above regarding "admin" sounds good
<rogpeppe> niemeyer: that we have State.SetPassword ?
<niemeyer> rogpeppe: No, that we can hardcode the "admin" user for the moment
<rogpeppe> niemeyer: ok, sounds good
<niemeyer> rogpeppe: and that we can do without changing its password for a while
<niemeyer> rogpeppe: I think, but not sure
<rogpeppe> niemeyer: then "admin-password" as an environment setting makes total sense
<niemeyer> rogpeppe: Either way, SetAdminPassword would be ok
<rogpeppe> niemeyer: which would be a nice change :-)
<niemeyer> robbiew, mramm: Where is the meeting taking place today?
<robbiew> niemeyer: what meeting? cloud consistency?
<niemeyer> robbiew: Yeah
<robbiew> niemeyer: meh..you can skip it, unless you wanna hear an update on charm collection
<niemeyer> robbiew: Yeah, that's what I'm curious about
<robbiew> conf id 80589 63238
<niemeyer> robbiew: Cheers
<niemeyer> Holy crap
<niemeyer> 17 people..
<rogpeppe> niemeyer: first stage of Name change: https://codereview.appspot.com/6585053/
<niemeyer> rogpeppe: Looking
<rogpeppe> niemeyer: i'm off for the evening. see ya tomorrow...
<rogpeppe> night all!
<niemeyer> rogpeppe: Cool, thanks for the quick turnaround
<niemeyer> rogpeppe: Have a good evening
<rogpeppe> niemeyer: np
<rogpeppe> niemeyer: static typing FTW
<niemeyer> rogpeppe: Indeed :)
<niemeyer> I'll have a walk outside and come back shortly for more reviews
<fwereade> niemeyer, very briefly: my issue with using the uniter's existing unit/service is just that I don't want to share them with the filter goroutine
<fwereade> niemeyer, so I rather think I have to pay the cost of a refresh anyway
<fwereade> niemeyer, from that perspective, passing a single new watcher into the new goroutine, and leaving it to reconstruct everything it required from there, seemed rather neat
<fwereade> niemeyer, will be on and off, if you have a quick answer just dump it in the channel and I'll see it soon enough
<niemeyer> fwereade: I half agree
<niemeyer> fwereade: If you want to not share the service, that's certainly reasonable
<niemeyer> fwereade: But the points in the review still feel valid.. you can easily pick the service up from state when starting the filter
<niemeyer> fwereade: and watch it upfront
<niemeyer> fwereade: Or do I misunderstand your point?
#juju-dev 2012-10-02
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1059921
<davecheney> --format=json :(
<SpamapS> davecheney: I've been explicitly using --format=json to avoid formatting incompatibilities between go/python
<davecheney> SpamapS: yeah, i'm fixing the command line parser to handle this case
<davecheney> lucky(~/src/launchpad.net/juju-core) % juju bootstrap --upload-tools
<davecheney> lucky(~/src/launchpad.net/juju-core) % juju deploy mysql
<davecheney> error: no reachable servers
<davecheney> outstanding ... bootstrap didn't
<davecheney> aws said yes
<davecheney> but the console says that instance is stillborne
<davecheney> https://code.launchpad.net/~dave-cheney/gnuflag/001-parse-more-gnu-args/+merge/127419
<davecheney> ^ gnuflags doesn't have lbox setup
<rogpeppe> fwereade, davechen1y: mornin'
<davecheney> monring
<fwereade> rogpeppe, davecheney, heyhey
<davecheney> rogpeppe: i have a non CL patch for you to review
<fwereade> rogpeppe, davecheney: I'm thinking I might actually take my swap day, I fell asleep really early last night and would prefer not to waste the sudden urge to rest
<fwereade> rogpeppe, davecheney, *but* ofc I have reviews to respond to first :p
<davecheney> fwereade: sgtm - don't forget meeting this aternoon
<rogpeppe> fwereade: ok, have fun if so!
<fwereade> davecheney, good point, ty
<rogpeppe> davecheney: i was about to ask if you had any reviews you wanted me to look at
<davecheney> one to gnuflags
<davecheney> rogpeppe: https://code.launchpad.net/~dave-cheney/gnuflag/001-parse-more-gnu-args/+merge/127419
<davecheney> needed, basically, because all the charms expect it
<rogpeppe> davecheney: no codereview link?
<davecheney> rogpeppe: it doesn't make one
<davecheney> hang on
<rogpeppe> davecheney: what doesn't make one?
<davecheney> lbox
<davecheney> just found the option
<rogpeppe> davecheney: -cr
<davecheney> https://codereview.appspot.com/6588060
<davecheney> rogpeppe: why don't i need to do that on juju-core ?
<rogpeppe> davecheney: there's a .lbox file in its root
<rogpeppe> davecheney: with flags in
<davecheney> fairy nuf
<rogpeppe> davecheney: i'm on the review BTW
<rogpeppe> davecheney: i'm not sure it's as simple as your CL tries to make it (which is why i didn't do it before)
<rogpeppe> davecheney: you've got a review
<TheMue> good morning
<TheMue> hi fwereade
<fwereade> TheMue, heyhey, I'm not really here :)
<TheMue> oh, where you are?
<fwereade> TheMue, swap-daying
<fwereade> TheMue, I'm tired and it's my birthday :)
<TheMue> fwereade: ah, ic. so what do you do here? ;)
<TheMue> oh, happy birthday
<fwereade> TheMue, well, the most entertaining thing I can do for the next 10 mins is poke at how we implement relations and try to come up with something cleaner than what we have for the current use cases
<fwereade> TheMue, after that I'll go out for a while, but see you at the meeting
<TheMue> fwereade: so it's a half swap-day
<fwereade> TheMue, whatever :)
<TheMue> fwereade: :D
<rogpeppe> fwereade: hippy barday to you!
<rogpeppe> fwereade: BTW AgentName isn't happening, but we're renaming PathKey to... something else.
<Aram> happy birthday fwereade.
<rogpeppe> fwereade: my suggestion was Name, but that doesn't work because Service.Name already has implications. niemeyer suggested Key but that doesn't work because all kinds of things have keys.
<rogpeppe> fwereade: currently going with an earlier of niemeyer's suggestions: EntityName
<rogpeppe> fwereade: but i'm not greatly happy with that either - any suggestions welcome.
<Aram> meeting in 1 min or 61 min?
<Aram>  ow
<Aram> now
<niemeyer> Morning!
<mramm> team meeting hangout invite sent
<mramm> https://plus.google.com/hangouts/_/557e6f4f5f9de98be4769ec91fc82db48a75d356?authuser=0&hl=en-GB
<niemeyer> mramm: Thanks
<rogpeppe> fwereade: meeting...
<davecheney> rogpeppe: https://bugs.launchpad.net/mgo/+bug/1045678
<davecheney> ^ may be of interest
<davecheney> niemeyer: --format=json
<davecheney> gnuargs only supports --format json
<davecheney> niemeyer: all: https://bugs.launchpad.net/juju-core/+bug/1059921
<rogpeppe> davecheney: i'm happy to do the gnuflag change if you like
<rogpeppe> davecheney: i do bear responsibility for that package, after all
<davecheney> rogpeppe: if you want to take a crack over night, i would be most grateful
<rogpeppe> davecheney: ok
<davecheney> i also tried splitting on the = and pushing the remainder back into f.procArgs, which worked, but the testFlagsetParse tests were not happy
<davecheney> and I couldn't figure out why
<niemeyer> davecheney, rogpeppe: What's the problem with storing the empty flag in procFlag?
<niemeyer> davecheney, rogpeppe: Sorry, I think I meant procArgs
<rogpeppe> niemeyer: i'm not sure i understand the question
<davecheney> niemeyer: that would probably work, but TestFlagSetParse always failed for reasons I didn't understand
<niemeyer> rogpeppe: What's the problem with doing the straightforward and just splitting --foo=bar and storing "bar" as usual?
<davecheney> niemeyer: that is what I tried first
<davecheney> pushing the remainder back into procArgs
<davecheney> but I couldn't make the test pass
<rogpeppe> niemeyer: what if bar is empty?
<niemeyer> rogpeppe: What's the problem with that/
<niemeyer> ?
<niemeyer> rogpeppe: --foo "" is fine
<niemeyer> rogpeppe: and so is --foo=
<fwereade> dammit -- did I completely miss it?
<niemeyer> fwereade: Happy birthday!
<fwereade> niemeyer, haha, cheers :)
<niemeyer> :-)
<davecheney> fwereade: happy annual celebration!
<fwereade> niemeyer, went out for a nice casual brunch with cath and then got screwed by the busses on the way back :/
<fwereade> niemeyer, the intent was there though :(
<fwereade> davecheney, cheers :)
<niemeyer> fwereade: No worries really, it was a nice and quick sync up
<fwereade> niemeyer, excellent
<rogpeppe> niemeyer: i have to remember the code properly, gimme a few moments
<niemeyer> rogpeppe: And btw, given your suggestion in the review, note that "--foo=bar" and "--foo =bar" mean different things
<rogpeppe> niemeyer: indeed, but i think that wouldn't be a problem if we did it the way i suggested
<rogpeppe> niemeyer: because we wouldn't put the latter argument into procFlag
<niemeyer> rogpeppe: Cool
<rogpeppe> niemeyer: the main immediate problem with storing an empty procFlag is that i want to be able to do --foo=false for boolean flags.
<rogpeppe> niemeyer: while still allowing --foo to mean the same as --foo=true
<niemeyer> rogpeppe: Sounds good, although I'm not familiar with the algorithm there to have an opinion
<rogpeppe> niemeyer: i'm nearly there. we'll see if the tests work :-)
<niemeyer> rogpeppe: Super :)
 * niemeyer gets some coffee
<fwereade> niemeyer, Aram: can we talk a little about relations and their creation, retrieval, and watchery?
<Aram> sure
<fwereade> Aram, teh relation key is now a string with N service:relation pairs, in sort order, right?
<Aram> yes
<fwereade> Aram, this is interesting from my POV because we identify relations to hooks in a not-quite-neatly-fitting way
<Aram> it's also used in some watchers to identify that the relation belongs to a service.
<fwereade> Aram, as I see it we can either have an API that gets a relation by pre-existing ID number, rather than by key
<fwereade> Aram, which will be convenient for the hook tools
<fwereade> Aram, or we can try to do more interesting things with the key and do everything that way
<fwereade> Aram, niemeyer: and I'm rather hoping someone else has some thoughts on this because I can't tell which is better
<Aram> so, the hook cares about the id of the relation?
<fwereade> niemeyer, eg, do we think that anyone is depending on any way on the format of the JUJU_RELATION env vars?
<fwereade> Aram, the hook needs to differentiate between different relations on the same endpoint
<fwereade> Aram, there's a JUJU_RELATION which is, say, "db", and a JUJU_RELATION_ID which is, say "db:7"
<fwereade> Aram, niemeyer: if we were to specify, instead, a JUJU_RELATION_ID of "db:otherservice:otherrelation"
<fwereade> Aram, niemeyer: I think it should be pretty simple to transform back and forth into real relation keys
<fwereade> Aram, niemeyer: which would resolve the icky two-Relation-getter-methods-on-State thing
<fwereade> niemeyer, Aram: I am then *very* torn on the issue of AddRelation and what params it should take
<fwereade> niemeyer, Aram: on the one hand, it's very convenient to be able to make up any old relations we like
<fwereade> niemeyer, Aram: and we make wanton use of this ability in the tests
<fwereade> niemeyer, Aram: (thought, derail atm: there's some always-on relation we should always be able to handle, right? for subordinates?)
<fwereade> niemeyer, Aram: on the other, it's pretty clearly a Bad Thing to be able to create any nonsensical relation we like, we ought to be validating them against the charms, just like we should be on upgrade
<fwereade> niemeyer, Aram, so my instinct is drawn towards a fuzzy-name interface for both AddRelation and Relation: ie it accepts things like "wordpress mysql" and figures out the right answer without ambiguity
<fwereade> niemeyer, Aram: but which mostly wants and expects real keys for normal usage
 * niemeyer catches up
<Aram> hmm?
<Aram> how would it resolve the fuzzy naming?
<fwereade> niemeyer, Aram: do I sound like I'm on crack, and just making needless work? I *think* there's something important buried in there somewhere
<fwereade> Aram, not very fuzzy -- you have to specify the service names, but can leave the :relation bit off if there's no ambiguity
<niemeyer> fwereade: No, I think you're on a good track
<fwereade> Aram, the point is just that it allows for a single clean API that includes appropriate validation for all uses
<niemeyer> fwereade: I dislike a bit the way relations are referred to right now
<fwereade> Aram, it's definitely not a polished proposal just a feel that *something* is wrong
<fwereade> niemeyer, in what context?
<niemeyer> fwereade: In general
<fwereade> niemeyer, they're represented too many ways already, I think ;)
<niemeyer> fwereade: Oh, sory
<niemeyer> sorry
<niemeyer> fwereade: I mean the way we refer to them by a string that is a concatenation of the endpoints
<niemeyer> fwereade: It ended up as a convenient way, though
<fwereade> niemeyer, ha, I've been coming to rather like that -- if it is (or is trivially transformable to/from) the relation "names" that the user types/sees, I think it is a good thing
<niemeyer> fwereade: So, let me try to help on a few of your questions above
 * fwereade sits attentively
<niemeyer> fwereade: I don't think we should change the environment variable
<niemeyer> fwereade: It's breakage that sounds easy to avoid
<fwereade> niemeyer, ok, wholeheartedly accept that argument
<niemeyer> fwereade: It sounds fine to have a method that grabs the relation by its numeric id
<niemeyer> fwereade: I'm not sure I get the problem you describe referring to AddRelation, though
<fwereade> niemeyer, a client of the API is perfectly able to AddRelation(<wordpress:lol>, <mysql:giggle>)
<fwereade> niemeyer, and we will consider that a perfectly good relation, and join it, and just not run any hooks
<fwereade> niemeyer, IMO this is bad because if that relation *becomes* valid in future charms, to which we upgrade, we'll start running hooks in a weird weird state -- not installed, for a start
<rogpeppe> davecheney, niemeyer: https://codereview.appspot.com/6598052/
<rogpeppe> now for some lunch
<fwereade> niemeyer, so, we should really just not be allowed to create them, just as we don't create services referring to charms that cdon't exist
<Aram> fwereade: yeah, we should validate the relation, we didn't do so because we didn't have the emans when I've written that/
<Aram> s/emans/means/
<fwereade> Aram, did we not? I thought charm.Meta was all we needed?
<fwereade> Aram, ok, but it's not just a matter of validating the endpoints
<niemeyer> fwereade: Curious.. :-)
<fwereade> Aram, it's a matter of specifying them in the first place, in add-relation, which requires that we do the validation
<niemeyer> fwereade: A bit of history.. https://codereview.appspot.com/6305067/ ;-)
<fwereade> niemeyer, ok, that is referring (I think) to a somewhat lower standard of validity
<fwereade> niemeyer, I think that in apprehending the charm-upgrade issue, my thoughts on this matter have taken a harder line
<niemeyer> fwereade: Yeah, your thoughts on it are well appreciated
<niemeyer> fwereade: So, hmm.. what prevents a relation from being added while an upgrade is in progress?
<fwereade> niemeyer, nothing yet
<fwereade> niemeyer, things are still changing under my feet enough that I don't have a clear picture of the final shape of the solution
<fwereade> niemeyer, I just know it's something that should be borne in mind
<niemeyer> fwereade: We should certainly validate the relation.. just trying to imagine how to do it properly
<fwereade> niemeyer, I don't see any way around getting the charms' Metas
<niemeyer> fwereade: We can add an assertion against the revision numbers of the involved services, I suppose
<fwereade> niemeyer, I think it's easier that that if we are careful with the lifetimes and with what upgrades we allow
<fwereade> niemeyer, block upgrades which would change non-Dead relations
<niemeyer> fwereade: Yeah, this is the basic need.. the next question is which charm
<niemeyer> fwereade: That's already an agreement I think, although we didn't put it in place yet
<fwereade> niemeyer, the current one of the service -- by blocking upgrades until incompatible relations are gone we can always use the service's current charm
<niemeyer> fwereade: But besides that, we must also prevent adding a relation after the decision that it was ok to upgrade happens
<fwereade> niemeyer, it becomes a matter of figuring out what races we have to deal with between client 1 adding relations and client 2 upgrading charms
<niemeyer> fwereade: Yeah, that's what I was referring to above
<niemeyer> <niemeyer> fwereade: We can add an assertion against the revision numbers of the involved services, I suppose
<fwereade> niemeyer, I am hoping that we will be able to express suitable asserts in the transactions
<niemeyer> fwereade: To prevent adding a relation to a service that holds an unknown charm
<fwereade> niemeyer, yeah, exactly
<niemeyer> fwereade: Cool, nothing sounds too bad so far
<fwereade> niemeyer, as long as we express the changes correctly on the client, we can find some way to keep not-yet-ok charms off to one side until the deployed charm can handle them
<fwereade> niemeyer, not really sure how to do that yet
<fwereade> niemeyer, doesn't feel likely to be a real dealbreaker
<niemeyer> fwereade: You mean not-yet-ok relations?
 * fwereade slaps self
<fwereade> niemeyer, yes, thank you
<niemeyer> fwereade: Cool, np
<niemeyer> fwereade: Yeah, agreed.. we have easy access to the current charm from the uniter
<fwereade> niemeyer, yeah, it feels quite neat
<fwereade> niemeyer, you had doubts about the appropriateness of the current "foo:bar baz:qux" key, though?
<fwereade> niemeyer, actually, that's in backward form :p
<niemeyer> fwereade: I have a tiny dislike for it.. but it's not a big deal I think
<fwereade> niemeyer, I *think* that the ease of transformation to/from user-speak is very nice
<niemeyer> fwereade: The only change I'll likely propose in the short term is to order the keys by role
<fwereade> niemeyer, +1
<niemeyer> fwereade: Otherwise we have "wordpress:db mysql:server" and "mysql:server blogger:foo"
<niemeyer> Which is not great
<fwereade> niemeyer, requirer then provider?
<niemeyer> fwereade: Yeah,  that sounds sensible
<fwereade> ok then -- since the ids-in-watchers change, I have no way to progress on relations without working on some of this stuff
<fwereade> niemeyer, ^^
<fwereade> niemeyer, I really don't think it is sensible for me to get sidetracked by the CLI-related bit when the uniter is still to be done, so I think I will just add a minimal RelationById(int) (*Relation, error) and change Relation() to take a key instead of endpoints
<fwereade> Aram, how are the watchers looking?
<fwereade> Aram, specifically ServiceRelationsWatcher :)
<Aram> nothing done on that front, there's one in review you can look at, it will definetly look very similar.
<fwereade> Aram, great, I will * try* to check that out this evening
<niemeyer> fwereade: Hmm
 * fwereade listens
<niemeyer> fwereade: Sorry, still pondering on the scheme you mention there
<niemeyer> fwereade: Having the relation taking a key instead of endpoints doesn't feel great
<niemeyer> fwereade: The way we have the current interface pretty consistent on that front feels like a nice win we have
<fwereade> niemeyer, hmm; I need to get relations both by key (because I presume that is how the watcher will deliver them) and by id (because that is how I express them to the hooks, which might pass them back to me)
<fwereade> niemeyer, getting relations, and watching relations, by key, seems natural to me
 * rogpeppe had never realised the subtle difference in Gnu flags between -u=foo and --u=foo
<fwereade> niemeyer, the intent is that at some stage we can change over AddRelation to take a string, just as GetRelation does, that does the appropriate inference of unambiguously underspecified endpoints
<fwereade> niemeyer, and sorts appropriately, I guess
<niemeyer> fwereade: :-)
<niemeyer> fwereade: Effectively manipulating the key onto a different key
<fwereade> niemeyer, and thereby as a second step get back a unified API, that is backward-compatible with the stuff I've been doing in the uniter
<fwereade> niemeyer, ok, the heart of it is this
<fwereade> niemeyer, I need to look them up in 2 different ways, neither of which is the same as the current way to get a relation
<niemeyer> fwereade: What is the real underlying needs you have?
<fwereade> niemeyer, if I just add two trivial RelationByKey and RelationById methods, I'm golden
<niemeyer> fwereade: -1 on relation by key
<niemeyer> fwereade: key is internal, and is a compilation of the current endpoint semantics
<niemeyer> fwereade: I'm pretty sure we'll regret if we change it
<fwereade> niemeyer, ah -- I had thought we were loving away from internal keys that differed from external names
<fwereade> niemeyer, but, ok, if it's not the thing that we use to look things up by, and to send in watchers, it's not much of a key, is it?
<niemeyer> fwereade: That's one of the reasons why I dislike the key mechanism. That said, I'm pretty sure exposing it as suggested will lead to many other issues that don't feel great to get into.
<niemeyer> fwereade: You just mentioned one of them.. we cannot add a relation by its key, because we have to manipulate the key first
<niemeyer> fwereade: It's a primary key used in the database.. there's no single place in our API that exposes this key
<niemeyer> fwereade: It's an implementation detail
<niemeyer> fwereade: To enable the API that we've agreed on so far
<niemeyer> fwereade: We can kill the idea of using it as a key if we change the API
<fwereade> niemeyer, ok, then ID remains the primary way of identifying relations?
<niemeyer> fwereade: It depends on the way you look at it
<niemeyer> fwereade: how does our API look like? Are we using id as a primary way to identify relations?
<fwereade> niemeyer, ok: SRW will send IDs, and I will call a State method that takes an ID when I want to see what relation it refers to
<fwereade> niemeyer, I have assumed ID to be the main way, yes, but little committed code depends on this
<niemeyer> fwereade: We can make the concept more first class
<niemeyer> fwereade: For instance, state.Relation(id) sounds fine..
<niemeyer> fwereade: Feels a lot more sensible than obtaining all the endpoints to grab the relation
<niemeyer> fwereade: We can then have e.g. state.EndpointsRelation(endpoints ...RelationEndpoint) (*Relation, error)
<fwereade> niemeyer, great, +1
<fwereade> Aram, in the absence of other feedback, then, I will probably move on to a SRW that generates IDs at some point tomorrow; please let me know if you suspect collisions
<fwereade> Aram, I imagine I will implement it along the lines of your proposal
<niemeyer> fwereade: This kills the inconsistency, I think, and makes state.Relation(relation.Id()) work, as usual for everything else
<fwereade> niemeyer, yeah, that is fine by me
<Aram> fwereade: I don't anticipate any collisions, everythung should be smooth
<fwereade> niemeyer, and then we can have a state.InferEndpoints(eps ...string) ([]RelationEndpoint, error), that can be used by the CLI tools
<niemeyer> fwereade: Yeah, that sounds great
<fwereade> niemeyer, and as a bonus we can keep the AddRelation logic as it is for now and do that properly in concert with the upgrady bits
<niemeyer> fwereade: Cool
<niemeyer> fwereade: Btw, there's a great benefit of having the DB _id being the key as it is today
<niemeyer> fwereade: The service relations watcher doesn't have to load the document to tell what service it targets
<fwereade> niemeyer, +100
<fwereade> niemeyer, I *knew* there *was* something really good about it
<niemeyer> fwereade: Yeah, that was kind of an unanticipated benefit.. the real reason we changed was the transaction mechanism
<niemeyer> fwereade: Without this, we can't add a unique relation
<niemeyer> fwereade: Because we don't have anything to assert on
<fwereade> niemeyer, very neat :)
<niemeyer> fwereade: We can't assert on non-existence of something else that contains a given field
<niemeyer> fwereade: Do you have a moment to talk about https://codereview.appspot.com/6588053/diff/2001/worker/uniter/modes.go#newcode171
<niemeyer> fwereade: happy to postpone if you're in flow on something else
<niemeyer_> :(
<niemeyer_> <niemeyer> fwereade: Do you have a moment to talk about https://codereview.appspot.com/6588053/diff/2001/worker/uniter/modes.go#newcode171
<niemeyer_> <niemeyer> fwereade: happy to postpone if you're in flow on something else
<TheMue> hmm, the current test in environs/ec2 in trunk has one fail and one panic. anybody working on that?
<niemeyer> TheMue: What's the failure/panic?
<niemeyer> Hmm.. something in the last two revisions broke it
<TheMue> niemeyer: Fail is NoSuchBucket in TearDownTest
<niemeyer> Ah
<niemeyer> That's Dave's change
<TheMue> niemeyer: Panic is in localServerSuite.TestBootstrapInstanceUserDataAndState
<niemeyer> Let's see in the previous revision
<niemeyer> TheMue: Proposing
<TheMue> niemeyer: Pardon?
<niemeyer> TheMue: I've fixed it.. CL up in a moment
<niemeyer> TheMue: https://codereview.appspot.com/6596056
<TheMue> niemeyer: LGTM
<rogpeppe> niemeyer: PathKey->EntityName https://codereview.appspot.com/6593061/
<niemeyer> rogpeppe: LGTM
<rogpeppe> niemeyer: thanks!
<fwereade> niemeyer, I think I can for a mo :)
<niemeyer> rogpeppe: np, and thanks
<niemeyer> fwereade: Sorry, I just read your mail and didn't realize you were out before
<fwereade> niemeyer, nah, I was explicitly doing it because I realised I needed to and, meh, cath was out :)
<fwereade> niemeyer, while C&L are distracted together I'm more than happy to talk if it helps us progress
<fwereade> niemeyer, it just means I may disappear abruptly for hours at a time :)
<niemeyer> fwereade: I was just going to mention that the issue there was simpler than it sounded
<niemeyer> fwereade: I was questioning why we do the wantEvent+waitForConfig dance at all
<niemeyer> fwereade: Rather than just running the hook directly
<fwereade> niemeyer, yeah -- as it turned out I really did have a reason ;)
<fwereade> niemeyer, not well expressed in the comments though ;)
<fwereade> niemeyer, oh yeah, I meant to ask about the resolved business
<niemeyer> fwereade: Cool, I'm curious about why that is, but I can wait for the comment easily :)
<fwereade> niemeyer, in suggesting that no ResolvedNone event needs to be sent, I imagine that, to go with this, a wantResolved would just do nothing if the state was ResolvedNone
<fwereade> niemeyer, which makes "want" not quite right, maybe
<fwereade> niemeyer, or maybe "want" is just right :)
<fwereade> niemeyer, regardless, it would necessarily I think break the want-always-causes-a-send-as-soon-as-possible feature
<fwereade> niemeyer, the immediate consequences are generally good
<fwereade> niemeyer, but that property did seem like quite a nice one
<niemeyer> fwereade: Yeah, a slight change in perspective.. it does send an event immediately, as long as the current state is sensible
<fwereade> niemeyer, ha, actually, the way I expressed it above, it *doesn't* break it
<fwereade> niemeyer, ok, if you've already considered it from that perspective I'm very happy indeed to do it that way
<SpamapS> fwereade: aren't you supposed to be doing celebratory chores?
<niemeyer> fwereade: Yeah, I wish we could do the same for upgrades actually
<fwereade> SpamapS, did some, now I can relax again ;p
<fwereade> niemeyer, upgrades aren't so simple, the upgradiness of a given service charm differs depending on mode
<niemeyer> fwereade: You mean force vs. non-force?
<fwereade> niemeyer, and what it's to
<fwereade> niemeyer, X is not an interesting request when we're already upgrading to X even if it's forced
<fwereade> nie sorry gtg for a while again
<niemeyer> fwereade: Please do go :)
 * niemeyer => lunch
<TheMue> niemeyer: a final propose of https://codereview.appspot.com/6596051/ for the error message of illegal firewall modes
<rogpeppe> niemeyer: any idea why this doesn't work? http://paste.ubuntu.com/1256219/
<niemeyer> rogpeppe: Because you're authenticating against the wrong database
<rogpeppe> niemeyer: ah, i wondered if it might be something like that
<niemeyer> rogpeppe: The "/dbname" at the end of the URL does the trick
<niemeyer> rogpeppe: That said, I don't think we want to use it like that, FWIW
<niemeyer> rogpeppe: Login is a lot easier
<rogpeppe> niemeyer: i don't think so either
<rogpeppe> niemeyer: but i wanted something to verify that AddUser was actually doing something
<rogpeppe> niemeyer: and in fact, for non-SSL connections, perhaps we do want to do it this way
<rogpeppe> niemeyer: Login... hmm, i started doing it that way, then thought this was easier!
<rogpeppe> niemeyer: SetPassword in state: https://codereview.appspot.com/6587060/
<niemeyer> rogpeppe: Reviewed
<rogpeppe> niemeyer: thanks
<niemeyer> rogpeppe: np
<rogpeppe> niemeyer: i'm not entirely sure how i can check that the presence database is authenticated. currently there's no difference between an authenticated and an unauthenticated connection AFAIK
<niemeyer> rogpeppe: Perhaps we should fix that first then?
<rogpeppe> niemeyer: i'm not sure i can do that without running mongo in auth mode, which comes in the next CL
<niemeyer> rogpeppe: Perhaps we should do that first?
<rogpeppe> niemeyer: hmm, yeah, we could do that
<niemeyer> rogpeppe: As it is the CL has a bug that went unperceived because we had no tests.. feels like a good reason to step back
<rogpeppe> niemeyer: i'm not entirely sure i understand how the auth mode thing works - the mongodb docs seem pretty sketchy
<rogpeppe> niemeyer: i don't think the bug comment was published
<niemeyer> rogpeppe: ?
<rogpeppe> niemeyer: i don't see any comment in your review that points out a bug
<rogpeppe> niemeyer: (apart from "Just needs a test to ensure the bug is fixed")
<niemeyer> rogpeppe: Try to add the test then :-)
<rogpeppe> niemeyer: ha
<rogpeppe> niemeyer: i just looked at the comment in context :-)
<niemeyer> rogpeppe: In terms of --auth, just running with --auth enables secure mode.. it allows connecting locally to the database before a user is added
<niemeyer> rogpeppe: Which is quite handy for what we want
<rogpeppe> niemeyer: ok, i'll make another CL that runs the testing mongodb server in auth mode.
<rogpeppe> niemeyer: once a user is added, does it stop any further unauthenticated local connections?
<niemeyer> rogpeppe: I think all we'd need is running the test mgo in auth mode, and use SetAdminPassword on it
<niemeyer> rogpeppe: Right
<rogpeppe> niemeyer: i don't see a SetAdminPassword method in mgo
<niemeyer> rogpeppe: That's about juju :-)
<rogpeppe> niemeyer: but we can add a user in state.Initialize
<rogpeppe> niemeyer: i'd thought we might add a password argument to state.Initialize
<niemeyer> rogpeppe: and then how do we connect?
<niemeyer> rogpeppe: Well, sorry, that's irrelevant
<rogpeppe> niemeyer: with Open as usual, no?:
<niemeyer> rogpeppe: Yeah
<niemeyer> rogpeppe: Either way, Initialize can be made in terms of SetAdminPassword..
<rogpeppe> niemeyer: seems reasonable
<rogpeppe> niemeyer: BTW in environs/cloudinit, i'm planning to write the entity name and password to $dataDir/admin/mongodb-auth
<rogpeppe> niemeyer: then the agents will read that and put the info into the state.Info before connecting
<rogpeppe> niemeyer: i've gotta go and cook up some wild mushrooms for tea :-)
<niemeyer> rogpeppe: Sounds fine, except a "/auth" file sounds enough. We need to consider how to handle the multiple agents too. $dataDir is shared, IIRC
<rogpeppe> niemeyer: ok. i wondered if there might be a problem with people reading the length of the file, given that $dataDir is globally readable.
<rogpeppe> niemeyer: for multiple agents, i'd thought we'd use agents/$entityname/auth
<niemeyer> rogpeppe: Not sure.. agents/ was going away, I think.. we'll have containers/ instead IIRC
<rogpeppe> niemeyer: i think i'll fold in the Initialize changes into the current CL otherwise we've got a small chicken/egg problem .
<niemeyer> rogpeppe: We must also think about how to avoid having the admin password exposed..
<rogpeppe> niemeyer: well, the same directory that's used for the agent anyway
<niemeyer> rogpeppe: I don't see the chicken/egg problem
<rogpeppe> niemeyer: i think it's got to sit in a file. as long as it's in a 700 file in a 700 directory, i don't think there's a problem
<niemeyer> rogpeppe: I mean how to get the admin password to be set in a way that doesn't leak it
<niemeyer> rogpeppe: and at the same time is known by the client
<rogpeppe> niemeyer: Initialize relies on SetPassword (well SetAdminPassword which is almost exactly the same thing), and that's what's in the current CL
<niemeyer> rogpeppe: That's how we end up with big unfocused branches
<niemeyer> rogpeppe: It doesn't really on unit.SetPassword or machine.SetPassword
<niemeyer> rely
<rogpeppe> niemeyer: yes, but the testing will be exactly the same (they both share tests)
<rogpeppe> niemeyer: as will the implementation
<rogpeppe> niemeyer: and we'll need the changes from the current CL that do the login too
<niemeyer> rogpeppe: What happens if you login with an invalid user?
<rogpeppe> niemeyer: you get an "auth failed" error
<niemeyer> rogpeppe: s.Session.Login("admin", "foo")
<niemeyer> rogpeppe: This means one can trivially implement SetAdminPassword, and test it
<rogpeppe> niemeyer: ok, so that's stage 1. the next stage needs to have all the other login stuff in. and that's the bulk of it. SetAdminPassword is about two lines of code.
<niemeyer> rogpeppe: Without Open, or machine.SetPassword, or unit.SetPassword, or Initialize, etc
<niemeyer> rogpeppe: No, it doesn't
<niemeyer> rogpeppe: The next stage can have just the changes to state.Open, I believe
<niemeyer> rogpeppe: Without anything else again
<rogpeppe> niemeyer: that's what i meant
<niemeyer> rogpeppe: No unit.SetPassword, no machine.SetPassword, etc
<rogpeppe> niemeyer: i didn't mean those as login stuff.
<rogpeppe> niemeyer: anyway, it's all pretty small. i don't think there's a danger (currently) of the branch getting too unfocused
<niemeyer> rogpeppe: That's how it starts every time
<niemeyer> rogpeppe: and that's the direction that it was going again
<niemeyer> rogpeppe: It's extremely rewarding to have small branches quickly integrated
<rogpeppe> niemeyer: i will try
<rogpeppe> niemeyer: BTW your admin password question is salient
<rogpeppe> niemeyer: i don't know how we can get the admin password to the bootstrap machine without putting it in cloudinit
<rogpeppe> niemeyer: and hence making it available forever
<niemeyer> rogpeppe: I think we can use a hash trick on the original password
<niemeyer> rogpeppe: hash it together with a well known token, and use that as the first password.
<rogpeppe> niemeyer: yeah, i suppose we could use a hash designed for passwords.
<niemeyer> rogpeppe: On first connection, replace with the actual password
<niemeyer> rogpeppe: That's not the issue
<niemeyer> rogpeppe: If we use a hash designed for passwords as the password, we'd still have the same issue
<rogpeppe> niemeyer: i realise we have to change the password on first connection
<rogpeppe> niemeyer: but we also need to use a hash designed for passwords, i think
<niemeyer> rogpeppe: Not super worried about it, as long as it's a cryptographic hash
<rogpeppe> niemeyer: otherwise someone can reverse the hash and connect as admin
<niemeyer> rogpeppe: None of the traditional hashing algorithms are reversible
<rogpeppe> niemeyer: i think it's important actually, if we're concerned about security from the bootstrap machine
<rogpeppe> niemeyer: they don't need to be reversible if the password isn't so strong
<niemeyer> rogpeppe: In fact, hashes are not reversible by definition.. they may have other weaknesses, but not reversibility
<rogpeppe> niemeyer: i should've said "crack" not "reverse"
<niemeyer> rogpeppe: It doesn't matter in this context
<rogpeppe> niemeyer: because they can't make that many connections to mongodb?
<niemeyer> rogpeppe: Sorry, I don't understand what problem you're trying to solve
<rogpeppe> niemeyer: i'm imagining someone malicious on the bootstrap machine. they have the hash of the password from cloud-init; now they're trying to log in as admin
<niemeyer> rogpeppe: Okay.. they have a hash based on the password.. so what?
<rogpeppe> niemeyer: they can go through a few hundred billion passwords quite easily; when they find one that hashes to the same thing, they have the admin password.
<rogpeppe> niemeyer: hence we slow down that process by using a hash designed for that purpose.
<rogpeppe> niemeyer: i really must go; i'll pop my head in a little later.
<niemeyer> rogpeppe: Enjoy
<niemeyer> I've got a doc appointment.. back soon
<fwereade> niemeyer, ping
<niemeyer> fwereade: you're probably off by now, but I just sent a question in the review
<niemeyer> fwereade: Have seen your email too, and pondering.. will reply later
<davecheney> state_test.go:250: c.Assert(current, DeepEquals, initial)
<davecheney> ... obtained map[string]interface {} = map[string]interface {}{"firewall-mode":"default", "name":"test", "development":true, "authorized-keys":"i-am-a-key", "default-series":"precise", "type":"test"}
<davecheney> ... expected map[string]interface {} = map[string]interface {}{"type":"test", "name":"test", "development":true, "authorized-keys":"i-am-a-key", "default-series":"precise"}
<davecheney> ^ when was firewall-mode added ?
<davecheney> niemeyer: https://bugs.launchpad.net/juju-core/+bug/1060509
<davecheney> can anyone else reproduce this, i'm getting it on two systems
<niemeyer> davecheney: It was added today
<niemeyer> davecheney: Well, yesterday from your perspective
<niemeyer> davecheney: Frank probably didn't run the full test suite
<davecheney> https://codereview.appspot.com/6590064 has the fix
<niemeyer> davecheney: LGTM
<davecheney> ta
<davecheney> what happened with rogers gnuflag patch ?
<davecheney> ok, the mysql charm is failing because falvor doesn't have a valid default
<davecheney> niemeyer: it looks like charm setting's defaults are broken
#juju-dev 2012-10-03
<niemeyer> davecheney: Quite possibly.. config needs love
<davecheney> niemeyer: basically it looks like the config-get command is returning null to the hook
<davecheney> not the default value
<davecheney> looking at the tests in cmd/jujuc/server/config-get_test.go
<davecheney> it looks like that is the cause
<davecheney> ie, none of the default options in the dummy charm are being reported by config-get
<niemeyer> davecheney: It's state that needs to change, both to support multiple charms and to integrate default handling
<niemeyer> davecheney: jujuc is merely reporting what it finds
<davecheney> riiiiiiiiiigh
<davecheney> but isn't that logic in the charm package ?
 * davecheney raises an issue
<niemeyer> davecheney: No, the charm package manages charm bundles, directories, etc, and reports the defaults in the charm itself correctly
<niemeyer> davecheney: But state needs to incorporate those details
<davecheney> right
<davecheney> is there an issue reported for this already ?
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1060537
<davecheney> ^ will close if it is a duplicate
<davecheney> niemeyer: sorry to keep asking about this
<davecheney> but are you sure this is a problem in the state package
<davecheney> looking at state/charm.go, it imports charm, an includes a *charm.Config in the document
<niemeyer> davecheney: Look at Service.Config
<davecheney> niemeyer: thank you
<niemeyer> davecheney: np
<davecheney> niemeyer: ah, right
<davecheney> it just returns a *state.ConfigNode with no consideration for any defaults provided by the charm
<davecheney> niemeyer: cmd/juju/get.go:end
<davecheney> if this logic is correct, should I move it into state ?
<niemeyer> davecheney: We should implement this together with the multi-config handling that we talked about in the sprint
<niemeyer> davecheney: Since this will mean we'll have the charm at hand, among other things
<davecheney> you will have to refresh my memory
<davecheney> or, if others understand this better, i will move on to something else
<niemeyer> davecheney: There's some coverage of the problem in the list IIRC
<davecheney> ok, will re-read the archive
<niemeyer> davecheney: I thought you had talked to William during the sprint about this as well
<davecheney> yes, but it is not ringing any bells
<davecheney> possibly i have a different name for the problem
<niemeyer> davecheney: Okay, there was some misscommunication then.. this is what I've been asking you about for the past couple of weeks whenever I ask "How is the config stuff going" :)
<davecheney> definitly a misscommunication
<davecheney> i was working on the user side juju get
<niemeyer> davecheney: Cool, no worries
<niemeyer> davecheney: The basic idea, as you can dig in the archives for background, is that when we upgrade, the configuration has to be upgraded as well
<davecheney> right, yes, https://lists.ubuntu.com/archives/juju-dev/2012-September/000195.html
<niemeyer> davecheney: Because the new charm may have different options
<davecheney> incompatible charm upgrades
<niemeyer> davecheney: Then, we have an issue: what do we do during the upgrade
<niemeyer> davecheney: This isn't an atomic operation.. old charms cannot deal with new config
<niemeyer> davecheney: new charms cannot deal with old config
<niemeyer> davecheney: and they're both running
<niemeyer> davechen1y: Makes sense?
<davechen1y> niemeyer: yes, i think i understand the problem
<davechen1y> but the solution looks compilcated
<niemeyer> davechen1y: Yeah, it requires some consideration indeed
<niemeyer> davechen1y: I suggest just leaving it aside for the moment then, and continuing the good progress
<davechen1y> niemeyer: understood
<niemeyer> davechen1y: It'll probably require more synchronization than we'll have available in the next couple of weeks due to the timezone differences, so it seems to make more sense for you to focus on tasks you're comfortable with the approach and direction
<davechen1y> yes, i agree
<davechen1y> i don't want to tackle that
<davechen1y> you and william are in the best position to handle it
<niemeyer> davechen1y: Alright, time for some sleep.. have a good day there
<davecheney> :135
<Aram> moin.
<davecheney> rogpeppe: https://codereview.appspot.com/6598052/
<davecheney> LGTM
<davecheney> i've been using it all day, and no charms have barfed on it
<rogpeppe> davecheney: cool, thanks. i wanted an extra look because i made some significant changes after your previous review
<rogpeppe> davecheney: i was kinda hoping for a review from niemeyer, but i think i'll submit anyway
<rogpeppe> davecheney: i'm not too happy about the difference between -u=foo and --u=foo but that's GNU flags for ya
<davecheney> rogpeppe: what is the difference between -u=foo and --u=foo ?
<davecheney> maybe i'm a bsd zealot, but -u=foo isn't a valid syntax
<rogpeppe> davecheney: it is
<davecheney> although I realise that plan9/flag treat them as identical
<rogpeppe> davecheney: -u associates the value "=foo" with the u flag
<davecheney> ahhm yes
<rogpeppe> davecheney: --u associates the value "foo"
<davecheney> dont you also introduce things like
<davecheney> -flag=5
<davecheney> which is -f -l -a -g=5
<rogpeppe> davecheney: yeah, that works
<davecheney> and also
<davecheney> -uwang
<rogpeppe> davecheney: that works too
<davecheney> which could be -u -w -a -n -g, and also -u => wang
<davecheney> this shit is cazy
<rogpeppe> davecheney: yeah, you can't tell without looking at the flag
<rogpeppe> davecheney: it's a real pity that flag parsing is context-sensitive
<rogpeppe> davecheney: (that's true for go's flag syntax too, of course)
<davecheney> rogpeppe: if you are happy with what exists now, then i'd +1 for comitting it now so I can hammer on folks to go get -u their installs
<davecheney> we're at the point that if you squint, the mysql gem is usable
<rogpeppe> davecheney: FWIW plan 9's flag package didn't allow multi-letter flags at all...
<rogpeppe> davecheney: cool!
<rogpeppe> davecheney:  ok, will submit
<rogpeppe> davecheney: gustavo can moan later if he wants
<Aram> yeah, I very much disklike multi letter flags.
<rogpeppe> Aram: i'm not sure.
<rogpeppe> Aram: it depends how many flags you've got
<rogpeppe> Aram: i very much dislike unix-style commands that have many single-letter flags
<rogpeppe> Aram: of course, i've got used to ls by now :-)
<davecheney> rogpeppe: Aram it is what it is, whatever the charms expect, we have to support
<Aram> I dislike commands that have too many flags as well.
<rogpeppe> davecheney: submitted
<davecheney> bam!
<Aram> davecheney: of course.
<davecheney> Aram: but i share your feelings
<davecheney> python commands tend to be very verbose in their input
<rogpeppe> davecheney: did you get a t-shirt?
<davecheney> rogpeppe: i did indeed
<davecheney> a lovely green one
<rogpeppe> davecheney: me too. very tasteful.
<davecheney> i think you've seen niemeyer model it
<davecheney> w00t
<rogpeppe> davecheney: ah, you're right. i thought i'd seen it before.
<davecheney> coming to a google io near you
<Aram> where do you get these tshirts?
<davecheney> Aram: i think adg sent them
<davecheney> a long time ago, back in march from memory, they asked for a shipping address
<davecheney> unless this was an unrelated act of generosity
<Aram> send to whom? to an elite cabal? :)
<davecheney> Aram: i don't know who was on the list, i'll find out
<Aram> heh, Microsoft released the source of the Z3 theorem prover under a license that can be characterized as: "We couldn't figure out how to make money with it, but if you do, give it to me."
<davecheney> Aram: rofl
<davecheney> the 'come in spinner' licence
<rogpeppe> Aram, fwereade: how important do you think it is that state.Initialize is idempotent?
<fwereade> rogpeppe, I think it's important but my only strong motivation is an obscure sense of aesthetics
<rogpeppe> Aram, fwereade: 'cos the changes i'm making will mean that the second time you call it, it will give an error
<fwereade> rogpeppe, that sounds fine too, at least it's explicit :)
<rogpeppe> fwereade: cool
<davecheney> rogpeppe: tell me more about this
<davecheney> we do have explicit tests that assert that this is ok
<davecheney> however I can't think of a real world case where this wouldn't be an error
<rogpeppe> davecheney: we're going to run mongodb in auth mode
<rogpeppe> davecheney: Initialize is responsible for setting up the first user (admin)
<davecheney> i think the 'it's ok to double init' logic was part of the 'wait for init' logic
<davecheney> which probably makes little sense
<rogpeppe> davecheney: subsequent connections will need to provide authorization
<davecheney> rogpeppe: right, so first in sets the key
<rogpeppe> davecheney: so if you call Initialize twice, the second call won't have the correct authorization info, so it'll fail
<rogpeppe> davecheney: yeah
<davecheney> after that it's password or GTFO
<rogpeppe> davecheney: yup
<davecheney> i think that is a reasonable restriction
<davecheney> +1
<rogpeppe> davecheney: cool
<davecheney> urgh - i just ended up writing a test harness for ssh by hand
<davecheney> http://codereview.appspot.com/6601043/
<davecheney> kind wish we had gocheck for go.crypto packages
<davecheney> /s/ssh/sshd
<davecheney> certainly {SetUp,TearDown}Test
<rogpeppe> davecheney: i don't really understand why sshtest is a package. it doesn't seem to export any symbols.
<davecheney> i want the func tests to be in another packaged
<davecheney> because, honestly, they are going to fail more often
<davecheney> because of various environmental issues
<rogpeppe> davecheney: won't they be run exactly as often because people will always do test ./... ?
<davecheney> it's mainly for the builders
<davecheney> you make a good point that it is uncommon to split tests across two packages
<rogpeppe> davecheney: it seems odd to me, tbh
<davecheney> rogpeppe: noted
<rogpeppe> davecheney: at the very least there should be a package comment that says "this package consists of tests only"
<rogpeppe> davecheney: and i think there's no point in having non-test files. they're just confusing. i looked for a while to try to find what the package was about.
<davecheney> rogpeppe: good call
<davecheney> i will make that change
<Aram> rogpeppe: I don't care if state.Initialize is idempotent. I wish it would be, but I wouldn'd mind if it is not.
<rogpeppe> Aram: i don't see any situation in which we want to call Initialize twice on the same mongo (apart from if we've reset everything in tests)
<niemeyer> Morning all!
<fwereade> niemeyer, heyhey
<andrewsmedina> niemeyer: morning :D
<niemeyer> fwereade, andrewsmedina: Heyas
<rogpeppe> niemeyer: yo!
<rogpeppe> niemeyer: new SetAdminPassword CL: https://codereview.appspot.com/6586070/
<niemeyer> rogpeppe: Heya
<niemeyer> rogpeppe: Sent a review
<rogpeppe> niemeyer: thanks
<niemeyer> rogpeppe: np
<rogpeppe> niemeyer: s.State.Service won't work as a check because s.State now has admin perms. but i guess if i did the SetAdminPassword in a different connection, that might work.
<rogpeppe> niemeyer: alternatively, perhaps i could produce a better error message from state.Open
<rogpeppe> niemeyer: that depends on the mgo errors though
<niemeyer> rogpeppe: We can just try to talk to the database then.. I think s.Session will not be authenticated
<rogpeppe> niemeyer: yeah, that'll probably work.
<niemeyer> rogpeppe: Otherwise, hmm
<niemeyer> rogpeppe: It actually sounds like a good idea to have a sane user message in the open case
<niemeyer> rogpeppe: Since that's really what people will see
<rogpeppe> niemeyer: that's what i thought.
<rogpeppe> niemeyer: it depends if there's a reproducible error code though
<niemeyer> rogpeppe: We can do that without introducing new logic
<rogpeppe> niemeyer: yeah, i'd look at the Code in the error returned from the same operation
<niemeyer> rogpeppe: Cool.. I don't recall what the code is, but hopefully there is a well defined on
<niemeyer> e
<rogpeppe> niemeyer: i'll have a look. this page is not great: http://www.mongodb.org/display/DOCS/Error+Codes
<niemeyer> rogpeppe: Easiest is to simply enable mgo debugging while running that test
<niemeyer> rogpeppe: Or just print out the error, realy
<niemeyer> really
<rogpeppe> niemeyer: i was gonna print the error %#v
<rogpeppe> niemeyer: done. https://codereview.appspot.com/6586070
<niemeyer> rogpepper: Sorry, forgot to ping back.. that's ready
<niemeyer> Hmm.. getting spam from the Corinthia hotel in Lisbon.. how unfortunate
<niemeyer> Aram: Watcher is so much better, thank you
<Aram> niemeyer: thanks.
<Aram> indeed it is
<Aram> I'm doing the SUW now
<Aram> and SRW afterwards
<rogpeppe> niemeyer: i think we might need two admin accounts
<rogpeppe> s/accounts/users/
<rogpeppe> niemeyer: because there's a difficulty with our proposed method of sending the hash of the password - the remote side never gets to know the actual admin secret
<niemeyer> rogpeppe: In a call, but will be with you in a moment
<rogpeppe> niemeyer: np
<niemeyer> rogpeppe: SO
<rogpeppe> niemeyer: SO
<niemeyer> :)
<rogpeppe> niemeyer: i have a possible solution
<niemeyer> rogpeppe: I suspect we can handle that nicely by setting the password on the first connection, as we vaguely alluded to yesterday
<rogpeppe> niemeyer: once we've done that, how does the server side connect to the state?
<rogpeppe> niemeyer: (it doesn't know the admin secret)
<niemeyer> rogpeppe: And it shouldn't, I believe
<rogpeppe> niemeyer: indeed
<rogpeppe> niemeyer: so the question remains
<niemeyer> rogpeppe: That's the case we covered before we went into the admin password
<rogpeppe> niemeyer: to put it more clearly, perhaps: when the bootstrap machine reboots, what password does the PA use to reconnect to the state?
<niemeyer> rogpeppe: The machine agent should use the machine password
<rogpeppe> niemeyer: ah, ok, so nothing on the bootstrap machine needs access to admin privileges
<rogpeppe> niemeyer: that makes sense.
<niemeyer> rogpeppe: Yes, its password should give access to everything it needs
<rogpeppe> niemeyer: so... how would you feel about jujud bootstrap-state being responsible for starting the initial machine agent?
<niemeyer> rogpeppe: Of course, nowadays that's all mostly silly since passwords give access to the whole database, but once we front the API with a server, we'll be able to cut down privileges
<rogpeppe> niemeyer: it's not *that* silly - the admin account does have more privileges than most
<niemeyer> rogpeppe: I'd feel like that's unnecessary and redundant.. we have to handle the case of a machine getting its password anyway
<rogpeppe> niemeyer: yes, but someone has to create the password for the new machine
<rogpeppe> niemeyer: and i think that should probably be bootstrap-state
<niemeyer> rogpeppe: Yes, that's a good task for bootstrap-state..
<niemeyer> rogpeppe: After all, it creates the machine in the first place
<rogpeppe> niemeyer: so if bootstrap-state is writing stuff specifically for the machine agent, it would seem reasonable for it to actually start the agent too
<rogpeppe> niemeyer: so we haven't got too much action-at-a-distance going on
<niemeyer> rogpeppe: Don't see the leap
<niemeyer> rogpeppe: bootstrap-state has to create a machine and set its password, because it's bootstrapping
<niemeyer> rogpeppe: Every single time a machine is started, we need to run the machine agent.. that's not special
<niemeyer> rogpeppe: No reason to put special code inside the bootstrap-state command to do things that are common
<rogpeppe> niemeyer: it has to do more than set its password - it has to write a file containing the machine agent's password to disk, i think
<niemeyer> rogpeppe: I don't think so.. the machine agent itself can write its own file to disk
<rogpeppe> niemeyer: how does the machine agent find out its own password?
<niemeyer> rogpeppe: Like it generally would anyway
<niemeyer> rogpeppe: How will it find its own password when we're creating a machine that is not the bootstrap machine?
<rogpeppe> niemeyer: the PA writes a cloudinit that creates the file holding a random password
<niemeyer> rogpeppe: Why are we creating a file?
<rogpeppe> niemeyer: because want the MA to be able to restart and find its key again
<rogpeppe> niemeyer: and we don't want the key in the upstart file
<niemeyer> rogpeppe: The machine agent itself can create its own file
<niemeyer> rogpeppe: Why not?
<rogpeppe> niemeyer: i guess it's no harm. i was thinking that we didn't want to have the client create the initial MA key, but perhaps it's ok. we're actually only vulnerable if we run units on the bootstrap machine, which we can't do until the MA has changed its password and started work.
<niemeyer> rogpeppe: Exactly
<rogpeppe> niemeyer: so we pass the initial password as a flag to the MA, which is ignored if the key file already exists.
<niemeyer> rogpeppe: and we need to hand off the password anyway in the general case, since otherwise the machine won
<niemeyer> 't be able to talk back
<niemeyer> rogpeppe: Yeah, --initial-password
<niemeyer> rogpeppe: So explicitly something that won't be around for long
<rogpeppe> niemeyer: i'm thinking there's no point in doing the two-stage thing for unit agents, but what do you think?
<niemeyer> rogpeppe: I think it wouldn't hurt to do the same mechanism, and may have relevant semantics in specific cases
<rogpeppe> niemeyer: ok
<niemeyer> rogpeppe: For example, it means the principal unit loses access to the subordinate unit after it's started
<rogpeppe> niemeyer: interesting point, yes
<rogpeppe> niemeyer: although i'm not sure you lose access to a connection if its password is changed.
<niemeyer> rogpeppe: This aspect will only be relevant once we have a frontend API instead of direct connection to the database.. right now the passwords will be interchangeable
<rogpeppe> niemeyer: here's a sketch of my understanding of this: http://paste.ubuntu.com/1258227/
<niemeyer> rogpeppe: On bootstrap-state, I suppose we should set the admin password first thing
<niemeyer> rogpeppe: What's pbkdf and what's the benefit of using it?
<rogpeppe> niemeyer: http://go.pkgdoc.org/code.google.com/p/go.crypto/pbkdf2
<niemeyer> rogpeppe: 2) I think we only need one initial password to bootstrap-state and the machine agent
<rogpeppe> niemeyer: i'm not sure that works
<rogpeppe> niemeyer: the benefit of using pbkdf2 is that noone can crack the password even if they've got access to the cloudinit file
<rogpeppe> niemeyer: ha, yeah, maybe you're right about needing only one password
<niemeyer> rogpeppe: You mean it'll be more expensive to crack, okay
<rogpeppe> niemeyer: yeah, like unfeasible expensive, i believe
<niemeyer> rogpeppe: You're slowing down the cost count-times, specifically.. that's different from infeasible
<rogpeppe> niemeyer: yeah. but if the cost count times are slowed down enough (imagine 1 second per test) it becomes infeasible for the indefinite future.
<rogpeppe> niemeyer: well, ok 5 years :-)
<niemeyer> rogpeppe: Cool, +1
<rogpeppe> niemeyer: cool
<rogpeppe> niemeyer: this is my latest version of the sketch: http://paste.ubuntu.com/1258245/
<niemeyer> rogpeppe: Let's embed the function instead of importing the package
<rogpeppe> niemeyer: why not import the package? it's supported by the core Go team
<niemeyer> rogpeppe: Because it means another external import and repository in exchange for 25 lines of code
<rogpeppe> niemeyer: ok. how about launchpad.net/juju-core/go.crypto/pbkdf2 ?
<rogpeppe> niemeyer: so the origin is obvious
<rogpeppe> niemeyer: and we can easily import other packages from go.crypto as necessary
<niemeyer> rogpeppe: juju-core/thirdparty/pbkdf2?
<niemeyer> rogpeppe: If we start to depend on more stuff from go.crypto, that'd justify using the external package itself
<rogpeppe> niemeyer: i'd prefer to keep go.crypto in the path (particular as pbkdf2 is such a non-obvious name)
<rogpeppe> niemeyer: but i don't feel too strongly
<niemeyer> rogpeppe: There's content inside it indicating its origin
<rogpeppe> niemeyer: ok
<niemeyer> rogpeppe: That also defines the pattern.. if have other similar needs, or depend on poorly maintained external packages, we just stick them in there
<niemeyer> if we
<rogpeppe> niemeyer: how about we put a comment at the top of the file // Original at code.google.com/p/go.crypto/pbkdf2
<rogpeppe> ?
<niemeyer> rogpeppe: +1
<rogpeppe> niemeyer: so, does my final version of the sketch LGTY?
<niemeyer> rogpeppe: Sorry, haven't seen it..
 * niemeyer looks
<niemeyer> <niemeyer> rogpeppe: On bootstrap-state, I suppose we should set the admin password first thing
<niemeyer> rogpeppe: ?
<rogpeppe> niemeyer: i think it's good to set the admin password as the last thing actually
<rogpeppe> niemeyer: so that people can't connect until we've set up all the state we want to
<rogpeppe> niemeyer: it gives us an atomic initialize again, without needing to try
<niemeyer> rogpeppe: That's why I think we should set it first.. ?
<niemeyer> rogpeppe: Ah, no, I see..
<niemeyer> rogpeppe: Sounds good
<rogpeppe> niemeyer: cool.
<rogpeppe> niemeyer: anything else?
<niemeyer> rogpeppe: Yeah, we'll need to retry on unauth, I suppose
<rogpeppe> niemeyer: i don't think you can connect at all until SetAdminPassword is called
<niemeyer> rogpeppe: Have you tested that?
<rogpeppe> niemeyer: yeah, although not formally
<niemeyer> rogpeppe: Formally!? :-)
<rogpeppe> niemeyer: i didn't write a test for it
<niemeyer> rogpeppe: Have you run mongo with auth and tried to connect?
<rogpeppe> niemeyer: yes
<niemeyer> rogpeppe: Okay
<niemeyer> rogpeppe: What happens?
<rogpeppe> niemeyer: i got connection refused forever
<rogpeppe> niemeyer: well, for 10 minutes until the connect timed out
<rogpeppe> niemeyer: i was wondering about the other way around actually: does mongodb connect fail immediately if it gets an unauth error?
<niemeyer> rogpeppe: Interesting.. just did a trivial test and it connects fine..
<rogpeppe> niemeyer: we don't want the first connection to take 10 minutes because it's constantly retrying after it's been told it's unauthorized
<rogpeppe> niemeyer: from another machine?
<niemeyer> rogpeppe: and it says it's unauthorized
<niemeyer>         "$err" : "unauthorized db:test ns:test.test lock type:0 client:192.168.157.104",
<rogpeppe> niemeyer: that *was* from a different machine, right?
<rogpeppe> niemeyer: maybe my diagnosis of the failure before was wrong.
<niemeyer> rogpeppe: If I connected to the local address, it would work, no?
<rogpeppe> niemeyer: the auth access is based on your source address, i believe
<rogpeppe> niemeyer: but, yeah, i'd expect it to fail if you connected to 192.168... rather than 127.1
<niemeyer> rogpeppe: It doesn't fail, and you get an unauthorized
<rogpeppe> niemeyer: hmm, i wonder what was going on in my failed test then
<niemeyer> rogpeppe: Were you testing on EC2?
<rogpeppe> niemeyer: yes.
<rogpeppe> niemeyer: i was just running the usual ec2 live tests
<niemeyer> rogpeppe: Did you open the security group port?
<rogpeppe> niemeyer: i removed the --auth flag from environs/cloudinit and the live tests worked fine
<niemeyer> rogpeppe: Sorry, I don't see how that's related to the issue
<niemeyer> rogpeppe: live tests work fine if you remove --auth, because they currently work fine.. (?)
<rogpeppe> niemeyer: sure. but adding --auth made the tests hang for 10 minutes trying to connect
<rogpeppe> niemeyer: it didn't fail immediately with an unauth error
<niemeyer> rogpeppe: If you want to test the behavior of --auth, just fire a mongodb and try to connect to it.. inferring behavior because live tests broke anyhow isn't a good way to verify behavior
<rogpeppe> niemeyer: true nuff
<rogpeppe> niemeyer: i'll bootstrap an instance and play with it
<niemeyer> rogpeppe: I can test this as easily as running mongod, and trying to connect to the machine ip
<niemeyer> rogpeppe: (the machine ip, no 127.0.0.1)
<rogpeppe> niemeyer: i've only got one machine :-)
<niemeyer> not
<niemeyer> rogpeppe: This test only needs a single machine
<rogpeppe> niemeyer: actually it occurs to me that the external connection will be connecting from 127.1 to 127.1 anyway, because we're using ssh port forwarding
<niemeyer> rogpeppe: Besides that, looks the scheme looks good
<niemeyer> rogpeppe: Still wanna talk about the location for the password, but that's a trivial for later
<rogpeppe> niemeyer: ok, cool
<niemeyer> rogpeppe: Thanks for the write up.. nice to debate like this
<rogpeppe> niemeyer: np. i needed it to get the pieces straight in my head.
<niemeyer> I'm heading to lunch, biab
 * niemeyer respawns
<niemeyer> Aram: ping
<Aram> niemeyer: pong
<niemeyer> Aram: Yo
<Aram> hi
<niemeyer> Aram: Just sent one last round.. please let me know if it sounds reasonable
<Aram> ok, reading now
<niemeyer> Aram: Just sent a last note regarding the copying.. we don't need to touch the database as I suggested before
<rogpeppe> niemeyer: the old SetPassword CL is back, with better tests. i have  confidence that it's really working now :-) https://codereview.appspot.com/6587060
<rogpeppe> i'm off for the night. see y'all tomorrow!
<niemeyer> rogpeppe: Thanks, and have a good evening!
<niemeyer> rogpeppe: Awesome tests
<Aram> niemeyer: changed the watcher.
<Aram> I'm off now
<niemeyer> Aram: Have a good evening
<Aram> thanks
<Aram> niemeyer: oops, I fucked up something
<niemeyer> Aram: Ah, Alive I suppose
<Aram> yeah.
<Aram> that breaks two tests
<Aram> hmm
<niemeyer> Aram: Yeah, the original was good
<Aram> I'll revert it
<niemeyer> Aram: LGTM with that reverted and assuming all the tests pass
<Aram> thanks
<niemeyer> Aram: Thanks!
<niemeyer> fwereade_: ping
<fss> niemeyer: ping
<niemeyer> fss: Yo
<fss> niemeyer: I'm adding support for some IAM actions on goamz, would you be interested in merging it? :)
<niemeyer> fss: Definitely
<fss> niemeyer: cool. I will split the changes that I've already made and send them to you in parts
<niemeyer> fss: Brilliant, thank you
<fwereade_> niemeyer, pong
<niemeyer> fwereade_: Heya
<fwereade_> niemeyer, heyhey
<niemeyer> fwereade_: Ended up just sending a note in the review so you can process at a better time
<fwereade_> niemeyer, lovely, cheers, bit distracted tonight :)
<niemeyer> fwereade_: Nice :)
<fwereade_> OOPS: 102 passed, 2 FAILED, 3 FIXTURE-PANICKED, 26 MISSED
<fwereade_> --- FAIL: TestPackage (22.84 seconds)
<fwereade_> FAIL
<fwereade_> FAIL	launchpad.net/juju-core/state	22.885s
<fwereade_> niemeyer, is this known? ^^
<fwereade_> niemeyer, the first panic is "need to login"
<fwereade_> niemeyer, that is on trunk, and matches what I get on my branch; my changes merge cleanly with whatever broke, AFAICT, so I don't think I'm making anything worse by submitting :)
<niemeyer> fwereade_: Oh nos
<niemeyer> fwereade_: Hold on
<fwereade_> niemeyer, sorry, I think it has landed already
<fwereade_> niemeyer, bad call? :(
<niemeyer> fwereade_: It's just good to confirm what is actually breaking, and whether your own branch passes cleanly without the breakage
<niemeyer> fwereade_: I'm investigating.. it's probably the admin password logic
<niemeyer> I wonder how rogpeppe didn't get this
<fwereade_> niemeyer, ah sorry -- ISTM that clean tests before merge + failures identical with trunk after merge = independent logic
<fwereade_> niemeyer, but I bet I could construct a suitably fiendish counterexample
<niemeyer> fwereade_: Just running tip with his changes reverted
<niemeyer> tests on top
<niemeyer> tip
 * fwereade_ frets, hopefully baselessly
<niemeyer> fwereade_: Yeah, all good
<fwereade_> niemeyer, cool
<niemeyer> So, let me see if I can fix so I don't have to revert
<niemeyer> fwereade_: This may be a difference between our mongod version and rogpeppe's
<fwereade_> niemeyer, ahh
<niemeyer> Nasty
<niemeyer> I've heard there's some improvements coming around auth in 2.4.. I hope that's the case, because it's a seriously half-baked story ATM
<niemeyer> fwereade_: Fixed.. proposing in a mom
<niemeyer> fwereade_: https://codereview.appspot.com/6602048
 * fwereade_ looks
<fwereade_> niemeyer, LGTM
<niemeyer> fwereade_: Thanks
<fwereade_> niemeyer, np :)
 * niemeyer steps out to meet friends
#juju-dev 2012-10-04
<davecheney> what does bzr mean when it returns error code 3 ?
<davecheney> rogpeppe: i think i have a solution to maek the ssh tests much cleaner
<davecheney> if you want it
<davecheney> -- but i know you're busy tearing that out
<rogpeppe> fwereade_: morning!
<fwereade_> rogpeppe, heyhey
<fwereade_> rogpeppe, how's it going?
<rogpeppe> fwereade_: sorry about trunk breakage last night
<rogpeppe> fwereade_: i didn't realise i had an old version of mongodb
<fwereade_> rogpeppe, no worries
<rogpeppe> fwereade_: not bad at all, thanks.
<rogpeppe> fwereade_: and you?
<fwereade_> rogpeppe, yeah, not bad :)
<TheMue> morning
<rogpeppe> fwereade_: i'm adding admin-secret to the environment configuration, and i *think* it should be part of config.Config, not specific to the ec2 config. it's an interesting case though, because it's *really* secret - it doesn't get pushed with the other secrets.
<rogpeppe> fwereade_: so i'm not sure whether to have a "ReallySecretAttrs" method in environs, or just to make a special case for admin-secret
 * fwereade_ thinks
<fwereade_> morning TheMue btw
<TheMue> fwereade_: heya
<fwereade_> rogpeppe, sorry, no strong feelings either way, just a sense of unease
<rogpeppe> fwereade_: yeah, special cases make me uncomfortable too
<rogpeppe> TheMue: hiya
<TheMue> rogpeppe: a wonderful morning to you too (even if it is raining here) ;)
<rogpeppe> TheMue: beautifully sunny this morning. (sun just up)
<TheMue> rogpeppe: we had a nice october so far until yesterday. since then mostly rain, too much rain
<fwereade_> rogpeppe, wow, massive filter change works as expected; simplifies uniter noticeably; but pushes the tests up over a minute :/
<rogpeppe> fwereade_: cool, but...
<rogpeppe> fwereade_: what's taking all the time in the tests, out of interest?
<fwereade_> rogpeppe, thus far I know not, I will do some poking around
<rogpeppe> fwereade_: worth doing, i think.
<rogpeppe> fwereade_: go test -gocheck.vv | timestamp is always a good start
<fwereade_> rogpeppe, some of it is probably attributable to just doing more work more often but iyt's a bit of a big change for that I think
<rogpeppe> fwereade_: a small CL: https://codereview.appspot.com/6589072/
<rogpeppe> fwereade_: for context, here's the authentication sketch: http://paste.ubuntu.com/1259533/
<rogpeppe> davecheney: it would be good to have some feedback from you about the authentication scheme too, if poss
<davecheney> rogpeppe: sure thing
<davecheney> rogpeppe: did you see my comment about improving the ssh tests ?
<rogpeppe> davecheney: i did
<davecheney> althought I realise the horse has bolted
<rogpeppe> davecheney: yeah, no point in flogging that one
<davecheney> rogpeppe: right-o
<fwereade_> rogpeppe, so --initial-password will be ignored if a password file exists
<rogpeppe> davecheney: i'm interested to know what your plan was though
<rogpeppe> fwereade_: yes
<davecheney> rogpeppe: http://codereview.appspot.com/6601043/diff/3011/ssh/sshtest/sshtest_unix_test.go
<davecheney> ~ line 150
<rogpeppe> fwereade_: well actually, it might not be
<davecheney> rogpeppe: which CL is the authn Cl ?
<rogpeppe> fwereade_: if we fail to connect with the password, we'll try initial-password
<davecheney> the paste ?
<rogpeppe> davecheney: yeah
<davecheney> kk
<rogpeppe> fwereade_: because we might have written the password file but failed to change it
<fwereade_> rogpeppe, ah, sensible
<davecheney> rogpeppe: i'm not sure how helpful this is
<davecheney> but the primary customer inside canonical is Elmo
<davecheney> so if he doesn't like the smell of this
<davecheney> irrespective of its other merits
<davecheney> it's game over
<davecheney> not saying he won't like it, or that what you have is not correct
<fwereade_> rogpeppe, CL LGTM, I will try to get myself into a suitably adversarial mode before tackling the auth overview
<rogpeppe> davecheney: i'm not sure who Elmo is
<davecheney> me neither
<davecheney> but I hear he's the big cheese of the internal sysadmin team
<rogpeppe> davecheney: ok, i'll make a write-up and put it on juju-dev
<davecheney> rogpeppe: that sounds like an excellent plan, then others can distribute as necessary
<davecheney> rogpeppe: my only comment of note, is the storing of the key on disk per machine agent
<davecheney> i don't have a solution to this
<rogpeppe> davecheney: indeed. we need to store it somewhere
<davecheney> only observe that others will see it as a potential loophole
<rogpeppe> davecheney: we must be able to connect after reboot
<davecheney> yeah,
<davecheney> is there a concept of differeing levels of privilige ?
<rogpeppe> davecheney: ssh has exactly the same issue
<davecheney> rogpeppe: yup, it sure does
<rogpeppe> davecheney: in fact any autonomous agent must have the same issue
<rogpeppe> davecheney: the only solution is authenticated h/w
<rogpeppe> davecheney: which we don't have.
<davecheney> rogpeppe: and i'm not sure if that would actually solve the problem
<davecheney> the issue, as i understand it is
<davecheney> user X on machine Y can get root, then get whatever details they need to connect to the state, rip off the AWS keys ..
<davecheney> is that correct ?
<rogpeppe> davecheney: if there was a way of propagating a secret from bootstrap stage to the agent itself, then we could use the secret, then destroy it. then even if you were root, you couldn't get it.
<rogpeppe> davecheney: but of course the agent needs to keep the secret around, so even then we're vulnerable
<davecheney> rogpeppe: the secrets are in the /e document in the /e collection, right ?
<rogpeppe> davecheney: it's a pity everyone has root access
<rogpeppe> davecheney: no
<rogpeppe> davecheney: these secrets are not
<davecheney> rogpeppe: but the AWS creds we're trying to protect are
<rogpeppe> davecheney: yes, they are currently
<rogpeppe> davecheney: but they won't be when we use this scheme to leverage principal-specific access controls
<davecheney> so, if by some mech, the /e document could be protected from access by the machine agent, would that be a solution ?
<rogpeppe> davecheney: it's not a solution to malicious entities on the machine being able to impersonate the machine agent
<rogpeppe> davecheney: but that is necessary too, yes
<davecheney> rogpeppe: is their a spec for the security model ?
<rogpeppe> davecheney: no
<rogpeppe> davecheney: sigh
<davecheney> rogpeppe: i'm not sure how to proceed without this
<davecheney> at best you'll implement whatever is inside gustavos head
<davecheney> and worse, you won't
<davecheney> and niether case may be what customres want
<rogpeppe> davecheney: i see what we're doing here as a necessary prelude to implementing the final security model, which is unspecified
<rogpeppe> davecheney: we're adding the notion of a principal to the state info, which i think is always going to be necessary.
<davecheney> rogpeppe: i'm concerned it is equiv to starting to walk in an unspecified direction, without picking a destination
<davecheney> rogpeppe: please understand, i'm not have a go at your solution,
<rogpeppe> davecheney: ok, the basic security model, as i understand it is:
<davecheney> you know my pickyness for implementing security without a spec
<rogpeppe> davecheney: agents identify themselves to the state; the state allows agents to agent-specific things.
<rogpeppe> s/to agent/to do agent/
<davecheney> rogpeppe: but there is an unspoken requirement that agents cannot be impersonanted
<rogpeppe> davecheney: yes... well, kinda.
<rogpeppe> davecheney: damage limitation
<rogpeppe> davecheney: we don't want any random non-root user on a machine to be able to impersonate that machine's machine agent.
<rogpeppe> davecheney: but if you're root, you're going to be able to do what you damn please
<davecheney> rogpeppe: then putting the per machine agent password in a 0600 file would work
<davecheney> but i think further consultation with the customer is needed
<rogpeppe> davecheney: yes, that's what we're doing
<davecheney> i'm pretty sure that someone is going to say 'but what if they get root'
<rogpeppe> davecheney: there's nothing we can do in that case.
<davecheney> yup
<rogpeppe> davecheney: the trickiness in the spec i pasted above is because there's no way of passing a secret to the initial machine agent that's not accessible by non-root users.
<rogpeppe> davecheney: hence --initial-password
<davecheney> but that is probably going to meen elmo rejects the idea, and we've done a lot of work for nohting
<rogpeppe> davecheney: he can't ask the impossible
<davecheney> he's a very powerful customer
<rogpeppe> davecheney: this is significantly better than what we had before - entities get permissions on a need-to-have basis.
<rogpeppe> davecheney: sure, but we're talking *impossible* here
<rogpeppe> davecheney: noone else will be able to do better
<davecheney> rogpeppe: i never said the customer was rational :)
<rogpeppe> davecheney: also, you won't be able to do much by impersonating a machine agent, even if you are root.
<rogpeppe> and malicious
<rogpeppe> davecheney: in fact the machine agent doesn't need to be able to write to the state at all
<rogpeppe> davecheney: you can do a little more by impersonating a unit agent, but again, not too bad, i think.
<rogpeppe> davecheney: this all assumes an entity-aware API of course.
<rogpeppe> davecheney: the thing i'm most concerned with is man-in-the-middle attacks. i don't see how we can protect against those unless we have some kind of key-distribution scheme.
<davecheney> rogpeppe: them why does the MA need a password at all ?>
<davecheney> i we treat the machine and the machine agent as untrusted
<davecheney> you can ignore their credentials
<rogpeppe> davecheney: we want to partition machines
<davecheney> rogpeppe: can we trust the LXC security boundary ?
<rogpeppe> davecheney: and i don't think we can entirely assume that non-root users can always obtain root.
<rogpeppe> davecheney: apparently not. for root users within LXC anyway.
<davecheney> rogpeppe: if that is a working assumption, then the job is a lot easier
<rogpeppe> davecheney: i think our advice should be (as per usual) don't run untrusted stuff as root.
<davecheney> seconded
<rogpeppe> davecheney: this means we've essentially got a two-tier security model. primary layer: TLS-based authentication; secondary layer: entity name/password authentication
<rogpeppe> davecheney: we assume that a malicious root user can bypass the secondary layer, but not the primary layer.
<davecheney> rogpeppe: is the tls layer using client side certs ?
<rogpeppe> davecheney: it'd better!
<davecheney> otherwise it isn't a security authn mech :)
<rogpeppe> davecheney: indeed
<rogpeppe> davecheney: and vulnerable to man-in-the-middle too
<rogpeppe> davecheney: we need a way of securely passing a cert to a new machine
<rogpeppe> davecheney: ISTR that amz supports this
<rogpeppe> davecheney: dunno about others
<davecheney> rogpeppe: in the puppet model the client generates the cert and sends it to the server for signing
<rogpeppe> davecheney: how does the server know where the cert is coming from?
<davecheney> the admin is expected to manage that out of band
<davecheney> just like gpg
<rogpeppe> davecheney: we can't do that - it's all autonomous
<davecheney> ie, if you weren't expecting to see a cert request, then don't sign it
<davecheney> of course, most envs turn on automatic cert signing
<rogpeppe> davecheney: of course.
<rogpeppe> davecheney: i don't even see how an admin can know
<davecheney> security is hard, shall we go shopping ?
<rogpeppe> davecheney: ooh shiny
<davecheney> rogpeppe: generally you install machine x, insgtall puppet, then tell it to join your puppet server
<davecheney> go to the server, accept the request
<davecheney> then profit
<rogpeppe> davecheney: what if someone got in there between the install and going to the server?
<rogpeppe> davecheney: of course, it may be improbable, but...
<davecheney> puppet assumes you control the security of your environment
<davecheney> it's a cfg management tool
<davecheney> and there is a reason juju exists :)
<davecheney> rogpeppe: my friends that work in hosting companies
<davecheney> run one puppet server per customer
<rogpeppe> davecheney: the way i did this when we had a similar thing was i installed manually on a machine, including a private key.
<davecheney> the idea of a single puppet instnace for all customers is impractical
<rogpeppe> davecheney: then when the machine dials in, you *know* that it's the right machine.
<rogpeppe> davecheney: but that assumes a manual install, of course.
<davecheney> rogpeppe: yup
<rogpeppe> davecheney: which is why you have to assume that the cloud provider can do something similar
<davecheney> my friends in hosting companies use vlans and shit
<davecheney> to separate customers environments
<rogpeppe> davecheney: worst comes to worst you're vulnerable to mitm, but if the hosting co is compromised, you're fucked anyway
<rogpeppe> davecheney: "For Linux instances, you can provide an optional key pair ID in the launch request (created using the CreateKeyPair or ImportKeyPair operation). The instances will have access to the public key at boot. You can use this key to provide secure access to an instance of an image on a per-instance basis. Amazon EC2 public images use this feature to provide secure access without passwords."
<rogpeppe> davecheney: i *think* we can leverage that
<davecheney> rogpeppe: ooooooooooh
<davecheney> but, are we likely to end up in the same 'too many firewall groups' quagmire ?
<rogpeppe> davecheney: i don't think there should be a problem creating a keypair per machine, but i may well be wrong :-)
<davecheney> another stupid amazon limitation
<davecheney> and you'll probably get asked how to do it in the openstack/azure/hp world
<rogpeppe> davecheney: the main problem is that this mechanism is designed for allowing you to connect to a new machine securely, not the other way around.
<davecheney> mmm
<rogpeppe> davecheney: i think we'd need to create a cert based on a hash of the key pair's public key or something.
<rogpeppe> davechen1y: actually, i think we could probably go quite a long way by moving the environ config into a separate database.
<rogpeppe> davechen1y: then we could at least restrict access to the AWS keys without needing an API
<davechen1y> rogpeppe: yes
<davechen1y> which sounds like the juju-as-a-service plan
<davechen1y> customers have MA's, we run the PA
<rogpeppe> davechen1y: yeah, that's certainly part of it
<rogpeppe> davechen1y: by separate database, i didn't mean a separate mongo server, as it happens
<rogpeppe> davechen1y: i meant a separate mgo.Database
<rogpeppe> davechen1y: (we already use two)
<davechen1y> rogpeppe: yup
<davechen1y> i thought we did that already
<davechen1y> or is that just different connections, same db ?
<rogpeppe> davechen1y: same connection, different dbs
<rogpeppe> davechen1y: but each db has its own set of users
<rogpeppe> davechen1y: did you see this CL BTW? https://codereview.appspot.com/6587060
 * davechen1y looks
<davechen1y> rogpeppe: fwereade_ trivial: http://codereview.appspot.com/6601056/
<davechen1y> looking for a LGTM
 * fwereade_ looks
<rogpeppe> davechen1y: LGTM, although i suppose the question is: why --format and not something else?
<davechen1y> rogpeppe: this is the thing that broke
<davechen1y> and it's intended to get people to update their gnuflag instance
<rogpeppe> davechen1y: fair enough.
<rogpeppe> davechen1y: it's gotta be somewhere; why not there?
<rogpeppe> davechen1y: (rhetorical question)
<fwereade_> davechen1y, LGTM
<davechen1y> thanks folks
 * fwereade_ has made the uniter tests fast again by fixing a huge obvious repeated 500ms sleep
 * fwereade_ knows that has been there for ages
 * davechen1y applauds
 * rogpeppe applauds loudly
<rogpeppe> fwereade_: duration of uniter tests now?
<fwereade_> rogpeppe, back to ~45s
<rogpeppe> fwereade_: hmm, still slow then
<fwereade_> rogpeppe, well, yeah, the bit I fixed was pre-existing, so there's probably another 20s to be extracted somewhere, but it's really not obvious
<rogpeppe> fwereade_: 20s shorter would be much more reasonable... but i know the feeling.
<rogpeppe> davechen1y: here's a draft of the first part of a heads-up email: http://paste.ubuntu.com/1259631/
 * davechen1y reads
<davechen1y> i don't see why we need both certs and usernames/passwords
<davechen1y> certs can already be associated with a principal
<rogpeppe> davechen1y: in mongodb?
<rogpeppe> davechen1y: (i tend to agree - i feel that passwords are a bit retro)
<fwereade_> rogpeppe, davechen1y: I need to pop out for a while, but I have this: https://codereview.appspot.com/6588053
<fwereade_> rogpeppe, davechen1y: still WIP
<davechen1y> rogpeppe: not sure how mongo does it
<fwereade_> rogpeppe, davechen1y: but it incorporates some significant changes after niemeyer's suggestions in various places
<rogpeppe> fwereade_: will have another look
<davechen1y> but if you have a TLS cert, then you can request client authentication (ie, they need a sub cert signed by the same CA that issued your cert)
<fwereade_> rogpeppe, davechen1y: and if you have time to cast a quick eye over it for general sanity that would be great
<davechen1y> or the TLS handshake fails
<fwereade_> rogpeppe, issues like "all the filter tests are broken" are not what I'm looking for ;p
<fwereade_> bbiab
<rogpeppe> davechen1y: i'm wondering how/whether mongo converts client certs into mongo users
<davechen1y> sounds complicate
<davechen1y> compilcated
<davechen1y> i reckon drop it
<davechen1y> just use TLS for a secure channel to transmit creds over
<rogpeppe> davechen1y: so don't connect direct to mongo?
<rogpeppe> davechen1y: but use a forwarder?
<rogpeppe> davechen1y: i don't know if that's easy
<rogpeppe> (although at least the mongo client is written in Go)
<rogpeppe> oh why, oh why are certificates so horribly useless in this world?
<davechen1y> and hard, dont' forget that
<rogpeppe> davechen1y: indeed
<rogpeppe> davechen1y: so unnecessary
<rogpeppe> davechen1y: it looks to me as if mongodb can't do client-certificate verification
<davechen1y> scratch that
<Aram> yo.
<rogpeppe> Aram: hiya
<davecheney> rogpeppe: fwereade_ : comments ? https://codereview.appspot.com/6591080
<rogpeppe> davecheney: looks reasonable to me, but i'm not really familiar with the issue, i'm afraid
<rogpeppe> davecheney: "I want to ask that it be accepted." - you can always just *ask*, y'know :-)
<Aram> rogpeppe: what mongodb version do you use?
<rogpeppe> Aram: i've just started using a different version
<rogpeppe> Aram: i was using... 2.0.3 i think
<Aram> aha.
<rogpeppe> Aram: now i've downloaded the version we use on ec2 and am using that
<Aram> I'm kind of nervous that behavior changes so often between versions so close to eachother.
<rogpeppe> Aram: me too. it's a pretty shitty thing to get wrong.
<davecheney> slow cloud-init, is slow
<davecheney> Aram: rogpeppe: i think we should switch to using the version from the public bucket, exclusively
<rogpeppe> davecheney: and download it each time we run a test?
<davecheney> no, i'm sure there is a way to avoid that expense
<rogpeppe> davecheney: it's important that we be able to run tests on an aeroplane too.
<davecheney> rogpeppe: i think you're reading too much into my suggestion
<rogpeppe> davecheney: i suppose we could check the mongodb binaries into the repo
<davecheney> i'm just thinking of unpacking the version into somewhere inside the juju-tree
<davecheney> then just call that path directly
<davecheney> if it's there 'win'
<davecheney> if not, fail
 * davecheney wishes we have juju destroy-service
<rogpeppe> davecheney: is there no such command in the original juju?
<davecheney> yeah, but we don't have it in cmd/juju yet
<davecheney> makes testing harder :)
<rogpeppe> davecheney: if we don't check it into the repo (and i'm not sure we want to clutter the repo with an 8MB binary) i'm not sure that running from a different path buys us that much
<davecheney> rogpeppe: we don't have to check it in
<rogpeppe> davecheney: we'd be better off running mongod --version to check that we get the expected version
<rogpeppe> davecheney: if we don't check it in, then we have the same problem of possible version skew
<davecheney> just change the mongo tests to call to a specific path, one where we have already downloaded the mongodb version, rather than just calling any mongod in th epath
<rogpeppe> ha, it looks like niemeyer didn't build mongod with ssl support
<Aram> I thought that building with SSL was the reason why we needed to build it.
<davecheney> rogpeppe: crap -- that was the _ENTIRE_ reason for cmd/builddb
<rogpeppe> Aram: me too
<rogpeppe> davecheney: try mongod --help 2>&1 | grep -i ssl
<davecheney> % juju deploy couchbase &error: cannot assign unit "cf-mongodb/0" to machine: cannot assign unit "cf-mongodb/0" to machine 8: duplicate key insert for unique index of capped collection
<rogpeppe> davecheney: hmm, it *looks* as if builddb builds it with ssl
<rogpeppe> i wish we prefixed our errors with "juju: " rather than "error: ". i've been meaning to fix that for ages.
<davecheney> here is some good news --> http://paste.ubuntu.com/1259806/
<rogpeppe> davecheney: woo!
<rogpeppe> fwereade_: check it out!
<davecheney> and most of them worked !
<rogpeppe> davecheney: that's almost like a real installation :-)
<davecheney> i don't think I can start any more machines
<davecheney> amazon will chide me
<rogpeppe> davecheney: the security description gets longer (still haven't got to the bit that was the whole point yet though!) http://paste.ubuntu.com/1259811/
<davecheney> the buildbot charm failures, somethign happened with apt, it wasn't us
<TheMue> davecheney: nice environment ;)
<davecheney> % juju ssh ceph/0 -- -t 'less /var/log/juju/unit*'
<davecheney> nice
<rogpeppe> davecheney: what does -t do?
<davecheney> tells ssh'd to allocate a pty
<davecheney> normally if you do ssh host /some/command
<davecheney> no pty is allocated
<rogpeppe> davecheney: ah of course
<davecheney> which sucks
<davecheney> ssh cmd does a lot of things for you
<rogpeppe> davecheney: no, that's a good thing :-)
<davecheney> yeah, but then you have to figure out why it is how it is
<rogpeppe> davecheney: for most commands a pty gets in the way
<davecheney> true
<davecheney> so, the ceph charm is flat broke, not our fault
<rogpeppe> davecheney: paste the log?
<davecheney> two secs
<davecheney> most of them are missing debs
<davecheney> rogpeppe: http://paste.ubuntu.com/1259818/ << ceph
<rogpeppe> davecheney: i wonder what radosgw is and if it was installed by default before.
<davecheney> 2012/10/04 11:35:31 JUJU HOOK ERROR: command: cluster-init: 10.190.42.228:8091, [Errno 111] Connection refused
<davecheney> 2012/10/04 11:35:32 JUJU HOOK + /opt/couchbase/bin/couchbase-cli bucket-create -c 10.190.42.228:8091 -u Administrator -p administrator --bucket=jienaigo --bucket-type=couchbase --bucket-password= --bucket-ramsize=1607 --bucket-replica=1
<davecheney> 2012/10/04 11:35:32 JUJU HOOK ERROR: command: bucket-create: 10.190.42.228:8091, [Errno 111] Connection refused
<davecheney> ^ couchbase
<davecheney> i wonder if some charms install the py juju tools by accident ...
<davecheney> rogpeppe: re your email
<davecheney> drop the bit about a machine cert
<davecheney> i thought we weren't/couldn't do that
<rogpeppe> davecheney: i think we have to do that, somehow.
<rogpeppe> davecheney: although we can't currently.
<davecheney> rogpeppe: yup, good point
<rogpeppe> davecheney: one way of doing it is to have a server that exchanges one-time tokens for certificate signing.
<rogpeppe> davecheney: then we can pass a one-time token into cloudinit
<rogpeppe> davecheney: the machine agent leverages that to get its own certificate signed.
<davecheney> the couchdb charm requires a ppa which is broken
<rogpeppe> davecheney: i wonder what your duplicate key insert problem was about
<davecheney> mongo bug
<davecheney> happens a lot
<davecheney> niemeyer has raised a bug upstream
<davecheney> couchdb
<davecheney> 2012/10/04 11:37:39 JUJU HOOK  * Starting database server couchdb
<davecheney> 2012/10/04 11:37:39 JUJU HOOK    ...done.
<davecheney> 2012/10/04 11:37:39 JUJU hook failed: exit status 1
<davecheney> 2012/10/04 11:37:39 JUJU reading uniter state from disk...
<davecheney> nice
<Aram> yeah, mongodb is... less than stellar.
<rogpeppe> Aram: it's better than zk though
<davecheney> 2012/10/04 11:37:32 JUJU HOOK + sed -e 's/^STARTDISTCC=.*/STARTDISTCC="true"/' -i /etc/default/distcc
<Aram> the products are so different they can't be compared like that.
<davecheney> 2012/10/04 11:37:32 JUJU HOOK + '[' -x /usr/bin/open-port ']'
<davecheney> 2012/10/04 11:37:32 JUJU hook failed: exit status 1
<Aram> they solve a diferent problem.
<davecheney> ^ hard coded tools
<rogpeppe> Aram: you're probably right.
<rogpeppe> Aram: from my brief look at the mongo source earlier today, i wasn't enormously overjoyed.
<Aram> oh, it's bad, I had to look through it when solving various quirks and bugs.
<davecheney> 2012/10/04 11:39:15 JUJU HOOK ldconfig deferred processing now taking place
<davecheney> 2012/10/04 11:39:17 JUJU HOOK install: cannot stat `files/php/php_conf.d_apc.ini': No such file or directory
<davecheney> 2012/10/04 11:39:17 JUJU hook failed: exit status 1
<davecheney> 2012/10/04 11:39:17 JUJU reading uniter state from disk...
<davecheney> ^ drupal6 expects a file not owned by a deb that it installed
<rogpeppe> Aram: i'd forgotten that people still like ifdefs. ugh.
<davecheney> so, short summary, 19 charms, 9 working
<davecheney> none are our fault (directly)
<Aram> open source stuff is plagued by ifdefs.
<davecheney> someone please pass on to mramm and gustavo
<davecheney> i
<davecheney> i'm off to bed
<rogpeppe> davecheney: good work, man
<rogpeppe> davecheney: enjoy your rest
<davecheney> no, congratulations to all of you
<davecheney> with the exceptoin of juju-log -l
<rogpeppe> davecheney: what was the issue with that?
<davecheney> i haven't found a charm that is broken because we are incompatible with py juju
<davecheney> rogpeppe: we didn't support -l $LEVEL
<rogpeppe> davecheney: ah
<rogpeppe> davecheney: well, gnuflag was broken too...
<davecheney> rogpeppe: https://codereview.appspot.com/6584069
<davecheney> % juju destroy-environment
<davecheney> error: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details. (SignatureDoesNotMatch)
<davecheney> FUCK YOU AWS
<davecheney> not tonight
<davecheney> not with 20 machines running
<davecheney> it always does this when I have more than a few machines running
<rogpeppe> davecheney: hmm, that might be our bug, i suppose
<rogpeppe> davecheney: just got to the aws console :-)
<rogpeppe> s/got/go/
<davecheney> right, i have ceased hemorhaging money
<rogpeppe> davecheney: i hope you claim it back on expenses
<rogpeppe> davecheney: (i've been a bit crap at doing that recently)
<davecheney> night all
<rogpeppe> davecheney: but i had $200 bill last month, so it's worth doing
<bigjools> fwereade_: howdy, can I bug you for a bit please? got a vexing problem with juju
<fwereade_> bigjools, heyhey
<bigjools> how's it going?
<fwereade_> bigjools, ah, not bad thanks, not sure if I can *actually* remember anything about python but I'm game for a try
<bigjools> ah you're a Goer
<fwereade_> bigjools, that's what they tell me
<bigjools> :)
 * fwereade_ maintains a perfectly straight face
<bigjools> I'm trying use juju to test deployment on maas and it's saying it did it, yet there's no attempt at all to start a machine
<bigjools> I can't see start_machine being called, not sure how to debug this and even why it started misbehaving
<fwereade_> bigjools, hmmmm -- you're bootstrapping?
<bigjools> after that
<bigjools> bootstrap is ok
<bigjools> just deploy going wrong
<fwereade_> bigjools, what does status tell you? is it that there "should be" a machine, but it's just not started? or that the machine never gets added to state in the first place?
<rogpeppe> bigjools: have you looked at the logs on the bootstrap machine?
<rogpeppe> bigjools: (something to do after fwereade_'s suggestion, possibly)
<fwereade_> bigjools, yeah, the start_machines will be called by the provisioning agent
<bigjools> fwereade_: I see no attempt at all to even try to start a machine, but status says it's waiting for it
<fwereade_> bigjools, assuming they're making it into state, which you can check with status, your best bet is the PA logs
<bigjools> rogpeppe: I didn't look there, will do so!
<bigjools> fwereade_: PA?
<fwereade_> bigjools, provisioning agent
<rogpeppe> bigjools: /var/log/juju/*.log i think
<fwereade_> bigjools, upstart job called somehign beginning with juju-pro IIRC
<bigjools> ok
<rogpeppe> fwereade_: here's the text of an email that i'm considering sending to juju-dev: http://paste.ubuntu.com/1259902/
 * fwereade_ reads
<bigjools> well, there's a traceback
<rogpeppe> bigjools: that sounds fairly indicative :-)
<bigjools> http://pastebin.ubuntu.com/1259905/
<rogpeppe> niemeyer: yo!
<niemeyer> rogpeppe: Heya!
<niemeyer> Hello all!
<fwereade_> bigjools, sorry, never seen that before
<bigjools> fwereade_: darn :(
<bigjools> fwereade_: happens on packaged or trunk version, I suspect some config problem but I can't work out what exactly from that traceback, it's not very helpful :(
<fwereade_> bigjools, indeed not :(
<rogpeppe> niemeyer: i was getting feedback from dfc this morning about the security stuff, and he suggested sensibly that we should make sure that our potential users (e.g. Elmo) are happy with the direction we're going, so i put an email together trying to explain things. i don't know whether it's actually worth sending, but it focused my mind helpfully anyway. http://paste.ubuntu.com/1259902/
<fss> niemeyer: i sent the cl yesterday https://codereview.appspot.com/6586073/ :)
<bigjools> fwereade_: I think I know what it is .... :/
<niemeyer> rogpeppe: I don't think it's worth sending because this is not our end goal
<fwereade_> bigjools, go on...
<rogpeppe> niemeyer: ok. i'm not sure what our end goal is then.
<bigjools> fwereade_: the ZK machine can't reach the maas server.  why that is, I don't know
<fwereade_> bigjools, oh, hell
<niemeyer> Oct 01 11:54:59 <niemeyer>      rogpeppe: In a future universe, we'll then introduce an HTTPS API to which everyone will talk to
<rogpeppe> niemeyer: that's what i talk about in the email
<rogpeppe> niemeyer: as an intro
<rogpeppe> niemeyer: then i say that these are steps in that direction
<bigjools> fwereade_: well, that "integer is required" error is the fundamental problem causing that in fact.  So still not closer to working it out.
<fwereade_> bigjools, oh, for real tracebacks in twisted
<niemeyer> rogpeppe: That's what I mean.. our focus is on implementing these steps. If you want to discuss future with James, that sounds great, but I suggest getting hold of him in two weeks and talking to him
<bigjools> fwereade_: quite :/
<bigjools> fwereade_: you have to turn on deferred debugging to get them
<fwereade_> bigjools, I don't think we expose a switch for that
<fwereade_> bigjools, I guess you can always hack at the code the PA runs :/
<niemeyer> rogpeppe: Meanwhile, I hope we *implement* the steps, rather than just discuss how a perfect future looks like
<bigjools> fwereade_: awesome :)
<rogpeppe> niemeyer: ok. i was thinking that it might be useful to know if we're stepping in the right direction, but if you think that's fine, i won't send anything.
<niemeyer> rogpeppe: I'm not sure I understand your concerns
<niemeyer> rogpeppe: If authenticating is a step in the right direction? Of course it is.. if transport security is a step in the right direction? Of course it is
<rogpeppe> niemeyer: it may be that we don't need anything of what we're doing now.
<niemeyer> rogpeppe: We're not doing anything fancy.. we're implementing what should be in place from day zero
<rogpeppe> niemeyer: we already authenticate and do transport security.
<niemeyer> rogpeppe: Transport security? Authentication?
<rogpeppe> ssh
<niemeyer> rogpeppe: Please read the code of our agents :)
<rogpeppe> niemeyer: ok, we don't currently use ssh intra-cloud, but we could.
<rogpeppe> niemeyer: we could do SSL security without any of the SetPassword stuff.
<niemeyer> rogpeppe: How do you put a client SSL certificate in place?
<rogpeppe> niemeyer: that's a question for us now too
<niemeyer> rogpeppe: Heh
<rogpeppe> niemeyer: perhaps we're not concerned with man-in-the-middle attacks though
<niemeyer> rogpeppe: What man-in-the-middle attacks?
<niemeyer> rogpeppe: Do you have man-in-the-middle attacks when you use a password on gmail?
<rogpeppe> niemeyer: yes, potentially.
<niemeyer> rogpeppe: No, you don't unless you ignore the security warnings from your browser
<niemeyer> rogpeppe: I'm happy to hear proposals that are better than the one I've explained. I'm not greatly interested in stopping progress to hunt for a proposal without clear articulation of what is the problem, the solution, and the way we'll get there in time.
<rogpeppe> niemeyer: i suppose that's what i was trying to articulate.
<niemeyer> rogpeppe: I haven't noticed that yet.. you just told me we already authenticate and do transport security
<rogpeppe> niemeyer: i wanted to put my sketch up on juju-dev to see if anyone could see obvious flaws in it, as we'll probably be using this model for a while. do you think that's a bad idea?
<niemeyer> rogpeppe: Yes, I personally think it is. I'd like to see progress being made instead of exposing a half-baked plan. This is not the end goal.. we'll *not* use database constraint to secure data.
<rogpeppe> niemeyer: ok, fair enough
<niemeyer> rogpeppe: It's up to you, though
<rogpeppe> niemeyer: changing the subject, what do we do about admin-secret vs the environment config?
<rogpeppe> niemeyer: it's secret, but we don't want to push it with the rest of the secrets
<niemeyer> rogpeppe: I personally don't mind that you're talking about it with people, of course. Feel free to do contact James, juju-dev, or whoever else.
<niemeyer> rogpeppe: I'll be doing pressure for progress, though.
<niemeyer> rogpeppe: I want to see code being merged that improves the situation.
<rogpeppe> niemeyer: i'm not going to delay things at all
<niemeyer> rogpeppe: Also, doing homework is good..
<niemeyer> """
<niemeyer> It is not clear to me whether it is possible to make MongoDB
<niemeyer> perform client-certificate verification; if it cannot, for the time being
<niemeyer> we will remain vulnerable to man-in-the-middle attacks within the cloud.
<niemeyer> """
<niemeyer> """
<niemeyer> Even within this interim
<niemeyer> model, we can significantly improve things by separating concerns within
<niemeyer> the database.  For example the environment configuration (containing
<niemeyer> the cloud access keys), the machines collection (allowing the creation
<niemeyer> of new instances), and the unit-related collections could each be in
<niemeyer> """
<niemeyer> We won't do that.
<rogpeppe> niemeyer: no?
<niemeyer> rogpeppe: No
<rogpeppe> niemeyer: do we have transaction that span machines and units?
<rogpeppe> transactions
<niemeyer> rogpeppe: Erm?
<rogpeppe> niemeyer: sorry, i jumped to conclusions. why won't we do that?
<rogpeppe> niemeyer: it seemed to me like keeping the environ config separate might be a cheap and easy way to make things more secure.
<niemeyer> rogpeppe: Cheap? How do you separate out everyone that needs the environment configuration?
<niemeyer> rogpeppe: Everybody uses it right now
<rogpeppe> niemeyer: hmm. i suppose we do need to read the private bucket.
<niemeyer> rogpeppe: How does MongoDB authentication will permit you to do it in a single database, or alternatively how do you span multiple databases with transactions?
<niemeyer> rogpeppe: The solution is not to hack together such change.. the solution is to have a real API to which clients talk to, instead of communicating with the database
<rogpeppe> niemeyer: depends whether we need to span those things in a single transaction. i thought perhaps we did not. it's true i didn't check, though.
<rogpeppe> niemeyer: yeah that's true at least.
<rogpeppe> niemeyer: i did try to do my homework regarding mongod client-side certificate verification, but got lost in a) the source code b) the openssl docs.
<rogpeppe> niemeyer: please disregard the proposed email. i wanted feedback and i got it, thanks.
<rogpeppe> niemeyer: currently i am wondering whether to make admin-secret a special case, or to have a VerySecretAttrs method on EnvironProvider.
<rogpeppe> niemeyer: i'm tending towards the former, and putting admin-secret in config.Config.
<niemeyer> rogpeppe: Transport security and authentication, in place, working.. that's what we have to focus on for the moment. The need for the API is being strongly requested, and won't take long. What we're doing is a good step towards supporting it.
<rogpeppe> niemeyer: that's cool. i understand that now. let's move on.
<niemeyer> rogpeppe: Special case in which sense?
<rogpeppe> niemeyer: we don't want to push it to the state
<rogpeppe> niemeyer: at least i *think* we don't want to push it to the state
<rogpeppe> niemeyer: otherwise it makes a mockery of our careful password management
<niemeyer> rogpeppe: Hmm
<niemeyer> rogpeppe: Indeed
<niemeyer> rogpeppe: This will likely be somewhat boring, in fact..
<rogpeppe> niemeyer: yeah
<niemeyer> rogpeppe: Since we replace the local config with the remote one regularly
<niemeyer> rogpeppe: and the remote one won't have the secret
<rogpeppe> niemeyer: i'm tempted to make it a special case
<rogpeppe> niemeyer: and never push an attribute named "admin-secret"
<niemeyer> rogpeppe: It is a special case in either case.. I'm just wondering what that means in practice
<niemeyer> rogpeppe: Well, I guess we only need the password when connecting, so the regular replacement may not be much of an issue
<rogpeppe> niemeyer: i'm not sure which piece you're thinking of when you say "we replace the local config with the remote one regularly"
<rogpeppe> niemeyer: which "local" and which "remote"?
<niemeyer> rogpeppe: the one in memory vs. the one in the database
<rogpeppe> niemeyer: i don't think that's a problem - we'd never have the admin-secret attribute in the cloud
<niemeyer> rogpeppe: Exactly
<rogpeppe> niemeyer: ah, when *the client is* connecting. yeah.
<niemeyer> rogpeppe: Means we'll lose the local password from the configuration.. but I think that's ok
<rogpeppe> niemeyer: i don't think we *need* to remove the password from the client-side Config object, but perhaps that's not what you're thinking of
<niemeyer> rogpeppe: I'm thinking it is going to be removed even if we don't need it
<niemeyer> rogpeppe: Because we load the environment configuration from the remote side
<rogpeppe> niemeyer: we just have to change BootstrapConfig to remove admin-secret too
<rogpeppe> niemeyer: but it would seem a bit odd if SecretAttrs didn't return admin-secret actually
<bigjools> fwereade_: I restarted the PA and it made the error go away ... wtf!
<rogpeppe> niemeyer: so a better fix would be to remove admin-secret from the secrets within juju.Conn.updateSecrets
<niemeyer> rogpeppe: Not sure.. that's a setting for the provider itself
<rogpeppe> niemeyer: is it?
<rogpeppe> niemeyer: aren't we doing provider non-specific stuff with it?
<niemeyer> rogpeppe: EnvironProvider.SecretAttrs is
<rogpeppe> niemeyer: ah, yeah
<rogpeppe> niemeyer: in which case changing BootstrapConfig would seem better
<niemeyer> rogpeppe: Yeah, I think it's quite fitting
<niemeyer> rogpeppe: We should also add a panic to State.SetEnvironConfig
<niemeyer> rogpeppe: In case it ever sees an admin-secret
<rogpeppe> niemeyer: that seems reasonable
<niemeyer> rogpeppe: Or perhaps just an error.. I think we might reach the panic with "juju set admin-secret=foo"
<fwereade_> bigjools, grar
<rogpeppe> niemeyer: hmm yeah
<rogpeppe> niemeyer: i was thinking a panic seemed a bit harsh actually
<niemeyer> rogpeppe: +1
<rogpeppe> niemeyer: actually, we could make juju set admin-secret work if we wanted, i think
<niemeyer> rogpeppe: Yeah, sounds sane, but it's a different code path anyway
<rogpeppe> niemeyer: yeah
<rogpeppe> niemeyer: the error case would remain
<rogpeppe> niemeyer: oh yeah, small CL from this morning: https://codereview.appspot.com/6589072/
<TheMue> niemeyer: morning btw from me too
<TheMue> niemeyer: and also a CL regarding the firewall mode in EC2: https://codereview.appspot.com/6589073/
<niemeyer> TheMue: Heya
<niemeyer> rogpeppe: ping
<rogpeppe> niemeyer: pong
<niemeyer> rogpeppe: Have a moment for a call? An idea just crossed my mind
<rogpeppe> niemeyer: sure
<niemeyer> rogpeppe: I'm not sure if it's crack or not, or if it'd take longer or not, so would appreciate some brainstorm
<rogpeppe> niemeyer: i'll fetch the other computer, so it doesn't die half way through
<rogpeppe> niemeyer: on mo
<niemeyer> rogpeppe: Cool
<niemeyer> TheMue: Quick question before the review: is the *-global group needed? Couldn't we just use the one group?
<fwereade_> niemeyer, this should be a trivial: https://codereview.appspot.com/6607043
<TheMue> niemeyer: started that way too, but then I found it could make sense that our machines internally share one group with possible pure internal needed ports for machine to machine communication while the global group contains those ports open to the public.
<niemeyer> TheMue: COol
<niemeyer> TheMue: And is there any practical benefits with that?
<TheMue> niemeyer: currently our firewall model doesn't use it, everything is done on the global group. but we should discuss it.
<TheMue> niemeyer: have been interrupted here, sorry
<TheMue> niemeyer: so we can distiinguish between the source
<niemeyer> TheMue: Cool, sounds ok
<TheMue> niemeyer: glad you like it
<rogpeppe> niemeyer: admin-secret: https://codereview.appspot.com/6587085
<niemeyer> rogpeppe: Cheers
<niemeyer> fwereade_: Is it just moving the package without any semantic changes?
<fwereade_> niemeyer, yes, + package comment tweaking
<niemeyer> fwereade_: Beautiful, LGTM
<fwereade_> niemeyer, cheers
<niemeyer> rogpeppe: Great stuff
<rogpeppe> niemeyer: thanks! unfortunately i accidentally made it dependent on a prereq, so perhaps you could have a look at that too (it's very small) https://codereview.appspot.com/6589072/
<niemeyer> rogpeppe: Looking
<niemeyer> rogpeppe: Hmm.. that's reviewed already
<rogpeppe> niemeyer: oh really? cool!
<niemeyer> Lunch.. biab
<niemeyer> fwereade_: Back on uniter here
<fwereade_> niemeyer, ah, cool, thanks
<niemeyer> fwereade_: Looking pretty good
<niemeyer> fwereade_: Looks like you've managed to get the upgrade decision entirely inside the filter
<fwereade_> niemeyer, yeah, I think it's reasonably clean
<fwereade_> niemeyer, and it does make for a very nice interface
<niemeyer> fwereade_: Very true
<niemeyer> fwereade_: I'm thinking through the relationship between ModeInstalling and the follow up continuation
<niemeyer> fwereade_: In terms of possible races given the different origins of the charm
<fwereade_> niemeyer, not sure I follow... this is rarely a good sign ;) what race?
<niemeyer> fwereade_: Well, that's what I'm trying to find :)
<fwereade_> niemeyer, ok, well, I don't *think* the changes matter re upgrading at all: at some single point in time we write the current service charm in an install op, and that is the charm that gets installed, full stop
<fwereade_> s/upgrading/installing/
<niemeyer> fwereade_: When a delta is taken from foo to bar, and then we pick current state from baz, that's always an eye-opener
<niemeyer> fwereade_: Okay, imagine this:
<niemeyer> 1) Unit starts up
<niemeyer> 2) filter goroutine starts up, and blocks
<niemeyer> 3) ModeInstalling runs, and picks charm C1
<niemeyer> 4) filter runs, and picks charm C2 as current charm
<niemeyer> fwereade_: What happens next?
<fwereade_> niemeyer, at some point after this we will be in a mode that needs charms, so it will ask for charm events relative to a baseline of C1, and (assuming appropriate forcing) get the *Charm corresponding to C2 next time it reads from charmEvents
<niemeyer> fwereade_: Why? C1 was never seen by the filter
<fwereade_> niemeyer, yeah, we tell the filter what charm events we're interested in
<niemeyer> fwereade_: Ah, maybe that's what I'm missing
<fwereade_> niemeyer, wantCharmEvent now takes something like (upfradeFrom *state.Charm, mustForce bool)
<niemeyer> fwereade_: Aha, makes sense, thanks
<niemeyer> fwereade_: Quite nice
<niemeyer> fwereade_: Very nice, in fact
<fwereade_> niemeyer, cheers :)
<niemeyer> fwereade_: Okay, so..
<niemeyer> fwereade_: New branch/logic/etc is *awesome*
<niemeyer> fwereade_: You did changes inside the filter that look like a great direction too
<niemeyer> fwereade_: The select statement is neat and tight
<fwereade_> niemeyer, it seemed worth trying again, and it seemed to work out ok this time :)
<niemeyer> fwereade_: There's one thing I think we can improve slightly, but it's just in terms of how to do exactly the same thing, rather than changing it
<fwereade_> niemeyer, great, improvements always welcome :)
<fwereade_> [in the background, a wail of shock: no! mum! I don't want to eat salad!]
<niemeyer> ROTFL
<niemeyer> fwereade_: These new closures are great in terms of naming and isolating logic, but they're begging to be real methods
<fwereade_> niemeyer, yeah, I felt the pull
<fwereade_> niemeyer, and then I felt like putting them on the same type as the chans would end up obscuring its purpose, and that a separate type wasn't quite right
<fwereade_> niemeyer, which would you favour?
<niemeyer> fwereade_: This, I suspect, will also reduce a bit the massive namespacing that we have within that one method
<fwereade_> niemeyer, OTOH the same type as the chans is clearly the right place, given that they manipulate them
<fwereade_> niemeyer, yeah, indeed
<fwereade_> niemeyer, incidentally, obscuring the field namespace was also a concern
<niemeyer> fwereade_: I'd put them in the same type.. they're a different category of methods, guaranteed
<fwereade_> niemeyer, if I'm going to have a busy namespace a single function scope sometimes seems like the right place ;p
<niemeyer> fwereade_: They are private methods, except everything is private at the moment because the whole type is private
<fwereade_> niemeyer, how would yu feel about me exposing filter as Filter, and making the Events methods public?
<fwereade_> niemeyer, or just the methods even
<niemeyer> fwereade_: I was about to suggest the latter half only
<niemeyer> fwereade_: +1
<fwereade_> niemeyer, convergence :)
<niemeyer> fwereade_: ftw :)
<fwereade_> niemeyer, great
<niemeyer> fwereade_: The field names so far are very clear.. we have well known fields, plus want* and out*.. we can then share a few private things with proper names, and I suspect the methods will hide a few variables within their own scope
<fwereade_> niemeyer, yeah, I think it'll work out pretty nice
<fwereade_> niemeyer, it's just the duplication of out* names that bugs me
<niemeyer> fwereade_: Hmm, which duplication?
<fwereade_> niemeyer, the field that holds the real chan and the one that's the same but sometimes nil
<fwereade_> niemeyer, which is manipulated in those closures
<fwereade_> niemeyer, outCharm = nil; outCharm = f.outCharm
<niemeyer> fwereade_: Hmm.. that doesn't feel too bad to me
<fwereade_> niemeyer, well, I certainly can't think of good names for the pair
<fwereade_> niemeyer, the outCharm above will need t be a field,right?
<niemeyer> fwereade_: Ah, okay, I see
<fwereade_> niemeyer, f.maybeOutCharm = f.outCharm
<fwereade_> ;p
<niemeyer> fwereade_: f.outCharm and f.outCharmOn?
<fwereade_> niemeyer, f.outCharm = f.outCharmOn
<fwereade_> niemeyer,  I like it
<fwereade_> niemeyer, ty
<niemeyer> fwereade_: np
<rogpeppe> niemeyer: i'm thinking about putting some encoding of the admin-secret (secure hash or b64) into cloudinit, rather than the raw password, as using the raw password exposes us to awkward upstart quoting issues.
<rogpeppe> niemeyer: do you think that's reasonable?
<niemeyer> rogpeppe: I don't think we ever want the raw password in cloud init
<rogpeppe> niemeyer: good point. doh.
<rogpeppe> niemeyer: so we can assume that the Password in the StateInfo passed to cloud-init is always nicely formed.
<niemeyer> rogpeppe: As in, contains reasonable text? Yeah
<rogpeppe> niemeyer: yeah
<rogpeppe> i'm off for the day. see y'all tomorrow.
<niemeyer> rogpeppe: Cheers man
<niemeyer> fwereade_: Sent some other minor comments on the review of that same branch
<niemeyer> fwereade_: Only one point there may need some further talking
<fwereade_> niemeyer, heyhey
<fwereade_> niemeyer, I'll take a look
<fwereade_> niemeyer, about the config events?
<niemeyer> fwereade_: Yeah
<fwereade_> niemeyer, it just means that if we get a really early wantConfig, we don't end up sending an event not corresponding to a real change when the initial watcher change shows up
<fwereade_> niemeyer, clearly it's not expressed very well :)
<niemeyer> fwereade_: I don't think I get it still
<fwereade_> niemeyer, and re "Upgrading?" -- yes, only ModeUpgrading checks for ErrConflict, because all ModeInstalling does is pull into a new empty dir, and is therefore unlikely to experience conflicts ;)
<niemeyer> fwereade_: How does it prevent anything.. the order is still unpredictable, and the block in the early select is the same block on the later select
<fwereade_> niemeyer, the order is not unpredictable: doing what I do guarantees that the config.Changes() chan will be read at least once before the want chan
<fwereade_> niemeyer, and that therefore it is impossible to (re?)activate the outConfig chan via a want *before* getting the initial event which is actually un *un*requested resend of the "original" event
<fwereade_> niemeyer, yeah, the above is not clear
<fwereade_> niemeyer, consider without that block
<fwereade_> niemeyer, 1) start config watch
<fwereade_> niemeyer, 2) get a wantConfig
<fwereade_> niemeyer, 3) send a config event
<fwereade_> niemeyer, 4) get the initial config change
<fwereade_> niemeyer, 5) send a config event
<fwereade_> niemeyer, the above sequence STM to be possible -- two events from a single config after only one want
<niemeyer> fwereade_: Makes sense, thanks
<fwereade_> niemeyer, np, I'll try to make it clear in the code
<niemeyer> fwereade_: Thanks!
<fwereade_> niemeyer, did the Upgrading? explanation make sense?
<niemeyer> fwereade_: Does as well, thank you
<fwereade_> niemeyer, cool
<fwereade_> niemeyer, (just to check: you're ok with wantUpgradeEvent etc over wantCharmEvent?)
<fwereade_> niemeyer, that's what it actually *is* now after all :)
<niemeyer> fwereade_: Yeah, definitely
<niemeyer> fwereade_: Happy with it, actually
<fwereade_> niemeyer, sudden hare-brained idea, but since you're here:
<fwereade_> niemeyer, u.f.wantConfigEvent() is not really so bad
<fwereade_> niemeyer, I don't really think we get much from emebedding
<niemeyer> fwereade_: +1
<fwereade_> niemeyer, cool
<fwereade_> niemeyer, I should manage to propose again later tonight
<niemeyer> fwereade_: Superb
<niemeyer> fwereade_: I'll have to step out in a bit to sign routine bank papers, but will be back working later too
<fwereade_> niemeyer, have fun :)
<niemeyer> fwereade_: "fun" :)
 * niemeyer steps out
 * niemeyer is back
<fwereade_> niemeyer, https://codereview.appspot.com/6588053 reproposed
<niemeyer> fwereade_: Brilliant.. on the phone with mramm, but will be there soon
<fwereade_> niemeyer, cool, thanks
<niemeyer> fwereade_: Looking
<fwereade_> niemeyer, cheers
<niemeyer> fwereade_: done
<niemeyer> fwereade_: LGTM, with a few last suggestions for your consideration
<fwereade_> niemeyer, awesome :D
<fwereade_> niemeyer, I'll take a look
<fwereade_> niemeyer, I'm a bit uncomfortable about serviceCharm.force sometimes meaning "force" and sometimes meaning "must force" depending on the var... but the only reason I didn't succumb to its convenience was because I though you wouldn't like it :)
<fwereade_> niemeyer, not sure I'll quite get that merged tonight, I'm thinking sleep sounds interesting once I get my current jujuc butchery building again
<niemeyer> fwereade_: I can definitely understand.. it took me some time to suggest variable names properly for this to not be awkward
<fwereade_> niemeyer, yeah, those names help a lot
<niemeyer> fwereade_: upgradeRequested.force and upgradeAvailable.force both sound about right, if not ideal
<fwereade_> niemeyer, definitely better than anything I could think of :)
<niemeyer> fwereade_: About sleep, I can only say have a great one! :-)
<niemeyer> davecheney: Morning!
<niemeyer> We've got an empty review queue again
<davecheney> niemeyer: nice one
<davecheney> i'll see what I can do about that
<davecheney> just trying to debug this atm
<davecheney> lucky(~/src/launchpad.net/juju-core) % juju destroy-environment
<davecheney> error: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details. (SignatureDoesNotMatch)
<niemeyer> davecheney: Uh oh ;)
<niemeyer> davecheney: Curious
<davecheney> ^ happens when I have more than a few machines' running
<niemeyer> davecheney: Up-to-date goamz/ec2?
 * davecheney shrugs
<davecheney> it's the one i've been using since hte last goamz fix
 * davecheney mumbles something about having to add tests to catch people using old pkg versions
<niemeyer> davecheney: Yeah, seems interesting
<davecheney> niemeyer: did you see the results from my charm testing last night
<davecheney> 9 out of 19 charms worked
<niemeyer> davecheney: Woah, I hadn't seen that
<davecheney> non failed because of compatability issues between py and go
<davecheney> niemeyer: check the logs for ~12 hours ago
<davecheney> niemeyer: http://paste.ubuntu.com/1259806/
<niemeyer> davecheney: Woah
<niemeyer> This is *awesome*
<niemeyer> davecheney: Deserves a mail to juju@
<davecheney> i don't have it setup on this machine
<davecheney> i was going to do a bit more triage on the failing charms
<davecheney> niemeyer: https://bugs.launchpad.net/juju-core/+bug/1061941
<davecheney> bug report for aws destroy-environment failure
<niemeyer> davecheney: Can you post some more details of the problem there?
<niemeyer> davecheney: This is indeed a critical issue
<niemeyer> davecheney: and probably easy to solve too
<niemeyer> davecheney: There's a private debug flag withing goamz/ec2
<niemeyer> davecheney: I'm working a bit on multi-config meanwhile
<davecheney> niemeyer: ok, i'll rebuild goamz now i have a test case
<davecheney> niemeyer: please review the irc logs from yesterday
<davecheney> re: the failed charms
<niemeyer> davecheney: Can you paste the section you'd like me to look at so I don't have to second guess?
<davecheney> niemeyer: http://irclogs.ubuntu.com/2012/10/04/%23juju-dev.html#t11:48
<davecheney> ^ are we missing a command ?
<davecheney> http://irclogs.ubuntu.com/2012/10/04/%23juju-dev.html#t11:57
<davecheney> ^ dud charm
<davecheney> http://irclogs.ubuntu.com/2012/10/04/%23juju-dev.html#t11:59
<davecheney> ^ hard coded tools path
<davecheney> http://irclogs.ubuntu.com/2012/10/04/%23juju-dev.html#t12:01
<davecheney> ^ unhygenic charm
<davecheney> and that was as far as I got before I realised destroy-environment wasn't working :)
<niemeyer> davecheney: Which command are we missing?
<davecheney> is bucket-create a juju command ?
<niemeyer> davecheney: It's a bit hard to read it all and guess what you think is going on
<davecheney> niemeyer: yeah, i'll do more triage today
<niemeyer> davecheney: " /opt/couchbase/bin/couchbase-cli bucket-create"
<niemeyer> davecheney: That's couchbase, not juju
<davecheney> right, then the tally is 9 out of 19 charms make it to started, non fail because of compatibility between us and py juju
<davecheney> niemeyer: http://jujucharms.com/charms/precise
<davecheney> ^ can i get this data in xml/json ?
<niemeyer> davecheney: The list of charms?
<davecheney> yup
<niemeyer> davecheney: Let me see..
<davecheney> just the charm names
<davecheney> niemeyer: thank you for your review on http://codereview.appspot.com/6591080/
<davecheney> i know it's not the right solution, but it lets me at least have a working deploy
<niemeyer> davecheney: np, the interim solution looks good
<niemeyer> davecheney: Doesn't corrupt the state, and easy to rollback
<davecheney> and it will be fairly obvious for others coming alone later
<davecheney> niemeyer: what do you think about adding so documentation to the top level, INSTALL and CONTRIBUTE
<davecheney> also adding a scripts directory for some of the things like the stress test
<davecheney> and running gocov will need some scripting support
<niemeyer> davecheney: +1 on docs.. for the stress test, I'd prefer for it to have its own code base
<davecheney> even if it's just a bash script ?
<niemeyer> davecheney: Okay, that sounds fine to have in
<niemeyer> davecheney: Unless the bash script is doing while true; do go test; done
<niemeyer> ;)
<davecheney> well, there is a little bit more, where it sets a random GOMAXPROCS
#juju-dev 2012-10-05
<davecheney> niemeyer: just a quick double check https://codereview.appspot.com/6601056
<davecheney> niemeyer: do you want review this goamz patch ? https://code.launchpad.net/~mirtchovski/goamz/ec2
<niemeyer> davecheney: LGTM on first, thanks
<davecheney> niemeyer: ta
<niemeyer> davecheney: Do you need something from the second?
<davecheney> what is the code review policy for goamz
<davecheney> the key change for that branch is adding DescribeImages
<davecheney> as the user (f2f in #go-nuts) has created their own ami's
<davecheney> whereas we use the ubuntu cloud image finding service thing
<davecheney> niemeyer: does turning on debug in ec2 dump my secret key ?
<davecheney> ie, is it safe to post this whole debug log ?
<niemeyer> davecheney: I don't know..
<niemeyer> davecheney: There's a lot pending in goamz.. someone needs to dedicate some time to it
<niemeyer> davecheney: We also require people to sign the contribution agreement on external contributions to Canonical code bases
<niemeyer> davecheney: Has he signed?
<davecheney> niemeyer: i doubt he's signed anything
<davecheney> niemeyer: https://bugs.launchpad.net/juju-core/+bug/1061941
<davecheney> ^ debug in issue
<niemeyer> davecheney: Okay, can you figure why it is broken?
<davecheney> i'm sure it's the length of the URL, or some component of that
<davecheney> bisecting the problem now
<niemeyer> davecheney: That debug info doesn't say much.. the meat is usually inside sign.go
<niemeyer> davecheney: We had it broken sometime ago due to a change in the way path was handled
<niemeyer> davecheney: Ah, wait
<niemeyer> davecheney: It could be about the encoding of the + there
<davecheney> niemeyer: right o
<niemeyer> davecheney: Not how the signature has a %2B
<niemeyer> davecheney: Note
<davecheney> yup, that will proably screw itup
<davecheney> damn spaces
<davecheney> niemeyer: there is a certain number of instances where it does work for
<davecheney> i'm figureing out that number, it's like 3 or less I think
<niemeyer> davecheney: This may match the chances of getting a signature with a given shape
 * niemeyer => bed
<niemeyer> Night all
<rogpeppe> fwereade_: mornin'
<rogpeppe> davecheney: hiya
<TheMue> morning
<davecheney> morning gents
<fwereade_> rogpeppe, heyhey
<fwereade_> davecheney, heyhey
<fwereade_> TheMue, heyhey
<TheMue> hi all
<fwereade_> rogpeppe, I've been chatting to niemeyer about jujuc.HookContext
<rogpeppe> fwereade_: ok
<fwereade_> rogpeppe, we are agreed that it should be an interface, and nothing more specific has been said
<rogpeppe> fwereade_: interesting.
<fwereade_> rogpeppe, I have had a crazy idea, that the interface we *really* want may actually be a jujuc.Conn, with the same thinking behind it as juju.Conn
<rogpeppe> fwereade_: i think of it as quite concrete, but i'm interested where you end up
<rogpeppe> fwereade_: juju.Conn isn't an interface :-)
<rogpeppe> fwereade_: what different implementations of this interface do you envisage?
<fwereade_> rogpeppe, the purpose is to separate the server and the commands cleanly from all the crap in RelationContext that is in the wrong place
<fwereade_> rogpeppe, by making an interface and moving the existing implementation off into uniter we can start to collapse Relationer and RelationContext into something interesting
<fwereade_> rogpeppe, the fact that it will also be possible to write neater contexts in the jujuc tests is both a bonus and a hassle
<rogpeppe> fwereade_: i'm not sure i see the relationship with juju.Conn. doesn't the context need to know the context within which the commands work (i.e. the current unit, relation, etc) ?
<fwereade_> rogpeppe, well, yes, but I think we can expose all that clearly and usefully in terms of capabilities rather than things
<fwereade_> rogpeppe, no need for a *state.Unit in the Conn(?) has an OpenPort method, etc etc
<fwereade_> s/ in / if /
<fwereade_> rogpeppe, and then I thought "hmm, juju.Conn presents a convenient high-level interface with state"
<rogpeppe> fwereade_: what about commands that operate relative to the context?
<fwereade_> rogpeppe, "this situation has at least one interesting parallel"
<rogpeppe> fwereade_: e.g. unit-get, relation-get etc
<fwereade_> rogpeppe, they basically all do...
<rogpeppe> fwereade_: yeah, so it won't look much like juju.Conn then, i guess
<rogpeppe> fwereade_: but you could certainly provide primitives that look like the ones the commands require
<fwereade_> rogpeppe, yes, that is the commonality I am thinking of
<rogpeppe> fwereade_: but then, does that really save you much?
<rogpeppe> fwereade_: over just writing the commands, using the stuff inside the context
<fwereade_> rogpeppe, well, it's basically a straight move of code inside the context
<rogpeppe> fwereade_: i guess i don't really see where the pressure here is. could you explain what the problem is with the current structure?
<fwereade_> rogpeppe, primarily that relationy responsibilities are unhelpfully smeared across two modules, and I will be able to express things much more clearly
<fwereade_> rogpeppe, by combining RelationContext and Relationer with some of HookContext
 * rogpeppe is thinking
<fwereade_> rogpeppe, a server module that contains nothing but the server and the commands, and implements them in terms of an interface, will allow me the freedom to butcher the existing types and construct a magnificent assemblage from their parts :)
<rogpeppe> fwereade_: perhaps you could put together a sketch of how you think the types might work?
<rogpeppe> fwereade_: i guess i don't really see how an interface will work. who will implement the interface?
<fwereade_> rogpeppe, something in uniter whose true nature will only become apparent as i integrate relations
<fwereade_> rogpeppe, for now, the existing HookContext and RelationContext will move into uniter and be manipulated by Relationer as before
<fwereade_> rogpeppe, they'll just be tweaked to implement the interface
<rogpeppe> fwereade_: the difficulty i see is that you've got two kinds of context going on, and one is a superset of the other AFAICS
<fwereade_> rogpeppe, expand please
<rogpeppe> fwereade_: well, aren't the commands that you can execute in a relation hook more-or-less a superset of the commands you can execute in a non-relation hook?
<fwereade_> rogpeppe, no, they're exactly the same
<fwereade_> rogpeppe, you just might need to specify things that aren't present in a non-relation context
<rogpeppe> fwereade_: ok, so... i still don't see the interface thing. it's only useful to have an interface if you've got multiple implementations of it, in my view.
<rogpeppe> fwereade_: and i still don't see how that applies here
<rogpeppe> fwereade_: couldn't you just move HookContext into relationer? (with possible renaming/restructuring along the way)
<fwereade_> rogpeppe, no: import cycles
<fwereade_> rogpeppe, that *is* the right place for it
<rogpeppe> fwereade_: what's the import cycle there?
<fwereade_> rogpeppe, the uniter needs to import jujuc to do stuff with the context; jujuc needs to import uniter to know wtf it's been given
<fwereade_> rogpeppe, it could have its own package
<fwereade_> rogpeppe, and that package will still exhibit the original problem
<rogpeppe> fwereade_: if the context was in relationer, the uniter wouldn't need to import jujuc, no?
<fwereade_> rogpeppe, there is no relationer package
<fwereade_> rogpeppe, and I don;t think there should be
<rogpeppe> fwereade_: i think we're getting closer to the crux of the issue here.
<fwereade_> rogpeppe, Relationer itself will almost certainly change beyond recognition
<fwereade_> rogpeppe, it is at heart a part of the uniter, and if it deserves to be a helper type then its shape will be very different
<rogpeppe> fwereade_: so could you put the context in uniter?
<rogpeppe> fwereade_: without having an import cycle?
<fwereade_> rogpeppe, yes, that is what I'm proposing; that is the situation with the import cycle
<fwereade_> rogpeppe, an interface STM to be a great way of doing so with minimal disruption
<rogpeppe> fwereade_: why does the uniter need to import jujuc?
<fwereade_> rogpeppe, so it can run a jujuc.Server
<fwereade_> rogpeppe, and supply it with contexts
<rogpeppe> fwereade_: ah yes, of course, that seems right
 * rogpeppe is still thinking
<fwereade_> rogpeppe, it is a situation that deserves lots of it, I have been thinking about this for some time :)
<fwereade_> rogpeppe, hmm, it crosses my mind that OpenPort/ClosePort are somewhat crackful, because they change global state directly
<rogpeppe> fwereade_: rather than wating until the hook has completed successfully?
<fwereade_> rogpeppe, my fuzzy feeling that the Context interface should not directly expse state at all has just grown firmer
<fwereade_> rogpeppe, yeah
<fwereade_> rogpeppe, also, config-get isn't using a frozen config
<rogpeppe> fwereade_: it probably should, i guess
<fwereade_> rogpeppe, (also we expose stuff like ConfigNodes which have *totally* inappropriate capabilities like Read and Write
<fwereade_> )
<rogpeppe> fwereade_: here's a perhaps crackful idea: could we obliterate *all* knowledge of units, charms etc from jujuc/server ?
<fwereade_> rogpeppe, not crackful at all
<rogpeppe> fwereade_: and have it simply provide a command-running and callback service
<fwereade_> rogpeppe, I've been skating around the issue due to the distracting potential crackfulness
<fwereade_> rogpeppe, hmm
<fwereade_> rogpeppe, I'd been thinking more in terms of it not referencing state directly in the interface at all
<rogpeppe> fwereade_: it seems to me that this is where the cyclic stuff is coming from, and that actually the *important* thing that jujuc/server is to run commands and allow commands to call back.
<fwereade_> rogpeppe, niemeyer and I are in rough agreement that server+commands is a cohesive chunk
<rogpeppe> s:jujuc/server:jujuc/server does:
<fwereade_> rogpeppe, I think he feels that way more strongly than me though
<rogpeppe> fwereade_: i think i'm more of the opinion that the commands are tied more closely to the uniter than the server mechanism
<rogpeppe> fwereade_: and i *think* that might help things straighten out
<rogpeppe> fwereade_: in fact it looks to me as if even now there's no essential reason for HookContext to live in jujuc/server
<fwereade_> rogpeppe, I was convinced otherwise :)
<rogpeppe> fwereade_: NewServer doesn't rely on uniter
<rogpeppe> AFAICS
<fwereade_> rogpeppe, I have been through this
<fwereade_> rogpeppe, I think we are in agreement that "pull server out on its own, it'sthe most obviously independent bit" would be a reasonable thing to do
<rogpeppe> fwereade_: yeah
<fwereade_> rogpeppe, however, I think that all that actually does is push a non-problematic part to one side
<rogpeppe> fwereade_: it seems to me that server+commands is *not* a cohesive unit - it's two things stuck together not-very-strongly-at-all.
<fwereade_> rogpeppe, well, they are united in their need for a HookContext
<rogpeppe> fwereade_: the server doesn't need a HookContext necessarily.
<fwereade_> rogpeppe, at least, it needs to "know" about them, even if it doesn't manipulate the type directly
<rogpeppe> fwereade_: what does it "know" about them?
<fwereade_> rogpeppe, ContextId is one of the Request fields
<fwereade_> rogpeppe, really
<fwereade_> rogpeppe, the location of Server is a derail
<fwereade_> rogpeppe, it does not contribute to solving the problem
<rogpeppe> fwereade_: i'm not so sure
<fwereade_> rogpeppe, it is trivial to extract it now and it will remain trivial to do so tomorrow
<rogpeppe> fwereade_: the main cyclic problem was between jujuc/server and worker/uniter, no?
<fwereade_> rogpeppe, the cyclic problem is HookContext, full stop
<fwereade_> rogpeppe, and that is a symptom of the relation ickiness more than anything else
<rogpeppe> fwereade_: suppose you had a worker/uniter/commands package and defined HookContext in there?
<rogpeppe> fwereade_: might that help?
<fwereade_> rogpeppe, then I *still* have the precise problem I have been trying to express all morning
<rogpeppe> fwereade_: sorry, please reiterate. it'll penetrate my think skull eventually :-)
<fwereade_> rogpeppe, Relationer and RelationContext need to be in the same package so I can evolve them towards cohesive sanity without making a mess in other packages
<fwereade_> rogpeppe, once I have them next to one another I think it will be almost trivial
<fwereade_> rogpeppe, while they're distant any CL will be laughed out of the room for being monstrous, and quite rightly so :)
<rogpeppe> fwereade_: so... the problem is essentially that the commands need RelationContext and you don't want to put RelationContext in the same package as the commands.
<rogpeppe> fwereade_: (or vice versa)
<rogpeppe> fwereade_: because ISTM that an ultra-simple solution would just be to put the commands in uniter
<rogpeppe> fwereade_: it wouldn't make uniter too huge actually. the two packages combined (including the separable server stuff) amount to only ~2100 lines.
<rogpeppe> fwereade_: and perhaps this is the fundamental underlying semantic issue - the commands *are* tied closely to the uniter.
<fwereade_> rogpeppe, IMO not so tightly as to justify dumping them all in the same package
<fwereade_> rogpeppe, being able to express the interface without mentioning state has a benefit as well -- it keeps the jujuc package nice and hygienic, and unable to itself cause the write-on-error bugs we discussed above
<rogpeppe> fwereade_: it seems to me that they're really just functions that are called back into the uniter; they just happen to be expressed as commands.
<fwereade_> rogpeppe, well, yes; but *everything* under uniter is similarly closely tied to uniter, and I really don't think dumping it all in one package is a good idea
<rogpeppe> fwereade_: hmm, i may be coming around to your point of view :-
<rogpeppe> )
 * fwereade_ feels cheered :)
<rogpeppe> fwereade_: you'll need to be careful how you define the interface though; it might not be easy to avoid cycles or awkward copying type conversions.
<fwereade_> rogpeppe, I *think* have something moderately sane coming through
<rogpeppe> fwereade_: FWIW i think that worker/uniter/commands is a better name than jujuc/server
<rogpeppe> or worker/uniter/cmd
<fwereade_> rogpeppe, the thing is you are right about the callbacky nature of the commands, and I think this will actually make that clearer
<fwereade_> rogpeppe, as soon as we drop jujuc I will be agitating for worker/uniter/tools
<rogpeppe> fwereade_: tools == commands ?
<fwereade_> rogpeppe, yeah, specifically borrowing terminology from elsewhere in the project
<fwereade_> rogpeppe, anyway, while cmd/jujuc exists, worker/uniter/jujuc it will be
<rogpeppe> fwereade_: are you planning to move cmd/jujuc to worker/uniter/jujuc ?
<fwereade_> rogpeppe, cmd/jujuc is the command, just about all the implementation is now in worker/uniter/jujuc; moved from cmd/jujuc/server
<rogpeppe> fwereade_: hmm. i'm not sure that "jujuc" is such a good name in that context. worker/uniter/server would be better. or worker/uniter/commands (or tools, though i'm not sure about borrowing that terminology)
<rogpeppe> anyway, that's just a naming issue
<fwereade_> rogpeppe, for now it is the implememntation side of cmd/jujuc, and I think that remains the best name
<fwereade_> rogpeppe, when we fold jujuc into jujud that point will become open
<rogpeppe> fwereade_: ok
<rogpeppe> fwereade_: one last murmur: i think you're going to a lot of effort to avoid importing 500 lines of code into the uniter package. a concrete type presented to the commands could potentially be just as clear (as in not making State available) as an interface, i think.
<fwereade_> rogpeppe, it still feels like necessary scaffolding to allow me to make the change semi-peacefully
<rogpeppe> fwereade_: moving to using an interface feels like it'll be more disruptive, but i'm sure i don't understand the issues well enough.
<TheMue> fwereade_, rogpeppe: could you please take a short look at https://codereview.appspot.com/6589073/ ? i've got a strange error there when clicking on "View".
<fwereade_> TheMue, I too see a chunk mismatch
<rogpeppe> TheMue: i've seen that before. you can look at the raw diff instead: https://codereview.appspot.com/6589073/patch/6001/4003
<fwereade_> TheMue, I don't know how to resolve it I'm afraid
<rogpeppe> TheMue: i'm not sure what causes it to happen, i'm afraid.
<TheMue> es, the raw is fine, i've tested.
<TheMue> s/es/yes/
<TheMue> ok, thanks for your effort
<rogpeppe> fwereade_: i consistently get a uniter test failure in trunk: http://paste.ubuntu.com/1261625/
<fwereade_> rogpeppe, whoa
<rogpeppe> fwereade_: and i think jujuc was failing too, let me find the failure in the copious log output...
<rogpeppe> fwereade_: yeah, uniter/jujuc too: http://paste.ubuntu.com/1261631/
<fwereade_> rogpeppe, that one's because you haven't updated goyaml I think
<rogpeppe> fwereade_: ah, i didn't realise i needed to
<rogpeppe> fwereade_: yeah, that's the reason for that, thanks
<rogpeppe> fwereade_: that doesn't fix the other problem tho
<fwereade_> rogpeppe, I'm investigating
<rogpeppe> fwereade_: does it pass for you?
<fwereade_> rogpeppe, just got to running it, had to clean up a bit of mental state so it makes sense when I look back at it
<fwereade_> rogpeppe, yeah, works for me, which is odd
<fwereade_> rogpeppe, would you try logging the actual underlying error in unit.go:578 please?
<rogpeppe> fwereade_: will do
<rogpeppe> fwereade_: you mean uniter.go presumably?
<fwereade_> rogpeppe, no, state/unit.go, surely?
<rogpeppe> fwereade_: oh, sorry, i wasn't looking at the log...
<fwereade_> rogpeppe, np -- I think the copious logging is more of a blessing than a curse but I agree it can be a lot
<rogpeppe> fwereade_: it seems wrong that we're ignoring the error there anyway
 * fwereade_ agrees
<rogpeppe> fwereade_: [LOG] 10.99256 JUJU set private address error: duplicate key insert for unique index of capped collection
<rogpeppe> fwereade_: ha.
<rogpeppe> fwereade_: maybe i need to update my mgo package
<rogpeppe> fwereade_: that fixed it.
<rogpeppe> fwereade_: phew.
<rogpeppe> fwereade_: if you have a spare moment: https://codereview.appspot.com/6623047
<TheMue> rogpeppe: good hint, just upgraded it too
<rogpeppe> fwereade_, TheMue, davecheney: and another one: https://codereview.appspot.com/6612054/
<TheMue> *click*
<TheMue> rogpeppe: In the last one you have two helpers in agent_test.go exported. is there a special reason?
<rogpeppe> TheMue: i just moved them from another file
<rogpeppe> TheMue: they're exported to indicate they're intended to be used globally, not that it makes any difference in a test file.
<TheMue> rogpeppe: yes, i've seen, but they could be private, don't they?
<TheMue> rogpeppe: ah, ic
<rogpeppe> TheMue: yes, and so could all our suite types, but they're not. i don't care too much.
<TheMue> rogpeppe: ;)
<TheMue> rogpeppe: btw, what's "arble"?
<rogpeppe> TheMue: a word :-)
<rogpeppe> TheMue: a nonsense word that was made up by a friend at uni AFAIR.
<TheMue> rogpeppe: hehe, a foobar alternative
<rogpeppe> TheMue: yeah
<TheMue> rogpeppe: you've got a lgtm
<rogpeppe> TheMue: thanks
<TheMue> rogpeppe: yw
<rogpeppe> TheMue: i win? woo! :-)
<rogpeppe> TheMue: what's my prize?
<TheMue> rogpeppe: yes, and the prize is to make the alpha bundle
<rogpeppe> TheMue: booby!
<TheMue> rogpeppe: hehe
 * TheMue fights with headaches today, and that w/o alcohol. maybe that's the reason.
<rogpeppe> TheMue: i prescribe you a large 16 y.o. Talisker
<TheMue> rogpeppe: aaaaah, you know my medicine, good
<TheMue> rogpeppe: just asked my dealer for a 17 y.o. Balvenie DoubleWood, costs about 75 Â£
<TheMue> rogpeppe: i hope it's not too expensive here, taxes are lower
<rogpeppe> fairly trivial CL: https://codereview.appspot.com/6622047/
<TheMue> rogpeppe: is setting the pw to "" really removing it or setting it to ""?
<rogpeppe> TheMue: for AdminPassword, it's really removing it, as mentioned in the docs
<rogpeppe> lunch
<TheMue> ok thx
<rogpeppe> mramm: i thought the meeting was at 1.30
<mramm> rogpeppe: it is
<mramm> I started the hangout too soon
<rogpeppe> mramm: ah, np
<davecheney> does anyone have the hangout link ?
<davecheney> found it
<niemeyer> davecheney: yo!
<niemeyer> Hi all
<fwereade_> niemeyer, heyhey
<niemeyer> fwereade_: Yo
<rogpeppe> fwereade_: hangout?
<davecheney> fwereade_: https://plus.google.com/hangouts/_/ad44942cb79ac76c808c48efaec6b9da87275d6c?authuser=0&hl=en
<davecheney> !meeting
<davecheney> didnt' expect that to work
<niemeyer> "hello dear, My Name is Dorise i see your email at golang.org i will like us to have a good friendship"
<niemeyer> WTF?
<davecheney> niemeyer: hands off, she's promised to me
<davecheney> CONTRIBUTORS for the win
<niemeyer> :-)
<davecheney> no
<davecheney> for the spam
<niemeyer> davecheney: Ah, makes sense
<TheMue> *ROFL*
<rogpeppe> davecheney: sorry mate, i got there first
<davecheney> rogpeppe: D comes before R
<davecheney> she emailed me first
<rogpeppe> :)
<rogpeppe> niemeyer: ping
<niemeyer> rogpeppe: Hi
<rogpeppe> niemeyer: i'm just wondering about the best place to do the admin secret hash logic
<rogpeppe> niemeyer: here are a couple of possibilities: http://paste.ubuntu.com/1261889/
<niemeyer> rogpeppe: Option 2 seems nicer overall
<rogpeppe> niemeyer: ok, cool. i was trending towards that, but thought there were args both ways.
<rogpeppe> niemeyer: it's kinda odd perhaps that each environment will need to implement the same hashing logic, but that's perhaps ok.
<rogpeppe> niemeyer: actually we could implement PasswordHash function in environs
<rogpeppe> niemeyer: that might work nicely actually; then neither juju nor ec2 need be dependent on the details of the hashing.
<niemeyer> rogpeppe: That's what I imagined too
<rogpeppe> niemeyer: cool
<rogpeppe> niemeyer: will do
<niemeyer> rogpeppe: Cheers!
<rogpeppe> niemeyer: i'm piling up a few small CLs BTW if you have a moment some time.
<rogpeppe> niemeyer: https://codereview.appspot.com/6623047, https://codereview.appspot.com/6612054/, https://codereview.appspot.com/6622047/
<niemeyer> rogpeppe: I'll do reviews in a moment
<rogpeppe> niemeyer: np
<niemeyer> fwereade_: Do you want to have that conversation now?
<fwereade_> niemeyer, http://paste.ubuntu.com/1261919/
<fwereade_> niemeyer, is a summary of my current thinking which hopefully draws on our shared context
<fwereade_> niemeyer, can I just go for a ciggie while you read before we get settled in? :)
<niemeyer> fwereade_: Certainly :)
<fwereade_> niemeyer, cheers, bbs
<niemeyer> I'll prepare some chimarrÃ£o at the same time
<fwereade_> niemeyer_, back
<niemeyer_> fwereade_: I'm here too, let's fire it off
<fwereade_> niemeyer_, starting a hangout then
<niemeyer_> fwereade_: Just did it
<fwereade_> niemeyer_, joining
<niemeyer> TheMue: https://codereview.appspot.com/6589073/diff/6001/environs/jujutest/livetests.go
<TheMue> niemeyer: yes, i already asked roger and william how this is possible. the raw diff is ok.
<niemeyer> TheMue: Yep.. have you tried to propose it again?
<TheMue> niemeyer: do you have any idea how to "repair" it?
<TheMue> niemeyer: not yet, willdo. one moment.
<rogpeppe> niemeyer: when you've finished with fwereade_, i've encountered another interesting wrinkle which needs a moment's thought.
<niemeyer> rogpeppe: Just reviewing a branch and will be with you
<TheMue> niemeyer: strange, now lbox hangs here. *grmblx*
<niemeyer> TheMue: Hangs? Or is it working?
<TheMue> niemeyer: oh, had a 502
<niemeyer> TheMue: That's a Launchpad internal error
<TheMue> niemeyer: i'm trying it again right now
<TheMue> niemeyer: ah, it's there now
<fwereade_> niemeyer, hey, just one thing: naming of the HookContext interface?
<fwereade_> niemeyer, I would actually be pretty happy with a jujuc.Context interface
<rogpeppe> fwereade_: +1
<fwereade_> rogpeppe, cheers
<niemeyer> fwereade_: +1
<fwereade_> niemeyer, cheers :)
<niemeyer> TheMue: Sent a couple of comments
<niemeyer> TheMue: and LGTM assuming that tests pass on Amazon
<niemeyer> TheMue: Live tests, that is
<TheMue> niemeyer: just seen the notification, thx
<niemeyer> TheMue: I'd like to cover something about the next step
<niemeyer> TheMue: Do you have a minute?
<TheMue> niemeyer: yes
<niemeyer> TheMue: Okay, so
<niemeyer> TheMue: I believe that one of the reasons why the firewaller works well today is because we have a contract that when a machine comes up, it's coming up with all its ports closed
<niemeyer> TheMue: In other words, StartInstance always hands off machines with no ports open
<niemeyer> TheMue: This is of course being broken by the global scheme
<niemeyer> TheMue: Which is alright on itself, except for one detail
<niemeyer> TheMue: When a machine dies, what happens with its ports?
<niemeyer> TheMue: Are you still there?
<TheMue> niemeyer: Yes, thinking about it.
<niemeyer> TheMue: The answer is nothing.
<niemeyer> TheMue: And that's been fine so far
<niemeyer> TheMue: Because the group is only used by that one machine
<TheMue> niemeyer: We react in the sense of the lifecycle
<niemeyer> TheMue: Meaning?
<TheMue> niemeyer: if i get it right we close the ports of the unit.
<TheMue> niemeyer: but with a global group that's not good
<TheMue> niemeyer: it only should be closed if no unit needs that pot anymore
<niemeyer> TheMue: When do we close the ports of the unit?
<TheMue> niemeyer: on moment, checking if i have seen it right. thought it is in flushMashine()
<niemeyer> TheMue: Which machine? It was removed
<niemeyer> TheMue: We may never have seen it
<niemeyer> TheMue: I mean, the removal
<niemeyer> How can we tell which ports on that global group are improperly opened
<TheMue> niemeyer: That's what i meant with a lifecycle change. This would be catched. But not a real hard death.
<niemeyer> TheMue: We cannot trust on luck
<TheMue> niemeyer:yes, i know. it only has been a view on the status quo
<niemeyer> TheMue: Anyway, I'll leave you thinking about that problem.. we need a way to close those ports
<niemeyer> rogpeppe: SOrry, I have a call with flacoste right now.. I'll be with you after that
<rogpeppe> niemeyer: np
<TheMue> niemeyer: ok, so i know my next task ;)
<fwereade_> niemeyer, when you're off your call, https://codereview.appspot.com/6620054 isn't quite what we originally discussed but should still be pretty trivial
<niemeyer> fwereade_: Sounds good, I'll just be with rogpeppe for a while and will then jump on that
<rogpeppe> niemeyer: there's a problem with using a salted password
<fwereade_> niemeyer, cool, thanks
<rogpeppe> niemeyer: which is that we don't know the salt when we reconnect
<niemeyer> rogpeppe: Where is the salt put
<rogpeppe> niemeyer: we need to store the salt in the environment
<rogpeppe> niemeyer: we can return it in the state.Info
<rogpeppe> niemeyer: alternatively
<rogpeppe> niemeyer: we could not use a salted password at all and just say "use a long random password"
<niemeyer> rogpeppe: Just use a well known salt for now
<rogpeppe> niemeyer: what's the point?
<rogpeppe> niemeyer: well, i guess we can add salt later :-)
<rogpeppe> niemeyer: the changes are fairly small actually
<niemeyer> rogpeppe: The point is the same of having a salt
<niemeyer> rogpeppe: You can't attack a juju environment with a pre-made dictionary unless that dictionary was built specifically to attack a juju environment
<rogpeppe> niemeyer: there's no point in having a constant salt - it's equivalent to having no salt at all
<rogpeppe> niemeyer: i guess so
<niemeyer> rogpeppe: No, it's not equivalent
<rogpeppe> niemeyer: yeah, i see your point
<niemeyer> rogpeppe: If you use a salt, you force people to start building that dictionary today
<rogpeppe> niemeyer: ok, constant salt it is
<niemeyer> rogpeppe: If you use an algorithm that takes 1 second to compute the hash on a modern computer, even a short password will take many years to break
<rogpeppe> niemeyer: the password hash function i've just used takes about 0.2 seconds per password FWIW.
<niemeyer> rogpeppe: Increase the count
<rogpeppe> niemeyer: ok, will do. it'll increase the test time though :-)
<niemeyer> rogpeppe: Not if you make that a parameter
<rogpeppe> niemeyer: i'm not sure it should be a parameter. it could be a global variable changed by the test code though.
<rogpeppe> niemeyer: part of the point of putting it in a function is to hide the parameters, i think.
<niemeyer> rogpeppe: Or just keep it 0.2
<rogpeppe> niemeyer: i think 0.2 should be fine tbh
<niemeyer> rogpeppe: I think so as well.. we're overvaluing that problem.. we have a lot to do before that kind of thing becomes a real issue
<rogpeppe> niemeyer: and we can add a random salt quite easily at a later stage, and people can change their passwords.
<rogpeppe> niemeyer: yup
<niemeyer> rogpeppe: All of yours reviewed
<rogpeppe> niemeyer: brilliant, thanks
<niemeyer> fwereade_: I'll do yours first thing after lunch
<rogpeppe> niemeyer: i've just made the change you suggested to https://codereview.appspot.com/6623047/
 * niemeyer => food!
<niemeyer> rogpeppe: Cheers
<fwereade_> niemeyer, cheers, enjoy :)
 * niemeyer is back
<rogpeppe> niemeyer: a couple of comments on your reviews.
<rogpeppe> niemeyer: BTW i think it's better to always require that the entity name in the cloudinit's state-info matches the machine's entity name.
<niemeyer> rogpeppe: Thanks, reviewing fwereade_'s at the moment
<rogpeppe> niemeyer: np
<rogpeppe> niemeyer: thanks for the reviews
<niemeyer> rogpeppe: Doesn't seem reasonable to have a "machine-0" entity name with an admin password
<rogpeppe> niemeyer: it's the machine password too
<niemeyer> rogpeppe: As far as the caller is concerned, it's the admin password
<rogpeppe> niemeyer: ok, sounds reasonable
<niemeyer> rogpeppe: We just play cheap
<rogpeppe> niemeyer: ?
<niemeyer> rogpeppe: Nevermind.. you got it
<niemeyer> fwereade_: Sent suggestions on the interface.. please let me know how you feel about them
<niemeyer> rogpeppe: "i don't think the base64 padding will make any difference"
<niemeyer> rogpeppe: It's just silly to have "==" behind every parssword. Let's avoid that, please.
<rogpeppe> niemeyer: ok
<rogpeppe> niemeyer: sorry, i thought it was a security concern
<niemeyer> rogpeppe: Not a security concern.. just not nice
<rogpeppe> niemeyer: i'll check the padding
<niemeyer> rogpeppe: By tweaking the input to a proper size, you can avoid it with no pain or work
<rogpeppe> niemeyer: i agree
<niemeyer> rogpeppe: Either way, replied in the CL
<rogpeppe> niemeyer: thanks
<niemeyer> rogpeppe: np
<fwereade_> niemeyer, all look great except maybe Settings
<fwereade_> niemeyer, I'm afraid fixes/progress will trickle in over the w/e, I think I'm done for the night
<rogpeppe> fwereade_: have a great w/e!
<fwereade_> rogpeppe, and you :)
<niemeyer> fwereade_: Indeed, have a great one
<niemeyer> fwereade_: Settings is the interface of what we'll have as state.Settings
<niemeyer> fwereade_: No Russian-name programming
<niemeyer> :)
<niemeyer> Russian novel programming
<niemeyer> http://www.johndcook.com/blog/2012/09/27/russian-novel-programming/
<rogpeppe> niemeyer: argh. the EntityName check breaks everything
<niemeyer> rogpeppe: Because the implementation is not finished, I suppose
<rogpeppe> niemeyer: because a lot of tests do this, for instance:
<rogpeppe> 	inst, err := t.Env.StartInstance(0, InvalidStateInfo, nil)
<rogpeppe> niemeyer: and now we need to craft a different StateInfo each time.
<rogpeppe> niemeyer: or perhaps have just two
<rogpeppe> niemeyer: oh, no
<rogpeppe> niemeyer: different each time
<niemeyer> rogpeppe: Why?
<rogpeppe> niemeyer: because we're starting a different machine each time
<rogpeppe> niemeyer: so we need a different entity name
<rogpeppe> niemeyer: that may well be the right thing to do, but it's quite a few changes
<niemeyer> rogpeppe: I don't think I understand the issue.. grepping for it
<rogpeppe> niemeyer: yeah sure. it's just more than the trivial change i thought it was going to be.
<niemeyer> rogpeppe: Why are we giving the instances an invalid state info?
<rogpeppe> niemeyer: this is in tests
<niemeyer> rogpeppe: The question too :)
<rogpeppe> niemeyer: where there's no state to connect to
<niemeyer> rogpeppe: I'm seeing this in LiveTests
<rogpeppe> niemeyer: for some tests we only care about the instances being created and destroyed.
<rogpeppe> niemeyer: we don't care what actually happens on the machine.
<niemeyer> rogpeppe: Still feels awkward, but okay.. that's not the time to fix it..
<rogpeppe> niemeyer: and ISTR that those tests were written before we had a dummy state server
<niemeyer> rogpeppe: Perhaps that's the reasoning
<rogpeppe> niemeyer: they were some of the earliest juju code actually
<niemeyer> rogpeppe: Either way, func InvalidStateInfo(machineId int) *StateInfo
<rogpeppe> niemeyer:
<rogpeppe> // InvalidStateInfo holds information about no state - it will always give
<rogpeppe> // an error when connected to. The machine id gives the
<rogpeppe> // machine id of the machine to be started
<rogpeppe> func InvalidStateInfo(machineId int) *state.Info {
<rogpeppe> snap!
<niemeyer> rogpeppe: Great minds think alike, as Jamu would say :-)
<niemeyer> rogpeppe: LGTM on bootstrap-state,  thanks
<rogpeppe> niemeyer: cool, thanks
<niemeyer> rogpeppe: LGTM on initial password too, thank you
<rogpeppe> niemeyer: cool, thanks
<niemeyer> rogpeppe: it'd be good to run all tests, if you haven't done it.. quite a few touchy changes
<rogpeppe> niemeyer: problem with all these small branches is keeping track of 'em all...
<rogpeppe> niemeyer: i'm doing it currently.
<niemeyer> rogpeppe: Thanks for that
<rogpeppe> niemeyer: perhaps i should be running them on all branches in parallel
<rogpeppe> niemeyer: i made the password changes: https://codereview.appspot.com/6615047
<niemeyer> rogpeppe: Looking
<niemeyer> rogpeppe: Why the big salt?
<rogpeppe> niemeyer: why not?
<niemeyer> rogpeppe: Because two bytes should be enough
<rogpeppe> niemeyer: ok, i'll use a smaller slat
<rogpeppe> salt
<niemeyer> rogpeppe: Suggest just a trivial check on the content of the password.. like its length
<niemeyer> rogpeppe: LGTM otherwise, thanks for the changes
<rogpeppe> niemeyer: i was right to run the live tests... i had broken something. (i don't *think* in trunk)
<rogpeppe> niemeyer: i've got to go now, unfortunately.
<rogpeppe> niemeyer: have a great weekend.
<niemeyer> rogpeppe: Hmm.. I hope that wasn't a "I've broke trunk and I have to go" :)
<niemeyer> rogpeppe: have a great weekend :)
<rogpeppe> niemeyer: i don't think so
<rogpeppe> niemeyer: the thing that broke was the tests that were added in the most recent (unsubmitted) branch for entity name in cloudinit
<fss> niemeyer: ping
<niemeyer>    fss Yo
<fss> niemeyer: did you see this cl: https://codereview.appspot.com/6586073/?
<niemeyer> fss: I hadn't seen it, but this is looking very good in fact
<niemeyer> fss: Have you signed the contribution agreement yet?
<niemeyer> fss: Sorry, I don't know if I asked that before
<fss> niemeyer: I don't think so, where I can check this?
<fss> can I*
<niemeyer> fss: www.canonical.com/contribute
<niemeyer> fss: It's  straightforward/non-draconian agreement
<fss> 404
<fss> =/
<fss> I think I found it
<fss>  /contributors
<niemeyer> fss: Sorry
<fss> Shoul I put your name in Canonical Project Manager?
<niemeyer> fss: Yes please
<niemeyer> fss: I'm providing a review.. pretty simple stuff really. It's mostly ready for integration.
<fss> done, I've signed the cla
<fss> I've already finished the package, also created an iamtest package, I will send it to you in small parts (and adapt what is already done to your feedback)
<fss> I'm going home now, but I'll be online in 30 minutes
<fss> brb
<fss> niemeyer: I'm back
<niemeyer> fss: Welcome back.. was just looking at the changes
<fss> I forgot to add the integration tests suite, I'll send it tomorrow or later today
<niemeyer> fss: Thanks.. it's fine to do that in a new CL
<niemeyer> fss: I'm submitting this one
<fss> cool :-)
<niemeyer> fss: Hmm.. you'll probably want to name your future branches other than "goamz" :)
<fss> i was about to ask about that
<fss> how should I proceed with lbox after the cl gets merged
<niemeyer> fss: The name is usually after what's being done
<niemeyer> fss: add-iam or whatever
<fss> hmm, so I create a new branch for each cl?
<niemeyer> fss: The workflow is not enforced.. what I personally do is to grab the master again and pull from trunk
<niemeyer> fss: I use cobzr
<niemeyer> fss: That's right
<niemeyer> fss: This allows you to do things in parallel when you want, and to not mess up changes
<niemeyer> fss: Your first change is submitted
<niemeyer> fss: Congratulations! :)
<fss> niemeyer: \o/
<niemeyer> fss: Wanna do your first review too?  Something that will look familiar:
<niemeyer> Erm.. if only Launchpad would help
<fss> niemeyer: thanks for the orientation, I will start sending others methods and types tomorrow
<niemeyer> fss: https://codereview.appspot.com/6611053
<fss> niemeyer: cool :)
<niemeyer> fss: If you understand what's going on there, and are happy with it, just reply with LGTM there
<fss> I was just thinking about the order of Region fields
#juju-dev 2013-09-30
<thumper> hi davecheney
<thumper> davecheney: are you working today?
<thumper> davecheney: wallyworld is on holiday, and axw has a public holiday
<davecheney> thumper: is it a public holiday today ?
<davecheney> i'm terrible at these things
<thumper> davecheney: for WA I think
<davecheney> nup, in in NSW
<davecheney> i'll be here all week, try the fish
<thumper> davecheney: I have a number of small branches that fix saucy issues
<thumper> davecheney: https://codereview.appspot.com/14114043/
<thumper> davecheney: oh, just saw your review of it
<thumper> davecheney: and it does work on precise
<thumper> the other option is --session (for testing only)
<thumper> I checked this
<thumper> and tested on ec2
<thumper> davecheney: I have a golxc one, and another juju one coming
<davecheney> thumper: keep 'emcoming
<thumper> davecheney: ack
<thumper> davecheney:  https://codereview.appspot.com/14114044/
<davecheney> thumper: LGTM
<davecheney> reviewed by email
<thumper> davecheney: I agree on the name, but that would be a bigger change just now
<thumper> I'm trying to keep 'em smallish
<thumper> davecheney: it seems it isn't just me failing with this error
<thumper> davecheney: the gobot is also failing
<thumper> davecheney: could I get you to run the tests on trunk to see if you get it?
<davecheney> sure, running trunk now
<thumper> ta
<davecheney> same,
<davecheney> [LOG] 36.85478 DEBUG juju.environs.simplestreams cannot load index "http://127.0.0.1:42617/peckham/private/tools/streams/v1/index.sjson": invalid URL "http://127.0.0.1:42
<davecheney> 617/peckham/private/tools/streams/v1/index.sjson" not found
<thumper> hmm...
<davecheney> http://paste.ubuntu.com/6173877/
<davecheney> broke
 * thumper wonders how it landed
<thumper> davecheney: https://code.launchpad.net/~thumper/golxc/nicer-destroy/+merge/188254
 * thumper now looks at the failing test
<thumper> davecheney: you are running raring?
 * thumper afk for a bit
<davecheney> thumper: yes sir
<_thumper_> jam: ping for when you start
<jam> thumper: pong
<thumper> jam: hangout? fire-fighting
<jam> sure
<thumper> jam: https://plus.google.com/hangouts/_/7e75017df572083de566b5fc04dab18866050eb4?hl=en
<thumper> jam: https://code.launchpad.net/~thumper/juju-core/revert-1901/+merge/188261
<thumper> jam: https://code.launchpad.net/~thumper/golxc/nicer-destroy/+merge/188254
<rogpeppe> mornin' all
<rogpeppe> fwereade: hiya
<fwereade> rogpeppe, heyhey
<rogpeppe> fwereade: looking for a review of https://codereview.appspot.com/14038045/ if you have a mo at some point. (joint work of mgz & i)
<TheMue> morning
<rogpeppe> TheMue: mornin'
<TheMue> rogpeppe: heya, need a short restart after update
<TheMue> so, back again
<fwereade> rogpeppe, reviewed, not sure if there's some reason to mix concerns that I'm not quite getting
<fwereade> jam, thank you for spotting the AccessDenied
<jam> fwereade: np
<fwereade> jam, am I right in thinking that mgz has keys to fix that?
<jam> fwereade: It is the ec2 bucket, I don't know who has keys. Dave does, probably curtis does
<jam> I don't
<dimitern> fwereade, hey
<fwereade> dimitern, heyhey
<rogpeppe> fwereade: the reason i thought it was good to put both the address updater and publisher in the same place is that they both need to respond to almost exactly the same information - it's trivial to do them both together
<fwereade> rogpeppe, no it's not
<dimitern> fwereade, https://codereview.appspot.com/14036045/ would you take a look please?
<rogpeppe> fwereade: and the publisher is actually thing i actually need out of this work
<fwereade> rogpeppe, why so? just having the info in state is good enough, surely?
<rogpeppe> fwereade: i wanted to avoid doing a scan through all machines every time someone logs in
<fwereade> rogpeppe, can't we just index by jobs if that turns out to be a cost worth worrying about?
<rogpeppe> fwereade: can you index by a set?
<fwereade> rogpeppe, unless you know for sure that you *can't*, mising concerns like this is seriously premature optimization
<fwereade> rogpeppe, last resort, not first
<fwereade> rogpeppe, even if you do know
<fwereade> rogpeppe, there's nothing stopping a separate publisher task from working with the data collected here
<fwereade> rogpeppe, and pretending the two tasks are the same is just not helpful
<rogpeppe> fwereade: they seem to go together quite nicely to me
<fwereade> rogpeppe, if your type description says "X does Y. Also, it does Z" you really should be writing either two types, or a long comment detailing the justifications for doing so
<rogpeppe> fwereade: we'll also want this logic for publishing the provider stateinfo
<fwereade> rogpeppe, I'mnot saying the logic is *bad* even
<fwereade> rogpeppe, just that the package is doing way too much
<fwereade> rogpeppe, (and when you do the provider state info, I worry you'd say it's "trivial" to add it to this type too, because the tasks go together "nicely"...)
<rogpeppe> fwereade: ok, i guess. i thought the publishing bit is a relatively small addition to the rest of the logic, which is concerned with knowing when addresses change.
<rogpeppe> fwereade: yes, i'd thought that this package could be concerned with all addressing stuff.
<rogpeppe> fwereade: in particular, i'd thought we'd have two places where we'd publish the current set of addresses
<rogpeppe> fwereade: in state and in the provider
<fwereade> rogpeppe, me too
<rogpeppe> fwereade: and that the same code can be responsible for both
<fwereade> rogpeppe, ISTM that that is more than enough reason to put that clever code in its own package
<rogpeppe> fwereade: it's not very clever code
<fwereade> rogpeppe, because then whoever needs to add functionality to it will *only* have that to deal with
<fwereade> rogpeppe, rather than having to understand all the address-updating stuff as well
<rogpeppe> fwereade: i think this is making life harder for ourselves again
<rogpeppe> fwereade: but if you insist
<fwereade> rogpeppe, I am not open to argument here
<fwereade> rogpeppe, the concerns are separate
<fwereade> rogpeppe, you've got the first one practically done, it seems
<fwereade> rogpeppe, it can go in as a worker and start making life easier immediately
<rogpeppe> fwereade: that was the plan
<jam> fwereade: I wanted to chat a bit about the ssl stuff, but after you're done with rog
<fwereade> rogpeppe, and then we can write another worker that might even be properly trivial
<fwereade> rogpeppe, and really easy to understand and change in isolation for the publish-to-environcase
<rogpeppe> fwereade: honestly, the publisher goroutine is really simple, and it won't be that simple when factored out as its own goroutine
<rogpeppe> s/goroutine/worker/
<rogpeppe> s/goroutine$/worker/ :-)
<rogpeppe> fwereade: because it'll have to duplicate a lot of stuff that this one is doing
<rogpeppe>  but we like duplication
<fwereade> rogpeppe, AFAICT the only actual point of overlap is watching all machines
<rogpeppe> fwereade: and the environ
<fwereade> rogpeppe, and that's not really apropriate to a publisher anyway, but it'll do in a pinch
<rogpeppe> fwereade: the publisher needs to watch all machines, no?
<fwereade> rogpeppe, aw man, I guess we're doing the mix-environ-watching-into-everything stuff again?:(
<fwereade> rogpeppe, depends
<fwereade> rogpeppe, would be nicer if we could just watch all the state servers
<rogpeppe> fwereade: in this case, we can have a separate environ watcher that sets a shared Environ, guarded by a mutex.
<fwereade> rogpeppe, is there any case we *couldn't* do that in?
<fwereade> rogpeppe, Environ is meant to be goroutine-safe, right?
<rogpeppe> fwereade: yeah
<rogpeppe> fwereade: i'm not sure - there might be some cases where we actually want to know when an environ has changed.
<fwereade> rogpeppe, I would hope not, surely?
<fwereade> rogpeppe, and if we do, that would seem to be the place for custom environ-watching code
<fwereade> rogpeppe, anyway axw has some investigation into that in his queue, I think
<fwereade> jam, ssl?
<jam> fwereade: so. I like smoser's idea to add the cert, mostly because it means I don't have to track down edge cases. It means I still need the code I've landed, because the initial *client* needs to have a way to connect.
<jam> However
<jam> fwereade: It is completely non-obvious how we get the Cert out of the connection.
<fwereade> jam, ha
<jam> I think we have to create a custom http.Transport object, that overrides Dial
<jam> so that when it connects
<jam> we can peek at the tls.Conn object
<jam> which has a ConnectionState call
<jam> that can have the certs in it
<jam> But the layer at which cloud-init sits
<jam> is about 5 abstractions away from the actuall Conn
<jam> fwereade: and I'm wondering how terrible that is
<jam> The best I can think of is to have a global registry of hostname => Certs
<jam> and then create a custom Transport
<jam> well, custom Dial that adds those certs to the registyr
<jam> and then if you have "ssl-hostname-verification: false" set
<jam> it still does what I've done today
<jam> but then at cloud-init time
<jam> it looks in the global registry
<jam> if there is a cert for auth-url
<jam> and if so, it puts it into cloud-init
<jam> The vagaries of "hostname => certificate " concern me
<fwereade> jam, I'm shuddering a little there
<jam> but it might be feasible
<jam> fwereade: the net.HTTP stuff doesn't expose any way to get access to the cached Conn objects
<jam> so I can't do it without overriding Dial and peeking at connection time
<fwereade> jam, that bit seems fine to me
<fwereade> jam, it's the global registry that freaks me out
<fwereade> jam, altogether too much action at a distance
<jam> fwereade: so we already have a custom Transport object
<jam> because we have to set tls.SkipInsecuryVerify = true
<jam> it isn't hard to inject a Dial there
<jam> though I'm not sure how to make that dial
<jam> have enough context
<jam> to be able to cache the connection information on the Goose object ?
<fwereade> jam, weeeell there are always ways... eg can we make the goose object itself supply the custom dial function?
<fwereade> jam, (I feel like the situation is symptomatic of too many globals, and that adding more is unlikely to bring us to a happy conclusion)
<jam> fwereade: so *today* we are using a shared HTTP Client
<jam> because that seems to be the recommend way
<rogpeppe> jam: could you just add the cert to /etc/ssl/certs/ca-certificates.crt ?
<jam> so that you get global connection pooling
<jam> rogpeppe: that is what cloud-init allows for you, the trick is *digging out* the certificate from the connection
<jam> fwereade: we could certainly just punt on all of that (though it is how net/http works), and go with a one http.Client per goose.Client, and then goose.Client asks for an http.Client that has *this custom Dial* func() that is actually an appropriate closure
<jam> fwereade: I'm not 100% sure how we do the juju-side of it.
<jam> Because of simplestreams
<jam> we might not need to
<jam> as in, we leave juju simplestreams as just ignoring the certificate, we teach goose how to grab the certificate, and then we teach juju bootstrap how to ask goose for what the cert is
<jam> note there is still a small problem that the "SWIFT" URL doesn't have to match the AUTH URL
<rogpeppe> jam: we can't make it a configuration option, so the user tells juju about their own self-signed cert?
<jam> rogpeppe: how do they get that cert
<rogpeppe> jam: off their provider, i suppose
<jam> rogpeppe: users don't really want to connect via Firefox, click on "I understand the warning" then "Download Certificate", copy and paste that into an environment.yaml file (.jenv)
<jam> rogpeppe: my point on the bug is: "ssl-hostname-verification: false" really easy for a user to type and understand
<jam> rogpeppe: go inspect this service over there to pull out its SSL certificate
<jam> rogpeppe: *completely* non-obvious
<fwereade> jam, +1
<jam> rogpeppe: right now, I'm looking at how I get "ssl-hostname-verification: false" to work for all our stuff that just downloads from a URL
<jam> cloud-init does
<jam> upgrader does
<jam> charmer does
<jam> etc
<jam> I either propagate ssl-hostname-verification = false into EnvironConfig
<jam> and teach the API
<jam> that for things that return a URL
<jam> they also return a "And you should ignore the certificate for this URL"
<jam> or I teach something like Bootstrap
<jam> to put "here is a new Cert for you to use"
<rogpeppe> or you find out the cert somehow, yeah
<fwereade> jam, I'm starting to feel gordian-knotty here -- I think you should probably just go with skipverify for now, because it delivers actual value to users who specifically say they want insecurity
<jam> to accept
<jam> fwereade: what really sucks is that we have the cert
<fwereade> jam, giving those users a bit of extra security is just a bonus
<jam> but it is over here on this object that is hidden between 3 interfaces and a type that doesn't expose its internal map
<rogpeppe> jam: this problem is almost all about when we're talking to storage, right?
<jam> rogpeppe: right
<jam> rogpeppe: there are 2 problems, but I feel like I've solved the first
<rogpeppe> jam: so, storage already exposes a URL method, yes?
<jam> we need to handle connecting to the Provider
<jam> and we need to handle Storage
<jam> rogpeppe: all the agents that aren't on machine-0 don't have a Provider connection
<jam> just a bunch of URLs
<jam> rogpeppe: which is why on Openstack the Storage() has to be a world-readable container
<jam> (swift version of s3 bucket)
<rogpeppe> jam: so if we've got a URL for a provider, can we can find out the certificates provided by that URL?
<jam> rogpeppe: per the work we've been doing to put everything into the API, we *really* don't want the Provider secrets on any machine but machine-0
<rogpeppe> jam: ISTM that that's a potential way of bypassing the abstraction layers
<jam> rogpeppe: as I've been saying, yes. You just connect to it, and then the tls.Conn object has a "ConnectionState" which has the certs. But *that* object is very hidden.
<fwereade> jam, I feel like the urge for a solution is going to cause us either to fuck proper layering hard, or to fiddle with quite a lot of code in order to pass certs around with all the urls we store
<jam> rogpeppe: so yes, we could make the API Server proxy for anything you want to download
<jam> but that is quite a bit bigger change.
<rogpeppe> jam: i'm not sure i was suggesting that.
<rogpeppe> jam: are you saying that it's not possible to use the net/http interface to make a connection and find out the certs at the other end, regardless of our code?
<jam> rogpeppe: so if we've done the work to extract the Certificate, then we we start an instance, we can tell cloud-init to add the certificate to the accepted certs store for that machine.
<jam> rogpeppe: net/http has a global shared Client that pools connections, and that map of address => connection is not exposed (that I can see)
<jam> rogpeppe: we *can* create an http.Client that uses a custom Dial
<fwereade> sorry, brb
<jam> and when we get a Dial attempt
<jam> we inspect if it is a tls.Conn
<jam> and if so
<jam> grab the certificate
<jam> but *where do we put it*
<jam> so that we can pull it out later when we get to cloud-init time
<rogpeppe> jam: can't we put it into the environ's config?
<jam> rogpeppe: http.Client is *intended* to be a global shared state
<jam> rogpeppe: how do we get it from Dial => environ config
<dimitern> fwereade, ping
<rogpeppe> jam: what i'm trying to suggest is that somewhere outside the provider, if we have insecureSkipVerify, we invent a storage request, try to dial its URL, extract the certificate, and save it in the provider (and possibly change the global http client too)
<rogpeppe> jam: so we don't have to wait until the provider does its own Dial
<rogpeppe> jam: we preempt it by doing our own first
<jam> rogpeppe: so we don't actually know where storage is until we've connected to the provider
<jam> rogpeppe: openstack uses a registry of URLs for where things like swift is at
<jam> vs how ec2 has "known urls" ahead of time.
<jam> rogpeppe: so you log in, then get back a list of "this is the URL to use for Swift"
<rogpeppe> jam: but that's in code that isn't hard to change to allow insecureSkipVerify, no?
<jam> rogpeppe: so that's already been done, but it also means we've already done the Dial, so it doesn't make a lot of sense to do it separately
<rogpeppe> jam: i don't mind a bit of inefficiency in this case
<rogpeppe> jam: it's only one extra http request, after all; i'm probably missing something though.
<jam> rogpeppe: so we have a fair number of abstractions about what URLs we are downloading from
<jam> there isn't Just One
<jam> we could probably do just Environ.Storage
<jam> (and assume that tools-url is going to match that)
<jam> though there are no guarantees to that effect
<rogpeppe> jam: isn't the whole reason for URL so that we can use it in shell scripts ?
<rogpeppe> jam: what other abstractions are you thinking about?
<jam> rogpeppe: you mean for env.Storage().URL ?
<rogpeppe> jam: yeah
<dimitern> fwereade, jam, updated https://codereview.appspot.com/14036045/
<jam> rogpeppe: so you're allowed to specify "tools-url" and "imagemetadata-url" which are just URL roots that we will use to look for image metadata and for tools metadata
<fwereade> dimitern, heyhey
<fwereade> dimitern, I will take a look
<rogpeppe> jam: is it ok to assume that they use certs signed by the same authority?
<rogpeppe> jam: or that if someone uses ssl-hostname-verification=false, that adding a cert from one of them will be good enough?
<fwereade> rogpeppe, that does not sound ideal to me
<jam> rogpeppe: so ian's design for simplestreams is that it can be any-old-http-server that you want, one of which might be swift/s3
<jam> rogpeppe: for the *immediate* use case, that might be ok
<jam> though auth-url and swift-url are different machines, I think
<jam> so if they are using self-signed, they might be different self signed.
<fwereade> jam, can we land it with just the existing disabling in place, and triage a bug for doing it better as wishlist or something? I feel like we're in danger sacrificing better to best
<jam> fwereade: it won't work today with just what I've done so far
<jam> we can bootstrap
<jam> and with the cloud-init it will start
<jam> but Upgrader Uniter etc will still be broken
<jam> I can land this, and work on those
<jam> fwereade: but that is why I was tempted by smoser's idea
<rogpeppe> jam: smoser's idea is good *if* you know where to find the certificates to add
<fwereade> rogpeppe, well, yeah,but it's *that* problem that feels to me like an uncontainable horror
<jam> rogpeppe: so we can iterate over all the simplestreams DataSources and get all of there certs (if any) I suppose
<jam> ah, except the Sources
<jam> use their own connection
<jam> we just call Source.Fetch()
<jam> but we *do* have source.URL
<jam> so for _, source := GetToolsSources(): customClient.Get(source.URL()) => drops the Cert somewhere we can get it
<rogpeppe> jam: can we not define our own global http client, and have everything in juju use it?
<fwereade> jam, but can we even be sure that a simplestreams file will only specify relative addresses for the actual downloads?
<rogpeppe> jam: hmm, that's not good either
<rogpeppe> fwereade: ha ha, good point
<jam> fwereade: that is part of the simplestreams spec
<fwereade> jam, ok, sweet, so long as someone's committed to that, I'd missed that
<jam> fwereade: so there is stuff about mirrors, but the design is that the index always gives relative paths, so when you mirror the data, you don't have to change it
<fwereade> jam, I think there's a disconnect there
<fwereade> jam, but it's not worth worrying about actually
<fwereade> jam, ech, or is it
<jam> fwereade: so when we go to cloud-images... it tells us where the data is for amazonaws, etc.
<fwereade> jam, anyway if it's in the spec this is moot
<jam> fwereade: I'm not 100% sure about how the tools stuff is going to go, we are intending that you mirror tools into a local index
<jam> fwereade: so I think we can avoid doing an HTTP get, we can iterate the sources, get the URL("") and then if the URL.Scheme == "https" do a tls.Dial and grab the cert out of there.
<jam> (if ssl-hostname-verify: false)
<jam> create a Set() of those certs, and the add them all to cloud-init
<fwereade> jam, ok... we're still left assuming that those certs won't change... is that ok?
<jam> we probably still need to use ssl-hostame-verification: false when talking to the Provider itself (maybe), or we include auth-url as one of the bits we want to add
<jam> fwereade: I think for the use case, it is fine. I was worried about that as well
<jam> fwereade: but I don't think people are going to change their self-signed certs and expect juju to upgrade in place
<jam> I think
<fwereade> jam, I guess it's the same problem as updating authorized-keys in essence anyway
<fwereade> jam, ie we could actually build the infrastructure to handle it if we had to
<jam> fwereade: well ssl-hostname-verification: false would just disable it always, right?
<jam> fwereade: we'd start managing certificates
<jam> which I would love to avoid
<jam> (oh, revoke that certificate, add this one), but I guess people want us to do that for authorized-keys as well
<fwereade> jam, that was my thought, yeah
<jam> fwereade: I think it is worthwhile to think how this interacts with the httpstorage proposal as well
<jam> (a storage url may not be available before bootstrap time ?)
<fwereade> jam, anything using it before bootstrap time will be able to get what it's looking for off the filesystem, won;t it?
<jam> fwereade: I'm meaning httpstorage with the local provider
<jam> we're talking about it exposing https
<jam> and I guess we'll do something about accepting that cert
<jam> fwereade: I thought I saw an axw commit that said "we'll need to disable certs for this"
<fwereade> jam, I may have missed that bit... but the wouldn't the environment's CA cert be what we'd use/need there?
<jam> fwereade: I honestly don't know what the plan is, and axw is already gone for the day
<fwereade> jam, ok, fair enough
<jam> fwereade: I just know of it as yet-another HTTPS source we might need to worry about
<fwereade> jam, indeed, got you
<jam> I don't really know how we set ca-certs for those instances
<jam> I don't think we use cloud-init there
<jam> given we would have to run a metadata server on the users' machine
<jam> (I think)
<fwereade> jam, alternate tack again: the cost and complexity of adding and using a bool-returning api method for each of the facades is known to be small, and fulfils the current use case adequately if not admirably
<fwereade> jam, the cost and complexity of the alternatives stm to make it unlikely that we'll get an adequate implementation anywhere near as soon
<jam> dimitern: so for https://codereview.appspot.com/14036045/ couldn't we have something that takes a list of jobs and tells you if you need state access?
<fwereade> jam, no argument, the admirability ceiling of keeping track of the actual certs is much higher, but it feels wishlisty
<jam> that seems generic enough to work whether or not you're on the end of the API or directly on state
<jam> fwereade: well, it was also impacted by the fact that scott raised the question, and nobody else reviewed the proposal :)
<jam> fwereade: so I certainly considered the "add this cert manually" but the feeling of how to do the manual cert seemed terrible for users. So I explored this possibility of doing it for them
<jam> fwereade: but yeah, disabliing it everywhere seems more directly straightforward
<jam> fwereade: and avoids the "oh you need 3 certs for the various services, etc"
<fwereade> jam, don;t get me wrong, I am happy that you explored it, it's exactly the sort of thing I'd like us to be considering by default
<jam> fwereade: yeah, I felt it was worth discussing at least, I certainly wasn't committing to code it yet, but I did investigate to see what it would have taken.
<jam> If net/http could have exposed the existing conn I was pretty interested. Having to do it via inspecting Dial made me a bit sad.
<dimitern> jam, I originally though to add a method on MachineJob to return true if the job needs state, but we need the same one on params.MachineJob and state.MachineJob
<fwereade> dimitern, that's a bit surprising
<fwereade> dimitern, where do we get a state.MachineJob when we're not connected to state?
<jam> fwereade: so there are 2 aspects (AIUI). One side needs to know if it should add a MongoPassword, and the other side needs to know if it should ensureStateConnection
<jam> ensureStateWorker
<jam> dimitern: so I think your point is that we actually have 2 Job types
<jam> one that is exposed on the API
<jam> and one that is directly in state
<jam> and so 1 function wouldn't take both types of objects
<jam> shame they aren't just an Enum
<fwereade> jam, dimitern: there are a few places where types moved from state to api rather than being copied... what are the forces that led us otherwise here?
<dimitern> fwereade, the tricky part is the machine agent code
<dimitern> fwereade, there we have both a state connection and an api connection
<jam> dimitern: because we might run the env provisioner?
<dimitern> fwereade, and we can't have the latter until we know we can connect - i.e. not bootstrapping and we know our jobs
<jam> dimitern: I think fwereade's point is that why aren't they just params.MachineJob enums ?
<dimitern> jam, because of JobManageState, but also because of the firewaller
<dimitern> jam, the env provisioner also uses the api
<fwereade> dimitern, jam: more to the point, they're ints in state and strings in the api
<fwereade> bah
<jam> fwereade: good times
<dimitern> fwereade, all int consts are strings in the api - c.f. live
<fwereade> jam, dimitern: however, no reason not to keep the int storage and expose them in methods as params.Job, right?>
<dimitern> fwereade, that's because json doesn't have true ints IIRC
<fwereade> dimitern, and we want it to be half-way comprehensible too:)
<jam> fwereade: right, in an API call it is nice to see "hosts-units" vs "1"
<jam> but that would have been a reason to put them as strings into the DB as well :)
<jam> dimitern: so I think the idea is that we would have state.JobHostsUnits only long enough to turn it into params.JobHostsUnits.
<jam> But that sounds like EOUTOFSCOPE for your patch
<jam> I'd probably still rather it be a function that takes a slice of jobs
<jam> rather than putting functions on what is otherwise a blob
<mgz> hey jam
<jam> mgz: mumble/hangout ?
<dimitern> jam, fwereade, if it has to be a helper taking a slice of jobs, we still need 2 helpers
<fwereade> dimitern, to be fair, if one of them is a wrapper that does the conversion and calls the other, that wouldn't be so bad, would it?
<rogpeppe> mgz: ping
<dimitern> fwereade, ok I suppose
<jam> dimitern: so I don't feel like we need tons of overengineering, but we did have logic that needed to be changed in several places to keep them in sync
<jam> it certainly felt like it should be centralized
<jam> not helped by "this is an if clause, this is a map index, etc"
<mgz> rogpeppe: hey
<rogpeppe> mgz: do you want to continue with the address worker at some point?
<mgz> rogpeppe: that would be good
<mgz> after standup?
<rogpeppe> mgz: sounds good
<fwereade> jam, rogpeppe: ISTM that provider.StartBootstrapInstance and provider.StartInstance are out of whack
<jam> fwereade: how so?
<fwereade> jam, rogpeppe: it looks like StartInstance is something you do *to* an environ, and should be in environs
<jam> fwereade: so I think the idea is that we have 90% common code between all implementations
<fwereade> jam, rogpeppe: but StartBootstrapInstance looks like it's a common implementation of Bootstrap
<jam> that set up the machine-config etc
<fwereade> jam, provider implementations call StartbootstrapInstance but, other code calls StartInstance
<fwereade> jam, rogpeppe: I thought the idea of the code in the provider package was to help people implement providers
<rogpeppe> fwereade: i think you're probably right
<jam> fwereade: I have a feeling it was exploring the bounce-and-bounce-back stuff. (Do we call Env.Foo() which calls something common, and then calls back on Env or do we just call something common, or...)
<jam> fwereade: I have the feeling someone found the balance different each time
<jam> but we should make them at least similar
<rogpeppe> fwereade: i mean, you're definitely right that the code in the provider package is to help people implement providers
<rogpeppe> fwereade: and i think you're probably right about StartInstance. i'm just looking around - i'm not that familiar with it
<rogpeppe> fwereade: (although i suppose i may have been responsible for putting it there!)
<fwereade> rogpeppe, jam: ehh, we progress uncertainly, but we do progress
<fwereade> rogpeppe, jam: they're certainly named very confusingly given the different domains though
<rogpeppe> fwereade: agreed
 * rogpeppe hates the .(T) magic strewn around seemingly at random
<fwereade> rogpeppe, jam: ok, I think I'll move StartInstance over to Environs in a mo
<rogpeppe> it feels exceedingly fragile to me
<jam> fwereade: it is also exceedingly unhelpful that env.StartInstance does already exist
<fwereade> jam, oh, wtf
<jam> and is called by provider.StartInstance IIRC
<fwereade> jam, ahh Environ.StartInstance, sorry
<jam> fwereade: "last line of StartInstance is broker.StartInstance"
<jam> and broker == environ
<jam> (if env, ok := broker.(environs.Environ)"
<fwereade> jam, yep
<jam> fwereade: so the idea is that environ.StartInstance is hard to use because it needs this MachineConfig and Tools stuff
<jam> so we'll pull that out into a helper
<jam> oddly enough
<jam> StartBootstrapInstance also needs this tools list
<jam> but the env passes them in
<fwereade> jam, rogpeppe: I would agree that the type-checking in provider.StartInstance looks like madness and bullshit
<jam> though it comes from Bootstrap
<rogpeppe> jam: i don't really understand the "environ.StartInstance is hard to use because it needs this MachineConfig and Tools stuff" comment
<rogpeppe> jam: it's only used in one single place in the code
<rogpeppe> jam: so how does creating an extra layer help?
<jam> fwereade: and there is a bootstrap.Bootstrap that does the heavy lifting before calling env.Bootstrap which then calls provider.StartBootstrapInstance which then calls env.StartInstance
<rogpeppe> jam: that seems reasonable to me
<rogpeppe> jam: because Bootstrap is something that external code might want to do.
<jam> rogpeppe: so provider.StartInstance is the same "use a helper to then call the right values on the environ" that bootstrap.Bootstrap is
<fwereade> jam, rogpeppe: exactly
<jam> rogpeppe: note that I didn't write these, though maybe more should be caught in review.
<rogpeppe> jam: except that noone except the provisioner worker should ever be starting instances
<jam> standup time
<jam> fwereade: rogpeppe: https://plus.google.com/hangouts/_/8a92f5273abdde270a9fa8d3c6c19416568d4b6b
<fwereade> rogpeppe, ok, but I don't think the provisioner should really need to know or care about tools
<natefinch> So, am I wrong in thinking EC2 instances all start with a specific amount of disk space you get for free with the instance, which is always more than 8gigs?  "Instance Storage" here - http://aws.amazon.com/ec2/instance-types/#instance-details
<mgz> natefinch: all cloud providers tend to give you another volume for misc storage
<mgz> which is much larger than the root partition
<mgz> really charms should all be configured to use that for storagey things like databases where possible
<mgz> so the root can just be packages
<natefinch> when I tweaked the defaults inthe code, I could get root storage up to what was stated in that table.  afaik, aws gives you other storage but it's ephemeral, goes away on reboot
<mgz> right... that's also a consideration
<natefinch> goes away, for me, means "treat like (really slow) in-memory storage"
<mgz> natefinch: I'm not certain stopping and starting an instance is supported in general by juju, and ephemeral storage should persist across a simple reboot
<natefinch> mgz: hmm.. I was under the impression that ephemeral storage wasn't reliable across a reboot, but I'm by no means an expert, and only read the docs once, a while ago.
<mgz> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
<mgz> "The data in an instance store persists only during the lifetime of its associated instance. If an instance reboots (intentionally or unintentionally), data in the instance store persists. However, data on instance store volumes is lost under the following circumstances: *Failure of an underlying drive  *Stopping an Amazon EBS-backed instance *Terminating an instance"
<natefinch> mgz: thanks for the link. I guess I saw ephemeral and "temporary" and jumped to conclusions
<mgz> ah, excellent
<mgz> forw whatever reason, sydney is vpc-only
<mgz> all the other regions have ec2-classic
<mgz> amusing I have to test this on the server furthest from me, but hey
<natefinch> what's an ocean or two between friends, really?
<hazmat> jam, fwereade fwiw, i'm poking around the perms on the bucket.. the problem seems to be whatever is doing the uploads
<hazmat> needs to also explicitly do an access grant to public for read
<mgz> hazmat: yes, our code just defaults to creating private
<mgz> sinzui: do you need any help fixing up the ec2 perms, following up for the 1.15 release?
<sinzui> mgz, I don't think so. I used s3cmd to upload because it has bulk powers. I will try again using s3up for each file.
<mgz> sinuzi, the issue is the 'directories' not the files I believe
<sinzui> oh, then I am still clueless.
 * sinzui reads the emails again
<hazmat> there are no directories
<hazmat> there are only object
<mgz> there is a tools/releases object that's 19 bytes
<hazmat> odd
<sinzui> the paths then http://juju-dist.s3.amazonaws.com/
<hazmat> any objects uploaded must be done so with a grant that allows global read
<hazmat> sinzui, your using s3cmd?
<hazmat> sinzui, is this code from juju-core/scripts?
<sinzui> mgz, again?  I saw and thought I fixed the 19 byte issue. It was caused when relative paths were passed to sync-tools. I switched to absolute locate paths
<abentley> sinzui: ping for standup
<hazmat> so here's the perm map of that bucket http://paste.ubuntu.com/6175597/
<hazmat> sinzui, mgz  it looks we can create a bucket policy directive for contained objects to default to read
<hazmat> investigating
<sinzui> hazmat I can fix this when I leave my current meeting.
<hazmat> sinzui, at this point i'm already in it
<sinzui> hazmat, then I thank you very much for helping
<rogpeppe> trivial code review anyone?  https://codereview.appspot.com/14123043
<dimitern> rogpeppe, looking
<dimitern> rogpeppe, lgtn
<dimitern> m even
<rogpeppe> dimitern: ta
<mgz> looks good to noone?
<mgz> that's pretty mean
<hazmat> sinzui, np.. policy set
 * rogpeppe hates that feeling when you *know* you've implemented something identical in the (possibly recent) past, but just can't remember where that was.
<TheMue> rogpeppe: it even get's worse if you don't even know it and some time later you discover, that you've done it
<rogpeppe> TheMue: actually, i don't mind that as much
<rogpeppe> TheMue: it's that feeling of struggling to reproduce logic you can *almost* remember
<rogpeppe> fwereade, jam: delete cmd/builddb: https://codereview.appspot.com/14127043
 * rogpeppe goes for lunch
<hazmat> sinzui, one other regression vs 1.14.[0,1] is the production and upload of the armhf binaries
<sinzui> hazmat, yes, only Ubunut is making them. I saw that
<hazmat> robbiew, non lts server distro support is 9months? ie. 12.10 is no longer supported?
<natefinch> fwereade,mgz, rogpeppe, TheMue, jam, dimitern, anyone else who cares -  I'm writing juju help constraints... anyone want some input?  I want to make sure there aren't any technical errors, and if you have formatting suggestions, that's cool too: https://docs.google.com/document/d/1sy4yDUp93FYPt205Muarr8ASiEaylyVSAuycq0OkBgY/edit?usp=sharing
<mgz> natefinch: will have a look
<fwereade> natefinch, I'll try to get to it later
<natefinch> fwereade:  no problem
<mgz> note I added some generic stuff to lp:juju/docs when we did the sprint
<mgz> er... or whatever the correct location is
<natefinch> mgz: I see some stuff on constraints under juju-core/doc/provisioning.txt
<mgz> natefinch: the juju.ubuntu.com docs is what I'm talking about
<natefinch> mgz: ahh, right
<natefinch> mgz: didn't occur to me to look there
<natefinch> mgz: definitely some stuff that needs adding to my docs
<mgz> those have:
<mgz> https://juju.ubuntu.com/docs/charms-constraints.html
<mgz> https://juju.ubuntu.com/docs/reference-constraints.html
<natefinch> all right, well, there's obviously a lot to add to my docs.  Probably not worth looking over what I have until I work in those pages
<natefinch> mgz: is this true?  "A value of 'any' explicitly unsets a constraint, and will cause it to be chosen completely arbitrarily."
<mgz> er, it's not great wording, but it's true that pyjuju unsets a contraint when given the string 'any'
<natefinch> mgz: yes, but does that apply to juju-core?
<mgz> nope.
<natefinch> badness
<mgz> pyjuju distinguishes between "any" which is no constraint, and "" which is use the default
<mgz> juju-core just has "" which can mean several different things
<natefinch> we need separate documentation for pyjuju and juju-core :/
<mgz> really, we just need to update any remaining bits to talk about juju-core behaviours, with notes added for where we break compat
<jam> mgz: you might check with TheMue he's been working in the area. So that "nil" can unset a value, etc.
<natefinch> *nod* also, a lot of that constraints page reads as release  notes, not documentation  "will be controlled with a new command"  "Please note that there are no changes to"
<jam> I think that is null for JSON and maybe "" for the commandline
<jam> ah, nm, "juju unset"
<mgz> TheMue: ^please also update natefinch and the docs when you land exciting constraint semantics changes :)
 * natefinch likes writing documentation, but it also means he's picky about it ;)
<TheMue> mgz: will/would do, but so far nothing regarding constraints
<jam> TheMue: ah, "juju unset" is all about config options for a charm, not constraints
<TheMue> jam: yep, exactly
<jam> mgz: is there a way to, say, unset the mem constraint?
<mgz> mem= and mem=0 both seem to have the same effect
<mgz> no way of saying "go back to the juju default"
<mgz> (returning references to unitialisaed values still freaks me out in go...)
<natefinch> you mean copies of uninitialized values, right? :)
<mgz> (need to spend brainpower to remember uint64 means 0, not random memory
<mgz> means? gets? summat.
<mgz> natefinch: that also doesn't help :)
<natefinch> the "everything is initialized to a zero value" is one of my favorite things about go.  Although, to be fair, most modern languages do about the same thing... C#, java, etc.
<natefinch> it's just go makes zero values more useful in many cases
<rogpeppe> hmm, provider/ec2 tests seem broken for me on trunk. anyone else see that?
<rogpeppe> i see this: http://paste.ubuntu.com/6176180/
<rogpeppe> fwereade, jam, dimitern, natefinch, TheMue: can you verify please?
<mgz> rogpeppe: yeah, I see that
<rogpeppe> mgz: hmm, i wonder how it could have got past the 'bot
<mgz> the test looks like it talking to the real s3 bucket
<mgz> so, presumably the bucket wasn't borken when it got run on the bt
<rogpeppe> mgz: what makes you think that it's talking to the real s3 bucket (not that i don't believe you)
<mgz> last few lines of the log have real urls
<rogpeppe> mgz: ha, good point
<mgz> lacking the new "/releases/" part
<jam> rogpeppe: mgz: this is because we *can* now read s3
<rogpeppe> jam: i think this must be relatively recent behaviour
<jam> rogpeppe: as in, kapil just fixed s3 about 2 hours ago
<rogpeppe> jam: rev 1901 introduced the problem
<jam> rogpeppe: known, we had a failure elsewhere in the test suite because 1.15.0 was uploaded, and the test suite failed because it saw but couldn't read the bucket (see Tim's patch earlier today), then Kapil fixed the s3 bucket to be readable, and another test fails
<rogpeppe> jam: but at rev 1900, provider/ec2 tests are extremely slow (51s), whereas relatively recently they only took 5s
<jam> rogpeppe: not specifically related
<rogpeppe> oh it's such a twisty maze
<rogpeppe> i really think the dynamic type conversions everywhere are a horrible mistake
<rogpeppe> there's no way to know by looking at provider.StartInstance what methods might be called on the broker parameter
<rogpeppe> jam: i'll disable that test for the time being, just so we can actually submit something
<jam> rogpeppe: please make sure to submit a Critical bug about i
<jam> it
<jam> test suite not being isolated is *bad* and will bite us again
<rogpeppe> jam: totally agree
<rogpeppe> jam: https://codereview.appspot.com/14123045
<rogpeppe> mgz, fwereade, dimitern: ^
<fwereade> rogpeppe, LGTM
<fwereade> rogpeppe, I am up to my elbows in it as we speak
<fwereade> rogpeppe, in short I think that provider.StartInstance is total madness
<mgz> rogpeppe: also lgtmed, note that you shouldn't mark that bug fixed when you land, as it's the one tracking the actual issue
<rogpeppe> fwereade: +1
<rogpeppe> mgz: hmm, good point
<mgz> (or you should reference a different bug in the skip message and close that one)
<rogpeppe> mgz: if i approve the branch, will it mark the bug as fixed?
<rogpeppe> mgz: (automatically)
<mgz> I think the bot may, when landing, but you can always revert
<rogpeppe> mgz: ok
<rogpeppe> fwereade: i would really really *really* like it if we could lose all the dynamic type coercions, so any interface values passed to provider functions document exactly what methods may be called on them.
<fwereade> rogpeppe, I'm not entirely convinced there
<natefinch> rogpeppe, fwereade: +1   otherwise the interface is a lie
<fwereade> rogpeppe, no argument that provider.StartInstance is abuse
<rogpeppe> fwereade: i can't see any advantage to the way things are done currently
<rogpeppe> fwereade: it breaks types-as-documentation, and it breaks encapsulation.
<rogpeppe> fwereade: and it's trivial to make it work conventionally.
<natefinch> (to be clear, I was +1 for roger's point)
<fwereade> rogpeppe, natefinch: ISTM that the reality is that we have environs that actually do expose different features, and the custom datasources are a valid application of the technique
<rogpeppe> fwereade: exposing a different feature is not a cause for exposing a different method
<rogpeppe> fwereade: environs that don't implement custom data sources can implement a method that returns no custom data sources
<rogpeppe> fwereade: and we can easily have a "nothing custom implemented here" empty provider type.
<rogpeppe> fwereade: which can be embedded to provide the default versions of the methods.
<rogpeppe> fwereade: so the cost to any given provider is at most one line
<fwereade> rogpeppe, natefinch: so we have a giant nothing-special one that is itself useless as documentation, because any method could be overridden? or a bunch of little nothing-special ones that are embedded individually (and still you can't say for sure whether they're overridden)?
<fwereade> rogpeppe, natefinch: ISTM that the idea that an interface specifies a minimum set of capabilities issomewhat endorsedby the language
<rogpeppe> fwereade: for a given Environ type it's easily possible to say what's overriden
<rogpeppe> dden
<fwereade> rogpeppe, natefinch: if someone were to abuse that to, say, close a conn in a surprising fashion, that would be bad
<rogpeppe> fwereade: but what is awful is having functions that say they expect some interface, and then randomly assert some other interface type down in the depths of their implementation
<rogpeppe> fwereade: i am not endorsing that StartInstance take the giant interface type.
<fwereade> rogpeppe, natefinch: whereas clever copying things with ReadFrom are touted as somewhat awesome
<natefinch> fwereade: an interface defines both what could be called and what will not be called, by definition.  You can get around it, but it's bad form to do so
<fwereade> natefinch, so it's bad manners to accept an interface and not call its methods? agree in the abstract, think it's a bit fuzzier in practice
<rogpeppe> fwereade: the cleverness in ReadFrom is somewhat dubious - but excusable by virtue of the fact that the methods that it uses implement exactly the same functionality as the original interface methods.
<natefinch> fwereade: I mean it's bad manners to accept an interface and then call OTHER methods
<natefinch> fwereade: for exmaple, if something takes an io.Reader, checks for Close() and then calls Close.... that would be surprising
<rogpeppe> fwereade: i think that we should try to define interfaces that define useful subsets of the Environ type.
<fwereade> natefinch, indeed so :) and everyone agrees it sucks
<fwereade> rogpeppe, indeed so
<rogpeppe> fwereade: but i think that functions like provider.StartInstance should at least accept an interface type that defines a non-strict-superset of the methods that will be called
<fwereade> rogpeppe, natefinch: as it happens I think you're right about the custom data sources
<fwereade> rogpeppe, natefinch: because they all have the same goddamn implementation anyway ;p
<natefinch> heh
<fwereade> rogpeppe, natefinch: so, I dunno -- I'm not willing to excommunicate the technique, but it may well be the case that every single use of it that you can point to is unmitigated crack
<fwereade> rogpeppe, natefinch: in which case, eh, we shouldn't have any of them :)
<rogpeppe> fwereade: here's a reason that the io.Copy magic isn't bad - you can't break the behaviour of io.Copy by embedding an io.Reader in a custom struct type.
<rogpeppe> fwereade: but our code will break if you embed an Environ in something else.
<fwereade> rogpeppe, ok, that's a nice concrete reason
<rogpeppe> fwereade: what would you take as evidence that a particular use was *not* unmitigated crack?
<natefinch> rogpeppe, fwereade: my stance would be, any use like this should be a huge red flag, and much scrutiny put towards finding a better way.  My guess is that almost always, there will be a way that is much more clear, without a ton more work.
<fwereade> rogpeppe, I imagine it varies case-by-case, but I wasn't *that* hard to convince there, was I? ;)
<rogpeppe> fwereade: hrmph :-)
<natefinch> :)
<fwereade> rogpeppe, natefinch: my experience this afternoon might be leading me in your direction anyway
<fwereade> rogpeppe, natefinch: I can certainly agree that it's fair to treat it as a pungent code smell
<rogpeppe> hmm, we may want to disable go vet in .lbox.check in go1.2
<rogpeppe> it takes 30s on my machine
<natefinch> 30s seems ok if it only happens when you're proposing
 * fwereade called for dinne, back later
<natefinch> it's not *fun*, but it's pretty useful, and could easily be forgotten otherwise
<rogpeppe> natefinch: propose is already really slow
<rogpeppe> natefinch: and we've got the bot
<natefinch> rogpeppe: I was actually going to say, lbox propose is already slow, so what's another 30s? :)
<rogpeppe> natefinch: it gets in the way of my critical path
<rogpeppe> natefinch: i can't do anything when i'm running lbox propose
<natefinch> rogpeppe: did it get significantly slower in 1.2?
<rogpeppe> natefinch: yes
<rogpeppe> natefinch: or maybe only in tip, i'm not sure
<rogpeppe> natefinch: it now runs the type checker
<natefinch> rogpeppe: ahh hmm. when are we going to switch to 1.2?
<rogpeppe> natefinch: good question
<natefinch> rogpeppe: I'm sorta surprised, 30s is a long, long time.
<rogpeppe> natefinch: yup
<natefinch> rogpeppe: any idea if they plan on trying to make that perform better?  I haven't been keeping up on golang-nuts as closely as I should
<rogpeppe> natefinch: probably. i should bug adonovan
<sinzui> r1.15.0 continues to be cursed. Is there a command or url I can use to quickly locate an Ubuntu image for azure. The default image selected by Juju appears to be invalid now: http://pastebin.ubuntu.com/6176364/
<rogpeppe> sinzui: could you run that command with --debug please?
<rogpeppe> natefinch, fwereade, jam, mgz, dimitern: next stage in environment config info storage:
<rogpeppe> 2013-09-30 17:00:40 ERROR juju supercommand.go:282 command failed: cannot start bootstrap instance: POST request failed: BadRequest - The location or affinity group East US specified for source image b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-12_04_3-LTS-amd64-server-20130916.1-en-us-30GB is invalid. The source image must reside in same affinity group or location as specified for hosted service West US. (http code 400: Bad Request)
<rogpeppe> error: cannot start bootstrap instance: POST request failed: BadRequest - The location or affinity group East US specified for source image b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-12_04_3-LTS-amd64-server-20130916.1-en-us-30GB is invalid. The source image must reside in same affinity group or location as specified for hosted service West US. (http code 400: Bad Request)
<rogpeppe> oops!
<rogpeppe> https://codereview.appspot.com/14136043/
<rogpeppe> natefinch, fwereade, jam, mgz, dimitern: ^
<rogpeppe> please ignore the deleted builddb noise
<sinzui> rogpeppe, I am re rerunning with 1.15.0 adn debug. the issue is the same with sable and unstable: http://pastebin.ubuntu.com/6176490/
 * sinzui tries an older image from August
<rogpeppe> sinzui: i suspect a problem with our simplestreams logic
<sinzui> older image does not work either :(. This did work last week
<sinzui> rogpeppe, possibly. Did you see that that my first paste was using 1.14.1
<rogpeppe> sinzui: try with revno 1900
<sinzui> 1.14.1 did run with azure last week for me
<rogpeppe> dimitern: have you run live tests on ec2 since you merged your mongo password changes?
<dimitern> rogpeppe, I haven't merged them yet
<dimitern> rogpeppe, still fooling around with some tests
<rogpeppe> dimitern: ah, ok that's good - i can't blame them for my current test failure :-)
<dimitern> rogpeppe, and, let's clear that out - 'cause I was wondering before
<sinzui> rogpeppe, same result for -r1900, 1.15.0, and 1.14.1 used default image selection and when I force an image. I suspect this is more azure than juju.
 * sinzui looks for victim to test azure
<rogpeppe> sinzui: quite possible
<dimitern> rogpeppe, by "ec2 live tests" do you mean bootstrapping an ec2 env and doing some deployments, etc. or running ec2 tests with --amazon? (or whatever it was)
<rogpeppe> dimitern: no, i mean cd provider/ec2; go test -test.timeout 1h -amazon
<rogpeppe> dimitern: the latter in other words, yes
<dimitern> rogpeppe, I've never done that actually
<rogpeppe> dimitern: heh
<dimitern> rogpeppe, I just fire up a c2 env from my account and do some manual deploy/status/etc. tests on the console
<dimitern> *ec2 that is
 * rogpeppe leaves
<rogpeppe> might be back later for a bit
<thumper> morning
<natefinch> thumper:  morning... you're on early
<thumper> hi natefinch
<thumper> nah, the country is now UTC+13
<natefinch> oh, it was over this weekend, that's right
<thumper> I'm a little earlier than usual by about 30m
<natefinch> right
<thumper> because it is the school holidays
<thumper> and I have less to worry about
<natefinch> hah, I think I'd get in earlier when the kids had to go off, because there's more stuff to do in the morning
<natefinch> but then, I start work before anyone else is even awake in the house
<sidnei> hey guys, got a fancy one here. had an env running for a couple days, now i went back to look at it and one of the machines has the agent as down. looking at the logs, it's failing to log back in:
<sidnei> 2013-09-30 19:34:03 ERROR juju runner.go:211 worker: exited "state": cannot log in to juju database as "machine-6"
<thumper> natefinch: what time do you start?
<thumper> sidnei: have you upgraded?
<sidnei> obviously i can't 'juju terminate-machine' because it needs the unit to be removed before terminating it right?
<natefinch> thumper: I get up at 5:30am most days and start work between 6 and 6:30 depending on what else I have to do in the morning.  before the kids get up is the only quiet time I get :)
<sidnei> thumper: not intentionally, but maybe landscape auto-upgraded it for me
<thumper> natefinch: do you split your day?
<sidnei> although, i think it wouldn't come from a package, but from tools?
<thumper> sidnei: which environment?
<natefinch> thumper: basically.  I help the kids get up around 7:30-8:30 or so (later if there's nothing going on that day), and then help some more around lunch time.  So, less split ,and more just interrupted ;)
<natefinch> sidnei: oh, we don't support environments being up for multiple days
<sidnei> natefinch: haha :)
<thumper> :P
<natefinch> :D
<sidnei> thumper: canonistack
<thumper> sidnei: I wonder if juju is installed there
<thumper> can you ssh into it?
<sidnei> thumper: neither 'juju' is installed, nor 'juju-core'. in fact, it doesn't even know about a juju-core package
<thumper> sidnei: ok, that's good
<thumper> sidnei: ya know, I've always felt a bit weird about how the password thing is handled
<sidnei> thumper: if you want to poke at it i can add your ssh key, im about to destroy the instance otherwise
<thumper> sidnei: no, destroy it...
 * thumper thinkgs
<thumper> well agents in lxc containers come back up
<sidnei> i guess i can't destroy it either
<thumper> so the file must persist properly I guess
<sidnei> otherwise the environment will go nuts
<thumper> sidnei: not yet, we don't have a --force or other mechanism for cleanup
<sidnei> maybe i can poke at mongo and get the creds out of it, then compare to what the machine thinks it should have
<sidnei> jujud has a timestamp of sept 10th btw
<natefinch> mna, it bugs me that you can do juju add-machine lxc, but you can't do juju add-machine --constraints container=lxc  :/
<rogpeppe> thumper: hiya
<thumper> hi rogpeppe
 * thumper otp
<rogpeppe> thumper: otp = "off to play" ?
<thumper> on the phone
<natefinch> what does add-machine ssh do?
<fwereade> natefinch, manual provisioning
<fwereade> natefinch, ssh in and start running a machine agent
<natefinch> so it basically gives credentials and an IP address for the state server to connect to that machine and start it?
<natefinch> (and then do all our normal startup stuff)
<fwereade> natefinch, at the moment I think it's direct, and I'd need to read the code to tell you the exact sequence
<natefinch> fwereade: no problem... I was just looking at docs related to constraints and..... add-machine needs some help :)
<fwereade> natefinch, something like: ssh in, maybe ask for a sudo password, figure out the machine's series/hardware, inject machine into state, start agent, log out
<fwereade> natefinch, I bet
<fwereade> natefinch, thank you ever so for focusing on this
<natefinch> fwereade: I figure it's good to have the guy who already doesn't know what he's doing look at the docs and see if they make sense ;)
<fwereade> natefinch, hell yeah :)
<fwereade> natefinch, although, to be fair, I would not have described you thus :)
<thumper> fwereade: hey hey
<fwereade> thumper, heyhey
<fwereade> thumper, how's it going?
<thumper> good
<fwereade> sinzui, good evening
<sinzui> Hi fwereade
<fwereade> sinzui, how's azure? :)
<fwereade> sinzui, is there anything I can do to help?
<sinzui> fwereade, from an email I am typing:
<sinzui> * BAD: I cannot deploy to azure with 1.15.0. I can with 1.14.1.
<fwereade> sinzui, that statement is a model of clarity
<thumper> sinzui: what sort of error are you getting?
<sinzui> fwereade, http://pastebin.ubuntu.com/6177102/
<sinzui> ^ I have tried East US and West US. I have uploaded tools to both
<sinzui> I am confident that https://jujutools.blob.core.windows.net/juju-tools/tools is the tools-url
<thumper> sinzui: looks to be in image problem not a tools problem...
<sinzui> I agree!
<sinzui> I can deploy 1.14.1 and I don't expect a different image to be selected though
<thumper> hmm...
<thumper> sinzui: the metadata seems to be in a different format than juju is expecting
<thumper> sinzui: it looks to me that we are looking for: "com.ubuntu.cloud:server:12.04:amd64" but it has "com.ubuntu.cloud.daily:server:12.04:amd64"
<thumper> notice the .daily
<fwereade> sinzui, yeah, that ".daily" would seem to be the problem
<thumper> smoser: ping
<sinzui> I can change image-stream: maybe I think "release" doesn't get the extra daily
<fwereade> sinzui, but then that url doesitselfspecify daily
<smoser> thumper, here.
<thumper> smoser: hey
<thumper> smoser: can you read the scrollback to the pastebin?
<thumper> smoser: we are having trouble with juju and azure
<thumper> smoser: and it seems we can't find the simple streams image data
<thumper> smoser: wondering if the format change, or could just be our code
<thumper> smoser: so really just checking on expectations right now
<thumper> fwereade, sinzui: http://cloud-images.ubuntu.com/daily/streams/v1/com.ubuntu.cloud:daily:azure.sjson is referenced from the azure group
<fwereade> thumper, huh, our code is looking a bit problematic itself actually... azurecode does weird things with daily vs ""
<smoser> thumper, http://pastebin.ubuntu.com/6177102/
<thumper> smoser: yeah
<fwereade> thumper, provider/azure/environ.go:932
<fwereade> thumper, provider/azure/environ.go:904
<fwereade> thumper, something doesnotadd up
<thumper> fwereade: hmm...
<thumper> fwereade: getImageStream is never called, except in a test
<fwereade> thumper, ah good... I suppose
<thumper> heh
<thumper> smoser: I guess the real thing we want to check is that we should be looking for  "com.ubuntu.cloud.daily:server:12.04:amd64" not "com.ubuntu.cloud:server:12.04:amd64" in the index
<thumper> smoser: is that right?
<smoser> nice.
<smoser> no.
<thumper> fwereade: can you see where we build that string?
<smoser> you should no longer be looking for daily.
<smoser> that should work. but you should'nt be looking for it.
<thumper> smoser: so... what should we be doing?
<smoser> there are released images on azure now.
<smoser> i said this on an email thread.
<thumper> fwereade: this might be it...
 * thumper sighs
<smoser> hm... odd.
<thumper> smoser, fwereade: perhaps this was dropped on the floor?
<fwereade> thumper, smoser: very possibly :(
<thumper> smoser: what's odd?
<fwereade> thumper, smoser: the azure providerdoes seem to get the stream from the config
<smoser> odd that you committed to juju the '.daily'
<smoser> http://paste.ubuntu.com/6177171/
<smoser> anyway, that is what you want.
<fwereade> sinzui, does your environ config specify image-stream by any chance?
<sinzui> No
<sinzui> I was pondering using it to force something other than daily
<thumper> fwereade: line 908
<thumper> fwereade: builds the image source list with the local storage, + the default (which is /daily)
<thumper> sinzui: can I get you to try something?
<fwereade> thumper, yeah, /releases is missing
<thumper> sinzui: azure/environ.go line 904
<thumper> change daily to releases
<thumper> sinzui: and try then
<smoser> are you possibly looking in the released stream for daily products ?
<smoser> as you wont find them.
<thumper> smoser: no, I don't think so
<thumper> we are looking in the daily for released
<thumper> AFAICT
<fwereade> thumper, agreed
<fwereade> thumper, given that we have image-stream configurable, we really ought to look in both, I think
<fwereade> thumper, ah, but can we?
<sinzui> "image-stream: releases" yields this error:  no OS images found for location "East US", series "precise", architectures ["amd64" "i386"]
<smoser> hm..
<smoser> i tink on caonnistack we're actually combining both streams
<fwereade> sinzui, is there a line just above with some product names?
<smoser> in whic hcase if you were looking for either .daily or released, youd find both.
<thumper> fwereade: where do you see the image stream be configurable?
<fwereade> thumper, azure/config.go:71
<fwereade> thumper, ha, :127
<sinzui> fwereade, this looks identical to the first pastebin except for times: http://pastebin.ubuntu.com/6177196/
<thumper> hmm
<thumper> sinzui: haha, it is appending .releases
<thumper> where I don't think it should
<thumper> sinzui: what did you change exactly?
<smoser> it should not.
<smoser> yeah.
<sinzui> thumper, I added
<sinzui> image-stream: releases
<smoser> pproduct names are com.ubuntu.cloud:server:12.04:amd64 and com.ubuntu.cloud.daily:server:12.04:amd64
<sinzui> thumper, juju init added this
<sinzui> #image-stream: daily
<thumper> sinzui: but you made it "release" ?
<sinzui> thumper, I uncomented the line and made the value "releases"
<thumper> sinzui: ok, comment that out again
<thumper> fwereade: did you find the config line for the source url?
<thumper> sinzui: add this:
<thumper> image-metadata-url: http://cloud-images.ubuntu.com/releases
<thumper> fwereade: hmm environs/imagemetadata/simplestreams.go:24
<fwereade> thumper, hmm... but we never seem to check it
<sinzui> Oh, this is taking much longer
<thumper> fwereade: environs/imagemetadata/urls.go:39:
<sinzui> Boom. Up comes a state-server
<thumper> so the question is: why isn't it using the default that is there...
<thumper> I think I know
<fwereade> thumper, ohhhh... how do we handle falling back from one source to another? is it giving up before looking at the right one?
<thumper> correct
<thumper> I think
<fwereade> thumper, ah bollocks
<thumper> so azure says: try this daily one
<thumper> so it goes:
<thumper> config if set
<thumper> environ if provided
<thumper> then default (which is correct)
<thumper> but we seem to be giving up before we get to it
<thumper> I feel that this is incorrect error handling
<thumper> any error is bad
<thumper> we should have a specific error that we can check for to keep iterating through
<fwereade> thumper, I think I remember ian explaining that it was a behaviour-preservation thing... the original tools stuff would only fall back in the case of *no* tools
<thumper> this isn't tools
<thumper> this is images
<fwereade> thumper, indeed so, but they share underlying mechanisms
<thumper> hmm...
<thumper> that's a little poked
<fwereade> thumper, yep, I completely overlooked it at the time, I was thinking purely about tools
<thumper> so...
<thumper> where to from here?
<thumper> fwereade: can we split the lookup behaviour?
<thumper> add a policy to the method?
<fwereade> thumper, policy feels cleanest at first sight, doesn't it
 * thumper nods
<fwereade> thumper, there we have it documented: environs/imagemetadata/simplestreams.go:68
<fwereade> thumper, "the first location which has a fileis the one used"
<thumper> hmm...
<thumper> sinzui: does this mean you can upgrade your azure env?
<sinzui> I am still waiting for the first juju status to complete
<sinzui> thumper, I can try, but I have not been able to upgrade hp or aws
<thumper> sinzui: omg slow
<thumper> hmm, what is the hp upgrade problem?
<thumper> sinzui: so amazon does upgrade?
<sinzui> azure is being routed though Somalia.
<sinzui> thumper, no.
<sinzui> I have not been able to upgrade any env
<fwereade> thumper, environs/simplestreams/simplestreams.go:444?
<sinzui> but since aws and hp are fast I can try again
<thumper> fwereade: seems sprurious
<thumper> sinzui: have you tried since the bucket tools were made available?
<sinzui> yes.
<fwereade> thumper, there's the core of something sane though -- if you're looking for product X and you find an index for product Y you should certainly move on to the next one
<fwereade> thumper, if you find product X, but no examples of it that match what you're looking for, you should probably not
<fwereade> thumper, or at least it's arguable, I think
<thumper> fwereade: yeah... otp with mramm now
<sinzui> I have still not gotten a status back from azure. This is more than 30 minutes without feedback
<hazmat> sinzui, it takes a while but rarely more than 15m
<hazmat> interestingly bootstrap/destroy-env are basically synchronous there
<sinzui> yeah. I did three bootstraps of azure on 1.14.1 today
<hazmat> for valid reasons
<hazmat> sinzui, is the instance up from the azure console?
<hazmat> sinzui, are we tagging release in bzr?
<sinzui> Yes, I see an instance
<sinzui> hazmat, My first two status calls ended with "no reachable servers"
<sinzui> The third is in progress
<hazmat> sinzui, no reachable servers means no response on the api to get the ip
<hazmat> from the object storage instance id
 * hazmat tries on azure
<sinzui> thumper, I have a positive response from aws. I looks like it accepted the upgrade (http://pastebin.ubuntu.com/6177342/). But 10 minutes later I still see the agent versions are 1.14.1
<thumper> sinzui: would be interesting to get the entire log file back for analysys
<thumper> sinzui: the -all.log from the bootstrap node?
<thumper> sinzui: can you scp or pastebinit?
 * sinzui visits
<thumper> gary_poster: hey
<thumper> gary_poster: I'm now on saucy and having no issues with the local provider
<thumper> gary_poster: I'm wondering what is different on your machine
<thumper> gary_poster: can you think of any "non default" things you may have?
 * thumper afk for a bit
<hazmat> thumper, gary_poster the issues gary mentioned sound like a cgroups issue not a juju one
<hazmat> like the cgroup mount space isn't correct.. normally we're using cgroups lite afaicr
<thumper> hazmat: hmm... how does that get changed?
<hazmat> gary_poster, you around?
<sinzui> thumper, I reuploaded the aws tools using s3up, now I cannot see any public tools. I think it made things worse
<hazmat> thumper, you need the cgroup-lite package to get the cgroup  sysfs to get automounted via upstart.. its been a while but i remember one time.
 * sinzui redeploys with s3cmd
<hazmat> sinzui, i changed the bucket policy to ignore whatever the client said.. the bucket is always public
<thumper> hazmat: however the clients inside aws can't see it
<hazmat> sinzui, since this has happened multiple times..  re private tools
<hazmat> huh
 * hazmat checks
<thumper> yeah
<sinzui> hazmat, I thought you did, but thumper reports other say they are still private
<thumper> hazmat: sinzui was trying an upgrade, and the clients listed all the tools, but no 1.15 ones
<hazmat> thumper, every link in jam's email worked for me
<thumper> hazmat: perhaps it is how they are being listed?
<hazmat> ie.. http://juju-dist.s3.amazonaws.com/tools/releases/juju-1.15.0-precise-amd64.tgz
<thumper> the underlying api?
<thumper> perhaps has nothing to do with the bucket settings
<hazmat> thumper, the listing is http://juju-dist.s3.amazonaws.com/
<thumper> the goamz bit
<hazmat> and it works fine to
<hazmat> not related, i overrode the default bucket policy which had keys default to private.. and switched it to public
<thumper> ah...
<thumper> I know
<hazmat> because this is like the second time in a week this issue has occured.
<thumper> sinzui: the tools need to be in /tools as well for the 1.15 / 1.16 releases
<thumper> they need to be in two places
<thumper> otherwise 1.14 can't find them
<thumper> the new code puts them in tools/releases
<thumper> the old code is just looking in /tools
<hazmat> gotcha
<hazmat> so they need to be in both places for backwards compatibility.
<thumper> ack
<hazmat> sinzui, which scripts are you using .. the ones in lp:juju-core/scripts
<sinzui> no, it be broken, it cannot find the series, version, or tarball name
<hazmat> the key on tools/streams/v1/com.ubuntu.juju:released:tools.json appears to have whitespace around it.
<hazmat> not sure if that's real or just formatting oddity
<hazmat> might just be formatting
<sinzui> hazmat, I used this, but have run each upload by itself. http://pastebin.ubuntu.com/6177496/
<hazmat> ugh
<sinzui> It began as a fix to the script. I shouldn't be trusted download each deb and then work out how to deploy to other clouds
<thumper> sinzui: you need to add: s3cmd put --acl-public ${DEST_DIST}/tools/releases/*.tgz s3://juju-dist/tools/
<thumper> sinzui: for at least the 1.15 and 1.16 versions
<thumper> sinzui: once everything is on 1.16, we don't need the legacy location
<thumper> sinzui: but otherwise the 1.14 machines can't see the new 1.15(16) tools
<sinzui> thumper, let me repeat that ...
<thumper> sinzui: did you want a hangout
<thumper> ?
<hazmat> sinzui, that should get cleaned up and added to trunk branch..
<sinzui> thumper, using v1.15.0 client, I cannot complete an upgrade to 1.15.0 server because the client didn't tell the server where the 1.15.0 servers are?
<hazmat> sinzui, the server looks up tools for an upgrade
<thumper> sinzui: correct
<sinzui> 2 minutes
<thumper> sinzui: the client looks, and stores the new version in state
<thumper> the agents go "oh, new version" and go looking
<thumper> using the 1.14 codebase
<thumper> which says "look in /tools"
<sinzui> k$ s3cmd info s3://juju-dist/tools/juju-1.15.0-precise-amd64.tgzs3://juju-dist/tools/juju-1.15.0-precise-amd64.tgz (object):
<sinzui>    File size: 3291685
<sinzui>    Last mod:  Mon, 30 Sep 2013 22:27:33 GMT
<sinzui>    MIME type: application/x-gzip; charset=binary
<sinzui>    MD5 sum:   10e6466f113e751fa66461d755c0149d
<sinzui>    ACL:       gustavoniemeyer: FULL_CONTROL
<sinzui>    ACL:       *anon*: READ
<sinzui>    URL:       http://juju-dist.s3.amazonaws.com/tools/juju-1.15.0-precise-amd64.tgz
<sinzui> They are there now
<thumper> sinzui: the agents should be able to upgrade now
<sinzui> thumper, and it did without me asking any more from it
<thumper> \o/
<sinzui> asw upgrades. I will replay this on hp (after I upload tools to tools/)
<thumper> so we need to do similar things for the other tools locations
<gary_poster> hazmat, thumper here for a sec.  anything I can check?
<sinzui> yep
<thumper> yay
<thumper> now I can go to the gym now feeling guilty about leaving a huge mess for a while
 * gary_poster runs away again
<sinzui> thumper, v1.15.0's sync-tools does not copy to tool/. I need to find another way to put the tgzs there
<thumper> hmm...
<thumper> we need something half manual
<hazmat> gary_poster, yes.. can you verify you have cgroup-lite installed
<hazmat> sinzui, the acl doesn't matter anymore
<hazmat> sinzui, the bucket policy will override any acl
<hazmat> you can upload private and its still publicly available
<hazmat> its been too accident prone relying on a variety of clients and lack of automation.
<sinzui> hazmat, rock. I will stop panicking . Thank you!
<sinzui> looks like the hp upgrade is accepted. I will wait for a few minutes
<sinzui> hp isn't upgraded yet
<sinzui> I have uploaded the tools to azure's tools/ and the listing looks correct
<sinzui> hazmat, thumper does juju support leaping upgrades of stable? eg 1.12 to 1.16?
 * sinzui hopes no
<hazmat> sinzui, at the moment.. yes.. you can even downgrade
<hazmat> sinzui, although..
<hazmat> sinzui, in this context its just going to look for the latest it can find in the location its knows about
<sinzui> hazmat, so aren't we commit to putting tgz files in /tools until we break compatibility (2.0)
<hazmat> maybe.. given that we have to coordinate two versions (client, server) and two tool locations..
#juju-dev 2013-10-01
<hazmat> arosales, interesting re affinity group bug
<hazmat> arosales, did red squad implement affinity groups recently? it wasn't an issue last week
<thumper> sinzui: no
<thumper> sinzui: only minor jumps of .2 at this stage
<thumper> sinzui: planning to fix this later
<thumper> sinzui: has the world stopped burning?
<hazmat> sinzui, i though it looked for the lastest compatible minor release
<hazmat> whoops thumper ^
<hazmat> looking at upgradejuju initVersions
<hazmat> thumper, per environ/tools.FindTools ... f minorVersion = -1, then only majorVersion is considered.
<hazmat> s/f/if
<thumper> hazmat: whether or not it looks for the latest tools is irrelevant, the code doesn't promise to work
<thumper> I have been told loudly and often that we only support single jumps at this stage
<thumper> and not from dev -> dev
<thumper> only from released -> current dev
<thumper> or old release -> current release
<thumper> or last release -> current release
<hazmat> well old_release -> current_release is the question.. and if the code doesn't prevent it, then the default behavior is effectively not supported
<hazmat> ie 1.12 upgrade will go straight to 1.16 or 1.18..
<thumper> :)
<thumper> hazmat: well, wether is does or not is a bit moot
<hazmat> either its supported and its fine, or its not supported, and we have default behavior which is potentially dangerous for prod envs.
<hazmat> thumper, then we can talk about downgrading ;-)
<thumper> well, I've been told it isn't supported, so we should check to see if it dies :)
<hazmat> thankfully a test trivial with a downgrade..
<davecheney> downgrade ?
<davecheney> are you mad
<davecheney> the only way is up, baby
<hazmat> davecheney, i filed a bug and told it was meant to work for downgrade..
<hazmat> but you can --version=previous_version
<hazmat> unfortunately can't go back to 1.12 since its not in simple streams..
<hazmat> which also sort of explains the upgrade problems
<gary_poster> hazmat, ack thanks.  I do have cgroup-lite installed.
<gary_poster> (1.8)
<hazmat> gary_poster, yeah.. i looked over the tracebacks again.. didn't seem related, not sure what's up there
<hazmat> gary_poster, it did seem possibly networking related, but unclear..
<hazmat> gary_poster, could you pastebin ifconfig from the host
<gary_poster> hazmat http://paste.ubuntu.com/6177815/
<hazmat> totally sane..
<hazmat> thumper, gary_poster i'd ask hallyn about this one
<gary_poster> hazmat, I was thinking the same.
<hazmat> sinzui, thumper so with deploys coming from the new streams structure, effectively the only version of juju known to trunk is 1.15 atm.. no real testing of multi-version leaps atm unless using an old client
<davecheney> hazmat: i don't think downgrades have ever been supportde
<davecheney> i've certainly never heard that requirement, and it has never been a constraint
<hazmat> davecheney, fair enough.. this notion of 'support' only known to juju-core devs is interesting.. my concern is primarily what does the tool allow end users. if minor increment skipping isn't supported, why does the tool allow it.. much less do it by default.. if downgrading isn't supported, how about a warning at least.. if not remove the capability
<davecheney> hazmat: how can you downgrade ?
<davecheney> maybe this was added when I wasn't looking
<hazmat> davecheney, juju upgrade-juju --version=lower_version
<davecheney> meh, that is like --to, if you do that, all the safties are demoved
<davecheney> removed
<hazmat> davecheney, https://bugs.launchpad.net/juju-core/+bug/1227991
<_mup_> Bug #1227991: upgrade-juju can downgrade.. and causes hook executions <juju-core:Opinion> <https://launchpad.net/bugs/1227991>
<davecheney> yup, certainly a footgun
<hazmat> davecheney,  --to doesn't remove safeties from juju, just the charms, and its quite safe for container spec
<hazmat> although honestly i like the capability.. what i dislike is a default behavior which only juju-core devs knows is not supported, namely skipping versions on upgrades.
<hazmat> ie juju upgrade-juju.. resulting in an unsupported behavior
 * hazmat files a bug for future ref
<thumper> ah ffs
<thumper> lxc has changed its behaviour for another thing
<thumper> fuckity fuck fuck
 * thumper looks at the docs
 * thumper goes to write another branch
<axw> thumper: what's the cut off date for 1.16.0?
<thumper> axw: soft cut off is thursday I think
<thumper> and then it gets harder :)
<axw> thumper: okey dokey, ta
<axw> public holiday came at a bad time :\
 * thumper shrugs
 * thumper goes to fix the lxc bug
<davecheney> thumper: sounds like the same story as last time
<davecheney> there is a date
<davecheney> but it's optional
<davecheney> depending on your frame of reference
<thumper> davecheney: I would say that we only want bug fixes after that
<thumper> and not try to drop features
<davecheney> drop == include
<davecheney> or drop == exclude ?
<davecheney> drop == land ?
<davecheney> why why did I make my admin-secret so secret
<davecheney> such a pain
<davecheney> y didn't i make it 'bob'
<thumper> not to land features
<davecheney> understood
 * thumper needs hacking music
<thumper> axw, davecheney: I'm confused
<thumper> I use os.Symlink to make a symlink
<thumper> but it doesn't set the symlink mode flag
<thumper> if I go os.Stat(path), fileinfo.Mode().IsRegular() it returns true
<thumper> why?
<davecheney> thumper: not sure
<davecheney> let me try
<thumper> I'm writing a checker for IsSymlink
<axw> also not sure..
<thumper> so in my test test, I make a symlink , then test it
<thumper> and it says nah
 * thumper is confused
<thumper>    c.Assert(symlinkPath, jc.IsSymlink)
<thumper> ... obtained string = "/tmp/gocheck-5577006791947779410/8/a-symlink"
<thumper> ... /tmp/gocheck-5577006791947779410/8/a-symlink is not a symlink: &{name:a-symlink size:0 mode:384 modTime:{sec:63516189566 nsec:62250446 loc:0x6c71c0} sys:0xc2000975a0}, true
<hazmat> anyone seen this one.. https://bugs.launchpad.net/bugs/1233457
<_mup_> Bug #1233457: service with no units stuck in lifecycle dying  <cts> <juju-core:New> <https://launchpad.net/bugs/1233457>
<thumper> file, err := ioutil.TempFile(c.MkDir(), "")
<thumper> c.Assert(err, gc.IsNil)
<thumper> symlinkPath := filepath.Join(filepath.Dir(file.Name()), "a-symlink")
<thumper> err = os.Symlink(file.Name(), symlinkPath)
<thumper> c.Assert(err, gc.IsNil)
<hazmat> cts needs a fix, and afaics the only thing to do is munge the db..
<hazmat> how about those pastebins
<thumper> hazmat: sure...
<axw> thumper: I think you want os.Lstat
<thumper> ah...
<thumper> yeah
<thumper> Stat is checking the thing that is being pointed to right?
<axw> yup
<hazmat> thumper, python is the same btw
<hazmat> os.lstat vs os.stat
<thumper> yeah, that's it
<thumper> hazmat: no, not seen that before
<thumper> although maybe I have, but I don't recall
<thumper> we have some lifecycle issues
<hazmat> thumper, so what's the downside of just yanking that document from mongo..
<thumper> hazmat: unsure
<davecheney> hazmat: nobody has ever tried
<hazmat> afaics as long as its cleared out of units, relations, relation scopes, its fine.
<hazmat> thumper, probably some txn issues as well
<hazmat> yeah... that's the problem.. prod customer env.. doing the crazy.. i guess i need  a full db dump to verify this is sane..
<hazmat> but against a sample env, it does look okay..
<hazmat>  the txn observers complicate things, but nothing triggers on this observation
<hazmat> thumper, hmm.. let me ask a different way.. do you know what processes the cleanup of services in lifecycle:dying
<davecheney> would have to be the provisioner
<hazmat> ie could wondering if i could just bounce the bootstrap jujud to process it
<thumper> hazmat: no, there is a lifecycle doc written by fwereade in the tree, but no I don't really know
<davecheney> hazmat: try it
<davecheney> can't hurt
<hazmat> davecheney, why the provisioner?
<hazmat> its not related to the provider
<davecheney> hazmat: it's the only one left
<davecheney> unit agent moves the unit to dying
<davecheney> there is no service agent
<davecheney> so it must be the provisioner
<hazmat> davecheney, fair enough
<hazmat> davecheney, or at least a job on the machine agent
<hazmat> on bootstrap
<hazmat> ie could just as easily be api
<hazmat> or something else
<hazmat> davecheney, you reasonably sure that bouncing it won't pertub the system?
<davecheney> hazmat: it could be the machine agent
<davecheney> but that is a problem
<davecheney> because when the last machine is removed
<davecheney> who will remove the service ?
<davecheney> the machine agent moves the unit from dying to dead
<davecheney> that is all
<hazmat> hi julianwa
<julianwa> hi davecheney  hazmat
<davecheney> hazmat: yes
<davecheney> bouncing it is fine
<hazmat> julianwa, so the suggestion is juju ssh 0 ... and then
<davecheney> it is designed for this
<hazmat> julianwa, sudo service jujud-machine-0 restart
<julianwa> hazmat: ok, will try in few minutes
<hazmat> julianwa, that might resolve it as forces the agent to process current state, shouldn't pertub the system at all
<davecheney> hazmat: the bootstrap agent is probably restarting all the time
<hazmat> julianwa, if that doesn't work.. then i'll have to fallback to the manual munge the db, which i'd rather avoid if possible, as its not supported.
<hazmat> davecheney, huh
<davecheney> we don't let the process die, 'cos that will cause upstart to put it in the sin bin
<davecheney> but the process does restart the worker frequently
<hazmat> davecheney, if that's the case then the persistent error here over days, would have already been resolved..
<davecheney> hazmat: yes
<davecheney> so a restart is unliekly to fix the problem
<davecheney> but it's also unlikly to make it any worse
<hazmat> davecheney, upstart will restart as long as its not a persistent suicide
<hazmat> davecheney, how often does juju kill workers.. my understanding is that it only does it on error in the worker
<hazmat> ie. its not a periodic process
<davecheney> hazmat: depends on the error
<davecheney> with maas, it's quite frequent
<davecheney> with other providers, it tends to be less frequent
<hazmat> davecheney, ie.. if its not a periodic process... then no error.. means no restart
<hazmat> so a manual restart still has value
<davecheney> hazmat: sure
<hazmat> davecheney, is the scope of the worker all the environ jobs
<davecheney> bottom line
<davecheney> nobody knows what will fix it
<davecheney> but a restart can't hurt
<hazmat> indeed
<hazmat> and its better than touching juju's db by hand.. which is the next step.. so worth a shot
<hazmat> aha
<thumper> hazmat: I'd highly recommend not hacking the juju db by hand
<hazmat> davecheney, thumper we have the underlying issue it looks like
<hazmat> jujud died perm
<hazmat> thumper, except when nesc :-) like migration
<thumper> hazmat: only if you mean pyjuju -> go juju
<davecheney> 12:28 < hazmat> jujud died perm
<davecheney> ^ can you please expand on this
<hazmat> davecheney, its not running
<davecheney> what does the log file say ?
<hazmat> thumper, ack on recommend, and yes that's the migration i'm working on... in this case service is pretty innoucous though, it has no agent manipulating state.
<hazmat> an observers to trigger on txn, in fact its the lack of triggers there that forces the manual
<julianwa> hazmat: already restart jujud-machine-0 on bootstrap node
<julianwa> hazmat: nothing changed
<hazmat> julianwa, did it stay up.. ie is it in the ps aux | grep juju   output
<hazmat> julianwa, could you copy/send /var/log/juju/machine-0.log for analysis
<davecheney> thumper:
<davecheney> 2013-10-01 01:03:32 DEBUG juju.rpc.jsoncodec codec.go:172 -> {"RequestId":9,"Response":{"Results":[{"StringsWatcherId":"4","Changes":null,"Error":null}]}}
<davecheney> 2013-10-01 01:03:32 DEBUG juju.worker.logger logger.go:45 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING"
<davecheney> ^ when did this land ?
<davecheney> my juju environment has stalled
<thumper> it hasn't stalled
<davecheney> and I'm not getting any output from the agents because we log nothing at WARNING level
<thumper> it just isn't giving info and debug
<thumper> if you want, you should bootstrap with debug
<davecheney> ok, is it possible to change this ?
<thumper> or read my big long email
<davecheney> ok, will re-read
<thumper> juju set-env logging-config='<root>=DEBUG'
<thumper> on the running environment
<thumper> will set it back
<thumper> juju set-env logging-config='<root>=INFO;juju=DEBUG;juju.rpc=INFO' will reduce noise
<thumper> or even
<thumper> juju set-env logging-config='juju=DEBUG;juju.rpc=INFO' will reduce noise
<hazmat> yummy
<hazmat> thumper, how bout a default of that
<thumper> export JUJU_LOGGING_CONFIG='juju=DEBUG;juju.rpc=INFO'
<hazmat> thumper, what's the current default?
<thumper> <root>=WARNING
<hazmat> k
<thumper> the logging config you bootstrap with is passed on
<davecheney> hazmat: thumper i'll do that
<davecheney> but i'm concerned that the debugging that I needed has probably been thrown away
<davecheney> i'll rebootstrap and try again
<thumper> davecheney, hazmat: perhaps we should change the default logging level if you are using a debug release
<thumper> as that is most likely to be developers
<thumper> who are more interested in logs
<thumper> well, debug info anyway
<davecheney> thumper: i think the default sohuld be export JUJU_LOGGING_CONFIG='juju=DEBUG;juju.rpc=INFO'
<julianwa> hazmat:  machine-0.log.tar.gz under chinstrap:/home/julianwa
<davecheney> the main problem is the log spam from the rpc code
<davecheney> everything else is crucial
<thumper> davecheney: alternatively, we set the rpc spam to TRACE
<thumper> so DEBUG just works
<davecheney> sure
<davecheney> i don't care about the specificas
<davecheney> everyone hates the rpc spam
<davecheney> all i know is I have an environment that is broken
<davecheney> and i cannot debug it beucase the logging that i needed was discarded
<davecheney> all the agents are idle
<davecheney> but i have no logs to debug the problem
<thumper> davecheney: I'll submit a fix tomorrow
 * davecheney destroys environment and trys again
<hazmat> julianwa, thanks
<hazmat> julianwa, that's odd i get unexpected EOF extracting
<hazmat> and output cuts off mid line
<hazmat> but it doesn't look the agent was doing anything for a while
<hazmat> julianwa, when you restarted it, did the process stay up?
<thumper> hazmat: FWIW, I'm seeing upstart issues starting upstart for the local provider now
<thumper> it seems that upstart sees it very fast
<thumper> and starts it before we call start
<thumper> a race it seems between us going: are you running? no? ok lets start you, fail - already running
<julianwa> hazmat: what's you mean stay up?
<julianwa> hazmat:  did you mean process still have same pid?
<hazmat> julianwa, sure
<hazmat> julianwa, i'm trying to verify its running
<hazmat> its unclear from the log why it died, just want to make sure that it is in fact alive
<julianwa> hazmat: yes. it's running with new pid
<hazmat> julianwa, good, and status still has mysql dying...
<thumper> ah fark
<julianwa> right
<hazmat> julianwa,  so unto manual db work..
<hazmat> one moment
 * thumper headdesks
<hazmat> thumper, do you have an alternative?
 * thumper headdesks
 * thumper headdesks
<hazmat> julianwa, are you using juju-deployer?
<julianwa> hazmat: no
<hazmat> k
<thumper> hazmat: for some reason, my local provider is completely screwed
<hazmat> thumper, hmm.. works okay for me. what do you see? i still have a sleep in Install and a sudo in start
<hazmat> re upstart.go
<thumper> telling me that the environment isn't up when I can see that it is
<thumper> bootstrap worked
<thumper> status says "you aren't bootstrapped"
<hazmat> thumper, would you mind pastebin  status -v --debug
<hazmat> just curious what it looks like
<thumper> hazmat: -v isn't needed if you use --debug
<thumper> really
<hazmat> thumper, i used to assume that then someone told/showed me otherwise
<thumper> no, it's bollocks
<hazmat> davecheney, wasn't that you?
<hazmat> not important
<hazmat> thumper, yeah.. --debug would be interesting to see
<thumper> $ juju status --debug 2>&1 | pastebinit
<thumper> http://paste.ubuntu.com/6178119/
<hazmat> thumper, is your provider storage running?.. its basically failure to resolve the juju db i'd assume
<hazmat> your not even getting to state/open
<thumper> not very helpful eh?
<hazmat> so env/conn is the failure point
<hazmat> well maybe..
<hazmat> actually possibly before then
<thumper> hmm...
<thumper> for some reason it wasn't started
<thumper> but it said it was
<thumper> wtf
 * hazmat returns to db surgery
 * thumper is still confused
<thumper> seems to be some weirdness there
<davecheney> hazmat: i did recommend -v always
<davecheney> but the meaning of -v has changed recently
<thumper> it hasn't changed yet, but will one day soon
<thumper> or perhaps it did way back, and I changed it
<davecheney> -v and --debug uses to be synonyms
<davecheney> not anymore
<davecheney> or somethign
<hazmat> julianwa, i need a sanity check before i give the modify db script.. could you run this http://paste.ubuntu.com/6178181/ from the client machine, you need to pass -e  env_name and -o output_file.json .. and then copy the file over to chinstrap so i can verify what needs modification.
<hazmat> julianwa, you'll probably need to setup a virtualenv to get a recent version of mongodb if the client is precise
<julianwa> hazmat: setup a virtualenv on the server?
<hazmat> julianwa, on the client
<hazmat> sorry net dropped for bit, battery died
<hazmat> julianwa, virtualenv --system-site-packages jujufix && source jujufix/bin/activate && easy_install pymongo
<hazmat> then python  script_from_pastebin -e env_name -o output.json
 * thumper off to do a demo of juju
<thumper> so I fully expect everything to fall in a heap
<hazmat> thumper, good luck
<thumper> cheers
<hazmat> julianwa, that script is read only, it just grabs a copy of the db into a json
<hazmat> mostly so i can inspect and verify all refs to 'mysql' before doing an update script
<julianwa> we don't have a client to create virtualenv. the client is maas server.
<julianwa> hazmat: ^^
<hazmat> julianwa, that's fine
<hazmat> julianwa, you can setup the virtualenv there
<hazmat> the virtualenv will keep the pymongo package isolated
<hazmat> from the system
<hazmat> you do need to install python-virtualenv package
<hazmat> ie. on the maas server
<hazmat> and then you can remove the virtualenv dir when the work is done
<hazmat> julianwa, sorry i just got told you may not have outbound net access
<hazmat> might have to do go for this..
<hazmat> except not sure i have time for that tonight
<hazmat> julianwa, k.. i'm done for the night. can pick up via email.
<julianwa> hatch: thanks will exec script and get back to you
<julianwa> sorry hazmat ^^
<hazmat> julianwa, no worries
<rogpeppe> mornin' all
<dimitern> morning
<axw> morning rogpeppe, dimitern
<rogpeppe> axw: hiya
<rogpeppe> axw: just glancing at  q
<rogpeppe> axw: https://codereview.appspot.com/14197043
<rogpeppe> axw: is the whole thing really on one line?
<axw> rogpeppe: yep, the base64 encoder doesn't add newlines if that's what you're thinking of
<rogpeppe> axw: hmm, that seems wrong - head should not have to read 10MB lines
<axw> rogpeppe: hmm yeah, I guess head is going to buffer all that isn't it.
<rogpeppe> axw: and... is there a pty at the other end of that ssh connection?
<rogpeppe> axw: indeed
<axw> rogpeppe: no pty allocated
<rogpeppe> axw: well at least we won't overflow the input buffer
<axw> rogpeppe: I can change it to a here doc.
<axw> hm
<axw> isn';t that going to be the same?
<rogpeppe> axw: that would be better, but for other reasons
<axw> rogpeppe: other reasons being?
<rogpeppe> axw: it's robust when the input doesn't happen to be exactly the right length
<axw> rogpeppe: this will always be one line
<rogpeppe> axw: currently yes, but i think it should probably be split
<rogpeppe> axw: so that it can be streamed all the way
<rogpeppe> axw: it's slightly odd that encoding/base64 doesn't provide an option to do that
<axw> rogpeppe: indeed, I thought there was a mandated line length for some usages of base64...
<axw> rogpeppe: I was thinking about https://codereview.appspot.com/14011046/ again before; I *think*, once everything's behind the API, we should be able to solve this cleanly by adding something into the environ config
<axw> i.e. a "bootstrapped" flag, which is checked to see if ssh/http storage is used
<axw> initially false (while bootstrapping), then set to true before adding into state
<axw> then the API server will always load it out as true
<rogpeppe> axw: i wondered about that
<rogpeppe> axw: i'm not sure though
<rogpeppe> axw: currently i assume that Bootstrap doesn't change any config options that need storing locally
<rogpeppe> axw: how about something like this for splitting lines (untested): http://paste.ubuntu.com/6178727/ ?
<axw> rogpeppe: sure, but then what on the server-side? head still buffers, as does a here doc
<rogpeppe> axw: no, a here doc doesn't buffer (and neither does head, other than line-at-a-time)
<axw> er yeah of course, line buffered.. :)    ok, I thought the here doc was fully buffered
<axw> ok, I can make that change
<rogpeppe> axw: seems to work ok: http://play.golang.org/p/oYPyqeej5_
<dimitern> fwereade, ping
<fwereade> dimitern, pong
<rogpeppe> axw: their unbuffered nature is almost the entire reason that heredocs exist (otherwise you could just use single quotes); think of shell archives.
<dimitern> fwereade, I updated that passwords CL, if you can take a look? https://codereview.appspot.com/14036045/
<fwereade> dimitern, cheers
<axw> rogpeppe: thanks, I didn't think about that too hard
<dimitern> fwereade, ah, sorry didn't see you had comments recently - will look at them now
<fwereade> dimitern, they're basically a repriseof ourdiscussions, thought Ishould record them
<dimitern> fwereade, I couldn't find a way to access the password from the agent conf, except if there's a method on the interface
<dimitern> fwereade, but now at least, there's no newPassword and I do read it back to get the saved value
<fwereade> dimitern, hmm, I see, don't love it but I guess that's why we had that Password method in the first place
<fwereade> dimitern, do you think there's any way we can make its testiness clearer
<fwereade> dimitern, like if we added an unexported passwordForTests method to the interface, and then had an export_test func that called it?
<dimitern> fwereade, the problem is, I need it outside of the agent module, so export_test won't do
<fwereade> dimitern, jujud/bootstrap_test?
<dimitern> fwereade, also, originally the method was PasswordHash(), but that's not going to work in my case - I need the actual plain text password
<fwereade> dimitern, or more?
<fwereade> dimitern, yeah, I saw that, entirely unbothered there :)
<dimitern> fwereade, there and in machine_test, yeah
<dimitern> fwereade, hmm you know what?
<dimitern> fwereade, I do have OpenState and OpenAPI in config
<fwereade> dimitern, I don't see it in machine_test
<dimitern> fwereade, I can use these instead in assertCanConnectToState and the other
<fwereade> dimitern, hey, yeah, just load a config and check those work (or don't) as expected
<dimitern> fwereade, ok, and I'll revert to PasswordHash then
<fwereade> dimitern, and in bootstrap_test I think the try-to-load-a-config thing should work too
<dimitern> fwereade, you mean drop PasswordHash altogether?
<fwereade> dimitern, ISTM that we might then even be able to drop Password
<fwereade> yeah :)
<dimitern> fwereade, ok, will try now
<fwereade> dimitern, lovely, thanks
<fwereade> rvba, ping
<TheMue> fwereade: ping
<rvba> Hi fwereade
<fwereade> rvba, those docs STM to imply that you can and should and do use multiple API keys for the same user, and that that's the accepted way to use juju/maas, and all I'm looking for is a way to automate the ugly bit of the setup-- have Imissed something?
<fwereade> TheMue, pong
<TheMue> fwereade: 1st, just found your mail regarding the unset documentation, missed it in the mail stream. aaaargh. will do it
<rvba> fwereade: right, but I think the bug is there even if you use multiple API keys (belonging to the same user).
<fwereade> rvba, are we sure? the python implementation doesn't look like it does anything very different
<TheMue> fwereade: 2nd, the refactoring is almost done (currently in testing), but i have to admit that using a struct doesn't really feel better then 3 arguments and 4 return values
<TheMue> fwereade: it sometimes even reads more ugly
<rvba> fwereade: I confess I haven't tested it in the field but I'm pretty sure.
<rvba> fwereade: I'll test it today.
<fwereade> rvba, tyvm
<TheMue> fwereade: and additionally i'm not happy with a behavior we already had with the CL before.
<fwereade> TheMue, ok, if it's a bad idea let's not do it :)
<fwereade> TheMue, ah, what's that one?
<TheMue> fwereade: that is that we can write Status(params.StatusFoo, "something", nil)
<TheMue> fwereade: but when reading the status data is not nil but an empty map
<TheMue> fwereade: i would like to change that in a way, that in case of an empty map in the db we return nil for data
<fwereade> TheMue, what's the distinction you're trying to preserve here?
<TheMue> fwereade: none ;) it only feels asynchronous, not my way of thinking
<TheMue> fwereade: but ok, i see a problem
<TheMue> fwereade: if i write an empty data a nil would be returned, so the same problem :(
<TheMue> fwereade: so forget my stammering here
<TheMue> fwereade: :D
<fwereade> TheMue, haha, np
<TheMue> fwereade: so i'll now do the doc and then free for the next job
<fwereade> TheMue, rogpeppe: I am wondering if it is feasible/useful to have TheMue start work on some of the environ Prepare methods?
<fwereade> TheMue, perfect, ty
<fwereade> TheMue, lining that up now :)
<TheMue> fwereade: great
<fwereade> hey, trunk is doing this for me: http://paste.ubuntu.com/6178801/
<fwereade> jam, rogpeppe, dimitern, TheMue: anyone else seeing it?
<jam> fwereade: I thought rogpeppe submitted a "skip this test" last night for that
<dimitern> fwereade, no, but haven't tried trunk yet
<rogpeppe> fwereade: once https://codereview.appspot.com/14136043/ has gone in, that might work ok actually
<fwereade> jam, fwiw it just failed on the bot, but only one of them did
<rogpeppe> fwereade: i'm still struggling through fixing test failures, but i don't think that should be an obstacle
<fwereade> rogpeppe, cool, that looked LGTMed already so I didn't examine closely
<fwereade> rogpeppe, ok, fantastic
<rogpeppe> fwereade: ok
<rogpeppe> fwereade: yes, i've seen that test failure
<dimitern> fwereade, updated https://codereview.appspot.com/14036045 again
<rogpeppe> fwereade, jam: that wasn't the one i submitted a Skip for
<rogpeppe> fwereade: looks like a problem with the code dialling out inappropriately
<dimitern> fwereade, sorry, some code left over, will repropose
<rogpeppe> fwereade: i was planning to do a global check that we're not doing that, by tweaking the net package so that it panics when resolving or dialling a non-localhost address
<fwereade> rogpeppe, do I recall that I convinced (or browbeat or ordered or something) you to handle admin-secret and other juju-level things in environs.Prepare?
<fwereade> rogpeppe, nice
<fwereade> rogpeppe, in which case TheMue should steer clear of the Prepare func itself and just work with the individual environ implementations
<rogpeppe> fwereade: yes
<rogpeppe> fwereade: well, yes to the latter anyway
<rogpeppe> fwereade: i'm not quite sure what you mean by "handle" in the above context
<fwereade> rogpeppe, insert values if none are present
<fwereade> rogpeppe, y'know, "prepare" them ;p
<rogpeppe> fwereade: yes, that should definitely happen
<rogpeppe> fwereade: the above CL does that for CA cert, and it would be easy to do for admin-secret too
<fwereade> rogpeppe, fantastic
<fwereade> rogpeppe, do we have any other global candidates?
<fwereade> rogpeppe, hey interesting wrinkle
<rogpeppe> fwereade: go on...?
<fwereade> rogpeppe, if state-port/api-port get filled in globally, the local provider won't be able to pick non-conflicting ones itself
<fwereade> rogpeppe, nbd
<rogpeppe> fwereade: hmm
<rogpeppe> fwereade: well, there's nothing stopping a given provider forcibly changing a config attribute, in fact, although that may be inadvisable
<fwereade> 25495131 - alexia
<fwereade> bugger
<rogpeppe> :-)
<fwereade> brb
<fwereade> b
<fwereade> dimitern, LGTM
<dimitern> fwereade, thanks
<dimitern> fwereade, fixing a few other minor things I found and landing
<dimitern> fwereade, after a live test ofc
<rogpeppe> fwereade: ah, sync.selectSourceStorage is the culprit
<rogpeppe> fwereade: we should set up sync.DefaultToolsLocation
<fwereade> rogpeppe, gaaah ofc (tyvm for looking into that, somehow it's already 3 levels down my stack)
<rogpeppe> fwereade: the "tweak net package" hack is working quite nicely
<fwereade> rogpeppe, excellent
<rogpeppe> fwereade: ah ha!
<rogpeppe> fwereade: we *do* set up sync.DefaultToolsLocation, but in this case we're acually invoking a different test binary (via TestRunMain) which doesn't tweak it
<fwereade> rogpeppe, gaah indeed
<fwereade> sorry bbs
<axw> rogpeppe: https://codereview.appspot.com/14197043
<axw> renamed your writer, I hope you don't mind ;)
<rogpeppe> axw: it was an arbitrary name :-)
<rogpeppe> axw: although i think Wrap on its own is a bit lacking context. LineWrapWriter would probably be better
<axw> good point
<rogpeppe> axw: (as many writers "wrap" another writer)
<axw> yup
<axw> I shall change it
<rogpeppe> axw: also, it's not quite as general as the doc comment implies - it wraps lines at lineLength *bytes* not characters
<rogpeppe> axw: so it can split utf chars inappropriately
<axw> yeah, true, I should specify bytes/ascii chars
<rogpeppe> axw: that's the reason i think it's not entirely appropriate for utils, as it's ok to use for wrapping base64, but not really in general
 * axw nods
<axw> rogpeppe: I very nearly left it in there... can move back
<rogpeppe> axw: making it do proper utf8 handling would be not entirely trivial and need some additional buffering, so i think that's probably best
<axw> rogpeppe: cool. btw, why did you add the additional bufio in there originally?
<rogpeppe> axw: because i didn't want to do lots of 1-byte writes
<rogpeppe> axw: it could slow things down quite a bit when transferring MBs
<axw> rogpeppe: ok. I would think that's the concern of the user
<axw> yeah
<axw> ok
<rogpeppe> axw: true; the user can always pass in a bufio.Writer
<rogpeppe> axw: which is probably a good thing anyway, as base64.Encoder does lots of small writes too
<axw> rogpeppe: yeah, I'll create one in the sshstorage.run command.
<dimitern> fwereade, live tests with the local provider pass nicely
<dimitern> fwereade, now trying ec2
<fwereade> dimitern, try an upgrade from 1.14 as well on general principles :)
<dimitern> fwereade, oh.. that might be a problem
<dimitern> fwereade, not sure how to get 1.14 first
<fwereade> dimitern, it's in ppa:juju/stable
<rogpeppe1> axw: are you sure that legitimate base64 output can't finish with the letters "EOF" ?
<axw> rogpeppe1: heh, good question. probably could mangle it so it is. /me fixes
<rogpeppe1> axw: that's why i suggested '@EOF' originally
<axw> ah
<axw> rogpeppe1: and my response about not understand that was thinking it was a special bash syntax
<axw> thanks, I'll use that
<rogpeppe1> axw: well, the quotes are significant
<rogpeppe1> axw: if the string is quoted, the shell won't scan for potential variable and `` expansions
<axw> rogpeppe1: yeah I thought the @ was doing something special in the context of a here-doc
<rogpeppe1> axw: ah no :-)
<axw> rogpeppe1: updated
<rogpeppe1> axw: i posted a review
<axw> off for dinner now, adieu
<axw> ta, I'll take a look later
<rogpeppe1> axw: i don't think you've re-proposed
<rogpeppe1> axw: enjoy
<rogpeppe1> can the x clipboard really not hold any more than 32K ?
<rogpeppe1> hmm, must be bug elsewhere
<rvba> fwereade: testing shows I was right (see the email I just sent).
<jam> dimitern: rogpeppe1 standup ?
<jam> https://plus.google.com/hangouts/_/cf9a1a494368b354ff8311d84b7e64aa52777ed0
<fwereade> rvba, crap, so all the docs are wrong? *does* pyjuju do something special to handle it?
<rvba> fwereade: something seems to be wrong with the docs indeed.  I didn't test with pyjuju, but I doubt it behaves differently.
<jam> fwereade: it looks like not only might you shutdown all of your own instances, you might shutdown another users if you have perms to do so
<fwereade> rvba, yeah, I didn't spot any obvious differences in the implementations
<mgz> geh. g+ is dropping too many packets for me...
<mgz> understanding robot is too hard... please internet
 * TheMue => lunch
<dimitern> fwereade, so summary of the live upgrade test
<dimitern> fwereade, I managed to upgrade, but I needed to do 3 manual steps: 1) copy the tools as discussed in control-bucket/tools; 2) create a new /var/lib/juju/tools/1.15.-precise-amd64/downloaded-tools.txt with the proper json format (without this the MA upgraded and all tasks run ok, except the lxc-provisioner stopped working); 3) replace the symlink in /var/lib/juju/tools/unit-mysql-0 to point to the 1.15.0.1 tools - now the UA works as well (
<dimitern> after 2) and 3) did killall -9 jujud, just in case)
<dimitern> fwereade, and one final thing worth mentioning - destroy-environment called on 1.14 bootstrapped env, upgraded to 1.15, using the juju command from 1.15 produces this error: ERROR cannot read environment information: environment "amazon" not found (when I call the juju command from 1.14 it works)
<dimitern> fwereade, so should I land it anyway?
<rogpeppe1> fwereade: i found the mock charm store BTW. it still doesn't play well with InferRepository though.
<rogpeppe1> fwereade, jam: https://codereview.appspot.com/14200044
<jam> rogpeppe1: "broken config" is just a single quote char ?
<jam> Isn't that invalid yaml?
<rogpeppe1> jam: that's the point, yes
<rogpeppe1> jam: it's a more reliable way of triggering an error
<jam> rogpeppe1: while that is true, that seems (to me) to be exercising a very different code path
<jam> one is that if we have well-formed but missing data
<jam> and the other is malformed data
<rogpeppe1> jam: that's not the code path this test is interested in testing, if i read the test correctly.
<fwereade> jam, I think that's LGTM -- the only thing it's actually intended to test is that the arg parsing doesn't care about order of command vvs --log-file
<fwereade> jam, I am a little sceptical about the value that test provides anyway
<rogpeppe1> fwereade: yeah, i seriously considered trashing it
<jam> rogpeppe1: so at least I would change the comment on "breakJuju" to say "with invalid configuration"
<fwereade> jam, seems like an awfully awkward way to run something a bit like the real binary
<dimitern> fwereade, have you seen my previous messages?
<jam> fwereade: you mean TestRunMain ?
<jam> it is very ackward, but used in quite a few places
<jam> since you do end up with a test binary
<jam> that you could then invoke
<jam> and I guess means you don't have to "go build" before you "go test"
<jam> I was a bit surprised we didn't do the "go build" route myself.
<fwereade> jam, I can't think of any good reason not to test the build product
<fwereade> jam, but the idea was that it should all be done in go test
<jam> fwereade: because when you just run "go test" there isn't a 'juju' build product
<fwereade> jam, ofc we *could* actually build the binary inside a go test test
<fwereade> jam, in fact in some places we do IIRC
<fwereade> jam, but that's kinda horrible as well
<rogpeppe1> fwereade, jam: i'd be happier if these kinds of test were actually shell scripts or similar, outside the purview of go test.
<fwereade> rogpeppe1, jam, +1
<rogpeppe1> fwereade: it seems to me that they don't add much value anyway - the amount of actual logic they're testing in general is just glue code and tiny
<fwereade> rogpeppe1, sure, but it would also be nice to run at least one test on our actual build output
<rogpeppe1> fwereade: "badrun" is very badly named
<rogpeppe1> fwereade: i agree
<fwereade> rogpeppe1, isn't it, I think it has mutated quite significantly over the years
<rogpeppe1> fwereade: we should have more tests that work on the final juju binary - in particular live tests that use it to deploy charms, etc
<fwereade> rogpeppe1, +100
<fwereade> rogpeppe1, I think that a few of those would be pretty good replacements for some of the live tests tbh
<rogpeppe1> fwereade: yeah
<rogpeppe1> fwereade: we really really need a charm-level regression testing suite
<fwereade> rogpeppe1, a bunch of standard charms that exerice our functoinality, that we deploy in those tests? yeah
<rogpeppe1> fwereade: exactly
<rogpeppe1> fwereade: i started doing it, but came up against the issue of how we determine success or failure (we need to actually talk to the charm)
<rogpeppe1> fwereade: there are lots of possibilities and i got distracted while trying to decide on one :-)
<fwereade> rogpeppe1, yeah, that's a beguiling problem, I immediately started looking into space and pondering
<rogpeppe1> fwereade: :-)
<dimitern> fwereade, ping
<fwereade> dimitern, pong
<dimitern> fwereade, ^^
<fwereade> dimitern, ok, so that looks like "really rather broken" to me
<fwereade> dimitern, "downloaded-tools.txt" is complete nonsense, I fear
<dimitern> fwereade, it's hard to tell why it's needed
<fwereade> dimitern, and do we have any idea why the upgrader didn't run on the unit agent?
<fwereade> dimitern, it just looks like a not-properly-tested typo/thinko for downloaded-url.txt
<dimitern> fwereade, it ran, but kept restarting with that issue "empty size or checksum" for tools
<fwereade> dimitern, or maybe not
<dimitern> fwereade, downloaded-url.txt is just a url in a text file, while the downloaded-tools.txt has the url, size, sha256, and binary version string in a json-serialized format
<fwereade> dimitern, ohhhh right ok
<rogpeppe1> jam: your new logMatches changes are just great. i was struggling to see what's what in http://paste.ubuntu.com/6179372/; i merged your branch and got this: http://paste.ubuntu.com/6179376/
<jam> rogpeppe1: thanks, I think it is quite a bit nicer, if I had known I should have done it that way to start
<dimitern> fwereade, that brings me back to the question
<fwereade> dimitern, so we fucked the format to store a load of information that is not only unused but is literally worthless
 * fwereade sighs
<dimitern> fwereade, should I land my branch despite the upgrade issues, which are related to tools and not to my changes?
<dimitern> fwereade, yeah, pretty much istm
<rogpeppe1> jam: one thing i realise though - it talks about "line 8" but there's actually no way of seeing what was on the lines in question.
<fwereade> dimitern, ok so what you're saying is that once the actual tools are put in place the mongo-related implications of the change are fine
<dimitern> fwereade, in fact in the 1.15.0.1 tools dir there is indeed a downloaded-url.txt only
<fwereade> dimitern, ofc there is
<fwereade> dimitern, what version downloaded it, do you think? :)
<dimitern> fwereade, yes, the agents restart, and I can see the UA is using API only, and the MA on machine 1 is using only api
<fwereade> dimitern, ok then, please land that
<dimitern> fwereade, ah, right!
<dimitern> fwereade, ok, landing then
<fwereade> dimitern, rogpeppe1: what would it cost us to just rip out all that where-did-the-tools-come-from crap? if we care we can log it at upgrade time
<dimitern> fwereade, not sure, have to check, but it seems the lxc-provisioner and upgrader currently log tool-related errors
<rogpeppe1> fwereade: i guess we could do that. i *thought* it would be useful in the status, but if it's really causing problems...
<jam> dimitern: this error: "/var/lib/juju/tools/1.15.0-precise-amd64/downloaded-tools.txt" looks suspicious because I'm pretty sure it should look in /var/lib/juju/tools/downloaded-tools.txt
<dimitern> jam, there's no downloaded-tools.txt in /var/lib/juju/tools/
<dimitern> jam, but if there is one in the subdirs it finds it
<fwereade> jam, there had better not be one in there :)
<jam> fwereade: well it does say "$bin/downloaded-tools.txt"
<jam> I would have thought it would split the tools all into one file, but I see it is doing ">" not ">>"
<fwereade> rogpeppe1, jam: have we already released an API that demands we include that information? I think we have
<fwereade> awwfuck it
<jam> fwereade: needs what infor?
<fwereade> jam, all the stuff in a state.Tools which is completely irrelevant
<fwereade> jam, url, hash, size, etc
<fwereade> jam, I mean, I don't see how it's possibly useful to feed that info into juju from the agents
<fwereade> jam, the agents *got* that info from juju
<fwereade> jam, and I fear that SetAgentTools expects all that stuff now
<fwereade> rogpeppe1, jam, dimitern: do any of you recall seeing a panic in testing.TarGz?
<jam> fwereade: I have not recently seen a panic there
<rogpeppe1> fwereade: i don't think so
<fwereade> jam, rogpeppe1: dammit, I'm sure I saw one once, and I think mattyw is having trouble with one at the moment
<rogpeppe1> fwereade: i'd be interested to see a stack trace
<mattyw> rogpeppe1, I've got the full output of go test juju-core/.. if that's useful?
<rogpeppe1> mattyw: paste away
<jam> fwereade: back to the sha stuff, the only thing I can particularly remember is that Ian wanted to make sure the sha sum matched what we read in the simplestreams code. As you mention if we validate that it matches the expected value before we untar it, then we just use the expected value inside Juju and it doesn't help us to write it down again.
<mattyw> rogpeppe1, I've emailed you if that's ok, it'd be an enormous paste
<rogpeppe1> mattyw: ok
<rogpeppe1> mattyw: hmm, what version of Go are you running?
<axw> fwereade: re "It's that old wrong-environ problem again", I thought thought the issue was that an environment might be stale, and you need to SetConfig with config from state to do the right thing
<mattyw> rogpeppe1, 1.1
<axw> fwereade: so, you could SetPrechecker with the stale env, then SetConfig later to make it right
<axw> (before using the state object)
<rogpeppe1> mattyw: could you try with 1.1.2, please? i think that fixed a few GC issues. also, could you pull tip too?
<mattyw> rogpeppe1, tip of core and go 1.1.2?
<rogpeppe1> mattyw: yeah
<mattyw> rogpeppe1, will do
<rogpeppe1> ain't it great when a test fails because DeepEquals returns false, but the info printed out for each value is identical?
<gary_poster> hey fwereade.  when you have a chance, could you take a look at https://bugs.launchpad.net/juju-gui/+bug/1233462 and clarify expected/correct behavior please?  The CLI and GUI should be aligned on issues like this, I think.
<_mup_> Bug #1233462: gui does not allow multiple relations between wordpress and ha proxy <juju-gui:Triaged> <https://launchpad.net/bugs/1233462>
<mattyw> rogpeppe1, thanks for your help, upgrading to go 1.1.2 seemed to fix it, although I get these failures on tip http://paste.ubuntu.com/6179576/
<rogpeppe1> mattyw: i suspect those are current test-isolation issues
<mattyw> rogpeppe1, ok, I'm able to ignore them for my stuff anyway
<mattyw> rogpeppe1, thanks again for your help
<rogpeppe1> mattyw: np
<mgz> sinzui: you want initiating into the ways of sync-tools?
<sinzui> mgz I do
<mgz> so, confusing part #1: the landing bot and the cloud-wide tools account are the same one
<mgz> after that, it's dead easy
<mgz> source the bot creds, add a juju-env section something like:
<mgz>   canonistack-juju-tools:
<mgz>     type: openstack
<mgz>     admin-secret: <whatever, doesn't get used>
<mgz>     control-bucket: juju-dist
<mgz> then run `juju sync-tools -e canonistack-juju-tools`, being careful with which version of juju you're actually using, due to simplestreams stuff having changed of late
<mgz> if you don't have the bot creds, I will gpg enc them to you
<mgz> it's the account named "juju-tools-upload"
<mgz> it's useful to `swift list juju-dist` to check what's actually there afterwards
<mgz> also it rhymes
<fwereade> gary_poster, responded
<mgz> sinzui: do you need the creds from me?
<hazmat> fwereade, is this valid for a txn op..               "$not": "<_sre.SRE_Pattern object at 0x19276f8>" ?
<hazmat> that looks like an accidental serialization of a regex
<hazmat> object instead of pattern
<fwereade> hazmat, it does rather, doesn't it
<mgz> heh
<hazmat> fwereade, line 14309 from the dump fwiw
<gary_poster> great, thank you fwereade
<gary_poster> (and yes, fwereade, helpful. :-) )
<fwereade> hazmat, hmm, bson.RegEx, eh? that smells a little funny
<hazmat> gary_poster, that behavior is intended for the gui
 * hazmat vaguely remembers writing that code
<gary_poster> hazmat, you'd argue that the CLI and GUI should have different behavior
<hazmat> gary_poster, i'd argue 99% of charms are broken with a requires having multiple relations
<hazmat> gary_poster, see my conversation with dave cheney in #juju-gui last night
<hazmat> gary_poster, the gui should prevent people from shooting themselves and their services in the foot
<hazmat> thats the whole point for the relation dimming guides, to give contextual help to a user
<gary_poster> hazmat, IMO talk through it with fwereade.  CLI and GUI should be corresponding.  Need to be on call now, can return to convo later
<fwereade> hazmat, I concur, we should respect relation limits
<hazmat> fwereade, fair enough.. this was done in the absence of enforcement of those limits, and the reality that at the time only one extant charm had any support for multiple requirers, every other basically had undefined behavior for it
<hazmat> and even then that one charm required explicit service config for the scenario to work
<mgz> sinzui: emailed you the creds
<sinzui> thank you!
<fwereade> hazmat, yeah, I think the gui was right to restrict -- if there *is* a requires with a limit != 1, though, it should probably allow that one
<hazmat> fwereade, fair enough, although my intent at the time was that if they really needed that sort of thing, they could use the cli for it.
<hazmat> because frankly nothing supports it
<hazmat> in terms of charms
<hazmat> haproxy only does in the context of explicit service config, without that its entirely broken
<hazmat> gary_poster, commented on the bug
<hazmat> considering the complexity of fixing this and the zero gain in terms of real world usage, i'd rather kick this down the road for the gui. +1 on cli respecting limits and converging behavior with gui down the road.
<gary_poster> ack thx hazmat
 * fwereade taking a break, bbs
<dimitern> jam, ping
<dimitern> or mgz ?
<dimitern> h, nevermind
<rogpeppe> hmm, anyone got any tips for debugging DNS issues? i'm seeing a 16s turnaround on some DNS requests
<rogpeppe> which *really* slows down some of the non-isolated tests ....
<dimitern> rogpeppe, you can use a local dns server perhaps?
<sinzui> thank you again mgz, canonistack is sorted.
<rogpeppe> dimitern: yeah, i guess i could try that
<arosales> hazmat, re the affinity bug. I think it may be related to how msft sets an affinity group.
<arosales> hazmat, still a hypothesis. I ran into this blog http://michaelwasham.com/2012/08/07/http-error-message-the-location-or-affinity-group-east-us-specified-for-source-image/
 * arosales still investigating 
<rogpeppe> dimitern: hmm, tried that (i used these instructions http://askubuntu.com/questions/264827/how-do-i-activate-a-local-caching-nameserver), but this program (http://paste.ubuntu.com/6179816/) still reliably takes between 15 and 20s to run.
<dimitern> rogpeppe, are you sure you're hitting your local dns?
<dimitern> rogpeppe, try host -v store.juju.ubuntu.com
<rogpeppe> dimitern: ah, i was using nslookup to test stuff
<dimitern> rogpeppe, host is way better in many ways
<rogpeppe> dimitern: hmm, the local name server returns in no time, but then it times out twice: http://paste.ubuntu.com/6179829/
<rogpeppe> dimitern: (that took about 15s to run)
<dimitern> rogpeppe, I'm running dnsmasq locally and it's pretty fast, and caching
<rogpeppe> dimitern: it seems to be caching ok, but for some reason the lookup is ignoring the cache
<dimitern> rogpeppe, take a look in your /etc/resolv.conf
<rogpeppe> dimitern: it just says "nameserver 127.0.1.1"
<dimitern> rogpeppe, mine does as well and dnsmasq is listening on :53
<rogpeppe> dimitern: i wonder if it's an issue with the ubuntu DNS record - i tried a few things under ubuntu.com and they all took ages, but google.com was quick
<dimitern> rogpeppe, they're quick here (ubuntu.com I mean)
<rogpeppe> dimitern: actually google.com tried contacting the actual google server too, it seems
<rogpeppe> dimitern: so maybe it's an issue with an ubuntu DNS server
<dimitern> rogpeppe, could be, but it might be on your side as well
<rogpeppe> dimitern: what does host -v store.juju.ubuntu.com print for you?
<dimitern> rogpeppe, http://paste.ubuntu.com/6179868/
<rogpeppe> dimitern: interesting - here's what i see http://paste.ubuntu.com/6179862/
<rogpeppe> dimitern: yours seems to be following the same logic, but mine times out the first time and is slow the second
<rogpeppe> dimitern: i did change my router/modem at the weekend, so it's probably related to that somehow, but i can't quite see how
<dimitern> rogpeppe, this is what I get for ubuntu.com http://paste.ubuntu.com/6179878/
<dimitern> rogpeppe, your router might be getting in the way as a preferred dns on the network and then not doing a good job and timing out?
<dimitern> rogpeppe, or perhaps there's a override somewhere in the settings ?
<axw> fwereade: ping
<fwereade> axw, pong, I'm very sorry I haven't managed to review more of your branches today
<rogpeppe> dimitern: even if it was, why isn't my local dns cache working properly?
<axw> fwereade: nps, I just have a question
<axw> <axw> fwereade: so, you could SetPrechecker with the stale env, then SetConfig later to make it right
<axw> <axw> (before using the state object)
<axw> err
<axw> sorry, missed some context
<rogpeppe> dimitern: anyway, i'm spending too much time trying to fix this issue. thanks for the input.
<axw> that was preceded by:
<axw> <axw> fwereade: re "It's that old wrong-environ problem again", I thought thought the issue was that an environment might be stale, and you need to SetConfig with config from state to do the right thing
<fwereade> axw, I think it is, yes
<axw> fwereade: so a PrecheckerSource would be unnecessary, no?
<fwereade> axw, well, I was thinking more in the context of a task that kept a shared env up to date
<axw> I envision the series of events like this: juju.Open gets Environ with stale config, and a state, and calls SetPrechecker. Later, someone updates the env's config with SetConfig
<fwereade> axw, but then we need to track that specific environ around and update it at the right time -- and there's no guarantee we can create an environ at the point we're creating state, is there?
<fwereade> axw, environs lack secrets until the first CLI connection
<fwereade> axw, I was thinking of a model in which there was a task that waited for a valid environ in state and only then returned it
<axw> fwereade: ok. and I was thinking that you could have an invalid Prechecker (just as you can have an invalid Environ - they're the same after all), and make it valid by updating the environ's config
<fwereade> axw, how do you propose to create this invalid environ? ;p
<fwereade> axw, it'd have to be nil, wouldn't it?
 * axw is confused
 * fwereade is too, a little, do you have a moment for a g+?
<axw> fwereade: for a short while, going to bed shortly
<axw> just gotta go get set up, brb
<axw> fwereade: https://plus.google.com/hangouts/_/eef843eef5b79f33bd5a7a2a4d50f20192103ac721?authuser=1&hl=en-GB
<rogpeppe> does anyone else see provider/ec2 tests fail (TestStartInstanceWithEmptyNonceFails fails for me) ?
<rogpeppe> on trunk
<rogpeppe> it looks like another isolation issue to me, but i can't see a specific bug for it
<rogpeppe> mgz: ^
<mgz> checking
 * mgz switches to trunk
<mgz> yeah, tht fails for me
<gary_poster> If, in an ec2 environment, juju claims that the environment is bootstrapped but it is not actually (http://pastebin.ubuntu.com/6180035/), what do I do?  this is trunk, as of yesterday and today.
<mgz> probably the test passed when poorly isolated
<rogpeppe> mgz: one mo - i'll disconnect from the network and we'll see
<mgz> gary_poster: destroy-environment and start again?
<mgz> gary_poster: the more manual option is to poke around in the file store and see what it has
<mgz> using some s3 tool or other
<gary_poster> mgz that worked thanks :-)
<rogpeppe> mgz: well, it fails with the same error when the netork is disconnected
<rogpeppe> mgz: i'm not sure how it got past the bot
<rogpeppe> what caused all these isolation-related issues to suddenly raise their head?
<mgz> changes to simplestreams
<rogpeppe> hmm
 * rogpeppe escapes to lunch
<rogpeppe> fwereade: preliminary proposal of the branch that's been in the offing for ages and ages: https://codereview.appspot.com/14207046
<rogpeppe> fwereade: all tests pass except the one that fails in trunk too
<rogpeppe> fwereade: (i don't know why that test passes on the 'bot)
<fwereade> rogpeppe, gaah, last time I ran trunk everything was passing :/
<dimitern> rogpeppe, are you sure it's not due to your specific router/dns setup?
<rogpeppe> dimitern: no :-)
<rogpeppe> dimitern: but it's a vanilla setup
<rogpeppe> dimitern: the DNS settings are automatically obtained from my ISP
<dimitern> rogpeppe, I mean is the failing test timeouting on dns resolving?
<rogpeppe> dimitern: and... if the caching was working, why wouldn't the query return immediately it has got an answer?
<rogpeppe> dimitern: yes
<rogpeppe> dimitern: oh, no
<rogpeppe> dimitern: i don't think so
<rogpeppe> dimitern: mgz could reproduce the issue
<rogpeppe> dimitern: do provider/ec2 tests pass for you?
<dimitern> rogpeppe, they did when I landed my branch last time
<rogpeppe> dimitern: they failed for me with the network disconnected too
<rogpeppe> dimitern: that might be a good way of reproducing the issue, if it is an isolation problem
<dimitern> rogpeppe, yeah
<rogpeppe> dimitern: if you could bear to be without the internet for a minute or so, i'd appreciate it if you could try that
<dimitern> rogpeppe, not right now - in a few minutes, I'm testing a bugfix
<rogpeppe> dimitern: np
<rogpeppe> dimitern: whenever
<dimitern> rogpeppe, take a look at cmd/juju/destroyenvironment.go:46
<dimitern> rogpeppe, why's that?
<rogpeppe> dimitern: the "if !assumeYes" line?
<dimitern> rogpeppe, no, the if before that, with the _, err =
<rogpeppe> dimitern: hmm, i don't see that in my branch - gimme 5 minutes while my lbox propose runs and i'll have a look
<rogpeppe> dimitern: that line was a bit premature
<dimitern> rogpeppe, it's spurious
<rogpeppe> dimitern: not really
<dimitern> rogpeppe, why?
<rogpeppe> dimitern: if you haven't got info stored for an environment you won't be able to talk to that environment at all
<rogpeppe> dimitern: so that line gives a better error message to the user
<rogpeppe> dimitern: however...
<rogpeppe> dimitern: that line is now gone in https://codereview.appspot.com/14207046/
<rogpeppe> dimitern: (review appreciated BTW!)
<dimitern> rogpeppe, I can't see it gone
<dimitern> rogpeppe, and I was going to do that anyway
<rogpeppe> dimitern: actually it's gone in trunk already
<gary_poster> hazmat, fwereade, added comment #7 to https://bugs.launchpad.net/juju-gui/+bug/1233462 fwiw.  I welcome corrections, but I think what I said/did will be mostly OK to both of you.  Thank you again for your feedback on this.
<_mup_> Bug #1233462: gui does not allow multiple relations between wordpress and ha proxy <juju-gui:Triaged> <https://launchpad.net/bugs/1233462>
<dimitern> rogpeppe, oh? that's good - it must be very recently
<gary_poster> Already made correction/clarification: "The GUI also ignores the "limit" value, but simply does not currently allow more *requires* relations than 1."
<rogpeppe> dimitern: yeah, it merged in the last hour or so
<fwereade> gary_poster, that looks accurate and sane to me
<gary_poster> thank you very much fwereade, cool
<natefinch> I keep wanting to "correct" all the spots where I see containerised and other similar words using s where I'd use a z, and then I remember who I'm working for....
<sinzui> natefinch, that was done in Lp code, then we got a command from above. Canonical is British company and  and it encourages British spelling.
<natefinch> sinzui: that's understandable. The language is called "English" after all, not "American" ;)
<sinzui> I went to high school in AU, University in US, and work for GB company. My spelling is permanently buggered
<sinzui> rogpeppe,  dimitern: I may have gone mad. I used sync-tools --source --destination repeatedly from r1903 and make the tools. today (possibly after installing the official 1.15.0 package) I cannot get that command too work
<sinzui> Since I have the tools and metadata it created, and logs, I know it worked for many days
<rogpeppe> sinzui: i'm afraid i don't understand all the new "simple"streams stuff
<rogpeppe> natefinch: i try to use american spelling in the source code
<sinzui> I noticed right away that I couldn't use the juju from the package, but I am sure juju from the tree continued to work....but not today
<rogpeppe> sinzui: i didn't know that canonical is a british company, although i knew its head office is in london
<sinzui> rogpeppe, I think it corporate identity rather than legal identity.
<rogpeppe> sinzui: so... how does the command fail today?
<natefinch> rogpeppe: I think I've only seen one or two things that stuck out at me as british... and honestly, when those things pop up, they're things I didn't know we differed on.  Not a big deal, for sure.
<sinzui> rogpeppe, no, I reverted to r1903, which I used on Friday. The commands no longer work.
<rogpeppe> sinzui: you've done "go install ./..." presumably?
<sinzui> yes. --version is correct
<rogpeppe> sinzui: what errors do you get?
<rogpeppe> sinzui: "no longer work" is a hard place to start from :-)
<sinzui> yeah
<sinzui> I am surprised it is listing available tools when the --source --destination options are for building a tree that can be republished anywhere.
<rogpeppe> sinzui: i still haven't seen any error messages...
<sinzui> rogpeppe, This is the command I run. It is from a script I have played more than dozen times over the weekend. sync-tools is run after the tgz files are created in new-tools/tools http://pastebin.ubuntu.com/6180604/
<rogpeppe> sinzui: i'd like to see that with --debug on too
<sinzui> rogpeppe, http://pastebin.ubuntu.com/6180622/
<sinzui> rogpeppe, "boing" is a new env I add. I wondered if the hp and aws envs were "tainted" because the tools were uploaded there.
<rogpeppe> sinzui: just to clarify: what do you expect this command to do?
<sinzui> rogpeppe, copy the tgz files to new-tools/juju-dist/tools/releases generate the json and copy that to new-tools/juju-dist/streams/v1/
<rogpeppe> sinzui: without touching any external provider, right? this should all work locally on your machine.
<sinzui> that's right
<rogpeppe> sinzui: (even if you weren't using the local provider)
<rogpeppe> sinzui: in fact, the fact that you have to specify a provider for this command is kinda superfluous, i guess
<sinzui> that is also right. I use hp and aws in previous calls (to be certain the data was identical)
<sinzui> This is the content of the hp/aws calls from a few days ago: http://pastebin.ubuntu.com/6180642/
<rogpeppe> sinzui: what does ls -lR /home/curtis/Work/new-tools print?
<sinzui> rogpeppe, http://pastebin.ubuntu.com/6180652/
<rogpeppe> sinzui: i don't see anything in /home/curtis/Work/new-tools/tools/releases
<rogpeppe> sinzui: which is where synctools looks for tools, AFAICS
<sinzui> rogpeppe, that is right! sync-tools --destination is to make releases/ and streams/v1/
<sinzui> I will make those dirs just to eliminate this scenario
<rogpeppe> sinzui: AFAICS the tools sync logic copies from $source/tools/releases
<sinzui> rogpeppe, thank you very much! indeed it doe NOW....
<rogpeppe> sinzui: ah, cool
<sinzui> rogpeppe, sync-tools is a plugin isn't it and it obeys the rules of PATH
<rogpeppe> sinzui: a lot of this stuff has changed
<sinzui> ?
<rogpeppe> sinzui: possibly
 * rogpeppe hates the frickin' plugin thing
<sinzui> rogpeppe, I think the 1.14.1 rules were used when called GOPATH/bin/juju, and now I have 1.15.0 installed.
 * sinzui revises script
<rogpeppe> sinzui: it looks like the plugin logic respects $PATH, yes
 * rogpeppe is done for the day
<rogpeppe> g'night all
<rogpeppe> sinzui: glad you got it working!
<sinzui> goodnight and thank you again rogpeppe
<jcsackett> sinzui: do you have a moment to chat?
<sinzui> jcsackett, do
<jcsackett> sinzui: fantastic. i'll call you on g+.
<sinzui> jcsackett, is this a trap. I entered the room, but you isn't there
<jcsackett> sinzui, abentley: can one of you review https://code.launchpad.net/~jcsackett/charmworld/missing-qa-data-in-review-queue/+merge/188584
<abentley> jcsackett: Sure.
<jcsackett> abentley: thanks!
<abentley> jcsackett: Should you consider series as well as charm name when checking for QA?
<jcsackett> abentley: oh, good catch.
<jcsackett> yes, i should.
<thumper> morning folks
<thumper> sinzui: what's the status?
<thumper> gary_poster: ping
<sinzui> thumper, I am redeploying ALL streams because they are different when generated with 1.15 and 1.15 is installed. When I generated the streams on the weekend, I had 1.14.1 installed and I think its sync-tool plugin was used  for part of the streams :(
<thumper> :-|
<sinzui> thumper, fwereade: All tools are redeployed now. I will validate then restart the tests!
<gary_poster> thumper, hey
<thumper> sinzui: are you testing 1.15 deployments, or upgrades or both?
<thumper> gary_poster: hey, was giving a demo last night, showing off the juju-gui
<thumper> gary_poster: one key problem I had was the projector was 1024x768
<sinzui> thumper, both
<thumper> gary_poster: and lots of stuff was not visible
<thumper> sinzui: ack
<gary_poster> thumper, full ack :-(
<gary_poster> known problem
<thumper> gary_poster: is there a plan to have a zoom out for the entire interface?
<thumper> or perhaps this magic css fu I don't understand properly
<thumper> that changes more behaviour based on resolution
<gary_poster> thumper, no, UX intends bigger design change to address
<gary_poster> inspector grows to cnsume all of right hand side, and...other stuff they haven't worked out :-P
<thumper> gary_poster: ok, would be nice if it scaled well for projector demos with crappy resolution
<thumper> gary_poster: you could put that on the designers todo list :)
<gary_poster> thumper, it definitely is.  this is an important one
<thumper> gary_poster: cool, that's all
<thumper> gary_poster: although people were very interested in bundles
<gary_poster> ack thanks thumper
<thumper> and being able to save state from the gui
<thumper> s/although/also/
<thumper> I guess it reads ok either way
<gary_poster> :-) thumper, cool.  release today in progress gives you UX to save state from gui (was only keypress before).  bundle work getting very close but last bits of hooking things up take extra time, as is often the case.  was hoping for this week, but next week looks more likely
 * thumper nods
<sinzui> thumper, fwereade , azure is definitely walking! I will know in about 30 minutes if we can say it us running
<fwereade> sinzui, awesome news
<thumper> walking or working?
<thumper> sinzui: you bootstrapped ec2 ok?
 * thumper is having trouble with it
<sinzui> walking. I can status just the state-server within 15 minutes of bootstrap. I could not do that yesterday
<sinzui> thumper, I have not done ec2 yet. I am doing azure and hp
<thumper> sinzui: I have bootstrap succeeding, but no instance coming up
<sinzui> I will switch to ec2 then
<thumper> usually the management console is pretty on the ball with new instances
<thumper> but I'm not seeing anything
<thumper> us-east-1 (the default)
 * thumper tries ap-southeast-1
 * thumper tries ap-southeast-2
<thumper> not 1
<thumper> 2013-10-01 20:35:02 ERROR juju supercommand.go:282 cannot start bootstrap instance: cannot set up groups: cannot revoke security group: Source group ID missing. (MissingParameter)
<thumper> hmm...
<thumper> so much for aws having consistent apis
<sinzui> aws does appear to be slower doing a deploy
<sinzui> thumper, azure depoy PASS http://juju-test-release-azure-zw9r097xn7.cloudapp.net/
<sinzui> thumper aws deploy PASS http://ec2-23-22-68-40.compute-1.amazonaws.com/
<thumper> sinzui: hmm, are you using --upload tools?
<sinzui> No, I am not
<thumper> ah crap
<thumper> I think I'm looking at the wrong user
<sinzui> been there, done that this week
<thumper> haha
<thumper> I was
 * thumper feels like a dumb ass
<sinzui> thumper, HP deploy PASS http://15.185.254.245/
<sinzui> I can start the upgrade tests now. I think the messed up streams were the true cause of the HP upgrade failures.
 * thumper nods
<thumper> ok
<hazmat`> does anyone know what the purpose of the settingsref collection is?  afaics its basically redundant info
<thumper> hazmat`: sorry, no idea
<hazmat`> no worries,  i'll ping fwereade tomorrow
<fwereade> hazmat`, heyhey
<hazmat`> fwereade, greetings, thought you might be done for the night.. just curious about the settingsref collection for the pymigration stuff.. namely what's the intent behind it
<fwereade> hazmat`, IIRC it allows us to clean up service config settings when they're no longer used
<hazmat`> afaics its basically a count of settings users, but it seems to always a key ref to a named service's settings with a unit count?
<hazmat`> fwereade, how's it any different then unit count?
<hazmat`> on the service doc
<fwereade> hazmat`, so, for each given charm a service runs, it has a (potentially) different config
<fwereade> hazmat`, the service always holds a ref to the service's current charm's version of the settings
<fwereade> hazmat`, units hold refs to the version suitable for the version of the charm they are currently running
<fwereade> hazmat`, when the refcount hits 0, we delete them
<fwereade> hazmat`, if we didn't have a refcount it would be hard to know when they were no longer used
<hazmat`> but its not mvcc on the service config.. ie no previous copies or versions are held. the keys are all the service ref
<hazmat`> not versioned config refs
<fwereade> hazmat`, we can't assert things like "no doc exists with $value in $field" in the txn library
<hazmat`> fwereade, so the value should be unit_count+1
<fwereade> hazmat`, but we *do* hold older versions of the config, for use by those units that have not updated to a charm that is guaranteed to understand the current one
<fwereade> hazmat`, steady state, in general, yeah, I think so
<fwereade> hazmat`, service configs are keyed on service name + charm urlIIRC
<hazmat`> fwereade,  cool i think thats answers my question.. but re older versions of config.. how can they be held when the keys conflict
<fwereade> hazmat`, we've always got the charm against which they originally validated, and we use that to fill in defaults etc when presenting old configs to outdated units
<fwereade> hazmat`, if charm config changes we basically just drop old fields, add new ones as default, and reset to default any we can't make head or tail of (type changes)
<hazmat`> sure re defaults, but afaics you always have one actual service config, and a ref to the charm.. the older service config would have to be localdata on the unit.
<fwereade> hazmat`, the older service config stays there in state until all the units have upgraded
<hazmat`> fwereade, okay.. ah.. ic.. this only in the context of upgrades, where you have both charms versions to merge against the current setting, doesn't apply to config changes, and multiple versions of applied config in state (ie juju set)
<fwereade> hazmat`, juju set only actually applies to the current one
<hazmat`> yup.. so not really multiple versions of config
<hazmat`> just synthesized for upgrades from multiple charms defaults
<fwereade> hazmat`, a given set of settings may well only half-validate against an older charm so we leave them frozen until they're culled
<fwereade> hazmat`, so yeah it's just a dumb cache
<hazmat`> fwereade, so the settingsref purpose is primarily garbage collection against extant refs
<fwereade> hazmat`, yeah
<hazmat`> fwereade, cool, thanks
<fwereade> hazmat`, anywhere you see a refcount in state the reason for it is basically isomorphic
<hazmat`> fwereade, yeah.. txn guard behavior.. what's generally unclear is the actor / supervision tree for state
<hazmat`> mostly around the gc of docs, possibly b/c in several cases there are multiple answers. ie. service doc owned by last unit agent or cli vs. machine doc owned by machine or provisioner
<hazmat`> fwereade, thanks for the insights
<fwereade> hazmat`, the lifecycle docs in doc/ should be helpful there
<fwereade> hazmat`, lifecycles.txt:156 onwards
<fwereade> hazmat`, it has probably rotted in one or two places but it should still be largely accurate
<fwereade> hazmat`, in particular I did not think to update it when we fixed the "unit destruction depends on unit agents" bug
<thumper> fwereade: can I get you to run the ec2 tests for trunk?
<thumper> fwereade: it could be that saucy is causing this test to fail
<fwereade> hazmat`, in fact death-and-destruction.txt might be even closer
<hazmat`> fwereade, cool, i'll check them out
<fwereade> thumper, is there a particular one? I'm a bit up to my elbows in provider again here
<thumper> localLiveSuite.TestStartInstanceWithEmptyNonceFails
<thumper> same failure for openstack
<fwereade> thumper, both passing for me
 * thumper sighs
<fwereade> thumper, paste me the failures?
<thumper> fwereade: http://pastebin.ubuntu.com/6181433/
<thumper> fwereade: wrote to tools, but tried to read from different location?
<thumper> don't seem like a distro specific issue there
<thumper> ha it isn't a tools problem but image problem
<fwereade> thumper, still, it's surprising they're the only ones failing
<thumper> yeah
<fwereade> thumper, although, hmm, it is uploading version.Current there
<thumper> yeah, but that's the tools
<thumper> not the image
<thumper> it is complaining with the image search
<fwereade> thumper, Iwould not be too surprised to see we didn't include test image metadata forsaucy though
<thumper> yeah, do you know where the test metadata is?
<fwereade> thumper, and specifying version.Current will force saucy where perhaps nothing else does
 * thumper nods
<thumper> fwereade: shall I try with "quantal", which is what other tests do?
<fwereade> thumper, that sounds sensible, yeah
<fwereade> thumper, heh, ec2 has a load in export_test
<fwereade> thumper, no mention of sauct
<fwereade> or even saucy
 * thumper nods
<thumper> I've modified the test to specify "quantal"
<thumper> lets see if it passes
<thumper> that looks like it fixed it
 * thumper proposes as a drive by in the other branch
<thumper> gah, fixed ec2 but openstack broke
 * thumper looks at image differences
<thumper> oh ffs
<thumper> too many non-constrained test variables
<thumper> arch should be set too
<sinzui> HP upgrade FAIL. Like before the state sever completed, but the units did not.. I have capture the log
<thumper> wow, lbox confused that merge
<thumper> fwereade: if you are still around https://codereview.appspot.com/14243043/
<fwereade> thumper, should we not be explicitly testing --system ratherthan just doing that shift-out-of-the-way lark in checkargs?
<thumper> fwereade: probably
<thumper> fwereade: do you know where the $(JUJU_HOME)/environments dir and associated .jenv files are created?
<thumper> fwereade: if you bootstrap the local provider with sudo as you need to, you can't do anything else because it is created 600 by root
<thumper> which includes status
<thumper> I assigned the bug to rog, but he hasn't done anything with it
<thumper> so local provider is still broken
<fwereade> thumper, it's all in environs/configstore, not sure offhand exactly where called from
<fwereade> environs.Prepare, probably
<thumper> hmm...
 * thumper will look after the gym
<fwereade> enjoy
<thumper> sinzui: please add to the todo list of live tests: local provider
<sinzui> I will
<thumper> sinzui: things fixed on trunk should target 1.15.1?
 * thumper -> gym
<sinzui> azure upgrade PASS
<dpb1> Hi all -- is this a known issue on a precise bootstrap node with the maas provider? # start juju-db
<dpb1> juju-db start/running, process 4473
<dpb1> root@qnc48:/var/log/upstart# cat juju-db.log
<dpb1> error command line: unknown option sslOnNormalPorts
<dpb1> use --help for help
<dpb1> seems like wrong version of mono or something
<dpb1> *mongo
<sinzui> aws upgrade PASS
#juju-dev 2013-10-02
<thumper-afk> dpb1: yes, you have the wrong mongodb
<thumper> sinzui: yay
<thumper> sinzui: I wonder if the hp upgrade failure was to do with the security group changes
<thumper> sinzui: it is the only thing that has changed and is different from azure/ec2
<thumper> sinzui: this occurred to me on the drive up the hiill
<thumper> dpb1: can you do apt-cache policy mongodb?
 * thumper -> fud
<davecheney> dpb1: you do not have the correct versino of mongo installed
<davecheney> did ppa:juju/stable get added during cloud init
<davecheney> check
<davecheney> /var/log/cloudinit-output.log
<davecheney> please
<fwereade> aw ffs how is it 2:45
<axw> whoa, you're still awake :)
<fwereade> axw, didn't really intend to be, but I started programming
 * axw knows how that can be
<fwereade> axw, so I still haven't done your reviews, you may have to ask thumper
<axw> fwereade: okey dokey
<axw> fwereade: if I'm stuck, I'll start looking at the environ updater thing
<fwereade> axw, please quickly check with thumper & sinzui if there's anything we need to do with azure simplestreams fallbacks -- but I see an "azure upgrade PASS" above so you might not need to
<axw> fwereade: ok
<sinzui> fwereade, azw: azure is 100% happy when the image-metadata-url is set
<fwereade> sinzui, ah ok -- but still screwed if not?
 * thumper has some dots in the middle of his vision...
<sinzui> thats right
 * thumper tries to read around the floating dots
<thumper> I'm just looking at fixing the file permissions on the $(JUJU_HOME)/environments dir and files
<fwereade> axw, ok, I think unfucking the fallbacks is more important
<thumper> then I can look at the image-metadata-url
<fwereade> axw, but I actually must sleep
<axw> fwereade: no worries, I'll have a look
<fwereade> axw, thumper has context :)
<davecheney> dpb1: ping
<fwereade> axw, thumper: and I'd love a review of https://codereview.appspot.com/14254043 if you get a moment
 * thumper nods
<axw> fwereade: sure, will have a look later today
 * sinzui reboots to purge sshuttle non-sense
<hazmat> sinzui, iptables -F -X and --policy to reset
<sinzui> super. Thanks!
<hazmat> sinzui, http://ubuntuforums.org/showthread.php?t=1381516
<axw> sinzui: is there a bug for this image-metadata-url thing? I can't find anything in LP
<sinzui> axw: no. my bad.
<sinzui> axw I can report it now
<axw> sinzui: if you don't mind, I don't know what the issue is that's all
<axw> sinzui: I'm getting "The affinity group name is empty or was not specified. (http code 400: Bad Request)" when I try to bootstrap
<axw> is that the issue, or something new? :)
<sinzui> different
<sinzui> axw, When the azure provider goes looking for the os image, it bails early wne it does not find a match in the "daily" stream set
<axw> ok
<sinzui> axw: this is all the info I have https://bugs.launchpad.net/juju/+bug/1233924
<_mup_> Bug #1233924: cannot bootstrap azure because no OS image found <azure> <bootstrap> <juju-1.15.0> <juju:Triaged> <https://launchpad.net/bugs/1233924>
<axw> sinzui: thanks!
<sinzui> axw, one moment, I need to move that bug to juju-core.
<axw> sinzui: I'm having trouble getting *anything* to work on azure at the moment, so may take me a little while to get to it...
<sinzui> axw: my setup is just this https://juju.ubuntu.com/docs/config-azure.html with the addition on tools-url: https://jujutools.blob.core.windows.net/juju-tools/tools
<axw> sinzui: likewise, tho I am using the Southeast Asia location
<axw> hmm maybe if I set tools-url..
<axw> nope
<sinzui> axw, I think you should try West US. The other bug you found is about not being able to deploy from any other region/affinity
<axw> sinzui: this was working before
<sinzui> Then I hope it will work again
<dpb1> davecheney:
<dpb1> sup?
<axw> sinzui: ok, reproduced the 1233924 on West US. I'll log a bug for the other one later, if I can reproduce it in a fresh env
<sinzui> fab
<davecheney> dpb1: you had questions ?
<davecheney> thumper: what the balls ?!
<thumper> what balls?
<davecheney> ip-10-240-27-224:2013-10-02 02:07:08 INFO juju.worker.uniter uniter.go:105 unit "w1/0" shutting down: tomb: dying
<davecheney> ip-10-240-27-224:2013-10-02 02:07:08 ERROR juju runner.go:211 worker: exited "uniter": permission denied
<davecheney> ip-10-240-27-224:2013-10-02 02:07:08 INFO juju runner.go:245 worker: restarting "uniter" in 3s
<davecheney> what the hell happened here
<davecheney> all I did was remove the w1 m1 relation
 * thumper shrugs
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1233936
<_mup_> Bug #1233936: worker/uniter: uniter restarts when relation removed <juju-core:New> <https://launchpad.net/bugs/1233936>
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1233938
<_mup_> Bug #1233938: cmd/juju: debug-log lines appear out of order <juju-core:Triaged> <https://launchpad.net/bugs/1233938>
<davecheney> aaand that is enough bugs for the morning
<thumper> davecheney: dude, rsyslog interleving messages from remote clients may well not have them in strict time order
<thumper> davecheney: and, seriously, not critical
<thumper> critical means "down tools and fix right now"
<thumper> which this certainly isbn't
<thumper> axw: https://codereview.appspot.com/14258043/
<thumper> axw: and lets skip the package review
<axw> thumper: sgtm
<thumper> axw: did you want to chat through the fallback mechanism?
<axw> thumper: I've got a fix for the azure thing, just writing a test
<thumper> axw: is the the "to maintain existing behaviour, don't fall back" ?
<axw> thumper: uhh maybe we should chat, not sure what you mean
<thumper> :)
<axw> thumper: https://plus.google.com/hangouts/_/e2ed2f3559f1cd0314405f2303d3c222e83803c4?authuser=1&hl=en-GB
 * thumper -> coffee time
<axw> axw -> hammer time
<dpb1> error command line: unknown option sslOnNormalPorts
<dpb1> davecheney: getting that error when juju-db is starting on a maas node.  Wrong version of mono being used?
<dpb1> why do I keep doing that.
<dpb1> *mongo
<thumper> dpb1: yes, wrong mongo
<thumper> <davecheney> dpb1: you do not have the correct versino of mongo installed
<thumper> <davecheney> did ppa:juju/stable get added during cloud init
<thumper> <davecheney> check
<thumper> <davecheney> /var/log/cloudinit-output.log
<thumper> <davecheney> please
<davecheney> dpb1: can you provide /var/log/cloudinit-output.log
<davecheney> that will answer all the questions
<dpb1> thumper: how is that happening?
<dpb1> ya sec, checking
<davecheney> dpb1: apt-add-reposutory ppa:juju/stable is failing
<davecheney> ^ best guess
<dpb1> d'oh, right at the top
<dpb1> 2013-10-01 16:26:17,159 - cc_apt_update_upgrade.py[WARNING]: Source Error: ppa:juju/stable:add-apt-repository failed
<davecheney> dpb1: does this maas environment have access to the interwebs ?
<dpb1> kind of a sneaky little error there though
<dpb1> no... it's behind a firewall, this is for the OIL lab if you have heard of that effort
<dpb1> but I can selectively allow things through, and have.  probably something with a keyserver.
<davecheney> dpb1: you could try the apt-squid-proxy
<davecheney> if you do that
<davecheney> you need to configure it on the machine that initiates juju bootstrap
<davecheney> (the client)
<davecheney> which is automagically sniff the settings
<dpb1> davecheney: davecheney even keys?  I think keyserver.ubuntu.com still needs to be faked
<davecheney> dpb1: never had to do that
<dpb1> ok
<dpb1> I'll run it by is in the morning.  thanks, that gives me what I need to go on next. :)
<axw> thumper: https://codereview.appspot.com/14218044/ when you're free
<thumper> axw: ack
<thumper> axw:  https://codereview.appspot.com/14260043, and I'll review yours before getting back to the other.
<axw> thumper: okey dokey
<axw> thumper: I don't suppose you'll have time to review null provider stuff today?
<thumper> sure, after these two :)
<thumper> I HATE that we match log messages
<thumper> I think it is a FAIL on so many levels
<axw> thumper: I was going for expediency, but I do agree
<axw> I would prefer a dummy data-source that checked it was called
<thumper> axw: I understand
<thumper> not critical of this in particular, just that it exists
<thumper> axw: what needs the most urgent review?
<axw> thumper: https://code.launchpad.net/~axwalk/juju-core/null-provider-customsources/+merge/187978
<axw> thumper: then this: https://code.launchpad.net/~axwalk/juju-core/null-provider-storage-auth/+merge/187964
<thumper> ack
<axw> lunch, bbs
<thumper> axw: https://codereview.appspot.com/14258043/
<thumper> axw: looking at "Wire up authenticating httpstorage", if you are back, I'd like to chat, if not I'll continue with the review
<axw> thumper: I'm here
<thumper> axw: do we care that we are putting the auth token in the url?
<axw> thumper: shouldn't do, it's HTTPS
<thumper> always?
<axw> thumper: should be, let me just double check...
<axw> thumper: yep. only if you use ClientTLS, then it'll send an auth-key. and it'll only send it for modifying requests
<axw> thumper: modifying requests always go via HTTPS
<thumper> kk
<axw> thumper: did you want to hangout, or was that all you wanted to ask?
<thumper> I was going to ask about the boilerplate additions
<thumper> those are being moved into this extra env.jenv files in $(JUJU_HOME)/environments
<thumper> have you considered that?
<thumper> or is it deferred
<thumper> axw: also, for null provider, under what circumstances would the storage certs not be there?
<axw> thumper: I'm not really abreast of what I need to do for env.jenv
<thumper> axw: ok, we'll defer that bit
<axw> thumper: the storage certs are the same as those used for mongo
<axw> the CA cert is I mean
<thumper> axw: so surely by the time we want to use them, if they aren't there, we should be erroring?
<thumper> just thinking about the null provider Storage method
<axw> thumper: yeah, except Storage() doesn't reutrn an error
<thumper> and the just use http if the certs aren't there
<axw> I could have it panic
<thumper> is it an error for them to be not set?
<axw> thumper: that's what it's doing now
<thumper> I think it is by the time it gets there
 * axw checks
<axw> thumper: it's not an error in config.Validate
<axw> bootstrap ensures there's one tho
 * thumper nods
<thumper> I think the validate bits are not errors for testing :)
<axw> ugh, I hate that
<axw> it's one thing to mock something out, but to change all behaviour to accommodate testing...
<thumper> review done
 * thumper called for dinner
<axw> thumper: thanks!
<axw> if you're around later, I updated this: https://codereview.appspot.com/14218044/
<jam> axw: I think your fix is actually incorrect
<jam> what we *want* is to use the released steram
<jam> stream
<jam> by default
<jam> axw: so "provider/azure/environ.go" should be "http://cloud-images.ubuntu.com/released"
<axw> jam: ah, ok
<jam> axw: we might *also* need to do your fix
<jam> for whatever reason
<axw> well, either way my change makes the simplestreams code do what it's meant to do
<axw> I can update it to use released first tho, if that's what should be done
<jam> but the fix for bug #1233924 is that we should point to the released stream by default
<_mup_> Bug #1233924: cannot bootstrap azure because no OS image found <azure> <bootstrap> <juju-1.15.0> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1233924>
<jam> axw: I'm testing now, but I think this is fallout from before. We used to only have dailies for Azure
<jam> and then you would specify "image-stream: daily"
<jam> and it would use the daily cloud-url
<jam> but I think we just want to ignore that dailies exist
<jam> and if someone wants to try really hard they can set their imagemetadata-url to the daily stream
<axw> jam: ok. that sounds sensible
<jam> axw: it is a shame that they can't just set "image-stream: daily" and have it switch the metadata for you, but I think we can punt on that for now
<axw> jam: there's two different issues tho, so I'll create a new MP that fixes this in the proper way
<axw> jam: ah, you can't do that anymore?
<jam> axw: so there are 2 options
<jam> 1) we always only search cloud-images.ubuntu.com/released
<jam> 2) we also add a baseURL for cloud-images.ubuntu.com/daily
<jam> which 99% of the time won't have what you want
<jam> but *if* you set "image-stream: daily" then /released doesn't have what you want but /daily does
<jam> my above point is that (2) can be taken care of by a *user* setting imagemetada-url: *and* image-steram
<jam> image-stream
<jam> axw: does that make sense?
<axw> jam: yep
<jam> the reason they have to set 2 things, is that the files they need to *read* are in a different location, and the index key *in those files* is also different
<jam> I'd like to unify it, but I think it is something users won't really use
<jam> and not worth our time today
<jam> axw: have you bootstrapped on azure before?
<jam> I did just the 1-line fix to point at '/released'
<jam> and it has gotten to "finding products at path" ...
<jam> but it seems to be stuck
<jam> ah ffs
<jam> the URL is "/releases"b
<jam> but the data inside the file is "released"
<axw> jam: yes I have bootstrapped before
<jam> http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:azure.json
<jam> notice the "releases" at the beginning and the "released" at the end
<axw> jam: I think you don't want to put "releases" on at all
<axw> no prefix
<axw> just .com/streams
<jam> axw: maybe, but looking at http://cloud-images.ubuntu.com/daily/
<jam> to differentiate it from http://cloud-images.ubuntu.com/releases/
<jam> axw: DefaultBaseURL = "http://cloud-images.ubuntu.com/releases"
<axw> hrm yeah...
<jam> axw: ah you know what, you might be right
<axw> I'm thinking of image-stream I think
<jam> that azure is providing an optional location
<jam> and we should be finding the real fallback for all clouds
<jam> axw: in which case, we probably just need your fix
<jam> as in, it is finding daily it appears configured correctly, but doesn't have the data we want, so we need to keep searching.
<jam> I'm wondering if we just remove daily from baseURLs
<axw> jam: well, my fix does actually work. but I kinda get your point about not wanting to go to daily all the time
<axw> yeah that's what I was thinking
<jam> axw: I think it is silly to look at daily for most people
<jam> It isn't terribly harmful (with your fix :) but it isn't useful and adds roundtrips to bootstarp
<jam> bootstrap
<jam> axw: I'm trying to look at the logic just before what you change
<jam> changed
<jam> namely, it has "if err != nil && len(items) == 0 && ..."
<jam> how would we have items and an error?
<jam> that sure seems like it should be
<jam> if (err != nil || len(items) == 0)
<axw> jam: hmm yes, I think you're right
<jam> axw: anyway, it doesn't really change what you need to do
<jam> which is, simplestreams was expecting that it would get an error if nothing matchedb
<jam> axw: actually, we are supposed to
<jam> if len(matches) ==0 { return nil, newNoWatchingProductsError}
<axw> jam: not even; the function directly below GetMetadata (which calls it) has a comment saying that err == nil if no matches
<axw> that's internal stuff anyway
<jam> axw: ah, I see, it traps that error and says the higher up will go to the next, but the higher up doesn't because it has a "if err != nil { break }"
<axw> yup
<jam> axw: interestingly if we *did* return the error
<jam> then the higher up code would have continued properly
<axw> heh yes :)
<jam> axw: so I think it is a case of "ok, loop until we don't get an error, and the lowest level has an error if nothing matches" and forgetting that the intermediate step filters that error out
<jam> axw: so offhand, I would be fine with either fix
<jam> return the error
<jam> or trap for empty item list
<jam> I *think* we want to log that we didn't find anything, and just continue
<jam> and returning it as an error might cause us to double log it
<jam> but it might also give us a better final message when you don't run with "--debug"
<jam> axw: can you try both and see if there is a clear win?
<axw> jam: sure, I'll give it a shot
<axw> jam: hmm, if GetMetadata is changed to not return the error, it's going to require a lot of changes in callers I think.
<axw> seems reasonable for it to return an error if it doesn't find anything from any of the sources
<axw> oh but...
<axw> it's nil
<axw> never mind :)
<jam> axw: my point was, rather than putting "len(items) == 0" before breaking
<jam> change the function it calls to *not* filter out the noMatchingProductsError
<jam> axw: for context, rev 1878.1.1 was where Ian introduced GetMetadata
<jam> and inverted the logic about how to handle noMatchingProductsError
<jam> axw: it *used* to be that we would search across all data sources
<jam> for signed data
<jam> and then all data sources for unsigned data
<jam> axw: and note that environs/tools/simplestreams.go Fetch() has:
<jam> if (err != nil || len(items) == 0) && !onlySigned
<jam> axw: which is why it used to work
<jam> axw: have I succeeded in confusing you? Because I think I did make sense of it
<axw> jam: slightly ;)
<axw> so in other words, this logic used to be outside
<axw> and it used to check len(items) as well as err
<axw> when going to the next source
<axw> ?
<jam> axw: so it used to be we would return nil, [] so that we would try the unsigned metadata
<jam> but the len(items) == 0 is still only used for the signed/unsigned check and not the rest
<jam> axw: *however* the old code had the same bug
<jam> because err.noMatchingProductsError would break out of the search loop
<jam> so if it succeeded in reading any stream
<jam> but just didn't find anything
<jam> it would stop searching
<jam> (bzr revert -r 1878; and inspect environs/simplestreams/simplestreams.go line 419)
<axw> ta
<jam> axw: and I might understand why
<jam> axw: so imagine you put data into your private bucket
<jam> and then ask to boostrap
<jam> do you want it to read that data, and then say "oh, I didn't find what you requested in your personal store, I'll just go read cloud-images.ubuntu.com"
<jam> or do you want it to fail right away with "your request couldn't be fulfilled with the data you provided"
<jam> this is the sort of "either way could be valid"
<axw> urgh, yeah I can see that
<jam> Either could be a WTF for the user
<jam> Why did you start something when I asked you to use my personal image
<jam> vs
<jam> Why is this failing to start.
<jam> So I'm *tempted* to just nuke the daily stream :)
<jam> And then talk about it in depth as a team
<axw> jam: yeah that's fair enough, I hadn't considered private cloud metadata
<jam> axw:  does that sound reasonable to you?
<jam> The person who thought about it the most is on vacation this week (Ian)
<axw> heh yeah :)
<jam> fwereade: are you around ?
<jam> axw: so I know the reason we wanted the shortcut-to-failure was because originally we had a different bug because of the Signed first then Unsigned
<jam> which is that users can only really upload unsigned metadata
<jam> and if we read all signed metadata firs
<jam> then it would have fallen back to cloud-images.ubuntu.com (the only source for signed data) and users could never override it
<jam> we switched it to search each location signed then unsigned
<axw> yeah that bit makes sense
<jam> but we need to figure out what helps users the most
<jam> falling back so that when you ask for an armhf but have only configured amd64 in your images
<jam> we still give you the cloud-images instance
<jam> or not falling back so that we don't accidentally give you the non-customized image if your constraint accidentally doesn't match
<jam> axw: either way, current logic is just broken for Azure since it is injecting a source that won't ever have released data in it
<axw> jam: I think we should just do what you said; take out daily, and people can put in the stream if they really do want it
<axw> then we can revisit this when Ian comes back
<jam> axw: yeah, I still think we need to decide whether we want to fallback or not. but we can address it separately that way
<axw> I'll do that now
<jam> fwereade: when you get in, I'd like to G+ with you about simplestreams fallback-or-not I can give you context when you get in
<jam> axw: fwiw right now I think I'd rather fallback and give users something than just fail, so I'm happy you wrote the patch, but I want to make sure we understand it as a team
<axw> jam: no worries. I'll leave that MP there, and do a new one that just touches azure
<jam> axw: afaik Azure was the only one that ever actually supported the daily stream, and they only did because they had to (there were originally no full releases for azure)
<jam> axw: and I'm pretty sure on it, because Azure is the only one that supports "image-stream:" and you can't *read* the daily cloud images without that (because they use a different simplestreams key).
<jam> axw: and I'm pretty sure on it, because Azure is the only one that supports "image-stream:" and you can't *read* the daily cloud images without that (because they use a different simplestreams key).
<jam> welcome back, btw,
<axw> jam: my laptop just froze
<axw> ta
<axw> jam: last line in my log is from me, saying "jam: no worries..."
<jam> axw: so I'm thinking we just remove all the daily support from the Azure provider
<jam> none of the other ones ever supported it
<jam> because you have to have "image-stream:" in order to actually read the daily streams
<jam> and we *had* to do it back in the da
<jam> day
<jam> because there were only Daily Saucy images being published to Azure
<axw> jam: sgtm, that was just while saucy was in development afaik
<jam> now that we have official LTS releases
<axw> *nods*
<jam> we just treat Azure like the others
<axw> er 12.04.03 I mean
<axw> yep
<axw> jam: so does that mean taking out image-stream altogether?
<jam> axw: I think so, because it isn't a *juju* supported config
<jam> so the overhead of making sure it works for just one provider
<jam> seems a bit much
<jam> I can see a point to having it, but if we wanted it, we should have it everywhere
<axw> jam: can we defer that change? or are you concerned that people will rely on it in 1.16.0?
<jam> axw: defer away, I'm just thinking out loud
<jam> it is the natural follow on to removing daily from the search path
<rogpeppe> mornin' all
<rogpeppe> dimitern: i found out a little bit more about my dns name query problem BTW
<axw> morning rogpeppe
<rogpeppe> axw: hiya
<rogpeppe> axw: how much do you know about dns?
<rogpeppe> davecheney: ^
 * rogpeppe reboots
<axw> jam: https://codereview.appspot.com/14266043
<axw> rogpeppe> axw: how much do you know about dns?
<axw> <axw> rogpeppe: not a heap, why?
<jam> axw: he was just rebooting just before you got back
<jam> he was saying something about having a dns name query problem
<axw_> grr, net keeps going up and down
<dimitern> rogpeppe, hey
<rogpeppe> dimitern: hiya
<jam> axw_: reviewed
<dimitern> rogpeppe, so what was it?
<rogpeppe> dimitern: the dns client is doing two lookups, one for an A (ipv4) record, which returns immediately, but another for an AAAA (ipv6) record, which always times out
<rogpeppe> dimitern: i'm not sure where the problem is - probably in my ISP's DNS server
<rogpeppe> dimitern: i can't currently work out a way of disabling the ipv6 request
<axw_> jam: ta
<dimitern> rogpeppe, I think you can tweak the sysctl.d or something to disable ipv6 on an interface
<rogpeppe> dimitern: i've already done that - it doesn't seem to make a difference
<jam> rogpeppe:  http://askubuntu.com/questions/32298/prefer-a-ipv4-dns-lookups-before-aaaaipv6-lookups ?
<jam> says you can tweak /etc/gai.conf
<jam> rogpeppe: are you using v6 on your local network ?
<rogpeppe> jam: no, i don't think so
<rogpeppe> jam: afaic the gai.conf suggestion will only affect the order of the requests, not whether AAAA requests are issued
<rogpeppe> afaics
<jam> rogpeppe: the link above mentions adding "options single-request" to /etc/resolve.conf
<jam> or disable ipv6 entirely (in a link from the one I gave)
<rogpeppe> jam: that's similar - it means both requests will be issued in series rather than parallel, i think
<rogpeppe> jam: and i've just disabled ipv6...
<jam> rogpeppe: but if the first succeeds will it do a second ?
<jam> anyway sure
<rogpeppe> jam: yes, i think
<rogpeppe> jam: i can see that the first is succeeding already (host -v prints an address almost immediately, then waits for 10-20s for the next request to time out)
<jam> rogpeppe: host isn't getaddrinfo IIRC, though I'll note if I do "host google.com" it has a valid ipv6 address
<jam> though I get v4 first
<jam> host arbash-meinel.com doesn't have ipv6 though it seems to pause for only 1s or so, not 10-20
<TheMue> fwereade: happy birthday
<dimitern> fwereade, yeah, dude!
<allenap> Does juju-core work with ARM okay? See comment #3 on https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1233831
<_mup_> Bug #1233831: maas doesn't return zookeeper instances for newly provision environment <amd64> <apport-bug> <raring> <maas (Ubuntu):New> <https://launchpad.net/bugs/1233831>
<rogpeppe> jam: i see 10s pause if i use 127.0.1.1, 6.64s if i use my provider's DNS server directly, and 2.9s if i use google's dns service (8.8.8.8)
<jam> allenap: "zookeeper" would be python-juju
<dimitern> or a really old juju-core
<jam> allenap: though the log messages look a whole lot like juju-core
<jam> so the person might just be reporting it poorly
<dimitern> jam, juju-core used zookeeper for a while after the migration to go
<jam> allenap: so digging through the bug more I see the follow up posts. So we might-ish support it, but I don't think we were building the binaries for arm as of last week
<jam> someone was just requesting sinzui do an arm build for 1.15
<jam> so we might in the near future
<jam> hmm... I see armhf for 1.14.0
<fwereade> TheMue, dimitern: cheers
<fwereade> jam, heyhey
<fwereade> sorry I'mlate, Iwas uprather late last night
<allenap> jam: The message is: stick with PyJuju on ARM for now?
<jam> tools/juju-1.14.1-saucy-armhf.tgz
<jam> fwereade: happy birthday (to continue the trend)
<jam> allenap: so we only have *saucy* builds with arm
<fwereade> jam, so, I would like to say "drop the daily datasource in azure and have done with it"
<jam> fwereade: right, we still have the open question about whether we want to fallback or not
<fwereade> jam, but I don't know enough about how/if it will react with the image-stream config setting, and other associated snippets of code inside that provider
<allenap> jam: I'll ask him to deploy saucy with 1.14.1 to see what happens. It may be he's okay with saucy in general. Thanks.
<jam> allenap: so I think we want to support it, but only having saucy builds is going to make things harder to debug rather than easier, I imagine
<jam> fwereade: so (1) we're going to just delete the azure daily stream lookup, no problem there we all agree
<jam> fwereade: but (2) if you have some data that gets found, but without matching, should we go to fallbacks?
<jam> eg, I upload a custom Precise image, set up the simplestreams for it, and then go "juju deploy cs:saucy/mongo"
<jam> fwereade: current implementation just fails because it finds your private-bucket simplestreams data and goes no further
<jam> fwereade: axw's patch would let us keep searching and find the Ubuntu stock images for saucy
<fwereade> jam, one possibility that crossed my mind was to give datasources a bool method meaning "if there's an index, fall back no further"
<jam> fwereade: which way seems less WTF for a users?
<fwereade> jam, and so anything explicitly user-set will mask out everything else
<fwereade> jam, ie image-metadata-url, tools-url, and synced tools
<jam> fwereade: but is that better than just allowing them to override certain settings (augment cloud-images) rather than lose everything from there entierly
<axw> fwereade: I had the same thought too (a flag to say whether or not to keep going). But I sorta think daily shouldn't be in there by default anyway.
<jam> fwereade: also, I wouldn't mind chatting about the fact we *cannot* upgrade 1.14 => 1.15
<jam> fwereade: I saw you going WTF yesterday, so you might have a plan of attack
<jam> though I have one that I'm willing to try
<fwereade> jam, dimitern has been looking at the worst of the upgrade problems and I think has a plan of attack
<jam> fwereade: so download-tools.txt vs download-url.txt comes to mind, but beyond that ?
<fwereade> jam, there's a sort of nexus of suck around tools, setagenttools,and upgrader
<dimitern> fwereade, jam, yeah  - just filing a bug about it
<jam> dimitern: bug #1233934
<_mup_> Bug #1233934: upgrade from 1.14.1 to 1.15.0 fails <hp> <juju-1.15.0> <upgrade-juju> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1233934>
<fwereade> jam, so yeah, those text files are part of the problem
<fwereade> jam, which I propose we handle by just never reading them
<fwereade> jam, and never storing any of the associatedinformation in state
<fwereade> jam, write them out as we unpack, sure, why not
<jam> fwereade: well arguably we'd also like to cache the lookup in state, but that is pretty removed from the issue at hand.
<jam> so we *do* want FindTools to report them (hopefully it always can, as it seems to be a hard requirement right now if we actually want to validate the binaries we've downloaded)
<jam> But we could have a type FoundTools struct { Tools, SHA256, ...}
<jam> ytpe
<jam> type
<jam> fwereade: as for reading downloaded-url.txt, that has been the indication of if we have done a download
<fwereade> jam, but it's only there so ReadTools can hack together a *Tools
<jam> rather than statting the directory
<fwereade> jam, ok the fundamental thing is that SetAgentTools shouldnot beSetAgentTools
<fwereade> jam, it should be SetAgentVersion
<fwereade> jam, so the upgrader should not be dicking around with ReadTools at all, ever
<dimitern> fwereade, jam, there it is - bug 1234035
<_mup_> Bug #1234035: upgrading from 1.14 to 1.15 fails (tools not found, upgrader) <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1234035>
<fwereade> jam, because the only useful piece of information is the version
<jam> fwereade: we need the Binary information
<jam> fwereade: so it has to be at least SetAgentBinary
<fwereade> jam, all the rest is just N copies of redundant data, one per agent, that's never even looked at
<jam> dimitern: isn't that just a dup of bug #1233934
<_mup_> Bug #1233934: upgrade from 1.14.1 to 1.15.0 fails <hp> <juju-1.15.0> <upgrade-juju> <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1233934>
<fwereade> jam, ehh, maybe, but really why bother denormalising it?
<fwereade> jam, we know hardware characteristics (ie arch) and machine series *anyway*
<jam> fwereade: we need the arch and series so that when you ask for "what tools do I need to upgrade to" we can point you to the right URL
<jam> fwereade: from where?
<jam> because we can search around for the machine the agent is running on ?
<dimitern> jam, maybe that one is a duplicate of mine, because it has more context inside?
<fwereade> jam, but yeah, it's convenient enough, I don't really have a quarrel with a version.Binary
<fwereade> jam, er, version.Current?
<fwereade> jam, the only other client of ReadTools is the lxc provisioner, which is a bit cracky itself, because we can get all the relevant information from the tools-for-this-machine API call
<jam> fwereade: so we have Upgrader.Tools() as an API, which needs to return a URL to download from, and preferably a Version so that we can make sure it actually changes something
<jam> we currently do the lookup just on Machine.Tag()
<jam> because we already ha
<jam> had the desired version in EnvironConfig
<jam> but we needed the Arch and Series informaction
<fwereade> jam, yeah, which we have
<jam> We get that from whatever the agent previously recorded from an earlier SetTools call
<jam> (agent always calls SetTools on start)
<jam> before it asks if it should upgrade
<fwereade> jam, I know all this... and none of it has any reason to ReadTools because *all* that info is in version.Current
<jam> fwereade: right, except URL, which we don't need to pass around
<jam> we need it *from* Upgrader.Tools
<jam> to tell us what to download
<fwereade> jam, and we could even just send version.Current.Number because we *do* actuall know series/arch, but that's a smentic change to a field so let's not go there
<fwereade> jam, ok, agreed
<fwereade> jam, so we've implemented this Tools method
<fwereade> jam, and we can drop ReadTools *completely* if we just use it in provisioner instead of messing around grubbing up our own data instead of asking the authoritative source
<jam> fwereade: so lxcBroker wants the full tools information because of ? It wants to give the instances identical tools with the same source?
<jam> (I'm pretty sure you can run a Saucy machine and have a Precise LXC on i t)
<jam> fwereade: so I'm not 100% sure why lxcBroker needs tools at all
<fwereade> jam, well, at the moment, we're restricted to the same series
<jam> fwereade: why ?
<fwereade> jam, exactly because we had this readtools thing to mes about with, and *didn't* have an api method to talk to
<jam> fwereade: k, but even there do we actually use the URL?
<jam> if we are requiring the instances to use the file we have
<jam> we have it right here on disk
<jam> fwereade: anyway, sounds reasonable to change the SetAgentTools api
<jam> and then get rid of having to read the URLs everywhere
<fwereade> jam, right
<fwereade> jam, but saying "download it from the same place" seemed like the easiest way to get them there at the time
<fwereade> jam, fwiw "download from somewhere" will I think be the right answer in general anyway, once we have eg kvm machines that might not even be the same arch
<fwereade> jam, doesn't look like we actually enforce the series restriction in state though
<fwereade> jam, maybe that's ok, it'snot really state's concern
<jam> fwereade: right, you *should* be able to switch series, even if you can't switch arch on containers (though I also know that you can run 32-bit LXC on 64-bit very easily and have done it many times)
<jam> you probably can't run 64-bit lxc on 32-bit host, or arm on i386, etc.
<fwereade> jam, yeah
<jam> fwereade: anyway, this sounds like a great plan, and if dimitern is picking it up, I'm happy to let him do it and focus on other things.
<fwereade> dimitern, you ok with that?
<dimitern> fwereade, jam, cheers
<dimitern> I'll deal with it
<jam> fwereade: the nice effect is that it leaves us in a better place going forward, rather than "ok at least the upgrade works for 1.14 => 1.6"
<jam> 1.16
<fwereade> jam, yeah, that's the hope, just strips away a little bit of unnecessary complexity
<jam> certainly good to step back and say "why do I need this field" rather than adding another kludge to handle the previous one
<fwereade> jam, the provisioner API will probably have to evolve a little as we progress, but that shouldn't hurt too much
<fwereade> jam, so, I am also mildly freaking out about this maas business
<jam> fwereade: which bit ? bootstrapping maas with juju to have juju on maas ? or some specific bug about juju and maas ?
<fwereade> jam, https://bugs.launchpad.net/gomaasapi/+bug/1222671
<_mup_> Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <https://launchpad.net/bugs/1222671>
<jam> ah, can we just do the name thing and not try to get it perfect ?
<fwereade> jam, so basically we have this stuff documented both in maas and juju, going back to python times
<fwereade> jam, "issue yourself a fresh maas api key for each juju environment and everything will be fine"
<fwereade> jam, apparently that is horsepoo
<fwereade> jam, rvba informs us that it doesn't work and I'm getting a string impression that it never did
<fwereade> jam, which kinda leaves me WTFing a bit hopelessly
<fwereade> jam, but I guess it'snot directly a1.16 thing
<fwereade> jam, so I should probably be off to review some things that *are* 1.16y
<fwereade> jam, hell, I thought I'd published comments on https://codereview.appspot.com/13962043/
<fwereade> rogpeppe, can we land https://codereview.appspot.com/13968043/ ?
<rogpeppe> fwereade: will do
<rogpeppe> fwereade: i'm looking for a review of https://codereview.appspot.com/14207046/ BTW
<fwereade> rogpeppe, and https://codereview.appspot.com/14123043/ looks like it was blocked by a known intermittent failure so that might be worth another shot?
<fwereade> rogpeppe, looking at that now
<rogpeppe> fwereade: thanks
<fwereade> rogpeppe, LGTM essentially, bit of waffling in the CL, let me know your thoughts
<rogpeppe> fwereade: thanks. will do.
<jam> fwereade: I think you're right about FinishMachineConfig, I didn't really know it existed, and it feels a *bit* funny to have stuff not initialized until you "Finish" but it is absolutely a place that takes Config data and puts it into the MachineConfig (which New* don't)
<rogpeppe> fwereade: the ErrPrepared thing is a good point. it *can* currently leak out to the user in at least one circumstance, i think, which is when the user hasn't bootstrapped, invokes a juju command that expects a prepared environment, and the provider needs extra attributes created by Prepare.
<rogpeppe> fwereade: part of the problem is deciding on an appropriate error message in this case
<jam> fwereade: provisioner_task calls environs.NewMachineConfig but does *not* call environs.FinishMachineConfig ?
<fwereade> jam, FinishMachineConfig happens elsewhere, I think, just a mo...
<jam> fwereade: inside StartInstance
<jam> so... fairy nuff
<fwereade> jam, heh, I see, by hand every time :/
<jam> fwereade: probably why I didn't think about it in the first plcae
<jam> place
<fwereade> rogpeppe, surely we can infer non-bootstrappedness from non-preparedness?
<rogpeppe> fwereade: not really
<jam> fwereade: you may not have prepared it on your machine, but someone else bootstrapped already
<jam> I'm guessing
<rogpeppe> fwereade: bootstrapped-ness is somewhat orthogonal (actually, totally orthogonal currently)
<rogpeppe> fwereade: because you can be prepared without being bootstrapped
<jam> rogpeppe: fwereade's point is that you can't be bootstrapped without being prepared
<rogpeppe> fwereade: and currently, for backward compatibility, you can be bootstrapped without being prepared
<jam> but I think that is only if you are running on one machine.
<fwereade> rogpeppe, ok, got you
<rogpeppe> fwereade: in the future, i want to make it so that if there's no environment then you'll get an "environment not found" error or something
<fwereade> rogpeppe, ok, that one slightlynasty error messageshould not block it
<rogpeppe> fwereade: ok, thanks.
<rogpeppe> fwereade: i wonder if "incomplete environment configuration" might be a better message
<fwereade> rogpeppe, not sure it helps a lot tbh
<jam> fwereade: the other *really* nice thing about FinishMachineConfig is that I can actually test that setting the Config entry has an effect on the MachineConfig NewBootstrapConfig sort of stuff is not well isolated from actually starting an instance.
<jam> fwereade: so thanks for the pointer
<jam> makes me a lot happier there
<fwereade> jam, cheers :)
<dimitern> fwereade, fix done hopefully, live testing now
<fwereade> dimitern, <3
<natefinch> jam: did we decide on a review target? I think you mentioned something, but I don't remember an email about it, and searching for code review in my inbox is not helpful ;)
<jam> fwereade: https://codereview.appspot.com/13962043/ has been updated
<jam> natefinch: I didn't
<jam> with the release I decided lets focus on getting code done
<natefinch> jam: ok, cool
<fwereade> axw__, responded on https://codereview.appspot.com/14254043/ -- will be landing unless you still think it's wrong
 * TheMue => short lunch break
<fwereade> dimitern, actually if you have 5s would you take a quick look at https://codereview.appspot.com/14254043/ and let me know if my comment's obviously insane
<dimitern> fwereade, looking
<dimitern> fwereade, as I get it - it's better not to kill the environment if we fail to stop the instances?
<fwereade> dimitern, yeah, exactly, that's the original behaviour
<dimitern> fwereade, sgtm
<fwereade> dimitern, the env is not considered destroyed until its storage is trashed
<dimitern> fwereade, yeah, sounds sane; and (*drumroll*) there it is https://codereview.appspot.com/14231044
<jam> fwereade: fwiw, I think we want to just punt about the bootstrap-state file, we didn't really follow through with that and now it gets us into "juju bootstrap" failing leaving us in a state where we thing it is bootstrapped and you have to destroy something that never existed.
<jam> fwereade: I also agree that if we fail to stop something we don't delete the storage based on how the if/return works
<jam> dimitern: don't we want SetAgentTools to actually strip the values? Or are you thinking to add a SetAgentVersion ?
<fwereade> jam, expand on the first comment a little please -- punt on what exactly?
<jam> fwereade: we added bootstrap-state so we could try to notice "you're running a python-juju env"
<jam> and not muck it up
<jam> fwereade: however, if you're running python-juju environment, you won't have a Cert file to talk to the state node anyway
<jam> so we fail really early
<jam> fwereade: so *I* think we should just nuke all the bootstrap-state code because it just complicates things for no gain
<fwereade> jam, except that we have hard dependencies on it
<fwereade> jam, LoadStateFromURL in jujud bootstrap
<jam> fwereade: I think all of those could be adjusted. We have provider-state rather than bootstrap-verify
<dimitern> jam, I don't plan on adding SetAgentVersion
<jam> fwereade: but notice that LoadState *doesn't* read the file
<jam> etc
<jam> I may be wrong on the naming
<jam> but we have 2 ways of "loading" and *one* checks the file, and the other *doesn't*
<jam> dimitern: in which case, I would actually strip the fields that we don't actually want in the DB
<jam> namely URL Size and SHA
<fwereade> dimitern, I like what jam's saying
<jam> we need Tools() to give the caller those values, but SetAgentTools doesn't need to record them.
<dimitern> jam, fwereade, we do want them, don't we?
<fwereade> dimitern, leave the API alone but make the SetAgentTools in state into SetAgentVersion?
<fwereade> dimitern, no
<dimitern> fwereade, why?
<jam> fwereade: it was something danilo had been working on, but he left and it didn't realyl get that polished, and other changes mean it is just like your appendix
<jam> cruft that is likely to get you in trouble :)
<fwereade> dimitern, because we do not want any of that crapin state
<fwereade> dimitern, full stop
<dimitern> fwereade, which crap?
<dimitern> fwereade, size, checksum or url?
<jam> dimitern, fwereade: sha size, URL
<fwereade> dimitern, anything except the agent'sbinary version
<dimitern> fwereade, so no verification of what we're trying to install?
<fwereade> dimitern, no verification of what we've *already* installed
<fwereade> dimitern, it's verified perfectly well before the code that sets it even runs
<fwereade> dimitern, if the verification worked, no need to set it
<dimitern> fwereade, so are we doing pre-install verification?
<dimitern> fwereade, we can't just remove the Tools method from the upgrader and change it to Version
<fwereade> dimitern, I never mentioned Tools
<fwereade> dimitern, we need to et that info in order to verify
<dimitern> fwereade, so only SetAgentTools -> SetAgentVersion
<dimitern> fwereade, in state and no changes in the API except for that?
<fwereade> dimitern, yes, and that just in state, leave the API alone as much as possible -- just ignore/drop fields, don't change names or anything
<dimitern> fwereade, I need to change the API to call SetAgentVersion instead of SetAgentTools at least
<dimitern> fwereade, but that's internal, won't change the interface
<fwereade> dimitern, yep, that's all good
<dimitern> fwereade, and I suppose that's another live ec2 tests after that
<fwereade> dimitern, probably, but I'll try to hit the rest of the review soon too
<dimitern> fwereade, ok
<fwereade> dimitern, ie I'll try to save you having to do more than one more round of live testsing ;p
<dimitern> fwereade, ah well, that's nice :)
<jam> dimitern: right, so Tools doesn't change because we need a URL and when we download it we need to validate the SHA, but SetAgent*  doesn't need the URL or sha, etc. Arguably we should have different types
<jam> but we can live with 1 type and just ignore bits of it
<jam> fwereade: should we be adding "omitempty" flags to the Tools object ?
<dimitern> jam, machine and unit will have SetAgentVersion(version.Binary) methods instead of SetAgentTools
<dimitern> standup
<fwereade> jam, what, are we storing Toolses directly in state?
<dimitern> fwereade, we are storing coretools.Tools in state
<fwereade> jam, dimitern: can't we just make it a one-field struct?
<jam> mgz fwereade: rogpeppe:https://plus.google.com/hangouts/_/3b7ffbb7710f75a160f5ae900736d7276d6b241f
<dimitern> fwereade, I'd rather not explode the complexity of this already not-so-small CL
<fwereade> dimitern, jam: or, ehh, keep the stupidity, just construct a Toolswith just the version field
<jam> fwereade: so the issue is that we've been using it in the API and writing it to State
<jam> and they need to be a bit different now
<jam> fwereade: but that means we've written empty values into the DB unless we have omitempty, right?
<dimitern> it's ok if we just set the version
<jam> fwereade: anyway, standup we can live chat this after
<dimitern> we have omitempty only for the sha I think
<dimitern> we can make all the others except version omitempty as well
<jam> dimitern: +1
<fwereade> jam, I don't care what's in the DB, nobody has ever read anything from the db apart from the version
<fwereade> jam, it's straight-up crack putting external objects directlyinto the db *anyway*
<dimitern> fwereade, I did btw
<mgz> gahhh... deliver th
<mgz> e packets google
<fwereade> mgz, I feel a bit better if it hates you too :/
<fwereade> dimitern, reviewed
<dimitern> fwereade, thanks
<fwereade> dimitern, I observe that ReadTools is now basically unused, but that can wait
<fwereade> dimitern, I am fairly heavily weirded out by what you do in Upgrader though
<jam> fwereade: going over https://launchpad.net/juju-core/+milestone/1.16.0 right now
<jam> are we actually intending to do all of them
<fwereade> dimitern, looks like you're getting the proposed tools and setting them as the current ones, this will mean you can't upgrade now
<jam> like bug #1089291
<_mup_> Bug #1089291: destroy-machine --force <juju-core:Triaged> <https://launchpad.net/bugs/1089291>
<fwereade> jam, that one? not a chance
<rogpeppe> fwereade: what do you think about the status os https://bugs.launchpad.net/juju-core/+bug/1089291?
<dimitern> fwereade, ReadTools is used in a few places, so I didn't want to remove it just yet
<_mup_> Bug #1089291: destroy-machine --force <juju-core:Triaged> <https://launchpad.net/bugs/1089291>
<rogpeppe> s/os/of/
<dimitern> fwereade, not sure what you mean for the upgrader
<dimitern> fwereade, the upgrade works actually
<fwereade> dimitern, yeah, but no future upgrade ever will
<fwereade> dimitern, https://codereview.appspot.com/14231044/diff/1/worker/upgrader/upgrader.go#newcode101
<dimitern> fwereade, ok
<fwereade> rogpeppe, jam: that bug in particular is at least a few day's work I think
<rogpeppe> fwereade: thought it might be
<fwereade> rogpeppe, jam: https://bugs.launchpad.net/juju-core/+bug/1217781 is smaller and more important though
<rogpeppe> fwereade: we're going through https://launchpad.net/juju-core/+milestone/1.16.0 BTW
<_mup_> Bug #1217781: machine destruction depends on machine agents <cts> <cts-cloud-review> <juju-core:Triaged> <https://launchpad.net/bugs/1217781>
<rogpeppe> fwereade: that's not targeted to the milestone
<fwereade> rogpeppe, I think it's somewhat unrealistic regardless, but unquestionably a better candidate than 1089291
<fwereade> oh ffs
<rogpeppe> fwereade: gone again?
<fwereade> indeed
<fwereade> everything is working just fiine, except for hangouts :/
<rogpeppe> gah
<rogpeppe> fwereade: finished now
<fwereade> heh
<rogpeppe> i see sporadic test failures on provider/null
<fwereade> rogpeppe, would you hand them over to axw_ for tonight or tomorrow or whatever "soon" is in his timezone please?
 * fwereade is grabbing a quick lunch
<rogpeppe> fwereade: will do
<jam> fwereade: what was the bug I was asking if we had (and didn't see) ? I can't remember now that I've gone through 10 other bugs.
<jam> fwereade: for "provider.Tools()" this is a bug that might be serious
<jam> fwereade: https://codereview.appspot.com/14231044/patch/1/1005
<jam> dimitern: ^^
<jam> specifically, for one of our releases Upgrader was calling Tools as well
<jam> but we didn't have Provider credentials yet
<jam> so it spins and keeps killing the API server.
<fwereade> jam, cloud-tools-archive
<jam> For Upgrader this was fixed with adding the DesiredVersion API call
<jam> fwereade: thanks
<fwereade> jam, surely that shouldn't be killing the whole api server though, just the failing task?
<fwereade> jam, was that when we were still using allFatal perhaps?
<jam> fwereade: my point being, "bouncing workers" seems like a bad thing to think is just-ok-to-do
<jam> and asking Tools before first "juju status" means it will keep bouncing
 * fwereade grabbing a bit more lunch
<jam> dimitern: ^^ Calling Tools requires us to have the Environment/Provider credentials enabled on the API server (aka, juju status has transferred the secret attrs).
<jam> I *believe* we are ok because of the call for WaitForEnviron
<jam> is that true?
<jam> or are we going to see LXC Provisioners start bouncing once their up because they call Tools and it can't access the Storage to search for tools ?
<dimitern> jam, we have an environ before we call Tools
<dimitern> jam, that's WaitForEnviron's job
<jam> dimitern: not in upgrader
<jam> but yeah
<jam> I tihnk Provisioner is ok
<jam> dimitern: so if you make the change William suggests here: https://codereview.appspot.com/14231044/patch/1/1015
<dimitern> jam, the upgrader won't call Tools - i'm changing it to set the current tools, as fwereade suggested
<jam> I think we're ok
<fwereade> dimitern, well, it and the provisioner will still both call Tools
<jam> dimitern: https://codereview.appspot.com/14231044/patch/1/1002 is still odd to me
<jam> fwereade: so Upgrader will definitely only call Tools after calling SetTools which is just fine. And It won't call Tools until it has called DesiredVersion
<jam> so we are generally ok
<jam> (if someone asks for "juju upgrade-juju" they should have connected and passed the secrets)
<jam> fwereade: there is a small race if Provisioner starts up first, sees we have a configured environment, and Upgrader hasn't finished calling SetTools first
<jam> I think
<fwereade> jam, ok, but provisioner *might* get in there -- indeed
<dimitern> jam, I commented on it, and will be removed - rogpeppe already fixed it in trunk
<fwereade> jam, imo that doesn't matter enough to invalidate the technique
<fwereade> jam, that's almost the whole point of runner, so we can be resilient to this sort of thing
<jam> fwereade: so I feel like the whole thing is a bit incorrect, though.
<jam> We shouldn't be calling tools to *instantiate* the broker
<jam> we should have the lxc broker calling Tools when it is asked for a new instance.
<fwereade> jam, the broker should not be responsible for tools in the first place afaict
<jam> fwereade: My faith in non-bouncing runners is quite low
<jam> fwereade: Environs provide tools in the current layout
<jam> and LXC Broker is environ-lite
<jam> I guess
<fwereade> jam, broker is clearly not responsible for tools, because it *takes* possibleTools and doesn't look them up itself
<fwereade> jam, provisioner should be thinking "broker" not "environ"
<jam> fwereade: provider.StartInstance looks up tools for the broker
<fwereade> jam, afaict tools should be coming from the api in exactly the same way in both cases frankly
<jam> and uses either "environ.Environ.FindTools" or broker.Tools if it has the attribute
<fwereade> jam, that however is too big a change to make here and now
<fwereade> jam, I know what it does
<jam> fwereade: right, I wasn't going to block on it, but notice "this is really messed up" :)
<fwereade> jam, I'm just saying it's crap :)
<jam> fwereade: and has the advantage that we avoid a boot-time race condition
<fwereade> jam, yeah, I think it's tolerable
 * rogpeppe goes for lunch
<jam> fwereade: probably should be documented in a bug
<fwereade> jam, I'm just not quite sure what the bug is -- "lxc provisioner might bounce once or twice on upgrade"?
<jam> fwereade: lxc provisioner calls Tools before it actually tries to start an instance
<fwereade> jam, or possibly "lxc broker pretends to know about tools"
<fwereade> jam, hey, is there any reason to do *that* even?
<jam> fwereade: given we can just search for tools again, I don't see why.
<fwereade> jam, can we just drop lxc broker's Tools full stop?
<jam> but I don't know why the LXCBroker is playing games with tools
<jam> fwereade: so you do still need to find tools, and you don't have Env creds to search directly
<jam> fwereade: can I get a final LGTM on https://codereview.appspot.com/13962043/ ?
<jam> I think I addressed your comments
<fwereade> jam behind api, though, so irrelevant?
<fwereade> jam, looking
<jam> fwereade: if we change it to call the API, +1 from me
<fwereade> jam, that LGTM, thanks
<dimitern> fwereade, so we're changing SetTools of the upgrader API to take only a version and ignore the others as well ?
<dimitern> jam ^^
<jam> dimitern: version.Binary (aka with Arch and Series)
<jam> so version.Current content
<dimitern> jam, yeah, ok
<jam> axw_: what are you doing working? :)
<axw_> jam: just culling my inbox :)
<jam> axw_: and responding to IRC, and reviewing code, and ? :)
<axw_> hehe
<axw_> and now playing skyrim ;)
<jam> axw_: very fun game
<jam> PC?
<axw_> yup
<axw_> I don't play games all that often, but this one has me hooked for now
<jam> axw_: Steam claims I played it for 153 hours....
<axw_> jam: only 105 here ;)
<natefinch> man I wish I still had time for games.  Maybe once the kids are in school :)
<mgz> or when they leave home
<mgz> only a decade and a bit till freedom?
<natefinch> my kids are 0 and 2, so, like 18 years :)  Hey, at least graphics ought to be pretty awesome by then
<TheRealMue> natefinch: be promised it will get better
<fwereade> natefinch, I was playing a game just the other day!
<natefinch> TheRealMue: thanks.  I think it's pretty awesome, I just mourn for my free time, especially now that our 3 month old has decided 6am is a good time to get up (previously, I had been able to get up early to get some free time before the rest of the house woke up)
<fwereade> natefinch, iirc it was "dora the explorer saves the crystal kingdom", but hopefully laura's tastes will mature
<natefinch> fwereade: haha.  Yeah, I'm hoping to encourage good tastes in games, both digital and tabletop, since I enjoy both
<rogpeppe> fwereade: ha, i've discovered why the jujutest live tests are failing
<fwereade> rogpeppe, oh yes?
<rogpeppe> fwereade: the jujud that gets uploaded is a fake! (it's just a temp file containing the text "jujud contents 1.15.1-precise-amd64")
<rogpeppe> s/temp file/text file/
<fwereade> rogpeppe, arrgh
<rogpeppe> fwereade: it was pretty awkward to find out though, because it seems that there's no default key pair used for new instances
<rogpeppe> fwereade: i used to be able to ssh into an instance even before cloudinit had finished
<rogpeppe> fwereade: so i had to detach the instance's volume and attach it to another instance so i could see what was going on
<fwereade> rogpeppe, that's odd, I didn't think we had any awareness of amazon keypairs
<fwereade> ever
<rogpeppe> fwereade: me neither. i wonder if an amazon default has changed.
<rogpeppe> fwereade: ah, looks like the live tests were broken here: https://codereview.appspot.com/14031043/diff/7001/environs/jujutest/livetests.go
<rogpeppe> fwereade: i *think* all those UploadFakeTools calls are spurious in a live test context, but i don't really understand why they were added.
<dimitern> so ian goes to a holiday and all tools break loose :)
<rogpeppe> dimitern: it's not easy to review 4700 line diffs...
<fwereade> dimitern, ehh, less hell than I feared
<dimitern> :)
<dimitern> rogpeppe, any idea why I see this failing? http://paste.ubuntu.com/6183812/
<rogpeppe> dimitern: that test is new on me; just introduced in https://code.launchpad.net/~thumper/juju-core/environments-dir-permission/+merge/188754
<rogpeppe> dimitern: i'll have a look
<rogpeppe> dimitern: that's pretty weird
<rogpeppe> dimitern: what go version are you using?
<dimitern> rogpeppe, the one from the archive
<dimitern> rogpeppe, go version go1.1.1 linux/amd64
<rogpeppe> dimitern: from which archive?
<rogpeppe> dimitern: i.e. is it from a PPA? it's quite an odd error to get, because Current *is* (or should be) implemented on linux/amd64
<dimitern> rogpeppe, there http://paste.ubuntu.com/6183836/
<dimitern> rogpeppe, it's not from a PPA
<fwereade> rogpeppe, there may well be some opportunity to pull the tool-fiddling up to a higher level so we can upload real or fake as indicated by the, er, liveness of the test
<dimitern> rogpeppe, there's a golang bug with might be relevant https://code.google.com/p/go/issues/detail?id=6376
<rogpeppe> dimitern: that *shouldn't* be relevant as it won't have been cross-compiler
<rogpeppe> compiled
<dimitern> jamespage, do you know how is the golang 1.1.1 package in main compiled in raring/amd64? is it possible for it to be cross-compiled?
<rogpeppe> oh i hate it when apt-get tries to use the terminal
<dimitern> rogpeppe, looking here https://launchpad.net/ubuntu/saucy/amd64/golang-go-linux-amd64/2:1.1.1-3ubuntu3
<dimitern> rogpeppe, from the description one might assume it's actually cross-compiled on i386?
<rogpeppe> dimitern: BTW you've got two golang packages in the list you pasted - if i apt-get install golang, i get the second one, not the first
<rogpeppe> dimitern: (i.e. 1.0.2 not 1.1)
<rogpeppe> dimitern: if we are cross-compiling the standard go compiler, that's a potential problem
<dimitern> rogpeppe, yeah, but I forced the version I think
<rogpeppe> dimitern: ah, how do you force a version?
<dimitern> rogpeppe, ah, no sorry
<dimitern> rogpeppe, it seems i used jamespage's raring backports ppa: http://ppa.launchpad.net/james-page/golang-backports/ubuntu raring main
<rogpeppe> dimitern: right, i wondered
<jamespage> dimitern, please don't use that ppa
<jamespage> ppa:juju/golang
<jamespage> is the right one to use
<dimitern> rogpeppe, you force a version by setting it explicitly golang=2:1.1.1-3ubuntu3 .. something
<rogpeppe> dimitern: could you try what jamespage says and see if you still have a problem?
<jamespage> the packages in my ppa won't work
<dimitern> jamespage, rogpeppe, ok, will try
<jamespage> they cross compile in a way that is broken
<rogpeppe> jamespage: +1
<dimitern> aha!
<jamespage> the packages in juju/golang ppa do it right
<rogpeppe> jamespage: that looks like the issue dimitern was seeing
<dimitern> rogpeppe, ok it's fine now - using go version go1.1.2 linux/amd64 and the tests pass
<rogpeppe> dimitern: cool
<dimitern> jam, fwereade, updated https://codereview.appspot.com/14231044/ - live testing on ec2 in the mean time
<fwereade> dimitern, cheers
<jam> dimitern: I don't think we want to change the API do we ?
<jam> If you aren't changing the Tools struct, I don't see the point of changing the name of the API
<dimitern> jam, what do you mean?
<jam> change the state.SetAgentVersion but state/api/params/internal.go
<jam> https://codereview.appspot.com/14231044/patch/10001/11005
<dimitern> jam, the name of the type doesn't matter
<dimitern> jam, just the fields should be the same to be compatible
<jam> dimitern: fair enough, but I wouldn't change the type name when it still has a Tools object, we can change the type when we change the content
<dimitern> jam, that's why I put the DEPRECATE tag there
<dimitern> jam, it's ok we've done this before
<jam> dimitern: but *why* bother to change X when it does nothing when you have to change Y that actually does something and you can change X when you change Y
<dimitern> jam, because I wanted to clean up all refs to SetAgentTools in the code
<dimitern> jam, and I think it's properly documented, so it's not confusing
<jam> dimitern: so I think we wanted to change "tools.Tools" to add omitempty entries
<jam> and I'm still not sold on changing the name of the struct
<jam> but the rest LGTM
<dimitern> jam, I forgot the omitempty, will add that
<jam> I'll let fwereade and you discuss the other bits, I'm off for now
<dimitern> jam, thanks
<dimitern> bugger.. another random test failure http://paste.ubuntu.com/6183904/
<fwereade> jam, dimitern: I'd just as soon define (in params) `type Tools struct {Version version.Binary}` and use that -- seems equivalent, right?
<dimitern> that *evil* EnvironBootstrapStorager strikes again
<dimitern> fwereade, not really
<dimitern> fwereade, we do want to keep the logic with getting the url + size + sha from the environ, right?
<fwereade> dimitern, ehh, right
<fwereade> dimitern, type Version struct ;p
<dimitern> fwereade, we can change all that in 1.18 as we deprecate that stuff
<fwereade> dimitern, EntityVersion struct {Tag string, Tools Version}
<fwereade> dimitern, ok, I guess I don't see why we have future deprecation considerations
<fwereade> dimitern, all that any server requires today is Version, right?
<dimitern> fwereade, but that way we'll change the type of an API type's field
<dimitern> fwereade, I chose the most backwards-compatible approach I think
<fwereade> dimitern, if it's just a struct with compatible fields, what's the problem? it's the field names that screw us,not the types themselves
<dimitern> fwereade, but if an older client sends fields for that type which are missing?
<fwereade> dimitern, dropped on the floor automatically, aren't they?
<dimitern> fwereade, we'll just ignore them I guess
<dimitern> fwereade, yeah, ok, will do
<fwereade> dimitern, lovely, thanks
<fwereade> rogpeppe, is this the one you saw? http://paste.ubuntu.com/6183904/
<rogpeppe> fwereade: yes
<rogpeppe> fwereade: and i know why it happens - i raised a bug
<rogpeppe> fwereade: https://bugs.launchpad.net/juju-core/+bug/1234125
<_mup_> Bug #1234125: provider/null: sporadic test failure <juju-core:New for axwalk> <https://launchpad.net/bugs/1234125>
<dimitern> rogpeppe, if it's sporadic, add an intermittent-failure tag (I did that now)
<fwereade> rogpeppe, nice, thanks
<dimitern> fwereade, EnvironVersion has to be type EnvironVersion version.Binary, right?
<dimitern> fwereade, no Tag to be seen
<dimitern> fwereade, and if it is, then I don't see why we need to even define a type - just use version.Binary
<fwereade> dimitern, we need a type to be where the Tools struct was, surely
<dimitern> fwereade, yes
<dimitern> fwereade, so it we redefine Tools *version.Binary should be enough?
<fwereade> dimitern, surelynot no
<dimitern> fwereade, ok, I'm confused now, why not?
<fwereade> dimitern, {Tag: "foo", Tools: {Version: "bar"}} != {Tag: "foo", Tools: "bar"}
<dimitern> fwereade, ah
<dimitern> fwereade, so we have type Version stuct { Version: version.Binary } and use that in SetAgentVersion struct
<fwereade> dimitern, yeah
<fwereade> dimitern, although it would be great if we didn't continue the practice of verbing sumb structs
<fwereade> s/sumb/dumb/
<fwereade> dimitern, afaict that data there is an AgentVersion, or possibly an EntityVersion -- not a *Set* anything
<dimitern> fwereade, there's already AgentVersionResult and Results
<rogpeppe> fwereade: i'm looking at the changes in SetUpSuite in https://codereview.appspot.com/14031043/diff/7001/provider/ec2/live_test.go and they look like crack but i don't know what the current status w.r.t. public buckets is
<dimitern> fwereade, don't you thing it would be confusing to have AgentVersionResult, AgentVersionResults, AgentVersion and AgentsVersion ?
<rogpeppe> fwereade: the reason for the "verbed" structs in params is that params.X is supposed to mean "the parameters to the call X"
<fwereade> dimitern, and AgentVersionResult doesn't actually have an agent reference anywhere in it, it's just a VersionResult, surely?
<dimitern> fwereade, ok, will rename AgentVersionResult* to VersionResult*
<rogpeppe> fwereade: if we want to change that convention, i think we should do it all at once rather than changing some things not others
<fwereade> rogpeppe, I think that justification breaks down pretty quickly in practice, doesn't it? I haven't really spotted anything resembling consistency in params names thus far
<dimitern> fwereade, how about AgentToolsResult(s) ?
<fwereade> dimitern, ToolsResult? no agent mentioned in there...
<dimitern> fwereade, ok
<rogpeppe> fwereade: they might not be utterly consistent, but *most* of them abide by that convention AFAICS
<fwereade> dimitern, rogpeppe: sorry I must go, I think laura is crying about me not coming to see the birthday cake she's decorated for me and I amfeeling like a Bad Person
<fwereade> bbs
<rogpeppe> fwereade: happy birthday, BTW!
<fwereade> rogpeppe, cheers
<fwereade> and cath has mollified laura so I have breathing space
 * fwereade still a bad person
<fwereade> dimitern, ok, reviewed with a few more thoughts, bbs
<dimitern> fwereade, cheers
<rogpeppe> fwereade, jam, mgz: it's my understanding that an environment's PublicStorage is now entirely ignored when checking for existing tools - is that right?
<jam> rogpeppe: PublicStorage (afaik) is completely ignored
<rogpeppe> fwereade, jam, mgz: so that the old logic on line 94 of https://codereview.appspot.com/14031043/diff/7001/provider/ec2/live_test.go can't be replaced decently
<rogpeppe> jam: the new code breaks any of the live tests that actually want to run the tools
<jam> rogpeppe: you can create a storage bucket, write tools to it, and set tools-url: in config
<abentley> sinzui: I believe we can disable auto-discovery in elasticsearch by setting discovery.zen.ping.multicast.enabled to False.  We would then enable discovery.zen.ping.unicast and use a peer relation to determine discovery.zen.ping.unicast.hosts.
<rogpeppe> jam: how does that help? what we need here is a fallback, not something that will be used if we've uploaded the actual tools
<abentley> sinzui: http://www.elasticsearch.org/guide/reference/modules/discovery/zen/
<rogpeppe> jam: ah, i see, you mean tools-url points to the simple-streams thingy
<rogpeppe> jam: presumably you'd need to generate simplestreams metadata too
<jam> rogpeppe: sync-tools / upload-tools generate all that stuff
<jam> so you need a bucket and then publish tools into it
<rogpeppe> jam: and from within Go ?
<TheMue> hmm, bzr tells me "This transport does not update the working tree of: bzr+ssh://bazaar.launchpad.net/~themue/juju-core/049-prepare-ec2/" but the branch can be seen in launchpad. anyone an idea?
<jam> rogpeppe: upload tools from withing go does create simplestreams metadata for the tools it publishes
<jam> TheMue: there shouldn't be a working tree for a bazaar.launchpad.net branch
<jam> so there shouldn't be any user files there
<rogpeppe> jam: ok, i'll have a look
<rogpeppe> jam: thanks
<TheMue> jam: ok, but why does bzr think i want to do so?
<jam> TheMue: I don't know, I don't see one here: http://bazaar.launchpad.net/~themue/juju-core/049-prepare-ec2/.bzr/
<jam> TheMue: I could probably do a better diagnose if you had more of the .bzr.log file (found in $HOME)
<jam> rogpeppe: sync.Upload
<rogpeppe> jam: ah, i was looking at tools.WriteMetadata
<mgz> TheMue: did you do something funny like `bzr pull -d REMOTE` rather than `bzr push REMOTE`?
<TheMue> jam: thx, i'll take a look in there
<jam> rogpeppe: sync.Upload does all the work for you, from what I can tell it is used by bootstsrap
<jam> bootstrap
<TheMue> mgz: no, a standard push --remember ...
<rogpeppe> jam: in this case i don't want to build and upload the actual tools
<TheMue> mgz: and it is visible at https://code.launchpad.net/~themue/juju-core/049-prepare-ec2
<rogpeppe> jam: hmm, looks like more work than i want to do currently - i'll just go with even slower live tests for the time being
<jam> rogpeppe: can't you then do UploadFakeTools ?
<jam> there is a GenerateFakeTools somewhere
<jam> rogpeppe: provider/openstack/live_test.go
<jam> does:
<jam> t.metadataStorage = openstack.MetadataStorage(t.env)
<jam> envtesting.UploadFakeTools(c, t.metadataStorage)
<jam> rogpeppe: MetadataStorage is a testing thing that points at the location pointed to by the LiveTest config
<rogpeppe> jam: yes, i could probably something like that, but i can't afford the hour or so it would take currently.
<rogpeppe> jam: because testing it is so slow
<rogpeppe> jam: and i'm hoping to push the addresser branch
<rogpeppe> jam: and i already have *something* that makes the live tests work
<rogpeppe> jam: i'll leave it as a todo
<jam> rogpeppe: fine with me
<mattyw> I'm a little confused by the new logging stuff, if I want to deploy a charm and have debug logging do I need to supply "--log-config=<root>=DEBUG" is that right?
<rogpeppe> mattyw: i'm afraid i'm not sure - it's new to me too
<rogpeppe> fwereade: would you be able to take another look at https://codereview.appspot.com/14038045/ please?
<rogpeppe> fwereade: it addresses your concerns i think
<rogpeppe> fwereade, jam: BTW i'm a little concerned about these changes: https://codereview.appspot.com/14258043/diff/6001/environs/configstore/disk.go
<rogpeppe> fwereade, jam: ISTM that they might not work on windows
<rogpeppe> natefinch: how much of our stuff is supposed to work under Windows?
<fwereade> rogpeppe, hell, that sounds very plausible
<rogpeppe> fwereade: i have a feeling that ensurePathOwnedByUser should be in an foo_unix.go file
<rogpeppe> fwereade: with a fallback for non-unix
<fwereade> rogpeppe, that sounds very likely to me to be correct
<natefinch> rogpeppe: define stuff?  The client builds under windows.  I'm not sure about the tests for the packages the client uses, though. And definitely, a lot of the code outside the client is linux-specific
<rogpeppe> natefinch: this is client stuff
<rogpeppe> natefinch: i suppose this stuff *might* work under windows, because os.Chown will still be there
<natefinch> rogpeppe: I'm taking a look at the code review
<rogpeppe> natefinch: thanks
<fwereade> natefinch, if you concur there's a probablem, would you kick it over to thumper and cc me please?
<natefinch> fwereade: sure thing
<rogpeppe> fwereade, natefinch: i think that it'll probably work actually.
<rogpeppe> fwereade, natefinch: because SUDO_[GU]ID won't be set, so it won't ever try to do the os.Chown
<natefinch> rogpeppe: interesting... true, but is that still the correct behavior?  I'm not sure what uid == 0 means
<natefinch> (in the linux case)
<fwereade> natefinch, don't suppose you have a windows box handy to try it out on?
<natefinch> fwereade: sure do.
<fwereade> natefinch, great :)
<rogpeppe> natefinch: it means SudoCallerIds didn't find SUDO_UID and SUDO_GID i think
<dimitern> fwereade, updated https://codereview.appspot.com/14231044/ and the live upgrade from 1.14 to 1.15 passes
<dimitern> fwereade, but I decided to try changing version to 1.16.0 and try to upgrade to that, just in case
<dimitern> fwereade, but I'm running into issues, like phantom debug logs which are not where they say they are in the source and indeed nowhere to be seen
 * fwereade does not like the sound of *that* -- they're not 1.14ones, are they?
<dimitern> fwereade, http://paste.ubuntu.com/6184278/
<dimitern> fwereade, no
<dimitern> fwereade, in tools/list.go:107 there's indeed a loop to match tools in a slice/List but there's no log message there saying "trying to match tools..."
<dimitern> fwereade, neither in trunk, nor in 1.14
<fwereade> what version do you have installed locally?
<fwereade> because that "found existing jujud" is a bit scary to me
<dimitern> fwereade, I have my branch
<fwereade> dimitern, oh wait that's not so scary, I think, ok
<dimitern> fwereade, and I did rebuild jujuc and jujud after changing the version to 1.16.0
<dimitern> fwereade, the problem is from line 29 on in that paste - before that all seems fine i think
<fwereade> dimitern, delete pkg/linux_amd64 or whatever it is, maybe?
<dimitern> fwereade, I'll try
<rogpeppe> fwereade: any chance of a look at https://codereview.appspot.com/14038045/ please? it's a prereq if addressing is going to get done today
<rogpeppe> fwereade: (and you have gone over it before, so shouldn't be too onerous)
<dimitern> fwereade, ok, so the phantom message is gone, but I still get these on lines 30 and 31
<dimitern> fwereade, why is the filter empty i wonder?
<rogpeppe> mgz: it'd be nice if you could look too, as it's our branch: https://codereview.appspot.com/14038045/
<natefinch> rogpeppe, fwereade: uh, hmmm...   C:\Users\Nate>juju --debug bootstrap
<natefinch> 2013-10-02 15:29:05 DEBUG juju.environs.configstore disk.go:77 Making C:\Users\Nate\AppData\Roaming\Juju\environments
<natefinch> 2013-10-02 15:29:05 INFO juju.provider.ec2 ec2.go:215 preparing environment "amazon"
<natefinch> 2013-10-02 15:29:05 INFO juju.provider.ec2 ec2.go:193 opening environment "amazon"
<natefinch> 2013-10-02 15:29:05 ERROR juju supercommand.go:282 cannot create environment info "amazon": cannot rename new environment info file: rename C:\Users\Nate\AppDat
<natefinch> a\Roaming\Juju\environments\098297807 C:\Users\Nate\AppData\Roaming\Juju\environments\amazon.jenv: The process cannot access the file because it is being used b
<fwereade> dimitern, 1.16.0.*1* is dev
<natefinch> y another process.
<rogpeppe> natefinch: oh great
<rogpeppe> natefinch: i think i know the fix
<dimitern> fwereade, 1.16.0 is what I set - .1 is added by upload-tools
<rogpeppe> natefinch: yeah, we need to avoid the defer tmpFile.Close
<rogpeppe> natefinch: that's useful, thanks
<natefinch> rogpeppe: I figured it was something like that
<natefinch> rogpeppe: I don't think I actually tested the changes in that CL, though
<dimitern> fwereade, i'm trying 1.17.1 now
<dimitern> fwereade, now it works
<rogpeppe> natefinch: actually i think you probably have
<fwereade> dimitern, if you want 1.16.0 I think you'll need to generate tools metadata and sync *that*, won't you?
<dimitern> fwereade, so we can't use --upload-tools with even minor versions
<rogpeppe> natefinch: it would have failed earlier if they hadn't worked
<fwereade> dimitern, we can, but juju knows damn well that uploaded tools arenot releasedones ;p
<natefinch> rogpeppe: oh, ok.  It wasn't clear to me if I'd gotten to that code or not.  good.
<dimitern> fwereade, whew.. ok, 1.17 works and upgrades ok
<dimitern> fwereade, I'm not even trying to understand that whole complex process just now
<dimitern> fwereade, maybe tomorrow, once I start on minor version upgraders
<dimitern> upgrades
<mgz> rogpeppe: sure, looking
<mgz> rogpeppe: while you're between things, what's the correct way to run amazone live tests right now?
<mgz> I get a panic when cdng into provider/ec2 and using -gocheck.f
<rogpeppe> mgz: i run them with "go test -amazon -test.timeout 2h"
<mgz> hm, probably can't do the cd
<rogpeppe> mgz: some of the live tests don't work in isolation
<rogpeppe> mgz: and many of the live tests are broken currently because they don't upload the tools correctly
<mgz> hm, indeed, not singling out one test to run does things
<rogpeppe> mgz: i have a fix that makes some of the live tests work, but breaks the normal tests. i want to do better, but i'm concentrating on addressupdater for the moment.
<dimitern> fwereade, btw https://codereview.appspot.com/14231044/ still waits final approval, now after live tests passed
<rogpeppe> mgz: which test were you trying to run and how did it fail?
<mgz> okay, the test lacks a prepare...
<mgz> panic on first line, which uses t.Env
<mgz> okay, that's an easy fix
<mgz> man, we realy should just isolate by default
<mgz> sloppiness
<rogpeppe> natefinch, fwereade: fix configstore under Windows: https://codereview.appspot.com/14285043
<rogpeppe> mgz: the live tests are deliberately not isolated because we don't want them to take 4 hours to run
<rogpeppe> mgz: but i've been thinking we could do better
<mgz> rogpeppe: point there is they should have code to share stuff, not need code to not-share stuff
<natefinch> rogpeppe: looking
<rogpeppe> mgz: we could have one suite that just bootstraps once for the whole suite, and no tests in it destroy the environment
<rogpeppe> mgz: and another which isolates and lets the tests do whatever they like
<fwereade> rogpeppe, addresspublisher LGTM
<rogpeppe> fwereade: thanks
<fwereade> dimitern, rereviewing
<fwereade> natefinch, would you do rog's review please?
<natefinch> fwereade: was already looking :)
<fwereade> natefinch, have to do dimitern's and become coherent andprofessional before client call in 15 mins
<fwereade> natefinch, <3
<natefinch> fwereade: good luck ;)
<fwereade> dimitern, Idon't understand what's incompatible about the name change -- it's not the API, it's just code
<dimitern> fwereade, you mean change only the client-side?
<dimitern> fwereade, SetTools is an API call
<fwereade> dimitern, it's just a method on an object as far at the provisioner knows
<dimitern> fwereade, when it's client-side, yes
<dimitern> fwereade, I though you wanted me to change the SetTools server-side and client-side API calls to SetVersion
<dimitern> fwereade, to be compatible, we can just change the client-side
<natefinch> rogpeppe: I see the same pattern in utils.WriteYaml, but that's only called by the uniter, so shouldn't get run on Windows.  It's a good thing to keep in mind, though.
<rogpeppe> natefinch: yeah
<fwereade> dimitern, state object changes: good
<fwereade> dimitern, wire protocol changes: bad (except when analyzed carefully)
<fwereade> dimitern, state/api code changes for clarity: good, I think
<fwereade> dimitern, and the state/api interface change has nothing to do with compatibility
<fwereade> dimitern, just what wesend on the wire
<dimitern> fwereade, yeah
<dimitern> fwereade, ok I'll change the client-side to be SetVersion, and keep the server-side as SetTools
<fwereade> dimitern, but, look, I'm not really bothered about the *actual* name of that method in state/api -- but essentially I think that everything marked DEPRECATE is just unnecessary
<fwereade> dimitern, I don't want to change the wire protocol further unless there's a behaviour benefit
<fwereade> dimitern, the suckiness of Version{Version: "foo"}} is not worth the churn IMO
<dimitern> fwereade, how is it unnecessary?
<dimitern> fwereade, if we change these right away we're making it incompatible
<dimitern> fwereade, if we change it at a later release, it'll be ok
<fwereade> dimitern, I'm telling you *not to change them* unless there's a reason involving something being broken
<dimitern> fwereade, I'm not changing them
<fwereade> dimitern, our updating and compatibility story is, yu may have observed, shit
<fwereade> dimitern, you're apparently saying "I will change them later"
<fwereade> dimitern, I don't think that's worth committing to
<fwereade> dimitern, lots of places the API is inelegant
<dimitern> fwereade, well it is, for clarity
<dimitern> fwereade, having Tools field which is actually just a version
<fwereade> dimitern, every such change carries a massive cost in compatibility
<dimitern> fwereade, I want to commit to improving the api over time
<dimitern> fwereade, without breaking anything now, hence the DEPRECATE tags
<dimitern> fwereade, g+?
<fwereade> dimitern, sorry, meeting started
<dimitern> fwereade, ok
<dimitern> fwereade, although please ping me when done
<mgz> oh gawd, this is amusing
<mgz> I've actually got the change needed for vpc down to one thing, as the real behaviour is quite a bit more liberal than docs and earlier experimentation implied
<mramm> awesome!
<natefinch> mgz: it's nice when things turn out easier than expected
<hazmat> mgz, nice
<hazmat> mgz, you mean the sec group id vs name distinction isn't black and white?
<mgz> nope, it *only* matters for filters, not general params, the online docs state it should matter for everything
<mgz> so, it's fewer extra apis calls to fix for now
<mgz> the fun bit being we have zero test coverage for this inside juju-core...
<mgz> the one security group focused live test actually passed on default vpc as it didn't exercise the problem case
<mgz> ...or something else weird is going on with live tests
<mgz> ah, ciute, I see
<TheMue> so, will leave now until monday. tomorrow national holiday
<TheMue> enjoy your evenings/weekend
<fwereade_> right, I might go and see my family for a bit -- does anyone need anything from me imminently?
<natefinch> fwereade_: it's your birthday, stop working! :)
<hazmat> adam_g, the problem with switching to lp:juju-deployer atm appears to be i don't have access to it. the branch needs to be owned by the group juju-deployers, atm its owned by you, ie for comparison darwin series branch is owned by the group
<adam_g> hazmat, updated
<hazmat> adam_g, thanks
<hazmat> adam_g, not quite.. you have to push the branch to lp:~juju-deployers/juju-deployer/trunk and update the series branch.. branch is currently pointed to a  personal branch. lp:~gandelman-a/juju-deployer/trunk -
<adam_g> hazmat, one sec
<hazmat> jamespage, deployer 0.25 uploaded to pypi fwiw, no support yet for to = [1,2,3]
<adam_g> hazmat, look better?
<hazmat> adam_g, looks good, thanks
<hazmat> pushed latest darwin merges to that
<hazmat> ahasenack, could you update your recipe to point to trunk instead of darwin when you have a chance.
<ahasenack> hazmat: oh, sure, I must have forgotten
<ahasenack> no, wait, that was something else
<ahasenack> hazmat: so your "fork" was merged then
<hazmat> ahasenack, yes
<ahasenack> hazmat: is lp:juju-deployer enough, or do I need to specify the full path? Depends on how "development focus" is setup
<natefinch> Anyone else think it's a little weird that the help from the juju client includes how to install the juju client? :/
<hazmat> ahasenack, lp:juju-deployer is enough
<ahasenack> ok
<hazmat> ahasenack, thanks.. also fwiw, the version is now at 0.2.5
<ahasenack> on it
<jamespage> hazmat, thanks - uploaded to saucy
<mgz> yak #1 of the day: https://codereview.appspot.com/14302043
<sinzui> I see this file in a place I don't think it belongs
<sinzui> s3://juju-dist/tools/releases/juju-1.10.0-precise-amd64.tgz
<sinzui> ^ I think it is a test
<sinzui> ^ I want to delete it, or else I need to write a rule to never pull it down
<jam> sinzui: delete it, it isn't referenced in the released:tools.json file
<sinzui> jam, thank you
<fwereade_> jam, hey... we could actually delete all that public-bucket-related code now, couldn't we?
<jam> fwereade_: I would need to actually audit for it, but I believe so
<jam> fwereade_: I was looking around earlier and it seemed that way, but it is still mentioned in some tests, tec.
<jam> etc
<fwereade_> jam, indeed
<jam> fwereade_: hopefully my patch lands, I'm off to sleep
<fwereade_> jam, tyvm, sleep well
<rogpeppe> mgz: i reviewed https://codereview.appspot.com/14302043/
<mgz> rogpeppe: thanks, rest incoming, was going to break up, but not *too* huge I think (though will take me a little longer to test out)
<rogpeppe> mgz: i've done the rest of the address updater BTW, but still lack most of the tests
<thumper> morning
<natefinch> thumper: morning
<thumper> natefinch: how are things going?
<natefinch> thumper: good good.  documenting stuff.
<thumper> fwereade_: anything I need to be aware of?
<thumper> natefinch: the maas tags?
<natefinch> thumper: and azure provider (and cleaning up some of the docs on the other providers)
 * thumper nods
<natefinch> thumper: tags are really pretty much already documented as a part of the constraints docs I did yesterday
<natefinch> oh, I also bugged the web guys to put a menu item for Docs on juju.ubuntu.com
<natefinch> well, I bugged Nick, and he bugged the web guys
<thumper> :)
<natefinch> thumper, mgz, rogpeppe, fwereade_: I noticed juju help aws is just about the only place we use the term aws, everywhere else we use ec2 or amazon.... in fact, I went through juju help amazon and juju help ec2 before going to juju help to find out what the right term was.
<thumper> hmm... we really want topic aliases
<rogpeppe> natefinch: ec2 would probably be a better term to use
<thumper> agreed, since it is the type we use in the config
<rogpeppe> yup
<natefinch> yeah, I was thinking ec2 being the displayed topic, and then having aliases for amazon and aws
<thumper> rogpeppe: do you know the current api only for non-bootstrap nodes is going?
<rogpeppe> thumper: i'm having difficult parsing that question...
<rogpeppe> ulty...
<thumper> rogpeppe: we are wanting agents on non-bootstrap nodes to not access state directly
<thumper> rogpeppe: do you know if this is done?
<rogpeppe> thumper: i think it is, yes, although it's possible the final branch hasn't actually landed yet
<thumper> rogpeppe: do you know the status of the CLI using the api?
<rogpeppe> thumper: not done
<rogpeppe> thumper: there's one command that uses the API
<thumper> is it status?
<rogpeppe> thumper: i wish
<rogpeppe> thumper: no, it's get
<thumper> ah...
<thumper> I heard that status is particularly shit with low latency to the cloud
<thumper> rogpeppe: have you started on the ha stuff?
<rogpeppe> thumper: no
<thumper> I'm looking at the address publisher and thinking that's what it is for
<thumper> is it not?
<rogpeppe> thumper: it's a necessary prereq, but it's also for the addressability stuff
 * thumper nods
<thumper> ok
<rogpeppe> thumper: and it's also for the API endpoint caching
<fwereade_> thumper, rogpeppe: fwiw, get and add-unit use the API
<rogpeppe> fwereade_: ah, i'd forgotten about add-unit
<mgz> well, this is a barrel of laughs
<thumper> mgz: what's that?
<arosales> thumper, whats your feeling on https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1219879
<_mup_> Bug #1219879: [FFe] juju-core 1.16 for Ubuntu 13.10 <juju-core (Ubuntu):Confirmed> <https://launchpad.net/bugs/1219879>
<mgz> silly changes needed to goamz due to test server limitations/oddnesses with what the real stuff actually supports
<arosales> for US Oct 3rd
<thumper> arosales: I'm adding a comment
<arosales> https://launchpad.net/juju-core/+milestone/1.16.0 isn't looking good either
<thumper> arosales: heh, no
<arosales> thumper, is your comment going to address what is going to land feature wise per the bug description?
<thumper> arosales: https://launchpad.net/juju-core/+milestone/1.15.1 is better
<thumper> arosales: well, with 1.14, the local provider is broken on saucy
<thumper> arosales: 1.16 unfucks it
<arosales> nice to see some fix committed
<arosales> thumper, lol
<arosales> thumper, please add that to the release notes
<mgz> yak #2: https://codereview.appspot.com/14304043
<arosales> thumper, I can defer bug 1214178 to post 1.16
<_mup_> Bug #1214178: Azure provider configuration is difficult to configure <azure> <papercut> <juju-core:Triaged> <https://launchpad.net/bugs/1214178>
<arosales> thumper, I'll look for a couple others to retarget
<thumper> arosales: I have a feeling that most of those on 1.16 will need to be pushed off
<arosales> we'll need to hit some, but I agree most will have to be re-targetted
 * arosales just wanted to defer the low hanging fruit
<natefinch> thumper, mgz, anyone else: https://codereview.appspot.com/14207048/    just some doc changes.  Except for help_topics.go, most of it is just formatting changes, not content changes.
<fwereade_> rogpeppe, hey, what's the deal with verifyAllConfig in ec2?
 * rogpeppe looks
<rogpeppe> fwereade_: ah, ok
<fwereade_> rogpeppe, looks like it's already gone through validation before it's called
<rogpeppe> fwereade_: i'm coming to that conclusion
<rogpeppe> fwereade_: i think what i was aiming towards is that no defaults come into play at that point
<fwereade_> rogpeppe, I think it's too late for that
<rogpeppe> fwereade_: yes, we need a better SetConfig
<rogpeppe> fwereade_: i think we should leave those changes for another day
<fwereade_> rogpeppe, sorry, which? picking control-bucket?
<fwereade_> rogpeppe, I'm not seeing the issue, expand please
<rogpeppe> fwereade_: the issue i'm suggesting leaving alone for now is that providers should not pick default values except right at the beginning.
<fwereade_> rogpeppe, ah, ok, makes some sense; and rest easy, I have no intention of touching it :)
<rogpeppe> fwereade_: cool. just throw validateAllConfig away.
<mgz> geh, what's the idiom for returning an empty struct in an error case, without needing to type `return package.StructName{}, err` everytime?
<rogpeppe> fwereade_, mgz: review would be much appreciated: https://codereview.appspot.com/14306043
<rogpeppe> mgz: var zero = package.StructName{}
<mgz> rog, perfect timing, just started livetest run so otherwise idle
<rogpeppe> mgz: lovely
<mgz> rog, just at the top of the func? guess it makes sense as I use it twice
<mgz> rogpeppe: while I read, see 14304043 for me :)
<rogpeppe> mgz: looking
<rogpeppe> mgz: reviewed
<mgz> I'm about half way :)
<rogpeppe> mgz: pretty good considering yours is about 1/16th the size :-)
<rogpeppe> mgz, fwereade_: do you think addressupdater should be part of JobManageState or JobManageEnviron ?
<fwereade_> rogpeppe, feels a bit statey tome
<fwereade_> to me
<rogpeppe> fwereade_: i started off with that, but just decided otherwise
<fwereade_> rogpeppe, but it's most closely connected with the provisioner, indeed
<rogpeppe> fwereade_: currently everything in ManageState could theoretically live without a valid Environ
<fwereade_> rogpeppe, nice theory :)
<rogpeppe> fwereade_: lol
<rogpeppe> fwereade_: that's my aim, eventually anyway
<fwereade_> rogpeppe, I think it's safe to say that whatever we pick we'll find out why it was wrong soon enough
<fwereade_> rogpeppe, follow your heart :)
<rogpeppe> fwereade_: ok will do...
<fwereade_> rogpeppe, (I take particular pleasure in that phrase as a simpsons echo... "hey, boss, should I shoot him gangland-style or execution-style?" "follow your heart"
<fwereade_> )
<rogpeppe> .
<rogpeppe> :-)
<hatch> hey all - in charm options are .'s a valid word delimeter? 'foo.bar.baz' vs 'foo-bar-baz'
<fwereade_> hatch, I would not personally stake money on .s working right
<hatch> fwereade_ ok there is an issue with the Hadoop charm, and the GUI because it uses .'s
<hatch> could we set a rule that they are invalid? :)
<fwereade_> hatch, hmm, is it just hadoop, do you know?
<hatch> fwereade_: I haven't been able to find another
<hatch> and I have been looking :)
<fwereade_> hatch, ok, then, I'm more than happy to call that invalid -- do you know what happens if you try to use those settings on the CLI?
<hatch> rumor has it it works
<hatch> I'm going to try it now
<hatch> it'll take a bit for it to deploy
<fwereade_> hatch, ok, it would be great if it did break somewhere so I could declare it to be poor input validation with a 100% clear conscience
<hatch> certainly - the fact it's an approved charm with them made me a little curious
<hatch> so once it spins up I'll see if I can set a config option
<fwereade_> hatch, would you also debug-hooks it please, and see whether you can config-get it correctly too?
<hatch> fwereade_: I've never done that before - would it be `juju debug-hooks hadoop/0 config-changed` ?
<fwereade_> hatch, that sounds right, but yu can skip the hook name to debug everything iirc
<fwereade_> hatch, (do it in a separate terminal or it will be frustrating)
<hatch> will do
<rogpeppe> mgz: "
<rogpeppe> I assume this gets cleaned up at test end anyway, so doesn't need any of its
<rogpeppe> own?
<rogpeppe> "
<rogpeppe> mgz: i don't get that remark
<mgz> we're fiddling with something global-lloking that gets passed in
<mgz> I'm just used to change-thi-thing-for-testing function coming with their own cleanup
<mgz> rogpeppe: do you know any way of switching region with the goamz live tests?
<rogpeppe> mgz: state is reset at test end, yeah
<rogpeppe> mgz: not sure.
<rogpeppe> mgz: i'm not sure where they take their region from
<mgz> yeah, I couldn't see at a glance
<mgz> I have a hack that works for the juju-core ones...
<mgz> hm, the live TestBootstrapAndDeploy test is failing for me... and I think causing other fallout
<mgz> guess I just propose without confidence of the live tests passing for now
<hatch> fwereade_: I am not able to set config options via the CLI with .'s FYI
<fwereade_> hatch, awesome! sounds like an input-validation bug in juju-core, and bad-config bug in the hadoop charm :)
<hatch> yup - ok thanks, I appreciate the help
<fwereade_> anyone free to review https://codereview.appspot.com/14307043 quickly?
<mgz> fwereade_: looking
<mgz> yaks shaved, payoff: https://codereview.appspot.com/14309043
<mgz> and day done
<rogpeppe> mgz: well done
<rogpeppe> mgz, fwereade_: this actually integrates the address updater: https://codereview.appspot.com/14251044
<rogpeppe> fwereade_: looking
<rogpeppe> fwereade_: reviewed
<rogpeppe> thumper: any chance of a review of https://codereview.appspot.com/14251044 ?
#juju-dev 2013-10-03
<rogpeppe> mgz: reviewed
<fwereade_> rogpeppe, in case you're there, my unease has crystallized -- how does addressupdater play with containers?
<rogpeppe> fwereade_: it doesn't currently
<fwereade_> rogpeppe, I guess it doesn't have to yet
<rogpeppe> fwereade_: we have to work out how we're going to do container addressing first
<rogpeppe> fwereade_: in case you missed it, i'm after a review of this, which actually integrates the address updater: https://codereview.appspot.com/14306043/
<fwereade_> rogpeppe, well, we know that one instance will have at least N addresses that need to be shared amongst the machine and its containers
<rogpeppe> fwereade_: i'm not sure who will be responsible for allocating container addresses
<rogpeppe> fwereade_: whatever happens, there has to be *something* like the address updater at the top level, i think
<fwereade_> rogpeppe, yeah, I think the trickiness it just going to be passing the extra addresses on to containers
<rogpeppe> fwereade_: yes
<fwereade_> rogpeppe, and that's orthogonal, so... LGTM
<fwereade_> rogpeppe, and you reviewed mgz's already, so I'm going to bed :)
<fwereade_> gn
<rogpeppe> fwereade_: gn
<rogpeppe> '
<thumper> fwereade_: you still up?
<thumper> geez
<thumper> rogpeppe: you heading to bed too?
<rogpeppe> thumper: i was hoping i might get the API client address caching done...
<thumper> rogpeppe: how much does it still have to do?
<rogpeppe> thumper: 1) it needs State.APIAddresses to return addresses from the state server machines rather than from mongo peers
<rogpeppe> thumper: 2) it needs the API login to call State.APIAddresses and return them as the result
<rogpeppe> thumper: 3) it needs some code to actually save the API endpoint returned from the API login
<rogpeppe> thumper: the first two are pretty trivial; the third requires a little more thought but should be easy enough.
<rogpeppe> thumper: all of them can actually be done independently
<thumper> and before you sleep?
<rogpeppe> thumper: erm, maybe i'm being a little optimistic :-)
<rogpeppe> thumper: i think i'll probably just land the address updater
 * thumper nods
<rogpeppe> thumper: it would be nice to change status so that we can actually see the new address info too...
<thumper> :)
<rogpeppe> right, i am doing No More
<rogpeppe> thumper: g'night
<thumper> night
<davecheney> sinzui: +1 on your change
 * rogpeppe just live bootstrapped with an environments.yaml entry that's simply : "envname": {"type": "ec2"}
<davecheney> rogpeppe: wowzers
<davecheney> talk about simple
<davecheney> rogpeppe: what is the next step, bootstrap with no envuronments.yaml and just some flags?
<rogpeppe> davecheney: it did use the conventional $AWS_ environment vars, so it's a bit of a cheat really
<rogpeppe> davecheney: i think that would be good, yeah
<davecheney> juju bootstrap -e rog -t ec2
<davecheney> creates ~/.juju/environments.yaml ?
<rogpeppe> davecheney: no need
<rogpeppe> davecheney: creates ~/.juju/environments/rog.jenv
 * davecheney twitches
<davecheney> prefixing everything with j sounds very 2002
<davecheney> :)
<rogpeppe> davecheney: look, there was a bikeshed CL specifically for that :-)
<rogpeppe> davecheney: you didn't weigh in so you're out
<rogpeppe> davecheney: FWIW i'm not that keen on .jenv either, but it was the only reasonable suggestion
<davecheney> rogpeppe: fair enough, my fault for not being involed
<davecheney> i'll shut up
<rogpeppe> davecheney: if you really have a better suggestion, i'm happy to hear it.
<rogpeppe> davecheney: it can be changed now; not so easily in the future.
<davecheney> rogpeppe: ignore my griping, i'm a chicken, not a pig
 * rogpeppe duly ignores
<rogpeppe> right, the address updater has landed. i'm going to bed.
<rogpeppe> davecheney: g'night.
 * thumper sighs heavily
<thumper> why is dummy provider so dumb
<thumper> is there any specific trick I need to know about dummy.Storage()?
<rogpeppe> thumper: what about it?
<thumper> I have a test hanging forever
<thumper> inside dummy bootstrap method, I'm trying to call common.SaveState
<thumper> so it behaves like a real environment
<thumper> for some other tests
<rogpeppe> thumper: dummy.Storage shouldn't be doing anything special
<thumper> but the test just hangs
<thumper> how can I tell where it is hanging?
<rogpeppe> thumper: ^\
<rogpeppe> thumper: i.e. SIGQUIT
<thumper> ok, but it is
<rogpeppe> thumper: that'll give you a stack dump
<thumper> is there a control key for that?
<rogpeppe> thumper: control-backslash
<thumper> ta
<rogpeppe> thumper: paste the stack trace; i have a suspicion what might be your problem
<thumper> rogpeppe: http://pastebin.ubuntu.com/6186255/
<rogpeppe> thumper: i think you're probably calling common.SaveState while the mutex is held
<thumper> rogpeppe: hopefully I waited long enough
<thumper> ah
<rogpeppe> thumper: yup, looks like that's the issue
<rogpeppe> thumper: (see goroutine 52)
 * thumper nods
<rogpeppe> right, i really *am* going to bed noew
<thumper> rogpeppe: that was it, passes now
 * thumper runs all the tests again
<thumper> oh...
<thumper> test panci
<thumper> panic
 * thumper wondered what I did
<hatch> is there any way I can deploy a charm locally (lxc) from my launchpad branch?
<hatch> the branch is lp:~hatch/+junk/hadoop-charm-update
<hatch> but it says that is an invalid charm name
<thumper> hatch: I think you may need to have a copy locally
<thumper> axw: I got a test failure with the null provider tests
<thumper> axw: may be a timing thing
<axw> thumper: yeah, rog filed a bug
<hatch> thumper: yeah that's what it looks like in the docs....thanks for confirming
<axw> will look into it shortly
<thumper> axw: this one? environSuite.TestEnvironBootstrapStorager
<thumper> hatch: np
<axw> thumper: yup
<thumper> ok,
<axw> https://bugs.launchpad.net/bugs/1234125
<_mup_> Bug #1234125: provider/null: sporadic test failure <intermittent-failure> <juju-core:New for axwalk> <https://launchpad.net/bugs/1234125>
 * thumper sighs
<thumper> why do core-devs file "New" bugs
<thumper> it should at least be triaged
<thumper> stabby stabby
<thumper> wow
<thumper> found out why so many of our bootstrap tests are slow
<thumper> automagical upload is now rebuilding jujud every test :)
<thumper> why is it one hour jobs become four hour jobs
<sidnei> thumper: around?
<thumper> sidnei: kinda
<sidnei> thumper: just wondering if bootstrapping from 1.5.1 just built from trunk this morning it should be picking tools 1.4.1.1 or am i doing something wrong?
<sidnei> and by 1.4 and 1.5 i meant 1.14 and 1.15 of course
<davecheney> sidnei: as i understand it from 1.15.x onwards it will only be able to work if it finds an exact tools match
<sidnei> uhm, so this shouldn't have happened, if that's indeed correct
<sidnei> unless im missing some branch that was landed recently
<thumper> sidnei: I'd say it certainly shouldn't be
<sidnei> https://pastebin.canonical.com/98419/ fyi
<sidnei> let me paste that to paste.ubuntu.com
<sidnei> oh, oh. i think i know what the problem is
<sidnei> yeah, much better now. 'sudo juju bootstrap' was picking /usr/bin/juju obviously.
<thumper> davecheney, axw:  https://codereview.appspot.com/14279044/
<axw> thumper: looking
<thumper> sidnei: yea, I've picked up `sudo $(which juju) bootstrap' from axw
<thumper> before I'd hardcode the path
<sidnei> ah, nice one
 * davecheney looks
<thumper> it is a beautiful day here
<thumper> once my wife is back with kid from the doctor, we are going for a picnic
<thumper> \o/
<davecheney> thumper: +1 on that change
<thumper> and this? https://codereview.appspot.com/14321043
<davecheney> nope https://codereview.appspot.com/14279044/
<davecheney> is there another review ?
<thumper> yeah, the one I just said :)
<axw> I think there's a bug for this too
<thumper> there is
<thumper> linked to the branch
<axw> ah
<thumper> https://bugs.launchpad.net/juju-core/+bug/1216775
<_mup_> Bug #1216775: cmd/juju: local provider doesn't give a clear explanation when lxc is not configured correctly <papercut> <juju-core:Triaged by thumper> <https://launchpad.net/bugs/1216775>
<axw> thumper: yep, just expected a "Fixed #..."
<thumper> axw: I use the bzr --fixes lp:nnn
<thumper> rather than the lbox thingy I can't remember
<axw> okey dokey
<davecheney> 22:08 < thumper> davecheney, axw:  https://codereview.appspot.com/14279044/
<davecheney> that was the one I reviewed
<thumper> davecheney: yeah, I know
<thumper> davecheney: and thanks
<thumper> davecheney: I was giving you another :)
<thumper> help punished by requesting more help
<thumper> :)
<davecheney> cokc
<davecheney> fuck the lags is bad here today
<davecheney> _+1
<axw> thumper: https://codereview.appspot.com/14315044/ if you have the time
 * thumper looks
<thumper> axw: I don't get this: ( echo $* | grep -q touch ) && head -n 1 > /dev/null
<thumper> can you explain?
<axw> thumper: one sec
<axw> thumper: actually that's broken
<axw> thumper: my intention was to only expect input for the second bash, but that's just wrong
<axw> thumper: PTAL
 * thumper had to decode petal
<axw> sorry, Please Take Another Look :)
<thumper> I got it, it just took me a while
<thumper> my cat is attacking me, she wants food
<axw> thumper: nps, it can wait
<axw> the code.. not the cat
<axw> :)
<thumper> have to teach the cat some time
<thumper> fark
<thumper> exec 0<&-
<thumper> now that is cryptic
<thumper> even with the comment, I don't get it
<axw> hence the comment ;)
<axw> heh
<axw> can't say I grok the syntax either
<thumper> axw: do you have any other null provider things to get done?
<axw> thumper: nothing for 1.16
<davecheney> help
<thumper> davecheney: whazzup?
<davecheney> does anyone remember the issue for the charm dir being owned by root 0700 ?
<thumper> no
<thumper> well, I don't
<davecheney> there must be one
<davecheney> a few people lost their shit over it
<thumper> axw: you're branch failed
 * thumper sighs
<thumper> your
<axw> huh
<thumper> obviously you aren't a branch
<axw> I tested that
<davecheney> thumper: https://bugs.launchpad.net/juju-core/+bug/1226088
<_mup_> Bug #1226088: config-get fails with "open FORCE-VERSION: permission denied" <juju-core:Invalid> <https://launchpad.net/bugs/1226088>
<axw> heh
<davecheney> is a protruberance of this issue
<davecheney> but not the core issue
<davecheney> got it
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1205286
<_mup_> Bug #1205286: charm directory permissions now more restrictive <canonical-webops> <juju-core:Won't Fix> <postgresql (Juju Charms Collection):Fix Released> <postgresql-psql (Juju Charms Collection):Fix Released> <https://launchpad.net/bugs/1205286>
 * thumper back later
<rogpeppe> mornin' all
<fwereade_> heya rogpeppe
<rogpeppe> fwereade_: hiya
<axw_> jam: ping
<jam> hi rogpeppe fwereade_ and axw_
<jam> axw_: what's up?
<rogpeppe> jam: hiya
<axw_> jam: what do we want the cloud-tools pocket for?
<fwereade_> jam, heyhey
<jam> axw_: it holds backports of tools related stuff. AIUI juju-core itself will be in there, but so will LXC
<jam> axw_: I *think* we'll migrate to using Cloud Tools instead of ppa:juju/stable
<rogpeppe> axw_: small change to environs/sshstorage - i started commenting on your CL but then realised it was merged; https://codereview.appspot.com/14327043
<axw_> jam: ah ok. should I bother trying to get this in today? or have we already cut off?
<axw_> rogpeppe: looking
<axw_> oh, no comment left
<jam> the discussion says cut off, but we can still land it
<axw_> rogpeppe: is there a problem, or did I already fix it? :)
<axw_> rogpeppe: I made another fix in a further bzr push, didn't repropose
<rogpeppe> axw_: no real problem - just that if there were several lines of output, they wouldn't be joined with newlines
<rogpeppe> axw_: ah, if the change wasn't reproposed, i have no idea
<axw_> rogpeppe: agh, yeah, because I changed to use the scanner
<axw_> I'll fix in another, thanks
<rogpeppe> axw_: that CL fixes it
<axw_> oh sorry, that's yours
 * axw_ looks again
<rogpeppe> axw_: yeah
<axw_> rogpeppe: heh, I did actually split it out into a separate function, but reversed it to keep the EOF bits close together
<axw_> but... meh, no big deal
<rogpeppe> axw_: yeah, i see that; i think the separate function is still worth it though. i think the @EOF is distinctive enough really.
<rogpeppe> axw_: i suppose we could pass the @EOF to copyAsBase64 as a "terminator" argument
<axw_> rogpeppe: I think it's fine, that'd probably be overkill
<axw_> rogpeppe: lgtm, thanks
<rogpeppe> axw_: ta
 * fwereade_ breakfast
<axw_> rogpeppe: anything we can do about this in the short term? https://bugs.launchpad.net/bugs/1234534
<_mup_> Bug #1234534: local provider spams machine 0 log with "localInstance.Addresses not implemented" <juju-core:New> <https://launchpad.net/bugs/1234534>
<axw_> bbiab - making dinner
<jam> fwereade_: enjoy your food, but poke for when you get back
<rogpeppe> axw_: hmm interesting. i'll investigate.
<rogpeppe> axw_: there's a trivial fix
<rogpeppe> axw_: just remove that log statement :-)
<rogpeppe> axw_: but that leaves a slight problem
<rogpeppe> axw_: we shouldn't really be polling for address changes when the address can never change
<rogpeppe> axw_: i wonder if Addresses should return ErrUnimplemented; then the polling loop could just not bother anymore
<rogpeppe> fwereade_, dimitern, axw_: trivial CL to add a little bit of logging: https://codereview.appspot.com/14328043
<dimitern> rogpeppe, looking
<dimitern> rogpeppe, reviewed
<rogpeppe> dimitern: ta
<axw_> rogpeppe: would it be bad to just to return an empty list of addresses from the local provider?
<rogpeppe> axw_: no, but we don't really want to be contiuously polling those addresses
<axw_> rogpeppe: ah, it keeps checking if it has an empty list? ok
<rogpeppe> axw_: yes - it'll wait for the machine to get an address (currently it polls quite frequently at that stage, once a second)
<rogpeppe> dimitern: you're right about the %#v thing - it was a cop-out
<rogpeppe> dimitern: i'm wondering about a nice printed format for addresses
<rogpeppe> dimitern, mgz: how about something like this: public:12.3.5.6(networkname)
<rogpeppe> dimitern, mgz: where (networkname) is omitted if NetworkName is unset
<rogpeppe> dimitern, mgz: and i'd omitted the address type because it's almost always easily divinable from the contents of the address.
<dimitern> rogpeppe, that lgtm
<rogpeppe> mgz: i suspect that Address.AddressType could actually be omitted entirely and implemented as a method.
<jam> dimitern or rogpeppe: https://bugs.launchpad.net/juju-core/+bug/1233936
<_mup_> Bug #1233936: worker/uniter: uniter restarts when relation removed <juju-core:Triaged> <https://launchpad.net/bugs/1233936>
<jam> It looks like if you delete a relation
<jam> then when the agents see the Changed event
<jam> they get an EPERM trying to find out what changed
<jam> (rather than, say, an ENOTFOUND)
<jamespage> can someone spare me some time to debug a juju-core 1.14.1/openstack havana problem I'm seeing?
<rogpeppe> wonderful errors without context
<rogpeppe> jamespage: what's the issue?
<jamespage> rogpeppe, I can bootstrap an environment OK; but when I deploy services, extra instances are not being provisioned
<jam> jamespage: does "juju status" tell you anything informative?
<jamespage> 'pending'
<jam> often it should give errors for the units you've asked for
<jamespage> no errors
<dimitern> jam, seems your analysis of that bug is correct
<jam> dimitern: I'm pretty sure I know why it is failing, I'm not 100% sure what the correct behavior is
<jamespage> jam, rogpeppe: I see this in machine-0.log
<jamespage> http://paste.ubuntu.com/6187432/
<dimitern> jam, the problem is, before we get the relation we can't really decide whether the user is allowed to see it or not
<dimitern> jam, hence the ErrPerm there
<jam> dimitern: don't we encode the endopints into the key itself?
<dimitern> jam, but I guess the correct behavior is, rather than trying to fix the API (which is correct in this case - at least consistent), we should fix the uniter no to die on ErrPerm there
<jam> so we could check if the agent is one piece of the relation that would-have-existed
<rogpeppe> jamespage: could you paste the whole log please?
<rogpeppe> jamespage: from the look of that last line it looks like it is actually trying to provision an instance
<jam> jamespage: so given this is "com.canonical.serverstack.serverstack:ubuntu:..." I'm guessing this is a custom data source
<dimitern> jam, "would-have-existed" for a remote entity (i.e. not our authenticated one), is meaningless, if we get NotFound from state
<jam> jamespage: it *looks* like it properly finds an image id to launchd
<jam> candidate matches for products ["com.ubuntu.cloud:server:12.04:amd64"] are [0xc2004ac500]
<jamespage> jam: yes - thats the simplestreams sync of data into our testing cloud
<jam> (pointers don't help but the fact it isn't an empty list is a good sign)
<dimitern> jam, I mean we can't distinguish between "this is something you should be able to see" and not
<jam> dimitern: if "unit-0" has been validated, and asks about "relation-unit-0:unit-2" it doesn't seem like a problem to tell it ENOTFOUND
<jamespage> status output - http://paste.ubuntu.com/6187449/
<jam> dimitern: at least, I thought we changed relation-tags to hold the relation-keys which are crafted based on the unit endpoints involved
<dimitern> jam, relation tags are exactly as the relation keys in state - no more, no less
<dimitern> jam, informationwise
<jamespage> jam, rogpeppe: http://paste.ubuntu.com/6187453/
<jamespage> machine-0.log file
<jam> jamespage: the fact that we produce "openstack user data" should sounds like we are trying to start an instance
<jam> So it feels like we are missing the next lines
<dimitern> jam, and just because unit 0 was validated doesn't mean "have access to any arbitrary service a relation tag might specify" imo
<jam> dimitern: if it has *unit-0* in the tag
<jam> then it has access to relations involving *unit-0&
<jam> unit-0
<rogpeppe> jam: yes, that message is actually produced inside the openstack StartInstance method
<jamespage> jam: agreed - I see calls to the nova-api for flavors and stuff
<dimitern> jam, well, I guess, although not entirely convinced it should
<jamespage> but nothing related to actually starting an instance!
<rogpeppe> jamespage: i suppose it's possible that the startinstance request is blocking forever
<jam> dimitern: why would a unit *ever* not have access to a relation involving that unit ?
<dimitern> jam, if we check the relation tag before trying to get the relation from state and make sure one of the services is the same as our unit's service, then report NotFound instead of ErrPerm
<rogpeppe> jamespage: could you try something for me? kill -QUIT the jujud process
<jam> jamespage: so, looking at the code
<jam> the next thing it does
<jam> is try to set up a Security Group
<jam> it is possible you're out of security groups
<jam> but our error logging is terrible ?
<fwereade_> jm, heyhey
<fwereade_> jam ^
<jam> hey fwereade_
<rogpeppe> jamespage: that *should* produce a stack trace to the log file, showing where everything is
<dimitern> jam, how critical is this bug?
<rogpeppe> jam: if the StartInstance call fails, the provisioner *does* actually log the error immediately
<jamespage> rogpeppe, http://paste.ubuntu.com/6187464/
<jam> dimitern: it really depends, *I* would think that if a relation was deleted then we should fire a charm hook
<jam> so the charm can clean itself up
<jam> rogpeppe: but I'm wondering if we are hitting a failure in ensureSecurityGroup
<jam> which I don't see any log messages about
<rogpeppe> jamespage, jam: yes, looks like the StartInstance request is blocked forever
<dimitern> jam, that's a tall order - not many charms actually use that
<jamespage> rogpeppe, jam: log from nova api server - http://paste.ubuntu.com/6187468/
<jam> dimitern: if the appropriate behavior is "deleted relation, nothing happens" then it doesn't matter if the nothing-happens is done by restarting the agent or by just continuing the loop
<rogpeppe> jam: see goroutine 78 in that last paste
<dimitern> jam, although restarting the agent kinda sucks
<dimitern> jam, I'll look into it later today then
<jam> jamespage: it does, indeed, appear to be stalled  in an HTTP request
<fwereade_> jam, the uniter did itself clean up that relation completely, didn't it?
<jam> fwereade_: no mention of it in https://bugs.launchpad.net/juju-core/+bug/1233936
<_mup_> Bug #1233936: worker/uniter: uniter restarts when relation removed <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1233936>
<fwereade_> jam, looks like it's doing it to me
<jam> fwereade_: however that loop is trying to aggregate f.relationsChanged(ids)
<jam> fwereade_: if it can't get the id and returns immediately in the "range keys" loop
<jam> then it won't call f.relationsChanged
<jam> because the Uniter dies before it gets to call that
<jam> fwereade_: now, I could be completely wrong about where the error is originating, because we don't have any clue but where the error was finally logged
<fwereade_> jam, it doesn't need to call relationsChanged
<jam> fwereade_: but if you hit the line 350 because err != nil and err != ENOTFOUND then you won't call f.relationsChanged
<jam> fwereade_: so, am I wrong in believing that if you delete a relation it should trigger a charm hook ?
<fwereade_> jam, the hooks have all already run
<rogpeppe> jam: is that the entire nova server log? i'd expect to see something around 09:34 (assuming the clocks are vaguely in sync)
<fwereade_> jam, relation removal wastriggered by the completion of the last hook essentially
<jam> rogpeppe: I think you mean jamespage ^^
<rogpeppe> jam: i do - very inconvenient juxtaposition of irc nicks :-)
<jam> fwereade_: so dave's original bug was that he deleted the relation from the client
<jamespage> rogpeppe, those where the calls I saw when issued the QUIT
<jam> jamespage: do you know if nova logs when it starts a request or when it finishes them ?
<rogpeppe> jamespage: ah, i'd like to see what happened around the time the request was issued
<jam> jamespage: given the "time" field
<fwereade_> jam, if the filter sees a relation that doesn't exist, the uniter *by definition* has no knowledge of it
<jam> sounds like when it finishes them
<jam> jamespage: if something was hung, then you wouldn't have a nova log either
<jamespage> rogpeppe, full log -http://paste.ubuntu.com/6187481/
<jamespage> http://paste.ubuntu.com/6187481/
<fwereade_> jam, if the uniter knew about it, it'd be in scope
<fwereade_> jam, if it's in scope, the relation can't be removed
<jam> fwereade_: so what does "delete the relation from a client mean" ? just trigger the teardown of the process ?
<fwereade_> jam, destroy-relation will remove it straight off if no units are in scope; otherwise it sets dying and waits for the last departing unit to remove the relation as it does so
<fwereade_> jam, it clearly ran the hooks that need to be run before leaving scope
<jamespage> jam: not sure bout the logging
<fwereade_> jam, and then the relation was somehow removed
<jamespage> but I do see several established connections on the API server from the bootstrap node
<fwereade_> jam, the overwhelming balance of prob is that the uniter really did leave scope
<jam> jamespage: .100 is the bootstrap node, right?
<jamespage> yes
<jam> fwereade_: so I think it boils down to: we used to get ENOTFOUND and now we get EPERM, is it best to return a nice ENOTFOUND if it looks like the uniter would have had access to that relation if it did exist.
<jam> fwereade_: I'm still skeptical that rebooting is "just ok"
<jam> as there is a step that would have happened if it had got ENOTFOUND
<jam> so, (1) I definitely think this should be fixed, but (2) I'd accept that it isn't Critical
<jam> fwereade_: but I actually wanted to chat about Uniter.CharmURL when we finish the other priority interrupts :)
<fwereade_> jam, rebooting is just fine
<fwereade_> jam, all the local uniter state for that relation is cleaned up before that can happen
<fwereade_> I think the right thing is for the client to handle ErrPerm explicitly, given that we basically always return perm rather than notfound
<jam> fwereade_: that seems *very* cavalier to me
<jam> perhaps a "rebooting is fine in this cause because..."
<fwereade_> jam, rebooting should not happen, but it doesn't cause any harm
<jam> fwereade_: hence my "this should be fixed, but isn't Critical"
<axw_> jam, rogpeppe, dimitern: joining?
<axw_> mgz:
<jam> jamespage: so from what I can see we are making an attempt, it is possible that the Openstack server is telling us "too many requests, try again after X seconds" and we will retry up to 3 times for that
<jam> I would expect that to show up in the nova log, though.
<jamespage> jam, rogpeppe: thats odd -  the two missing instances just started up
<rogpeppe> jamespage: ha ha
<jamespage> rogpeppe, yeah - but the lag was massive
<rogpeppe> jamespage: that's probably because killing the agent terminated the requests, then it retried when it restarted
<rogpeppe> jamespage: so i suspect that for some reason those requests were blocked forever - i don't know if its our problem or nova's
<rogpeppe> jamespage: perhaps we should time-limit our requests
<jamespage> rogpeppe, maybe - I turned off ratelimiting
<jamespage> that might be why it started working but I'm uncertain
<jamespage> let me test that again
<jam> rogpeppe: I agree, we've had a bug reported that we would end up with huge amounts of "dead" connections because they never timed out
<arosales> fwereade_, thanks for the fix on bug https://bugs.launchpad.net/juju-core/+bug/1217781
<_mup_> Bug #1217781: machine destruction depends on machine agents <cts> <cts-cloud-review> <juju-core:Fix Committed> <https://launchpad.net/bugs/1217781>
<jamespage> jam, rogpeppe: hmm - so I turned rate-limiting back on and its still fine
<jamespage> odd
<jam> jamespage: well we haven't done enough requests yet to cause a problem :)
<jam> it may not be rate limiting
<jam> but something in nova that starts a request and never finishes it
<jamespage> add-unit -n 16 worked just fine as well
<jamespage> jam: lots of variables - might be sucky networking on 12.04 precise LXC (the cloud-controller is running in juju managed LXC)
<arosales> natefinch, was the maas bug you were going to look into bug https://bugs.launchpad.net/gomaasapi/+bug/1222671 ?
<_mup_> Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <https://launchpad.net/bugs/1222671>
<natefinch> arosales: no, there's a bug about juju going out and finding nodes that are part of a different juju environment and shutting them down
<arosales> natefinch, ok
<arosales> natefinch, added a comment to your merge request
<arosales> https://code.launchpad.net/~natefinch/juju-core/018-azure-help/+merge/188936/comments/432802
<arosales> the hp cloud trailing "/" is important as that actually breaks config
<natefinch> arosales: you have to do "publish and mail comments" for me to see the comments.
<natefinch> arosales: I saw that bug you filed... is the slash supposed to be there or not?
<arosales> natefinch, sorry I am not following usual core process here :-/
<arosales> but you should see the comment in lp @ https://code.launchpad.net/~natefinch/juju-core/018-azure-help/+merge/188936
<arosales> natefinch, the  trailing slash is supposed to be there for hp.
<mramm> jam: you said you had two new bugs
<natefinch> arosales: ahh I see.. thanks.
<arosales> jam, fwereade_ can we quickly discuss https://bugs.launchpad.net/gomaasapi/+bug/1222671 ?
<_mup_> Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <https://launchpad.net/bugs/1222671>
<mramm> jam: can you target them to 1.15.1?
<jam> mramm: will do: bug #1234577
<_mup_> Bug #1234577: Uniter needs to support ssl-hostname-verification: false <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1234577>
<jam> and bug #1234576
<_mup_> Bug #1234576: Upgrader needs to support ssl-hostname-verification: false <juju-core:Triaged by jameinel> <https://launchpad.net/bugs/1234576>
<natefinch> arosales: changes made from your comments. Thanks for looking.
<arosales> natefinch, thanks
<arosales> thanks for getting that in too :-)
<jam> dimitern: https://bugs.launchpad.net/juju-core/+bug/1233451 you can dupe it to something else if you already had a bug
<_mup_> Bug #1233451: juju upgrade-juju results in unsupported behavior <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1233451>
<arosales> keeping the help menu up to date that is
<natefinch> arosales: no problem. It bugs me when there aren't good docs on tools.
<arosales> natefinch, yup and a bug UX issue
<natefinch> arosales: I also bugged evilnick and the web guys to make the docs more prominent on juju.ubuntu.com since they're really hard to find even if you know they're there somewhere
<mgz> I wonder if we could fix bu 1222671 at the juju-core level, it seems a little similar to filtering the result of instances to machines in the building/running state
<mgz> not sure if the list mass gives us actually has enough to trim out non-allocated machines though
<mgz> bug 1222671
<_mup_> Bug #1222671: maas provider must only attempt to stop machines in the allocated state <cts-cloud-review> <Go MAAS API Library:Triaged> <juju-core:Triaged> <https://launchpad.net/bugs/1222671>
<arosales> natefinch, I will also follow up with the web team on that feedback.
<natefinch> arosales: I literally had to do a text search of the front page to convince myself there even was a link to the docs there
<arosales> ouch
<arosales> natefinch, a user has to go to Resources --> then docs
<jam> mgz: it does say "maas provider" which sounds juju-y,
<natefinch> davecheney: I called azure Windows Azure because that's what Microsoft calls it (even though it has nothing to do with windows): http://www.windowsazure.com/en-us/
<arosales> natefinch, that is the correct name branding
<rogpeppe> fwereade_, axw_, mgz, dimitern: here's the fix for that address polling problem: https://codereview.appspot.com/14337043/
<dimitern> rogpeppe, looking
<natefinch> arosales: Cool.  I just copied what they put on their website, figure they can't get mad at us for that :)
<rogpeppe> dimitern: thanks
<jam> rogpeppe: it doesn't seem like IsUnimplemented should be a Warning
<rogpeppe> jam: ?
<rogpeppe> jam: oh, i see
<jam> rogpeppe: https://codereview.appspot.com/14337043/patch/1/1007
<rogpeppe> jam: i think it's reasonable to see that single message
<jam> rogpeppe: on every boot, on etc, I really don't think we want a Warning
<rogpeppe> jam: won't you only see it in the machine agent log file?
<jam> rogpeppe: it will end up in debug-log
<dimitern> rogpeppe, reviewed
<rogpeppe> dimitern: ta
<jam> and every time the agent restarts
<jam> etc
<dimitern> jam, a little clarification about bug 1234576 ?
<_mup_> Bug #1234576: Upgrader needs to support ssl-hostname-verification: false <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1234576>
<rogpeppe> jam: ok, i'll make it not log anything in that case
<jam> rogpeppe: I would be ~ ok if it was INFO/DEBUG but Warning says that you might need to fix something, and this is explicitly unfixable
<jam> dimitern: Upgrader grabs a Tools URL and then does net/http/Get of that file
<rogpeppe> jam: fair enough
<jam> we need a way to have Upgrader realize that EnvironConfig.SSLHostnameVerification() == false
<rogpeppe> dimitern: would you be happier if it was 100 years?
<jam> dimitern: and then use utils.NonValidatingClient
<dimitern> rogpeppe, why poll at all in this case?
<rogpeppe> dimitern: because it doesn't complicate the code any more
<dimitern> jam, so basically the change is in fetchTools in the upgrader worker
<rogpeppe> dimitern: and we don't care if it does
<jam> dimitern: and some sort of API to get the environ setting into the worker
<dimitern> rogpeppe, it doesn't seem right
<jam> dimitern: I was going to change the Tools api to include that bit
<jam> but fwereade_ asked to make it a separate api call
<rogpeppe> dimitern: because...?
<jam> I don't really care
<dimitern> rogpeppe, it feels like we should stop polling and report an error there
<dimitern> jam, why is the API involved at all here?
<rogpeppe> dimitern: the machine poller has to stay around, unless we refactor most of the logic in that worker
<jam> dimitern: the Uniter worker doesn't have ENv creds
<jam> and doesn't have ENvironConfig
<jam> so it doesn't *know* if ssl-hostname-verification was set or not
<rogpeppe> dimitern: so it might as well just have a very long poll interval, i think
<dimitern> jam, ah, it's an environ config setting, ok, I get it now
<jam> dimitern: and essentially the same thing for Uniter downloading a charm
<rogpeppe> dimitern: which fixes this problem without making the code more complex for a case which is going to go away anyway
<jam> they don't have access to that config setting
<jam> and need it in one fashion or another
<dimitern> rogpeppe, doesn't it feel bad fixing it like that? :)
<rogpeppe> dimitern: no
<rogpeppe> dimitern: :-)
<rogpeppe> dimitern: it feels like a minimally invasive and perfectly sufficient change
<dimitern> jam, so some call shared by the uniter and upgrader, called SSLHostVerification bool ?
<rogpeppe> dimitern: and when the local provider gets an Addresses implementation, we just need to delete 4 lines of code.
<fwereade_> jam, dimitern: I think I'm sold on putting the ssl setting in the api calls that return urls
<fwereade_> jam, dimitern: we can just use the env setting now, and we're free to improve it as jam suggested at our leisure
<jam> dimitern: right, as mentioned I was going to put it into the existing API that they are already calling, but fwereade_ thought that was more risky
<rogpeppe> fwereade_: that seems good to me
<jam> fwereade_: makes me happy, though dimitern the Uniter charm one is using a StringsBoolResult
<jam> which is shared with GetP? call
<fwereade_> jam, dimitern: I thought I just backpedalled on that but Ican't find where I typedit
<jam> which is what I wanted to talk with fwereade_ about earlier
<fwereade_> jam, dimitern: so: I'm fine putting a DisableSSLHostnameVerification bool into the existing results, I think
<dimitern> fwereade_, jam, so adding a field to the result of the Tools() upgrader API call (which will make it available to the provisioner as well), and a field to the CharmURL() uniter call
<jam> dimitern: that was the idea, yeas
<jam> yes
<jam> fwereade_: to be clear, the EnvironConfig setting is "SSLHostnameVerification"
<fwereade_> jam, dimitern: it can always match the env config setting for now, and it's all behind the api so it's possible to be more sophisticated in future
<dimitern> jam, camel case??
<jam> dimitern: I mean, inverted true/false
<jam> the config setting
<jam> you set it to "false" to disable
<jam> false => stop checking certs
<rogpeppe> dimitern: another possibility would be to have EnvironProvider implement a SupportsInstanceAddresses method, then avoid starting the worker at all if that returns false. that's more efficient but much more invasive.
<dimitern> ok
<fwereade_> jam, dimitern: and definitely not CharmURL
<dimitern> rogpeppe, but certainly feels more like the right thing to do
<dimitern> fwereade_, what?
<rogpeppe> dimitern: we'll only want to rip it out again later.
<fwereade_> jam, dimitern: CharmURL is completely irrelevant AFAICT
<fwereade_> jam, dimitern: why would we ever change that?
<rogpeppe> dimitern: there's no point in making all the code base more complex for this little temporary hack.
<dimitern> fwereade_, I don't know, just asking to figure it out what needs changing
<dimitern> rogpeppe, why temporary? when is it going away?
<fwereade_> jam, dimitern: we're thinking of CharmArchiveURL
<jam> rogpeppe: dimitern: for polling, I could see some value in a say 1/hour message indicating that if the IP address were to change, we wouldn't notice
<fwereade_> jam, dimitern: CharmURL itself is completely different
<jam> fwereade_: so *shrug* I hadn't finished writing it. But whatever we actually download needs to change.
<fwereade_> jam, dimitern: obviously :-/
<rogpeppe> dimitern: when the local provider implements Instance.Addresses, which i hope will happen quite soon
<fwereade_> rogpeppe, you have any idea how to do that?
<dimitern> rogpeppe, ok, but please bug it and add a TODO about that
<dimitern> lest we forget later
<jam> rogpeppe: an infrequent message so when something does happen and the user goes WTF is kind of nice, but I'm happy with what you've done, (though it shouldn't be a Warning)
<fwereade_> jam, dimitern: so anyway -- the only other thing to be careful about is how a false value for that field will be interpreted, because that's what we'll always get returned from 1.14
<jam> rogpeppe: and I'm pretty strong on a JFDI to be done :)
<jam> fwereade_: fair point
<jam> and one that I would have gotten to, yes
<rogpeppe> fwereade_: well, the local provider implements DNSName and i am presuming that Addresses is going to supplant DNSName completely
<jam> fwereade_: I think I was leaning towards Disabled
<jam> but when I saw it just now I thought it should line up
<dimitern> fwereade_, jam, ok, it seems not CharmURL, but ArchiveURL is the one to patch in the uniter
<fwereade_> jam, dimitern: so we might need to invert meaning for sanity's sake, but today I fail boolean logic
<dimitern> fwereade_, right, "false" will mean do verification
<dimitern> fwereade_, so he field will be called SkipSSLHostnameVerification
<dimitern> or NoSSLHostnameVerification
<fwereade_> dimitern, let's call it Disable to match the setting it's mirroring maybe?
<rogpeppe> jam: it would be nice if you didn't get a log message every time it polls actually, although i'm not sure how to do that while preserving useful information.
<dimitern> fwereade_, ok DisableSSLHostnameVerification in ArchiveURL() and Tools()
<jam> fwereade_: so it *really* looks like CharmURL from everything I've traced through
<fwereade_> dimitern, actually I may be on crack, I have no idea
<fwereade_> jam, CharmURL is "cs:" or "local:"
<jam> fwereade_: ok, it turns out the object returned which has URL inside it is "live" and have a connection to the API
<dimitern> jam, fwereade_, uniter.charm.download() uses ArchiveURL()
<jam> it wasn't very clear that it wasn't just a blob of data
<rogpeppe> jam: we could log only if the message is different, but i suspect that some providers will include a bunch of other stuff in the error message including request ids etc which would make that not work
<jam> dimitern: well, getArchiveInfo("CharmArchiveURL")
<jam> dimitern: ah, but the API for it is ArchiveURL
<jam> gotcha
<dimitern> yep
<jam> it is hard to figure out what Charm object I'm looking at
<jam> across 3 level
<jam> levels
<jam> or more
<fwereade_> rogpeppe, AFAICT the local provider Instance is completely fucked
<jam> as they are all *just called Charm*
<fwereade_> rogpeppe, and will never work
<fwereade_> rogpeppe, except for machine 0 :/
<rogpeppe> fwereade_: i didn't look at it too much
<rogpeppe> fwereade_: fucked how, exactly?
<jam> fwereade_: so in default Precise, you can't find the IP address for the lxc's you started, only the instances themselves can report it back. *however* the 12.04.03 update gives us better LXC tools and lxc-ls *does* give us the info
<jam> fwereade_: so I don't think we're just-fucked
<fwereade_> rogpeppe, getAddressForInterface("eth0") to get an address for some other container?
<jam> fwereade_: ^^ we change to us the lxc-* tools once we can assume we have the updated tools
<jam> which is at least *some* of what we want Cloud-tools archive for
<fwereade_> jam, indeed, that's cool
<fwereade_> jam, sooooo
<fwereade_> jam, rogpeppe: ...we'll need an address updater permachine agent, then?
<rogpeppe> fwereade_: don't they all share the same address space currently?
<fwereade_> rogpeppe, sure
<fwereade_> rogpeppe, but getting one's own eth0 is unlikely to help in determining the address of something completely distinct
<rogpeppe> fwereade_: so it's kinda fit for purpose *currently*...
<jam> rogpeppe: right, william is just remarking that the current implementation will never work to get addresses for another machine, but there are plans to change how we do it
<fwereade_> rogpeppe, by sheer ridiculous luck, yes, it works in the single situation it's usedbecauseit appens to run on the correct machine
<rogpeppe> fwereade_: i won't argue with that :-)
<yolanda> hi, i'm using juju-deployer to deploy a set of charms, but i find this error : error: cannot get latest charm revision: charm not found in "/home/yolanda/development/canonical-ci": local:precise/postgresql - shouldn't be local:postgresql, not local:precise/postgresql ?
<jam> yolanda: official charm locations have the series in them
<jam> local repos have a directory with the series
<jam> so $REPO/precise/postgresql would be the structure that you would do "juju deploy --local --repo $REPO postgresql
<jam> and jujut 'fills in' the default series (aka precise)
<yolanda> jam, i know it, but i'm asking about juju-deployer, it embeds the precise into it
<yolanda> if i deploy locally with local:charm works, but juju-deployer is deploying that as local:precise/charm
<jam> yolanda: you can also "juju deploy precise/postgresql"
<jam> I think
<jam> yolanda: I'm pretty sure that is supposed to work
<yolanda> i don't have control about juju deploy commands using juju-deployer wrapper, it's automated
<yolanda> so my question is about deploying using juju-deployer wrapper, not manually using juju deploy, which works for me
<yolanda> jam ^
<fwereade_> yolanda, a charm url is not valid without a series
<jam> yolanda: and I'm saying, that shouldn't be the problem, because both syntaxes are supposed to be valid
<fwereade_> yolanda, local:postgresql is  shorthand for user input only
<fwereade_> yolanda, there is no such actual charm as local:postgresql
<yolanda> jam, fwereade, but then juju complains if i use local:precise/postgresql, and works if i use local:postgresql
<yolanda> as juju-deployer uses first syntax, it gives me error
<yolanda> if first url is set to be working, what can be stopping to work in my environment?
<fwereade_> yolanda, so the charm is in $REPO/precise/posgresql, right?
<yolanda> yes
<jam> yolanda: it isn't something like you changed default series and actually have $REPO/saucy/postgresql locally, right?
<yolanda> jam, no, series is set as precise
<yolanda> and i have $REPO/precise/postgresql charm there
<fwereade_> yolanda, and "juju deploy local:precise/postgresql" does not work, while "juju deploy local:postgresql" does?
<yolanda> fwereade_, sorry, tried manually now with local:postgresql and doesn't work also
<yolanda> but i have the charm in my local repo
<fwereade_> yolanda, what's the charm name in the metadata?
<yolanda> postgresql
<fwereade_> yolanda, if you run with --debug, do you see any "failed to load charm at" warnings?
<yolanda> let me try it
<yolanda> mm... 2013-10-03 12:01:48 WARNING juju repo.go:341 charm: failed to load charm at "/home/yolanda/development/canonical-ci/precise/postgresql": YAML error: line 6: found a tab character where an intendation space is expected
<yolanda> that should be a bug in postgres charm?
<jam> yolanda: looks like
<fwereade_> yolanda, sounds like
<fwereade_> :)
<yolanda> i can fix it and do an mp
<yolanda> fwereade. should i raise a bug?
<fwereade_> yolanda, sorry, against what? it looks like it's a charm problem, but equally juju could probably somehow do better on that front
<fwereade_> yolanda, local repos are pretty baroque
<rogpeppe> fwereade_: BTW rather my don't-poll fix, why don't i just implement Addresses in the local provider to simply call the existing DNSName method
<rogpeppe> ?
<rogpeppe> s/rather/rather than/
<rogpeppe> fwereade_, mgz, dimitern: can you think of any down sides to the above?
<yolanda> fwereade, well, i was thinking in an MP for postgresql charm, but a bug against juju, to deal with malformed files... what do you think?
<mgz> rogpeppe: I nearly asked in the standup why not just implement it for local
<dimitern> rogpeppe, if it works by live testing, why not
<mgz> the hard case is containers in another provider, I didn't remember any local catches
<rogpeppe> mgz: i don't know why i didn't think of that, tbh
<jam> yolanda: if "juju deploy cs:postgresql" works, I would guess there is something other than a bug in the charm itself
<rogpeppe> right, i'll ditch that CL
<yolanda> jam, i branched lp:charms/postgresql, and config.yaml file has tabs instead of spaces, that's right
<yolanda> maybe it's not the same version as in charmstore?
<jam> yolanda: I see it here, I think: http://bazaar.launchpad.net/~charmers/charms/precise/postgresql/trunk/view/head:/config.yaml
<jam> It looks like someone's editor changed spaces to tabs
<jam> "helpfully"
<yolanda> do you want me to create the mp?
<jam> yolanda: and that change is in the very last commit to lp:charms/postgresql
<yolanda> or are you dealing with that?
<fwereade_> jam, also looks like whoever committed it didn't try deploying it before doing so
<jam> fwereade_: yep
<fwereade_> jam, yolanda: but I imagine the charm store just ignored it because it's invalid
<jam> fwereade_: stub merged richard's patch, but looks like he accidentally broke it
<mgz> yolanda: go ahead and fix and propose I'd say
<yolanda> ok
<fwereade_> jam, ahhh :)
<fwereade_> yolanda, so an MP against the postgres charm would be great
<yolanda> ok, doing it
<jam> fwereade_: it looks like the last change from Richard was to clean-up the description for one of the fields
<fwereade_> yolanda, re juju-core, the idea was that local repos would ignore things that aren't valid charms
<jam> and, naturally, that breaks everything
<jam> but was "just a comment fix"
<jam> so it wasn't tested
<fwereade_> yolanda, we have had troublein the past in whichone broken charmin a repo prevents anything being deployedfrom that repo
<fwereade_> jam, heh
<yolanda> fwereade, ideally if these changes can't go into cs, it will be ok, the problem will be only if using some launchpad branch
<fwereade_> yolanda, so I can't see a clear way forward that fixes your surprise without breaking things much worse
<fwereade_> yolanda, yeah
<jam> fwereade_: reporting the warning by default would help :)
<jam> (there was something that looks like what you requested, but it isn't actually valid)
<yolanda> fwereade, jam: https://code.launchpad.net/~yolanda.robla/charms/precise/postgresql/fix_tabs/+merge/189057
<jam> yolanda: lgtm, but Stub is the maintainer of that charm
<fwereade_> jam, there wasn't really anything that actually looked like you wanted though
<fwereade_> jam, directory name is irrelevant
<jam> I hopefully poked him in another window
<fwereade_> jam, hmm, ok, maybe it did, I guess we read the metadat without difficulty
<yolanda> jam, cool, i'll update manually in the meantime
<fwereade_> jam, actually showing that output by efaultwouldhave been helpful though
<jam> yolanda: stub says he's landing it now
<yolanda> but yes, instead of reporting as "charm does not exist", maybe juju deploy could show some error about invalid charm or something like that
<fwereade_> yolanda, well, juju was 100% accurate, there was no such charm in the repo
<fwereade_> yolanda, I think that the error is correct, and that just warning about broken charms is the Right Thing to do
<fwereade_> yolanda, it feels like the worst bit is that the warning got swallowed by default
<dimitern> jam, 2013-10-03 12:22:39 ERROR juju supercommand.go:282 disabling ssh-hostname-verification is not supported
<dimitern> jam, how can I test it if I cannot disable it?
<jam> dimitern: so it should work for openstack, which is the one we care about
<yolanda> fwereade, if you enable debugging you can see it, but if not you aren't aware of the error
<jam> dimitern: or you could just comment out that config validation failure
<dimitern> jam, oh, I need to dust out my canonistack permissions
<jam> if you really want to test on EC2
<jam> dimitern: or hp
<jam> dimitern: fwiw I tested the previous steps by mv /usr/share/ca-certificates and then running commands
<jam> otherwise the cert is still valid so it wouldn't fail whether you had that flag or not
<fwereade_> yolanda, yeah, hiding important messages STM like a juju-core bug, please go ahead
<rogpeppe> is there any way to get sudo to preserve your existing $PATH ?
<fwereade_> yolanda, thanks
<fwereade_> -E
<jam> rogpeppe: sudo -E ?
<rogpeppe> jam: doesn't preserve $PATH, it seems
<dimitern> jam, /usr/share/ca-certificates where?
<jam> rogpeppe: "man sudo" says "ENvironment PATH may be overridden by the security policy"
<fwereade_> rogpeppe, works for me, anyway
<jam> rogpeppe: so you could probably remove the Paths line from config if you wanted to avoid that
<jam> but you can't guarantee it
<jam> for $ARBITRARY_USER
<dimitern> jam, on your machine or ?
<jam> dimitern: so for "juju bootstrap" I did it on my machine, and then ssh'd into the started machine and did it there to test the line in cloud-init
<rogpeppe> jam: i'm just trying to work out a decent way of using the local provider when you're not using /usr/bin/juju
<dimitern> jam, ok, so I'll do it on machine 0 once it starts and restart the agent
<jam> dimitern: but honestly, if you are using utils.NonValidating that is *known* to work properly with non-validating certs.
<jam> rogpeppe: sure, thumper said earlier "sudo `$(which juju)` bootstrap"
<dimitern> jam, it's actually utils.GetNonValidatingClient()
<jam> dimitern: sure
<rogpeppe> jam: that doesn't work
<dimitern> ok
<rogpeppe> jam: i'm doing this currently: x=$PATH sudo -E sh -c 'export PATH=$x; juju bootstrap --debug'
<jam> dimitern: if you're using the one that the other stuff uses, I've tested that pretty well. You *can* write a test case for it
<rogpeppe> jam: which isn't ideal
<jam> using httptest.Server
<jam> dimitern: I don't have a test for cloud-init specifically, because we don't have any 3rd-party clouds that we have creds to that don't use valid certs
<jam> dimitern: but I *do* have a bunch of openstack localHTTPSServer tests
<jam> for the actual Provider interaction
<dimitern> jam, well, it seems to work ok
<rogpeppe> dimitern, jam, mgz: alternative fix to the logging spam problem: https://codereview.appspot.com/14339043
<rogpeppe> fwereade_: ^
<dimitern> rogpeppe, no tests?
<rogpeppe> dimitern: there are no tests for DNSName, (this is a thin wrapper around that), and I don't want to block this on adding appropriate testing to that
<dimitern> fwereade_, jam, https://codereview.appspot.com/14340043 - fix for bug 1234576 (one of the best ids so far!)
<_mup_> Bug #1234576: Upgrader needs to support ssl-hostname-verification: false <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1234576>
<rogpeppe> dimitern: i raised a bug
<dimitern> rogpeppe, ok, please live test it at least
<rogpeppe> dimitern: i have
<rogpeppe> dimitern: at least, i've verified that we don't get the log spam - there's no externally visible way currently to see the output of Addresses.
<dimitern> rogpeppe, reviewed
<dimitern> rogpeppe, why?
<rogpeppe> dimitern: because nothing uses it yet
<dimitern> rogpeppe, can't you log what addresses you get and compare them?
<dimitern> rogpeppe, not as part of the code, just for testing
<rogpeppe> dimitern: we could write a test that does that, yes
<rogpeppe> dimitern: and the addressupdater code tests that
<rogpeppe> dimitern: but you can't see what Addresses are attached to a machine by looking at juju status, for example
<dimitern> rogpeppe, I meant simply adding a log.Errorf("Addresses returns: %v", addresses) and bootstrapping a local environment
<dimitern> rogpeppe, and use lxc-ls or something to get the container addresses?
<mgz> dimitern: we run into the precise problem with that
<mgz> lxc-ls is useless on precise
<rogpeppe> dimitern: we're talking about two statements here. i believe that they work, and it doesn't actually matter if they don't. we need more testing in this area, but i don't think it matters at this moment.
<dimitern> rogpeppe, are you running precise?
<rogpeppe> dimitern: no
<dimitern> rogpeppe, ok then
<mgz> ah, you didn't mean doing that in the code?
<mgz> read through the conversation a bit too fast :)
<dimitern> I simply meant as a local live test
<dimitern> but whatever
<dimitern> seeing is better than believing alone
<yolanda> fwereade, jam: https://bugs.launchpad.net/juju-core/+bug/1234687
<_mup_> Bug #1234687: juju is hiding bugs in charms <juju-core:New> <https://launchpad.net/bugs/1234687>
<natefinch> mgz, rogpeppe, fwereade_, jam:  anyone want to finish up the review Dave started?  Just docs, but I'd like them in asap: https://codereview.appspot.com/14207048/
<mgz> natefinch: if no one beats me to it, I'll look after doing various code cleanup things on my branch
<natefinch> mgz: thanks
<rogpeppe> natefinch: any particular reason you added the extra newline before the Doc text in addmachine.go ?
<rogpeppe> natefinch: ha, it looks like it's stripped anyway
<natefinch> rogpeppe: it makes the text in-code more clear not to have it indented due to the variable assignment, and they produce the same output anyway... I'm trying to keep all the doc formatting the same
<natefinch> rogpeppe: yep
<fwereade_> rogpeppe, I am confused by https://codereview.appspot.com/14339043/
<rogpeppe> fwereade_: go on
<fwereade_> rogpeppe, didn't we agree that local.Instance.DNSName isno good unless it's run on the relevant instance?
<jam> dimitern: so one test we *could* add is to set up the local dummy service with an HTTPS Server and assert that the Upgrader is able to find the tools, do you think that is worthwhile?
<rogpeppe> fwereade_: well, Addresses is in exactly the same boat
<fwereade_> rogpeppe, I would*much* rather have the notimplemented hack than poke bad data into state
<rogpeppe> fwereade_: and it's never called for real
<rogpeppe> fwereade_: the bad data is already in state
<fwereade_> rogpeppe, how did it get there?
<rogpeppe> fwereade_: by calling Instance.DNSName, no?
<dimitern> jam, seems extreme
<fwereade_> rogpeppe, when did that go into state?
<jam> dimitern: well it is the only thing that we actually care about
<jam> you don't ever test that the boolean we return is actually acted upon
<fwereade_> rogpeppe, the only addresses we have hitherto stored (that I am aware of) have come from code running on the instances in question, as part of uniter setup
<dimitern> jam, I have no idea how to do that
<rogpeppe> fwereade_: good point, but... how is Addresses any different from DNSName ?
<rogpeppe> fwereade_: the result of Addresses isn't going into the state either
<rogpeppe> fwereade_: .... is it?
<fwereade_> rogpeppe, huh? isn't that precisely what addressupdater does?
<rogpeppe> fwereade_: ha, i see
<jam> dimitern: so in the interests of getting 1.15.1 out the door, I think we should probably just land it, have you implement the next one, land it, and then come back to fill in the tests
<rogpeppe> fwereade_: erm, aren't things using the result of instance.DNSName currently to decide where to connect to?
<jam> dimitern: but http://bazaar.launchpad.net/~go-bot/juju-core/trunk/view/head:/provider/openstack/local_test.go#L705 is what I set up for the Openstack HTTPS tests
<fwereade_> rogpeppe, yeah, but by sheer luck the only things that do are running somewhere where Instance.DNSName happens to be correct
<rogpeppe> fwereade_: i don't really mind storing the DNSName results in state - it's not as if they're permanent
<fwereade_> rogpeppe, they are *wrong*, and they will fuck everything up
<rogpeppe> fwereade_: any time we upgrade to make a better implementation, the addresses stored in state will change appropriately
<fwereade_> rogpeppe, asking a unit for its addresses asks the machine first
<fwereade_> rogpeppe, if you put bad data into machines, units start reporting the wrong addresses
<dimitern> jam, sgtm, will have a look after landing these two
<rogpeppe> fwereade_: hmm, so if there's no machine address, the uniter uses an EnvironProvider method to find the address of itself?
<fwereade_> rogpeppe, yeah
<fwereade_> rogpeppe, well, the unit always does that
<fwereade_> rogpeppe, but machine addresses are the canonical location
<fwereade_> rogpeppe, we just can't yet drop the unit address lookup
<fwereade_> rogpeppe, precisely because we can't get good addresses for the machines in all cases
<rogpeppe> fwereade_: does that mean the address updater cannot work in the local provider?
<fwereade_> rogpeppe, I thought you already knew that, and that was the reasoning behind the notimplemented thing :)
<fwereade_> rogpeppe, until we can use a new lxc-ls everywhere, yes
<rogpeppe> fwereade_: ah, i see - i hadn't realised that was the blocker
<mgz> gah, right we do need a working lxc-ls for the local provider
<rogpeppe> fwereade_: in which case,  there's https://codereview.appspot.com/14337043/ instead
<fwereade_> rogpeppe, that LGTM if it's really hard to abort the loop vs polling once per year
<rogpeppe> fwereade_: it would have to duplicate a lot of the loop's logic (reading on Dying, sending on died, reading on changed, exiting when dead); i don't really see the advantage of doing it.
<rogpeppe> fwereade_: unless, as i said above, we added some method to EnvironProvider, but that has tentacles and seems way overkill for killing a few log messages.
<jam> dimitern: step one LGTM
<fwereade_> rogpeppe, were it not for the tentacles, that would be my favoured solution
<fwereade_> rogpeppe, but I'm looking for mr right now, not mr right ;p
<jam> fwereade_: but the tentacles are the tasty part :)
<fwereade_> jam, I have a sudden recollection of laura over summer... playing peter pan with cuddly toys... "and cthooley can be tinkerbell"
<fwereade_> (my stepmother had a cuddly cthulu)
<mgz> that's pretty ace
<fwereade_> I thought so :)
<dimitern> jam, thanks
<rogpeppe> fwereade_: i had a sudden thought that if it found an unimplemented error, it could kill the whole worker (ensuring it doesn't restart), but it's not that straightforward to do, sadly.
<fwereade_> rogpeppe, ah, not to worry
<fwereade_> rogpeppe, maybe a short comment explaining why it's necessary would be a good idea though
<rogpeppe> fwereade_: i'm adding one
<fwereade_> rogpeppe, cheers
<rogpeppe> it wouldn't be *too* hard (just make a special error that is recognised by worker.Runner that says "i really want to quit without taking anything else down or being restarted"), but not for now.
<fwereade_> rogpeppe, agreed
<dimitern> jam, added bug 1234715 for that
<_mup_> Bug #1234715: Verify SSLHostnameVerification: false behavior with a test (upgrader, uniter) <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1234715>
<rogpeppe> dimitern, fwereade_: landin
<rogpeppe> g
<dimitern> rogpeppe, the "no log spam" thing?
<rogpeppe> dimitern: yeah
<dimitern> rogpeppe, sweet
<rogpeppe> dimitern: i changed unimplementedError to notImplementedError
<rogpeppe> dimitern: and added a comment about the 1y thing
<dimitern> rogpeppe, great, thanks
<dimitern> jam, fwereade_ this is the fix for bug 1234577 https://codereview.appspot.com/14337044
<_mup_> Bug #1234577: Uniter needs to support ssl-hostname-verification: false <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1234577>
<rogpeppe> fwereade_: i don't know if you saw last night, but i succeeded in bootstrapping a live ec2 environment with a environments.yaml entry that was just "myenv": {"type": "ec2"}
<rogpeppe> fwereade_: which is quite cool
<dimitern> rogpeppe, you have your AWS_* env vars set then
<rogpeppe> dimitern: yeah, it needed them of course
<fwereade_> rogpeppe, nice, I did that too, it was very satisfying :)
<jam> dimitern: reviewed LGTM
<mattyw> fwereade_, I'm hoping to point you in the direction of a merge proposal later for the id stuff (just the api and owner-get tool) would I be able to grab you today to talk about the next stage (user creation and deletion)
<mattyw> (I appreciate you're busy at the moment)
<fwereade_> mattyw, I have 20 mins until my next meeting and had had not much expectation of accomplishing other things in the interim, so that's perfect  :)
<fwereade_> mattyw, otherwise sometime later should also be fine
<mattyw> fwereade_, if now's good I'm ready to listen :)
<fwereade_> mattyw, I think there's an intermediate step
<fwereade_> mattyw, in which we explicitly set the admin user on the services she creates
<mgz> rogpeppe: don't know if you want to re-stamp the default vpc branch before I land, have made the changes you suggested
<fwereade_> mattyw, taking care to keep the services which don't have that field still working
<fwereade_> mattyw, at the state level, that involves adding a param to AddService
<fwereade_> mattyw, and at the apiserver level it involves extracting the entity tag from the connection and passing it into AddService
<fwereade_> mattyw, shouldn't otherwise hit the API at all I think
<dimitern> jam, cheers
<mattyw> fwereade_, so that would mean that any service that gets deployed would have the admin-user set as the owner of that service?
<mattyw> and no change to the command line args
<fwereade_> mattyw, yeah, that'd be the effect
<mattyw> fwereade_, and again - we'd just hardcode the user - so we'd add a oarameter to addservice - but we'd also pass a hardcoded "user-admin" to it?
<fwereade_> mattyw, the api server knows who's connected to it
<fwereade_> mattyw, you can get the tag from ... uh, somewhere
<fwereade_> mattyw, so while there's only one user still it *will* always be user-admin
<fwereade_> mattyw, but it sets us up to do the right thing transparently when there are more users
<fwereade_> mattyw, adding users is easy, and deleting them is a bit more interesting, but I think that's something we can ignore for a little bit longer
<fwereade_> mattyw, sorry, popping out for a quick cig before 4:30
<fwereade_> mgz, rogpeppe, natefinch, dimitern: are your https://launchpad.net/juju-core/+milestone/1.15.1 bugs up to date?
<fwereade_> and would someone please take a look at axw's https://codereview.appspot.com/14329043/ and, if it checks out, land it for him please? (unless he's actively here?)
<mgz> fwereade_: yes, but I'm about to bot-propose mine now, so the bot will change status shortly
<mgz> can also pick up that cl if needed
<dimitern> fwereade_, yeah, except I'm not doing the upgrade one for now, until I land the other 2 fixes
<dimitern> and the bot is not being helpful
<mgz> dimitern: what's up with the bot?
<mgz> I'm just about to poke
<dimitern> mgz, it seems the random test failures have increased
<dimitern> mgz, and the variety of the packages that fails as well
<rogpeppe> hmm, the error you now get when destroying an already-destroyed environment is not great: http://paste.ubuntu.com/6188368/
<natefinch> wow, that's bad
<fwereade_> mramm, call is full
<mramm> :(
<mramm> sinzui: hey, is this in progress bug completed? https://bugs.launchpad.net/juju-core/+bug/1234456
<_mup_> Bug #1234456: release-public-tools.bash must be hacked to work with debs <juju-core:In Progress by sinzui> <https://launchpad.net/bugs/1234456>
<dimitern> mgz, did you update bot's goamz?
<mgz> dimitern: I did
<mgz> was the only poke I made in that case
<dimitern> ok
<mgz> but I can try and help you with other things
<sinzui> mramm, I need to land it
<sinzui> oh, and it didn't land
<mramm> ok
<mgz> so, 1.15.1 is going to be from r194X, right? when the last bit we want gets landed today?
<mgz> sinzui: can you link me the release notes doc so I can add a note?
<sinzui> mgz, https://docs.google.com/a/canonical.com/document/d/1o8YsLrQuadB1gbd5veJ3cpN_r2uozKwwTmIh1RmOVHM/edit#heading=h.h7wry0fbg197
<mgz> sinzui: ta!
<sinzui> mgz, can you give me a clue to resolve this error: http://pastebin.ubuntu.com/6188547/
<mgz> sinzui: to land things on juju core, we just mark the mp approved and the bot looks for that
<sinzui> mgz, I think it is https://code.launchpad.net/~sinzui/juju-core/release-public-tools-with-streams/+merge/188966
<mgz> aaron's lp:rvsubmit is a pretty bzr plugin to make it easy
<mgz> sinzui: +add a commit message
<sinzui> doh!, I get hate mail from the lander we created for charmworld. I assumed it did the same
<mgz> yes, it's very confusing needing to use `lbox propose` but needing to NOT use `lbox submit`
<dimitern> well, the first 2 times maybe, but then you just forget about lbox submit
<dimitern> fwereade_, ping
<mgz> dimitern: till you then need to land something on goamz or the like, and rediscover it :)
<dimitern> mgz, yeah, although we can bring goamz under the bot as well :P
<fwereade_> dimitern, pong
<dimitern> fwereade_, re bug 1233451
<_mup_> Bug #1233451: juju upgrade-juju results in unsupported behavior <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1233451>
<dimitern> fwereade_, is this for today as well?
<dimitern> fwereade_, (i.e. 1.15.1)
<fwereade_> dimitern, I had not thought it was, I was a little surprised to see it there
<fwereade_> dimitern, it's only necessary if we have to do 1.20 before 2.0
<dimitern> fwereade_, yeah, because I definitely won't manage for today with it
<fwereade_> dimitern, and certainly shouldn't block today, indeed
<fwereade_> dimitern, I guess take it off the milestone
<dimitern> fwereade_, done
<rogpeppe> these kinda of sporadic failures are worrying; i have no idea what's going on here: https://code.launchpad.net/~rogpeppe/juju-core/436-addressupdater-log/+merge/189012/comments/432808/+download
<fwereade_> rogpeppe, the bright side I suppose is that the suite eventually recovers from the dirty socket problems
 * fwereade_ has to go out for a while -- would mgz, rogpeppe, dimitern, natefinch look after the remaining 1.15.1 bugs please?
<rogpeppe> fwereade_: yeah, i'm still live verifying stuff
<natefinch> fwereade_: sure thing
<fwereade_> thanks guys
<natefinch> rogpeppe or mgz:  if one of you can check out my docs changes, I can land them to get that one off the list: https://codereview.appspot.com/14207048/
<mgz> natefinch: right, really doing that now
<rogpeppe> natefinch: sorry, i got diverted while looking at them
<dimitern> sinzui, https://bugs.launchpad.net/juju-core/+bug/1234456 fix has landed, right?
<_mup_> Bug #1234456: release-public-tools.bash must be hacked to work with debs <juju-core:In Progress by sinzui> <https://launchpad.net/bugs/1234456>
<sinzui> dimitern, yes
<dimitern> sinzui, ok, marked as Fix Committed
<sinzui> oh, sorry, I thought gobot could do that
<dimitern> lbox does weird things sometimes, like changing milestones around
<dimitern> sinzui, it used to but it stopped a while ago I think
<mgz> natefinch: see review
<natefinch> mgz: initial newlines are stripped automatically already... it just makes the code cleaner when you have it all smashed up against the left side of the window.  And I ordered the providers alphabetically on Dave's suggestion... it seems more fair and an easier rule to follow.
<mgz> fair enough
<natefinch> mgz: I'll make those removals though, thanks.
<dimitern> approving axw's fix for bug 1234127
<_mup_> Bug #1234127: juju should enable cloud-tools pocket for Precise <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1234127>
<dimitern> that's the last one for 1.15.1 after nate's
<rogpeppe> mgz: reviewed
<mgz> rogpeppe: thank you. natefinch ^
<rogpeppe> mgz: ta
<rogpeppe> natefinch: thanks for doing that - it's really good to make docs more consistent
<natefinch> rogpeppe: np.  Happy to put my little bit of OCD to good use :)
<natefinch> rogpeppe: thanks for the review
<natefinch> so glad sublime text has alt-q to wrap the current paragraph to the column ruler.... it even preserves indents and line prefixes so you can use it on stuff that's commented out
<hazmat> arosales, so what's been missing in the azure provider affinity group discussion, is what changed from when it when worked well for other regions in 1.14.1
<arosales> hazmat, the defaults were East
<arosales> and the tools were in east
<arosales> so not affinity issues
<arosales> issue is when you try to deploy in West
<hazmat> arosales, hmm.. ic, thanks
<rogpeppe> fwereade_: i'm wondering about the behaviour of juju when you try to use (not bootstrap or sync-tools) an environment that has no associated info. at the moment you'll get a config error if the environments.yaml file omits some prepare-time attributes, such as admin-secret.
<rogpeppe> fwereade_: i'm thinking that it there's no associated info and creating a new Environ directly from environments.yaml fails, that we should just return an "environment does not exist" error.
<rogpeppe> fwereade_: or perhaps "environment has not been created"
<rogpeppe> fwereade_: and that when everyone has .jenv files, we can just fail when there's no associated info, with the same error
<arosales> hazmat, I don't know if it is a juju thing that is setting the affinity group or Azure
<arosales> hazmat, I found this http://michaelwasham.com/2012/08/07/http-error-message-the-location-or-affinity-group-east-us-specified-for-source-image/
<arosales> but juju is only reading the public tools which I wouldn't think would cause an affinity group set, but I don't know.  That is the main question
<arosales> the juju tool read does happen first,
<hazmat> arosales, interesting that link does suggest a solution
<arosales> hazmat, basically upload tools to the same region you deploy in, or are you reading something different there?
<hazmat> although exactly where the issue is not entirely clear either without further investingation
<arosales> hazmat, it also may be that my juju control bucket for storage is in east when I try to deploy in west . . .
<hazmat> arosales, i'll try digging deeper over the weekend if no else has gotten to it. i did have azure working with last week.
<hazmat> arosales, the azure provider does a very good job of cleaning up after itself
<hazmat> arosales, at the cost of being synchronous, including its tool bucket.
<arosales> hazmat, it does but it takes some time.
<arosales> hazmat, there is a new api for delete that we should work in some time.
<hazmat> arosales, its hard to avoid given the api.. to kill the vnet, it has to wait on instance kills, etc. although ideally its doing that in parallel with a wait for the group (for instance stop).
<arosales> hazmat, yup and I think msft tried to clean that code up.
<hazmat> arosales, did i read that right? msft cleaned up juju code? will wonders never cease.
<natefinch> rogpeppe, mgz, fwereade_, dimitern: just landed my  docs branch (well, set to approved, waiting for the bot).  Gotta run out for a doctor's appointment, but I'll be back later in the day.
<mgz> I'm going to be around but only with one eye here for the evening
<sinzui> rogpeppe, I see GhostRevisionHasNoRevno errir in gnuflag. I think you have the power to fix it: http://pastebin.ubuntu.com/6188999/
<rogpeppe> sinzui: interesting
 * rogpeppe doesn't know about ghost revisions
<sinzui> yeah, I haven't seen a ghost issue in years
<sinzui> I don't know the details, I think abentley, jam, and mgz do though
<rogpeppe> sinzui:
<rogpeppe> bzr fetch-ghosts
<rogpeppe> bzr: ERROR: unknown command "fetch-ghosts"
<abentley> rogpeppe: it's from bzrtools.
 * sinzui is updating the make tarball script to honour dependencies.tsv
<rogpeppe> abentley: thanks
<rogpeppe> sinzui: how can i tell if i've fixed the issue?
<sinzui> rogpeppe, this is how I found the issue
<sinzui> bzr checkout --lightweight -r 12 lp:gnuflag gnuflag
<rogpeppe> sinzui: i just tried that actually - it still failed
<sinzui> :/
<rogpeppe> sinzui: i'll try the second suggestion
<rogpeppe> sinzui, abentley: nope. nothing doing. i've tried all the combinations i can think of
<rogpeppe> sinzui: the bzr push always says "No new revisions or tags to push"
<rogpeppe> sinzui: which i presume means it's not making any changes
<sinzui> hmm. this is tricky
<rogpeppe> sinzui: should fetch ghosts, then make a blank commit and try again?
<rogpeppe> s/should/should i/
<sinzui> push --overwrite says there is nothing to do?
<rogpeppe> sinzui: yes
<sinzui> I don't think fetch will help
<sinzui> I can change the release tarball script to not use lightwieght checkouts. I think that will unblock me
<rogpeppe> sinzui: if i knew what the issue was, i could probably do something about it
<sinzui> abentley, ^ do you think the project's repo is corrupt?
<rogpeppe> sinzui: i could erase most of the history of the branch, i suppose, by rewinding it and applying a patch, and push --overwriting.
<sinzui> rogpeppe, me too.
<sinzui> me looks at branch again
<rogpeppe> sinzui: i don't really want to lose the repo history though - it's quite useful sometimes
<sinzui> rogpeppe, I agree.
<sinzui> rogpeppe, I see from https://code.launchpad.net/~gnuflag/gnuflag/trunk that the branch we would normally use as the nase of the stack is actually stacked on the original branch
<sinzui> ^ abentley could this issue a manefestation of stacking and will `bzr reconfigure --unstacked` address the issue
 * sinzui pulls the branch and tries
<rogpeppe> sinzui: apologies for my ignorance. what's a stacked branch?
<sinzui> rogpeppe, lp:~gophers/gnuflag/trunk
<rogpeppe> sinzui: that's the trunk branch, yes?
<rogpeppe> sinzui: what does it mean for it to be "stacked"?
<abentley> sinzui: No, the problem is that bzr doesn't normally push ghost revisions.  That's why you need to use the fetch-ghosts command, not "pull" or "push".
<sinzui> rogpeppe, stacking is a disappointing space optimisation used by Lp. the true trunk branch (yours) does not have all its revisions, so bzr will pull the remaining revs from ~gophers. Shared repos are better...but Lp doesn't know about them
<rogpeppe> sinzui: so the true trunk branch isn't lp:~gophers/gnuflag/trunk ?
<sinzui> actually it is...
<sinzui> rogpeppe, the project thinks ~gnuflag/gnuflag/trunk is the current focus of development. It will stack all new branches on that, so that all new branches will be based on true trunk. But true trunk doesn't have all its revisions, most are in lp:~gophers/gnuflag/trunk. This probably happened when the branches were switched
<sinzui> This is one of many disappointing "features" of stacking
<rogpeppe> sinzui: ah, i hadn't twigged to ~gnuflag vs ~gophers distinction
<rogpeppe> s/ to / the /
<rogpeppe> sinzui: well, if you work out a way of fixing it, let me know what to type and i'll type it :-)
<sinzui> okay.
 * rogpeppe has definitely reached eod
<rogpeppe> g'night all
<abentley> sinzui: If you manually stack ~gnuflag/gnuflag/trunk on ~gophers/gnuflag/trunk, that should fix it.  And if it does, I would expect it would stay fixed after reconfigure --unstacked.
<sinzui> abentley, I will try that
<abentley> sinzui: Stacking uses the database ids of branches, so renaming branches should not break stacking.
<sinzui> I just realised that -r revno is broken in this case, but -r revid works
<abentley> sinzui: I have scripted out basically the whole process of setting up for lxc.  Now setting it up on juju-gui-qa.  I love watching when scripts Just Work.
<sinzui> me too abentley
<thumper> morning
<thumper> morning natefinch
<natefinch> thumper: morning
<thumper> I was just looking at the 1.15.1 milestone
<thumper> and the only non-fix committed bug is the azure help
<thumper> how's that going?
<natefinch> thumper: it's committed... the bot just seems to not want to land it
<thumper> ?!
<natefinch> thumper: https://code.launchpad.net/~natefinch/juju-core/018-azure-help/+merge/188936
<natefinch> *shrug*
 * thumper looks
<thumper> natefinch: you need to set the commit message :)
<natefinch> sonofa
<thumper> heh
<natefinch> why can't it just take the description? :/
<thumper> there is a long and sorded history
<thumper> my answer to that is this:
<thumper> the description is what you write to help the reviewer, explaining the changes, what why and how
<thumper> the commit message is what the change is
<natefinch> Fair enough
<thumper> they have different audiences
<thumper> however, most people in the team just copy the description
<thumper> I often edit it
<natefinch> do I have to poke it in and out of approved, or will the bot notice the change to the commit message?
<natefinch> The commit message thing would bug me a lot less if the bot would just email me that I forgot the commit message, instead of silently failing to do anything
<thumper> the bot will notice
 * natefinch thinks that line sounds ominous
<natefinch> I have the root-disk constraint working on  EC2, just need to get it to actually report back the correct disk size (currently it just hard codes the 8G default)
<thumper> hmm..
<thumper> the bot failed
<thumper> but because it couldn't create a directory
 * thumper kicks it into approved again
<natefinch> I can retry... though not being able to create a directory doesn't sound like something that's likely to work a second time either
<natefinch> yeah, cool
<thumper> it could be a random clash
<thumper> if not, someone needs to log into the machine and clear out the /tmp dir
<thumper> from all the gocheck temp dirs
<thumper> I have a feeling that some are left behind if there are issues with teardown
<natefinch> ug
 * thumper nods
<thumper> yeah
<thumper> I have about 10 in my /tmp dir
<natefinch> I gotta run my daughter to a doctor's appointment, but I'll be back on after for a bit.
<thumper> ack
<thumper> fwereade_, mgz: anyone know where to landing bot instructions are?
<thumper> need to clean out the tmp dir
<thumper> bot is failing
<thumper> fwereade_: if you are around, I'd love to chat and rant
<natefinch-afk> thumper: how goes?
#juju-dev 2013-10-04
<thumper> natefinch-bed: night
<jam> sinzui: rogpeppe: generally you get the ghost revision doing a lightweight checkout of a stacked branch (with no actual local commits). The easier thing is "bzr reconfigure --unstacked lp:gnuflag"
<jam> I would expect a trunk branch to be unstacked anyway, but I can imagine the owner changed over time
<jam> go bot will mark things Fix Committed if you used "bzr commit --fixes lp:####" I believe
<jam> I guess I missed that you did eventually figure that out
<axw> morning jam
<jam> morning axw, I'm not "officially working" but if I can do anything to smooth the release I'll sneak some time this morning. Do you know of anything ?
<axw> jam: nope, was just saying morning :)
<axw> I'm not aware of any problems
<jam> axw: and a good afternoon to you
<axw> argh, something has broken the null provider
<davecheney> axw: does storage-auth-key have a default value ?
<axw> davecheney: no, it is in the boilerplate with a random value, but there's no default
<axw> it should be checked for existence
<axw> but isn't
<davecheney> how come you were able to comment it out
<davecheney> ?
<davecheney> oh, you just told me
<davecheney> well, there is your problem
<axw> bleh, I found another problem, and this one actually does need a code fix :(
<davecheney> dag
<davecheney> dang
<rogpeppe> mornin' all
<rogpeppe> jam: i didn't know about bzr commit --fixes
<rogpeppe> axw: what's the problem?
<axw> rogpeppe: https://bugs.launchpad.net/juju-core/+bug/1235100, https://bugs.launchpad.net/juju-core/+bug/1235102
<_mup_> Bug #1235100: null provider machine agent fails to initialise <juju-core:New> <https://launchpad.net/bugs/1235100>
<_mup_> Bug #1235102: null provider authenticating storage client fails <juju-core:New> <https://launchpad.net/bugs/1235102>
<axw> I made some changes in a review, and broke everything
<axw> didn't do a live test
<axw> (everything in the scope of the null provider only)
<rogpeppe> axw: the storage-auth-key panic looks like the null provider isn't checking its config properly
<axw> rogpeppe: correct
<axw> I know what all the issues are, they just need code changes
<axw> which obviously isn't good at this time :)
<rogpeppe> axw: well i'm happy to review
<davecheney> night all
<davecheney> i'll be online later
<axw> rogpeppe: cool, trying to get it all working now. I'll ping you a bit later
<axw> night davecheney
<rogpeppe> davecheney: see ya
<fwereade_> hey all
<fwereade_> I see no release, so I presume things are bad
<axw> fwereade_: not sure why there's no release, but things are bad with the null provider.. wouldn't think that would hold anything up tho
<axw> also, hi
<fwereade_> axw, heyhey
<fwereade_> axw, well, that is rather problematic tbh, is there anything I can do to help?
<axw> fwereade_: under control at the moment, just about to propose a couple of fixes
<fwereade_> axw, ok, cool
<axw> rogpeppe, fwereade_ https://codereview.appspot.com/14387043
<axw> one more to come
<rogpeppe> axw: just remind me please: when is the machine agent accessing the environ's Storage before the secrets are pushed?
<fwereade_> axw, sorry, I don't quite get it
<fwereade_> axw, yeah, what rogpeppe said
<axw> I think it was from StateInfo
<axw> one moment, I'll reproduce
<axw> rogpeppe, fwereade_ yep, environ.StateInfo gets called, which calls common.StateInfo, which does a common.LoadState
<fwereade_> axw, by what?
<axw> launchpad.net/juju-core/state/apiserver/common.(*Addresser).StateAddresses(0xc200263d00, 0x0, 0x0, 0x0, 0x0, ...)
<rogpeppe> axw: isn't the StateInfo passed to the machine agent in its config?
<rogpeppe> axw: i'm a bit surprised that the machine agent is calling StateInfo before it has a valid environ config
<axw> rogpeppe: as was I :)
<rogpeppe> axw: where is it being called?
<axw> rogpeppe: I'll pastebin the stacktrace, just a sec
<axw> rogpeppe: http://paste.ubuntu.com/6191212/
<rogpeppe> axw: could you paste the stack trace of the bootstrap machine agent at that point?
<axw> rogpeppe: ? that is from the bootstrap machine agent
<rogpeppe> axw:  i mean the stack trace of all the goroutines
<axw> ah yep, just a sec
<fwereade_> axw, so that's either deployer or provisioner making that call, right?
<fwereade_> axw, deployer shouldn't have any units to deploy until after the first connection
<dimitern> the bot was running out of space and there were way too many gocheck-* and test-mgo* dirs in /tmp/ so some tests were failing, I just cleaned it up
<fwereade_> axw, and provisioner shouldn't start until it's got a valid environ config
<fwereade_> dimitern, thanks
<axw> fwereade_: not sure tbh. I just bootstrapped and nothing else
<axw> so it shouldn't be trying to provision anytthing
<fwereade_> axw, sure, but the provisioner ought to be able to start up with an invalid environ config, and just sit and wait until it gets a valid one
<rogpeppe> fwereade_: agreed; i think it must be the provisioner
<axw> fwereade_: yeah, I thought it did wait today
<rogpeppe> axw: still waiting for that stack trace - it'll sort out whether it's the provisioner or the deployer
<axw> I was looking at it earlier, looked somewhat sane
<axw> rogpeppe: uploading now
<rogpeppe> perhaps it's a bug in WaitforEnviron
<fwereade_> axw, does a config without a storage-auth-key validate?
<fwereade_> axw, it looks like it does, and that's the problem,but maybe I'm misreading
<axw> fwereade_: it does at the moment. If nothing is meant to be able to call Storage until it has all the secrets, then it shouldn't validate
<axw> rogpeppe: attached machine-0.log to https://bugs.launchpad.net/juju-core/+bug/1235100
<_mup_> Bug #1235100: null provider machine agent fails to initialise <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1235100>
<fwereade_> axw, that schema.Omit is basically an explicit "I don't care if we don't have this, everything will work perfectly if it's not here"
<fwereade_> axw, and unless there's some explicit handling elsewhere to fail validation, there's your problem
<axw> fwereade_: yep, I understand that bit
<axw> (now)
<axw> ;)
<fwereade_> :)
<axw> nothing like a last minute bug to educate you
<axw> anyway
<fwereade_> oh yes indeed:)
<axw> I'll propose this other fix, then get back to this issue
<fwereade_> axw, cheers
<axw> fwereade_: https://codereview.appspot.com/14388043
<axw> this one's more straight forward
<fwereade_> axw, I am concerned at the lack of tests there...
<fwereade_> axw, I can't really see the impact by inspection
<axw> fwereade_: ok. I think I can come up with one
<fwereade_> axw, thanks
<frankban> hi, how can I create local certs in juju home for an environment?
<axw> fwereade_: updated, verified it failed before the fix
<axw> rogpeppe: I think it's triggered by the provisioner task calling NewAPIAuthenticator, which does st.StateAddresses
<axw> rogpeppe: the API server gets the env config from state, but the secrets aren't in there
<rogpeppe> axw: but it shouldn't be calling NewAPIAuthenticator until it's obtained a valid Environ
<rogpeppe> axw: which shouldn't be possible until the secrets are there
<axw> rogpeppe: ok, so is it because Validate isn't returning an error
<axw> ?
<rogpeppe> axw: that sounds right
<rogpeppe> axw: i think that's what fwereade_ was trying to say earlier
<axw> right, I didn't catch on
<axw> I'll fix that now. thanks rogpeppe and fwereade_
<fwereade_> frankban, if you started a new environment with current code, the certs are now written into BootstrapConfig in $JUJU_HOME/environments/<name>.jenv
<frankban> fwereade_: thanks, that's right. In trunk, if I add a new environment (e.g. a local one) in the yaml file, and then I try to bootstrap it, the certs seems not to be generated: http://pastebin.ubuntu.com/6191312/
<axw> I think that's a real bug with the local provider
<axw> it used to have to chown the cert files to the sudo caller
<axw> frankban: as a workaround, you can "touch" that file it's complaining about. would you mind logging a bug?
<axw> rogpeppe: updated https://codereview.appspot.com/14387043
<axw> frankban: never mind, I'll log it - I can reproduce it trivially
<frankban> axw: thanks!
<axw> https://bugs.launchpad.net/juju-core/+bug/1235130
<_mup_> Bug #1235130: local provider fails bootstrapping if legacy certificates are missing <juju-core:Confirmed> <https://launchpad.net/bugs/1235130>
<rogpeppe> frankban, axw: ah sorry about that - my fault, i had no idea there was other stuff in the system that relied on the old cert files
<rogpeppe> should be trivial to fix
<axw> rogpeppe: no worries, it's a bit specific to the local provider
<axw> yep
 * rogpeppe thinks that our approach to dealing with the running-as-root stuff is a bit wrong
<rogpeppe> i think we should run *something* as root, but drop privileges a.s.a.p.
<rogpeppe> then we wouldn't need the "chown it back to what we think it should have been" hacks all over the place
<axw> it's not great
<frankban> fwereade_, axw: so, IIUC, excluding that bug we could safely remove all the pem files in JUJU_HOME, correct?
<axw> frankban: I think it'll need to be migrated to the .jenv file first, is that right rogpeppe ?
<fwereade_> frankban, not for pre-existing environments
<rogpeppe> fwereade_: +1
<fwereade_> axw, frankban, rogpeppe: we don't have any migration code there but we probably should:(
<rogpeppe> it's a bit awkward at the moment
<axw> doh
<rogpeppe> fwereade_: i was wondering about that
 * axw undoes his current change ;)
<rogpeppe> fwereade_: it's not *strictly* necessary, but would enable us to have cleaner code
<rogpeppe> fwereade_: and better error messages
<rogpeppe> the error you get when trying to talk to an environment without associated environ info is terrible currently
<rogpeppe> fwereade_: how about "juju make-info -e someenv" ?
<fwereade_> rogpeppe, wait, so we can't keep using an existing environment?
<rogpeppe> fwereade_: sure we can
<fwereade_> rogpeppe, thought so
<fwereade_> rogpeppe, but then I don't follow what you just said
<rogpeppe> fwereade_: but the problem is that if you have a new-style environments.yaml, you get an error like this http://paste.ubuntu.com/6191378/
<rogpeppe> fwereade_: because the code thinks that if no info exists, it must try to make a valid environment from just the attrs in environments.yaml
<fwereade_> rogpeppe, ah, so that's trying to destroy twice?
<rogpeppe> fwereade_: it doesn't get that far
<rogpeppe> fwereade_: it fails trying to make a new Environ
<dimitern> rogpeppe, sounds like it's doing more than it should
<rogpeppe> fwereade_: (i tried to raise this with you yesterday, BTW :-])
<fwereade_> rogpeppe, but only when that environ doesn't exist?
<rogpeppe> dimitern: what do you think it should do?
<rogpeppe> fwereade_: only when the configstore info for that environ doesn't exist
<fwereade_> rogpeppe, sorry I either missed it orcompletely failed to understand
<dimitern> rogpeppe, check if there's an actual bootstrapped environment before trying to prepare one?
<rogpeppe> dimitern: this problem doesn't happen when Preparing
<rogpeppe> dimitern: it'll happen if you call juju status, for example
<rogpeppe> dimitern: or any command that tries the NewFromName an existing environment
<rogpeppe> s/the/to/
<dimitern> rogpeppe, or destroy-environment
<rogpeppe> dimitern: yeah - that was my original example
<fwereade_> rogpeppe, but only if that environment has been edited to become new-style..?
<rogpeppe> fwereade_: yes
<dimitern> rogpeppe, yeah, that seems just little too magical
<rogpeppe> fwereade_: the problem is our attempt to deal with the old-style environments.yaml
<fwereade_> rogpeppe, the release notes have:
<fwereade_> admin-secret is now chosen automatically if omitted from the configuration of a new environment.
<fwereade_> control-bucket is now chosen automatically if omitted from the configuration for new ec2 and openstack environments.
<dimitern> rogpeppe, the same thing happens when you upgrade from 1.14 - no new-style env info
<rogpeppe> dimitern: yes
<fwereade_> rogpeppe, dimitern: so, it's all about environs that were bootstrapped pre-1.15
<rogpeppe> what i would *like* to happen is that these commands just fail immediately if there's no associated environ info
<rogpeppe> fwereade_: yes
<fwereade_> rogpeppe, dimitern: and which have then been destructively edited in environments.yaml
<fwereade_> rogpeppe, dimitern: is the behaviour actually any worse than would happen in 1.14 if you broke environments.yaml?
<rogpeppe> fwereade_: hence my thoughts about a transitory "juju make-info" command
<rogpeppe> fwereade_: no, it's not about envs which have then been destructively edited
<rogpeppe> fwereade_: it's about envs which have *not* been destructively edited
<rogpeppe> fwereade_: that's what the code is currently trying to deal with
<rogpeppe> fwereade_: and that should work fine
<dimitern> fwereade_, no sure what you mean by destructively edited
<fwereade_> rogpeppe, what happened to the control-bucket in the example you pasted? looks like a destructive edit to me..?
<dimitern> fwereade_, if you remove the environment from env.yaml?
<rogpeppe> fwereade_: that's not an environment that was bootstrapped pre-1.15
<fwereade_> rogpeppe, ahhhhh bollocks ok
<fwereade_> rogpeppe, but then I don't see how make-info would help
<rogpeppe> fwereade_: make-info would create the environ info (with bootstrap config) from an environments.yaml file
<axw> another CL, to fix the local provider issue: https://codereview.appspot.com/14289044
<rogpeppe> fwereade_: so we could change the code to fail *always* if an appropriate .jenv file does not exists
<rogpeppe> exist
<frankban> fwereade_, axw, rogpeppe: thanks for your help. I believe you are already aware of this, but, while local envs now work properly in my raring box, I also see those errors: http://pastebin.ubuntu.com/6191399/
<fwereade_> aw ffs
<axw> frankban: yep, I believe rogpeppe has fixed that last night
<rogpeppe> frankban: that's with trunk?
<fwereade_> frankban, thanks
<rogpeppe> axw: i *thought* i fixed that
<axw> oh
<axw> :o
<frankban> rogpeppe: yes, revno 1948
<rogpeppe> frankban: but you're still seeing that problem?
<frankban> rogpeppe: yes
<axw> rogpeppe: I can confirm
<rogpeppe> bugger
 * rogpeppe goes to fix it
<fwereade_> rogpeppe, you did run the local provider with your fix, right?
<rogpeppe> fwereade_: i did, but perhaps not with the final combo
<fwereade_> rogpeppe, heh, I've fallen into that one acouple of times
<fwereade_> bad luck
<fwereade_> rogpeppe, but when scrambling for a release it's doubly important to be obsessive
<rogpeppe> fwereade_: yes
<rogpeppe> well, the bug is clear; i wonder why my test didn't find it
<fwereade_> rogpeppe, basically I am not happy with the make-info idea, but let's come back to it in a bit
<rogpeppe> fwereade_: yeah, it was a straw man; other ideas would be good.
<rogpeppe> fwereade_, axw, frankban: here's the (possible) fix - am just verifying it works with the local provider:https://codereview.appspot.com/14388044
<frankban> rogpeppe: cool thanks
<axw> rogpeppe: looking
<fwereade_> rogpeppe, LGTM if it works
<axw> ah already
<axw> rogpeppe, fwereade_: let me know if there's anything that needs explaining in my CLs... just going to make a cup of tea
<rogpeppe> fwereade_: it works
<fwereade_> rogpeppe, great
<rogpeppe> fwereade_: i'll land that and then fix the cert chown problem
<fwereade_> rogpeppe, axw has https://codereview.appspot.com/14289044/ up, take a look at that
<rogpeppe> fwereade_: great
<rogpeppe> fwereade_: that's exactly the change i was gonna make
<rogpeppe> fwereade_: so, back to make-info (or whatever); there's another possibility, that i'm less keen on because it's kinda wrong, but...
<rogpeppe> fwereade_: which is to change environs.NewFromName so that if the environment or config can't be made and there's no environ info, it just returns "environment does not exist", ignoring the config error
<fwereade_> rogpeppe, so, I think I'd be happier with that
<fwereade_> rogpeppe, but I expect you've thought through the problems more clearly than me
<rogpeppe> fwereade_: it always seems a bit bad to ignore arbitrary errors, but i think in this case it's ok - we're grudgingly allowing some legacy environments to be created, but ignoring them if they're invalid
<fwereade_> rogpeppe, yeah, that was my readingof it
<rogpeppe> fwereade_: ok, i'll propose something that does that
<rogpeppe> fwereade_: i think it's worth getting in if we can
<rogpeppe> fwereade_: because everyone will come across this problem
 * fwereade_ hates churn at this stage but thinks it's probably worse to not do it
<axw> fwereade_: updated https://codereview.appspot.com/14387043
<axw> not just a test update - there shouldn't have been a default set at all
<fwereade_> axw, LGTM assuming it works live ;)
<axw> it does indeed
<axw> thanks
 * axw tests them all together now
 * fwereade_ approves :)
<axw> all good
<axw> well
<axw> on the upside, now I know a bit more about how the machine agent works
<rogpeppe> fwereade_: suggestions for a good error message when there's no environment info for an environment?
<rogpeppe> fwereade_: i'm thinking of going with "environment does not exist"
<rogpeppe> fwereade_: but that's worryingly similar to "environment not found"
<rogpeppe> fwereade_: "environment not prepared" is more accurate, but probably confusing to the user
<fwereade_> rogpeppe, I'm kinda leaning towards "is not bootstrapped", but I'm fretting there's some way for that to be wrong
<rogpeppe> fwereade_: there is
<rogpeppe> fwereade_: because you can do sync-tools before bootstrap
<rogpeppe> fwereade_: bootstrapped status is largely orthogonal to prepared status
<rogpeppe> fwereade_: except that destroy-environment resets both of them
<fwereade_> rogpeppe, I seem to be unusually dense today -- but if we have no info, and no valid config, how *could* we have boostrapped?
<rogpeppe> fwereade_: we could have changed the config after bootstrapping
<rogpeppe> fwereade_: "is not bootstrapped" is logically accurate, i suppose
<fwereade_> rogpeppe, is the situation any worse than it would be if you broke the config after bootstrapping with 1.14?
<rogpeppe> fwereade_: it's possibly a little more misleading
<rogpeppe> fwereade_: saying "is not bootstrapped" kind of implies that the only way to rectify the situation is to bootstrap, but actually sync-tools will do the job too
<fwereade_> rogpeppe, ok, but why sync-tools without a view to a bootstrap? ;)
<rogpeppe> fwereade_: i guess my issue comes from the fact that we're saying "is not bootstrapped", when we actually have no idea if the user has bootstrapped or not.
<rogpeppe> fwereade_: but as an error message, i think it works ok
<rogpeppe> fwereade_: even if it's not *strictly* accurate
<rogpeppe> fwereade_: so i think i'll go with it
<fwereade_> rogpeppe, I think they either didn't bootstrap, of they broke their environment in a way we've never been able to fix
<fwereade_> rogpeppe, I say go with it :)
<rogpeppe> fwereade_: or they didn't sync-tools :-)
<rogpeppe> fwereade_: will do
<natefinch> morning all
<axw> morning natefinch
<natefinch> rogpeppe: fwiw, as long as juju bootstrap fixes "is not bootstrapped" I think that's fine.  I much prefer an error that makes it obvious how to fix it, even if it's not 100% accurate.
<rogpeppe> natefinch: cool
<axw> rogpeppe: re your comment about panicking: noted, but it's going thru the bot now
<rogpeppe> axw: np
<axw> I'll do that next time I touch it
<axw> fwereade_: should I be marking these all against 1.15.1, or is that actually cut and nobody announced it?
<fwereade_> axw, I don't think sinzui is up yet, so I lackinfo
<fwereade_> natefinch, heyhey, is there a problem with that doc branch or can we land it?
<natefinch> fwereade_: bot was messed up last night.  Tim was looking at it when I went to bed. /tmp was full or something
<natefinch> fwereade_: otherwise, yes, totally ready
<fwereade_> natefinch, ah, great,dimiter cleared that up this morning
<natefinch> fwereade_: I just poked it again, so the bot will land it
<axw> I'm off, have a nice weekend all
<fwereade_> axw, cheers, and you
<fwereade_> axw, thanks for all your help this week
<dimitern> rogpeppe, mgz, standup?
<rogpeppe> dimitern, fwereade_: https://codereview.appspot.com/14388044
<mgz> dimitern: ta
<fwereade_> oh ffs
<rogpeppe> fwereade_: https://codereview.appspot.com/14386044
<rogpeppe> fwereade_, dimitern, mgz: i need a review of the above, please
<fwereade_> rogpeppe, sorry, what's invalid about the info in the first test?
<rogpeppe> fwereade_: it hasn't got a state-id
<rogpeppe> fwereade_: which the dummy provider adds when preparing
<rogpeppe> fwereade_: i'll add a comment
<fwereade_> rogpeppe, cheers
<fwereade_> rogpeppe, another trivial or two, otherwise LGTM
<rogpeppe> fwereade_: i'm trying to understand https://codereview.appspot.com/14388043
<rogpeppe> fwereade_: why can't we just set the Location to the req.Host directly
<rogpeppe> ?
<rogpeppe> fwereade_: i don't see why we have to do all the port mangling
<rogpeppe> fwereade_: and why should being able to do a HEAD be predicated on running secure (ie. with a tls.Config)
<rogpeppe> ?
<fwereade_> rogpeppe, I'm sorry to say I made a trust call on that one -- please comment on the review. re secure HEAD -- I do not know why we even do anything http-only tbh, but it seemed out of scope
<rogpeppe> mgz, fwereade_: looks like the bot may be down - i approved this 50 minutes ago and it still hasn't made any progress: https://code.launchpad.net/~rogpeppe/juju-core/440-fix-addressupdater/+merge/189238
<rogpeppe> still no movement on https://code.launchpad.net/~rogpeppe/juju-core/440-fix-addressupdater/+merge/189238
 * rogpeppe goes to lunch
<rogpeppe> mgz: if you were able to investigate the 'bot problem, that would be great...
<mgz> rogpeppe: something does seem to be up...
<mgz> rogpeppe: answer to the mystery, btw
<rogpeppe> mgz: oh yes?
<mgz> there was a mongodb process from a test left around, which inherited the filehandle of our flock file
<mgz> so, that got left in a locked state, so the cronjob was not obtaining the lock or starting any new work
<mgz> there are like, a bunch of ways to make that more robust
<rogpeppe> mgz: ah, so it's merrily continuing on its way now?
<mgz> I huped the mongod process and things are happy again, your mp is being processed
<rogpeppe> mgz: let's hope one of the not-so-intermittent test failures doesn't strike again
<rogpeppe> merged!
<fwereade> natefinch, do you have any docs left to merge?
<natefinch> fwereade: nope,  mine are all in
<fwereade> natefinch, great, thanks
<fwereade> natefinch, bug up to date?
<natefinch> fwereade: oops, nope, I'll go poke it
<fwereade> natefinch, cheers
<rogpeppe> fwereade: do you agree that bootstrap.ConfigureBootstrapMachine should live in provider/common ?
<fwereade> rogpeppe, probably
<rogpeppe> fwereade: given that it's designed to be used as part of provider implementions, which bootstrap.Bootstrap is not
<fwereade> rogpeppe, wait, no
<fwereade> rogpeppe, nothing to do with a provider implementation
<rogpeppe> fwereade: well, kinda
<fwereade> rogpeppe, coincidentally called by the local provider
<rogpeppe> fwereade: a provider either arranges for jujud bootstrap to be called, or it does it itself
<fwereade> rogpeppe, I see no good reason not to use jujud bootstrap across the board
<rogpeppe> fwereade: yeah, the local provider should probably just exec it
<fwereade> rogpeppe, pretending that state manipulation is part of a provider implementation will not lead us along a happy path
<rogpeppe> fwereade: yeah, hmm
<fwereade> rogpeppe, it's not in a good place now, though
<rogpeppe> fwereade: yeah
<natefinch> fwereade: so, it sounds like Tim has a solution to the "juju destroys all MaaS nodes" but needs to verify it.   So, I presume I don't need to work on that.  I can try to finish up the EC2 root disk stuff... as I said in the standup, the constraint works, we're just not getting the value back in status
<fwereade> natefinch, ok, that's great, I'll sync up with him on sunday night and make sure, but fixing root-disk would be awesome
<fwereade> natefinch, thanks
<natefinch> fwereade: cool.   Do you have a good process for debugging what's going on with jujud and the agents?  I've mostly done work on the client, so not really sure how I'd test out code on the bootstrap node (other than repeatedly destroying/bootstrapping with the new code)
<fwereade> natefinch, keeping half an eye on debug-log,and meditating upon the infinite, basicaly
<fwereade> natefinch, but if it's something specific I might be more helpful?
<sinzui> fwereade, what is this milestone, https://launchpad.net/juju-core/+milestone/dev-docs
<natefinch> fwereade: heh... mostly just that there's information I expected to be there that isn't there. I guess looking at the code for goamz and seeing if it actually returns the hardware characteristics would help
<fwereade> natefinch, don't think so
<fwereade> natefinch, we hope to get instance-type data into simplestreams sometime
<fwereade> natefinch, but for now the best we can do is use the hardcoded tables
<natefinch> fwereade: if we allow people to set the disk space, I'd prefer we return no information rather than wrong information.
<fwereade> natefinch, yeah, I'd be inclined to take the RootDisk info out of instancetype.go
<fwereade> natefinch, and just always insert the value from the attached ebs volume
<fwereade> natefinch, might be worth checking with sidnei if he had any clever plans in that direction too
<natefinch> fwereade:  cool, ok
<rogpeppe> fwereade: how about folding most of ConfigureBootstrapMachine into agent.InitialStateConfiguration?
<rogpeppe> fwereade: (which I'd probably rename to something like InitializeState)
<fwereade> rogpeppe, sounds reasonable
<rogpeppe> fwereade: this came about because i was fiddling with provider/local bootstrap and looking at ConfigureBootstrapMachine, realised that it didn't have a test, then started writing one, then realised things could probably be simpler...
<fwereade> rogpeppe, I am more than happy to see that stuff tidied up
<rogpeppe> fwereade: i don't really like the way that InitialStateConfiguration is written as a free func that only works on a particular type either. It should really be part of the Config interface, but really I think the Config interface should be a concrete struct, although i know tim thinks differently.
<fwereade> rogpeppe, the cast to *configInternal does look like crack to me
<natefinch> fwereade: wow, that is total crack
<natefinch> fwereade: that'll panic if you pass any other implementation of that interface
<fwereade> natefinch, thankfully there aren't any others, but still
<natefinch> fwereade: still... I agree, it should be a method on the interface... certainly not a function that someone else can call and cause to panic by implementing some other Config
<mattyw> fwereade, regarding the comment here: https://codereview.appspot.com/14389043/diff/1/state/state.go#newcode693. Don't we need this function so we can data in the megawatcher?
<fwereade> mattyw, that's got a *State, so we can get to it -- and I'm a bit twitchy about having different ways to do the same thing in state
<fwereade> mattyw, I don't think it's a blocker in any way, minor matter of taste
<fwereade> mattyw, (there are a couple of other similar methods lurking already, that IMO should be in another package and make use of the bits exported from state because that's all they use, but... it's really not a huge deal)
<mattyw> fwereade, ok, I might leave it like that for now but it's something to meditate on :)
<mattyw> fwereade, one more question: for this https://codereview.appspot.com/14389043/diff/1/worker/uniter/context.go#newcode145 should we pass it in to HookContext as a parameter to NewHookContext - or work it out inside NewHookContext?
<mattyw> (I guess passing it in makes it easier to test?)
<fwereade> mattyw, passing it in feels a bit easier
<mattyw> fwereade, sure
<sinzui> fwereade, someone? I see this error running make check from the tarball on saucy and precise: http://pastebin.ubuntu.com/6192615/
<mgz> ick
<fwereade> sinzui, oh *hell* the dirty sockets have shown up occasionally since forever -- is it consistent?
<mgz> sinzui: it's an intermittent failure, try make check again
<mgz> looking for bug#
<sinzui> I have run the test 1 on saucy, 2 on precise, always the same hardware
 * sinzui is running the branch on saucy now
<sinzui> mgz, I got the same failure again. Can I treat this as an issue local to me. My next step in the release is to build the deb locally to confirm the recipe
<mgz> sinzui: it seems worth continuing for the moment
<sinzui> okay
<fwereade> mgz, sinzui: agreed, cannot repro
 * sinzui just realised that Lana Del Rey's Dark Paradise has a sound similar to his IRC notification
<sinzui> mgz, can I set the tools-url to url outside of the cloud? I want to build a stream from my new deb, put it on people.canonical.com, then deploy to aws, hp, etc...
<mgz> sinzui: hm, wonder what the best way of testing is these days
<mgz> sinzui: so, my old way of doing this doesn't work any more, and I've not seen instructions from ian on exactly what we should be doing instead
<sinzui> right
<mgz> but you *can* generate your own simplestreams bits using the plugins commands and get juju to use those somehow
<sinzui> I just want to get my confidence a little higher about building all the deb
<sinzui> I will explore my options with tools-url
<fwereade> gents, I need to be away now
<fwereade> happy weekends
<fwereade> I will try to look in every now and then
<rogpeppe> sinzui: i just saw your post on juju-dev
<rogpeppe> sinzui: and replied - any chance you can use 1954, not 1953 ?
<mgz> rogpeppe: that doesn't seem like a crucial fix
<rogpeppe> mgz: it's something that users will see all the time
<rogpeppe> mgz: and so affect the general "nice feelingness" of the new release
<mgz> sure, but we can put it in 1.15.2
<rogpeppe> mgz: ok, i guess
<mgz> (or rather, 1.16.1)
<mgz> I would be very surprised if there aren't some more issues we turn up that need fixing on the version we'll be shipping for saucy
<sinzui> rogpeppe, I can switch to 1954
<sinzui> rogpeppe, mgz, I can move the tag now
<mgz> if sinzui is fine with bumping that's okay :)
<rogpeppe> sinzui: that would be great, thanks!
<sinzui> I am about to rebuild my deb anyway to get the version to match what the builders will do. I will switch the tag now.
<natefinch> mgz, fwereade, rogpeppe, dimitern:  goamz changes as a precursor to supporting root-disk on ec2: https://codereview.appspot.com/14374044
<rogpeppe> natefinch: cc niemeyer
<natefinch> niemeyer: scrub my back, I'll scrub yours?   https://codereview.appspot.com/14374044    added code so we can set root disk size when we request an EC2 image for supporting the root-disk constraint
<mgz> natefinch: looking
<niemeyer> natefinch: Hmm.. this diff doesn't look rigth?
<mgz> natefinch: looks like you need to merge trunk
<mgz> or you'll conflict with the stuff I landed this week
<natefinch> oh, dang, I hadn't updated goamz recently..l ok
<mgz> I even sent an email ;_;
<natefinch> mgz: yeah, that's how I knew what you were talking about... I just forgot to actually do it
<niemeyer> natefinch: Hmm.. I actually implemented devices support before
<niemeyer> natefinch: I wonder if I did something silly and it never made it in
<natefinch> niemeyer: yeah, I saw that BlockDeviceMapping was there but not really put to use
<niemeyer> natefinch: No, I mean I've fixed that
<niemeyer> natefinch: Either way, not in trunk.. I've screwed it up indeed
<natefinch> niemeyer: ahh, that's a shame
<niemeyer> natefinch: https://codereview.appspot.com/9860044
<niemeyer> natefinch: It just never got reviewed/merged
<niemeyer> natefinch: That also looks more complete than your version
<niemeyer> natefinch: If you're happy with that, I'll submit
<natefinch> niemeyer: looks good to me
<natefinch> niemeyer: more complete is certainly better
<niemeyer> natefinch: That's in
<niemeyer> natefinch: Sorry for the double work
<natefinch> niemeyer: it was pretty trivial,and now I know the goamz code better :)
<rogpeppe> fwereade: here's a preliminary InitializeState proposal. It's still WIP (it needs more tests) but thought you might like to have a quick once-over for sanity checking: https://codereview.appspot.com/14395043/
<natefinch> mgz: you can do this one instead, the juju code to use the goamz disk stuff: https://codereview.appspot.com/14326044
<rogpeppe> fwereade: it feels a little bit cleaner than before, i think
<rogpeppe> fwereade: there is a new restriction that InitializeState must be called on the bootstrap machine's config, but i think that seems reasonable at that very specific stage in the proceedings.
<rogpeppe> fwereade: the main thing is that it consolidates more of the bootstrap logic in a single place.
<rogpeppe> and... that's me for the week
<rogpeppe> happy weekends all
<natefinch> rogpeppe: happy weekend :)
<rogpeppe> natefinch: you too
#juju-dev 2013-10-05
<jam>  a
#juju-dev 2014-09-29
 * thumper is now grumpy, sore and hungry
<thumper> specialist appointment for shoulder isn't until 26th of November
<davecheney> o_O
<menn0> thumper: bugger
<menn0> thumper: sounds very NHS-ish
<menn0> thumper, davecheney: so what are we doing about the standup? I have an errand to do so just trying to plan the rest of the day.
<thumper> menn0, davecheney: now?
<menn0> thumper, davecheney: works for me
<davecheney> sure, i'll see you in the hangout
<davecheney> menn0: one for you http://reviews.vapour.ws/r/120/diff/
<menn0> davecheney: cool. will look shortly.
<davecheney> kk
<menn0> davecheney: apparently the review request is private and I'm not allowed to look
<menn0> davecheney: I don't think you've hit publish yet
<davecheney> menn0: wut
<davecheney> is there a flag to rbt to say "yes, i'd actually like to review this "
<davecheney> done
<davecheney> menn0: try againplease
<davecheney> menn0: i've got a few more changes like this that are trying to pull apart the multi watcher to make it more unstandable (for me)
<menn0> davecheney: finished chatting to Tim now. looking at your change.
<menn0> davecheney: Ha ... I kept reading InfoId and "Infold"
<menn0> s/and/as/
<thumper> axw: sha'ping
<davecheney> rb is a useless sack of shit
<davecheney> if it do rbt post on a branch that has already been posted, it creates a new review
<davecheney> % rbt publish 120
<davecheney> ERROR: Error publishing review request (it may already be published): Object does not exist (HTTP 404, API Error 100)
<davecheney> why is this an error ?
<davecheney> can't the tool look at the status of ther view and see it's been published
<davecheney> menn0: what is the command to upload a new diff to an existin greview ?
<menn0> "rbt post -r <the review id>"
<davecheney> rbt can rbt remember that an existing branch is attached to a review ?
<davecheney> ok, that review is totally screwed
<davecheney> just getting 500's
<davecheney> i'll make another one
<davecheney> http://reviews.vapour.ws/r/120/diff/#
<davecheney> should be correct now
<menn0> davecheney: review of 120 done. thumper needs to meta-review.
<davecheney> menn0: I have to remove that type because otherwise I cannot break the import look between apiserver/params and state/multiwatcher
<davecheney> menn0: fwiw, i disagree with your comment
<davecheney> this type adds nothing in terms of type safety
<menn0> davecheney: I know you do, we've discussed this before :)
<davecheney> it leads people to think that f(id multiwatcher.InfoId) takes things of a specific type
<davecheney> (it doesnt')
<menn0> davecheney: I know it doesn't add type safety but it does add the other benefits I mentioned
<davecheney> and I am neurtral on the reability argument
 * thumper looks
<menn0> davecheney: if it helps with the import loop breaking then let's do it. that's a bigger win.
<davecheney> kk
<menn0> davecheney: probably should have mentioned that reason in the commit message or review description
<davecheney> yes, my bad
 * menn0 is out for a bit. errands to run.
<davecheney> reviewboard is telling me I have -1 Open incoming reviews
<davecheney> now i have -2
<davecheney> anyone seen this failure ?
<davecheney> http://paste.ubuntu.com/8452331/
<davecheney> it's sporadic
<davecheney> http://reviews.vapour.ws/r/122/diff/
<davecheney> menn0: can you tell me how to do a dependent change with rbt ?
<davecheney> i'm guessing there is a flag for rbt post
<menn0> davecheney: I've never done it myself but I believe it hinges on the --parent option
<menn0> davecheney: re that failure - I have seen it before but it looks like a ordering problem that might be fixed by using SameContents
<davecheney> menn0: same
<davecheney> lp has forgotten my login
<davecheney> i'll log a bug when I go downstairs and get my key for 2fa
<menn0> davecheney: k
<davecheney> menn0: if you ahve time, http://reviews.vapour.ws/r/122/diff/
<davecheney> opens the door to my next fix which makes params.EntityId an interface
<menn0> davecheney: give me a little while
<davecheney> client_test.go:2448: c.Assert(client.AddCharm(curl), gc.IsNil, gc.Commentf("goroutine %d", index))
<davecheney> ... value *params.Error = &params.Error{"", "cannot add charm to storage: unexpected deletion of resource catalog entry with id \"47d6e7812099383e8266f590d37b7f5369ef43931aecc72a4b86674e7052f346a391648
<davecheney> 209b8e69ea5b649d2c0e86da0\": Resource not available because upload is not yet complete"} ("cannot add charm to storage: unexpected deletion of resource catalog entry with id \"47d6e7812099383e8266f590d
<davecheney> 37b7f5369ef43931aecc72a4b86674e7052f346a391648209b8e69ea5b649d2c0e86da0\": Resource not available because upload is not yet complete")
<davecheney> ... goroutine 0
<davecheney> [LOG] 0:00.462 DEBUG juju.storage managed resource entry created with path "environs/90168e4c-2f10-4e9c-83c2-feedfacee5a9/charms/cs:precise/wordpress-3-082e539d-946e-44ef-8684-489fc4bcfc3f" -> "47d6e78
<davecheney> 12099383e8266f590d37b7f5369ef43931aecc72a4b86674e7052f346a391648209b8e69ea5b649d2c0e86da0"more
<davecheney> more intermediate failures
<thumper> axw: ping
<thumper> menn0: where are we on these branches?
<thumper> davecheney: re 122 above, shipit
<menn0> thumper: just about to push the API server change again
<davecheney> thumper: ta
<menn0> thumper: then one more manual test of the env uuid unit change and that can be merged
<thumper> menn0: ok
<thumper> menn0: let me know when I need to look at that review again
<menn0> thumper: just manually testing that branch now
<thumper> kk
<menn0> thumper: ok. http://reviews.vapour.ws/r/119/
 * thumper looks
<davecheney> menn0: still LGTM
<davecheney> trying to auth as an aribtrary user is pretty gross
<davecheney> would you consider raising a ticket/card/smoke signal
<davecheney> to have a proper method added somewhere to do this ?
<thumper> menn0: one small thing
<thumper> menn0: now that we are treating any error as maintenance in progress
<menn0> davecheney: *nod* I'll create a ticket
<thumper> menn0: you don't really need to create a fake user tag
<thumper> menn0: you could just pass through the agent tag
<menn0> thumper: well not quite
<thumper> no?
<menn0> thumper: the loging validator in jujud returns nil for the local machine
<thumper> ah...
 * thumper nods
<menn0> thumper: because the local machine is always allowed to login
<thumper> yeah...
<menn0> thumper: but if that breaks due to db migrations then we still want to know we're in upgrade mode
<thumper> got it
<thumper> please add that to the comment
<menn0> thumper: ok
<thumper> so the next person doesn't change it
<menn0> thumper: and as per davecheney I'll add a TODO referring to a ticket to get this done in a cleaner way
<davecheney> thanks
<davecheney> the next person will probably be me
<davecheney> and i'll go through trampling that
<davecheney> % echo $?
<davecheney> 0
<davecheney> sweet, sweet, tests
<thumper> number of tests run: 0
<menn0> thumper: do you want to see that change again or should I land it?
<menn0> hmm my env uuid for units branch is getting lots of merge conflicts with upstream ... for files I haven't touched
<menn0> I wonder what's up with that
<davecheney> thumper: menn0 i think i'm at the point that I have to stop nibbling around the edges and invert the dependenceies between state/multiwatcher and apiserver/params
<thumper> menn0: just land it
<menn0> thumper: doing it now
<thumper> davecheney: I'm ok with that
<thumper> menn0: NFI re conflicts
<menn0> thumper: it's ok. there were tiny unit env uuid changes in those files. lots of conflicts with recent ports ports.
<menn0> thumper: i'm almost done resolving.
<thumper> ugh
<thumper> ok
 * thumper EODs
<thumper> dog walk time
<dimitern> morning all
<dimitern> tasdomas, as OCR, can you please review http://reviews.vapour.ws/r/117/ ?
<dimitern> fwereade, jam, TheMue, you might be interested as well ^^
<jam> morning dimitern, looking
<dimitern> jam, cheers!
<jam> dimitern: I'd like to see it tweaked, but I'm willing to discuss it
<dimitern> jam, thanks, I realized I've missed something
<jam> dimitern: what's that?
<dimitern> jam, we shouldn't be opening and closing ports like that at every hook commit
<dimitern> jam, even though there won't be an error as they will be opened or closed already
<jam> dimitern: do you commit hooks before they are done?
<jam> ah, this is pre-populated, not the stuff which is currently changing
<jam> dimitern: yeah
<dimitern> jam, it's not pre-populated because I realized it needn't be
<dimitern> jam, the uniter hook commands is the only way to open and close ports on *that* unit
<dimitern> jam, so we should cache what's opened and closed, to show it later with the opened-ports hook tool (in a follow-up), but use temp lists for pending ports, which are cleaned up at hook commit time
<jam> dimitern: sgtm
<tasdomas> morning
<dimitern> jam, updated - http://reviews.vapour.ws/r/117/diff/1-2/
<jam> dimitern: so why do we track "Closed" ports in the portRanges map, isn't everything closed that hasn't been opened ?
<jam> or this is both "things I've requested" and "things that were already there"
<dimitern> jam, I was thinking we can have both "opened-ports" and "closed-ports" hook tools in a follow-up, which will use the map
<dimitern> jam, it might be useful for charms to see what ports it requested to be closed as well as opened
<jam> dimitern: k, "opened" or "opening" ? I'm just trying to figure out what portRanges buys us above just openingPorts and closingPorts
<dimitern> jam, the portRanges map is not reset between hooks, unlike the slices
<dimitern> jam, and they need to be reset so we don't unnecessarily issue api calls on each hook commit for the port ranges already opened before
<TheMue> morning
<dimitern> morning TheMue
<tasdomas> dimitern, I've submitted a review
<jam> dimitern: am I *very* choppy ?
<jam> or just a little?
<dimitern> tasdomas, cheers, I've updated the review and some of your suggestions are implemented, can you take another look?
<tasdomas> dimitern, is there a need to add the OpeningPorts and ClosingPorts methods to the interface?
<dimitern> tasdomas, what do you suggest instead?
<tasdomas> dimitern, am I missing something or are they only used in tests?
<dimitern> tasdomas, for now yes, but this might change
<dimitern> tasdomas, but then again, it makes sense to only add them to export_test
<tasdomas> dimitern, exactly
<dimitern> as of today
<dimitern> fwereade, ping
<tasdomas> dimitern, I'm also a bit concerned about how conflicting operations will be handled
<tasdomas> dimiter, open(100-200) followed by a close(100-200) should probably result in a nop?
<tasdomas> dimitern, unless 100-200 was already open
<fwereade> dimitern, heyhey
<jam> tasdomas: I'd probably say "yes but that is a bit of an edge case in a buggy charm" so we're allowed to have it be slightly undefined behavior
<jam> because you'd have to determine that the first open() is actually the noop.
<jam> If we really want it, then we could make the total list of open + close ordered
<jam> so one slice with (true, 100-200), (false, 100-200), (true, 100-200) etc
<jam> morning fwereade
<fwereade> jam, heyhey
<dimitern> fwereade, I'd appreciate it if you find some time to review my open/close-port sandboxing branch  http://reviews.vapour.ws/r/117/ - and read a bit of scrollback with comments from jam and tasdomas
<fwereade> dimitern, will do, but I have to do gsamfira's ones first of all
<dimitern> fwereade, sure
<dimitern> fwereade, it will be easier maybe if we do a quick g+ talk when you can
<fwereade> dimitern, ok, I had a quick look at that because it seemed smaller ;p
<fwereade> dimitern, the things I immediately wonder are:
<fwereade> dimitern, do we have some mechanism for checking against what's already opened/closed?
<fwereade> dimitern, and, how will we integrate this with per-relation port open/closes?
<dimitern> fwereade, well, that's why I suggested a g+, it will be easier talking than typing
<fwereade> dimitern, ok then :)
<dimitern> jam, tasdomas, ok, after a chat with fwereade I'll discard http://reviews.vapour.ws/r/117/ and do a prereq first that adds a method to uniter api to retrieve all machine ports and cache them in the uniter, so we can both check for conflicts and do sandboxing
<jam> dimitern: well, I would guess you can use most of what you already have, just an extra step at  the start?
<tasdomas> dimitern, understood - I was thinking about that solution, I'm just worried that there would still be a window present, where conflict could emerge
<dimitern> tasdomas, even with the extra checks against machine ports, there will still be a very minor possibility for open/closePorts to fail at finalize time, but that's OK, as we'll catch and handle most other cases
<tasdomas> dimitern. sound good to me
<jam> TheMue: standup?
<TheMue> omw
<perrito666> morning
<rogpeppe> i just wanted to use juju with a local environment. it's not working... anyone got any idea of what might be happening here and how I might be able to fix it? http://paste.ubuntu.com/8454648/
<rogpeppe> hmm, perhaps it was as simple as just apt-get install lxc. somehow that must have got uninstalled at some point.
<rogpeppe> ha, it works now. phew.
<rogpeppe> those error messages were not great though.
<mgz> wow, that is a bit of a mess rogpeppe
<rogpeppe> mgz: mmm
<mgz> I was expecting the "must install juju-local" message
<dimitern> fwereade, jam, http://reviews.vapour.ws/r/123/ - uniter api changes, as discussed
<fwereade> dimitern, cheers, just popping out but back shortly
<dimitern> fwereade, sure, np
<jam> dimitern: my first thought is "why does *Uniter* have a Machine implementation", it seems the wrong thing for a uniter to think about.
<jam> maybe it needs to know the machine it is on isn't dying?
<dimitern> jam, because the ports are on the machine, not on a unit anymore
<dimitern> jam, and the only reason to have a Machine object is to be able to call AllPorts() on it
<jam> dimitern: so looking at the data structures, you expose MachinePortsResults which holds a slice of MachinePortsResult which holds a slice of MachinePortRange which uses a UnitTag and a PortRange
<jam> but nothing there uses the NetworkTag
<jam> And what do we do if more than one Unit asks for a similar port to be open?
<jam> Is it just a conflict ?
<dimitern> jam, the idea is to return all ports on the machine, regardless of network
<dimitern> jam, when more the one unit tries to open a conflicting range, we'll detect and not allow it to happen
<jam> dimitern: but especially in the case of ranges, saying "I want these open" doesn't actually have to be a problem does it?
<dimitern> jam, I'm not sure I follow - can you expand a bit?
<jam> if I say "i need this machine to expose 10-100, and someone else says 90-200", we could just open 10-200 and be done
<jam> I suppose it would be clearer for the charm to fail with "you can't actually use 90-100" so it doesn't try to configure the application to use them?
<jam> dimitern: ^^
<dimitern> jam, nope
<jam> dimitern: nope ? not sure what you are saying no to
<dimitern> jam, if unit 0 says open 10-100/tcp, that's fine; later unit 1 says open 90-100/tcp - there's a conflict and we won't allow it
<dimitern> jam, we can't have concurrently running open-port requests, as there's only one hook running at a time on the machine
<dimitern> jam, port ranges are bound to units that requested them, so 10-100/tcp + 90-200/tcp for different units != 10-200/tcp
<wwitzel3> perrito666, natefinch: standup
<perrito666> wwitzel3: going
<natefinch> wwitzel3: coming one sec, grabbing my coffee from the other room
<jam> natefinch: wwitzel3: did you see the email from menno about syslog and cert issues?
<wwitzel3> jam: yep, the one about log message spam?
<jam> wwitzel3: about not being able to connect because of an issue with "no IP SANs"
<jam> after upgrade
<wwitzel3> jam: looking at it now
<jam> wwitzel3: k, I'd chat with nate about it, because I think he discovered the golang issue for the HTTP API stuff
<jam> I think the issue is that whatever workaround we did for the API we need to do for syslog
<wwitzel3> jam: ack
<alexisb> fwereade, ping
<fwereade> alexisb, pong
<alexisb> hey there
<alexisb> do you mind if I reschedule our 1x1?
<alexisb> my little guy is still sleeping and I dont want to wake him to go to his nanny given he has been sick
<alexisb> so I will be on the road for our normally scheduled time
<alexisb> fwereade, ^^
<fwereade> alexisb, np
<alexisb> thanks
<alexisb> any particular day that works best for you?
<alexisb> fwereade, ^^
<fwereade> alexisb, tomorrow isn't great, how about weds?
<fwereade> alexisb, or thurs?
<alexisb> I can do a thursday morning, I will reschedule for them, thanks for being flexible fwereade !
<fwereade> alexisb, no worries, glad to be of service
<ericsnow> natefinch: FYI, I fixed that issue on http://reviews.vapour.ws/r/103/
<natefinch> ericsnow: can you print the name and the type of the value, instead of a generic string?  Would make debugging easier.
<ericsnow> natefinch: sure
<ericsnow> natefinch: you know, you can leave comments on the review :)
<natefinch> ericsnow: sometimes it's faster just to talk on irc, but I just left a message there too
<ericsnow> natefinch: thanks
<hazmat> fwereade, didn't you have some writing provider getting started docs? not seeing them in tree.. got  a partner thats interested
<dimitern> fwereade, TheMue, jam, tasdomas, re-proposed open(close)-port sandboxing for the uniter, please take a look http://reviews.vapour.ws/r/125/
<jcw4> mgz: this error looks suspiciously like a build script merge failure -
<jcw4> /var/lib/jenkins/juju-release-tools/make-release-tarball.bash: line 115: syntax error near unexpected token `<<<'
<jcw4> natefinch: do we have a UTC-0400 to -0700 timezone person who knows the build server stuff?
<jcw4> perrito666: surely I can count on at least you to be around :)
<mgz> that looks odd
<jcw4> whew
<jcw4> mgz: still not sure if that's the issue but the last couple builds failed with that error
<mgz> wassit on? a landing?
<mgz> okay, fixing.
<jcw4> yep
<jcw4> http://juju-ci.vapour.ws:8080/job/github-merge-juju/837/ and http://juju-ci.vapour.ws:8080/job/github-merge-juju/838/
<mgz> yup... conflictishy
<mgz> okay, should work now
<jcw4> wow.  thanks mgz
<mgz> I'm also going to make another change with some fixes, will notify when it's through
<jcw4> thanks mgz  should we start to land again or wait for your fixes?
<mgz> try a landing now if you're waiting
<jcw4> k
<bodie_> mgz++
<ericsnow> natefinch: could you have another look at 103?
<natefinch> ericsnow: ok
<ericsnow> natefinch: thanks
<natefinch> ericsnow: ship it
<ericsnow> natefinch: thanks
<arosales> thumper: fyi we got some new info @ https://bugs.launchpad.net/juju-core/+bug/1375268
<mup> Bug #1375268: Juju Panic'ing on MAAS Power8le Environment <bootstrap> <maas-provider> <ppc64el> <juju-core:Triaged> <https://launchpad.net/bugs/1375268>
<thumper> arosales: ta, looking
<arosales> thumper: natefinch had done some debug to identy the panic, but didn't have anything conclusive
<thumper> arosales: I've read all that...
<arosales> thumper: let us know if you need any other info, or access
<thumper> access to the maas would be helpful
<thumper> and the instance that is having problems
<arosales> mbruzek: do you have docs you can send over to thumber on accessing the maas IBM system?
<thumper> the logs don't show any information for the stack frame that actually is the one panicing
<thumper> which is somewhat surprising to me
<thumper> arosales: mbruzek: info also to davecheney plz
<arosales> mbruzek: also be interesting to see if you have reproduced on the other maas environment
<mbruzek> thumper: arosales I will send that out right now.  Tim from IBM needs to make him an account.  We could jump in a hangout if you want to see my screen now.
<thumper> mbruzek: I have meetings starting in a minute for a while
<arosales> mbruzek: I guess the VPN is specific to individuals
<thumper> davecheney: are these the lines that indicate a bad compiler? juju[5386]: bad frame in setup_rt_frame: 0000000000000000 nip 0000000000000000 lr 0000000000000000
<davecheney> yes
<davecheney> thumper: hold fire
<davecheney> writing you a long email so that you too can know all there is to know
<perrito666> wow an email containing "all there is to know" must be instanely long
<perrito666> I would really dig doc strings on things like state/watcher.go
<davecheney> perrito666: brevity is not my strong suit
 * perrito666 imagines thumper getting an email with 42 as body text
<perrito666> I would also dig a trackball, I have too much junk on my desktop to be able to move the mouse :/
<jcw4> thumper: per our hangout last week: http://reviews.vapour.ws/r/127/
<thumper> jcw4: cheers, will look when I have the kids back from swimming :)
<jcw4> menn0: it's about EnvUUID so your feedback appreciated too :)
<jcw4> thx thumper
<menn0> jcw4: having a look
<menn0> jcw4: review done
<jcw4> menn0: much appreciated
<perrito666> menn0: tx for your mail it was enlightening
<menn0> perrito666: good! it was useful for me to write it. I learned a few things while making sure I was giving you correct information :)
<perrito666> sadly I cannot have a watcher since I am nuking the db in this process so I am looking another way to signal my worker
<davecheney> http://reviews.vapour.ws/r/126/
<davecheney> what's going on here
<davecheney> i thought that backups were going to be streamed back to the client, not shoehorned into the api server
<perrito666> davecheney: as per fwereade design and I believe we agreed on that in Las Vegas, backups are stored in the state server and downloaded on demand
<perrito666> Iam not sure if any of those things include shoehorning since I am having some difficulty picturing the meaning of it in my head :p
<davecheney> perrito666: yes
<davecheney> that is no in question
<ericsnow> davecheney: That patch provides all the boilterplate needed plus a very basic implementation of the data transfer
<davecheney> but downloading them via encoding them into json is bad
<ericsnow> davecheney: I agree sending a []bytes over the wire is not a valid solution
<davecheney> ericsnow: i thought the plan was to add some bulk download api
<davecheney> i thought that had happened
<ericsnow> davecheney: not that I'm aware
<davecheney> bummer
<ericsnow> davecheney: I agree that download should support bulk calls
<ericsnow> davecheney: upload as well (when we get to that)
#juju-dev 2014-09-30
<davecheney> I now have -3 open reviews in my RB incoming review queue
<perrito666> is that a minus?
<thumper> davecheney: those are your reviews I think
<thumper> haha
<thumper> if it is, that's funny
<davecheney> thumper: how many (negative) reviews do you have ?
<thumper> personal ones are positive, juju-team shows -3
<jcw4> davecheney: that seems like double entendre
<thumper> jcw4: I'm looking at yours now
<jcw4> thumper: cool... I hope it's not negative
 * jcw4 cracks himself up
<davecheney> yay software
 * thumper wonders what jcw4 will think of this...
 * jcw4 waits with anticipation
<jcw4> unless you hit publish in the next 2 minutes though my anticipation will have to stretch til after family supper.
<jcw4> ;)
<davecheney> the man knows how to negotiate
<jcw4> haha
 * thumper recoils from that idea after reading the code, and goes back to the review
 * jcw4 suspects recoil isn't usually used in context of a happy review
<thumper> it isn't too bad, just getting my head around this
<jcw4> thumper: you should know that you bang your head on your desk a lot in irc... my thin skin can't take too much of that m'kay?
<jcw4> :)
<thumper> jcw4: do we have a way to remove actions and their results yet?
<jcw4> thumper: no
<jcw4> thumper: we do have a forthcoming 'Cancel' API call
<jcw4> thumper: but that's not quite the same thing
<thumper> pretty sure I'm not going to want every call around for ever :)
<thumper> hmm...
<jcw4> thumper: +1... should that be handled with an API call (ArchiveActions) or within Jujud somehow?
<thumper> jcw4: probably some api call
 * jcw4 adds that to the actions api branch
 * jcw4 is off to supper
<thumper> oh FFS
 * thumper wanted to edit that
<thumper> wrong button
<axw> thumper: pong (it was a public holiday yesterday)
<thumper> axw: hey, guessed that in the end
<thumper> axw: got a few minutes?
<axw> sure
<thumper> axw: https://plus.google.com/hangouts/_/gvxqrkm74ije2z64ochcr5bbhia?hl=en
<sebas5384> hey there o/ :)
<sebas5384> i'm building a juju api client
<sebas5384> but i don't know why i only can connect to the socket ws://juju-gui:8001/ws
<sebas5384> which i suppose is being proxyed or something like that
<rick_h_> sebas5384: well the juju-gui talks to a proxy service on the charm
<sebas5384> because that socket is already knowing to which environment to communicate
<sebas5384> hmmm yeah
<rick_h_> sebas5384: it does that to provide services like bundle deployments which the juju api doesn't have at this time. So it intercepts some calls to do actions manually
<sebas5384> rick_h_: hi o/
<rick_h_> sebas5384: but most calls it just proxies through to the state server
<sebas5384> hmmm i was imagining something like that
<sebas5384> can you show me the code of that proxy ?
<sebas5384> I want to use it as an example of how to connect to the api state server
<rick_h_> sebas5384: yep, looking now, it's the 'guiserver' bit http://bazaar.launchpad.net/~juju-gui-charmers/charms/trusty/juju-gui/trunk/files/head:/server/guiserver/
<sebas5384> i'm taking a look at it
<rick_h_> sebas5384: http://bazaar.launchpad.net/~juju-gui-charmers/charms/trusty/juju-gui/trunk/view/head:/hooks/utils.py#L140 might be of interest to you
<sebas5384> hmmmm here it is:)
<sebas5384> http://bazaar.launchpad.net/~juju-gui-charmers/charms/trusty/juju-gui/trunk/view/head:/server/guiserver/handlers.py
<rick_h_> sebas5384: since it sounds like you really want the real ip addr of the state server
<sebas5384> yeah exactly!!
<rick_h_> sebas5384: but yea, there's some magic bits in the juju-gui charm so worth going through some of that.
<rick_h_> sebas5384: hope that helps
<sebas5384> definitively rick_h_ !
<sebas5384> thanks man! you are always helpful :)
<sebas5384> rick_h_ ++
<rick_h_> np, have fun. I'm out for the night.
<sebas5384> rick_h_: o/
<axw> thumper: I just saw your email from yesterday about the 403s; it *shouldn't* be using the proxy for uploading to provider storage (probably the no_proxy mask isn't quite right)
<axw> thumper: that wouldn't happen on master tho
<jcw4> thumper: responded to your comments... great review. thank you.
<thumper> jcw4: the old name or the new name?
<thumper> jcw4: if Name -> Id
<thumper> and Id() returns the localID
<thumper> all good
<jcw4> thumper: good
<jcw4> thumper: did you get your first question about old vs. new name answered?
<jcw4> oh... I think it all goes together
<thumper> it does :)
<jcw4> in which case I'm clear and that's cool
<thumper> good
<thumper> jcw4: names.Tag represents any tag
<jcw4> k
<thumper> jcw4: names.UnitTag is a concrete implementation of a tag
<jcw4> right
<thumper> since the design leans towards different receiver types
<jcw4> ah
<thumper> it makes sense to be a names.Tag rather than a UnitTag
<jcw4> got it
<thumper> we can then do type switches on it
<thumper> and know that it is well formed
<thumper> tags have become more than just API level
<thumper> they are typed key values
<jcw4> so if we use Tags instead of string ids for the receiver,  we make the parameter the general Tag type
<jcw4> thumper: okay, so it's okay to use them internally
<thumper> the document should keep a string value
<thumper> but it is internal
<thumper> the public interface to the Action should use a Tag
<thumper> IMO
<jcw4> +1
<thumper> jcw4: state.User now takes a names.UserTag arg
<jcw4> but the string repr. is the tag version instead of the internal Name() version
<thumper> so we know that it is valid
<jcw4> k
<thumper> jcw4: no, just store the Id() of the tag
<jcw4> right
<thumper> which is the same value you are storing now
<jcw4> hmm; I see in the case of Units yes
<jcw4> I was distracted because the internal representation of action* id's are different than the tag version
<sinzui> thumper, do you have a minute to review https://github.com/juju/juju/pull/862
 * thumper looks
<thumper> sinzui: do we have a good power build now?
<sinzui> thumper, we don't know yet. I have update all the machines and sent 1.20.9 off to the builders to find out
<thumper> sinzui: ok, thanks for the info
<wwitzel3> jam: for bug #1375507 , should I be looking at going the NonValidatingClient route? should we generate new certs and leave the client as is?
<mup> Bug #1375507: rsyslog worker continuously restarts due to x509 error following upgrade <logging> <rsyslog> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1375507>
<wwitzel3> menn0: what provider did you use when you noticed the bug ^
<menn0> wwitzel3: local
<menn0> wwitzel3: it's probably not relevant but I can give you the script I was using to set up the 1.20 env
<wwitzel3> menn0: actually, that'd be helpful .. my local is giving me a bit of a different error
<menn0> ok
<menn0> wwitzel3: i'll email it to you know
<wwitzel3> thanks
<wwitzel3> menn0: also were you using 1.20 from the git tag? Or a pre-built?
<menn0> wwitzel3: pre-built - I noticed the problem coming from both 1.20.7 and 1.20.8 from the stable PPA
<wwitzel3> menn0: k
<menn0> wwitzel3: just sent the script
<menn0> wwitzel3: I was using it to test upgrades for the units collection migration to using env uuids in the _id
<menn0> wwitzel3: the rsyslog issue is just something I noticed incidentally
<menn0> wwitzel3: wait for all agent-states to be "started" before running "juju upgrade-juju --upload-tools" with the juju built from master
<wwitzel3> menn0: thanks again, I'll give it a whirl right now
<menn0> wwitzel3: ok
 * thumper feels much better after a few hours coding
 * thumper EODs
<thumper> jam: FWIW, just implemented 'juju api-info' with nice options
<thumper> jam: will propose tomorrow
<jam> thumper: sounds good
<axw> jam: are you still maintaining the old juju landing bot? I tried to land a contribution to gwacl, and it spat the dummy
<axw> `/bin/sh: 1: Syntax error: "&&" unexpected`
<jam> axw: mgz is probably more aware of that bot than I am.
<axw> okey dokey
<axw> thanks
<jam> axw: I'll see if I can find it
<jam> axw: can you link the MP ?
<axw> jam: https://code.launchpad.net/~mark-sheahan-ms/gwacl/cert-args/+merge/235889
<axw> I can merge manually if it's too much hassle, unit tests should be pretty a safe bet
<dimitern> morning all
<wwitzel3> dimitern: morning
<dimitern> wwitzel3, hey, isn't it way to late there?
<dimitern> :)
<jam> dimitern: isn't it a bit too early there :)
<jam> I think its only about midnight for wwitzel3
<dimitern> jam, 8:44, nice and early :)
<wwitzel3> dimitern: yeah, it isn't super late, but mostly I'm sick and not sleeping well, so figure I'd be productive since I was up anyway
<dimitern> wwitzel3, oh, get well soon then!
<wwitzel3> dimitern: my goal is to not be sick at the sprint, that would be awful
<dimitern> wwitzel3, yeah, I almost managed to get a cold in the past few days, but it's ok now
<wwitzel3> dimitern: I hung out with some friends who have kids on Saturday and felt it coming on Sunday .. I should know better than to leave the house :P
<dimitern> jam, can I bother you for a couple of early reviews? :) http://reviews.vapour.ws/r/123/ and http://reviews.vapour.ws/r/125/
<dimitern> wwitzel3, is it cold already in fl ?
<wwitzel3> dimitern: my friends tell me no, not yet, I'm up in NC
<wwitzel3> (north carolina)
<dimitern> wwitzel3, ah, right
<dimitern> fwereade, hey, I know it's awful early, but can you take a look as well when you can? ^^
<jam> dimitern: commented on http://reviews.vapour.ws/r/123/
<jam> fwereade: I'd like your input on the idea of "Does adding a method require us to bump the API version"
<jam> dimitern: and a comment on http://reviews.vapour.ws/r/125/
<dimitern> jam, thanks!
<dimitern> jam, well, it's the agent api, and if we needed to do that for the uniter api, we already missed the train a few times with the actions and metrics, etc.
<jam> dimitern: "we made mistakes in the past, so clearly we shouldn't stop making mistakes now"
<dimitern> jam, :)
<dimitern> jam, btw Tag() returning names.Tag is wrong
<dimitern> jam, it's only like that in state because of FindEntity() and things like that
<dimitern> jam, in other places, esp. in the api we should use proper tag types - return them and take them as args; that was one of davecheney's comments on one of my previous PRs
<dimitern> jam, introducing versions into the uniter api server facade, considering how big it is, will require massive changes to the PR
<jam> dimitern: grep -rnI "func.*Tag()" . shows an awful lack of common implementations
<dimitern> jam, yes, but we're moving towards better consistency in the implementations
<jam> dimitern: type  UniterV1 struct { Uniter } is not particularly large
<dimitern> jam, and testing both versions without much code duplication will need all test suites to be refactored
<jam> dimitern: so I agree we should move towards consistency, though (IMO) having a consistent "you can call Tag() on objects to get a Tag describing it" seems useful, and Go tends to require that if you ever want to put those objects into an Interface the method has to return identical types
<dimitern> jam, like what I did for the firewaller
<dimitern> jam, yeah, that kinda sucks, esp. having to do apiuniter.Unit(stateUnit.Tag().(names.UnitTag))
<jam> dimitern: I feel like the pain of Versioning is a pain we need to endure sooner rather than later, as there will always be "its a lot of work" for anything we want to do.
<jam> dimitern: I can live with us agreeing it isn't worth it
<jam> but we should discuss that with more than just 2 people
<dimitern> jam, much of the stir around serializing and deserializing tags for the api can be solved by making the tags handle that and use tags instead of strings in params/results
<dimitern> jam, so how about instead of adding a new version of the api in this PR, I add some fallback code in the uniter to handle IsCodeNotImplemented ?
<jam> dimitern: as stated, I don't want it to just be you and I that come up with what is good enough in this case. I want at least fwereade and maybe a wider agreement from the list.
<jam> dimitern: there is lots of ways we can reduce the work (like just using what you have right now which is even easier for you)
<jam> dimitern: but whatever we do is going to be precedent (like Metrics, etc), and I'd like the group to have decided what we are doing
<dimitern> jam, considering the rush to implement a lot of features in the past months, by different teams, even outside contributors, the mess will only get worse if we don't gate api changes with some process we agree upon
<dimitern> jam, ok, while waiting for consensus I'll switch to container addressability to prepare some tasks for breakdown and estimation
<axw> reviewboard's dead? :(
<dimitern> axw, works for me btw
<axw> dimitern: it started working again
<axw> guess someone restarted it ...
<perrito666> morning
<dimitern> morning perrito666
<dimitern> jam, standup?
<tasdomas> hi
<tasdomas> I'm having problems trying to bootstrap a juju environment on amazon using the latest master
<tasdomas> http://paste.ubuntu.com/8465277/
<perrito666> fwereade: ping me if you have a spare moment
<fwereade> perrito666, I have to be out in 10 mins, let's be quick
<perrito666> mattyw: remember to get approval from someone more senior than I
<mattyw> perrito666, thanks very much for the review
<mattyw> and I will
<perrito666> ericsnow: wwitzel3 I am in the noisiest place ever, I am bailing out
<perrito666> of the call
<ericsnow> perrito666: pity
<ericsnow> perrito666: I need to talk to you about backups at some point
<perrito666> ericsnow: well if you need to I can squeeze myself into a quieter place
<ericsnow> perrito666: it can wait
<perrito666> they seem to have these meeting booths
<perrito666> ericsnow: gimme a moment and Ill be there
<perrito666> cameraless as usual
<perrito666> ericsnow: I am in
<perrito666> ericsnow: wwitzel3 you both just froze
<perrito666> I guess itsme
<ericsnow> perrito666: yep
<perrito666> ericsnow: I am having some form of connection error to hangouts
<perrito666> :(
<perrito666> man rainy days really kill internets here
<perrito666> I have lost dns
<ericsnow> perrito666: :(
<perrito666> ericsnow: so, yes, scp approach is far from perfect if you are doing an upload/download mechanism I am sure there are better things to use
<perrito666> ericsnow: I dont remember very well now but voidspace had done something for download
<ericsnow> perrito666: I'll see what I can come up with
<perrito666> maybe that can be used as a start point
<ericsnow> perrito666: yeah, that's what I'm doing
<ericsnow> perrito666: :)
<TheMue> oh, looks like my test error is a race. interesting, but ok, logically it made no sense
<tasdomas> could somebody take a look at http://reviews.vapour.ws/r/124/ and http://reviews.vapour.ws/r/131/ ? Much appreciated!
<perrito666> tasdomas: looking
<tasdomas> perrito666, thanks!
<perrito666> tasdomas: 131 has already been reviewed by me earlier
<tasdomas> perrito666, thanks
<perrito666> tasdomas: done
<tasdomas> perrito666, thanks
<mgz> perrito666: care to stamp a trivial dep change for me?
<mgz> perrito666: <https://github.com/juju/juju/pull/867>
<perrito666> mgz: looking
<perrito666> mgz: lgtm
<mgz> ta
<perrito666> brb lunch
<wwitzel3> 0
<wwitzel3> I guess that is the number of things I'm winning at today
<wwitzel3> is there an easy way to get rendered URL and params of a FacadeCall?
<wwitzel3> figure a well placed Debugf should give me that ..
<alexisb> perrito666, what is your twitter handle?
<LAS_QUELO_TODAS> klflvlv
<LAS_QUELO_TODAS> v
<LAS_QUELO_TODAS> v
<LAS_QUELO_TODAS> v
<LAS_QUELO_TODAS> v
<LAS_QUELO_TODAS> v
<LAS_QUELO_TODAS> v
<LAS_QUELO_TODAS> v
<alexisb> hmmm
<benji> Going down.
<perrito666> alexisb: extremelite late ping, my twitter handle is: perrito666
<perrito666> no surprise there :p
<jcw4> thanks thumper
<thumper> np
<cmars> thumper, hangout? or should i reschedule +1 hour
<thumper> cmars: can we reschedule for 30min?
<cmars> thumper, np
<arosales> thumper: working on deploying to a power8 maas cluster from a amd64 client
<arosales> thumper: we try to use --arch=ppc|64|64el|64le with no luck
<arosales> looking at the juju help it does not list ppc* as a supported arch
<thumper> shouldn't that be constraints?
<arosales> thumper: do you know what the recomended way to deploy from an x86 client to ppc maas cluster
<thumper> no I don't, sorry
<arosales> juju deploy --constraints arch=ppc*
<arosales> s/deploy/constraints/
<arosales> thumper can you confrirm ppc64* is a valid arch?
<thumper> yes
<thumper> I think ppc64, ppc64el and ppc64le all work
<thumper> so 'juju bootstrap --constraints arch=ppc64' should work
 * thumper crosses fingers
<arosales> from x86 it fails miserably
<arosales> trying from ppc64el
<arosales> machine
<thumper> arosales: fails how? and bug plz?
<arosales> thumper: will file, just trying to get something to work.
<arosales> mbruzek: has the details, but he is trying to get 1.20.9 to test that hypothesis
<jcw4> thumper: still cleaning up in response to your other points, but I updated http://reviews.vapour.ws/r/127/ with a couple questions for you
 * thumper looks
<thumper> jcw4: don't see your answers
<jcw4> thumper: hmm; last two points you made
<jcw4> basically you suggested when searching for actions or actionresults by actionreceiver that we could just use the receiver name instead of using docID on the receiver name
<jcw4> I was asking 1) don't we need to still include the EnvUUID prefix in the filter
<jcw4> and 2) can we safely ignore using QuoteMeta on the combination of env uuid and receiver name
<thumper> 1) yes, but you can specify the uuid field
<thumper> 2) avoid regex
<jcw4> thumper: +1
<jcw4> thumper: so just reconstruct the prefix with uuid + receiver name
<jcw4> thumper: but can I search for matching actions by prefix without regex?  is there a simple prefix search method I can use in the bson package?
<thumper> jcw4: just look for the specific fields bson.D{{'env-uuid': envuuid},{'reciever': receiver}}
<jcw4> thumper: doh
<jcw4> thumper: got it
<jcw4> thumper: thanks
<thumper> cmars: still around?
<davecheney> thumper: fuck
<davecheney> ingore mu half sent mail
<davecheney> PEBKAC
<thumper> ack
<wwitzel3> anyone know where the FacadeCall stuff gets distilled down in to just a URL with params? I'm trying to Debugf the URL for the rsyslog facade calls.
<cmars> menn0, can i trouble you to take a look at my versioned login API PR, https://github.com/juju/juju/pull/392 ?
<menn0> cmars: sorry, was having lunch. looking now.
<perrito666> hi
#juju-dev 2014-10-01
<menn0> perrito666: hi
<jcw4> thumper: one more revision on http://reviews.vapour.ws/r/127/.  Addressed all your feedback.
<thumper> jcw4: ack
<jcw4> thanks thumper
<davecheney> thumper: do you have 2 minutes for the ppc64 postgame ?
<davecheney> make that 10
<thumper> davecheney: sure, in a few minutes? just want to propose this branch
<davecheney> kk
<davecheney> i'll jump in the standup hangout
<davecheney> i'll see you there when you're ready
<thumper> menn0,davecheney: http://reviews.vapour.ws/r/133/diff/
<davecheney> thumper: worst, description, ever
<davecheney> thumper: LGTM, i added some nits
<davecheney> you can ignore them if they are inconveneint
<thumper> ta
<menn0> thumper: I'm still looking at other reviews
<davecheney> thumper: command looks good
<davecheney> i like the little touches like sorting unknown values in the error message
<davecheney> and printing a single return argument differently
<davecheney> this was clearly a labour of love
<davecheney> shit, winton-09 is completely screwed
<davecheney> i'm going to try a chroot to get back to a normal trusty install
<menn0> cmars: I'm done reviewing the login PR (got interrupted by a furniture delivery, sorry)
<menn0> cmars: looks pretty good except for what happened to the "maintenance in progress" handling which is now broken I think.
<thumper> davecheney: still needs a ship it :)
<thumper> davecheney: yeah, I like to care how it feels to run the command
<menn0> thumper: I'm looking at the command now
 * thumper frowns
<thumper> why isn't rb updating?
<thumper> menn0: is `rbt post -u` the right thing?
<menn0> thumper: it often doesn't. I think it just uses the hash of the topmost rev on your branch to find the review to update
<menn0> if you've rebased or added more commits I don't think it works
<menn0> use -r instead
<davecheney> thumper: rbt 0r NNN
<davecheney> thumper: rbt -r NNN
<davecheney> otherwise you get a new review
<thumper> yeah, I think I got a new review
<wwitzel3> menn0: that script worked great btw, was able to replicate the errors, still don't have a fix yet, but trying to get there.
<menn0> wwitzel3: well at least you can replicate it. that's half the battle :)
<wwitzel3> menn0: at this point it was way less than half the battle, lol ;)
<menn0> wwitzel3: :)
<menn0> thumper: I have lots of feedback on that PR... still going
<thumper> menn0: really?
<menn0> thumper: yep
<thumper> geez
<thumper> menn0: not sure what your review mentor is going to think :-)
<menn0> thumper: :)
<menn0> thumper: my issues are not huge but I think the code can be simplified a bit
<menn0> thumper: done
<thumper> ta
<thumper> menn0: one reason I decided not to start with an example showing everything is because the output is really long
<thumper> the cacert takes up approx 25 lines
<menn0> thumper: ok fair enough. Could you shorten the cacert output in the example?
<thumper> sure... how?
<thumper> ... maybe
<menn0> thumper: yeah or "start ... end"
<menn0> substituting in what the start and end actually look like
 * thumper nods
<davecheney> thumper: sniffing for getpagesize(3) didn't work
<davecheney> it turns out that libgo already asks the os for the page size when _allocation_ heap via mmap(2)
<davecheney> it jsut doesn't use the page size when returing memory via madvise(2) :(
<thumper> damn
<davecheney> so, there goes the easy route
<davecheney> thumper: it might be easier to write a program that compiled under gccgo
<davecheney> can introspect it's own runtime
<davecheney> 'cos it's not really juju that's busted
<davecheney> its the libgo
<davecheney> ie, write a test program that exercises libgo
<davecheney> but that sounds counter productive
<thumper> are you able to ask it the right questions?
<davecheney> ie, if you have to install something to test if you have a shitty libgo
<davecheney> then why not make the program dependon the _good_ libgo
<davecheney> and by installing it
<davecheney> fix the problem
<thumper> :)
<davecheney> the test program would agrevate the libgo bug
<davecheney> allocate lots of random stucts full of pointers
<davecheney> then agressively try to free memory to agetate the scavenger
<davecheney> cute excersise
<davecheney> but probably tangental to giving the charmers a tool they can use
<thumper> :)
 * davecheney goes to make lunch
<cmars> calling it a night y'all. menn0, thanks for the review, let me know how it looks now if you get the chance.
<menn0> cmars: ok good night. I'll try to have another look before I EOD
<axw> ericsnow: you tested backup downloads with a validating TLS client? last time I checked, the certs we use didn't support it
<ericsnow> axw: I'm pretty sure the trick is to explicitly set the root CA
<ericsnow> axw: Michael Foord sorted it out right before he handed backups over to me
<ericsnow> axw: it's certainly conceivable that I've missed something though :)
<axw> hmm, thought I tested that. there was another error about IP SANs being missing, IIRC
<ericsnow> axw: I'll look into that first thing tomorrow
<axw> thanks
<ericsnow> axw: thanks for bringing it up :)
<axw> it matters for backups more than it does tools... it may matter for charms soon, though
<axw> nps
<ericsnow> axw: FWIW, I see value in consolidating the HTTP request code that tools, charms, and backups have in common at some point in the future
<axw> ericsnow: agreed
<axw> it grew a bit organically, needs some refactoring
<ericsnow> axw: I dabbled with in a month or two ago and revisited it today a tiny bit
<ericsnow> anyway, I'm EOD
<axw> ericsnow: night, see you in Brussels
<axw> (I'm leaving tonight)
<thumper> OMG, trying SO hard not to fix everything in this file...
 * thumper makes a note to come back later
<davecheney> thumper: did the best i could on a straight forward way to detect bad libgo's
<davecheney> given it was time boxed and we're probably going to fix it with the dpkg hammer
<thumper> ok, cool
<davecheney> i've also been able to demonstrate that all shipping versions of trusty are broken out of the box
<davecheney> which blows
<thumper> :-(
<thumper> (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
<thumper> just some casual flipping
<thumper> why didn't I complain more before...
 * thumper blames himself
<wwitzel3> â¬ââ¬ï»¿ ã( ã-ãã)
<thumper> (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
<wwitzel3> (â¯Â°â¡Â°)â¯ï¸µ â»ââ» ï¸µ â¯(Â°â¡Â° â¯)
<thumper> :)
<thumper> (â¯Â°ÐÂ°ï¼â¯ï¸µ /(.â¡ . \)
 * thumper EODs
<thumper> night all
<wwitzel3> o/ thumper
<TheMue> morning
<dimitern> jam, hey
<dimitern> jam, it seems while we're trying to reach a consensus on how agent apis need to change, others are still adding stuff to the uniter facade :)
<dimitern> jam, I might as well land my changes and at least finish with port ranges for now
<jam> dimitern: well, I was hoping people would actually join the conversation, lack of objection means we need to do the versioning work
<dimitern> jam, you mean enforce it somehow?
<jam> dimitern: I mean actually bump the version
<dimitern> jam, as it is now?
<jam> dimitern: as in we need to split it up and start a new version for new content. At least, that is what I recommended and nobody said "lets not do that"
<dimitern> jam, ok, I might as well do it, yeah
<fwereade> dimitern, first thing that springs to mind on the uniter.Machine stuff is -- do we really need to use the remote-object model here? just an AllMachinePorts (or something) method would seem to be all we really need here
<fwereade> dimitern, the only reason we have this *Unit style is because we didn't want to rewrite *everything* -- but I don't see a big win in writing new code in the saqme style
<dimitern> fwereade, ok, I can agree with this
<dimitern> fwereade, I was even thinking of having AllMachinePorts taking unit tags and returning all ports for each given unit tag's assigned machine
<dimitern> fwereade, the more painful issue is what to do about api versioning? I was going to refactor the first PR so it has UniterBaseAPI (having GetOwnerTag which is replaced by ServiceOwner in V1), UniterAPIV0 (with everything but the new methods - AssignedMachine and AllMachinePorts), and UniterAPIV1 { UniterAPIV0 } (having the new calls)
<dimitern> basically what I did for the FirewallerAPI before the ports migration upgrade step
<fwereade> dimitern, yeah, I think we really should be doing that, annoying though it may be
<dimitern> fwereade, ok, I'm on with it already; and re AllMachinePorts taking unit tags? With this, I won't need to add AssignedMachine separately as well
<fwereade> dimitern, that works for me, I think, but: gsamfira, are you doing an AssignedMachine call for reboot?
<fwereade> dimitern, gsamfira: because finding out what machine you're on can just be one call when we bring up a uniter -- that's not going to change for the lifetime of the process, *whatever* happens ;)
<dimitern> fwereade, well I'm calling it in NewHookContext, but I guess I could do that at uniter startup once
<gsamfira> fwereade: yep, doing the same. If I can skip it, it would be great
<fwereade> dimitern, gsamfira: ok, as long as one of you does it I'm easy :)
<fwereade> dimitern, if you're already on that path it might be easiest for you?
<dimitern> gsamfira, fwereade, yeah, I've just seen the AssignedMachine PR, well if mine lands before yours you can just use it :)
<dimitern> an vv
<dimitern> fwereade, right, so AllMachinePorts will take machine tags, as it is now, but it will be a top-level method
<fwereade> dimitern, perfect
<fwereade> dimitern, and would you make sure it includes per-relation ports? it's fine to have -1 for everything for now, but I'd prefer not to have to change the api when we introduce them
<dimitern> fwereade, it does not even include network tags now, just unit tags and port ranges lists
<dimitern> fwereade, per-relation ports are not in state yet, but when we later introduce them we can just add a few fields to the result
<dimitern> fwereade, the point of having AMP() is to have a cache of all ports, regarless of network or relation, so we can verify against them each time open(close)-port is called
<dimitern> although.. hmm.. yeah this only works because all ports are on the same network now
<fwereade> dimitern, yeah, exactly
<dimitern> fwereade, how about we return both []slice{UnitTag, NetworkTag, and PortRange} as result, and when per-relation ports are introduced, we add another api call to get the network of a relation endpoint?
<fwereade> dimitern, but don't we need to know the *relation* for which a port has been opened? and to be able to infer the network from the relation, not vice versa?
<dimitern> fwereade, I don't think it's the other way around
<dimitern> fwereade, port conflicts occur on networks, the relation is just a way to determine the network to use
<dimitern> fwereade, ah, so RelationNetwork() call can be added in the future, but right now I don't know for which relation a port was opened
<dimitern> fwereade, perhaps I can add a RelationTag in addition to the UnitTag and PortRange to the result, but just leave it blank for now and put a comment
<fwereade> dimitern, right, but opening the same port on two different relations is *not* necessarily a conflict, is it?
<dimitern> fwereade, only if it's the same network
<fwereade> dimitern, no, even if it's the same network it's not a conflict
<fwereade> dimitern, I think we can trust a charm to handle its own potential collisions
<fwereade> dimitern, but we need to track it to know when we *actually* need to open/close a port
<fwereade> dimitern, consider mysql serving db over two different relations
<fwereade> dimitern, we'll want to close the ports on one relation when that relation dies
<fwereade> dimitern, but actually keep the ports open on that *network*
<fwereade> dimitern, because there's another relation that wants them open
<dimitern> fwereade, yeah, but if these 2 relations are on the same network it is a conflict to try open-port 80-90/tcp
<dimitern> fwereade, what I don't get is how is that not a conflict
<fwereade> dimitern, but it can't be, can it?
<fwereade> dimitern, consider the mysql charm
<fwereade> dimitern, if they conflict, how can you have more than one relation with its server endpoint?
<dimitern> fwereade, wait, I'm trying to follow, but I still don't quite get it..
<dimitern> fwereade, say mysql:db and mysql:cluster are both bound to juju-public; mysql listens on 80-90/tcp for db and tries to do the same for cluster
<dimitern> fwereade, or you're saying in this case we let the charm handle this and trust it not to try opening 80-90/tcp for both relations?
<fwereade> dimitern, I'd been thinking more about having two separate relations with db -- but, yeah
<fwereade> dimitern, no, I'm saying that the charm must be allowed to open those ports for both its relations
<fwereade> dimitern, and that we open a port on a given network when there's one or more relations on that network that have declared they use that port
<fwereade> dimitern, and close it only when all relations on that network have closed it
<dimitern> fwereade, *now* I get it :)
<fwereade> dimitern, sorry unclear :)
 * fwereade lunch
<dimitern> fwereade, cheers!
<TheMue> dimitern: got a few seconds for me?
<dimitern> TheMue, sure
 * TheMue fights with a race condition since yesterday
<dimitern> TheMue, oh, what's the issue?
<TheMue> dimitern: a test, this one https://github.com/TheMue/juju/blob/networker-mode-based-on-agent-api/cmd/jujud/machine_test.go#L1042
<TheMue> dimitern: IMHO the problem lies in the way a.Run() runs in the background. first I had the patching and the channel declaration outside the loop, but then the tests fails even more often
<dimitern> TheMue, looking
<TheMue> dimitern: thanks
<TheMue> dimitern: hoped that the selecting on the unbuffered modeCh would be enough to ensure, that only one of the tests at a time runs
<dimitern> TheMue, ISTM the default: case inside the patched func might be the problem
<TheMue> dimitern: aaargh, sure, "hey, I cannot sent, no problem, will still continue"
<TheMue> dimitern: will check, thanks
<jam> TheMue: if you are going to do something like that, can you include a "<-time.After(testing.LongWait)" instead of default ?
<jam> the idea is that a test should fail rather than get hung
<jam> (it should be <time.After(...): c.Fatalf()
<TheMue> jam: yep, like for the receiving below
<jam> yeah
<dimitern> TheMue, apart from that I have a few comments re naming and logging
<TheMue> dimitern: always happy about feedback
<dimitern> TheMue, i.e. please don't include test #%d: in the description, do it in the loop; s/descr/about/; s/disable/managedNetworking/ ?
<TheMue> dimitern: ok
<mattyw> umm, did review board just die?
<mattyw> nope - was just slooooow
<TheMue> dimitern, jam:no talk today? ;)
<fwereade> dimitern, just looking at that second review, I think what we discussed this morning applies re what we have to store -- even if we're doing everything with the implicit expose relation with id -1, I think the data changes follow quite naturally
<fwereade> dimitern, does it need a deeper look now, or can it wait until we've got that stuff in place?
<fwereade> dimitern, (not on the model side yet, I know)
<dimitern> fwereade, I'm almost done with the first PR changes - api versioning, and I'll propose it soon, but the second PR should be mostly the same (except for a few changes), so I'd appreciate if you look into it
<fwereade> dimitern, ok,cool
<TheMue> dimitern: btw, the waitStopped() usage has been the solution
<dimitern> TheMue, sweet!
<mattyw> ericsnow, ping?
<perrito666> mattyw: he usually arrives in the next half an hour
<mattyw> perrito666, ok - it's not urgent
<mattyw> perrito666, as you're normally happily to listen to my insanity....
<mattyw> perrito666, I think there's a problem with the review board clock: http://reviews.vapour.ws/r/138/
<mattyw> perrito666, look at the time the review was published and the time for the review comments
<perrito666> mattyw: my account -> settings -> my timeszone
<mattyw> perrito666, got it
<perrito666> hey, has anyone instelled 14.10?
<perrito666> I am tempted to break my machine before traveling :p
<TheMue> ping *
<perrito666> TheMue: I am not sure what to answer to that
<TheMue> anyone with a fresh checkout of master able to do a test for me?
<TheMue> perrito666: you answered, so catched you *lol*
<TheMue> perrito666: has been a channel trap
<ericsnow> perrito666, wwitzel3: standup?
 * perrito666 deflects TheMue with "I dont have a fresh copy" :p
<perrito666> a couple of days old
 * TheMue grumbles and thinks a bout a new tactics
<TheMue> perrito666: just a run of go test in .../juju/worker
<TheMue> perrito666: I've got failing tests there and they don't look as they have anything to do with my changes. so I switched to a master branch and still have the error.
<perrito666> TheMue: upgrading
<TheMue> perrito666: thx
<perrito666> TheMue: running
<jcw4> TheMue: what is the error?
<dimitern> jam, TheMue, fwereade, if you're still around, please take a look at the updated, versioned uniter API http://reviews.vapour.ws/r/123/diff/2/
<fwereade> dimitern, cheers
<TheMue> jcw4: multiple tests expect a string "setup"
<TheMue> jcw4: oh, ic, may have to do with actions
<jcw4> TheMue: running now
<jcw4> TheMue: I often get errors in the peergrouper tests that seem to be only on my machine...
<perrito666> TheMue: FAIL: filter_test.go:293: FilterSuite.TestConfigEvents
<perrito666> TheMue: that is master with deps up to date
<TheMue> perrito666: oh, this error is unknown to me, strange
<jcw4> perrito666: that one is suspiciously close to actions stuff
<jcw4> I was seeing it too, but only sometimes, and it didn't happen on the build server
<jcw4> perrito666: are you using go 1.3.x
<jcw4> ?
<TheMue> perrito666: mine are in notifyWorkerSuite and stringsWorkerSuite
<jcw4> TheMue: can you pastebin the output?
<TheMue> jcw4: it's always s.actor.CheckActions(c, "setup") that fails here
<jcw4> TheMue: hmm; that isn't related to *our* Actions AFAIK
<TheMue> jcw4: http://paste.ubuntu.com/8472971/
<TheMue> jcw4: and it seems to be flaky, this time three fails, but also have seen four fails in other runs
<jcw4> TheMue: are you using go 1.3.x too?
<jcw4> I've seen tests behave more flaky in 1.3.x than the official 1.2.x that the build server uses
<jcw4> other than that my only test failures are in workerSuite.TestSetMembersErrorIsNotFatal
<jcw4> which I get often
<dimitern> fwereade, btw the diff is split in 2 http://reviews.vapour.ws/r/123/diff/2/ - second page (in case you were wondering like me :)
<TheMue> oh, 1.2.1, will update
<dimitern> fwereade, oops I meant http://reviews.vapour.ws/r/123/diff/2/?page=2
<jcw4> TheMue: I think 1.2.1 is the official version that's supported
<natefinch> morning everyone
<jcw4> hi natefinch :)
<dimitern> morning natefinch
<bodie_> morning
<TheMue> o/
<dimitern> fwereade, re opening and closing all the pending ports in one API call.. I'd rather leave this as is and do a follow-up later that introduces a FinalizeHookContext API call to do all changes in one call
<dimitern> it will be easier once the versioning code lands
<TheMue> jcw4: same failures with 1.3.3
<jcw4> TheMue: that's really strange
<TheMue> indeed
<jcw4> I don't see an obvious connection, but sometimes I have half a dozen borked mongo instances running after the tests, and if I killall of them and re-run my tests run fine
<alexisb> TheMue, I am on the hangout and ready when you are
<TheMue> alexisb: ouch, missed it, omw
<alexisb> no rush TheMue
<ericsnow> could I get a second pair of eyes on http://reviews.vapour.ws/r/132/ (katco was kind enough to review it first)
<katco> ericsnow: i am enjoying reading your code. very nice. can't claim i understand all of it, but it's readable :)
<ericsnow> katco: achievement unlocked
<jcw4> katco: +1, I reviewed and liked the code - didn't have any feedback to give other than nice.
<katco> ericsnow: lol
<jcw4> ericsnow: nice :)
<bodie_> rick_h_, you around?
<bodie_> we're nailing down actions api stuff and thinking forward to the golden spike
<rick_h_> bodie_: how goes?
<bodie_> rick_h_, goes well, I think we're pretty close to (or already have?) a first rendition API landed ( jcw4? ) and just need to hash out a few more details in the doc
<bodie_> rick_h_, I'm just thinking about the sprint and organizing the next steps
<rick_h_> bodie_: sure thing
<jcw4> yep, the API has landed
<jcw4> it's not implemented yet, but the outline is there
<rick_h_> cool
<jcw4> rick_h_: the one question I had...
<bodie_> rick_h_, TheMue was saying maybe we could arrange a meeting of spirits next week, so I got thinking about paving the way to a demo
<jcw4> It makes sense for information about available actions to be exposed through the CharmInfo
<rick_h_> bodie_: you guys at the sprint or calling in?
<jcw4> but IIRC that is all in YAML format
<bodie_> we'll be remote
<jcw4> rick_h_: we don't have any information for calling in
<rick_h_> jcw4: right, but remember we need json for the gui.
<jcw4> rick_h_: that was my question
<rick_h_> jcw4: bodie_ ok, just curious what you're looking at setting up
<rick_h_> jcw4: yea, so we don't currently have a JS yaml parser and json is the way to go for most things and we're still looking at trying to use jsonschema for the validation/ui part
<jcw4> rick_h_: we thought about a special purpose bridge between the charminfo that exposes the actions json info through the actions API as JSON
<jcw4> rick_h_: I'm not clear though if when you consume Juju API calls on the front end if it comes in as JSON anyway?  Sounds like not?
<rick_h_> so right now we do it over the websocket and get json payloads in there I believe?
<jcw4> thats what I was thinking
<jcw4> so wouldn't that automatically convert the charminfo stuff to the json format you need?
<jcw4> or am I missing a piece ?
<rick_h_> jcw4: no idea, I'd have to look at the core side to see how it's turning that data into json over the websocket
<jcw4> rick_h_: I'd like to dip my toe in the front end code myself
<rick_h_> jcw4: ok, np.
<jcw4> maybe I'll hit you up with some questions as I attempt to get started
<rick_h_> jcw4: if you want we can setup a hangout sometime and I can show you where our client code is, how we debug over the websocket, etc
<jcw4> that would be great
<rick_h_> jcw4: I don't know where that lives on the -core side. frankban and Makyo have more of an idea but hopefully easy to find
<rick_h_> jcw4: and in theory if you've got a build of juju you could test it out and load the gui/talk to it and get a feel for it
<bodie_> I believe the web client connects to the API seamlessly, so it must just marshal things as json
<jcw4> rick_h_: I think I can figure out the -core side.  Testing out the comms is what I want to attempt
<bodie_> i.e., the websocket stuff is already dealt with
<jcw4> rick_h_: the main repo is https://github.com/juju/juju-gui right?
<rick_h_> jcw4: correct
<rick_h_> jcw4: https://github.com/juju/juju-gui/blob/develop/app/store/env/go.js is our client
<jcw4> cool
<bodie_> rick_h_, what I have in my head is something really minimal: create a new action with some parameters, see what comes back.  we could spec up the calls for that, I think
<rick_h_> jcw4: bodie_ and chrome can let you watch all the frames come in/out of the websocket
<rick_h_> bodie_: sounds good to me
<bodie_> ooo, nice
<bodie_> rick_h_, perhaps using jeremy's json-schema ui builder
<bodie_> anyway, these are all just ideas as yet -- just thinking about where to take it from here :)
<rick_h_> bodie_: well you can try that out. Our big hurdle is that we need to write an integration between jeremy's stuff and our databinding layer we use for monitoring for changes to the environment
<rick_h_> bodie_: happy to help move it forward with a proof of concept. Let me know what you need for me to help
<rick_h_> bodie_: my plan next week is to ask for some time next cycle to add the actions feature to the gui in the next cycle of ubuntu work
<bodie_> rick_h_, okay, awesome.
<thumper> morning
<perrito666> thumper: morning
<urulama> morning, thumper :)
<thumper> katco: aargh... I had forgotten to publish my updates. thanks for the review, should have already addressed issues you raised, but I'll take another look
<perrito666> thumper: shame, do you not see the green bar reminding yo to submit? green is the color of urgent and peding things you should have seen it
<katco> thumper: doh! no worries, good to read through code :)
<thumper> ha... didn't look at it after doing the rbt command line stuff
<katco> i do that sometimes lol
<davecheney> alexisb: ping
<alexisb> crap davecheney sorry
<alexisb> was on another call and didnt get my calendar ping
<alexisb> hoping over
<davecheney> s'ok
#juju-dev 2014-10-02
<perrito666> hi again
<ericsnow_> CI has failed to build for me 3 times in a row: "35 repositories updated; 2 failed"
<ericsnow_> any thoughts?
<davecheney> godeps: cannot update "/var/lib/jenkins/workspace/github-merge-juju/tmp.uAmw1jccO3/RELEASE/src/bitbucket.org/kardianos/service": package bitbucket.org/kardianos/service: https://api.bitbucket.org/1.0/repositories/kardianos/service: 503 Service Unavailable
<davecheney> bitbucket is down
<ericsnow_> davecheney: ah
 * thumper needs a casual (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
 * perrito666 hangs the call and immediately falls asleep on the desk
<perrito666> see some of you tomorrow, cheers
<ericsnow> wwitzel3: did you look at Open in api/apiclient.go?
<ericsnow> wwitzel3: that's where the cert stuff happens on the client side
<ericsnow> wwitzel3: past that there isn't a whole lot you can plug into
<wwitzel3> ericsnow: looking now
<wwitzel3> ericsnow: yeah I started there but wasn't able to actually figure anything out, as far as I can tell, in that file, on line 231, we are already doing that exact tls config for the websocket.
<ericsnow> wwitzel3: maybe it's a websockets vs. normal HTTP thing then (as backups is using direct HTTP requests for download)
<wwitzel3> ericsnow: maybe, yeah, I will dig in to more in the morning, I'm going to take some medicine and try to get some extra zzz's tonight.
<wwitzel3> ericsnow: appreciate you taking a peak at it though
<ericsnow> wwitzel3: yeah, I'm EOD soon too
<ericsnow> wwitzel3: np
<thumper> davecheney: care to leave a ship it? http://reviews.vapour.ws/r/133
<davecheney> thumper-afk: looking
<davecheney> done
<thumper-afk> davecheney: ta
<TheMue> morning
<jam> morning TheMue
<jam2> TheMue: you're gone tomorrow, right? I'm surprised to not see Dimiter here yet.
<jam2> fwereade: I'd like to chat with you about possible TXN designs.
<fwereade> jam2, heyhey
<TheMue> jam2: no, have been here yesterday, had my hangout with dimiter and missed you
<jam2> TheMue: yeah, sorry I missed the hangout, ended up getting distracted and missed the ping
<fwereade> jam2, sgtm -- juju-sapphire?
<jam2> fwereade: I'm in a cafe, but we can certainly give it a shot
<TheMue> jam2: I made my final changes to my comming PR in the evening and right now doing the last checks before proposal *yay*
<dimitern> jam2, hey
<dimitern> jam2, TheMue, http://reviews.vapour.ws/r/123/ - if you can have a look
<fwereade> jam2, did I fall over or did you?
<jam1> fwereade: I'm at a cafe that gives "free wifi" but only in 1 hour blocks
<jam1> so I just switched back to tethering my phone
<jam1> but I don't trust the 3G for a hangout
<jam1> fwereade: anyway, I think I've gotten enough to at least progress from here, thanks
<fwereade> jam1, cool
<dimitern> jam1, fwereade, hey, I really like to land this branch, if you can have a look and say it's fine - http://reviews.vapour.ws/r/123/
<TheMue> dimitern: will take a look now while my suite is running in the background ;)
<dimitern> TheMue, cheers!
<fwereade> dimitern, ah yeah sorry I got some of the way through the second page there yesterday
<dimitern> fwereade, np, was it looking ok?
<fwereade> dimitern, yeah, maybe a couple of minor quibbles
<fwereade> dimitern, just going through the rest now
<dimitern> fwereade, great, thanks!
<dimitern> fwereade, re doing all ports in a single api call in finalize.. if it's ok with you, now once we have versioning in the uniter api, it will be easier to add a FinalizeHookContext api call that does all in one
<fwereade> dimitern, yeah, agreed, we don't want that yet
<dimitern> but as a follow-up, once the rest of the port ranges PRs land
<dimitern> fwereade, cheers
<fwereade> dimitern, that's LGTM, but please bear all the quibbles in mind -- I think they do need work, but they don't have to be that PR
<fwereade> dimitern, the v0/v1 split is worth landing even if all the details aren't pefect
<dimitern> fwereade, sure, will do
<dimitern> fwereade, re "FWIW, I would be really happy to see this removed. I guess it's a derail right now, but it's completely pointless -- part of an early abortive metering thing." - can you expand on this one a bit please?
<dimitern> fwereade, specifically, what's wrong with calling ParseUserTag at the end of ServiceOwner?
<fwereade> dimitern, sorry, it's about ServiceOwner itself
<dimitern> fwereade, so you're saying we should get rid of it at some point?
<fwereade> dimitern, yeah
<dimitern> fwereade, that's fine, what I did is just fix it to be consistent with other api calls (permission checks, bulk ops, etc.)
<fwereade> that definitely doesn't have to be in this one
<fwereade> dimitern, it's just something I mention whenever I see it
<dimitern> fwereade, right :)
<dimitern> TheMue, thanks for the review as well
<TheMue> dimitern: yw ;)
<TheMue> dimitern: just proposing mine now
<TheMue> jam2: I have to say while I'm happy with the way we found to handle API versioning the original approach based on the EnvironmentCapabilities as dimiter suggested it has a somehow more clear logic.
<thumper> fwereade: hey, got a few minutes to chat?
<mgz> trivial review someone please! <http://reviews.vapour.ws/r/140/>
<dimitern> TheMue, you've got a review
<dimitern> mgz, looking
<TheMue> dimitern: just got the notification, thanks
<dimitern> mgz, LGTM
<mgz> dimitern: thanks!
<dimitern> jam, fwereade, reproposed ports sandboxing against master - http://reviews.vapour.ws/r/141/ please take a look
 * dimitern needs to step out for ~1h
<mgz> rogpeppe: when you have a mo <https://code.launchpad.net/~gz/godeps/print_project_not_dir/+merge/236860>
<rogpeppe> mgz: LGTM
<hazmat> jam, was that tokumx build i gave for 1.5 targeted towards a build dir / bin .. rebuilding 2.0.0 atm
<dimitern> jam, fwereade, review poke :)
<mgz> rogpeppe: if you add me to ~godeps-maintainers I'll go ahead and land
<hazmat> jam, tokumx 2.0 binary https://www.dropbox.com/s/f78gcixuerpbpjw/tokumx-2.0.0-linux-x86_64.tar.gz?dl=0
<hazmat> w/ ssl
<rogpeppe> mgz: done
<mgz> rogpeppe: landed, thanks!
<rogpeppe> mgz: definitely nicer error messages, thanks
<hazmat> jam, ping me if you've got a minute to chat toku, added some comments to the docs
<fwereade> jam, I don't seem to be able to change teams in githib -- can you?
<mgz> fwereade: you should be able to, but I can if needed as well
<mgz> fwereade: (things related to the juju team at least)
<fwereade> mgz, yeah, I thought I could once, didn't seem to be happening today though
<fwereade> mgz, would you remove vlad from hackers, and add gabriel-samfira, please?
<mgz> sure thing.
<mgz> I'll also owner you.
<arosales> any folks seen this error before
<arosales> http://pastebin.ubuntu.com/8479146/
<arosales> is this a out of memory on the provider (maas) or juju bootstrap?
<mgz> gabriel has been sent an email, he'll need to accept to be added
<mgz> arosales: ...that's a good one
<mgz> arosales: I would say it, but I bet maas is misbehaving here as well
<mgz> arosales: check the maas logs?
<arosales> mgz: ya I see OperationalError: out of memory in maas logs
 * arosales sighs
<perrito666> natefinch: ericsnow ?
<ericsnow> perrito666: here
<perrito666> ericsnow: we seem to have the standup now
<perrito666> instead of later
<perrito666> :p
<mgz> gsamfira: twothings, pr707 was go fmt sad, <http://juju-ci.vapour.ws:8080/job/github-merge-juju/872/console>, and you need to join juju/hackers on github for $$merge$$ to be obeyed
<gsamfira> mgz: yup, fixed and pushed
<gsamfira> mgz: I think I have :)
<mgz> okay, should be all set then, let me see
<mgz> gsamfira: github seems to be lagging, juju has you as in the hackers team, but your profile doesn't have you in juju
<gsamfira> mgz: eventual consistency ftw :). No rush, its fine
<mgz> gsamfira: ah, you need to set your membership of juju to public
<gsamfira> hmm..lemme see
<mgz> gsamfira: on the <https://github.com/orgs/juju/people?query=samfira> page should see a link
<gsamfira> ahh
<gsamfira> thanks!
<gsamfira> was looking for that miniscule link
<mgz> gsamfira: itsgoing
<gsamfira> worked :)
<gsamfira> mgz: thanks!
<perrito666> I really hate the local power company
<dimitern> fwereade, updated http://reviews.vapour.ws/r/141/diff/1-2/
<dimitern> fwereade, it should be good to land now
<dimitern> natefinch, hey, can you have a look as well?  http://reviews.vapour.ws/r/141/ I really need to land this
<perrito666> brb
<fwereade> dimitern, posted a quick question/concern, does it make sense?
<dimitern> fwereade, so, currently at the api level there's no relation given or needed when calling OpenPorts or ClosePorts
<dimitern> fwereade, but we can add it easily later in the params.EntityPortRange struct
<fwereade> dimitern, we can always fix up the api later, yeah
<fwereade> dimitern, it's keeping track at the context level that I'm mostly concerned about
<fwereade> dimitern, we half keep track and half not? because we have machinePorts that does know about relations, but pending that doesn't
<dimitern> fwereade, pending does know now, but the RelationTag field of PortRangeInfo is ignored for now
<fwereade> dimitern, but the keys don't know
<fwereade> dimitern, as it is that data structure can only hold one info per range
<dimitern> fwereade, that's correct, but isn't this how it should be - in the context of a relation hook?
<fwereade> dimitern, don't think so -- just because there's a default relation in play doesn't mean you're barred from changing others
<fwereade> dimitern, you just need to be explicit about it
<dimitern> fwereade, I agree, but later we'll add a [-r rel-id] argument to open-port, close-port, and opened-ports
<dimitern> fwereade, it doesn't make sense to do it now, because it's ignored
<fwereade> dimitern, mmm, I'd rather have the context-level stuff understand relations but always be using a magic null relation -- rather than have to rewrite that logic when we try to incorporate them
<fwereade> dimitern, I think it can even be written and tested sanely right now, you just need to pass a hardcoded not-really-a-relation into those methods
<fwereade> dimitern, then enabling it at the command level becomes trivial, instead of a hidden whoops-rewrite-logic thing
<dimitern> fwereade, so you're saying pendingPorts should be something like map[X]PortRangeInfo, where X is struct { Ports network.PortRange, RelationTag names.RelationTag } ?
<fwereade> dimitern, something like that, yeah
<fwereade> dimitern, I know it's kinda tedious
<fwereade> dimitern, but since we're writing it now we may as well get it right, ratherthan reworking it later
<dimitern> fwereade, ok, fair point
<fwereade> dimitern, awesome, tyvm
<dimitern> fwereade, and re your other question
<dimitern> fwereade, changing the map key should solve that as well I think
<fwereade> dimitern, yeah, I think so
<dimitern> fwereade, sweet! thanks, I will change it and repropose
<dimitern> fwereade, can you LGTM it conditionally, so I can land it after that?
<fwereade> dimitern, it kinda feels like that's the heart of the CL is the trouble there
<fwereade> dimitern, give me an ETA on the changes and I'll set an alarm to come and look though
<dimitern> fwereade, I'll be done in 1h, will ping you
<perrito666> if anyone has time http://reviews.vapour.ws/r/142/diff
<dimitern> fwereade, updated - http://reviews.vapour.ws/r/141/diff/2-3/
<dimitern> perrito666, "You don't have access to this review request."
<perrito666> dimitern: sorry I keep forgetting to publish these things
<perrito666> done
<dimitern> perrito666, cheers, looking
<dimitern> perrito666, reviewed
 * dimitern is way past eod now, time to go
<natefinch> ok, I think I might actually be able to do a little work now
<marcoceppi> natefinch: if on a server deployed with juju
<marcoceppi> I take down eth0 and bring up eth1 on a different network
<marcoceppi> will juju agent detect that change?
<marcoceppi> in 1.20.9
<natefinch> buh
<marcoceppi> I figured
<marcoceppi> (asking for a friend ;)
<natefinch> lhaha
<marcoceppi> natefinch: so, not really going to happen?
<natefinch> uh all the networking guys are off by now, so I'm not sure.  I'd be pretty sure that juju wouldn't figure it out on its own.
<marcoceppi> can I kick juju in the ass to figure it out?
<marcoceppi> that you know of
<marcoceppi> like restart the agent
<natefinch> marcoceppi: you can certainly try... I don't know what it'll do, honestly.
<katco> is there a way to specify an environment for any juju command? i.e. juju -e local (blah)
<natefinch> yes?
<natefinch> - e local?
<natefinch> er -e
<natefinch> that should work on anything that requires an environment
<katco> hm doesn't seem to work for me? "error: flag provided but not defined: -e"
<katco> so you have to specify after the subcmd?
<natefinch> might have to be after the command, I forget if it cares
<katco> hm. just trying to figure out if there's a way i can make sure i don't hose my actual juju environment while developing :p
<natefinch> katco: you can always rename the jenv for that environment
<natefinch> to something not .jenv
<natefinch> then it's untouchable
<katco> yeah but then i have to flip bits around everytime i want to switch b/t developing and doing something with my env.
<katco> it's a good suggestion, but not ideal =/
<natefinch> katco: there's a request in to be able to "lock" an environment so you don't accidentally muck up a pristine environment, but no one has tried tackling it yet
<katco> ah cool
<katco> damn juju developers.
<katco> never get to the important stuff! ;p
<katco> oh hey natefinch where/when you flying out? wondering if i'll catch any canonical ppl on my flight
<natefinch> well, the clientside part is trivial, but you want the serverside to respect it too... but then older juju clients wouldn't be able to unlock the environment, so what then?  Maybe not a big deal.  But anyway... it's tricky.
<natefinch> katco: Saturday 7pm EST BOS-Heathrow-Brussels
<natefinch> British Airlines IIRC
<katco> ah... i'm in dc
<katco> i mean i have a layover there, then direct to brussels
<katco> oh for the days when lambert intl. airport was actually an international airport lol.
<natefinch> *nod* This was the best times for me, gives me the most time on Saturday with my family without being a ridiculously extra long flight for no reason
<natefinch> jej
<natefinch> hhe
<katco> yeah, i'm flying out sat. night
<katco> might see you at the brussels airport
<katco> get in at like 7am i think
<natefinch> I get in at 11am.... I wanted to get in at 8am, but it would have meant leaving home at like noon on saturday
<natefinch> gotta run, the kids are refusing to take a nap and my wife is feeling worse again.
<perrito666> I can neve understand how being I kid I also refused to nap
<perrito666> and now I would give my life for a nap
<perrito666> katco: ericsnow some of you wants to add a review to http://reviews.vapour.ws/r/142/ ?
<ericsnow> perrito666: I'll take a look
<katco> perrito666: i'm trying to wrap up a change, i'll try and have a look when i'm done
<katco> perrito666: thanks for the ping
<perrito666> is it my perception or git rebase will hit the same conflict every time?
<perrito666> ericsnow: ta
<thumper> FYI, I'm going to be in and out a lot today as I prepare for the sprint next week
<thumper> need to be at the airport in 7 hours
<thumper> so need to pack etc.
<rogpeppe1> random network question: anyone know what this arp line means?
<rogpeppe1> ? (192.168.0.4) at (incomplete) on en1 ifscope [ethernet]
<rogpeppe1> i'm having LAN routing issues
<ericsnow> EOD a little early
<rogpeppe1> and i think that's one of the symptoms
#juju-dev 2014-10-03
<davecheney> alexisb: bauer is down again, :( RT 75536
<dimitern> fwereade, hey
<dimitern> fwereade, updated http://reviews.vapour.ws/r/141/diff/3-4/ if you can have a look?
<fwereade> dimitern, cheers
<fwereade> dimitern, on further inspection today, I'm worried that we really rather need to define how relation ports work at the context level
<fwereade> dimitern, otherwise when we start using ones that aren't -1 we *will* forget to test it all properly
<fwereade> dimitern, because anyone seeing the try* funcs will assume that relationId is a meaningful parameter
<dimitern> fwereade, I was hoping to discuss this in brussels
<fwereade> dimitern, tell you what
<dimitern> fwereade, yeah?
<fwereade> dimitern, take relationId out of the signatures of the try funcs
<fwereade> dimitern, we'll still need to rewrite them to take it into account properly
<dimitern> fwereade, and hardcode -1 inside?
<fwereade> dimitern, but it'll do what we need to today, and we won't be able to fool ourselves into believing the logic is more capable than it is
<fwereade> dimitern, yeah
<fwereade> dimitern, the first line can be "relationId := -1" even
<dimitern> fwereade, ok, will this be sufficient for now? I can also add a TODO comment about refactoring both funcs to take relation ids properly into account
<fwereade> dimitern, I think so, yeah
<dimitern> fwereade, cheers!
<fwereade> dimitern, given that, LGTM with trivials
<dimitern> fwereade, great, just finished with the changes and re-proposing, thanks again
<dimitern> fwereade, last step - a small branch that implements the "opened-ports" hook tool to return a list of ports or ranges opened by the unit: http://reviews.vapour.ws/r/144/
<fwereade> dimitern, hmm, I'm wondering how that will mesh with (again) per-relation ports
<fwereade> dimitern, there's an uncomfortable tension between implicit -1 and implicit current relation
<dimitern> fwereade, we can add [-r relid] argument later perhaps?
<fwereade> dimitern, yeah, the concern is that all the *other* -r flags assume current relation
<fwereade> dimitern, so there's a change in behaviour for charms that open/close/opened in a relation context
<fwereade> dimitern, OTOH it'll be exactly the same problem for open/close, there's nothing special about opened
<dimitern> fwereade, yeah, I think so
<fwereade> dimitern, I'm worrying a bit about output format though
<fwereade> dimitern, is a list sufficient?
<fwereade> dimitern, I guess it probably is
<fwereade> dimitern, we can have an --all flag with a map or something
<fwereade> dimitern, LGTM with --format yaml, --format json
<dimitern> fwereade, the format flag is common to all commands btw
<fwereade> dimitern, yeah, I expressed it insanely, but I'm mainly whining about the lack of tests
<fwereade> dimitern, doesn't need to be super-complex, just make sure they work
<dimitern> fwereade, sure, np
<dimitern> fwereade, thanks!
 * perrito666 bags ready
<perrito666> natefinch: going to the 1:1?
<natefinch> perrito666: yep
<perrito666> nate switched to stdup
<sinzui> natefinch, perrito666 This issue is blocking the cloud installer. We need to understand if this is juju, ,maas, or both, and do we need a backported fix to 1.20, see bug 1376952
<mup> Bug #1376952: juju machine stuck on 'pending' with no error message: juju 1.20.9.1, maas provider <cloud-installer> <deploy> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1376952>
<stokachu> natefinch, perrito666 - mmcc_ is here if you have further questions
<mmcc_> yep, hi - the environment has been torn down but the systems are available to try to reproduce, and if anyone wants access to look themselves, thatâd be fine
<sinzui> stokachu, does machine 1 exist? Can you get a copy of the /var/log/cloud-init-output.log to tell about what happened before juju's agent was started
<stokachu> mmcc_, ^
<mmcc_> sinzui: that machine is not up, and has been recommissioned a couple of times. Iâll try to reproduce and see if I can get that log for you
<perrito666> the only strange thing I see and can somehow understand is that peergrouper complains that it cannot determine the replicaset
<perrito666> which leads me to know that the state server might be failing while trying to create the replicaset
<perrito666> natefinch: any input?
<natefinch> looking
<perrito666> natefinch: did you by any change got my package?
<natefinch> perrito666: I haven't gotten anything yet, I was actually going to ask about that.  Do you have tracking info on it?
<natefinch> perrito666: if you ever need to get anything here in a short period of time, I have Amazon Prime, so I can get stuff shipped 2nd day for free
<natefinch> perrito666: hell, I can get something delivered tomorrow morning for $5, if you really want :)
<perrito666> natefinch: lol :p I think this is not amazon delivering so It might take a lot longer, its the size of a usb thumb so you can store it until the next sprint if you get it after :p I would say you can use it but it only is useful on thinkpads
<natefinch> haha
<natefinch> perrito666: ahh, yeah, I never buy anything that isn't shipped by amazon unless I'm desperate for that very specific thing... because you never know how long some schmoe is going to take to get it to you, and chances are he's going to send it the slowest, cheapest way possible
<perrito666> natefinch: well its a bit of a specific part so it was only available from that source
<natefinch> perrito666: yeah, figured. bummer.  Hope it makes it here.   It's probably being sent via US Post Office, which delivers today and tomorrow, so we might get lucky.
<perrito666> natefinch: well laptop spare parts (in this case is the bluetooth module) are almost always sent from china, what I dont understand is why someone made this laptop without the bluetooth module in the first palce
<katco> i'm running into an ordering issue using gocheck; i'm trying to use SetUpTest, but that is precluding something else from setting up state... does this sound familiar to anyone?
<perrito666> I use a usb bluetooth dongle but I keep breaking those against things
<perrito666> katco: not following
<katco> perrito666: are you familiar with gocheck's SetUpTest functionality?
<katco> perrito666: it basically is a special function that will be run before every test in a suite
<katco> perrito666: the problem is, if i define it, then state in an embedded suite never gets set up
<natefinch> katco: you have to manually call sub-suite's setup functions too
<natefinch> remember, the methods on the struct you're defining override the ones on embedded structs, so you have to manually call the ones on the embedded structs
<katco> natefinch: ah that was exactly it
<katco> natefinch: dependency model i didn't fully understand
<natefinch> katco: yep.  happen to uh... a friend of mine. ;)
<katco> haha
<katco> in the perfect world in my brain, nothing is ever that obscure
<natefinch> yeah, it's really the fault of the magic in gocheck
<katco> i would have thought that gocheck would handle that for you
<katco> since it has all the suites which have been registered
<katco> ok, back to hacking. thanks perrito666, natefinch
<natefinch> well, it just looks for the top level method and calls it
<perrito666> ok bbl, going lunch with the wife to compensate for 1 week driking with friends in brussels
<natefinch> haha sounds fair ;)
<hazmat> cmars, mongo ++ :-)
<hazmat> cmars, er. gonzo ++
<cmars> hazmat, fear and loathing in nosql databases
<cmars> idk
<cmars> fun little side project i've been hacking on
<cmars> hazmat, https://github.com/cmars/gonzodb
<hazmat> cmars, yeah. i saw
<hazmat> cmars, reminds me of https://github.com/rick446/MongoTools/blob/master/mongotools/mim/mim.py
<hazmat> mongo in memory .. make unit tests FAST
<natefinch> hazmat: that would be amazing
<cmars> hazmat, that python version is impressive. even embeds JS. nice!
<hazmat> natefinch, cmars alexisb we should schedule a topic on this for next week
<alexisb> hazmat, I was not following the discussion, give me a minute and will do
<hazmat> alexisb, i'll add, just wanted to put it on your radar
<hazmat> actually we already have an agenda item that matches this topic
<hazmat>  Juju Testing: Better Merge Gating, Faster Testing
<wwitzel3> we used MIM at SourceForge for our testing
<mmcc__> sinzui, natefinch, perrito666 - trying to reproduce bug 1376952 just now, the machine came up correctly, so I will just go ahead and use it. if it breaks again, Iâll update the bug. thanks for looking, and let me know if I can enable any extra logging or anything thatâd help diagnose it next time
<mup> Bug #1376952: juju machine stuck on 'pending' with no error message: juju 1.20.9.1, maas provider <cloud-installer> <deploy> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1376952>
<natefinch> mmcc_: ok, thanks
#juju-dev 2014-10-04
<perrito666> afternoon
<jcw4> perrito666: or morning... whatever
<jcw4> :
<jcw4> :)
<perrito666> perrito666: currently is 13:08 her
<perrito666> so Ill say afternoon
#juju-dev 2015-09-28
<davecheney> anastasiamac: i cannot test on windows
<davecheney> i dn't have a windows machine
<davecheney> but I fixed the error reported by CI
<davecheney> which was the sole remaining use og version.Current.OS
<anastasiamac> davecheney: did u have a chance to look at this? is this worth a try? http://wiki.cloudbase.it/juju-testing
 * thumper wanders off to find some lunch
<davecheney> i'm fine with letting CI test this
<davecheney> if there is a problem, reverting it is a quick process
<axw> wallyworld: I'm going to be out tonight, so won't be able to make the retro sorry
<wallyworld> axw: sure, np
<axw> wallyworld: also I'm not working today, in case you'd forgotten
<wallyworld> you have a holiday anyway
<anastasiamac> axw: why r on irc?
<anastasiamac> axw: u*
<axw> wallyworld: summary: I spent most of last week working on various bugs, and trying to come up with a demo for storage for seattle (no fruit yet). blocked on azure because no SDK updates yet
<axw> anastasiamac: just to rub it in that I have a holiday ;)
<anastasiamac> axw: oh yes - thank you - appreaciated :D
<wallyworld> thumper: you don't need +1 for backport, you can just land that can't you
<thumper> wallyworld: no, but I was really wanting some form of confirmation
<thumper> but I may just merge it
<wallyworld> i can look if yiu want
<wallyworld> thumper: lgtm, seems to be an extra unrelated test from the cherry pick that came across, doesn't matter
<mup> Bug #1500283 opened: apiserver: data race in machinePinger.Stop <juju-core:New> <https://launchpad.net/bugs/1500283>
<mup> Bug #1500283 changed: apiserver: data race in machinePinger.Stop <juju-core:New> <https://launchpad.net/bugs/1500283>
<mup> Bug #1500283 opened: apiserver: data race in machinePinger.Stop <juju-core:New> <https://launchpad.net/bugs/1500283>
<perrito666> for those of you who are in the wrong side of the world
<perrito666> http://imgur.com/J0RNfTt
<perrito666> sorry for the quality my lens is not the best there is
<anastasiamac> perrito666: nice picture :P
<anastasiamac> perrito666: but everyone who is on the "wrong side of the world" is now asleep :P
<perrito666> anastasiamac: lol
<perrito666> its noon for you
<anastasiamac> :P yes just passed
<davecheney> http://reviews.vapour.ws/r/2764/
<davecheney> ^ quite urgent
<anastasiamac> davecheney: wrong channel... re-posting here
<anastasiamac> <anastasiamac> ok.. m pleading my ignorance... why changing from bool will fix a race? :D
<davecheney> we need to use atomic operations so the change to the value is visible across threads
<davecheney> and there is no atomic.LoadBool() in 1.2.1, so this is the closes
<davecheney> and there is no atomic.LoadBool() in 1.2.1, so this is the closest
<anastasiamac> davecheney: ah!
<anastasiamac> davecheney: is it possible to add this info to codebase and/or PR?
<davecheney> sure, i'll amend the description
<anastasiamac> \o/
<davecheney> done
<wallyworld> davecheney: just saw this branch in when i pulled master "bugger-off-version-binary-os". that is awesome, most aussie branch name ever
<davecheney> wallyworld: :)
<mup> Bug #1465873 changed: Environment.Users does not take into consideration the current environment <juju-core:Fix Released by waigani> <https://launchpad.net/bugs/1465873>
<mup> Bug #1500298 opened: API message logging should have field blacklist <security> <tech-debt> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500298>
<mup> Bug #1500298 changed: API message logging should have field blacklist <security> <tech-debt> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500298>
<mup> Bug #1500298 opened: API message logging should have field blacklist <security> <tech-debt> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500298>
<mup> Bug #1500298 changed: API message logging should have field blacklist <security> <tech-debt> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500298>
<mup> Bug #1500298 opened: API message logging should have field blacklist <security> <tech-debt> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500298>
<wallyworld> rogpeppe: heyya, can i get a review of a small charm.v6-unstable pr when you have time? https://github.com/juju/charm/pull/154
<wallyworld> rogpeppe: also, i've made a charmrepo.v2, since i also need to change other methods besides just adding a new interface. i'll push a deps update once the above pr lands, plus there's also a failing test that slipped through onto gh
<urulama> wallyworld: morning, he'll be here in a "few minutes"
<urulama> wallyworld: still a bit early for UK :)
<wallyworld> urulama: no worries :-)
<urulama> wallyworld: and welcome back, hope you had a great time off
<wallyworld> urulama: yeah, had a wedding to go to, was good
<voidspace> frobware: I saw the updated 1:1 time by the way
<voidspace> frobware: fine
<frobware> voidspace, great. sorry for the short notice. :)
<wallyworld> urulama: with mongo, do you know a way to detect if javascript is enabled via the session or some such? i have a bunch of charmrepo tests failing because i'm using juju-mongo without javascript engine. i can add a skip for those tests where mongo doesn't support javascript
<urulama> wallyworld: hm, isn't there "isJavascriptEnabled" or something like that?
<urulama> wallyworld: i can take a look, but after the call
<rogpeppe> wallyworld: have you actually found some instances of old charms that have a series field in?
<wallyworld> not sure, hence the question, but i'll go searching for a func name like that. ty
<wallyworld> rogpeppe: in a gz file
<wallyworld> which contained bundles
<wallyworld> rogpeppe: it has Series: ""
<wallyworld> so it needs to be handled
<rogpeppe> wallyworld: hmm, bummer
<wallyworld> yeah:-(
<rogpeppe> wallyworld: um, what's a charm metadata file doing in json format anyway?
<wallyworld> rogpeppe: nfi
<wallyworld> rogpeppe: see allcharms.json.gz in the charmrepo
<rogpeppe> wallyworld: so not a charm in the charm store then?
<wallyworld> no
<rogpeppe> wallyworld: i don't think you should take allcharms.json as representative
<rogpeppe> wallyworld: it was generated from actual charms that do not have series entries
<rogpeppe> wallyworld: for example, the first entry in that file is cs:precise/apache2-12
<wallyworld> rogpeppe: sure, but legacy charms won't
<rogpeppe> wallyworld: but if you get that entry from the charm store, it doesn't have a series field in its metadata
<wallyworld> rogpeppe: hmmm, ok. so an option is to change that json file then
<wallyworld> delete all the Series: ""
<wallyworld> rogpeppe: although, the code is safe, i think it's best to be robust
<wallyworld> read everything, write concisely
<wallyworld> who know what local charms people will have
<wallyworld> i'd rather err on the side of caution
<rogpeppe> wallyworld: ok, i guess
<rogpeppe> wallyworld: but i'm not convinced about the approach in that PR
<rogpeppe> wallyworld: it means that the series will be lost when marshaling the metadata
<wallyworld> lost how? it's converted to a slice
<wallyworld> oh
<wallyworld> marshalling
<wallyworld> yes, but series as a string literal is deprecated
<wallyworld> s/deprecated/strongly discouraged
<wallyworld> i think we just need to handle reading older bundles
<rogpeppe> wallyworld: it'll be lost even if it's a slice, no?
<rogpeppe> wallyworld: anyway, i thought we were calling it SupportedSeries
<wallyworld> rogpeppe: oh, hmmm. i think you're right. in which case it's kinda moot
<wallyworld> rogpeppe: the spec says "series' thpugh
<wallyworld> i guess we can change it
<wallyworld> will be a lot easier
<wallyworld> rogpeppe: i think i'll just change the spec and ask for forgiveness, and can repropose this pr if needed if the spec has to be changed back
<rogpeppe> wallyworld: i think you can call the field SupportedSeries in the Meta type
<wallyworld> yes
<wallyworld> and make the serialised field name "supported-series"
<rogpeppe> wallyworld: theoretically you can actually have two fields that both marshal to "series"
<rogpeppe> wallyworld: but actually better to use "supported-series" in the metadata
<wallyworld> yeah
<rogpeppe> wallyworld: it's a good name anyway - it reflects what it does
<wallyworld> rogpeppe: ok, i'll delete the current pr and repropose a charm.v6-unstable pr with SupportedSeries
<rogpeppe> wallyworld: thanks
<wallyworld> rogpeppe: ty too. btw, do you know how to detect is mongo javascript is enavled? i have abunch of failing tests
<wallyworld> because i have juju-mongodb without js
<rogpeppe> wallyworld: i don't think we can rely on mongo-with-js
<wallyworld> and so charmrepo tests fail
<rogpeppe> wallyworld: because it's not supported on some architectures
<wallyworld> rogpeppe: there's about 4 tests which check that stats are updated which fail without js
<rogpeppe> wallyworld: ah yes, charmstore uses mongojs
<wallyworld> rogpeppe: yeah, so i'd like just to skip those tests if there's no js
<rogpeppe> wallyworld: perhaps just ignore those failures on your machine - CI will tell you if the tests really pass
<wallyworld> oh, ok :-) but i was hoping for something nicer :-)
<rogpeppe> wallyworld: i'm sure it wouldn't be hard to work out
<rogpeppe> wallyworld: just execute some js and see what happens
<wallyworld> rogpeppe: yeah i had a quick look but nothng jumped out, thought i'd ask just in case you knew off the top og your head
<wallyworld> there's an error just i can catch, but i'd rather detect it and skip
<wallyworld> anyways, it's no biggie
<rogpeppe> wallyworld: you could use Database.Run to run http://docs.mongodb.org/manual/reference/command/eval/#dbcmd.eval
<rogpeppe> wallyworld: but eval is deprecated, so i don't know how long that'll carry on working
<wallyworld> yeah, i saw that and came to the same conclusion
<urulama> wallyworld: this might help you with JS https://github.com/juju/charmstore/blob/v5-unstable/internal/storetesting/flag.go
<wallyworld> urulama: ty, will look after dinner
<wallyworld> rogpeppe: quick one to rename series https://github.com/juju/charm/pull/155
<rogpeppe> wallyworld: so we want to support "supported-series: trusty" as well as "supported-series: [trusty]" ?
<wallyworld> rogpeppe: ah, bollocks, will fix that, thanks. was rushing to get it done before dessert
<wallyworld> i'll check that we just want []string
<rogpeppe> wallyworld: tbh i'm not sure that categories and tags work as intended in that respect
 * rogpeppe checks
<wallyworld> rogpeppe: i just checked - actually, supported-series doesn't support string literals
<wallyworld> it requires a list
<wallyworld> so it's ok as is
<wallyworld> we now just ignore any series attribute
<wallyworld> as it is meaningless
<rogpeppe> wallyworld: you're right - i misread parseStringList
<rogpeppe> wallyworld: sorry for false alarm :)
<wallyworld> np :-)
<wallyworld> i didn't change parseStringList I don't think
<wallyworld> rogpeppe: thanks, after dinner, i'll propose that tweak to charmrepo to make the tests happy
<rogpeppe> wallyworld: no, i thought that parseStringList accepted a single string
<rogpeppe> wallyworld: i'd be interested to know what API-breaking changes you were thinking of for charmrepo, BTW
<rogpeppe> wallyworld: BTW, $$merge$$ works on the charm repository
<wallyworld> rogpeppe: InferRepository() returns Interface
<rogpeppe> wallyworld: so better to use that than the big green button :)
<wallyworld> rogpeppe: oh sorry, i thought there was no bot
<rogpeppe> wallyworld: it's relatively recent. no worries.
<wallyworld> rogpeppe: so even if I use a new SupportedSeries interface, that method wil need to change also
<wallyworld> gotta run, dibber, bbiab
<wallyworld> dinner
<rogpeppe> wallyworld: enjoy
<wallyworld> rogpeppe: sorry, last one for a bit, it's small https://github.com/juju/charmrepo/pull/30
<wallyworld> rogpeppe: wtf, tests pass for me but github says checks fail due to a charmstore.v5 build error
<wallyworld> and i haven't updated charmstore.v5 dep for this pr
<dimitern> frobware, voidspace, dooferlad, TheMue, I'm back, we can start the planning HO
<frobware> dimitern, OK
<dooferlad> dimitern: on it
<dimitern> omw
<frobware> dimitern, shall we use the link from friday's meeting?
<dimitern> frobware, sgtm
 * dimitern ah.. monday hangouts..
<rogpeppe> wallyworld: you haven't added charmrepo.v2 to the deps
<wallyworld> ah
<dimitern> rejoining..
<rogpeppe> wallyworld: oh, sorry my mistake
<rogpeppe> wallyworld: this is targetted at v2 of course
<wallyworld> rogpeppe: it is. and with the flag stuff, uros pointed me at that code, it's straight from charmstore
<dimitern> voidspace, ping
<rogpeppe> wallyworld: in the charmstore, that code is in the storetesting package
<rogpeppe> wallyworld: which is there specifically for tests
<rogpeppe> wallyworld: we don't want to have that stuff in production code
<wallyworld> ok, i'll tweak it
<rogpeppe> wallyworld: thanks
<wallyworld> rogpeppe: there's a testing package in the top level i'll mov eit top
<rogpeppe> wallyworld: i wonder whether you shouldn't just use jujutesting.MgoServer.WithoutV8
<wallyworld> could do, i wonder why that other code doesn't do that
<rogpeppe> wallyworld: although... i'm not convinced by the heuristics for deciding whether the set that
<rogpeppe> wallyworld: are you using /usr/lib/juju/bin/mongod ?
<wallyworld> yes
<rogpeppe> wallyworld: in which case that'll probably do the trick
<rogpeppe> wallyworld: it certainly seems like the right place to look at
<wallyworld> i'll try
<wallyworld> rogpeppe: although for consistency, may be nice to use the same code
<rogpeppe> wallyworld: i'll suggest changing charmstore actually. and possibly changing juju/testing too.
<wallyworld> ok
<rogpeppe> wallyworld: ISTM there should be one canonical place for this info
<wallyworld> yeah
<voidspace> dimitern: omw, sorry
<wallyworld> rogpeppe: just using that WithoutV8 flag seems to have worked
<rogpeppe> wallyworld: cool
<voidspace> dimitern: need to do authentication dance, will take a minute
<voidspace> dimitern: aaaand you're not in the standup hangout - got a link for me?
<wallyworld> rogpeppe: so jenkins still fails, should i just try doing another PR to update charmstore.v5-unstablel to use charmrepo.v2?
<wallyworld> charmstore.v5 imports charmrepo.v1
<wallyworld> and the build of charmstore.v5 fails
<rogpeppe> wallyworld: if you run godeps -t ./... > dependencies.tsv, you'll probably end up with something that works
<rogpeppe> wallyworld: you'll still need charmrepo.v1 as a dependency, but that will be fixed in time
<wallyworld> makes sense, ta
<urulama> wallyworld, rogpeppe: we don't have jenkins jobs picking up charmrepo.v2 or charmstore.v5, only charmstore.v1 and cs.v5-unstable
<urulama> wallyworld, rogpeppe: if you're relying on our CI that is
<rogpeppe> urulama: it looks like the bot is picking up that branch ok
<urulama> hm
<urulama> it shouldn't :D
<rogpeppe> urulama: although it's possible it's ignoring the target branch i guess
<rogpeppe> urulama: well, it should really :)
<rogpeppe> urulama: but it might be doing it erroneously
<urulama> rogpeppe: ah, ok, the test bot, but the merge bot won't work
<rogpeppe> urulama: ah, ok
<rogpeppe> urulama: well, if the test bot says it passes, i don't mind clicking the big green button while we wait for CI to catch up
<rogpeppe> urulama: i'm wondering if we should be using v2.unstable though
<urulama> rogpeppe: v2-unstable would be following our protocol, yes
<wallyworld> rogpeppe: that worked
<rogpeppe> wallyworld: given that we probably haven't finished making breaking changes to charmrepo, could you retarget at v2-unstable please?
<rogpeppe> wallyworld: cool
<wallyworld> ok
<wallyworld> i'll merge this change first and then rename
<wallyworld> rogpeppe: assuming you give it a +1
<rogpeppe> wallyworld: i'd prefer not to create charmrepo.v2
<rogpeppe> wallyworld: as it gives the impression that it's stable
<wallyworld> ok, i'll update now before landing
<wallyworld> rogpeppe: ok, so v2-unstable created, and here's the new pr https://github.com/juju/charmrepo/pull/31
<rogpeppe> wallyworld: LGTM
<wallyworld> rogpeppe: awesome, ty
<wallyworld> now i can start the Interface changes
<wallyworld> rogpeppe: how long do i need to wait for the $$merge$$ on charmrepo to be picked up?
<rogpeppe> wallyworld: we use :shipit: instead
<wallyworld> ah, oops, ty
<rogpeppe> wallyworld: but it might not work, as i'm not sure the merge bot picks up merges for all targets
<rogpeppe> wallyworld: try it and see
<wallyworld> i've tried it, let's see
<wallyworld> and done
<voidspace> frobware: just realised I pick my daughter up from school at 3:15pm, can we move 1:1 to 4.00pm?
<frobware> voidspace, done
<voidspace> frobware: thanks
<katco> wwitzel3: ericsnow: let's just meet in moonstone for iteration planning
<wwitzel3> katco: rgr
<sinzui> mgz: can you review https://github.com/juju/juju/pull/3388
<mgz> sinzui: looking
<katco> natefinch: we're meeting in moonstone for iteration planning
<wwitzel3> katco: browser crashed, brb
<katco> wwitzel3: no worries
<sinzui> mgz: sorry http://reviews.vapour.ws/r/2771/
<mgz> sinzui: lgtm
<sinzui> thank you mgz
<mgz> hm, I think I will install ntpd on reviews. my comment is now 8 minutes from now.
<sinzui> mgz: yeah. I have installed ntpd on most of our machines because something eventually breaks when the instance lives more thsn a month
<frobware> dimitern, ping; do you have time for a quick HO regarding juju-br0?
<dimitern> frobware, sure
<frobware> dimitern, let's use the standup HO
<dimitern> frobware, ok, omw
<frobware> voidspace, can we delay 15 mins?
<frobware> voidspace, in a HO with dimitern and sean looking at the wily 4.2 bug
<voidspace> frobware: yup, np
<frobware> voidspace, as we're making some progress any objections if we reschedule for tomorrow?
<voidspace> frobware: no problem
<voidspace> frobware: maybe give me a chance to get my branches landed by the time of the network call (maybe...)
<voidspace> frobware: good that you're making progress
<TheMue> dimitern: Regarding constraints, is the section in the network model document still up-to-date?
<dimitern> TheMue, it should be - which part specifically?
<TheMue> dimitern: the meaning of the spaces constraint, to add it to the docs
<dimitern> TheMue, that part hasn't changed
<dimitern> TheMue, the format - comma-delimited list with ^ prefix for negatives
<TheMue> dimitern: ok, thx. only wanted to get sure as we had some discussions in London
<rogpeppe> anyone fancy a tiny review? https://github.com/juju/juju/pull/3390
<dooferlad> frobware, dimitern, TheMue: hangout
<mup> Bug #1499613 changed: Windows device path mismatch in volumeSuite.TestListVolumesStorageLocationBlockDevicePath <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by cmars> <juju-core 1.25:Fix Released by cmars> <https://launchpad.net/bugs/1499613>
<dimitern> dooferlad, we're still in a call, will join soon
<voidspace> dimitern: frobware: eta?
<voidspace> dimitern: frobware: jay is looking at Andrew's bug but we can't really help without him, and we don't have anything else to talk about
<voidspace> so maybe we defer to email for that bug
<mup> Bug #1499613 opened: Windows device path mismatch in volumeSuite.TestListVolumesStorageLocationBlockDevicePath <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by cmars> <juju-core 1.25:Fix Released by cmars> <https://launchpad.net/bugs/1499613>
<voidspace> dimitern: frobware: Jay has gone for breakfast :-)
<dimitern> voidspace, ah :/ too bad
<mup> Bug #1499613 changed: Windows device path mismatch in volumeSuite.TestListVolumesStorageLocationBlockDevicePath <blocker> <ci> <regression> <test-failure> <unit-tests> <windows> <juju-core:Fix Released by cmars> <juju-core 1.25:Fix Released by cmars> <https://launchpad.net/bugs/1499613>
<dimitern> voidspace, dooferlad, that 4.2 kernel bug seems like something is messed up with the hardware network stack - juju does what's expected
<voidspace> dimitern: ah...
<dimitern> i.e. ifdown eth1 looks like it worked,but the device is still up after that
<rogpeppe> here's an incremental change to apiserver/common, converting a macaroon discharge-required error with ServerError: http://reviews.vapour.ws/r/2774/
<voidspace> dimitern: Jay was very suspicious about the fact that it was receiving packets with itself as the source address
<dimitern> voidspace, yeah
<voidspace> dimitern: he suspected a loop somewhere in the network stack
<dimitern> voidspace, it looks like it yeah
<dimitern> frobware, I think I'll call it a day and resume with the demo prep first time tomorrow morning
<frobware> dimitern, agreed. been in a meeting since 9am....
<katco> ericsnow: wwitzel3: natefinch: be there in a sec
<katco> natefinch: ready?
<natefinch> katco: sorry, my wife is late coming home from picking up Lily, so I still have two kids...
<katco> natefinch: we're going to defer for half an hour; ericsnow is occupied until then anyway
<natefinch> katco: ok, great (sorta).
<voidspace> frobware: dooferlad: TheMue: if you have a chance: http://reviews.vapour.ws/r/2775/
<voidspace> tasdomas: you're OCR I believe, so if you get a chance http://reviews.vapour.ws/r/2775/
<thumper> mramm: ping?
<mup> Bug #1500613 opened: configstore should break fslock if time > few seconds <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>
<mup> Bug #1500613 changed: configstore should break fslock if time > few seconds <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>
<thumper> davechen1y: http://data.vapour.ws/juju-ci/products/version-3112/maas-1_8-deploy-trusty-amd64/build-884/machine-0.log.gz
<thumper> davechen1y: look at the panic there
<thumper> panic: cannot pass empty version to VersionSeries(), github.com/juju/utils/series/supportedseries.go:185
<thumper> utils/series looks like your recent stuff
<thumper> found this looking through the CI failure on http://reports.vapour.ws/releases/3112/job/maas-1_8-deploy-trusty-amd64/attempt/884
<thumper> which just failed on master
<mup> Bug #1500613 opened: configstore should break fslock if time > few seconds <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>
<wallyworld> axw: perrito666: anastasiamac: i'm finishing another meeting, will be 10 mins or so late
<axw> ok
<anastasiamac> wallyworld: sure \o/
<anastasiamac> Just noticed this in the build - + bzr whoami 'J. Random Hacker <jrandom@example.org>'
<anastasiamac> very funny
<perrito666> Could any of you join the hangout? Trying my connection axw anastasiamac
<axw> perrito666: coming
#juju-dev 2015-09-29
<perrito666> I hate something in my network but I am not sure what
<thumper> your computer?
<thumper> I had mine
<perrito666> It might be my isp, or perhaps my wlan card, but I am not sure which of those is the one I hate today
<sinzui> Hi wwitzel3 ericsnow: Can we get this reviewed and merged by tommorrow so that we can propose 1.25-beta1 for wily https://github.com/juju/charm/pull/157
 * sinzui not know the policy for reviews and merges in the charm project
<mup> Bug #1500676 opened: use-default-secgroup in environments.yaml not respected <cpec> <juju-core:New> <https://launchpad.net/bugs/1500676>
<axw> wallyworld: FYI - https://bugs.launchpad.net/juju-core/+bug/1500703
<mup> Bug #1500703: provider/gce: DeviceNames populated in volume attachment info are invalid <gce-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500703>
<wallyworld> ok
<axw> wallyworld: gce volumes don't work properly
<axw> wallyworld: working on a fix now
<wallyworld> sounds good, ty
<wallyworld> hopefully you'll get access to that maas setup soon too
<mup> Bug #1500703 opened: provider/gce: DeviceNames populated in volume attachment info are invalid <gce-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500703>
<mup> Bug #1500703 changed: provider/gce: DeviceNames populated in volume attachment info are invalid <gce-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500703>
<mup> Bug #1500703 opened: provider/gce: DeviceNames populated in volume attachment info are invalid <gce-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500703>
<mup> Bug #1500703 changed: provider/gce: DeviceNames populated in volume attachment info are invalid <gce-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500703>
<mup> Bug #1500703 opened: provider/gce: DeviceNames populated in volume attachment info are invalid <gce-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500703>
<mup> Bug #1500721 opened: provider/ec2: volumes should be attached with name "xvdf" instead of "xvdf1" by default <ec2-provider> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500721>
<rogpeppe> dimitern: oh great OCR of the day! :-) would you be able to do a review for me, by any chance please? http://reviews.vapour.ws/r/2774/
<dimitern> rogpeppe, sure thing :) looking
<rogpeppe> dimitern: ta muchly! :)
<voidspace> dimitern: if you get a chance... http://reviews.vapour.ws/r/2775/
<dimitern> frobware, sorry, omw ~2m
<frobware> dimitern, ok
<voidspace> dimitern: actually, extended the test and found a bug... fixing
<voidspace> dimitern: and done...
<axw> wallyworld: do you have a moment to review http://reviews.vapour.ws/r/2776/?
<wallyworld> axw: sure, about to eat,so will pick it up after if i don't get it done
<axw> wallyworld: thanks
<wallyworld> axw: where does "scsi-0Google_PersistentDisk" come from?
<wallyworld> is that something we made up?
<dimitern> voidspace, sweet! :)
<mup> Bug #1500760 opened: juju space/subnet subcommands need to respect -e env-name flag <network> <usability> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500760>
<dimitern> voidspace, TheMue, standup?
<dimitern> jam, fwereade ^^
<rogpeppe> dimitern: any chance of that review? :)
<dimitern> rogpeppe, in standup now, will reply in ~20m
<rogpeppe> dimitern: ta
<voidspace> dimitern: omw
<dimitern> rogpeppe, LGTM
<rogpeppe> dimitern: tvm
<dimitern> voidspace, reviewed as well
<dimitern> frobware, how's going so far with the bootstrapping etc?
<frobware> dimitern, tis bootstrapped
<frobware> dimitern, want to go through the motions over HO?
<dimitern> frobware, ok - joining the standup one
<frobware> dimitern, 10 mins - ok?
<dimitern> frobware, np
<mup> Bug #1500769 opened: provder/gce: default block source not set <gce-provider> <juju-core:In Progress by axwalk> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500769>
<dimitern> dooferlad, just a reminder to add cards for the -e flag issue
<dimitern> :)
<dooferlad> sure
<frobware> dimitern, I'm back...
<dimitern> frobware, omw
<axw> wallyworld: good news, I got Ceph working on GCE with storage.
<wallyworld> farking awesome
<wallyworld> what sort of disks?
<axw> wallyworld: SSDs
<wallyworld> much to change in the charm?
<axw> wallyworld: the non-local kind
<wallyworld> btw, did you see my question above?
<axw> wallyworld: not a lot
<axw> wallyworld: no sorry, reading
<wallyworld> yeah, adding storage should not be a big deal
<axw> wallyworld: it's not made up, that's how what the /dev/disk/by-id path looks like. it's not *documented* though, so it's a bit sketchy. hence the TODO, which involves changing mongo and apiserver, etc.
<wallyworld> np, thanks, i didn't quite fully grok it
<axw> wallyworld: there's two paths in /dev/disk/by-id: that one, and google-<device name>
<wallyworld> ah ok
<axw> wallyworld: the diskmanager publishes the undocumented one :)
<wallyworld> :-)
<axw> I better log a bug for that I think
<wallyworld> axw: should we fix that now lest we have mmigration issue later
<wallyworld> before 1.25 ships
<axw> wallyworld: I'll try to do that next
<axw> wallyworld: there's another bug I was about to fix, which is that the gce provider doesn't set a default volume source
<axw> wallyworld: I can leave that for now and fix this
<wallyworld> i think that would be wise, just imo
<axw> wallyworld: I'd prefer to get the quick fix in first though, if you don't mind
<wallyworld> sure, +1
<axw> in case I run out of time
<wallyworld> done
<axw> wallyworld: thanks
<TheMue> frobware: so, back from the doc, brain is - at least as much as possible ;) - ok, now continuing on the "online" docs (part of the cmd). want to propose it today
<frobware> TheMue, sounds good on all fronts... :)
<TheMue> frobware: yeah, absolutely. and then I can start with the user docs, already got needed infos by nick
<mup> Bug #1500803 opened: provider/gce: HardwareId should use the documented "google-" format <gce-provider> <juju-core:In Progress by axwalk> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1500803>
<frobware> dooferlad, do you run spaces tests on CI right now?
<dooferlad> frobware: we don't have any yet, so no
<dooferlad> frobware: that is what I am fixing :-)
<dooferlad> frobware: there are unit tests
<dooferlad> frobware: and those do run
<frobware> dooferlad, ok, that's good -- if you had said "yes" then dimitern and I might have been bemused
<frobware> dooferlad, because it's not clear at the moment which access-key/secret-key combo allows spaces to work
<dooferlad> frobware, dimitern: http://pastebin.ubuntu.com/12610778/
<dooferlad> frobware, dimitern: http://pastebin.ubuntu.com/12610786/
<frobware> dooferlad, interesting. that's what I was trying to do.
<frobware> dooferlad, as in my test charm for spaces was to use ubuntu
<dooferlad> frobware: well, I have it automated now. Will check in and you can use it.
<frobware> dooferlad, if you could bung me some instructions that would be appreciated
<dimitern> dooferlad, you need to use a different service name if you're deploying the same charm twice
<dimitern> dooferlad, i.e. juju deploy cs:trusty/ubuntu ubuntu1 --constraints spaces=dmz ; # or maybe even just: juju deploy ubuntu ubuntu2 --constraints...
<dooferlad> dimitern: so http://reviews.vapour.ws/r/2700/ isn't quite the perfect instructions then :p
<dooferlad> dimitern: will try that...
<dimitern> dooferlad, in the steps there I'm using 1 deploy followed by 1 add-unit
<dooferlad> dimitern: ok, ok, problem between keyboard and chair
<dimitern> :)
<wallyworld> rogpeppe: urulama: whenever you get time, here's a new charmrepo.v2 PR which adds a repo which sits on top of a given charm dir https://github.com/juju/charmrepo/pull/32
<urulama> wallyworld: kk, sometime during the day, will be ready before your morning
<rogpeppe> wallyworld: thanks, will look
<wallyworld> tyvm
<urulama> wallyworld: hah, not a small one :P
<wallyworld> nah sorry
<wallyworld> rogpeppe: urulama: it's set up to allow core to juju deploy with a dir path rather than a cs:foo or local:bar url
<rogpeppe> wallyworld: i wonder if it's right that the caller should have no control as to which kind of repository is used
<rogpeppe> wallyworld: for example if you're a server, you almost certainly don't want to be statting local paths
<rogpeppe> wallyworld: when resolving URLs
<wallyworld> rogpeppe: i don't quite follow, if you use the Infer method, the first checks are for cs: or local:. Only if neither of those does it assume a path
<wallyworld> this is used on the client when deploy or upgrade
<rogpeppe> wallyworld: hmm, doesn't ParseReference always return a schema of "cs" if none is specified
<rogpeppe> ?
<rogpeppe> wallyworld: so if you do "juju deploy wordpress" and there's a local wordpress directory that you want to deploy, i think it'll try to deploy cs:wordpress anyway, won't it?
<rogpeppe> wallyworld: is that the behaviour you want?
<wallyworld> rogpeppe: ah right so with paths with only single words (not foo/bar) it will assume cs:
<wallyworld> i'll have to change that
<wallyworld> try for path first
<wallyworld> i had it that way and then changed it
<dooferlad> dimitern: can you view the spaces and subnets in juju status?
<dimitern> dooferlad, no
<dooferlad> dimitern: bother
<dimitern> dooferlad, use space list for the time being - we'll add it to status at some point
<rogpeppe> wallyworld: have you got a few moments for a chat?
<wallyworld> rogpeppe: sure, can i ping you in say 10 minutes?
<rogpeppe> wallyworld: sure
<wallyworld> ty
<urulama> wallyworld, rogpeppe: do talk about the "supported-series" vs "series" as well, otherwise, we can do it tomorrow
<wallyworld> oh? i changed it to supported-series
<rogpeppe> wallyworld: eco guys don't like supported-series unfortunately
<wallyworld> oh damn
<wallyworld> i'll have to add back that parsing stuff
<rogpeppe> wallyworld: i've suggested an alternative
<wallyworld> in te doc?
<rick_h_> wallyworld: in the bug on juju/charm that was opened
<rick_h_> wallyworld: see https://github.com/juju/charm/issues/156
<wallyworld> oh, ok, i don't normally follow gh bugs
<rick_h_> wallyworld: :P
<wallyworld> lp forever
<wallyworld> rogpeppe: free now? https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand
 * fwereade had very little sleep last night and is going to have a nap before his meeting in 2h
<dimitern> frobware, I found the problem
<mup> Bug #1500843 opened: Windows ftb due to unused import is diskmanager <blocker> <ci> <regression> <windows> <juju-core:In Progress by gz> <https://launchpad.net/bugs/1500843>
<dimitern> frobware, while I *still* don't quite get why it's working fine when using the root keys, the reason we're getting this rate limit error
<dimitern> frobware, as we saw in the logs is "InvalidGroup.NotFound" for the SG juju created for the machine
<dimitern> frobware, which is a "classic" SG - not attached to the VPC explicitly but
<mup> Bug #1500843 changed: Windows ftb due to unused import is diskmanager <blocker> <ci> <regression> <windows> <juju-core:In Progress by gz> <https://launchpad.net/bugs/1500843>
<dimitern> frobware, it might be easier if we get on a HO actually :)
<mgz> ocr: ^ want to review my branch?
<mup> Bug #1500843 opened: Windows ftb due to unused import is diskmanager <blocker> <ci> <regression> <windows> <juju-core:In Progress by gz> <https://launchpad.net/bugs/1500843>
<voidspace> dimitern: so I added the missing test and extracted the functions as you suggested
<voidspace> dimitern: switching one of the machines to have JobManageEnviron in the MachineTemplate caused AddMachines to return an empty slice of machines
<voidspace> dimitern: do you think it's worth me digging into that to work out why?
<frobware> dimitern, yep, HO's easier... :)
<voidspace> dimitern: ah, found it - "state server jobs specified but not allowed"
<voidspace> hmmm...
<dimitern> frobware, joining standup HO?
<frobware> dimitern, ok
<wallyworld> rogpeppe: here's that charm change https://github.com/juju/charm/pull/158
<urulama> wallyworld: i think he's out on lunch
<wallyworld> urulama: np, i'm about to go to sleep, so if it's +1, could you shipit for me
<urulama> wallyworld: will do
<wallyworld> ty
<frobware> dimitern, coming back... (wrong tab)
<mup> Bug #1246156 changed: plugins should receive environment to operate in unambigiously <juju-core:Triaged> <https://launchpad.net/bugs/1246156>
<mup> Bug #1246156 opened: plugins should receive environment to operate in unambigiously <juju-core:Triaged> <https://launchpad.net/bugs/1246156>
<mup> Bug #1246156 changed: plugins should receive environment to operate in unambigiously <juju-core:Triaged> <https://launchpad.net/bugs/1246156>
<voidspace> dimitern: ping
<dimitern> voidspace, pong
<voidspace> dimitern: upgrade step added to 1.25
<voidspace> dimitern: does it need adding to 1.26 (master) as well (i.e. to steps126)
<voidspace> dimitern: that would only be needed if someone upgraded from a 1.25 beta, which doesn't have the preferred field, to 1.26
<voidspace> without going to 1.25 final first (which would do the upgrade for them)
<voidspace> dimitern: I don't think that's supported, so I don't think it needs to be in steps126
<voidspace> dimitern: but would like confirmation
<voidspace> to be clear, it is (or will) be in steps124 and steps125 on master - I don't think it needs to be in steps126 as well
<dimitern> voidspace, hmm.. let me think for a moment
<dimitern> voidspace, I think since 1.25 is not release yet, we *shouldn't* need to add it to 1.26 steps
<voidspace> dimitern: that's my understanding
<voidspace> dimitern: they should upgrade to 1.25 "some released version" first
<voidspace> which would do the upgrade
<voidspace> then go to 1.26
<dimitern> voidspace, yeah - as far as users are concerned - it might screw us up a bit when testing upgrades etc. with upload-tools
<voidspace> dimitern: heh, if it does it's a few lines copy & paste to fix
<voidspace> dimitern: it's merged on 1.24, in the queue for 1.25 and master
<dimitern> voidspace, awesome!
<natefinch> modular
<natefinch> lol wrong window
<voidspace> how to kill a colleague without being found out
<voidspace> oops, wrong window...
<natefinch> rofl
<voidspace> ;-)
<voidspace> dimitern: heh, landed on 1.25 - master is blocked
<dimitern> voidspace, 1.25 is the important one anyway for us now :)
<voidspace> dimitern: yep, I've marked the bug as fix committed
<voidspace> dimitern: the blocker on master is trivial, so should be fixed soon anyway
<dimitern> voidspace, sweet!
<voidspace> dimitern: hah, so the ec2 provider implementation of NetworkInterfaces already fetches all associated subnets
<voidspace> dimitern: the only thing needed for SubnetInfo that it doesn't keep is the availability zone information
<voidspace> dimitern: so easy enough to add (somehow....)
<voidspace> dimitern: looking at whether extending network.InterfaceInfo to keep this info is sensible
<dimitern> frobware, *facepalm* .. and I was wondering why it's not working!
<dimitern> voidspace, I'm not sure adding AZ to InterfaceInfo is worth the effort
<dimitern> frobware, us-east-1 does not have a default VPC
<voidspace> dimitern: but I *need* it for the Subnets call
<voidspace> dimitern: if I add it to InterfaceInfo, then when Subnets(instId) is called I can just call NetworkInterfaces(instid) and build the SubnetInfo from that
<voidspace> dimitern: otherwise I call NetworkInterface(instId) (which internally calls the ec2 describe subnets for each subnet) - and then I have to call describe subnet *again* for each nic to get the full SubnetInfo
<voidspace> or I can add []AZ to InterfaceInfo
<dimitern> frobware, what a way to waste 2h..
<dimitern> voidspace, I see
<voidspace> dimitern: or I can find another way, which might be better than extending InterfaceInfo
<voidspace> a new method which is used by both NetworkInterfaces and Subnets
<dimitern> voidspace, ok then - I see no harm in adding AvailabilityZone to InterfaceInfo
<voidspace> it had better be a slice to support other providers where a subnet can span availability zones
<dimitern> voidspace, fair point about using a slice - I'll leave it to your judgment then
<frobware> dimitern, so what's the redux here?
<dimitern> frobware, look carefully before spending hours debugging :)
<dimitern> frobware, eu-central-1 one of the few (if not the only one) regions on the shared aws account with a default VPC
<dimitern> frobware, I've added a few subnets to it, backed out my changes and I'm trying to recreate the demo there, will paste you some steps shortly
<frobware> dimitern, ok. but I guess a better error message is in order for the places/cases where this currently happens.
<dimitern> frobware, re the redux - default VPC is required for all this MVP functionality to work
<dimitern> frobware, yes, definitely - e.g. error out if spaces cannot be supported (due to a missing default VPC for example at least for now)
<voidspace> dimitern: which region do we have default vpc in?
<dimitern> voidspace, eu-central-1
<voidspace> cool, thanks
<voidspace> dimitern: doing some manual testing
<voidspace> about to do addressable containers (1.25)
<lazypower> ooo addressable containers \o/
<voidspace> heh
<voidspace> "ERROR environment destruction failed: destroying storage: destroying volumes: destroying "vol-fa3c3117": The volume 'vol-fa3c3117' does not exist. (InvalidVolume.NotFound)"
<voidspace> that's a new one on me. I didn't create any volumes.
<dimitern> voidspace, go in the EC2 web console and check the Volumes page
<voidspace> dimitern: presumably someone else's volume...
<voidspace> which is worrying that my call to destroy-environment would touch it...
<dimitern> and IT WORKS!
<voidspace> yay
<voidspace> whatever IT is...
<voidspace> spaces hopefully...
<dimitern> voidspace, frobware: it's still deploying some instances, but they all end up where they should and AZ distribution also works: http://paste.ubuntu.com/12613764/
<voidspace> dimitern: awesome, nice work
<mup> Bug #1464679 changed: juju status oneline format missing info <landscape> <status> <ui> <juju-core:Fix Released by waigani> <https://launchpad.net/bugs/1464679>
<dimitern> voidspace, frobware, here are the steps: http://paste.ubuntu.com/12613782/
<frobware> dimitern, presumably I should be able to replicate this with my local aws account as it would have been created with a default VPC
<dimitern> frobware, indeed - that's why it worked with my personal account but not with the shared on us-east-1
<dimitern> frobware, and the mystery around root/non-root creds about the shared account - no mystery at all :) I actually tested with my account it turned out (I did set the shared acc keys but *then* overrode them in env. yaml)
<frobware> dimitern, if you don't have a default VPC (e..g, us-east-1) is there a way to specify one as part of the setup/demo?
<dimitern> frobware, not yet
<dimitern> frobware, because the issues we were facing during the HO
<dimitern> (with a non-default VPC the behavior is "as prescribed" in the AWS API docs, no "EC2-Classic"-sorta compatible quirks - e.g. groupName vs groupId)
<frobware> dimitern, what happens if you don't do this step: for i in subnet-0fb97566 subnet-d27d91a9; do juju subnet add $i default; done
<dimitern> frobware, nothing - I just wanted to test the 3 cases: using default VPC subnets, non-default subnets with and without auto-public-IP set,
<dimitern> just for completeness sake
<frobware> dimitern, OK. Just wanted to check (and understand) that you could do it with just public/dmz.
<dimitern> frobware, yeah - surprisingly for me it works even in the last case (no public IPs) .. somewhat
<frobware> dimitern, so next steps are to try some charms... :)
<dimitern> frobware, the machine is up and can talk to the api server, you can even ssh into it (as we're jump-hosting via the bootstrap node), but it can't access the archive (dns works, but not connecting)
<dimitern> frobware, yep! :)
<dimitern> frobware, I'll leave the env running for tonight, checking the connectivity periodically with a watch script, just to make sure it works longer than 5m :)
<dimitern> and then it's time to get a drink \o/
<frobware> dimitern, yep, nice one!
<katco> natefinch: wwitzel3: ericsnow_afk: are all of you available for a chat
<katco> ?
<mup> Bug #1500981 opened: juju-db segfault while syncing with replicas <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1500981>
<ericsnow> katco: sure
<katco> natefinch: wwitzel3: ericsnow: need the whole team for this one
<katco> ericsnow: i'm in moonstone now if you have a sec though
<mup> Bug #1500996 opened: Juju says "bad JSON product data" for valid simplestreams <ci> <simplestreams> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1500996>
<perrito666> is there any charm for vivid?
<rick_h_> perrito666: https://jujucharms.com/q/vivid
<katco> natefinch: ping
<perrito666> rick_h_: thanks a lot
<rick_h_> perrito666: any time
<perrito666> what is the middle column supposed to be, series?
<rick_h_> perrito666: yes
<rick_h_> more interesting in a search for not a specific series: https://jujucharms.com/q/apache2
<perrito666> rick_h_: I tried the dropdown filter but only got the ubuntu charm
<rick_h_> perrito666: yea, we try not to promote the non-LTS too much
<katco> natefinch: ping
<rick_h_> perrito666: so if you ask for it it'll give it to you, but deploying workloads is best done on LTS
<perrito666> rick_h_: I used the filter to get the url parameter and then changed for vivid
<rick_h_> perrito666: ah, yea it probably had some other params there by default vs a normal search.
<rick_h_> perrito666: since by default in the /store it shows only promulgated (reviewed charms) and that is the only one
<perrito666> heh, yup, the strange thing is that it returned only one charm, but it was useful for what I was trying
<rick_h_> perrito666: anyway, a handful from there
<katco> natefinch: ping
<wallyworld> wwitzel3: bug 1491688 - where did you get up to? this issue seems familair - did we fix it one other time?
<mup> Bug #1491688: all-machine logging stopped, x509: certificate signed by unknown authority <landscape> <logging> <rsyslog> <sts> <juju-core:Triaged> <juju-core 1.24:Confirmed> <https://launchpad.net/bugs/1491688>
<wwitzel3> wallyworld: yeah, we fixed it a previous time, axw did, but it looks like it might have not made it in to other versions, I got to replication point, but didn't ever start really digging on it
 * thumper looks around carefully
<thumper> anything blowing up?
<wallyworld> wwitzel3: ty, i'll follow up with him
<alexisb> thumper, yes
<alexisb> we have two criticals:
<thumper> :-(
 * thumper sighs
<alexisb> thumper, you know you love bug squad duties
<thumper> except I was poking lxd with a stick
<thumper> among other things
<mwhudson> thumper: you can debug my ppc64le if you'd rather
<thumper> alexisb: where are these criticals?
<alexisb> thumper, lxd is a happening in moonstone hangout if you are interested
<thumper> yeah
<alexisb> I am sure wwitzel3 wouldnt mind if you joined the party
<thumper> I have some info re:lxd
<alexisb> criticals:
<alexisb> bug 1491688
<mup> Bug #1491688: all-machine logging stopped, x509: certificate signed by unknown authority <landscape> <logging> <rsyslog> <sts> <juju-core:Triaged> <juju-core 1.24:Confirmed> <https://launchpad.net/bugs/1491688>
<wallyworld> thumper: fwiw ^^^^^ i thought that one was fixed
<thumper> which ironically is only high
<alexisb> https://bugs.launchpad.net/juju-core/+bug/1335885
<mup> Bug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider> <security> <uosci> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>
<alexisb> thumper, it is happening on cisco site
<thumper> wallyworld: I thought it was fixed too
<wallyworld> thumper: andrew fixed it apparently, i'll check with him in an hour or so
<thumper> I wonder if it is client version vs. fixed version
<thumper> perhaps the fixed version hasn't made its way to the client site
<wallyworld> could be
<alexisb> thumper, that second one is now causing issues for openstack and sts
<thumper> m'kay
<thumper> wwitzel3: are you talking lxd now?
<wallyworld> thumper: wwitzel3: bug 1417875
<mup> Bug #1417875: ERROR juju.worker runner.go:219 exited "rsyslog": x509: certificate signed by unknown authority <canonical-bootstack> <logging> <regression> <juju-core:Fix
<mup> Released by wwitzel3> <juju-core 1.21:Fix Released by wwitzel3> <juju-core 1.22:Fix Released by wwitzel3> <https://launchpad.net/bugs/1417875>
<wallyworld> haven't read description yet, seems related
<wallyworld> fixed in 1.22
<wallyworld> thumper: wwitzel3: and then this one which seems more likely, fixed in 1.24 bug 1474614
<mup> Bug #1474614: rsyslog connections fail with certificate verification errors after upgrade to 1.24.2 <regression> <juju-core:Fix Released by axwalk> <juju-core 1.24:Fix Released by axwalk> <https://launchpad.net/bugs/1474614>
<wallyworld> so maybe above fix needs to be checked
<thumper> wallyworld: arse
<wallyworld> balls?
<wallyworld> thumper: ?
<thumper> wallyworld: the fixed but not fixed
<wallyworld> yeah, but could be different root cause
<wallyworld> not sure, haven't dug
<anastasiamac> menn0: tyvm for review!!!
<menn0> anastasiamac: no problemo
<mup> Bug #1501084 opened: hitting nova RAM quota results in over-general error <juju-core:New> <https://launchpad.net/bugs/1501084>
<mup> Bug #1500676 changed: use-default-secgroup in environments.yaml not respected <cpec> <juju-core:Invalid> <https://launchpad.net/bugs/1500676>
<mup> Bug #1500996 changed: Juju says "bad JSON product data" for valid simplestreams <ci> <simplestreams> <juju-core:Invalid> <juju-core 1.24:Invalid> <https://launchpad.net/bugs/1500996>
<mup> Bug #1501093 opened: unclear simplestreams error <ci> <simplestreams> <juju-core:Triaged> <https://launchpad.net/bugs/1501093>
<wwitzel3> thumper: you have some lxd info?
<beisner> hi wallyworld, thumper - thoughts on https://launchpad.net/bugs/1335885 ?   i'm hitting it with high frequency, bootstraps are failing in test automation.
<mup> Bug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider> <security> <uosci> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>
<beisner> ps, added summary comment
<beisner> 1.24.5 --> 1.24.6 seems to have exacerbated it
#juju-dev 2015-09-30
<wallyworld> beisner: i haven't looked sorry, thumper is on it afaik
<thumper> wwitzel3: yes
<thumper> wallyworld, beisner: I'll look this afternoon, just need to talk with menn0
<thumper> and write an email or two
 * thumper looks at bug now
<thumper> to start the thinking process
<beisner> thumper, awesome, much thanks.
<beisner> thumper, i've got a repro underway, which basically bootstraps and destroys in a loop, 1.24.6 & openstack provider.  --debug enabled, will capture and add to bug if i can catch it that way.
<thumper> beisner: awesome
<thumper> beisner: does it happen every time?
<thumper> or just some times?
<beisner> thumper, i've seen it > 5 times today in test automation.   that test ran 38 cycles.
<thumper> hmm... interesting
<thumper> beisner: and every time it is deleted at the end even though the warning says it wasn't
<thumper> ?
<beisner> thumper, pure conjecture:   it seems that the opportunity for that to race has always been present, and that something got better/faster.
<beisner> exposing it more frequently
 * thumper nods
<thumper> sleeps for the win?
<beisner> but yes, 100% of destroys result in the "couldn't delete that thing" msg
<thumper> oh...
<thumper> so the warning is always there, what was it that happened > 5 times?
<beisner> i don't control timing of amulet.  say it gets 10 jobs to run.  it bootstraps, deploys, execs tests, destroys, bootstraps, deploys, rinse and repeat.
<beisner> oh to clarify:  failed to bootstrap 5.   all 38 complain that they couldn't delete sec groups (but always have)
<thumper> the failure to bootstrap is what?
<thumper> is this the message you are trying to capture?
<beisner> no it's what i already logged in the bug
<beisner> what i'm trying to capture is the --debug output
<mgz> this is something of a well-known issue with the destroy code
<perrito666> zomg how can be local provider so easy to break :(
<thumper> mgz: well known by whom?
<thumper> not me
<beisner> mgz - yep.  the failing to create sec group is new.
<mgz> we have a bunch of mitigation in the form of post-destroy cleanup
<mgz> thumper: the bug is from 2014-06
<beisner> mgz, until 1.24.6 it was just an annoyance.  now, it fails to delete Foo, then tries to create Foo, and fails to bootstrap, saying it couldn't create Foo.
<thumper> mgz: well that isn't particularly useful to us now...
<mgz> and anyone who does destroy-env immediately followed by bootstrap the same env on openstack will have seen it
<thumper> now we just look imcompetant
<mgz> thumper: so, the only realy way to fix it is make destroy-environment take much longer
<thumper> sure
<thumper> which is the right thing surely
<thumper> make sure the freaking thing is dead
<mgz> cloud providers will frequently refuse to destroy resources that are associated with other resources in the process of being destroyed
 * thumper grumbles 
<mgz> so, kill a machine, you have to wait for some ammount of time before it will let you delete the groups that were attatched to it
<mgz> likewise block devices and so on
<mgz> one thing that is possible with openstack, and I think the new ec2 vpc sec groups, is remove the groups from the machines before killing the machines
<mgz> that way you can reliably wipe them straight away
<mgz> is a bunch more api calls though
<mgz> the other option is something more like what CI does to get juju reliable, which is before bootstrap, basically destroy-environment --force
<mgz> that's less elegant
<mgz> thumper: I guess we really want a different bug for beisner's issue, which is certainly a newer thing
<beisner> oh neat.  my bootstrap/destroy loop yielded something different:  http://paste.ubuntu.com/12620988/
<mgz> beisner: I know this is going to be annoying as you need the destroy cleanup race error first, but any idea if this started in a particular 1.24 minor version?
<beisner> mgz i believe 1.24.5 was solid
<beisner> would have to do some log digging to prove/disprove that observation though
<mgz> beisner: bug 1467331 bug 1500613
<mup> Bug #1467331: configstore lock should use flock where possible <charmers> <ci> <reliability> <repeatability> <juju-core:Triaged> <https://launchpad.net/bugs/1467331>
<mup> Bug #1500613: configstore should break fslock if time > few seconds <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>
<beisner> ok so that repro is simple.   loop a deploy/bootstrap.  took 8 iterations to hit that.
<beisner> errr em.  rather, a bootstrap/destroy loop
<mgz> beisner: http://reports.vapour.ws/releases/rule/34 for us hitting that in ci
 * beisner wanders off, to return in a bit
<mgz> bug 1454323 is marked fixed but that was just to make the error less terrible and the followups are what I linked above
<mup> Bug #1454323: Mysterious env.lock held message <bootstrap> <ci> <destroy-environment> <repeatability> <ui> <juju-core:Fix Released by thumper> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1454323>
<mgz> thumper: so, I don't think the juju code around adopting existing security groups with the same name has actually changed,
<mgz> see ensureGroup in provider/openstack/provider.go
<mgz> however, I think we hit the bad case of trying to create a group which is in the process of being deleted much more often with our storage code, and changes in newer openstacks
<beisner> mgz, thumper - added accidental findings to bug 1500613.   after hitting that lock issue, my enviro is borked.  how do i unlock? ;-)
<mup> Bug #1500613: configstore should break fslock if time > few seconds <amulet> <openstack-provider> <tech-debt> <uosci> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>
<mgz> beisner: just delete the lock
<beisner> am i supposed to know where it is?
<beisner> oh look there.  it tells me.  ha
<mgz> :P
<beisner> so, not sure i can reliably catch the secgroup thing with this hopping out front so readily.
<mgz> beisner: you can just rm -rf the lock location in between each run
<beisner> not if i'm using another runner, such as bundletester or amulet
<beisner> oh you mean in the repro, yes i can
<mgz> yup
<thumper> beisner: I'm kinda surprised at how often this lock file problem is occurring
<thumper> it should just work and delete the file
<thumper> really weird that it isn't
<thumper> time to go make a coffee and look at this bug
<beisner> mgz, thumper, thanks.  i've got the repro looping for the secgroup race.  must sleep now.
<thumper> beisner: ack, thanks
<beisner> mgz, thumper - successfully repro'd bug 1335885 with the same loop, added new comment.  now i'm really closing my screen.  thx again.
<mup> Bug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider> <security> <uosci> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>
<thumper> beisner: thanks again
<beisner> thumper, yw, happy to help chase it.
<beisner> \o
<thumper> o/
<mup> Bug #1303787 changed: hook failures - nil pointer dereference <hooks> <local-provider> <ppc64el> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1303787>
 * thumper afk for a family thing
<thumper> will be back to finish bug
<wwitzel3> thumper: thanks for that email
<wallyworld> axw: small review please https://github.com/juju/charm/pull/159
<axw> wallyworld: looking
<wallyworld> axw: thanks for review, any idea for name? i don't like it either
<wallyworld> SeriesForCharm maybe
<axw> wallyworld: *shrug*  SelectSeries? not much more informative
<axw> wallyworld: sounds fine
<wallyworld> ok, ta
<axw> wallyworld: BTW, my point (regarding "any", "default", etc.) is that this function is not directly attached to the charm metadata. so the user has to ensure the order of supported series is maintained
<axw> wallyworld: which is why I'm saying not to use "any" when it's really "the first item"
<axw> (if that's true)
<wallyworld> ok, i'll reword
<wallyworld> it is the first
<mup> Bug #1501173 opened: apiserver/common/storagecommon: StorageAttachmentInfo returns without error even if block device doesn't exist <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501173>
<mup> Bug #1501173 changed: apiserver/common/storagecommon: StorageAttachmentInfo returns without error even if block device doesn't exist <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501173>
<mup> Bug #1501173 opened: apiserver/common/storagecommon: StorageAttachmentInfo returns without error even if block device doesn't exist <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501173>
<thumper> wallyworld, axw, anastasiamac: http://reviews.vapour.ws/r/2789/
<thumper> I'd like to build a version to make available for these folks to try with
<thumper> to see if it does actually help
<axw> thumper: reviewed
<thumper> ta
<thumper> axw: yes, I'm wanting to get it live tested first
<thumper> though observing things, it appears that what happens is this:
<thumper> try to terminate all the machines
<thumper> emits warning saying security group in use
<thumper> finishes destroy, deletion of group works
<thumper> so the end result is that the user is warned that it couldn't be deleted, but it has gone
<thumper> alternatively it warns again, and doesn't delete it, next bootstrap fails
<thumper> but yes, I want to test it prior to landing
<thumper> as I'm taking a wild stab at the numbers
<axw> thumper: sure, sounds fine
 * thumper writes that on review board too :)
<thumper> ok, I'm done
<thumper> laters folks
<urulama> wallyworld: http://www.theguardian.com/travel/2013/may/25/top-10-live-music-venues-seattle :)
<wallyworld> :-)
<mup> Bug #1501203 opened: apiserver/storage/storagecommon: WatchStorageAttachment should filter block devices <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1501203>
<voidspace> dimitern: ping
<dimitern> voidspace, pong
<voidspace> dimitern: in environments.yaml I have an environment called "amazon-eu" which is type "ec2" and region "eu-central-1"
<voidspace> dimitern: yet when I bootstrap that environment I get a bootstrap machine in us-east-1
<voidspace> hmmm... it might be a yaml indentation issue
<voidspace> dammit
<dimitern> voidspace, check if you have EC2_REGION set in the env
<voidspace> dimitern: will do, thanks
<dimitern> voidspace, or EC2_URL
<TheMue> hmm, HO dislikes me
<voidspace> dimitern: that's set to: https://ec2-lcy01.canonistack.canonical.com:443/services/Cloud
<voidspace> :-)
<frobware> jam, fwereade: joining standup today?
<jam> omw
<voidspace> dimitern: dooferlad: gah, Subnets bug is on 1.25 as well as master
<voidspace> better retarget the work I'm doing and fix it in both places
<dimitern> voidspace, the addressable containers instId thing?
<voidspace> dimitern: yeah
<voidspace> I assumed it was just master, should have checked...
<voidspace> hah
<voidspace> the bug even says both
<voidspace> so it's a reading comprehension failure too... :-)
<dimitern> :)
<axw> fwereade: can you please have a glance at https://github.com/juju/juju/compare/master...axw:lp1500769-gce-default-block-source, and let me know if you're ok with this before I go much further?
<voidspace> axw: o/
<voidspace> axw: morning :-)
<axw> fwereade: basically, I'm sick of using Validate to upgrade config
<axw> voidspace: hiya, how's it?
<voidspace> axw: all is well, how's you?
<axw> voidspace: not too shabby. furious bug fixing before demo time at the sprint :)
<voidspace> axw: heh, right
<voidspace> axw: pretty much what our team is on as well...
<fwereade> axw, ack
<fwereade> axw, looks eminently sane to me
<fwereade> axw, thanks
<natefinch> fwereade: got a minute?
<axw> fwereade: thanks
<ashipika> juju bootstrap for amazon reports the following: https://ec2.us-east-1.amazonaws.com?Action=DescribeInstances&Filter.1.Name=instance-state-name&Filter.1.Value.1=pending&Filter.1.Value.2=running&Filter.2.Name=instance.group-id&Filter.2.Value.1=sg-05ae1a61&Timestamp=2015-09-30T10%3A18%3A37Z&Version=2014-10-01
<ashipika> any ideas anyone? ^
<natefinch> fwereade: gonna be out for a bit, but looking for tips on how to run workers during jujuconnsuite tests, since unit assignment is done in a worker now, a ton of tests fail due to units not getting assigned.
<axw> ashipika: is there an error missing from that line?
<ashipika> sorry.. copy&paste mistakeâ¦ here's the error message: ERROR failed to bootstrap environment: cannot start bootstrap instance: Get https://ec2.us-east-1.amazonaws.com?Action=DescribeInstances&Filter.1.Name=instance-state-name&Filter.1.Value.1=pending&Filter.1.Value.2=running&Filter.2.Name=instance.group-id&Filter.2.Value.1=sg-05ae1a61&Timestamp=2015-09-30T10%3A18%3A37Z&Version=2014-10-01: dial tcp: lookup ec2.us-east-1.amazonaws.com on 1
<ashipika> axw ^
<axw> hrm
<ashipika> axw: latest master.. go 1.5.1
<axw> ashipika: looks like it's due to tagging
<axw> ashipika: that command should be retried though ...
<ashipika> axw: tagging?
<axw> ashipika: we tag the instance and its root disk after it starts
<axw> (can't do it while starting, which seems a bit brain dead)
<ashipika> axw: https://pastebin.canonical.com/140850/
<ashipika> axw: with âdebug: https://pastebin.canonical.com/140853/
<axw> ashipika: erm actually that just looks like a host resolution error. can't tell more than that
<ashipika> axw: rebooting.. who knows.. might help
<ashipika> axw: did not help
<axw> ashipika: don't really know. it's attempting to resolve through DNS on localhost, is that intentional?
<axw> "on 127.0.1.1:53"
<ashipika> axw: i knowâ¦ i saw that.. but cannot explain it
<axw> ashipika: don't know, sorry
<tasdomas> ashipika, ping ec2.us-east-1.amazonaws.com
<ashipika> tasdomas: yes, fails.. switched to eu-west-1 and it seems to be working
<tasdomas> ashipika, but what does it resolve to?
<ashipika> tasdomas: something must have messed up my resolve.conf, or sth
<rogpeppe> this PR adds macaroon authorization to the charms endpoint, and continues with some cleanup of the apiserver package too. reviews much appreciated, thanks! http://reviews.vapour.ws/r/2794/
<rogpeppe> wallyworld: i've reviewed https://github.com/juju/charmrepo/pull/32
<wallyworld> ty
<wallyworld> rogpeppe: i'm tired now and want to keep hacking on the juju side of things for a bit, but will come back to the charmrepo stuff tomorrow, thanks for looking
<rogpeppe> wallyworld: ok, cool
<wallyworld> rogpeppe: one thing - name in meta doesn't have to be same as directory
<wallyworld> so i'm not sure about yuor comment
<rogpeppe> wallyworld: yeah, but it's very confusing if it's not
<wallyworld> hmm, ok, i habe test charms i have written where it doesn't match, so i guess i'm used to it
<frobware> dimitern, did you mention this morning that you got the spaces demo to work without having to have a public ip address? (Or perhaps I misheard you.)
<dimitern> frobware, yes, eventually - initially the machines in the subnets without auto-public-ip set were "pending", because they didn't manage to download some packages (no outbound access, just dns works)
<frobware> dimitern, aha. that's what I see.
<dimitern> frobware, so I presume after apt-get retried 30 or so times it gave up and cloud-init finished OK
<frobware> dimitern, so did you flick the switch for auto-public-ip on the subnets?
<frobware> dimitern, I'm not seeing a timeout though. machine state still in allocating.
<dimitern> frobware, no, but even if I did the flag is only honored when starting instances - not after they're running
<dimitern> frobware, is the instance running in the EC2 UI?
<frobware> dimitern, I was trying on my local account. HO?
<frobware> dimitern, yes instance is running
<dimitern> frobware, it might take 30m or so for apt-get retry script to give up I guess - I waited at least 30m with no change, but in the morning all machines shouled up as started
<dimitern> frobware, and it "worked" I guess just because I was deploying the ubuntu charm (which was pre-fetched by the apiserver and then the isolated machine got it from there - as usual), which doesn't need anything from the internet - wordpress I suspect won't work
<frobware> dimitern, so in the real world how is this supposed to / going to work?  service in the "private" subnet will need access on provisioning, installing packages, et al
<frobware> dimitern, so I was deploying the ubuntu charm, like we were doing yesterday.
<dimitern> frobware, in the real world we can do things like setting up squid-deb-proxy for apt + another proxy + nat + forwarding etc. on machine 0 (or another "public" machine)
<dimitern> frobware, the ubuntu charm is useful only for really simple tests - for more "real-world-like" tests, we need charms like in that bundle - scalable, with relations, config, etc.
 * dimitern needs to eat something - bbiab
 * frobware also needs to eat something too.
<frobware> dimitern, when bootstrapping a node with two NICs is it possible to configure which NIC gets selected?
<voidspace> frobware: no
<voidspace> frobware: that's why we need spaces
<frobware> voidspace, :)
<voidspace> seriously :-)
<frobware> voidspace, ok ok ook okkkkk
<frobware> voidspace, I'm sold!
<voidspace> frobware: hah :-)
<frobware> voidspace, I manually provisioned a machine with two NICs
<voidspace> frobware: right
<frobware> voidspace, sent 'bootstrap-host: 10.17.17.117' in my environment.yaml
<frobware> voidspace, then bootstrapped. Which indicates that the dns-name=10.17.17.117
<voidspace> frobware: so it's at least using the address you gave it
<frobware> voidspace, however, both mongod and jujud are listening on all interfaces
<voidspace> right
<frobware> voidspace, http://pastebin.ubuntu.com/12624701/
<frobware> voidspace, whereas I was trying to coerce it to listen on the single NIC only.
<voidspace> frobware: yep
<frobware> voidspace, OK answers my questions. thanks
<voidspace> frobware: not possible at the moment with a vanilla install
<dimitern> frobware, you mean in maas?
<frobware> dimitern, yes
<voidspace> AFAIK anyway...
<frobware> dimitern, well, no just maas
<dimitern> frobware, yeah - voidspace is correct actually :)
<voidspace> it does happen sometimes
<dimitern> frobware, one of the many goals of the model is giving you this sort of flexibility, while hiding the gruesome details :)
<voidspace> dimitern: frobware: I'm just bootstrapping an EC2 environment with my fix in place (for the ec2 Subnets issue) to see if it actually works...
<voidspace> it should do...
<TheMue> dimitern: btw, just recognized it. we still document the networks constraint as we still support this constraint. but shall I already remove it from the constraints documentation?
<cherylj> wwitzel3: ping?
<wwitzel3> cherylj: heya, in standup
<cherylj> wwitzel3: kk
<dimitern> TheMue, I think so, it should be dropped from the docs (as we're on that stage) and later from the code as well (I'm not too worried about this now)
<voidspace> well, the error is no longer in the logs - but the container still has a 10.0 address
<TheMue> dimitern: yep, feels better to me so too, thx
<dimitern> voidspace, with the address-allocation feature flag set?
<voidspace> I thought so...
<voidspace> godammit
<voidspace> must be a different shell window
<voidspace> *sigh*
<wwitzel3> cherylj: ping
<cherylj> wwitzel3: I heard that you've got experience using virtual MAAS?
<wwitzel3> cherylj: yep
<cherylj> wwitzel3: is there documentation somewhere on how to set that up?  What I've found seems to be out of date
<wwitzel3> cherylj: did you sacrafice a chicken? first step ;)
<wwitzel3> cherylj: yeah, one sec, I used the videos that Kirkland made, and they worked well for me
<cherylj> wwitzel3: no, no chicken.  I've got some pigeons around here.  Will that work?
<wwitzel3> cherylj: sorry, it was beisner who made them
<wwitzel3> cherylj: https://www.youtube.com/playlist?list=PLvn2jxYHUxFlxNmc1dAbw524aoPmHxNpC
<cherylj> wwitzel3: yay, thank you!
<wwitzel3> cherylj: I've refered to them a few times, I just follow his steps and it has always worked
<wwitzel3> cherylj: gl
<cherylj> wwitzel3: thank you :)
<natefinch> cherylj: just remember, wwitzel3 said it'll be really easy with absolutely no problems.
<cherylj> natefinch: so long as I remember the chicken
<wwitzel3> that's the key
<natefinch> cherylj: that must have been what I forgot when I was trying to do it at the sprint in Germany.
<natefinch> never did get it working
<dimitern> frobware, voidspace, I have a patched version of the gui which works and deploys the slightly modified bundle and respects spaces constraints!
<dimitern> (writing down all the steps and will send them later)
<voidspace> dimitern: awesome
<TheMue> dimitern: great, sounds cool
<voidspace> dimitern: yay, it worked this time
<voidspace> dimitern: it's done properly (subnetIds also honours as well as instId) - just needs some tests
<dimitern> voidspace, you're the man! :) great
<aisrael> How does one go about getting something backported to 1.24.x? i.e., this fixes juju with osx 10.11, which comes out today: https://github.com/juju/juju/pull/2969
<jcastro> sinzui: heya, El Cap just went gold today, IMO we should probably send a mail to the list telling people they should be fine with 1.25.x
<jcastro> any issues you think I should bring up?
<sinzui> jcastro: 1.24.6 in homebrew is fine. I delievered the patch to them personally
<jcastro> I saw that, that's why I wanted to mention it
<sinzui> jcastro: 1.25 is a beta
<jcastro> sinzui: I was meaning more like "this is the last time you'll have to care about this, future juju versions won't break on your beta OS."
<sinzui> jcastro: I WONT say that unitl it is true
<jcastro> heh
<jcastro> ok, I can not say that then.
<sinzui> el capitan is hardoded in 1.25. I read the coew
<mup> Bug #1501381 opened: panic: cannot pass empty version to VersionSeries() <blocker> <ci> <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1501381>
<alexisb> mgz, ^^^ is this bug in master? or all branches
<mgz> alexisb: master and feature branches off master
<alexisb> mgz, ok thanks
<mgz> alexisb: clarified the bug
<dimitern> voidspace, dooferlad, TheMue, frobware, you should've all received demo prep instructions
<TheMue> dimitern: +1, great, thanks
<mgz> alexisb: I'm not clear if it will only happen on maas, or if it's just our testing on maas that happens to hit this
<frobware> dimitern, received, queued (and not quite read). :)
<dimitern> TheMue, frobware, cheers :)
 * dimitern is outta here ;)
<frobware> dimitern, thanks; great to see the demo coming along :)
<dimitern> frobware, yeah - I'm happy we won't be the only team not showing interesting stuff :D
<natefinch> katco: you had mentioned enabling worked for the lease feature tests.... where is that code?  I can't find it
<natefinch> s/worked/workers/
<katco> natefinch: let me tal
<katco> natefinch: err... looks like they were deleted?
<katco> natefinch: here: https://github.com/juju/juju/blob/1.22/featuretests/leadership_test.go
<natefinch> katco: lol, well, that explains why I couldn't find them :)
<katco> natefinch: don't forget to submit your sick leave
<natefinch> katco: oh yeah, I'll do that right now
<mup> Bug #1501398 opened: stateSuite setup fails on windows with WSARecv timeout <blocker> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1501398>
<frobware> voidspace, you still about? Regarding the multi-nic question from above: am I wrong in thinking that spaces should allow for: juju bootstrap --constraints mem,cpu,etc,spaces=my-network-with-nic-192.168.1.123
<mgz> hm, it's not possible to be in more than one hangout at once
<alexisb> mgz, ping
<mgz> alexisb: omw
<frobware> mgz, it's odd though - you would think computers should be good at multitasking. :)
<mgz> apparently not :)
<beisner> o/ hi mgz -  fyi, i pulled thumper's binaries, re-ran loop, hit that bootstrap fail.  updated @ bug 1335885
<mup> Bug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider>
<mup> <security> <uosci> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>
<alexisb> beisner, thanks for the update
<mgz> beisner: thanks.
<beisner> alexisb, mgz - yw.  thx for the focus on this.
<voidspace> frobware: still around
<voidspace> frobware: that question would be better directed to dimiter I think, but I don't see why that shouldn't work
<voidspace> frobware: hmmm... although thinking about it
<voidspace> frobware: our implementation of spaces is at the "juju model" level - which requires the state server to be in place
<voidspace> frobware: so making it work at bootstrap time will require making the client "spaces aware" (i.e. able to discover spaces and resolve constraints)
<voidspace> frobware: so it isn't going to work initially, would require specific work
<mup> Bug #1500843 changed: Windows ftb due to unused import is diskmanager <blocker> <ci> <regression> <windows> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1500843>
<mup> Bug #1501432 opened: BootstrapSuite tests fail on non-ubuntu platforms with no matching tools <blocker> <centos> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1501432>
<cherylj> thanks for the quick review, cmars
<cmars> cherylj, thanks for the bug fix
<mgz> cherylj: where will that error propogate to exactly?
<mgz> cherylj: I'm wondering if we're still not logging enough information to work out what the bad data actually is
<mgz> cherylj: (code change looks sensible regardless)
<cherylj> mgz: it will cause the image to be ignored when we update stored image metadata
<cherylj> mgz: I was thinking I should update the logging to indicate the ID of the ignored image
<mgz> cherylj: sounds good to me - can be a seperate branch
<cherylj> mgz: I'm going to include it in the branch that updates dependencies.tsv
<mgz> cherylj: one thing that comes to mind from what you've found so far,
<mgz> our maas has a windows image which will obviously not have an ubuntu series
<mgz> how that would cause panics some of the times but not others though I have no idea, so may be unrelated
<cherylj> mgz: it shouldn't.  This panic was because we were trying to determine the version from a series of "" (empty string)
<cherylj> mgz: if there's some data in simple streams that just doesn't make sense, (like having nothing for the version), we should ignore it
<cherylj> erm, my previous comment should have been that we were trying to determine the series from an empty version
<cherylj> I had it backwards
<mup> Bug #918386 opened: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>
<mup> Bug #918386 changed: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>
<natefinch> arg.... I have a feeling the jujuconn tests are somehow mucking with the database in just the right way to break my worker
<mup> Bug #918386 opened: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>
<mup> Bug #918386 changed: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>
<mup> Bug #918386 opened: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>
<natefinch> my country for SOME of our code to have unit tests... you know, so they don't break when totally f'ing unrelated code is changed.
<marcoceppi> wow, mup, calm down, the bug isn't that important
<natefinch> ahh, hmm I think I got it.  Interesting difference between a real environment and the test environment
<mup> Bug #1501475 opened: Status presents unnecessary MAAS API info for machines <juju-core:New> <https://launchpad.net/bugs/1501475>
<marcoceppi> why can't I bootstrap local as root? ERROR failed to bootstrap environment: bootstrapping a local environment must not be done as root
<natefinch> marcoceppi: I forget, but it messes up permissions of certain things, and probably puts things in the wrong directories.  Why would you want to, anyway?
<perrito666> natefinch: its not like local wont do that for you anyway
<marcoceppi> natefinch: because I'm in an LXC container as root and I want to bootstrap as the root user
<perrito666> marcoceppi: can you bootstrap local inside an lxc container?
<marcoceppi> perrito666: well, I was going to find out (it's a LXD container, so should work)
<perrito666> famous last words
<marcoceppi> worst case scenario it doesnt' work
<marcoceppi> but stopping me because I'm root makes me sad
<natefinch> marcoceppi: looks like it doesn't ;)
<perrito666> marcoceppi: just adduser
<marcoceppi> I get that
<marcoceppi> but because of the way these mounts outside the system work I need to be root to access them anyways
<perrito666> marcoceppi: sudo?
<perrito666> but as a rule of thumb, any question matching with: "why .* local .*?" is answered with: because local provider sucks
<natefinch> +1
<marcoceppi> perrito666: I know how to work around this, I'm saying it's silly that juju would stop me as the root user, it should try to detect sudo vs root to discourage old local provider behaviou
<marcoceppi> also "lol its local so deal with it" isn't really a great answer
<marcoceppi> furthermore, local bootstraps in a LXD container
<marcoceppi> can we get a LXD provider now plz
<perrito666> marcoceppi: it was more in the tone of an apology than a mockery
<marcoceppi> I see
<perrito666> local provider is the number one cause of my screweing my work computer during the past year or so
<natefinch> marcoceppi: funny you should ask
<natefinch> marcoceppi: moonstone started work on an LXD provider as of today
<marcoceppi> perrito666: which is why I'm running it in a LXD container
<marcoceppi> natefinch: yes, 1000 times yes, I will happily test anything you throw at me
 * natefinch screenshots for later
<marcoceppi> I stand by my assertion! ;)
<perrito666> marcoceppi: I could totally use a brief howto for what you are doing
<marcoceppi> perrito666: well, if I could juju bootstrap local as root the howto would be way easier :P
<marcoceppi> perrito666: I'll write a blog
<perrito666> marcoceppi: tx
<thumper> beisner: that binary I created for you was me taking wild guesses at times, I'd like to tweak and get you to try again, keen?
<mup> Bug #1501490 opened: juju-local can't bootstrap as root user <juju-core:New> <https://launchpad.net/bugs/1501490>
<beisner> thumper, indeed
<beisner> thumper, i suspect 2s may not be enough, just based on observing nova compute, et al, after nova deleting an instance.
<thumper> beisner: how long do you think we need?
<beisner> thumper, i think it's variable, depending on the hardware, and load on that cloud
<beisner> thumper, how do we handle similar needs with other providers?
<beisner> ie. is there an existing max_wait / retry_interval approach in any other provider?
<beisner> thumper, i'll do a little ditty on serverstack to see if i can measure timing
<thumper> beisner: awesome
 * thumper otp
<thumper> beisner: we handle similar things in other clouds terribly IO
<thumper> IMO
<thumper> we should be treating many other cloud calls as retryable calls, but in most cases we don't
<beisner> thumper, ah i see.  so i think a max_wait and retry_sleep would work well.  it's a matter of how long you're comfortable blocking on destroy.
<thumper> beisner: you think having them configurable by config?
<beisner> thumper, i'd aim for a resilient default.  ie.  say ...  max_wait 30s, recheck every 1s or 2s.   but hold the line, i'm about to have data.
<beisner> ;-)
<beisner> bootstrap: http://paste.ubuntu.com/12626772/
<beisner> destroy: http://paste.ubuntu.com/12626773/
<beisner> nova instance: http://paste.ubuntu.com/12626774/
<beisner> nova secgroup: http://paste.ubuntu.com/12626775/
<beisner> thumper, ^ checking and timestamping nova secgroups and nova instances as fast as apis will allow, while bootstrapping and destroying
 * thumper looks
 * beisner too
<thumper> ok, so 2s is no where near enough
<thumper> beisner: let me build you one with 30s max :)
<beisner> thumper, a-ok.  i'll put together a timeline from those ^
<thumper> copying files now
<thumper> beisner: it appears to be as small as instant, but as large as 4s
<thumper> I'm doing 30s max with 1s retry
<thumper> *should* be solid enough
<thumper> getting about  702.1KB/s up to chinstrap
<thumper> beisner: the binaries are up, in the same place as before
<beisner> thumper, timeline @ https://bugs.launchpad.net/juju-core/+bug/1335885/comments/17
<mup> Bug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider>
<mup> <security> <uosci> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>
<beisner> thumper, ack, will pull bins
<beisner> thumper, fyi 3 iterations in.  seeing 3s, 11s, 5s between 'terminating instances' and 'command finished'  ... going to let that run.  i'm eod, but will prob check back in late evening.
<thumper> beisner: ok, cool
<beisner> thumper, thanks again!
<thumper> wallyworld: before I merge this openstack retry branch
<thumper> wallyworld: perhaps we should chat about exponential backoff?
<wallyworld> thumper: ok, give me a minute
<thumper> wallyworld: although, I'm tempted to land this and discuss the exponential backoff as part of a bigger picture provider retry system
<thumper> as I'm starting with the 1.24 branch
<wallyworld> sgtm
<thumper> k
 * thumper does that
<wallyworld> thumper: storageprovision/schedule.go
<wallyworld> is the storage solution
<wallyworld> that we can discuss moving to utils
<wallyworld> storageprovisioner/schedule.go i mean
<thumper> ack
<axw> fuuuuuuuuuuuuuuuuuu. sick of blocked master
#juju-dev 2015-10-01
<wallyworld> axw: a small one https://github.com/juju/charm/pull/160
<wallyworld> axw: ty, i've pushed up an additonal change to use typed errors
<axw> wallyworld: ok, looking
<wallyworld> btw the original imple,entation was sort of on purpose because upstream handled the empty series stuff, but easier to push that down into charm
<axw> wallyworld: ok. still LGTM
<wallyworld> ty
<thumper> wallyworld: quick chat?
<wallyworld> sure
<thumper> 1:1 hangout
<wallyworld> axw: see http://reports.vapour.ws/releases/3124 - it suggests bug 1479546 may be a cause, could you take a look?
<mup> Bug #1479546: Storage provisioner timeouts spawning extra volumes on AWS <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1479546>
<axw> wallyworld: looking
<wallyworld> ty
<axw> wallyworld: as I responded in the bug, it's a different issue. but still an issue nonetheless
<axw> wallyworld: I guess we need to increase the timeout
<wallyworld> axw: ty, i didn't read the bug too closely
<axw> wallyworld: np, I'm just a bit annoyed there's a rule matching a bug which I explicitly stated is not the same. whatever
<wallyworld> axw: agreed. is there a bug for the new issue?
<axw> wallyworld: about to check and file if not
<wallyworld> ty
<axw> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1501559
<mup> Bug #1501559: provider/ec2: bootstrap fails with "failed to bootstrap environment: cannot start bootstrap instance: tagging root disk: timed out waiting
<mup> for EBS volume to be associated" <bootstrap> <ec2-provider> <intermittent-failure> <juju-core:Triaged by axwalk> <https://launchpad.net/bugs/1501559>
<wallyworld> ty
<axw> wallyworld: there's not much to review. I'll have a look after I've reviewed perrito666's mongo branch
<wallyworld> axw: i'll just fix the series on the bug, plus i'll have a review for you real soon :-)
<axw> wallyworld: lucky me :)
<wallyworld> i know right
<mup> Bug #1501559 opened: provider/ec2: bootstrap fails with "failed to bootstrap environment: cannot start bootstrap instance: tagging root disk: timed out waiting
<mup> for EBS volume to be associated" <bootstrap> <ec2-provider> <intermittent-failure> <juju-core:Triaged by axwalk> <https://launchpad.net/bugs/1501559>
 * perrito666 feels summoned by axw
<mup> Bug #1501563 opened: Connection shutdown <ci> <test-failure> <juju-core:New> <https://launchpad.net/bugs/1501563>
<mup> Bug #1501563 changed: Connection shutdown <ci> <test-failure> <juju-core:New> <https://launchpad.net/bugs/1501563>
<mup> Bug #1501563 opened: Connection shutdown <ci> <test-failure> <juju-core:New> <https://launchpad.net/bugs/1501563>
<axw> perrito666: sorry (sorry again), no summoning intended
<axw> SORRY
<axw> wallyworld: and sorry about continuously setting the wrong milestones on bugs :/
<wallyworld> tis ok :-)
<axw> wallyworld: it'd be great if it LP didn't let me do that... or was smarter about assigning series from milestones
<wallyworld> yes
<perrito666> axw: if you really want to troll me, add my name to a bug title and let mup do the rest
<axw> perrito666: ok. when you least expect it
<axw> could be 2am... could be 5am...
<perrito666> axw: could be , yet I dont have notifications for IRC I happen to be working that is Why I saw the notification
<axw> :)
<perrito666> I have too many troll friends to tie my phone to any sort of exploitable notifications
<natefinch> that sounds like a challenge
<natefinch> The worst thing that ever happened to me was when somehow my cell number got confused for a fax number.
<mup> Bug #1501563 changed: Connection shutdown <ci> <test-failure> <juju-core:New> <https://launchpad.net/bugs/1501563>
<mup> Bug #1501563 opened: Connection shutdown <ci> <test-failure> <juju-core:New> <https://launchpad.net/bugs/1501563>
<perrito666> natefinch: I am at phone safe distance
<natefinch> good lord, who wrote this crap?  Evidently if you pass "invalid" as a placement directive to the dummy provider, it'll return an error if you try to use that placement
<perrito666> natefinch: you can always git blame it :p
<natefinch> perrito666: I usually do, but honestly, it doesn't matter.. unless it's someone who has left the company, I can't really call them out on it :/
<natefinch> perrito666: https://github.com/juju/juju/blob/6e4a2cf80781a77934fcf559f3b7db88b4d9a271/provider/dummy/environs.go#L694
<mup> Bug #1501568 opened: TestRebootFromJujuRun Failed <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1501568>
<perrito666> well you just asked who wrote it :p
<natefinch> perrito666: lol.... just a figure of speech, really
<perrito666> there is some humor value on "invalid placement is invalid
<natefinch> yeah, I thought so
<natefinch> blame says axw, but looking at the commit, he just refactored the code someone else wrote.   I need like a blame navigator so I can drill down to who started the whole mess.
<axw> natefinch: I think I did write that crap. that's how most of the dummy provider works I think?
<axw> natefinch: what's the issue with it, and how would you test placement with the dummy provider differently?
<natefinch> axw: sorry for the harsh tone, just frustrated.  My problem is that it's a magic string that causes an error in the dummy provider, and the only way you can know that it is supposed to cause an error is to go read the sourcecode deep in the dummy provider.  I'd much rather have a setting that can be toggled with an obvious name and functionality I can immediately go read.
<perrito666> natefinch: you could also change the error string to be more informative
<axw> natefinch: no problem. fair enough, it could be more obvious. feel free to change it
<natefinch> axw: some context.... I'm here, trying to debug why this test is suddenly failing: https://github.com/juju/juju/blob/master/apiserver/service/service_test.go#L372
<perrito666> "%s is invalid, the only valid is blah"
<mup> Bug #1501568 changed: TestRebootFromJujuRun Failed <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1501568>
<natefinch> perrito666: that would help... certainly it would make it more searchable.
<axw> natefinch: I see. is this in your unit assigner branch?
<natefinch> axw: yes, which, now that I know where this error originates from, I know why it's not getting triggered
<axw> natefinch: okey dokey
<natefinch> axw:  actually would like your input on this.  Now that the unit assignment is being done in a worker... this will never fail.  The assignment from the worker will fail, but that's obviously asynchronous.  Not sure what to do with this test.
<axw> natefinch: we should still be doing the prechecking, even if we don't do assignment
<natefinch> axw: hmm good point. I made sure we were still doing some of the more basic parsing, but was missing the precheck.  Cool, I'll hook that in.
<axw> natefinch: thanks, SGTM. if it's not obvious, precheck is "are these args obviously wrong". it can still fail asynchronously if that passes, and that's fine
<natefinch> axw: yeah, I figured, thanks.
<mup> Bug #1501568 opened: TestRebootFromJujuRun Failed <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1501568>
<natefinch> heh, failing that test did reveal that the test code just panics if there's no error.... it calls Results[0].Error.Error() without checking if that first .Error is nil.
<natefinch> arg.... of course precheck instance doesn't actually take a real instance.Placement...
<natefinch> just some string that has to be in a magically correct format :/
<mup> Bug #1501569 opened: MachineSuite failed <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1501569>
<beisner> o/ thumper
<beisner> so the retry on the openstack provider destroy is better (46 bootstrap/destroy iterations successful vs. 9)
<beisner> but then we hit a new one:  it tried to destroy twice, with a nil error:  http://paste.ubuntu.com/12629198/
<beisner> (full loop output if interested:  http://paste.ubuntu.com/12629219/)
<beisner> added timing stats to the bug, as well as that ^ info
<thumper> beisner: hey, looking at paste now
<beisner> thumper, kk.  see last bug comment too for min/max/avg timings observed.
<thumper> hmm... so now what?
<beisner> ha!
<beisner> ok, so i found a secret
<beisner> | 2d794379-3472-492b-9034-7c6d87727883 | juju-beis1-machine-0              | ERROR   | -          | NOSTATE     |
<beisner> that turned out to be an exercise in how well it handles an error in the cloud
<beisner> so, when the undercloud behaved, it looks like it works well (though 30s may still be pushing it)
<beisner> and when the undercloud misbehaves, that unexpected nil thing happens.  i'd expect it to fail differently.
<beisner> but basically, the instance never disappeared, instead, hit a resource issue.
<beisner> and errored by nova-compute
<thumper> hmm...
<thumper> so nova errored out trying to terminate the instance?
<beisner> no it errored trying to spawn it
<thumper> ah
<beisner> (neutron-api timed out as the undercloud was under fairly heavy load at that moment)
<thumper> new bug plz :)
<beisner> so, on the work done so far, suggestions:
<thumper> beisner: I think I'm about to crusade on better retries of cloud based errors across the board
<beisner> one iteration was 34s from 'terminate' to 'finished' in the juju destroy --debug output
<beisner> so, an increased max_wait may be in order
<beisner> 2nd suggestion:
<beisner> that's a while to wait on apparently nothing
<thumper> :-|
<beisner> user feedback (at least in --debug) while iterating, would be good
 * thumper nods
<beisner> avg 7s, min 3s ... seem like reasonable normal/best cases
 * thumper nods
<thumper> I can land additional tweaks to 1.24/1.25
<thumper> both have landed with the current fix I think
<thumper> I'm currently poking something else :)
<beisner> under load, ~30s seems like about as long as one should be expected to wait around on something to terminate.   but, as this shows, that may not be resilient with a production cloud under dynamic load changes.
<thumper> changing our fslock implementation on linux/macos to use flock and only use the rename dir on windows
<beisner> ++ for 1.24.x & 1.25  yes please
<beisner> one of the roles of this ci is to always consume whatever is in ppa:juju/stable, and run that against the released and the dev versions of the openstack charms
<beisner> plus, when we get the "stakeholders, please test proposed juju X" email, we flip a bit and run it all against that
<thumper> beisner: ack
<beisner> so, if i run on a fixed version, we'd loose that ability to flex
<thumper> beisner: we'll try our best to look after you :)
<beisner> ditto :)
<beisner> or at least feed info to ya
<thumper> :)
<thumper> that's appreciated
<lazypower> a thumper crusade :D i cant wait to read the commit stream on this one
<thumper> heh
<thumper> I'm getting pretty sick of Juju's inability to handle transient cloud errors
<lazypower> I feel your pain
<lazypower> thumper, have you seen how prominent they are on the public cloud space?
<thumper> not really, but I can guess
<beisner> thumper, thanks a ton.  plz lmk when/where i can get 1.24.x with the new goods ;-)
<lazypower> http://reports.vapour.ws/cloud-health/trends
<thumper> beisner: as in a released version of 1.24.x?
<thumper> lazypower: why does that graph go back in time to the right
<thumper> lazypower: that just looks weird
<lazypower> I'm not sure really
<thumper> lazypower: also, what does the red really mean?
<lazypower> red means it encountered an error that we didn't handle and retry provisioning
<beisner> so tldr from a non-juju-go-dev:   the opportunity for this race to occur has always been present (not waiting on secgroup delete before attempting to create another).   something got better/faster in 1.24.6, which removed just enough wait for a line to be crossed.  or at least that's how my brain has resolved it.  ;-)
<lazypower> thats what i understand anyway. Not necessarily the exact science
<wallyworld> axw: thanks for review, in this branch there's not yet any new series order of precedence stuff - the charm store repo does not yet return supported series
<beisner> thumper, ah of course.  i'll notice that for sure.  thx once again.
 * thumper nods
<thumper> kk
<axw> wallyworld: ah I see, I was wondering why that result was ignored. can you please add a TODO there
<wallyworld> ok
<wallyworld> axw: i've responded to some of the questions, working on updating some of the doc as requested
<axw> wallyworld: thanks, looking
<wallyworld> axw: so local repos. all charms are by convention/definition meant to be interpreted as single series only, because they are located in a directory named after the series. so if a charm author decides to modify the charm to declare supported series, we ignore it. if they want to have the charm be interpreted as multiseries, they move it or use the path syntax
<wallyworld> so for now, all the repo related code ignore supported series so the system behaves as today
<wallyworld> supported series will be used for charm store only once it supports it
<wallyworld> and local repo is considered deprecated
<axw> wallyworld: that's fine. when I wrote the comment I thought the default series precedence was implemented, since you had updated the doc
<axw> wallyworld: I think the default-series code I pointed at needs to change when it's implemented, regardless of local/cs
<axw> but I may be wrong. I don't know it well.
<axw> (and it doesn't matter until it's implemented)
<wallyworld_> damn, this kernel bug killing my networking is giving me the shits
<axw> [12:11:58] <axw> wallyworld: that's fine. when I wrote the comment I thought the default series precedence was implemented, since you had updated the doc
<axw> [12:12:29] <axw> wallyworld: I think the default-series code I pointed at needs to change when it's implemented, regardless of local/cs
<axw> [12:12:41] <axw> but I may be wrong. I don't know it well.
<axw> [12:12:54] <axw> (and it doesn't matter until it's implemented)
<axw> wallyworld_: also, LGTM
<wallyworld_> \o/
<wallyworld_> ty
<wallyworld_> i just need to get my charmrepo branch landed
<axw> wallyworld_: if you have time, could you review some of my branches? master is blocked, but I'd like to back-port to 1.25 while it's still unblocked
<wallyworld_> axw: yeah, was just about to do that
<axw> wallyworld_: thanks
<thumper> ah fark
<thumper> I suppose I should have grepped first...
<thumper> two hours down the drain attempting to change a base implementation only to find that people rely on existing behaviour...
<thumper> poo
<wallyworld_> axw: your reviews done, i'm afk for a bit
<axw> wallyworld_: thanks
<axw> waigani: the "move lastlogin and last connection to their own collections" upgrade step for 1.25 is in the master branch, but not in the 1.25 branch. intentional?
<mup> Bug #1501637 opened: provider/ec2: "iops" should be per-GiB in EBS pool config <ec2-provider> <juju-core:Triaged by axwalk> <juju-core 1.25:In Progress by axwalk> <https://launchpad.net/bugs/1501637>
<mup> Bug #1501642 opened: provider/ec2: min/max EBS volume sizes are wrong for SSD/IOPS <ec2-provider> <juju-core:Triaged by axwalk> <juju-core 1.25:In Progress by axwalk> <https://launchpad.net/bugs/1501642>
<rogpeppe> wallyworld: reviewed https://github.com/juju/charmrepo/pull/32
<wallyworld> ty
<axw> wallyworld: would you agree with changing https://bugs.launchpad.net/juju-core/+bug/1501637 for 1.25?
<mup> Bug #1501637: provider/ec2: "iops" should be per-GiB in EBS pool config <ec2-provider> <juju-core:Triaged by axwalk> <juju-core 1.25:In Progress by axwalk> <https://launchpad.net/bugs/1501637>
<axw> wallyworld: I mean, making the suggested change
<wallyworld> rogpeppe: maybe we should just remove charm.Reference as you say, but in a followup if that's ok
<rogpeppe> wallyworld: definitely - it's quite a big job
<wallyworld> axw: yes, best to do it before release
<axw> wallyworld: this demo prep has been enlightening ;)
<wallyworld> i bet
<wallyworld> dog food tastes awesome
<axw> wallyworld: I got the benchmark GUI working before. I'll send you a link when I've tested these changes and got it up again
<wallyworld> awesome
<dooferlad> dimitern, voidspace: hangout!
<voidspace> dooferlad: omw
<dooferlad> fwereade: hangout?
<fwereade> dooferlad, oops, ty
<dimitern> jam, HO?
<dooferlad> dimitern, frobware: hangout?
<dimitern> dooferlad, I think I'll skip it today
<dooferlad> dimitern: test for your demo passes :-) http://pastebin.ubuntu.com/12630924/
<dooferlad> dimitern: the last four lines are the good bit!
<axw> wallyworld: results are a little underwhelming, but here's the GUI: http://52.64.145.252/ (will be taking it down soon)
<wallyworld> looking
<axw> wallyworld: mysql-benchmark/2 is provisioned IOPS, mysql-benchmark/0 is SSD, mysql-benchmark/1 is magnetic
<axw> wallyworld: (I have made the suggestion that the GUI show related unit info on the screen)
<dimitern> dooferlad, awesome!
<dimitern> dooferlad, and the logs look nice as well
<axw> wallyworld: mysql-benchmark/2 is provisioned IOPS, mysql-benchmark/0 is SSD, mysql-benchmark/1 is magnetic
<axw> wallyworld: (I have made the suggestion that the GUI show related unit info on the screen)
<axw> (in case you got cut off before)
<dimitern> dooferlad, I think we should add a scaling step - i.e. add-unit mysql and mediawiki and check they end up in the same spaces, but different subnets?
<wallyworld> axw: too bad there aren't labels that can be set to show that detail on the summary
<dooferlad> dimitern: already doing that :-)
<dimitern> dooferlad, this will give us guarantee the AZ distribution works with spaces
<wallyworld> yes i did get cut off again :-(
<dimitern> dooferlad, great! cheers :)
<dooferlad> dimitern: just got sidetracked with trying to access haproxy, which even though it is exposed isn't responding.
<wallyworld> axw: with labels, it' impossible to easily see what beanchmark ran on what
<axw> wallyworld: if I demo this, I'll deploy the benchmark charm once per mysql
<axw> wallyworld: and give them useful names
<wallyworld> sounds good, what cloud was it again?
<axw> wallyworld: AWS
<wallyworld> cool. can we do gce or azure as well?
<axw> wallyworld: yes, can do
<axw> wallyworld: we only do one disk type on each of them, so we'd be comparing multiple clouds rather than disk types
<wallyworld> that's fine, just to show storage run on those platforms
<axw> wallyworld: tearing it down now
<wallyworld> ok
<axw> wallyworld: if you're still working, another small one that will be helpful for the demo: https://github.com/juju/juju/pull/3417
<axw> if not, tomorrow
<wallyworld> sure
<wallyworld> axw: i already lgtm that one
<wallyworld> you meant 2087?
<axw> wallyworld: nope, http://reviews.vapour.ws/r/2807/
<wallyworld> yeah, was dyslexic
<wallyworld> axw: lgtm, ty
<axw> wallyworld: thanks
<dimitern> dooferlad, voidspace, frobware, TheMue, I managed to pre-patch the gui charm so it can be deployed from a local repo; just get http://people.canonical.com/~dimitern/spaces-demo-local-repo.tar.bz2, extract it and use $ juju deploy --repository ./repo local:trusty/juju-gui --to 0
<mup> Bug #1501709 opened: "juju deploy" does not validate volume/filesystem params <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501709>
<mup> Bug #1501710 opened: worker/storageprovisioner: worker bounces upon finding invalid volume/filesystem params <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501710>
<mup> Bug #1501709 changed: "juju deploy" does not validate volume/filesystem params <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501709>
<mup> Bug #1501710 changed: worker/storageprovisioner: worker bounces upon finding invalid volume/filesystem params <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501710>
<mup> Bug #1501709 opened: "juju deploy" does not validate volume/filesystem params <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501709>
<mup> Bug #1501710 opened: worker/storageprovisioner: worker bounces upon finding invalid volume/filesystem params <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501710>
<rogpeppe> would anyone be able to give this a review please? it's been up for review for more than a day now. http://reviews.vapour.ws/r/2794/
<rogpeppe> ericsnow, axw: ^
<mup> Bug #1501786 opened: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1501786>
<ericsnow> potentially a *much* faster "go test": https://github.com/rsc/gt
<ericsnow> thanks for pointing it out, natefinch!
<TheMue> dimitern: quick question, on command line a subnet can be created and directly added to a space, also a space can be direct created with and without subnets. is it possible to create a subnet without adding it to a space?
<mup> Bug #1501786 changed: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1501786>
<natefinch> gt is frigging amazing.  I can basically just always run go test over the entire repo and not pay the price of all the tests that can't possibly have changed
<natefinch> ericsnow: btw, there's a -f flag to tell gt to force-rerun stuff
<mup> Bug #1501786 opened: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1501786>
<mup> Bug #1501786 changed: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1501786>
<mup> Bug #1501786 opened: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1501786>
<rogpeppe> ericsnow: I see you're OCR; would you be able to do a review for me, by any chance? http://reviews.vapour.ws/r/2794/
<rogpeppe> natefinch: is that russ's tool?
<natefinch> rog yep
<katco> rogpeppe: it has a ship it doesn't it?
<rogpeppe> katco: i need a review from someone on juju-core
<katco> rogpeppe: ahh ok
<katco> rogpeppe: i'll tal
<rogpeppe> katco: ta!
<natefinch> rogpeppe: it's pretty great... it's even handy for our flaky tests, because you can cache the tests from when they pass ;)
<rogpeppe> natefinch: ha
<rogpeppe> natefinch: my problem with it was that i was often changing something on which lots of things depended
<rogpeppe> natefinch: e.g. if you're working on the charm package, it doesn't really help much
<natefinch> rogpeppe: right, it's useful for full-repo runs of juju/juju
<natefinch> rogpeppe: I like it because it means I run full tests more often... and if I'm modifying something that a ton of stuff is using, at least I'm more aware of that fact.
<rogpeppe> natefinch: yeah
<rogpeppe> natefinch: i should try using it again
<dimitern> TheMue, sorry, was in a call
<rogpeppe> natefinch: i'd like someone to seriously look at speeding the tests up though. i'm not convinced that it requires wholesale refactoring.
<TheMue> dimitern: no problem, found it
<dimitern> TheMue, so a subnet without a space is forbidden by the model
<TheMue> yep, seen it
<TheMue> ;)
<TheMue> dimitern: thx anyway
<natefinch> rogpeppe: yeah, I think we guessed out that if we just reused cleared the db rather than restarting mongo that it would go faster
<rogpeppe> natefinch: a lot of the time is spent dialing mongo. providing a way to make a State from a mongo session might help a lot
<rogpeppe> natefinch: looking at the state tests, even when the test itself only take 0.01 seconds, the actual time taken is more like 0.25s - there's a quarter second overhead on every test
<natefinch> rogpeppe: yeah, I think one thing that would help a lot is if gocheck's time output included test setup/teardown and if it could output suite times, to include setup/teardown
<rogpeppe> natefinch: i suspect that if someone was given a week of time, they could make the tests run twice as fast.
<natefinch> rogpeppe: that would seem to be a no brainer for someone to work on
<rogpeppe> natefinch: +1
<cherylj> perrito666: ping?
<perrito666> cherylj: pong
<perrito666> good afternoon
<cherylj> perrito666: what's the story with https://github.com/juju/utils/pull/155 ?
<cherylj> perrito666: was it in response to a bug?
<cherylj> perrito666: good afternoon :)  Sorry to jump past the pleasantries :)
<perrito666> cherylj: np, I assume that broke something else?
<cherylj> perrito666: yeah bug 1501786
<mup> Bug #1501786: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1501786>
<perrito666> mmpf, I presume something changed between versions of apt-add-repository then
<perrito666> cherylj: its a bug found and fixed in place, when adding a ppa it would break by saying something similar
<perrito666> well actually when adding anything at all
<cherylj> on trusty?
<perrito666> vivid
<cherylj> ah
<perrito666> cherylj: you can revert it, it breaks nothing in production
<cherylj> perrito666: k, will do
<perrito666> cherylj: Ill device a way to make that smarter
<cherylj> perrito666: cool, thanks.  I'll get on the bug, then :)
<katco> rogpeppe: lgtm
<rogpeppe> katco: tvm!
<katco> rogpeppe: sorry for the confusion
<rogpeppe> katco: np
<rogpeppe> katco: in general on our side we always require two reviews anyway, FWIW
<dimitern> frobware, i'm back
<frobware> dimitern, me too
<dimitern> frobware, shall we use the same HO?
<frobware> dimitern, ah... yes. off-by-1 for me... omw
<dimitern> :)
<beisner> mgz - are those openstack provider retry bits in the pkgs @ ppa:juju/devel ?
<mgz> the ones from tim yesterday?
<mgz> not yet.
<beisner> mgz - ack.  before i go drop custom bins onto 23 slaves, do you know an eta of when that might land there?
<katco> mgz: rackspace call is now if you're interested
<mgz> katco: omw
<cherylj> RB isn't picking up my PRs.  Can someone review https://github.com/juju/utils/pull/158 ?
<cherylj> it's for the new blocker
<perrito666> cherylj: ship it
<cherylj> perrito666: thanks!
<rogpeppe> katco: i wonder if you might want to take a look at this one - very similar to the previous one (but only test code this time) so should be quick: http://reviews.vapour.ws/r/2812/
<natefinch> katco: btw, all tests are passing on my branch, just adding a few more to cover things I realized I missed... but adding tests is easy, so it'll be ready for a PR in a bit.
<katco> natefinch: awesome!
<rogpeppe> ericsnow: any chance of a review? http://reviews.vapour.ws/r/2812/
<ericsnow> rogpeppe: katco has offered to handle at least some of the reviewing for me today
<rogpeppe> ericsnow: ok, cool
<ericsnow> katco: would you mind taking a look at this one?
<katco> ericsnow: rogpeppe: res, plan to
<katco> yes
<rogpeppe> ericsnow: i already had one review of katco; maybe asking for two in a day is pushing it :)
<rogpeppe> s/of/off/
<rogpeppe> katco: ta!
<ericsnow> rogpeppe: we're heads-down on some pre-Seattle work :)
<rogpeppe> ericsnow: me too :)
<rogpeppe> ericsnow: (that's what this is)
<ericsnow> rogpeppe: surprise, surprise lol
<cmars> hey rogpeppe i'll take a look
<rogpeppe> cmars: ta!
<rogpeppe> cmars: i'd much appreciate your thoughts anyway as you know the territory...
<cherylj> Can I get another review?  http://reviews.vapour.ws/r/2813/
<cherylj> again, for the blocker
<mgz> cherylj: juju code change looks fine to me, explain the utils revert?
<mgz> will it actually break something on vivid going back to the previous quoting?
<cherylj> mgz: no, I tested that
<mgz> cherylj: okay, lgtm
<katco> natefinch: running a little late
<natefinch> katco: ok, just working on the PR for my branch
<katco> natefinch: finish that up and ping me
<natefinch> katco: ok
<jcastro> alexisb: wow, creepy. Everyone in eco is like, happy with 1.25 other than the bugs filed already
<jcastro> mbruzek actually demo'ed at nagios conf with 1.25
<jcastro> aisrael has one bug he needs to reproduce before filing a bug
<jcastro> but overall I got thumbs up from everyone in our daily calls
<aisrael> jcastro: two possible bugs
<aisrael> I'll get that done tomorrow
<natefinch> katco: http://reviews.vapour.ws/r/2814/
<katco> natefinch: rock tal
<natefinch> katco: sorry it's big.
<natefinch> katco: oops, crap, I see some code I left commented out
<cherylj> wwitzel3!  I got virtual MAAS working without sacrificing a chicken!
<katco> natefinch: no worries, i can review while you fix :)
<wwitzel3> cherylj: yeah, that part isn't actually required, I should probably stop doing it
<cherylj> wwitzel3: haha.  Thanks for the links
<wwitzel3> cherylj: np, glad it worked
<natefinch> katco: gah, the merge with a more up to date master broke some tests... looking into them now.  I'm sure it's more of the same junk
<katco> natefinch: k, i'll keep with the review
<natefinch> katco: haha, no it was that code I had commented out... it actually should have been deleted, not uncommented.  Oops.
<katco> natefinch: fortunate :)
<katco> ericsnow: wwitzel3: natefinch: want to meet up?
<wwitzel3> katco: sure
<ericsnow> katco: sure
<natefinch> katco: yep
<natefinch> is there any way to stop systemd from spamming every single terminal I have?
<perrito666> natefinch: yes, upgrade to mongo 3 ;)
<ericsnow> ping
<ericsnow> pong
<natefinch> pong
<ericsnow> thumper: ping
<thumper> ericsnow: hey, otp just now
<katco> thumper: join moonstone whenever you're ready? https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=1
<katco> rick_h_: ping
<voidspace> ericsnow: ping
<ericsnow> voidspace: hey :)
<voidspace> ericsnow: hi
<voidspace> ericsnow: you're OCR today I believe :-)
<voidspace> if you have a chance:
<voidspace> http://reviews.vapour.ws/r/2816/
<voidspace> ericsnow: you got over jetlag yet?
<ericsnow> voidspace: I'll take a look
<ericsnow> voidspace: just barely
<ericsnow> voidspace: got hit by an awful sinus infection
<ericsnow> voidspace: feeling better and my body is finally back on track
<voidspace> ericsnow: :-(
<voidspace> good
<ericsnow> voidspace: LGTM
<voidspace> ericsnow: thanks
<voidspace> ericsnow: I wonder if 1.25 is blocked
<voidspace> master has been blocked for days
<ericsnow> voidspace: good luck :/
<voidspace> :-)
<voidspace> ericsnow: did you work on storage?
<ericsnow> voidspace: not at all
<voidspace> heh, can't blame you
<voidspace> can't destroy an environment (without force) because a volume I didn't create doesn't exist
<voidspace> it's a shared amazon account, so I think it's getting confused by other things in the account that it shouldn't be concerned with
<voidspace> ERROR environment destruction failed: destroying storage: destroying volumes: querying volume: The volume 'vol-45ac83a7' does not exist. (InvalidVolume.NotFound), destroying "vol-cbc0ce22": The volume 'vol-cbc0ce22' does not exist. (InvalidVolume.NotFound)
<voidspace> ah well, I'll look into it tomorrow
<voidspace> ericsnow: g'night
<ericsnow> voidspace: have a good one
<axw> rogpeppe: sorry, I ignored it because it had a shipit already
<perrito666> my kingdom for a big machine where to compile stuff
#juju-dev 2015-10-02
<rick_h_> katco: pong
<axw> let the merge wars commence
<axw> thumper wallyworld waigani anastasiamac: if you have things waiting, master is got unblocked
<wallyworld> \o/
<axw> naturally I started mine merging before I told you
<waigani> axw: lol, cheers
<mup> Bug #1501381 changed: panic: cannot pass empty version to VersionSeries() <blocker> <ci> <intermittent-failure> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1501381>
<mup> Bug #1501432 changed: BootstrapSuite tests fail on non-ubuntu platforms with no matching tools <blocker> <centos> <ci> <test-failure> <windows> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1501432>
<mup> Bug #1501786 changed: juju cannot provision precise instances: need a repository as argument <blocker> <ci> <precise> <regression> <juju-core:Fix Released by cherylj> <https://launchpad.net/bugs/1501786>
<rick_h_> perrito666: how big?
<perrito666> rick_h_: something that can compile mongo in less than the time it takes for launchpad I am trying to fix the willy build of a mongo3 ppa and I dont have a willy nor intentions of breaking my laptop
<rick_h_> perrito666: what's your LP id?
<perrito666> rick_h_: hduran-8
<rick_h_> perrito666: oh bah, it's in use.
<perrito666> rick_h_: ??
<rick_h_> perrito666: well you can ssh to maas.jujugui.org and create an lxc container on that laptop and use it if you'd like, though tit's not that fast
<rick_h_> perrito666: I thought we had an unused node in our guimaas the team uses
<perrito666> rick_h_: I appreciate it, ill ponder on it if I dont fall asleep on the kb before
<rick_h_> perrito666: but looks like the team put it to use
<rick_h_> perrito666: ok, well let me know. We've got hardware if it'll help. However you should be able to create a wily lxc and do the builds in there and keep the laptop safe
<natefinch> lol compiling mongo is super slow...  it took forever even on my quad core i7 with a beastly SSD.  That's C++ for you.
<rick_h_> perrito666: worst case can ask urulama to take over the QA environment for a bit and get full machines
<natefinch> I think it took like 45 minutes on my laptop
<rick_h_> natefinch: have a 6 core 32gb machine if the instructions are easy for me to dupe
<natefinch> rick_h_: I haven't done it in like a year, but IIRC, the directions were terrible
<rick_h_> natefinch: :/ well if it would help happy to try to get something better working. shoot, maybe just get a giant ec2 machine for a bit. but understand if it's just a pita
<perrito666> it is sort of a pita, especially the one I am trying to build, which is a debian package
<natefinch> rick_h_: it's a good lesson for why perrito666 shouldn't buy such namby pamby laptops
<rick_h_> lol
<rick_h_> laptops are for portability, many core desktops with lots of ram for real work :)
<perrito666> natefinch: my laptop kicks ass, I just dont want to install willy on it
<natefinch> haha
<rick_h_> perrito666: but surely in an lxc it's safe?
<perrito666> rick_h_: ill give it a shot, I hadn't thought of it
<natefinch> perrito666: just giving you shit :)
<perrito666> natefinch: so is mongo :p
<perrito666> I really need to buy some heavy lifting machine I am just too lazy to import it
<perrito666> and to cheap to pay local price for what I can get here
<natefinch> perrito666: probably easier to just get us to stop using mongo
<natefinch> perrito666: I saw our new competitor nomad uses boltdb... which blows mongo out of the water as far as compile time and ease of setup.  Of course, it doesn't have replication, so they must be doing that themselves somehow.. but still.. I love boltdb... pure Go, man.  That's the way to do it.
<cmars> I like boltdb, i've used it in some side-projects and it's always been great to work with. I've never really pushed it in terms of concurrent txns though.
<mup> Bug #1502016 opened: Juju-quickstart error message lacks detailed error info <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1502016>
<mup> Bug #1502016 changed: Juju-quickstart error message lacks detailed error info <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1502016>
<mup> Bug #1502016 opened: Juju-quickstart error message lacks detailed error info <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1502016>
<TheMue> fwereade: Happy Birthday
<dimitern> fwereade, oh wow, happy birthday indeed! :)
<anastasiamac> fwereade: Â¸Â¸â¬Â·Â¯Â·â©Â¸Â¸âªÂ·Â¯Â·â«Â¸Â¸Happy Birthday To YouÂ¸Â¸â¬Â·Â¯Â·â©Â¸Â¸âªÂ·Â¯Â·â«Â¸Â¸
<fwereade> TheMue, dimitern, anastasiamac: <3
 * fwereade is tired and is taking a pre-travel swap day to relax
<voidspace> fwereade: hippy bathday fella
<voidspace> fwereade: enjoy the rest and seattle
<fwereade> voidspace, cheers :)
<mfoord> frobware: dimitern: conflicts resolved and unit public address branch set to attempt to land on master (again...)
<dimitern> mfoord, awesome!
<mfoord> frobware: dimitern: ec2 subnets branch also landing on master
<dimitern> mfoord,
<dimitern> mfoord, and it did land on 1.25 alread?
<mfoord> dimitern: yup, already landed there
 * dimitern gaahh I seem to be eating letter today
<mfoord> :-)
<rogpeppe> mfoord: fancy a review? http://reviews.vapour.ws/r/2822
<mfoord> rogpeppe: sure, coffee first
<rogpeppe> mfoord: ta!
<mfoord> rogpeppe: the only wart is having assertErrorResponse return the failure for a single use case
<rogpeppe> mfoord: yeah, i had to think a bit about that
<rogpeppe> mfoord: if you've got a better suggestion... ?
<mfoord> rogpeppe: haha, not sure I do :-)
<mfoord> I'll have another look
<mfoord> the rest is straightforward
<rogpeppe> mfoord: thanks
<mfoord> it's a common pattern in error testing (or testing in general) - to have some detail that only needs checking once
<mfoord> rogpeppe: as doer is only used once, and it's a one line function, does it need to be a separate function?
<dimitern> mfoord, is a Doer constructed by a Doerer ?:D
<mfoord> dimitern: the doerer is the function that does the doer
<mfoord> and what that does, well...
<dimitern> is Do()es
<dimitern> it even
<mfoord> rogpeppe: LGTM with the minor proviso about (maybe) inlining doer
<rogpeppe> mfoord: thanks
<rogpeppe> mfoord: it's like that in the other tests
<mfoord> rogpeppe: ah right, fair enough
<rogpeppe> mfoord: it could be inlined, but it's like it is to make it trivial to make other similar tests
<mfoord> ok
<mfoord> fine, drop the issue then
<rogpeppe> mfoord: done :)
<rogpeppe> fwereade: ping
<TheMue> rogpeppe: he's swapping due to Seattle
<rogpeppe> TheMue: ah, np
<dimitern> TheMue, reviewed - quite a lot of comments I'm afraid, ping me if anything is unclear
<TheMue> dimitern: ok, and I appreciate any comments
<TheMue> dimitern: jfi, I'll fix it in the 1.25 branch first but mark the issues in your review. later we can forward-port the 1.25
<dimitern> TheMue, cheers, sounds good
<bogdanteleaga> niemeyer: is -run deprecated in gocheck in favor of check.f?
<TheMue> dimitern: great input, you speak juju networking fluently ;) changes are now visible in http://reviews.vapour.ws/r/2823/
<dimitern> TheMue, looks good, thanks! I'll leave frobware / voidspace to proof-read as discussed
<TheMue> dimitern: cheers
<frobware> TheMue, I have a few comments - just wondering whether you want to address dimiter's first before I add mine
<TheMue> frobware: I already have done and pushed it, see http://reviews.vapour.ws/r/2823/.
<TheMue> frobware: dimitern reviewed my master branch, this is the 1.25 one.
<frobware> TheMue, I've add some comments but will want to take another pass. Just adding and publishing now so you're not blocked. OK?
<TheMue> frobware: great, thanks
<TheMue> frobware: deploy.go:118, singular "space" or plural "spaces"? the latter sounds more natural to me.
<frobware> TheMue, You could drop this bit "but not of the cmd and the database space"
<TheMue> frobware: ok, thx
<frobware> TheMue, leaving it as "deploy 2 instances of haproxy on cloud instances inside subnets of the dmz space only"
<TheMue> frobware: short and clear, sounds even better ;)
<frobware> TheMue, also note I saw a reference to "carret" which should be "caret"
<TheMue> frobware: fixed
<niemeyer> bogdanteleaga: Not deprecated per se.. -run still has its usual meaning, but it's orthogonal to gocheck.. The latter has check.f for its own use
<bogdanteleaga> niemeyer: hmm, -run doesn't seem to work for me, while check.f does with the same simple regex
<mup> Bug #1502127 opened: TestNoUniterUpdateStatusHookInError never reached desired status <ci> <intermittent-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1502127>
<mup> Bug #1502130 opened: charm upload fails often on maas <charm> <ci> <maas-provider> <regression> <juju-core:Triaged> <https://launchpad.net/bugs/1502130>
<mup> Bug #1502139 opened: Panic in TestUpgradeJuju on wily and vivid <ci> <intermittent-failure> <panic> <unit-tests> <vivid> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1502139>
<mup> Bug #1502139 changed: Panic in TestUpgradeJuju on wily and vivid <ci> <intermittent-failure> <panic> <unit-tests> <vivid> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1502139>
<mup> Bug #1502139 opened: Panic in TestUpgradeJuju on wily and vivid <ci> <intermittent-failure> <panic> <unit-tests> <vivid> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1502139>
<mup> Bug #1502139 changed: Panic in TestUpgradeJuju on wily and vivid <ci> <intermittent-failure> <panic> <unit-tests> <vivid> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1502139>
<mgz> what are you on mup...
<mup> Bug #1502139 opened: Panic in TestUpgradeJuju on wily and vivid <ci> <intermittent-failure> <panic> <unit-tests> <vivid> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1502139>
<niemeyer> bogdanteleaga: Probably because you're expecting -run to do something it doesn't do
<bogdanteleaga> niemeyer: the docs seem to say they do pretty much the same thing(-run and check.f), just that check.f takes suites as well
<mup> Bug # opened: 1502149, 1502153, 1502154, 1502158
<frobware> dimitern, I was just getting back to the demo. Which region did you create the subnets in?
<dimitern> frobware, eu-central-1
<TheMue> frobware: dimitern: done, now just swapping offices. if branch is ok I will immediately merge it into 1.25 and then port it to master
<dimitern> TheMue, cheers, will have one last look
<mup> Bug #1502140 opened: panic in lastlogin upgrade step <blocker> <ci> <regression> <upgrade-juju> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1502140>
<frobware> dimitern, TheMue: taking another look
<niemeyer> bogdanteleaga: Just that they are different, yes :-)
<mup> Bug #1502140 changed: panic in lastlogin upgrade step <blocker> <ci> <regression> <upgrade-juju> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1502140>
<frobware> dimitern, how do I get to eu-central? My choices are frankfurt or ireland.
<mgz> the 1.25 branch is blocked at present, see ^ that bug
<mup> Bug #1502140 opened: panic in lastlogin upgrade step <blocker> <ci> <regression> <upgrade-juju> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1502140>
<dimitern> frobware, eu-central-1 is in frankfurt :)
<frobware> dimitern, yeah. I noticed. <face palm>
<frobware> :)
<natefinch> fwereade: you around?
<natefinch> ericsnow: anything I need in the environments.yaml besides type: lxd?
<ericsnow> natefinch: nope
<katco> fwereade: hey can you give a high-level review of http://reviews.vapour.ws/r/2814 ? i'm doing a detailed review, but i want to make sure we're not doing anything wrong, conceptually
<katco> fwereade: i.e. i think it will need a +2 from you
<katco> natefinch: i see several places ericsnow's multi error patch would help :)
<katco> if only it were in the codebase.... le'sigh
<ericsnow> katco: what!!! someone isn't opposed!  ;)
<katco> ericsnow: hey, i've always thought it was a good idea
<katco> ericsnow: very common patterns
<ericsnow> katco: all providers use it (and not for necessarily API-related stuff)
<ericsnow> (it = the pattern)
<natefinch> definitely a common APIMultiError  would be good.... it's just proposed as being more general than it needs to be now
<ericsnow> katco: check out Environ.Instances in conjunction with ErrNoInstances and ErrPartialInstances
<ericsnow> natefinch: I wrote it in the first place to meet a non-API use case
<katco> natefinch: yeah not sure that's true... ericsnow ran into a place yesterday
<natefinch> it just assumes you have an id for each error, which is not always the case... but it would fit many cases, most of which are bulk calls in the API
<ericsnow> natefinch: the ID is neither required not necessarily unique
<ericsnow> (not -> nor)
<perrito666> bbl people
<mgz> natefinch: I don't like that retry function much
<mgz> means all those tests will have an extra 3s ceiling runtime?
<natefinch> mgz: we have to wait for the worker... I don't know how to get around that
<natefinch> mgz: and in theory it's up to 3 seconds... though in small scale tests I did find it took a couple seconds for the watcher to notify the worker and for the worker to do its job
<natefinch> mgz: I don't know why the worker/watcher cycle is so slow
<natefinch> mgz: I also don't know if that slowness is outside the norm.... are they usually instant?  I don't know the details of how the watchers and workers work.... but I know there's a lot of machinery between them.
<mgz> natefinch: yeah, i don't have a good answer on overall speed, but why not usin LongAttempt
<natefinch> mgz: I thought about using the attempt code, but it didn't really fit with the pattern I wanted, IIRC'
<natefinch> mgz: I could certainly look at it again
<natefinch> mgz: on second thought, I could make that work.. I missed the fact that I could call HasNext()
<mgz> natefinch: that would make me happier, at least we're staying with the existing constants
<natefinch> mgz: makes me happier, too.
<mup> Bug #1502202 opened: toolsversionchecker worker failing <juju-core:New> <https://launchpad.net/bugs/1502202>
<natefinch> mgz: fixed :)
<mgz> natefinch: thanks!
<natefinch> mgz: thanks for getting me to look at the attempt code again.  I definitely prefer using something already written than handcrafting my own.
<aisrael> jcastro: alexisb: lp:1502202 is the first I found. I'm trying to recreate the second.
<mup> Bug #1502202 changed: toolsversionchecker worker failing <juju-core:New> <https://launchpad.net/bugs/1502202>
<mup> Bug #1502202 opened: toolsversionchecker worker failing <juju-core:New> <https://launchpad.net/bugs/1502202>
<aisrael> jcastro: alexisb: I can't recreate the second issue. I've been running 1.25 since alpha1 for daily work and it's been solid except for the previously mentioned issue. Kudos.
<katco> natefinch: that is some hairy code. lgtm, but i think we really need someone like fwereade to look at it
<fwereade> katco, natefinch: link? (no promises today though)
<katco> fwereade: http://reviews.vapour.ws/r/2814/
<fwereade> katco, cheers
<katco> fwereade: a high-level overview should suffice. just need to make sure we're conceptually doing the right thing.
<katco> fwereade: this is the combining txns you and natefinch have been discussing
<fwereade> katco, so I see -- awesome :)
<natefinch> fwereade: I have a couple concerns... I'm using raw ids across the API... I think that's a no-no, but not sure if I should be creating a new tag just for my new mini collection
<fwereade> natefinch, for watcher results it's the least wrong thing to do actually
<fwereade> natefinch, might be inclined to go with unit ids though
<natefinch> fwereade: that's certainly doable
<natefinch> fwereade: the other one is that this is a general worker, but maybe it should be a singular worker?  We don't want multiples of these things doing assignments of the same things at the same time
<fwereade> natefinch, (translating id->tag at apiserver Next() time has complications -- but that's where it should be fixed, and for every watcher at once)
<fwereade> natefinch, I'd favour making it safe when racing with itself
<fwereade> natefinch, singular is a bit yucky
<fwereade> natefinch, but you're probably right
<fwereade> natefinch, yeah, make it singular, we just need a less layer--breaky singular implementation
<natefinch> fwereade: yeah, I started to go down the singular route, but all the current examples seem to use state directly, which I know is a Bad Thing... and trying to get it working with the API was tricky
<fwereade> natefinch, yeah, it should just affect where you launch it in  the machine agent
<fwereade> natefinch, there's a singular wrappper func which at least keeps the evil out of its clients' hair
<fwereade> natefinch, gtg for now
<natefinch> fwereade: ok, thanks for the input
<natefinch> ericsnow: gah, I hate having methods on a type outside the file where that type is defined... when we get some time, can we maybe not do that?
<ericsnow> natefinch: yeah, we should split types like that up along the same line as the files (and compose them into the main type)
<ericsnow> natefinch: otherwise we end up with enormous files
<natefinch> ericsnow: well.... if it's done well, not not just done to get the methods out of the same file, I think that's ok.  But given that the environ.go file is only 147 lines right now, I think we have a long way to go before we need to start breaking things up.
<alexisb> frobware, you potentially around?
<perrito666> how do you people keep feature branches up to date?
<natefinch> perrito666: merge from master once in a while
<natefinch> perrito666: it's not fun
<perrito666> natefinch: was thinking on the details :)
<perrito666> pr?
<perrito666> direct merge?
<natefinch> perrito666: meh, just merge
<natefinch> rebase if you're feeling lucky
<perrito666> natefinch: and you push that to the feature branch directly?
<natefinch> perrito666: oh uh... hmm..
<natefinch> perrito666: I guess PR.. sorry, wasn't thinking
<natefinch> perrito666: though it's generally not something that can really get reviewed
<perrito666> mm, can I get permissions to push into a branch in our repo?
<mup> Bug #1502140 changed: panic in lastlogin upgrade step <blocker> <ci> <regression> <upgrade-juju> <juju-core:Incomplete> <juju-core 1.25:Triaged by cherylj> <https://launchpad.net/bugs/1502140>
<mup> Bug #1502140 opened: panic in lastlogin upgrade step <blocker> <ci> <regression> <upgrade-juju> <juju-core:Incomplete> <juju-core 1.25:Triaged by cherylj> <https://launchpad.net/bugs/1502140>
<mup> Bug #1502140 changed: panic in lastlogin upgrade step <blocker> <ci> <regression> <upgrade-juju> <juju-core:Incomplete> <juju-core 1.25:Triaged by cherylj> <https://launchpad.net/bugs/1502140>
<mup> Bug #1502306 opened: cannot find package gopkg.in/yaml.v2 <blocker> <ci> <regression> <juju-core:Incomplete> <juju-core lxd-provider:Triaged> <https://launchpad.net/bugs/1502306>
<mup> Bug #1502314 opened: juju-deployer never completes landscape deployment to vmLDS <kanban-cross-team> <landscape> <falkor:Triaged> <juju-core:New> <https://launchpad.net/bugs/1502314>
<mup> Bug #1502314 changed: juju-deployer never completes landscape deployment to vmLDS <kanban-cross-team> <landscape> <falkor:Triaged> <juju-core:New> <https://launchpad.net/bugs/1502314>
<mup> Bug #1502314 opened: juju-deployer never completes landscape deployment to vmLDS <kanban-cross-team> <landscape> <falkor:Triaged> <juju-core:New> <https://launchpad.net/bugs/1502314>
<alexisb> aisrael, jcastro, I really appreciate the feedback on 1.25, that is great!
<mup> Bug #1502314 changed: juju-deployer never completes landscape deployment to vmLDS <kanban-cross-team> <landscape> <falkor:Triaged> <juju-core:New> <https://launchpad.net/bugs/1502314>
<mup> Bug #1502314 opened: juju-deployer never completes landscape deployment to vmLDS <kanban-cross-team> <landscape> <falkor:Triaged> <juju-core:New> <https://launchpad.net/bugs/1502314>
<beisner> alexisb, do we have an eta on the fix for the being ppa-consumable somewhere? https://bugs.launchpad.net/juju-core/+bug/1335885
<mup> Bug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider> <security>
<mup> <uosci> <juju-core:In Progress> <juju-core 1.24:Fix Committed by thumper> <juju-core 1.25:In Progress by thumper> <https://launchpad.net/bugs/1335885>
<alexisb> heya beisner
<alexisb> we have a fix committed for that bug for 1.24, but we are currently working to get a blessed build for 1.24 on CI
<alexisb> beisner, eta for a 1.24.7 release is next week pending a blessed CI revision
<beisner> alexisb, woot, thanks
#juju-dev 2016-10-03
<frankban> axw: could you take a look at https://github.com/juju/juju/pull/6354 ? I think I made all the changes suggested, I'd like another +1 before I land it
<axw> frankban: looking
<frankban> axw: ty
<axw> frankban: I've just pushed one comment, but I'm still reviewing
<frankban> cool
<axw> frankban: done
<frankban> axw: thanks, good suggestions
<mup> Bug #1629817 opened: [arm64] bad cpu-cores detection <juju-core:New> <https://launchpad.net/bugs/1629817>
<voidspace> mgz: ping
<voidspace> mgz: hmmmm... unping
<voidspace> mgz: was going to ask for vsphere creds, but I see them in cloud-city as usual
<rogpeppe> anyone know if there's a way to list all the users that been granted permissions on a model?
<perrito666> Morning all
<rogpeppe> perrito666: hiya
<rogpeppe> perrito666: do you know the answer to the question i just asked, by any chance?
<perrito666> Mmm there is iirc wallyworld implemented something like that, can't remember where though
<rogpeppe> perrito666: i see "juju users" but that tells me all users for a controller, not a model
<perrito666> rogpeppe: something with shares possibly?
<rogpeppe> perrito666: "juju commands | grep -i share" doesn't come up with anything
<voidspace> mgz: ping about vsphere
<voidspace> katco: natefinch: I am likely to be late for standup - daughter has to go to an emergency dentist's appt
<natefinch> voidspace: yikes. Hope she's ok
<voidspace> natefinch: just toothache, nothing serious
<voidspace> natefinch: I'm kind of glad - she still has her baby teeth and it's a good lesson as to why toothbrushing is necessary...
<natefinch> voidspace: fair enough :)
<perrito666> rogpeppe: sorry I hung on you, bank made me stash my phone under threat of arrest
<natefinch> lol argentina
<perrito666> natefinch: well see who laughs after tue wig driven dude gets command of your country
<natefinch> perrito666: lol
<natefinch> perrito666: I was going to say, in my country you can get shot for holding a toy gun in the toy gun aisle of a toy store
<perrito666> in my country you can get shot for rand.Rand{}.int()
<natefinch> well, using insecure psuedo-random numbers is a capital offense
<perrito666> lol
<perrito666> natefinch: the reason for phones being forbidden in banks here is because there was this mo of burglars that would tip from inside the bank to outsiders when someone whitdrew large sums of cash
<perrito666> (you can see the level of footshooting put into the solution)
<natefinch> ahh, so, same reason we take off our shoes at the airport.  Because "this one time...."
<perrito666> yup
<perrito666> I really hope no one does anything odd with shoes in a bank here, or visiting the banks is going to be very funny
<natefinch> very japanese
<cholcombe> i need a hand setting a breakpoint on juju.  I've got gdb and dlv installed.
<cholcombe> cmd/juju/application/deploy.go deployCharm seems to be what is failing on me
<cholcombe> however gdb and dlv don't seem to know how to break on that function
<natefinch> gdb is dicey at best, but I've had good experience with delve via editor integration.  I haven't used it via the CLI
<cholcombe> natefinch, delve kinda blew me away when i did a funcs command.  over 100K lines of function def's
<cholcombe> natefinch, any idea how to point delve at the right breakpoint?
<natefinch> cholcombe: use editor integration, it makes things 1000x easier.  What editor do you use?
<cholcombe> natefinch, well i'm trying to debug the controller running jujud.  i haven't tried building juju and running locally yet.  maybe that's the way to go?
<natefinch> cholcombe: ahh...  delve does have remote debugging capability, which might still be the best way to go. For delve I'm pretty sure you have to build the binary with delve debugging enabled, but I'm not 100% sure
<cholcombe> gah
<rogpeppe> perrito666: np
<natefinch> cholcombe: I might be wrong about that... the docs only mention that if you build with delve it makes the binary "easier to debug"
<cholcombe> yeah
<cholcombe> i thought gdb could handle this also but it appears to not be able to find anything either
<cholcombe> i suspect the jujud binary was built without debugging flags
<natefinch> gdb is known to not work well with go
<cholcombe> oh ok
<cholcombe> the problem is i have a charm which juju for some reason refuses to deploy
<natefinch> cholcombe: the best way to debug these things is to start off with the logs..... what does juju do when you try to deploy it?
<cholcombe> natefinch, https://gist.github.com/cholcombe973/d0ab4865691d6996d0c3b20578f269ca
<cholcombe> charm proof passes and this charm looks like any other charm.  i'm not sure what juju's problem is
<cholcombe> i'm running 2.0-rc2
<cholcombe> it used to deploy but i don't remember which version it worked on.  probably after i updated to 2.0 beta's it started failing
<natefinch> cholcombe: that looks like it might be a bug
<natefinch> cholcombe: let me try here
<cholcombe> natefinch, yeah none of the beta's or rc's work on this charm.  i don't know what's special about it
<cholcombe> natefinch, i also pushed it to the charm store on --channel edge and that also fails
<cholcombe> natefinch, juju deploy cs:~xfactor973/gluster-charm-1 --channel edge
<cholcombe> natefinch, i did a charm grant everyone on that in case you want to try
<natefinch> cholcombe: crud, I just realized the time, I have to run an errand.  Can you try deploying another local charm and see if that works?  It seems unlikely that the contents of your charm would give a "not found" style error
<cholcombe> other local charms work perfectly fine
<cholcombe> there's something special about this one
<cholcombe> i've tried deleting everything in the repo that is unique and it didn't help
<natefinch> wacky, ok
<natefinch> try removing the dash in the name
<natefinch> I gotta run, maybe someone else can help.  Sorry.
<babbageclunk> Can anyone help me with a lxd problem?
<babbageclunk> I got bitten by this https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1629766 while trying to install something unrelated.
<mup> Bug #1629766: Syntax error in /usr/lib/lxd/upgrade-bridge <lxd (Ubuntu):Fix Committed> <https://launchpad.net/bugs/1629766>
<babbageclunk> So I did my best to totally remove lxd and the half-configured interfaces that the failed upgrade left behind.
<babbageclunk> But now I can't bootstrap to lxd - I get a message saying to run `sudo dpkg-reconfigure -p medium lxd`, but running it doesn't bring up the network config questions it used to.
<babbageclunk> It just says "Warning: Stopping lxd.service, but it can still be activated by:
<babbageclunk>   lxd.socket
<babbageclunk> " and then completes with returncode=0.
<rogpeppe> i've factored out some of the juju/worker code from juju/juju so that we can easily reuse it across projects. i made a few changes as i did so - reviews welcome: https://github.com/juju/worker/pull/1
<cholcombe> natefinch, i found the problem.  when i deleted the actions.yaml file it started deploying again.
<cholcombe> must be bad syntax or something.  juju really needs a better error message for that
<mup> Bug #1629912 opened: the provision network IP addresses have to be smaller than the ones on the public network <juju-core:New> <https://launchpad.net/bugs/1629912>
<mup> Bug #1629919 opened: destroy-controller fails and a kill-controller is required. <juju-core:New> <https://launchpad.net/bugs/1629919>
<alexisb> dooferlad, ping
<alexisb> or voidspace
<alexisb_> dooferlad or voidspace ping
<voidspace> alexisb_: pong
<alexisb_> heya voidspace
<alexisb_> just quick q for bug triaging
<alexisb_> https://bugs.launchpad.net/juju-core/+bug/1629912
<mup> Bug #1629912: the provision network IP addresses have to be smaller than the ones on the public network <juju-core:New> <https://launchpad.net/bugs/1629912>
<alexisb_> ^^^ looks like an ip ordering issue which is a known bug?
<voidspace> alexisb_: that's what it sounds like, although it's unclear
<voidspace> alexisb_: I didn't know we had an IP ordering bug!
<alexisb_> hmm, ok :)
<alexisb_> I will give ti to rick for now then
<bdx_> hows it going all?
<bdx_> is there a way to specify what subnet juju bootstraps to?
<mup> Bug #1629912 changed: the provision network IP addresses have to be smaller than the ones on the public network <juju-core:New> <https://launchpad.net/bugs/1629912>
<hatch> is there a way to get the juju controller and model uuid from the cli?
<alexisb_> juju show-model
<alexisb_> juju show-controller
<alexisb_> is that what you were asking hatch?
<hatch> alexisb_: how embarassing
 * hatch slowly backs away and closes the door on his way out
<hatch> :D
<alexisb_> :)
<perrito666> hatch: isn't this where your fellow contrymen say their most iconinc catchphrase?
<hatch> perrito666: lol
<mup> Bug #1629951 opened: cannot specify subnet to  create controller in on bootstrap <juju-core:New> <https://launchpad.net/bugs/1629951>
<gQuigs>  new to Go.. how can I build/install juju 1.25?  the instructions seem geared towards trunk...
<natefinch> gQuigs: after getting the code, git checkout 1.25 (or juju-1.25.6 for the latest 1.25 release)
<natefinch> gQuigs: you can download/install 1.25 on many machines... you probably don't need to build it unless you want to hack on it
<gQuigs> natefinch: I need to test a 1.25.6 fix that might break stuff
<natefinch> gQuigs: valid :)
<gQuigs> a binary build of 1.25.6 would be fine, but I haven't found one
<natefinch> gQuigs: https://jujucharms.com/docs/1.25/reference-releases
<natefinch> gQuigs: it's hard to find if you don't know how to get there (and even if you do it's like 4 clicks)
<natefinch> gQuigs: scroll down to "Proposed"
<gQuigs> natefinch: I have actually found that, but the Proposed branch hasn't moved to include the fix I'm looking for (at least according to the PPA) - https://launchpad.net/~juju/+archive/ubuntu/proposed?field.series_filter=xenial
<natefinch> gQuigs: ahh, bummer
<gQuigs> but I never tried the centos (tar.tz) builds.. could they be newer?
<natefinch> I'm sure they're all built off the same codebase
<gQuigs> yup, confirmed that, all the same older version :(
<natefinch> gQuigs: building juju really isn't hard: http://pastebin.ubuntu.com/23271348/
<natefinch> gQuigs: lines 4-8 are just setup... after that, rebuilding just requires go install ./...  (which means build everything in this directory and all subdirectories)
<gQuigs> ah, looks like I got most of that, except for the go get launchpad.net/godeps
<gQuigs> let's see
<natefinch> ahh yeah, you need godeps... that sets all the repos juju depends on to the correct revisions
<natefinch> If you want to run the tests, you'll need to install mongo (yes, it's terrible, I'm sorry) ... but if you just want to build and run, you don't need it.
<gQuigs> well, it worked, but maybe I'm just blind, but how do I actually run it?
<natefinch> haha
<natefinch> so the binaries get put in $GOPATH/bin
<natefinch> juju and jujud
<gQuigs> yea, they didn't end up there...
<natefinch> oh uh hmm
<gQuigs> maybe I should try it again.. I already had already tried to get juju installed...
<natefinch> alternatively, from the root of the repo, you can do go build ./cmd/juju and go build ./cmd/jujud and it'll put the binaries in the current directory
<gQuigs> natefinch: could this be a difference between go on trusty vs xenial?  (sorry forgot to mention I'm on trusty)
<gQuigs> ahh.. blah.. do I have to manually install go 1.6?
<natefinch> gQuigs: what does go version say?
<gQuigs> go version go1.2.1 linux/amd64
<natefinch> oh yeah, that won't work
<natefinch> I mean, it might work in 1.25
<natefinch> but it really should be 1.6
<gQuigs> yea, I just installed go1.6 manaully
<gQuigs> will try again
<thumper> morning all
<alexisb_> morning thumper
<gQuigs> thanks a billion for your help nate!
<gQuigs> (it's working now)
<katco> thumper: morning tim. here, start the day with a fresh cup of easy review: https://github.com/juju/charm/pull/224
<thumper> katco: shipit
<katco> thumper: ta
<katco> thumper: will have a series of these soon... maybe only 1 more based off the pr you have already reviewed
<natefinch> gah, is there no way to juju login without it being interactive?
<natefinch> thumper: do you know about login/logout?  This code is a mystery to me...
<thumper> as in the CLI?
<thumper> no
<thumper> from the apiserver, yes
<natefinch> thumper: yeah. cli.... there's a bug, but I can't even really find where we're prompting for a password...
<gQuigs> heh.. agent-streams: local works :)
<alexisb_> thumper, I assume you are ditching the bug scrub call?
<thumper> yeah
<alexisb_> which is fine by me, means i can find some lunch
<thumper> ok
 * thumper recalls a bug about what he is looking at
<mup> Bug #1630029 changed: models should inherit vpc-id from controller  <juju-core:New> <https://launchpad.net/bugs/1630029>
<alexisb_> wallyworld, you around?
<wallyworld> alexisb_: yes, otp
<katco> wallyworld: ohai... thought you were on national holiday. we can still do our 1:1 if you like
<wallyworld> one sec
<perrito666> wallyworld: welcome back
<wallyworld> katco: ok, one sec
<wallyworld> alexisb_: sorry, free now, did you have a question?
<mup> Bug #1630029 opened: models should inherit vpc-id from controller  <juju-core:New> <https://launchpad.net/bugs/1630029>
<alexisb_> wallyworld, yes, thumper and I do: https://bluejeans.com/5036865018
<thumper> wallyworld:  https://bugs.launchpad.net/juju/+bug/1629089
<mup> Bug #1629089: API Login ACL response values differ from CLI <juju:In Progress by thumper> <https://launchpad.net/bugs/1629089>
<perrito666> axw: lemme know if https://github.com/juju/juju/pull/6321#pullrequestreview-2234618 suits what you asked so I can merge this
<axw> perrito666: LGTM, thanks
<perrito666> tx axw
#juju-dev 2016-10-04
<alexisb> wallyworld, looks like curtis did open a bug: https://bugs.launchpad.net/juju/+bug/1629985
<mup> Bug #1629985: TestNewModelConfig can't connect to the local LXD server (lxd 2.3) <ci> <lxd> <regression> <unit-tests> <xenial> <yakkety> <juju:Triaged by wallyworld> <https://launchpad.net/bugs/1629985>
<alexisb> I assigned it to you
<wallyworld> ok
<wallyworld> alexisb: but it should be assigned to tycho
<wallyworld> as it is his pr
<wallyworld> i assume he will be the one landing the fix
<alexisb> tych0, are you around to land you PR after wallyworld reviews?
<wallyworld> i am testing it and also reviewing
<thumper> https://github.com/juju/juju/pull/6372 anyone?
<thumper> I'm off to give the dog a quick walk while the sun is shining
<mup> Bug #1629113 opened: Juju deploy wordpress fails with MaaS <juju-core:New> <https://launchpad.net/bugs/1629113>
<menn0> thumper: remind me how to see the running workers/goroutines in a controller agent?
<menn0> using the developer-mode stuff
<thumper> juju-goroutines
<menn0> thumper: thanks
<wallyworld> axw: howdy, got time for a hangout about credentials?
<axw> wallyworld: yup, just give me a minute please
<wallyworld> sure
<axw> wallyworld: see you in 1:1?
<wallyworld> yup
<menn0> wallyworld: I *think* I've got a fix for the problem of Juju taking ages to recover when the primary controller node goes away
<wallyworld> oh stop it, now you're just teasing
<menn0> wallyworld: with the fix in place the cluster recovers (i.e. apiservers back up) in under 30s
<menn0> and a good chunk of that time is mongodb arranging a new primary
<wallyworld> wow, finally!
<wallyworld> menn0: was it a mgo fix?
<menn0> the fix is hacky at the moment so I won't be proposing it until tomorrow
<menn0> nope
<menn0> it's around HackLeadership and the handling of state in the apiserver
<wallyworld> joy
<menn0> the fix also means that the apiserver no longer needs it's own copy of state
<menn0> i'm going to help out at home for a bit now and tidy up the fix later on this evening
<wallyworld> sgtm
<babbageclunk> Anyone else using lxd 2.3?
<babbageclunk> I'm getting weirdly slow juju bootstraps with it - not sure whether it's me or them.
<babbageclunk> mwhudson: ping?
<mwhudson> babbageclunk: mmf
 * babbageclunk isn't sure whether that means you're ready to receive or not.
<mwhudson> babbageclunk: it means i should be in bed but i'm not so be quick :)
<babbageclunk> mwhudson: Ok! :)
<babbageclunk> mwhudson: I'm trying to find a memory leak in Juju
<babbageclunk> mwhudson: I found something that purports to view heapdumps, but it needs a patch to Go.
<babbageclunk> mwhudson: I was looking at the patch before I tried building it myself and saw a TODO with your name on it.
<mwhudson> babbageclunk: ok
<mwhudson> heh
<mwhudson> uh oh
<mwhudson> where is it?
<babbageclunk> mwhudson: So I thought, "Hey, I wonder whether he knows anything about viewing heap dumps."
<babbageclunk> https://github.com/tombergan/goheapdump/blob/master/runtime.heapdump.go.patch
<mwhudson> oh
<babbageclunk> I've tried a couple of things so far but not managed to get anything to work.
<mwhudson> that's boring, basically if you have a go program used shared libraries you only get part of the info
<mwhudson> babbageclunk: sorry don't really know anything, i'm afraid
<babbageclunk> ok, thanks anyway! Sorry to keep you up!
<mwhudson> babbageclunk: i wonder why he hasn't sent that patch upstream
<mwhudson> babbageclunk: no worries, sorry i couldn't help
<babbageclunk> mwhudson: Well, I think it's something that's still in flux. Found it from here: https://github.com/golang/go/issues/16410
<mwhudson> babbageclunk: heh i guess i got all the mails for that issue but just blanked on them :-)
 * mwhudson goes to bed
<perrito666> morning all
<babbageclunk> perrito666: o/
<mup> Bug #1630123 opened: OpenStack base 45 not being deployed with Juju GUI <juju-core:New> <juju-gui:New> <https://launchpad.net/bugs/1630123>
<perrito666> how do I create a model with a different owner from the cli, does annyone know?
<natefinch> cmars, rogpeppe1: I'm having trouble understanding the proper behavior of logout/login got a minute to talk?
<cmars> natefinch, not at the moment. i may free at 10:30 central
<natefinch> cmars: ok
<rogpeppe1> natefinch: what's up?
<natefinch> rogpeppe1: I'm confused as to the expected behavior of login/logout
<natefinch> rogpeppe1: if I log out of a GCE environment that I created, is juju supposed to prompt for a password when I log back in?
<rogpeppe1> natefinch: s/environment/model/ ?
<rogpeppe1> natefinch: this is without an identity-url configured, right?
<natefinch> rogpeppe1: yeah, I just bootstrapped with the normal defaults to google
<rogpeppe1> natefinch: what did the logout command print?
<natefinch> rogpeppe1: and yes, I guess it's logging into the modelk?  or the controller? I'm not sure
<natefinch> rogpeppe1: $ juju login
<natefinch> username: admin
<natefinch> You are now logged in to "gce" as "admin@local".
<rogpeppe1> natefinch: you always log into a controller, although you may choose an API connection specific to a model when you log in
<rogpeppe1> natefinch: what did the logout command print?
<natefinch> rogpeppe1: $ juju logout
<natefinch> Logged out. You are still logged into 1 controller.
<natefinch> (after changing my password)
<natefinch> rogpeppe1: the bug I'm working on says I should be prompted for my password when I log back in, but I'm not sure when that's the case and when it's not, since I know there's some macaroon stuff...
<natefinch> rogpeppe1: https://bugs.launchpad.net/bugs/1621375
<mup> Bug #1621375: "juju logout" should clear cookies for the controller <juju:In Progress by natefinch> <https://launchpad.net/bugs/1621375>
<rogpeppe1> natefinch: after logging out, what do you see in accounts.yaml ?
<natefinch> rogpeppe1:   gce:
<natefinch>     user: admin@local
<natefinch>     last-known-access: superuser
<rogpeppe1> natefinch: hmm, that's odd - it should have deleted the entry
<rogpeppe1> natefinch: although it doesn't have any password, which is also odd
<natefinch> rogpeppe1: where is the code that controls the password etc?  I can't seem to find it
<rogpeppe1> natefinch: what does "juju switch" print?
<rogpeppe1> natefinch: inside the jujuclient package
<natefinch> rogpeppe1: at the time, gce was my default controller... I have since messed with things
 * natefinch rebootstraps to do more testing
<rogpeppe1> natefinch: so if you do: "juju switch gce; juju logout", so you still have gce entry in your accounts.yaml ?
<rogpeppe1> natefinch: you shouldn't need to re-bootstrap
<rogpeppe1> natefinch: this is a client-side problem AFAICS
<natefinch> rogpeppe1: yes, I just couldn't find the client side code :)
<natefinch> rogpeppe1: also, I wasn't clear on when it should prompt and when it shuld not, based on macaroons and all that jazz
<natefinch> rogpeppe1: where's the code that produces the password prompt?  some naive greps under jujuclient isn't finding much
<rogpeppe1> natefinch: after it's told you it's logged in, can you actually use the connection (e.g. use juju status, juju list-models, etc?)
<natefinch> rogpeppe1: yep
<natefinch> rogpeppe1: just tried it
<rogpeppe1> natefinch: ISTR there's a "local login" macaroon created somewhere, and it looks as if logout is failing to delete it
<rogpeppe1> natefinch: axw is the man to talk to
<natefinch> rogpeppe1: k, thanks
<rogpeppe1> natefinch: it's almost certainly in the cookie jar
<rogpeppe1> natefinch: if you get logout to remove all cookies in the cookie jar associated with the controller, it'll probably work
<rogpeppe1> natefinch: FWIW the password prompting is hooked up in modelcmd.NewAPIContext (environschema.v1/form is the thing that actually does the prompting)
<rogpeppe1> natefinch: hmm, actually i'm not sure about that
<alexisb> babbageclunk, ping
<natefinch> rogpeppe1: ok, thanks, I'll look at that....
<natefinch> rogpeppe1: totally would not expect a function called NewAPIContext to do login stuff
<rogpeppe1> natefinch: actually, the user password stuff is done in api/authentication
<rogpeppe1> natefinch: in Visitor.VisitWebPage
<natefinch> rogpeppe1: yeah, I was just looking there.  Again.. VisitWebPage is not a function I would expect to prompt for a password via the command line :/
<mup> Bug #1630123 changed: OpenStack base 45 not being deployed with Juju GUI <juju-gui:Invalid> <https://launchpad.net/bugs/1630123>
<rogpeppe1> natefinch: yeah, history plays a role in that name
<rogpeppe1> natefinch: i think "Interact" might be a better name really
<rogpeppe1> natefinch: i'm going for lunch now
<natefinch> rogpeppe1: kk
<natefinch> rogpeppe1: thanks for the help
<rogpeppe1> natefinch: np
<voidspace> mgz: pingg
<mgz> voidspace: yo
<voidspace> mgz: I need help with vsphere and the company vpn
<voidspace> mgz: ah, it's standup time
<mgz> I'm in standup for fast bandwidth options too
<voidspace> mgz: cool
<mgz> happy to hang on after
<perrito666> oh, look at that, you cant add-model for a user other than you when you have add-model
<perrito666> rick_h_: alexisb-afk is that the intended behavior?
<rick_h_> natefinch: ping for standup
<rick_h_> perrito666: huh?
<perrito666> rick_h_: lets say I grant you addmodel
<perrito666> you can juju addmodel mymodel
<perrito666> which makes you happy
<perrito666> but you can not juju add-model --owner=perrito hismodel
<perrito666> ok, I might get soon disconnected
<perrito666> guess who lives where that giant blob of red is going http://www.smn.gov.ar/vmsr/general.php?dir=YVcxaFoyVnVaWE12WVhKblpXNTBhVzVoYzJWamRHOXlhWHBoWkdFdmFXNW1MM05q
<alexisb-afk> perrito666, stay safe
<perrito666> alexisb-afk: I have a roof :p
<perrito666> but I dont have faith on my internet air link
<perrito666> rick_h_: ping me after standup so we discuss this plz
<rick_h_> perrito666: rgr
<perrito666> rick_h_: dont forget me
<babbageclunk> perrito666: mind talking through some architecture details with me while you wait for rick_h_? :)
<perrito666> babbageclunk: I was thiking on lunching but I can spare a few mins
<perrito666> what can I do for you?
<rick_h_> perrito666: sorry, otp having fun
<rick_h_> perrito666: will ping when free sorry
<babbageclunk> perrito666: Thanks! I've tracked down a memory leak to us keeping State instances in the api server statePool even after they're destroyed and never closing them.
<babbageclunk> perrito666: I mean, after the corresponding model is destroyed.
<babbageclunk> perrito666: And I'm trying to work out how to go about closing them when the model is destroyed.
<babbageclunk> perrito666: There doesn't seem to be any way that a state from the pool can be returned to the pool - can state objects be shared between  goroutines/concurrent requests?
<perrito666> babbageclunk: doesnt state knows best and figures the model is gone?
<perrito666> babbageclunk: point me to the code
<babbageclunk> perrito666: Not as far as I can see - at least, the worker stopping done in State.Close() doesn't happen.
<babbageclunk> perrito666: finding links...
<babbageclunk> perrito666: This is the stuff that doesn't happen: https://github.com/juju/juju/blob/master/state/open.go#L522
<babbageclunk> perrito666: state pool: https://github.com/juju/juju/blob/master/state/pool.go
<babbageclunk> perrito666: Here's where it gets the state from the pool to serve a request: https://github.com/juju/juju/blob/master/apiserver/apiserver.go#L556
<perrito666> so, close is actually never happening?
<katco> babbageclunk: this is essentially a connection pool. so the connections to mongo won't be closed until close is called on the pool which (guessing here) will only happen when the process is torn down?
<babbageclunk> perrito666: I mean, the server will close the pool if it gets shut down, and that will close the states.
<katco> babbageclunk: i.e. it's ok that close isn't being called on the connections -- we want them to stay open
<katco> babbageclunk: what is the memory leak you're tracking down?
<perrito666> ill let you with katco for a moment as the oven is beeping
<babbageclunk> katco: lots of add-models/destroy-models in a row.
<babbageclunk> katco: bug 1625774
<mup> Bug #1625774: memory leak after repeated model creation/destruction <eda> <oil> <oil-2.0> <uosci> <juju:In Progress by 2-xtian> <https://launchpad.net/bugs/1625774>
<babbageclunk> perrito666: Thanks!
<katco> babbageclunk: ah i see the logic bug then
<babbageclunk> katco: It seems less like a connection pool and more like a cache.
<katco> babbageclunk: it's a connection pool (a cache of connections ;p)
<katco> babbageclunk: so close can only be called on the pool, not for a model
<katco> babbageclunk: so when you do add/destroy modela  bunch of times, you get a lot of connections cached, but connections for the model you destroy hang around for the life of the pool
<babbageclunk> katco: :) Sure, but in others I've seen there's a concept of taking the connection out of the pool while it's in use and returning it.
<babbageclunk> katco: That's right/
<katco> babbageclunk: errr... i made a bad assumption. i wonder why we're not using sync.Pool for this
<katco> babbageclunk: i suppose bc we want multiple people using 1 connection
<babbageclunk> katco: Ok - that's the bit I wasn't sure about.
<katco> babbageclunk: you are correct: this is a cache and not a pool
<katco> babbageclunk: so basically all you need to do is to make sure that the code-path the destroys the model closes the connection from the cache and removes it
<babbageclunk> katco: Ooh, I didn't know about sync.Pool. Neat.
<katco> babbageclunk: yep
<katco> babbageclunk: maybe not useful here. i suppose the value in the map here could be a sync.pool
<babbageclunk> katco: Yeah, that's what I was thinking - the bit I wasn't sure about was making sure that others weren't using the state when I close it.
<katco> babbageclunk: i imagine you'd need a refcount for that. i wonder if we could deduce a logically safe way though wherein a call to invalidate the cache is guaranteed to be the only consumer
<katco> babbageclunk: i.e. if you destroy the model, you could create a watcher that waits until the model is destroyed, and then invalidate the cache knowing nothing else could be using it?
<babbageclunk> katco: Yeah, that would make sense.
<katco> babbageclunk: but i don't know the specifics. that's going to require a deep=dive
<babbageclunk> katco: Also, maybe this is the source of some other weird bugs we see around destroy-model - old state objects still swimming in the pool after the model's gone away.
<katco> babbageclunk: or the refcount with a drain op. dunno which is cleanest/logically most correct
<natefinch> cmars: ping when you get the time about macaroons / logout etc
<katco> babbageclunk: like what?
<babbageclunk> katco: in the course of reproducing this I've gotten into states where I had models that would show up in list-models but couldn't be destroyed.
<cmars> natefinch, i'm available now
<katco> babbageclunk: =| i'm not sure open connections to state would cause that?
<babbageclunk> katco: I'm not sure it would either, just wondering.
<katco> babbageclunk: converge the code towards correctness and we'll get there ;)
<babbageclunk> katco: :)
<babbageclunk> katco: ok, so you think it's deliberate that the pool allows multiple requests to share states?
<natefinch> cmars: ok - https://hangouts.google.com/hangouts/_/canonical.com/core?pli=1&authuser=2
<katco> babbageclunk: probably; to limit the number of connections open. ironically to prevent what you're trying to solve
<katco> babbageclunk: i know we had a go at reducing memory leaks, and this was probably part of it
<katco> babbageclunk: if not, we should be using sync.Pool
<katco> babbageclunk: looks like this is over a year old though
<babbageclunk> katco: ok, doing some code archaeology to understand the history.
<babbageclunk> katco: Thanks for the help! I'll probably have more questions soon. :)
<katco> babbageclunk: hth! not an expert on this by any means, but looked like i understood what was happening
<babbageclunk> katco: might pick Menno's brains about it too - he's all through the blame for it.
<katco> babbageclunk: lol yep
<rick_h_> voidspace: is https://bugs.launchpad.net/juju/+bug/1616098 related at all to your investigation?
<mup> Bug #1616098: Juju 2.0 uses random IP for 'PUBLIC-ADDRESS' with MAAS 2.0 <4010> <cpec> <juju:Opinion by dimitern> <https://launchpad.net/bugs/1616098>
<voidspace> rick_h_: looking
<voidspace> rick_h_: I don't believe so
<voidspace> rick_h_: the vsphere issue, use fe80:: as an address, which is a loopback address - is a fairly specific problem I think
<voidspace> rick_h_: so that bug is a different one - partly impacted by my changes as addresses are now sorted (hence the "lower" address noted in Ante's latest comment)
<voidspace> rick_h_: there's currently no space awareness in the public address picking logic
<voidspace> rick_h_: but as we have to pick *one* address, it's not immediately clear to me how we should resolve that
<voidspace> rick_h_: so bug 1616098 is definitely related to the address picking logic I've worked on, the vsphere one is different
<mup> Bug #1616098: Juju 2.0 uses random IP for 'PUBLIC-ADDRESS' with MAAS 2.0 <4010> <cpec> <juju:Opinion by dimitern> <https://launchpad.net/bugs/1616098>
<rick_h_> voidspace: k, ty
<voidspace> rick_h_: the issue there is that the  machine has multiple public IPs, some of which are not routable, and the address picking logic just has to pick one
<voidspace> rick_h_: that logic is currently space unaware - it needs to become space aware and pick an address from a space that the controller is in
<rick_h_> voidspace: rgr, have to think on it. It's on my list of stuff we don't seem to have a firm rulebook for
<voidspace> rick_h_: I think that rule "pick an IP from space the controller machine is in" works
<voidspace> rick_h_: then the public IP is routable from the controller, so we can ssh to it
<voidspace> *a space
<katco> can someone explain the difference of these two to me? https://github.com/juju/names/blob/v2/application.go#L13 https://github.com/juju/charm/blob/v6-unstable/url.go#L49
<rick_h_> the one diff is the firm ending $ on the second.
<rick_h_> though not sure what effect that would have
<rick_h_> in practice
<katco> rick_h_: i'm not looking for a comparison of regexes :p
<rick_h_> katco: heh, ok then have to be more clear on "explain the diff" :)
<katco> rick_h_: aren't the two concepts equivalent? do we have 2 checks in different libs?
<katco> rick_h_: i think "name" in the charm lib is the same as "app" in our names lib? i think we should be centralizing checks in our names lib?
<rick_h_> katco: there is the work to make sure they're compatible. I'm not sure why names is not a dep in charm to share the logic.
<katco> rick_h_: it is actually a dep... not sure why it isn't just using the logic though
<katco> rick_h_: that makes it all the more puzzling =/
<rick_h_> katco: ok, then even more "don't know"
<katco> rick_h_: haha
<katco> rick_h_: would you consider "series" something else that should be validated in names?
<rick_h_> katco: I'd suggest asking rogpeppe1 ^ for any justification and if none then carry on
 * rogpeppe1 looks
<rick_h_> katco: hmm, maybe as far as reserved words go, but not sure how easy series are to pull in like that in a light weight way
<katco> rogpeppe1: ta
<katco> rick_h_: https://github.com/juju/charm/blob/v6-unstable/url.go#L48
<katco> rick_h_: i would just look to pull this same validation logic over
<rick_h_> katco: hmm, that's interesting. Since series are hard coded a regex to valid them seems interseting
<katco> rick_h_: juju proper does more validation against a fixed list. something else we might want to pull over into names
<katco> rick_h_: i think we go in circles on this stuff bc there's no clarity across the team at a high-level as to what libs are for. lots and lots of reimplementation of thing
<rogpeppe1> katco: they match identical texts
<katco> rogpeppe1: sorry, can you restate?
<rogpeppe1> katco: the names one has a few non-grouping qualifiers
<rogpeppe1> katco: they're the same expression
<katco> rogpeppe1: right, so can we nix the one in charm and consolidate to names?
<rogpeppe1> katco: but the names one has ?: qualifiers to prevent subgroup matching
<katco> rogpeppe1: also, any opposition to moving the series check over?
<rogpeppe1> katco: i think it's probably fine for the charm package to use names.ApplicationSnippet instead of validName
<rogpeppe1> katco: the series check?
<katco> rogpeppe1: see the regex in charm right above the name regex
<katco> rogpeppe1: https://github.com/juju/charm/blob/v6-unstable/url.go#L48
<rogpeppe1> katco: does names need to know about series?
<katco> rogpeppe1: same question for schema. can we centralize all correctness logic to the names lib?
<katco> rogpeppe1: this is my fundamental question: is the names lib where we centralize correctness checks?
<rogpeppe1> katco: i'm wary of moving all "name" logic to juju/names. names are not all the same.
<katco> rogpeppe1: can you state what you feel the purpose of the name lib is?
<rogpeppe1> katco: i think that moving all checks for all name-like things to juju/names is a recipe for mixed concerns.
<rogpeppe1> katco: it's about names of juju entities in a juju controller
<katco> rogpeppe1: not names of charms themselves?
<rogpeppe1> katco: i think that charm urls could be considered separate, yes
<rogpeppe1> katco: where do we stop? do we put all the charm URL parsing code in names.v2 too?
<katco> rogpeppe1: if that's the case, then i think it's correct that the two are separate. how the controller spells charm entities is only coincidentally (and maybe not permanently) the same
<rogpeppe1> katco: yeah, actually, that's right
<katco> rogpeppe1: i wonder how we can make that justification clear...
<rogpeppe1> katco: an application name really is different from the name of a charm in the charm store
<rogpeppe1> katco: even though the default application name is taken from the charm name
<katco> rogpeppe1: yeah makes sense
<katco> rogpeppe1: ta for the clarity
<rogpeppe1> katco: tbh i still don't mind the charm package using ApplicationName for the constant
<katco> rogpeppe1: they're not logically coupled, so imo, i don't think it makes sense to artificially do so
<rogpeppe1> katco: fair enough, i pretty much concur
 * rick_h_ grabs lunchables
 * perrito666 learns that he must close the huge closing of his garage before the storm
<katco> perrito666: stay safe down there
<perrito666> I am safe, a bit wet though
<perrito666> I have one of these things http://mla-d2-p.mlstatic.com/toldo-vertical-para-balcon-toldos-de-lona-para-balcones-716221-MLA20740247242_052016-F.jpg?square=false
<perrito666> but its one big piece, a good 4m width
<perrito666> and was half deployed when the wind storm started
<perrito666> good news is I closed it before it was ripped apart
<perrito666> bad news is I am soaking wet
<perrito666> https://pbs.twimg.com/media/Ct8R1BPWAAAsAxb.jpg:large   <-- as you can see, the water level is at the first step of the house :p
<natefinch> yikes
<perrito666> no worries, I guessed this would happen so I built the house 1m above ground
<perrito666> I am a bit surprised I still have internet though
<natefinch> I live on a hill like 10m above the nearest low point.  If my house ever floods, water in the basement will be the least of my worries
<perrito666> build an ark
<natefinch> pretty much
 * rick_h_ goes to get boy from school
<perrito666> this city cannot stand rain http://www.lavoz.com.ar/ciudadanos/tormenta-en-cordoba-con-granizo-y-calles-anegadas?cx_level=catastrofe
<perrito666> that is 2 hs of rain
<natefinch> yikes
<natefinch> gah, who decided that SetCookies(url, nil) would be a NOOP instead of just deleting all cookies?
<alexisb> perrito666, you around ?
<perrito666> alexisb: sorry here, Was trying to avoid a few tiles from flying from my roof
<alexisb> o boy
<alexisb> perrito666, do you need to be left alone?
<perrito666> alexisb: no, I decided that its the roofers problem, not mine
<natefinch> pretty sure there's only one answer to that question ;)
<perrito666> I am not climbing to the roof in the middle of a storm
<alexisb> hey perrito666 can you help tych0 with a question
<perrito666> sure I can
<alexisb> tych0, has a failed merge due to dep updates
<tych0> see https://github.com/juju/juju/pull/6367
<perrito666> tych0: sorry to hear that
<tych0> i'm trying to untangle it now
<alexisb> perrito666, do you know what he needs to do to move things forward?
<tych0> but supposing i can't, what's the policy on adding new deps?
<perrito666> tych0: afaik, adding new deps needs to go through the tech board
<rick_h_> tych0: they need to be reviewed. what's the new dep?
 * perrito666 reads the code
<rick_h_> tych0: it needs license review/etc
<perrito666> tych0: I dont see you adding a new dep
<tych0> rick_h_: https://github.com/inconshreveable/log15
<perrito666> tych0: where is that being added?
<rick_h_> tych0: why that dep vs the standard logging tools in core?
<rick_h_> tych0: that might cause issues
<tych0> ok, i'm hearing "no" then :)
<perrito666> that will cause issues most likely
<perrito666> especially since we are doing the logs to mongo thing
<rick_h_> perrito666: tych0 yea, and audit logs/etc. a new logging tool will make things complicated
<perrito666> tych0: also, I dont see on your code where is this added, I am confused
<tych0> perrito666: it's picked up in the github.com/lxc/lxd hash update
<rick_h_> tych0: oic, a dep if a dep?
<tych0> i didn't add it to juju explicitly, lxd added a dep mostly accidentally
<perrito666> .... that is different
<rick_h_> wallyworld: ^
<rick_h_> wallyworld: has opinions and isbon the board and such
<alexisb> rick_h_, ian is hopefully sleeping
<perrito666> alexisb: nonsense, ian doesnt sleep
<rick_h_> alexisb: ah, was thinking the evening call was closer than it is
 * rick_h_ actually looks at clock
<alexisb> do we need to add the dep of the dep directly to juju
<tych0> (it's also possible that none of this is necessary if i can untangle it)
<rick_h_> alexisb: i hooe it's not that bad, but it provavly still does need license review and such
<alexisb> rick_h_, ack
<babbageclunk> alexisb: Is Menno around today? His name's on the StatePool which is what's hanging onto States.
<alexisb> babbageclunk, he is
<babbageclunk> cool cool
<alexisb> but it will be another 30 minute sbefore he start of day
<alexisb> babbageclunk, ^^
<perrito666> rick_h_: ideally lxc people should have checked that and if we are licence compatible with them that should be enough, if they did not they might have tainted the code and are in deep s***t
<perrito666> tych0: do you have more info about that?
<perrito666> babbageclunk: heavy summoning skills
<babbageclunk> perrito666: ha ha
<tych0> perrito666: about what?
<rick_h_> perrito666: it's apache license  should be good
<babbageclunk> tych0: I invoked menn0.
<perrito666> tych0: what rick_h_ said
<babbageclunk> tych0: duh ignore me
<menn0> wat?
<tych0> perrito666: yes, we check licenses :)
<babbageclunk> menn0: morning!
<menn0> babbageclunk: howdy :)
<thumper> o/ babbageclunk
<menn0> thumper: o/
<babbageclunk> hey thumper
<babbageclunk> menn0: Actually, just need to walk Alice's sister back to her hotel - can I grab you in 15 mins?
<menn0> babbageclunk: sure
<babbageclunk> menn0: For context, I'm chasing bug 1625774 and it turns out it's state.StatePool hanging onto State instances. So I think it needs to grow a way to release them for models that have gone away.
<mup> Bug #1625774: memory leak after repeated model creation/destruction <eda> <oil> <oil-2.0> <uosci> <juju:In Progress by 2-xtian> <https://launchpad.net/bugs/1625774>
<babbageclunk> menn0: So I wanted to run my plan past you since you seemed to know about it.
<menn0> babbageclunk: I can see how statepool could be a problem in that case. let's talk when you're back.
<babbageclunk> menn0: Cool thx
<katco> ohai menn0
<menn0> katco: howdy
<natefinch> quick review anyone? https://github.com/juju/persistent-cookiejar/pull/16
<katco> natefinch: i'll review yours if you give me a quick review as well: https://github.com/juju/juju/pull/6378
<wallyworld> rick_h_: alexisb: tych0: our strong preference is to drop that other logging dep and migrate lxd to loggo (which now supports colour which was the issue why it wasn't used originally IIANM)
<tych0> and syslog support
<tych0> but that's a bigger job than we can really do :)
<tych0> anyway, i think i've unwound it
<babbageclunk> menn0: back (turns out Alice and Jayne needed to chat for a bit first). Now good?
<menn0> babbageclunk: yep, give me a few secs, can you set up a hangout/bluejeans?
<babbageclunk> don't know about bluejeans but hangout yes
<babbageclunk> menn0: https://hangouts.google.com/hangouts/_/canonical.com/xtian
<alexisb> wallyworld, failure!
 * alexisb grumbles and adds a tag to all the bugs
<wallyworld> alexisb: curtis will know better than me, we can ask him at standup
<wallyworld> alexisb: release call?
<alexisb> wallyworld, can you invite rick to the HO
<wallyworld> yep
<alexisb> wallyworld, running late
<thumper> wallyworld: https://github.com/juju/juju/pull/6360 and https://github.com/juju/juju/pull/6372
<wallyworld> thumper: will look after release call
<katco> alexisb: rick_h_: are we still expecting a oct. 7th release?
<wallyworld> thumper: we should probably check with marco or someone to ensure the --model-default syntax and behaviour is what they want
<thumper> marcoceppi: ping
<wallyworld> thumper: partly because andrew mentioned that in most cases, there could be an argument that what you use with --config could reasobably be in most cases a model default. but my argument was that that's fine, except for that one case where it's not
<alexisb> katco no, the 13th
<marcoceppi> thumper wallyworld pong
<thumper> marcoceppi: bug 1628999 and the solution: https://github.com/juju/juju/pull/6360
<mup> Bug #1628999: Openstack network selection is not passed from the controller to the models <canonical-is> <juju:In Progress by thumper> <https://launchpad.net/bugs/1628999>
<marcoceppi> thumper: so --model-defaults is applied to controller as well?
<thumper> marcoceppi: yes
<marcoceppi> unless overridden in --config
<thumper> yes
<marcoceppi> thumper: well that sounds fucking fantastic
<marcoceppi> thumper: helps with a problem we had in Best Buy deployment of setting both --config on bootstrap, then model-config later
<marcoceppi> thumper: the only thing I think worth considering is changing --config to --model-config to match the rest of teh verbiage, but that's a nit (at best)
<wallyworld> marcoceppi: you can also set up model defaults in clouds.yaml - this new work brings that capability directly to the CLI. there are use cases for both
<wallyworld> marcoceppi: in clouds.yaml, that's the only way to set up region defaults
<wallyworld> you still can't do that on the CLI
<axw> rogpeppe1 natefinch: there's a bug open about cookies not being deleted on logout
<wallyworld> axw: can you pop in a minute early to standup?
#juju-dev 2016-10-05
<natefinch> axw: the bug about cookies not getting deleted is what I'm working on :)
<alexisb>  menn0, axw, wallyworld, katco just sent a prosed tech board agenda item
<alexisb>  please let me know if you have questoins/concern about my request
<wallyworld> ok
<natefinch> axw: https://cdn.meme.am/instances/500x/72177725.jpg
<menn0> alexisb: ok
<menn0> axw: ping
<axw> menn0: pong
<menn0> axw: regarding https://github.com/juju/juju/blob/master/api/apiclient.go#L575
<menn0> axw: it looks like when the conn is closed another Ping call is attempted before the heartbeatmonitor can finish
<menn0> axw: should there be a short circuit in the <-s.closed case so it just does close(s.broken); return?
<menn0> axw: this isn't causing a specific issue, i'm just poking around in the area
 * axw looks
<axw> menn0: yeah, I think so
<menn0> axw: ok cool.
<menn0> axw: i'm also going to try out adding handling for the websocket conn's Dead channel
<alexisb> menn0, axw, wallyworld any questions before I log off>
<alexisb> ?
<axw> alexisb: sorry just got back, looking now
<wallyworld> alexisb: why are you guys voting for trump?
<axw> menn0: SGTM
<menn0> alexisb: I think your tech board item is clear enough
<alexisb> to amuse you wallyworld
<wallyworld> lol
<menn0> although it'll blow out the meeting beyond an hour :)
<alexisb> and because I cannot eat a turd sandwich
<alexisb> menn0, yes do the best you can
<alexisb> if we can get the start of a list I can work with folks individually on details
<menn0> axw: i'm seeing cases in my HA testing where the api-caller takes quite a while to notice that the apiserver has stopped because the conn is only tried every ping timeout and I guess the ping attempt takes a while to time out too
<axw> alexisb: sounds good. I don't think we'll get it all done in one meeting of course, but we can make a start
<menn0> axw: checking the underlying websocket Dead might speed things up
<alexisb> axw, understood and thank you
<alexisb> alrighty all I am off for the night chat with you tomorrow
<menn0> alexisb: good night
<axw> good night
<axw> menn0: hrm, what channel are you talking about?
 * menn0 finds
<menn0> axw: sorry I said the websocket conn
<menn0> axw: I meant rpc.Conn - it has a Dead() method
<axw> menn0: ah right
<axw> wallyworld: is it https://github.com/juju/juju/pull/6373 that I'm reviewing?
 * axw was half asleep in standup
<wallyworld> axw: yeah, ta
<natefinch> axw, wallyworld:  any thoughts on this? https://github.com/juju/persistent-cookiejar/pull/16#pullrequestreview-2813363
<wallyworld> natefinch: i'm not too sure; i don't know a lot about cookies offhand
<menn0> thumper, wallyworld or axw: https://github.com/juju/juju/pull/6379
<wallyworld> menn0: i'll swap https://github.com/juju/names/pull/76
<natefinch> wallyworld: np
<axw> natefinch: sorry, likewise. I think you're going to have to interrogate rogpeppe1
<natefinch> axw: kk
<menn0> wallyworld: fair enough
<wallyworld> menn0: that one is a pre-req for a much bigger juju core change to remove @local
<wallyworld> which i've done but need to write upgrade steps, yay
<menn0> wallyworld: understood
<menn0> wallyworld: sorry, I just remembered I need QA steps on that
 * menn0 adds
<wallyworld> menn0: that's one think i dislike about gh reviews, no qa steps section. plus all the other disadvantages. when are we going to go back to something better like reviewboard?
<menn0> wallyworld: we're still in our 2 week trial. after that we get everyone's opinions/votes and see where we land.
<babbageclunk> menn0: Oops couldn't sleep
<wallyworld> menn0: i've been away for 2 weeks, i was hoping it would be done and dusted :-)
<menn0> wallyworld: maybe the trial's over then? We'll have to check the date when we started.
<menn0> babbageclunk: haha :)
<menn0> babbageclunk: just b/c you can't sleep doesn't mean you have to work :)
<menn0> wallyworld: QA steps added
<wallyworld> ta
<wallyworld> that's anothe thing - now we are foced to pollute commit messages with QA steps
<wallyworld> since there's no separate section
<wallyworld> menn0: lgtm. i like simple fixes like that
<menn0> wallyworld: thanks - figuring out what was going on was anything but simple but the fix was straight forward
<wallyworld> menn0: it reminds me of that joke about the car mechanic
<menn0> wallyworld: and some of the other changes I'd like to do around this are a lot less simpler :)_
<menn0> wallyworld: your change LGTM. I had actually reviewed it a while ago but forgot to end the review.
<wallyworld> menn0: ta. am still writing the upgrade steps anyway
<babbageclunk> menn0: It doesn't look like the pool gets hit if I try to do something with a removed model, but I think that might be because the client store knows the model's gone away.
<babbageclunk> menn0: Or rather, it's because it tries to find the model by name first from the controller.
<babbageclunk> menn0: Is there a way to destroy the model from a different client so that the modeluuid is still in the client store?
<babbageclunk> menn0: I'm beginning to think I should do the refcounting anyway - relying on implementation details of the api server to ensure that the state isn't in use when we try to close it seems fragile.
<babbageclunk> menn0: And even if it does today there's no guarantee it will in the future.
<babbageclunk> menn0: Oops, out of battery - send me an email if you have strong feelings about the above?
<menn0> babbageclunk: that all sounds good to me
<thumper> menn0: do you still need a teddy bear?
<menn0> thumper: have been looking into other things but yes
<menn0> thumper: now?
<thumper> sure
<thumper> hangout?
<menn0> thumper: 1:1?
<thumper> there already
<menn0> ok
<wallyworld> axw: i think we can drop migrateLegacyAccounts() for accounts.yaml now right? i think it was for beta13->14?
<axw> probably, looking
<wallyworld> i be doing an implementation to migrate the @local stuff
<axw> wallyworld: yep, please drop what's there
<wallyworld> will do ta
<axw> wallyworld: it would be nice if we had some general client-side upgrade steps. maybe in some distant future
<wallyworld> yeah
<veebers> menn0: would you happen to know where I might be able to file a bug against mongostat that I got from http://repo.mongodb.org? The most recent changes output a little bit (unexpected)
<axw> wallyworld: gotta go pick up my car, will bbs
<wallyworld> ok
<wallyworld> the new one?
<axw> wallyworld: yeah just had to take it for its first service
<wallyworld> ok
<menn0> jam: tech board?
<jam> menn0: yep, just got 2fa'd, brt
<macgreagoir> rogpeppe1: This is to fix one you mentioned to me last week, in case you'd like to look ;-) https://github.com/juju/juju/pull/6381
<rogpeppe> macgreagoir: looking
<macgreagoir> Cheers
<rogpeppe> macgreagoir: LGTM - that was a hard one :)
<macgreagoir> :-D
<macgreagoir> If it were OpenStack, I'd get to ODS for free!
<babbageclunk> voidspace: ping?
<voidspace> babbageclunk: pong
<babbageclunk> voidspace: hey, do you know about the AllModelWatcher?
<voidspace> babbageclunk: not really, sorry
<voidspace> babbageclunk: don't think I've ever looked at it
<babbageclunk> Hmm.
<voidspace> babbageclunk: what's up with it, what do you need to know?
<babbageclunk> voidspace: Trying to change the StatePool so it closes the State objects it holds when the corresponding model goes away.
<voidspace> babbageclunk: ah :-)
<voidspace> babbageclunk: sounds like fun
<babbageclunk> voidspace: It's the source of a memory leak when someone does lots of add/destroy models.
<voidspace> right
<babbageclunk> voidspace: worryingly, I can see that the AllModelWatcher contains a StatePool, so adding an AllModelWatcher to the StatePool seems like a bad idea.
<voidspace> babbageclunk: so important for OIL but not for anyone in real life
<voidspace> hah, yes
<voidspace> babbageclunk: so maybe make the change in the AllModelWatcher calling back into the StatePool
<voidspace> babbageclunk: or can you have multiple state pools, not all associated with an AllModelWatcher
<babbageclunk> voidspace: yes that
<babbageclunk> voidspace: there's also one inside the API server.
<voidspace> babbageclunk: gah, you either need a registry or you need a new watcher
<babbageclunk> voidspace: Probably a new watcher - it looks like the AllModelWatcher hoovers up way more than this one would care about.
<voidspace> babbageclunk: a new watcher is more ju-thonic
<babbageclunk> voidspace: I only want to be notified when a model is removed.
<babbageclunk> jujuic?
<voidspace> heh
<voidspace> ModelRemovalWatcher
<babbageclunk> yeah, I guess.
<babbageclunk> voidspace: Hah, I guess that's State.WatchModels then
<voidspace> cool
<babbageclunk> voidspace: Thanks!
<babbageclunk> wallyworld: ping?
<wallyworld> hey
<babbageclunk> wallyworld: I'm trying to work out how to add a model watcher to the StatePool.
<wallyworld> i've not done anything with the state pool per se. i've done watchers, but on mgo collections
<voidspace> mgz: ping
<wallyworld> i can take a look at the code
<babbageclunk> wallyworld: Does that mean I need to make StatePool into a worker?
<babbageclunk> wallyworld: Thanks - with Will gone and Dimiter out I'm not sure who to bug about this stuff over here.
<wallyworld> babbageclunk: so the state pool manages a collections of states i assume. but that's not a db/mgo construct is it?
<babbageclunk> wallyworld: no, I don't think so (not sure what you mean though)
<wallyworld> babbageclunk: so normally our watchers are constructs that notify when something in the db changes
<wallyworld> eg adding to a collection
<babbageclunk> wallyworld: At the moment it's a cache of States, but there's no way to have the States get closed when the model is removed, so there's a leak.
<babbageclunk> wallyworld: Yeah, I *think* State.WatchModels will give me the signals I want.
<wallyworld> babbageclunk: dumb question not having seen the code - why can't we close state objects when they are removed from the cache?
<babbageclunk> wallyworld: At the moment nothing removes them. (Except when the pool itself is closed, at api server shutdown time.)
<wallyworld> oh, i see, a cache that is not managed correctly
<wallyworld> and you want to know when a model object is removed from state
<babbageclunk> only two hard problems in computering.
<babbageclunk> yup
<wallyworld> do we should have a model watcher on state
<wallyworld> but could we do something in the undertaker?
<wallyworld> do we have an existing process than deals with dying models?
<wallyworld> that we can hook in to?
<babbageclunk> wallyworld: hang on, looking at what the undertaker listens to.
<wallyworld> may be prudent just to do something as part of the undertaker workflow; but not sure without looking through the code to see the dying model workflow
<wallyworld> bbiab, got to go get kid from band practice
<perrito666> k, need to step out for the next ~4 hs, ill be available on mail
<mgz> voidspace: hey, can I help?
<voidspace> mgz: still struggling to connect to company VPN
<voidspace> mgz: Spads has been helping me in #is - I'm about to try US gateway
<voidspace> mgz: I get "Could not find source connection" in /var/log/syslog for UK gateway
<mgz> voidspace: gotcha
<mgz> if you don't get anywhere with the vpn, come back and poke me again
<voidspace> mgz: so thanks, I'll weep loudly and gnash my teeth if it doesn't work
<voidspace> heh, thanks
<mgz> because we have a machine in canonistack we use to let our ci setup jump through
<mgz> that is an alternative to getting vpn working
<natefinch> rogpeppe: can we talk cookies?
<rick_h_> mhilton: ^ have a sec for natefinch ?
<urulama> natefinch, rick_h_: rogpeppe is afk for lunch. he'll be back in 20min or so
<voidspace> rick_h_: grabbing coffee, with you in 5
<rick_h_> voidspace: stuck in another call, will be a bit late
<voidspace> rick_h_: cool
<voidspace> mgz: I get the same error with the US VPN gateway
<voidspace> mgz: "Could not find source connection"
<voidspace> mgz: and even though "us-mfoord VPN" is on in network manager it remains "Not connected"
<voidspace> mgz: can you help me diagnose?
<rogpeppe> natefinch: sure
<natefinch> rogpeppe: I know you said that the cookies don't need to round trip... except that they do with RemoveCookie
<natefinch> rogpeppe: thus - https://github.com/juju/persistent-cookiejar/pull/16
<rogpeppe> natefinch: hmm, i'd forgotten that RemoveCookie existed. let me have a look.
<mgz> voidspace: what's the final ip you're trying to get to?
<mgz> we may as well try the low-fi method
<voidspace> mgz: I'm trying to get access to vsphere
<voidspace> mgz: so 10.245.0.131 I think
<rogpeppe> natefinch: how are you intending to use it?
<mgz> voidspace: okay, this works for me
<mgz> (well, I get prompted for a password, but I presume you can get by that)
<voidspace> mgz: I have the vsphere creds from cloud-city
<natefinch> rogpeppe: when we log out of a controller, we need to remove the cookie we saved for the controller, to force the user to re-enter their password
<rogpeppe> natefinch: how are you planning on getting that cookie?
<rogpeppe> natefinch: scratch that, it's obvious :)
<natefinch> rogpeppe: I hope your obvious solution is the same as mine
<rogpeppe> natefinch: i think you'll need to use AllCookies
<rogpeppe> natefinch: and search through for any cookies with a matching host
<natefinch> rogpeppe: isn't that what Cookies(net.URL) is for?
<rogpeppe> natefinch: not really
<rogpeppe> natefinch: that method is for getting cookies to send as part of a web request
<rogpeppe> natefinch: i think it's best to keep to the standard implementation for that
<natefinch> rogpeppe: right, which is what we'd send up as authentication, which means those are the ones that we want to remove
<rogpeppe> natefinch: it might be nice to have a RemoveAll(url) method
<rogpeppe> natefinch: which would remove all cookies that would be sent to the given URL
<natefinch> how is that not this?
<natefinch> hopefully they'll recognize that yoyu
<natefinch> oops, wrong paste
<natefinch> for _, c := range jar.Cookies(st.cookieURL) {
<natefinch> 			j.RemoveCookie(c)
<natefinch> 		}
<katco> natefinch: standup time
<rogpeppe> natefinch: because the Cookies method doesn't fill in the domain name
<natefinch> that's the BUG
<rogpeppe> natefinch: and that seems to be deliberate
<rogpeppe> natefinch: then raise it in golang.org/issue
<katco> voidspace: standup time
<voidspace> katco: omw
<natefinch> rogpeppe: it did seem deliberate, which is why I posted on golang-nuts about it, but no one has responded yet. I could go directly to filing an issue, might get a better response
<rogpeppe> natefinch: i think the answer is probably the answer i gave you - you don't need to encode the domain name when sending a cookie
<natefinch> rogpeppe: but it also prevents you from updating a cookie.  Being able to round trip is usually good practice, to prevent surprising behavior
<rogpeppe> natefinch: this document is relevant: String returns the serialization of the cookie for use in a Cookie header (if only Name and Value are set) or a Set-Cookie response header (if other fields are set). If c is nil or c.Name is invalid, the empty string is returned.
<rogpeppe> s/document/comment/
<rogpeppe> natefinch: so if we change the implementation of Cookies, we'll be breaking that
<mhilton> natefinch, rogpeppe: you seem to be having the discussion I was about to start. I'm on the side of the Cookies function should only fill in the values that get sent in an HTTP request. If you want different functionality then I think you want a different function.
<rogpeppe> mhilton: yup
<rogpeppe> natefinch: that's the doc comment on Cookie.String BTW
<natefinch> mhilton. rogpeppe: well, then RemoveCookie is useless, and we should remove it
<natefinch> or... nearly useless I guess
<rogpeppe> natefinch: it's not useless
<rogpeppe> natefinch: you can use it with AllCookies
<natefinch> well then allcookies is doing the wrong thing according to what you said above
<rogpeppe> natefinch: AllCookies isn't designed for sending web requests
<rogpeppe> natefinch: it's not part of the CookieJar interface
<natefinch> rogpeppe: it's a function called AllCookies on a thing that implements CookieJar
<rogpeppe> natefinch: yes
<rogpeppe> natefinch: and?
<natefinch> rogpeppe: if you think no one is ever going to use cookies from that list to send in a web request, you're more optimistic than most devs I've met
<rogpeppe> natefinch: FWIW i think both AllCookies and RemoveCookie could be better documented
<rogpeppe> natefinch: i'm optimistic. why would you do that? net/http uses the CookieJar implementation directly - there's no need to add cookies yourself.
<natefinch> I don't think GetCookies(u) | RemoveCookie(c) is an unreasonable thing to think will work.  The fact that it doesn't cost me a few hours of confusion.
<natefinch> the old adage is to make it hard/impossible to do the wrong thing, and we're making it really easy to do the wrong thing.
<rogpeppe> natefinch: i think we've pointed out the reason why you can't have that
<rogpeppe> natefinch: i think the best answer is to improve the docs
<natefinch> rogpeppe: That's fine... but then RemoveCookie shouldn't be there.  It's misleading.  I think it would be better to remove RemoveCookie and AllCookies, and instead have RemoveAll(url) and RemoveAll(func(c cookie)) or something.
<rogpeppe> natefinch: AllCookies is useful for other reasons too
<natefinch> er func(cookie) bool
<rogpeppe> natefinch: and if you can see a cookie, it's useful to be able to remove it.
<natefinch> rogpeppe: the fact that it returns cookies in a different format than Cookies(url) is *going* to cause confusion and bugs.  It already has.
<rogpeppe> natefinch: well, it returns the same format that's used for SetCookies
<rogpeppe> natefinch: and for a Set-Cookie response header
<rogpeppe> natefinch: it should definitely be documented better, and I think we could easily have a RemoveAll(url) method
<rogpeppe> natefinch: i'm sorry you've had issues with it
<rogpeppe> natefinch: but i don't think the current semantics need to be changed
<natefinch> rogpeppe: it sounds like it's really just a problem with the design of http.Cookie and the CookieJar... Having String() return significantly different things based on what values are set seems like a bad design... which then makes GetCookies(u) have to work weirdly, and the whole thing cascdes.
<rogpeppe> natefinch: maybe, but that's ancient history and we can't change it now.
<natefinch> rogpeppe: the best idea for now, seems to be to making a RemoveCookies(url) seems like a good compromise
<rogpeppe> natefinch: yeah, I'd accept a PR to that effect
<rogpeppe> natefinch: for the time being you could just iterate through AllCookies as I suggested
<rogpeppe> natefinch: it's not like there's a performance issue with doing that.
<natefinch> rogpeppe: that's true.  Just seems to encode too much knowledge into the caller.
<rogpeppe> natefinch: we're not going to break it. relax :)
<natefinch> rogpeppe: it's not that I'm afraid we'll break it, it's just ugly... and if I have to write that code somewhere, I'd rather write it in the place that deals with cookies, especially since we may need to do the same thing somewhere else.
<rogpeppe> natefinch: i'd put it in logout.go
<rogpeppe> natefinch: i think it's reasonable that that code knows where the various auth information is stored
<natefinch> rogpeppe: we don't currently have an easy way to get the current controller url or the current cookie file... I had planned on putting Logout() on api.Connection, since it has the Login function, and it already has the cookie jar and the url.
<rogpeppe> natefinch: you can trivially get the current controller URL by calling store.ControllerByName
<rogpeppe> api.Connection doesn't have enough information to be able to log out
<natefinch> rogpeppe: it has the persistent cookie jar and the url we're using to log in.... does it need more than that?
<rogpeppe> natefinch: it doesn't necessarily have a persistent cookie jar
<natefinch> rogpeppe: when would it not?
<rogpeppe> natefinch: when someone makes a connection that's not using a persistent cookie jar. That's set up at a higher level, and api.Connection should not assume that it has one.
<rogpeppe> natefinch: sorry, my connection just dropped for some reason
<rogpeppe> natefinch: i think it's better that the code that's responsible for setting up the persistent cookiejar should be responsible for cleaning it up
<rogpeppe> natefinch: BTW you can trivially get the persistent cookie jar with logoutCommand.APIContext() in the Jar field
<rogpeppe> natefinch: FWIW persistent-cookiejar is a bit of a hack. it would be nice to replace it some time, and ISTM that introducing a dependency on it inside the API client code is not a good way to make that easier.
<rogpeppe> natefinch: in general, anything that needs a dynamic type coercion should be viewed with suspicion.
<natefinch> rogpeppe: that's fine.  I figured the parity with login would be good, but it's fine not to
<natefinch> rogpeppe: Oh yeah, absolutely.  I hate to do it
<redir> morning juju-dev
<babbageclunk> redir: morning!
<babbageclunk> redir: Hey, what happened with WaitAdvance?
<redir> babbageclunk: it landed on friday IIRC
<babbageclunk> redir: but no pushback or argybargy?
<redir> babbageclunk: not yet
<redir> babbageclunk: I'm just back from swapdays today
<babbageclunk> redir: oh right
<babbageclunk> redir: cool
<redir> babbageclunk: how's move preparations going?
<babbageclunk> redir: scary!
<redir> babbageclunk: I bet. That is a long move
<babbageclunk> redir: and getting really close! Lots of things to cancel and organise.
<redir> babbageclunk: we made a transcontinental move last year this time. I am so glad it went smoothly and we're done with it.
<redir> hope yours is as painless as can be
<babbageclunk> redir: yeah, I'm really looking forward to it being done. Thanks"
<babbageclunk> !
<redir> kettle is whistling. bbiam
<alexisb> morning redir
<alexisb> redir, we should connect when you have a moment
<redir> alexisb: OK, going through email
<redir> I'll ping you when done?
<alexisb> redir, sure and I have a few meetings this morning as well
<redir> alexisb: I'll look for a slot
<redir> alexisb: wanna ping me after 10?
<alexisb> sure
<alexisb> redir,
<rogpeppe> i've just proposed some changes to the juju register command to allow registration of controller with officially signed certs. reviews appreciated. https://github.com/juju/juju/pull/6382
<babbageclunk> katco: ping?
<katco> babbageclunk: pong
<babbageclunk> katco: hey, I've been making some changes to StatePool to do refcounting.
<katco> babbageclunk: lucky you ;p how's it going?
<babbageclunk> katco: Would you mind taking a look at the current state of it?
<katco> babbageclunk: sure
<babbageclunk> katco: ok I think, although I've backed off from trying to make the StatePool watch models and remove items from itself.
<katco> babbageclunk: yeah i don't think the statepool would be responsible for that as it's just a cache. if we went that route it would be a separate watcher
<babbageclunk> katco: it's used in the API server and in WatchAllModels, and they use different types for life management.
<babbageclunk> katco: menn0 had suggested I do that, but I think I'd need to convert the API server to use a catacomb first, maybe?
<babbageclunk> katco: not really sure about that
<redir> rogpeppe: looking
<rogpeppe> redir: thanks!
<rogpeppe> redir: there's some fun stuff in there :)
<babbageclunk> katco: Ok, pushed - https://github.com/juju/juju/compare/master...babbageclunk:pool-memleak?expand=1
<katco> babbageclunk: tal. not sure about the catacomb question either. i think i need more context/concrete code to ponder over
<babbageclunk> katco: thanks! Sorry, I have to rush home now so can't chat about it, but I'll check comments/email about it and be back online later.
<voidspace> rick_h_: ok :-)
<voidspace> rick_h_: was just writing you an email... but you've answered it
<rick_h_> voidspace: :)
<alexisb> redir, ping
<redir> alexisb: po
<redir> ng even
<alexisb> :)
<redir> can't tab complete pong in weechat yet.
<redir> location alexisb ?
<alexisb> 1x1 HO
<alexisb> brt
<alexisb> https://hangouts.google.com/hangouts/_/canonical.com/alexis-bruemmer
<alexisb> redir, ^^
<rogpeppe> redir: get anywhere with that review? :)
<redir> rogpeppe: on hold in meeting bbiab
<rogpeppe> redir: np
<rogpeppe> redir: i'm about to stop for the day, but look forward to your comments in the morning.
<cmars> which collection has an application's config?
<redir> rogpeppe: cool
<redir> i mean: ack
<cmars> ah, it's `settings`
 * alexisb changes locations, back online in a few minutes
<rick_h_> voidspace: dooferlad ping, question in an irc channel about forcing juju traffic onto a specific address/interface.
<rick_h_> voidspace: dooferlad is there some method of forcing that?
<perrito666> I have put up a PR that mostly has tests https://github.com/juju/juju/pull/6384
<alexisb> perrito666, would this bug (https://bugs.launchpad.net/juju/+bug/1605714) potentially be fixed with this PR: https://github.com/juju/juju/pull/6321
<mup> Bug #1605714: juju2 beta11: LXD containers always pending on ppc64el systems <oil> <oil-2.0> <juju:Triaged by rharding> <https://launchpad.net/bugs/1605714>
<perrito666> alexisb: mm, I dont know if fixed, the containers might be in broken instead of fixed
<perrito666> of pending
<perrito666> but I am not sure what is causing it
<alexisb> ok
<mup> Bug #1630728 opened: user add/remove borked <juju-core:New> <https://launchpad.net/bugs/1630728>
<mup> Bug #1630737 opened: juju should use internal vpc network address space when connected to vpc  via vpn <juju-core:Incomplete> <https://launchpad.net/bugs/1630737>
<perrito666> rick_h_: I am starting to think we need a juju unremove-user
<thumper> perrito666: ew
<natefinch> lol
<natefinch> juju undo
<rick_h_> perrito666: heh, I'd think reactive maybe, but yea. that's always the way I guess
<perrito666> thumper: well our remove user does not really remove users (that is a feature) and users want to remove and then re-add, so
<rick_h_> perrito666: I think the thing is the why is it a normal workflow, and yea sometimes mistakes happen and users need a way out
<katco> thumper: hey quick q
<katco> thumper: https://github.com/juju/juju/pull/6377/
<katco> thumper: what is our standard way of returning friendly multi-line errors to users?
<katco> thumper: macgreagoir is creating a new error and not wrapping the old one; it seems like there should be a better way
<thumper> otp
<thumper> look shortly
<katco> thumper: no worries ta
<natefinch> katco: my thought was that any error from a cmd.Run method can basically be considered direct output to a user, and so we don't have to worry too much about the nitty gritty
<katco> natefinch: what about --debug where the stack and cause would be shown?
<natefinch> katco: does debug do that?  I didn't realize :)  I think  Wrap does the right thing, if you want a new message but keep the same stack
<katco> natefinch: yeah i use --debug all the time to track down the root of an error
<katco> natefinch: i think Wrap is the right thing here, but thumper has done a lot of work around user-friendly multi-line errs, so thought i'd check in
<natefinch> mna... I can't believe our logout tests don't actually log in first... sigh
<rogpeppe1> redir: thanks for the review. good point about the feature tests - i always forget that some tests are far removed from the package they're testing.
<rogpeppe1> natefinch: ha ha
<natefinch> rogpeppe1: details, right?
<rogpeppe1> natefinch: :)
<konobi> howdy. I'm having a problem with 4096bit SSH keys as reported in https://bugs.launchpad.net/juju/+bug/1543283 -- The bug is marked as fixed/released, but I'm seeing the same behaviour with 2.0-rc2-elcapitan-amd64
<mup> Bug #1543283: [Joyent] 4k ssh key can not be used: "cannot create credentials: An error occurred while parsing the key: asn1: structure error: length too large" <juju:Fix Released> <https://launchpad.net/bugs/1543283>
<konobi> (this is via juju-quickstart, btw)
<natefinch> konobi: does it happen if you don't use quickstart?
<konobi> not sure, what would be a way to validate?
<konobi> `ERROR detecting credentials for "joyent" cloud provider: credentials not found`
<konobi> along with `ERROR there was an issue examining the environment: cannot create credentials: An error occurred while parsing the key: asn1: structure error: tags don't match (16 vs {class:0 tag:14 length:30 isCompound:true}) {optional:false explicit:false application:false defaultValue:<nil> tag:<nil> stringType:0 timeType:0 set:false omitEmpty:false} pkcs1PrivateKey @2`
<natefinch> could be a bug, but might also be a problem with the format of the key in the credentials.yaml file
<konobi> well, with joyent/triton it uses http-signatures for credentials
<natefinch> konobi: ahh, is the key encrypted?
<natefinch> I betcha that's the problem
<konobi> it is... but my ssh-agent is running just fine
<konobi> in the meantime, unencrypting should work for now
<konobi> but yeah, i'd generally be using high strength ECDSA
<natefinch> I don't think juju itself supports ssh-agent.... I see some stuff about quickstart supporting it, but I'm honestly not sure how that works
<natefinch> asking on #juju might get you better help with quickstart.  The bug you saw was for juju, not quickstart, so it's possible quickstart has the same problem we already fixed in the core of juju
<konobi> it's the same errors as from that ticket
<alexisb> thumper, are you actively doing anything for htis bug: https://bugs.launchpad.net/juju/+bug/1625768
<mup> Bug #1625768: github.com/juju/juju/state go test timeout <ci> <intermittent-failure> <regression> <unit-tests> <juju:In Progress by thumper> <https://launchpad.net/bugs/1625768>
<thumper> alexisb: not any more, I sped up the slowest 3
<thumper> and that made things pass under the time limit again
<alexisb> yep
<alexisb> I am going to move it to high and to 2.1, as CI still sees it failing but not at a critical rate
<thumper> veebers: pign
<redir> anyone see this error when bootstrapping aws before? http://paste.ubuntu.com/23281590/
<veebers> thumper: pong, what's the haps? :-)
<thumper> katco: commented, and suggested wrap
<katco> thumper: ta, tal
<thumper> veebers: I have a fix (I think) for some windows and s390 test failures
<thumper> and would like to confirm
<thumper> can we get access to the machines, grab my branch and run the test to see?
<veebers> thumper: yep, we can do that.
<thumper> veebers: this issue http://reports.vapour.ws/releases/issue/57f1cae8749a5660b085163f
<redir> reboot brb
<konobi> huh, seems like it chooses id_rsa in my .ssh folder over the one it's configured with =0/
<redir> phtphtpht failed to bootstrap model: cannot start bootstrap instance: no "xenial" images in us-west-1 with arches [amd64]
<konobi> okay... it appears that juju just doesn't want to use the keys it was told to
<konobi> (for bootstrap that is)
<konobi> don't suppose folks who work on the joyent cloud provider backend are around at all?
<katco> thumper: is it OK if an error message is multi-line? should we instead print things out to stderr and return the standard formatted error?
<thumper> katco: I think it's fine, but I also think that it would make sense to get general agreement on approach for these types of responses, and then apply it everywhere
<thumper> I'm in two minds
<thumper> I can see benefits of both ways
<katco> thumper: it's just bugging me for some reason
<thumper> katco: so here's an alternative approach
<katco> thumper: i think because we throw away the original error (maybe ok) but then create a message that's unlike all other errors
<thumper> the common.PermissionError should take a command context
<thumper> and write out to that
<thumper> and then have the function return a silent error
<thumper> so it doesn't get emitted
<thumper> to be honest...
<thumper> if we are wanting to have standard multiline nice emitting of errors
<thumper> it makes sense to me for those functions to take the error and command context
<katco> thumper: (or just an io.Writer)
<thumper> that way you aren't wrapping or hiding
<thumper> agreed
<thumper> a writer would also be fine
<katco> thumper: i think that's what's bothering me: we're not trying to create a new error with a better message, we're really trying to print something nice out to the user
<thumper> katco: listen to that niggling
<katco> thumper: and i think we're conflating the two
<thumper> katco: it is bothering you for a reason
<thumper> trust it
<thumper> agreed
<thumper> I'll back you on this
<katco> thumper: as you said, maybe a function that take an io.Writer to write to and is called from the Run()s, and then do we still print the returned error? i think it would be ok?
<thumper> I think the 'nice writer func' should write all the error info
<thumper> and we convert the error into a silent one
<thumper> otherwise we'll get things emitted out of order
<thumper> which is why the error is being replaced with long text
<konobi> thumper: any idea if someone on IRC is familiar with the joyent stuff at all? I'm dealing with it now, but I'm also a former joyent engineer, so I'm much more familiar with how the api there works
<thumper> so the command infrastructure will write the full text
<thumper> konobi: not me sorry
<katco> thumper: ok, i can get behind that. function that writes out message to user + original error message and swallows the error?
<thumper> no
<thumper> it can't swallow it fully
<thumper> otherwise the command will have rc of 0
<katco> ah just errors.Wrap(err, "") ?
<thumper> rather than 1
<katco> yeah
<thumper> there is a silent error type in cmd
<thumper> which doesn't get emitted
<katco> we probably still need to wrap the error for the stacktrace in --debug
<thumper> yeah
<thumper> wrap the error with a silent one
<thumper> so errors.Cause is a silent error
<thumper> that should work
<katco> ok thanks for talking through that with me, i'll leave that feedback
<thumper> also...
<thumper> we may want to have an ansiterm.Writer rather than io.Writer
<thumper> so we could colorize
<thumper> just a thought
 * thumper thinks
<thumper> no
<katco> would we ever want to colorize that?
<katco> yagni principle
<thumper> well, ERROR is shown in red
<thumper> everywhere else
<thumper> because the cmd package logs the error
<thumper> it starts getting a little messy
<katco> thumper: ack
#juju-dev 2016-10-06
<axw> wallyworld: would you please stamp https://github.com/juju/juju/pull/6380?
<wallyworld> sure
<wallyworld> axw: what was the rationale for not doing server side filtering of the machines?
<axw> wallyworld: so we don't break backwards compat
<axw> wallyworld: we'll have existing deployments with juju-<long-UUID>-, so we either search with prefix "juju-" or we can't change the format
<wallyworld> oh right, because existing envs will use non-truncated
<axw> yup
<wallyworld> seems like the only solution, but it does seem icky
<wallyworld> axw: we take the last 6 chars of the uuid now. can't we we just do a machine filter on that? ie modify the filter regexp?
<axw> wallyworld: I guess we could do "juju-.*(last-6-chars).*"  -- but this way we have freedom to change the format later
<wallyworld> maybe the number of machines needing to be filtered client side is nothing to worry about
<axw> yeah, I don't think so
<wallyworld> i wonder what ec2 does in this area
<axw> wallyworld: ec2 allows you to filter on tags, so it's a bit different
<wallyworld> fair enough
<wallyworld> lgtm then
<axw> ta
<wallyworld> menn0: axw: 99% of this is s/@local// and s/Canonical()/Id() and s/old upgrade code//. The bit that needs attention is the upgrade steps. See if you get a chance today to look https://github.com/juju/juju/pull/6388
<menn0> wallyworld: will take a look soon
<axw> wallyworld: ok, just finishing off something and will look
<wallyworld> ta, no rush
 * redir goes EoD
<thumper> fuck, what? http://juju-ci.vapour.ws:8080/job/github-merge-juju/9427/artifact/artifacts/trusty-out.log
<thumper> panic: runtime error: invalid memory address or nil pointer dereference
<thumper> [signal 0xb code=0x1 addr=0x20 pc=0x8bf770]
<thumper> in mgo
<thumper> not seem that before
<thumper> menn0: got a few minutes?
<menn0> thumper: yep
<menn0> thumper: that mgo panic is new to me as well
<menn0> thumper: during logging by the looks
<axw> wallyworld: I'm going out shortly for lunch, will be out for a while. so I'll have to review properly later. I added a few comments
<wallyworld> ok, ta
<axw> wallyworld: if you get a chance, https://github.com/juju/juju/pull/6390. it looks bigger than it is. most of the diff is in auto-generated stuff
<wallyworld> sure
<menn0> wallyworld: I intentionally got rid of assertSteps and assertStateSteps
<menn0> wallyworld: you don't need them
<wallyworld> menn0: oh ok, those tests can be deleted?
<menn0> wallyworld: yep
<menn0> wallyworld: instead you use findStep
<menn0> wallyworld: this confirms that a given Step with a certain description exists for the specified version
<menn0> wallyworld: and then gives you the Step so that it can be tested
<menn0> wallyworld: kills 2 birds with one stone
<wallyworld> menn0: ok. in this case though, i am testing the step itself in state
<menn0> wallyworld: ok, well just have a test which calls findStep and have a comment explaining that it's tested elsewhere
<wallyworld> ok
<menn0> given that it's a state step you might need a findStateStep variant
<wallyworld> right. i saw that sort of thing was missing and assumed it would be those assert funcs
<menn0> I just hadn't implemented findStateStep b/c it wasn't needed yet
<menn0> I probably should have to make it clearer
<wallyworld> i wasn't across the changes, just went with what i knew :-)
<menn0> wallyworld: totally understandable - I should have made it more obvious
<wallyworld> glad i asked you to review :-)
<menn0> :-)
<menn0> wallyworld: review done
<wallyworld> tyvm, will look after i finish reviewing andrew's pr
<axw> wallyworld: can you please review https://github.com/go-amz/amz/pull/71, I need that for my ec2 change. looking at your PR again now
<wallyworld> sure
<axw> wallyworld: https://github.com/juju/juju/pull/6369 also needs a second review (sorry, feel free to pass if you're busy)
<wallyworld> tis ok
<wallyworld> axw: cert cleanup looks ok. will be good to get this fixed before release. chris is also working on a leak with state objects
<axw> wallyworld: cool, ta
<axw> wallyworld: reviewed
<wallyworld> ta
<wallyworld> axw: yeah, the error can only ever be notfound. bool is better
<wallyworld> axw: for some reason, doing another manual test - api returns errperm after upgrade, even though db is all properly migrated etc. maybe there's a macaroon issue and you need to logout before upgrading and then login again. not sure yet. pita to diagnose
<axw> wallyworld: hmm. try deleting your cookie jar and logging back in
<axw> wallyworld: the macaroon will have user@local in it
<wallyworld> axw: yeah, that was my suspicion
<wallyworld> but no luck with deleting cookie jar. something else is messing up
<wallyworld> worked fine another time damit
<rogpeppe1> axw: tyvm for the review
<axw> rogpeppe1: np
<urulama> wallyworld: try deleting ~/.go-cookies and ~/.local/share/juju/store-usso-token
<rogpeppe1> axw: v glad to hear about @local going away :)
<axw> me too
 * axw will be bbl
<wallyworld> urulama: turns out i was using an older copy of user tags which did not properly strip @local when parsing. seems to work no, without needing to delete any cookies
<urulama> cool
<rogpeppe1> a small change to make bootstrapping controllers with autocert a little simpler: https://github.com/juju/juju/pull/6391
<rogpeppe1> anyone up for reviewing the above? axw? wallyworld? voidspace?
<wallyworld> i can look
<rogpeppe1> wallyworld: ta!
<wallyworld> rogpeppe1: looks like a fairly simple change, lgtm
<rogpeppe1> wallyworld: ta
<voidspace> babbageclunk: ping
<babbageclunk> voidspace: pong
<voidspace> babbageclunk: do you know much about syctl.d?
<babbageclunk> voidspace: nup
<babbageclunk> voidspace: hth
<voidspace> babbageclunk: I'm looking at bug 1602192 which was assigned to you at some point
<mup> Bug #1602192: when starting many LXD containers, they start failing to boot with "Too many open files" <lxd> <juju:Triaged by rharding> <lxd (Ubuntu):Confirmed> <https://launchpad.net/bugs/1602192>
<voidspace> babbageclunk: yep, great help, thanks...
<babbageclunk> voidspace: ah right - that was called something very different when I first saw it.
<voidspace> babbageclunk: what about the juju cloud-init system and including a new file in it (specifically /etc/sysctl.d/10-juju.conf)
<voidspace> babbageclunk: do you know how to do that?
<voidspace> babbageclunk: I assume it's the cloudconfig package
<babbageclunk> voidspace: nope - I think it would need to be done on the host machine, right?
<voidspace> ah, right - yes
<voidspace> but that's still cloud-init, just not container init
<babbageclunk> voidspace: oh, but what about when someone's bootstrapping to lxd - don't they need the limits on their host machine?
<voidspace> ah, the lxd provider
<voidspace> yes, I don't even know if we can do that
<babbageclunk> voidspace: I think you need to pick rick_h_'s brains about whether he means that those limits should be set at juju install time and/or instance start time (for machines that could then host lxd containers)
<voidspace> the juju client probably shouldn't change global defaults on the machine you run it on
<babbageclunk> voidspace: no, that seems like a bad thing to do
<babbageclunk> voidspace: barring that, then yeah cloud-init seems like the right place to add a sysctl.d file
<voidspace> babbageclunk: I can look at doing it in cloud-init for machine sthat juju provisions and will talk to Rick about what to do with the lxd provider
<voidspace> babbageclunk: unfortunately the lxd provider seems like the major use case this affects
<babbageclunk> voidspace: yeah, that was certainly the original problem
<rick_h_> voidspace: babbageclunk jam had some opinions on this yesterday
<rick_h_> voidspace: babbageclunk I think we need to put that on hold while that works out atm tbh.
<voidspace> rick_h_: put the bug on hold or just fixing the host machine for the lxd provider?
<voidspace> rick_h_: we can still fix the issue for machines that juju provisions
<voidspace> unless jam has opinions on how that should be done too
<jam> rick_h_: voidspace: babbageclunk: so I replied to the email that rick_h_ forwarded to me. The *ideal* time to do it is at "bootstrap" time, because that is the only time that a client is actually asking to create containers.
<jam> however, Juju is running at user privilege then
<jam> the only time we have root is during "apt" time, but just because you are installing a juju client doesn't feel like a great time to consume kernel resources because you *might* bootstrap LXD
<jam> is it possible to just give better feedback about why something isn't starting, and point people toward how to fix it?
<voidspace> jam: so we can't do the right time and we shouldn't do the wrong time?
<voidspace> jam: working out what the problem was required some serious probling outside of juju - switching back to lxc rather than lxd was how it was worked out I think
<voidspace> jam: so I'm not sure it's "easy" from juju to tell why provisioning fails
<jam> voidspace: fundamentally it feels like it should be LXD's problem, as anyone who wants to create 20 containers is going to hit it, we just make it easier to do so.
<voidspace> jam: right
<jam> can you link the original bug?
<voidspace> bug 1602192
<mup> Bug #1602192: when starting many LXD containers, they start failing to boot with "Too many open files" <lxd> <juju:Triaged by rharding> <lxd (Ubuntu):Confirmed> <https://launchpad.net/bugs/1602192>
<voidspace> jam: see comment 29 (from Stephane in july) about an upstream fix
<jam> so the patch to tie it to a user namespace seems the ideal, as then each container gets X handles, and launching Y containers automatically gets you X*Y handles available.
<jam> I suppose if the answer is only "8x more consumption" and it gives us the headroom for 30-ish containers maybe thats sane...
<voidspace> jam: so reach out to Stephane to find the state of the upstream patches and leave the bug for the moment?
<jam> voidspace: from the conversation, the 'upstream' patch is likely to be many months out of acceptance.
<jam> it does feel like the most correct fix.
<jam> babbageclunk: do you know how many containers you could do with default settings?
<rick_h_> jam: folks were hitting around 8
<jam> I do believe my environments have all been touched so I'm not 100% sure what pristine is.
<rick_h_> jam: sorry, wrong bug
<jam> rick_h_: voidspace: what about having a script that we ship with juju which can create an appropriate /etc/sysctl.d/10-juju.conf file, and if you do "juju bootstrap ... lxd" we check for that file
<rick_h_> jam: voidspace I think that we have to be ready to react though quick. I'd like to suggest we get a patch ready for the local provider case and leave the cloud-init case and at least have it handy.
<jam> and give you a message about # of container limitations, and what "sudo bigger-inotify-limits.sh" you can run to fix it?
<voidspace> rick_h_: the local provider is the one that's problematic to fix
<rick_h_> jam: I just don't think that haveing a 10-20 limit on the local provider case is going to pass muster.
<jam> so we tie it to "juju bootstrap ... lxd"
<voidspace> rick_h_: we either have to do it at install time or use jam's idea
<jam> but we make it explicit, which also gives the user a pointer if they really need to go up to 50 containers, etc.
<voidspace> rick_h_: as "juju bootstrap" runs with user priveleges and changing this on the host machine requires system priveleges
<rick_h_> voidspace: I understand. imo we should just put it in at install time.
<rick_h_> voidspace: jam I understand, just really can't get past the fail/extra command to use lxd for 10-20
<rick_h_> voidspace: jam I guess I'd feel different if it was a 50+ thing
<voidspace> rick_h_: jam: I'm inclined to agree - it's a setting that I don't see a downside to changing
<jam> voidspace: if there wasn't a downside, then LXD would ship with it.
<voidspace> rick_h_: jam: however I know many users *might* feel differently about us changing their system settings
<jam> "each used inotify watch takes up 1kB of unswappable kernel memory"
<voidspace> jam: I don't think that necessarily follows - it's more likely to be caution about changing system settings
<rick_h_> bumping the limit will also allow any other user on the
<rick_h_> system to use a whole lot of kernel memory and still run you dry of
<rick_h_> inotify watches.
<voidspace> jam: if someone is trying to run 20 lxd container they'll be fine with that - it's a necessary consequence
<rick_h_> ^^ is the "cost" which in a local provider case (folk's laptops/desktops) I don't feel is an issue to outweight the gain
<jam> voidspace: I *absolutely* agree that if someone wants 20 containers they want it
<voidspace> it's not *ideal* that's for sure
<jam> but "apt install juju" is not "I want to run 20 containers on this machine"
<voidspace> yep, understood
<jam> rick_h_: again, that is why I'm trying to tie it to "juju bootstrap ... lxd" where someone is very close to saying they want 20 containers.
<rick_h_> jam: I understand. but we can't do that. So within out limits of influence atm, we need to be ready to do the right thing for juju users with lxd.
<rick_h_> if we can get lxd to carry the issue great, but with one week until yakkety release I'm not convinced we can get that to happen.
<rick_h_> jam: voidspace so I don't see any way around us carrying this as part of the juju install for the time being.
<rick_h_> voidspace: jam especially because this isn't just 20 containers at a time, but 20 cross mulitple models locally
<jam> rick_h_: if I read his recommended values correctly, that leaves us with 4GB of unswappable kernel memory, which sounds like a bad default.
<jam> again, that seems to be only the in-use ones, but it does mean a runaway process will cause real problems on your machine.
<jam> can we cut that by 1/4th ?
<jam> so instead of 4M default go to 1M default, which cuts it to a 1GB peak?
<jam> can we test how many containers you can run with that cleanly?
<voidspace> jam: rick_h_: I can test that
<rick_h_> so if we cut that by 1/4 we should be looking at 2x our original bug 20-40?
<mup> Bug #20: Sort translatable packages by popcon popularity and nearness to completion <feature> <lp-translations> <Launchpad itself:Invalid> <https://launchpad.net/bugs/20>
 * rick_h_ has to run the boy to school
<voidspace> on my system both baloo and nepomuk have set the fs.inotify.max_user_watches to 524288
<jam> it seems that "apt install kde-runtime" creates a /etc/sysctl.d/30-baloo-inotify-limit.conf with 524288
<jam> voidspace: yeah, I just found the baloo one.
<jam> the doc I found says the default is 8192
<jam> 512K is >> 8k
<voidspace> wow, that's low
<jam> I'll try to see what a fresh image has in AWS
<jam> voidspace: my old EC2 IRC bot is, indeed, having 8192 by default.
<voidspace> jam: I'm upping the limit on my machine and seeing how many containers I can create
<jam> voidspace: yeah, unfortunately, I have the feeling that babbageclunk has the 512k version, not the built-in 8k version.
<jam> If it was just 8k lets us create 8 containers, then I'm happy to go 8k * 20
<voidspace> jam: baloo is part of kde, so it may well be the default for desktop users
<jam> voidspace: so, what is a good way to test that the container is getting provisioned without Juju? Something that runs via upstart and print something?
<voidspace> jam: I was just going to do it with juju...
<voidspace> there are some scripts on that bug though
<voidspace> juju bootstrap localxd lxd --upload-tools
<voidspace> for i in {1..30}; do juju deploy ubuntu ubuntu$i; sleep 90; done
<voidspace> except without --upload-tools
<jam> voidspace: so that can tell you if juju came up and run, I was trying to check it without Juju in the picture.
<voidspace> jam: there's some lxc reproducer scripts that check for success
<voidspace> https://bugs.launchpad.net/juju/+bug/1602192/+attachment/4700890/+files/go-lxc-run.sh
<mup> Bug #1602192: when starting many LXD containers, they start failing to boot with "Too many open files" <lxd> <juju:Triaged by rharding> <lxd (Ubuntu):Confirmed> <https://launchpad.net/bugs/1602192>
<jam> just doing "juju deploy ubuntu -n 30" is another way to just test containers.
<jam> voidspace: thanks, go-lxc-run is what I was looking for.
<voidspace> running it now
<jam> interesting, those require Xenial images because it is using systemctl, I wonder if Trusty would be different as a guest.
<jam> so I get 12 good containers with 512k max_user_inotify
<voidspace> hah, my machine is grinding to a halt...
<jam> interesting, it may be fs.max_user_instances that is failing first, vs max_user_watches
<jam> voidspace: 30 containers all at once can do that to you :)
<voidspace> I failed at 13
<voidspace> trying again
<voidspace> also setting max_user_instances and max_queued_events
<jam> voidspace: I failed at 13 with 512k an 128 max_user_instances, but at 131072/128 I also failed at 13
<jam> so I'm pretty sure it is the max_user_instances which blocks us at ~12 containers.
<voidspace> right
<jam> trying to dig up reasons to set/not set that one.
<jam> I found the 1KB kernel memory for a "user_watch" but nothing yet on the cost of a "user_instanec"
<jam> voidspace: with 'go-run-lxc' I'm trying to set max_user_watches really low (32k) and see if I hit a limit first
<voidspace> cool
<voidspace> with max_user_instances set to 1024 (plus max_queued_events) as suggested by Rick I got to 16 before failing
<voidspace> which seems low
<voidspace> hard to tell why it failed though
<jam> even with 32k I still get 13 containers started. it seems there isn't too many user watches actually created (probably a lot more when juju is running, cause there are more upstart scripts)
<jam> voidspace: 'sudo slabtop' should tell you what kind of kernel memory you are using.
<jam> I haven't fully figured it out, but it might be interesting if we see Kernel mem bloating dramatically.
<jam> with no containers active, my kernel memory seems to sit at 512k
<jam> well, presumably a display bug as it says 865991.26K right now
<jam> but that is 865GB which is a bit more than my 16GB ram :)
<jam> ah, sorry, I meant 512MB
<jam> which is accurate.
<jam> voidspace: so, I'm +1 on 512*1024 default max_user_watches, as that is a standard used by other things in a normal desktop install, and doesn't seem to be the direct limiting factor in launching containers.
<jam> voidspace: I got to 18 successful containers with max_user_instances=1024 max_user_watches=32768
<voidspace> jam: right - did you dig up any reasons not to change max_user_instances
<voidspace> jam: cool - I'm trying again and am up to 12
<jam> kernel memory went up to 1.2GB
<jam> voidspace: how much mem do you have ?
<voidspace> jam: with 15 containers running it's at 600474.73K
<jam> interesting, mine was much higher
<voidspace> jam: still just over half a gig then
<jam> but how much total ram?
<jam> in the machine
<jam> I'm also on a Trusty kernel testing this, so some of that may vary
<voidspace> jam: 16GB in the machine
<jam> same as myself
<voidspace> xenial kernel
<jam> I'm also running a btrfs root disk, and btrfs_inode is actually my top consumer in slabtop
<voidspace> 18 containers up to 877611
<voidspace> so similar to yours I think
<voidspace> fluctuatuing
<voidspace> machine grinding to a halt again
<jam> at 17 containers it switches to 'dentry'
<jam> which is probably inotify stuff
<jam> interestingly, the script fails to cleanup when it hits 19 containers
<jam> 'unable to open database file'
<jam> sounds like a general FS limit
<voidspace> jam: mine died on 20 but cleaned up ok
<voidspace> jam: I have the settings suggested by rick_h_ in the bug, but it sounds like we don't need to touch user_watches
<voidspace> jam: I'll set that back to the default and try again
<jam> voidspace: correct. user_watches = 512k seems sane
<jam> playing around with user_instances myself.
<jam> and haven't touched max_queue yet
<voidspace> I'm grabbing coffee - so maybe it's just max_user_instances we need to change
<voidspace> I'll do some digging on that
<rick_h_> jam: voidspace so looks like the numbers we got suggested from stephane might not all need tweaking as much?
<jam> rick_h_: yeah, we don't need to multiply all numbers by 8, I'm also trying to get several data points so we know how much kernel memory is taken up by what settings and how many containers that yields.
<rick_h_> jam: gotcha ty
<jam> rick_h_: so at max_user_instances=256, I've hit a soft cap where "chmod" seems to be failing. Which means we have a different bottleneck.
<jam> thats at 19 containers.
<jam> How many do you consider "sane by default" which would mean we need to go poke something else.
<jam> I'm going to run the tests again with Juju in the loop, to confirm that we can get close to the 'ideal' limits of go-lxc-run.sh
<rick_h_> jam: honestly I'd hope for 40-50 ?
<rick_h_> not sure what other folks think
<jam> rick_h_: then we need to find what the next bottleneck is
<jam> cause it looks to be something like "max files open"
<jam> I get errors in "lxc delete" because it can't open the database.
<rick_h_> jam: isn't that what this bug is?
<rick_h_> https://bugs.launchpad.net/juju/+bug/1602192
<mup> Bug #1602192: when starting many LXD containers, they start failing to boot with "Too many open files" <lxd> <juju:Triaged by rharding> <lxd (Ubuntu):Confirmed> <https://launchpad.net/bugs/1602192>
<jam> rick_h_: that is inotify handles
<jam> and we can bump it up from the defaults, but that only moves me from 13 to 19 containers
<rick_h_> I see, different "too many open files" situation?
<jam> rick_h_: right.
<rick_h_> how does lxd do massive scale if there's limits hit like this? /me is confused
<rick_h_> tych0: how many containers do you all run in testing things? do much with > 20 on a host?
<jam> I didn't test max-queued-events yet, maybe that's the bottleneck
<babbageclunk> (Sorry everyone, catching up on the conversation now)
<voidspace> jam: with a raised max_queued_events I still had a limit of 20
<jam> voidspace: same here, something else is hitting a limit
<jam> I'm thinking something like max procs or max fs handles
<jam> but I can't tell
<jam> other things on my machine start failing with 'could not open file' at if I have 19
<rick_h_> katco: ping for standup
<jam> rick_h_: voidspace: babbageclunk: So having played with it for a bit, I'm more comfortable with an /etc/sysctl.d/10-juju.conf that sets max_user_watches=512k and max_user_instances=256 but if we want to get to 50 instances we need to dig harder.
<jam> I can just barely get to 10 instances of 'ubuntu' from juju, and only 19 raw containers with any of the inotify settings, and processes start dying at that point.
<jam> (firefox/Term/etc crash)
<voidspace> jam: currently in standup and then collecting daughter - I'm doing some digging on "scaling lxc|d" as people *must* have done this before
<voidspace> jam: I've added that as a note to the bug just to track where we've got to
<jam> voidspace: https://launchpad.net/~raharper/+junk/density-check was something Dustin used to get 600+ containers on his system, but he didn't say what tuning he did around that.
 * voidspace looking
<tych0> rick_h_: we use busybox in our test suite, which doesn't run a lot of actual things inside the container
<tych0> rick_h_: but also,
<tych0> rick_h_: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
<tych0> has a bunch of limits that we recommend bumping
<rick_h_> tych0: ah, interesting
<rick_h_> jam: voidspace ^
<rick_h_> tych0: hmm, ok. So ootb this limit of 19ish doesn't sound like we're doing something wrong?
<voidspace> tych0: thanks
<voidspace> rick_h_: jam: there's a bunch of things to tweak there - I'll play
<voidspace> collecting daughter from school first
<jam> voidspace: it feels a lot like i'm hitting max number of open files for 1 user
<voidspace> jam: that's /etc/security/limits.conf I guess
<rick_h_> voidspace: please ping when you're back
<jam> voidspace: yeah
<jam> sounds like changing that needs a system reboot
<jam> and doesn't sound like something we should really be poking at
<rick_h_> jam: yea, I think the 19 we run with and make sure we do a really solid job of error'ing cleanly and having this link from tych0 as something we're ready to point to after that
<jam> rick_h_: so with Juju in there, its 10
<rick_h_> ouch?!
<voidspace> gotta run - bbiab
<jam> because we run a lot more things than just empty containers
<jam> rick_h_: yeah, and that's 'ubuntu' charm
<rick_h_> yea, understand :/ just ouch
<tych0> so there has been talk in the past about namespacing some of this
<tych0> (in the kernel)
<tych0> perhaps we should talk about that more in bucharest
<rick_h_>  tych0 yea, sounds like we have to roll with what we can do for now, but it'll be a topic to chat about because 10 isn't great for a local juju experience
<jam> rick_h_: pointing users to docs for how to tweak settings seems a best-effort on our part for now
<katco> rick_h_: hey sorry about the standup... they're starting to close off roads for the debate on sunday. massive traffic
 * perrito666 imagines the debate like a street rap battle given the closed roads
<katco> perrito666: lol no, they're just ramping up security around the university where it's being held... or something. maybe it's just for parking, dunno
<rick_h_> katco: rgr
<rick_h_> voidspace: when you're back also want to check on https://bugs.launchpad.net/juju/+bug/1629452
<mup> Bug #1629452: [2.0 rc2]  IPV6 address used for public address for Xenial machines with vsphere as provider <oil> <oil-2.0> <vsphere> <juju:Triaged> <https://launchpad.net/bugs/1629452>
<perrito666> katco: you cant denny that rap battle style debate would be awesome
<voidspace> rick_h_: no useful progress on that one I'm afraid - I got stuck for a while on getting access to vsphere
<katco> perrito666: it would be yuuuuuge
<voidspace> rick_h_: I think I've solved that but switched to the lxd bug
<rick_h_> voidspace: ok, what's involved in solving it?
<rick_h_> voidspace: we're getting asked to get that to make the cut for 2.0 and I want to understand how big the ask is
<voidspace> rick_h_: I couldn't get the VPN to work - but using ssh config and the cloud-city key I should be able to get to it
<voidspace> rick_h_: I got as far as connecting, but refused access and now I have the right key I should have full access
<voidspace> rick_h_: so I can look at it
<rick_h_> voidspace: ok, I'm going to pull it back then and we'll try to get to it next.
<rick_h_> voidspace: k, but you have a hint at the root issue that needs fixing?
<voidspace> rick_h_: ah, solving the issue, not solving access
<voidspace> rick_h_: nothing tangible, but with some logging it should be easy enough to find the source if it's deterministic
<voidspace> which from the bug report it it
<voidspace> *it is
<rick_h_> voidspace: k, yea.
<rick_h_> voidspace: ok, will pull the card back in and let's see what we can come up with.
<rick_h_> voidspace: but for now, let's move forward with the small tweak for a 20% gain in containers and make sure our error'ing/logging is clear around the container limit
<rick_h_> to wrap up the current bug
<voidspace> rick_h_: I'm concerned about handling the error case
<voidspace> rick_h_: the error that surfaces to juju isn't related to the file issue - that's well underlying
<rick_h_> voidspace: the too many files info isn't coming from lxd but into the syslog or something?
<voidspace> rick_h_: I will try and see where it ends up and report back
<rick_h_> voidspace: k
<voidspace> I hadn't found it so far in my playing today
<voidspace> rick_h_: for getting a new file into the ubuntu juju package, do I need to bug the package maintainers with a patch rather than in juju-core?
<voidspace> I can't see deb related stuff in juju-core
<rick_h_> voidspace: check with mgz and sinzui please for that
<voidspace> rick_h_: yep
<sinzui> voidspace: mgz, balloons, and propose the Ubuntu packages. We can make changes as needed
<voidspace> sinzui: cool, thanks
<sinzui> voidspace: I think "I" was supposed to be in that last message. I am working on the yakkety package now
<voidspace> sinzui: I mentally interpolated it anyway...
<voidspace> sinzui: we need to add a new sysctl conf file for juju, shall I raise a specific issue for it or just email you (plural)?
<sinzui> voidspace: report a bug against juju-release-tools. We can track the point it is fixed
<voidspace> sinzui: thanks
<rogpeppe1> to anyone that's been working on removing hard time dependencies in juju-core, you should find this useful: https://github.com/juju/utils/pull/245
<rogpeppe1> reviews appreciated, please
<rogpeppe1> redir: i'm not sure if you were working on removing time dependency, but you might be interested to take a look: https://github.com/juju/utils/pull/245
<voidspace> jam: ping
<rick_h_> rogpeppe1: I think macgreagoir is doing some of that &
<rick_h_> not sure if he's available to peek
<rogpeppe1> rick_h_: thanks
<rogpeppe1> rick_h_: looks like redir definitely was too
<rick_h_> ok cool
<natefinch> rogpeppe1: when you get a minute, I updated that PR, btw: https://github.com/juju/persistent-cookiejar/pull/17
<rogpeppe1> natefinch: cool, thanks
<rogpeppe1> natefinch: you too might be interested in the retry package PR i mentioned above
<rogpeppe1> natefinch: i thought it was quite reasonable to pass a URL to RemoveAll
<rogpeppe1> natefinch: as it might be useful to remove all cookies under a particular path (e.g. our services store service-specific cookies under api.jujucharms.com/servicename/
<rogpeppe1> natefinch: but given that you don't need that functionality, i'm suggesting you just rename your method RemoveAllHost instead
<natefinch> rogpeppe1: sounds good to me
<rogpeppe1> natefinch: which can be a specialised version of RemoveAll if/when we implement that
<natefinch> rogpeppe1: yep, great.
<voidspace> sinzui: https://bugs.launchpad.net/juju-release-tools/+bug/1631038
<mup> Bug #1631038: Need /etc/sysctl.d/10-juju.conf <juju-release-tools:New> <https://launchpad.net/bugs/1631038>
<voidspace> sinzui: let me know if I should do more, like provide an actual file
<sinzui> voidspace: this is for the juju *client* on their localhost?
<voidspace> sinzui: yes, sorry
<sinzui> yep
<voidspace> sinzui: otherwise it would be a juju-core bug for cloud-init to create it
<sinzui> voidspace: I should have clicked trhough to the bug...I know it well
<sinzui> voidspace: Juju is also providing that when it sets up a jujud?
<voidspace> sinzui: alas, this isn't enough - it gets us up from ~10 to ~20 or so containers
<sinzui> voidspace: that is enough for me to test an openstack depoyment though :)
<voidspace> sinzui: cool
<voidspace> coffee
<rogpeppe1> natefinch: BTW your cookiejar branch is named "master" which is probably not what you want
<natefinch> rogpeppe1: I was just noticing that
<rogpeppe1> natefinch: reviewed
<rogpeppe1> katco: i see a lot of bugs with your name on that this could help fixing... fancy a review? :) https://github.com/juju/utils/pull/245
<rogpeppe1> s/bugs/TODOs/
<katco> rogpeppe1: sure
<rogpeppe1> katco: ta!
<katco> rogpeppe1: hmmm how is this different enough from github.com/juju/retry?
<rogpeppe1> katco: ha, i didn't know about that
 * redir was going to mention katco since I recall her doing retry stuff recently
<rogpeppe1> katco: well for a start it keeps to the existing pattern
<katco> rogpeppe1: the bug todos you're probably seeing from me are referencing a bug to *consolidate* not create another retry mechanism haha
<rogpeppe1> katco: i think that having a loop is better than a callback
<katco> rogpeppe1: this would be i think the 4th or 5th way of doing retries in juju... bc there's so many this would definitely have to go through the tech board
<katco> rogpeppe1: i don't like our current retry package very much, personally
<rogpeppe1> katco: well, it's intended to be a straight replacement for utils.AttemptStrategy
<katco> rogpeppe1: we're meant to be consolidating everything to juju/retry
<rogpeppe1> katco: juju/retry looks pretty complicated to me
<katco> rogpeppe1: yeah i don't like it
<katco> rogpeppe1: but i already sent an email out about this a month or so ago, and this was the decision. so any new attempt at replacing it has to go through the tech review board
<katco> rogpeppe1: do you want me to plop it on the schedule?
<rogpeppe1> katco: just FWIW:
<rogpeppe1> % g -r retry.CallArgs | wc
<rogpeppe1>      15      85    1293
<rogpeppe1> % g -r utils.AttemptStrategy | wc
<rogpeppe1>      81     397    7065
<katco> rogpeppe1: you are attempting to convince me of something i already believe :) but it doesn't change the path forward unfortunately
<rogpeppe1> i.e. I think there's a lot of value in having a pluggable replacement for the existing mechanism that doesn't involve wholesale code rewriting
<rogpeppe1> katco: please plop it :)
<katco> rogpeppe1: will do! can you write up an email and send it to me? you might even be able to attend the meeting to make your case
<rogpeppe1> katco: ok will do
<katco> rogpeppe1: ta roger
<katco> rogpeppe1: yeah i really dislike juju/retry's callback methodology and little knobs and such. i prefer inline myself. i think i wrote all this in my email whenever that was
<rogpeppe1> katco: if you could review my code (and API) anyway, that would be great - then i can know whether it's worth continuing
<rogpeppe1> katco: FWIW i've been thinking about this for ages, but hadn't come to a decent understanding of how to support the existing API in the face of the stop thing.
<rogpeppe1> katco: and i just realised that it was actually OK for HasNext to block.
<katco> rogpeppe1: fyi, it's on the agenda: https://docs.google.com/document/d/13nmOm6ojX5UUNtwfrkqr1cR6eC5XDPtnhN5H6pFLfxo/edit
<rogpeppe1> katco: ta
<rogpeppe1> katco: as a little experiment, i replaced one use of juju/retry with the new package (functionally identical i think although there are no tests to check that sadly). http://paste.ubuntu.com/23285245/
<rogpeppe1> katco:  1 file changed, 17 insertions(+), 43 deletions(-)
<katco> rogpeppe1: less code makes me happy :)
<alexisb> redir, ping
<alexisb> redir, when you are ready https://hangouts.google.com/hangouts/_/canonical.com/alexis-bruemme
<redir> alexisb: ack brt
<natefinch> rogpeppe1: you still around?
<rogpeppe1> natefinch: yup, but not for long
<natefinch> rogpeppe1: yep, figured.  Quick question on the cookie jar... I'm honestly not sure what the behavior should be.  Do you think we should exact match on the hostname?
<natefinch> rogpeppe1: I agree that foo.apple.com removing cookies for bar.apple.com is confusing
<natefinch> rogpeppe1: should removing apple.com remove cookies for foo.apple.com?  I don't know what is expected here.
<rogpeppe1> natefinch: i think an exact match would be more intuitive
<natefinch> rogpeppe1: fine by me.  Will do. Thanks.
<alexisb> hml, you around?
<hml> alexisb: good afternoon
<alexisb> heya :)
<alexisb> do you have a second for a quick call or hangout?
<hml> alexisb: sure, phone call would be better
<alexisb> number?
<hml> alexisb: 781.929.3679
<kwmonroe> hey juju-dev!  neiljerram noted something weird on #juju in rc3
<kwmonroe> <neiljerram>        UNIT                WORKLOAD  AGENT  MACHINE  PUBLIC-ADDRESS   PORTS  MESSAGE
<kwmonroe> <neiljerram>        calico-devstack/0*  unknown   idle   0        104.197.123.208
<kwmonroe> where does that * in the unit name come from?
<kwmonroe> i thought maybe it was truncating for length, but i deployed ubuntu with a long name in rc3 and didnt' see it: http://paste.ubuntu.com/23286448/
<alexisb> kwmonroe, I believe that means leader now
<alexisb> thumper, ^^^
<thumper> yeah
<thumper> that's right
<kwmonroe> cool!  thx alexisb thumper.  there ya go neiljerram.  it denotes leadership... i didn't see it because the ubuntu charm doesn't have that concept.
<neiljerram> ok thanks, good to know
<kwmonroe> neiljerram: i'd be interested to know your output of 'juju status --format=yaml calico-devstack/0' shows it as well
<neiljerram> kwmonroe, I can't easily get yaml for the deployment with calico-devstack in it.  But in the other deployment that I just ran, with more units, yes, I do see this in the yaml:
<neiljerram>         leader: true
<kwmonroe> cool
<alexisb> hml, axw I will be a couple min late
<menn0> wallyworld: the new tools selection behaviour (no more --upload-tools) is nice but has one unfortunate side effect
<wallyworld> :-(
<menn0> wallyworld: if you're working on a feature and a new release arrives in the streams "juju bootstrap" stops using the tools you've just built
<menn0> wallyworld: it's just bitten me again
<axw> yeah, I get confused by that too
<wallyworld> menn0: yeah, you need to pull the latest source to get the new version
<menn0> I lost a bit of time figuring out why my QA wasn't working
<wallyworld> it's a small window but a pain none the less
<menn0> wallyworld: is the solution to stop using go install and just use --build-agent when testing stuff?
<wallyworld> yep
<menn0> I will try and change my habits and see how that works out
<menn0> wallyworld: I do like the new semantics overall, it's just this one thing
<wallyworld> menn0: here's a trivial cmd help text change for that users command we discussed yesterday https://github.com/juju/juju/pull/6392
<menn0> wallyworld: give me 2 mins
<wallyworld> no hurry
<babbageclunk> menn0: Got a moment for a quick chat before standup?
<menn0> babbageclunk: sure
<menn0> babbageclunk: where?
<babbageclunk> menn0: https://hangouts.google.com/hangouts/_/canonical.com/xtian
<thumper> haha
<thumper> fark!!!
<thumper> I think I have found this race
<thumper> geeze it is a doozy
<alexisb> babbageclunk, feel free to join us
<babbageclunk> alexisb: too sleepy - want to finish this test and crash
<alexisb> :) understood
<perrito666> alexisb: gah, now I am singing suses song
<alexisb> :)
<thumper> menn0 or wallyworld: https://github.com/juju/juju/pull/6397
<wallyworld> looking
<thumper> wallyworld: thanks
<wallyworld> sure
<wallyworld> thumper: here's a really trivial one https://github.com/juju/juju/pull/6392
<thumper> looking
<menn0> wallyworld: review done...
<wallyworld> ta
<thumper> done
<wallyworld> menn0: i didn;t know about our summary being one line, i'll rework
<menn0> wallyworld: yeah, it's the line that's shown when you do "juju help comands"
<menn0> not sure what will happen with multiple lines
<wallyworld> menn0: fixed, plus also i did a quick driveby for another bug
<menn0> wallyworld: looking
#juju-dev 2016-10-07
<wallyworld> i removed the tabular text bit as all user commands like that are tabular
<wallyworld> no need to mention it every time imho
<menn0> wallyworld: LGTM. just needs a line wrapping fix.
<wallyworld> ta
<wallyworld> damn, i meant to delete that bit
<alexisb> alrigty all I am off for the day
<menn0> thumper or wallyworld: https://github.com/juju/juju/pull/6398
<wallyworld> looking
<babbageclunk> menn0: Ugh - I can't get my watcher to see a model I create in my test.
<menn0> babbageclunk: stink
<menn0> babbageclunk: do you want me to look at the code?
<babbageclunk> menn0: Yes please! just pushing it now.
<babbageclunk> menn0: After I create a new model here: https://github.com/juju/juju/commit/94f48e609342c26c494c36155fb0df21608f3624#diff-6734d4932d3aead490d5ec768a7df6b7R519
<babbageclunk> menn0: I never see it logged in the changes from here: https://github.com/juju/juju/commit/94f48e609342c26c494c36155fb0df21608f3624#diff-19f910c28876da8a8d94937e102b2ebeR686
<babbageclunk> menn0: More importantly, I never see it die after the model.Destroy.
<wallyworld> menn0: your pr looks really nice
<menn0> wallyworld: cheers... I got a bit obsessive :)
<wallyworld> that is a good thing :-)
<menn0> babbageclunk: i've been digging for a bit
<menn0> babbageclunk: what you've got looks reasonable
<menn0> still loooking
<menn0> also looking
<menn0> :)
<babbageclunk> Thanks!
<babbageclunk> I think I need to go to bed - maybe it'll be obvious (later) in the morning.
<babbageclunk> menn0: ^
<menn0> babbageclunk: ok, i'll email if I see anything
<menn0> babbageclunk: good night
<babbageclunk> night!
 * babbageclunk craches
<babbageclunk> ugh
<babbageclunk> dumb fingers
<menn0> wallyworld or axw: are the status values shown under machine-status in the yaml status output provider specific?
<menn0> wallyworld/axw: never mind I figured it out. they're not.
<wallyworld> menn0: we map provider specific codes onto those generic status values
<menn0> wallyworld: yep thanks. I figured out immediately after asking
<redir> wallyworld: needs more tests, but PTAL and I'll follow up tomorrow: https://github.com/juju/juju/pull/6399
<redir> or axw ^
<wallyworld> ok
<redir> wallyworld: expect typos... Just got things changed and the currect tests updates. Haven't made a pass for spelling or grammar
<wallyworld> ok, np
<natefinch> axw: you around?  I have questions on login/logout tests
<axw> natefinch: yes, but about to go into a meeting
<natefinch> axw: ok, let me know when you're out
<axw> natefinch: finished
<natefinch> axw: want to do a hangout?  Might be easier
<axw> natefinch: sure
<axw> natefinch: https://plus.google.com/hangouts/_/canonical.com/andrew-nate&authuser=1
<axw> eek, that doesn't work
<natefinch> heh
<natefinch> https://hangouts.google.com/hangouts/_/canonical.com/core?pli=1&authuser=2
<menn0> wallyworld: follow on from my earlier PR: https://github.com/juju/juju/pull/6400
<wallyworld> menn0: ok, just eating will look real soon
<menn0> wallyworld: no rush
<wallyworld> menn0: swap you https://github.com/juju/juju/pull/6401, an easy one
<menn0> wallyworld: ok
<menn0> wallyworld: but Brokener is such a great name :)
 * wallyworld gets the sick bag
<menn0> Breakable :p
<wallyworld> much better
<menn0> CanHazBroken
<wallyworld> even betta!!!
<wallyworld> love that one
<wallyworld> why was that not a convention
<wallyworld> CanHazAddress is much better than Addresser
<menn0> wallyworld: I'll come up with something else
<menn0> wallyworld: I don't understand your PR :-/
<wallyworld> :-(
<menn0> wallyworld: well what came before it
<wallyworld> menn0: i like Breakable :-)
<menn0> wallyworld: i'll probably go with that actually
<wallyworld> is it the --show-secrets behaviour?
<menn0> wallyworld: well yes
<wallyworld> so, by default list-credentials will not prints passwords etc
<wallyworld> it needs to query the provider to get the schema to know what attrs are secret
<wallyworld> to query the provider, it needs to look up the cloud details
<wallyworld> so if the cloud name is invalid, it can't get the credential schema
<wallyworld> so it errs not to print anything other than a message
<menn0> wallyworld: but why show creds for a removed cloud at all?
<wallyworld> because that entry is in the file
<wallyworld> and list-credentials needs to display the contents of the file IMHO
<wallyworld> maybe the user wants to add that cloud back later
<menn0> wallyworld: ok
<menn0> wallyworld: I'll suggest a simpler wording on the PR
<wallyworld> ok
<menn0> wallyworld: I think that's what threw me
<wallyworld> i think what's there gives the necessary info, so if it's just a wording tweak
<wallyworld> i wouldn't want to not convey that info
<menn0> how about:
<menn0> The following clouds no longer exists but credentials for them still exist.
<menn0> Use --show-secrets to display credentials these credentials.
<menn0> ...
<menn0> wallyworld: ^
<menn0> ignoring the doubled up word :-/
<wallyworld> sgtm
<menn0> ok
<menn0> wallyworld: done
<wallyworld> ta
<menn0> now to land this Brokener change....
<menn0> :-p
<blahdeblah> Who wouldn't want a Brokener?
<blahdeblah> wallyworld: I can't believe you don't like that name
<wallyworld> call me fussy :-)
<wallyworld> i really hate how Go appends "er" to everything
<wallyworld> sounds horrible most of the time
<blahdeblah> Yeah - I think it should have used "inator" instead
<wallyworld> or "CanHaz...."
<blahdeblah> A Brokeinator would totally pwn a Brokener or a CanHazBroken
<thumper> wallyworld, menn0, axw: https://github.com/juju/juju/pull/6402
<thumper> someone
<axw> looking
<wallyworld> anyone
<thumper> :P
<menn0> thumper: done
 * thumper off to jitz
<thumper> will land when I'm back
<thumper> after tweaking year
<menn0> wallyworld: another look at https://github.com/juju/juju/pull/6400 pls
<wallyworld> ok
<menn0> now with Breakable
<menn0> and I caught some tests that needed updating
<wallyworld> thank goodness
<wallyworld> menn0: still looks good, ta
<mup> Bug #1190985 changed: Confusing upgrade-charm and deploy -u behavior <deploy> <docs> <improvement> <upgrade-charm> <juju-core:Expired> <https://launchpad.net/bugs/1190985>
<mup> Bug #1521610 changed: Upgrade hung when moving from 1.18.4.3 to 1.24.7 <canonical-is> <juju-core:Expired> <https://launchpad.net/bugs/1521610>
<mup> Bug #1583683 changed: juju thinks it is upgrading after a reboot <juju-core:Expired> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1583683>
<mup> Bug #1190985 opened: Confusing upgrade-charm and deploy -u behavior <deploy> <docs> <improvement> <upgrade-charm> <juju-core:Expired> <https://launchpad.net/bugs/1190985>
<mup> Bug #1521610 opened: Upgrade hung when moving from 1.18.4.3 to 1.24.7 <canonical-is> <juju-core:Expired> <https://launchpad.net/bugs/1521610>
<mup> Bug #1583683 opened: juju thinks it is upgrading after a reboot <juju-core:Expired> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1583683>
<mup> Bug #1190985 changed: Confusing upgrade-charm and deploy -u behavior <deploy> <docs> <improvement> <upgrade-charm> <juju-core:Expired> <https://launchpad.net/bugs/1190985>
<mup> Bug #1521610 changed: Upgrade hung when moving from 1.18.4.3 to 1.24.7 <canonical-is> <juju-core:Expired> <https://launchpad.net/bugs/1521610>
<mup> Bug #1583683 changed: juju thinks it is upgrading after a reboot <juju-core:Expired> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1583683>
<thumper> axw: I have a question about your review
<axw> thumper: yup?
<thumper> axw: I didn't change the time that RunHook took at all
<thumper> it was already returning cmd.Wait()
<thumper> all I do is save the output after the Wait call
<thumper> before we were discarding the output
<thumper> that's all
<axw> thumper: what I'm saying is that the client won't get to see the stdout/stderr of the hook until after it's finished
<thumper> ah
<thumper> hmm...
<axw> thumper: oh ... yeah, we weren't setting stdout/stderr
<axw> huh, how do we see the output
<thumper> isn't it the tmux session that is started?
<thumper> I'm not sure
<axw> thumper: ahhh yes
<axw> thumper: never mind me then
<thumper> how can I do QA on this
<thumper> I'd like to check before landing
<thumper> I don't think I've ever used debug-hooks in anger
<axw> thumper: umm. I guess create a charm with a config-changed hook that generates some stdout. then deploy it, debug-hooks, and force a config-changed
<axw> thumper: we don't actually run the hook at all ... we just trap and then you can run it yourself
<axw> thumper: so my review was garbage
<thumper> yeah...
<thumper> my problem is now that it is almost 7pm on friday
<thumper> and I have run out of fucks
<thumper> but I don't want to land something that could potentially screw tings up
<thumper> so I need to test it
<wallyworld> axw: if you get a chance, could you look at this for me? after you recover from being hassled by thumper https://github.com/juju/juju/pull/6405
<axw> heh
<axw> wallyworld: sure
<wallyworld> ta
<axw> thumper: I think it's fine to land as is. a smoke test of debug-hooks would be good, but I think fine to skip in this case
<axw> thumper: I do think it would be a bit neater to pass Stdout/Stderr in as a param, but not fussed
<axw> (just so you don't have to reach in via export_test)
<thumper> I thought that RunHook might have been an interface function
<thumper> so signature mattered
<axw> thumper: it is. I meant pass in via the HookContext
<thumper> ah...
<thumper> ok
<thumper> yeah, that's better
<thumper> axw: reworked how to get the combined output, https://github.com/juju/juju/pull/6402
<axw> looking
<thumper> so no change to the code for non-test
<axw> thumper: looks better, thanks. approved
<thumper> axw: ta
<thumper> and on that note
<thumper> I'm off to start drinking
<thumper> later folks
<axw> enjoy
<wallyworld> axw: with error code, you never get a not found from that api - it is always ErrPerm if the model is not there
<axw> wallyworld: yeah I know, I'm just saying using that function would make it clearer why you're looking for that error
<axw> wallyworld: alternatively, please add a comment
<wallyworld> ok
<wallyworld> axw: also, i was trying to avoid having to add an api. ModelInfoResult does contain the number of machines in the model, but not number of applications. I could just print the number of machines
<wallyworld> or I guess I could add to the struct and the applications bit won't be there for rc1 and rc2
<axw> wallyworld: could we just make a call to FullStatus?
<wallyworld> we could. seems expensive though just to poll for info like that
<axw> wallyworld: on the client facade
<wallyworld> i'd prefer not to tbh
<axw> wallyworld: why? we want to know the model status...
<wallyworld> the thing we really want is for the command to block
<wallyworld> printing the number of machines and apps is a bonus
<wallyworld> imho
<wallyworld> i'm worried about calling full status every 2 seconds
<wallyworld> maybe it's nothing to worry about
<axw> wallyworld: what's your concern?
<axw> I don't think we should be concerned about getting status frequently
<wallyworld> sec
<wallyworld> status is fairly heavy weight last time i looked - it did a bit to gather all the info, and then all we would do is throw most of it away here simply to print a machine/app count
<wallyworld> if we do want to do that, we have a ModelInfo api with machine info
<wallyworld> we'd just need to add application count
<axw> wallyworld: I'm fine with that. and it would be ok if you only showed the machine count for an rc3 server
<wallyworld> on large models, status used to tke minutes and minutes to run
<wallyworld> ok
<axw> wallyworld: that's a problem that we need to fix :)
<axw> if it's still a problem
<wallyworld> right, but for this bug fix, i'd rather use something lightweight. status might be fixed, but it is still heavy weight i think
<axw> yep, fair enough
<axw> wallyworld: sorry, but Controller.ModelStatus isn't suitable :(   it requires superuser
<axw> so you wouldn't be able to destroy your own models if you're non super user
<wallyworld> really? sigh
<wallyworld> axw: yeah, you're right. but i think we should change the perm on that method
<wallyworld> if you had permissions on the model, you should be able to get its status
<axw> wallyworld: fine by me. superuser || model-owner
<axw> or
<wallyworld> will have to fix after soccer
<axw> model admin or whatever
<wallyworld> at least i didn't need a new api
<axw> wallyworld: yeah. it does mean that rc3 would be brokenish, but I think that's ok
<wallyworld> i think so too
<wallyworld> we can release note it
<wallyworld> right, off to soccer, bbiab
<dooferlad> voidspace / babbageclunk: Small branch for review if you have the time: https://github.com/juju/juju/pull/6406
<babbageclunk> dooferlad: looking
<babbageclunk> dooferlad: lgtm
<babbageclunk> dooferlad: with a couple of comments
<macgreagoir> dooferlad: I guess there is no golang way to get the fs reserved space number, rather than a magic 90%. I won't look too hard if you already have. You might assume 5% in the fs, so 10% isn't a bad one.
<dooferlad>  macgreagoir: I am sticking to magic for now because it gets us out of the woods. It would be great to be able to pass in something on the CLI so the user can restrict / expand as wanted or do something based on monitoring the host.
<macgreagoir> tune2fs -l :-) but we don't want to assume that.
<macgreagoir> Assuming to reserve twice the fs default as reserved is not so bad.
<dooferlad> macgreagoir: I was talking about the Juju CLI
<dooferlad> macgreagoir: I know about the tune2fs thing and leaving some space on the host seems good too.
<macgreagoir> Oh, aye, I mean a way to get a non-magic number.
<voidspace> dooferlad: as part of provisioning a new machine (cloud-init) we reboot, right?
<dooferlad> voidspace: I don't think we reboot at the end of cloud-init
<voidspace> dooferlad: ah
<dooferlad> voidspace: pretty sure that cloud-init runs on first boot after install.
<voidspace> dooferlad: we'd like to up the open file limit, but that requires a reboot - unless we get into the image I guess
<dooferlad> voidspace: you can do that on the fly
<dooferlad> voidspace: just looking that up
<voidspace> dooferlad: not inotify user_watches or max_instances
<dooferlad> voidspace: sysctl -w fs.file-max=100000
<voidspace> dooferlad: ah, ok - interesting I'll try it
<voidspace> dooferlad: we've been looking at setting it in /etc/security/limits.conf which requires a reboot
<dooferlad> voidspace: I don't know which file in /etc/ wins on reboot. http://askubuntu.com/questions/594765/ubuntu-14-04-cant-get-past-4096-max-open-files-for-non-root-user has more info
<voidspace> dooferlad: thanks
<voidspace> appreciated
<voidspace> ooh, agent lost on 11 machines - I think I just hit the limit
<voidspace> past 20 containers though
<voidspace> ah no, they're back - just the machine grinding to a halt I think
<voidspace> I'm at 25
<perrito666> Morning
<voidspace> perrito666: morning!
<voidspace> at 30 containers I ran out of disk space
<rogpeppe1> this PR introduces a new field in api.Info, SNIHostName. reviews appreciated, thanks. https://github.com/juju/juju/pull/6407
<wallyworld> axw: i updated the api server to allow model admins or owners to make that status call. rc3 servers will not break the destroy as it will just print a message and continue without blocking
<perrito666> K doctor appointment bbl
<perrito666> well gotta love when doctors use old houses as practices.... lte wont go through walls
<rick_h_> voidspace: ping for standup
<voidspace> rick_h_: omw
<rogpeppe> could someone review this please? i've got another one backed up behind it. https://github.com/juju/juju/pull/6407
<alexisb> perrito666, ping
<rick_h_> katco: can you peek at rogpeppe's link ^ please?
<katco> rick_h_: sure
<babbageclunk> voidspace, dooferlad, katco``: anyone feel like a review? https://github.com/juju/juju/pull/6408
<rick_h_> ty!
<babbageclunk> d'oh
<katco> rogpeppe: is this fixing a bug?
<alexisb> babbageclunk, that is why I was pinging perrito666 :)
<natefinch> rick_h_: btw, I forgot that updating the featuretests for the logout stuff was actually very easy, so I just did it.  landing now
<rick_h_> natefinch: <3
<rogpeppe> katco: no, it's an enhancement to make it easier to deal with controller with public certs
<natefinch> ahh crap, realized I need to get the change to persistent-cookiejar to land before the juju core stuff will work, oops.
<natefinch> rogpeppe: I made the suggested changes to persistent cookiejar, in a new PR from a better named branch: https://github.com/juju/persistent-cookiejar/pull/18/files
<rogpeppe> natefinch: ok, looking
<rogpeppe> natefinch: swap ya! https://github.com/juju/juju/pull/6407
<katco> rogpeppe: natefinch: i am also looking at this ^^
<rogpeppe> i need two reviews to land it :)
<natefinch> rogpeppe: deal
<rogpeppe> natefinch: i'm also working on getting people on board for general use of my new retry package API... (which I think is actually kinda cool :-]) https://github.com/juju/utils/pull/245
<natefinch> rogpeppe: we really need to move juju/utils/clock to a top level repo
<rogpeppe> natefinch: LGTM
<natefinch> rogpeppe: thanks
<rogpeppe> natefinch: yeah, probably
<rogpeppe> natefinch: although i don't think it's a big issue
<natefinch> rogpeppe: utils churn has been an issue in the past... backporting stuff in utils often requires one-off ugly branches and godeps.
<natefinch> rogpeppe: newWebsocketDialer0?
<natefinch> rogpeppe: are dialwebsocket and dialapi unchanged?  They moved, so it's hard to see the diff.
<rogpeppe> natefinch: they're pretty similar if not entirely unchanged. sorry about the move, but they really belong together.
<natefinch> oh, no, I see some diffs
<natefinch> mostly unchanged though
<rogpeppe> natefinch: i thought that "newWebsocketDialer0" vs newWebsocketDialer is better than the arbitrary newWebsocketDialer vs createWebsocketDialer
<rogpeppe> natefinch: i often use "0" as a "this is different but essentially the same thing" suffix
<katco> rogpeppe: review up
<rogpeppe> katco: ta!
<katco> i really dislike how comments break up the code... makes it really hard to read reviews
<rogpeppe> katco: "We need to instead take this in as a dependency to the call-chain." - does that mean every caller to NewServer has to manually wire that dependency up?
<katco> rogpeppe: that bubbles up to Open(...) doesn't it?
<rogpeppe> katco: oh yes, sorry, it does... every call to Open then
<katco> rogpeppe: it would fit nicely into DialOpts i think
<katco> rogpeppe: yes, that is what it means
<rogpeppe> katco: i really feel like it's very much an internal implementation detail
<katco> rogpeppe: if it were, then the code you're writing against it wouldn't have to patch a global variable
<rogpeppe> katco: it's an internal test
<katco> rogpeppe: tests are just code
<rogpeppe> katco: it's allowed to depend on internal implementation details
<rogpeppe> katco: for me, not exposing it is about encapsulation of implementation detail
<rogpeppe> katco: if we expose it, we can't change it any more
<katco> rogpeppe: sure we can, you just pass in something different
<rogpeppe> katco: clients become dependent on that particular detail of the implementation
<katco> rogpeppe: if it were an implementation detail, the code you've written wouldn't have to patch out a global variable
<rogpeppe> katco: ok, we can't change it without changing the clients
<katco> rogpeppe: yes, but the code you've written shows that different clients have different needs
<katco> rogpeppe: this is fundamental to IoC
<rogpeppe> katco: the test in question is a unit test. i think it's legimate for unit tests to depend on internal details that we would not want to expose outside the package.
<katco> rogpeppe: a test is just code, and in this case it's exposing a design flaw
<rogpeppe> katco: i don't think all tests should be external tests. that way lies hideously complex APIs
<katco> rogpeppe: i think that decision was already made
<rogpeppe> katco: really? *strictly* external tests?
<katco> rogpeppe: yeah
<rogpeppe> katco: i mean, i know we define tests as if they were external, but i didn't realise we weren't allowed to test any internal implementation detail directly
<katco> rogpeppe: i went through that pain and came out the other side agreeing with it a little more. i haven't run into a counter-example, but i don't doubt they're out there
<rogpeppe> katco: if that's really the case then i give up. i think it's a path to ruin.
<katco> rogpeppe: :( i don't know how to respond to that. i didn't like it either, but i pushed through
<rogpeppe> katco: so you really think we should add a field like this to api.DialOpts?
<rogpeppe> 	DialWebsocket func(cfg *websocket.Config, opts DialOpts) func(<-chan struct{}) (io.Closer, error)
<katco> rogpeppe: i am less sure about how it gets passed in, but DialOpts does look like the right spot as a place to specify "options to dial"?
<rogpeppe> katco: even though there are >70 instantiations of DialOpts in the code and they'll all need to change and websockets are just an implementation detail of the API that we might want to change in the future?
<katco> rogpeppe: how about this? create a 2nd function that calls the existing one with the correct function passed in?
<rogpeppe> katco: ?
<rogpeppe> katco: tbh rather than adding that i think it would be preferable to do a more integration-style test with a real server (but that's hard with code that relies on officially signed certs)
<katco> rogpeppe: i.e. rename Open => OpenWithWebsocketDialer(..., func()) and then modify Open to call OpenWithWebsocketDialer with the correct func?
<rogpeppe> katco: why would you want to export that function?
<katco> rogpeppe: integration tests can be done in the top level featuretests/ but not anywhere else
<rogpeppe> katco: that still exposes an implementation detail that really should be hidden inside the api package
<rogpeppe> katco: websockets are not essential to the api abstraction
<katco> rogpeppe: the code you have written shows that it is not an implementation detail
<rogpeppe> katco: you're saying that tests aren't allowed to test specific implementation details AFAICS
<rogpeppe> katco: and i disagree. tests of specific implementation details are often the ones that give you most confidence in the higher level code.
<katco> rogpeppe: they are certainly free to test the behavior, but if that behavior need be modified by patching out global variables, that's a design smell and proves that it's not an implementation detail
<rogpeppe> katco: you're saying they're free to check externally visible behaviour but nothing else.
<katco> rogpeppe: yes, the behavior of the thing you're testing, not how it's implemented
<rogpeppe> katco: i think there's room for both approaches. we often write tests knowing about the structure of the code we're testing.
<rogpeppe> katco: because otherwise you would never get decent test coverage and you'd write lots of redundant tests.
<rogpeppe> katco: i think this is that kind of case.
<rogpeppe> katco: patching out global variables is a smell, yes, but so is exposing abstraction-breaking details as part of your public API.
<rogpeppe> katco: as usual, there's a trade-off to be made.
<katco> rogpeppe: i've heard this argument a lot -- that it breaks encapsulation -- yes, by design it does. the code you've written has shown that it's a detail that should not be encapsulated
<rogpeppe> katco: you're assuming that a package's tests are exactly the same as any other client of the code.
<rogpeppe> katco: but as i've tried to say, they really aren't - they are necessarily privy to details of the code structure.
<katco> rogpeppe: not the same, but it is one of the first consumers
<rogpeppe> katco: that's very true, and it's great to write tests as much as possible to be strictly external.
<rogpeppe> katco: but i don't think that it's necessary to have a hard and fast rule there.
<rogpeppe> katco: in the end what really counts is the overally simplicity and maintainability of the code.
<rogpeppe> s/overally/overall/
<katco> rogpeppe: definitely
<rogpeppe> katco: and a big part of "simplicity" is simplicity of exported APIs.
<katco> rogpeppe: but, we tried the other way. our tests are extremely brittle, so a decision was made and we must abide by it
<katco> rogpeppe: no, that's "easy". simplicity is a different beast
<rogpeppe> katco: no, simple APIs are not "easy"
<rogpeppe> katco: FWIW making this an exposed external thing is likely to make code more brittle not less
<katco> rogpeppe: there are valid arguments against ioc, but i don't think lack of simplicity is one of them. lack of ease is i think
<rogpeppe> katco: ioc?
<katco> rogpeppe: inversion of control
<katco> rogpeppe: dependency injection
<katco> rogpeppe: please don't take this as dismissive, but i could sit here and enjoy this conversation all day, but i need to get to some other reviews/code. that's my review, i think we have decisions to back it. we can appeal to someone (?) if that helps
<rogpeppe> katco: for me, i have the most problem with exposing abstraction-breaking details in the API
<rogpeppe> katco: thanks for the review
<katco> rogpeppe: ta for the conversation
<perrito666> alexisb: sorry for the delay I was at the doctor being yelled at because I am fat and lazy
<alexisb> perrito666, :)
<alexisb> those darn doctors always concerned about your health
<alexisb> perrito666, I was going to ask you to review babbageclunk PR
<perrito666> alexisb: if still ther eill be glad to
<babbageclunk> perrito666: It's still there: https://github.com/juju/juju/pull/6408
<babbageclunk> alexisb: I haven't sorted out the http.Transport leaks yet - should I just make a binary now and push it to the OIL CI without a fix for that.
<babbageclunk> ?
<perrito666> babbageclunk: reviewed, lgtm but I would love to have a second pair of eyes there since it is a critical path
<babbageclunk> perrito666: Thanks - ok, I'll ping Menno about it, he knows that code.
<natefinch> sinzui: CI seems broken... it didn't post that my build had failed and it's not taking new builds, AFAICT.
<sinzui> natefinch: ci? I think you mean the bot lander.
<natefinch> sinzui: it's all CI to me
<natefinch> sinzui: but yes
<sinzui> the bot is a third part tool that uses git. Ci Launchpad, tarfiles and our machines.
<sinzui> natefinch: I am looking for the job the bt started, sometimes it fails so quickly ti cannot call home. If so I will just requeue
<natefinch> sinzui: it should have failed to compile, I honestly didn't look at the output since I realize it would fail before I even saw it fail
<sinzui> oh, it doesn't compile
<sinzui> natefinch: yes, the compilation error prevented the call back. http://juju-ci.vapour.ws/view/Juju%20Ecosystem/job/github-merge-juju/9452/console. When this happens Add "Build failed: Tests failed" in a comment. That will let you requeue
<natefinch> oh neat, thanks
<sinzui> natefinch: I think I fixed this scenaio. if Juju fails to build, It will call back. I will need to watch a few runs to be sure though
<natefinch> sinzui: even better.  Thanks again. always glad to help break things in new and interesting ways.
<sinzui> :)
<redir> smallish review anyone? https://github.com/juju/juju/pull/6399 PTAL...
<rick_h_> katco: ping, can I bug you please?
<katco> rick_h_: sure thing
<rick_h_> katco: can you join https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1 please
<katco> rick_h_: this might help: https://github.com/juju/juju/blob/master/cloud/clouds.go#L24
<katco> rick_h_: we generate this file https://github.com/juju/juju/blob/master/cloud/fallback_public_cloud.go#L8
<katco> rick_h_: off the yaml file
<katco> rick_h_: so if go generate was not run when we built rc3, the file would be stale; not sure if i got the use-case/problem completely straight, but that seemed like a detail that might be important
<rick_h_> katco: yea, that's 'duplication' in the looks of what's in the code, but it's generated so that's less actual dupe than it seems I guess
<katco> rick_h_: right... the idea is to bring that information in at compile time rather than run-time by reading the yaml file
<rick_h_> katco: yea, just not sure if that's the peachiest thing imo. I understand folks think different there though so understand
<katco> rick_h_: moving down the chain of execution is usually better when you can do it. removes chance of errors at runtime, cost of reading every time, etc.
<natefinch> rick_h_: thoughts on what I should take next?
<rick_h_> natefinch: pick your favorite critical please
<rick_h_> natefinch: the streams one might be interesting but up to you
<natefinch> rick_h_: which one is the streams one?
<rick_h_> natefinch: generate-tools, second down on the right
<natefinch> rick_h_: I have no idea how the generate-tools stuff works, but I'm happy to look into it
 * redir lunches
<redir> must be friday afternoon:)
<alexisb> redir, heh
<redir> :)
<perrito666> god people, on irc so late on a log weekend friday? you have no life :p
<alexisb> :)
#juju-dev 2016-10-08
<redir> heh
<redir> almost eow
<perrito666> redir: go home
<redir> :)
<redir> `juju config <app>` should squirt out the config for that application right/
<redir> ?
<alexisb> redir, yes
<redir> hmmm, that doesn't seem to be happening on master
<redir> following https://github.com/juju/juju/pull/6205 which worked then, isn't working now. Unless I'm missing something.
<redir> I am using lxd instead of AWXS though
<alexisb> well redir taht would be a bug
<alexisb> we should file one
<redir> right me looks at history
<redir> welp I was working on making --reset a string arg anf that stuff, but it wasn't working. So now I'm trying ot figure out when It stopped
<redir> but maybe my installation just needs to be tossed and resinstalled
<redir> shizzle
<redir> anyone here know if I can make lxc return just nams
<redir> names
<redir> I want to do something like for c in `lxc list --just-names`; do lxc restart name $c; done
<redir> blew away work tree and rebuilt. config output is back:(
<redir> well that was an afternoon lost
<redir> and evening
<redir> me goes eow
#juju-dev 2016-10-09
<thumper> wallyworld, menn0: fyi, I'll miss the stand up today
<wallyworld> ok
<menn0> wallyworld: this brings back Ping: https://github.com/juju/juju/pull/6410
<wallyworld> looking
<menn0> wallyworld: adding IsBroken as a separate PR
<wallyworld> menn0: i think the jaas code has already been updated? do we need the Ping() or just IsBkoken?
<menn0> wallyworld: I don't know but given that it caused problems for them removing Ping may upset other projects too so I thought it best to bring it back at this stage
<menn0> wallyworld: the next PR will mark it as deprecated
<wallyworld> ok
<menn0> wallyworld: the preferred way to check a connection will be to use Broken() or IsBroken()
<wallyworld> indeed
<wallyworld> menn0: lgtm but a request to simplify the Ping() doc string
<menn0> wallyworld: ok thanks
 * thumper is sad looking at db contents
<thumper> so much duplication
<thumper> at least three different places we store the ca-cert
<thumper> and three places we store api-port and state-port
<anastasiamac> thumper: considering mongo is not really relational, i.e. a lot of denormalised data, duplication is not lethal.. i think we may just try to twist its arm too hard something
<anastasiamac> sometimes*
<wallyworld> menn0: axw: veebers: standup?
<veebers> wallyworld: worry just about to head out the door (I have existing appointments most days at this time :-\)
<wallyworld> no worries
#juju-dev 2017-10-02
<jamespage> hml: commented on https://bugs.launchpad.net/juju/+bug/1689683
<mup> Bug #1689683: Can't bootstrap openstack in some cases where compute AZ exist but not networking AZs <cpe-onsite> <new-york> <openstack-provider> <uosci> <usability> <juju:Triaged by hmlanigan> <https://launchpad.net/bugs/1689683>
<jamespage> maybe I'm missing something but I don't feel that you're trying to use the feature correctly
<hml> jamespage: agreed - weâre not using the feature correctly - for that bug - weâd like to  to determine how we should use it.
<hml> jamespage: sounds like we should make no coorelation between instance AZ and network AZ.  If using FIPs, and no external network specified, we should find an external network in the same network AZ as the network.
#juju-dev 2017-10-03
<hallyn> I have a juju client machine and vsphere server together behind a proxy.  so 'juju bootstrap vsphere vs1' should contact the vsphere server without proxy, but fetch over internet through proxy.  how do i tell juju to do that?
<hallyn> i've seen about per-model 'no-proxy', but can i do that for controller too?
<hallyn> hm, or does bootstrap bootstrap a model so i'm ok?
<hallyn> hm, nope
<hallyn> Hm, how does one change the hardware type for juju launched instances in juju+vsphere?
<hallyn> seriously there's an issue withthe googlability of the docs :)  I always end up with docs listing commands that no longer exist like 'juju get-env'
<hallyn> ugh.  it's hard-coded in
<hallyn> which is a bug, technically, according to https://jujucharms.com/docs/2.1/help-vmware
<hallyn> that page says only hardware version 8 is required.
<hallyn> 10 requires esx 5.5.  8 is 5.0.  I have (mostly) 5.1
<hallyn> stokachu: ^
<hallyn> all right i'll email the guy who hardcoded that
#juju-dev 2017-10-04
<thumper> babbageclunk: how's things?
<babbageclunk> thumper: middling
<thumper> got time for a quick chat?
<babbageclunk> thumper: sure - 1:1?
<thumper> ack
<thumper> jam: ping
<jam> hi thumper
<thumper> jam: are you able to make our 1:1 tomorrow given NZ is now in DST?
<jam> sorry I missed tech board, time change through me off
<jam> thumper: I will probably be slightly late because of bus drop off, but I think I can make it
<thumper> jam: ok
<wallyworld> jam: i'd love to be able to land this before we go to beta tomorrow if you had a chance to review https://github.com/juju/juju/pull/7906 ; it's an enhancement to the juju list-offers command
<jam> in standup now, will try to give it a look
<wallyworld> ok, thanks
<wallyworld> anastasiamac: can i have a review for a small PR? i'd like to land it before tomorrow's beta. it removes an unneeded status doc from the db https://github.com/juju/juju/pull/7907
<anastasiamac> wallyworld: of course u can... :D
<wallyworld> yay
<wallyworld> jam: thanks for review; i think toy are right it was confusing. i've updated the param names and help test - hopefully it is clearer now? could you PTAL?
<jam> wallyworld: what is read permission on an offer if not the right to consume it? You have the right to know the offer exists?
<wallyworld> correct
<wallyworld> it will show in search results
<wallyworld> we have read, consume, admin
<jam> wallyworld: lgtm
<wallyworld> jam: yay, tyvm
<wallyworld> thanks for asking for the clarifications
<wallyworld> thumper: we having release call?
<thumper> yeah
<mup> Bug #1721159 opened: partial sync when leader_id is empty, yielding inconsistent mirror <juju:New> <juju-core:New> <ubuntu-repository-cache (Juju Charms Collection):New> <https://launchpad.net/bugs/1721159>
<mup> Bug #1721159 changed: partial sync when leader_id is empty, yielding inconsistent mirror <juju:Incomplete> <juju-core:Won't Fix> <ubuntu-repository-cache (Juju Charms Collection):New> <https://launchpad.net/bugs/1721159>
<wpk> balloons: Do we have any juju CI tests on Artful?
<balloons> wpk, we do a deploy, and can target anything you wish
<mup> Bug #1721159 opened: partial sync when leader_id is empty, yielding inconsistent mirror <juju:Incomplete> <juju-core:Won't Fix> <ubuntu-repository-cache (Juju Charms Collection):New> <https://launchpad.net/bugs/1721159>
<wpk> balloons: with netplan merged I'd test artful as a host for container and as a container, in different combinations with xenial (artful on xenial, xenial on artful, artful on artful)
<balloons> wpk, we have some of what you are after
<balloons> artful on artful
<wpk> and it worked? (it shouldn't :)
<mup> Bug #1721159 changed: partial sync when leader_id is empty, yielding inconsistent mirror <juju:Incomplete> <juju-core:Won't Fix> <ubuntu-repository-cache (Juju Charms Collection):New> <https://launchpad.net/bugs/1721159>
#juju-dev 2017-10-05
<babbageclunk> thumper: can you review this please? https://github.com/juju/1.25-upgrade/pull/49
<thumper> ack
<babbageclunk> thumper: testing the final tweaks of the container systemd/upstart rewriting
<anastasiamac> wallyworld: thumper: PTAL - https://github.com/juju/juju/pull/7911
<wallyworld> ok
<anastasiamac> wallyworld: thumper: i think this is less confusing. tyvm :D
<anastasiamac> wallyworld: replied but m so happy u liking me fixing this one!
<wallyworld> looking
<wallyworld> just "2.3" could imply only "2.3" and not "2.3.minor"
<wallyworld> once we hit 2.3, there are no more betas etc
<wallyworld> just point releases
<anastasiamac> k, so just for clarity - 2 is major, 3 is minor, everything else is a "patch" :D
<anastasiamac> so maybe the phrasing can be along "2.3 point releases"?
<anastasiamac> wallyworld: ^^
<wallyworld> ok, i'm not married to point vs patch
<wallyworld> whatever sounds correct
<anastasiamac> wallyworld: or "this client can only bootstrap agents using any of the 2.3 Juju versions".... wordy but then avoids ambiguitites...
<wallyworld> sgtm
<thumper> wallyworld: I was thinking, we should skip our 1:1 in a few minutes and just have a joint one with jam later
<thumper> wallyworld: sound good?
<wallyworld> sgtm
<thumper> k
<wallyworld> anastasiamac: here's a micro PR to fix a test race https://github.com/juju/juju/pull/7912
<anastasiamac> wallyworld: looking
<wallyworld> ta
<anastasiamac> wallyworld: neat! i like explicit accessors and mutators by default :D
<wallyworld> that's your java beackground :-)
<anastasiamac> sh
<anastasiamac> wallyworld: or just pendatic "don't assume u know what i want, be explicit"
<anastasiamac> :D
<babbageclunk> thumper: what's a sensible default for when updated on a status record is nil? https://github.com/juju/1.25-upgrade/blob/master/juju1/state/migration_export.go#L1310
<babbageclunk> thumper: 0?
<thumper> hmm..
<thumper> no you don't want zero
<thumper> what does the rest of the record look like?
<babbageclunk> thumper: don't know, it's in IS
<babbageclunk> https://pastebin.canonical.com/199884/
<thumper> hmm..
<thumper> babbageclunk: perhaps defaulting it to the time of the export
<babbageclunk> thumper: yeah, that seems good to me.
<mup> Bug #1721555 opened: juju bootstrap node tries to reach manually added machines through private IPs instead of public <juju-core:New> <https://launchpad.net/bugs/1721555>
<mup> Bug #1721629 opened: No tools available via streams.canonical.com for 1.25.13 on i386 <juju-core:New> <https://launchpad.net/bugs/1721629>
<babbageclunk> thumper: take another look at https://github.com/juju/1.25-upgrade/pull/49 ? And also https://github.com/juju/1.25-upgrade/pull/50
<balloons> can I get a review for version bump? https://github.com/juju/juju/pull/7917
<mup> Bug #1721629 changed: No tools available via streams.canonical.com for 1.25.13 on i386 <juju:Triaged> <https://launchpad.net/bugs/1721629>
<babbageclunk> balloons: approved
<babbageclunk> thumper: hey thanks!
#juju-dev 2017-10-06
<anastasiamac> https://bugs.launchpad.net/juju/+bug/1690413
<mup> Bug #1690413: Add support for Google Cloud 'us-east4' region and future regions <juju:Triaged> <juju 2.2:Won't Fix> <https://launchpad.net/bugs/1690413>
<wallyworld> babbageclunk: i forgot to ask you about this one https://bugs.launchpad.net/juju/+bug/1717860
<mup> Bug #1717860: model migration from 2.2.4 to 2.2.4 fails <canonical-is> <new-york> <juju:Triaged by 2-xtian> <https://launchpad.net/bugs/1717860>
<wallyworld> i think we need to close it and raise a new bug if neded?
<wallyworld> the migration bit works IIANM
<babbageclunk> Yeah, I think so - there's something bad with performance afterwards, but that's something separate
<wallyworld> babbageclunk: did you mention you had seen the issue or knew of a possible cause, i can't quite recall?
<babbageclunk> wallyworld: For that one? No - I meant for migrations on his previously 1.25-upgraded models (missing settings that meant that the status history and action pruners were crashing). Not normal juju2-juju2 migrations
<wallyworld> ah ok
 * babbageclunk goes for a run
<thumper> o/ hazmat
<thumper> babbageclunk: I've updated the bundlechanges PR
<thumper> babbageclunk: not handing the diff case just now
<thumper> that is going to have to wait
<babbageclunk> thumper: makes sense - do you want me to approve the PR now?
<thumper> hang on
<thumper> I see an issue
<thumper> just pushed another fix to update the cmd package in it so CI is happy
<thumper> babbageclunk: now is good :)
<thumper> babbageclunk: also... I need to work out what to do about local charms
<thumper> working on the integration branch now
<babbageclunk> so hold off?
<thumper> um...
<thumper> perhaps a bit
<thumper> poo
<redir> how was NY?
