#juju-dev 2013-01-14
<jam> wallyworld__: welcome back! I hope you had a great break
<wallyworld__> jam: thanks, yes it was great. i hardly opened the computer at all. it was against the rules i was told by the better half
<jam> :)
<jam> did you travel at all?
<wallyworld__> jam: how was your break?
<wallyworld__> jam: yes, went north to mackay to visit a friend
<wallyworld__> went to the whitsunday islands etc, very nice
<jam> pretty good. No travelling, but took it easy, did lots of family stuff.
<wallyworld__> yeah, back to reality today :-(
<wallyworld__> jam: i have to go and get the car back from the repairer, may be a few minutes late for our call
<jam> wallyworld__: np
<wallyworld__> jam: hi, you free now?
<jam> wallyworld__: yeah, sorry I missed your message.
<TheMue> *: Morning.
<rogpeppe> fwereade_, jam, mgz, TheMue, wallyworld__: mornin' all
<jam> morning rogpeppe
<TheMue> rogpeppe: hiya
<wallyworld__> g'day
<fwereade_> rogpeppe, jam, TheMue, mgz, wallyworld__: mornings!
<rogpeppe> fwereade_: thinking about it last weekend, i definitely think that checking for state-fatal errors (and having workers be more independent) is the way to go
<fwereade_> rogpeppe, cool
<rogpeppe> fwereade_: i have an idea for how it might work actually
<fwereade_> rogpeppe, oh yes?
<rogpeppe> fwereade_: rather than making sure that errors need to be passed back preserving the "this was caused by a state-fatal problem" property, we could make the underlying operations on state check the error from mongo, and record a "was fatal" flag in the state if it found one
<rogpeppe> fwereade_: then we could just check the state to see if we should reconnect or not
<rogpeppe> fwereade_: when we get an error from a worker
<fwereade_> rogpeppe, do you recall what the problem that prevents us from checking the state's liveness/sanity in the first place was?
<fwereade_> rogpeppe, because I worry a little that setting broken state *everywhere* we return some mongo error may be somewhat burdensome and hard to do right
<rogpeppe> fwereade_: no, but i'm wary of another probe - particularly when we might have timeout errors
<fwereade_> rogpeppe, hm, reasonable, still not 100% sure that solution will cover us adequately
<rogpeppe> fwereade_: i don't think it's necessarily a problem - if we wrap Run we get a long way
<fwereade_> rogpeppe, mm, I'm more worried about the arbitrary gets of a doc here, a few fields there, etc
<rogpeppe> fwereade_: basically, i think we can wrap a few mgo methods to check the error, and i *think* that'll be sufficient
<fwereade_> rogpeppe, yeah, maybe we're restricted to something like  FindCount, FindOne, FindAll, FindFieldsOne, FindFieldsAll
<fwereade_> rogpeppe, interesting
<rogpeppe> fwereade_: yeah, that's the idea
<fwereade_> rogpeppe, tentative SGTM then
<rogpeppe> fwereade_: cool, we'll see.
<rogpeppe> fwereade_: just gotta reboot
<dimitern> jam: ping
<fwereade> rogpeppe, TheMue, dimitern, jam, mgz: btw I have a couple more doc drafts in review: https://codereview.appspot.com/7094052/ and https://codereview.appspot.com/7095049/
<fwereade> all lifecycle-related and mostly accurate
<fwereade> but the service destruction stuff is speculative -- it already exists in slightly different form, but needs to be reworked as agreed elsewhere
<rogpeppe> fwereade: will have a look in a bit. thanks.
<TheMue> fwereade: *click*
<aram> moin.
<TheMue> aram: moin
<fwereade> TheMue, what's the status of 006-state-retry-delay?
<TheMue> fwereade: I'll get an answer by Dave tomorrow why he prefers a different solution.
<TheMue> fwereade: He has it on his todo but didn't found time for it today, so he sent me a little note.
<fwereade> TheMue, ok, cheers
<fwereade> rogpeppe, 181-mgo-show-log has been approved for a while, is it ok?
<rogpeppe> fwereade: hmm, i'd forgotten about that
<rogpeppe> fwereade: will submit
<fwereade> rogpeppe, cool
<rogpeppe> fwereade: thanks - i should really go through my outstanding CLs some time!
<niemeyer> Gooood morning all!
<dimitern> niemeyer: hey, morning :) happy new year and congrats on the baby!\
<niemeyer> dimitern: Thanks, and happy new year too
<fwereade> niemeyer, heyhey!
<fwereade> niemeyer, happy new year and many congratulations!
<niemeyer> William!
<fwereade> niemeyer, you had any sleep yet? ;p
<niemeyer> fwereade: Kind of :-)
<fwereade> niemeyer, cool, treasure it ;)
<TheMue> niemeyer: Oh, hello, and a happy new year and grats to the new born child from my side too.
<niemeyer> fwereade: Ale is obviously suffering a bit more as every other hour he demands her attention
<fwereade> niemeyer, yeah, it can all get a bit much
<niemeyer> TheMue: Heya! Thanks!
<TheMue> niemeyer: Even if the first weeks are hard enjoy them. Little children are a precious wonder, and getting a smile opens the heart.
<niemeyer> TheMue: Yeah, we're definitely enjoying it a lot
<TheMue> niemeyer: For us this time is now so long away, but still each day is different and interesting. Currently we're discussing ideas about jobs and our older daughter now has a driver license (and we no car anymore, she's using it all the time). :D
<rogpeppe> niemeyer: yo!
<rogpeppe> niemeyer: brilliant news, BTW, i was so happy when i saw your G+ announcement.
<niemeyer> rogpeppe: Hey man!
<niemeyer> rogpeppe: Thanks for sharing the joy
<hazmat> niemeyer, g'morning, and congratulations!
<niemeyer> hazmat: Morning, and thanks!
<TheMue> lunchtime
<rogpeppe> fwereade: https://codereview.appspot.com/7096054
<fwereade> rogpeppe, cheers
<rogpeppe> niemeyer: you might wanna take a look too - there's a new package that hopefully you won't consider too crackful :-)
<niemeyer> rogpeppe: I'll start from the bottom up
<rogpeppe> niemeyer: where's the bottom? :-)
<niemeyer> rogpeppe: I'll have a look at the changes done since I went out
<rogpeppe> niemeyer: good plan
<rogpeppe> fwereade: what's your intended audience for the docs? much of the death-and-destruction document seems to me like it might live more happily as comments inside the state package.
<fwereade> rogpeppe, yeah, I had that though too
<rogpeppe> fwereade: in particular, talking about transactions seems like it's really very implementation-specific. i'd thought that the docs were to be higher level.
<fwereade> rogpeppe, the trouble with the state docs is that they are by their nature distributed across files and it's hard to explain things clearly that way
<rogpeppe> fwereade: perhaps an overview doc comment in state.go might be the way forward
<rogpeppe> fwereade: i'd like to see some docs on death and destruction that will still be valid after we've moved over to using the API
<fwereade> rogpeppe, but the real purpose of that CL is to get people either agreeing or disagreeing with what's there, because I feel like the state package is approaching maturity
<fwereade> rogpeppe, will those not still be valid?
<fwereade> rogpeppe, the same stuff will happen, but in a different place, surely?
<rogpeppe> fwereade: i'm just not sure how helpful it is to see the implementation details (e.g. "Removing a unit involves *two* transactions, defined as follows:") when what we're trying to do, i think, is document what we're aiming towards - the externally visible behaviour rather than the details of how we make that happen.
<rogpeppe> fwereade: i'm not saying that it's not good to have implementation details somewhere, but i *think* that closer to the source is better for those.
<fwereade> rogpeppe, I need at least one other person to analyze my plans at this level of detail
<rogpeppe> fwereade: i think that's a great plan, but i think that should be either an email to juju-dev or a doc comment in state. i still think it's worth having a document that summarises at a high level (and hopefully very simply) our approach to death and destruction.
<fwereade> rogpeppe, I think the line-by-line comments make a CL a good discussion point; much of it may go well in doc comments, but to evaluate it all properly you do kinda need to read it all and check it's internally consistent
<rogpeppe> fwereade: ok, i'll review it as if it were a doc comment
<fwereade> rogpeppe, last time I tried a braindump of this nature doc/draft was suggested as an appropriate landing point
<rogpeppe> fwereade: well, if others differ, i'm happy to defer to them. this was just my initial reaction when going through the doc.
<fwereade> rogpeppe, cool -- I think there's a lot of value to having that information all collected in one place
<rogpeppe> fwereade: it does seem very implementation-specific to me, something that would live happily in state, as you don't need to know any of those details externally to state.
<rogpeppe> fwereade: hence my suggestion for putting it in state as a doc comment
<rogpeppe> fwereade: or is there more of a cross-package nature to the doc that i'm not seeing?
<fwereade> rogpeppe, it's not cross-package, and it's definitely all about state, but it's a complex-enough topic that I think distributing the docs across comments for various methods on various types will be counterproductive
<rogpeppe> fwereade: that's not what i'm suggesting
<rogpeppe> fwereade: i'm suggesting that we could have it as a single comment/.
<rogpeppe> fwereade: maybe even in its own file, doc.go
<fwereade> rogpeppe, I think that's a place for API docs, which are not quite the same thing
<fwereade> rogpeppe, but I may be wrong
<rogpeppe> fwereade: we have many comments about the internal implementation too; they're not visible with godoc, but they're still useful when working on state.
<rogpeppe> fwereade: this doc seems like it comes into a similar category
<fwereade> rogpeppe, perhaps so -- I'm basically treating doc/draft as a sort of holding area for things that I feel ought to be written down
<fwereade> rogpeppe, writing that stuff down was important for coming to understand it all better myself
<rogpeppe> fwereade: fair enough. i hadn't realised that - to my shame this is the first of your drafts i've looked at, and i was presuming it was all higher level stuff.
<fwereade> rogpeppe, no worries
<fwereade> rogpeppe, the original plan for the first stuff that went in there was "give the technical writer something to start with"
<rogpeppe> aram, TheMue, dimitern, jam, mgz: any chance of a second look at this CL? https://codereview.appspot.com/7096054
<dimitern> rogpeppe: looking
<rogpeppe> dimitern: ta!
<jam> rogpeppe: I'm *really* not a fan of global state like that mutex. Is it possible to put it struct local?
<jam> (*c = *nc) is a bit scary, but I sort of understand why.
<rogpeppe> jam: i agree in principle, but in this case, i really want to keep the struct with all-exported fields. also, i want it to work even if two agents are using different config structs that point to the same datadir/entity
<rogpeppe> jam: i *could* have a global mutex per dataDir/entity, but that would be overkill
<rogpeppe> jam: (and i'd still need a global mutex to guard access to that!)
<dimitern> rogpeppe: LGTM
<rogpeppe> dimitern: thanks
<dimitern> fwereade: +1 on https://codereview.appspot.com/7094052/
<fwereade> dimitern, tyvm
<rogpeppe> fwereade, niemeyer: i think we need a collection in the state that holds entity-name/password pairs (as distinct from mgo user/password pairs, currently used, which will eventually only be used by the API servers). i thought about calling it "users". how does that seem to you?
<niemeyer> rogpeppe: Hard to tell without background
<rogpeppe> niemeyer: ok
<fwereade> rogpeppe, seems to echo my own ill-formed thoughts but I don't think I can justify it to myself yet, would like to hear more
<rogpeppe> niemeyer: so... eventually, when everything possible is going through the API, we don't want a random agent to be able to connect directly to the state
<niemeyer> rogpeppe: Indeed
<rogpeppe> niemeyer: so our current approach of storing mongo user/passwords won't fly, so we need to store the passwords somewhere else.
<rogpeppe> niemeyer: and inside the state seems like the most reasonable (only possible?) approach
<niemeyer> rogpeppe: Ideally we should never store passwords themselves
<rogpeppe> niemeyer: sure, we'll store them hashed
<niemeyer> rogpeppe: s/them hashed/their hashes/ yes
<rogpeppe> niemeyer: i'm thinking that the current SetPassword will end up renamed to (perhaps) SetMongoPassword
<rogpeppe> niemeyer: and that SetPassword will operate on the mongo collection
<rogpeppe> niemeyer: then when the provisioner want to allocate a new machine that's allocated the APIServerJob job, it will additionally add a password for the new agent using SetMongoPassword
<rogpeppe> niemeyer: so that the new machine agent can access the state directly.
<niemeyer> rogpeppe: Not entirely sure about that
<niemeyer> rogpeppe: Who would use unit.SetMongoPassword?
<rogpeppe> niemeyer: the provisioner
<rogpeppe> niemeyer: (only)
<niemeyer> rogpeppe: Why?
<niemeyer> rogpeppe: The unit should never connect to the state
<niemeyer> rogpeppe: To mongodb, that is
<rogpeppe> niemeyer: i thought we had agreement that the API server would be just another job that the machine agent could run
<rogpeppe> niemeyer: (and i've been proceeding on the basis of that understanding)
<niemeyer> rogpeppe: That seems completely unrelated to what I just said
<rogpeppe> niemeyer: ah, sorry, i'd missed the "unit" bit
<rogpeppe> niemeyer: yes, we wouldn't have that method
<rogpeppe> niemeyer: just Machine.SetMongoPassword
<rogpeppe> niemeyer: although we'd leave Unit.SetMongoPassword around for the time being, until the unit no longer needs to connect directly to the state.
<niemeyer> rogpeppe: +1
<rogpeppe> niemeyer: cool, thanks
 * fwereade -> lunch
<rogpeppe> fwereade: enjoy!
<rogpeppe> jam: are you ok with this CL after my explanation? i won't submit unless you are. https://codereview.appspot.com/7096054
<hazmat> are the ostack and ec2 public containers are global across accounts?
<rogpeppe> hazmat: yes
<rogpeppe> hazmat: well, the ec2 public containers are
<rogpeppe> hazmat: (that's kinda the point)
<hazmat> we're looking for a fallback location to publish cloud image streams/data, istm that those locations might be reasonable choices then if their global to the provider in question.
<dimitern> rogpeppe: but the OS ones seem account-dependant
<jam> rogpeppe: how about a comment that we use a global because what we are actually locking is the filesystem paths. Though arguably you need an on-disk lock for that.
<dimitern> rogpeppe: I can have the same container as another user and using ACLs both than be made public, with different URLs
<jam> rogpeppe: otherwise, I'm certainly not -1 on it. So you can land it.
<rogpeppe> jam: yeah, i'm assuming that we only care about concurrent use within a given agent, which means a lock inside the agent is sufficient
<rogpeppe> dimitern: hmm, how's the juju public-bucket stuff done then?
<rogpeppe> dimitern: ('cos there we really want to avoid uploading tools every time)
<rogpeppe> jam: but i'll add a comment. thanks for the feedback.
<dimitern> rogpeppe: well, wallyworld__ or mgz can tell you more details, but what we agreed upon is to have a public container (with ACLs) per region for the tools in OS
<dimitern> rogpeppe: and specify it's full URL in environments.yaml
<rogpeppe> dimitern: that seems like what i'd expected. global across acounts?
<rogpeppe> dimitern: ah, so rather than specifying a bucket name, we specify the URL.
<rogpeppe> dimitern: that seems fine too.
<dimitern> rogpeppe: depends on how you define "global" - definitely they're not in a single namespace
<mgz> rogpeppe: the goal will be what we talked about in denmark, with a tool to mirror the tools from ec2 to your openstack deployment
<mgz> but we'll also support per-tenant tools, and that's what's going to work first.
<rogpeppe> dimitern: it still sounds like a reasonable place for hazmat's need, i'd guess though.
<rogpeppe> mgz: sounds good
<hazmat> dimitern, its not a single namespace, but it is a url/container that can be reference'd across accounts for tool access which would suffice for also storing/publishing the image stream/data.
<dimitern> rogpeppe: in this sense "global", as in one for all environments running on canonistack region lcy01, yes they are global, but still a specific user/tenant account creates it
<rogpeppe> dimitern: true of ec2 too, although the bucket names themselves are global too.
<dimitern> hazmat: yeah, provided we have easy means (tools) to update and sync the tools in the global location (per region and per OS installation)
<rogpeppe> on ec2, that is
<dimitern> what I'm doing now is making juju bootstrap work (hopefully :) ) on canonistack with goose - using this global container url
<dimitern> mgz: is there a need for both public-bucket and public-bucket-url in the config?
<mgz> dimitern: public-bucket is not strictly needed, but we'd hardcode a string for it otherwise
<dimitern> mgz: not only it's not needed, I don't see it used - it's just confusing
<dimitern> mgz: maybe at lest a TODO or something would be nice
<mgz> we'll use it for the uploading of tools
 * rogpeppe goes for some lunch
<dimitern> mgz: isn't that what the control bucket is used for?
<dimitern> mgz: I see it now - in provider.SetConfig() - but I'm confused - why do we need 2 buckets now?
<fwereade> dimitern, because we generally want people not to have to build and/or upload their own tools: they can get theirs from the public bucket
<fwereade> dimitern, but we also sometimes want to deploy our own tools, eg with juju bootstrap --upload-tools
<fwereade> dimitern, and so we need to use our own private bucket for those
<fwereade> dimitern, the control bucket also holds the charm bundles IIRC
<dimitern> fwereade: so the control bucket (private one) always has either the tools from the public bucket or the ones we upload with --upload-tools specifically
<fwereade> dimitern, I can't remember if we copy public tools into the private bucket; rogpeppe would probably know
<dimitern> fwereade: and if we don't we just copy the tools from the public bucket to the control bucket on bootstrap?
<dimitern> fwereade: I see
<fwereade> dimitern, I *think* so, let me check something
<fwereade> dimitern, environs/cloudinit/cloudinit.go:105
<fwereade>         fmt.Sprintf("wget --no-verbose -O - %s | tar xz -C $bin", shquote(cfg.Tools.URL)),
<fwereade> dimitern, hm sorry that's not so helpful
<fwereade> dimitern, rogpeppe will remember the details of how they're distributed
<dimitern> fwereade: I'm looking at cloudinit.go - so this happens after or during bootstrap?
<fwereade> dimitern, ok, look quickly at environs/ec2/ec2.go:234
<fwereade>     if uploadTools {
<fwereade> dimitern, tools comes from either PutTools or FindTools
<dimitern> fwereade: sorry, my router went berserk again
<fwereade> dimitern, np, so did mine
<fwereade> I didn't actually say much, what was the last thing you saw?
<dimitern> fwereade: I'm looking at cloudinit.go - so this happens after or during bootstrap?
<fwereade> dimitern, the cloudinit file is used to bootstrap each instance, whatever it will end up running, and one of the things you see it doing there
<fwereade> dimitern, is wgetting the tools from the *Tools that was passed in
<dimitern> fwereade: but how does it get called?
<fwereade> dimitern, that *Tools comes from environs/ec2/ec2.go:234...
<dimitern> fwereade: ah, now I see
<fwereade> dimitern, where it does either PutTools (to upload) or FindTools(to pick the right sort); when FindTools is subsequently used, it check for PutTollsed tools before falling back to the public bucket
<fwereade> dimitern, but it doesn't do any copying into the control bucket that I can see except when it's uploading the local tools
<dimitern> fwereade: yeah, that's it then
<rogpeppe> back
<rogpeppe> fwereade: yup, that's right
<rogpeppe> mechanical rename CL; trivial if you agree with the rename. https://codereview.appspot.com/7098053
<rogpeppe> fwereade, TheMue, aram, niemeyer: ^
<TheMue> rogpeppe: *click*
<rogpeppe> niemeyer, fwereade: i was thinking that, rather than have a separate user/password collection, the best place to store the passwords would be in the entity docs themselves. does that make sense to you?
<niemeyer> rogpeppe: I still feel it's slightly awkward to expose Mongo at that level of the API, but this feels like a bike-sheddy discussion that won't drive us anywhere. +1 on moving forward with this and reducing impact later.
<rogpeppe> niemeyer: we could call it CoreDBPassword or something
<niemeyer> rogpeppe: It's fine as it is for now.. it'll be more useful to get rid of the method where possible ASAP
<rogpeppe> niemeyer: yeah
<niemeyer> rogpeppe: Regarding storing the password in the entity doc, sounds reasonable in principle (without looking at details)
<rogpeppe> niemeyer: the only issue that i can see is that the state open process will want a way to map from entity name to entity, so it can check the password
<rogpeppe> niemeyer: but if you're ok with that, i think there's a clear path
<niemeyer> "state open process"?
<rogpeppe> niemeyer: sorry, state.Open
<niemeyer> Ah
<niemeyer> rogpeppe: I think it's fine
<TheMue> rogpeppe: You'vve got  a review
<niemeyer> rogpeppe: We'll eventually need that anyway to be able to constrain access on a per-entity basis
<rogpeppe> niemeyer: i'm thinking func (st *State) entity(entityName string) (entity, error); type entity interface {SetPassword(pass string) error; HashedPassword() string; possible other methods}
<rogpeppe> niemeyer: perhaps Entity instead of entity
<niemeyer> rogpeppe: Why do we need that?
<rogpeppe> niemeyer: that's a way to map from entity name to entity
<niemeyer> rogpeppe: I now perceive I don't understand what you meant
<rogpeppe> niemeyer: inside state.Open, we need a way to find the hashed password for an entity given an entity name
<niemeyer> rogpeppe: and why is that non-trivial?
<rogpeppe> niemeyer: i guess state.Open could dive directly into the entity doc
<rogpeppe> niemeyer: perhaps that would be better - is that what you're thinking?
<niemeyer> rogpeppe: Yeah, feels like simple switch statement
<rogpeppe> niemeyer: i was thinking of a simple switch statement actually: switch entityKind {case "unit": return st.Unit(name); case "machine": return st.Machine(name) }
<rogpeppe> niemeyer: and Unit and Machine will already implement the methods in the entity interface
<rogpeppe> niemeyer: so there's no need to spread any doc-specific code outside of the entities themselves
<niemeyer> rogpeppe: The only difference is the collection name.. no need to even load the entities
<niemeyer> rogpeppe: But whatever.. it's fine either way
<niemeyer> rogpeppe: Your implementation.. I agree with the idea
<rogpeppe> niemeyer: ok, i'll go with this for the time being. i'm not sure if the performance difference would be significant.
<fwereade> dimitern, or not? ;p
<dimitern> wow
<rogpeppe> TheMue: thanks!
<dimitern> i just got home and tried and it seems back on
<TheMue> rogpeppe: yw
<dimitern> fwereade: don't send it if you haven't, 10x
<fwereade> dimitern, already sent :)
<dimitern> fwereade: np, I'll reply
<fwereade> rogpeppe, https://codereview.appspot.com/7098053/ LGTM
<rogpeppe> fwereade: tyvm
<fwereade> rogpeppe, can I redraw your attention to https://codereview.appspot.com/7092044/ and https://codereview.appspot.com/7094045/ in particular please? ;)
<rogpeppe> fwereade: looking now, thanks for the reminder
<fwereade> rogpeppe, there are more, if you're of a mind, but I know you have to maintain balance ;)
<rogpeppe> fwereade: i think i'll do just those two for today if that's ok. perhaps i'll have a review morning tomorrow.
<fwereade> rogpeppe, sgtm
<rogpeppe> fwereade: ping
<fwereade> rogpeppe, pong
<rogpeppe> fwereade: i'm looking at this expression in your CL: ru.st.units.Find(selSubordinate).Select(lifeFields)
<fwereade> rogpeppe, oh yes
<rogpeppe> fwereade: can that ever result in more than one result, and if not, why not?
<fwereade> rogpeppe, I believe not, because of the check in the unit addition ops
<fwereade> rogpeppe, (well, it can, but only if you use AUST)
 * rogpeppe has forgotten what AUST stands for
<fwereade> rogpeppe, (the unit add op asserts that the principal has no subordinate unit with a name starting with "svcname/")
<fwereade> rogpeppe, AddUnitSubordinateTo
<rogpeppe> fwereade: of course, ta
<rogpeppe> fwereade: hmm
<rogpeppe> fwereade: can't the unit have several existing subordinate units?
<fwereade> rogpeppe, but only one of each service
<rogpeppe> fwereade: why does that make a difference?
<rogpeppe> fwereade: is the above Find selecting only subordinate units from one service?
<fwereade> rogpeppe, selSubordinate := D{{"service", serviceName}, {"principal", unit Name}}
<fwereade> rogpeppe, (sorry line break)
<rogpeppe> fwereade: ok so (pardon my lack of understanding here) what's stopping RelatedEndpoints returning more than one end point?
<fwereade> rogpeppe, the fact that we can only create relations with 1 or 2 endpoints
<rogpeppe> fwereade: ah, ok.
<rogpeppe> fwereade: i guess i'm trying to see the code path that happens when the principal unit has more than one dying subordinate unit
<fwereade> rogpeppe, only one subordinate unit is ever relevant for a scope entry
<fwereade> rogpeppe, the unit of the other service in the relation
<rogpeppe> fwereade: i think this comes down to my fundamental difficulty in grasping exactly what a RelationUnit is :-)
<fwereade> rogpeppe, it controls a single unit's effects on a particular relation
<fwereade> rogpeppe, in fact, better
<fwereade> rogpeppe, it's the interface a relation exposes to a unit (roughly, with caveats and handwaving)
<rogpeppe> fwereade: and a relation is between two services only?
<fwereade> rogpeppe, two or one
<rogpeppe> fwereade: yea
<rogpeppe> h
<fwereade> rogpeppe, but a subordinate relation will only be between 2
<rogpeppe> fwereade: ok, i'm starting to get it better now, thanks
<fwereade> rogpeppe, this may, one day, change, but I don't think it's yet on the horizon
<fwereade> rogpeppe, cool
<rogpeppe> fwereade: you've got a pair of reviews
<niemeyer> mramm: Is there a consistency meeting today?
<dimitern> niemeyer: the weekly (double) meeting is tomorrow as usual, if you refer to this one
<niemeyer> mramm: Nope, it's a different one
<niemeyer> robbiew: Meeting today?
<robbiew> niemeyer: nah...client sprint this week
<niemeyer> robbiew: Ah, that's why I thought the Austin sprint started on the 13th
<robbiew> :)
<rogpeppe> fairly simple CL: https://codereview.appspot.com/7105054
<rogpeppe> and that's me for the day. see y'all tomorrow!
<hazmat> niemeyer, will you be @ austin?
<niemeyer> hazmat: Nope
<niemeyer> rogpeppe: Have a good one
<hazmat> niemeyer, bummer, but understood
<niemeyer> hazmat: Cesarean and all.. wouldn't feel right
<hazmat> niemeyer, ouch.
#juju-dev 2013-01-15
<TheMue> davecheney, fwereade_, rogpeppe: Morning
<fwereade_> TheMue, everybody, heyhey
<rogpeppe> morning all!
<TheMue> rogpeppe: hiya
 * TheMue enjoys installing a fresh environment to run MAAS in KVM
<aram> moin.
<dimitern> morning!
<TheMue> aram, dimitern: hiya
<dimitern> TheMue: yo :)
<rogpeppe> fwereade_: you have reviews
<rogpeppe> fwereade_: i'd appreciate a look at https://codereview.appspot.com/7105054/
<fwereade_> rogpeppe, cool, thanks; I'll take a peek
<fwereade_> rogpeppe, LGTM
<rogpeppe> fwereade_: thanks!
<fwereade_> rogpeppe, I need to take a break for a bit... would you let http://paste.ubuntu.com/1533914/ roll around your mind for sanity for a bit?
<rogpeppe> fwereade_: looking
 * jam goes to grab a coffee, will be back for standup in 2 min
<jam> mgz: poke
<mgz> hey
<jam> for great mumbling!
<mgz> I'm on.
<fwereade_> niemeyer, heyhey
<niemeyer> fwereade_: How's tricks? :)
<fwereade_> niemeyer, good thanks; but a massive storm has just blown up here, somewhat to my surprise
<niemeyer> fwereade_: Ugh
<niemeyer> fwereade_: I guess you have good power rails there :)
<rogpeppe> niemeyer: hiya!
<fwereade_> niemeyer, it usually messes with the wifi more than the power but power cuts are not unheard of here
<fwereade_> holy crap, hail
<TheMue> lunchtime
<niemeyer> fwereade_: Wow
<fwereade_> niemeyer, and now it's just eerie silence
<dimitern> fwereade_: I'm always hoping for a good, massive hailstorm, just to see how bad it gets here, but alas - nothing worthy so far :D
<fwereade_> dimitern, yeah, that wasn't all that much to write home about on reflection
<fwereade_> dimitern, but still pretty unexpected, given the beautiful sunshine not half an hour before
<fwereade_> dimitern, I think I just heard thunder though
<dimitern> fwereade_: yeah, this here is like mountain weather - 5m and it's completely different
<fwereade_> rogpeppe, any thoughts re sanity of that paste?
<rogpeppe> fwereade_: what's the difference between Destroy and Remove?
<fwereade_> rogpeppe, Destroy is the method everything has; it might set Dying or it might remove
<fwereade_> rogpeppe, Remove only applies to units/machines because they're the only ones that follow the path we originally understood
<fwereade_> rogpeppe, Dying services and relations are removed as side-effects of either Unit.Remove or RelationUnit.LeaveScope
<fwereade_> rogpeppe, did you get a chance to take a look at death-and-destruction.txt?
<rogpeppe> fwereade_: yes, i did, although i wasn't sure how much is what we've got now, and how much is plans for the future.
<fwereade_> rogpeppe, everything except how services are done, which was agreed before xmas and which I'm working on now
<rogpeppe> fwereade_: BTW why might Remove of a unit also remove its service? don't we allow services with no units?
<fwereade_> rogpeppe, if the service is Dying, then the disappearing unit might be the last reference to the service
<rogpeppe> fwereade_: ah, of course
<rogpeppe> fwereade_: the paste looks reasonable to me
<rogpeppe> fwereade_: it'll be worth discussing this with niemeyer, since he's around
<fwereade_> rogpeppe, quite so; niemeyer, I'd like to have an API discussion with you when you have some free time
<niemeyer> fwereade_: Sure, any time
<niemeyer> fwereade_: Next meeting is in an hour
<fwereade_> niemeyer, I'd quite like a bite to eat before then -- could we do it right afterwards please?
<fwereade_> niemeyer, (or, hmm, will that hit your lunchtime?)
<niemeyer> fwereade_: Sounds good
<fwereade_> niemeyer, ok, cool, thanks :)
<niemeyer> fwereade_: It might, but it's okay either way.. worst case we do when I'm back
<rogpeppe> fairly simple CL if anyone cares to have a look:  https://codereview.appspot.com/7085062
<niemeyer> rogpeppe: Looking
<rogpeppe> niemeyer: thanks
<dimitern> i keep getting error: environment "canonistack" has an unknown provider type "openstack" - what should I do to fix this? I tried go installing cmd/juju to make sure it was compiled with the latest source, still no luck
<dimitern> rogpeppe, fwereade_- any hints? ^^
<rogpeppe> dimitern: juju needs to import the openstack provider
<rogpeppe> dimitern: import _ "launchpad.net/juju-core/environs/openstack"
<dimitern> rogpeppe: so we need that for all providers? even though they are registered?
<rogpeppe> dimitern: they can only register if they're imported
<rogpeppe> dimitern: (by at least one thing)
<dimitern> rogpeppe: I see, 10x
<jam> TheMue: https://wiki.canonical.com/Launchpad/MAASTesting I'm not sure if you knew of that one, but it should have some stuff about the Lenovo QA Lab for MaaS
<TheMue> thx
<jam> TheMue: I think rvba in #maas would probably be able to help out on questions/access information to the lab.
<jam> aram: ^^ probably you're interested in that one as well (if you haven't seen it already)
<niemeyer> fwereade_: Sorry, net hiccup
<fwereade_> niemeyer, np, hangout is still open
<niemeyer> fwereade_: Are you still in the same hangout?
<niemeyer> fwereade_: Joining
<rogpeppe> wallyworld__: shall we try to sort out the public storage thing?
<wallyworld__> rogpeppe: would love to but it's almost midnight here and i'm tired, i'll have another look tomorrow to try and see how what i've done may be different to how ec2 does it and will ask you if i need to
<rogpeppe> wallyworld__: ok, np
<wallyworld__> thanks for asking
<rogpeppe> wallyworld__: FWIW in the ec2 provider, the public storage is implemented with exactly the same type as the private storage
<wallyworld__> i do that too
<wallyworld__> the public openstack storage uses a client connection that doesn't fo any authentication
<rogpeppe> wallyworld__: it's probably down to the way the container object is initialised
<wallyworld__> the public storage requires that the container be created with the correct acl to allow world readable without auth being required
<wallyworld__> and so the reading bit works fine
<wallyworld__> ie someone who is authenticated can upload stuff to the container and then anyone can read it, autjenticated or not
<wallyworld__>  however, using the public container to try and write something eg fake tools in the tests, fails because no auth is used since the public container uses a non authenticating client
<wallyworld__> s/public container/public storage
<rogpeppe> wallyworld__: is the issue that you *can't* use authentication when connecting to a container that doesn't require it?
<wallyworld__> i could, but the public storage is initialised with a non authenticating client
<wallyworld__> openstack clients store the auth token when authentication occurs
<rogpeppe> wallyworld__: could you initialise it with an authenticating client?
<wallyworld__> and then put that token in each request
<wallyworld__> yes, but that would require authentication which the public containers must not require?
<rogpeppe> wallyworld__: i'm presuming the crux is in this code: http://paste.ubuntu.com/1534329/
<wallyworld__> the non public storage in the openstack provider does indeed use an authenticating client
<wallyworld__> yes
<wallyworld__> the public storage is initialised with a client which does not authenticate, so no credentials are used
<wallyworld__> used or required
<rogpeppe> wallyworld__: what is it about the publicStorageUnlocked initialisation that isn't authenticating?
<wallyworld__> publicBucketClient
<rogpeppe> wallyworld__: i see two occurrences of e.client(ecfg, authMethodCfg)
<rogpeppe> wallyworld__: ah, i may be looking at an old branch though
<wallyworld__> let me check the code
<wallyworld__> func (e *environ) publicClient(ecfg *environConfig) client.Client {
<wallyworld__> 	return client.NewPublicClient(ecfg.publicBucketURL(), nil)
<wallyworld__> }
<wallyworld__> around line 196
<wallyworld__> the public storage instance uses a goose client that is non authenticating as constructed above
<rogpeppe> wallyworld__: ah, i wasn't in sync
<wallyworld__> public bucker url above is like the public region for ec2
<wallyworld__> so a goose client can be created with credentials, or not
<wallyworld__> if credentials are provided, it authenticates
<rogpeppe> wallyworld__: so if you provide credentials for a container that doesn't require them, it fails?
<wallyworld__> there are separate constructors for each type - you cannot supply credentials for a public client
<wallyworld__> if use curl to poke a container which is publicly readable but use credentials, i'm not sure what will happwn
<wallyworld__> i think i read that there is a but in that area and it will not work
<wallyworld__> bug not but
<rogpeppe> wallyworld__: that's what i was wondering
<wallyworld__> i definitely read that there is a bug in that area, but am uncertain if it is fixed. i will need to test
<rogpeppe> wallyworld__: it seems a bit wrong that it can't ignore credentials when they're not required
<wallyworld__> indeed
<rogpeppe> wallyworld__: i wonder if it would be possible to work around that bug
<rogpeppe> wallyworld__: for instance, by detecting an error code that signifies the bug
<wallyworld__> not sure, but it does seem the idea of a non authenticating client needs to be rethought
<rogpeppe> wallyworld__: goose/client could certainly do with some API docs - i can't see how it's meant to work...
<fwereade_> rogpeppe, https://codereview.appspot.com/7058073 has changed a bit since I merged trunk in and might benefit from a second glance; TheMue, you've also started to review that one
<TheMue> fwereade_: Will take a look.
<fwereade_> TheMue, tyvm
<wallyworld__> rogpeppe: yes, it probably does need more doc
<wallyworld__> it's very openstack specific
<wallyworld__> to understand it you need you to know about how openstack works
<rogpeppe> wallyworld__: if it's openstack specific, perhaps it should be hidden behind the swift, etc APIs
<rogpeppe> wallyworld__: rather than requiring clients to interact with it
<wallyworld__> the goose client is used by the goose swift code to provide the underlying transport/connectivity
<wallyworld__> a client needs to be created with credentials, but after that, it's opaque
<wallyworld__> it just needs to be passed to a swift instance so that it can be used
<rogpeppe> wallyworld__: perhaps the goose swift code should provide a function that encapsulates that, so the type isn't externally visible.
<dimitern> is there a command to show the complete effective environ config as seen by juju?
<wallyworld__> the same client is used by the nova and glance stuff too
<dimitern> after defaults, etc. are applied?
<rogpeppe> dimitern: you can call AllAttrs
<rogpeppe> wallyworld__: hmm, yes, i see the issue
<dimitern> rogpeppe: on what?
<rogpeppe> dimitern: config.Config
<rogpeppe> dimitern: what's your context?
<dimitern> rogpeppe: I want to dump the config after validation, before any cmd like bootstrap is executed
<rogpeppe> wallyworld__: if NewPublicClient and NewClient were merged, i'm not sure there would be a good reason for exposing Client.
<dimitern> rogpeppe: isn't there something like juju get --all ?
<wallyworld__> rogpeppe: the same client instance is shared by nova and swift instances
<rogpeppe> dimitern: ah, from the command line
<dimitern> rogpeppe: yeah, that seems easiest
<wallyworld__> ah, no it's not
<wallyworld__> a new client instance is created for each nova, swift instance
<wallyworld__> rogpeppe: i sort of think of a goose client as like a tcp socket - you may need to create one and then give it to things to use on your behalf but you don't need to know how a socket works
<wallyworld__> but i guess you are saying each thing should create it's own socket
<rogpeppe> wallyworld__: well, i think that might possibly make for a more straightforward API, but i'd need to think about it
<rogpeppe> wallyworld__: in general though, whether openstack-specific or not, any exported API should be fully documented
<wallyworld__> the thing with merging NewClient and NewPublicClient is that go lacks default method  params etc, so it gets messy
<wallyworld__> yeah, agree with doco - we've just run way short on time
<rogpeppe> wallyworld__: i don't think that's necessarily a problem. NewClient has quite a few args anyway; it could easily be a struct.
<wallyworld__> yes. i used that approach in some of the client internals
<rogpeppe> wallyworld__: type ClientParams struct {Creds ...; AuthMethod ...; Logger; etc}
<rogpeppe> wallyworld__: that's the go way to do default method params. works pretty well usually.
<wallyworld__> yeah, i gotta get used to that. hard to forget all that python
<wallyworld__> it does seem to add overhead though
<wallyworld__> compared with just using default method or kw args
<wallyworld__> and that overhead is incurred for each and every call
<rogpeppe> wallyworld__: runtime overhead?
<wallyworld__> since the struct needs to be created each time
<wallyworld__> typing/verbosity overhead
<wallyworld__> code bloat
<rogpeppe> wallyworld__: there are advantages too though - you can create an object encapsulating the params and pass it around and manipulate it easily
<fwereade_> rogpeppe, TheMue, niemeyer: can anyone remember why we prevent SetExposed and ClearExposed when a service is Dying?
<wallyworld__> rogpeppe: swings and roundabouts as with anything :-)
<rogpeppe> fwereade_: it was a fairly arbitrary decision AFAIR
<dimitern> rogpeppe: so I guess there isn't a way to dump the config from the command line?
<rogpeppe> dimitern: you could write a program to do it in 3 minutes
<wallyworld__> rogpeppe: thanks for discussion, way past my bedtime, i'll look again tomorrow
<rogpeppe> wallyworld__: cheers!
<dimitern> rogpeppe: ok, so I need to dump AllAttrs from environ?
<rogpeppe> dimitern: i think you'd print the result of environs.NewFromName("").Config().AllAttrs()
<dimitern> rogpeppe: I'll try that, 10x
<fwereade_> rogpeppe, I don't think I can justify disallowing ClearExposed; SetExposed is more interesting, but I'm inclined to say it should be allowed for symetry's sake
<rogpeppe> dimitern: assuming you're using the standard environments.yaml
<rogpeppe> fwereade_: i'm not sure the operations need to be symmetrical
<rogpeppe> fwereade_: we often allow closing-down operations when dying, but not adding operations
<fwereade_> rogpeppe, yeah, true
<rogpeppe> fwereade_: mind you, i always thought there should only be one call: SetExposed(bool)...
<fwereade_> rogpeppe, yeah, I can sympathize there
<niemeyer> fwereade_: I think either way is fine
<niemeyer> rogpeppe: Yeah, I recall that :)
<niemeyer> The reasoning, ironically, is precisely for symmetry..
<niemeyer> Other flags require more than a bool
<niemeyer> But I digress
<niemeyer> and will have lunch, in fact.. biab
<rogpeppe> fwereade_: just looking at https://codereview.appspot.com/7058073 - quite a bit has changed, but i don't see my comments addressed
<fwereade_> rogpeppe, oh, hell, I got completely distracted by the manual merge
<rogpeppe> fwereade_: np
<fwereade_> rogpeppe, I don't have any objections to any of them though
<rogpeppe> fwereade_: cool
<TheMue> fwereade_: you've got a review
<fwereade_> TheMue, ty
<rogpeppe> fwereade_: ping
<fwereade_> rogpeppe, pong
<rogpeppe> fwereade_: i'm starting to think that all public Dying methods are crackful (other than Tomb's, of course). thoughts?
<rogpeppe> fwereade_: the fact that something is dying is really an internal detail - what an external thing should care about is dead or not dead IMO
<fwereade_> rogpeppe, hmm -- I can't think of many places they're used but I'm not yet ready to commit to such a statement myself
<fwereade_> rogpeppe, wait, you mean on state entities? or on watchery things?
<rogpeppe> fwereade_: so for instance, the only place you use Uniter.Dying, you could (should, i think) use u.tomb.Dying instead
<rogpeppe> fwereade_: both
<fwereade_> rogpeppe, when is there a Dying method that isn't `return x.tomb.Dying()`?
<rogpeppe> fwereade_: here: http://paste.ubuntu.com/1534428/
<rogpeppe> fwereade_: oops, those are calls
<rogpeppe> fwereade_: i don't think anything should expose its internal tomb's dying state
<rogpeppe> fwereade_: what's it useful for?
<rogpeppe> fwereade_: it says "this object is starting to shut down", but that's not useful info when interacting with the object.
<rogpeppe> fwereade_: i think all those Dying methods should be Dead methods that return x.tomb.Dead()
<rogpeppe> fwereade_: for example, the comment here is just wrong:
<rogpeppe> // Dying returns a channel that signals a Firewaller exit.
<rogpeppe> func (fw *Firewaller) Dying() <-chan struct{} {
<rogpeppe> fwereade_: it doesn't signal a firewaller exit, but just that at some time in the future the firewaller hopefully will exit.
<rogpeppe> lunch
<rogpeppe> fwereade_: what was the last thing you saw me say before i said "lunch" ?
<fwereade_> rogpeppe, fwereade_: it says "this object is starting to shut down", but that's not useful info when interacting with the object.
<rogpeppe> ah
<rogpeppe> [14:50:46] <rogpeppe> fwereade_: i think all those Dying methods should be Dead methods that return x.tomb.Dead()
<rogpeppe> [14:53:40] <rogpeppe> fwereade_: for example, the comment here is just wrong:
<rogpeppe> [14:53:40] <rogpeppe> // Dying returns a channel that signals a Firewaller exit.
<rogpeppe> [14:53:40] <rogpeppe> func (fw *Firewaller) Dying() <-chan struct{} {
<rogpeppe> [14:54:06] <rogpeppe> fwereade_: it doesn't signal a firewaller exit, but just that at some time in the future the firewaller hopefully will exit.
<fwereade_> rogpeppe, that also sounds sane, probably, where there are places it's actually used
<fwereade_> rogpeppe, how common is that?
<rogpeppe> fwereade_: there's one use, in a test.
<fwereade_> rogpeppe, heh, for something that doesn't just have a .Wait()?
<rogpeppe> fwereade_: no, so that we can time out a wait
<rogpeppe> fwereade_: all the other uses of Dying other than on a tomb are in uniter, and those could all be u.tomb.Dying
<fwereade_> rogpeppe, consider me +1 then
<fwereade_> rogpeppe, not sure why I ever had u.Dying()
<rogpeppe> fwereade_:  i tell a lie, it's never used
<fwereade_> rogpeppe, cool
<rogpeppe> fwereade_: i thought f.Dying was on a firewaller (not looking at the file) but it was on a uniter.filter.
<fwereade_> rogpeppe, ah, yeah
<rogpeppe> niemeyer: does this sound reasonable to you: change all public Dying methods in juju-core into Dead methods?
<niemeyer> rogpeppe: Huh!?
<rogpeppe> niemeyer: there's no use for Dying, but there is a use for Dead
<rogpeppe> niemeyer: dying in an internal state
<rogpeppe> s/in an/is an/
<niemeyer> rogpeppe: Sorry.. quite out of context here
<niemeyer> rogpeppe: The quick answer is no, it doesn't sound reasonable without further detail
<rogpeppe> niemeyer: that's fine. let me try to explain.
<rogpeppe> niemeyer: 1) nothing ever uses a Dying method currently (other than Tomb.Dying)
<rogpeppe> niemeyer: 2) ... except for one case in a test where it should be using Dead.
<rogpeppe> niemeyer: 3) i'd like to use a Dead method (on api.Server)
<rogpeppe> niemeyer: 4) i believe that "dying" cannot signify anything useful to a user of an object
<rogpeppe> niemeyer: because it's really a transitional state
<niemeyer> rogpeppe: What's the test that is broken? That may be an interesting case to look at
<niemeyer> rogpeppe: Not really.. Dying isn't transitional
<rogpeppe> niemeyer: here's the code: http://paste.ubuntu.com/1534705/
<rogpeppe> niemeyer: the bug is that it can wait 50ms for dying to happen, but then the transition to dead might take a lot longer
<niemeyer> rogpeppe: tomb.Dying and other similar channels fire once the thing is dying, and always from then on
<rogpeppe> niemeyer: i mean that dying is part of the transition to dead
<rogpeppe> niemeyer: i realise that the channel state is persistent
<niemeyer> rogpeppe: The point is that Dying, the method you're suggesting we remove, is not transitional.. it has well defined  behavior that is fire-once
<rogpeppe> niemeyer: can you think of a good use case for Dying?
<rogpeppe> niemeyer: as a publicly visible method
<niemeyer> rogpeppe: Given the dozens of cases we have with tomb.Dying, of course..
<rogpeppe> niemeyer: i'm not including tomb.Dying.
<rogpeppe> niemeyer: that's fine as is
<rogpeppe> niemeyer: it's the tomb's raison d'etre
<niemeyer> rogpeppe: Not really.. Tomb is only useful with the full behavioral pack it has
<niemeyer> rogpeppe: Including Dead, Kill, etc
<rogpeppe> niemeyer: but a tomb is an implementation detail of an object, which is why we have it as a non-exported field, rather than embedding it.
<niemeyer> rogpeppe: Wait
<rogpeppe> niemeyer: ok, true
<niemeyer> rogpeppe: So I'm looking at the paste
<niemeyer> rogpeppe: It's not clear how that proves Dying is bad
<rogpeppe> niemeyer: "dead not detected"
<rogpeppe> niemeyer: it's waiting for the object to die. but Dying does not signify that. it signifies that it will die (hopefully soon)
<niemeyer> rogpeppe: Sure.. it's a bit like saying Sleep is back because it blocks
<niemeyer> s/back/bad
<rogpeppe> niemeyer: why would we ever want to wait until an object *starts* to clean itself up?
<rogpeppe> niemeyer: i think that we always want to wait until it *has* cleaned itself up
<niemeyer> rogpeppe: Always when?
<rogpeppe> niemeyer: (unless we're part of the object itself, and need to participate in the cleaning-up process)
<niemeyer> rogpeppe: It feels like we're trying to guess stuff up unnecessarily
<niemeyer> rogpeppe: All I see is one test poorly written
<niemeyer> rogpeppe: That doesn't say much about Dying
<rogpeppe> niemeyer: that's the *only* place that Dying is used (other than some places in uniter that could use the tomb Dying method)
<rogpeppe> niemeyer: i mean Dying-other-than-tomb.Dying there
<niemeyer> rogpeppe: Erm
<rogpeppe> niemeyer: the only place that one of the Dying methods defined in juju-core is used
<rogpeppe> niemeyer: to be more accurate
<niemeyer> rogpeppe: UnitDying?
<fwereade_> niemeyer, UnitDying is justa broadcast channel that's closed when the unit's Life is Dying -- has nothing to do with tombs
<rogpeppe> niemeyer: that's different. it's not about the object itself
<niemeyer> <rogpeppe> niemeyer: why would we ever want to wait until an object *starts* to clean itself up?
<rogpeppe> [16:35:40] <rogpeppe> niemeyer: (unless we're part of the object itself, and need to participate in the cleaning-up process)
<niemeyer> rogpeppe: UnitDying is not part of the object itself
<niemeyer> rogpeppe: Someone else watches it
<rogpeppe> in this case, the uniter could arguably be considered part of the unit "objecT"
<rogpeppe> niemeyer: i would not suggest changing UnitDying to UnitDead
<niemeyer> rogpeppe: Me neither.. Dying means what it means, and seems to be fine everywhere we've used it
<niemeyer> rogpeppe: If you use Dying when you mean Dead, it won't fly so well, though
<niemeyer> rogpeppe: Same thing if you use Dead when you mean Dying
<rogpeppe> niemeyer: which is... in one place, wrongly.
<niemeyer> rogpeppe: In one place if you special case-out every other place
<rogpeppe> niemeyer: i don't see why we expose Dying on Firewaller, for example
<rogpeppe> niemeyer: or Provisioner or Uniter
<rogpeppe> niemeyer: or uniter.filter
<rogpeppe> niemeyer: all those places would be much better served with a Dead method AFAICS
<niemeyer> rogpeppe: That's an easier to verify assertion
<niemeyer> rogpeppe: rather than a witch-hunting against Dying methods
 * niemeyer looks
<rogpeppe> niemeyer: those are the only Dying methods we define
<fwereade_> rogpeppe, don't all tasks have Wait() though anyway?
<niemeyer> rogpeppe: Heh
<rogpeppe> fwereade_: yes. but Dying is nicer than starting a new goroutine just to call wait and send on a channel
<rogpeppe> fwereade_: when we've got that channel available anyway
<rogpeppe> oops!
<rogpeppe> fwereade_: Dead!
<fwereade_> rogpeppe, haha :)
<fwereade_> rogpeppe, I'll be guided by you in that, you're deeply involved with the client ATM
<fwereade_> rogpeppe, AIUI that's what the client does at the moment, right?
<rogpeppe> fwereade_: which client?
<fwereade_> rogpeppe, jujud
<niemeyer> rogpeppe: So firewaller.Dying is never used?
<rogpeppe> niemeyer: indeed
<niemeyer> rogpeppe: Easy one.. just kill it
<rogpeppe> niemeyer: +1
<niemeyer> Huh
<rogpeppe> niemeyer: and the other never-used Dying methods too?
<niemeyer> Why are we even discussing this?
<rogpeppe> niemeyer: because i'm just about to create a Dead method
<rogpeppe> niemeyer: and was wondering about the precedent.
<niemeyer> rogpeppe: Why do we need one?
<niemeyer> rogpeppe: That's different
<niemeyer> rogpeppe: Removing an unused method is a pretty easy choice
<niemeyer> rogpeppe: No matter its name :)
<rogpeppe> niemeyer: we don't *need* one - we can always do go func() {c <- obj.Wait()}; but a Dead method seems nicer, since the channel is available anyway
<niemeyer> rogpeppe: Sure, if you need Dead just add it
<rogpeppe> niemeyer: ok, cool
<niemeyer> rogpeppe: That's a different conversation than "let's kill all Dying methods and replace them by Dead because that's always what we need", if you see what I mean
<rogpeppe> niemeyer: i guess so. i had hypothetical use cases in mind, not real ones :-)
<niemeyer> rogpeppe: There's no long term decision, or convention being established.. just add that one-liner you need and let's be happy :)
<rogpeppe> niemeyer: and i still think that an externally visible Dying method that just returns obj.tomb.Dying() is almost certainly going to be a mistake
<rogpeppe> hmm, weird codereview behaviour i haven't seen before: https://codereview.appspot.com/7101059/
<rogpeppe> fwereade_: do you see no diffs at all?
<rogpeppe> the diffs in launchpad look fine
<niemeyer> rogpeppe: No issue exists with that id (7101059)
<niemeyer> rogpeppe: That's the message I get
<rogpeppe> niemeyer: i've just deleted it, trying again.
<rogpeppe> niemeyer: but the next one has the same problem too
<rogpeppe> niemeyer: one mo
<niemeyer> rogpeppe: FWIW, I just got a weird behavior while using it too.. had to reload the CL *several* times to see comments I had just entered
<rogpeppe> niemeyer: https://codereview.appspot.com/7106052
<niemeyer> rogpeppe: And my comments disappeared again!
<niemeyer> rogpeppe: There's something funky going on in their side
<rogpeppe> niemeyer: i guess you could review it by looking at the launchpad page
<rogpeppe> niemeyer: there's nothing complicated going on
<niemeyer> rogpeppe: It's not okay for lbox/codereview to be broken like that
<rogpeppe> niemeyer: i've never seen this particular failure-mode before
<rogpeppe> niemeyer: i imagine there's a data warehouse gone down somewhere :-)
<niemeyer> rogpeppe: The changes to u.Dying => u.tomb.Dying seem undue
<niemeyer> rogpeppe: u.Dying seems perfectly okay there
<rogpeppe> niemeyer: why would something external to uniter find any use from using the Dying method?
<rogpeppe> niemeyer: the dying state is surely something internal to the implementation of Uniter?
<rogpeppe> niemeyer: as it is in all our other objects
<rogpeppe> fwereade_: what do you think?
<niemeyer> rogpeppe: Maybe
<niemeyer> rogpeppe: none of our other objects tend to touch their tombs from external functions, though
<rogpeppe> niemeyer: neither does uniter
<niemeyer> rogpeppe: See modes.go
<niemeyer> rogpeppe: I don't care enough, though
<rogpeppe> niemeyer: modes.go is still internal to the package, although it's true it defines functions not methods
<rogpeppe> niemeyer: but that's not unprecedented. in jujud, i pass tombs around to other functions.
<niemeyer> Whatever William says is fine by me
<niemeyer> rogpeppe: Yes, you do..
<niemeyer> rogpeppe: Other functions don't grab private tombs, though
<rogpeppe> niemeyer: the modes.go functions use other private Uniter fields too
<niemeyer> rogpeppe: I don't care, though.. it feels a bit of a red-herring change, and a bike-sheddy discussion
<rogpeppe> niemeyer: i don't think the tomb is much different
<niemeyer> rogpeppe: LGTM and let's focus on something more valuable
<rogpeppe> niemeyer: i didn't plan on spending more than 5 minutes on this change!
<niemeyer> rogpeppe: Zero would be a better number
<mramm> looks like I'm going to be without power for about 2 hours here this afternoon
<mramm> I'll be around though, so if you you need to contact me I will be checking e-mail periodically
<mramm> and will be available by phone
<niemeyer> rogpeppe: Regarding the yaml vs. json change, we can use yaml, and fix goyaml later to use !binary, that encodes as base64 by itself
<rogpeppe> niemeyer: i think if we use yaml we'll exceed the 16K cloudinit limit
<niemeyer> rogpeppe: "The server encountered an error and could not complete your request."
<niemeyer> rogpeppe: codereview is busted
<rogpeppe> niemeyer: seems like it
<niemeyer> rogpeppe: I don't understand
<niemeyer> rogpeppe: How does the agent configuration file written to disk relate to cloudinit?
<rogpeppe> niemeyer: see WriteCommands
<rogpeppe> niemeyer: it's also used to generate the cloudinit file
<rogpeppe> niemeyer: which is kinda the point of the package - it's a common place that knows how to write an agent configuration file
<rogpeppe> niemeyer: whether that's into a cloudinit file or an upstart script, or directly.
<rogpeppe> niemeyer: actually we don't use it in an upstart script
<niemeyer> rogpeppe: echo | gunzip?
<niemeyer> rogpeppe: I mean, the two things seem somewhat unrelated.. you can do whatever you want in these commands
<rogpeppe> niemeyer: that's a thought
<rogpeppe> niemeyer: except it would have to be echo | b64decode | gunzip
<rogpeppe> niemeyer: is there a base 64 decoder provided as standard in ubuntu?
<niemeyer> rogpeppe: Yeah:
<niemeyer> % dpkg -S /usr/bin/base64
<niemeyer> coreutils: /usr/bin/base64
<rogpeppe> niemeyer: cool, i'll do that then.
<rogpeppe> niemeyer: if that seems like a reasonable approach to you
<niemeyer> rogpeppe: Sounds okay.. even if we convert it to base64 inside goyaml itself (which sounds sane), we'll still be getting more compression that way
<niemeyer> rogpeppe: It would actually be nice to pass the whole cloudinit compressed instead
<niemeyer> But I don't recall if that's possible out of the box
<rogpeppe> niemeyer: yeah, and it will deal nicely with the two-identical certificates issue
<rogpeppe> niemeyer: yes, that would be good
<rogpeppe> niemeyer: maybe it does - i haven't actually experimented with the limits, just googled for them
<niemeyer> rogpeppe: I know it has a fancier mime multipart scheme
<niemeyer> Hah
<niemeyer> rogpeppe: https://help.ubuntu.com/community/CloudInit
<niemeyer> rogpeppe: "content found to be gzip compressed will be uncompressed"
<rogpeppe> niemeyer: cool!
<niemeyer> rogpeppe: We can actually just bundle the whole thing with gzip and send down the wire
<rogpeppe> niemeyer: that's just brilliant
<niemeyer> +1
<rogpeppe> niemeyer: i can relax a bit
<niemeyer> Lots of good stuff in the history of the past month so far
<niemeyer> It's great to have the CL links in the log, btw
<niemeyer> I hadn't realized how useful that'd be until now
<rogpeppe> niemeyer: isn't it, just
<rogpeppe> niemeyer: i'm glad you haven't found too many WTFs!
<rogpeppe> niemeyer: (or maybe you have, but are dwelling on the good things...)
<rogpeppe> niemeyer: i've gotta go
<rogpeppe> g'night all, see ya tomorrow
<niemeyer> rogpeppe: Nah, looks good so far :)
<niemeyer> rogpeppe: Have a goo dnight man
<rogpeppe> niemeyer: my dnights will indeed be all gooey
<rogpeppe> :-)
<niemeyer> rogpeppe1: Are you back from the goo dnight? :)
<hazmat> anyone else having issues with reitveld?
<niemeyer> hazmat: Yeah
<niemeyer> hazmat: Server seems busted ATM
<hazmat> rogpeppe1, gooey is sweet and crunchy ;-)
<niemeyer>   testing: print mongod log messages
<niemeyer> rogpeppe1: ^ nice
<niemeyer> I had seen that before too
<niemeyer> Or rather, s/seen/suffered from/
<hazmat> app engine, the most helpless way to fail at scale
<TheMue> niemeyer: Google has GAE troubles and rietveld runs on it.
<niemeyer> TheMue: Really?
<niemeyer> TheMue: Where's the news?
<TheMue> niemeyer: Yep, just read it.
<TheMue> niemeyer: One moment.
<TheMue> niemeyer: Hehe, even http://code.google.com/status/appengine failt.
<TheMue> fails.
<niemeyer> TheMue: Works here, but it does say the service is facing difficulties
<TheMue> niemeyer: See also https://twitter.com/app_engine/status/291243267239059456
<niemeyer> TheMue: Cheers
<hazmat> niemeyer, if you've got a moment.. there was an lbox question from orange squad .. an error their getting on submit .. http://pastebin.ubuntu.com/1535263/
<hazmat> bzr info output http://pastebin.ubuntu.com/1535286/
<niemeyer> hazmat: "*** Bazaar has encountered an internal error."
<niemeyer> hazmat: Doesn't look like an lbox issue
<niemeyer> hazmat: The command line is pretty boring: arguments: ['/usr/bin/bzr', 'commit', '-F', '/tmp/commit-803116972', '--
<niemeyer>     author', 'Curtis Hovey <curtis.hovey@canonical.com>',
<niemeyer>     '/home/curtis/Work/charmworld/lbox-245970753/lbox']
<hazmat> niemeyer, noted thanks
<niemeyer> hazmat: I note it's a beta bzr version.. might be related
<niemeyer> (2.6b2)
<hazmat> niemeyer, it might also be related to plugins, several members of this team have various assorted plugins.. i'm wondering if lbox should do bzr --no-plugins when executing
<hazmat> several don't use it as a result due primarily to errors on submit
<niemeyer> hazmat: lbox doesn't use anything special about bzr
<niemeyer> hazmat: It's just running it via its command line
<niemeyer> hazmat: If plugins are broken with lbox, I'm curious about why is that a problem related to lbox
<hazmat> niemeyer, understood. but in this case the user can commit to trunk fine using bzr byitself..
<niemeyer> hazmat: Exactly, lbox *is* using bzr itself
<hazmat> niemeyer, just that lbox could potentially be more robust by disabling user plugins
<niemeyer> hazmat: indeed, but it would also prevent people from using plugins with lbox.. I don't yet understand why that's a good thing
<hazmat> hmm...
<niemeyer> hazmat: The one thing lbox uses that some people are not used to is lightweight checkouts
<niemeyer> hazmat: But that's a stock feature, which hopefully should work fine
<hazmat> at the moment i'm just trying to discover what's the issue, it was pointed out that the plugin disable can be done via env var. we're trying that now
<niemeyer> hazmat: Thanks, I'm also curious to know what's the actual issue
<hazmat> niemeyer, no change.. http://pastebin.ubuntu.com/1535318/  is there an option to have lbox keep the temp directory around?
<hazmat> hmm.. no such revision.. is that a common ancestor check.
<niemeyer> hazmat: Not yet, but it should be relatively simple to break at the right point by replacing the bzr executable and catching commit
<niemeyer> Hmm.. the branch checked message has a bug..
 * niemeyer fixes
<hazmat> niemeyer, fwiw 2.6b2 is the quantal bzr and the further analysis (no real answer) http://pastebin.ubuntu.com/1535610/
<fwereade_> davecheney, morning
<davecheney> fwereade_: hello
<fwereade_> davecheney, I'm about to sleep, but I have a vague feeling that you are *probably* least tainted by groupthink-related issues re state and lifecycle
<davecheney> fwereade_: that is a sounds assumption
<fwereade_> davecheney, and therefore I would particularly appreciate your thoughts on https://codereview.appspot.com/7095059
<davecheney> fwereade_: will do
<fwereade_> davecheney, as a diff, it's a monster; but if you read doc/draft/death-and-destruction.txt first, I *hope* it will appear roughly sane to you
<davecheney> fwereade_: ok
<fwereade_> davecheney, ie I think/hope the operations are clear and adequately explained; but the near-equivalence between behaviours before and after is harder to discern
<davecheney> removing things is always the hardest
<davecheney> and tends to mess up pretty mental models
<fwereade_> davecheney, I *thiiiink* I'm converging one something that works ok for now
<davecheney> fwereade_: i can assure you that you are the one that knows the most about this problem
<fwereade_> davecheney, which is why it's so important I get sanity-checks from everyone else ;p
<fwereade_> davecheney, anyway, sleepytime :)
<doc_> is this strictly for juju developer questions?
<davecheney> doc_: if it's a general juju question you're probably likely to find better help in #Juju
<doc_> davecheney: Thanks. Tried there, nobody answering :-(
<davecheney> doc_: bad time of day
<davecheney> most juju's are in the EU/US timezone
<doc_> tx. Will try again tomorrow then
<doc_> EST myself â¦ which reminds me maybe I should go get some food :-)
<davecheney> doc_: consider also https://lists.ubuntu.com/mailman/listinfo/juju
#juju-dev 2013-01-16
<hazmat> davecheney, he asked to ask, but never actually asked re #juju
<davecheney> hazmat: well, that is two problems
 * hazmat is missing how to query store stats w/ usernames
<hazmat> aha username is last on stats
<TheMue> *: Morning all.
<fwereade_> TheMue, heyhey
<fwereade_> TheMue, do you have a bit of time for reviews today? if so: https://codereview.appspot.com/7095049/ should only require simple sanity checks; but https://codereview.appspot.com/7095059/ is complex and important, and I would especially appreciate reviews focusing on the logic expressed therein and the clarity with which it is done so
<TheMue> fwereade_: Sorry, not now. I'll drive into town to upgrade my notebook to run more test vms in a few moments and this afternoon I'm out of office as I told yesterday. But I'll take a look when I'm back from the city and this evening.
<fwereade_> TheMue, ok, no worries: enjoy :)
<TheMue> fwereade_: Thx.
<fwereade_> rogpeppe1, morning
<rogpeppe1> fwereade_: yo!
<rogpeppe1> fwereade_: i'm with you about dfc's remarks on 7095059.
<fwereade_> rogpeppe1, cool, good to know that if I'm crazy I'm not alone ;p
<rogpeppe1> fwereade_: i think i've seen a similar complaint from dave before actually, also misguided.
<fwereade_> rogpeppe1, yeah, I think so too
<rogpeppe1> wallyworld: morning
<rogpeppe1> wallyworld: so it *was* possible to use an authenticating client, i guess?
<rogpeppe1> fwereade_: could i have a post-facto LGTM on https://codereview.appspot.com/7106052/ for form's sake, please? gustavo stated it here on IRC but didn't reply to the CL.
<fwereade_> rogpeppe1, looking
<fwereade_> rogpeppe1, done
<rogpeppe1> fwereade_: thanks
<rogpeppe1> fwereade_: i think it's ok for the Dead methods to exist for the convenience of the tests, tbh - it's just a line of code
<fwereade_> rogpeppe1, yeah, I'm not *really* bothered about it :)
<rogpeppe1> fwereade_: and i'm going to be adding a Dead method to api.Server (used outside of tests), so there'll be good company
<fwereade_> rogpeppe1, cool :)
<fwereade_> wallyworld, fwiw I just reviewed https://codereview.appspot.com/7086055/ with basically 1 question -- if you have a quick response to that now you can probably get an LGTM by the time you start work tomorrow(?)
<aram> moin
<rogpeppe1> aram: hiya
<fwereade_> rogpeppe1, https://codereview.appspot.com/7137044 might be trivial, I would appreciate your opinion
<rogpeppe1>  fwereade_a 500 line diff trivial? i don't think so. (well, i'd need to spend a reasonable amount of time on it, so it's not trivial for me...)
<rogpeppe1> fwereade_: ^
<rogpeppe1> fwereade_: i'll have a look in a bit though
<fwereade_> rogpeppe1, fair enough -- maybe the term I was looking for is "mechanical"
<rogpeppe1> fwereade_: it might've been mechanical for you, but it's not easy for me to see the relationship between old and new, i'm afraid. i'll have to look quite hard to see that the new is equivalent to the old.
<fwereade_> rogpeppe1, np at all, I understand my perspective is skewed ;)
<rogpeppe1> fwereade_: sometimes it's easier to write than to read...
<fwereade_> rogpeppe1, heh, I have become very familiar with those InferEndpoints blocks
<fwereade_> niemeyer, morning
<niemeyer> fwereade_: Yo
<wallyworld> rogpeppe1: yeah, i figured it out
<rogpeppe1> wallyworld: cool.
<rogpeppe1> niemeyer: hiya
<niemeyer> rogpeppe1: Hey ho
<wallyworld> fwereade_: i had another look at the ec2 implementation and it uses auth with the public bucket, so i did the same
<fwereade_> wallyworld, ok, fair enough, thanks
<wallyworld> fwereade_: i initially thought as you did and messed up the first go at it
<wallyworld> fwereade_: so in essence, you need auth to write to the public bucket, but then can use wget etc without auth to get stuff from it
<fwereade_> wallyworld, ah ok, hmm, let me backpedal a mo
<fwereade_> wallyworld, in what circumstances is it sensible for a juju client to write to the public bucket?
<wallyworld> fwereade_: give me a few minutes, on a standup
<fwereade_> wallyworld, ah sorry
<wallyworld> fwereade_: np, i was the one who pinged you
<rogpeppe1> fwereade_: in normal circumstances, a juju client will never be able to write to the public bucket.
<rogpeppe1> fwereade_: but by using the normal authentication, we make it possible for the tests to write to the public bucket without adding any more code.
<niemeyer> rogpeppe1: Agreed. It's even safe to say that the real public bucket is never written to.
<niemeyer> (by juju)
<fwereade_> rogpeppe1, I'm not sure I generally approve of that approach -- what code are you actually saving? just not having to create a client with auth in the tests?
<niemeyer> fwereade_: How to test the public bucket if it's always empty?
<fwereade_> niemeyer, er, I'm not suggesting that the tests should not put stuff in the public bucket
<rogpeppe1> fwereade_: we'd need more code to create a bucket that's neither of the ones provided by the Environ
<fwereade_> niemeyer, I'm saying that it seems weird to give the provider authenticated access to something it should not need, just to save the effort of creating an authenticated client in the tests
<rogpeppe1> fwereade_: tbh, i'm not sure there's a good use case for the swift unauthenticated connection
<niemeyer> fwereade_: I'm probably out of context there
<rogpeppe1> fwereade_: why not always present your credentials. at worst they're ignored.
<niemeyer> fwereade_: I'm reading that as "It's weird to create authenticated access just to save the trouble of creating authenticated access", which makes no sense
<fwereade_> rogpeppe1, the "in the tests" bit is important
<niemeyer> fwereade_: That's probably the bit I'm missing.. what's that again?
<fwereade_> niemeyer, AIUI the contents of the public-bucket should be available to anyone, and should never be written to by juju
<niemeyer> fwereade_: That is the case, but not in tests
<fwereade_> niemeyer, right
<niemeyer> fwereade_: We write to a bucket in tests, and set it as the public bucket, because we need content in it
<fwereade_> niemeyer, yes, this is not hard or complicated
<niemeyer> fwereade_: That's what it seems to me as well
<fwereade_> niemeyer, I just don't understand why the *environ* needs to have an authed connection when there is never any real use case, and creating a separate authed connection -- to write with -- in the tests is trivial
<rogpeppe1> fwereade_: in goamz there's no way to create an unauthed connection. is that a problem?
<fwereade_> niemeyer, it feels like the justification is "meh, it's convenient" but IMO it's misleading to put test-only code into an implementation without heavily flagging it with comments
<niemeyer> fwereade_: Hmm
<niemeyer> fwereade_: I see
<fwereade_> rogpeppe1, well, at least it's behind a PublicStorage interface -- if we weren't casting back to Storage in the tests I probably wouldn't care how it was implemented
<niemeyer> fwereade_: But I'm not sure I agree.. let me reverse the question: what's the *problem* with the current implementation?
<fwereade_> niemeyer, the problem is that the openstack code became less clear -- it took me some time to understand that the change to an authed connection was entirely to accommodate the tests
<niemeyer> fwereade_: AFAIK, there's nothing wrong with downloading files from S3 while being identified
<niemeyer> fwereade_: That's a pretty problem from what we've been discussing thus far
<niemeyer> pretty different
<fwereade_> niemeyer, yeah, this is true -- the context is entirely "why did this change add something that is only needed for the tests?"
<fwereade_> niemeyer, if "we did it that way before" is all the justification we need then I'll shut up
<niemeyer> fwereade_: No, I think you're right in your argument.. it's just the problem statement that was hard to grasp
<niemeyer> fwereade_: It's not about S3, or goamz, or the ec2 environ
<niemeyer> fwereade_: It's about openstack
<niemeyer> fwereade_: So you're saying just returning an authed connection with the public bucket is tricky?
<niemeyer> fwereade_: (with openstack)
<fwereade_> niemeyer, it seems like code that doesn't need to exist
<niemeyer> fwereade_: What's that code?
<niemeyer> fwereade_: Or where is it?
<fwereade_> niemeyer, https://codereview.appspot.com/7086055/
<fwereade_> niemeyer, actually https://codereview.appspot.com/7086055/diff/5001/environs/openstack/provider.go#newcode223
<fwereade_> niemeyer, the whole business of casting a PublicStorage to a Storage in the tests strikes me as a touch icky, because it doesn't seem like creating a separate connection in the tests would be very hard at all... I may be missing something?
<niemeyer> fwereade_: What's the code that shouldn't exist?
<fwereade_> niemeyer, the commented change that I linked?
<niemeyer> fwereade_: Are we talking about 5 or 6 characters?
<fwereade_> niemeyer, if you want to reduce it to that, don't worry, rubber-stamp LGTM
<niemeyer> fwereade_: I'm being silly.. I'm actually honestly trying to understand what you're arguing about.. when I hear "code that shouldn't exist" I think about maintenance
<niemeyer> fwereade_: That doesn't look like it
<niemeyer> fwereade_: If what you're arguing about is purely conceptual, I think this is actually correct
<niemeyer> fwereade_: For other reasons
<fwereade_> niemeyer, it is as simple as "why are we using a type with writey methods when they are not required?"
<niemeyer> fwereade_: Having an authenticated connection is not just about writing
<niemeyer> fwereade_: It's also about reading
<niemeyer> fwereade_: Think the case of private clouds
<rogpeppe1> niemeyer: that's a good point. it might be fine to have a public bucket that's only readable by the given user.
<rogpeppe1> niemeyer: but not writable
<rogpeppe1> niemeyer: assuming openstack has ACLs
<niemeyer> fwereade_: The only change I see in this CL is "use credentials when accessing the public bucket", with a trivial one-liner.. it doesn't look bad by itself on that ground
 * mgz catches up with the log on this topic
<fwereade_> niemeyer, ok, so, no default public-bucket for openstack?
<niemeyer> fwereade_: That said, if tests are requiring the public bucket interface to have write methods, it does sound sensible to fix the test to not require that.
<niemeyer> fwereade_: Wait what?
<niemeyer> fwereade_: I didn't see that coming.. what's the relation?
<fwereade_> niemeyer, well, if we're authenticating the connection, those credentials need to be recognised by the public bucket server... right?
<rogpeppe1> niemeyer: no public tests require that
<fwereade_> niemeyer, I may be on crack
<rogpeppe1> niemeyer: it's only in the provider-savvy test code that it makes that assumption
<fwereade_> niemeyer, it seems like a stretch to assume that your local credentials will work with whatever public-bucket weprovide by default
<niemeyer> fwereade_: Will swift block a read of a publicly available file if a signature is provided?
<fwereade_> niemeyer, honestly I have no idea
<niemeyer> fwereade_: Ah, interesting.. you're thinking of the case across clouds I suppose
<fwereade_> niemeyer, I think the cross-cloud case is here today if weprovide some sort of default public-bucket for openstack
<fwereade_> niemeyer, and I think we do want to
<fwereade_> niemeyer, even if we advise people to keep a local tools bucket to save bandwidth, not depend on us, etc, I think we also want a default "let's try juju with this openstack" to work without messsing around like that
<niemeyer> fwereade_: Yeah, you mean default public bucket as in default storage in a random swift server for all openstack providers
<niemeyer> fwereade_: That's a good point
<fwereade_> niemeyer, yeah, I assumed we'd have to have something like that
<niemeyer> fwereade_: I doubt the signature would go by if the user doesn't exist, of course
<niemeyer> fwereade_: Sure, I just didn't have that context in mind
<fwereade_> niemeyer, np, it's an awful lot harder to communicate the context than it is the problem
<niemeyer> fwereade_: For supported clouds, we will have a default public bucket, in the cloud itself, so that's not an issue
<rogpeppe1> niemeyer: the other side of the coin is when there's a public bucket that isn't fully public, of course, just shared
<niemeyer> fwereade_: and that's what I had in mind when you said "real public bucket"
<fwereade_> niemeyer, ah yeah, I'd discarded those as "easy to deal with" and basically forgotten about them ;p
<niemeyer> fwereade_: Cool, I think the context is in sync now
<niemeyer> fwereade_: Agreed, it'd be best for the test to not enforce that need
<niemeyer> fwereade_: Since it should be easy to
<fwereade_> niemeyer, yeah, that was my read of the situation
<mgz> "public" is meaning too many different things in this conversation...
<niemeyer> fwereade_: We can be smarter about having credentials when operating in the same cloud in the future
<fwereade_> niemeyer, I imagine the thing to do there is clean up the public-bucket/public-bucket-url business
<niemeyer> fwereade_: But that's a scenario we don't have to test/code right now, to make things simpler
<fwereade_> niemeyer, ie if there's a public-bucket, it's assumed to be in the same cloud and accessible by name
<fwereade_> niemeyer, public-bucket-url has a default pointing to our super-public bucket (heh :/)
<fwereade_> niemeyer, and public-bucket-url is only used in the absence of public-bucket
<fwereade_> niemeyer, sane?
<niemeyer> fwereade_: Not sure.. the matter of "defaults" is that it depends on the cloud being run
<mgz> fwereade_: that's not really ideal from an openstack perspective
<fwereade_> niemeyer, agreed -- maybe that's a big derail
<niemeyer> fwereade_: We'll want juju itself to be able to figure when it can use a public bucket in the same cloud, and to fallback to that public-bucket-url when it doesn't know better
<niemeyer> fwereade_: So that's not really a matter of "what is set", but rather of "where am I"
<fwereade_> niemeyer, ok, yes, that makes sense
<fwereade_> mgz, I'm interested to hear your perspective on what I suggested
<fwereade_> mgz, because I'm not really comfortable with what I've seen of those settings, but that is almost certainly down to ignorance
<wallyworld> fwereade_: i'll let mgz fill you in on the fine details, i'm quite tired tonight and need some sleep soon. the openstack provider implementation wrt public buckets was done similar to the existing ec2 way of doing it and was done is such a way to make the existing juju tests pass
<fwereade_> wallyworld, yep, np, enjoy your sleep :)
<wallyworld> ie a container was created for the public storage, and made readable by all, but the tests need to writw to it, hence need auth
<wallyworld> the readable by all bit then allows wget etc to fetch the tools
<fwereade_> wallyworld, sure, I understand, I'm just saying there's no justification for auth in the implementation when it's only there so the tests can cast an interface to another and abuse it ;p
<wallyworld> fwereade_: yeah, i understand your concerns. i took the approach that the exisitng tests were all ok and i just needed to write code to make them pass aka TDD :-)
<fwereade_> wallyworld, I'm doing my very best to criticize the code alone, rather than your approach to it, and I'm sorry if it'snot coming across that way
<mgz> fwereade_: so, the main issue is that for unauthenticated read access, we just want a url
<wallyworld> fwereade_: oh no, don't apologise. i have a thick skin and in any case did not in any way take your concerns badly. i'm very glad you raised them
<mgz> to put stuff there, we need authentication and a bunch of extra details
<fwereade_> wallyworld, cool :)
<wallyworld> i love a good robust discussion
<wallyworld> especially when it comes to improving the code base
<mgz> the tests fake out the global bucket in order to test that, and write tools to a temporary test globabl bucket
<fwereade_> mgz, we are in agreement so far
<wallyworld> fwereade_: so whatever is decided, i'd love it to be from a holicstic perspective and if changes are needed we fix the tests as a whole with possible changes to both ec2 and openstack providers if required. perhaps we can land my mp as is and follow up with a separate branch
<mgz> well, apart from the naming being a pain, and a lack of commands to propogate tools, I think we're happy
<fwereade_> wallyworld, I'm -0 on that but I will think more on it -- I think there's a real distinction between ec2 and openstack, because we don't need to take cross-env credentials into account
<fwereade_> wallyworld, mgz: if the client really will work when it tries to read a bucket with the wrong credentials, I think it's fine
<fwereade_> wallyworld, mgz: but if so it certainly deserves commenting
<rogpeppe> i'd like to know what actually happens if you use invalid credentials on an openstack node when reading from a publicly-accessible container.
<wallyworld> fwereade_: i thought we only needed the public container for wget and the like?
<mgz> the read should really not be trying to give a token at all.
<wallyworld> rogpeppe: i did read that there is/was a bug of some sort that raised a 401 when it shouldn't but i need to find the reference to that, and in any case, it might also be fixed
<wallyworld> anyways, i'll go sleep and catch up with things tomorrow. thanks for the inetresting discussion
<fwereade_> wallyworld, mgz: if it works, and there's a comment pointing out that the credentials might be wrong but it doesn't matter, then I'd be ok with landing that as-is, along with a bug pointing out that the live tests make some slightly icky assumptions about the nature of provider storage and it would be nice if they didn't
<fwereade_> wallyworld, cheers
<fwereade_> wallyworld, I'll add some notes to the CL
<rogpeppe> niemeyer, fwereade_: simple change to goamz/ec2/ec2test to support compressed user data: https://codereview.appspot.com/7135045
<rogpeppe> niemeyer: i wonder if it might make sense to have ec2 always compress user data. i'm not sure.
<rogpeppe> niemeyer: in fact, scratch that, it should definitely not.
<niemeyer> rogpeppe: I think it should attempt to uncompress based on the magic instead of picking any errors as uncompressed
<rogpeppe> niemeyer: i wondered about that. gzip.NewReader reads the header - i could bail out if NewReader fails.
<rogpeppe> niemeyer: rather than duplicating the header magic
<niemeyer> rogpeppe: I still think it should simply read the magic
<niemeyer> rogpeppe: if string(data[:magicSize]) == "foo" { profit }
<niemeyer> rogpeppe: Pretty simple
<rogpeppe> niemeyer: i'm not sure though. user data is arbitrary - there's nothing to say that user data with a valid gzip header followed by random contents is necessarily invalid
<rogpeppe> niemeyer: that's a cloudinit thing, not a user data thing
<niemeyer> rogpeppe: user data is not arbitrary.. it's fine to explode if we don't reckon it in our own test server
<rogpeppe> niemeyer: i thought cloudinit was just one way of using user data
<niemeyer> rogpeppe: Hmm.. you're right actually
<niemeyer> rogpeppe: -1 on the change then
<niemeyer> rogpeppe: This is ec2 test server.. there's nothing about cloudinit there
<rogpeppe> niemeyer: yeah, i suppose so.
<rogpeppe> niemeyer: i was trying to avoid having environs/ec2 knowing about the compression scheme.
<niemeyer> rogpeppe: If it knows about interpreting a cloud-init oriented configuration, it should know about what cloud-init expects
<rogpeppe> niemeyer: the cloudinit package knows about that (i'd just added a RenderCompressed method)
<rogpeppe> niemeyer: but i guess i can either remove that, or just document that the compression format is gzip
<niemeyer> rogpeppe: If it doesn't know about interpreting a cloud-init oriented configuration, there's no reason for it to know about the compression
<niemeyer> rogpeppe: Seems self-exclusive.. you only have to uncompress if you care about the data
<niemeyer> rogpeppe: If you care about the data, it's natural to know its format
<rogpeppe> niemeyer: yeah, and the environs/ec2 tests know that the cloudinit data is in yaml format, so they may as well know it's compressed too.
<niemeyer> rogpeppe: +1
<rogpeppe> niemeyer: ok, here's the CL to compress the cloudinit data: https://codereview.appspot.com/7138044/
<rogpeppe> niemeyer: user data now down to 4984 bytes, much better
<niemeyer> rogpeppe: Reviewed
<rogpeppe> niemeyer: thanks. a reasonable point. will add trivial.Gzip and probably Gunzip too for symmetry
<niemeyer> rogpeppe: Super, thanks
<rogpeppe> niemeyer: updated accordingly:  https://codereview.appspot.com/7138044
<niemeyer> rogpeppe: Reviewed
<rogpeppe> niemeyer: submitted
<rogpeppe> niemeyer: thanks
<niemeyer> rogpeppe: np!
<niemeyer> fwereade_: Is it okay if I use state-service-api as a chance to review the whole death-and-destruction doc?
<rogpeppe> fwereade_, niemeyer: PTAL https://codereview.appspot.com/7085062
<rogpeppe> fwereade_: i'll do some reviews now. what's your highest priority?
<fwereade_> niemeyer, that works for me
<niemeyer> fwereade_: Coo
<niemeyer> l
<fwereade_> rogpeppe, any of them really :)
<niemeyer> I'll step out for lunch though, as it's pretty late
<niemeyer> biab
<fwereade_> niemeyer, enjoy
<rogpeppe> fwereade_: what does EnterScope immediately followed by LeaveScope accomplish?
<fwereade_> rogpeppe, create the subordinate but get back to the expected starting state for the initial tests (ie nothing in scope)
<rogpeppe> fwereade_: so in the real system, what calls EnterScope to cause the subordinate to be created?
<fwereade_> rogpeppe, the principal unit agent
<rogpeppe> fwereade_: ah, so the subordinate agent starts already in scope, then leaves scope when it dies?
<fwereade_> rogpeppe, no -- the action of the principal entering scope causes the subordinate to be created; when the subordinate's unit agent is running it will enter scope, and at that point the two units will see each other
<rogpeppe> fwereade_: ah, i see noe
<rogpeppe> w
<fwereade_> rogpeppe, (and fwiw: the principal and the subordinate each leave scope independently when they observe themselves, or the relation, to be Dying)
<rogpeppe> fwereade_: you've got a couple of reviews
<fwereade_> rogpeppe, cheers
<mramm> hazmat: rogpeppe: we still doing that API sync call?
<rogpeppe> mramm, hazmat: yes, i just noticed that. am ready if you are.
<mramm> I'm alone in the hangout, hanging out
<rogpeppe> mramm: i just joined then left :-)
<mramm> interesting
<fwereade_> rogpeppe, comments on https://codereview.appspot.com/7137044/ when you have a mo
<hazmat> mramm2, rogpeppe sorry i got double booked
<hazmat> totally missed it.. i'm in the hang out now if you've got time
<hazmat> mramm2, rogpeppe should i reschedule for tomorrow?
<rogpeppe> hazmat, mramm2: i'm ok if you are
<hazmat> rogpeppe, mramm2 let's do tomorrow.. i have another meeting starting now..
<rogpeppe> hazmat: ok, np
<hazmat> rogpeppe, thanks
<rogpeppe> fwereade_: replied
<fwereade_> rogpeppe, SGTM, thanks
<niemeyer> fwereade_: The first bug described in the death doc went a bit over my head
<niemeyer> fwereade_: Would appreciate some quick exchange on it when you have a moment
<fwereade_> niemeyer, ping -- I must be brief -- the issue is that the Uniter uses its relation state dir to record what relations it's joined, while actually it should be using... what relations it's joined
<niemeyer> fwereade_: He
<niemeyer> y
<niemeyer> fwereade_: Hmm
<aram> holly shit, I've been scammed. I bought 12 SD cards for my raspberry pis, and I bought expensive ones that can do 30MB/sec, and they were all chinese pirate copies that can't even do 1MB/sec.
<fwereade_> niemeyer, so if an instance dies, and a UA comes up on a fresh instance, it won't leave dying relations because it doesn't know it's in them
<fwereade_> aram, ouch, crap :(
<niemeyer> fwereade_: Okay, the doc can probably be reworded a bit then
<niemeyer> fwereade_: It's trusting on logic we don't even have yet
<niemeyer> I think
<fwereade_> niemeyer, I *think* that the provisioner will start fresh instances
<fwereade_> niemeyer, but I'd have to check
<niemeyer> fwereade_: For an already deployed machine with running units? That'd be awkward
<niemeyer> fwereade_: I'd expect some clean up to take place in those cases
<niemeyer> fwereade_: With some kind of formal acknowledgement in the logic handling it that things went bad
<niemeyer> fwereade_: I don't recall us having that yet, at least
<fwereade_> niemeyer, ah, hmm, looks like we don't
<fwereade_> niemeyer, funny, could have sworn it went in quite early
<niemeyer> fwereade_: Either way, the point is valid.. it just seems worded in an involved way
<niemeyer> fwereade_: The core point, if I get it, is that if the known unit state is corrupted, it won't auto-recover
<fwereade_> niemeyer, yeah -- but I think it would recover ok if we didn't use local dirs as evidence of relation membership
<niemeyer> fwereade_: Or maybe not.. you offer solutions there as well.. my comprehension of the problem just seems to have made the wordin gmore clear somehow
<fwereade_> niemeyer, haha
<rogpeppe> aram: presumably you bought from a reputable seller, so have some comeback... ?
<aram> yeah.
<rogpeppe> aram: bummer though
<aram> yeah.
<aram> btw, Plan 9 works really really well.
<rogpeppe> aram: on the pi?
<aram> Linux is comically slow.
<aram> yes.
<rogpeppe> aram: cool
<aram> though quake 3 runs really well because it is 3D.
<rogpeppe> aram: have you tried inferno?
<aram> not yet.
<rogpeppe> aram: what are you planning to do with your pis?
<aram> the first thing In want to do is implement a full TCP/IP stack over the GPIO ports. I'd like to learn more about TCP/IP this way.
<rogpeppe> aram: what's the point? can't you just implement a transport?
<rogpeppe> aram: assuming you're doing it under plan 9
<aram> yes, I'll do the transport at first, but I want to learn more about implementing TCP/IP in general.
<rogpeppe> i'm off for the night
<rogpeppe> see y'all tomorrow
<aram> cheers
#juju-dev 2013-01-17
<TheMue> Morning.
<rogpeppe> mornin' all
<fwereade_> rogpeppe, TheMue: heyhey
<rogpeppe> fwereade_: hiya
<TheMue> fwereade_, rogpeppe: Heyhey.
<rogpeppe> TheMue: yo!
<TheMue> Btw, TestAttemptTiming in trivial seems to have a failure. The final test loop traverses 'want' and then compares the wanted values with the wanted values. ;)
<TheMue> Found it yesterday when adding interval behaviours (static, linear, exponential).
<rogpeppe> TheMue: oops
<rogpeppe> TheMue: i'll have a look
<TheMue> rogpeppe: It'll be fixed with my CL.
<rogpeppe> TheMue: cool. i should've broken the test first!
<TheMue> rogpeppe: But the test itself is cool, I use the same pattern for the behaviour tests.
<rogpeppe> TheMue: yes, when it's fixed, the test still passes :-)
<TheMue> rogpeppe: +1
<TheMue> rogpeppe: Only change is that I now work with x * time.Millisecond etc. instead of simple fixed values.
<rogpeppe> TheMue: seems reasonable. athough i actually found the 0.5e9 idiom quite readable - ignoring the e9, it's just seconds.
<TheMue> rogpeppe: Indeed, keeping the time base in mind.
<TheMue> rogpeppe, fwereade_: Anyone got a "301 response missing Location header" when doing juju status --debug ?
<rogpeppe> TheMue: i haven't used juju status --debug, i don't think.
<rogpeppe> TheMue: i'll give it a go
<TheMue> rogpeppe: Thx
<fwereade_> TheMue, don't think I've ever run it with --debug :)
<TheMue> rogpeppe: Last time it worked. I'm trying it on ap-southeast-1
<TheMue> fwereade_: Dave showed it in his bug, it logs the reconnection tries.
<rogpeppe> TheMue: you're trying to reproduce dave's bug?
<TheMue> rogpeppe: Yes, I once had a solution where he said he would like a different one. Now I'm trying it and get this error by EC2.
<TheMue> rogpeppe: It's not Daves bug, he only entered it: https://bugs.launchpad.net/juju-core/+bug/1084867
<_mup_> Bug #1084867: conn: conn connections should pause between retry attempts <juju-core:In Progress by themue> < https://launchpad.net/bugs/1084867 >
<fwereade_> TheMue, rogpeppe: btw, https://codereview.appspot.com/7095059/ has a couple of changes and some discussion on death-and-destruction.txt; I would be most grateful for your thoughts
<TheMue> fwereade_: Will take a look.
<fwereade_> TheMue, rogpeppe: also: https://codereview.appspot.com/7095049/ is a 50-line text document with one LGTM already, if anyone fancies taking a quick look at that
 * TheMue detects some kind of review beggin'. ;)
<rogpeppe> fwereade_: i think that using string consts rather than ints "because someone may want to log something sometime" is perhaps misguided. if i see a string, i generally assume it's because of some external constraint (e.g. it's going into a database or something). iota consts work well.
<rogpeppe> fwereade_: and they involve less code (runtime and bytes of source)
<fwereade_> rogpeppe, heh, when I see int constants I assume that overwhelming efficiency considerations demand the use of opaque identifiers ;p
<rogpeppe> fwereade_: naah, they're just the natural thing to use in Go
<fwereade_> TheMue, I'm trying not to overdo it ;p
<rogpeppe> fwereade_: they're only opaque if you print 'em (and even then, it's trivial to work out which one is which)
<TheMue> fwereade_: NP
<TheMue> rogpeppe: IMHO numbers are for calculations :)
<rogpeppe> TheMue: what's an if statement if not a calculation? :-)
<rogpeppe> fwereade_: when i queries "maintain the service" i thought it was a typo. doesn't endpointRemoveOps return the operations necessary to remove an endpoint?
<rogpeppe> fwereade_: "the operations that are needed to maintain the service" doesn't mean much to me
<TheMue> rogpeppe: It's a kind of consideration, like human beings do.
<fwereade_> rogpeppe, it's the operations that apply ot the given endpoint in the course of removing the relation
<fwereade_> rogpeppe, to the endpoint's service, rather
<rogpeppe> fwereade_: so do you mean: "returns the operations that are needed to maintain the service identified by the supplied endpoint when removing a relation" ?
<rogpeppe> fwereade_: i look at that comment, and it doesn't seem to tell me anything about what the returned operations are for
<rogpeppe> fwereade_: i can try to guess from the function name, but i'm not sure i'm guessing right
<fwereade_> rogpeppe, yeah, the "when removing..." will help, thanks
<fwereade_> rogpeppe, still not convinced that iota gains us anything beyond a few bytes, or that the gain is worth the cost in clarity
<rogpeppe> fwereade_: i don't see any clarity gain. only that you can print out a slightly nicer debug message if you're debugging this code.
<fwereade_> rogpeppe, that is what I am referring to
<rogpeppe> fwereade_: how is that "clarity"?
<rogpeppe> fwereade_: the code becomes slightly simpler
<rogpeppe> fwereade_: the executable slightly smaller. and if you're debugging that code, seeing 0, 1 or 2 in that context will probably be just fine.
<fwereade_> rogpeppe, it is clarity for the maintainer logging wtf goes on there, and not making him have to look up what 0, 1, or 2 mean
<rogpeppe> fwereade_: if i'm adding a debug print there, i know i'm printing a removeMode and it's utterly trivial to find the definition.
<fwereade_> rogpeppe, ISTM that you are repeatedly explaining why strings are superior
<fwereade_> rogpeppe, are you seriously concerned that "destroy-service" is unclear, or that the executable size will balloon unacceptably?
<fwereade_> rogpeppe, those are the negative consequences of using strings, and I don't really feel they carry much weight in our situation
<fwereade_> rogpeppe, the negative consequence of using an iota is that it becomes slightly harder to debug the code
<rogpeppe> fwereade_: i think it's a visceral reaction to unnecessary executable size bloat
<rogpeppe> fwereade_: plus the source becomes a little smaller too
<TheMue> rogpeppe: Do we have a source size problem? ;)
<rogpeppe> fwereade_: ok, i do realise that the gains are minute (42 bytes on executable size and 80 in the source), y'know, death by a thousand cuts. hmm, i'm not really sure why i get such a strong reaction.
<rogpeppe> TheMue: actually we do, but it won't be solved in this kinda way
<rogpeppe> TheMue: not source size, sorry, binary size, yeah
<TheMue> rogpeppe: Oh, we have a binary size problem? Good to know. In which way?
<rogpeppe> TheMue: it takes ages to upload the binaries.
<fwereade_> rogpeppe, I do understand the feeling but in my own mind this battle played out a long while ago and I went with "always use strings for constants unless there is an overriding reason not to"
 * TheMue comes from a world running fat binary 24/7
<TheMue> rogpeppe: OK, sounds reasonable.
<rogpeppe> TheMue: we can easily cut the executable size by at least a third, by combining jujuc and jujud
<rogpeppe> TheMue: it's on my todo list on the kanban...
<TheMue> rogpeppe: Fine, so it is covered.
<TheMue> rogpeppe: It's a different kind of problem than in my former jobs, where software is only deployed once per release over a local network.
<rogpeppe> fwereade_: in general i'd use int constants, and possibly add a String method if i needed to debug 'em.
<rogpeppe> fwereade_: i've been meaning to hack up a quick automatic String method generator for constants actually
<rogpeppe> fwereade_: then if you need to debug, you just run it to generate the necessary source code, and remove when done.
<fwereade_> rogpeppe, yeah, if something like that was easy to add I'd be fine with it -- but doing it by hand pushes me over to "just use strings" ;)
<fwereade_> rogpeppe, but I generally prefer to keep my source code modifications to an absolute minimum when debugging things
<fwereade_> rogpeppe, so I think it'd only really work for me if it were a code-generation step
<fwereade_> rogpeppe, that was always applied
<rogpeppe> fwereade_: i think it's probably my experience with the go core. there, you never get symbolic string constants - constants of that nature are always iota ints. and that works pretty well.
<fwereade_> rogpeppe, "always code as though the maintainer will be an angry lunatic who knows where you live" ;)
<rogpeppe> :-)
<fwereade_> rogpeppe, TheMue: addressed https://codereview.appspot.com/7137044 I think
<rogpeppe> fwereade_: LGTM
<rogpeppe> fwereade_: and on doc-entity-creation too
<fwereade_> rogpeppe, awesome, tyvm
<TheMue> Ah! It's working. One should use the EC2 console right.
<TheMue> rogpeppe, fwereade: https://codereview.appspot.com/6949044/ is in for review again.
<TheMue> fwereade: Now I'll continue with your reviews.
<fwereade> TheMue, thanks
<niemeyer> Morning all
<fwereade> TheMue, https://codereview.appspot.com/7137044/ should be quick and easy if you haven't started the others yet
<fwereade> niemeyer, heyhey
<TheMue> niemeyer: Hiya.
<fwereade> niemeyer, thanks for the comments on death-and-destruction, I think your points are now addressed or justified
<TheMue> fwereade: Started with the ...5059, but will now do this one first.
<fwereade> niemeyer, btw, you know all the stateful table-based watcher tests we have?
<niemeyer> fwereade: Cool, thanks, I'll have a look
<niemeyer> fwereade: Yeah
<fwereade> niemeyer, I think I've convinced myself that they'e just entirely Bad, and should be replaced with literate inline tests (broken out into separate scenarios where possible)
<fwereade> niemeyer, and also that many of them are just testing way two much -- "add two services", "add three services", etc
<fwereade> niemeyer, general agreement?
<niemeyer> fwereade: Yeah, definitely
<fwereade> niemeyer, I don't want to spend too much time on it but I have to take an axe to machine_test.go to finally get rid of AddUnitSubordinateTo, so I think that at least will just get a full rewrite
<mgz> fwereade: was dimitern having more internet woes?
<fwereade> mgz, not that I know of
<niemeyer> fwereade: Ah, makes sense
<TheMue> fwereade: LGTM (and lunchtime ;) ).
<fwereade> TheMue, <3
<fwereade> mgz, although now I think of it he did mention having an engineer come round; shall I text him?
<mgz> fwereade: not urgent, but hopefully he gets online again :)
<fwereade> mgz, I'll ping him but will be off for lunch in a mo myself
<mgz> wallyworld: looks like bug 1091669
<_mup_> Bug #1091669: GET Public-readable container's object with another user credentials is raising 403(Forbidden) exception <OpenStack Object Storage (swift):New> < https://launchpad.net/bugs/1091669 >
<mgz> which looks like a dupe of bug 1020722
<_mup_> Bug #1020722: swift_auth middleware disallows access to public Swift URLs <OpenStack Object Storage (swift):In Progress by dan-prince> < https://launchpad.net/bugs/1020722 >
<mgz> jam: want to hop on mumble?
<rogpeppe> fwereade, TheMue, niemeyer: a small CL: https://codereview.appspot.com/7128047
<fwereade> rogpeppe, gtg lunch I'm afraid, will be slightly extended today
<rogpeppe> fwereade: np
<rogpeppe>  niemeyer: trivial? https://codereview.appspot.com/7137048
<niemeyer> rogpeppe: LGTM
<rogpeppe> niemeyer: thanks
<wallyworld> fwereade: g'day, i've pushed some changes to the mp we have been discussing, are you able to take another look?
<niemeyer> fwereade: Provided another partial review on https://codereview.appspot.com/7095059/
<niemeyer> fwereade: Hopefully not too off track..
<rogpeppe> niemeyer: there's a discussion about the juju-core API about to happen, if you fancy joining in
<rogpeppe> hazmat: is it happening?
<hazmat> rogpeppe, its happening
<hazmat> mramm, rogpeppe https://plus.google.com/hangouts/_/7eba46242c127e6b2d38ec9e3db2a8a5898dece9
<rogpeppe> hazmat: ok, joining now
<fwereade> niemeyer, responded
<niemeyer> fwereade: Cool, will see after the meeting, thank you
<fwereade> niemeyer, cheers
<niemeyer> fwereade: ping
<fwereade> niemeyer, pong
<niemeyer> fwereade: Hey
<niemeyer> fwereade: Just going over your review now
<TheMue> fwereade: Thx for the LGTM.
<fwereade> niemeyer, cool, thanks
<fwereade> TheMue, yw :)
<niemeyer> fwereade: Some of it is mainly doc it seems
<fwereade> niemeyer, yeah, clearly I've failed to explain endpointRemoveOps well
<niemeyer> fwereade: Some of it seems pretty subjective, requiring knowledge of the whole plan to extract the true meaning
<TheMue> Sh**, build a KVM VM for MAAS failed.
<fwereade> niemeyer, that has been on my mind; the trouble is I'm not sure how to do it without smearing multiple copies of the same information through half a dozen places
<niemeyer> fwereade: Imagine I say something like this: "Hey William, would you mind to implement a method that returns the operations that are needed to maintain global consistency with respect to the service identified by the supplied endpoint during the relation's removal?
<niemeyer> "
<niemeyer> fwereade: :)
<fwereade> niemeyer, of *course* I would -- what magnificent wordsmith created that wonderful prose!?
 * fwereade kids
<fwereade> niemeyer, it may be that I've picked the wrong structure for the code
<fwereade> niemeyer, hmm, a thought
<niemeyer> fwereade: I think it'd simplify matters in that specific case to drop the logic that checks mode and service name from within the method
<niemeyer> fwereade: Have you considered the use of (leavingScope bool, destroyService string) instead of mode?
<niemeyer> Or destroyingService, rather
<fwereade> niemeyer, destroyingService is misnamed in that case
<niemeyer> fwereade: You actually already opted for the latter in one case
<fwereade> niemeyer, when that var means "this is the service that's being destroyed", that is indeed what I named it
<fwereade> niemeyer, in endpointRemoveOps it means "the service that is relevant to the supplied mode"
<fwereade> niemeyer, which, in the case of modeLeaveScope, is the service of the departing unit (ie the service that does not need to be removed)
<fwereade> niemeyer, would it maybe help if mode and serviceName were packed into a strcut with documentation right there?
<niemeyer> fwereade: Gosh.. it's worse than I thought then :-(
<niemeyer> fwereade: serviceName is a wildcard variable that refers to *a* service name.
<fwereade> niemeyer, I was not explicit that mode was modified by serviceName, but I feel that was expressed a touch robustly ;)
<fwereade> niemeyer, yes: a service name whose actual meaning is determined by mode
<niemeyer> fwereade: Yeah.. this is super confusing
<fwereade> niemeyer, the obvious answer is to duplicate the common operations in the various places we need to remove relations
<fwereade> niemeyer, I'm not quite sure that's an actual improvement though
<niemeyer> fwereade: I'm not suggesting duplication.. I'm just looking for ways to clarify what is actually going on there
<niemeyer> fwereade: I obviously got a few things wrong in an honest attempt to understand what was written
<fwereade> niemeyer, what do you think of the struct idea?
<niemeyer> fwereade: Yeah, that seems like an improvement for sure
<fwereade> niemeyer, I wondered about tying mode and serviceName together but worried that it would be dismissed as premature fancification :)
<fwereade> niemeyer, clearly it's not though
<niemeyer> fwereade: One of the core points is that it shouldn't be "serviceName"
<niemeyer> fwereade: It doesn't make sense to have leaveScope.serviceName, for example
<fwereade> niemeyer, even if it's always the name of a service?
<niemeyer> fwereade: and that's one of the issues
<fwereade> niemeyer, it identifies the service that cannot be removed as a result of the change
<niemeyer> fwereade: Yes, and what the heck that means again? :-)
<niemeyer> What's a service that "cannot be removed" as a result of which change?
<fwereade> niemeyer, I had hoped that death-and-destruction.txt made this stuff clearer than it managed to
<fwereade> niemeyer, when removing a relation during a destroy operation, we know that both services are Alive
<fwereade> niemeyer, when leaving scope, we don't
<fwereade> niemeyer, but when leaving scope we know that a unit is leaving scope; and hence we know its service has at least one unit; and hence we know we can't ever need to remove it
<fwereade> niemeyer, we have no such guarantees re the service on the other side of the relation, so we need to check its state to see whether it's Dying and only referenced by the relation being removed
<fwereade> niemeyer, this is not the biggest deal ever -- we could just check both services just the same when leaving scope -- but it's an extra roundtrip that seemed like a bad idea
<niemeyer> fwereade: Thanks for the explanation
<niemeyer> fwereade: I don't think I made myself clear regarding in the previous point, sorry
<niemeyer> fwereade: The point is that there's a specific meaning for that service name
<niemeyer> fwereade: In the context of the caller
<rogpeppe> lunch
<niemeyer> fwereade: It doesn't make sense to call that leaveScope.theServiceThatCantBeRemove
<niemeyer> d
<niemeyer> fwereade: You see what I mean?
<niemeyer> fwereade: THere's a lot of implicit knowledge that must be distilled into something tangible when we reach the code
<niemeyer> fwereade: The death-and-destruction document is great as reasoning for what and how things must be put in place
<niemeyer> fwereade: We just need to be careful to implement it in a way we'll still understand when we come back in two weeks
<fwereade> niemeyer, hmm: I thought we were talking about something like `type relationRemoval struct {mode removeMode; info string}`
<niemeyer> fwereade: Stuff like lack of docs, unnamed results, magic variables that mean completely different things based on caller context, etc, will make this process very painful
<niemeyer> fwereade: Hmm.. I don't think that helps much
<niemeyer> fwereade: You're just putting the exact same parameters in a struct
<niemeyer> fwereade: And what's "info string"?
<fwereade> niemeyer, or "payload", or whatever
<niemeyer> fwereade: What's that?
<niemeyer> fwereade: What's a payload of a relationRemoval?
<fwereade> niemeyer, its meaning depends on the value of the mode -- are you implacably opposed to this technique, or are you just pointing out that the names I have so far come up with are rubbish?
<niemeyer> fwereade: I think you're moving in good direction, but the underlying structure still needs some clarification.. I don't think the change is big, btw
<fwereade> niemeyer, sure -- I am as keen as you are to find ways to simplify this
<niemeyer> fwereade: Cool, just a sec
<fwereade> niemeyer, would you expand on your leaveScope.something idea?
<niemeyer> fwereade: Yeah, I'm just writing down some sample
<fwereade> niemeyer, (incidentally -- the unnamed result is sometimes useful but not always, and so it's left unnamed in the cases where it's not used; more than oneperson has flagged this, and I'm somewhat baffled by it)
<niemeyer> fwereade: Okay, I had another pass at the code, and here is another proposal
 * fwereade listens
<niemeyer> fwereade: As far as the local logic is concerned, and as long as I'm not missing something else (possibly I am), there are only two factors that distinguish what these methods are doing:
<niemeyer> fwereade: 1) A service is being destroyed
<niemeyer> fwereade: 2) A service is necessarily alive
<niemeyer> fwereade: Is that correct?
<fwereade> niemeyer, I think that is a reasonable way of looking at it, yes
<niemeyer> fwereade: Can we express this as f(survivingService, destroyingService string)?
<fwereade> niemeyer, I don't think it quite works
<niemeyer> fwereade: Okay, let's see then
<fwereade> niemeyer, I think the problem is in (2)
<fwereade> niemeyer, trying to express the cases in relevant terminology
<fwereade> niemeyer, actually no
<fwereade> niemeyer, thinking, don't wait for me to type ;)
<niemeyer> fwereade: Hehe :)
<fwereade> niemeyer, there are sometimes services we never want to modify; all other services always need to be decreffed; sometimes a decreffed service may need to be removed
<fwereade> niemeyer, so actually we *can*, but we need to fix the names
<fwereade> niemeyer, ignoreService, maybeDestroyService?
<fwereade> niemeyer, or perhaps destroyingService, leavingScopeUnit (ha, eww) is cleanest
<fwereade> niemeyer, departingUnit
<fwereade> (btw are we doing the kanban review?)
<fwereade> niemeyer, I worry that those won;t make the call sites very clear though
<niemeyer> fwereade: Hmm, that looks good
<fwereade> niemeyer, definitely worthy of further investigation though
<niemeyer> fwereade: So what would departingUnit be/
<niemeyer> ?
<fwereade> niemeyer, the unit whose departure from scope caused the removal of the relation
<niemeyer> fwereade: The actual *Unit
<niemeyer> ?
<fwereade> niemeyer, I was expecting a name
<niemeyer> fwereade: Can we have just that?
<fwereade> niemeyer, all I need from it is the service name
<fwereade> niemeyer, maybe cleaner to pass in the unit
<niemeyer> fwereade: So departingUnit is a service name? That's confusing.
<fwereade> niemeyer, no... it's a unit name, from which I will extract a service name
<niemeyer> fwereade: Might as well be the *Unit, if we have that in the call sites
<niemeyer> fwereade: (that need to provide it)
<fwereade> niemeyer, yeah, it's available
<niemeyer> fwereade: Cool, that'll be a lot clearer
<fwereade> niemeyer, ok, fantastic -- might even let me write removeOps such that I don't need endpointRemoveOps
<fwereade> maybe
<niemeyer> fwereade: Woot
<niemeyer> fwereade: Okay, a parenthesis just for a second now that we're good on that front:
<niemeyer> fwereade: <fwereade> niemeyer, (incidentally -- the unnamed result is sometimes useful but not always, and so it's left unnamed in the cases where it's not used; more than oneperson has flagged this, and I'm somewhat baffled by it)
<fwereade> niemeyer, ah yes
<niemeyer> fwereade: I *think* I was talking about something else, since hopefully it was uncontroversial
<niemeyer> fwereade: By unnamed result I meant in the method signature
<fwereade> niemeyer, ahhh ok, sorry, I completely misunderstood
<niemeyer> fwereade: func f() (A, B, bool, error) isn't great
<niemeyer> fwereade: Since the bool is completely unreadable
<niemeyer> fwereade: All one can tell is that there's something being returned that is true or false :)
<fwereade> niemeyer, I write doc comments with the expectation that people will read them ;p
<niemeyer> fwereade: Funny enough, the parameter was *also* undocumented in that case :-)
<niemeyer> fwereade: Well, or maybe it was and I just can't read
<niemeyer> fwereade: Yeah, "and whether"
<fwereade> niemeyer, I think we're talking about the bool documented as "whether those operations will lead to the relation's removal"
<niemeyer> fwereade: Cool, my bad
<fwereade> niemeyer, np
<niemeyer> fwereade: Still, the name would be useful for folks like me that can't read ;-)
<fwereade> niemeyer, good idea, chers
<niemeyer> fwereade: I hope you're okay with the piecemeal reviews I've been providing
<fwereade> niemeyer, absolutely
<niemeyer> fwereade: The logic is so dense that I find tricky to sustain attention for too long
<fwereade> niemeyer, I am not unfamiliar with that feeling myself ;)
<fwereade> niemeyer, and it's useful besides
<fwereade> niemeyer, so np at all
<niemeyer> fwereade: Neat, thanks
<niemeyer> fwereade: I'll provide more pieces today
<fwereade> niemeyer, lovely, thanks
<niemeyer> Some food first, though!
<niemeyer> biab
<fwereade> niemeyer, enojy
<rogpeppe> back
<rogpeppe> oops, did i miss the kanban meeting? or didn't it happe?
<rogpeppe> n
<TheMue> rogpeppe: it didn't happen, right now niemeyer is at lunch
<rogpeppe> TheMue: cool. i went for a lunchtime walk and totally forgot about it.
 * TheMue usus this time to shrink his vmware image, gone too large during testing for maas
<rogpeppe> TheMue: have you actually verified that your retry changes fix the problem?
<rogpeppe> TheMue: that is, have you tried running live tests with your branch?
<TheMue> rogpeppe: yep, on southeast and east with --debug, looks good
<rogpeppe> TheMue: did you run it with -gocheck.vv and see the attempts get more infrequent?
<rogpeppe> TheMue: i have a suspicion that the changes you've made won't change the usual redial case
<TheMue> rogpeppe: not that way, but i can do it once my image is up again
<rogpeppe> TheMue: the place you've changed (in juju.NewConn) is very rarely hit
<TheMue> rogpeppe: i have been able to see it like in the lp bug before, that is now gone
<TheMue> rogpeppe: Dave suggested it
<rogpeppe> TheMue: i'm pretty certain that the place that needs an exponential backoff is actually the mgo package
<TheMue> rogpeppe: there I've placed it before (but linear). you've seen Daves comments?
<rogpeppe> TheMue: looking
<rogpeppe> TheMue: i don't believe dave has it right
<rogpeppe> TheMue: the problem is that mgo.DialWithInfo continually retries with no delay
<TheMue> rogpeppe: yes, that's why I first placed it there
<rogpeppe> TheMue: i didn't see a CL on the mgo package
<rogpeppe> TheMue: can you point me to it?
<TheMue> rogpeppe: earlier changes in the same cl
<rogpeppe> TheMue: the correct fix would be in the mgo package itself
<TheMue> rogpeppe: so you say we have three nested retry loops?
<rogpeppe> TheMue: i think we've only got two, and one in the usual case.
<mramm> rogpeppe: I had a doctors appointment so I had to miss the kanban meeting
<rogpeppe> mramm: ah, nice coincidence!
<rogpeppe> TheMue: mgo has a redial loop and juju.Conn has a redial loop, but only in the ErrUnauthorized case, which only happens for as long as juju bootstrap takes to run, which is about 0.5s usually
<rogpeppe> TheMue: 60s is vast overkill there
<rogpeppe> TheMue: but it's there to cater for VM scheduling variance
<rogpeppe> TheMue: BTW it's nice to see an exponential backoff AttemptStrategy, but given that the usual point of exponential backoff is to avoid thundering herds, i'd be tempted to add a random element too.
<TheMue> rogpeppe: as own behaviour?
<rogpeppe> TheMue: ?
<TheMue> rogpeppe: the random element. do you think of a random time behaviour or a random factor inside the exponential behaviour? and if so than in which way?
<rogpeppe> TheMue: i'd start with a random delay, then multiply that by a constant factor, adding or subtracting a smallish random delta each time
<rogpeppe> TheMue: you might want to google for best practice in this area though - the above is just a guess
<TheMue> rogpeppe: what's the motivation behind it?
<rogpeppe> TheMue: if a state server goes down and we've got 10000 clients, each client is going to see the server go down at the same time, then initiate the same redial loop.
<rogpeppe> TheMue: so we'll get a pattern of huge surges, when we want the redials spread out over time
<rogpeppe> TheMue: but as i said, juju isn't doing the redial in this case, so the right fix is in mgo, and you'll have to ask niemeyer what his preferred method is there.
<TheMue> rogpeppe: ok, randomize the initial delay and then also a random base number to get a more shallow or more steep exponential behavior, sounds good
<rogpeppe> TheMue: no, i think the exponent should probably be constant. but if we perturb the numbers each time, the different clients will tend to diverge (although i'd have to think hard if i wanted to show that mathemetically :-])
<rogpeppe> TheMue: but for the time being, i think this CL isn't helping anything. i'd leave the exponential backoff alone for now, we don't need it.
<TheMue> rogpeppe: all with the same random seed based on the same utc. ;)
<rogpeppe> TheMue: i wouldn't seed based on the utc
<TheMue> rogpeppe: /me too, just a joke
<TheMue> rogpeppe: could you please enter your thoughts into the cl as a comment so that we can update Dave?
<rogpeppe> TheMue: will do
<TheMue> rogpeppe: great, tyvm
<rogpeppe> TheMue: here's a fun way of generating a seed without using time or importing crypto/rand: http://play.golang.org/p/Bd2cIFoJ-L
<niemeyer> fwereade: I didn't know "white lie" was an international term, btw :)
<fwereade> niemeyer, heh, nor did I
<niemeyer> fwereade: ping
<fwereade> niemeyer, pong
<niemeyer> fwereade: Yo
<fwereade> niemeyer, how goes?
<niemeyer> fwereade: Trying to understand the logic in Service.destroyOps that builds the relation removal ops
 * fwereade looks at it
<niemeyer> fwereade: It looks like the way in which the ops are prepared is only guaranteed by a count
<niemeyer> fwereade: But there's apparently no guarantee that the count, being the same, refers to the right relations
<fwereade> niemeyer,     asserts := D{{"life", Alive}, {"txn-revno", s.doc.TxnRevno}}
<fwereade> niemeyer, ah, no, that's not necessarily enough is it...
 * fwereade thinks
<niemeyer> fwereade: It might be
<niemeyer> fwereade: Since an inc or dec must touch it
<fwereade> niemeyer, yeah, I'm trying to remember ;)
<niemeyer> fwereade: But then what's the role of the relation count there?
<fwereade> niemeyer, not sure what the question is exactly, but let me try:
<dimitern> fwereade: can you take a look at this: https://codereview.appspot.com/7133043/ - if you think it's fine to land this it'll simplify my branch on bootstrapping
<fwereade> niemeyer, if getting the service's relations gets us the "wrong" number, we should definitely refresh and retry
<fwereade> dimitern, soon :)
<dimitern> fwereade: 10x
<fwereade> niemeyer, if we have the right number, that's enough reason to speculatively go ahead
<fwereade> niemeyer, but we can only determine whether it's ok to remove the service by counting the relations we need to remove and seeing whether it's the same as the relations we know we have
<fwereade> number of relations we know we have^^
<fwereade> niemeyer, if it isn't, then at least one relation is left Dying, and we can't remove the service until the relation is removed
<fwereade> niemeyer, not sure if I've covered what you're asking...
<niemeyer> fwereade: Yeah, I think so
<fwereade> niemeyer, still not sure about whether there's a hole, though, need to think about that separately
<niemeyer> fwereade: I'll recommend a comment in that location
<niemeyer> fwereade: and a second one on removeCount == RelationCount
<fwereade> niemeyer, ok, sgtm
<niemeyer> fwereade: Explaining how we can reach that state
<niemeyer> fwereade: This is the first one:
<niemeyer> / This is just an early bail out. The relations obtained may still be wrong,
<niemeyer> / and that'll be asserted by the txn-revno comparison below.
<niemeyer> Actually, I'll reword slightly, but that's the idea
<fwereade> niemeyer, (I have thus far been unable to construct a sequence of events that screws up the relations without also hitting txn-revno)
<fwereade> niemeyer, yep, sounds sane
<rogpeppe> dimitern: looking
<fwereade> (ty rogpeppe, sorry dimitern, still a bit distracted by this conversation)
<dimitern> fwereade: np :)
<dimitern> rogpeppe: 10x, and the other one closely related is https://codereview.appspot.com/7086055/ - I really'd like to have these landed soon
<niemeyer> fwereade:
<niemeyer> / Relations that aren't Alive won't have been removed because
<niemeyer> / they're already on their way to death by other means.
<niemeyer> / If that's the case, do not remove the service either,
<niemeyer> / and let it be dealt with together with the dying relations.
<niemeyer> fwereade: Is that it
<niemeyer> ?
<fwereade> niemeyer, yeah, well put
<niemeyer> fwereade: Supa
<fwereade> niemeyer, just reproposed with nicer Relation.removeOps
<fwereade> niemeyer, endpointRemoveOps did disappear :)
<niemeyer> fwereade: Woohay!
<fwereade> niemeyer, btw, I think we can drop the txn-revno check on service destroy, it's an awfully heavy stick to use
<niemeyer> fwereade: Oh
<fwereade> niemeyer, we can just add a D{{"life", Dying}} assert for each relation that isn't removed
<niemeyer> fwereade: and how do we tell that there are no new relations?
<fwereade> niemeyer, in that case, for the relation count to match but the relations to be wrong, the missing relation is guaranteed to fail a txn
<rogpeppe> dimitern: both reviewed
<dimitern> rogpeppe: tyvm
<fwereade> niemeyer, (we can't hit that situation without an equal number of adds and removes)
<fwereade> niemeyer, (we also need to assert unitcount in that case ofc)
<niemeyer> fwereade: Well, yes.. and?
<niemeyer> fwereade: I mean, that's a real situation
<fwereade> niemeyer, right, and if we have an assert for the state of every initially-known relation we're fine, because any of those that are removed will fail an assert and we'll retry
<fwereade> niemeyer, right?
<niemeyer> fwereade: Sounds sane.. thinking
<fwereade> niemeyer, I'd really like to drop the txn-revno check, it changes at the drop of a hat ;p
<niemeyer> fwereade: Yeah, that should be sane.. assert on obtained relation count == service relation count, plus asserts on all individual relations not removed
<fwereade> niemeyer, cool
<niemeyer> fwereade: The only thing that could blow that scenario up is  a relation added without being perceived, and that can't happen in that case
<fwereade> niemeyer, yeah, I think so
<fwereade> niemeyer, (I'm *really* coming to appreciate mgo/txn :))
<niemeyer> fwereade: Neat, that's a relief :)
<niemeyer> fwereade: Btw, I've just sent the notes on service.go
<fwereade> niemeyer, lovely, thanks
<fwereade> niemeyer, let me know what you think of the relation.go changes :)
<niemeyer> fwereade: state/service.go:422 is the only thing that is perhaps worth of a sync up
<niemeyer> fwereade: Cool, will refresh and look again
<fwereade> niemeyer, just saw that one, thinking a mo
<fwereade> niemeyer, heh, let me hunt around a mo, I think that's addressed in the followup...
<fwereade> niemeyer, yeah, you're absolutely right, I forgot to backport that one
<fwereade> niemeyer, although actually I think it should maybe assert D{{"life", Dying}, {"unitcount", 1}, {"relationcount", 0}}
<fwereade> niemeyer, just because txn-revno is less stable than those
<niemeyer> fwereade: Which unit is that 1?
<niemeyer> fwereade: Well, I suppose there will be asserts on the unit itself on the same transaction
 * niemeyer looks again
<fwereade> niemeyer, that's the unit being removed
<fwereade> niemeyer, it's in removeUnitOps
<niemeyer> fwereade: Yeah, DocExists, ool
<niemeyer> cool
<fwereade> (heh, laura is asking cath for a "darky-light black pen")
<fwereade> (apparently, what she meant was a "brown pencil")
<niemeyer> fwereade: ROTFL
<niemeyer> fwereade: darky-light is a much nicer name than brown, FWIW
<niemeyer> :-)
<niemeyer> fwereade: The new removeOps is great
<fwereade> niemeyer, yeah, thanks for pressing me on that one
<niemeyer> fwereade: np
<niemeyer> fwereade: Okay, those were the big chunks it seems.. I'll do a pass on everything else now
<fwereade> niemeyer, sweet
<niemeyer> fwereade: The test changes makes it pretty obvious that the changes performed are an API improvement
<niemeyer> fwereade: Besides making things right
<fwereade> niemeyer, cool, glad you like it :)
<fwereade> niemeyer, bah, supper is on the table, I should stop
<fwereade> niemeyer, I'll pop back on later for a final check/propose if I can
<niemeyer> fwereade: Beautiful.. you'll probably have a LGTM with minor comments if any
<rogpeppe> i have to go now.
<niemeyer> rogpeppe: Have a good one
<rogpeppe> niemeyer: i think i've finally arrived at a good idea of how the API architecture will work
<rogpeppe> niemeyer: it'd be good to have a discussion about it to see if i'm on crack or not.
<rogpeppe> niemeyer: tomorrow, perhaps
<niemeyer> rogpeppe: Ah, great
<niemeyer> rogpeppe: Certainly up for the talk
<niemeyer> fwereade: Okay, reviewed everything
<rogpeppe> niemeyer: in particular, i think i finally understand how things can work with multiple state servers (both mongo and API)
<rogpeppe> niemeyer: so we have something to aim towards, even if we don't go all the way immediately
<niemeyer> fwereade: All good.. will just wait for the service.go repropose
<rogpeppe> g'night all
<niemeyer> rogpeppe: Cool, night!
<niemeyer> fwereade: There's one trivial on a test too, optional
<fwereade> niemeyer, good point re test, proposed, bbiab
<niemeyer> fwereade: Please ping when you're back
<niemeyer> fwereade: Just want to run a quick idea by you before we're over with this
<fwereade> niemeyer, back
<niemeyer> fwereade: I was wondering if we actually need the Dying relations at all
<fwereade> niemeyer, oh, yes, go on?
<niemeyer> I mean, the assert on the Dying relations
<niemeyer> fwereade: The alive case has an assert on {"relationcount", D{{"$gt", removeCount}}, which means it must necessarily be alive anyway
<niemeyer> fwereade: Because it has relations to justify it, whether it was the relations we've seen or not
<niemeyer> fwereade: Oh, wait, nope.. that's not it
<fwereade> niemeyer, no -- the relation count only goes down when the relation is removed
<niemeyer> fwereade: If there are other relations, we must check those
<fwereade> niemeyer, yes, I think so
<niemeyer> fwereade: So the relationcount should be the one we've seen, sholudn't it?
<niemeyer> fwereade: Exactly, that is
<fwereade> niemeyer, yes, you're completely right
<fwereade> niemeyer, curse that crack
<fwereade> niemeyer, units can have $gt, relations must be exact
<niemeyer> fwereade: Cool
<niemeyer> fwereade: It's slightly suboptimal that there's a hidden requirement in that logic
<niemeyer> fwereade: destroyOps must necessarily have an assertion on the existence of the relation document for it to be reliable
<niemeyer> fwereade: If len(ops) > 0
<fwereade> niemeyer, I wondered about adding an assertDying bool param there
<fwereade> niemeyer, to get the relation to generate the assert itself
<fwereade> niemeyer, seemed slightly less good, especially since we're already using len(ops) == 0 as a test for already-dying elsewhere
<fwereade> niemeyer, I'd be open to arguments that errAlreadyDying would work just as well and maybe be clearer
<niemeyer> fwereade: Yeah, that'd be much better
<fwereade> niemeyer, ok, cool
<niemeyer> fwereade: The point is that we won't foolishly change destroyOps to not return that error without seeing the consequences
<niemeyer> fwereade: But it's easy to change what len(ops) == 0 means
<niemeyer> fwereade: Unintendedly
<fwereade> niemeyer, yep, consider me convinced
<fwereade> niemeyer, proposed again :)
<niemeyer> fwereade: Awesome, looking
<niemeyer> fwereade: There's a len(ops) == 0 in Service.Destroy.. does that have to change?
<niemeyer> fwereade: Or perhaps not change, but a new case added?
<fwereade> niemeyer, I didn't think of that, but you're quite right again
<niemeyer> fwereade: hmm.. I guess not
<niemeyer> fwereade: That's service.destroyOps, though
<fwereade> niemeyer, yeah, but the same arguments apply
<fwereade> niemeyer, I should just be explicit
<niemeyer> fwereade: Cool
<fwereade> niemeyer, I'll just make the errAlreadyDying message non-entity-specific
<niemeyer> fwereade: Sounds good
<niemeyer> fwereade: a pre LGTM on this
<niemeyer> fwereade: The branch seems ready, thanks a lot for all the work
<fwereade> niemeyer, sweet :D
<fwereade> niemeyer, tyvm
<davecheney> mramm: ping
<mramm> davecheney: pong
<mramm> I'm available in skype, and am in the google hangout
<davecheney> mramm: that google hangout is fine
<davecheney> but you still haven't invited me
<davecheney> just some other david cheney
#juju-dev 2013-01-18
<rogpeppe> fwereade, dimitern, jam1, wallyworld, mgz: morning!
<fwereade> rogpeppe, et al, heyhey
<wallyworld> g'day
<rogpeppe> fwereade: ping
<fwereade> rogpeppe, pong
<rogpeppe> fwereade: i wonder if you could have a look at some speculative stuff i've been working up before i run it past gustavo later
<fwereade> rogpeppe, sure
<rogpeppe> fwereade: it's to do with where we're aiming with the API server and how the machine agent will deal with that
<fwereade> rogpeppe, ok, cool
<rogpeppe> fwereade: here's some pseudocode: http://paste.ubuntu.com/1544082/
<rogpeppe> fwereade: the idea is to solve the problem of multiple API servers and state servers
<rogpeppe> fwereade: the main additions are the API server (obviously) and the publisher and locationer (provisional names) workers
<rogpeppe> fwereade: i also have a sketch for what i'm doing right now, which is a transitional step where we don't have multiple servers, but we are running an API server and connecting to it
<rogpeppe> fwereade: http://paste.ubuntu.com/1544098/
<fwereade> rogpeppe, still processing
<rogpeppe> fwereade: np
<rogpeppe> fwereade: i hope the pseudocode is reasonably clear
<fwereade> rogpeppe, yeah, I think so, I'm mainly just trying to get my head around the bootstrap process
<fwereade> rogpeppe, and how it fits in with davecheney's stater work
<rogpeppe> fwereade: i'm not sure i know about that
<rogpeppe> fwereade: what's the stater work?
<fwereade> rogpeppe, they're in review at the moment
<fwereade> rogpeppe, https://code.launchpad.net/~dave-cheney/juju-core/069-worker-stater-I/+merge/143629
<fwereade> rogpeppe, https://code.launchpad.net/~dave-cheney/juju-core/071-cmd-jujud-stater-I/+merge/143638
<fwereade> rogpeppe, I guess the heart of the problem is that you're working on the state API and he is working on HA state
<fwereade> rogpeppe, and they have a fair number of collision points...
<rogpeppe> fwereade: yeah
<rogpeppe> fwereade: it would be good to chat with dave at some point, but i haven't been online at the same time this year so far
<fwereade> rogpeppe, heh, I would suggest trying to arrange something
<fwereade> rogpeppe, communication by CL is ok for CLs but not so much for major design considerations
<fwereade> rogpeppe, but modulo bootstrap worries, and concerns about managing what machines will be running mongo (as opposed to just the API server) I think it's roughly sane
<rogpeppe> fwereade: bootstrap worries?
<rogpeppe> fwereade: yeah, i have entirely delegated the decisions about which machines run what to a higher level.
<fwereade> rogpeppe, I don't see how to fit it together with davecheney's stuff
<fwereade> rogpeppe, I'm on the fence as to whether state server maintenance should be separate from the machine agent
<rogpeppe> fwereade: i'm not sure there's a problem. from what i've seen so far, i think it'll fit in ok
<rogpeppe> fwereade: i think it's fine to be part of the MA
<rogpeppe> fwereade: (i'm assuming dfc's branch does that)
<fwereade> rogpeppe, yeah, me too, but that definitely collides badly with dave's direction
<fwereade> rogpeppe, nope
<fwereade> rogpeppe, IMO it's crack, but it's apparently pre-agreed with niemeyer
<rogpeppe> fwereade: oh, i see
<rogpeppe> fwereade: the question in my mind is how other things are going to know how to contact the state
<fwereade> rogpeppe, I also think that doing it in the MA will be fiddly, but still probably the right thing
<rogpeppe> fwereade: i don't think it should be *too* fiddly. i'll have a quick check to see how it fits into my plan.
<fwereade> rogpeppe, cool
<dimitern> rogpeppe, fwereade, wallyworld: morning :)
<rogpeppe> dimitern: hiya
<fwereade> dimitern, heyhey
<rogpeppe> fwereade: i think the stater worker fits in quite easily actually: http://paste.ubuntu.com/1544194/
<rogpeppe> fwereade: but i think i agree that it shouldn't be a new subcommand
<fwereade> rogpeppe, I don't see a stater involved in the bootstrap machine
<rogpeppe> fwereade: i'm not sure it needs to, although there's no harm if it does
<rogpeppe> fwereade: on a bootstrap machine, the state must be started manually
<fwereade> rogpeppe, I'm -1 on anything that makes machine 0 work differently from any other mchine with the same responsibilities
<rogpeppe> fwereade: or probably bootstrap itself would do it
<fwereade> rogpeppe, we may need to be clever in bootstrap, but after that it shoudl be just the same as any other
<rogpeppe> fwereade: machine 0 is fundamentally different, because there's no existing state to connect to
<rogpeppe> fwereade: but i agree in principle
<fwereade> rogpeppe, once there's state, if we treat it differently, I think we're Doing It Wrong
<rogpeppe> fwereade: ok, i think this might be better: http://paste.ubuntu.com/1544253/
<TheMue> Morning all.
<dimitern> TheMue: morning
<TheMue> dimitern: Hiya
<fwereade> TheMue, heyhey
<fwereade> rogpeppe, yeah, looks sane to me; lots of other details (do machine jobs change (I think they will have to), how do we deal with job changes (significant differences in task-running)) etc
<fwereade> rogpeppe, except, well, hmm
<rogpeppe> fwereade: yes, job changes are an interesting question.
<fwereade> rogpeppe, I guess I still don't quite know what a stater will be doing... watching for a job change and killing itself?
<fwereade> rogpeppe, (sorry, removing mongo and then killing itself?)
<rogpeppe> fwereade: yeah, when jobs can change
<rogpeppe> fwereade: in my sketch above, it does nothing after initial check-and-install
<fwereade> rogpeppe, indeed, that's what I'm waffling on about
<rogpeppe> fwereade: perhaps i should work out how we might deal with changing jobs. there are some interesting issues around that (perhaps we'd need to keep a count of api servers and state servers in the state, and disallow a job change if it will lose the last one of either)
<rogpeppe> wallyworld: i've just replied to https://codereview.appspot.com/7133043/; i'm interested if you think my suggestion is viable.
<fwereade> rogpeppe, yeah, I think so
<fwereade> rogpeppe, do you have a moment to revisit the destroy-* command plan?
<fwereade> rogpeppe, I'm pretty sure I convinced you of crack the other day
<rogpeppe> fwereade: lol
<fwereade> rogpeppe, but really all the possible approaches seem crackful one way or another
<rogpeppe> fwereade: what in particular was possibly crack?
<fwereade> rogpeppe, refusing if any unit's already Dying
<rogpeppe> fwereade: refusing what, sorry?
<fwereade> rogpeppe, to do anything :)
<fwereade> rogpeppe, it certainly doesn't fit well with the Destroy methods in state
<fwereade> rogpeppe, all of which are fine with dying/dead/removed entities
<fwereade> rogpeppe, that's the main problem for me
<rogpeppe> fwereade: are we talking about https://codereview.appspot.com/7138062/ here?
<fwereade> rogpeppe, not really, I'm doing destroy-machine -- with destroy-service I weakened and went with allow-if-exists
<rogpeppe> fwereade: or is this a general discussion, unrelated to any particular CL?
<fwereade> rogpeppe, but service for some reason only destroys one at a time
<fwereade> rogpeppe, (ok that is crazy too, but for another time)
<rogpeppe> fwereade: i'm a bit at sea here. are we talking about the "destroy gives an error if destroy has already been called on the enitity" decision?
<fwereade> rogpeppe, yes: from the perspective of the CLI
<rogpeppe> fwereade: (assuming there *was* such a decision, which i can't quite remember)
<fwereade> rogpeppe, such an approach is incoherent in state IMO
<fwereade> rogpeppe, "no error, it's being destroyed" makes a lot more sense at that level
<rogpeppe> fwereade: what if it's already been removed?
<fwereade> rogpeppe, no error -- the user had a *state.Whatever, it must have existed at some point, now it doesn't, everyone;s happy
<rogpeppe> fwereade: not from the perspective of the CLI though
<fwereade> rogpeppe, well, yeah, the question is: what *is* correct from that perspective
<rogpeppe> fwereade: you'll get a different error if it's been removed from if it's been destroyed-but-not-removed
<rogpeppe> fwereade: yeah
<rogpeppe> fwereade: i'm ok with the rm analogy
<rogpeppe> fwereade: actually, the kill(1) analogy is probably stronger
<rogpeppe> fwereade: 'cos processes can take a while to go away
<rogpeppe> fwereade: so i think i'm fine if we get no error if the entity is Dying, but we do get an error if it's been removed.
<fwereade> rogpeppe, the philosophical issue here is then how we treat Dead
<rogpeppe> fwereade: ha
<fwereade> rogpeppe, in theory, Dead/removed are not states we should really be distinguishing between
<fwereade> rogpeppe, in practice, I don't see how we can avoid it
<fwereade> rogpeppe, and I can't tell whether that matters or not ;p
<rogpeppe> fwereade: yeah, i think i agree. perhaps we should go with "if it doesn't show up in status, you should get an error trying to talk to it"
<fwereade> rogpeppe, hmm, yeah, does status strip Dead things yet?
<fwereade> rogpeppe, (and is it even sane for it to do so?)
<rogpeppe> fwereade: i dunno
<rogpeppe> fwereade: but if we go with the above guiding principle, it doesn't matter either way, just that we're consistent in that respect
<fwereade> rogpeppe, if we strip (say) dead units, then the user won't be able to understand why they can't terminate a machine that appears to have no units
<fwereade> rogpeppe, or possibly they'll see the unit on the machine, but see it as a dangling pointer
<fwereade> rogpeppe, actually, dangling pointers in status cannot be avoided in general
<rogpeppe> fwereade: i'm ok with status showing dead things
<rogpeppe> fwereade: so then the implementation all becomes obvious, no subterfuge required
<fwereade> rogpeppe, ok, yeah, that SGTM
<fwereade> rogpeppe, so as long as an identified entity can be obtained from state, we allow a destroy, even if it's a no-op
<rogpeppe> fwereade: yup
<fwereade> rogpeppe, if any entity in the list is not found, bail
<rogpeppe> fwereade: what about the other entities on the list? do they get destroyed or not?
<fwereade> rogpeppe, (not that we can destroy them all txnly anyway, so we're still vulnerable to conns dropping half way through processing...)
<rogpeppe> fwereade: i think that we should go through trying to destroy all the entities, even if one fails
<fwereade> rogpeppe, I'm imagining we collect up all the identified entities that are Alive, error if any isn't there, and then just Destroy the ones we found
<rogpeppe> fwereade: when you say "error if any isn't there", you mean "log an error" not "return an error
<rogpeppe> ", yes
<rogpeppe> ?
<fwereade> rogpeppe, meh, maybe -- so we keep track of any such errors and then exit 1 if we hit one?
<rogpeppe> fwereade: yup
<rogpeppe> fwereade: that's consistent with rm and kill, for example
<fwereade> rogpeppe, yeah, ok, I think that's sanest then
<rogpeppe> fwereade: it means that if you've got a big list of entities to destroy, it's less fragile if one of those has just been removed.
<fwereade> rogpeppe, yeah, sgtm
<fwereade> rogpeppe, hmm... I guess if we can't tell whether state is broken, we have to just keep on trying and failing when the conn goes down half way through
<rogpeppe> fwereade: no, i think that if the error is not "entity doesn't exist", we should abort
<rogpeppe> fwereade: that's the approach we take in other places.
<fwereade> rogpeppe, (incidentally, destroy-unit won't scale sanely anyway... we're going to need something like `juju resize wordpress 27`)
<fwereade> rogpeppe, hmm
<rogpeppe> fwereade: i totally agree
<rogpeppe> fwereade: having a target number of units makes much more sense than managing individual units
<fwereade> rogpeppe, ok, so if someone added a JobManageEnviron to a machine and made it unkillable, that should stop all the other machines from being destroyed?
<rogpeppe> fwereade: no
<rogpeppe> fwereade: ok, so there are probably other errors to check for :-)
<fwereade> rogpeppe, that is not an "entity doesn't exist" error ;)
<rogpeppe> fwereade: i really want us to be able to check for state upness
<fwereade> rogpeppe, yeah, agreed
<rogpeppe> fwereade: i was thinking recently that the best place for it is probably in mgo itself.
<fwereade> rogpeppe, hmm, yeah, if that could reliably give us ErrConnDown or something that would be great
<rogpeppe> fwereade: as that's the place that is actually (potentially) in the knowledge of whether it can still talk to the state or not.
<rogpeppe> fwereade: i was thinking more of mgo.Database.Up() bool
<fwereade> rogpeppe, ah, yeah, nice
<rogpeppe> fwereade: which would return false if any operations had failed permanently
<rogpeppe> fwereade: the problem is that i'm not sure it's possible to tell that, because there's a cluster, and just because one operation has failed doesn't mean that a subsequent op will. i don't know enough about mongodb and mgo's client implementation to be able to tell how feasible this is.
<fwereade> rogpeppe, indeed
<fwereade> rogpeppe, nor do I
<fwereade> rogpeppe, hmm -- in the absence of such a thing,how about: (1) collect what we can, ignore not-Alive, log removed, return other errors directly; (2) destroy everything we collected, logging all errors; (3) return an error if we encountered any
<fwereade> rogpeppe, except, no, maybe not
<rogpeppe> fwereade: perhaps we should not try to be clever
<fwereade> rogpeppe, state.IsCannotDestroy(err)
<rogpeppe> fwereade: just make all the attempts we can, logging each error as it happens
<fwereade> rogpeppe, the only things that are actually ever undestroyable are machines
<rogpeppe> fwereade: that won't be true for long
<rogpeppe> fwereade: we'll be able to get permission-denied errors, i think
<fwereade> rogpeppe, ah, yes, cool
<rogpeppe> fwereade: i think if we go with the naive approach for now, it'll be easy to retrofit state-upness checking if/when it's available
<fwereade> rogpeppe, hmm, are the destroy-machine errors really special cases of permission errors? I think they might be
<rogpeppe> fwereade: i don't think so
<rogpeppe> fwereade: or... nah, i don't think so
<fwereade> rogpeppe, yeah, maybe it's not quite right
<rogpeppe> fwereade: it's not that you don't have permission, it's that you haven't arranged things correctly to enable the operation
<TheMue> rogpeppe: ping
<fwereade> rogpeppe, heh, I'd seen it as "*nobody* has permission to do that, are you crazy?" ;p
<rogpeppe> fwereade: lol
<rogpeppe> TheMue: ping
<rogpeppe> TheMue: pong
<rogpeppe> TheMue: pung
<rogpeppe> TheMue: peng
<TheMue> rogpeppe: pang
<TheMue> rogpeppe: Did you read Daves mail?
<rogpeppe> TheMue: yeah
<TheMue> rogpeppe: Thoughts?
<fwereade> rogpeppe, apparently "pung" is an obscene swedish word; this was pointed out when I used it in test data once
<rogpeppe> TheMue: i'm just writing a reply actually
<TheMue> rogpeppe: OK, thx.
<rogpeppe> fwereade: "pung" is also a term for a set of three of the same tile in maj jong
<TheMue> rogpeppe: I'm a bit torn.
<fwereade> rogpeppe, when you say the "naive" approach, I'm not quite sure how maive you mean
<rogpeppe> TheMue: he's still off track - changing the delay in juju.Conn won't affect anything
<rogpeppe> fwereade: go through each thing, getting it, trying to destroy it, and log any errors. exit(1) if any error occurred.
<fwereade> rogpeppe, ok, sgtm
<TheMue> rogpeppe: I have to try if the error still exists w/o a change, because I've been able to reproduce it last year and now, with both approaches, it's gone.
<rogpeppe> TheMue: what error?
<TheMue> rogpeppe: That one that has been the reason for that change. In the LP ticket. Or lets call it "behavior", those quick retries.
<rogpeppe> TheMue: which LP ticket?
<rogpeppe> TheMue: (the CL doesn't seem to reference it)
<TheMue> rogpeppe: Yes, missed it, but the kanban card. *lol* It's https://bugs.launchpad.net/juju-core/+bug/1084867.
<_mup_> Bug #1084867: conn: conn connections should pause between retry attempts <juju-core:In Progress by themue> < https://launchpad.net/bugs/1084867 >
<rogpeppe> TheMue: ah, i thought you were talking about an actual error
<rogpeppe> TheMue: how were you checking whether the behaviour had changed?
<TheMue> rogpeppe: Calling status with debug like shown before and after.
<rogpeppe> TheMue: i haven't seen any before and after - have you got a paste?
<TheMue> rogpeppe: No. In the tests without a change I had the same retries. Afterwards one retry sometimes has been enough and sometimes some more (with a different timing then before due to the different delay).
<TheMue> rogpeppe: But never again in such an amount.
<rogpeppe> TheMue: were you calling status immediately after bootstrap?
<rogpeppe> TheMue: (because that's the only time when retries are significant)
<TheMue> rogpeppe: Yes, as immidiate as possible by typing or pasting the command.
<rogpeppe> TheMue: one mo, i'll try your branch and see if i see the same
<wallyworld> rogpeppe: hi, your suggestions are viable i think. my personal preference is still to run all the tests with the -local or -live flags since it will be a royal pita to have to comment out all the skips all the time
<TheMue> rogpeppe: Thanks
<rogpeppe> wallyworld: in practice, you'll probably just comment out the skips that you think will be fixed by the openstack changes you're making
<wallyworld> rogpeppe: that's part of the issue - it's hard to know what will be fixed each time
<rogpeppe> wallyworld: i think there's a significant advantage in being able to incrementally enable more tests
<wallyworld> since a small change to some of the provider stuff magically makes a bunch more tests pass
<rogpeppe> wallyworld: surely commenting out the skips is a single editor command - not a great hassle?
<wallyworld> sure, but it's still work to do, and changing code means there's always a possibility of accidentally committing something
<rogpeppe> wallyworld: we'll catch that!
<wallyworld> i guess my workflow the past few weeks has been very suited to having all the tests run with -live and seeing what improves with each change
<wallyworld> but if i can't change your mind then i'll put in the skips
<rogpeppe> wallyworld: i think that the tests that pass should be enabled by default in trunk, so that anyone running the tests will run them too.
<wallyworld> you mean the service double ones?
<rogpeppe> wallyworld: yeah
<wallyworld> part of the issue as well is that we seem to be making good progress (touch wood) and so there will be a lot of extra juju-core branches which simply uncomment skips, and all the overhead that involves getting reviews etc
<wallyworld> so it affects velocity
<dimitern> wallyworld: but these CLs uncommenting skipped tests should be trivial and easy to review
<wallyworld> sure, but the turn around time is still 12-24 hours, and by that time, you have several branches queued up. but is seems your mind is made up so i will add the skips
<rogpeppe> wallyworld: i think any CL that just uncomments skipped tests could be counted as trivial and submitted immediately.
<rogpeppe> wallyworld: although i'd check that out with others to make sure that's ok
<wallyworld> \o/ ok!
<rogpeppe> TheMue: i'm trying your branch now, and i don't see any backoff. it's redialling just as fast as ever.
<wallyworld> i've had several beers (it's friday evening after all), so i will tackle the skips tomorrow :-)
<rogpeppe> TheMue: http://paste.ubuntu.com/1544692/
<rogpeppe> TheMue: shit, one mo, i'm trying the wrong branch!
<dimitern> wallyworld: thanks for submitting the other CL btw
<TheMue> rogpeppe: I already wondered, because yesterday I had a different behavior. But I'll test too.
<wallyworld> dimitern: you mean the test double fixes? not submitted yet, still some more stuff to fix, but soon
<dimitern> wallyworld: no the one about fake tools
<wallyworld> ah ok
<rogpeppe> TheMue: testing with your branch now, and still no change, i'm afraid
<rogpeppe> TheMue: can you paste me the output of juju status --debug showing exponential backoff happening on the retries?
<TheMue> rogpeppe: Aaargh, just seeing it here too. Seems yesterday the net has been too good, now I have multiple retries too and can confirm your observation of no exponential delay.
<rogpeppe> TheMue: ok, good. nice to know i'm not going bonkers :-)
<TheMue> rogpeppe: Let me try to add it to state.Open(), like in my first approach. Just as a test, because there we dial mongo.
<TheMue> rogpeppe: That would confirm your idea of mongo having it as a feature.
<rogpeppe> TheMue: that will work - just revert to your earlier revision
<TheMue> rogpeppe: Currently it tries to connect, wonderfully with increasing delays.
<TheMue> rogpeppe: Bing, and now it's in.
<rogpeppe> TheMue: cool
<niemeyer> Hi all
<TheMue> rogpeppe: And beside that I now can connect the web console of MAAS from this VM while it is installed in another one. Hehe!
<TheMue> niemeyer: Hello
<rogpeppe> niemeyer: yo!
<dimitern> niemeyer: hiya
<fwereade> niemeyer, morning
<niemeyer> Wow, hi all! :-)
<niemeyer> What a warm welcome
<rogpeppe> niemeyer: when you have a moment for discussion about the API archtecture, please let me know.
<TheMue> niemeyer: Matching to your weather (21Â° more than here).
<niemeyer> rogpeppe: Sounds good
<niemeyer> rogpeppe: Just give me a few minutes and will be with you
<rogpeppe> niemeyer: cool
<niemeyer> rogpeppe: Alright
<rogpeppe> niemeyer: okeydokey
<rogpeppe> niemeyer: i've only realised this morning how actively dave cheney has been working on the HA stuff, so there's some overlap here.
<fwereade> niemeyer_, resync on the question "is it sane/meaningful for a service whose charm has a peer relation to ever not be in a peer relation?"
<fwereade> niemeyer_, I believe the answer is no, but I wanted to check your opinion
<niemeyer_> fwereade: Hmm
<niemeyer_> fwereade: If I get what you mean, yes, it will always be in that relation
<fwereade> niemeyer_, that was (roughly) the justification given for blocking peer relation interactions in the command line
<niemeyer_> fwereade: "be" is a bit vague, though, so maybe it's worth expanding
<niemeyer_> fwereade: The relation may always be there, even if there are no counterpart units
<fwereade> niemeyer_, yeah, I'm not thinking about units
<niemeyer_> fwereade: Well, this raises a few extra ideas
<niemeyer_> fwereade: That maybe we just shouldn't even worry right now, for the benefit of progress
<fwereade> niemeyer_, I'm thinking of whether or not the existence of a such a relation doc is tied to the existence of the service doc
<niemeyer_> fwereade: I'll just put something on the back of your mind for awareness:
<niemeyer_> fwereade: It's quite reasonable to, some day, have peer relations with multiple services in it
<fwereade> niemeyer_, yeah, I am aware of this, and I don't *think* it affects this problem
<fwereade> niemeyer_, the reaosn is that I just picked up https://bugs.launchpad.net/juju-core/+bug/1072750
<_mup_> Bug #1072750: deploy does not add relations transactionally <juju-core:In Progress by fwereade> < https://launchpad.net/bugs/1072750 >
<niemeyer_> fwereade: Ah, okay
<fwereade> niemeyer_, we can (1) ignore the bug; (2) implement peer-relation-addition within service addition; (3) allow direct control of peer relations; (4) ???
<niemeyer_> fwereade: I'd vote for (2) if it's not too much trouble
<fwereade> niemeyer_, such was my instinct too
<niemeyer_> fwereade: and for (4) postpone the solution, if it is
<fwereade> niemeyer_, IMO that's just a nicer way of saying (1) ;)
<fwereade> niemeyer_, ah ok, I misinterpreted
<fwereade> niemeyer_, yeah, I'll see how annoying it feels
<niemeyer_> fwereade: It actually is.. it was just a different way of saying "The bug is relevant, but maybe not worth fixing before other critical things on our pipeline"
<niemeyer_> fwereade: (1) felt a bit like "We don't care." :-)
<fwereade> niemeyer_, yeah, it's slightly better than (1) because it says "we have a plan but we won't do it yet"
<fwereade> niemeyer_, followup question -- state API and CLI are inconsistent wrt peer relations
<fwereade> niemeyer_, I think that to do this cleanly we want (1) AddRelation to reject peer relations and (2) Relation.Destroy to reject direct attempts to destroy a peer relation
<fwereade> niemeyer_, (they'd still get destroyed but only when their service was)
<niemeyer_> fwereade: +1.. sounds reasonable with the status quo
<fwereade> niemeyer_, cool, thanks
<niemeyer_> fwereade: We may have to fix if we add what I suggested above, but that's totally fine
<fwereade> niemeyer_, I don't think it collides with multi-endpoint peer relations at all actually -- that will require new code but I *think* not invalidate any that exists
<fwereade> niemeyer_, logic, that is, not code
<rogpeppe> lunch
<rogpeppe> mramm: when are the southern-hemisphere kanban meetings?
<mramm> 17:30 eastern time
<mramm> 22:00 GMT
<mramm> 22:30 GMT
<mramm> rogpeppe: if you want I can throw you on the invite list so you know about them
<rogpeppe> mramm: please. i'll try to make it along some time - it would be useful to have some interaction with dave.
<mramm> rogpeppe: yea, that would be good
<niemeyer_> rogpeppe: Re. "with regard to 3), Write *is* checking that the configuration is correct unless
<niemeyer_> i'm missing something stupid."
<niemeyer_> rogpeppe: Maybe I misunderstood then.. why is Change calling Check before WRite?
<rogpeppe> niemeyer_: ha, good question. i can't remember. the point is moot now, 'cos i've deleted Change.
<niemeyer_> rogpeppe: Cool
<rogpeppe> niemeyer_: but Write does call Check still
<niemeyer_> rogpeppe: Cool.. so it was probably just unnecessary
<rogpeppe> niemeyer_: probs, yeah
<rogpeppe> trivial CL anyone: https://codereview.appspot.com/7128055
<TheMue> lunchtime
<niemeyer_> rogpeppe: Done
<rogpeppe> niemeyer_: thanks
<rogpeppe> "<unknown job constant: %d>" would probably be better
<rogpeppe> niemeyer_: it's not a constant when it's printed :-)
<niemeyer_> rogpeppe: Hm?
<rogpeppe> niemeyer_: <unknown job value %d> ?
<rogpeppe> niemeyer_: although i actually think i prefer the unknown value being in the same form as the known value
<niemeyer_> rogpeppe: I don't understand what you mean
<niemeyer_> rogpeppe: JobFooBar are constants
<niemeyer_> rogpeppe: If we get a *constant* we don't recognize, we say so
<rogpeppe> niemeyer_: we're not getting a constant we don't recognise. we're getting a value we don't recognise...
<rogpeppe> niemeyer_: just <unknown job %d> would be better perhaps
<niemeyer_> rogpeppe: 42 is not a valid job constant.. it's a perfectly reasonable value :-)
<niemeyer> rogpeppe: I won't bikeshed over this, though. It's a pretty straightforward issue.
<rogpeppe> niemeyer: indeed. i'll go with <unknown job %d>, if that's ok
<niemeyer> rogpeppe: Sure, whatever works
<rogpeppe> niemeyer: if you might be able to take another look at https://codereview.appspot.com/7085062/, that would be great too, thanks.
<rogpeppe> niemeyer: that's the CL that deletes the Change method
<rogpeppe> niemeyer: (as well as the CL's stated purpose...)
<niemeyer> rogpeppe: Cool, will check agian
<niemeyer> rogpeppe: LGTM
<rogpeppe> niemeyer: thanks
<rogpeppe> niemeyer: there's a trivial followup that i seem to have neglected to unWIP: https://codereview.appspot.com/7128047
<niemeyer> rogpeppe: LGTM, with a couple of points to ponder aobut
<niemeyer> aobut
<niemeyer> about
 * niemeyer => lunch, so he can type again
<rogpeppe> i'm heading off early today to try and beat the snow. will be leaving in about an hour.
<fwereade_> niemeyer, ping
<fwereade_> rogpeppe, take care
<rogpeppe> fwereade_: and you - have a good trip to austin...
<fwereade_> rogpeppe, cheers :)
<niemeyer> fwereade_: pongus
<fwereade_> niemeyer, heyhey
<fwereade_> niemeyer, so, I got a little distracted with another bug, because I had a silly approach to the peer-relation one
<fwereade_> niemeyer, and ISTM that we can hugely simplify AddRelation if we're explicit about it taking 2 endpoints only
<niemeyer> fwereade_: Hmm
<fwereade_> niemeyer, how insane do you consider that suggestion to be? :)
<niemeyer> fwereade_: It sounds to me like a significant refactoring to change something that is already working
<niemeyer> fwereade_: All the logic around relations handles an arbitrary number of endpoints
<niemeyer> fwereade_: Hardcoding a limit on that seems like a step backwards for no great reason
<fwereade_> niemeyer, are you free for a very quick call?
<niemeyer> fwereade_: Let's do it
<fwereade_> https://plus.google.com/hangouts/_/142892326e0f515d55e499b071db37d660b0ba7e?authuser=0&hl=en#
<fwereade_> niemeyer, ^^
<niemeyer> fwereade_: ON th eway
<rogpeppe> niemeyer: here's a CL that changes agent.Conf.StateInfo and APIINfo to pointer types, after your remark earlier: https://codereview.appspot.com/7124062/
<rogpeppe> have a great weekend all
<TheMue> rogpeppe: Enjoy the snow. ;)
<TheMue> rogpeppe: And have a great weekend too.
<rogpeppe> TheMue: thanks, you too
<niemeyer> rogpeppe: Have a good one!
<niemeyer> rogpeppe: Reviewed, btw
<rogpeppe> niemeyer: thanks. yeah, it was just a debugging remnant, and i wondered about and/or there too!
<rogpeppe> niemeyer: have a great weekend, see ya monday
<niemeyer> rogpeppe: Cheers!
#juju-dev 2014-01-13
<wallyworld_> thumper: these spurious failures have suddenly gotten a lot worse - i'm onto my third landing attempt
<thumper> yeah... :(
 * wallyworld_ -> doctor, bbiab
<thumper> wallyworld_: sorry I didn't get to the reviews, lots of intro stuff with waigani and then collecting the dog from the vet
<wallyworld_> no problemo
<thumper> wallyworld_: if you are still short tomorrow morning I can dedicate time to it
 * thumper off to make dinner
<wallyworld_> ok, hopefully not :-)
<dimitern> morning
<jam> morning dimitern, I'm not used to seeing you this early
<jam> did your Timezone change?
<dimitern> jam, yes, now +1h from before
<jam> makes sense
<dimitern> jam, our 1-1 is in 1h, right?
<jam> yep
<jam> does Google have your right TZ now that you've moved?
<jam> if it works for you, we can go earlier next week
<dimitern> it was a struggle, but I found how to adjust my TZ in google, yes
<dimitern> well, if it works for you as it is, i'd prefer to leave it, so that I can do some work in the morning on mondays before the meeting :)
<jam> I'd like to go earlier, because it gives a gap before the next call. but I don't  think we usually take the full hour anyway
<dimitern> ok then, let's try it provisionally for next week?
<jam> dimitern: I'm happy to split the difference, and go for 30min, that still gives me a gap, but gives you some time
<jam> dimitern: I'm there whenever you're around
<dimitern> jam, coming
<jam> dimitern: you're stream just hung, can you hear me?
<rogpeppe> mornin' all
<jam> morning rogpeppe
<rogpeppe> jam: hiya
<jam> rogpeppe: poke about our 1:1 ?
<rogpeppe> ha!
<rogpeppe> one mo
<TheMue> fwereade: thx for feedback, that tag argument sounds reasonable to me, that's indeed a better way, yes.
<TheMue> fwereade: do you prefer a time based line dropping over a line number based? will change it then.
<fwereade> TheMue, yeah, it feels like maybe a general problem with watchers -- that they can be left running by a client that doesn't make regular next calls
<fwereade> TheMue, may be neater to separate out a solution to that and leave a line limit in place for this CL
<fwereade> TheMue, can you write a bug about the watchers being abandonable?
<TheMue> fwereade: yep, will do
<TheMue> fwereade: i think to keep the api stable there should be no argument for line limit (or time later) but a constant (or a general configurable value?)
<fwereade> TheMue, let's make it a generous constant for now
<TheMue> fwereade: ok
<mgz> mornin'
<jam> mgz: morning!
<jam> 1:1 ?
<mgz> jam: mumble?
<jam> joining now
<jam> mgz: standup?
<jam> rogpeppe: ^^
<niemeyer> Good morning jujuers
 * niemeyer is back from holidays
<dimitern> niemeyer, welcome back
<niemeyer> dimitern: Thanks!
<jam> natefinch: 1:1?
<natefinch> jam yep
 * dimitern lunch
<natefinch> fwereade: thoughts on git addall returning an error when nothing added for git 1.8.5 (used on trusty) where on earlier versions it did not error in the same situation?  Not sure if that's something we depend on in the code.
<fwereade> natefinch, ah yeah
<fwereade> natefinch, I'm not sure *exactly* why that test is there, but it has the ring of somthinh that mightbe important
<fwereade> natefinch, might it be something to do with restarts at unexpected times?
<fwereade> natefinch, actually
<fwereade> natefinch, AddAll must surely be valid with no changes
<fwereade> natefinch, we do it at the end of every hook
<fwereade> natefinch, whether or not the charm wrote anything or not
<fwereade> natefinch, does that apply?
<natefinch> fwereade: yep, that makes it clear I need to swallow the error
<dimitern> fwereade, ping
<dimitern> fwereade, when you have some time perhaps we can go over the API document and verify if it makes sense (I'm done with what we discussed last week I think)
<natefinch> fwereade, rogpeppe: Code added to squelch addall error:  https://codereview.appspot.com/49470047
<dimitern> hazmat, ping
<dimitern> hazmat, i'm not sure what's not clear with the API specification goals - bulk operations are described, and ease of use from the client is not terribly reduced (not much more so than for agents)
<adeuring> could somebody please have a look here: https://code.launchpad.net/~adeuring/juju-core/1117173/+merge/201417 ?
<dimitern> adeuring, looking
<adeuring> dimitern: thanks!
<adeuring> dimitern: also this link https://codereview.appspot.com/51280045
<dimitern> adeuring, ah, nicer, thanks
<dimitern> adeuring, reviewed; please link the branch to bug 1117173
<_mup_> Bug #1117173: Validating config could have better errors <compat> <config> <docs> <improvement> <polish> <juju-core:Triaged by adeuring> <https://launchpad.net/bugs/1117173>
<adeuring> dimitern: thanks!
<rogpeppe> dimitern, fwereade, jam, natefinch: review anyone? it's just an upgrade to checkers.DeepEquals, but it's saved lots of time for me, so i'd like to integrate it. https://codereview.appspot.com/51480043/
<dimitern> rogpeppe, i'll take a look in m
<dimitern> 10 minutes
<rogpeppe> dimitern: thanks
<dimitern> rogpeppe, LGTM
<rogpeppe> dimitern: ta!
<rogpeppe> dimitern: sorry about the lack of specific tests, but this is just test code - we will see if there are problems. i don't want to spend any more time on it, if that's ok
<natefinch> rogpeppe: I can review it
<natefinch> rogpeppe: bah, missed that dimitern already did
<rogpeppe> natefinch: np, thanks anyway
<dimitern> rogpeppe, it's not absolutely necessary, but it would be nice - esp. to show off the better error messages ;)
<rogpeppe> dimitern: oh ok then :-)
<dimitern> rogpeppe, cheers!
<natefinch> rogpeppe: I'm worried about nil slice == empty slice.  I know it's handy a lot of the time, but sometimes they mean different things.
<jam> rogpeppe: natefinch: DeepSimilar ?
<natefinch> jam: DeepCloseEnough?
<rogpeppe> natefinch: if you can think of a concrete instance where we care, i'd like to know. it's caused us far too much pain in the past, and some of our tests are contorted because of it
<natefinch> rogpeppe: I know, I know. I have done the contortions, and it's a pain in the butt. I don't know of a specific case.
<natefinch> rogpeppe: maybe we can make a general guideline of "please don't do that" for relying on nil vs empty slices
<rogpeppe> natefinch: agreed
<natefinch> rogpeppe: and if you need it, well, then you're stuck DeepEquals, and good luck
<rogpeppe> dimitern: i've added error messages to all the tests
<rogpeppe> natefinch: how close are you to landing https://codereview.appspot.com/49470047/ ?
<natefinch> rogpeppe: should be in the process thereof
<rogpeppe> natefinch: great
<natefinch> rogpeppe: foiled by my own flaky mongo tests..... I'll repropose.  Maybe I should be working with fwereade on getting tests to pass reliably. (also need to get the mongo Tag stuff in, which is also failing on the bot due to flaky tests)
<rogpeppe> natefinch: i can't get tests to pass on my machine currently
<natefinch> rogpeppe: dang...
<rogpeppe> natefinch: i think i might work on getting tests to pass reliably - it's taking too much of my time as it is
<natefinch> rogpeppe: definitely a problem.  Sorry for the delays it has caused... they pass really reliably on my machine :/
<natefinch> rogpeppe: the mongo replica set tests, at least
<rogpeppe> natefinch: the replicaset tests weren't the ones that were failing for me
<rogpeppe> natefinch: i'm getting lots of "Waiting for sockets to die"
<natefinch> rogpeppe: oh, well, nevermind then :)   The spurious failure I just had with the git branch was with the replica set tests....
<natefinch> rogpeppe: oh yeah, I see that all the time.  Annoying.
<rogpeppe> natefinch: i still can't repro it reliably
<natefinch> rogpeppe: me either... but it happens pretty frequently for me, like 30% of the time, maybe?
<jamespage> sinzui, I should have qualified my bug report re local provider terminate-machine - thats with 1.17.0
<sinzui> jamespage, it is always helpful. I will note that and add it to 1.17.1
<natefinch> rogpeppe: that fix to the git tests just landed btw
<natefinch> rogpeppe: make your life a little bit easier at least
<rogpeppe> natefinch: thanks
<rogpeppe> natefinch: how's the replicaset branch?
<natefinch> rogpeppe: just looking at it now.  there's a couple tests that failed last time on the bot that I'm trying to make more reliable
<rogpeppe> anyone fancy giving this a review? https://codereview.appspot.com/49920045/
<rogpeppe> natefinch: i think you said you'd have a look last night - did you have any unpublished comments?
<natefinch> rogpeppe: sorry,  I got halfway through.  I'll finish the rest of it right now
<rogpeppe> natefinch: np
<rogpeppe> natefinch: i'm thinking about renaming "candidate" to "wantsVote" BTW
<natefinch> rogpeppe:  that seems clearer to me, yeah
<rogpeppe> natefinch: consider it done
<rogpeppe> fwereade: ping
<natefinch> rogpeppe: there's no difference between make(map[string]string) and  map[string]string{}  right?'
<rogpeppe> natefinch: that's right
<natefinch> rogpeppe: I thought so, but then managed to convince myself I might be wrong.
<rogpeppe> natefinch: personally i think that when initialising an empty map, the former is marginally clearer because you see the "make" right at the start of the expression
<natefinch> rogpeppe: I like the latter because there's less garbage in the way of the type... the make and the parens just obscure the type for me. Also, it's how the rest of the types are initialized, and there's really no other reason for map[x]y to be in an assignment except if you're making a new one.  Plus it's just fewer characters.   But, I don't really care that much.
<rogpeppe> natefinch: i use make for making slices too
<natefinch> rogpeppe: I use make only when need to set length or cap
<rogpeppe> natefinch: me too
<rogpeppe> natefinch: otherwise just var x []int
<rogpeppe> natefinch: i've updated that CL if you want to take another look. https://codereview.appspot.com/49920045
<natefinch> rogpeppe: cool, looking
<rogpeppe> natefinch: one additional change i made was to take mongoPort out of peerGroupInfo, which makes it possible to have machines with different ports for testing
<natefinch> rogpeppe: nice
<natefinch> rogpeppe: I think I saw Dave had a min/max package that had specific functions for each numeric type... but I can't find it now
<rogpeppe> natefinch: i don't think that was seriously intended
<rogpeppe> natefinch: i couldn't care less about reimplementing min
<natefinch> rogpeppe: right
<rogpeppe> natefinch: hmm, those tests should not be passing
<natefinch> rogpeppe: that sounds bad
<rogpeppe> natefinch: it's not bad as such, but it makes me worried there might be a go bug
<thumper> hi natefinch, rogpeppe
<rogpeppe> thumper: yo!
<natefinch> o/ thumper
<rogpeppe> thumper: happy new year!
<thumper> I have a question for you two
<thumper> I was trying to help waigani with lbox yesterday
<thumper> new guy
<thumper> but we couldn't get it logging in right
<thumper> it has been too long since I set up mine
<thumper> is there a step we are missing?
<natefinch> thumper: I had to log in with my gmail account, couldn't do it with my canonical account for whatever reason.
<thumper> hmm...
<rogpeppe> natefinch: did it prompt for the password?
<natefinch> rogpeppe: yep
<thumper> waigani: meet rogpeppe and natefinch
<rogpeppe> natefinch: sorry, i meant to address that to thumper
<rogpeppe> waigani: hello there
<natefinch> rogpeppe: or rather, it does now.  I don't know about using my canonical address since that was 6 months ago at this point
<thumper> rogpeppe: yeah, the password prompt came up
<natefinch> waigani: howdy
<thumper> rogpeppe: but it kept saying it was wrong
<natefinch> thumper: I think that's what happened to me. I never did figure it out.
<rogpeppe> natefinch: i've got to go now, but please ponder this conundrum: if map iteration order is random, how come the map iteration over members at the end of desiredPeerGroup is always returning a slice in the same (expected) order?
<rogpeppe> natefinch: it's really weird, and it feels like it just might be a golang bug, although that would be most unexpected
<natefinch> rogpeppe: hrm... I had thought they purposely made map iteration random so people couldn't depend on it, even though there's really no reason for it to be random in the current implementation
<rogpeppe> natefinch: *exactly*
<rogpeppe> natefinch: so why are my tests passing reliably?
<rogpeppe> natefinch: (i made a very simple test to verify that it doesn't happen for all maps)
<rogpeppe> anyway, gotta go
<rogpeppe> g'night all
<natefinch> g/night
<thumper> natefinch: as far as ordering goes, what you get order is different from what is guaranteed
<thumper> if the language doesn't define an order
<thumper> the language implementation can change, and ordering will change, and the language lawyers can say "we did nothing wrong"
<natefinch> thumper: yes, that's why they made map iteration non-deterministic. However, roger was saying that his code reliably turns a map into the same slice, despite that fact.
<thumper> so for now it may just be a happy coincidence
<thumper> but we shouldn't rely on that fact
<natefinch> thumper: right, but his tests are relying on it, I think is the problem
<thumper> natefinch: do all the tests pass on trusty now?
<thumper> natefinch: if they do, I may upgrade
<natefinch> thumper: not sure.  I'm on saucy still, just had the new git for whatever reason
<thumper> ah
<thumper> should ask rog I guess
<thumper> he is on trusty
<thumper> I think
<natefinch> thumper: yep, he is
<sinzui> hi thumper. I am not sure bug 1268689 is a Juju issue. It might be cloud-init, or possible the cloud-archive packages. What say you?
<_mup_> Bug #1268689: juju on 12.04 can't install packages in lxc containers <juju-core:New> <https://launchpad.net/bugs/1268689>
<thumper> sinzui: otp, will look shortly
<thumper> sinzui: looked
<thumper> sinzui: it reminds me of an issue we have had before
<thumper> sinzui: where the default packages failed to install correctly
<thumper> normally an intermittent failure
<thumper> due to updates in the archive
<sinzui> thank you thumper.
<thumper> sinzui: can I get you to add ~waigani to the ~juju team plz?
<sinzui> thumper, done
<thumper> sinzui: ta
<fwereade> waigani, welcome
<fwereade> waigani, sorry I wasn't around yesterday
<fwereade> hi thumper
<waigani> fwereade: hello :)
<waigani> i've spent the morning deprogramming my osX habits and getting use to Ubuntu
<waigani> installed on macbook pro - a few drivers and cards issues to work through. but it all seems to be working now.
<thumper> fwereade: hi, I'm just off to the gym
<fwereade> waigani, cool
#juju-dev 2014-01-14
<mbruzek> waigani on a scale of 1 to 10 how hard was Ubuntu to install on a Mac?
<waigani> hmmm, not too bad actually. I'd give it a 7
<waigani> the trackpad sucks though
<waigani> selecting, dragging and dropping - I have not sworn sworn so much in a long time!
<waigani> I found this very helpful for the install https://help.ubuntu.com/community/MacBookPro11-1/Saucy
<dimitern> jam, ping
<dimitern> jam, if you haven't yet started working on bug 1268471 i'd like to pick it up
<_mup_> Bug #1268471: cache bootstrap address <bootstrap> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1268471>
 * dimitern is away for 1h
<rogpeppe> anyone have an idea why my /boot partition might have filled up recently?
<axw> rogpeppe: usually for me that's because of kernel upgrades, and the old ones being kept around
<rogpeppe> axw: yeah, i'd just figured that out, thanks
<axw> are there lots of vmlinuz files?
<axw> ok
<jam> dimitern: I haven't started it yet
<rogpeppe> axw: i've just removed some old vmlinuz and initrd.img files
<rogpeppe> axw: hopefully no crucial ones
<rogpeppe> axw: i'm assuming that old versions will never be used again
<axw> rogpeppe: you should probably apt-get remove the packages
<rogpeppe> axw: ah
<axw> removing the packages updates grub too
<axw> rogpeppe: linux-image-<version>-generic
<axw> remove them for whichever ones you deleted :)
<rogpeppe> axw: how can i list all the packages i've got installed?
<rogpeppe> axw: i'd like to do <list all packages> | grep linux-image, just to make sure i'm doing the right thing
<axw> rogpeppe: dpkg --get-selections
<rogpeppe> axw: ha ha
<rogpeppe> axw: you can't do it with apt-get?
<rogpeppe> axw: fair enough
<axw> not sure actually
<rogpeppe> axw: much better! /boot down from 100% to 30%
<axw> :)
<rogpeppe> axw: one might think that the usual upgrade procedure would garbage collect every now and again
<rogpeppe> axw: thanks a lot BTW
<axw> rogpeppe: yeah, though you wouldn't want to upgrade and lose your old kernel if the new one is busted
<axw> rogpeppe: nps
<rogpeppe> axw: yeah, but keeping around 8 versions is probably slightly overkill...
<axw> I won't argue with that :)
<jam> axw: did something change with the test suite that it is trying to read /home/ubuntu/.ssh/authorized_keys ?
<jam> axw: https://code.launchpad.net/~rogpeppe/juju-core/479-desired-peer-group/+merge/201245 for the traceback
<jam> it happens that the machine has an ubuntu user
<jam> but the test suite is run as "tarmac"
<jam> so doesn't have rights
<axw> jam: that'd be the authenticationworker I think
<jam> and *IMO* shouldn't need them
<dimitern> jam, rogpeppe, wallyworld_, any idea how to work around this error while bootstrapping? bootstrap failed: cannot find bootstrap tools: XML syntax error on line 9: element <hr> closed by </body>
<jam> dimitern: wallyworld_ has a patch up for it, but also just "juju bootstrap --upload-tools" which I think you need anyway ?
<dimitern> jam, oh, silly me, ofc
<dimitern> thanks
<axw> jam: so, I'm pretty sure the error there is not due to the lack of access
<axw> it is very spammy though
<jam> axw: [LOG] 31.15149 ERROR juju.worker.authenticationworker reading ssh authorized keys for "machine-0": reading ssh authorised keys file: open /home/ubuntu/.ssh/authorized_keys: permission denied
<jam> that may not be why the test suite failed, true
<jam> but it does look like something is broken in our test infrastructure
<jam> which hides realstuff
<axw> jam: I don't suppose I can do multiple prereqs on a proposal, can I?
<jam> axw: just one
<axw> rats
<jam> you could merge them into another branch and use that as the prereq
<axw> doesn't matter, I'll just wait for the others to be approved
<axw> anyway: hooray, Windows can now bootstrap again (in my sandbox)
<rogpeppe> jam: i just saw that
<rogpeppe> jam: i suspect that something's not setting up things for the fake home correctly
<jam> rogpeppe: well the tests are run as Tarmac, so it should be reading a different home dir
<rogpeppe> jam: nothing should be trying to read from /home/ubuntu
<jam> I have the feeling authentication worker has hard-coded "ubuntu" user's .ssh/authorized-keys
<rogpeppe> jam: yeah
<axw> jam: it could be changed, but I wonder if it should be run at all in tests. I don't think it's a good idea for the test user's authorized_keys to be updated...
<rogpeppe> jam: it doesn't seem to hard code /home/ubuntu actually
<rogpeppe> jam: but even so, it shouldn't be looking at the real home directory
<axw> that's a good point
<jam> standup
<jam> fwereade: ^^
<mattyw> fwereade, hi there - the id meeting - I could move it one hour earlier - would that be better?
<mgz> wallyworld: lp:~gz/juju-core/1.16_ssl_verification_bootstrap_state_1268913 as an idea. would really like to write a test for it, but have few ideas.
<natefinch> jam, rogpeppe: this might be the problem: 6.8G	./test-mgo172948267
<rogpeppe> natefinch: probably
<rogpeppe> natefinch: but why is it so big?
<wallyworld_> mgz: let's see if it works first :-)
<natefinch> rogpeppe: I don't know. I'm going to try passing --oplog-size and see if I can trim it down. The help says the oplog defaults to 5% of disk space
<jam> rogpeppe, natefinch: note that I cleaned all of /tmp before we restarted the tarmac bot a couple of hours ago, so that is a 'new' run
<jam> natefinch: did you submit that one yourself
<jam> ?
<rogpeppe> natefinch: oh wow
<natefinch> jam: that's on my local machine
<rogpeppe> natefinch: perhaps we could make it zero...
<jam> rogpeppe: natefinch: mongo defaults are very much about "this will be a mongo machine"
<jam> which is very much against a test suite :)
<jam> natefinch: ah, good. I just checked the bot and I swapped the "used" and "available". The bot currently has6.9GBfree
<natefinch> jam: there's no reason why that can't be plenty.  I'll do some tests locally to see if the op log thing helps.
<natefinch> jam, rogpeppe: that did it.  oplogSize 100   (100MB) gets the directories down to 387M on my machine
<rogpeppe> natefinch: cool
<jam> natefinch: even 100 is rather large for the test suite. I don't know what we want in practice
<rogpeppe> natefinch: even 10 would seem like plenty
<jam> my understanding is you can't resize it
<natefinch> I'll try 10 and see if anything blows up
<jam> "The oplog exists internally as a capped collection, so you cannot modify its size in the course of normal operations. "
<jam> http://docs.mongodb.org/manual/tutorial/change-oplog-size/
<jam> so you can restart mongo to change the size (with some real hacks from what I can see)
<jam> but you can't just a
<jam> adjust it on the fly
<jam> anyway, 10MB should work, but will mean we have to do full syncs more often if a slave gets out of date
<jam> (more entries in the oplog than it can keep up with)
<natefinch> this is just for the tests.... and it seems to run fine with 10
<natefinch> I feel somewhat morbid with all these error messages about waiting for sockets to die
<mgz> natefinch: the poor sockets... :)
<nate_finch> oh USB ethernet adapter.... why do you hate me so?
<frankban> hi juju-core devs, trying to bootstrap an azure env i get this: http://pastebin.ubuntu.com/6750245/ . It might be related to bug 1250007
<_mup_> Bug #1250007: Bootstrapping azure causes memory to fill <amd64> <apport-bug> <saucy> <juju-core:Incomplete> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1250007>
<mgz> frankban: looks like it could be, a recursive error or something
<mgz> frankban: you have the stack there, so add some printing and de-incomplete the bug
<mgz> hm, actually writing config, not printing an error
<frankban> mgz: an encoding error writing the jenv?
<mgz> frankban: add some debug stuff on top of a local 1.16.5 and find out
<natefinch> mgz: wonder if azure is bootstrapping a machine that's too small?  512mb RAM or something?
<mgz> natefinch: the OOM is on the local machine, no?
<natefinch> mgz: duh.  yeah.  That's weird
<mgz> as frankban can reproduce it, should be pretty trivial for him to narrow down the issue
<frankban> mgz: it seems to affect trunk too: data, err := goyaml.Marshal(info) in configstore/disk/Write seems to never return
<dimitern> fwereade, ping
<mgz> frankban: good, being a branch-only issue would be more annoying
<dimitern> fwereade, sorry, never mind
<frankban> mgz: it seems a failure marshalling  the azure management-certificate
<dimitern>  rogpeppe, jam, fwereade: a quick review? https://codereview.appspot.com/52050043 fixes bug 1268471
<_mup_> Bug #1268471: cache bootstrap address <bootstrap> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1268471>
<rogpeppe> dimitern: will look in a mo
<dimitern> rogpeppe, ta
<frankban> mzg, so maybe the bug is in the documentation: https://juju.ubuntu.com/docs/config-azure.html : the management-certificate-path is the pem file, not the cer one.
<frankban> mgz: the bootstrap node has been successfully created, but status fails: 2014-01-14 14:10:30 DEBUG juju.state open.go:88 connection failed, will retry: dial tcp 137.117.132.43:37017: connection refused
<frankban> mgz: and now it works... it has taken several minutes. thanks for your help mgz
<mgz> frankban: presumably because the client OOMed before actually finishing up everything
<mgz> or... okay, what did you change?
<frankban> mgz: nothing, it just takes a lot more time than I was used to with ec2. anyway, the problem I see here are two:1) the documentation is ambiguous and 2) juju explodes very badly if you set up your azure environment with the wrong certificate file
<mgz> okay, so what did you actually change? the management-certificate setting?
<mgz> having any value that OOMs on serialisation just needs fixing
<frankban> mgz: the management-certificate-path setting
<mgz> frankban: can you record that in the bug?
<frankban> mgz: sure
<sinzui> hi natefinch. I have some go+win questions that you might help me answer
<natefinch> sinzui: sure
<rogpeppe> dimitern: argh, i've just discovered that this CL, which should have gone in a month ago, never made it in: https://codereview.appspot.com/37650048/
<sinzui> natefinch, I have setup a windows instance to build juju and the installer. There is a arch mismatch though
<dimitern> rogpeppe, looking
<dimitern> rogpeppe, ah yes
<rogpeppe> dimitern: i'm sorry, it probably means that the logic in your branch will need looking at again, but i'd like to get it in, if that's ok
<sinzui> natefinch, do we need 386? If so can I choose go 1.2 or go 1.1rc3?
<smoser> hey. is streams.canonical.com supposed to have some data ?
<sinzui> smoser, not yet
<dimitern> rogpeppe, get it in, I'll merge the changes and fix mine
<rogpeppe> dimitern: thanks.
<sinzui> smoser, http://streams.canonical.com/juju will have something soon though
<rogpeppe> dimitern: i was looking at the code and thinking "i'm sure i made this a bit simpler"..
<natefinch> sinzui: we build 386 for go because it's compatible with both x86 and x64  but the OS bitness doesn't matter
<dimitern> rogpeppe,  :)
<natefinch> sinzui: 1.2 should be fine
<sinzui> fab, thank you natefinch.
<smoser> k. thanks, sinzui . jamespage ^.
<natefinch> sinzui: np
<smoser> jamespage noticeds that it was being hit from his juju.
<jamespage> sinzui, yeah - I was trying to deploy a saucy charm under the local provider (running on trusty)
<jamespage> #fail
<sinzui> jamespage, 1.6.x always checks streams, then fails over to aws. But also note that many charms wont deploy on saucy because they rely on a package from a ppa that doesn't exist.
<jamespage> sinzui, thats 1.17
<jamespage> sinzui, this charm should work on saucy
<frankban> mgz: commented on bug 1250007
<_mup_> Bug #1250007: Bootstrapping azure causes memory to fill <amd64> <apport-bug> <saucy> <juju-core:Incomplete> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1250007>
<sinzui> jamespage, you can always set the tools-metadata-url to the location you expect tools to be found
<sinzui> jamespage, which cloud? your own?
<jamespage> sinzui, local provider so yes
<dimitern> rogpeppe, but you can review mine anyway, at least as it is now maybe?
<rogpeppe> dimitern: yeah, i will
<rogpeppe> dimitern: i am still looking at your CL BTW, and trying to work out how to avoid apiConfigConnect calling prepareAPIInfo. i'm thinking that it's perhaps not quite right that prepareAPIInfo calls Environ.StateInfo, but i haven't quite worked out what it *should* do
<dimitern> rogpeppe, why avoid calling it?
<rogpeppe> dimitern: firstly, because it's really slow
<rogpeppe> dimitern: secondly, because if we've got an API connection, we can get the most up to date API server addresses by asking the API
<dimitern> rogpeppe, well, apiConfigConnect calls NewAPIConn, which does the same
<rogpeppe> dimitern: your code is invoking Environ.StateInfo twice, AFAICS
<dimitern> rogpeppe, yes
<dimitern> rogpeppe, but it's not really slow if we have the cache
<dimitern> rogpeppe, it's slow only once initially
<rogpeppe> dimitern: isn't this happening when we *don't* have the cache?
<dimitern> rogpeppe, I was thinking how to avoid calling it twice but couldn't find a way
<rogpeppe> dimitern: i don't see why apiConfigConnect needs to call NewAPIConn at all
<rogpeppe> dimitern: can't it just call apiOpen directly?
<rogpeppe> dimitern: we're discarding the APIConn type anyway
<dimitern> rogpeppe, ah, that's a good point
<rogpeppe> dimitern: that old branch of mine has landed, BTW
<dimitern> rogpeppe, ok, will merge mine
<dimitern> another small review anyone? https://codereview.appspot.com/52130043/ - fixes bug 1259925
<_mup_> Bug #1259925: juju destroy-environment does not delete the local charm cache <destroy-environment> <local-provider> <juju-core:In Progress by dimitern> <https://launchpad.net/bugs/1259925>
<mgz> dimitern: do you need the prereq reviewed first?
<dimitern> mgz, they're not really related
<dimitern> mgz, just in the same pipeline, and rogpeppe is already reviewing the prereq
<mgz> k
<mgz> I shall assume sanity then
<dimitern> :) ta
<mgz> dimitern: revie
<mgz> -wed
<dimitern> mgz, cheers!
<rogpeppe> dimitern: reviewed
<dimitern> rogpeppe, thanks
<natefinch> rogpeppe: are there tests besides the ones involving git that *always* fail on trusty?
<rogpeppe> natefinch: i don't *think* so, but i'll just check again.
<natefinch> rogpeppe: FYI, QA just ran the test suite on trusty and it passed, so you don't have to bend over backwards trying to figure out if it's possible to pass on trusty
<rogpeppe> natefinch: i'm running the test suite continuously now. so far i've had 2/2 failures
<natefinch> rogpeppe: that's about the ratio I get on saucy :/
<rogpeppe> natefinch: one with the juju package failing with "Waiting for sockets to die", the other with that and minunits tests failing with "no reachable servers"
<natefinch> rogpeppe: yeah, I get those fairly often
<marcoceppi> Hey guys, maas question, where does juju sync-tools put the files on the maas master?
<mgz> marcoceppi: in the filestorage thing, which I think boils down to postgres blobs
<marcoceppi> mgz: ack, thanks!
<mgz> you can get at them using the maas cli thing... I think the bug with that got fixed
<rogpeppe> natefinch: 6/6 test failures so far
<rogpeppe> natefinch: test run times varying between 9 and 20 minutes
<natefinch> rogpeppe:  yeesh
<TheMue> Tschakka, it passes. *phew* Some more details tomorrow morning and I can push it again. *happy*
<rogpeppe> natefinch: the juju package fails every time. other packages i've seen fail: worker/minunitsworker state/apiserver/upgrader	worker/provisioner
<arges> hi. Are there simple instructions for 'seeding' a juju maas provider with juju tools? I'm imaginging first i need to mirror the tools directory, then 'juju sync-tools --source <path to tools>'. is there a better way to do this? and how do I mirror the tools easily?
<natefinch>  juju bootstrap --upload-tools should work: http://maas.ubuntu.com/docs/juju-quick-start.html
<natefinch> arges: ^
<arges> natefinch: well what i'm asking is, imagine i'm using juju from a machine that has s3 access blocked. how would i get tools on that machine so I can use juju
<arges> and its a maas enviornment
<mgz> args, no there should be simple instructions, but that sort of thing with sync-tools is the right idea
<mgz> *arges
<mgz> **kwarges
<arges> heh
<natefinch> lol
<arges> mgz: now the question is. how do I easily mirror juju tools
<arges> wget -m on the s3 url doesn't work. and I can download specific files, but i don't want to miss anything
<arges> i also thought of 'juju sync-tools -e lxc' and downloading tools locally, then syncing that to maas (which seems like a hack)
<mgz> arges: I think --local-dir, then rsync or whatever that to somewhere with maas access, then --source from there should work... but the streams.canonical.com being broken bug prevents me from actually testing that
<arges> mgz: --local-dir is an argument to which command?
<mgz> both to sync-tools
<arges> oh ok
<arges> mgz: i think --local-dir is in newer version of juju
<arges> its not in 1.16.5
<mgz> arges: that seems likely then...
<mgz> your trick with the local provider seems worth trying, or just getting the tar bits you need from the ec2 bucket and generating the metadata files, which then requires several more commands and is also not well documented, and probably only nice and usable on trunk...
<mgz> this stuff has been semi-broken for too long
<arges> mgz: yea the local provider trick works for now. I'll check out trunk when I have time
<rogpeppe> natefinch: 10/10 failures so far
<natefinch> rogpeppe: dang.  I'm 100% fail as well..... trying to figure out what's going on.  Given that a lot of it seems to have to do with the test mongo server, I wonder if I managed to break it in some odd way
<natefinch> rogpeppe: seems like it's probably one specific problem that is cascading across tests
<rogpeppe> natefinch: yeah, i think so
<rogpeppe> natefinch: i suspect some kind of race in mgo, but i dunno really
<rogpeppe> natefinch: i need to muster up the energy to dive in again, now that i can repro it on my machine for the first time reliably
<rogpeppe> i'm done for the day though
<rogpeppe> g'night all!
<natefinch> heh, running go test -race somehow froze my computer
<natefinch> ....twice
<natefinch> maybe I'll stop doing that
<sinzui> ha ha. The windows client for 1.17.1 gives up on the public address and attempts to connect on the private address
<natefinch> good luck with that
<sinzui> I think this is another day to hit the sauce
<natefinch> hot sauce?
<sinzui> natefinch, beer
<sinzui> Before developing in Windows, I drank 1 beer a month
<sinzui> Except for yesterday's hacking via python and ssh, I drink to numb the pain
<natefinch> sinzui: yeah, windows is pretty terrible.
<natefinch> So, I went to upgrade to trusty, and it runs through the whole huge thing.  At the end it says:
<natefinch> Upgrade complete
<natefinch> The upgrade has completed but there were errors during the upgrade
<natefinch> process.
<sinzui> natefinch, The good news is that CI can build the windows installer along with the release tarball. Running the client tests looks like an exercise in adapting bash and *ix=isms
<sinzui> natefinch, don't remove any packages
<natefinch> ..... so... am I on trusty?  did it work?  Were the errors fatal?  Do I need to do somtehign to fix them?
<sinzui> natefinch, I have found that re-enabling the disabled archives (most still using saucy), then updating again fixes the issues
<natefinch> sinzui: cool, I'll give that a try.  Still don't get why they think it's ok to just disable a bunch of my archives :/
<sinzui> natefinch, yeah, that is a pain. They do it to ensure only the tested/compatible are installed, but as developers, we are crippled. I one let upgrade remove the packages that came from other archives and that caused a chain of breakages.
 * sinzui wont do that again
<natefinch> brb rebooting
<marcoceppi> did you guys change the default behavior of destroy-environment from using the -e flag to a list of environments as parameters? Or has that how it's been?
<natefinch> well great, now I can't open "software & updates" :/
<_thumper_> morning
 * thumper sighs
<thumper> big email backlog...
<natefinch> sinzui: lsb_release -a says trusty, but the "About This Computer" window still says 13.04.... is that normal for being on the release branch, or did my mucked up upgrade screw things up?
<natefinch> sinzui: s/release branch/development branch/
<sinzui> natefinch, yes
<natefinch> sinzui: ok, phew
<thumper> natefinch: the about computer package gets updated later
<thumper> sometimes the login screen will show old version number too
<thumper> until that is updated
<natefinch> thumper: yeah, it does.  boo.
<thumper> natefinch: tests passing?
<natefinch> .....and that answers the question of whether go test -race still crashes my laptop
<thumper> haha
<sinzui> natefinch, do you have any incites into this error. I can run the script from powershell over  rdp, but ssh errors on bootstrap command (but not the version command) http://pastebin.ubuntu.com/6752700/
<natefinch> sinzui: looks like it's looking for the juju home directory, which doesn't exist for some reason
<sinzui> JUJU_HOME is defined (at least is in powershell)
 * sinzui looks in ssh env
<natefinch> C:\Users\Administrator\AppData\Roaming\Juju  would be the default
<sinzui> natefinch, thank for the clue. It isn't defined when I execute the script via ssh
<natefinch> sinzui: welcome
<natefinch> thumper, wallyworld_:  This test consistently fails for me: TestStartInstanceWithDefaultSecurityGroup  (in provider/openstack/live_test.go)
<thumper> natefinch: is this run normally with make check?
<thumper> natefinch: if so, just passed for m,e
<natefinch> thumper: not sure what you mean about make check, I just did go test in that directory (also fails when running the full test suite)
<thumper> ok, works for me then
<natefinch> thumper: weird.  Fails for me every single time
<natefinch> live_test.go:248:
<natefinch>     c.Assert(defaultGroupFound, gc.Equals, useDefault)
<natefinch> ... obtained bool = false
<natefinch> ... expected bool = true
<natefinch> gotta go
<davecheney> sinzui: hold on, imma coming
#juju-dev 2014-01-15
<thumper> o/ wallyworld_
<wallyworld_> hello
<bigjools> halp: juju bootstrap says "environment already bootstrapped", juju status repeats: "ERROR TLS handshake failed: EOF" ad infinitum.
<bigjools> I see no running machines in my env with other tools
<bigjools> 1.16.5-saucy-amd64
<thumper> ?!
<thumper> bigjools: run destroy environment
<bigjools> thumper: no :)
<thumper> no?
<bigjools> I want to keep it - it's doing this on two envs
<bigjools> I can SSH into bootstrap node
<bigjools> the other env which has none I could destroy
<thumper> maas?
<bigjools> canonistack
<bigjools> so sorry let me be clearer.  The one env is genuinely empty so I've destroy-env'd it now.  The other is in use but juju status can't talk to it.
<thumper> bigjools: if you ssh into the bootstrap node of the machine that is running
<thumper> can you see if the machine agent is running?
<thumper> axw: hey there, where was that method that I need to implement for the local provider to get the addresses working properly?
<axw> thumper: containers/lxc/instance.go
<thumper> kk
<axw> method Addresses
<thumper> ta
<axw> np
<axw> thumper: I'm glad you moved RemoteResponse, because I was going to have to do it otherwise. It was causing a circular import from utils/ssh->cmd->environs/config->util/ssh
<thumper> :)
<axw> which is why I'm going to have to revert my change of using JujuHomePath
<thumper> should be landing now
<axw> cool
<thumper> had one intermittent failure landing so far
<bigjools> thumper: will check shortly, on a call
<axw> they've been quite frequent lately :(
<thumper> bigjools: kk
<thumper> axw: yes they have
<thumper> axw: seems like a race condition somewhere
<thumper> axw: any idea how to track it down?
<axw> I'm sure there are multiple
<axw> -race may help, will likely take some days of sifting I'd say
<thumper> axw: the address updater only runs on a machine that has the job ManageEnviron
<thumper> axw: this isn't sufficient
<thumper> for containers
<axw> oh :(
<axw> maybe we should change that?
<thumper> we need to have something running on every machine
<axw> that's what I thought it did
<thumper> nope
<axw> thumper: ah, it assumes that the addresses are observable externally
<thumper> axw: also it is a state worker
<axw> hrm
<thumper> so not over the api
<sinzui> axw, did you see this behaviour yesterday: https://bugs.launchpad.net/juju-core/+bug/1269120
<_mup_> Bug #1269120: win client bootstrap fails because it uses private ip <bootstrap> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1269120>
<axw> sinzui: I did not
<axw> it should be checking them all
<sinzui> axw, then we can say the good news is the basic test I put together for CI is valuable :)
<thumper> check versions
<axw> sinzui: where's the rest of the log?
<axw> is there an error?
<sinzui> I don't have any more. The test terminates the machine before getting the logs. Well I cannot because of the IP issue. I can get more information tomorrow when I add the test to CI
<axw> sinzui: I mean the juju bootstrap stdout/stderr - is that all there was?
<sinzui> axw, actual, i can get the debug output in 30 mintues
<sinzui> axw, that was all there was from --show-log
<axw> mk
<axw> sinzui: does your windows box have openssh on it?
<axw> sinzui: because it's actually expected not to work at the moment :)
<axw> I haven't submitted my fix yet
<axw> I'm surprised that there's no error there though
<sinzui> axw. as a matter of fact, it does have openssh installed for the benefit of CI. The behaviour is the same from powershell though
<axw> sinzui: ok. I think it might be a good idea to exclude openssh from %PATH% for the tests, for a more standard user setup. may be worth having both, I suppose
<sinzui> axw. I now little about windows...thank your for the recommendation
<thumper> wallyworld_: here is part two of the work you reviewed this morning -  https://codereview.appspot.com/52470044
<sinzui> axw, I learned today that I need to restart sshd each time I change the rules of user envs :)
<wallyworld_> ok, looking real soon
<axw> sinzui: nps. some people may have openssh/cygwin installed, but it's typical for people to use PuTTY on Windows
 * sinzui has 19.5 years of  linux development experience, but only 2 on windows
<axw> I was unfortunate to work across something like 5 OSes in my last job :)
<axw> the worst of all worlds
<axw> sinzui: I think it's probably not worth investigating further until my changes land
<axw> theoretically there should be no change if openssh is there
<axw> but theoretically it should work just as it does on Linux if it is there
<sinzui> axw, how long with juju wait during a bootstrap before realising that it has failed?
<axw> sinzui: 10 mins
<thumper> waigani_: around?
<waigani_> thumper: hello :)
<sinzui> wallyworld_, did r2201 change the behaviour of CI can no longer create simple streams, nor can we publish new releases. This is the command that did not generate files: http://pastebin.ubuntu.com/6753792/
<wallyworld_> looking
<wallyworld_> sinzui: it was not meant to change any behaviour. if it did i suck. i'll look at the command locally and see what's happening
<wallyworld_> it could be there's a code path there that doesn't generate metadata like it should
<wallyworld_> sync-tools needs tools tarballs and metadata now
<wallyworld_> since when if falls back to streams.canonical.com, it needs to use simplestreams to get the tools
<wallyworld_> hence it needs to use the same method for local tools as well
<wallyworld_> so if the source directory is missing metadata, you could generate it using juju metadata generate-tools
<wallyworld_> iow, just having a directory will tools tarballs is not sufficient, you also need simplestreams metadata
<wallyworld_> i can look at building that logic into the sync tools command if the source is local
<wallyworld_> does that rambling make sense?
<wallyworld_> as a short term fix, try $JUJU_EXEC metadata generate-tools -d $SOURCE
<bigjools> thumper: sigh.  I *was* sshed into the bootstrap node and my session froze.  Now, all the canonistack instances have vanished.  FFS.
<thumper> bigjools: :(
<bigjools> I now have to figure out how to redeploy my whole setup
<sinzui> wallyworld_, Since this is the process that makes data for streams.canonical.com. I think we need to land an immediate fix to calling metadata generate-tools. Do we not need to call sync-tool for 1.17.1 now?
<wallyworld_> sinzui: all you need to do to get metadata ready for streams.canonical.com is to use the generate meadata command. the sync-tools is more intended for folks to grab tools from streams.c.c so they can upload to their own cloud
<wallyworld_> so if you have tools tarballs, juju metadata generate-tools will produce the json ready to upload
<sinzui> wallyworld_, but we also need to create the directory structure too? (tools/releases/*.tgz) or does meta-data do that too
<sinzui> wallyworld_, we don't use sync-tools to deliver to the clouds. we are using it to put the tools and metadata in a directory that we sync to various clouds
<wallyworld_> generate metadata assumes the tools are in a <dir>/tools/releases and will put the metadata in <dir>/tools/streams
<wallyworld_> ah i see
<wallyworld_> but sync-tools needs the tarballs
<wallyworld_> so put the tarballs in a tools/releases dir and run generate-metadata and then you will have a dir structure ready to upload
<wallyworld_> since you will end up with <dir>/tools/releases and <dir>/tools/streams
<sinzui> wallyworld_, we make the tarballs and place them in a temp dir. We can make the dir structure. cp them to the releases dir, then run metadata to make the json...then cay on with signing.
<wallyworld_> that sounds ok. sorry abut the change in behaviour, i didn't realise you guys were using sync-tools like that
<wallyworld_> sadly i had to change it to fix the other XML error issue
<wallyworld_> since the code used to rely on being able to list the file contents of a url
<wallyworld_> it worked for s buckets but not for an arbitrary htlp url
<sinzui> wallyworld_, I am glad to stop calling sync-tools. This is the script that is called after we make the package and before we publish to all the CPCs http://bazaar.launchpad.net/~juju-qa/juju-core/ci-cd-scripts2/view/head:/assemble-public-tools.bash
<sinzui> I think I can get this sorted out quickly
<wallyworld_> yeah, generate_streams() will be a lot more logical if it can just call a command to generate streams :-)
<rogpeppe> fwereade: well done for finding the missing Close. Not sure how we all missed that for so long.
<fwereade> rogpeppe, cheers
<rogpeppe> fwereade: what i don't understand though, is why it only failed *some* of the time
<fwereade> rogpeppe, I agree that's not clear -- the fact that removing the SetAdminMongoPassword helped is interesting though
<fwereade> rogpeppe, hazmat had a patch that removed that, that apparently worked for him
<rogpeppe> fwereade: yeah, we worked that out together
<rogpeppe> fwereade: completely missing the missing Close :-)
<fwereade> rogpeppe, yeah, those things can hide sometimes
<rogpeppe> fwereade: totally trivial review? https://codereview.appspot.com/52460046
<rogpeppe> or anyone else?
<fwereade> rogpeppe, LGTM
<rogpeppe> fwereade: ta
<rogpeppe> fwereade: please merge https://code.launchpad.net/~fwereade/juju-core/fix-unclosed-conn-test/+merge/201723 - i wanna use it!
<jam> fwereade: given it is a Close method, should we be using "gc.Check(err, gc.IsNil)" ?
<jam> that way we can continue cleaning up even if one of the many Close calls fails?
<jam> Assert will stop there, and fail to close the rest of the resources
<fwereade> jam, other defers will still run, won't they?
<jam> fwereade: defer I think will run? I'm not really sure. But you did change a "conn.Close(); conn.Close(); conn.Close()" section. I guess that is the same object, so it doesn't matter
<jam> and all the rest appear to be in defer
<fwereade> jam, well it's purportedly testing that multi closes work
<jam> good enough, then
<dimitern> rogpeppe, jam, re https://codereview.appspot.com/52050043 I mentioned in the description that updating api server addresses after connecting seems out of scope for this CL
<dimitern> rogpeppe, jam, it gets us more than we had before - cached API endpoints, which speed up the CLI, which is already a big win IMO
<dimitern> rogpeppe, jam, but the actual updating can come as a follow-up, can't it?
<rogpeppe> dimitern: yes, i agree, as i said in my review ISTR
<jam> dimitern: so I don't think the actual updating is going to look like what you've written, is my concern, which means redoing it
<jam> I like the from-config stuff, as that is not likely to change a lot
<rogpeppe> dimitern: the other thing that we should do is make sure that bootstrap saves the cached address
<dimitern> jam, i'm changing the CL now to accommodate rogpeppe's parallels.Try logic and will repropose shortly
<jam> dimitern: your structure requires us to have the updated addresses before we return from api.Open, but that seems unfortunate to delay waiting for another round trip
<dimitern> jam, i'm not really following you there - why before api.Open?
<rogpeppe> jam: i think this is at least an improvement
<jam> dimitern: because at the end of api.Open you call SetAPIEndpoints immediately
<rogpeppe> jam: as it caches the address we get from Environ.StateInfo
<dimitern> rogpeppe, yes exactly
<jam> dimitern: my suggestion is just not to do it from the api-from-environ case and only the api-from-config case (or whatever the exact names are)
<jam> because the from-environ isn't giving us anything, so just pass nil to be clear that we don't have any new information
<dimitern> jam, anyway i'd like you to take a look after i propose again, i'm testing live with EC2 now
<dimitern> jam, api-from-environ is the same as api-from-config
<jam> dimitern: my point is, there are 2 code paths, one returns the stuff it just read, the other goes to the Environ and pulls out info from state info and looks it up
<jam> the latter should be cached
<jam> the former already is
<dimitern> jam, got you
<dimitern> rogpeppe, problem is, with the new code I can't seem to be able to distinguish between "info connection failed, but config succeeded" and "both failed"
<jam> dimitern: if you can't actually connect, I don't think we should cache, should we?
<jam> mgz: poke for standup
<mgz> jam: there seems to be no one there...
<natefinch> mgz: may need to pop out and back in
<natefinch> mgz: I had similar problem at first
<mgz> well, this is annoying
<dimitern> rogpeppe, jam, I'd appreciate a second look at https://codereview.appspot.com/52050043/
<rogpeppe> dimitern: will do
<rogpeppe> dimitern: i still don't see any new tests
<dimitern> rogpeppe, I need your help for that I think
<rogpeppe> dimitern: ok
<dimitern> rogpeppe, it's tested live, but I have trouble figuring out how to set up the tests for the new functionality
<dimitern> rogpeppe, generally, we need to test that cached info gets used first and failing to connect with it fails back to using the environ, and finally updates the cache
<rogpeppe> dimitern: i *think* there's already code that checks that the cached info is used first
<rogpeppe> dimitern: (test code, that is)
<dimitern> rogpeppe, so what tests do you think we need to add?
<rogpeppe> dimitern: i think that the only new test needed is to test that the cache is updated
<dimitern> rogpeppe, ok, i'll look into it and prepare something, and paste it to you
<rogpeppe> dimitern: thanks
<TheMue> fwereade: next round of debug log is in
<TheMue> adeuring: seen your comments on rietveld, but no changes. didn't used lbox propose?
<adeuring> TheMue: argh. forgot it... done now.
<TheMue> adeuring: great, thx, will take a look
<adeuring> thanks
<dimitern> rogpeppe, ok, i'll look into it and prepare something, and paste it to you
<dimitern> rogpeppe, oops sorry
<dimitern> rogpeppe, almost done btw
<rogpeppe> dimitern: cool
<dimitern> rogpeppe, http://paste.ubuntu.com/6756196/ there it is - TestWithInfoOnly is updated to check the cache is not changed
<rogpeppe> dimitern: do you actually mean TestWithConfigAndNoInfo ?
<rogpeppe> dimitern: TestWithoutInfoAndConfigUpdatesCache sounds like there's no info *or* config
<dimitern> rogpeppe, yeah, I'll rename it, thanks (was wondering how to phrase it)
<natefinch> rogpeppe: Does this test pass for you?  localLiveSuite.TestStartInstanceWithDefaultSecurityGroup    It fails 100% of the time for me.
<natefinch> rogpeppe: under provider/openstack
<rogpeppe> natefinch: yeah, it passes for me
<rogpeppe> natefinch: have you done godeps -u ?
<natefinch> rogpeppe: not recently. I bet that's the problem
<rogpeppe> natefinch: yeah
<rogpeppe> natefinch: you'll have to 'go get -u' the packages it complains about
<rogpeppe> natefinch: (i should really make it work a bit better when the required deps aren't available locally)
<rogpeppe> dimitern: i'm not sure i see how the first test is making sure that the cache hasn't been updated
<dimitern> rogpeppe, should I check the modified time of the jenv file instead?
<rogpeppe> dimitern: owd
<rogpeppe> dimitern: (mistype)
<natefinch> rogpeppe: godeps: cannot update "/home/nate/code/src/launchpad.net/gomaasapi": bzr: ERROR: branch has no revision ian.booth@canonical.com-20131017011445-m1hmr0ap14osd7li
<natefinch> bzr update --revision only works for a revision in the branch history
<rogpeppe> natefinch: as i said, you'll need to run go get -iu
<rogpeppe> -u
<rogpeppe> natefinch: i.e. go get -u launchpad.net/gomaasapi/...
<rogpeppe> natefinch: unfortunately godeps only prints a single repo that's failed, so you'll probably need to do that several times
<rogpeppe> natefinch: for each repo that's out of date
<rogpeppe> dimitern: i wouldn't check the mtime
<rogpeppe> dimitern: configstore.Storage is an interface, so you can intercept the Write method.
<dimitern> rogpeppe, ah, good point - and a chance for me to use PatchValue
<rogpeppe> dimitern: no need to use PatchValue i think
<dimitern> rogpeppe, how then?
<dimitern> rogpeppe, and why not?
<rogpeppe> dimitern: you can just pass your custom store interface value into newAPIFromName
<rogpeppe> dimitern: (that's why it exists seperately from newAPIClient
<dimitern> rogpeppe, i'll try
<rogpeppe> dimitern: and NewAPIClientFromName)
<dimitern> rogpeppe, although the PatchValue approach seems cleaner
<rogpeppe> dimitern: what value would you patch?
<dimitern> rogpeppe, store.Write?
 * rogpeppe thinks that patching values is something to be avoided if possible
<dimitern> rogpeppe, or it only works for globals
<rogpeppe> dimitern: it only works for globals
<rogpeppe> dimitern: well, it only works for *values*
<rogpeppe> dimitern: you can't patch methods
<dimitern> rogpeppe, http://paste.ubuntu.com/6756292/ better?
<dimitern> rogpeppe1, updated the CL with your last review, reproposing now
<rogpeppe1> dimitern: ta
<dimitern> rogpeppe1, https://codereview.appspot.com/52050043/ - does it look ok to land now?
<rogpeppe1> dimitern: looking
<dimitern> mgz, ping
<dimitern> mgz, should we have a talk about networking, so I can be brought up to speed?
<dimitern> mgz, perhaps with fwereade as well?
<natefinch> I love it when I make a guess and it turns out to be right.  I had somehow munged my iptables in such a way as to prevent me from being able to print... resetting iptables fixed the problem.
<mgz> dimitern: SURE
<mgz> er, caps
<dimitern> :) sounds like you're too eager?
<fwereade> dimitern, mgz, ok, sgtm, I have half an hour
<dimitern> fwereade, so now then? i'll send a link
<natefinch> also rogpeppe1: thanks, updating stuff fixed my test failures.. I actually got them to pass on the first try. Amazing.
<rogpeppe1> natefinch: yay!
<dimitern> mgz, fwereade: https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.3tn7jebub5jn5mhuh5sf8acd70
<rogpeppe1> dimitern: reviewed
<dimitern> rogpeppe1, ta
<natefinch> man I hate it when foo --help bar doesn't return help about bar
<rogpeppe1> natefinch: ha yes
<rogpeppe1> natefinch: s3cmd being one example
<natefinch> mongod --replset has an optional seed list that you can append.... but I can't find what the format of the seed list is supposed to be
<rogpeppe1> lunch
<rogpeppe1> fwereade: we're not planning to lose default-series entirely, are we?
<fwereade> rogpeppe1, I was hoping we could eventually tbh
<rogpeppe1> fwereade: if we do, then what should EnsureAvailability use when it starts new machines?
<fwereade> rogpeppe1, I think it uses something similar-but-different? default-series as controller of charm series should definitely not be depended upon long-term
<rogpeppe1> fwereade: i guess we just have series as an argument to EnsureAvailability
<fwereade> rogpeppe1, state-server-series perhaps? seems probably smart to deploy mongo across the same OSs where possible...
<rogpeppe1> fwereade: i'm not sure
<rogpeppe1> fwereade: i'm not sure we want to state that people *must* do that
<natefinch> fwereade, rogpeppe1:  seems like defaulting to latest LTS is probably a sane default.... do most people even really care what OS their servers are running?
<fwereade> rogpeppe1, natefinch: I think that would certainly default to latest-lts
<fwereade> rogpeppe1, natefinch: but I can imagine reasonable use cases -- certain charms require a different version, and you want to deploy them densely, so you want all your machines to be... unctuous, or whatever we may call it
<rogpeppe1> fwereade: yeah, i was thinking that too
<natefinch> rogpeppe1, fwereade: yes, but I would hope most charms run well on latest LTS
<rogpeppe1> natefinch: i doubt it
<fwereade> natefinch, the particular case we've seen is needing a newer kernel version
<rogpeppe1> natefinch: i suspect most charms will be on precise for a long time
<fwereade> rogpeppe1, not so sure, that's being actively worked on
<rogpeppe1> fwereade: i'll believe it when i see it :-)
<natefinch> rogpeppe1, fwereade: that's still one of the things that surprises me about ubuntu (and linux in general) - that stuff which worked on the OS 2 years ago is assumed to be broken on the latest version.
<natefinch> rogpeppe1, fwereade: not just assumed, but often is
<rogpeppe1> natefinch: i agree, but that's just something we have to work with
<rogpeppe1> natefinch: everybody assumes everything is utterly unportable
<rogpeppe1> natefinch: once upon a time, you could actually do things portably across unixes, let alone linuxes
<TheMue> natefinch: os/2? ah, i loved it. and scripting with rexx, even with ui (used watcom). editor has been spf/2 (i came from the mainframe at that time)
<natefinch> rogpeppe1: boggles my mind... coming from Windows where stuff written for XP 13 years ago still works on Windows *\8
<natefinch> rogpeppe1: btw, that extra info from replicaset code finally landed
<rogpeppe1> natefinch: <o/
<rogpeppe1> natefinch: \o/ even :-)
<natefinch> rogpeppe1: only took two tries to pass the tests this time
<natefinch> rogpeppe1: have some time to talk about EnsureMongoServer, now that I can actually get back to that?
<rogpeppe1> natefinch: sure
<natefinch> rogpeppe1: so it is just a matter of rewriting the upstart job as appropriate?
<rogpeppe1> natefinch: yeah, and checking whether the upstart job is running already or not
<natefinch> rogpeppe1: don't we need to rewrite it even if one is running?  Thinking of upgrade and/or when the list of servers changes
<rogpeppe1> natefinch: i hope not. i don't want to have the list of servers inside the upstart file.
<natefinch> rogpeppe1: ahh, ok, I misunderstood some of the text.  Yeah, I think it's best not to have the list in the upstart file (and should be unecessary)
<rogpeppe1> natefinch: i think the upstart job should probably just run a shell script that gets the server list from somewhere, and upgrades could upgrade that.
<natefinch> rogpeppe1: so, we already have upstart.MongoUpstartService ... is there anything else to do but just update that with --replSet juju?
<natefinch> rogpeppe1: I don't think we even really need the list of servers to start mongo
<rogpeppe1> natefinch: no?
<rogpeppe1> natefinch: how does it find out about its peers?
<natefinch> rogpeppe1: when you add it to the member list on the primary, magic happens, and it joins the group.  You don't have to directly tell the secondary about the rest of the servers (I think likely the primary pings it to let it know there's a replset in existence)
<rogpeppe1> natefinch: ah, of course!
<rogpeppe1> natefinch: because all servers connect directly to each other
<natefinch> rogpeppe1: right
<rogpeppe1> natefinch: in which case, i think you're right
<natefinch> rogpeppe1: well, cool.
<natefinch> rogpeppe1: we do still have to fix the upstart script on upgrade, though
<rogpeppe1> natefinch: yeah, the first time
<rogpeppe1> natefinch: (and of course if we want to change the mongo args, but that's another matter)
<natefinch> rogpeppe1: yeah I meant changing the args (to add --replSet juju).   Figured it's better just to always update the script when we update juju
<rogpeppe1> natefinch: seems reasonable
<rogpeppe1> natefinch: but i don't think we always want to restart the service, do we?
<natefinch> rogpeppe1: don't we restart the service by definition while upgrading?
<rogpeppe1> natefinch: i'm not sure. currently we don't restart any service. perhaps that's reasonable to do though.
<rogpeppe1> natefinch: (there are two services involved here, right?
<rogpeppe1> )
<natefinch> rogpeppe1: right, yeah, I was thinking about it incorrectly.
<natefinch> rogpeppe1: so, I'm not sure where or when we'd call the code to recreate the upstart script
<rogpeppe1> natefinch: in EnsureMongoServer?
<natefinch> rogpeppe1: well, yes.  I thought I might need to actually call that function from somewhere, though
<rogpeppe1> natefinch: yes, that function will be called from jujud
<rogpeppe1> natefinch: inside the machine agent logic
<rogpeppe1> natefinch: when the machine agent finds that it has a ManageState job
<natefinch> rogpeppe1: So you're saying you'll have the code to call it?
<rogpeppe1> natefinch: yeah - one of us will write it. EnsureMongoService is a primitive we'll use
<natefinch> lunchtime for me
<hazmat> rogpeppe1, got a bug report against deployer in #juju .. http://paste.ubuntu.com/6757161/ .. its an error message from the watcher impl that the watcher is stopped
<hazmat> we're seeing it in a few different contexts, i'm just curious if this is normal behavior
<rogpeppe1> it probably means that the state watcher has been stopped :-)
<rogpeppe1> hazmat: can you reproduce it?
<hazmat> rogpeppe1, but why would the watcher be stopped outside of the client requesting it?
<rogpeppe1> hazmat: the watcher should only be stopped if the state is closed
<rogpeppe1> hazmat: i'd like to see a transcript of the API messages
<rogpeppe1> hazmat: a copy of machine-0.log would be really useful
<hazmat> rogpeppe1, ack, asking
<rogpeppe1> hazmat: i tell a lie. it can happen if either the watcher or the underlying state was closed
<hazmat> rogpeppe1, i've got the api server log.. do you have a chinstrap account?
<rogpeppe1> hazmat: i think so
<hazmat> rogpeppe1, its in  ~kapil/machine-0.log
<hazmat> rogpeppe1, yeah.. i'm thinking its client error, i don't recall the gui folks have ever complained about it, but i've seen a few reports against deployer
<rogpeppe1> hazmat: hmm, interesting.
<hazmat> rogpeppe1, anything of note there? it looks like stop is being called, but there's a lot of line noise.
<rogpeppe1> hazmat: i can't see Stop being called (by that client anyway)
<rogpeppe1> hazmat: i think the only interaction that client had with the API server is in the messages in ~rog/select.log on chinstrap
<rogpeppe1> hazmat: i can't currently see a way that it could be happening
<hazmat> rogpeppe1, hmm.. perhaps some isolation issue around multiple allwatchers?
<rogpeppe1> hazmat: that's what i'm looking for, but it looks pretty tight to me
<rogpeppe1> hazmat: it would help if there weren't two distinct errors that "state watcher was stopped" represents
<rogpeppe1> hazmat: (there's a TODO in the code to change one of them)
<rogpeppe1> hazmat: i don't see how it can happen, but there are a few places where better logging could help us. i'll fix that up so the next time it happens we'll have a bit more useful info.
<rogpeppe1> hazmat: i think it might not be coincidence that client [1] goes away at a similar time to client [1A] getting the "state watcher is stopped" message
<rogpeppe1> hazmat: but we don't log clients leaving, so i can't be sure
<hazmat> rogpeppe1, well when the watch error happens its going to kill the process which stops a separate control connection
<hazmat> rogpeppe1, fwiw this is the bug tracking https://bugs.launchpad.net/juju-core/+bug/1269519
<_mup_> Bug #1269519: Error on allwatcher api <juju-core:New> <juju-deployer:New> <https://launchpad.net/bugs/1269519>
<rogpeppe1> hazmat: ah, this is a python client which will be using a separate connection for each operation, yeah
<hazmat> rogpeppe1, no..
<hazmat> rogpeppe1, multiple operations on one connection, watches on separate connections
<hazmat> er.. optionally watches on separate connections
<rogpeppe1> hazmat: that's what i meant to say :-)
<hazmat> :-)
<rogpeppe1> hazmat: but multiple connections for a single client, anyway
<hazmat> yup
<thumper> morning
<natefinch> morning thumper
<thumper> morning natefinch
<natefinch> thumper: btw, problem I had last night was out of date dependencies.  Man, wish there was a better way to keep that from happening.
<thumper> natefinch: yeah...
<thumper> natefinch: also, I noticed that godep doesn't fetch the remote branches
<thumper> it assumes that the revisions are there
<thumper> and just sets the working tree revision
<natefinch> thumper: roger was feeling bad about that this morning
<natefinch> thumper: in practice, what we really just need is a cron job to update those branches to head once a day
<thumper> I think it should make the branch that are there actually have a tip of what we depend on
<natefinch> thumper: we could just add a "juju" tag and update the tag as appropriate... then the aforementioned cron job could keep the local in sync with the tag
<natefinch> same idea, basically
<thumper> I don't think it is that hard..
<thumper> and we could have a simple make target that does the godep call
<thumper> make dep-update
<thumper> or something
<thumper> and include the ability for a quick check
<thumper> don't fetch, just check that the tip of each dependency matches the file
<thumper> that should be super fast
<thumper> and could be part of the default make targets
<thumper> I know some people aren't fans of makefiles
<thumper> but they are handy
 * thumper takes bug 1269363
<_mup_> Bug #1269363: local environment broken with root perms <local-provider> <ssh> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1269363>
<natefinch> thumper: got a sec?
<thumper> natefinch: sure
<natefinch> thumper: I need to find a place to rewrite the mongo upstart script, so we can add --replSet juju to the command line that we run, and then restart mongo
<natefinch> thumper: roger had said we should do it in the machine agent somewhere, which is fine... except that I'm not sure it has access to the right information to write the upstart script
<thumper> yeah...
<thumper> I've been thinking about that
<natefinch> upstart.MongoUpstartService() takes the mongo data directory and port
<thumper> as part of the upgrade stuff
<natefinch> cloudinit gets those from the MachineConfig, but I don't see a way for the machine agent to get to that info
<thumper> hmm...
<thumper> not sure...
<natefinch> seems like half of software development is just figuring out how to get information from here to over there
<thumper> for sure
<thumper> I'm in that situation right now too
<thumper> I know the problem, know what causes it,
<thumper> just fixing it right...
<natefinch> yep
<thumper> that's the hard bit
<natefinch> it would help if I was more familiar with the way all the code in this area interacts.  I guess now is the time to start figuring that out :)
<thumper> :)
<natefinch> well that's confusing..... there's an environs/cloudinit.go and an environs/cloudinit/cloudinit.go
<thumper> natefinch: but wait, there's more...
<thumper> there is cloudinit/cloudinit.go
<natefinch> wow, that is.... something else
<thumper> yes
<thumper> yes it is
<thumper> naming shit is hard
<natefinch> that's true
<natefinch> time for the old "I don't know where to put it, so just pick some place and let it shake out in the reviews"
<thumper> :)
<natefinch> I hate it when something as stupid as "append 'db' to the end of the path" turns into a whole pain in the ass of "well, now I need a central place to keep this logic"
<natefinch> which of course is like 80% of actual programmnig
<rogpeppe2> natefinch: why wouldn't the machine agent have the right info to rewrite the upstart script?
<rogpeppe2> thumper, natefinch: a review of this would be appreciated: https://codereview.appspot.com/52850043
 * thumper nods...
<natefinch> rogpeppe2: two things, one is that the mongo directory is "db" under the machine's data directory, but that code was only in MachineConfig.addMongoToBoot
<natefinch> rogpeppe2: the other thing is the mongo port
<rogpeppe> natefinch: i don't understand the first
<rogpeppe> natefinch: you can get the mongo port from state.EnvironConfig
<natefinch> rogpeppe: the first is just that there was a piece of code hidden away in cloudinit that needed to be put somewhere accessible to the rest of the world
<rogpeppe> natefinch: definitely. i want to move it out of cloudinit entirely
<rogpeppe> natefinch: i'm hoping that jujud init can start the mongo server itself rather than it being done in cloudinit
<natefinch> rogpeppe: not really sure how to get EnvironConfig from machineagent either....
<rogpeppe> natefinch: it might require a new API call. let me check.
<natefinch> sorry, gotta run, realized it's EOD for me.  email me if you figure it out, rogpeppe, otherwise I'm sure I can figure it out... just didn't know if there was an obvious place where that info was that I wasn't seeing.
<rogpeppe> natefinch: it's available in the provisioner API, which is available to the machine agent, but i think it should be added to the machiner
<rogpeppe> oh, too late
<hazmat> rogpeppe, getting more reports of that same issue re stop watcher, in terms of helping to debug it..
<hazmat> just turn up the log level and hand over more logs?
<rogpeppe> hazmat: i've just proposed a CL that might help slightly in trying to narrow down the issue: https://codereview.appspot.com/52850043
<hazmat> rogpeppe, cool
 * thumper goes to check out some office space in town
<rogpeppe> axw: ping
#juju-dev 2014-01-16
<thumper> bugger
<thumper> followed closely by wat?
<axw> rogpeppe: pong... presumably you're asleep though
<axw> thumper: office space?
<thumper> axw: yeah, there are now four of us in Dunedin
<thumper> considering options around having an office
<axw> ah, cool
<thumper> axw: I'm looking at the bug wallyworld_ mentioned in the email
<wallyworld_> thumper: meeting?
<thumper> axw: it was the creation of ~/.juju/ssh
<thumper> wallyworld_:
<thumper> yeah
<thumper> coming
<axw> oh? :(
<axw> sorry about that
<thumper> axw: np, working on it
<axw> brb, restarting
 * axw loves his new SSD
<wallyworld_> axw: don't forget to add trim support, unless you are running trusty :-)
<axw> wallyworld_: yup thanks :)
<axw> not running trusty yet
<wallyworld_> i knew you'd know to do it :-)
<wallyworld_> i want to upgrade to trusty, just a bit scared
<axw> me too
<wallyworld_> maybe best to wait a few weeks
<axw> I was going to wait a few months ;)
<wallyworld_> i've found the first beta to be ok in the past
<axw> yeah maybe I'll get in on the beta
<thumper> I found that very amusing
<thumper> bootstrapped local
<thumper> did juju status
<thumper> it says "agent-state: down"
<thumper> I'm like, bollocks
<thumper> otherwise status wouldn't work
<thumper> axw: can I get you to review the changes to the ~/.juju/ssh work?
<axw> thumper: yes
<thumper> axw: cheers, just proposing now
<thumper> axw: https://codereview.appspot.com/52950043
<axw> I wish we didn't have to run with sudo :(
<thumper> me too
<thumper> maybe soon
<axw> thumper: reviewed
<thumper> ta
<thumper> axw: can we talk about this?
<axw> thumper: sure
<thumper> https://plus.google.com/hangouts/_/7ecpijqrs52ki6g2aveu0pajlg?hl=en
<axw> thumper: thanks, looks better now (to me anyway)
<thumper> axw: np
<axw> wallyworld_: did you add the MachineConfig API, or was that jam?
<axw> for manual provisioning
<wallyworld_> um
<wallyworld_> i can't recall doing it
<axw> ok
<wallyworld_> what does bzr annotate say?
<axw> wallyworld_: says you did it actually :)
<wallyworld_> which file?
<axw> state/api/client.go
<axw> wallyworld_: just wondering if there's a particular reason why Series was passed in, when that can be gotten from the state.Machine
<axw> wallyworld_: and if I'm allowed to break compatibility with 1.17.3
<axw> err
<axw> 1.17.0
<wallyworld_> can't recall exactly - may be because the old state based api had it as a param
<wallyworld_> and the new api replicated that behaviour
<wallyworld_> i'd have to look over the code again to remember. maybe add as a topic for discussion inthe meeting tonight
<axw> it would've been like that because it was all in the same function, I think
<axw> mk
<axw> thanks
<wallyworld_> if i can get my current wip shit sorted, i'll look over the code
<axw> ok, ta
<wallyworld_> axw: the old 1.6 api seemed to require series to be passed in
<axw> wallyworld_: which 1.16 API is that?
<wallyworld_> environs/manual/provisioner.go recordMachineInState1dot16
<wallyworld_> and then the same series is passed to MachineConfig
<axw> wallyworld_: I'm talking about the bit *after* recording the machine in state
<wallyworld_> but i guess the series is known
<wallyworld_> yeah
<axw> yeah
<axw> :)
<wallyworld_> i see what you are talking about now
<axw> I'll put in an item for the meeting to make sure I can break it
<wallyworld_> so it makes sense to drop it
<wallyworld_> i think it will be ok because 1.17 is dev
<axw> arch is known in that case too, but I'm not sure I can remove that one. MachineConfig could conceivably be used in other contexts, and Arch isn't required like Series is
<axw> (right?)
<axw> yeah I figured it's okay to break
<axw> although, MachineConfig requires arch, so... maybe it's okay for it to assume it's set on the machine and just error out if not
<axw> I'll do that
<jamespage> davecheney, +1 on not shipping a fork btw
<jamespage> davecheney, just reading your email
 * fwereade needs to be out for a bit, be back by the meeting
<jam1> mgz: rogpeppe: I'm going to have to miss the weekly meeting today, it is my son's first day of Karate, so I have to be there to get him all sorted out.
<rogpeppe> jam1: ok
<rogpeppe> axw: i was just pinging you in case you might be up for a review of https://codereview.appspot.com/52850043/
<axw> rogpeppe: sure, will take a look now
<rogpeppe> axw: thanks
<axw> rogpeppe: lgtm
<rogpeppe> axw: thanks
<rogpeppe> axw: maybe you're right about logging join/leave at info level
<rogpeppe> axw: it would be easy to do, and "
<rogpeppe> juju
<rogpeppe> deploy service -n 1000
<rogpeppe> " won't spam
<rogpeppe> axw: 'cos it only makes one connection
<axw> each machine agent will connect later won't it?
<rogpeppe> axw: ah, but i guess all the machines will make a conn, yeah
<axw> that's my only reservation
<rogpeppe> axw: 2000 lines isn't much though
<axw> yeah I guess so. I've found that sort of thing useful in the past, debugging when clients unexpectedly exit
<axw> tho I think we're at debug by default
<axw> fwereade: I would appreciate a look at this later, if you have time: https://codereview.appspot.com/53040043/
<axw> it's for hazmat
<rogpeppe> axw, fwereade, dimitern, mgz, wallyworld_: team meeting?
<dimitern> rogpeppe, we're all there
<wallyworld_> were're here
<fwereade> rogpeppe, we're having it ;p
<dimitern> rogpeppe, https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.sbtpoheo4q7i7atbvk9gtnb3cc
<rogpeppe> hmm, i must have clicked on the wrong link. i'm in https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.8sj9smn017584lljvp63djdnn8?authuser=1
<axw> rogpeppe: https://plus.google.com/hangouts/_/calendar/bWFyay5yYW1tLWNocmlzdGVuc2VuQGNhbm9uaWNhbC5jb20.sbtpoheo4q7i7atbvk9gtnb3cc?authuser=1
<rogpeppe> weird
<axw> rogpeppe: this is the CL mentioned before: https://codereview.appspot.com/53040043/
<rogpeppe> axw: thanks, will look
<axw> I will bbl if you need to ask me anything about it
<axw> ta
<wallyworld_> fwereade: i think i've addressed the outstanding issues with the juju status work - the revision update functionality and the status command presentation of data. if you get a chance to take a look, great. if not, i'll chase a review tomorrow
<rogpeppe> axw: so the idea behind this is that it means that you can provision machines that you can't ssh to, right?
<mgz> dimitern: have proposed a pretty trivial initial goose branch
 * fwereade makes no promises to wallyworld_ but thanks him for the reminder
<dimitern> mgz, looking
<axw> rogpeppe: yes, that is the idea
<dimitern> mgz, reviewed
<dimitern> mgz, i'm still looking at the networks spec doc and into the lxc/kvm brokers btw
<rogpeppe> axw: reviewed
<dimitern> noodles775, you around?
<noodles775> dimitern: yep, for a bit.
<dimitern> noodles775, re bug 1259925 you reproduced - can you do it reliably?
<_mup_> Bug #1259925: juju destroy-environment does not delete the local charm cache <destroy-environment> <local-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1259925>
<dimitern> noodles775, and if so, can you please leave a comment so I can try it myself and investigate?
<noodles775> dimitern: I've not tried since then - should I retry with 1.17.0 or trunk? (I actually tried trunk this morning but couldn't get juju status to return sensible info http://paste.ubuntu.com/6761458/)
<dimitern> noodles775, trunk would be preferable; have you made sure you rebuilt cmd/juju and cmd/jujud and then did bootstrap --upload-tools ? I was seeing similar status output yesterday before doing that
<dimitern> noodles775, also, if you can reproduce it again, can you make a mongo dump please? I'll help to see what's in the db
<noodles775> dimitern: no - I didn't do --upload-tools, I'll try that. Thanks.
<dimitern> axw, have you tested your branch against 1.16 api server?
<cgz> thanks for the goose review dimitern
<sparkiegeek> am I right in saying that there's no API for querying the state of a relation (e.g. hooks have fired and state comes from failures or not)?
<axw> dimitern: no, I haven't. I can do that tomorrow.
<axw> thanks rogpeppe
<dimitern> axw, I posted some review comments as well
<axw> cool, thanks dimitern
<dimitern> axw, cheers
<rogpeppe> sparkiegeek: you can find out the current status of a unit, which will show you if a relation has failed. what information are you after?
<sparkiegeek> rogpeppe: I want positive confirmation of success
<rogpeppe> sparkiegeek: confirmation that a relation has been successfully made?
<sparkiegeek> rogpeppe: exactly
<sparkiegeek> rogpeppe: where "success" = no hooks have failed :/
<sparkiegeek> rogpeppe: and more importantly, that they *have* been run
<rogpeppe> sparkiegeek: ah, yes. you can get the former, but i'm not sure there's any current way to verify that the relation-joined hook has successfully run
<rogpeppe> fwereade: ^ am i right there?
<axw> dimitern: "I'd like to see a test about using ProvisioningScript against 1.16 API server" -- do you just mean an interactive test, or do we have some way of doing that in unit tests?
<dimitern> axw, we have similar tests for other api calls - grep for 1dot16 methods and their associate tests
<fwereade> rogpeppe, sparkiegeek: that is correct -- we have work scheduled to flag when a unit's finished doing work, but you can't tell today
<sparkiegeek> fwereade: is there a bug number or blueprint you can point me at?
<rogpeppe> fwereade: i'm not sure that that will provide quite the functionality required
<rogpeppe> fwereade: a unit might have successfully joined the relation but still have work to do
<fwereade> sparkiegeek, I can't find one immediately, I'm writing one, will close it dupe if I can track it down
<rogpeppe> fwereade: (from relation changed events, for example)
<sparkiegeek> fwereade: great! Thanks a lot
<fwereade> rogpeppe, "finished doing work" ie completed all hooks, nothing scheduled for the future
<axw> dimitern: ah yep I understand now. I will add a test to environs/manual that tests the compat
<axw> dimitern: thanks
<dimitern> axw, ta!
<rogpeppe>  sparkiegeek: i'm wondering if what you really want here is a way for a charm to provide a positive indication that it's in a certain state
<rogpeppe> fwereade, sparkiegeek: i'm thinking this might be another use case for output variables
<rogpeppe> fwereade: the problem is that with juju run particularly, that state might never occur
<rogpeppe> fwereade: a relation's attributes could be constantly changing, so the charm might never reach that steady state
<rogpeppe> fwereade: but i suspect that sparkiegeek doesn't care about steady state as much as "this relation has been successfully joined"
<sparkiegeek> rogpeppe: that would work for me too, assuming the states cover the scenario of being deployed vs. being deployed and successfully related to $X
<sparkiegeek> rogpeppe: correct
<rogpeppe> sparkiegeek: the idea is that a charm could set an output variable to communicate something to the outside world. so the relation-joined hook could set foorelation-ready=true, for example (the variable name would be arbitrary)
<fwereade> rogpeppe, sparkiegeek: hmm, per-relation busy/idle flags?
<rogpeppe> fwereade: again, i don't think that quite hits the mark
<rogpeppe> fwereade: we don't care about busy vs idle but "are you in this state?"
<fwereade> rogpeppe, ok, please be very precise about what state we're hoping to expose and why, I may have missed something
<fwereade> rogpeppe, define a "successfully joined" relation
<rogpeppe> fwereade: my impression is that output variables would not be very hard to implement. can you think of something that makes them so?
<rogpeppe> fwereade: i don't think we can hope to make a canonical definition of that
<fwereade> rogpeppe, well I assumed it was a proxy for "what sparkiegeek really wants"
<rogpeppe> fwereade: hence i think that an output variable that lets the charm *say* "this thing has successfully happened", whether it's a relation successfully joined (by that charm's definition of success), or a web service being started, or whatever
<fwereade> rogpeppe, "all scheduled hooks have been runfor this relation, and none of them failed" STM to match the desires stated above
<rogpeppe> fwereade: but that might never happen, in legitimate scenario
<rogpeppe> s/in/in a/
<fwereade> rogpeppe, well, I don'tsee how you can say the relation has been "successfully joined" until that's completed
<fwereade> rogpeppe, it might always fall over next hook
<rogpeppe> fwereade: sure. that's why i would define "success" at the juju level at all
<rogpeppe> s/would/wouldn't/
<rogpeppe> fwereade: i *think* that output variables provide sufficient tools for a charm to communicate "success" by whatever definition it chooses
<fwereade> rogpeppe, I'm reluctant to abandon the idea that it should be mediated by juju somehow
<rogpeppe> fwereade: how do you mean? you think there might be some general definition of "success" in this context?
<fwereade> rogpeppe, I think the one I'm proposing has some utility, yes
<rogpeppe> fwereade: ISTM that it's too low level and may be fragile if people come to rely on it extensively
<rogpeppe> fwereade: it's also going to be hard to implement, i suspect - how can you actually tell when a client has no more hooks to execute?
<fwereade> rogpeppe, "has this unit successfully responded to all known changes [in this relation ]without errors?" doesn't seem so hard to me
<rogpeppe> fwereade: what does "respond" mean in that context? changes with respect to what base state?
<fwereade> rogpeppe, "responded to" = "run all relevant hooks", right?
<TheMue> nate_finch: your yesterdays tweet about 90% of programming is so damned right. i currently trying to find an elegant way to get the log dir which is different for containers like lxc or kvm. and the information is wonderfully private. *sigh*
<rogpeppe> fwereade: so every time you run a hook, you store that in the state?
<rogpeppe> fwereade: ISTM that output variables would be considerably easier to implement, have less runtime costs, and give access to a very useful range of new capabilities.
<fwereade> rogpeppe, no
<fwereade> rogpeppe, the GUI team *is* asking for that, and I think I'm ok activating that for specific units, but I don;t want a constant spew from all of them
<rogpeppe> fwereade: so you think output variables are a bad idea in general?
<fwereade> rogpeppe, no, I just don't think they solve the "wait for a unit to be ready" use case we already know we have
<nate_finch> TheMue: yeah, I was trying to get the mongo port from the machine config into the machine agent.  Should be easy, but it's not
<fwereade> rogpeppe, but that the report-busy-idle stuff *does* solve the are-we-ready-yet problem
<fwereade> rogpeppe, if we do it via output variables we fuck the gui
<rogpeppe> fwereade: there's no way in general to wait for a unit to be ready. a relation between A and B might have been successfully made, but it's quite possible that for a unit in A to be ready, B must have a made a relationship with C
<fwereade> rogpeppe, yes, you have to wait for the whole system to get to idle before you can be confident the whole thing will remain stable until perturbed
<rogpeppe> fwereade: even then, you can't do it
<rogpeppe> fwereade: units can asynchronously do stuff now, right?
<fwereade> rogpeppe, I consider juju-run to be a perturbation
<rogpeppe> fwereade: the system may very well never become idle
<fwereade> rogpeppe, I think that's a symptom of a problem with the system that's been constructed then
<rogpeppe> fwereade: not necessarily
<rogpeppe> fwereade: it might be perfectly reasonable for a charm to provide some update on a relation every few seconds
<rogpeppe> fwereade: i don't think we should focus too much on "idleness"
<rogpeppe> fwereade: i think we should be much more interested in "history"
<fwereade> rogpeppe, I am having some trouble imagining a use case there
<TheMue> nate_finch: sounds familiar ;)
<fwereade> rogpeppe, you use the relation settings to set up the channel over which the fast-changing info flows, surely
<rogpeppe> fwereade: for example: a service that's a gateway to another service with a bunch of servers that are changing. it has a relation attribute that contains that set of servers.
<rogpeppe> fwereade: it depends how fast-changing it is.
<rogpeppe> fwereade: if something's changing on the order of once every 10 seconds, i think it may be reasonable to use relation attributes
<rogpeppe> fwereade: and the point is: we've provided the capability, so people will do it *anyway*
<rogpeppe> fwereade: and it would be good to make our tools cope well when it happens
<fwereade> rogpeppe, I think that it's perfectly reasonablefor us to report such a system as unstable then
<fwereade> rogpeppe, that's coping perfectly with a system that's unstable
<rogpeppe> fwereade: unstable is fine - we can still do useful work with an unstable system
<Beret> fwereade, how would that screw over the GUI?
<rogpeppe> fwereade: as an example of another possible approach, we could provide a "hook history", that lets us know which hooks have been executed in a given unit.
<rogpeppe> fwereade: when a hook is run twice, we overwrite that hook's previous entry in the history
<fwereade> Beret, if we let people use a completely unstructured channel to report working/not-working, the gui won't be able to show a useful distinction
<Beret> ah, I was assuming the output variable would have to be structured
<fwereade> Beret, busy/idle *can* be interpreted by the GUI and drawn as pale-green/solid-green or similar
<rogpeppe> fwereade: each entry in the history could have its own time stamp.
<sparkiegeek> or some egg timers/beachballs ;)
<fwereade> Beret, the "output variables" idea is that they be minimally structured -- it's the flipside of charm config, which is really a service's "input variables"
<rogpeppe> fwereade: the other thing is that a unit may look idle from a juju point of view, but be very busy internally
<rogpeppe> fwereade: we could easily impose some conventions on output variables
<rogpeppe> fwereade: to let charms show their status to the GUI in a standard way, for example
<fwereade> rogpeppe, the same sort of thing that worked so well with haproxy and private-address, for example?
<rogpeppe> fwereade: remind me
<fwereade> rogpeppe, haproxy has "host", everything else in the world uses "private-address"
<fwereade> rogpeppe, relying on convention is I think inadequate
<rogpeppe> fwereade: if you don't use the conventions, the GUI won't see you. seems reasonable to me.
<fwereade> rogpeppe, and if you accidentally use them, the gui acts werd
<rogpeppe> fwereade: you could even have the uniter set output variables actually
<rogpeppe> fwereade: (the history idea above could work that way)
<rogpeppe> fwereade: convention can work well
<rogpeppe> fwereade: i prefer a general mechanism with some conventions to building more and bigger Stuff
<dimitern> mgz, fwereade, meeting?
<dimitern> we can use this https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig
<cgz> dimitern: I didn't add a hangout or anything... let's use that
<jamespage> fwereade, any thoughts on that suggestion I made re juju letting charms know if units are alive?
<fwereade> jamespage, well, I am still not sure it'll really solve the problem
<fwereade> jamespage, I know the "idle" hook I mentioned is not a thing, but I'm still interested to know if it helps you
 * fwereade maybe missed a followup elsewhere..?
<natefinch> rogpeppe: WIP - https://codereview.appspot.com/53220043/   ignore the replicaset.go changes.. just some extraneous changes that got on the wrong branch.
<jamespage> fwereade, sorry - idle hook?
<jamespage> must has missed that bit
<rogpeppe> natefinch: reviewed
<natefinch> rogpeppe: thanks
<natefinch> rogpeppe: about restarting mongo every time the machine agent bounces... I had been assuming we'd only call this method when we knew the upstart script was either out of date or missing (i.e. on upgrade from previous versions, or when starting a new state server).  I think either we'll know when we need to call this, or if we aren't sure, we'll have to call it every time just in case.
<rogpeppe> natefinch: the idea was that we'd call it if when starting up and we find we have a ManageState job
<rogpeppe> natefinch: that's kind of the point
<rogpeppe> natefinch: in that case we need to start the mongo server if it's not already started
<natefinch> rogpeppe: right, it occurred to me after I said that, that the "Ensure" part of the name means that logic should be encapsulated
<rogpeppe> natefinch: yes please :-)
<mattyw> rogpeppe, what happened to that charm lib you were writting in go?
<rogpeppe> mattyw: launchpad.net/juju-utils/gocharm
<rogpeppe> mattyw: um, maybe i got the path wrong there
<rogpeppe> mattyw: launchpad.net/juju-utils/cmd/gocharm
<mattyw> rogpeppe, in here? lp:juju-utils
<rogpeppe> mattyw: yeah
<rogpeppe> mattyw: for docs, see http://godoc.org/launchpad.net/juju-utils/cmd/gocharm
<rogpeppe> mattyw: and http://godoc.org/launchpad.net/juju-utils/hook
<rogpeppe> mattyw: the latter is the actual charm hook interface that you write code against
<rogpeppe> mattyw: the former just compiles the code and generates hook stubs
<mattyw> rogpeppe, looks awesome, will try to have a play with it over the next few days
<rogpeppe> mattyw: please do. i'd love any feedback at all.
<natefinch> oh local environment... why do you have to throw a wrench in everything?
<jcastro> hey sinzui
<sinzui> hi jcastro
<jcastro> you still get this right? https://bugs.launchpad.net/golxc/+bug/1238541
<jcastro> it's making life annoying, I was wondering maybe we can bug a core dev together today?
 * natefinch ducks
<_mup_> Bug #1238541: Local provider isn't usable after an old environment has been destroyed <intermittent-failure> <local-provider> <golxc:New> <https://launchpad.net/bugs/1238541>
<jcastro> natefinch, look at how easy it is! :)
<sinzui> hmm
<jcastro> natefinch, I can also just prod the list if you'd like
<sinzui> jcastro, I will get someone to investigate it. I thought bug 1269363 may have indirectly addressed the issue
<_mup_> Bug #1269363: local environment broken with root perms <local-provider> <ssh> <juju-core:Fix Committed by thumper> <https://launchpad.net/bugs/1269363>
<natefinch> jcastro: yjrtr
<natefinch> jcastro: there's a branch with a fix already submitted.  I can probably just approve the branch
<jcastro> that would be swell, we're doing a bunch of bootstraps/teardowns as part of the audit and cleaning up the containers by hand gets old after a while, heh
<sinzui> bugger, I cannot manage the bugs in golxc
<sinzui> oh, it is not a part of juju-project. Either a mistake of the project really isn't about juju
<natefinch> sinzui: I approved the fix and marked the bug as fix committed
<sinzui> thank you natefinch
<jcastro> natefinch, I owe you a beer, thanks!
<thumper> morning folks
<thumper> sometimes it feels like I never stop working
<thumper> that is what 11pm meetings do for you :-(
<natefinch> thumper: haha... I know the feeling... 5am meeting for me :)
 * thumper nods
<thumper> natefinch: got a minute to chat?
<natefinch> thumper: sure
<thumper> natefinch: I want to bounce an idea off someone
<natefinch> thumper: I can be your rubber duck
<thumper> natefinch: looking for more of a teddy bear https://plus.google.com/hangouts/_/7ecpi3me7dl01vto2l2n3368ns?hl=en
<natefinch> thumper: not sure if that's better or worse ;)
<hazmat> for juju datadir normally is just /var/lib/juju?
 * hazmat is trying to interpret some api params
<natefinch> hazmat: yeah
<hazmat> hmm
<natefinch> hazmat: I'm going to assume that hmm means everything is going perfectly. ;)
<hazmat> natefinch, magically delicious as always
<hatch> looks like `juju destroy-environment local` removes some files which causes `sudo juju destroy-environment local` to fail looking for those files putting it into a "corrupt" state
<hatch> has anyone noticed this before?
<sinzui> hatch, thumper fixed a similar issue to that yesterday
<hatch> sinzui oh awesome
<thumper> interesting...
<thumper> should fix that...
<hatch> :)
<hatch> my test machine isn't the most pristine environment so I always like to confirm with others before I file bugs haha
<wallyworld_> thumper: there's some stuff i'm keen to land which has been partially reviewed. how many beers/red wine would it take to get you to look and hopefully +1?
<thumper> wallyworld_: how long are they? the longer they are, the more wine it takes
<wallyworld_> not tooooo long
<wallyworld_> https://codereview.appspot.com/48880043/ https://codereview.appspot.com/49510043/ https://codereview.appspot.com/49500043/
<wallyworld_> :-D
<wallyworld_> not too short either
 * wallyworld_ goes to order a case of the finest French red
<thumper> wallyworld_: https://codereview.appspot.com/49510043/ can you respond to williams comments?
<wallyworld_> thumper: looking
<wallyworld_> thumper: yeah, i implemented that but didn't respond because i talked to him verbally and thought he'd do the last review
<wallyworld_> will comment
<thumper> wallyworld_: ta
<thumper> wallyworld_: I'll take a look again after the gym
<wallyworld_> thumper: ok, appreciate it thanks
<sinzui> thumper, wallyworld_ , Nate marked this branch approved to merge a few hours ago, but I don't think he realised that it is managed by ~juju, not gobot. Are either of you comfortable doing the merge and push? https://code.launchpad.net/~patrick-hetu/golxc/fix-1238541/+merge/200845
<wallyworld_> sure can do
<wallyworld_> will do now
<wallyworld_> sinzui: that should be done
<sinzui> thankyou wallyworld_ . Lp agrees
<wallyworld_> sinzui: how is the streams.c.c stuff going?
<sinzui> I learned that Ben was testing the deployment today. I hope for tomorrow to be done
<wallyworld_> \o/
#juju-dev 2014-01-17
<wallyworld_> thumper: i'm off to the cricket soon, i'll land any approved branches when i get back tonight
<thumper> ok
<wallyworld_>  ah good, i can land one now :-)
<wallyworld_> ah not quite, upstream still pending :-)
<dimitern> 622388
<dimitern> oops, morning
<rogpeppe2> fwereade: just checking: we only care about 1.16 -> 1.18 compatibility, not 1.17.x -> 1.18 compatibility, right?
<rogpeppe2> jam, dimitern: ^
<fwereade> rogpeppe2, I will be somewhat if it doesn't work, but they are explicitly dev releases... what needs to change?
<fwereade> somwhat sad
<rogpeppe2> fwereade: some time after 1.16 i added code to create the stateServers collection, but it isn't maintained correctly and i'm adding another field to it
<rogpeppe2> fwereade: i started writing compatibility code for both cases, but started getting bogged down
<fwereade> rogpeppe2, I has a bit of a sad there
<fwereade> rogpeppe2, given that no previous version will have any data worth keeping, can you do a detect-and-trash-old-format when yu create a State connection?
<rogpeppe2> fwereade: it's not that easy to do in a concurrent-safe way
<rogpeppe2> fwereade: although in all honesty the probability of something dodgy happening is very close to zero
<fwereade> rogpeppe2, I have a preference for "actually zero" -- what are the possible concurrent actions in play that might go wrong?
<fwereade> crap, I have to be afk some time, standup without me and I'll join if/when I can
<rogpeppe2> fwereade: ok
<dimitern> rogpeppe2, right
<dimitern> fwereade, wallyworld_, standup
<jamespage> sinzui, is the juju-core test suite offline? i.e. is it possible to run it in an offline build environment?
<dimitern> rogpeppe, how do you feel about EnvironWatcher mixin type (in state/apiserver/common/), with WatchForEnvironConfigChanges() and EnvironConfig() methods?
<rogpeppe> dimitern: SGTM
<dimitern> rogpeppe,  cool, thanks
<rogpeppe> fwereade: w.r.t. your query above, the sequence would look like: {scan state server machines; txn{remove doc; insert doc with found state server ids}}; if a client adds state servers just after doing that and there are two concurrent clients, the doc might be added without the newly added ids
<rogpeppe> fwereade: but that could only happen upgrading from 1.17 environments
<rogpeppe> fwereade: the alternative is to have different paths in State.createStateServers doc, one which asserts that the doc doesn't exist, the other which asserts that it exists with the expected revno
<fwereade> rogpeppe, without pre-existing multiple state servers, aren't we going to be down to a singe state client anyway?
<rogpeppe> fwereade: yes, but i can't see quite how that changes matters
<fwereade> rogpeppe, well, no concurrent access, right?
<rogpeppe> fwereade: i think you can still get concurrent access even with a single client, can't you?
<rogpeppe> fwereade: unless txn serialises everything on a single client, which it may, i suppose
<fwereade> rogpeppe, if you do it at state.Open time, or just after, before everything else gets their hands on it
<rogpeppe> fwereade: ah
<rogpeppe> fwereade: yeah, i think that works, thanks
<fwereade> rogpeppe, awesome
<rogpeppe> fwereade: although...
<rogpeppe> fwereade: no, it's ok
<fwereade> you had me worried though :)
<rogpeppe> fwereade: good thing the only mongo connection is from one place in the single machine agent!
<fwereade> rogpeppe, isn't it
<rogpeppe> lunch
<sinzui> jamespage, There are one or two bugs reported that it does need to be online. It should run on an offline computer though
<wallyworld_> fwereade: hi, if you have a few minutes, i got all but one branch approved. tim looked at this last one and had some issues which i addressed and a question which i answered. i was using an update interval of 6 hours cause i thought 24 was too long. i can change it if you agree with tim. https://codereview.appspot.com/49500043/
<fwereade> wallyworld_, I'm inclined to go with 24h for now
<wallyworld_> ok, i'll change. i just thought it was too long
<wallyworld_> fwereade: updated
<dimitern> gary_poster, hazmat, cmars, api meeting?
<jamespage> sinzui, I'll stuff it in a ppa and see
<gary_poster> dimitern: having computer issues.  will be there asap and trying to find delegate
<mbruzek> rogpeppe, I updated the bug.  When you get back from lunch let me know if you need any more information.
<arges> hey guys, need help with getting juju-tools in a private cloud where s3 is blocked. is there a wiki on how to set this up with 1.16.5?
<jam1> arges: I'd don't have a ton of time, but you can look for "sync-tools --source" which can be a local directory, but you'll need to have downloaded the tools some other way
<arges> jam1: yea so here's what i did... and i'm not sure if there is a better way
<arges> jam1: juju sync-tools (in LXC enviornment)
<arges> copy the files from .juju/local/storage into maas enviornment
<arges> juju sync-tools --source=<Storage dir>?
<arges> juju bootstrap --source=<storage dir>
<jam> arges: sounds reasonable to me,
<arges> Ok... I'm open for alternatives if there are any
<mgz> arges: I think that's what we decided the other day was the best way pre-1.18?
<jam> arges: you could wget, or something along those lines , but I think sync-tools into LXC gives you a nice way that it will discover everything you need rather than you figuring it out
<arges> mgz: yup. just checking
<arges> jam: yea that's what i used originally, but i missed files
<dimitern> i'm giving up :/
<natefinch> Man I hate it when I go to change a constant and find it's been hard coded in like 8 other spots
<mgz> it should be a woho moment, you get to make 8 bits of code better :)
<natefinch> mgz: that too, it just means more work that I wasn't expecting to have to do :)
<natefinch> are the juju-backup and juju-restore plugins things that are actively supported and supposed to work?
<mgz> presumably.
<mgz> at least till we have a better story there
<mgz> and multiple stateservers still isn't a replacement for the ability to backup...
<natefinch> because they're like one big hardcoded list of assumptions that need to be kept up to date with the rest of the code by hand.
<rogpeppe1> mbruzek: thanks
<mbruzek> you are welcome rogpeppe1
<rogpeppe> pretty trivial review anyone? https://codereview.appspot.com/53750043
<natefinch> rogpeppe: sure
<rogpeppe> natefinch: thanks
<natefinch> rogpeppe: do you know if we currently support the user running more than one local environment at the same time?
<jamespage> sinzui, fwereade: fyi the uploads I did today for juju-core in trusty enable build with both go compilers
<rogpeppe> natefinch: i don't *think* we do, but i may be wrong there.
<jamespage> http://javacruft.wordpress.com/2014/01/17/call-for-testing-juju-and-gccgo/
<jamespage> jcastro, ^^
<sinzui> jamespage, \o/ Did you also test with a closed network?
<jamespage> sinzui, not yet - its on my list
<sinzui> jamespage, I have been tracking the packaging branch, I can pull your rules in when you ask me to. Oh, and 1.17.0 has a packaging change. I have not forwarded it to you since there may be more getting to 1.18.0
<natefinch> rogpeppe: reviewed
<rogpeppe> natefinch: thanks
<jamespage> sinzui, binaries right?
<sinzui> yes, one was renamed
<jamespage> sinzui, got that
<jamespage> sinzui, I was not going to upload 1.17 but I wanted some wider testing of the gccgo stuff
<sinzui> jamespage, ack
<jamespage> sinzui, oh - I had to pull in a commit for mgo (r257 I think) for gccgo compat
<jamespage> sinzui, can that be included in trunk if not done so already please
<sinzui> jamespage, I will report the issue now
<jamespage> sinzui, thanks
<rogpeppe> hmm, i can't destroy my local environment now :-\ lp:1270252
<rogpeppe> 1270252
<natefinch> rogpeppe: dang, that's annoying
<rogpeppe> fwereade: you're not around atm are you, by any chance?
<natefinch> rogpeppe: I have code in the machine agent to install the upstart job for mongo, but it's making the tests barf because they don't have rights to do that... do we have a common way to test that kind of code?
<rogpeppe> natefinch: good question.
<rogpeppe> natefinch: no, not really.
<natefinch> sudo go test? ;)
<rogpeppe> natefinch: no, we don't want to do that
<rogpeppe> natefinch: (although that's what docker does)
<natefinch> rogpeppe: I know, I was just joking
<rogpeppe> natefinch: i'd just rely on mocking in this kind of case
<natefinch> rogpeppe: ok, fair enough
<rogpeppe> natefinch: and it's the kind of thing that we could have live tests for, but the live tests seem to have languished
<rogpeppe> natefinch: another fairly simple review? (fixes a critical bug) https://codereview.appspot.com/53810045
<natefinch> rogpeppe: sure
<rogpeppe> natefinch: thanks
<natefinch> rogpeppe: done
<rogpeppe> natefinch: ta!
<rogpeppe> what a lovely launchpad error: "milestone_link: Constraint not satisfied."
<natefinch> yuck
<hazmat> did something change wrt to mongodb 'admin' password? its still the same as admin-secret right?
<hazmat> i had a working auth mongodb auth setup that now results in auth fails error msg from mongo
<hazmat> natefinch, sound familiar?
<natefinch> hazmat: not at al familiar
<natefinch> hazmat: I highly doubt we changed the mongodb admin password, but I can double check
<natefinch> hazmat: yeah, still the admin secret
<hazmat> definitely feels like somethings changed about this
<hazmat> hmm.. admin-secret's no longer in environments.yaml.. i'm pulling it directly from jenv though
<hazmat> natefinch, so it looks like this has changed, the state server agents are basically autogenerating the password, previously that was a nonce that would update to the admin-secret subsequently
<hazmat> sigh.. maybe not.. still trying to figure out the delta
<natefinch> hazmat: ahh hmm... there's still some code in there about it being admin-secret, but maybe that gets overridden
<hazmat> natefinch, well more like it never gets called
<hazmat> natefinch, so i'm on the state server
<hazmat> natefinch, and looking at machine-0's agent.conf
<hazmat> the actual value that works to login is not the 'admin-secret' for the enviornment
<hazmat> but the 'oldpassword' from the agent.conf
<hazmat> a wee bit of fubar
<hazmat> the update to the new admin password is supposed to happen i think the first time the client accesses mongodb, but with the move to api for all client ops.. it never happens
<hazmat> where new admin-password is admin-secret
#juju-dev 2014-01-19
<thumper> wallyworld: https://codereview.appspot.com/53590044/
<thumper> https://www.destroyallsoftware.com/talks/useing-youre-types-good
<bigjools> hello sprinters in dunners
#juju-dev 2015-01-12
<thumper> menn0: ok
<menn0> thumper: thanks for the review!
<menn0> thumper: another one for you: http://reviews.vapour.ws/r/707/
<anastasiamac> thumper: thnx for reviewing my doc :-)
<thumper> menn0: ack, doing another right now
<menn0> thumper: np.
<axw> wallyworld_ thumper: responded to http://reviews.vapour.ws/r/699/, PTAL when you can
<wallyworld_> sure, looking
<mattyw> morning all
<mattyw> thumper, morning/ afternoon, you still around?
<thumper> axw: done
<thumper> mattyw: yep, and reviewed
<axw> cheers
<axw> trunk is blocked again :~(
<mattyw> thumper, cool - is that an LGTM or a vote of confidence?
<thumper> :)
<thumper> merge it
<mattyw> thumper, I don't have permission - but if you do you're welcome to
<thumper> ah, ok
<thumper> mattyw: done
<mattyw> thumper, that's great service
<mattyw> thumper, thanks very much
<wallyworld_> axw: i think i may be running into an issue with maas 1.7 - the node is green/ready, and i try and bootstrap. i expect it to go blue/allocated and then i can use the virsh console to start manually. however, juju bootstrap exits with  an error FORBIDDEN (You are not allowed to start up this node.)
<wallyworld_> maybe maas is complaining there's no power mgmt options set
<axw> wallyworld_: I don't think I've used 1.7, only 1.6...
<axw> can't say that error sounds familiar
<wallyworld_> bollocks
<wallyworld_> i have a look at the python used to control the virsh console and it appears deficient in that it neglects to pass the --quite flag, so login is deemed to fail
<wallyworld_> so there's clearly some things they need to fix to get start/stop working
<thumper> axw: rereviewed the storage branch
<axw> thumper: thanks
<rick_h_> thumper: got a sec to chat?
<thumper> rick_h_: sure
<rick_h_> thumper: https://plus.google.com/hangouts/_/canonical.com/daily-standup?authuser=1
<wallyworld> rvba: hi there, did you have a few minutes for a question or 2 about maas?
<axw> wallyworld: how are you going with maas?
<axw> need a hand with anything?
<wallyworld> axw: well, tl;dr; it kinda sucks. there's a gomaasapi commit after the lastest one in dependencies.tsv that fails unit tests in juju. so i have to get that fixed prior to being able to land my gomaas mods to then propose my juju fix. i have also reverse engineered the api somewhat to figure out how to fix the deployment status issue, but need to confirm a couple of things as the doco is lacking. i'm just finishing adding the new api
<wallyworld> to the gomaas testserver so i can then use that in juju unit tests
<wallyworld> and maas plus kvm in 1.7 is broken
<wallyworld> so hard to test live
<axw> oy :(
<wallyworld> thanks for asking :-)
<wallyworld> axw: also, the testserver was somewhat, shall we say, retarded in how it was written wrt handing constraints, so i had to fix that also
<dimitern> axw, wallyworld, rogpeppe1, mgz, hey guys do you mind going here https://github.com/orgs/go-amz/people and setting yourself as publicly visible?
<dimitern> davecheney, ^^
<rogpeppe1> dimitern: done
<dimitern> rogpeppe1, cheers!
<davecheney> dimitern: done
<dimitern> axw, btw you should be unblocked now wrt storage work on goamz - api version upgraded to latest, so if you import gopkg.in/amz.v2 you should be all set
<dimitern> davecheney, thanks!
<axw> dimitern: I noticed, thank you
<jamespage> alexisb, fwereade: just bringing this feature bug to your attention - https://bugs.launchpad.net/juju-core/+bug/1409639 - hopefully that's not to much of a surprise
<mup> Bug #1409639: juju needs to support systemd for >= vivid <juju-core:New> <https://launchpad.net/bugs/1409639>
<fwereade> jamespage, ha, yes
<voidspace> dimitern: I don't know if you want to have a look at this http://reviews.vapour.ws/r/704/
<dimitern> voidspace, sure, I'll have a look in a bit
<dimitern> voidspace, reviewed
<voidspace> dimitern: thanks
<wallyworld> voidspace: hey there
<voidspace> wallyworld: hey, hi
<wallyworld> voidspace: quick one - the last comment to gomaasapi (rev 59) doesn't work with juju unit tests. i have changes i need  to make to gomaasapi but can't until those issues are fixed. related to nodegroup changes
<wallyworld> juju deps still refers to rev 58 so juju is not broken at this stage
<voidspace> wallyworld: right, I have a branch that fixes the maas test that is broken
<voidspace> wallyworld: I think the code is actually broken
<wallyworld> \o/
<voidspace> wallyworld: http://reviews.vapour.ws/r/704/
<wallyworld> oh, great, a pr already
<wallyworld> and you have a ship it
<wallyworld> that will unblock me tomorrow, so i'm happy
<voidspace> cool :-)
<voidspace> wallyworld: branch ready to land, but trunk blocked...
<wallyworld> voidspace: ah, :-(
<voidspace> wallyworld: the fix you need to get tests passing with latest gomaasapi is change line 183 of maas/environ.go to
<voidspace> if err != nil || len(bootImages) == 0 {
<voidspace> (add a check for empty bootImages)
<wallyworld> voidspace: i made that same change locally
<voidspace> cool
<wallyworld> so it must be right :-)
<voidspace> I'll try and land this asap
<voidspace> hah
<voidspace> I defer to your  wisdom
<voidspace> as always
<wallyworld> barf
<wallyworld> voidspace: i checked critical bug from irc topic, fix for regression was committed over w/e so hopefully it will be verified soon as fix released to unblock
<voidspace> wallyworld: great, thanks
<mgz> the fix is in, last run over the weekend failed on maas, I triggered a rebuild in case it's happy now
<voidspace> hmmm
<mgz> wallyworld: I'm not sure about the fix though, only changes the test, so the api still returns stuff in an undefined order? would prefer if the api return was just sorted
<anastasiamac> mgz: why would u prefer to see return sorted?
<mgz> anastasiamac: because it's an api - I like having stuff we're exposing for other people deterministic, just so they can't also make a mistake in their code and assume that it's in insert-order or whatever
<mgz> there are places where that's quite important, and many more where it's really not
<anastasiamac> fwereade: wallyworld: ^^ would the order of returned annotations from bulk call matter (on get)?
<fwereade> anastasiamac, yes please, preserve the order they were sent in
<fwereade> anastasiamac, * the requests were sent in
<anastasiamac> mgz: the test was failing not on returned order. it was setting annotations and m checking if the collection that m about to throw over the wire contains elements I was after
<anastasiamac> fwereade: mgz: will duble check that m returning collection from get in the same order as the request :)
<anastasiamac> mgz: thnx for pick up
<fwereade> anastasiamac, doh, I see, I think; so no, I don't think that order matters, users that need to sort results by alpha or something can do so trivially, it's not necessary that we do that sort of work
<mgz> anastasiamac: fair enough
<perrito666> good morning everyone, sorry for the lateness
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs:  None
<dimitern> trunk is unblocked
<perrito666> dimitern: how did you fix it_
<perrito666> ?
<dimitern> perrito666, anastasiamac fixed it
<dimitern> perrito666, see https://github.com/juju/juju/commit/66bf4163b9580a1052193bbe8934dc21a4384f11
<perrito666> dimitern: ah I saw that on friday
<perrito666> I though thi was about abentley mail (or perhas his mail was about this)?
<dimitern> perrito666, ah, the uniter one - no, no fix for that yet - it's intermittent
<wwitzel3> ericsnow: ping
<perrito666> hey wwitzel3 still of for the standup right?
<wwitzel3> perrito666: ping
<perrito666> wwitzel3: pong
<wwitzel3> perrito666: just checking if you're coming for standup?
<perrito666> wwitzel3: I thought I got an email from you saying no standup
<perrito666> :p
<perrito666> brt
<dimitern> voidspace, https://github.com/juju/juju/pull/1382 got merged already btw :)
<voidspace> dimitern: yeah, I saw
<voidspace> dimitern: thanks :-)
<dimitern> voidspace, np
<voidspace> dimitern: I only saw *after* I tried to merge again of course...
<dimitern> voidspace, I thought so yeah :)
<ericsnow> wwitzel3: you back?
<jam> wwitzel3: as OCR, can you try to make sure someone is at least investigating Aaron's 2 reported bugs about CI cursing current trunk?
<jam> (it doesn't have to be you, I just want to make sure someone is following up on it)
<wwitzel3> jam: will do, thanks for the ping
<voidspace> g'night all
<voidspace> EOD
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs:  1409827
<thumper> morning
<perrito666> hi thumper
<thumper> hey perrito666
<thumper> perrito666: all good in your neck of the woods?
<perrito666> thumper: sure, we seem to be doing better than most parts of the world accoding to the official news, albeit a bit more hot :p
<thumper> :)
<perrito666> abentley: do we know when did this break? #1409827 that is
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails on ppc <ci> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
<abentley> perrito666: 66bf4163b9580a1052193bbe8934dc21a4384f11 did not exhibit the issue: http://reports.vapour.ws/releases/2219
<perrito666> abentley: tx
<perrito666> bbl
<wwitzel3> ericsnow: I'm back, looking at the regression
<rick_h_> thumper: ping, we're trying to get a backport of frankban's polish into 1.22, we've filed a bug and need a review on a backport PR I'm told? https://github.com/juju/juju/pull/1388
<ericsnow> wwitzel3: k
<wwitzel3> thumper: ping, noticed your comment on the regression are you running the PPC test suite?
<wwitzel3> thumper: I'm trying to debug that issue now, but not sure how to run the tests.
<wwitzel3> ahh nice, I can just use a ppc vm on qemu .. that seems to work
<anastasiamac> wwitzel3: thnx for the review!!
<wwitzel3> anastasiamac: np
<anastasiamac> wwitzel3: I have a question - PTAL if u get a chance b4 :)
<anastasiamac> wwitzel3: b4 eod even :)
<wwitzel3> anastasiamac: yep, you have a response, let me know if that makes sense
<anastasiamac> wwitzel3: i do? :) sorry i must be looking in the wrong place - where can I find it?
<wwitzel3> anastasiamac: oops, sorry, was still a draft .. my fault
<katco> how are unit tags generally marshalled/unmarshalled?
<katco> service tags are handled automatically because their member-variables are public; unit has a private member-variable
<anastasiamac> wwitzel3: yep. thnx :) that's what I thought I understood :P
<anastasiamac> wwitzel3: really appreciate ur thoughts and input! thnx
<wwitzel3> anastasiamac: you're welcome
<thumper> wwitzel3: do you get the same issue just running with the gccgo compiler on the local machine?
<thumper> wwitzel3: I'm fairly sure it is a timing problem
<thumper> wwitzel3: and I think the test should be rewritten to not need time.Sleep
<thumper> cherylj: I'm back if you have time to talk now
<wwitzel3> thumper: ok, let me try compiling with gccgo.
<wwitzel3> thumper: thanks for the hints
<cherylj> thumper, now works for me
<thumper> cherylj: we'll just use the standup hangout
<cherylj> ok
<wwitzel3> thumper: it passes locally for me when using gccgo
<thumper> wwitzel3: otp now
<menn0> thumper, waigani: environments watcher here: http://reviews.vapour.ws/r/711/
<waigani> menn0: lgtm
<menn0> waigani: cheers
<menn0> thumper: when you have a moment can we discuss the machine agent worker stuff?
<thumper> menn0: in 10 minutes?
<menn0> thumper: sounds good
#juju-dev 2015-01-13
<katco> anastasiamac: http://reviews.vapour.ws/r/713/ when you have a moment :)
<katco> anastasiamac: conceptually rather simple
<anastasiamac> katco sorry with alexis. brb
<katco> anastasiamac: no rush at all! ty!
<wallyworld_> axw: could you please take a look at https://code.launchpad.net/~wallyworld/gomaasapi/testservice-additions/+merge/246237 when you get a chance
<wallyworld_> thumper: got 2 minutes?
<rick_h_> wallyworld_: anastasiamac reply inbound, please let me know if any of it doesn't make sense or you want to chat on a hangout or anything to help clear up my questions/concerns
<anastasiamac> rick_h_: thnx! will look :)
<axw> wallyworld_: looking
<wallyworld_> ty
<katco> wallyworld_: ty for the review; comments
<wallyworld_> sure, np
<axw> wallyworld_: do you know when ?op=deployment_status was added? since forever? (I can't see a mention in the MAAS 1.5 API docs)
<katco> wallyworld_: just so there's no ambiguity: i left comments to your comments :)
<wallyworld_> axw: i believe mid 2014, so 1.7 i *think*
<axw> yeah, 1.7's docs are the first time it's mentioned
<wallyworld_> maybe even latish 2014
<anastasiamac> katco: reviewed too :)
<katco> anastasiamac: ty!
<anastasiamac> katco: it was a pleasure and I learn something new eveytime I read ur code!
<axw> wallyworld_: reviewed
<wallyworld_> ty
<thumper> wallyworld_: hey
<thumper> wallyworld_: just eating, catch up in a few minutes?
<wallyworld_> thumper: i'll ping you in 15? just in a meeting
<thumper> wallyworld_: kk
<axw> dimitern: thanks very much for landing my branch
<wallyworld_> thumper: free now, meet our 1:1?
<thumper> sure
<axw> anastasiamac: can you please take a look at https://github.com/juju/juju/pull/1394?
<anastasiamac> axw: of course!
<axw> the RB bot seems not to have picked it up
<anastasiamac> axw: the bot is on lunch? :) it must b on BNE time. I'll look :)
<anastasiamac> axw: reviewed :)
<axw> anastasiamac: thanks
<axw> it's pretty trivial, but I guess I should follow protocol
<axw> wallyworld_: ^^ can you please review the review when you have a moment
<wallyworld_> sure
<wallyworld_> axw: lgtm, thanks for review board fixes also
<axw> thanks
<thumper> axw: addressed all your review comments: http://reviews.vapour.ws/r/688/diff/#
<axw> thumper: I saw, thanks
<axw> thumper: just noticed an import in the wrong block in environmentmanager.go
<thumper> really?
 * thumper looks
<axw> juju/juju/version
<thumper> fixed
<thumper> axw: isn't the bot blocked?
<axw> thumper: master was unblocked overnight, dunno about now
<thumper> there is an open critical bug above
<thumper> wwitzel3 was looking at it
<axw> ah sou
<thumper> a power64 failure
<wallyworld_> axw: a small fix to maas test server, sigh
<wallyworld_> https://code.launchpad.net/~wallyworld/gomaasapi/testservice-numeric-field-fix/+merge/246250
<axw> looking
<axw> wallyworld_: what triggered the error?
<axw> wallyworld_: I see tests updated, but no new test that would trigger the problem before?
<wallyworld_> axw: on the juju side, the start instance code was pulling out mem from the jsonobject map and assuming a float. so i had to modify the juju unit tests to construct  test server nodes with "memory": 8192 instead of "memory": "8192". this then triggered the testserver vreakage
<axw> I see
<axw> hence the assertion in the last line of the diff
<axw> ok
<axw> wallyworld_: approved
<wallyworld_> yeah, basically making testserver behaviour match what we appear to get with real maas
<wallyworld_> ty
<wallyworld_> axw: and here's a one line juju fix (plus tests) http://reviews.vapour.ws/r/718/
<wallyworld_> thumper: that critical bug is assigned to wayne, is anyone on your team looking at it per chance? you made an earlier comment on the bug, so i reckon last person who touches it, owns it :-)
<thumper> no, not looking at it
<axw> wallyworld_: reviewed
<wallyworld_> great, ty. nfi how that slipped through previously :-(
<wwitzel3> yeah I was looking at it, never figured it out
<wwitzel3> I forgot to unassign it when I hit EOD
<dimitern> axw, no worries
<axw> dimitern: thanks for the review
<dimitern> axw, np :)
<dimitern> axw, I assume you've run the live tests as well, right?
<axw> dimitern: yes
<dimitern> axw, +1
<dimitern> axw, btw with the diskmanager disable we won't see anymore errors like ERROR juju.worker.diskmanager lsblk.go:105 error checking if "sr0" is in use: open /dev/sr0: no medium found ?
<dimitern> axw, perhaps it shouldn't even try on a local environment, when it's enabled again
<axw> dimitern: that error is news to me, but you are correct, it won't do that anymore
<dimitern> axw, I was meaning to file a bug about it, but I keep forgetting
<axw> dimitern: we will/want to eventually have support for local, though it'll require more work for LXC
<dimitern> axw, right, fair point
<axw> dimitern: probably doesn't matter now, we can worry about it when it gets enabled without the feature flag and local has been looked into
<dimitern> axw, yeah - I should've done the same about the networker - feature flag is better than dealing with upgrades and machine jobs set correctly
 * dimitern needs to go apply for a SA visa - most likely back in time for standup
<axw> wallyworld: re the disk parameters error message, I'd prefer not to change it because I want to keep the disk name in the error message. the "name" is just a number, so having "disk" before that gives context
<wallyworld> axw: oh, i just meant the text description bit, the "to be created" part I think can go
<axw> wallyworld: "cannot get parameters for disk %q" then?
<wallyworld> yeah, i think so, more generic ad doesn't add surpurfulous context
<axw> fair enough
<axw> wallyworld: I don't really understand your other question
<axw> wallyworld: what test are you looking for?
<wallyworld> when a machine is created without disk params, that the get getDiskInfo() method or whatever it is called returns empty
<wallyworld> a machineWithDisks was created specially to check that the params were recorded
<wallyworld> but maybe other non machineWithDisk machines could be checked to ensure there is no info
<wallyworld> just need to do it once somewhere
<axw> wallyworld: yep, adding
<axw> thanks
<wallyworld> sure, sorry if it was too pedantic
<axw> nope, that's fine
<wallyworld> fwereade: can i ping you later for our 1:1 - i have soccer tonight
<rvba> Hi wallyworldâ¦ it seems I missed your ping from yesterdayâ¦ do you still need me?
<TheMue> dimitern: morning. had a bad night and feeling weak now. so I'll step back into bed.
<TheMue> dimitern: will have irc and mail open, so I later see what's happening
<TheMue> dimitern: I sent you the mails we talked about in cc
<voidspace> morning all
<dimitern> TheMue, sure, I hope you get well soon! :/
<dimitern> morning voidspace
<voidspace> o/
<dimitern> voidspace, you know what - let's split it up like this - i'll start from one end - network.InterfaceConfig, using it to render the lxc config, etc. while you start from the api call we need before startinstance
<dimitern> voidspace, and we'll integrate it in the middle somewhere
<dimitern> voidspace, how does this sound?
<perrito666> morning
<wallyworld> rvba: hey there, would love a quick chat if you are free
<rvba> wallyworld: sure.  Hangout?
<wallyworld> https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand
<voidspace> dimitern: oops, missed your message
<voidspace> dimitern: but yes, that sounds good to me
<dimitern> voidspace, great, I'm already on it
<voidspace> cool
<voidspace> dimitern: name for the new API call?
<voidspace> dimitern: ContainerInterfaceInfo ?
<voidspace> dimitern: taking a slice of machine ids and returning a slice of InterfaceConfig
<dimitern> voidspace, yeah, sounds good
<voidspace> or slice of mappings of interface name to InterfaceConfig
<dimitern> voidspace, although..
<voidspace> if the InterfaceConfig includes name then strictly a mapping is redundant
<dimitern> voidspace, while doing the changes around the network package I realized we don't need network.InterfaceConfig
<voidspace> especially as they'll only have one entry initially
<voidspace> dimitern: just extend SubnetInfo ?
<dimitern> voidspace, we might use network.Info, rename it to InterfaceInfo and extend it to include what we need
<voidspace> InterfaceInfo or SubnetInfo
<dimitern> voidspace, SubnetInfo is very basic, doesn't contain NIC-specific settings, like whether it's auto-start, extra config (e.g. pre-up/post-down rules, routes, etc.)
<voidspace> dimitern: ok
<voidspace> hmm...
<voidspace> although as we fleshed it out earlier we didn't discuss including that info anyway
<dimitern> voidspace, so let's say the ProvisionerAPI.ContainerInterfaceInfo(ids) []result struct { *error, []results }
<voidspace> dimitern: ok, that's enough to be getting on with
<voidspace> I can hassle you for details as I go
<dimitern> voidspace, yeah, but as long as I'm changing it I want to include the necessary fields so we can model: physical interfaces, vlans (both of these we can, but partially), bridges, static addresses, etc.
<voidspace> dimitern: ok
<dimitern> voidspace, sure, I hope to propose it soon
<voidspace> dimitern: I'll be going on lunch before the MAAS call, so I'll have made a start by then, but probably only a start
<dimitern> voidspace, no worries
<voidspace> dimitern: params.Entities as the arg type?
<voidspace> and check each of the entities maps to a container machine
<dimitern> voidspace, that's right, cheers
<perrito666> mm, CI still locked?
<dimitern> voidspace, http://reviews.vapour.ws/r/719/ - there it is; still live testing, but just as a precaution - I don't expect anything to break
<dimitern> voidspace, ping for maas meeting
<voidspace> dimitern: omw
<voidspace> dimitern: LGTM on your PR
<dimitern> voidspace, thanks!
<dimitern> voidspace, so did it get clearer as you looked at the changes?
<dimitern> voidspace, I mean the network.InterfaceInfo and how we'll used it around StartInstance?
<voidspace> dimitern: yeah, I think it's pretty clear
<voidspace> dimitern: what I'm not clear on is in the ProvisionerAPI how I get to the environ to do the address allocation
<voidspace> dimitern: we have a state though
<dimitern> voidspace, you can always create an environ from config
<voidspace> dimitern: is that the *right* thing to do?
<dimitern> voidspace, I can't think of another way actually
<voidspace> dimitern: ok, cool
<voidspace> thanks
<dimitern> voidspace, we "create" an instance of the Environ interface, not an actual environment
<dimitern> voidspace, np
<voidspace> right
<voidspace> coffee!
<perrito666> wwitzel3: ericsnow and I are feeling lonely at the standup :p
<bodie_> any chance we can get https://github.com/juju/juju/pull/1399 merged?  it's arguably something that should not be in a tagged version
<bodie_> just a spurious file in project root
<bodie_> i.e. nails on the chalkboard of my mind
<dimitern> oh boy, what a fat panic - bug 1410320
<mup> Bug #1410320: juju status --format summary panics with unresolvable IPs <cmdline> <network> <panic> <status> <juju-core:Triaged> <https://launchpad.net/bugs/1410320>
<perrito666> wha?
<dimitern> katco, hey, you might want to have a look at that ^^
<perrito666> btw, is anyone looking at the current blocker?
<dimitern> perrito666, wasn't wwitzel3 doing this?
<perrito666> dimitern: I cannot get a hold of wayne, that is why I ask
<perrito666> he is assigned
<dimitern> perrito666, ah, I see
<dimitern> perrito666, well, I'll have a look to see if it's just a map ordering ppc issue that can be fixed easily
<perrito666> its not, thumper and I took a look at it yesterday
 * perrito666 tries to separate unit and agent as entities and cries a little bit
<voidspace> rebooting due to spotify killing all window chrome
<voidspace> rebooting via the command line...
<voidspace> BRB
<dimitern> wow
<dimitern> perrito666, the whole TestSetMembersErrorIsNotFatal does not make much sense to me
<dimitern> rogpeppe1, hey, can you give a hand to figure out why this test is failing on ppc ? https://bugs.launchpad.net/juju-core/+bug/1409827
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged by wwitzel3> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1409827>
<voidspace> and back
<rogpeppe1> dimitern: i'll try to have a look later
<perrito666> voidspace: google chrome?
<voidspace> perrito666: no, unity chrome
<perrito666> ah
<voidspace> as in - all windows stop responding to mouse clicks
<dimitern> rogpeppe1, cheers
<voidspace> some handler in spotify doesn't return and unity blocks
<perrito666> voidspace: yep, but browser
<voidspace> perrito666: no
<perrito666> voidspace: ah you are actually using spotify app
<voidspace> perrito666: yep
<dimitern> sinzui, hey, can you clarify which merge by frankban you refer to in https://bugs.launchpad.net/juju-core/+bug/1409827/comments/8 - I can see one landing just after voidspace's maas branch
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1409827>
<voidspace> dimitern: is this something I should be concerned about / look at?
<voidspace> dimitern: might I have killed CI?
<wwitzel3> I was looking at that yesterday, but never got a suitable testing environment setup.
<dimitern> voidspace, unlikely :) you change has nothing to do with replica sets or peergrouper
<voidspace> hah
<voidspace> I shall remain blissfully unaware then...
<wwitzel3> perrito666: I told you Sunday night I was going to be missing standup and out during the AM eastern time. You're bad at this.
<ericsnow> wwitzel3: lol
<dimitern> bad, bad perrito666
<perrito666> wwitzel3: that was today :p
<dimitern> ;)
<perrito666> wwitzel3: sorry since I saw you assigned to the bug I assumed you where immersed in it
<perrito666> wwitzel3: I do my best to mimick nate's scheduling ability
<perrito666> :p
<sinzui> dimitern, the bug was changed to non-critical, unblocking CI. which was wrong given we still have a failing test that needs to pass
<dimitern> sinzui, I agree
<wwitzel3> dimitern, rogpeppe1: thumper suggested I try using gccgo , which I did, but I couldn't replicate the bug with just that.
<dimitern> sinzui, just trying to figure out comment #8
<wwitzel3> and i was never able to successfully get a ppc64el vm up and going
<perrito666> bbl lunch
<sinzui> dimitern, 1.22 was created from the previous commit. wallyworld merged the suspect commit into 1.22 as a pre-requisite for his branch...and then introduced the same bug into the other line of development. We we reported this bug, we were testing both voidspace's and frankban's commits. Since 1.22 failed without frankban's changed, the only suspect is commit 782e9cd
<sinzui> s/We we/when we/
<dimitern> thanks sinzui
<dimitern> jw4, hey, you mentioned you can reproduce that bug relatively easily? ^^
<dimitern> i have an inkling..
<voidspace> dimitern: do share
<dimitern> at first glance the changes introduced by TheMue with https://github.com/juju/juju/commit/d5fd5e032b6593b53d03244773b9c7ac65805fd0#diff-1 *might* cause patched timeouts to get unpatched incorrectly
<dimitern> s.PatchValue shouldn't be used in a loop or in any case more than once
<dimitern> once per test, that is
<mgz> dimitern: oh, interesting
<dimitern> mgz, yeah, I had issue with that before - for loops and such, gitjujutesting.PatchValue() should be used, which returns a Restorer that can be called at the end of the loop
<dimitern> I'm not saying this is the issue (other tests in there are written similarly and should fail intermittently the same way), but it might be related
<katco> dimitern: hey just saw your message, i'll look into it, thanks
<dimitern> katco, np
<katco> dimitern: it looks like i gave a good warning but then ignored my own warning :)
<dimitern> katco, ha :) good catch then
<bodie_> dimitern, mgz, sinzui -- any of you know whether / how / whom to talk to about getting a single file single line PR through?  we have a stupid foo.yaml file sitting in our project root
<dimitern> bodie_, I can have a look
<bodie_> dimitern, not certain how we didn't catch that in review but there it is.  https://github.com/juju/juju/pull/1399
<mgz> bodie_: you can just get it reviewed then force it through
<bodie_> mgz, okay, cool.  it's reviewed, and I can just land it then?  I thought we were frozen
<mgz> I'm fine with a trivial change on trunk - give me ten mins though, I'll confer and land it if agreed. what's the pr #?
<bodie_> mgz, 1399
<mgz> ta
<dimitern> bodie_, LGTM, once ci is unblocked please land it
<bodie_> thanks :)
<mgz> bodie_: am going to leave ci idle for now rather than landing your change through the block, in case we want to try out a change to the test along dimiter's proposed patch thing on ci to see if it helps
<dimitern> mgz, what proposed patch?
<voidspace> dimitern: ping
<dimitern> voidspace, pong
<voidspace> dimitern: so the new api method
<voidspace> dimitern: for every machine tag requested it should allocate an address, set it on the machine, construct an interface info result
<voidspace> dimitern: will this api only be called for machines that support address allocation?
<voidspace> dimitern: (that information can be added to the ManagerConfig)
<mgz> dimitern: a speculative future proposed patch
<voidspace> dimitern: SupportAddressAllocation takes a netId - so should we call environ.Subnets(), then pick a network, check environ.SupportAddressAllocation(netId) (continuing until we find one and erroring if we don't)
<dimitern> voidspace, so, the environ supports address allocation or not strictly speaking
<voidspace> dimitern: allocating an address on the first network we find that supports address allocation
<voidspace> dimitern: it is implemented as taking a netId
<voidspace> dimitern: but ok, fair enough (and yes!)
<dimitern> mgz, unfortunately I won't be around long enough to try a fix :/
<voidspace> dimitern: but are we assuming this api will only be called where the environ does support address allocation
<voidspace> dimitern: and how do I pick the right subnet?
<dimitern> voidspace, the apiserver should check the given machine tag is a container tag - also it should be allowed only to pass containers with have the agent machine as parent
<voidspace> dimitern: I'm checking ContainerType
<dimitern> voidspace, i think the code path is roughly speaking like this: 1) if env.SupportAddressAllocation() { <case1> } else { return nil and some error - e.g. NotSupportedf }
<voidspace> dimitern: do you know how I find the agent machine - or should I hunt around?
<dimitern> voidspace, sure - that's the tag used at login
<voidspace> cool and cool
<voidspace> yes, check address allocation is supported or error
<voidspace> great
<dimitern> voidspace, like authorizer.GetAuthTag() ...
<dimitern> voidspace, then, for <case1>, let me think a bit
<voidspace> :-)
<dimitern> voidspace, you need a netId to pass to SupportAddressAllocation
<voidspace> yep
<dimitern> voidspace, but that's ignored for now, right?
<voidspace> as far as I know
<voidspace> I'd have to check
<voidspace> for example, for MaaS we *could* check if the specified netId has a static range
<voidspace> but we don't yet
<dimitern> voidspace, right, so we should keep this in mind, but for now let's ignore it
<voidspace> sure
<dimitern> voidspace, we need to allocate an address from the same subnet as the container's host's primary NIC
<dimitern> (what a mouthful)
<perrito666> back
<voidspace> dimitern: ok...
<dimitern> voidspace, so what *i think* we need is to get the subnets for that host, then pick a reasonable one
<voidspace> dimitern: the one containing the primary IP of the host for example
<dimitern> voidspace, right
<voidspace> so long as the state addresses for the machine are correct that should work
<dimitern> voidspace, all this is assuming a bit too much, but until we have per-service-endpoint bindings in place we could improvise
<dimitern> voidspace, strictly speaking (as per the model) we should create 1 container NIC per host NIC
<voidspace> right
<voidspace> but not yet
<dimitern> voidspace, but that's too much for now
<dimitern> voidspace, yeah
<dimitern> voidspace, let's decide which subnet later, for now just pick the first allocatable one
<dimitern> voidspace, and please add a comment this is temporary
<dimitern> voidspace, it should allow us to do an end-to-end simple test with 1 subnet
<voidspace> dimitern: hah, ok
<voidspace> that's easy enough
<voidspace> thanks
<dimitern> voidspace, cool
 * dimitern should be going now
<voidspace> dimitern: o/
<jw4> dimitern: sorry - was at the gym... Yes I can repro consistently.  I'll check with katco and your github repo and see if you have any proposed fixes I can test
<jw4> katco: are you picking up what dimitern was working on with the peergrouper test failures?
<katco> jw4: sorry, this is the first i'm hearing about this
<jw4> katco: no worries - I just thought I saw you respond to dimitern when he was talking about it
<katco> jw4: are you perhaps thinking of the bug dimitern pointed me at? https://bugs.launchpad.net/juju-core/+bug/1410320
<mup> Bug #1410320: juju status --format summary panics with unresolvable IPs <cmdline> <network> <panic> <status> <juju-core:Triaged by cox-katherine-e> <https://launchpad.net/bugs/1410320>
<jw4> hmm; if that was the bug he was talking about I'm confused
<mgz> there were a couple of conversatons happening at once :)
<katco> jw4: i think you... mgz has it :)
<jw4> katco: :)
<katco> jw4: i would continue with your assumptions, but i don't think i was ever involved in the peergrouper stuff
<jw4> kk
<mgz> dimitern mentioned that the patch in the test at issue in bug 1409827 was dodgy, but didn't have a proposed fix
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1409827>
<jw4> mgz: yeah that was the one.  I have been able to consistently repro that one on my box since august
<jw4> mgz: I just ignored it since the CI wasn't getting it, and other folks weren't seeing it
<jw4> (maybe even before august, that's just how far back my test run captures go)
<mgz> jw4: you have a talent for getting some of these inconsitent failures...
<jw4> mgz: lol
 * jw4 whispers:  I'm using a hyper-v ubuntu VM on top of -- windows 8.1 --- (gasp)
 * jw4 notes the stunned silence with some chagrin
<mgz> jw4: I barely even giggled
<jw4> now I'm crestfallen instead of chagrined
<mgz> does support the general theory of timing relatedness, can't imagine that's the smoothest setup
<jw4> yeah.  It looks like no-one is assigned to it anymore.  I'll investigate a bit
<voidspace> g'night all
<voidspace> EOD
<perrito666> wallyworld: drop me a ping whenever you return please
<bodie_> mgz, still around?  I'm having a mental merge conflict between your and dimitern's directives about pr 1399 :P
<thumper> cmars: hey there
<thumper> cmars: are you ok if we skip today's call?
<thumper> cmars: I've got a heap to do
<perrito666> wallyworld: ping me whenever you are here
<thumper> cmars: oh, noticed that you had declined today's meeting anyway, so we're all good :)
<menn0> katco: ping?
<katco> menn0: hey what's up?
<menn0> katco: I'm in the process of making some machine agent workers run per environment (in a multi-env Juju server)
<menn0> katco: is the lease manager a global thing or a per environment thing
<menn0> katco: i'm guessing the former, but want to be sure
<katco> menn0: per environment
<katco> menn0: the way to think about it is it has a 1:1 relation with a state server
<menn0> katco: ok, but the state server is shared by multiple environments...
<katco> menn0: maybe i'm misunderstanding the word "environment"
<menn0> katco: i'll give some background
<katco> menn0: if the state server is shared, then the lease server will be shared
<menn0> so we're almost at a place where one state server (or set of replicated state servers) will be able to support multiple, independent Juju environments
<katco> menn0: ah gotcha... i think i know what this is in reference to
<menn0> so if you already have a state server up, you can issue a command to add another environment which can then have its own machines, services, charms etc
<katco> menn0: since the leases are stored in mongo, and the lease server is running on the state server, i think the answer is "global" in your case
<katco> s/lease server/lease manager/g
<katco> menn0: the spec has a runtime-components diagram which should help
<menn0> katco: ok great.
<menn0> katco: I had forgotten about that diagram. I'll take another look just to make sure I've got it straight.
<menn0> katco: thanks!
<katco> menn0: please feel free to ping me with any follow-up questions if it's at all unclear
<menn0> katco: grr... I can't find the spec, only the text source
<katco> menn0: np one sec
<katco> menn0: https://drive.google.com/open?id=0B24olKDYt1DQa1piUFhSMGpuWjA&authuser=1
<menn0> katco: thanks. searching for "lease", "lease service" etc on google drive doesn't find it.
<katco> menn0: i wonder if i can tag documents
<katco> menn0: strange, the title is lease-service.org.pdf
<menn0> katco: yeah I know. pretty crap.
<menn0> katco: this is a bit of an aside, but we need to make sure that lease service use cases that are env specific are isolated from each other (e.g. presence)
<menn0> katco: I guess we just need to make sure sensible ids - which include the env UUID - are used
<katco> menn0: the lease service itself is completely agnostic to what it's asked to store
<katco> menn0: it provides a namespace functionality, so that can easily be used for presencer etc.
<menn0> katco: yep I see that. I'm just think out loud about what it's going to be used for.
<katco> menn0: your comments are appreciated :)
<menn0> katco: all good. i'm now sure that the lease service worker just needs to run once, not per env.
<menn0> katco: and that's what I need to know right now.
<katco> menn0: good to hear :) you might also be interested: i pulled MachineAgent into its own exportable package, so you can now write unit tests against them
<katco> menn0: without spinning up a jujud
<menn0> katco: yep, I'd already seen that. very helpful - thanks!
<katco> :)
<menn0> katco: at some point I'd like to unpick the upgrade-steps work and extract it to the worker part of the tree for similar reasons. it's currently quite coupled to the machine agent.
<katco> menn0: right; i think the agent shouldn't know anything about what it's running. it should just loop over a slice of functions it got passed
<katco> menn0: i'm currently spending my fridays picking apart jujud/*
<menn0> katco: well hopefully what I'm about to do doesn't upset your efforts
<menn0> katco: i'm about to reorganised how some workers get started
<katco> menn0: thank you for your concern. i wouldn't worry about it though... your work should take precedent
<katco> menn0: mine will take awhile i think
<menn0> katco: i'm hoping that my changes will also leave the machine agent a code a little clearer than it was
 * katco cheers on menn0 
 * thumper groans
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs:  1409827 1410556
<wallyworld> perrito666: hey, just about to have short meeting, will ping you soon
<perrito666> wallyworld: going out for a walk if I dont answer ill ping you back :p
<wallyworld> sure :-)
<thumper> oh fark
<thumper> menn0: unit agents don't run upgrades do they?
<thumper> menn0: I need to upgrade the unit agents config files...
<thumper> wallyworld: did you know about this failure? http://juju-ci.vapour.ws:8080/job/run-unit-tests-precise-i386/1239/console
<thumper> wallyworld: I recall you doing some maas constraints stuff
<wallyworld> thumper: in a meeting now with sinzui etc discussing such things
<thumper> wallyworld: kk
<wallyworld> thumper: we have 4 unit test failures blocking CI. we'll look into the MAAS i386 one. can someone on your team look at bug 1236471?
<mup> Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1236471>
<wallyworld> that was reported om oct 2013 so is an old failure
<wallyworld> sorta dormant till now
 * thumper sighs
<thumper> wallyworld: who is looking at the power one?
<wallyworld> which one are you thinking of?
<wallyworld> bug 1410556 ?
<mup> Bug #1410556: TestStartInstanceUnmetConstraints fails on 386 and ppc64el <ci> <i386> <ppc64el> <regression> <test-failure> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1410556>
<thumper> maybe...
 * thumper looks
<wallyworld> that one i'll look at
<wallyworld> i copied an existing test and added a constraint to start instance so i have nfi why it fails
<thumper> https://bugs.launchpad.net/juju-core/+bug/1409827
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
<wallyworld> i was going to ping nate about that one
<thumper> wallyworld: nate is on paternity leave no?
<wallyworld> ah yes
<wallyworld> he was going to fix the repliaset related "unit" tests
<wallyworld> but i don't hink that's been done yet
<wallyworld> any mongo related peer group / replica set tests are horrible
<wallyworld> if you wanted to fix that one instead.....
<wallyworld> we just need to start dividing up the work as it's got to the point where i won't be able to do it all in a timely fashion, as i'm also fixing functional issues for 1.22
<thumper> wallyworld: menno was going to look at bug 1409827
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
<wallyworld> \o/ that would be great
<wallyworld> ty
<thumper> I'll take the intermittent failure after I've finished fixing this branch
<wallyworld> sure, i'll do the maas i396 one
<wallyworld> and there's another different i386 one also
<wallyworld> bug 1408762
<mup> Bug #1408762: --constraints option is ignored on MaaS provider <constraints> <maas-provider> <juju-core:In Progress by wallyworld> <juju-core 1.22:Fix Committed by wallyworld> <https://launchpad.net/bugs/1408762>
<wallyworld> ah no
<wallyworld> pingerSuite tests consistenly fail on trusty 386
<wallyworld> whatever the right number is
<wallyworld> i'll see if my team can get that one fixed
<wallyworld> sinzui: bug 1410556 shouldn't apply to 1.33 yet as the change has only been committed to 1.22
<mup> Bug #1410556: TestStartInstanceUnmetConstraints fails on 386 and ppc64el <ci> <i386> <ppc64el> <regression> <test-failure> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1410556>
<sinzui> wallyworld, I will fix that
<wallyworld> sinzui: i just did
<sinzui> :)
<wallyworld> just letting you know
<perrito666> wallyworld: back
<wallyworld> perrito666: give me 2 minutes
<perrito666> np
<wallyworld> perrito666: did you want a hangout? https://plus.google.com/hangouts/_/canonical.com/tanzanite-stand
<wallyworld> sinzui: fix for bug 1410556 just merged into 1.22
<mup> Bug #1410556: TestStartInstanceUnmetConstraints fails on 386 and ppc64el <ci> <i386> <ppc64el> <regression> <test-failure> <juju-core 1.22:In Progress by wallyworld> <https://launchpad.net/bugs/1410556>
#juju-dev 2015-01-14
<perrito666> wallyworld: btw I began using quassel, it sucks that they dont support ubuntu indicator for messages
<wallyworld> yeah, maybe htere's a plugin, not sure
<perrito666> doesnt seem to implement plugins
<perrito666> but having the server in amazon works better than bip
<thumper> wallyworld: re bug 1236471
<mup> Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1236471>
<thumper> wallyworld: it seems to be just utopic
<thumper> is there someone running utopic that could test it?
<wallyworld> oh joy
<wallyworld> i can
<thumper> works fine here
<wallyworld> sigh
<wallyworld> i'll try it now
<thumper> wallyworld: although a key indicator to the problem could be the comment on line 111
<wallyworld> thumper: works fine for me on utopic
<thumper> wallyworld: reading the code...
<thumper> wallyworld: it seems that sometimes destroy removes the unit, and sometimes it just sets it to dying
<thumper> smells like a race condition
<thumper> the filter that we check dies seems to not find the unit
<wallyworld> yeah, sadly
<thumper> so it feels like in that case, the doc was removed
<thumper> as opposed to just set to dying
<thumper> wallyworld: reading the comment in state/unit.go Destroy method
<wallyworld> that makes sense having now read the code
<thumper> wallyworld: it seems that once destroy has been called, we can't depend on it being there
<thumper> so the test looks fucked
<thumper> or the filter is screwed
<thumper> as it expects it to be there
<thumper> and it has goine
<thumper> perhaps the test should handle the missing unit error
<thumper> I bet what is happening
<thumper> is that the remove is being executed before the filter go routine starts
<wallyworld> thumper: this comment
<wallyworld> / Ensure we get a signal on f.Dead()
<wallyworld> seems to imply the test expects the unit to be Dead
<wallyworld> but it's either still dying or already removed as you say
<thumper> seems like the test needs to ensure that the filter go routine has started
<thumper> and it probably doesn't
<thumper> nope
<thumper> it doesn't
<thumper> there is the race
<thumper> filter.go line 281 will be where it errors
 * thumper grunts
<thumper> wallyworld: perhaps the best way to fix this is to have NewFilter not return until the goroutine for the loop has hit a ready state?
<thumper> wallyworld: thoughts?
<wallyworld> thumper: i'm just reading the filter code and in various places there's this: IsCodeNotFoundOrCodeUnauthorized
<wallyworld> so it seems that the filter can expect the unit to not be there
<wallyworld> and in that case, it returns an ErrTerminateAGent which is what the test wants
<wallyworld> do you see what i mean?
<wallyworld> in the loop()
<wallyworld> oh, i just read what you posted above
<wallyworld> let me look at that
<menn0> wallyworld, thumper: fyi I have made little progress with bug 1409827
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
<thumper> wallyworld: but you can see from the bug, the error is a "not found" one
<menn0> wallyworld, thumper: mainly due to distractions at home
<wallyworld> menn0: i hate those tests
<thumper> wallyworld: because it is a params.Error not an rpc.Error
<menn0> wallyworld, thumper: I am questioning why it's a CI blocker though. it fails far less often than the other ones you're currently looking at
<menn0> wallyworld, thumper: I can only find one recent failure
<wallyworld> menn0: maybe sinzui can answer
<menn0> wallyworld, thumper, sinzui: it obviously needs to be dealt with, I just don't know if it's CI blocker material
<menn0> wallyworld, thumper, sinzui: continuing with it now anyway
<wallyworld> ty
<thumper> wallyworld: I think I know why it isn't being caught
<wallyworld> thumper: so the test log shows the CharmURL() api call failing with a params.NotFound error but this error should have been caught in the defer and changed to ErrTerminateAgent
<thumper> wallyworld: because if any of the errors are traced or wrapped, then they don't match the call
<thumper> maybe...
<wallyworld> thumper: because ?
<wallyworld> if err, _ := err.(rpc.ErrorCoder); err != nil {
<wallyworld> err could be wrapped?
<thumper> I'm thinking it is a possibility
<thumper> however...
<thumper> no...
<thumper> because the resulting error is a params.Error
<thumper> wallyworld: I don't see how this could fail... :(
<wallyworld> sigh, me either, but we might me missing something subtle not being familiar with the code
<wallyworld> thumper: if i comment out the deferred error handling to convert a not found to a errterminateagent, i can get it to fail just like in the test
<wallyworld> so that does suggest that that error handling being used is critical to making the test pass, and i can't see how the error could be escaping
<thumper> looking at the bug test log
<thumper> [LOG] 67.77366 DEBUG juju.rpc.jsoncodec -> {"RequestId":12,"Response":
<thumper> indicates that line 320 is the return statement in question
<thumper> but I agree, can't see why it wasn't chagned
<wallyworld> thumper: sill, i do think we need an errors.Cause() instead of a straight cast tp rpc.Error in the params error code stuff
<wallyworld> rpc.ErrorCoder i mean
<thumper> while I generally agree
<thumper> I'm trying to work out this failure
<thumper> and this isn't it...
<thumper> api/uniter/unit.go line 446
 * thumper thinks...
<thumper> hang on...
<wallyworld> sure, that was a general cooment
<wallyworld> not for this fix
<thumper> fucker
<thumper> ...
<thumper> damn it
 * thumper looks deeps
<wallyworld> result.Error won't be caught maybe
<wallyworld> i saw that line before and assumed it would be caught
<wallyworld> oh wait, yes it will
<thumper> wallyworld: I need a reference to a recent failure
<thumper> result.Error is a pointer
<thumper> and the *params.Error does match the interface
<thumper> we need to see the recent failure
<thumper> looking at modern code for an old failure is a waste of time
<thumper> too much can change
<wallyworld> thumper: attached to the bug
<wallyworld> https://launchpadlibrarian.net/190620335/filter-failure.log
<thumper> wallyworld: the 19th of November isn't recent
<wallyworld> thumper: looks like a wrapping issue
<thumper> ah, that one is a wrapping issue
<wallyworld> obtained *errors.Err = &errors.Err
<thumper> haha
<thumper> yeah, that is one
<wallyworld> so maybe we should do my previous suggestion
<thumper> I'll fix this
<wallyworld> awesome
<menn0> wallyworld, thumper: I think I see a potential data race relating to bug 1409827
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1409827>
<menn0> mustNext in worker/peergrouper/worker_test.go
<menn0> the return value is set in a separate goroutine
<menn0> isn't that a no-no?
<menn0> thumper: ^
<menn0> might not be the source of the test failures but looks fishy
<wallyworld> menn0: just in meeting, will look soon
<axw> katco: standup?
<katco> axw: shoot sorry
<thumper> menn0: /me looks
<thumper> menn0: which line?
<menn0> thumper: from 513 onwards
<menn0> thumper: the return value "val" is assigned directly from the goroutine
<menn0> thumper: the more I think about it the more I don't think this is the test failure (because if nil was returned we'd see a different kind of failure)
<menn0> thumper: but i'm clearing it up anyway
<thumper> you could be right
<thumper> but we aren't seeing "timeout waiting" are we?
<thumper> menn0: it is wrong, but not the source I think
<thumper> wallyworld: this bug isn't a critical blocker, but done anyway...
<wallyworld> thumper: in meeting, will look in a sec
<thumper> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1236471
<mup> Bug #1236471: Sporadic test failure w/ bot inside Uniter: FilterSuite.TestUnitRemoval <test-failure> <juju-core:In Progress by thumper> <https://launchpad.net/bugs/1236471>
<thumper> gah, wrong paste: http://reviews.vapour.ws/r/723/diff/#
<thumper> axw: test added
<axw> thumper: thanks
 * thumper wonders if the bot will pick this up
<wallyworld> thumper: yay, bot got your change. so i think the SetMembers test is now the only remaining blocker
<axw> menn0: I took a look at the test yesterday, it just looked like time sensitivity to me
<axw> menn0: if you double that last sleep, it'll fail each time
<axw> the sleep durations do not leave a lot of margin for error/jitter/whatever
<menn0> axw: totally agree. i'm figuring out how to rewrite it to not be so fragile.
<menn0> axw: in fact, checking that retries happen with exponential backup seems somewhat unnecessary
<menn0> axw: what do you think about just ensuring that retries are occurring and leaving it at that?
<thumper> wallyworld: branch landed, bug updated to fix released
<thumper> wallyworld: as it was just a test failure bug
<wallyworld> thumper: awesome, thanks
<axw> menn0: seems fine to me
<axw> menn0: it's a bit of an overkill test
<ericsnow> wwitzel3: the patch to fix PortSet was pretty simple
<ericsnow> wwitzel3: not that it affects us, but Intersection had the same problem
<ericsnow> wwitzel3: anyway, I'm EOD :)
<menn0> axw: I have a fix but thought I'd run one other possibility by you
<menn0> axw: you don't think there's a possibility that the issue is to do with the way the count variable is being handled?
<menn0> axw: it's being updated in another goroutine
<axw> menn0: moment
<menn0> axw: davecheney tells me that there's no guarantees about how updates will be seen by other goroutines
<axw> menn0: yeah, I think that's wrong. it should just return the value on the channel...
<axw> menn0: I don't think that's the cause of the failure tho
<axw> well, it could be but I think the time sensitivity is more likely
<menn0> axw: so do i but i thought i'd run it past you
<menn0> axw: anyway, i've removed all the fragile timing checks... i'll have that up for review shortly
<axw> cool
<axw> menn0: maybe just change the chan bool to a chan struct{bool, interface{}} while you're there? :)
<menn0> axw: oh i've done that
<axw> great
<menn0> axw: but what I meant is the function passed to setErrorFuncFor in TestSetMembersErrorIsNotFatal
<menn0> axw: it updates the voyeur with a integer
<axw> menn0: oh.. looking
<menn0> axw: actually... never mind
<menn0> axw: that's fine
<menn0> axw: i'm looking too closely
<menn0> axw: the variable is only used and updated from one goroutine so there's no issue
<axw> yup
<menn0> axw: http://reviews.vapour.ws/r/725/
<axw> looking
<menn0> axw: the channel change is a separate PR
<menn0> axw: here's the other one: http://reviews.vapour.ws/r/726/
<menn0> axw: all ok?
<axw> menn0: reviewed, I'd prefer if we got rid of any explicit sleeps
<axw> lemme know what you think - maybe I'm being too pedantic
<axw> menn0: alternatively just get rid of the sleep in the test, since mustNext will wait up to LongWait anyway
<menn0> axw: I guess I was wanting to see multiple retries
<menn0> axw: but that can be done with multiple mustNext calls
<menn0> so i'll do that
<axw> menn0: hence the loop in my code, but yes, multiple mustNexts will do that too
<waigani> ericsnow: is http://reviews.vapour.ws/r/724 really that big (about 40 files, 5000 lines changed) or has RB gotten confused?
 * _thumper_ headdesks
 * thumper headdesks
<ericsnow> waigani: oops, no it's like 15 lines :)
<thumper> yay string constants
<thumper> go on, search the codebase for "90168e4c-2f10-4e9c-83c2-feedfacee5a9"
 * thumper fixes
<waigani> ericsnow: few. I thought, "there goes my afternoon..."
<waigani> whoha, that's a lot of feedface
<ericsnow> waigani: what you saw is the GCE provider patch (minus +/- 1500 lines of tests we're still writing)
<ericsnow> waigani: we'll be splitting that up into multiple review requests though :)
<menn0> axw: http://reviews.vapour.ws/r/725/ updated
<waigani> ericsnow: that would be good, otherwise it's a hell of a patch to review!
<ericsnow> waigani: :)
<axw> menn0: lgtm, thanks
<menn0> axw: sweet. thanks.
<menn0> thumper, wallyworld : fix for bug 1409827 merging. is the policy that I can mark the ticket as Fix Released once it's in b/c it's a test only fix?
<mup> Bug #1409827: TestSetMembersErrorIsNotFatal fails <ci> <intermittent-failure> <regression> <test-failure> <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1409827>
<thumper> menn0: I think so
<wallyworld> menn0: not sure, what tim said
<menn0> wallyworld: good answer :-p
<wallyworld> yep :-)
<thumper> waigani: do you remember where we generate the uuid for new environments?
<thumper> waigani: I remember we moved it...
<thumper> but can't remember where
<waigani> thumper: I think it's actually in util?
<thumper> waigani: well, where we assign it into the environ config
<waigani> thumper: will look in a sec, hangon
<thumper> environs.ensureUUID
<thumper> which is in environs.prepare
<thumper> hmm...
<thumper> ok
<waigani> thumper: sorry, did you want me to hunt for anything else now?
<thumper> waigani: nope
<thumper> got it now
<waigani> cool
<thumper> damn... how long do these tests take to run?
<thumper> geez
<anastasiamac> waigani: wallyworld: had to change http://reviews.vapour.ws/r/722/
<anastasiamac> waigani: wallyworld: could u PTAL again?
 * thumper taps his fingers
<anastasiamac> waigani: wallyworld: Get at apiserver now takes params.Entities
<anastasiamac> waigani: wallyworld: everything else should be the same
<anastasiamac> waigani: wallyworld: in fact, it's a revert to oirginal rather than a change...
<waigani> anastasiamac: just read review, yep that makes sense to me
<anastasiamac> waigani: awesome - so I'll keep ur shipit :)
<waigani> anastasiamac: sure :)
<wallyworld> anastasiamac: +1
<thumper> can someone tell me how long the tests should take in worker/provisioner plz?
<anastasiamac> thumper: my last run
<anastasiamac> thumper: github.com/juju/juju/worker/provisioner 56.706s
<thumper> anastasiamac: ta
<anastasiamac> wallyworld: thnx!!
<menn0> merges are unblocked people
<anastasiamac> menn0: yes there r already changes qued up in jenkins :)
<anastasiamac> menn0: axw: thnx for unblocking it :)
<menn0> anastasiamac: all those queued merges may or may not be mine
 * menn0 ducks
<anastasiamac> menn0: and here i was thinking to start the collection for a case of scotch...
<menn0> ha
<thumper> fark
<thumper> fark fark fark
<thumper> bitten by this same fucking issue again
<thumper> FYI, machine config only has APIInfo structure set for state server machines
<thumper> hmm...
<thumper> no
<thumper> damn
<menn0> thumper: i'm trying to get my head around how an API connection for a new environment will be opened
<menn0> thumper: the password will always be the same right? it's stored on the user, not the envuser.
<menn0> thumper: actually... how do passwords for machines work?
<thumper> yes, machines have passwords
<thumper> menn0: but that bit isn't working right now
<menn0> thumper: fixes in your branch?
<thumper> menn0: I've opened a huge pile of worms with this env uuid in the agent config stuff
<thumper> broken heaps
<thumper> and slowly untangling
<thumper> but I'm being called away
<thumper> to walk the dog
<thumper> so I'm done
<thumper> for today
<menn0> that sucks
<menn0> ok
<menn0> i will ignore that part for now
<axw> anastasiamac: looks like your branch failed on the bot, but I think it might be an infrastructure issue
<axw> take a look, you can probably just retry it
<anastasiamac> axw: thnx will look
<anastasiamac> is there a comment I can send to bot to not try to $$merge$$?
<anastasiamac> like cancel a merge?
<wallyworld> anastasiamac: you need to have credentials, i can cancel an in progress landing if you want
<anastasiamac> wallyworld: ic
<wallyworld> axw: if have have a few minutes, i'd appreciate a review of http://reviews.vapour.ws/r/727/
<anastasiamac> wallyworld: no big deal i have qd couple of branches and noticed too late that one of them has unresolved conflict
<axw> wallyworld: looking
<anastasiamac> wallyworld: it'll just fail.. but thnx :)
<wallyworld> ok
<wallyworld> anastasiamac: annotations-tags?
<anastasiamac> wallyworld: no sync-tools
<anastasiamac> wallyworld: annotations tags are about to be backported to 1.22 ;)
<anastasiamac> wallyworld: since they've merged well :P
<jam> anastasiamac: wallyworld: aren't we in feature freeze for 1.22?
<anastasiamac> jam: this is not a feature that's new to 1.22
<anastasiamac> jam: it's kind of a bug... that needs to be fixed in 1.22
<anastasiamac> jam: trying to get ckient signature right to avoid conflicts later
<anastasiamac> jam: client*
<anastasiamac> jam: s/conflicts/headaches
<jam> k
<anastasiamac> jam: thnx for checking :)
<axw> wallyworld: reviewed
<wallyworld> ty looking
<axw> grrrrrr, shitty tests
<anastasiamac> wallyworld: cherry picked annotations change http://reviews.vapour.ws/r/728/
<wallyworld> axw: i introduced a new bootstrap method to avoid churn on the other providers.  i can quite see how to wrap the finaliser though such that the instance id is available to it, since it's called from bootstrap/bootstrap.go with a machine cfg without the id and is only filled in inside the closure
<wallyworld> s/can/can't
<axw> wallyworld: maybe I'm wrong, gimme a sec
<axw> wallyworld: yep, you're right, sorry
<wallyworld> axw: sure, np. you happy with the sig change to avoid churn?
<axw> wallyworld: yes that'd be good, thanks
<wallyworld> axw: to be clear, I made the sig change to avoid updating all the other providers. but you didn;t like it
<wallyworld> also, the way i have it avoids the boilerplate error checking
<wallyworld> that would otherwise have to be introduced
<axw> wallyworld: where would there be extra boilerplate?
<wallyworld> axw: what's inside the new Bootstrap() func - those 5 lines or so
<axw> wallyworld: I don't follow. the existing code hasn't changed much, and the callers of Bootstrap still need to check an error
<wallyworld> since the environs.Bootstrap interface method i would think we'd want to retain
<wallyworld> they just return common.Bootstrap() directly since the signature matches that of environs.Bootstrap
<axw> wallyworld: ah yeah, that would need to change
<axw> ok
<axw> forget it
<wallyworld> ok, i'll make the err change though
<axw> wallyworld: keep it as is. I'll comment on the branch
<wallyworld> ok, ta
<wallyworld> axw: also, i started looking at bug 1384259. it seems cloud init is directly running the various apt commands it is configured with , and something else on the machine locks apt and then cloud init is sad. but i haven't dug any deeper. not sure if you have any ideas
<mup> Bug #1384259: race condition running apt in bootstrap <bootstrap> <ci> <oil> <race-condition> <juju-core:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1384259>
<wallyworld> not sure if we want to wrap the cloud init apt commands with a retry
<axw> wallyworld: apt is called from the ssh script, not cloud-init (on bootstrap only)
<wallyworld> ah doh, yeah. i saw cloud init text in the log on the bug
<axw> we could lock... I don't *think* cloud-init does anything like that though
<wallyworld> not sure off hand how to solve this one, need to dig into it some more. any suggestions welcome.
<axw> wallyworld: only thing I can think of is to add a script that waits for any apt or dpkg processes to stop running before we do anything
<wallyworld> yuk, but may have no choice :-(
<wallyworld> i've updated the pr too
<axw> wallyworld: it'd be good to know what it's conflicting with, that might give us a better approach
<wallyworld> i'll ask on the bug
<axw> wallyworld: lgtm
<wallyworld> ty
<wallyworld> axw: in doing that branch i lost so much time due to not realising maas gave back instance ids that were different from the system ids to be passed over the api. sigh. i now know
<axw> wallyworld: :(   any idea why we're using the resource_uri instead?
<wallyworld> nope :-(
<wallyworld> i found a helper function someone wrote to convert
<wallyworld> so it must have been a deliberate decision
<axw> whee, finally. you can now provision ec2 instances with volumes
<wallyworld> axw: whoot! fantastic
<axw> wallyworld: I forgot to ask before: is there a way we can flag some manual testing as being  required for the next release?
<axw> (e.g. ensuring MAAS 1.7 deployments work well, for non-bootstrap machines)
<wallyworld> axw: yes, i plan on raising this with curtis tomorrow
<axw> wallyworld: ok, cool
<axw> wallyworld: but in general, should we be using launchpad bugs or what...?
<wallyworld> for recording testing notes? using lp bugs seems reasonable
<wallyworld> this close to release, i wanted to do it more explicitly
<axw> wallyworld: not so much a note, as "we should not release unless we know this thing has been tested"
<wallyworld> yeah that, sorry, was using the term generically
<axw> ok
<wallyworld> but we don't have a documented process AFAIK to flag critical testing issues
<axw> wallyworld: in my previous job we used to create tasks for every new feature and major bug fix that would block a release. they'd generally need to be done by someone other than the implementer. we had the luxury of having big, dedicated testing teams though :)
<wallyworld> axw: yeah, we had something similar previously for me also
<wallyworld> we just need to make sure that wes the release manager and the QA team are informed, and that other stakeholders are brought in as needed to help test
<dimitern> axw, hey, are you still around?
<axw> dimitern: heya, yes I am
<TheMue> morning *yawn*
<axw> morning
<dimitern> axw, a quick storage question: do we plan to mount devices in lxc containers?
<dimitern> morning TheMue
<axw> dimitern: we want to be able to, yes. it's going to require some changes to lxc templates to allow mounting and so on
<axw> dimitern: why do you ask?
<dimitern> axw, because due to the networking work I plan to make lxc config file templates more flexible
<dimitern> axw, and this should also help for storage
<axw> dimitern: I see, yes, that will be helpful
<dimitern> axw, sweet, I'll let you know when my changes are in then
<axw> dimitern: thanks very much. probably won't be getting to lxc for a little while yet, but that'll be much appreciated
<axw> wallyworld: ^^ dimitern is helping with storage now ;)
<dimitern> wallyworld, axw, :D more like side-effecting it
<dimitern> wallyworld, axw, can any of you have a look at a small goamz PR? https://github.com/go-amz/amz/pull/16 thanks!
<voidspace> dimitern: ping
<voidspace> dimitern: cannot use parent (type names.Tag) as type names.MachineTag in function argument: need type assertion
<dimitern> voidspace, hmm
<dimitern> voidspace, yeah?
<voidspace> dimitern: just getting the code
<voidspace> dimitern: I want to know if it's safe to just do the conversion
<voidspace> dimitern: if I actually have the right tag
<voidspace> dimitern: just finding the place where I get the tag and where I'm using it
<dimitern> voidspace, it is safe if you actually have a names.MachineTag
<voidspace> parent := p.authorizer.GetAuthTag()
<voidspace> parentTag, err := names.ParseMachineTag(parent)
<voidspace> parentMachine, err := p.getMachine(canAccess, parentTag)
<voidspace> ah
<voidspace> now the error is
<voidspace>  cannot use parent (type names.Tag) as type string in function argument
<voidspace> dimitern: so just convert then...
<dimitern> voidspace, wait a sec
<voidspace> dimitern: is the result of GetAuthTag() the machine tag?
<dimitern> voidspace, GetAuthTag does return names.Tag, but if authorizer.AuthMachineAgent() is true then it's safe to cast it
<voidspace> if it isn't true we shouldn't be running...
<voidspace> so I should check I guess
<dimitern> voidspace, yeah - have a look at NewProvisionerAPI in apiserver
<dimitern> voidspace, the very first check is if !authorizer.AuthMachineAgent() && !authorizer.AuthEnvironManager() { return nil, common.ErrPerm }
<dimitern> voidspace, actually, the getAuthFunc defined there is just what you need
<dimitern> voidspace, it already checks parent/child relationship
<voidspace> ah
<voidspace> and I'm using that later anyway
<voidspace> so maybe I don't need a separate check
<voidspace> I'll look at that, thanks
<dimitern> voidspace, yes, *I think* you can just use that getAuthFunc to validate the passed tag
<voidspace> cool, thanks
<dimitern> voidspace, standup?
<voidspace> dimitern: oops, sorry
<dimitern> voidspace, I have a cunning plan :)
<dimitern> voidspace, you can't tag IPs, but you can tag NICs
<dimitern> voidspace, e.g. we can add tags like "juju:machine-id=<id>", "juju:<mid>:address:<#>=<ip>" to the NIC after calling RunInstances
<dimitern> voidspace, so each time we call AttachPrivateIpAddress successfully, we also add a tag "juju:0/lxc/0:address:0"="" (we don't know the address yet), but then when listing all instance IPs we use the tags to decide which goes where
<dimitern> voidspace, and the instance updater can set "juju:0/lxc/0:address:0"="<some yet-unassigned ip>" as a tag and also in state
<dimitern> anyway.. just thinking out loud - tags can be pretty powerful way of adding metadata accessible via aws api even if apiserver dies/cannot be reached, we can use the tags to intelligently cleanup dependent resources
<TheMue> dimitern: to stay with the golang naming conventions I would call it AttachPrivateIPAddress()
<perrito666> morning
<dimitern> TheMue, in goamz it's called AssignPrivateIPAddresses actually
<TheMue> perrito666: heya and good morning
<dimitern> morning perrito666
<TheMue> dimitern: +1
<TheMue> dimitern: just took a look at net package ;)
<dimitern> TheMue, yeah :)
<perrito666> dimitern: TheMue any of you knows what is the status of blocking bugs?
<dimitern> perrito666, all resolved
<perrito666> dimitern: and merged?
<dimitern> perrito666, for now at least, so no longer blocked
 * perrito666 looks at the topic hoping it will dissapear
<perrito666> wallyworld: do you not sleep?
<wallyworld> sometimes
<wallyworld> like you can talk :-)
<perrito666> heh fair
<perrito666> anyway your mail makes sense, that is why i added a unit ptr as a member of the unitagent, we can use tag from there
<wallyworld> perrito666: i don't think it makes sense to embed the whole unit into unitagent - i thought we talked about having unitagent very lightweight, just doing status get/set
<voidspace> dimitern: that's a terrible abuse of tags :-D
<dimitern> voidspace, :) oh, I'm just getting started
<voidspace> :-)
<perrito666> wallyworld: I did not embed it, its just a member
<perrito666> wallyworld: you might need some sleep and a couple of drinks
<wallyworld> perrito666: the latter is taken care of :-)
<perrito666> lol
<wallyworld> but i'm still not sure about even referencing unit
<wallyworld> we don't need all that baggage inside UnitAgent struct, which for now is just about get/set status
<perrito666> I am all ears about Tag then :p
<wallyworld> we could invent a new one eg "unitagent-foo/0", or easier, just have SetAgentStatus pass the "unit" tag across the wire and the method knows how to deal with it
<wallyworld> the latter seems best, but maybe i'm missing something
<perrito666> I am not sure of the implication of the latter, I guess it could work
<wallyworld> i think it will be ok, but would need to start coding to see where it ends up
<perrito666> well, that is what I am for
<perrito666> and tonight I have meetings at 11pm and 00 (its 9am) so I seem to have time ahead of me
<voidspace> dimitern: hmmm, I bet you can't set tags atomically though
<dimitern> voidspace, well it would appear so.. although you can't set tags on instance/NIC/etc. creation according to the docs, you *can* launch an instance via the AWS web console and add tags to it
<dimitern> voidspace, i've enabled the cloudtrail API logging and experimenting now to see how they do it
<voidspace> cool
<perrito666> mm, on a machine from scratch here and our tests seem to expect a /usr/lib/juju/bin/mongod
<perrito666> that is sort of wrong for the tests isn't it?
<wwitzel3> perrito666: short answer, yes ;)
<jw4> backport PR to remove accidentally added file from 1.22: http://reviews.vapour.ws/r/731/
<jw4> OCR PTAL ^^
<jw4> :)
<dimitern> jw4, ship it! :)
<jw4> dimitern: :)
<perrito666> that is so close to occipital
<jw4> perrito666: http://en.wikipedia.org/wiki/Occipital_bone ?
<perrito666> true
<perrito666> the OCRPTAL bone
<jw4> hehe
 * jw4 just got it
<TheMue> o/
<perrito666> OCR PTAL http://reviews.vapour.ws/r/732/
<dimitern> perrito666, we should just start using "occipital" :D
<perrito666> dimitern: yes, it was very hard to resist the temptation
<katco> one-line change and test; blocking 1.22; up for review: http://reviews.vapour.ws/r/733/
<dimitern> katco, wow!
<dimitern> katco, a return is all it takes?
<katco> dimitern: i told you i had already thought of the possibility, but i ignored my own warning ;)
<dimitern> katco, hehe - you've got a review
<katco> dimitern: ty, i'll add the bug#
<katco> dimitern: would you be able to do a quick test of the code on your environment? or has the opportunity passed?
<dimitern> katco, sure, let me pull your branch
<katco> dimitern: ty so much :)
<dimitern> katco, np - it's bootstrapping now
<katco> dimitern: cool ty again
<dimitern> katco, ok, so no panic, just a few warnings about dns resolving - http://paste.ubuntu.com/9749392/
<katco> dimitern: that's expected; looks good, yes?
<dimitern> katco, yes, however isn't the warning message a bit misleading?
<katco> dimitern: how so?
<dimitern> katco, "Status may be incorrect" ?
<katco> dimitern: well, it's showing that you're running on no subnets and utilizing no ports
<dimitern> katco, got it, right
<dimitern> katco, lgtm then
<katco> dimitern: ty for all the help; finding it, reporting it, everything :)
<dimitern> katco, np, thanks for fixing it so quickly :)
<katco> dimitern: it's much easier to troubleshoot/fix something when you know (almost) everything about it :)
<katco> dimitern: and the fact that i could write a unit test sped up the process as well
<dimitern> katco, exactly!
<voidspace> dimitern: ping
<voidspace> dimitern: you still around?
<dimitern> voidspace, yep
<voidspace>  dimitern state.State supports adding a subnet that doesn't yet exist in state or fetching one that already exists
<voidspace> dimitern: what I *want* is "get me this subnet - adding it if it doesn't exist"
<voidspace> dimitern: better to do that in a State method or just hand code the logic
<dimitern> voidspace, too many "states" :)
<voidspace> hah
<dimitern> voidspace, does not exist in which state?
<dimitern> voidspace, ah, sorry
<voidspace> the stored state
<dimitern> voidspace, got you
<voidspace> mongo I guess
<voidspace> I mean, I know it's mongo
<voidspace> but I guess that's a better way of saying it...
<dimitern> voidspace, right - we can change AddSubnet to AddOrUpdateSubnet perhaps?
<voidspace> dimitern: ok
<voidspace> gah, and there's network.SubnetInfo plus state.SubnetInfo
<voidspace> I have a network.SubnetInfo, I need a state.SubnetInfo
<dimitern> voidspace, let me have a look
<voidspace> dimitern: I wrote the code, I only have myself to blame
<dimitern> voidspace, right, so the unfortunate duplication is on purpose
<voidspace> dimitern: I'm ok
<katco> dimitern: backport of same change to v1.22: http://reviews.vapour.ws/r/734/
<dimitern> voidspace, state shouldn't depend on other packages, the same applies to params
<dimitern> katco, looking
<voidspace> dimitern: although state does depend on network anyway I believe
<dimitern> katco, ship it! :)
<katco> dimitern: woot! grats on quick turn around on this :)
<dimitern> voidspace, well it does for network.Address I think
<dimitern> katco, well I've seen it before lol
<katco> dimitern: i mean the whole bug :)
<katco> dimitern: wouldn't have gotten resolved, nor so quickly w/o your help
<dimitern> katco, ah, yeah - one of the fastest fixes lately
<dimitern> katco, np, glad to help
<dimitern> voidspace, so.. the state documents shouldn't depend on things outside of state, which might change out-of-band and lead to docs getting serialized differently
<voidspace> dimitern: fair enough
<dimitern> voidspace, we're not entirely depend-less, but let's not make it worse :)
<dimitern> voidspace, as for params - same issue - serialization; we shouldn't change the on-the-wire format of the api incompatibly
<voidspace> I'm aware of that one
<voidspace> for state I don't think it's a *genuine* issue though as we populate a subnetDoc from the SubnetInfo
<dimitern> voidspace, sorry :/
<voidspace> so we're safe from "out of band changes" anyway
<voidspace> as we already have a layer of indirection for the actual serialisation
<dimitern> voidspace, yeah, that's right
<voidspace> adding SubnetInfo is *two layers* of indirection
<voidspace> :-p
<dimitern> voidspace, we should consult fwereade here I think
<voidspace> dimitern: let me work with the code and see how it feels - I'll just write a "caster function" I guess
<dimitern> voidspace, because not depending on packages for the sake of stable serialization format for mongo docs is one thing, but no dependencies at all might be too much
<voidspace> ok
<voidspace> and Subnet representation (network package) is a low level dependency not a structural dependency
<dimitern> voidspace, I think so, yes
<voidspace> dimitern: late ping
<dimitern> voidspace, yeah? i'm around on and off
<voidspace> dimitern: you added network.InterfaceInfo recently, with the intention it be used by the ProviderAPI api?
<dimitern> voidspace, not over the wire though - there's a params.NetworkInfo for that
<voidspace> dimitern: I have subnet info and ip address and am wondering how I get the extra information if that's what I'm required to
<voidspace> dimitern: from the subnet CIDR I'll have to fetch the NIC info
<voidspace> dimitern: it doesn't look like there's a provider method for this (that I can see), can I assume state will have it correctly?
<voidspace> for the host machine
<dimitern> voidspace, sorry, what extra info?
<voidspace> DeviceIndex, MACAddress, NetworkName, InterfaceName
<voidspace> etc...
<voidspace> everything on NetworkInfo that isn't on SubnetInfo
<dimitern> voidspace, hmm
<voidspace> let me double check there's no environ.Interfaces
<dimitern> voidspace, from state you mean?
<voidspace> dimitern: I call environ.Subnets() which returns []network.SubnetInfo
<voidspace> dimitern: there is an interface collection in state though I *believe*
<voidspace> dimitern: maybe a problem for tomorrow as it's late for me too
<voidspace> dimitern: I thought you might know easily... :-)
<dimitern> voidspace, yeah - let's call it a day :) I'm a bit dumb now I'm afraid
<voidspace> dimitern: it's even later for you than it is for me! Goodnight, see you tomorrow.
<voidspace> and goodnight everyone
<thumper> morning folks
<thumper> geez... you make one small thing required and suddenly heaps of tests break...
<perrito666> thumper: what did you break while we where not looking?
<thumper> perrito666: I'm needing to add the environment uuid to the agent config
<thumper> perrito666: otherwise all the machine and unit agents don't know which environment to connect to
<thumper> perrito666: but that opened a world of hurt
<thumper> that I've spent about five hours unpicking
<thumper> I think I'm almost there
<thumper> then I need to write an upgrade step
<perrito666> I am sure you said "this should be easy" before starting, that usually complicates things
<thumper> I think I did
<thumper> I expected it to be a few hours
<thumper> not days
<perrito666> well, you should never jynx it
<thumper> and I seem to have made the provisioner tests never exit
<thumper> ...
<thumper> not sure how that happened
 * thumper looks up and sees two open critical bugs
<thumper> WTF
<thumper> ok... topic is wrong
<perrito666> thumper: yep, I dont know why is not back to  none
<perrito666> build is unblocked
<perrito666> now, this is unexpected, there is an mtv channel that actually has music
<thumper> haha
<thumper> menn0: bot is unblocked, land your pending ones if you havent' already
 * thumper makes a sad face
<thumper> just found the most horrible fragile test
<thumper> but don't have time to fix it right now
<perrito666> thumper: which is?
<thumper> func (*cloudinitSuite) TestWindowsCloudInit(c *gc.C) {
<perrito666> ah, oops
<thumper> no shit, doing an equality test with a 850 line string
<thumper> any change in any cloundinit stuff means the string has to change
<menn0> thumper: I landed then all yesterday (I unblocked the bot and got them all in before telling anyone else :-0)
<menn0> thumper: that sounds wonderful
<menn0> thumper: (that test)
<thumper> tech debt item: all cloudinit tests are awful and fragile
<thumper> ah ha...
<thumper> I think I found the culprit
<wwitzel3> ericsnow: ping
<ericsnow> wwitzel3: hey
<menn0> waigani: Ship It!
<waigani> menn0: sweet, thanks
<menn0> ericsnow: I just reviewed your Attempt PR again (Ship It if you like)
<ericsnow> menn0: thanks
<thumper> by joves I think I may have fixed all the test failures
<thumper> ...
 * thumper runs full suite again
<thumper> menn0: 31 files changed, 185 insertions(+), 87 deletions(-) to get the tests passing on requiring environ uuid
<thumper> menn0: do you have a few minutes to chat? I need to talk through an issue
<thumper> although I think I know the answer
<thumper> menn0, waigani_: beware with upgrade steps landing since 1.22 was branched, we should have 1.23 upgrade steps now
<waigani_> thumper: right, noted
<thumper> we should check any that have landed since the branch (if any)
<thumper> I was just thinking of this now as I'm about to write an upgrade setp
<menn0> thumper: hi, sorry just noticed this. was deep in thought. chat now?
<thumper> 2 minutes, booking a shuttle for tomorrow
<thumper> menn0: standup hangout?
<menn0> thumper: see you there
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs:  1411024
<sinzui> thumper, wallyworld_ can you get someone to look into the windows regression reported in bug 1411024
<mup> Bug #1411024: Win client/agent cannot bug built because of backup deps <ci> <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1411024>
<davecheney> thumper: is there an agenda for our feb sprint ?
<davecheney> menn0: so what i'm hearing is "no, don't install the latest kernel update if you want wifi to work"
<perrito666> sinzui: I am the culprit, Ill fix it
#juju-dev 2015-01-15
<thumper> davecheney: not yet... mostly knowledge transfer for cherylj, although I do have one problem I'm putting off today that would be good to do
<cherylj> yessss, give me ALL THE KNOWLEDGE!
 * perrito666 wants to know new zealand and thinks thumper should adopt him into his team temporarily
<thumper> perrito666: happy to :)
<thumper> perrito666: and... Air NZ now doing direct flights to Argentina
<thumper> probably has something to do with rugby and the pumas :)
<perrito666> thumper: price in bars of gold?
<perrito666> last time I checked direct flight was somewhere in the 20k usd :p
<thumper> probably going to get cheaper :)
<perrito666> sweet
 * perrito666 packs cooking gear
<anastasiamac> perrito666: cooking gear?..
<perrito666> anastasiamac: I am sure there is something nice to cook in new zealand and I dislike cooking without my own knife :p
<perrito666> so who is OCR? http://reviews.vapour.ws/r/736/ this inlocks the critical bug
 * perrito666 installs go cross compile to avoid having to go for a windows every time
<perrito666> waigani: ?
<menn0> davecheney: actually, wifi works, it just took an unusually long time connect after suspend. possibly nothing to do with the latest kernel.
<perrito666> menn0: I have heard more people complain of things like that with latest kernel
<davecheney> thumper: i plan to monopolise mwhudson 's time and grind through as many bugs as we can on our port
<davecheney> perrito666: reviewed
<davecheney> fix it, then ship it
<perrito666> davecheney: td
<perrito666> tx
<perrito666> aghh I keep forgetting we are in 2015
<davecheney> 2015, all year
<anastasiamac> perrito666: aah... that would explain high prices on travel to nz
<anastasiamac> perrito666: if everyone form argentine brings their ***cough*** cooking knives
<anastasiamac> perrito666: u do know that Maories r peaceful pppl? not easily provoked? nor known for their aggression? .. :P
<perrito666> anastasiamac: I dont know if you noticed this but there is no call where thumper doesn't have a bruise
<thumper> perrito666: ha
<anastasiamac> perrito666: rofl
<perrito666> can we make the bot oblivious to the diff between - and _ ??
<ericsnow> perrito666: I gave you a review on that restore fix
<ericsnow> perrito666: is there a reason why you can't just add a restore_windows.go?
<perrito666> right now the pressing time
<perrito666> ericsnow: anyway, it was faster to stub Restore than to stub all of the methods that fail
<wallyworld_> anastasiamac: standup?
<perrito666> ericsnow: to be honest I am a bit curious on why do we build any of that in windows
<davecheney> thumper: i'm not having anyluck getting someone on #is to look at that RT
<davecheney> ericsnow: can you please close/delete this review http://reviews.vapour.ws/r/621/
<davecheney> it's dead
<ericsnow> davecheney: sure
<ericsnow> davecheney: the PR should be closed too
<thumper> davecheney: did as much poking as I could...
<thumper> menn0: I have a bad feeling about this
<menn0> thumper: about what?
<thumper> menn0: someone elses upgrade step
<thumper> that I don't think it does what they thought it did
<thumper> I cribbed someone else's setp
<thumper> step
<thumper> and did things slightly differently
<thumper> particularly about the assertions
<thumper> perhaps it is working by side-effect
 * thumper shrugs
<menn0> thumper: which one?
<thumper> menn0: the one in agentconfig.go
<thumper> I'm also changing agent config
<thumper> but I'm trying to see where it actually gets written to disk
<thumper> and it isn't during the step
<thumper> hmm... only have a ConfigSetter
<thumper> there must be a reason why we don't write in the upgrade steps
<menn0> thumper: not sure... looks like it's Dimiter's work
<thumper> menn0: ah ha
<thumper> menn0: the changes to the agent config are saved at the end of the upgrade steps
 * thumper fixes his tests
<menn0> thumper: of course.
<menn0> thumper: the steps are run inside a ChangeConfig
<thumper> yep...
<wallyworld_> thumper: core meeting?
<wallyworld_> menn0: also?
<wallyworld_> davecheney: ditto
<perrito666> anastasiamac: I trust you noticed the coloring in thumper 's right eye
<thumper> perrito666: you are imagining things
<thumper> not been hit for ages
<thumper> as I've had to let my shoulder heal
<perrito666> thumper: you live in a dangerous place, you have orcs for god sake :p
<perrito666> I have senn it on tv
<anastasiamac> perrito666: orcs r seasonal
<perrito666> lol
<anastasiamac> perrito666: only from Boxing Day onwards
<thumper> menn0: fyi https://github.com/juju/juju/pull/1417/files
<thumper> anyone game? http://reviews.vapour.ws/r/738/
<thumper> wow, three pages of diff
<thumper> $ git diff master | wc -l
<thumper> 1792
<thumper> sorry reviewer
<thumper> most of that is real necessary
<thumper> some of it isn't...
<thumper> but most of it is
<menn0> thumper: i'm having a look
<thumper> menn0: how did you test upgrades before when the upgrade is listed against 1.23.0 and the version is 1.23-alpha1?
<thumper> menn0: wondering because I have a running 1.22 environment and I thought I might as well test the upgrade
<thumper> but since the upgrade steps are listed against 1.23.0... I don't think they'll run
<wallyworld> axw: blake suggested in the maas doc using "/dev/disk/by-id/<foo>" as the block device identifier. I think this would work? Can you comment in the doc to confirm?
<menn0> thumper: I changed things a while back so that upgrade steps for the final version are run when you upgrade to any alpha or beta of that version
<menn0> thumper: the steps will run
<thumper> menn0: oh nice
<thumper> menn0: thanks
<menn0> thumper: it was wallyworld's suggestion. that way we can test steps while working towards a final version.
<thumper> makes sense
<menn0> thumper: Fix it and ship it
<thumper> menn0: confirmed that starting a new environment has all agents dialing the env uuid version
<thumper> menn0: I wish we had a setting that could say "please restart after you have finished upgrading"
<thumper> menn0: thanks for the timely review of this massive piece of work
<thumper> menn0: but I think it should wait until monday to land
<thumper> I still need to test upgrades and a few tweaks from the review, but running out of time today, need to pack
<mattyw> davecheney, you still around?
<davecheney> mattyw: ping
<axw> wallyworld: will do.
<wallyworld> ty
<wallyworld_> jam: did we ever get a definitive yes/no on whether Openstack is in scope for storage? It's missing completely from the Provider Specific Tactics section of the spec, but is a line item in the "in scope" section
<dimitern> jam, axw, PTAL http://reviews.vapour.ws/r/740/ - migrating goamz imports from LP to GH
 * dimitern needs to step out for ~1h
<axw> dimitern: lgtm
<dimitern> axw, thanks!
<axw> wallyworld_: can't read your pgpmail
<wallyworld_> oh, ffs, stupid thunderbird
<axw> wallyworld_: you have question, do you mean for 1:1 tomorrow?
<axw> questions*
<axw> wallyworld_: if you're around, https://github.com/juju/juju/pull/1420
<axw> bot hasn't picked it up for some reason ...
 * fwereade out for a bit
<dimitern> TheMue, standup?
<TheMue> dimitern: morning and sorry for missing the hangout
<TheMue> dimitern: third bad night in a row, had to change clothes several times
<TheMue> dimitern: at least in the morning hours I found some sleep
<TheMue> dimitern: so will now work on the couch reading all the networking information and ask you each time there's something unclear :)
<dimitern> TheMue, hey, morning!
<dimitern> TheMue, sorry to hear that :/ I hope you'd start feeling better!
<dimitern> TheMue, sure - sgtm, there's a lot to read :)
<TheMue> dimitern: I hoped so too, I seldom have something for more than one day. so I already canceled being at the Oldenburg English Club today :(
<TheMue> dimitern: yeah, I thought that's a good opportunity, have to make the best out of the situation
<mgz> perrito666: I have reopened bug 1411024 because the mac client still does not build with your change from last night
<mup> Bug #1411024: Win client/agent cannot bug built because of backup deps <ci> <regression> <windows> <juju-core:Triaged by hduran-8> <https://launchpad.net/bugs/1411024>
<dimitern> TheMue, +1
<dimitern> TheMue, I wish you a quick recovery in any case
<dimitern> mgz, ha!@
<TheMue> dimitern: thanks, I wish it too. *lol*
<dimitern> mgz, I though it might be a problem as in the fix the "_linux" suffix was used, rather than _unix
<mgz> dimitern: well, it should be an easy fix to the fix then :)
<dimitern> mgz, however I'm not sure whether we support backups on mac os?
<mgz> dimitern: I think we clearly don't need to at the moment
<mgz> there's probably no real reason it shouldn't work, but we just need the client to build for now :)
<dimitern> mgz, yeah, but do you know if it *ever* worked on mac?
<dimitern> mgz, because if it didn't the fix is a lot easier
<mgz> dimitern: I suspect the old shell script hack would have, if anyone ever tried
<mgz> it's certainly not something we signed up for supporting, or have been verifying works
<mgz> (the backup/restore test jobs only use the client under ubuntu)
<dimitern> mgz, cool, that's all the confirmation I need for now :)
<dimitern> mgz, https://github.com/juju/juju/pull/1421 - that's the fix
<dimitern> mgz, can you re-run the failing jobs after it lands manually to speed-up the unblocking?
<dimitern> can anyone review this please? ^^ axw, fwereade, voidspace, ?
<voidspace> dimitern: looking
<dimitern> voidspace, ta!
<voidspace> dimitern: LGTM
<dimitern> for some reason a RB diff wasn't generated
<fwereade> dimitern, LGTM too, I'm just a bit too slow
<dimitern> voidspace, thank you, setting to merge to unblock trunk
<mgz> dimitern: it should go straight through - the build jobs are early on as well
<dimitern> mgz, cool!
<perrito666> mgz: tx
<perrito666> dimitern: and tx to you too
<dimitern> perrito666, np, I'm opening a beer tab ;)
<perrito666> you won it :p will have to wait until apri though
<dimitern> hehe it's closer than you think ;)
<perrito666> dimitern: we can go have beer and sausages with sauerkraut
<dimitern> perrito666, great idea! we should definitely put it on the agenda
<perrito666> well, it is on mine :p
<mgz> ci is building the new rev
<dimitern> mgz, it would be very useful if on pages like this http://reports.vapour.ws/releases/2232 "Last build" links *both* to the job in jenkins and the console log output (ideally not as application/octet-stream but text/plain, so it can be viewed in the browser directly)
<mgz> dimitern: yeah, I'm working on tidying those last bits up now
<dimitern> mgz, you rock!
<dimitern> voidspace, please, take a look at the stub NetworkInterfaces - http://reviews.vapour.ws/r/742/
<fwereade> wallyworld_, don't suppose you're around? wanted to ask about when we set StatusInstalling
<wallyworld_> fwereade: give me 5
<wallyworld_> fwereade: when ModeInstalling is called and also when ModeConfigChanged is called and state is not Starting
<fwereade> wallyworld_, right, am confused about setting it at ModeConfigChanged time
<wallyworld_> maybe that's not necessary - i think it was a s/Installed/Installing thing
<fwereade> wallyworld_, ok cool
<wallyworld_> but if we do get config changed and we are not set up yet
<wallyworld_> then installing seems reasonable
<wallyworld_> maybe the check just needs to be expanded
<fwereade> wallyworld_, then we should surely still be in Installing state? ah-*ha*, we might have gone into hook error on install
<wallyworld_> yup
<fwereade> wallyworld_, I don't think it should be a config-changed-time check though, I think it's part of committing an installed state
<fwereade> wallyworld_, does that sound sane?
<wallyworld_> i think so, but am a little tired so will think properly tomorrow
<fwereade> wallyworld_, or, hmm
<fwereade> wallyworld_, no, actually, I think you're right
<wallyworld_> which bit?
<fwereade> wallyworld_, setting it at the start of config-changed
<wallyworld_> that did seem reasonable
<wallyworld_> at the time
<fwereade> wallyworld_, it was judicious and forward-thinking, I appreciate it
<wallyworld_> not sure if i put it there or adapted it or whatever
<fwereade> wallyworld_, fwiw I think you set StatusInstalling in the wrong part of ModeInstalling but I'm touching that as I go
<wallyworld_> could well do - i am a uniter novice
<wallyworld_> be gentle with me, it's my first time
<wallyworld_> doing stuff with uniter
<fwereade> wallyworld_, no worries, it's just that ModeInstalling isn't the mode func, it *returns* the mode func
<wallyworld_> yes, you are right. maybe i was thinkin that if it was at the point of setting up for install, it should be put in installing
<wallyworld_> ie transition from allocating to installing sooner
<wallyworld_> as by then clearly allocation had been done
<fwereade> wallyworld_, anyway you don't need to do anything, I just wanted to check up/rubber duck what was going on
<wallyworld_> np, ty
<fwereade> wallyworld_, I dunno, I think the moment we start looking up the charm is a fine time to set the status
<fwereade> wallyworld_, btw did we get an upgrading status?
<wallyworld_> hmmm, no
<wallyworld_> we should probs add that, after checking with john
<fwereade> wallyworld_, we can manage fine without it but I think it's cleaner
<wallyworld_> i think i agree
<fwereade> wallyworld_, cool, I will assume that'll be the approach if I bump up against it
<wallyworld_> ok
<dimitern> mgz, got a link to the win/mac build jobs?
<dimitern> mgz, ah, found them from the last curse message
<perrito666> dimitern: any luck?
<dimitern> perrito666, well - so far so good
<axw> if anyone's got time for a quick review, that'd be great... may have slipped by unnoticed since the RB bot didn't pick it up: https://github.com/juju/juju/pull/1420
<dimitern> perrito666, only the "revision results" job has to succeed
<dimitern> perrito666, but afaics the win/mac builds passed
<voidspace> dimitern: looks good
<voidspace> dimitern: still reading
<voidspace> dimitern: why the ++ changed to --
<dimitern> voidspace, great
<voidspace> dimitern: why the ++ changed to -- ?
<dimitern> voidspace, well, it seemed more appropriate - releasing an address should decrement the maxAddr as it could be useful for tests like Allocate, Release, Allocate
<voidspace> ah, ok
<voidspace> at least you haven't claimed to do live testing on this one...
<voidspace> Dummy implementation looks good/useful
<voidspace> dimitern: if the -- change made no difference to any tests, or any code anywhere, then I'm sceptical it's needed at all...
<voidspace> dimitern: other than that, LGTM
<voidspace> dimitern: well, let me look at the dummy implementation one more time before I sign off
<voidspace> lots of TODOs added as well
<dimitern> voidspace, I agree we shouldn't plan too far ahead for what tests we *might* need
<voidspace> dimitern: I dislike/distrust code written for imaginery future use cases.
<dimitern> voidspace, that s/++/--/ seems like a good idea, but I don't mind reverting it if you disagree - it's just a detail anyway
<dimitern> voidspace, +1, although I need to get better at judging this :)
<voidspace> sure
 * dimitern bbiab
<voidspace> balancing future planning with that is difficult
<voidspace> no easy answers
<voidspace> dimitern: yep, LGTM
<voidspace> dimitern: my mortgage guy is due here any minute and I'm going to make coffee
<anastasiamac> fwereade: o/
<fwereade> anastasiamac, heyhey
<anastasiamac> fwereade: how r u?
<fwereade> anastasiamac, good thanks, and you?
<anastasiamac> fwereade: m good but weary
<anastasiamac> could I run smth by u?
<fwereade> anastasiamac, ofc
<anastasiamac> fwereade: about annotations...
<fwereade> anastasiamac, go on
<anastasiamac> fwereade: to meet kapil's requirments, we need to list charms
<anastasiamac> fwereade: I've added API call to current client
<anastasiamac> fwereade: bumping it's version but m not 100% sure it's good
<anastasiamac> fwereade: not sure if 1 call warrants a separate API either
<fwereade> anastasiamac, my gut would say to have a new facade
<fwereade> anastasiamac, hot on its heels will, I suspect, be requirements that we delete charms
<fwereade> anastasiamac, etc etc
<anastasiamac> fwereade: excellent! will change tomorrow :)
<fwereade> anastasiamac, and check the current api, there's a charm-info (or something) call
<anastasiamac> fwereade: shall i move it?
<fwereade> anastasiamac, which would be happy on that facade I think
<fwereade> anastasiamac, yes please
<fwereade> anastasiamac, which is to say don't *move* it, leave it where it is too, and then we don't need a version change on client
<anastasiamac> fwereade: purrrfect :) wallyworld_ and I were wondering but u've cleared it all up ;)
<fwereade> anastasiamac, cool
<anastasiamac> fwereade: thnx for ur time and insight :)
<fwereade> anastasiamac, np
<dimitern> anastasiamac, you've got a review on the ListCharms PR
<dimitern> wtf?! ERROR juju.cmd supercommand.go:411 failed to bootstrap environment: dial tcp 54.88.153.132:22: ConnectEx tcp: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
<dimitern> mgz, is the job itself flaky? http://juju-ci.vapour.ws:8080/job/win-client-deploy/1298/console
<mgz> dimitern: sinzui has been wrangling with it
<dimitern> mgz, cool
<sinzui> dimitern, ssh on that machine sucks
<dimitern> sinzui, :/
<dimitern> well, I hope there's still a chance to unblock master later today
<mgz> dimitern: build-osx-client passed, so your fix was fine
<dimitern> voidspace, thanks for the review
 * dimitern sings the happy dance ;)
<dimitern> or dances the happy song lol
<mgz> eheheh
<dimitern> BLESS!
<dimitern> oh joy
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs:  None
<mgz> dimitern: yeay! unblocked
<perrito666> dimitern: thank youuuuu
<dimitern> perrito666, mgz, cheers guys!
<katco> wwitzel3: ericsnow: did either of you do a write-up on how you broke up your branch? i'm coming up on a large-ish change.
<ericsnow> katco: we haven't gotten there yet :)
<katco> ericsnow: doh!
<katco> i suppose there must be something out there on google when the time comes
<ericsnow> katco: but the idea I had is to remove everything I don't want in the review and commit that on the branch and then create the PR (and review request)
<ericsnow> katco: then add back in the stuff for the next patch and commit in a new branch (and use rbt --parent)
<ericsnow> katco: then repeat until it's back up to where it was before you started splitting things up
<ericsnow> katco: that way you don't lose all your git history
<katco> ericsnow: i think my general hunch was the inverse: create a new branch and begin merging in certain files
<ericsnow> katco: where to split things up is a different question
<ericsnow> katco: though keeping each review request a manageable size is one key objective
<katco> ericsnow: yeah, that's the only reason i am doing it
<ericsnow> katco: the problem with that is the loss of history
<katco> ericsnow: very true
<ericsnow> katco: right now our branch has over 100 commits in it (I think)
<katco> ericsnow: wow that's awesome
<ericsnow> katco: and most of that is valuable information
<wwitzel3> yeah, I'd like to avoid losing our commit history
<ericsnow> katco: wwitzel3 and I have taken the approach of keeping commits focused
<katco> ericsnow: i aspire to be that diligent, but i always get trapped in the "gosh can this really be an atomic commit"
<katco> ericsnow: as in, could this functionality stand alone
<ericsnow> wwitzel3: one alternative is to manually generate the patches and create the review requests manually and then merge the whole branch when *all* the requests get a ship-it
<ericsnow> wwitzel3: but that seems like more work
<katco> ericsnow: i think i need to shift my thinking on that somehow
<ericsnow> katco: sometimes you can and sometimes you can't :)
<ericsnow> katco: usually, though, you can :)
<katco> ericsnow: :)
<ericsnow> katco: try using the "-p" option to git add
<wwitzel3> yeah -p for the win
<ericsnow> katco: we've used it to great effect when we've made changes that we want to split into multiple commits
<wwitzel3> ericsnow: yeah, honestly, I think we could probably, in most cases just do file and file_test as a commit.
<katco> ericsnow: ah cool, so you're picking out bits of files?
<wwitzel3> ericsnow: not commit, review
<wwitzel3> katco: yeah, we just select the bits that make up an atomic commit
<wwitzel3> katco: I've found it much easier to do when you have all of the context, instead of trying to do with with forsight
<wwitzel3> ericsnow: anyeay, I agree it is more work, but I think we've done a good enough job of file seperate that it is probably the best approach
<katco> wwitzel3: i agree. i cannot with any reliability land slices of the stack without having first coded the entire stack. there's just too much back and forth as the design emerges.
<perrito666> does anyone know how to prevent google chrome from popping those notifications before meetings??
<dimitern> perrito666, in the calendar - edit the event and uncheck reminders
<perrito666> dimitern: What I actually want is to stop chrome from breaking the 4th wall
<dimitern> perrito666, Notifications -> [ Pop-up ] [ 10] [minutes] (x)
<dimitern> perrito666, whaa? :)
<perrito666> dimitern: I dont want to be able to pop notifications into my window manager
<perrito666> I mean chrome
<perrito666> There is a small bell that shows in the indicator and a pop up thing tells me that I have a meeting, I dont want chrome to do that , it is invasive and also clashes with my ubuntu calendar
<dimitern> perrito666, aah!
<dimitern> perrito666, those .. I knew how at some point and managed to turn them off, but I forgot how
<perrito666> it is not obvious in the ui
<perrito666> rick_h_: there is a nice lady talking in yourbehalf in the call
<voidspace> dimitern: decorator fail : http://maas.ubuntu.com/docs/api.html
<dimitern> voidspace, I can't see anything unusual :)
<voidspace> dimitern: :-)
<dimitern> voidspace, all my branches merged, finally
<voidspace> dimitern: great
<voidspace> dimitern: I got sidelined into mortgage applications - didn't make as much progress as I should have
<voidspace> dimitern: but for once I have nothing planned this evening so might do some more in front of the TV
<voidspace> dimitern: I got the signatures changed, but not much on the implementation for MAAS / EC2
<dimitern> voidspace, no worries, I wanted to unblock you before I go today
<voidspace> dimitern: easy stuff though
<voidspace> dimitern: yeah, thanks
<voidspace> ah, and the new goamz branch
<voidspace> cool
<dimitern> voidspace, yep - shiny and squeaky
<voidspace> Nice :-)
<dimitern> :)
<voidspace> dimitern: so, I already had booked a "geek dinner" in London of the Thursday that we're together in London
<dimitern> ok, I'll be off then - a long day that was
<voidspace> dimitern: you'd be welcome
<voidspace> dimitern: go have a break
<dimitern> voidspace, nice!
<voidspace> yeah, saves me a trip if I'll already be in London...
<dimitern> voidspace, awesome :)
<voidspace> I'm signing off for now
<voidspace> going jogging
<voidspace> g'night all
<dimitern> voidspace, enjoy
<rick_h_> perrito666: who da what?
<rick_h_> perrito666: nice lady?
<perrito666> rick_h_: on the call we where hearing an operator voice
<rick_h_> ah lovely
<rick_h_> what a mess
<sinzui> perrito666, do you have a few minutes to review http://reviews.vapour.ws/r/744/
<sinzui> abentley, jog: I have my lull. I will review abentley's branch first
<jog> sinzui, that's fine, abentley, said he'd look at mine
<perrito666> katco: hey, what was the magic invocation to get different output formats in juju?
<perrito666> and which version is that in?
<katco> perrito666: juju status --format
<katco> perrito666: juju help status will give you all the options
<katco> perrito666: and i believe it's in >= 1.22
<anastasiamac> didmitern: r u still around?
<perrito666> katco: tx a lot
<katco> perrito666: np, enjoy :)
<sinzui> perrito666, do you have a minute to review  http://reviews.vapour.ws/r/744/
<perrito666> sinzui: sure
<perrito666> anyone local provider savvy?
#juju-dev 2015-01-16
<perrito666> wallyworld: hey, yes I think i got it
<wallyworld> great :-) hope i made sense with my ramblings
<wallyworld> you need sleep, so we'll catch up tomorrow
<perrito666> you did, I think I stayed as close as possible :p
<wallyworld> no problems, we can see how it turns out and adjust if the idea is too insane
<perrito666> wallyworld: ow please sleep is a poor replacement for coffee
<perrito666> wallyworld: ok i have updated now its backwards compatible and consistent
<perrito666> if you go for a setstatus it will use the agent
<perrito666> and if you call explicitly you get the right ones
<wallyworld> perrito666: awesome ty, will look after my cuurent meeting and provide feedback, you need to sleep :-)
<perrito666> now yes I need to sleep
<perrito666> cheers man
<axw_> wallyworld: updated the diagram, do you want to take a look and see if it's what you had in mind?
<wallyworld> sure, ty, looking
<wallyworld> axw: looks good ty
<axw> ta
<wallyworld> axw: if you have a moment, could you please look at a small review for a 1.22 bug http://reviews.vapour.ws/r/748/
<axw> wallyworld: sure
<axw> wallyworld: reviewed
<axw> need to make lunch... bbs
<wallyworld> tyvm
<wallyworld> axw: so by the looks, Datastore in code is what's in the spec as Storage Provider, and Datastore.Kind is analogous to Storage Provider type?
<wallyworld> also, i did find in the spec --disks for add-machine
<axw> wallyworld: no, Datastore is an early name I came up with for storage instance
<axw> I may just rename it to StorageInstance ...
<wallyworld> ah right ok
<axw> Kind is block|filesystem|object
<axw> :q
<axw> oops
<wallyworld> and a StorageInstance will reference the provider type / pool from which it came
<axw> wallyworld: correct
<wallyworld> thanks axw, i need to buy some glasses
<axw> wallyworld: nps. to be fair, someone may want the string; I don't think the backend should be in the business of rendering though
<axw> its job is to tell the user about changes in the model
<axw> wallyworld: there's a followup to change the API right?
<wallyworld> yes, but the megawatcher already does some rendering i think
<wallyworld> does there need to be an api change?
<axw> wallyworld: I'm sure you'll need to change something in apiserver/params at least...
<wallyworld> let me look
<axw> wallyworld: sorry, I assumed you were splitting up the branch
<axw> I believe you need to update AllWatcher, which uses MegaWatcher
<wallyworld> i've not ever done much with megawatcher so am not sure of all the artefacts
<axw> wallyworld: re rendering, perhaps you should double check with fwereade what he wants it to do
<wallyworld> axw: looked in params, it seems the api layer just json marshals the struct directly
<axw> huh, ok
<wallyworld> but i can see a params_test that will fail
<axw> so it does
<wallyworld> i'll wait for the failure and fix the params_test
<wallyworld> i think i had it at theback of my mind that i just needed to change the machineinfo directly but couldn't remember why
<axw> wallyworld: any opinion on the name of the thing I was talking about before? StorageInstance? StorageUnit? (it's currently "Storage" in state, which has too many meanings...)
<wallyworld> um
<axw> the thing being an instantiation of a store declared in a charm
<wallyworld> maybe StorageInstance just to make it not Storage and we can search/replace later if a better name comes up
<axw> sounds good
<dimitern> o/
<axw> morning dimitern
<axw> wallyworld: just sent you an email with a UML diagram - please take a quick look when you have a moment to see if it's what you had in mind
<dimitern> hey axw
<wallyworld> axw: sure, ty
 * dimitern is amazed
<dimitern> no blockers todat yay!
<axw> :)
<axw> and here I am with nothing to land
<dimitern> just in time for the 5 branches I had in mind :)
<axw> woot
<dimitern> aww ;)
<dimitern> hmm.. after all the live testing my ec2 bill grew a bit - $0.32 now hah
<wallyworld> axw: thanks yeah, that's something we can start with and evolve over time
<axw> wallyworld: cool
<anastasiamac> dimitern: thnx for review on list charms
<anastasiamac> dimitern: m moving them into a separate client but have taken ur feedback on board :0
<dimitern> anastasiamac, cheers :)
<dimitern> anastasiamac, I'd like to have a look when you propose it please
<anastasiamac> dimitern: thnx! m hoping later on tonite - got my hadns full of kids atm
<anastasiamac> dimitern: hands that is
<dimitern> anastasiamac, np
<anastasiamac> dimitern: committed updates with new facade
<anastasiamac> dimitern: i have copied CharmInfo only in this PR
<anastasiamac> dimitern: I'd like to land it b4 copying the rest of charm stuff from old client
<anastasiamac> dimitern: like AddLocalCHarm, AddCahrm and ResolveCharm
<anastasiamac> dimitern: this way old client's dealings with charm will be "deprecated" :)
<dimitern> anastasiamac, good plan, but we shouldn't remove them from the old facade
<anastasiamac> dimitern: agreed.. Hence "copy" :)
<dimitern> anastasiamac, right :)
<anastasiamac> dimitern: thnx for double-checking! it's amazing to work with ppl that r diligent and care!
<dimitern> anastasiamac, oh c'mon :) thanks
<anastasiamac> dimitern: :D
<Muntaner> good morning, I'm quite new to Juju and having problems
<Muntaner> am I in the right place to ask for some help?
<dimitern> Muntaner, hey, welcome!
<dimitern> Muntaner, You can ask questions either here or in #juju
<Muntaner> I asked in Juju, if it's ok I can copypaste my question here
<dimitern> Muntaner, This channel is more about Juju development, while #juju is more for users, charmers, etc.
<dimitern> Muntaner, no need - I'll have a look
<Muntaner> ok, thanks :)
<TheMue> morning
<dimitern> morning TheMue
<dimitern> TheMue, feeling better today?
<TheMue> dimitern: heya
<TheMue> dimitern: in IT I would "project not yet done, but it's getting better with each iteration" ;)
<TheMue> would say
<dimitern> TheMue, glad to hear it :)
<TheMue> dimitern: yeah, me too. today I'm working from a convenient place in the living room, having my cat beside me. better than alone in the home office.
<dimitern> TheMue, :) nice
<TheMue> dimitern: yep
 * TheMue looks around
<dimitern> TheMue, voidspace, sorry guys - omw
<voidspace> dimitern: o/
<perrito666> good morning
<anastasiamac> perrito666: o/
<wwitzel3> perrito666: feeling better today?
<perrito666> wwitzel3: yes tx a lot, was a state of mind more than of body and I really didn't want to share my frustration in negative ways
<perrito666> my grand mother used to say that you should not do hangouts when angry
<wwitzel3> perrito666: yea, I can understand that
<wwitzel3> perrito666: haha
<perrito666> since she died before google rose above the search engine I might be missquoting her :p
<dimitern> TheMue, voidspace, ping
<TheMue> dimitern: pong
<dimitern> TheMue, hey - we have approval for the sprint (check your mail) - please submit the form
<TheMue> dimitern: just seen Sarahs mail :)
<dimitern> TheMue, :)
<wwitzel3> ericsnow, perrito666 ping
<perrito666> wwitzel3: pong?
<wwitzel3> oh i guess I'm a minute early
<perrito666> says my clock I still have 1min
<ericsnow> perrito666, wwitzel3: I'm going to the upstart-systemd-migration call
<TheMue> dimitern: my flights are bad for the times we're planning, cannot come early enough or have to leavy too early. I would would prefer coming on Monday during morning and leave on Saturday then. what do you say?
<dimitern> TheMue, if there are no suitable flights - yeah, but I'd still wait to hear from the travel agent to be sure
<dimitern> TheMue, she might have better options to offer
<TheMue> dimitern: sure, but my usual portal is very good, checking lots of flight. I have also to do one hop
<ericsnow> perrito666, wwitzel3: I did wrap up the tests for instance and conn_availzones
<dimitern> TheMue, which airport you usually take - hamburg or bremen?
<TheMue> dimitern: bremen, hamburg is badly to reach
<TheMue> dimitern: but maybe this time it's worth it, will take a look. from here to hamburg it's about 2:30h (with train and sub).
<voidspace> dimitern: cool, thanks
<bodie_> is there an api for hook tools or is the only way to use them to invoke them from the shell?
<bodie_> e.g. relation-set
<bodie_> I think it would be nice to be able to use them programmatically from within a hook or action
<ericsnow> we only run one jujud per instance, right?
<bodie_> I believe so
<ericsnow> my understanding is that we *used* to run one jujud per agent (or something like that)
<sinzui> ericsnow, 1 jujud machine per instance
<bodie_> you could use a local http service (is that insane?)
<bodie_> then that could be wrapped for different languages
<bodie_> or, rpc server as we do for api/apiserver
<bodie_> but that might be overcomplicated
<ericsnow> sinzui: do you mean 1 jujud *per* machine per instance?
<sinzui> ericsnow, 1 jujud as a machine agent per instance. 0-* jujud unit agents per machine.
<ericsnow> sinzui: so there may be more than one jujud running?
<sinzui> ericsnow, add-machine will give you a machine with just 1 jujud running on it. deploy will usually give 2 jujuds running as machine and unit.
<ericsnow> sinzui: ah
<sinzui> ericsnow, and with subordinates, many more juju unit agens get added
<ericsnow> sinzui: but there is only one upstart job for jujud (or one for each jujud that runs)?
<ericsnow> sinzui: so one jujud per machine agent
<sinzui> ericsnow, there is an upstate job for the machine, and for each service running
<bodie_> jujud_ip=os.Getenv("JUJUD_IP")
<bodie_> perhaps
<ericsnow> sinzui: ah, okay, that's just what I needed to know
<ericsnow> sinzui: thanks :)
<sinzui> ericsnow, this is what I see on my apache3 machine with has a lot on it
<sinzui> http://pastebin.ubuntu.com/9762266/
 * perrito666 notices he forgot to add something to his patch
<sinzui> Hi voidspace ericsnow perrito666 TheMue dimitern wwitzel3: can anyone help with these two issues that block 1.21 an 1.22? https://launchpad.net/juju-core/+milestone/1.21-rc1
<TheMue> sinzui: taking a look
<ericsnow> sinzui: perfect
<ericsnow> dist-upgrades don't happen on running instances, right? (other than as part of the initial start-instance stuff)
<ericsnow> since everything in juju is series driven :)
<sinzui> ericsnow, correct. Deploy a new service to get the new os, not upgrade an existing one
<dimitern> sinzui, i have a clue re bug 1403738
<mup> Bug #1403738: upgrade tests fail on multiple substrates with revision 24c1b80d <ci> <regression> <upgrade-juju> <juju-core:Fix Released by menno.smits> <https://launchpad.net/bugs/1403738>
<dimitern> sinzui, I think the fix for that bug wasn't backported to 1.21 and hence current blocker - bug 1411502
<mup> Bug #1411502: ERROR upgrade in progress - Juju functionality is limited <openstack> <uosci> <juju-core:Triaged> <juju-core 1.21:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1411502>
<sinzui> ah, thank you dimitern
<dimitern> sinzui, yep - I've just checked the source of 1.21 - no backport in sight
<alexisb> dimitern, can you delegate the two tasks you highlighted in mail for the 1.21 bugs
<alexisb> to some us based folks so we can get them landed today if possible
<dimitern> alexisb, two tasks? I haven't actually had time to look at the other bug, just the upgrade one
<alexisb> well the backport tasks
<dimitern> alexisb, sure
<dimitern> ericsnow, wwitzel3, can any of you find some time to backport https://github.com/juju/juju/pull/1343 to 1.21, so we can unblock people affected by bug 1411502 please?
<mup> Bug #1411502: ERROR upgrade in progress - Juju functionality is limited <openstack> <uosci> <juju-core:Triaged> <juju-core 1.21:Triaged> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1411502>
<ericsnow> dimitern: if wwitzel3 can't, I'll do it
<voidspace> dimitern: ping
<dimitern> ericsnow, thanks! I'll assign it to you - please sync-up with wwitzel3 and reassign if needed
<voidspace> dimitern: did you think that goamz already supported the DescribeNetworkInterfaces call?
<voidspace> dimitern: it doesn't look to me like it does
<voidspace> dimitern: I can implement ec2 filtering without it
<dimitern> voidspace, it does, but the filtering by instance-id is not implemented by the ec2test server
<voidspace> dimitern: ah, I can't see it in the goamz source code
<voidspace> dimitern: just filtering by network name might be *easier* anyway
<dimitern> voidspace, well, I started doing it for ec2 (the Environ.NetworkInterfaces() implementation) and realized I can't test it without the filtering
<dimitern> voidspace, and resorted to using Instances instead and the NetworkInterfaces from there
<dimitern> voidspace, unfortunately there were too many distractions today to finish it :/
<voidspace> dimitern: ok
<voidspace> dimitern: I have a basically done implementation *without* using DescribeNetworkInterfaces
<voidspace> just filtering on the provided network IDs
<voidspace> but I have an oddly failing test - possibly due to the test server
<voidspace> just investigating
<dimitern> voidspace, cheers - i need to step out now though
<dimitern> happy weekends y'all ;)
<voidspace> dimitern: o/ have a good weekend
<perrito666> ericsnow: have a moment for a completely non work related question?
<ericsnow> perrito666: sure
<perrito666> ericsnow: priv
<voidspace> EOW
<voidspace> happy weekend everyone
<ericsnow> sinzui: does bug #1411502 actually apply to 1.22?
<mup> Bug #1411502: ERROR upgrade in progress - Juju functionality is limited <openstack> <uosci> <juju-core:Triaged> <juju-core 1.21:In Progress by ericsnowcurrently> <juju-core 1.22:Triaged> <https://launchpad.net/bugs/1411502>
<ericsnow> sinzui: from what dimitern indicated, this should be fixed there already (with 1403738)
<sinzui> ericsnow, I think the change is in 1.22 I believe 1.22 was master at the time
<sinzui> ericsnow, If you agree, I will remove the task
<ericsnow> sinzui: k
<ericsnow> sinzui: I have a patch up for review that backports the 1.22 fix
<sinzui> ericsnow, okay. I will remove the 1.22 since the change comes from 1.22
<alexisb> ericsnow, perrito666 do we have an idea on this bug:
<alexisb> https://bugs.launchpad.net/juju-core/+bug/1410876
<alexisb> ??
<mup> Bug #1410876: Error executing lxc-clone: lxc_container: utils.c: mkdir_p 220 Not a directory - Could not destroy  snapshot %s - failed to allocate a pty; Insufficent privileges to control  juju-trusty-lxc-template <lxc> <oil> <trusty> <juju-core:Triaged> <juju-core 1.21:Triaged> <juju-core
<mup> 1.22:Triaged> <https://launchpad.net/bugs/1410876>
<ericsnow> alexisb: I haven't looked at that one
<ericsnow> wwitzel3: ^^^ ?
<perrito666> alexisb: I haven't either
<alexisb> sinzui, lp 1410876 is a blocker for 1.21 right?
<perrito666> alexisb: I am not sure I even know how to reproduce it
<sinzui> alexisb, I think so since it was found in openstack testing
<alexisb> well if we need repro steps to make progress we need to make that request in the bug
<perrito666> sinzui: do you know how to reproduce this or should I ask for more info?
<sinzui> perrito666, We will need more info. We obviously never saw this in our testing
<perrito666> sinzui: ah, i assumed since you re-classified it you knew how to make this happen
<sinzui> perrito666, No the stakeholder makes it important. it was found in oil testing
<perrito666> aghh I hate whatever shortcut my client has that allows me to close this
<perrito666> ericsnow: wwitzel3 did you guys knew that monday apparently is a holiday there?
<ericsnow> perrito666: yep
<perrito666> ericsnow: wwitzel3 then all we said was happening on monday will be happening on Tu, right? :p
<perrito666> I am going to be sooo alone in here on monday
<ericsnow> perrito666: yep
<wwitzel3> back, ericsnow I haven't looked at that one
<wwitzel3> perrito666: yeah, Mon = Tue ;)
#juju-dev 2015-01-18
<menn0> ericsnow: ping
<menn0> thumper: here's the correct fix to that upgrade ticket: http://reviews.vapour.ws/r/753/
<menn0> thumper: I need to forward port this to 1.22 and master as well
<thumper> ok, thanks
<thumper> will look
<thumper> menn0: shipit
<menn0> thumper: cheers
<menn0> thumper, waigani: the EnvWorkerManager PR is ready for review here: http://reviews.vapour.ws/r/754/
#juju-dev 2016-01-18
<davecheney> thumper: i remember why I lost interest in using ppc and went back to intel for testing juju with go 1.5
<davecheney> ppc is _SOOOO SLOW_
<davecheney> which is partly the imaturity of the compiler
<davecheney> and mainly because the vm's that we have are underpowered
<mwhudson> this is not a problem with the s390x port, at least :-)
<mup> Bug #1534307 changed: juju-metadata plugin tests are currently being skipped on Windows <juju-core:Invalid by wallyworld> <https://launchpad.net/bugs/1534307>
<mup> Bug #1535165 opened: Unable to create hosted environments with MAAS provider <juju-core:Triaged> <https://launchpad.net/bugs/1535165>
<mup> Bug #1534238 changed: juju debug-log fails with 1.26alpha3 and lxd <juju-core:New> <https://launchpad.net/bugs/1534238>
<davecheney> mwhudson: omg rockne is sloooow
<davecheney> did you say that s390 was fast
 * thumper is done
<thumper> laters
<menn0> fwereade: howdy. I've taken care of most of your review comments for http://reviews.vapour.ws/r/3541/ and the code is much better for it. thanks!
<menn0> fwereade: can you please answer my one query about one of your suggestions some time during your day today? (the second remaining open issue)
<frobware> voidspace, http://reviews.vapour.ws/r/3550/ - when you have a moment. I'll also reciprocate with your outstanding review.
<voidspace> frobware: mine's been done, but thanks
<voidspace> frobware: will look shortly, wrestling with merged tests at the moment :-/
<frobware> voidspace, ooops... sorry!
<frobware> voidspace, standup?
<voidspace> frobware: omw
<frobware> voidspace, http://reports.vapour.ws/releases
<blahdeblah> Hi all - hopefully quick Q: I have a subordinate charm which needs to know the public-address of every primary charm with which it and its peers are related during the update-status hook. i.e. Every subordinate needs to have a full list of the primaries.
<blahdeblah> What is the right way to achieve this?
<blahdeblah> Do I need to gather the public-address from the related primary on each subordinate (during the relation-joined hook), then send that data across the peer relation and store it for use during the update-status hook?
<blahdeblah> This feels like a really good way to get inconsistent data on each subordinate, but I couldn't think of another way to do it.
<blahdeblah> OK - I guess that wasn't a quick Q.  I'm heading off for the evening; hopefully someone can suggestion better than mine.
<blahdeblah> s/can /can think of a /
<perrito666> good morning all
<frobware> perrito666, morning
<voidspace> frobware: heh, found the magic combination
<voidspace> frobware: of the correct versions of bundlechanges, charm.v6, charmrepo.v2 and charmstore.v5
<voidspace> frobware: for everything to build
<voidspace> let's see if tests pass
<frobware> voidspace, which version won?
<voidspace> frobware: some are maas-spaces and some are master
<voidspace> I should have just done a blame and taken the most recent of each
<voidspace> I still get test failures
<voidspace> ah, because I removed a line
<voidspace> I'll add it back
<voidspace> frobware: the trouble was that it took time to work out *which* dependencies were the problem
<voidspace> frobware: and tests pass!
<frobware> voidspace, great
<frobware> voidspace, which version of charm-store won?
<voidspace> frobware: I think charmstore is master
<voidspace> frobware: but charm is maas-spaces
<voidspace> frobware: so this is the PR, it's only 8000 lines of diff
<voidspace> frobware: https://github.com/juju/juju/pull/4139/files
<voidspace> frobware: if you can check it in the next hour or so we can land it
<voidspace> ;-)
<frobware> voidspace, at line 1...
 * frobware is now at lunch. :)
<voidspace> :-D
<frobware> voidspace, will take a look later today. still trying to find out how to get, or why we get, multiple dns-nameserver entries.
<voidspace> frobware: I'll check it first anyway
<voidspace> frobware: more eyes needed, definitely though - but I want to look through the maas changes on master
<bogdanteleaga> is there some documentation on how to use the new controller/environment setup?
<frobware> bogdanteleaga, heh, good question. I've been bouncing between maas-spaces and master today, and getting confused between killing an environments.
<mup> Bug #1535328 opened: TestUniterRelationErrors fails <ci> <intermittent-failure> <test-failure> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1535328>
<cherylj> bogdanteleaga: https://jujucharms.com/docs/master/wip-systems
<cherylj> but that hasn't been updated with the new terminology and commands
<cherylj> systems are now called controllers
<cherylj> and all the commands are flat, like juju kill-controller rather than juju system kill
<cherylj> evilnick: how long would it take to update that page?  https://jujucharms.com/docs/master/wip-systems
<cherylj> evilnick: I could take a pass at updating the commands and terminology tonight
<evilnick> cherylj, if you wanted to do that it would be most helpful
<evilnick> it isn't something we have been working on or near the top of the list for juju stuff
<evilnick> (in terms of when we do it, not how important it is)
<cherylj> evilnick: yeah, I'll take a pass at updating it tonight
<bogdanteleaga> cherylj: thanks
<frobware> voidspace, we might as well pass up on the network interlock as jay may not be there (US holiday)
<frobware> voidspace, http://reviews.vapour.ws/r/3564/ - this is a forward port from 1.25 to master.
<perrito666> bbl
<voidspace> frobware: oops
<voidspace> frobware: will look in a bit
<voidspace> (and yes on the network interloc - I forgot it was on!)
<frobware> beisner, your last png looks nothing like my setup. interesting...
<beisner> hi frobware - just in for a bit, on us holiday, little one is napping tho and i just can't help but work a little.
<beisner> frobware, how about this one?  are you set to "DHCP and DNS" ?  https://launchpadlibrarian.net/234527105/maas-cluster-controller-interface-detail.png
<frobware> beisner, do you have 10-15 to HO?
<beisner> yah, sec...  lm grab headset
<frobware> beisner, https://plus.google.com/hangouts/_/canonical.com/juju-sapphire
#juju-dev 2016-01-19
<thumper> beautiful.... http://reports.vapour.ws/releases/3520/job/run-unit-tests-xenial-amd64/attempt/353
<thumper> fatal error: concurrent map writes
<davecheney> that's awesome, that check caught something
<davecheney> i'm pretty sure that race is already tracked in a bug
<davecheney> thumper: before the break we were talking about moving gomassapi to github
<davecheney> in fact i think at the time I was going to address the races in that package
<davecheney> and that prompted the move to gb
<davecheney> and that prompted the move to gh
<davecheney> thumper i sent you an email about this on 1 dec last year
<davecheney> following up it looks like martin made the switch https://github.com/juju/gomaasapi
<davecheney> i'll send a PR
<davecheney> and get stuck into those races
<davecheney> oh, and it looks like gomaasapi is being used
<davecheney> that happened on the 5th ?
<davecheney> ok, time to fix that race
<thumper> I did notice that martin did add it before christmas
<thumper> davecheney: did you add the concurrent map write fatal?
<blahdeblah> Anyone see my Q from last night about getting list of all primaries on every instance of a subordinate?  Is here the right place to ask, or would I be better taking it to #juju?
<thumper> blahdeblah: what is the rationale for this?
<blahdeblah> I'm working on a subordinate to automatically publish unit public addresses in DNS with no additional code or user intervention.
<blahdeblah> (And automatically drop units out of DNS if they fail status checks when update-status hook is called, but that's not really important here...)
<davecheney> thumper: that was added to 1.6
<davecheney> i'm surprised it made it into the versino in xenial
<davecheney> my guess is xenial is using some 1.6 beta
<thumper> blahdeblah: sorry, can't help with that, but yes, either here or #juju
<blahdeblah> Any suggestions about whether I'm on the right track or not (i.e. there isn't a way to query all primaries directly from a single subordinate unit)?
<davecheney> thumper: https://github.com/juju/gomaasapi/pull/3
<davecheney> this repo doesnt' appear to have gb or a bot
<thumper> ?
<davecheney> gb ? i mean rb
<davecheney> actual change is here, https://github.com/juju/gomaasapi/pull/3/files#diff-cef8a1746ed33f0946c8dc5017058f15R599
 * thumper sees if it has a bot
<davecheney> thumper: interesting
<davecheney> there is a bot, but no reviewboard
<davecheney> can't say I mind
 * thumper shrugs
<davecheney> thumper: https://github.com/juju/juju/pull/4146
<davecheney> passes stress.bash on my machine
<thumper> done
<davecheney> i'm sort of surprised that fix worked
<davecheney> but there is only one way in to the maas api
<davecheney> so sticking a lock right there did the job
<davecheney> which is good
<davecheney> 'cos doing it propoerly was going to be days of work
<thumper> :)
<cherylj> Can I get a quick review?  http://reviews.vapour.ws/r/3562/
<cherylj> davecheney: did that PR fix one of the intermittent failures we're seeing on wily / xenial?
<davecheney> cherylj: i cannot say for sure
<davecheney> if the failure wasn't related to maas
<davecheney> then probably not
<cherylj> davecheney: okay.  Please don't land anything else to master until we actually get a blessed revision and cut an alpha1
<cherylj> unless it fixes one of the failing tests :)
<davecheney> cherylj: sorry, thumper approved it
<thumper> cherylj: this fix was to fix the curse
<thumper> on master
<davecheney> if it fails to merge I won't retry
<cherylj> thumper, davecheney yeah, if it's to address a current test failure, that's fine
<cherylj> thanks for doing that
<cherylj> it wasn't clear that's what it was doing
<mwhudson> oh heh 1.6~beta2 is in xenial
<mwhudson> who uploaded that?
<davecheney> Â¯\_(ã)_/Â¯
<mwhudson> oh xnox
<cherylj> fatal error: concurrent map writes?  oh I see gomaasapi in the stack now.
<mwhudson> which means i should update golang-race-detector-runtime too i guess
<cherylj> looks like the other master failures were joyent issues
<davecheney> joyent provider also has long standing races
<cherylj> davecheney: and we had to move to another (more crowded) region recently
<thumper> huh... who woulda thunk it
<thumper> marshalling (1<<64-1) as uint64 value becomes 0xffffffffffffffff, and unmarshalling works into uint64
<thumper> but other smaller values are 'int'
 * thumper goes to write a schema hole to push this through
<thumper> anyone? http://reviews.vapour.ws/r/3568/
<thumper> davecheney: ^^ ?
<davecheney> thumper: responded
<thumper> ta
<davecheney> you can tighten up the switch statement some
<davecheney> other than that, lgtm
<thumper> davecheney: I don't see it
<thumper> ah
<thumper> GH
<thumper> davecheney: yeah, nicer on one line
<davecheney> and you move the case that used to fall through the switch into the switch body
<davecheney> i like methods than have a switch statement to end with a switch
<davecheney> you know once you get into the switch, it can only hit one of the cases
 * thumper tries to beat the bot
 * thumper wonders if there is a bot on juju/schema
<davecheney> then you don't have to rememver, hmm, this switch case doesn't do anything ... oh, i have to keep reading the file
<davecheney> Lovely
<thumper> did for both Uint and Time
 * thumper manually merges
<thumper> davecheney: before I manually merge, want to add a mark to GH?
 * thumper departs
<mup> Bug #1460175 changed: apiserver_test authhttp_test SetUpTest.debugLogSuite failed <intermittent-failure> <ppc64el> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1460175>
<voidspace> frobware: you got an LGTM by the way
<frobware> voidspace, much obliged.
<frobware> dooferlad, I will pick up the VLAN bugs/issues on 1.25.
<dooferlad> frobware: thanks! I will get on with the demo page then.
<frobware> dooferlad, could you pick of the '@' for '=' first.
<dooferlad> frobware: sure.
<frobware> dooferlad, we also need to help voidspace out with the merge of master into maas-spaces.
<dooferlad> frobware: I gave him a +1. Guess we can discuss in the standup
<frobware> dooferlad, yep
<voidspace> omw
<voidspace> frobware: dooferlad: standup
<frobware> voidspace, you tried merging into master. http://reviews.vapour.ws/r/3560/
<mup> Bug #1535678 opened: The MAAS bridge script only works for debian based interfaces(5) files. <maas-provider> <network> <juju-core:New> <https://launchpad.net/bugs/1535678>
<mup> Bug #1535678 changed: The MAAS bridge script only works for debian based interfaces(5) files. <maas-provider> <network> <juju-core:New> <https://launchpad.net/bugs/1535678>
<mup> Bug #1535678 opened: The MAAS bridge script only works for debian based interfaces(5) files. <maas-provider> <network> <juju-core:New> <https://launchpad.net/bugs/1535678>
<voidspace> frobware: hah
<voidspace> frobware: did I set up the branches the wrong way round
<frobware> voidspace, I would say so
<voidspace> frobware: oops
<frobware> voidspace, close...
<voidspace> frobware: will close and recreate PR
<frobware> voidspace, dooferlad: http://reviews.vapour.ws/r/3570/ - thx
<voidspace> frobware: dooferlad: attempt two (onto the right branch!) http://reviews.vapour.ws/r/3571/
<dooferlad> voidspace: if it is the same as before, but with the right branch target, it already has a +1 from me.
<voidspace> dooferlad: it is
<voidspace> frobware: looking at yours
<dooferlad> voidspace: +1 it is then
<voidspace> dooferlad: thanks, I've hit $$merge$$ on it
<dooferlad> voidspace, frobware: https://github.com/juju/juju/pull/4147
<frobware> dooferlad, LGTM except for jam's comment.
<frobware> (or observation)
<voidspace> frobware: add-juju-bridge.py:is_active, if an option is 'bond-master' then you return false
<voidspace> frobware: that's a separate, unrelated, change to the vlan bridging change - right?
<frobware> voidspace, nope.
<voidspace> frobware: the big change is that in _bridge_vlan you no longer check active interfaces
<voidspace> frobware: ah, ok
<voidspace> frobware: what's bond-master?
<frobware> voidspace, bond raw device now become special too
<voidspace> frobware: what's a bond raw device?
<frobware> voidspace, it is to protect these stanzas of this form: http://pastebin.ubuntu.com/14574966/
<voidspace> frobware: ok, what are they?
<voidspace> frobware: why do we not bridge them?
<frobware> voidspace, look at this example - http://pastebin.ubuntu.com/14574970/
<frobware> voidspace, we have a bond but the bond has two vlans on (top of) it.
<voidspace> frobware: if it's manual wouldn't it be safe anyway?
<frobware> voidspace, the correct bridging should be http://pastebin.ubuntu.com/14574972/
<frobware> voidspace, no because bond0 becomes bridged. bond0 => br-bond0, so in those cases we shouldn't create bridged for eth0/eth1 (which are the bond raw devices).
<frobware> voidspace, if we did eth0/1 in that example. you would get br-bond0 and br-eth0 (and br-eth1).
<voidspace> frobware: ok, thanks
<frobware> voidspace, did I confuse things? does it make sense?
<frobware> voidspace, it's too long since I did the actual change. :)
<voidspace> frobware: I'm just looking at the original code to see how we now treat that differently
<voidspace> frobware: as far as I can see, eth0 and eth1 in that example are manual, so they would both have been treated as inactive before anyway - even without checking for bond-master
<frobware> voidspace, agreed.
<voidspace> frobware: that bond-master change (specifically) doesn't make any difference for the examples you showed
<voidspace> they would only change behaviour for  a bond-master that was dhcp / static
<voidspace> frobware: and that bond-master change isn't tested as far as I can see
<voidspace> frobware: other than that LGTM
<frobware> voidspace, test is in the case that uses networkDHCPWithBondInitial
<frobware> voidspace, ah, you mean if the bond0 config is static?
<voidspace> frobware: I mean a test case where is_active would have returned true before the change but returns false now
<voidspace> frobware: so yeah, a bond-master config that is either static or dhcp
<frobware> voidspace, and we're talking for the raw eth0/eth1 device?
<voidspace> frobware: well, whatever sort of stanza that code is intended to be for
<voidspace> so yes I think
<voidspace> but a realistic example in the test would make the intent of the code clearer
<frobware> voidspace, I'm now beginning to wonder if that's a case we can run into. a bond's raw devices should always be manual, therefore !active.
<voidspace> right
<voidspace> that was my half suspicion, if it's not a case we can ever hit then the code doesn't need to be there
<voidspace> with networking I don't mind it being there "just in case" though, but in which case there should be a test
<voidspace> maybe with a comment that it's not a likely case and this is defensive code
<frobware> voidspace, you know what would be better here... if we moved the tests to python and then looked at the coverage.
<voidspace> frobware: heh, indeed
<mup> Bug #1465307 changed: 1.24.0: Lots of "agent is lost, sorry!" messages <landscape> <regression> <juju-core:Fix Released> <https://launchpad.net/bugs/1465307>
<jamespage> mgz, blimey - so retries was intentional!
<mgz> jamespage: yeah, though I'm not sure how well publicised it was, didn't get any responses on the list
<mup> Bug #1465307 opened: 1.24.0: Lots of "agent is lost, sorry!" messages <landscape> <regression> <juju-core:Incomplete> <https://launchpad.net/bugs/1465307>
<jamespage> mgz, well is was -dev rather than straight up juju
<jamespage> mgz, I responded in the bug and the ML - we should turn this off...
<mgz> I thought I had responded at the time, but I guess I just poked bogdan about it on irc
<mup> Bug #1465307 changed: 1.24.0: Lots of "agent is lost, sorry!" messages <landscape> <regression> <juju-core:Incomplete> <https://launchpad.net/bugs/1465307>
<mup> Bug #1465307 opened: 1.24.0: Lots of "agent is lost, sorry!" messages <landscape> <regression> <juju-core:Incomplete> <https://launchpad.net/bugs/1465307>
<mup> Bug #1465307 changed: 1.24.0: Lots of "agent is lost, sorry!" messages <landscape> <regression> <juju-core:Fix Released> <https://launchpad.net/bugs/1465307>
<mup> Bug #1465307 opened: 1.24.0: Lots of "agent is lost, sorry!" messages <landscape> <regression> <juju-core:Fix Released> <https://launchpad.net/bugs/1465307>
<bogdanteleaga> jamespage: fwiw, from what I remember we had an attempt at making it opt-in, but then we decided it was a good idea to always do it
<bogdanteleaga> I was waiting for somebody to get hit by it for a while :p
<bogdanteleaga> I think the reasoning was that hooks are supposed to be idempotent anyway, so it wouldn't hurt to always do it
<jamespage> bogdanteleaga, hook idempotency is important and a good feature, but if somethings failed, it breaks out of that assumption I think
<bogdanteleaga> jamespage, I think the assumption is they're still idempotent even if they fail (as in they will fail in the same way based on some condition)
<bogdanteleaga> how does the opt-in thing sound?
<perrito666> bogdanteleaga: opt-in?
<bogdanteleaga> I was thinking about having a option in the charm to specify whether the author wants retries or not
<perrito666> bogdanteleaga: that is a problem
<perrito666> you see retrie is because of juju, not of the charm
<cherylj> bogdanteleaga: we could also put it behind a config option
<bogdanteleaga> perrito666: not sure I follow
<bogdanteleaga> cherylj, you're talking about environ config?
<cherylj> bogdanteleaga: yeah.  like juju set-env auto-hook-retry=true
<perrito666> bogdanteleaga: lets say the hook is update-status hook, and juju state server is for some reason not available, lets say upgrading, we dont want to have a charm in failed status just because state server happened to be upgrading
<bogdanteleaga> cherylj, yup, but why not make the granularity at charm level?
<bogdanteleaga> perrito666, won't that happen regardless of retries?
<perrito666> bogdanteleaga: if the hook is idempotent, that would not be a problem :p
<bogdanteleaga> perrito666, aren't we working with the assumption that they are?
<katco> natefinch: happy tuesday :)
<perrito666> that looks extremely like the prelude to bad news :p
<perrito666> I hate making changes in status
<natefinch> katco: happy tuesday
<katco> natefinch: just wanted to check in and see if you needed anything for your card (or anything else)
<natefinch> katco: nope, think I'm good.  Card seems pretty straight forward.
<katco> natefinch: ok, cool! also sent you a meeting invite for this afternoon. 15:15, and then we have the meeting at 15:30
<natefinch> katco: cool, I'll go read up for that, so I'm ready. what's the format, are we all going to talk to him at the same time?
<katco> natefinch: we'll discuss @ 15:15
<natefinch> katco: ok
<voidspace> frobware: dooferlad: mega-monster-master-merge landed
<voidspace> frobware: dooferlad: do we have openstack meeting now?
<dooferlad> voidspace: will ping you if anyone turns up
<voidspace> dooferlad: ok
<voidspace> hmm... so far changes on master to provider/maas/environs.go *are* on the merged maas-spaces
<voidspace> that's encouraging
<voidspace> ok, so found the first "lost" changes
<katco> natefinch: also, ericsnow needed a review on this: http://reviews.vapour.ws/r/3551/diff/#
<natefinch> katco: thanks for bringing that to my attention, I hadn't gone to look at reviewboard this morning
<perrito666> downloading lxd images in a bar, the rest of the people using the wifi must be hating me
<perrito666> a lot
<lazypower> perrito666 chaos_wolf.png
<lazypower> perrito666 http://i.imgur.com/p2jAdlk.jpg
<perrito666> lazypower: actually I was drinking coffee and having lunch
<perrito666> but given what they charged me, I presume the bw was more than paid :p
<lazypower> perrito666 thats fair. I feel that way about panera
<mup> Bug #1535838 opened: Juju won't let me use some instance types on AWS <juju-core:New> <https://launchpad.net/bugs/1535838>
<beisner> cherylj, w/1.25.3 deb:  first bootstrap attempt failed (in a different way).  the machine's e/n/i looks good, and it is still reachable.  but `dpkg -l | grep juju` is empty.  error was:
<beisner> failed to bootstrap environment: bootstrap instance started but did not change to Deployed state: instance "/MAAS/api/1.0/nodes/node-d4692494-8228-11e4-8078-d4bed9a84493/" is started but not deployed
<beisner> i've got another bootstrap underway to see if i get the same.  logs, collected too.
<mup> Bug #1535838 changed: Juju won't let me use some instance types on AWS <juju-core:New> <https://launchpad.net/bugs/1535838>
<beisner> from the maas perspective, machine is "deployed" without error
<mup> Bug #1535838 opened: Juju won't let me use some instance types on AWS <juju-core:New> <https://launchpad.net/bugs/1535838>
<mgz> beisner: did you use  --upload-tools?
<beisner> mgz, indeed.  got a new run going now.
<cmars> natefinch, question about windows workloads.. if I juju ssh to a windows workload, would I have access to powershell commands, or just cmd.exe ?
<cmars> bogdanteleaga, was going to ask you as well ^^, though it's probably past your eod
<bogdanteleaga> I can do quick questions
<bogdanteleaga> you can't ssh to a windows machine now
<bogdanteleaga> we need winrm for that kind of functionality
<bogdanteleaga> and it's not implemented yet
<bogdanteleaga> best thing you can do is rdp into it, or winrm manually
<mgz> see bug 1426729 bug 1426730
<mup> Bug #1426729: juju-run does not work on windows hosts <juju-agent> <run> <ssh> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1426729>
<mup> Bug #1426730: juju debug-hooks does not work on windows <charm> <juju-agent> <ssh> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1426730>
<beisner> mgz, cherylj - 2nd and 3rd bootstraps a-ok.  now deploying some misc stuff to another unit and containers.  will holler back.
<cherylj> thanks, beisner!
<beisner> yw, cherylj !
<cmars> bogdanteleaga, mgz ok, thanks!
<natefinch> ericsnow: what's the point of the extra level of indirection of the serialization struct in this review? http://reviews.vapour.ws/r/3551/diff/#
<ericsnow> natefinch: consistent serialization in the various places where we need to
<natefinch> ericsnow: This feels like it's mixing concerns.  The formatted stuff is getting mixed into the api<->non-api conversions...
<ericsnow> katco, natefinch: quick review please: https://github.com/juju/utils/pull/188
<ericsnow> natefinch: how so?
<ericsnow> katco: with that utils patch + my current branch, resource-get is working! :)
<katco> ericsnow: yes!
 * katco does a happy dance
<natefinch> ericsnow: awesome
<katco> ericsnow: when i have more time, i'd love to hear what bugs you ran into
<ericsnow> katco: just read through the commit history of the branch :)
<katco> ericsnow: good call
<natefinch> ericsnow: re: the serialized stuff: the logic to format a charmresource now lives outside the formatter, which is bad.  It should have all of that self-contained so that it's obvious that a change to that code changes the outputted text.  In addition, you've split the conversion into two steps, x -> String, string -> y, whereas before it was x -> y, making it clear that the conversion from x to y needs to be maintained (and making it easier to see
<natefinch> where the data is coming from).
<natefinch> ericsnow: it feels like going too far with DRY. The conversions happen to be similar,  but they aren't required to be identical, and by merging it all into a single thing, you're tying things together too tightly.
<ericsnow> natefinch: I don't see how anything is being tied to tightly to anything else
<ericsnow> natefinch: across our implementation we have a consistent pattern of serialization
<ericsnow> natefinch: I factored that out so that it is more maintainable
<natefinch> ericsnow: it was already maintainable, because they were confined to single functions that did exactly what we needed to maintain, convert X into Y.  Now everything is in two functions, convert X into serialized, convert serialized into Y.
<natefinch> ericsnow: and we don't even use the fingerprint field from serialized for the formatter, which is a red flag
<ericsnow> natefinch: how is that a red flag?
<natefinch> ericsnow: it shows that the formats are only coincidentally the same.... and for example, that field is not the same.
<natefinch> ericsnow: and the code for IsPlaceholder is only used in the formatter, but the logic exists way far away in the serialized code
<natefinch> ericsnow: I could maybe be convinced of it being useful for api->resource/charmresource, but I really think it's inappropriate to put formatter logic anywhere but in the formatter.
<beisner> cherylj, mgz - ok, some basic throw-it-at-the-wall success:  http://pastebin.ubuntu.com/14577040/  :-)    one odd thing, eth3 (unused) is getting some unexpected definitions in e/n/i.
<beisner> though that is not impacting this particular deploy
<thumper> morning folks
<perrito666> thumper: morning
<katco> natefinch: moonstone
<cherylj> frobware, dooferlad, if you're still around, can you checkout beisner's comment above?
<cherylj> sinzui: there are a couple other patches that we want in alpha1 that weren't tested in this blessed run.  Can we run master again?
<cherylj> sinzui: it pulls in the MAAS fixes that we talked about in the standup
<sinzui> cherylj: CI saw the commits and has already started a new test of master
<mup> Bug #1535891 opened: Feature request: Custom/user definable cloud-init user-data <juju-core:New> <https://launchpad.net/bugs/1535891>
<menn0> perrito666: ping
<perrito666> menn0: pong
<menn0> perrito666: I'm looking at your XDG compatible PR
<menn0> perrito666: what do you mean by "location for juju config files is not .config/juju or XDG_CONFIG_HOME/juju"
<perrito666> menn0: lemme go check
<perrito666> menn0: there, corrected, apologies
<menn0> perrito666: it seems like the change is making it so the default juju home is indeed ~/.config/juju
<menn0> perrito666: thanks
<perrito666> menn0: it follows the xdg standard
<mup> Bug #1535891 changed: Feature request: Custom/user definable cloud-init user-data <juju-core:Opinion> <https://launchpad.net/bugs/1535891>
<menn0> perrito666: yep I get it. The "not" was confusing :)
<perrito666> it is either XDG_CONFIG_HOME/juju or ~/.config/juju depends if the first is defined, it usually isnt
<perrito666> the standard dictates that the later is the default value
<perrito666> extra context  -> http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
<menn0> perrito666: thanks. I'm up to speed with that but it's good to have the link to the spec.
<perrito666> menn0: tx for reviewing that, it is a boring long change
<frobware> cherylj, I answered in the bug. make sense?
<cherylj> frobware: thanks for that update.  It is not clear to me whether or not you'd be comfortable releasing what we have.
<cherylj> frobware: should we wait for your additional testing?  or are we good to go?
<frobware> cherylj, I can try now, but I may curtail if it goes too long.
<cherylj> frobware: so then we should wait?  :)
<frobware> cherylj, I don't think we can get into the situation I envisage as we would always have at least one NIC up. If there are no NICs up (at all!) bootstrap is highly unlikely. :)
<cherylj> frobware: heh okay.  Let me know your results when you EOD and we'll make the call to ship it or not
<beisner> frobware, cherylj - ++comment @ "that bug"
<frobware> beisner, I'm struggling to understand why (in my own head) the additional nameserver entries become a problem?
<frobware> beisner, the indentation is confusing on the dns- entries added by maas/curtin. They belong to an interface.
<beisner> frobware, they *might* not.  it's just not consistent.  (why just eth3?  what happened to eth2 and eth1 if eth3 got that treatment?)  and it's new behavior from 1.25.0, 1.25.1 and 1.25.2
<frobware> mpontillo, ping - I have some questions regarding the rendering of /e/n/i from MAAS/curtin...
<beisner> so i'm just raising a flag that this could smell of some future trouble.
<frobware> beisner, to me it appears that MAAS/curtin will always add a dns-nameserver entry to /e/n/i. The way that it does it makes it appear that they're top-level stanzas whereas they should be attached to the iface.
<frobware> beisner, they _are_ attached to the iface. In your case 'auto eth3'.
<beisner> http://pastebin.ubuntu.com/14577099/
<beisner> frobware, ^ exactly.   there is no longer a global dns-nameservers setting.  it's getting sucked into eth3's stanza.
<frobware> beisner, I would agree that the association to eth3 is new, but then again that's because we were parsing it badly.
<beisner> whereas without juju, there is a top-level definition of dns-nameservers.
<frobware> beisner, I think the problem is that there's no notion of top-level dns-* stanzas.
<frobware> beisner, if you read $(man interfaces) there's no mention of dns-*
<frobware> beisner, if you read $(man resolvconf) there is the notion that an interface can have a dns-* option.
<frobware> beisner, and, AFAICT, there are few mentions of nameservers in the ifupdown package. http://pastebin.ubuntu.com/14577985/
<mup> Bug #1535916 opened: juju upgrade-charm should recognize the force flag <adoption> <compatibility> <juju-core:New> <https://launchpad.net/bugs/1535916>
<beisner> frobware, indeed.  everything i see also says dns-nameserver belongs inside the iface stanza.   so is it a bug that maas has been placing it at the top level all along?
<frobware> beisner, I think the issue is that the indentation make it appear that dns-* options are top-level stanzas.
<beisner> frobware, to be clear, top-level dns-nameserver does work in an e/n/i file, regardless of presence in the man pages.
<menn0> perrito666: review done. I don't think it's quite there yet.
<beisner> frobware, oh heck.   dns-search dellstack   gets dropped from top-level, AND it doesn't pull into eth0.   instead it's in eth3, which isn't even up.  i'm reasonably confident this is not desired.
<frobware> beisner, does eth3 have an auto stanza?
<frobware> beisner, if so, you can verify its UPness with: ip -d link
<beisner> frobware, top is before juju, bottom is after juju.  http://pastebin.ubuntu.com/14577099/
<beisner> frobware, eth2 and eth3 aren't even cabled up on these boxes.
<frobware> beisner, can you cut+paste the output from that state using ip -d link
<beisner> frobware, ip -d link:  http://pastebin.ubuntu.com/14578067/
<beisner> i've got to depart, EOD, but can be back online in ~+4hrs.
<frobware> beisner, cherylj: see discussion on #maas
<cherylj> frobware: I'm following :)
<frobware> cherylj, given the differing opinions, do we want to hold fire here? It seems we need to nail this down once and for all. :(
<cherylj> beisner: is it okay if we take another day or so to figure this all out?  are you guys working with something stable now?
<cherylj> I think I missed him
<cherylj> frobware: we can pick this back up tomorrow.
<cherylj> frobware: but please do add an update to the bug with a current plan before you head off to bed
<perrito666> menn0: tx a lot
<perrito666> menn0: I agree a lot with your review, tx
<frobware> cherylj, what's interesting is that if `dns-*' is a top-level stanza this is pretty much how we were previously interpreting it.
<cherylj> frobware: so what's your thought now?  to ask maas / curtin to create juju-br0?
<frobware> cherylj, too much of a change right here, right now. but it's something to discuss for ongoing releases.
<cherylj> frobware: okay.  For 1.25.3, do you need to change what you have now?
<frobware> cherylj, no, don't believe so. but I just updated the bug. it's getting late for me to summarise what we have & do.
<mup> Bug #1336473 changed: Support new t2 instance types on AWS <ec2-provider> <juju-core:Fix Released> <https://launchpad.net/bugs/1336473>
<mup> Bug #1336473 opened: Support new t2 instance types on AWS <ec2-provider> <juju-core:Fix Released> <https://launchpad.net/bugs/1336473>
<mup> Bug #1336473 changed: Support new t2 instance types on AWS <ec2-provider> <juju-core:Fix Released> <https://launchpad.net/bugs/1336473>
<mup> Bug #1336473 opened: Support new t2 instance types on AWS <ec2-provider> <juju-core:Fix Released> <https://launchpad.net/bugs/1336473>
<axw> perrito666: FYI, your messages are coming up in the hangouts thingy in gmail/inbox
<perrito666> axw: the writing in another chat was ajoke or I actually was on the wrong chat?
<axw> perrito666: as opposed to the hangout video window
<axw> perrito666: I guess because you're using your tablet or phone or whatever?
<perrito666> mm that sucks
<perrito666> axw: exactly, hangouts finds new ways of sucking
<anastasiamac> perrito666: u typed messages?
<perrito666> anastasiamac: I did
<anastasiamac> perrito666: i did not see any ;(
<perrito666> apparently there is no way to access the video chat from the mobile app
<mup> Bug #1336473 changed: Support new t2 instance types on AWS <ec2-provider> <juju-core:Fix Released> <https://launchpad.net/bugs/1336473>
<wallyworld> axw: anastasiamac: will be afk for a bit, electrician on way to my house
<anastasiamac> wallyworld: \o/
<anastasiamac> wallyworld: i've pm-ed u too
<axw> wallyworld: see you later, minus arm and leg
<anastasiamac> :D
<mpontillo> frobware: sorry, I was out for an appt - back now. what's up?
#juju-dev 2016-01-20
<cherylj> hey perrito666, did wallyworld ping you yet about opening bugs for providers with know JES issues?  (this is the destroying multiple environments with the same name issue that was brought up in Oakland)
<cherylj> mpontillo: you missed the fun :)  most of it was over in #maas
<cherylj> mpontillo: it was regarding bug 1534795, if you're interested in catching up
<mup> Bug #1534795: unit loses network connectivity during bootstrap: juju 1.25.2 + maas 1.9 <maas-provider> <uosci> <juju-core:Fix Committed by frobware> <juju-core 1.25:Fix Committed by frobware> <https://launchpad.net/bugs/1534795>
<perrito666> cherylj: nope, in which provider?
<cherylj> perrito666: all of them with known issues :)
<cherylj> perrito666: the idea is we open bugs so we can point to them as known issues for 2.0-alpha1
<mpontillo> cherylj: thanks; I read through the bug. will wait to hear frobware's findings
<menn0> thumper: for model migrations, I need to store the connection and auth details for the target controller
<menn0> thumper: will suggested I made the API that creates the document which holds this stuff take something similar to api.Info
<menn0> thumper: and I'm tempted to take a tag instead of a username, in case we end up using machine accounts at some stage
<menn0> thumper: is storing a tag in the DB a no no?
<thumper> hmm...
 * menn0 wonders about going back to just accepting usernames
<thumper> we haven't so far, but I think that this case may be fine
<menn0> it does simplify some other aspects
<thumper> for storing credentials
<menn0> thumper: it's more flexible if any tag is allowed
<thumper> agreed
<menn0> i'll do that then
<mup> Bug #1536025 opened: LXD provider unable to setup configuration correctly <bouncing> <lxd-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1536025>
<anastasiamac> jam: imagemetadata worker patch is ready to go... waiting for master to unblock... this should fix the above lxd difficulty \o/
<jam> anastasiamac: so is that bug a duplicate then?
<jam> well, not that one, the previous one
<jam> bug #1536019
<mup> Bug #1536019: LXD provider causes imagemedata worker to bounce <bouncing> <imagemetadata> <lxd> <juju-core:Triaged> <https://launchpad.net/bugs/1536019>
<anastasiamac> jam: i think so... for some providers, this worker should not run... this is what this patch does :D
<jam> anastasiamac: well, fwiw, i'd love it if we knew about the images that were used for LXD so we didn't have to state in "environments.yaml" that you need to run lxd-images import
<jam> anastasiamac: we could do the import ourselves, and know when we need to update the base image *and* the templated ones (if we go that far)
<anastasiamac> jam: sounds reasonable \o/
<frobware> dooferlad, ping. sync today?
<dooferlad> frobware: sorry, yes, still able to talk?
<frobware> yes
<dooferlad> frobware: http://pactcoffee.com/spread/JAMES-CMOCKA@cp
<dooferlad> frobware: you know, just in case
<TheMue> hehe, the coffee freak. morning btw
<frobware> dooferlad, heh. fancy a quick HO, re dns-nameservers. :)
<dooferlad> morning TheMue!
<dooferlad> frobware: sure
<jam> anastasiamac: wallyworld: did we end up leaving deep debugging code in? I'm seeing a lot of "STORAGE FRONT-END" and "STORAGE BACK-END" code triggering.
<jam> but the content doesn't seem useful
<jam> as it is debugging a bunch of pointer objects
<wallyworld> jam: not on purpose
<anastasiamac> jam: looks like it
<jam> facadeCaller: base.facadeCaller{facadeName:"Block", bestVersion: 1, caller: *pointer, closer: *pointer)
<wallyworld> must have slipped through review
<jam> wallyworld: anastasiamac: k. prob should be fixed for at least alpha2, not sure if we've missed alpha1
<wallyworld> the hope is we got a bless this morning
<wallyworld> will make alpha2 for sure
<jam> I did see a bless
<frobware> cherylj, beisner: am in the process of writing a conclusion to #1534795. the TL;DR is I don't believe what we've done is broken.
<mup> Bug #1534795: unit loses network connectivity during bootstrap: juju 1.25.2 + maas 1.9 <maas-provider> <uosci> <juju-core:Fix Committed by frobware> <juju-core 1.25:Fix Committed by frobware> <https://launchpad.net/bugs/1534795>
<cherylj> frobware: awesome, thank you!
<alexisb> fwereade, jam, voidspace meeting
<voidspace> alexisb: omw
<TheMue> perrito666: answered your question about that drink called Ayran
<perrito666> TheMue: you drank that?
<TheMue> perrito666: yes, very refreshing. yogurt and milk with a little bit salt and then beat until frothy
<perrito666> I should try
<TheMue> perrito666: it's a national drink in Turkey
<TheMue> perrito666: they had no beer there, so I asked for something authentic
<voidspace> dooferlad: tell me more about this ethernet thing you talk of...
<dooferlad> voidspace: It is like WiFi,  but with wires.
<voidspace> dooferlad: my wifi never works either... :-)
<dooferlad> voidspace: :-)
<voidspace> not entirely true...
 * voidspace lurches to lunch
<mup> Bug #1533750 changed: 2.0-alpha1 stabilization <juju-core:Invalid> <https://launchpad.net/bugs/1533750>
<mup> Bug #1533750 opened: 2.0-alpha1 stabilization <juju-core:Invalid> <https://launchpad.net/bugs/1533750>
<mup> Bug #1533750 changed: 2.0-alpha1 stabilization <juju-core:Invalid> <https://launchpad.net/bugs/1533750>
<mup> Bug #1533750 opened: 2.0-alpha1 stabilization <juju-core:Invalid> <https://launchpad.net/bugs/1533750>
<mup> Bug #1533750 changed: 2.0-alpha1 stabilization <juju-core:Invalid> <https://launchpad.net/bugs/1533750>
<cherylj> Can I  get a quick review?  http://reviews.vapour.ws/r/3578/
<frobware> cherylj, beisner: my observations are in https://bugs.launchpad.net/juju-core/+bug/1534795/comments/31
<mup> Bug #1534795: unit loses network connectivity during bootstrap: juju 1.25.2 + maas 1.9 <maas-provider> <uosci> <juju-core:Fix Committed by frobware> <juju-core 1.25:Fix Committed by frobware> <https://launchpad.net/bugs/1534795>
<mup> Bug #1536215 opened: 1.25.0: deployment times out - system is deployed successfully by maas 1.9 but juju state never transitions from pending <oil> <juju-core:New> <MAAS:New> <https://launchpad.net/bugs/1536215>
<mup> Bug #1536230 opened: backups create help text is wrong about downloading <docs> <papercut> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1536230>
<frobware> beisner, ping
<mup> Bug # changed: 1497312, 1517611, 1525280, 1527349, 1533469, 1533751, 1533792, 1533849, 1536215
<lazypower> abentley sinzui  - have we tested a 1.20 => 1.25 upgrade?
<sinzui> lazypower: yes it does work.
<lazypower> awesome, i recall 1.22 had some problem child days, so this is great news
<abentley> lazypower: Yes, we test an upgrade from 1.20 for everything: http://reports.vapour.ws/releases/3522/job/aws-upgrade-20-trusty-amd64/attempt/508
<lazypower> thanks for the confirmation sinzui
<lazypower> awesome, thanks abentley sinzui  - you just calmed a dusty old charmers fears :D
<abentley> lazypower: Though the script exits early if the target is 2.0+ because upgrades from 1.20 -> 2.0 are not supported-- you have to go through 1.25
<lazypower> abentley makes total sense. this is for a user upgrading from 1.20 => 1.25
<beisner> frobware, pong.  in a mtg atm.
<frobware> beisner, ack. ping me when you have some time; would like to discuss your concerns.
<mup> Bug # changed: 1390585, 1486553, 1527681, 1528217, 1531064, 1534353, 1534795
<mup> Bug #1536215 opened: 1.25.0: deployment times out - system is deployed successfully by maas 1.9 but juju state never transitions from pending <oil> <juju-core:New> <MAAS:Invalid> <https://launchpad.net/bugs/1536215>
<beisner> frobware, so i think the lowest common denominator in the topic is this:  bug 1536262   ...and that is why we see what we see in 1.25.3.  thoughts?
<mup> Bug #1536262: dns-* e/n/i are misplaced <uosci> <MAAS:New> <https://launchpad.net/bugs/1536262>
<beisner> hi cherylj, frobware, if we can get 1.25.3 into ppa:juju/proposed, that will unblock our openstack CI metal lab.  is that something that is planned in the next day or so?
<cherylj> beisner: yes, we're aiming for today.
<cherylj> beisner: we just wanted all the dust to settle on the questions around DNS and MAAS
<beisner> cherylj, frobware - sweet.  yeah so there is still dust but given a sudden realization that indentation in the e/n/i file is meaningless, it's old dust.
<ericsnow> natefinch: is this what you meant? http://reviews.vapour.ws/r/3579/
<natefinch> ericsnow: yep :)
<ericsnow> natefinch: thanks for pointing that out :)
<natefinch> ericsnow: no problem.  I've done the exact same thing a half a dozen times... write a whole bunch of code and then realize half of it is covered in the stdlib.
<mup> Bug #1536324 opened: juju backups restore times out trying to connect to API server <backup-restore> <ci> <juju-core:Triaged> <https://launchpad.net/bugs/1536324>
<mup> Bug #1536324 changed: juju backups restore times out trying to connect to API server <backup-restore> <ci> <juju-core:Invalid> <https://launchpad.net/bugs/1536324>
<mup> Bug # opened: 1536333, 1536336, 1536337, 1536340
<mup> Bug #1536345 opened: backups create and backups restore defaults clash <backup-restore> <ci> <papercut> <juju-core:Triaged> <https://launchpad.net/bugs/1536345>
<thumper> was the LXD provider available in 1.25 behind a feature flag?
<thumper> or was it just 1.26/2.0 ?
<thumper> katco: ^^?
<katco> thumper: just 1.26/2.0
<thumper> ta
<katco> thumper: np. specifically 1.26-alpha2
<thumper> k,
<menn0_> thumper: I very much doubt that machine-dep-engine is going to merge this week
<menn0> thumper: there are unblessed changes in master which conflict
<menn0> thumper: so I have to wait for them to get blessed and then merge and fix the conflicts and then get machine-dep-engine blessed
<menn0> thumper: there aren't enough days left in the week for that
<thumper> ugh
<menn0> thumper: I might merge machine-dep-engine right up to latest master in the hope that it'll get blessed. It might save someone less familiar with the changes in the feature branch from having to deal with the conflicts. They look non-trivial.
<menn0> (next week)
 * thumper needs moar coffee
<perrito666> OT is any of you in the go slack channel, I filled the form to get the invite but didnt :(
<natefinch> perrito666: I can ping the maintainers about it
<cherylj> hey davecheney, are you still looking for some bugs to take?
<natefinch> perrito666: https://gophersinvite.herokuapp.com/
 * thumper stops procrastinating and dives into the code
<perrito666> natefinch: oh, that changed
<natefinch> perrito666: evidently
<perrito666> natefinch-afk: tx btw, I am in
<mup> Bug #1536378 opened: resolver loop error in maas-1_8-OS-deployer test <juju-core:Incomplete> <juju-core machine-dep-engine:Triaged> <https://launchpad.net/bugs/1536378>
<mup> Bug #1536378 changed: resolver loop error in maas-1_8-OS-deployer test <juju-core:Incomplete> <juju-core machine-dep-engine:Triaged> <https://launchpad.net/bugs/1536378>
<mup> Bug #1536378 opened: resolver loop error in maas-1_8-OS-deployer test <juju-core:Incomplete> <juju-core machine-dep-engine:Triaged> <https://launchpad.net/bugs/1536378>
<thumper> WTF!!!!
<thumper> I thought that we stopped panicing on unknown series?
<thumper> I'm using version.MustParseBinary with "3.4.5-zesty-amd64"
<thumper> and it panics!
<perrito666> mm, I thought it too
<thumper> that's pretty disappointing
<davecheney> thumper: i can work on that if you like
<davecheney> but if you use the Must form, it will panic
<davecheney> that's why the Must form exists
<davecheney> if you don't want a panic, don't call mustParseBinary
<thumper> I didn't expext ParseBinary to return an error on unknown series
<thumper> that's fine, just gone with "trusty"
<davecheney> thumper: what should it do if the series is unknown ?
<thumper> davecheney: wrt work, cherylj had a particular bug I think
<thumper> davecheney: not sure...
<davecheney> it would beunwise to return something like "2.0.5-unknown-unknown"
<thumper> I thought it would have just created a binary with a series it didn't know about
<thumper> like "zesty"
<davecheney> 'cos lots of caller expect that any version value is accurate
<thumper> and not find tools etc
<davecheney> hmm, that makes senes
<davecheney> that would fit with the notion of Parse
<thumper> however that isn't important just now
<thumper> I'm working around it
<davecheney> it just parses the form provided
<davecheney> if you want something else, ParseValidSeries ?
<davecheney> ^ guess
<natefinch-afk> davecheney, thumper, katco: I notice I'm invited to this core leads call in 5 minutes.... is that on purpose?
<thumper> natefinch-afk: did you go to the earlier one today?
<perrito666> same question
<katco> natefinch-afk: it's for if you wish to ask questions about the sprint
<natefinch-afk> katco: oh
<thumper> alexis combined normal weekly team call with team lead call
<thumper> for debrief
<davecheney> natefinch-afk: thanks for the reminder
<katco> natefinch-afk: coming?
<natefinch-afk> katco: I can come for a bit
#juju-dev 2016-01-21
<mup> Bug #1536425 opened: Juju metadata cannot make streams with 1.x and 2.x agents <metadata> <regression> <streams> <juju-core:Triaged> <https://launchpad.net/bugs/1536425>
<mup> Bug #1536426 opened: Juju metadata cannot make streams with 1.x and 2.x agents <metadata> <regression> <streams> <juju-core:Triaged> <https://launchpad.net/bugs/1536426>
<thumper> davecheney: what is the simplest way to create a copy of []string?
<thumper> is it:  target := source[:]
<thumper> ?
<thumper> or do you go:
<thumper> target := make([]string, len(source))
<thumper> copy(target, source)
<thumper> seems that the first is good enough
<davecheney> thumper: do you want to copy the string header value, or the contents of the string
<davecheney> the former is easy, just assign
<davecheney> the latter requires a copy if you want the backing array of the source and copy to be different
<thumper> I want a change in source to not change target
<davecheney> then you need to copy the contents of the slice
<thumper> davecheney: the [:] makes a copy right?
<thumper> davecheney: on another take... guess what?
<thumper> fslock is fubared
<thumper> menn0 is annotating a bug
<natefinch-afk> thumper: [x:y] is the slicing operator, which never copies the backing array
<thumper> natefinch: http://play.golang.org/p/TJlPFkr9wD
<thumper> natefinch: hmm...
 * thumper pokes
<thumper> ok, figured it out
<natefinch> append makes you a new backing array there
<thumper> yeah, figured that bit out
<thumper> ta
<davecheney> [:] copies the struct value, which still points to the same backing array
<davecheney> var s []string
<davecheney> x := s[:]
<davecheney> isthe same as
<davecheney> x := x
<thumper> oh wallyworld
<davecheney> x := s
<wallyworld> yo
<thumper> wallyworld: I have a master blocker to give you
<wallyworld> oh joy
<thumper> because I'm busy with model migrations :)
<wallyworld> sigh, i'm busy too :-(
<menn0> thumper, wallyworld: https://bugs.launchpad.net/juju-core/+bug/1536378
<mup> Bug #1536378: resolver loop error in maas-1_8-OS-deployer test <juju-core:Incomplete> <juju-core machine-dep-engine:Triaged> <https://launchpad.net/bugs/1536378>
<menn0> updating the bug with new info now
<thumper> wallyworld: I know, everyone is busy
<thumper> but I've been given the "go away" sign for my door
<thumper> davecheney: surprise surprise the alive stuff in fslock doesn't work
<wallyworld> remind me to "thank" alexis
<thumper> wallyworld: she did give you the choice remember
<wallyworld> choice of wot?
<thumper> me or axw_
<axw_> ?
<wallyworld> to do the model migrations work i think he means
<thumper> axw_: it was who was going to work on model migrations
<axw_> ah, I had no idea :)
<wallyworld> we were going to do CMR
<thumper> yeah, wallyworld sheltered you
 * axw_ retreats back into cave
<wallyworld> that was before CMR go pulled
<thumper> like a good manager
 * rick_h_ hands out umbrellas to everyone
<davecheney> thumper: i think I just won some kind of bet
<davecheney> thumper: will you let me rip it out now ?
<thumper> I was happy to have it ripped out before
<thumper> who wanted it kept?
<thumper> I forget
<davecheney> thumper: you did
<axw> there was some concern about shared filesystems I think?
<davecheney> well, we put it to juju-dev and htere was the usual round of non committal 'well, maybe we need it'
<axw> we don't even need to worry about that when we do XDG properly
<davecheney> combined with the usual bike shedding on how one _could_ implement that feature
<axw> davecheney: I seem to recall a twitter poll... ;)
<davecheney> indeed
<thumper> ugh
<thumper> fark
<thumper> I'd like to see it replaced with something else that works on windows / linux / maxos
<thumper> that unlocks if the process dies
<thumper> I don't care about the underlying implementation
<davecheney> I can do that 100% reliably on linux
<natefinch> I can do that 100% reliably on windows
<thumper> create some interface that we can hide behind
<davecheney> osx would have to be a tcp socket listneing on localhost
<thumper> that has both lock, and lockWithTimeout
<davecheney> lock is just lockWithTimeout(MAXINT)
<wallyworld> np
<thumper> davecheney: JFDI
<mup> Bug # opened: 1536445, 1536446, 1536447, 1536448
<davecheney> kk
<davecheney> thumper: https://bugs.launchpad.net/juju-core/+bug/1465317/comments/12
<mup> Bug #1465317: Wily osx win: panic: osVersion reported an error: Could not determine series <osx> <packaging> <wily> <windows> <juju-core:Triaged> <juju-core 1.24:Won't Fix> <juju-core 1.25:Triaged by dave-cheney> <juju-release-tools:Fix Released by sinzui> <https://launchpad.net/bugs/1465317>
<natefinch> davecheney: I was looking into that code in Oakland, since I was factoring out the version stuff to live outside of juju/juju for use by the charm repo (for MinJujuVersion).  IIRC, the places where we actually care that a series is "known" or not are very limited... and usually it would be fine to fail later (e.g. when we can't find a matching tools stream).
<davecheney> natefinch: yeah, i think so too
<davecheney> i think it's reasonably safe to add an "unknown" series
<natefinch> I agree
<natefinch> as long as it's really unknown... if it's Zany Zebra and it's just not in our list, if it still comes across as "zany", then I think that's ok.  if zany gets turned into unknown, then we're basically back where we started.
<thumper> I'd rather we keep it zany
<thumper> and just not find tools etc
<natefinch> exactly
<natefinch> ...until tools appear, and then it just magically works
<natefinch> and we stop getting these bugs every time a new ubuntu or windows or osx version comes out
<davecheney> thumper: we can do that for linux
<davecheney> but for windows and OSX there is no tool to map their internel release name to a name
<davecheney> what is the series of OSX 10.20.0 ?
<davecheney> maybe we just say "10.20.0"
<davecheney> it won't match any tools
<davecheney> but you cannot bootstrap to OSX, so that's not a big deal
<menn0> wallyworld, thumper, davecheney: bug 1536378 updated
<mup> Bug #1536378: fslock staleness checks are broken <blocker> <juju-core:Triaged> <https://launchpad.net/bugs/1536378>
<menn0> now creating another bug with all the other stuff that's broken in fslock
<wallyworld> menn0: so is that a second blocker?
<menn0> wallyworld: the others are less likely to be hit, probably not worth blocking for
<wallyworld> menn0: oh, reading the comment in the maas blocker implicates fslock?
<natefinch> davecheney: we actually get a string name for windows... I guess we just thought it was too long, so we made a map to shorter names... nothing says we couldn't change that and just use the real name for the tools.. it's not like they're something a user should ever see.
<menn0> wallyworld: yep I believe the fslock bug is the reason for the MAAS CI test failure
<menn0> wallyworld: I'm amazed that not more is broken
<wallyworld> so thumper introduces a bug and flocks it off to me \0/ i owe him a beer
<wallyworld> actually i take it back, someone else introduced the bug, not thumper. damn, can't blame him now
<davecheney> oh dear
<davecheney> juju still has two different types of file lock
<davecheney> juju/utils/fslock, and juju/juju/utils/filelock
<davecheney> fsutils.Lock calls lock.clean before starting to lock the lock
<davecheney> fsutils.LockWithTimeout does not call lock.clean
<davecheney> one of those is wrong
<davecheney> i cannot tell which
<natefinch> and this is why man invented the coin toss
<natefinch> ....also unit tests
<davecheney>         logger.Warningf("breaking configstore lock, lock dir: %s", filepath.Join(dir, lockName))
<davecheney>         logger.Warningf("  lock holder message: %s", lock.Message())
<davecheney> multi line logging ? srsly
<natefinch> rofl
<wallyworld> menn0_: with logging to mongo, do we still need logsink.log?
<wallyworld> davecheney: are you fixing the blocker? i was going to start looking but don't want to double up
<davecheney> yup, i'm looking at the lock type
 * davecheney scrubs for surgeery
<wallyworld> awesome, ty
<thumper> wallyworld: yes
<thumper> wallyworld: logsink.log is the file that we use if mongo goes awol
<wallyworld> thumper: makes sense, ty. i have a branch with rsyslog all removed
<menn0_> wallyworld: logsink.log was created as part of the logging to mongo
<thumper> \o/
 * thumper is done for the day
<wallyworld> later
<menn0_> davecheney: i'm not sure what you're thinking but I wonder if now is the time to change fslock to use OS primitives for locking (as already discussed on juju-dev) instead of the current mess
<menn0_> davecheney: all the current complexity and bugs are because it's doing it wrong
<menn0_> davecheney, wallyworld: bug 1536378 describes the other fslock issues discovered today
<mup> Bug #1536378: fslock staleness checks are broken <blocker> <juju-core:Triaged> <https://launchpad.net/bugs/1536378>
<menn0_> davecheney, wallyworld: sorry bug 1536461
<mup> Bug #1536461: fslock bugs <juju-core:New> <https://launchpad.net/bugs/1536461>
<wallyworld> awesome, more bugs \o/
<wallyworld> menn0_: i see a bunch of work was done to fslock since it was originally written - obviously not well reviewed
<mup> Bug #1536461 opened: fslock bugs <juju-core:New> <https://launchpad.net/bugs/1536461>
<davecheney> menn0_: wallyworld thumper, minimal fix https://github.com/juju/utils/pull/190
<wallyworld> awesome
<menn0_> davecheney: I think the retry loop might have been a way to paper over the race created by the RemoveAll in NewLock (see bug 1536461)
<mup> Bug #1536461: fslock bugs <juju-core:Triaged> <https://launchpad.net/bugs/1536461>
<menn0_> davecheney: as long as that race is there, the loop might be necessary
<davecheney> menn0_: there are lots and lots of problems in that type
<menn0_> davecheney: agreed
<davecheney> for example, Lock calls clean which does the isAlive/BreakLock dance
<davecheney> LockWithTimeout does not call clean
<davecheney> which looks like an oversight
<davecheney> the "if your pid matche the pid of the lock" is also a bug
<menn0_> davecheney: yes, that's one of the problems I mention in bug 1536461 (you should look there)
<mup> Bug #1536461: fslock bugs <juju-core:Triaged> <https://launchpad.net/bugs/1536461>
<wallyworld> davecheney: i made a comment, not sure if you agree
<menn0_> davecheney: I'm not sure that the "if PID == lock.PID" is necessarily wrong
<menn0_> davecheney: this function is about deciding if the PID which owns the lock is alive
<davecheney> if two workers inside the same process are relying on this lcok to stop them walking over each other
<davecheney> it is
<davecheney> we probably don't have that code path today
<davecheney> but it's just a matter of time
<menn0_> davecheney: not about whether the the lock is held
<davecheney> the logic of "make lock, then lock it" is insane
<davecheney> you sholdn't be able to create a lock unless you hold it
<menn0_> davecheney: isAlive is about whether a process is alive and keeping the current lock "held"
<menn0_> davecheney: yep I agree with that
<davecheney> menn0_: I added a long rebuttle why I think retring is pretty pointless
<davecheney> wallyworld: http://reviews.vapour.ws/r/3585/
<davecheney> any more comments
<wallyworld> looking
<davecheney> ta
<wallyworld> davecheney: reviewed, but if we close the current blocker there should be a bug opened for the proper fix
<davecheney> wallyworld: menn0 has created a placholder bug listing all the probles
<davecheney> wallyworld: 1536461
<wallyworld> ok, ty
<wallyworld> i was just concerned this specific bug would be closed for the quick fix and not followed up
<wallyworld> as part of a longer term fix
<davecheney> wallyworld: yup, a long term fix is to stop using pid files on disk
<davecheney> which is my plan
<wallyworld> indeed
<davecheney> wallyworld: can you please re review
<davecheney> i got scared and rolled back most of my change
<wallyworld> ok
<davecheney> now the patch just loops like the logic originally did
<davecheney> but actually loops
<davecheney> i think this is the safer change
<wallyworld> yep, +1
<wallyworld> that was my main issue with the first changes
<axw> wallyworld: https://github.com/juju/juju/pull/4162  -- refactored credentials types, added list-credentials
<wallyworld> yay
<wallyworld> axw: will look in about 15
<axw> wallyworld: next will be extending providers to return schemas, need to tidy that up. probably will be a big one
<axw> wallyworld: thanks, no great rush
<davecheney> wallyworld: CI bot is fucked up, it cannot tell time anymore
<davecheney> http://paste.ubuntu.com/14588528/
<wallyworld> sigh
<davecheney> tried twice in a row
<wallyworld> we'll have to try again i guess and let curtis know
<mup> Bug #1536477 opened: utils/debugstatus: test failure <juju-core:New> <https://launchpad.net/bugs/1536477>
<axw> wallyworld: so we're definitely removing the old azure from 2.0?
<wallyworld> axw: yep
<axw> wallyworld: if so, I'm going to do that now, because it'll make my job easier in the cloud credentials branch
<axw> k
<wallyworld> axw: we're supporting 1.25 for 2 years
<axw> wallyworld: *nods*
<axw> wallyworld: hmm, CI probably isn't ready for it though
<mup> Bug #1536477 changed: utils/debugstatus: test failure <juju-core:New> <https://launchpad.net/bugs/1536477>
<axw> I'll just put it up for review anyway, then land when CI is ready
<mup> Bug #1536477 opened: utils/debugstatus: test failure <juju-core:New> <https://launchpad.net/bugs/1536477>
<wallyworld> axw: yes, i am having conversations about CI for the api rename branch. i've added that to the list of topics.
<axw> wallyworld: ta
<davecheney> wallyworld: https://github.com/juju/juju/pull/4163
<davecheney> fix to juju/juju
<wallyworld> ok
<anastasiamac> the whole of juju/juju?
<wallyworld> if only
<wallyworld> lgtm
<mup> Bug #1536480 opened: Disallow deploying multiple units to the same container by default <juju-core:New> <https://launchpad.net/bugs/1536480>
<wallyworld> axw: reviewed, have to head to soccer in about 45 minutes, will look again when i get back if not before
<axw> wallyworld: ok ta
<axw> wallyworld: addressed some but not all comments, PTAL when you can
<wallyworld> sure
<wallyworld> axw: responded
<axw> wallyworld: PTAL
<wallyworld> ok
<wallyworld> axw: done, ty. now off to soccer
<axw> wallyworld: later, enjoy
<Muntaner> hello to everyone...I have a problem with a bootstrap, is this the right channel?
<Muntaner> I'm trying to bootstrap juju with two maas machines, that actually are two VM into VMWare...  I installed the Maas controller on another machine, and the bootstrap seems to go well, it installs the OS cloud into the first of the two machines but when it comes to MongoDB replica stuff, it fails. I have under my hands the logs that I see in this situation
<TheMue> Muntaner: maybe you better ask on #juju. here it's more development of juju
<Muntaner> ok TheMue, sorry : )
<TheMue> Muntaner: yw, no need to excuse. is only a hint.
<voidspace> jam: stdup?
<voidspace> dooferlad: frobware: small PR http://reviews.vapour.ws/r/3590/
<mup> Bug #1536587 opened: juju's MAAS bridge script is echoed to the console by cloud-init during bootstrap <bootstrap> <network> <juju-core:New> <https://launchpad.net/bugs/1536587>
<mup> Bug #1536587 changed: juju's MAAS bridge script is echoed to the console by cloud-init during bootstrap <bootstrap> <network> <juju-core:New> <https://launchpad.net/bugs/1536587>
<mup> Bug #1536587 opened: juju's MAAS bridge script is echoed to the console by cloud-init during bootstrap <bootstrap> <network> <juju-core:New> <https://launchpad.net/bugs/1536587>
<mup> Bug #1536587 changed: juju's MAAS bridge script is echoed to the console by cloud-init during bootstrap <bootstrap> <network> <juju-core:New> <https://launchpad.net/bugs/1536587>
<mup> Bug #1536587 opened: juju's MAAS bridge script is echoed to the console by cloud-init during bootstrap <bootstrap> <network> <juju-core:New> <https://launchpad.net/bugs/1536587>
<rogpeppe> any fancy a review of a new (to juju/utils) URL utility function? https://github.com/juju/utils/pull/191
<mgz> I'll bite.
<rogpeppe> mgz: ta!
<rogpeppe> mgz: if you can think of any unusual corner cases that aren't covered in the test table, i'd like to know
<mgz> hm, there's nothing actually urly about this
<rogpeppe> mgz: urls are an obtuse thing!
<mgz> doesn't do anything with schemes
<rogpeppe> mgz: URL paths
<rogpeppe> mgz: still part of URLs
<rogpeppe> mgz: the RFC is definitely pertinent here
<mgz> rogpeppe: this is basically os.path.relpath though
<rogpeppe> mgz: yeah, except for it's URL-specific
<rogpeppe> mgz: os.path.relpath wouldn't work here
<mgz> I'm not seeing any case where it actually differs
<rogpeppe> mgz: os.path.relpath doesn't treat trailing slashes as significant
<mgz> I mean, that's not a reason we don't need this, if go doesn't have it
<mgz> aha, yeah, that is true
<mgz> which if these tests cover that...
<mgz> rogpeppe: does it ever make sense for base to not start with a / ?
<rogpeppe> mgz: not in the use case that we've been using it for, but it's possible
<mgz> also both have to be normalised already (as in, not contain .././)
<rogpeppe> mgz: yeah, that's true. perhaps i should normalise them with path.Clean
<mgz> asserting both are abs paths seems good for the moment
<mgz> rogpeppe: lgtm
<rogpeppe> mgz: thanks
<rogpeppe> mgz: tbh i don't know of any http muxers that allow .. or . in the middle of paths.
<rogpeppe> mgz: although... actually i think http.ServeMux returns a redirect if the path isn't clean, which kinda avoids the issue
<mgz> I think just stating the obvious in the function doc comment should be fine
<mgz> there are few cases in code where you'll actually deal with non-normalised paths and they tend to be obvious
<rogpeppe> mgz: yeah
<frobware> voidspace, dooferlad: can we take another pass through this - http://reviews.vapour.ws/r/3570/. I've only just got back to it...
<voidspace> frobware: looking
<voidspace> frobware: your unique option filtering, is that tested?
<frobware> voidspace, since I posted I'm going to revert that change.
<voidspace> frobware: unique_options should really be a set not a list
<voidspace> frobware: hah
<frobware> voidspace, set doesn't guarantee original order.
<voidspace> frobware: true enough
<frobware> voidspace, I could switch to ordered set.
<frobware> voidspace, but I know recall why I started making it unique and that should be a separate change.
<voidspace> frobware: ok
<frobware> voidspace, in my experiments yesterday with dns-nameserver we need to collapse duplicate entries to a single line, so this is not appropriate.
<voidspace> frobware: for small "sets" a list is fine
<voidspace> for long lists testing membership is slow
<frobware> voidspace, I want fewer changes to the original /e/n/i as possible.
<frobware> voidspace, makes it clear what buthering "we" did
<voidspace> frobware: good call
<frobware> voidspace, understood the perf. call, but we're dealing with < 10 entries in general.
<frobware> voidspace, of all the things... I don't think we have a performance problem. :)
<voidspace> :-)
<voidspace> frobware: is the new diff ready for review?
<frobware> voidspace, yep. pushed.
<voidspace> frobware: LGTM
<frobware> voidspace, thx
<perrito666> axw: wait, are we droping $JUJU_HOME?
<frobware> mgz, ping
<perrito666> wow testing is too different from the version we are using, anyone knows why?
<mgz> frobware: hey
<frobware> mgz, so we monster merged master into maas-spaces and I was looking when maas-spaces was last blessed which turns out to be ~39 days ago. Is and will maas-spaces be tested in CI? Do we have to do anything to trigger this, et al? Would really like to ensure our merge into maas-spaces will eventually get back into a blessed stated so we can merge into master.
<mgz> frobware: I can get it queued today
<frobware> mgz, great. thanks.
<mgz> we're still suffering from shortage of maas capacity
<sinzui> mgz: mass 1.9 is ill. Nodes are not coming up. Releasing the held nodes and restarting did not fix the issue. I need more coffee before I can fix this
<mgz> sinzui: will have a look
<beisner> sinzui, mgz, maas trusty daily image woes.  #maas @ c
<sinzui> :/
<sinzui> thank you beisner
<beisner> sinzui, yw i think  ;-)
<perrito666> jam: ?
<jam> perrito666: ?
<natefinch> it's a punctuation fight!  Go!!
<perrito666> I have a doubt (2 actually) regarding juju2 supporting xdg
<perrito666> natefinch: please, I have stuff like Â¿
<perrito666> jam: are we dropping JUJU_HOME ?
<jam> I would think that JUJU_HOME supersedes any other setting
<sinzui> mgz: Can you review https://code.launchpad.net/~sinzui/juju-ci-tools/unit-test-install-deps/+merge/283450
<perrito666> jam: I thought so too, but axw made a few comments in my PR which made me doubt
<jam> link?
<perrito666> the second is regarding XDG_DATA_HOME are we honoring that?
<perrito666> jam: the second is regarding XDG_DATA_HOME are we honoring that
<jam> perrito666: I don't think we have the concept of a set of DATA files. Isn't that usually things for local installs of resources that you consume?
<jam> given .local/share
<jam> which matches /usr/share, right?
<perrito666> jam: well it is a bit of a blurry concept I think, I have seen people not be consistent on that one
<perrito666> that is why I did a straight replace
<jam> so, *today* I don't believe we consume anything from /usr/share
<perrito666> but if we are going to deem certain things as "data" there is more thought to be put into this
<jam> the closest we have is /usr/lib/juju/ where we put things like mongo
<jam> but I don't think we have any plans to read that out of a users home dir
<jam> I'd think we'd just support XDG_CONFIG_HOME for this
<perrito666> works for me
<perrito666> jam: tx for your help
<jam> perrito666: happy to
<natefinch> anyone have ideas on how I can store (and more importantly) update a map[string]string in mongo?
<perrito666> natefinch: you just do it?
<natefinch> perrito666: update is the tricky one.. is there an operation to just update a single entry in a map in the DB?
<perrito666> natefinch: you seem to be confusing mongo with a serious db :p
<natefinch> perrito666: heh
<natefinch> perrito666: I'm trying to avoid having to read the map out and then write it back
<perrito666> nope, you are doomed to it
<perrito666> and in the middle the incertitude of race conditions :p
<natefinch> pleh
<perrito666> jokes apart, I dont think there is a db that allows you to do a partial update of a piece of data
<perrito666> your smalles unit is a doc's field and you want to make a change smaller than that
<natefinch> perrito666: I just didn't know if there was some mongo hackery, since a map is just like an object, if I could do foo["bar"] = "baz"
<natefinch> perrito666: looks like you can do it, so like {$set: {"foo.bar": "baz"} }
<perrito666> what? :\
<perrito666> that surprises me
<natefinch> the map is really an "embedded document" in the field
<perrito666> can you do that with other types?
<perrito666> natefinch: ah, I would not expect maps to be treated as documents
<natefinch> perrito666: in mongo (and json) map == object/document
<perrito666> natefinch: I am curious on how we are serializing de-serializing that
<natefinch> perrito666: that's what I'm trying to figure out... probably map[string]interface{}
<perrito666> that whole process deppends a lot on us being  nice people :p
<natefinch> ericsnow: it occurs to me, we have no unique id for a resource :/
<natefinch> ericsnow: the unique id is service/resourceName/resourceRevision ... and revision is either the revision or the timestamp, depending on the type of resource :/
<perrito666> remove all the things that say "remove in 2.0" is a feeling equivalent to peeling the film of a new screen
<frobware> cherylj, ping
<cherylj> hey frobware, what's up
<frobware> cherylj, I think I deleted the wrong meeting - or it's disappeared from my cal. Are we meeting now?
<cherylj> frobware: nope, in 30 minutes
<cherylj> frobware: but given your email, we can postpone / cancel if you're swamped
<frobware> cherylj, ah. got it. sorry for the noise. coffee is wearing off.
<katco> perrito666: must be enjoyable
<katco> perrito666: glad you have the honor
<frobware> cherylj, can we? I have stuff for 1.25 and maas-spaces pending, plus other stuff...
<cherylj> frobware: yeah, no prob.  I can cancel and send some comments / ideas in an email
<frobware> cherylj, also need to reflect on the nature and timing as I mentioned via email
<frobware> cherylj, any thoughts on when a 1.25.4 would be? I was starting to look at #1532167 again.
<mup> Bug #1532167: maas bridge script handles VLAN NICs incorrectly <addressability> <maas-provider> <network> <juju-core:Triaged> <juju-core 1.25:Triaged by frobware> <https://launchpad.net/bugs/1532167>
<cherylj> frobware: ha, that bug just came up in the cross team call.
<frobware> cherylj, not terribly surprised...
<cherylj> frobware: I'm going to say 2 - 4 weeks, to be realistic
<frobware> cherylj, ok. I'm asking because we have way better support for bond, vlans, vlans on bond in maas-spaces (in the bridge script). so trying to decide whether patching what's in 1.25 or taking wholesale changes from maas-spaces might be better.
<frobware> cherylj, can they work without a fix for 2-4 weeks?
<frobware> cherylj, tentatively sometime next week
<beisner> o/  frobware, thanks again for all your work on the ifaces fun.   much appreciated.
<frobware> beisner, np. it's great we have a better understanding of how it all plumbs together now.
<lazypower> cherylj how long until 2.0-alpha1 lands in the -dev ppa?
<lazypower> cherylj i ask as i just encountered a bug that says it was 1.25 fix-released, but it doesnt appear to be in practice: http://pad.lv/1512371
<cherylj> lazypower: should be very soon.  sinzui ^^ ?
<lazypower> so i just had the develooer upgrade to 1.26-alpha3, this will be jarring if they get a shiny 2.0-alpha1 later today :P
<cherylj> lazypower: which 1.25 were you using?
<sinzui> lazypower: that bug is fixed in the proposed juju. We are waiting for confirmation juju just works before completing the release.
<cherylj> lazypower:  that bug isn't fixed in any stable 1.25.  The versions which have that fix are still in proposed
<lazypower> ah ok
<lazypower> well, here's to have immediate external testing to 2.0 when it lands
 * lazypower cheers
<cherylj> :)
<lazypower> i have a meeting with them in 30, i'll put a little bug in their ear they're going to be on 2.0 in the next day or so.
<lazypower> should we encounter critical punches, i can have them revert to the proposed ppa and that should get themb ack in stable w/ a maas 1.9 fix?
<cherylj> lazypower: yeah.  1.25.3 was just put into proposed, and that has a lot of maas networking fixes
<lazypower> ok, bueno. Thanks for the background info
<cherylj> lazypower: let me know if they run into issues with 2.0.  I'd love to hear feedback
<lazypower> will do
<natefinch> ericsnow: is it too evil if I just reuse SetResource and pass it an ID that already has the unit tag embedded in it?
<ericsnow> natefinch: borderline evil :)
<natefinch> ericsnow: ahh, better: I can add a possibly-empty unit field to set-resource
<natefinch> ericsnow: that way I don't have to encode the ID outside the DB code
<ericsnow> natefinch: there's enough distinction that it may be better to at least have 2 separate methods that call the same underlying (unexported) method
<natefinch> ericsnow: yeah, ok... I was trying to be lazy... too lazy in this case, you're right :)
<ericsnow> natefinch: :)
<natefinch> man I hate that Tags have a String() string and an Id() string .... which one am I supposed to use when?  Why are there two string representations?
<natefinch> ericsnow: why do we pass the id and the resource itself separately to persist.SetResource?  Wouldn't it be better to just pass the resource and let the function take the id from the name on the resource?
<ericsnow> natefinch: not sure
<natefinch> ericsnow: removing it would prevent people from doing evil things like I was planning to do ;)
<ericsnow> natefinch: re: tags, use Tag.Id() (and I'd recommend passing that ID around rather than the tag)
<natefinch> gah, I hate passing around raw strings. Then you never know what they're really supposed to be, and so you have to traverse back up the tree to figure out what string is actually being set...
<voidspace> frobware: that's sneaking assigning that card to me...
<natefinch> if it's always the results of a UnitTag.Id().. why not just pass the UnitTag?
<natefinch> otherwise, you're just encoding way high up in the stack what the DB id format is
<frobware> voidspace, sorry, was on a mission to ensure the iteration was fully loaded...
<frobware> voidspace, want to chat about it?
<frobware> voidspace, it was really because post your monster merge I wanted to keep an eye out for any fallout
<ericsnow> natefinch: tags are a wire format type (my understanding), not suitable for use outside the API
<ericsnow> natefinch: regarding that "id" parameter, I expect it is an artifact of copying (or at least following the example of) the equivalent code from payloads
<natefinch> ericsnow: I'll think about submitting a separate patch to strip it out, but not that important right now
<ericsnow> natefinch: in that case the payload ID wasn't derived from the payload name (or any other info)
<mup> Bug #1536728 opened: Juju's MAAS bridging script needs to de-duplicate dns-* iface options <maas-provider> <network> <juju-core:New for frobware> <https://launchpad.net/bugs/1536728>
<ericsnow> natefinch: see payload/state/unit.go (Track)
<ericsnow> natefinch: yeah, that should be fine since we are just using the resource name for the ID
<natefinch> ericsnow: I generally frown on anyone way outside the DB layer deciding what the ID for the item in the DB should be.
<voidspace> frobware: fallout, as in it might start passing? :-)
<frobware> voidspace, ping me when it does. :-D
<ericsnow> natefinch: the ID (in the case of payloads) is determined by state
<voidspace> frobware: I need to get the "api shut off" done on maas-spaces, then I can take a look at it
<ericsnow> natefinch: that is not the same thing as the doc ID (which is contained strictly within the persistence layer)
<frobware> voidspace, yep fine. again, I just didn't want the CI build to pass through the cracks.
<voidspace> frobware: sure :-)
<voidspace> frobware: just finished checking the differences between master and maas-spaces for potential missing stuff
<frobware> voidspace, golden? ;)
<voidspace> frobware: a *lot* of files renamed on master, and a lot of new files on maas-spaces
<voidspace> frobware: nothing problematic though (I'm pretty sure)
<frobware> voidspace, great. thanks for the due diligence.
<voidspace> frobware: looking at default gateway on the new NetworkInterfaces implementation too
<voidspace> as that doesn't exist on master and the fix there on applies to the legacy version
<voidspace> frobware: ah, however the new one does set a GatewayAddress
<mup> Bug #1536728 changed: Juju's MAAS bridging script needs to de-duplicate dns-* iface options <maas-provider> <network> <juju-core:New for frobware> <https://launchpad.net/bugs/1536728>
<voidspace> frobware: yep the new one takes it from the interface subnet
<voidspace> so that's cool
<voidspace> done and dusted
<mup> Bug #1536728 opened: Juju's MAAS bridging script needs to de-duplicate dns-* iface options <maas-provider> <network> <juju-core:New for frobware> <https://launchpad.net/bugs/1536728>
<frobware> voidspace, great
<voidspace> frobware: as an experiment I just did another merge of master
<voidspace> frobware: there are conflicts in bridgescript_test.go
<voidspace> frobware: want me to resolve and push so we can track master
<voidspace> ?
<frobware> voidspace, so how did that happen? In your branch? or mine which merged sometime earlier?
<voidspace> frobware: I did a fresh checkout of maas-spaces and merged upstream/master into it
<frobware> voidspace, does it look the conflicts came from this https://bugs.launchpad.net/juju-core/+bug/1534795
<mup> Bug #1534795: unit loses network connectivity during bootstrap: juju 1.25.2 + maas 1.9 <maas-provider> <uosci> <juju-core:Fix Released by frobware> <juju-core 1.25:Fix Released by frobware> <https://launchpad.net/bugs/1534795>
<voidspace> frobware: is there a linked github branch?
<voidspace> frobware: HEAD has a new "pre-up" in one of the tests
<frobware> voidspace, https://bugs.launchpad.net/juju-core/+bug/1534795/comments/12
<mup> Bug #1534795: unit loses network connectivity during bootstrap: juju 1.25.2 + maas 1.9 <maas-provider> <uosci> <juju-core:Fix Released by frobware> <juju-core 1.25:Fix Released by frobware> <https://launchpad.net/bugs/1534795>
<voidspace> frobware: cool, thanks - looking
<voidspace> frobware: yes
<voidspace> frobware: do you want to resolve it? Would only take a minute and there's already quite a bunch of changes on master we should pull in
<frobware> voidspace, swamped. and need to leave in 19 mins...
<voidspace> frobware: ok
<frobware> voidspace, can do this AM tomorrow if it helps.
<voidspace> frobware: on maas-spaces runScript takes a new isBond parameter - and I'm not sure what that should be on the new test from master
<voidspace> probably false though
<voidspace> frobware: I'll try and resolve, if I get bogged down I'll abandon
<frobware> voidspace, at first blush I would say that's the wrong way round.
<frobware> voidspace, the isBond should only be relevant on master and 1.25.
<voidspace> you might be right
<voidspace> frobware: multiple conflicts in the test scripts themselves, and my head is too tired to work them out
<voidspace> it can wait until tomorrow morning I think
<voidspace> natefinch: ping
<natefinch> voidspace: pong
<voidspace> natefinch: asking in case you know...
<voidspace> natefinch: the API is switched off during bootstrap until a certain point
<voidspace> natefinch: I need to  ensure that the API remains off until a worker (that fires on bootstrap) has ccompleted
<voidspace> natefinch: do you know where I would find the API switching on/off code?
<voidspace> or the machinery/mechanism we use to do that
<voidspace> if not I will go spelunking in cmd/jujud
<natefinch> voidspace: I don't know specifically, but I would start looking at cmd/jujud/agent/machine.go, since that's where all the workers are, and some of them are specified as API workers, which require the API to be on... hopefully you could work backward to figure out where the API gets turned on
<voidspace> natefinch: that's a good clue, thanks
<natefinch> voidspace: there's an APIWorker method that is probably what you want to look at... i.e. what starts that worker
<voidspace> natefinch: cool
<natefinch> voidspace: so maybe I do know where to look, after all ;)
<voidspace> natefinch: you usually do :-)
<voidspace> natefinch: appreciated
<arosales> hello, I recall a developer environments.yam key  for GCE to use a specific image, but I don't remember the exact key
<arosales> do that sound familiar to anyone?
<cherylj> arosales: I thought there used to be something like that for ec2, but hasn't been there for a while
<arosales> hm, I thought I saw one for azure and ice . .  .
<arosales> s/ice/azure
<natefinch> pretty lonely in this juju core team meeting all by myself
<perrito666> natefinch: I believe that last night meeting was a replacement for this one
<arosales> is "force-image-name" still recognized by azure?
<natefinch> perrito666: ahh
<natefinch> arosales: for the legacy azure provider, yes
<natefinch> arosales: I don't know how that might relate to the new azure provider
<arosales> natefinch: thanks, do you know if a similar option is available for GCE?
<arosales> or AWS?
<natefinch> arosales: definitely not for GCE
<natefinch> arosales: not for aws either
<arosales> natefinch: ok, thanks
<natefinch> arosales: it's something we've wanted to implement for forever, it just never was a high enough priority (and none of the devs were annoyed enough to just do it themselves)
<arosales> natefinch: understood, thanks for the info
<natefinch> ericsnow: you know something should be a function not a method when..... id := Persistence{}.resourceID("spam", serviceID, "")
<ericsnow> natefinch: likely another artifact from payloads
<katco> natefinch: meeting running long, have to cancel
<natefinch> katco: ok
<natefinch> ericsnow: how's the get resource stuff going?
<ericsnow> natefinch: slow (still not feeling great)
<natefinch> ericsnow: understandable
<natefinch> ericsnow: anything I can do to help?  My code is like 10% dependent on your branch.
<ericsnow> natefinch: not a ton (mostly a lot of mechanical cleanup)
<natefinch> ericsnow: fair enough
<katco> ericsnow: going to cancel our 1:1 as well; we're both not feeling great
<ericsnow> katco: k
<katco> ericsnow: natefinch: sorry bout that
<perrito666> katco: ericsnow you can get together and drink chicken soup during the standup
<perrito666> *1:1
<katco> perrito666: i need to find a vegetarian replacement for chicken soup
<perrito666> katco: I am pretty sure that those powder based chicken soups are far from any chicken :p
<katco> lol
<perrito666> sometime ago I bought soy hot dog sausage, surprisingly good
<katco> perrito666: i have discovered that the key to being vegetarian is not trying to find meat replacements, but to just cook delicious meals that just don't call for meat
<perrito666> which makes you think how much of the flavour on the original ones is influenced by the actual ingredients
<perrito666> sure thing, I cook a lot without meat in summer and that is good
<natefinch> yeah, you can get a long way with just salt, oil, onions, garlic, and peppers of various sorts.
<natefinch> and spices of course
<menn0-afk> cherylj: I just check a recent failure of the MASS OS deployer CI job for master and I also see the the hook execution lock being incorrectly broken there
<menn0-afk> cherylj: the following run (with the fix I think) passed
<menn0-afk> davecheney: ^
<perrito666> menn0: morning, could you re-review the xdg branch?
<menn0> perrito666: I will. I may not be able to do it for a couple of hours though. is that ok?
<menn0> perrito666: FWIW I had a quick look yesterday and it looked good
<menn0> perrito666: I'll look more closely today.
<perrito666> menn0: no hurry at all
<perrito666> menn0: I am eod
<menn0> perrito666: ok cool. I'll review it before your next working day.
<cherylj> menn0: This test run of master doesn't have davecheney's fixes in it:  http://reports.vapour.ws/releases/3529
<perrito666> menn0: tx
<cherylj> menn0: is that the one you were looking at?
<menn0> cherylj: yep that's one
<menn0> cherylj: it got lucky then :)
<cherylj> menn0: yeah :)  Are you rebasing machine-dep-engine?
<menn0> yep... about to hit merge on it
 * thumper puts his head down to work
<menn0> cherylj: http://juju-ci.vapour.ws:8080/job/github-merge-juju/6057/
<natefinch> ericsnow: ahh, hmm... seems like stub implementations should not be calling errors.Trace on the errors they return.. it obscures whether or not we're returning the correct error
<ericsnow> natefinch: use errors.Cause()
<natefinch> ericsnow: but then I can't tell if my code is incorrectly wrapping the error, or if it's just the stub doing that
<cherylj> hey perrito666, I opened bug 1536792 to reference in the 2.0-alpha1 release notes as a "known issue".  I know you committed fixes after the revision we're going to release.  Was there other work needed to address that issue in the various providers?
<mup> Bug #1536792: Some providers release wrong resources when destroying hosted models <juju-core:Fix Committed> <https://launchpad.net/bugs/1536792>
<perrito666> cherylj: I actually am tasked with opening bugs for the changes I committed so we track them
<natefinch> ericsnow: for example, my function should just pass through the error... but I can't test that it does that, because the stub is wrapping the error I give it.
<cherylj> perrito666: we can just use that one bug and describe which providers release what resources
<cherylj> perrito666: I opened it now for you since we're very very close to actually releasing alpha1 :)
<natefinch> ericsnow: if I call errors.Cause, I can't tell if I'm just unwrapping what the stub wrapped, or if I'm also unwrapping what my code (incorrectly) wrapped
<cherylj> perrito666: and I want to make sure it's included in the release notes.
<ericsnow> natefinch: then perhaps use a stub that doesn't trace errors?
<perrito666> cherylj: great, is a comment the best way to note which are the providers ?
<perrito666> we should have provider tags
<cherylj> perrito666: you can do that, or modify the original description.
<natefinch> ericsnow: just seems like a general good practice - don't wrap errors in stubs.  but yes, in this instance, I can make a different stub that doesn't wrap
<perrito666> oh, look at that, I have prmissions
<perrito666> there you go
<perrito666> EOD, cheers  all
<cherylj> thanks, perrito666!
<natefinch> ericsnow: https://i.imgflip.com/xr6ig.jpg
<cherylj> natefinch: lol!
<natefinch> :D
<natefinch> cherylj: perrito666 suggested I turn it into a meme to soften the blow :)
<cherylj> I like it
<natefinch> from now on I'm giving all my bad news via memes
<cherylj> natefinch is GGG:  https://imgflip.com/i/xr9f3
<mup> Bug #1536792 opened: Some providers release wrong resources when destroying hosted models <juju-core:Fix Committed> <https://launchpad.net/bugs/1536792>
<natefinch> cherylj: http://goo.gl/MMZ2eq
<cherylj> ha
<cherylj> so rcj doesn't follow memes much, so I often make a reference to one and he stares at me blankly.
<cherylj> or says something like "is that a meme" thing?
<cherylj> Sometimes it's like I'm married to a member of AARP
<natefinch> rofl
<mup> Bug #1536792 changed: Some providers release wrong resources when destroying hosted models <juju-core:Fix Committed> <https://launchpad.net/bugs/1536792>
<natefinch> my wife said "on fleek" the other day, and I had no idea what she was talking about.
<cherylj> oh that's new to me
<cherylj> I'm so out of touch!
<cherylj> imgur you have failed me!
<natefinch> my wife loves Biden memes, so it came up from this: http://napturalnicole.com/wp-content/uploads/2015/01/IMG_6731.jpg
<cherylj> haha!
<mup> Bug #1536792 opened: Some providers release wrong resources when destroying hosted models <juju-core:Fix Committed> <https://launchpad.net/bugs/1536792>
<natefinch> evidently it means "on point" as in "really well done"
<cherylj> can I get a review?  http://reviews.vapour.ws/r/3599/
<cherylj> natefinch: I googled it
<natefinch> cherylj: me too :)
<natefinch> cherylj: I'm wondering if an equally valid, perhaps more correct fix for the failing test is to simply delete that line
<natefinch> cherylj: or at least, don't check the content type
<mwhudson> is anyone working on the fact that github.com/juju/juju/cmd/jujud/agent fails intermittently with go 1.5?
<cherylj> natefinch-afk: I'm not convinced that removing the check is correct.  I think we *do* want to make sure that the correct content type is generated, but should accept that the correct type can be either javascript or x-javascript
<bdx> core, dev: Are there plans for storage using openstack provider?
<cherylj> natefinch-afk: the content type is generated by the code being exercised, so I think it's a valid thing to check.  It can just vary based on centos vs. ubuntu
<cherylj> bdx, we do support storage for openstack:  https://jujucharms.com/docs/devel/storage
<cherylj> bdx: is that what you're looking for?
<bdx> cherylj: yea
<bdx> cherylj: I'm not sure you sent the correct link ...
<cherylj> bdx: maybe I misunderstood what you're looking for?
<bdx> cherylj: can you point out where the openstack provider portion is in that link .... oooh ... I found it .. "The OpenStack/Cinder provider does not currently have any configuration."
<cherylj> bdx: yeah, there's no openstack specific storage configuration yet
<bdx> cherylj: ok, should I feature request it? do you know if its on the roadmap?
<cherylj> bdx: do you know specifically what you'd like to be able to specify about storage on openstack?
<bdx> cherylj: availability_zone, type, size
<cherylj> bdx: yeah, you can open a bug to request those config options and I'll add it to our list of requests we track
<bdx> cherylj: thats awesome! thanks!
<bdx> arosales:^^
<bdx> cherylj: https://bugs.launchpad.net/juju-core/+bug/1536819
<mup> Bug #1536819: Feature Request: Storage support for openstack provider <juju-core:New> <https://launchpad.net/bugs/1536819>
<cherylj> thanks, bdx.  I'll add it to our listing here:  https://github.com/juju/juju/wiki/Feature-Requests
<mup> Bug #1536819 opened: Feature Request: Storage support for openstack provider <juju-core:New> <https://launchpad.net/bugs/1536819>
<arosales> got another xenial question :-)
<arosales> I feel like it should be possible to bootstrap a xenial model and services if I set default-series to xenial, image-stream to daily and agent-stream to devel
<arosales> but I get unable to find matching tools :-/
<arosales> http://paste.ubuntu.com/14593295/
<arosales> against aws
<arosales> is it because my desktop has to be Xenial as well?
<arosales> bdx: ya the juju-core folks rocks.  thanks cherylj
<arosales> bdx: seem very reasonable as we tackle storage in openstack which gets a lot of usage :-)
<davecheney> menno-afk: sweet
<arosales> --upload-tools also ends the same way:   http://paste.ubuntu.com/14593344/
<arosales> its seems like it should work, but wanted to get folks thoughts here on it.
<wallyworld> axw: perrito666: standup?
<mup> Bug #1536838 opened: state/lease: package in wrong location <juju-core:New> <https://launchpad.net/bugs/1536838>
#juju-dev 2016-01-22
<thumper> wallyworld: I can chat earlier if you can
<wallyworld> thumper: sure,now?
<thumper> yeah
<davecheney> func (b blocks) add(block block)
<davecheney> hmmm
<davecheney> thumper: how do I set debugging level to trace on an environment
<davecheney> (you can tell it's been a long time since I did this)
<thumper> davecheney: a running environment?
<davecheney> i haven't deployed yet
<davecheney> actually, don't worry
<davecheney> i kjnow the line I want to see
<thumper> juju set-env logging-config=juju=trace
<davecheney> i'll just make it an error
<thumper> you can set an environment variable
<davecheney> then I don't ahve to deal with trillions of lines of trace
<thumper> export JUJU_LOGGING_CONFIG=juju=debug,juju.something.specific=trace
<natefinch-afk> logging-config: "<root>=DEBUG;juju.worker.leadership=WARNING;juju.trace=WARNING;juju.worker.peergrouper=INFO;juju.worker=ERROR;juju.apiserver=TRACE
<natefinch-afk> (in environments.yaml)
<thumper> that is another way
<davecheney> i'll just hack the error to report at ERROR level
<thumper> :)
<natefinch> also works :)
<davecheney> i don't want to have to deploy twice if I screw it up
<davecheney> all I want to see is the specific line failing, then I can check my fix
<mup> Bug #1536885 opened: github.com/juju/juju/cmd/jujud/agent fails frequently with go 1.5 <juju-core:New> <https://launchpad.net/bugs/1536885>
<davecheney> lucky(~/devel/swift-test) % juju bootstrap -v
<davecheney> Bootstrapping environment "ap-southeast-2"
<davecheney> Bootstrap failed, destroying environment
<davecheney> ERROR failed to bootstrap environment: cannot read product data, invalid URL "https://juju-dist.s3.amazonaws.com/tools/streams/v1/com.ubuntu.juju:devel:tools.sjson" not found
<menn0> thumper: if you have a chance could you take a look at http://reviews.vapour.ws/r/3541
<menn0> thumper: it's quite different from when you last looked at it.
<menn0> thumper: one more issue from Will to address and then it's ready (pending any further review feedback)
<thumper> k
<cherylj> hey natefinch, regarding your review, I still think it makes sense to test for content type since that's something generated in the functionality we're testing.  It just happens it can be one of two things, depending on where the tests are running
<lazypower> davecheney I just ran into that myself
<lazypower> davecheney however, it doesn't appear ot be holding anything up. I still have services coming online... but i did bootstrap with --upload-tools
<natefinch> cherylj: yeah, sorry, saw your comment earlier
<natefinch> cherylj: do you know why we return two different content types?
<davecheney> lazypower: ame
<davecheney> lazypower: same
<davecheney> http://paste.ubuntu.com/14595278/
<davecheney> ^ uhhh
<cherylj> natefinch: yeah, I had asked sinzui about it.  something about the package used by centos handling javascript differently.  Sorry it's late and now I don't completely remember what he said
<cherylj> but it seemed satisfactory at the time :D
<mup> Bug #1536819 changed: Feature Request: Storage support for openstack provider <juju-core:Invalid> <https://launchpad.net/bugs/1536819>
<mup> Bug #1536819 opened: Feature Request: Storage support for openstack provider <juju-core:Invalid> <https://launchpad.net/bugs/1536819>
<davecheney> where does the leadership manager worker run ?
<davecheney> is it on machine 0 ?
<davecheney> or is it on every machine that is a component of a leadership charm
<thumper> davecheney: probably managed by the "master" state server machine
<thumper> which will start with machine 0
<davecheney> ok
 * thumper has had enough of today
<davecheney> i have a fix for the bug
<thumper> awesome
<davecheney> and i appears to work
<thumper> time to start drinking
<davecheney> bootstraping now
<thumper> hazaah
<thumper> good luck
<davecheney> but I'm having trouble writing up the description as how the lease stuff works is not entirely clear in my head
<davecheney> but if the lease manager runs on machine 0
<davecheney> not the actual machien holding the lease
<davecheney> that makes sense
<mup> Bug #1536819 changed: Feature Request: Storage support for openstack provider <juju-core:Invalid> <https://launchpad.net/bugs/1536819>
<axw> wallyworld: how does changing the name to ModelTag stop things from compiling?
<wallyworld> # github.com/juju/names
<wallyworld> ./model.go:27: ModelTag is not a type
<wallyworld> nfi
<axw> wallyworld: that's because you've got a variable called ModelTag
<wallyworld> oh ffs
<wallyworld> sigh, ignore me
<wallyworld> too many concurrent things
<axw> wallyworld: just going to have something to eat, will you be available for 1:1 soonish?
<wallyworld> sure
<axw> ok, bbs
<davecheney> soooo, leadership doesn't use a watcher ...
<davecheney> http://reviews.vapour.ws/r/3605/
<davecheney> for review, still testing
<mup> Bug #1535916 changed: juju upgrade-charm should recognize the force flag <adoption> <compatibility> <juju-core:Invalid> <https://launchpad.net/bugs/1535916>
<axw> wallyworld: ready now?
<wallyworld> axw: ok, give me 5
<axw> wallyworld: sure, ping when you're there
<davecheney> http://paste.ubuntu.com/14595610/
<davecheney> well shit
<davecheney> and now AWS has locked me out for rate limit abuse
<wallyworld> axw: there now
<frobware> voidspace, fyi maas-spaces is cursed - http://reports.vapour.ws/releases/3530
<frobware> voidspace, btw. I didn't look at the detail, just noted the failure message in my inbox
<voidspace> frobware: ok
<voidspace> frobware: looking at a master merge last night, merging the test code itself is straightforward - but the test *scripts* are not so trivial (at least to me)
<voidspace> frobware: you'd probably find it a lot easier
<frobware> voidspace, I'm in the middle of the 3 bugs right now....
<voidspace> frobware: ok, the longer we leave it the worse it gets though
<frobware> voidspace, can we HO and look at the detail?
<voidspace> frobware: let's do it after standup, I still need coffee
<voidspace> maas-spaces hasn't been blessed since December 9th!
<frobware> voidspace, I don't want to leave it. equally, I don't want to stash what I'm currently doing.
<voidspace> not as bad as it could be...
<frobware> voidspace, yep!
<voidspace> centos & windows unit tests and go-1.5 on trusty failures
<voidspace> looks like the only *voting* failure is the windows one
<voidspace> metadata failures, not code we've touched!? (I don't think)
<voidspace> I'll look at the other curses and see if they're consistent
<voidspace> also in the middle of the api stuff
<voidspace> frobware: most of the curses prior to the *most* recent are "stale version"
<voidspace> I'll have to ask mgz what that means
<voidspace> frobware: those windows tests passed on master though (even though master is also cursed)
<frobware> voidspace, right. and this is why I recently read somewhere are sample size of a BLESSED run should be > 1. This failure may be transient.
<frobware> s/read/said/
<voidspace> frobware: right
<voidspace> frobware: many of the tests are still pending on that run
<voidspace> frobware: we'll need another run to see if that windows failure is genuine
<voidspace> dooferlad: frobware: stdup?
<frobware> voidspace, ah, friday. wrong link. omw...
<voidspace> mgz: ping
<frobware> voidspace, this one? http://reviews.vapour.ws/r/3564/
<voidspace> dooferlad: frobware: huge again!
<voidspace> dooferlad: frobware: http://reviews.vapour.ws/r/3609/
<voidspace> running tests
<frobware> voidspace, this is motivation for us to clean up and get our stuff back into master.
<voidspace> yep
<voidspace> mind you, a big chunk of that diff is the legacy azure provider being removed
<voidspace> frobware: dooferlad: clean unit test run on the merged branch
<frobware> I'm trying to submit a PR but the subsequent RB is always empty (http://reviews.vapour.ws/r/3612/) -- anybody else see this behaviour?
<frobware> voidspace, I was trying to run the unit tests on your -2 merge branch. Currently fails for me.
<frobware> voidspace, dooferlad: http://reviews.vapour.ws/r/3614/
<frobware> voidspace, will take another look at your merge -2 branch now.
<frobware> voidspace, unit test pass for me in your merge -2 branch.
<voidspace> frobware: cool
<mgz> voidspace: when did you guys last merge master?
<mgz> and did you merge a blessed rev or just what happened to be tip?
<voidspace> mgz: well, we merged very recently - but there's already a stack more changes
<voidspace> mgz: and we just merged tip
<voidspace> mgz: we just got a test run (cursed - windows unit tests failed in code unrelated to our changes)
<voidspace> mgz: but *before* we merged master there was a long gap between us and master
<voidspace> mgz: and all the test runs came up with "stale version"
<mgz> voidspace: I'm asking because the windows test failur was only very briefly on master
<mgz> and is fixed on tip
<voidspace> mgz: ah, cool
<voidspace> mgz: we're about to merge again
<voidspace> mgz: and would appreciate a new CI run when that's done
<voidspace> dooferlad: frobware: can I have a +1 on my merge branch?
<frobware> voidspace, done.
<frobware> voidspace, could you take a look at http://reviews.vapour.ws/r/3614/
<mup> Bug #1537082 opened: Cannot use devel/proposed agent streams with daily stream for xenial <bootstrap> <streams> <juju-core:Triaged> <juju-release-tools:Triaged> <https://launchpad.net/bugs/1537082>
<beisner> frobware, cherylj, mgz - 24hrs into exercising 1.25.3 @ uosci, smooth sailing :-)
<cherylj> beisner: yay!!
<voidspace> frobware: looking
<voidspace> frobware: ooh, O(n^2) append function
<voidspace> ;-)
<voidspace> frobware: LGTM, the Python is straightforward
<frobware> voidspace, yep. the alternative is ordered set. or dict. and importing collectiosn. can do, but don't think we have perf issues.
<frobware> voidspace, wary if importing anything not does not work on series >= precise.
<voidspace> frobware: it's fine, it won't be an issue
<voidspace> mgz: will we get a new maas-spaces CI run automatically, or do you have to kick it off for us?
<voidspace> mgz: frobware: because the latest master merge has landed and we'd like a new one please :-)
<frobware> voidspace, yep looking forward to monday starting out blessed! :-D
<voidspace> hehe, we can hope
<frobware> voidspace, I even bootstrapped with your latest merge
<mgz> voidspace: it will happen, but I'll manually bump
<natefinch> katco: good question on using tags...
<katco> natefinch: if i remember correctly they're supposed to be constrained to the api portion
<katco> natefinch: and strings are fine elsewhere
<natefinch> katco: I think you're right, though I really dislike the policy behind that second statement...
<natefinch> katco: I think we have a ton of places in the code where string actually means "this is an ID of an object and must be able to be parsed back into a tag" except there's nothing actually indicating or enforcing that
<natefinch> except perhaps having id on the end of the variable name :/
 * perrito666 is under an AC and still a little hot... sometimes I really hate this country
<natefinch> perrito666: it's -8Â°C here... (not inside, of course)
<perrito666> natefinch: last night, at 1AM it was 30C here
<frobware> voidspace, interestingly my PR caused a test failure. http://juju-ci.vapour.ws:8080/job/github-merge-juju/6070/console
<mup> Bug #1537152 opened: Can't re-create a model <juju-core:New> <https://launchpad.net/bugs/1537152>
<voidspace> frobware: that looks spurious, retry
<voidspace> frobware: I've had that before
<mup> Bug #1537152 changed: Can't re-create a model <juju-core:New> <https://launchpad.net/bugs/1537152>
<mup> Bug #1537152 opened: Can't re-create a model <juju-core:New> <https://launchpad.net/bugs/1537152>
<mup> Bug #1537153 opened: juju deploy --config option ignored when deploying a bundle <juju-core:New> <https://launchpad.net/bugs/1537153>
<bdx> charmers, SOS --> https://bugs.launchpad.net/charm-helpers/+bug/1534819
<mup> Bug #1534819: FetchHandler charmhelpers.fetch.giturl.GitUrlFetchHandler not found <Charm Helpers:New> <https://launchpad.net/bugs/1534819>
<natefinch> katco: aww man, I missed that you could request to view resources for a specific unit, not just a service... didn't notice it until I was reading your user stories.
<katco> natefinch: not surprising... i think that is a farily new addition (maybe in capetown)
<natefinch> katco: revision history says it's been there since December 17th... I must have just glazed over it somehow
<natefinch> katco: definitely the nice thing about having a more explicit, verbose set of user stories - harder to miss stuff
<natefinch> as opposed to missing "[/unit]" in a huge document :)
<katco> natefinch: yeah maybe that's what it is. alexisb called it out specifically in her notes, and as i was reviewing the spec to make the user stories i called it out specifically
<natefinch> katco: what happens if you do juju resources --verbose mydb/0 ?
<katco> natefinch: undefined as of yet
<katco> natefinch: error
<mup> Bug #1536819 opened: Feature Request: Storage support for openstack provider-specific config parameters <juju-core:Triaged> <https://launchpad.net/bugs/1536819>
<perrito666> blah is a controller, use destroy-controller is perhaps the most anoying error We have
<natefinch> oh, I'm sure we have worse errors than that
<perrito666> that one is annoying
<natefinch> what was that in response to you doing?
<perrito666> you know what it is, you know what I want to do withit, dont get semantic on me
<perrito666> natefinch: destroy-environment blah
<natefinch> destroy controller sounds like a command you should never have to type
<natefinch> we need a better name for what we now call environment
<natefinch> I thought controller meant the state server(s)
<natefinch> bleh
<natefinch> perrito666: maybe we just need a destroy blah command
<mup> Bug #1536215 changed: 1.25.0: deployment times out - system is deployed successfully by maas 1.9 but juju state never transitions from pending <oil> <juju-core:Invalid> <MAAS:Invalid> <https://launchpad.net/bugs/1536215>
#juju-dev 2016-01-24
<mup> Bug #1454919 changed: destroy-environment should cleanup all .jenvs connecting to environ <juju-core:Expired> <juju-core 1.24:Incomplete by waigani> <https://launchpad.net/bugs/1454919>
<mup> Bug #1454919 opened: destroy-environment should cleanup all .jenvs connecting to environ <juju-core:Expired> <juju-core 1.24:Incomplete by waigani> <https://launchpad.net/bugs/1454919>
<mup> Bug #1454919 changed: destroy-environment should cleanup all .jenvs connecting to environ <juju-core:Expired> <juju-core 1.24:Incomplete by waigani> <https://launchpad.net/bugs/1454919>
<mup> Bug #1454919 opened: destroy-environment should cleanup all .jenvs connecting to environ <juju-core:Expired> <juju-core 1.24:Incomplete by waigani> <https://launchpad.net/bugs/1454919>
<mup> Bug #1454919 changed: destroy-environment should cleanup all .jenvs connecting to environ <juju-core:Expired> <juju-core 1.24:Incomplete by waigani> <https://launchpad.net/bugs/1454919>
<mup> Bug #1496184 changed: juju bootstrap on armhf/keystone hangs   juju version 1.24.5 <armhf> <bootstrap> <bug-squad> <juju-core:Invalid> <juju-core 1.24:Invalid> <juju-core 1.25:Invalid> <https://launchpad.net/bugs/1496184>
#juju-dev 2017-01-16
<nurfet> hey guys, which branch I should create PR to for a new provider, develop?
<thumper> anastasiamac: https://github.com/juju/juju/pull/6804
<thumper> nurfet: yes
 * anastasiamac looking
<anastasiamac> thumper: +1...
<thumper> ta
<rick_h> thumper: ping
<thumper> rick_h: hey
<rick_h> thumper: hey, did you want to catch up today at all?
<thumper> sure
<rick_h> thumper: what's your schedule look like?
<thumper> clear
<rick_h> thumper: k, https://hangouts.google.com/hangouts/_/canonical.com/rick?authuser=1
<thumper> oh ffs
<thumper> fuckity fuck...
 * thumper considers this race
<thumper> babbageclunk: around?
<thumper> got this will be horrible, but necessary...
<thumper> geez
<anastasiamac> thumper: sounds like fun \o/ what have u got?
<thumper> a race condition
<thumper> StartSync()
<thumper> is used throughout the tests
<thumper> to get watchers to get values
<anastasiamac> i've seen these and added a few of my own
<anastasiamac> some tests do not run well without them
<thumper> but places assume that it is sufficient
<anastasiamac> right
<thumper> in the 390 tests, there are calls to change, startsync, more change, start sync
<anastasiamac> i think the bigger isse (and probably better approach) is to address our watchers
<thumper> now the assumption is that first change comes independently of the second
<thumper> but you can't ensure that
<thumper> because the start sync just says "hey, please start a sync", it doesn't wait for it, or even be sure it has started
<thumper> I'm beginning to think that the test is just bollocks
<thumper> this failure http://reports.vapour.ws/releases/4711/job/run-unit-tests-xenial-s390x/attempt/903#highlight
<anastasiamac> i'd say we should seriously consider all intermittently failing tests and what values they bring
<anastasiamac> ,ost of them need to b re-designed
<thumper> func (s *watcherSuite) TestWatchUnitsKeepsEvents(c *gc.C) {
<anastasiamac> or actually made into unit test
<anastasiamac> yeah, that's one... but there are a lot of these sprinkled thru
<anastasiamac> just look for StartSync in test files
<thumper> oh I know exactly what is happening
<thumper> but it's all fake
<thumper> and meaningless
<thumper> the test is asserting a false proposition
<anastasiamac> nice
 * thumper fixes test
<thumper> it is bollocks
<thumper> the assertion that the events are separate is needless
<thumper> and impossible to ensure without hoop jumping
<thumper> so why bother
<thumper> anastasiamac: https://github.com/juju/juju/pull/6806
 * anastasiamac looking
<thumper> anastasiamac: the strings watcher test helper calls start sync before any assert
<thumper> the extra ones aren't necessary
<anastasiamac> thumper: there was at least another test where it was necessary... let me dig it up just for ur perusal :D
<anastasiamac> thumper: https://github.com/juju/juju/pull/6608
<anastasiamac> in that instance, adding additional sync eliminated the race :(
<anastasiamac> syncs even
<thumper> anastasiamac: yeah, using watchertest.StringsWatcherC would have saved you all that
<thumper> rather that statetest
<thumper> see watchertest/strings.go
<thumper> watchertest.NewStringsWatcherC(c, w, s.BackingState.StartSync)
<thumper> pass a "pre-assert" function in
<thumper> in this case, the StartSync on the BackingState
<thumper> that way you don't have to sprinkle the code with startSync calls before every assert
<thumper> because it does it for you
<anastasiamac> thumper: k. i'll circle back to it tomorrow... have a killer migraine atm :(
<perrito666> Morning
<gsamfira_> Hello folks. Anyone have time to review a 5 line PR? :) https://github.com/juju/utils/pull/260
<perrito666> gsamfira: hi dude, long time no see
<gsamfira> perrito666: yeah, it's been a while :D
<perrito666> looking at the patch
<gsamfira> thanks!
<perrito666> gsamfira: I am keen to approve the patch even though you missed the QA steps :p but I would love an explanation of what are those two windowses?
<gsamfira> It's been a while, I am not familiar with the new QA steps :D. I may need a crash course :P
<gsamfira> One is Hyper-V server 2016. The free version of windows that just gives you the hypervisor and nothing more
<gsamfira> the other is the Windows storage server. That one does not really add a new series. Just enables detection for that particular version
<perrito666> gsamfira: sure, we sent the new rules over a mail or some other not really good communication form so I cant easily point you to them (I know, we need to fix that)
<perrito666> gsamfira: so the first step is to re-propose to develop instead of master :D
<gsamfira> I didn't see a develop branch in juju/utils
<gsamfira> would be happy to do it
<perrito666> ahhh utils
<perrito666> I forgot we moved that
<perrito666> :p
 * perrito666 makes a note about having develop o utils
<perrito666> gsamfira: ship it
<gsamfira> thanks!
<junaidali> Hi guys, I have a failed machine (a controller instance in HA), I have run 'juju enable-ha' to ensure HA again but now I can't remove the failed machine
<junaidali> it is giving error 'ERROR no machines were destroyed: machine <machine number> is required by the model'
<junaidali> Any idea, what I'm missing here? Juju HA is already ensured.
<perrito666> junaidali: hey, I assume this is juj 2.x?
<junaidali> perrito666: yes
<perrito666> mm, could you post in a pastebin your "juju status -m controller --format=yaml" ?
<junaidali> perrito666: my bad, I removed two instances from my HA (out of three, need to restore from backup now), can't access controller right now
<perrito666> junaidali: ah I see
<perrito666> well ping me if you need ay help
<perrito666> ill be on this channel all day
<junaidali> perrito666: Thanks, did you want to check the member-status for that machine?
<junaidali> controller-member-status*
<perrito666> junaidali: yes, also I was curious what was that juju thought the statuses for those machies where
<junaidali> I remember, the value was 'no-vote'. I will let you know when I restore the controller and re-enable HA
<perrito666> tx a  lot
<perrito666> bbl
<thumper> I hate intermittent failures
<babbageclunk> thumper: yeah they suck
<thumper> hmm actually I think this is another case of slow race run with complex cert sometimes takes too long
<thumper> babbageclunk: https://github.com/juju/juju/pull/6809
<babbageclunk> thumper: looking
<thumper> babbageclunk: many of these timeout type failures are due to expectations that the server starts quickly
<thumper> on the race build, on the CI machine, generating a CA cert can take 4s and I have seen the 2048bit server cert take 10s
<babbageclunk> thumper: LGTM
<thumper> ta
<redir> I sometimes hate intermittent failures
<redir> More often I hate intermittent success
<anastasiamac> :D
<thumper> ha
<redir> I am a bit stuck running into a bug https://bugzilla.redhat.com/show_bug.cgi?id=1325085
<redir> but dannf is off since it is a holiday here today.
<redir> so I can't ask if he got past that via a workaround.
<redir> So now might be a good time to have a run and a think on it.
<redir> the issue is theoretically fixed in libvirt 1.3.3, but xenial ships with 1.3.1
<redir> also it seems that we can't expect kvm to work on arm64 on trusty
<redir> :/
<redir> bbiab
<thumper> redir: if that is the case, then we should blacklist it in the code
<babbageclunk> thumper: https://github.com/juju/juju/pull/6810 - bit of a weird one.
 * thumper looks
<babbageclunk> thumper: Ok, I'll add the names. Do think it would be better as MoveInstancesToController(controllerUUID string, ids ...instance.Id)?
<babbageclunk> thumper: Now that I've typed that out I don't think it's better.
<thumper> babbageclunk: I'd keep consistency with the other methods
<perrito666> morning babbageclunk thumper ... redir? arent you in US?
<redir> perrito666: yes, but planning to swap the holiday for later
 * redir really goes for a run now
<babbageclunk> sure buddy
<redir> heh
<redir> thumper: you mean bl arm64 or trusty on arm64?
<thumper> redir: trusty on arm64
<redir> k
<redir> let me run it by dannf tomorrow
<anastasiamac> thumper: babbageclunk: veebers: wallyworld: redir: perrito666: axw: i've added an HO to standup meeting invite... could we plz try it today instead of ...?
<redir> they both work for me
<anastasiamac> redir: i'd take it as a 'yes' ;)
<babbageclunk> anastasiamac: fine by me
<anastasiamac> babbageclunk: \o/
<babbageclunk> thumper: could you take a look at https://github.com/juju/juju/pull/6813?
<thumper> babbageclunk: ack
<thumper> babbageclunk: were we tagging instances with controller uuid in gce before?
 * thumper needs to go kick everyone in the house off the internet for the team hangout
<babbageclunk> thumper: yup - look in ControllerInstances for example.
<Miguel_Ubuntu> Tring to install Juju on a MAAS server. The bootstrap fails. There is a similar report on Ubuntu Solutions Engineering o9n github, but I cannot use the sugestion. It is the top issue.
<thumper> wallyworld, anastasiamac: standup hangout?
<thumper> perrito666: hangout change?
<wallyworld> what's the hangout name?
<perrito666> thumper: sure, fire up a hangout
<anastasiamac> HO is in standup invite incalendar
#juju-dev 2017-01-17
<anastasiamac> thumper: babbageclunk: can we cap log collection in mongodb? bug 1656430
<mup> Bug #1656430: juju logs should be a capped collection in mongodb <sts> <juju:New> <https://launchpad.net/bugs/1656430>
<babbageclunk> anastasiamac, thumper: I don't think mongo allows deleting from a capped collection, so the pruner couldn't work. I'm not sure whether that would cause any knock-on problems.
<anastasiamac> love the error msg \o/ - "Something wicked happened" :D
<anastasiamac> babbageclunk: thnx!
<anastasiamac> babbageclunk: but interestingly, would we need prunner if log collection was capped anyway?..
<thumper> anastasiamac: the problem with a capped collection is one noisy model can remove logs for another model
<thumper> but...
<thumper> perhaps there are ways.
<anastasiamac> thumper: yep. marked the bug as invalid... but thought it was interesting thought-flexing exercise :D
<redir> collection per model
<thumper> anastasiamac: I reopened that bug but with a different subject
<thumper> Miguel_Ubuntu: what is the problem you are hitting?
<anastasiamac> thumper: k
<redir> bbl after dinner and stuff
<redir> night
<junaidali> perrito666: output of juju status -> http://paste.ubuntu.com/23814800/ after restore
<junaidali> I'm unable to remove machines with 'down' status and 'no-vote' as controller-member-status
<jam> wallyworld: ping. IIRC your team worked on "show-status-log". I'm trying to debug some provisioning errors, and I see the messages show up while the provisioner is trying
<jam> but as soon as the provisioner gives up
<jam> the machine goes to "pending", ""
<jam> (empty message)
<jam> and "juju show-status-history" only shows the "pending", "" entry
<jam> none of the ones that have been giving me "failed because of X" messages.
<jam> axw: ^^ in case you know something about it, too
<perrito666> jam: I can help you with that, just give me a couple of mins to discover where I left my glasses when I woke up
<perrito666> jam: you mean the status history is being re-written?
<perrito666> aaand I just realized you asked this like an hour ago
<jam> perrito666: I mean that I'm testing what happens when provisioning fails. And if I watch "juju show-machine X" I can see the messages about failing and will retry
<jam> perrito666: but as soon as the provisioner decides that its done trying
<jam> perrito666: the message ends up as ""
<jam> which is rather unhelpful
<jam> looking at the code in the Provisioner
<jam> I see it wanting to call setErrorStatus which should set the machine into an Error state, instead of a Pending state
<jam> and have a nice looking message there.
<jam> well, at least an informative one.
<perrito666> jam: that is rather odd, and how is show-status-log involved here?
<jam> perrito666: it also has only 1 entry "pending", ""
<jam> which doesn't match the 5+ messages we just set about "could not do what you wanted"
<jam> perrito666: and I was hoping  to, you know, *see* the reason why it had been failing.
<jam> perrito666: mongodb has the same content as "juju status" and "juju show-machine" and "juju show-status-log" which means at least the reporting layer isn't lying
<perrito666> jam: mmmm, interesting
<perrito666> so I can do an educated guess, iirc, there used to be this rule: "You cant set error without the data field being populated" if still in place, the status set for error might be failing
<perrito666> status history setting happens inside set status and is non guaranteed and non stopping so even though set status history fails, set status might succeed, so its not that breaking it, or should not be
<jam> perrito666: hm. it seems to be setting the values on the agent's message
<jam> perrito666: https://pastebin.canonical.com/176150/
<perrito666> checking
<jam> so while the values are there for something
<jam> it isn't the machine object
<jam> perrito666: is there a way to get the history for the "juju-status" portion of 'show-machine' ?
<jam> perrito666: After debugging a bit more I do see a 'statuseshistory' entry for something
<jam> "globalkey" : "m#1/lxd/6#instance", isn't interesting, but "globalkey" : "m#1/lxd/6" is
<jam> perrito666: it looks like we added code so that if a machine isn't Stopped or Pending, then it overwrites the value of the juju-status field with "agent not communicating"
<jam> however, it doesn't handle if the status is Error
<jam> perrito666: juju show-status-log --type juju-machine XXX is what I wanted
<perrito666> jam: sorry someone was at the door back to you
<perrito666> jam: mm that is iirc, what cheril added and most likely I finished which is a proper "hardware" status
<jam> perrito666: "what I wanted" meaning that's where the actual information *is* but it was quite confusing to find.
<perrito666> we hold status for agent which is juju agent and "instance" which is the underlying status of the actual hardware
<jam> and the fact that we were setting a field which defaults to being overridden
<perrito666> jam: now, that was a bad design decision, I wonder we we did that
<jam> perrito666: I think the idea is that you can't trust the status if the agent isn't communicating/you want to let the user know that the status is stale.
<jam> perrito666: but I think it fundamentally is just "we should be setting InstanceStatus" during provisioning
<jam> not Status
<perrito666> ahhh indeed, but that should not override error
<jam> perrito666: it *doesn't* override Pending or Stopped
<jam> but that is the only check
<jam> I don't know whether Error was just not thought of
<perrito666> jam: btw, instance status is set during provisioning iirc
<jam> perrito666: *if* you call apiserver.provisioner...machine.SetInstanceStatus it will call that and machine.SetStatus
<jam> perrito666: however, the provisioner code itself *only* calls SetStatus, *not* SetInstanceStatus
<jam> perrito666: when there is a provisioning failure
<jam> perrito666: unless the machine.SetStatus client-side is actually calling SetInstanceStatus
<jam> perrito666: however, i'm not seeing any history in "juju show-status-log --type machine 1/lxd/3"
<perrito666> jam: but there is an instancestatuspoller
<jam> perrito666: this is container provisioning stuff that I'm specifically focused on.
<jam> perrito666: but I'm pretty sure the maas status messages also end up in "juju-status" not "machine-status"
<jam> I could be wrong there
<jam> its been a while
<perrito666> jam: mmm, odd, I wonder if the filter is ok... try getting all types and see if your global key shows (I dont recall the actuall syntax for this)
<perrito666> I am pretty sure there is a thing called instancesomethingpoller that populates the instance status
<jam> perrito666: so m#1/lxd/6/ is interesting m#1/lxd/6#instance is not
<jam> for the purposes of seeing "failed to create instance"
<jam> sort of thing
<perrito666> jam: I see, we need to polish that then
<jam> perrito666: well, its what I'm working on *right now*, fortunately :)
<perrito666> jam: would it be too much of a hassle to ask you to put up a bug with that info pointed in my direction?
<perrito666> ahhh
<jam> perrito666: bug #1650252
<mup> Bug #1650252: juju add-machine lxd:N --constraints INVALID does not show provisioning error <lxd> <observability> <provisioning> <ui> <juju:Triaged> <https://launchpad.net/bugs/1650252>
<perrito666> I thought you where workin on something else and got hit by this issue
<jam> perrito666: I got hit by this issue when I refuse to start an LXD instance because of a misconfiguration, and no error is shown to the user.
<perrito666> gotcha, I believe there is another ux pain point there where instance status is not getting the right info
<jam> perrito666: so do you think that if juju-status is in Error it should not suppress the message when the agent is not alive?
<perrito666> jam: I am unsure if that is the right place to show that error
<perrito666> I mean, its not an error from the agent
<perrito666> we are posting "there should be an agent here, but we could not give you one"
<perrito666> why is it that memory leaks never come up when one needs them :p
<natefinch> voidspace: your mic is not working
<voidspace> natefinch: thanks
<redir> pong
<redir> oops
 * thumper sighs
<thumper> more freaking intermittent failures
 * thumper picks one
<redir> :|
<thumper> freaking peergrouper tests...
<thumper> http://reports.vapour.ws/releases/issue/5617dbc6749a562f5cdd8efc
 * thumper dives on it
 * perrito666 tried to get mongo to accept 0.25G as a way of expressing 256M
<menn0> babbageclunk: bug 1569632 is done right?
<mup> Bug #1569632: indicate "migrating" in show-model status output <juju:Triaged by 2-xtian> <https://launchpad.net/bugs/1569632>
<perrito666> ghaaaaaaaaaaaaaa, this only became a float in 3.4
 * perrito666 cries on the floor
<perrito666> are we getting mongo 3.4 rsn?
<menn0> perrito666: we probably should
<perrito666> menn0: yup, especially because until that wired tiger cache is bound to take 1G as the minimum possible parameter
<menn0> perrito666: really?
<perrito666> menn0: well the command line param does not support floats until 3.4
<perrito666> so we can allow it to choose but that does half the ram minus 1g
<menn0> perrito666: and it doesn't take a unit/
<menn0> ?
<perrito666> nooope
<perrito666> technically it does
<perrito666> its /var/lib/juju/init/juju-db/juju-db.service
<perrito666> --wiredTigerCacheSizeGB
<perrito666> there
<perrito666> so, it takes one unit :) GB
<menn0> perrito666: well that just sucks
<perrito666> after standup Ill glog my upload  by deploying a hughe bundle and see how this new setting bodes (even If I ask for 1G it will be better than allowing it to grow at will
<perrito666> s/glog/clog
<menn0> perrito666: regardless, it might be worth starting the ball rolling for moving to mongodb 3.4
<perrito666> yup, I just need to try and remember who was the packager
<menn0> perrito666: was it mwhudson ?
<perrito666> yes, tx
<perrito666> sorry I am a bit distracted today
<perrito666> mwhudson: hello, you might remember me from, lets upgrade to mongo 3.1 and lets upgrade to mongo 3.2
<perrito666> mwhudson: lets upgrade to mongo 3..4
<menn0> thumper: are you available for a quick hangout?
<thumper> sure
<thumper> menn0: 1:1 hangout?
<menn0> thumper: yep
<wallyworld> thumper: when you are free, PTAL at PR 6815, issues fixed
<wallyworld> thanks for review
<perrito666> so, hangouts or bluejeans?
<anastasiamac> perrito666: ho
#juju-dev 2017-01-18
 * thumper has a meeting with a bank person...
<thumper> afk for a bit
 * perrito666 afk for the night, c u all
<babbageclunk> menn0: review plz? https://github.com/juju/juju/pull/6827
 * redir eods
<axw> wallyworld: can you please see my comments on https://github.com/juju/juju/pull/6795
<wallyworld> sure
<wallyworld> axw: commented
<wallyworld> see if you think extracting assertDestroy is worthwhile
<axw> wallyworld: ta
<wallyworld> thumper: back?
<thumper> yeah
<thumper> trying to work out why this fucking test is failing
<wallyworld> can you PTAL at that PR?
<wallyworld> i've fixed the things
<thumper> yeah
<wallyworld> ta. sorry
<thumper> wallyworld: lgtm
<wallyworld> great tyvm
<wallyworld> axw: if you get a chance at some stage, would love a review on https://github.com/juju/juju/pull/6819. Once that's in, cmr is good enough (behind the flag) to share out a bit for feedback
<axw> wallyworld: okey dokey, will look shortly
<wallyworld> no rush
<thumper> ugh...
<thumper> pretty sure this timing issue is due to clock.After(0) and the testing clock
<thumper> hmm... no?
<babbageclunk> thumper or someone else? I think menn0's busy. Could you take a look at https://github.com/juju/juju/pull/6827 plzthx?
<babbageclunk> axw, anastasiamac? ^
<axw> babbageclunk: will do, just reviewing wallyworld's one atm
<babbageclunk> axw: ok thanks!
<anastasiamac> babbageclunk: since axw will b looking, I'll skip :D
<axw> babbageclunk: reviewed
<babbageclunk> anastasiamac: eminently sensible :)
<babbageclunk> axw: thanks!
<anastasiamac> babbageclunk: i'd question both :0 but nice that u think so
<anastasiamac> axw: babbageclunk: i like "adopt"... makes it sound almost humanitarian and benevolent
<axw> anastasiamac: :p
<axw> can't have those instances roaming the streets
<anastasiamac> unfed, unclothed...
<axw> pick a pocket or two
<babbageclunk> axw, anastasiamac: I think I'm ok with AdoptInstances (although in this case it's a bit more like "here, you take them!")
<babbageclunk> which, who hasn't felt like that sometimes?
<anastasiamac> fostering does not have the same 'share the love" feel
<anastasiamac> babbageclunk: i didnt, until recently
<axw> foistering? :)
<axw> I need to take a break from punning, working up a sweat here. lunch time
<babbageclunk> axw: That's some solid work!
<menn0> babbageclunk: did you still need a review?
<babbageclunk> menn0: no thanks, axw gave me a better review anyway!
<babbageclunk> ;)
<menn0> babbageclunk: ouch!
 * menn0 will remember to be nasty on his next review for babbageclunk 
 * babbageclunk d'ohs
<babbageclunk> anastasiamac: Want to look at this one? https://github.com/juju/juju/pull/6829 (or axw)
<axw> babbageclunk: reviewed
<babbageclunk> axw: thanks! Yes, SetOwnerData will preserve keys that aren't set - I've updated the comment.
<axw> babbageclunk: thanks
<wallyworld> axw: can i get comment access to that storage doc?
<axw> wallyworld: sorry, one sec
<axw> wallyworld: try now
<wallyworld> ta
<axw> wallyworld: there's a question from james beedy on your PR if you haven't seen
<wallyworld> axw: not yet, been out talking to anastasia and looking at specs
<axw> wallyworld: I think I've addressed all of your comments in the doc, thanks for the review/comments. Would appreciate another look - I'm heading out shortly, so it can wait till tomorrow
<rogpeppe> axw: hiya
<rogpeppe> axw, wallyworld: do you know if the juju master branch is used any more?
<anastasiamac> rogpeppe: it's meant to b releasable (for next release) but is not at this stage
<anastasiamac> rogpeppe: why?
<anastasiamac> rogpeppe: if u have smth for 2.2, it goes into develop, otherwise each release has it's branch 2.0, 2.1, etc
<rogpeppe> anastasiamac: the last commit was in october
<rogpeppe> anastasiamac: i thought it would be more up to date than the version branches
<anastasiamac> rogpeppe: yes, there is a new dev process in place; we/ve benn releasing from staging
<anastasiamac> rogpeppe: master is kind of "reserved"; we need to promote to it from develop>staging>master once we have a bless and passing tests
<jam> perrito666: if you're around I have another 'juju show-status-log' question. Specifically, the last step is putting the machine into an error status, and that doesn't seem to be shown in the status log
<jam> only the 'pending' entries are
<jam> perrito666: also wide entries in 'message' cause all lines to wrap
<jam> perrito666: https://bugs.launchpad.net/juju/+bug/1657383
<mup> Bug #1657383: juju show-status-log wraps all entries if one is too wide <show-status-log> <ui> <juju:Triaged> <https://launchpad.net/bugs/1657383>
<jam> perrito666: maybe its a throttling thing, but I see 4 messages in 'juju debug-log' but only 2 in 'juju show-status-log' https://pastebin.canonical.com/176285/
<jam> wallyworld: ^^
<rogpeppe> this PR fixes the logging on API connect so that it doesn't print misleading errors: https://github.com/juju/juju/pull/6830
<rogpeppe> jam: I think you reviewed the change that started printing the errors - you might want to take a look at this ^
<rogpeppe> macgreagoir: likewise ^
<jam> rogpeppe: your !a.HasNext() means that certError may be a nil object in the next print statement
<jam> in the Debugf
<jam> ah, its a bool not an actual error
<rogpeppe> jam: perhaps i should rename it isCertErr
<jam> rogpeppe: that might help, but I'm also wondering about the format string.
<jam> error dialing (certificate error: true): blah
<jam> vs
<jam> error dialing (certificate error: false): blah
<jam> I'm not sure if there is actual value in knowing about the certificate error status
<rogpeppe> jam: yeah, probably not. i just left it there because that PR seemed to find it useful
<rogpeppe> jam: it should be fairly obvious from the error message tbh
<jam> rogpeppe: right. I'd like us to actually look at that instead
<jam> rogpeppe: and then the comment needs updating
<rogpeppe> jam: i'll remove that, which makes the code exactly what it was before, i think
<jam> eg: // we won't retry anymore. Either this is a certificate error, or we're out of attempts
<rogpeppe> jam: before PR #6620
<jam> rogpeppe: that would indicate that the errors *aren't* clear.
<rogpeppe> jam: sorry, what would indicate that?
<rogpeppe> jam: AFAIK all those x509 errors include the string "x509" in them
<jam> rogpeppe: PR 6620 would indicate that it was added because the error messages weren't clear
<rogpeppe> jam: the thing is that nothing outside of that code really cares whether it's a cert error or not
<rogpeppe> jam: the diagnosis is only used so we can abort early
<rogpeppe> jam: so unless you're specifically debugging that code, it's probably not worth making a deal of it
<rogpeppe> jam: the new "will retry" message should make it more obvious when a certificate-like error is being inappropriately retried
<jam> rogpeppe: so if it was helpful for context of someone working around there, it is likely to be helpful to someone trying to help someone else who's having a problem.
<rogpeppe> jam: so you you now think that showing whether it's a cert error is worth doing?
<jam> rogpeppe: I don't think the Annotation difference is necessary, but I think we want to keep something about certificate in the logged message, yes
<rogpeppe> jam: so what would you suggest to change in the current PR?
<jam> rogpeppe: would "bad certificate" be clearer than "certificate error: true" ?
<rogpeppe> jam: i think that could be misleading
<rogpeppe> jam: the error string should say what the actual issue is - the important thing from the p.o.v. of that message is whether it was classified as a cert error, i think
<jam> rogpeppe: btw, you missed cfg.Location on the new Debugf
<rogpeppe> jam: good catch, fixed
<jam> rogpeppe: I take it you didn't actually run it to see the messages were helpful.
<jam> example output helps in this situation
<jam> I'm happy with the change to Debugf
<jam> I'm not 100% sure how to succinctly convey whether it was a certificate error.
<jam> the rest LGTM
<rogpeppe> jam: tbh i think that if you're trying to debug it, it'll be obvious in almost all cases whether it was a cert error - if it wasn't a cert error, it'll retry
<rogpeppe> jam: unless it's timed out, which is unlikely to happen immediately
<jam> rogpeppe: so, as *I* haven't been trying to debug it, but macgreagoir did, I'd be interested to hear his thoughts
<rogpeppe> jam: here's some sample output: http://paste.ubuntu.com/23821416/
<jam> rogpeppe: I have to say "certificate error true" doesn't help much when the error is ... cannot validate certificate...
<jam> rogpeppe: its also redundant to put the cfg.Location when that's also in the error
<rogpeppe> jam: it might when the error is "x509: failed to load system roots and no roots provided"
<jam> I wonder if that is normally true
<rogpeppe> jam: looks like it is
<jam> rogpeppe: as you say, x509 probably gives enough context, and I *am* concerned about having errors that don't feel like garbage to read. Always embedding the Location makes it just look like nobody has ever read any of our debug output
<jam> rogpeppe: is that always true for x509, or true always? Like will plain Dial also have the URL?
<rogpeppe> jam: looks like it's true for all websocket dials
<macgreagoir> rogpeppe jam: I can't remember the bug I was debugging when I separated the x509 errors. It seemed useful at the time, so I left it in. If it happens not to be useful, please roll it out.
<jam> hi macgreagoir. thanks. all the ones that I've seen so far also have that information in the error, so it feels redundant.
<jam> macgreagoir: though the problem with errors is that its hard to know if you've seen 'all' of them
<jam> rogpeppe: I'm tempted to take out the cfg.Location, though because we're dialing multiple locations, if you ever get one without that context you're lost
<jam> (what dial failed)
<rogpeppe> jam: one possibility would be to remove the websocket.DialError wrapper, then always add the context
<rogpeppe> jam: but that's probably more logic than i'd ideally want to add for a fairly superficial thing
<jam> rogpeppe: so on one hand its 'not a big deal', on the other it is potentially one of your first interactions, which is why you're dropping it to debug anyway
<rogpeppe> jam: websocket.DialConfig always returns a *websocket.DialError
<rogpeppe> jam: output now looks like this: http://paste.ubuntu.com/23821568/
<rogpeppe> jam: i've updated the PR
<wallyworld> jam: sorry, missed your msg, was out to dinner. my recollection is that throttling could account for the observed behaviour, if it is not smart enough to account for message content as well as status value itself. without looking into it, i'd have to ask horatio
<jam> wallyworld: I'm wondering if we should be throttling at all, given we're starting to make active use of it ourselves.
<jam> wallyworld: anyawy, I'm happy to talk to perrito666 instead, he just needs to wake up. :)
<jam> wallyworld: I hope you enjoyed your dinner.
<wallyworld> yeah, throttling was only ever meant to kill reptitive status-update hook messages. it was a quick implementation that could use some rework AFAIR
<wallyworld> i did, we went to a Jamie Olivier restaurant
<wallyworld> i think there's been a few issues with the throttling, there's a bug or three still
<wallyworld> jam: btw, thanks for comments on cmr spec. i'll garden a bit and put out for comment once i land one more pr
<jam> wallyworld: happy to.
<jam> ping me after you've done it, hopefully I can look with fresh eyes
<wallyworld> will do, i'll update overnight or first thing tomorrow more likely. i have a little cleanup on the pr to auto expose and i want to land that
<wallyworld> i don't mind if the spec has questions still to resolve, just want to get it ready for consumption so to speak
<jam> sure
<perrito666> jam: sorry I am here now
<perrito666> jam: you have a few mins until I decide to leave for a coffee and some sort of bakery product which will take about 10 mins then
<perrito666> jam: ill leave you with an answer at least, throtling was, as wallyworld said, something implemented in a hurry and could use some re-work
<rogpeppe> jam: where's the accepted place to report juju bugs these days? still launchpad?
<rogpeppe> anyone?
<rogpeppe> ha, given there are only two issues on github.com/juju/juju, i guess so
<rogpeppe> unless All The Bugs Have Been Fixed
<rick_h> rogpeppe: yes, launchpad.net/juju
<rogpeppe> rick_h: thanks
<rogpeppe> rick_h: just filed https://bugs.launchpad.net/juju/+bug/1657448
<mup> Bug #1657448: provider/azure: adding credential produces error when service principal exists <juju:New> <https://launchpad.net/bugs/1657448>
<wallyworld> rogpeppe: andrew will be estatic :-) i'll let him know
 * perrito666 imagines wallyworld compiling a static version of andrew
<wallyworld> one can never have too many of those
<perrito666> wallyworld: arent you pass your bed time?
<wallyworld> yeah
<wallyworld> stuff to do
<perrito666> wallyworld: do you need any of us to tuck you in ?
<wallyworld> yes, and a goodnight story
<perrito666> ill read you a user story for bed
<rogpeppe> wallyworld: thanks
<wallyworld> rogpeppe: it's probably not even his bug, but mentally I s/azure/andrew :-)
<rogpeppe> wallyworld: it looks like it's a bug in the azure client lib
<rogpeppe> wallyworld: i filed a bug there too
<wallyworld> sgtm, ty
<rogpeppe> wallyworld: MS loves byte-order marks
<wallyworld> i bet
<perrito666> uhhh, that is a cool internal idea, lets rename azure provider andrew provide internally
<rogpeppe> wallyworld: have you used the azure provider much?
<wallyworld> not much, just a smoke test or two
<natefinch> voidspace, katco, rick_h, frobware, macgreagoir: if there's a standup, I need to miss it to clear the snow in our driveway so we can get the kids to school.
<katco> natefinch: k
<katco> natefinch: happy exercize
<natefinch> :)
<frobware> merge jobs seem to be failing with either 'go' not found or 'lxd' not found. any ideas?
<frobware> http://juju-ci.vapour.ws:8080/job/github-merge-juju/10057/
<perrito666> wow sounds like corrupt server
<perrito666> bbl, have people doing some fixes at the house and need to shut down the power for a moment
<redir> woohoo nested arm64 booted
<jam> perrito666: ping
<jam> redir: nested?
<redir> jam
<redir> yes
<jam> what do you mean by nested arm64?
<jam> kvm in kvm ?
<redir> but just by fiddling with the domainXML
<redir> jam yes
<jam> ah, not somethnig like running virtualized arm board on an arm board.
<redir> yes with kvm acceleration instead of emulation
<redir> like we do for amd64
<redir> it's apparently a bit different
<redir> and it appears it will only work on xenial
<redir> not trusty
<redir> :(
<jam> redir: is it a kernel thing, such that HWE kernel would work on trusty, or is it userspace thing and the tools aren't backported (or HWE doesn't have it, either)
<redir> jam AFAIU the qemu-efi package requires 15.04 minimum
<jam> perrito666: when you have power again, can you have a look at https://github.com/juju/juju/pull/6828 I ended up needing to tweak some of the InstanceStatus stuff and realized we were missing some test coverage.
<redir> I have some notes I'll add to the wiki.
<jam> perrito666: Also, the different states for InstanceStatus make me wonder if I should be setting something different in the worker
<jam> namely Running vs Started and Provisioning vs Pending.
<jam> perrito666: also, the SetInstanceStatus API accepts all the other values (Pending, etc). Should it be validating the values being set and rejecting ones that aren't in the right set?
<redir> jam you got a sec?
<jam> redir: sort of, I'm not officially working right now, but if you have a Q
<redir> I have a q about the bridge used on arm
<redir> should be real quick
<jam> sure
<redir> https://hangouts.google.com/hangouts/_/canonical.com/john-reed
<redir> thanks jam
<mup> Bug #1550821 opened: TestSetsAndUpdatesMembers timed out <ci> <go1.5> <regression> <trusty> <wily> <juju-core:Triaged> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1550821>
<mup> Bug #1550821 changed: TestSetsAndUpdatesMembers timed out <ci> <go1.5> <regression> <trusty> <wily> <juju-core:Triaged> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1550821>
<mup> Bug #1550821 opened: TestSetsAndUpdatesMembers timed out <ci> <go1.5> <regression> <trusty> <wily> <juju:Triaged by thumper> <juju-core:Fix Released> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1550821>
 * redir lunches
<mup> Bug #1550821 changed: TestSetsAndUpdatesMembers timed out <ci> <go1.5> <regression> <trusty> <wily> <juju:Triaged by thumper> <juju-core:Fix Released> <juju-core 1.25:Fix Released> <https://launchpad.net/bugs/1550821>
<perrito666> jam: pong, why do you always ping me when I am away?
<perrito666> jam: ill check your PR
<perrito666> jam: LGTM your changes seem sane enough
<anastasiamac> thumper: since u were in cert's area, would u have a sec to look at https://github.com/juju/juju/pull/6831
<thumper> anastasiamac: later... in a deep dive now
<anastasiamac> thumper: sure \o/ just flagging to for ur perusal
<anastasiamac> thumper: this is a companion PR:  https://github.com/juju/utils/pull/261
<thumper> ack
<sinzui> abentley: can you review https://code.launchpad.net/~sinzui/juju-reports/ignore-missing-tasks/+merge/315076
<abentley> sinzui: r=me with some comments.
<sinzui> abentley: thank you.
<perrito666> brb
<menn0> veebers: so no perfscale meeting
<menn0> ?
<veebers> menn0: hey, sorry I sent an email about re-scheduling it for next week
<menn0> veebers: I vaguely remember that. just thought I'd still seen the meeting on the calendar this morning.
<menn0> ignore me :)
<veebers> menn0: sorry about that, I only just moved the cal event :-P
<menn0> all good... I was away at a meeting with an accountant and am just catching up.
<babbageclunk> axw: ping?
<anastasiamac> babbageclunk: it's 7am for axw... just saying...
<babbageclunk> anastasiamac: yeah, I keep forgetting. I'm out this afternoon though.
<perrito666> anastasiamac: I get up at 6:00 :p
<anastasiamac> babbageclunk: u could leave a msg here, m sure he'll read \o/
<anastasiamac> perrito666: ah... we all are up early (and stay up late) but r u functioing and working at that time?
<perrito666> anastasiamac: I never function :p
<perrito666> btw, wasnt this like our previous standup time?
<perrito666> we could go back there
<anastasiamac> perrito666: it was :) and we could :) let's discuss today
<anastasiamac> perrito666: alto nz'ers might b at or near their lunch ...
<axw> babbageclunk: pong
<babbageclunk> oh hey axw, sorry for timezone confusion
<babbageclunk> axw: I'm doing AdoptInstances for ec2, and I think I need to update tags for volumes as well as for instances?
<babbageclunk> axw: thumper suggested you'd know about that?
<axw> babbageclunk: yes, that is correct
<axw> babbageclunk: several of the providers have support for storage, and use tagging (where possible) similarly to how it's used for tracking instances
<axw> babbageclunk: there's a separate interface for dealing with volumes/filesystems
<axw> babbageclunk: theoretically filesystems have the same issue, though there are no cloud filesystem implementations yet
<babbageclunk> axw: I can see that ec2 CreateTags will work on volume IDs as well - what's the best way to get all the volume IDs for a set of instance IDs?
<axw> babbageclunk: volumes aren't necessarily attached to an instance. they can be floating
<babbageclunk> axw: Oh, right - so it would need to happen higher up? The migration master would call AdoptVolumes as well as calling AdoptInstances?
<axw> babbageclunk: yes. and AdoptFilesystems
<babbageclunk> axw: ok - and what interface should those live on? environs.Environ or somewhere else?
<axw> babbageclunk: storage/{Filesystem,Volume}Source I think would be most appropriate
<babbageclunk> axw: ok, thanks! that probably gives me enough to start with.
<axw> babbageclunk: np, let me know if you get stuck
<babbageclunk> axw: I almost certainly will!
<babbageclunk> (get stuck I mean)
<axw> :p
#juju-dev 2017-01-19
<babbageclunk> thumper: hey, what was your idea about volumes?
<thumper> babbageclunk: oh yeah
<thumper> babbageclunk: jump back in standup
<babbageclunk> ok
<perrito666> things that cannot be tested in lxd are really evil
<redir> like kvm
<perrito666> k, EOD finally
<thumper> menn0: https://github.com/juju/juju/pull/6833
<menn0> thumper: looking
<thumper> menn0: ta
<menn0> thumper: done
<thumper> menn0: ta
<anastasiamac> ha.. mup has quit!
<mup> Bug #1442493 changed: Openstack services failing on 1 node while deploying using JUJU <deploy> <openstack> <juju-core:Fix Released> <openstack (Juju Charms Collection):New> <https://launchpad.net/bugs/1442493>
<mup> Bug #1484105 changed: juju upgrade-charm returns ERROR state changing too quickly; try again soon <bug-squad> <canonical-is> <upgrade-charm> <upgrade-juju> <juju-core:Fix Released> <https://launchpad.net/bugs/1484105>
<redir> time to make dinner
 * redir eods
<axw> anastasiamac_ wallyworld thumper menn0: I have a couple of PRs that I'd appreciate reviews on, when you have the time: https://github.com/juju/juju/pull/6818 and https://github.com/juju/juju/pull/6826
<thumper> axw: what do you see as the type of things a provider upgrade would do?
<wallyworld> ok, will look shortly
<thumper> axw: and why not part of the standard upgrade ?
<axw> thumper: what do you mean "no part of the standard upgrade"? they are run as upgrade steps as normal, they're just defined in the provider code instead of in the upgrades package
<axw> thumper: the azure-specific PR has an upgrade step that adds a resource to non-controller resource groups. so, very provider-specific things
<thumper> ah...
<thumper> ok, I think I have the concept now
<menn0> axw: that upgrade short circuit helped us avoid a lot of annoyance and confusion from users
<menn0> axw: i'm a bit worried about it not being there
<axw> menn0: what was the cause of annoyance/confusion?
<menn0> axw: before that was added we got a lot of bug reports from people seeing messages about juju upgrading when they knew it wasn't
<axw> menn0: how about we just check if upgraded-to-version == version.current?
<axw> and short circuit on that
<axw> otherwise it really *is* doing an upgrade, there just happens to be no upgrade steps
<anastasiamac_> thumper: didn't u do something recently with megawatcher?...
<menn0> axw: that might be ok
<menn0> axw: except for alpha and beta versions where you might want upgrade steps to always run (i.e. steps added during development without a version bump)
<menn0> axw: it's probably ok though
<axw> menn0: when you upgrade in that case you would be getting a +1 to build though, so it would still trigger
<menn0> axw: yeah, that should be ok
<axw> okey dokey, will make that change now
<axw> menn0: I've pushed a new commit that releases the lock immediately if upgraded-to == version.Current
<thumper> anastasiamac_: don't think so
<menn0> axw: looking
<anastasiamac_> thumper: thnx. so bug 1585361 probably still stands... m going to have to verify manually...
<mup> Bug #1585361: megawatcher delta is missing data (service & relation) <juju:Triaged> <https://launchpad.net/bugs/1585361>
<menn0> axw: ship it
<menn0> axw: the first one anyway
<axw> menn0: tyvm
<anastasiamac_> axw: wallyworld: does bug 1657187 sound like a wish to u? or would it b of help to the project-that-should-not-be-named?
<mup> Bug #1657187: Get new token for existing user <cwr-ci> <matrix> <juju:New> <https://launchpad.net/bugs/1657187>
<wallyworld> doesn't look like it's applicable to 2.1 at all
<anastasiamac_> it's not for 2.1.. m wondering if it's a wishlist item or <PM>ed
<anastasiamac_> wallyworld: is bug 1564026 still a thing?
<mup> Bug #1564026: Unable to upgrade hosted model with --upload-tools after upgrading controller with --upload-tools <juju-release-support> <upgrade-juju> <juju:Triaged> <https://launchpad.net/bugs/1564026>
<wallyworld> upload tools is gone but i don't know if the core problem is fixed
<wallyworld> needs to be tried out
<axw> anastasiamac_: https://bugs.launchpad.net/juju/+bug/1657187 is a bug IMO
<mup> Bug #1657187: Get new token for existing user <cwr-ci> <matrix> <juju:Triaged> <https://launchpad.net/bugs/1657187>
<anastasiamac_> axw: ok
<axw> anastasiamac_: at least part of it
<axw> anastasiamac_: I'm not sure about the ability to register a user on multiple machines, that sounds wishlisty to me. but the fact that you can lose the token and not be able to restart the registration process is a proble
<axw> m
<anastasiamac_> menn0: talking about MM bugs... should we try to address bug 1650009 for 2.1?
<mup> Bug #1650009: Model Migration fails w/ migration already in progress <juju:Triaged by menno.smits> <https://launchpad.net/bugs/1650009>
<anastasiamac_> axw: k. i was hoping to avoid it... but i'll break up the bug into a wishlist item and a bug \o/ thnx :D
<axw> anastasiamac_: thanks
<menn0> anastasiamac_: i'm pretty sure that's already been fixed as part of other MM fixes before xmas
<menn0> anastasiamac_: leave it for 2.1 and i'll confirm
<anastasiamac_> menn0: k \o/ It'd b helpful if there is a PR that u could pint the bug to as well..
<anastasiamac_> menn0: u know, in ur spare time :D
<menn0> anastasiamac_: I suspect this overlaps with other bugs fixed
<menn0> anastasiamac_: I will try
<anastasiamac_> menn0: i realise there may not be a dedicated PR.. but if we can pinpoint to any that included this fix as a "drive-by", it'd b incredible!
<menn0> anastasiamac_: understood. i'll try and pin down the fix when I dig in to it
<anastasiamac_> menn0: tyvm
<axw> wallyworld: thanks for the review. I'll look at adding a test that asserts that StartInstance waits for the deployment. might be a bit of a PITA because of clocks, we'll see
<wallyworld> ok, thanks that would be good
<wallyworld> if possible
<anastasiamac_> menn0: sorry... what about bug 1611404? has it had a drive-by fix too?
<mup> Bug #1611404: failed migration leaves model unkillable <model-migration> <juju:Triaged by menno.smits> <https://launchpad.net/bugs/1611404>
<menn0> anastasiamac_: well, the stuck migration issue which led to this ticket being created was fixed a long time ago
<menn0> anastasiamac_: but I would like to go further with it and provide some escape hatch in case of a migration bug, hence the ticket still being opened
<menn0> anastasiamac_: not required for 2.1
<anastasiamac_> menn0: it's does not have a milestone. just marked as High... so there is that :)
<menn0> anastasiamac_: just leave it that way I guess
<menn0> anastasiamac_: thanks for digging through all of this
<anastasiamac_> menn0: yep. leaving.
<axw> wallyworld: take a quick glance at https://github.com/juju/juju/pull/6826/commits/69a36cc17bf8d57b97e7dede31f8fb075c7c3f46 please?
<wallyworld> ok
<wallyworld> axw: thanks for adding the test
<axw> wallyworld: thanks. just doing a final QA then will land
<rogpeppe> axw: i've reviewed  https://github.com/juju/juju/pull/6835 and 6836
<axw> rogpeppe: thanks
<perrito666> sort of good morning
<voidspace> perrito666: o/ morning :-)
<junaidali> Hi perrito666: there?
<perrito666> junaidali: hi, I am now
<junaidali> perrito666: a few days back, I talked about the issue regarding not being able to delete machines from juju status in controller model that are down
<junaidali> I'm able to reporduce the issue. here is my juju status 'http://paste.ubuntu.com/23827603/'
 * perrito666 checks
<perrito666> junaidali: mm, it would seem the intances never get provisioned
<perrito666> what does maas have to say about that?
<junaidali> Actually, it did provisioned. I released the other nodes from MAAS to test Juju HA
<junaidali> it is the same environment that I told you about, I have restored the controller from backup as it was down
<junaidali> http://paste.ubuntu.com/23827623/, this is another test environment with three Juju controllers, having same issue
<junaidali> not able to remove the nodes in down state
<perrito666> junaidali: mm, please file a bug plese including as much log files as you can, especially from agents on the machines as I suspect something is broken there and therefore juju thinks machines are down
<junaidali> sorry perrito666 for the late reply, will file a bug. Thanks for looking into it
<perrito666> junaidali: np, tx for delaing with this and sorry for the bug
<perrito666> bbl, gotta buy some lunchables
<rogpeppe> does anyone know how log gathering happens in juju 2? does it all happen through port 17070 (api port) ?
<perrito666> rogpeppe: i believe so
<rogpeppe> perrito666: thanks
<rick_h> rogpeppe: yes
<rogpeppe> rick_h: cool
<redir> morning juju-dev
<perrito666> redir: morning
<redir> perrito666: :)
<alexisb> heya redir and perrito666
<alexisb> just fyi, I am back today if you want to chat
<alexisb> just ping me
<redir> alexisb o/
<menn0> morning all
<perrito666> menn0: morning
 * rick_h goes to get the boy from school
<alexisb> morning menn0
<menn0> alexisb: o/
<redir> g'day
 * TheMue greets his former colleagues
<menn0> TheMue: howdy
<thumper> TheMue: hey
 * redir goes for a haircut
<redir> bbiab
<TheMue> Always interested in how Juju evolves. Only sad about the latest news.
 * natefinch also greets his former colleagues ;)
<TheMue> natefinch: o/
<perrito666> someone with some level of linux-fu I would appreciate a review of https://github.com/juju/juju/pull/6838
<perrito666> ok, ill be back later
<alexisb> thumper, if you are around you should join the bug scrub call in 20 minutes
<thumper> yeah
<anastasiamac> alexisb: thumper: \o/
<alexisb> morning anastasiamac
<anastasiamac> alexisb: <3
<menn0> axw or thumper: https://github.com/juju/juju/pull/6839
 * thumper looks
 * menn0 remembers it's too early for axw
<thumper> menn0: done
<thumper> just a few suggestions
<menn0> thumper: thanks
<menn0> thumper: good feedback. I'll address soon.
<thumper> ack
#juju-dev 2017-01-20
<redir> building juju on an emulated systems is slooooooooow
<redir> woohoo ubuntu delpoyed to kvm on arm64
<redir> and mysql
<axw> wallyworld: thanks for the review. I'm going to have to go back to the drawing board tho. I forgot, normal cloud-init runs the script through /bin/sh, which is dash on ubuntu. and that named fd thing is a bashism
<axw> le sigh
<wallyworld> ah
<wallyworld> damn
<wallyworld> i didn't realise that
<junaidali> Hi guys, should I file bug for Juju on launchpad on Github?
<junaidali> I have filed a bug on lp https://bugs.launchpad.net/juju/+bug/1658033, should I create an issue on GitHub as well?
<mup> Bug #1658033: Juju HA - Unable to remove controller machines in 'down' state <juju:New> <https://launchpad.net/bugs/1658033>
<perrito666> Morning
<rick_h> junaidali: no, the LP bug is great ty
<mup> Bug #1658100 opened: Deployment of a large bundle fails or hoggs the system <juju-core:New> <https://launchpad.net/bugs/1658100>
<mup> Bug #1658100 changed: Deployment of a large bundle fails or hoggs the system <juju-core:New> <https://launchpad.net/bugs/1658100>
<mup> Bug #1658100 opened: Deployment of a large bundle fails or hoggs the system <juju-core:New> <https://launchpad.net/bugs/1658100>
<alexisb> perrito666, ping
<perrito666> alexisb: pong
<perrito666> k peeps eow
#juju-dev 2017-01-22
<thumper> anyone around who can review?
<thumper> https://github.com/juju/juju/pull/6851
<redir> looking
<redir> LGTM
<thumper> redir: ta
<redir> good find
<redir> np
<thumper> well, looking through the uploaded info for the bug
<thumper> seemed obvious that we were leaking, and just went through the codebase looking for all cases
 * redir wonders if there's a static analysis tool for that
<thumper> wallyworld: ping
<wallyworld> yo
<thumper> wallyworld: can you please enable the pre-push hook?
<thumper> wallyworld: you are committing code that doesn't pass go vet
<thumper> apiserver/remoterelations/remoterelations.go:344: and 357
<wallyworld> ok. but our landing bot hsould pick that up
<thumper> yes, but it doesn't
<wallyworld> sigh
<nurfet> guys, when juju ssh into a machine which user does it use by default?
<axw> nurfet: the "ubuntu" user
<nurfet> axw: thanks. how can I change the user during bootstraping?
<axw> nurfet: are you using the manual provider?
<nurfet> no, I am developing a new public cloud provider
#juju-dev 2018-01-15
<axw> babbageclunk: standup? it's just us tho
<babbageclunk> axw: I thought it wasn't for another hour? I'm happy to skip if you are - still just working on audit logging + and making controller config updatable.
<axw> babbageclunk: oh what, indeed
<babbageclunk> (hope you didn't get up specially!)
<axw> babbageclunk: a little bit, but I'll survive
<axw> babbageclunk: let's skip. I just worked on resource leak investigation, dqlite stuff today
<babbageclunk> ok cool - the dqlite stuff sounds pretty neat, was reading a bit about it
 * babbageclunk goes for a run in that case
<tasdomas> morning, juju, could I please get a review?
<tasdomas> https://github.com/juju/juju/pull/8262
<anastasiamac> tasdomas: i've looked at it the other day :)
<anastasiamac> tasdomas: cannot comment on pythn and acceptance tests
<anastasiamac> tasdomas: go code looks ok
<tasdomas> anastasiamac, thanks
<anastasiamac> tasdomas: but either way, u already have couple of approvals...
<anastasiamac> tasdomas: i was going to ask chris when he is back from holidays to look
<tasdomas> anastasiamac, yeah I thought I'd give somebody from core a chance to take a look at this
<anastasiamac> tasdomas: at acceptance tests side but i *think* u r good to go..
<tasdomas> anastasiamac, thanks
<anastasiamac> tasdomas: k, could we leave til APAC tomorrow?
<anastasiamac> tasdomas: i'll ask in our morning ;)
<tasdomas> anastasiamac, sure
<tasdomas> anastasiamac, thank you
<anastasiamac> tasdomas: tyvm for ur patience!!!
<tasdomas> np ;-]
<anastasiamac> tasdomas: just to confirm.. develop is heading for 2.4-b1.. r u k with that timeline?
<anastasiamac> oh do u need newer version of romulus earlier?..
<anastasiamac> s/oh/or
<tasdomas> anastasiamac, it would be great for this change to go out with the next release
<tasdomas> anastasiamac, when is that:?
<anastasiamac> tasdomas: next release is a point one - 2.3.2
<anastasiamac> tasdomas: it's a point release...
<anastasiamac> tasdomas: so dunno if u really want ur changes in point release.. it will come out as soon as build farms r back on, so mayb this week or next the latest
<anastasiamac> tasdomas: if u really really really want it in point release, i.e. in 2.3.2, u'd need to re-target ur PR to 2.3 branch; land it there once u'd get a +1 from Chris; and then forward-port to develop ;)
<anastasiamac> tasdomas: (sounds daunting but m happy to help if u need it)
<anastasiamac> tasdomas: u might want to get a +1 from Tim to get this into 2.3.2 as well...
<tasdomas> anastasiamac, thanks, I think I'll do that
<anastasiamac> tasdomas: \o/
 * anastasiamac eods
#juju-dev 2018-01-16
<axw> veebers: do you have an example you can show me with the tests that fail due to lack of "virsh"? I ran all the tests yesterday locally, after removing libvirt-bin
<axw> (and they passed)
<axw> veebers: also, latest s390x run on 2.3 looks much better. thanks for picking up that error: http://qa.jujucharms.com/releases/6156/job/run-unit-tests-xenial-s390x/attempt/2313
<axw> still a few errors, but not completely hosed
<veebers> axw: can do, one moment while I grab th right url. awesome great to see the s390x improved
<veebers> axw: you'll need the vpn connected: http://10.125.0.203:8080/job/RunUnittests-amd64/182/testReport/ (github.com/juju/juju/container/kvm.Test)
<axw> veebers: thanks
<veebers> axw, babbageclunk would it be fair to say to run the unit tests you need lxd installed and setup (i.e. you need to be in the group etc.), that's out of the scope of the test itself
<axw> veebers: I'm not sure if there are any tests that run lxd things directly. wouldn't surprise me
<axw> veebers: not sure what you mean about scope sorry
<veebers> axw: I ask because there are test failures occuring and I'm pretty sure it's due to them being run within fresh lxd containers each time now (i.e. no long running setup, installs)
<axw> veebers: IMO, if you've run "make install-dependencies", then you should be able to run the tests. I don't know what the reality is though
<veebers> axw: install-deps doesn't install lxd :-\ so if it's actually needed by the tests is assuming its installed and setup. I'll dig further and pester you later on :-)
<axw> okey dokey
<axw> veebers: FYI: https://github.com/juju/juju/pull/8292. skips live tests when virsh is missing. I had other errors when trying to run them - pretty sure nobody is running them, since they only work when run as root
<axw> anastasiamac: thanks
<veebers> axw: Interesting, I'm just looking now that the unit tests in lxd containers get run as root (and changing that so it's ubuntu user instead)
<anastasiamac> axw: nws, was curious :D
<axw> veebers: sounds good, that will also fix the kvm test issue
<veebers> axw: appears it will fix the lxd test errors too
<axw> veebers: cool :)
<thumper> wpk_: ping
<tasdomas> morning
<tasdomas> anastasiamac, ping?
<anastasiamac> tasdomas: ? :D
<anastasiamac> tasdomas: i saw u landing ur PR in 2.3, so assumed that u do not need any other reviews... generally, staright forward ports r self-approved
<tasdomas> anastasiamac, does https://github.com/juju/juju/pull/8290 look alright as a forward port?
 * anastasiamac looking
<anastasiamac> tasdomas: i *think* so :) lgtm'ed
<tasdomas> anastasiamac, thanks
<kjackal> Hi, is there support for availability zones in openstack?
<axw> kjackal: yes there is. is something not working for you?
<kjackal> axw: we are suposed to use openstack as a cloud provider
<kjackal> this openstack deployment has 3 availability zones
<kjackal> is it possible to use a constraint to request certain applications to be spread accross all availability zones?
<kjackal> is there a zone constraint
<kjackal> axw: ^
<axw> kjackal: all applications will (should) be spread across AZs without you having to do anything extra
<axw> i.e. "juju deploy -n 3 foo" should give you a unit in each AZ
<kjackal> ok, can I request a certain availability zone for a unit?
<axw> there is no zone constraint. there is zone *placement*, if you want to assign specific units to a zone
<kjackal> ok, where can I read about this?
<kjackal> juju docs?
<axw> kjackal: https://jujucharms.com/docs/2.3/charms-deploying looks to be the most appropriate link
<axw> kjackal: see "--to zone="
<kjackal> axw: ok, cool
<kjackal> axw: is the "--to" available from a bundle as well?
<axw> kjackal: I don't think so
<kjackal> axw: hm... it gives me an "invalid placement" error
<axw> kjackal: what command did you use?
<kjackal> axw: placement works from command line but I need it to be part of a bundle
<kjackal> so in the "-to: " of the bundle I tried "zone=nova" or just "nova"
<axw> kjackal: that doesn't make for a very portable bundle. what do you think about supporting --to on the CLI for bundles, so you'd do something like "juju deploy bundle.yaml --to ubuntu/0:zone=nova --to ubuntu/1:zone=super"
<axw> kjackal: anyway, probably best to file a bug/feature request on launchpad - I'm past EOD, going to head off in a moment
<kjackal_bot> axw: thank you for your help
<kjackal_bot> yes you are right the bundle will not be portable. It would be too essoteric
<thumper> babbageclunk: giving audit logging demo tomorrow morning here
<thumper> babbageclunk: what is the current status of the outstanding items we had on audit logging?
<thumper> ah.. 1am, you won't see this
<balloons> hml, ping
#juju-dev 2018-01-17
<axw> babbageclunk: standup?
<babbageclunk> oh sorry, omw
<axw> agprado_: I wonder how many people they had off to the side ready to take that shark down :)
<axw> seems a bit crazy to me
<thumper> morning team
<thumper> who is around?
<blahdeblah> no one here but us testers! :-)
<babbageclunk> thumper: I am around!
<babbageclunk> also
<babbageclunk> But I am a mere release automaton
<thumper> babbageclunk: I hear there were some release issues
<thumper> what is the current status?
<babbageclunk> It's progressing apace - I'm just gpg signing a windows installer
<babbageclunk> thumper: did the audit log demo go ok?
<thumper> babbageclunk: it is this morning
<babbageclunk> oh, timezones are confusing
<thumper> what was the verify upgrade problem?
<thumper> was it the proposed issue?
<babbageclunk> yup
 * thumper nods
<thumper> we should fix that
<babbageclunk> yup
<thumper> babbageclunk: quick audit log feedback
<thumper> babbageclunk: for the "who" field, can we use the tag.Id() rather than the string value?
<thumper> that would remove the user- prefix
<thumper> given that we only record information from users
<babbageclunk> Sure - I thought at the time that it was good future-proofing just in case, but it's really not needed is it.
<thumper> babbageclunk: next question
<thumper> how do you update the exclude methods?
<thumper> given that it is an array?
<babbageclunk> Well, at the moment there's no way to update it - updating controller config isn't landed yet.
<babbageclunk> But I'd envisaged overwriting it each time.
<thumper> but most of our other commands accept a single string value
<babbageclunk> They accept a yaml value
<thumper> will we read a yaml file?
<babbageclunk> Oh you mean, what's the syntax to set it?
<babbageclunk> audit-log-exclude-methods=[Facade.Method,Other.Method...]
<thumper> yeah
<thumper> I'm pretty sure there is a way to specify a file...
<thumper> other places allow something-yaml=@filename.yaml
<thumper> we should just look into that
<thumper> because it is unlikely that they will want to put it all on the command line
<babbageclunk> Not sure - I haven't changed how we specify it at all
<babbageclunk> If we accept that for other config that'll work
<thumper> I'm also wondering whether we should exclude error responses if all the errors are nil
 * thumper nods
<thumper> but I'll gather feedback at the demo today
<thumper> and let you know
<babbageclunk> My thinking there is that we could have specified a number of things, it's useful to know which ones failed.
<babbageclunk> (in a bulk operation)
<thumper> sure, but it none failed, should we write something out?
<thumper> I agree that showing which of the bulk failed is good
<thumper> but if the command succeeds completely, it is a "boring" line
<thumper> don't stress about it just yet
<thumper> I'll get feedback
<babbageclunk> I don't really like omitting the message if there's no errors - makes it hard to distinguish between a success and a truncated file.
<babbageclunk> But no biggie
<thumper> sure
<thumper> sounds reasonable
<axw> babbageclunk: your email did go through, in case you're still wondering
<babbageclunk> cool thanks - I was
<babbageclunk> anastasiamac only gets them after my bedtime so she couldn't help
<anastasiamac> babbageclunk: :( i helped a little... but thank you, axw :D
<babbageclunk> Oh, sorry anastasiamac - you were very helpful for other stuff but not for that specific delivery question!
<anastasiamac> babbageclunk: :)
<axw> wallyworld: I'm reviewing your PR from the beginning again, given all the new context. thanks for your patience
<freyes> hi, in the devel stream there is no entry for juju 2.4 -> https://streams.canonical.com/juju/tools/streams/v1/com.ubuntu.juju-devel-tools.json
<kjackal> hi, we run into some trouble when bootstraping a controller on openstack. We get an error "failed to get details for serverId" follwoed by a "Authentication responce not recieved..."
<kjackal> how can we approach this?
<kjackal> seems it comes from openstack provider.go:1366
<hml> babbageclunk: ping
<babbageclunk> hml: hey hey
<hml> babbageclunk: do you have a few minutes?
<babbageclunk> sure
<babbageclunk> on a hangout?
<hml> babbageclunk: standup HO?
<hml> :-)
 * babbageclunk pauses the sea shanties
#juju-dev 2018-01-18
<babbageclunk> axw: do you think I need approval on a merge? If I do could you review this? https://github.com/juju/juju/pull/8300
<axw> babbageclunk: only if there were non-trivial merge conflicts IMO
<axw> babbageclunk: you said you had to fix a merge issue?
<babbageclunk> axw: yeah - there were some audit log pieces that had been missed moving it from the machine agent to the apiserver worker
<babbageclunk> axw: definitely non-trivial conflicts.
<axw> I'll take a glance over it
<babbageclunk> Thanks!
<axw> babbageclunk: LGTM
<babbageclunk> Cheers!
<kjackal_bot> Hello, we are behind a proxy and when doing juju deploy ssl authentication is failing
<kjackal_bot> is there a way to bypass this?
<hml> in our testing packages - is there an unordered version of DeepEquals?
<hml> i found jc.SameContents - but it doesnât work on maps
<anastasiamac> hml: there is a SamceContent
<hml> :-)
<anastasiamac> :)
<hml> is there a map version of SameContent?
<anastasiamac> hml: i don't think so... off memory, u might need to compare length and then pull indiv keys and elements by hand to compare...
<hml>  anastasiamac okay - i was hoping there might be an easy way.  :-)
<anastasiamac> hml: most of the time, when map comparison is needed, u might b k in unit test to just have a map with one element?..
 * anastasiamac looking
<admcleod> who can help with this? "can't get info for image 'juju/bionic/s390x': not found"
<anastasiamac> hml: yeah, cannot find anything unless u want to compare by hand...
<hml> anastasiamac: thx for looking - i need to do this a few times, time to write something.  :-)
<anastasiamac> admcleod: we'd need more info than that - what substrate u r using, what is bootstrap command, etc.. mayb file a bug and someone will have a look?
<admcleod> anastasiamac: well i could do, i think its just a builder for that specific image though
<anastasiamac> admcleod: sure, but what version of juju, etc... also means of reproducing, like steps, would help
<admcleod> anastasiamac: ok - first ill make sure theres no image, if there is ill file a bug
<babbageclunk> hml: I think DeepEquals works on maps?
<babbageclunk> (Sorry, this is very late info, just happened to be reading.)
<hml> babbageclunk: it does - but my map order keeps changing - DeepEquals doesnât like that
<hml> :-/
<babbageclunk> hml: ? I don't understand - you should be able to hand DeepEquals two maps and if they have the same set of k/v pairs it will pass.
<babbageclunk> The ordering shouldn't matter.
<hml> babbageclunk: double checking
<hml> babbageclunk: order matters with DeepEquals, but not with SameContents - the trouble is that SameContents doesnât work with maps
<hml> only slices
<babbageclunk> I might be mistaken - I don't think order matters with DeepEquals if you pass it maps. The reason I say that is that SameContents is implemented by converting the values into maps and then using reflect.DeepEqual on the maps.
<babbageclunk> hml: ^
<hml> babbageclunk: getting out my magnifying glass.  :-)
<babbageclunk> I'm doubting myself now - running a toy example to check.
<hml> babbageclunk: i think iâve been asking the wrong question thenâ¦ so DeepEquals is matching out of order.  However one of the map values is a slice, which has the same contents but different order - so failing to match that
<babbageclunk> Ohhhhh.
<hml> babbageclunk: https://pastebin.canonical.com/207917/   <â my eye killing match trying
<hml> babbageclunk: or eye crossings
<babbageclunk> Stink - you want a DeepEquals-alike that does SameContents on the children.
<babbageclunk> let's talk after team meeting!
#juju-dev 2018-01-19
<jam> hi all
#juju-dev 2018-01-21
<rogpeppe> jam: i just created a couple of PRs you might be interested in - something I've been wanting to do for years: https://github.com/juju/juju/pull/8304 https://github.com/juju/juju/pull/8305
