#juju-dev 2012-10-15
<davecheney> fwereade: thank you for your review
<fwereade> davecheney, np, the tweaks should be straightforward :)
<davecheney> fwereade: already done
<fwereade> davecheney, cool
<davecheney> fwereade: could I press you to expand on this statement
<davecheney> FWIW, I maintain that user-config is not the same thing as internal-config, and
<davecheney> trying to pretend that they are causes us nothing but pain. Not sure that topic
<davecheney> is really open for discussion, though...
<davecheney> i too feel the sting of too many battles over configuration
<fwereade> davecheney, I was trying to argue for it when we were originally having the config, and lost... but I'm still not quite sure *why* I lost, and so I can't figure out whether this situation qualifies as a reason to reopen the discussion (and indeed I'm not really sure whether I even want to...)
<davecheney> fwereade: so this is a diffrence between the environments configuration (passwords and shit that the agents need)
<davecheney> vs the pure minimum of config as a connection string
<fwereade> davecheney, I hadn't looked at it exactly that way -- in my eyes the problem is simply that we're trying to make a single parser do different things in different contexts
<davecheney> fwereade: well, IMO the wrinkle is not in the parser
<davecheney> but the way BootstrapConfig fucks with the config, then tries to parse it again
<TheMue> morning
<davecheney> hello
<fwereade> davecheney, ISTM that the only reason we're messing around with the config is so that we can pretend that the two situations are the same
 * davecheney nods
<fwereade> davecheney, and that that is the cause of the ugliness you reference
<fwereade> davecheney, so yeah, not exactly the parser itself, but a decision made in the parser's intellectual scope, if you like
<fwereade> TheMue, heyhey
<TheMue> fwereade: heya, just reading your review
<fwereade> TheMue, probably all these things were covered in a prereq, but I didn't see that I'm afraid
<TheMue> fwereade: no, they are not
<TheMue> fwereade: one simply is a typo (c'n'p error) - the "dummy" in the ec2 config. i'll change it
<TheMue> fwereade: the "default" is still in discussion and may be changed later. but i still prefer the word "default" so that one explicitely state in the config, that he wants the default behavior instead by leaving it empty
<TheMue> fwereade: could you please describe your question regarding the change of the mode more? do you mean a change at runtime?
<fwereade> TheMue, feels wrong to me to have 2 names for the same state, but mileages may reasonably vary I guess
<fwereade> TheMue, yeah
<fwereade> TheMue, there's handling somewhere for things like attempts to change the env type, we can probably just do something similar there
<TheMue> fwereade: we don't have two names, we only have, if somebody writes nothing, a default mode. but how would you otherwise explicit define that you want the default mode?
<fwereade> TheMue, honestly I don't think this should be user-configurable at all
<fwereade> TheMue, it feels like we're repeating the placement mistake
<TheMue> fwereade: environments like ec2 have different modes, later maybe also a third one (one group per service). how to control that?
<fwereade> TheMue, ask the environ what it prefers?
<TheMue> fwereade: the environment prefers nothing, it depends on your cloud size and services
<fwereade> TheMue, in that case, surely, it shouldn't have a default value: it's is an important decision that we should require an explicit value for
<fwereade> TheMue, but... well, look, we'll be dealing with a whole bunch of environments soon enough
<fwereade> TheMue, some of them have no concept of security groups in the first place, eg MAAS (as far as I know, not very up to date with progress)
<fwereade> TheMue, for those envs we will actually have to implement a local firewaller in every machine agent anyway, I think
<TheMue> fwereade: yes, that's why niemeyer wanted a default, because you today can't tell if instance and global are ok for each environment. default is start, more modes to come.
<TheMue> fwereade: exactly
<TheMue> fwereade: but more important is your thought about changing it at runtime
<fwereade> TheMue, ok, if we implement local per-machine firewalling with iptables or something
<fwereade> TheMue, why would we need "global" or "instance" modes at all?
<fwereade> <fwereade> TheMue, ok, if we implement local per-machine firewalling with iptables or something
<fwereade>  TheMue, why would we need "global" or "instance" modes at all?
<fwereade> TheMue, envs like MAAS will require that we add local firewalling, I think
<fwereade> TheMue_ ^^
<TheMue_> fwereade: yes, but other envs maybe not
<fwereade> TheMue_, if we have local firewalling implemented for just one env we can just do that everywhere
<TheMue_> fwereade: niemeyer wanted to keep it more abstract, so we later can support multiple envs with the same configuration machanism
<fwereade> TheMue_, and just have a thin layer of the global handling in the environments that need it on top of that
<TheMue_> fwereade: you now change the whole firewalling concept
<TheMue_> fwereade: please let's do it step by step
<fwereade> TheMue_, in that situation global-firewalling is just Wrong
<fwereade> TheMue_, how so? AIUI I merely bring up what everyone has always known had to be done from the beginning
<fwereade> TheMue, I'm not demanding specific action :)
<fwereade> TheMue, but I am worried that we're adding a piece of totally transitory dev-relevant-only config
<fwereade> TheMue, that we will end up releasing and have to go through all the tediousness of deprecating
<TheMue> fwereade: what's wrong with security groups for the ec2 env?
<TheMue> fwereade: especially, if the user configures abstract modes like instace or global, the change from sec groups to iptables is transparent for the user
<fwereade> TheMue, they scale like shit, and ISTM that the global firewaller provides so flimsy a veil of kinda-security that , while it may allow us to scale, will not be used in any serious deployments
<fwereade> TheMue, what is the purpose of the global firewaller if not to paper over the security group issues?
<TheMue> fwereade: using them today lets us sart while a change to a different implementation later is transparent for the user. so we ca release early with today demands but still have a comfortable solution for the future
<fwereade> TheMue, today's demands are "match python"; I though python had no global firewaller
<fwereade> TheMue,  the most transaprent thing is to do the best we can for the user given our capabilities
<fwereade> TheMue, if we want to target, right, now, users who want either scalability or security, but not both, then I guess we have no choice
<Guest2629> TheMue, fwereade, davecheney: mornin'!
<TheMue> fwereade: afaik it's just a simple temporary solution which later is easy to change. how fast could you set up iptables inclusive testing?
<TheMue> Guest2629: morning
<fwereade> TheMue, but if we're asking them to make a choice like *that* I think we should be upfront about it and have no default
<fwereade> rogpeppe, heyhey
<TheMue> fwereade: could you add this with your reasons to the review so that we could discuss it with niemeyer later. i'm not good as a man in the middle.
<fwereade> TheMue, clearly the decision to write the global firewaller has already been made, but when I review the code I lack the context behind the decisions and need to ask questions
<TheMue> fwereade: yes, for sure
<fwereade> TheMue, I will expend my review, thanks
<TheMue> fwereade: but i'm not very good in repeating your questions later to niemeyer. it's better when you ask him too
<fwereade> er extend
<fwereade> TheMue, absolutely, np :)
<TheMue> fwereade: as long as this topic is in move changes are more simple than later
<rogpeppe> davecheney: ping
<davecheney> rogpeppe: ack
<rogpeppe> davecheney: i see your issue with my admin-secret fix, but i think that moving the admin-secret check earlier in the Bootstrap function would fix that.
<davecheney> rogpeppe: fair call
<davecheney> what you go, ie the blowup at cloud init level
<rogpeppe> davecheney: admin-secret can't be a required field in the config, unfortunately AFAICS
<davecheney> was where i started with
<davecheney> i got most of the way with making admin-secret rquired
<davecheney> but BoostrapConfig screwed me
<rogpeppe> davecheney: the fundamental issue is that we don't want to push admin-secret into the state.
<davecheney> rogpeppe: what about just overwriting the secret with 'your secret isn't here'
<davecheney> config.New would be happy
<davecheney> and we _know_ we're never putting the real value into the state
<rogpeppe> davecheney: i don't see how that's better than just not requiring the admin secret.
<davecheney> rogpeppe: i would argue the opposite
<davecheney> but not streniously
<rogpeppe> davecheney: it's like we're saying, "yeah it's required, but only means something in some cases"
<davecheney> rogpeppe: can't we leave it out of config/config
<davecheney> and apply it at environs/ec2/config.go
<davecheney> it is already in there
<davecheney> but specfied as schema.Omit
<rogpeppe> davecheney: no
<rogpeppe> davecheney: because we need to be able to create an environment without admin-secret, i think
<rogpeppe> davecheney: or... maybe that is a possibility, let me think
<davecheney> rogpeppe: yeah, can't we do that just by changing environs/ec2/config.go ?
<davecheney> rogpeppe: it would make admin-secret an environ specific value
<davecheney> unknownattrs, or whatever
<rogpeppe> davecheney: no, we do need to be able to create an environment without admin-secret
<davecheney> rogpeppe: i don't understand that sentance
<davecheney> an environment == ec2 ?
<rogpeppe> davecheney: environs == environs.Environ
<rogpeppe> s/environs/environment/
<davecheney> yeah, that is easy
<davecheney> remove the admin-secret atom from environs/config/config.go
<rogpeppe> davecheney: but can we then open an ec2 environment with that config?
<davecheney> rogpeppe: depends if it is a valid ec2 config
<rogpeppe> davecheney: i thought you were suggesting that the ec2 config required admin-secret
<davecheney> rogpeppe: indeed i am
<rogpeppe> davecheney: so when an agent opens the environment, it must make up an admin secret?
<davecheney> rogpeppe: NFI
<davecheney> i dont know how it works ont he agents
<rogpeppe> davecheney: it works ok. agents don't need admin-secret
<rogpeppe> davecheney: only the client has the admin secret
<davecheney> yeah, they would use environs/config/config.go
<davecheney> which only requires a generic config
<davecheney> so removing admin-secret from e/c/config.go is a valid suggestion
<rogpeppe> davecheney: they've still got to get it through environs/ec2/config.go,m no?
<davecheney> rogpeppe: don't thye do that by getting the providor, then the provider can validate the config ?
<rogpeppe> davecheney: i'm not sure i understand. where are you suggesting we make the check for admin-secret's presence?
<davecheney> environs/ec2/config.go
<davecheney> it is already there
<davecheney> but is optional atm
<rogpeppe> davecheney: so what happens when we don't have an admin secret?
<davecheney> not a valid config
<rogpeppe> davecheney: then how can an agent create an Environ?
<davecheney> well, you got me there
<davecheney> i guess they are screwed
<davecheney> and we are back to sq one
<rogpeppe> davecheney: that's the reason i decided that the best place to put the check was where admin-secret is actually needed - in Bootstrap
<davecheney> rogpeppe: fair enough
<davecheney> best of a bad situation
<davecheney> oh
<rogpeppe> davecheney: yeah. i'm very happy to entertain other possibilities, but i haven't seen a better one yet
<davecheney> what about if admin secret in ec2 worked the same way as authorised-keys in e/c/config.go ?
<davecheney> ie, if it were blank, they values are sourced from somewhere else
<davecheney> (else == whereever the agents stash their secret value)
<rogpeppe> davecheney: so you can't stash your secret in environments.yaml?
<davecheney> rogpeppe: i believe authorised-keys has a fallback mech
<davecheney> yaml first, disk second
<rogpeppe> davecheney: yeah it does.
<rogpeppe> davecheney: but... how does this help?
<davecheney> does it solve the problem of the e/ec2/config.go being able to require admin-secret, but allow it to come from several locations
<rogpeppe> davecheney: i think we've got a fundamental issue here - in some places in the system, we don't *have* an admin secret
<davecheney> agreed
<rogpeppe> davecheney: ... where it's coming from another location or not
<davecheney> i think this it the point fwereade says he told us so
<rogpeppe> davecheney: i agree with fwereade's point, but i don't mind this solution *too* much
<davecheney> rogpeppe: althoght, it does bail out after writing the data to the s3 store
<davecheney> which means you have to use juju destroy-environment before trying again
<rogpeppe> davecheney: yeah, that's a bug
<rogpeppe> davecheney: as it suggested earlier, the check should come further up
<rogpeppe> s/it/i
<rogpeppe> davecheney: i'll fix that
<davecheney> sweet
<davecheney> afk for a while
<rogpeppe> davecheney: afk?
<fwereade> rogpeppe, away from keyboard
<rogpeppe> fwereade: thanks
<TheMue> fwereade: thx for your notes. both sound very good to me.
<fwereade> TheMue, cool, cheers :)
<Aram> moin.
<TheMue> Aram: moin moin
<TheMue> *: Always the same question. *sigh* When having a chain of CLs which shall bei updated to trunk, then after merge only commit and push or also lbox propose of already submitted pre-requesites?
<TheMue> rogpeppe, fwereade: any hint ^^?
<fwereade> TheMue, I generally don't repropose unless there were conflicts
<TheMue> fwereade: so only commit and push? i twice made it wrong in the beginning and had too much code in the reviews
<fwereade> TheMue, it seems to work ok for me
<fwereade> TheMue, I only get too much code when I forget a -req
<TheMue> fwereade: thx, i'll make a note to keep it in mind next time. ;)
<Aram> TheMue: no need to repropose before submitting.
<Aram> just make sure you merge trunk.
<Aram> so you merge the prereq
<niemeyer> Morning jujuers!
<fwereade> niemeyer, heyhey
<niemeyer> fwereade: Heya
<niemeyer> fwereade: Good weekend?
<fwereade> niemeyer, very nice thanks -- saw cath's cousins and their new baby :)
<niemeyer> fwereade: Oh, nice
<fwereade> niemeyer, and also stacked up a few reviews which will, I hope, lead to interesting conversations if not necessarily immediate progress
<niemeyer> fwereade: /me is all sensitive about "new babies" :-)
<fwereade> niemeyer, he's about 4 months old and smiley and dribbly and cute
<niemeyer> fwereade: LOL
<TheMue> niemeyer: morning and thx for your comment
<niemeyer> TheMue: Heya, np
 * TheMue 's small baby - she is now 16 yrs old ;) - has her first archery tournament this afternoon. we'll all watch her.
<niemeyer> TheMue: Wow, very nice. Good luck there!
<fss> niemeyer: morning :-)
<niemeyer> fss: Heya
<TheMue> niemeyer: thank you, I will tell her when she's back from school. just had to watch our older one leaving home with her driving teacher. one of the last lessons before driving test.
<fwereade> late lunch, bbiab
<niemeyer> fwereade: Enjoy
<niemeyer> fwereade: https://codereview.appspot.com/6651060/ is waiting for you
<fwereade> niemeyer, cheers
<niemeyer> Aram: ping
<fss> niemeyer: ping
<niemeyer> fss: hi
<fss> did you see that iamtest cl?
<niemeyer> fss: Still churning CLs pushed over the holiday and/or weekend
<fss> niemeyer: oh, right :-) please let me know if I can help with anything
<niemeyer> fss: Cheers!
<niemeyer> fwereade_: ping
<fwereade_> niemeyer, pong
<niemeyer> fwereade_: Yo
<niemeyer> fwereade_: Just started looking at relation-lifecycles
<fwereade_> niemeyer, oh yes
<niemeyer> fwereade_: The N=100k indeed feels troublesome there
<fwereade_> niemeyer, does my wolly justification for an end-run around txn hold any water?
<fwereade_> s/wolly/wooly/
<niemeyer> fwereade_: I'm thinking too.. in principle it sounds doable
<niemeyer> fwereade_: But even better would be to have these settings collected in a more distributed way
<niemeyer> fwereade_: Even if we do RemoveAll, 100k is still a lot
<fwereade_> niemeyer, not really sure where it makes sense to distribute it to -- a CA that just does a bit of work every few seconds might do the trick I guess :)
<niemeyer> fwereade_: Yeah, but that sounds worryingly hand-wavy
<niemeyer> fwereade_: We've decided to leave settings around, and that seems to work well, but is there any real point where we can say settings for a unit leaving aren't relevant anymore?
<fwereade_> niemeyer, I don't really think we can at the unit level, can we? We could make subordinate relations be cleaned up incrementally by their own units
<fwereade_> niemeyer, but that doesn't help with the big ugly case
<niemeyer> fwereade_: Right
<fwereade_> niemeyer, the CA case doesn't seem too bad to me though -- it's just one extra op to create a document with the relation's ID (or I guess just directly its scope key prefix)
<fwereade_> niemeyer, the CA then just removes a few matching docs at a time until it runs out, and then deletes the "droppings" doc
<niemeyer> fwereade_: The extra doc for?
<fwereade_> niemeyer, or adds it into an existing doc, sorry, haven't considered tradeoffs here
<niemeyer> fwereade_: Ah, I see
<fwereade_> niemeyer, brb phone
<niemeyer> fwereade_: Cool, thinking meanwhile
<niemeyer> fwereade_: Yeah, you're right.. the CA may be a good idea after all
<fwereade_> niemeyer, however I'd prefer not to switch focus to that just now
<fwereade_> niemeyer, do you think that an interim solution of either monster-transaction, or remove-all-settings-before-transaction, would be plausible?
<niemeyer> fwereade_: Not greatly comfortably with either
 * fwereade_ is crestfallen
 * niemeyer looks in the dictionary to see what fwereade_ is
<fwereade_> niemeyer, how about deleting no settings, but creating a droppings doc for future use?
 * niemeyer laughs at the "-- said of a horse." part of the explanation
 * fwereade_ doesn't know which dictionary niemeyer has
<niemeyer> fwereade_: That was from gcide
<fwereade_> niemeyer, ah nice :)
<niemeyer> fwereade_: Sounds good regarding injecting doc into cleanups collection
<fwereade_> niemeyer, great
<fwereade_> niemeyer, and that's a nicer name than droppings
<niemeyer> fwereade_: {Id: bson.ObjectId, Type: "settings", Prefix: "r#..."}
<fwereade_> niemeyer, LGTM
<niemeyer> fwereade_: /Type/Kind/ perhaps
<fwereade_> niemeyer, +1
<niemeyer> fwereade_: We'll need the CA soon enough anyway
<niemeyer> fwereade_: To recover lost transactions
<fwereade_> niemeyer, indeed
<niemeyer> I'll step out for lunch.. back in a bit to continue on reviews
<fwereade_> wrtp, Aram, niemeyer: is http://paste.ubuntu.com/1281389/ expected/familiar? just merged...
<wrtp> fwereade_: i haven't seen that before
<wrtp> fwereade_: ISTM that's probably a consequence of recent fw changes, renaming "default" to ""
<fwereade_> wrtp, yeah, I guess we will see a fix tomorrow
<niemeyer> I guess no cloud consistency call today..
<niemeyer> fwereade_: Can you paste the link again please?
<fwereade_> niemeyer, http://paste.ubuntu.com/1281389/
<niemeyer> fwereade_: Cheers
<niemeyer> fwereade_: I have a try at fixing it quickly
<niemeyer> So we don't stick with a broken trunk
<niemeyer> robbiew: Cloud consistency call today?
<robbiew> niemeyer: no idea...I know I won't be attending (at ODS)
<robbiew> niemeyer: I would say "no"
<robbiew> ...as I don't know who will be around to attend
<niemeyer> robbiew: Cheers
<robbiew> ;)
<wrtp> i'm off for the day. see y'all tomorrow.
<niemeyer> wrtp: Enjoy the evening
<niemeyer> fwereade: https://codereview.appspot.com/6699044 .. nothing exciting, luckily
<fwereade_> niemeyer, ping
<fwereade_> niemeyer, re service-lifecycles EnterScope txns -- I suspect the logic *is* faulty, but the mistake is not, I think, in the inability of a unit to re-enter a scope -- I think that is a feature
<fwereade_> niemeyer, thoughts?
<niemeyer> fwereade_: Yo
<fwereade_> niemeyer, heyhey
<niemeyer> fwereade_: In principle I don't see a great reason to allow it, but at the same time it feels bad to have faulty logic that doesn't really bring any benefit and misbehaves in such cases
<fwereade_> niemeyer, to expand: when a unit leaves the scope, it has run the relation-broken hook and represented to the charm that its participation in the relation is entirely done
<niemeyer> fwereade_: Sure, but in theory it could happily re-enter the scope by reporting to the charm that it's entering again
<fwereade> niemeyer, I am 100% with you on the faulty logic, I will rework and explain nicely, unless you feel there is something deeply wrong with the approach -- I think that it is possible to determine the various situations cleanly
<fwereade> niemeyer, which feeds into ErrScopeDying
<fwereade> niemeyer, which should maybe be ErrScopeClosed -- indicating that either the unit has already left and can't come back, or just that the scope is not open to new members...
<fwereade> niemeyer, ...but in either case, the correct response for the uniter is to ignore the error and all further references to that relation
<niemeyer> fwereade: Is there anything that misbehaves today if we enter the scope again, as far as the the state/watchers/etc are concerned?
<fwereade> niemeyer, I don't think so, no
<niemeyer> fwereade: So I don't see yet the motivation to be actively preventing the unit from entering the scope. It feels like a property we can easily explore for good reasons in the future, and don't have to think much right now
<fwereade> niemeyer, the intent is that a unit "re-entering" a scope should basically do nothing except possibly update the settings if the private address has changed
<fwereade> niemeyer, that's from the API perspective
<niemeyer> fwereade: Yep, sounds fine
<niemeyer> fwereade: I'm talking about leaving and entering, though
<niemeyer> fwereade: I don't see a reason to actively disallow it
<fwereade_> niemeyer, gaah, sorry
<fwereade_> niemeyer, I saw/said:
<fwereade_> <niemeyer> fwereade: Yep, sounds fine
<fwereade_> <fwereade> niemeyer, from the model perspective I think it is sensible to make this part of the model agree with the part which thinks that relation-broken is the dividing line between "in the relation" and "left, never ever coming back"
<fwereade_> niemeyer, I would favour consistency for the time being and consider experimentation later
<niemeyer> fwereade> niemeyer, that's from the API perspective
<niemeyer> <niemeyer> fwereade: Yep, sounds fine
<niemeyer> <niemeyer> fwereade: I'm talking about leaving and entering, though
<niemeyer> <niemeyer> fwereade: I don't see a reason to actively disallow it
<niemeyer> fwereade_: I don't see the reason to building onto the model that "left, never ever coming back"
<fwereade_> niemeyer, I understand that to be the meaning of the relation-broken hook
<fwereade_> niemeyer, I had a feeling we did state somewhere that it would be called once per relation and it would be the last thing called
<fwereade_> niemeyer, if that is the case (er, I'll see if I can find anywhere...) then I think it's best to be consistent
<niemeyer> fwereade_: We're getting into shady areas
<niemeyer> fwereade_: We never discussed what happens when you start a unit in a different machine, for example
<niemeyer> fwereade_: Do we Leave and then Join from the other machine? Etc
<niemeyer> fwereade_: I'm not suggesting we should specify any of that right now, though
<niemeyer> fwereade_: I'm just suggesting we don't start hardcoding unnecessary behavior that closes doors we haven't talked about yet
<fwereade_> niemeyer, I shall meditate upon this :)
<niemeyer> fwereade_: Hehe :)
<fwereade_> niemeyer, hm, actually, on https://juju.ubuntu.com/docs/charm.html it says "Runs when a relation which had at least one other relation hook run for it (successfully or not) is now unavailable"
<fwereade_> niemeyer, that is not really a good description IMO but it does if anythig carry connotations of potentially being rerunnable
<niemeyer> fwereade_: Overall, the lifecycle branches look like going in a pretty good direction.. the leave scope one has a few things we'll have to discuss too to make sure we understand what's up, but it's all getting together nicely
<fwereade_> niemeyer, yeah, I thought there'd probably be a bit of discussion for each of them
<fwereade_> niemeyer, the one that sits on top of them all does make me happy, though
<niemeyer> fwereade_: It looks lke surprisingly small amount of logic, thankfully
<fwereade_> niemeyer, yeah, it's not too bad
<fwereade_> niemeyer, there may be a simplification waiting to get out but I haven't got round to implementing it against that tree yet so it could yet surprise me
<niemeyer> fwereade_: Cool
<niemeyer> fwereade_: I'll step out to do some end-of-afternoon errands.. back later
#juju-dev 2012-10-16
<davecheney> who is supposed to call unit.UnassignFromMachine() ?
<davecheney> is it the UA on the way out the door ?
<davecheney> the MA on observing the death of the UA ?
<davecheney> is juju remove-unit ?
<niemeyer> davecheney: Nobody calls it at the moment.. the unit dies and should be removed
<niemeyer> davecheney: And morning!
<davecheney> hey hey
<davecheney> for the moment, i stuck it in juju remove-unit
<davecheney> but that may not be correct
<davecheney> niemeyer: correct my logic: the MA is responsible for removing units that have reached dead from the state
<niemeyer> davecheney: Right.. remove-unit should not unassign or remove
<davecheney> niemeyer: right, then we have a problem
<davecheney> niemeyer: https://bugs.launchpad.net/juju-core/+bug/1067127
<niemeyer> davecheney: Not entirely surprising
<niemeyer> davecheney: We're just now getting to the end of the watcher support for lifecycle, unit dying, etc
<davecheney> niemeyer: cool
<davecheney> so, i can add unassignfrommachine to remove-unit
<niemeyer> davecheney: That said, it would be useful to understand what's the missing spot there
<niemeyer> davecheney: I bet it's something simple
<niemeyer> davecheney: But I can't tell if it's in anyone's plate yet
<niemeyer> davecheney: No, that's not the way to go
<davecheney> niemeyer: lets talk about it this evening
<niemeyer> davecheney: The unit doesn't have to be unassigned, ever
<niemeyer> davecheney: Because it's being removed with the assignment
<davecheney> niemeyer: can you say that annother way
<davecheney> niemeyer: currently we have tow actions, EnsureDying and UnassignFromMachine
<niemeyer> davecheney: The machiner should remove the unit once it's dead
<niemeyer> davecheney: That's probably the missing link
<davecheney> niemeyer: i agree
<niemeyer> davecheney: So there's no point in unassigning it
 * davecheney reads worker/uniter
<davecheney> worker/machiner
<niemeyer> davecheney: It should be removed, and then its assignment is gone, whatever it was
<davecheney> i agee, the machiner should be responsible for calling Unassign
<davecheney> it is a Machine after all :)
<niemeyer> davecheney: Nobody has to call unassign :)
<davecheney> niemeyer: well, then there is a bug
<davecheney> see above
<davecheney> when you say 'nobody has to call unassign'
<davecheney> you mean, no person, ie, nobody typing juju remove-unit ?
<niemeyer> <niemeyer> davecheney: It should be removed, and then its assignment is gone, whatever it was
<niemeyer> <niemeyer> davecheney: The machiner should remove the unit once it's dead
<niemeyer> <niemeyer> davecheney: That's probably the missing lin
<davecheney> niemeyer: ok, thanks, understood
<niemeyer> davecheney: No unassignment in the picture
<wrtp> davecheney, fwereade: morning!
<fwereade> wrtp, davecheney, heyhey
<davecheney> morning
<davecheney> 70 working charms
<davecheney> whoo hoo!
<TheMue> morning
<TheMue> davecheney: cheers
<TheMue> davecheney: those are very good news
<fwereade> TheMue, morning
<TheMue> fwereade: hiya
<wrtp> fwereade: i'm not sure about the idea of making all hooks in a container mutually exclusive
<fwereade> wrtp, go on
<wrtp> fwereade: it seems a bit like unwarranted interaction between independent charms
<wrtp> fwereade: for instance, one charm might be very slow to execute certain hooks, making another one slow to react
<wrtp> fwereade: in fact, if one charm's hook hangs up for a while, it would lock out all other charms in the same container, which seems... dubious
<wrtp> fwereade: isn't apt-get supposed to work if run concurrently with itself?
<fwereade> wrtp, you don't think that, say, a subordinate might try to make concurrent tweaks to a setting file being changed by the principal?
<wrtp> fwereade: i think that would be extremely dodgy behaviour
<wrtp> fwereade: just because it runs in the same container doesn't mean a subordinate has a right to delve into the inner workings of its principal
<wrtp> fwereade: if it's a setting file not in the charm directory, then it's not so dodgy, but it's fundamentally flawed if there's no locking, because the principal might not be making the change in a hook context.
<fwereade> wrtp, yeah I'd been thinking of the second case
<fwereade> wrtp, but please expand on the changes that might be made outside a hook context
<wrtp> fwereade: it's perfectly reasonable that a charm might trigger some changes in a hook that don't execute synchronously with the hook
<wrtp> fwereade: for instance, it might have started a local server that manages some changes for it, and the hook might just be telling that server about the changes.
<wrtp> fwereade: that's an implementation detail, and not a technique we should preclude
<wrtp> fwereade: my view is that charms should be views as independent concurrent entities
<fwereade> wrtp, I dunno, I still have a strong instinct that the Right Thing is to explicitly declare all activity outside either a hook context or an error state (in which you're expected to ssh in) is unsafe
<wrtp> s/views/viewed/
<wrtp> fwereade: huh?
<wrtp> fwereade: so you can't run a server?
<wrtp> fwereade: isn't that kinda the whole point?
<fwereade> wrtp, there is juju activity and service activity
<fwereade> wrtp, the service does what it does, regardless of juju
<wrtp> fwereade: it seemed to me like you were talking about the subordinate mucking with service setting files
<fwereade> wrtp, yes, but not stuff inside the charm directory... the settings of the actual service
<wrtp> fwereade: ok, so that's service activity, right?
<fwereade> wrtp, ensuring a logging section has particular content, or something
<fwereade> wrtp, that is juju activity... acting on the service's config
<wrtp> fwereade: i think it's a grey area
<fwereade> wrtp, IME it is rare for services to wantonly change their own config files at arbitrary times
<wrtp> fwereade: "rare" isn't good enough.
<wrtp> fwereade: we want charms to be able to work with one another regardless of how they're implemented
<wrtp> fwereade: and it seems to me like it's perfectly reasonable for a charm to start a service which happens to manage another service.
<fwereade> wrtp, how does allowing parallel hook execution do anything except make it harder for charms to work reliably together?
<wrtp> fwereade: it means the failure of one charm is independent on the failure of another
<wrtp> s/on the/of the/
<wrtp> fwereade: and in that sense, it makes it easier to charms to work reliably together
<fwereade> wrtp, sorry, I may be slow today: how is hook failure relevant?
<fwereade> wrtp, having them execute in parallel makes it *more* likely that hooks will fail due to inappropriately parallelised operations
<wrtp> fwereade: if i write a command in a hook that happens to hang for a long time (say 15 minutes, trying to download something), then that should not block out any other charms
<wrtp> fwereade: i think that if you write a subordinate charm, it's your responsibility to make it work correctly when other things are executing concurrently.
<fwereade> wrtp, and if you write a principal charm, it's also your responsibility to know everything that every subordinate charm might do so yu can implement your side of the locking correctly?
<wrtp> fwereade: no. i think that kind of subordinate behaviour is... insubordinate :-)
<wrtp> fwereade: i think that we should not think of subordinates as ways to muck directly with the operations of other charms.
<wrtp> fwereade: if you want that, then you should change the other charms directly.
<wrtp> fwereade: ISTM that if you've got two things concurrently changing the same settings file (whether running in mutually exclusive hooks or not) then it's a recipe for trouble.
<fwereade> wrtp, the point is to eliminate the concurrency...
<fwereade> wrtp, by mandating that if yu want to make a change you must do it in a hook, and serialising hook executions across all units, we do that
<fwereade> wrtp, the other drawbacks may indeed sink the idea
<fwereade> wrtp, but I'm pretty sure that doing this gives us a much lower chance of weird and hard-to-repro interactions
<wrtp> fwereade: yeah, but if you're a principal and you change a settings file, you might be warranted in expecting that it's the same when the next hook is called.
<wrtp> fwereade: for instance you might just *write* the settings file, rather than read it in and modify it
<fwereade> wrtp, (it also depends on adding juju-run so that we *can* run commands in a hook context at arbitrary times)
<Aram> moin.
<wrtp> Aram: hiya
<fwereade> Aram, heyhey
<fwereade> wrtp, well, it is true that a hook never knows what hook (if any) ran last
<wrtp> fwereade: i don't think we should be making it easier to write the kind of charms that this would facilitate
<fwereade> wrtp, what's the solution to the apt issues then?
<wrtp> fwereade: so is it true that apt does not work if called concurrently?
<fwereade> wrtp, it appears to be
<wrtp> fwereade: i would not be averse to providing an *explicit* way to get mutual exclusion across charms in a container
<wrtp> fwereade: so you could do, e.g.: juju-acquire; apt-get...; juju-release
<wrtp> fwereade: last thing i saw: [10:04:39] <fwereade> wrtp, it appears to be
<wrtp> fwereade: it would be better if apt-get was fixed though - that seems to be the root of this suggestion.
<fwereade_> wrtp, sorry -- I was composing something like "that sounds potentially good, please suggest it in a reply"
<fwereade_> wrtp, I still find it hard to believe that apt-get is the only possible legitimate source of concurrency issues
<wrtp> fwereade_: of course it's not - but we're in a timesharing environment - it's all concurrent and people need to deal with that.
<fwereade_> wrtp, and if it's not then everybody has to carve out their own exceptions for the things they know and care about
<fwereade_> wrtp, and I have a strong suspicion that everyone will figure out that the Best Practice is to grab the lock at the start of the hook and release it at the end
<wrtp> fwereade_: i think that trying to pretend that in this fluffy juju world everything is sequential and lovely, is going to create systems that are very fragile
<wrtp> fwereade_: that may well be true for install hooks at any rate
<wrtp> fwereade_: i'm not so sure about other hooks
<fwereade_> wrtp, I am not trying to "pretend" anything... I am saying we can implement things one way, or another way, and that I think one way might be good. you seem to be asserting that even if we do things sequentially they still won't be sequential
<fwereade_> wrtp, it's not about pretending it's about making a choice
<wrtp> fwereade_: yeah. i'm saying that hook sequencing doesn't necessarily make the actions of a charm sequential
<fwereade_> wrtp, if we only pretend to make choices I agree we'll be screwed there ;)
<fwereade_> wrtp, wait, you have some idea that any charm knows anything about what happened before it was run?
<fwereade_> wrtp, s/any charm/any hook/
<wrtp> fwereade_: i think it's reasonable for a charm to assume ownership of some system file.
<fwereade_> wrtp, ok, but that still implies nothing about what the last hook to modify that file was, at any given time
<wrtp> fwereade_: it means that you know that whatever change was made, it was made by your hooks
<fwereade_> wrtp, I don't remotely care about hook *ordering* in this context... is that the perspective you're considering?
<wrtp> fwereade_: no, not at all
<fwereade_> wrtp, wait, you were just telling me that "rare" isn't good enough when considering the possibility of, say, a service changing its own config... ISTM that it follows that we must have some magical system which is safe from any and all concurrent modifications (or, really, that every charm author has to build compatible pieces of such a system)
<fwereade_> wrtp, or we have a simple solution, which is, don't run two hooks at a time
<wrtp> fwereade_: or... don't have one charm that modifies the same things as another. keep out of each others' hair.
<fwereade_> wrtp, so, no apt then
<fwereade_> wrtp, and nothing else that doesn't like consurrent modifications
<wrtp> fwereade_: apt needs to be fixed. or we need to provide a workaround for that, in particular.
<fwereade_> wrtp, *or* a vast distributed multi-author locking algorithm using new hook commands
<wrtp> fwereade_: "vast distributed multi-author" ??
<fwereade_> wrtp, every single charm author has to do the locking dance right
<wrtp> fwereade_: only if you you're changing something that others might change concurrently.
<wrtp> fwereade_: i think this all comes down to how we see the role of subordinates
<fwereade_> wrtp, requiring that charm authors have perfect precognition doesn't strike me as helpful ;p
<wrtp> fwereade_: have you looked at what subordinate charms are out there now, and whether any potentially suffer from these issues? (ignore apt-get issues for the moment)
<fwereade_> wrtp, no, because these sorts of issues are by their very nature subtle and hidden
<wrtp> fwereade_: i'm not so sure. i think it should be fairly evident if a subordinate is doing stuff that may interact badly with a principal.
<fwereade_> wrtp, I think that if it were that clear, everybody would have spotted the apt problem and worked around it in every single charm
<wrtp> fwereade_: i assume that hardly anyone uses subordinates yet tbh
<wrtp> fwereade_: i don't mean evident from behaviour, but evident from what the purpose of the subordinate charm is
<fwereade_> wrtp, right -- my position is that the reason that apt is the only problem we've seen is likely to be because we don't use many yet
<fwereade_> wrtp, IMO it is consistent with the general feel of juju to make it easier, not harder, for charms to play together
<wrtp> fwereade_: IMO it's also consistent with juju to make independent components that have independent failure modes
<fwereade_> wrtp, we provide a consistent snapshot of remote state in a hook context -- why mess that up by explicitly encouraging inconsistency in local state?
<wrtp> fwereade_: because we *can't* provide a consistent snapshot of local state?
<fwereade_> wrtp, and yet you seem to consider that adding a class of subtle and hard-to-detect concurrency-based failures is consistent with this goal
<fwereade_> wrtp, we can either have a hook which is equivalent to logging into a machine yourself, or logging into a machine with N concurrent administrators
<fwereade_> wrtp, all making changes at the same time
<fwereade_> wrtp, I don't see how the second scenario is more robust
<wrtp> fwereade_: if one of those N concurrent adminstrators hangs for ages, the others can continue uninterrupted. i think that's a very useful property.
<fwereade_> wrtp, I think that's a very situation-specific property and not worth introducing this class of bug for
<wrtp> fwereade_: it means that if i decide to install a subordinate charm, the principal service can carry on regardless.
<fwereade_> wrtp, I feel if we ever do something like this it should be a release/reacquire pair around the long-running operations
<fwereade_> wrtp, making people have to lock by default seems really unhelpful to me
<wrtp> fwereade_: tbh, i'm very surprised that apt-get doesn't work concurrently by default. i haven't managed to find any bug reports so far.
<wrtp> fwereade_: it seems to take out file locks
<fwereade_> wrtp, so plausibly 2 things are installing things with overlapping dependencies?
<wrtp> fwereade_: we could always provide a version of apt-get that *is* exclusive...
<fwereade_> wrtp, I dunno, it feels to me like we'll end up with a bunch of special cases sooner or later
<fwereade_> wrtp, can we take it to the lists for further discussion? need to pop out to baker before it closes
<wrtp> fwereade_: sure, i'll try and draft a reply
<niemeyer> Good morning!
<davecheney> hello
<TheMue> hi
<niemeyer> Anyone has the calls active already, or should I?
<TheMue> niemeyer: feel free to start, imho none has done it yet
<niemeyer> COol, starting it up
<niemeyer> https://plus.google.com/hangouts/_/2a0ee8de20f9362c47ab06b9b5635551d4959416?authuser=0&hl=en
<davecheney> no camera today
<davecheney> not sure why
<davecheney> the mac says it can see the device
<davecheney> but no green light :(
<niemeyer> wrtp: ping
<wrtp> niemeyer: pong
<niemeyer> wrtp: Party time
<wrtp> niemeyer: am just sorting out the hangout laptop
<Aram> I hate this technical shit
<wrtp> lunch
<wrtp> back
<niemeyer> fwereade_: Sent a more carefully considered comment on the lock-stepping issue
<mramm> How are folks doing this fine morning?
<niemeyer> fwereade_: ping
<niemeyer> mramm: Heya
<mramm> I'm about to go over Mark S's open stack design summit keynote with him (and kapil and clint)
<niemeyer> mramm: All good 'round here
<niemeyer> mramm: Brilliant, good luck there
<mramm> I think we have a really good story to tell around openstack upgrades thanks to the cloud archive
<mramm> and the look and feel of the juju gui is impressive
<niemeyer> fwereade_: When you're back and you have a moment, I'd appreciate talking a bit about https://codereview.appspot.com/6687043
<niemeyer> fwereade_: Both about the logic in EnterScope, and about the fact the CL seems to include things I've reviewed elsewhere
<niemeyer> mramm: What's the cloud archive?
<niemeyer> mramm: Good to hear re. GUI
<mramm> it's just a package archive
<mramm> with all the new stuff, backported and tested against the LTS
<niemeyer> mramm: LOL
<niemeyer> mramm: So we manage to stick the word "cloud" on package archives? ;-)
<niemeyer> managed
<mramm> it's all "cloud" stuff in the archive
<mramm> yes
<mramm> gotta make things cloudy
<fwereade_> niemeyer, pong, sorry I missed you
<niemeyer> wrtp: no problem
<niemeyer> Erm
<niemeyer> fwereade_: no problem
<fwereade_> niemeyer, haha
<niemeyer> fwereade_, wrtp: I was actually about to ask something else
<wrtp> niemeyer: go on
<fwereade_> niemeyer, 043 was meant to be a prereq for 046, i didn't realise I'd skipped it until yesterday
<niemeyer> fwereade_, wrtp: I think it'd make sense to have the interface of juju.Conn exposing at least similar functionality to what we have in the command line
<wrtp> niemeyer: are you talking about your Deploy bug?
<fwereade_> niemeyer, yes, I think I like that idea
<niemeyer> No, I'm talking about https://codereview.appspot.com/6700048
<niemeyer> fwereade_, wrtp: We've been going back and forth on what we have in juju.Conn, and the state we're in right now is quite neat
<wrtp> niemeyer: ah, i think that's a tricky one.
<niemeyer> fwereade_, wrtp: But the decision to put something there or not is a bit ad-hoc at the moment
<wrtp> niemeyer: i *do* think it's perhaps a bit confusing that RemoveUnit in Conn isn't anything like RemoveUnit in State.
<niemeyer> wrtp: Agreed, and I have a proposal: DestroyUnit
<wrtp> niemeyer: in state?
<niemeyer> wrtp: No, in juju.Conn
<niemeyer> Ideally that'd be the name of the command-line thing as well, but that's too late
<wrtp> niemeyer: hmm
<niemeyer> We do have destroy-service and destory-environment, though
<fwereade_> niemeyer, honstly I would prefer us to change Add/Remove in state to Create/Delete, say, and save the meanings of those verbs for the user-facing add/removes
<wrtp> niemeyer: Destroy sounds more drastic than Remove tbh
<niemeyer> fwereade_: Remove vs. Delete feels awkward
<niemeyer> wrtp: It's mean to be drastric
<niemeyer> drastic
<niemeyer> meant
<niemeyer> I can't spell
<wrtp> niemeyer: ah, i thought the command-line remove-unit just set dying.
<fwereade_> niemeyer, well, the trouble is we have this awkward remove-unit verb, which doesn't really mean remove at all
<niemeyer> wrtp: and dying does what? :)
<wrtp> niemeyer: then again, i suppose... yeah
<niemeyer> fwereade_: We can obsolete the command, and have destroy-unit
<niemeyer> fwereade_: (supporting the old name, of course)
<fwereade_> niemeyer, I'm -0.5 on the add/destroy pairing but it doesn't seem all that bad
<wrtp> fwereade_: we already have add-service, destroy-service, no?
<niemeyer> wrtp: We don't have add-service, yet
<niemeyer> wrtp: WE may have some day..
<wrtp> niemeyer: good point
<niemeyer> We do have AddService, though, so the pairing is already there in some fashion at least
<fwereade_> wrtp, we also have terminate-machine rather than destroy-machine
<niemeyer> I quite like destroy precisely because it's drastic, and because it avoids the add/remove/dying conflict
<niemeyer> fwereade_:+1 on destroy-machine too
<fwereade_> wrtp, niemeyer: and in general I am in favour of making the commands more consistent
<wrtp> fwereade_: +1 too
<niemeyer> fwereade_: destroy-service, destroy-unit, destroy-machine, destroy-environment..
<niemeyer> I'm happy with that, at least
<wrtp> destroy for destructive actions seems good
<fwereade_> wrtp, niemeyer: any quibbles I may have over the precise verb are drowned out by my approval for consistency
<wrtp> niemeyer: sounds like a plan
<fwereade_> niemeyer, wrtp: destroy-relation
<niemeyer> wrtp, fwereade_: Awesome, let's document and move in that direction
<niemeyer> I'll add a comment to Dave's branch
<niemeyer> fwereade_: +1
<fwereade_> niemeyer, great, thanks
<wrtp> fwereade_: i don't mind about remove-relation actually - it doesn't feel like that much of a destructive operation.
<niemeyer> wrtp: It actually is
<fwereade_> wrtp, strong disagreement
 * niemeyer has to take the door.. biab
<wrtp> fwereade_: ok, cool
<fwereade_> niemeyer, re the review -- if I were you I'd just drop that one, you've seen it all already in the one without the prereq
<fwereade_> niemeyer, I will try to figure out exactly where I am and whether I've introduced anything that deserves a test, then I should have the fixed one-you've-seen ready to repropose soon
<wrtp> interesting; this test didn't *fail*, but it did take over 2 minutes to execute on my machine: http://paste.ubuntu.com/1283102/
<wrtp> i'm not sure if i'm being pathological there or not
<niemeyer> wrtp: Would be very useful to know where the time is being spent
<niemeyer> fwereade_: Awesome, thanks
<wrtp> niemeyer: i'm looking into it right now.
<niemeyer> fwereade_: Can we speak about EnterScope when have a moment?
<fwereade_> niemeyer, any time
<fwereade_> niemeyer, now?
<niemeyer> fwereade_: Let's do it
<wrtp> niemeyer: quick check before you do that
<niemeyer> wrtp: Sure
<wrtp> niemeyer: should there be any watcher stuff running in a normal state unit test?
<wrtp> niemeyer: (i'm seeing hundreds of "watcher: got changelog document" debug msgs
<wrtp> )
<niemeyer> wrtp: The underlying watcher starts on state opening
<wrtp> niemeyer: ah
<niemeyer> wrtp: If you're creating hundreds of machines, that's expected
<wrtp> niemeyer: i see 4600 such messages initially
<fwereade_> niemeyer, actually, I'm just proposing -wip
<fwereade_> niemeyer, not quite sure it's ready, have ended up a bit confused by the branches
<fwereade_> niemeyer, but it does have an alternative approach to EnterScope
<fwereade_> niemeyer, that I am not quite sure whether I should do as it is, or loop over repeatedly until I get so many aborteds that I give up
<niemeyer> wrtp: You'll get as many messages as changes
<fwereade_> niemeyer, https://codereview.appspot.com/6678046
<niemeyer> fwereade_: Cool
<niemeyer> fwereade_: Invite sent
<wrtp> hmm, i see the problem, i think
<niemeyer> wrtp: Found it?
<wrtp> niemeyer: the problem is that all the goroutines try to assign to the same unused machine at once, but only one succeeds; then the all try with the next one etc etc
<wrtp> s/the all/they all/
<wrtp> niemeyer: i think i've got a solution
<wrtp> niemeyer: i'm not far off trying it out
<wrtp> niemeyer: my solution is to read in batches, and then try to assign to each machine in the batch in a random order.
<niemeyer> wrtp: What about going back to the approach we had before?
<wrtp> niemeyer: which was?
<niemeyer> wrtp: Create a machine and assign to it
<wrtp> niemeyer: what if we don't need to create a machine?
<wrtp> niemeyer: this is in AssignToUnusedMachine, which doesn't create machines
<niemeyer> wrtp: My understanding is that we had an approach to allocate machines that was simple, and worked deterministically
<wrtp> niemeyer: and the approach we used before is inherently racy if someone else *is* using AssignToUnusedMachine
<wrtp> niemeyer: that's fine (modulo raciness), but that doesn't fix the issue i'm seeing in this test (which we may, of course, decide is pathological and not worth fixing)
<niemeyer> wrtp: The only bad case was that if someone created a machine specifically for a service while someone else attempted to pick a random machine, the random one could pick the machine just allocated for the specific service
<wrtp> niemeyer: so in that case we should loop, right?
<niemeyer> wrtp: I'm not sure
<wrtp> niemeyer: actually, we *do* create a machine and then assign the unit to that machine
<wrtp> niemeyer: and that's the cause of the bug that dfc is seeing (i now realise)
<niemeyer> wrtp: Indeed, sounds plausible
<wrtp> niemeyer: in the case i'm dealing with currently, we have a big pool of machines already created, all unused, and we're trying to allocate a load of units over them.
<wrtp> niemeyer: that seems like a reasonable scenario actually.
<niemeyer> wrtp: Agreed
<wrtp> niemeyer: so i think it's worth trying to make that work ok.
<niemeyer> wrtp: +1
<wrtp> niemeyer: so... do you think my proposed solution is reasonable?
<niemeyer> wrtp: It seems to reduce the issue, but still feels racy and brute-forcing
<wrtp> niemeyer: alternatives are: - read *all* the machines, then choose them in random order; - add a random value to the machine doc and get the results in a random order
<wrtp> niemeyer: yeah, i know what you mean
<wrtp> niemeyer: there's probably a way of doing it nicely, though i haven't come up with one yet
<niemeyer> wrtp: I think we could introduce the concept of a lease
<wrtp> niemeyer: interesting way forward, go on.
<niemeyer> wrtp: When a machine is created, the lease time is set to, say, 30 minutes
<niemeyer> wrtp: AssignToUnused never picks up machines that are within the lease time
<wrtp> niemeyer: that doesn't solve the big-pool-of-already-created-machines problem AFAICS
<wrtp> niemeyer: which is, admittedly, a different issue
<niemeyer> wrtp: Hmm, good point
<niemeyer> wrtp: You know what.. I think we shouldn't do anything right now other than retrying
<wrtp> niemeyer: and ignore the time issue?
<niemeyer> wrtp: Yeah
<wrtp> niemeyer: the random-selection-from-batch isn't much code and will help the problem a lot
<niemeyer> wrtp: It makes the code more complex and bug-prone for a pretty unlikely scenario
<wrtp> niemeyer: ok. it's really not that complex, though, but i see what you're saying.
<niemeyer> wrtp: I recall you saying that before spending a couple of days on the last round on unit assignment too :-)
<wrtp> niemeyer: i've already written this code :-)
<wrtp> niemeyer: and it's just an optimisation that fairly obviously doesn't affect correctness.
<niemeyer> wrtp: I don't think it's worth it.. it's increasing complexity and the load of the system in exchange for a reduction in the chance of conflicts in non-usual scenarios
<niemeyer> wrtp: We'll still have conflicts, and we still have to deal with the problem
<niemeyer> wrtp: People adding 200 machines in general will do add-machine -n 200
<niemeyer> wrtp: and we should be able to not blow our own logic out with conflicts in those cases
<wrtp> niemeyer: i'm thinking of remove-service followed by add-service
<niemeyer> wrtp: Ok?
<wrtp> niemeyer: sure.
<wrtp> niemeyer: i'll scale back my test code :-)
<niemeyer> wrtp: Sorry, I was asking what you were thinking
<niemeyer> wrtp: What about remove-service follows by add-service?
<niemeyer> followed
<wrtp> niemeyer: if someone does remove-service, then two add-services concurrently, they'll see this issue.
<wrtp> niemeyer: that doesn't seem that unusual a scenario
<wrtp> niemeyer: i mean two "deploy -n 100"s of course
<wrtp> niemeyer: assuming the original service had 200 units.
<niemeyer> wrtp: If someone does destroy-service, they'll put units to die.. if they run add-service twice immediately, they'll create two new machines
<niemeyer> wrtp: What's the problem with that?
<wrtp> niemeyer: if someone does destroy-service, then waits, the machines lie idle with no units after a while, yes?
<niemeyer> wrtp: Sorry, what's the scenario again?  Different scenarios are not "of course" the same
<wrtp> niemeyer: here's the scenario i'm thinking of:
<wrtp> juju deploy -n 200 somecharm; juju remove-service somecharm; sleep 10000; juju deploy -n 100 othercharm & juju deploy -n 100 anothercharm
<niemeyer> wrtp: I don't understand why we're talking about deploy + remove-service
<niemeyer> wrtp: What's the difference between that and add-machine -n 200?
<wrtp> niemeyer: because that leaves a load of machines allocated but unused, no?
<niemeyer> wrtp: add-machine -n 200?
<wrtp> [15:53:33] <niemeyer> wrtp: People adding 200 machines in general will do add-machine -n 200
<niemeyer> wrtp: Yes, what's the difference?
<wrtp> niemeyer: but they are more likely to remove a service and add another one, i think
<niemeyer> wrtp: Doesn't matter to the allocation algorithm, does it?
<wrtp> niemeyer: "juju deploy -n 200 foo" doesn't have the issue
<wrtp> niemeyer: if the machines are not currently allocated
<niemeyer> wrtp: Agreed.. that's why I'm saying the whole problem is not important..
<niemeyer> wrtp: I still don't get what you're trying to say with deploy+remove-service+sleep
<niemeyer> wrtp: Isn't that an expensive way to say add-machine -n 200?
<wrtp> niemeyer: i'm trying to show a moderately plausible scenario that would exhibit the pathological behaviour we're seeing here.
<wrtp> niemeyer: yeah, sure.
<niemeyer> wrtp: Okay, phew..
<niemeyer> wrtp: So how is add-machine -n 200 + deploy -n 200 an issue?
<wrtp> niemeyer: it's only an issue if you've got two concurrent deploys.
<niemeyer> wrtp: Okay, so we should just ensure that these cases actually work by retrying, until we sort a real solution out in the future that actually prevents the conflict
<wrtp> niemeyer: sounds reasonable.
<wrtp> niemeyer: AssignToUnusedMachine does currently retry as it stands actually.
<niemeyer> wrtp: So how is Dave stumbling upon issues?
<wrtp> niemeyer: the problem is in AssignUnit, but there's a trivial fix, i think
<niemeyer> wrtp: Coo
<niemeyer> ll
<wrtp> niemeyer: currently AssignUnused calls Unit.AssignToMachine(m) but it should call Unit.assignToMachine(m, true)
<wrtp> niemeyer: yeah, i was surprised when my test didn't fail.
<niemeyer> wrtp: I'm not sure this solves the issue
<wrtp> niemeyer: no?
<wrtp> niemeyer: i *think* it solves the case of AssignUnit racing against itself
<wrtp> niemeyer: it doesn't solve the problem of AssignUnit racing against AssignToUnusedMachine
<wrtp> niemeyer: if we want to solve that, we'll need to loop, i think.
<wrtp> niemeyer: (but that's not the problem that dave is seeing)
<wrtp> niemeyer: erk
<wrtp> niemeyer: no, you're right
<wrtp> niemeyer: i'm thinking of something like this: http://paste.ubuntu.com/1283247/
<niemeyer> wrtp: This doesn't feel great.. allocating a machine and having it immediately stolen is pretty awkward
<niemeyer> wrtp: If we want to solve this stuff for real, I suggest two different fronts:
<wrtp> niemeyer: AddMachineWithUnit ?
<niemeyer> 1) Introduce a lease time on AddMachine that prevents someone else from picking it up non-explicitly
<niemeyer> 2) Do a variant of your suggestion that picks the highest and the smallest id of all unused machines, and picks the first one >= a random id in the middle
<niemeyer> wrtp: -1 I think.. this would mean we'll have to do a bunch of transaction merging that right now are totally independent
<wrtp> niemeyer: ok
<wrtp> niemeyer: do we have a way of getting an agreed global time for leaseholding?
<wrtp> niemeyer: presumably the presence stuff does that currently
<wrtp> niemeyer: hmm, maybe mongo provides access to the current time
<niemeyer> wrtp: Yeah
<niemeyer> wrtp: Although, ideally we'd not even load that time
<wrtp> niemeyer: i'm thinking we shouldn't need to
<niemeyer> wrtp: if the machine is created with a bson.MongoTimestamp, that's automatically set
<niemeyer> wrtp: It needs to be the second field, though, IIRC
<wrtp> niemeyer: weird
<niemeyer> wrtp: Yeah, it's a bit of an internal time type
<niemeyer> wrtp: it'd be nicer to use a normal time, actually
<niemeyer> I don't recall if there's a way to create it with "now", though
 * niemeyer checks
<niemeyer> Nothing great
<niemeyer> I talked to Eliot before about $now.. I think it'll come, but doesn't exist yet
<wrtp> niemeyer: ah
<niemeyer> Anyway, will think further about that over lunch
<wrtp> niemeyer: cool
<wrtp> niemeyer: for the time being, perhaps it's best just to do the loop?
<wrtp> niemeyer: as it's a quick fix for a current bug
<niemeyer> wrtp: Yeah, that's what I think we should do
<niemeyer> wrtp: The real solution is involved and will steal our time
<wrtp> niemeyer: agreed
<wrtp> niemeyer: also, what we will have will be correct, just not very efficient.
<niemeyer> wrtp: Yeah, but those are edge cases really.. the cheap answer is "don't allocate tons of machines and then do tons of assignments in parallel"
<niemeyer> wrtp: Which isn't hard to avoid
<wrtp> niemeyer: yeah
<wrtp> niemeyer: concurrent deploys are inefficient. we can live with that for the time being.
<niemeyer> Cool, lunch time.. biab
<niemeyer> wrtp: Concurrent deploys with spare machines, specifically
<wrtp> niemeyer: all concurrent deploys will suffer from the someone-stole-my-new-machine problem, i think.
<wrtp> niemeyer: this seems to work ok. https://codereview.appspot.com/6713045
<niemeyer> wrtp: Looking
<niemeyer> wrtp: Nice
<fwereade_> niemeyer, hmm, is it OK to go from Dying straight to removed without passing through Dead?
<niemeyer> wrtp: How long does it take to run?
<fwereade_> niemeyer, blast sorry can't talk now
<wrtp> niemeyer: <2s
<wrtp> niemeyer: one mo, i'll check
<wrtp> niemeyer: 0.753s to run the state tests with just that one test.
<niemeyer> wrtp: Beautiful, thanks
<niemeyer> wrtp: LGTM
<wrtp> niemeyer: thanks
<wrtp> niemeyer: it was surprisingly difficult to provoke the race before applying the fix
<niemeyer> fwereade_: If nothing else, I see how it might be okay in cases we have tight control on
<niemeyer> fwereade_: We can talk more once you're back
<niemeyer> wrtp: Those are very useful tests to hold
<wrtp> niemeyer: agreed
<wrtp> niemeyer: any chance of getting some feedback on https://codereview.appspot.com/6653050/ ?
<niemeyer> wrtp: I was reviewing that when I stopped to review your request here
<wrtp> niemeyer: ah brilliant, thanks!
<niemeyer> wrtp: Why does it reset the admin password on tear down of ConnSuite?
<fwereade_> niemeyer, well, it was what we were discussing earlier... that it seemed sensible for the last unit to leave a relation scope to be the one to finally remove it, and that we should do it in a transaction
<wrtp> niemeyer: because every time we connect to the state, the admin password gets set.
<niemeyer> wrtp: Where's that done?
<wrtp> niemeyer: in juju.NewConn
<wrtp> niemeyer: actually, in Bootstrap
<wrtp> niemeyer: and then juju.NewConn resets it, as is usual
<niemeyer> wrtp: 	 135         // because the state might have been reset
<niemeyer>  136         // by the test independently of JujuConnSuite.
<niemeyer> wrtp: Is that done when the password change fails?
<niemeyer> wrtp: I mean, where do we reset and put it in such a non-working state
<wrtp> niemeyer: one mo, i'll check the code again
<niemeyer> wrtp: Cheers
<wrtp> niemeyer: ah, yes, it's when we have tests that Bootstrap then Destroy
<wrtp> niemeyer: any test that does a Destroy will cause the SetAdminPassword call to fail
<niemeyer> wrtp: Hmm..
<niemeyer> wrtp: I'm pondering about what it means.. won't the follow up tear down fail too
<niemeyer> ?
<wrtp> niemeyer: no, it doesn't. can't quite remember why though, let me check again.
<niemeyer> wrtp: It feels a bit wild
<niemeyer> wrtp: You've just worked on that and can't remember.. neither of us will have any idea of that stuf fin a bit :(
<wrtp> niemeyer: i know what you mean
<wrtp> niemeyer: the interaction between JujuConnSuite, MgoSuite and the dummy environ isn't ideal
<niemeyer> wrtp: What actually fails we don
<niemeyer> 't reset the password there?
<wrtp> niemeyer: lots of tests need the server to be restarted then
<wrtp> niemeyer: nothing fails - tests just get slower
<niemeyer> wrtp: That's good
<wrtp> niemeyer: when someone calls Environ.Destroy, it calls mgo.Reset, but the JujuConn.State variable remains pointing to to old connection.
<niemeyer> wrtp: Right
<niemeyer> wrtp: Okay, I'll just suggest a comment there
<wrtp> niemeyer: sounds good
<niemeyer> / Bootstrap will set the admin password, and render non-authorized use
<niemeyer> / impossible. s.State may still hold the right password, so try to reset
<niemeyer> / the password so that the MgoSuite soft-resetting works. If that fails,
<niemeyer> / it will still work, but it will take a while since it has to kill the whole
<niemeyer> / database and start over.
<niemeyer> Ah, will add a note about when it happens too
<niemeyer> wrtp: LGTM
<niemeyer> wrtp: Pleasantly straightforward
<wrtp> niemeyer: great, thanks
<wrtp> niemeyer: yeah, when i realised that all tests were going to need to connect with authorisation, i thought the changes would be worse than they ended up.
<niemeyer> fwereade_: I see
<niemeyer> fwereade_: I think it sounds reasonable in that case
<niemeyer> fwereade_: Is there anyone else that might be responsible for taking the unit from dying => dead => remove?
<wrtp> right, submitted. good way to end the day. see y'all tomorrow.
<niemeyer> wrtp: Indeed, have a good one!
<fwereade__> niemeyer, it's the relation I'm pondering taking directly Dying->gone
<fwereade__> niemeyer, I *think* it's ok, because the last thing to be doing anything with it should be that last relation
<niemeyer> fwereade__: Makes sense.. have you seen my comments on it?
<fwereade__> niemeyer, sorry, I'm not sure which comments.. I haven't seen any comments of yours less than ~ 1 day old on the CLs I'm thinking of
<niemeyer> fwereade__: My apologies, I meant on IRC, right above
<niemeyer> <niemeyer> fwereade_: I see
<niemeyer> <niemeyer> fwereade_: I think it sounds reasonable in that case
<niemeyer> <niemeyer> fwereade_: Is there anyone else that might be responsible for taking the unit from dying => dead => remove?
<fwereade__> niemeyer, it's not the unit, it's the relation
<niemeyer> fwereade__: Ah, sorry, yes, s/unit/relation
<fwereade__> niemeyer, the other thing thta might have do do it is the client, if no units are in the relation yest
<fwereade__> niemeyer, I'm actually starting to feel less keen on the idea
<fwereade__> niemeyer, I'm starting to think that it would be better to set it to Dead and add a cleanup doc for it
<fwereade__> niemeyer, we can do it in one transaction but don't have to get overly clever
<niemeyer> fwereade__: What's the benefit?
<fwereade__> niemeyer, we get (1) consistent lifecycle progress and (2) a single transaction that the unit agent uses to wash its hands of a dying relation
<niemeyer> fwereade__: Actually, hmm
<niemeyer> fwereade__: Well, before we derail..
<niemeyer> fwereade__: Both don't look like very strong points.. we're exchanging simple and deterministic termination by a hand-off of responsibility
<niemeyer> fwereade__: There's perhaps an alternative that might offer a middle ground solving some of your concerns, though
<fwereade__> niemeyer, (the big one is "LeaveScope will be less complicated")
<fwereade__> niemeyer, but go on please
<niemeyer> fwereade__: Sorry, I spoke too soon, I think the idea would introduce further races down the road
<fwereade__> niemeyer, ha, no worries
<fwereade__> niemeyer, anyway I'm hardly married to the idea, I'll take it round the block another time and try to simplify a bit more
<niemeyer> fwereade__: I don't see any simple alternatives..
<niemeyer> fwereade__: Adding a cleanup document would mean persisting service and associated relations for an undetermined amount of time
<niemeyer> fwereade__: Even in the good cases
<fwereade__> niemeyer, really? a Dead relation has decreffed its service... I don't think there's anything blocking service removal at that point
<fwereade__> niemeyer, that's almost the whole point of it being dead
<fwereade__> niemeyer, if anything else is reacting in any way to a dead relation I think they're doing it wrong
<niemeyer> fwereade__: It would be the first time we're keeping dead stuff around referencing data that does not exist
<niemeyer> fwereade__: This feels pretty bad
<niemeyer> fwereade__: Find(relation)... oh, sorry, your service is gone
<niemeyer> fwereade__: Worse.. Find(relation).. oh, look, your service is live, again, but it's a different service!
<niemeyer> fwereade__: The purpose of Dead as we always covered was to implement clean termination, not to leave old unattended data around
<fwereade__> niemeyer, fair enough, as I said I'm happy to take it round the block again -- I had seen it as just one more piece of garbage in the same vein as all its unit settings, but mileage clearly varies
<fwereade__> niemeyer, just to sync up on perspective: would you agree that we should, where possible, be making all related state changes in a single transaction, and only falling back to a CA when dictated by potentially large N?
<niemeyer> fwereade__: Settings have no lifecycle support, and an explicit free pass in the case of relation unit settings because we do want to keep them up after scope-leaving for reasons we discussed
<fwereade__> niemeyer, yep, ok, I am not actually arguing for it any more, I think I have gone into socratic mode largely for my own benefit
<niemeyer> fwereade__: Regarding CA use, yes, it feels like a last resort we should use when that's clearly the best way forward
<niemeyer> fwereade__: Again, it sounds sensible in the case of settings precisely because we have loose control of when to remove
<fwereade__> niemeyer, thanks, but in fact I have a more general statement: we should be making state changes as single transactions where possible, and exceptions need very strong justifications
<fwereade__> niemeyer, because I am suddenly fretting about interrupted deploys
<niemeyer> fwereade__: I'm finding a bit hard to agree with the statement open as it is because I'm not entirely sure about what I'd be agreeing wiht
<fwereade__> niemeyer, it feels like maybe we should actually be adding the service, with N units and all its peer relations, in one go
<niemeyer> fwereade__: The end goal is clear, though: our logic should continue to work reliably even when things explode in the middle
<fwereade__> niemeyer, ok, thank you, that is a much better statement of the sentiment I am trying to express
<niemeyer> fwereade__: In some cases, we may be forced to build up a single transaction
<niemeyer> fwereade__: In other cases, it may be fine to do separate operations because they will independently work correctly even if there's in-between breakage
<niemeyer> fwereade__: and then, just to put the sugar in all of that, we have to remember that our transaction mechanism is read-committed
<niemeyer> fwereade__: We can see mid-state
<niemeyer> fwereade__: even if we have some nice guarantees that it should complete eventually
<fwereade__> niemeyer, I have been trying to keep that at the forefront of my mind but I bet there are some consequences I've missed somewhere ;)
<niemeyer> fwereade__: That's great to know
<niemeyer> fwereade__: We most likely have issues here and there, but if nothing else we've been double-checking
<fwereade__> niemeyer, so, to again consider specifically the extended LeaveScope I'm looking at now
<niemeyer> fwereade__: I often consider the order in which the operations are done, and the effect it has on the watcher side, for example
<niemeyer> fwereade__: Ok
<fwereade__> niemeyer, (huh, that is not something I had properly considered... are they ordered as the input {}Op?)
<niemeyer> fwereade__: Yep
<niemeyer> fwereade__: We've been getting it right, I think :)
<niemeyer> fwereade__: E.g. add to principals after unit is in
<niemeyer> fwereade__: I like to think it's not a coincidence :-)
<fwereade__> niemeyer, cool, but that's one of those totally unexamined assumptions I think I've been making, but could easily casually break in pursuit of aesthetically pleasing code layout or something ;)
<fwereade__> niemeyer, good to be reminded
<niemeyer> fwereade__: So, LeaveScope
<niemeyer> fwereade__: What do you think?
<fwereade__> niemeyer, you know, I'm not sure any more, I need to write some code :/
<fwereade__> niemeyer, thank you, though, this has helped some things to fall into place
<niemeyer> fwereade__: Okay, since we're here with state loaded in our minds, this is my vague understanding of what we probably need:
<niemeyer> 1) If the relation is live, run a transaction doing the simplest, asserting that the relation is still alive
<niemeyer> 2) If there are > 1 units in the relation we're observing, run a transaction asserting that this is not the last unit, and just pop out the scope
<niemeyer> 3) If there is exactly 1 unit remaining, or 2 was aborted, remove relation and scope, unasserted
<fwereade__> niemeyer, yeah, that matches my understanding
<niemeyer> Actually, sorry, (3) has to assert the scope doc exists
<niemeyer> Otherwise we may havoc the system in some edge cases
<niemeyer> fwereade__: ^
<fwereade__> niemeyer, it was the refcount checks I had been thinking of when you said unasserted
<fwereade__> niemeyer, but then actually, hmm: by (3) a failed assertion should be reason enough to blow up, unless a refresh reveals that someone else already deleted it... right?
<niemeyer> fwereade__: Right
<fwereade__> niemeyer, we can't do anything sophisticated with the knowledge, it's always going to be an error: may as well assert for everything even in (3)
<fwereade__> niemeyer, at least we fail earlier if state does somehow become corrupt
<niemeyer> fwereade__: Hmm
<niemeyer> fwereade__: Sounds like the opposite
<niemeyer> fwereade__: If we assert what we care about, we can tell how to act
<niemeyer> fwereade__: If we assert just on existence of scope doc, which is the only thing we care about, we're know exactly what happened if it fails
<niemeyer> fwereade__: Even if we don't load anything else
<niemeyer> fwereade__: We don't care about refcounts, in theory
<niemeyer> fwereade__: If it's 1, there's only 1.. if someone removed that one, and it wasn't us, that's okay too
<niemeyer> fwereade__: 1 should never become 2 unless we have a significant bug
<niemeyer> fwereade__: Makes sense?
<fwereade__> niemeyer, so if we assert lots, fail early, and recover if it turns out that the relation was removed by someone else, I think we're fine, and in the case of such a significant bug at least we haven't made *more* nonsensical changes to the system ;)
<niemeyer> fwereade__: My point is that we don't have to "recover if it turns out ..."
<fwereade__> niemeyer, yeah, fair enough, I see that side too
<niemeyer> fwereade__: Otherwise, agreed regarding assert lots
<niemeyer> fwereade__: In fact, we're doing in-memory Life = Dead, which sounds pretty dangerous in that place
<niemeyer> fwereade__: We need to make sure to not use an in-memory value we got from elsewhere in that > 1 logic.
<fwereade__> niemeyer, sorry, I am suddenly at sea
<niemeyer> fwereade__: EnsureDead does in-memory .doc.Life = Dead
<niemeyer> fwereade__: Picking a count and life from an external relation doc and saying "Oh, if it's dying, it surely has no more than 1 units" will bite
<niemeyer> fwereade__: Because someone else may have inc'd before it became Dying
<fwereade__> niemeyer, I may have misunderstood: but my reading was that it would be ok to use the in-memory values to pick a transaction to start off with, but that we should refresh on ErrAborted
<fwereade__> niemeyer, and use those values to figure out what to do next
<niemeyer> fwereade__: It depends on how you build the logic really
<fwereade__> niemeyer, I think I know roughly what I'm doing... time will tell :)
<niemeyer> fwereade__: If we load a value from the database that says life=dying and units=1, you don't have to run a transaction that says >1 because you know it'll fail
<niemeyer> fwereade__: If you have a value in memory you got from elsewhere that says the same thing, you can't trust it
<fwereade__> niemeyer, yes, this is true, there are inferences I can draw once it's known to be dying
<niemeyer> fwereade__: That was the only point I was making in the last few lines
<fwereade__> niemeyer, cool
<niemeyer> fwereade__: I'll go outside to exercise a tad while I can.. back later
<fwereade__> niemeyer, enjoy
<niemeyer> fwereade__: Have a good evening in case we don't catch up
<fwereade__> niemeyer, and you :)
<niemeyer> fwereade__: Cheers
#juju-dev 2012-10-17
<TheMue> morning
<fwereade__> TheMue, heyhey
<TheMue> fwereade__: hello
<fwereade__> TheMue, just in case you have perspective on this: ISTM that there is nothing preventing a charm from publishing multiple relations with the same name, so long as they have different roles... do you know if this is intentional?
<TheMue> fwereade__: sorry, dunno. net yet went deeper into relations constraints
<fwereade__> TheMue, no worries :)
<TheMue> fwereade__: phew ;)
<fwereade__> TheMue, it's just that I saw you at the exact moment I was deciding I didn't know the answer myself :)
<TheMue> fwereade__: yeah, I would have liked to give you the missing link
<wrtp> fwereade__, TheMue: mornin'
<TheMue> wrtp: bonjour
<Aram> moin.
<TheMue> Aram: moin moin
<fwereade__> Aram, wrtp, TheMue: does anyone know why the juju-info relation has global scope?
<fwereade__> bcsaller, actually, if you're around, then you will be sure to know: ^^
<TheMue> fwereade__: sorry again, but it seems i have to take a deeper look at relations to get a better understanding
<wrtp> fwereade__: that does seem a bit odd
<wrtp> fwereade__: isn't the whole point of juju-info to support subordinates' relationship with their principal?
<fwereade__> wrtp, that had indeed been my conception of it
<fwereade__> wrtp, hmm, I'm suddenly wondering whether we should be occupying the juju-* namespace for hooks as well as relations
<fwereade__> wrtp, otherwise people could implement hooks for the provider side of juju-info
<wrtp> fwereade__: hmm, interestin
<wrtp> g
<fwereade__> wrtp, which becomes icky if those hooks suddenly appear during a charm upgrade, with a bunch of hooks already "run"
<wrtp> fwereade__: i'm not quite sure i understand
<fwereade__> wrtp, it might not be a big deal, I'm not quite sure
<fwereade__> wrtp, I can't figure out whether there's any value in implementing provider-side hooks for juju-info
<wrtp> fwereade__: i don't think we can occupy the juju-* name space entirely, otherwise you won't be able to implement requirer-side hooks for it
<wrtp> fwereade__: hmm, i can see that there *might* be
<fwereade__> wrtp, ha, just realised
<wrtp> fwereade__: although it's perhaps dodgy.
<fwereade__> wrtp, even if there is value it's fundamentally dodgy anyway
<fwereade__> wrtp, if we have multiple relations with the same name then the "what hook do we run" question becomes, I think, insoluble
<wrtp> fwereade__: why would we have multiple relations with the same name?
<fwereade__> wrtp, we already do... everything implicitly provides juju-info, and some things also require it
<wrtp> fwereade__: doesn't a principal provide exactly one juju-info relation?
<fwereade__> wrtp, yes, but so does a subordinate
<fwereade__> wrtp, everything provides exactly one
<wrtp> fwereade__: ah, so you've got the same relation in provider and requirer roles
<fwereade__> wrtp, everything requires one or zero
<fwereade__> wrtp, yeah, essentially, but they're not really the same relation
<wrtp> fwereade__: yeah sorry,
<wrtp> fwereade__: i meant the same relation name
<fwereade__> wrtp, cool
<wrtp> fwereade__: so it makes sense that you can't specify a juju-info hook
<fwereade__> wrtp, no, but yu have to
<fwereade__> wrtp, otherwise the whole relation is worthless
<wrtp> fwereade__: because if you're requiring a juju-info hook, you'd name the relation something different, no?
<wrtp> fwereade__: tbh i haven't yet seen a good use for the juju-info relation
<fwereade__> wrtp, I haven't done an exhaustive check but I've seen at least two charms with a require named juju-info
<wrtp> fwereade__: did you see what they used it for?
<fwereade__> wrtp, here is one one example http://jujucharms.com/~david-duffey/precise/ddclient/hooks/juju-info-relation-joined
<fwereade__> wrtp, here is another: http://jujucharms.com/~ted/precise/application-start/hooks/juju-info-relation-joined
<wrtp> fwereade__: it's interesting that they both specify scope: container
<wrtp> fwereade__: i suppose that actually it makes sense to allow non-locally-scoped juju-info relations, because requirers can get local scope if they want
<fwereade__> wrtp, I'm not sure we get anything out of that ability though
<fwereade__> wrtp, I suspect it just misleads
<fwereade__> wrtp, I *think* that what I would like is:
<wrtp> fwereade__: it provides the ability for one service to watch any other service in a generic way
<fwereade__> wrtp, every subordinate implicitly requires it; every principal implicitly provides it; write all the hooks you want; don't allow cross-role name collisions
<wrtp> fwereade__: how many of those statements are true today?
<fwereade__> wrtp, just about none :/
<wrtp> fwereade__: i'm not sure about the first one
<fwereade__> wrtp, except possibly for write-all-the-hooks-you-want, but it's worthless due to name collisions
<wrtp> fwereade__: why should every subordinate implicitly require it?
<fwereade__> wrtp, the subordinates don't have to do anything with it, just as principals don't have to do anything with the one they implicitly provide
<wrtp> fwereade__: i thought *all* services provided a juju-info relation
<wrtp> fwereade__: it kinda makes sense to me that that might be the case
<fwereade__> wrtp, yeah, I think that subordinates shouldn't
<wrtp> fwereade__: why not?
<wrtp> fwereade__: i think it could make sense to think of juju-info as something entirely orthogonal to the subordinate-principal relationship
<fwereade__> wrtp, because we have charms that both provide and require juju-info, and that's obviously crack
<fwereade__> wrtp, that's a nice idea, but I can't see how to make it work
<fwereade__> wrtp, (without changing existing charms...)
<fwereade__> wrtp, ok, alternative that might work
<wrtp> fwereade__: i'm not sure. if we say that all charms implicitly provide juju-info, and juju-info relation hooks refer to the requirer side, i think that *might* work
<fwereade__> wrtp, everything implicitly provides juju-info unless it explicitly requires juju-info
<fwereade__> wrtp, apart from anything else, surely the fact that we can have unresolvably ambiguous relation specs is enough to sink the idea that name collisions are ok
<wrtp> fwereade__: one possible way would be to move in a backwardly-incompatible way and prohibit people declaring a relation with the name "juju-info"
<wrtp> fwereade__: given that very few people seem to use this functionality so far, that might not be so bad
<wrtp> fwereade__: and then all the problems go away
<fwereade__> wrtp, could do, I guess -- but that feels like a punt on the big issue, which is that we apear to allow relation name collisions in charms
<wrtp> fwereade__: you can declare several relations with the same name?
<wrtp> fwereade__: that does seem like crack
<fwereade__> wrtp, yeah
<wrtp> fwereade__: i wonder if any charms do that
<wrtp> fwereade__: (other than the ones that declare juju-info, of course)
<fwereade__> wrtp, any with a juju-info relation do
<fwereade__> wrtp, ;p
<fwereade__> wrtp, I hope not :)
<wrtp> fwereade__: i think it makes sense to reserve all relations with the prefix "juju-"
<wrtp> fwereade__: or actually...
<wrtp> fwereade__: yeah, it makes sense to do that; then we can add more in the future without breaking charms.
<wrtp> fwereade__: but i think people could still write juju-info hooks
<wrtp> fwereade__: i don't really see why that's a problem yet
<fwereade__> wrtp, they are actually all meant to be reserved, I think :/
<wrtp> fwereade__: if we insist that noone can redeclare juju-info, then we don't have any ambiguity
<wrtp> fwereade__: in which case that ddclient charm is bogus, right?
<wrtp> fwereade__: (the other one is ok, as it doesn't name the relation "juju-info")
<fwereade__> wrtp, hell, the hook naming totally threw me there
<fwereade__> wrtp, yeah, that sounds sensible to me
<fwereade__> wrtp, I could swear I'd seen code explicitly forbidding explicit juju-* relations
<wrtp> fwereade__: we don't know that that charm actually works...
<fwereade__> wrtp, haha :/ I had hoped that charm store entry procedures were a little stricter, but I guess we already know they aren't all perfect
<fwereade__> wrtp, can you think of any reason not to insert the implicit relation in ReadMeta?
<wrtp> fwereade__: that's a good question
<wrtp> fwereade__: my gut says it's probably not a great idea, but...
<fwereade__> wrtp, yeah, I too feel discomfort, but I don't think any other place works
<wrtp> fwereade__: can't it be done in state, when the charm is added?
<wrtp> fwereade__: i think the charm package should prohibit the declaration of relations with a juju- prefix, but i'm not sure it should actively declare them.
<wrtp> fwereade__: that seems to me like something that's appropriate for the system implementation to do.
<wrtp> fwereade__: i'm kind of thinking that at some point in the future, we might see charms being used in different contexts, some of which might have a different set of implicitly declared juju- relations.
<fwereade__> wrtp, you may be right... I guess it's just that I'd prefer that the unit agent be able to determine whether a potential relation is valid by reading its own charm instead of having to hit remote state for it
<fwereade__> wrtp, actually that's no bug deal it's just a method that takes a charm.Charm
<fwereade__> wrtp, yeah, sgtm, cheers
<wrtp> fwereade__: cool
<wrtp> fwereade__: gustavo may well think differently, of course!
<fwereade__> wrtp, I understand that caveat to be implicit in everything I hear
<fwereade__> wrtp, or think ;)
<wrtp> lol
<fwereade__> wrtp, hm, the python explicitly disallows *provides* with a juju-* name or interface
<wrtp> fwereade__: i think it should disallow requires too
<fwereade__> wrtp, ISTM that we should disallow any juju-* name (or just "juju")
<fwereade__> wrtp, but I'm not sure we have any reason to mess with the interface
<wrtp> fwereade__: any juju- name
<fwereade__> wrtp, so "juju" should be allowed?
<wrtp> fwereade__: yeah, why not?
<fwereade__> wrtp, because it is misleading to see a "juju-info-relation-joined" hook
<wrtp> fwereade__: i'm not sure i see why
<fwereade__> wrtp, I'm almost tempted to say that `starts with "juju"` should be the criterion
<wrtp> fwereade__: why is juju-info-relation-joined misleading, even if there is a juju relation?
<fwereade__> wrtp, hold on, it's crack anyway
<fwereade__> wrtp, application-start has one of those
<fwereade__> wrtp, but it'll never get run afaict
<wrtp> fwereade__: why not?
<fwereade__> wrtp, the only explicit relation it has is juju; it may run if it uses its implicit juju-info relation
<wrtp> fwereade__: if you have a juju-info hook, then it must refer to the provider side, right? (assuming you're not allowed to explicitly declare a juju-info relation)
<wrtp> fwereade__: so there's no ambiguity and... no problem?
<Aram> wrtp: btw, regarding our discussing about " and =, = and == were in 10th edition unix, just catched my eye yesterday :).
<fwereade__> wrtp, guess so... still worried about the inconsistencies on charm upgrade when we add such hooks
<Aram> also, the rc manual in 10th edition manual mentions plan 9, I wonder when did plan 9 development started.
<wrtp> Aram: yeah, i think i used "=" first; then i added "==" (without knowing about the 10th edition version); then i moved to rc and started using - and -- instead
<wrtp> fwereade__: what kind of inconsistencies are you thinking about?
<wrtp> fwereade__: it would be easy to disallow juju-info hooks in the first instance, anyway
<wrtp> Aram: i remember seeing plan 9 around just after i'd left university, around '91. charles had built his own clone of it (called Orbit) before it was made publicly available.
<Aram> really? did he make any code available?
<wrtp> Aram: i might have a pirated copy on magtape somewhere :-)
<wrtp> Aram: he got the whole comp sci year using it for a year or so
<fwereade__> wrtp, most relations won't even be joined if the charm doesn't explicitly provide them; but the juju-info relation will be processed as usual
<fwereade__> wrtp, if we upgrade to a version that does provide hooks, it won't see the expected events
<fwereade__> wrtp, it will see departs of units that never joined, etc
<wrtp> fwereade__: i see.
<fwereade__> wrtp, this *might* not be a big deal in practice but it involves dropping guarantees that STM to be useful
<wrtp> fwereade__: i think that that's probably good enough reason to disallow implementation of juju-* hooks
<wrtp> fwereade__: it kinda makes sense that the system is providing all interactions with juju- relations.
<wrtp> fwereade__: one could see it as implicitly declaring the necessary hooks
<wrtp> fwereade__: so it doesn't make sense for a charm to redefine them
<fwereade__> wrtp, yeah, this sounds reasonable
<fwereade__> wrtp, that then means that we should disallow "juju" as a relation name as well, otherwise we won't be able to write juju-relation-joined hooks
<fwereade__> s/otherwise/because/
<wrtp> fwereade__: i'm not sure i follow
<fwereade__> wrtp, disallow implementation of juju-* hooks => juju-relation-joined not ok => cannot meaningfully declare a "juju" relation
<wrtp> fwereade__: i don't get the first inference. we disallow implementations of hooks for relations with a juju- prefix. but the relation name in juju-relation-joined is "juju" not "juju-"
<fwereade__> wrtp, ah, I thought yu were advocating the simpler "no hook can start with `juju-`" rule
<fwereade__> wrtp, which honestly I would prefer
<wrtp> fwereade__: i think it's easier to think in terms of relation names
<wrtp> fwereade__: (HasPrefix(name, "juju-") || name == "juju") is more complex and arbitrary, IMHO, than just the HasPrefix test
<fwereade__> wrtp, I think specifying that no files in charm/hooks can start with "juju-" is the absolute simplest way to get what we need
<wrtp> fwereade__: and i don't see that we'll ever need to define a "juju" relation.
<fwereade__> wrtp, why should we allow others to?
<wrtp> fwereade__: because it's a convenient name to use for the requirer side of a juju-info relation?
<fwereade__> wrtp, doesn't this presuppose that we won't have other juju- relations?
<wrtp> fwereade__: i don't think so.
<fwereade__> wrtp, it's only a good name so long as juju-info is the only juju- one available
<wrtp> fwereade__: a better name would probably be principal-info, right enough
<wrtp> fwereade__: tbh i wouldn't mind if we just reserved all juju* names.
<fwereade__> niemeyer, heyhey
<wrtp> niemeyer: yo!
<fwereade__> wrtp, that's what I'd prefer
<niemeyer> Good morning!
<fwereade__> niemeyer, I was wondering, did you see the prereq of the CL you approved last night? I have a suspicion it may have got caught in the miscommunication
<niemeyer> fwereade__: Let me check
<fwereade__> niemeyer, https://codereview.appspot.com/6678046/
<fwereade__> niemeyer, you have reviewed it once butthat was a while ago
<niemeyer> fwereade__: Ah, sorry.. I didn't realize it was a pre-req.. I reviewed the other one in an attempt to unblock you, but I has fail
<fwereade__> niemeyer, np at all, I have other things I have been thinking about usefully
<fwereade__> niemeyer, I will definitely want a call about charms and relations at some point
<niemeyer> fwereade__: I'm game at any time
<fwereade__> niemeyer, it's just that I'm getting peckish and will probably disappear for luch soonish ;p
<niemeyer> fwereade__: Sounds like a good plan :-)
<fwereade__> niemeyer, ok, I will be back in a little while, hopefully the various issues will settle into some sort of order in my mind :)
<wrtp> niemeyer: a bootstrap can "succeed" several times, because the only way it can find out if the environment is already bootstrapped is by reading the environment's storage, which might not be available because of eventual consistency.
<wrtp> niemeyer: it's possible that moving to using tags might help this issue, but i'm not sure.
<niemeyer> wrtp: We shouldn't allow for that to happen
<niemeyer> wrtp: Otherwise it's ad-hoc
<wrtp> niemeyer: do you have an idea for how we might prevent it?
<wrtp> niemeyer: we could put a sleep at the start of Bootstrap, i suppose
<niemeyer> wrtp: We can read the information we just wrote, for example
<niemeyer> wrtp: Before confirming it worked
<wrtp> fwereade__: how does that help?
<wrtp> niemeyer: how does that help?
<wrtp> fwereade__: (didn't mean to address that to you, sorry)
<wrtp> niemeyer: BTW could we have a chat about TLS certs some time today?
<niemeyer> wrtp: Well, the idea is ensuring data we just wrote is visible
<wrtp> niemeyer: doing that does not ensure that, unfortunately
<wrtp> niemeyer: a read can succeed and then fail
<wrtp> niemeyer: (there's one live test that prints a *sigh* message when that happens, and i see it reasonable often)
<wrtp> niemeyer: it would probably reduce the frequency of Bootstrap succeeding twice in succession, but i don't think it would stop it entirely.
<niemeyer> wrtp: It's not just about Bootstrap.. this isn't right either:
<niemeyer>                 r, err = storage.Get(name)
<niemeyer>                 if err == nil {
<niemeyer>                         break
<niemeyer>                 }
<wrtp> niemeyer: i was just looking at that
<wrtp> niemeyer: yes, that's wrong.
<wrtp> niemeyer: storage.Get does the retry anyway, so it's unnecessary for the tests to do that
<niemeyer> wrtp: Eventual consistency sucks to deal with, and we should take the bullet to offer people a sane API
<wrtp> niemeyer: i agree.
<wrtp> niemeyer: but in some cases i'm not sure it's possible
<niemeyer> wrtp: If accessing data is *entirely* random, then I can't see why people use S3
<wrtp> niemeyer: it's not entirely random. it works... eventually.
<niemeyer> wrtp: Yep, so we should wait until it does
<wrtp> niemeyer: for the record, i see the *sigh* message relatively often.
<wrtp> niemeyer: we do!
<wrtp> niemeyer: but in these cases, the request succeeds, so there's nothing to wait for
<niemeyer> wrtp: Obviously a request succeeding doesn't seem to mean much
<wrtp> niemeyer: if we delete a file, we can sometimes fetch it even after the delete request. that request is succeeding, but it shouldn't. it will fail eventually though, but the thing doing the Get can't know that it was expected to fail and hence retry.
<niemeyer> wrtp: I understand what eventual consistency means, and I also understand that S3 has it
<wrtp> niemeyer: i'm not sure how you expect me to fix the problem while removing the loops in jujutest then
<niemeyer> wrtp: I'm thinking as well
<niemeyer> wrtp: So, I think the issue is on loading, rather than on writing
<niemeyer> wrtp: It'd suck to wait a long time for things to show up when we're writing sequentially several charms or whatever
<niemeyer> wrtp: But it seems that in general files we read from S3 are in well known locations, that should be there
<wrtp> niemeyer: i don't yet see where you're going with this
<niemeyer> wrtp: Bootstrap doesn't tell you that there was a previous bootstrap because it fails to get the previous file
<niemeyer> wrtp: We already have a retry strategy on Get.. why is it not working?
<wrtp> niemeyer: look at the start of the Bootstrap method
<wrtp> niemeyer: we fail when the Get *succeeds*
<niemeyer> <niemeyer> wrtp: Bootstrap doesn't tell you that there was a previous bootstrap because it fails to get the previous file
<wrtp> niemeyer: yeah, i was wrong there
<wrtp> niemeyer: it can only know if there was a previous bootstrap if it can get the file which was created by the previous bootstrap.
<niemeyer> Yep
<wrtp> niemeyer: actually, Bootstrap should do the loop itself
<wrtp> niemeyer: because it knows that it's expecting the file not to be there.
<wrtp> niemeyer: that would erase the jujutest bootstrap loop at any rate, if not the others.,
<wrtp> niemeyer: it means Bootstrap will always take 5 seconds when the environment is already bootstrapped, but that might not be a problem.
<niemeyer> wrtp: Hmm
<niemeyer> wrtp: Why would it do the loop itself?
<niemeyer> wrtp: Get does the loop already
<wrtp> niemeyer: Get only loops if the file is not found
<niemeyer> wrtp: We just have to improve Get until it works more reliably
<niemeyer> wrtp: Yes, and that's what we need
<wrtp> niemeyer: in this case the file *is* found, so it won't loop
<wrtp> niemeyer: and that's a problem for Bootstrap, because it gives an error when the file is found.
<niemeyer> wrtp: If the file is found, Bootstrap can fail
<wrtp> niemeyer: exactly.
<niemeyer> wrtp: Immediately
<niemeyer> wrtp: I mean it *can* fail
<wrtp> niemeyer: that's a problem if we do a Destroy followed by a Bootstrap.
<niemeyer> wrtp: Destroy kills the whole bucket
<wrtp> niemeyer: yes, but we might still be able to fetch that bucket even after it's been killed
<wrtp> niemeyer: which will cause Bootstrap to fail inappropriately.
<niemeyer> wrtp: Okay, so perhaps we should make Destroy more reliable instead
<niemeyer> wrtp: Because that's the weak link
<wrtp> niemeyer: so we make Destroy always take 5 seconds?
<niemeyer> wrtp: We can have slightly more deterministic logic by having it retry until it can no longer see the file in a few different tries
<niemeyer> wrtp: That means we don't slow down tests much
<wrtp> niemeyer: that doesn't necessarily mean that a subsequent fetch won't still succeed
<niemeyer> wrtp: Heh
<niemeyer> wrtp: It won't ever mean that, whatever we do
<niemeyer> wrtp: Even hours later S3 can still pull off a TA-DA! moment
<niemeyer> wrtp: But we can at least try to bring some sanity
<wrtp> niemeyer: yeah
<wrtp> niemeyer: but it means we still need the Bootstrap loop if we're to make the tests reliable
<niemeyer> wrtp: I don't see why.. bootstrap uses loadState which uses Get
<wrtp> niemeyer: ... which can succeed even after the bucket has been destroyed and we've verified that we get a 404 error.
<niemeyer> wrtp: We've just said we'd improve Destroy?
<niemeyer> wrtp: What has to change in Bootstrap?
<wrtp> niemeyer: that's the "verified that we get a 404 error" bit
<wrtp> niemeyer: nothing has to change in Bootstrap. i'm talking about the loop in the test that calls  Bootstrap.
<wrtp> niemeyer: (the one that triggered this discussion)
<niemeyer> wrtp: Erm..
<niemeyer> wrtp: Yes.. I'm a bit lost now.. all we've been saying above is to make things more reliable
<niemeyer> wrtp: If we still have to loop in tests, the whole point is moot
<wrtp> niemeyer: yes, but it won't be that reliable, even with the change above.
<niemeyer> wrtp: Why?
<wrtp> niemeyer: because verifying that the Get fails after Destroy doesn't verify that get Get called in Bootstrap won't subsequently succeed.
<wrtp> s/get Get/the Get/
<wrtp> niemeyer: personally, i *think* the loop in the test is the lesser evil. much better would be if we could talk to *something* in amz that was fully consistent.
<niemeyer> wrtp: If the Get sequentially fails several times at Destroy, and then Bootstrap can see the file again, too bad.. let the test blow
<niemeyer> wrtp: Much better is if we don't use S3 at all.. that's where we have to go
<wrtp> niemeyer: how many times do we try? what if this causes the test to fail 10% of the time?
<niemeyer> wrtp: We improve it until it's 1% or 0.01%
<wrtp> niemeyer: i'm not entirely sure that avoiding S3 will help here.
<niemeyer> wrtp: Erm?
<niemeyer> wrtp: Everything we've been talking about for the past hour is about S3
<wrtp> niemeyer: what's the alternative? tags? surely they'll suffer from a similar problem?
<niemeyer> wrtp: An internal storage
<wrtp> niemeyer: how does that help with Bootstrap?
<niemeyer> wrtp: True..
<niemeyer> wrtp: Either way, let's not derail
<niemeyer> wrtp: The test is great.. it's showing the API is flaky.. there's nothing to fix there
<wrtp> niemeyer: what about the other tests that loop?
<niemeyer> wrtp: Same thing.. we have that shows storage.Get failing, and you say it fails often
<niemeyer> wrtp: Sounds to me like we should improve storage.Get too
<wrtp> niemeyer: by causing it to fetch several times?
<wrtp> niemeyer: it does already retry on error
<niemeyer> wrtp: Or perhaps try more often
<niemeyer> wrtp: We should have a live test specifically for it, showing whether it works reasonably well or not
<wrtp> niemeyer: ISTM that these tests are doing something that people won't be doing much in normal usage - we don't really care if something succeeds when it shouldn't most of the time.
<niemeyer> wrtp: ISTM that we don't know how people will be using it
<niemeyer> wrtp: destroy+bootstrap is not uncommon at all, for example
<wrtp> niemeyer: i'm not sure what the Get test you're suggesting would do, and how Get might be improved. i don't think we want to retry on Get. nor, probably do we want to slow down every Put or Delete by doing a sequence of Gets after it.
<wrtp> s/retry on Get/retry when Get succeeds/
<niemeyer> wrtp: It would verify how reliable Get looks like when we put something
<niemeyer> wrtp: We want to solve the instability you see in the test by fixing the API, rather than by looping in the test
<niemeyer> wrtp: We want to avoid looping in the test, because it shows very unstable behavior in the API itself.. we don't want that to be our high-water mark for all the providers.
<wrtp> niemeyer: i'd like to fix the API (and we *have* fixed it when things fail), but i'm not sure how we can fix it without slowing everything down to full-eventual-consistency pace.
<niemeyer> wrtp: I've just explained
<wrtp> niemeyer: i fully agree that it's unfortunate
<wrtp> niemeyer: ok, so suppose i write the test and it shows that in 40% of cases, a Get succeeds when it should not.
<wrtp> niemeyer: what then?
<niemeyer> wrtp: Then that's REALLY BAD isn't it!?
<niemeyer> wrtp: If Destroy+Bootstrap fails 40% of the times, we suck I think
<wrtp> niemeyer: that's S3 for you :-(
<niemeyer> wrtp: Nope.. that's juju developers for you
<niemeyer> wrtp: S3 seems to work fine for a lot of people
<niemeyer> wrtp: storage.Get succeeding is fine
<wrtp> niemeyer: we can easily make Destroy+Bootstrap fully reliable by making Bootstrap wait for eventual consistency to resolve before returning an error, as Get does.
<niemeyer> wrtp: It's the other case that is a lot more rare: there are very few spots where we care about content being *deleted*
<wrtp> niemeyer: Get("nonexistent-thing") will currently always take 5 seconds.
<niemeyer> wrtp: I don't see how that's relevant
<wrtp> niemeyer: i agree. but that's what these tests are showing.
<niemeyer> wrtp: Bootstrap uses loadState, which uses Get.. Bootstrap is fine as it is
<wrtp> s/showing/testing/
<wrtp> niemeyer: our current technique for avoiding eventual consistency issues is to try to avoid errors.
<wrtp> niemeyer: so, if we're about to return an error, we try a few times to make sure that we've *really* got an error.
<wrtp> niemeyer: ISTM that Bootstrap fits into that category.
<niemeyer> wrtp: Sorry.. can we get to actual use cases so we can start to funnel the conversation towards agreement?
<niemeyer> wrtp: We're talking for more than an hour, so it's time to reach some conclusions
<niemeyer> wrtp: Destroy+Bootstrap fails and we want it to work..
<niemeyer> wrtp: We can make that more reliable by having Destroy wait until the file at least looks gone
<wrtp> niemeyer: my suggestion is to make Bootstrap not fail until the bootstrap bucket definitely isn't disappearing
<niemeyer> wrtp: I don't know what that means
<niemeyer> wrtp: Probably because thre are three negatives
<wrtp> niemeyer: your suggestion is *more* reliable, but it still isn't reliable. i believe my solution is reliable (well, as reliable as our Get heuristics)
<wrtp> niemeyer: this is what i suggest for the start of the  Bootstrap method :http://paste.ubuntu.com/1285011/
<niemeyer> wrtp: We're almost in agreement
<wrtp> niemeyer: the down side is that a failed Bootstrap gets slower. but that's what happens with a failed Get too, so comparable cases, i think.
<niemeyer> wrtp: The difference is that you're doing the verification that should be dong in Destroy within Bootstrap
<niemeyer> wrtp: That's effectively what this is doing.. it's waiting until Destroy actually takes place.. we can do that in Destroy itself
<wrtp> niemeyer: i'm not sure we can do that and ensure that the subsequent Bootstrap will succeed, without waiting for the full eventual consistency timeout.
<wrtp> niemeyer: and perhaps that's actually best - we'd just delay destroy-environment for a while.
<niemeyer> wrtp: That's true
<niemeyer> wrtp: Okay, sounds good
<wrtp> niemeyer: so we add a sleep to Destroy?
<niemeyer> wrtp: What? :)
<wrtp> niemeyer: or... what sounds good to you?
<niemeyer> wrtp: Your proposal..
<wrtp> niemeyer: ok, cool.
<niemeyer> wrtp: I think storage.Get likely deserves some improvement too to sort out that issue you see frequently
<wrtp> niemeyer: i'll still need to leave the loops in some of the other jujutest tests, i think.
<niemeyer> wrtp: I'd prefer to nail down problems rather than looping
<niemeyer> wrtp: For all the reasons we already covered
<niemeyer> wrtp: We're nailing one of them
<wrtp> niemeyer: i only see it frequently because the test is checking Get after Delete, which i don't think is something that we care too much about in real-world code.
<niemeyer> wrtp: Let's solve that first one, and then we see the others
<niemeyer> wrtp: So let's not do that
<wrtp> niemeyer: remove that test?
<niemeyer> wrtp: Well, I suppose this is testing Delete?
<wrtp> niemeyer: yes
<wrtp> niemeyer: there's also one in the test that tests List, i think.
<niemeyer> wrtp: List is a tough one, because we don't know what we're waiting for..
<wrtp> niemeyer: in fact it's all in TestFile
<wrtp> niemeyer: yup
<niemeyer> wrtp: I guess those loops for the storage method itself are fine, at least for now
<niemeyer> wrtp: It's the environment interaction we care the most about
<wrtp> niemeyer: yeah
<wrtp> niemeyer: i'm happy to have Bootstrap work a little more reliably.
<niemeyer> wrtp: Me too
<niemeyer> wrtp: I think storage.Get may need some tweaking too
<niemeyer> wrtp: But we'll see
<wrtp> niemeyer: what kind of tweak are you thinking of?
<niemeyer> wrtp: Perhaps just reducing the time between retries
<niemeyer> wrtp: So we don't increase the overall time further but improve the method a tad
<wrtp> niemeyer: i'm not sure that helps here. we're talking about the case where it doesn't retry at all, because the Get succeeds.
<wrtp> niemeyer: we could retry even when the fetch succeeds, but i *think* that's unnecessary.
<niemeyer> wrtp: I was considering something else, but it doesn't really matter right now.. sorry for the noise.
<wrtp> niemeyer: np
<wrtp> niemeyer: i'm sorry i'm bad at explaining things!
<niemeyer> wrtp: I don't think you're bad at explaining things.
<wrtp> niemeyer: it felt like i wasn't explaining things well above, but we got there!
<niemeyer> wrtp: I don't think it was a problem in the explanation..
<niemeyer> <wrtp> niemeyer: nothing has to change in Bootstrap. i'm talking about the loop in the test that calls  Bootstrap.
<niemeyer> wrtp: That was 1h ago..
<niemeyer> wrtp: It took 1h for us to agree to solve the actual problem without looping in tests.
<wrtp> that was in response to talking about making Destroy try the get, i think
<fwereade__> niemeyer, take a look at http://paste.ubuntu.com/1285056/ -- I think it roughly covers the areas I'm thinking about
<niemeyer> wrtp: <wrtp> niemeyer: i'd like to fix the API (and we *have* fixed it when things fail), but i'm not sure how we can fix it without slowing everything down to full-eventual-consistency pace.
<niemeyer> wrtp: Etc etc
<wrtp> niemeyer: to be fair, i had suggested the current solution some time before my "nothing has to change" remark
<wrtp> niemeyer: and our API is still broken (Bootstrap is less so now, happily), and we can't fix it
<niemeyer> wrtp: Nevermind.. You rock.
<niemeyer> fwereade__: Checking it out
 * wrtp wishes he did
<niemeyer> fwereade__: 0) that seems crazy indeed
<niemeyer> fwereade__: The name is precisely the way in which the charm uniquely identifies the relation
<fwereade__> niemeyer, cool
<wrtp> niemeyer: this might do it: https://codereview.appspot.com/6696043/
<niemeyer> fwereade__: 1a) +1
<wrtp> niemeyer: (waiting on live tests to complete a few times before i believe it)
<niemeyer> fwereade__: 1b) "but allowing freedom to implement any interfaces": I don't think we should allow people to implement juju-* interfaces
<fwereade__> niemeyer, they have to implement it to talk to juju-info
<fwereade__> niemeyer, (whose interface is also juju-info)
<niemeyer> fwereade__: Okay, that's requiring the interface, specifically
<niemeyer> fwereade__: Although, you could say that means implementing it too
<niemeyer> fwereade__: So my bad in the wording
<fwereade__> niemeyer, and honestly I don't see any reason to *stop* people from implementing non-juju-namespaced relations that happen to be acceptable substitutes
<niemeyer> fwereade__: What I meant is that we shouldn't allow people to provide juju-info
<fwereade__> niemeyer, it'd be kinda dumb but harmless I think
<fwereade__> niemeyer, ah, not harmless then?
<niemeyer> fwereade__: No, juju-* is reserved
<niemeyer> fwereade__: We shouldn't break people's charms if we decide to implement juju-mama tomorrow
<niemeyer> fwereade__: If we don't reserve it, we can
<niemeyer> We can break, I mean
<fwereade__> niemeyer, ok, sgtm
<fwereade__> niemeyer, so: you can't call a relation juju or juju-*; you can't provide an interface called juju or juju-*; anything else is ok?
<niemeyer> fwereade__: +1
<niemeyer> fwereade__: 1c) Charm metadata is definitely not the place to insert implicit relations
<niemeyer> fwereade__: Otherwise they'd not be implicit
<niemeyer> fwereade__: If we inserted, that would affect several places we don't want to touch
<niemeyer> fwereade__: e.g. the store
<niemeyer> fwereade__: and it'd also mean that old charms don't get new implicit relations
<fwereade__> niemeyer, that applies to the metadata file... when about the Meta type?
 * fwereade__ regards "when about" with horror
<niemeyer> fwereade__: LOL
<niemeyer> fwereade__: I was about to put that in my vocabulary.. you have to watch out when you talk to me
<fwereade__> niemeyer, (actually, derail for now, it will be relevant again soo)
<fwereade__> niemeyer, nah, just an incompetent edit
<niemeyer> fwereade__: Meta reflects the metadata
<niemeyer> fwereade__: It's actually stored in the store
<niemeyer> fwereade__: a terrible evil will fall upon us if we hack that nice piece of immutable information
<fwereade__> niemeyer, right, cool, the "doesn't feel right" is accurate
<niemeyer> fwereade__: Okay, in sync, so going to 2
<niemeyer> fwereade__: 2a) +1.. I think that was covered in 1b
<fwereade__> niemeyer, yeah, there is some overlap
<niemeyer> fwereade__: I mean, +1 to the (no)
<fwereade__> niemeyer, cool
<niemeyer> fwereade__: 2b) Perhaps nothing.. let's handle them silently within the uniter if it reaches it, and blow up at the door when bundling/loading the charm
<niemeyer> So the uniter doesn't have to know about these conventions
<fwereade__> niemeyer, in code terms, I was thinking of just refusing to load such abominations
<fwereade__> ;)
<niemeyer> fwereade__: That's what I meant.. we don't have to special case in the uniter, because we don't bundle them
<fwereade__> niemeyer, cool, perfect
<niemeyer> fwereade__: 2c) Yes, people should certainly be able to run a hook for juju-info.. but it's not clear if that's the question you asked
<fwereade__> niemeyer, oh ok -- I was expecting to be able to relate *to* juju-info, and run hooks in repsonse to a counterpart's juju-info, but I was not expecting charmers to implement, say, juju-info-relation-joined
<fwereade__> niemeyer, what does that tell you other than "a subordinate exists"?
<fwereade__> niemeyer, except it doesn't even tell you that
<fwereade__> niemeyer, because juju-info has global scope
<fwereade__> niemeyer, it just means "something you don't know about is in a relation with you"
<niemeyer> fwereade__: Okay, let's break that apart
<fwereade__> niemeyer, and I can't figure out how that information is useful to anyone
<niemeyer> fwereade__: juju-info-relation-joined means the relation *name* is juju-info, which is.. hmm.. interesting
<niemeyer> fwereade__: We said before we'd disallow it, but I'm not sure we should, now that I think of it
<fwereade__> niemeyer, I have one concern there
<niemeyer> fwereade__: First, because it doesn't really matter.. that's the user-provided name for the relation
<niemeyer> fwereade__: We don't really care, I think
<niemeyer> fwereade__: Second, because we offer convenient shortcut notation in charms that make relation-name == relation-interface
<niemeyer> fwereade__: So, I think relation *names* that have juju* sound okay, in principle. What do you think?
<fwereade__> niemeyer, the specific thing I want to block is an author who responds to changes he shouldn't know about, by doing things in response to the implicit juju-info relation changing
<niemeyer> fwereade__: Before we derail, does the above sound sane?
<fwereade__> niemeyer, blocking juju entirely seemed like the simplest and clearest way to ensure the situation didn't come up in future
<niemeyer> fwereade__: is there a reason to prevent a relation *name* from being named juju*
<niemeyer> fwereade__: Blocking a relation name doesn't do anything.. it has absolutely no semantics attached to it
<niemeyer> fwereade__: Other than being an identifier
<fwereade__> niemeyer, I thought it want just a reserved-for-=future-use deal
<fwereade__> niemeyer, s/want/was/
 * fwereade__ is typing, er, "dynamically" today
<niemeyer> fwereade__: The point I'm considering is that: a) It's useful to have it because we have syntax that makes relation-name == relation-interface convenient, and we have relation interfaces called juju-*;
<niemeyer> fwereade__: b) There's nothing to reserve. It's an identifier without any semantics attached to it other than being a way to reference the relation by the local charm author
<fwereade__> niemeyer, ok, we have one relation interface called juju-info, at this stage, that *everything* provides
<niemeyer> fwereade__: interface != name
<fwereade__> niemeyer, if we do a shortcut "juju-info" relation then the name will collide
<niemeyer> fwereade__: Everything I said above is about the relation name, not the interface
<fwereade__> s/interface //
<fwereade__> niemeyer, we can't allow juju-info because the name collides with the implicit relation
<niemeyer> fwereade__: If a name collides, it will blow up.. that's a separate constraint that should necessarily be enforced for every relation, despite its name
<fwereade__> niemeyer, if we allow juju-* relation names we are setting ourselves up for future collisions
<niemeyer> fwereade__: I don't see how that's relevant at all
<niemeyer> fwereade__: Or perhaps, more clearly, I don't see how we'd be setting ourselves up for future collisions
<niemeyer> fwereade__: Relation names are a local namespace
<fwereade__> niemeyer, ISTM that we will not be free to pick sensible names for future implicit relations, because other charms might already be using those names for something different
<niemeyer> fwereade__: I don't think that's true
<niemeyer> fwereade__: Can you provide a short example? It'll elucidate the point
<fwereade__> niemeyer, imagine we wanted to introduce juju-info, but charms already existed which use that relation name... what is the appropriate course of action? break the charms, or just call it "juju-inof-no-really-this-is-the-official-one"? ;p
<fwereade__> niemeyer, I don't know what other implicit relations we may want to introduce
<niemeyer> fwereade__: We introduce the *interface* juju-info, and absolutely nothing breaks at all
<fwereade__> niemeyer, so what do we name the relation then?
<niemeyer> fwereade__: We don't.. it's not our name.. we don't care
<fwereade__> niemeyer, it *is* our name
<fwereade__> niemeyer, we implicitly provide juju-info
<niemeyer> fwereade__: Ahh, I see. You're wondering about the provider side
<niemeyer> fwereade__: It's a good point.. hmm
<fwereade__> niemeyer, I'm a little bewildered - I *think* I'm just reiterating that names should be unique within a charm -- across roles
<niemeyer> fwereade__: across roles?
<fwereade__> niemeyer, a provider called foo and a requirer called foo?
<niemeyer> fwereade__: Ah, no no.. that's definitely not ok
<niemeyer> fwereade__: Must be unique
<niemeyer> fwereade__: I just never thought about the charm referencing the provider relation locally
<niemeyer> fwereade__: The implicit one, that is
<niemeyer> fwereade__: Again, it's a good point.. it just escaped me
<fwereade__> niemeyer, yeah, I only came across it this morning
<niemeyer> fwereade__: So here is a suggestion that preserves my original intention, and I think sorts out the problem you bring: we do allow juju* relation names, as long as they match the interface name
<niemeyer> fwereade__: In other words, it's fine for a requirer juju-info relation to be named juju-info
 * fwereade__ thinks a bit
<fwereade__> niemeyer, I'm not sure that helps
<niemeyer> fwereade__: Ah, indeed
<fwereade__> niemeyer, what does `svc1:juju-info` refer to?
<niemeyer> fwereade__: requirer and provider would conflict
<niemeyer> Dmad
<niemeyer> Damd
<niemeyer> fwereade__: Okay, +1 on not allowing it.. it's clearly non trivial and I'm digging a hole
<fwereade__> niemeyer, cool, cheers
<niemeyer> fwereade__: Okay, another point on your comment: juju-info has global scope
<niemeyer> fwereade__: I don't think that's the case
<niemeyer> fwereade__: The *relation* has whatever scope the requirer gives it
<niemeyer> fwereade__: Otherwise we'd not be able to use juju-info for the very reason it was created
<fwereade__> niemeyer, ok, yes -- what I mean is that it *allows for* globally scoped relations, and I'm not sure it has any value in that instance
<fwereade__> niemeyer, I am almost certainly missing something here though
<niemeyer> fwereade__: It provides the ip address of the related unit
<niemeyer> fwereade__: So there's *some* value
<fwereade__> niemeyer, yeah, I guess we could have an automated try-to-hack-this-box charm we could relat to everything :0
<fwereade__> niemeyer, ok, objections withdrawn :)
<niemeyer> fwereade__: That said, I'm not sure about why this is relevant to be honest.. it's just another relation, that can be global or container, and behaves like anything else in that sense
<niemeyer> fwereade__: Am I missing something?
<fwereade__> niemeyer, I think it's just a derail, sorry :)
<niemeyer> fwereade__: Indeed, but given that we've derailed.. can we agree it's just like any other relation in that sense?
<niemeyer> fwereade__: I just want to make sure I'm not missing yet another aspect
<fwereade__> niemeyer, yes, certainly
<niemeyer> fwereade__: Cool, thanks
<niemeyer> fwereade__: (to me there's value in things not being special :-)
<niemeyer> So, where were we..
<niemeyer> 2c was last I think
<niemeyer> Ah, and we didn't yet answer it
<niemeyer> fwereade__: People should be able to write requirer hooks for juju-info relations
<fwereade__> niemeyer, my feeling still leans toward "no" -- the idea of juju-info, in particular, is that it's what yu relate to when the other charm doesn't know anything about you
<fwereade__> niemeyer, ok, yu mean relations with the juju-info interface, not the name, right?
<niemeyer> fwereade__: Yes
<fwereade__> niemeyer, I have no arguments there
<fwereade__> niemeyer, but that will not be a hook called juju-info-anything
<niemeyer> fwereade__: -1 as well on juju-info *provider* hooks
<fwereade__> niemeyer, it might be called, say, principal-info-something
<niemeyer> fwereade__: I don't think the name of the hook matters much
<niemeyer> fwereade__: It could be called foo-relation-joined and still be a juju-info hook
<fwereade__> niemeyer, ok, I am thinking from the name perspective here, because those are the things that can lead to collisions
<niemeyer> fwereade__: We've already decided on the name
<niemeyer> fwereade__: I'm talking about relation hooks that respond to relations with the juju-info interface
<fwereade__> niemeyer, let me restate what I'm wondering
<fwereade__> niemeyer, given the relation naming restrictions, would it be ok for us to just declare the whole juju namespace in the hooks dir out of bounds?
<fwereade__> niemeyer, if we *don't* then we run the risk of one of those names matching a future hook and firing as a hook when it's not expected
<fwereade__> niemeyer, that is always a risk with any file in that dir to be fair
<niemeyer> fwereade__: That doesn't look like 2c
<fwereade__> niemeyer, ok, smaller question
<fwereade__> niemeyer, is it ever sane to have a file named "juju-info-relation-joined" in your hooks dir?
<niemeyer> fwereade__: It does the exact same thing as a file named can-I-has-catz
<fwereade__> niemeyer, how do we stop it from running? special-casing in the uniter?
<niemeyer> fwereade__: Why would we have to special case?
<niemeyer> fwereade__: It does absolutely nothing.. there's nothing interesting about that name
<fwereade__> niemeyer, it will be executed when the provider joins the relation, won't it?
<fwereade__> niemeyer, unless we explicitly avoid running all such hooks
<niemeyer> fwereade__: Aha, okay..
<niemeyer> <niemeyer> fwereade__: -1 as well on juju-info *provider* hooks
<niemeyer> fwereade__: So you want to avoid these hooks by scanning the directory and banning everything hooks/juju*?
<fwereade__> niemeyer, essentially, yes
<fwereade__> niemeyer, or possibly explicitly ignoring them in the uniter
<fwereade__> niemeyer, either way smacks of icky special-casing, not sure which is worse
<niemeyer> fwereade__: That sounds like a good idea
<niemeyer> fwereade__: I think we should do better, in fact
<niemeyer> fwereade__: Ban *all* unwanted files from hooks/*
<niemeyer> fwereade__: We can't do that in the short term, but we should start warning people about that asap
<fwereade__> niemeyer, hmmmmmmm a lot of people use a nunch of symlinks to a single implementation file
<niemeyer> fwereade__: That works, as long as the implementation file is one of the hooks
<fwereade__> niemeyer, can't remember offhand, but I'm not sure it is
<niemeyer> fwereade__: Understood.. I'm sure it'll break what people do right now
<fwereade__> niemeyer, (in general, in the cases I've seen...)
<niemeyer> fwereade__: Which is why we cannot push in the short term, but can start warning asap
<fwereade__> niemeyer, ok, regardless this is a direction statement of which I approve, I will add some deprecation warnings
<fwereade__> niemeyer, and also explicitly skip juju-* "hooks" in the uniter, I guess?
<niemeyer> fwereade__: It avoids the future-hooks issue, and establishes a good convention on which the juju* ban works
<niemeyer> fwereade__: I'd prefer to explicitly forbid *those* hooks right away
<niemeyer> fwereade__: The uniter is too late
<niemeyer> fwereade__: It'll blow people up way down the pipeline
<fwereade__> niemeyer, ok, SGTM -- error on Read hooks/juju-*; warning on Read hook/anything-not-referenced-elsewhere?
<niemeyer> fwereade__: +1
<fwereade__> niemeyer, ok, great
<niemeyer> fwereade__: Wait, hmm..
<niemeyer> fwereade__: There are forward compatibility issues I thnk
<niemeyer> fwereade__: It'd mean an old juju version cannot deploy any charms that expose a new hook implementation
<fwereade__> niemeyer, hmm
<niemeyer> fwereade__: We don't have to solve that now
<niemeyer> fwereade__: Let's agree on what seems clear: no hooks/juju*
<fwereade__> niemeyer, perfect
<niemeyer> Awesome
<niemeyer> 2c) Check
<niemeyer> 2d) Block them when reading?
<fwereade__> niemeyer, +1
<niemeyer> Cool
<fwereade__> niemeyer, 2e, 2f covered
<niemeyer> 2e) Definitely not
<fwereade__> niemeyer, no, block when reading
<niemeyer> Cool
<niemeyer> 3
<niemeyer> 3a) Yes
<fwereade__> niemeyer, note that we should be careful about charm store releases once we've tightened these up
<niemeyer> fwereade__: Yeah, I think it's okay
<niemeyer> fwereade__: But we'll see
<fwereade__> niemeyer, at least the 2 footnoted charms will be refused, and probably others
<niemeyer> fwereade__: 2b) I don't understand the point
<niemeyer> fwereade__: Probably lost on "charm URLs match are for"
<fwereade__> niemeyer, er, I apparently cannot haz craggar
<fwereade__> er, grammer :/
<fwereade__> GAAAH
 * fwereade__ takes a deep breath
<niemeyer> :)
<fwereade__> niemeyer, ISTM that the right way to accomplish 3a is to Assert that the service's charms are the same charms we used to determine that the endpoint list was OK
<fwereade__> niemeyer, it shouldn't in itself be controversial, it's just setting up 3c
<niemeyer> fwereade__: I don't think that's the case
<fwereade__> niemeyer, cool, go on
<niemeyer> fwereade__: I think I see what you mean, actually, and agree
<fwereade__> niemeyer, ah, ok, great
<niemeyer> fwereade__: We need to assert that the charm still has the relation
<niemeyer> fwereade__: There's a red-herring in that description regarding the charm URL not having changed, which isn't the case, but it doesn't matter
<fwereade__> niemeyer, please explain, I don;t see it
<niemeyer> fwereade__: "charm URLs match are for the
<niemeyer>     charms from which those endpoints have been determined"
<niemeyer> fwereade__: The determination of the endpoints is done at time T1.. the adding of relation is done at T2
<fwereade__> niemeyer, ISTM that the services' charm urls (and lifes) are exactly what we need to assert are what we expect
<fwereade__> niemeyer, agree
<niemeyer> fwereade__: Things may change in between.. the important fact is that at T2 the relation is still sane
<fwereade__> niemeyer, ok, indeed, we probably do want to retry the validation in that case
<niemeyer> fwereade__: Cool, in sync again
<niemeyer> So, 3b check
<fwereade__> niemeyer, yep
<niemeyer> fwereade__: 3c) Doesn't seem entirely the case, for the reasons described
<fwereade__> niemeyer, for the first attempt, it does seem wrong to redo the work we just did
<niemeyer> fwereade__: I'm not entirely comfortable with changing what an Endpoint is because of that minor need
<niemeyer> fwereade__: and endpoint is a high-level description of one side of the relation
<niemeyer> fwereade__: It's not bound to any charm
<fwereade__> niemeyer, yes, but we can see very neatly whether or not it will apply to a given charm
<Aram> uniter tests fail because of:
<Aram> [LOG] 4.30775 JUJU git command failed: exit status 1
<Aram> path: /tmp/gocheck-894385949183117216/0/agents/unit-u-0/charm
<Aram> args: []string{"pull", "/tmp/gocheck-894385949183117216/0/agents/unit-u-0/state/deployer/current"}
<Aram> M	.juju-charm
<Aram> U	data
<Aram> M	hooks/start
<Aram> A	ignore
<Aram> M	revision
<Aram> anybody seen this?
<Aram> this is from trunk
<fwereade__> niemeyer, this is actually maybe the point at which things will be clearest if you read to the end and then ask questions again
<fwereade__> Aram, grar, would you paste me the full test output?
<Aram> ok
<fwereade__> Aram, I suspect my unracing in a particular step was not as good as it might have been
<fwereade__> niemeyer, it's definitely the part I am least certain about
<niemeyer> fwereade__: Yeah, I understand.. it seems to do exactly what 3c suggests
<niemeyer> fwereade__: IOW, associate an endpoint with a charm
<niemeyer> fwereade__: This makes the model more complex in a few different ways
<fwereade__> niemeyer, *optionally* do so, but yes
<niemeyer> fwereade__: Not optionally.. always.. except we now have *two* endpoint types, so we must be qualifying them
<niemeyer> fwereade__: All seems like changing something clear into something that requires further effort to understand and communicate
<niemeyer> fwereade__: The need you expose seems relevant
<niemeyer> fwereade__: But I think we can solve the need without refactoring what an endpoint means
<niemeyer> fwereade__: In the short term, it sounds a lot easier to move forward by simply having AddRelation fail if the endpoints don't exist at the transaction time
<fwereade__> niemeyer, ok...
<fwereade__> niemeyer, the proposal also I think incorporates an additional perspective that I didn't mention
<niemeyer> fwereade__: That's the common denominator of all of the points and interface changes, I think, so having that encapsulated under that one operation feels like a significant win
<fwereade__> niemeyer, which is that I think I will be wanting to validate endpoints against *deployed* charms
<niemeyer> fwereade__: Sounds sane
<fwereade__> niemeyer, I think that endpoint matching/checking go very well together with charms
<fwereade__> niemeyer, without having to involve state
<niemeyer> fwereade__: charm.HasEndpoint(endpoint)?
<fwereade__> niemeyer, nearly -- a container-scoped one should match a global one that the charm declares
<fwereade__> niemeyer, actually yeah HasEndpoint covers that
<fwereade__> niemeyer, the thinking is generally that this block of data, excluding the service name, appears to be relevant and useful in the charm package on its own
<niemeyer> fwereade__: I don't think so, unless you change what an endpoint is
<niemeyer> fwereade__: The concept of "endpoint" began its life as that "service:relationname" we use in the command line
<niemeyer> fwereade__: To represent one side of the relation in an unambiguous way
<fwereade__> niemeyer, hm, fair enough -- maybe I'm asking for a rethink of charm.Relation instead?
<fwereade__> niemeyer, which lacks Role and something else
<niemeyer> fwereade__: I don't know.. I don't know what we're after
<fwereade__> niemeyer, I think it would be very useful to have a type, regardless of its name, which I could use with charms to determine what charms can talk to what other charms
<fwereade__> niemeyer, at the moment RelationEndpoint roughly givesus that, but the service name is a distraction from that POV
<niemeyer> fwereade__: Can we cover the problem we're trying to solve first?
<fwereade__> niemeyer, I would like to be able to manipulate implementations of charm.Charm in such a way as to be able to easily ask questions about what can relate to what
<niemeyer> fwereade__: This is still talking about the API you want.. can we talk about the *problem* first?
<Aram> fwereade__: http://paste.ubuntu.com/1285248/
<Aram> fwereade__: only one test failure in that log, if I run it without piping in a file I get more errors, if I pipe in a file I get only one error, or no failure at all.
<fwereade__> niemeyer, the problem is charm upgrades
<niemeyer> fwereade__: Okay.. what happens with charm upgrades?
<niemeyer> fwereade__: The uniter has a single local charm that is deployed, and an upgrade candidate
<niemeyer> fwereade__: What do we have to do about that/
<niemeyer> ?
<fwereade__> niemeyer, I'm more thinking of a charm that has *not* yet been upgraded; detecting a relation that was added to the service since the service was upgraded, but which the local charm does not know about
<fwereade__> niemeyer, I don't think the uniter needs to worry about upgrades
<fwereade__> niemeyer, the idea is that we don't even allow upgrades if they would break relations that currently exist
<fwereade__> niemeyer, but that, when we see a relation, we have no guarantee it's not for a newer version of the charm, to which we have not yet upgraded
<fwereade__> niemeyer, and so by comparing the ... endpoint ... of the relation against the current charm we can know we shouldn't actually enter
<fwereade__> niemeyer, so I would like a Thing representing the relation-from-charm-perspective, or endpoint-but-not-about-service, or whatever we call it -- that Name/interface/Role/Scope quartet
<niemeyer> fwereade__: Okay.. so the goal is knowing that the established relation is not in fact known to the current charm, even though it's known to the service
<fwereade__> niemeyer, yes
<niemeyer> fwereade__: Awesome, thanks
<fwereade__> niemeyer, (and also, taking a charm and determining what ... endpoints ... it exposes)
<niemeyer> fwereade__: So how do we guarantee that?
<fwereade__> niemeyer, sorry about the vocab, it's just that endpoint STM to be the losest word we have to the thing I'm trying to describe
<niemeyer> fwereade__: Let's pleast not loose it
<niemeyer> fwereade__: We have a well defined meaning for endpoint
<fwereade__> niemeyer, yep, ok
<niemeyer> fwereade__: Charms don't have endpoints.. they have relation names to uniquely identify the relation
<fwereade__> niemeyer, the other word is "relation", which is unhelpful in this context because the name relation is more tightly bound to state in my mind than it is t charm
<fwereade__> niemeyer, but it may be what we're after
<fwereade__> niemeyer, Peers/Provides/Requires are all map[string]Relation
<niemeyer> fwereade__: Interesting
<niemeyer> fwereade__: Okay
<fwereade__> niemeyer, the field and the key supply those 2 pieces of information
<niemeyer> fwereade__: It's certainly not ideal, and I don't blame you for the confusion at all
<niemeyer> fwereade__: I'm not clear enough myself
<fwereade__> niemeyer, but the Relation type itself feels like it would be more useful if it included them
<niemeyer> fwereade__: Back to the point, though
<fwereade__> niemeyer, possible restatement: the unit of compatibility is the charm, not the service, and I am currently very interested in compatibility
<niemeyer> fwereade__: How do we tell if a charm supports a relation that is compatible with what the state service is telling us?
<niemeyer> fwereade__: What's the info that tells us whether we're good to go or not?
<fwereade__> niemeyer, maybe interface and role is all we *need* there?
<niemeyer> fwereade__: We must take the name in consideration too
<fwereade__> niemeyer, ha, yes indeed, they need to match
<niemeyer> fwereade__: Imagine cache-db vs. data-db
<fwereade__> niemeyer, I'm not clear on scope for some reason
<fwereade__> niemeyer, I don't think that's an issue of charm compatibility, I think that is a service-level thing
<niemeyer> fwereade__: I think it matters too
<niemeyer> fwereade__: If nothing else, because it means what is *established* needs to change
<fwereade__> niemeyer, ah ok at runtime?
<niemeyer> fwereade__: Yes, I'm still keeping the problem statement in mind:
<niemeyer> <niemeyer> fwereade__: Okay.. so the goal is knowing that the established relation is not in fact known to the current charm, even though it's known to the service
<fwereade__> niemeyer, feels like we just shouldn't allow upgrades that change the meanings of established relations at all
<niemeyer> fwereade__: +1
<fwereade__> niemeyer, (determining that is another use for the cabability I am touting)
<niemeyer> fwereade__: And by established we mean non-Dead/removed
<fwereade__> niemeyer, yep
<niemeyer> Awesome
<niemeyer> fwereade__: So, problem solved?
<niemeyer> fwereade__: Hmm.. no
<niemeyer> fwereade__: Because we can still establish a new relation
<fwereade__> niemeyer, dunno -- you seem to be implicitly agreeing that the Name/Iface/Role/Scope quartet may be a sensible data type for several situations, but I'm not actually sure yu are
<fwereade__> niemeyer, ah yeah -- we can establish a new relation while old charms are still not upgraded
<niemeyer> fwereade__: Uh.. I'm not agreeing or disagreeing with that
<niemeyer> fwereade__: I'm still trying to see what is the problem we have and solve it
<niemeyer> fwereade__: So what we use to determine if a relation is known to the current charm, so that it may be established, is its <name,iface,role,scope>... okay
<fwereade__> niemeyer, the problem is that we need to do compatibility-checking things with charms in several situations: determining possible endpoints; determining valid charm upgrades; ensuring sanity in an edge case on the uniter
<niemeyer> fwereade__: I'm still trying to solve one single problem
<fwereade__> niemeyer, I am looking for APIs that makes sense for all these needs, which feel like different applications of what should be the same tools
<niemeyer> fwereade__: Sorry, I'm slow..
<niemeyer> fwereade__: I cannot cover all needs at once.. we just determined that there should be no API changes at all for adding a relation
<niemeyer> fwereade__: Now there's another case that I'm trying to follow up with you
<niemeyer> <niemeyer> fwereade__: So what we use to determine if a relation is known to the current charm, so that it may be established, is its <name,iface,role,scope>... okay
<fwereade__> niemeyer, sorry, wait, I'm not sure I did agree with that 100% -- I agreed that RelationEndpoint is the right data to keep in state, not that it's the best way to express our desires to AddRelation
<fwereade__> niemeyer, I'm really not trying to be difficult here
<niemeyer> fwereade__: That's not how it feels
<fwereade__> niemeyer, the difficulty is in relations that do not correspond to those declared in the charm
<niemeyer> fwereade__: Sounds good.. I was trying to get the context for that statement, and just now I think I'm understanding what you're trying to do
<niemeyer> fwereade__: When we add a relation, we do so against the latest charm
<fwereade__> niemeyer, sorry, I'm still feeling my way around it myself
<niemeyer> fwereade__: that is associated with the services
<niemeyer> fwereade__: We prevent upgrades that modify established relations
<niemeyer> fwereade__: Which gives us a guarantee that any unit that has an *established* relation, has a charm with that specific relation matching that of the tip charm associated with the service
<niemeyer> fwereade__: Which is comfortable and good and sane
<fwereade__> niemeyer, yes, agreed
<niemeyer> fwereade__: What we cannot guarantee, though, is that when we establish a *new* relation, that relation may be established by all running units that are not up-to-date with the latest charm
<fwereade__> niemeyer, with a slight wrinkle in the case of the provider relation named juju-info, which is not declared explicitly by any charm, and it's not clear where it comes from
<fwereade__> niemeyer, yes, agreed
<niemeyer> fwereade__: The important fact here, then, is that a *new* relation that was established may be *incompatible* with the charm the uniter is running, and compatibility as we agreed is determined by <name,iface,role,scope>
<fwereade__> niemeyer, yes, agreed
<niemeyer> fwereade__: Awesome
<niemeyer> fwereade__: Another important factor that is a consequence of some of these agreements, is that once we observe a relation in state, we have the guarantee that as long as that relation exists, it will remain compatible with the tip charm, whatever that charm is
<fwereade__> niemeyer, yes indeed
<fwereade__> niemeyer, (I *think* that actually *some* scope changes are safe... if the only existing relation is itself a downscoped global relation, it's safe for the underlying relation to go from global to container)
<fwereade__> niemeyer, but that is just a detail
<niemeyer> fwereade__: The problem feels straightforward then..
<niemeyer> fwereade__: endpoint := relation.Endpoint(service) ; if endpoint supported by local charm { move on }
<fwereade__> niemeyer, agreed...
<fwereade__> niemeyer, the code I am trying to discuss is the " if endpoint supported by local charm" bit
<fwereade__> niemeyer, which is the same sort of question I will be asking about various "endpoints" and charms in several situations
<niemeyer> fwereade__: Sorry, I thought that was part of the coverage above
<niemeyer> fwereade__: ... and compatibility as we agreed is determined by <name,iface,role,scope>
<niemeyer> fwereade__: endpoint.SupportedBy(charm) comparing <name,iface,role,scope>?
<fwereade__> niemeyer, ok, which we do not have collected into a handy type, other than RelationEndpoint, which has an extraneous service name -- if it didn't have that, it would live very happily as a type in the charm package
<niemeyer> fwereade__: It's not extraneous.. it's part of the definition of what an endpoint is since we first used the endpoint term
<niemeyer> fwereade__: and by charm I really mean *state.Charm
<niemeyer> fwereade__: Or at least the charm interface
<fwereade__> niemeyer, why *state.Charm? all we *should* need to answer the important questions is charm.Charm
<niemeyer> fwereade__: Which might work with both
<fwereade__> niemeyer,
<fwereade__> niemeyer, yeah, in fact it's just Meta() I think
<niemeyer> fwereade__: Cool.. charm.Charm still feels good, though
<fwereade__> niemeyer, it does!
<niemeyer> fwereade__: For docs, if nothing else
<fwereade__> niemeyer, and being able to answer these sorts of questions in the charm package feels good too :)
<niemeyer> fwereade__: Feels irrelevant to any problem stated so far
<fwereade__> niemeyer, all the data is already available in the charm package... all the questions are purely about charm relation compatibility, divorced from service... it's just that we don't have a handy type in the charm package expressing it
<fwereade__> niemeyer, this seems strange to me
<niemeyer> fwereade__: There are only so many battles we can win at a time
<fwereade__> niemeyer, when I consider what that type should look like, I see that it bears notable similarities to -- but is not the same as -- RE
<niemeyer> fwereade__: If all we need is a trivial method on state.RelationEndpoint to say whether it is compatible with a charm or not, I don't see why we to break down the concept of endpoint we have today, split types, change the charm package, etc etc
<fwereade__> niemeyer, and a way to get all those endpoints from charms
<niemeyer> fwereade__: Why?
<niemeyer> fwereade__: I haven't seen that need so far
<fwereade__> niemeyer, add-relation foo bar
<niemeyer> fwereade__: yes?
<fwereade__> niemeyer, we need to go through all possible combinations to determine whether or not it's ambiguous
<niemeyer> fwereade__: Yes, and that's in state
<niemeyer> fwereade__: Because all of the data this applies against is in state
<niemeyer> fwereade__: endpoints, err := state.InferEndpoints("foo", "bar")
<niemeyer> fwereade__: relation, err := state.AddRelation(endpoints)
<niemeyer> fwereade__: Sorry, I'm starving.. will be back soon
<fwereade__> niemeyer, so at the very least we need to get the service twice and the charm twice for every service in a relation?
<fwereade__> niemeyer, np, might be a little longer myself
<fwereade__> niemeyer, ttyl
 * wrtp is off for the evening. night all.
<niemeyer> wrtp: Night
<wrtp> niemeyer: please can we have a discussion about tls certs tomorrow...
<niemeyer> wrtp: Of course
<fwereade__> niemeyer, tyvm for reviews
<fwereade__> niemeyer, sorry that one was still a bit scrappy
<fwereade__> niemeyer, and fwiw I don't feel a strong need to pursue the not-actually-Endpoints discussion -- in initial sketches the code seemed to be telling me it wanted to do that, but there's nothing fundamentally wrong with keeping the functionality in state, and I will try to cast it from my mind ;) :-)
<fwereade__> niemeyer, ty for valuable discussion earlier :)
<niemeyer> fwereade__: It was my pleasure
<niemeyer> fwereade__: Also excited about the rest of the code
<niemeyer> fwereade__: Lots of good stuff, and not much to change
<fwereade__> niemeyer, excellent :)
<fwereade__> niemeyer, a thought: is a container-scoped peer relation ever valid?
<niemeyer> fwereade__: Wow..
 * niemeyer thinks
<fwereade__> niemeyer, maybe with multi-endpoint peers it would be... although it sort of makes my brain hurt a little
<fwereade__> niemeyer, it actually sounds icky enough that I'd like to forbid it, and we can consider the implications at a later date if anyone wants to use to use it for anything
<fwereade__> niemeyer, I can't imagine any use for a container-scoped peer relation, even there, thant isn't better addressed as a separate pro/req relation
<fwereade__> niemeyer, ofc, my imagination is limited ;)
<niemeyer> fwereade__: I think it'd be pretty bizarre
<niemeyer> fwereade__: THe only way it could happen is to have two subordinates of the same type
<niemeyer> fwereade__: Since naturally we cannot have two principals in the same container
<fwereade__> niemeyer, which I guess is not impossible, but pretty damn weird
<niemeyer> fwereade__: Yeah
<niemeyer> fwereade__: I'm not sure if we should prevent it actively, though
<niemeyer> fwereade__: If we have to special case it to prevent, I'd say let's do nothing
<niemeyer> fwereade__: If we have to special case to support, I'd say let's not support it
<fwereade__> niemeyer, sgtm, ++less-code
<fwereade__> niemeyer, btw, I was also just wondering if we really want to scan the hooks dir every time we read a charm... maybe it would be cleaner to check at bundle time?
<fwereade__> niemeyer, sorry, context is for disallowing juju* files in the hooks dir
<fwereade__> niemeyer, assuming that every charm we run has passed through a bundling stage does not seem unreasonable
<fwereade__> niemeyer, but maybe it's not sensible, they are just zip files
<niemeyer> fwereade__: Hmm
<niemeyer> fwereade__: Yeah
<fwereade__> niemeyer, scan-every-read then?
<niemeyer> fwereade__: Sounds fine
<fwereade__> niemeyer, yeah, could be worse ;p
<niemeyer> fwereade__: True :)
<fwereade__> niemeyer, I'm going to bed, but https://codereview.appspot.com/6713057/ should be nice and simple
<niemeyer> fwereade__: Thanks much
#juju-dev 2012-10-18
<davecheney> niemeyer: got a better fix for signing bug
<davecheney> will propose in a few minutes
<davecheney> niemeyer: did you fix lbox so it doesn't mess up the milestone of a bug ?
<davecheney> if so, thank you :)
<niemeyer> davecheney: Yo
<niemeyer> davecheney: Morning
<niemeyer> davecheney: I didn't yet, sorry
<davecheney> niemeyer: weird, i wonder why lbox propose didn't screw up the milestone on https://bugs.launchpad.net/juju-core/+bug/1061941
<davecheney> maybe because it's a different project
<niemeyer> davecheney: Perhaps the milestone picking logic is actually right
<niemeyer> davecheney: What I did was to deactivate 1.3 a while ago
<davecheney> \o/ !
<niemeyer> davecheney: Bed time here
<niemeyer> davecheney: Have a good time
 * niemeyer => zzzzzZZZ
<wrtp> fwereade__: morning
<fwereade__> wrtp, heyhey
<wrtp> davecheney: hiya
<wrtp> davecheney: if you have a moment, this evening, it would be great if you could try to provoke that test failure - i'd like to get that CL submitted.
<davecheney> wrtp: that is the thing
<davecheney> the test never fails
<wrtp> davecheney: have you tried raising the iteration count?
<wrtp> davecheney: (did you see my reply BTW?)
<davecheney> i think i may not have
<davecheney> basically i reveted state.go
<davecheney> and the test still passed
<davecheney> so, what is the test testing ?
<wrtp> davecheney: i only replied an hour ago
<wrtp> davecheney: it's testing a race condition
<wrtp> davecheney: read my reply for details
<davecheney> ok, i'll go upstairs and read it
<wrtp> davecheney: ta
<fwereade__> guys, I need to take a somewhat extended lunch; https://codereview.appspot.com/6734046 is pretty trivial if anyone is bored ;p
<wrtp> fwereade__: reviewed
<Aram> moin.
<wrtp> Aram: hiya
<wrtp> this fails for me.go test labix.org/v2/mgo
<wrtp> could someone else try it, and let me know if the test passes, please.
<Aram> one min\
<wrtp> Aram: thanks
<Aram> wrtp: failed
<wrtp> Aram: thanks
<Aram> wrtp: I remember now, actually.
<Aram> it never passed for me
<Aram> I remember that error from when I started using mgo
<Aram> it never worked right here
<wrtp> Aram: it's strange because it looks like that js is ok
<wrtp> Aram: that is, i can't see how rs1a could be undefined.
<wrtp> Aram: ah, i see
<Aram> yeah
<wrtp> Aram: maybe the server ports have changed or something
<Aram> what momgodb version do you have?
<Aram> this is 2.0.4
<wrtp> Aram: the one we're using in ec2, i think
<wrtp> Aram: 2.2.0
<wrtp> Aram: hmm, i think it might be failing because i don't have a "supervisorctl" command
<wrtp> (whatever that's supposed to do)
<Aram> yeah, I don't have that either
<wrtp> Aram: hmm, that doesn't seem to have fixed it
<niemeyer> Moooorning
<fwereade__> niemeyer, heyhey
<wrtp> niemeyer: yo!
<wrtp> niemeyer: any hints as to how i can get mgo tests passing? it seems like it's something to do with supervisord (which wasn't installed), but i've failed to get it working even when it is.
<niemeyer> wrtp: You have to run make startdb
<wrtp> niemeyer: ah!
<niemeyer> wrtp: It sets up the 12 servers, etc
<niemeyer> wrtp: It doesn't restart them across runs
<wrtp> niemeyer: right - i see now. that was not so obvious :-)
<niemeyer> wrtp: Anything you think is broken there?
<wrtp> niemeyer: i just wanted to check that i could connect to mongodb with Go's TLS, so i made some changes to allow that and wanted to verify that everything was still working
<wrtp> niemeyer: i do get one test failure, even after running startdb, BTW
<niemeyer> wrtp: I will work on that on my own time
<wrtp> niemeyer: it looks like session.DatabaseNames doesn't always return the names in sorted order
<wrtp> niemeyer: ok. you don't want to accept it as a contribution?
<niemeyer> wrtp: Is juju distributing TLS certificates properly?  I'll happily take that as a contribution. :-)
<wrtp> niemeyer: i need to talk with you about that
<wrtp> niemeyer: i didn't want to build an implementation without deciding what to do first
<niemeyer> wrtp: Understood.. the point is that I'm happy to use my own time to work on mgo as I've been doing since day one. We don't have to spend our time to do that.
<wrtp> niemeyer: that's ok.
<wrtp> niemeyer: it was only 10 minutes' work
<niemeyer> wrtp: Cool, but it wouldn't be to implement it properly
<wrtp> niemeyer: no? it's true i'm probably unaware of the subtleties
<wrtp> niemeyer: i thought something like this was probably sufficient: http://paste.ubuntu.com/1286877/
<niemeyer> wrtp: Quite possibly
<niemeyer> wrtp: That's what Dave suggested too
<wrtp> niemeyer: here's my half-baked plan for key distribution BTW: http://paste.ubuntu.com/1286895/
<niemeyer> fwereade__: I think there was some misunderstanding on the Cleanup review
<niemeyer> fwereade__: But perhaps it's for the best
<fwereade__> niemeyer, oh, sorry, perhaps there was
<fwereade__> niemeyer, what did I miss?
<niemeyer> fwereade__: The suggestion was to run RemoveAll, but not to scan through all documents pending of cleanups at once
<fwereade__> niemeyer, yes -- that changed because it appeared to make the code simpler, and I couldn't think of a serious drawback...
<fwereade__> niemeyer, given that I was looping until I found a valid cleanup doc *anyway*, I thought it was easier to just keep on looping :)
<wrtp> i've got to reboot. back in a mo.
<niemeyer> fwereade__: It may only load the system at once a bit more, but it sounds worth trying
<fwereade__> niemeyer, yeah, it seemed to follow the spirit of your other suggestions :)
<niemeyer> fwereade__: I think the cleanups doc is fine to be deleted out of a transaction too, btw
<fwereade__> niemeyer, I thought "better safe than sorry" there, on the basis that it's a new feature whose context is yet to be completely determined
<niemeyer> fwereade__: We did its intended action before getting there, and the action is idempotent, so any mixup should be fine
<fwereade__> niemeyer, true enough, happy to change
<niemeyer> fwereade__: For clarity, the dangers we can incur by not using a transaction, in that specific case, is seeing the change being done again
<niemeyer> fwereade__: Because if the transaction failed on the insert side before being finished, someone else will attempt to apply the transaction again, and may end up reinserting the doc
<niemeyer> fwereade__: After that transaction is finished, that doc is never seen in a transaction again
<niemeyer> fwereade__: So the worst that can happen is Cleanup running the same cleanup twice, which is not a big deal due to its nature
<niemeyer> fwereade__: I suspect that will be true in general for cleanups
<fwereade__> niemeyer, hope so :)
<niemeyer> fwereade__: If you want me, I'll be in my bunker
<fwereade__> haha
<wrtp> niemeyer: any thoughts on my key distribution sketch above?
<niemeyer> wrtp: I started reviewing a branch before you posted, and I haven't yet finished
<wrtp> niemeyer: ok, np
<wrtp> niemeyer: i wasn't sure you noticed the paste.
<niemeyer> fwereade__: LGTM, thanks a lot for the rounds
<niemeyer> wrtp: np
<niemeyer> wrtp: SO..
<rogpeppe> niemeyer: so...
<niemeyer> rogpeppe: We don't need --sslOnNormalPorts.. we don't even use normal ports
<rogpeppe> niemeyer: ok. that just came from a web page. i'm not fully cognizant of the implications of all the mongod flags :-)
<rogpeppe> niemeyer: (i thought perhaps it might allow non-ssl connections without that flag)
<niemeyer> rogpeppe: We want --ssl, I think
<niemeyer> rogpeppe: I don't have a lot of experience with ssl on mongo myself, to be honest
<niemeyer> rogpeppe: 10gen plays a trick I don't find so nice
<rogpeppe> niemeyer: i was going from this page: http://docs.mongodb.org/manual/administration/ssl/
<niemeyer> rogpeppe: Their own packages don't have ssl built-in.. you have to pay for a subscription
<rogpeppe> niemeyer: yeah, that's why you had to do your own build, right?
<niemeyer> All things considered, I guess it's a fine way to stay in business, though
<niemeyer> rogpeppe: Yeah
<rogpeppe> niemeyer: if people can't be bothered to do a build, then it's their problem i guess.
<rogpeppe> niemeyer: (you didn't have that easy a time of it AFAIR though!)
<niemeyer> Hmm.. "Add the âssl=Trueâ parameter to a PyMongo".. I'll have to investigate what's the standard way drivers are handling that
<niemeyer> "mongodb://localhost/?ssl=true&sslverifycertificate=false"
<niemeyer> interesting..
<niemeyer> Anyway, sorry, derail
<niemeyer> rogpeppe: Only because it takes forever.. otherwise it was somewhat okay
<niemeyer> rogpeppe: in your sketch, I'm not entirely sure of what lines 12-15 mean
<rogpeppe> niemeyer: we've got to make the public key available somewhere outside of the state, right?
<rogpeppe> niemeyer: and the server itself must generate the public key, i think
<niemeyer> rogpeppe: Well, we can only generate the public key by generating the pair
<rogpeppe> niemeyer: exactly
<niemeyer> rogpeppe: I'm not sure we want to do that at the server side, though
<niemeyer> rogpeppe: It means the client has to trust the server blindly, at least once
<rogpeppe> niemeyer: and we don't want to make the private key available in cloudinit
<rogpeppe> niemeyer: there's another way around that, i think, which is orthogonal to this
<rogpeppe> niemeyer: which is to have a client-specified value that's put into the state.
<niemeyer> rogpeppe: That's a bit related to that conversation we had the other day regarding push vs. pull
<niemeyer> rogpeppe: Erm.. sorry
<niemeyer> rogpeppe: That was confusng
<niemeyer> rogpeppe: The conversation regarding bootstrapping via ssh rather than via cloud-init
<rogpeppe> niemeyer: i remember the conversation
<niemeyer> rogpeppe: Or perhaps via both, but pushing the changes via ssh
<rogpeppe> niemeyer: don't we have the same problem there. we must trust the server blindly the first time.
<niemeyer> rogpeppe: Let me look at how we bootstrap today for a sec.. just a moment
<niemeyer> rogpeppe: Okay
<niemeyer> rogpeppe: I guess the problem we have is pretty much the same as the problem we just solved with the password
<rogpeppe> niemeyer: if we could write to environments.yaml, our problem would be simpler
<niemeyer> rogpeppe: Seems like a red-herring
<niemeyer> rogpeppe: How did we solve the problem of initializing the password?
<rogpeppe> niemeyer: we changed it
<niemeyer> rogpeppe: Yep
<rogpeppe> niemeyer: but if the password is specified in environments.yaml, we can't do that.
<niemeyer> rogpeppe: Hmm.. I lost you
<niemeyer> rogpeppe: The password *is* specified in environments.yaml, and we do that
<niemeyer> rogpeppe: environments.yaml seems like a red-herring
<rogpeppe> niemeyer: perhaps i'm misunderstanding you. the password in environments.yaml is so we can authenticate ourselves to the server. what we're trying to do here is authenticate the server to the client, no?
<niemeyer> rogpeppe: The problem we have is exactly the same.. we have a private key that we want the server to hold without exposing to everybody that can read cloud-init
<rogpeppe> niemeyer: ok... so who generates the private key?
<niemeyer> rogpeppe: We do
<niemeyer> rogpeppe: The client
<rogpeppe> niemeyer: how does that help?
<rogpeppe> niemeyer: i suppose it means we don't have to wait for the public key to become available
<niemeyer> rogpeppe: It's the only way for the client to be able to tell it's talking to the right server
<rogpeppe> niemeyer: i don't see how it helps
<niemeyer> rogpeppe: Hmm.. expand?
<rogpeppe> niemeyer: when we connect for the first time after bootstrap, how do we tell if we're talking to the right server?
<niemeyer> rogpeppe: When you connect to your bank in your browser, how can your browser tell it is talking to the right endpoint?
<niemeyer> rogpeppe: (or at least hope so ;)
<rogpeppe> niemeyer: because your browser has stored a public key of the bank.
<niemeyer> rogpeppe: Almost, but close enough
<rogpeppe> niemeyer: so where do we store the public key that we generate at bootstrap time?
<rogpeppe> niemeyer: ah, i suppose we can use our local key to sign a certificate for the server
<rogpeppe> niemeyer: and then use our local key as the root CA
<rogpeppe> niemeyer: i'm not sure that works though
<rogpeppe> niemeyer: hmm, maybe.
<niemeyer> rogpeppe: Why?
<rogpeppe> niemeyer: i guess we'd need another field in environments.yaml
<rogpeppe> niemeyer: to hold the original signer's certificate
<rogpeppe> niemeyer: we could overload authorized-keys, i suppose
<niemeyer> rogpeppe: Uh.. why the obsession about environments.yaml?
<niemeyer> rogpeppe: and how's authorized-keys related to the problem?
<niemeyer> rogpeppe: This is the ssh key, and it works well
<niemeyer> rogpeppe: No need to fiddle there
<rogpeppe> [15:09:02] <rogpeppe> niemeyer: when we connect for the first time after bootstrap, how do we tell if we're talking to the right server?
<niemeyer> <niemeyer> rogpeppe: environments.yaml seems like a red-herring
<niemeyer> :)
<rogpeppe> :-)
<rogpeppe> niemeyer: i'm not sure how to answer my question above
<niemeyer> rogpeppe: You've already answered it?
<rogpeppe> niemeyer: my answer only works if we know what key signed the server's cert
<niemeyer> rogpeppe: Of half of it, anyway
<rogpeppe> niemeyer: and that's something that needs to be specified in environments.yaml AFAICS
<niemeyer> rogpeppe: Exactly, that's what we're handing off the key to the server
<niemeyer> s/what/why
<niemeyer> rogpeppe: lOL
<niemeyer> rogpeppe: We can do whatever we want.. yes, it may be in environments.yaml, it may be in ~/.juju/envname.cert, or whatever
<rogpeppe> niemeyer: ok, so we've handed off the key to the server. now we connect again, and the server presents a certificate. how do we tell if that certificate is valid?
<rogpeppe> niemeyer: ok, sorry. i was using environments.yaml as a place-holder for "somewhere with client configuration information". of course it could be stored elsewhere too.
<niemeyer> rogpeppe: You've already answered it:
<niemeyer> <rogpeppe> niemeyer: ah, i suppose we can use our local key to sign a certificate for the server
<niemeyer> <rogpeppe> niemeyer: and then use our local key as the root CA
<rogpeppe> niemeyer: the reason for the possible authorized-keys possible-red-herring is that we already have a local key, which we use to connect with SSH. we could use that very same key to sign the initial certificate.
<rogpeppe> niemeyer: rather than requiring everyone to generate another key just for juju
<niemeyer> rogpeppe: Nope.. different worlds
<rogpeppe> niemeyer: ok
<rogpeppe> niemeyer: so... in this proposed world, we have our own key distribution problem at the client side.
<rogpeppe> niemeyer: everyone that needs to bootstrap an environment must have a private key signed by the root key
<niemeyer> rogpeppe: This seems to put a simple problem in somewhat of a dark perspective
<niemeyer> rogpeppe: I'd put it more lightly as
<niemeyer> rogpeppe: When we bootstrap, we sign the key we hand off to the server
<rogpeppe> niemeyer: i suppose i'm slightly concerned at the change in juju usage implied here. before, you could just bootstrap anywhere, and connect to the environment from any other place as long as you had an authorized ssh key.
<niemeyer> rogpeppe: That feels like an entirely different context than the one we're talking about
<rogpeppe> niemeyer: now you can only bootstrap if you've got a private key that is derived from the root key.
<rogpeppe> niemeyer: (the root key being the one mentioned in the other environments.yaml files, so that they can connect)
<niemeyer> rogpeppe: Sorry, I think either you see something I don't, or you're creating a picture that I'm not perceiving
<niemeyer> rogpeppe: Generate key, sign, send to server, profit..
<niemeyer> rogpeppe: Where's the pain coming from?
<rogpeppe> niemeyer: connecting again
<niemeyer> rogpeppe: yes?
<rogpeppe> niemeyer: if i'm connecting from a different host, i've got to verify that the bootstrapped environment's key was signed by the original key
<rogpeppe> niemeyer: so i need to know about the original key
<niemeyer> rogpeppe: Yes you do need to know about the server certificate if you want to validate the server is who we think it is
<niemeyer> rogpeppe: That's called authentication
<niemeyer> rogpeppe: Something we conveniently ignore right now
<rogpeppe> niemeyer: we're actually verifying the client that did the bootstrap in this case.
<rogpeppe> i think
<niemeyer> rogpeppe: ?
<rogpeppe> niemeyer: that's where the chain of trust starts
<rogpeppe> niemeyer: which seems good, actually.
<rogpeppe> niemeyer: hmm, i guess we'll have to issue the initial certficate with a very limited lifespan.
<rogpeppe> niemeyer: no that doesn't work
<rogpeppe> niemeyer: i'm not sure how we'll do the interaction to sign the new key when first connecting to the state.
<niemeyer> rogpeppe: We can create a temporary cert to sign the trashed private key
<rogpeppe> niemeyer: the server does that?
<niemeyer> rogpeppe: We cannot create the certificate in the server because then we cannot talk to the server and know we're talking to the right server
<rogpeppe> niemeyer: perhaps we need to run a little server that the client connects to, reads the new public key from the server, and sends back a signed cert
<rogpeppe> niemeyer: indeed
<rogpeppe> niemeyer: but we can (and should) generate a new private key on the server, and get that signed
<niemeyer> rogpeppe: We can generate the private key on the client and sign it on the client where we have the certificate
<rogpeppe> niemeyer: how do we send it to the server?
<rogpeppe> niemeyer: we could put it in the state, i guess, but that seems wrong to me
<niemeyer> rogpeppe: Why?
<rogpeppe> niemeyer: for the same reason we don't put admin-password in the state
<niemeyer> rogpeppe: One of the reasons we don't put it in the state is because nobody needs it
<rogpeppe> niemeyer: i thought it was a security issue too - we don't want anyone with access to the state to be able to impersonate the state server.
<rogpeppe> s/thought it was/think it might be/
<rogpeppe> niemeyer: of course, they can only do that if they subvert DNS too
<niemeyer> rogpeppe: Give me access to the state without a private key, and I'll dig the key out of the state server in a short while
<rogpeppe> niemeyer: that's true of all our passwords too, of course. (with the exception of admin-password itself)
<niemeyer> rogpeppe: No, you cannot figure the password actually, but that's a derail
<niemeyer> rogpeppe: The points are:
<niemeyer> 1) We're consciously not handling authorization anywhere yet
<niemeyer> 2) Anyone with access to the state *today* can do whatever they want
<niemeyer> 3) That includes getting access to the state server
<niemeyer> 4) We're walking towards having a true API that manages authorization
<niemeyer> 5) With authorization, we can prevent people from having access to things they shouldn't
<niemeyer> 6) Even in the future, anyone that gets access to a state server *database* will have the private key for the server anyway, no matter if the key sits within the state or not
<niemeyer> That's it, I think
<rogpeppe> niemeyer: i suppose it just *feels* more secure to have the server generate its own private key. but you're probably right that it's unnecessary.
<niemeyer> rogpeppe: Well, I won't argue about that. If you know why it's more secure I'd be interested though.
<rogpeppe> niemeyer: it means we're not leveraging future security on current security so much. once a state server has obtained an identity, it can keep it, knowing that no-one else has its private key. that said, i can't think of any attacks in particular that this enables.
<niemeyer> rogpeppe: Yeah, I don't see where the "more secure" bits went in
<niemeyer> rogpeppe: And I see a bunch of issues to solve, such as how to distribute keys to replica sets, and how to ensure the key isn't lost
<wrtp> hmm, don't know why i was kicked of then
<wrtp> niemeyer: last thing i saw: [16:04:15] <niemeyer> rogpeppe: And I see a bunch of issues to solve, such as how to distribute keys to replica sets, and how to ensure the key isn't lost
<wrtp> off
<niemeyer> wrtp: Yeah, that was it
<wrtp> niemeyer: what was the last thing you saw me say?
<niemeyer> wrtp: Nothing after that
<wrtp> 16:06:29] <rogpeppe> niemeyer: here's a thought: is it actually possible to change the server's key after it has started running?
<wrtp> [16:06:52] <rogpeppe> niemeyer: or will we need to shut down mongod and restart it pointing to the new private key
<wrtp> ?
<niemeyer> wrtp: Short term, I suppose we have to restart it
<wrtp> niemeyer: and there's another issue too - how do we ensure that the original certificate (in cloud-init) isn't valid forever, while ensuring we can connect an arbitrary time after bootstrap
<wrtp> ?
<niemeyer> Oct 18 11:40:53 <niemeyer>      rogpeppe: We can create a temporary cert to sign the trashed private key
<wrtp> niemeyer: i'm not sure quite what you mean by that
<niemeyer> wrtp: Temporary cert?
<niemeyer> wrtp: What part?
<wrtp> niemeyer: yeah
<niemeyer> wrtp: Sorry, can you explain what you don';t understand?
<wrtp> niemeyer: do you mean a certificate with a short expiry time?
<niemeyer> wrtp: No.. I mean creating a certificate that is itself temporary
<wrtp> niemeyer: i'm not sure what that means
<wrtp> niemeyer: stored temporarily?
<niemeyer> wrtp: I don't know how to explain that more simply
<niemeyer> wrtp: Yes
<wrtp> niemeyer: but the cert is in the cloud-init data; how can it be stored only temporarily?
<niemeyer> wrtp: No, the cert is local
<niemeyer> wrtp: priv and pub keys are in cloud-init
<wrtp> niemeyer: don't we need to pass a cert in the cloud-init too?
<wrtp> niemeyer: because we need to sign it with the client's private key
<niemeyer> wrtp: Hm?
<wrtp> niemeyer: isn't that the basis of our security here?
<wrtp> niemeyer: how else do we know that the certificate is valid when we're connecting to the state?
<niemeyer> wrtp: Because we have the certificate with which the private key was signed
<niemeyer> Erm
<niemeyer> The public key
<wrtp> niemeyer: the server has that certificate?
<niemeyer> wrtp: Not necessarily, although we should probably send the public part of it as a sensible thing to do
<niemeyer> wrtp: That's not what the security is based on, though
<wrtp> niemeyer: oh, i *think* i start tosee.
<wrtp> to see
<niemeyer> wrtp: A private key is used to sign stuff, and to encrypt stuff for people to have the public key
<wrtp> niemeyer: i know that.
<niemeyer> wrtp: A public key is signed by a private part of the certificate so that people holding the public part of the certificate can attest that whoever handed off the public key was signed by the private part of the certificate
<wrtp> niemeyer: i do know how pk auth works, i think
<niemeyer> wrtp: My apologies, but that's exactly what I've been explaining
<wrtp> niemeyer: i'm just not entirely sure what configuration you envisage here
<wrtp> niemeyer: i think i might see. i was thinking that the private key would be generated at bootstrap time.
<niemeyer> wrtp: and it will
<niemeyer> Oct 18 11:30:45 <niemeyer>      rogpeppe: Generate key, sign, send to server, profit..
<wrtp> niemeyer: (sorry, i'm feeling particularly dense today)
<wrtp> niemeyer: i thought that the server side of the TLS connection needed the signed certificate
<niemeyer> wrtp: The server side needs a public key signed by the local certificate and a private key
<wrtp> niemeyer: hmm, that's what i thought
<wrtp> niemeyer: so that {public key signed by the local certificate} must go into the cloud-init data, right?
<niemeyer> wrtp: The server side needs a public key signed by the local certificate and a private key
<niemeyer> wrtp: Both need to go
<wrtp> niemeyer: yeah, sure, the private key too
<wrtp> how does that tally with this: [16:15:22] <niemeyer> wrtp: No, the cert is local
<niemeyer> wrtp: "signed by the local certificate"?
<wrtp> niemeyer: isn't a public key signed by the local certificate a certificate itself?
<wrtp> niemeyer: i'm probably getting my terminology wrong, sorry.
<niemeyer> wrtp: It can be if its respective private key is used to sign something else.. but I don't think that matters/
<niemeyer> ?
<wrtp> niemeyer: (i was thinking that certificates can't sign things, only private keys can)
<niemeyer> niemeyer> wrtp: A public key is signed by a private part of the certificate so that people holding the public part of the certificate can attest that whoever handed off the public key was signed by the private part of the certificate
<wrtp> niemeyer: private-part-of-certficate == private-key, no?
<wrtp> gah!
<niemeyer> wrtp: I suggest reading a bit about PKI..
<wrtp> niemeyer: i have read a certain amount, but obviously not the right bits...
<niemeyer> wrtp: http://www.openssl.org/docs/HOWTO/certificates.txt
<wrtp> niemeyer: that seems to agree with what i thought
<wrtp> niemeyer: i.e. the private key is not part of the cert
<niemeyer> wrtp: I'm sure.. we're debating this for ages just because we all understand things and are in deep agreement :)
<wrtp> niemeyer: i'm sorry. but part of my confusion here is about terminology. when you say "public key signed by the local certificate", i find it confusing, because a certificate can't sign things.
<wrtp> niemeyer: so i'm *trying* to get around to asking how we can limit the lifetime of the certificate we send to the server in the cloud-init data.
<wrtp> niemeyer: i'm sorry we seem to be having difficulty here.
<niemeyer> wrtp: Do you understand what a root certificate is?
<wrtp> niemeyer: i believe so
<niemeyer> wrtp: Okay.. what does it do?
<wrtp> niemeyer: it verifies that some root private key has some attributes
<niemeyer> wrtp: Hmm.. no..
<niemeyer> wrtp: It signs things
<niemeyer> wrtp: and there is both a private part of it, and a public part of it
<wrtp> niemeyer: the public part of it *is* the certificate, i believe
<wrtp> niemeyer: the private key is just a private key, that can be used to sign any number of certs
<niemeyer> wrtp: Nope.. it's a pair
<wrtp> niemeyer: if i want to sign a certificate request, i can use this command:
<wrtp> openssl x509 -req -days 10000 -signkey $keyfile -out $certfile -in $reqfile
<wrtp> niemeyer: that produces a certficate signed by the specified key file.
<wrtp> niemeyer: there's no inherent pairing
<niemeyer> wrtp: There all those fancy names, but in the end it's good old asymmetric crypto
<wrtp> niemeyer: indeed
<wrtp> niemeyer: the different thing about a root cert, i suppose, is that it's self-signed.
<niemeyer> wrtp: So, you use the private part of the key pair that some call certificate, some call whatever they want, to sign things
<wrtp> niemeyer: yup
<niemeyer> wrtp: The root CA is the public part of it
<wrtp> niemeyer: yup
<niemeyer> wrtp: The thing we see and can check
<wrtp> niemeyer: yup. but we also need to check the intermediate certificate too, right?
<niemeyer> wrtp: The private part of the root CA is what signs the public keys we use in the server that you can also call certificate if you want
<niemeyer> wrtp: If there is an intermediate certificate, yes
<niemeyer> wrtp: Then the private key of the root CA signed the public key of the intermediate certificate, and the private key of the intermediate certificate signs the server public key or public certificate if you please
<wrtp> niemeyer: in this case, there must be an intermediate certificate AFAICS because we don't send the root certificate to the cloud
<wrtp> niemeyer: ok, in that sense we probably won't have an intermediate certificate. it's that last bit i'm talking about.
<niemeyer> wrtp: The two last sentences contradict each other
<wrtp> niemeyer: i realised that i meant a different thing to you by "intermediate certificate"
<wrtp> niemeyer: hence "in that sense"
<niemeyer> wrtp: Intermediate certificate in general is unambiguous
<wrtp> niemeyer: in this case AFAICS, we've got two certificates in the first instance: the root cert and the cert that we give to the bootstrap instance.
<niemeyer> http://en.wikipedia.org/wiki/Intermediate_certificate_authorities
<niemeyer> wrtp: Yes
<wrtp> niemeyer: so the latter cert goes in the cloud-init data, right?
<niemeyer> wrtp: Yes, we have to send the private key and the public key for the server in cloud-init
<wrtp> niemeyer: it's more than just the public key - it's the *certificate* (which includes expiry time, possible server name etc) too, right?
<niemeyer> wrtp: We can call it shit if you want..
<niemeyer> wrtp: Sorry.. I'm getting slightly tired by now
<wrtp> niemeyer: i'm sorry, that's a big difference
<wrtp> niemeyer: because a certificate is valid for a while
<wrtp> niemeyer: so even if we change the private key, that certificate will still be valid
<wrtp> niemeyer: so we *could* make the expiry time short
<wrtp> niemeyer: but that would mean that if we didn't connect soon after bootstrap, then it'll expire and we'll lose access to our newly bootstrapped state
<niemeyer> wrtp: I suggest we use a temporary certificate to sign the key instead, but you already knew that 1.5h ago
<wrtp> niemeyer: who signs the temporary cert?
<niemeyer> wrtp: Nobody.. the temporary CA cert is the thing that sings the server cert
<wrtp> niemeyer: ah... so you're suggesting making a temporary root certificate when bootstrapping?
<wrtp> niemeyer: and storing that somewhere on the client's machine
<niemeyer> wrtp: http://en.wikipedia.org/wiki/Public_key_certificate
<niemeyer> wrtp: Yes
<wrtp> niemeyer: okaaay
<wrtp> niemeyer: that's what i was trying to get at by saying (ages ago) that it would be nice if we could write to environments.yaml.
<niemeyer> wrtp: ROTFL
<niemeyer> wrtp: Sorry.. this is getting somewhat sad :)
<niemeyer> wrtp: I'll step out for lunch
<wrtp> niemeyer: enjoy. sorry for my obtuseness.
<TheMue> so, CL simplified and re-proposed. time to step out for dinner.
<wrtp> niemeyer: this how i understand the current state of affairs (slightly simplified): http://paste.ubuntu.com/1287249/
<wrtp> niemeyer: s/current state of affairs/currently proposed scheme/
<wrtp> niemeyer: the simplification is that i'm not sure it's particularly useful to generate a certificate on the client side; all we need to verify is the public key of the server, so it can generate its own certificate.
<wrtp> s/certificate/temporary certificate/
<wrtp> niemeyer: for the record that page above (http://en.wikipedia.org/wiki/Public_key_certificate) contains nothing that implies that a certificate has a private part, or can be used to sign things. i believe my confusion was understandable.
<niemeyer> wrtp: It contains the fact that a certificate has *the public key* signed by *the CA*, but I really don't want to discuss this anymore
<niemeyer> wrtp: If you think it doesn't have a private counterpart, I won't force you into understanding it
<wrtp> niemeyer: me neither. i'm hoping that my paste above will move things forward.
<wrtp> niemeyer: (of course it has a private counterpart, the signer, but that's not part of the certificate)
<niemeyer> wrtp: "contains nothing that implies that a certificate has a private part"
<wrtp> niemeyer: it's not a part of the certificate. anyway, moving on.
<niemeyer> wrtp: Changing the point until it becomes reasonable makes for a very painful conversation
<wrtp> niemeyer: a certificate contains no private data!
<niemeyer> <niemeyer> wrtp: "contains nothing that implies that a certificate has a private part"
<niemeyer> wrtp: That's what you said
<niemeyer> wrtp: That's what I responded to
<wrtp> niemeyer: counterpart != part
<niemeyer> wrtp: For my own sanity I'll stop talking for the moment
<wrtp> niemeyer: ok
<wrtp> niemeyer: i'm sorry this is difficult. i don't mean to be awkward, but when we're talking about this stuff, it helps to have some agreed terminology.
<wrtp> niemeyer: we should have moved to G+ hours ago
<wrtp> niemeyer: i'd very much like some reaction to my proposal above, as i plan to start implementing it tomorrow morning, if it looks ok.
<niemeyer> wrtp: Will check
<niemeyer> wrtp:
<niemeyer> on initial server:
<niemeyer> 	get tempKey from cloud-init data
<niemeyer> 	tempCert = generate a self-signed certificate using tempKey
<wrtp> niemeyer: is that bad?
<niemeyer> wrtp: Just send the key/cert signed by the temporary local CA for the server to use
<wrtp> niemeyer: what's the point of generating two keys when one will do?
<wrtp> niemeyer: (i'm perfectly willing to accept that there may be a good reason, but i couldn't think of one)
<niemeyer> wrtp: What are you going to put on the root CA field?
<wrtp> niemeyer: the certificate itself.
<wrtp> niemeyer: sorry, the temporary key itself
<wrtp> niemeyer: it's self-signed
<niemeyer> wrtp: Bingo.. it's a certificate, not a private key
<wrtp> niemeyer: the server can generate its own certificate
<wrtp> niemeyer: i don't *think* the server needs to share a CA with the client, but i may be wrong.
<niemeyer> wrtp: Why are you generating the same certificate twice?
<wrtp> niemeyer: we only generate it once
<wrtp> niemeyer: on the server.
<wrtp> niemeyer: the client just generates the private/public key pair
<niemeyer> wrtp: and what do you put on the local root CA field?
<wrtp> niemeyer: we don't need one
<niemeyer> wrtp: Argh
<niemeyer> wrtp: It's been three hours, and we're still stumbling upon basics of PKI
<wrtp> niemeyer: we can do TLS authentication with no CAs
<niemeyer> wrtp: Gosh
<wrtp> niemeyer: i have Go code that does that, in fact.
<niemeyer> wrtp: Okay, please teach me about how that works
<wrtp> niemeyer: we use the usual diffie-hellman key exchange, which verifies the public key at the other end
<wrtp> niemeyer: then we check that public key matches the one we expect
<niemeyer> wrtp:
<niemeyer>     // RootCAs defines the set of root certificate authorities
<niemeyer>     // that clients use when verifying server certificates.
<niemeyer>     // If RootCAs is nil, TLS uses the host's root CA set.
<niemeyer>     RootCAs *x509.CertPool
<niemeyer> wrtp: That's in the tls.Config type
<wrtp> niemeyer: yes
<wrtp> niemeyer: it's optional.
<niemeyer> wrtp: What do you suggest we do with this?
<wrtp> niemeyer: it'll work with an empty cert pool
<niemeyer> wrtp: Yes, and what happens if you don't use it?
<niemeyer> wrtp: What happens in that case?
<niemeyer> wrtp: (hint: it's in the comment)
<wrtp> niemeyer: you don't verify that the certificate is signed
<wrtp> niemeyer: however...
<niemeyer> wrtp: Wrong
<wrtp> niemeyer: you can easily check the exact certificate
<niemeyer> wrtp: It uses the host CA pool
<niemeyer> wrtp: Which contains root CAs, which the key is not signed against
<wrtp> niemeyer: you can use the ConnectonState method to find out the public key at the other end
<niemeyer> wrtp: There are many things we could do, but that's not what we'll do. What we'll do is use plain and well known SSL, as intended.
<wrtp> niemeyer: ok. it seems a bit unnecessary (checking the public key directly is just as strong), but fair enough.
<niemeyer> wrtp: Sorry, I'll stop discussing that now. Let's talk next week.
<wrtp> niemeyer: ok
<wrtp> niemeyer: assuming we generate a certificate too, does the rest look ok?
<niemeyer> wrtp: I really mean it, sorry about that. We've been discussing this for about 3h without progress. Next week we talk live with a whiteboard and sort it out.
<wrtp> niemeyer: ok
<wrtp> :-(
<niemeyer> wrtp: I suggest picking stuff you'd enjoy working on tomorrow as a brain break
<wrtp> niemeyer: i was thinking *this* might be quite fun :-)
<wrtp> niemeyer: for the record: http://paste.ubuntu.com/1287384/
<wrtp> i'm done for the evening
<wrtp> night all
<niemeyer> wrtp: Have a good evening
<wrtp> niemeyer: and you. v sorry about the wasted time this afternoon.
<niemeyer> wrtp: All good, next week will be a lot easier
<wrtp> niemeyer: indeed
#juju-dev 2012-10-19
<fwereade__> morning wrtp
<wrtp> fwereade__: hiya
<TheMue> morning
<wrtp> TheMue: mornin'
<TheMue> wrtp: heya
<fwereade__> TheMue, heyhey
<TheMue> fwereade__: hi
<fwereade__> wrtp, TheMue: I think https://codereview.appspot.com/6737050 is a trivial
<TheMue> fwereade__: LGTM
<fwereade__> TheMue, thanks
<wrtp> fwereade__: looking
<wrtp> fwereade__: why can't we use the same log file for output and log messages?
<wrtp> fwereade__: in fact, we could just send log messages to stdout, maybe
<fwereade__> wrtp, I dunno, I feel that it's a different class of message... anything going to out is evidence that we've royally screwed up somewhere
<wrtp> fwereade__: i'm somewhat -1 on adding another log file to monitor
<fwereade__> wrtp, well, it was always there in python, and it saved us hours of debugging there, and would have saved it for us the other day too
<wrtp> fwereade__: i'm definitely not -1 on using Out
<fwereade__> wrtp, in fact I'm pretty sure I LGTMed your container code only on the condition that you added it :/
<fwereade__> wrtp, but, meh, easy to miss :)
<wrtp> fwereade__: the nice thing about having output go to the same file as log messages is that you can see the output in context
<wrtp> fwereade__: given that, probably, the only output we're going to see is a panic stack trace, i think that's useful.
<fwereade__> wrtp, or errors from before logging is set up, or failure to even launch the process
<fwereade__> wrtp, it's a backstop for should-never-happen errors
<wrtp> fwereade__: true too. but it's all good to see in the context of the other logging messages (for instance the ones that were logged on the previous run)
<fwereade__> wrtp, are you suggesting replacing --log-file with an Out then?
<fwereade__> wrtp, or using both to write to the same place?
<wrtp> fwereade__: if we just remove the --log-file flag, will the Out field cause the stderr log msgs to go to the log file?
<fwereade__> wrtp, think so, haven't tried
<wrtp> fwereade__: yeah, it looks like it will
<fwereade__> wrtp, but does make it somewhat tricky to add log rotation in future, I think
<wrtp> fwereade__: that's an interesting point.
<fwereade__> wrtp, and that does feel like something we will need sooner or later
<wrtp> fwereade__: we could potentially implement log rotation by restarting the agent, i suppose
<wrtp> fwereade__: if it's sufficiently infrequent, that's probably no problem
<fwereade__> wrtp, that approach feels a bit lumpen to me, but maybe it's a matter of taste
<wrtp> fwereade__: yeah, i know what you mean
<fwereade__> wrtp, anyway, I think we have passed the "this is not trivial" milestone in this discussion
<wrtp> fwereade__: sorry
<fwereade__> wrtp, I'm fine punting that decision to niemeyer -- I don't *really* care how we get the panics, just that we do somehow getthem :)
<fwereade__> wrtp, np at all
<fwereade__> wrtp, whatever we implement should be pretty simple in the end
<wrtp> fwereade__: yeah
 * wrtp wishes upstart had a reference manual
<Aram> moin.
<Aram> davecheney: any chance of recording from the gophers' meeting?
<TheMue> Aram: moin, moin
<davecheney> Aram: sorry, thre was no recording
<davecheney> but the slides are online
<Aram> pity.
<davecheney> talks.golang.org
<Aram> yeah, seen them.
<davecheney> sorry, very low fi
<davecheney> we didn't have it at the swanky google offices
<fwereade__> hmm, how does anyone else feel about s/RelationEndpoint/Endpoint/ ?
<wrtp> fwereade__: in state, presumably?
<fwereade__> wrtp, yeah
<fwereade__> wrtp, Endpoint only means one thing, just like Unit, and we don't call Unit ServiceUnit
<fwereade__> wrtp, (RelationUnit is, I think, different, because it really is a combination of a Relation and a Unit)
<wrtp> fwereade__: mixed feelings. on the one hand, i think "yeah, great idea - shorter name". one the other hand, the state name space is quite crowded.
<fwereade__> wrtp, expand on the second bit -- don't see how a simple name change affects the crowdedness
<wrtp> fwereade__: it's just that it's not always obvious which names relate to which things.
<wrtp> fwereade__: but on balance i'm probably +1
<fwereade__> wrtp, ok, I'll probably run it past niemeyer when he arrives
<fwereade__> wrtp, cheers :)
<fwereade__> later all, lunchtime
<davecheney> http://codereview.appspot.com/6734043/
<dimitern> hey, this is fwereade in the wrong place... am I meant to be in a meeting
<dimitern> (I went to fix cath's aunt's toilet over lunch, and it took a bit longer tan expected, but dimiter's house is on the way home...)
<davecheney> meeting ?
<davecheney> not that i know of
<wrtp> dimitern: neither me
<dimitern> wrtp: well, it's dimitern again, fwereade is on the way home now :)
<wrtp> dimitern: :-)
<wrtp> dimitern: nice meetin' ya :-)
<wrtp> (if this could actually be called "meeting"...)
<dimitern> wrtp: likewise :)
<fwereade__> hey again all
<fwereade__> I just had a sudden hey-wait-we-sometimes-meet-on-fridays-I-hope-it's-not-today moment
<davecheney> fwereade__: yeah, blah blah, airplanes, blah blah, uds
<davecheney> during this season, regular concerns are suspended
<fwereade__> davecheney, haha :)
<niemeyer> Hello all!
<niemeyer> I'll step out to run some errands I have to sort before Copenhagen.. back later.
<fss> niemeyer: morning :-)
<hazmat> niemeyer_, g'morning had a question re the bzr revid that the store could serve up (namely how to access), pls ping me when your around
<niemeyer_> hazmat: Heya
<niemeyer_> hazmat: It's already implemented, but it's not yet deployed
<niemeyer_> hazmat: We have to sync up with mthaddon to get it out
<niemeyer_> hazmat: It's just another field on that usual charm doc info
<hazmat> niemeyer_, ah.. okay. i was wondering about that. will it just appear in the store data for a given charm alongside revision and sha?
<hazmat> cool
<niemeyer_> info doc
<hazmat> perfect, then the browser will just pick it up
<niemeyer_> hazmat: "digest", specifically
<niemeyer_> hazmat: The generic term is because it doesn't really matter for the store.. it's whatever the revision control labels it
<hazmat> sounds good, i just wanted to make sure it was in the info doc.
<wrtp> time to stop for the day.
<wrtp> have a great weekend everyone, see y'all monday or in copenhagen!
<hazmat> wrtp, cheers
#juju-dev 2012-10-20
<otubo> hello guys, I just added the Juju PPA, installed and updated the  ~/.juju/environments.yaml with my credentials, but I'm having this particular issue when I try to 'juju bootstrap' -
<otubo> otubo@yoda ~ $ juju bootstrap
<otubo> 2012-10-19 22:59:04,071 INFO Bootstrapping environment 'sample' (origin: distro type: ec2)...
<otubo> 2012-10-19 22:59:04,116 ERROR [('PEM routines', 'PEM_read_bio', 'no start line')]
<otubo> Does anyone know if this is a known issue?
<hazmat> otubo, more suitable for #juju
<hazmat> otubo, lots of juju people are currently traveling fwiw.. i'll try and address in the other channel
<hazmat> except your not there..
<otubo> hazmat, sorry, I was away, I'll join #juju, thanks! :)
<otubo> hazmat, did you send my messages to #juju? Should I post there?
<hazmat> otubo, already done
#juju-dev 2012-10-21
 * hazmat arrives at hotel
<davecheney> 21:52  * hazmat arrives at hotel
<davecheney> 21:52 -!- hazmat [~hazmat@plone/hazmat] has quit [Excess Flood]
<davecheney> ^ lol
#juju-dev 2013-10-14
<axw> hey thumper, have a nice break?
<axw> anything I can help with on the maas thing?
<thumper> axw: yeah, nice to get that mental break
<thumper> axw: I'm trying to work out why maas is having bootstrap issues, and ec2 and openstack not
<axw> thumper: I guess I won't be of use without maas knowledge, but give me a yell if there's something I can do.
<thumper> sure
<bigjools> this is a bit of a critical problem
<thumper> bigjools: right
<bigjools> thumper: forwarding you another email
<thumper> bigjools: do you have any logs of a failed bootstrap?
<bigjools> no, I am going to start poking soon
<thumper> in particular, I want everything related to the cloud-=init
<bigjools> did rvb's email not cover it?
<bigjools> send you two emails
<bigjools> sent*
<bigjools> there's two problems I see at the moment
<thumper> bigjools: the pastebin (http://paste.ubuntu.com/6226420/) had  two lines but I don't know what from
<bigjools> 1. bootstrap failing as per rvb
<bigjools> it's the QA lab environment
<bigjools> 2. destroy-env gets its knickers in a twist
<bigjools> I reckon the pastebin shows an invalid command line quite frankly
<bigjools> --constraints is not getting any arg
<thumper> locally, it gets '' if not set
<thumper> I'm not sure how it ends up getting missed on the command line
<thumper> I can submit a fix *right now* for the empty one
<thumper> but I was really wanting to know why we didn't see issues on the ec2/openstack side
<bigjools> NFI
<thumper> yeah
<thumper> me neither
 * thumper bootstraps an ec2
<bigjools> something is causing --constraints to get passed in with no arg, but I don't know what would do that
<bigjools> passed in to jujud that is
<thumper> hmm...
<thumper> well,
<thumper> when we generate the cloudinit
<thumper> it makes" --constraints ''   " if empty
<thumper> so
 * thumper shrugs
<bigjools> ok I am going to recreate locally
<thumper> why is the version of in lp:juju-core/1.16 1.15?
<bigjools> ?
<bigjools> I have a sneaking suspicion that the tools are mismatched somewhere
<thumper> nah
<thumper> well...
<thumper> actually
<thumper> hmm...
<thumper> ah...
<thumper> nope, my fault there
 * thumper bootstraps again
<thumper> bigjools: got time for a hangout?
<thumper> bigjools: is your shitty connection up to it?
<bigjools> yeah it's ok now
<bigjools> I finally plugged in my ADSL filter and it's better :)
<thumper> haha
<thumper> bigjools: hangout?
<bigjools> yea just call me
<thumper> kk
<axw> thumper: re maas vs. ec2/openstack, I *suspect* it's due to https://codereview.appspot.com/13802045/patch/36001/37002
<axw> see IsEmpty
 * thumper looks
<axw> and parseTags
<axw> bit of a wild guess tho
<axw> hmm tags= shouldn't be in the string tho, so maybe not
<thumper> this isn't about tags, but constraints
<axw> thumper: yeah... there's tags in constraints now. was thinking it might've affected whether the constraints are considered empty or not
<bigjools> thumper: https://bugs.launchpad.net/juju-core/+bug/1239496
<_mup_> Bug #1239496: Packaged 1.16 juju has oauth errors talking to maas <juju-core:New> <https://launchpad.net/bugs/1239496>
<bigjools> rather critical
<bigjools> thumper: http://pastebin.ubuntu.com/6234274/
<bigjools> wallyworld_'s fault you say?
<wallyworld_> wot
<bigjools> hang on
<wallyworld_> you are trying to bootstrap using a 1.9 client?
<bigjools> trying to work out what TF I am building
<bigjools> I rebuilt and it said I am using 1.9
<wallyworld_> pbkac
<bigjools> trunk branch moved ...
<bigjools> old checkout
<thumper> axw: you could look at this https://codereview.appspot.com/14494054
<axw> thumper: will do
<thumper> hang on
<thumper> that's bollocks
<thumper> WTF went on there then
<thumper> I branched from 1.16
<thumper> and proposed to trunk
<bigjools> ok oauth in latest core is bollixed
<bigjools> https://bugs.launchpad.net/juju-core/+bug/1239496
<_mup_> Bug #1239496: 1.16 juju and above has oauth errors talking to maas <juju-core:New> <https://launchpad.net/bugs/1239496>
<davecheney> bigjools: which series are you running
<davecheney> i wonder if this is 'compiled against an old versino of go' again
<bigjools> saucy
<davecheney> hmm
<davecheney> nope, not that the
<davecheney> then
<thumper> davecheney: bigjools says the package version is poked
<bigjools> package in saucy itself is bollixed
<davecheney> bigjools: ok
<davecheney> what if you build from the 1.16.0 tarball
<davecheney> download it
<davecheney> set yoru GOPATH to the root of the tar
<davecheney> go install ./...
<davecheney> that is the litmus test
<bigjools> where is it again?
<davecheney> gotta go to the 1.16 series
<davecheney> https://launchpad.net/juju-core/1.16/1.16.0/+download/juju-core_1.16.0.tar.gz
<bigjools> ta
<thumper> axw: https://codereview.appspot.com/14494054/ diff is now right
<axw> thumper: ta, reading
<thumper> but still need to target one against 1.16
<axw> ok
<axw> bigjools: did you also notice that the realm is different in the working/non-working tests?
<bigjools> axw: yes
<bigjools> maas has a log entry saying "invalid consumer"
<thumper> axw: is the realm used in anyway to change the header values?
<bigjools> src/code.google.com/p/go.net/ipv6/sockopt_rfc3542_linux.go:33: too many errors
<bigjools> nice
<axw> thumper: not that I've noticed so far
<axw> thumper: I mean, apart from the "realm" part in the HTTP header
<axw> one has realm="", one has realm="MAAS+API"
<bigjools> davecheney: still got a 401 with that tarball
<axw> which is URL-encoded (in code it's "MAAS API")
<bigjools> davecheney: it prompts another question - why did Mark's package work
 * bigjools grabs food
<davecheney> bigjools: is he using saucy ?
<axw> thumper: lgtm
<thumper> axw: ta
<bigjools> davecheney: yes from what I can tell, his juju --version showed 1.16.0-saucy-amd64
<davecheney> ok
<davecheney> just checking
<davecheney> in case he was running precise
<bigjools> thumper: did you end up uploading a new binaryu for me?
<thumper> bigjools: yes, I copied the one I build into the ~/tim dir
<bigjools> ok ta let's try it
<bigjools> it gets the 401 too
<thumper> wtf?
 * thumper goes to do some bzr magic to see the changes to the maas provider between 1.14 and 1.16
<thumper> bigjools: could it be gomaasapi?
<thumper> davecheney: do we know which revisions of gomaasapi were used for 1.14?
<thumper> davecheney: we have tarballs, right?
<davecheney> thumper: it's encoded in the release-tarball script on the 1.14 branch
<bigjools> thumper: not sure it changed.  Oauth stuff is handled in Go's library
<davecheney> there will also be a tag on gomaasapi
 * thumper grabs both release tarballs
<thumper> bigjools: wondering why sabdfl didn't hit that problem...
<bigjools> thumper: exactly
<bigjools> thumper: nor in the lab
<bigjools> is it only on my box?  if so why does an old version work
<thumper> bigjools: yes, very weird
<thumper> nothing oauth related has changed in gomaasapi nor provider/maas in the 1.14.1 -> 1.16 change
<thumper> we need another maas to test against
<bigjools> I was gonna say
<bigjools> easy for you to do that
<bigjools> you don't need nodes
<thumper> bigjools: how?
<bigjools> run it from a branch or a package; all you need to do is "juju destroy-environment"
<bigjools> and if buggered you get a 401
<bigjools> you can even do it on canonistack
<thumper> say what?
<thumper> axw: and now the 1.16 branch version, with other bug references: https://codereview.appspot.com/14516056/
<axw> thumper: just lgtm'd
<thumper> ta
<axw> thanks for logging the bugs
<thumper> np
<bigjools> juju is now an easier interface to openstack than the other command line tools
<davecheney> :)
<davecheney> juju deploy -n 1000 ubuntu
<bigjools> well even for adding arbitrary instances
<bigjools> I just use bootstrap and abuse it, or add-machine
<bigjools> davecheney: on that note, how do I force a particular distroseries with add-machine?
<bigjools> oh --series ... duh
<bigjools> davecheney: so for some odd reason that oauth problem is only on my maas test box at home
<davecheney> does oauth use the hostname of the server ?
<bigjools> no idea
<bigjools> the server is called "maas" so if that's causing problems I think we can all quit now
 * davecheney packs his bags
<bigjools> axw, davecheney: if you can think of anything, anything at all, why the Auth header would be screwy only one one system.... I would be rather grateful.
<axw> bigjools: what's the diff in Auth header with the same juju, in the two envs?
<bigjools> axw: https://bugs.launchpad.net/juju-core/+bug/1239496
<_mup_> Bug #1239496: 1.16 juju and above has oauth errors talking to maas <juju-core:New> <https://launchpad.net/bugs/1239496>
<bigjools> older juju binaries work fine
<bigjools> so something that is compiled into it, possibly from the stdlib, is behaving differently because of some environmental difference
<axw> bigjools: do you have a capture of the auth header from an older juju?
<bigjools> I'll get one hang on
<bigjools> oh crap overwrote the old binary
<bigjools> I'll grab a tarball and rebuild
<axw> bigjools: you didn't happen to edit environments.yaml after bootstrapping, did you?
<bigjools> nope
<axw> k
<axw> just asking because environments.yaml gets cached in these .jenv files now
<bigjools> O_o
<axw> e.g. ~/.juju/environments/maas.jenv
<bigjools> what's that for?
<axw> so we can auto generate stuff like admin-secret
<axw> not sure what else tbh
<axw> did you say you got this error when you did a destroy-environment, without first bootstrapping?
<axw> or did I imagine that
<bigjools> yeah
<bigjools> an easy way to test
<bigjools> it tries to list nodes and gets a 401
<bigjools> but any command gets the same response
<bigjools> axw, davecheney: so I can recreate with the 1.15 release, 1.13 does not have  the problem, let me just find the header
<bigjools> axw: I updated the bug
<axw> ta
<axw> bigjools: the only thing that jumps out at me is the non-empty realm in 1.16
<axw> both others have realm=""
<bigjools> axw: see the latest one
<axw> ok
<bigjools> axw: ah sorry ignore the realm=""
<bigjools> I posted the wrong request on the comment
<axw> ok, no worries
<bigjools> the ones that don't work use completely the wrong token
<bigjools> axw: how do I clear that jenv?
<bigjools> axw: because I can see from examining it the oauth is different ...
<axw> bigjools: it gets created when you bootstrap
<bigjools> so it looks like I did change the envionments.yaml at some point
<axw> ok, that would explain it
<bigjools> this is a bug in juju IMO - I should be able to change the credentials
<bigjools> if my maas server was compromised, I need to change them
<axw> yeah, I'm not sure what the supported method of doing that is
<axw> you'd need to update the state db I think, via set-environment
<axw> whatever's in the .jenv should just be used for connecting to the API server or state server I think.
<bigjools> I hacked the jenv
<bigjools> works that w3ay
<bigjools> not ideal if you don't know about it :/
<axw> yeah, I was banging my head against that for a while when dev/testing the null provider :)
<axw> perhaps we should check contents of environments.yaml and .jenv, and make sure there's no changes
<bigjools> good idea
<axw> I'll log a new bug
<bigjools> thank you
<bigjools> and thanks for the help
<axw> no worries
<rvba> thumper-afk: I'm testing your fix for the empty constraints stuff in the lab now.
<bigjools> I am re-creating that here as well
<rvba> thedac: Looks like the fix for that worked (no empty --constraints any more) but jujud is still not starting properly: http://paste.ubuntu.com/6234820/
<rvba> arg, sorry thedac, I meant thumper-afk.
<bigjools> --upload-tools seems to cause bootstrap to break, at least on maas
<bigjools> http://paste.ubuntu.com/6234893/
<davecheney> bigjools:  i don't see anythig in that log that it is actually uploading anything
<davecheney> that would be the problem
<rvba> fwiw, I've this error (EOF) happening from time to time, sometimes when fetching the tools from aws.  But it was always transient.  Any idea?
<bigjools> davecheney: the problem is that bootstrap has not done anything!
<bigjools> getting that EOF error every time
 * thumper looks at rvba's pastebin
<rogpeppe> mornin'
<rogpeppe> thumper: hiya
<rvba> thumper: see my most recent email to the ML to have all the logs.
<rvba> Morning rogpeppe.
<thumper> rvba: ack
<thumper> hi rogpeppe
<rogpeppe> rvba: yo!
<davecheney> i thin that eof indicates it ran off the end of whatever it was asking maas to list
<bigjools> https://bugs.launchpad.net/juju-core/+bug/1239558
<_mup_> Bug #1239558: --upload-tools failure preventing bootstrap completing <juju-core:New> <https://launchpad.net/bugs/1239558>
<thumper> rvba: do you have access to the userdata.txt for the cloud image boot?
<thumper> rvba: on ec2 found it in /var/cloud/... somewhere
<thumper> sorry
<thumper> /var/lib/cloud
<rvba> thumper: sure, let me get that for youâ¦
<davecheney> /var/lib/cloud/instance/
<davecheney> one of those is the text
<thumper> rvba: also, log for mongod
<thumper> rvba: the one that is the juju one
 * thumper looks for the init script name
<bigjools> eek, why are tools for quantal getting uploaded by default
<rvba> /var/lib/cloud/instance/cloud-config.txt: http://paste.ubuntu.com/6234904/
<rvba> /var/log/mongodb/ is empty (!)
 * bigjools heading out to eat, back later
<rvba> ps aux | grep mongo â zero
<thumper> WTF?
<thumper> rvba: it seems that we have jujud trying to start before mongod, that would certainly be an error :)
<rvba> http://paste.ubuntu.com/6234911/
<rvba> Mongodb is not installed
<thumper> rvba: what about mongodb-server
<rvba> 1:2.0.4-1ubuntu2.1 is installed
<thumper> hmm..
<thumper> that is the wrong version
<rvba>  sudo service mongodb status â mongodb stop/waiting
<thumper> the cloud-init has removed the stable ppa that has the mongodb-server package
<thumper> rvba: what is the install source of the mongodb server package
<thumper> ?
<rvba> thumper: http://paste.ubuntu.com/6234922/
<thumper> rvba: the juju mongo service is "juju-db"
<rvba> arg, one secâ¦
<rvba> thumper: http://paste.ubuntu.com/6234924/
<rvba> thumper: ah ok
<thumper> rvba: but it will be trying to start with ssl
<thumper> and failing
<rvba> $ sudo service juju-db status â juju-db stop/waiting
<davecheney> rvba: did you see my reply ?
<davecheney> Get:17 http://archive.ubuntu.com/ubuntu/ precise-updates/universe mongodb-clients amd64 1:2.0.4-1ubuntu2.1 [16.5 MB]
<davecheney> Get:18 http://archive.ubuntu.com/ubuntu/ precise-updates/universe mongodb-server amd64 1:2.0.4-1ubuntu2.1 [4167 kB]
<davecheney> ^ ppa:juju/stable is missing from this bootstrap node
<davecheney> it will never bootstrap properly
<thumper> rvba: and it seems to be logging to syslog
<thumper> davecheney: agreed
<rvba> davecheney: ah ok
 * thumper looks at the 1.16 cloud-init work
<davecheney> did someone try to remove ppa:juju/stable from our cloud-init scripts again ?
<davecheney> i recall a change recently
<thumper> yeah, me too
<thumper> that was to add the cloud archive
<thumper> I wonder if they removed the old
 * thumper greps the bzr logs
 * davecheney flips a table
<davecheney> jamespage: is the mongodb update available in the cloud tools pocket ?
<thumper> ok found the change
<thumper> axw: ping
<axw> thumper: yo
<thumper> hey
<thumper> axw: r1948 where you added cloud-tools
<thumper> you modified the apt-repository bit for the ppa/stable
 * axw brings up the diff
<davecheney> thumper: axw just get the mongo depdency installed into the cloud-tools pocket
<davecheney> that'll fix it today
<davecheney> no release required
<axw> last I checked, mongo was in cloud-tools
<jamespage> davecheney, its already in the cloud-tools pocket
<jamespage> but the cloud archive is just for 12.04
<thumper> axw: http://paste.ubuntu.com/6234904/
<thumper> axw: the cloud-init userdata doesn't add ppa:juju/stable
<thumper> axw: also, it uses mongodb-server
<thumper> not mongo
<TheMue> morning
<thumper> hmm...
<thumper> mongodb-server is there
<axw> thumper: I tested this live
<axw> it's in  both ppa and pocket
 * thumper wonders what was booted with maax
<thumper> jamespage: it appears to be precise
 * thumper ponders
<thumper> ah...
<thumper> here it goes
<thumper> W: Failed to fetch http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/precise-updates/cloud-tools/main/binary-amd64/Packages  403  Forbidden
<thumper> E: Some index files failed to download. They have been ignored, or old ones used instead.
<axw> yeah just saw that
<jamespage> thumper, some sort of proxy blocking access?
<axw> what's up with that?
<thumper> rvba: can the machine see it?
 * rvba tests.
<thumper> jamespage: qa lab is firewalled off I think
<thumper> rvba: you'll need to get a hole blased to that location
<rvba> thumper: that machine can access the file all right.
<rvba> (i.e. wget http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/precise-updates/cloud-tools/main/binary-amd64/Packages works)
<thumper> rvba: can you do an update, upgrade?
<thumper> intermittant blockage?
<rvba> hum, apt-get update â Failed to fetch http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/precise-updates/cloud-tools/main/binary-amd64/Packages
<thumper> yeah... that'll bit it :)
<thumper> jamespage: do you know what has changed with the cloud-init process?
 * rvba wonders why wget works.
<thumper> something has changed to how it executes the scripts
<thumper> in particular the --constraints '' used to work
<thumper> and now it doesn't
<thumper> the empty two single quote param was getting dropped somewhere
<thumper> hmm...
<thumper> or perhaps...
<thumper> it is just how it is logged out
<thumper> and copying that got a different problem
<thumper> that would kinda explain it
<jamespage> thumper, sorry - no I don't
<rvba> Here is goes, apt is configured to use the region's squid as a proxy and that is failing to fetch things from ubuntu-cloud.archive.canonical.com.
<thumper> jamespage: doesn't matter, I think I explained it to myself
<rvba> thumper: okay, I fixed the proxy in the lab, let's try again to bootstrap a nodeâ¦
<rvba> (with your empty constraints fix)
<thumper> rvba: I'd be curious to see it without my fix
<thumper> but lets try with first :)
<rvba> thumper: node got bootstrapped okay.
<bigjools> I recreated it without any fix
<thumper> rvba: \o/
<thumper> bigjools: did you say you got one bootstrapping without the fix?
<bigjools> thumper: I recreated the bug
<bigjools> ie. fail
<thumper> oh
<bigjools> I can retry with the fix
<thumper> bigjools: recreated the bug locally?
<rvba> thumper: testing again without your fix, just to confirm.
<bigjools> thumper: yes
<thumper> bigjools: is it the same? no access to the cloud-tools ?
<thumper> or the bad command line parsing?
<thumper> I'm confused as to how the command-line parsing is different
<bigjools> thumper: the command line thing with the --constraint
<thumper> really?
<thumper> wow
<thumper> how does the cloud-init get run in the maas environment?
<bigjools> something is arsed in the core
<bigjools> it runs as part of the image
<bigjools> takes params from kernel cmd line
<bigjools> then downloads stuff from the metadata service
<thumper> bigjools: no, the cloud-init data is good
<thumper> bigjools: something has changed in how it is run
<thumper> bigjools: for the failing bootstrap with the constraint error, can I see the cloud-init userdata file?
<bigjools> thumper: we're on our call, hang on
<rogpeppe> thumper: what's the problem here?
<thumper> rogpeppe: it seems that the maas execution of the cloud-init user data was dropping an empty string param
<thumper> rogpeppe: causing the jujud bootstrap-state command to fail
<thumper> rogpeppe: I'm gathering data
<thumper> rogpeppe: landed a fix for it by not passing constraints where there are none
<rogpeppe> thumper: this doesn't happen on ec2?
<thumper> but I'd like to know why it is happening
<thumper> no, ec2 is fine
<rogpeppe> thumper: hmm, odd.
<thumper> yeah
<rvba> thumper: the node bootstrapped okay *without* your fix this timeâ¦ I'm pretty confusedâ¦  maybe the wrong juju was used somehowâ¦
<thumper> hmmâ¦
<thumper> even more curious about bigjools's cloud-init issue now
<rogpeppe> it's a pity that set -x doesn't print quoted commands correctly
<thumper> yeah
<rogpeppe> thumper: still, you should see an extra space in the log
<rogpeppe> thumper: if the command was really there
<rogpeppe> thumper: actually, another possible fix: use --constraints=foo rather than --constraints foo
<rogpeppe> thumper: well, it would be interesting to see if that made any difference, anyway
 * thumper shrugs
<thumper> slightly cleaner not to pass through nothing IMO
<rogpeppe> thumper: and therefore where it was a quoting issue or not
 * thumper nods
<thumper> I'm hesitant to think it is a quoting issue
<thumper> I'm thinking user error, but want more data :)
<rogpeppe> thumper: why do you think the problem was a dropped empty string param?
<thumper> rogpeppe: because I didn't read the email properly
<thumper> :)
<rogpeppe> thumper: heh
<thumper> perhaps not enough coffee
<thumper> or alcohol
<thumper> or something
<rogpeppe> thumper: what kind of user error do you think might have caused the problem?
<bigjools> thumper: ok sorry I destroyed-env before you asked :(
<thumper> :(
<thumper> bigjools: do it again :)
<bigjools> can do, will leave it running while I deal with kids bedtime
<thumper> ta
<bigjools> bloody annoying that I get that EOF when using --upload-tools
<bigjools> can you fix that too, kthxbai
<thumper> wallyworld_: ^^
<bigjools> at least is there a way to stop it downloading the world's tools?
<bigjools> I only need one set
<wallyworld_> reading backscroll, what's the issue?
<wallyworld_> EOF uploading tools?
<wallyworld_> that usually means the source tools can't be found
<wallyworld_> which means the compile failed perhaps
<wallyworld_> i'd need to see some logs etc
<bigjools> wallyworld_: https://bugs.launchpad.net/juju-core/+bug/1239558
<_mup_> Bug #1239558: --upload-tools failure preventing bootstrap completing <juju-core:New> <https://launchpad.net/bugs/1239558>
<bigjools> thumper: ok where's the file you want?
<wallyworld_> bigjools: it's not upload tools per se - it;s the maas storage provider returning an EOF when a file is requested
<wallyworld_> sorry, i mean a list() is performed
<wallyworld_> the expected behaviour for ec2/openstack etc is to return an empty list if there are no such files
<wallyworld_> i guess maas is different :-(
<bigjools> \o/
<bigjools> maas itself returns an empty list
<bigjools> so I guess the provider needs changing
<wallyworld_> i'll have to look a little closer into it
<wallyworld_> upload-tools han't changed really
<wallyworld_> i wonder how it worked before
<bigjools> pixie dust
<wallyworld_> the main change is that upload-tools runs automatically if tools cannot be found
<bigjools> and tears drawn from the cheeks of juju developers
<wallyworld_> bigjools: you running this against your local maas?
<bigjools> wallyworld_: Yes. I just wanted to avoid the long wait while it downloads 8 sets of tools
<bigjools> thumper: speak to me!
<wallyworld_> bigjools: maybe you can slip me some creds and i'll try against that too?
<bigjools> slip you what?
<wallyworld_> whatever creds i need to add to my env.yaml
<wallyworld_> so i can try bootstrapping on your maas
<bigjools> that's not going to help, you need ssh access to my server
 * thumper looks up at bigjools
<thumper> bigjools: the user data from /var/lib/cloud
<bigjools> which I can give you, I'll add your ssh key in one moment once I've given thumper my load
<thumper> I don't know exactly where it is
<wallyworld_> bigjools: so i need to ssh in and run juju from there?
<bigjools> wallyworld_: yep
<wallyworld_> ok, so any debugging, i need to checkout juju src etc without my lovely ide :-(
<bigjools> thumper: the actual data has this:
<bigjools> --constraints '' --debug
<thumper> right
<bigjools> thumper: so I call cloud-init bug
<thumper> bigjools: but cloud-init fails when running it?
<bigjools> however I suggest you use --constraints=''
<bigjools> thumper: yeah the '' is lost
<bigjools> wallyworld_: fraid so
<thumper> bigjools: I just don't set --constraints now if empty
<wallyworld_> ok
<bigjools> wallyworld_: ha apparently I already gave you access once before
<wallyworld_> lookslike i never used it :-)
<wallyworld_> bigjools: are there live tests for the maas storage provider (save me looking up the code)
<wallyworld_> bigjools: cause i need to discover why it's giving an EOF
<bigjools> wallyworld_: define live
<bigjools> it's all against a crappy mock IIRC
<wallyworld_> tests which are run against a live maas instance rather than a test double
<bigjools> only place is the QA lab where we run stuff daily
<wallyworld_> bollocks
<wallyworld_> looks like i'll need to figure out how to get that all set up
<wallyworld_> or i may just experiment on your maas
<thumper> bigjools: can I get you to email me the cloud-init and the log output for the cloud-init?
<bigjools> thumper: one sec
<bigjools> thumper: logs are in your "tim" directory on my server
<thumper> bigjools: ta
<bigjools> thumper: I want to destroy env now, ok?
<bigjools> log and cloud-data is there, I mean
<thumper> bigjools: gimmie a minute
<thumper> bigjools: you don't have a problem with --constraints either
<thumper> bigjools: your problem is: 2013-10-14 08:58:18,294 - cc_apt_update_upgrade.py[WARNING]: Source Error: deb http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/cloud-tools main:failed to get key from keyserver.ubuntu.com
<thumper> bigjools: when the commandline is echoed in the log file
<thumper> bigjools: it doesn't quote the parameters
<thumper> bigjools: not a cloud-init bug
<bigjools> hahaha
<rvba> thumper: yeah, the --constraints stuff is not a bugâ¦ I thought it was because I copied the output I found in cloudinit's log.
<thumper> rvba: bigjools copied you too
<bigjools> thumper: so if the keyserver is down, we can't deploy ...
<bigjools> \m/
<thumper> bigjools: seems a bit weird huh
<thumper> I wonder if there is a way to encode the key in cloud-init
<bigjools> my nodes don't have internet access BTW
<bigjools> can just talk to the proxy on the maas host
<bigjools> thumper: that was only a warning, it's not fatal
<bigjools> thumper: I still think the error is the --constraints
<bigjools> because of the error text
<bigjools> thumper: also the connection failed thing
<thumper> bigjools: no
<thumper> bigjools: it isn't
<bigjools> mongo is not running
<thumper> correct
<thumper> because you have the wrong version of mongodb-server
<bigjools> that seems to fairly fatal
<thumper> so it isn't running with ssl
<thumper> so the client can't connect
<bigjools> hurray!
<bigjools> why is it wrong?
<thumper> because it didn't get the mongodb-server from the cloud-tools archive
<bigjools> this is all installed from the archive....
<thumper> it got the default one, which is old
<bigjools> I thought the ssl-based on was in the ubuntu archive now?
<bigjools> one*
<thumper> it is, for saucy and raring
<thumper> but not precise
<thumper> hence the cloud-tools archive
<thumper> which it didn't use
<thumper> because it wasn't authorized
<thumper> anyway
<thumper> I have my iom-maas group access now
<thumper> will test in the morning
<thumper> it is 22:35 here now, and I'm going to have a cuppa
<thumper> night all
<bigjools> so bootstrapping on precise is basically borked
<bigjools> rvba: ^
<rvba> bigjools: I didn't get the logsâ¦ can you send them to me?
<bigjools> rvba: no need - read scrollback
<rvba> bigjools: I just got a node bootstrapped with precise in the lab.
<bigjools> !
<bigjools> rvba: in cloud-init-output.log, where is it downloading mongodb-server from ?
<bigjools> rvba: I see Get:18 http://archive.ubuntu.com/ubuntu/ precise-updates/universe mongodb-server amd64 1:2.0.4-1ubuntu2.1 [4167 kB]
<rvba> bigjools: No, it was downloading it from the right place (I don't have the logs handy, I'm recreating the env right now).
<bigjools> rvba: so it means that the cloud-archive was not added to cloud-init....
<davecheney> bigjools: what does /etc/apt.d/sources.list.d/ say ?
<bigjools> no cloud archive
<davecheney> well, shit
<rvba> bigjools: can you share the cloudinit config?
<bigjools> and I see this in cloud-data
<davecheney> is that 'cos of the proxy issue ?
<bigjools> apt_sources:
<bigjools> - source: deb http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/cloud-tools
<bigjools> Oct 14 08:58:18 gxx4a [CLOUDINIT] cc_apt_update_upgrade.py[WARNING]: Source Error: deb http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/cloud-tools main:failed to get key from keyserver.ubuntu.com
<bigjools> not sure if relevant
<davecheney> bigjools: yup
<rvba> It is!
<bigjools> not necessarily
<davecheney> you need to setup the proxy to allow keyserver.ubuntu.com
<bigjools> it can continue w/o the key
<davecheney> it could
<davecheney> but it doesn't
<bigjools> but I guess you fixed this already rvba?
<rvba> bigjools: In the lab, I fixed it yes.
<bigjools> rvba: squid-deb-proxy?
<rvba> bigjools: yes.
<bigjools> diff please? :)
<bigjools> rvba: we need to fix this in the release packaging
<rvba> bigjools: hang on, this is only related to the lab's config.
<rvba> bigjools: I fixed the config of the lab's proxy.
<rvba> bigjools: I'm not talking about the proxy installed on the region, I'm talking about the proxy (installed on the lab's machine) that the region's proxy is setup to use.
<rvba> bigjools: does that make sense.
<rvba> bigjools: in short, my issue was completely specific to the lab's setup.
<bigjools> rvba: we also have a proxy on the region that will disallow access to both the keyserver and the cloud archive
<rvba> bigjools: well, apparently not because I didn't have to tweak that at all.
<bigjools> actually the latter is ok because of ".archive.canonical.com"
 * bigjools bootstraps again
<axw> bigjools: is there a session agenda somewhere I can see?
<axw> for next week
<axw> (just saw your email)
 * TheMue => lunch
 * rogpeppe is now really quite damp
<mgz> rogpeppe: you were going *on* the roof?
<rogpeppe> mgz: just out in the rain to look at the roof with the potential contractor
<rogpeppe> mgz: it was absolutely chucking it down
<rogpeppe> prompted by thumper's remarks this morning, a change to environs/cloudinit tests: https://codereview.appspot.com/14665043
<rogpeppe> mgz, TheMue, wallyworld, davecheney: ^
<rogpeppe> mattyw: did you get any further with your issues?
<TheMue> rogpeppe: nice one. one question: in case of multiple non-continuous lines I'm interested in, can I use patterns there?
<rogpeppe> TheMue: i'm not sure what you mean
<TheMue> rogpeppe: e.g. I'm expecting foo \n something uninteresting \n bar \n
<TheMue> rogpeppe: foo and bar is interesting, but not the part in the middle
<rogpeppe> TheMue: yes, sure, that works
<TheMue> rogpeppe: ah, fine, expected nothing else, but wanted to get sure
<rogpeppe> TheMue: (i tried to say that in the comment on inexactMatch, but i guess i didn't succeed)
<rogpeppe> TheMue: the logic in the function should make it fairly clear that that works too
<rogpeppe> TheMue: assertScriptMatch, that is
<rogpeppe> TheMue: thanks for the review, anyway
<TheMue> rogpeppe: yeah, seen the func and thought I mostly got it, but not in total
<TheMue> rogpeppe: missing some commments ;)
<rogpeppe> mgz: ping
<rogpeppe> mgz: ping
<mgz> rogpeppe: hey
<rogpeppe> mgz: fancy doing some stuff on addresses?
<mgz> rogpeppe: lets, give me five then I'll join the hangout
<rogpeppe> mgz: tell you what, gimme 15 mins and i'll have a bite to eat first
<mgz> rogpeppe: sure
<mattyw> rogpeppe, same again, http://paste.ubuntu.com/6235841/ I was looking at other stuff this morning, looking again now
<rogpeppe> mattyw: i'm just about to start on some other stuff, but it would be good to find out more about what's going on.
<rogpeppe> mgz: https://plus.google.com/hangouts/_/b9aefae7756c493cf05ba17f092adfe125b6305d?authuser=1
<mgz> rogpeppe: ta
<rogpeppe> mattyw: perhaps tomorrow morning?
<mattyw> rogpeppe, sounds good, I'm going to see if I can get it going on my main machine so I can carry on, but I'll leave that vm as it is and sure, let's look tomorrow morning
<mattyw> rogpeppe, is william on holiday?
<mgz> mattyw: for a few days, yeah
<mattyw> rogpeppe, trying it out on my main machine I get similar errors - not exactly the same, but lots of sockets left open
<rogpeppe> mattyw: that's, erm, good
<mattyw> rogpeppe, it's certainly "information" but I don't know what it means :)
<hazmat> anyone ever tried juju with gccgo?
<rogpeppe> hazmat: i haven't tried gccgo period...
<rogpeppe> hazmat: i'd like to know what happens :-)
<hazmat> rogpeppe, yeah.. it seems  more than a little flakey
<hazmat> rogpeppe, i'm trying with just a single go file, and the resulting exec is tossing up errors
<hazmat> the question came up wrt to juju portability
<rogpeppe> hazmat: hmm. what errors are you seeinG?
<sidnei> hazmat: is that related to the arm thread?
<hazmat> sidnei, not arm
<sidnei> k
<hazmat> rogpeppe, http://pastebin.ubuntu.com/6236943/
<hazmat> looks like it got to upstream.. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57194
<sidnei> hazmat: first hit on google: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57194
<sidnei> aha
<hazmat> :-)
<hazmat> looks like it barfs on compiling juju (gccgo 4.8) http://paste.ubuntu.com/6236975/
<hazmat> some sort of inter-package dep issue
<rogpeppe> hazmat: how are you trying to compile juju there?
<hazmat> are we still targeting go 1.0? or have we moved on to go1.1
<hazmat> rogpeppe, go install -compiler=gccgo launchpad.net/juju-core/...
<rogpeppe> hazmat: we've moved on to 1.1.2
<rogpeppe> hazmat: where did you get your gcc from
<hazmat> rogpeppe, saucy pkg .. apt-get install gccgo == 4.8.1
<rogpeppe> hazmat: i think you'll need to use 4.8.2
<hazmat> rogpeppe, it feels like it might be a tool issue.. but i'll give it a whirl
<rogpeppe> hazmat: you'll probably need to compile from source
<rogpeppe> hazmat: 4.8.1 doesn't support go1.1
<hazmat> rogpeppe, sure.. but its only the inter-pkg links that barfs.. 1.1 has prelim support
<rogpeppe> hazmat: (i just confirmed that with iant on #go-nuts)
<hazmat> in go 4.8.1
<hazmat> k
<hazmat> rogpeppe, i'll give it a whirl, thanks
<hazmat> rogpeppe, is 4.8.2 released?
<rogpeppe> hazmat: very soon, apparently
 * rogpeppe wonders if he's got enough resources on his laptop to build gcc
<rogpeppe>  /me is done for the day
<rogpeppe> g'night all
<rogpeppe> mgz: quite a few test errors from that branch (not really surprising)
<thumper> mramm2: do we have a chat today?
<mramm2> I can chat on the phone or skype
<mramm2> or on IRC
<mramm2> but am traveling today
<mramm2> (driving to florida -- national holiday here)
<thumper> ah
<thumper> well, I don't have anything too urgent I suppose
<mramm2> I am currently not driving
<thumper> I didn't think you would be
<mramm2> my only big thing is making sure we are ready to go for the NYC and SF sprints
<thumper> not on irc :)
<mramm2> haha
 * thumper nods
<thumper> I've been looking into the maas issues
<thumper> we have one clear deficiency
<thumper> and that is if we fail during startup of the bootstrap node
<thumper> we have no error reporting mechanism
<thumper> like not getting the right mongodb-server
<thumper> which means we can never status
<thumper> nor destory
<hazmat> davecheney, ping
<hazmat> davecheney, trying out gccgo (per roger got 4.8.2) but the fundamental issue is the tooling seems to be broken for multi-package compilations
<hazmat> aha.. got it
<hazmat> needed  a pkg symlink
<davecheney> hazmat: i don't know what you're doing
<davecheney> but i don't think a symlink is the solution
<davecheney> hazmat: rogpeppe related, http://gcc.gnu.org/ml/gcc/2013-10/msg00085.html
<hazmat> davecheney, i'm using that fwiw
<davecheney> cool
<hazmat> davecheney, gcc took forever to compile
<hazmat> davecheney, my underlying issue was the way inter package lookups were being done
<hazmat> for linking purposes
<hazmat> default compilation was dropping them into $GO_HOME/lib/gccgo but lookup had an architecture suffix
<hazmat> a simple symlinked fixed it
<hazmat> running through the core test suite atm
<davecheney> i rememver seeing a patch about that recently
<hazmat> davecheney, i'm on golang trunk also re tooling
<hazmat> getting an env to bootstrap on this might be a bit tricky.. haven't tried the static compilation options yet
<davecheney> yup, from what I know
<davecheney> gccgo binaries need libgo.a on the target
<thumper> time to go hit something
 * thumper -> gym
<hazmat> davecheney, it can be statically linked but running into some issues with that
<hazmat> davecheney, ala -gccgoflags '-static -static-libgo'
<bigjools> thumper (or anyone): how do you deal with bootstrap failures on other providers then?
<davecheney> bigjools: we don
<davecheney> dont
<bigjools> cloud-init's failure to add the cloud-archive and then silently install the wrong mongo is pretty shite
<bigjools> davecheney: ok
<davecheney> bigjools: i think this would be the 4th time it's broken
<davecheney> i cant break for other reasons, including when cloud-init is setup properly
<davecheney> rvba hit one with his proxy
<davecheney> if the mongo we needed was SRU'd into precise
<davecheney> this would solve the problem
<bigjools> davecheney: indeed.  I've no idea why why setup won't install the cloud archive - it can't get to the keyserver for some reason.  Perhaps the apt proxy doesn't get used for that.
 * bigjools experiments...again
<bigjools> wallyworld_: are you using my maas server?
<wallyworld_> not yet
<hazmat> davecheney, bigjools we should be doing the cloud-archive by hand per cloud-archive instructions instead of using the ppa facility of cloudinit
<hazmat> that will fail more obviously
<hazmat> and also remove duplicate work against manual provider for the cloud archive install
<hazmat> since its not using cloudinit
<davecheney> 1276734
<davecheney> lp#1276734
<hazmat> bug 1276734
<davecheney> no mup ?
<hazmat> mup.. where'd you go
<davecheney> canonistack strikes again
<bigjools> wallyworld_: ok let me know when you might as I am doing some experimentation on it
<wallyworld_> ok
<kurt_> Hi Guys
<kurt_> I discovered bug 1276734 was actually not fixed in 1.16.0 as it said it was.  I noted this in the bug tracker, but no one has responded.  What can I do about this?
<bigjools> hazmat: it's using the apt sources directive of cloud-init - what do you mean "by hand"?
<kurt_> Oh, I see its the topic du jour already
<hazmat> bigjools, use the script part to install the archive
<bigjools> add-apt-repo?
<hazmat> bigjools, yeah
<hazmat> bigjools, see cloud archive install instructions
<bigjools> ok ta
<hazmat> its not just a ppa
<hazmat> np
#juju-dev 2013-10-15
<thumper> perhaps I should stop sparring the week before a gathering
<thumper> I wonder if this facial bruising will be gone by Sunday
<thumper> I should learn not to block punches with my face
<wallyworld_> bigjools: is it ok for me to use your maas? are you still getting that eof issue?
<bigjools> wallyworld_: I am, but hang on I am just completing a test
<bigjools> 10 mins
<wallyworld_> ok
<wallyworld_> rightio
<thumper> wallyworld_: have you used the garage maas before?
<wallyworld_> nope
<wallyworld_> i've not used *ant* maas
<wallyworld_> any
<wallyworld_> how's the face?
<bigjools> wallyworld_: server is all yours
<wallyworld_> bigjools: ta. how do i get an oauth key for the env.yaml?
<bigjools> wallyworld_: it's all set up
<bigjools> just log in and bootstrap
<wallyworld_> ok
<wallyworld_> bigjools: i'm scared to look inside the folder called "backdoor-image". shudder
<bigjools> wallyworld_: left just for you
<wallyworld_> \o/
<axw> thumper: do you have/use any bzr plugins to get diff summaries? (file names with +/- lines)
<thumper> axw: bzr diff | diffstat
<thumper> that's all I use
<axw> ah, didn't know diffstat
<axw> thanks
<thumper> doesn't give full summary per file
<thumper> although it may have options
<axw> thumper: that'll do nicely for me, thanks
<thumper> wallyworld_: face is a little bruised, that's all
<wallyworld_> walked into a door
<bigjools> thumper: nothing to do with the missus?
<axw> boxing?
<thumper> sparring at boxing, yes
<thumper> the guy today is very fast
<thumper> top 5 in the country and under 20
<thumper> so much younger and faster
<thumper> but you don't learn unless you fight those better than you
<bigjools> your nick gets better and better
<bigjools> or possibly more ironic
<thumper> heh
<thumper> I have an interview with Rachel, Caitlin and her new principal later this afternoon
<thumper> slight shiner to go in with
<thumper> I should just ack all meek
<thumper> and flinch when Rachel looks at me :)
<thumper> school pickup time
 * thumper walks a block
<thumper> I want to test out my maas change to make sure that the allocated change is right
<thumper> hence the email about the garage maas
<wallyworld_> bigjools: this maas eof thing is giving me the shits. it's failing doing a bog standard httpClient.Do(request) inside the gomaasapi client.go, complaining that "can't write HTTP request on broken connection"
<wallyworld_> so something is closing the maas client's connection
<thumper> I know what it is
<wallyworld_> but i don't know what
<thumper> pick me
<thumper> pick me
<thumper> pick me
<thumper> pick me
<wallyworld_> ok!
<thumper> wallyworld_: quick hangout?
<wallyworld_> sure
<thumper> https://plus.google.com/hangouts/_/ca28925e2921123581a410b465ff00dda5e3c11c?hl=en
<bigjools> wallyworld_: I blame Go
<wallyworld_> bigjools: it's all fooked
<bigjools> \o/
<wallyworld_> i just can't see what's wrong
<bigjools> wallyworld_: try emulating with a simple curl request
 * bigjools looks in maas log as well
<wallyworld_> bigjools: i think i did that and still got an error. oh, i did it using wget
<bigjools> wallyworld_: what time did you try last?
<bigjools> you need auth header remember
<wallyworld_> bigjools: looks like remote end is closing the connection
<thumper> and as it happens, I don't know
<thumper> different problem
<wallyworld_> bigjools: ah yes. can you remind me of the syntax
<bigjools> wallyworld_: I can't :)
<wallyworld_> fat lot of good you are then
<bigjools> wallyworld_: what request is it making when it gets the EOF?
<bigjools> FWIW I get the EOF sometimes when not using --upload-tools
<wallyworld_> bigjools: http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools
<wallyworld_> or there abouts
<wallyworld_> i had commented out the prefix to try something
<thumper> bigjools: we blame maas
<bigjools> 10.0.0.9 - - [15/Oct/2013:11:15:17 +1000] "GET /MAAS/api/1.0/files/?op=list&prefix= HTTP/1.1" 200 807 "-" "Go 1.1 package http"
<bigjools> maas is doing just fine it seems --^
<wallyworld_> bigjools: why does this command return an error then: response, err := httpClient.Do(request)
<wallyworld_> it's effectively a straight http get call
<wallyworld_> which fails
<bigjools> it did not do this until recently and maas has not changed around this call
<wallyworld_> the code has not changed
<bigjools> can you dump the request and response?  I'll add the same to the maas log
<wallyworld_> it's a simple list request
<wallyworld_> bigjools: there is no response cause the request call errors
<wallyworld_> &http.Request{Method:"GET", URL:(*url.URL)(0xc200230850), Proto:"HTTP/1.1", ProtoMajor:1, ProtoMinor:1, Header:http.Header{"Authorization":[]string{"OAuth oauth_signature_method=\"PLAINTEXT\", oauth_version=\"1.0\", realm=\"MAAS+API\", oauth_consumer_key=\"M2X2ZeCSNVWer6AEHc\", oauth_token=\"v3jKFjma2gZkhasdQR\", oauth_signature=\"%26yVgdrUAVWKCRsxNXGUEzyTrTaHYebAmH\", oauth_timestamp=\"1381806551\", oauth_nonce=\"38010820\""}},
<wallyworld_> Body:io.ReadCloser(nil), ContentLength:0, TransferEncoding:[]string(nil), Close:false, Host:"10.0.0.9:80", Form:url.Values(nil), PostForm:url.Values(nil), MultipartForm:(*multipart.Form)(nil), Trailer:http.Header(nil), RemoteAddr:"", RequestURI:"", TLS:(*tls.ConnectionState)(nil)}
<wallyworld_> is the request
<bigjools> I'll try again and dump the maas log, hang on
<bigjools> wallyworld_: ok that request isn't even hitting maas
<wallyworld_> \o/
<bigjools> Go is getting the EOF alllll on its own
<wallyworld_> huzar!
<wallyworld_> thumper: http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=
<davecheney> wallyworld_: can you paste a curl -v of that url ?>
<bigjools> wallyworld_: the last request that maas sees is /MAAS/api/1.0/files/provider-state/
<wallyworld_> davecheney: i'll have to look up the syntax - i can't recall how to construct a curl request with the headers etc
<bigjools> curl -H "Authorization: OAuth <ACCESS_TOKEN>" http://www.example.com
<wallyworld_> ta
<wallyworld_> bigjools: so i pasted in the entire oauth access token from the env.yaml file as the ACCESS_TOKEN, right?
<bigjools> no
<bigjools> something like:
<bigjools> Authorization: OAuth oauth_signature_method="PLAINTEXT", oauth_version="1.0", realm="MAAS+API", oauth_consumer_key="M2X2ZeCSNVWer6AEHc", oauth_token="v3jKFjma2gZkhasdQR", oauth_signature="%26yVgdrUAVWKCRsxNXGUEzyTrTaHYebAmH", oauth_timestamp="1381807262", oauth_nonce="2814807
<wallyworld_> ok
<bigjools> it mght get a bit tricky actually
<bigjools> you can always re-create the request with maas-cli
<bigjools> which works fine, BTW
<wallyworld_> does the maas-cli log the request?
<bigjools> "maas-cli maas files list prefix=tools-"
<bigjools> yeah I can see it in the mas log
<wallyworld_> maas-cli maas files list prefix=releases/tools/juju- also works
<bigjools> right
<wallyworld_> so why is Go's http call failing
<bigjools> there are no polite answers to that
<bigjools> as a data point I can re-create this another way
<wallyworld_> bigjools: do you have the output from mass-cli that you can give to davecheney who wanted to see a curl -v output?
<bigjools> if I bootstrap without --upload-tools, destroy and then bootstrap without --upload-tools again, you get the EOF
<bigjools> yes
<bigjools> http://paste.ubuntu.com/6238905/
<bigjools> that's the host side
<bigjools> but amounts to the same thing
<wallyworld_> davecheney: is that ^^^^ sufficient? it sure seems like a http get request via other means works, but the same request sent from Go's http.Client fails
<davecheney> wallyworld_: nah, i was hoping to see the curl output
<davecheney> i know how to interpret that
<wallyworld_> ok, i'll see if i can get it
<bigjools> given that the request from Go isn't hitting maas, I suspect you need to look at why the Go internals are misbehaving
<bigjools> no amount of curl log is going to help with that
<wallyworld_> yeah
<wallyworld_> davecheney: that's the pastebin https://pastebin.canonical.com/99027/
<wallyworld_> but
<wallyworld_> that worked
<wallyworld_> the Go code doesn't
<wallyworld_> the values in the pastebin were harvested by printing the http.Request struct from Go
<wallyworld_> so the request is formed with the right header values etc
<wallyworld_> but it doesn't reach the server side, and instead the Do() method returns an EOF
<wallyworld_> or a "can't write HTTP request on broken connection" error
<wallyworld_> wtf
<wallyworld_> bigjools: how recently did you upgrade to Go 1.1.2? could that be an issue?
<bigjools> wallyworld_: how can that affect anything?  I am using a packaged juju binary on this box
<wallyworld_> i only run Go 1.1.1 on mine. if it's a Go internal issue, the version may be relevant
<davecheney> bigjools: you could have 1.0.x
<davecheney> that is known to be broke
<davecheney> that was all the difficulty we had bad in July
<bigjools> I still don't know how that can affect a binary built by someone else
<wallyworld_> davecheney: go version says 1.1.2 on his maas server
<davecheney> also, does maas use your forked http code ?
<bigjools> my maas server Go version is irrelevant
<wallyworld_> bigjools: i compiled juju from source on your box
<wallyworld_> to add in debugging
<bigjools> sure, but given this is also a problem with the packaged juju ...
<davecheney> bigjools: ooooooooooh dear
<davecheney> i wonder if the host that build juju was using 1.1.2
<bigjools> if it's on saucy, quite likely
<davecheney> ok
<davecheney> that is known to work
<wallyworld_> except for this http request :-(
<bigjools> is it re-using objects or otherwise similarly being stupid?
<wallyworld_> a new http request is created each time, as is a new http.Client object
<bigjools> oh google you chunk of crap
<bigjools> silently rendering the word EOF as just "of" in my seach is stupid
<bigjools> http://stackoverflow.com/questions/17714494/golang-http-request-results-in-eof-errors-when-making-multiple-requests-successi
<bigjools> "You need to set Req.Close to true"
<wallyworld_> just saw that
<bigjools> is it doing that?
<wallyworld_> nope, this is maas code
<bigjools> is gomaasapi doing that?
<wallyworld_> i don't think so, but i wonder if we do it *anywhere*
 * bigjools knows not of golang esoterics
<wallyworld_> a quick code search seems to indicate we don't
<wallyworld_> only in the forked gwacl code
<wallyworld_> so why the fuck is this not failing elsewhere if it is a problem
<davecheney> maas is the only provider that uses OAUTH ?
<davecheney> ^ guess
<bigjools> I think so
<bigjools> but why is it failing like this now as well?
<bigjools> 1.1.2?
<bigjools> wallyworld_: this is easy to test out
<wallyworld_> yeah, doing it now
<bigjools> ok - I was also but noticed the file changed under my feet :)
<wallyworld_> bigjools: davecheney: adding in req.Close = true after creating the req seems to have worked. but wtf. we don't do that anywhere else in juju-core
<bigjools> \o/
<wallyworld_> and this is in the gomaasapi library
<wallyworld_> so why just maas and not goose or goaws wtc
<bigjools> mystery
 * bigjools 's work here is done
<bigjools> thanks for helping out wallyworld_
<wallyworld_> np. but i'm pissed at it all. not sure where to direct my rage
<bigjools> Rob Pike? :)
<wallyworld_> i blame Go
<wallyworld_> i mean if req.Close is required, why not just default to that?
<bigjools> indeed
<wallyworld_> and why is it not consistent?
<wallyworld_> do we just change maasapi?
<wallyworld_> or do we try and change *all* the other places we create requests
<bigjools> something to do with HTTP1.1 I guess
<bigjools> the document is singularly unhelpful
<bigjools>         // Close indicates whether to close the connection after
<bigjools>         // replying to this request.
<bigjools> no hint of why you'd want to do that
<wallyworld_> yeah, and the error seems like the opposite of that
<wallyworld_> bigjools: how long should juju status take to come back on your maas setup?
<bigjools> about 10 minutes
<wallyworld_> me taps fingers on desk waiting, waiting
<bigjools> my theory is that maas already closed the connection and it didn't detect that until trying to send on the same one again
<bigjools> think yourself lucky, it used to be 20 minutes, the fast installer cut the time in half
<wallyworld_> wonder why it didn't fail before now though
<bigjools> and 8 of the 10 minutes is cloud-init and a reboot
<wallyworld_> did maas change?
<bigjools> I reckon davecheney may be right when he said it could be go 1.1.2
<bigjools> maas hasn't changed here - it's using the exact same stuff
<bigjools> Apache frontend
<wallyworld_> and 1.1.4 shipped on Go 1.1.1?
<wallyworld_> 1.14 i mean
<bigjools> no idea
<bigjools> when did 1.1.2 hit the archive?
<wallyworld_> no idea here either
<bigjools> could build 1.1.6 on go 1.1.1 and see
<bigjools> 1.16 even
<wallyworld_> go on then, i dare you
<wallyworld_> and juju stat came back ok
<wallyworld_> i'll revert my debugging
<bigjools> so stackoverflow, I can't upvote an solution because I don't have enough reputation,  but I *can* edit the answer to oblivion.  *HEADDESK*
<wallyworld_> funny, i searched for and found that page also, but my google didn;t "fix" the EOF spelling
<bigjools> I used +"EOF"
<wallyworld_> i didn't
<bigjools> sigh
<bigjools> Go seems to be a minefield where there's an actual mine wherever you step
<wallyworld_> yep :-(
<wallyworld_> AND no version control :0(
<wallyworld_> what could possibly go wrong
<wallyworld_> bigjools: you should have a working binary in your ~ubuntu/go/bin directory
<bigjools> hurrah!
<wallyworld_> update PATH quick!
<bigjools> no need I can just run that binary
<wallyworld_> there's also a bootstrapped env now
<bigjools> huzzah
<wallyworld_> not if you want to upload tools
<bigjools> oh
<wallyworld_> i think it needs to be in your path, or is that for local provider
<bigjools> one last bitchslap
<bigjools> default gopath is that env anyway
<wallyworld_> i can't recall, it may only be for local provider
<wallyworld_> ok, see how it goes
<bigjools> hello jtv
<jtv> Hi bigjools
<bigjools> you're not on the internal irc
<jtv> Reconnecting
<bigjools> davecheney: you guys planning another release to go out with 13.10?
<davecheney> bigjools: nope
<davecheney> 1.1.2 has been in saucy since september
<bigjools> davecheney: juju release I mean
<davecheney> bigjools: i think so
<davecheney> i don't know the detalis
<davecheney> thumper and sinzui probably know
<bigjools> thanks
<axw> wallyworld_: if you have some time this week, would you mind having a look over https://codereview.appspot.com/14527043/
<axw> there's some changes to simplestreams metadata merging in there
<wallyworld_> axw: sure, looking now. was out buying dinner after school pickup
<axw> wallyworld_: thanks
<wallyworld_> axw: just had a quick read of some of the comments. i agree in theory the resolve can go away eventually. but my view is we need to now while we transition and have to cope with older metadata etc
<wallyworld_> lenient on what's read, strict on what's written and all that
<axw> wallyworld_: yep, sounds fair enough
<wallyworld_> i reckon we can get rid of it for 1.18
<wallyworld_> so maybe if this is going into trunk we don't need it
<wallyworld_> or the 1.16 backport has it, land in trunk, and then follow up with a trunk branch to remove it
<wallyworld_> davecheney: from your recollection, was juju 1.14 done with go 1.1.1?
<axw> wallyworld_: yeah this isn't going into 1.16
<wallyworld_> hmmm. maybe we can/should get rid of it then
<wallyworld_> reduce complexity
<wallyworld_> all metadata should be good for 1.16
<wallyworld_> and if it's not, we don't want to propagate the issue
<wallyworld_> bigjools: i think there's going to be a 1.18 for saucy
<axw> 1.18? not 1.16.1?
<axw> do we not do stable updates like that?
<wallyworld_> axw: i *thought* it was going to be 1.18, based off 1.17 trunk
<axw> mmkay
<wallyworld_> too many bugs etc we are fixing in trunk
<wallyworld_> i could be wrong
<wallyworld_> but that's what i heard
<axw> okey dokey
<wallyworld_> ot thought i heard
<axw> wallyworld_: in that case, I'd rather hold off on changing the resolve logic for this CL
<wallyworld_> ok
<axw> oh but... we don't support going multiple versions do we
<axw> upgrading
<wallyworld_> axw: with WriteMetadata stuff, i have done something similar now for images metadata. but unlike for tools, i pulled the WriteXXX methods out into generate.go instead if stuffing into simplestreams.go
<axw> ok
<wallyworld_> i think this is the link, i have 3 of the fuckers https://codereview.appspot.com/14663043/
<wallyworld_> generate.go is a new file
<wallyworld_> axw: we support going from 1.14->1,16 and 1.16->1.18
<wallyworld_> but not 1.14->1.18
<axw> right, so doesn't matter either way then
<wallyworld_> axw: it's extra work for you, but if the resolve bit were to be removed prior to committing to trunk, the code comes out cleaner
<wallyworld_> i looked at the merge proposal, it look ok
<axw> wallyworld_: yeah that's cool, it'll definitely clean it up - I'll do that
<axw> thanks for checking over it
<axw> I'm reviewing your CL now, looking good so far
<wallyworld_> np, thanks for fixing it :-)
<wallyworld_> axw: thanks :-) if you are a masocist, i have 3 related ones. the one you are looking at is the last of 3
<axw> are they prereqs?
<wallyworld_> yeah
<wallyworld_> https://codereview.appspot.com/14502059/ is first, and https://codereview.appspot.com/14540055/ is second
<axw> not a masochist, but I'll do what I can ;)
<wallyworld_> the end result is that creating image metadata for private clouds is *much* easier
<wallyworld_> thanks :-)
<axw> cool
<wallyworld_> with this work, the user can run the image metadata gtenerate plugin over and over to buiild up their metadata
<wallyworld_> for different series, arches etc
<wallyworld_> before, the tool was just a prototype and just did one image and overwrote each tome
<bigjools> wallyworld_: ok ta
<wallyworld_> rogpeppe: can you recall - did we release juju 1.14 using Go 1.1.1?
<rogpeppe> wallyworld_: hmm, not sure
<wallyworld_> rogpeppe: ok. there's a bug i'm fixing in gomaasapi which i want to blame on the go 1.1.2 upgrade
<wallyworld_> cause nothing else makes sense
<rogpeppe> wallyworld_: is that the EOF problem?
<wallyworld_> yeah
<wallyworld_> reg from http.NewRequest needs to have req.Close = true all of a sudden
<wallyworld_> but only in gomaasapi
<wallyworld_> nowhere else
<rogpeppe> wallyworld_: FWIW if Close didn't default to false, you'd almost never get any connection reuse
<wallyworld_> and no gomaasapi code has changed
<rogpeppe> wallyworld_: i wonder how long connections are kept around before they're dumped
<wallyworld_> not sure
<rogpeppe> wallyworld_: it might be the maas server dumping old http connections
<wallyworld_> but this happened during bootstrap, so < 1 second
<rogpeppe> wallyworld_: ah, that seems wrong
<wallyworld_> why did it on;y show up now?
<wallyworld_> and not before
<rogpeppe> wallyworld_: good question; does it happen reliably?
<wallyworld_> yep
<wallyworld_> only in 1.16 juju-core
<wallyworld_> not 1.14 on same maas box
<rogpeppe> wallyworld_: have you tried 1.14 compiled with go 1.1.2 ?
<bigjools> rogpeppe: the front end is Apache, so whatever Apache is doing...
<rogpeppe> bigjools: hmm, seems unlikely then
<wallyworld_> rogpeppe: no
<rogpeppe> wallyworld_: that would be a good way to confirm or deny your suspicions
<wallyworld_> yeah
<wallyworld_> rogpeppe: so will this have adverse performance impace for gomaasapi wrt connection resuse?
<wallyworld_> if Close is always set to True
<wallyworld_> which it needs to be to make it work
<rogpeppe> wallyworld_: possibly
<rogpeppe> wallyworld_: these are https connections, right?
<wallyworld_> i think so
<wallyworld_> ah no
<wallyworld_> http
<wallyworld_> eg http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=
<wallyworld_> bbiab
<bigjools> I'd take bad performance over no performance :)
<rogpeppe> bigjools: it would be nice to know what's going on though
<bigjools> indeed
<bigjools> could be a Go difference from 1.1.1 to 1.1.2
<rogpeppe> bigjools: it is possible
<rogpeppe> bigjools: that should be easy enough to confirm
<bigjools> aye
 * rogpeppe is reading through the Go http.Transport logic
<bigjools> have you guys done the environment uuid yet?
<bigjools> rvba: I reckon that is still the best solution BTW, as you said the other one is seriouslu abusing tags
<rvba> If the environment uuid is not available yet, would it make sense to use a hash of the hash of the admin-secret as an identifier?
<rogpeppe> rvba, bigjools: what's the issue here?
<rvba> rogpeppe: we want to fix https://bugs.launchpad.net/maas/+bug/1239488
<_mup_> Bug #1239488: Juju api client cannot distinguish between environments <MAAS:Triaged> <https://launchpad.net/bugs/1239488>
<rvba> rogpeppe: so we need a way to flag nodes to account for the fact that they belong to a juju environment.
<rvba> rogpeppe: we thought about two solutionsâ¦ and we think solution 2 is best:
<rvba> = 3. Fix by storing the API key used to acquire nodes =
<rvba> ???
<rvba> Cons: harder to pull a key if compromised
<rvba> rogpeppe: arg, wrong paste, sorry.
<rvba> rogpeppe: http://pad.ubuntu.com/DnNONX6kFB
<rogpeppe> rvba: i don't quite see how the UUID is an *alternative* to using tags
<rogpeppe> rvba: won't you need to tag with the UUID?
<rvba> rogpeppe: no, we need the UUID in both cases.
 * rogpeppe goes to get his SSO key
<rogpeppe> rvba: so are you thinking of solution 2 here?
<rvba> rogpeppe: yeah
<rogpeppe> rvba: i've always wanted the env UUID to be generated when the environment is first created
<rogpeppe> rvba: it should actually be quite a simple change now
<rogpeppe> rvba: it can be done at Prepare time
<rogpeppe> rvba: and would be passed through in the environ config
<rvba> rogpeppe: that would be great.
<rvba> rogpeppe: we need to fix this problem today/tomorrow morningâ¦ is that something (the UUID thingy) that could be doneâ¦ likeâ¦ now? ;)
<rogpeppe> rvba: probably not that quickly - it involves changes in a few places. however...
<rogpeppe> rvba: we could make a change in the maas provider only
<rogpeppe> rvba: to make it generate its own uuid, to be replaced with the environment uuid at some point in the future
<rvba> rogpeppe: changing the uuid will imply dealing with already deployed environments.
<rogpeppe> rvba: there's actually no particular need for it to be the *actual* environment UUID, is there?
<rvba> rogpeppe: no, we just need an identifier, specific to each environment.
<rogpeppe> rvba: ok, so here's a possible way forward:
<wallyworld_> bigjools: so we don't overlap, were you going to compile juju 1.14 with go 1.1.2 on your maas box?
<rogpeppe> rvba: change maas's EnvironProvider.Prepare so that it generates its own UUID and stores it in the environ's configuration
<rogpeppe> rvba: then change StartInstance and Bootstrap to tag the instance with that tag
<rvba> (plus change Instances() to pass that tag to MAAS when listing instances)
<rogpeppe> rvba: yeah
<rogpeppe> rvba: oh yeah, it's agent_name, not tag, also :-)
<rvba> Right, that's a detail :)
<rvba> rogpeppe: isn't EnvironProvider.Prepare called every time juju is run?
<rogpeppe> rvba: nop
<rogpeppe> e
<rogpeppe> rvba: it's called just once for a given environment
<rvba> I mean, I'm not sure I see how the uuid would be created once and persisted.
<rogpeppe> rvba: after Prepare is called, all the config attributes for that environment are stored in the .jenv file (as BootstrapAttrs)
<rvba> rogpeppe: I seeâ¦ so if you manually get rid of the .jenv you'll be in trouble then.
<rvba> rogpeppe: what about using hash(admin-secret) ?
<rogpeppe> rvba: i don't like that idea
<rogpeppe> rvba: there's nothing guaranteeing that admin-secret is unique
<rogpeppe> rvba: you'll be in trouble if you manually get rid of the .jenv file anyway
<rvba> rogpeppe: all right :).  Then this seems like the best option.
<TheMue> morning
<rvba> bigjools: I updated the plan with rogpeppe's idea.
<rogpeppe> rvba: for an example of a Prepare method that adds an attribute, take a look at openstack's environProvider.Prepare method
<rvba> rogpeppe: okay, thanks.
<rogpeppe> rvba: although that method allows the control-bucket to be overridden in environments.yaml, and i'm not sure you'd want to allow that for the uuid
<rvba> Probably not.
<rogpeppe> rvba: you'd probably want to make the method return an error if the uuid is already specified
<bigjools> wallyworld_: no, go for it
<rogpeppe> afk
<bigjools> rvba: ok.  Existing deployments will be a problem IMO
<rvba> bigjools: true.  That's the only remaining problem.
<bigjools> rvba: and it's a hard one
<bigjools> rvba: although we could get it to work if juju only generates its uuid on bootstrapping
<rvba> bigjools: one things we could do is detect that we don't have generated a UUID and that the env is already bootstraped, and in this case use '' as the agent name,
<bigjools> and existing deployments can stay without the uuid
<rvba> thing*
<bigjools> brb
<rogpeppe> rvba: if you've upgraded a legacy environment, the uuid in the config will be unset
<rogpeppe> rvba: so that case should be easy to cater for
<rogpeppe> rvba: although it's perhaps a problem that upgrading to the new juju won't fix the current problem for existing envs
<rvba> rogpeppe: I don't really think we have a choice here.
<rogpeppe> rvba: is the plan to make the agent_name dynamically changeable, or will it only be specifiable when an instance is started?
<rvba> rogpeppe: I don't see why we should make it changeable.
<rogpeppe> rvba: if it was, it might be possible to fix existing deployments
<rogpeppe> axw_: you've got a review  https://codereview.appspot.com/14430064/
<wallyworld_> rogpeppe: sadly, juju-core does not appear to have been tagged for the 1.14 release. neither has goose. and other dependencies like gomaasapi have no tags at all :-( i can get the juju-core source for 1.14 because there's a series branch but for the dependencies i can't :-(
<wallyworld_> so a bit ahrd to recompile 1.14 :-(
<rogpeppe> wallyworld_: i think all the deps are in https://launchpad.net/juju-core/1.14/1.14.1/+download/juju-core_1.14.1.tar.gz
<wallyworld_> rogpeppe: ah ok. we have a src tarball
<wallyworld_> still not really best practice :-(
<rvba> Thanks a lot for your help rogpeppe, we are starting to work on a fix for our bug, we will probably come to you during the day for advice/reviews :).
<rogpeppe> rvba: np
<wallyworld_> rogpeppe: juju 1.14 and go 1.1.2 seems to work. i'm not sure why. juju 1.14 uses the old tools lookup code, but still does similar things eg storage.List() etc
<rogpeppe> wallyworld_: ok, so i guess we can't blame the upgrade to go 1.1.2
<wallyworld_> i am reticent just to set Close = true without knowing why
<wallyworld_> guess not
<rogpeppe> wallyworld_: i agree
<rogpeppe> wallyworld_: what request is it getting the EOF on
<rogpeppe> ?
<wallyworld_> a stor List()
<wallyworld_> http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools/releases/juju-
<wallyworld_> well, with the prefix quoted
<wallyworld_> i unquoted it to paste
<wallyworld_> the gomaasapi logic creates a new request, creates a new http client and then does a client.Do(req)
<wallyworld_> not much to fail
<wallyworld_> a curl done manually of the same thing worls
<rogpeppe> wallyworld_: it might be worth changing the code to log all requests and responses, to see how they've changed between the releases
<wallyworld_> yeah
<wallyworld_> i just can't see why a simple http req fails
<wallyworld_> rogpeppe: here's a pastebin of a manual curl to invoke the req that fails using httpClient.Do(req)
<wallyworld_> https://pastebin.canonical.com/99027/
<rogpeppe> wallyworld_: well, it'll be reusing a socket from a previous request
<wallyworld_> sure
<rogpeppe> wallyworld_: it is possible that the maas server doesn't like that
<wallyworld_> but why now?
<rogpeppe> wallyworld_: and i don't know if you can make curl do that
<wallyworld_> wouldn't sockets always be reused
<rogpeppe> wallyworld_: yeah, but perhaps we're making more requests now
<wallyworld_> we would be
<wallyworld_> since tools look up check for metadata etc
<wallyworld_> and before it didn't
<rogpeppe> wallyworld_: exactly - so our access pattern has changed, and perhaps that's triggering some bug/feature on the server
<wallyworld_> hard to believe the server could be that fragile
<wallyworld_> it's just an apache http server
 * rogpeppe doesn't find it that hard to believe...
<wallyworld_> rogpeppe: so, worst case, we may have to set Close = true for mass
<wallyworld_> just to get something working for release
<wallyworld_> juju doesn't really hammer the connection anyway, right?
<rogpeppe> wallyworld_: i'm not sure we know the worst case now - it may be that setting Close is just papering over a bug which will re-emerge later in some form
<wallyworld_> sure, depends if we can find the root cause
<wallyworld_> i really don't know where to start. the maas logs show the req isn't even getting through
<rogpeppe> wallyworld_: that's interesting in itself
<wallyworld_> so if it never arrives, it mught be getting lost inside the Go http client lib, or maybe apache is discarding it
<wallyworld_> i can ask bigjools to check the apache logs
<rogpeppe> wallyworld_: that would be good, to start with
<wallyworld_> or i can check
<wallyworld_> ok, so i can see the legacy tools request
<wallyworld_> i'll fire up 1.16
<rogpeppe> wallyworld_: how long into the bootstrap do we see the EOF response?
<wallyworld_> rogpeppe: very near the start - when it is syncing tools
<rogpeppe> wallyworld_: i'm still wondering if it might be a stale-connection timeout issue
<rogpeppe> wallyworld_: how near? (in seconds)
<wallyworld_> um. 2?
<wallyworld_> 5?
<wallyworld_> not sure
<wallyworld_> i'll fire up 1.16 and see
<wallyworld_> rogpeppe: ffs . it worked that time
<rogpeppe> wallyworld_: ok, so that's interesting too
<wallyworld_> let me check something
<rogpeppe> wallyworld_: i suppose that means that it's possible that it wasn't the Close=true addition that caused it to succeed last time
<wallyworld_> rogpeppe: oh wait, i'm an idiot
<wallyworld_> i compiled a juju version with close=true for jools to use
<wallyworld_> and it's still in the path
<wallyworld_> i'll revert and try again
<rogpeppe> wallyworld_: ha
<rogpeppe> wallyworld_: assuming you manage to reproduce the problem, i'd like to check one thing - in gomaasapi/client.go, i'd like to change the "return nil, err" after httpClient.Do to return a more distinctive error, so we can be sure that the EOF is coming from that
<wallyworld_> rogpeppe: i've already logged that
<wallyworld_> and it is coming from the Do()
<wallyworld_> rogpeppe: error happens after about 7 seconds
<rogpeppe> wallyworld_: have there been several successful requests before the one that failed?
<wallyworld_> rogpeppe: log is full. let me clear it and i'll retry. too hard to tell
<rogpeppe> wallyworld_: i'd be interested to see the log actually
<wallyworld_> thought so :-)
<wallyworld_> will pastebin
<rogpeppe> ta
<wallyworld_> rogpeppe: actually, i deleted to log and there is no new one. it seems like the first http get fails
<wallyworld_> which is what i think the tools look up is
<wallyworld_> ie build tools locally, then look to see what's in target bucket, boom
<rogpeppe> wallyworld_: hmm, that is interesting. i'm surprised that the Close=true change makes a difference then
<wallyworld_> yeah
<wallyworld_> so the log is the apache log
<wallyworld_> so the request never leaves the client
<wallyworld_> or apache eats it
<wallyworld_> i can't see apache doing that
<wallyworld_> no new errors logged
<rogpeppe> wallyworld_: can you change gomaas API dispatchRequest to log every time it makes a request?
<wallyworld_> i can
<wallyworld_> i gotta attend to a couple of things. i'll do it between now and standup
<rogpeppe> wallyworld_: thanks
<wallyworld_> rogpeppe: there is one other request
<wallyworld_> ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/provider-state/
<wallyworld_> ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools%2Freleases%2Fjuju-
<wallyworld_> ERROR Get http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools%2Freleases%2Fjuju-: EOF
<wallyworld_> the provider-state lookup
<wallyworld_> which is not a List() but a storage Get() I think
<rogpeppe> wallyworld_: and you're saying the first request isn't in the apache log?
<wallyworld_> rogpeppe: seems not. i deleted the access.log file from /var/log/apache2 and it hasn't created a new one
<wallyworld_> that doesn't seem right
<rogpeppe> wallyworld_: when you say "deleted", did you just truncate the file, or remove it? perhaps apache is just trying to append but not create?
<wallyworld_> yeah, let me touch it and try again
<wallyworld_> nope
<wallyworld_> empty
<wallyworld_> wtf
<rogpeppe> wallyworld_: maybe logging just isn't configured for that apache?
<rogpeppe> wallyworld_: or... are you sure it's actually going through that apache?
<rogpeppe> wallyworld_: if you do a curl request, does it show up in the log?
<wallyworld_> trying a few things
<wallyworld_> i ran juju 1.14 and lo log either
<wallyworld_> perhaps apache didn't like its log file going away
<wallyworld_> i'll retstart it
<rogpeppe> wallyworld_: it's possible, yeah
<rogpeppe> wallyworld_: perhaps it was still trying to write to the old (removed, but still there in the fs) log file
<wallyworld_> yeah
<wallyworld_> rogpeppe: i had started a juju env using 1.14. i destroyed using 1.16. it logged about 10 requests and did it no problem
<wallyworld_> so it's only bootstrap
<rogpeppe> wallyworld_: hmm
<wallyworld_> rogpeppe:  and yes, the provider state lookup is logged
<wallyworld_> 10.0.0.9 - - [15/Oct/2013:20:14:58 +1000] "GET /MAAS/api/1.0/files/provider-state/ HTTP/1.1" 404 233 "-" "Go 1.1 package http"
<wallyworld_> but NOT the tools list
<wallyworld_> which is the next http get
<rogpeppe> wallyworld_: could you paste the log of the requests it made when destroying the 1.16 env using 1.14?
<wallyworld_> o
<wallyworld_> k
<rogpeppe> sorry, i mean the the other way around
<rogpeppe> wallyworld_: destroying the 1.14 env using 1.16
<wallyworld_> yep
<wallyworld_> rogpeppe: https://pastebin.canonical.com/99039/ that's the whole log. it has the destroy followed by bootstrap
<wallyworld_> bootstrap at 14:58
<wallyworld_> destroy ends at 14:48
<rogpeppe> wallyworld_: hmm, and that contains a successful list request too - exactly the same request that fails in Bootstrap
<rogpeppe> wallyworld_: well, *almost*
<wallyworld_> yep
<wallyworld_> go fogure
<wallyworld_> figure even
<rogpeppe> wallyworld_: i wonder if we can try to repro this with a very simple example, rather than using bootstrap
<rogpeppe> wallyworld_: we know that there are only two requests and the second one fails
<wallyworld_> that's pretty simple
<wallyworld_> in itself
<rogpeppe> wallyworld_: so we could simply create a maas Environ and try to read its provider-state file and then list the tools
<rogpeppe> wallyworld_: with a 7 second gap between them
<wallyworld_> yes. but what would that tell us
<wallyworld_> that we don't already know
<rogpeppe> wallyworld_: if that failed, that would be very useful
<rogpeppe> wallyworld_: because we then have a very simple example that we can reduce further to see what's actually going on
<rogpeppe> wallyworld_: if that succeeds then we know something else is up
<wallyworld_> true
<rogpeppe> wallyworld_: it should only take 5 minutes to write
<wallyworld_> yes
<wallyworld_> i'll try and do it after standup if i am still awake, othwrwise tomorrow
<rogpeppe> wallyworld_: one mo, i'll do it if you like
<rogpeppe> wallyworld_: try this (substitute "my-maas-environ-name" with your env name: http://paste.ubuntu.com/6239995/
<rogpeppe> rvba: i'm a bit confused by the maas storage handling - how are the storages from different juju environments in the same MAAS kept separate from one another?
<wallyworld_> rogpeppe: sorry, was afk. looking
<wallyworld_> ubuntu@maas:~/go/src/launchpad.net/juju-core$ go run masstest.go
<wallyworld_> ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/provider-state/
<wallyworld_> 2013/10/15 20:44:31 get provider-state: file 'provider-state' not found not found
<wallyworld_> ------ GET http://10.0.0.9:80/MAAS/api/1.0/files/?op=list&prefix=tools%2Freleases%2Fjuju-
<wallyworld_> 2013/10/15 20:44:38 list tools ok
<TheMue> rogpeppe: standup?
<rogpeppe> TheMue: "the video call is full"
<rogpeppe> mgz: standup?
<mgz> on my way
<wallyworld_> fwereade: ??
<rogpeppe> wallyworld_: http://paste.ubuntu.com/6240109/
<wallyworld_> rogpeppe: new version ran fine
 * TheMue => lunch
<rogpeppe> wallyworld_: could you try the original version with a 15 second delay?
<wallyworld_> ok
<wallyworld_> rogpeppe: except bigjools just shot it down
<wallyworld_> shut
<rogpeppe> wallyworld_: oh
<wallyworld_> yeah :-(
<rvba> rogpeppe: care to have a look at https://codereview.appspot.com/14696043/ ?
<rogpeppe> rvba: looking
<wallyworld_> rogpeppe: i got the server powerd up again. i added the 15s delay. all passed
<rogpeppe> wallyworld_: hmm
<wallyworld_> rogpeppe: i can add your ssh key
<wallyworld_> rogpeppe: done, you should be able to ssh in now
<rogpeppe> wallyworld_: great, thanks
<wallyworld_> the ~ubuntu/go dir is where the 1.16 source lives
<wallyworld_> i put the test go src you gave me in the juju-core root dir
<rogpeppe> wallyworld_: ok, i'm investigating now
<wallyworld_> rogpeppe: awesome. i'll try and hang around a bit. if you are finished and i'm not around, can you sudo poweroff?
<fwereade> rogpeppe, wallyworld_: thanks for looking into this -- I will try to pop back on and off, and I should be able to be around properly this evening if I can help you at all
<rogpeppe> fwereade: thanks
<wallyworld_> fwereade: np. we've spent a long time on it and at every turn it gets more confusing. we will kick ourselves when the issue is found i'm sure
<allenap> rogpeppe: When does EnvironProvider.Prepare() get called? I only want to set the environment UUID when bootstrapping an environment; existing environments should not have a UUID.
<allenap> rogpeppe: Hello too :)
<rogpeppe> allenap: hi :)
<rogpeppe> allenap: it gets called only if there's no .jenv file in ~/.juju/environments for that environment
<rogpeppe> allenap: and only if you call juju bootstrap or juju sync-tools
<rogpeppe> allenap: so that should be exactly what you need, i hope
<allenap> rogpeppe: That sounds about right, thanks :)
<allenap> rvba: Do you happen to know why maasEnvironConfig.attrs exists? environs.config.Config has two maps, one for known and one for unknown config. Does that not suffice, or was that not around at the time?
<rogpeppe> allenap: perhaps it's to save the map being copied every time an attr is used
<allenap> rogpeppe: I'm a little rusty :) Can you explain why that would happen?
<rogpeppe> allenap: environ.Config makes a copy of the map that it returns, to prevent mutation of the immutable Config value.
<rvba> allenap: no ideaâ¦ by the looks of it, it just copies the pattern used in the openstack providerâ¦ maybe jtv (if he's around) will knwo.
<rvba> know*
<allenap> rogpeppe: Ah yes, I see, ta.
<rogpeppe> wallyworld_: FWIW, these are the (only) requests i see. the final request is the one that fails (it fails because it sees EOF on the persistent connection) http://paste.ubuntu.com/6240523/
<wallyworld_> rogpeppe: that looks about right. the simplestreams code uses a different http client. maybe that is related. the http client used by juju-core code has the ssl support. the http client in gomaas doesn't
<wallyworld_> i'm not sure whar is done with the http transport in juju-core though
<rogpeppe> wallyworld_: they're both using the same transport
<wallyworld_> rogpeppe: so at bootstrap is when the different hhtp clients are used, whereas destoy just uses the gomaasapi one
<wallyworld_> surely not a coincidence?
<wallyworld_> rogpeppe: although, did you run bootstrap with --upload-tools?
<rogpeppe> wallyworld_: no
<wallyworld_> cause using --upload-tools doesn't first do a simplestreams search
<wallyworld_> and that still fails
<rogpeppe> wallyworld_: ah, i'll try that
<rogpeppe> wallyworld_: http.Client isn't stateful unless it's given a custom RoundTripper
<rogpeppe> wallyworld_: so it shouldn't make any difference if plain http.Get is used vs creating a new client with http.Client{}
<wallyworld_> i'm just clutching at straws, trying to think of twhat's different
<wallyworld_> sinzui: hi, did you see the streams repository is sortof set up?
<sinzui> wallyworld_, I did not
<wallyworld_> sinzui: ticket 63925
<wallyworld_> it talks about the server sawo which i'm not familiar with
<wallyworld_> sinzui: i need to change the url embedded in juju-core, and the generated simplestreams metadata and tools need to be uploaded
<sinzui> I saw that and update a bug over my weekend
<wallyworld_> or the release scripts tweaked accordingly
<wallyworld_> i can easily do the juju-core side ahead of time
<wallyworld_> although if the tools were uploaded i could drop the legacy aws fallback at the same time
<wallyworld_> rogpeppe: i fell asleep on the couch before so i had better get myself off to bed for real. maybe you can drop me a quick note with any progress and i'll pick up tomorrow? also don't forget to poweroff the server
<rogpeppe> wallyworld_: how should i do that?
<rogpeppe> wallyworld_: just sudo shutdown?
<wallyworld_> sudo shutdown now
<wallyworld_> or does sudo poweroff work? not sure
<rogpeppe> wallyworld_: i'm making progress BTW
<wallyworld_> oh?
<rogpeppe> wallyworld_: i have a suspicion of what might be happening
<wallyworld_> oh great, nowyou've got me curious
<rogpeppe> wallyworld_: it looks like MAAS *is* closing the connection after 5s
<rogpeppe> wallyworld_: but somehow we're seeing that in the wrong place; not sure quite why yet
<wallyworld_> hmmm. ok. would be great to know the sequencing of things so explain why now and not before
<rogpeppe> wallyworld_: it *may* be to do with someone not closing an http request body. i need more investigation
<wallyworld_> if maas is going to do that, maybe we do need the close=true on the juju side?
<rogpeppe> wallyworld_: the http client *should* work ok even with that
<wallyworld_> ok. good luck and thanks. talk tomorrow
<rogpeppe> wallyworld_: but i guess there's always going to be a race.
<rogpeppe> wallyworld_: okeydokey. sweet dreams :-)
<wallyworld_> i'll ask jools tomorrow about the maas aspect of it
<jamespage> fwereade, sinzui: any critical bugs for 1.16.0 that I need to push in pre-release of saucy?
 * sinzui looks
<sinzui> jamespage, I don't see any criticals that can be pushed
<jamespage> sinzui, coolio
 * jamespage puts his feet up for a bit then
<mgz> jamespage: I have one funny report on a bug fixed for 1.16
<mgz> that I want to (un)confirm
 * jamespage takes his feet off the desk and listens
<mgz> jamespage: see last two comments on bug 1236734
<_mup_> Bug #1236734: juju 1.15.1 polls maas API continually <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1236734>
<mgz> I'm pretty sure he just has the old version still
<jamespage> mgz, oh - I can confirm that has good
<jamespage> gone
<mgz> excellent.
<jamespage> the load on the serverstack maas server went from 2.0 to 0.04 post upgrade
<rogpeppe> jamespage: excellent news - i was a bit concerned my fix hadn't
<mgz> I'll comment of the bug and see if I can help the guy.
<jamespage> which is worrying - as it would indicate that the performance of the storage on MAAS is not great - I only have 6 physical and 8 lxc service units in the deployment
<jamespage> mgz, maybe he forgot todo juju upgrade-juju
<jamespage> fwiw that took a very long time to complete
<jamespage> <jamespage> which is worrying - as it would indicate that the performance of the storage on MAAS is not great - I only have 6 physical and 8 lxc service units in the deployment
<mgz> hm, storage performace, or just general api being really slow?
<mgz> django is a fair bit of overhead on top of postgres
 * sinzui thinks there is a space in dependencies.tsv that breaks goamx
<mgz> gah, and it's me...
<rogpeppe> sinzui: ah, i was wondering what the problem there might be
<mgz> sinzui: I'll fix
<mgz> should have thought of that when it was falling over yesterday rog...
<mgz> my bad.
<rogpeppe> mgz: i should've thought of it too
<rogpeppe> sinzui: nice catch
<mgz> fix proposed for rubber stamping. now, did I backport that too...
<natefinch> howdy all.  Sorry to miss the standup this morning.
<mgz> natefinch: blame columbus!
<natefinch> mgz: that bastard
<mattyw> fwereade, you about?
<mgz> ah, actually, it wasn't my one that borked, so is just trunk
<mgz> blame say nate :)
<mgz> natefinch: https://codereview.appspot.com/14701043
<TheMue> natefinch: heya
<TheMue> rogpeppe: could you help me with the provisioner?
<natefinch> mgz: dang, sorry
<rogpeppe> TheMue: sure; gimme a minute
<TheMue> rogpeppe: or even a level higher, I don't know if it is the provisioner
<TheMue> rogpeppe: thx
<natefinch> mgz: who uses tab separated values, anyway?  Why not csv like normal people? :)
 * natefinch grumbles about invisible, variable width characters
<rogpeppe> TheMue: what's the issue?
<rogpeppe> natefinch: because commas are more common in normal values
<TheMue> rogpeppe: my question is that in case of a dying machine (e.g. by shutdown -fh now) is the a mechanism that updates the state on it?
<rogpeppe> TheMue: i'm not sure
<rogpeppe>  /me goes to look
 * rogpeppe goes to look
<rogpeppe> TheMue: i can't see anything that polls instances for their status, no
<TheMue> rogpeppe: yes, that's my impression too
<TheMue> rogpeppe: thx
<rogpeppe> TheMue: np
<TheMue> rogpeppe: cts reports of a customer which turn down machines (wonder why they do it instead of removing units) and then in status they see the machine as down but the unit as alive
<TheMue> rogpeppe: that makes me wonder
<rogpeppe> TheMue: i have a feeling that the address updater worker might be a good place to put this functionality (i guess it might be renamed to "machineupdater" if it that happened)
<rogpeppe> mgz: does that make sense to you?
<mgz> seems reasonable, though I'm not sure it's the right fix for the problem of machines just going away
<TheMue> mgz, rogpeppe: I'm trying to get more details from CTS
<rogpeppe> mgz: yeah, that would probably need another fix too, in the provisioner. although...
<rogpeppe> mgz: we'd need to decide what we want to do about that
<rogpeppe> mgz: do we want to automatically start another unit?
<mgz> that was the pyjuju behaviour, roughly
<mgz> not sure if it's what we want though
<rogpeppe> mgz: indeed
<rogpeppe> mgz: we probably do if minunits is set
<TheMue> mgz: it get's more clear now, they use juju to deploy openstack on maas and there the hacluster charm. then they stop nodes for failover tests
<fwereade> TheMue, rogpeppe, mgz: we dropped that behaviour, the only time anyone noticed it was when they didn't want it
<fwereade> TheMue, it is surprising that the unit is not reported as down if it's not running
<rogpeppe> fwereade: yeah, i thought so, although with SetMinUnits, we should perhaps rethink that
<rogpeppe> fwereade: because that expresses a clear intent, i think
<fwereade> TheMue, look at the api server, maybe it's not noticing that the unit's connection is gone and is hence not killing the presence bit
<rogpeppe> fwereade: ah of course, i'd forgotten about the presence thing
<fwereade> rogpeppe, wrt units uncautious agreement; wrt machines uncautious disagreement ;p
<fwereade> er machines *cautious* disagreement
<rogpeppe> fwereade: agreed
<fwereade> rogpeppe, essentially I need to write some access-revocation stuff for agents we force destroy
<sinzui> How permanent is the theme-oil bug tag? Is there a better tag to represent the bugs? server?, hyperscale? cloud-server?
<fwereade> rogpeppe, if I have that I can be sure that removed units aren't coming back and are safe to replace without compounding confusion
<rogpeppe> fwereade: i guess so, although i suppose it's not a huge problem
<rogpeppe> fwereade: ah yes, of course
<fwereade> rogpeppe, I am still twitchy about presence nodes as an arbiter of true existence
<rogpeppe> fwereade: although...
<rogpeppe> fwereade: we won't be recreating the same units, will we?
<rogpeppe> fwereade: or the same machines to run them, come to that
<fwereade> rogpeppe, I've seen a couple of `down (alive)` reports that have always cleared themselves up before I've been able to figure them out
<fwereade> rogpeppe, so basically I'm just nervous about our ability to tell for sure what's in the system
<rogpeppe> fwereade: yeah
<fwereade> rogpeppe, and I don't want things coming back to life once we think they're dead
<rogpeppe> fwereade: if we've destroyed their state entities, how can they?
<rogpeppe> fwereade: or perhaps that's what you're thinking of when you say "access revocation"
<fwereade> rogpeppe, I'm just worried about the potential for races, with things coming back and managing to do something while we think they're gone
<fwereade> rogpeppe, access revocation == telling things they're dead immediately, even if they're not quite dead in state yet
<rogpeppe> fwereade: we can't just kill 'em dead in state immediately?
 * rogpeppe is still going down the maas EOF bug rabbit hole
<fwereade> rogpeppe, not sure we can while keeping sane guarantees
<fwereade> rogpeppe, (tyvm for keeping on at that)
<fwereade> rogpeppe, think of the relations a subordinate has joined, when that subordinate is in a container in a machine that itself has a unit running directly
<fwereade> rogpeppe, even calculating the right transaction for smoothly cleaning up the top-level machine kinda makes me cry
<rogpeppe> fwereade: there's something really weird happening, which is causing the remote http connection to be dropped *just* before we make a GET request
<fwereade> rogpeppe, I would like a quick "cut all these off" to be followed by a more measured cleanup of all the various agents potentially in play
<rogpeppe> fwereade: yeah, seems reasonable on second thoughts
<natefinch> fwereade: do you know how far Tim got in looking at MaaS?  I was looking into a couple of the bugs, but was having trouble with my maas environment (as usual).   If he's progressing, it might be better for me to work on something else
<kurt_> Hi all - I asked about bug 1236734 yesterday and didn't get a response.  This is listed as fixed in 1.16.0, but it's not.
<_mup_> Bug #1236734: juju 1.15.1 polls maas API continually <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1236734>
<kurt_> Any chance we could get some attention on this?
<fwereade> natefinch, I saw rvba and bigjools talking with rogpeppe this morning
<rogpeppe> natefinch: which MaaS problem?
<fwereade> kurt_, mgz was going to follow up with that -- last I heard it was confirmed fixed by jamespage
<rvba> fwereade: we talked about something different, we are in the process of fixing bug 1239488.
<_mup_> Bug #1239488: Juju api client cannot distinguish between environments <MAAS:Triaged> <https://launchpad.net/bugs/1239488>
<natefinch> rogpeppe: the couple I was looking at was destroying non-allocated machines and destroying machines from outside the juju environment
<fwereade> kurt_, are all your agents running 1.16?
<kurt_> fwereade: its not.  I have 1.16 installed and a still having issue
<fwereade> rvba, I *think* that's what natefinch was thinking of
<rvba> natefinch: that's the bug I just mentioned.
<rogpeppe> natefinch: ah, yeah, there's some stuff being done for that
<natefinch> rogpeppe, rvba, fwereade:  that sounds like a great fix for the bug
<kurt_> fwereade: how can I bring up the agent inof?
<kurt_> info rather
<fwereade> kurt_, juju status
<fwereade> kurt_, included agent-version
<natefinch> sorry, brb, diaper duty
<rogpeppe> natefinch: there's a CL here: https://codereview.appspot.com/14696043/
<fwereade> kurt_, you may need to sync-tools before an upgrade-juju sees the latest tools to fix the problem
<kurt_> Oh - good point - how can I update the agent? just sync-tools?
<jamespage> fwereade, mramm2: any call outs you would like to add for juju-core in https://wiki.ubuntu.com/SaucySalamander/ReleaseNotes#Ubuntu_Server ?
<fwereade> kurt_, sync-tools should get them in place for you; upgrade-juju will actually upgrade the agents
<fwereade> kurt_, we didn't quite get the global tools source in place in time for it to be 1-step on maas :(
<kurt_> oh - so even if I followed the correct upgrade process, I still would have ran in to this?
<kurt_> sorry, I know this is the dev list and not real appropriate for this discussion
<fwereade> kurt_, no worries :) and... I guess it depends on definitions, because upgrade-juju can't upgrade further than it knows; but it's not necessarily entirely obvious that you need to sync tools before a maas environment will see them
<kurt_> interesting, right.  I'm trying to blog some stuff, so I'd like to capture that
<fwereade> kurt_, the bug report is appreciated all the same -- it helps make it clear to us that we still need to smooth the process
<fwereade> kurt_, *hopefully* you just hit a bad window, because we've got some infrastructure just coming online now to make it more transparent (unless you're actually cut off from the internet entirely -- if you're isolated, syncing will always be required)
<kurt_> I'm not isolate
<kurt_> isolated - full connectivity to internet on MaaS
<kurt_> and juju
<fwereade> kurt_, but, indeed, I don't think you'll see the benefits until 1.18 -- sorry for the infelicity
<mgz> kurt_: I responded on the bug today after checking with others, you might not have seen the comment unless you subscribed to the bug
<kurt_> so when upgrading juju - can you outline what the fool proof process should be?
<kurt_> (as to avoid this situation in the future) :D
<kurt_> mgz: thanks
<kurt_> I didn't see that yet
<fwereade> kurt_, for most users: `juju upgrade-juju`; for those using maas, or those who have manually synced tools in the past, you'll need a `juju sync-tools` first
<kurt_> fwereade: I believe that I did the juju upgrade via apt-get upgrade after adding the ppa
<fwereade> kurt_, we will hopefully be able to cut out that step for those not in isolated environments, but it's not there today
<fwereade> kurt_, ha, sorry, I was focused on the agent side of things: yes, you should also apt-get upgrade your client; and you should probably prefer to upgrade the client first, because that way round has been exercised more, but either way round should be fine in practice
<kurt_> so the upgrade of juju can be thought of at 2 levels, right? 1. the juju binary 2. the agents - is that a reasonable assumption?
<kurt_> oh there are the tools too
<kurt_> !
<fwereade> kurt_, yeah, both need to be done -- but 1.14/1.16 binaries and agents should interoperate happily
<fwereade> kurt_, the tools are the agent binaries
<kurt_> i was previously on .15 :D
<fwereade> kurt_, just a quirk of terminology
<kurt_> 1.15.1 that is
<kurt_> I C
<fwereade> kurt_, then that *should* actually work too, but we don't make guarantees for odd-numbered minor versions
<fwereade> kurt_, we have no intention of breaking upgrades ofc... but 1.15 *did* have upgrade troubles
<kurt_> 1.15.1 had some fixes I needed previously - but yeah I get that - but that's why I was asking about best way to fip bits between dev and stable stuff.
<fwereade> kurt_, think of 1.odd versions as canaries to make sure we don't mess up the 1.even path
<kurt_> ok
 * rogpeppe gets some lunch
<TheMue> fwereade: hmm, thx for the hint, found the machinePinger in apiserver/admin.go, but funnily it's used nowhere
<natefinch> rogpeppe, fwereade:  ec2 instance type constraints.... would love to have some input: https://codereview.appspot.com/14523052/
<mgz> natefinch: I was leaving that one for next week
<mgz> if you read through the mail archives, you'll find various arguments, and just having a value that's ignored depending on provider and generally has uncertain meaning isn't really an improvement on not supporting it at all
<allenap> fwereade: Hi William, please can you add me to ~juju? Or please mark https://code.launchpad.net/~allenap/juju-core/maas-environment-uuid/+merge/191146 as Approved :)
<mgz> allenap: did you do the rename rogpeppe mentioned?
<allenap> mgz: In a subsequent branch I changed it to maas-environment-uuid then back again when I read his comment.
<mgz> so, we want another branch generating it in Prepare for all providers?
<allenap> mgz: I guess so. I have added the code to return an error if it's set in environments.yaml, but it's still in the MAAS provider for now.
<allenap> mgz: I have a very specific problem to solve :)
<rogpeppe> wow, this is incredibly weird
<mgz> allenap: submitted
<allenap> mgz: Ta.
<sinzui> fwereade, hazmat, thumper, You will be getting email about this bug https://bugs.launchpad.net/juju-core/+bug/1232304
<_mup_> Bug #1232304: consider tuning git setup for juju-core, and document caveats <canonical-webops> <doc> <feature> <pes> <juju-core:Triaged> <https://launchpad.net/bugs/1232304>
 * TheMue is stepping out for dinner, cu
<mthaddon> sinzui: hi, can you give me any context on that bug (1232304)? it's a real problem for us at the moment - means some environments can't run upgrade-charm without significant effort
<allenap> mgz: If you're itching to do a review, https://codereview.appspot.com/14644045 is scratching at the door waiting to be let into your heart.
 * rogpeppe has gone up a blind alley and needs to stop and make curry.
<rogpeppe> g'night all
<sinzui> mthaddon, I have asked for input from the engineers.
<allenap> nn rogpeppe
<rogpeppe> allenap: cheerio
<mgz> allenap: may have a chance to go through that briefly
<natefinch> morning thumper.  How goes?
<thumper> natefinch: morning, good
<thumper> can tell that today is going to get a little broken up
<thumper> taking the car for a service in just under an hour
<thumper> and will work from a cafe across the road while that is happening
<natefinch> Wish I could do that, my mechanic is sorta in the middle of nowhere
 * natefinch doesn't really want to choose mechanics based on amenities within walking distance, however....
<thumper> :)
<wallyworld_> fwereade: you around? i was hoping for a hand off email about the maas http connection issue. do you know the status?
<fwereade> wallyworld_, back soon -- rogpeppe was thinking for a while that it *was* an unclosed request body, but then I saw him saying blind alley :(
<wallyworld_> ok, thanks
<thumper> is the go bot set up to land on 1.16?
<wallyworld_> thumper: yes
<thumper> wallyworld_: so why isn't it
<thumper> ?
<wallyworld_> um.
<wallyworld_> it did for me last week, no problems
<thumper> https://code.launchpad.net/~thumper/juju-core/bootstrap-state-no-constraints/+merge/190852
<wallyworld_> i'll log in and check
<thumper> approved, commit message
<thumper> not getting merged
<wallyworld_> thumper: https://pastebin.canonical.com/99089/
<thumper> wtf?
<wallyworld_> thumper: wtf is a ghost revision?
<thumper> a revision with referenced as a parent that isn't in the repository
<thumper> most likely, a stacking issue
<thumper> but launchpad can see it
<thumper> so NFI
<wallyworld_> maybe do a new branch, cherry pick changes, re-propose?
<thumper> ugh
<thumper> wallyworld_: I know what it is
<wallyworld_> do tell
<thumper> wallyworld_: no I don't
 * thumper thought he did
<wallyworld_> nothing to see here, move on
<thumper> yes I do
<thumper> maybe
<thumper> argh
<thumper> it does have to do with stacking
<thumper> what I did was start with a 1.16 branch
<thumper> and then moved back 3 revisions to be what was currently merged with trunk
<thumper> that was the base revision for that branch
<thumper> when pushed to launchpad, it gets stacked on trunk
<thumper> trunk should have those revisions from 1.16
<thumper> so now I'm confused again
<thumper> bzr is doing something weird
 * thumper wonders if he can unstack it
 * thumper pokes
<wallyworld_> nothing seems to be going right this week
<thumper> :)
<thumper> looking forward to next week?
<thumper> because that'll be awesome
<thumper> :-)
<thumper> well, that branch says it is reconfiguring as unstacked, we'll see if it actually works
<thumper> bzr reconfigure --unstacked lp:~thumper/juju-core/bootstrap-state-no-constraints
<thumper> in case you cared
<wallyworld_> cool, i have not done that before
<wallyworld_> yes, next week wil be great
<thumper> wallyworld_: do you need any reviews?
<wallyworld_> thumper: i have a couple actually
<wallyworld_> there's a pipeline of 3
<wallyworld_> andrew has done the last
<wallyworld_> https://codereview.appspot.com/14663043/ and https://codereview.appspot.com/14540055/ are the others
<thumper> paste the links here and I can go through them
<wallyworld_> thanks :-)
<wallyworld_> the last is https://codereview.appspot.com/14502059/ fwiw
<wallyworld_> i've been looking at the http maas issue so haven't yet really looked at the review i do have, will do that today
 * thumper nods
 * thumper starts on the first one
<thumper> looks big
<thumper> btw, unstacking that branch worked
<thumper> should get merge in now
<thumper> hopefully
<fwereade> wallyworld_, do we have docs for sync-tools --source somewhere
<fwereade> ?
<fwereade> thumper, hi, long time
<wallyworld_> fwereade: not sure. i haven't really read any of our docs
<thumper> fwereade: hey
<thumper> just noticed the provisioner tests timed out on go-bot for the 1.16 branch
<thumper> anyone seen this before?
<wallyworld_> fwereade: do we have any docs for sync-tools at all?
<wallyworld_> thumper: i get random timeouts from the bot semi-regularly
<fwereade> wallyworld_, hell, possibly not :/
<bigjools> wallyworld_: how'd you get on with the http problem?
<wallyworld_> bigjools: hi, long story, lots of dead ends. but it looks like maas is closing the http connection and then juju complains about it
<bigjools> wallyworld_: and juju just needs to deal with that I guess
<wallyworld_> still need to figure out root cause
<bigjools> it might be getting closed because it's not using http/1.1?
<bigjools> or the keep-alive header is not getting set?
 * bigjools stabs in the dark
<wallyworld_> not sure, will check. but none of that has supposedly changed
<wallyworld_> bigjools: so all of the http requests to maas go via the apache server?
<bigjools> did you work out if Go 1.1.1 made a difference?
<wallyworld_> bigjools: no difference
<bigjools> wallyworld_: yes, maas runs as a wsgi container in Apache
<wallyworld_> compiled 1.14 with go 1.1.2 and it worked
<bigjools> !
<bigjools> so something changed inbetween 1.14 and 1.16.... fun
<wallyworld_> nothing obvious though
<wallyworld_> and the EOF at the client is really due to the server closing the connection
<wallyworld_> but what triggers the closure
<wallyworld_> that is the question
<bigjools> you could do something silly like compile 1.16 with the version of gomaasapi from 1.14
<bigjools> just to eliminate that
<bigjools> or confirm it
<wallyworld_> could do. a very quick look at gomaasapi showed no significant changes, or nothing obvious in the http request dispatch side of thnugs
<wallyworld_> i think though the 1.14 gomaasapi won't worj with 1.16? will need to checki guess
<wallyworld_> bigjools: you using the maas server this morning?
<bigjools> wallyworld_:  no
<bigjools> I am weeping into my coffee
<wallyworld_> i'll poke around some more then if that's ok
<bigjools> because of this multi-environ crap
<wallyworld_> weeping?
<bigjools> np
<wallyworld_> ah, the users getting mixed up issue?
<bigjools> yeah
<bigjools> I maintain it's a juju self-inflicted problem but thumper disagrees with me
<wallyworld_> what would he know, right?
<thumper> not a lot
<bigjools> at least we can disagree in a more violent manner in person next week :)
<wallyworld_> i'll remember to pack the popcorn
<wallyworld_> or maybe jelly
 * bigjools will secretly put one fewer shots in thumper's coffee
 * thumper packs the boxing gloves
<bigjools> heh
<bigjools> you still 120kg thumper?
<wallyworld_> of pure muscle
<rvba> Dibs on the front row seats!
<thumper> just under 97kg now
<bigjools> gosh
<thumper> never wes 120
<thumper> 111 maybe
<bigjools> :)
<wallyworld_> wasting away to nothing
<bigjools> so rvba has come up with the great idea of using juju's uuids in the filenames we send to maas
<bigjools> rvba: of course we have to make sure the tools don't get those uuids in the names I think
<wallyworld_> yep, that would be bad
<rvba> bigjools: well, we just need to handle that transparently in the provider's code.
<wallyworld_> not as simple as that
<wallyworld_> you mean send the tools with the uuids?
<wallyworld_> and translate inthe provider?
<wallyworld_> or strip out the uuid and send the tools uncganged?
<wallyworld_> unchanged
<rvba> juju itself will ask to a file with name="zzz" but the provider will just fetch the file named uuid+"zzz"
<rvba> Same when storing files.
<wallyworld_> hmm. but then we will have multiple copies of the same tools tarballs
<rvba> The only trick is the anonymously accessible filesâ¦ but that will also work transparently I think, because in this case we simply use generated ids.
<wallyworld_> canit be path based? ie don't use uuids for certain paths?
<wallyworld_> so stuff under tools/releases is left alone
<rvba> wallyworld_: well, yes, that could be an improvement.
<wallyworld_> and tools/streams
<wallyworld_> or just /tools in general
<wallyworld_> that would save duplicating the same tools tarballs and metadata over and over again for each user
<bigjools> so leave tools/ alone but add uuid/ to everything else
<bigjools> WFM
<wallyworld_> yeah, think that will work
<rvba> Yeah, totally possible.  But again, I think this should be done as a second step.
<wallyworld_> why 2nd? i think it is flawed to needlessly duplicate lots of data
<bigjools> rvba: so I don't want to change the filestorage object willy nilly  because of this
<bigjools> it needs to be done at the provider level
<rvba> Absolutely.
<rvba> There is actually very little to be done.
<rvba> But that's the beauty of it.  It's just a fix to the provider.
<rvba> All in provider/maas/storage.go
<rvba> Basically, we just need to translate filename <--> uuid+filename everywhere.
<wallyworld_> and also the output from list
<rvba> Yep
<rvba> storage.List(stor, "")
<rvba> Becomes storage.List(stor, uuid)
<wallyworld_> i will be trivial to add the /tools exclusion
<rvba> bigjools: this is really too good to be true!
<bigjools> rvba: I disagree that it can only go there because it will affect the tools
<bigjools> I think it needs to go in some of its callsites
<wallyworld_> tools are access via storage
<rvba> Exactly.
<wallyworld_> so it should work, so long as tools are excluded from uuid mangling
<bigjools> via that storage object no?
<wallyworld_> yes
<bigjools> special casing tools in there is crazy
<wallyworld_> agreed, i don't care where it's done
<wallyworld_> but we should duplicate tools
<wallyworld_> shouldn't
<bigjools> I'll sort the call sites out later
<bigjools> right now I need breakfast and an inbox cleanout
<rvba> Again, I suggest dealing with the tools second.
<bigjools> or is that an outbox cleanout
<rvba> Because we need to implement this and test it.
<wallyworld_> but not in a flawed way
<rvba> bigjools: so, I'll let the lab running so that you can test this
<wallyworld_> tools duplication not good
<bigjools> rvba: I can test it locally
<rvba> not good indeed.  But not fatal.
<bigjools> my maas setup is working fine here
<wallyworld_> too much scope for hidden gotchas
<rvba> bigjools: all right.
<bigjools> I'd rather do it properly from the start, it's no big deal.
<wallyworld_> bigjools: except i'm playing with it
<bigjools> rvba: thanks
<bigjools> wallyworld_: easily fixed :)
<wallyworld_> bastard
<bigjools> mwahahaha
<bigjools> you have a while anyway
<rvba> bigjools: don't forget to base your work on Gavin's branch: ~allenap/juju-core/maas-environment-uuid-use
<bigjools> doubt I'll get to the QA stage until this afternoon
<bigjools> rvba: affirmative, thanks
<bigjools> is he finishe dcoding?
<bigjools> is he finished coding?
<rvba> Yeah, it's up for review.
 * bigjools looks at wallyworld_
<wallyworld_> yeeees?
<rvba> bigjools: btw, thumper reviewed it.
<bigjools> \o/
<rvba> He has an interesting point about backward compatibility.  It should be fine (I think extra arguments like agent_name will be ignored by earlier version of MAAS) but it's worth a test.
<bigjools> rvba: it will be fine and a desirable outcome.  It leaves the functionality as-is
<bigjools> I'll try to land it later
<rvba> All I'm saying is that it's worth a test.
<bigjools> absolutely
<rvba> And on that note, I think I'll call it a night.
<rvba> I'll be available to do some QA tomorrow morning if we still have time.
<wallyworld_> davecheney: ping
<davecheney> wallyworld_: ack
<wallyworld_> i have tweaked a go file from the std lib (net/transport.go). how do i force that to compile so that a go build -a ./... from juju-core uses my mods?
<wallyworld_> i tried go build in the /usr/lib/go/pkg/net/http dir
<davecheney> wallyworld_: you should, 1 remove the packaged go
<davecheney> 2. download go from source
<davecheney> the packaged go cannot be modified
<davecheney> it's too mutated
<wallyworld_> davecheney: this is on jool's maas server which roger played with last night and he made changes
<davecheney> http://golang.org/doc/install/source
<wallyworld_> i'm trying to revert them
<davecheney> if they are to the packaged version
<davecheney> remove the package
<davecheney> make suer /var/lib/go is empty
<davecheney> reinstlal the package ?
<davecheney> would that work ?
<wallyworld_> i think so. i assume go 1.1.2 is in the repos?
<wallyworld_> although if roger made his changes stick, i shoild be able to as well?
<wallyworld_> i wonder what command he ran?
<davecheney> wallyworld_: i don't think I really understand the problem
<davecheney> or the solutoin you are persuing
<wallyworld_> ok, so roger modified transport.go
<wallyworld_> to add extra logging
<davecheney> wallyworld_: you coudl try, as root
<davecheney> cd /usr/lib/go/pkg/net/http
<davecheney> go install -x
<wallyworld_> and he did so by editing /usr/lib/go/pkg/net/http/transport.go directly
<davecheney> i don't know if that will work
<davecheney> never tried it
<wallyworld_> will try
<wallyworld_> davecheney: that looks like it worked. the timestamp on the .a file has now chaged
<wallyworld_> thanks :-)
<davecheney> when screwing with this
<davecheney> i recommend -x and -v flags
<davecheney> so you can see when things are chaning
<davecheney> or more important, when they are not 'cos the tool thinks everything is up to date
<wallyworld_> ok. i've not used -x before
<wallyworld_> yeah. i thought go build -a would have done the job
<wallyworld_> but it seems that's not for the std libs, only deps of current project
<davecheney> -a has problems
<wallyworld_> don't we all
<davecheney> https://groups.google.com/d/msgid/golang-nuts/9120d622-e8a8-451f-941e-34899ae0a457%40googlegroups.com
#juju-dev 2013-10-16
<wallyworld_> bigjools: wtf. bootstrap on your maas box works now
<wallyworld_> bigjools: so roger and i were debugging and trying things. then the server got shutdown. now, it seems it has just started working. i tried with and without all the debug logging
<wallyworld_> maybe the power cycle on the server helped, not sure
<bigjools> wallyworld_: dafuq
<thumper> haha
<wallyworld_> bigjools: so looks like we have wasted 2 man days when we should just have listened to Roy ans Moss and "turned it off and on again"
<bigjools> \o/
<bigjools> I still think it's a bug
<wallyworld_> well, not much juju can do if maas/apache closes the connection
<wallyworld_> from underneath it
<wallyworld_> i guess it could retry
<wallyworld_> but wtf
<wallyworld_> that's the only place that needs such logic
<bigjools> network code should never ever assume connections will stay open
<wallyworld_> that's for the http lib to worry about
<bigjools> yes
<bigjools> I think the Go http lib is a little crazy
<bigjools> exposing that Close setting is one thing, but requiring it before it can cope with the other end closing it is a bug
<wallyworld_> hard to argue that i think
<bigjools> anyway can someone land this please, I am not in the juju team any more: https://code.launchpad.net/~allenap/juju-core/maas-environment-uuid-use/+merge/191249
<wallyworld_> i'll add you
<bigjools> please don;t :)
<wallyworld_> what's it worth to you
<bigjools> coffee and lunch at the Tavern?
<wallyworld_> tempting
<bigjools> lower latency to my maas server?
<wallyworld_> who knows
<bigjools> I'm slobbing it on the outdoor sofas today
<wallyworld_> bigjools: so that branch, does it duplicate tools?
<bigjools> not started it yet
<bigjools> I do not intend to duplicate them
<wallyworld_> ah, sorry, i thought the one you wanted landed was it
<bigjools> no, it's gavin's agent_name fixes
<wallyworld_> i should have read the description
<wallyworld_> bigjools: do you intend to propose this against 1.16 too?
<bigjools> failing the tavern, $5 big mac? :)
<wallyworld_> or just land in trunk?
<bigjools> honestly NFI what's best
<bigjools> I am not familiar with the release plans
<wallyworld_> thumper: when is 1.18 due out?
 * thumper shrugs
<bigjools> someone said yesterday that there's another release for saucy
<wallyworld_> there is
<wallyworld_> 1.18 i think
<bigjools> so trunk then
<wallyworld_> bigjools: so gavin's fix, how critial is it
<bigjools> very
<bigjools> and the one I am about to do
<wallyworld_> i wonder if we need a 1.16.1 then
<wallyworld_> fwereade: any idea on release plans as per the backscroll?
<bigjools> are you going to approve the MP then?
<wallyworld_> yeah, sorry saw something shiny and got distracted
<wallyworld_> should be in the bot now
<bigjools> heh
<bigjools> thanks
<wallyworld_> bigjools: about lunch, is there a place to buy decent coffee beans out your way?
<bigjools> wallyworld_: the little bean in Kenmore
<wallyworld_> that's more out my way :-)
<bigjools> it's on the way :)
<bigjools> that's the nearest
<bigjools> though there's a new coffee shop coming apparently \o/
<wallyworld_> i need coffee. i could drive out to you and get beans also. kill two birds with the one stone
<bigjools> poyfekt
<bigjools> remember that the cafe closed down, you have to go to the smaller place on the other side of the road now
<wallyworld_> what time?
<wallyworld_> yes
<thumper> wallyworld_: I have a feeling that we'll only be able to put 1.16 point releases directly into saucy
<bigjools> anhy time you want
<bigjools> however
<thumper> but continue as normal with trunk
<bigjools> I have a call from 12-1
<thumper> and the ppa
<wallyworld_> bigjools: i'll leave soon i guess
<bigjools> sure
<wallyworld_> thumper: that sucks
<thumper> wallyworld_: that's working with distro
<wallyworld_> i thought 1.18 was going into saucy
<bigjools> they will only take cherry picks into saucy now
<thumper> wallyworld_: unlikely at this stage
<wallyworld_> cause i've done a bunch of stuff in trunk
<wallyworld_> assuming it would be in saucy
<wallyworld_> this is very bad
<wallyworld_> 1.16 is not ready
<wallyworld_> there's still the tools repository to do
<wallyworld_> and the ongoing maas stuff
<wallyworld_> and lots of other tooling stuff
<wallyworld_> if we are forced to cherry pick stuff, it will be like the whole fucking cherry tree
<wallyworld_> bigjools: leaving now
<bigjools> wallyworld_: righto
<bigjools> every time I leave the juju code base for a while and then come back to work on it, I struggle to get everything compiling.  I presume this is because of mismatched dependencies.  What's the best way of dealing with this?
<bigjools> or, I suspect, branches moving and Go has the bug of using the wrong url for a branch :/
 * thumper nods
<thumper> that is one
<bigjools> yeah goamz moved it seems
<thumper> apparently jam had a proposal to get golang to use lp: urls for launchpad
<thumper> no interest
<bigjools> oh dear
<davecheney> bigjools: yup, if the owner of goamz has moved
<davecheney> the go get'd branch is probably pointing at the wrong place
<bigjools> indeed it was
<davecheney> bigjools: niemeyer added support for bzr to go get
<davecheney> if you can show me what is wrong, i can try to get it fixed
<bigjools> thumper can explain it better than me
<bigjools> but the upshot is that it needs to pull from lp:project
<bigjools> not the actual branch url
<thumper> davecheney: when the go tool resolves bzr branches to launchpad
<thumper> it expands the project name into the full http url with unique name
<thumper> this is very slow
<bigjools> the old url is http://bazaar.launchpad.net/~gophers/goamz/trunk/
<thumper> most LP users have their lp identiy set in bzr
<bigjools> the new url is bzr+ssh://bazaar.launchpad.net/+branch/goamz/
<thumper> which means lp: urls resolve to bzr+ssh
<bigjools> the latter is owner-agnostic
<thumper> if you don't lp urls resolve to http
<thumper> so lp: is better
<thumper> also
<thumper> bzr+ssh://bazaar.launchpad.net/+branch/project
<thumper> always resolves to the development focus trunk of the project
<thumper> even if the owner changes
<thumper> but go get will turn "launchpad.net/loggo" into http://bazaar.launchpad.net/~thumper/loggo/trunk
<thumper> instead of bzr+ssh://bazaar.launchpad.net/+branch/loggo
<thumper> if go get passed "lp:loggo" to bzr
<thumper> bzr translates to the best it knows
<thumper> which is bzr+ssh if it has your id
<thumper> and http if not
<davecheney> thumper: i'm pretty sure the choice of http is deliberate
<thumper> davecheney: deliberate and stupid
<thumper> IMO
<davecheney> fair
<thumper> it is a choice made by someone who doesn't understand the bzr tool
<thumper> and when jam suggested a patch to golang, they ignored it
 * davecheney has no comment
<thumper> even though he is probably the best person to make such a suggestion
 * thumper goes back to reviewing wallyworld's brach
<wallyworld> \o/
 * thumper needs to go pick up the car from the garage
<thumper> bbs
<thumper> axw: how are you doin?
<axw> thumper: heya
<axw> not too shabby
<axw> working on fixing null provider bugs
<axw> the apt repo one's a bit of a pain, need to extract the key from the keyserver... cloud-init would normally take care of that
 * thumper nods
<thumper> there isn't a handy command we can use?
<thumper> doesn't add-apt-repository download the key?
<axw> thumper: only for ppas
<thumper> bummer
<axw> I'm looking at the cloud-archive case
 * thumper nods
 * thumper goes to pick up the wife
<thumper> geez
<thumper> broken day
<thumper> bbs
<axw> davecheney: are you aware of any tools for looking for unused functions/vars/types/etc.?
<axw> or, how can I identify all functions that are only ever used in tests
<davecheney> axw: I think there is a mode for go vet in 1.2
<davecheney> and kamil kissel has written a tool
<axw> davecheney: thanks, I'll take a look
<wallyworld> thumper: ping
<wallyworld> thumper: i did some fixes for axw's review in the wrong branch in the pipeline. i'm fixing now so ignore the new diff in your review.
<thumper> ok
<wallyworld> thumper: what i did do though is reply to your comments on both merge proposals. i'll fix the issues like gc.HasLen etc but there's also a few things i've replied back to
 * thumper nods
<thumper> wallyworld: I feel I may pop down to harvey norman to look at the coffee machine
<thumper> really need one that doesn't make me angry
<wallyworld> yes indeed
<wallyworld> get the dual boiler!
<wallyworld> thumper: do all tests really need to extend loggingsuite even if they don't requite the base functionaity
<wallyworld> seems like a waste
<thumper> wallyworld: the logging suite captures the logging
<thumper> without it, tests become noisy
<thumper> if someone decides to add logging somewhere
<thumper> that is in the testing path
<wallyworld> fair point
<thumper> that's all it really does
<wallyworld> i guess we should fix all existing test suites at some point then
 * thumper nods
<wallyworld> thumper: if you still have any spare bandwidth left today, i've done fixes for those 2 mp's
<axw_> wallyworld: when you have a moment, I've updated https://codereview.appspot.com/14527043/
<wallyworld> sure
<axw_> wallyworld: sync no longer resolves metadata, but "juju metadata generate-tools" will still
<wallyworld> axw_: i think everything should calc the sha etc, drop the option to allow it not to be done. the sha256 and size is absolutely needed for sync tools
<wallyworld> and generate metadata is typically run using local files so it can always be done for that as well
<axw_> wallyworld: it's only for existing tools with no metadata - I thought the conclusion was that it would be okay after 1.16?
<wallyworld> after 1.16, there should be no metadata without size/checksum
<wallyworld> so if this mp is to go into trunk, then drop the resolve option altogether
<wallyworld> imo
<wallyworld> always just do the checksum/size
<axw_> wallyworld: as in, behave as if the call were specified with fetch/resolve==true all the time?
<wallyworld> yeah
<axw_> what's the point if there's no metadata without size/checksum?
<wallyworld> the fetch=true tells the command to read the tarball data to do the size/checksum when the metadata is generated
<wallyworld> and that's what we always want now
<wallyworld> since we don't want to produce metadata without size/checksum
<wallyworld> so fetch=false should be verboten
<wallyworld> make sense? or am i missing something?
<axw_> wallyworld: with the change, metadata is still beign generated with size/hash. It's populated when the tools are copied to storage
<axw_> wallyworld: the only thing that's affected is tools that are in storage, but either don't have metadata, or have metadata without size or hash
<wallyworld> so - if i have some tarballs locally, and i just want to generate metadata json, and not copy the tarballs anywhere - that's what the generate-metadata command does - that should always happen with size/hash
<wallyworld> and even if the tarballs are not local, ie on a cloud, the same applies
<wallyworld> the generate-metadata command should always produce json with size/hash now
<axw_> wallyworld: yes, it does and will continue to do so with this change
<axw_> generate-tools only
<wallyworld> as of 1.16, there should be no metadata without size/hash
<wallyworld> so i'm not sure if your comment above holds?
<axw_> right
<wallyworld> this one i mean
<wallyworld> [15:02:48] <axw_> wallyworld: the only thing that's affected is tools that are in storage, but either don't have metadata, or have metadata without size or hash
<axw_> right, so my point is - the change won't break anything :)
<wallyworld> sure, but why cater for a forbidden scenario
<axw_> as in, it only affects a scenario that won't occur
<wallyworld> it just complicates the code base
<axw_> I'm explicitly not catering for it now
<wallyworld> but there's still the fetch option etc
<wallyworld> that is no longer needed
<wallyworld> fetch=true always
<axw_> sorry, wallyworld are you talking just about the metadata plugin?
<axw_> as in, get rid of the command line option and have *that* always fetch?
<wallyworld> i was just looking quite narrowly at the diff in the code review
<wallyworld> and saw the option to resolve or not still there
<axw_> wallyworld: yeah, that's *only* in the plugin now.
<wallyworld> i do think we should always fetch, but we can do that as a separate mp
<axw_> I can make it always do it
<wallyworld> sorry, my brain hadn't made the distinction of what was where when reading the diff
<axw_> can I just confirm that it's okay *not* to resolve metadata for syncing?
<wallyworld> we do need to resolve for syncing
<axw_> when I say resolve metadata, I mean fill in size/hash
<wallyworld> cause we may have new tools
<wallyworld> that need to be copied
<axw_> wallyworld: heh, I mean for existing tools
<axw_> sorry
<axw_> not for newly copied ones
<axw_> newly copied ones will always get it, there's no option to disable it
<wallyworld> ok, i think it's reasonable, in trunk, to assume existing tools will have size.hash
<wallyworld> agree?
<axw_> okay, cool
<axw_> yes
<wallyworld> by brain hurts :-)
<wallyworld> my
<axw_> sorry :)
<wallyworld> not your fault
<axw_> wallyworld: and I'll update the generate-tools command to always fetch
<wallyworld> ok, that would be great. i like leaving less legacy / tech-debt :-)
<wallyworld> thaks :-)
<axw_> wallyworld: updated
<wallyworld> looking
<wallyworld> axw_: looks good, land that fucker :-)
<axw_> sweet, thanks
<wallyworld> thank you for making it all work :-)
<axw_> heh nps
<bigjools> pls to be reviewerating https://code.launchpad.net/~julian-edwards/juju-core/maas-uuid-file-prefix/+merge/191336
<bigjools> sorry no Blofeld
<fwereade> bigjools, so how's the environment-uuid config field hooked up to the actual environment uuid?
<bigjools> fwereade: don't know the details, allenap did that already
<axw_> fwereade: the UUID is allocated randomly, at prepare time
<axw_> so... pointing at the same env requires sharing the UUID
<fwereade> axw_, bigjools: looks like that's not an environment UUID at all, it's just somemade-up shit :/
<axw_> yeah
 * fwereade sighs deeply
<fwereade> bigjools, your branch looks fine
<bigjools> fwereade: ok thanks
<bigjools> and wow are you working late or in a different TZ?
<fwereade> bigjools, early
<fwereade> bigjools, flying to the US later today
<bigjools> fwereade: it calls utils.NewUUID() in gavin's branch
<fwereade> bigjools, need to go and see laura fora bit though, might not be back
<bigjools> so what is the somemadeup-shit you're talking about?
<bigjools> hoho
<fwereade> bigjools, the problem is the overwhelming bugfuck insanity of naming that thing "environment-uuid" when we already have an "environement uuid" that is not at all the same thing
<fwereade> bigjools, how to write unmaintainable code vol 1 ch1page 1
<fwereade> bigjools, but I cannot deal with this now,I might be back shortly
<bigjools> it's very easy to criticise
<bigjools> but at least it got done
<bigjools> so do you guys still need two +1s or can I land on one now?
<axw_> bigjools: just one
<bigjools> thanks axw_
<bigjools> axw: can you approve it please, I am ont in the juju team so I can't do it
<axw> bigjools: sure
<bigjools> thank you sir
<fwereade> bigjools, ok, I did not express myself in a helpful way and I apologise for that
<fwereade> bigjools, but I think it really is a problem that some environments now have two UUIDs and there's no clear distinction between them
<fwereade> bigjools, would it be possible to do a quick branch that just s/environment-uuid/maas-agent-name/ and eliminates this source of confusion?
<bigjools> fwereade: I'm not sure where Gavin received his advice from, but I believe it was mostly under the direction of someone in the core team and that whoever it was had a plan to resolve this
<fwereade> bigjools, yeah, I just read the review :(
<bigjools> sadly this is what happens when stuff needs to go in quickly before a release
<fwereade> bigjools, yeah, I would kinda like to figure out how the api-key fiction got created and then propagated so widely in the first place
<fwereade> bigjools, it never even crossed my mind that it was completely made up
<fwereade> bigjools, because it's persisted all the way through back from python days
<fwereade> bigjools, and we never even had a maas environment to check against for such a long time
<rogpeppe> mornin' all
<rogpeppe> fwereade: the environment-uuid thing is all my fault
<rogpeppe> fwereade: i don't really see what harm it can cause tbh
<fwereade> rogpeppe, heyhey, I saw the review, and I think I see the reasoning... but ISTM that now we have two "environment uuids" for maas environments, and I don't see how we're ever going to be able to pull them back together
<rogpeppe> fwereade: they don't join up
<rogpeppe> fwereade: the environment-uuid in the config doesn't make anywhere else, does it?
<fwereade> rogpeppe, then why do they have the same name? it looked like it was justified on the strength of being step 1 towards picking one at prepare time rather than bootstrap time
<fwereade> rogpeppe, which would be great, if we did it
<fwereade> rogpeppe, but now we have an environment config with one value, used by some parts of the system, and an environment doc with another used by different parts of the system
<fwereade> rogpeppe, and to imagine that never the twain shall meet strikes me as... optimistic
<rogpeppe> fwereade: well, currently maas has a private attribute called environment-uuid; the environment uuid in state doesn't come from or go into the config
<rogpeppe> fwereade: given that state.Initialize takes an environ config, we can easily change that at a later stage to put the environ-uuid from that into the current uuid doc
<rogpeppe> fwereade: and likewise we can easily change environs.Prepare to create it
<rogpeppe> fwereade: and when we do that, i *think* everything will just work, and the maas environ-uuid will then join up with the state uuid
<rogpeppe> wallyworld_: after sleeping on it, i *think* i know what's going on with the maas EOF bug
<fwereade> rogpeppe, that's fine for new environments, but existing environments will need to keep both around
<rogpeppe> fwereade: is that a problem?
<fwereade> rogpeppe, I think so, yes, because there is no longer a singular concept of environment uuid
<fwereade> rogpeppe, and I don't see how an existing environment can ever be brought in line
<rogpeppe> fwereade: is that a problem?
<fwereade> rogpeppe, well, yes, because an environment uuid is the only thing we have for globally identifying an environment
<fwereade> rogpeppe, and the last thing I want is to have to respond to bug reports by saying "ah, yes, it doesn't work because you should have used the *other* environment uuid"
<rogpeppe> fwereade: is it any worse than if maas created a new attribute, for example maas-machine-identifier ?
<fwereade> rogpeppe, yes, I think it is much worse
<fwereade> rogpeppe, a new identifier would have been great
<fwereade> rogpeppe, I thought I even saw you advocating that yesterday morning as I rushed by, and I thought "ah cool, everything's undercontrol"
<rogpeppe> fwereade: i advocated one or the other
<rogpeppe> fwereade: i quite liked the idea of just using environment-uuid, because i *don't* think there's a great problem currently - the maas attribute is not really visible to the user
<fwereade> rogpeppe, you think nobody looking at the environ config is going to be fooled?
<fwereade> rogpeppe, the environ config is most certainly visible
<fwereade> rogpeppe, it's *more* visible to the user than the one in the environ doc
<rogpeppe> fwereade: i actually think that fixing it properly is going to be quite a small change.
<fwereade> rogpeppe, what do we do about all the environments that have two uuids then?
<rogpeppe> fwereade: we just need to change environs/config to add UUID, change environs.Prepare to create it and change state.Initialize to use it
<fwereade> rogpeppe, apart from the fact that we have to carry code FOREVER to handle the fact that sometimes they're different
<rogpeppe> fwereade: really?
<rogpeppe> fwereade: what code would we need?
<fwereade> rogpeppe, code to figure out which one is "meant" at any given time
<fwereade> rogpeppe, as it is today we will be starting envs with two uuids
<fwereade> rogpeppe, both of which are exposed to external systems
<rogpeppe> fwereade: the other side of the coin is that in the future, we *would* like maas to use the environ uuid to tag its machines
<fwereade> rogpeppe, and which we therefore cannot change
<rogpeppe> fwereade: and if we don't make it use environ-uuid, it will forever use some other identifier
<rogpeppe> fwereade: well, some other attribute anyway
<rogpeppe> fwereade: because it could still take its value from environ-uuid
<fwereade> rogpeppe, yeah, that would be nice, we would be able to derive the differently-named attribute from the real uuid if a legacy one werenot already set
<fwereade> rogpeppe, bigjools: is there *any* way we can get this fixed without releasing in this state?
<rogpeppe> fwereade: well, it's just a naming issue right?
<rogpeppe> fwereade: so we just need to change the name
<fwereade> rogpeppe, yeah, but I am out of the loop and have no idea what timelines etc are in play
<bigjools> fwereade: we are at the mercy of the release managers in ubuntu
<fwereade> rogpeppe, if you can fix it, or ask someone else to, in time to not release with it in place, please please do so... but I have about half an hour to get up, pack, and catch a taxi to the airport
<bigjools> this is a major flaw in juju and maas and really needs to at least be a zero-day fix
<bigjools> so there is time to change it I think
<rogpeppe> fwereade: ok. how about i just fix it properly? i *think* it's quite a small change, though i may be wrong
<fwereade> rogpeppe, if you were to use environment-uuid in InitializeState that would be fine with me too
<bigjools> but one of you needs to do it AFAIC  because my engineers have done enough already
<rogpeppe> fwereade: i'll give it a go
<fwereade> rogpeppe, can you do that please? and coordinate with jamespage I guess? tyvm
<rogpeppe> fwereade: i know what's going with the MAAS bootstrap EOF bug BTW, i'm pretty sure
<rogpeppe> fwereade: it's a very interesting conjunction of issues
<wallyworld_> rogpeppe: hi
<rogpeppe> wallyworld_: hiya
<wallyworld_> sorry, i was out getting my presecription filled before i go away
<rogpeppe> pwd
<wallyworld_> rogpeppe: a reboot of the server fixed everything
<rogpeppe> wallyworld_: of the MAAS server?
<wallyworld_> yep
<wallyworld_> i think juju's http is flawed
<rogpeppe> wallyworld_: i don't believe the problem is fixed
<wallyworld_> it should cope with disappearing connections
<wallyworld_> any networking stack needs to be robust
<wallyworld_> to connections going away
<rogpeppe> wallyworld_: i think the real problem is an underlying problem with the http protocol itself
<wallyworld_> sure, ut the http lib needs to hide that
<rogpeppe> wallyworld_: i'm not entirely sure whether it's possible
<wallyworld_> http libs from python et al do
<rogpeppe> wallyworld_: i wonder how they cope with this race:
<rogpeppe> wallyworld_: you use an existing connection and send a request, but the remote end drops the connection before it reads your request
<frankban> hi juju devs: is it safe to use ~/.juju/current-environment as a reliable way to retrieve the current default env name? or should we just consider it an internal detail?
<rogpeppe> wallyworld_: then it looks like you're getting EOF in response to your request
<bigjools> why is a request data object dealing with protocols?
<rogpeppe> bigjools: ?
<bigjools> request has a Close on it
<bigjools> seems odd
<rogpeppe> bigjools: it's an http header
<wallyworld_> frankban: the value in that file can be overridden by JUJU_ENV i think
<wallyworld_> frankban: so i would not rely on it
<rogpeppe> wallyworld_: when the above scenario happens, should the http client resend the http request on a new connection (possibly duplicating side-effects) or just return the error?
<wallyworld_> not sure. i'd like to know how other libs handle it
<rogpeppe> wallyworld_: me too
<frankban> wallyworld_: sure, I am trying to implement this logic: if JUJU_ENV is set, use it, otherwise, retrieve the default env as set by "juju switch". So my question is: how to reliably grab that value in the second code path?
<wallyworld_> but i've never seen this sort of behaviour elsewhere
<rogpeppe> wallyworld_: the thing is, it's usually a race with a very narrow window
<rogpeppe> wallyworld_: but in this case, an unfortunate set of circumstances conspire to make it happen every time
<bigjools> rogpeppe: in that case I'd expect the transport to deal with headers that affect its operation
<wallyworld_> frankban: what if juju switch has not been called yet?
<rogpeppe> bigjools: where should the user be able to tell the http package whether connections should be reused or not?
<wallyworld_> not on the request object that is for sure :-)
<frankban> wallyworld_: it's ok, we tried and failed, and we have no default value.
<frankban> wallyworld_: the last chance could be looking for environments.yaml[default] actually
<wallyworld_> frankban: is this a python script or something?
<frankban> wallyworld_: yes it is
<rogpeppe> wallyworld_: the reason (i'm pretty sure, though i haven't had time this morning to verify) why we were seeing the problem every time, is that just before we send the request that fails, we do some very cpu-intensive operations for more than 5 seconds
<wallyworld_> frankban: so i think the order juju-core checks is: juju_env, juju switch file, env.yaml
<wallyworld_> frankban: so if you do that, you should be ok
<frankban> wallyworld_: so, in order: JUJU_ENV -> juju switch -> environments.yaml[default] -> error "please specify an env name".
<wallyworld_> frankban: i think so
<rogpeppe> wallyworld_: and that meant that the goroutine that usually sees the remote connection being dropped was not being scheduled in that time
<frankban> heh
<bigjools> rogpeppe: I'd have a higher level function on the transport rather than exposing protocol details on a request object
<rogpeppe> bigjools: the transport is actually lower level here, no?
<rogpeppe> bigjools: and most http clients don't see it
<bigjools> rogpeppe: not in that sense, I mean a function on the transport to say whether to do it or not.  manipulating headers is low-level
<fwereade> rogpeppe, hey, change of heart -- please *don't* use environment-uuid, just change the name to something maas-specific
<frankban> wallyworld_: yes my question is about the "juju switch" part: parsing the output seems fragile, and I was wondering if  ~/.juju/current-environment is considered an internal detail. anyway, implementing something like "juju switch --format json" could be a good idea
<fwereade> rogpeppe, I'm not convinced we have properly thought through the issues witrh setting it early
<fwereade> rogpeppe, and I don't want maas/juju collisions
<rogpeppe> fwereade: really?
<bigjools> fwereade: I chatted to wallyworld_ about this earlier and we concluded that its akin to a private bucket name
<fwereade> rogpeppe, really really
<rogpeppe> fwereade: ok - i've almost done it, BTW
<wallyworld_> frankban: i just checked. the checks are done in the order specified
<fwereade> rogpeppe, just call it maas-agent-name or something
<jamespage> fwereade, rogpeppe: what do I need to know about?
<fwereade> rogpeppe, sorry, but I just want to avoid adding more little threads connecting different bits
<fwereade> rogpeppe, that at least betrays its actual usage
<fwereade> rogpeppe, then, as a followup, when we have done early-set-uuid properly
<fwereade> rogpeppe, we can make maas-agent-name derive therefrom if unset
<wallyworld_> frankban: the current-environment file just has a single line with the env name. ideally i agree about the output bit you suggest. but i can't see it changing
<rogpeppe> fwereade: actually, we shouldn't allow maas-agent-name to be set explicitly
<fwereade> rogpeppe, it will already be set explicitly inenvironments we upgrade,ok?
<frankban> wallyworld_: ok, so it's ok to use the current-environment file, correct?
<rogpeppe> fwereade: i don't *think* so
<rogpeppe> fwereade: i think the only time it could be explicitly set is if someone specifies it in their environments.yaml
<fwereade> rogpeppe, it *will* because we will set it *now* and it will need to persist in the env
<wallyworld_> frankban: well, given there's nothing else, then yes. but i think we need a command to print the current env name to hide the details
<fwereade> rogpeppe, when we upgrade thosesubsequently we will need to deal with it
<fwereade> rogpeppe, I don't much care how we react to its resence in Prepare, cutting it off there doesn't seemcrazy
<wallyworld_> frankban: we can whip something up next week at the sprint perhaps
<axw> wallyworld_, frankban "juju switch" shows you the current env
<rogpeppe> fwereade: i think that's fine - maas will just always use maas-agent-name
<rogpeppe> fwereade: but at Prepare time, it can derive maas-agent-name from environ-uuid
<wallyworld_> frankban: oh, we already do it it seems
<wallyworld_> that i didn;t realise, sorry for the noise
<frankban> axw, wallyworld_: yes "juju switch" is already there, but it seems to me fragile to parse the output, that's why I was suggesting something like "juju switch --format json"
<axw> frankban: ah sorry, I missed that
<frankban> axw: the current output is: Current environment: "ec2"
<fwereade> rogpeppe, agreed Ithink
<wallyworld_> frankban: almost json :-)
<rogpeppe> fwereade: it does mean that we'll need the maas-agent-name attribute indefinitely in the future, which is what i was hoping to avoid
<axw> frankban: agreed, I think we should have a machine readable output mode
<wallyworld_> remove the space, add {}
<fwereade> rogpeppe, just keep environment-uuid out of there for now, Ifear the tentacles' reach
<wallyworld_> axw: i agree too
<rogpeppe> fwereade: but i see your point about having two things called "environment-uuid" being confusing too
<frankban> wallyworld_, axw: do you want me to file a bug about "juju switch --format machine-readable"?
 * fwereade has to go right now, thanks guys
<wallyworld_> yes
<rogpeppe> fwereade: safe journeys
<axw> later fwereade, see you next week
<fwereade> cheers
<axw> frankban: yes please
 * fwereade has to shut down an env beforeheflies actually
<rogpeppe> axw: do you know much about the http protocol?
<axw> rogpeppe: I know enough to be dangerous, but maybe not intimately enough... why?
<rogpeppe> axw: just wondering how most http clients deal with the inherent race involved in reusing connections when the client might drop the connection at any moment
<TheMue> morning
<frankban> axw, wallyworld_: it seems there is one already: bug 1193244
<_mup_> Bug #1193244: juju env could be friendlier to scripts <improvement> <juju-core:In Progress by themue> <https://launchpad.net/bugs/1193244>
<axw> cool
<axw> good morning TheMue
<wallyworld_> frankban: we'll fix that for sure
<TheMue> axw: oh, morning, came in at the right moment?
<axw> heh :)
<frankban> wallyworld_: great, thanks
<wallyworld_> frankban: well, TheMue will :-)
<frankban> :-)
<TheMue> wallyworld_: yeah
<axw> rogpeppe: sorry, I know the protocol well enough, but not how most clients work in that regard
<TheMue> frankban: seen my last comment there? regarding the way to control the output?
<frankban> TheMue: --raw sounds good
<rogpeppe> axw: we came across an issue that triggered that race reliably
<wallyworld_> rogpeppe: isn't it the server that drops the connection rather than the client?
<rogpeppe> axw: so we'd see the connection drop at *almost exactly* the same moment we make a request
<rogpeppe> wallyworld_: yes, but the Go client wasn't *seeing* the connection being dropped because it was busy doing other things
<wallyworld_> rogpeppe: sure, but i was referring to your comment above that the client dropped the connectiopn
<wallyworld_> just clarifying
<rogpeppe> wallyworld_: ha, yes
<rogpeppe> wallyworld_: i meant server there
<wallyworld_> :-)
<TheMue> frankban: fine, then I will do it that way
<frankban> TheMue: thanks!
<axw> rogpeppe: tbh, this seems like the Go stdlib should handle.
<axw> it's suggested that you reuse clients, and that it's safe to do so
<rogpeppe> axw: agreed. but i'm not quite sure how it can do so reliably
<wallyworld_> axw: that's what me and bigjools think too
<rogpeppe> axw: istm that this is an inherent race in the http protocol
<rogpeppe> axw: and i'm not sure how it can be dealt with other than just trying to reduce the window for the race
<axw> hmmm
<dimitern> morning all
<rogpeppe> axw: (the window could certainly be smaller in the Go http client)
<rogpeppe> dimitern: yo!
<axw> morning dimitern
<rogpeppe> axw: actually, on reading of the rfc 2616, it looks like the Go http client is just wrong here
<rogpeppe> "
<rogpeppe> Client software SHOULD reopen the
<rogpeppe>    transport connection and retransmit the aborted sequence of requests
<rogpeppe>    without user interaction so long as the request sequence is
<rogpeppe>    idempotent (see section 9.1.2).
<rogpeppe> "
<axw> yeah, but "so long as the request sequence is idempotent"
<axw> I was thinking that, but how do you guarantee idempotency?
<axw> that's an application level thing
<rogpeppe> axw: yeah, but it defines GET, HEAD, PUT and DELETE as being idempotent
<rogpeppe> axw: i guess that means you still have a potential problem for POST
 * axw wonders how many applications are not idempotent for those methods ;)
<rogpeppe> axw: http://tools.ietf.org/html/rfc2616#section-9.1.2
<rogpeppe> axw: i wonder that too
<axw> rogpeppe: anyway, "wrong" is maybe too harsh for not implementing an RFC "SHOULD"
<axw> but it would certainly be useful for an option at least
<rogpeppe> axw: yeah
<TheMue> dimitern: morning, how has PyCon been?
<dimitern> TheMue, ah, it was interesting and not too long :)
<TheMue> dimitern: I somehow left Python a few years ago, so would almost have to relearn it. ;)
<rogpeppe> axw, wallyworld_: there's an outstanding Go issue for this actually: https://code.google.com/p/go/issues/detail?id=4677
<axw> rogpeppe: heh, nice, same conclusion :)
<jamespage> fwereade, rogpeppe: what do I need to know about? (hint: release of saucy is tomorrow - if I need to upload it needs to be today otherwise I'm in SRU territory)
<wallyworld_> rogpeppe: fat lot of good that is - it's not going to e fixed
<rogpeppe> jamespage: i need to make a small naming fix to the maas provider
<rogpeppe> wallyworld_: it's not going to be fixed for 1.2, yeah
<wallyworld_> so go's http is broken and there's nothing we can do about it. oh joy
<wallyworld_> wtf
<rogpeppe> wallyworld_: we can set the Close field
<wallyworld_> and remove connection reuse?
<rogpeppe> wallyworld_: or change the transport to allow no idle connections
<wallyworld_> why oh why does Go get so much so wrong
<rogpeppe> wallyworld_: yeah
<wallyworld_> version control, proper hhtp stack etc etc
<wallyworld_> it's not like comp sci is a new field of science
 * wallyworld_ is very frustrated
<axw> I like this comment from the Chromium dev: "* If you pipeline requests and get a transport error, we pray that HEADs and GETs are actually idempotent and retry."
<axw> wallyworld_: I'd actually prefer that this not be enabled by default, because it's generally not safe to assume GETs are idempotent
<bigjools> don't fret wallyworld_, you get the pleasure of my company on Saturday for 15 hours
<axw> opt-in would be good
<wallyworld_> oh joy
<wallyworld_> axw: sure, but we don't kow if it's needed or not at any time
<bigjools> GET is supposed to be idempotent, are there really crackful websites that are not?
<wallyworld_> +1 to that
<axw> bigjools: supposed to be, sure, but people do all sorts of wrong things in their web applications
<bigjools> then that's their tough titty
<wallyworld_> axw: that's no excuse for not assumimg the spec holds
<axw> not to mention HTTP servers and proxies that don't follow protocols
<bigjools> tools should not behave as stupidly
<wallyworld_> no excuse
<wallyworld_> the wrong implementations should get fixed
<wallyworld_> not the clients
<bigjools> just look at the havoc IE6 created
<wallyworld_> or else the problem will never be fixed
<wallyworld_> ie6 is a great example
<wallyworld_> axw: rogpeppe: so you'd think a robust http lib would be the cornerstone requirement of any new langage. and yet they say not fixed for 1.2????? what else could be more important
<wallyworld_> TLS is another example
<wallyworld_> they refused to accept the need for it - so we are forced to fork
<rogpeppe> wallyworld_: i trust Adam Langley when he says that tls renegotiation is badly broken
<wallyworld_> well it works for us
<wallyworld_> or if broken, just fix it already
<wallyworld_> and give us a feature complete http stack
<wallyworld_> not like it's important or anything
<rogpeppe> wallyworld_: there's always a tension when trying to write clean software that's dealing with crappy standards
<wallyworld_> sure, but it's not like Python, C++ etc etc etc etc weren't around to learn from
<wallyworld_> solved problems
<wallyworld_> meanwhile Juju breaks and we deal with the fallout
<wallyworld_> customers don't care the Go is deficient, they blame Juju and us
<wallyworld_> and we stuff Juju full of all of these hack and workarounds to paper over Go's cracks
<jamespage> rogpeppe, bug reference?
<jamespage> I'd like to raise ubuntu tasks now so they get noticed in the right places
<rogpeppe> jamespage: i'll just file a bug for it
<rogpeppe> jamespage: https://bugs.launchpad.net/juju-core/+bug/1240423
<_mup_> Bug #1240423: provider/maas: environment-uuid is the wrong name to use for the machine disambiguation tag <juju-core:New> <https://launchpad.net/bugs/1240423>
<jamespage> rogpeppe, is that linked to bug 1229275
<_mup_> Bug #1229275: [maas] juju destroy-environment also destroys nodes that are not controlled by juju <maas> <theme-oil> <juju:Triaged> <juju-core:In Progress by thumper> <maas (Ubuntu):Triaged> <https://launchpad.net/bugs/1229275>
<rogpeppe> jamespage: yes
<jamespage> rogpeppe, so I'm right in saying the juju-core 1.16.0 with MAAS in Saucy will exhibit this problem?
<rogpeppe> jamespage: the above bug? yes, i believe so, unless https://code.launchpad.net/~allenap/juju-core/maas-environment-uuid/+merge/191146 has landed in 1.16 yet
<jamespage> rogpeppe, so I'm going to need that as a fix for 1.16.0 as well
<rogpeppe> jamespage: yes
<jamespage> :-)
<rogpeppe> jamespage: i hope to provide one in the next hour or so
<jamespage> rogpeppe, lovely - thanks!
<rogpeppe> jamespage: i've changed the code - i just need to test in various places
<rogpeppe> allenap: i'd appreciate a review of this please: https://codereview.appspot.com/14741045/
<dimitern> rogpeppe, mgz, TheMue, others - standup
<rogpeppe> dimitern, mgz, TheMue: ^ (this urgently needs to go in BTW)
<TheMue> oh
<rogpeppe> bigjools: ^
<natefinch> rogpeppe: the standup link seems not to be working?
<rogpeppe> natefinch: it works for me
<rogpeppe> natefinch: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig?authuser=1
 * TheMue => lunch
<rogpeppe> mgz, natefinch, dimitern, TheMue: PLEASE could someone review this? https://codereview.appspot.com/14741045/
<natefinch> rogpeppe: it's already open in my browser :)\
<rogpeppe> natefinch: thanks
<mgz> rogpeppe: whoops, didn't hit publish
<mgz> rogpeppe: HIT IT.
<mgz> whoops caps, but kinda funny.
<natefinch> rofl
<rogpeppe> mgz: i thought that was deliberate
<natefinch> rogpeppe: me too
<rogpeppe> anyone around here know about the new maas agent_name semantics?
<rogpeppe> mgz: i've just realised that the branch might be wrong
<mgz> hm, in what way?
<natefinch> mgz: I couldn't help but hear it like this: http://www.youtube.com/watch?v=erbL_BxITHw
<rogpeppe> mgz: because it still filters by agent_name even when there's no maas-instance-uuid
<mgz> I didn't closely review the first branch, so may be missing some subtelties
<rogpeppe> mgz: i don't know what the agent_name filtering semantics are though
<rogpeppe> allenap: ping
<mgz> the filtering is all our code, no?
<mgz> yeah, the filter is just our standard stuff
<rogpeppe> mgz: um, it looks like the filter is passed directly to the maas API GET request
<mgz> which does also raise the back-compat question...
<mgz> rogpeppe: hm, I see what you mean
<rogpeppe> mgz: so i don't know what will happen if you pass an agent_name="" filter
<rogpeppe> mgz: it *might* match all instances, or it might not.
<rvba> rogpeppe: it will match all instances
<mgz> well, that's not something you've changed in this branch
<rogpeppe> mgz: no it isn't
<rogpeppe> rvba: ok, that's good
<rvba> It is good indeed :).
<rogpeppe> rvba: and there's no problem having a blank agent name in the acquire params?
<rvba> rogpeppe: that's fine too (I was sure of it but I still tested it this morning.)
<rvba> I mean, what I tested was a version of juju using agent_name with a version of MAAS which doesn't know about it.
<rvba> not exactly what you asked
<rogpeppe> rvba: yeah, true
<rvba> But using a blank agent name is the same as not providing one.
<rogpeppe> rvba: it would be nice to make sure. or, alternatively, a less risky strategy is just to lose the agent_name params if the agent name is blank
<rogpeppe> rvba: so you can't filter instances that have blank agent names?
<rvba> Like I said, as far as MAAS is concerned, this is the same.
<rvba> No.
<rvba> err
<rvba> Yes you can do that.  But you have to know that a blank agent name is the default.
<rogpeppe> rvba: ah, so using a blank agent name isn't *quite* the same as not providing one, then?
<rvba> It is exactly the same.
<rogpeppe> rvba: so how would you distinguish between a) asking for all instances regardless of agent_name and b) asking for any instances which happen to have a blank agent_name ?
<rvba> rogpeppe: no, I was wrong, when listing instances, it's not the same.
<rvba> Sorry for the confusion.
<rogpeppe> rvba: ok, cool. i actually think that's better for our purposes
<rogpeppe> rvba: as it means that existing maas juju deployments won't see new environments
<rogpeppe> rvba: as long as they've been upgraded
<rvba> Right.
<rvba> But you can have only one of these "old" environments.
<TheMue> frankban: ping
<rogpeppe> rvba: yeah - that's a current restriction though
<frankban> TheMue: pong
<rogpeppe> rvba: i'd appreciate it a lot if you could take a glance through this before i merge: https://codereview.appspot.com/14741045
<TheMue> frankban: just seen in the channel log that you talked about a json output of env/switch
<rogpeppe> mgz: i've renamed maas-instance-uuid to maas-agent-name throughout as it seemed to make more sense once i read through a bit more of the code
<TheMue> frankban: I understood raw as being simply the env name(s)
<TheMue> frankban: in case of json I would prefer a different flag than --raw
<TheMue> frankban: what do you say?
<frankban> TheMue: since "juju switch" will always return just a single string, --raw seems reasonable. I was thinking about --format just for symmetricity with "juju api-endpoints", but I don't think that's required
<frankban> TheMue: what is --raw supposed to return if no default env is set? just an empty string?
<TheMue> frankban: it is <not specified>, at least today
<rvba> rogpeppe: looking now
<rogpeppe> rvba: thanks
<frankban> TheMue: hum... perhaps an empty string is better? I guess spaces and <> are not allowed in env names, so that could be ok, but a user still have to parse the return value, or just compare with "<not specified>".
<TheMue> frankban: I prefer the "<not specified>" as it is more explicit than just an empty string
<frankban> TheMue: the retcode is in both cases, right?
<frankban> TheMue: is 0 I mean
<TheMue> frankban: yep
<frankban> TheMue: if you are ok with users comparing against "<not specified>" the it's all good. But then we must ensure that string will not change in the future. On the other hand, if --raw is the machine friendly command, I am not sure it has to be explicit for humans, but as said, I am +1 on your plan
<natefinch> frankban: when would a script ever need to check the output of juju switch? Couldn't it just always do juju switch <foo> or -e <foo> each time?   Not saying no one will ever do it, it just seems like an unnecessary step
<rogpeppe> hmm, bzr's "merge specific revisions" logic seems not to work very well
<rogpeppe> natefinch: i agree
<abentley> rogpeppe: how do you mean?
<rogpeppe> abentley: i just had some conflicts with some very weird diffs in
<abentley> rogpeppe: Well, if you think it's bzr's fault, give me steps to reproduce, and I can have a look.  I did write most of that code.
<frankban> natefinch: assume you have a script that needs to bootstrap an environment. that script can either 1) force the user to pass a "-e" parameter or 2) make that parameter optional. In the second case, we still want to ensure an environment is ready to be bootstrapped, and we might also want to grab the env name. So we check JUJU_ENV, then we ask to juju switch, and finally we look inside environments.yaml[default]
<rogpeppe> abentley: here's an example: http://paste.ubuntu.com/6245564/
<abentley> natefinch: A script would need to check the output of "juju switch" in order to determine what the current environment is.  For example, I have a script that runs "nova" using the credentials of the current juju environment.
<rogpeppe> abentley: note the line sitting in the middle of nowhere - it actually came from a test function in the merge-source that was almost entirely lost
<rogpeppe> abentley: ah, sorry, you're talking about juju switch, not bzr :-)
<natefinch> abentley: the script can tell if *an* environment is chosen, but not necessarily the right now.  Better to just always switch to the right one first
<rogpeppe> perhaps what we need is a command that prints the current env name
<natefinch> brb screaming baby
<TheMue> frankban: in case of a set JUJU_ENV the command juju env returns it
<rogpeppe> so scripts aren't trying to second-guess JUJU_ENV, juju switch, environments.yaml juju logic
<abentley> natefinch: The right one is the current one.  The best way to determine that is to ask juju what the current one is.
<mattyw> rogpeppe, is there a canonical example of writing a go client to connect to the api somewhere in the juju source?
<rogpeppe> mattyw: i don't think so.
<abentley> natefinch: One way to do that is to run "juju switch" with no env specified.
<frankban> TheMue: yes, and also if current-environment is missing, juju switch seems to return the default in envs.yaml.
<rogpeppe> mattyw: there's a command in launchpad.net/juju-utils that does, but it's quite possible it doesn't compile currently
<abentley> rogpeppe: But I bet if you look at THIS and OTHER, that line was preserved in both.
 * TheMue will come back in a few moments
<rogpeppe> abentley: OTHER and THIS were both as i expected
<rogpeppe> abentley: but usually i expect the merge to contain all the lines from both
<rogpeppe> abentley: in this case almost an entire function had gone missing, leaving me wondering what else might have gone
<abentley> rogpeppe: That's not what merge does.  It attempts to apply changes from both, which is both insertion and deletion.
<rogpeppe> abentley: i don't believe the source branch deleted anything in that place
<rogpeppe> abentley: note that this isn't a straight branch merge i'm talking about here
<rogpeppe> abentley: i did: merge -r1984..1985 trunk
<abentley> rogpeppe: Okay, and trunk here is juju-core?
<rogpeppe> abentley: to try to bring some trunk changes into 1.16
<rogpeppe> abentley: yeah
<rogpeppe> abentley: you could duplicate the problem yourself easily, if you're interested
<abentley> rogpeppe: And you're merging into which branch?
<rogpeppe> abentley: into lp:juju-core/1.16
<natefinch> abentley: (sorry, had to step out for a second) I'm confused as to why the script needs to know the current environment name, if it just assumes whatever is current is the right one.
<abentley> natefinch: Because it needs to parse environments.yaml and determine what the correct values are for OS_REGION_NAME, OS_USERNAME, OS_PASSWORD, OS_TENANT_NAME and OS_AUTH_URL
<abentley> natefinch: That varies depending on whether the current env uses my personal credentials or my team credentials.
<natefinch> abentley: why is the script parsing environments.yaml?  That's for configuring juju... not configuring the script
<natefinch> abentley: oh wait, you mean, it's going to set the environment variables based on what environment is set in juju?
<abentley> natefinch: Right.
<natefinch> abentley: that seems like... a bad idea.  If you're going to store the credentials in the script, why not store them in environments.yaml?  Or do we not support those particular values in environments.yaml?
<abentley> I'm not storing the credentials in the script, I'm getting  them from environments.yaml.
<natefinch> abentley: I'm confused again.  If stuff is stored in environments.yaml already, why do you need to set them as environment variables?  Won't juju pick them up from environments.yaml itself?
<rogpeppe> rvba, natefinch, TheMue, wallyworld_: this CL merges the maas changes into 1.16 https://codereview.appspot.com/14746044
<abentley> natefinch: The script runs "nova", not "juju".
<natefinch> abentley: ahh, that's what I was misunderstanding.
<rogpeppe> rvba, natefinch, TheMue, wallyworld_: i've done it as a single branch to save an hour or so of committing overhead; i hope that's ok
<natefinch> abentley: I see you already said that, but I guess I missed it.
<rogpeppe> natefinch: do you have access to a maas environment that you could run up a quick live test of that branch on, by any chance?
<natefinch> rogpeppe: I wish I did... the virtual maas environment I had set up got nuked somehow
<rogpeppe> natefinch: hmm
<rogpeppe> anyone got a maas environment available?
<abentley> natefinch: I have another script that uses the current juju environment (by default) to run sshuttle.  Again, it uses "switch" to determine the current environment.
<rogpeppe> abentley: could you clarify for me why the script needs to know the name of the current environment?
<rogpeppe> abentley: (i'm not saying you haven't got a good use case, but i'm interested in what it is)
<rogpeppe> if i wasn't clear, i would really like a review of this CL please - it *needs* to land today. https://codereview.appspot.com/14746044
<rogpeppe> rvba, axw, natefinch, TheMue, wallyworld_: ^
<rvba> rogpeppe: ah ok, I didn't get it had to be reviewed again.
<rogpeppe> rvba: well, i think it probably should be, as this is the actual change that's landing on 1.16, and i've had to do some manual merge conflict resolution
<rvba> okay
<rogpeppe> rvba: thanks
<TheMue> rogpeppe: just starting review
<TheMue> rogpeppe: reviewd
<rogpeppe> TheMue: thanks
<TheMue> eh, reviewed
<rogpeppe> TheMue: those are changes that should be made in trunk - i'm not making them independently in this review
 * rogpeppe grabs a bit to eat
<rogpeppe> bite
<TheMue> rogpeppe: ok
<abentley> rogpeppe: My use case for the first script is to use nova to manipulate the instances of my current juju environment, especially "nova status" and "nova add-floating-ip".
<abentley> rogpeppe: The use case for the second script is to be able to access the current environment using an SSH tunnel, since the bootstrap node has a private IP.  I use the current environment name to determine which "juju-*-machine-0" to ssh into for my tunnel.
<rogpeppe> abentley: so this is useful only because the openstack provider uses the environment name to name some of its resources, right?
<rogpeppe> abentley: s/this/switch printing the current environment name/
<abentley> rogpeppe: Yes, the second script is useful because the instance name can be predicted from the environment name.  The first script is useful regardless.
<rogpeppe> abentley: we'll probably change this behaviour in the future, BTW - the environment name doesn't make for a very good unique key.
<abentley> rogpeppe: In the context of a given nova account, I think there's some sense in requiring unique environment names.  Things can be very confusing otherwise.
<rogpeppe> abentley: we are moving towards the idea of using the environ UUID
<rogpeppe> abentley: which will be generated at bootstrap time
<rogpeppe> abentley: so if you've destroyed an environment but for some reason some instances are still around, there's then no possibility of crosstalk with a newly bootstrapped environment, even if it happens to have the same name
<rogpeppe> abentley: is that the kind of thing you think might be confusing?
<abentley> rogpeppe: That seems okay, as long as it's not ephemeral.  Writing essential info to *.jenv alone makes sharing accounts painful.
<rogpeppe> abentley: we already write essential info to .jenv alone
<abentley> rogpeppe: I know,  and it's evil.
<rogpeppe> abentley: to share accounts, you'll need to share the .jenv file
<rogpeppe> abentley: (but you won't need to share anything else at all)
<rogpeppe> abentley: i don't see any other way that we can have local state
<abentley> rogpeppe: And then someone tears down the environment and re-bootstraps, and everyone's broken.
<abentley> rogpeppe: Write it to environments.yaml instead of .jenv.
<abentley> rogpeppe: Then everyone with the same config can access the same environment.
<rogpeppe> abentley: we can't without losing comments etc
<abentley> rogpeppe: Losing comments is better than breaking other team members.
<abentley> rogpeppe: And perhaps there is a yaml parser out there that doesn't lose comments.  There certainly are for ini files.
<rogpeppe> abentley: most YAML parser out there can barely parse YAML :-)
<rogpeppe> parsers
<rogpeppe> abentley: it's not just comments - there are many ways to format YAML
<rogpeppe> abentley: i'm afraid the only way to do it is to share the .jenv files (possibly in a shared filesystem)
<abentley> rogpeppe: No, that's not acceptable as long as .jenv files are deleted by destroy-environment.
<rogpeppe> abentley: if you destroy an environment and then bootstrap it again, it's actually not the same environment any more
<abentley> rogpeppe: The same people need to be able to access it, though.
<abentley> rogpeppe: Why can't you store all the .jenv-unique data in swift/s3?
<rogpeppe> abentley: because some of that data is used to work out which swift/s3 bucket to talk to
<rogpeppe> abentley: for instance, control-bucket is now automatically generated
<abentley> rogpeppe: but if I specify control-bucket in environments.yaml, it's respected?
<rogpeppe> abentley: yes, currently
<rogpeppe> abentley: i agree that this is a very significant change in behaviour BTW, and i understand your point of view
<abentley> rogpeppe: It is very discouraging to see this.  It seems like juju is going to get worse and worse from the perspective of teams.
<rogpeppe> abentley: i think the way of the future is to have a service that stores .jenv files
<rogpeppe> abentley: (the code is actually written with that in mind)
<abentley> rogpeppe: I don't think we need a new service.  Just a swift/s3 bucket that doesn't change.  One per provider would be fine.
<rogpeppe> abentley: s3 isn't great for shared write access
<rogpeppe> abentley: i.e. you can't change anything atomically AFAIK
<abentley> rogpeppe: Okay, but I don't want to live with a worse-and-worse juju for a long time because the jenv sharing service is more complex than necessary.
<abentley> and therefore takes more time to implement.
<rogpeppe> abentley: essentially all it needs to do is implement the Storage interface defined in environs/configstore/interface.go
<rogpeppe> abentley: there's some thinking to be done as to how to implement the encryption
<rogpeppe> abentley: (i.e. do you let the server see the plaintext contents of the .jenv files?)
<rogpeppe> abentley: but i don't anticipate it being more than a week's worth of work for the server and maybe another week to integrate it into juju-core.
<abentley> rogpeppe: Is there a plan to split environments.yaml into multiple files, or is that a misunderstanding of *.jenv files?
<rogpeppe> abentley: there's a plan to lose environments.yaml entirely
<abentley> rogpeppe: What replaces it?
<rogpeppe> abentley: another config file based around somewhat different abstractions
<rogpeppe> abentley: fwereade has a better idea than me - he thrashed out some of the details with bdfl
<abentley> rogpeppe: For teams, it would be nice if each environment had its own config, because some environments are team environments and some are personal environments.  It's easier to maintain the team environments if they're not intermixed with personal ones.
<abentley> s/config/config file/
<rogpeppe> abentley: personally, i'd be happy if the environments.yaml file became simply a URL to the config-file storage server and probably a key for that too
<rogpeppe> abentley: then we'll be able to operate more coherently in a team environment
<abentley> rogpeppe: I'd be unhappy, because then it wouldn't have the openstack credentials, and my first script wouldn't work.
<rogpeppe> abentley: your script could use the server to fetch the openstack credential
<rogpeppe> s
<abentley> rogpeppe: Don't I need credentials in order to use the server?
<rogpeppe> abentley: not provider-specific credentials
<abentley> rogpeppe: Doesn't that make it harder for private clouds?  I would think you'd want it to be a per-provider service at least.
<rogpeppe> abentley: (if we do it right, a single config storage server could easily serve the needs of thousands of clients - the demands aren't high)
<rogpeppe> abentley: for private clouds, you would indeed need to run a server somewhere.
<rogpeppe> abentley: or have a file-based mechanism that you could use too
<rogpeppe> abentley: i'm thinking off the top of my head here BTW - none of this stuff has been discussed yet in the team
<w7z> yeay, finally internet back
<sinzui> rogpeppe, Thank you for fixing the gnuflag branch. I will remove the workarounds from the scripts
<abentley> rogpeppe: About the merge, the source deletes 4 lines from environprovider_test.go: http://pastebin.ubuntu.com/6246103/
<abentley> rogpeppe: If you look at the diff of provider/maas/environprovider_test.go (with conflicts and everything), you'll see that no function was deleted.  What happened is that trunk altered a function that doesn't exist in 1.16.
<abentley> rogpeppe: That's why there's a conflict with just one line showing, because that's the only line of the non-existent function that trunk wanted to change.
<rogpeppe> abentley: hmm, which function was altered that doesn't exist in 1.16?
<abentley> rogpeppe: TestUnknownAttrsContainEnvironmentUUID
<rogpeppe> abentley: hmm, interesting. i think i must have got the merge command wrong. i thought that "bzr merge -c 1234" was equivalent to "bzr merge -r 1234", but i'm guessing that it's actually equivalent to "bzr merge -r 1233..1234"
 * rogpeppe is not worried that more stuff has been lost
<rogpeppe> s/not/now/
<abentley> rogpeppe: The latter is correct.
<abentley> rogpeppe: You may have actually wanted bzr merge -r 1984..1985?
<abentley> rogpeppe: I mean you may have actually wanted bzr merge -r 1983..1985?
<rogpeppe> abentley: yes
<abentley> rogpeppe: It may help to know that the -r parameters affect diff the same way as merge.
<rogpeppe> abentley: yeah, i'd thought they were slightly different
<abentley> rogpeppe: So if you do diff -r 1983..1985, you'll see all of the changes that merge will attempt to integrate.
<rogpeppe> abentley: ok, it seems we're lucky in this case
<rogpeppe> abentley: the only changes that were made in 1984 were overwritten by changes made in a later branch that i also merged
 * rogpeppe slaps own wrist and considers himself once-bitten
<rogpeppe> abentley: thanks for solving the mystery for me
<abentley> rogpeppe: You're welcome.
<rogpeppe> if anyone's been following it, i now have a tiny demo program of the net/http problem that has been causing us problems: http://paste.ubuntu.com/6246231/
<rogpeppe> i'm thinking this might be the cause of bugs like this: https://bugs.launchpad.net/juju-core/+bug/1228255
<_mup_> Bug #1228255: Live bootstrap tests fail on canonistack <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1228255>
<rogpeppe> see my comment on https://code.google.com/p/go/issues/detail?id=4677
<rogpeppe> rebooting
<natefinch> rogpeppe: interesting about the http thing, though I think your bug report has a typo: "It succeeds when run with GOMAXPROCS>0."
<natefinch> rogpeppe: pretty sure gomaxproxs is always >0   ;)
<TheMue> natefinch: as long as your system doesn't blocks
<rogpeppe> natefinch: oops
<rogpeppe> natefinch: correct
<rogpeppe> natefinch: corrected
<TheMue> so, have a good night guys
<TheMue> cu tomorrow
<natefinch> TheMue: g'night
<rogpeppe> i'm also off now
<rogpeppe> g'night all
<natefinch> g'night!
<thumper> abentley: how's tricks?
<abentley> thumper: I'm doing fine.  How are you?
<thumper> abentley: pretty good
<thumper> abentley: I was wondering about your bzr plugin for lbox
<thumper> abentley: did you have one?
<thumper> well, reitveld
<thumper> I guess
<abentley> thumper: Yes, since reitveld was updating launchpad and juju-gui has tarmac, it really just sets the commit message and marks the MP approved.
<gary_poster> thumper, abentley http://jujugui.wordpress.com/2013/05/24/thanks-to-diogo-matsubara-well-be-migrating-to/
<smoser> https://bugs.launchpad.net/juju-core/+bug/1240667
<thumper> abentley: ah, but not for submitting?
<_mup_> Bug #1240667: Version of django in cloud-tools conflicts with horizon:grizzly <ubuntu-cloud-archive:Confirmed> <juju-core:Confirmed> <https://launchpad.net/bugs/1240667>
<thumper> well, proposing
<thumper> smoser: is that for maas?
<smoser> its really only solvable in juju i think
<abentley> thumper: No, I didn't do one for proposing.  Is lbox propose giving you grief?
<smoser> not specific to maas
<thumper> abentley: regularly
<thumper> smoser: but juju doesn't use django
<gary_poster> thumper, API server is falling over (it closes the socket).  that usually means there's some juju error to look at.  could you direct me towards the probably pertinent logs?
<gary_poster> oh wait
<gary_poster> machine 0
<abentley> thumper: Any chance of switching to MPs?
<thumper> gary_poster: probably the api logs on machine-0
<thumper> abentley: only when we get inline commenting
<thumper> abentley: I've raised this before
<smoser> django is in the cloud archive for maas, yes. but juju added the cloud archive to the node.
<smoser> which is where the problem came from.
<thumper> lbox diff generation isn't as good as LPs
<thumper> smoser: ah
<thumper> smoser: that's annoying
<abentley> thumper: Wish I had a quick solution for you, but I guess you'll have to roll your own if you want to propose on Reitveld.
<thumper> I thought you had one that dealt with pipelines
<thumper> I often forget the -req bit for lbox
<smoser> thumper, yeah. its a problem there for django, but it could be a problem for anything really.
<smoser> and coudl expose itself with lxc. or even mongodb.
 * thumper nods
<thumper> smoser: any plan or ideas?
<smoser> none that i like.
<abentley> thumper: lp-propose handles pipelines.  I don't know of a reitveld equivalent.  I can dig up the plugin that generates diffs if you like.
<thumper> abentley: nah
<thumper> abentley: probably not going to get time to look at it
<smoser> when does juju need to enable the cloud archive ?
<thumper> too much else on
<smoser> seem my comments there, am I correct?
 * thumper looks at the bug
<smoser> if so, then the most reasonable solution is only to use cloud archive on bootstrap node or "un-containerized" node (that would then want to create containers with lxc)
<thumper> smoser: yeah, your comments are right
<thumper> I wish people didn't use the packaged django
<thumper> virtualenvs are the way to go with python IMO
<smoser> oh yeah, of course, installing random stuff from untrusted sources on the internet that can change is always the best way to build software.
 * smoser realizes that argument might not seem ridiculous here
<natefinch> smoser: haha
<gary_poster> thumper or any sympathetic soul, there is no api log that I see on machine 0.  I see /var/log/juju/machine-0.log.  In it, http://pastebin.ubuntu.com/6247538/ seems to show that the AllWatcher is falling over.  Any thoughts on where to look next?
<thumper> the machine-0.log has all the api stuff in it
<thumper> gary_poster: can you post the error?
<thumper> pastebin
<thumper> or something
 * thumper sighs
<thumper> I see it there
 * thumper clicks
<thumper> it is all organge
<thumper> orange
<hazmat> that looks like a client close of connection
<gary_poster> organge?:
<thumper> I'm used to seeing the hyperlinks as blue
<thumper> dumb client
<gary_poster> hazmat, nope.
<hazmat> gary_poster, do you have the corresponding browser trace?
<hazmat> websocket trace from the client that is
<gary_poster> hazmat, yes.  Connection Close Frame
<hazmat> gary_poster, do you have a way to reproduce?
<thumper> hmm...
<gary_poster> hazmat, yes, but it is expensive.  Deploy this and go to GUI.  https://raw.github.com/paulczar/charm-championship/master/monitoringstack.sh
<hazmat> ah.. that one.
<thumper> gary_poster: what is the gui asking for?
<thumper> when it is falling over?
<gary_poster> thumper, that log is as close to a record of that as we have.  in request 5, we ask the all watcher to give us the next output.  this is the first such request, I am pretty sure, so it will be the full system status.  We then ask for various other things, and get successful responses (such as line 3, which is our 16th request to juju in this connection) but then on line 5 (and 4?) of the pastebin, juju says that the
<gary_poster>  AllWatcher Next request has an error...because the "state watcher was stopped".  By whom? I'd love to know.  That appears to be death knell for the whole connection.
<gary_poster> The correlation is by the "RequestId": line 5 is a reaction to line 1
<thumper> gary_poster: the whole "all watcher" is a part I've not yet delved into, and I gather a rather complicated beast
<gary_poster> thumper, ack.  this may be a big deal.  it might explain some other reports I've heard.
<thumper> gary_poster: probably needs lots of debugging added to it to find out what is going on
 * thumper sighs
<thumper> big deals always happen at the last moment, no?
<gary_poster> thumper, ack.  I'm checking with another source to see if they can dupe in another situation
<gary_poster> yeah :-/
<thumper> gary_poster: if you can get me something simpler to generate the problem with it'd be appreciated
<thumper> gary_poster: let me finish off the reviews I'm in the middle of
<thumper> and then I'll start poking
<gary_poster> thumper, heh, I'd love to.  I think I need some hint on cause before I can come up with a smaller case.
<thumper> gary_poster: does that monitoringstack deploy list always cause the problem?
<thumper> is it really reproducable?
<thumper> do I have to open the gui?
<thumper> or does it just happen?
<gary_poster> thumper, so far, yes, it is reproducable.  API connection falls over in about 4.35 seconds, from the perspective of the client.
<thumper> gary_poster: how far through the script does it get before it falls over?
<gary_poster> thumper, and yes, you go to the environment with the GUI and log in
<thumper> gary_poster: does the system appear stable prior?
<gary_poster> thumper, the script sets up an environment.  then you go to the gui, and just look at it, and the connection falls over
 * thumper nods
<thumper> ok
<gary_poster> thumper, other than the API connection, everything seems fine
<thumper> gary_poster: how long are you around for?
<gary_poster> thumper, and other than the AllWatcher Next, within that shining 4 seconds of connectivity, other replies in the connecton seem to be behaving fine
<gary_poster> thumper, my EoD is in 6 minutes, and I should probably go not too soon after that.  last night was already a long one.
<thumper> gary_poster: ack
<thumper> gary_poster: do we have a bug yet?
<gary_poster> thumper, no.  I will file one before I leave, or, if I get confirmation from the reporter, add juju core to an existing gui bug.
<gary_poster> (add the dupe instructons I have so far)
<gary_poster> and add
<gary_poster> such as they are
<thumper> ok, ta
<sinzui> thumper, Can you take 10 minutes to advise/comment on this bug https://bugs.launchpad.net/juju-core/+bug/1232304
<_mup_> Bug #1232304: consider tuning git setup for juju-core, and document caveats <canonical-webops> <doc> <feature> <pes> <juju-core:Triaged> <https://launchpad.net/bugs/1232304>
<thumper> sinzui: I can try
<natefinch> sinzui, thumper:  any process that relies on storing binary blobs in git is flawed
<sinzui> natefinch, agreed
<sinzui> I really think there is a process problem in the bug. If we tune git, how much more can we scale before he hit the next problem?
<natefinch> sinzui: tuning git is not the solution. Not storing blogs in git is the solution.   Is it a matter of charm authors doing something wrong, or a problem of juju doing something wrong?  I don't know the charm upgrade code at all.
<natefinch> and by "wrong" I can also mean "we haven't given them a better way"
<thumper> natefinch: care to comment on that bug?
<thumper> natefinch: I know nothing about git
 * thumper will try to replicate gary_poster's bug on the local provider to save AWS bills
<gary_poster> +1
 * gary_poster needs to try and get local working on this machine.  lost > day on itso been putting it off
<thumper> gary_poster: we could look next week :)
<gary_poster> thumper, it's my desktop.  laptop was fine last I checked
<natefinch> thumper: all I know is that when you store binary in git, and then submit a change that changes the binary.... it doesn't store the diff of the two binaries, it just stores the two binaries.  So, if you have a 200 meg zip file and you add one thing to it and re-commit, you now have two 200-meg zip files in the repo.
<thumper> gary_poster: ah
<natefinch> thumper: and whenever you get code from git, you get the WHOLE REPO.  There's no getting a specific branch to reduce the amount you have to download.
<thumper> WTF... my local provider won't bootstrap...
<thumper> does no-one test this shit?
 * thumper rages
<natefinch> thumper: anyway, I'll comment on the bug, but since I don't know the upgrade charm code, I'm not sure where the problem lies
<natefinch> thumper: I get ERROR juju supercommand.go:282 Get http://10.0.3.1:8040/provider-state: dial tcp 10.0.3.1:8040: connection refused
<thumper> natefinch: me too
 * thumper sighs heavily
 * thumper wonders if 1.16 works
<thumper> it used to
<thumper> no.
<thumper> that fails too
<thumper> WTF!!!!
 * thumper headdesks
<natefinch> it's pretty epically bad if local bootstrap doesn't work in 1.16
<thumper> true that
 * thumper files critical bug
<gary_poster> thumper, I did not triage https://bugs.launchpad.net/juju-core/+bug/1240708 as critical yet but I filed it
<_mup_> Bug #1240708: API server falls over repeatably during AllWatcher Next, killing GUI <juju-core:New> <https://launchpad.net/bugs/1240708>
<thumper> gary_poster: sorry, but have to fix the local provider first
<gary_poster> thumper, completely understood
<thumper> gary_poster: it is broken, AGAIN
<gary_poster> :-(
<thumper> yeah
<natefinch> thumper: I guess we don't have tests that actually try bootstrapping local?  Seems like, of all the providers, that one should be the most thoroughly tested.
<thumper> natefinch: it needs root
<thumper> so we can't do it in unit tests
<thumper> natefinch: we plan to have it work in the qa lab
<thumper> but that is still being set up
<thumper> at least I think I know how to get this working
<gary_poster> thumper I have to run.  I will check back later briefly
 * thumper puts on his debugging music nice and loud
<thumper> gary_poster: ack
<natefinch> thumper: making people type in a sudo password when running the tests seems like a worthwhile pain in order to get full testing on the local provider
<thumper> natefinch: perhaps...
 * thumper goes to fix the problem
<natefinch> thumper: good luck
<davecheney> or add themselves to whell
<davecheney> wheel
<davecheney> or sudoers
<davecheney> or something to make it automagic
<natefinch> indeed
<davecheney> but is probably going to be a non starter for CI
 * natefinch is at EOD
<thumper> well that was easy
<natefinch> thumper: ha! awesome
<davecheney> what is the plan for getting all these post 1.16 fixes into Saucy
<thumper> davecheney: 1.16.1
<thumper> I have no other answer at this stage
<davecheney> roger
<thumper> davecheney: if you insist, we could get roger to do everything
<thumper> :P
 * thumper is frustrated with lbox and how it reuses merged merge proposals
<thumper> I never want that
<thumper> davecheney: you may have re-approved the old one, I'm resubmitting
<thumper> https://codereview.appspot.com/14573046
<davecheney> thumper: LGTM. Looks like the same
<thumper> davecheney: it is :)
<thumper> davecheney: it was me leaving a clean trail behind me
<thumper> davecheney: once it lands in trunk, I'll submit for 1.16
<thumper> I made sure I branched off a common ancestry revision
<thumper> so I don't need to cherry pick
<thumper> davecheney: do you handle the packaging for juju?
<thumper> davecheney: in main world, we had seb and didier
<thumper> davecheney: who we'd pass a patch to in situations like this and they'd rebuild the deb
<davecheney> thumper: that honor belongs to sinzui
<thumper> sinzui: ping
<sinzui> hi thumper
<thumper> sinzui: got time for a hangout?
<sinzui> yes
<thumper> sinzui: need bandwidth
<thumper> sinzui: https://plus.google.com/hangouts/_/95c73d1f1d77129b8096dd279bf17d654e856cda?hl=en
<wallyworld_> sinzui: hi
<sinzui> hi wallyworld_
<wallyworld_> you otp?
<thumper> wallyworld_: he is soon :)
<davecheney> please form an orderly line
<wallyworld_> ok, sinzui maybe you can ping me when done
<sinzui> wallyworld_, your up
<wallyworld_> sinzui: i was just wondering the plan for getting tools and metadata onto streams.canonical.com now that it has been commissioned
<sinzui> I plan to test it for 1.17
<sinzui> 1.18 will make it available to everyone
<wallyworld_> sinzui: ok. i will need to make changes to juju-core to update the url etc
<wallyworld_> when were you planning to start?
<sinzui> Did you see the bug I updated about that?
<wallyworld_> ah no
<sinzui> streams.canonical.com/juju/tools
<wallyworld_> i mustn't be subscribed to the bug
<wallyworld_> yeah, that url looks fine
<wallyworld_> do you have a bug # handy?
<sinzui> ^ We share the host with cloud images. We made an executive decision that juju-dist was too long
 * sinzui looks
<wallyworld_> ok, it will take a little work to retool everything, but it's only software
<sinzui> wallyworld_, https://bugs.launchpad.net/juju-core/+bug/1220965
<_mup_> Bug #1220965: add official tools repository to metadata search <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1220965>
<sinzui> wallyworld_, really? you need juju-dist?
<wallyworld_> wtf, i even reported that bug
<sinzui> wallyworld_, I read EVERY bug over the weekend. Even yesterday I was dizzy from too much information
<wallyworld_> sinzui: actually, no. i was not thinking straight. juju-dist is assumed as the container name for cloud storage (ie private bucket)
<wallyworld_> i'll need to check my mail filters to see where emails from that bug went
<wallyworld_> anyways, that url will be fine
<wallyworld_> i'll update juju-core with the necessary changes
<wallyworld_> will be done this week i expect
<sinzui> wallyworld_, If it is a problem to match the path /juju/tools/, I can discuss the issue with Ben
<wallyworld_> nah, should be fine
<wallyworld_> will be fine, i was just caffeine deprived
<sinzui> :/
<sinzui> Drink
<wallyworld_> will do :-)
<wallyworld_> sinzui: can you let me know when stuff has been uploaded? how hard is it to just copy across the tools from s3 or wherever?
<sinzui> Not sure yet
<wallyworld_> ok
<sinzui> at the moment. I assemble all the tools to call sync-tools and make meta data. Since I have a cache from aws of the historic tools, I can build the tree in about 5 minutes. the release-public-tools script then deploys to all CPCs taking 15 minutes
<sinzui> wallyworld_, So the new process is pull/sync from streams.canonical.com, then 15 minute publication to all CPCs
<wallyworld_> the ticket talks about syncing from sawo or something like that
<wallyworld_> my question above was more aimed at getting some initial metadata up on streans.c.c so i could test
<sinzui> thats right. I have no experience uploading to it and know how long from that server to the new server
<thumper> gui won't install locally
<thumper> grr
<sinzui> thumper, dir permissions?
<wallyworld_> sinzui: i'll leave you alone in peace to work your magic and you can let me know when there's some news :-)
<gary_poster> thumper, what is error?
<gary_poster> thumper, for the API server falling over, more info.  From comment I added in bug: "We verified that the other bug has the same behavior (linked as dupe). Apparently, then, the charms are likely unrelated, because the other scenario is of an openstack bundle. This may simply be scale--and not very much of it."
<thumper> gary_poster: http://pastebin.ubuntu.com/6248091/
<thumper> gary_poster: logged in, and all at latest revision
<thumper> gary_poster: in fact, every charm failed to install
<gary_poster> thumper, yeah, "apt-get -y install python-apt python-launchpadlib python-tempita" failed, which doesn't seem like a gui issue
<thumper> yeah...
<thumper> hang on
 * thumper pastebins
<thumper> http://pastebin.ubuntu.com/6248095/
<thumper> seems to be a cloud-init issue
<thumper> which is why all failed
<gary_poster> :-( :-( :-(
<sidnei> thumper: yeah, it's all broken :/
<thumper> sidnei: any idea?
<sidnei> thumper: it's an issue with procps and lxc, we've been tracking it all day
<gary_poster> I have to run
<thumper> gary_poster: ack
<thumper> sidnei: oh, didn't know
<sidnei> thumper: it started happening yesterday because procps landed in precise-updates
<thumper> ah
<sidnei> bug #1157643
<_mup_> Bug #1157643: procps fail to start <failed> <patch> <procps> <start> <procps (Ubuntu):Confirmed> <https://launchpad.net/bugs/1157643>
<thumper> sidnei: are they going to roll it back?
<sidnei> thumper: nope, that doesn't help unfortunately
<thumper> why?
<sidnei> thumper: the problem is that procps has been broken in lxc since ever, but since it's installed by default and init doesn't block on it failing it went unnoticed
<sidnei> thumper: it only became an issue when calling the postinst script from dpkg
<thumper> haha
<thumper> oops
<sidnei> so reverting won't help
<sidnei> thumper: there's a patch in the bug, but it needs to go into sru and all that i guess
 * thumper nods
<sidnei> until then, no workie lxc :/
<thumper> oh well, I guess using aws to test this other failure is what we need then
<thumper> davecheney: this one is for 1.16 https://codereview.appspot.com/14439067/
 * thumper self approves
<thumper> huh - rieveld ignores my own LGTMs :)
<sidnei> thumper: there :)
<bigjools> o/
<thumper> o/
<thumper> gym time
<wallyworld_> thumper: don't forget my 2 branches :-)
<thumper> wallyworld_: oh yeah... after gym
<wallyworld_> thumper: ok :-) also i think we need a 1.18 in saucy
<wallyworld_> but we can discuss later
<thumper> wallyworld_: it won't happen dude
<thumper> yes, lets
<wallyworld_> well, stuff will be broken
<wallyworld_> no tools repository
<wallyworld_> unless we back port lots :-(
<wallyworld_> i wonder why it's so hard to get own own software into our own distro
<davecheney> wallyworld_: i couldn't agree more strongly
<wallyworld_> davecheney: yeah :-( there's a lot going into trunk right now
<wallyworld_> we either want juju to work or we don't
<rogpeppe1> ooh, i *so* nearly just tried to buy a camera from this site. then i looked at the terms and conditions... http://www.pcmshop.info/index.php?route=information/information&information_id=5
<rogpeppe1> mark v cheney ftw
<davecheney> rogpeppe1: we'll, I didn't want you to have to fall on your sword this sprint
<davecheney> i figured i was my time
<wallyworld_> rogpeppe1: where in goamz did you see req.Close = true? i can't seem to find it
<wallyworld_> ah never mind, found it
#juju-dev 2013-10-17
<bigjools> wallyworld_: haha I see you're fixing the http bug in the other providers then
<wallyworld_> yeah
<wallyworld_> bigjools: turns out the goamz lib sets Close = true
<bigjools> fancy that
<thumper> rogpeppe1: don't suppose you are still around?
<wallyworld_> bigjools: what's the process for landing gomaasapi changes? looks like landing bot is not set up for that project. do i need to merge to trunk manually?
<bigjools> it should be owned by the bot I thought
<bigjools> hmmm maybe it's still running in the qa lab
<wallyworld_> doesn't appear to be
<bigjools> yes I think it's a bot in the lab
<bigjools> needs to be migrated
<wallyworld_> bigjools: can you land it for me then?
<wallyworld_> https://code.launchpad.net/~wallyworld/gomaasapi/fix-request-eof/+merge/191537
<bigjools> just set it approved
<wallyworld_> ah ok
<thumper> bigjools: I wonder if it still works since I changed the owner...
<bigjools> yeah :)
<thumper> wallyworld_: if that doesn't work, merge and push to trunk directly
<thumper> \o/
<wallyworld_> alright
<wallyworld_> bigjools: ffs :-( can you please add me to ~maas-maintainers
<bigjools> how';s that gonna help?
<wallyworld_> There was a problem validating some authors of the branch. Authors must be either one of the listed Launchpad users, or a member of one of the listed teams on Launchpad.
<wallyworld_> Persons or Teams:
<wallyworld_>     maas-maintainers
<bigjools> oh ffs
<bigjools> let me see if I can remove that
<wallyworld_> ta
<bigjools> oh fuck sake this is all fucked
<bigjools> wallyworld_: merge manually
<wallyworld_> ok
<bigjools> it's using jenkins tests on quantal and raring ffs
<wallyworld_> \o/
<axw> thumper: did I break the local provider with the bootstrap storager changes?
 * thumper nods
<axw> sorry :(
<thumper> or it could have been the prepare work
 * axw looks closer
<thumper> not sure exactly where the problem lies
<axw> mk
<thumper> axw: it's fixed now
<axw> I didn't think it was broken, it's optional
<thumper> neither did I
<thumper> something changed in bootstrap, so it tried to do some stuff earlier
<thumper> I do remember changing some bootstrap stuff
<thumper> too
<axw> ok. I'll have a dig, I'd like to know why it broke
<thumper> it'd be ironic if I broke it :)
<axw> hehe
<thumper> you know what...
<thumper> I think I may have been me
<thumper> with an early fail if already bootstrapped
<thumper> heh
<axw> ahhh
<axw> that makes sense
<thumper> it does the check after the enable bootstrap storage
<thumper> although I thought I did that before my other lxc fixes
 * thumper shrugs
<thumper> the log file indicates other stuff happening around prepare
<axw> well anyway, it makes sense to have an EnableBootstrapStorage for local too
<thumper> it does
<axw> sorry, I didn't think to add it before
<thumper> found out that lxc is broken right now with precise anyway
<thumper> so we can't really use it
<axw> :(
<thumper> some other bug
<thumper> leaving apt in a bad state
<thumper> charm install hooks fail
<thumper> anyway
<thumper> I'm supposed to be looking at a problem for gary around the all watcher
<thumper> so I'll have to fire up lots of machines in ec2
<davecheney> thumper: or containers ?
<thumper> davecheney: or containers what?
<davecheney> for gary's allwacher problem ?
<sidnei> davecheney: lxc is broken because of a procps update that landed in precise-updates
<davecheney> *sadface*
 * thumper nods
<thumper> davecheney, axw: I know that in C++ you should never modify a map while you are iterating over it, how does go handle this?
<axw> thumper: iirc it's not completely specified. I'll have to look it up
<thumper> in C++ it can segfault
<thumper> normally considered bad form modifying a container you are iterating
<thumper> although can be done if careful
<axw> yeah I'm all too familiar with those bugs ;)
<thumper> the C++ language is very careful about what invalidates iterators
<axw> spent many hours debugging other people's ugly C++, not to mention my own
<thumper> I'm just looking inside the multiwatcher code
<thumper> where it is responding to things
<thumper> under load when there are likely to be multiple requests pending
<thumper> the code iterates over the response map, and deletes things from the map as it is iterating
<axw> thumper: "The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next
<axw> . If the map is nil, the number of iterations is 0."
<thumper> it just raises my suspiciions
<axw> ignore the first sentence
<axw> that's just saying iteration order is random
<thumper> hmm...
<thumper> what if you remove the current element?
<axw> yeah, that's the "not fully specified" bit
<axw> I can look it up in the runtime, but that won't help if it's not specified
 * thumper nods
<thumper> we are keeping a map of singlely linked lists
<thumper> it has been forever since I've had to manage that myself
 * thumper considers a race condition
<thumper> arg... gotta go get a child
<thumper> back
<thumper> I have a horrible feeling about this
<axw> ?
<davecheney> ?
<thumper> hmm...
<thumper> can't seem to reproduce the problem
<wallyworld_> bigjools: pretty please https://code.launchpad.net/~wallyworld/gwacl/fix-request-eof/+merge/191545
<bigjools> wallyworld_: done
<wallyworld_> ta
 * thumper gets back to look at wallyworld_'s branches
<wallyworld_> \o/
<thumper> wallyworld_: hangout?
<wallyworld_> ok
<thumper> https://plus.google.com/hangouts/_/7912539c41441cc8e5dc4021f78f9c48c48c9e69?hl=en
<wallyworld_> thumper: https://codereview.appspot.com/14769043
<bigjools> huh, nothing on ubuntu.com front page about the release ...
<wallyworld_> i'm too scared to try it
<davecheney> bigjools: release is tomorrow
<bigjools> davecheney: no it's today...
<davecheney> stilll yesterday in the states,
<davecheney> barely today in the UK
<bigjools> meh
<bigjools> anyway there's normally a countdown and at least *some* mention of it
<davecheney> good point
<rogpeppe1> mornin' all
<rogpeppe1> thumper, axw: it's fine (and well defined) to remove the current element of a map when iterating over it
<axw> rogpeppe1: morning. well defined where?
<rogpeppe1> axw: "If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced."
<rogpeppe1> axw: i.e. yes, it's ok to remove an entry, and no, you won't see that entry again
<rogpeppe1> axw: nowhere does it say you cannot delete an entry from a map during iteration
<axw> rogpeppe1: not saying something you can't do something is not the same as saying you can do it :)
<axw> rogpeppe1: not really sure how that sentence says you can safely remove the current element either
<axw> it's talking about past elements
<rogpeppe1> axw: in the spec, not saying you can't do something *is* equivalent to saying you can do it
<rogpeppe1> axw: and i know from past discussions that that is indeed the case here
<axw> rogpeppe1: that sounds slightly scary from an implementer's perspective, but okay :)
<rogpeppe1> axw: i'm actually a bit surprised if even you don't know this - some stronger wording in the spec is perhaps called for, although ISTR that's been pushed back on in golang-dev before this
<axw> rogpeppe1: I know it works, I just didn't know if it was guaranteed in the spec
<rogpeppe1> axw: basically the spec says (unqualified)  "The built-in function delete removes the element with key k from a map m."
<rogpeppe1> axw: the fact that it's unqualified means you can do it anywhere (memory model permitting of course)
<rogpeppe1> axw: otherwise you'd need loads of qualifiers everywhere
<rogpeppe1> axw: however... this is indeed an unusual property in this case and could probably do with an additional sentence.
<rogpeppe1> axw: i should have said: in the spec, not saying you can't do something *is* equivalent to saying you can do it *if* there's something *saying* you can do it
<axw> rogpeppe1: the fact that this has come up on golang-nuts (and in this channel ;)) before adds merit to that
<axw> rogpeppe1: you may be interested to know, I started to rewrite parts of llgo to use adonovan's ssa package
<axw> should be a bit more robust now :)
<axw> well, when it's working. I'll have to stop warping the definition of interfaces to get it to work
<rogpeppe1> axw: cool!
<rogpeppe1> axw: how were you warping the definition of interfaces?
<axw> rogpeppe1: llgo represents them as {runtime-type, value, method1, method2, ...}
<axw> ugly hack
<axw> should be done inside the runtime
<axw> rogpeppe1: when you have a moment, can you please look at this: https://bugs.launchpad.net/juju-core/+bug/1239550
<_mup_> Bug #1239550: juju should warn if .jenv doesn't match environments.yaml <config> <tech-debt> <ui> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1239550>
<axw> well actually, the summary says it all
<axw> rogpeppe1: bigjools and I were debugging and issue where he'd changed the oath token in his environments.yaml, but it wasn't taking effect
<TheMue> morning
<axw> turns out he'd already prepared, and then changed it
<axw> morning TheMue
<TheMue> axw: hiya
<axw> rogpeppe1: that aside, it's a bigger problem for the null provider, which doesn't implement destroy-environment (yet)
<wallyworld_> rogpeppe1: i've landed 2 branches; i have 2 more to go to fix the http issue https://codereview.appspot.com/14769043/ https://codereview.appspot.com/14668044/
<rogpeppe1> wallyworld_: did you see my response on https://codereview.appspot.com/14769043/ ?
<rogpeppe1> wallyworld_: i think it's possible to be much less invasive with this fix
<rogpeppe1> axw: sorry, only just saw your remark above
<axw> rogpeppe1: nps
<rogpeppe1> axw: in general it has always been the case that changing something in your environments.yaml won't have any effect on an already-bootstrapped environment
<thumper> rogpeppe1: https://bugs.launchpad.net/juju-core/+bug/1240708
<_mup_> Bug #1240708: API server falls over repeatably during AllWatcher Next, killing GUI <juju-core:New> <juju-gui:Triaged> <https://launchpad.net/bugs/1240708>
<thumper> rogpeppe1: I've been looking at that bug a little today
<thumper> rogpeppe1: all I have been able to discern so far is that somewhere, someone is calling Stop on the watcher
<rogpeppe1> thumper: i hadn't seen that bug
 * rogpeppe1 has a look
<axw> rogpeppe1: hmm, what about if bootstrap failed? or someone did sync-tools and not bootstrap yet? in the past, the environments.yaml change would take effect
<thumper> rogpeppe1: yeah, filed by gary while you slept
<axw> thumper: are we still going to have a meeting in a bit?
<thumper> axw: aye
<thumper> that is why I'm back :)
<axw> makes sense :)
<axw> thumper: when are you flying out?
<rogpeppe> api client crashed.
<rogpeppe> irc client, even
<thumper> axw: sunday morning (from here), sunday night AKL -> SFO
<thumper> arrive 11:15am SFO
<axw> sameish
<axw> I arrive a bit later I think
<rogpeppe> and it doesn't seem to be notifying me when someone mentions my nickname any more, which is why i didn't see earlier remarks
<axw> rogpeppe: what about when you don't have a 1 at the end of your name?
<rogpeppe> axw: just testing - could you mention my irc nick, please?
<rogpeppe> huh, weird
<rogpeppe> this is the first time i've ever had a problem with this IRC app
<axw> rogpeppe: testing
<rogpeppe> axw: thanls
<rogpeppe> axw: thanks even
<axw> np
<rogpeppe> axw: still buggered
<rogpeppe> axw: if i bring up the "configure notifications" panel, the whole app halts when i click Ok
<axw> rogpeppe: in case you missed all this:
<axw> <rogpeppe1> thumper: i hadn't seen that bug
<axw> * rogpeppe1 has a look
<axw> <axw> rogpeppe1: hmm, what about if bootstrap failed? or someone did sync-tools and not bootstrap yet? in the past, the environments.yaml change would take effect
<axw> <thumper> rogpeppe1: yeah, filed by gary while you slept
<wallyworld_> rogpeppe: i just saw your comments, thanks. so setting DisableKeepAlives=false has exactly the same effect as Close=true?
<rogpeppe> wallyworld_: yes
<wallyworld_> ok
<thumper> wouldn't DisableKeepAlives=true be the same as Close=true?
<rogpeppe> thumper: yes
<thumper> double negative would be like "enable keep alives"
<wallyworld_> whatever :-)
<rogpeppe> thumper: yeah i missed the =false thing above
<wallyworld_> i knew what i meant :-)
<thumper> wallyworld_: no one else did
<rogpeppe> thumper: i did
<wallyworld_> rogpeppe did :-)
 * thumper ignores them both
<rogpeppe> axw: i think that if bootstrap fails and we've prepared the environment in that step, that we should destroy the environment there and then
<rogpeppe> axw: the sync-tools-then-bootstrap case is interesting
<axw> rogpeppe: ideally, but you could still end up with a stale .jenv file if juju died half-way through
<rogpeppe> axw: true - in general if you want to change attributes you need to destroy-environment first
<rogpeppe> axw: warning about mismatched attributes may be a reasonable thing to do
<rogpeppe> axw: we'd also want to check that a provider doesn't override any of the attributes at Prepare time (currently that's allowed)
<axw> rogpeppe: I was thinking of adding in a provider.Validate call there, passing in .jenv/env.yaml as args
<axw> rogpeppe: ?
<axw> don't understand that bit
<axw> rogpeppe: meeting is on
<TheMue> rogpeppe: meeting
<natefinch> rogpeppe: about the destroy-environment change I was proposing... I agree that you need to be able to script it.  I just want it to require a little more thought than just throwing a -y on the end, if we're talking about taking down an entire network of services and destroying all the user's data
<natefinch> rogpeppe: how about juju destroy-environment --confirm destroy-my_env_name  ?
<thumper> rogpeppe: confirmed just maas provider changed for the agent_name
<rogpeppe> natefinch: if -y is good enough for fsck, it's good enough for me :-)
<rogpeppe> thumper: yeah, i also confirmed that
<rogpeppe> natefinch: how long do you have to make a name before it's hard enough to type?
<rogpeppe> natefinch: that's the reason it's called "destroy-environment" rather than just "destroy"
<natefinch> rogpeppe: yes, but the point is that people could destroy the *wrong* environment.  That's why I want the environment name in the confirmation.
<natefinch> rogpeppe: also, fsck will only fuck up one computer.  destroy-environnent will fuck up potentially dozens or more
<rogpeppe> natefinch: i dunno - have we actually encountered cases where this has been a problem?
<rogpeppe> natefinch: a nicer solution would be to be able to deliberately "lock" an environment so that it's not possible to destroy until explicitly unlocked
<natefinch> rogpeppe: then people have to remember to lock it
<rogpeppe> natefinch: sure - you lock it if it's important to you
<natefinch> rogpeppe: if you even realize that function exists
<rogpeppe> natefinch: for many people destroying an environment isn't that big a deal - you can deploy everything again quite easily
<thumper> natefinch: -y or --yes-I-know-what-Im-doing
<thumper> :)
<rogpeppe> natefinch: personally, i find even the current behaviour of destroy-environment to be annoying
<natefinch> rogpeppe: which is more important, 10 keystrokes, or helping the users not shoot themselves in the foot?
<rogpeppe> natefinch: that's a false dichotomy
<natefinch> rogpeppe: this is the bug I was basing the work on: https://bugs.launchpad.net/juju-core/+bug/1057665
<_mup_> Bug #1057665: juju destroy-environment is terrifying; please provide an option to neuter it <canonical-webops> <destroy-environment> <juju:Fix Committed by hazmat> <juju-core:In Progress by natefinch> <https://launchpad.net/bugs/1057665>
<natefinch> brb sorry
<rogpeppe> natefinch: that bug report makes a similar suggestion to mine above
<natefinch> rogpeppe: yes, but this solution is simpler and more effective.
<rogpeppe> natefinch: i don't believe it's more effective
<frankban> thumper: looking at bug 1240708: it could be a problem on the gui server side (aborting the connection)
<_mup_> Bug #1240708: API server falls over repeatably during AllWatcher Next, killing GUI <juju-core:New> <juju-gui:Triaged> <https://launchpad.net/bugs/1240708>
<rogpeppe> natefinch: although it is simpler, i agree
<frankban> thumper: working on it
<rogpeppe> frankban: it certainly looked as if the API server saw a connection drop
<natefinch> rogpeppe: it's more effective because it's automatic.  Otherwise it's like having to turn on the airbag in your car.
<rogpeppe> natefinch: i don't believe that making people type more makes an effective barrier - people will type *anything* without thinking
<rogpeppe> natefinch: we've already got a written warning that mentions the name of the environment, and a user prompt
<natefinch> rogpeppe: it's not just more, it's the fact that it's environment specific.  You can't think you're destroying the test env when you're actually destroying production
<natefinch> rogpeppe: no one reads
<natefinch> rogpeppe: especially after the first couples times
<rogpeppe> natefinch: we could potentially make it so that the -e flag is mandatory for destroy-environment
<natefinch> rogpeppe: I had thought of that. That would be fine with me.
<natefinch> rogpeppe: I'd be happy to do both, make the lock command and require -e.
<rogpeppe> natefinch: (i'll just make a script, jujudestroy, which finds out the current env name and passes that as the -e arg :-])
<natefinch> rogpeppe: you're welcome to do that. Anyone can script a gun to shoot themselves in the foot... I just don't want juju to hand them one loaded with the safety off, in an ankle holster
<rogpeppe> natefinch: the lock/nodestroy command is interesting, as it's not clear what we should do if you can't connect to the environment
<rogpeppe> natefinch: if the environment instances have been destroyed, for example, we should still be able to destroy the environment
<sinzui> natefinch, Did this branch also get merged into to trunk? https://code.launchpad.net/~natefinch/juju-core/fix-win-bootstrap/+merge/190461
<gary_poster> hey rogpeppe, I'm here (in mtgs but here) to discuss bug 1240708.  Whether it is an issue in core (as I diagnosed) or gui (which is possible but seems unlikely to me), it's critical for gui
<_mup_> Bug #1240708: API server falls over repeatably during AllWatcher Next, killing GUI <juju-core:New> <juju-gui:Triaged> <https://launchpad.net/bugs/1240708>
<natefinch> rogpeppe: that's a good point.  Kinda hard to keep a central lock on the environment, when you might not be able to connect to the environment
<rogpeppe> gary_poster: frankban seems to think it might be a problem on the gui server
<natefinch> sinzui: hmm... lemme check
<gary_poster> rogpeppe, ah! ok, will check with him thx
<natefinch> sinzui: not in trunk.  good catch, I'll move it over
<sinzui> https://bugs.launchpad.net/juju-core/+bug/1240927
<_mup_> Bug #1240927: os.rename does not wotk with windows <bootstrap> <windows> <juju-core:In Progress by natefinch> <juju-core 1.16:New> <https://launchpad.net/bugs/1240927>
<sinzui> ^ natefinch your bug
<TheMue> rogpeppe: just coming back from lunch and reading the log here I've seen your statement that destroying an environment isn't a big deal because everything can be deployed quite easy again
<TheMue> rogpeppe: but how about lost data in that case
<TheMue> rogpeppe: think of our typical blog example and kick you blog entries of the past two years -> ouch
<TheMue> and that's one of the most harmless examples
<jpds> Is juju aware of neutron security groups?
<mgz> geh, hopefully internet is stable and alive for now
<rogpeppe> mgz: i wondered why we didn't see you. has your internet been down?
<mgz> yeah, went screwy for most of yesterday afternoon
<mgz> seems okay now though at least
<sinzui> rogpeppe, I see your merge into 1.16 branch. I updated two bugs that I think you will also want to update: https://bugs.launchpad.net/juju-core/+bug/1229275 and https://bugs.launchpad.net/juju-core/+bug/1081247
<_mup_> Bug #1229275: [maas] juju destroy-environment also destroys nodes that are not controlled by juju <maas> <theme-oil> <juju:Triaged> <juju-core:In Progress by thumper> <juju-core 1.16:In Progress by rogpeppe> <juju-core (Ubuntu):Triaged> <maas (Ubuntu):Triaged> <juju-core (Ubuntu Saucy):Triaged> <maas (Ubuntu Saucy):Triaged> <https://launchpad.net/bugs/1229275>
<_mup_> Bug #1081247: maas provider releases all nodes it did not allocate [does not play well with others] <maas> <juju:Triaged> <juju-core:In Progress by julian-edwards> <juju-core 1.16:In Progress by rogpeppe> <MAAS:Invalid> <https://launchpad.net/bugs/1081247>
<sinzui> rogpeppe, Are the bugs fix committed or in progress
<rogpeppe> sinzui: the fixes are committed
<sinzui> fab
<rogpeppe> aargh, i wish i knew why my IRC client had suddenly stopped notifying me
<sinzui> rogpeppe, any more bugs that need backporting?
<natefinch> rogpeppe: I had the same problem with empathy not notifying me anymore
<rogpeppe> sinzui: there are varying opinions on that
<mgz> rogpeppe: but nothing else we urgently need asap after release, right?
<mgz> just the question over whether most of what's on trunk should actually get into saucy too at some point
<rogpeppe> mgz: i *think* so
<sinzui> mgz I am looking for bugs that need to be backported before the 1.17.0 release. 1.17.0+ will be made available to saucy like our other series
<mgz> sinzui: right.
<rogpeppe> mgz: fancy spending an hour on the addressing stuff?
<mgz> rogpeppe: sounds like a plan
<sinzui> mgz, rogpeppe I really don't know if I should begin blessing 1.16 tip now. I see this bug that stakeholders may want backported https://bugs.launchpad.net/gomaasapi/+bug/1222671
<_mup_> Bug #1222671: Using the same maas user in different juju environments causes them to clash <cts-cloud-review> <maas> <Go MAAS API Library:Fix Committed> <juju-core:Fix Committed by thumper> <https://launchpad.net/bugs/1222671>
<rogpeppe> mgz: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig?authuser=1
<mgz> sinzui: I'm not sure what the status of that is, it required several changes to both juju and maas
<sinzui> oh good. I think that disqualifies the bug
<mgz> without both parts back, it doesn't make much sense to include the juju changes
<sinzui> I believe SRUs must be limited to fixes to just the package
<TheMue> off for today, cu tomorrow
<mgz> rogpeppe: thanks, see you tomorrow
<rogpeppe> mgz: ha, i pressed ^R in my web browser, not ^T
<rogpeppe> mgz: so lost the connection
<rogpeppe> mgz: cheers!
<mgz> guessed that :)
<rogpeppe> g'night all
<narindergupta> i All I am trying to test the MAAS/JUJU from cloud tools from precise but stuck at know issue where mongodb with ssl was not installed. And mongodb listens on 27017 instead of 37017. Since i am using proxy so can not add the ppa of mongodb. If someone can tell me the ppa name for mongodb with ssl support for precise I can manually tried to update the mongodb and have bootstrap node working with juju-core?
<jpds> Guys, how do I force juju to forget about a install hook lock?
<jpds> I deployed service A, then service B, realized that I didn't need A, remove-unit'ed it, but now B's not moving because it still thinks A exists.
<jpds> Oh, all on the same machine.
<natefinch> jpds: sorry, I don't know, an this is kind of a dark time of the day for juju devs.  Thumper should be coming online in the next hour or so.
<jpds> natefinch: No worries.
<wallyworld_> thumper: a trivial goose review https://codereview.appspot.com/14769043
<wallyworld_> and a small juju-core one https://codereview.appspot.com/14668044/
<thumper> +!
<thumper> or 1
<wallyworld_> you mean lgtm?
<wallyworld_> thanks :-)
<thumper> done
<thumper> I was going to be helping at the daughter's school's sports day
<thumper> but rain cancelled it after two hours
<thumper> so home agani
<thumper> again
<thumper> final trip prep...
<thumper> and stuff
<thumper> will be on and off irc periodically
<wallyworld_> me too
<wallyworld_> thanks
<wallyworld_> thumper: what do you mean multiple times? init() is only called once isn't it?
<thumper> wallyworld_: but init() will be called for gwacl, goose, juju...
<thumper> each one replacing the default in the http lib
<wallyworld_> that is true
<wallyworld_> but i'm not sure what can be done
<thumper> but not really a big deal
<wallyworld_> yeah
 * thumper shrugs
<thumper> I wouldn't bother
<wallyworld_> ok, will land :-)
 * thumper takes a wet kid to school for the afternoon
<thumper> she doesn't wanna go
<thumper> so I'm being a mean dad
<wallyworld_> hah
#juju-dev 2013-10-18
<wallyworld_> thumper: turns out overwriting the transport did cause issues as it nuked any alternate protocol handlers already registered
<wallyworld_> so i did a quick fix https://codereview.appspot.com/14840043
<thumper> heh
<axw> wallyworld_: how did the tests pass?
<axw> or has the original change not landed
<wallyworld_> axw: goose tests passed just fine but juju-core tests failed
<axw> ah right
<axw> this is in goose, got it
<wallyworld_> yeah. my original solution worked but roger made me change it
<thumper> school run
<wallyworld_> thumper: gwacl and gomaasapi - there the req.Close attribute was set instead, as all http requests were dispatched via a helper function
<wallyworld_> i used that solution with goose originally, but it was more changes as we use http directly there
<wallyworld_> so i needed to add the dispatch helper
<wallyworld_> but roger suggested using the keep alive attr instead
<bigjools> wallyworld_: checked in yet?
<wallyworld_> bigjools: nope. but i've already resrved my seats
<bigjools> wallyworld_: just reserved mine.  fucking agent had put me in a bulkhead seat with restricted legroom ...
<wallyworld_> i think i got 50
<wallyworld_> from memory
<bigjools> row 39
<wallyworld_> so i can throw things at your head
<bigjools> 17A on the la->sfo
<wallyworld_> i forget what i got for that one, 8 i think
<bigjools> we can change at the gate so we can hold hands
<wallyworld_> nah, just wait till we drive over the golden gate into the sunset
<bigjools> wallyworld_: http://tinyurl.com/kj3zltm
<adam_g> uhm so
<adam_g> is multi-tenancy supposed to work at this point /w MAAS?
<bigjools> adam_g: not *quite* there is an sru waiting
<adam_g> bigjools, bug #? anything in proposed i can test?
<bigjools> it's all tested
<adam_g> maas (1.4+bzr1693+dfsg-0ubuntu2.1) saucy-proposed; urgency=low
<adam_g>  ?
<bigjools> oh that kind of testing
<bigjools> I don't see that in proposed yet
<adam_g> hmph still in queue
<adam_g>   * debian/patches/99_fix_juju_multienv_lp1239488: Allows juju to distinguish
<adam_g>     between different environments, actually fixing the MAAS side of multiple
<adam_g>     juju environment support. (LP: #1239488)
<_mup_> Bug #1239488: [SRU] Juju api client cannot distinguish between environments <MAAS:Fix Released by julian-edwards> <maas (Ubuntu):Triaged> <maas (Ubuntu Saucy):Triaged> <https://launchpad.net/bugs/1239488>
<bigjools> that is the one
<adam_g> still anything that needs to be fixed on the juju side?
<bigjools> ah it's in the upload queue
<bigjools> there was a fix for juju but I don't know its status
<bigjools> thumper: ?
<thumper> landed afaik, but not released yet
<thumper> has been landed on 1.16 branch for 1.16.1, but again, don't know the status
<bigjools> wallyworld_: seems like cartridge razors are ok in hand luggage
<wallyworld_> oh, surpirsing
<bigjools> wallyworld_: it also occurred to me that we ought to go shopping for gadgetry
<wallyworld_> well why not
<wallyworld_> i don't need anything but need != want :-)
<bigjools> exactly
<bigjools> I am sure we can take the mustang via Fry's etc :)
<thumper> :)
 * thumper is looking for his US cables
<wallyworld_> bigjools: the saucy archives should be updated with the final release by now, right?
<wallyworld_> "update-manager -d" shows a splash screen saying it's still a beta release
<bigjools> wallyworld_: yeah release has happened
<bigjools> apt-get update should make that go away
<wallyworld_> update-manager -d does do an update
<wallyworld_> it forces your current release to be up-to-date
<bigjools> wallyworld_: yes but you're still running with the old update manager at that point
<bigjools> or potentially older
<wallyworld_> i would have thought it would have fetched the latest splash info
<wallyworld_> so i just ignore that message?
<bigjools> yeah
<wallyworld_> ok, but i reckon it's a poor user experience. it should show info about the release you are upgrading to, not older info
<bigjools> yup
<wallyworld_> cause i reasonably thought the mirrors i was using hadn't been synced yet
<wallyworld_> based on the info on the splash screen
<bigjools> that's also possible
<wallyworld_> so then i switch to the canonical archives and same result
<bigjools> then we should declare it to be buggered
<TheMue> morning
<TheMue> frankban: ping
<frankban> TheMue: hey
<TheMue> frankban: hi
<TheMue> frankban: based on my first proposal rogpeppe had a good idea for env/switch
<TheMue> frankban: see the last comment on https://code.launchpad.net/~themue/juju-core/053-env-more-script-friendly/+merge/191640
<TheMue> frankban: if that is fine for you too I would change my proposal
<frankban> TheMue: so, --raw by default and an error exit code if no default env is configured. totally +1
<TheMue> frankban: fine, then I'll note it there and the issue and change it this morning
<frankban> TheMue: great, thank you!
<TheMue> frankban: yw
<TheMue> rogpeppe: ping
<rogpeppe> TheMue: pong
<TheMue> rogpeppe: ah, hiya
<rogpeppe> TheMue: sorry, my IRC client has stopped notifying me when someone mentions my name
<rogpeppe> TheMue: it's most annoying
<rogpeppe> TheMue: hiya, BTW
<TheMue> rogpeppe: as you may have seen frankban and I agreed on your proposal
<rogpeppe> TheMue: cool
<TheMue> rogpeppe: one question for "juju env --list"
<TheMue> rogpeppe: in that way it only lists all names
<TheMue> rogpeppe: but additionally you can pass a name to switch too
<TheMue> rogpeppe: how would you act in that case and let the output look like?
<rogpeppe> TheMue: i had no idea that "env" was a synonym for "switch"
<TheMue> rogpeppe: yeah, it is
<rogpeppe> TheMue: i think "juju switch --list foo" should probably give an error
<rogpeppe> TheMue: at some point in the future, when environments may be held remotely, we could potentially use it to implement search functionality but for now that's not needed.
<TheMue> rogpeppe: ah, fine, that's my idea too. I dislike the combination of switching and listing in one call
<TheMue> rogpeppe: it's so "hey, please show me the environments. and by the way you can also switch it" :/
<rogpeppe> TheMue: yeah
<TheMue> rogpeppe: currently cleaning up the tests, everything simpler now :)
<rogpeppe> TheMue: i hoped it might be
<rogpeppe> mgz: standup?
<rogpeppe> wallyworld_: ^
<rogpeppe> dimitern: you still connected?
<dimitern> gah!
<dimitern> my connection died at the hangout exactly as yesterday
<dimitern> and i can't seem to be able to join again
<mgz> hm can't join again? what error?
 * TheMue => lunch
<dimitern> my machine behaves somewhat erratically perhaps it's time for a reboot
<rogpeppe> anyone up for doing a review of this? https://codereview.appspot.com/14619045/
<rogpeppe> dimitern, TheMue, natefinch: ^
<natefinch> rogpeppe: sure thing
<natefinch> rogpeppe: there are some comments about this code being temporary.  How temporary is this code?   Just want to know so I can dial in the amount of nitpicking ;)
<natefinch> rogpeppe: (in state/apiserver/common/addresses.go)
<natefinch> rogpeppe: note, my problem is not with your changes, but some minor stuff with the code that was there that could do with a little cleanup
<mgz_> hm, didn't log out at home
<abentley> sinzui: dude, you indented with tabs!  Are you feeling okay?
<sinzui> Obviously no
 * sinzui will fix that
<abentley> sinzui: When using 'find' with wildcards, I think it's best to quote the wildcard.
<sinzui> abentley, I think I want the scripts to look for credentials and configs in JUJU_HOME. I don't we want to force .juju or $HOME
<sinzui> abentley, yes, I did that twice,
<abentley> sinzui: you missed it in archive_tools and retrieve_packages.
<abentley> sinzui: I agree about $JUJU_HOME.
<sinzui> abentley, The last hours broke my head. I was testing what happens when non-required data is missing in steps and find lots of errors that killed the script
<abentley> sinzui: Oh, I see.
<sinzui> s/find/found/
<sinzui> abentley, reassembling with existing tools (no debs) was very bad. I will review the scripts with fresh eyes. Though yours are clearly fresh
<abentley> sinzui: It's a shame that s3cmd won't accept environment variables, because I could extend "jnova" to work with all providers and I think that would be neat.
<jpds> Anyone know what is going on here? http://pastebin.ubuntu.com/6257151/
<jpds> Ah, fixed it.
<rogpeppe> is there any way to get apt-get to downgrade a package to a specific version?
<rogpeppe> (still trying to fix my IRC client issue
<rogpeppe> )
<jpds> rogpeppe: apt-cache policy <package>
<jpds> rogpeppe: Take the earlier version number and: sudo apt-get install <package>=<version>.
<rogpeppe> jpds: thanks. hmm, looks like nothing's changed in a while, and there don't seem to be any earlier version numbers.
<rogpeppe> % apt-cache showpkg konversation
<rogpeppe> Package: konversation
<rogpeppe> Versions:
<rogpeppe> 1.5~rc1+git20130415-0ubuntu1 (/var/lib/apt/lists/gb.archive.ubuntu.com_ubuntu_dists_raring_universe_binary-amd64_Packages) (/var/lib/dpkg/status)
<rogpeppe>  Description Language:
<rogpeppe>                  File: /var/lib/apt/lists/gb.archive.ubuntu.com_ubuntu_dists_raring_universe_binary-amd64_Packages
<rogpeppe>                   MD5: 529965a53c80f878568781c6a205d5f5
<rogpeppe>  Description Language: en
<rogpeppe>                  File: /var/lib/apt/lists/gb.archive.ubuntu.com_ubuntu_dists_raring_universe_i18n_Translation-en
<rogpeppe>                   MD5: 529965a53c80f878568781c6a205d5f5
<rogpeppe> Reverse Depends:
<rogpeppe>   konversation:i386,konversation
<rogpeppe>   kubuntu-full,konversation
<rogpeppe>   konversation-dbg,konversation 1.5~rc1+git20130415-0ubuntu1
<rogpeppe>   konversation-data,konversation 1.3~beta1-2
<rogpeppe>   konversation-data,konversation 1.3~beta1-2
<rogpeppe>   konversation-data,konversation 1.5~rc1+git20130415-0ubuntu1
<rogpeppe> Dependencies:
<rogpeppe> 1.5~rc1+git20130415-0ubuntu1 - kde-runtime (0 (null)) kdepim-runtime (0 (null)) libc6 (2 2.14) libkabc4 (2 4:4.4.3) libkde3support4 (2 4:4.4.3) libkdecore5 (2 4:4.5.85) libkdeui5 (2 4:4.7.0) libkemoticons4 (2 4:4.4.95) libkidletime4 (2 4:4.4.95) libkio5 (2 4:4.5.85) libknotifyconfig4 (2 4:4.4.3) libkparts4 (2 4:4.4.3) libphonon4 (2 4:4.2.0) libqca2 (2 2.0.2) libqt4-dbus (2 4:4.7) libqt4-network (2 4:4.7) libqt4-qt3support (2 4:4.7) libqt4-
<rogpeppe> xml (2 4:4.7) libqtcore4 (2 4:4.8.0) libqtgui4 (2 4:4.8.0) libsolid4 (2 4:4.4.3) libstdc++6 (2 4.1.1) phonon (0 (null)) konversation-data (5 1.5~rc1+git20130415-0ubuntu1) konversation:i386 (0 (null))
<rogpeppe> Provides:
<rogpeppe> 1.5~rc1+git20130415-0ubuntu1 - irc
<rogpeppe> Reverse Provides:
<rogpeppe> frick
<rogpeppe> http://paste.ubuntu.com/6257210/
<rogpeppe> argh, everything is broken
<jpds> policy, not showpkg.
<rogpeppe> jpds: sorry, that was just the previous contents of my paste buffer
<rogpeppe> jpds: the paste points to the intended thing
<rogpeppe> jpds: unfortunately DNS lookups take about 10 seconds on this machine at the moment, so my pastebin script hadn't run quickly enough
<jpds> Yeah, so no way to downgrade without going to launchpad and downloading an earlier .deb file.
<rogpeppe> jpds: well, the binary hasn't changed in months, so i guess it must be something that's gone wrong somewhere in my machine
<rogpeppe> :-(
<rogpeppe> i might try reinstalling the app i suppose
<rogpeppe> natefinch: could you mention my nickname please?
<rogpeppe> or anyone
<jpds> rogpeppe: Hi.
<rogpeppe> jpds: thanks
<jpds> Folks, I'm trying to deploy on openstack and I'm getting this error: error info: {"badRequest": {"message": "Multiple possible networks found, use a Network ID to be more specific.", "code": 400}}
<rogpeppe> well bugger me backwards with a spade, it worked
<jpds> Where do I specify a network ID?
<rogpeppe> jpds: mgz might be a good one to ask
<jpds> mgz_: â ?
 * rogpeppe goes for some lunch
<mgz_> jpds: hm...
<jpds> mgz_: I do have two networks in openstack, the shared ext_net and my own tenent's one.
<mgz_> yeah, this is somewhat of a problem if that's an error case, as this is from boot, right?
<mgz_> does nova boot also complain if you don't specify a network?
<jpds> mgz_: Yes, exact same message.
<mgz_> jpds: seems mostly like a nova configuration issue then...
<jpds> mgz_: No.
<mgz_> there's not really anything reasonable juju could do here, the best would be list all networks and arbitrarily select one, which still sucks
<jpds> You could specify a network to .juju in the environments.
<mgz_> yeah, because more manual configuration is exactly what we want
<mgz_> (that is an option, but it doesn't seem ideal)
<mgz_> (would much prefer nova having a default network selection)
<jpds> mgz_: Well, it's one extra flag to the boot option: http://people.canonical.com/~jpds/nova-boot.png
<jpds> Maybe make it part of the imagemetadata.json?
<mgz_> it's not at all related to images
<TheMue> rogpeppe: after a short discussion we'll roll back to env/switch with flag --raw
<sinzui> rogpeppe, mgz natefinch did we hard code ubuntu series in Juju? Looks like sync-tools cannot do a release
<sinzui> ERROR invalid series "trusty"
<sinzui> yep, we did hard code.
<abentley> sinzui: There is a syntax error in "assemble-public-tools": generate_streams does not work, because it does "for $tool in" instead of "for tool in".  I do not understand why this syntax error doesn't abort the script.
<sinzui> abentley, me neither. Just fixed that BTW in my scripts
<sinzui> abentley, I had to pause to deal with this seen in that very function: https://bugs.launchpad.net/juju-core/+bug/1241666
<_mup_> Bug #1241666: Cannot creaste simple streams for Ubuntu trusty series <build> <juju-core:In Progress by sinzui> <https://launchpad.net/bugs/1241666>
<abentley> sinzui: Glad you caught that before a release.
<abentley> sinzui: Also, I think it's good hygiene to use "set -eu", not just "set -e".  It does mean you have to use {$foo:-} in some places where $foo would otherwise suffice.
<sinzui> abentley, I just pushed my changes minus the JUJU_HOME change we discussed this morning
<sinzui> abentley, okay
 * sinzui makes juju releasable
<abentley> sinzui: I don't see the tool fix in the changes you just pushed.
<sinzui> bugger, I switched
<sinzui> abentley, now? revno 2008
<abentley> sinzui: Yes, that's got it.
<mgz_> sinzui: some of the code updates from distro-info, but the bit that's breaking you may not
<abentley> sinzui: I just pushed a tweak to set usage.
<mgz_> sinzui: the bit you finger in the bug at least does have the update code, are you sure your ubuntu.csv has trusty?
<sinzui> mgz, it does not, yet
<sinzui> I got updates 2 hours ago
<mgz_> so, it's not a juju bug, it's an ubuntu bug :)
<jamespage> mgz_, logged bug  1241674 for the multiple tenant networks issue jpds described above
<_mup_> Bug #1241674: juju-core broken with OpenStack Havana for tenants with multiple networks <juju-core:New> <https://launchpad.net/bugs/1241674>
<mgz_> jamespage: thanks
<jamespage> we can have a fight as to where the problem actually lies
<mgz_> lets :)
<jamespage> :-)
 * jamespage fists up
<jamespage> lol
<mgz_> juju really needs some selection criteria for networks if it's going to start explictly passing one in
<abentley> sinzui: I'm getting an empty added_tools, which is giving me an empty $tool in the loop, which makes rm unhappy.
<abentley> sinzui: http://162.213.35.28/job/juju-core-ci/38/console
<sinzui> abentley, Must have broken this morning. It was a happy loop lastnight
<abentley> sinzui: I'm unfamiliar with that syntax.  It concerns me that the loop executes for an empty string.
<sinzui> mgz, does juju-core have a max line length for go code? I need to tell my editor to STFU
<mgz_> sinzui: no, we try to keep it sane
<sinzui> abentley, me too. I think we need to look-before-we-leap.
<mgz_> but some go syntax stuff doesn't really sit nicely with hard line length limits
<abentley> sinzui: Is bash really so bad?  normally an empty input array means an each loop gets skipped.
<mgz_> I still aim for less than 80, but with tabs and some function definitions you pretty much always end up going over that as wrapping would be worse
<sinzui> abentley, I thought the same. Have I mentioned I hate bash today?
<mgz_> right, I'm transfering back home again, will look at any pending reviews when I'm in
<sinzui> mgz. okay. I will set no max length, and let common sense rule
<sinzui> thanks mgz
<sinzui> bugger! I've got mgo errors again. Since saucy is released I suspect it is me and not the code
<abentley> sinzui: Have I mentioned that the heredoc trick works equally well with python?
<sinzui> abentley, no, but I have used it myself
<rogpeppe> TheMue: how come?
<TheMue> rogpeppe: see discussion on juju-gui
<TheMue> rogpeppe: oops, just seeing that the proposal is now somehow faulty
<TheMue> rogpeppe: I've done a revert and then changed the latest whishes
<TheMue> rogpeppe: now the proposal shows too many files :/
<TheMue> rogpeppe: I think I'll simply close this one, take my two changed files and create a new branch :(
<sinzui> jamespage, do I need to upgrade to trusty to get a /usr/share/distro-info/ubuntu.csv that knows about trusty?
<jamespage> sinzui, no - that will be SRU'ed
<jamespage> like right now
<jamespage> (I see it in -proposed)
<sinzui> jamespage, fab. I worried I and CI/CD needed to hack that file to do releases
<abentley> sinzui: I think I have a fix: http://pastebin.ubuntu.com/6258199/
<sinzui> ah
<abentley> sinzui: That line noise at 17 is apparently the way you determine the length of the array.  So its foo[len(foo)] = bar, or foo.appen(bar) in saner languages.
<sinzui> +1 abentley
<abentley> sinzui: Pushed.
<rogpeppe> TheMue: here's probably more appropriate
<rogpeppe> TheMue: when you reverted the first time, you reverted the *entire tree* and you'd already merged trunk
<rogpeppe> TheMue: so you've manged to revert the changes in trunk that happened since the revision you reverted to
<TheMue> rogpeppe: oh, now I've seen your reply here
<rogpeppe> wow, verifying a public key pair takes 40 *milliseconds* on my machine
<rogpeppe> i was wondering why juju switch was so slow, and that's the reason
<TheMue> rogpeppe: slow? I have no experience to compare. is it done so often?
<TheMue> mgz: one CL to review => https://codereview.appspot.com/15080044/
<rogpeppe> TheMue: not particularly, but i saw a noticeable delay when running it
<rogpeppe> TheMue: it took 0.25s to run on my machine
<TheMue> rogpeppe: ah, ok
<gary_poster> hey, does anyone have a chance to help paulczar, who is trying to get a charm championship entry finished up, in #juju with what appears to be a juju bug/fragility in https://bugs.launchpad.net/juju-deployer/+bug/1241721 (see comment #2: "agent-state-info: '(error: invalid URL "http://cloud-images.ubuntu.com/releases/streams/v1/index.sjson" not found)'"?
<_mup_> Bug #1241721: juju-deployer never finishes <juju-deployer:New> <https://launchpad.net/bugs/1241721>
<TheMue> rogpeppe: the CL above is the now fresh and correct one
<TheMue> so, guys, I'm off now, well see on Monday in SFO
<TheMue> *wave*
<rogpeppe> i'm also off
<rogpeppe> g'night all
<mgz> gary_poster: responded, I suspect just ec2 falkeiness
<mgz> *flakiness
<gary_poster> thank you very much mgz
<mgz> or something
<gary_poster> yeah, I figured.  arguably fragility it would be nice to be able to handle, but probably reasonable to put that off for another day
<mgz> yeah, it's hard to see where our robustness is falling down exactly, as we also seem to have not logged the failure from provisioning (assuming there was one)
<sinzui> mgz, if you have time, can you review https://codereview.appspot.com/15120043
<mgz> sinzui: just saw that looking for TheMue's one :)
<mgz> tarty would have been a very silly series name :)
<sinzui> did I write that again?
<sinzui> I guess I can expect the same when Unctuous Uakari is not announced
<adam_g> mgz, is it possible to use the ec2 provider with local simplestream data? ie, against a private openstack via ec2 api?
<sinzui> adam_g, I don't fully understand your question, but I can confirm that the tools-url has to be to the same cloud. Eg. I cannot set the tools-url to a location I have built test tools, then use them with the cloud I am testing
<sinzui> adam_g, I have instead uploaded tools and metadata to each cloud, but placed them in a non-standard location and pointed the tools-url to pick them up
<smoser> sinzui, tools-url and simplestreams data url are separate, right?
<smoser> adam_g, is interested in providing simplestreams data url.
<adam_g> tools might be an issue too
<adam_g> im intersted in using juju against a private openstack cloud via the EC2 API, probably with no internet access
<adam_g> i'd need to specify the AMI ID of the glance image somehow, through a custom simplestream, in the same way i would have done with default-image-id using py juju
<adam_g> and a custom tools-url, i guess
<sinzui> smoser, They /might be/. Juju seems to conflate simplestreams with tools. I don't know if it thinks simple streams for images is different for simplestreams for tools
<sinzui> hmm
 * sinzui looks at old notes
<sinzui> adam_g, when azure simplestreams was broken in 1.15.0, I could force it to find the correct images doing this:
<sinzui> image-metadata-url: http://cloud-images.ubuntu.com/releases
<adam_g> sinzui, so i'd need to somehow fake the streams there for AWS, and point to images in my cloud?
<smoser> well. only the client acually *needs* the data. hopefully that can be a url like file://
<smoser> i am pretty sure its checking signatures.
<smoser> but maybe any signing key would be ok
<smoser> sstream-mirror can allow you to mirror the http://cloud-images.ubuntu.com/releases data to a local directory
<adam_g> smoser, if only it were that easy.. i need to create a stream of VMDKs :)
<sinzui> file:/// might work. This bug indicates they do work https://bugs.launchpad.net/juju-core/+bug/1223752
<_mup_> Bug #1223752: environs/simplestreams/simplestreams.go leaks test:// and file:// URLs into the http.DefaultClient <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1223752>
<smoser> adam_g, its not significantly more difficult.
<smoser> adam_g, for openstack, we actuallydo this.
<smoser> on canonistack
<smoser> and it supports a "hook" to repack the thing it uplaods
<smoser> ie, other than adding glance metadata informtion, i think your use case fits fairly easily into
<smoser> http://bazaar.launchpad.net/~smoser/simplestreams/example-sync/view/head:/cstack-mirror
<adam_g> smoser, unfortunately they cant just be 'repacked'
<smoser> sure they can
<adam_g> smoser, oh?
<smoser> what is "repacked"
<adam_g> converted from a qcow2 image to something that will actually boot on a vmware cluster
<smoser> how are you getting what you have?
<adam_g> ben is using some proprietary tools that come with vmware workstation to convert them on a windows system
<adam_g> atm i have a single precise vmdk that works, and  i'd like to make that available alongside a standard precise image and available to a local cloud via juju
<adam_g> but jeez, even getting juju to talk to my local cloud endpoints is no longer as trivial as setting them in my environments.yaml :|
<smoser> have you tried using vbox convert ?
<smoser> do you know that that fails ?
<adam_g> smoser, no, i haven't
<adam_g> smoser, i have something that works and would like to make that available to my cloud. i'd prefer not to waste another 4 days wrestling with VMDK images.
<smoser> so use the example-sync above, and for "repack" do 'cp some-other-file TARGET-FILE'
<smoser> or just hack the glance upload to do nothing
<smoser> and just return 'your-uuid-here'
<adam_g> smoser, are those synced images then available via an EC2 stream as well?
<smoser> yes
<smoser> what do you mean ec2 ?
<smoser> you want image ids ?
<smoser> in ami-abcfde format?
<mgz> yeah, if he wants to use the ec2 api he'll need that
<mgz> you can still specify your own simplestreams stuff with the plugin bits
<adam_g> smoser, yea
<adam_g> mgz, the endpoints are contained in some stream data too?
<smoser> adam_g, well, then just instead of returnning the uuid return the ami-id
<smoser> ami-ids are a PITA
<smoser> tahts why they're not implemented in that example-sync
<smoser> i really wanted to do it.
<smoser> but its difficult because you can't actually say "give me the ami-id for this uuid" anywhere
<smoser> youd have to crawl all iamges, and then match on name
<smoser> and thats not actually even guaranteed
<natefinch> Is there documentation on what objects the MaaS API returns?  I see docs on calling the REST API endpoints, but not on what they return
<natefinch> smoser, rvba, anyone else? ^^
<smoser> i dont know about doc
<natefinch> smoser: you said setting up virtual maas on my local machine was probably a bad idea?
<smoser> i would jsut do it on an isntance somewhere
<smoser> (ie canonciastack)
<smoser> it does all sorts of stuff that i wouldn't want to deal with
<smoser> ie, remember how it whacks /etc/resolv.conf ?
<natefinch> smoser: fair enough
<natefinch> smoser: right
<smoser> this is the juju charm mentality
<smoser> just do whatever you want to the root.
<smoser> but that doesn't sit so well with "i want my laptop to work"
<natefinch> heh right
<rogpeppe> right, i'm off to bed. taxi arrives in 4 hours.
<rogpeppe> see y'all in sf
<rogpeppe> natefinch: i'm kinda hoping you might have got something through the post :-)
#juju-dev 2013-10-20
<rogpeppe> sleep shmeep
#juju-dev 2014-10-13
<stokachu> anyone ever run into https://bugs.launchpad.net/juju-core/+bug/1380337?
<mup> Bug #1380337: adding machine after destroying another fails <add-machine> <cloud-installer> <destroy-machine> <juju-core:New> <https://launchpad.net/bugs/1380337>
<rogpeppe2> minor update to the juju errors package to use the new location for errgo: https://github.com/juju/errors/pull/8
<rogpeppe2> reviews appreciated
<TheMue> rogpeppe2: LGTM
<rogpeppe2> TheMue: thanks!
<TheMue> rogpeppe2: yw
<rogpeppe2> anyone know anything about the bitbucket.org/kardianos/osext and bitbucket.org/kardianos/service dependencies?
<rogpeppe2> they don't appear to be used by anything
<jcw4> rogpeppe2: aren't they windows dependencies?
<rogpeppe2> jcw4: possibly, but i haven't found anything that uses them
<jcw4> rogpeppe2: I see... it's only in cmd/jujud/main_windows.go that it's even imported
<jcw4> osext is a dependency of service
<rogpeppe2> jcw4: ah, thanks
<rogpeppe2> jcw4: i don't know why i didn't find that
<jcw4> rogpeppe2: I just remember the second level dependency issue when I bumped into the same question recently
<rogpeppe2> jcw4: it's a pity that the dependencies aren't findable by godeps. i should fix that.
<jcw4> yeah. since it's only in the _windows.go files go needs a little prodding to find them I guess
<jcw4> (when you're not on windows)
<rogpeppe2> jcw4: yeah. godeps should try with different build flags.
<bodie_> did someone ping me or is my IRC client highlighting this channel for no reason?  sigh
<jcw4> rogpeppe2 I'm seeing this error in builds:
<jcw4> Extant directories unknown:
<jcw4>  gopkg.in/errgo.v1
<jcw4> for example: http://juju-ci.vapour.ws:8080/job/github-merge-juju/914/console
<jcw4> doesnt make sense to me because nothing in juju master refers to that yet, so it must be a transitive dependency
<jcw4> Also, I can't find any transitive dependencies on that yet either.
<jcw4> mgz: I think it's a build script issue... do you know why it's trying to pull in gopkg.in/errgo.v1 since that isn't in the dependencies.tsv, and doesn't seem to be a transitive dependency either
<thumper> jcw4: is it from the juju/charms package?
<rick_h_> jcw4: that was a branch that landed today
<rick_h_> jcw4: make sure to update, I saw it go by in the email
<jcw4> rick_h_: I am fully updated to juju master
<jcw4> thumper: I'm trying to decipher what "it" is :)
<rick_h_> jcw4: looking
<jcw4> this error is when the build script is initially pulling down dependencies
 * thumper sighs
<thumper> found an intermittent test failure in worker/provisioner
<jcw4> thumper: I've seen worker/provisioner fail a couple times intermittently, but I see worker/peergrouper fail fairly often locally
<rick_h_> jcw4: https://github.com/juju/errors/pull/8 is the pr I was thinking of
 * thumper sighs heavily
<jcw4> rick_h_: yeah, that was why I pinged rogpeppe initially
 * thumper is going to *fix* juju/errors
<jcw4> YAY
<jcw4> what baffles me is why the build script is even pulling it at all
<jcw4> juju-core doesn't reference it yet
<jcw4> and I couldn't find any dependencies of juju-core that referenced it either
<rick_h_> like thumper said, probably juju/charm
<thumper> I was told that errgo wasn't going to sneak into dependent packages
<jcw4> thumper, rick_h_ I don't see how it could have
<thumper> and it was only going to be used outside of juju/errors in the charmstore
<rick_h_> thumper: jcw4 ok, justcccc ccbgjgvcveghhillduiklnblheujreenhtsnkubdrcjf
<rick_h_> bah gueccccsccbgjgvscuneviintrdikfgbvegt
<rick_h_> idhvdfvudkdkjb
<rick_h_> double bah
<thumper> wat?
<rick_h_> called rick sitting in the hotel bumping his yubikey
<thumper> rick_h_: you ok?
<jcw4> haha
<rick_h_> no, I'm still in brussels :P
<thumper> rick_h_: you lucky fella
<jcw4> rick_h_: did you go see the Atomium?
<thumper> rick_h_: I was looking at the schedule for the first few days
<thumper> $ git diff master | wc -l
<thumper> 1731
<thumper> hmm
<thumper> this branch isn't going to break up well
<urulama> thumper: errgo should stay within charmstore limits, yes.
 * thumper puts hacking hat back on and ignores all y'all for a while
<thumper> ah crap...
<thumper> new struct from a merge has broken my work...
 * thumper works around it
<rick_h_> chase them down and make them eat pie
<jcw4> wait, before committing to that course of action... which struct?
<jcw4> rick_h_: isn't it LATE in Brussels?
<jcw4> 11pm or so?
<rick_h_> jcw4: yes
<thumper> jcw4: not you
 * jcw4 wonders if the initial pull for imports uses the HEAD revision of juju/errors to find child imports
<urulama> thumper: btw, let me know if errgo somehow sneaks into the code ...
<jcw4> thumper: whew
<thumper> jcw4: was cmars's branch for login v2
<thumper> urulama: ack
<thumper> urulama: you home?
<urulama> thumper: was having fun trying to QA all the "landed" branches from last week ... could have missed some though
<urulama> thumper: i am
<urulama> thumper: my flight was about 1h 20min ;)
<thumper> urulama: I have a branch for juju/errors that needs updating based on our conversations in nuremberg
<thumper> urulama: I got home yesterday just before lunch
<thumper> urulama: working today/tomorrow taking thu/fri off in lieu
<thumper> but today is finishing off what I started pre-sprint
<urulama> thumper: enjoy :)
<thumper> \o/
<urulama> thumper: seen that errors branch, looks nice
<thumper> I'll see if I can get to it tomorrow
<thumper> or may this afternoon if I can fix all the test failures I caused
<thumper> by changing a bucketload of API calls
<urulama> that's always a good thing! :D
 * thumper is full steam ahead on finishing identity bits
<thumper> OOPS: 268 passed, 6 FAILED, 5 PANICKED
<thumper> need to fix the failed and panicked
<jcw4> yikes
<thumper> :)
<jcw4> :)
<jcw4> where is the code that actually runs the CI build?  I dont' think its the jenkins-github-lander repo
<urulama> don't panic
<thumper> urulama: thanks for that :)
<thumper> jcw4: not entirely sure sorry
<jcw4> thanks thumper you can ignore me while you're dealing with your panic attacks
<jcw4> :)
<thumper> cheers buddy
 * thumper goes back to it
<jcw4> I bet its https://launchpad.net/juju-ci-tools
<mgz> jcw4: rog borked it
<jcw4> mgz: yay :)
<mgz> with this change: https://github.com/juju/errors/pull/8
<jcw4> mgz: it seems like the build script might be getting imports from the HEAD revisions of first level imports?
<jcw4> mgz: right, but that's just on the errors repo, and juju-core wasn't updated to use that revision in the dependencies.tsv
<mgz> yeah, something's up
<mgz> godeps likely doesn't do exactly the right thing
<jcw4> mgz: yeah, interesting
<mgz> I'll have a poke around
<thumper> mgz: just revert the merge for now
<thumper> mgz: poke around later
<mgz> jcw4: the repo is just getting lp:juju-release-tools and running `./make-release-tarball.bash master`
<thumper> mgz: this is how we deal with broken builds
<thumper> dvcs FTW
<mgz> thumper: fair enough
<jcw4> thumper: thanks!
<jcw4> mgz: thanks!
<jcw4> mgz: I see the problem I think
<jcw4> compare_dependencies in check_dependencies.py doesn't seem to like any packages that are not accounted for in the dependencies.tsv file
<jcw4> I bet godeps is just pulling down second level imports before updating first level imports to the specified revision
<jcw4> which would bring gopkg.in/errgo.v1 into the pristine GOPATH, and then compare_dependencies complains
<jcw4> mgz: thumper, rogpeppe2 -- I reverted https://github.com/juju/errors/pull/8 so that juju-core will build again.
<jcw4> rogpeppe2: sorry! Hopefully we can figure out the build process so that we can land your change again
<thumper> jcw4: kk
#juju-dev 2014-10-14
 * thumper tries to remember menno's git trick
<dimitern> morning all
<dimitern> any willing reviewers around?
<dimitern> I need a review on http://reviews.vapour.ws/r/167/ (req by hazmat) and this trivial one http://reviews.vapour.ws/r/168/
<hazmat> g'morning
<rogpeppe2> two trivial PRs: one to update depdendencies.tsv (https://github.com/juju/juju/pull/906), the other to fix a sporadic test failure i just saw (https://github.com/juju/juju/pull/907)
<rogpeppe2> dimitern: ^
<dimitern> rogpeppe2, looking in a moment
<rogpeppe> dimitern: ta
<dimitern> rogpeppe, both LGTM, second one with a question
<dimitern> rogpeppe, can you review those two of mine I posted earlier? ^^
<rogpeppe> dimitern: yes, that character is produced by Duration.String
<rogpeppe> dimitern: that's why the test was failing
<dimitern> rogpeppe, fancy :)
<rogpeppe> dimitern: can i still merge with $$merge$$ or do i now have to go through vapour.ws ?
<dimitern> rogpeppe, you can *only* merge the PR, RB is only for reviewing
<rogpeppe> dimitern: ah, cool
<dimitern> rogpeppe, so can you? :)
<rogpeppe> dimitern: am doing. a bit of an unequal swap, if I may say so :)
<dimitern> rogpeppe, well, at least the trivial one, not the FwNone one if you prefer
<dimitern> hazmat, hey
<dimitern> hazmat, have a look at http://reviews.vapour.ws/r/167/ if you want - it's what your asked for I think
<hazmat> dimitern, did we decide anything last week re going back to github?
<hazmat> for reviews
<dimitern> hazmat, no, in fact we decided to automate the RB workflow by creating RB reviews and closing them as PRs appear
<dimitern> ..and get merged
<hazmat> dimitern, cool, i'm still in meetings atm. i'll take a look though
<dimitern> hazmat, cheers
<rogpeppe> dimitern: reviewed the trivial one
<dimitern> rogpeppe, tyvm
<mattyw> morning all
<TheMue> morning btw
<voidspace> TheMue: morning
<voidspace> restarting computer
<voidspace> brb :-)
<dimitern> morning TheMue, voidspace, mattyw
<dimitern> :)
<voidspace> dimitern: morning
<mattyw> dimitern, morning
<TheMue> voidspace: done any steps regarding the installation of MAAS on gremlin? there is a controller running, but I don't know user and/or pw
<TheMue> voidspace: looks like a fresh installation, the boot images are not yet imported (according to a message shown when trying to log in)
<voidspace> TheMue: not me - probably elmo :-)
<voidspace> TheMue: I think elmo talked to jam about it
<voidspace> TheMue: jam is not around at the moment it seems
<voidspace> TheMue: I'm going to install locally on my machine
<TheMue> voidspace: ok, I'll try to see if the superuser is created via cli
<voidspace> TheMue: but into a kvm image (probably bridged)
<voidspace> TheMue: cool
<TheMue> voidspace: kvm inside parallels? or extra machine?
<rogpeppe> dimitern: interesting, it appears that the utf-8 microsecond symbol has appeared after go 1.3 (in tip)
<dimitern> rogpeppe, so it fails earlier
<voidspace> TheMue: on my desktop - which is native ubuntu
<dimitern> rogpeppe, how about adding "u" to the [charset] in the regex?
<rogpeppe> dimitern: i'm about to do that
<TheMue> voidspace: ok
<voidspace> TheMue: with plenty cpu / ram / hard drive
<voidspace> TheMue: unlike my laptop....
<TheMue> voidspace: ah, you are the one who took the orange box with him
<voidspace> haha, I wish
<TheMue> yeah, nice toy
<voidspace> TheMue: did you create a MaaS superuser on gremlin?
<voidspace> TheMue: ah, I see you did
<voidspace> or someone did
<TheMue> voidspace: someone
<TheMue> voidspace: not me
<voidspace> TheMue: it's the same user(name) we had before
<voidspace> TheMue: it was probably still elmo - I told him what username and password we used
<TheMue> voidspace: interesting
<TheMue> aaah
<voidspace> he wasn't impressed with our high-security password :-)
<TheMue> hehe
<TheMue> absolutely
<TheMue> that's a typical superuser problem, instead of heaving administrator groupd and adding individual users to it
<voidspace> right
<TheMue> when it's MY machine it's no problem, but when sharing it ...
<voidspace> TheMue: interesting - I added a new VM (kvm instance) and it has automatically shown up in the MaaS cluster
<voidspace> I need to add power details
<voidspace> at least when I manually started it it came up
<TheMue> voidspace: nice
<voidspace> TheMue: what's the format for power address and power id?
<voidspace> I can look it up
<TheMue> voidspace: hmm, here it maybe enough to add the maa user to the libvirt group
<voidspace> yep, needed anyway I think
<TheMue> voidspace: http://www.teale.de/tealeg/computing/cloud/kvm_maas_juju_openstack.html#sec-5-4
<voidspace> done
<voidspace> TheMue: we still need to generate a key for the maas user I think
<TheMue> voidspace: seems so
<voidspace> hmmm... although it's running on the host
<voidspace> but it will still use ssh+qemu I think
<TheMue> voidspace: btw, did you start the boot image import? otherwise I'll do it now
<voidspace> I didn't
<TheMue> voidspace: ok, I start it
<voidspace> I'm downloading a utopic iso image locally
<voidspace> my internet connection is horrible :-/
<TheMue> voidspace: hmm, and here I've got troubles with the boot images link, I get a "not found"
<TheMue> voidspace: ah, the link misses the "MAAS" part of the URI, interesting
<voidspace> TheMue:  sudo -u maas virsh -c qemu+ssh://maas-admin@localhost/system list --all
<rogpeppe> two trivial PRs for review: https://github.com/juju/utils/pull/44 and https://github.com/juju/charm/pull/62
<rogpeppe> anyone know what can cause an "Extant directories unknown" error in the 'bot ?
<rogpeppe> http://juju-ci.vapour.ws:8080/job/github-merge-juju/920/console
<rogpeppe> ah, i know.
<rogpeppe> well, perhaps.
 * rogpeppe tries something.
<luca> ?
<voidspace> TheMue: can you see the node?
<voidspace> TheMue: I see one node with the status Ready
<voidspace> and I need coffee
<TheMue> voidspace: strange, have the headline "0 nodes in gremlin MAAS" here in the UI
<voidspace> TheMue: how odd
<TheMue> voidspace: I'm logged in with our admin user
<voidspace> TheMue: I'm logged in as maas-admin
<voidspace> maas-test-controller - 1 node in this MAAS
<voidspace> our node is called "useless-weight"
<voidspace> which is a pretty cool name
<TheMue> virsh list --all shows me maas-test-controller running and maas-test-1 shut off
<voidspace> that's correct
<voidspace> the node is not running
<TheMue> but I don't see any "useless-weight"
<TheMue> only 0 nodes :,(
<voidspace> I've logged out and back in again and I still see it
<voidspace> that's very odd
<jcw4> rogpeppe: I figured out what was causing that Extant directories issue
<voidspace> TheMue: now I am home my power supply works fine...
<voidspace> which is good, but also odd
<TheMue> voidspace: oh, one moment, do you logged into the maas controller on gremlin or inside the image maas-test-controller?
<voidspace> TheMue: I am logged into the maas controller (web view) on gremlin
<voidspace> My local url being http://localhost:8080/MAAS/
<voidspace> and I'm running ssh forwarding
<voidspace> ssh gremlin -L 8080:10.124.0.11:80
<dimitern> voidspace, you could run a vpn connection to access gremlin's internal ip directly - that's what I use
<TheMue> voidspace: ah, so we're using different controllers
<voidspace> dimitern: this is easier
<voidspace> TheMue: ah, yes
<voidspace> maas-controller image is a different maas
<voidspace> maas-test-controller
<voidspace> heh, that's why
<voidspace> TheMue: I'm using the MaaS that's installed *on gremlin*
<TheMue> voidspace: me too
<TheMue> voidspace: but then you don't need port forwarding, you can use gremlin/MAAS directly
<voidspace> TheMue: if I use the VPN ?
<voidspace> this works and is easy enough
<TheMue> voidspace: the forwarding was only needed for a MAAS controller inside a KVM
<TheMue> voidspace: I start the vpn and then can talk to port 80 on gremlin directly
<voidspace> right, I'm not using a vpn
<voidspace> so I need the port forwarding
<TheMue> voidspace: yeah, but you're talking to .11 (that's the KVM I installed on Friday), not to .10 (which is gremlin itself running a controller too)
<voidspace> oh really!
<TheMue> :D
<voidspace> hah
<voidspace> so the kvm image appeared there automatically
<voidspace> I didn't manually add the node
<voidspace> how funny
<voidspace> I just copied the port forwarding from my laptop
<voidspace> I didn't check the ip address
<TheMue> *lol*
<TheMue> so I still would like to know which node you see if only one KVM instance is running, the controller itself *stunning*
<voidspac_> TheMue: installing MaaS locally into a KVM image has seized the keyboard / mouse
<voidspac_> TheMue: and the key combination it wants to release it doesn't appear to exist on my keyboard (I've remapped some of the keys...)
<voidspac_> TheMue: so even when only the controller is running the cluster knows about the other node
<voidspac_> TheMue: it shows it as in a Ready state
<voidspac_> TheMue: I'm going on lunch
<voidspac_> back in a bit
<dimitern> waigani, hey, are you around for a review?
<dimitern> http://reviews.vapour.ws/r/167/
<waigani> hey dimitern, sure just abut to head for lunch though
<dimitern> waigani, sure, np - when you can, have a look please :)
<rogpeppe> jcw4: oh yes?
<jcw4> rogpeppe: it's a godeps issue I believe
<rogpeppe> jcw4: entirely probable...
<alexisb> jcw4, it is before 6am for you, thats dedication
<jcw4> (coupled with a juju-release-tools weirdness)  lol
<jcw4> alexisb: gotta maximise overlap
<jcw4> and right back at you alexisb
<jcw4> :)
<jcw4> rogpeppe: godeps seems to follow second level imports at the HEAD revision of first level imports
<jcw4> rogpeppe: before updating the first level imports to the dependencies.tsv revision
<rogpeppe> jcw4: when updating, godeps doesn't follow imports at all AFAIR
<alexisb> jcw4, I am in Brussels so it is normal time for me
<jcw4> rogpeppe: yeah, I think it does
<jcw4> alexisb: whew
<jcw4> :)
<jcw4> rogpeppe: it does a go get -d ....
<jcw4> which seems to be a little recursive
<rogpeppe> jcw4: go get -d shouldn't follow imports
<jcw4> hmm
<rogpeppe> jcw4: (that's the point of the -d flag)
<jcw4> rogpeppe: when I was testing yesterday it seemed to
<rogpeppe> jcw4: ah, you're right
<rogpeppe> jcw4: i misread/misremembered what that flag did
<jcw4> rogpeppe: of course then the juju-release-tools check that no *extra* packages are downloaded, it causes that Extand directories error
<jcw4> Extant even
<rogpeppe> hmm, i wonder what the best way is of just fetching a repo without its dependencies
<jcw4> rogpeppe: I can't think of a way to work around this other than to relax the juju-release-tools check
<mgz> rogpeppe, jcw4: I should probably email the list explaining,
<mgz> but I think we can just fix
<jcw4> mgz: I'm on the edge of my seat waiting for the rest of that comment
<mgz> the issue is go get pulls in tip of master and all dependencies of tip of master, which somewhat conflicts with the concept of having clear and well defined deps
<jcw4> mgz: right
<mgz> so, can either not use go get in godeps, or make the tarball creation delete extranious things go get went and got and trust the build to blow up if they did actually matter
<rogpeppe> mgz: yeah. i think godeps should probably avoid using go getr
<jcw4> well I'm guessing we're more likely to 'fix' godeps than go get
<mgz> rogpeppe: I had a brief go at that last night, it's a little annoying
<rogpeppe> mgz: yeah, there's a fair amount of logic in the go tool that would need duplication
<mgz> but is possible to add a Clone() to VCS and do our own mapping of how to get launchpad.net/ github.com/ etc
<rogpeppe> jcw4: well, go get is working as advertised, so it doesn't really need fixing
<mgz> the dep collection is at least already there and could be resused
<rogpeppe> jcw4: except i guess to add another flag to specify that deps should not be downloaded
<mgz> yeah, that's sort of what we want
<jcw4> rogpeppe: I wonder if we used tags instead of head revisions in the import url
<mgz> jcw4: we still have the problem that go get always gets master
<rogpeppe> jcw4: go get does what it does. no way to specify tags to go get
<mgz> not tags
<jcw4> rogpeppe, mgz I thought if the import path ended with .something go get looked for a something tag to check out
<rogpeppe> jcw4: i don't think so
<jcw4> hrm - I would expect go get would want to be more deterministic too
<mgz> rogpeppe: I think, as a fix for now, I'm going to delete unknown dirs, and trust that things will break
<rogpeppe> mgz: i'm not sure that's quite in the spirit of that check
<rogpeppe> mgz: isn't the point of that check to make sure that we include godeps info for all our deps?
<mgz> yes
<rogpeppe> mgz: ah, i see
<rogpeppe> mgz: sgtm
<rogpeppe> mgz: if we're really dependent on it, then it should break
<jcw4> +1
<mgz> ick... the script is so nice and safe at present
<mgz> making in rmtree stuff is a little scary
<jcw4> mgz: lol
<jcw4> mgz, rogpeppe would using gopkg.in-like urls for dependencies avoid this issue?
<rogpeppe> jcw4: no
<mgz> nah
<jcw4> :(
<rogpeppe> mgz: please let me know when i can try to re-land that branch...
<mgz> rogpeppe: will do
<mgz> rogpeppe: done change, writing some tests before proposing/landing
<rogpeppe> mgz: did you manage to get those changes in?
<marcoceppi_> what's the configuration option to disable apt-get upgrade on add machine and is it in 1.20 ?
<jrwren> marcoceppi_: enable-os-upgrade: false    i'm pretty sure its not in 1.20.
 * marcoceppi_ raises fists to the sky, falls to his knees, and yells NOOOOOOOOOOOOOOOO
<jrwren> marcoceppi_: There is master. It can be your friend.
<marcoceppi_> sure, run a demo using tip of tip, what could go wrong
<mbruzek> marcoceppi_: https://bugs.launchpad.net/juju-core/+bug/1350493
<mup> Bug #1350493: 1.20.x local provider not running apt-get update <charms> <regression> <juju-core:Fix Released by cox-katherine-e> <https://launchpad.net/bugs/1350493>
<thumper> morning folks
<jcw4> hi thumper
<jcw4> waigani: I have a question about state localID and docID
<waigani> jcw4: go for it
<jcw4> :)
<thumper> waigani: still working?
 * thumper looks around for fwereade
<jcw4> does it make sense to do a prefix check on docID like there is in localID?
<waigani> thumper: i need to talk to you about the watcher and localID/docID
<jcw4> I'd like to be able to safely pass an environment prefixed id to docID
<waigani> oh right
<jcw4> I could just do docID(localID(id))
<jcw4> but ...
<waigani> no, ugly
<jcw4> yep
<waigani> why do you need to docID() if it already has the prefix?
<waigani> i'm guessing because you are not sure if it has it or not
<jcw4> waigani: I have a method that might be called with either
<jcw4> yep
<waigani> i would argue that the code would be easier to read if you did the check in that method
<jcw4> waigani: okay, that's plausible
<thumper> waigani: did you want a hangout?
<jcw4> tx waigani
<waigani> that way, it is clear that the method handles an ambiguous id
<jcw4> good point
<waigani> which, i for one, would be glad to spot when reading your method
<jcw4> interesting
<jcw4> so... if the method today does docID(id)
<waigani> then I'd assume that id is a localID
<waigani> no further checking needed
<jcw4> okay.  Thanks again
<waigani> i.e. do the prefix check in your method
<waigani> thumper: okay
<jcw4> yeah
<waigani> thumper: hangout channel?
<thumper> waigani: you mean standup hangout?
<waigani> lol - yeah, sigh
<waigani> it's late
<thumper> cmars: hey, you around?
#juju-dev 2014-10-15
<bodie_> review would be appreciated.  small apiserver addition https://github.com/juju/juju/pull/909
<thumper> axw: o/
<axw> thumper: \o
<axw> glad to see you're alive after friday night
<thumper> heh
<thumper> I should learn to pace myself better
<thumper> but I do like mojitos
<axw> :) they weren't bad at all
<thumper> $ juju user list --show-disabled
<thumper> NAME     DISPLAY NAME  DATE CREATED    LAST CONNECTION
<thumper> admin    admin         33 minutes ago  just now
<thumper> foo      Foo Bar       20 seconds ago  not connected yet (disabled)
<thumper> thumper  Tim Penhey    31 minutes ago  not connected yet
<thumper> getting there...
<thumper> axw: I don't suppose you could look at https://bugs.launchpad.net/juju-core/+bug/1380337 could you?
<mup> Bug #1380337: adding machine after destroying another fails <add-machine> <cloud-installer> <destroy-machine> <juju-core:New> <https://launchpad.net/bugs/1380337>
<axw> looking
<wallyworld_> thumper: axw: i've already looked into that bug - it only affects 1.20, and it a dup
<thumper> wallyworld_: oh hai
<axw> wallyworld_: okey dokey. do you want me to continue looking, or you know the root cause already?
<wallyworld_> o/
<wallyworld_> axw: i have an idea, just needs to be fixed
<axw> wallyworld_: oh, is this the tools bug you mentioned on the hangout?
<wallyworld_> yeah
<axw> ok
<wallyworld_> axw: what's the status of bug 1360605 which is marked as In Progress
<mup> Bug #1360605: support maas zones for automatic az placement <constraints> <maas-provider> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1360605>
<axw> wallyworld_: needs review of a gomaasapi branch
<axw> then it should be good to merge
<wallyworld_> oh yeah, that's right
<wallyworld_> i'll do that
<thumper> well that's confusing
<thumper> time.Time.Add takes a duration and returns a time
<thumper> time.Time.Sub takes a time and returns a duration
<wallyworld_> awesome
<thumper> WT actual F
<thumper> ./user_info_test.go:144: constant -2.1 truncated to integer
<thumper> I have to put a f there?
<thumper> stabby stabby
<thumper> -2.1e1 is fine
<thumper> -2.1e0 gets turned into an int
<thumper> wtf
<thumper> ah...
<thumper> used with int64 and *
<thumper> FFS!!!!
<thumper> debug-log is broken with the local provider again since all-machines.log doesn't exist
<mwhudson> thumper: -2.1e1 is an integer, isn't it?  given that constant calculations have infinite precision
<thumper> mwhudson: according to the language spec it is a float constant,
<thumper> mwhudson: the issue here is that it was being used in a compile time multiplication of two constants
<thumper> so the float was being co-erced into an int
<wallyworld_> axw: i think bug 1356886 can be closed now, right?
<mup> Bug #1356886: failed add-machine ssh:  leaves behind garbage in state <14.10> <add-machine> <manual-provider> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1356886>
<axw> wallyworld_: kinda. it will clean up eventually now. it would be nice if we removed it immediately, but it's not that simple
<axw> I think we can close it
<thumper> wallyworld_: what was the lxc issue?
<wallyworld_> thumper: looks like destroy-machine removes stuff under <local-root>/storage which includes tools required to start new machines
<wallyworld_> thumper: that debug-log issue sounds like a stop the line regression doesn't it
<thumper> wallyworld_: hmm (re: lxc), and I'd say so w.r.t. debug-log
<thumper> wallyworld_: not sure when it stopped working
<thumper> been a while since I've had a local environment running
<thumper> where I cared about the lgs
<thumper> logs
<wallyworld_> wayne needs to fix it, since i think he may have broken it
<wallyworld_> we can stop the line later when he comes on :-)
<thumper> haha
<axw> wallyworld_: PTAL https://code.launchpad.net/~axwalk/gomaasapi/maas-testserver-zones/+merge/236447
<wallyworld_> sure
<wallyworld_> axw: what's the go with the agent_name handling?
<axw> wallyworld_: replied to your comment inline
<wallyworld_> hmmm, didn't see the reply, let me look again
<axw> hrm, doesn't show up after I pushed again.. one sec
<axw> wallyworld_: Correct. From https://maas.ubuntu.com/docs1.5/api.html#nodes:
<axw> "param agent_name:
<axw>  	An optional agent name to attach to the acquired node."
<axw> (under the "acquire" operation notes; for "list" it's a filter value)
<axw> (in reply to your question about agent_name being a side effect)
<wallyworld_> ah, there's a link you need to click to show the diff comments
<axw> where's that?
<axw> ah, I see
<wallyworld_> hmm, not sure i like that api design, but it is what it is
<wallyworld_> axw: +1, land away
<axw> wallyworld_: thanks
<wallyworld_> np
 * thumper sighs deeply
<thumper> wallyworld_: take a look in #juju
<wallyworld_> ok
<thumper> wallyworld_: you'd think this was tested right?
<wallyworld_> thumper: i would have thought so, but i haven't been involved with HA so much so am not sure
<wallyworld_> HA has been problematic at times though I seem to recall
<dimitern> morning all
<jam> morning dimitern
<dimitern> hey, jam!
<alexisb> wallyworld_, you still around?
<wallyworld_> alexisb: hi
<alexisb> hey there wallyworld_
<wallyworld_> having fun?
<alexisb> wallyworld_, totally!
<alexisb> ;)
<waigani> fwereade: hey you about?
<fwereade> waigani, heyhey
<waigani> fwereade: hello, could I talk something over with you: watchers + localID/docID
<fwereade> waigani, sure, can I have a quick ciggie while you start a hangout please?
<fwereade> waigani, or I could do irc if you'd rather
<waigani> fwereade: for sure, I'll start a hangout
<waigani> fwereade: those things will kill you btw
<waigani> fwereade: https://plus.google.com/hangouts/_/g2voow4a3pvpvu6jwsnyrqlwv4a?hl=en
<waigani> fwereade: my computer is spazzing out, hang on
<fwereade> waigani, np, I'll stay in that one, just say if you need to start a new one
<waigani> fwereade: https://plus.google.com/hangouts/_/g2wu75d5xxebxhb3pw6jwusynya?hl=en
<mattyw> morning all
<waigani> fwereade: can we have another quick chat please?
<waigani> fwereade: https://plus.google.com/hangouts/_/gqitbaigxm45nkusctyi7lh5uya?hl=en
<fwereade> waigani, sorry, there now
<fwereade> waigani, shall I start a new one?
 * fwereade does
 * fwereade bbs
<voidspace> ok, off to a baby scan - hopefully on my return we know if the baby is a boy or a girl
<voidspace> biab
<marcoceppi_> enable-os-upgrade: false is one of the greatest kept secrets of 1.21
<wesleymason> Can I beg someone to give some love^Wreviewtime to https://github.com/juju/juju/pull/689 ?
<voidspace> hmmm.... screwed my network onfiguration :-)
<voidspace> *configuration
<voidspace> so, on my laptop for the moment...
<w7z> the joys of developing networking for juju?
<voidspace> w7z: yep :-)
<voidspace> w7z: created a bridge for MaaS and it worked fine until I rebooted and lost all network
<voidspace> w7z: I thought it was going a bit too well...
<w7z> >_<
<mfoord> and back...
<w7z> wesleymason: the branch looks fine, I'd nitpick the error message but don't want to delay you further
<wwitzel3> alexisb: yep
<TheMue> mfoord: a result of the scan?
<mfoord> TheMue: boy :-)
<TheMue> mfoord: yay, grats. am I right, your second one, isn't it?
<mfoord> TheMue: that's right - first one is a girl
<mfoord> so we'll have one of each colour :-)
<TheMue> mfoord: hehe, yeah, well done
<TheMue> dimitern: take a look here: http://paste.ubuntu.com/8565153/
<TheMue> dimitern: the action guys get this error
<jcw4> it's when running ./worker/firewaller tests on master
<dimitern> TheMue, looking
<TheMue> dimitern: thanks
<dimitern> TheMue, gah... it seems like a timeout issue - i.e. the worker stops *before* entering the loop() and returning the error.. I'll have a look and propose a fix
<dimitern> jcw4, is it readily reproducible?
<jcw4> dimitern: for me it seems to be every time
<dimitern> jcw4, sorry about the trouble, but that's actually great to test the fix :)
<jcw4> dimitern: I can pull a branch from you to test any fixes if you like
<dimitern> jcw4, will you be around for ~1h more?
<dimitern> maybe less even
<jcw4> yep; beginning of my day
<dimitern> jcw4, ah, sweet, will ping you when I propose the branch
<jcw4> cool
<katco> menn0: ha, we just finished the same review
<katco> menn0: poor thumper now has 37 comments on his PR :p
<menn0> katco: whoops
<menn0> katco: hopefully we didn't overlap too much
<katco> menn0: hehe i'm sure we did on things like docstrings
<menn0> menn0: well that'll teach him :)
<menn0> katco: I'm currently reviewing bogdanteleaga's PRs (on Github) so let's not both do those
<menn0> katco: we should probably co-ordinate for the rest of the day as well :)
<katco> menn0: lol ok
<menn0> katco: it's not normally a problem because our time zones don't overlap much but I'm working from the London office today
<katco> menn0: it looks like review board is taken care of
<katco> menn0: ah i see
<menn0> menn0: my bad. I should have thought about this.
<katco> menn0: it's not a big deal. i'm familiarizing myself with code as well, so it's not wasted effort.
<dimitern> TheMue, jcw4|afk - the fix is ready for review -  http://reviews.vapour.ws/r/172/ -- if you can have a look and approve it (other reviewers are welcome as well), I'll merge it tomorrow morning
 * dimitern reached eod
<TheMue> dimitern: looking
<mattyw> calling it a day everyone, take care all
<jrwren_> can I haz digital ocean provider? https://www.digitalocean.com/community/tutorials/an-introduction-to-droplet-metadata
<bodie_> there's an unofficial one, fwiw
<bodie_> https://github.com/kapilt/juju-digitalocean
<bodie_> uses manual provider sadly
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1381626 1381632
<sinzui> katco, there are two kinds  of unit-tests failures introduced in the last few days, bug 1381626  and bug 1381632. Can you help get developers to look into them?
<mup> Bug #1381626: TestActionFail fails on ppc64el <ci> <ppc64el> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1381626>
<mup> Bug #1381632: firewaller_test.go TestStopsAfterGettingMode fails on utopic <ci> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1381632>
<jcw4> sinzui, katco fwiw I'm investigating that too
<sinzui> jcw4, thank you for helping
<jcw4> All three of the common errors are related to a permission error
<jcw4> somewhere between lines 1409 and 1416 of apiserver/uniter/uniter_base.go
<jcw4> I'm suspecting a Tag issue
<jcw4> sinzui: well, I feel on the hook since my commit was cited
<jcw4> :)
<jcw4> katco: to be clear I'm investigating the first one (ActionFail) right now; I haven't looked at the Firewaller one, but I'm 90% sure that dimitern has a PR up that will fix that one
<sinzui> jcw4, I don't believe permissions issue are arch related...I don't know why a single arch would start failing. I am very puzzled
<jcw4> sinzui: agreed
<jcw4> katco: (sinzui) this was the PR that I think will fix bug 1381632 http://reviews.vapour.ws/r/172/
<mup> Bug #1381632: firewaller_test.go TestStopsAfterGettingMode fails on utopic <ci> <regression> <test-failure> <utopic> <juju-core:Triaged> <https://launchpad.net/bugs/1381632>
<jcw4> I was getting the same error and that PR branch fixed it for me
<jcw4> TheMue, sinzui since http://reviews.vapour.ws/r/172/ is approved to :shipit: can we $$merge$$ it now, or wait til dimitern does it tomorrow?
<sinzui> jcw4, the two regressions will block merges until they are fixed
<jcw4> sinzui: I think this PR *is* a fix for one of those regressions
<jcw4> so maybe I should $$fixes-LP-1381632$$ it
<sinzui> jcw4, oh, then anyone add fixes-1381632 ot the merge. You don't need "$$"
<jcw4> cool
<katco> sorry was eating lunch
<jcw4> sinzui: it looks like I triggered that merge prematurely - I'm still confident it's the right fix, but dimitern hadn't finished the full test suite check on that branch
<jcw4> katco: 'sok I'm getting excited about my lunch pretty soon here too
<jcw4> :)
<katco> lol
<katco> sinzui: just reading through the backlog... it looks like you could use help investigating 1381626?
<jcw4> katco: yeah - that's the one with my merge implicated, but I don't see a direct correlation
<katco> jcw4: ok i'll start investigating
<jcw4> ta
<sinzui> katco, I suspect ppc is vulnerable to failures in that part of the tests, something made the tests go from intermittent to always fail. As this is trusty, I cannot see a correlation with "permission denied"
<katco> sinzui: interesting that it would begin breaking all of the sudden
<katco> sinzui: it must be one of the changes, unless something has changed with the scripts you guys use
<jcw4> my change that triggered it only renamed the params.ActionItem to params.Action struct
<jcw4> the structure of the struct is identical, just the name changed
<sinzui> katco, we have not changed the tests since their last pass.
<katco> sinzui: ok, well i'll keep digging in
<sinzui> katco, We see warning that imply the machine's clock is behind?
<sinzui> tar: juju-core_1.21-alpha2/src/gopkg.in/yaml.v1/README.md: time stamp 2014-10-15 17:50:59 is 44.413476891 s in the future
<katco> sinzui: interesting, but i don't know why that would cause an issue as long as the relative time remains stable during testing
<sinzui> katco, yep
<jcw4> katco: the source of the error is here: https://github.com/juju/juju/blob/master/apiserver/uniter/uniter_base.go#L1410-L1415
<jcw4> I wish I'd made two slightly different errors, or else added logging here
<jcw4> all three of the Action related test failures come back to that block
<katco> jcw4: thank you. has anyone run a git bisect?
<jcw4> katco: I can't even repro the failure locally
<katco> jcw4: ah ok. i might have some tricks. please hold (elevator music)
<jcw4> haha
<jcw4> katco: one of those tricks doesn't include access to a ppc64 machine does it?
<katco> jcw4: lol no, although i keep wondering if that might be handy since we seem to see these with some regularity
<jcw4> yeah
<katco> jcw4: fails for me on tip: go test -compiler=gccgo github.com/juju/juju/api/uniter/...
<katco> jcw4: but looks like it might be failing in a different spot
<sinzui> bugger
<sinzui> jcw4, thank you for finding the PR, it indeed fixed the utopic tests.
<jcw4> yay!
<jcw4> about the last comment, not katco's comment: bugger to that too
<sinzui> Now I see a compilation error in ppc64el tests.
<jcw4> :(
<katco> sinzui: do you have handy a commit point where this was working?
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1381626 1381671
<sinzui> katco, commit 1a7538d6 was the last passed
<katco> sinzui: ty
<jcw4> nice; I repro the same error with --compiler=gccgo
 * jcw4 goes for long anticipated lunch
<katco> sinzui: something to think about while i investigate: do we run tests with gccgo to gate merges?
<sinzui> katco, no, but we agreed to do it this cycle: https://docs.google.com/a/canonical.com/document/d/1WMeul2xZNOE1vxjj5Vb8Itl-LsGNZeBQurnv76kseGE/edit#heading=h.vxbalgtua4zd
<katco> ah cool
<katco> jcw4|nomnom: hey when you get back, i could use some help debugging your change. i'm not that familiar with facades yet, but it looks like tag parsing might be to blame? not sure.
<katco> wallyworld: are you there yet?
<wallyworld> katco: hi, finishing coffee :-)
<katco> wallyworld: yum :)
<katco> wallyworld: i need to EOD, but didn't want to wait until stand-up to hand this off: https://bugs.launchpad.net/juju-core/+bug/1381626
<mup> Bug #1381626: TestActionFail fails on ppc64el <ci> <ppc64el> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1381626>
<wallyworld> ok, let me look
<katco> wallyworld: i'm pretty sure someone who knows facades better will be able to hop right to the issue since i've done the leg-work
<wallyworld> katco: ty, will look into it
<katco> wallyworld: hopefully you have it solved b/f standup... i'd love to know what it is
<katco> wallyworld: but anyway, time to make dinner :) cya in a bit
<wallyworld> ok
<jcw4|nomnom> katco: sorry
<jcw4> I'll keep digging too
<katco> jcw4: thx check out the lp bug, i added what info i had there
<jcw4> katco: yep, just reading that now.  I'll update if I find anything more
<jcw4> katco: you're probably gone but the issue was actually in the names package
<jcw4> anyone around to give me a code review on https://github.com/juju/names/pull/29 ?
<jcw4> its needed to fix a blocking bug in CI
<jcw4> wallyworld: I figured out that bug that katco handed off to you.  Its a serialization issue
<jcw4> wallyworld: I have a PR thats the first part of the fix for that https://github.com/juju/names/pull/29
<jcw4> wallyworld: are you able to review that for me?
<wallyworld> jcw4: great thanks, will do, just otp, will look real soon
<jcw4> cool, ta
<wallyworld> jcw4: reviewed. i'm surprised we haven't run across this before. i would have thought there would be other cases where we compose from unexported structs. maybe not
<jcw4> wallyworld: I know
<jcw4> wallyworld: also other Tags in the names package have unexported fields
<jcw4> wallyworld: so my guess is they're not being serialized yet, even though we've decided to use Tags in the API instead of strings
<wallyworld> jcw4: fields but not embedded structs
<jcw4> wallyworld: right.
<wallyworld> stupid gccgo i guess
<jcw4>  hehe
<wallyworld> jcw4: maybe you can add a comment to the code to explain why you needed to export it
<jcw4> wallyworld: will do
#juju-dev 2014-10-16
<jcw4> wallyworld: this is just a dependencies.tsv change... I don't think it needs a reviewboard diff? https://github.com/juju/juju/pull/912
<jcw4> can you LGTM?
<wallyworld> sure, looking
<wallyworld> done
<jcw4> ta
<anastasiamac> axw: ian says there is a coverage tool and you maight be able to assist... what and how can I run it?
<axw> anastasiamac: you can just run "go test -cover" and get coverage statistics. you can also do "go test -coverprofile=/tmp/cover.out" and it'll write a file that you can then translate to HTML with "go tool cover -html=/tmp/cover.out"
<axw> anastasiamac: I wrote https://github.com/axw/gocov ages ago, but it's mostly redundant now
<wallyworld> axw: ah, that;s what I wasn't sure of - whether yours was still relevant
<anastasiamac> axw: thnx :-)
<axw> nps
<axw> wallyworld: depends on how much you like the command line :)
 * wallyworld loves the command line
<wallyworld> axw: is it just me or is canonicaladmin down?
<wallyworld> jam: you sure 1379802 is critical? it's been like that for ages, only affects local, and is not that common a use case
<axw> wallyworld: down for me too
<jam1> wallyworld: if it has been open forever then it isn't a huge issue, but having "destroy-machine" completely bork up local seems pretty serious to me
<wallyworld> axw: ok, thanks
<wallyworld> jam: the bug hasn't been opne, but the bug in code has been there a while without anyine raising the issue
<wallyworld> critial is fine, i intended to fix it asap anyway
<wallyworld> just need to find time
<anastasiamac> axw: Andrew! First code review of almost my code :-)
<wwitzel3> wallyworld: my logs tell me you blamed me for something yesterday, but I couldn't figure out what :)
<wallyworld> wwitzel3: um, global warming?
<wallyworld> wwitzel3: i think i said you were working on log rotation
<wallyworld> and rsyslog and a few bug fixes
<wwitzel3> wallyworld: ahh, then that is true, though all of the critical ones I know about are merged to master.
<wwitzel3> wallyworld: fixes for them that is
<wallyworld> wwitzel3: and log rotation is fully done now?
<wwitzel3> wallyworld: yep, there is still some issues if you have a lot of servers .. jam1 found them during scale testing. Too many connections results in logging to all-machines being missed.
<wallyworld> wwitzel3: ah right, i knew there was still one issue
<wallyworld> wwitzel3: customers were asking about the status, so i just let alexis know we had some stuff in 1.21, to be done for alpha2
<wwitzel3> wallyworld: rgr, the overall experience is greatly improved and afaik doesn't have any of the show stoppers for your average user that existed previously
<wallyworld> \o/ thanks
<dimitern> morning all
<mattyw> morning everyone
<allomov> hey, all.
<allomov> quick question. where I can take tools-metadata for latest version of juju (build from sources) ?
<allomov> *where can I take .. :)
<voidspace> dimitern: so we haven't estimated all the cards then?
<voidspace> dimitern: looks like there are good descriptions on them though, thanks
<dimitern> voidspace, yes, but not all I managed to finish, but that's ok as a few are already workable
<voidspace> dimitern: shall I start on some of the easier ones
<dimitern> voidspace, cheers!
<voidspace> dimitern: and is the "claim_sticky_ip_address" the static address allocation support we need in MaaS
<voidspace> dimitern: or is that a different api?
<dimitern> voidspace, the api to use in maas should be linked from the card description
<voidspace> dimitern: there's a ticket to *check* if the api supports the capability/
<voidspace> dimitern: that links to the capabilities api
<voidspace> dimitern: the ticket to use that api references the first ticket (check the capability) but not the api to use
<voidspace> (The second ticket is "maas: implement SupportAddressAllocation Environ capability")
<voidspace> dimitern: I'll start with "environs,state: Introduce new environment capability - SupportAddressAllocation."
<voidspace> dimitern: since that's trivial
<voidspace> dimitern: then look to add the check to the maas provider
<dimitern> voidspace, that sgtm
<allomov> I've found that there is juju-metadata binary which helps to generate tools and image info
<allomov> but bootstrap still fails because it can't find tools there
<allomov> ok then, will use stable version thenm
<dimitern> voidspace, standup?
<voidspace> dimitern: I was already omw
<perrito666> morning all
<wallyworld> fwereade: hiya
<fwereade> wallyworld, hreyhey
<fwereade> wallyworld, isn't it the middle of the night for you by now?
<wallyworld> fwereade: i've finished the hopefully last mods to the health status doc - could you take a quick look before i publicise?
<wallyworld> fwereade: kinda
<fwereade> wallyworld, certainly
<wallyworld> ty
<wallyworld> fwereade: a big assumption is that we are asking that feature flags be done to prevent older charms from breaking in weird ways
<fwereade> wallyworld, I think we ended up backpedalling on that to required-juju-version
<fwereade> natefinch, remind me? ^^
<wallyworld> ah, could be
<natefinch> fwereade, wallyworld: yep, we're doing min-juju-version now
<wallyworld> natefinch: ok, ta, i'll fix the doc
<jcw4> sinzui: thanks
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: see calendar | Open critical bugs: 1381671
<perrito666> wwitzel3: natefinch ericsnow stdup?
<wwitzel3> perrito666: yep
<wwitzel3> perrito666: I thought I was about to fix this test and didn't want to break stride
<voidspace> dimitern: you've specified that SupportsAddressAllocation returns a bool
<voidspace> dimitern: should that be (bool, error) ?
<voidspace> dimitern: what if the underlying api call fails (for example)
<voidspace> dimitern: and is "address allocation" actually "static address allocation", or really just "any address allocation"?
<voidspace> dimitern: I know asking two questions at the same time is asking for trouble...
<bodie_> hey, I'm writing some api client tests and wondering whether we're faking mongo tests yet
<bodie_> is there anywhere I can look for more about that, or should I just write an integrating test?
<dimitern> voidspace, sorry, just saw your questions
 * katco doctor's appt. bbiab
<dimitern> voidspace, so, SupportsAddressAllocation can return (bool, error) if needed, yes - take a look at other capabilities
<dimitern> voidspace, and yes on the second question - jam suggested to shorten the name
<voidspace> dimitern: cool, I'll use the unshortened version in the docstring
<voidspace> dimitern: and, as there's an underlying api call I think we ought to be able to return an error
<voidspace> rather than just assuming false when we're actually screwed
<dimitern> voidspace, sure, sgtm
<dimitern> we can also just log errors and return false still, but as you wish (i.e. erroring in that capability is more of a warning imo)
<voidspace> dimitern: an error would indicate we can't reach the api to ask about the capability
<voidspace> dimitern: that seems more like a fatal error
<voidspace> I'll bow to your wisdom on it
<voidspace> at the moment I'm just returning false and there is no api call
<voidspace> and no consumer of this new method
<voidspace> so it's academic until the next step
<dimitern> voidspace, fair point, yes
<dimitern> voidspace, i'll leave it to your judgment :)
<bac> hi abentley, sinzui: regarding bug 1379397 i've landed a change and deployed it to our qa instance.  do you want to test against it before i deploy to production?
<mup> Bug #1379397: Bundle queries timeout <chamers> <charmworld:Fix Committed by bac> <https://launchpad.net/bugs/1379397>
 * sinzui tries
<bac> jcastro: is the cloud cross team call happening today?
<abentley> bac: Okay, one sec.
<sinzui> bac, sorry, my head is slow today, which machine do we test the url on?
<bac> sinzui, abentley: i added the 'start' parameter, a zero-based index
<bac> sinzui: http://qa.manage.jujucharms.com/api/3/search?text=bundle&type=approved&doctype=bundle&start=10
<jcw4> bodie_: I guess it was hazmat and cmars talking about in memory mongo: http://irclogs.ubuntu.com/2014/10/03/%23juju-dev.html#t15:56
<jcastro> bac, no topic yet, I'll go ahead and cancel, thanks for the reminder
<abentley> bac: I have to say, all I want is bundle information.  I think including charms in the output is what's making it slow and unreliable.
<bac> abentley: that would require a separate endpoint or parameter.
<cmars> bodie_, jcw4 in-memory mongo has been a bit of an exploratory diversion of mine, but I think the Right Way to test api is with api-level mocks
<bac> abentley: as charmworld is being phased out can you live with this until it is replaced?
<jcw4> cmars: makes sense
<bac> jcastro: np.  saves me from showing up to an empty hangout, again.  :)
<jcw4> cmars: in-memory mongo would sure be nice for our tests though
<jcw4> :)
<jcw4> sinzui: is bug 1381671 consistently reproducible? the error output looks more like a compiler issue with generated libs maybe caching or conflicting somehow
<mup> Bug #1381671: reboot tests fail to build on gccgo <ci> <gccgo> <reboot> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1381671>
<jcw4> sinzui: I can't repro it with gccgo, although I don't have access to ppc64el machine
<cmars> jcw4, what if we ran mongod out of /dev/shm? that'd be in-memory too :)
<jcw4> cmars: very interesting
<sinzui> jcw4, the output is consistent in the tests. maybe we need to compare versions of gccgo again
<jcw4> sinzui: hmm - what is the official gccgo version for juju?
<sinzui> jcw4, there isn't one. It is the version shipped with trusty, which has changed about 4 times this year
<jcw4> heh
<jcw4> okay
<jcw4> sinzui: on my machine - gccgo (Ubuntu 4.9.1-0ubuntu1) 4.9.1
<sinzui> that looks right
<mgz> simply backing out the change would be one approach
<abentley> bac: start is itself a new parameter.  Paging means that we have to update our code to handle the possible inconsistencies that paging will produce.  I can understand needing paging when there are thousands of results, but this query produces less than 70 entries.
<mgz> but I'm curious about gccgo versions, as it's much the same as an issue dave cheney fixed upstream
<bac> abentley: yes, start is a new parameter.  it is the one we proposed on the bug and i went on to implement.
<jcw4> sinzui, mgz is it normal for the _test folder to contain duplicate .o files in gccgo?
<jcw4> it seems that libreboot.a is being generated in the normal place first and then in a subfolder under _test and it's those two versions that are conflicting
<sinzui> jcw4, it is, but there is also a long history of gccgo not managing the linking correctly. There is a kind of art to write a test that gccgo likes.
<jcw4> i see
<bodie_> cmars, I see :)
<bodie_> cmars, that's what I ended up settling on anyway
<abentley> bac: It didn't seem like a proposal.  It seemed like you had decided what to do, without consulting us: "We've added work items to add an offset to the charmworld search api"
<mgz> jcw4: see bug 1289067 and linked things for more context
<mup> Bug #1289067: arm64 multiple definition of `launchpad.net_juju_core_cmd._.0 <arm64> <ppc64el> <test-failure> <juju-core:Fix Released by dave-cheney> <gccgo-go (Ubuntu):Fix Released by james-page> <gccgo-go (Ubuntu Trusty):Fix Released by james-page> <https://launchpad.net/bugs/1289067>
<jcw4> mgz: ta - I was just grepping for gccgo issues
<bac> abentley, but you commented on the proposal in the bug.  you said "how will i specify the start", we answered, and considered that tacit approval for the approach.
<abentley> bac: I was pointing out that rick's rationale for lowering the bug was impossible to implement.   I didn't mean to endorse the solution.
<bac> abentley: clearly i'm not neglecting your needs or i wouldn't have started this conversation.
<bodie_> fwereade, meant to hit you up earlier -- I didn't get your email til late last night
<bodie_> fwereade, did I get you what you needed?
<fwereade> bodie_, no worries, I could swear I'd replied?
<fwereade> bodie_, but, thanks, it was very useful
<bodie_> ah, just checking before I checked my email, which I did just see your reply in
<bodie_> okay, cool
<bac> sinzui, abentley: i've got a call right now.  can we talk later?
<abentley> bac: Sure.
<sinzui> jcw4, mgz, stilson-08 that ran the recent tests is on gccgo 4:4.9-1ubuntu6, which is the current version for trusty. The version hasn't changed in months
<jcw4> hmm; maybe I'll be able to repro if I install that version
<mgz> sinzui: yup, I double checked that as well
<jcw4> sinzui: but that error does seem like a very similar issue to the one davecheney fixed
<mgz> I'm reasonably certain it's not the same bug, we certainly ran with the latest package, which includes that fix, and would have expected far more breakage if it was the same as reported befpre
<mgz> just seems somewhat related
<jcw4> hmm
<jcw4> that *is* the same version I'm running --version just reports it slightly differently
<mgz> I can repo on the machine that's running the ppc64el tests
<mgz> but not my armhf machine or a amd64 cloud one
<jcw4> hmm; well I'll let someone else figure it out I guess :)
<mgz> is this not just like bug 1378716
<mup> Bug #1378716: ppc64el unittests have expected failures, we don't see real failures <arm64> <ci> <ppc64el> <regression> <test-failure> <juju-core:Fix Released by anastasia-macmood> <https://launchpad.net/bugs/1378716>
<mgz> meh, we need those logs somewhere though
<jcw4> was the fix for that to just ignore tests we knew were failing under false pretences?
<mgz> more relevently bug 1365480
<mup> Bug #1365480: new compiler breaks ppc64el unit tests in many ways <ci> <ppc64el> <regression> <juju-core:Fix Released by dave-cheney> <gcc-4.9 (Ubuntu):Invalid by dave-cheney> <https://launchpad.net/bugs/1365480>
<mgz> probably not good to lump in, as it seems to be pcc64 only and linker error not failures
<mgz> well, I have a hack-around fix, if that's preferable to a back out
<bac> sinzui, abentley: acknowledging the misunderstanding, will the currently implemented pagination solution work for you, even if you consider it less than ideal?
<abentley> bac: Yes.
<bac> abentley: do you want time to exercise it on QA before i request it be deployed to production?
<abentley> bac: otp
<voidspace> dimitern: what do you think about testing all of these stub methods?
<voidspace> dimitern: there isn't an obvious place to test all of them, some providers have obvious places though
<voidspace> dimitern: I think I'd rather test them than not
<dimitern> voidspace, just a sec
<dimitern> voidspace, it seems there are both per-provider tests and a state/environcapability_test.go, but how each one is tested is inconsistent
<voidspace> dimitern: on a call
<dimitern> voidspace, np
<dimitern> voidspace, at the very least I think there should be a TestSupportAddressAllocation in each provider, see TestSupportNetworks for examples
<voidspace> dimitern: yeah, I don't think every provider has the TestSupportNetworks
<voidspace> I'll add tests for those that do - and then work on the others
<dimitern> voidspace, they do actually
<dimitern> voidspace, but most of them are a one-liner like: c.Assert(s.env.SupportNetworks(), jc.IsFalse)
<dimitern> voidspace, one more thing - the dummy provider will need some extra care
<voidspace> dimitern: ok, I have touched dummy provider
<dimitern> voidspace, i.e. making it possible to customize whether it claims to support addresses or not
<voidspace> dimitern: dummy/environs.go has gained the method
<voidspace> dimitern: ah, right
<dimitern> voidspace, for testing
<voidspace> dimitern: so that will come in the second pass
<voidspace> dimitern: understood
<dimitern> voidspace, sure, np
 * katco is back
<jcw4> and the crowd goes wild
<jcw4> :)
<katco> ha
<voidspace> to be fair, the crowd was pretty wild already
<jcw4> true true
<jcw4> but hey
<jcw4> everyone needs a little encouragement
<voidspace> :-)
<voidspace> apparently putting a nose on my smilies marks me as old :-(
<jcw4> lol
<mgz> ( Í¡Â° ÍÊ Í¡Â°)
<jcw4> wow, haven't seen that one... thats really old looking
<jcw4> wait
<Spads> â¹(â¢Â¿â¢)âº
<jcw4> on my mac that looks different than my pc
<mgz> it may also be a test of your irc client's unicode fonts
<jcw4> my pc that looks like bags under the eyes
<jcw4> on my mac looks like heavy eyeliner
<mgz> okay, I am going to big-hammer bug 1381671 to unblock trunk
<mup> Bug #1381671: reboot tests fail to build on gccgo <ci> <gccgo> <reboot> <regression> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1381671>
<mgz> gsamfira: what's the copyright status for your juju-core contributions?
<gsamfira> mgz: cloudbase has signed the Canonical agreement as a company, and so have I as an individual
<gsamfira> this was done back in december when we started contributing on MaaS
<mgz> gsamfira: so, I can add the standard headers to api/reboot.go and api/reboot_test.go then?
<mgz> (I'm touching the file anyway)
<gsamfira> sure, and I would really appreciate if you added both Canonical and Cloudbase headers
<natefinch> mgz: we agreed that files that are mostly contributed by cloudbase should get both copyrights (canonical and cloudbase)
<gsamfira> https://github.com/juju/juju/blob/master/juju/names/names.go#L1 <-- example here
<mgz> gsamfira: so "Copyright 2014 Cloudbase <...?...>, Canonical Ltd."
<mgz> ta
<mgz> I shall do that
<gsamfira> thank you! :)
<gsamfira> I always forget the damn headers...
<gsamfira> there must be a sublime plugin for that
<gsamfira> there is! :)))
<mgz> jcw4, gsamfira: http://reviews.vapour.ws/r/176/
<jcw4> mgz: ... waitin for the hammer to fall â«
<gsamfira> mgz: perfect
<jcw4> LGTM
<gsamfira> LGTM
<mgz> ...life would be a little easier if reviewboard+firefox didn't segfault my X server reliably...
<bodie_> I've mostly abandoned my faith in firefox, sadly
<gsamfira> whoa :). Nvidia? :P
<mgz> gsamfira: no, armhf chromebook, not quite sure which recent package update started this, but haven't taken the time to find out more yet
<mgz> okay, I am going to go ahead and land
<gsamfira> must be a HW feature.  [ $browser != 'chrome' ] && troll
<mgz> :)
<abentley> bac: I'm in the middle of other stuff, so I think you should just go ahead and deploy.
<mattyw> night everyone
<voidspace> night all
<wwitzel3> o/
<wwitzel3> I'm down to one failing test .. better than four :)
<natefinch> wwitzel3: nice! :)
<bodie_> http://reviews.vapour.ws/r/178/ ready for review -- a bit of Actions API
<wwitzel3> ahh validation .. I love it when a test exposes a bug
<perrito666> wwitzel3: and now the famous "how was this working before" part
<natefinch> wwitzel3: team meeting?
<bodie_> ping rick_h_ re. actions -- what's the outlook for this sprint
<bodie_> ?
<rick_h_> bodie_: so the outlook from the GUI team is we're doing an added service bar for release next week, then debug log for 2-3wks and I'm hoping there's a way to move to actions after that with at least a dev juju release
<rick_h_> bodie_: is that what you're curious about or something else?
<rick_h_> bodie_: now that's to get started, we've got some UX bits to figure out and integration of jsonschema/data binding issues on our end that will make it take a chunk of time
<bodie_> ok, cool
<rick_h_> but I'd love to release the added services bar in Oct, debug log in Nov, and then actions Dec/Jan? (holiday break so let's just say Janish)
<rick_h_> or whenever it's actually release in core
<bodie_> should we be pushing forward on a gui demo?  just didn't want to work at cross purposes if you guys were going to be putting in time on that soon, or figure out where our overlap would be
<rick_h_> well, my understnading is you have an api now but it's not fully working? I guess maybe we should chat and sync up on where you're at
<bodie_> sounds good
<rick_h_> bodie_: getting into teh GUI with it will take some skill, it's not a dirt simple integration unfortunately
<bodie_> yeah, that's kind of what I was thinking -- I'm not averse to diving in, but I'm also not an Ember developer :) .... although it does sound quite exciting
<rick_h_> bodie_: you want to setup a call for next week?
<bodie_> sure, that sounds good, I think jcw4 has more insight into where we stand with the apiserver, but I think it's mostly nailed down
<rick_h_> bodie_: you all targeting 1.22 right now?
<bodie_> that's the release for this sprint?
<bodie_> I'm kind of out of the loop with juju planning
<jcw4> rick_h_: we're not clear on the roadmap - is that published anywhere?
<jcw4> rick_h_: the v0 actions api is pretty well in with a little cleanup needed
<rick_h_> bodie_: jcw4 oh, fully functional/callable?
<jcw4> si
<rick_h_> cool, I thought it was a bit more stubbed than that atm
<bodie_> got fleshed out last week and this :)
<rick_h_> heh, /me was afk at a sprint and missed a chunk it seems
<jcw4> :)
<rick_h_> ok, let's chat next week. If it helps we can swap actions/debug log order
<rick_h_> so I could have a couple of folks working on the GUI integration in a week ish
<jcw4> ooh
 * jcw4 claps excitedly
<rick_h_> ok, so let's get a call together next week and I'll start to work on an updated plan that swaps that out
<bodie_> excellent
<rick_h_> jcw4: bodie_ and you guys should chat with sinzui/alexisb and check if you're targeting 1.21 or 1.22
<jcw4> rick_h_: cool ; is there a published roadmap anywhere?
<rick_h_> so we have a good sense of what release we want to tie to
<rick_h_> jcw4: well I've been out of the loop so not sure
<jcw4> k
<rick_h_> I'll shoot an email
<jcw4> ta
<rick_h_> and copy you all in
<rick_h_> jcw4: can you pm me email addresses please? Sitting in an airport and too lazy to look up the emails atm :/
<jcw4> hehe
<rick_h_> <3
<bodie_> woot, glad we could figure out something good :)
<rick_h_> cool, so writing up email on release stuff and if you can setup a call with your times overlapping next week we'll move it forward
<benji> rick_h_: It's not email season for three more weeks.  The game warden will get you.
<rick_h_> ruh roh
 * rick_h_ is in trouble :P
<rick_h_> jcw4: bodie_ email sent with you copied. Let me know if I travel typo'd
<bodie_> rick_h_, got the email
<bodie_> rick_h_, did you use my binary132@gmail.com address or bodie@synapsegarden.net?
<rick_h_> bodie_: second one per jcw4
<jcw4> bodie_: I told rick_h_ I always use binary132 but I thought you preferred synapsegarden
<bodie_> sounds good, just didn't know if others would know who it was, but I suppose it's pretty obvious, come to think of it :)
 * perrito666 works in front of the Tv to see the launch of the first Argentinian Satellite
<jcw4> perrito666: cool!
<bigjools> how do I bootstrap on a machine in a particular maas zone?
<rick_h_> bigjools: using the --to to pick a machine in a zone?
<bigjools> rick_h_: if you know machine names that works
<bigjools> but not if you only know the zone
<rick_h_> bigjools: and assume you can't use tags either to help target witha constraints?
<bigjools> yeah that's the only thing left
<bigjools> it's ok, this is my own rig so it's not a problem, I was just wondering for future reference
<rick_h_> there's new zone work going on, but I don't know the details of that. You'd have to chat with john and company
<rick_h_> it's always talked about as 'automatic' so I think it'd be post-bootstrap and more of an add-unit thing but might be worth bringing up while it's WIP
#juju-dev 2014-10-17
<rick_h_> and maas zones are part of that work/discussion
<bigjools> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1382276
<mup> Bug #1382276: Says environment is bootstrapped when it's not <juju-core:New> <https://launchpad.net/bugs/1382276>
<wallyworld> katco: hi, can you pop onto the standup a minute early?
<katco> wallyworld: good timing :)
<katco> wallyworld: (i'm there already)
<mwhudson> what is GOARCH on ppc64le?
<wallyworld_> bigjools: https://bugs.launchpad.net/juju-core/+bug/1382329
<mup> Bug #1382329: juju 1.21-alpha1, unable to bootstrap to maas with --constraints arch=armhf / arm64  <hs-arm64> <juju-core:New> <https://launchpad.net/bugs/1382329>
<mattyw> morning all
<voidspace> morning all
<dimitern> morning
<voidspace> dimitern: I won't be around for standup *again* today I'm afraid
<voidspace> dimitern: dentist appointment
<voidspace> dimitern: I shouldn't be out too long
<dimitern> voidspace, no worries
<voidspace> dimitern: I organised it especially to avoid clashing with moonstone standups
<voidspace> dimitern: and then I changed team :-)
<voidspace> dimitern: I have a branch which I think is ready for review - it's trivial though (as discussed yesterday)
<dimitern> voidspace, :) well, I'll be standing up by myself then hehe
<voidspace> dimitern: ah yes
<voidspace> dimitern: no jam on Fridays and Frank is gardening
<voidspace> :-)
<voidspace> dimitern: enjoy
<voidspace> dimitern: don't talk for too long
<menn0> morning voidspace
<voidspace> menn0: morning
<voidspace> menn0: or time-appropriate-greeting anyway...
<menn0> voidspace: well I'm in your timezone so morning is just fine
<menn0> voidspace: this is my last day working from the London office
<voidspace> menn0: you still in London?
<voidspace> menn0: ah
<voidspace> menn0: how's it been?
<voidspace> menn0: if you find Evan Dandrea, say hi to him for me
<menn0> voidspace: good!
<voidspace> menn0: I guess a bunch of them are still in Brussels
<menn0> Jesse has been here too so we've been working together but I'm on my own today
<voidspace> ah, cool
<menn0> I'm seated amongst the phone team. it's been interesting seeing what they're up to.
<voidspace> have you been able to play with any hardware?
<voidspace> Tell them to test the "edge swipe" when the phone has a case on it (the type of case that has a lip round the screen)
<voidspace> well, maybe suggest it if you have the chance...
<voidspace> and they've probably thought of that anyway
<menn0> voidspace: yeah the head of the phone team gave me a demo of the phone yesterday
<menn0> voidspace: it's quite nice. amazing what they've managed to do in such a short time.
<voidspace> great
<voidspace> don't let them poach you! ;-)
<fwereade> tasdomas, ping
<tasdomas> fwereade, pong
<fwereade> tasdomas, just came across the collectMetricsSignal thing, couple of thoughts
<fwereade> tasdomas, (1) shouldn't we continue to collect metrics while dying?
<fwereade> tasdomas, (2) shouldn't we set it up once, and trust it to keep firing, rather than recreating it every time through the loop
<fwereade> tasdomas, hmm, re (2), I guess we actually want to reset the timer once we trigger it, so we do need to create it repeatedly, so don't worry about that one
<tasdomas> fwereade, recreation happens because we need the signal to fire as soon after the unit has started, and time.Ticker did not let me do that
<tasdomas> fwereade, I don't think we need to collect metrics while the unit is dying
<tasdomas> fwereade, I will discuss this with cmars, but I'm pretty sure that's the way we want it (at least for now)
<fwereade> tasdomas, ok, please do bring it up, a unit could potentially be dying but still delivering value for a long time
<fwereade> tasdomas, shame not to have the numbers available
<tasdomas> fwereade, numbers?
<fwereade> tasdomas, the metrics we collect
<fwereade> tasdomas, it's just weird to have a potentially-long gap at the end of its lifetime
<fwereade> tasdomas, not the end of the world ofc
<tasdomas> fwereade, well, it seems to me that a dying unit is a very unreliable thing to meter
<fwereade> tasdomas, yeah, maybe it's only useful if we also record when the unit started dying
<menn0> fwereade: ping
<fwereade> menn0, pong
<menn0> fwereade: I'm hitting road-block after road-block trying to get the machine env uuid upgrades to work
<menn0> there are many things in the upgrade infrastructure itself that care about machines and instanceData
<fwereade> menn0, crap -- more things happening in the process of getting a state connection than we anticipated?
<fwereade> ahh, ok
<menn0> the hack for getting the state connection is ok and worked
<menn0> but then getting the upgrades to work is another story
<menn0> the upgrades running functionality wants both a *state.State and an *api.State
<menn0> I've implemented various hacks to get the API working well enough
<menn0> but am now hitting problems with the upgrade sync stuff itself
<fwereade> hmm, I was wondering if we could make sure we ran the state-requiring upgrades first
<menn0> that's what I'm thinking
<menn0> treat these machine and related collection upgrades as a special case
<fwereade> minimising the window of yuckiness seems sane
<fwereade> and we can/do control the order of upgrades anyway, right?
<menn0> we do
<menn0> but currently we don't get to running any of them
<menn0> and I understand why
<menn0> but fixing it is getting super hacky
<menn0> I'm thinking that about implementing something near the top of jujud's Run that gets a *state.State and runs selected upgrade steps manually
<menn0> if it's the master machine agent
<menn0> those steps would be removed from the list of steps for 1.21
<menn0> fwereade: does that sound reasonable to you?
<fwereade> in essence, yes, I think so
<fwereade> menn0, can we make the subsequent upgrade steps *not* take a *state.State then?
<fwereade> menn0, I'd be happiest if we could completely separate state changes from all the others, and always run the state ones first
<menn0> fwereade: ok. upgrade steps for state servers only get *state.State now if running on a state server (obviously)
<menn0> fwereade: what about upgrade steps that aren't DB migrations but might need to use state?
<fwereade> menn0, I'm hoping we don't have any, because philosophically speaking we shouldn't
<fwereade> menn0, I am prepared to be surprised/saddened though
<menn0> fwereade: I'm looking at what we have now
<menn0> fwereade: upgrade steps have been given a *State if running on a state server since before I started worked on this stuff
<menn0> fwereade: well updateRsyslogPort is pretty special. it opens up its own private *State
 * fwereade twitches gently
<fwereade> what's it doing that can't/shouldn't be done over the api?
<menn0> fwereade: and then there's processDeprecatedEnvSettings which uses the provided *State
<fwereade> ditto
<menn0> fwereade: it doesn't look like it. it needs to change something which is normally read-only
<fwereade> equally I'd be fine calling both those things data model changes
<fwereade> any reason not to run those in the first, needs-state batch?
<fwereade> and all the others in a second, needs-api batch?
<menn0> fwereade: I think that sounds reasonable
<menn0> fwereade: I'll have to think about how this effects the current upgrade sync functionality and the upgrade mechanics generally
<fwereade> menn0, cool, thanks
<fwereade> menn0, let me know what the next showstopper is ;p
<menn0> fwereade: will do :)
<menn0> fwereade: sigh. this might require some reorganisation of the upgrade-steps worker
<perrito666> man headbanging in closed spaces is dangerous
 * perrito666 rubs forehead
<perrito666> morning all
<jcw4> morning perrito666
<abentley> sinzui: Have you had a chance to look at https://code.launchpad.net/~abentley/juju-ci-tools/industrial-test/+merge/238621 or my review of https://code.launchpad.net/~sinzui/juju-release-tools/validate-streams/+merge/237975 yet?
<sinzui> abentley, yes, sorry about the delays. You will see an email soon about the 1.20.10 problem
<abentley> sinzui: Ah, gotcha.
<menn0> fwereade: sigh. without machine and instanceData having been migrated, EnsureUpgradeInfo doesn't work. it's really feeling like these specific migrations are a special case that may need to be handled as such.
<menn0> fwereade: I'm going to try that out and seeing how it looks.
<fwereade> menn0, ick
<fwereade> menn0, yeah, go for it
<menn0> once I have it done I'll push it to my repo so you can have a look. I don't think it'll take long.
<natefinch> perrito666: 1:!?
<perrito666> going
<menn0> fwereade: I was thinking something like this: https://github.com/mjs/juju/commit/b30135af56baa344efda84e54c4cc9c90b367d32
<menn0> fwereade: with this the upgrade from 1.20 works
<fwereade> menn0, it's a relief that that works
<menn0> fwereade: I think this isn't quite right in terms of handling master changes during upgrades
<menn0> fwereade: but I can fix that
<fwereade> menn0, can you estimate the cost of splitting upgrade steps up so we can separate state-needing ones from api-needing ones, and run them in that order using upgrade machinery?
<fwereade> menn0, if I wave my hands vigorously, it feels like it shouldn't be *fundamentally* hard, but I can believe it'd be a hassle
<fwereade> menn0, ie the upgrade machinery itself would need some work
<menn0> fwereade: sure. I've already thought somewhat about this today.
<menn0> fwereade: by "run them in that order using upgrade machinery" do you mean inside the upgrade-steps worker? (or at least managed by the UpgradeInfo system)
<fwereade> menn0, let's say the latter for now, given that I assume the former wants an api connection as well
<menn0> fwereade: actually... quick hangout? I want to make sure we're on the same page.
<fwereade> menn0, sure
<menn0> fwereade: https://plus.google.com/hangouts/_/gxxkhbn6ou7drvwcgmo3hd7wjea
<menn0> fwereade: missed you
<bodie_> fwereade, quick question re. actions -- we want to land the whole CLI at once, at the very end, right?
<fwereade> bodie_, I think so, yeah -- but if you can come up with a smaller and self-consistent subset that's also fine
<bodie_> hmm, ok
<bodie_> fwereade, jcw4 and I have been thinking we should open CLI PRs against our juju-actions repo individually, and get them vetted, so we don't have a huge PR to pass all at once
<fwereade> bodie_, +100 to that
<bodie_> okay, cool
<bodie_> we can figure out what the subset is from there, maybe
<bodie_> thanks fwereade!
<mattyw> fwereade, are you done for the day? or do you have 5 minutes for an "in theory" discussion?
<voidspace> happy weekend everyone
<hazmat> anyone know what happens when two sides disagree on scope:container?
<natefinch> a wormhole in space?
<natefinch> no I don't know
<perrito666> hazmat: interesting question to ask on a friday afternoon after a sprint
 * perrito666 was 10 mins trying to figure out why his select was not working... he wrote switch instead
<natefinch> haha, I've done thart
<perrito666> sounds like something that could be a compiler error "too many type mismatches on your switch, you most likely are an airhead trying to use select"
<natefinch> heh perhaps
<perrito666> is there any doc for mup? there is no README on the repo.. or any other piece of non code information for the matter
<natefinch> mup: help
<mup> natefinch: Run "help <cmdname>" for details on: bug, echo, help, infer, poke, run, sendraw, sms
 * perrito666 sighs and git clones
<natefinch> who the heck ever thought yaml was a GOOD idea :/.
<perrito666> what did that bad format do to you now?
<perrito666> your best bet is to actually write an editor for it
<natefinch> I'm trying to set min version in the charm metadata.yaml .... so like minversion: 1.20    .. but of course then yaml thinks that's a float, not a string, so I have to put it in quotes like minversion: "1.20"  except then my code is saying \"1.20\" is not a valid version.... which seems to indicate it's getting the quotes as part of the string for some reason
<perrito666> mmm, I believe I saw gustavo and someone else discuss that particular issue not long ago
<natefinch> I think the actual reason is that our version parsing stuff always wants the micro version... so you have to say 1.20.0
<natefinch> which is annoying
<perrito666> mmmm, true
<perrito666> which is not so bad itself cause that way you ommit alpha/beta version
<natefinch> anyway, gotta run
<natefinch> have a nice weekend everyone
<perrito666> u2
#juju-dev 2014-10-18
<jcw4> pre-push script improvement to not run the full validation when the push is a branch delete or when there are no changes: http://reviews.vapour.ws/r/197/
#juju-dev 2014-10-19
<thumper> morning
<davecheney> moin moin
<davecheney> thumper: i have no idea when the stadnup is now
<davecheney> thanks to timezones
<davecheney> can we just do it now ?
<thumper> davecheney: hey
<thumper> davecheney: sure, not sure who else is actually around
<davecheney> i'll jump in the hangout
<thumper> I know menno isn't back until Thu
<davecheney> if it's just you and me
<davecheney> then i'll at leats be quick
<thumper> :)
<davecheney> mwhudson: ping
<davecheney> mwhudson: http://paste.ubuntu.com/8594439/
<davecheney> gcc / clang appear to change their as(1) behaviour depending on the _case_ of the file extension
<mwhudson> hah uh
<mwhudson> davecheney: looks like 32 bit though, all that stuff has fallen out of my brain
<davecheney> i guess it's part of using gcc as a frontend
<davecheney> do you know the diffrence between .s and .S
<mwhudson> i do not, i'm afraid
<mwhudson> davecheney: ah google suggests that .S goes through the preprocessor
<davecheney> it looks like .S goes through a preprocessor
<davecheney> yes
<davecheney> just figured that out with -c
<davecheney> just figured that out with -v
<davecheney> ok, i should be able to figure it out from here
<mwhudson> which even sort of makes sense (in hindsight) looking at your pastebin
 * mwhudson lunches
<davecheney> makes sense in retrospect should be the tagline for gcc
#juju-dev 2015-10-12
<dooferlad> voidspace: hangout?
<voidspace> dooferlad: omw
<mgz> angleterrettes, someone free to look at my collection of branches changing yaml deps? see bug 1504821 comments
<mup> Bug #1504821: Please switch dependency from gopkg.in/yaml.v1 to gopkg.in/yaml.v2 <juju-core:In Progress by gz> <https://launchpad.net/bugs/1504821>
<mgz> rogpeppe: maybe if you're about? ^
<rogpeppe> mgz: will do
<rogpeppe> mgz: we've found that upgrade problem BTW
<mgz> rogpeppe: ace, both halves?
<rogpeppe> mgz: yeah
<rogpeppe> mgz: not easy to fix though
<rogpeppe> mgz: we're having a hangout about it now if you want to come and join the fun :)
<mgz> ooh, ooh, invite me!
<rogpeppe> mgz: https://plus.google.com/hangouts/_/canonical.com/gogogo?authuser=1
<rogpeppe> (or authuser=0)
<mattyw> mgz, I'm actually semi through that bug (#1504821) for the chicago cubs branch - shall I carry it on or are you happy to carry on?
<mup> Bug #1504821: Please switch dependency from gopkg.in/yaml.v1 to gopkg.in/yaml.v2 <juju-core:In Progress by gz> <https://launchpad.net/bugs/1504821>
<voidspace> rogpeppe: I need to create a tempdir for a test and then remove it on test completion
<voidspace> rogpeppe: I can do that with ioutil
<voidspace> rogpeppe: but IIRC there is a test suite method to do it, and I can't seem to find it
<voidspace> rogpeppe: do you know it?
<mgz> mattyw: I think our work will be complimentary
<mattyw> mgz, I'm only doing it in core - not any of the deps
<mattyw> mgz, if you're happy for me to carry on I'll carry on :)
<voidspace> mattyw: mgz: ^^ do you know of a test suite method for creating a temp dir for a test?
<mgz> mattyw: right, I started at doing all the deps
<voidspace> I'm sure there is one, I just can't find it
<mgz> voidspace: I think we have a few...
<mattyw> voidspace, I don't remember the name - but it's in gocheck
<voidspace> mgz: mattyw: heh, thanks
<mattyw> voidspace, https://godoc.org/gopkg.in/check.v1#C.MkDir
<mattyw> voidspace, is that what you're after?
<voidspace> mattyw: awesome, thanks
<mattyw> voidspace, you're very welcome
<voidspace> mattyw: just looking
<voidspace> I expect it's exactly what I'm after
<voidspace> mattyw: yep
<voidspace> ah, after the *suite* finishes running
<voidspace> well, for the suite would be fine
<voidspace> just need to modify the test
<voidspace> looking at our test suite, we use that method "per test" anyway
<voidspace> I guess it doesn't matter
<mattyw> voidspace, I was sure there was another one - I'm just having a look
<mattyw> mgz, just to confirm then - migrate to yaml.v2 in master first right?
<mgz> mattyw: yes, and your life will be a little easier when some of these other dep change branches land
<mattyw> mgz, awesome stuff, thanks
<mattyw> mgz, I'll be starting that in the next few minutes
<wwitzel3> katco: ping
<mattyw> mgz, ping?
<mgz> mattyw: hey
<mattyw> mgz, hey, you might be able to ignore me now actually
<mattyw> mgz, was looking at this https://github.com/juju/httprequest/pull/35/files
<mattyw> mgz, the json body is because juju/testing was updated it looks like
<mgz> mattyw: and the go 1.2 vs later thing
<mattyw> mgz, I see
<mattyw> mgz, LGTM
<mgz> I could also have updated testing to always set content-type for post, even if no content has been supplied
<mgz> but this seems like it's making the test more realistic and avoiding the golang bug fine
<mattyw> mgz, any idea what's happened here? https://github.com/juju/cmd/pull/22
<mattyw> mgz, again ignore me
<mattyw> mgz, I thought it had been accepted but not merged
<mgz> mattyw: I had to do futzing with the gating jobs
<mgz> because I made them hostile to dependency changes
<mattyw> mgz, 2 days ago - were you doing it saturday?
<mgz> in my defense, it was tipping it down outside the hotel all morning
<mattyw> mgz, ah yes - at the sprint
<mgz> rogpeppe: is the intention of godeps -t to get all testing deps, or just those of the given package?
<rogpeppe> mgz: all testing deps of the named packages
<rogpeppe> mgz: so it won't get testing deps of deps that aren't in the original package list
<rogpeppe> mgz: (because I found that's almost never what you want)
<mgz> rogpeppe: yup, that sounds sane to me.
<mgz> so, with the current stuff landed, plus charm which I'll have a quick look at again, should be able to do the remaining charm* bits
<rogpeppe> mgz: tbh I'd prefer it now if -t was the default
<rogpeppe> mgz: i don't think i've ever not wanted to update testing deps too
<mgz> rogpeppe: yeah, that surprised me when I was updating the other day, forgot about the nuance
<wwitzel3> katco: ping
<mgz> (^was waiting on natefinch for lumberjack landing, and it's US holiday today, but it's only a dep for the tests, not the project, so other dep updates are fine)
<mgz> wwitzel3: was she swapping today? or just holidaying?
<wwitzel3> mgz: thought she was online today, but maybe not, yeah
<mup> Bug #1456659 opened: Juju bootstraps a machine but cannot SSH to it <ci> <intermittent-failure> <juju-core:Triaged> <juju-core 1.23:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1456659>
<wwitzel3> :q
<rogpeppe> mgz: i've proposed https://github.com/juju/juju/pull/3485 which should fix the bug, although i haven't had time to actually try the upgrade for real yet: https://github.com/juju/juju/pull/3485
<mgz> rogpeppe: I'm fine with merging it after review and letting CI find out if it doesn't blow things up any more
<mup> Bug #1505309 opened: apiserver: pinger can access state before upgrade has completed <juju-core:New> <https://launchpad.net/bugs/1505309>
<mup> Bug #1505309 changed: apiserver: pinger can access state before upgrade has completed <juju-core:New> <https://launchpad.net/bugs/1505309>
<mup> Bug #1505309 opened: apiserver: pinger can access state before upgrade has completed <juju-core:New> <https://launchpad.net/bugs/1505309>
<mup> Bug #1505435 opened: option to enable --unsafe-caching for uvt-kvm <juju-core:New> <https://launchpad.net/bugs/1505435>
#juju-dev 2015-10-13
<mup> Bug #1505460 opened: Image consistency with containers can be broken <juju-core:New> <https://launchpad.net/bugs/1505460>
<mup> Bug #1505460 changed: Image consistency with containers can be broken <juju-core:New> <https://launchpad.net/bugs/1505460>
<mup> Bug #1505460 opened: Image consistency with containers can be broken <juju-core:New> <https://launchpad.net/bugs/1505460>
<mup> Bug #1505460 changed: Image consistency with containers can be broken <juju-core:New> <https://launchpad.net/bugs/1505460>
<mup> Bug #1505460 opened: Image consistency with containers can be broken <juju-core:New> <https://launchpad.net/bugs/1505460>
<davecheney> Get:1 http://ports.ubuntu.com/ubuntu-ports/ trusty-updates/main linux-image-3.19.0-30-generic arm64 3.19.0-30.34~14.04.1 [51.2 MB]
<davecheney> 16% [1 linux-image-3.19.0-30-generic 10.0 MB/51.2 MB 20%]    56.8 kB/s 15min 1s^Z
<davecheney> wheee
<davecheney> cloud scale
<mup> Bug #1505504 opened: error when destroying current environment in a multi-environment scenario <juju-core:New> <https://launchpad.net/bugs/1505504>
<mup> Bug #1505504 changed: error when destroying current environment in a multi-environment scenario <juju-core:New> <https://launchpad.net/bugs/1505504>
<mup> Bug #1505504 opened: error when destroying current environment in a multi-environment scenario <juju-core:New> <https://launchpad.net/bugs/1505504>
<rogpeppe> mgz: it looks like reviewboard has stopped adding the "Review request" comment to github PR titles
<rogpeppe> mgz: (i just had to add one manually to https://github.com/juju/juju/pull/3485 and the two before it haven't been labeled)
<mgz> rogpeppe: yeah, I broke eric the other day, will fix now and poke him again later to sort out
 * rogpeppe hopes eric can be fixed
<dooferlad> voidspace: hangout time
<dooferlad> jam: hangout?
<voidspace> frobware: I've turned it into a PR so it can be reviewed, but  I won't land it until we've confirmed it actually fixes the bug!
<voidspace> frobware: https://github.com/juju/juju/pull/3489
<voidspace> frobware: bootstrapping against 1.9 now
<voidspace> bug updated as well
<frobware> voidspace, thanks - bbiab (rebooting)...
<voidspace> frobware: hmmm... two failed bootstraps
<voidspace> frobware: failing with 2015-10-13 10:00:33 WARNING juju.replicaset replicaset.go:98 Initiate: fetching replication status failed: cannot get replica set status: can't get local.system.replset config from self or any seed (EMPTYCONFIG)
<voidspace> frobware: switched to 1.25 and trying to see if it fails in the same way
<frobware> voidspace, worked first time for me - using your branch
<voidspace> odd :-)
<voidspace> it could just be dumb luck
<frobware> voidspace, the times I see this I force a really clean build...
<voidspace> frobware: that's certainly not the failure I would expect
<voidspace> frobware: after I've tried vanilla 1.25 I'll rebuild and do a clean attempt
<voidspace> I would expect everything after the ifdown/ifup to fail
<voidspace> frobware: yeah, vanilla 1.25 fails much harder
<voidspace> frobware: so I'll call that a win ;-)
<voidspace> (fails earlier)
<frobware> voidspace, still going through the motions with multiple NICs
<voidspace> frobware: kk
<voidspace> in the meantime
<voidspace> dooferlad: TheMue: fancy a review? http://reviews.vapour.ws/r/2882/
<voidspace> frobware: ah, a godeps made some changes, maybe that was it
<voidspace> trying yet again
<frobware> voidspace, pretty sure that's why/when I see the replicaset issue.
<frobware> voidspace, posted a few comments and still doing some manual testing.
<voidspace> frobware: thanks
<voidspace> frobware: we can't create unit tests for machines with multiple nics  I don't think
<voidspace> frobware: as the tests really just do text manipulation
<frobware> voidspace, I was thinking more of the case where we have multiple entries in the expected input/output tests.
<voidspace> frobware: we use the primary interface as found from the "ip route list exact 0/0" command
<frobware> voidspace, so maybe I misread the test - let me take another look
<voidspace> frobware: nice improvement in those bash functions though
<voidspace> pushing now
<voidspace> frobware:can I drop that issue?
<frobware> voidspace, line 219 in environ_test.go - can we not simulate in there that there might be multiple NICs?
<voidspace> frobware: that's the script we generate - which is the same for multiple nics
<voidspace> frobware: we don't simulate the machine at all
<voidspace> that script has to *work* on machines with multiple nics - but is itself unchanged
<frobware> voidspace, that's just expected output though
<voidspace> that's the script itself
<voidspace> the output of generating the script
<frobware> voidspace, to me that's just test input/output, no?
<voidspace> frobware: well yes
<voidspace> frobware: but that input / output will be the same on a machine with multiple nics
<voidspace> we do not simulate the machine
<frobware> voidspace, so does it make sense to have a test that has multiple ethN entries?
<voidspace> we only generate the script and check the generated output matches what we expect
<voidspace> frobware: the script will never have multiple ethN entries
<voidspace> *running* the script on a machine with multiple nics will do something
<voidspace> but the unit tests never run that part of the code
<voidspace> frobware: see how that script doesn't mention ethN at all - but sets and uses $PRIMARY_IFACE
<voidspace> frobware: or are you talking about networkStaticInitial versus networkStaticFinal
<frobware> voidspace, sure, but the stuff I thought we were talking about was just text matching/replacement
<voidspace> frobware: and adding extra ethN entries in the canned /etc/network/interfaces ?
<voidspace> frobware: well it is
<frobware> voidspace, exactly
<voidspace> ah
<voidspace> frobware: I can add an extra entry and we can test it is left untouched
<voidspace> some value in that I guess
<frobware> voidspace, agreed
<voidspace> I'll add an extra test
<frobware> voidspace, I think there should be two: 1 with eth0, 1 with eth...ethN.
<voidspace> frobware: well I'll add one extra with an eth1
<voidspace> frobware: if eth1 is untouched then adding ...ethN adds no extra value
<voidspace> we just want to test that "not the primary interface" is left intact
<voidspace> frobware: sorry for the misunderstanding :-)
<mup> Bug #1505617 opened: juju backup intermittent failure <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1505617>
<mup> Bug #1505617 changed: juju backup intermittent failure <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1505617>
<voidspace> frobware: additional test pushed
<mup> Bug #1505617 opened: juju backup intermittent failure <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1505617>
<frobware_> voidspace, thanks
<frobware_> voidspace, mgz: do we have an easy or existing mechanism to build a ppa for a proposed fix like this?
<voidspace> frobware_: I don't know, I don't think deploying a ppa is "simple"
<frobware_> voidspace, heh, no. I've always used fpm when I wanted to do this.
<voidspace> frobware_: are you using trusty or vivid?
<voidspace> frobware_: on trusty I can bootstrap but then get the replset failure - reliably it seems
<frobware_> voidspace, trusty
<voidspace> frobware_: I'm trying changing the images to see if that helps
<voidspace> probably a different mongo package
<voidspace> but if it works for you it shouldn't be that
<frobware_> voidspace, to rebuild I use: alias rebuild-juju='[ -f $PWD/juju/api.go ] && { set -e; nukegopkg; make clean; godeps -u dependencies.tsv; make install; }'
<mgz> frobware_: people ask for ppas, but generally that's not actually useful for juju
<mgz> we're better off giving them some built binaries
<voidspace> mgz: frobware_ : in this case it's Mark Shuttleworth asking for a ppa...
<mgz> yeah, well that's always a fun conversation
<voidspace> oh no its not
<voidspace> <voidspace> frobware: additional test pushed
<voidspace> <mup> Bug #
<mgz> voidspace: you killed him
<voidspace> https://bugs.launchpad.net/juju-core/+bug/1494476
<mup> Bug #1494476: MAAS provider with MAAS 1.9 - /etc/network/interfaces "auto eth0" gets removed and bridge is not setup <addressability> <lxc> <maas-provider>
<mup> <network> <juju-core:Triaged by mfoord> <juju-core 1.24:Triaged by mfoord> <juju-core 1.25:In Progress by mfoord> <https://launchpad.net/bugs/1494476>
<voidspace> it's andreserl asking
<voidspace> prebuild binaries should be enough
<mgz> yup.
<rogpeppe> mgz, anyone else: reviews much appreciated for this PR which fixes a critical issue: http://reviews.vapour.ws/r/2878/diff/#
<mgz> rogpeppe: I'll +1, looked at it yesterday
<rogpeppe> mgz: it's changed quite a bit since then
<rogpeppe> mgz: yesterday's didn't actually fix the problem
<rogpeppe> mgz: because we actually need to be able to log in
<rogpeppe> mgz: and the Login method i had fetched the environment config, which failed
<rogpeppe> mgz: so i've refactored things so agent login can be done without doing that
<mgz> ;_;
<rogpeppe> mgz: (which incidentally did simplify things a bit which was nice)
<mup> Bug #1505648 opened: meter-status worker spins if charm doesn't have meter-status-changed hook <logging> <manifold> <uniter> <juju-core:Triaged> <https://launchpad.net/bugs/1505648>
<voidspace> frobware: shall I land this branch
<frobware> voidspace, I was just about try / validate on 1.8
<frobware> voidspace, give me 30 mins. OK?
<voidspace> frobware: sure
<frobware> voidspace, I ran into the replicaset issue too
<mattyw> mgz, ping?
<voidspace> frobware: 1.8 or 1.9?
<voidspace> or both :-(
<frobware> voidspace, 1.8. third time lucky?? now
<voidspace> frobware: did you work out how to get ssh access?
<frobware> voidspace, backdoor access?
<voidspace> yeah
<frobware> voidspace, nope :)
<voidspace> I'll try hacking cloud init config and ssh into the machine after a failed bootstrap
<voidspace> the machine is still running
<voidspace> frobware: I'm going to have lunch first
<frobware> voidspace, me too. though the patch seems fine for the combos I have tried on 1.8 and 1.9
<voidspace> frobware: if it works sometimes that implies a timing issue
<voidspace> yeah, odd :-/
<frobware> voidspace, and I tried 1.9 way more
<voidspace> annoying
<perrito666> belated good morning all
<mgz> mattyw: hey, what's up?
<mattyw> mgz, hey hey - sorry you can ignore me
<mattyw> mgz, I found some docs I could read instead of pestering others :)
<mgz> :)
<mup> Bug #1505617 changed: juju backup intermittent failure <backup-restore> <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1505617>
<mgz> bogdanteleaga: wotcha, you around? need to bother you about centos testing
<mgz> ericsnow: also pokÃ©, when you have a mo, to talk about reviewboard github persm
<ericsnow> mgz: k (OTP)
<rogpeppe> mgz: did you forget to publish your review of http://reviews.vapour.ws/r/2878/ ?
<mgz> rogpeppe: yeah, need to finish that
<rogpeppe> mgz: thanks
<mgz> rogpeppe: okay, the thing I am hung up on, and why I didn't finish looking through the code yesterday,
<mgz> is one we've started the server, what's to stop something connecting before all the upgrades have actually happened
<mgz> the ordering around this hurts my head
<rogpeppe> mgz: the machine agent has logic in it to stop that
<rogpeppe> mgz: and i agree, it's a pain to reason about and should really be fixed at a more fundamental level
<rogpeppe> mgz: but this at least works around the issue for now and is no worse than what we had before
<rogpeppe> mgz: i think that probably mongo schema upgrades should be treated entirely separately from the other upgrades
<rogpeppe> mgz: but that's too big a change for me to do right now (and it's not really my responsibility to fix old juju tech debt tbh)
<mgz> rogpeppe: I agree this is better, will plus one and give in to the fact I don't comprehend all of cmd/jujud/agent/
<rogpeppe> mgz: y'know, looking at the code, i'm no longer convinced that it does
<mgz> ehehe ;_;
<rogpeppe> mgz: but again, it's no different to how it was before
<mgz> rogpeppe: you have reviews.
<rogpeppe> mgz: ta!
<voidspace> frobware: this is the /etc/network/interfaces I have
<voidspace> juju-dev
<voidspace> * gberginc has quit (Quit: -sigh-)
<voidspace> frobware: http://pastebin.ubuntu.com/12773634/
<voidspace> a bit weird
<voidspace> ah
<voidspace> auto eth0:1
<voidspace> has been replaced with "auto juju-br0:1"
<voidspace> funnily enough it occured to me that the sed we were using would clobber interface names where the primary interface was *part* of another interface
<voidspace> dammit
<voidspace> ok, easy enough to fix and test I think
<mattyw> mgz, here's my yaml migration pr http://reviews.vapour.ws/r/2883/
<voidspace> dooferlad: ping
<mgz> mattyw: thanks!
<mgz> mattyw: anything particularly fun?
<mattyw> mgz, there was one place where yaml setters and getters were changed and I think that reduced line count
<mattyw> mgz, and also there was some validation that just wan't needed anymore
<mattyw> mgz, https://github.com/juju/juju/pull/3490/files#diff-992668ccf77c55eabded974107a15a3bL94
<frobware> voidspace, how did you get it into that state?
<mgz> mattyw: is it worth wrapping errors from marshal funcs, or will we get enough context anyway?
<voidspace> frobware: that's after a fresh bootstrap
<voidspace> frobware: for some reason it has an "auto eth0:1" in it initially
<frobware> voidspace, I did not run into that case
<voidspace> frobware: were you able to ssh in?
<frobware> voidspace, this through adding additional NICs via MAAS?
<voidspace> frobware: that's ssh'ing into the failed bootstrap node
<frobware> voidspace, ah. without your change?
<voidspace> frobware: the sed code (pre-existing I might add) badly mangles "ssh eth0:1" into "auto juju-br0:1"
<voidspace> frobware: no, that's with my change
<mattyw> mgz, I think we get enough context - but opinions may vary
<voidspace> frobware: so I can fix that problem and see if it fixes the replicaset issue
<frobware> voidspace, I never saw that generated with multiple NICs.
<voidspace> just bootstrapping now to see if I've fixed it
<mattyw> mgz, for example for long sequences or maps you don't get much feedback
<voidspace> frobware: even when bootstrap failed?
<mattyw> mgz, but in the cases I've changed you're likely to know what was input anyway
<frobware> voidspace, the only time I saw bootstrap fail (on 1.9) was when I was not running your branch
<frobware> voidspace, btw - I'm currently running 1.25 in a bootstrap/destroy loop and have not seen the replicaset failure.
<voidspace> frobware: with my fix in place I still have an eth0:1, but it isn't mangled by sed
<voidspace> so maybe it will work this time
<voidspace> bootstrap still in progress
<frobware> voidspace, it's not clear to me how you got :1 added
<voidspace> frobware: that's the *initial* /etc/network/interfaces that cloud init modifies
<voidspace> so it comes from maas
<frobware> voidspace, have not seen that.
<voidspace> it has a different address!
<voidspace> the machine has two addresses
<frobware> voidspace, two addresses, 1 NIC?
<voidspace> yep
<frobware> voidspace, aha. Thought you were getting this on 2 NICs- confused.
<voidspace> I have no idea why it is there
<voidspace> but it is, and juju is no longer mangling it
<voidspace> frobware: so the fix is good I think
<voidspace> frobware: I'll push shortly with an extra test
<voidspace> well, if the bootstrap works that is...
<voidspace> nope, same problem
<voidspace> so I think this additional interface in /etc/network/interfaces must be the problem
<voidspace> I wonder if bootstrapping to a different node, not screwed up, would help
<voidspace> I've acquired that node to force maas to pick a different one for bootstrap
<voidspace> trying again
<frobware> voidspace, and the new node has an alias too?
<voidspace> frobware: don't know - will see shortly
<frobware> voidspace, I just added one to my node... previously I was only concentrating on two NICs.
<voidspace> frobware: as far as I can *see* the alias (if that's what it is) isn't defined in virt-manager
<voidspace> no idea where it came from
<frobware> voidspace, I added the alias in the MAAS screen
<frobware> voidspace, so I see eth0 and eth0:1 each with their own addrs
<voidspace> ah, ok - looking at the node in maas
<voidspace> ah yes, it's there
<voidspace> frobware: the other nodes don't have aliases
<voidspace> so that's the problem
<frobware> voidspace, but it is a legitimate combo.
<voidspace> frobware: well, it is
<voidspace> frobware: but it's an existing problem
<voidspace> let's fix the reported bug
<voidspace> and file a new one for the alias
<voidspace> as the current bug is urgent
<frobware> voidspace, fair enough
<frobware> voidspace, but I think we would have to get the new bug fix into 1.25 as well
<voidspace> that would depend on priority of that bug :-)
<voidspace> I also don't immediately know the right fix
<voidspace> I think we would probably need to add bridge_ports to all aliases too
<voidspace> frobware: I'll leave the fix to not *mangle* aliases in
<voidspace> as it's more correct
<voidspace> so I'll write the extra test
<voidspace> but that scenario still doesn't work
<voidspace> I assume because the alias isn't routable and mongo is picking that address
<frobware> voidspace, agreed to not mangling aliases
<voidspace> cool
<voidspace> frobware: bootstrap just worked
<voidspace> so that was it
<frobware> voidspace, if it's obvious I'm inclined to make sure it works with aliases too
<voidspace> frobware: it's not obvious to me
<voidspace> frobware: but it's *probably* not hard
<voidspace> let's land this and I can work on it alongside
<frobware> voidspace, ack
<frobware> voidspace, whilst testing your bits I may have made some progress with addressable containers and macvlan...
<voidspace> frobware: latest version pushed
<voidspace> frobware: cool
<voidspace> frobware: if you fancy a quick look over and giving me a ShipIt I'll land it
<frobware> voidspace, looking and trying now...
<voidspace> and then I'll do back/forward ports
<voidspace> although dimitern said he has a simpler fix for 1.24
<voidspace> not sure what that is
<frobware> voidspace, heh - the node I'm using is entitled 'elegant-belief'. :)
<frobware> voidspace, I think the simper fix is just 2 invocations of sed. one to replace, one to insert the bridge bits.
<voidspace> ah, ok
<voidspace> maas node names are lovely
<voidspace> I'm on imaginative-hose
<mattyw> I wonder which meaning of hose they had in mind
<voidspace> heh
<voidspace> all of them
<frobware> voidspace, I wonder whether as debug we should $(cat /e/n/i) before and after our mods. At least the former may show up before we nuke any access...
<voidspace> frobware: to the client?
<frobware> voidspace, wherever debug goes
<voidspace> frobware: I don't think there is any debug output from cloud init
<voidspace> frobware: we could save a copy before mangling it
<frobware> voidspace, but this is juju doing this bit
<voidspace> frobware: this is cloud init
<voidspace> frobware: this is a bash script we send to the machine using cloud init
<frobware> voidspace, oh. I though juju did this in place
<voidspace> no, that's why we have to use bash
<voidspace> it's not the juju executable doing this
<voidspace> it's bash code executed as part of machine setup
<frobware> voidspace, so it could still be logged.
<voidspace> frobware: using syslog?
<voidspace> it's before there's a juju log file
<frobware> voidspace, cloud-init can log stuff too
<voidspace> ok
<frobware> voidspace, just to be clear - your latest change is not alias aware, correct?
<voidspace> frobware: it has changes to not mangle aliases
<voidspace> but it doesn't work with aliases
<voidspace> so basically correct, yes
<frobware> voidspace, so no bootstrap possible with aliases (eth0:1)?
<voidspace> frobware: that's correct, but it's also correct with trunk
<voidspace> the new code is marginally better
<voidspace> as trunk code will mangle aliases
<frobware> voidspace, because it does bootstrap OK for me with aliases
<voidspace> frobware: no replicaset issue?
<frobware> voidspace, nope. here with aliases: http://pastebin.ubuntu.com/12773981/
<voidspace> frobware: deleting the alias solved the replicaset problem for me
<voidspace> frobware: using my branch?
<voidspace> frobware: right, I guet that output - but bootstrap fails
<voidspace> it maybe intermittent
<voidspace> it probably depends which address of the two that mongo reports first
<frobware> voidspace, 7a13a40365df2aa5f13572955783a8ac60fb0dd6 Don't mangle aliases
<voidspace> frobware: cool
<voidspace> frobware: I'll add an alias and try again
<voidspace> frobware: in the meantime, care to give me a ShipIt?
<voidspace> right, bootstrap in progress
<frobware> voidspace, I added one more alias and it no longer bootstraps for me.
<voidspace> heh
<voidspace> frobware: really we would want mongo to use the primary interface not the alias
<voidspace> so I wonder if that's the real problem here
<frobware> voidspace, so the script output ends up in /var/lib/juju/nonce.txt - so we could add the cat before/after to see what we did...
<frobware> voidspace, correction in /var/log/cloud-init-output.log
<voidspace> ah, right
<voidspace> we don't need after, as that's what's in /e/n/i anyway
<voidspace> just before
<voidspace> why the $(cat ...) ?
<voidspace> why not just "cat ..."
<frobware> voidspace, the $(...) was just be delimiting in the written prose...
<voidspace> ah :-)
<frobware> voidspace, the after is a sanity check if the cloud-init is logged to a console and e/n/i is borked. there's then a chance we could see why.
<voidspace> ok...
<frobware> voidspace, so your most recent change does handle the alias -- or so it seems to me.
<voidspace> frobware: ok, that's good
<voidspace> frobware: we can topic it in tomorrow's standup
<frobware> voidspace, added a shipit - do you plan to post any binaries?
<voidspace> frobware: I can create some I guess
<voidspace> wonder where they should go
<frobware> voidspace, http://people.canonical.com/~voidspace
<voidspace> frobware: ah
<voidspace> never used that...
<voidspace> I'll look into it
<voidspace> heh, 404
<frobware> voidspace, nor me. I just saw dimiter publish some stuff there the other day... not sure I can login atm.
<frobware> voidspace, I think you first need to create public_html
<voidspace> right
<voidspace> I'll look into it in a bit - some exercise first
<voidspace> I've set my branch to land on 1.25
<mgz> voidspace: you can (with vpn or chinstrap) scp to lillypilly.canonical.com:~/public_html
<mgz> though it likes scri
<voidspace> mgz: thanks
<mgz> ..screwing up file persm so you have to ssh in to make apache like it anywa
<voidspace> heh, right
<natefinch> ericsnow: anything I can do to help?
<ericsnow> natefinch: I don't think so
<ericsnow> natefinch: you could look through the TODOs in that PR
<natefinch> ericsnow: the LXD PR?
<frobware> voidspace, you still there?
<frobware> voidspace, this worked for me: ssh -i ~/.ssh/id_launchpad_rsa people.canonical.com
<ericsnow> natefinch: yeah, that would help, but I was thinking of the list-payloads one (which probably doesn't actually have many TODOs)
<voidspace> frobware: cool, thanks
<frobware> voidspace, you'll need to be on the VPN
<natefinch> ericsnow: sure, I can look at that one. Probably have a better base of knowledge for it anyway.
<voidspace> frobware: bootstrap with an alias just worked for me too
<frobware> voidspace, it seems a nice by product of the \s* sed/grep match
<voidspace> yeah
<voidspace> not mangling things is generally good...
<frobware> s/nice/natural <shrug>
<voidspace> :)
<frobware> voidspace, we're just waiting for that to merge in 1.25, correct?
<voidspace> yep
<voidspace> then porting to master
<frobware> voidspace, if so could you update the bug report so at least everybody knows current status. thx.
<voidspace> and maybe 1.24 depending on if dimiter prefers his fix
<voidspace> yep
<voidspace> frobware: updated, really going now
<frobware> voidspace, me too. thanks!
<natefinch> ericsnow: I don't really see any super important todos that need to get done for that code.  Mostly just nice-to-haves and/or open questions.
<ericsnow> natefinch: k
<natefinch> (that code = payloads list)
<ericsnow> natefinch: perhaps catalog what tests are missing?
<natefinch> ericsnow: good idea
<natefinch> ericsnow: I started doing the list as another review of the code, but would it be more useful in textual form?  So we could put it in a card and anyone could work on it/
<ericsnow> natefinch: sgtm
<katco> ericsnow: wwitzel3: time to start again
<ericsnow> wwitzel3: FYI, my list-payloads patch works now
<thumper> dooferlad: ping
<cherylj> Hey davechen1y, are you actively working on bug 1465317
<mup> Bug #1465317: Wily osx win: panic: osVersion reported an error: Could not determine series <osx> <packaging> <wily> <windows> <juju-core:In Progress by dave-cheney> <juju-core 1.24:Triaged> <juju-core 1.25:Triaged> <juju-release-tools:Fix Released by sinzui> <https://launchpad.net/bugs/1465317>
<thumper> cherylj: what's the concern with http://reviews.vapour.ws/r/2868/
<cherylj> thumper: look at http://reviews.vapour.ws/r/2854/
<thumper> looking there now actually
 * thumper thinks
<wwitzel3> ericsnow: awesome, will check it out now
<mattyw> thumper, while we happen to be in the same place (irc) at the same time (now) have you seen rog' email on juju-dev about the user names local domain stuff, and do you have any problem with it?
<thumper> no, not seen it yet
<thumper> so no comment
<mattyw> thumper, ok, I figured that might be the case, just wanted to check
<mattyw> thumper, it's no particularly urgent
 * thumper headdesks
<thumper> two hours at work and it has started already
<thumper> FFS
<mgz> no, no, the poor desk
<mgz> mattyw: reviewed your branch earlier btw
<mattyw> mgz, I saw, thanks very much
<mattyw> mgz, I'm saving the work until tomrrow
<mattyw> mgz, something to look forward to
<mgz> :D
<thumper> cherylj: you were right to be concerned about the flslock changes
 * thumper brings up a juju/utils branch
<mgz> thumper: my argument was we just want to do what bzr used to, which is `kill -0 pid`
<mgz> which approximately works everywhere
<thumper> what is that?
<mgz> just check if the pid in the lock is alive
<mgz> no fancy stuff
<thumper> kill works on windows?
<thumper> I do agree, no fancy stuff
<mgz> there's a very similar equivalent with TerminateProcess
<thumper> surely there is a way to get Go to tell us about a pid
<thumper> implicit breaking of a lock is bad IMO
<thumper> there are better ways
<mgz> I agree just using a better lock primative would be better regardless
<thumper> http://stackoverflow.com/questions/15204162/check-if-a-process-exists-in-go-way
<thumper> mgz: this does what you said the Go way
<mgz> thumper: looks like we'd actually want x/sys/unix Kill
<thumper> no, because windows
<mgz> I have doubts process.Signal does the right thing on windows
<thumper> I'll write a test and check :-)
 * thumper goes to walk the dog to clear thinking
<wallyworld> perrito666: anastasiamac: running a bit late
<anastasiamac> wallyworld: k. ping when ready :D
<perrito666> wallyworld: ack
<davechen1y> is anyone on call reviewer today ?
<perrito666> davechen1y: I still am
<perrito666> at least in my tz
 * perrito666 shifted his day a bit to the right
<davechen1y> http://reviews.vapour.ws/r/2881/
<davechen1y> http://reviews.vapour.ws/r/2880/
<davechen1y> if you have time please
<perrito666> davechen1y: syre
<perrito666> sure
<wallyworld> anastasiamac: here now
#juju-dev 2015-10-14
<perrito666> davechen1y: the second one was intended to be a fix it and ship it but rb disagreed with me
<perrito666> davechen1y: http://reviews.vapour.ws/r/2880/ the first issue is more a question of personal curiosity than an issue
 * thumper headdesks
 * thumper takes a breath
<thumper> FFS
<anastasiamac> thumper: ?
<thumper> just something that is SO broken
 * thumper submits a reversal
<anastasiamac> thumper: is ur desk titanium? :D
<anastasiamac> or better yet - foam?
<thumper> alright git folks
<thumper> ...
 * thumper works it out
<anastasiamac> thumper: git 42 ./...?
<thumper> reviewer: http://reviews.vapour.ws/r/2886/
<thumper> plz
<thumper> davechen1y: ^^?
<thumper> wallyworld: ping
<wallyworld> yo
<thumper> wallyworld: I need a sanity check
<thumper> quick call?
<wallyworld> sure
<thumper> 1:1
<mup> Bug #1505866 opened: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1505866>
<perrito666> thumper: this is just a revert?
<thumper> perrito666: yes
<perrito666> thumper: does it require juju-core fixes to go along?
<mup> Bug #1505866 changed: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1505866>
<thumper> perrito666: no
<perrito666> thumper: then ship it
<perrito666> your comment seems to be on par with the code for what I can see
<mup> Bug #1505866 opened: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1505866>
<perrito666> anyone else needs a review before I go to sleep and you are out of reviewer until its the morning wherever katco lives?
<perrito666> k timeout
<perrito666> bye
<thumper> wallyworld: here's one http://reviews.vapour.ws/r/2887/
<wallyworld> ok
<thumper> wallyworld: just pushing a branch based on what we talked about
<wallyworld> ok
<thumper> wallyworld: http://reviews.vapour.ws/r/2889/
<wallyworld> ok
<thumper> wallyworld: :-( the check means passing the gc.C object all the way through
 * thumper hacks
<wallyworld> thumper: oh. was a suggestion, but i think worthwhile, if you agree
<thumper> I'm just trying it
<thumper> wallyworld: I think this is what you are after, good enough?
<wallyworld> looking
<thumper> wallyworld: most the attempt strategies are in tests right now
<thumper> and it makes the code less clear
<thumper> IMO
<thumper> also
<wallyworld> fastclock change looks ok
<thumper> the duplication of the unlock is so we have decent logging
<thumper> no point saying we are retrying and then not
<wallyworld> i realise that, but you log after each failed attemot
<thumper> by having the unlock first, then retry loops, better information logging
<thumper> but then you have to check if you are the last one
<thumper> and not log that one
<thumper> I went for this as I thought it was more clear
<wallyworld> ok
<wallyworld> feel free to drop any
<thumper> ok, I'll try to be nice
<thumper> wallyworld: in the end I did everything you wanted
<thumper> just like always
<thumper> :)
<wallyworld> thumper: aw, shucks
<wallyworld> but you didn't have to
<thumper> is there anyone looking at the 1.25 block?
 * thumper shrugs
<thumper> it's fine
<wallyworld> eg the attempt strategy
<wallyworld> and you made a good point abput the loop
<thumper> Attempt has a HasNext
<thumper> just used that
<wallyworld> ok
<thumper> internally it is overkill for what is needed
<wallyworld> will be good to se ehow this improves it
<thumper> but as a UI, it is fine
<thumper> ugh
<thumper> bug 1505866
<mup> Bug #1505866: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1505866>
 * thumper EOD
<thumper> laters
<mup> Bug #1505902 opened: MAAS 1.9: "storage-default-block-source" is not allowed in bootstrap <bootstrap> <ci> <maas> <juju-core:Triaged> <juju-core 1.26:Triaged> <https://launchpad.net/bugs/1505902>
<dimitern> morning all
<frobware> dimitern, hey - can we catch up after the sprint?
<frobware> dimitern, s/sprint/standup/
<dimitern> frobware, hey, yes - I've just accepted it
<dimitern> frobware, upgrading my hardware maas here to 1.9 to test voidspace's fix and mine
<voidspace> dimitern: cool, I'm porting mine to master
<voidspace> dimitern: you're doing 1.24 differently apparently?
<voidspace> dimitern: mine's landed on 1.25
<voidspace> dimitern: including a fix for interface alias mangling
<dimitern> voidspace, I had some comments yes - mostly about taking a simpler approach with the script
<dimitern> voidspace, that last one I couldn't quite get
<voidspace> dimitern: you mean understand?
<dimitern> voidspace, yep
<voidspace> so maas creates aliases like "eth0:1"
<voidspace> dimitern: and if the primary interface is "eth0" then some of the sed manipulations that change "eth0" to "juju-br0"
<dimitern> voidspace, ah I see, so the script needs to replace the whole thing incl. :1 with juju-br09
<dimitern> juju-br0 even
<voidspace> will change some references to "eth0:1" to "juju-br0:1"
<voidspace> yep
<voidspace> which broke mongo
<dimitern> voidspace, hmm good catch though!
<voidspace> so I fixed that by improving the sed regex to check for "${PRIMARY_IFACE}\s*$
<dimitern> voidspace, so did you manage to reproduce the issue on 1.9 before your fix?
<voidspace> dimitern: yep
<dimitern> voidspace, I'm not sure that thing with the aliases was part of the original issue - as you perhaps added 2 IPs to the eth0 NIC, and maas renders those as aliases
<voidspace> dimitern: I didn't change the aliases - I left them intact. I changed the script to not touch them.
<voidspace> dimitern: I don't think it was part of the original issue
<voidspace> dimitern: but it showed up when testing - so I had to work out why it was happening
<voidspace> somehow I'd got an alias for a nic on one node
<frankban> dimitern: you might be interested in http://reviews.vapour.ws/r/2892/ (support service constraints in bundle deployment). if you could take a look it would be great
<voidspace> not sure how
<dimitern> frankban, awesome! will have a look, thanks!
<frankban> dimitern: ty
<voidspace> dimitern: I don't know if you saw - my PR splits the part of the script that does text manipulation on /e/n/i into a separate script
<voidspace> dimitern: so it can be unit tested
 * dimitern has a shiny new hw maas with 1.9.0alpha4 now :)
<voidspace> dimitern: so adding new tests / changing the script is easy
<dimitern> voidspace, sounds good
<dimitern> voidspace, my alternative solution was to do something like sed -i s/${PRIMARY}/{{.Bridge}}/;/iface ${PRIMARY} inet/a\    bridge_ports: ${PRIMARY}" && append "iface ${PRIMARY} inet manual"
<dimitern> frankban, is guibundles branched off of master?
<voidspace> dimitern: that looks pretty much equivalent yeah
<frankban> dimitern: yes, last merged master was a blessd one from the week before seattle. will have to merge it again soon
<dimitern> voidspace, it's shorter but perhaps more dangerous, as it does replace all "eth0" with "juju-br0" indiscriminately
<voidspace> dimitern: yep, it would need fixing to not mangle aliases
<dimitern> frankban, ok, I've LGTM-ed it - will you backport the same to 1.25?
<voidspace> dimitern: you can run the tests I wrote against it
<dimitern> voidspace, yeah, I'll now test first your fix then mine
<frankban> dimitern: I was going to ask about what should I do to include this work in 1.26
<frankban> dimitern: so I'll do whatever required
<frankban> dimitern: oh, and thanks for the review!
<dimitern> frankban, well, master (1.26) is not quite up-to-date with 1.25 as I've realized last week, there are a few fixes needed in provider/ec2, but I'll take care of that
<voidspace> dimitern: I'm porting mine to master - if you have a simplification we could do that as a patch on top of what I've already done
<dimitern> voidspace, sounds good - we can have a simpler solution for 1.26 and one that works for 1.24 and 1.25 (even if they're different - shouldn't be a big deal)
<voidspace> dimitern: if you go to an individual node view in maas 1.9 you can add an interface alias and check your fix works with that
<dimitern> voidspace, I'm doing just that now - added 2 aliases and set eth0 (the only NIC the NUC it has unfortunately) to static, 1 alias to auto assign, and 1 to static
<mgz> voidspace: bug 1505949
<mup> Bug #1505949: TestRenderNetworkInterfacesScript* fail on windows <blocker> <ci> <maas-provider> <regression> <test-failure> <windows> <juju-core:Incomplete> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1505949>
<dimitern> mgz, d'oh :) voidspace - it should be skipped in windows
<voidspace> mgz: pthpthpthpth
<mgz> it's easy to fix at least :)
<dimitern> that looked like an interface name with biosdevnames=1 :)
<voidspace> mgz: dimitern: ok, on it
<dimitern> voidspace, cheers - please add a card for it etc.
<voidspace> mgz: it's not in master yet - so I'll land a fix for 1.25 and make sure the fix is in the port to master
<voidspace> dimitern: yep
<mgz> voidspace: also not in 1.24 it seems, a different version of the maas fix landed there?
<voidspace> mgz: not yet
<voidspace> mgz: shortly...
<mgz> okay, thanks!
<mattyw> mgz, regarding http://reviews.vapour.ws/r/2883. I guess your comment is in response to my use of return err in the UnmarshalYaml function?
<mgz> mattyw: yeah, as we're now returning err rather than false from the marshalling funcs, I just wondered if the errors are clearer with trace
<mattyw> mgz, I'll look again, but I think you're right, I think trace would be better, thanks for pointing it out
<mgz> or something like `spaces, err = parseYamlStrings("spaces", val); if err != nil { return errors.Annotate("failed to parse spaces", err) }` type thing
<mgz> don't think we need to go overboard, but may be nice for the complex functions
<mattyw> mgz, it's useful in contraints becuase there are a few steps to that
<mattyw> mgz, but in version it's best to leave it as it is I think
<mgz> mattyw: yes, that makes sense to me
<mattyw> mgz, just pushing those changes, do you want to take another look?
<voidspace> mgz: dimitern: http://reviews.vapour.ws/r/2893/
<mgz> mattyw: reshipit
<mattyw> mgz, it's going in, thanks very much
<mgz> voidspace: looks fine
<dimitern> voidspace, reviewed
<mgz> dimitern: I'm not sure about your comment
<mgz> really the skipifbug is for stuff we're skipping until we fix the bug
<voidspace> dimitern: it's not a windows bug
<mgz> I don't intent to implement /bin/sh on windows
<dimitern> voidspace, mgz, I see - well, then ignore my comment :)
<voidspace> heh :-)
<dimitern> fwereade, standup?
<mattyw> mgz, anything more I can do to help with yaml.v2 stuff?
<mgz> mattyw: can you look at my charm branch change?
<mgz> rog suggested expanding test coverage on it, so suggestions on that would be helpful
<mattyw> mgz, looking now, there seems to be some controversy about using the latest juju/utils, any idea what's going on there?
<mattyw> mgz, you seem to change it in that pr
<mgz> mattyw: it's fine if we rebump to now latest
<mgz> and it's not code that charm uses
<mattyw> mgz, any idea what that controversy is
<mattyw> mgz, also, any idea what kind of expanded test coverage rog had in mind, it seems ok to me, worrying that I'm missing something
<mgz> the see github.com/juju/utils/pull {159,162,163}
<mgz> mattyw: I think he had in mind that the YAMLMarshal tests didn't tell anyone that storage wasn't being serialised
<mgz> and don't say if the old redundant vars should be
<mgz> hm, and this test just failed, did it do that for me last time... aha, no, it's after I rebased on latest
<mgz> well, that's a starting point
<mgz> I'll bump utils and do this
<mgz> ooo, I'm silly
<mgz> I did the rebase wrong, someone had added json attrs to Meta, not yaml ones
<mgz> ...three lots of these is going to be painful
<mgz> mattyw: so, my basic issue is I'm not quite sure what the intended non-bson serialisation of Meta is exactly
<mattyw> mgz, we can get rid of the meta.SetYaml because yaml.v2 does it all for us right?...
<mattyw> mgz, but type Meta is missing the yaml tags, which I think would be useful
<mgz> mattyw: right, it needs them
<mgz> mattyw: I had added them but the rebase conflicted and I dropped my changes
<mattyw> mgz, ah ok
<mgz> mattyw: but I don't quite understan https://github.com/juju/charm/pull/158
<mgz> mattyw: do you know what is using json serialisation of Meta? there are no tests for that in meta_test.go
<mgz> or was that just a mistake and yaml was intended?
<mgz> (it's what I assumed on the conflict...)
<mattyw> mgz, hmm, I'm looking through master for changes to see if it's used there
<mattyw> mgz, it's also possible it's used in one of the feature branches?
<mattyw> mgz, or a mistake?
<mgz> I think it's just a mistake that happened to work
<mgz> because what mattered was the field rename back, and the serialisation tags are ignored... and GetYAML actually did the serialisation
<mgz> ...hm, comment in that branch makes it sound intentional
<mgz> but the pr discussion then makes no sense
<mgz> mattyw: I'm just going to make the yaml serialisation do exactly what it did before
<mgz> and let someone else deal with this confusion
<voidspace> mgz: the windows bug was never committed to master
<voidspace> mgz: can I delete juju-core from that issue
<voidspace> dimitern: http://reviews.vapour.ws/r/2894/
<mattyw> mgz, I'd say that's a wise choice
<mgz> voidspace: I prefer if it's just Invalid and present, otherwise makes lp search even more confused
<voidspace> mgz: the branch has landed on 1.25 (thanks for kicking the PR again) and I've marked it as fix committed
<voidspace> mgz: ok
<voidspace> mgz: I have a branch for master that incorporates the windows skipping
<voidspace> dimitern: shall I backport mine to 1.24, or would you rather land yours there?
<dimitern> voidspace, do yours
<dimitern> voidspace, I'd like to simplify the master
<dimitern> s/master/fix for master/
<mup> Bug #1505866 changed: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Invalid> <juju-core 1.25:Fix Committed by mfoord> <https://launchpad.net/bugs/1505866>
<voidspace> dimitern: ok
<mgz> mattyw: okay, this is terrifying
<mgz> three different serialisations with slightly different behaviours
<mgz> like, is it intentional or a tyop that Tags is json:"tag"
<mattyw> mgz, do we have tests in place for the three different behaviours?
<mgz> I see nothing anywhere using json
<mup> Bug #1505866 opened: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Invalid> <juju-core 1.25:Fix Committed by mfoord> <https://launchpad.net/bugs/1505866>
<mgz> mattyw: I run `:%s/ json:\([^ ]\+\)//g` and no tests fail
<mgz> (seriously vim, I always forget which regexp chars need \)
<mattyw> mgz, so, luckily for me I'll be buggering off to lunch soon ;)
<mattyw> mgz, but
<mattyw> mgz, I'd try to break the behvaiour and see if any tests in charm or core fail
<mattyw> mgz, and then try to get at least simple test coverage for the 3 behaviours in the charm package
<mattyw> mgz, and then cover it in todo's and bug numbers to sort it out
<mattyw> (if you want my opinion)
<mgz> mattyw: I have fixed by branch, pushing now
<mgz> I'll send a message to list about whether the json stuff was actually intended
<voidspace> dimitern: so for 1.24 I'll just land it
<voidspace> dimitern: for master I'll wait for your review with a suggested simpler approach
<dimitern> voidspace, sgtm, sorry - still in a call
<voidspace> dimitern: np
<mup> Bug #1505866 changed: TestRenderNetworkInterfacesScript* tests fail on windows <blocker> <ci> <regression> <test-failure> <windows> <juju-core:Invalid> <juju-core 1.25:Fix Committed by mfoord> <https://launchpad.net/bugs/1505866>
<mattyw> mgz, taking a long lunch, I'll look at the charm pr when I get back
<rogpeppe> mgz: to get a "bug fix" PR past the bot, where should the "fixes-xxxx" thing go?
<rogpeppe> mgz: in the subject of the PR or the body?
<rogpeppe> mgz: and are the square brackets and quotes mandator?
<rogpeppe> y
<mgz> fixes-xx in comment on github, no other characters required
<rogpeppe> mgz: ah, in a comment!
<rogpeppe> mgz: i tried subject and in the body in 3 forms!
<mgz> rogpeppe: unrelated, I would like to land https://github.com/juju/charm/pull/164 even though the serialisation is insane, and fix the test coverage/intetions in a follow up. the v2 changes to yaml have the exact behaviour as before, it's just everything else is nuts
<mgz> rogpeppe: it should probably check the body of the commit for bug mentions too, it's nice to have them there anyway
<rogpeppe> mgz: i'd really like it if you added some serialization tests in a previous branch (without mgo.v2) then we can see how the behaviour changes
<rogpeppe> mgz: otherwise we have no way of knowing what it has changed
<mgz> rogpeppe: so, what I'm changing (the yaml) is covered by testing
<mgz> rogpeppe: what's not covered (the new stuff, storage etc) I've now omitted from serialisation, which is what the codew was doing before
<mgz> and thelack of tests of any json serialisation is another thing all together
<mgz> rogpeppe: so, what I anticipate, is a follow branch that adds Sotrage and PayloadClass and testing for those
<rogpeppe> mgz:  it would be nice to add some tests that have some omitted storage serialisation
<mgz> rogpeppe: and another folloup that either deletes the json or actually tests it
<rogpeppe> mgz: so that it's easy to see that the followup fixes that
<rogpeppe> mgz: i don't think you can delete the json.
<mgz> okay, will throw one in for that
<rogpeppe> mgz: thanks
<rogpeppe> mgz: at least one API already serialises the charm metadata as JSON
<mgz> well, in the last 16 days it changed from serialising Tags as "tag" rather than "tags"
<mgz> rogpeppe: I give up, just going to have to fix this ;_;
<rogpeppe> oh really? that's bad.
<mgz> I prefer when touching code doesn't mean you have to sort out everyone before you's bugs
<rogpeppe> mgz: me too
<dimitern> voidspace, I'm having trouble with your fix for 1.25
<dimitern> voidspace, here's what /e/n/i looks like after the juju-br0 script have run: http://paste.ubuntu.com/12780287/
<dimitern> voidspace, here's how it looked like when deploying the node via MAAS (no juju involved): http://paste.ubuntu.com/12780294/
<dimitern> voidspace, notice the aliases are not handled ok and there's also no lo device
<dimitern> voidspace, sorry, just to be sure I've deployed another node from maas, configured the same way and lo is missing, but the rest is as expected: http://paste.ubuntu.com/12780344/
<mgz> voidspace: so I have maybe-bad news
<perrito666> man we really need to find a prettier way to include bash into our code
<mgz> voidspace: the networking for lxc on our wily slave seems to be screwed, not sure why yet
<mgz> voidspace: but could be a leak from running local-deploy on that machine with the 1.24 version of the maas networking change
<mgz> or manual deploy
<mgz> 1.25 did not have this problem
<mgz> voidspace: is it possible our real /etc/network/interfaces got stomped?
<dimitern> mgz, what do you see?
<dimitern> mgz, I think voidspace's fix does not do the right thing with /e/n/i
<mgz> just no interfaces other than loopback visible to a non-juju managed lxc container
<mgz> and whoops, I think I just borked networking to the box entirely when poking things ;_;
<jam> mgz: ouch. borking networking is definitely something that hurts, as you use it to fix the thing
<frobware> dimitern, voidspace: I also see the lack of loopback
<dimitern> mgz, hmm that sounds like an issue with /var/lib/lxc/<container>/config
<dimitern> frobware, yeah, I can confirm this on both vivid and trusty, but this is how maas (curtin?) renders /e/n/i
<frobware> dimitern, you mentioned this morning about cloud-init not getting an interface with just a deploy from MAAS... I think?
<dimitern> frobware, on one of the machines, yes - which has a single eth0 configured in maas to DHCP
<frobware> dimitern, so I think I see the same. eth0 is static, with static aliases for eth0:1 and eth0:2
<frobware> dimitern, cloud-init eventually completes post its 120s timeout
<mup> Bug #1506044 opened: juju.worker.diskmanager goes into ERROR loop inside a container <logging> <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1506044>
<dimitern> frobware, and no lo to defined at all?
<frobware> dimitern, and although there is no 'auto lo - iface lo inet loopback' entry rebooting the machine I do see the loopback device configured.
<dimitern> frobware, ok, so then the lack of loopback perhaps is not an issue?
<frobware> dimitern, I can certainly `telnet 127.0.0.1 22' and it connects so there's something bound and listening...
<dimitern> frobware, right
<dimitern> frobware, I'm experimenting with the ways to deal with juju-br0 where there are aliases
<frobware> dimitern, I'm just wondering if the whole thing could be simpler.
<frobware> dimitern, the aliases seems to work OK, no?
<dimitern> frobware, I managed to get both s/eth0/juju-br0/ + add iface eth0 inet manual and bridge_ports eth0 to juju-br0's config; AND just replacing the "eth0" -> "juju-br0", leaving "eth0:0" and "eth0:1" intact
<dimitern> frobware, the only issue is in the second case I end up having duplicate routes: 192.168.1.0/24 dev eth0 and juju-br0
<dimitern> frobware, voidspace, ok, I'm convinced now the 1.9 fix is not working for me, I'm going back to my original simpler fix and tweaking it to make it work
<frobware> dimitern, there's also the possibility of introducing 'ip link add name juju-br0 type bridge'
<dimitern> frobware, however.. I've just realized we might be having a lot bigger issue here
<dimitern> frobware, yeah, I'll check this too
<dimitern> frobware, the bigger issue is simply, juju-br0 is *supposed* to get its address via DHCP
<dimitern> frobware, and hence the containers on it will do the same
<dimitern> frobware, when we have a statically configured eth0 (or juju-br0) we won't be able to just do dhcp on the container side
<dimitern> frobware, in fact this seems so serious to me that we need to have a chat with the maas team how to resolve this
<dimitern> it won't just work
<dimitern> without dhcp
<dimitern> it might work ok if we had a solid story around addressable containers (with static IPs)
<frobware> dimitern, there's nothing preventing the containers getting DHCP
<frobware> dimitern, MAAS still runs DHCP, no?
<dimitern> frobware, I've experimented with a vivid vbox vm so far, but the case is the same more or less - single eth0 on the vm, configured as bridged to the host's eth0 (which is statically configured, but can also dhcp to the router and get an IP)
<frobware> dimitern, voidspace: so I think this line is broken in that script: "    sed -i "s/auto ${PRIMARY_IFACE}//" {{.Config}}"
<frobware> dimitern, voidspace: this is why you get ":1", ":2" entries
<dimitern> frobware, yeah
<frobware> dimitern, voidspace: this might be better expressed as s/auto ${PRIMARY_IFACE}[^:]//
<voidspace> hmmm
<voidspace> dimitern: frobware: just been on lunch, reading scrollback
<voidspace> doesn't sound good
<voidspace> mgz: ouch
<voidspace> mgz: I only touched maas networking not local
<jam> dooferlad: dimitern: priorities call?
<dimitern> jam, omw, sorry
<mgz> voidspace: it may not be related, but I made it harder to find out by borking the machine
<voidspace> mgz: :-(
<voidspace> frobware: dimitern: so there's an additional missing "\s*$" in the unAuto function - this is what kills the aliases when you have dhcp
<voidspace> dimitern: frobware: note that this is also technically a pre-existing problem - not a new problem with my patch
<voidspace> lack of loopback is odd
<mattyw> mgz, done https://github.com/juju/charm/pull/164
<mgz> mattyw: I assume we just want the latet version of yaml.v2 everywhere?
<alexisb> voidspace, thank you for getting the fix for 1494476 committed!
<mattyw> mgz, I guess so
<voidspace> alexisb: no problem
<voidspace> alexisb: there might be some more work to be done on it though - at least one scenario it doesn't work on, maybe two
<voidspace> alexisb: to be fair these are really pre-existing problems that allowing maas 1.9 to work at all has uncovered
<alexisb> voidspace, can we please make sure to open bugs for the scenarios that do not work
<voidspace> alexisb: ok
<voidspace> alexisb: hoping to get them done today and not leave any known bugs
<alexisb> voidspace, that works too :)
<alexisb> voidspace, keep me updated
<voidspace> alexisb: ok
<voidspace> dimitern: this code is only used when we don't have addressable containers, right? so containers won't be getting an address via dhcp
<voidspace> that was my understanding.
<voidspace> ah, maybe I'm misunderstanding
<dimitern> voidspace, yes
<dimitern> voidspace, but so far this hinged upon the primary nic getting it's ip via dhcp
<perrito666> sinzui: can you tell me more about https://bugs.launchpad.net/juju-core/+bug/1403689 ?
<mup> Bug #1403689: Server should handle tools of unknown or unsupported series <tech-debt> <upgrade-juju> <upload-tools> <juju-core:Triaged> <juju-core 1.22:Triaged> <juju-core 1.24:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1403689>
<perrito666> it seems like there was some discussion around it between thumper and you
<sinzui> perrito666: there are several cases where used cannot upgrade because their old server doesn't know about wily or windows or centos. Why should the server care if there are wily agents in the streams? It wants a precise or trusty agent and they are in the streams
<sinzui> perrito666: Tim suggested that the server should pull the agents without judging the series/os they are for.
<perrito666> I tend to agree
<sinzui> perrito666: some users have synced agents to their local disk, deleted the wily and windows agents, then published private streams to complete an upgrade. The state server could have ignored them.
<perrito666> why is that targetted to 1.22?
<perrito666> ok I am taking it, ill ignore the 1.22 milestone for now
<sinzui> perrito666: the bug is old and 1.22 users are afffected
<sinzui> +1 perrito666
<perrito666> oh so this actually needs to be backported eventually ?
<perrito666> sinzui:  do we have any of the failure reports associated?
<sinzui> perrito666: not from CI. I think the affected users were on precise or trusty 1.22.6/8 with stale distro-info
<voidspace> frobware: dimitern: so I don't see how the loopback is being removed
<voidspace> frobware: dimitern: do you have the /e/n/i before it gets modified?
<frobware> voidspace, there's a PB far above ^^^ :)  http://paste.ubuntu.com/12780294/
<dimitern> voidspace, the loopback it not pre-rendered by curtin/maas
<voidspace> frobware: I saw that - I thought that was *after*
<voidspace> dimitern: so maas doesn't render loopback sometimes *ouch*
<voidspace> dimitern: that sounds like a maas bug
<frobware> voidspace, I see lo in that PB ^^^
<voidspace> frobware: there are two PB
<voidspace> maybe three
<voidspace> frobware: oh, you did post the link
<dimitern> voidspace, indeed - I haven't seen it render /e/n/i with lo in it when there's per-node network config specifically set; however during deployment, if you quickly ssh in you'll see something like "auto lo...auto eth0 ...dhcp" (configured by cloud-init-dyn-network something)
<frobware> TheMue, moved our 1:1 for tomorrow - apologies for the short notice.
<voidspace> frobware: we don't remove loopback for that
<TheMue> frobware: no problem, feel free
<voidspace> frobware: it's only the alias issue which was fixed for static but not dhcp
<voidspace> frobware: so I think the missing loopback is maas fault not ours
<frobware> voidspace, based on the script it's not clear how it is removed.
<frobware> voidspace, but it did exist at some point. See: /etc/network/interfaces.old
<frobware> voidspace, contents are here: http://pastebin.ubuntu.com/12781804/
<voidspace> frobware: when I run that input (the first pastebin here) through our script, loopback is not removed :-)
<voidspace> frobware: dimitern thinks that maas renders it without loopback and then adds it
<voidspace> frobware: so maybe there's a race condition
<voidspace> sounds like we need to talk to the maas guys about it
<frobware> voidspace, If I just deploy I do get lo.
<voidspace> frobware: I've  not seen it without
<voidspace> frobware: here's a fix for the DHCP with aliases issue
<frobware> voidspace, so what's the plan here? The lo disappearing is odd. Fixing the alias thing I agree is new.
<voidspace> http://reviews.vapour.ws/r/2896/
<frobware> voidspace, heh - beat me to it...
<voidspace> frobware: I'm 99% certain *we're* not screwing loopback in this code
<voidspace> frobware: we *might* be overwriting MAAS as it adds it (a race condition because both maas/curtin and us are modifying /e/n/i)
<voidspace> frobware: if that's the case we either need to poll and wait for loopback to appear, or talk to the maas guys about a better fix
<frobware> voidspace, it might be easier to do this substitution in perl using non-greedy matches.
<voidspace> frobware: so probably raise a bug for that problem
<voidspace> frobware: I think that regex is fine
<voidspace> frobware: and is perl preinstalled? cloudinit is executed before package installation I understand
<voidspace> so we can't depend on anything not in a base ubuntu install
<voidspace> we could use Python as that is in a base install
<voidspace> but dimitern wasn't keen on doing that...
<voidspace> and if we're going to not use bash I'd rather use Python :-)
<voidspace> pretty sure it has non greedy matches too
<frobware> voidspace, so here's a thing... can we temporarily test with writing /e/n/i to some other place and do a few runs to see if we somehow(??) remove lo?
<voidspace> frobware: well - we have a test harness that does that
<voidspace> frobware: you can give it whatever input you want
<frobware> voidspace, I was more interested in the dynamic nature then the pre-canned test input, but sure...
<voidspace> frobware: well, the only bit that's different is the code that sets PRIMARY_IFACE
<voidspace> but even if that turns out empty I can't see any way that code could remove the loopback entries
<frobware> voidspace, agreed.
<frobware> voidspace, for 99% of the time. :)
<voidspace> well, running the same script multiple times on the same input won't make any difference - the part that modifies /e/n/i is deterministic
<voidspace> if there was a bug we'd need to determine the right input to feed it
<voidspace> frobware: *grrr*
<voidspace> frobware: you sure you want to work on the networking team?
<voidspace> frobware: :-)
<frobware> voidspace, OS bugs are harder. :)
<voidspace> heh
<frobware> voidspace, invariably there's no actual machine to debug...
<frobware> voidspace, dimitern: see recent conversation on #maas
<frobware> voidspace, we will always get lo whether it's there or not
<voidspace> ah, I can join #maas - won't be able to see past conversation
<voidspace> frobware: dimitern left
<voidspace> I've joined #maas
<wwitzel3> ericsnow: pig
<wwitzel3> lol
<wwitzel3> ericsnow: ping
<voidspace> wwitzel3: how rude...
<wwitzel3> ;)
<ericsnow> wwitzel3: hey
<voidspace> wwitzel3: ericsnow may not be the prettiest member of juju but that's plain rude
<frobware> voidspace: http://pastebin.ubuntu.com/12782022/
<ericsnow> lol
<voidspace> frobware: thanks, reading
<voidspace> frobware: ah
<voidspace> frobware: so if it's not there it's not a problem
<wwitzel3> ericsnow: hey, so the existing SetStatus stuff actually maps pretty well to what we want, I'm just going to gut the PluginStatus / StateStatus / CombinedStatus stuff and pass in a string.
<voidspace> frobware: so we can ignore that issue?
<wwitzel3> ericsnow: my question is, in the list command, what status are you showing in that column?
<frobware> voidspace, yep. maybe chasing ghosts here.
<voidspace> well - I've put up a PR for 1.25 fixing dhcp with aliases
<voidspace> I'll backport to 1.24 and roll into the branch for master which is still awaiting a review from dimitern I believe
<wwitzel3> ericsnow: is it the workload.Details.Status or workload.Status? .. it doesn't really matter since I'm updating both items in the doc
<wwitzel3> ericsnow: but I'm curious which the list command is using (I guess I could go look)
<voidspace> frobware: perrito666 has approved the 1.25 PR so I'll crack on with that
<voidspace> frobware: oh wait, wrong PR - he *will* look at it, hasn't yet
<ericsnow> wwitzel3: Payload.Status (which comes from Workload.Status.State)
<wwitzel3> ericsnow: ok, perfect
<frobware> voidspace, maybe we should add the cat /e/n/i as we talked about yesterday. at least that way we would know what it is to start with.
<voidspace> frobware: ok
<mup> Bug #1506121 opened: cloud-images query format deprecated, lxc should use simplestreams <juju-core:New> <https://launchpad.net/bugs/1506121>
<mup> Bug #1506121 changed: cloud-images query format deprecated, lxc should use simplestreams <juju-core:New> <https://launchpad.net/bugs/1506121>
<voidspace> frobware: doing that, including it in this PR
<voidspace> frobware: but it will require manual testing
<mup> Bug #1506121 opened: cloud-images query format deprecated, lxc should use simplestreams <juju-core:New> <https://launchpad.net/bugs/1506121>
<voidspace> frobware: that's pushed here if you want to take a look: http://reviews.vapour.ws/r/2896/
<voidspace> frobware: I'm just trying it out
<frobware> voidspace, ack
<voidspace> frobware: logging with cat seems to work fine
<voidspace> frobware: I already have a +1 on the branch so I'm landing it
<katco> ericsnow: natefinch: wwitzel3: http://reviews.vapour.ws/r/2898/
<frobware> voidspace, here's some sample output from my quick test: http://pastebin.ubuntu.com/12782441/
<natefinch> katco: looking
<ericsnow> katco: lgtm
<katco> ericsnow: ty
<voidspace> frobware: looks good to me
<natefinch> katco: Ditto to mr speedy up there
<katco> natefinch: ty
<frobware> voidspace, I'm heading out but seems all OK to me.
<voidspace> frobware: cool
<voidspace> frobware: see you later
<rogpeppe> mgz: any possibility we could get CI running on the chicago-cubs branch soonish?
<rogpeppe> mgz: the bug has been fixed now (or *should* have been)
<rogpeppe> sinzui: ^
<mgz> rogpeppe: yup, will hopefully happen by tomorrow
<rogpeppe> mgz: thanks
<mgz> we have lots of people doing work, and I slowed things down by booming the wilys-lave
<rogpeppe> mgz: when can we remove the critical status of the bug? will that happen automatically if it gets blessed?
<rogpeppe> mgz: i'm concerned that it might take a week before we can land anything
<alexisb> wwitzel3, ping
<wwitzel3> alexisb: pong
<alexisb> heya wwitzel3, I was just curious if you had a build of the lxd provider branch I could pull down and play with
<wwitzel3> alexisb: https://github.com/wwitzel3/juju/tree/skullcrusher has lxd and the deploy branches merged in to it right now
<wwitzel3> alexisb: ericsnow might have a more up-to-date one than that .. that is the same one from the sprint.
<alexisb> thank you wwitzel3
<ericsnow> alexisb: try the lxd-provider feature branch of github.com/juju/juju
<katco> wwitzel3: hey the link from your email to docs isn't working
<mup> Bug #1505902 changed: MAAS 1.9: "storage-default-block-source" is not allowed in bootstrap <bootstrap> <ci> <maas> <juju-core:Invalid> <juju-core 1.26:Invalid> <https://launchpad.net/bugs/1505902>
<katco> wwitzel3: looking at your pr too... why is it so large? seems like there's a lot of registration code in there that should have already existed in the feature branch already?
<natefinch> katco: looks like it includes all of eric's stuff, too
<natefinch> katco: like the list-payloads stuff
<katco> natefinch: still, much of that registration stuff should have already been in the feature branch
<natefinch> katco:  we hadn't implemented the CLI commands yet, so there wasn't the CLI command registration code
<alexisb> ericsnow, does your lxd branch have bundle support?
<ericsnow> alexisb: what do you mean "bundle"?  charm bundles?
<ericsnow> alexisb: if it's something extra then it probably doesn't have it :)
<alexisb> :) ok
<alexisb> nws :)
<alexisb> ericsnow, just fyi, the bundle support I am referring to his the "guibundle" branch that adds support for deploying bundles in core
<ericsnow> alexisb: not included then
<wwitzel3> katco: I had merged eric's branch in to test the charm with both commands, but forgot to backout the merge when I pushed up a fix
<wwitzel3> katco: I resolved it, it is back to normal
<katco> wwitzel3: k
<mup> Bug #1506225 opened: Failed bootstrap does not clean up failed environment <juju-core:New> <https://launchpad.net/bugs/1506225>
<mup> Bug #1506225 changed: Failed bootstrap does not clean up failed environment <juju-core:New> <https://launchpad.net/bugs/1506225>
<mup> Bug #1506225 opened: Failed bootstrap does not clean up failed environment <juju-core:New> <https://launchpad.net/bugs/1506225>
<mup> Bug #1506225 changed: Failed bootstrap does not clean up failed environment <juju-core:New> <https://launchpad.net/bugs/1506225>
<mup> Bug #1506225 opened: Failed bootstrap does not clean up failed environment <juju-core:New> <https://launchpad.net/bugs/1506225>
<wallyworld> anastasiamac: standup?
<anastasiamac> wallyworld: m stanidn :D
#juju-dev 2015-10-15
<jog> wallyworld, are you able to give me permission to tag github.com/juju/juju.git ?
<wallyworld> sure
<jog> my git user is 'jogeo'
<wallyworld> jog: sorry, browser crashed. does it work now?
<jog> wallyworld, yup thanks!
<wallyworld> np
<wwitzel3> anything interesting come out of the meeting?
<anastasiamac> wwitzel3: u mean team meeting?
<wwitzel3> anastasiamac: yeah
<wwitzel3> ericsnow: ping, you happen to still be around?
<anastasiamac> wwitzel3: well, we just caught up on new terminology (and what are the mplications for code, docs, blogs), restructure of core and spec contentes/processes
<anastasiamac> wwitzel3: it was short and quick :D
<anastasiamac> wwitzel3: as in brief :P
<wwitzel3> is there a listing of the new teminology?
<anastasiamac> environment = model
<anastasiamac> service = application
<anastasiamac> JES = controller
<anastasiamac> tools = agents
<anastasiamac> workload = payload
<anastasiamac> m adding this to meetings minute
<wwitzel3> anastasiamac: great, thank you
<anastasiamac> better to confirm with rick_h_ too :D
<anastasiamac> wwitzel3: i may have heard relation = link (but it could b jetleg too)
<rick_h_> anastasiamac: confirm what?
<rick_h_> anastasiamac: wwitzel3 terminology looks good
<wwitzel3> thanks
<rick_h_> relation stays, just talked a lot as forming a link
<natefinch> anastasiamac: can you explain the environment = model thing?  Environment is kind of overloaded, but when I think environment, I think like my production juju environment on AWS or my testing juju environment on MAAS
<rick_h_> natefinch: the thinking is that juju models the working system and a controller of several models seems ok
<rick_h_> natefinch: there's some more, can chat on how the discussions went the last two weeks.
<natefinch> rick_h_: sure, just wondering how we actually talk about these things... I can say "juju deploy adds a new service to the model".... but I can't really say "a machine just went down in our production model"
<rick_h_> natefinch: interesting use case. I'd skip that by saying models have names in a controller and you'd actually say "a machine died in production-analytics"
<rick_h_> but admit I'm cheating
<natefinch> heh ok :)
<natefinch> rick_h_: seems like there's really two things - there's juju's model of the world, stored in mongo, which is the model... and then there's the reality attached to that model.
<rick_h_> natefinch: which do you talk to and interact with via the cli?
<rick_h_> natefinch: I think you are right in that distinction though.
<natefinch> rick_h_: depends on the command.... things like deploy and add machine really just change the model, which the controller then attempts to modify reality to match
<natefinch> rick_h_: but then if you're doing juju status - you're showing both the model and reality
<rick_h_> natefinch: yea, but then again it's the model's view of reality.
<jam> wallyworld: you still have a 'juju-charm-demo' instance running. is that intentional?
<wallyworld> jam: no, it is not, i thought i had killed them all. damn
<jam> wallyworld: it started something like 25hrs ago
<jam> so roughly yesterday
<wallyworld> hmmm, i don't recall starting it but i must have
<wallyworld> killed now
<anastasiamac> rick_h_: thank you for explanantion  - would it be gr8 to have terminology discussion with Nick and Peter present to ensure that docs reflect our world :)
<mup> Bug #1506338 opened: state/leadership: sporadic test timeout <juju-core:New> <https://launchpad.net/bugs/1506338>
<jam> domas, fwereade: I just filed https://bugs.launchpad.net/juju-core/+bug/1506353
<mup> Bug #1506353: leadership resolver still too noisy <logging> <uniter> <juju-core:Triaged> <https://launchpad.net/bugs/1506353>
<jam> it seems the new dependency engine is waking up every 30s or so and logging that there is essentially "nothing to do"
<jam> which seems ok, until you have 100s of units doing it
<mup> Bug #1506353 opened: leadership resolver still too noisy <logging> <uniter> <juju-core:Triaged> <https://launchpad.net/bugs/1506353>
<mattyw> wallyworld, ping?
<wallyworld> mattyw: hey, just otp with fwereade , can i ping you soon?
<mattyw> wallyworld,
<dimitern> frobware, ping
<dimitern> dooferlad, voidspace, fwereade
 * dooferlad is AFK for ~25 mins
<fwereade> axw, wallyworld: offhand, non-urgent, but in case yuo see it before sleep: what would be causing the resolver loop to wake up every 30s?
<mattyw> mgz, ping?
 * dooferlad is back
<perrito666> morning
<voidspace> perrito666: 'ning
<TheMue> perrito666: 'ng
<TheMue> voidspace: next step only 'g? ;)
<voidspace> TheMue: :-)
<mattyw> mgz, there was some code in core in cmd/status that hadn't been moved onto yaml.v2. It was still working because juju/cmd was still on yaml.v1, but when you update the dep it breaks, I'm fixing that one up, there doesn't look there will be more but obv' we'll have to look out for it
<mup> Bug #1506460 opened: On juju-br0 interface creation the inet6 addresses of the original interface are lost <ipv6> <juju-core:New> <https://launchpad.net/bugs/1506460>
<dimitern> voidspace, hey
<dimitern> voidspace, I managed to confirm your fix on 1.25 works with a few different net configs for nodes in maas 1.9
<dimitern> voidspace, I'm trying a few more confs now (so far tested lxc and kvm connectivity)
<voidspace> dimitern: cool
<voidspace> dimitern: I have containers working with 1.25 and the fix for master
<voidspace> dimitern: I set the master branch to land but tests failed
<voidspace> looking at it now
<voidspace> cmd_juju_subnet_test panicked
<dimitern> voidspace, so I'm having issues with my local maas network setup I suspect, as it *always* takes ~5m to deploy a node (w/ or w/o juju) due to cloud-init waiting for 10 then 120s for the network to come up (even though it's statically configured)
<voidspace> bah
<dimitern> voidspace, hmm - what was the issue?
<voidspace> ah, no reachable servers
<voidspace> that happens from time to time - can happen to any test
<voidspace> will retry
<mup> Bug #1503740 changed: Storage should be behind a feature flag in 1.24 <storage> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1503740>
<dimitern> right
<voidspace> it's not good, but it's not my immediate problem...
<voidspace> I'm going on lunch
<voidspace> bbiab
<dimitern> also there's the bug above, just filed - when we have inet6 config in /e/n/i we should be handling it properly
<mup> Bug #1503740 opened: Storage should be behind a feature flag in 1.24 <storage> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1503740>
<mup> Bug #1226307 changed: juju-core lazily get tools if from public bucket <bootstrap> <performance> <juju-core:Fix Released> <https://launchpad.net/bugs/1226307>
<mup> Bug #1503740 changed: Storage should be behind a feature flag in 1.24 <storage> <juju-core:Fix Released by axwalk> <https://launchpad.net/bugs/1503740>
<frobware> dimitern, I see the same slow deployment time. This is what I thought you reported yesterday.
<dimitern> frobware, I suspect the issue might be due to having 1 NIC with 2 aliases all on the same subnet
<frobware> dimitern, agreed. I haven't been scientific enough in recording which does/does not work...
<dimitern> frobware, voidspace, also, I just did a test with one of the NUCs having 2 NICs bonded in bond0
<dimitern> script needs to handle this case better - connectivity was lostit doesn't work
<frobware> dimitern, can you please raise a bug for this explicit case
<mup> Bug #1226307 opened: juju-core lazily get tools if from public bucket <bootstrap> <performance> <juju-core:Fix Released> <https://launchpad.net/bugs/1226307>
<dimitern> frobware, sure
<mup> Bug #1226307 changed: juju-core lazily get tools if from public bucket <bootstrap> <performance> <juju-core:Fix Released> <https://launchpad.net/bugs/1226307>
<dimitern> frobware, what was that ip link command to add the bridge?
<dimitern> frobware, ip link add link juju-br0 type bridge doesn't seem to work
<natefinch> fwereade: I fixed a bunch of the issues you had with the unit assigner worker.. some of the comments I responded to without fixing, as a couple of them would have been quite large refactors.  Would appreciate if you could take a look  http://reviews.vapour.ws/r/2814/
<frobware> dimitern,  sudo ip link add name juju-br0 type bridge
<dimitern> frobware, right, thanks!
<dimitern> frobware, I might have found a workaround the bond issue
<dimitern> frobware, using that ip link add + ip link set up seems to do the trick without loss of connectivity
<dimitern> double checking now..
<frobware> dimitern, would be worth verifying the UP status of the link
<dimitern> frobware, I've added dumping of all routes, links, and addresses at the end of the script
<frobware> dimitern, using `ip addr' ?
<frobware> dimitern, or ifconfig?
<dimitern> frobware, all three with iprotue2
<dimitern> iproute2 even
<wwitzel3> ericsnow: ping
<wwitzel3> katco: I might be a couple minutes late to standup
<katco> wwitzel3: np we can wait
<wwitzel3> anastasiamac: yeahback
<wwitzel3> oops
<wwitzel3> back
<mup> Bug #1506498 opened: juju-br0 not configured properly with maas 1.9 on machines with 2 bonded NICs <juju-core:New> <https://launchpad.net/bugs/1506498>
<dimitern> frobware, so the in addition to adding and upping the bridge, we need to wait for both the bond and the bridge to become ready, pining the default GW
<dimitern> frobware, filed bug 1506498
<mup> Bug #1506498: juju-br0 not configured properly with maas 1.9 on machines with 2 bonded NICs <juju-core:New> <https://launchpad.net/bugs/1506498>
<mup> Bug #1506498 changed: juju-br0 not configured properly with maas 1.9 on machines with 2 bonded NICs <juju-core:New> <https://launchpad.net/bugs/1506498>
 * dimitern needs to step out, will continue a bit later with the tests
<mup> Bug #1506498 opened: juju-br0 not configured properly with maas 1.9 on machines with 2 bonded NICs <juju-core:New> <https://launchpad.net/bugs/1506498>
<rogpeppe> nice bikeshed opportunity anyone! i want to factor writeServerFile out of cmd/juju/user so that it can be used by cmd/juju/system too. but... where should the new location be?
<rogpeppe> possibilities considered so far: cmd/juju/system, cmd/juju/common, environs/configstore
<rogpeppe> fwereade, dimitern, jam: suggestions?
<rogpeppe> surely *someone* must have an opinon! dimitern, cmars, natefinch, mgz... ?
<mgz> rogpeppe: I really don't :)
<rogpeppe> mgz: bah!
<mgz> well, I think "common" sucks as a concept, but we have it
<rogpeppe> mgz: FWIW i don't like any of the above suggestions
<rogpeppe> mgz: i agree totally. i don't want to make it worse
<rogpeppe> mgz: currently i'm thinking of a new package, maybe github.com/juju/juju/serverfile
<rogpeppe> or perhaps cmd/juju/serverfile
<rogpeppe> mattyw: nice bikeshed opportunity anyone! i want to factor writeServerFile out of cmd/juju/user so that it can be used by cmd/juju/system too. but... where should the new location be?
<rogpeppe> mattyw: [15:46:25] <rogpeppe> possibilities considered so far: cmd/juju/system, cmd/juju/common, environs/configstore
<natefinch> rogpeppe: nope, no real opinion. If it's just for cmd/juju, keep it under there somewhere
<rogpeppe> natefinch: well, it's potentially something that someone external to juju might want to use
<mattyw> rogpeppe, I take offence at the assumption that I always enjoy a bikeshed
<mattyw> rogpeppe, so the punishment is, whatever I say you have to agree with 100%
<cmars> rogpeppe, i probably wouldn't import something out of juju/juju though, because it's such a huge checkout from github
<rogpeppe> mattyw: not a chance. the way of bikesheds is: noone gives a toss until a decision is made, then everyone disagrees with it
<rogpeppe> cmars: i don't understand
<cmars> rogpeppe, wait, are we talking juju/juju/cmd/juju or cmd/juju ?
<rogpeppe> cmars: the former
<cmars> er, juju/cmd. ok
<mattyw> rogpeppe, I'd put it in environs/server_file.go for now
<rogpeppe> mattyw: i'm not sure. isn't environs a bit overburdened already?
 * rogpeppe checks again
<mattyw> it is, it's the best of the three options presented
<mattyw> I could also go with environs/serverfile/write.go or something
<rogpeppe> mattyw: yeah, that might be reasonable
<rogpeppe> mattyw: although it's not really much to do with environs
<mattyw> rogpeppe, also you shouldn't miss this chance to rename it writeControllerFile
<rogpeppe> mattyw: ha
<rogpeppe> mattyw: if there's gonna be some renaming, let's do it consistently all at once across the code base please
<mattyw> rogpeppe, since when has that been a think?
<mattyw> thing
<mgz> the thing takes a *cmd.Context does moving it out of cmd make sense?
<rogpeppe> mattyw: it seems like the only sane approach to me
<mgz> I have not internalised our import graph, but pretty sure environs generally gets imported by cmd not the other way around
 * mattyw looks again
<rogpeppe> mgz: i don't think it should take a cmd.Context
<rogpeppe> mgz: or an EndpointProvider come to that
<mgz> fair enough, if refactored to be a different interface have wider options
<mattyw> those are good points, it should basically be given the structure and where to put it, and it writes it there I'd imagine
<rogpeppe> mattyw, mgz: i'm thinking of something like: http://paste.ubuntu.com/12790951/
<rogpeppe> mattyw: and given that it refers to environs/configstore, perhaps environs/serverfile would work best
<mattyw> rogpeppe, I'd be +1 all of that except for the incomprehensible lack of godocs in the paste ;)
<mgz> that does seem pretty reasonable. with filename being an abspath.
<rogpeppe> mgz: yup
<mattyw> mgz, while you're here: juju/juju/cmd/status had mention of GetYAML and was still using yaml.v1 from github.com/juju/cmd
<mattyw> mgz, I started the work to move it over - does that conflict with anything you're doing?
<mgz> mattyw: nope, just missed it in the grep
<mgz> problem with packages that have go get as their gating mechanism
<mattyw> mgz, ok - was sort of hoping I'd get a yeah and I've fixed all the problems
<mattyw> mgz, but ok
<rogpeppe> mgz, mattyw: actually, another possibility is juju/jujuj/envcmd which is where ServerFile (the serialisation format) is defined
<rogpeppe> mattyw: did you find out what was calling GetYAML ?
<mattyw> rogpeppe, it was using github.com/juju/cmd which was still on yaml.v1
<rogpeppe> mattyw: ha!
<mattyw> rogpeppe, that could make sense - I suppose it is quite cmd related
<rogpeppe> mattyw: if in doubt, use showdeps -a | grep yaml.v1
<rogpeppe> mattyw: i suppose :-\
<mgz> mattyw: hm, did my subnet yaml branhc not land? I'm not confused by the state of master
<mattyw> rogpeppe, well it's the cmd's that use it right?
<mattyw> mgz, subnet?
<rogpeppe> mattyw: currently, yes
<mattyw> mgz, cmd/subnet?
<mgz> ah, I am confused
<rogpeppe> mattyw: although it's a format that others might want to parse
<mattyw> rogpeppe, so it sort of makes sense
<mgz> I removed redundant junk in spaces
<mgz> missed the same in subnets
<rogpeppe> mattyw: e.g. if you were implementing a juju client in python
<mgz> mattyw: unless you've started already on status, I'll do it
<mattyw> mgz, I've started - but am stuck
<mattyw> mgz, I'll push what I have so far so you can take a look if you like
<mgz> mattyw: I'll have a lok, touched this code before so I should remember what's going on
<mattyw> mgz, but a couple of tests are failing - "service-status" isn't appearing in the right places
<mattyw> mgz, one moment
<mgz> mattyw: meanwhile, http://reviews.vapour.ws/r/2911/
<mattyw> mgz, I migrated those, sure they're not needed?
<mgz> mattyw: yup
<mgz> I should have seen them when I killed the other ones
<mgz> it's noop copy code
<mattyw> mgz, cool, I suspect it might be the same in the other stuff I'm doing
<mattyw> mgz, but I haven't got it worked out in my minds yet
<mgz> status is the one place we really do fancy stuff
<mattyw> (still waiting to push)
<mgz> but people have looked at it for yaml examples and been confused as to what's actually needed
<mgz> mattyw: I evilly turned off my hook a while back and have yet to reenable it
<mattyw> mgz, St Peter doesn't take kindly to that kind of thing
<mattyw> mgz, here it is http://reviews.vapour.ws/r/2912/
<mgz> mattyw: ta!
<mgz> mattyw: is the famous and much copied rog comment actually still true?
<mgz> or can you inline declare types in the methods?
<mattyw> mgz, I don't know - I was hoping to get tests passing then see if I can remove and at have still pass tests
<mgz> :D
<mgz> mattyw: I'll poke this after standup
<mattyw> mgz, rog was kind enough to explain it to me this morning - but it didn't help my understanding, I was hoping to talk to him after ods to go over it in a bit more deetail
<wwitzel3> ericsnow, natefinch: aaand again
<ericsnow> wwitzel3: ouch
<wwitzel3> it is odd, it is a clean shutdown , but like it is happening in the background .. syslog shows systemd shutting things down before it dies
<mgz> bogdanteleaga: if you have a mo, can you look at http://reviews.vapour.ws/r/2860 for the centos changes
<mgz> katco: and you got a feature test for your last minute feature right? :P
<katco> mgz: (whistles)... we had one at some point
<katco> mgz: i think it got lost in the shuffle
<natefinch> mgz: we're happy to write one for next friday
<bogdanteleaga> mgz: looks sane to me, did you test it on centos?
<natefinch> fwereade: reviewed your uniter logging PR - http://reviews.vapour.ws/r/2913/
<mgz> bogdanteleaga: I do not know how at present, jog is working on centos on our maas, was wondering if you had a working public cloud setup anywhere
<bogdanteleaga> mgz: we didn't try it out in public clouds yet, only maas
<bogdanteleaga> mgz: have you tried http://wiki.cloudbase.it/juju-centos ?
<mgz> bogdanteleaga: I assume that's basically what jog followed, he had fun with nic and maas 1.9 though
<mgz> anyway, we should have that as part of our revision testing soon enough, then I can rerun the branch with centos
<jog> mgz, 1.9 has support for importing centOS images (i.e. it automatically picks them up from the daily image streams), so I did not have to generate my own images or manually add them.
<jog> https://maas.ubuntu.com/docs/changelog.html#alpha2
<mgz> jog: did you file any bugs for your troubles?
<jog> There is a bug for the LVM issue already
<jog> the second issue was related to having multiple NICs and the MAAS DHCP listening on the second NIC. However, CentOS only configured the first... I've not filed a bug about that yet.
<jog> the bug for LVM is 1499558 and also discussed in the release notes I liked to above
<wwitzel3> ericsnow: pushed up my status-set changes
<ericsnow> wwitzel3: cool
<natefinch> ericsnow, wwitzel3: back, sorry, the furnace guy was here
<wwitzel3> ericsnow: pushed an update
<ericsnow> wwitzel3: thanks!
<natefinch> wwitzel3, ericsnow: what can I do to help?  Just review?
<katco> natefinch: that's what i'm doing
<wwitzel3> ericsnow: pushed update to unregister as well, added a todo and the checkempty
<katco> wwitzel3: gah... push file renames in separate PR
<wwitzel3> just look at it on Github , since it handles it just fine
<wwitzel3> :P
<katco> wwitzel3: doing that actually... doesn't do diff b/t 2 files though
<wwitzel3> odd it should, I used git rename
<katco> wwitzel3: lemme know what i'm doing wrong, but don't see it here: https://github.com/juju/juju/pull/3517/files?diff=split
<wwitzel3> yeah, that is weird, since the commit history is file renamed without change .. then my changes
<natefinch> man, git sucks, bzr and launchpad handle this just fine
<wwitzel3> katco: yeah, that is odd, it is confused about something .. you can just look at the commit here https://github.com/wwitzel3/juju/commit/06ef4664272c5110fe1198ac778a28fba7e73ee4?diff=split
<wwitzel3> katco: i broke it up in to a rename commit and a change commit
 * natefinch shakes off the possession by wallyworld
<wwitzel3> so that this wouldn't happen .. lol
<wwitzel3> and it happened anyway
<katco> wwitzel3: much better
<katco> wwitzel3: +1 to change
<mgz> ericsnow: sorry, should have claimed that mattyw yaml review, he put it up incomplete so I could resolve a conundrum for him
<ericsnow> mgz: np :)
<mgz> I just found a mo though and have the fix
<mattyw> ericsnow, thanks anyway, sorry I should have said something in the review
<ericsnow> mattyw: my question about the bugs vs. yaml.v2 still stand though :)
<mgz> mattyw: see my review for answer
<mgz> ericsnow: we think we can remove most of the weird workaround stuff, only part I'm not sure on is the gccgo behaviour
<ericsnow> mgz: gccgo just keeps on giving
<mgz> it's just the fun of working on multiple platforms in general
<mgz> you get twice the number of platform bugs
<natefinch> wwitzel3: a few very minor review comments: http://reviews.vapour.ws/r/2901/
<mattyw> ericsnow, mgz thanks guys, I'm not actually reading any of the comments, I'm saving it as a suprise
<ericsnow> mattyw: lol
<mgz> mattyw: a breakfast excitement
<natefinch> ericsnow: that status list formatter thing takes a compatVersion, but doesn't use it..... is that correct?  and if so, why does it need that value?
<ericsnow> natefinch: I followed the lead of status on that one
<natefinch> ericsnow: of taking in the value and not using it? :/  :)
<ericsnow> natefinch: haha
<wwitzel3> woohoo, new keyboard is here
<wwitzel3> my wrists are free again!
<katco> wwitzel3: i expect 50.02% more productivity
<wwitzel3> lol, sure ;)
<natefinch> wwitzel3: sweet... my code keyboard is coming tomorrow.  Can't wait.
<wwitzel3> natefinch: I looked at those but I didn't see an adjustable version
<natefinch> wwitzel3: adjustable how?
<wwitzel3> natefinch: width and height and tilt
<natefinch> wwitzel3: yeah, it's not super adjustable... I think it can be tilted, but at the end of the day, it's not rally any more adjustable than the rectangular keyboard that you can get for $10, it's true.
<wwitzel3> natefinch: yeah, I don't know how to type on a regular keyboard anymore
<natefinch> wwitzel3: haha
<natefinch> wwitzel3: I have a friend who es the kinesis... I imagine that must be even worse
<wwitzel3> natefinch: these last two days on the laptop keyboard have been awful .. anytime we have sprints and people watch me type, I can only imagine what people think
<wwitzel3> the one I just got is the Goldtouch V2, which is adjustable 0-30 degrees on the horiz and vert
<wwitzel3> it doesn't have a num pad, but I have an extrenal one floating around somewhere
<natefinch> haha.. yeah, for me, it's the touchpad that is horrible. I can type on the laptop ok, but man.... the touchpad makes me feel like I'm 110 years old
<katco> ericsnow: wwitzel3: we have a testable branch yet?
<wwitzel3> katco: status-set is landing right now, not sure where list is at
<ericsnow> katco: the only thing I have left is to address a few more comments from natefinch
<katco> wwitzel3: ericsnow: awesome!
<aisrael> Who's familiar with jes? I'm trying to use it for a spike, but the devel docs aren't helping me. `juju system create-environment test` throws 'error: no system specified'
<mup> Bug #1506649 opened: On juju upgrade the security group lost ports for the exposed services <juju-core:New> <https://launchpad.net/bugs/1506649>
<mup> Bug #1284982 changed: juju destroy-environment exits with strange error <juju-core:Fix Released by thumper> <https://launchpad.net/bugs/1284982>
<fwereade> aisrael, I should be asleep, but thumper or menn0 should be able to help
<mup> Bug #1506666 opened: juju bootstrap fails in virtualbox <juju-core:New> <https://launchpad.net/bugs/1506666>
<menn0> aisrael: howdy
<menn0> aisrael: let me remind myself who this hangs together
<menn0> aisrael: which juju version are you using?
<aisrael> menn0: 1.25-beta1. I'm chatting with thumper about it now, though, but thanks!
<menn0> aisrael: ok, you're in good hands then
<mgz> probably broken lxc package
<mgz> see if update gets you a new one.
<menn0> mgz: hey hey
<menn0> mgz: do you know if there's likely to be another 1.24 release?
#juju-dev 2015-10-16
<mup> Bug #1506680 opened: `juju environments` fails due to missing ~/.juju/current-environment <juju-core:New> <https://launchpad.net/bugs/1506680>
<davechen1y> thumper: final version.Current PR is up for review
<thumper> wallyworld: heading out for a quick dog walk before the rain comes in
<wallyworld> ok
<wallyworld> ping when you are free
<thumper> wallyworld: there now
<wallyworld> ok
<natefinch> ericsnow: how goes?
<rogpeppe> mgz: ping
<wallyworld> fwereade: i reverted the change omit series, could you PTAL http://reviews.vapour.ws/r/2888/
<fwereade> wallyworld, cheers
<fwereade> wallyworld, LGTM, thanks
<wallyworld> tyvm
<dooferlad> fwereade, jam: hangout?
<voidspace> dimitern: looks like they took a different approach in maas - they didn't go with device_type in the end
<voidspace> dimitern: so that card can be deleted
<voidspace> dimitern: https://bugs.launchpad.net/maas/+bug/1490637
<mup> Bug #1490637: Devices with parents should not show on devices tab <MAAS:Fix Committed by blake-rouse> <https://launchpad.net/bugs/1490637>
<voidspace> dimitern: they just decided to use the presence of a parent instead
<voidspace> dimitern: so I can delete my branch....
<dooferlad> TheMue: so, the lock problem.
<dimitern> voidspace, it sounds like a mess that they need to fix :)
<voidspace> dimitern: yeah, weird
<voidspace> dimitern: I've added a comment to the issue and will ping them about it
<TheMue> dooferlad: yes, let's unlock it
 * dooferlad groans
<TheMue> dooferlad: the scope is always only the local node?
<dimitern> voidspace, cheers
<dooferlad> TheMue: yes
<dooferlad> https://github.com/golang/go/issues/2307 is background on the Windows socket behaviour
<TheMue> dooferlad: so surely a state based solution is nice, but not needed?
<dooferlad> TheMue: using state would fix the problem, yes
<dooferlad> TheMue: it is our single source of truth
<TheMue> dooferlad: as we just said, state would help, but not solve it if a process/node dies and the lock stays in state
<dooferlad> TheMue: state would notice that the connection died though, right? That could be an even to unlock.
<dooferlad> TheMue: actually, it looks like the socket code on Windows should only allow one connection now.
<dooferlad> TheMue:
<TheMue> dooferlad: a, you would check, if a connection between a node and state exists? not the data in state in a collection?
<TheMue> dooferlad: I thought the whole time of a lock collection with records containing node, handle, and lease time
<dooferlad> TheMue: you would store the lock data in the collection, but you could validate it against if the thing that created the lock still exists
<TheMue> dooferlad: and processes use their node and handle to check if the lease time is passed off or not
<dooferlad> TheMue: but I don't want to do anything lock related in state because adding more logic to state unless we have a good future use case.
<dooferlad> And we don't have a lease time. There is no need.
<TheMue> dooferlad: how would you validate if the creator still exists?
<dooferlad> We have connections to state open to update state from machines.
<dooferlad> We just see if the connection that created the lock is still there
<dooferlad> but, it involves a network, so no, don't do that
<TheMue> dooferlad: many distributed solutions I've looked for use lease times. we could create this lock as a util type, it will find it's usage ;)
<dooferlad> Doing this right over a network turns into a distributed consensus algorithm
<dooferlad> at which point we have implemented etcd to replace a file lock
<dooferlad> so lets not
<dooferlad> On Linux we could use /tmp to ensure our locks don't exist across reboots (yay!) and on Windows we can use the file attributes  FILE_ATTRIBUTE_TEMPORARY and FILE_FLAG_DELETE_ON_CLOSE (see https://en.wikipedia.org/wiki/List_of_RAM_drive_software#Native) to achieve about the same thing
<dooferlad> that solves all my problems.
<dooferlad> apart from delete_on_close doesn't look right
<dooferlad> arg
<dooferlad> https://github.com/natefinch/npipe maybe is the best way to go on Windows
<TheMue> dooferlad: hmmm, ok, not yet really sure
<dooferlad> TheMue: maybe I just move the file locks to /tmp on Linux and leave the rest. It would be enough to get the MAAS and EC2 tests passing and since we don't do container init on Windows (or run those tests) then problem solved.
<dooferlad> TheMue: the need for the lock then goes away once we are down to 1 juju process and we all celebrate.
<dooferlad> TheMue: (well, when that happens)
 * dooferlad needs a cup of tea. Back in 5
<fwereade> mattyw, tasdomas: as people who've used the dependency engine: I have a use case for injecting an engine into itself, and am having naming troubles
<mattyw> fwereade, ah naming, one of the solved problems
<fwereade> mattyw, tasdomas: currently exposed as InceptionManifold, which is not really accurate, but is a useful indicator of the presence of weirdness
<fwereade> mattyw, tasdomas: a more accurate nname would be OurobourosManifold but that's, er, a bit obscure
<fwereade> mattyw, tasdomas: thoughts appreciated :)
<tasdomas> fwereade, RecursiveManifold ?
<mattyw> fwereade, -1 to inception
<mattyw> tasdomas, fwereade, I was just typing exactly that!
<tasdomas> fwereade, I don't really care for names as long as they are not misleading
<tasdomas> fwereade, a rose by any other name..
<fwereade> tasdomas, there's quite a lot of subtlety embedded in "so long as they are not misleading" ;)
<tasdomas> fwereade, well, in my case misleading would be calling it SimpleManifold
<fwereade> tasdomas, it *is* actually pretty simple ;p
<fwereade> tasdomas, (but point taken all the same)
<tasdomas> fwereade, there's no such thing as a simple manifold
<tasdomas> ;-]
 * fwereade has a sad now
<fwereade> ;p
<fwereade> hm, maybe it would be best to just have an `Engine.InstallSelf(name string) error` method? and unexport the manifold itself except for tests?
<fwereade> ...but then the InstallSelf tests end up yucky and arms-lengthy in a bad way
<fwereade> SelfManifold?
<fwereade> mattyw, tasdomas: ^^
<mattyw> fwereade, self is fine
<mattyw> fwereade, but document it
<fwereade> mattyw, cheers
<fwereade> mattyw, http://reviews.vapour.ws/r/2916/diff/# -- adequately documented?
<mattyw> fwereade, a proper review will come later
<mattyw> fwereade, but for now the documentation seems fine
<mattyw> fwereade, and given the actualy implementation i really like SelfManifold as a name
<fwereade> mattyw, cool, thanks :D
<fwereade> mattyw, don't feel obliged to do a full review, the implementation remains unchanged from when thumper LGTMed it
<mattyw> fwereade, ok cool, good enough for me
<fwereade> mattyw, tyvm
<mattyw> fwereade, I'm having one of those days where I'm chasing shiny things instead of doing the stuff I'm supposed to
<fwereade> mattyw, haha, sorry to contribute to that
<mattyw> fwereade, no problem - shiny == awesome
<mattyw> mgz, are you around?
<mgz> mattyw: hey
<mattyw> mgz, just pushing an update to the yaml branch
<mattyw> mgz, it looks like we can get rid of that weirdness
<mattyw> mgz, but I'm not full of confidence as I still don't really understand why it was there
<mgz> mattyw: jw4's comment is bug 1436871
<mup> Bug #1436871: ppc64 gccgo fails to build cmd/juju <ci> <ppc64el> <regression> <juju-core:Fix Released by johnweldon4> <https://launchpad.net/bugs/1436871>
<mgz> if it builds with both go and the version of yaml we're using, and gccgo, there's nothing left to worry about
<mattyw> mgz, can I test that somehow?
<mgz> yeh,or I can. install gccgo on trusty and build with -compiler gccgo
<mattyw> mgz, it's here if you want to try it
<mattyw> mgz, I'll see if I'm setup to do it as well
<mgz> seems to build
<mgz> of course, looking at the bug it did for me last time too, but I *think* this was a short-lvied bug in the packages
<mgz> we'll find out for certain when it gets merged anyway
<mattyw> mgz, maybe I could channel rogpeppe
 * rogpeppe 's ghost appears
<rogpeppe> mattyw: that bug resulted in some quite weird things happening
<rogpeppe> mattyw: best to definitively check that it works properly before renaming those types
<mgz> spooky!
<mattyw> rogpeppe, how would I check that?
<mattyw> rogpeppe, are you the ghost of bugs yet to come?
<rogpeppe> mattyw: you'd write a little piece of test code
<mattyw> rogpeppe, test code - that's an odd concept
<rogpeppe> mattyw: more like the ghost of your fetid past
<rogpeppe> mattyw: i'm not thinking of an actual test
<rogpeppe> mattyw: just a standalone program to check current behaviour
<rogpeppe> mattyw: tell you what, i'll do one if you want
<mattyw> rogpeppe, make sure you send me the code, and show me how to run it
<mattyw> rogpeppe, it's the only way I'll learn
<rogpeppe> mattyw: ironically it appears that the yaml bug was actually fixed 2 hours before i committed that juju comment
<mup> Bug #1506865 opened: Failed to retry tools download after EOF <bootstrap> <reliability> <retry> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1506865>
<mup> Bug #1506869 opened: TestNewServerDoesNotAccessState api connection failure <i386> <test-failure> <unit-tests> <windows> <juju-core:Incomplete> <juju-core chicago-cubs:Triaged> <https://launchpad.net/bugs/1506869>
<rogpeppe> mattyw: here's my test code
<rogpeppe> mattyw: http://paste.ubuntu.com/12798941/
<mattyw> rogpeppe, thanks very much
<rogpeppe> mattyw: the yaml commit that fixed the bug was 4d6bb54d8acc91e147763cea066cff0b89437e90
<mattyw> rogpeppe, you're welcoe to review the pr if you want
<mattyw> rogpeppe, http://reviews.vapour.ws/r/2912/
<rogpeppe> mattyw: this was the important piece of the diff: http://paste.ubuntu.com/12798949/
<rogpeppe> mattyw: have you run the status tests on that branch?
<mattyw> rogpeppe, I have
<rogpeppe> mattyw: hmm, weird, i wonder why you don't get an infinite loop
<mattyw> rogpeppe, I've not run cmd/juju/... yet - in hangouts
<rogpeppe> mattyw: try running the status tests in cmd/juju/status
<mattyw> rogpeppe, I've run all of those
<rogpeppe> mattyw: interesting. I think that exposes a bug in goyaml actually
<rogpeppe> mattyw: what would you expect this to print? http://paste.ubuntu.com/12799013/
<mattyw> rogpeppe, shall I read it and guess or you want me to run it?
<mattyw> rogpeppe, (in our planning meeting now so sort of distracted I'm afraid, many apologies)
<rogpeppe> mattyw: first guess, then run
<mattyw> rogpeppe, I guessed and ran
<mattyw> rogpeppe, and now I'm confused
<rogpeppe> mattyw: the answer it gave is the reason your change doesn't cause infinite recursion
<rogpeppe> mattyw: https://github.com/go-yaml/yaml/issues/134
<rogpeppe> mattyw: reviewed http://reviews.vapour.ws/r/2912/
<mattyw> rogpeppe, thanks, see martins previous comment regarding omitempty
<rogpeppe> mattyw: ah, ok i see about the omitempty. i guess it's best not to mess with the output.
<mup> Bug #1506881 opened: 1.22 client cannot talk to chicago-cubs server <api> <regression> <juju-core:Incomplete> <juju-core chicago-cubs:Triaged> <https://launchpad.net/bugs/1506881>
<mattyw> rogpeppe, mgz and again http://reviews.vapour.ws/r/2912
<mattyw> rogpeppe, I'll be back in 10 though
<dimitern> voidspace, ping
<dimitern> voidspace, if you're still around, you might be interested in what I came up with for the juju-br0 script
<dimitern> voidspace, it turned out not quite as simple as I imagined (~200 lines of bash), but handles properly the IPv4/IPv6/both cases as well as aliases, and it's more resilient towards failures (reverts changes back so you can ssh later and diagnose)
<dimitern> voidspace, anyway - here it is:  http://paste.ubuntu.com/12799841/ (still testing a few cases though)
 * dimitern really wants at this point to do all these steps in Go and inject a simple binary that does all and can be easily tested in isolation
<dooferlad> dimitern: I am amazed we don't insist on Python being installed for these things :-|
<dimitern> dooferlad, it's about time :)
<dimitern> dooferlad, the issue in this specific case is the script is run very early, before packages are installed (well, we could change the cloud images to include python always - it might be so already in fact)
<dooferlad> dimitern: indeed. I would check that.
<dimitern> dooferlad, yeah - good idea - check a pristine trust amd64 image deployed from maas as well as "lxc-create -t ubuntu-cloud" - as lxc images might be different (same applies to KVM I guess, but I suspect the use the same images as maas)
 * dimitern pats himself on the back :D in all those ~200 lines of bash (in a go string) - so far only 2 syntax issues (apart from a panic from unmatched "}" in the template)
<mgz> mattyw: you still seem to comflict with my (landed) change that scrubs subnets
<mattyw> mgz, let me wave my rebase wand
<dooferlad> mattyw: you can get a wand for that? Ohh.
<mattyw> mgz, should be better now
<mattyw> dooferlad, wand didn't work, had to use keyboard
<dooferlad> mattyw: boo
<mattyw> tough crowd
<mgz> mattyw: shipit
<mgz> if it blows up on ppc we'll find out and can add back the hack
<mgz> yak sack
<mup> Bug # changed: 1335885, 1373516, 1435283, 1465844, 1478156, 1489477, 1492066, 1493123, 1494070, 1494542, 1495338, 1496639, 1496750, 1497094, 1498746, 1500613
<voidspace> dimitern: pong
<voidspace> dimitern: wow
<natefinch> sweet, just got my new code keyboard.  Gonna take a little getting used to, but the difference from my old keyboard is crazy.
<katco`> natefinch: i expect 50.02% more productivity from you.
<natefinch> katco`: of course ;)
<perrito666> katco`: which part of the .el for emacs is the one that makes the editor remind me that exported functions should have docs?
<katco`> perrito666: lol dunno if golint does that or not
<natefinch> golint does
<perrito666> katco`: something is doing it in the conf you passed to me :p I am just trying to understand lol
<katco`> perrito666: flycheck does things on the fly, so it's probably just running golint every time the buffer changes or something
<katco`> perrito666: i'm sure it has some kind of smarter algo, but that's the just of it
<natefinch> katco`: is there anything to do to get the payloads stuff in?  ericsnow says it merges just fine... do we do a PR, or what?  There's no process written down in the wiki.
<katco`> natefinch: the CI test to bless it is currently 50% complete
<natefinch> katco`: sweet
<katco`> natefinch: then i think we just propose a merge against 1.25 and do the merge
<natefinch> wow, glad I got the "quiet" mechanical keyboard.  Geez.
<perrito666> natefinch: lol, I thought you had one of those before
<natefinch> perrito666: nope.  My last keyboard was expensive only because it was wireless and solar powered, not due to the quality of the typing experience.
<perrito666> solar powered, really? what a strange feature for a kb
<natefinch> perrito666: never need to charge or change batteries
<jcastro> natefinch: I have a Code!
<perrito666> natefinch: my kb never sees the light of day
<perrito666> the window is behind two monitors
<natefinch> jcastro: yeah, just got mine, love it.  the lights are so perty
<perrito666> natefinch: pic
<natefinch> perrito666: http://imgur.com/c7H33c0
 * natefinch carefully crops out his messy desk
<perrito666> sure, all of us that have very neat desks will be incredibly offended by it :p
<perrito666> that is one sexy keyboard
<perrito666> I want it
<natefinch> it is wicked sexy... didn't even realize it had media keys until I saw them in that picture ;)
<mup> Bug #1506994 opened: deploy unit can fail after creating a machine but before assigning unit to it <juju-core:New> <https://launchpad.net/bugs/1506994>
<natefinch> keyboard backlight has adjustable brightness, too, which is nice - it goes pretty bright.
<mup> Bug #1506994 changed: deploy unit can fail after creating a machine but before assigning unit to it <juju-core:New> <https://launchpad.net/bugs/1506994>
<mup> Bug #1506994 opened: deploy unit can fail after creating a machine but before assigning unit to it <juju-core:New> <https://launchpad.net/bugs/1506994>
<mup> Bug # changed: 1463053, 1463922, 1469844, 1471657, 1471936, 1487191, 1499573
#juju-dev 2015-10-17
<katco`> ox
#juju-dev 2015-10-18
<mup> Bug #1507345 opened: add-machine for centos aborts in cloud-init at firewalld  <add-machine> <centos> <cloud-init> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1507345>
<mup> Bug #1507345 changed: add-machine for centos aborts in cloud-init at firewalld  <add-machine> <centos> <cloud-init> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1507345>
<mup> Bug #1507345 opened: add-machine for centos aborts in cloud-init at firewalld  <add-machine> <centos> <cloud-init> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1507345>
<mup> Bug #1507359 opened: MAAS 1.9 image names and Juju streams disagree <add-machine> <centos> <maas-provider> <streams> <juju-core:Triaged> <juju-core 1.24:Triaged> <MAAS:New> <https://launchpad.net/bugs/1507359>
<mup> Bug #1507345 changed: add-machine for centos aborts in cloud-init at firewalld  <add-machine> <centos> <cloud-init> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1507345>
<mup> Bug #1507345 opened: add-machine for centos aborts in cloud-init at firewalld  <add-machine> <centos> <cloud-init> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1507345>
<mup> Bug #1507345 changed: add-machine for centos aborts in cloud-init at firewalld  <add-machine> <centos> <cloud-init> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1507345>
#juju-dev 2016-10-17
<veebers> wallyworld: you have a moment? WOuld like help deciphering an error message I'm seeing in the failed tests.
<wallyworld> ok
<veebers> wallyworld: I'm seeing:"cannot add unit to application "memcached": getting instance types: listing VM sizes: azure.ServicePrincipalToken#Refresh:" for 3 tests, is it possible that it's the azure side that's failed or juju?
<veebers> wallyworld: the logs are at:
<veebers> http://reports.vapour.ws/releases/4490/job/azure-arm-deploy-bundle-lxd/attempt/687
<veebers> well, that's one of them at least
<wallyworld> not too sure about azure, i'll look
<veebers> wallyworld: thank you
<wallyworld> veebers: the error in the logs seems to be x509: certificate signed by unknown authority
<wallyworld> not sure off hand the root cause; azure deploys seem to work in general
<veebers> wallyworld: ok thanks, I'll keep looking
<wallyworld> veebers: andrew would know probably but he's out till thursday. seems strange though it's just one test out of many. there may be some difference in test set up or something
<anastasiamac> veebers: what u r describing may be related to an exisiting bug 1629759?...
<mup> Bug #1629759: azure subscriptions api throws x509 error "certificate signed by unknown authority" <arm64> <bootstrap> <ci> <deploy> <juju:Triaged> <https://launchpad.net/bugs/1629759>
<veebers> anastasiamac: awesome thanks, I'll check that bug is similar and file the issue away etc.
<anastasiamac> veebers: \o/
<veebers> wallyworld: ack, I'll file or +1 the bug at this point
<mup> Bug #1633554 changed: juju ssh uses old/invalid known_hosts data <juju-core:Invalid> <https://launchpad.net/bugs/1633554>
<anastasiamac> thumper: or anyone: trivial https://github.com/juju/juju/pull/6457
<thumper> lgtm
<anastasiamac> \o/
<anastasiamac> wallyworld: PPTAL https://github.com/juju/juju/pull/6458 ? should b trivial :D
<wallyworld> ok
<anastasiamac> wallyworld: and another :D https://github.com/juju/juju/pull/6459
<wallyworld> what bug is that?
<anastasiamac> bug 1633607
<mup> Bug #1633607: juju grant help text puts examples before valid inputs <juju:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1633607>
<anastasiamac> i'll add it to the pr.. sorry.. it's in the name but not in prose :D
<wallyworld> ah, i just didn't ee it
<frobware> axw: ping
<hoenir> could anyone take a look on this PR https://github.com/juju/juju/pull/6414 ?
<babbageclunk> menn0: ping?
<menn0> babbageclunk: pong
<dooferlad> frobware / dimitern / voidspace / babbageclunk: https://github.com/juju/juju/pull/6460 is short and has a nice big positive impact.
<babbageclunk> menn0: hey, just going through the redirect after migration task
<dooferlad> review please!
 * dooferlad goes for coffee
<voidspace> dooferlad: looking
<dooferlad> voidspace: thanks
<dimitern> dooferlad: looking
<frobware> dooferlad: does this work with LXD on Y?
<dimitern> btw I still need a review on https://github.com/juju/juju/pull/6454 please << dooferlad, voidspace, babbageclunk, frobware
<frobware> dooferlad: or, put another way, is the domain name still the same in LXD 2.4?
<voidspace> dooferlad: LGTM
<babbageclunk> menn0: So basically Login just needs to return a CodeRedirect if there's a successful migration, and RedirectInfo returns the addresses and cert from the TargetInfo?
<voidspace> dimitern: I will look at it in a bit
<menn0> babbageclunk: that's what I was thinking
<babbageclunk> menn0: Cool - seems pretty straightforward!
<menn0> babbageclunk: at a hand-wavey level anyway
<dimitern> voidspace: cheers!
<menn0> babbageclunk: yep, hopefully. I was very glad when I saw rog doing the initial work for this and what he did was very similar to what I was thinking
<menn0> babbageclunk: I assigned you a card for this on a a-team board btw
<babbageclunk> menn0: Yeah, saw an email about that - I'll move it to in-progress then
<menn0> babbageclunk: sweet
<frobware> dimitern: taking a look
<frobware> dimitern: personally I think selecting the fastest based on duration is a step too far. why? it so depends on what the machines are doing _at_ that _moment_ in time. And that may be totally different when you come to initiate the SSH connection.
<frobware> dimitern: as a first pass do we not just care for something that is listening?
<dimitern> frobware: we do care for those we can connect to, and if one takes >100ms while another takes <10ms, we should pick the latter
<frobware> dimitern: this is all noise, IMO.
<frobware> dimitern: the numbers are noise 100 vs 10?
<dimitern> frobware: sorry, I'm not sure I follow :/
<frobware> dimitern: if you open and stare at a document for 5 mins. would it matter that it took 100ms or 10ms to open?  If you initiate and do stuff over an SSH connection for 5 mins would it make any difference if the initial connection was 100ms or 10ms. IMO, nope.
<dimitern> frobware: ok, I see that, but I'm not sure why do you think picking the lowest-latency endpoint is wrong
<frobware> dimitern: it's code that we don't need. to test. to document. or grok.
<dimitern> frobware: what do you suggest instead?
<frobware> dimitern: probe for a listening connection. if it is "open" just record that fact. I don't think it matters how long it took. there should/would be an overall timeout on the overall probe anyway.
<frobware> dimitern: in terms of latency, your sample size is 1.
<frobware> dimitern: which is not huge. :)
<dimitern> frobware: not necessarily - you could have multiple equally good endpoints
<dimitern> frobware: but then it shouldn't matter I guess
<frobware> dimitern: I think less is more in this case. all we want to know is if the addr/port is reachable.
<dimitern> frobware: so you're saying "quit trying as soon as you connect successfully" ?
<frobware> dimitern: if there was more than one answer would we connect on more than one addr:port? If not, then isn't the first enough?
<frobware> dimitern: that was the approach I was playing with. http://178.62.20.154/~aim/portscan.go
<dimitern> frobware: the way it's currently implemented, I think we'll only connect to one endpoint at a time
<frobware> dimitern: right. so first open port would be good enough. true?
<dimitern> frobware: but otherwise, with a goroutine-per-hostport it's possible to connect to more than one endpoint
<dimitern> frobware: yeah
<frobware> dimitern: but would we do that as part of bootstrap?
<frobware> dimitern: previously we would blindly connect to the first addr:port. Now we at least verify there's something on the other end.
<frobware> dimitern: my take on commits. It should be so obviously simple that I would be nuts not to accept it...
<dimitern> frobware: ok, fair enough - will change that and push
<frobware> dimitern: appreciated. ty
<babbageclunk> oops, missed this discussion, I might have reviewed an obsolete PR. Hopefully there are still some useful comments in it dimitern!
<voidspace> dimitern: ping
<dimitern> babbageclunk: thanks! I'm changing it to simplify the logic
<dimitern> voidspace: pong
<babbageclunk> dimitern: :)
<voidspace> dimitern: do you know how to get juju bootstrap to use ssh (particularly a specific ssh key)?
<voidspace> dimitern: I have the juju-ci ssh key and can ssh to the machine I want access to
<voidspace> dimitern: but I can't bootstrap
<dimitern> voidspace: that's a manual bootstrap?
<voidspace> dimitern: is it?
<dooferlad> frobware, dimitern: re-ping https://github.com/juju/juju/pull/6460
<voidspace> dimitern: no, I just want "juju bootstrap vsphere vsphere" to work
<voidspace> dimitern: I have the vsphere credentials and configuration
<dimitern> voidspace: ah, ok
<dooferlad> frobware: I haven't tried a public cloud. I put some notes in about detecting the correct domain in a follow up.
<voidspace> dimitern: mgz says I should be able to do this just using the juju-ci ssh key, but spelunking through the juju-release-tools to work out the right incantation is painful
<voidspace> dimitern: I can see the code paths that bootstrap, but working out exactly what arguments / environment it is using will take a pen and paper and some time...
<dimitern> voidspace: then I think it's up to ~/.ssh/config - if you have a Host section with IdentityFile and Username, etc. set up correctly and you can do 'ssh <IP>' OK (without any args)
<voidspace> dimitern: and I thought you might "just know"...
<frobware> dooferlad: yep, looking
<voidspace> dimitern: I have that setup
<frobware> dooferlad: I think we  need to try at least AWS
<voidspace> dimitern: so it *should* "just work" if my ssh config is correct?
<voidspace> dimitern: I wonder if I have some conflicting rules
<dimitern> voidspace: I expect so
<dimitern> voidspace: as juju's ssh during bootstrap tries effectively `ssh <IP>:22`
<voidspace> dimitern: I definitely *can* ssh to the IP I want
<voidspace> but the bootstrap times out
<dimitern> voidspace: without specifying username and key?
<voidspace> dimitern: correct
<dimitern> voidspace: hmm - well, try bootstrap --debug ?
<voidspace> dimitern: ah, it's trying *my* ssh key - not the juju-ci one
<dimitern> voidspace: is the juju-ci ssh key in ~/.ssh ?
<dimitern> voidspace: it should try all keys it finds (IIRC)
<voidspace> it is
<voidspace> I terminated that one when I saw the wrong key - I'll try again
<voidspace> nope, doesn't seem to try more than the one key
<mgz> voidspace: if you just go to the box, su to jenkins
<mgz> and do the steps as written in the job
<mgz> that's all you need
<mgz> so, set JUJU_HOME=~jenkins/cloud-city and so on
<mgz> to just repo something on a ci machine, you should do exactly what ci does
<voidspace> mgz: I want to be able to make a custom build - bootstrap and debug - change and repeat
<mgz> you don't need to do your own config unless you're trying to do something else
<voidspace> mgz: so I *really* want to do it from my box, not from the CI box if possible
<mgz> you can just scp a new binary onto the box
<voidspace> mgz: but I think at this point I have to give up, and do that
<voidspace> ok
<voidspace> mgz: dimitern: thanks
<voidspace> frustrating, but such is life
<voidspace> mgz: by the way, deploy_job.py has no reason to exist (juju-ci-tools) - just add that main stanza to deploy_stack.py
<mgz> yeah, there's a bunch of minor things that could be cleaned up with the older scripts
<voidspace> mgz: ping
<mgz> voidspace: hey
<frobware> dooferlad: LGTM as long as we have a sanity bootstrap test for AWS and MAAS. As you mention, should be a change to the LXD provider only.
<frobware> dooferlad: I do wonder whether we should query for the domain as it is the LXD provider.
<frobware> dooferlad: otherwise at some point this will bite us
<frobware> dooferlad: so, a little contradictory to our call. :/
<mup> Bug #1420996 changed: Default secgroup reset periodically to allow 0.0.0.0/0 for 22, 17070, 37017 <canonical-is> <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1420996>
<mup> Bug #1559062 changed: ensure-availability not safe from adding 7+ nodes, can't remove stale ones <ensure-availability> <juju-core:Won't Fix> <https://launchpad.net/bugs/1559062>
<hackedbellini> hi guys! I'm trying to bootstrap a lxd environment but I'm having a problem. It seems that it bootstraps fine, the lxd container gets online, but at some point it tries to get a client, but it sends the request to my gateway machine and not the machine that I'm running juju itself
<hackedbellini> here are the logs: http://pastebin.ubuntu.com/23326644/
<hackedbellini> this is my /etc/network/interfaces http://pastebin.ubuntu.com/23326622/ and this is my /etc/default/lxd-bridge http://pastebin.ubuntu.com/23326614/
<hackedbellini> as you can see, the machine that I'm tying to bootstrap has an ip of 192.168.99.3. There's another machine on the network, the gateway, that has 192.168.99.4, but it has nothing to do with juju. In fact, the only place that I put its ip was on /etc/network/interfaces file. So I'm very confused on why it is trying to do that request on that machine
<hackedbellini> am I doing something wrong? I saw the same problem reported in this comment, but I don't think it is related to the bug it was reported: https://bugs.launchpad.net/juju/+bug/1547268/comments/19
<mup> Bug #1547268: Can't bootstrap environment after latest lxd upgrade   <2.0-count> <juju:Triaged by rharding> <https://launchpad.net/bugs/1547268>
<mup> Bug # changed: 1379882, 1383492, 1384350, 1384353, 1384359, 1384363, 1384565, 1384572, 1384725, 1384803, 1385098, 1387751, 1388983, 1389656, 1390525, 1391387, 1391966, 1392340, 1393879, 1393932, 1396159, 1399310, 1399322, 1399604, 1399606, 1400193, 1400207, 1403032, 1403674, 1408307, 1412720,
<mup> 1413653, 1414021, 1414890, 1416406, 1418325, 1419715, 1423278, 1423548, 1424837, 1425137, 1425680, 1429864, 1433244, 1435543, 1435999, 1438258, 1438721, 1438798, 1441749, 1442541, 1444537, 1445053, 1447390, 1449617, 1451616, 1452393, 1452490, 1454627, 1454661, 1455284, 1455646, 1457895, 1458447,
<mup> 1458452, 1459082, 1459148, 1459761, 1461358, 1461534, 1461889, 1461954, 1466272, 1466951, 1467590, 1468187, 1468855, 1474386, 1474892, 1475341, 1475425, 1475641, 1477709, 1480942, 1483987, 1484718, 1484833, 1485017, 1485145, 1490947, 1490977, 1491451, 1492396, 1498518, 1498575, 1498859, 1499332,
<mup> 1503039, 1503449, 1508392, 1508595, 1512191, 1516077, 1524089, 1536838, 1557302, 1568654, 1593855, 1611379
<rogpeppe1> This fixes an intermittent test failure in juju; reviews appreciated, thanks: https://github.com/juju/juju/pull/6461
<mup> Bug # changed: 1282025, 1282211, 1284167, 1284734, 1285082, 1285422, 1285685, 1287662, 1287665, 1287669, 1288302, 1288900, 1288950, 1290654, 1290727, 1292344, 1293686, 1295682, 1296515, 1297094, 1298120, 1298141, 1299027, 1300637
<frobware> hackedbellini: what's in your LXD profile? Run: lxd profile show default
<mup> Bug # changed: 833064, 862418, 945862, 998238, 1009687, 1015637, 1026422, 1029976, 1037027, 1037045, 1043076, 1050245, 1057650, 1057652, 1083558, 1086670, 1089297, 1089298, 1089304, 1096840, 1097018, 1100076, 1122135, 1122889, 1129218, 1129219, 1130771, 1131409, 1137902, 1144355, 1149889, 1156654,
<mup> 1161919, 1163983, 1164601, 1168754, 1169773, 1170419, 1172811, 1174190, 1176740, 1185143, 1191418, 1194869, 1197365, 1197372, 1199888, 1201117, 1202175, 1204851, 1206759, 1208292, 1209044, 1209155, 1209313, 1210035, 1210076, 1210449, 1210484, 1211498, 1212177, 1212936, 1214178, 1214949, 1214952,
<mup> 1214954, 1214957, 1214959, 1214961, 1214967, 1214968, 1215052, 1215252, 1215402, 1215777, 1217282, 1217508, 1217742, 1217768, 1217860, 1218167, 1218176, 1218834, 1219630, 1220260, 1220705, 1220816, 1221834, 1223325, 1224057, 1224368, 1224456, 1224492, 1224515, 1226460, 1226652, 1226786, 1227142,
<mup> 1227586, 1228241, 1228311, 1229199, 1229506, 1229883, 1230053, 1230131, 1230289, 1230370, 1230702, 1231526, 1231551, 1233371, 1233938, 1234217, 1234687, 1234715, 1237304, 1237518, 1238938, 1239368, 1239509, 1239908, 1240461, 1242237, 1243811, 1244760, 1245649, 1247688, 1248839, 1250104, 1250115,
<mup> 1250153, 1250965, 1251118, 1251697, 1252781, 1253056, 1253704, 1254237, 1255502, 1256053, 1256852, 1257758, 1257975, 1258116, 1258644, 1258889, 1259490, 1261324, 1262175, 1266476, 1266729, 1267387, 1267518, 1267541, 1268550, 1269014, 1270543, 1274455, 1277048, 1277116, 1277139, 1278734, 1279986,
<mup> 1300882, 1301352, 1302119, 1302561, 1303455, 1303942, 1304863, 1305385, 1307101, 1308088, 1308491, 1309434, 1309449, 1310453, 1311976, 1312068, 1312173, 1312201, 1313793, 1314754, 1316174, 1316593, 1316602, 1317917, 1319346, 1320080, 1322705, 1322747, 1324904, 1328151, 1329483, 1331694, 1336542,
<mup> 1340483, 1340836, 1341264, 1345848, 1347994, 1358227, 1359573, 1364847, 1364866, 1365623, 1366027, 1366793, 1368362, 1368981, 1372566, 1372759, 1374159, 1445093, 1445658, 1446159, 1446168, 1447235, 1449613, 1449633, 1450729, 1453297, 1453890, 1455089, 1455368, 1455445, 1455703, 1455840, 1456258,
<mup> 1456398, 1456703, 1456728, 1456857, 1457089, 1458758, 1459327, 1459610, 1461339, 1461605, 1461961, 1462097, 1462423, 1462874, 1464235, 1467873, 1467964, 1468359, 1468637, 1496188, 1496221, 1496975, 1497653, 1497788, 1498511, 1498577, 1499400, 1499570, 1501084, 1501093, 1501203, 1502016, 1505309,
<mup> 1505435, 1505460, 1506865, 1506866, 1506994, 1509099, 1510333, 1511543, 1511944, 1514616, 1516698, 1517391, 1517428, 1519081, 1519848, 1521017, 1523837, 1524021, 1527595, 1530992, 1533790, 1534289, 1537586, 1537731, 1540580, 1546492, 1546795, 1550033, 1554060, 1554807, 1564018, 1565196, 1567179,
<mup> 1568161, 1572022, 1583409, 1585750, 1590958, 1599129, 1620830, 1620832
<frobware> hackedbellini: also, using the defaults I'm surprised to see the .3 for LXD_IPV4_ADDR="192.168.99.3". This would normally be .1
<mup> Bug # changed: 1391941, 1392229, 1392379, 1392407, 1392684, 1392810, 1392876, 1393452, 1393883, 1393892, 1394668, 1395900, 1396096, 1396474, 1396862, 1397171, 1398055, 1399303, 1399506, 1399722, 1399730, 1468639, 1468756, 1469318, 1469731, 1470345, 1471138, 1471237, 1474411, 1475386, 1475635,
<mup> 1477281, 1477712, 1478706, 1478762, 1478934, 1478936, 1479278, 1479653, 1481366, 1482876, 1482939, 1483525, 1483932, 1484177, 1484930, 1489217, 1489484, 1490665, 1491608, 1492232, 1492598, 1494782, 1494831, 1495952, 1496127
<perrito666> wow
<hackedbellini> frobware: lxd profile? How can I see it?
<frobware> hackedbellini: $ lxd profile show default
<frobware> hackedbellini: lxc profile show default
<frobware> hackedbellini: not lxd, lxc
<hackedbellini> frobware: ahh ok, here:
<hackedbellini> https://www.irccloud.com/pastebin/eVgPQz7R/
<frobware> hackedbellini: looks fine
<hackedbellini> and the .3 is because this machine should have this ip. We have 3 machines here, one that servers nfs, one that provides dhcp and acts as the network firewall (the .4) and this one that runs our services. We were using juju 1.x in it for some time with this bridge configuration without any problems
<frobware> hackedbellini: did you run `sudo dpkg-reconfigure lxd` and accept the defaults. Note: you must not have IPv6 enabled in your lxd-bridge setup.
<hackedbellini> frobware: what do you mean by defaults?
<frobware> hackedbellini: if you run `sudo dpkg-reconfigure lxd` it will ask you questions about your IPv4 (and 6) network setup
<frobware> hackedbellini: in general, you can just accept the defaults with the proviso you should not accept any IPv6 setup
<hackedbellini> frobware: yeah, but the defaults means "no bridge"? Because I need the bridge on this machine
<mup> Bug # opened: 1391941, 1392229, 1392379, 1392407, 1392684, 1392810, 1392876, 1393452, 1393883, 1393892, 1394668, 1395900, 1396096, 1396474, 1396862, 1397171, 1398055, 1399303, 1399506, 1399722, 1399730, 1468639, 1468756, 1469318, 1469731, 1470345, 1471138, 1471237, 1474411, 1475386, 1475635,
<mup> 1477281, 1477712, 1478706, 1478762, 1478934, 1478936, 1479278, 1479653, 1481366, 1482876, 1482939, 1483525, 1483932, 1484177, 1484930, 1489217, 1489484, 1490665, 1491608, 1492232, 1492598, 1494782, 1494831, 1495952, 1496127
<frobware> hackedbellini: ah, good point. because you're bridged on br0. (the .3)
<hackedbellini> frobware: yeah. But answering your question, I did not accept IPv6
<frobware> hackedbellini: what's files are in /etc/network/interfaces.d/
<hackedbellini> frobware: nothing, just /etc/network/interfaces (http://pastebin.ubuntu.com/23326622/)
<frobware> hackedbellini: and the output of `ifconfig -a` and `ip route` ?
<hackedbellini> https://www.irccloud.com/pastebin/DFGS6yFn/
<hackedbellini> frobware: ^
<frobware> hackedbellini: thanks. going to try and reproduce this.
<hackedbellini> frobware: thank you so much! Please tell me if you need more info and/or want me to test anything else
<frobware> hackedbellini: one more thing: sudo lsof -i:8443
<hackedbellini> https://www.irccloud.com/pastebin/eE1VSqxN/
<mup> Bug # changed: 1332049, 1332221, 1332545, 1333496, 1334390, 1336313, 1336353, 1337318, 1337804, 1340077, 1340133, 1340184, 1340749, 1342738, 1343318, 1343569, 1345440, 1345541, 1349908, 1350008, 1351426, 1353239, 1353482, 1353571, 1354039, 1355216, 1356806, 1356857, 1358376, 1358474, 1359925,
<mup> 1360607, 1361365, 1361723, 1361759, 1363183, 1364013, 1365604, 1365621, 1365633, 1391941, 1392229, 1392379, 1392407, 1392684, 1392810, 1392876, 1393452, 1393883, 1393892, 1394668, 1395900, 1396096, 1396474, 1396862, 1397171, 1398055, 1399303, 1399506, 1399722, 1399730, 1400559, 1400782, 1401568,
<mup> 1402763, 1403955, 1408191, 1408467, 1408472, 1408848, 1409104, 1409381, 1409746, 1409806, 1409856, 1412020, 1412917, 1413052, 1413424, 1414027, 1414710, 1417874, 1418608, 1419864, 1421262, 1423364, 1423626, 1424901, 1425245, 1425506, 1425930, 1426127, 1426217, 1426458, 1426730, 1426940, 1427510,
<mup> 1427770, 1428893, 1429353, 1430220, 1430839, 1430943, 1431401, 1433589, 1434246, 1436766, 1436961, 1438590, 1440445, 1441899, 1441915, 1442046, 1445078, 1468639, 1468756, 1469318, 1469731, 1470345, 1471138, 1471237, 1474411, 1475386, 1475635, 1477281, 1477712, 1478706, 1478762, 1478934, 1478936,
<mup> 1479278, 1479653, 1481366, 1482876, 1482939, 1483525, 1483932, 1484177, 1484930, 1489217, 1489484, 1490665, 1491608, 1492232, 1492598, 1494782, 1494831, 1495952, 1496127
<voidspace> mgz: ping
<natefinch> anastasiamac: I blame you for this ^ :)
<anastasiamac> natefinch: u blame me for closing bugs? ;D
<frobware> hackedbellini: can you try the following for me:
<frobware> hackedbellini: juju bootstrap lxd lxd
<anastasiamac> natefinch: believe me, i'd rather not have any (would save me pain of some hate mail)
<frobware> hackedbellini: and in another window
<frobware> hackedbellini: lxc list
<frobware> hackedbellini: lxc exec <juju-name-from-lxc-list> bash
<hackedbellini> frobware: "bootstrap lxc" or "bootstrap localhost"?
<frobware> hackedbellini: cat /etc/network/interfaces.d/50-cloud-init.cfg
<frobware> hackedbellini: doesn't matter - whatever bootstrap command you are already using. I just want to bootstrap and then poke around in the image that lxd/juju creates.
<hackedbellini> https://www.irccloud.com/pastebin/vPi7Gpvg/
<hackedbellini> frobware: this is the result:
<hackedbellini> $ lxc exec juju-147208-0 bash
<hackedbellini> root@juju-147208-0:~# cat /etc/network/interfaces.d/50-cloud-init.cfg
<hackedbellini> # This file is generated from information provided by
<hackedbellini> # the datasource.  Changes to it will not persist across an instance.
<hackedbellini> # To disable cloud-init's network configuration capabilities, write a file
<hackedbellini> # /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
<hackedbellini> # network: {config: disabled}
<hackedbellini> auto lo
<hackedbellini> iface lo inet loopback
<hackedbellini> auto eth0
<hackedbellini> iface eth0 inet dhcp
<hackedbellini> ops, sorry, was going to pastenbin it but selected the wrong option
<frobware> hackedbellini: interesting. because mine 'inet manual'
<frobware> hackedbellini: which explains why my bootstrap will never complete. (it never gets an IP address)
<hackedbellini> frobware: so you are unable to reproduce the issue?
<voidspace> mgz: successfully bootstrapping to vsphere from the jenkisn machine
 * frobware wonders how this can be
<voidspace> mgz: using 2.0-beta17 which is the installed version on that box, but still bootstrapping a custom build should be straightforward from here
<frobware> hackedbellini: no. but there is mostly definitely an issue where we have 'iface eth0 inet manual'
<voidspace> mgz: thanks for your help and patience
<hackedbellini> frobware: is there any workaround I can do here? I need to setup this to get our services up ASAP
<rick_h_> voidspace: katco natefinch mgz ping for standup
<rick_h_> dimitern: ^
<kjackal> hey dimitern, first of all thank you for making sure local provider works! I still have a couple of questions. Can you spare some a few minutes?
<voidspace> rick_h_: omw
<frobware> hackedbellini: in a conf. call - will work through this again in about 30 mins
<hackedbellini> frobware: ok, thanks!
<natefinch> rick_h_: btw, show-controller only shows the current user even for admins
<rick_h_> natefinch: ah, ok. That's the gap.
<rick_h_> natefinch: k, let's punt this then please
<frobware> hackedbellini: is your client Mac OS X (sierra) or Linux?
<hackedbellini> frobware: Linux. Ubuntu Xenial to be more specifically
<frobware> hackedbellini: thx
<hackedbellini> frobware: going to launch now, brb in about 30 mins
<frobware> hackedbellini: ack
<rick_h_> frobware: dooferlad http://jeffreifman.com/2016/10/01/fix-macos-sierra-upgrade-breaking-ssh-keys/ reads a lot more clear
<rick_h_> natefinch: can you check that juju creates/uses RSA keys please as a quick card?
<rick_h_> natefinch: looking to work out if that link above is an issue with Juju
<rick_h_> natefinch: frobware dooferlad I'm going to guess we've not been using DSA keys for along time.
<dimitern> kjackal: hey! can you send me those questions by mail please? too many things to fix today.. sorry :/
<kjackal> yeap no problem dimitern I am trying to sort them out, let hope i will not send you any email :)
<dimitern> kjackal: ;) np, I'll have a look later this evening
<natefinch> rick_h_:  sure
<voidspace> rick_h_: so I can repro the IPv6 bug on vsphere by the way
<voidspace> rick_h_: preparing more logging
<hackedbellini> frobware: I'm back
<frobware> hackedbellini: on the phone still
<hackedbellini> frobware: np :)
<perrito666> lazyPower: this is still a problem? https://bugs.launchpad.net/juju/+bug/1426728 ?
<mup> Bug #1426728: Windows Charms are unable to raise error state <juju-agent> <oil> <oil-2.0> <windows> <juju:Triaged> <https://launchpad.net/bugs/1426728>
<lazyPower> perrito666 tbh i dont know, that issue is quite old now. I haven't deployed a windows charm in some time
 * perrito666 is fishing for a bug to work on
<voidspace> so what I'm seeing is my logging never being hit and therefore machine addresses never being set!
<natefinch> rick_h_: GenerateKey makes a 2048 bit RSA no-passphrase SSH capable key
<natefinch> rick_h_: that's what we use for the bootstrap server
<natefinch> rick_h_: and I verified that the comment matches the implementation ;)
<rick_h_> natefinch: ty
<natefinch> rick_h_: now what?
<rick_h_> natefinch: look at the interive add-cloud and where we left off.
<rick_h_> natefinch: maybe chat with katco back on the schema updares, i think she was out during some of tgat.
<natefinch> rick_h_: ahh, cool, yeah, I'd like to get that done
<lazyPower> perrito666 - is this expected behavior when upgrading from rc-3 to 2.0.0? http://paste.ubuntu.com/23339291/
<lazyPower> i don't think its a problem, just double checking that i haven't managed to botch my controller by upgrading from rc's to 2.0 proper
<lazyPower> seems fine, disregard the question.
<frobware> hackedbellini: purged my system of LXD and other stale stuff. Get different result now.
<frobware> hackedbellini: in fact, I believe I see your error
<frobware> hackedbellini: http://pastebin.ubuntu.com/23339436/
<marcoceppi> natefinch: I'm trying to update juju.fail since 2.0 is out, is 2.1 now "master?
<hackedbellini> frobware: yeah it looks like my error!
<natefinch> marcoceppi: hasn't been updated
<katco> marcoceppi: https://github.com/juju/juju/blob/master/version/version.go#L22
<katco> marcoceppi: https://github.com/juju/juju/blob/staging/version/version.go#L22
<natefinch> marcoceppi: oh.. hmm
<katco> marcoceppi: i think staging is what you want to track
<natefinch> yeah
<natefinch> just figured that out
<katco> marcoceppi: master will remain last released version so that we can do hotfixes
<natefinch> marcoceppi: master lags behind staging, staging is what master used to be
<marcoceppi> so I should track staging for blocking bugs?
<marcoceppi> or is juju.fail even needed anymore?
<frobware> OMG
<natefinch> not sure.... rick_h_ ? ^
<natefinch> I think we must still have blocking bugs at some point... but we PR into development, so if anything is blocked, that's what would get blocked
<frobware> hackedbellini: I think I may have found at why.
<katco> marcoceppi: i'm not sure tbh. i think so bc we can once again have a concept of "blocking" for the staging branch; i.e. changes to develop can't land into staging if it's not blessed
<marcoceppi> gotchya, I'll try to sync up with the qa/release team on how blocking bugs are decided so I can update the site
<natefinch> New workflow is branch from staging, PR into development.  if/when development passes CI, it's merged into staging.  releases are cut from staging to master
<hackedbellini> frobware: really? Why? Is there a fix/workaround?
<frobware> hackedbellini: so I'm now back to no address on second bootstrap. still investigating.
<perrito666> Brb lunch
<frobware> hackedbellini: OK, I see the problem now. https://github.com/juju/juju/blob/staging/provider/lxd/environ_raw.go#L155
<frobware> hackedbellini: on your bridged configuration this is going to be whatever your DNS deems the gateway (IIRC, .4 in your case). there is no LXD listening there.
<frobware> hackedbellini: the only unsatisfactory solution I have right now is reconfigure and use a NAT'd lxdbr0. :(
<frobware> rick_h_: ^^
<rick_h_> frobware: :/ can we get the list of addresses that lxd is listening on and error/note when that's not legit?
<frobware> rick_h_: I think it's more fundamentally broken than that. in the case where you do not use a NAT'd lxd bridge we need to understand the network config and not return the gateway address (presumably associated with the DHCP lease).
<hackedbellini> frobware: hrm that is sad :(. What do you mean by "NAT'd lxdbr0"?
<rick_h_> frobware: is this a case where we can say it's not supported? I mean if you don't have a NAT'd lxd bridge doesn't that have impact on getting agents/etc?
<rick_h_> frobware: or am I mis-understanding things
<rick_h_> frobware: oh nvm, non-nat'd as in dhcp so it's on the main netwokr
<frobware> hackedbellini: if you run dpkg-reconfigure you can say, "my bridge is lxdbr0", and when asked to configure IPv4 subnet, say "yes". This will detect an unused subnet and your containers will typically end up on 10.x.x.x/24.
<frobware> rick_h_: correct, it's on your local LAN
<frobware> rick_h_: this mode (use my hosts network bridge) is akin to the addressable containers we have on MAAS. The containers (nicely) end up on your local LAN.
<rick_h_> frobware: right, gotcha. why does this not go boom on maas? because they're on a maas provider host vs being the provider?
<frobware> rick_h_: yep
<frobware> rick_h_: juju/provider/lxd/environ_raw.go
<frobware> rick_h_: this code was changed recently
<frobware> rick_h_: https://github.com/juju/juju/pull/6078
<frobware> rick_h_: though it's not really clear whether it made it better/worse. I suspect many people don't use this setup by default. Clearly we don't in CI.
<dimitern> frobware: ping
<frobware> dimitern: need to EOD. quick?
<dimitern> frobware: I think I did all we discussed on https://github.com/juju/juju/pull/6454
<dimitern> frobware: if you can +1 it, I'll set it to land
<dimitern> frobware: can wait till tomorrow ofc
<frobware> dimitern: I would like to cast a fresh eye in the morning. ok?
<dimitern> frobware: sure, np
<hackedbellini> frobware: I see. But I won't be able to access my containers on my lan, right?
<frobware> hackedbellini: not directly
<frobware> :(
<hackedbellini> frobware: can I at least change it later after the bootstrap?
<hackedbellini> if not this is a _huge_ problem for me
<frobware> hackedbellini: thinking... you could add another NIC to your profile which DHCP's off your LAN...
<hackedbellini> frobware: hrmm, how can I do that?
<frobware> hackedbellini: let me futz for 5 mins or so... :)
<hackedbellini> frobware: k =P
<frobware> hackedbellini: how much manual hackery could you put up with?
<hackedbellini> frobware: as much as needed, as long as this works :)
<perrito666> lazyPower: you are not upgrading to 2.0 but to whatever your local working copy of the repo is seems to me
<frobware> hackedbellini: my /etc/default/lxd-bridge - http://pastebin.ubuntu.com/23339686/
<hackedbellini> frobware: you want me to try to use it?
<frobware> hackedbellini: I don't have any quick solution atm
<frobware> hackedbellini: I was about to paste some more stuff, but it won't work with juju
<hackedbellini> frobware: I see. Let me explain how things are setup here so maybe you can suggest an alternative solution for me:
<hackedbellini> That gateway pc that provides dhcp (the .4) has apache running on it. All accesses to our services will go into it first, and apache will do a reverse proxy to the service in the container
<hackedbellini> but the containers are running inside a second machine, the .3, the one that I'm trying to bootstrap juju
<hackedbellini> so, that is the reason that the containers need to have access to my lan
<frobware> hackedbellini: the containers are doing DHCP, correct?
<hackedbellini> or more specifically, my lan has to have access to the containers
<hackedbellini> yeah
<cory_fu> I'm hitting a strange issue with the 2.0 GA release and lxd.  xenial instances come up fine, but trusty units get stuck in 'waiting for machine' even though the lxd image is up and running.  If I try to 'juju ssh' to the machine number or unit name, it says that the keys for that unit are not found, if that's useful.  I don't see anything regarding that unit in the debug-log
<cory_fu> This only started after upgrading to the GA
<cory_fu> Could someone please offer some advice?
<rick_h_> cory_fu: can you check for anything interesting in the lxc logs?
<rick_h_> cory_fu: do they have network interfaces that can talk to the controller?
<frobware> hackedbellini: so this may work...
<frobware> hackedbellini: are you still about?
<hackedbellini> frobware: so lets try it :)
<frobware> hackedbellini: http://pastebin.ubuntu.com/23339861/ - a new LXD profile
<cory_fu> rick_h_: They do have network interfaces and networking seems to be working from what I can tell
<cory_fu> rick_h_: Any particular lxc log I should be looking in?
<frobware> hackedbellini: lxd profile create fix-up
<frobware> hackedbellini: use that content, ^^ substituting your bridge name (IIRC, br0)
<rick_h_> perrito666: going to be a min late
<perrito666> rick_h_: no worries
<perrito666> It would feel very weird if one of my meetings started on time these days
<rick_h_> perrito666: it's getting them to end on time that's hard :P
<hackedbellini> frobware: and then I try to bootstrap using that profile?
<frobware> hackedbellini: not quite. a little more involved unfortunately.
<frobware> hackedbellini: you need to use the lxd-bridge config I gave you earlier
<frobware> hackedbellini: sudo service stop lxd
<frobware> hackedbellini: sudo service lxd stop <--- CORRECTION
<frobware> hackedbellini: sudo service lxd-bridge-stop
<frobware> hackedbellini: copy in my /etc/default/lxd-bridge
<frobware> hackedbellini: there's no magic here - this one was autogenerated by dpkg-reconfigure lxd
<frobware> hackedbellini: then start the lxd and lxd-bridge services
<frobware> hackedbellini: kill any juju bootstraps before you do
<hackedbellini> frobware: done!
<frobware> hackedbellini: lxc profile list --< that lists our fix-up profile?
<hackedbellini> yep!
<frobware> hackedbellini: ok, I'll repeat my steps and cut+paste them
<frobware> hackedbellini: juju bootstrap lxd lxd
<frobware> hackedbellini: that will first use the internal 10.67.x/24 network
<frobware> hackedbellini: juju switch controller
<cory_fu> rick_h_: Actually, while the trusty image has eth0 and external networking appears to work, it doesn't have lxdbr0
<frobware> hackedbellini: juju ssh status
<hackedbellini> frobware: juju ssh status doesn't resolve. I imagine you mean just "juju status"?
<rick_h_> perrito666: omw
<hackedbellini> https://www.irccloud.com/pastebin/Ae6uBE82/
<frobware> hackedbellini: yay
<frobware> hackedbellini: and, juju ssh 0 works?
<hackedbellini> frobware: it works!
<frobware> hackedbellini: we're now going to apply the profile and copy a new .cfg for eth1 into the container.
<frobware> hackedbellini: what's in your: lxc profile show fix-up
<hackedbellini> frobware: here:
<hackedbellini> https://www.irccloud.com/pastebin/BCLY2tWs/
<frobware> hackedbellini: create this on your host/desktop: http://pastebin.ubuntu.com/23339928/
<frobware> hackedbellini: run `lxc list` to find the juju container name
<frobware> hackedbellini: lxc profile apply juju-2c7439-0 default,juju-controller,fix-up
<frobware> hackedbellini: juju switch controller
<frobware> hackedbellini: juju scp ~/lan.cfg 0:
<frobware> hackedbellini: juju ssh 0
<frobware> hackedbellini: sudo cp lan.cfg /etc/network/interfaces.d/
<frobware> hackedbellini: sudo reboot
<frobware> hackedbellini: wait 10s. then `lxc list`. You should see your LAN turn up.
<frobware> hackedbellini: http://pastebin.ubuntu.com/23339951/
<hackedbellini> frobware: it works! :)
<frobware> hackedbellini: Huzzah! \o/
<hackedbellini> frobware: so, for now, I need to do that for each container that I need external access, right?
 * frobware says, don't try this at home folks... :)
<frobware> hackedbellini: yes. though the controller is arguably special.
<hackedbellini> frobware: hahaha yeah! When do you think I'll be able to use my lxd-bridge config directly?
<frobware> hackedbellini: if you do `lxc profile list` you'll see there's another profile juju-default'
<frobware> hackedbellini: but they look identical to me
<hackedbellini> hrmm interesting
<frobware> hackedbellini: are you now in a state you can make progress?
<hackedbellini> frobware: yeah finally! Thank you very much for this
<frobware> hackedbellini: I will raise this as a bug - no immediate answer as for when it will be fixed, but 2.0.1 IMO
<hackedbellini> frobware: great! Please, if you can, ping me with the bug link when you open it so I can subscribe to it
<frobware> rick_h_: a devious workaround ^^ for wanting to use your own bridge with the LXD provider.
 * frobware needs to EOD for real now. :-D
<hackedbellini> frobware: one last question. I'm migrating some stuff from the old juju installation (1.x) to this new machine. The problem is that the machine failed very hard, but I have a backup of it. Where can I find the charms configurations that I was using?
<redir> great work frobware
<natefinch> man, I wish we were using 1.7 with the standard testing framework... the new subtests are wicked slick.
<redir> natefinch: +1
<natefinch> I bet one could revamp gocheck to use subtests
<redir> pretty easy review https://github.com/juju/juju/pull/6462 PTAL
<perrito666> natefinch: the issue would be to find that one :p
<perrito666> redir: looking
<natefinch> perrito666: I *was* very careful to use a non-specific pronoun there :)
<redir> tx perrito666
<perrito666> redir lgtm now you need the tests to pass
<redir> perrito666: tx. hopefully the QA checks out;)
 * redir lunches
<alexisb> wallyworld, ping, when you come online
<wallyworld> alexisb: hey
<perrito666> alexisb: ?
<alexisb> wallyworld, perrito666, hey sorry guys, will have to wait when I get back
<perrito666> cliffhanger
<mup> Bug #1634289 opened: new AWS region: us-east-2 <juju-core:New> <https://launchpad.net/bugs/1634289>
<thumper> poo
<perrito666> very mature
<thumper> wallyworld: seems the the blobstore changed an index between b7 an dnow
<anastasiamac> thumper: "juju wait" plugin is not part of our codebase... do u know whose it is?
<thumper> no, but lazyPower probably does
<thumper> hmm...
<wallyworld> thumper: it could have done but i don't recall
<thumper> well... fuck
<thumper> both b7 and master use the same hash for juju/blobstore
<thumper> but I'm getting this:
<thumper> ndex with name: files_id_1_n_1 already exists with different options
<thumper> gopkg.in/juju/blobstore.v2/gridfs.go:69: failed to flush data
<thumper> gopkg.in/juju/blobstore.v2/managedstorage.go:259: cannot add resource "buckets/1b69fe07-8ce5-40b2-8e08-24eaf5c91e41/tools/2.0.0-xenial-amd64-3ce84f9af0e163f5188d1355dc2dc9fb43a1e992d58dc8af874b98bcccc3c0da" to store at storage path "a76d021e-505a-468a-878f-0ce3a4e9e918"
<thumper> index with name
<thumper> hmm...
<wallyworld> thumper: i was not aware of anything but i know folks poked around a little in that area, but am not across why and what
<thumper> I'm following things down the rabbit hole into mgo
<thumper> it is possible that the gridfs impl there changed
<thumper> wallyworld: clucking bell
<thumper> gridfs file index changed
<thumper> between b7 and rc versions of mgo
<thumper> and it seems it has no way to migrate...
<thumper> the code just says "Endure Index, and has unique flag set"
<thumper> old version didn't have unique
<thumper> and the result is: sorry
<thumper> already exists with other options
<thumper> so ensure index fails
<thumper> write file to gridfs fails
<thumper> bugger
<wallyworld> thumper: what about delting the index
<wallyworld> deleting
<thumper> I may have to manually delete the index
<thumper> yes
<thumper> just investigating
<wallyworld> and then letting it be recreated
<thumper> seems a bit fucked up though
<thumper> that the mgo gridfs doesn't manage it's own upgrades
<thumper> something to be aware of
<thumper> makes me wonder what other time bombs we have hidden
<mup> Bug #1634289 changed: new AWS region: us-east-2 <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1634289>
<perrito666> alexisb: ping me when/if you are back
<redir> oh look a bless on develop
<anastasiamac> redir: \o/
<alexisb> perrito666, pong
<perrito666> alexisb: pong
<perrito666> what did you need?
<perrito666> (the ping part was so the computer would make a noise)
<alexisb> I wante to check in, do you have a few minutes to meet?
<perrito666> alexisb: certainly
<alexisb> perrito666, we can meet in the morning if that is better, if you are eod
<perrito666> pick the communication form
<perrito666> alexisb: I am never eod
<alexisb> lol
<alexisb> tha tis not good
<alexisb> perrito666, will meet you in our 1x1
<wallyworld> anastasiamac: a small one https://github.com/juju/juju/pull/6463
#juju-dev 2016-10-18
<anastasiamac> wallyworld: looking :D
<anastasiamac> wallyworld: there is nothing to do with regards to pricelist and stuff that axw did before holidays?..
<wallyworld> anastasiamac: otp, sec
<wallyworld> anastasiamac: that pricelist stuff is orthogonal
<wallyworld> you can bootstrap just fine
<anastasiamac> wallyworld: lgtm'ed
<wallyworld> ty
<mup> Bug #1634289 opened: new AWS region: us-east-2 <juju:In Progress by wallyworld> <juju-core:Triaged by alexis-bruemmer> <https://launchpad.net/bugs/1634289>
<anastasiamac> wallyworld: i seem to recall u had a cunning plan to specify what machine gets picked from maas based on some user desire... where is it at?
<thumper> time to take the dog for a wander
<anastasiamac> alexisb: if u have a sec, I'd love ur advice on something
<wallyworld> anastasiamac: sorry, was otp. we have had the ability to use maas agent as a placement directive. that's been there for a while. not sure if that's what youo mean
<anastasiamac> wallyworld: https://bugs.launchpad.net/juju-core/+bug/1345440
<mup> Bug #1345440: add-machine does not check for duplicates <add-machine> <maas-provider> <juju-core:Won't Fix> <https://launchpad.net/bugs/1345440>
<wallyworld> anastasiamac: right, that's what I think the agent name placement directive is used for
<wallyworld> but i'm not 100%
<wallyworld> would need to look at the code
<anastasiamac> wallyworld: i know it's an old bug, but if u could comment on it with what you know currently, i'd really really apreciate
<anastasiamac> (like coffee-worthy appreciate \o/)
<wallyworld> ok, i'll need to do it later today after i look at the maas code
<anastasiamac> \o/
<anastasiamac> i guess, i'll pay coffee in credit then ;D
<lazyPower> anastasiamac juju wait was owned by stub iirc
<alexisb> anastasiamac, fyi, I gave wallyworld a task that may prevent him for having time to update lp 1345440 today
<anastasiamac> lazyPower: tyvm \o/
<anastasiamac> alexisb: ack
<thumper> damn
<thumper> can't deploy anything from the charmstore with beta 7 as I get unknown channels
<thumper> and juju panics
 * thumper sighs
<veebers> thumper: I think beta13 is the oldest that can be used to deploy from charmstore (I hit this a little while back :-|)
 * thumper grabs marcoceppi's ubuntu charm directly from github
 * thumper hopes it deploys locally into beta 7
<thumper> well... something is happening
<balloons> anastasiamac, did you see my comments about juju-1-switch
<anastasiamac> yes, u r awesome: marked as duplicate
<anastasiamac> balloons: i was also told that u have a plan \o/
<mup> Bug # opened: 1577556, 1592887, 1597318, 1614633, 1615986, 1616149, 1616832
<balloons> anastasiamac, it seems like juju-1-switch should have worked or>>
<balloons> anastasiamac, the update-alternatives issue can be solved, but I'm afraid the juju-1-swith command still won't work
<dimitern> frobware: ping
<frobware> dimitern: pong
<frobware> dimitern: your PR is on my list
<dimitern> frobware: I've forked your kvm-maas repo and I'm close to sending a PR your way to handle multiple networks
<frobware> dimitern: I have a PR pending for that too. :)
<dimitern> frobware: yeah, a final look on my PR will be good at some point ;)
<dimitern> frobware: nice ;) we could integrate the approach later I guess?
<dimitern> frobware: there it is - even works :) https://github.com/frobware/kvm-maas/pull/1
<frobware> dimitern: \o' ok, entering review mode now. first up is your net port prober.
<dimitern> frobware: cheers
<frobware> dimitern: ah, actually need to raise a bug first. :/
<dimitern> np :)
<frobware> dimitern: reviewed. just a few comments.
<dimitern> frobware: tyvm!
<dimitern> frobware: updated to use conn.Close() instead of defer conn.Close()
<frobware> dimitern: close early, close often. :)
<dimitern> frobware: yeah - originally I had it like this because the results chan was unbuffered, now it doesn't matter
<dimitern> ow ffs! maas 1.9 DNS can resolve only its nodes' hostnames, not its own hostname :/
<dimitern> frobware: oops sorry - I didn't see your "outdated" comments
<dimitern> frobware: compromise? ReachableHostPort() ?
<dimitern> or even SelectReachableHostPort
<frobware> dimitern: yep, prefer Reachable.
<frobware> dimitern: first of your alternatives.
<dimitern> frobware: ok, pushing in a momemnt.
<frobware> dimitern: less words. select.. what? ;)
<perrito666> morning
<dimitern> frobware: :)
<dooferlad> https://github.com/juju/juju/pull/6465 <-- dimitern, frobware, voidspace instead of messing with hostnames, just reverting an earlier change. Seems like something else fixed the container DNS issue.
<dimitern> dooferlad: LGTM
<voidspace> dimitern: I didn't get to your review yesterday, did you get one?
<dimitern> voidspace: I got one from frobware, but feel free to have a look :)
<voidspace> dimitern: you have the link handy?
<frobware> voidspace: https://github.com/juju/juju/pull/6454
<frobware> dimitern: I was trying your kvm-maas PR - "error: could not determine IP address for PXE network br-enp1s0f1 br-enp1s0f0[0]'"
<frobware> dimitern: need to investigate a bit
<dimitern> frobware: where is that? kvm-maas-add-node ?
<frobware> dimitern: not sure; flipping between too many things, but yes, my first add-node failed.
<dimitern> voidspace: sorry, just pushed the last change related to frobware's review
<voidspace> dimitern: whilst you're feeling talkative...
<voidspace> dimitern: I'm diagnosing issues with vsphere and xenial. Any unit gets an fe::80 address (or similar) - which is machiine local ipv6
<dimitern> frobware: why "br-enp1s0f1" ? I'd expect to see e.g. virbr42 instead (well in my case bridge-name==virt-net-name)
<voidspace> dimitern: and from my logging, as far as I can tell machine addresses are *never* set
<voidspace> dimitern: so juju/vsphere/xenial is unusable
<voidspace> dimitern: do you have suggestions as to my next step in debugging
<voidspace> dimitern: the machines get ipv4 addressses from vsphere
<dimitern> voidspace: fe80:: addresses are link-local ones
<voidspace> dimitern: right, link local
<voidspace> dimitern: I can't ssh to the machine via juju
<dimitern> voidspace: they always exist when ipv6 is enabled - so somewhere we're not filtering them properly
<frobware> voidspace: is vsphere/trusty usable?
<voidspace> frobware: yes
<frobware> voidspace: last time I looked at this I could only do *anything* on vsphere as long as it was trusty
<voidspace> dimitern: it's not that we're not filtering - we don't have *any other addressess* for the machin ein state
<voidspace> frobware: I believe that's still the case
<dimitern> voidspace: ah.. well if that's the case something is quite broken
<voidspace> dimitern: at least that's my current conclusion - adding more logging to confirm
<voidspace> dimitern: yep :-)
<dimitern> voidspace: on the provider side I'd guess
<dimitern> rick_h_: ping?
<rick_h_> dimitern: pong
<dimitern> rick_h_: should we do our 1:1?
<rick_h_> dimitern: I'm there
<dimitern> rick_h_: ok, omw
<voidspace> dimitern: yep, agreed - but it maybe a problem with provisioning xenial images on vsphere
<voidspace> dimitern: I'd like to get into the machine to see
<voidspace> dimitern: will try the controller machine as I can contact the state server
<frobware> dimitern: http://paste.ubuntu.com/23343319/
<frobware> dimitern: haven't looked into why
<dimitern> voidspace: let me get back to you a bit later
<voidspace> dimitern: sure
<dimitern> frobware: found it! pushing update to the PR
<dimitern> voidspace: if you can ssh into the controller, what address(es) did you use?
<voidspace> dimitern: just hacking up some more logging, trying again shortly - will let you know
<dimitern> voidspace: ok
<mup> Bug #1324841 changed: Improve isolation in utils/file_test.go <juju-core:Won't Fix> <https://launchpad.net/bugs/1324841>
<mup> Bug #1325837 changed: juju run is updating ~root/.ssh/known_hosts <run> <ssh> <juju-core:Won't Fix> <https://launchpad.net/bugs/1325837>
<frobware> dimitern: closer... http://paste.ubuntu.com/23343396/
<frobware> dimitern: ahhh
<frobware> dimitern: that's because my virt-network is now a bridge. Ho-hum.
<dimitern> frobware: yeah - I use the same names for the bridge and the network; btw pushed another
<frobware> dimitern: different, my virt net work definition is actually a bridge. http://paste.ubuntu.com/23343406/
<frobware> dimitern: for that kind of definition this virt_network_address()" will always fail
<dimitern> frobware: that looks almost the same as what I have locally, e.g. http://paste.ubuntu.com/23343418/ (for maas-int19)
<frobware> dimitern: but you have 'ip address=1.2.3.4'
<dimitern> frobware: ah, because it's a bridge without an address
<dimitern> yep
<frobware> dimitern: it's actually the hosts bridge (which does have an address)
<balloons> .query stgraber
<frobware> dimitern: I think that's a separate commit/fix/enhancement. when I first started doing this I only need libvirt-derived bridges. Now I want them on my various VLANs
<dimitern> frobware: re https://github.com/juju/juju/pull/6454 - good to land?
<frobware> dimitern: yep
<frobware> dimitern: thanks all around
<dimitern> frobware: cheers!
<dimitern> jam: are you around?
<frobware> dimitern: I wonder whether we should just paas the QEMU connection string as an argument.
<dimitern> frobware: there are 2 of those actually - from the host POV and maas's POV
<frobware> dimitern: are they not the same?
<dimitern> frobware: nope
<frobware> dimitern: well, I guess that really depends on your initial MAAS setup. My MAAS setup always connects as 'me' to the host.
<dimitern> frobware: the former can be qemu:///system as a sane default, while the other is likely different, with qemu+ssh://$USER:$PXE_IP/system being a reasonable default, except if it's not :)
<frobware> dimitern: they are essentially always the latter, no?
<frobware> dimitern: or it can just degenerate to the latter
<dimitern> frobware: I used qemu+ssh://maas:$IP/system before, as I had a maas user only the vmaas-es can use to ssh into my laptop
<frobware> dimitern: if MAAS can do power-on/off, then you can use the same string to add/remove-node
<dimitern> frobware: if you're calling virsh locally, qemu:///system is assumed to be what you'd want
<dimitern> frobware: however, overriding it is useful if you're e.g. setting up a remove kvm host
<dimitern> in which case it's likely to be the same inside the vmaas host as well (assuming it's configured ok)
<frobware> dimitern: I never tried that, but I think it is a reasonable thing to assume would just work.
<dimitern> I was thinking of adding a simple check that qemu+ssh://$USER:$PXE_IP/system is reachable from the local host
<frobware> dimitern:  I think I'm going to make it a required arg. A bit sucky, but less magic.
<dimitern> but really we need such a check more importantly for the vmaas host
<frobware> dimitern: that would happen in kvm-maas-host - a new repo to setup MAAS controllers.
<frobware> dimitern: or, I combine them.
<dimitern> frobware: as long as it can be exported once and then used in the same shell - sure
<dimitern> frobware: yeah, I forked that one as well :) nice work so far - haven't tried it though
<frobware> dimitern: wouldn't bother. fundamentally broken until cloud-init is fixed.
<dimitern> frobware: oh, too bad :/
<frobware> dimitern: for reference, it is bug #1576692
<mup> Bug #1576692: fully support package installation in systemd <sts> <verification-done> <cloud-init:Fix Released> <cloud-init (Ubuntu):Fix Released> <init-system-helpers
<mup> (Ubuntu):Fix Released by pitti> <cloud-init (Ubuntu Xenial):Fix Released> <init-system-helpers (Ubuntu Xenial):Fix Released> <https://launchpad.net/bugs/1576692>
<dimitern> frobware: I see though you've thought about allowing lxd-based maas-es to be deployable - nice
<dimitern> frobware: well, I can see cloud-init 0.7.8-1... in xenial-updates now
 * frobware notes that this is now fix-released. 
<dimitern> frobware: so I'll give it a try later today
<frobware> dimitern: the focus is still wrong. We need to drive this from a network spec.
 * frobware lunches
<dimitern> mgz, balloons: how often/when does github-merge-develop-to-staging run?
<mup> Bug # changed: 1325946, 1329256, 1329480, 1329578, 1331691, 1332048, 1365665
<hackedbellini> Hey guys! A local charm failed is failing on install hook. I fixed it, updated the charm and marked it as resolved. But on debug-log I see that it is still using the old install-hook. How can I force it to use the new code?
<balloons> dimitern, it runs whenever there is a bless on develop
<balloons> dimitern, I believe there was a bless this morningish?
<dimitern> balloons: ah, I see - ok
<dimitern> balloons: I was wondering how a multi-PR fix will work if some PRs land in staging from develop before others
<balloons> dimitern, the ci run should fail if it gets picked up
<balloons> and thus, it shouldn't hit staging until it's all ok
<dimitern> balloons: right.. or if it doesn't fail, the later PR in the pipeline could be based on staging + cherry-picked PRs yet-to-land on staging
<dimitern> s/the later PR/the later PRs/
<balloons> dimitern, the need for rebasing might happen; see the discussion on the mailing list about this
<balloons> but from a job perspective, you understand what will happen now :-)
<dimitern> balloons: yeah, cheers :)
<mup> Bug # changed: 1365675, 1368254, 1369638, 1369900, 1369909
<mup> Bug #1373592 changed: When bootstrapping or deploying dont spec zone <bootstrap> <ec2-provider> <papercut> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1373592>
<mup> Bug #1373768 changed: Juju doesn't inform users when MAAS is out of nodes <maas> <orange-box> <ui> <juju-core:Fix Released> <https://launchpad.net/bugs/1373768>
<mup> Bug #1375110 changed: "maintenance in progress" detection in the API server needs improving <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1375110>
<voidspace> help for debug-log shows --tail and --no-tail as arguments, yet they don't seem to be defined
<voidspace> hmm, they are on my local box - yet not on the remote one using the same (allegedly) version of juju
<voidspace> gah, my fault
<voidspace> "juju help debug-log --no-tail" doesn't work, perhaps unsurprisingly
<voidspace> dimitern: I see this in the logs: provider addresses: []state.address{state.address{Value:"fe80::1", AddressType:"ipv6", Scope:"public", Origin:"provider", SpaceName:""},
<voidspace> dimitern: so at some point we're getting an address with value fe80::1 come in with a Scope of "public"
<voidspace> dimitern: which is why we're setting it as a public address
<dimitern> voidspace: right!
<dimitern> voidspace: I remember seeing something nasty like using network.NewScopedAddress(..., network.ScopePublic) in the vsphere provider
<voidspace> dimitern: some smoking guns from the logs: http://pastebin.ubuntu.com/23343774/
<dimitern> to fake some address as a public one
<voidspace> dimitern: I will hunt that out
<voidspace> dimitern: ouch
<voidspace> dimitern: yep, environInstance.Addresses makes addresses public
<mup> Bug # changed: 1375268, 1376576, 1380659, 1380989, 1382063, 1382276, 1383260, 1384013, 1384336, 1384348, 1384369
<voidspace> dimitern: provider/vsphere/instance.go:58
<voidspace> dimitern: shall I just change that to always use a derived scope instead of the two explicit scopes?
<voidspace> dimitern: in fact dammit, I'll just try it
<voidspace> dimitern: if Type was a method and we *always* derived it then we wouldn't have this issue
<dimitern> voidspace: let me have a look
<voidspace> dimitern: I've made the change and I'm just scp'ing the binaries up to try it
<voidspace> dimitern: only takes ten minutes
<voidspace> dimitern: although, please look to see if there's any reason why we shouldn't rely on a derived scope there
<rock_> hi . we developed "cinder-storage driver charm. We want to install "git" not as part of install hook. But someone suggested "layer apt". In my charm folder I created layer.yaml file as pasted. http://paste.openstack.org/show/586196/
<rock_> when layer.yaml will execute?
<rick_h_> natefinch: ping
<mup> Bug #1384549 changed: Running Juju ensure-availability twice in a row adds extra machines <canonical-bootstack> <canonical-is> <ha> <improvement> <maas-provider> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1384549>
<mup> Bug #1385277 changed: malformed urls as environment variable values need to be handled better <tech-debt> <juju-core:Won't Fix> <https://launchpad.net/bugs/1385277>
<mup> Bug #1386222 changed: Usability: machine provisioning timeouts <deploy> <scalability> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1386222>
<rock_> Before install script (or) after install script?
<dimitern> voidspace: yeah, I think we should be using NewAddress() instead of NewScopedAddress() there
<voidspace> dimitern: bootstrapping now
<voidspace> dimitern: thanks
<rock_> could anyone help me in this.
<dimitern> voidspace: it's commendable that whoever implemented the provider tried to convey the public vs. private distinction to juju with the scope, but ...
<natefinch> rick_h_: howdy
<voidspace> dimitern: yep, "but"
<voidspace> dimitern: hah, and with that change no tests fail...
<voidspace> or at least, no vsphere provider tests fail
<rick_h_> natefinch: can you do me a fav please? Can you generate a new fallback-clouds.yaml file with the change ian landed overnight and get it to abentley to test out please?
<dimitern> voidspace: sweet! :)
<natefinch> rick_h_: sure
<rick_h_> natefinch: ty
<voidspace> dimitern: well, either sweet or it was just untested....
<voidspace> dimitern: hopefully they pass because using a derived scope is the right thing anyway...
<rock_> dimitern: Hi. do you have any idea on my question? please tell me if you have.
<katco> rock_: try over in #juju. they usually discuss the charming side of things much more. marcoceppi or lazyPower may be able to help
<katco> rock_: or cory_fu
<mup> Bug #1384549 opened: Running Juju ensure-availability twice in a row adds extra machines <canonical-bootstack> <canonical-is> <ha> <improvement> <maas-provider> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1384549>
<mup> Bug #1385277 opened: malformed urls as environment variable values need to be handled better <tech-debt> <juju-core:Won't Fix> <https://launchpad.net/bugs/1385277>
<mup> Bug #1386222 opened: Usability: machine provisioning timeouts <deploy> <scalability> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1386222>
<rock_> katco: Thanks.
<voidspace> dimitern: all machines started, all have ipv4 addresses, can ssh to them
<katco> rock_: this channel is more for developers of juju itself
<voidspace> rick_h_: found and fixed the vsphere ipv6 bug (need a PR and tests of course - but verified the fix) - with a bit of help from dimitern as usual
<cory_fu> rock_: Hey.  I'm glad to help.
<rick_h_> voidspace: <3
<rick_h_> voidspace: dimitern frobware macgreagoir natefinch mgz ping for standup
<rock_> cory_fu: Hi. Hi. we developed "cinder-storage driver" charm. Our charm dependent on Github. During the execution of the charm it will go and get latest files from Git and it will keep those files in cinder node. We will deploy our charm post deployment of Openstack setup. So during execution of our charm it was giving "git ERROR"[Like git is not there].
<cory_fu> rock_: The apt layer will install that package more or less the first chance it gets.  Generally, this will mean during the install hook, though it can sometimes actually happen even earlier (due to leadership, storage, etc).  Essentially, as soon as the reactive framework is bootstrapped and the apt layer sees that the package has not yet been installed.
<cory_fu> rock_: That also applies to any of your own reactive handlers that don't have any other unsatisfied pre-conditions (e.g., @when decorators)
<cory_fu> rock_: So, what you'll want to do, is ensure that the code that depends on the "git" package being installed has a @when('apt.installed.git') decorator on it.  There's an example of this usage in the apt layer README: https://git.launchpad.net/layer-apt/tree/README.md#n69
<cory_fu> rock_: (That example also uses reactive code to perform the initial package install, but you can just as easily use the layer.yaml option definition if there are no conditions or other prerequisites that must be satisfied *before* installing the git package)
<rock_> cory_fc: Thanks. Yes I saw this.  But before install script oonly I need to install that git package.
<cory_fu> rock_: Right.  So just ensure that your initial "entry point" handler (i.e., the one with minimal or no pre-conditions) does at least have the precondition of @when('apt.installed.git')
<cory_fu> rock_: Let me see if I can find you a more concrete example
<rock_> cory_fc: I am new to this. didn't get
<rock_> cory_fc: better I need to clear you with my requirement
<mup> Bug #1384549 changed: Running Juju ensure-availability twice in a row adds extra machines <canonical-bootstack> <canonical-is> <ha> <improvement> <maas-provider> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1384549>
<mup> Bug #1385277 changed: malformed urls as environment variable values need to be handled better <tech-debt> <juju-core:Won't Fix> <https://launchpad.net/bugs/1385277>
<mup> Bug #1386222 changed: Usability: machine provisioning timeouts <deploy> <scalability> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1386222>
<rock_> cory_fu: If we have juju openstack setup. when we deploy our charm and added relation to cinder, then my charm will go and get our cinder-storage driver files from Git and keep them on cinder node. But before cloning from git it was giving "git ERROR". So we need to install git package not as part og install hook.
<cory_fu> rock_: Hrm.  All of the examples I can find use the apt.queue_install() method in the code, rather than the layer.yaml option, but it's functionally equivalent.  I think the README is probably the best example, in that case, just know that you can substitute the layer.yaml option for lines 77-79 and it will behave the same
<cory_fu> rock_: When using reactive, the idea is to think about the life-cycle less in the terms of hooks, and more in terms of what are the pre-conditions of the block of code (single function / handler) that you're concerned about.  In your case, you have a handler which uses git, and so that block of code needs to be decorated with @when('apt.installed.git') and then it will always be delayed until that dependency is met.
<cory_fu> That will likely still happen during the install hook, though, because it can really happen as soon as the apt package is done being installed.
<cory_fu> Unless it has other pre-requisites, such as depending on the relation, as you mentioned, in which case it needs to have more conditions specified in its @when decorators
<rock_> cory_fu: I developed my charm using shell script.
<cory_fu> rock_: That's fine.  You can use the reactive pattern with bash.  Here is an example, although it doesn't use the apt layer: https://github.com/juju-solutions/layer-openjdk/blob/master/reactive/openjdk.sh
<cory_fu> rock_: The main things to note with that are the "source charms.reactive.sh" at the top, and "reactive_handler_main" at the bottom.
<cory_fu> Otherwise, it's similar to any other reactive example in that you define a set of functions and decorate them with the pre-conditions that are required for each one to be able to run
<mup> Bug # changed: 1260247, 1262750, 1263196, 1267298, 1268917, 1270041, 1270858, 1270896, 1271502, 1271504, 1330473, 1386494, 1386926, 1389303, 1389324, 1389418, 1390284, 1391353
<cory_fu> Those pre-conditions depend on states (flags) that are set either by other handlers in your layer, or by other layers that you depend on.
<rock_> cory_fu: what layer-git-deploy will do?
<rock_> core_fu: I am little bit confusing. where to add and what to add?
<cory_fu> rock_: I had not seen that layer yet.  To be honest, I'm not sure that it is complete.  Perhaps bdx (James Beedy) will chime in?
<rock_> core_fu: Ok Thanks. Simply, What I need to add to my charm to install git package? sorry, really these layer terms are very new for me.
<cory_fu> rock_: If you haven't read through it yet, https://jujucharms.com/docs/stable/developer-layers covers the basics of layers and states to give you a better foundation.
<cory_fu> rock_: As for your specific question, I think that using the "apt: packages: [git]" in your layer.yaml as you mentioned initially, and adding a @when('apt.installed.git') decorator around the code that uses git should be all you need
<mup> Bug #1271923 changed: using lxc containers with maas provider always default to series of host service unit <lxc> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1271923>
<mup> Bug #1273216 changed: unknown --series to add-machine breaks provisioner <bitesize> <juju-core:Fix Released> <https://launchpad.net/bugs/1273216>
<mup> Bug #1274450 changed: VM locale handling <debug-hooks> <ssh> <juju-core:Fix Released> <https://launchpad.net/bugs/1274450>
<cory_fu> rock_: This all presupposes, of course, that you're writing your charm using layers and reactive, which we recommend, but is a very different style than writing traditional charms, in that you don't write the individual hooks directly, just reactive handlers that have preconditions.
<rock_> core_fu: Oh. Actually I written my charm in a traditional way. So for installing "git" package only I asked in Chat. One guy suggested "layer apt". So I am asking about this.
<cory_fu> rock_: If you're writing your charm using the classic approach and creating your hooks yourself, then you can't use the apt layer, have to call apt-get install yourself, and must manage ensuring the ordered execution of your code yourself with the understanding that hooks are inherently life-cycle events and not procedural code paths.  Thus, I can promise that that approach can get difficult quite quickly
<cory_fu> rock_: We recommend writing all new charms using layers and reactive because it makes dealing with these exact coordination issues much, much easier
<cory_fu> rock_: I have to step away for a few minutes for a meeting, I'm afraid.  I will try to continue to respond, but may be slower for a little while
<rock_> core_fu: Oh. OK. thanks for your help. One final question please.
<rock_> core_fu: We already used classic approach right. I will follow this. Because we have to deliver this charm quickly to the client. So at present situation I will use "apt-get install git" in install script of hook. This will be fine right?
<mup> Bug #1271923 opened: using lxc containers with maas provider always default to series of host service unit <lxc> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1271923>
<mup> Bug #1273216 opened: unknown --series to add-machine breaks provisioner <bitesize> <juju-core:Fix Released> <https://launchpad.net/bugs/1273216>
<mup> Bug #1274450 opened: VM locale handling <debug-hooks> <ssh> <juju-core:Fix Released> <https://launchpad.net/bugs/1274450>
<natefinch> gah..... why oh why does pastebin.ubuntu.com want me to log in with SSO to download the plaintext of a pastebin that I can see without logging in?  Geez.
<mup> Bug #1271923 changed: using lxc containers with maas provider always default to series of host service unit <lxc> <maas-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1271923>
<mup> Bug #1273216 changed: unknown --series to add-machine breaks provisioner <bitesize> <juju-core:Fix Released> <https://launchpad.net/bugs/1273216>
<mup> Bug #1274450 changed: VM locale handling <debug-hooks> <ssh> <juju-core:Fix Released> <https://launchpad.net/bugs/1274450>
<natefinch> abentley: new fallback-clouds.yaml http://pastebin.com/raw/rnxGWLjB
<abentley> natefinch: Thanks.
<cory_fu> rock_: Back, sorry.  So for the most part, yes.  Using `apt-get install git` in the install hook should be fine, but as I mentioned before, you do need to be aware that there are some conditions under which other hooks might run *before* the install hook, mainly storage-attached and leader-elected, I think.  I am not 100% certain, though, that relation hooks will *never* run before install.  So, you may find that you need to manually implement
<cory_fu> some sort of flag system to manage that.
<cory_fu> rock_: You could do that either with hidden dot files on the unit (I prefer to keep them out of the charm code directory, perhaps one level up, to keep that more clean for upgrades and debugging), or you can install the charmhelpers python library which includes a command-line interface to "unitdata" which makes it easy to manage persistent charm data like that, e.g.: chlp unitdata set foo true
<cory_fu> rock_: At any rate, just keep in mind that, while there are some assertions about when certain hooks will run, there is also a lot of uncertainty, which is inherent in the nature of the cloud.
<dimitern> frobware, voidspace, dooferlad: a couple of small PRs (the latter includes the former): https://github.com/juju/juju/pull/6467 https://github.com/juju/juju/pull/6468 needed to (finally!) fix bug 1616098, please take a look if you can
<mup> Bug #1616098: Juju 2.0 uses random IP for 'PUBLIC-ADDRESS' with MAAS 2.0 <4010> <cpec> <juju:In Progress by dimitern> <https://launchpad.net/bugs/1616098>
<frobware> dimitern: will take a look in a bit. raising the LXD bug from yesterday...
<dimitern> frobware: sure
<rock_> core_fu: Thank you for your valuable information. I will need to get good knowledge on layer charm.
<frobware> dimitern: reviewed, though it would be easier with distinct PRs. the last one I looked at had code that I had already reviewed.
<dimitern> frobware: I'm open to suggestions how I should propose 2 related PR, so they can still depend on each other, but do not duplicate the diff
<dimitern> :)
<dimitern> frobware: thanks for the reviews though
<frobware> dimitern: I think propose them separately and land a new PR that has both (already) reviewed.
<frobware> dimitern: or merge one of them once independently reviewed and approved.
<dimitern> frobware: but wouldn't that be awkward to implement?
<dimitern> frobware: I mean, if PR1 changes package1, and PR2 changes package2, which uses package1 ..
<frobware> dimitern: so what I meant was have two PRs. get both reviewd. Once OK on both land PR2 with PR1 merged.
<dimitern> frobware: there needs to be some artificial shim in package2 that uses a faked up version of package1
<frobware> dimitern: then I would say they are not independent and would benefit from being reviewed as a single PR.  $shrug.
<frobware> dimitern: look at it another way, could either have been independently reverted?
<dimitern> frobware: I guess so
<dimitern> frobware: well, the second one - not as proposed
<dimitern> balloons: ping
<dimitern> mgz, balloons: the windows instance used by github-merge-juju is out of disk space, and the lxd one is somehow borked (websocket.Dial wss://10.0.8.25:17070/model/bc2f61b3-fe8d-416a-8bba-19349301efd5/api: x509: certificate signed by unknown authority)
<balloons> sinzui, ^^
<balloons> Sorry away from computer ATM
<dimitern> np, just for the record :)
<sinzui> dimitern: balloons I reboot is needed then
<dimitern> sinzui: it such cases, and when the PR is approved, I guess it's fine to still hit $$merge$$, right?
<sinzui> dimitern: yes, wait a moment for me to get the host backup. windows is slow
<dimitern> sinzui: np, I'll check back in an hour or so
<mgz> dimitern: it's fine to $$merge$$ again if the failure made it back to the pr
<dimitern> mgz: righto ;) cheers
<sinzui> balloons: rick_h_ : did someone setup a NEW windows host to for the travis tests. We need a deidicated machine for each job. We seem to bee seeing a lot of temporary disk full errors on windows over the last week
 * dimitern EODs
<natefinch> FYI, if you're looking for travis-like CI for windows: https://npf.io/2014/07/ci-for-windows-go-packages-with-appveyor/
<balloons> sinzui, send me your thoughts about scaling. I know the LXD tests will not scale well ( nor are they being cleaned up, need some magic for that better solved by running elsewhere perhaps)
<alexisb> perrito666, how is vshpere treating you?
<perrito666> alexisb: nicely, I am reviewing the provider and I believe I have a clue on what is happening on at least one of the bugs and I see voidspace took the other one that larry deemed very important, then the rest will come after
<alexisb> perrito666, how do you feel about the user experience for the provider?
<perrito666> I feel like you did not tell me something because the one I have been provided works like any other juju
<alexisb> redir, ping
<redir> pong
<redir> alexisb:
<alexisb> heya redir, just checking in, any thing you need from me?
<redir> not at the moment
<redir> thanks
<alexisb> k
<redir> I'll reach out to katco shortly
<alexisb> ok, cool if she is pre-occupied let me know :)
<alexisb> and just to verify redir you are picking up the multi series charm bug, correct?
<redir> yes
<alexisb> coolio, thanks
<redir> np
<natefinch> heh nice, without even trying my PR is +1,000 â0
<rick_h_> natefinch: hah
<natefinch> sinzui, mgz: is there a list somewhere of which repos the merge bot handles?
<mgz> you can see either by looking at the github project perms or the jenkins job list
<sinzui> natefinch: yes, the jobs are named after the repo http://juju-ci.vapour.ws/view/Juju%20Ecosystem/
<perrito666> wtf/min increases
 * perrito666 cries in spanish
<natefinch> sinzui: perfect, thanks
<natefinch> sinzui, mgz: can one of you add a merge job for github.com/juju/jsonschema ?
<mgz> what deps does it have?
<natefinch> mgz: like go version, or like packages?
<mgz> packages
<sinzui> natefinch: mgz: no deps at this minues
<natefinch> I can add a dependencies.tsv if that makes your life easier
<sinzui> mgz: export the path to golang1.6 we dont wany any other go being used
<mgz> natefinch: that's safest, but if it only have trivial ones the gating script can use one of the other modes instead
<natefinch> mgz: it does import 3rd party repos
<natefinch> mgz: https://github.com/natefinch/jsonschema/blob/0f97e8fd9f30f6c1e8c1756aacea26cd56792547/dependencies.tsv
<natefinch> mgz: one wrinkle - the dependencies.tsv only exists in the PR, not in the root of the repo (this is the first PR to populate the repo)
<mgz> natefinch: that will work fine, thanks
<mgz> sinzui: I'll go ahead and add this?
<sinzui> mgz" please
<mgz> okay, perms right on the branch, job created
<mgz> changing the config on the juju slave now then we're good to go
<mgz> natefinch: all done, you can go ahead and try $$merge$$ on your proposal now
<natefinch> mgz: awesome, thanks
<mgz> natefinch: one error in the dependencies.tsv?
<natefinch> mgz: yeah, my fault, was on a local branch that hasn't landed yet
<natefinch> git commit -a --amend --no-edit  is my BFF
<natefinch> yay for fast tests
<mgz> natefinch: seems to have worked
<natefinch> mgz: yep
<natefinch> mgz: thanks :)
<hackedbellini> gey guys! I deployed a charm here, but had to do some manual changes on it because of an error that it is having. Now I'm trying to upgrade it to my local charm, but it is giving me this error:
<hackedbellini> ERROR cannot upgrade application "jenkins-slave-xenial" to charm "local:precise/jenkins-slave-16": cannot change an application's series
<hackedbellini> I'm trying to use the --force-series option but with no luck
<natefinch> hackedbellini: there should be a series: value in the metadata.yaml with a list of supported series, make sure xenial is in that list, or add it if it's not there
<hackedbellini> natefinch: hrm, this is a very old charm. It doesn't have that value. What should I put exactly?
<natefinch> series: ["xenial"]
<natefinch> that should work
<hackedbellini> natefinch: on the root, right? I'll try it right now, thanks
<natefinch> yep
<hackedbellini> natefinch: I added both "xenial" and "trusty" to the list. After that I got this:
<hackedbellini> ERROR series "precise" not supported by charm, supported series are: xenial,trusty
<hackedbellini> then i added "precise" too, that made me return to the original issue
<hackedbellini> note that the charm is not deployed in a precise machine
<natefinch> weird
<natefinch> it's deployed to xenial, I presume, based on the name?
<hackedbellini> natefinch: yeah! The charm in question is jenkins-slave: https://jujucharms.com/jenkins-slave/
<hackedbellini> I deployed 2, one on trusty and one on xenial
<hackedbellini> but I had to modify the jenkins-slave.deb for xenial to update it to systemd (it was using upstart)
<natefinch> hackedbellini: what's your local series?  i.e. the one you're running deploy from?
<hackedbellini> xenial
<hackedbellini> https://www.irccloud.com/pastebin/FGgsTLGl/
<hackedbellini> natefinch: ^ that is my deploy
<natefinch> so, the store thinks this is a precise charm
<hackedbellini> yeah the store thinks that
<hackedbellini> natefinch: this is how the metadata.yaml of my local charm is looking:
<hackedbellini> https://www.irccloud.com/pastebin/YZjKVFvA/
<natefinch> are you using upgrade-charm --switch?
<hackedbellini> natefinch: I was trying with --path, but --switch gives something very alike. Let me give you some outputs
<natefinch> I think switch is what is supposed to work for what you want to do
<natefinch> "supposed to"
<hackedbellini> natefinch: this is the output when I just have xenial and trusty on series:
<hackedbellini> https://www.irccloud.com/pastebin/DkbF6lW0/
<hackedbellini> and this is when I add precise too
<hackedbellini> https://www.irccloud.com/pastebin/V7DjV3B4/
<hackedbellini> natefinch: do you know at least where can I change a file on the existing charm? Because juju keeps running a hook on it that reinstalls the old deb and I loose my modifications
<hackedbellini> nevermind, think I found it
<hackedbellini> but still want to know how to change the charm to the local one =P
<natefinch> --switch is supposed to work for that.  seems like you're hitting a bug
<hackedbellini> natefinch: haha I'm hiting a lot of bugs in this deployment =P
<natefinch> ahh man, our provider config code is so convoluted
<wallyworld> menn0: thumper: look at the last few lines in func addApplicationOps() in state/application.go.... what do you notice?
 * menn0 looks
<menn0> wallyworld: apart from "svc"?
<wallyworld> the Id value in the txn.Op does not match the docID value in the applicationDoc
<wallyworld> it uses app name for the txn.Op
<wallyworld> and uuid:name for the application doc id
<wallyworld> surely that's an issue?
<menn0> wallyworld: nope
<wallyworld> what am I missing?
<menn0> wallyworld: the multi-model txn layer will sort that out
<wallyworld> oh, so it even handles Id in txn.Op
<menn0> wallyworld: the Id field in the txn.Op gets the uuid: prefix added automatically before the txn is applied
<wallyworld> ok, didn;t relaise that
<perrito666> why would someone have type mismatches inside the same library?
<wallyworld> that example is different to all the others
<wallyworld> the others use doc id, which has the model uuid applied
<wallyworld> menn0: thanks for clarifying
<menn0> wallyworld: it's because you used to have to add the doc id on manually
<menn0> wallyworld: but then we improved the multi-model layer
<menn0> wallyworld: so there's a mix
<wallyworld> ok, i'm happier now, i didn't know about that improvement
<menn0> wallyworld: for new work, don't add the model uuid prefix yourself
<wallyworld> excellent, ok
<menn0> wallyworld: same goes for Find and FindId calls
<wallyworld> even betta
<thumper> wallyworld: there are many places in the codebase that do things they don't need to do w.r.t. the model-uuid
<thumper> and menn0 did a great job with the multi-model layering
<wallyworld> thumper: agreed. i wsn't critisising, just very confused
<menn0> wallyworld, thumper: I started an effort last year to standardise all this ages ago but got pulled onto other things
<wallyworld> as always :-)
 * thumper nods
<thumper> I have been trying to clean things up too
<thumper> whenever I drive by
<wallyworld> i'm going to propose the cross model work be merged back into develop and wanted to be sure everything was good in how i did remote applications
<wallyworld> that f*cking branch was sooooooo stale
<alexisb> wallyworld, are you having a fun morning ? ;)
<wallyworld> alexisb: living the dream, wheeeeeee
<wallyworld> alexisb: only just woke up, it was more like a fun late night :-(
<wallyworld> alexisb: can you join the sts call?
<alexisb> possible
 * alexisb runs to pick up her son
<redir> what mongod do I need for 1.25.x?
<redir> 0.6?
<perrito666> 2.4
<redir> :)
<zeestrat> Anyone with a MAAS setup that could take a look at (and perhaps try to reproduce) https://bugs.launchpad.net/juju/+bug/1634390 ? Would be much appreciated.
<mup> Bug #1634390: jujud services not starting after reboot when /var is on separate partition  <juju:New> <https://launchpad.net/bugs/1634390>
<redir> :)
<redir>  
<redir> forgot all about 1.x blowing up the terminal with mongo logs
#juju-dev 2016-10-19
<mup> Bug # changed: 1274460, 1274922, 1276943, 1276976
 * redir eods a little early for M's birthday
<mup> Bug # changed: 1277307, 1277359, 1279093, 1281376, 1281377, 1282731, 1283814
<mup> Bug #1284183 changed: jujuclient.EnvError: <Env Error - Details:  {   u'Error': u'watcher was stopped', u'RequestId': 9, u'Response': {   }} <api> <status> <juju-core:Fix Released> <https://launchpad.net/bugs/1284183>
<mup> Bug #1285115 changed: very slow upgrade-charm run <upgrade-charm> <juju-core:Fix Released> <https://launchpad.net/bugs/1285115>
<mup> Bug # changed: 1286517, 1286570, 1287661, 1288034
<mup> Bug # changed: 1288745, 1293324, 1294458, 1294462
<mup> Bug # opened: 1288745, 1293324, 1294458, 1294462
<anastasiamac> wallyworld: menn0: thumper: m having one of those funnily interesting days \o/ where can I find a list of all hooks that we support and conditions under which they fire? :D
<wallyworld> source code :-)
<wallyworld> not sure there's doc?
<anastasiamac> :D where in source code?
<wallyworld> there might be
<anastasiamac> or docs?
<anastasiamac> we do not have a list?
<wallyworld> nfi about docs
<wallyworld> may do but i've never used them
<katco> anastasiamac: https://jujucharms.com/docs/stable/reference-charm-hooks
<menn0> anastasiamac: does this help: https://jujucharms.com/docs/stable/developer-event-cycle
<anastasiamac> katco:  menno: this is awesome :D tyvm
<wallyworld> in case docs are out of date, hooks are defined in charm/hooks/hooks.go
<mup> Bug # changed: 1288745, 1293324, 1294458, 1294462, 1294843, 1297940, 1298662, 1298755, 1299032
<anastasiamac> wallyworld: thank you :)
<mup> Bug #1299040 changed: juju should fire depart/join hooks when units are dead <hooks> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1299040>
<mup> Bug #1299579 changed: support unconfined and nested container app armor profile in local provider <improvement> <local-provider> <ubuntu-engineering> <juju-core:Won't Fix> <https://launchpad.net/bugs/1299579>
<mup> Bug #1299584 changed: dead machines stuck in status <destroy-machine> <landscape> <juju-core:Fix Released> <https://launchpad.net/bugs/1299584>
<mup> Bug # changed: 1300033, 1300755, 1300823, 1301565, 1301999, 1302015, 1303204
<mup> Bug #1303205 changed: an option for add-machine to return back the registered machine id <add-machine> <improvement> <juju-core:Fix Released> <https://launchpad.net/bugs/1303205>
<mup> Bug #1307643 changed: juju upgrade-juju --upload-tools does not honor the arch <simplestreams> <upgrade-juju> <upload-tools> <juju-core:Fix Released> <https://launchpad.net/bugs/1307643>
<mup> Bug #1308101 changed: juju/testing: suite-level Patch never gets restored <tech-debt> <testing> <juju-core:Fix Released> <https://launchpad.net/bugs/1308101>
<wallyworld> menn0: at some point, tomorrow or whenever, i'd love eyeballs on this https://github.com/juju/juju/pull/6473 . There's one commit that's important; the rest can be eyeballed as it's already been reviewed when landing into the feature branch initially. And it's hidden by feature flag and will need rework anyway as spec has changed :-(
<mup> Bug # changed: 1308966, 1309441, 1311781, 1312786, 1312951, 1314682, 1317596
<mup> Bug # changed: 1318601, 1318923, 1319441, 1319608, 1320218, 1321407, 1321408, 1321793, 1322829, 1323441, 1323623
<menn0> wallyworld: ok looking now
<wallyworld> it can wait
<wallyworld> late for you
<menn0> wallyworld: yeah, actually it looks big :)
<menn0> wallyworld: which commit is the important one?
<wallyworld> the one commit that needs review is small
<wallyworld> it's in the PR
<menn0> wallyworld: duh
<menn0> :)
<frobware> hackedbellini: https://bugs.launchpad.net/juju/+bug/1634744
<mup> Bug #1634744:  bootstrap fails with LXD provider when not using lxdbr0 <bootstrap> <lxd-provider> <juju:New> <https://launchpad.net/bugs/1634744>
<menn0> wallyworld: reviewed that commit and found a few little problems
 * menn0 is EOD
<wallyworld> menn0: ty
<bradm> is there a reason there's a juju add-subnet, and lists-subnet, but no apparent way to edit said subnet?
<anastasiamac> bradm: no reason :D
<bradm> although there's not a lot in the way of editting you can do, I guess
<bradm> just remove would do
<mup> Bug # changed: 1313862, 1324097, 1324666, 1324949, 1367863
<anastasiamac> frobware: do u know if this si still applicable for juju 2?
<anastasiamac> https://bugs.launchpad.net/juju-core/+bug/928624
<mup> Bug #928624: cached unit public addresses are problematic when public ip address changes <addressability> <canonical-is> <canonical-webops-juju> <network> <juju-core:Triaged> <pyjuju:Won't Fix> <https://launchpad.net/bugs/928624>
<frobware>  anastasiamac: I would say yes, or still warrants investigation or a repro.
<anastasiamac> frobware: tyvm!
<mup> Bug # changed: 766721, 928624, 1371558, 1372016
<mup> Bug #1049340 changed: cross-model relations <canonical-is> <canonistack> <feature> <ubuntu-engineering> <juju:Triaged> <juju-core:Won't Fix> <pyjuju:Confirmed> <https://launchpad.net/bugs/1049340>
<mup> Bug #1634761 opened: juju 2 subnet refresh from cli <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1634761>
<mup> Bug #1634761 changed: juju 2 subnet refresh from cli <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1634761>
<mup> Bug #1049340 opened: cross-model relations <canonical-is> <canonistack> <feature> <ubuntu-engineering> <juju:Triaged> <juju-core:Won't Fix> <pyjuju:Confirmed> <https://launchpad.net/bugs/1049340>
<mup> Bug # changed: 1049340, 1173089, 1183309, 1188126, 1194483, 1208430, 1208787, 1209112
<mup> Bug #1634761 opened: juju 2 subnet refresh from cli <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1634761>
<mup> Bug # opened: 1173089, 1183309, 1188126, 1194483, 1208430, 1208787, 1209112
<mup> Bug # changed: 1173089, 1183309, 1188126, 1194483, 1208430, 1208787, 1209112, 1209452, 1213186
<mup> Bug #1209452 opened: juju status should report hardware characteristics in a more structured format <hours> <status> <ui> <juju-core:Fix Released> <https://launchpad.net/bugs/1209452>
<mup> Bug #1213186 opened: Show user progress of deployed service <papercut> <performance> <ui> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1213186>
<mup> Bug #1209452 changed: juju status should report hardware characteristics in a more structured format <hours> <status> <ui> <juju-core:Fix Released> <https://launchpad.net/bugs/1209452>
<mup> Bug #1213186 changed: Show user progress of deployed service <papercut> <performance> <ui> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1213186>
<mup> Bug #1217726 changed: gui: viewing a service or a service unit causes the api server to enumerate all availalble tools <api> <juju-gui> <juju-core:Fix Released> <https://launchpad.net/bugs/1217726>
<mup> Bug #1218651 changed: Requests to meta-data service do not timeout <canonistack> <performance> <security> <juju:Triaged> <juju-core:Won't Fix> <https://launchpad.net/bugs/1218651>
<mup> Bug #1634761 changed: juju 2 subnet refresh from cli <canonical-bootstack> <juju:Triaged> <https://launchpad.net/bugs/1634761>
<mup> Bug # changed: 1221501, 1221685, 1223339, 1223752, 1227450, 1229755, 1234287, 1234677, 1236662
<mup> Bug # changed: 1238677, 1241763, 1241840, 1242725, 1243768, 1244382, 1244841
<mup> Bug # opened: 1238677, 1241763, 1241840, 1242725, 1243768, 1244382, 1244841
<mup> Bug # changed: 1238677, 1241763, 1241840, 1242725, 1243768, 1244382, 1244841
<mup> Bug # changed: 1248674, 1248800, 1252322, 1253651, 1254790, 1255786, 1257729, 1259496
<frobware> on my nuc: load average: 2251.20, 1112.39, 431.37
<frobware> lol - no wonder it is slow to login...
 * frobware grabs some coffee and waits for the load avg to dip below 100...
<voidspace> perrito666: you've touched the vsphere provider - do you know who wrote it?
<frobware> mgz: ping
<frobware> mgz: any chance you could jump on a HO w.r.t. simplestreams
<SimonKLB> hey, could anyone here give me some best practice for dealing with exceptions during a config update in a charmed software - to give some context, right now im using data_changed() to see if the configuration has changed but if the configuration task does not complete it might be unchanged in the software
<SimonKLB> and since data_change is updateing the unitdata it would not think the config has changed the next iteration
<SimonKLB> it would be nice with some kind of two step update rather than writing the new data when using data_changed(), for example first being able to check if the data has changed and then commiting it to the unitdata.kv when it's actually successfully completed
<aluria> hi o/ -- I'm trying to get a similar behavior than when I deployed ubuntu --to lxc:N (using juju1) --> now, using --to lxd:N (using juju2)   --- on juju1+LXCs, I had to edit /usr/share/lxc/templates/lxc-ubuntu-cloud to map ifaces into bare metal bridges
<aluria> now, I see juju2+LXDs already creates all ifaces listed in "juju list-subnets" -- is that the expected behavior? Could I limit somehow the devices shown on "lxc config show <lxd-name>" ?
<aluria> ie. have an LXD only mapping a couple of ifaces into bare metal bridges
<aluria> frobware: if I may ask you ^^
<frobware> aluria: not at the moment, though that is something we are aware of. In a future version this may well be dynamic (i.e., we will only add what is needed in the container).
<frobware> aluria: just to be clear, you're seeing this on MAAS, correct?
<aluria> frobware: thx -- and how is the order chosen? ie. eth0 maps to br0-net2, eth1 to br0-net0, ...
<aluria> frobware: on MAAS 2, indeed
<perrito666> voidspace: answered you in priv
<frobware> aluria: they should be paired so that eth0 in the container will be br-eth0 on the host
<frobware> aluria: you can see the mapping if you run: lxc config show <container-name>
<aluria> frobware: hmm, we're using disable-network-management=True and copying our own /e/n/interfaces file, so names don't follow br-eth0, etc.
<dimitern> aluria: if you want not to bridge certain maas node interfaces, keep them unconfigured in maas - juju won't bridge unconfigured interfaces
<aluria> frobware: and yep, "lxc config show <lxd-name>" helps to troubleshoot, but I was wondering if that order will be followed on every redeploy (hope so) + on every lxd (I imagine that will happen)
<dimitern> aluria: can I ask why are you using disable-net-mgmt: true ?
<aluria> dimitern: I'm using "disable-net-mgmt: true" b/c we used to (in bootstack)... to be able to configure bonds, vlan tagging, bridges and make LXCs bridge via the bridge we want and not lxcbr0
<mgz> frobware: yo, how can I help?
<frobware> mgz: I'm stuck, blocked and confused. Can you jump on a HO?
<aluria> dimitern: with juju2, things will change (iirc, on maas 2.2?, bridges will be supported)
<dimitern> aluria: ah, I see - well, with juju2 you should have most of that config done by juju on maas
<dimitern> aluria: maas 2.1 has initial bridges support (already released?)
<mgz> frobware: same one as for standup?
<frobware> mgz: yep.
<aluria> dimitern: not on ppa:maas/stable but we'll have to check devel series in the near future -- thx for the info on "juju2 not bridging unconfigured interfaces"
<dimitern> aluria: it's on ppa:maas/next but on Y, IIRC
<aluria> dimitern: ah, it looks it's also on X, ta
<dimitern> aluria: yeah, for L2 support (e.g. neutron) that's how it'll work; in future releases we'll have better way of communicating "I just want L2 device(s) thank you"
<dimitern> I'd appreciate a second look at https://github.com/juju/juju/pull/6468 please
<frobware> voidspace, dooferlad: ^^ any volunteers for dimitern's PR. I took a look but somebody else should too.
<dimitern> it's easier to review now, as the prereq has landed and I rebased onto develop
<voidspace> frobware: dimitern: looking
<dimitern> voidspace: ta!
<dimitern> frobware: I figured out how to trigger juju ssh hanging on demand :) not a blackhole route to the node's public address (gets EINVAL) but iptables -t filter -I OUTPUT 1 -s <IP>/32 -j DROP
<dimitern> even better (somewhat..) I managed to trigger the bug in parallel.Try
<dimitern> when a dual-NIC juju controller node has -j DROP rule on its first nic arriving from the client
<voidspace> dimitern: so you've added AllNetworkAddresses to the sshclient  shim
<dimitern> voidspace: yeah
<voidspace> dimitern: but this PR doesn't implement AllNetworkAddressess, was it pre-existing just not exposed?
<voidspace> dimitern: and what's the difference between Addresses and AllNetworkAddresses (except the error return value)
<dimitern> voidspace: it was in the prerequise PR - 6467
<voidspace> dimitern: ah
<dimitern> voidspace: Addresses() returns the merged machine and provider addresses stored on the machine doc (considered obsolete source)
<dimitern> voidspace: and AllNetworkAddresses() returns the machine's addresses assigned to link layer devices
<voidspace> dimitern: the machine doc is now considered obsolete? what is authoritative
<voidspace> ah
<voidspace> cool
<dimitern> (but as []network.Address, more usable across package boundaries - i.e easier to test, than []*state.Address)
<voidspace> dimitern: so has PrefferedPublic|PrivateAddress been updated to use link layer device addresses?
<dimitern> voidspace: nope, not yet
<voidspace> dimitern: I thought the preferred addresses were still using machine/provider
<voidspace> right
<voidspace> cool
<dimitern> voidspace: but I'd like to, eventually
<voidspace> dimitern: that PR looks fine to me, pretty straightforward really
<voidspace> dimitern: just going to read frobware's comments
<dimitern> voidspace: they are, but on maas the LLD addresses are a better source (and have all the "metadata" along with them - i.e. subnets, dns settings, etc.)
<dimitern> voidspace: tyvm! :)
<voidspace> dimitern: yep, I like the change to consider LLD authoritative (i.e. to have a proper network model)
<voidspace> dimitern: but we need to be consistent - having two sources of truth is bad :-(
<dimitern> voidspace: yeah :/ but we'll sort that along the way in the next releases I hope
<voidspace> dimitern: cool
<macgreagoir> aluria: Feel welcome to ping me if you want to go over any of those differences. You'll love it! ;-)
<hackedbellini> frobware: thanks for pointing me the bug
<rick_h_> voidspace: ping for chat
<rick_h_> voidspace: bailing out, will see you in standup
<dooferlad> dimitern: https://github.com/juju/juju/pull/6476 fixes one of your annoyances if you want to take a look
<voidspace> rick_h_: oh crap
<voidspace> rick_h_: forgot again dammit
<voidspace> rick_h_: was afk - see you in a few moments
<dimitern> dooferlad: cheers - looking
<rick_h_> voidspace: katco natefinch dimitern ping for standup
<lazyPower> has anyone seen an issue where juju deploying a bundle yields a response message like the following:  ERROR cannot deploy bundle: cannot create machine for holding easyrsa, etcd and kubernetes-master units: cannot add a new machine: An internal error has occurred (InternalError)
<lazyPower> this was filed here: https://github.com/juju-solutions/bundle-kubernetes-core/issues/30  -- and we've not seen anything like this before. I think its bug-worthy on the core side, but i'm not sure what else i can add aside from this bugs output as we didn't encounter it.
<katco> alexisb_: rick_h_: https://medium.com/feature-request-management-in-saas/roadmaps-a-product-teams-friend-or-foe-43b1d5cdc1a6#.kqna52meq
<rick_h_> qkaty
<rick_h_> bah
<rick_h_> katco: ty
 * macgreagoir is still here. Now no need for the early EOD.
<redir> does 1.25 series require the --upload-tools flag to upgrade jujud to a locally built version?
<dimitern> redir: still does, just 2.0 doesn't need it
<redir> tx dimitern
<redir> was there a way to see the logs like --show-logs in 1.24? or did it require  logging in?
<dimitern> redir: you could try `juju debug-log` ?
<redir> I'm a little late coming back from dr. Trains delayed -- should be back in about 15 minutes
<aluria> hi again -- in juju-deployer, we used to use "overrides" for config variables that existed in multiple services (now applications) -- I haven't found such an option on juju2 bundles -- would I need to repeat the same key-value on all charms that have it as config param?
<hoenir> can anyone review my latest patches ? https://github.com/juju/juju/pull/6414
<hoenir> https://github.com/juju/juju/pull/6464
<rick_h_> katco: can you help look at that first one please? ^
<rick_h_> perrito666: do you have time to look at the second one today? ^
<natefinch> aluria: try on #juju .... they generally know bundles better than we do
<perrito666> rick_h_: I do, I am actually waiting on something to break
<rick_h_> aluria: sorry, the built in bundles don't support the variable substitution like the deployer does
<perrito666> hoenir: reviewed the second, targetted to the wrong branch but otterwise sems find
<perrito666> fine
<hoenir> perrito666, and what branch is the "dev branch" ?
<perrito666> develop
<hoenir> "develop" branch?
<perrito666> I misstyped wanted to type develope
<perrito666> develop
<hoenir> but why develop? Why the dev design changed? We know make pr only on develop? Why not on master "as usual"?
<hoenir> perrito666, could you add more details?
<rick_h_> hoenir: sorry, we've got a new development workflow that started this week
<natefinch> rick_h_: is the new development workflow on the wiki?
<rick_h_> natefinch: no, only in the email and docs
<perrito666> I was about to ask, do we have a pub... oh
<perrito666> ok
<hoenir> why not send an email to all contribs when a dev design is changing ? It's really hard and somewhat annoying to "change over night"...
<hoenir> I will make new pr's on develop, thanks guys.
<natefinch> it's pretty easy to move PRs, btw
<perrito666> hoenir: no need to, you can re-target your existing one
<hoenir> natefinch, , perrito 666 thanks, for the first one I've changed  the branch in the end, but for the second I already created a new PR and closed the older one. So I think it's ok for now.
<perrito666> hoenir: pass me the second one so I can review/approve
<hoenir> perrito666, right away
<hoenir> https://github.com/juju/juju/pull/6477
<katco> rick_h_: hoenir: i will tal at 6414 later today; trying to wrap up some changes
<hoenir> katco, thanks !
<hoenir> perrito666, hmm smth failed, should I rebase with the master and afer rebase with develop ?
<perrito666> hoenir: no, just let me figure out how to fix what failed :)
<perrito666> rick_h_: sinzui any of you remember the invocation to re-run checks
<perrito666> ?
<rick_h_> perrito666: !!build!!
<perrito666> and perhaps we need to make godeps run a couple of times before failing that one
<perrito666> rick_h_: mm, I might not be in the right group of people then, it did nothing
<perrito666> hoenir: interesting aproach
<hoenir> perrito666, thanks, the changes Implies, that, the user data script could be run multiple times without producing errors or changing the way the jujud is installing and setting up the env windows registers, etc.
<perrito666> hoenir: now we need those tests to run again
<perrito666> rick_h_: can you try the !!build!! incantation?
<rick_h_> perrito666: link me please, ln the phone
<perrito666> rick_h_: https://github.com/juju/juju/pull/6477
<hoenir> https://github.com/juju/juju/pull/6477
<hoenir> why the first failed?
<perrito666> hoenir: godeps failed due to network/server issues
<perrito666> it is a check that runs a basic set of tests
<perrito666> once that passes you can merge
<hoenir> could anyone provide a link all of the "incantations"?
<perrito666> hoenir: there is only 2 as far as I know !!build!! will cause these to re-run and $$merge$$ that will cause a merge (I am not sure they will work for everyone though)
<perrito666> brb
<hoenir> and the log ? 2016-10-19 18:50:55 ERROR trusty failed with 2 perrito666
<hoenir> perrito666, nevermind I founded.
<hoenir> http://juju-ci.vapour.ws/job/github-check-merge-juju/79/artifact/artifacts/trusty-out.log
<hoenir> Yeah so the scripts are valid but they don't 100% match in the unit test(added one space by mistake or smth).
<hoenir> perrito666, https://www.diffchecker.com/k9QPGjvA fixing it now
<hoenir> perrito666, fixed !
 * redir lunches
 * perrito666 does some serious open heart surgery to vsphere provider
<katco> do we only support linux w/ the manual provider?
<perrito666> katco: afaik
<katco> perrito666: k just checking, ta
<perrito666> meh I am not sure if I should refactor this code or publish it as a bad practices manual
<katco> perrito666: refactor. here's why: http://ronjeffries.com/xprog/articles/refactoring-not-on-the-backlog/
<katco> perrito666: just try and keep it a narrow refactoring (difficult in our codebase bc there are sticky bad things that have propagated)
<perrito666> katco: well the code doesnt actually work so not that narrow
<perrito666> :p
<katco> lol
<katco> well then you don't have much of a choice do you lol
<perrito666> it does the wrong thing and written in the wrong way :p
<perrito666> so you see my issue  :p
<katco> perrito666: refactor. this shouldn't even be a question lol
<perrito666> katco: still could sell very well as an oreilly reference book :p
<katco> frobware: does this really work? https://github.com/juju/juju/blob/staging/environs/manual/addresses.go#L17
<katco> frobware: i still don't know much about our networking stuff, but i imagined a more robust resolution framework
<anastasiamac> I can see both $$merge$$ and !!build!! being used... is there a considerable difference between the two? is there a preference?
<perrito666> anastasiamac: afaik build only triggers the automated tests
<anastasiamac> perrito666: k so with !!build!!, there is no landing... we are just requesting a check that land is possible?
<perrito666> anastasiamac: exactly it is for cases where tests fail I presume
<anastasiamac> perrito666: thnx \o/
<wallyworld> menn0: i replied to your comments? care to take a look?
<alexisb> menn0, thumper ping
<thumper> yes?
<thumper> wallyworld: the folks behind exploding kittens have a new game coming
<thumper> which I have backed alrady
<wallyworld> thumper: me too!
<thumper> bearsvsbabies.com
<thumper> :)
<wallyworld> looks awesome
<thumper> yeah
<thumper> veebers: I was thinking of spending some time up the road at the cafe
<thumper> veebers: interested in joining me?
<wallyworld> thumper: still no coffee machine?
<thumper> wallyworld: no :-(
<wallyworld> ffs
<thumper> word
<menn0> thumper: hangout again? I've dug a bit further
<thumper> yep
 * perrito666 reaches new levels of not enough bandwidth
<perrito666> 39k upload...
<veebers> thumper: ugh, sorry I just organised to grab some lunch with the girls (we have Kerris sister down for Kerris b-day) :-\ Maybe later today or tomorrow?
<thumper> hmm...
<thumper> I'll probably be there 1-3pm
<thumper> ish
<veebers> cool, I'll be in touch :-)
<redir> what if there's no sdb for new regions. empty string?
<perrito666> yay vsphere provider working correctly
 * perrito666 jumps like doc brown
#juju-dev 2016-10-20
<redir> redir, yes empty string
<redir> part 1 https://github.com/go-amz/amz/pull/72 PTAL
<redir> anyone have a minute to look at something with me?
 * thumper heading up to the cafe
 * redir heads to GOSF to look at some talks. Most specifically one about https://github.com/ekanite/ekanite because... What if we just used a raft log to persist state? Where'd the logs go?
<redir> also there's this: https://github.com/reedobrien/juju/tree/lp/1612645%2B1634289_new-aws-regions
<redir> but it fails because: cannot write file "provider-state" to control bucket: The specified bucket does not exist
<redir> so I must have missed something
<redir> in adding the regions, or something else is wrong
<redir> later
<redir> anyone have hany thoughts about why the control bucjet doesn't exist?
<wallyworld> context?
<wallyworld> menn0: have you had a chance to take another look at that commit with my latest changes to fix import ordering etc?
<redir> wallyworld: added new ohio region to amz and 1.25 provider and tried to deploy
<redir> got the error 6 lines up
<wallyworld> 6 lines up from where?
<redir> wondering if I missed something obvious,
<menn0> wallyworld: sorry, I haven't looked yet, otp but will look after
<redir> this channel
<wallyworld> menn0: np, ty
<wallyworld> redir: not 6 lines for me, i have other notifications turned on, more like 20 and it's scroll off my screen :-)
<redir> wallyworld: this channel at 17:33:07
<wallyworld> redir: did your environment.yaml have a control bucket? it shouldn't
 * redir looks
<wallyworld> also not 17:33 more me?
<wallyworld> :-)
<redir> hehe
<redir> 33:07 after the hours two hours ago
<redir> no control-bucket in environments.yaml
<wallyworld> redir: can you provide debug logs? aws is expected to create a controller bucker using a self generated bucket name from memory
<redir> wallyworld: I'll work on it tomorrow
<wallyworld> it's been 2 years since i saw that code, so will have to go digging
<redir> it only fails in US East new region. The AP South one I added works
<wallyworld> hmmm, interesting
 * redir guesses it assumes the wrong things for useast
<wallyworld> maybe that new rwgion doesn't have s3 support yet
<wallyworld> or the s3 endpoints are wrong
<wallyworld> doesn't sounds like a juju issue
<wallyworld> if it's just the one region. more likely a go-amz issue
<redir> it says it does, but not at s3.a....com it needs to be at s3.us-east-2....com
<wallyworld> ok, ok, so you'll need to add update go-amz to ensure it has the right endpoints
<redir> wallyworld: right. My guess too, I'll dig in to goamz tomorrow
<wallyworld> ok
<redir> thanks wallyworld much appreciated. Wanted to make sure it wasn't something simple
<wallyworld> redir: tis ok, i didn't tell you much except propose a guess or two :-)
<redir> wallyworld: sometimes it's nice when the rubberduck squeaks:)
<wallyworld> depends on how hard you squeeze it :-)
<wallyworld> normally I quack
<redir> :)
<menn0> wallyworld: ship it
<wallyworld> menn0: tyvvvvvvm for looking
<menn0> wallyworld: opening the diff nearly killed my browser
<wallyworld> menn0: another reson why github reviews suck
<wallyworld> with rb, you get a paginated list of files
<wallyworld> i don't get how people navigate gh reviews
<wallyworld> without pulling their hair out
<menn0> wallyworld: I just had a quick play with the gerrit instance for the chromium project. it's pretty nice, if intimidating at first.
<menn0> wallyworld: knobs and buttons for everything
<menn0> wallyworld: but fast and has good keyboard shortcuts for navigation.
<wallyworld> menn0: would love to seriously consider that
<wallyworld> but people need to get over the NIH syndrome
<wallyworld> which I think will be impossible for our group :-(
<menn0> wallyworld: I don't know about that. I can see why some people are attracted to GH reviews.
<wallyworld> i am having trouble. they are really hard to navigate (all one big page), and so much wasted space
<wallyworld> and no way to manage comments
<wallyworld> the main argument seems to be "well it might get better sometime"
<wallyworld> but it's crap *now*
<thumper> w00t
<thumper> wow
<thumper> that looks really out of place now
<thumper> my view had this as the last comment:
<thumper> <perrito666> yay vsphere provider working correctly
<thumper> then I scrolled down
<thumper> heh
<redir> amz bits ready for review: PTAL https://github.com/go-amz/amz/pull/72
<alexisb> wallyworld, if you are still around can you provide a review for redir ^^
<wallyworld> can do in a minute
<alexisb> wallyworld, thank you!
<redir> ekanite looks pretty nice. A syslog server that accepts msgs over UDP/TCP [+TLS] parses RFC5424 headers, does full text indexing, and manages retention
<wallyworld> redir: did you sort the s3 issue?
<redir> also sorts by log timestamp rather than the order received
<redir> wallyworld: yup
<wallyworld> greta
<wallyworld> great
<redir> and will have a juju 1.25 PR when the AMZ bits are merged
<redir> wallyworld: ^^
<redir> oh and the ekanite binary is 12MB
<wallyworld> redir: did you researc that us-east-2 requires the region bools to be true ie bucket lower case etc?
<redir> wallyworld: that fixed the issue
<redir> and US East is the only region that doesn't
<redir> in amz
<redir> us-east-1
<wallyworld> interesting
<wallyworld> redir: lgtm
<redir> See https://github.com/reedobrien/amz/blob/2eab8e64ed1675ae196f785bbc2d293e45aaf7a1/aws/aws.go
<redir> everything else is not false.... so looks like they were lax on the original region
<redir> and changed hteir minds for everything else
<redir> wallyworld: my guess it that since bucket can be used in the website url they wanted to make it consisten with case insensitivity...
<redir> after the original s3 in 2006 was let out into the wild
<wallyworld> sounds plausible yeah
<redir> can we merge amz?
<redir> does all the $$merge$$ stuff work there?
<wallyworld> should do
<redir> k. tx
<dimitern> dooferlad, frobware, macgreagoir: I'd appreciate a review on https://github.com/juju/juju/pull/6481 fixing bug 1616048 ta!
<mup> Bug #1616048: Create interface for ofono <snapd-interface> <snapd (Ubuntu):In Progress by alfonsosanchezbeato> <https://launchpad.net/bugs/1616048>
<dimitern> oops - bug 1616098
<mup> Bug #1616098: Juju 2.0 uses random IP for 'PUBLIC-ADDRESS' with MAAS 2.0 <4010> <cpec> <juju:In Progress by dimitern> <https://launchpad.net/bugs/1616098>
<frobware> dimitern: will do - otp
<dimitern> frobware: no rush
<dimitern> ;)
 * dimitern steps out for ~1h
<dimitern> frobware: ping
<dimitern> macgreagoir: hey
<dimitern> macgreagoir: got some time for a review? :)
<anastasiamac> dimitern: I think ur PR do not have a check run coz they r awesome :) m sure that's what ballons coded in the script... "if dimitern > don't check" :)
<dimitern> anastasiamac: :D
<dimitern> anastasiamac: that's a nicer way of looking at it, sure ;)
<dimitern> anastasiamac: I'd appreciate a review as well hehe
<anastasiamac> dimitern: i'd love to review but m furiously working to remove myself from keyboard: it's 11pm \o/ if it can wait til my morning -m happy to do it then :D
<dimitern> anastasiamac: uuh you should definitely get some rest!
<anastasiamac> dimitern: \o/ will soon :D thank you for care and concern :)
<dimitern> anastasiamac: any time ;)
<dimitern> rick_h_: ping
<rick_h_> dimitern: pong
<dimitern> rick_h_: I thought about picking up bug 1580501 tagged 1.25 .. it seems it's only related to 2.0 though
<mup> Bug #1580501: cloudimg-base-url parameters not in Juju2 anymore <4010> <cpe-sa> <orangebox> <sts> <juju:Triaged> <https://launchpad.net/bugs/1580501>
<rick_h_> dimitern: please stick with cards on the board
<dimitern> rick_h_: that's the first one on the board :)
<dimitern> rick_h_: top left
<rick_h_> oh wtf, the title of the bug and the card got out of wack
<rick_h_> dimitern: ugh, ok this is a can of worms because this falls under work planned for the lxd image cache stuff, and yes is 2.0 only
<rick_h_> dimitern: so going to yank that one off the board
<dimitern> rick_h_: +1 I'll look into bug 1589680 then
<rick_h_> dimitern: one sec, let me see if I did bad copy/paste voodoo
<mup> Bug #1589680: Upgrading to cloud-archive:mitaka breaks lxc creation <canonical-bootstack> <juju-core:Triaged> <juju-core 1.25:Triaged by rharding> <https://launchpad.net/bugs/1589680>
<rick_h_> dimitern: https://bugs.launchpad.net/juju-core/+bug/1560487
<mup> Bug #1560487: local provider fails to create lxc container from template <canonical-is> <local-provider> <juju-core:Won't Fix> <juju-core 1.25:Triaged by alexis-bruemmer> <OPNFV:New> <https://launchpad.net/bugs/1560487>
<rick_h_> dimitern: I pasted the wrong bug number to the card, my bad
<rick_h_> dimitern: apologies for the confusion
<dimitern> rick_h_: right :) np - do you want me to look into that last one instead?
<rick_h_> dimitern: either one is ok thank you
<dimitern> rick_h_: ok - 1589680 it is then
 * dimitern dusts off his 1.25 branch .. it's been a while
<mgz> hm, we're not doing staging and such like with 1.25 I presume?
<rick_h_> mgz: no, we're not
<dimitern> it's free for all :D
<rick_h_> frobware: ping, can I grab a sec when you're available?
<macgreagoir> dimitern: You nean 6481? Sorry, I started and got distracted by my own :-)
<rick_h_> dimitern: I'd expect that we can just upgrade the lxc in order to find a path forward
<rick_h_> dimitern: so if we deploy something on trusty/default lxc and it's pre-2.0, then manually add the PPA/upgrade lxc on there...how would Juju know to use/follow newer lxc patterns
<dimitern> macgreagoir: np, if you can have a look at it - great!
<dimitern> rick_h_: I'd expect setting openstack-origin to cloud-archive:mitaka will trigger adding the repo and the lxc upgrade..
<dimitern> will know soon anyway..
<rick_h_> dimitern: right, but my point is you can do this with the ubuntu charm
<rick_h_> dimitern: no need for anything openstack
<rick_h_> dimitern: fewest moving parts/complications the better
<dimitern> rick_h_: yeah, I guess so.. double checking c-a:m's lxc version..
<rick_h_> dimitern: just check the version in trusty vs the lxc ppa for trusty
<rick_h_> dimitern: if you can get a pre-2.0 in default trusty and then get post 2.0 via the ppa you can "do the upgrade" and chase down what we need to help update to make it work
<dimitern> yeah: lxc	2.0.5-0ubuntu1~ubuntu16.04.1~cloud0 is in cloud-archive:mitaka
<dimitern> and lxc is not there in c-a:icehouse
<chrome0> dimitern : afaicr the error was when going from c-a:trusty-liberty -> c-a:trusty-mitaka
<chrome0> And orig. install was plain trusty, sans c-a
<dimitern> chrome0: yeah, I'm trying exactly that now :)
<chrome0> +1
<dimitern> rick_h_: post 2.0 lxc won't work with 1.25 AIUI
<rick_h_> dimitern: can you hop in the standup real quick?
<dimitern> owm
<frobware> rick_h_: yep, back
<rick_h_> frobware: meet you in the standup room please?
<frobware> rick_h_: omw
<SimonKLB> is there a good way to find out when a relation is completely removed?
<SimonKLB> i find it really hard to test adding and removing relations
<SimonKLB> simply looking at the departed/broken hooks doesnt seem to be enough to determine the actual status of the relation in juju
<rick_h_> natefinch: ping for chat
<natefinch> rick_h_: oops coming
<rick_h_> frobware: dimitern natefinch ping for standup
<mgz> dooferlad: yo, free for bothering?
<frobware> rick_h_: I think we have a nice conclusion for the openstack bug
<frobware> rick_h_: https://bugs.launchpad.net/juju/+bug/1621590/comments/18
<mup> Bug #1621590: openstack provider ignores a properly created bootstrap machine <cpec> <rteam> <juju:In Progress by frobware> <https://launchpad.net/bugs/1621590>
<frobware> dimitern: now looking at your PR :)
<redir> mgz: yt?
<dimitern> frobware: sweet! :) thanks!
<redir> mgz: can you $$merge$$ this https://github.com/go-amz/amz/pull/72 or add me to the right group, or tell me who to ask to get added to the right group? Please:)
<rick_h_> dooferlad: ping
<mgz> redir: sure, I'll take a look
<dooferlad> rick_h_: hi
<dooferlad> mgz: sorry, still talking to Mick
<rick_h_> dooferlad: heads up, assigned another bug your way. It's another config param getting ignored like your current one.
<dooferlad> rick_h_: ok, thanks
<rick_h_> dooferlad: the possible idea is that it's a more generic problem and that two birds/one stone and such
<rick_h_> dooferlad: and it was brought up on the cross team call by the stakeholders there that it should be replicatable with canonistack
<rick_h_> dooferlad: so if getting OS goign is a burden please dump that and run with an already running OS
<mgz> redir: heh, that's a group I'm not actually owner on, but I can trigger the merge for you
<rick_h_> dooferlad: bah, LP won't let me change the assigner atm, timing out. https://bugs.launchpad.net/juju/+bug/1614239
<mup> Bug #1614239: bootstrap-timeout ignored in --config <landscape> <juju:Triaged by rharding> <https://launchpad.net/bugs/1614239>
<mgz> redir: merged as 7754380
<frobware> dimitern: what were the rules on 500 lines of diff? You're dangerously close. :)
<dimitern> frobware: I really tried to minimize the changes :/
<dimitern> frobware: the only "luxuries" are a few added tests, but the refactoring of the common code / tests *did* reduce the original diff by a 100 lines :)
<frobware> dimitern: have time to HO?
<dimitern> frobware: yeah
<dimitern> frobware: I'm in 'core'
<frobware> dimitern: omw
<mgz> dooferlad: so... free now?
<redir> mgz: thanks
<redir> mgz you know who the owner is?
<mgz> redir: canonical-is and niemeyer
<mgz> I don't know who has access to the canonical-is role account
<dooferlad> mgz: sorry - have a headache so bad I want to be sick. Can we talk tomorrow? Perhaps invite me to a meeting so we have a slot booked?
<mgz> dooferlad: no worries, I'll set something up for tomorrow
<dooferlad> mgz: thanks - much appreciated
 * dooferlad goes to hide in a dark room
<redir> another PR ready for review; https://github.com/juju/juju/pull/6483 PTAL
<redir> bbiab after dentist
<SimonKLB> hello, anyone got some time guiding me on how to test adding and removing relations?
<SimonKLB> i was using the relation function amulet to test wether or not the relation was sucessfully removed, and that worked, but now it doesnt anymore
<SimonKLB> would really appreciate some help!
<katco> dimitern: frobware: hey, is this robust? i.e. will it do the right thing in complicated network setups? https://github.com/juju/juju/blob/staging/environs/manual/addresses.go#L17
<redir> abentley: was that update for Ohio because of the amz release that happened?
<redir> this morning
<abentley> redir: No, it was because AWS announced the region on the 17.
<redir> i see, thanks abentley
<redir> anyone, PTAL https://github.com/juju/juju/pull/6483
<hoenir> axw, katco, https://github.com/juju/juju/pull/6414
<redir> ls
<redir> whoops
<natefinch> ahh the glorious feeling when you've gotten down in the weeds typing code and you finally get it back to the state where it'll gofmt.
<perrito666> looooool
 * redir lunches
<natefinch> made a new struct that helps you ask the user repeated questions.... called it Pollster.  Figured it was appropriate given current events.
<redir> As long as it produces output that isn't actually relevant to anyone, natefinch
<redir> and diferent results fo rthe same input by different users
<natefinch> heh
<natefinch> rick_h_: http://pastebin.ubuntu.com/23355482/
<rick_h_> natefinch: coolio
<natefinch> rick_h_: so, one reason to use add-cloud for ec2 etc ..... it's the only way to save a cloud-specific config
<natefinch> rick_h_: instead of haviong to do bootstrap --config=myconfig.yaml
<rick_h_> natefinch: yea, but I think we need a different approach for config as a whole
<rick_h_> natefinch: because config is across all providers and that's more of an "edit cloud" than an add-cloud and such
<rick_h_> natefinch: so I've punted that for now because adding it here is a hack imo
<natefinch> rick_h_: that's fine.  I can pare the list of clouds back to just the custom ones
<rick_h_> natefinch: though I wonder if we can bootstrap by adding an empty config section whenever we write it out
<rick_h_> natefinch: so that it's ready to be edited ootb
<natefinch> well, you don't really need a config section
<rick_h_> natefinch: I mean that you can specify a config key to the cloud definition and then add overrides/etc to that file
<natefinch> rick_h_: one problem I'm not quite sure how to solve from a UX standpoint is how to query for "choose M of N", like for authentication types, you can choose 1-3 of 3 choices.
<natefinch> rick_h_: I guess just asking for them comma separated will work for our needs
<rick_h_> natefinch: yea, I think pick a format and to validate it
<redir> easy review https://github.com/juju/juju/pull/6483 PTAL
<mgz> redir: you haven't got anyone to bite yet? I'll take it.
<mgz> redir: you took the pricing stuff from the webpage, or andrews's new generation thing?
<redir> mgz the web page:/ Didn't know about a generator
<mgz> redir: it's what I did last time, I think that's reasonable for 1.25
<mgz> we know it's not always up to date but should be good enough
<mgz> redir: lgtmed
<redir> mgz: tx
<mgz> ...why did I give andrew an extra s...
<redir> channelling Gollem?
<mgz> preciousss anddrewsss
<natefinch> ^ nice
<natefinch> rick_h_: forgot about a school meeting I have in the morning at 9am.  Probably will miss standup, though I might get lucky and it won't take forever. Doing well with the add-cloud stuff, though.
<rick_h_> natefinch: k
<axw> wallyworld: is there no new price list info for us-east-2?
<wallyworld> axw: not yet, that bit was not updated in juju
<wallyworld> so far, just the region and endpoints have been updated so it works
<axw> wallyworld: yeah, it's just that it only works because of a fall-back. it could be inaccurate
<axw> anyways, nice that it works :)
<wallyworld> yeah
<axw> wallyworld: hm, there is info for ohio in the latest index
<axw> I'll update
<wallyworld> ta
<axw> ... right after I remember to go to standup
#juju-dev 2016-10-21
<redir> seems like a lot of failures on develop currently
<redir> I branched from staging and am proposing a merge to develop https://github.com/juju/juju/pull/6485
<redir> real easy review
<redir> except I don't know why I've got voidspace commits in there with mine.
<redir> apparently they aren't in develop but are in staging
<redir> yup they were merged directly to staging
<redir> anyhow PR ready for review: https://github.com/juju/juju/pull/6485 PTAL
<redir> wallyworld, anastasiamac, axw ^
<wallyworld> ok
<wallyworld> redir: not sure why you needed to branch from staging if you wanted to merge back into devel, anyway lgtm
<anastasiamac> wallyworld: isn't it our current worklflow? branch from staging but PR agaisnt develop?
<wallyworld> not sure, i always just branch from develop. i've always though it better to branch from the target to which you want to merge
<wallyworld> otherwise you end up with unrelated commits
<wallyworld> s/end up/can end up
<anastasiamac> that's not the workflow tho.. this is why remote is staging
<wallyworld> my remote is develp :-)
<anastasiamac> special \o/
<wallyworld> why have a remote that is different to the target of your PRs?
<wallyworld> it just introduces skew like reed saw
<anastasiamac> for one, there is no guarantee that what's in develop will be promoted to staging....
<anastasiamac> staging is meant to be stable branch
<anastasiamac> but we have difficulties promoting to staging atm
<anastasiamac> the idea is that failed promotion will not landing everything in develop to staging
<redir> tx wallyworld
<anastasiamac> so if u branch from develop, u'll have some stuff that is not stable yet
<anastasiamac> i agree that until the wrinkles are ironed, mayb it's worthwhile cto consider to branch from develop
<anastasiamac> altho if we do that, we'll never iron our wrinkles :D
 * redir goes back to branching from develop
<redir> I had started there but was seeing unextpected failures and then switched to staging
<redir> some of the failures were intermittent and others were because lxd setup on 16.10 defaults IPV6 to on.
<anastasiamac> redir: wallyworld: the promotion from develop to satging ws meant to take only about 3hrs... however, atm, it's not happenning
 * redir eod
<wallyworld> if you branch from develop you'll get stuff not in stage for sure, but most time that's what i want especially if i'm collarborating and need to pick up someone's work as soon as it lands
<wallyworld> otherwise i'm blocked until their work hits staging
<wallyworld> even if it's 3 hours, that's still half a day lost
<anastasiamac> in the situation where there are several PRs being promoted to stging but fail, u will not be able to re-submit ur PR branched from develop easily if the failure is with other PRs
<anastasiamac> wallyworld: axw: welcome message fix as per standup: https://github.com/juju/juju/pull/6487
<anastasiamac> PTAL at ur leisure :D
<wallyworld> sure, otp will look soon
<anastasiamac> hmm m not sure why i have 2 commits on it... ?? "Merge commit '7c21f4f4a09f727601fdce45cbd0230063f7f3a3' into HEAD "
<axw> anastasiamac: did you branch off master instead of staging perhaps?
<axw> anastasiamac: that's what I got when I did that
<anastasiamac> axw: i branched of staging.. but it's so far behind... m wondering if i should re-branch and re-propose against develop?
 * axw shrugs
<anastasiamac> axw: well, i gues my question is if i'll $$merge$$ on this PR (once lgtm'ed), will this commit hurt?
<mup> Bug #1493118 changed: Subordinates stuck in error state <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1493118>
<mup> Bug #1566450 changed: Juju claims not authorized for LXD <bootstrap> <ci> <intermittent-failure> <lxd> <juju:Triaged> <https://launchpad.net/bugs/1566450>
<mup> Bug #1629919 changed: destroy-controller fails and a kill-controller is required. <juju-core:Invalid> <https://launchpad.net/bugs/1629919>
<axw> anastasiamac: I don't think it matters. it's just the merge commit. the child commits are already in develop AFAICT
<axw> might look a little ugly in history that's all
<anastasiamac> axw: i can't imagine anyone intreted in history oon this one... but for future PRs, i'll b kinder to posterity and future us :D
<mup> Bug #1493118 opened: Subordinates stuck in error state <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1493118>
<mup> Bug #1566450 opened: Juju claims not authorized for LXD <bootstrap> <ci> <intermittent-failure> <lxd> <juju:Triaged> <https://launchpad.net/bugs/1566450>
<mup> Bug #1629919 opened: destroy-controller fails and a kill-controller is required. <juju-core:Invalid> <https://launchpad.net/bugs/1629919>
<mup> Bug #1493118 changed: Subordinates stuck in error state <juju-core:Won't Fix> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1493118>
<mup> Bug #1566450 changed: Juju claims not authorized for LXD <bootstrap> <ci> <intermittent-failure> <lxd> <juju:Triaged> <https://launchpad.net/bugs/1566450>
<mup> Bug #1629919 changed: destroy-controller fails and a kill-controller is required. <juju-core:Invalid> <https://launchpad.net/bugs/1629919>
<anastasiamac> wallyworld: tyvm for review \o/ I've addressed/replied to all... happy for it to merge?
<wallyworld> let me look
<wallyworld> anastasiamac: there's no use of template
<anastasiamac> wallyworld: m expecting in the long run to have manual page instead
<wallyworld> using templates is much better than printf with the same value repeated many times
<anastasiamac> wallyworld: it's temporary, bandage solution...
<wallyworld> ok
<wallyworld> lgtm then
<anastasiamac> \o/ i'll consider template after lunch unless something else'll come up :)
<anastasiamac> wallyworld: to clarify, coz i've seen u talking about versions on openstack endpoint this week, do openstack endpoint have to ahve a version?
<anastasiamac> axw: ^^ if u know...
<wallyworld> yep
<anastasiamac> m looking to pick up bug 1634770 and wondering what the right thing to do would b..
<mup> Bug #1634770: panic when bootstrapping with openstack provider if endpoint omits api version <bootstrap> <openstack-provider> <juju:Triaged> <https://launchpad.net/bugs/1634770>
<wallyworld> they all have a version in the url
<wallyworld> v1 or v2.1 etc
<anastasiamac> k.. so instead of panic, i'll just error :D
<axw> anastasiamac: it would be better if we did neither, and queried supported versions
<wallyworld> the panic would be because our code expects a version in the supplied url
<wallyworld> right now, we expect the user to know
<wallyworld> what version to specify
<wallyworld> but andrew is right, querying would be better
<anastasiamac> axw: wallyworld: i could probably return supported/available versions as part of error
<anastasiamac> when none is supplied as per bug..
<wallyworld> we do the query for identy for example to know if domain is supported
<wallyworld> we never used to do that originally
<wallyworld> the new goose code will query for versions
<wallyworld> it may be that for cloud endpoints, version becomes optional
<wallyworld> if none supplied, use the latest perhpas
<anastasiamac> use the latest and log Info msg that none was supplied and latest selected, for e.g.?
<axw> anastasiamac: yes, that would be ideal I think
<anastasiamac> \o/ awesome!
<axw> anastasiamac: maybe just log as debug, I'm not sure that it's that interesting
<anastasiamac> axw: u don;t think that users might need to know? i'd imagine not many would run with debug on
<axw> anastasiamac: juju should just do the right thing. I don't think there's a reason for the user to care which version of the identity API we use
<anastasiamac> axw: k. i'll do debug.. we can alsways change if needed :)
<axw> yup
<anastasiamac> wallyworld: changed to use template \o/ made vars obviosu as well! i think it looks even better now :)
<anastasiamac> wallyworld: still k to land from u?
<wallyworld> sure
 * wallyworld goes to look
<wallyworld> template much easier to understand. thanks
<wallyworld> but is only temporary as you say
<anastasiamac> sure but i thought since our *temporary* may last longer than usual "temporary", i'd better do the right thing :)
<anastasiamac> axw: wallyworld: there is no way to find latest identity version from api. m going to hard-code it to 3 as we've done everywhere else..
<wallyworld> be careful, v3 is not universally support IIANM
<wallyworld> i think we hard code to v3 when domain is specified in credentials
<wallyworld> we do have a way to get identity version
<wallyworld> FetchAuthOptions
<anastasiamac> wallyworld: yep looking at it now ;)
<anastasiamac> it's not actually ideal as it does not tell you what is latest... it just tells u what's available...
<anastasiamac> i guess i could just error out and force user to supply version for now
<anastasiamac> and once we do have legitimate identiy endpoints without version, we'll update openstack... u know ike on-a-per-need basis :)
<mup> Bug #1386284 changed: no warning when tools-metadata-url is misconfigured <bootstrap> <ci> <logging> <simplestreams> <juju-core:Expired> <https://launchpad.net/bugs/1386284>
<mup> Bug #1538462 changed: simplestreams debug content is useless (juju bootstrap --debug) <logging> <simplestreams> <juju-core:Expired> <https://launchpad.net/bugs/1538462>
<wallyworld> isn't the latest the one with the highest version number?
<axw> it's not ever day that a house goes through the air over your head
 * axw stops hyperventillating
<wallyworld> axw: wtf happened?
<axw> wallyworld: heh :)  neighbour's second storey is going on
<wallyworld> via a huge crane?
<axw> I'll take a photo when the next module goes on
<axw> wallyworld: yup
<axw> wallyworld: and said crane took the first module pretty much over my head
<wallyworld> cool
<wallyworld> i'd be outside looking
<axw> i went and had a gander. they're just getting ready atm
<axw> for the next one
<wallyworld> i hope their slings and ropes etc are strong :-)
<axw> :)
<anastasiamac> wallyworld: from what I am seeing you do not get version numbers from FetchAuthOptions but auth mode. the translation from version id to auth mode is happening within goose
<anastasiamac> so by the time we get we canot figure out latest available
<anastasiamac> unless we hardcode something
<anastasiamac> so m going to just error for now...
<anastasiamac> the problem is that our logic in version deduction was a bit error-prone
<anastasiamac> m dealing with tests failures atm
<anastasiamac> then will propose
<wallyworld> goose will be gaining some logic to get endpoint versions without the post processing, we can use that when available
<anastasiamac> exactly \o/ for now i'll just error out
<anastasiamac> i believe that this was the intent of the code m changing... but it's buggy
<anastasiamac> u'll c what i mean when i propose
<axw> wallyworld: ok, going back to work now... https://goo.gl/photos/ZPmdqKnRv5aJtqp76
<axw> (the one with the shabby lawn is my place)
<wallyworld> axw: jeez, what ever happened to the brickie and chippie turning up on site with their utes and cattle dogs and a chicko roll
<axw> wallyworld: heh :)  plenty of guys standing around doing nothing at least
<wallyworld> ah so they work for the council then
<mgz> my internet has just been terrible this morning...
<dimitern> mgz: can you +1 https://github.com/juju/juju/pull/6481 if you think it's good to land please?
<mgz> dimitern: sure, I'll take a look
<dimitern> mgz: cheers!
<mgz> hm, I'm not sure about exposing the force of v1 all the way up to an environment variable
<dimitern> mgz: it's mostly harmless anyway,
<dimitern> mgz: but it does make testing easier, without introducing unnecessary patching / global vars
<mgz> yeah, I agree it's unlikey to hurt us, but really we just want that for the unit tests
<mgz> as the right way to functional test is with different juju versions
<dimitern> mgz: yeah, you're not wrong :) I should try adding one
<mgz> okay, this all makes sense to me
<mgz> to check I'm getting the important bits right
<dimitern> sweet!
<mgz> basically this is some of the way towards what we do on bootstrap ssh, for all juju ssh calls
<mgz> previously we'd just ask the api for an address (public or private depending on context) and try to ssh to that
<mgz> now the code gets every address the machine claims to have
<dimitern> and *only* that
<mgz> and does some inspecting of them, then hands one off that it reckons will actually work
<dimitern> yeah, but this approach is even better (faster) than the one used during bootstrap
<mgz> which isn't quite what bootstrap does
<mgz> (which is actually try to ssh to every address in parallel and see what happens)
<dimitern> yeah, hands off the one it did connect to successfully
<mgz> dimitern: okay, +1ed
<dimitern> it does, but the timeout used for the parallel.Try during bootstrap is appalling (10m IIRC)
<mgz> well, for good reason in that context
<mgz> as we're waiting for the remote machine to actually get stuff done as well
<mgz> so, it's not 10m of network timeout, it's 10m of please start your ssh server
<dimitern> which means if you hit a blackhole route *first* you'll sit there waiting for ssh to come back.. after 10m
<mgz> but yeah, it's all rather messy
<dimitern> mgz: awesome! thanks :)
<dimitern> I've seen it break with some unfortunate iptables rules set
<dimitern> but also parallel.Try in general assumes the func you pass to run in parallel won't block forever
<dimitern> and I've seen ssh doing that if given an address matching an OUTPUT -j DROP iptables rule
<dimitern> let's fix one thing at a time I suppose.. :)
<dimitern> mgz: does the $$fixes-BUG_ID$$ thing still work as before or I got to use $$merge$$ ?
<mgz> you might have to use $$merge$$ now, I'm not sure my little hack made it across
<dimitern> mgz: it does work it seems
<dimitern> mgz: however ... http://paste.ubuntu.com/23358843/
<dimitern> mgz: you might want to fix that :)
<dimitern> [develop: command not found
<mgz> heh, that's new
<mgz> or perhaps not, as it's not actually causing the run to fail
<dimitern> well now :) for a change I actually feel I accomplished something useful this week \o/
<dimitern> mgz: nope - it's not causing it to fail but might not stop people trying to land stuff via github-merge-juju directly in staging
<dimitern> (AIUI)
<mgz> hm, that's the code block that's aimed at stopping people landing directly on master
<mgz> and it's borked
<mgz> so...
<dimitern> :D
 * dimitern steps out for ~1h
<mgz> okay, fixed, people can no longer land directly on staging :)
<mup> Bug #1635622 opened: 'juju ssh <unit> ...' fails with Permission denied (publickey), for only one or two machines in a deployment <juju-core:New> <https://launchpad.net/bugs/1635622>
<anastasiamac> frobware: wallyworld: axw: PTAL https://github.com/juju/juju/pull/6488
<wallyworld> anastasiamac: lgtm
<dimitern> mgz: the windows vm is out of space again I presume
<mgz> ah, the gating job chucked you out on windows tests?
<dimitern> mgz: my fix failed twice on windows so far, with increasingly weird errors :)
<mgz> I'll have a luge
<mgz> hm, first one windows tests pass, failed trying to get a trusty instance
<mgz> second windows tests failed in a pretty typical intermittent failure manner
<mgz> third we're into bad local connection weirdness, but nothing obviously saying oom vs other unhappiness
<dimitern> weird..
<mgz> we can try restarting that machine and running again
<dimitern> mgz: +1
<rick_h_> dimitern: ping for standup
<dimitern> omw
<mgz> oh, and there goes google dropping me
<mgz> so, if I'm vanishing from hangout, my internet is just terrible
<mgz> well, I can hear you all on the hangout, but can't get my audio through?
<mgz> and sometimes not text as well it seems
<mgz> anyway, my update on cards, have code reviewed by curtis and good to land, some small fixes to make and some unit test coverage to re-add
<mgz> then I have some more maas setup work to do
<dimitern> mgz: can't hear you :/
<mgz> yeah, net today has been just about good enough for irc and ssh session, but dropping packets everywhere
<dimitern> mgz: now the lxd and the windows vms for the merge job seem to be getting worse
<mgz> sinzui: ^do you have a particular proceedure on trying to keep these things working? I remember you mail to nicholas about the issues here.
<sinzui> mgz: I restart the machine when it is irational. But ssh drops from an AMI instance is another matter. we are not running race tests on the gating job because ssh consistenly drops from the *new* juju-core-slave, but not the old one.
<dimitern> chrome0: ping
<chrome0> dimitern : Hola
<dimitern> chrome0: hey! I'm still trying to reproduce bug 1589680.. no luck so far though, it seems lxc-templates has been part of Recommends for lxc since 0.8.0
<mup> Bug #1589680: Upgrading to cloud-archive:mitaka breaks lxc creation <canonical-bootstack> <juju-core:Triaged> <juju-core 1.25:In Progress by dimitern> <https://launchpad.net/bugs/1589680>
<dimitern> chrome0: which happens to be the lxc version in trusty/main still
<dimitern> chrome0: juju doesn't specify package source when installing lxc, so it will get the most preferred, which is 1.0.8 from trusty/updates
<dimitern> chrome0: I bootstrapped trusty with 1.25.6 and added 1 lxc, then installed lxc from trusty-backports (2.0.4) and again did juju add-machine lxc:0 -- no issues or errors I can see
<chrome0> dimitern : Hm, we have trusty-updates on the machines this happened on too, and likely had it then as well
<dimitern> chrome0: is it possible lxc-1.0.3 was originally installed?
<natefinch> gah, we need better documentation on oauth1 vs oath2 and when you use one or the other
<dimitern> chrome0: before the mitaka upgrade? (can't see how though.. except manually)
<natefinch> a lot of our docs just say "oauth" which is Not A Thingâ¢ in juju
<chrome0> dimitern : I can't say for sure anymore, but am reasonably certain that we didn't manually upgrade lxc
 * rick_h_ grabs lunchables
<rick_h_> katco: pinkg
<rick_h_> ping tha tis
<rick_h_> bah, /me blames tools for typing issues *bad keyboard, bad!*
<katco> rick_h_: lol hey
<rick_h_> katco: got a sec to chat real quick on dev workflowy bits?
<katco> rick_h_: sure
<rick_h_> meet you in ?core please
<natefinch> man I hate that all our configs use maps of name : object, rather than a list of object with a property that is the name.  it makes parsing like 10 times more difficult.
<katco> natefinch: i.e. map[string]interface{} vs. struct?
<natefinch> katco: yeah
<natefinch> well, vs []struct
<katco> natefinch: hm, i don't understand that part. not having the config as one large struct?
<rick_h_> katco: hmm, actually...in this case we don't need master any more.
<natefinch> it makes the name not part of the value, it's the key...  it's also then an extra layer of indirection, an extra layer of indenting in the config
<rick_h_> katco: because any hotfix would be against the support branches, never master
<katco> rick_h_: release-branches are cut from staging?
<rick_h_> katco: yea, so rather than making a new release by merging staging->master and then creating a new support branch from master
<rick_h_> katco: it would just be to create a new release by creating the new support branch
<rick_h_> katco: and skip that master middle man
<katco> rick_h_: i thought the purpose of master was to always be releasable? e.g. staging -> master is our opportunity to run the full CI suite?
 * redir wonders if we'll wind up with git-flow -- with different names
<rick_h_> katco: no, master's job was to be a place to perform hotfixes that only had one PR from the last release
<rick_h_> redir: lol, almost, except git-flow doesn't have the idea of dual test runs in it
<katco> rick_h_: where do we run our full CI suite w/ no master?
<rick_h_> katco: develop->staging
<rick_h_> katco: right now to get frmo develop->staging you need a bless
<katco> rick_h_: oh, it's PR->develop that is the partial isn't it
<rick_h_> katco: right
<katco> rick_h_: then yes, we don't need master; but i would get rid of staging instead since master is such a common thing with git
<rick_h_> katco: +1, just speaking in current terms so we follow what we're saying
<redir> ok master and develop check
<redir> hotfixes, check
<rick_h_> redir: where do you think I got those from :P
<redir> :)
<redir> captain we need more power
 * redir is missing the dual test runs
<rick_h_> redir: so the one plus is that develop->master is automated based on bless
<rick_h_> redir: on 1.25? or some other way?
<redir> like picture in picture on tv
<redir> never saw a use
<rick_h_> redir: ? /me isn't following
<redir> missing as in doesn't know -- is ignorant
<redir> ignore me you were making progress
<rick_h_> redir: oic, 30min test run vs 3hr test run
<katco> rick_h_: redir: in my head, someday there will be no 30min test run, only a sub-minute one and then CI tests
<redir> one to merge to develop and the other for merging to master
<redir> I thought the 30 minute one was the CI tests
<katco> redir: it's just a subset of our test suite, unfortunately
 * redir is curious where the sub minute test run lives:)
<katco> redir: in the future! when we've converted our suites to actual unit tests :)
<redir> ahhh I wan't in the time machine to the future
<rick_h_> :) in a dream land where you don't actually talk to a db in a unit test
<redir> s/wasn't
<katco> rick_h_: that's not a dream!
<rick_h_> katco: :)
<katco> rick_h_: e.g. the deploy command is now fully ready to be converted to unit tests. completely in-memory
<redir> +1 I think of that as reality and not that as a bad dream
<rick_h_> katco: <3
<katco> rick_h_: and some of the tests have already been converted as examples
<mgz> rick_h_: oh, misc comment, it seems the flag to limit pre-testing to certain users is not actually turned on?
<mgz> so, anyone proposing a branch against develop gets their code run on our setup
<mgz> I can turn it on and see if it works?
<redir> So I came over here to exit bacause DNS keeps failing. REbooting everything from the modem in :(
<rick_h_> redir: there's a known big ddos attack going on
<redir> oh
<redir> the iot one?
<mgz> yeah, it's not you redir
<katco> redir: possibly
<redir> OK.
<katco> http://money.cnn.com/2016/10/21/technology/ddos-attack-popular-sites/index.html
<mgz> though confusingly my internet is just terrible as well today
<katco> yeah it's turning out to be a weird day
<redir> I guess irclogs.ubuntu.com is a popular site:|
<rick_h_> https://news.ycombinator.com/item?id=12759697
<rick_h_> redir: heh
<natefinch> it's everybody
<redir> OK not rebooting everything.
<redir> just the router and I'll use some non major DNS
<redir> because it is DNS for me
<redir> names not resolving
<natefinch> yeah, there's a DNS service down and it's screwing up a TON of sites
<redir> oh dyndns
<natefinch> yep
<redir> Saw they were down
<redir> thanks for the intervention
<redir> friends don't let friends reboot
<redir> unnecessarily
<katco> this is rather unprecedented isn't it?
<natefinch> I don't remember anything like this, no.
<katco> i mean i've seen pleanty of ddos against a site or two
<natefinch> I think @FiloSottile said it best: Take this intuition. Now ask, "why doesn't my DNS resolver just remember the IPs that worked 1h ago?" NO GOOD REASON
<katco> but this seems like it's affecting a lot of critical sites/infrastructure
<natefinch> And: Any website would take stale IPs over downtime. Nobody relies on fast DNS changes anyway, everything is cacheable (â¡DNS is bestcase) </rant>
<natefinch> well, hopefully after this, the DNS companies will get their act together and start planning for this sort of thing
<katco> the timing with the kernel privilege escalation is interesting too
<natefinch> lol, this is not the Russ Cox I was looking for: https://aussiecriminals.com.au/high-profile-criminals/russell-mad-dog-cox/
<katco> haha
<natefinch> just realized goimports doesn't support -s and now I have a sad
<redir> I once made a dns mistake (forgot a leading dot) and so the rply was with the wrong IP. That IP led to an http server's default site. I used long TTLs then because it shouldn't change often, so 86400. Google happened to crawl it before the correction propagated. So google returned the wrong results for another 7 days.
<redir> I started using short DNS TTLS after that
<natefinch> haha yeah
<redir> I'll give you one guess who's "business card" site was the default site.
<redir> There was a lot of telelphone ringing that week
<redir> not the good kind
<katco> ohhhh shit... the fan in my main computer sounds like it's bearing just went out
<katco> god wtf friday
<katco> sigh brb
<katco> (hopefully)
<redir> luck
<natefinch> gah, turning jsonschema into a generic UX is really really hard
<natefinch> er generic interactive UX
<rick_h_> natefinch: let me know if there are any particular pain points you want to brainstorm on
<natefinch> I think it's ok, there's just a huge matrix of interactions that exist.... for now I'm treading a narrow path of supporting only what I know we need to support... but it means there's a lot of edge cases that won't work if we want to use this as a generic "throw anything at it and it'll work" library (which was basically the whole point of writing it in the first place)
<rick_h_> natefinch: that's ok though. one stone at a time.
<rick_h_> as long as we stick to the types we need, byt focus on supporting the type vs oir need we'll be in good shape imo
<natefinch> yep.  Basically, I wrote the schema for openstack, now I'm writing the UI code to handle that particular schema, while doing my best to add checks to properly error out if someone deviates from the supported schemas, instead of just doing the wrong thing.
<natefinch> openstack just adds three new ideas - arrays, an enum of values, and an object that is a map of name to object (regions).
<perrito666> oh, this is where we complain?
<natefinch> haha
<natefinch> where else? :)
<perrito666> I have been fixing tests for 2 days that basically require me to reverse engineer what a provider is doing interms of api calls
<natefinch> the rest of the internet is broken ;)
<perrito666> speak for yourself, internet works here, its just crazy slow
<perrito666> ghaaa, kidding me? I added error.Traces to the code and the tests break? that was nasty
#juju-dev 2016-10-22
<redir> tumbleweeds
#juju-dev 2016-10-23
<mup> Bug #1635622 changed: 'juju ssh <unit> ...' fails with Permission denied (publickey), for only one or two machines in a deployment <juju:Triaged> <https://launchpad.net/bugs/1635622>
#juju-dev 2017-10-16
<axw> burton-aus: skip standup? just me and you today
<axw> burton-aus: I didn't get to your doc yet, but will take a look today
<burton-aus> Logging in now
<burton-aus> Yep, skipped it, the chrome is extremely slow on my machine
<burton-aus> axw let me just send it to you
<axw> burton-aus: you already did
<axw> burton-aus: I just haven't had a chance to review it yet
<axw> I'm talking about the CMR test plan
<axw> I realise I wasn't very specific... :p
<burton-aus> axe yes, the CMR plan.
<burton-aus> axw
<axw> burton-aus: what you've got in the test plan sounds good. I think we'll need to expand, but it's a good start, and captures the core requirements
<axw> burton-aus: do any of the tests in CI do chaos-y sorts of things? I found an issue last week where breaking a remote controller would cause major issues the local controller. would be good to have some of that automated too
<burton-aus> axw not in my knowledge on chaos-y kind of tests.
<burton-aus> axw would you send out a detailed story about it?
<mup> Bug #1654136 opened: Removing relationship didn't remove subordinate unit <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1654136>
#juju-dev 2017-10-17
<burton-aus> hml CI check passed, but may need someone to review and verify it https://github.com/juju/juju/pull/7940
<hml> burton-aus: ty
<burton-aus> babbageclunk and axw you may want to review the PR or verify it https://github.com/juju/juju/pull/7940
 * axw is looking
<babbageclunk> If it's a trivial backport (ie it didn't require changes to apply to the other branch) we typically don't require a review of the backport PR.
 * babbageclunk goes for a run
<burton-aus> hml you can issue the merge comment now, axw has approved the change.
<hml> burton-aus: done
<burton-aus> hml yep
<burton-aus> hml babbageclunk test failed http://ci.jujucharms.com/job/github-merge-juju/356/console, nothing is trivial.
<burton-aus> hml babbageclunk weird - [xenial] -bash: line 59: go: command not found
<babbageclunk> burton-aus: yeah, that seems pretty weird
<burton-aus> hml babbageclunk it was run on juju-core-slave-b, so I went there and got:
<burton-aus> hml babbageclunk which go
<burton-aus> hml babbageclunk /snap/bin/go
<burton-aus> hml babbageclunk not sure if I should create a soft link under /usr/bin
<burton-aus> hml babbageclunk let me re-run the job first.
<babbageclunk> burton-aus: did you work out what the problem with the test run was?
<burton-aus> hml babbageclunk email sent
<babbageclunk> burton-aus: if `which go` works for you, why can't the test run script run go?
<burton-aus> hml babbageclunk the test runs on a instance created on the fly, not the machine I run which go.
<burton-aus> hml babbageclunk overall I have no knowledge on the recent go version change, just debugging on the spot.
<babbageclunk> burton-aus: ah, right
<burton-aus> hml babbageclunk we may can merge it directly, to let CI handle the test.
<thumper> o/ redir
<redir> \o thumper
<redir> How was NYC?
<thumper> redir: pretty good
<redir> sounds highly productive:)
<thumper> redir: actually it was very good, good things happening
<redir> That is awesome to hear
<redir> \o/
#juju-dev 2017-10-18
<babbageclunk> wallyworld: should we fallback to release anytime we'd use devel - even if the user has specified devel explicitly?
<wallyworld> babbageclunk: yeah, IMO. develop just indicates level of risk the user wants.
<babbageclunk> ok
<wallyworld> we need to behave linke snaps
<wallyworld> *like
<babbageclunk> wallyworld: I'm not sure how snaps work in this case
<wallyworld> babbageclunk: you need to read up on them then :-)
<babbageclunk> wallyworld: trying - any pointers? The closest I can find is that you can publish to multiple channels when publishing - but that's not the case here is it?
<wallyworld> no, this isn't about publishing but the user specifying the level of risk they are willing to accept
<wallyworld> i can try and find some snappy docs
<babbageclunk> Right, but it seems like you could publish to beta without publishing to edge, and then someone who was on edge wouldn't see that version?
<babbageclunk> I'm reading the docs on snapcraft.io
<wallyworld> babbageclunk: they would see a higher version in beta
<wallyworld> if published there
<babbageclunk> whoa, there are channel branches?
<wallyworld> babbageclunk: you always get the highest version - the channel determines how far down you look to see what's available
<wallyworld> yeah
<wallyworld> you can create a short lived snap (30 days default) from a branch
<wallyworld> a feature branch
<wallyworld> if you want longer you need to pay
<babbageclunk> https://forum.snapcraft.io/t/channel-terminology-and-policy/551
<babbageclunk> Ok, I think I get the fallback between risk levels (modulo the talk about closing which I guess doesn't matter here).
<wallyworld> babbageclunk: correct, closing is not relevant here
<babbageclunk> wallyworld: what about the other stream levels (proposed, testing)
<wallyworld> ignore testing. we don't really use proposed in practice i don't think, but we could sp probably should support it
<babbageclunk> I think it makes sense for all of them to fallback to released, right?
<wallyworld> it's almost like testing < develop < proposed < released
<babbageclunk> But if they all had released after them that would be right? Or should I only have the fallback for develop?
<babbageclunk> wallyworld: ^
<wallyworld> babbageclunk: sorry, on call, so distracted
<wallyworld> the above order would be analogous to snap channels
<babbageclunk> wallyworld: okthanks!
<wallyworld> babbageclunk: sorry, out of meeting now if you still had questions
<babbageclunk> wallyworld: no, I think I understand - I've made everything follow that fallback, just going through the places that call PreferredStream now to make them do the fallback if they need to. Thanks!
<wallyworld> babbageclunk: ok, sgtm
<axw> wallyworld: still at the guessing phase, but I have a suspicion that it's nothing to do with skews now. the logs say that the leadership worker is failing with "not found" and "state changing too quickly; try again soon"
<axw> wallyworld: from looking at the code, that doesn't make sense... unless there's 2 leadership workers
<wallyworld> interesting
<axw> wallyworld: we run a leadership worker *per state*
<axw> wallyworld: so one might be expiring stuff while the other is extending
<wallyworld> oh dear
<axw> wallyworld: I think the best thing to do would be to make it so we run one per model
<wallyworld> axw: that sounds logical, one per state sounds crazy
<wallyworld> axw: although i think there's an aspect where maybe there's a controller level doc, rather than per model
<wallyworld> the clock#singular-controller lease doc
<axw> wallyworld: I'm just talking about the applications one atm
<wallyworld> ah ok
<wallyworld> in that case sounds good to do
<axw> wallyworld: I think we'd want a per-machine one for singular
<axw> but I'll leave that for now
<wallyworld> per controller i think?
<axw> wallyworld: each controller machine needs to run the worker for failover, that's how the singular workers are implemented
<axw> they'll each try to claim, but only one will succeed
 * axw bbs
<thumper> wallyworld: https://github.com/juju/names/pull/84
<wallyworld> thumper: looking
<wallyworld> thumper: lgtm
<axw> wallyworld: I was wrong, when we expire the lease, the other manager will refresh its cache when the txn fails. so :/
<axw> still shit, but probably not shit enough to continue trying to untangle everything
<axw> not sure why it's getting "state changing too quickly errors" though... that refresh should fix it
<redir> curl https://s3.amazonaws.com//aws-cloudwatch/downloads/latest/awslogs-agent-setup.py -O
<redir> chmod +x ./awslogs-agent-setup.py
<redir> jmeta-u./awslogs-agent-setup.py -n -r us-east-1 -c awslogs.confw
<redir> w
<redir> whoops
<redir> somehow I found the shortcut to tell terminator to broadcast to all terminal windows
<redir> :)
<redir> glad it wasn't a secret
<thumper> wallyworld: I think we should skip the release call again
<thumper> we know what we are doing
<wallyworld> ok
<thumper> wallyworld: https://github.com/juju/juju/pull/7943
<wallyworld> ok
<thumper> wallyworld: I'm confused, got a minute?
<wallyworld> sure
<thumper> 1:1 HO
<wallyworld> thumper: the hamster died, but we had finished
<thumper> yeah
<babbageclunk> wallyworld: I've just clicked that environs.Tools.Tools already does fallback between custom source and public source.
<babbageclunk> wallyworld: implementation wise it's simpler for me if the fallback goes:
<babbageclunk> custom/devel, public/devel, custom/released, public/released. Is that crazy or alright?
<wallyworld> babbageclunk: i'd have to re-read the code - i didn't think custom came into it for agent binaries, just image metadata
<wallyworld> for agent binaries, i thought we just had agent-stream to tweak things
<babbageclunk> sorry, thinko'd the function name: environs/tools.FindTools
<babbageclunk> It uses GetMetadataSources which adds one for agent-metadata-url and one for the default public source.
<wallyworld> we may use custom if agent binaries are sourced from metadata supplies at bootstrap
<wallyworld> but then it becomes custom > testing > develop > proposed > released
<babbageclunk> No, custom has streams in it.
<babbageclunk> So custom/public (and also cloud sources) are orthogonal to stream.
<hml> babbageclunk: joining us?  we miss you
<hml> :-)
#juju-dev 2017-10-19
<axw> wallyworld: did I miss standup? didn't realise we were starting today
<wallyworld> axw: yeah, sorry, i thought we had agreed. not anything new to share
<axw> wallyworld: ok
<axw> wallyworld: I'm back to thinking it's clock jumping now. I'll see what we can do, but I don't think we'll be getting anything in for 2.2 this week
<wallyworld> ok, what will be will be
<axw> wallyworld: once I've ruled out a short term change, I'll be looking into getting rid of using absolute times
<axw> wallyworld: I had some ideas on this when william was doing it, but never delved too deeply
<wallyworld> i've never dug too far into it myself
 * babbageclunk goes for a run
<thumper> wallyworld: http://paste.ubuntu.com/25769769/ so far
<thumper> almost there
<thumper> two failing tests...
<wallyworld> looking
<babbageclunk> wallyworld: ping?
<wallyworld> babbageclunk: sorry, was out for a bit, back now
<babbageclunk> wallyworld: hey hey
<babbageclunk> have you got time for a quick hangout?
<babbageclunk> in 1:1?
<wallyworld> sure
 * thumper disconnects to check that the tests aren't reaching out
<balloons> so I see https://launchpad.net/juju/+milestone/2.2.5 has 7 bugs on it, all by jam, We should just call 2.2.5 juju jam edition
<hml> :-)
<thumper> babbageclunk, hml: I'm heading to my daughter's school for a meeting with a teacher, so won't make the team standup, can you make sure someone runs it please?
<thumper> thanks
<hml> thumper: sure
<babbageclunk> Is wallyworld out too?
<hml> heâs around
<babbageclunk> Just not here or on Canonical IRC? maybe he's on a call..
<babbageclunk> -.
<hml> he must have popped off then??
<babbageclunk> you think he's bought a farm?
<hml> more like a football league
#juju-dev 2017-10-20
<wallyworld> axw: you free to meet now?
<axw> wallyworld: okey dokey
<wallyworld> axw: give me a minute to reboot to get sound, see you in HO
<axw> sure
<soumaya_> Hi All.
<soumaya_> I need some information related to python-juju api
<soumaya_> Is it possible to bootstrap a controller using python juju api
<soumaya_> Actually i have to bootstrap a juju controller from alipine based docker container and as i have bot found any juju client for alpine
<rick_h> soumaya_: no, until the controller is bootstrapped there's no API to communicate with. I'm not sure if the snaps would work on alpine but you can try to just grab the juju binary and see if it works for you at all.
<soumaya_> Thanks Richard ..
#juju-dev 2017-10-21
<t30aka> @here
#juju-dev 2017-10-22
<saadi> unable to remove an application that has a status error using remove-application or remove-unit. Finally i delete container and even juju status is showing that application with error. how to remove it
