#juju-dev 2012-06-25
<fwereade> heya TheMue
<TheMue> fwereade: Hi
<TheMue> fwereade: Phew, it's raining cats and dogs here.
<fwereade> TheMue, it's pretty hot here
<fwereade> TheMue, we're just approaching the too-damn-hot point of the year
<fwereade> TheMue, give me a few days and I'll be begging for a decent rainstorm :)
 * TheMue dcc's fwereade some rain.
<fwereade> TheMue, :)
<fwereade> TheMue, so how's it going? I hope the format 2 stuff isn't too much of a hassle -- I feel like I maybe should have done it myself, but I got caught up in the relations and I felt it was getting pretty important
<TheMue> fwereade: I'll do firewall first, then format 2.
<fwereade> TheMue, ah, excellent
<fwereade> TheMue, are we going with the security groups style or are we doing it properly this time?
<TheMue> fwereade: I'm talking about todays firewall.py in state. security and auth will be handled later.
<fwereade> TheMue, ah cool
<TheMue> fwereade: firewall is used by the PA.
<TheMue> fwereade: So I'm moving it out of state to cmd.
 * fwereade suddenly gets suspicious
 * fwereade goes to read code a mo
<fwereade> TheMue, doesn't implementing that presuppose the security groups approach?
<TheMue> fwereade: As far as I've seen yet not.
<TheMue> fwereade: But I've just started.
<fwereade> TheMue, it seems to me that if the PA is going to use it, then we're assuming that the PA will remain responsible for opening/closing ports
<fwereade> TheMue, a proper solution using firewalls on the units surely shouldn't involve the PA at all?
<TheMue> fwereade: Sorry, don't know.
<fwereade> TheMue, blast, wish niemeyer was on
<TheMue> fwereade: So how would your solution look like?
<fwereade> TheMue, unit agent messing with iptables, rather than PA messing with the provider
<fwereade> TheMue, we've certainly talked about our use of security groups being a serious problem, and about the need for a cross-provider firewall solution
<TheMue> fwereade: Pls go on ...
<fwereade> TheMue, but it would not necessarily be *irrational* for us to go with the tried, tested, known-working-at-small-scale solution (given the time constraints that are starting to wear at me slightly)
<fwereade> TheMue, the problems with security groups are (1) aws is really not designed to handle what we're doing with them and (2) the solution only works for aws
<fwereade> TheMue, (2) is not important wrt our critical short-term goals
<fwereade> TheMue, but disregarding (1) feels like the sort of decision that we should get some sort of consensus on before writing code that presupposes it
<fwereade> TheMue, s/presupposes it/presupposes that approach/
<TheMue> fwereade: That are worries, ok, but how would a proper solution look like?
<fwereade> TheMue, I'm afraid I don't have a clear idea of the *precise* problem with our use of security groups... just that we're not meant to use any, and an apocryphal amazon engineer was said to look somewhat horrified by the prospect :)
<fwereade> TheMue, I think it comes down to the *unit* agents watching the ports that should be open in their container and taking charge of it themselves
<fwereade> TheMue, we'd still need *some* security groups, but probably just 2: one for PA machines and one for everything else
<fwereade> TheMue, make sense?
<TheMue> fwereade: Yep, so far understandable.
<TheMue> rogpeppe: Hey, you are not here. ;)
<rogpeppe> TheMue: that's right. i'm an invisible ghost.
<rogpeppe> TheMue: i've been given special dispensation :-)
<TheMue> rogpeppe: Ah, ok, then it's ok.
<fwereade> TheMue, that's pretty much it...
<fwereade> rogpeppe, heyhey
<rogpeppe> fwereade: yo!
<fwereade> rogpeppe, are you aware of any official preference as to how we implement firewalling this time round?
<rogpeppe> i seem to remember we've got a meeting scheduled in 13 minutes, so i thought i'd try and turn up for it...
<rogpeppe> (maybe i've got it wrong though!)
<fwereade> rogpeppe, btw, finished To Hold Infinity, very enjoyable
<rogpeppe> fwereade: cool, glad you enjoyed it. am enjoying wwz, in a slightly grim kinda way
<rogpeppe> hmm, firewalling
<rogpeppe> until we containerise everything, i think the current approach is probably the only one
<fwereade> rogpeppe, enjoying Axiomatic too, fun to have a more thinky, less experiencey read once in a while
<fwereade> rogpeppe, ah, expand please? I don't see the issue
<fwereade> rogpeppe, after all everything is containerised already... in a sense... which feels like the appropriate sense for this context
<rogpeppe> fwereade: how do we firewall without making use of ec2's facilities?
<fwereade> rogpeppe, iptables?
<rogpeppe> fwereade: can't anything get around that?
<fwereade> rogpeppe, I have always presumed that it works as advertised, but I can't point to anything proving that
<fwereade> rogpeppe, and I'm not saying we don't use security groups at all -- we have to -- but we know that using one per machine is a problem
<fwereade> rogpeppe, I just don't know whether it's the sort of problem we want to fix now, or the sort of problem we leave for 13.04
<rogpeppe> fwereade: am i right about the meeting, BTW?
<fwereade> rogpeppe, er, I have no idea... I had a vague feeling it was weds, but maybe I missed another change
<fwereade> rogpeppe, but davecheney is on, and that may lend support to your theory ;p
<rogpeppe> dammit, it's an hour later
<rogpeppe> bugger, my dispensation is invalid
<rogpeppe> fwereade: iptables are manipulatable by root, and the charms run as root.
<rogpeppe> fwereade: we need to talk to niemeyer about this
<rogpeppe> fwereade, TheMue: well, gotta go. will miss the meeting, i think. have fun, and post any interesting/relevant conversations to juju-dev, where i will see 'em and sneakily read 'em...
<fwereade> rogpeppe, yeah, indeed -- I'm not even sure I have a strong position on this, I just feel it's something we should get niemeyer's input on before we implement code that supposes either way
<fwereade> rogpeppe, enjoy the holiday :)
<TheMue> rogpeppe: OK, have fun.
<fwereade> TheMue, I think that either way you can certainly implement something that keeps an eye on both sets of conditions, and emits events when ports should actually open or close
<rogpeppe> fwereade: we could cache groups, because we're unlikely to have too many configurations of ports.
<rogpeppe> fwereade: which might mitigate the issue
<TheMue> fwereade: That's what firewall does today.
<rogpeppe> TheMue: ah, it must've changed since i last looked
<rogpeppe> TheMue: i thought there was one group for each machine
<rogpeppe> anyway, gotta go
<TheMue> rogpeppe: The firewall.py does not very much. It's only used by the PA.
<fwereade> TheMue, where does it do that?
<fwereade> TheMue, I don't see anything that shares groups in there
<TheMue> fwereade: I didn't say anything about groups. I meant watcing the ports.
<fwereade> TheMue, if anything does that, it's in the individual provider's open_port/close_port methods
<fwereade> TheMue, ah got you
<fwereade> TheMue, all I'd suggest then is to make sure that the thing that watches an individual machine remains distinct from the thing that watches all machines
<fwereade> TheMue, do I appear to be approximately sane there?
<TheMue> fwereade: I'll keep it in mind. I'm not yet deep enough in it. Just started the porting and as a prerequisite the watcher for the exposed flag.
<fwereade> TheMue, cool
 * fwereade starts to wonder whether he's right about it being up to the UA... maybe the MA would be better...
<TheMue> fwereade: You've got more insight than me. I sometimes miss an architecture graphics where the components, their responsibilities and roles and how they communicate are visible.
 * TheMue is a very visual being.
<fwereade> TheMue, I think the issue there is that the responsibilities in python are not necessarily as they should be
<fwereade> TheMue, eg theMA being responsible for the first download of the charm, and the UA being responsible for subsequent ones
<TheMue> fwereade: OK, then two graphics: todays implementation and wanted implementation
<fwereade> TheMue, the first one is of limited value and the second one is subject to change as we figure out *how* we should be doing things...
<fwereade> *should*
<fwereade> TheMue, hopefully without succumbing to second-system effect
<TheMue> fwereade: That's a problem of working remote. I've used whiteboards a lot for a discussion of how something is and how it should change.
<TheMue> fwereade: My intention is now first class diagram
<TheMue> s/now/no/
<Aram> moin.
<fwereade> Aram, heyhey
<TheMue> Aram: Moin
<Aram> fwereade: TheMue: had a little bit of fun yestarday: http://play.golang.org/p/D-qPq8uIw3
<fwereade> Aram, haha, nice
<TheMue> Aram: *lol*
<TheMue> Hmm, seems it's time for a topology watcher.
<TheMue> fwereade: Any experiences with the size of topologies in large installations?
<fwereade> TheMue, all I know is that yaml was too big for the 2k deployment, json makes it small enough for that with room to spare
<TheMue> fwereade: I'm asking because topology watchers keep an old one in memory and pass it and a new one to the using callbacks/watcher users.
<fwereade> TheMue, IIRC max ZK node size is 1MB, so order of that, I guess
<TheMue> fwereade: I would store it already parsed, so there should be not whitespace problem.
<fwereade> TheMue, it shouldn't be an overwhelming load though
<TheMue> fwereade: ok
<fwereade> TheMue, however you way want to look at recent topology watchers in go, which don't keep a whole topology around
<fwereade> TheMue, they just keep the bits they're interested in
<TheMue> fwereade: WHich ones you're talking about? Most I've seen so far watch simple nodes.
<fwereade> TheMue, MachinesWatcher and MachineUnitsWatcher
<TheMue> fwereade: Also the event of change always forces me to at least read one complete node.
<fwereade> TheMue, also ServiceRelationsWatcher, new in review today
<fwereade> TheMue, yeah, you always read the whole new topology
<fwereade> TheMue, no reason to keep unit info around when all you care about is relations for one service
<TheMue> fwereade: Thx, will take a look. I need it for the ServiceUnitsWatcher.
<fwereade> TheMue, cool
<fwereade> TheMue, a suggestion, don't know if it applies:
 * TheMue listens
<fwereade> TheMue, when doing the ServiceRelationsWatcher, it was very convenient to add (*Service)relationsFromTopology(t *topology) and use it both in Relations and the watcher
<fwereade> TheMue, haven't looked at MW or MUW to see whether they'd benefit from similar
<TheMue> fwereade: OK, will look, it sounds good.
<fwereade> TheMue, it may be that the code to extract the stuff we care about is small enough not to bother in those cases and maybe in yours
<TheMue> fwereade: Huh, the last sentence is difficult for me to understand.
<fwereade> TheMue, sorry
<fwereade> TheMue, I'm saying that getting a []*Relation from a service and a topology is enough work to make it worth factoring out
<fwereade> TheMue, but getting a []*Unit from a service and a topology may be trivial enough that it's better to duplicate the code
<fwereade> TheMue, similar may apply to MW and MUW
<TheMue> fwereade: OK, understand, I will see how much it is.
<niemeyer> Hellos!
<twobottux> aujuju: Is juju specific to ubuntu OS on EC2 [closed] <http://askubuntu.com/questions/149952/is-juju-specific-to-ubuntu-os-on-ec2>
<TheMue> niemeyer: Hello to the far west.
<niemeyer> TheMue: Hi :)
<niemeyer> TheMue: How's been the weekend?
<TheMue> niemeyer: Fine, a but support for my brother in law, he is building a house, and sitting on the couch on Sunday, it rained cats and dogs.
<TheMue> niemeyer: And your travel to SFO?
<niemeyer> TheMue: Hah :)
<niemeyer> TheMue: The trip was quite fine
<niemeyer> Hmm.. so it seems that Go's behavior on redirections has changed somehow.. lpad seems broken :(
 * niemeyer investigates
<fwereade> niemeyer, heyhey
<fwereade> niemeyer, TheMue: please confirm that it is not safe to select on a send to a channel that might be closed
<niemeyer> fwereade: It is actually safe
<fwereade> niemeyer, really? oh, cool
<niemeyer> fwereade: It depends a bit on what you mean by that, though
<niemeyer> fwereade: Oh, wait.. *send*.. hmm
<fwereade> niemeyer, select {dodgy <- event: blah; <-t.Dying()}
<fwereade> niemeyer, select {dodgy <- event: blah; <-t.Dying():}
<niemeyer> fwereade: No, that's not ok, sorry for the misinfo
<fwereade> niemeyer, no worries :)
<niemeyer> fwereade: It's considered a bad practice (hence why it blows up) because it's a clear statement that the life time of the channel is messed up.
<fwereade> niemeyer, that was what I thought
<Aram> hi niemeyer, how's SF?
<fwereade> niemeyer, and I'm pretty sure I'm in a situation where I can just leave the channel alone without ever closing it anyway :)
<niemeyer> Aram: Pretty nice, sunny.. had a good time with Andrew yesterday as well
<Aram> niemeyer: nice.
<niemeyer> fwereade: That's a possible answer
<Aram> niemeyer: did you see my silly paste entry? http://play.golang.org/p/D-qPq8uIw3
<niemeyer> Aram: Yeah, that was awesome :)
<niemeyer> robbiew: ping
<Aram> niemeyer: I could have made it an actual animated PNG, but animated PNGs don't work in webkit browsers yet.
<robbiew> niemeyer: pong
<niemeyer> robbiew: Heya
<niemeyer> robbiew: Do we have a meeting today?
<robbiew> niemeyer: heh...as usual, I have no idea...checking
<niemeyer> Aram: Surprisingly short
<niemeyer> robbiew: Cool.. I better find out a good way to call out of the hotel if so
<robbiew> niemeyer: no meeting
<niemeyer> robbiew: Super, thanks for checking
<fwereade> gn all
<fwereade> niemeyer, btw, I have to go again in a sec, but I meant to ask:
<fwereade> niemeyer, are we planning to replicate the security-group firewalling for 12.10?
<niemeyer> fwereade: Yeah
<niemeyer> fwereade: Should be easy, and gets us parity
<niemeyer> fwereade: We can then fix it another way later
<niemeyer> fwereade: But,
<niemeyer> fwereade: We should try to make the implementation sensible, so that we can reuse bits
<fwereade> niemeyer, yep, I approve (despite emotionally wanting to Do It Right ;))
<niemeyer> fwereade: I've been talking to Frank about that
<niemeyer> fwereade: He's working on the firewall port watcher stuff
<fwereade> niemeyer, excellent, I realised I didn't know what plan we were following when he mentioned it this morning
<niemeyer> fwereade: That we have under state/firewall.py in Python
<niemeyer> fwereade: But with some twists.. the Python version assumes it knows about a provider and what not
<niemeyer> fwereade: The Go version will be a normal watcher
<fwereade> niemeyer, yeah, I presume we'll just be outputting changes
<fwereade> niemeyer, perfect
<niemeyer> fwereade: Exactly
<fwereade> niemeyer, I would guess two levels of watchers so we can reuse the inner one when it becomes the MA (UA???)'s responsibility?
<niemeyer> fwereade: Yeah, we actually already have one in the unit
<niemeyer> fwereade: So this is adding the second one, on Machine
<fwereade> niemeyer, ah, nice
<niemeyer> fwereade: WatchPorts
<niemeyer> fwereade: I think we'll use the exact same thing when we move
<niemeyer> fwereade: The difference is that the machine agent will call Machine.WatchPorts, rather than the provisioning
<fwereade> niemeyer, perfect :)
<robbiew> mramm: looking for me?
<Aram> niemeyer: somethins intriguing is happening... compare this: http://bazaar.launchpad.net/~gophers/juju-core/trunk/view/head:/mstate/state.go#L56 with this: https://codereview.appspot.com/6304099/diff2/9002:18002/mstate/state.go
<Aram> the machine function
<Aram> is different :)
<Aram> how can this be?
<Aram> the AllMachines function is the same though, and both have been altered in the same commit.
<niemeyer> Aram: Why should they be the same, just so I get the context?
<Aram> niemeyer: because I submitted what's on codereview, and what's in launchpad seems an earlier version.
<niemeyer> Aram: Ah, it's actually not
<niemeyer> Aram: https://codereview.appspot.com/6330045/
<Aram> interesting.
<Aram> why the removal of that branch?
<niemeyer> Aram: The new error will look like "can't get machine 42: not found", which is fine
<niemeyer> Aram: I had to touch that logic due to the NotFound renaming
<niemeyer> Aram: (ErrNotFound now)
<Aram> yes, yes.
<niemeyer> Aram: But rather than replacing it, I just dropped and allowed the underlying error to go through as per the message above
<Aram> well yes, that was my initial version as well.
<niemeyer> Aram: Not really
<niemeyer> Aram: your initial version was the opposite.. any error would lead to "not found"
<Aram> right.
<Aram> niemeyer: anyway, thanks for clearing the confusion.
<niemeyer> Aram: np, and sorry for the trouble.. I wanted to ask for your review on it too, but at the same time didn't want to leave trunk broken
<Aram> of course
<Aram> niemeyer: first piece of the puzzle: https://codereview.appspot.com/6341050
<niemeyer> Aram: Awesome, thanks!
<Aram> niemeyer: the diff on codereview is always done against lp:juju-core? can't I do it against some other branch I have?
<niemeyer> Aram: You can, with -req
<niemeyer> Aram: It only allows trees rather than graphs, but it works
<Aram> strange, that's what I did, lbox propose -cr -wip -req="lp:~aramh/juju-core/mstate-charm-basic"
<niemeyer> Aram: -req has to be used at propose time
<Aram> but it generated this: https://codereview.appspot.com/6325057 which is wrong because it should only be two lines
<niemeyer> Aram: After the merge proposal is created, it doesn't work anymore
<niemeyer> Aram: (because Launchpad doesn't allow changing it)
<Aram> can I delete a merge proposal and do it again from the same branch?
<niemeyer> Aram: Yeah
<niemeyer> Aram: That works fine
<Aram> ok, thanks
<niemeyer> np
<niemeyer> Okay, lpad works again.. I'll go out for finding some food, and will be back to work on reviews
<Aram> morning davecheney
<davecheney> morning Aram
<davecheney> hows it going ?
<Aram> great
<Aram> niemeyer: I believe three pieces of the puzzle should be in the queue now
<niemeyer> Aram: Super, thanks!
<niemeyer> davecheney: Heya
<davecheney> howdy lads
#juju-dev 2012-06-26
<fwereade> niemeyer, tyvm for reviews, good comments
 * fwereade looks at time, goes to bed
<niemeyer> fwereade: My pleasure, good stuff too.. both seem almost mergeable
<niemeyer> fwereade: Have a good night!
<fwereade> niemeyer, cheers :)
<Aram> good night
<niemeyer> Okay, I'll get some food and head to bed.. looks like we'll have a meeting at 4AM (ouch)
<TheMue> Morning
<fwereade> TheMue, heyhey
<fwereade> TheMue, was just wondering about FlagWatcher... is it ok to assume that watched nodes will never have their content changed?
<TheMue> fwereade: Have to look if it fires by accident. It is intended to just recognize if a node is created or removed. It first has been the exposed watcher, but niemeyer then recommended to name it flag watcher to be more flexible.
<TheMue> fwereade: Btw, the ServiceUnitsWatcher is for review at https://codereview.appspot.com/6325062. Your work helped a lot.
<fwereade> TheMue, I think it'll fire with every content change, won't it?
<fwereade> TheMue, sweet
<TheMue> fwereade: It uses the content watcher, so yes, it may be.
<TheMue> fwereade: Hmm, even if it is not intended to be used with node with content it's better to change it. Thx for the hint.
<fwereade> TheMue, cool, np
<TheMue> davecheney: Hi
<davecheney> TheMue: howdyu
<TheMue> davecheney: Howdyu? If it's the abbreviation I assume: fine. ;)
<fwereade> davecheney, heyhey
<TheMue> fwereade: The fix is in: https://codereview.appspot.com/6336058
<fwereade> TheMue, I feel it needs a test :
<fwereade> :)
<Aram> hi.
<Aram> meeting in one minute?
<davecheney> yy
<TheMue> Aram: Think so.
<Aram> where? :).
<TheMue> fwereade: OK, will add one to be sure.
<davecheney> Aram: g+ ?
<Aram> ok
<Aram> hi niemeyer
<fwereade> Aram, niemeyer: heyhey
<TheMue> davecheney: Yes G+
<davecheney> sup!
<niemeyer> Heya
<niemeyer> Is it party time yet?
<TheMue> niemeyer: Hi
<TheMue> niemeyer: Only waiting for mark to send the invitation.
<niemeyer> Where's Mark?
<TheMue> niemeyer: Not yet seen *sigh*
<niemeyer> Okay, I'll kick it off then
<TheMue> niemeyer: Thx
<davecheney> sigh
<davecheney> signed in with the wrong account
<davecheney> two secs
<niemeyer> davecheney: I can invite the other one two
<niemeyer> davecheney: If you want
<davecheney> nah
<davecheney> just logging in now
<niemeyer> Aram, William: Invite is out
<fwereade> niemeyer, cheers
<mramm> niemeyer: Aram: TheMue: fwereade:   good morning/afternoon/evening all
<fwereade> mramm, heyhey, come join us on g+
<niemeyer> mramm: Heya
<mramm> can somebody throw me the hangout link?
<Aram> laptop overheated
<niemeyer> mramm: Done
<niemeyer> mramm: https://plus.google.com/hangouts/_/02365fc5d5c685631f4c5d7068a7bab6c7e99543
<niemeyer> davecheney: https://codereview.appspot.com/6333056/
<davecheney> niemeyer: yes, i was looking at that one
<niemeyer> davecheney: I haven't looked, so I have no idea about what's there, but that's what rog had to say:
<niemeyer> Jun 22 13:39:56 <rog>   niemeyer: i've got a preliminary CL that starts the provisioning agent, but i haven't done the tests yet.        https://codereview.appspot.com/6333056/
<niemeyer> Jun 22 13:40:10 <rog>   niemeyer: someone might want to take it on while i'm away
<davecheney> i'll make it work
<davecheney> niemeyer: good luck with your presentation
<davecheney> i'm going to a live screening in AU at 2am I think thursday morning
<davecheney> m_3 is in town, i'll bring him along
<niemeyer> davecheney: oh, nice
<niemeyer> davecheney: The one I'll be in is early friday
<davecheney> hmm, is that the 2nd day or the third day
<niemeyer> davecheney: Third
<davecheney> i'll have to catch it on video
<davecheney> i doubt the owner of the cafe the local GDC group have taken over will put up with us for 3 days
<niemeyer> davecheney: hehe :)
<davecheney> niemeyer: good luck, what a coup!
<niemeyer> davecheney: THank you!
<niemeyer> Okay, I need to get some more rest, or they'll kick me out when they see I'm unable to speak at all :)
<niemeyer> See you all later!
<TheMue> niemeyer: Good night
<TheMue> Lunchtime
<TheMue> fwereade: Thx for review.
<fwereade> TheMue, yw, hope it's helpful
<Aram> hm?
<Aram> error: Failed to create merge proposal: Server returned 405 and no body.
<Aram> from lbox
<Aram> who's fault? :).
<fwereade> Aram, probably lp, give it a sec and try again
<Aram> ok
<fwereade> gn all, I have quite the stack of reviews if anyone's of a mind... https://code.launchpad.net/juju-core/+activereviews
<Aram> have a good one
<Aram> meh, lbox works for other branches
<Aram> damn
 * Aram goes to the store to buy some tea.
<niemeyer> Hi all!
<Aram> hi again niemeyer.
<Aram> niemeyer: I have a problem with lbox
<Aram> "error: Failed to create merge proposal: Server returned 405 and no body."
<Aram> I get this error on one branch, but not the others.
<niemeyer> Aram: Hmm
<niemeyer> Aram: Can you please post the "bzr info" for that one branch?
<Aram> sure
<Aram> niemeyer: http://paste.ubuntu.com/1061229/
<niemeyer> Aram: Hmm.. seems alright
<niemeyer> Aram: Can you please try the same operation with -debug?
<Aram> lbox pr bzr?
<Aram> lbox or bzr?
<niemeyer> Aram: lbox propose -debug
<Aram> yes
<Aram> niemeyer: http://paste.ubuntu.com/1061241/
<niemeyer> Aram: I think this is my fault
<niemeyer> Aram: lpad recently broke due to an API change introduced in Go itself
<niemeyer> Aram: Looks like I have done something wrong while working around the issue
<niemeyer> Aram: Looking right now
<Aram> right
<niemeyer> Aram: Can you please confirm the version of lbox you have installed (dpkg -l lbox)
<Aram> niemeyer:
<Aram> white:mstate$ dpkg -l lbox
<Aram> Desired=Unknown/Install/Remove/Purge/Hold
<Aram> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
<Aram> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
<Aram> ||/ Name           Version        Description
<Aram> +++-==============-==============-============================================
<Aram> ii  lbox           1.0-51.62.38.1 The Launchpad Toolbox
<niemeyer> Aram: Cheers
<niemeyer> I'm having some trouble trying to figure how this request could possibly be done with a POST
<niemeyer> I suspect I'm about to understand something curious on the http package
<niemeyer> Or about my own logic, of course
<niemeyer> Aram: I know what's wrong.. just adding a test and will have a working version
<Aram> great
<niemeyer> Building..
<Aram> niemeyer: hmm... I see new versions on launchpad, but apt-get update isn't picking them, is there some kind of delay/cache?
<Aram> ah, the amd64 package still needs building
<Aram> 4 minutes
<Aram> or not :)
<niemeyer> Aram: Yeah
<niemeyer> Aram: It's super slow for whatever reason
<niemeyer> Aram: lbox should be good
<niemeyer> Aram: Can you please see if it solves your problem?
<niemeyer> I'm going to grab some food if so
<Aram> yes
<Aram> checking now
<Aram> meh, still don't see the update.
<Aram> I can see it built on launchpad
<Aram> but apt-get doesn't pick it yet
<Aram> okay, updating now
<Aram> testing...
<Aram> niemeyer: works, thanks.
<niemeyer> Aram: Woohay
<niemeyer> Okay
<niemeyer> Lunch has gone by already.. will find something reasonable outside
<Aram> oh man, hope you didn't miss lunch because of fixing lbox :).
<Aram> added two reviews to the queue
<Aram> morning davecheney
<davecheney> morning!
<niemeyer> Aram: Thanks!  Nah, was just excited pushing stuff forward :)
#juju-dev 2012-06-27
<davecheney> niemeyer: thanks for your review
<niemeyer> davecheney: My pleasure
<davecheney> the reason the firewaller watches the environ is I imagined that would be how it woudl talk to the (unknown) firewalling service
<davecheney> as the Environ in this case represents ec2
<davecheney> niemeyer: in the pythonic version, the provider managed openport/closeport
<niemeyer> davecheney: Duh.. it is obvious indeed, sorry
<davecheney> niemeyer: well, the facility doesn't exist on environ, yet
<niemeyer> davecheney: Which facility?
<davecheney> open port/close port
<davecheney> it's there in goamz
<davecheney> but there is no generic interface in environs
<niemeyer> davecheney: Ah, I see, yep
<niemeyer> davecheney: It kind of sucks that we'll have to redo this soon to be machine-based
<niemeyer> davecheney: But it seems like the right thing to do, rather than get into a trip in unknown territory before we actually get things working at all
<davecheney> niemeyer: right o
<davecheney> niemeyer: i'm going to raise a bug on juju-core to define the open port / close port interface on environ
<davecheney> if that is ok
<niemeyer> davecheney: E.g. if we moved onto the machine agent, we actually have to watch relations too, at the machine level, so that we alter the firewall allowing them to go in and out
<niemeyer> davecheney: Sounds great
<niemeyer> davecheney: Thanks
<davecheney> niemeyer: the reason it's a new type, not an addition to the Provisioner, was a hope that the service could be moved between agents
<davecheney> but i don't know how realisitic that will be
<niemeyer> davecheney: I love the fact you splitted it out
<niemeyer> davecheney: It was a bad decision on the original implementation
<niemeyer> davecheney: Even if we can't move it as-is, we can move something, or we can even erase it by itself if nothing else
<davecheney> yup
<niemeyer> davecheney: Hmm
<niemeyer> davecheney: I don't think we should allow the environment name to change
<niemeyer> davecheney: We that value internally in the environment itself
<niemeyer> We use
<davecheney> niemeyer: ok, i'll do it another way
<niemeyer> davecheney: E.g. security groups
<davecheney> it wasn't being set at all
<niemeyer> davecheney: It feels slightly bad that we do this
<niemeyer> davecheney: It would be better to have that as an external property of the environment
<niemeyer> davecheney: But it's helpful to tell the user about what those machines he's running are about
<davecheney> niemeyer: the current thing that is blocking bootstrap is there are no tools in my s3 bucket
<davecheney> i can't see anything to generate those
<niemeyer> davecheney: Oh
<davecheney> 012/06/27 10:27:45 JUJU findTools searching for {{0 0 0} precise amd64} in []
<davecheney> error: cannot start bootstrap instance: cannot find juju tools that would work on the specified instance: no compatible tools found
<niemeyer> davecheney: bootstrap -upload-tools?
<niemeyer> davecheney: bootstrap --upload-tools?
<niemeyer> davecheney: I think you're right, though
<niemeyer> davecheney: We do need to set the environment name, at least once
<davecheney> 2012/06/27 10:28:50 JUJU environs: putting tools tools/juju-0.0.0-precise-amd64.tgz
<davecheney> niemeyer: i'll find a nicer way to do it
<niemeyer> davecheney: I feel like your CL is a good step forward, actually
<niemeyer> davecheney: I'm probably mixing problems up
<niemeyer> davecheney: We should prevent the user from replacing the name, but that's not the way to do it
<davecheney> niemeyer: the name is an unfortunately property that comes out of the yaml file
<niemeyer> davecheney: I'll file a bug, and LGTM your change
<davecheney> niemeyer: ta, just a small cosmetic i found
<niemeyer> davecheney: This may well be what's preventing it from running
<davecheney> 2012/06/27 10:31:34 JUJU environs/ec2: started instance "i-acd0d3d5"
<davecheney> niemeyer: dude, it bootstrapped !
<niemeyer> davecheney: Woah!
<davecheney> niemeyer: i'm going to pull this branch apart into smaller pieces, but i think it's working
<davecheney> no idea if there is a PA at the other end yet
<davecheney> also, juju status would be nice to have :)
<niemeyer> davecheney: +1 :)
<davecheney> niemeyer: time for a celebratory cup of tea
<niemeyer> davecheney: Feel free to start looking at it actually, if you ever get blocked/bothered
<niemeyer> s/bothered/bored/
<davecheney> niemeyer: ok
<davecheney> niemeyer: i'm not going to step on franks toes ?
<davecheney> screw it, i'll just email juju dev and claim it
<davecheney> niemeyer: 2012/06/27 10:34:27 JUJU environs: putting tools tools/juju-0.0.0-precise-amd64.tgz
<davecheney> error: cannot upload tools: cannot write file "tools/juju-0.0.0-precise-amd64.tgz" to control bucket: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.
<davecheney> ^ kinda unreliable, but that is australia for ya
<davecheney> maybe i'll try a different availability zone
<niemeyer> davecheney: Definitely not stepping on Frank's.. he's not looking at status ATM
<davecheney> sorry, william
 * davecheney facepalm
<niemeyer> davecheney: That error was kind of weird
<niemeyer> davecheney: I don't think anyone is doing status yet
<niemeyer> davecheney: I've not seen that before.. worth investigating
<davecheney> niemeyer: anyway, time for a break
<niemeyer> Woohay.. new battery arrived
<niemeyer> It's time for me to leave too
<niemeyer> I'm late, actually
<niemeyer> Dinner with Go folks tonight.. back later
<davecheney> niemeyer: how was dinner ?
 * davecheney is a tad jealous
<niemeyer> davecheney: It was awesome indeed
<davecheney> niemeyer: i think i figured out why juju can't bootstrap in other avail zones
<niemeyer> davecheney: Not the full team, unfortunately
<davecheney> niemeyer: i bet rsc wasn't there
<davecheney> he's elusive
<niemeyer> davecheney: Yeah, we had Rob and Andrew
<niemeyer> davecheney: He lives elsewhere.. I bet he'd join if he was around
<davecheney> he lives in boston right ?
<davecheney> niemeyer: can I commit rogers branch ? or should I merge it into one I own to lbox propose ?
<niemeyer> davecheney: The right thing would be to branch from it and continue, or possibly even break it down in smaller chunks
<niemeyer> davecheney: Yeah, he lives in Boston, or next to it I think
<davecheney> niemeyer: i'll try to break it into two chunks, it's only 100 lines
<davecheney> actually less than 90
<niemeyer> davecheney: Yeah, but it's also incomplete, right?
<davecheney> it's very close
<davecheney> [  102.246904] init: juju-provision-agent main process (4356) terminated with status 127
<davecheney> [  102.246937] init: juju-provision-agent main process ended, respawning
<davecheney> ^ jujud isn't in the path, that is pretty much the only bit missing
<niemeyer> davecheney: I had the impression it was a spike mostly.. rog mentioned it had no testing
<niemeyer> davecheney: I confess to not have looked into it, though
<davecheney> he's done some tests, > 50% of the change is in livetests
<niemeyer> davecheney: Ah, nice
<niemeyer> davecheney: I wonder what he was referring to then
<davecheney> maybe it was the environ name problem
<niemeyer> davecheney: Well, you're in a much better position than I am
<davecheney> since I fixed that it's been working pretty well for me
<niemeyer> davecheney: Whatever you feel should go to review, I'll be happy to look at
<davecheney> ok
<davecheney> niemeyer: did my diagnosis of the s3 problem make sense ?
<davecheney> niemeyer: ubuntu@domU-12-31-39-0C-58-F1:~$ pgrep jujud -lf
<davecheney> 4539 /var/lib/juju/tools/juju-0.0.0-precise-amd64/jujud provisioning --zookeeper-servers localhost:2181 --log-file /var/log/juju/provision-agent.log
<niemeyer> davecheney: I think it does.. it's definitely an error we'll want to fix
<davecheney> i think it should be pretty easy to add support for that
<niemeyer> davecheney: My surprise was with the fact Amazon refuses to take a request on a region-specific endpoint because the payload doesn't mention the region-specific endpoint name *as well*
<niemeyer> davecheney: Woah? Provisioner running?
<davecheney> yes, almost
<davecheney> i think i got initzk working as well (path issues)
<davecheney> but it's blocked waiting on the right value for /environment
<davecheney> niemeyer: [zk: localhost:2181(CONNECTED) 2] ls /
<davecheney> [services, charms, relations, zookeeper, initialized, machines, units]
<davecheney> initzk worked, just missing /environment
<davecheney> niemeyer: i don't understand why location constraint has to be xml, why can't it just be a header
<niemeyer> davecheney: That's expected
<niemeyer> davecheney: /environment comes from juju deploy
<niemeyer> davecheney: Well.. it doesn't even need to be a header
<niemeyer> davecheney: The endpoint URL itself is already a great indicator
<niemeyer> davecheney: There's no way to be talking to that endpoint URL and not wanting to.. well.. talk to it
<davecheney> niemeyer: i thought there was
<niemeyer> davecheney: Kind of strange.. either a very awkward decision, or I'm missing something more interesting
<davecheney> there is a note in one page I found that said there is a generic s3 location you can talk to
<davecheney> but they recommend not using it, because it will internally redirect your request
<niemeyer> davecheney: Sure, but we're not using it, AFAIK
<davecheney> either way, it's a bit of a wart
<davecheney> niemeyer: so, juju deploy will be responsible for keeping /environment up to date ?
<niemeyer> davecheney: Yeah
<niemeyer> davecheney: Well, hopefully not up-to-date
<niemeyer> davecheney: Only the first time
<niemeyer> davecheney: THe Python version updates it every time, but that's a mistake
<davecheney> niemeyer: ok, so i should not make the bootstrap process seed /environment
<niemeyer> davecheney: It's tricky because to do that we'd have to wait until the environment bootstrapped
<niemeyer> davecheney: Chicken and egg
<niemeyer> davecheney: We don't want to send credentials over via user-data
<davecheney> niemeyer: right, so to confirm, the cloud-init script is not secure ?
<niemeyer> davecheney: Right, it puts thing up in an address that can be openly accessed from within the machine
<fwereade> niemeyer, davecheney: heyhey
<davecheney> fwereade: howdy
<fwereade> niemeyer, perhaps I'm missing something re: https://codereview.appspot.com/6335057/diff/3001/state/presence/presence.go#newcode397
<fwereade> niemeyer, but it seems to me that we can only guarantee absence notifications from the watches if we don't stop the childLoops at all
<fwereade> bah
<fwereade> ok, I'm off to take laura to nursery
<davecheney> yeah, i'm off soon as well
<niemeyer> davecheney: Yo
<niemeyer> davecheney: Please feel free to tackle the location bug if you're up for it
<niemeyer> davecheney: I'll certainly have a low attention level in the next few days with I/O going by
<niemeyer> Time to head to bed now
<niemeyer> davecheney: Have a good time there, and talk to you tomorrow
<niemeyer> Night all
<TheMue> Morning.
<fwereade> TheMue, heyhey
<fwereade> TheMue, any objection to me just straight-up deleting relationUnitWatcher? after the meeting yesterday we have a better plan for doing essentially the same stuff
<TheMue> fwereade: No problem, so I'll just don't touch it when doing the Deleted/Removed change.
<fwereade> TheMue, I consider this trivial enough to go straight in with a cursory review; sound sane?
<fwereade> TheMue, I'll be proposing in a couple of mins...
<TheMue> fwereade: Does the mimic also may relate the ServiceUnitsWatcher?
<fwereade> TheMue, sorry, can't parse
<TheMue> fwereade: The way you will do it in future, will this be interesting for the service units watcher too?
<fwereade> TheMue, ah sorry; I don't *think* so
<fwereade> TheMue, I don't think that should be considering agent presence
<fwereade> TheMue, but I may not have thought it through properly
<fwereade> TheMue, should it?
<TheMue> fwereade: OK, I'll take a look when you proposed it. ;)
<TheMue> fwereade: No, it's only that both are interested in units, but in different contexts.
<fwereade> TheMue, yeah, this is specifically unit *relation* presence
<TheMue> fwereade: ic
<fwereade> TheMue, (the better plan is to use the forthcoming presence.ChildrenWatcher)
<fwereade> TheMue, https://codereview.appspot.com/6351046
<fwereade> TheMue, should be trivial :)
<TheMue> fwereade: LGTM, those pure red ones are pretty simple. ;)
<fwereade> TheMue, cheers; agree ok to submit directly? :)
<TheMue> fwereade: If the removed watcher isn't yet used then it's ok.
<fwereade> TheMue, don't worry, I ran the tests :)
<fwereade> TheMue, submitting now
<fwereade> TheMue, btw, if you have another couple of moments
<fwereade> TheMue, it seems to me that the watcher use in cmd/jujud is a bit inconsistent/idiosyncratic
<fwereade> TheMue, I would appreciate your opinion on whether it's justified, or if we should clean it up
<fwereade> TheMue, by which I mainly mean moving stopWatcher and mustErr into state/watcher, and using them consistently
<TheMue> fwereade: i take a look
<fwereade> TheMue, cheers
<TheMue> fwereade: hmm, do you have a pointer for me?
<fwereade> TheMue, cmd/jujud/machine.go:93
<fwereade> TheMue, I'm pretty sure that it's wrong to let a tomb.ErrStillRunning slip through
<fwereade> TheMue, and in line 80 we discard errors
<fwereade> TheMue, I haven't looked at provisioner properly but I'm pretty sure it suffers similar problems
<fwereade> TheMue, it may or may not be justifiable to rearrange NewMachiner such that it follows the usual model -- ie start the watcher before we start the loop
<fwereade> TheMue, IMO it's a cleaner way of doing things, and is anyway better in the context of juju because it's more like the model we use throughout state
<TheMue> fwereade: yes, this uncommon way may lead to misunderstandings
<fwereade> TheMue, cool, I'll have a go at that; we can see what niemeyer thinks when he;s around
<TheMue> fwereade: yes
<TheMue> fwereade: lunchtime
<fwereade> TheMue, cool, enjoy
<Aram> hi.
<fwereade> Aram, heyhey
<fwereade> TheMue, it crosses my mind that *actually* the PA has it right wrt the watcher... there doesn't ever seem to be a good reason to create a subwatcher outside the loop
<fwereade> TheMue, or even to keep it in a field on a, er, superwatcher
<TheMue> re
<TheMue> Aram: Hi
<TheMue> fwereade: Maybe, it only puzzles me if something is different than in most/all other places.
<fwereade> TheMue, I can't see anywhere a watcher field is used outside a loop method
<TheMue> fwereade: Maybe it's my old OO as well as Erlang thinking that I do most initialization in a constructor and use the loop method only as loop.
<fwereade> TheMue, IMO a field that's used in only one method should not really be a field ;)
<fwereade> TheMue, anyway, I'm experimenting with what might be a noticeable simplification of the stuff in state/watcher.go, I'll let you know if it works out :)
<TheMue> fwereade: But then I would not call the method loop(). I often name them backend() in my projects. ;)
<fwereade> TheMue, I would personally gravitate towards run() :)
<TheMue> fwereade: That's the job of the whole program: come on, run, Forrest ...
<fwereade> TheMue, haha
<fwereade> TheMue, making this change has made some of the state watchers start to look a bit inconsistent to me: an awful lot of them seem to me to be at risk of sending "changes" that don't actually represent changes
<fwereade> TheMue, I rather consider that to be poor behaviour; opinion?
<fwereade> TheMue, basically all of them seem to do what FlagWatcher did before I suggested changing it yesterday
<fwereade> TheMue, excluding the Machine ones, that is
<TheMue> fwereade: Hmm, maybe they should be fixed  then one-by-one in small changes.
<fwereade> TheMue, yeah, that sounds sensible
<TheMue> fwereade: We've done it this way for the error messages in state.
<fwereade> TheMue, sorry, expand please
<fwereade> TheMue, ah, how we make the changes
<TheMue> fwereade: +1
<TheMue> fwereade: yep
<TheMue> fwereade: I think the should be fixed to work exactly as expeted.
<fwereade> TheMue, cool
<fwereade> TheMue, I need to think about what I've done a little more but it feels like a big win ATM
<TheMue> fwereade: great
<fwereade> TheMue, I'll see what niemeyer thinks before I start hanging more changes off it though ;)
<TheMue> fwereade: I'm just doing the last changes for the service unit watcher. niemeyer had a very good hint how to test it more elegant.
<fwereade> TheMue, yeah, I saw that
<fwereade> TheMue, sounds good
<TheMue> fwereade: it's so much clearer now.
<fwereade> TheMue, sweet -- that style does make my head hurt a bit sometimes :)
<TheMue> fwereade: it still uses test tables, but more simple ones
<fwereade> TheMue, perfect
<TheMue> fwereade: will now propose it and then start a new branch for the deleted to removed change
 * fwereade worries that that change will be a real hassle to merge with his current work
<fwereade> TheMue, can you wait a few mins? I'll propose what I have -wip so you can (1) tell me if it's sane and (2) decide whether the conflicts will be worth the hassle
<TheMue> fwereade: Which change you mean, the first one or the deleted/removed one?
<fwereade> TheMue, added/removed
<fwereade> TheMue, (sorry, remind me what the first one is?)
<TheMue> fwereade: The fixes after the review of the service units watcher.
<TheMue> fwereade: They are proposed now.
<fwereade> TheMue, I'm happy to suck up the conflicts on that one myself
<TheMue> fwereade: The work on the deleted-to-be-named-removed will start now.
<fwereade> TheMue, but Added/Removed might hurt
<TheMue> fwereade: This is why I will do it in an extra branch. There are influences in more packages than just state.
<fwereade> TheMue, true
<TheMue> fwereade: But I'll wait for your proposal.
<fwereade> TheMue, https://codereview.appspot.com/6347045
<fwereade> TheMue, the reason I worry slightly is that it totally guts state/watcher.go
<fwereade> TheMue, actually, looking at it, I doubt Added/Removed will hurt that much
<fwereade> TheMue, but I'd still appreciate your preliminary opinion
<fwereade> TheMue, the important bit is the changes to state/watcher.go, most of the rest can probably be a separate CL
<TheMue> fwereade: In the moment the field is renamed in watcher.go any user of this field has to be renamed too.
<TheMue> fwereade: So it may be better type by type instead file by file.
<fwereade> TheMue, you mean the Added/Removed change?
<fwereade> TheMue, my CL does not alter observable behaviour AFAICT
<TheMue> fwereade: Yes
<fwereade> TheMue, cool
<TheMue> fwereade: The other one - don't raise events for unwanted content changes - is a different topic and should be discussed with niemeyer
<fwereade> TheMue, yeah, I haven't even started on that
<TheMue> fwereade: Maybe you should file it first.
<fwereade> TheMue, hopefully I'll get a chance to see him and discuss it tonight
<TheMue> fwereade: So which change of mine do you think could conflict yours?
<fwereade> TheMue, I decided it probably wouldn't hurt too much in the end
<fwereade> TheMue, there will be conflicts but they'll be trivial
<fwereade> TheMue, feared they might be ugly before I looked at the actual diff
<TheMue> fwereade: ah, ok, understand
<fwereade> TheMue, anyway, does that change look worth it to you?
<TheMue> fwereade: i'm not yet through, one moment
<fwereade> TheMue, MW and MUW are not finished there
<fwereade> TheMue, you should be able to ignore them and still have a clear idea of what's involved
<TheMue> fwereade: Mostly LGTM, I only have troubles with the panic() in MustErr().
<fwereade> TheMue, that was already there :)
<TheMue> fwereade: Maybe, but not by me. ;)
<fwereade> TheMue, the idea is that if a subwatcher's channel is closed while the watcher is reading from it, it indicates a fundamental logic error
<TheMue> fwereade: The question is: Is this situation simply an error itself and recoverable or do we really need to tear down the program?
<fwereade> TheMue, it means that we, as coders, are demonstrably on crack
<fwereade> TheMue, ie not recoverable IMO
<TheMue> fwereade: If it can't happen by any runtime situation then I'm fine with hit.
<TheMue> it
<fwereade> TheMue, that's the idea
<TheMue> fwereade: This is what I really like in Erlang. Here a supervisor would recognize this crash and start the needed tasks to recover the system (which sometimes is only the restart and registration of a coroutine (process) and sometimes may be the stopping and restarting of large process trees).
<fwereade> TheMue, but if the error is "the programmers are demonstrably failing basic logic" then I think it's better to fail hard AAP
<TheMue> fwereade: supervisors can also supervise other ones, so it cascades if one fails during restart
<fwereade> ASAP
<TheMue> fwereade: yes, indeed, if it is definitely not recoverable and a logical error
<fwereade> TheMue, if a watcher has closed its change channel while its owner is still making use of it, that is IMO evidence of fundamental crackfulness
<twobottux> aujuju: Security groups creation in EC2 <http://askubuntu.com/questions/156715/security-groups-creation-in-ec2>
<mthaddon> hey folks, does anyone know who's responsible for maintaining http://jujucharms.com/ ?
<james_w> mthaddon, that's hazmat's
<mthaddon> james_w: ah, thx
#juju-dev 2012-06-28
<jcharette> anyone around to support MAAS with juju?
<jcharette> anyone have experience making juju work with MAAS
<zirpu> jcharette: did you try the #maas channel?
<jcharette> no, but i can
<jcharette> doesn't seem to be a maas problem, juju hasn't even tried to connect
<m_3> davecheney: yo
<m_3> davecheney: was gustavo's talk last night or is it tonight?
<m_3> davecheney: we got in about 6 this morning
<davecheney> m_3: hey hey
<davecheney> m_3: i think it will be friday
<davecheney> so maybe the east coast friday
<m_3> ha
<m_3> ok
<davecheney> m_3: so tired today, watched the google io demo live
<davecheney> got home at 5:30
<m_3> ah, ok... that's what I was asking about
<m_3> sorry thought that was gustavo's talk
<davecheney> fwereade: howdy
<davecheney> cmd/juju only lists bootstrap and destroy-environment as options, there is no deploy
<davecheney> was this intentional ?
<fwereade> davecheney, heh, no it wasn't
<fwereade> davecheney, sorry about that
<fwereade> davecheney, re your email, the ultra-short version is that no deploy doesn't publish secrets
<fwereade> davecheney, I *think* that is something that should be handled by juju.Conn in the background; the first thing that should be done on client access to state is to check for complete environ config, and if keys are missing the secrets should be pushed
<fwereade> davecheney, sounds sane?
<davecheney> fwereade: yup
<fwereade> davecheney, cool
<davecheney> i hope this can be solved soon, and finally
<davecheney> i always find myself butting up against it
<davecheney> oh, also,         spec, err := findInstanceSpec(defaultInstanceConstraint)
<fwereade> davecheney, yeah, that is basically completely broken IMO
<davecheney> ^ this is why we can't bootstrap into an arbitary region :)
<davecheney> defaultInstanceConstraint says 'dude, use us-east-1'
<fwereade> davecheney, I know
<davecheney> not hard to fix
<fwereade> davecheney, I was looking at exactly that last night and started swinging the axe before I realised that it was a distraction from what I *actually* need to do right now
<davecheney> :)
<davecheney> btw, i went to use juju deploy and found the command wasn't enabled yet
<davecheney> is it not ready for prime time ?
<fwereade> davecheney, yeah, but it's doing a bunch of stuff that is unnecessary, and some stuff that is flat-out wrong, and I'm not happy with some of the terminology, and grumble grumble grumble
<fwereade> davecheney, it should *work* -- although it won't enable the PA -- it's just that I forgot to actually enable it
<davecheney> s'ok, it was a one line change
<fwereade> davecheney, propose it, it's a trivial
<davecheney> but again, no PA secrets; no machines :(
<davecheney> fwereade: kk
 * davecheney afk for a few mins
<fwereade> davecheney, step by step :)
<fwereade> davecheney, brb, weirdly sluggish system, restarting
<davecheney> fwereade: intresting ... even with the correct image spec, it still booted an instance in us-east-1
<fwereade> davecheney, as I said, I am generally underwhelmed by that code
<fwereade> davecheney, I think it needs a fair bit of work to be useful even *without* constraints
<fwereade> davecheney, and that when we add constraints it will need still more
<davecheney> right
<fwereade> davecheney, sorry to bear bad tidings :(
<davecheney> fwereade: gotta start somewhere
<fwereade> davecheney, indeed :)
<fwereade> davecheney, if you're planning to work on that code
<davecheney> can i take a moment to whinge that --debug doesn't actually output anything more than you previously got
<davecheney> thank you
<fwereade> davecheney, may I suggest starting by removing the useless fields from InstanceConstraint?
<fwereade> davecheney, we always want persistent storage, and we always want a released server image
<davecheney> yeah, why would we ever want to boot an unstable desktop
<davecheney> i'll make a bug for myself
<fwereade> davecheney, quite so
<fwereade> davecheney, cheers
<davecheney> fwereade: can we talk about juju.Conn for a moment ?
<fwereade> davecheney, ofc
<davecheney> is that something you are working on in your sphere ?
<davecheney> i kinda get the feeling that much has been discussed already
<fwereade> davecheney, hmmm, it's something I'm reluctant to commit to *implementing* myself in a short timeframe
<davecheney> fwereade: understood
<fwereade> davecheney, but I think I have a reasonably clear view of the issues
<davecheney> would i be well served by looking in the irc logs for the details ?
<davecheney> (where are the irc logs btw)
<davecheney> btw, great work on the watcher refactoring, the cargo culting of new watchers was starting to smell odd
<fwereade> davecheney, hmmmm, probably not
<fwereade> davecheney, cheers :)
<fwereade> davecheney, let me run down what springs to mind immediately
<davecheney> cool, iget the feeling that i'll be implementing it, 'cos I need it most urgently
<fwereade> davecheney, it should be done at the Conn level rather than in response to specific operations just because we have a history of forgetting that it might need to be done :)
<fwereade> davecheney, hmmm, it's a touch tangled in my mind, just trying to put thoughts in order
<fwereade> davecheney, a lot of it is down to an issue in my mind that remains a source of some discomfort
<fwereade> davecheney, I think there are 3 categories of environment setting
<fwereade> davecheney, (hm, maybe more) there are those that must always be present, regardless of provider
 * davecheney nods
<fwereade> davecheney, that is: type, name, default-series
<fwereade> davecheney, I am adding default-series and type handling to Initialize as we speak
<davecheney> fwereade: manditory, secret (but required), and optional (no sane default)
<fwereade> davecheney, hmm, I hadn't been thinking of optional as a category
<fwereade> davecheney, I was thinking juju-mandatory, provider-mandatory, and secret
<fwereade> davecheney, I retain unease about the fact that we just poke all the juju-mandatory ones into the provider-specific config schemas -- that feels like a potential source of tedious bugs with little corresponding benefit
<fwereade> davecheney, "name" is particularly an interesting one, because at the moment Initialize does not set (or even have access to ) the environment name
<fwereade> davecheney, in some ways this is good because the lack of a name should, I hope, be sufficient to indicate an incomplete environ config even in the case of the dummy provider
<davecheney> urk, that is because name is a synthetic property that comes via the yaml conversion ?
<fwereade> davecheney, but it *also* feels like a source of potential avoidable bugs, because as discussed elsewhere we'll be in a world of hurt if we try to change an env name at runtime
<fwereade> davecheney, I think it's just because "meh we don;t need it yet"
<davecheney> fwereade: there are lots of those bits of information
<davecheney> ec2 region is another example
<fwereade> davecheney, however passing it up is a hassle -- we need to change userData, the initzk command, and state.Initialize to handle it
<davecheney> early on in the config validation the region string from the config is turned into an aws.Region struct
<davecheney> but, nowhere in that struct does it include its short name
<davecheney> and so forth
<fwereade> davecheney, yeah, I rather feel that the environ code has been implemented without paying close enough attention to the lessons learned working with multiple providers in the python
<fwereade> davecheney, it's entirely understandable and forgivable, because those lessons don't rally strike home until you've done a bunch of work with them
<fwereade> davecheney, but still a shame
<davecheney> yup, maybe there is a case for doing an openstack or local provider concurrently
<davecheney> to expose these issues sooner
<fwereade> davecheney, there *definitely* is, but I think the decision to take those off the table is rational given the constraints we're working under
<fwereade> davecheney, explicitly acknowledged tech debt is a lot better than the accidental kind
<davecheney> yup, we don't get a prize if we have 3 sorta working providers, and no solid ec2 story
<fwereade> davecheney, it's just that we've also taken on some accidental debt as well in the course of implementation
<fwereade> davecheney, if you're looking into this stuff you could do worse than to take a casual look at the python provider implementations, though
<fwereade> davecheney, while it's not perfect I think I did a reasonable job of separating the common from the provider-specific
<davecheney> i think i don't spend enough time there
<fwereade> davecheney, although I never addressed the flat-out retarded config issue whereby we leave "default" settings empty
<fwereade> davecheney, and then do `config.get(key, default)` with the same default in multiple places
<davecheney> ooh, that is a trao
<davecheney> trap
<fwereade> davecheney, yeah -- just something to be aware of
<fwereade> davecheney, in the context of what you're currently working on, I think formalising (the equivalent of) the machine_data we have in python would be a good idea
<davecheney> fwereade: thanks, i'll check it out
<fwereade> davecheney, I'm not quite sure whether startInstance should take (*state.Info, *environs.MachineData), or just (*environs.MachineData)
<fwereade> davecheney, but I'm 98% sure that a MachineData is a good idea :)
<davecheney> i think the state.Info is a smell
<davecheney> along with the secrets problem
<fwereade> davecheney, there's something about that I'm not comfortable with, indeed, but I have yet to put my finger on it
<davecheney> the argument about which state.Info you get has taken more than it's fair share of head space
<fwereade> davecheney, yeah
<fwereade> davecheney, there's probably a better way to do it :)
<fwereade> davecheney, so... I hope this was slightly helpful, I feel what I've mostly said is "there are dragons here, here, and here, good luck!"
<fwereade> davecheney, which is not without value but possibly I have skipped things that I think are "obvious"... so please let me know if I seem to have missed stuff
<davecheney> fwereade: wrt startInstance and a state.Info, i think it's overengineered
<davecheney> actually no
<fwereade> davecheney, agree, I think there's quite enough info available internally
<davecheney> i can't ever imagine how the provider could start a machine that would not talk to the state using exactly the same connection string as the pa
<davecheney> hmm, i shuldn't use ever
<davecheney> but it's certainly not the biggest problem at the moment
<fwereade> davecheney, ah, but we don't know what the PA machine's address is at bootstrap time
<fwereade> davecheney, so we have to start it pointing at localhost
<fwereade> davecheney, and then figure out the actual address to use for the other machines later
<fwereade> davecheney, it's not a problem we can just skip entirely
<davecheney> this sounds like another secrets passing problem
<fwereade> davecheney, it shares some features, yeah
<davecheney> the details about the instance is started for machine/0 are so opaque we can't even say 'what is your public ip'
<fwereade> davecheney, but I don't think we need any external input to figure out the right address -- surely we can figure that out on the machine?
<fwereade> davecheney, doesn't the metadata service provide that?
<davecheney> not sure what you mean by metadata service
<fwereade> davecheney, environs/ec2/ec2.go:253
<fwereade> davecheney, it's what we use to figure out the instance id of the bootstrap machine at bootstrap time
<davecheney> fwereade: ta
<fwereade> davecheney, but honestly offhand I cannot remember what else it provides
<davecheney> fwereade: wtf
<davecheney> right so if you curl that url from inside your ec2 instance you can find out stuff about your instance
<fwereade> davecheney, exactly
<davecheney> that is both brilliant, and insane, in equal parts
<fwereade> davecheney, ha, yeah
<fwereade> davecheney, and ofc it can fail
<fwereade> davecheney, which is not something we have actually handled now I come to think of it
<fwereade> davecheney, I got botten by that at least once at my last job
<davecheney> fwereade: maybe I should add something to goamz to talk directly to it
<davecheney> rather than using the shell hammer
<fwereade> davecheney, hmmmmm maybe
<fwereade> davecheney, in this specific case we have providers where we *do* know the instance id before launch
<fwereade> davecheney, and so passing it in on the command line seems to me like the right thing to do
<fwereade> davecheney, rather than worrying about a provider-specific initialize
<fwereade> davecheney, ...but yeah, getting it *reliably* is an issue that should be addresses
<fwereade> davecheney, and I'm not actually sure if we do use the metadata service to figure out the long-term state info
<fwereade> davecheney, I suspect that's something we do from the instance-id stored on S3, making use of secrets to interrgate aws
<fwereade> davecheney, that's certainly what we need to do on the client, I think
<fwereade> davecheney, but I'm getting a moderate sense that I may be talking crap, and that it would be wise to check the python
<davecheney> fwereade: have a great day mate, i'm going offline now
<fwereade> davecheney, cheers, take care
<TheMue> Morning
 * davecheney waves
 * TheMue waves back
<TheMue> Morning Aram
<Aram> morning
<TheMue> fwereade: Heya William, online again! ;)
<fwereade> TheMue, heyhey, no idea what the issue was
<TheMue> fwereade: It definitely shows how depended we are.
<fwereade> TheMue, well, ok, the adsl was stuck "initializing LCP" for a long time... and then suddenly it wasn't :/
<fwereade> TheMue, indeed :)
<TheMue> fwereade: My current provider is here thankfully pretty good. But I'll change to a different one in July (with an overlapping time having both). I hope it's so reliable too.
<TheMue> fwereade: Got a neat bandwidth then.
<Aram> meh, some dude came to tell me I have to pay 52â¬/2 months for TV and radio.
<Aram> I don't have aither.
<Aram> either.
<Aram> I haven't watched TV in like 15 years.
<TheMue> Aram: We have the same here. You may listen to the radio or watch TV, e.g. with the PC. So you've got to pay.
<Aram> meh.
<TheMue> Aram: But it's cheaper.
 * TheMue and his family indeed watch TV and listen to the radio. :)
<TheMue> fwereade: Just merged your and my change to watcher, little text conflicts but no probs.
<fwereade> TheMue, fantastic
<TheMue> fwereade: But I'll have still to adopt your changes to the ServiceUnitsWatcher, just seen. :( The other one has been the deleted-is-now-removed-change.
<TheMue> fwereade: I can't build and test jujud, see http://paste.ubuntu.com/1064020/. Can you reproduce that?
<fwereade> TheMue, sorry, got distracted -- let me grab a fresh branch and try
<fwereade> TheMue, works for me; if you've using the move-branches-around style of development you may want to delete your built package
<TheMue> fwereade: Strange, no, will have a deeper look.
<TheMue> fwereade: I took you watcher.go as merge base and only reproduced the changes regarding Deleted/Removed.
<fwereade> TheMue, hmm; contentWatcher has Err and Stop, and everything should embed contentWatcher
<TheMue> fwereade: Found it, melt took the wrong file. *sigh* I hate text conflicts.
<fwereade> TheMue, heh, I just solve conflicts by hand every time, I think it saves me time on average ;)
<twobottux> aujuju: "init: juju-..." errors in syslog after uninstalling juju <http://askubuntu.com/questions/157093/init-juju-errors-in-syslog-after-uninstalling-juju>
<TheMue> fwereade: Yep, that has been the mistake, now it works.
<fwereade> TheMue, cool
<fwereade> TheMue, and, sorry: I only remember I said I'd wait for yours after I'd submitted :((
<TheMue> fwereade: No problem, that's always a risk using a VCS without locks (and with locks is even worse, I've once had been forced to use ClearCase).
<fwereade> TheMue, haha
<TheMue> fwereade: go test ./... still leads to a funny problem here. Due to my German locale one error message of mongo doesn't match and the test fails. ;)
<fwereade> TheMue, eww
<fwereade> TheMue, what test?
<TheMue> fwereade: The store
<TheMue> fwereade: bzr: ERROR: Kein Zweig: Â»/non-existent/~jeff/charms/precise/bad/trunk/Â«.
<fwereade> TheMue, weird, that's only just started happening?
<TheMue> fwereade: Expected is "Not a branch" instead of "Kein Zweig".
<TheMue> fwereade: So far it's the only error. The returned error by type is the right one but the received string for it not.
<TheMue> fwereade: That's also one reason why I dislike those generic error strings with errors.New() or fmt.Errorf().
<fwereade> TheMue, not i18nable?
<TheMue> fwereade: In my apps/pkgs I use error types so that they can be compared (and later also be returned in the UI in the right language).
<TheMue> fwereade: Yep
<fwereade> TheMue, yeah, I think I probably do favour your approach; OTOH I see where niemeyer's coming from, we had an embarrassing mess of error types in python that didn't really help anything at all :)
<TheMue> fwereade: As long as we only deploy instances with English locales everything is fine.
<fwereade> TheMue, I agree that we shouldn't ever be depending on string values of errors in "real" code, and given that we have at least one core dev with a non-english locale I also agree we probably shouldn't be in tests either
<TheMue> fwereade: We should talk about it in Lisbon.
<fwereade> TheMue, SGTM
<Aram> is machine 0 special?
<Aram> right now I have a lot of code that tries to detect is a unit doesn't have a machine assigned or it's just machine 0.
<Aram> if machine 0 is special I could remove all that crap
<zirpu> Aram: machine 0 is currently the bootstrap node where zookeeper runs.
<zirpu> so it's "special" in that sense.
<Aram> so no new units can be assigned to it?
<Aram> that's great.
<zirpu> you can put subunits on it i think.
<Aram> that's fine in my design.
<zirpu> but mostly it's meant to orchestrate the rest of the units/services.
<zirpu> there's a bug about making the bootstrap node high available, or have a 2ndardy.
<zirpu> https://bugs.launchpad.net/juju/+bug/803042
<fwereade> Aram, sorry I missed that earlier: we really really should try not to make machine 0 special
<Aram> hmm.
<fwereade> Aram, it's also perfectly legitimate and expected to have any number of machines without units assigned
<Aram> of course.
<TheMue> fwereade: Just for info: I've adopted your watcher changes to the ServiceUnitsWatcher and it works pretty fine.
<fwereade> TheMue, great
<fwereade> TheMue, tiny nitpick -- any particular reason SUW is up at the top of the file away from the other topology watchers?
<Aram> fwereade: I'm refactoring some really awful stuff I've worked on yesterday, I have a much better data model and the problem I had will not exist anymore, so everything is fine.
<fwereade> Aram, cool
<Aram> fwereade: btw, lbox was broken yesterday, niemeyer was kind enogh to fix it.
<Aram> or was that the day before?
<Aram> I guess the day before.
<fwereade> Aram, yeah, I saw, and serendipitously had exactly the same failure the following day
<fwereade> Aram, thanks for blazing that trail ;)
<twobottux> aujuju: Can I specify tighter security group controls in EC2? <http://askubuntu.com/questions/156715/can-i-specify-tighter-security-group-controls-in-ec2>
<fwereade> TheMue, how would you feel about vast and hideous surgery to the state tests?
<fwereade> TheMue, I feel that state_test.go is way too big and we'd benefit from breaking things up into both separate files and separate suites within those files
<TheMue> fwereade: +1
<fwereade> TheMue, cool, cheers
<TheMue> fwereade: It has grown a lot over the time.
 * fwereade dons tedium-proof outergarment
<TheMue> Did you know that rabbits can yawn? I'm sitting on the veranda, and one of our rabbits just done it. :D
<Aram> fwereade: yes, state_test.go is way too big, for mstate I was planning to split it up and make it provider independent.
<Aram> we could isolate mgo/zk things to a handful of utility functions.
<fwereade> Aram, ouch, what are the provider dependencies?
<Aram> and keep the tests pure.
<fwereade> Aram, ahh, got you
<TheMue> Hmm, firewall is tricky, multiple watchers, so multiple goroutines and tombs.
<bcsaller> google's compute engine will need a provider soon
#juju-dev 2012-06-29
<twobottux> aujuju: Unable to fully remove Juju <http://askubuntu.com/questions/157093/unable-to-fully-remove-juju>
<davecheney> m_3: still on for lunch today ?
<davecheney> m_3: that is more of a statement than a question, just confirming
<m_3> davecheney: yup
<davecheney> m_3: just sent you an email, do you like Thai hawker food ? Luksa and the like ?
<m_3> sounds good... I'm easy
<davecheney> m_3: groovy
<fwereade_> mornings :)
<davecheney> fwereade_: howdy
<fwereade_> davecheney, heyhey
<davecheney> things are looking up, we can bootstrap into other regions
 * fwereade_ cheers
<davecheney> and I'm just about to get the local ec2 test running
<fwereade_> davecheney, sweet
<davecheney> gustavo was supportive of moving the Provisioner into another package
<fwereade_> davecheney, I spent most of the night hitting everything related to testing and zookeeper with an axe
<davecheney> so i can import it into the local tests
<fwereade_> davecheney, yeah, I saw
<fwereade_> davecheney, hope he is of this change too
<davecheney> fwereade_: looking for that race, or just because it deserved it
<fwereade_> davecheney, duplicated bits and pieces everywhere, subtly inconsistent bits and pieces
<fwereade_> davecheney, also a single monster test suite for almost everything in state
<fwereade_> davecheney, the collective pain passed my threshold
<fwereade_> davecheney, and it's the sort of thing I really *don't* want to do piecemeal
<davecheney> fwereade_: excellent work
<fwereade_> davecheney, any ad-hoc duplications I leave lying around will, according to the law of sod, be those that someone sees and copies next time they need to take to ZK, or use a State, or whatever ;)
<davecheney> fwereade_: did you move zkSuite into juju-core/testing ?
<davecheney> or are you planning too ?
<davecheney> the zkSuite i refer to comes from cmd/jujud
<fwereade_> davecheney, I'm planning to drop it and use juju-core/testing.ZkConnSuite for initzk, and juju-core/state/testing/StateSuite for the others
<fwereade_> davecheney, just unpicking that bit right now actually :)
<fwereade_> davecheney, neither of those exist yet, except this this tree ofc
<davecheney> fwereade_: excellent, I hope it gets commited soon
<davecheney> 'cos I need that to start a zk to back the provisioner for the local ec2 tests
<fwereade_> davecheney, me too...
<fwereade_> davecheney, ah, cool
<fwereade_> davecheney, I'm not sure it will actually get proposed until you've gone to bed
<fwereade_> davecheney, that said I hope it will before eod for me
<fwereade_> davecheney, so if you do happen to see it over the weekend, and are able to give the bits that hit your code a once-over (and hopeful LGTM), I think it would help
<fwereade_> davecheney, it's gong to be much too big for a single person to review it effectively, even though *most* of the changes are from splitting up state_test
<davecheney> understood
<davecheney> review backlog ... sigh
<fwereade_> davecheney, cmd/jujud/main_test.go
<fwereade_> davecheney, is it intentionally not in the main package?
<fwereade_> davecheney, I thought you changed all those to be internal?
<davecheney> let me check, that may have been an accident
<fwereade_> davecheney, it looks as though that test isn't actually run any more
<davecheney> sorry, that was an accident
<davecheney> hmm, looking further
<davecheney> i didn't touch main_test
<davecheney> but it should be fixed
<fwereade_> davecheney, cool, I'll pick thoe up with the one I'm doing
<davecheney> ok, thanks
<davecheney> I can do it too, but it'll be depending on my other services branc
<fwereade_> davecheney, since I'm already hitting everything to do with testing I think it fits best with what I'm doing
<davecheney> kk
<fwereade_> davecheney, actually, do you have a few minutes to take a preliminary look at what I've done? assuming I haven't broken anything in the latest pass, I'll propose -wip in a couple of mins
<fwereade_> davecheney, or will I be keeping you at work inappropriately? ;)
<davecheney> fwereade_: yeah fire it off
<fwereade_> davecheney, https://codereview.appspot.com/6348053
<fwereade_> davecheney, testing/zk.go, and state/testing/testing.go, are at the heart of it all
<fwereade_> davecheney, most of the stuff in the state package can otherwise be pretty much ignored for now, it's just test-moving
<fwereade_> davecheney, so if you take a look at those, and how they affect the jujud tests, that would probably be most directly useful
 * davecheney looks
 * fwereade_ passes davecheney the goggles, and hopes they do something
<davecheney> fwereade_: looks pretty good to me
<fwereade_> davecheney, cool, thanks for taking a look :)
<davecheney> you could probably split this into two changes, to reduce the sticker shock
<davecheney> but 90% of the change is one liners in a file
<fwereade_> davecheney, yeah, maybe I should do the state_test split separately
<fwereade_> davecheney, bah :)
<davecheney> i don't think it's worth it, this branch will probably take a few review cycles, so it's probably not saving much
<fwereade_> davecheney, yeah, I must admit I'm not especially enthused by the idea, but the I wouldn't be ;)
<fwereade_> davecheney, hmm, an idea
<fwereade_> davecheney, state_test.ConnSuite should probably not be used anywhere near as much as it is
<fwereade_> davecheney, if I move the tests that use zkConn directly out into their own zk-substrate-specific test it would probably be a good thing
<fwereade_> davecheney, but that's surely one for a new CL
<davecheney> fwereade_: stepping offline for a while, gonna see what is happening in the real world
<fwereade_> davecheney, take care, have fun
<davecheney> fwereade_: you too
<fwereade_> TheMue, heyhey
<TheMue> fwereade_: Hi
<TheMue> fwereade_: Oh, update tells me to reboot. One moment.
<TheMue> fwereade_: So, back again, just submitted the deleted-is-now-removed-change
<fwereade_> TheMue, cool
<fwereade_> TheMue, I'll merge that in a mo
<fwereade_> TheMue, I have a hideous monster of a branch in progress
<fwereade_> TheMue, but it does remove a whole bunch of duplication across the tests that hit state and/or zookeeper so hopefully it won't be too controversial
<fwereade_> TheMue, do you have anything else likely to land soon?
<TheMue> fwereade_: I'm still porting the firewall, it uses multiple watchers. And now I've got a method with a callback inside a callback. :(
<TheMue> fwereade_: No, I'll need some time.
<fwereade_> TheMue, cool
<fwereade_> TheMue, an internal updates channel may be helpful if I've understood what you're doing correctly
<fwereade_> TheMue, ie the main loop selects on w.subwatcher.Changes and w.updates
<fwereade_> TheMue, and a separate goroutine for each additional watcher sends on w.updates for the attention of the main loop
<fwereade_> TheMue, sensible/relevant?
<TheMue> fwereade_: Maybe an approach, yes. I'll take a look how it matches. But it SGTM.
<fwereade_> TheMue, if by callback-inside-callback you just mean consuming the initial event from a new watcher before sending it off on its own goroutine I think that's sensible and necessary
<TheMue> fwereade_: I've got to get deeper in the code first, understanding what happens there today.
<fwereade_> TheMue, SGTM, give me a shout if you see anything particularly surprising, I seem to have spent a disturbing amount of time thinking about this sort of problem recently
<Aram> hey.
<fwereade_> Aram, heyhey
<TheMue> fwereade_: Could you please help me getting yield right?
<TheMue> Aram: Hi btw
<fwereade_> TheMue, I'll do what I can :)
<fwereade_> TheMue, what's the problem?
<TheMue> fwereade_: I understand it as a generator returning the yielded value with each call. So far no problem.
<fwereade_> TheMue, but @inlineCallbacks is weird?
<TheMue> fwereade_: But when the call to the generator function has a list as argument and inside the func this list is iterated and every time a yield, what do I get?
<TheMue> fwereade_: And those inlineCallbacks too, yes. ;)
<fwereade_> TheMue, not sure I follow the first bit, can you point me to the code?
<fwereade_> TheMue, the idea of inlineCallbacks is that you repeatedly yield a Deferred, and the result of that Deferred is sent back into the generator once it's fired
<fwereade_> TheMue, basically it lets you write callback-based code and kid yourself it's not really using callbacks
<fwereade_> TheMue, so the critical thing is not to think of inlineCallbacks-decorated methods as being generators at all
<fwereade_> TheMue, it's a creative and useful abuse of generators but that's not really a useful perspective from which to view them IMO
<TheMue> fwereade_: OK, thx, a first hint. And forget my first sentence, I interpreted the indent inside firewall.py wrong.
<TheMue> So, organized my checkup at the dentist on Monday morning. *dislike*
<fwereade_> TheMue, Aram: I have the most horrifying CL known to man, just proposed at https://codereview.appspot.com/6348053
<Aram> oh man
<fwereade_> TheMue, Aram: it's explained in great detail on codereview
<fwereade_> Aram, I hope that it moves us a significant step towards substrate-independent state testing, which is why I'm pointing it out to you in particular
<fwereade_> TheMue, because it hits a lot of tests that you yourself wrote, I would particularly appreciate your feedback
<TheMue> fwereade_: I'm already looking.
<TheMue> fwereade_: Btw, I often read CL here, but I've been not able to find the long version of this abbrev. Could you help me?
<fwereade_> TheMue, ChangeList, apparently :)
<TheMue> fwereade_: Thought so, indeed. But it seemed too simple. ;)
<TheMue> fwereade_: I'm knowing it as a change set.
<TheMue> fwereade_: http://en.wikipedia.org/wiki/Changeset
<fwereade_> TheMue, indeed
<fwereade_> TheMue, seemed easiest to just adopt the preferred local terminology :)
<TheMue> fwereade_: Yes, only wanted to know. Maybe there's a history behind it.
<fwereade_> TheMue, Aram: btw, if I do appear to be making unsupportable statements in the description, or to otherwise be on crack, please point it out... I'm somewhat nervous that niemeyer is going to reject it on principle, and I'd like it to be as sane as possible in all respects other than size ;)
<TheMue> fwereade_: Normally your ideas have very good reasons.
<fwereade_> TheMue, thank you -- but occasionally total nonsense slips out past the internal censor, and N other opinions are very helpful ;)
<TheMue> fwereade_: *lol*
<TheMue> ..ooOO( Note to Mr. Reade: This change is indeed a *monster*. )
<TheMue> fwereade_: You've got a review.
<fwereade_> TheMue, tyvm for review
<fwereade_> TheMue, how about "conf" instead of "any"?
<TheMue> fwereade_: I'm missing the intention of that type. The name should reflect this.
<fwereade_> TheMue, it's for holding environ configurations, haven't noticed any other uses
<TheMue> fwereade_: We could reuse it naming it like "dictionary", otherwise especially for this case "environConfig".
<fwereade_> TheMue, SGTM
<TheMue> fwereade_: But that are all minors. In general it seems to be an important change to me.
<fwereade_> TheMue, fantastic
<TheMue> fwereade_: And cleaning up the test tables (all in the methods or all out of them, simplification, anonymous structs) can be done later.
<fwereade_> TheMue, yeah, I'm not sure I feel those are important enough to do until I have reason to modify them
<fwereade_> TheMue, the big reason for this change is that I'm about to extend state.Initialize and without consistency that will be a nightmare
<fwereade_> TheMue, the little reason is just low-level ongoing frustration as I complained about ;)
<TheMue> fwereade_: :D
<twobottux> aujuju: juju: ERROR Unexpected Error interacting with provider: 409 CONFLICT <http://askubuntu.com/questions/157785/juju-error-unexpected-error-interacting-with-provider-409-conflict>
#juju-dev 2013-06-24
<thumper> wallyworld_: you around?
<wallyworld_> yep
<thumper> good
 * thumper is proposing the lxc-container branch *again*
<wallyworld_> where else would i be
<thumper> wallyworld_: I dunno, fucking around?
<bigjools> enjoying a hot shower?
<wallyworld_> well, the weather is nice outside
 * thumper snorts
<wallyworld_> not funny
<thumper> well, it is shit here
<thumper> wallyworld_: how's the bathroom?
<wallyworld_> not quite done. tradesman is a total fuckwit. we'll be reporting him to the Building Services Authority, but need to get him to finish as much as we can first
<bigjools> hope you took pictures
<wallyworld_> yep
<thumper> wow
<thumper> been watching some home renovation programs on sky, damn scary what some builders do
<thumper> hoping to get some that don't suck for our work
<bigjools> I can only think of one profession that doesn't con customers more than builders
 * thumper waits for lbox to continue generating the diff
<wallyworld_> yeah. and i got the worst tiler in queensland according to people who i have talked to since
<bigjools> that does, I mean
<thumper> bigjools: software developers?
<bigjools> thumper: got it in one
<wallyworld_> politician
<bigjools> I don't consider politics to be a profession
<thumper> ok, perhaps politicians
<thumper> haha
<bigjools> serial troughers maybe
<thumper> wallyworld_:  https://codereview.appspot.com/10370044 another round?
<wallyworld_> alright
<thumper> ta
<thumper> dog is being a nutter today
<thumper> is outside
<bigjools> so much for this being a family channel :)
<thumper> yeah, I felt guilty as soon as I swore :)
<thumper> you guys are a bad influence
<bigjools> I thought I was bad until I met Ian
<thumper> heh
<thumper> I was much worse when I first went to the UK
<thumper> had to tone things down a lot
<thumper> and not use swearing as punctuation, or as adjectives too often
<bigjools> I expect the politically correct HR drones had some policy on it ...
<bigjools> everyone got asked to go on some "sensitivity" training at one of the banks I was at
<bigjools> as a contractor I told them to get fucked
<wallyworld_> thumper: i had a branch that william has +1'ed to remove that Instance.Metadata() method. if you +1 it also, i can land it and you could tweak your branch before landing https://codereview.appspot.com/10384049/
<thumper> kk
<thumper> bigjools: haha
<bigjools> thumper: there's no crisis that can't be solved by sending in the diversity coordinators, apparently
<thumper> pfft
<bigjools> was that a fart?
<thumper> no
<thumper> like *snort*
<thumper> things you miss by typing stuff
<thumper> miss all that non-word verbal communication
<bigjools> aye, it's hard to talk without waving your arms around
<thumper> wallyworld_: so right now, we are doing nothing with this metadata?
<wallyworld_> thumper: i have a branch which has been reviewed which i need to fix some things on. the branch writes the metadata to state.
<thumper> wallyworld_: what happens if I return nil, or a real struct where all the values are nil?
<wallyworld_> just return nil
<wallyworld_> it will create an empty record in the db
<wallyworld_> with just the InstanceId and Nonce
<wallyworld_> and mem, cpu cores etc will be blank
<bigjools> sigh
<bigjools> seems like Go doesn;t coerce ints to floats
<thumper> haha
<thumper> no
<thumper> it doesn't even promote ints
<thumper> into bigger ints as needed
<bigjools> O_o
<thumper> needs explicit cast
<thumper> wallyworld_:  https://codereview.appspot.com/10409049 for the broker
<thumper> davecheney: you could cast your "on call reviewer" eye over too maybe?
<thumper> davecheney: are you still doing on-call reviewing in your current role?
<thumper> wallyworld_: ta
<wallyworld_> np
<thumper> wallyworld_: what is the point of the PublicAddress, PrivateAddress, and InstanceId functions on environs.EnvironProvider?
<thumper> they don't seem to make any sense
<thumper> why does a provider have an ip address or instance id?
<wallyworld_> thumper: i think each machine has a provider instance. i agree it doesn't make sense
<thumper> huh?
<thumper> that is nuts
<thumper> sounds broken
<wallyworld_> i'm not across the design decisions there, before my time
 * thumper nods
<wallyworld_> but i agree with you
<thumper> I'll be up late tonight to talk with fwereade__ anyway
<thumper> I'll add it to my WTF question list
<wallyworld_> if i'm around i'll listen in
<wallyworld_> i think william wants to refactor all this stuff
<wallyworld_> there's a few interfaces with inappropriate methods
<davecheney> thumper: am I on call today ?
<davecheney> as to the other question, NFI
<thumper> davecheney: according to the juju google calendar
 * thumper does school run
<davecheney> my bad
<bigjools> wallyworld_: does the provider interface require a separation between public and private storage?  Or can it all be private?
<wallyworld_> bigjools: it could all be private. may require a tweak to some auxiliary code
<bigjools> wallyworld_: is there any use for public at all then?
<wallyworld_> yes, to store the tools
<wallyworld_> snd simplestreams metadata
<bigjools> does juju upload simplestreams stuff?
<wallyworld_> no
<bigjools> thought that was seaprate
<bigjools> ok
<wallyworld_> it is expected to be created to match the cloud
<wallyworld_> there's a tool which can be used to make a starting metadata file
<wallyworld_> the tooling needs to be refined
<bigjools> so if one person uploads tools to a public place and then removes them, does someone else get to auto-upload tools?
<bigjools> and how is authenticity guaranteed
<wallyworld_> it's not
<bigjools> \o/
<wallyworld_> design flaw
<wallyworld_> the idea right now is that only authorised people get to write to the public bucket
<wallyworld_> so those people need to be trusted
<bigjools> so there's just one public storage
<wallyworld_> one per cloud
<wallyworld_> there's one for ec2, canonistack, hp cloud etc
<wallyworld_> the tools tarballs are currently replicated across each
<bigjools> so what's the procedure for the first upload to a public place?
<wallyworld_> the juju release manager uploads to the ec2 pubic bucket, then uses sync-tools cmd to help replicate to other public buckets
<bigjools> ah ok
<wallyworld_> the new upcoming simplstreams stuff makes it a bit better
<wallyworld_> and the tarballs will be signed also
<bigjools> so for our new provider, someone would upload the tools before any deployments would take place
<wallyworld_> yes
<wallyworld_> for certified public cloud partners, the tools are uploaded/maintained as part of the release process
<bigjools> presumably *someone* must be thinking about some sort of GPG signature downloadable over https from a trusted site?
<wallyworld_> yes
<wallyworld_> for private clouds, we will provide tooling to make setup easy
<bigjools> ok so I'll work on a change I need to make public storage available then
<bigjools> was wondering whether it was really neede
<bigjools> d
<wallyworld_> bigjools:  is this for a CPC partner?
<wallyworld_> bigjools: if so, then the metadata for tools and images will be hosted on a canonical url, and that metadata will point to the cloud's pulic bucket to actually get the tools from
<bigjools> wallyworld_: right
<bigjools> and juju at some point will gpg verify the md5 of the tools.
<wallyworld_> yes
<bigjools> please tell me that is going to happen
<wallyworld_> real soon now
<wallyworld_> working on it
<bigjools> splendid
 * bigjools thinks about booking flights to Europe
<wallyworld_> a few things are happening in that space, might be a release or two away
<thumper> bigjools: you in IOM?
<thumper> bigjools: you get to fly?
<bigjools> thumper: supposed to be there
<bigjools> still not sure about going
<wallyworld_> bigjools: in the maas provider, i'm going to add a NOP when it comes to returning hardware characteristics about a started instance (mem, cores etc). i'd like to raise a bug for it. ec2 and openstack will have this supported. will it take much to do for mass?
<thumper> wallyworld_: are you going to land the instance metadata work?
 * thumper off to take kids ice skating
<thumper> back after dinner
 * thumper away
<bigjools> wallyworld_: easy, all that stuff is already supported for the python provider
<bigjools> so why not fix it :)
<wallyworld_> bigjools: i'll just land the current branch for now and raise a bug. thumper is queued up behind this branch
<bigjools> yeah no worries
<bigjools> it was on our hitlist when doing the new provider
<wallyworld_> bigjools: bug 1193998
<_mup_> Bug #1193998: mass provider doesn't return hardware characteristics of started instances <juju-core:Triaged> <https://launchpad.net/bugs/1193998>
<bigjools> ta.  if we get to it, we get to it
<wallyworld_> fwereade__: ping
<jam> wallyworld_: sorry I missed our 1:1. I rebooted my machine, and then wasn't paying attention. I'm making a coffee real quick and then I'll be around if you are available.
<wallyworld_> jam: hi, no problem. i saw you log off. can we reschedule to tomorrow. a *new* tiler has just arrived and i have a little bit of stuff to sort out
 * wallyworld_ still has no finished bathroom :-(
<jam> wallyworld_: wow. That bathroom is causing you so much grief. Time to just put in an Outhouse and be done with it. :)
<jam> wallyworld_: we can reschedule for tomorrow, no problem
<jam> can someone give a quick look at https://codereview.appspot.com/10465043/ it is a pretty small patchd
<rogpeppe1> mornin' all
<rogpeppe1> fwereade__: ping
<wallyworld_> jam: question - i got a mp approved, and did a few small bits of followup after the lgtm, but now the bot is saying i have unapproved revisions :-(
<wallyworld_> that never used to happen
<jam> wallyworld_: you have to push, and let Launchpad see the new revs, and then Approve
<jam> I re-approved it for you
<jam> wallyworld_: you might also need to reload the MP
<wallyworld_> ok, thanks. i thought i had pushed but maybe not
<jam> as it stores the 'current' revision when it is loaded, and sets the approved revision to the one it saw
<jam> wallyworld_: you did push
<jam> you didn't wait for the MP to refresh before you marked it Approved
<wallyworld_> ah ok. in too much of a hurry
<jam> so the MP was at eg rev 10, you pushed rev 11, and marked rev 10 as approved, but not rev 11
<rogpeppe> wallyworld_: any chance of a review of https://codereview.appspot.com/10447047/ or https://codereview.appspot.com/10259049/ ? would be much appreciated.
<wallyworld_> rogpeppe: i'm running late for soccer. i'll be back in about 3 hrs. sorry
<rogpeppe> wallyworld_: np
<rogpeppe> wallyworld_: thought i was probably too late, just pushing my luck :-)
<rvba> wallyworld_: Hiâ¦ jam's explanation on MP is enough (I think) for me to work on the missing URL() methodâ¦ but I'd rather work on it in a different branch and get this one (https://codereview.appspot.com/10237046/) reviewed nowâ¦ could you please have another look when you'll be back fro soccer?
<fwereade__> rogpeppe, heyhey
<rogpeppe> fwereade__: yo!
<rogpeppe> fwereade__: wondering if you fancy a chat about moving the api connect stuff forward
<fwereade__> rogpeppe, yeah, sounds sensible, would you start a hangout please?
<rogpeppe> fwereade__: doing
<rogpeppe> fwereade__: https://plus.google.com/hangouts/_/b51c68bb41b0ebd90ac5ae51531a89f50e298495?authuser=0&hl=en-GB
<jam> rogpeppe, fwereade__: Mind if I listen in? I probably won't add much, but I'd like to follow the conversation.
<jam> danilos: be there in just a second
<danilos> jam: ack
<thumper> fwereade__: hey
<fwereade__> thumper, heyhey
<thumper> fwereade__: got some time? I've got lots of questions
<fwereade__> thumper, how's it going?
<fwereade__> thumper, sure, just a sec
<fwereade__> thumper, ready
 * thumper starts a hangout
<bigjools> fwereade__: have you got provider-specific constraints on the plan?
<fwereade__> bigjools, sorry, 5 mins
<bigjools> fwereade__: we lost you
<fwereade__> heh, I can still hear you
<fwereade__> bigjools, rvba, I was just asking if it was just maas-name and maas-tags, or if we missed something else
<fwereade__> bigjools, rvba, if you've ok piggybacking name on tags I'm ok with adding tags
<rvba> fwereade__: that's all: maas-name (can probably be supported as tags) and maas-tags.
<jtv> fwereade__: does my MP from last week look OK now?  https://codereview.appspot.com/10407045/
<fwereade__> jtv, huh, looks like you're right about the validation on SetConfig. grar.
<fwereade__> jtv, I don't think it should really be the SetEnvironmentCommand's responsibility; I could accept it being *state*'s responsibility not to mess it up, but state can't do that very easily due to some decisions we made a while back
<fwereade__> bah
<jtv> fwereade__: true, the place where the responsibility really belongs is missing, isn't it?
<jtv> I was also wondering why the ec2 and openstack providers continue to write to  environ.name when it's being read without locking.
<fwereade__> jtv, search me... I would honestly not hold up any of our providers as shining beacons of best practice
<jtv> Functionality is easier to retrofit than concurrency correctness.
<fwereade__> jtv, would you be ok adding the validation in SetConfig regardless of what ec2/openstack do?
<jtv> already did â didn't the diff update?
<fwereade__> jtv, can't see the changes myself
<fwereade__> jtv, it doesn't happen automatically on push, you need to `lbox propose` again
<jtv> Argh
<jtv> Our lives would become so much easier if we could just use one kind of reviews...
<jtv> It's especially annoying if it means switching back to another branch while you're already working on another.
<fwereade__> jtv, heh, yeah, I can understand that
<rvba> wallyworld_: I just realized I forgot to lbox propose again (with the fixes) so you probably did not see the diff being updated, I've done it now (https://codereview.appspot.com/10237046/).
<jtv> (Also, it'd be nice if lbox could actually bother to check for fatal errors...  someone once wrote that whoever doesn't do that in Go ought to be fired :)
<fwereade__> jtv, I would be loath to give up the line-specific comments, but it seems to be fading in importance a little given the trunk-gating
 * fwereade__ refrains from comment
<jtv> What's trunk-gating?
<jtv> It sounds lumber-related.
<fwereade__> jtv, we now have tarmac to land approved branches
<jtv> Ah that
<jtv> Do the two interact?
<fwereade__> jtv, not very much, no -- `lbox submit -tarmac` might be nice, though
<fwereade__> jtv, we've had a few branches get in without being `go fmt`ed
<jtv> It'd also need to submit a commit message, rather than just a description...
<fwereade__> jtv, yeah, that's not such a huge change, really
<fwereade__> jtv, but nobody's actually working on changing it at the moment
<jtv> I also find the line-based comments rather hard to work with as a reviewee...  I'd be perfectly happy to go with the normal Canonical procedure completely.
<jtv> Well, it looks like I've re-proposed.  Let's see if the branches will marry.
<jam1> fwereade__: the tarmac rules should be "go fmt ./... && go build ./... && go test ./..." so we shouldn't have anything land without formatting. Are you talking about the lbox rules?
<fwereade__> jam1, hmm, I've had lbox complain about go fmt issues on propose twice now
<fwereade__> jam1, does it definitely merge the code it runs, and not just merge the source branch from which it got the code?
<fwereade__> jam,^
<mgz> double underscore versus 1...
<jam> fwereade__: It should run the test suite, and then commit/not commit, It doesn't restage the commit a second time after the test suite passed. (AFAICT)
<mgz> running go fmt as part of tarmac seems a little risky, as that has changed between go versions so is inconsistent
<jam> mgz: it just means we're 100% consistent in what lands on trunk :)
<fwereade__> mgz, jam, ahh -- remind me, which go does tarmac use?
<mgz> but does it actually commit post fmt?
<jam> mgz: it does the fmt pre merge check
<jam> so it should land it post fmt
<mgz> I guess we trust fmt not the break stuff...
<jam> mgz: I just checked the last 20 revs, and none of them had complaints for *my* go fmt.
<mgz> it's 1.0 from distro on the bot right?
<jam> fwereade__: so I know lbox check uses 'gofmt' rather than 'go fmt' I don't know if there is a difference there. But I just checked the last 100 commits to trunk, and 100% of them have 0 output running 'go fmt ./...'
<jam> to verify, I manually mutated a file while it was running, and it complaine.d
<jam> fwereade__: so if you see lbox complain, let me know
<jam> but I can't reproduce (and I'm guessing it might be because of your own local changes)
<jam> fwereade__: could it be a go 1.1 vs 1.0.3 formatting issue?
<jam> mgz: https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
<mgz> ya
<fwereade__> jam, yeah, I was wondering about that
<fwereade__> jam, I'm on 1.1
<jam1> fwereade__: so go 1.0.3 is happy with all of the last 100 commits.
<jam1> fwereade__: so it must be you :)
<fwereade__> jam, ha, ok, I'll look closer next time I see it and try to figure it out
<fwereade__> jam, 1.1 vs 1.0.3 seems likely though
<rogpeppe1> any chance of a review of https://codereview.appspot.com/10447047/, please? it's considerably more deleted code than added code.
<rogpeppe1> jam: there are definitely gofmt 1.1 vs 1.0.3 differences
<rogpeppe1> jam: there's no difference between gofmt and go fmt (they use the same tool)
<jam> rogpeppe1: they take different args, at least.
<rogpeppe1> jam: go fmt calls gofmt
 * rogpeppe1 quickly goes to check he's not talking out of his arse
<rogpeppe1> yes, that's true
<rogpeppe1> jam: it's even in the docs: "
<rogpeppe1> Fmt runs the command 'gofmt -l -w' on the packages named
<rogpeppe1> by the import paths.
<rogpeppe1> "
<rogpeppe> jam: i have the go1.0.3 version of gofmt in my PATH, so that's what i always use to format code
<TheMue> rogpeppe: you've got a review
<rogpeppe> TheMue: thanks!
<wallyworld_> rvba: hi, sorry i missed your earlier message. i was at soccer. i look at the codereview link but the 2nd patchset is not there. i looked at the diff in lp and it looks ok. i just have a more minor comment - use %q instead of '%s' to quote a string value in a printf call
<rogpeppe> mgz: any chance you'll be able to land https://codereview.appspot.com/10439043/ today? i have a branch that depends on it.
<jam> fwereade__: you asked for some more tests about the Machiner watcher code (to test the API .Next() behavioR)
<jam> fwereade__: from what I can see, there is 0 client code that follows any watcher that was set up in the API
<fwereade__> jam, ah, yes -- I guess there's some subtlety that I'm missing?
<jam> there is a function "state/api/watcher.go" that has "newEntityWatcher" which is not an exposed function, and nothing calls it.
<jam> fwereade__: so the code adds an ability to start a watcher on the server
<jam> but AFAICT there is no functionality for actually following that watcher on the client.
<fwereade__> jam, I was hoping that the unit tests for the machiner would grow the infrastructure necessary to make a naked Next call with the returned watcher id and check that it does something
<fwereade__> jam, the client side is a different problem, I think
<rvba> wallyworld_: I think the new mp is there: https://codereview.appspot.com/10478045/
<jam> fwereade__: so there should be a APIServer.resources[id] object
<fwereade__> jam, I'm only thinking of the unit-ish-testing of the server side
<jam> fwereade__: is that reasonable?
<mgz> rogpeppe: as jam says, trying to do that now, but working out how to get reasonable test coverage is frustrating me
<jam> fwereade__: also, *I* am a little confused about what Next() is vs what Changes() is .
 * rogpeppe goes for lunch
<rvba> wallyworld_: I did not realize re-proposing a branch creates yes a new MP.
<fwereade__> jam, hmm, isn't the resources stuff done via an interface anyway?
<wallyworld_> rvba: no problem, looking now
<fwereade__> jam, so we can completely control that, I think
<fwereade__> jam, Next() means please-read-from-Changes()-and-return-what-you-got
<fwereade__> jam, but in fact I don't even care about that here, now I think about it
<fwereade__> jam, the Resourcerer or whatever interface will presumably be expected to have a watcher registered
<wallyworld_> rvba: done. thanks for adding the notfound code
<fwereade__> jam, the mock can check the registered resource is sane, and pick an id to give back to the machiner, and the test can then check that id came back
<fwereade__> jam, sane? I may be forgetting something about the resources, I don't know it well
<rvba> wallyworld_: thanks for spotting that problem :)
<wallyworld_> no wuckers
<fwereade__> jam, fwiw, there is some rambling in the -wip https://codereview.appspot.com/10495043 that I was working on over the weekend
<jam> fwereade__: right now the API for Resources is that they have a "Stop()" method.
<jam> however, he could probably cast it to the actual Watcher
<jam> because we humans know its actual type
<fwereade__> jam, +1
<fwereade__> jam, in fact, independent of its relevance this second, I would appreciate your thoughts on that CL
<fwereade__> jam, it's a bit dashed off because I have other stuff to do but I hope to land something related eventually
<jam> fwereade__: I will try to look at the code you mention.
<jam> fwereade__: in the mean time, you mention Next
<fwereade__> jtv, LGTM, just one tweak
<jam> but all I specifically see is "Changes"
<jtv> Thanks fwereade__
<fwereade__> jam, yeah, I had my layers confused
<jam> the thing stored in the resources is a state.EntityWatcher
<fwereade__> jam, and there was/will be some sort of Next method somewhere that takes a resource Id and returns the next thing to come out of its changes channel (or an error if it's stopped)
<fwereade__> jam, it appears that does not exist at present
<fwereade__> jam, but it's not actually needed for testing this
<fwereade__> jam, because we can write unitier tests anyway without bringing it into the mix
<fwereade__> jam, sorry, I didn't think that one all the way through
<jam> fwereade__: right, we were a bit confused because we couldn't find (a) anything client-side that could actually follow a watcher and (b) something called Next() :)
<jam> fwereade__: apparently (a) has been designed and is called Next()
<fwereade__> jam, that said, to write a client watcher, we will need to make Next calls -- and there is code in history that does that
<jam> but hasn't been implemented
<fwereade__> jam, I think it was removed with the change
<jam> fwereade__: there is nothing in trunk today that I could find.
<jam> fwereade__: well, there is for the AllWatcher
<fwereade__> jam, it would have gone in one of dimitern's commits the week before he left
<jam> but not for the EntityWatchers that we have.
<fwereade__> jam, if you find that code it will serve as a useful model
<fwereade__> jam, but it had a couple of issues -- I think the entitywatcher variant was ok, but the lifecyclewatcher would drop events
<fwereade__> jam, there are a couple of possible approaches to that
<jam> fwereade__: drop all events so nobody notices?
<jam> :)
<fwereade__> jam, the description in CL above really is relevant there :)
<fwereade__> jam, sorted
<fwereade__> jam, let's go shopping
<fwereade__> jam, basically we can either implement client-side coalescence when a second change comes in before we've delivered the first
<fwereade__> jam, (lots of work, duplicated from elsewhere, although it cold probably be factored into cleanliness)
<fwereade__> jam, or we can flip-flop client-side watchers from send to receive mode
<fwereade__> jam, so we get the result of Next(), then sit and wait until our client reads it from our channel, and only then make another Next call
<fwereade__> jam, this is pretty nice, and a hell of a lot simpler
<jam> fwereade__: doesn't that leave you with stale Next() results? Is it possible to not call Next until someone actually cares and we have the coallescing done in the server?
<fwereade__> jam, well, the Next calls do in fact cause some coalescence on the server
<fwereade__> jam, because the watcher coalesces internally until someone reads from Changes, and that only happens when we call Next
<fwereade__> jam, however, there is going to be an unintended consequence there with the code as it stands
<fwereade__> jam, specifically: (1) api server syncs state (2) client calls Next, no more changes (3) state syncs again, 100 changes come in (4) the first change is delivered on Changes and returned from Next (5) the client has to handle that one, and call Next *again* before he sees the rest of that block of 100 changes
<jam> fwereade__: do you see a block of changes, or just one-by-one ?
<fwereade__> jam, in theory you can see blocks
<fwereade__> jam, how much you do in practice depends on the specific watcher
<jam> fwereade__: EntityWatcher.Changes() has a channel that is essentially just a boolean "something happened"
<jam> I guess LifeCycleWatcher gets an array of strings?
<fwereade__> jam, and they are all currently implemented in a way that, given the characteristics of the underlying state/watcher implementation, tends towards single events
<jam> s/array/slice/
<fwereade__> jam, yeah
<fwereade__> jam, coalescence is indeed much easier for the entity watchers
<fwereade__> jam, but even then, it tends twoards suboptimal behaviour
<fwereade__> jam, hmm, actually, maybe not so much there
<fwereade__> jam, maybe single documents are ok across the board
<jam> fwereade__: well getting 100 'something has changed' calls all at once is a bit annoying. It really depends how the backlog is treated.
<fwereade__> jam, agreed
<fwereade__> jam, that shouldn't be a problem with the entity watcher
<fwereade__> jam, it is a problem I recently discovered in the scope watcher, though
<fwereade__> jam, that's that CL
<jam> fwereade__: mgz added a call to Changes, does it look reasonable to you? (https://code.launchpad.net/~gz/juju-core/057-api-machiner-watch/+merge/170586 line 212, CL is in the process of updating)
<fwereade> jam, mgz, that looks perfect in essence  but I'm suspicious of getting that event out of it; just a sec
<mgz> that event?
<mgz> NotNil is all I;m asking of it
<mgz> I'd *like* to make that a useful assertion
<fwereade> mgz, according to the previous implementation you wouldn't have got one at all
<mgz> and maybe have proper checks for yeah, no event at all/timeout/whatver
<fwereade> mgz, because that one read and returned the initial guaranteed Changes() event as part of the Watch() call
<fwereade> mgz, that approach makes some sense, I think, because all watchers are meant to return an initial event describing the difference between no-state and the current state
<fwereade> mgz, and it would be nice to help the client watcher implementation by delivering the first event straight-off
<fwereade> mgz, so in that case you'd be able to extract a resource that's an EntityWatcher for the machine you know about, and do a quick check that no events are produced until you write a change to the machine
<mgz> fwereade: I would *really* like to see an example test along those lines to work off
<fwereade> mgz, just as you do, but with one extra check verifying the bit of behaviour you missed because you had no way to know about it
<fwereade> mgz, I just (at least partially) unfucked most of the ones in state actually
<fwereade> mgz, state/watcher_test.go
<mgz> thanks
<fwereade> mgz, they're internal at the moment but I'm fine promoting them somewhere else if they're useful to you
<mgz> what do you mean by internal?
<fwereade> mgz, NotifyWatcherC looks like it might actually be *exactly* what you need
<fwereade> mgz, in the state_test package
<mgz> but... I could just use that, no?
<fwereade> mgz, hmm, I thought you didn't get _test.go code compiled in except for the package currently under test
<jam> fwereade: correct
<mgz> surely I can just import state/state_test
<jam> you have to move it to a testing module
<jam> mgz: nope
<mgz> ;_;
<jam> it would be 'state_test' but that won't look up in the module path
<mgz> okay, so I'll land this with copy-code, then maybe move that
 * fwereade grudgingly approves of mg's plan, so long as he follows up with the fix asap ;p
<fwereade> mgz's
<mgz> ...I keep getting "panic: watcher was stopped cleanly" instead of any useful failures
<rogpeppe> mgz: just looked at your branch
<rogpeppe> mgz: one thing:
<mgz> I guess the stop needs do be a defer or something
<rogpeppe> mgz: the event coming off the entity watcher channel can never be nil
<rogpeppe> mgz: it's a struct{}
<mgz> rogpeppe: that was an entirely stub assert
<rogpeppe> mgz: ah
<rogpeppe> mgz: the test should probably be something like: select {case _, ok := <-channel: c.Assert(ok, Equals, true); case <-time.After(timeout): c.Fatal("timed out")}
<rogpeppe> niemeyer: hiya
<niemeyer> rogpeppe: Hey, good morning
<mgz> rogpeppe: I tried that... (roughly)
<mgz> I don't see why I'm getting into MustErr at all...
<rogpeppe> mgz: and that's when you got your "watcher stopped cleanly" error?
<rogpeppe> mgz: hmm
<rogpeppe> mgz: where were you seeing that error?
<mgz> http://paste.ubuntu.com/5795495/
<mgz> I get that with my branch just by making an assert fail inside the select
<mgz> rogpeppe: ^
<mgz> I'll have some lunch, then everything will make sense
 * rogpeppe pulls mgz's branch
<rogpeppe> mgz: what revision of mgo are you using? it's probably not a related issue, but all those goroutines stuck at labix.org/v2/mgo/server.go:227 look suspicious
<rogpeppe> mgz: hmm, the tests pass for me (when testing that package on its own, at any rate)
<rogpeppe> mgz: tests pass for me altogether - i cannot reproduce your issue
<rvba> Could you guys please update the gwacl library in the landing environment (or tell me how to do it)?  I've made a change to gwacl recently and I need it to land a branch in juju-core.
<mgz> hm, not having any joy isolating this
<rvba> mgz: sorry to bother you, but would you happen to know how to update gwacl in juju-core's landing environment?  I can't land my approved branch because I need the last version of gwacl there.
<rvba> jam: ^ ? I know you're the tarmac master :).
<fwereade> rogpeppe, LGTM
<rogpeppe> fwereade: ta!
<mgz> rvba: sorry, didn't answer because I'd need to find out
<rogpeppe> fwereade: i was surprised how easy it was actually
<mgz> I think it's pretty trivial to bump the dep
<rvba> mgz: I guess bumping the dep is easyâ¦ but I'm not sure who has access to the tarmac machineâ¦ I certainly don't.
<mgz> hm, can't find the details anywhere, I'll bug jam when he's around again
<rvba> mgz: okay, thanks.
<rogpeppe> mgz: are you still having the "watcher died cleanly" issue?
<mgz> rogpeppe: yeah, with trunk mgo
<rogpeppe> mgz: go1.0.3 ?
<mgz> it's fine if the test passes, I just can't get it to fail
<rogpeppe> mgz: ah, i think i know what your problem might be
<mgz> rogpeppe: 1.0.2-2 from dist on raring
<rogpeppe> mgz: if your test fails, you will never stop the watcher
<rogpeppe> mgz: so the state will be torn down and the watcher will see its state/watcher torn down
<rogpeppe> mgz: tbh the error message should be better
<mgz> right, that was my assumption, need the stop in some kind of defer
<rogpeppe> mgz: i think using MustErr in EntityWatcher.loop is misleading
<mgz> but... the error/expection being backwards... and a panic, is mysterious
<rogpeppe> mgz: there's no indication that the *underlying* state/watcher Watcher might have been torn down
<rogpeppe> mgz: i think it's worth changing the error message there, probably
<rogpeppe> mgz: yes, you need to defer the Stop
<jam> rvba: will updating gwacl break juju core without your branch?
<rvba> jam: no
<jam> rvba: generally you have to ping someone who knows how to ssh into tarmac-bot, I've told some people (wallyworld has done it), but I'm a reasonable alternative.
<jam> rvba: now on 129
<rvba> jam: but that's a good questionâ¦ what it I wanted to make a non-backward compatible change to gwacl (and then land the required change in the provider)?
<jam> rvba: we don't have dep managment, so it is either always run tip and update on every revision, or do it manually.
<jam> rvba: then we need to land your patch as we update the dependency
<rvba> jam: all right :)
<jam> rvba: we want to end up with a file that says what version of each dependency to use, but we don't have that yet
<mgz> jam: can you record the ip of the machine somewhere?
<jam> mgz: 'nova list lcy02' when you use the shared credentials
<mgz> I'm assuming I could get in if I knew where it was :)
<jam> mgz: 10.55.63.190
<mgz> ah, yeah, that'd work
<rvba> jam: thanks for your helpâ¦ I'm landing my branch now.
<jam> mgz: ssh in as ubuntu, sudo su - tarmac
<rogpeppe> fwereade: you've got a review of https://codereview.appspot.com/10495043/
<rogpeppe> right, time to go and trim a hedge
<rogpeppe> g'night all
<thumper> morning folks
<hatch> morning thumper
<mramm2> morning thumper
<thumper> mramm2: writing an email :)
<thumper> mramm2: but thought you might like to catch up after that?
<mramm2> sure
<thumper> wallyworld: can we chat when you are around?
<wallyworld> thumper: give me 30 mins or so. gotta have breakfast and tidy the house for the cleaner
<thumper> wallyworld: sure, np
<thumper> fwereade: around in your evening?
<dpb1> thumper: hi, if I have some feedback on an MP, where should I put it?
<dpb1> this one specifically, I just tested it: https://code.launchpad.net/~themue/juju-core/029-config-get-all/+merge/170789
<thumper> dpb1: at the moment, in the reitveld review
<thumper> otherwise the dev will likely miss it
<dpb1> thumper: ok
<thumper> dpb1: bonus points if you log in to reitveld with an email address that launchpad knows about :)
<dpb1> ahh, ok  will do
<wallyworld> thumper: hi, did you want a chat now? https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
<thumper> wallyworld: ack
<thumper> wallyworld: https://codereview.appspot.com/10478043/
<wallyworld> thumper: in the above mp, the 2nd patch set doesn't seem to be pushed up
<thumper> wallyworld: it hasn't
<thumper> wallyworld: because it is a one word addition to an error
<thumper> wallyworld: and a documentation change
<thumper> s/function/method
<wallyworld> sure, normally when i see Done i look for the change to be there
<thumper> :)
 * thumper considers food
<thumper> and perhaps another coffee
 * thumper goes to inspect the fridge
#juju-dev 2013-06-25
<bigjools> why do we have canonical-written code that isn't hosted in LP?
<davecheney> bigjools: which code ?
<bigjools> mgo
<davecheney> bigjools: https://launchpad.net/mgo
<davecheney> it is
<davecheney> but gustavo is using a 'vanity url'
<bigjools> ah ok
<bigjools> thanks
<niemeyer> bigjools: mgo is not Canonical written, by the way
<niemeyer> bigjools: it is Gustavo written
<bigjools> I'd change that purely from the PoV of getting lTAB to work in the shell :)
<bigjools> niemeyer: under Canonical time?
<niemeyer> bigjools: under Gustavo time
 * davecheney butts out
<niemeyer> Its creation predates even Canonical's adoption in Go, so there's little to be said
<niemeyer> s/in Go/of Go/
<wallyworld> thumper: have you ever had a bzr pointless merge error?
<thumper> yes
<thumper> wallyworld: I'm off to walk the dog now
<thumper> bbl
<wallyworld> thumper: ok, can you ping me when you get back
<wallyworld> WARNING  Merging https://code.launchpad.net/~wallyworld/goose/null-project-description into https://code.launchpad.net/~go-bot/goose/trunk would be pointless
<wallyworld> i'm not sure why the merge fails
<wallyworld> the mp shows a nice diff etc
<thumper> wallyworld: interesting, haven't had exactly that before
<thumper> wallyworld: try merging trunk into null-project-description first
<thumper> wallyworld: also, try 'bzr missing --mine ' from the null-project
<wallyworld> thumper: ok. u had done that prior to proposing i'd swear but i'll try again
<wallyworld> thumper: i've commented on https://codereview.appspot.com/10447045/, perhaps you could do the same
 * thumper looks
<wallyworld> thumper: so, no revs to merge in from trunk, and bzr missing shows what i would expect https://pastebin.canonical.com/93338/
<thumper> wallyworld: ok, I have no idea
<wallyworld> np
<wallyworld> i'll bug john latert
<wallyworld> thumper: i'd like another quick chat about constraints when you have a moment
<thumper> wallyworld: sure, let me go make a drink and I'll be right back
<wallyworld> kk
<thumper> got a hangout?
<wallyworld> i'll make one
<wallyworld> https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
 * thumper bbl for more fwereade chats :)
<rogpeppe> mornin' all
<jam> morning rogpeppe, you seem up early
<rogpeppe> jam: i wanna get stuff done before i go away on thurs
<wallyworld> jam: i have to take my son to the dr but do you know why i get a bzr pointless merge error trying to get my bracnh into trunk?
<wallyworld> WARNING  Merging https://code.launchpad.net/~wallyworld/goose/null-project-description into https://code.launchpad.net/~go-bot/goose/trunk would be pointless
<wallyworld> i get the above in the tarmac log
<wallyworld> the mp diff looks fine
<wallyworld> bzr missing looks fine - shows the 2 revs that i have committed
<wallyworld> tim doesn't know what's wrong
<wallyworld> maybe you do?
<wallyworld> i'll check in when i get back
<jam> wallyworld: because goose successfully merged and committed your change, but failed to push it back to Launchpad
<jam> so its local branch has merged your changes.
<jam> wallyworld: I just pushed it out. I don't specifically know why it would have gotten into this situation.
<rogpeppe> jam: i have some outstanding reviews, BTW, which it would be great to get moving; in particular fwereade has verbally ok'd this; i need another review and i'd appreciate your input. https://codereview.appspot.com/10259049/
<bigjools> hi.  We need to generate an iso9660 from the user data that the Azure provider needs. We can shell out, or invoke some C, or .... anything better you can think of?
<jam> bigjools: the recommendation that seems to come from the lxc work is that you should shell out
<bigjools> jam: ok cheers.
<rogpeppe> bigjools: would it be that hard to write a little package that generates an iso9660 image? the format doesn't look that abstruse, at first glance anyway.
<bigjools> rogpeppe: I'd really rather not reinvent the wheel
<rogpeppe> bigjools: i'm slightly concerned about juju acquiring many external dependencies. is there something that can produce an iso9660 image installed by default in ubuntu?
<bigjools> yes
<bigjools> what's the problem with external dependencies?
<rogpeppe> bigjools: they might not always be available on platforms we want juju to run on.
<bigjools> ok
<bigjools> the one I am looking at is genisoimage
<bigjools> ah it gets installed by ubuntu-desktop
<bigjools> rogpeppe: I think we'll shell out for now (we're short on time) but the motivated can easily replace it with a native piece of code
<bigjools> at least, it's good to show the rest of the code works before spending the time writing this
<rogpeppe> bigjools: that seems reasonable. you could make a package for it anyway, designed to be reimplemented at some future point.
<bigjools> rogpeppe: yes, that would be good
<bigjools> gives a nice clean break
<TheMue> morning
<wallyworld> fwereade: would you be free for a quick chat?
<fwereade> wallyworld, sure, start a hangout, with you in a sec
<wallyworld> fwereade: https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
<thumper> jam: hello there
<thumper> jam: landing issue because I tried to add a new dependency
<thumper> jam: launchpad.net/golxc
<thumper> mgz: morning
<thumper> jam: oh, just saw the emails, seems like you are on it
<mgz> hey thumper
<thumper> mgz: hey, have you managed to hand off the api stuff yet?
<mgz> thumper: yeah... +- one annyoing branch that's ready to land
<jam> thumper: I installed the new dependency, still fai
<jam> fails
<jam> # launchpad.net/juju-core/container/lxc container/lxc/instance.go:49: undefined: instance.Metadata
<thumper> oh ffs
<thumper> that;s right
<thumper> wallyworld: removed it again
<thumper> I'll have to fix it tomorrow
<thumper> jam: but thanks
<jam> thumper: np. Poke me if you need updated golxc/etc.
<thumper> ok, ta
<wallyworld> thumper: i did tell you :-)
<rogpeppe> mgz: is that branch really ready to land now? i'm blocked on it
<mgz> rogpeppe: really :)
<rogpeppe> mgz: i don't see anything different from yesterday (assuming we're both talking about https://codereview.appspot.com/10439043)
<mgz> pushing now-ish
<jam> rogpeppe: so the reason thumper went with another package name was so that you could do: import . "launchpad.net/juju-core/testing/checkers" is there a reason you prefer not to use '.' ?
<jam> (the idea is to make it act the same as gocheck checkers)
<rogpeppe> jam: yeah, i really don't think we should do any more importing to .
<rogpeppe> jam: having one package imported from . is bad enough
<rogpeppe> jam: and i don't think the saved typing is a good enough justification
<jam> rogpeppe: I'd rather be consistent, so we should probably have that discussion on the list. thumper's assertion at least was without 'import .' there isn't really a benefit of 'testing.IsTrue' over Equals, true)
<jam> rogpeppe: I do think it makes the code 'read' better.
<jam> c.Assert(err, Satisfies, errors.IsNotFoundError) without the extra "checkers." in there.
<rogpeppe> jam: we're introducing pollution to the local name space. there's a reason Go doesn't do that all the time
<rogpeppe> jam: yes, i understand that point
<jam> rogpeppe: so at least my argument is: either (a) put them in 'testing' and import it as a module or (b) put them in checkers and import them as '.'
<rogpeppe> jam: i hadn't realised that was the only reason for putting the checkers in a new package
<rogpeppe> jam: in which case i'd put them all in testing
<rogpeppe> jam: there is another possible reason for putting them in a new package, which is that testing has lots of dependencies, where checkers has almost none.
<mgz> rogpeppe: 's up
<rogpeppe> mgz: ta
<mgz> I think william might have wanted something different on the inital-event handling, but I didn't understand where he wanted the change, and just landing this seems... like a desired thing
<jam> mgz: for the test, I think if we move NotifyWatcherC somewhere we can use it, then you can just NotifyWatcherC(resource).AssertOneChange()
<mgz> yup, though I got the impression he wanted the actual logic, not the test, to change? I wasn't completely clear.
<rogpeppe> jam: surely reading from a channel with a timeout is not something we *need* to factor out
<rogpeppe> jam: we do it all over the place
<jam> mgz: "then do some basic verification of the watcher's state with something like NotifyWatcherC"
<mgz> rogpeppe: the current code, for instance, does not check that there aren't further events
<jam> rogpeppe: I think NotifyWatcherC is intended to become something more than just reading off a channel.
<rogpeppe> mgz: that's true, but we aren't testing the watcher here
<mgz> helpers are as much about making sure everyone gets the code right as saving typing
<rogpeppe> mgz: we're testing that the watcher is there and watching the right thing
<rogpeppe> mgz: your test tests the former but not actually the latter
<rogpeppe> mgz: to be honest, i think the client test should test that and kill two birds with one stone
<mgz> I agree there's some question over what should be covered in the client tests rather than here
<rogpeppe> mgz: originally i *only* did client tests, reasoning that they cover almost exactly the same ground
<rogpeppe> mgz: but since the advent of bulk-for-everything (and the lack of a client interface to that), server-side-only tests are necessary
<jam> mgz: testing that the side effect happened is very unit-y vs integration-y, and since we don't have a client-side thing yet...
<jam> rogpeppe: while integration tests can cover everything unit-tests do, they tend to be overbroad and trigger to many failures when something low-level changes, vs a unit test that tends to be more precise. (I'm personally in favor of having both)
<rogpeppe> jam: i don't really see the interposing the API server makes it that much more of an integration test
<rogpeppe> jam: we're still calling all the way down to mongo
<jam> rogpeppe: more piecs == more integration
<jam> pieces
<jam> I don't see how that is hard to see.
<jam> testing-per-layer is a good thing to do IMO
<rogpeppe> jam: i'm more interested in test coverage
<rogpeppe> jam: and keeping tests from taking hours
<rogpeppe> jam: and i dislike having a lot of seriously overlapping test code
<rogpeppe> jam: because it wastes time and energy
<rogpeppe> jam: i do see your point about the failures being harder to diagnose though
<jam> rogpeppe: easier to diagnose failures often save *lots* of debugging time. Which in the lifetime of a project can easily dominate the overall cost.
<jam> rogpeppe: I agree spending huge amounts of runtime testing the same codepath is a bit of a waste.
<jam> I personally am in favor of layer testing, and a small number of integration tests.
<jam> So you don't have to look at all the edge cases at integration time
<jam> and just cover things that fail because of the combination, and general "does it work" tests.
<TheMue> fwereade: ping
<rogpeppe> jam: in this case, i tend to see the server type + the rpc package as one "layer"
<rogpeppe> jam: and that nothing other than the rpc package will ever talk to the server types
<mgz> so, I have a friend who might be interested in helping with the containerisation work
<rogpeppe> jam: and the overhead of talking through that is fairly minimal
<mgz> thumper, have we got any reasonably seperate parts that someone could have a go at?
<jam> rogpeppe: except anytime you have an RPC you *really* want to test them in isolation, so you don't run into the "client 1.11 tests all pass, and 1.10 tests all pass, but 1.11 can't talk to 1.10 because we weren't asserting what the conversation actually was"
<TheMue> jam: thx for review
<rogpeppe> jam: that's an interesting point; i think we need both.
<TheMue> jam: I came about chmod the dir because it is created in tests externally. this also can happen later
<rogpeppe> jam: i'm not entirely sure of the best way to do the compatibility checks.
<TheMue> jam: that's why I correct it during the initial writing ;)
<rogpeppe> mgz: you have a review
<jam> rogpeppe: I agree we need both
<rogpeppe> jam: i think we could have automated tests for rpc message compatibility, but version compatibility involves much more than just the format of the rpc messages
<frankban> fwereade, anyone else: could you please review  https://codereview.appspot.com/10497043 ? Thanks!
<rogpeppe> frankban: looking
<frankban> thanks
<fwereade> frankban, I'll take a look after lunch
<fwereade> TheMue, pong, very quickly
<frankban> fwereade: great thank you
<fwereade> mgz, rogpeppe, jam: I had liked rogpeppe's(?) original model in which the Watch call returns only when the guaranteed initial event has been read off the watcher, so the client-side watcher can hand that straight over to the out chan on creation, which seems kinda nice
<rogpeppe> fwereade: ah, good catch
<fwereade> mgz, rogpeppe, jam: ofc this is a degenerate case, there's no actual data to send, but for consistency's sake we should still be consuming the initial event before we hand over over as a resource to be Watch()ed
<rogpeppe> mgz: yeah, it should do that
<mgz> fwereade: okay, that's the statement from the review I wasn't clear on
<fwereade> mgz, Next()ed, rather
<mgz> so, the Watch call needs to pull from the channel, then the test needs to assert there are no further events
<fwereade> mgz, yeah, do an AssertNoChange, tweak the machine, do an AssertOneChange
<rogpeppe> mgz: yes to the former; for the latter, i'd change something in the Machine, Sync, and verify you get a change
<fwereade> rogpeppe, I think the NotifyWatcherC gives quite a good vocabulary for those tests
<rogpeppe> fwereade: i don't think it's necessary to AssertNoChange
<rogpeppe> fwereade: we're not testing the actual watcher here
<rogpeppe> fwereade: just that it exists and is attached to the right thing
<TheMue> fwereade: oh, a quick pong? so we should talk about auto-sync later
<fwereade> rogpeppe, how else do we verify the original event was read? we're testing the SUT by reference to the known and tested-elsewhere characteristics of the watcher
<rogpeppe> fwereade: hmm, good point
<fwereade> TheMue, ok, can we maybe talk about it just before kanban? or are you blocked? ...I never double-checked your english, sorry
<fwereade> TheMue, would you link me that CL quickly please?
<mgz> fwereade: so, I still don't know *where* exactly you want that initial event pulled off
<TheMue> fwereade: we can talk before kanban, yes
<fwereade> mgz, the Watch call should create the watcher, read its initial event, register it as a resource and return its resource id
<TheMue> fwereade: review is https://codereview.appspot.com/10441044/
<mgz> which one though, the MachinerAPI Watch call, or the Machine Watch call, or newEntityWatcher...
<fwereade> mgz, MachinerAPI
<mgz> okay.
<fwereade> mgz, this is what we're currently implementing, in terms of the watchers that already exist and have this somewhat convenient initial-event model, which we want to take advantage of in the api
<mgz> ...I thinmk I'll do this in a new branch
<mgz> because it really wants the test helpers
<fwereade> mgz, ok, sgtm
<rogpeppe> fwereade: i have about two other branches blocked on mgz's branch BTW, so i'd very much like something to go in
<rogpeppe> fwereade: soon
<mgz> landing the current, then doing that change
<rogpeppe> fwereade, frankban: here's an interesting question: if someone has a service with minUnits=5, then destroys a unit, should the new unit be created when the old unit has gone away entirely, or when it starts to die?
<rogpeppe> davecheney: hiya
<rogpeppe> lunch
<frankban> rogpeppe: my understanding is that MinimumUnits expresses the concept of "minimum amount of units that should be alive". If that's correct, it seems sane to me to react right when one unit starts to die.
<jam> mramm: I saw you show up for a second. I just got back from the restroom
<rogpeppe> mgz: i think you need to re-approve your branch
<rogpeppe> mgz: it died with one of them "bad MAC" errors
<mgz> rogpeppe: do you know about those?
<mgz> first time I've seen it, will resubmitting help?
<rogpeppe> mgz: yeah, it's intermittent
<rogpeppe> mgz: i thought that was the problem that jam fixed
<jam> mgz, rogpeppe: I think this is a different error. This is using system mongo instead of the tarball mongo
<jam> I'll try to fix it quicky.
<jam> rogpeppe, mgz: I'm resubmitting now.
<mgz> jam: thanks!
<jam> mgz: when I update the config that the tarmac charm uses it rewrites crontab, which means I have to manually go fix it up, and I forgot to set the PATH
<mgz> urgh
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: rogpeppe | Bugs: 6 Critical, 81 High - https://bugs.launchpad.net/juju-core/
<TheMue> clear
<jam> mgz, rogpeppe: Looks like my fix worked.
<rogpeppe> jam: cool
<jam> (uninstalled the system mongo which the charm also installs, fix up the PATH for the test suite.)
<jam> I'm actually pretty impressed that there is only 1 failing test
<jam> (well 2, but it looks like 1 fails to tear down so 2 notices the setup isn't clean)
<Makyo> gary_poster, mramm - is today the cross-team meeting today? I have something on my calendar, but it looks old.
<mramm> Makyo: it's thursdays now
<Makyo> mramm, Thanks.  Same time?
<Makyo> mramm, actually, I won't have the hangout; can you invite me?
<gary_poster> Makyo, sorry, I might have misunderstood what you said you wanted last Friday. cross-team is half hour after kanban time
<gary_poster> Makyo, I'll delete the Tuesday juju core kanban appt for you?
<Makyo> gary_poster, That's what I meant, I just had the old event on my board for today.
<Makyo> gary_poster, I think I beat you to it :)
<gary_poster> :-) you declined, I deleted
<Makyo> Ah, okay.
<rogpeppe> a branch to review, if anyone fancies it: https://codereview.appspot.com/10494043
<rogpeppe> frankban: your merge proposal seems to be corrupted; could you re-propose, please? https://codereview.appspot.com/10497043
<frankban> rogpeppe: done
<rogpeppe> frankban: ta
<ackk> hi, I'm hitting an error deploying with juju-core on openstack: instances goes into ERROR state with "ProcessExecutionError". I tried to remove the unit and destroy the service, but it all seems stuck
<mgz_> ackk: that sounds pretty solidly like an issue with nova, not juju
<ackk> mgz_, can you force the destroy of the service/unit in juju?
<mgz_> just `juju destroy-environment`, and manually `nova delete` if needed
<mgz_> you can use terminate-machine for one machine, but starting from scratch seems wiser
<ackk> mgz_, yeah I tried that, it doesn't work because the machine is still associated to the unit (which has life: dying)
<mgz_> right, you just need to wipe if things get that screwed
<rogpeppe> frankban: you've got a review
<frankban> rogpeppe: thanks
<ackk> mgz_, I see. thanks
<dpb1> To get the unit tests to pass, is there a reference ~/.juju/environments.yaml file that i need to have?
<dpb1> I'm trying to follow the "Testing" section from the "CONTRIBUTING" file, but what is written there is not working.
<rogpeppe> i have to go a bit early today
<rogpeppe> g'night all
<hazmat> fwereade, in your nomenclature LKP = ?
<mgz_> dpb1: do you mean the live tests? you don't need environments.yaml for the unit tests
<dpb1> mgz_: I'm running just make check from the juju-core checkout.  I installed mongodb (obvious error) the errors I'm getting now are not as obvious.  I can paste in one at a time, but I'm wondering if there is more than what the CONTRIBUTING file states.
<dpb1> mgz_: I checked out juju core with go get -v -u launchpad.net/juju-core/...
<dpb1> mgz_: I just found the README as well.  I'm missing some things, let me check back, thanks
<mgz_> dpb1: feel free to pastebin anything if you get stuck
<dpb1> mgz_: for starters, I'm getting this: http://paste.ubuntu.com/5799084/  (cstack2 is an environment from my ~/.juju/environments.yaml file, which is why I asked my original question the way I did).
<mgz_> that's probably just bad isolation in one test...
<mgz_> dpb1: try running with JUJU_HOME=/tmp or something
<dpb1> mgz_: ok, I have tons of failures like these, will try now.  Hope that will be it
<dpb1> mgz_:  you were close.  apparently some tests don't isolate against JUJU_ENV (which I had set).  All good now
<dpb1> lbox is throwing an error about not being about diff branches when I submit a proposal: http://paste.ubuntu.com/5799185/  -- what am I missing?
<andreas__> hso I have a unit in pending state, and nova list shows the instance is in ERROR. It never launched. Any idea how to recover without destroying the environment?
<andreas__> http://pastebin.ubuntu.com/5799285/
<andreas__> it's related to #1187959
<_mup_> Bug #1187959: juju does not detect instance launch error, waits forever? <juju-core:Triaged> <https://launchpad.net/bugs/1187959>
<andreas__> ok, a combination of juju destroy-service and destroy-unit allowed me to "terminate" that machine
<andreas__> ...except terminate-machine doesn't work
<andreas__> does nothing
<hazmat> maybe i misunderstanding something.. when does a service in state 'dying' get garbage collected?
<ahasenack> hazmat: I don't think it does
<ahasenack> hazmat: or I haven't waited long enough
<thumper> morning
<thumper> poke fwereade
<thumper> hazmat: ping
<hazmat> thumper, pong
<thumper> hazmat: what was the dependency tool you found that caused you to drop the requirements.txt work for juju-core?
<hazmat> thumper, none, the use case for a commit or ci test runner was fulfilled i thought
<thumper> hmm... not really
<thumper> not in a reproducable way
<hazmat> thumper, why's is that.. it puts all deps at a known revision based on req.txt or gets head?
<thumper> head
<thumper> AFAIK
<thumper> I thought, and I may be wrong here, that we just have tarmac doing the landings for us
<thumper> not that there is any special revno checking there
<hazmat> thumper, a frontend script/make blows away the tree between runs
<hazmat> thumper, go get -u should still pull/update to trunk afaik, but blowing away the tree is simple as well
<hazmat> thumper, there's a bunch of other build tools, one other i might have mentioned is https://github.com/mozilla-services/heka-build
<thumper> ok, I may take a look
<hazmat> thumper, g+?
<thumper> hazmat: sure
<thumper> wallyworld_: you know how at the end of last week I said your watcher was returning dupes...
<thumper> wallyworld_: well, I was wrong
<thumper> wallyworld_: my code was bad
<wallyworld_> \o/
<thumper> I'm just trying to work out how to write a test for it
<wallyworld_> my code would never have any bugs :-P
<wallyworld_> thumper: talked to martin last night. he's across what we need to do. i asked him to send an email to us summarising the steps to address the main use case (deploy into container on new instance) as well as the next use case (mysql and wordpress in separate containers on an instance)
 * thumper nods
<wallyworld_> there's something easy we can do initially. it might get complicated later
<wallyworld_> but that can wait
<wallyworld_> lxc.net is an easy thing to get inter-container networking on the same machine
<wallyworld_> and we can bridge to get the first use case going
<wallyworld_> he already has some work in progress towards the goals so that's good
<thumper> wallyworld_: provisioner_test, TestProvisioningDoesNotOccurForContainers
<thumper> wallyworld_: why do you have cleanup code at the end of the test?
<thumper> wallyworld_: doesn't the test framework clean that up?
 * wallyworld_ looks
#juju-dev 2013-06-26
<wallyworld_> thumper: TestSimple also does it's own cleanup. i think it's just to check that the EnsureDead/Stop etc methods work
<thumper> ok
<bigjools> wallyworld_: is the openstack provider configuring a cloud-drive ?
<bigjools> or config drive
<wallyworld_> no
<wallyworld_> the instance just uses the storage attached to the instance
<bigjools> what format is that storage?
<wallyworld_> it depends on the instance
<bigjools> I got pointed at this: http://bazaar.launchpad.net/~cloud-init-dev/cloud-init/trunk/view/head:/doc/sources/configdrive/README.rst
<wallyworld_> larger instances use emphmeral disks
<wallyworld_> perhaps that's a feature of cloud-init the provider doesn't yet support
<wallyworld_> not that i am aware of anyway
<bigjools> how do you get your user data into openstack?
<wallyworld_> magic
<bigjools> can you wave your wand on my azure stuff
<wallyworld_> i think it is passed via cloud-init somehow, but need to check that
<bigjools> yes, no shit :)
 * wallyworld_ waves his stick
<wallyworld_> sorry, not sure of the exact mechanism
<wallyworld_> it's been ages since i looked at that code
<bigjools> my question is how does cloud-init get it.  normally you set a url for the kernel and the userdata is in provider storage at that url
<bigjools> but we can't set kernel params in azure
<bigjools> hence the cloud-drive thing - which I thought openstack used as well
<wallyworld_> bigjools: justed checked - openstack uses params passed when starting an instance to hold the user data
<wallyworld_> there's a userdata field in the struct
<bigjools> ah so it's direct
<wallyworld_> so it's not via cloud-init
<bigjools> cloud-init picks them up eventually
<wallyworld_> 		server, err = e.nova().RunServer(nova.RunServerOpts{
<wallyworld_> 			Name:               e.machineFullName(scfg.machineId),
<wallyworld_> 			FlavorId:           spec.InstanceType.Id,
<wallyworld_> 			ImageId:            spec.Image.Id,
<wallyworld_> 			UserData:           userData,
<wallyworld_> 			SecurityGroupNames: groupNames,
<wallyworld_> 		})
<wallyworld_> i'm not sure of the internals
<bigjools> you have a userData() func
<wallyworld_> yes, but i'm not sure how the instance works internally
<bigjools> which pretty much answers my question now
<bigjools> it renders to []byte
<bigjools> I need to render to a "clouddrive"
<bigjools> configdrive, even
<wallyworld_> wave your wand and it will happen
<thumper> oh FFS!!!!
<thumper> why do we insist on making testing things so fucking hard?
 * bigjools wonders why there's two cloudinit packages
<bigjools> thumper: if you want to take a break from fucking hard, can I have a pre-imp for something that is only moderately hard
<thumper> bigjools sure
<bigjools> thumper: gimme 2 mins and I'll call you
 * bigjools prods thumper to answer
<davecheney>  
<davecheney> Google Drive
<davecheney> The app is currently unreachable
<davecheney> shitter
<wallyworld_> thumper: for your todo list (tomorrow maybe) https://codereview.appspot.com/10534043/ and https://codereview.appspot.com/10447045/
<wallyworld_> davecheney: they must have used some microsoft code in there somewhere
<davecheney> wallyworld_: http://instantrimshot.com/classic/?sound=rimshot
 * wallyworld_ laughs
<jam> thumper: poke
<jam> I'm trying to sort out what an Envorin is, and why he knows a guy named 'eric' very intimately. (testing/environ_test.go)
<jam> I guess the goal is to just have a known home to compare against. Though we should probably s/Envorin/Environ/
<jam> rogpeppe: I just found that "go1.1" is a build constraint. If you wanted to write your Sliceof code in 2 modules and have a '// +build go1.1' and a '// +build !go1.1'
<jam> I don't think it is worth divergence just yet
<jam> but we do have a mechanism to take advantage of features when available
<rogpeppe> jam: that doesn't really solve my problem - my workaround for the SliceOf code is to have every module register the slice types - the go1.1 constraint would poison lots of packages.
<jam> sure
<rogpeppe> jam: mornin' BTW!
<jam> I just hadn't seen it before
<jam> morning
<rogpeppe> i've had this out for review all week and i've still only had one review of it. if i'm to get the API stuff in, i'd really like another review, please! https://codereview.appspot.com/10259049/
<rogpeppe> to whomsoever it may concern :-)
<jam> rogpeppe: to be honest, I've gone to look at it at least 3 times, it is just big enough and involves enough stuff I have trouble consuming it. I will certainly try again.
<rogpeppe> jam: sorry about that - there are probably a couple of trivial things that could be split off (the params.Life definitions for example) but i couldn't think of a way of splitting up the main change to agent.go and machine.go
<jam> rogpeppe: probably a lot of it is my unfamiliarity with parts of the internals, so I can't always grasp *why* a change is being made until I dig into it for a bit.
<rogpeppe> jam: well, it would be nice for the future if you acquire some familiarity with this part of the code, so please feel free to ask anything about what's going on - we could have chat about it if you want.
<jam> rogpeppe: so why did you need to add the worker.Worker.Kill implementation for the api server? It wasn't a worker before but should be? It was just useful for testing?
<rogpeppe> jam: it wasn't a worker before but now it is - rather than having explicit run loops in three places, the CL changes things to use worker.Runner in more places
<rogpeppe> jam: so in the machiner we've got the top level Runner that runs two workers; one of them runs any tasks that connect to the state; the other runs any tasks that connect to the API. each of them creates a runner themselves to run the tasks (so the runners are nested)
<jam> rogpeppe: and the presumably we iterate until one of those isn't necessary.
<jam> However, I would have thought that we would have tasks that couldn't quite be all API while we transition
<jam> so we could implement *an* API call and start using it
<jam> rather than having to finish off "everything a task might possibly need" before it can start using API.
<jam> Too much chance of split brain?
<rogpeppe> jam: i thought that too originally, but i think the idea is to transition an entire agent at a time
<rogpeppe> s/agent/worker
<jam> rogpeppe: Workers returns a worker that runs the unit agent workers. ???
<jam> why is the method plural, but the result singular
<jam> And the doc string doesn't seem to explain much.
<rogpeppe> jam: which method?
<jam> rogpeppe: https://codereview.appspot.com/10259049/patch/5001/6006
<rogpeppe> jam: ah, in unit.go
<jam> If it 'runs' it, I would have thought it would be a Runner
<jam> but a Runner is also a Worker?
<rogpeppe> jam: the unit agent has two levels of runner too
<rogpeppe> jam: yes
<rogpeppe> jam: that's an important point
<jam> For the docstring, I think it is trivial to see that Workers returns a worker, so focusing more about what the thing it returns will be used for might be helpful
<rogpeppe> jam: the Workers method is probably badly named
<jam> I also wouldn't make the method plural when it returns a single object, even if that object wraps multiple others.
<rogpeppe> jam: yeah, it's kind of a single object that encapsulates all the workers, but i agree
<jam> rogpeppe: MetaWorker ?
<rogpeppe> jam: TopLevelWorker?
<rogpeppe> jam: StateConnectingWorkerRunningOtherWorkers ?
<jam> // MetaWorker returns the Worker responsible for managing all individial workers that this Agent needs.
<jam> Just calling it Worker would eliminate the confusion on the returned type not being a slice
<jam> rogpeppe: so the thing you return isn't actually a runner, it is a newCloseWorker() which wraps a runner.
<rogpeppe> jam: StateWorker might be good enough
<rogpeppe> jam: yeah
<jam> rogpeppe: so do Runners actually implement the Worker api?
<rogpeppe> jam: that's so we close the state connection eventually
<rogpeppe> jam: yes
<rogpeppe> jam: (all it requires is Kill and Wait)
<jam> rogpeppe: why do you start the runner in cmd/jujud/machine/Init() rather than in Run
<rogpeppe> jam: that's so the tests can kill it
<jam> rogpeppe: I can see that as a reason to have it as an attribute, but Init seems more like setup stuff-without-starting anything, no?
<rogpeppe> jam: because we're not using our own tomb now - we're leaching off the Runner's liveness
<rogpeppe> jam: yeah, i take your point.
<rogpeppe> jam: even though the runner is empty, calling Init without Run will leak
<jam> rogpeppe: so you already have a method called StateWorker in MachineAgent code. Though perhaps the functionality is similiar enough that name sharing is a good thing?
<rogpeppe> jam: yes, that's what i was thinking
<rogpeppe> jam: and when we transition the unit agent, we'll probably have an APIWorker *and* a StateWorker in there until we can finally delete the StateWorker
<jam> rogpeppe: I would *really* like to call this thing the StateRunner instead, though.
<jam> That makes it clear the thing isn't actually doing anything on its own.
<rogpeppe> jam: yeah, seems like a good idea
<jam> But it is managing all the tasks that want access to State.
<jam> rogpeppe: It also seems odd that APIWorker (maybe called APIRunner) has a side effect of starting the StateRunner.
<jam> Rather than a sepacate
<jam> MachineAgent.StartAPIRunner() and MachineAgent.MaybeStartStateRunner ?
<jam> rogpeppe: naming aside, calling agent.APIFoo() and getting a runner, but having a side effect of mutating agent.runner seems a bad idea.
<rogpeppe> jam: unfortunately it needs to happen like that, because in general we can't start the state worker until we've connected to the API
<jam> certainly when reading the code it wasn't immediately obvious that "a.runner" isn't the same thing as the "runner" you are building up.
<jam> rogpeppe: you can check and refuse if the API side isn't running yet
<jam> and callers need to call them sequentially.
<jam> or we have a StartRunners call that does both
<dimitern> morning all!
<jam> welcome back dimitern
<rogpeppe> dimitern: yo!
<dimitern> thanks! :)
<rogpeppe> jam: we can't even know if we *need* to run a state server until we've connected to the API
<rogpeppe> s/state server/state worker/
<rogpeppe> jam: so there has to be some communication between the two
<rogpeppe> jam: i know it might seem awkward, but i haven't been able to think of a more elegant way of solving the problem
<jam> rogpeppe: again, you can always start the state worker after the API. I'm objecting to having a side effect of "create this runner for me", also starting and forcibly instantiating another runner that isn't actually the one I asked for.
<jam> rogpeppe: I'm just asking to move the code into another function that more clearly signals that it is potentially starting 2 things.
<jam> rogpeppe: If I'm understanding the code, the block from "m := entity.(*machineagent.Machine)" until the end of if needsStateWorker could be done after you return the newCloseWorker
<jam> which means it could be in a different function.
<jam> you need the 'entity', but you could expose that.
<rogpeppe> jam: i'm not sure it can be
<rogpeppe> jam: in the future, the information on how to connect to the state will be held in the API
<rogpeppe> jam: so we'll have to pass some information that we've obtained from the API into StateWorker
<jam> rogpeppe: so after we have the API runner up, we ask the API if we also need to start the StateWorker.
<rogpeppe> jam: yes
<rogpeppe> jam: that's what we do
<jam> rogpeppe: The point is you have a function labeled "Give me an APIWorker", that should be straightforward. But it has this side effect of maybe starting a StateWorker
<jam> which is nowhere in the contract we've defined
<jam> it doesn't even activate the APIWorker (doesn't add it to a.runner)
<jam> so it seems doubly strange that it would add a different worker to a.runner.
<rogpeppe> jam: unfortunately one of the things we have to do after connecting to the API is maybe start a new state worker
<jam> rogpeppe: so I'm just asking you to have a function called "StartWorkers" which connects to the API, creates an APIWorker, and then checks to see if it needs to start a state worker (and does so)
<rogpeppe> jam: how can it do that? - it won't have an API connection
<jam> rogpeppe: you just created the API connection
<jam> how does it not have one?
<rogpeppe> jam: creating an API worker doesn't create the API connection until some time later
<jam> rogpeppe: then how is your code doing it today?
<rogpeppe> jam: the only time we know we have an API connection is inside APIWorker itself
<jam> if it doesn't have the connection?
<rogpeppe> jam: so you'd be happy if StartWorkers was called from within APIWorker?
<jam> rogpeppe: wrong way around
<rogpeppe> jam: but if StartWorkers adds APIWorker to the top level runner, how can it get access to the API connection that only APIWorker has access to?
<rogpeppe> jam: perhaps i might understand better what you're suggesting if you paste some pseudocode
<jam> rogpeppe: rough sketch: http://paste.ubuntu.com/5800860/
<jam> rogpeppe: yeah, I was working on that
<rogpeppe> jam: what calls StartWorkers?
<jam> rogpeppe: right now it is an immediate replacement for whatever called APIWorker
<rogpeppe> jam: currently APIWorker is called by the top level Runner
<rogpeppe> jam: which is important because it gives us our top level run loop (retrying on failure)
<jam> rogpeppe: so MachineAgent is also a Worker?
<rogpeppe> jam: no
<rogpeppe> jam: machine.go:87,90
<jam> rogpeppe: is there a reason we couldn't just have Workers() here, which returns a slice of Workers, which might be 1 for just the API or 2 for both the API and State worker if we find we need it.
<rogpeppe> jam: that's how APIWorker gets called
<jam> Then we don't have to touch a.runner
<rogpeppe> jam: shall we G+ this - it seems like a bit of higher bandwidth might be useful
<rogpeppe> jam: i'm not sure what you mean with your slice suggestion. are you suggesting that MachineAgent.Run calls Workers directly?
<jam> rogpeppe: just a sec, digesting a bit.
<rogpeppe> jam: the difficulty is that eventually we *do* need two independent workers in the top level a.runner, but we only know whether to run the state worker *after* the first API connection has been made, which happens to be exactly what the other runner has to do as a first step. i suppose we could have another worker function which connects to the API and whose sole purpose is to add the other runners to the top-level runner
<rogpeppe> jam: but that's more code and not necessarily any clearer
<rogpeppe> jam: and there's also the point that at some point we might have to *remove* the state worker (if the machine jobs change), which will probably require the API worker to mess with the top-level runner in a similar way
<dimitern> jam: so now with the go-bot do I need to pull from a different location? bzr pull --overwrite --remember lp:juju-core ?
<rogpeppe> jam: would something like this help matters? http://paste.ubuntu.com/5800883/
<rogpeppe> dimitern:  if you have any existing proposals, they need to be reproposed
<rogpeppe> dimitern: but the pull location is still the same
<dimitern> rogpeppe: no, save for the one mgz_ took over
<rogpeppe> dimitern: did you have a good holiday, BTW?
<dimitern> rogpeppe: oh yeah, relaxed - just what I needed
<rogpeppe> dimitern: are you officially back now? (i thought you were back tomorrow)
<jam> dimitern: it depends how you got your branch. If you got it from 'bzr branch lp:juju-core' that still points at the right branch, if you use 'go get launchpad.net/juju-core' that points to the wrong one (because it pre-resolves the full branch URL). you can always just do it and not worry, because it won't be the wrong thing to do (bzr pull --remember lp:juju-core)
<dimitern> rogpeppe: i'm back today
<rogpeppe> dimitern: cool!
<rogpeppe> dimitern: we get a day and a half of overlap before i'm away...
<jam> rogpeppe: thinking a different way, what do we gain by not starting the StateWorker ?
<dimitern> rogpeppe: yeah
<dimitern> jam: will bzr info tell me that?
<rogpeppe> jam: if we do start the StateWorker, how does it know how to connect to the state?
<jam> rogpeppe: the fundamental bit for me is that it feels ugly to have APIWorker (create an api worker for me to start tasks on) have a side effect of touching the top level runner. We can manage the distrust by having 'Here be dragons' comments, but I'm trying to sort out if we can decouple it.
<jam> dimitern: it can
<rogpeppe> jam: i definitely see your point, but i *think* this is a fundamental causality issue
<rogpeppe> jam: only when we connect to the API can we work out if (and how) to start the state worker
<rogpeppe> jam: and the APIWorker is the only place that connects to the API
<jam> rogpeppe: well today we can just use a.Conf.Conf as you are already doing, no?
<rogpeppe> jam: currently we can, but in the future we won't be able to
<jam> And when we are at the point where API knows better how to connect to State, won't we be at the point where we refuse to let anything but the API server itself connect to state?
<jam> and the API server won't be connecting to itself.
<dimitern> jam: i have this: http://paste.ubuntu.com/5800889/
<jam> dimitern: parent branch: bzr+ssh://bazaar.launchpad.net/+branch/juju-core/
<jam> that is always "lp:juju-core" even when we point that at a different branch
<rogpeppe> jam: in the future, we will only know whether to run an API server by connecting to an API server
<dimitern> jam: so it's fine then?
<jam> dimitern: 'go get' would end with: parent branch: http://launchpad.net/~juju/juju-core/trunk/ (which would be the old wrong branch)
<jam> dimitern: yes, you are fine
<rogpeppe> jam: i do indeed hope that we get to the stage where the API server is the only thing that needs a state connection
<dimitern> jam: cheers
<jam> dimitern: note though that juju-core now has a bunch more external dependencies
<jam> lp:gwacl and lp:golxc come to mind
<jam> which also needs 'apt-get install libcurl4-openssl-dev'
<jam> and a branch from github
<rogpeppe> [09:35:02] <jam> and the API server won't be connecting to itself.
<jam> though 'go get launchpad.net/juju-core' will sort some of that out for you
<rogpeppe> jam: it will be connecting to other instances of itself
<dimitern> jam: yeah, just discovered these and i'm go getting them
<jam> rogpeppe: how does the first one start?
<rogpeppe> jam: bootstrap
<rogpeppe> jam: that's the "0" case at the top of the MachineAgent.Run
<rogpeppe> jam: that's the only time we can't first connect to an API server.
<jam> rogpeppe: in which case, I would offer that the task which actually runs an API server should be the thing starting the StateWorker and not APIWorker
<rogpeppe> jam: the StateWorker *is* the thing that runs an API server
<jam> anyway, I don't really want to argue it for too long, but having a "Create me one of these" have a side effect of creating a second one and mutating the state of the caller is hard to keep track off and should be well guarged.
<rogpeppe> jam: i agree with your discomfort, but i haven't seen a decent alternative yet
<rvba> Hi jam, I replied to your comment on lp https://code.launchpad.net/~rvb/juju-core/az-public-storage/+merge/171251/comments/382236
<jam> rogpeppe: the ordering *could* be, "connect to the API, poll it for the jobs to run, one of those jobs will start up, and ask the agent to start another worker, which then gets a task to run the API server"
<jam> rvba: I didn't see anything from jtv on the MP, sorry if I missed it.
<rogpeppe> jam: so rather than adding to the MachineAgent runner directly, we start another worker that does the same thing?
<jam> rvba: to be fair, I still don't see anything looking back again.
<rvba> jam: https://codereview.appspot.com/10541044/
<jam> rvba: ugh, split brain between LP and Reitveld. Reitveld mirrors most comments back into Launchpad, except when the Reitveld identity isn't known to LP, then those messages just get dropped by LP
<jam> rvba: if you look here: https://code.launchpad.net/~rvb/juju-core/az-public-storage/+merge/171251
<jam> it *also* means that JTV's message isn't in my email folder
<jam> because Rietveld only sends messages for things which you've already commented on.
<jam> sorry about that.
<rogpeppe> jam: in the end, starting the state worker *has* to be a consequence of connecting to the API. hmm, one possibility to make things easier to understand:
<jam> is it possible for jtv to register his alternative email with LP?
<rogpeppe> jam: we could explicitly pass the top level runner into the APIWorker
<rogpeppe> jam: so that it's more obvious that APIWorker can control it
<rogpeppe> jam: would that be better for you?
<jtv> Oh hi guys...  I'll try to register that email then.
<rogpeppe> jam: something like this: http://paste.ubuntu.com/5800919/
<jam> rogpeppe: I would be happy if the api was "add yourself to my runner", and then we just end up adding 2 things to the runner. rather than "create something and I'll take care of registering it" which then has a side effect of creating and registering an "unrelated" thing.
<dimitern> mgz_: ping
<rogpeppe> jam: the second thing has to be added to the runner as a side-effect of running the first thing. unless we have some entirely different code which connects to the API for the first time, starts *another* worker that connects to the API and a state worker if appropriate.
<dimitern> fwereade: ping
<jam> rvba: I just sent an email to juju-dev about the config naming question. I actually prefer your method, but we were explicitly asked to do the common-names thing. Which is why I proxied it to you.
<fwereade> dimitern, pong
<jam> rogpeppe: the issue is that APIWorker isn't *running* the worker yet, right?
<fwereade> dimitern, welcome back :)
<dimitern> fwereade: thanks :)
<rogpeppe> jam: which worker?
<dimitern> fwereade: i was thinking to pick up the deployer API stuff
<jam> it seems odd to have the StateWorker registered with the top level runner before the API Worker has been registered
<jam> calling APIWorker() creates a worker object
<jam> but hasn't added it to the topLevelRunner yet
<rvba> jam: okay, thanks for starting the discussion.
<dimitern> fwereade: as agreed before, if mgz_ haven't started on it
<fwereade> dimitern, +1but speak to danilos -- he's about to go away but hasn't yet and has, AIUI, been looking into it
<fwereade> dimitern, I'm not up to date on where he is with it though
<jam> dimitern: right, danilos has started to do some of the infrastructure.
<rogpeppe> jam: erm, i don't quite understand. you don't actually add workers to runners, you add a function that starts a worker.
<jam> and has 3 days of overlap with you to hand it off.
<dimitern> fwereade, jam: ok
<dimitern> danilos__: hey
<rogpeppe> jam: the function that calls APIWorker is added to the top level runner immediately
<rogpeppe> jam: machine.go:87
<dimitern> jam: i saw the mail about the objectives - what's the deadline for that?
<jam> rogpeppe: so we have some more naming confusion.
<jam> StartWorker doesn't start anything
<jam> it registers something that you'll want to start later, right?
<rogpeppe> jam: no, it will call the function immediately
<jam> dimitern: officially the end of this week. With official recognition that it is likely we'll miss it by a bit.
<jam> dimitern: though it doesn't have to take super long. You can do it in an hour or so. You have a lot of other people's objectives you can crib from.
<dimitern> jam: which ones should I take as a template - your or tim's ?
<rogpeppe> jam: the slight tension is that the name StartWorker implies only a single worker, but actually there's a sequential succession of workers
<rogpeppe> jam: each one started some time after the last one has quit, assuming it didn't quit with a fatal error
<jam> dimitern: actually, you're officially under Tim now
<jam> according to directory.canonical.com
<jam> I wash my hands of you :)
<dimitern> jam: ok :) \o/
<jam> rvba: sorry if my comments came across as a bit attacking. It certainly wasn't meant that way. I only commented because of the naming thing, and then I was surprised about the 1 review bit.
<rvba> jam: no worries ;)
<jam> rogpeppe: so it feels like the 'right' fix is to change the StartWorkers api so that the function you call can optionally return more workers that you might want to start. However, your last paste seems reasonable enough for now, and has sufficient Here Be Dragons to avoid people getting lost as to why it is happening.
<rogpeppe> jam: thanks. i'm not sure how changing the StartWorker API would help really. it seems like it would make the Runner API more complex for no particular gain. if you want another worker, just adding it to the runner directly seems like a reasonable way to do it.
<fwereade> TheMue, config-get --all LGTM
<jam> rogpeppe: runner.StartWorkers(..., func()) it doesn't feel like that func() should know what runner it is being attached to, so that it can be arbitrarily called by some other runner.
<jam> rogpeppe: it is a 'do we have a singleton per process', imagine writing tests cases for this
<jam> that want to start their own runner, and add this thing to them.
<jam> that should get added to whatever runner they created
<jam> rogpeppe: anyway, the bit you wrote is 'ok', it doesn't feel 'right', but it is acceptable
<jam> your pastebin does at least let you set which one you want it to add any future work to.
<rogpeppe> jam: yeah, it's pragmatic code - it's not beautifully regular or modular, but it encapsulates the task in hand and the scope is limited
<thumper> night all
<jam> rogpeppe: yeah, I think 1 exception is ok when well documented, if we end up with 2 we should wonder, and if we have 5 exceptions, then we probably have the design wrong.
<rogpeppe> jam: when we do multi-tenant, i think we will probably be doing a lot of runner-manipulating - there will probably be a worker who's sole responsibility is to add and remove other workers from the API runner.
<rogpeppe> jam: at least, that's my wand-wavy plan
<TheMue> fwereade: thx
<TheMue> fwereade: and also thx for the better doc ;)
<rogpeppe> jam: this is, i think, something more like you were suggesting. i don't think it's a great improvement: http://paste.ubuntu.com/5800976/
<jam> rogpeppe: I'm pretty sure line 32 is a.APIWorker
<rogpeppe> jam: indeed it is
<jam> rogpeppe: so the idea is you don't need the for{} in your original code because topLevelRunner handles that?
<rogpeppe> jam: yes
<jam> rogpeppe: I'll also note in your current proposal of "how do we know what to connect to", there is no actual connection of a.StateWorker() to anything the API returns
<rogpeppe> jam: not currently
<jam> though I realize you want to put something there.
<rogpeppe> jam: eventually the state server addresses will be accessible through the API
<jam> rogpeppe: I think changing the signature of StartWorker such that the function you pass in can take the runner as an argument, allowing it to add units would be a good idea. Especially under your proposal that there will be a worker which does a lot of start/stopping of stuff in its parent runner (rather than as children of itself).
<jam> However, your old pastebin is my current favorite for the time being.
<jam> (
<jam> http://paste.ubuntu.com/5800919/
<rogpeppe> jam: we *could* do that, but i think i'm happy having a runner as a closure variable too - there's a certain purity in having a function with zero args
<jam> rogpeppe: as long as we never have to do something like migrate the workers to another runner, etc
<jam> it *seems* like workers shouldn't know what runner they are running inside
<jam> which the closure breaks that
<rogpeppe> jam: if we do, the code is small and easy to change
<jam> mgz_: did your patch finally land?
<rogpeppe> jam: how about this? http://paste.ubuntu.com/5800999/
<jam> rogpeppe: it is pretty much equivalent to http://paste.ubuntu.com/5800919/ for me.
<jam> rogpeppe: you still have to wrap that in a closure
<rogpeppe> jam: ok
<jam> that closure still has to save a.runner as a const in its closure
<jam> etc.
<rogpeppe> jam: i was trying to make it so that the worker doesn't know what runner it's running in
<jam> It is probably slightly better at having APIWorker not know the internals of what a StateWorker is
<jam> rogpeppe: sure, but passing that in is the same thing.
<rogpeppe> jam: that knowledge is held outside (in MachineAgent.Run)
<jam> rogpeppe: so if a runner passes itself into the thing it is calling, then the thing it is calling doesn't "know" where it is running, it is "told" where it is running.
<jam> which means if another runner got that task, it would just run on the other runner.
<jam> in this case, the closure has to *know*, which is true of either form of your code.
<rogpeppe> jam: that assumes that we always want to add the state worker to the same runner that's running the API worker.
<jam> rogpeppe: if given both, I probably like http://paste.ubuntu.com/5800999/ more, it just doesn't solve the specific issue I was having troubles with.
<jam> Having nested runners that can start other runners that *aren't* nested underneath them is also a little bit confusing
<rogpeppe> jam: the startStateAPI func thing works out quite nicely actually: http://paste.ubuntu.com/5801013/
<rogpeppe> jam: i agree it's a bit confusing, but the whole look-at-api-then-connect-to-state-but-not-if-we-are-bootstrapping thing is a little inherently confusing, i think
<rogpeppe> jam: and i *think* that the code is just a reflection of that fundamental awkwardness
<mgz_> jam: yup
<jam> fwereade, rogpeppe: so what was decided on package naming vs tasks. Specifically, what package should I put upgrader in? Its own as state/apiserver/upgrader/upgrader.go, or sharing machine as state/apiserver/machine/upgrader.go? (I get the impression it may be run by machine or unit, so it should be its own thing)
<rogpeppe> jam: it should be in its own package
<fwereade> jam, rogpeppe: this is kinda the problem with the "let's segment-by-agent" scheme
<rogpeppe> jam: this CL shows where i'm aiming https://codereview.appspot.com/10494043/
<fwereade> rogpeppe, would you remind me what your objection was to putting common test infrastructure in its own package?
<fwereade> rogpeppe, (iirc that was the main factor in your decision -- I hope I'm not misrepresenting?)
<jtv> rogpeppe: does this correctly reflect your notes on concurrency hazards in provider implementations?  https://codereview.appspot.com/10602043
<rogpeppe> fwereade: that was part of it. the main thing was to try to keep packages from proliferating wildly.
<rogpeppe> jtv: will look in a mo
<fwereade> rogpeppe, small packages with clear purposes are usually considered a good thing
<jtv> thx
<rogpeppe> fwereade: the way we're going, we'll have apiserver/machine, apiserver/machiner and apiserver/machineagent
<fwereade> rogpeppe, all the more so in go, surely, considering that the only encapsulation boundaries are at packages edges
<rogpeppe> fwereade: i think it's reasonable to gather the things that will only ever run in one agent inside a package for that agent
 * jam goes and hides in a hole to get actual coding done
<jam> will emerge around standup time
<rogpeppe> fwereade: i'm happy to encapsulate by type as well as by package
<fwereade> rogpeppe, that's all very well in theory but when not enforced by language or ultra-string convention it tends to degrade ;)
<fwereade> rogpeppe, I think that the details of exactly what runs where will be more fluid than you anticipate
<fwereade> rogpeppe, and I would prefer not to impede our flexibility by signalling that the most important feature of, say, upgrader, is which agent runs it
<fwereade> rogpeppe, we have a distributed system with a bunch of responsibilities
<rogpeppe> fwereade: i would definitely put reusable components inside their own packages
<rogpeppe> fwereade: so the upgrader would get its own package
<rogpeppe> fwereade: one mo, i'll paste a couple of sketches
<fwereade> rogpeppe, and uniter?
<fwereade> rogpeppe, I'll probably want to run some of those in machine agents at some point
<rogpeppe> fwereade: it would go into apiserver/unit
<rogpeppe> fwereade: really?
<fwereade> rogpeppe, remove-unit --force
<fwereade> rogpeppe, I'm not going to try a transaction to clean up the whole unit state
<fwereade> rogpeppe, revoking the original unit's access and running a sandboxed uniter with a fake charm would work just fine though
<rogpeppe> fwereade: woah
<rogpeppe> fwereade: that seems a bit... heavyweight
<rogpeppe> fwereade: interesting idea though
<fwereade> rogpeppe, I only thought of it relatively recently, but it seemed like a possible end-run around a lot of the diffculty
<fwereade> rogpeppe, it would definitely require that the uniter be decomposed a little but that's definitely not a bad thing
<rogpeppe> fwereade: it's already decomposing, arf arf :-)
<rogpeppe> fwereade: it sounds like an interesting approach
<fwereade> rogpeppe, I'd love to be able to drop most of the giant integrationy tests there and be able to run detailed unit tests on all the modes, for example
<rogpeppe> fwereade: using the unit agent for that purpose would presumably only require a small subset of the full uniter API?
<fwereade> rogpeppe, yeah, that was the thought
<rogpeppe> fwereade: so it might have its own API facade anyway?
<rogpeppe> fwereade: ok, so if we go with the "packages for everything" approach, this is how i see the machine package looking: http://paste.ubuntu.com/5801079/
<fwereade> rogpeppe, that hadn't been my thought in particular -- allowing access to that facade for force-dying units on ManageState machines connections seems maybe plausible
<fwereade> rogpeppe, sorry, the machine agent implementation?
<rogpeppe> fwereade: what does cleaning up the unit state actually involve? is it that complex that it's a great help to have whole uniter around for it?
<fwereade> rogpeppe, basically a load of scope-leaving
<fwereade> rogpeppe, it may indeed not involve the whole thing, that's why I mention decomposing it
<rogpeppe> fwereade: so, the machine package would just integrate together all the APIs that we want to present to the machine agent.
<dimitern> danilos__: hey
<danilos> dimitern, hey-hey
<danilos> dimitern, welcome back, I hope it was nice two weeks off :)
<rogpeppe> danilos: hiya
<dimitern> danilos__: have you started some work on the deployer api stuff?
<danilos> rogpeppe, hey
<dimitern> danilos__: oh yeah it was :)
<danilos> dimitern, yeah, barely
<danilos> dimitern, basically, looking at that unification of watchUnits under Machine state object
<dimitern> danilos__: because i was thinking of picking that up
<dimitern> danilos__: if you don't mind
<danilos> dimitern, hum, perhaps not a bad idea and I can focus on finishing the python-env stuff
<dimitern> danilos__: sgtm
<fwereade> rogpeppe, that doesn't feel unreasonable to me -- it puts that responsibility in one place while we're firming up the stuff around it, and keeping it separate for now makes it easier to move it somewhere else if and when the need becomes apparent
<fwereade> rogpeppe, I'd really prefer to keep things separated by default, and only combined when it's clear that doing so fits the rest of the model so well that it's actively *bad* to keep them separate
<rogpeppe> fwereade: do you want a separate package for each watcher type?
<danilos> dimitern, some of unfinished code is up in lp:~danilo/juju-core/watch-units, though you might want to start over since it's not much (I've got some attempts at fixing tests uncommitted locally, but it doesn't solve them completely)
<danilos> dimitern, want me to push that too or you don't care? :)
<rogpeppe> fwereade: i was planning on putting them all into apiserver/watchers
<dimitern> danilos__: i'll take a look at that and what we planned before i left to bring myself up to speed
<fwereade> rogpeppe, depends where you mean... I think that what you describe may actually work pretty well for me -- IMO that's the one clear case we have of a really broadly shared capability
<danilos> dimitern, some notes I gathered are up in http://paste.ubuntu.com/5741603/ (including a link to your pastebin)
<danilos> dimitern, I meant https://docs.google.com/a/canonical.com/document/d/105xob7LVW63NoWoKoRhJNYN26_1GaAO-apiG8TKt_5s/edit :)
<dimitern> danilos__: cheers
<danilos> dimitern, shared it with you
<jam> fwereade, rogpeppe: by the same token, should it be called state/apiserver/upgrader/upgrader.go:UpgraderAPI  vs Upgrader ?
<jam> given MachinerAPI
<fwereade> rogpeppe, so you'd make the watcher API accessible separately to most of the other APIs, and those APIs' watch methods will return ids for use with the watcher service?
<fwereade> jam, if it has it's own package it could just be upgrader.API, and if everything did it could be machiner.API, etc
<fwereade> rogpeppe, jam, watcher.API :)
<jam> fwereade: it does make using grep to find where this type is implemented pretty hard.
<rogpeppe> fwereade: i was thinking watchers.EntityWatcherAPI, watchers.EnvironWatcherAPI, etc
<rogpeppe> fwereade: because each one is actually really quite small
<fwereade> rogpeppe, fwiw I think EnvironWatcher should just be an EntityWatcher really
<rogpeppe> fwereade: whatever
<rogpeppe> fwereade: it was the second watcher i could think of :-)
<fwereade> rogpeppe, (bikeshed bikeshed: NotifyWatcher)
<rogpeppe> fwereade: if you wanna repaint that bikeshed, go for it
<rogpeppe> fwereade: (re-bikeshed: Watcher)
<fwereade> rogpeppe, yeah, we can do all this later :)
<rogpeppe> fwereade: +1
<rogpeppe> jam: don't use grep to find types :-)
<fwereade> rogpeppe, I'm comfortable with the broad shape of what you propose
<rogpeppe> fwereade: cool
<rogpeppe> jam: also grep '^type SomeType' is a good way of finding types, if that's not how you do it already
<fwereade> rogpeppe, so in *that* case can we make that machine package wither to almost nothing, and make it the individual workers' responsibility to pull the APIs they need off a single shared client object?
<rogpeppe> fwereade: yes
<fwereade> rogpeppe, this sgtm
<jam> rogpeppe: sure, but if everything is defined as type API ...
<jam> func.*FuncName is reasonable to find functions too
<rogpeppe> jam: well, you'll only ever see it referred to a pkgname.API
<rogpeppe> jam: which is fairly unambiguous as to where that particular API type is defined
<jam> rogpeppe: except in the package itself, and you don't know *which* file it is defined in
<jam> rogpeppe: gives you a directory
<rogpeppe> jam: my favourite is ' Foo\(' for finding method definitions
<jam> rogpeppe: func.*Foo finds both methods and free funcs
<rogpeppe> jam: but it also finds functions with Foo in the name
<rogpeppe> jam:  i like ' Foo\(' because it's pretty exact
<jam> rogpeppe: except it finds calls of a free func Foo
<jam> :)
<jam> I agree that I put ( on when I need it
<rogpeppe> jam: yeah, it finds method and func defns
<rogpeppe> jam: in general though, i use godef :-)
<jam> rogpeppe: which while it is a tool that works, it is also almost by definition a "tool that works for rogpeppe" :)
<jam> as 99.999999% of humans don't have it installed on their machine
<rogpeppe> jam: other people use it too, honest :-)
<rogpeppe> jam: go get works
<jam> rogpeppe: I would argue that the number of people comfortable using godef, and the number comfortable using grep
<jam> ...
<rogpeppe> jam: it's idiomatic in Go to have type names that work well when qualified by the package identifier
<rogpeppe> jam: i don't think it's necessary to make names that are unique across the code base because they're a bit awkward to find using grep.
<rogpeppe> jam: BTW i tried to write godef such that it would be almost trivial to integrate into another editor - all you need is the current file cursor position and the current file contents and it'll tell you where the definition is.
<rogpeppe> jam: perhaps i'll actually learn some vim programming and write some bindings for it at some point.
<wallyworld_> jam: mgz_: danilos__: can we do standup now? it's half time in the State Of Origin football and I want to watch the 2nd half
<mgz_> I could, not sure about de otros
 * wallyworld_ is hopeful
<dimitern> wallyworld_: me too
<wallyworld_> dimitern: hey, welcome back
<dimitern> wallyworld_: thanks
<wallyworld_> good holiday?
<jam> wallyworld_, mgz_, danilos__, dimitern: I'm there https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
<dimitern> wallyworld_: oh yeah, although it seemed shorter ;)
<jam> mgz_: de otros? los otros?
<mgz_> something like that :)
<jam> danilos__: poke?
<dimitern> danilos_: https://plus.google.com/hangouts/_/8868e66b07fa02bdc903be4601200d470dae9ee3
<danilos_> dimitern, coming in a bit
<rogpeppe> jam: https://github.com/dgryski/vim-godef
<rogpeppe> jam: it's not fantastic (it doesn't know about the contents of the current buffer, so it will muck up if you're currently editing, but it might work ok otherwise)
<rogpeppe> jtv: reviewed
<TheMue> fwereade: ping
<fwereade> TheMue, pong
<TheMue> fwereade: wonna talk about autosync?
<fwereade> TheMue, sgtm, would you precis it here? would be worth the chance of other eyes passing over it I think
<TheMue> fwereade: currently simply wanted to know how you see this feature. so far I only have this topic "auto sync-tools" and found one mail
<TheMue> fwereade: as long as I understand it the need for an explicit sync-tools call shall be removed
<TheMue> fwereade: instead it should be handled automatically during bootstrap if needed
<fwereade> TheMue, I had been hoping you had been analysing the problem in the meantime -- when should we do it, when not, what are the drawbacks of the approach you recommend for various users, why do we consider them to be a price wrth paying
<fwereade> TheMue, the problem is that sync-tools is an annoying step extra step for first-time users, and we want to make their experience better
<TheMue> fwereade: so it's like I wrote. handle it automatically during bootstrap if needed (simplified)
<TheMue> fwereade: maybe I'm underestimating it and simply don't see the troubles you expect
<TheMue> fwereade: so it would be helpful what problems you see
<fwereade> TheMue, it's the "if needed" and the "simplified" I'd like to hear more about
<fwereade> TheMue, what's the trigger condition? what version(s) do we copy? what's the impact of these decisions?
<fwereade> TheMue, how do we make this a nice story for an isolated environment?
<TheMue> fwereade: the latter is the largest problem, indeed
<fwereade> TheMue, don't underestimate the former
<TheMue> fwereade: for the first parts I would use the same mechanisms like in sync-tools to check the existing tools in the environment
<TheMue> fwereade: using the same decision logic of which tools are to sync
<fwereade> TheMue, sync-tools will sync tools even if you already have valid tools available
<TheMue> fwereade: do we have known troubles with this logic (in sync-tools)?
<fwereade> TheMue, I don't think it's quite the same use case, is it?
<TheMue> fwereade: it uses tools.ErrNoTools when inspecting the environs storage with tools.ReadList()
<TheMue> fwereade: but public or private is explicitely set
<fwereade> TheMue, right -- there's a whole bunch of things to consider. what source we want now, whether we'll want alternative sources in the future, where we want to store the tools we autosync, etc
<fwereade> TheMue, at what point we check for tools, what sort of errors we could encounter, how we report those errors
<TheMue> fwereade: thought the source topic is an extra one
<fwereade> TheMue, ok, but I'm asking you to solve a mid-scale problem, and I need you to think through the solutions
<TheMue> fwereade: reasonable, ok
<fwereade> TheMue, the isolated environment case is one aspect of the problem; even if we don't solve that bit first (we won't) I'd like us to at least consider the problem's existence as we design it
<TheMue> fwereade: but then I need more than simply the three words "auto sync tools". that's why I asked you, to get more information which problem has to be solved
<rogpeppe> jam, fwereade: i've changed things a little bit and added some more comments. you might want to take a look before i approve the branch: https://codereview.appspot.com/10259049/diff2/5001:16001/cmd/jujud/machine.go
<fwereade> TheMue, so one consequence of isolated environments is that we'll have to, at some stage, have a pluggable tools source
<fwereade> rogpeppe, cheers, just a mo
<fwereade> TheMue, ok, I'm sorry, I thought I'd been clear that it's about streamlining the user's first experience
<TheMue> fwereade: so let me simply create an issue for this topic (currently only a tiny kanban card exists) where we can collect everything that should be solved
<fwereade> TheMue, ec2 has a nice story from that POV, the others not so much
<TheMue> fwereade: yeah, it's now more clear
<TheMue> fwereade: and where I underestimated it *sigh*
<TheMue> fwereade: hidden behind three nice words ...
<fwereade> TheMue, haha :)
<TheMue> fwereade: ;)
<jam> rogpeppe: you have some doc comments that are out of date (startStateWorker vs ensureStateWorker) otherwise LGTM
<fwereade> TheMue, the questions and answers are probably not actually that hard, it's just that I'd like to minimise surprising consequences and anything that involves copying several MB around is going to be a bit noticeable
<rogpeppe> jam: ah, i thought i'd fixed that, thanks
<TheMue> fwereade: yep
<TheMue> fwereade: what I got from the mail it sounded only like a "hey, we've got not tools. I want to bootstrap, so please call sync-tools"
<fwereade> rogpeppe, LGTM,one thought
<rogpeppe> trivial CL that fixes golang-tip govet against juju-core trunk: https://codereview.appspot.com/10607043/
<rogpeppe> please could someone have a look quickly, as the issue is stopping me from proposing anything currently.
<rogpeppe> fwereade: i believe you're on call :-)
<rogpeppe> fwereade: i invoke you
<fwereade> rogpeppe, cheers
<fwereade> rogpeppe, LGTM trivial
<rogpeppe> fwereade: ta
<jam> fwereade, rogpeppe: pulling out some of the testing code in apiserver/machine into apiserver/testing so that I can reuse it for upgrader: https://codereview.appspot.com/10608043/
<jam> rogpeppe: in the HA case, you'll still have the root machine that starts the HA off, right/
<jam> ?
<rogpeppe> jam: yes
<jam> so the newly started will-be-root-nodes still have an API server to connect to to find out what they will be doing
<jam> so there is still only 1 'machine/0' that bootstraps the whole process
<rogpeppe> jam: yes
<wallyworld_> fwereade: no urgency, could you look at https://codereview.appspot.com/10534043/? i'm off to bed soon so will check any comments tomorrow
<rogpeppe> jam: yes
<rogpeppe> jam: but
<rogpeppe> jam: once it's bootstrapped the whole process, its jujud might restart
<jam> wallyworld_: did you win?
<wallyworld_> yes!
<fwereade> wallyworld_, ack
<wallyworld_> smashed them :-D
<jam> wallyworld_: grats
<wallyworld_> 1-1 now. we need the 3rd game
<rogpeppe> jam: and then it's no longer appropriate to open state without opening the API first
<rogpeppe> jam: assuming there's at least one other API server out there
<fwereade> rogpeppe, I'm wondering if there's some justification for always connecting to state if state info is available
<rogpeppe> fwereade: in the future, state info won't be passed in cloudinit
<fwereade> rogpeppe, if a machine comes up without it, it can request it and write it, and the state workers that it'll start anyway can just fail repatedly until it's written out
<rogpeppe> fwereade: and it's better to connect with the latest info if we can
<fwereade> rogpeppe, the bootstrap case remains special
<rogpeppe> fwereade: and the API holds the freshest info
<fwereade> rogpeppe, sure, but I imagine the worker that keeps an eye on state info is going to have to be writing out updated versions every so often regardless
<jam> rogpeppe: why do we believe the information about what API server to connect to is any fresher than what State to connect to?
<rogpeppe> fwereade: sure. but we might come up after being partitioned from the API for some time
<fwereade> rogpeppe, in the bootstrap case only, write it in so the state workers can start and work immediately, and the api workers can sit and wait until there's an api available
<rogpeppe> jam: that's all we've got
<rogpeppe> fwereade: what happens when we can remove jobs from a machine?
<fwereade> rogpeppe, delete the stateinfo and bounce the agent I guess?
<rogpeppe> fwereade: the machine is disconnected and comes back up, thinking that it needs to connect to the state, but actually it longer has that privilege, so it connects to the state repeatedly and fruitlessly
<jam> rogpeppe: until it manages to connect to the API and finds out it is no longer able to do so.
<fwereade> rogpeppe, what jam said
<rogpeppe> jam: yes, so either way the APIWorker code has to interfere with the StateWorker code.
<fwereade> rogpeppe, at which point it gets deleted and everybody's happy
<rogpeppe> jam, fwereade: i'm not sure that we gain much by connecting to the state server regardless of the API.
<fwereade> rogpeppe, that way the two tasks can be completely orthogonal, surely?
<rogpeppe> [13:56:27] <fwereade> rogpeppe, at which point it gets deleted and everybody's happy
<fwereade> rogpeppe, none of this conditional complexity and passing statey things into api methods
<rogpeppe> fwereade: the APIWorker has to do that deletion
<rogpeppe> fwereade: which makes the two tasks non-orthogonal
<fwereade> rogpeppe, there's an api worker solely responsible for getting, updating, deleting state info -- but that's not something the api runner needs to take into consideration, is it?
<rogpeppe> fwereade: having the two tasks communicating via the shared stateinfo state seems wrong to me
<rogpeppe> fwereade: when we know exactly when a state worker is needed
<rogpeppe> fwereade: and can start it then
<fwereade> rogpeppe, at the cost of forcing everyone who wants to know about the api tasks to also figure out wat the deal is with the state tasks
<rogpeppe> fwereade: ??
<rogpeppe> fwereade: the only shared knowledge is those few lines in machine.go
<rogpeppe> fwereade: noone writing an API-based worker needs to know about any of that stuff
<rogpeppe> fwereade: we're talking < 10 lines of code here.
<fwereade> rogpeppe, agreed -- but to figure out what the agent does wrt the api, you need to derail into the state code
<rogpeppe> fwereade: i think that doing it directly is nicer than doing it indirectly by side-effect of changing the shared stateinfo
<rogpeppe> fwereade: which seems more like magic to me
<jtv> rogpeppe: thanks for the review â would you also have time for another?  It's this one: https://codereview.appspot.com/10480045/
<fwereade> rogpeppe, the actual state info is only "shared" between the thing that reads it and the thing that writes it
<fwereade> rogpeppe, we need both those things already
<fwereade> rogpeppe, why explicitly couple two distinct components when they'll work just fine independently anyway?
<fwereade> rogpeppe, if the api runner needs to know the state info, that's a smell
<fwereade> rogpeppe, if a component whose entire purpose is writing the state info knows about it, and *that* is run by an api worker, no problem
<fwereade> rogpeppe, distinction seem meaningful?
<rogpeppe> fwereade: it still feels icky to me; that's perhaps because i haven't absorbed its possibilities. at the moment it feels like smearing responsibilities (changing the stateinfo not only changes the stateinfo, but also has the side effect, at some point in the future maybe, of triggering state-based tasks to connect and run.
<rogpeppe> )
<rogpeppe> fwereade: it *may* be a better way of doing it, but i'd need to think hard about it for a while
<rogpeppe> fwereade: and for the time being, i'm reasonably happy with the current approach, which i think works ok
<rogpeppe> fwereade: and i really want to get this stuff in
<fwereade> rogpeppe, yeah, I did LGTM it, I'm not trying to block you
<rogpeppe> fwereade: thanks
<fwereade> rogpeppe, I'm also just wittering on about how I'd like to see it evolve
<ackk> hi all, I have a question about juju-core API: after sending a WatchAll request, does the first AllWatchersNext response contain the whole current environment state, or can it be broken up in multiple responses?
<rogpeppe> fwereade: i'd rather that the component that was responsible for writing the stateinfo was also directly responsible for adding or deleting the state worker.
<rogpeppe> jtv: looking
<jtv> thanks
<rogpeppe> jtv: reviewed
<jtv> Thanks again!
<jtv> rogpeppe: not sure I understand your review comment... you want me to move the initialization of the environ's "name" attribute down by one line?
<rogpeppe> jtv: did you see what i did in the other providers to solve the same issue?
<jtv> Yes, I was the first to review the branch.  I did something as similar as I could here.
<rogpeppe> jtv: i made the setting of the name field entirely independent of SetConfig
<jtv> Yes, that's what I did too.
<rogpeppe> jtv: you set the name field within SetConfig, no?
<rogpeppe> jtv: i would set it in Open
<jtv> That's what I did.  I guess you just missed it then because it was so hassle-free.  :)
<jtv> env := azureEnviron{name: name}
<jtv> âinitializes "name" right from the start, for maximum protection as the ads say.  :)
<rogpeppe> jtv: ha ha!
<rogpeppe> jtv: i'd read the red code as green code
<rogpeppe> jtv: sorry for the bogus comment
<jtv> Red code much better than Green squad code!
<rogpeppe> jtv: :-)
<mgz_> now now, let's not have arguments :P
<rogpeppe> jtv: LGTM
<TheMue> fwereade: thx for review
<dimitern> fwereade: got a minute?
<jtv> mgz_: Arguments much better than global state!
<jtv> thanks again  rogpeppe
<mgz_> death by joke!
<jtv> In Soviet Russia, we kill joke.
<jtv> I'm sorry â stress-induced giddiness.
<fwereade> dimitern, heyhey
<fwereade> dimitern, sure
<dimitern> fwereade: I'm running into some issues with the deployer tests after removing the units watcher arg
<fwereade> dimitern, oh yes?
<dimitern> fwereade: more specifically - take a look at TestDeployRecallRemovePrincipals and TestDeployRecallRemoveSubordinates
<dimitern> fwereade: I changed the deployer to use the machineId instead of a tag to verify whether it's responsible
<dimitern> fwereade: so now unassigned units cannot be deployed
<fwereade> dimitern, cannot be removed you mean?
<dimitern> fwereade: and the tests assuming that seem kinda crackful
<dimitern> fwereade: that as well
<fwereade> dimitern, hmm, don't forget this code *does* need to work with units started with an old deployer
<dimitern> fwereade: but now I have this check: http://paste.ubuntu.com/5801495/
<dimitern> fwereade: before it was deployerTag, ok := unit.DeployerTag(); ok { responsible == tag == d.tag }
<fwereade> dimitern, ok, that looks reasonable, I thnk
<dimitern> fwereade: the thing is - now the above 2 tests timeout right about there: http://paste.ubuntu.com/5801499/
<fwereade> dimitern, then the only impact is making sure that we list units of old-style deployers as well in the manager
<dimitern> fwereade: and I can't get the idea behind unassigning and then checking whether it's still deployed
<fwereade> dimitern, well, unassigning is kinda madness and crack actually
<fwereade> dimitern, because it's all fundamentally unknown
<dimitern> fwereade: it's definitely confusing - what should it be instead?
<fwereade> dimitern, well, it's not a feature we've designed properly at all, it's an 18-month-oldguess we still haven't found a usefor
<fwereade> dimitern, so, to be clear , how do the tests fail? they don't remove the unit when it got unassigned, because..?
<dimitern> fwereade: http://paste.ubuntu.com/5801508/ - that's the test output
<fwereade> dimitern, ah!
<rogpeppe> fwereade, anyone: i'd appreciate reviews of https://codereview.appspot.com/10494043/ and https://codereview.appspot.com/10554043/ if possible
<fwereade> dimitern, so you saw a change
<fwereade> dimitern, asked the machine for its id
<fwereade> dimitern, it gave you that error and it filtered down
<fwereade> dimitern, instead of trapping that error (in your first paste) and setting responsible = false
<dimitern> fwereade: ah!
<dimitern> fwereade: good catch
<dimitern> fwereade: so this works: http://paste.ubuntu.com/5801529/ - but only for principals, the subordinates test still fails with the same timeout error
<dimitern> fwereade: log http://paste.ubuntu.com/5801531/ - the only thing I changed is to add these lines in prepareSubordinates: http://paste.ubuntu.com/5801536/
<fwereade> dimitern, I think that one now needs to make into account that the principal will also be deployed?
<dimitern> fwereade: you mean waitFor(c, isDeployed(ctx, u.Name()) first, then the sub?
<rogpeppe> jam: what does this mean: "No proposals found for merge of https://code.launchpad.net/~rogpeppe/juju-core/311-juju-bootstrap-state-change-password-1.6 into https://code.launchpad.net/~go-bot/juju-core/trunk."
<fwereade> dimitern, I suspect so
<rogpeppe> jam: i just got it as an error from the Go Bot
 * rogpeppe wishes there was some way of knowing what tarmac is up to at any given moment
<fwereade> dimitern, by the way, you know the separate branch I suggested that allows principal units to become dead without waiting for subs? I'm not quite so sure that's a good idea, don't waste any time removing that code :)
<dimitern> fwereade: I'll propose it shortly as a first step
<fwereade> dimitern, remember you can't land the change you're making without removing deployer from the unit agent
<dimitern> fwereade: sure
<fwereade> dimitern, and do please spin something up and test it live with an actual subordinate :)
<dimitern> fwereade: ok, will do
<fwereade> dimitern, and try taking things down in various ways
<fwereade> dimitern, destroy principal unit/service, destroy subordinate service, destroy relation
<fwereade> dimitern, at least you can --force machine and run all the test sin the same place ;p
<dimitern> fwereade: i might need some help on that
<fwereade> dimitern, sure -- I have a kanban meeting to get to now though
<mgz_> rogpeppe: not sure, nothing seems obviously wrong, but tarmac seems to have not found the merge proposal
<rogpeppe> mgz_: hmm, weird
<mgz_> ah, you reproposed after targetting the wrong branch?
<mgz_> that seems likely to be the issue
<rogpeppe> mgz_: i don't *think* so, but it's possible
<rogpeppe> fwereade: kanban?
<fss> niemeyer: ping
<niemeyer> fss: Yo
<rogpeppe> jam: any idea about this message: "No proposals found for merge of https://code.launchpad.net/~rogpeppe/juju-core/311-juju-bootstrap-state-change-password-1.6 into https://code.launchpad.net/~go-bot/juju-core/trunk."
<rogpeppe> jam: ?
<rogpeppe> jam: the go-bot has actually already merged that prereq
<rogpeppe> jam: despite it saying that it can't find it
<sidnei> rogpeppe: sometimes the lp api lags behind
<rogpeppe> jam: the weird thing is that the merge proposal has existed for days
<rogpeppe> sidnei: ^
<rogpeppe> sidnei: but maybe it's a misleading error message
<sidnei> in one case i had to delete the cache that the lazr.restfulclient keeps locally to get it to notice
 * TheMue is stepping out for some time and will be back later
<fwereade> rogpeppe, both LGTM
<rogpeppe> fwereade: thanks
<rogpeppe> the API-connection branch has finally landed in trunk
<rogpeppe> yay!
<dimitern> rogpeppe: you've got 2 reviews
<rogpeppe> dimitern: thanks
<rogpeppe> dimitern: i'm just moving towards approval
<dimitern> rogpeppe: only the panics are fishy
<rogpeppe> dimitern: the panics are there because i don't expect those methods to be called in the tests
<rogpeppe> dimitern: no point in writing code that's never used
<dimitern> rogpeppe: hmm.. well, if it's there we should test it (at some point)
<rogpeppe> dimitern: i'm sure that they'll be fleshed out later when it becomes common test code
<rogpeppe> dimitern: all in good time :-)
<dimitern> rogpeppe: ok then
<mgz_> rogpeppe: what did you do to get your merge unstuck?
<rogpeppe> mgz_: nothing at all
<rogpeppe> mgz_: ah, no
<rogpeppe> mgz_: i re-approved it
<mgz_> okay, good to know for future reference
<rogpeppe> "There are additional revisions which have not been approved in review. Please seek review and approval of these new revisions."
<rogpeppe> aarg
<rogpeppe> h
<rogpeppe> does that mean i can't make changes that people suggest and then just land the branch?
<mgz_> you just toggle the approval state for that
<mgz_> you can't approve then push (as I found, I think)
<rogpeppe> mgz_: i don't think i did. but maybe i did.
<rogpeppe> mgz_: sigh
<mgz_> there may also be a little lag
<rogpeppe> mgz_: i think tying the approval to the revno that was loaded in the lp page is bogus.
<rogpeppe> mgz_: too clever for its own good and ours
<dimitern> fwereade: so once the deployer creation is removed from the unit agent, there's no need to test for deployment right?
<rogpeppe> mgz_: i'm spending far too much time shepherding the submission process
<dpb1> fwereade: Thanks for the review.  So, in pyjuju setting "" even on the command line (juju set "foo=") would set an empty string in the config.  AFAIK, there was no way to get a nil/null/None into the charm config, since that state does not exist in Bash (lowest common denominator)
<fwereade> dpb1, hmm, so what were we doing with "" values at config-get time? were we always stripping them out?
<dpb1> "" -> remained as is.  something unset (no default in the config) was stripped out.  The only way to get a null was something unset in the config.yaml that you probed directly with json (config-get --format=json unset_key => null)
<dpb1> fwereade: my brain hurts just typing that. :)
<fwereade> dpb1, turned out pretty readable actually :)
<dimitern> fwereade: https://codereview.appspot.com/10617043 - first attempt to see the general direction, will do live testing now
<fwereade> dpb1, so, ok -- I guess there wasn't any way to explicitly clear a value in python and get back to the default then?
<fwereade> dpb1, sorry, I've been doing go for too long
<fwereade> dpb1, ah! right, I remember why we turned "" into nil on input -- because python couldn't distinguish between a default value and an explicit value that matched the default
<fwereade> dpb1, and hence could not handle changing default settings on charm upgrade
<dpb1> fwereade: correct.  once it is defined, no way to unset it, so "" was the defacto "unset"
<dpb1> oh.
<fwereade> dpb1, sorry, it was IIRC an overheard conversation at a sprint about a year ago
 * dpb1 parses
<fwereade> dpb1, I'm frantically loading state myself
<fwereade> dpb1, so, we took the ""-means-unset convention and formalized it a bit
<dpb1> interesting
<fwereade> dpb1, and reassured ourselves it couldn't possibly hurt anyone because the semantics were just the same
<fwereade> dpb1, ofc you can see how well that turned out
<dpb1> hehe
<dpb1> well, it's a small thing, and easy to fix in the charms, its just... there are a lot of them. :)
<fwereade> dpb1, but I think there's a germ of value in the idea somewhere
<fwereade> dpb1, I firmly believe we need to fix this
<fwereade> dpb1, we've broken people and we must unbreak them
<fwereade> dpb1, to me the question is whether the fix is temporary, with a deprecation warning, or grandfathered in forever
<dpb1> fwereade: agreed.  I was kind of surprised by it (as a charm author).  I'm sure others would be too
<dpb1> fwereade: to me it seemed straightforward, "default: " in config.yaml maps to "" when you read it out.  But you threw that wrinkle in about distinguishing at charm upgrade time
<fwereade> TheMue, ping
<fwereade> dpb1, yeah -- and the ability to reset to default is quite nice too
<dpb1> fwereade: I guess yaml is just not expressive enough to consider a nil case
<fwereade> dpb1, null is perfectly valid yaml but we don;t want people to have to type that ;p
<dpb1> fwereade: ah...
<fwereade> dpb1, yaml can probably express "please fire all the nuclear missiles" if you're not careful
 * dpb1 wonders what would happen if he typed   juju set key=null  :)
<fwereade> dpb1, I don't *think* that goes through a yaml filter in python, and it certainly doesn't in go
<dpb1> fwereade: iirc, I've tried that and you are right.  just strigifies to null
<fwereade> dpb1, but if you used --config with a null value we'd interpret it as "please delete this setting" with the final impact being "replace with default"
<mgz_> fwereade: that's what safe_load is for :P
<fwereade> dpb1, except I'm suddenly not sure what would happen there in python
<fwereade> dpb1, it might well start coming out as None
<dpb1> fwereade: interesting.  I've never used this feature.  Since charms interpret "" as unset (for a variety of reasons we have discussed), that has always been how ive done it from the command line   juju set "key="
<dpb1> fwereade: in any case, having some change in behavior when using juju-core in this manner is fine, better than having to change charms in subtle ways, IMO.
<fwereade> dpb1, in the very narrow context of "" defaults being valid I completely agree
<fwereade> dpb1, you've brought up a more disturbing question though
<dpb1> I don't like the sound of that.
<fwereade> dpb1, that `juju set option=` is now sometimes a *very* different operation across go and python
<fwereade> dpb1, this only impacts string keys at least
<fwereade> dpb1, for other values it's new and useful and I don't think it hurts
<fwereade> dpb1, can you give a gut estimate for how often you used that for an option that had a non-empty default?
<dpb1> fwereade: so you are saying in juju-core if I do "juju set option=" it will revert to the default value?
<fwereade> dpb1, yeah
<fwereade> dpb1, that was something that happened so long ago I'd completely forgotten the original behaviour :/
<dpb1> well, that isn't entirely unexpected, I suppose.
 * fwereade wishes we hadn't developed in the dark for quite such a long time
<dpb1> but it brings up how you would actually set the value to ""
<dpb1> as you are getting at.
<fwereade> dpb1, quite so
<mgz_> `juju set option=\"\"` I guess
<fwereade> mgz_, except when `""` itself is valid
 * fwereade sighs
<dpb1> For me, I use that idiom all the time, when I'm developing the charm.  unset, set back, see what the charm does
<fwereade> dpb1, and this is when the default is not ""?
<dpb1> but, as you can expect, that differs depending on how "dynamic" the charm is supposed to be.  ie., are options used in install or config-changed.
<dpb1> fwereade: it's less important, since most options default to "" or have no default, especially as the charms grow.
<fwereade> dpb1, I'm really worried about that whole issue in general actually -- I rather feel that a charm that *requires* options is somewhat naughty, and one that can't handle changes is very naughty indeed ;)
<dpb1> fwereade: but, if I had found that before you mentioned it?  I probably wouldn't care an awfully lot.  it does something interesting (sets back to default).
<dpb1> as long as there is a way to unset, I would just use that.  modifying that behavior slightly isn't hard.
<fwereade> dpb1, where by "unset" you mean "set to empty", right?
<dpb1> fwereade: yes.  I think there are cases where you need options, but in general, it's something you should and do learn in charm building 101.  Don't require anything, and respond to change dynamically.
<fwereade> dpb1, ok, so, to summarize: `option=` is a nice way to reset, and you can live with that, but you need some way to express empty strings in general?
<dpb1> fwereade: hehe, well, that is the crux of what we are talking about.  In pyjuju, the two were equivalent (except for the case of config-get --format=json unset_value)
<dpb1> fwereade: ok, summarize... let me see
<dpb1> fwereade: the bug specifically mentions "default:" and how that should go to empty string.
<fwereade> dpb1, that's the easy bit -- if "" is valid, then we can keep that setting directly
<dpb1> beyond that, I think most of what we have been talking about it theory.  I don't need "option=" on the command line to behave in any certain way.
<fwereade> dpb1, so my issue is of global consistency
<fwereade> dpb1, if we allow "" in charm defaults, which I think is good, we should be able to express "make this the empty string, not the default" for the cases where the default is *not* ""
<fwereade> dpb1, as it is the range of possible values is constrained by input method and that kinda sucks
<dpb1> fwereade: yes, that makes sense.
<dpb1> fwereade: so something like , juju set --default option
<dpb1> (just throwing that out there)
<dpb1> afaik, pyjuju had no way of doing what you are saying, but I think it would be nice.  you can get the info out of "juju get" and just set it by hand.
<fwereade> dpb1, the specific scenario is for, say, tuning changes -- we'd like to update the preferred setting for users who haven't expressed a preference, but not for those who explicitly did
<fwereade> dpb1, the theory is that those who don't express a preference are those least likely to fix it manually and we should make it do the Right Thing by default
<dpb1> fwereade: right, but in that case, they aren't going to be changing things at the command line in the first place, right?
<dpb1> they will just deploy and let it work
<fwereade> dpb1, but it should be easy for them to fiddle with the settings and reset the ones they didn't want to -- but still get the benefits from upgrades without having to look
<dpb1> yes.
<dpb1> in that case having an explicit setting like --set-default, makes even more sense, avoid relying on "option=" doing something perhaps unexpected.
<fwereade> dpb1, I think I am becoming convinced, loath as I am to drop that cure little `float=` to reset to default in the cases where it's not clear
<dpb1> fwereade: in any case, it's a separate issue.  want me to file a bug about it?
<fwereade> dpb1, ok, that sounds good to me -- we fix charm defaults, with no deprecation warning, and file a bug that `juju set stringoption=` does the wrong thing when the default is not ""
<dpb1> ok, great.  bug coming along shortly.
<fwereade> TheMue, were yuo following along there roughly?
<dpb1> fwereade: https://bugs.launchpad.net/juju-core/+bug/1194945
<_mup_> Bug #1194945: juju set is overloaded <juju-core:New> <https://launchpad.net/bugs/1194945>
<fwereade> dpb1, thanks
<dpb1> fwereade: thx for chatting.  I'll be afk for a while now.
<fwereade> dpb1, cheers, anytime
<rogpeppe> dpb1, fwereade: juju unset ?
<rogpeppe> fwereade: trivial, i think: https://codereview.appspot.com/10595044
<fwereade> rogpeppe, I want to be able to do it in the same transaction as the sets, really
<fwereade> rogpeppe, trivial
<fwereade> rogpeppe, `juju set option-`? :)
<rogpeppe> fwereade: ?
<fwereade> rogpeppe, "-" indicating removal ;p
<rogpeppe> fwereade: i think i'd prefer juju set !option, but i think people with dodgy shells might object :-)
<rogpeppe> juju set myflag- foo=bar bar=tdfv
<rogpeppe> hmm, not entirely sure if that reads well
<fwereade> rogpeppe, yeah, it's a better idea in my head than on the screen
 * fwereade bbiab
<dpb1> I like unset from a readability point of view, for sure
<fwereade> dpb1, I kinda feel it's good to be able to make all config changes via the same command (and, under the hood transactionally)
<dpb1> fwereade: yes, I agree with you in that argument.
<rogpeppe> fwereade: another trivial? https://codereview.appspot.com/10620043
<arosales> have folks seen that when a service unit goes down (outside of Juju) it is still made available to haproxy?
<mgz_> AssertStrop... funny tyop
<dimitern> rogpeppe: not sure increasing the ping timeout to 5m is a good thing
<rogpeppe> dimitern: why does it need to be faster?
<rogpeppe> dimitern: what are we guarding against?
<dimitern> rogpeppe: we want dead connections to die quickly
<dimitern> rogpeppe: i.e. to be detected and closed earlier
<rogpeppe> dimitern: if we're using them, they'll die quickly - if not, we don't care that much
<rogpeppe> dimitern: 5 seconds is way too fast
<dimitern> rogpeppe: why so?
<rogpeppe> dimitern: because it's constant network traffic from every single node
<rogpeppe> dimitern: and a connection going down is a rare event
<dimitern> rogpeppe: how about 1m then?
<rogpeppe> dimitern: that's probably ok
<TheMue> fwereade: just returned to the screen and will now read the chat log
<dimitern> rogpeppe: reviewed
<rogpeppe> dimitern: ta
<fwereade> rogpeppe, LGTM also
<rogpeppe> fwereade: thanks
<rogpeppe> fwereade: do you have an opinion on the ping frequency?
<fwereade> rogpeppe, a minute sounded reasonable
<fwereade> rogpeppe, but probably only because it's the middle of the 3 values I was shown :)
<fwereade> rogpeppe, that's the sort of number I'm perfectly happy tuning in response to observation though
<rogpeppe> fwereade: yeah. i changed it from 5 seconds because i was watching the request log and it seemed way too fast
<fwereade> rogpeppe, fair enough, I feel like a minute is quite a nice resolution for now
<mgz_> fwereade: https://codereview.appspot.com/10623043
<mgz_> I can't find a nicer way of making those helpers shared
<TheMue> fwereade: so, read it, good discussion
<TheMue> fwereade: so when default: is '' the value is kept and only in this case (default: is a string) set foo= sets foo to an empty string
<TheMue> fwereade: so foo default: 'bar' and set foo= sets foo not to bar (the default) but to ''
<TheMue> fwereade: and the new issue will introduce unset foo or or set !foo to reset it to the default value (or delete it if no default is specified)
<TheMue> fwereade: correct summarize?
<jam> mgz_: how was the Release meeting?
<mgz_> jam: fine, I'll link you the notes
<jam> rogpeppe: https://code.launchpad.net/~rogpeppe/juju-core/312-alt-api-jobs/+merge/170135
<jam> has a prerequisite of https://code.launchpad.net/~rogpeppe/juju-core/311-alt-juju-bootstrap-state-change-password-1.5
<jam> *not* 1.6
<rogpeppe> jam: that's not the one that failed
<jam> rogpeppe: which one failed?
<rogpeppe> jam: this one: https://code.launchpad.net/~rogpeppe/juju-core/305-alt-jujud-use-tasks-package/+merge/170856
<jam> rogpeppe: so see if this fits what happened. You had a branch with a prereq, and they both got marked Approved at roughly the same time. Go-bot landed one of them, and then rejected the other with "No Proposals Found".
<rogpeppe> jam: sounds right
<rogpeppe> jam: i thought i could approve both branches at the same time and the prereq would get merged first
<jam> rogpeppe: IIRC it doesn't sort the requests by prerequisite, so it tried to merge 305-alt-jujud first, which has an approved-but-not-merged-prereq, so it gets bounced, and then 311-alt-jujud- is triggered immediately after and lands.
<rogpeppe> jam: ha, that's not great
<rogpeppe> jam: so if there are prerequisites, you have to wait until the prereq is merged before approving the next in line
<jam> rogpeppe: it probably doesn't help that the dependent branch comes alphabetically before the prereq branch.
<jam> 305 depends on 311
<rogpeppe> ha ha ha
<rogpeppe> it sorts alphabetically?!
<rogpeppe> jam: it should really sort by approval time, if anything
<jam> rogpeppe: I think it sorts by whatever order LP gives it, which could be alphabetically
<jam> I don't think it does sort internally.
<rogpeppe> jam: can we change it?
<jam> rogpeppe: https://launchpad.net/tarmac code is available
<jam> rogpeppe: https://bugs.launchpad.net/tarmac/+bug/845706
<_mup_> Bug #845706: Tarmac fails to resolve pre-req branches when they've already been merged <Tarmac:Triaged by rockstar> <https://launchpad.net/bugs/845706>
<jam> rogpeppe: looking at the bug report
<rogpeppe> jam: but if we make changes, do they have to be back-ported to precise before we get the benefit?
<jam> it appears launchpad actually does stop showing you the merge proposal for some reason.
<jam> rogpeppe: I'm running from source, not a deb
<rogpeppe> jam: if you want to have a look, this branch relates somewhat to an earlier branch you proposed. https://codereview.appspot.com/10611046/
<jam> I don't know the validaty of this bug comment: This is actually a bug in Launchpad, it seems. Sometimes it will fail to  give the branch in the requested list of proposals. This seems to  happen regardless of the prerequisite's merge status, but does seem to  occur more often when it has been merged.
<rogpeppe> jam: thanks for pointing me to that report
<rogpeppe> jam: i've reached eod. time to do some packing... see you tomorrow morning.
<rogpeppe> g'night all
<jam> rogpeppe: have a good evening. I'm about 6 hours past mine :)
<jam> dimitern: you are attributed the code roger is removing in https://codereview.appspot.com/10611046/
<jam> there is an intermediate object that adds the Id and a pointer back to the resources map
<jam> which doesn't seem to be used
<jam> If it isn't needed, we can remove it, I just wanted to check if you remember a need for it (that may just not have been written yet)
<jam> It is also possible you are marked with the code because you moved it in a refactoring, and not because you wrote it.
<dimitern> jam: what?
<dimitern> jam: I introduced the interface yes
<jam> dimitern: so you have a Resource interface, and a registry which tracks them with secret srvResource type
<jam> the only thing srvResource provides over Resource
<jam> is the 2 attributes Id, and a pointer back to the resource registry
<jam> Roger's proposal removes those 2 attributes by just using a map of Resource directly.
<jam> and I'm checking if you had a reason to have those attributes, which are currently unused.
<dimitern> jam: well, originally I had to introduce the interface to decouple the srvRoot from the machiner facade
<dimitern> jam: and I think that CL is just reorganizing things around, rather than removing functionality
<jam> dimitern: well it specifically removes an intermediate object which has some attributes the new one doesn't have, but they don't seem to be used, and I didn't know if they had other plans to be used.
<dimitern> jam: i don't think so - if we need these, we'll reintroduce them
<rogpeppe> jam: those attributes were needed only because srvResource was originally embedded as an API object directly
<rogpeppe> jam: it implemented Stop
<rogpeppe> jam: so it needed to be able to find the resources struct to remove itself
<fss> niemeyer: hi, took me all day to see your pong x)
<fss> niemeyer: are you going to fisl next week?
<niemeyer> fss: I'll try to
<thumper> morning
<thumper> fwereade: around?
<fss> niemeyer: I was wondering if we can chat about juju-core + vpc :)
<thumper> stabby stabby
<thumper> pretty happy that `bzr destroy-environment` doesn't do anything
<fwereade> thumper, heyhey
<thumper> fwereade: hey
 * thumper makes a sad face
<thumper> missing reviews
<fwereade> thumper, haha, I've done that bzr/juju thing
<fwereade> thumper, that's why I'm on :)
<thumper> ah
<thumper> fwereade: I've just merged trunk
<thumper> and made the lxc provisioner work with the new runner stuff
<thumper> and tested on ec2
<fwereade> thumper, oh yes..?
<thumper> had to fix one thing
<fwereade> that sounds like the past tense which I infer is a good thing
<thumper> where the api was assuming that for machine-0-lxc-0, that "0-lxc-0" was the amchine id
<thumper> so used the MachineIdFromTag and it is all good
<thumper> so we are back to where it works
<thumper> did you want me to repropose
<thumper> actually, I'll do that anyway
<thumper> I do think however that the pinger is doing something a little weird...
<thumper> oh, perhaps not
<wallyworld_> fwereade: hi, thanks for the extra comments on the instance metadata branch
<thumper> it seems that the api writing code doesn't mention the source
<thumper> and it is writted from the api serving side
<thumper> not the client side
<thumper> my dog is getting restless
<thumper> I think she needs to go outside for a bit
<thumper> fwereade, wallyworld_: Rietveld: https://codereview.appspot.com/10489043
<thumper> that is the branch that makes the lxc provisioner work out of the box
<wallyworld_> fwereade: if i keep Nonce on machine and write InstanceId to both machine and instanceData for compatibility, i think that may address your concerns?
<thumper> all previous pipes in that pipeline have landed
 * fwereade is working through it literally rght now
<thumper> this one branch makes us provision containers
<fwereade> wallyworld_, yep, that sgtm
<wallyworld_> fwereade: are you +1 on the sig change to machine.InstanceId() - returns (string, bool, error)
<thumper> I have my daughter at home today finishing off her science fair project
<thumper> better a day off school, than staying up until midnight and getting tired and stressed out
<thumper> wallyworld_: what?
<thumper> wallyworld_: why three results?
<fwereade> wallyworld_, ah, hmm, not so sure there -- but if you keep instanceid on the machine for now we get to dodge that question for a bit, right?
<wallyworld_> fwereade: sure, but i'd rather alter the sig now to match what is will be in the end
<fwereade> wallyworld_, if it's an error that would conventionally replace the info encoded in the bool with a specific error
<wallyworld_> thumper: results is string (the value), bool (is te machine provisioned), error (was there an error finding the id)
<thumper> fwereade: I agree
<thumper> error NotProvisioned
<thumper> and match against that
<thumper> two, not three results
<fwereade> wallyworld_, I wouldn't be comfortable predicting that change right now tbh
<wallyworld_> ok, i can do that. i did like the bool
<wallyworld_> i'll make it an error, that will cover both cases
<thumper> wallyworld_: it just becomes one more step, error.IsNotProvisioned(err)
<fwereade> wallyworld_, when we've got it behind the API I suspect we'll find out some interesting things about actual usage
<wallyworld_> thumper: yes, agreed
<fwereade> wallyworld_, but if you feel strongly I'm not too bothered
 * wallyworld_ doesn't feel strongly
<thumper> fwereade: also, I have looked at a very interesting library
<thumper> fwereade: that may fix watchers
<wallyworld_> fwereade: with the constraints branch - i'm thinking i may need a deployment constraints struct that embeds the current constraints struct
<thumper> fwereade: and give us distributed pub/sub
<thumper> fwereade: http://code.google.com/p/go-router/
<thumper> fwereade: worth looking into I think, but perhaps for later
<fwereade> thumper, at the lowest level the watcher have to remain, but the distribution of the events is absolutely open for improvement imo
<thumper> fwereade: it could mean we have one watcher on things, and just publishes the change
<wallyworld_> fwereade: the deployment constraints stuct has the container constraint; that will avoid the need for rasing an error in places where the deployment constraint is not needed, like adding a machine
<fwereade> thumper, yeah, definitely
<thumper> listeners instead are then just subscribers
<fwereade> wallyworld_, yeah, I was wondering about that
<wallyworld_> thumper: \o/ bring it on
<thumper> I want to poke around with it for a while to test efficiency and whether it does what we need
<wallyworld_> fwereade: if you're not immediately -1 on the idea, i'll rework the brnach to see how it pans out
<thumper> but the docs look promising
<fwereade> wallyworld_, but I'm a little reluctant too -- partly, I have a feeling that struct embedding may fuck unhelpfully with serialization
<wallyworld_> really?
<fwereade> wallyworld_, I think you should check before you go too far
<fwereade> wallyworld_, and we're still wearing these schema compatibility chains ;p
<wallyworld_> fwereade: you talking aout serialisation over the wire?
<fwereade> wallyworld_, in mongo
<wallyworld_> fwereade: there's a separate mongo doc already and conversion functions
<wallyworld_> between the mongo doc and constraints entity
<fwereade> wallyworld_, ah, that's nice, I knew I did that for a reason
<wallyworld_> :-)
<wallyworld_> so i think it will be all good
<wallyworld_> tw, i would have done the same design if i were doing it, so you must have done it right :-)
<wallyworld_> bte
<wallyworld_> tw
<wallyworld_> btw ffs
<fwereade> wallyworld_, ok -- just one thought, though, that we're already expecting that different bits will pay attention to different constraints
<fwereade> wallyworld_, and I'm not quite sure that there's quite such a clean separation, once you consider matching against existing instances as well
<fwereade> wallyworld_, I might be wrong though
<fwereade> wallyworld_, can't hurt to investigate
<wallyworld_> i'm not sure right now, i'll look at how it pans out
 * fwereade goes back to thumper's branch
 * thumper appreciates that
<wallyworld_> thanks for the input
 * thumper takes the dog otu
<wallyworld_> sorry thumper for jumping the queue :-)
<davecheney> arosales: ping
<fwereade> thumper, LGTM with waffling
<thumper> fwereade: thanks I'll look at those shortly
<thumper> need to head into town now to get science fair photos printed :-)
<thumper> wallyworld_: can I get you to look over that branch too?
<thumper> perhaps I'll get it landed this afternoon, which would be great
 * thumper heads into town
<arosales> davecheney, sorry we missed each other again
#juju-dev 2013-06-27
<thumper> wallyworld_: I can't use logger as there is already a package level logger called "logger"
<thumper> wallyworld_: and go doesn't give us a way to have file-local variables
<thumper> davecheney: does it?
<davecheney> thumper: no, there are no file local variables
<davecheney> only package, func and block scopes
<davecheney> what are you trying to do
<thumper> davecheney: don't suppose there could be?
<davecheney> ?
<thumper> davecheney: idiomatic usage, at least in other languages, is to have a file local "logger" variable
<thumper> this allows different files in one package to have different loggers
<davecheney> thumper: you already know what i'm going to say
<thumper> in my example...
<thumper> I have the lxc provisioner / broker in the worker/provisioner package
<thumper> davecheney: you are going to say "put them in a different package?"
<thumper> however, I have the tests for them share some common suite info
<thumper> and there doesn't seem to be a way to import _test declared stuff from another package
<thumper> so suites need exported stuff
<thumper> exported stuff in export_test only available in local package tests
<thumper> shared suites then impossible
<thumper> across other packages
<thumper> that is part of the problem
<davecheney> thumper: i am confused
<davecheney> have you moved on from file scopes vars ?
<thumper> davecheney: don't worry...
<thumper> I was talking myself through the problem of moving them to different packages
<davecheney> ok
<thumper> davecheney: I've been asked to ping you
<thumper> davecheney: about a problem that I have just fixed
<thumper> davecheney: JujuConnSuite wasn't closing the API connections, and fwereade was wondering if this may be related to the races you were looking into
<thumper> davecheney: the branch I'm about to land fixes this
<davecheney> related to the mgo stuff
<davecheney> could be
<davecheney> let me find my notes
<davecheney> land it and lets see
 * thumper nods
<thumper> davecheney: what is the difference between %v and %s in a sprintf?
<davecheney> %v chooses the %s,%f,%d or %b depending on the type of the thing
<davecheney> it's pretty much the default unless you absolutely know you want a string
<wallyworld_> thumper: thanks for extending the comment. i didn't make the connection that embedded = nested. makes more sense now
<thumper> wallyworld_: np
<thumper> wallyworld_: just did a local test run
 * thumper approves the mp
<wallyworld_> yay
 * thumper does a little dance
<thumper> bigger dance coming when it is actually landed
<thumper> wallyworld_: this gives automatic provisioning of lxc containers
<thumper> non-addressable containers, but containers none the less
<wallyworld_> yeah i know :-) will be good
 * thumper had forgotten to set a commit msg
<thumper> oh arse
<thumper> damn local dependencies...
<davecheney> arosales: ping me when you're ready
<arosales> davecheney, ping
<davecheney> two secs
<thumper> davecheney: how do I build the tests without running them?
<davecheney> go test -c is one way
<davecheney> that will produce $PKG.test
<thumper> hmm... don't really want anything created
<davecheney> it should be in $CWD
<davecheney> if it complets then your tests work fine
<davecheney> this is a way to check that your tests compile on other os's
<davecheney> GOOS=windows go test -c launchpad.net/juju-core/cmd/juju
<davecheney> eg
<davecheney> thumper: oh, you donn't want it crated
<davecheney> we.... cd $(mktemp -d) && go test -c $PKG
<davecheney> ^ try that
 * thumper cocks his head
 * thumper sighs...
<thumper> import cycles
 * thumper beats the codebase into submission
<thumper> ok, trying to land again
<davecheney> thumper: sorry, was on da phone
<davecheney> i may have another way to builkd, but not run tests ...
<davecheney> actaully ... no
<davecheney> what i was thinking of didn't work at all
<davecheney> bummer
<rogpeppe> thumper: i've got a script that builds all the tests but doesn't run them
<rogpeppe> thumper: but it uses some of my local stuff - you'd probably want to translate it into a shell you've got installed
<rogpeppe> thumper: http://paste.ubuntu.com/5803790/
<rogpeppe> thumper: i call it "gotest-c"
<rogpeppe> mornin' all, BTW
<jam> davecheney: you reviewed https://codereview.appspot.com/10465043/ but didn't LGTM. I don't quite understand what more you are asking for.
<jam> thumper: the idiom we've used in juju-core is to create a 'testing' package when we need test-related stuff shared between other packages. We've got a small number of them in the source tree already.
<jam> thumper: otherwise, why is the package level logger the wrong one?
<davecheney> jam: that is correct, i was confused about the whole passing back a *string
<jam> davecheney: well it works just fine. I suppose I could pass it back via a buffered channel if it makes you feel happier
<rogpeppe> davecheney, thumper: here's a bash version of my "gotest-c" script: http://paste.ubuntu.com/5803813/
<rogpeppe> davecheney, thumper: you'll have to go get code.google.com/p/rog-go/cmd/pxargs first though
<rogpeppe> davecheney, thumper: then you can do "gotest-c ./..." build all tests without running them. i use it a lot
<thumper-afk> rogpeppe: ta
<rogpeppe> thumper-afk: you might want to adjust the "5" constant, which should really be dependent on the number of cpus you've got
<rogpeppe> thumper-afk: actually, i fell foul of the usual bourne shell quoting gotchas. this is slightly better: http://paste.ubuntu.com/5803829/
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: dimitern | Bugs: 6 Critical, 83 High - https://bugs.launchpad.net/juju-core/
<jam> rogpeppe: merge conflict on submitting your patch again. I got my patch landed which resolved some of the previous Resource changes.
<rogpeppe> jam: darn, ok thanks
<rogpeppe> jam: it had a transitory error when i landed it before (must fix those tests!)
<jam> rogpeppe: yeah, I haven't seen that failure before, so I wasn't sure.
 * thumper afk until the meeting in 2 hours
<wallyworld_> mgz_: ping
<rogpeppe> jam: re-approved. let's see if it lands this time.
<jtv> Probably a beginner's question... when export_test.go exports something from a package, is there anything else that needs to be done to make it available to tests outside the package?
<jtv> I tried importing launchpad.net/juju-core/environs/local, and then using local.Listen in my test â but the compiler insisted that local.Listen was undefined.
<jtv> But the local provider's own tests can do the same thing and not have that problem.
<dimitern> https://codereview.appspot.com/10617043/
<jtv> dimitern: is that an answer for me or are you asking for a review?
<dimitern> jtv: for review :)
<jtv> Ah  :(
<jtv> Trade you?
<dimitern> jtv: sorry wasn't paying attention - will read the backlog
<jtv> Thanks.
<dimitern> jtv: so export_test makes package internals available to tests, but only inside the same package
<jtv> ...But the export_test.go is in the "local" package, and the test that makes use of it is in the "local_test" package.  How does that work?
<dimitern> jtv: go magic :)
 * jtv screams at  heavens
<dimitern> jtv: anything with packagename_test is accessible in tests, including stuff in export_test
<dimitern> jtv: if you need something from a tests package in more than one place, it's best to factor it out in a common testing package
<jtv> Damn.
<dimitern> wallyworld_: bug 1195223
<_mup_> Bug #1195223: juju all-machines.log is repetitive and grows unbounded <juju-core:New for wallyworld> <https://launchpad.net/bugs/1195223>
<jtv> Deeper and deeper into the rabbit hole for a single job...  :(
<dimitern> jtv: what do you need?
<jtv> I need to invoke local.Listen, for purposes related to the local package, in my test.
 * rogpeppe goes for breakfast
 * dimitern looks at local.Listen
<dimitern> jtv: and you need to invoke it where?
<jtv> In a test for the azure provider, where I use the local provider's storage implementation as a test double.
<dimitern> jtv: that's probably a bad idea
<dimitern> jtv: the local provider's storage is overengineered and probably needs to be simplified
<jtv> Oh great.  I was using the dummy provider's one, but fwereade preferred me to use the local provider's storage.
<dimitern> jtv: if you just need a simple http server, why not create one and handle a specific url?
<rogpeppe> jtv: why not just export local.listen?
<jtv> rogpeppe: that's what I did to get it working, but now I'm trying to find out what the "proper" way would have been!
<jtv> dimitern: I do not want a simple http server!
<jtv> I don't want any http at all.
<rogpeppe> jtv, dimitern: i think it's perfectly reasonable to have a working local storage provider from local that other things can use
<jtv> I just need a test double for a storage object.
<dimitern> jtv: yeah, that's another option - just export local.listen and NewStorage and get rid of export_test
<jtv> That's what I did, actually â but I figured since *another* package used the export_test trick, it looked as if I ought to be able to as well.
<dimitern> jtv: nah.. it only works inside the same package
<jtv> Well it's exported from package "local" and imported into package "local_test"
<jtv> And I do mean package, not source file.
<dimitern> jtv: yeah, but the azure provider is in a different package, hence it's not visible, even if you import it
<jtv> Oh God don't tell me the rule is "exported exactly 1 package up, but no more"!
<dimitern> rogpeppe: care for a review after breakfast? https://codereview.appspot.com/10617043/
<jtv> dimitern: I was already looking at yours actually.  My end of the implied bargain.  :)
<dimitern> jtv: sweet!
<fwereade> jtv, dimitern: I suggested the local provider's storage because it's independent and it works
<fwereade> jtv, dimitern: and if we make it simpler, then great
<dimitern> fwereade: it has to be exported to work like that
<jtv> Not in this branch though.  Gotta manage the scope of a branch or things will get waaaay out of hand.  :)
<fwereade> dimitern, I'm fine exporting it somewhere -- tying storage implementations to providers is kinda dumb in the first place
<rogpeppe> jtv: i suggest that you change environs/local to export an interface like this: http://paste.ubuntu.com/5804251/
<rogpeppe> fwereade: does that seem reasonable to you?
<dimitern> rogpeppe: +1
<fwereade> rogpeppe, looks sane
<rogpeppe> actually, NewStorage should probably be func NewStorage(addr string) environs.Storage
<fwereade> rogpeppe, better yet
<rogpeppe> unless there were some fancy introspection methods we'd want to put on it
<rogpeppe> but i can't think of any
<rogpeppe> jtv: the changes to environs/local to do that should be pretty trivial
<fwereade> rogpeppe, it would still be best if it were actually outside local
<rogpeppe> fwereade: environs/localstorage ?
<fwereade> rogpeppe, sounds reasonable
<rogpeppe> filestorage?
<rogpeppe> localstorage.New(addr string) environs.Storage
<rogpeppe> or Client
<fwereade> rogpeppe, TrivialStorage :/
<rogpeppe> fwereade: nah - it's actually potentially useful and not *entirely* trivial
<fwereade> rogpeppe, as you like :)
<rogpeppe> jtv, fwereade, dimitern: i think this is a bit better actually: http://paste.ubuntu.com/5804267/
<dimitern> rogpeppe: sgtm
<jtv> dimitern: done with your review
 * jtv catches up on backscroll
<dimitern> jtv: thanks
<dimitern> jtv: helpful comments, cheers
<jtv> np
<jtv> Deal's a deal.  :)
<wallyworld_> fwereade: you happy with https://codereview.appspot.com/10447045/ now?
<jtv> rogpeppe, fwereade: it's never *entirely* trivial â if you don't mind I'll just make that a separate branch, and first get this never-landing branch saga to a conclusion.
<wallyworld_> mgz_: ping
<fwereade> wallyworld_, helldamn I have drafts
<fwereade> wallyworld_, let me see what I said, just a mo
<fwereade> wallyworld_, sent; give it a quick read and let me know what you think
<wallyworld_> fwereade: also, on introducing a new DeploymentValue constraint embedding Value - to make this viable, I'll need to introduce a new Constraints interface
<wallyworld_> and use that throughout the codebase where feasible
<wallyworld_> given Go lacks polymorphism and other useful inheritance constructs
<fwereade> wallyworld_, I lean towards keeping it a single type until we've seen how the actual usage patterns end up
<wallyworld_> fwereade: so this would mean adding checks to other places where container constraint is not valid
<wallyworld_> besides add-machine
<fwereade> wallyworld_, I think that's more than we need actually -- just an understanding that different bits of the system pay attention to different parts of the constraints
<wallyworld_> fwereade: i'd rather code defensively and fail if people try and pass in the wrong thing
<wallyworld_> the system should fail if given invalid inputs
<wallyworld_> as people might have certain expectations and winder why stuff didn;t behave as expected
<fwereade> wallyworld_, agreed re *inputs* -- so I think what you did originally was fine -- but internally I think it's ok that different components pay attention to different parts of the structure
<fwereade> wallyworld_, if I knew for sure which parts handled what I'd be keener on splitting the type
<wallyworld_> fwereade: right, that's indeed how i coded it
<fwereade> wallyworld_, but I don't think we know enough to get it right there
<fwereade> wallyworld_, sorry for all the hassles
<fwereade> s/right there/right yet/
<wallyworld_> fwereade: i'm happy to defer the work, i just seem to recall you were unhappy with what i did but now it seems you agree?
<wallyworld_> i'm saying that from memory
<wallyworld_> will have to re-read the comments
<fwereade> wallyworld_, yeah, I was a bit more paranoid than I think I needed to be -- sorry about that
<wallyworld_> np
<wallyworld_> fwereade: just read the comments on the instance data branch. all good, i'll make the tweaks suggested. thanks for pointing out the checker, i didn't know we had one
<fwereade> wallyworld_, I think it's very new
<wallyworld_> fwereade: i'll try to get it done before bed, but if not first thing tomorrow
<dimitern> fwereade: can we get rid of WatchPrincipalUnits and WatchSubordinateUnits now after I land this?
<dimitern> fwereade: no one else is using them
<dimitern> jtv: if you don't mind I'll do the loggo and if/else if/else changes in follow-ups
<dimitern> fwereade: ping
<fwereade> dimitern, pong, quickly, meeting in 10 :)
<dimitern> fwereade: ^^
<fwereade> dimitern, ah sorry missed that
<fwereade> dimitern, hmm
<fwereade> dimitern, let's not just yet
<dimitern> fwereade: I mean in a follow-up
<dimitern> fwereade: or you think it's useful to keep them for now?
<fwereade> dimitern, yeah, I'm just not sure I want to drop them yet
<dimitern> fwereade: ok then
<dimitern> rogpeppe: so do we have API connection for all agents now?
<rogpeppe> dimitern: no, the unit agent doesn't have an API connection yet
<rogpeppe> dimitern: it needs the uniter facade first
<dimitern> rogpeppe: but openState now opens an API connection in agent.go?
<jtv> dimitern: fine with me!
<rogpeppe> dimitern: yes
 * jtv stalks off to get a stiff drink
<dimitern> rogpeppe: my question is: once I implement the deployer facade, can I start replacing state calls for api calls and have a connection?
<rogpeppe> dimitern: yes
<dimitern> rogpeppe: sweet!
<rogpeppe> dimitern: we also need the agent-alive mechanism in place
<dimitern> rogpeppe: yeah, but not for the deployer at least
<rogpeppe> dimitern: the deployer isn't its own agent, is it?
<rogpeppe> dimitern: ah, it's just a worker alongside the machiner
<rogpeppe> dimitern: cool
<dimitern> rogpeppe: no, but it has its own facade
<rogpeppe> dimitern: that sounds good
<rogpeppe> dimitern: when you do the deployer facade, please follow the example of apiserver/client
<dimitern> rogpeppe: that at least haven't changed while i was gone, right? a facade per worker/agent
<dimitern> rogpeppe: you mean with root replacement after login?
<rogpeppe> dimitern: i.e. a top level API type with a single method: Deployer(id string) (*Deployer, error)
<rogpeppe> dimitern: currently there's still only a single root type after login
<dimitern> rogpeppe: ok, will follow it
<rogpeppe> dimitern: when all the facades are factored out into their own packages, we'll do a switch on the user name after login and choose a facade to present
<rogpeppe> dimitern: each kind of client will have its own package (e.g. machine, unit, client) which will have an API type that presents the whole API for that client, integrating in all the facades as necessary
<dimitern> rogpeppe: i'd like to see that
<rogpeppe> dimitern: if that seems ok to you
<rogpeppe> dimitern: i'm knocking up a spike proof of concept
<dimitern> rogpeppe: cool
 * dimitern waiting impatiently for tarmac (hopefully not to kick me in the teeth :)
<rogpeppe> jtv, fwereade, dimitern: https://codereview.appspot.com/10678043/ (only slightly less trivial than i thought)
<dimitern> rogpeppe: on it
 * rogpeppe wants a "wait for tarmac" command that waits for a branch and prints tarmac's final judgement on it
<dimitern> juju wait-tarmac --retry-intermittent-failures ? :)
 * TheMue -> lunchtime
<dimitern> \o/ Merged!
<dimitern> rogpeppe: you've got a review
<rogpeppe> dimitern: ta!
<rogpeppe> TheMue: i'd appreciate a review from you too, as it was your code originally, i believe:  https://codereview.appspot.com/10678043/
<dimitern> so are we all now doing the standup/meeting now instead of kanban?
<rogpeppe> hmm, trunk is broken live: 2013-06-27 11:28:30 ERROR juju runner.go:198 worker: fatal "lxc-provisioner": error executing "lxc-ls": exec: "lxc-ls": executable file not found in $PATH
<jam> mgz_, dimitern, danilos, wallyworld_: i'm in the manager meeting, but we'll likely wrapup quickly
<wallyworld_> rogpeppe: thumper's fault!
<rogpeppe> wallyworld_: yeah, but actually it's doubly broken. i have also broken it :-(
<rogpeppe> wallyworld_: am working on a fix for the other issue now
<wallyworld_> the more the merrier :-)
<rogpeppe> wallyworld_: well, i'll incorporate a fix for apt-get lxc too
<wallyworld_> \o/
<thumper-afk> rogpeppe: eh?
<jam> d
<jam> dimitern: https://code.launchpad.net/~jameinel/goose/transfer-content-length-1124561/+merge/170976
<thumper> rogpeppe: no, trunk is good
<rogpeppe> thumper: ah!
<thumper> it landed
<rogpeppe> thumper: i've just realised why it happened
<rogpeppe> thumper: i upgraded
<rogpeppe> thumper: and upgrading just gets the binaries
<rogpeppe> thumper: it doesn't install new dependencies
<thumper> rogpeppe: oh, and the machines don't have lxc
<rogpeppe> thumper: yeah
<thumper> that seems like it could be a problem
<thumper> do we have a plan for that?
<rogpeppe> thumper: not yet :)
<thumper> ok, and with that, I'm off
<rogpeppe> thumper: i think it could be quite simple though; we'll see
 * thumper nods
<thumper> night everyone
 * rogpeppe goes for some lunch and packing
<rogpeppe> i might be a couple of hours, but i will be back
<mattyw> fwereade, do you know if anyone is doing https://bugs.launchpad.net/juju-core/+bug/1191066? I thought I might give it a go to help me get setup on contributing to core
<_mup_> Bug #1191066: ssh command line help incorrect <bitesize> <cmdline> <juju-core:Triaged> <https://launchpad.net/bugs/1191066>
<TheMue> rogpeppe: back from lunch, let me just propose my code (test is done) and then you'll get the review
<fwereade> mattyw, not that I'm aware of
<dimitern> jtv, rogpeppe, fwereade: almost trivial https://codereview.appspot.com/10681044
<fwereade> mattyw, fwiw I am doing another pass through the bugs and have started to tag "easy" as well as "bitesize"
<mattyw> fwereade, ok cool, I'll keep any eye out for that
<fwereade> mattyw, where "bitesize" implies few lines of code to fix, and "easy" also implies that you shouldn't need to know too much arcane junk
<fwereade> to identify which lines
<wallyworld_> fwereade: with bug 1130051 you pointed out in your review, it is implicitly fixed by the new implementation :-)
<_mup_> Bug #1130051: juju ssh doesn't wait properly for instance id <bitesize> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1130051>
<fwereade> wallyworld_, hmm, doesn't it short-circuit to checking the doc id? maybe I misremembered
<fwereade> wallyworld_, I probably did
<wallyworld_> fwereade: only for old machines, where the instance id should be set already
<jam> mramm: there is no hangout for API meeting in 10 minutes. Are we just doing it on IRC?
<wallyworld_> fwereade:  new ones will have the instance id in the new doc
<mramm> jam: that is fine with me
<fwereade> wallyworld_, ok, and you always check for a doc first? cool
<wallyworld_> fwereade: yes
<mramm> or I can add one in quickly
<mramm> whatever is most efficient ;)
<mramm> added one to the invite in case we want it
<mramm> I *think* we can go more quickly this week, and probably don't need an hour.
<jam> mramm: I'm on the hangount
<jam> fwereade: care to join us for a fast API overview? https://plus.google.com/hangouts/_/91b92a337684b97321e7521463a14ec318401052
<mattyw> just made my first attempt at fixing a small bug: https://codereview.appspot.com/10683043/
<jcastro> hey guys, I'm trying to file bugs on the documentation, but they seem to show up on juju-core, not juju-core/docs
<jcastro> I submitted via https://bugs.launchpad.net/juju-core/docs
<jam> jcastro: there is no 'juju-core/docs' package, just a branch
<jam> jcastro: if you want them to show up in that series... just a sec
<jam> jcastro: can you link a bug?
<jam> jcastro: I just did something here: https://bugs.launchpad.net/juju-core/+bug/1195293
<_mup_> Bug #1195293: Docs need workflow contribution examples <juju-core:New> <juju-core docs:New> <https://launchpad.net/bugs/1195293>
<jam> you can use the "Target to series" link undernead the bug tasks
<jam> jcastro: I now see stuff in https://bugs.launchpad.net/juju-core/docs so hopefully its doing what you want
<jcastro> ok got it!
<jcastro> jam: so there's no way for me to report doc bugs without spamming you guys?
<jam> jcastro: 'juju-core/docs' is configured as a juju-core series, not as a separate project. So you have to target bugs to the series
<jcastro> yeah so nick just landed new docs, so there might be a buncha little bugs coming in
<jam> jcastro: when submitting a new bug, it is possible to target a milestone immediately, but for a series you have to submit the bug, then come back and set a target
 * jcastro nods
<jam> jcastro: it is a bit of an misuse of launchpad to have a branch called lp:juju-core/docs that isn't actually a branch of juju-core source code
<jam> Arguably it should be called lp:juju-core-docs
<jcastro> we were inside of core before, and then we moved out, and now we're moving back in
<robbiew> lol
<jam> jcastro: we can live with what you have, but it doesn't fit the LP model.
<jcastro> not my decision, and I don't even know why or how
<jam> jcastro: as in there will be a juju-core/docs subdirectory with the website ?
<jam> or just we are sharing the project name ?
<jcastro> the html docs are in there
<jcastro> and they went live this morning, and I just wanted to fix a <pre> tag. :)
<jam> jcastro: I mean, when I do "bzr branch lp:juju-core" today, I don't get docs
<jam> is the intent that the html docs will share the same source tree as the code
<jam> ?
<jcastro> I belive so
<jcastro> let me get evilnick for you.
<jam> jcastro: if so, then it makes sense to have them in one project, and eventually lp:juju-core/docs will be merged into lp:juju-core
<jam> If they are intended to be developed "independently" then it would make more sense to have them as separate LP projects.
 * dimitern lunch
<jcastro> I think the intent was together so like we can maintain docs for stable vs. trunk, etc.
<evilnickveitch> jcastro, hey
<jam> jcastro: well, you could do that anyway, though you'd have to take care to branch when we branch.
<jam> we could share the https://launchpad.net/juju-project
<jam> as for ownership, bug visibilicty, etc.
<jam> we already have several projects there related to dependent code that we are also maintaining
<jcastro> evilnickveitch: hey so jam thinks he can help organize this better
<jcastro> but all I wanted to do is drive by fix some docs with no other responsibilities. :)
<evilnickveitch> jcastro :)
<evilnickveitch> jam, so what's the plan?
 * jcastro slowly tiptoes away from the conversation
<jam> evilnickveitch: I'm just trying to get a feel for how we are wanting to use Launchpad to manage the thing currently called "juju-core/docs"
<jam> LP has the idea that everything underneath a given project shares the same source code and bugs
<jam> So having a "juju-core/docs" as a series (rather than a related-but-independent project) confuses things like bug triage
<evilnickveitch> jam, uhuh. I was following the model from the previous juju + juju/docs model. but yes, I have noticed that
<evilnickveitch> jam should we just have it as a standalone project then?
<jam> evilnickveitch: that is how I would do it
<jam> called juju-core-docs
<jam> or juju-docs
<jam> It can be part of the 'juju-project' if we want to be clear about the affiliation
<jam> and make it easy to share bug maintainers, etc.
<fwereade> evilnickveitch, fwiw, I just this morning tagged a couple of juju-core bugs as "doc" because I *think* they stem from inadequate communication
<jam> But otherwise, when people make a change to docs, it is going to show up as a branch against juju-core itself, etc.
<evilnickveitch> fwereade, thanks!
<fwereade> evilnickveitch, there will surely be more :)
<evilnickveitch> jam, yes, it is a bit confusing - I am not sure how it used to work with the old juju project.
<evilnickveitch> jam, but it will be painful for me to trawl through bugs and target them to docs series all the time
<jam> evilnickveitch: right, and if we really want a 'juju-core' bug to also be a 'juju-core-docs' bug, we can still target both projects
<evilnickveitch> jam, sure. I think that will happen!
<jam> evilnickveitch: I'm pretty sure having a separate project will just fit LP's model better, and still let us do everything we want.
<evilnickveitch> jam, cool. I will sort it out
<benji__> teknico: I'll take it
<benji__> pfft; wrong channel
<evilnickveitch> jam, although, another option might be to host the docs with the juju-core code.
<jam> evilnickveitch: I could live with either, but I think either would be better than what we have today.
<evilnickveitch> jam, okay, will discuss with arosales et al.
<jam> evilnickveitch: I probably slightly lean towards separate projects, mostly because I don't think people will want the raw HTML who are developing the source code (because the target audience is different), but I'm not strongly that way.
<evilnickveitch> jam, to be honest, in some ways it would be easier for me if it was a separate project too. But there are advantages to it being contained alongside the main code.
<evilnickveitch> it makes versioning a lot simpler for one thing, and maybe it isn't a bad idea that coders also realise that docs exist :)
<fwereade> jam, hey, about the upgrader -- for maximum simplicity, I think we should be able to just EntityWatcher the env config, and then have an api method taking series/arch and returning a *Tools
<fwereade> jam, we can still pass series/arch in for smarter watching in future
<wallyworld_> fwereade: could you take a peek at my comments on https://codereview.appspot.com/10534043/ and if my thinking is aligned with yours, i'll polish off the branch tomorrow. in particular the bit on env vs deployment constraints and general usage
<fwereade> wallyworld_, will do
<ackk> hi, is there a way to get the environment UUID through the juju-core API?
<fwereade> ackk, ...apparently not? this is slightly surprising, but maybe the GUI hasn't wanted to use it yet
<fwereade> ackk, that's been the primary driver for the public API
<ackk> fwereade, right. It'd be nice to have it in the EnvironmentInfo, though
<fwereade> ackk, yeah, agreed, that should be a nice easy fix and it should kinda obviously be there
<ackk> fwereade, I just opened a bug for it, FIY: #1195344
<dimitern> ackk: if you type "bug 1195344" _mup_ will print a link to it
<_mup_> Bug #1195344: Add the environment UUID to EnvironmentInfo response <juju-core:New> <https://launchpad.net/bugs/1195344>
<ackk> dimitern, thanks
<fwereade> ackk, tyvm
<TheMue> dimitern: thx for review
<dimitern> TheMue: yw
<TheMue> dimitern: will add a comment, assert is just to ensure, that at least two runs have been done so that times could be compared
<dimitern> TheMue: my comment was not specifically for the assert, but for the for loop mostly
<dimitern> TheMue: thanks
<TheMue> dimitern: ah, ic
<TheMue> dimitern: ok, will find some words. it's based on a discussion we had during kanban ;)
<dimitern> TheMue: yeah, but you can see how it can be confusing - if it's hard to describe in a couple of sentences it might not be a good idea :)
<mramm> can somebody join the cross-team juju call?
<mramm> fwereade: ^^^
<mattyw> dimitern, linking an MP to the bug, do I still just do that as I normally would do in launchpad or is there something I need to do in rietveld?
<dimitern> mattyw: you can pass -bug="" in lbox or just link it in LP
<mattyw> dimitern, ok thanks
<dimitern> TheMue: can you take a look at this mostly trivial CL? https://codereview.appspot.com/10681044/
<TheMue> dimitern: sure
<fwereade_> mramm, hell, sorry, I was having a very late lunch
<dimitern> TheMue: thanks
<TheMue> dimitern: you've got a +1
<dimitern> TheMue: cheers
<rogpeppe> fwereade_: ping
<rogpeppe> fwereade_: ping
<wallyworld_> fwereade_: you around?
<thumper> morning
<fwereade_> wallyworld_, heyhey, I'm around if it's quick :)
<fwereade_> wallyworld_, I'm still not sure there's a firm distinction between provisioning and deployment constraints
<wallyworld_> fwereade_: hi, was just pinging you to hopefully +1 that instance metadata branch
<fwereade_> wallyworld_, but it did just cross my mind that we probably don't want to use "" to mean "no containerization", because the current meaning of "" elsewhere is "I don't care"
<fwereade_> wallyworld_, "none" might work better
<wallyworld_> fwereade_: ok, so i can introduce a new value
<fwereade_> wallyworld_, I'll take a look sorry about that
<wallyworld_> none sounds good
<wallyworld_> if you +1 it, i'll then nag thumper :-)
<wallyworld_> thumper: could you look at this one (display hardware info in status)? https://codereview.appspot.com/10667043/
<thumper> ok
<wallyworld_> thumper: thanks. as a matter of consistency, i prefer not to name variables the same as packages irrespective of if the package is accessed in the scope of the variable. just easier to read the code etc
<fwereade_> wallyworld_, reviewed
 * fwereade_ bed
<wallyworld_> fwereade_: thanks
#juju-dev 2013-06-28
<davecheney> hello
<davecheney> starting the release process now
<davecheney> can anyone tell me what the new release number shold be ?
<thumper> hi davecheney
<thumper> I thought we were going to release as 1.11.? and get it hammered by people
<thumper> and if that was good, release the same thing as 1.12
<thumper> so until we make the decision to go with 1.12
<thumper> I'd increment whatever the third number is
<thumper> 1.11.2?
<davecheney> thumper: yes, that is what I understood
<davecheney> we'll release 1.11.1, bump to 1.11.2
<thumper> ack
<thumper> +1
<davecheney> thumper: will 1.11.1 become 1.12.0 ?
<davecheney> ie, should I start a 1.12. branch ?
<thumper> as long as we tag the 1.11.1 release revision, we don't need to start a 1.12 branch until we are ready to release it
<thumper> and yes, if the release is good, 1.11.1 will be rebuilt as 1.12.0
<thumper> so branch off the 1.11.1 release tag, change the version number, rebuild
<thumper> push to lp:juju-core/1.12
<thumper> actually
<thumper> probably not
<thumper> but tag as 1.12
<thumper> merge into trunk
<thumper> and bump to 1.13.0
<thumper> we are then working on 1.13 in trunk
<thumper> that is how I'd do int
<thumper> s/int/it/
<thumper> we then have the 1.12 release tag in trunk, and working on 1.13
<thumper> davecheney: sound sane?
 * davecheney observes that tagging 1.11.1 is sort of pointless as we don't control the deps
<davecheney> but decides not to swallow that footgun
<davecheney> thumper: here is what I am thinking
<davecheney> tag release
<davecheney> then a script checks out the tag and all the other deps at thatpoint in time
<davecheney> and produces a tagball
<davecheney> that is what we push to lp
<thumper> davecheney: so effectively tarring up all the deps?
<thumper> so go install with that tar ball is repeatable?
<thumper> as long as we ignore the core libraries?
<davecheney> thumper: effectively making a fake gopath, bzr branch the tag, go get -v ... << will fetch all the deps at that point in time, then tar that up
<thumper> me nods
 * davecheney wonders if lbox can do tags ...
<davecheney> aaaah, my local apt mirror is offline
<davecheney> and the answer to the previous question
<davecheney> lucky(~/src/launchpad.net/juju-core) % lbox propose
<davecheney> error: Failed to run "bzr push": exit status 3
<davecheney> -----
<davecheney> bzr: ERROR: These branches have diverged.  See "bzr help diverged-branches" for more information.
<davecheney> thumper: https://code.launchpad.net/~dave-cheney/juju-core/122-tag-release-1.11.1/+merge/171945
 * thumper is back
<thumper> sorry, was walking the dog
<davecheney> np
<davecheney> looks like lbox totally choken on a tag
<thumper> davecheney: how did you create the branch?
<davecheney> my first attempt ?
<davecheney> it was my trunk branch
<davecheney> it may not have been clean
<davecheney> try this fresh mp
<davecheney> https://code.launchpad.net/~dave-cheney/juju-core/122-tag-release-1.11.1/+merge/171945
<thumper> is says the diff is empty
<thumper> is it just a tag?
<davecheney> yes
<thumper> lgtm
<davecheney> thanks
<davecheney> thumper: https://codereview.appspot.com/10725044 << bump dev version to 1.11.2
<davecheney> thumper: question: if i'm writing a release script to build the full $GOPATH for a release
<davecheney> why don't I just write it in a way to checkout all the deps at a specific revision
<thumper> davecheney: that would be great if you did that
<davecheney> just becuase the go tool doesn't support that, doesn't mean we can't manage this ourselves
 * thumper nods
<davecheney> thumper: understood
<thumper> we talked about having a requirements.txt in root of trunk
<thumper> which listed revnos/hashes etc for dependencies
<davecheney> yeah, that never went anywaywhere
<davecheney> thumper: will the bot process approcals/merges in order ?
<davecheney> or should I wait til the tag branch lands ?
<thumper> um...
<thumper> I'm not actually sure on how it handles ordering
<thumper> to be safer I'd wait
<davecheney> understood
 * davecheney crickets
<davecheney> is the bot broken ? or just slow ?
<davecheney> wallyworld_: thumper can someone check the bot
<davecheney> it's been an hour
<thumper> davecheney: did you set a commit message?
<thumper> that is the normal blocker
 * davecheney goes to set a commit message
 * thumper fixes some dumbness
<davecheney> still waiting ...
<thumper> arse biscuits
<thumper> davecheney: I wonder if tarmac thinks that there is nothing to do.
<thumper> davecheney: you may want to do a pointless commit
<thumper> davecheney: 'bzr commit -m "Tag 1.11.1 release" --unchanged
<thumper> davecheney: then bzr push
<davecheney> third times a charm
<thumper> then go and reapprove to get the last revision
<thumper> davecheney: may want to move the tag to that revision too
<thumper> to avoid confusion :-)
<thumper> anyone else seen this? PANIC: server_test.go:550: StoreSuite.TestBlitzKey
<davecheney> i've seen a lot of panics
<davecheney> that one rings a bell
<thumper> davecheney: good call using the gofmt
<davecheney> semantic noop
<bigjools> apparently Windows 8.1 has something called "charms" ...
<davecheney> ohh err
<davecheney> lets call ours Lucky Charms
<davecheney> oh, wait ...
<davecheney> thumper: https://code.launchpad.net/~dave-cheney/juju-core/123-set-development-version-to-1.11.2/+merge/171946
<davecheney> ^ can I get an approver ?
<thumper> bigjools: what do windows charms do?
<bigjools> NFI
<thumper> davecheney: what do you need?
<davecheney> dunno, bot is still be a turd
<thumper> davecheney: you only approved it 6 minutes ago
<thumper> sometimes the bot takes 10-15
<bigjools> thumper: it's like PQM all over again
<thumper> bigjools: slow tests is all
<thumper> but yes.
<davecheney> PQM ?
<bigjools> landing bot for Launchpad
<bigjools> LP's tests were 40 minutes when I started on LP, by the time I left they were ~5hours IIRC
<davecheney> bigjools: atlassians' tests took longer than a work day to run
<davecheney> committing fixes was an expontential time to completion
<bigjools> the ultimate "compiling?"
<wallyworld_> davecheney: sorry, was eating, missed the ping. is everything resolved?
<davecheney> yup
<davecheney> wallyworld_: should be
<davecheney> thanks mate
<davecheney> speaking of eating
<davecheney> it is 13:00 local, time for carbohydrates
<bigjools> liquid carbs?
<davecheney> bigjools: oh my, is it friday already
<bigjools> davecheney: seems so.  I thought Friday was yesterday and was rather disappointed when I found out it wasn't
 * bigjools launches unity-webapps-plugin into the neighbour's yard
<wallyworld_> thumper: maybe you could +1 this with the proviso i address the remaining couple of issues (which i am working on) and then william can +1 it tonight and i can land https://codereview.appspot.com/10447045/
<thumper> wallyworld_: ok, I'll look in a few minutes
<wallyworld_> no hurry. thanks
<davecheney> wallyworld_: did you land the goose fix you mentioned on the call yesterday ?
<wallyworld_> davecheney: sure did. i added the bug to the release notes at the same time
<davecheney> super
<davecheney> wallyworld_: new command, create-machine ?
<davecheney> wallyworld_: create machine doesn't appear to be part of the juju cli
<davecheney> just wondering if I shuld call it out in the release notes
<davecheney> thumper: would you be a dear and delete this https://code.launchpad.net/~thumper/+recipe/juju-core-daily
<davecheney> ^ it's no longer needed
<thumper> ack
<thumper> davecheney: done
 * thumper reviews wallyworld_'s branch again
<davecheney> SHIT
<davecheney> the PPA recipe has been using hte old cgo based goayml
<wallyworld_> davecheney: add-machine
<wallyworld_> it was create machine but someone (forget who) asked it to be changed
<davecheney> wallyworld_: no probs
<davecheney> so
<davecheney> add-machine can be used to prepopulate an environment with blank machines ?
<wallyworld_> yes, or containers
<wallyworld_> add-machine /lxc
<wallyworld_> creates a new instances with a lxc container on it
<wallyworld_> add-machine 1:/lxc
<davecheney> thumper: that tag didn
<wallyworld_> adds a lxc container to machine 1
<davecheney> didn't tag
<thumper> :-(
<thumper> I don't know why
<davecheney> bzr tags is empty
<thumper> davecheney: I would ask jam, but he is working Sun->Thu now
<davecheney> or not empty
<thumper> davecheney: you could always poke on #bzr for some help
<davecheney> oh well, we'll tag it again another time
<davecheney> thumper: what is the bzr word for tip ?
<thumper> tip
<davecheney> bzr branch -r tip doens't do what I think
<thumper> ah...
<thumper> heh
<thumper> um.. by default it does tip
<thumper> bzr branch foo bar # takes tip of foo
<thumper> -r -1 refers to the last one
<davecheney> thanks that'll do
<davecheney> FUCK
<davecheney> https://code.launchpad.net/~dave-cheney/+recipe/juju-core
<davecheney> ^ spot the problem with this build receipe
<davecheney> hit, tarmac
<davecheney> hint
<thumper> hah, not the right branch
 * thumper EOWs
<thumper> see ya people
<thumper> wallyworld_: you have your review
<wallyworld_> thumper: \o/
<thumper> and I've put a few skeleton ones up for local provider
<wallyworld_> have a good weekend
<wallyworld_> ok
<davecheney> later
<davecheney> gahhh, lp access is so slow
<davecheney> i'm having to run my scripts in the states
<wallyworld_> sadly it has always been a bit slow from aus
<davecheney> checkout speeds are all over the shop today
<davecheney> gah
<davecheney> going to go home and try a different internet connection
<davecheney> LP is so slow i cannot download the debs from PPA
<davecheney> https://codereview.appspot.com/10676046/
<TheMue> fwereade_: ping
<mattyw> morning folks, when I try to lbox submit this https://codereview.appspot.com/10683043/ I get a readonly transport error from bazaar, anyone seen something like this?
<fwereade_> TheMue, pong
<fwereade_> mattyw, we're using tarmac now
<fwereade_> mattyw, set the commit message and approve it in LP
<mattyw> fwereade_, got it, thanks!
<mattyw> fwereade_, all I can do is set it needs review or merged in lp
<TheMue> fwereade_: one moment, phone ;)
<fwereade_> mattyw, ah, sorry, I'll approve it then
<mattyw> fwereade_, thanks :)
<fwereade_> mattyw, done, it should land soonish
<mattyw> fwereade_, thanks very much
<TheMue> fwereade_: aargh, half an hour administrative call *sigh*
<TheMue> fwereade_: just pinged you because of a method in charm/config.go
<fwereade_> TheMue, no worries :) which method?
<TheMue> fwereade_: there is a FilterSettings(), which is only used in state/service.go changeCharmOps()
<fwereade_> TheMue, yep
<fwereade_> TheMue, what's the issue with it?
<TheMue> fwereade_: I changed in config.go the ReadConfig() that it now allows a default with empty string
<TheMue> fwereade_: tests are also fine (you can see it in the CL)
<TheMue> fwereade_: but in the test of the FilterSettings() an input of an empty string filters that setting to nil
<fwereade_> TheMue, as it should, you're just fixing the default bug today
<TheMue> fwereade_: and I'm not sure if that's correct (dimiter pointed me to that behavior)
<fwereade_> TheMue, if defaults are making it into FilterSettings I think we're Doing It Wrong somewhere
<TheMue> fwereade_: ah, fine, than I interpreted it correctly
<fwereade_> TheMue, cool
<TheMue> fwereade_: so then the CL waits for your review *smile*
<fwereade_> TheMue, when we figure out the nice way to promote "" to equality everywhere that stuff will have to change sometime
<fwereade_> TheMue, cool; link please?
<TheMue> fwereade_: https://codereview.appspot.com/10682043/
 * TheMue just downloads the whole stuff to OS X to build a client there
* ChanServ changed the topic of #juju-dev to: https://juju.ubuntu.com | On-call reviewer: - | Bugs: 7 Critical, 84 High - https://bugs.launchpad.net/juju-core/
<TheMue> Now let's start the dependency hell ...
<fwereade_> TheMue, LGTM, just drop the redundant/confusing test (and rearrange the logic if you think it looks like a good idea)
<TheMue> fwereade_: thx. the one tests uses a pre-created .juju dir, the other not, that's the difference
<fwereade_> TheMue, huh, sorry?
<fwereade_> TheMue, why do we hit .juju in ReadConfig?
<fwereade_> TheMue, or do I have the wrong context?
<TheMue> fwereade_: aargh, no, I've mixed up two CLs
<TheMue> fwereade_: ;)
<TheMue> fwereade_: I should read your review before I say anything
<TheMue> fwereade_: so, read it, thx for your review
<fwereade_> TheMue, cool (btw which was the other one? should I take a look at that? rings a bell but it's a bit mixed up with all the other CLs in my mind)
<TheMue> fwereade_: this one https://codereview.appspot.com/10507043/
<TheMue> Huh, all deps seem to build fine on OS X.
<fwereade_> TheMue, nice
<fwereade_> TheMue, about that CL you just linked
<fwereade_> TheMue, ISTM that the two tests you wrote could just as easily be one
<fwereade_> TheMue, (that checks both dir and file permissions)
<fwereade_> TheMue, but that you should have a test with a pre-existing file, and a surprising dir permission, and check that the dir permission stays (but is logged) but the file is rewritten with 0600
<TheMue> fwereade_: the one, where WriteEnvirons() creates the directory it will have 0700. otherwise it has what has been set before by the user
<fwereade_> TheMue, quite so
<fwereade_> TheMue, but that's not tested that I can see
<TheMue> fwereade_: a pre-existing file will be overwritten
<fwereade_> TheMue, yes, I think you're telling me what I told you..?
<fwereade_> TheMue, the behaviour is fine but the tests don't exercise that path
<TheMue> fwereade_: eh, no, i'm only wondering, because we never tested that a file is overwritten
<TheMue> fwereade_: the tests only check the right, as this is the only changed feature
<fwereade_> TheMue, I think that now we're messing with permissions we need to write tests that actually cover all the relevant cases
<TheMue> fwereade_: what's indeed untested is the logging. have to see how this can be tested.
<fwereade_> TheMue, pre-existing lack of coverage is an opportunity to improve, not a straitjacket ;)
<fwereade_> TheMue, I'm pretty sure LoggingSuite will do the trick
<fwereade_> TheMue, c.GetTestLog()
<TheMue> fwereade_: ok, will change (even after your first lgtm *smile*)
<fwereade_> ;p
<TheMue> *drummroll*
<TheMue> Will build now ...
<TheMue> Da fuck *sorry*
<TheMue> the build is incredible fast (I spend it all 8 cores)
<rogpeppe> fwereade_, dimitern: ping
<fwereade_> rogpeppe, heyhey, sorry I missed you last night
<dimitern> rogpeppe: heyhey
<rogpeppe> yo!
<TheMue> rogpeppe: hey, the norway guy
<TheMue> rogpeppe: you shall relax and recreate
<dimitern> rogpeppe: is it cold out there?
<rogpeppe> dimitern: not bad - about 18 degrees and reasonably sunny
<dimitern> rogpeppe: nice!
<TheMue> *sigh*
<rogpeppe> TheMue: i intend to - we head south to the coast and a hut tomorrow for a week
<TheMue> better than the wet 13Â°  here
<rogpeppe> TheMue: well "hut" - it's probably quite ok for modern conveniences...
<TheMue> rogpeppe: as long as it has no network ;)
<rogpeppe> dimitern, fwereade_: did https://codereview.appspot.com/10684044 make some sort of sense?
<rogpeppe> TheMue: indeed!
<dimitern> rogpeppe: haven't looked yet; i'm deep in the amazon still, live testing :)
<dimitern> rogpeppe: will take a look a bit later
<rogpeppe> dimitern: please do - i'll be able to ask questions today, but not later
<fwereade_> rogpeppe, only just gave it the most cursory look, but it seems very nice, thanks
<rogpeppe> fwereade_: cool
<rogpeppe> fwereade_: hopefully someone can take it forward from there.
<dimitern> rogpeppe: i looked through it, seems nice
<rogpeppe> dimitern: thanks.
<dimitern> rogpeppe: how do i get a machiner client facade from the machineagent one?
<danilos> how would I best test in jujutest if NewConn is not failing? Calling Status() over API returns non-nil error even if "juju status" returns an error
<rogpeppe> dimitern: if you want, you can trivially implement a method on the client machineagent API that returns the machiner API
<rogpeppe> dimitern: i'm not entirely sure i'd bother though - not sure
<dimitern> rogpeppe: well, because the agent connects to state and has a different facade from its tasks, and the tasks themselves need a different facade (to emulate st *state.State objects)
<rogpeppe> dimitern: the agent could keep the api.State around to fetch facades from
<rogpeppe> dimitern: because in general you don't need to fetch a facade from a facade (the machineagent case is possibly the only one)
<rogpeppe> dimitern: and unitagent, of course
<dimitern> rogpeppe: ok, will have a closer look at that
<rogpeppe> dimitern: the api.State is kinda like the "facade facade" :-)
<dimitern> rogpeppe: i'm thinking wrt the deployer facade i need to implement today
<rogpeppe> dimitern: you may well find it more convenient to add a Machiner method to MachineAgent, client-side
<dimitern> rogpeppe: i see, yeah so far seems reasonable
<rogpeppe> dimitern: func (a *MachineAgent) Machiner() *machiner.Machiner {return machiner.NewState(a.state)}
<rogpeppe> dimitern: or something like that
<dimitern> rogpeppe: yeah
<rogpeppe> dimitern: if you went that way, you'd probably need to add all the facade methods to Machine
<rogpeppe> MachineAgent, sorry
<dimitern> rogpeppe: hmm.. will see about this - we want to enforce encapsulation at the type level, as agreed
<dimitern> rogpeppe: no extra methods callable from outside
<rogpeppe> dimitern: i wonder if it might be better just to make MachineAgent implement Call; then the client can just do machiner.NewState(machineagent) for any facade
<dimitern> rogpeppe: at client-side?
<rogpeppe> dimitern: yes
<rogpeppe> dimitern: ISTM that machineagent (and unitagent) are special in that respect - it's ok for an agent to use any call appropriate for an agent, and it can use that to create facades specifically for individual workers.
<dimitern> rogpeppe: but we'll still have only the needed methods at server-side, so you cannot call something you're not supposed to (and is used by a different worker on the same agent)
<rogpeppe> dimitern: i guess there are two levels of security here - connection security and type security
<dimitern> rogpeppe: yes
<rogpeppe> dimitern: for connection security, theoretically any worker can call methods designed for any other worker
<dimitern> rogpeppe: how so?
<rogpeppe> dimitern: because they're all sharing the same connection
<dimitern> rogpeppe: ah, right, but that's ok i think
<rogpeppe> dimitern: for type security, we can make sure that, by not providing a Call method or the password to a worker, that a worker can't call any methods outside its facade
<dimitern> rogpeppe: yeah
 * TheMue just deploys our standard sample - via OS X
<rogpeppe> dimitern: but the machineagent and unitagent facades are special in that they need to spawn workers themselves. so making them implement common.Caller seems like it might be a nice approach
<rogpeppe> dimitern: which avoids the client-side machineagent and unitagent packages depending on all the other facade packages
<dimitern> rogpeppe: will look into it
<dimitern> rogpeppe: won't that give the agents' facades too much knowledge? i mean from the agent you can somehow screw up a worker or something
<rogpeppe> dimitern: the agents can do anything anyway because they made the connection
<dimitern> rogpeppe: by calling a method on the worker facade the agent is not supposed to
<rogpeppe> dimitern: the agents must be able to create worker facades
<rogpeppe> dimitern: so i don't think we'd be giving them any power they don't already have
<rogpeppe> dimitern: and it's not like there's malicious code living in cmd/jujud :-)
<dimitern> rogpeppe: yeah, but i'm thinking of the future when we'd possiblly have external agents using the api just like ours
<rogpeppe> dimitern: i don't see the issue - if we have external agents, they have exactly the same capabilities, surely?
<dimitern> rogpeppe: and we may very well need to think how to prevent malicious agents doing too much
<rogpeppe> dimitern: if we want to do that, we need to do it at the connection level, not the type level
<dimitern> rogpeppe: anyway, this is a bit on the philosophical side for now
<rogpeppe> dimitern: what we're talking about here is type-level security, which is trivially circumventable by anyone that can make a direct websocket connection.
<rogpeppe> dimitern: and even at the type level, the difference is just: machiner := machiner.New(machineagent) vs machiner := machineagent.Machiner()
<dimitern> rogpeppe: not really? do the agents use websockets as well as clients?
<rogpeppe> dimitern: yes, of course. what else would they use?
<dimitern> rogpeppe: wasn't sure - i thought we only use WS for clients
<rogpeppe> dimitern: (we could of course use another more efficient protocol for agents, but for the time being we just use the same thing for everyone - the API is the API)
<dimitern> rogpeppe: yeah, we could
<danilos> dimitern, mgz: hi, do you perhaps want to have an early stand-up call? (I may need to go out at exactly that time, though maybe not); we can potentially have it later if that suits you better as well
<rogpeppe> dimitern: at some point i envisage possibly using encoding/gob and direct TCP connections - that would make things quite a bit more efficient, but we orient everything around json currently.
<dimitern> danilos: i'm up for later :)
<rogpeppe> dimitern: right, i'm off to lie in the sun :-)
<dimitern> rogpeppe: would this lock us onto go-based agents only?
<dimitern> rogpeppe: enjoy :)
<danilos> dimitern, ok, I'll ping when I come back then :)
 * TheMue -> lunch
<rogpeppe> dimitern: we'd probably provide both protocols
<rogpeppe> dimitern: that's easy to do - just listen on two ports
<dimitern> fwereade_: ping
<fwereade_> dimitern, pong
<fwereade_> dimitern, crap, sorry, I saw that ping but got distracted in between seeing it and answering it
<dimitern> fwereade_: my bad :/ I realized my live testing yesterday wasn't done correctly
<fwereade_> btw, would someone please do another review of jtv's https://codereview.appspot.com/10500043/ branch?
<fwereade_> dimitern, bah, bad luck, what was wrong?
<dimitern> fwereade_: i didn't pass --upload-tools to bootstrap (don't know how I thought it won't be necessary and will still use the latest source)
<fwereade_> dimitern, haha
<fwereade_> dimitern, for compatibility checking that's good though
<fwereade_> dimitern, bootstrap plain
<dimitern> fwereade_: i found out today when testing the deployer before and after the change and even after the change the upstart files where suspiciously the same :)
<fwereade_> dimitern, deploy some units and subordinates
<fwereade_> dimitern, juju upgrade-juju --upload-tools
<fwereade_> dimitern, cross fingers
<dimitern> fwereade_: now i'm testing on the proper version and the good news is the new deployer seems to work ok :)
<fwereade_> dimitern, awesome
<fwereade_> dimitern, is it live right now?
 * TheMue happily just bootstrapped, deployed, exposed and connected from OS X
<dimitern> fwereade_: so the scenario is: revert to r1356 (or whatever before the change to deployer); build the scenario (wp+mysql+nrpe on machine 0); upgrade-juju - then what should I look for?
<dimitern> fwereade_: it is
<fwereade_> dimitern, if so you should be able to check compatibility by stopping the various upstart jobs, renaming a couple to the old format, and starting the jobs up again
<dimitern> fwereade_: quick g+ perhaps?
<dimitern> i never tried running upgrade-juju before
<fwereade_> TheMue, nice!
<jtv> Thanks fwereade_
<hazmat> have we verified of all our dependency licenses?
<hazmat> i just had a round with the gocurl author because there was no license specified (now apache2)
<mgz> hazmat: nope
<danilos> dimitern, mgz: I am still out and my laptop battery died: I've got CL up that I'd appreciate a review for :) i am on my phone now, so slow to type
<dimitern> danilos: ok, i'll take a look shortly
<danilos> dimitern, thanks
<mgz> okay :)
<fwereade_> hey, wasn't there something that shortened the timeouts on the ec2 tests? that seems not to be working, I see what look suspiciously like a bunch of 5s waits
<jtv> Could anyone help me out with a review?  There's quite a lot of history to it now, so it may be best to read the discussion in-order before going into the diffs.  https://codereview.appspot.com/10500043/
<jtv> allenap or rvba maybe?
<TheMue> fwereade_: wonna see an interesting behavior: http://play.golang.org/p/XGf09f7F9v
<wallyworld_> fwereade_: 1000th time lucky on that metadata in state branch? sorry for forgetting that client code accesses state.Machine directly. sigh
<fwereade_> wallyworld_, bad luck, I hate writing that sort of code
<fwereade_> TheMue, that's the nil/nil thing isn't it
<fwereade_> TheMue, nastiest example I have yet encountered though
<allenap> jtv: I'll give it a go.
<TheMue> fwereade_: yeah, I wondered why a change doesn't work
<TheMue> fwereade_: so I tried it isolated
<allenap> TheMue: That's very surprising behaviour!
<dimitern> TheMue: what's surprising about it?
<allenap> TheMue: I misread, it's not surprising.
<ahasenack> hi guys,
<ahasenack> 	imports launchpad.net/juju-core/environs/local: import "launchpad.net/juju-core/environs/local": cannot find package
<ahasenack> got this while updating this morning
<ahasenack> er
<ahasenack> line got cut off, sorry
<allenap> TheMue: I misread the err != nil as a test for err == nil.
<ahasenack> hm, no, it's that
<dimitern> ahasenack: that was moved into environs/localstorage - if you pull trunk tip should be ok
<ahasenack> I thought I was doing that
<TheMue> allenap: oh, yes, a typo
<ahasenack> go get -v -u <stuff>
<ahasenack> hm, now it has different output
 * ahasenack lets it run
<dimitern> ahasenack: or you can just: cd $GOPATH/src/launchpad.net/juju-core/ && bzr pull
<dimitern> fwereade_: i have a question
<fwereade_> dimitern, oh yes?
<dimitern> fwereade_: as the simple context is written, it'll only return stuff which match the deployer tag
<fwereade_> dimitern, yeah, I think that's the bit that needs fixing
<dimitern> fwereade_: but it must return both new and old ones, right? i.e. unit-wordpress-0:unit-nrpe-0 and 0:unit-wordpress-0 in the new case
<dimitern> fwereade_: or machine-0:unit-wordpress-0 in the old case
<fwereade_> dimitern, yeah, the new version of list has to return anything on the machine deployed either by itself or an older version of juju
<dimitern> fwereade_: ok
<fwereade_> dimitern, just a sec though
<fwereade_> dimitern, are you using the principal names in the subordinate conf names in the new version as well?
<fwereade_> dimitern, or did I misread you ^^
<dimitern> fwereade_: here's what I got from live tests: http://paste.ubuntu.com/5807731/
<fwereade_> dimitern, you know, I'm seriously questioning the value of those `0:`s
<fwereade_> dimitern, would we lose anything if we just forgot about identifying the deployer at all and just used the unit tag directly?
<fwereade_> dimitern, would be muuuch nicer, I think
<dimitern> fwereade_: you mean jujud-unit-mysql-0.conf instead?
<fwereade_> dimitern, yeah
<dimitern> fwereade_: how about if we have multiple deployers running on the same machine? or they'll be in containers?
<fwereade_> dimitern, the idea is one per machine, full stop
<fwereade_> dimitern, local provider won't be a problem because we won't be running uncontainerized units
<dimitern> fwereade_: seems reasonable i think
<fwereade_> dimitern, yeah, I think that if we end up with multiple machine agents ever running in the same instance the units will be the least of our worries
<dimitern> fwereade_: so we can make them jujud-%s.conf where %s is the deployed unit's tag
<fwereade_> dimitern, perfect
 * fwereade_ food
<dimitern> fwereade_: cheers, will do
<fwereade_> dimitern, whoa, is the branch without compatibility merged already?
<fwereade_> dimitern, I think dave's going to cut a release shortly, we should have both changes or neither in there
<fwereade_> dimitern, (I'm up for trying to get it in today if you think there's enough time)
<dimitern> fwereade_: i think the release was short-lived anyway - there was some mail about it
<dimitern> fwereade_: i'm mostly done anyway
<fwereade_> dimitern, but .2 is coming :)
<fwereade_> dimitern, cool
<dimitern> I already *hate* SimpleToolsFixture!!
<fwereade_> dimitern, if it's shit, kill it
<dimitern> fwereade_: it's written with one deployer in mind, I have to somehow convince it to create 2 separate fixtures for each one
<dimitern> fwereade_: it assumes too much
<danilos> dimitern, mgz: heya, if you want to have a (final) call, I am finally back :)
<dimitern> shit..
<dimitern> fwereade_, TheMue: are doing kanban?
<fwereade_> dimitern, yeah
<dimitern> sorry, will join now
<danilos> dimitern, ok, got the point :)
<dimitern> danilos: sorry, but just forgot we have kanban now
<danilos> dimitern, no worries, ping me when you are done; I'd still appreciate a review for https://codereview.appspot.com/10733044/ while I am figuring out the livetest failure I am seeing
<dimitern> danilos: sure, once i have some time
<danilos> dimitern, thanks
<frankban> hi dimitern, could you please take another look at https://codereview.appspot.com/10675043/ ?
<dimitern> frankban: will do a bit later
<dimitern> frankban: I have to land a patch first before the release
<frankban> dimitern: cool, thank you
<mgz> danilo: answered a query in one of your mps
<mgz> will look at it in more detail later
<fwereade_> danilos, reviewed
<danilos> fwereade_, thanks, I was looking for something like Reset() (was even considering writing Unpoison) but on the storage, not on the provider itself
<fwereade_> danilos, np
<danilos> fwereade_, btw, do you mean I should isolate all the tests in the livetests as well? (not done for performance reasons, since bootstrapping takes such a long time, but I'd be happy to split them out if that's what you are suggesting)
<fwereade_> danilos, sorry, no, just the unit tests
<danilos> fwereade_, ack
<fwereade_> danilos, our depending on the lack of isolation between the lives tests makes me eel somewhat grubby but it's practical
<danilos> fwereade_, well, to be honest, my tests could cause trouble as well (I should probably restore bootstrap-verify to "expected contents", or the VerifyStorage live tests might start failing when the order of execution changes)
<danilos> fwereade_, I had that code, but it seemed unwieldy (I was saving the existing contents of the file and then writing it back at the end of the test; perhaps it's good enough to write known-good content at the end of a test, which wouldn't look so ugly and wouldn't detract as much from the code being actually tested)
<fwereade_> danilos, I would be +1 on that
<fwereade_> danilos, pulling it out into a little deferred helper might not be so bad tough?
<danilos> fwereade_, yeah, I'll do that
<danilos> fwereade_, updated all as suggested, livetests still pass, 'lbox proposed' again to update the CL :)
<fwereade_> danilos, cheers
<fwereade_> TheMue, dimitern: https://codereview.appspot.com/10751043
<TheMue> *click*
<fwereade_> danilos, reviewed again, just a couple of things
<dimitern> fwereade_: whew.. done finally with the tests, will propose shortly for you to take a look, if you're still around
<fwereade_> dimitern, I've got to go out for a bit now but there's half a chance that if you get another review first I'll be able to land it before dave gets there
<fwereade_> dimitern, please mail me and him re status, I'll follow up this evening
<dimitern> fwereade_: ok, will talk to dave as well
<danilos> fwereade_, hum, I am a bit confused about what you want regarding error status; in one of my previous branches, you mentioned how you prefer functions/methods to return their own errors, instead of propagating returned errors, which is why I did what I did; I am fine with doing what you suggest, though, but explaining my rationale here :)
 * dimitern on to reviews now..
<dimitern> frankban: reviewed
<TheMue> fwereade_: reviewed
<dimitern> fwereade_: reviewed
<dimitern> danilos: on to yours
<danilos> fwereade_, this is getting ugly again (with different errors returned, I am also updating tests to cope, but it means restoring more of the state, and then I need 'restoreBootstrapVerificationFile' in unit tests as well, and...), I'll finish it later I hope (need to go out now)
<dimitern> TheMue, fwereade_: https://codereview.appspot.com/10746044/
<dimitern> danilos: reviewed
<dimitern> TheMue: ping
<TheMue> dimitern: pong
<TheMue> dimitern: start reviewing
<dimitern> TheMue: cheers!
<TheMue> dimitern: you've got a review
<dimitern> TheMue: tyvm
<TheMue> dimitern: so, I'll step out. enjoy next week.
<dimitern> rogpeppe: wow enjoyed the sun?
<rogpeppe> dimitern: oh yes, had a nice swim too
<dimitern> rogpeppe: well rested for a quick easy review ? :)
<rogpeppe> dimitern: just came online to add more songs to my spotify playlist for the journey south
<rogpeppe> dimitern: no reviews, i'm officially On Holiday and it would not be tolerated :-)
<dimitern> rogpeppe: ah :) sure
<dimitern> waiting for dave cheney to appear
<dimitern> i need to land this shit before the release
#juju-dev 2013-06-30
<thumper> morning
<thumper> hi fwereade__
<thumper> fwereade__: seeing you do reviews...
<fwereade__> thumper, heyhey
<thumper> fwereade__: I cc'ed you on a container email
<thumper> I'm approaching a higher level of concern...
<thumper> however
<thumper> I did find this:
<thumper> http://lxc.sourceforge.net/index.php/about/kernel-namespaces/network/configuration/
<thumper> method 1 seems to closely match what we want
<thumper> but I don't entirely understand it
<thumper> I seem to be missing some  understanding in it
<thumper> I seem to be lacking the key bit where the veth0 on host A is connected to container 1
<thumper> also there seems to be overlap in the vethN numbering in the host
<thumper> so I want to find someone who knows more
<fwereade__> thumper, yeah, I'm no wiser really -- was it serge who told us about namespaces in the first place?
<thumper> fwereade__: I think so
<thumper> it seems like it shouldn't be too hard
<thumper> just finding the right incantations
<fwereade__> thumper, that's my best guess there then -- yeah, indeed
<thumper> I'm putting on my review hat right now
<thumper> reviewing some of your pending work and ians
<fwereade__> thumper, re getting ip addresses assigned... mm, yeah
<thumper> particularly around the container constraint
<thumper> we need a way to ask for public/private ip addresses
<thumper> not sure how to handle container addressability without at least private
<fwereade__> thumper, yeah, indeed
<fwereade__> thumper, ec2 first sounds eminently sensible to me
<thumper> otherwise it seems the only way is to do port forwarding and fake it, which I loath as an idea
<thumper> I'd rather just say "sorry containers aren't supported on this provider because they suck"
<thumper> s/they suck/we can't get ip addresses dynamically/
<thumper> fwereade__: on the plus side, it appears the default lxc bridge will work fine for the local provider with no mods
<fwereade__> thumper, yeah, I don't think there's likely to be much mileage in faking it up
 * thumper nods
<thumper> I really don't want to put effort into a solution that doesn't take us towards a successful outcome
<fwereade__> thumper, more and more work for less and less gain
<thumper> I agree
<fwereade__> thumper, vg news re lxc though
<thumper> for the local provider?
<fwereade__> yeah
<thumper> yeah
<thumper> I'm going to go back to that after the reviews
<fwereade__> cool
<thumper> I had an intersting thought though...
<fwereade__> thumper, oh yes?
<thumper> we can have the containers auto restart when you reboot
<thumper> I suppose we need the same type of startup file that the machine agents have
<thumper> for the local machine as part of bootstrap?
<thumper> and clean it up with destroy-environment
<thumper> I'm not going to do it initially
<fwereade__> that's another one for serge I think -- some versions of lxc have autostart
<thumper> but will make a card for it
<thumper> fwereade__: reading the precise docs
<thumper> it seems that they do
<thumper> and it is easy
<fwereade__> sweet
<thumper> just symlink the config into /etc/lxc/autostart or something simiilar
<fwereade__> yeah, I just wasn't sure it was there or easy on precise
<thumper> but might be nice in the future for local provider to stay alive with a reboot
<fwereade__> thumper, tbh I think it's an important feature anyway for whatever containers
<thumper> I did notice though that the golxc impl uses lxc-stop and not lxc-shutdown
<thumper> shutdown is nice, stop is flicking the power switch
<fwereade__> thumper, we can't really expect that our cloud instances will never reboot
 * thumper nods
<thumper> easy to do by default I think
<thumper> I have a card on the kanban board already for it
<fwereade__> thumper, stop/shutdown is interesting... not sure when we'd stop one whose state we cared about
<fwereade__> but regardless
<thumper> it seems that we always go stop/destroy
<thumper> so we don't need to be nice
<thumper> but it does have me wondering what lxc does on system shutdown
<thumper> I'm going to trust the lxc devs here
<fwereade__> this is interesting but I'm flying horribly early tomorrow and hoping to be back at work by lunchtime, so I have to sleep now
<thumper> and guess they do a shutdown by default and kill if takes too long
<thumper> :-)
<thumper> ok
<thumper> ciao
<fwereade__> enjoy your day, regards to wallyworld and davecheney; see you soon :)
<wallyworld> see ya
<thumper> ffs
<thumper>     c.Assert(provider, Equals, &local.Provider)
<thumper> ... obtained *local.environProvider = &local.environProvider{}
<thumper> ... expected *local.environProvider = &local.environProvider{}
<thumper> wallyworld: that is me trying to test the actual provider for you
<thumper> wallyworld: interestingly DeepEquals works find
<thumper> fine
<wallyworld> thumper: because they are pointers
<thumper> so
<wallyworld> the actual mem addresses are different
<wallyworld> but contents are the same
<thumper> no they aren't
<thumper> as in, they are the same object
 * wallyworld was just guessing
<wallyworld> in the past, i've seen that
<thumper> what I was referring to was the obtained/expected results above
<thumper> like, oh, you gave me X when I expected Y
<thumper> but here X and Y are the same
<wallyworld> deep equals says contents are the same
<wallyworld> but are you sure mem addresses are the same
<thumper> wallyworld: deepequals also checks for the same type
<thumper> wallyworld: well, one is an interface, the other is a struct
<wallyworld> sure, type is the same
<thumper> so I'm thinking deep equals should be ok...
<wallyworld> yes
<thumper> or perhaps I should create an interface for the struct
 * thumper tests that
<wallyworld> i'd stick with deep equals perhaps
 * wallyworld goes away for 10 minutes to buy tickets to the First Test in Novemer
<thumper> nope
 * thumper nods
<thumper> deepequals it is
<wallyworld> thumper:  i have a failing test in lxcProvisionerSuite. i've tracked it down to expectStopped() being called and checking that the container is still provisioned. but advancing a machine's lifecycle to dead causes it to be removed and the instance metadata is removed also. i'm not sure why the test was written the way it was
<thumper> wallyworld: hmm... what changes have you made locally?
<thumper> what object is expectStopped on?
<wallyworld> this is the branch which introduces the instance metadata doc
<wallyworld> expect stopped is on lxcProvisionerSuite
<wallyworld> looking at a container
<wallyworld> i'm not sure how the test would have passed the first time
<wallyworld> i need to dig into it a bit
<thumper> which file?
<wallyworld> lxc-broker_test.go
<wallyworld> making a machine dead and expecting it to still have a valid instance id doesn't make sense to me?
<thumper> ah...
<thumper> here I was calling instance id from the object as it was just returning a cached local value
<thumper> so it was ok
<thumper> now you have turned this into a function doing work right?
<wallyworld> yes, but i check the local cached value first
<thumper> ok, so just change the expectStopped to take a machineId instead of state.Machine
<thumper> and get the isntance id out before stopping
<thumper> make sense?
<wallyworld> but why would we want to have a instance id != "" after making a machine dead?
<thumper> what we are testing is that the right container was stopped
<thumper> the container uses the container name as the isntance id
<thumper> we are only testing we stopped the right one
<thumper> not that there is something in state
<wallyworld> by checking instance id?
<thumper> we don't care about state here
<thumper> state happened to conveniently cache the id we care about
<thumper> how about...
<wallyworld> hmmm. seems like a fragile way to do it
<thumper> have expect started to return the event.id as a container name as a string
<thumper> then pass that into the expect stopped
<thumper> make more sense?
<wallyworld> i'll look into it. a mechanism that doesn't rely on instance id being valid after life->dead is what we want
<thumper> wallyworld: right, what I just said fits that
<wallyworld> sure, just repeating it back
 * wallyworld is pleasantly surprised the test suite seems to run so much quicker as of late
<thumper> hmm...
<thumper> I suppose I should start writing my juju talk at some stage
#juju-dev 2014-06-23
<davecheney> ttps://github.com/juju/juju/pull/144
<axw> wallyworld: https://github.com/juju/juju/pull/134/files#diff-4fd108ff3861516a9ea367ed5e560d50R1534    does this look reasonable?
<waigani> thumper: sorry I'm late - in hangout now
<wallyworld> axw: for the first test, i think we need to load a new machine object on which to set addr1, addr0
<wallyworld> in the before hook
<axw> oops, yeah, did it in the second but not the first
<wallyworld> so that we don't mess up the in memory representation for machine
<axw> will do
<wallyworld> ah, haven't got to the 2nd test yet
<wallyworld> axw: yep, lgtm
<axw> wallyworld: thanks. will fix the first one and land
<wallyworld> ta
<perrito666> wallyworld: :) please tal at your email I would really appreciate an answer to that last email
<perrito666> wallyworld: took me 3 days but my background brain thread finally returned the solution
<wallyworld> perrito666: will do, looking now
<wallyworld> perrito666: good pickup. my view is that all restore type operations should be run as admin. normal users have readwrite permissions but lack these ones mgo.RoleDBAdminAny, mgo.RoleUserAdminAny. i don't think it's appropriate for users other than admin to have such permissions. i'll reply to the email
<perrito666> wallyworld: ok, Ill propose a small patch and if we then decide that its the other way around we can reject it
<wallyworld> perrito666: which lines fail from mongpoEval?
<wallyworld> perrito666: i'm also confused why it worked when run bu hand?
<wallyworld> by
<perrito666> wallyworld: at first sight none :| that is none yields errors, but something might be lost when we go from bash trough ssh to go..
<perrito666> wallyworld: there lies the solution actually
<perrito666> I realized that by hand I was using admin instead of tag user
<wallyworld> oh
<wallyworld> that was a bit of a ref herring then :-)
<wallyworld> red
<perrito666> wallyworld: all thanks to the talk we had the other day
<perrito666> wallyworld: I take that a red herring is not a fish in that sentence? :p
<wallyworld> no :-) a colloquial expression for an unintended diversion or misdirection
<perrito666> well yes and no, I focused on the fact that it worked by hand and tried to replace --eval for a js file and while I was writting that patch I realised I was not using admin on restore
<perrito666> sadly the realisation of this solution came on a sunday night, so here I am :|
<wallyworld> :-(
<wallyworld> so the solution is to remove mongoEval  and run everything as mongoAdminEval?
<wallyworld> or did you want to keep doing some things as non admin?
<perrito666> nope, everything as Admin
<perrito666> or I can try one by one the commands and do as admin only what is required
<perrito666> which is, as I see it, a great loss of time
<wallyworld> yep, agreed
<perrito666> ok, Ill go sleep and send the patch tomorrow AM
<wallyworld> sounds good, thanks for spending the extra time to find it
<davecheney> Session closed is getting me down
<davecheney> axw: wallyworld + echo 'Instance setup done:' Mon Jun 23 04:35:03 UTC 2014
<davecheney> ^ this is great
<davecheney> but there is no other timestamp in the log to see when it started
<davecheney> well thre is this one
<davecheney> [workspace] $ /bin/bash /tmp/hudson2115592101584260118.sh
<davecheney> Started: Mon Jun 23 04:26:02 UTC 2014
<davecheney> aeriously - 10 minutes to setup a machine in ec2 ...
<wallyworld> davecheney: yep :-(
<wallyworld> can be as short a 4 minutes
<wallyworld> that's why we're looking at using a nailed up instance
<wallyworld> there are 3 timestamps in the log
<wallyworld> tests started, instance finished/tests stating, all done
<wallyworld> s/test started/ job started
<davecheney> Instance has ip ec2-54-84-105-221.compute-1.amazonaws.com
<davecheney> Waiting for 22..............................
<davecheney> + set +e
<davecheney> ^ there should be a timestamp here
<wallyworld> can add one
<davecheney>   System information as of Mon Jun 23 04:29:53 UTC 2014
<wallyworld> the apt dance take ages :-(
<davecheney> there is this one from the motd
<davecheney> i guess its up to date
<menn0> wallyworld: can I log in to Jenkins or is there a limited list of people who can? The current build is mine and has failed and can be cancelled.
<menn0> on a related note, there's been a lot of mgo panics today during test runs along the lines of "Session already closed"
<menn0> is this new or could it be related to davecheney's recent close-mongo-iterators PR?
<wallyworld> yes
<wallyworld> the tracebacks seemed to implicate root.go somehow
<wallyworld> and then i saw you email to dev and stopped looking
<wallyworld> could be related though
<menn0> no I think these are different from root.go
<menn0> there's been 2 regular build problems today: root.go and the mgo "session already closed"
<menn0> I think they're separate issues (probably)
<wallyworld> could be
<wallyworld> we've looked on and off and fixed several issues
<jam1> menn0: so the existing test that was failing does show a data race under "go test -race", I'll try to write up a simpler case, though.
<menn0> jam1: great that you were able to track it down quickly
<jam1> menn0: well, when the line that has the error is "objectCache[key] = obj"
<jam1> it gives a pretty good hint
<jam1> but yeah, I'm pretty familiar with the code since I've been working on it closely.
<menn0> jam1: note that it was a different test that failed on my machine, but in a similar way
<jam1> menn0: yeah, it is an API data race (concurrent mutation of a golang map, which are *not* concurrent safe, you have to wrap them in a mutex)
<menn0> jam1: yep. it could happen at any point right?
<menn0> any call
<jam1> menn0: yeah
<vladk> jam1: morning
<jam1> vladk: morning. Sorry I'm a bit late, looking into this data race, I'll be there in a couple mins
<vladk> jam1: go maps and slices are non-concurrent objects, only channels are concurrent and strings are immutable
<jam1> vladk: yeah, I'm aware, just wasn't thinking about the concurrent access when I was writing the code.
<jam1> menn0: https://github.com/juju/juju/pull/146
<jam1> it also fixes an only tangentially related race condition in state/api/watcher/watcher.go that I only noticed because the test that was failing in cmd/jujud had 2 sources of race conditions.
<jam1> "go test -race" is pretty nice, its a shame it slows things down so much.
<jam1> menn0: you're also OCR for today, so poke for the review :)
<wallyworld> jam1: not looking in detail, you may have perhaps fixed an ongoing intermittent test failure around watchers perhaps
<wallyworld> well at least i'm hoping :-)
<jam1> wallyworld: so the race for watchers is that it is possible for the loop() to terminate before it actually starts anything
<jam1> because it calls w.wg.Add() but only *inside a goroutine*
<jam1> which isn't, itself, protected by a wg.Add()
<wallyworld> ah ok. may not be the same issue then
<jam1> so you could start a watcher, have a couple pending goroutines, and then exit
<jam1> although thinking about it, I may need to move something around a bit
<wallyworld> good that we found and fixed this before 1.19.4 ships
<jam1> k, I don't need to move it after all. so my patch is ready for review.
<jam1> vladk: I'm in the hangount
<jam1> hangout
<jam1> wallyworld: I wonder if we want a CI test that runs the whole test suite in "go test -race" mode
<jam1> I don't think we're *quite* clean there, though.
<wallyworld> jam1: worth adding i reckon
<jam1> wallyworld: well it doesn't help if it never passes :)
<wallyworld> sure, so let's get it passing first :-)
<davecheney> ah mongo, how to you you leak temporary file, let me count the ways ...
<rogpeppe1> davecheney: morning!
<davecheney> rogpeppe1: ahoy!
<TheMue> morning
<rogpeppe1> TheMue: hiya
<jam1> TheMue: morning ! I'm just finishing up lunch, I'll be there in about 5-10 min.
<TheMue> jam1: ok
<fwereade> wallyworld, ping
<wallyworld> fwereade: hey
<fwereade> wallyworld, hey, I was wondering about proof-of-access for the managed resource stuff
<wallyworld> proof of access?
<fwereade> wallyworld, ie "here store this file with md5/sha256" "ok I want the md5/sha256 of <random byte range>" "here you go" "ok cool your file is stored"
<wallyworld> ah that
<wallyworld> not implemented yet
<wallyworld> just getting basics landed
<fwereade> wallyworld, I'm wondering what impact that will have on this layer, because it's starting to feel like the right place for it
<fwereade> wallyworld, maybe I'm wrong
<fwereade> wallyworld, but the lower the layer that implements it the less opportunity we will have to fuck it up
<wwitzel3> perrito666: nice job :)
<fwereade> wallyworld, the higher the layer the less we need to thread the challenge/response stuff through, I understand it's a tradeoff
<wallyworld> fwereade: it will impact i think, may need an extension to the current api. workflow will be controlled by a layer above but primitives to make it work will be in the current layer
<fwereade> wallyworld, ok, cool, so long as it's on your mind and coming soon I won't worry about it for this CL
<wallyworld> fwereade: well, next on the todo list is the ToolsStorage fascade so we get get rid of the http storage stuff for manual provider
<wallyworld> so it's on my mind but not on the very immediate next to do list
<wallyworld> does that work for you?
<fwereade> wallyworld, I worry that we'll want that functionality for all facades, and that changing the tools facade to accommodate it *as well as* the managedresource stuff will exert subtle pressures to do it less cleanly than we might
<wallyworld> fwereade: ok, i can add some new apis to the current design spec and do the proof of access stuff first then
<wallyworld> after i land the current pull request
<wallyworld> https://github.com/juju/juju/pull/124
<fwereade> wallyworld, great, thanks
<fwereade> wallyworld, yeah, was starting to look at that, that was what made me think of it :)
<wallyworld> lol ok, i figured as much
<menn0> jam1: sorry I had finished up for the day... I'm actually OCR tomorrow not today anyway
<jam1> menn0: I must not have refreshed the page
<jam1> pn
<jam1> np
<menn0> jam1: it did the same for me too, at first it said I was and them recalced
<menn0> then even
<menn0> jam1: it's good that you pointed it out anyway. I hadn't realised I was on tomorrow :)
<wallyworld> fwereade: what did we want for the challenge-response policy? retain the current Put() where the caller has to provide all the data (and it is de-deduped on the server) but also add a *new* API where they provide just the checksums and then are issued a challenge for a segment checksum and if that passes they don't need to upload anything?
<wallyworld> not thinking too much, the new api will necessarily be stateful so we'll have to consider a timeout etc after the initial request
<wallyworld> ie if they don't respond soon enough the acceptable response expires and they would be issued with a new challenge
<fwereade> wallyworld, in my mind the main goal is to avoid having to send the bytes at all in the common case, so what I'd like us to *expose* is the stateful case, and only fall back to sending bytes in response to a never-heard-of-it result from the first call
<fwereade> wallyworld, yeah, we'd want a timeout, indeed
<wallyworld> fwereade: "avoid sending bytes in the common case" assumes there is a high chance the data is already uploaded
<wallyworld> i guess the caller can optimistically try and use just the checksums
<wallyworld> and if the server doesn't have the data, the caller is requested to upload everything
<perrito666> morning everyone
<fwereade> wallyworld, I think that globally there is a high enough chance that the (low) cost of even quite a lot of back-and-forths will be reasonable compared to the (high) cost of even a few ~gig-sized uploads
<fwereade> wallyworld, remember this is closely aligned with the fat charms case
<fwereade> wallyworld, those can often end up gig-sized
<wallyworld> fwereade: agreed. i'm just stating the obvious to e really explicit we have a shared understanding
<jam1> dimitern: can you take a look at https://github.com/juju/juju/pull/146
<fwereade> wallyworld, cool, I think we do
<dimitern> jam1, looking
<wallyworld> fwereade: both Put(supplyTheData) and Put(supplyTheChecksums) will be exported so i guess the caller can decide which one they want to use
<jam1> morning perrito666
<natefinch> jam: morning
<jam1> morning natefinch
<perrito666> mm, we no longer have a way to say "this fixes bug lp:#######" ?
<wwitzel3> perrito666: I think if there is  a lp issue, it is nice to mention it in the pull request comments
<perrito666> wwitzel3: yup, I just wanted to know if there is a way to trigger the "fix committed" status
<wwitzel3> perrito666: not that I am aware of
<wwitzel3> perrito666: I've been doing that manually
<perrito666> ok, Ill use what I see for other bugs
<dimitern> jam1, reviewed
<jam1> thx
<perrito666> natefinch: wwitzel3 wallyworld https://github.com/juju/juju/pull/147
<perrito666> there are things that upset me and then the fact that this bug is fixed with so little.... :p
<wallyworld> perrito666: \o/ thank you for fixing
<perrito666> :) now back to write a decent restore
<wallyworld> i bet you are sick of backup/restore now
<perrito666> wallyworld: no, I am actually very fond of it, I really look forward to have the new one implmented
<wallyworld> you have a lot of patience :-)
<perrito666> I am a bit sleepy tho, I slept only 4 hs last night
<wallyworld> :-(
<rogpeppe1> jam1: on further reflection, i don't think i understand your commonLoop changes
<rogpeppe1> jam1: i'm not convinced they're right
<rogpeppe1> jam1: specifically, i don't see how the changes ensure anything happens before the outer loop terminates
<jam1> rogpeppe1: so the race as detected by 'go test -race' is that 'NewNotifyWatcher' does a defer w.wg.Wait()" before calling loop. And nothing has been Added to the wg at that time.
<jam1> we then call "go w.commonLoop()" internally
<jam1> which will, eventually, call w.wg.Add() for the two goroutines that *it* spawns
<jam1> however, the 'go w.commonLoop()' hasn't actually incremented anything and can return out of "loop" before we've started it.
<jam1> I believe there is a secondary channel of information in the "w.in"  so that the for{} loops never actual exit until commonLoop has entered.
<rogpeppe1> jam1: i see.
<rogpeppe1> jam1: a better solution (i think) is to avoid calling Wait in NewNotifyWatcher
<rogpeppe1> jam1: but to make sure loop waits for in to be closed before returning
<jam1> rogpeppe1: I personally felt like wg.Wait() should probably be called inside the loop() functions
<jam1> rogpeppe1: so in the case of "w.tomb.Dying" we can return without checking w.in
<jam1> is that ok ?
<rogpeppe1> jam1: the original scheme was the wg is for commonLoop's internal use only
<rogpeppe1> jam1: it's kinda weird that commonLoop is doing the Wait itself
<jam1> rogpeppe1: if it is internal to commonLoop, couldn't it just use a local var ?
<jam1> rogpeppe1: I certainly originally thought to change "go commonLoop()" to just be a synchronous "w.commonLoop()" and then wait outside
 * rogpeppe1 checks
<jam1> but that closes w.in in a defer
<jam1> so we could change it some other way
<rogpeppe1> jam1: yeah, that was my initial thought too
<rogpeppe1> jam1: i'm not keen on the current change as it adds more stuff that each caller of commonLoop must remember to do
<rogpeppe1> jam1: yes, i think wg could/should be a local var
<jam1> rogpeppe1: so 'must wait until in is closed' isn't quite true today, because of stuff like "the tomb can die first"
<rogpeppe1> jam1: yup
<rogpeppe1> jam1: if the tomb dies, we should wait for the in channel to be closed
<jam1> rogpeppe1: I'm not sure that it means the outer loop must not terminate before then
<rogpeppe1> jam1: because that's the way commonLoop signifies that it's finished
<rogpeppe1> jam1: it's instructional to see how the code has changed since the original version (state/api/apiclient.go in rev 1235)
<jam1> TheMue: standup ?
<perrito666> yay fix committed
<jam1> vladk: you dropped out? Is everything ok?
<jam1> it would be nice if they had a very soft ding when someone connects
<bodie_> morning all
<perrito666> morning bis
<perrito666> ericsnow: wallyworld I will go back to the new restore, what are you guys doing? I don't want to step on your toes
<bodie_> anyone have a free minute to scope a PR or two?
<bodie_> https://github.com/juju/juju/pull/140 and https://github.com/juju/juju/pull/141
<ericsnow> perrito666: I'm still working on the backup client code
<wallyworld> perrito666: i'm not working on it
<perrito666> wallyworld: I meant wwitzel3 sorry
<wallyworld> :-)
<perrito666> wallyworld: I am used to you not being here at this time :p
<wallyworld> can't sleep
<perrito666> wallyworld: try watching a movie, works wonders for mi wife, in almos five years together I think she hardly saw more than 3 movies to full extent
<bodie_> hahahah
<wallyworld> lol
<wwitzel3> natefinch: standup
<perrito666> natefinch: taxes in MA are really low
<rogpeppe1> mgz: ping
<natefinch> perrito666: what's funny is that most people around here call it Taxachusetts.  However, I presume you're talking about sales tax
<natefinch> perrito666: sales tax in the US is done per state, Massachusetts is pretty middle of the road for states at 6.25% ... California being the highest AFAIK, at 10%, and several states have 0% (notably New Hampshire, which borders MA).
<TheMue> *sniff*
<TheMue> in Germany weâve got 7% for food and books, magazines etc, but 19% for the rest
<alexisb> fwereade, having an issue with my hangouts, will be there shortly
<perrito666> natefinch: I am talking about the tax amazon collects from me when trying to ship you stuff :p
<fwereade> alexisb, oops, forgot we were meeting, omw too :)
<alexisb> :)
<perrito666> man, lenovo really makes it hard to find a replacement battery
<bac> cmars: ping
<cmars> bac, pong
<mgz> rogpeppe1: hey
<rogpeppe1> mgz: in a call currently, but are you around for a chat in 30 mins or so?
<mgz> sure thing
<rogpeppe1> mgz: also... did you manage to get around that godeps problem?
<mgz> rogpeppe1: yeah, should all be fine now
<rogpeppe1> mgz: what was the issue?
<mgz> unrelated repository issue on the bot
<rogpeppe1> mgz: which was?
<mgz> a repo is shared between a bunch of different things, including godeps apparently, and we hit a bzr bug which made every branch using the repo unhappy
<rogpeppe1> ah, i wondered if it was something like
<rogpeppe1> that
<perrito666> brb lunch
<rogpeppe1> mgz: hey
<mgz> rogpeppe1: hey
<rogpeppe1> mgz: hangout?
<rogpeppe1> mgz: if it's a hassle, np
<mgz> sure, lets use juju-core-team
<rogpeppe1> mgz: link?
<mgz> rogpeppe1: in the calendar for thursday or just ...plus.google.com/hangouts/_canonical.com/juju-core-team
<rogpeppe1> mgz: hmm, i get 404
<rogpeppe1> mgz: will try the link in the calendar
<mgz> after the _
<mgz> add /
<mgz> I mistyped
<sinzui> abentley, Juju-ci will fail juju for the wrong reasons.
<sinzui> abentley, ppc and arm64 access was restored, but ci missed the opportunity t make the debs. all those arch tests will fail
<abentley> Doh!
<sinzui> abentley, aws has 6 old instances still running, causing the manual test to fail.
<sinzui> I will restart the revision if no revision lands in the next hour
<sinzui> perrito666, I am resting the current revision. CI ran of our AWS resources and ppc64 and arm64 machines. Many tests couldn't be run. Looks like the restore is working when there are resources
<perrito666> sinzui: \o/
<natefinch> fwereade: you around?
<natefinch> I love getting happy birthday emails from websites I don't even remember visiting
<perrito666> natefinch: is it your bday?
<natefinch> It is my birthday and my twin sister's birthday and my wife's birthday today.
<perrito666> uhh, that is a cool memory space saver
<perrito666> natefinch: well happy bday (and why is you bd not in the calendar for bdays?)
<natefinch> it's in my calendar, I dunno
<natefinch> and my aunt's birthday is tomorrow and Wednesday is ZoÃ«, my younger daughter's birthday
<natefinch> and a couple days ago was my sister's step son's birthday.    My mother went to the store and bought 6 birthday cards last week :)
<perrito666> "I will not get friends with people that have birthdays outside this week" great technique
<rogpeppe> on reflection, i'm not sure that using gopkg.in/juju/charm.v2 gives significant advantage over using github.com/juju/charm.v2
<rogpeppe> mgz: %
<rogpeppe> mgz: ^
<natefinch> rogpeppe: I actually thought of that when Gustavo proposed gopkg.in
<rogpeppe> mgz: the main disadvantage of the latter that i can think of is that github.com/juju will show several more repos, one for each api version
<natefinch> yep
<rogpeppe> natefinch: what do you reckon?
<mgz> rogpeppe: it's mostly a benefit with lots of api bumps, and keeping a sane git branch workflow
<rogpeppe> mgz: i think that the git workflow can be pretty similar in both cases
<natefinch> rogpeppe: that you could make your own foo.v2 and not need his magic.  However, it does clean up the juju repo list
<rogpeppe> mgz: there's not much difference between a remote branch whichever repo it's in
<natefinch> rogpeppe: his magic does let you do v2.1 v2.2 and let import foo.v2  work with all of those
<natefinch> not sure how necessary that minor revision bumping is though
<rogpeppe> natefinch: that is true
<natefinch> it keeps the code separate, I guess, but there's little difference to the end user from it all being in the same branch
<rogpeppe> the thing is, the code will need to live in two separate directories anyway, because that's the way go works
<natefinch> I mean, it keeps the v2.1 separate from v2.2 in git
<natefinch> yes, on disk, foo.v2 will need to live separately from foo.v1
<natefinch> I think it's worth using gopkg.in to keep the juju repos cleaner
<natefinch> it's already getting a little noisy
<rogpeppe> my main inclination the other way is that it's nice to have all the juju packages live under github.com/juju in my $GOPATH
<rogpeppe> because i'll often do a recursive grep inside that dir
<jam1> alexisb: you dropped out at "lets say"
<alexisb> jam1, yeah
<alexisb> I am trying to get back in
<perrito666> oh the sweet looks of passing tests http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore/213/
<natefinch> perrito666:  beautiful
<sinzui> alexisb, natefinch, jam, 1.19.4 release is blocked by bug 1333357 which was introduced earlier today
<natefinch> dammit
 * perrito666 facepalms
<alexisb> ooo the saga continues
<natefinch> ahh, it's only gccgo, who cares?
 * natefinch is joking, mostly
<alexisb> IBM does
<alexisb> ;)
<perrito666> sinzui: do you really think that revision is the one that introduced the bug?
<sinzui> perrito666, it is the only rev that changed apiserver/networker in the last 2 days
<perrito666> the output of gccgo is less than useful
<natefinch> so, that's a compiler error, which means it's a gccgo bug not a juju bug... not that we don't still have to fix it in gccgo (and perhaps try to avoid it in juju)
<perrito666> natefinch: If I were a compiler dev, I really would like to have better error reports than that
<perrito666> do you know what the $ mean?
 * perrito666 takes a quick look at the code
<bac> hi sinzui, for deploying to prodstack one of the webops mentioned a while back that we should transition to storing charm dependencies in an bucket somewhere. can you point me to one of your deployments that does that so i copy the hell out of your work?
<natefinch> perrito666: it has the exact line number and everything, though the message itself is not very useful
<sinzui> bac, I don't have an example.
<bac> doh
<sinzui> bac swift post charmworld-deps
<sinzui> bac, the charm can call swift download charmworld-deps <object>
<bac> sinzui: thanks
<sinzui> bac, you will probably want to make the container public so that the charm doesn't need creds. swift post -r '.r:*'
<jcastro> natefinch, do you have the URL handy for the inprogress API documentation? I believe you gave it to me before but I didn't bookmark it.
<sinzui> bac, I don't trust canonistack's swift this month. I got canonistack tests to pass by avoiding it. You probably wan't have a problem intermittently uploading files to it.
<perrito666> natefinch: I was curious of what line of go triggered the .cc to crash
<natefinch> perrito666: no idea.  the error doesn't really say, and I can't imagine "String()" would do it.
<perrito666> natefinch: the only nested something is on the test
 * perrito666 buys new guts for his computer
<alexisb> natefinch, I need a quick break and will be a few minutes late for our 1x1, I will ping you when I get back
<natefinch> alexisb: okie dokie
<natefinch> alexisb: are you in the call? I'm there but it says no one else is
<alexisb> I am there
<alexisb> video call
<natefinch> I'll rejoin
<alexisb> trying call in as well
<perrito666> ok, EOD, bye ppl
<natefinch> the bridge ID said not valid
<alexisb> yep
<alexisb> natefinch, are you on the video call?
<mattyw> thumper, morning
<mattyw> thumper, fwereade asked if you could take a look at this https://github.com/juju/juju/pull/108
<thumper> mattyw: ok, and otp
<mattyw> thumper, no problem, just wanted to let you know
<mattyw> I'll be heading to bed soon so anytime today is fine
<menn0> perrito666: ping
<perrito666> menn0: pong
<menn0> perrito666: I'm wanting to understand how the native backup solution is looking.
<menn0> just the high level design
<menn0> how much is committed already and how much is to come?
<perrito666> menn0: you have my divided attention between you and my merienda :)
<perrito666> if what you want is backup, its inner parts are already committed and not likely to change much
 * menn0 had to look up what merienda means :)
<perrito666> I looked up and wp does not have a translation for it
<perrito666> menn0: as for restore is being done, I pretty much know how it will be but not yet completed
<perrito666> we had a few days setback bc of a bug in the old restore
<menn0> so will backups be stored server side with the option of downloading or did your team go with the direct download to the client option?
<menn0> yeah I saw the discussions about the problem that was breaking CI
<perrito666> menn0: if you give me 5 minutes to remove my toasts from the fire we might solve this faster with a hangout
<menn0> sounds good
<perrito666> menn0: https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=3
<perrito666> menn0: ?
<menn0> perrito666: missed you
<menn0> try again?
<perrito666> https://plus.google.com/hangouts/_/canonical.com/moonstone?authuser=3
<menn0> perrito666: party is over. try this: https://plus.google.com/hangouts/_/g3qbgajp7bnquq576ulbflvdvia
<perrito666> menn0: I did not call :p that is the url for the moonstone hangout
<bodie_> anyone familiar with the permissions stuff in apiserver?
<bodie_> I'm trying to write a failing test for a unit without perms
<bodie_> but, I can't quite figure out how to find a suitable unit to try to query that I won't have perms for
<bodie_> I'm in state/apiserver/uniter
<bodie_> UniterAPI suite
<bodie_> sorry, uniterSuite
<bodie_> and batch Actions query is in
<bodie_> https://github.com/juju/juju/pull/140#discussion-diff-14067952
<bodie_> sorry, https://github.com/juju/juju/pull/140
<bodie_> ActionsWatcher API endpoint would be great to have a review on as well :)
<bodie_> PR 141
<menn0> bodie_: I'll take a look at that PR today
<bodie_> sweet, thanks menn0
<sinzui> wallyworld, We got another regression while ppc64 testing was down. I don't think perrito666 or natefinch made progress with it https://bugs.launchpad.net/juju-core/+bug/1333357
<wallyworld> :-(
<wallyworld> sinzui: ok, we'll fix today
<sinzui> wallyworld, I will grab the tarball and installer the moment I see CI pass to start the release proc
<wallyworld> rightio. this release really is cursed so far
<dpb1> is there an equivalent to juju run, but for transferring files?   like, send this file to "--unit <unit list>"
<wallyworld> sinzui: we also have bug 1333098 that has not been fix commited yet afaict
<_mup_> Bug #1333098: API panic running test suite <api> <panic> <regression> <juju-core:In Progress by jameinel> <https://launchpad.net/bugs/1333098>
<wallyworld> dpb1: i think juju scp
<wallyworld> yup, type juju help scp
<dpb1> wallyworld: yes, but I have to iterate, right?  I have a big file, and I was looking for something that could copy once into the cloud, then distribute that to the units I specify.
<wallyworld> dpb1: ah i see what you want. no, sadly you have to iterate
<dpb1> wallyworld: k
<dpb1> thx
<wallyworld> sorry
<dpb1> np, I was just wishing. :)
<wallyworld> raise a bug if you want
 * dpb1 nods
<wallyworld> we may be able to do something
<perrito666> I did not, I just took a look at it
#juju-dev 2014-06-24
<davecheney> thumper: alexisb email sent
<davecheney> tell me what you think
<davecheney> certainly more than 6 weeks work :)
<davecheney> sinzui: i am blocked
<davecheney> something is fucked up on batuan and I cannot pull source from github
<davecheney> [rt.admin.canonical.com #72748]
<_mup_> Bug #72748: Crash while exiting <tea (Ubuntu):Invalid by tea-dev> <https://launchpad.net/bugs/72748>
<davecheney> going to talk to #is now
<sinzui> davecheney, I have copied files to a machine inside the network, hackes /etc/hosts to point to that machine. I have also created certs that  claim to be the other site
 * sinzui doesn't accept no
<davecheney> sinzui: i'll try to debig this on amd64 for the moment
<menn0> bodie_: review of PR 141 finished for now
<davecheney> sinzui: crisis averted
<davecheney> connectivity restored
<davecheney> sinzui: interesting, I cnanot reproduce the build failure on my pcc64 machine
<sinzui> we saw it several times in ci http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/
<sinzui> dave the machine is stilson-07, do you want to visit it?
<sinzui> davecheney, ^
<sinzui> davecheney, ssh ubuntu@10.245.67.135. Your keys are there. maybe you can find a older or newer version of gcc there
<sinzui> davecheney, oh, 10.245.67.136 saw the same error. so both the unittest machine and the packaging machine hate juju
<davecheney> sinzui: intersting
<davecheney> sinzui: trying stilson now
<davecheney> sinzui: stilson-07 is running the outdated version of gccgo
<davecheney> sinzui: are you permitted to update it ?
<davecheney> sinzui: wtf, stilzon-07 doesn't even have bzr in the path ...
<davecheney> that's like saying the machine is actually running fedora
<axw> menn0: thanks for the review
<axw> wallyworld: can you also take a look at https://github.com/juju/juju/pull/148/ ? there's some changes to the instance-type constraint handling that you added
<sinzui> davecheney, We try to keep the machines clean. tests purge the left overs, they tests often export all the locations needed
<wallyworld> axw: oh hi, i had looked, was waiting for you to come online
<waigani> thumper: added some comments on the doc, heading your way
<davecheney> sinzui: short version
<wallyworld> axw: i don't quite see the purpose from the covering letter - the azure provider already calls into common instance matching functionality
<davecheney> you need to upgrade the compiler
<sinzui> hurray
<davecheney> to the one that is (hopefully) in trusty updates
<axw> wallyworld: I'll find the code in question for you, just a sec
<davecheney> bug is fixed in that version
<axw> wallyworld: would've been more obvious with a branch preqreq'ing ;)
<davecheney> sinzui: I don't know how you can reconcile that with the requirements of the lp builders
<wallyworld> yup
<axw> wallyworld: https://github.com/juju/juju/blob/master/provider/azure/instancetype.go#L19
<axw> all this preferredTypes stuff can go
<axw> and be replaced with a call to MatchingInstanceTypes
<wallyworld> axw: also, this line "if len(itypes) == 0 && cons.Mem != origCons.Mem {"  <-- the semantics seem slightly different from the original origCons.Mem != nil
<sinzui> davecheney, the unittest run called this to get the compiler
<sinzui> sudo apt-get install -y build-essential bzr distro-info-data git-core mercurial zip rsyslog-gnutls juju-mongodb gccgo-4.9 gccgo-go
<axw> wallyworld: the idea is that above it'll only try again if it tried above with implied mem=1G
<davecheney> sinzui: the update has not landed
<davecheney> it's still stuck in whatever process is blocking it
<davecheney> i had to install the compiler from ppa
<sinzui> The builder called this sudo apt-get install -y build-essential fakeroot dpkg-dev debhelper bash-completion gccgo-4.9 gccgo-go
<davecheney> yes, you said
<davecheney> but it looks liek the compiler has not made it out of the -propose pocket
<davecheney> the original ppc bug is not marked fixed-released
<axw> wallyworld: I could do "!cons.HasInstanceType() && origCons.Mem == nil", but I thought this was clearer
<davecheney> sinzui: https://bugs.launchpad.net/ubuntu/+source/gccgo-4.9/+bug/1304754
<_mup_> Bug #1304754: gccgo has issues when page size is not 4kB <ppc64el> <trusty> <gcc:Fix Released> <gcc-4.9 (Ubuntu):Fix Released> <gccgo-4.9 (Ubuntu):Invalid> <gcc-4.9 (Ubuntu Trusty):Invalid> <gccgo-4.9 (Ubuntu Trusty):Fix Committed by doko> <gcc-4.9 (Ubuntu Utopic):Fix Released> <gccgo-4.9 (Ubuntu Utopic):Invalid> <https://launchpad.net/bugs/1304754>
<sinzui> davecheney, more interestingly. all machines got their compilers from ppa, but the ppa
<davecheney> what ppa ?
<sinzui> davecheney, the machines use ubuntu-toolchain-r-ppa-trusty.list per your instructions
<wallyworld> axw: so you plan on changing the selectInstanceTypeAndImage() method?
<sinzui> davecheney, stilson-7 is missing it though
<axw> wallyworld: yes, but I think selectMachineType is more relevant
<axw> wallyworld: actually it could probably just be replaced
<axw> wallyworld: I could just propose my other branch
<axw> as a wIP
<wallyworld> axw: i'm thinking that both openstack and ec2 providers call into FindInstanceSpec() and their code handles instance type constraints etc as is - can we just tweak azue to use the same code?
<wallyworld> without exporting anything?
<axw> wallyworld: maybe... I'll see what they do
<davecheney> sinzui: ppa is not installed on stilson-07
<davecheney> apt-get update confirms it
<sinzui> I am adding it now. I think a test replaced it...
<axw> wallyworld: do they allow you to explicitly specify the image name?
<wallyworld> axw: i'm all for refactoring if it improves the code, so if it's necessary, go for it. but i also think we'd want azure, ec2, openstack, joyent etc to be consistent
<wallyworld> ah, imagename
<sinzui> davecheney, the machine with the right ppa failed the same way
<wallyworld> sorry
<wallyworld> i was getting confused with instance type
<axw> wallyworld: this is caused by having to deal with force-image-name
<wallyworld> hmmm. we specifically removed image name selection from ec2 etc
<wallyworld> ie the image-id config was removed
<wallyworld> because people could do bad things
<wallyworld> like forcibly specify an image that didn't match tools
<wallyworld> the fact that azure added support for it is unfortunate
<wallyworld> i'm almost inclined to see if we shouldn't look at deprecating it
<menn0> axw: np
<wallyworld> it may have been useful early on when azure images were changing rapidly and we didn't have simplestreams
<axw> wallyworld: can do, but I think we need to give people notice. is exposing this function really that bad?
<davecheney> sinzui: can I see the failure from that machine ?
<sinzui> davecheney, http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/606/consoleFull
<wallyworld> axw: yeah, it would be deprecated over a release cycle. in which case i guess we could expose the function. but any time we expose previously internal stuff is potentially an issue we'd prefer to avoid. but in this case we could do at and hide it again when azure is changed
<axw> yep, that sounds fine to me
<wallyworld> ok, let me take another look at the pr
<axw> ta
<thumper> davecheney: arm email looks good to me
<wallyworld> axw: also, i'd be interested in your thoughts on review board as per martin's email
<thumper> davecheney: thanks
<axw> wallyworld: yeah taking a look now
<davecheney> sinzui: that machine is also lacking the ppa
<davecheney> oh no
<davecheney> i'm sorry
<sinzui> davecheney, ubuntu-toolchain-r?
<davecheney> i guess
<davecheney> i cannot see from that output
<davecheney> all i see is
<davecheney> Get:3 http://ppa.launchpad.net trusty/main ppc64el Packages [11.9 kB]
<davecheney> sinzui: can you add a
<davecheney> gccgo -v to the build script, just before doing go build
<davecheney> that will settle it
<sinzui> davecheney, that is a little tricky since that would error on golang
<davecheney> eh ?
<sinzui> davecheney, http://pastebin.ubuntu.com/7693053/
<axw> wallyworld: are you able to get onto reviewboard atm? it's refusing my connection, just wondering if I've set up the tunnel right or not
<davecheney> interesting
<davecheney> that is the correct version
<davecheney> that matches winton-07
<davecheney> that matches winton-09
<axw> wallyworld_: are you able to get onto reviewboard atm? it's refusing my connection, just wondering if I've set up the tunnel right or not
<wallyworld_> axw: yeah, worked for me
<axw> hrm
<wallyworld_> did you set up proxy in browser?
<axw> wallyworld_: seems busted. I can't even telnet to 8080 on the machine.
<axw> oh, it's 80 isn't it
<wallyworld_> axw: all i did was run the ssh -D command, and then that logged me into a session. then i changed the browser to use a socks proxy and it worked
<wallyworld_> i used port 8080 in the socks proxy config
<axw> wallyworld_: I was trying to be clever and just forward the port, but I chose the wrong destination port. never mind :)
<wallyworld_> ok :-)
<sinzui> davecheney, stilson-07 with the correct ppa failed the same way http://juju-ci.vapour.ws:8080/job/publish-revision/546/console
<sinzui> davecheney, I am still deploying a hack to print the compiler to all the slaves
<davecheney> gccgo -v ::
<davecheney> sinzui: confirmed
<davecheney> same compiler
<davecheney> virtually same kernel
<davecheney> different results :(
<wallyworld_> sinzui: do you know what the policy would be in terms of deprecating and removing a juju feature which shipped in trusty? can we do that over a 1.20->1.22 release, or are we stuck with it forever?
<sinzui> wallyworld_, jamespage's proposal is co-installable version, so we can deprecate in devel and obsolete next devel. We will continue to state modern client works with older server 1 stable behind
<sinzui> davecheney, I can reboot? I have rebooted stilson-08 a lot actually because the ppc bug past week cause the disk and ram to fill up
<axw> wallyworld: how did you create a review from a PR in reviewboard?
<wallyworld> axw: i clicked on the new review link and it showed the recent branches
<axw> wallyworld: did you just get the diff and upload it?
<wallyworld> i then clicked on the branch
<axw> huh ok
<axw> I don't get that...
<wallyworld> no New Review menu?
<axw> wallyworld: it just shows me the juju/juju repo and master branch
<wallyworld> hmmm
<wallyworld> so you have the menu but it doesn't show any branches?
<axw> wallyworld: I clicked "New Review Request" at the top, and the only repo is "juju" which is juju/juuju
<axw> juju*
<axw> no obvious way to add my own
<wallyworld> hmm, let me re setup the tunnel etc and try
<davecheney> sinzui: sure
<davecheney> sinzui: stilson-07 looks fucking sit
<davecheney> dmesg shows continual crashes
<wallyworld> axw: just emailed a screenshot showing what i get when i click "New Review Request"
<davecheney> s/sit/sick
<axw> wallyworld: yeah that's what I see too. that's merged commits.
<wallyworld> oh balls, it is too
<wallyworld> i didn't look too closely
<wallyworld> so looks like the rbt tool is needed after all
<axw> yeah, which is a bit crap IMO
<wallyworld> yep :-(
<wallyworld> i wonder if gerrit is any better
<davecheney> waigani: you're ocr today ?
<davecheney> wallyworld: everyone says gerrit is the one to go for
<wallyworld> davecheney: we'll take a look at it. trouble is, they're all crap compared to lp
<davecheney> wallyworld: i'll agree to disagree with you there
<davecheney> no inline commenting on lp was a non starter for me
<wallyworld> davecheney: what don't you like about lp? it has the best review queue, can mark stuff as wip, is *not* patch based, supports pre-req branches etc etc
<wallyworld> none of the others do that very well at all
<davecheney> wallyworld: lack of inline commenting on reviews
<wallyworld> meh
<wallyworld> lp has that now anyway
<davecheney> damn, too late
<wallyworld> since maybe about a month
<davecheney> like i say, we don't have to agree on this point
<wallyworld> sure :-)
<davecheney> sinzui: did you reboot stilson ?
<sinzui> davecheney, 7 and 8 are rebooted. 7 is back
<sinzui> and 8 is back
<davecheney> [    5.238055] init: plymouth-upstart-bridge main process ended, respawning
<davecheney> [    7.117990] init: pollinate main process (797) terminated with status 1
<davecheney> this machine is sick
<sinzui> davecheney, we could backout your change for a half a day to get a pass rev we can release. or may rewrite the offending code to be friendlier to gccgo
<davecheney> sinzui: yup, will revert
<davecheney> does anyone know how to do that ?
<davecheney> wallyworld: can you help revert a merge ?
<wallyworld> which one?
<davecheney> https://github.com/juju/juju/pull/144
<wallyworld> i think you just reverse cheery pick and propose a pr
<davecheney> wallyworld: i have _never_ done this before on git
<wallyworld> me either :-)
<davecheney> it's guarenteed i'll scrwe it up
<davecheney> might be easier to work around the compiler bug
<sinzui> davecheney, I don't have confidence in that myself.
<wallyworld> davecheney: it that what breaks the compiler?
<davecheney> sinzui: wallyworld I have a workaround for the compiler crash
<davecheney> pls hold
<wallyworld> ok
<sinzui> \o/
<davecheney> hang on
<davecheney> shit is weird
<davecheney> my local working copy of master does not match what i'm seeing on stilson
<davecheney> fuck, forgot to pull master
<davecheney> ok, running tests
<davecheney> PR comming ASAP
<axw> wallyworld: did you see the review request for gwacl?
<wallyworld> no, i'll look
<axw> wallyworld: https://code.launchpad.net/~axwalk/gwacl/rolesizes-update/+merge/224103
<wallyworld> axw: i know, let's just write a html page scraper, WCPGW
<axw> wallyworld: heh :)
<davecheney> sinzui: wallyworld https://github.com/juju/juju/pull/151
<sinzui> davecheney, jam made a similar fix last week
<wallyworld> davecheney: we could leave the authEntityTag alone and just do if names.NewMachineTag(parentId).String() == authEntityTag.String() {    ??
<davecheney> wallyworld: sure,
<wallyworld> just a suggestion to preserve more of the original branch
<davecheney> but i've tested this version
<wallyworld> ok, i was just thinking that we'd want to keep various getAuth functions consistent
<davecheney> this isn't forever
<wallyworld> ok, +1
<sinzui> davecheney, there is a cursed branch in front of yours. github-merge-juju will be available in 30 minutes
<davecheney> sinzui: ok
<davecheney> i don't know what you mean
<davecheney> but ok
<bodie_> I think this is ready to go -- https://github.com/juju/juju/pull/140
<bodie_> would appreciate a lgtm / lbtm :)
<menn0> bodie_: I'll take a look
<menn0> davecheney: do you want to get this merged or would you prefer that the tests I suggested get done first? https://github.com/juju/juju/pull/127
<davecheney> menn0: i'm not fussed
<menn0> davecheney: let's just get it in. it's got my LGTM.
<davecheney> cool, thanks
<davecheney> panic: runtime error: invalid memory address or nil pointer dereference
<davecheney> [signal 0xb code=0x1 addr=0x1 pc=0x40761f]
<davecheney> goroutine 1438 [running]:
<davecheney> runtime.panic(0xed70c0, 0x1f9c7c8) /usr/lib/go/src/pkg/runtime/panic.c:266 +0xb6
<davecheney> github.com/juju/juju/state/apiserver.funcï¿½ï¿½009(0x0, 0x0, 0xc210256c60, 0x30941b60a0, 0x414361, ...) /home/ubuntu/juju-core_1.19.4/src/github.com/juju/juju/state/apiserver/root.go:169 +0x5ed
<davecheney> github.com/juju/juju/state/apiserver.(*srvCaller).Call(0xc21033cb40, 0x0, 0x0, 0x0, 0x0, ...) /home/ubuntu/juju-core_1.19.4/src/github.com/juju/juju/state/apiserver/root.go:101 +0x3f
<davecheney> github.com/juju/juju/rpc.(*Conn).runRequest(0xc210859500, 0x7f9e941e1b18, 0xc21033cb40, 0x129fb20, 0x12, ...) /home/ubuntu/juju-core_1.19.4/src/github.com/juju/juju/rpc/server.go:533 +0xd5
<davecheney> created by github.com/juju/juju/rpc.(*Conn).handleRequest /home/ubuntu/juju-core_1.19.4/src/github.com/juju/juju/rpc/server.go:462 +0x671
<davecheney> err
<davecheney> didn't jam make a fix for this ?
<davecheney>                 r.objectCache[objKey] = objValue
<davecheney>                 return objValue, nil
<davecheney> the hell
<davecheney> that means r.objectCache is nil
<bodie_> could use a bit more review on this as well -- menn0 if you're interested -- jcw4 https://github.com/juju/juju/pull/141
<menn0> davecheney: he has a fix but he hasn't merged it yet
<menn0> davecheney: PR 146
<davecheney> cool, thanks
<davecheney> i'll check it out
<menn0> davecheney: it's fairly sporadic
<menn0> bodie_: that extension to the tests for #141 looks good. I had some other comments too.
<menn0> thumper: did you see this? https://github.com/juju/juju/pull/108#issuecomment-46822183
<menn0> thumper: fwereade is right that the implementation is probably not quite right given the plans in the identity spec. most of the work is still useful though. I wonder if it should merged but disabled.
<davecheney> menn0: urgh, shit, lots of races in cmd/jujud
<menn0> davecheney: where? what?
<davecheney> menn0: pls hold
<davecheney> sinzui: that gccgo fix landed
<sinzui> \o/
<davecheney> how do I kick off the ppc build ?
<davecheney> menn0: http://paste.ubuntu.com/7693425/
<davecheney> as suspected there is a race on the api server root hashmap
<sinzui> davecheney, you can't CI already sees there is a new revision http://juju-ci.vapour.ws:8080/
<davecheney> sinzui: ok
<sinzui> davecheney, Both publish-revision and run-unit-tests-trusty-ppc64el will start in 5 minutes
<davecheney> sinzui: https://bugs.launchpad.net/juju-core/+bug/1333513
<_mup_> Bug #1333513: state/apiserver: data race on on apiserver method hashmap <race-condition> <juju-core:Confirmed> <https://launchpad.net/bugs/1333513>
<davecheney> ^ blocks 1.19.4 i'm afraid
<sinzui> :(
<davecheney> i can try jam's fix and see if that helps
<thumper> menn0: hey, back at my desk now
<thumper> menn0: let me look
<menn0> thumper: k
<menn0> davecheney: that fail is due to the thing that jam has fixed but not merged. I believe there was discussion about the best way to fix it which might be why there's a delay.
<bodie_> menn0, I had a few replies for your comments -- I need confirmation from jcw4 on at least one of them since 141 is actually an API for the Watcher, which itself is tested elsewhere -- I'm not entirely sure all the error cases need testing, since many of them are things from other packages
<bodie_> the API should really be simple, I think, but I'm open to being wrong about that :)
<menn0> bodie_: give me a bit, I'm in the middle of your other review
<bodie_> sure thing, it's way past EOD for me here so I'll be reviewing in the morning -- hopefully we can sync effectively :) I really appreciate the comments
<menn0> bodie_: no problems
<davecheney> sinzui: ppc build passed!
<davecheney> w00t
<davecheney> menn0: what shold we do
<davecheney> 1. release with this know issue ?
<davecheney> 2. wait a few hours for jam ?
<davecheney> 3. try to fix it ourselves ?
<menn0> davecheney: I don't think we can release. This problem will happen in actual use.
<davecheney> ok
<davecheney> i agree
<davecheney> there are also a shitload of other races in that paste
<menn0> davecheney: it looks like jam has a valid fix that several people approved
<davecheney> i only made an issue for the first one
<davecheney> menn0: hit it with $$merge$$
<menn0> davecheney: there was quite a bit of discussion though so I wonder if jam1 was planning on doing something better
<jam1> davecheney: menn0: the fix for *my* code is just fine, the ancillary fix for other stuff that showed up with -race is in question, but I'm looking into it right now.
<davecheney> :)
<menn0> davecheney: maybe just go with what jam has right now and he can always change it later.
<davecheney> i agree
<menn0> hitting merge now
<jam1> davecheney: trying to use "go test -compiler=gccgo" I got this error: http://paste.ubuntu.com/7693569/
<jam1> thoughts?
<davecheney> jam1: fix in trunk
<jam1> running it a second time and it passed...
<jam1> ah, nm, still breaks
<davecheney> landed a few minutes ago
<jam1> I forgot I run the regular tests first
<jam1> davecheney: thanks
<menn0> jam1: did you catch that davecheney and I just hit $$merge$$ on your objectCache PR?
<jam1> davecheney: is that a fix in "juju" trunk, or a fix in gccgo trunk
<jam1> ?
<davecheney> jam1: workaround in juju
<jam1> menn0: yeah, though I did *just* push up the alternative fix
<menn0> jam1: crap.
<menn0> jam1: shall we cancel the merge then?
<jam1> menn0: meh we can just submit it again, I think if we missed something. It is noise but not terrible
<jam1>  menn0: there is a queue 3 deep already and maybe the bot doesn't track the tip version that was voted on, in which case it will get the updated one. We'll see.
<menn0> jam1: I was wondering the same thing.
<jam1> tarmac was careful about it, because you could be approving 3rd party proposals, though it often was annoying to have to go and reapprove since most of our branches were actually trusted.
<jam1> I don't know the new bot nearly as well.
<menn0> davecheney: I just send off that status API PR for merging too
<davecheney> thanks
<davecheney> sinzui: still there ?
<davecheney> when do we have to stop screwing around so you can cut a release ?
<wallyworld> jam1: did you still want to catch up?
<thumper> OMG... lots of addressing document comments, but no other work
 * thumper feels exhausted
<thumper> it is all rick_h_'s fault
<thumper> yes rick_h_, read this when you wake up and know that you did this to me :-P
 * thumper will add more tomorrow
<thumper> night all
<davecheney> wallyworld: axw  http://juju-ci.vapour.ws:8080/job/run-unit-tests-trusty-ppc64el/
<davecheney> ^ can you push go on this build please
<wallyworld> sure
<davecheney> ta
<wallyworld> davecheney: what revision? tip?
<davecheney> wallyworld: tip is fine
<wallyworld> ta, i started it with the sha of your latest commit
<davecheney> yup
<davecheney> that is the one i want to check
<davecheney> make sure I havn't just screwed things again
<jam1> fwereade: if you are around and have time for a 5-minute chat about a tasteful way to share common code I'd like to run some ideas by you.
<fwereade> jam1, sure
<jam1> fwereade: https://plus.google.com/hangouts/_/canonical.com/juju-sapphire
<rogpeppe> mornin' all
<davecheney> wallyworld: playing with wercker
<davecheney> it looks like we can setup custom build environments
<davecheney> trying to get the juju tests to pass
<wallyworld> cool
<davecheney> wallyworld: you get pre commit tests
<davecheney> on a per branch basis
<davecheney> the werker bot participates on the PR and says if the build passes or not
<davecheney> i've probably screwed my working copy
<davecheney> doing all this
<davecheney> https://app.wercker.com/#buildstep/53a925f9770aadd70b0f1944
<davecheney> well
<davecheney> that didn't work
<davecheney> and i'm out of fucks for today
<jam1> fwereade: so I've made a couple comments, but now Google is telling me "file not found" anytime I click on the doc.
<jam1> fwereade: It comes to mind that if we do the "this hook is only called if the other hook doesn't exist" model, then we could call it "unhandled-changes"
<fwereade> jam1, that's just the "missing" hook, and it'll get called 10000 times, isn't it?
<fwereade> jam1, (fwiw loading it in a new tab works for me)
<jam1> fwereade: well it could be "missing" but have the semantics that it is only called once for a sequence of hooks
<fwereade> jam1, it's not quite working for me, but maybe I just need to think it through more
<jam1> fwereade: so I mentioned in my comments that either semantic would be possible
<jam1> "always queue one of these" vs "only queue if the hook wasn't there"
<jam1> fwereade: I still get it to load and show the first time, but clicking anywhere says there is a problem. Can you share the doc directly with me?
<fwereade> jam1, shared
<jam1> fwereade: thanks, it seems happier
<jam1> fwereade: I just brought it up because I think it might change the name of the thing, and naming it seems problematic right now :)
<fwereade> jam1, I dunno, calling it once in response to any sequence of unimplemented hooks feels quite different to calling it when stable (and occasionally if it looks like we're not stabilising any time soon)
<fwereade> jam1, tying it to other hooks feels like complexity for no benefit -- maybe I'm just not getting the value though
<fwereade> jam1, you can implement install, you can implement relation-broken, etc etc
<jam1> fwereade: so tying it to other hooks is because existing charms are going to have this symlink farm, and it will mean that it actually gets called significantly more often
<fwereade> jam1, having it so if you implement r-b you basically always see s-c, r-b, s-c, r-b, s-c, r-b, s-c, r-b, s-c when leaving a few relations -- vs r-b, r-b, r-b, r-b, s-c
<fwereade> jam1, ok, but the symlink farm is stupid and broken
<fwereade> jam1, getting rid of that is almost the whole point
<jam1> fwereade: sure, getting there from here is something to be aware of.
<fwereade> jam1, I don't follow
<jam1> fwereade: juju certainly doesn't have the concept of "I can deploy version X of charm Y", but not the latest.
<fwereade> jam1, symlink farms keep working until they delete everything
<fwereade> jam1, er, yes it does
<jam1> so charms that *just implement* the new hook still need the symlink farm
<fwereade> jam1, ahhh ok
<fwereade> jam1, sorry, yes, I think we need charm feature flags much sooner than later
<dimitern> jam1, i reverted the network model doc to before domas changes
<jam1> fwereade: so the thing with even feature flags is that Juju-1.18 could still install version 10 of the charm, it is just version 11 that it can no longer install, but juju has no way of knowing/representing that to the user.
<jam1> dimitern: is that the giant paste in the header?
<dimitern> jam1, yep
<fwereade> jam1, expand please?
<fwereade> jam1, the charm store needs to know about feature flags too, it is true
<fwereade> jam1, and getting a version of 1.18 out that understands and uses them is important
<jam1> fwereade: if I try to "juju deploy mysql" can it pick a version that doesn't have the features it doesn't know about ?
<fwereade> jam1, that's my expectation and assumption, yeah -- pass supported feature flags when querying charm store
<jam1> the spec as I read it says that it wouldn't list "mysql" if the last version of mysql had "feature=unknown"
<jam1> "charms with features not included in list"
<fwereade> jam1, it wouldn't list that version of mysql
<jam1> is not the same as
<jam1> "charm versions with features are not included in the list"
<fwereade> jam1, perhaps poorly worded then
<jam1> fwereade: oh google docs, why is "ctrl+alt+m" write a comment but "ctrl+shift+m" is go into some weird 320x200 mobile view
<jam1> the latter is far easier to type
<fwereade> jam1, but, er, I was not trying to propose a feature in which valid charm versions just *disappear* ;)
<fwereade> jam1, I was trying to convey that invalid ones are hidden from clients that can't handle them
<fwereade> jam1, I guess the issue is the fuzziness between charm and charm revision
<jam1> fwereade: so the idea of r-b being implemented or not and how it effects queuing s-c being available.
<jam1> fwereade: you missed my point a bit
<jam1> the idea is that *if* r-b is implemented you get "r-b r-b r-b" full stop
<jam1> if it is not
<jam1> you get "s-c" full stop
<jam1> when I say "queued" I meant "queued for some time in the future when we reach quiescent point"
<jam1> so if you have config-changed, c-c, c-c, relation-changed, c-c,  and only relation-changed and something-changed are implemented, you would get
<jam1> r-c s-c
<jam1> after the first "c-c" was triggered, s-c would be marked as "I want to run this when I can"
<jam1> if, on the other hand, you had just "r-c r-c" then you would get exactly that, and not "r-c r-c s-c"
<jam1> I don't think anyone is saying *if* r-c is implemented then immediately call s-c afterward
<jam1> that would, indeed, be silly.
<jam1> I realized "queued" is a bit of a bad term to use here, as it has meta meaning and real-meaning in juju hooks
<rogpeppe> fwereade: i'm thinking of moving the charm package to gopkg.in/juju/charm.v2 to avoid breaking the current API (and to try to commit to maintaining a stable API in the future).
<rogpeppe> fwereade: also with a view to potentially moving other juju packages there too
<rogpeppe> jam1, fwereade: does that sound reasonable to you?
<rogpeppe> the actual branch would be named "v2" at github.com/juju/charm
<rogpeppe> jam1: with your something-changed hook, would that mean that even if there was an hour gap between two things changing, something-changed wouldn't be called the second time?
<fwereade> jam1, I am certainly saying "call s-c whenever the queues clear out, however many hooks were implemented or not in the interim"
<fwereade> jam1, what's the issue there?
<rogpeppe> fwereade: my issue with that is that "queue clearing out" is a very fuzzy concept (it's not with our current highly dubious 5-second polling system of course, but hopefully we can move to something better in the future and i hope we can design for that)
<rogpeppe> fwereade: queue clear for... how long?
<vladk> dimitern: please, take a look https://github.com/juju/juju/pull/121
<fwereade> rogpeppe, a few seconds with nothing else firing?
<fwereade> rogpeppe, but the 5-second polling is so many layers away that I don't really see the relevance
<dimitern> vladk, looking
<rogpeppe> fwereade: without the 5 second polling, things can keep on firing indefinitely, and maybe that's ok
<rogpeppe> fwereade: well, even *with* the 5 second polling, things can keep on firing indefinitely
<rogpeppe> fwereade: are you trying to address the "fire something when we're in a `stable state'" issue?
<fwereade> rogpeppe, sure, that's why we fire every N minutes when not stable
<fwereade> rogpeppe, I'm more trying to address the "most charms are stupid and wasteful and boilerplatey" issue
<fwereade> rogpeppe, and I think fire-a-hook-when-stable is a good solution to that
<rogpeppe> fwereade: oh, you mean it would be nice to write a charm that just had a "tell me when something happens" hook?
<fwereade> rogpeppe, that's what people do already
<fwereade> rogpeppe, but they have to implement every hook as a symlink to their entry point
<rogpeppe> fwereade: ah, but you want to amalgamate events
<fwereade> rogpeppe, yes, because what happens is that the entry point gets called 30 times and returns without doing anything because not enough context is around
<rogpeppe> fwereade: presumably you can't amalgamate events for different relations?
<rogpeppe> fwereade: or do you just have an env var that can hold all the relations that have changed?
<fwereade> rogpeppe, and then once there's enough, it gets called another 100 times in a row, rebuilding the full service config every time, diffing against the running config and maybe replacing and bouncing the service
<fwereade> rogpeppe, nobody cares what's changed
<fwereade> rogpeppe, they all just slurp up the complete environment state and translate it into a config
<rogpeppe> fwereade: right
<rogpeppe> fwereade: the main issue with amalgamating events is that you won't get such a timely response, because you can never fire a hook immediately
<fwereade> rogpeppe, but I think you *converge* much faster
<fwereade> rogpeppe, because you don't do the same processing 100x over
<fwereade> rogpeppe, you do no processing 100x
<rogpeppe> fwereade: you may do - it depends how costly your hook executions are
<fwereade> rogpeppe, and then do the actual work just once
<rogpeppe> fwereade: if your actual work only takes a millisecond, then it's not a problem
<fwereade> rogpeppe, quite -- this is presented as a way of working better with the charms which are just one, big, complex hook
<rogpeppe> fwereade: given that we currently always have up to 5 second delay, perhaps that's the self-imposed queue-gathering delay is not a problem even in a non-polling system
<fwereade> rogpeppe, didn't quite follow that
<rogpeppe> fwereade: if you get an event, you have to wait for some length of time to gather other events before you can fire your something-changed hook
<rogpeppe> fwereade: otherwise you'll regress to always firing an event every time
<rogpeppe> fwereade: alternatively...
<rogpeppe> fwereade: (and probably better)
<fwereade> rogpeppe, or we could integrate the hook queues and have them generate an s-c whenever they empty out
<rogpeppe> fwereade: is to just fire the first hook anyway, then gather any events that happen while the hook is firing, then fire all of them at once when that completes
<fwereade> rogpeppe, may or may not be enough better to justify the cost
<fwereade> rogpeppe, I never imagined *not* firing any other hook -- just that I'd expect most of those hooks to not be implemented
<rogpeppe> fwereade: i don't really have an idea of what "integrating the hook queues" implies
<fwereade> rogpeppe, there's one per relation at the moment, and other hooks coming in from a variety of sources -- eg config-changed from the filter, install/start/stop according to the state machine
 * rogpeppe goes to have a glance at the uniter source
<fwereade> rogpeppe, move relationId from AliveHookQueue to UnitInfo and you're quite a lot of the way there, although you probably want to maintain a linked list per relation as well alongside the global one
<rogpeppe> fwereade: perhaps life would be easier if the filter only had a single output channel
<fwereade> rogpeppe, mmmmm if we only had one channel for *hooks*, yes, it probably would
<fwereade> rogpeppe, single output chan on filter doesn't seem helpful to me
<rogpeppe> fwereade: ah, yeah, i was thinking of a single channel that modeAbideAliveLoop could be waiting on
<fwereade> rogpeppe, but regardless, we don't *need* any of that
<fwereade> rogpeppe, waiting a few seconds and firing s-c if nothing else happens, then not waiting if we just ran s-c, would I think work fine
<rogpeppe> fwereade: mmm, probably. it does feel a bit hacky though
<rogpeppe> fwereade: after all, the recipient *knows* that something has changed
<fwereade> rogpeppe, recipient == the charm? or the uniter?
<rogpeppe> fwereade: the uniter
<rogpeppe> fwereade: specifically modeAbideAliveLoop, though there may be other places
<fwereade> rogpeppe, right, but that's the uniter setting things up to tell itself when things *stop* changing
<fwereade> rogpeppe, and deciding then to inform the charm that s-c
<rogpeppe> fwereade: isn't that what you want?
<fwereade> rogpeppe, it is, we may be in violent agreement -- I think I didn't understand
<fwereade> after all, the recipient *knows* that something has changed
<jam1> TheMue: I'd like to turn https://docs.google.com/a/canonical.com/document/d/1fPOSUu7Dc_23pil1HGNTSpdFRhkMHGxe4o6jBghZZ1A/edit# into a concrete doc in the source tree about how it was actually implemented and how people interact with the system. It is currently a little too "how do we do this" in places. Do you think you can work on that, or is it stuff that only I have in my head.
<TheMue> jam1: thatâs a doc that will move into the Juju API Design Specification Iâm currently working on
<TheMue> jam1: itâs pretty detailed and together with your code changes I think itâs no problem to get it
<TheMue> jam1: in unclear cases Iâll simply ask you :D
<rogpeppe> fwereade: FWIW i was thinking along these kinds of lines: http://paste.ubuntu.com/7694109/
<rogpeppe> fwereade: though i'm sure it doesn't interact correctly with hook error retries, shutdowns and all that jazz
<TheMue> jam1: btw, thx for review. beside changing the tests and adding a failing case, do you think I should add a validity check to the API, or even deeper the RPC layer?
<jam1> TheMue: I'd like a test at at least the state/api/ layer (which means probably in state/apiserver/client_test.go
<fwereade> rogpeppe, yeah, that's roughly what I was thinking too, although I agree it probably isn't *exactly* right as it is
<dimitern> vladk, reviewed
<vladk> dimitern: thanks
<rogpeppe> fwereade: cool
<rogpeppe> fwereade: i thought you were considering something much more upstream than that
<fwereade> rogpeppe, considering, yeah -- there's something architecturall skewed about relation handling and filters and so on -- but I think the s-c can be done at that layer completely independently
<rogpeppe> fwereade: cool
<fwereade> rogpeppe, I just keep wanting to find excuses to look into that stuff again ;)
<rogpeppe> fwereade: FWIW the 2 second delay could probably be 10 milliseconds and it would still be useful
<fwereade> rogpeppe, concur
<rogpeppe> fwereade: BTW did you have an opinion on moving the charm package to use gopkg.in ?
<fwereade> rogpeppe, I rather like gopkg.in, I don't see why not
<rogpeppe> fwereade: cool. i've got a couple of outstanding proposals which i'd like to merge, but they break the API horribly, so i was wanting to merge only after moving to a new api version
<vladk> dimitern: I answered some comments in https://github.com/juju/juju/pull/121
<dimitern> vladk, thanks, will look in a bit
<natefinch> mornming all
<natefinch> jam1: do you have time to go over the multi env state server spec?  Seems like we have some comments to talk about
<fwereade> dimitern, vladk|offline: either we need a stringswatcher, in which case we should write it as a stringswatcher, or we want a one=shot in which case we should write it as a one-shot
<wwitzel3> morning natefinch
<natefinch> wwitzel3: up early huh?
<wwitzel3> natefinch: woke up and it was only 30 minutes till my alarm anyway
<natefinch> ahh yeah, I know that one
<wwitzel3> natefinch: just reading over that email from wallyworld I'll get started on that
<natefinch> wwitzel3: yeah that seems good
<wallyworld_> jam1: i'm free wherever you are, just ping when when you have a break in your schedule
<wallyworld_> wwitzel3: i'm not sure how long that will take you, i can find more to do if you get that done :-)
<wallyworld_> fwereade: will you get a chance to review https://github.com/juju/juju/pull/124 today? I have most of the proof-of-access followup done which I plan to propose tomorrow
<fwereade> natefinch, jam1: based on a super-quick look at that, what I was hoping for was a plan for fixing the db to handle multiple envs, rather than a redesign of the stuff tim's been working on for the last couple of weeks -- added a couple more comments just now
<fwereade> wallyworld_, sorry :(
<wallyworld_> np :-)
<fwereade> wallyworld_, it mostly looked good, I will try to do it properly today
<wallyworld_> ok, ta
<natefinch> fwereade: well shit
<fwereade> natefinch, ehh, communications screwup, it happens
<dimitern> fwereade, we'll primarily watch the machine itself i think, but we'll also need to watch the req. networks (the watcher server-side will handle the machines + services deployed on it ofc, so we don't need to care about the services), raw addresses and subnets attached to the machine's interfaces
<fwereade> dimitern, (1) why watch the machine (2) are any of those other things relevant other than in the context of "we need to configure X network, let's look up the details of how we do so"?
<natefinch> fwereade: is there more to fixing the db than scoping each document's id by adding the environment UUID to it?
<dimitern> fwereade, check the https://docs.google.com/a/canonical.com/document/d/16SYAlZFc19YPXrB7BRwufZVoeLFpqGzBTAdo4EoQIHg/edit#heading=h.idpldjoq36jf section about the networker's responsibilities
<dimitern> fwereade, sorry, not the machine, but other things
<fwereade> natefinch, there's (1) picking a shard key -- I don't think _id makes for a good one -- and  (2) Getting There From Here: how we change apiserver/state to work with multi-env without requiring rewriting All The Things
<fwereade> dimitern, checking
<dimitern> fwereade, fwiw, it seems more and more we'll actually need a worker not based on a watcher though
<fwereade> dimitern, how can a subnet start dying when a machine's using it? I thought non-zero refcounts would block that
<dimitern> fwereade, but not a one-shot single use worker, something that can still handle things in a loop using multiple watchers and "watcher-like" things, i.e. monitoring ifaces
<dimitern> fwereade, a subnet with no enabled interfaces attached to it has a refcount of 0
<fwereade> dimitern, I am suspicious that there's a big pile of necessary jobs that are all being crammed into one worker because the name's roughly related
<fwereade> dimitern, then why can the machine see that subnet?
<dimitern> fwereade, i had that feeling as well
<fwereade> dimitern, mainly I'm scared of another firewaller
<dimitern> fwereade, because it has disabled ifaces attached to it
<fwereade> dimitern, that thing's a horrorshow
<fwereade> dimitern, hmm. can't we just disable ifaces that don't correspond to desired networks, bam, done?
<dimitern> fwereade, so we can split the tasks in two at least - and "addresser", which handles watching interfaces as they come up and update raw addresses of the machine, as opposed to the instance poller, which calls the provider; the addresser can take care of filtering addresses as well perhaps, moving them from raw to official
<fwereade> dimitern, the address handling stuff is IMO nothing to do with a worker
<dimitern> fwereade, why? who's gonna monitor what addresses are assigned to the new interfaces and save them in state?
<fwereade> dimitern, SetAddresses/SetMachineAddresses *themselves* are expected to look at all the raw addresses and update the *actual* addresses in one go
<fwereade> dimitern, SetMachineAddresses is really just "ehh, we have a bunch of IPs, figure them out please state server"
<dimitern> fwereade, yes, but someone has to call that with the discovered addreses, right?
<dimitern> fwereade, will it be another worker or something in the MA?
<fwereade> dimitern, sure, that's a worker, but *all* the worker does is discover IPs and call the API
<fwereade> dimitern, it is not expected to do anything sophisticated with those addresses
<dimitern> fwereade, right
<dimitern> fwereade, the networker can to watch the machine's network interfaces to make sure we do the right thing when they are enabled/disabled
<dimitern> fwereade, which will happen as part of provisioning or deployment (i.e. when they're added in the first place, or when a unit gets deployed and the combined req. networks change, triggering interfaces being marked as enabled/disabled)
<fwereade> dimitern, maybe I'm being dense, but I still don't see why the networker needs to do anything other than (1) watch the list of networks and (2) every time it changes, look up relevant info on those nets from the API server, config the ones we have, and disable the ones that don't map to those reported active by the state server
<fwereade> dimitern, why watch the interfaces though?
<dimitern> fwereade, ok it seems we need another talk on this
<fwereade> dimitern, yeah, probably :)
<dimitern> fwereade, and sequence diagrams of the order of events and who handles them :)
<fwereade> dimitern, you free now?
<fwereade> dimitern, +1
<dimitern> fwereade, can we do it in 5m?
<fwereade> dimitern, sure
<dimitern> fwereade, sent you a link
<jam1> fwereade: so it turns out that there is a reason to use the 'interface' style for Client facing facades. because they will be exposing both BestAPIVersion *and* at least Close. I'm not sure if there will be another common function yet or not.
<jam1> And while yes, we could create 2 'simple' thunks
<jam1> once we have >1 it feels better to put that as a common embed to me.
<perrito666> morning everybody
<wwitzel3> morning perrito666
<perrito666> sinzui: did we manage to get 1.19.4 out?
<axw> wallyworld_: is it just me, or is the bot way worse than usual today?
<wallyworld_> not just you, does seem bad :-(
<jam1> wallyworld_: if your still around, poke
<wallyworld_> hi
<mattyw> fwereade, OpenPort in in state/unit.go - this looks like a bug - just checking it's not intended behaviour: https://github.com/juju/juju/blob/master/state/unit.go#L653
<jam1> wallyworld_: can you meet me over here: https://plus.google.com/hangouts/_/canonical.com/juju-sapphire?authuser=1
<wallyworld_> sure
<rogpeppe> bac, dimitern, jam1, wwitzel3, mgz, natefinch, perrito666, wallyworld_: here's the start of bundles implementation in the charm package; reviews very much appreciated: https://github.com/juju/charm/pull/9
<jam1> natefinch: poke
<natefinch> jam1: howdy
<wallyworld_> axw: finally landed :-/
<jam1> natefinch: hey, sorry we didn't get the focus on multi-environment stuff, I do feel we need to work on it. however, we have a release blocker bug (I believe)
<natefinch> the gccgo stuff?
<jam1> natefinch: I think I fixed those, this is an upgrade failure
<jam1> https://bugs.launchpad.net/juju-core/+bug/1333682
<_mup_> Bug #1333682: upgrading 1.18 to 1.19 breaks agent.conf <panic> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1333682>
<jam1> specifically, juju 1.19 expects there to be an "apiaddresses" line in agent.conf and panics on a null pointer dereference if it isn't there.
<jam1> natefinch: but 1.18 doesn't have it, and nothing seems to *put* it in there.
<perrito666> jam1: mm, shouldn't peergrouper do that?
<jam1> perrito666: peergrouper can't come up because the data isn't there and we try to connect to the state with a nill API address
<jam1> well, maybe maybe not it can come up, It isn't quite clear, but the process itself panics
<jam1> perrito666: natefinch: this *might* not be a 1.19 regression, because it appears 1.18.1 is putting the line in agent.conf
<jam1> so the bug is that potentially these people are upgrading from originally 1.16
<jam1> and 1.18 doesn't add the line, but doesn't care if it isn't there
<jam1> and 1.19 now expects it to be there.
<bodie_> morning all
<jam1> morning bodie_
<jam1> mramm: ping
<mramm> jam1: pong
<jam1> mramm: were we doing 1:1 call now?
<mramm> oh, I'm in the cloudbase sprint
<mramm> I can drop out and come over to the 1 on 1
<mramm> be there in one min
<perrito666> bzr branch takes a bit....
<wwitzel3> natefinch: standup
<natefinch> wwitzel3: sorry, in a meeting with the cloudbase guys.  Can you guys deal for now?
<ericsnow> natefinch: we got started without you :)
<wwitzel3> natefinch: yep ^
<perrito666> dimitern: ping
<alexisb> mgz, you around?
<ericsnow> natefinch: will you be available for our 1on1 in a few minutes?
<dimitern> perrito666, hey
<natefinch> ericsnow: can we move it to this afternoon?
<ericsnow> sounds good
<mgz> alexisb: yup
<perrito666> mgz: ah you here too
<perrito666> good
<alexisb> mgz, can you reach out to perrito666, he is working a critical bug and could use your great wisdom :)
<mgz> perrito666: feel free to bug me :)
<TheMue> jam1: any chance to take another look at https://github.com/juju/juju/pull/150
<TheMue> jam1: ?
<alexisb> thanks mgz, this is important given it is our latest block for the release which is now a week behind :)
<jam1> TheMue: my immediate thought is that "juju set" isn't returning an error code when it gets bad data?
<TheMue> jam1: thought about it too, but it would also be for names or any string arguments which pass the API
<TheMue> jam1: best would be imho if here no invalid encoded data would pass
<jam1> TheMue: reviewed
<TheMue> jam1: thx
<TheMue> jam1: Main() so far returns no error as it doesnât recognize it as error. as I said, to recognize it we would have to check any value for valid utf-8 best on rpc level
<tasdomas> is there a way to connect to the mongo db created by a test ?
<natefinch> tasdomas: you mean manually?  by default they get cleaned up at the end of the test, so they go away... but you can comment out the cleanup code if you want to go poke at the DB by hand
<perrito666> is anyone else having juju report dns-name for local deploys as localhost?
<automatemecolema> I have a bug problem on precise with local provider https://gist.github.com/anonymous/9ecb23a51844627028b0 Anyone able to point out something here? Here's a bug that maybe somewhat related https://bugs.launchpad.net/juju-core/+bug/1330406 ??
<_mup_> Bug #1330406: juju deployed services to lxc containers error executing "lxc-create" with bad template: ubuntu-cloud <bootstrap> <local-provider> <lxc> <juju-core:Incomplete> <https://launchpad.net/bugs/1330406>
<sinzui> natefinch, wwitzel3: what you you think about https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1328958
<natefinch> automatemecolema: maybe you need to update lxc?
<sinzui> The summary is juju requires the ubuntu user to be on the client machine...except I don't and I don't know anyone who does
<sinzui> I have never seen this issue...and if this is about server images, they come with ubuntu so juju local still works
<natefinch> sinzui: I think we create the user in cloud init if it doesn't exist
<sinzui> natefinch, right, but the issue here is local host machine your desktop
<natefinch> oh sorry, I missed that it was local
<sinzui> natefinch, the juju client does not create the ubuntu user on the host machine
<natefinch> sinzui: no, I wouldn't expect it would
 * fwereade was working at 1am and 8am today, calling it a day now
 * fwereade probably back on later, ping me if you need me
<tasdomas> natefinch, thanks - I'm actually suspending the test using a sleep
<sinzui> natefinch, I think the issue with https://bugs.launchpad.net/ubuntu/+source/juju-core/+bug/1328958 is that the localhost is a server image. juju recognises that, and then wants to use the ubuntu user. This works most of the time because servers always come with ubuntu user.
<sinzui> natefinch, The error implies that something in .juju/local is owned by ubuntu...I have never seen that on any of the 5 machines that test local. I would be looking though unless the test failed
<natefinch> tasdomas: that works
<tasdomas> natefinch: I can't however, find the correct username/password to log into the db
<natefinch> perrito666: ^^
 * perrito666 reads
<natefinch> perrito666: I know you were just talking about that... where do you get the username/pw?
<automatemecolema> natefinch: well I followed the juju guide on what packages needed to be installed
<perrito666> natefinch: var/lib/juju/agents/<tag>/agents.conf
<perrito666> tasdomas: ^
<automatemecolema> natefinch: apt-get install juju-local linux-image-generic-lts-raring linux-headers-generic-lts-raring
<perrito666> user==Tag: password=apipassword iirc
<tasdomas> perrito666, natefinch thanks
<automatemecolema> natefinch: not sure how more up to date I can get with lxc on precise??
<hatch> hey guys is there documentation for people who want to interact with the juju api like the GUI does?
<natefinch> automatemecolema: that sounds valid....
<natefinch> hatch: no, sorry
<hatch> natefinch ok np
<automatemecolema> natefinch: so I think I have a bug then....
<natefinch> hatch: right now I'm recommending that if you want to script against juju that you do it using the CLI, since that's a lot better documented and much more suitable for human consumption
<natefinch> hatch: which I know doesn't work for a lot of use cases
<natefinch> (like a web gui)
<hatch> natefinch yeah I'm sure the primary use cases would work with the CLI, but we should probably document it sometime :)
<natefinch> hatch: it's on the list of things to do, definitely.   You're far from the first to ask about it.
<hatch> natefinch, if only we had unlimited resources :)
<mattyw> fwereade, ping?
<natefinch> hatch: yep
<natefinch> automatemecolema: precise definitely works.... oh maybe you need to set default series
<natefinch> automatemecolema: in ~/.juju/environments.yaml   add default-series: precise   to the local provider section
<natefinch> automatemecolema: and/or edit ~/.juju/environments/local.jenv   to add default-series: precise
<automatemecolema> natefinch: trying your suggestion out right now
<automatemecolema> natefinch: so I can't run a trusty charm with a bootstrapped precise environment?
<automatemecolema> the problem is trying to deploy trusty/juju-gui when my bootstrap node is precise
<natefinch> you should be able to deploy that to something other than the bootstrap node (like you couldn't do juju deploy trusty/juju-gui --to 0)
<automatemecolema> Yea, but every time I tried that it failed with an lxc-create problem
<natefinch> I seem to remember there being a problem running trusty containers on precise..... but I forget exactly what the problem was
<natefinch> automatemecolema: so.... generally when I hit an lxc problem with the local provider, I just reboot, because most of the time it's lxc getting itself wedged in a bad state
<automatemecolema> yea I tried a reboot, that didnt work out
<natefinch> mgz: any ideas to help out ^^?
<rick_h_> natefinch: there were issues in the original juju release especially without a default series and such defined.
<rick_h_> natefinch: maybe you were thinking of https://bugs.launchpad.net/ubuntu/+source/juju-quickstart/+bug/1306537 ?
<_mup_> Bug #1306537: LXC local provider fails to provision precise instances from a trusty host <deploy> <local-provider> <lxc> <juju-core:Fix Released by wallyworld> <juju-core 1.18:Fix Released by wallyworld> <juju-quickstart:Fix Released by frankban> <juju-quickstart (Ubuntu):New> <juju-quickstart (Ubuntu Trusty):New> <https://launchpad.net/bugs/1306537>
<natefinch> rick_h_: ahh that may have been what I was thinking of
<perrito666> natefinch: alexisb I need to step off for a moment, I did not manage to fix the upgrade issue (I did nevertheless manage to break part of my env) Ill be back later but if anyone else wants to take a look he/she is welcome
<bac> hi sinzui, are you running jenkins for any go projects?
<sinzui> bac no. which ones interest you
<ericsnow> natefinch: ping
<natefinch> ericsnow: coming
<perrito666> back
<bodie_> anyone clear enough on the State.Caller interface to tell me where to look for the point where I can fake it in my API client test?
<bodie_> basically I just want to have st.caller.Call("WatchActions", blah blah) use my mocked-up function instead of the real one
<bodie_> however, since caller isn't an exported field of uniter.State, I'm not sure where I can poke the new function (I'm using testing.Patch to replace the function)
<bodie_> the example I'm working from is state/api/usermanager/client_test.go but that appears to use a slightly different call technique
<perrito666> how do I specify to upgrade-juju --version (usinglocal provider) my own stream?
<perrito666> I want to go to 1.19.4 from 1.18.1.1
<sinzui> perrito666, local provider ignores streams, it is impossible...we reported the bug months ago
<perrito666> sinzui: mmpf
<sinzui> perrito666, we have test streams in aws, hp, and joyent that you can use
<perrito666> sinzui: its ok, I have my own ;) I just thought that reproducing this bug was possible locally
<sinzui> perrito666, If you have both juju's installed, I think you can use the lower juju to upgrade (downgrade)
<sinzui> perrito666, one report of this bug was about local host I thought
<sinzui> perrito666, since local ignores streams, and only provides a crippled subset of archs, that the machine had two version of juju installed...
<sinzui> perrito666, and since few people know about the need to strictly define $PATH, the two jujus can mix
<lazypower> Can we call on user-data scripts / cloud init customizations at any time during the boot process? i dont see any documented constraints i can pass to populate user data on my cloud provider.
<lazypower> s/boot/provisioning
<natefinch> lazypower: no
<waigani> menn0: morning :)
<natefinch> lazypower: we don't expose that
<lazypower> natefinch: ok, so if we cant modify user data, can we specify custom AMI's then? or are we bound to the official ubuntu images?
<lazypower> the idea behind this is to eliminate bloating charms install hooks installing a-z toolkits that will be required on every machine for compliance.
<lazypower> wait, we do this. i foudn an AU post on it.
<lazypower> or rather, we did but dont now
<lazypower> http://askubuntu.com/questions/84333/how-do-i-use-a-specific-ami-for-juju-instances
<natefinch> lazypower: yeah, right now it's not implemented
<lazypower> ok, thats all i needed. Thanks natefinch
 * lazypower doffs hat
<natefinch> welcome
<wallyworld> natefinch: did anyone on your team make any progress on bug1333682?
<mbruzek> Hey guys is there a way to set environment variables for the local machine/units in Juju?
<alexisb> wallyworld, I believe that is the one perrito666 is working on
<alexisb> let me go verify the number
<mbruzek> I was reading the documentation that describes changing the environment, but I am specifically interested in environment variables.
<wallyworld> alexisb: ok, thanks. the bug is still marked as triaged
<wallyworld> rather than in progress
<lazypower> mbruzek: doesnt juju set-environment do that?
<alexisb> wallyworld, yeah perrito666 has been looking at that one, but he will have to update you on progress I am not sure how far he has gotten
<lazypower> mbruzek: https://juju.ubuntu.com/docs/commands.html -- see: set-env
<mbruzek> lazypower, The set-environment command will set a configuration option to the specified value.
<lazypower> ah, since we use environment interchangeably in our jargon, i see what you're saying. Thats wrt your juju env, not the bash env correct?
<wallyworld> set-env only sets juju config values
<mbruzek> I think that means that it will set a config option in our environments.yaml
<wallyworld> i'm not sure about how to set environment variables
<wallyworld> no, it will set a config value in a running deployment
<wallyworld> environments.yaml is only used when first bootstrapping
<wallyworld> after that, juju maintains a database on the state servers which contain the system config, plus also a local jenv file with certs and things like that
<wallyworld> mbruzek: a perhaps crappy solution is to use juju run
<wallyworld> that command runs a given set of commands on each machine/unit
<wallyworld> the script that can be run could include export FOO=bar
<natefinch> yeah, that was my thought
<wallyworld> i think that's the only way at the moment
<mbruzek> OK thanks wallyworld and natefinch for responding
<wallyworld> good luck, ping back if you get stuck
<rick_h_> alexisb: wallyworld I saw him reply in one of these channels, looking for this comment on progress
<rick_h_> alexisb: wallyworld it was basically, someone can have fun with it tonight
<wallyworld> rick_h_: you talking about bug 13333682?
<rick_h_> oh hmm, that was some 4hrs ago so maybe there's more
<rick_h_> wallyworld: an upgrade issue?
<rick_h_> perrito66| natefinch: alexisb I need to step off for a moment, I did not manage to fix the upgrade issue (I did nevertheless manage to break part of my env) Ill be back later but if anyone else wants to take a look he/she is welcome
<wallyworld> rick_h_: yeah, the panic
<rick_h_> was the last thing I saw related in irc
<rick_h_> fyi
<wallyworld> rick_h_: great, thank you
<alexisb> rick_h_, I think perrito666 has since come back, but no sure if he is still around
<alexisb> eitherway, wallyworld I would consider that bug fair game
<alexisb> it needs to get resolved
<alexisb> and it is perrito666 eod
<wallyworld> alexisb: yes indeed. i just wanted to see where others may have got to before i started my day
<rick_h_> alexisb: yea, I hadn't realize how long ago that was
<rick_h_> time flies when you're fixing bugs
<wallyworld> alexisb: this 1.19 release sure is cursed :-(
<wallyworld> hopefully this will be the last blocker
 * alexisb keeps her fingers crossed
<perrito666> alexisb: looking for me?
<alexisb> perrito666, wallyworld was looking for an update on the bug given his team will be working the "night" shift on it
<alexisb> perrito666, can you please touch base with wallyworld ?
<wallyworld> s/team/wallyworld :-)
<perrito666> wallyworld: let me touch your base
<perrito666> :p
<wallyworld> ooooh
 * wallyworld braces
<alexisb> lol
 * perrito666 touched bases with wallyworld in priv
<perrito666> ok I dissapear, Ill be back later
<perrito666> wallyworld: anything else before I leave?
<wallyworld> perrito666: nah, thanks, enjoy your evening. you can touch my base anytime :-)
<perrito666> wallyworld: http://www.youtube.com/watch?v=z13qnzUQwuI
<wallyworld> lol
<perrito666> bye
<sinzui> wallyworld, I will be happy for you to declare the bug not a regression. I can release what we have while this be is fixed for the next release
<alexisb> alrighty all, I am headed into town
<alexisb> I will check back in later this evening, if you need me before then email or call my cell
<wallyworld> see ya
<wallyworld> sinzui: i am still ramping up but there's a line in the bug comment which says "So potentially it is a different bug, which is that 1.19 is actually *removing* the line that used to be there"
<wallyworld> if that's the case, then we do have a problem it seems
<sinzui> I defer to your judegment
<bodie_> menn0, I really want to avoid redundancy at high time cost in these tests, we have a meeting with Mark approaching Friday and we're really trying to push the Actions down through these layers to the RunHook call
<bodie_> I understand your concern over the test coverage, but I think the api client methods here should really only be tested as far as their responsibility goes, I'm not positive we need to add redundancy here
<bodie_> e.g., the StringsWatcher is being checked for duplication at the State level; therefore, if duplicates are coming in, it seems that would be an issue with the st.call() method
<bodie_> which itself is being exercised elsewhere
<menn0> bodie_: I think we're misunderstanding each other. I'm not talking about anything particularly heavyweight.
<menn0> bodie_: I'm just interested in seeing tests that hit the error handling lines thats aren't currently being tested in state/api/uniter/unit.go:WatchActions
<bodie_> yeah, I was just discussing with jcw4 -- that function variable could be returned by the st.call() method
<menn0> specifically 423, 426 and 430
<menn0> bodie_: I'll whip up a quick example of what I mean ... it won't take long
<bodie_> we spent some time trying to emulate the usermanager Patch technique, but we were having trouble I think since we weren't using the function var as you'd mentioned
<menn0> bodie_: this is what I'm thinking: http://paste.ubuntu.com/7697305/
<menn0> untested but should be close
<thumper> fwereade: still around?
<menn0> bodie_: saw your comments about the time pressure you're under. Feel free to leave this bit until later if need be.
<bodie_> thanks, I think this should be pretty straightforward to implement :)
<bodie_> menn0, I was actually thinking of simply inserting a var call = ... which is then called by the existing st.call() function -- thus requiring no refactor
<bodie_> any reason not to do so?
<bodie_> besides being horribly lazy...
<menn0> bodie_: that should be ok as well I think
<menn0> bodie_: you'll probably still need to do something in export_test.go or the tests won't be able to get to call to patch it.
<bodie_> yep, got that bit in
<bodie_> thanks for the code example, btw!
<bodie_> I'm actually really happy about how close we are to finally getting Actions pulled together, the last few inches can be a little frustrating at times
<bodie_> and pr 141 updated
<bodie_> menn0, would much appreciate a clean bill of health ;)
<menn0> bodie_: in a call... should be done soon
<wallyworld> sinzui: i have a theory as to what's happening. it's only a guess based on reading the code. i will make a trivial change which may help but i cannot be certain it will definitively fix things https://bugs.launchpad.net/juju-core/+bug/1333682/comments/5
<_mup_> Bug #1333682: upgrading 1.18 to 1.19 breaks agent.conf <panic> <upgrade-juju> <juju-core:In Progress by wallyworld> <https://launchpad.net/bugs/1333682>
<menn0> I wonder how much work will be required to support this architecture? http://www.engadget.com/2014/06/23/russian-government-avoids-intel-amd-chips-for-baikal/
<mwhudson> well it's a cortex a57
<mwhudson> so once it's booted you should be ok...
<wallyworld> thumper: you got time for a trivial review? critical fix for 1.19.4 release https://github.com/juju/juju/pull/155
#juju-dev 2014-06-25
<davecheney> wallyworld: axw bot is still being a prick
<davecheney> did jams' fix to the data race in the api server get fixed ?
<wallyworld> sadly yes
<wallyworld> i think so
<wallyworld> there's 2 mongo races that i think are the cause of the remianing issues
<axw> I'll take a look at it this morning
<axw> oh
<axw> wallyworld: new ones though?
<axw> because it wasn't this bad last week
<wallyworld> axw: not sure, haven't looked in detail but there's been various changes this week
<wallyworld> axw: could you take a look at this first up, it attempts to fix another critical blocking 1.19.4 https://github.com/juju/juju/pull/155
<axw> wallyworld: sure, looking
<wallyworld> ta
<davecheney> wallyworld: haven't been able to land a branch all morning
<wallyworld> davecheney: so last week, we had the 2 intermittent mongo failures which only happened every so often. lots has changed this week. stuff has broken :-(
<wallyworld> i do think we've got systemic race conditions in our code which no one has really got a handle on
<wallyworld> and/or mongo is shit
<axw> wallyworld: do you know why the peergrouper is getting an empty set of machines in the first place?
<wallyworld> axw: not sure, it's all quite involved, i guessed that it could have been due to stuff initialising and replicaset stuff not being ready (likely during upgrade from non ha to ha env) and timer inside worker triggers publish of machine addresses but they aren't known yet
<wallyworld> there are 2 triggers to publish - machine watcher and timer
<axw> wallyworld: then I would expect it's just going from empty to empty? hmm. anyway, I'll LGTM because it doesn't make anything worse AFAICT
<wallyworld> axw: yes but the watcher won't have fired
<wallyworld> my analysis is in bug comment 5, i can't see an other reason for the issue :-(
<davecheney> wallyworld: ok, i'll focus on finding those race conditoins
<wallyworld> davecheney: we'll look as well
<wallyworld> davecheney: the 2 i was aware of were one in bug 1305014 and another inside the apiserver code (not sure of bug off hand)
<_mup_> Bug #1305014: panic: Session already closed in TestManageEnviron <intermittent-failure> <test-failure> <juju-core:Triaged by rogpeppe> <https://launchpad.net/bugs/1305014>
 * thumper goes to make a coffee
<axw> davecheney wallyworld: jam's fix has not landed
<axw> may explain why I'm getting panics and the race detector isn't happy with cmd/jujud
<wallyworld> oh, bollocks, i thought it had
<davecheney> axw: what can I do to help land that today
<wallyworld> axw: davecheney the fucking bug was marked as fix committed
<wallyworld> hence i thought it had landed
<davecheney> wallyworld: yay issue tracking
<wallyworld> but github says otherwise
<axw> davecheney: merge was attempted and tests failed, could just try again. rogpeppe has made added a remark about the solution being non-ideal
<wallyworld> i've hit the "merge" button again
<wallyworld> we can fix the non-ideal bit after it lands, we need to get this release out imo
<davecheney> seconded
<axw> sgtm
<axw> trivial review please: https://code.launchpad.net/~axwalk/gwacl/vnet-location-docalignment/+merge/224254
<wallyworld> looking
<wallyworld> axw: dumb question, how is 2012 more recent than 2013?
<axw> wallyworld: "which is actually older than what was in the code."
<wallyworld> yeah read that but still don't understand
<axw> wallyworld: 2012-whatever is the most recent published version for that API
<wallyworld> so we were using an incorrect date?
<axw> wallyworld: yeah
<wallyworld> ok
<axw> I don't think it actually matters, but I'd rather go by the docs
<wallyworld> yup
<thumper> davecheney: re: https://github.com/juju/juju/pull/156
<thumper> davecheney: axw has approved but I have a few questions
<thumper> davecheney: just letting you know
<davecheney> thumper: sure
<davecheney> thumper: i'm not trying to land anything til the race fix lands
<thumper> davecheney: done with comments
<davecheney> thumper: yup
<davecheney> i see your comments
<davecheney> i think oyu found one of my TODo's there
<davecheney> fwiw, almost all this stuff was just a string slung straight to the api server
<davecheney> but I will adjust the signatures to mandate the correct type
<davecheney> test 0
<davecheney> ... Panic: Couldn't create temporary directory: mkdir /tmp/gocheck-3784560248718450071: file exists (PC=0x41078D)
<davecheney> wowo
<thumper> haha
<thumper> pseudo-random
<thumper> davecheney: should it be the actual type or an interface?
<thumper> just wondering
<davecheney> thumper: i'm thnking the concrete type
<thumper> ok
<davecheney> we want to say "only machine tags here"
<davecheney> or units
<davecheney> or whatever
 * thumper nods
<thumper> This gives the compiler control rather than runtime assertion
<thumper> better I think
<davecheney> thumper: yup
<davecheney> to fix this i'll need to change most of hte places where we have a struct like
<davecheney> type Machine struct { tag names.Tag }
<davecheney> to
<davecheney> type Machine struct { tag names.MachineTag }
<davecheney> and this is a good thing
<davecheney> thumper: consider yourself nagged for that sprint
<davecheney> its < 30 days
<thumper> ack
<menn0> bodie_: I just finished reviewing 141
<menn0> all good from my perspective
<davecheney> wallyworld: http://juju-ci.vapour.ws:8080/job/github-merge-juju/252/console
<davecheney> what ?
<davecheney> tests didn't fail
<davecheney> did someone nuke this build ?
<wallyworld> davecheney: looks like connection to ec2 instance disappeared half way through
<wallyworld> faaaark
<wallyworld> retry i guess
<bodie_> menn0, good to hear :)
<sinzui> wallyworld, the utopic ami expired a few hours ago. I just configured tests to use that to fix on test...maybe i need to check the ami used by git-merge-juju
<wallyworld> sinzui: ah, could be it
<wallyworld> although i thought we ran trusty
<wallyworld> or even precise actually for the tests
<wallyworld> yes, i'm sure it's precise
<sinzui> wallyworld, yeah, and trusty tests are happy. I will double check
<wallyworld> ok, but it's waaaay past your eod
<sinzui> wallyworld, oh...
<sinzui> wallyworld, I think there is a correlation between git-merge-juju and a number of hung instance from a few hours hours ago. I had to terminate them.
<wallyworld> really? oh
<wallyworld> maybe you terminated a git-merge-juju one as well?
<sinzui> wallyworld, I have request an additional 20 instances.
<wallyworld> great
<sinzui> Maybe I can also update the tests to switch to hp which is very good now that we addresses the ram instance ratio
 * sinzui forces a rebuild
<wallyworld> i think that couldn't do any harm
<wallyworld> mgz is moving to get a nailed up hp cloud instance working to run the tests
<sinzui> I can do that in 30 minutes.
<sinzui> I can get slaves and slave helpers up quickly for trusty.
<sinzui> I have failed to create a fat 386 to test local deploy. I am pondering the consequences of never testing that 386 local provider works.
<wallyworld> i don't think too many people would be using it
<sinzui> a surprising number download it from the ppa
<wallyworld> once the nailed up instance is there, mgz needs to do a little script tweaking but it should be easy
<wallyworld> wow, ok
<wallyworld> i386 is so last century
<wallyworld> decade maybe
<sinzui> wallyworld, https://pastebin.canonical.com/112317/
<wallyworld> jeeez, ok
 * wallyworld is surprised
<sinzui> Indeed. abentley intends to break the numbers down by week to find proper trends is adoption and decline
<sinzui> your beta tester numbers are still generating
<wallyworld> ok
<wallyworld> i'd be interested in seeing those when done
<menn0> potentially silly question: are juju controlled machines (specifically state servers) able to SSH between each other? I'm thinking about something that would involve distributing a file to all state servers.
<sinzui> menn0, They don't have the private keys
<menn0> sinzui: ok thanks. (I think you mean the public keys right?)
<sinzui> menn0, they have public keys, that is why juju ssh workd
<sinzui> menn0, there isn't a private key on any of the juju machines that would allow an agent to ssh to another machine
<menn0> sinzui: got it
<menn0> thanks
 * menn0 should have just looked in ~/.ssh ...
<sinzui> menn0, also, manual provision can happen across networks, so the design is that the agents can all find the state server, but not necessarily the each other
<sinzui> juju ci's state server cannot reach the lab, but the 4 machines I provisioned there can reach the state server.
<menn0> sinzui: ok that's good to know.
<menn0> I'm wanted to get a file generated on one state server to all other state servers
<menn0> well that's one design I'm looking at
<sinzui> menn0, surely the master state-server is known to all machines. so all machines can ask for a task to pickup the file
<sinzui> wallyworld, beta testers https://pastebin.canonical.com/112318/
<menn0> sinzui: right, so you're thinking: a state server generates the file, tells the master to come and get it which it does, and then pushes it out to all other state servers.
<sinzui> menn0, I think that is possible
<menn0> that could work
<sinzui> menn0, My experience is from observation. I really don't know how juju works.
<wallyworld> 1.11.2 was big
<menn0> sinzui: alternatively I think I can make this work by adding extending the API a little...
<sinzui> on the other hand, I have done the neigh impossible because I didn't read the code http://curtis.hovey.name/2014/06/10/building-trans-cloud-environments-with-juju/
<menn0> sinzui: :)  thanks for your thoughts. you've nudged me in the right direction.
<sinzui> wallyworld, It you stare at the 1.19.x numbers you can see the transition from precise users to trusty users.
 * wallyworld needs another coffee before staring at numbers
<bodie_> if there's any chance I could get a LGTM on https://github.com/juju/juju/pull/140 it would be quite helpful in moving Actions forward :)
<bodie_> (LBTM equally appreciated, though)
<wallyworld> bodie_: doesn't look like william's issues with tags and canAccess function have been fixed?
<waigani> do we have any helper functions for merging two slices of structs, or should I just do it by hand?
<sinzui> wallyworld, Are you getting more information about bad tests from walk-unit-tests-amd64-trusty than for the other unittests?
<wallyworld> sinzui: the idea was that we'd try and solve the intermittent failures. that job was useful but with the switch to running the tests on ec2 instead of canonistack/tarmac, the main landing tests have become unreliable instead :-(
<wallyworld> we can pass that job until things settle down a bit
<wallyworld> pause
<sinzui> done
<wallyworld> ok, ta
<waigani> thumper: I've got a question, hangout? should be quick
<thumper> sure
<waigani> thumper: https://plus.google.com/hangouts/_/g4rgbccmbg5lchy6qcx7hsny6qa?hl=en
<bodie_> wallyworld, I'll have to have a look tomorrow morning, well past eod here, but I think I've addressed tags.  I'm not totally certain what he was looking for with canAccess
<bodie_> for the queries from api to apiserver, everything is encapsulated by tags
<wallyworld> ok
<bodie_> thanks for having a look!  good night all
<sinzui> wallyworld, I need to sleep. I am a little concerned that local-upgrade on precise is not passing. maybe there is something in one of these logs if CI continues to fail http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1414/
<wallyworld> ok, thanks, we'll look
<davecheney> thumper: wallyworld
<davecheney> [LOG] 0:02.374 DEBUG juju.environs.simplestreams fetchData failed for "http://127.0.0.1:47994/v1/1/juju-test-e76bfc6736366a8a/images/streams/v1/index.json": failed to GET object images/streams/v1/index.json from container juju-test-e76bfc6736366a8a
<davecheney> caused by: Resource at http://127.0.0.1:47994/v1/1/juju-test-e76bfc6736366a8a/images/streams/v1/index.json not found
<davecheney> caused by: request (http://127.0.0.1:47994/v1/1/juju-test-e76bfc6736366a8a/images/streams/v1/index.json) returned unexpected status: 404; error info: 404 Not Found
<davecheney> The resource could not be found.
<davecheney> an exmple of the random test failures I get on my machine
<wallyworld> i'd need to see the whole log of the filed test and which test it is
<wallyworld> axw: if you get a chance at any point, could you look at https://github.com/juju/juju/pull/124 ? william is happy with the implementation but hasn't had time to do a more detailed review
<axw> wallyworld: yep, no worries
<wallyworld> ta, i have to go out for an hour or so, back later
<axw> adios
<vladk> dimitern: morning, I started to work on NetworkInterfacesWatcher. I am going to change NetworkInterfaceDoc according to the final draft. Is it ok?
<dimitern> vladk, better not i think, because we'll need to add subnets and other things as well
<dimitern> vladk, we can still watch ifaces as they are though
<vladk> dimitern: so I filter interfaces by machineId, rather than _id?
<dimitern> vladk, what do you mean?
<dimitern> vladk, i suppose there will be a machine.WatchInterfaces() method in state, returning a StringsWatcher reporting the ids of all interfaces for that machine, as they are created, enabled, disabled, or removed
<dimitern> vladk, you do need to add the IsDisabled flag to the iface doc though, but just that - for the other things that need changing, i think we should start with networks and subnets first, then the rest
<vladk> dimitern: to read initial set of interfaces I need filter interfaces from the sate by machineId
<dimitern> vladk, yes, the interface doc has a machineId - you can use that, right?
<vladk> dimitern: yes
<TheMue> jam: ping
<jam> TheMue: pong
<TheMue> jam: seen the latest comments on https://github.com/juju/juju/pull/150 ?
<jam> I haven't, I'll go look
<TheMue> jam: we have different levels to fix this issue. from low-level inside the jsonrpc package up to the command. I prefer ParseSettingsStrings in charm.Config
<jam> TheMue: well, I posted that you might even go for the 'schema' package, which is the stuff that validates already
<jam> however, I really don't want us to get too derailed here, this is mostly a passing-by stuff, and we thought things were working differently than they are.
<jam> so for now, document that we get \ufffd when you pass in invalid binary data (part of the actual api Client level tests), so that we at least assert the current behavior and know if we change it.
<jam> and I can live with whatever other results we get, as it won't be *worse* than what we've lived with forever.
<jam> TheMue: does that make sense to you?
<TheMue> jam: so you donât expect cmd.Main to return something else than success? because we simply accept and document this behavior?
<TheMue> jam: as the json encoding always âcorrectsâ the invalid sequence to \ufffd
<jam> TheMue: I can live with it, given we've been living with it forever.
<jam> TheMue: it isn't what I prefer, but we don't have to fix everything, we have more important things today
<TheMue> jam: yes, that have been my thoughts this morning too. we only discovered this issue by thinking about it, but it never happened in reality so far
<TheMue> jam: ok, then it makes sense and Iâll change the latest added tests to be a bit more explicit and documenting, so that in case of somebody having to fix this error is having a good base to start from
<jam> TheMue: sounds good to me
<jam> vladk: just a reminder that you're On-Call-Reviewer today
<jam> and TheMue you're OCR tomorrow
<TheMue> jam: tested the json marshalling yesterday evening a bit in both directions as well as taking a look into the source. interesting that they removed the error for invalid utf-8 with go 1.2
<TheMue> jam: thx for the hint
<jam> TheMue: seems like a pretty significant change, but I guess that's what we have to live with.
<TheMue> jam: yeah, especially as we use the websocket json codec directly, so we donât even see the marshalled bytes. they are directly written to the socket
 * jam heads to lunch
<fwereade> davecheney, ping
<axw> fwereade: https://github.com/juju/juju/pull/158  - FYI
<fwereade> axw, thanks
<davecheney> fwereade: ack
<wallyworld> mgz: standup?
<fwereade> davecheney, can we have a quick chat about where you're heading with the tags stuff? there's clearly plenty to like but I don't have a clear idea of the plan
<davecheney> fwereade: sadly no
<davecheney> i am not at home
<davecheney> fwereade: you speak to tim frequently
<davecheney> he can represent me on this issue
<fwereade> davecheney, cool, thanks, I'll chat to him this evening
<wwitzel3> morning all
<TheMue> wwitzel3: morning
<perrito666> morning
<TheMue> vladk, jam: (hopefully) final changes https://github.com/juju/juju/pull/150. so please review (again)
<wallyworld__> jam: not sure if my "fix" for that agent.conf issue will do any good or not but it's all i could come up with as something to try from reading the code
<dimitern> jam, standup?
 * fwereade bbiab
<perrito666> stumble upon's logo reminds me of something
<sinzui> Looks like wallyworld's fix got upgrade broken local precise upgrades
 * sinzui reports bug
<sinzui> natefinch, jam, fwereade, alexisb : We have a regression cause by the fix for upgrade bug reported yesterday https://bugs.launchpad.net/juju-core/+bug/1334273
<_mup_> Bug #1334273: Upgrades of precise localhost fail <local-provider> <precise> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1334273>
<alexisb> sinzui, lovely
<alexisb> perrito666, wwitzel3, mgz ^^^ you guys available to take a look at this bug
<bodie_> morning all
<natefinch> wwitzel3: can we push our 1 on 1 to this afternoon?
<mgz> alexisb: nothing is jumping out at me from the change or the log in the bug...
<alexisb> mgz, well of course it couldn't be obvious that would be waaaay to easy ;)
<wwitzel3> natefinch: sure
<sinzui> mgz, The stall in the test log. where a single status call takes 10 minutes, then fails the test looks like the issue joyent upgrades had. Wallyworld said the fix was to ensure the state server got a sane list of addresses
<mgz> sinzui: so, all the change does is return an error if the publish function is given an empty list
<mgz> I don't particularly see how that regresses us... it must be something fun and side-effecty
<sinzui> mgz #136 from wallyworld/ignore-empty-machine-addresses
<sinzui> mgz, I am speculating...I really don't know
<alexisb> sinzui, mgz : does that mean we think that this current bug is a result of joyent shotcomings?
<sinzui> alexisb, no. I am just remarking that the fix for yesterdays critical broke another test in a similar way to a bug from last week
<sinzui> alexisb, CI loves juju yesterday. I could release something from yesterday as 1.19.4
<mgz> sinzui: I don't see that error propogated back up in the logs anywhere either
<perrito666> sinzui: github says https://github.com/juju/juju/commit/e01ac93e is a different commit
<mgz> which it *should* be if that change actually affected anything
<alexisb> sinzui, ok
<sinzui> perrito666, I give you the latest data, there have been two broken revisions, I just to give you a log from what fails 15 minutes ago
<perrito666> sinzui: I was talking about "With the introduction of e01ac93e "Merge pull request #155 from wallyworld/peergrouper-publish-ignore-empty""
<sinzui> perrito666, http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/ shows all the failed tests over the last 9 hours
<sinzui> oops
<perrito666> sinzui: as I say, the first failure comes from https://github.com/juju/juju/commit/e01ac93e0cbee312bf19ead78035a78ed2428871
<sinzui>  #153 from davecheney/112-state-life-takes-a-tag is the one
<perrito666> which is not wallyworld's tag
<perrito666> s/tag/rev
<sinzui> I updated the bug
<perrito666> mgz: you might find more answers there ^
<mgz> that does seem a little more promising
<alexisb> mgz, is there an obvious fix or is a commit we should consider backing out for this current dev release?
<mgz> I think we should consider backing out the change and seeing if that resolves the issue
<mgz> this might mean pain for upgrades later, this is an api change
<mgz> of sorts
<alexisb> mgz, that seems very reasonable to me, sinzui your thoughts?
<sinzui> mgz...well, I have the tarball and installer from the previous pass. I can start the release now with fa4f6106
<sinzui> mgz, are you preparing to backout the rev to test?
<bodie_> anyone happen to know whether fwereade is around?
<natefinch> alexisb: I presume I stick around with Gabriel and skip tosca this week?
<alexisb> natefinch, yes
<sinzui> mgz, perrito666 Is anyone working on bug 1334273 or planning to backout the revision where the error started?
<_mup_> Bug #1334273: Upgrades of precise localhost fail <local-provider> <precise> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1334273>
<perrito666> I am not, I was under the impression mgz was on it
<mgz> sinzui: I can do the backout if we want that
<mgz> I'm not completely sure it'll resolve the issue, but seems worth trying
<sinzui> mgz we can do that or I release the version before it.
<sinzui> mgz, if the test passes after the backing out, then we know the rev is bad, that is all we learn
<mgz> sinzui: I guess the question is, do we want the wallyworld change that comes after
<mgz> because *that* bug was the blocking cause previously right? (the thing wallyworld's change tries to fix, panic on not having addresses)
<sinzui> mgz, yes, if it is independent. wallyworld's change was a hope to fix the other critical
<mgz> okay, I'll do the revert, we send tip with that through ci, and release that if it's clean, otherwise release the rev before
<mgz> sounds good?
<alexisb> mgz, yes sounds good :)
<sinzui> mgz, alexisb. I have a lot of anxiety regarding the delay of the devel release. We have more than 50 issues people aren't testing beause we always find one more critical bug. I am inclined to revert to my old behaviour of selecting a blessed revision from the past. In general, a bug that is in older versions of juju does not block a devel release, only recent regressions can block
<perrito666> +1
<alexisb> sinzui, +1
<mgz> can I get a check/stamp on pr160 please
<mgz> perrito666: ^
<perrito666> natefinch: checking
<mgz> >_<
<perrito666> aghh sorry, off by one in the kb
<perrito666> mgz: check
<mgz> okay, sent to bot
<wwitzel3> perrito666, natefinch: standup or are you guys busy with the regression?
<perrito666> wwitzel3: coming
<perrito666> natefinch: ?
<natefinch> wwitzel3, perrito666: let's meet this afternoon, sorry, in a virtual sprint with the cloudbase guys until 1-ish
<perrito666> cool, I can then use my bw to continue downloading broken sword  :p
<mgz> sinzui: revert has landed
<mgz> sinzui: the upgrade job is blue, but from before the revert landed anyway.
<sinzui> ?
<mgz> if I'm looking at the jenkins job correctly, the latest run is blue, but it's from an earlier landing anyway
<alexisb> what does blue mean?
<mgz> er, it passed. blue dot in the jenkins ui.
<sinzui> damn it
<sinzui> mgz, a race condition where upgrade won? There were a lot of failures and the rate is much worse than it was before the rev
<sinzui> mgz, if the job passes quickly in the next run, then I am convinced that a race condition was avoided
<mgz> yeah, makes seeing particular blame on the change much harder
<mgz> we have a few more trunk revs to see
<bodie_> does anyone know how to retrieve the Unit making the query in an APIserver endpoint?
<bodie_> I was passing the calling Unit's tag along with the entity request, but I was told that the UniterAPI has the calling Unit as a member variable
<alexisb> jam1, did you want to meet?
<jam1> alexisb: yeah, I think so
<alexisb> ok, our normal hangout
<wwitzel3> natefinch, perrito666, ericsnow: is now good?
<ericsnow> wwitzel3: good for me
<perrito666> wwitzel3: coming
<natefinch> wwitzel3: can we do it in an hour?
<perrito666> lol
<perrito666> not coming
<perrito666> ping me when you are ready
 * perrito666 cooks
<mgz> sinzui: I've not seen any more runs on the local upgrade job,
<mgz> but I've proposed a restore of dave's changes as it doesn't seem to have had an effect
<sinzui> mgz, It runs in a later phase
<sinzui> mgz, I was just looking if I could force the test to run earlier
<sinzui> mgz, were you planning to create a dedicate HP test machine?
<mgz> sinzui: yup, though not completely sure about the hp part
<sinzui> mgz, HP is good now that we fixed the RAM to instances inbalance
<sinzui> mgz, I can provide a slave for you before you wake tomorrow
<mgz> do I need a la
<mgz> *slave, the unit test script just takes any machine right?
<sinzui> mgz, I have been pondering an update to the run-unit* to provision nova instances
<mgz> that seems reasonable
<sinzui> A slave in our cloudy case means some jobs jump the queue. That helps you
<mgz> okay
<sinzui> We could choose to run tests on the slave machine. I am going to try that for some tests. I really want to minimise our aws resources...then I thought changing the run* tests to use nova would help me
<sinzui> oh
<sinzui> mgz, your tests try once fast, then try with -p 2. Should I be doing the same?
<sinzui> I try test 4 times, maybe I could your your technique to minimose setups
<mgz> sinzui: , ideally not, but I was considering proposing a change to the script to take a flag that does that
<bac> i'm getting this error from jenkin-slave even though the file listed exists and is world-readable: 2014-06-25 17:39:24 ERROR juju runner.go:220 worker: exited "upgrader": cannot read tools metadata in tools directory: open /var/lib/juju/tools/1.17.4.1-trusty-amd64/downloaded-tools.txt: no such file or directory
<bac> any ideas?
<natefinch> perrito666, ericsnow: standup?
<perrito666> natefinch: sure
<ericsnow> natefinch: coming
<sinzui> I wont be helping a test job to run sooner. My effort lead to a collision in lxc/mongod/jujud on the jenkins master. That 60 minute delay was not faster than letting tests run as they naturally want to
<wwitzel3> all the contents of the turkey wrap I was eating for lunch blew out the bottom and the cat got scared, ran through it, and I just finsihed wiping up all the mustard cat prints in my house.
<ericsnow> wwitzel3: just imagining that gave me a chuckle :) (hopefully it wasn't too much work)
<ericsnow> wwitzel3: I've had the same thing happen with small children
<wwitzel3> ericsnow: thankfully, wood floors
<thumper> fwereade: around?
<thumper> natefinch: o/
<natefinch> thumper:
<natefinch> howdy
<perrito666> ericsnow: do you eat small children? :p
<ericsnow> perrito666: only when they misbehave ;)
<thumper> natefinch: got time for a quick hangout?
<natefinch> yeah
<thumper> natefinch: https://plus.google.com/hangouts/_/g44xyud6z5kz3g6p775ebv6cfya?hl=en
<jcw4> I'd appreciate a review on https://github.com/juju/names/pull/11
<jcw4> Its an ActionTag refactor to hide the implementation details of extracting a UnitTag from an ActionTag
<jcw4> thumper: I think it's Thursday in your timezone >:-} ^^
<jcw4> Of course it's still Wednesday for mgz ;)
<thumper> hi jcw4
<thumper> yes, thursday here
<jcw4> hi thumper :)
<thumper> just after 9am
<perrito666> jcw4: there you go, you might want to look for better reviewers than me nevertheles
<jcw4> perrito666: your review comments are great
<thumper> I'll take a quick look too if you like
<jcw4> thumper: excellent, thank you yes
<jcw4> thumper: re: exported attributes... I wasn't clear if I could access the internal attributes outside of a method on the internalStruct
<jcw4> I'll change it and make sure it still works
<thumper> jcw4: all internal attributes are accessable inside the same package
<thumper> as long as it is only used inside the names package, you should be fine
<jcw4> thumper: perfect
<thumper> hmm...
 * thumper writes a longer general comment.
 * bodie_ grabs his binoculars and tries to catch a peek
 * jcw4 counts the minutes, wondering what could cause such long deliberation...
 * jcw4 is scared now
<fwereade> thumper, hey, I thought there was some meeting in 40 mins? don't seem to be invited any more
<thumper> hmm... this is getting longer
<thumper> fwereade: yes it is still on
<fwereade> thumper, since I seem to be up, I might swing by, it looked important iirc
<fwereade> thumper, what was it?
<thumper> docker stuff
<thumper> the meeting that is
<thumper> I'd like to chat with you about identity and multi-env stuff
<thumper> but reviewing jcw4's branch just now,
<thumper> with you shortly
<fwereade> thumper, and I with you, I might be back in a bit or I might do it after that meeting if you'll be free?
<thumper> fwereade: standup after the other emeting
<thumper> fwereade: how about pre other meeting, in 20 minutes say?
<fwereade> thumper, ok, sgtm, see you soon
<thumper> ack
<thumper> jcw4, bodie_: there you go
<thumper> damn the markdown squishing spaces
<jcw4> thumper: thanks!
<thumper> I should work out how to write code inside comments
<jcw4> thumper: excellent suggestion, thanks
<thumper> np
<jcw4> thumper: I considered it, but thought it might be too radical
<jcw4> :)
<thumper> nah
<jcw4> with your blessing though...
<jcw4> :)
<thumper> with the upcoming identity work, I'm going to do something similar
<thumper> the identity tag will have two parts: username@provider
<thumper> so I had thought through all this already :)
<jcw4> :)
<sinzui> hi wallyworld thumper: I am creating a slave dedicated to running git-merge-juju. If you are contemplating running the unit tests on the same machine, perhaps in lxc, I can give it extra cpus to do that
<sinzui> otherwise I will use a medium instance and let the job create other instances to run the tests
<perrito666> thumper: ```
<thumper> sinzui: um... not sure I have enough context here, do we need to run tests on it?
<thumper> perrito666: yes?
<perrito666> thumper: ```code```
<thumper> perrito666: oh.. cool
<thumper> ta
<sinzui> thumper, the job currently creates a large ec2 instance to run tests. I was pondering support for hp instances, but maybe the core devs are thinking about running the unit tests in an lxc
<thumper> I've not thought about that, but others may have :)
<bodie_> I usually use ```go \n $code \n ```
<bodie_> you get highlighting by language
<bodie_> which is lovely
<perrito666> bodie_: good hint
<bodie_> perrito666, the other way is still good for inline monospaced code, though
<jcw4> bodie_, perrito666 (thumper); for inline monospace you just need `blah blah`  <-- only one tick
<thumper> ta
<perrito666> ok ppl EOD, ill still hang around bc I have no life
<perrito666> but I might not answer a lot since I just purchased the broken sword collection and I intend to play it :p
<bodie_> ooo
<bodie_> ... to the single backtick ;) though I have been known to have no life as well
<jcw4> bodie_: haha, I saw your ooo and thought you were talking about perrito666's broken sword collection
<menn0> morning all
<waigani> thanks menn0
<menn0> waigani: np
<menn0> it was top of my inbox and was straightforward :)
<waigani> menn0: nice, all my PRs should aim for that!
<wallyworld> sinzui: we are contemplating running unit tests in a container so extra cpus would be good
<sinzui> :( I just completed my setup of a 4 cpu machine
<wallyworld> that's ok
<wallyworld> we'll see how it goes
<wallyworld> it probably won't be something we can get done immediately
<sinzui> wallyworld, I can remake it. if my you are unhappy with the hp/nova provisioning I am adding to tests
<wallyworld> sinzui: ok, np. my view is that the immediate goal is to get the landing tests running in a nailed up instance of some sort
<sinzui> oh?
<wallyworld> containers would simplify isolation, but initially we can just do what we did in tarmac
<sinzui> wallyworld, I will restart this then. you and mgz will still have a dedicated slave in an hour
<wallyworld> sinzui: did we need a quick chat? seems like we are not quite aligned?
<wallyworld> although i have a meeting in 10 minutes
<sinzui> wallyworld, for the slave or f*ing  bug 1334273
<_mup_> Bug #1334273: Upgrades of precise localhost fail <local-provider> <precise> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1334273>
<bodie_> mornin'
<wallyworld> sinzui:  either or both. i'll ping you in an hour if you are still around. i'll also look into that bug. i was hoping it was ok when it started passing again yesterday :-(
<sinzui> wallyworld, mgz reverted Dave's commit and CI is playing it now. It has been a sad day. I will *not* try to make the tests run faster because I last effort cause to lxc-based tests to collide
<jcw4> thumper, perrito666 I pushed an updated branch for https://github.com/juju/names/pull/11
<jcw4> thumper: if you have the time and inclination we can discuss my decision to make func ParseActionId(string) (ActionTag, bool) {} exported
<wallyworld> sinzui: ok, i'd still like to still ping you for a chat if you are free and around in an hour
<thumper> ok
<sinzui> wallyworld, I have no social life
<wallyworld> i hear you
 * jcw4 takes a mid afternoon break
<wallyworld> sinzui: if you are free https://plus.google.com/hangouts/_/gyul7vioiw2m7xnhi2tkfvpsjya
<sinzui> wallyworld, Google hates me
 * sinzui tries to hack the url
<wallyworld> i can send an invite
<sinzui> wallyworld, send me an invite, G is the suck
<wallyworld> already sent
<sinzui> hmm, why is it ringing
<wallyworld> sinzui: should i try another hangout?
<sinzui> yes, It keeps ringing and you don't hear it
<thumper> jcw4: looking now
<wallyworld> sinzui: https://plus.google.com/hangouts/_/gzdeubxezekdktt42jeva7ui5ma
<sinzui> party's over
<thumper> awww...
<thumper> no party for sinzui
<thumper> jcw4: why have ParseActionId public?  what is the rationale?
<jcw4> thumper: in juju/state/action.go the Action type returns a names.Tag
<jcw4> thumper: the Action uses ParseActionId to get that tag
<jcw4> thumper: the Action doesn't have a reference to a Unit or a UnitTag
<thumper> um... no
<thumper> jcw4: where in state/action.go?
<jcw4> line 84 now...
<jcw4> but that code isn't pushed yet
<thumper> if it is returning a names.Tag
<thumper> you should use a type assertion, not more parsing
<thumper> action, ok := (names.ActionTag)(tag)
 * thumper thinks
<thumper> that's the right syntax isn't it?
<jcw4> thumper: it doesn't have a tag, it's just part of the Entity interface to return one
<jcw4> action, ok := tag.(names.ActionTag)
<thumper> ok.. that
<thumper> the action object though should have a unit and a sequence, yes?
<jcw4> thumper: yes...ish
<jcw4> the unit is encoded in the action Id
<thumper> so you return names.NewActionTag(unit, sequence)
<jcw4> thumper: so we
<thumper> ah...
<thumper> really?
<jcw4> thumper: it's partially due to watchers
 * thumper takes a deep breath
<jcw4> we want to watch the actions collection for actions specific to a unit
<thumper> right, so the document should have a unit attribute
<thumper> and not encoded in some id field
<thumper> it may well be encoded in id *as well*
<jcw4> thumper: +1
<thumper> but it shouldn't be the only source
<jcw4> thumper: that's how it is now but I agree
 * thumper nods
 * thumper realised he was late for the standup
<jcw4> thumper: thanks~
<jcw4> thumper: at your convenience; you'd prefer an actual unitName on the actionsDoc, over a method that parses it out of the _id?
<bodie_> jcw4, thumper -- it appears the UnitTag uses the parsing method as follows -- https://github.com/juju/names/blob/master/unit.go#L53-L63
<bodie_> fwiw
<thumper> jcw4: ack
<bodie_> we have a working ActionEvents filter channel :)
<bodie_> going to attempt to push down to the Hook level tomorrow
<bodie_> https://github.com/juju/juju/pull/163
<bodie_> any review would be very welcome :)
<jcw4> thumper: and I've updated https://github.com/juju/names/pull/11
<thumper> bodie_: I'm a reviewer today, and will try to get to it :)
<jcw4> thumper: given that perrito666 also reviewed, am I clear to merge?
<thumper> jcw4: yup
<jcw4> thumper: ta
<jcw4> thumper: et. al. do you use checklists when doing code reviews?  That was one of the suggestions in the article jam1 linked to on the email list
<thumper> bodie_: it seems I'm commenting on other parts of the work in the wrong PR
<thumper> jcw4: I have a mental checklist, but nothing formal at this stage
<jcw4> maybe we could compile the checklist items into a doc/code-review.md or something
<thumper> gym time... back later
<jcw4> have fun thumper
#juju-dev 2014-06-26
<bodie_> thumper -- sorry about that!  oops.... sigh
<bodie_> I'd cherry-picked the PR content it's waiting on in order to put the work through.  let's see here...
<bodie_> apologies for wasting your time with that
<bodie_> there we go
<bodie_> https://github.com/juju/juju/pull/163
<sinzui> wallyworld, the job failed even after I cleaned up the environment. Here are the logs http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1434/
<wallyworld> sinzui: sec, talking to alexisb , will contact you soon
<wwitzel3> wallyworld: do you know where I can get the instance type information I need for the client API?
<wallyworld> wwitzel3: yeah, sorry, i read your email and haven't had a chance to respond yet - been in meetings all morning. there is an api there, i just have to lookit and and let you know. will do so soon
<wwitzel3> wallyworld: np, thanks.
<wallyworld> sinzui: i can see the state server workers all start up, and also mongo. there appears to be no reason why the api client cannot connect to port 17070 - is it possible to do a netstat to see if the state server is indeed listening on the correct port?
<sinzui> wallyworld, This is what I saw during the upgrade http://pastebin.ubuntu.com/7703456/
<sinzui> wallyworld, WTF, this just happened on the next test
<sinzui> http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1435/console
<wallyworld> sinzui: is that done after these lines
<wallyworld> 2014-06-26 00:13:38 INFO juju.mongo open.go:90 dialled mongo successfully
<wallyworld> 2014-06-26 00:13:38 DEBUG juju.state open.go:58 connection established
 * wallyworld looks at new console
<wallyworld> seriously? it is mocing us
<wallyworld> mocing
<sinzui> wallyworld, it is a pass with a panic...that is a first
<wallyworld> mocking
<wallyworld> sinzui: i think that's due to the agent shutting down for upgrade, just caught it at a bad time
<sinzui> wallyworld, status may have panicked, it is called several times. the last call gave a result showing all machines upgrades
<sinzui> wallyworld, could be...but the code i wrote tries to capture that...that is why status will be called several times
<wallyworld> sinzui: i can see why status is behaving that way and there was a recent change there - it think it's missing a sanity check
<wallyworld> we can get back an error from the call to get status and still have partial status to display
<wallyworld> but we should check that we do indeed have some status and not nil
<wallyworld> so it will be a simple fix if i am correct
<wallyworld> sinzui: but i wonder why the CI ob failed the first time, there's no obvious reason
<sinzui> wallyworld, I am unsure what to do now. If I wasn't watching that happen, I would declare the test good and just focus on the ill azure
<wallyworld> sinzui: 2 out of 3? :-D
<sinzui> wallyworld, exactly
<wallyworld> let's run it again
<sinzui> azure and joyent are messed up. I am in the consoles killing machines
<wallyworld> menn0: hi ya, i think there's an issue with the recent changes to status. i think we are missing a nil check. see http://pastebin.ubuntu.com/7703483/
<wallyworld> do you agree?
<menn0> yep. davecheney has already fixed it
<menn0> https://github.com/juju/juju/pull/127
<menn0> wallyworld: it looks like the landing bot didn't pick up the merge request though.
<menn0> wallyworld: how do we make it notice the PR?
<wallyworld> um
<wallyworld> $$merge$$ should have been enough
<wallyworld> i'll look into it
<menn0> It could be because this was a PR where the merge started and then was aborted because the initial proposal wasn't quite right
<menn0> wallyworld: ^^
<wallyworld> ah
<menn0> wallyworld: I think you were the one who killed it in Jenkins (at dave and my request)
<wallyworld> menn0: yes. but after that it needs $$merge$$ again to re-trigger
<wallyworld> i'll do that
<menn0> well it's had that
<menn0> but try again I guess
<sinzui> wallyworld, (Sorry for this awkward question from my daughter), Are all 40-50 year-old Aussies obsessed with ABBA.
<sinzui> I said no, but she doesn't believe me
<wallyworld> rotfl
<wallyworld> only some of us :-)
<wallyworld> abba were very, very popular here
<wwitzel3> who isn't obsessed with ABBA? am I right?
<sinzui> I know. I told her about how many weeks Fernando and Dancing Queen spent at number one and she then decided there is a cohort who can't let the band go
<wallyworld> she is right
<wallyworld> but i am not one of them :-)
<wallyworld> wwitzel3: i sent you an email - there's a little refactoring required, sorry
<wwitzel3> wallyworld: ok, yeah, I saw that .. I think I can just add that to the interface in common provider? But I am still not sure how to actually get the provider to call the method on.
<wallyworld> wwitzel3: you have it in your method
<wallyworld>  func (api *EnvironmentAPI) getInstanceTypes(env environs.Environ)
<wallyworld> env is the provider
<wallyworld> so you add the method to Environ
<wwitzel3> wallyworld: lol
<wwitzel3> wallyworld: of course it is
<wallyworld> the ConstrainstValidator() method is already there
<wallyworld> :-)
<wwitzel3> I was sooo close
<wallyworld> yep :-)
<wwitzel3> wallyworld: thanks :)
<wallyworld> np
<wallyworld> menn0: i have no idea what's wrong, i'm just going to merge it directly
<menn0> wallyworld: ok thanks
<wallyworld> sinzui: i just merged in a fix for that status panic
<sinzui> :)
<wallyworld> sinzui: it was proposed a few days ago it seems but the bot just didn't want to pick it up
<jcw4> review requested: https://github.com/juju/juju/pull/164 ;  this just updates for the newly updated names package and makes the internal structure of the Action consistent with other state structures.
<thumper> wallyworld: with you shortly
<wallyworld> ok
<waigani> thumper: I'm thinking of picking up this bug: https://github.com/juju/juju/issues/138
<waigani> thumper: I see there are two ssh clients: openssh and gocrypto embedded. Does the gocrypto save known_hosts?
<thumper> waigani: first thing, can you move that bug to launchpad?
<waigani> thumper: sure
<waigani> thumper: https://bugs.launchpad.net/juju-core/+bug/1334481
<_mup_> Bug #1334481: juju should not record ssh certificates of ephemeral hosts <juju-core:Triaged by waigani> <https://launchpad.net/bugs/1334481>
<thumper> waigani: can you also link that on github too? for the issue
<waigani> thumper: done
<waigani> added comment
<waigani> thumper: shall I do the same for this one: https://github.com/juju/juju/issues/133
<thumper> waigani: check to see if it has been done already, but yes
<waigani> though there has already been some discussion on github
<waigani> ok
<waigani> thumper: done, and linked on github
<thumper> waigani: yes, working on that issue would be good
<waigani> thumper: cool. I'll start with a failing test. So we will just not store the known hosts at all on ssh right?
<thumper> waigani: yeah... but just for juju ssh
<waigani> thumper: cmd/juju/ssh ?
<thumper> yup
<wallyworld> axw: got time for a quick hangout?
<axw> wallyworld: can you give me 5 mins please?
<wallyworld> sure
<axw> wallyworld: I'm in the tanzanite-daily hangout
<jcw4> thumper: fwiw, on PR 163, a lot of the tag/id stuff has been clarified with that last names package update I submitted
<thumper> heh...
<thumper> I'm just commenting on what I see
<thumper> :)
<jcw4> thumper: bodie_ told me this morning that he expected to have to do some refactoring once my change was in
<jcw4> thumper: :)
 * jcw4 is just nervous about when thumper's beady eye gets on my PR next
<thumper> jcw4: which PR is it you want me to look at?
<jcw4> 164
<jcw4> or not... y'know
<jcw4> if you need a nap or something...
<bodie_> thumper, cleaned up 163, btw -- it looks like you're commenting on the content I removed :(
<jcw4> all jokes aside, I'm *loving* the code review process on this project
<bodie_> it's still useful -- the code you're commenting on is for PR 140 and 141, so I can still make use of it
<thumper> bodie_: hmm... ok, what should I be looking at?
<bodie_> https://github.com/binary132/juju/commit/1de2d29aba97a422da32fcfde1a15c94e150e1ad
<jcw4> bodie_: is that because your PR 163 was rebased on top of your pending 140 and 141 ?
<thumper> bodie_, jcw4: something to be aware of, with the upcoming work on multi-environment state servers, all the document _id fields will change to include the env uuid
<bodie_> yes, I removed the condensed commit and pushed --force so I thought it would be clear it was gone from the PR
<bodie_> but, the rebased commit was just the same content from 140 and 141, so I could run tests
<jcw4> thumper: so if we hide that _id behind the public api of the state types we should be fine right?
<thumper> generally...
<rick_h_> wallyworld: :P how many bundles have you created by hand?
<wallyworld> rick_h_: none
<wallyworld> but it sure would be nice to have a cli for it
<rick_h_> wallyworld: sure, from an existing environment as a dump/backup.
<wallyworld> eg juju start bundle followed by a series of deploy, relate commands, then juju end bundle
<rick_h_> wallyworld: but anyway, replied. rubbing some ointment over the gui comment sting :P
<rick_h_> wallyworld: it's called a shell script, you can do that today
<wallyworld> rick_h_: sorry if it came out bad, wasn't intended
<rick_h_> wallyworld: I'm joking with you
<wallyworld> i just wanted to make the point that most of our target audience don't use guis
<rick_h_> wallyworld: except I don't think bundle creation is a cli/scriptable thing
<rick_h_> wallyworld: do we have a target audience now?
<wallyworld> devop people?
<wallyworld> i guess windows folks want a gui
<rick_h_> because if we're going to talk small devs I'll argue with you
<wallyworld> i'm a small dev
<wallyworld> i don't use guis
<rick_h_> but yes, at scale people want scriptable > * (*cough* thumper *cough*)
<wallyworld> yup
<rick_h_> wallyworld: never confuse you vs a target audience.
<rick_h_> I use vim and a terminal all day and the only gui app I run is a browser
<thumper> rick_h_: I'm going to make you happy and make it all script happy
<thumper> rick_h_: leave it with me :)
<rick_h_> thumper: :) hey wallyworld is preaching it too
<thumper> rick_h_: I've already cleared the approach with fwereade
<rick_h_> woot
 * rick_h_ does happy dance
<wallyworld> thumper: you have a doc to share on that?
<thumper> waigani: nope, it is inside my head, but not complex
<rick_h_> wallyworld: thumper and I were having this conversation yesterday so glad to see your chime in as well.
<wallyworld> thumper: must be simple to be in your head
<thumper> wallyworld: it is
<rick_h_> ouch
<rick_h_> maybe there's a GUI in it
<wallyworld> lol
<rick_h_> har har!
<wallyworld> you funny
<rick_h_> I try, it's past my bedtime
<wallyworld> careful, you'll turn into a pumpin
<wallyworld> k
<rick_h_> half way there, let me get an orange shirt
<wallyworld> ll
<wallyworld> o
<wallyworld> and a green hat
<wallyworld> and a camera
 * thumper takes a deep breath and moves to the next PR
 * bodie_ hands thumper a bottle of water and cheers him on
<sinzui> thumper, wallyworld. A recent rev broke the win installer. We cannot compile it https://bugs.launchpad.net/juju-core/+bug/1334493
<_mup_> Bug #1334493: Cannot compile win client <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1334493>
 * sinzui forces build a a revision before the win and os revisions
<thumper> bodie_, jcw4: some comments on PR 164
<jcw4> thumper: right behind you
<thumper> particularly the last one, as that is the biggest question I have
<jcw4> I'll comment on the pr thumper, but this goes back to that watcher point
<jcw4> if we have a watcher on the actions collection
<jcw4> and that watcher gets _id's for *free*
<jcw4> we can filter on those _id's without another db hit
<thumper> sure, but that doesn't answer the question
<jcw4> thumper: because there could be multiple actions with the same name
<thumper> jcw4: a key question is: "Is the combination of unit and action name unique?"
<jcw4> thumper: no
<thumper> why?
<thumper> what is the differentiating point that makes actions here special?
<thumper> how does a user differentiate?
<thumper> if I say "run the backup action" it may mean multple things?
<thumper> if so, why?
<thumper> or is this "an instance of someone running the backup action" ?
<jcw4> thumper: every time a user types 'juju do <action>' an Action get's queued on the actions collection using the assigned unit and <action> name
<jcw4> thumper: I may say the same command twice
<jcw4> thumper: intending it to run twice
<thumper> ok...
<thumper> how come an action doesn't have a user?
<thumper> or a date requested?
<thumper> I think an action should have a timestamp that it was created
<thumper> and who requested it
<jcw4> thumper: my very first PR for this document had unitName, timestamp, (no user), etc.
<thumper> heh
<jcw4> in discussion w/fwereade we eliminated the unitName because it would be encoded in the _id
<wallyworld> sinzui: looking
<jcw4> the timestamp was deemed unnecessary for now
<jcw4> thumper: the intent is for us to basically have a super lightweight 'tracer' implementation
 * thumper coughs
<jcw4> thumper: and then fill in the details later
 * thumper looks shiftily at fwereade's shadow
 * jcw4 feels guilty for throwing fwereade under the bus
<thumper> jcw4: what is the lifetime of an action?
<jcw4> fwiw, I think fwereade's case was sound
<thumper> when do we remove it?
<jcw4> thumper: as long as it takes for the unit to execute it.
<jcw4> probably minutes or seconds
<jcw4> usually
<thumper> so we end up with an action result?
<thumper> how long do they live?
<jcw4> forever
<jcw4> and ever
<thumper> ouch...
 * thumper forsees an issue
<thumper> we obviously have different definitions of lightweight
<thumper> to me remembering who asked and when is part of very lightweight
<jcw4> to be fair, we haven't discussed any archiving of the results yet
<thumper> when you record the result, you then have a timestamp for finish and can then deduce a duration
<thumper> jcw4: but results could be big right?
<jcw4> thumper: indeed
<thumper> jcw4: or do they point to locations on file?
<jcw4> not in the current implementation
<thumper> well... they could...
<jcw4> thumper: yep... tbh we hadn't thought that far ahead yet
<jcw4> (we being me)
<thumper> given that we want to back up the db periodically
<thumper> and I don't want all my postgresql database backups stored in mongo
 * davecheney shreeks
<thumper> sorry davecheney, bad moment to listen
<davecheney> i've been listening for a wihle
<davecheney> i just couldn't stand it any longer :)
<thumper> haha
<davecheney> http://paste.ubuntu.com/7703965/
<davecheney> still one more race in the state/apiserver package
<davecheney> i'm on it
<thumper> ta
<jcw4> thumper, davecheney to be fair we don't have *any* actions actually runnable yet, so the danger isn't there until we do :)
<davecheney> jcw4: anything that ends with 'my backups are stored in mongodb' is horrifying
<thumper> jcw4: so you are just going to hand us a hand grenade and say "here you go, juggle"
 * thumper chuckles
 * jcw4 wonders how to respond to that
<thumper> heh
<jcw4> well....
<jcw4> :)
<thumper> jcw4: we'll need a way for a user to say "please discard the results for this action now"
<davecheney> when the only tool you have is mongodb, everything looks like /dev/null
<thumper> davecheney: mongo is web scale
<davecheney> thumper: so's /dev/null
<jcw4> :)
<thumper> exactly
<davecheney> axw: wallyworld is there a race build in jenkins ?
<wallyworld> um
<jcw4> thumper: so... we're trying to build/define actions here as we go
<wallyworld> no
<davecheney> jcw4: that's going to end in tears
<thumper> haha
<wallyworld> we are considering it
<thumper> hmm...
<davecheney> possible race from sabdfl, possible sadness from your team
<davecheney> wallyworld: i'll add it to the weekly meeting notes as a discussion point
<wallyworld> sure
<davecheney> wallyworld: do you know the status of the release / upgrade ?
<davecheney> i was watching a bunch of reverts overnight
<davecheney> that then got reverted
<thumper> jcw4: so... one question
<wallyworld> davecheney: reverts were red herring, i think a few conclusions were jumped to
 * thumper tries to formulate...
<wallyworld> davecheney: someone broke the windows build, i'm fixing that now
<sinzui> wallyworld, the build of the older revision, the one that only reverts daves rev
<davecheney> wallyworld: i think it was a good hunch
<jcw4> thumper, davecheney we've purposefully not exposed cli usage yet so that there's minimal exposure until we're done.
<thumper> jcw4: ack
<sinzui> wallyworld, This the the first time I have specifically tested a rev to get a pass
<thumper> jcw4: IMO, and fwereade may disagree, the id for any document should be composable from attributes in that document
<wallyworld> sinzui: you talking about the local upgrade?
<thumper> jcw4: so we don't need to parse the id to get attributes
<thumper> jcw4: expecially if parts of said id are used in other places
<thumper> such as the tag
<sinzui> wallyworld, yes, but since dave's rev was immediately restored, CI never tested just the revision we wanted
<thumper> going from a set of attributes to an ID is easier than trying to do the reverse
<sinzui> wallyworld, the rev could be restore until CI had built juju without it
<jcw4> thumper: that makes sense; it feels a little redundant, but makes sense to me
<thumper> and the amount of data we are storing is minimal
<jcw4> thumper: +100
<thumper> seriously, minimal
<wallyworld> sinzui: oh, so that one *may* have broken upgrades? i thought we just got a passing CI test?
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1334500
<_mup_> Bug #1334500: state/apiserver: more data races <race-condition> <juju-core:Triaged by dave-cheney> <https://launchpad.net/bugs/1334500>
<wallyworld> thumper: https://github.com/juju/juju/pull/165 fixes a release blocker sinzui found
<jcw4> thumper, davecheney what we need is some way to incrementally build / design what we're doing, and get active feedback (like this), without a fully specified feature doc
<davecheney> i'll throw this back in the pool if I can't fix this by EOD
<sinzui> wallyworld, I think we got lucky. the test passed, yet there is a panic in it
<thumper> davecheney: ack
<thumper> wallyworld: looking
<davecheney> jcw4: yes, if you don't have that, success will be hard
<wallyworld> sinzui: the panic was just in a juju cmd
<sinzui> If this rev I am testing passes I will release it. that is all I want a rev that passes that developers don't also say has a hidden bug
<thumper> wallyworld: lgtm
<wallyworld> it's fixed now but would have had little impact
<wallyworld> thumper: thanks
<wallyworld> thumper: i'm just going to hit merge directly so sinzui can rerun the windows build
<jcw4> davecheney, thumper I know you're in the middle of a couple other issues here; but we're also in somewhat of a tight spot because sabdfl is anxious for a version of actions that works end to end (even if it's very minimal)
<sinzui> wallyworld, I cannot
<sinzui> I am testing a previous rev
<thumper> jcw4: ok... please can we start by updating the state doc so the id is composed from other attributes?
<wallyworld> ah ok
<jcw4> thumper: +1
<sinzui> CI gets nasty if I try to make it do change what is being test
<thumper> jcw4: and if you don't decide to add a timestamp and user, add a note that says thumper wants it there
<jcw4> thumper: absolutely, and I'll add jcw4 too
<thumper> \o/
<jcw4> thumper: I'll also add a note to ActionResults about the long term risk of not managing old results
<thumper> jcw4: I think that removing old action results must be part of the initial release
<thumper> otherwise crazy ensues
<jcw4> thumper: agreed
<thumper> probably something as easy as "juju action rm <resultid>"
<thumper> jcw4: so... actions are defined in the charm metadata, yes?
<jcw4> thumper: yes
<thumper> jcw4: do we do validation somewhere on action names being requested
<thumper> jcw4: is there a command to list action results?
<jcw4> thumper: yes, and will be
<thumper> jcw4: we are going to have to have user there... ASAP
<thumper> jcw4: because I will most of the time only be interested in seeing the actions I asked for
<thumper> jcw4: but I should be able to see all
<jcw4> thumper: interesting
<thumper> (assuming I have permissions)
<jcw4> thumper: makes sense
<thumper> jcw4: as an aside, we will probably have permissions fine graned enough to say who can do what actions on which service
<jcw4> thumper: were you involved in the draft spec of Actions ?
 * thumper handwaves
<thumper> jcw4: not really, I think that was mostly sabdfl
<thumper> jcw4: although I have spend most of the last two weeks just writing specs
 * thumper sighs
<jcw4> :(
<jcw4> https://docs.google.com/document/d/14W1-QqB1pXZxyZW5QzFFoDwxxeQXBUzgj8IUkLId6cc/edit#heading=h.q6wtcjv2r9h
<jcw4> thumper: I think I want to capture a lot of your suggestions there
<thumper> heh
<thumper> jcw4: looks like the doc suggests a uuid for an action
<jcw4> yep
<jcw4> I don't recall if we explicitly discarded that idea or if it just slipped by us when we started worrying about filtering the events on the watcher
<thumper> jcw4: also notice that the spec shows that the action records when it was invoked
<thumper> that looks like a timestamp to me
 * jcw4 blushes
<jcw4> not for the first time tonight
<thumper> hmm...
<thumper> I do think that the design has gotten a little overcomplicated, in that we only need one action doc, not two
<jcw4> two?
<thumper> we should have the action results stored with the action
<jcw4> I see
<thumper> I don't think we need an ActionResult doc
<thumper> the result belongs to an action
<thumper> this way you don't need to copy fields across
<thumper> consider this:
<thumper> $ juju status action:UUID
<thumper> in the spec, there are two options:
<thumper> running, or failed
<thumper> this indicates to me that we are looking in one place to see the information
<thumper> which means a simple database query
<thumper> to get the action whether it is running or done
 * thumper takes a deep breath
<thumper> I feel a real design review coming along
<thumper> how much time do you have?
<jcw4> thumper: yes that makes sense... believe it or not, we started there and currents and eddies along the way pushed us to the two docs we have now
<jcw4> I *want* to go for hours
 * thumper smiles
<jcw4> I *should* have been off hours ago
<jcw4> :)
 * thumper looks in trunk
<thumper> hmm...
 * thumper goes back to the spec
<thumper> jcw4: ok where should I dump my thoughts?
<thumper> jcw4: I don't want to put them in the spec
<jcw4> How about an email to the list?
<thumper> jcw4: do you have a design spec?
<thumper> um... yeah... ok
<thumper> more potential for bikeshedding
<jcw4> I almost emailed the list a couple days ago, but didn't
<thumper> but ok
<jcw4> thumper: that's true
<thumper> lets try it :)
<jcw4> we started a couple spec docs, but nothing worth sharing
<jcw4> Maybe you might craft a new doc and link to it from an email?
<thumper> one may fall out of the conversation
<jcw4> thumper: ack
<jcw4> <--- did you notice that?
<jcw4> ;)
<jcw4> learning new catch phrases as I go
<thumper> jcw4: nice
<wallyworld> davecheney: i'm not sure the data races are critical blockers for the 1.19.4 release - so long as CI is happy, we can fix them post release
<davecheney> wallyworld: sure thing
<davecheney> you're the judge
<wallyworld> but if you can fix quickly....
<davecheney> i'm fixing it anyway
<davecheney> but please lets not block this release any futher
<wallyworld> great, may be able to sneak it in :-)
<wallyworld> yup
<wallyworld> that was the thinking
<wallyworld> i'll take off the 1.19.4 milestone
<wallyworld> sinzui: so what's the verdict with the release at the moment?
<sinzui> If I must release, I can use a8f48d14 which is before the 1.18.x upgrade fix
<sinzui> The revision under test has that fix, is before the win build broke, and might be without the local precise upgrade problem
<wallyworld> looks reasonable
<wallyworld> does github show commits in order?
<thumper> jcw4: sent
<wallyworld> in order of merging?
<thumper> sinzui: remember that I want to disable the user command for the 1.20 release
<wallyworld> sinzui: if the commits are in order, the 1.18 upgrade fix predates commit a8f48d14 doesn't it?
<thumper> sinzui: but really just in the release branch
<wallyworld> and yes, i just check that rev, and my 1.18 fix is there
<wallyworld> the one to stop peer grouper publishing empty api addresses
<thumper> jcw4: sorry I haven't looked in earnest before now
<jcw4> thumper: just one more thing on the list... I'm glad you've done what you have :)
<thumper> np
<sinzui> thumper, wallyworld I am honestly just looking for a passing rev. We wanted to release week, so that rev would not have had any of these fixes. I was a moron for not choosing 6a2c202d when it passed 2 days ago
<thumper> sinzui: you still can right?
 * thumper is about to leave and cook
<thumper> 2 hours of meetings tonight
<davecheney> wallyworld: fix coming up, 20 seconds
<davecheney> FUCK
<davecheney> my working copy is screwde
<sinzui> I can will release the newest passing rev tomorrow, when I am awake enough to not make a mistake
<wallyworld> sinzui: next time we'll branch off a release candidate so commits to trunk don't screw us
<davecheney> https://github.com/juju/juju/pull/166
<davecheney> wallyworld: pls review
<wallyworld> looing
<sinzui> wallyworld, Next time I wont listen to developer saying we have a blocking bug.
<davecheney> i'm removing that errant werker.yml file
<wallyworld> looking
<sinzui> wallyworld, I care about regressions in the recent commit. I don;t care about something that has been broken for weeks or months
<wallyworld> sinzui: agreed. if something does come up and it "needs" to be marked critical, we should then get concensus at least
<sinzui> wallyworld, in the case of the upgrade bug, we took more time coming to consensus that fixing it. maybe a dead line is more important. time-based releases work best when we just release
<sinzui> wallyworld, 1.19.3 was made from a  week-old rev because trunk was broken
<wallyworld> yeah, if we can keep to a short enough release cadence
<davecheney> thumper-afk: thanks for the review
<sinzui> wallyworld, mayeb we need stop the line. no one lands a branch until the critical is fixed...no one adds another regression until we fix the current on
<davecheney> anyone else ?
<wallyworld> davecheney: was it just that one attr?
<wallyworld> sinzui: i'd rather branch
<wallyworld> that way trunk development is not held up
<davecheney> wallyworld: thanks gents
<davecheney> i'll submit this now
<wallyworld> sinzui:  a few days before release cut a 1.19.4 rc branch
<davecheney> this doesn't feel like the root cause of the mogno panics
<davecheney> i'll keep looking
<wallyworld> and test it, and address any blockers on that branch
<wallyworld> davecheney: awesome for looking, thank you
<sinzui> wallyworld, when do we branch? trunk has been broken most of this month
<davecheney> thumper-afk: fyi - i'm going to stop landing names changes for the immediate
<davecheney> until
<wallyworld> sinzui: fair point. ideally, i'd say stop the line when CI breaks. but with unreliable clouds.....
<davecheney> a. 1.19.4 lands
<davecheney> b. i can find those f'n races
<sinzui> wallyworld, exactly. I am awake now because azure and joyent cannot be trusted
<wallyworld> davecheney: did the race detector pick up that envuuid one?
<wallyworld> sinzui: that's the root cause of some of our "trunk is broken" issues
<wallyworld> because we can't *enforce* a "you break it, you fix it" approach
<wallyworld> because we can't trust why CI is broken
<sinzui> wallyworld, HA is the root cause of most trunk brokeness, followed by API
<wallyworld> ah, true
 * jcw4 goes to bed... 
<wallyworld> more specifically, mongo is terrible
<wallyworld> and mongo + replicaset is even worse
<wallyworld> but mongo is web scale :-/
<davecheney> wallyworld: yup
<davecheney> i am *sure* there are other races
<davecheney> but right now, we can't see the wood for the trees
<wallyworld> davecheney: there indeed are, but if you fix the apiserver one, then woot
<wallyworld> there's another in our watcher shutdown code
<wallyworld> causing session closed errors
<davecheney> wallyworld: on it
<wallyworld> davecheney: you know the error i mean?
<davecheney> wallyworld: and it was the one we hoped jam's PR would fix
<wallyworld> davecheney: bug 1305014
<_mup_> Bug #1305014: panic: Session already closed in TestManageEnviron <intermittent-failure> <test-failure> <juju-core:Triaged by rogpeppe> <https://launchpad.net/bugs/1305014>
<wallyworld> that's the main one i think
<davecheney> shit
<davecheney> that was supposed to be fixed
<davecheney> we spent three days agonosing over the bloody fix
<davecheney> before it landed
<wallyworld> oh?
<davecheney> wallyworld: this is _not_ the change that you hulk smashed for jam last night ?
<wallyworld> before the api races over the past week, that session closed one was the main reason tests failed
<wallyworld> nope
<wallyworld> my change was about stopping the peer grouper publishing empty apiaddress lists
<wallyworld> if that's the one you are referring to
<davecheney> right
<davecheney> on it then
<davecheney> wallyworld: is it qlways one package that blows up with the session already closed errro ?
<wallyworld> axw: wtf does this mean. i haven't pushed to my branch since last time, yet it is saying my remote branch is behind the remote counterpart
<wallyworld> git push origin managed-resources
<wallyworld> To https://github.com/wallyworld/juju
<wallyworld>  ! [rejected]        managed-resources -> managed-resources (non-fast-forward)
<wallyworld> error: failed to push some refs to 'https://github.com/wallyworld/juju'
<wallyworld> davecheney: mostly i think
<rogpeppe> davecheney, wallyworld, axw: hiya
<wallyworld> the travebacks in the bug report should show it
<wallyworld> hi
<davecheney> wallyworld: i'll figure it out
<davecheney> ... me waits for tests to run
<wallyworld> davecheney: sorry, yeah i don't have the info to hand, i'd need to go and look at the bug report
<davecheney> wallyworld: meh, i'll live
<wallyworld> axw: wtf, now the pull request diff shows all the changes to trunk after I did the initial proposal
<wallyworld> i did rebase by branch so i could confirm there were no conflicts with trunk sit it may have bit rotted
<wallyworld> how the fark then do you do that and no have github mess everything up? this stuff just worked flawlessly with launchpad :-(
<jam1> wallyworld: "i did rebase" sounds like the start of your problems
<jam1> rebase throwing away history means DAG related operations lose context (IMO)
<jam1> wallyworld: so if you merge trunk, and then rebase that later
<jam1> all those changes look like you introduced them (I believe)
<jam1> depends on if rebase throws out the merge commit or not
<wallyworld> jam1: i thought rebase in this case simply moved your stuff out the way, merged in tip of trunk, and put your changes back?
<jam1> wallyworld: well in your history one of your changes is merging trunk, right?
<wallyworld> jam1: sure, but with launchpad, that all just worked
<jam1> anyway, it isn't something I've used tremendously
<jam1> wallyworld: you never rebased in LP
<jam1> you don't *have* to rebase in git
<wallyworld> so how else do i bring in trunk and not have my wip commits all sprined through?
<wallyworld> sprinkled
<jam1> wallyworld: live with them being sprinkled, like we did with LP
<wallyworld> well, not really
<jam1> wallyworld: they were hidden by default, but you can get that with "git log --first-parent" as well
<wallyworld> when you did the merge into trunk in lp, your merge commit was correctly placed in the timeline
<wallyworld> so if i $$merge$$ what's there now, i assume all the commits already in trunk will be ignored and just my new stuff will go in?
<jam1> wallyworld: well, it should try to merge the two, hopefully the changes that you brought in from trunk just apply cleanly
<jam1> You can try just doing "git merge master --no-commit"
<jam1> and see if that works without conflict.
<jam1> (or upstream/master, or however you relate to the github.com/juju/juju master branch)
<wallyworld> ok. the rebase workflow i got from rick - that's how he brings in trunk to ensure his work is sane with tip
<wallyworld> what should i use instead?
<jam1> wallyworld: "git merge master" is what I would ued
<wallyworld> pull upstream/master maybe
<jam1> use
<jam1> 'pull' might work, as I think it is just fetch+merge
<wallyworld> ok, will try that
<jam1> I'm just not sure if it is also going to change the defaults for "fast forward"
<wallyworld> i really don't understand the love for git
<jam1> wallyworld: I don't know if it is love as much as "it does the job, it was popular so everyone jumped on the bandwagon, and switching tools is hard so I always prefer the one I know"
<jam1> and *probably* a little bit of stockholm syndrom "this was hard so it must be good"
<davecheney> wallyworld: good news and bad news
<wallyworld> sure. i wouldn't mind switch if git were better than bzr
<davecheney> good news: i found the data race
<davecheney> bad news: its in the upgrade code
<davecheney> which is probably why you guys can't cut a release
<wallyworld> davecheney: which one? that watcher/session closed?
<davecheney> wallyworld: this shit takes _so_ long to run, i'm only reporting what I can see
<davecheney> the more I look, the more i'll find
<wallyworld> ok
<wallyworld> davecheney: upgrade only fails on precise/local
<wallyworld> works on other clouds and series
 * davecheney reaches for table leg
<jam1> wallyworld: so a few things that I can concretely say are better: a) git commit is faster for really big trees and lots of history, b) git push/pull logs into github faster than Launchpad, because of LP limitations that I tried to fix, but ran into odd bugs and never got time to finally address, c) the actual transfer times are also a lot faster, d) colocated branches by default are  BigDeal(tm) that you could configure Bazaar to work well, but not out of t
<wallyworld> jam1: so i came back to work on this branch after several days. when i went to push, it complained about my branch is behind remote counterpart and to pull. so i did, but there were conflicts which precisely corresponded to the changes i had made locally, and i had to resolve by "accept mine"
<wallyworld> jam1: bzr handles history and file renames better too
<jam1> wallyworld: that sounds like a mistaken set of targets for your pushand pull
<jam1> wallyworld: bzr's view of history (default in log) is beautiful (IMO)
<jam1> *but*
<jam1> it is very expensive to compute
<jam1> as it is O(allhistory)
<wallyworld> i push/pulled from origin/<branch> where origin is gh.com/wallyworld/juju
<jam1> so we paid a lot of user visible performance, and didn't push hard enough for how much better it actually presents history
<wallyworld> sure, but computers are fast
<wallyworld> how fast is fast enough
<jam1> wallyworld: I certainly have my bias in that
<jam1> but it didn't actually win hearts and minds
<davecheney> wallyworld: faster than mercurial would be fine :)
<wallyworld> yup :-)
<axw> wallyworld: sorry was out. did you sort the PR issue?
<jam1> wallyworld: also, when we had breakpoints like trying to get Mozilla (lost to hg) we were *very* slow because we were using a bad format.
 * axw hasn't read all the history yet
<jam1> we fixed that format in the next release
<jam1> but too late
<jam1> same thing for python's switchover
<wallyworld> axw: it's all screwed. if you look, you'll see my latest commits at the end
<jam1> we had an improvement in the works, but it didn't land before they made their decision.
<wallyworld> yeah :-(
<jam1> mercurial has a strong advantage that they didn't try to abstract things
<jam1> they supported 1 format and focused tightly on it
<jam1> git and hg both went with the "sync to local is important, remote support is not" while Bazaar abstracted out "I can treat anything as just another branch"
<jam1> which also cost Performance and developer time
<jam1> wallyworld: but it means you can "bzr log lp:juju-core" whereas you can't do that with git
<wallyworld> i do like that about bzr
<jam1> git only supports sync to local, and then you log, etc locally
<wallyworld> a lot
<wallyworld> yep
<davecheney> sadface, http://paste.ubuntu.com/7704302/
<jam1> wallyworld: but it means the primatives for log, etc, know that they have a local file they can just mmap, etc.
<wallyworld> indeed
<wallyworld> davecheney: funny, that test never fails in practice
<wallyworld> we have other races in production code i'd be more interested in fixing
<davecheney> wallyworld: i think it's not a real race, it's just in the cleanup code, like most of our race
<wallyworld> ok
<axw> wallyworld: I suspect you rebased on something other than upstream/master
<wallyworld> axw: i rebased on master (local)
<wallyworld> after pulling in tip from remote master
<wallyworld> jam1: so the pr on github doesn't seem to show the latest diff vs tip of trunk like lp does after you just push shit up
<wallyworld> jam1: because all of the noise in the pr now are actually commits in juju master
<jam1> wallyworld: that I don't really know github, it is possible they find the ancestor they want when they start, and then they just  stick with that one for the rest of the review
<wallyworld> they should be ignored
<jam1> wallyworld: launchpad actually does a merge without committing it, and shows that diff
<jam1> which means it can even show you conflicts, etc.
<wallyworld> i suspect you are right which makes me very sad
<axw> jam1 wallyworld: yep, ancestor only for the initial diff AFAIK
<jam1> vs just "diff from common ancestor"
<wallyworld> that sucks balls
<wallyworld> really
<wallyworld> how do so many people work that way?
<wallyworld> makes it very hard to have work in progress
<axw> wallyworld: do you want to have a hangout and screen share to fix it?
<wallyworld> ok
<axw> brb
<axw> wallyworld: in the tanzanite hangout
<wallyworld> axw: changes pushed
<axw> wallyworld: cool, looks happier now
<axw> nfi what happened before though
<wallyworld> axw: still had to do a push -f even the second time
<axw> wallyworld: yeah because it failed to push before
<axw> wallyworld: every time you rewrite history you have to do that
<axw> force push that is. you can only push without force if previously pushed history is unchangd
<jam1> wallyworld: axw: who worked on "consider retry loop for failing direct db operations" ?
<jam1> It looks like a card your team would have worked on, but nobody is assigned
<wallyworld> jam1: no one yet
<jam1> wallyworld: it was in the 'merged' column as of last week
<jam1> when I moved everything from merged into the archive
<jam1> wallyworld: should it be pulled out somewhere?
<wallyworld> that wasn't intentional
<jam1> wallyworld: ok, put it in your todo then?
<axw> accidental, shoul be in backlog or deleted I think
<wallyworld> i think it can be deleted now
<wallyworld> no need for it atm
<jam1> wallyworld: and I'm pretty sure menn0 was the one who worked on "Show relation name in status output", corect?
<jam1> bug #1194481
<_mup_> Bug #1194481: Can't determine which relation is in error from status <hours> <observability> <ui> <juju-core:Fix Committed by menno.smits> <https://launchpad.net/bugs/1194481>
<wallyworld> yep
<wallyworld> i think so
<jam1> wallyworld: is there a user for "unit tests fail on utopic" ?
<wallyworld> jam1: is that a completed card?
<wallyworld> i fixed a couple of thise
<jam1> bug #1325072
<_mup_> Bug #1325072: unit tests fail on utopic <ci> <test-failure> <utopic> <juju-core:Fix Committed by wallyworld> <juju-core 1.18:Fix Released by wallyworld> <https://launchpad.net/bugs/1325072>
<jam1> wallyworld: that is from "Week Ending June 6"
<wallyworld> sounds right
<jam1> wallyworld: k, I'm writing a script that pulls out stuff like velocity via the Kanban API and its showing some holes in our old labels
<jam1> nothing too bad, and I probably won't worry much farther back
<wallyworld> ok
<jam1> bug #1281394
<_mup_> Bug #1281394: uniter failed to run non-existant config-changed hook <regression> <juju-core:Fix Released> <https://launchpad.net/bugs/1281394>
<axw> wallyworld: you changed the name of the result error but didn't change the defers
<wallyworld> oh ffs, sigh
<wallyworld> will fix
<wallyworld> axw: done
<axw> wallyworld: thanks, reviewed
<wallyworld> thank you
<wallyworld> was a good review
<davecheney> http://paste.ubuntu.com/7704476/
<davecheney> i'm trying to fix the race in the upgrade test
<davecheney> but now it fails constantly on the safety check i put in
<davecheney> goroutine 4930 [sleep]:
<davecheney> time.Sleep(0xdf8475800) /home/dfc/go/src/pkg/runtime/time.goc:39 +0x31
<davecheney> github.com/juju/juju/state/api.(*State).heartbeatMonitor(0xc20822d5e0, 0xdf8475800) /home/dfc/src/github.com/juju/juju/state/api/apiclient.go:264 +0x66
<davecheney> created by github.com/juju/juju/state/api.Open /home/dfc/src/github.com/juju/juju/state/api/apiclient.go:196 +0xae3
<davecheney> we leak a shitload of these goroutines
<davecheney> in the tests
 * davecheney creates issue
<wallyworld> jam1: i'm off to soccer, but maybe you could get someone to look at why we continue to have very limited success with CI passing the local upgrade test only on precise. the latest machine-0 log from the failed test shows nothing obvious to me - previously there were errors in the log which showed why api server on port 17070 didn't start. only thing i can see is an apt get of a mongo-server package in the middle of the re-start after
<wallyworld> upgrade initiated. could just be log interleaving, not sure. here's a link to the latest failing job from which machine-o log can be got http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1436/
<wallyworld> curtis considers this a release blocker
<wallyworld> the apt get mongo-server thing does look like the only suspicious thing i can see that may be different to trusty
<jam1> wtf... 10 minutes ago leankit was reporting 700 cards, it now only reports 225. it just decided that our archive was old enough it could throw it away.... ?
<jam1> wallyworld: k, I'll try to give it a look
<davecheney> jam1: there must be some cutoff
<davecheney> and many of those cards are OLD
<davecheney> many of them date back to Atlanta
<jam1> davecheney: sure, but 10 minutes ago it gave me 700, I don't think we crossed the threshold in 10 mins
<davecheney> jam1: dunno, just trying to help
<davecheney> i'm probably not helping
<wallyworld> jam1: i was thinking you'd delegate to someone
<jam1> wallyworld: I'm pretty good at debugging stuff like this, so I'll at least give it a shot.
<wallyworld> ok
<wallyworld> jam1: frustratingly it passes sometimes
<jam1> wallyworld: well if it is a racy install of stuff, and sometimes we manage to install first
<wallyworld> jam1: yeah, i didn't get to look to see at what stage we apt installed, i only just looked atthe log
<jam1> wallyworld: "2014-06-26 06:26:41 INFO juju.state open.go:337 found existing state servers []
<jam1> "
<jam1> sounds problematic...
<davecheney> erk
<jam1> I don't know that it is the specific problem, that is in "cloud-init" so maybe no servers are available during the first connect, but it does seem weird.
<TheMue> morning
<sinzui> wallyworld, jam1: I am off to get some sleep. I will release the blessed revision from this page, http://juju-ci.vapour.ws:8080/job/revision-results/ it will probably be a8f48d14 because I don't believe trunk will get better in a few hours
<sinzui> wallyworld, jam1 The rev I forced CI to test will pass, though I still believe local precise upgrades are dodegy
<jam1> sinzui: so you think tip will pass, but we still should be investigating getting reliable P upgrades, right?
<sinzui> jam1 I didn't test tip. I tested an older rev that was skipped
<jam1> sinzui: do you mean a8f48d14
<jam1> or something else? as I don't see any other revs being tested in "revision-resultS"
<jam1> sinzui: the current loacl-upgrade-precise-amd64 is still blinking red, afaict
<sinzui> jam1 I tested 1d57f52
<jam1> sinzui: http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/ shows that rev as failing 3 times
<sinzui> Jam1 yes I am waiting for the destroy-env to complete http://juju-ci.vapour.ws:8080/job/local-upgrade-precise-amd64/1439/console
<sinzui> ^ that is a pass
<sinzui> but I dare note hurry destroy-env for fear that the act will cause an error
<jam1> sinzui: so blinking red is because it was red in the past but is running now?
<sinzui> jam1 yes
<sinzui> not obvious
<sinzui> lxc-destroy is taking forever
<mivtachyahu> apologies for the cross-post, but has anyone ever seen a bug where juju confuses what machines are which machine numbers?
<sinzui> mivtachyahu, I haven't seen that before
<mivtachyahu> I've come into work this morning to find that all the servers are jumbled up, ie what was machine 7 yesterday is machine 12 today
<sinzui> mivtachyahu, yes, that happen, machine numbers cannot be reused. so a number given to a machine that is added them removed also removed the number forever
<mivtachyahu> ah, no, you misunderstand, 7 is now 12, 12 is 17, 17 is now 8, 8 is now 7, they're jumbled, not removed.
<mivtachyahu> (those numbers illustrative, I've not mapped which machines are actually which)
<sinzui> juju-ci's hiest machine number is 52, but there are only 10 active machines
<sinzui> mivtachyahu, that is mad. How do you know they are jumbled? the ip addresses?
<sinzui> wallyworld, jam1, all the circles are blue http://juju-ci.vapour.ws:8080/
<mivtachyahu> when I juju ssh <machine number> they have the wrong contents, when I issue a juju status, the units are showing on the correct machine *numbers*, but the public-addresses have changed.
<jam1> sinzui: so we still have a chance for trunk tip if we get fixes, but we expect to release 1d57f52
<sinzui> jam1. You do. so CI has about 4 hours to work
<axw> mivtachyahu: which version of juju, and which provider type?
<mivtachyahu> juju 1.18.1 and on azure.
<axw> ok, nothing comes to mind. if it were in the 1.19 series then I'd be blaming availability sets because service units get a single load balanced IP
<sinzui> wallyworld, jam1. I reopened https://bugs.launchpad.net/juju-core/+bug/1334493 because juju doesn't execute after it is compiled on windows
<_mup_> Bug #1334493: Cannot compile win client <regression> <windows> <juju-core:Triaged by wallyworld> <https://launchpad.net/bugs/1334493>
 * sinzui tries to rebuild and hopes for the best
<axw> wallyworld: AFAICT, the Azure vhds cannot be reused. each one is a disk image for a separate VM instance, like you'd have if you were running VMs in VMWare or VirtualBox
<axw> wallyworld: i.e. they're not pristine OS images, but VM disks
<axw> wallyworld: going to move that card to "done"
<sinzui> axw, thank yo for investing that.
<axw> sinzui: nps
<axw> sinzui: I think we used to leak those VHDs because we were using a more error prone method of deleting disks before
<axw> sinzui: I switched the code over to using an API that deletes all associated disks when we terminate VMs
<axw> I think it's only in the 1.19 series tho
<sinzui> axw, the offcial api didn't let you delete them when you deleted disks until a few months ago
<sinzui> I had to upgrade the libraries we use to delete them
<mivtachyahu> well, good news, my weird bug has fixed itself. :)
<axw> weird indeed. mivtachyahu if you stumble across the steps to reproduce the issue, please file a bug (or ping someone in here to do so)
<mivtachyahu> will do
<axw> davecheney: this is committed, right? https://bugs.launchpad.net/juju-core/+bug/1334500
<_mup_> Bug #1334500: state/apiserver: more data races <race-condition> <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1334500>
<davecheney> axw: yes, committed
<davecheney> sorry, i didn't update the status
<axw> nps
<jam> axw: I believe it is, but due to the revision that sinzui was actually able to get to pass CI, it probably won't be in 1.19.4
<jam> davecheney: fwiw, we really don't need a get+set operation, just a simple mutexed get that will populate the cached value if it is empty would have been a better fit.
<davecheney> jam: i didn't want to hold the lock over that other operation
<jam> davecheney: given the whole point is that it is just a cache, I don't think we want to trigger the operation 2x while getting it. but it isn't like it is a big deal.
<axw> davecheney: I'm gonna have a look at the leaking heartbeat goroutine bug
<axw> seems to be a bunch of api.Opens without corresponding Closes.
<jam> axw: we seem to do that a fair bit in the test suite, I've caught a few in the past.
<jam> Once you have a couple Copy & Paste helps ensure it spreads :)
<jam> axw: in other code bases we had things like "ensure 0 threads are running when a test ends"
<axw> yeah, that'd be nice
<axw> perhaps we should add that to the base suite's TearDownTest
<jam> axw: well, we could try to move towards that, I think we'd find a lot of problems to start with.
<jam> axw: I also don't know if golang gives you a great view of "what is running", but probably it does somewhere
<jam> dimitern: TheMue: vladk|offline: just a reminder that we're skipping our daily standup for the team standup in 10 min
<axw> jam: we could compare runtime.NumGoroutine() before/after test run. I expect you're right and it'd be painful initially
<dimitern> jam, ok
<jam> axw: so for threads we compared the set of thread ids at start and end.
<jam> axw: so set(at_end) - set(at_beginning) must be empty
<jam> axw: though my google-fu says "you can't get a list of all running goroutines"
<jam> I know that you can, given that panic can print it out, but I imagine using that trick would be really really bad :)
<axw> yeah. it would be nice to compare sets, but I think just comparing size would be good enough
<jam> axw: (runtime.Stack(, all=true) and then parsing that for what is running)
<perrito666> morning everyone
<jam> axw: main problem with just doing the count, is that it sometimes passes accidentally, and it still doesn't give you any information about what is running that shouldn't be that you need to go fix.
<jam> In that respect, the runtime.Stack() method actually isn't terrible, as you could print out "these goroutine stacks are running and probably shouldn't be)
<axw> jam: true, though in that case you could just dump runtime.Stack(..., all)
<jam> axw: or as I was pointing out, you could just use Stack(â¦,all) and use that for set difference
<axw> yes, I suppose you could compare entry points
<jam> TheMue: just a reminder you're OCR today
<TheMue> jam: sure, already done first
<TheMue> first ones
<jam> TheMue: great
<TheMue> jam: made a calendar entry for it to not forget it ;)
<jam> TheMue: :), team standup onw
<TheMue> jam: yeah, here also my calendar reminded me
<TheMue> afk for lunch
<vladk> dimitern: ping
<vladk> dimitern: I created WatchInterfaces, My current problem is that it's impossible now to add network interfaces after they were provisioned.
<vladk> I can remove this check from machine.go, but this breaks some tests. Otherwise, I can
<vladk> dimitern: Otherwise, I can't test the watcher when I add the network interface
<dimitern> vladk, there was a slight change
<vladk> dimitern: what do you mean?
<dimitern> vladk, jam, fwereade and i discussed and we can use a notifywatcher instead of a stringswatcher
<dimitern> vladk, jam, for the network interfaces
<dimitern> vladk, jam, that way we don't need to care about tags for interfaces
<dimitern> vladk, as for your question, you'll need to change AddNetworkInterface slightly, so it doesn't fail when the machine is provisioned
<vladk> dimitern: this breaks some of the tests, so I need to fix them, too
<dimitern> vladk, i.e. assertAliveAndNotProvisioned becomes aliveDoc, and the if m.doc.Nonce != "" needs to go
<dimitern> vladk, yep, naturally
<vladk> dimitern: should I change stringswatcher to notifywatcher?
<dimitern> vladk, yes
<dimitern> vladk, i'm updating the model doc today to reflect what we discussed
<dimitern> vladk, that's the only thing affecting your work now
<vladk> dimitern: may I do PR with stringswatcher to get a quick feedback and change it lately?
<dimitern> vladk, of course
<vladk> thanks
<vladk> dimitern: please, review https://github.com/juju/juju/pull/169
<wallyworld> mgz: i'm still in a meeting, i'll ping you soon for 1:1
<mgz> wallyworld: sure, I'll hang out there for when you arrive
<wallyworld> be there soon
<rogpeppe2> trivial update to dependencies.tsv, anyone? https://github.com/juju/juju/pull/177
<rogpeppe2> fwereade, dimitern, mgz, natefinch, wwitzel3: ^
 * rogpeppe2 thinks it's trivial enough to just merge anyway
 * rogpeppe2 does that
<TheMue> rogpeppe2: taking a look
<TheMue> rogpeppe2: argh, too quick
<TheMue> rogpeppe2: ;)
<rogpeppe2> TheMue: that's ok - there's not exactly much to review...
<TheMue> vladk|offline: made some comments
<TheMue> rogpeppe2: have to compare this nice number to available revisions :D
<rogpeppe2> TheMue: the 'bot will complain if it doesn't work...
<TheMue> rogpeppe2: taedd? (trial-and-error driven development)
<rogpeppe2> TheMue: with changes that simple, it seems reasonable to me
<TheMue> rogpeppe2: yep
<fwereade> natefinch, I'll be with you soon
<lazypower> Greetings juju-core. There was an LXC update this morning that wipes mount fstype=rpc_pipefs, if i recall correctly this causes problems with containers does it not?
<lazypower> http://i.imgur.com/tjSkSG6.png
<TheMue> rogpeppe2: seen that the merge failed?
<rogpeppe2> TheMue: no i hadn't. thanks
<TheMue> rogpeppe2: yw
 * rogpeppe2 wants to work out a decent way to get an obvious warning when a merge fails
<TheMue> +1
<rogpeppe2> oh bugger, it's been changed to break the API
<rogpeppe2> i'm stuffed now
<rogpeppe2> because the new charm changes require the new names package
 * rogpeppe2 wonders why all those tag changes needed to happen
<rogpeppe> hmm, i guess i'll just hack around the issue for now
<lazypower> wrt my question above, here's a bug that was filed that shows the behavior: https://bugs.launchpad.net/juju-core/+bug/1319525
<_mup_> Bug #1319525: juju-local LXC containers hang due to AppArmor denial of rpc_pipefs mount with local charms <local-provider> <lxc> <juju-core:Invalid> <lxc (Ubuntu):Incomplete by tyhicks> <https://launchpad.net/bugs/1319525>
<perrito666> rogpeppe: "for now"Â®
<perrito666> is there anything in place to tell a machine "hey, apiserver and stateserver have changed" ?
<rogpeppe> perrito666: the state server addresses should change
<rogpeppe> perrito666: and they can be watched
<perrito666> rogpeppe: come again please, I cannot join those two things you just said into something I understand
<rogpeppe> perrito666: :)
<rogpeppe> perrito666: what are you trying to do?
<perrito666> rogpeppe: restore ;)
<perrito666> current restore ssh's into all of the agents and runs a sed script to change apiadresses and stateaddress
<perrito666> I really would like to do something prettier
<rogpeppe> perrito666: well, i think ssh'ing in is probably the only option
<rogpeppe> perrito666: but what you do *when* you've ssh'd in could be prettier
<rogpeppe> perrito666: you could add a jujud subcommand which updates the addresses in the agent.conf file
<rogpeppe> perrito666: and then invoke that from the ssh command
<lazypower> Does anyone know why LXC would give up on round robin dns assignment? I have evidence here it has done so: http://pastebin.ubuntu.com/7705971/
<perrito666> rogpeppe: it will have to do Â¯\_(ã)_/Â¯
<rogpeppe> perrito666: what kind of thing would you *like* to be able to do?
<perrito666> rogpeppe: well I think that your idea pretty much sums what I would like to be able to do, perhaps wrapped, something like having agents listen for "control commands" and a mechanism to issue those, I think I have some bias for working too much with embedded devices :p
<rogpeppe> perrito666: it wouldn't be too hard to get agents to listen on a local socket for control commands
<sinzui> mgz, 1.19.4 will be the revision you created. CI had skipped it for a new rev yesterday. I made CI test just your rev to get a pass
<sinzui> mgz: I am very interest in your work to play unittests in lxc
<rogpeppe> perrito666: but that does mean the agent has to be up and running at the moment you're doing the restore
<bodie_> morning all
<perrito666> rogpeppe: well restore always assumed the agents are up
<rogpeppe> perrito666: really?
<rogpeppe> perrito666: how so?
<perrito666> rogpeppe: well, the script that runs on all machines does:
<perrito666> 450         initctl stop jujud-$agent
<perrito666> which would fail and exit the script if jujud-$agent was not up
<rogpeppe> perrito666: that'll work ok if the agent is already stopped though, won't it?
<rogpeppe> perrito666: oh really - i thought initctl stop was idempotent
<rogpeppe> perrito666: that's a bug then
<rogpeppe> perrito666: blame me :-)
<mgz> sinzui: ace, thanks - I did reland the change, so will keep an eye on the job as well
<perrito666> rogpeppe: :) oh, then I un asume that
<rogpeppe> perrito666: no, you're right
<rogpeppe> i wonder if there's a way to tell initctl to stop a service only if it's already running
<perrito666> rogpeppe: || true
<rogpeppe> perrito666: ha ha
<rogpeppe> perrito666: that's indeed the simplest solution, though not great
<rogpeppe> perrito666: better would be to test the output of initctl status first, i think
<perrito666> rogpeppe: you would have to check status I guess
<rogpeppe> perrito666: yeah
<perrito666> returns stop/waiting or sth like that when not started
<jcastro> https://bugs.launchpad.net/juju-core/+bug/1334683
<_mup_> Bug #1334683: juju machine numbers being incorrectly assigned <juju-core:New> <https://launchpad.net/bugs/1334683>
<jcastro> has anyone seen this before?
<jcastro> it's affecting someone in production
<natefinch> jcastro: looking
<jcastro> they are early adopters, so any help you can lend would be <3
<alexisb> jcastro, what version of juju did they hit that bug?
<natefinch> I wonder if this is azure being wacky
<jcastro> I'll ask them to update the bug
<natefinch> alexisb: looks like 1.18.1
<alexisb> jcastro, wallyworld's team will be tackling azure issues this cycle, this may be one of them
<alexisb> ^^ just an fyi
<jcastro> rock and roll!
<natefinch> jcastro: updating to 1.18.4 couldn't hurt
<sinzui> natefinch, do you have a few minutes to review https://github.com/juju/juju/pull/178
<perrito666> sinzui: what happent with 1.19.5?
<sinzui> perrito666, well. we really want 1.20.0, though my scripts want to make 1.19.5. We will create a stable 1.20 branch and let master think it is 1.20.0
<perrito666> sinzui: that explains the commit message which says something very different from the actual patch
<sinzui> perrito666, natefinch yep. I realised that if I make the branch 1.19.5, I need to land another branch next week to get the version right for june 30
<sinzui> maybe I am wrong
<perrito666> sinzui: perhaps I am about to say something sinful but, wouldn't it be nice if you re-wrote a bit the past so that commit message says the right thing?
<sinzui> perrito666, I was thinking something a little different, but it also means retracking the PR
<perrito666> sinzui: if you use git ammend then push it should look as if this little mistake never happent
<natefinch> sinzui: LGTM
<sinzui> perrito666, I need to fork at juju-1.19.4 to create stable 1.20 branch. I think need to merge a branch into both devel and stable that sets the version. stable branch will want to be 1.20.0 and I will merge select revs from devel into it. Maybe devel needs to be 1.20-alpha to indicate it is devel
<sinzui> ^ natefinch maybe I want to do something different because I need a stable branch and juju will switch to the new version rules
<sinzui> perrito666, natefinch and the *next* unstable version that thumper and I discussed would be 1.21-alpha1
 * sinzui delete PR
<ericsnow> natefinch: are we doing standup now?
<natefinch> ericsnow: I can't, sorry.  Probably will have to be very late today if at all.  I have to take my daughter to her 1 year checkup in an hour.
<ericsnow> natefinch: no worries
<natefinch> Let's shoot for 3.5 hours for now, hopefully I'll be back in and working
<perrito666> natefinch: 3.5hours from now?
<ericsnow> natefinch: sounds good
<natefinch> perrito666: from now, yeah, sorry
<perrito666> natefinch: I think Ill be around
<alexisb> natefinch, Do we need more time scheduled with gsamfira and team for the workload stuff?
<alexisb> wwitzel3, ping
<natefinch> alexisb: probably.... it's been slow going.  Good, but not fast
<alexisb> natefinch, ok, I will put an hour on the calendar for tomorrow, then we can discuss if we want to do a few days next week
<natefinch> alexisb: I'm on vacation next week :/
<alexisb> crap thats right
<natefinch> they're actually doing well, so it might not be so bad
<alexisb> heh ok I will schedule I bit more time tomorroe then
<alexisb> and then we can exit with a game plan while you are gone
<bac> hi i'm trying to bootstrap an environment on azure and it is not coming up.  all-machines.log shows that 'machiner' cannot set the machine address and it is constantly restarting: http://paste.ubuntu.com/7707189/
<bac> is this situation recoverable?
<perrito666> sinzui: Ill lgtm if you promise me that you took care of the extra step that broke things the other time :)
<sinzui> perrito666, I am pondering those same consequences for my inc-1.20-alpha1 branch
<sinzui> perrito666, We change the transition number to 1.19.9...but I don't think we can land a version change to 1.20-alpha1 until after 1.20.0 is release. 1.18.x throws a wobbly when we ask it up upgrade to a version with alphas
 * sinzui ponders 1.19.5 for master until 1.20.0 is released
<perrito666> ericsnow:
<perrito666> news about nate?
<ericsnow> perrito666: nope
<perrito666> ericsnow: he is not in the hangout
<ericsnow> perrito666: yeah, not on IRC either
<perrito666> he most likely fell on the netsplit
<bac> sinzui: do you have much juju/azure experience?
<perrito666> sinzui: you got lgtmd
<sinzui> thank you perrito666
<sinzui> bac I have a lot of janitorial azure experience
<wallyworld> sinzui: hi, you finally got a rev to release :-) with the 1.20 branch you want to create off master, will CI be able to run tests for both the release candidate branch and trunk? will you set up a jenkins slave to test our future RC branches as well as trunk?
<sinzui> wallyworld CI knows how to watch any bzr or git branch
<sinzui> wallyworld CI knows how to watch any bzr or git branch
<sinzui> wallyworld_, will the lander/git-merge-juju work with a non-master branch?
<sinzui> I have a merge ready to try when we want
<sinzui> wallyworld_, also I have built a juju env from 3 clouds, a private vpn, and have some neigh impossible archs http://juju-ci.vapour.ws:8080/computer/
<sinzui> I think I can now afford to be sick and get rest
<jcw4> is there a problem upgrading my 14.04 ubuntu that I use for develoment to go 1.3 ?
<sinzui> jcw4, You will discover the 1.2-1.3 bugs faster than CI's gccgo testing will report
<jcw4> hehe
<jcw4> that's what I was afraid of
<jcw4> does juju have a 'support matrix' of which versions of Go are supported on which platforms?
<sinzui> jcw4, OSX appears to be building with 1.3. It was disconcerting to see since I don't have osx hardware to test with.
<wallyworld_> sinzui: the lander should handle a non master branch - i'll confirm with martin
<jcw4> sinzui: I had a hard time getting all the tests to work (go 1.2) on osx
<sinzui> jcw4, We are officially 1.2 on all OSes for all series...except ubuntu doesn't officially provide 1.2 for precise
<jcw4> sinzui: I see
<sinzui> wallyworld_, do I not have $$merge$$ special powers? I thought my inc of master to 1.19.5 woul work
<wallyworld_> sinzui: anyone in juju team should be able to type $$merge$$, did it not work?
<sinzui> jcw4, you added series (maverick) support to the version name? I was pleased to see that in my test of that today
<jcw4> wallyworld_: ^^
<sinzui> I may be impatient
<wallyworld_> jcw4: ?
<jcw4> sinzui addressed that comment to me but I think it was intended for you wallyworld_ ?
<wallyworld_> i didn't add maverick support, not sure why we did sice maverick is EOL
<wallyworld_> isn't it?
<sinzui> jcw4, no you. I was surprised to not see unknown when I bootstrapped today with an osx client
<sinzui> sorry osx mavericks
<wallyworld_> sinzui: you are right, the lander has not picked up your $$merge$$, i'll look  into it
<jcw4> oh; no :-(
<jcw4> sinzui: I don't even know how to do that yet :)
<wallyworld_> sinzui: just to confirm - you created the 1.20 branch off the rev used to cut 1.19.4, right?
<sinzui> thats okay, I am EOD now. no 19 hours days now that I have a release to create stable from. and I have an army of slaves to do my bidding
<sinzui> wallyworld_, I sure did
<wallyworld_> awesome
<wallyworld_> i'll inc the evrsion number to 1.20 also if it hasn't been done
<wallyworld_> sinzui: you have indeed been working too hard, you need to got rest and get better, perhaps with a glass of red
<sinzui> :)
<sinzui> I will call that medicine for my soar throat. a cough suppressant
<wallyworld_> sinzui: i'll add a lander job to look at and land stuff off the 1.20 branch
<wallyworld_> sinzui: so maybe tomorrow when you come in to work you can then hook 1.20 up to CI
<sinzui> wallyworld_, I will be visiting mgz tomorrow. Now that I have all my slaves, I want to run unit tests in lxc on them. I think that will take 30-40 minutes off the time it take CI to run unittest, build packages, and test local provider
<wallyworld_> \o/
<sinzui> wallyworld_, I just added 1.20 to the list of branches to test. Ci is testing it now.
<wallyworld_> sinzui: you are f*cking amazing
<alexisb_afk> wallyworld_, +1 to that :)
 * perrito666 needs to autodocument his code because he is loosing track of it
<wallyworld> sinzui: is there a separate dashboar for 1.20 vs trunk?
<sinzui> wallyworld, no, sorry
<wallyworld> that's ok just wondering
<wallyworld> so how do you see that 1.20 vs trunk is ok?
<wallyworld> thumper: can you ping me after your standup?
<waigani> davecheney: standup take two
<davecheney> waigani: rightou
<davecheney> wallyworld: is the tree open or closed ?
<wallyworld> open, we've created a separate 1.20 branch
<wallyworld> on my todo list to send email
<thumper> wallyworld: ping
<wallyworld> thumper: hey, have you seen this issue https://launchpad.net/bugs/1329051
<_mup_> Bug #1329051: local charm deployment fails on "git not found" due to wrong apt proxy <amd64> <apport-bug> <third-party-packages> <utopic> <juju-core (Ubuntu):New> <https://launchpad.net/bugs/1329051>
<wallyworld> wrong proxy being used inside lxc
<thumper> no
<wallyworld> ok, it seems Juju uses the apt_proxy setting from host machine when setting up proxy inside lxc
<wallyworld> which is wrong
<wallyworld> i'll schedule for next stable milestone
<thumper> yes, we do just blindly use the apt proxy of the host
<wallyworld> ok, seems like a legit issue then
#juju-dev 2014-06-27
<thumper> menn0: found anything yet?
<menn0> thumper: still try to replicate it locally
<thumper> ok.
<menn0> did you notice that the all-machines.log build artifact appears to be imcomplete?
<menn0> incomplete even
 * thumper needs to do the self-review hr thingy
<thumper> menn0: no... didn't look in too much detail
<menn0> it doesn't even get to the upgrade to 1.19
<menn0> thumper: so things don't look awesome
<thumper> in what way?
<menn0> thumper: I've just done an upgrade from 1.18.4 to 1.19.4 and the env is pretty screwed
<menn0> no idea if it's the same problem as what showed up in CI
<menn0> the main thing seems to be that the addresses for machine-1 and machine-2 are now 127.0.0.1 instead of what they should be.
<menn0> does that ring any bells
<thumper> no, but we do have an address updater worker thingy
<thumper> I would have just started with no other machines and just done the bootstrap node
<thumper> see if that works
<thumper> then add machines
<menn0> also, during the upgrade the db-relation-changed hook for mysql failed (I was testing with the standard wp / mysql setup)
<menn0> thumper: the problem is my machine is trusty and the problem is supposedly precise specific
<thumper> is it?
<thumper> I have a precise machine here
<menn0> according to the bug
<menn0> I was about to test with canonistack because I have that set up and it's all precise there
<thumper> ok, try that too
<menn0> at any rate, what I did should have been fine and it definitely wasn't
<menn0> even if it's not the same bug as what showed up in CI it's pretty bad
<menn0> the CI test uses some simple dummy charms. I'll test with them on canonistack, replicating what CI does as closely as I can
<thumper> ok
<wallyworld> menn0: i think this also 1.20 bug could be related also https://bugs.launchpad.net/bugs/1334773
<_mup_> Bug #1334773: Upgrade from 1.19.3 to 1.19.4 cannot set machineaddress <lxc> <maas-provider> <precise> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1334773>
<menn0> wallyworld: could be
<wallyworld> menn0: can you keep me in the loop as to where you get to?
<menn0> wallyworld: will do
<wallyworld> ta, i don't think that first bug is related necessarily, but i suspect the other could be
<thumper> menn0: do you want some help diagnosing this issue?
 * thumper just typed 'juju fetch upstream'
<thumper> unsurprisingly it didn't work
 * thumper considers writing a juju-fetch plugin
<thumper> how do I tell git to have my 1.20 branch track upstream 1.20?
<thumper> o/ mramm
 * thumper branches 1.20
<menn0> thumper: just got back from lunch
<menn0> thumper: yes please
<thumper> menn0: well there is certainly something wrong
<thumper> i'm looking at my current setup
<thumper> o..m..g...
<thumper> hmm...
<menn0> thumper: hangout?
<thumper> yeah...
<thumper> menn0: https://plus.google.com/hangouts/_/canonical.com/local-debugging
<axw> wallyworld: I'm back, ready for 1:1 whenever you are
<wallyworld> ok
<bodie_> I'm digging in the worker/uniter/uniter_test a bit
<bodie_> this is very unusual
<bodie_> does anyone know how to pass values around between steps?
<bodie_> I guess you'd have to set a value on the ctx struct?
<menn0> axw: are you around? at least one of the upgrade issues appears to be mongo replicaset related...
<axw> menn0: I am
<axw> oh? :(
<menn0> axw: https://plus.google.com/hangouts/_/canonical.com/local-debugging
<axw> brt
<waigani> axw: thanks
<waigani> axw: setting locale to default of UTF8 for charm hooks. Is that just a matter of setting an environment variable  LANG=en_US.UTF8 ?
<axw> waigani: yup
<axw> waigani: sorry gonna need a fix to the known_hosts path quoting
<axw> waigani: left a comment on the PR
<waigani> axw: ah shit, sorry I hit merge
<axw> nps, it's not an immediate problem
<waigani> axw: I ran the test and it did not quote
<axw> waigani: otp, I'll take a look in a bit
<waigani> axw: I think I've got it, using utils.CommandString in ssh_openssh.go
<waigani> axw: okay fixed, I pushed up changes. What happens, considering I already started the merge?
<axw> waigani: kinda dodgy now that I think of it, but if your original change passes it'll get merged :)
<axw> waigani: sorry, I don't think you need to quote the string at all in utils/ssh
<axw> waigani: it gets passed to os/exec.Command, not to a shell
<waigani> axw: oh right. I'll take out the utils.CommandString then
<waigani> axw: and not test for quotations? have a file name with no spaces?
<axw> waigani: sec
<axw> waigani: the way to test the quoting would be to update fakecommand in ssh_test.go
<axw> waigani: surrounding $@ with double quotes should make it quote the args
<waigani> axw: $@ already has double quotes
<axw> waigani: not in my branch. I have: echo $@ | tee $0.args
<waigani> axw: hehe, I was looking in utils
<waigani> axw: okay I'll have a play
<waigani> axw: just spotted scp tests failing
<waigani> axw: I'll fix those and push up again
<thumper> menn0,axw: did you find it?
<menn0> axw: not yet but axw is pretty sure those asserts are it
<menn0> axw wanted to get some food so we broke the call
<menn0> I'm currently trying to understand why the problem has stopped happening for me
<menn0> I have a theory which I'm testing now
<axw> thumper: just making some lunch, will bbs
<thumper> axw: ack
<thumper> menn0: because the asserts aren't failing?
<thumper> now
<menn0> but they should be if I'm upgrading from 1.18 to 1.19 right?
<menn0> thumper: where are those asserts?
<thumper> state/machine.go:975
<waigani> axw: okay, should be good to go now.
<menn0> thumper: I can't make the problem happen again. Trying something else...
<thumper> menn0, axw: so looking at the difference for me, the doc doesn't have Scope values in the db, and we are trying to set with scope values
<thumper> and the new list has one more ipv6 address
<thumper> however that doesn't explain why the transaction is aborted
<axw> thumper: shouldn't matter what we're setting, only what we're comparing between the in-memory and in-db
<thumper> hmm...
<thumper> may have it...
<thumper> maybe
<thumper> we look at the actual serialized data we have...
<thumper> but set using a string value in bson.D
<thumper> what is the structure serialized as?
<thumper> could that bit it?
<axw> "address"
<axw> I compared the structs between 1.18 and 1.19 and didn't see a difference
<axw> thumper: only difference is change in location of the structs... don't *think* that matters though...
<thumper> well, it shouldn't, but it might
<thumper> axw: is there any way to dump at the raw bson structure ?
<thumper> or, where are the bson serialisation commands?
<axw> thumper: yeah I was about to figure out how :)  there's an mgo.Raw type like in encoding/json
<thumper> here is a simpler case:
<thumper> 2014-06-27 04:32:09 DEBUG juju.state machine.go:931 addresses currently: []state.address{state.address{Value:"localhost", AddressType:"hostname", NetworkName:"", Scope:""}, state.address{Value:"10.0.3.1", AddressType:"ipv4", NetworkName:"", Scope:""}}
<thumper> 2014-06-27 04:32:09 DEBUG juju.state machine.go:978 updating addresses to: []state.address{state.address{Value:"localhost", AddressType:"hostname", NetworkName:"", Scope:"public"}, state.address{Value:"10.0.3.1", AddressType:"ipv4", NetworkName:"", Scope:"local-cloud"}}
<thumper> 2014-06-27 04:32:09 DEBUG juju.state.txn txn.go:91 0: err: &errors.errorString{s:"transaction aborted"}
<thumper> updating to add scope
<thumper> and that is all
<thumper> order is less likely to be an issue here with just two values
 * thumper has to go and run to get sushi
<thumper> before they run out
<thumper> I'll check in when I get back
<menn0> axw, thumper: I need to do a takeaway run myself but will join in again
<axw> menn0 thumper-afk: found it
<axw> a field name in state.address was changed
<axw> NetworkScope -> Scope
<axw> dimitern: I'm going to change the field name of state.address.Scope back to NetworkScope as it was before. let me know if you can think of any problem with that
<dimitern> axw, why?
<axw> dimitern: because it used to be called networkscope in state
<axw> dimitern: the change breaks upgrade
<dimitern> axw, ah.. dreaded schema changes
<dimitern> axw, ok, can you make it Scope string `bson:"networkscope"` ?
<axw> dimitern: yup, that's what I've done
<axw> looks like there's no queries on that field
<axw> so should be fine
<dimitern> great
<axw> dimitern: https://github.com/juju/juju/pull/183 please
<dimitern> axw, looking
<dimitern> axw, done
<axw> thanks
<thumper> axw: ah, so we were read the data, scope now blank, assert a dict which now didn't match, right?
<axw> thumper: something like that. we expected "scope", but state had "networkscope"
<axw> same value, different field name
<thumper> so our assertion failed
<axw> yup
<axw> uh oh, the merge bot picked up my 1.20 PR, is going to test it on trunk, and then land it in 1.20
<axw> oh well, it's trivial
<thumper> heh
<axw> waigani: sorry, looking at your changes now
<thumper> davecheney: making progress on bug 1334493 ?
<_mup_> Bug #1334493: Cannot compile/exec win client <regression> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1334493>
<davecheney> thumper: yup
<davecheney> https://github.com/juju/juju/pull/186
<wallyworld> axw: my irc sucks today. seems like you're making progess on bug 1334773?
<davecheney> should fix the faulty tools
<_mup_> Bug #1334773: Upgrade from 1.19.3 to 1.19.4 cannot set machineaddress <lxc> <maas-provider> <precise> <upgrade-juju> <juju-core:In Progress by axwalk> <https://launchpad.net/bugs/1334773>
<axw> wallyworld: yeah, I have a fix in the works
<axw> being merged
<wallyworld> great
<axw> merged in fact
<wallyworld> \o/
<menn0> thumper, axw: do you think it's possible that bug 1334273 may also be caused by the Scope problem?
<_mup_> Bug #1334273: Upgrades of precise localhost fail <local-provider> <precise> <regression> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1334273>
<axw> menn0: possibly related, but it's not local-specific, and definitely not precise-specific
<menn0> axw: yeah it's hard to see how it the networkscope issue could be that specific.
<menn0> I still don't see why it magically stopped happening on my machine though. It doesn't seem like the problem should have been intermittent.
<menn0> axw: the symptoms of the precise local upgrade problem look a lot like what Tim saw when upgrading from 1.19.1 to 1.19.4 and that was the peer grouper setup not happening.
<axw> yeah...
<axw> hmm
<menn0> axw: but the CI test was going from 1.18.4 to 1.19.4 so it's not quite the same.
<wallyworld> menn0: when i looked at the logs on CI, there were no obvious errors logged by the peer grouper when it failed
<wallyworld> it seems it could related to going from non-ha to ha set up, and on precise we use an older mongo
<menn0> wallyworld: sorry, it's not the peergrouper itself but the replicaset configuration that's done at startup for the peergrouper (it uses code in the peergrouper's module)
<wallyworld> which has previously caused issues with local provider, hence it was disabled for a while
<wallyworld> yup
<wallyworld> my handwavy view is that mongo+precise+replicaset = shit
<wallyworld> but i have little evidence
<axw> wallyworld: was there ever a known problem with the older version of mongo? the only thing I absolutely know for sure that caused a problem was the oplog size
<wallyworld> not sure
<wallyworld> but the only thing that i can see as different on precise is the version of mongo
<menn0> wallyworld: we found today that upgrading from 1.19 before the replicaset work was done in this series to 1.19.4 causes broken upgrades because the replicaset setup is only done if you're upgrading from pre-1.19.0.
<wallyworld> joy
<menn0> wallyworld: the thing is the 1.20 blocker for precise upgrades looks awfully similar.
<wallyworld> but that's "ok"
<wallyworld> we don't support upgrades for dev versions
<menn0> very similar symptoms in the logs
<wallyworld> menn0: so my view is replicaset is broken somehow, seems like you agree?
<menn0> so I wonder if replicaset setup failed for some other reason
<menn0> wallyworld: I'm saying there could be a problem initialising the replicaset in some cases.
<wallyworld> if it works for straight uo deploy from scratch, i wonder what's different when upgrading
<menn0> I need to go again for a bit but I'll dig through the CI failure again when I'm back
<menn0> something in the logs might jump out now that there's theory in to what might be happening.
<wallyworld> yeah
<menn0> back in 30 mins ish
<wallyworld> i'll try and look after soccer
<wallyworld> but i want to talk to william also
<wallyworld> so may not get time
<axw> wallyworld: there's a bug in MachineAgent.ensureMongoServer, but I don't know if it's related or not. If we get as far as ensureMongoServer, but then maybeInitiateMongoServer fails (or something in between), then the process would exit and restart
<axw> wallyworld: we'd then try the isPreHAVersion block again and fail to connect to state
<axw> wallyworld: the reason being that we haven't yet initiated the replicaset, and haven't told mgo to make a Direct connection
<menn0> axw: I'm hunting through the logs from the upgrade failure in CI again. Nothing yet.
<axw> menn0: yeah nothing jumped out at me. I've put up https://github.com/juju/juju/pull/187 - it's a long shot, but possibly related
<menn0> axw: just had a look at that PR. Seems reasonable. That applies even with a single state server right?
<axw> menn0: yes, that's the only time it actually will be uesd (pre-HA)
<menn0> axw: duh. of course :)
<menn0> axw: have you tried a 1.18 to 1.19 upgrade with the changes in that PR?
<axw> menn0: yes, just tried and it works fine
<menn0> axw: cool, I'll LGTM it.
<axw> cheer
<axw> s
<menn0> axw: done
<menn0> axw: one thing that jumps out at me about the failure in CI is that it takes 8 minutes from the time jujud restarts to the new version before MaybeInitiateMongoServer gets called. That seems awfully long.
<axw> hmm
<axw> it does seem like a long time...
<menn0> another thing: would we expect the mongo admin user to get set up once jujud restart into the new software version?
<menn0> mongo was started with --noauth and all that
<menn0> axw: ^^^
<menn0> in the CI test failure
<axw> menn0:  adding the admin user is part of the expected upgrade procedure
<menn0> axw: ok... well in the 1.18 to 1.19 upgrades on my machine which went without a hitch I see no evidence of that happening. But it did happen in the failed CI upgrade run.
<menn0> Could be coincidence but maybe not
<dimitern> axw, still around?
<axw> dimitern: yes
<dimitern> axw, I was wondering about the progress on relation addresses wrt charms
<dimitern> axw, there were some unresolved comments on the doc - did you reach agreement?
<dimitern> axw, about the new hooks and stuff
<axw> dimitern: not 100%. fwereade had a chat with hazmat and came to a vague agreement. I've got a PR up atm that triggers config-changed on units whenever the machine addresses change - will do relation addresses later
<dimitern> axw, i have to prepare a doc to sync up on how to expose IPv6 addresses to charms, as this is the most important take on ipv6 support in core from charmers perspective
<dimitern> axw, right, i'll ping fwereade and hazmat, cheers
<axw> ah, that'll be interesting...
<axw> menn0: did you get the full log for CI?
<axw> menn0: I upgraded 1.18.1 to 1.19.4 on my machine and I got "starting mongo with --noauth" and "setting admin password" on upgrade
<menn0> axw: no. all-machines.log at least is going to be incomplete because of the rsyslogd config changes during the upgrade. the separate machine logs are fine though.
<menn0> axw: I think I see a race
<axw> ah I see
<axw> race? where?
<menn0> axw: if the upgrade-steps worker finishes before ensureMongoServer is called by the state worker then the isPreHAVersion check will be false and we won't do the HA setup work.
<menn0> Does that sound plausable?
<menn0> upgrade-steps updates UpgradedToVersion once it's done
<menn0> and that is what ensureMongoServer is checking against
<menn0> might explain why I'm only able to see the issue sometimes.
<axw> menn0: well shit, I think you're right
<axw> we shouldn't do anything until we've upgraded mongo
<menn0> by ensureMongoServer I mean the the method not the function with the same name it calls
<axw> yep
<axw> menn0: that would better explain this bug
<menn0> axw: and possibly some of the other replicaset weirdness we've seen?
<axw> menn0: hmm actually...
<axw> ah never mind
<axw> yep, still looks like the culprit
<axw> menn0: dunno about other weirdness, this is upgrade specific
<menn0> ah ok. I thought the other issues were upgrade specific too.
<menn0> axw: so what's the fix? Do the mongo upgrade work before starting any of the workers?
 * menn0 suspects he needs to go so lest he upsets his wife
<menn0> s/so/soon/
<axw> menn0: I can handle it, it's pretty late there
<axw> but yes I think that's the solution
<menn0> axw: sweet.
<menn0> have a good weekend.
<axw> cheers, you too
 * menn0 is relieved that the day ended with something productive
<rogpeppe2> if anyone has some time, i'd very much like a review of this pull request by someone on core, please. It's currently only been reviewed by gui people. https://github.com/juju/charm/pull/9
<fwereade> axw, ping
<axw> fwereade: pong
<fwereade> axw, any insight into https://bugs.launchpad.net/juju-core/+bug/1334683 ?
<_mup_> Bug #1334683: juju machine numbers being incorrectly assigned <azure-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1334683>
<axw> fwereade: fraid I've been bogged down with critical 1.20 things to even look into it
<axw> no ideas off the top of my head
<fwereade> axw, no worries, it's only 18h old :)
<fwereade> axw, any suggestions for someone awake now/soon who'd know azure?
<axw> fwereade: I don't know if anyone other than me does
<axw> fwereade: I can take a look if someone wants to take over this jujud upgrade race
<axw> fwereade: menn0 noticed that the upgrade steps could theoretically complete before the state worker starts, causing isPreHA to return false and things to go a bit pear shaped
<axw> fwereade: eh I think I see the problem in azure
<axw> fwereade: in the 1.18 code it's not returning the instances in the same order as the input ids
<axw> fwereade: good news is, it's fixed in 1.19.4
<axw> or 1.19.0 or whenever I made all the changes
<fwereade> axw, you're too awesome, I'd resigned myself to hassling nate about it :)
<fwereade> axw, would yu add a really quick note to the bug to that effect please?
<axw> will do
<bac> fwereade: i've been trying to get an azure instance to bootstrap with trusty with no success for a day. i've a partially booted node up. who might know a thing or two about azure and have a look?
<fwereade> bac, if axw is still up he's your best bet by a long way
<axw> bac: are you using 1.19.x?
<bac> axw: 1.19.4
<axw> bac: ok, that's a good start. you've "partially booted a node up"?
<bac> axw: i bootstrap with --debug and 'Running apt-get upgrade' is the last thing in the console
<axw> how? where did it break?
<axw> ok
 * axw checks he can bootstrap
<bac> axw: i can give you access to the instance
<axw> bac: sure, just import lp:~axwalk please
<bac> axw: juju-azure-ci3-7ul3u8075q.cloudapp.net
<axw> ta
<axw> bac: the agent is running... what happens when you run juju status?
<bac> axw: instance-state remains ReadyRole
<bac> axw: let me paste the whole thing.  until then, here is my yaml config block https://pastebin.canonical.com/112600/
<axw> thanks
<bac> axw: https://pastebin.canonical.com/112601/
<axw> bac: looks fine - what's the issue with it?
<bac> axw: well 'juju bootstrap' never terminates.  if i ctl-c out of it, the instance is torn down
<axw> ah right. hmm weird
<bac> axw: i assumed since it didn't complete it was still doing stuff.  and i don't know what ReadyRole is
<bac> so it was less than comforting
<axw> bac: that's just azure's term for "machine is started/running"
<bac> ok
<bac> i guess if you had to jam two random words together those are as good as any
<bac> axw: so is my use of --debug possibly the culprit in the non-termination of the bootstrap?
<axw> bac: I don't think so, I do that all the time...
<axw> I'm just bootstrapping my own instance now
<axw> bac: when you say you've been trying to get an azure instance to bootstrap for a day... do you mean that you left that one command running for a day, or you've tried it a bunch of times?
<bac> axw: i tried it a bunch of times.  most was on US East and i got different errors.  last night i switched to US West and launched the bootstrap at my eod.  this one has been active overnight
<axw> mk
<bac> axw: upgraded from 1.19.3 in the middle of my attempts
<bac> axw: if i cannot get it resolved this morning, we're going to have to switch our CI setup to another provider.
<axw> bac: are you doing this on Linux?
<axw> your client
<bac> trusty vm
<axw> and does it have ssh installed?
<bac> yes
<axw> bac: uh oh, I have reproduced this issue
<bac> yay, boo.
<axw> bac: I just realised I had "image-stream: daily" set in my environments.yaml; I took it out and bootstrap is just hanging there
<axw> fwereade: ^^ critical bug for 1.20 I think
 * fwereade reads back with an unhealthy sense of impending freakout
<bac> axw: cool, so i can add image-stream to my config and perhaps get past, for testing purpose only, of course.
<fwereade> axw, bac: so, wait, is this juju or the image that's messed up?
<bac> fwereade: i can't say.
<axw> fwereade: has to be juju. the machine agent is coming up, but the bootstrap client is just hanging there
<axw> looks like someone already reported this: https://bugs.launchpad.net/juju-core/+bug/1316185
<_mup_> Bug #1316185: juju bootstrap hangs on Azure <juju-core:Triaged> <https://launchpad.net/bugs/1316185>
<axw> gonna try again with image-stream: daily back in
<bac> yep, that's the same issue
<bac> axw: i'll tear down that instance and try again with image-stream: daily to confirm from the other direction
<bac> axw: unless you want the instance to remain up
<axw> bac: nope, go for it
<axw> thanks
<bac> axw: fwiw, ctl-c yields
<bac> ^C2014-06-27 10:04:21 INFO juju.cmd cmd.go:113 Interrupt signalled: waiting for bootstrap to exit
<bac> 2014-06-27 10:04:21 ERROR juju.provider.common bootstrap.go:119 bootstrap failed: subprocess encountered error code 130
<bac> Stopping instance...
<axw> bac: thanks, 130 just means it was killed by Ctrl-C
<axw> bac fwereade: yeah, putting "image-stream: daily" back in fixed it for me
<axw> bugger.
<fwereade> axw, I has a very confused -- so juju is depending on something in the daily image that's not in the released one?
<axw> fwereade: I dunno what the difference is. The thing is, juju actually bootstraps and works with the released images. It's just that the ssh client doesn't see any output from the script past "apt-get upgrade"
<axw> and doesn't get EOF
<axw> fwereade: I guess one of the packages that gets upgraded is buggering up the communication
<axw> about to have a look at what the differences are
<bac> axw: confirmed that i can boot cleanly with image-stream: daily
<axw> thanks bac
 * bac breakfast
<axw> fwereade: is there some reason other than "we'd like people to have up-to-date machines" for running "apt-get upgrade" at bootstrap?
<bac> axw: this is almost certainly unrelated, but when i was trying to boot yesterday on US East i was getting repeated errors like http://paste.ubuntu.com/7707189/ -- it certainly muddied the waters.
<axw> bac: fixed on trunk today
<bac> ty
<fwereade> axw, apart from the fact that we install stuff and it's generally preferred that we update before doing so, I can't think of one
<axw> fwereade: bootstrapping with the daily images, at least on azure, is considerably faster without it
<axw> whether it's considerably more broken, I don't know :)
<fwereade> axw, so does it work if you drop the apt-get update?
<fwereade> axw, /upgrade
<axw> trying now
<axw> also going to try holding back bash and apt
<fwereade> cheers
<dimitern> jam1 (if you're here), vladk, standup?
<vladk> dimitern: yep
<axw> fwereade: works without apt-get update
<perrito666> morning
<natefinch> morn
<natefinch> ing
<natefinch> tab completion should work on the word I'm thinking of
<TheMue> hel
<TheMue> lo
<TheMue> na
<TheMue> tefi
<TheMue> nch
<TheMue> ;)
<wwitzel3> I do that in the shell sometimes, trying to a tab complete a URL I'm calling curl with.
<perrito666> yeah, I do it with passwords
<bac> hi mgz
<rogpeppe3> dimitern: any chance you could review this for me? it's blocked until i can get a review from someone from juju-core: https://github.com/juju/charm/pull/9
<bac> fwereade: now that axw has finished up, is anyone working now who i can ask about azure issues?
<axw> bac: I'm still around for the moment...
<rick_h_> axw: so in azure mode there's no manual placement. Is that the default in azure then? This means no machine view/colocating?
<bac> oh, hi axw.  had a question about azure availability sets.  just found a link to the doc
<axw> rick_h_: correct
<rick_h_> axw: can you do any form of colocation at all?
<axw> not in the current implementation
<axw> (unless you disable availability sets)
<fwereade> axw, rick_h_: yeah, I thought that if you used rubbish-mode you could still do manual placement
<rick_h_> axw: and is this somehting you can change after bootstrap? I see it's a 'bootstrap attribute' so is there a flag on juju bootstrap? Any other way to change afterwards?
<fwereade> axw, rick_h_: forgive the looseness of my terminology
<axw> rick_h_: no, it's immutable
<rick_h_> wow, ok will process and ponder. This will definitely cause some fun with our current machine view work and GUI along with other projects.
<axw> rick_h_: in case I wasn't clear, it's configurable at bootstrap time only, and immutable thereafter
<rick_h_> axw: gotcha.
<rick_h_> axw: just :(
<rick_h_> axw: but will spend some time on it thinking through it and it's chain of effects.
<axw> it's a bit of a PITA, I know, but azure's model dictates this
<bac> axw: is there a switch to juju bootstrap to turn it off?
<axw> bac: yeah you can set availability-sets-enabled=false in environments.yaml
<bac> axw: ah, ok
<perrito666> wwitzel3: ericsnow is any of you taskless?
<wwitzel3> perrito666: nope, I'm working on tests for the env client api stuff and then moving on to the legacy setmongopassword cleanup.
<wallyworld__> fwereade: did you want a quick chat about charm storage?
<fwereade> wallyworld__, sure, 5 mins?
<wallyworld__> fwereade: ok, join me in https://plus.google.com/hangouts/_/canonical.com/tanzanite-daily
<wallyworld__> rick_h_: did you want to join too?
<rick_h_> wallyworld__: joining
<bodie_> morning all
<perrito666> well ericsnow if you are interested we need to implement this suggestion by rogpeppe "you could add a jujud subcommand which updates the addresses in the agent.conf file"
<fwereade> perrito666, ericsnow: doesn't `juju endpoints` have that effect anyway?
<fwereade> perrito666, ericsnow: if not, it should, because *every* command that hits the api ought to update the .jenv addresses if they've changed
<perrito666> fwereade: tell me/us more, the idea is to be able to let the agents know that the state server has changed
<fwereade> perrito666, oh! wait, agent.conf?
<perrito666> :)
<fwereade> perrito666, that should not be a command
<fwereade> perrito666, we update them on initial connect already
<fwereade> perrito666, what we should have is a watcher that updates them when they change
<perrito666> fwereade: this is something we need to do after a restore
<fwereade> perrito666, ah, ok I see, sorry, I misread two separate things
<perrito666> so the idea is, state server hung and was killed and we restored, agents have no clue on what the state server is
<fwereade> perrito666, just ignore me, if you haven't learned that lesson already
<perrito666> lol
<fwereade> perrito666, jujud command makes sense
<perrito666> is at least an improvement from current method, which is ssh+sed
<bac> mgz you around?
<alexisb> natefinch, fwereade either you available to join the cloudbase call?
<fwereade> alexisb, just finishing up another call, in there in a sec
<rogpeppe> please, would someone be able to review this code?! https://github.com/juju/charm/pull/9
<alexisb> awesome thankyou
<rogpeppe> natefinch, wwitzel3, jam1, wallyworld__, axw, dimitern, mgz: ^
<axw> sorry I'm logging off - will take a look on monday if it's still unreviewed
<fwereade> wallyworld__, are you still around?
<wallyworld__> yeah
<axw> fwereade: I give up on the azure problem for tonight. I ended up modifying bootstrap to upgrade the packages individually, and it worked :/
 * axw logs off
<wallyworld__> fwereade: but i'm too tired to really review anything
<fwereade> wallyworld__, np, you're not needed :)
<wallyworld__> so situation normal then
<natefinch> alexisb: coming
<sinzui> Can someone explain or fix  me so that we can merge the juju version changes for master and 1.20. Juju CI will not test something that it knows is released
<sinzui> https://github.com/juju/juju/pull/181 and https://github.com/juju/juju/pull/180
<rogpeppe> *still* looking for a review of this, please: https://github.com/juju/charm/pull/9
<bac> sinzui: turns out this was the bug that i was hitting yesterday: bug 1316185
<_mup_> Bug #1316185: juju bootstrap hangs on Azure <juju-core:In Progress by axwalk> <juju-core 1.20:In Progress by axwalk> <https://launchpad.net/bugs/1316185>
<sinzui> Don't use daily
<natefinch> perrito666: how's your windows knowledge?
 * perrito666 sees an avalanche comming
<perrito666> natefinch: I know my way around windows 7 I rule at windows xp :p
<perrito666> I might work well enough with windows server if it looks anything like windows nt
<bac> sinzui: if daily is not a proper work around should it go back to critical?
<sinzui> yes. daily was required last year because saucy was the only series that had azure support
<natefinch> perrito666: we have work on getting charms deployable to windows
<sinzui> bac also, daily is now focuses on utopic. I think you want to use an LTS
<natefinch> I'm working with the cloudbase guys getting their code into Juju, but I'm on vacation next week and need someone else to help them out
<natefinch> perrito666: it's honestly less windows stuff and more just helping them get their code well integrated into Juju
<perrito666> natefinch: I guess I could help, although I wold love a bit more info :)
<natefinch> perrito666: hop on here https://plus.google.com/hangouts/_/canonical.com/cloudbase-juju?authuser=1
<sinzui> If the bot is going to ignore me or the 1.20 branch, I can merge the 1.20.0 version change myself to unblock the release
<perrito666> natefinch: hold I sec I stop the radio, I was waiting for news on the country entering into economic default or not
<natefinch> heh
<natefinch> perrito666: no rush
<natefinch> perrito666: we'll be on there for a long time
<perrito666> also apparently our vice president might get arrested for defraudation :p
<perrito666> natefinch: says google that the party is over
<sinzui> I manually merged 1.20.0 version change into 1.20 branch. CI test it in about 2 hours
<natefinch> perrito666: invited via the UI, that should work
<sinzui> oh, it will test in in a hour because master has an invalid version, the test suite will exit early
<perrito666> natefinch: yup, scared me to death
<rogpeppe> another review if anyone cares to, much simpler one this time: https://github.com/juju/charmstore/pull/11
<perrito666> when did fwereade pop into that call? I was about to say that natefinch sounded a lot like fwereade today :p
<wwitzel3> England, New England .. same thing
<natefinch> haha
<TheMue> rogpeppe: 11 is reviewed
<perrito666> well, clearly NewEngland is a factory for England
<rogpeppe> TheMue: thanks
<natefinch> lol
<TheMue> perrito666: which package? and a reference of a copy?
<TheMue> perrito666: usa := europe.NewEngland() ?
<perrito666> TheMue: NewEngland() (*europe.England, error) {}
<perrito666> I understand England used that a lot around 500 ys ago
<TheMue> perrito666: oh, shit, compiler error, didnât asked for the error
<perrito666> :p
<perrito666> more in the spirit of NewEngland(unconquered world.Country) (*europe.England, error) {}
<perrito666> although that only makes sense in spanish where uk == england in daily use
<natefinch> uk == england in most of the united states.  Took me forever to understand the political structure of that little batch of islands
<perrito666> I was provided an educational video by an uk guy
<perrito666> which explained all of that
<TheMue> funnily many Country instances reacted with a panic() but England defered a recover()
<perrito666> we are a bunch of nerds
<wwitzel3> http://xkcd.com/850/
<TheMue> Meeeeee? Nooooo! *blush*
<TheMue> wwitzel3: perrito666: regarding england: http://twistedsifter.files.wordpress.com/2013/08/the-only-countries-britain-has-not-invaded.jpg
<wwitzel3> lol
<ericsnow> natefinch, fwereade: are we having that meeting?
 * perrito666 notices that he forgot to cook lunch
<jcw4> perrito666: eating it raw?
<jcw4> perrito666: that corruption scandal must have really got your attention
<perrito666> jcw4: no, actually I was self documenting code
<perrito666> :p
<jcw4> :)
<perrito666> alexisb: natefinch something seems wrong with the hangout
<perrito666> ericsnow: ping me when you are available
<ericsnow> perrito666: ping
<perrito666> ericsnow: priv
<rogpeppe> mgz: ping
<mgz> rogpeppe: hey
<rogpeppe> mgz: would you be able to review a change to godeps?
<mgz> sure thing
<rogpeppe> mgz: ta! https://codereview.appspot.com/106250043/
<mgz> hmm
<rogpeppe> mgz: do you think that fetching by default is a bad thing?
<mgz> just thinking if it's going to bork anything
<mgz> well, it borks the wait, I'm actually on a different branch case
<mgz> but I guess it's not to bad to learn to use -F if needed
<rogpeppe> mgz: i can't think of a case where i'd ever actually want to use -F
<rogpeppe> mgz: after all, the repo may be updated anyway, regardless of -F
<rogpeppe> mgz: i guess the only time i might want to use it is if my network connection is poor
<mgz> rogpeppe: wehn you want to see deps that need updating, but not screw with trees because you're not sure of their current state
<mgz> using godeps -u in that case is a little dodgy anyway
<rogpeppe> mgz: there's always the -n flag for that
<rogpeppe> ha, i've just spotted a bug
<mgz> -P of 10 by default is also maybe a question
<mgz> that's enough to make rural broadband pretty sad
<rogpeppe> mgz: i bet your web browser fetches more than 10 things at once...
<mgz> rogpeppe: really need some actual tests for the changes
<rogpeppe> mgz: yeah, i thought you might say that. the tests have been broken for a while :-(
<rogpeppe> i should really fix the tests
<mgz> rogpeppe: sure, but running ten git processes in parallel is more than just an http get
<mgz> rogpeppe: change seems fine in general though
<rogpeppe> mgz: you're worried about cpu resources?
<rogpeppe> mgz: or does git make lots of connections in fact?
<rogpeppe> mgz: thanks
<mgz> rogpeppe: more network/memory, but yeah, depending on the url of the repo, it's not just one connection
<rogpeppe> mgz: got a suggestion for a better default?
<mgz> 4?
<rogpeppe> mgz: ok, 4 it is
<alexisb> ericsnow, ping
<ericsnow> alexisb: coming :)
<alexisb> :)
<rogpeppe> mgz: well, i've got the original tests passing now at any rate, but no time left to add -u tests.
<rogpeppe> mgz: i'll wait until Mon before pushing the code, as i think it's worth having the changes even without tests
<rogpeppe> mgz: and at least then i'll be around when the 'bot breaks :-)
<jcw4> I've updated https://github.com/juju/juju/pull/164 to merge in bodie_'s work and address some of thumpers comments
<jcw4> PTAL ^^
 * jcw4 is eow
<perrito666> sinzui: I never realised you could menace the bot into merging stuff
<perrito666> wwitzel3: ericsnow standup?
<ericsnow> wwitzel3: yep
<sinzui> perrito666, I have a latent ability to do neigh impossible thing. Intimidate bots, run 386 instances in Hp cloud, create trans-cloud juju envs.
<wwitzel3> perrito666: yep omw
<sinzui> I attribute this to my new way of looking at issues, and I will cheat if necessary
<sinzui> perrito666, wwitzel3 any insight into this critical bug https://bugs.launchpad.net/juju-core/+bug/1335243
<_mup_> Bug #1335243: No tools available TestValidateConstraintsCalledWithMetadatasource <regression> <test-failure> <juju-core:Triaged> <juju-core 1.20:Triaged> <https://launchpad.net/bugs/1335243>
<wallyworld__> mgz: around?
#juju-dev 2014-06-29
<waigani> morning all
<waigani> thumper: how far away are you? shall we meet later?
<thumper> 2 minutes
<waigani> hi stub, I'm looking into #1334482, saw your comment. What do you think of setting LC_ALL=C in hookVars in worker/uniter/context.go ?
<_mup_> Bug #1334482: consider setting proper locale for hooks environments <juju-core:Triaged> <mongodb (Juju Charms Collection):New> <https://launchpad.net/bugs/1334482>
<waigani> thumper: stub commented on locale bug that we should specify UTF-8 in the open call instead of setting a default locale in env context. Where is the open call?
<thumper> no idea
<thumper> probably the open call in the mongo charm
<waigani> oh right, he means the fix should be implemented on the charm side.
#juju-dev 2015-06-22
<menn0> thumper: can you have another look at http://reviews.vapour.ws/r/1975/ pls when you're back?
<davecheney> thumper: james fixed the worker/provisioner race
<davecheney> https://github.com/juju/juju/pull/2566
<davecheney> so i guess that means we don't need to argue about gocheck maintainership
<thumper> menn0: ack
<thumper> hmm... looks like reviewboard host is out of disk space
<davecheney> rup row
<davecheney> score one for the cloud
<thumper> the cloud, where everything just works
<davecheney> yup, as well as it does right here in your home
<davecheney> or your money back
<axw> wallyworld: I'm not really sure what we need to say about iaas resource tagging beyond what's in the release notes. do you have any ideas?
<wallyworld> axw: from memory the release notes seemed to cover it. they explained what was done and how to ass custom tags etc. so maybe just copy to a PR on the doc project
<axw> wallyworld: ok
<wallyworld> we just need to make sure the doc work is scheduled - hence the PR / bug being done
<axw> wallyworld: do I specifically need to do a PR? I'm not sure where this would best go, so ok if I just create an issue on the project with the text, and then keep an eye on it?
<wallyworld> yeah, that will be fine
<menn0> thumper: is the RB machine also the main Jenkins host?
<thumper> NFI
<menn0> thumper: just checked, they are
<axw> menn0: reviews. and juju-ci. both resolve to the same
<menn0> axw: yep, i just checked the same thing :)
<axw> :)
<menn0> so if RB is having trouble them Jenkins will be as well
<thumper> hazaah
<menn0> thumper: i'm on the host now... root volume is certainly full
<thumper> menn0: go delete some stuff
<thumper> pretty sure we don't need /var
<thumper> :-)
<menn0> i'm just looking for where the space is being used
 * thumper chuckles to himself
<menn0> it's painfully slow
<thumper> hmm
<thumper> need coffee
<menn0> thumper: you there?
<thumper> yup
<thumper> ugh... snow
<davecheney> thumper: ready when you are
<menn0> thumper: the culprit is the logs for the juju env that hosts the various CI services (reviews, CI proxy, reports)
<thumper> hah
<thumper> no rotation?
<menn0> thumper: the disk isn't really big enough to support the way Juju rotates the logs
<thumper> heh, oh the irony
<menn0> thumper: there's several units each with 2 backups of 300MB plus the current log file
<menn0> the disk is only 7GB total
<menn0> thumper: also the logs are full of: exited "rsyslog": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "juju-generated CA for environment \"rsyslog\"")
<menn0> weee
 * thumper stabs rsyslog
<menn0> wallyworld/thumper/davecheney: do you remember if the above has been fixed/
<menn0> ?
<thumper> I keep seeing it...
<wallyworld> hmmm
<wallyworld> i thought it had been
 * menn0 is compressing the backup logs
<menn0> that's better... 3.5GB free
<menn0> wallyworld, thumper: this env is 1.21. I don't think the problem has been fixed there.
<menn0> regardless, the disk is never going to be big enough
<menn0> i'll shoot out an email
<anastasiamac> menn0: tyvm for fixing/cleaning up the machine :D
<menn0> anastasiamac: np
<mup> Bug #1467362 opened: utils/ssh: data race in test <juju-core:New> <https://launchpad.net/bugs/1467362>
<davecheney> thumper: menn0 git push --set-upstream origin fixedbugs/1465115
<mup> Bug #1465115: api: data race in test <intermittent-failure> <race-condition> <unit-tests> <juju-core:In Progress by dave-cheney> <https://launchpad.net/bugs/1465115>
<davecheney> thumper: menn0 https://github.com/juju/juju/pull/2618
<davecheney> you'll love this one
 * menn0 looks
<menn0> davecheney: LGTM
<menn0> davecheney: nasty problem
<davecheney> yeah, that is a terrible footgun in the http api
<mwhudson> davecheney: so the situation with rugby is that it doesn't have any kind of useful outbound network access at all, right?
<mwhudson> not even via proxy
<mup> Bug #1467372 opened: api/cleaner: data race in test <juju-core:New> <https://launchpad.net/bugs/1467372>
<davecheney> mwhudson: yup
<davecheney> it broke a few weeks ago
<davecheney> this isn't the first time it broke
<mwhudson> davecheney: nice
<davecheney> but this was the first time I no longer had the stregth to complain on #is
<davecheney> mwhudson: if you would take up the charge this time, I would be indebted
<mwhudson> davecheney: i'm happy to (tomorrow) if you can tell me what should be working
<davecheney> there is a proxy
<davecheney> it runs on batuan
<davecheney> well it used to
<davecheney> but it doesn' tnow
<mwhudson> davecheney: https_proxy=http://squid.internal:3128/ ?
<davecheney> this proxy isn't monitored by is
<davecheney> so it shits tiselfe every now and then
<mwhudson> i see
<davecheney> and needs to be manually unshitted
<mup> Bug #1467372 changed: api/cleaner: data race in test <juju-core:New> <https://launchpad.net/bugs/1467372>
<mup> Bug #1467372 opened: api/cleaner: data race in test <juju-core:New> <https://launchpad.net/bugs/1467372>
<mwhudson> davecheney: you got some arm64 hw recently, right?
<mwhudson> that iirc you weren't very impressed with
<davecheney> yes, and yes
<mup> Bug #1447234 changed: juju prints "error" when deploying yet no units are in error <deployer> <lxc> <reliability> <juju-core:Expired> <https://launchpad.net/bugs/1447234>
<mup> Bug #1467374 opened: worker/uniter/filter: ci test failure <juju-core:New> <https://launchpad.net/bugs/1467374>
<mwhudson> davecheney: what was it?
 * mwhudson disappears, will read backlog later
<davecheney> mwhudson: xgene
<mup> Bug #1467379 opened: "attachmentcount" field not set when upgrading from 1.24 <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1467379>
<jam> dimitern: I'm going to miss standup, I have to run an errand
<dimitern> jam, ok, np
<voidspace> dimitern: ping
<voidspace> dimitern: standup?
<fwereade> is anyone free to take a look at RB? [Errno 28] No space left on device: '/tmp/reviewboard.pcPtS2'
<voidspace> dimitern: https://github.com/juju/juju/pull/2598
<axw> evilnick: I possibly just inadvertently switched a checkbox on https://github.com/juju/docs/issues/444
<axw> evilnick: which one left as an exercise to the reader (I don't know which one it was :/)
<evilnick> axw hehehe. Thanks!
<axw> axw: sorry about that. didn't realise clicking them did things.
<evilnick> those things are a mixed blessing
<axw> untracked things at that
<evilnick> it's okay, we are nearly done with them anyhow - it will be pretty easy for me to tell what is changed
<axw> cool
<tasdomas> is the reviewboard server out of disk space? I see this message: '[Errno 28] No space left on device: '/tmp/reviewboard.iG8Eys'' on http://reviews.vapour.ws/r/1963/diff/#
<Muntaner> hi jujuers
<Muntaner> I'm having problems with a bootstrap
<Muntaner> http://paste.ubuntu.com/11755548/
<dooferlad> dimitern / TheMue: could you take a look at https://github.com/juju/juju/pull/2621 please? ReviewBoard hasn't found it (still out of disk space?) so please review on Github.
<mgz> Muntaner: what did you do to add an ubuntu image to your deployment and register it with simplestreams?
<mgz> Muntaner: I'm presuming you've read and followed jujucharms.com/docs/stable/howto-privatecloud
<dimitern> dooferlad, looking
<TheMue> dooferlad: *click*
<Muntaner> mgz, does juju work with the new vivid cloud ubuntu images?
<Muntaner> I mean, the 15.04
<Muntaner> because it worked flawlessly with the old ones (14.04), but with the new I'm getting strange errors
<dimitern> dooferlad, done
<mgz> Muntaner: yes, but in that bootstrap it's not finding the image stream at all
<Muntaner> mgz, it was an error in the metadatas
<Muntaner> mgz, now I managed to go on: I get another error that I'm pasting
<Muntaner> mgz, http://paste.ubuntu.com/11756595/
<Muntaner> seems like it is searching for metadatas of a 14.04 version, why does it?
<Muntaner> sorry, I pasted the same stuff for two times
<TheMue> dooferlad: agreeing to dimitern comments ;)
<mgz> Muntaner: looks like you are trying to bootstrap trusty and have no trusty images
<Muntaner> mgz, mmmh
<Muntaner> I'm not trying to bootstrap trusty: I wanna vivid
<Muntaner> in my environments.yawl, I got a default-series: vivid
<mgz> what's default-series in your environments.yaml?
<Muntaner> mgz -> http://paste.ubuntu.com/11756623/
<Muntaner> I got vivid
<Muntaner> mgz, also with juju bootstrap --debug --series=vivid --upload-tools I got the same result
<mgz> Muntaner: yeah, --series doesn't do that
<Muntaner> mgz, sooo ...
<Muntaner> mgz, maybe I solved
<Muntaner> via tools-metadata-url: https://streams.canonical.com/juju/tools/
<mgz> yeah, you also want to be able to access vivid tools, but doesn't seem to be where it's getting stuck from the logs
<Muntaner> mgz, are you a developer?
<Muntaner> aw yes, we are in dev chan
<Muntaner> I think that juju logs need to be revisited
<Muntaner> in other situations, I'm having a lot of troubles in understanding what isn't working
<mgz> simplestreams is unreadable junk
<mgz> and why we're still logging config contents >_<
<Muntaner> mgz, will juju never work with containers? :)
<mgz> ? it does.
<mgz> so, can you now bootstrap or are you still stuck on simplestreams?
<Muntaner> mgz, sorry, got confused! seems to be bootstraping, but now it's stuck at Installing package: cloud-image-utils
<Muntaner> maybe I've got some issues in my openstack
<dooferlad> dimitern: replied to that review (https://github.com/juju/juju/pull/2621)
<dimitern> dooferlad, replied
<dooferlad> dimitern: thanks. Will fix up as suggested
<dimitern> dooferlad, cheers
<dooferlad> dimitern: by the way, I will probably be a bit late for the networking knowledge sharing because I need to pick my daughter up.
<Muntaner> maybe I'm having problems with security groups...
<Muntaner> guys, shall I open some ports in my default security group?
<Muntaner> because the environments gets bootstraped
<Muntaner> but it stucks at the apt-get upgrade...
<Muntaner> can't ssh to machines, can't get ssh status
<Muntaner> juju status*
<dimitern> dooferlad, ok, no worries
<Muntaner>  DEBUG juju.api apiclient.go:337 error dialing "wss://192.168.0.97:17070/environment/46eefeea-bd6a-43a9-8571-23a841643c0f/api", will retry: websocket.Dial wss://192.168.0.97:17070/environment/46eefeea-bd6a-43a9-8571-23a841643c0f/api: dial tcp 192.168.0.97:17070: connection refused
<Muntaner> guys, anybody can help me in understanting why it's getting stuck at Installing package: cloud-image-utils
<Muntaner> ?
<Muntaner> mmmh...
<Muntaner> where can I find the tgz of the tools?
<natefinch> Muntaner: I haven't done this personally, but I Think this section shows the relevant topics: https://jujucharms.com/docs/stable/howto-privatecloud#image-metadata
<Muntaner> natefinch, I think I fixed the metadata problem
<natefinch> Muntaner: cool
<Muntaner> natefinch, the host machine got lost in this:
<Muntaner> in the /var/log/cloud-init-output.log, I got this:
<Muntaner> http://paste.ubuntu.com/11756909/
<Muntaner> a lot of
<natefinch> Muntaner: hmm
<natefinch> Muntaner: do the machines have outside access to ubuntu package archives?
<Muntaner> natefinch, naturally, yes
<Muntaner> natefinch, now It works, maybe I got some network hic-cups
<Muntaner> a non juju-related question: do skype work for you? for me, it crashes after 5 seconds under xubuntu and fedora
<natefinch> Muntaner: I don't use skype, just google hangouts.  Works very reliably.  Not sure if they use similar technology.
<Muntaner> natefinch, I got a problem with endpoints
<Muntaner> who is telling to the juju machine 0 the endpoints?
<Muntaner> because it is looking for "controller:8774"
<Muntaner> but naturally, the vm can't know who is "controller"
<natefinch> Muntaner: I'm not sure where that information is coming from.  That's not hardcoded or anything (neither is that port, I don't see 8774 in the code at all)
<Muntaner> natefinch, I think it is asking to my openstack "what are your endpoints"?
<mgz> Muntaner: it's in your keystone config
<Muntaner> so I probably neet to change them
<Muntaner> need*
<Muntaner> it's fine :)
<voidspace> dimitern: dooferlad: network problems are due to ethernet-over-power hardware problems
<voidspace> still seeing if they can be resolved or if I need new hardware
<Muntaner> guys
<Muntaner> I'm trying to deploy juju-gui
<Muntaner> on my fresh vivid environment
<Muntaner> does it exists for vivid?
<Muntaner> 'cos I'm getting a sad "ERROR juju.cmd supercommand.go:430 cannot resolve charm URL "cs:vivid/juju-gui": charm not found"
<natefinch> Muntaner: I don't think the gui charm exists for vivid.  Most charms don't exist for vivid.  rick_h_ would know ^
<Muntaner> aw!
<natefinch> Muntaner: you could always copy the charm and put it on launchpad under your own name for vivid... in fact, I wouldn't be surprised if someone else already had
<mgz> you can always download a trusty charm, rename the series, and try deploying from local:
<natefinch> that too ^
<rick_h_> natefinch: correct, we're only LTS
<natefinch> It's sort of unfortunate that we tie charms to series so tightly, when a lot of the time they work fine in other series.
<dimitern> voidspace, oh boy :/ one drawback of ethernet-over-power
<voidspace> dimitern: yeah
<voidspace> dimitern: a router or network card can just as easily fail too though
<voidspace> dimitern: I don't think they're inherently unreliable, just one extra piece that can go wrong
<voidspace> dimitern: looks like the remote one has just died
<voidspace> dimitern: a system reset on the main one hasn't helped and according to the diagnostic tool I have the remote one isn't working
<voidspace> isn't being detected at all
<Muntaner> hey guys
<Muntaner> last thing :)
<dimitern> voidspace, but what's the problem you've discovered?
<Muntaner> I should deploy a private local bundle on my fresh juju
<Muntaner> I've got my bundle.yawl there
<voidspace> dimitern: well, as the remote unit doesn't work I have no network
<Muntaner> what was the command?
<voidspace> dimitern: and my networking configuration for the machine requires eth0 to be connected
<dimitern> voidspace, remote unit being the other end of the EoP ?
<voidspace> dimitern: yep
<mgz> Muntaner: I should have mentioned earlier, but you'd really be better off in #juju rather than here
<voidspace> dimitern: the end my desktop is connected to
<dimitern> voidspace, right
<dimitern> voidspace, can you use a cable instead?
<mup> Bug #1467556 opened: TestMachineAgentRunsEnvironStorageWorker fails <ci> <intermittent-failure> <test-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1467556>
<Muntaner> mgz, sorry
<voidspace> dimitern: my desktop is upstairs, so no
<voidspace> dimitern: I can go back to wifi and order a replacement -
 * TheMue is shortly afk to ride back home in a dry moment ;)
<dimitern> voidspace, I see, well I hope you figure out how to fix it :)
<voidspace> dimitern: I can probably set up a virtual maas with maas 1.8 to test code against until the replacement arrives
<voidspace> dimitern: I can't run my "real maas" without working ethernet
<dimitern> voidspace, sounds good, and I can give you a hand with testing on both my maas-es
<voidspace> dimitern: cool
<voidspace> dimitern: thanks
<dimitern> dooferlad, can you update bug 1463480 if there's anything you've missed ?
<mup> Bug #1463480: Failed upgrade, mixed up HA addresses <blocker> <canonical-bootstack> <ha> <upgrade-juju> <juju-core:Triaged> <juju-core 1.22:Triaged> <juju-core 1.24:Triaged> <hacluster (Juju Charms Collection):New> <https://launchpad.net/bugs/1463480>
<mup> Bug #1467590 opened: Running out of disk space blocks interacting with env on cli <juju-core:New> <https://launchpad.net/bugs/1467590>
<katco`> ericsnow: natefinch: planning meeting
<natefinch> katco: coming
<katco> ericsnow: wwitzel3: so i've never actually worked on a hook before. where are those defined?
<katco> ericsnow: wwitzel3: gh.com/juju/charm?
<katco> /hooks?
<ericsnow> katco: what do you mean by "hook"?
<katco> ericsnow: this is for the launch command
<ericsnow> katco: we aren't adding any hooks
<perrito666> katco: hooks are in the charm package
<katco> ericsnow: k, think i've found it: uniter/runner/jujuc/. however, where should ours live under process?
<ericsnow> katco: we already wrote all that
<ericsnow> katco: process/context
<ericsnow> katco: see register.go for a hook context command
<ericsnow> katco: launch be very similar
<katco> ericsnow: awesome. ty
<ericsnow> katco: :)
<thumper> wallyworld: I thought you said this was fixed? http://reports.vapour.ws/releases/2801/job/run-unit-tests-precise-i386/attempt/2149
<wallyworld> it was - i checked the commits
<wallyworld> if it's still broken there's maybe a regression or another problem?
<wallyworld> in a meeting, will check soon
<mup> Bug #1467690 opened: inconsistent juju status from cli vs api <canonical-bootstack> <juju-core:New> <https://launchpad.net/bugs/1467690>
<mwhudson> davecheney: rugby can talk to its proxy again
<wallyworld> thumper: i had a look at the logs for that test - i call bullshit
<wallyworld> i can't see an int overflow there
<thumper> cmd/juju/addrelation_test.go:18: undefined: CmdBlockHelper
<wallyworld> there's a uniter test failure relating to an upgrade test
<thumper> I'll look, but curious that it doesn't fail all the time
<wallyworld> yeah, go figure
<axw> wallyworld: conn died
<davecheney> mwhudson: thanks
<davecheney> mwhudson: actually
<davecheney> it's  not working for me
<mwhudson> hm
<mwhudson> i managed to clone go a few minutes ago
<davecheney> are you still in #is
<mwhudson> i never leave!
<davecheney> welcome to canoincal, when raising RT's is for the weak
#juju-dev 2015-06-23
<davecheney> thumper: http://paste.ubuntu.com/11759872/
<davecheney> this morning's results
<davecheney> there are three packages with races remaining
<davecheney> 2 of them are going to be involved fixes
<mup> Bug #1467712 opened: cmd/jujud/agent: data race in test <juju-core:New> <https://launchpad.net/bugs/1467712>
<thumper> hmm
<mup> Bug #1467715 opened: worker/peergrouper: data race in package <juju-core:New> <https://launchpad.net/bugs/1467715>
<davecheney> thumper: this one is worse https://bugs.launchpad.net/juju-core/+bug/1467715
<mup> Bug #1467715: worker/peergrouper: data race in package <juju-core:New> <https://launchpad.net/bugs/1467715>
<thumper> davecheney: what's it doing?
<davecheney> there are lots of races here
<davecheney> it looks like an internal slice has been leaked to the caller
<davecheney> and the caller is sorting it
<natefinch> ouch
<davecheney> _but_ the race only happens when the test fails
<davecheney> and these are failures we've seen in ci
<davecheney> if the test passes, there is no race
<davecheney> \o/
<thumper> natefinch: happy birthday from the future
<natefinch> thumper: thanks! :)
<natefinch> davecheney: lol
<davecheney> \o/ yes, stop working, go and have a birthday
<natefinch> Any time I don't have any kids in the same room with me, it's a party ;)
<anastasiamac> \o/
<perrito666> natefinch: aw, I am too lazy to wait until midnight to tell you hb, can I leverage the fact that most of my team is in tomorrowland and wish you a hb now?
<natefinch> perrito666: haha, sure :)  Thanks :)
<davecheney> thumper: menn0 https://github.com/juju/juju/pull/2624
<davecheney> just a small one
<wallyworld> thumper: want a bug to fix as part of your bug squad fun and games?
<menn0> davecheney: looking now... was having lunch
<menn0> davecheney: Ship It (although I see you were merging anyway)
<thumper> wallyworld: what is it?
<wallyworld> thumper: bug 1467690
<mup> Bug #1467690: inconsistent juju status from cli vs api <canonical-bootstack> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1467690>
<thumper> ha...
<thumper> I'm addressing ci blockers first
<wallyworld> i did some diagnostics, should be easy to demo using a test and then fi
<wallyworld> ok
<thumper> I"ll make sure it is on the list...
<wallyworld> i might get time today
<wallyworld> if not, tonight
<thumper> ...
<thumper> [LOG] 0:00.122 ERROR juju.service.systemd invalid conf for service "jujud-machine-99": relative path in ExecStart (C:/Juju/lib/juju/init/jujud-machine-99/exec-start.sh) not valid
<thumper> this error message is all sorts of wrong
<thumper> log says vivid machine, tests running on windows
<thumper> pretty sure ExecStart shouldn't be C:/ ...
<thumper> WTF...
<thumper> test is weird
<thumper> says "Series": "quantal"
<thumper> but tools "1.2.3-vivid-amd64"
<thumper> talk about confused
<davecheney> thumper: i can xplain that
<davecheney> we fake the series most of the time
<davecheney> and i know that those faked fixtures are reused many times
<davecheney> this is why I get so cross about version.Version being reused
<davecheney> this is the same problem
<thumper> ugh
 * thumper looks for the code that is obviously screwing up
 * thumper sighs
<davecheney> thumper: is the build blocked ?
<thumper> davecheney: yeah, looking at the blocker now
<davecheney> someone just marked it invalid
<thumper> I actually have a useful stack trace from the failure
<thumper> davecheney: someone did for trunk, it really shouldn't be invalid
<thumper> I'm looking at 1.24 first
<thumper>  (â¯Â°â¡Â°ï¼â¯ï¸µ â»ââ»)
<thumper> FAARRRRKKK!!!!!!!!!!!!!!!!!!
<thumper> davecheney: EXACTLY what you said before
<thumper> version.Current used for the wrong thing
 * thumper wants to stab someone
<thumper> systemd support
 * davecheney pats thumper on the back
 * davecheney leaves a little post it "i told u so"
<davecheney> thumper: the mess may be partly my fault
<davecheney> or at least
<davecheney> i was the last person to try to patch over the horror
<davecheney> if you find a bit of code that snifs aroud the version string, then astudiously overwrites parts of it
<davecheney> that was my fault
<davecheney> that was what we needed to do to get ppc64 working for T
<thumper> no...
<thumper> this is just blatently using version.Current.Series to work out the datadir for the systemd data directory
<thumper> which is wrong in so many ways
<thumper> works by accident for all juju devs using ubuntu
<thumper> trying to find the point to insert the right info
<menn0> thumper: the existing debuglog API handler doesn't stop if the API server stops... i've fixed that but am now wondering if that was intentional
<menn0> thumper: do you happen to remember?
<thumper> oversight
<thumper> it should
<menn0> good :)
<menn0> thumper: the final signficant db-log PR is ready... just doing some manual testing
<thumper> kk
 * thumper headdesks
 * thumper raises head high
<thumper> WWWHHHYYY!!!!!
 * thumper wonders how much of a hatchet to wield
<menn0> thumper: what's broken?
<thumper> service detection code
<thumper> inappropriate structures passed around because it happens to have a few useful fields
<davecheney> when the only field you have is a hammer, everything looks like a hammer
<davecheney> 13:40 < thumper> this is just blatently using version.Current.Series to work out the datadir for the systemd data directory
<davecheney> ^ this, now you feel my rage
 * thumper nods
<davecheney> thumper: how can I help ?
<davecheney> can i fetch you a refreshing clue bat ?
<thumper> find whoever added Version.OS and use the clue bat on them
<thumper> there is NO compelling reason to have it AFAICS
 * thumper sharpens the hatchet and wades into the code
 * thumper starts with service...
<davecheney> thumper: I'm sensing a theme for next weeks bug fixing
 * thumper sobs while hacking
<menn0> Achievement unlocked! Menno Smits got review request #2000!
<menn0> thumper: http://reviews.vapour.ws/r/2000/
<anastasiamac> menn0: well done! today is the day to get lotto, i guess :D
<mup> Bug #1404946 changed: charm-upgrade hangs forever <canonical-bootstack> <upgrade-charm> <juju-core:Expired> <https://launchpad.net/bugs/1404946>
<thumper>  (â¯Â°â¡Â°ï¼â¯ï¸µ â»ââ»
<mup> Bug #1467753 opened: cmd/jujud/agent: multiple data races detected <juju-core:New> <https://launchpad.net/bugs/1467753>
<thumper> davecheney: the map ordering problem is in the coreos/systemd repo
<davecheney> oh
<davecheney> dear
<thumper> hazaah
<davecheney> github.com/coreos/systemd ?
<thumper> aye
<thumper> go-systemd
 * thumper hacks and slaches
 * thumper grabs a copy of the serialization code
<menn0> thumper: another one: http://reviews.vapour.ws/r/2001/
<thumper> menn0: sorry, been stuck on the ci blocker
<menn0> thumper: no worries
 * thumper copies code from the go-systemd package for now
<menn0> these can wait
<thumper> pfft...
<thumper> the tests inside coreos/go-systemd/unit don't actually pass
<thumper> and can't pass if you look at the fucking code
 * thumper rages 
<thumper> anyone http://reviews.vapour.ws/r/2002/
<thumper> this fixes the current ci blocker, which fails on windows and ppc
 * thumper has to go help with dinner now.
<davecheney> ping, who's on call reviewer tonight ? http://reviews.vapour.ws/r/2003/
<wallyworld> fwereade: are you happy with horatio's latest uniter related PR? it looks ok at first read but i'm sure i won't pick up any subtle issues http://reviews.vapour.ws/r/1979
<wallyworld> fwereade: also, i need to talk to you later about idle time - i really want to leave it at 2 seconds otherwise status will be wrong a lot more often that it is right and so far it seems to be working fine in practice
<fwereade> wallyworld, thanks for the reminder, looking now
<dimitern> davecheney, me; looking
<dimitern> davecheney, ship it
<fwereade> wallyworld, so what I am mainly concerned about is that having an idleness timer that is higher-resolution than the event timer is going to lead to pathological flickering back and forth in certain circumstances
<fwereade> wallyworld, nothing-to-do-ohwait-look-nothing-to-do-oh-wait-look
<fwereade> wallyworld, and while the ideal is swift convergence to reliable values
<fwereade> wallyworld, I would prioritise the reliability over the swiftness
<fwereade> wallyworld, and as it stands even the best possible timings for relation chatter mean we can't reasonably infer that relation chatter has finished until at *least* 10s have elapsed, if not 15s
<rogpeppe1> fwereade: do you by any chance know how to turn on juju feature flags within tests?
<davecheney> dimitern: thanks
<davecheney> select {} is more like time.Sleep(some massive int)
<davecheney> but you don't need to import the time package
<dimitern> davecheney, I see - interesting trick though :)
<davecheney> maybe it could be written as
<davecheney> for { runtime.Sched() }
<davecheney> maybe
<davecheney> which might have the same result
<davecheney> but probably not
<dimitern> so a goroutine using select{} is just not running and it's not scheduled
 * fwereade scratches head at rogpeppe1, I did it once, possibly via SetFlagsFromEnvironment?
<davecheney> yeah
<fwereade> rogpeppe1, it seemed a bit janky but doable
<rogpeppe1> fwereade: finally found it, yes
<rogpeppe1> fwereade: seems like there should be a SetFlags call really
<rogpeppe1> davecheney: out of interest, how was the deepCopy function failing?
<rogpeppe1> davecheney: (i think it was me that suggested that solution and I'm interested to know why it went wrong, and particularly so if it triggered some kind of race condition)
<davecheney> it didn't appear to be duplicating slices properly
<davecheney> http://paste.ubuntu.com/11759941/
<davecheney> here is the race failure
<rogpeppe1> davecheney: how is that possible?
<rogpeppe> davecheney: there's no connection between old and new values
<rogpeppe> davecheney: as the entire thing gets marshaled and unmarshaled to a byte slice
<davecheney> i dunno
<davecheney> but taking it out and doing it by hand fixed the issue
<rogpeppe> davecheney: there's much more likelihood of getting it wrong by doing it manually
<rogpeppe> davecheney: your manual copy doesn't actually copy as much
<davecheney> once we have the race's fixed
<davecheney> we
<davecheney> we'll have a voting race build which will double check our work
<rogpeppe> davecheney: please try to take the time to understand *why* a race is happening rather than papering over the cracks
<rogpeppe> davecheney: i don't believe that deepCopy routine was at fault here, and by *not* deep copying, i'm concerned that there might be more potential for race conditions
<rogpeppe> davecheney: for example, there are quite a few Member fields which are pointers and are not now being appropriately copied
<wallyworld> fwereade: i have to go to soccer, i'll ping you later, i think we need to talk through the issue
<fwereade> wallyworld, sgtm, enjoy
<rogpeppe> davecheney: it's quite possible that changing the copy changed the timings so the race detector doesn't trigger (that does happen)
<rogpeppe> davecheney: i have a feeling that the change that actually fixed the issue was probably your changes on lines 111, 397 and 398
<rogpeppe> davecheney: as that means that the watcher won't trigger initially
<rogpeppe> davecheney: which actually breaks the expected watcher semantics, i think
<fwereade> rogpeppe, I think you're right there, not having looked at the actual CL
<rogpeppe> fwereade: i'm just looking at the race, trying to see what's actually going on
<rogpeppe> fwereade: ah, i think i understand the issue
<jam> fwereade: standup?
<fwereade> jam, oops, where does the time go, just a sec
<rogpeppe> fwereade, dimitern: an alternative fix for the peergrouper race: https://github.com/juju/juju/pull/2631
<fwereade> rogpeppe, assertMembers change LGTM, I trust that the rest is just reverts :)
<rogpeppe> fwereade: thanks. yes, that's the case.
<rogpeppe> fwereade: the peergrouper package is full of intermittent failures though. i don't remember it being like that before :)
<rogpeppe> fwereade: but probably it was all my fault
<fwereade> rogpeppe, I sort of doubt it actually
<fwereade> rogpeppe, races and suchlike do seem to get inserted quite a lot during "maintenance" :/
<rogpeppe> fwereade: yes, it's easy to do when the invariants aren't spelled out, which they probably should be better done in this package
<TheMue> fwereade: did I get you right? create a type in ipaddress.go which implements the three according methods (code moving) and embed it into State?
<rogpeppe> fwereade: and to be fair the testing style in worker/peergrouper is quite experimental
<rogpeppe> dimitern: i'd appreciate your take on this, as you signed off on the original PR http://reviews.vapour.ws/r/2003/
<rogpeppe> dimitern: oops, http://reviews.vapour.ws/r/2005/ of course
 * rogpeppe tries to avoid getting sucked into Fixing All The Things.
<fwereade> TheMue, I'd prefer explicit access over embedding, but, yeah
 * fwereade feels rogpeppe's pain
<dimitern> rogpeppe, looking
<rogpeppe> dimitern: thanks
<perrito666> Fwereade tx for the review I will look at it in depth when I get to something bigger than my phone
<TheMue> fwereade: ah, ok, I prefer explicit too. better to maintain and no available methods where don't directly can see where they are implemented
<natefinch> rogpeppe: if you have time today, I'd love a review of deputy, since it's in a relatively final state.  Here's a PR that has the full code for review: https://github.com/juju/deputy/pull/1
<evilnick> natefinch, the internet tells me it is your birthday today. If true, Happy Birthday :)
<natefinch> evilnick: the internet is correct, as it always is ;)  Thanks! :)
 * natefinch is having leftover cake from father's day for breakfast :D
<mup> Bug #1467873 opened: leadership lost during service teardown <juju-core:New> <https://launchpad.net/bugs/1467873>
<davecheney> rogpeppe: thanks for revertin that
<davecheney> i was just about to do that
<rogpeppe> davecheney: np
<rogpeppe> davecheney: there are a few other intermittent failures in peergrouper that would be nice to get to the bottom of
<rogpeppe> davecheney: i'm also seeing something that looks like a go bug, but i think that's probably a feature of tip only
<davecheney> ok, i'll try again tomorrow once the revert lands
<rogpeppe> natefinch: reviewed
<natefinch> rogpeppe: awesome, thanks!
 * natefinch just realized that he mocked out os.Exit, and his code is relying on it to terminate the function it's in.  
<rogpeppe1> does anyone know what should be the restrictions on the format of environment names?
<rogpeppe1> looks like it can't contain / or \, but other than that, i guess anything should be ok
<rogpeppe1> another random question without much hope of answer: anyone know what proxy.Settings.NoProxy is for?
<axw> rogpeppe1: pretty sure it's used for "no_proxy", as in wget and friends
<rogpeppe1> axw: ah, i didn't know about that - guess i should've google it :)
<rogpeppe1> googled
<axw> rogpeppe1: env name shouldn't contain "-
<axw> " either I think
<axw> otherwise tags would be broken?
<rogpeppe1> axw: ah
<rogpeppe1> axw: maybe that should be enforced in environs/config then?
<rogpeppe1> axw: (currently it just checks for / and \)
<axw> rogpeppe1: sorry, thinking of the IDs
<axw> rogpeppe1: which are just UUIDs... never mind
<rogpeppe1> axw: yeah, not id
<rogpeppe1> axw: it's just that kind of thing i'm wondering about though
<rogpeppe1> axw: as i'm just about to automatically generate an environment name, and i don't want to break things
<rogpeppe1> axw: i'm also trying to see a way forward to being able to call Provider.PrepareForCreateEnvironment on the server side not the client side
<axw> rogpeppe1: what's preventing that?
<rogpeppe1> axw: logic that gets env vars
<axw> ah
<rogpeppe1> axw: the real bad apple here is the local provider which does all kinds of shenanigans, running commands etc
<rogpeppe1> axw: which in a way doesn't matter (who wants to run multi environments locally anyway) but it would be nice to have it working for tests
<rogpeppe1> s/anyway/anyway?/
<rogpeppe1> axw: BTW do you think it's reasonable for someone to be able to specify a specific agent-version setting when creating a new environment in a JES?
<axw> rogpeppe1: I don't see why anyone would want to, but does allowing it (within reason) make something difficult?
<rogpeppe1> axw: just wondering - it could potentially be awkward i guess, if someone uses an incompatible agent version
<axw> rogpeppe1: ok. I'm not sure, sorry, better to ask thumper
<rogpeppe1> axw: you don't by any chance know off-hand what OS context the local provider's jujud runs in, do you?
<axw> rogpeppe1: what OS context? not sure what you mean by that
<rogpeppe1> axw: will it see the same environment variables that the user had when they bootstrapped?
<axw> rogpeppe1: ah. I don't think so - pretty sure we just write an upstart/systemd conf, and env vars won't generally be preserved
<rogpeppe1> axw: ah, ok so we really do need to do all that stuff locally
<axw> rogpeppe1: what exactly do you want to do server-side? generate the complete env config?
<rogpeppe1> axw: yes, to the greatest extent possible
<axw> rogpeppe1: and then something could, say, fetch the config into a .jenv file to run?
<mup> Bug #1467374 changed: worker/uniter/filter: ci test failure <juju-core:Triaged> <https://launchpad.net/bugs/1467374>
<mup> Bug #1456763 opened: TestUnitRemoval fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1456763>
<wallyworld> dimitern: can i have a small review to fix a 1.24.1 ritical? http://reviews.vapour.ws/r/2007/
<wallyworld> or perrito666 ^^^^
<perrito666> wallyworld: reviewed
<wallyworld> perrito666: tyvm
<wallyworld> oh, anastasiamac also did it
<anastasiamac> wallyworld: perrito666: I just perused :D it does not really count :))
<natefinch> rogpeppe1, axw: we should really document the format of environment names
<perrito666> wallyworld: I am familiar with the code so mine does count :p
<wallyworld> perrito666: it is a bad bug - affecting a paying customer site
<wallyworld> so it seems, there were upgraded from 1.20.14
<katco> natefinch: standup
<rogpeppe1> natefinch: true, but environment names are getting increasingly redundant
<natefinch> rogpeppe1: not sure I agree with that.  you still need to know what environment Juju is talking about, and if you and juju disagree on what it's called, it'll be confusing
<rogpeppe1> natefinch: an environment can have many names
<rogpeppe1> natefinch: it's used to tag provider machines, but now we've got ResourceTags for that
<natefinch> rogpeppe: just thinking of reading logs etc.
<natefinch> rogpeppe: I guess if you never have mixed logs, it's not a problem
<rogpeppe> natefinch: the UUID is the thing that matters, not the name
<natefinch> rogpeppe: right, but uuids are not human-friendly
<natefinch> "wait am I supposed to be looking at environment de305d54-75b4-431b-adb2-eb6b9e546014 or 123e4567-e89b-12d3-a456-426655440000?"
<rogpeppe> natefinch: that can easily be dealt with with a tiny amount of tooling
<mup> Bug #1467964 opened: state still serializes external types <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1467964>
<rogpeppe> natefinch: sed "s/$UUID/my-environment/g"
<natefinch> rogpeppe: exactly the usability issue I was talking about
<rogpeppe> natefinch: how is that a usability issue? i don't believe that all logs should be directly human readable necessarily.
<rogpeppe> natefinch: and as i said, there is no one global name for an environment except the UUID
<natefinch> rogpeppe: I bet our users would disagree
<rogpeppe> natefinch: with the last fact?
<natefinch> rogpeppe: with the fact about logs not needing to be directly human readable
<rogpeppe> natefinch: lots of logs are in JSON format. that's not very human readable, but it's very useful and eminently toolable
<natefinch> rogpeppe: and if we don't give an environment a human-readable when it is created, that's our own fault.  I'm not saying that name has to be globally unique.  Or even unique at all. Just human readable.
<rogpeppe> natefinch: i'm not objecting to having a label that can be attached to an environment
<rogpeppe> natefinch: but that's quite a different role than the environment name has traditionally played in juju
<rogpeppe> natefinch: for example, if it's just a label, you might consider being able to change it
<natefinch> rogpeppe: yes.  And don't get me wrong, I definitely think we need a UUID on every environment to identify it to our code.
<natefinch> rogpeppe: just saying we should also have some kind of label for the poor sucker reading the logs at 3am
<rogpeppe> natefinch: log size is a real problem for us. i'm not sure i'd want us to put the env name *and* the UUID in every log message
<rogpeppe> natefinch: mostly you won't want to mix log files between environments anyway
<natefinch> rogpeppe: yeah, that was my second thought - hopefully they never end up in the same place anyway.
<rogpeppe> natefinch: if you *do* mix 'em, just prefix each line with the UUID and provide a trivial tool to relabel according to whatever labels you deem appropriate or just grep.
<katco> ericsnow: you have a review (http://reviews.vapour.ws/r/1963/). nothing really wrong with the patch, but a few suggestions i think are good.
<ericsnow> katco: thanks
<katco> ericsnow: also, do you think bug 1466565 is related to the lxc collision issue? please don't dig into it too much if you don't know.
<mup> Bug #1466565: Upgraded juju to 1.24 dies shortly after starting <cts> <landscape> <sts> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1466565>
<katco> ericsnow: gah... why did rb screw up my code formatting?
<ericsnow> katco: I was just wondering that :)
<ericsnow> katco: did you indent the block 4 spaces?
<katco> ericsnow: i put it between `'s with newlines
<katco> ericsnow: i did not. is there a way to edit?
<ericsnow> katco: no need to quote; just indent 4 spaces
<ericsnow> katco: I don't think you can edit
<katco> boo
<katco> well, copy/paste to vim i suppose
<mup> Bug #1467973 opened: uploadSuite.TearDownTest Fails <ci> <intermittent-failure> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1467973>
<ericsnow> katco: just try replying to the comment
<natefinch> for multiline code formatting, you can do ``` before and after
<ericsnow> katco: sorry, not sure about that bug
<katco> ericsnow: no worries at all
<katco> ericsnow: just thought it might spark something
<katco> natefinch: ty
<katco> ericsnow: updated
<ericsnow> katco: thanks
<katco> ericsnow: it looks like go's code relies solely on pid to get current user. good call
<katco> ericsnow: sorry, meant uid
<ericsnow> katco: np :)
<natefinch> rogpeppe: about your comment on deputy for StderrLog and StdoutLog being the same... that would require a change to the API, since right now those are functions, which can't be compared, so we'd have to make them interfaces instead... which kind of complicates the package's API.
<katco> natefinch: how's that work for the demo coming? :)
<natefinch> katco: crap, is that this iteration? ;)
<katco> natefinch: haha :p just a gentle nudge on priorities given we are way over capacity
<natefinch> katco: yep
<rogpeppe> natefinch: no, you wouldn't need to change the API
<rogpeppe> natefinch: oh yes, you would... hmm
<rogpeppe> natefinch: yeah, i dunno. seems a pity but no obvious solution
<natefinch> rogpeppe: ok, sort of what I though too, just hoping you'd see something I missed.
<katco> ericsnow: passing thought. what do you think of creating a "proccmd" package under process/context? this way, instead of "NewFooCmd(...)", it is "proccmd.Foo(...)"
<ericsnow> katco: I had considered that before but tabled it
<ericsnow> katco: +0
<katco> ericsnow: lol
<katco> ericsnow: fair enough
<ericsnow> katco: we won't have a lot of commands, but it might still be a good idea
<katco> ericsnow: we'll pain the bike shed at some point.
<katco> paint
<ericsnow> katco: k
<katco> ericsnow: it's unclear to me how the arguments in the spec translate to the arguments passed into Init(...) for commands
<ericsnow> katco: take a look at the register command (and registeringCommand)
<ericsnow> katco: RegisterCommand maps onto the register command in the spec
<katco> ericsnow: yes, i'm looking at that. however, in the spec there are options that it looks like we're not accounting for?
<ericsnow> katco: like what?
<katco> ericsnow: i.e. spec takes some flags, Init(...) on register only takes name, and info
<katco> ericsnow: --definition, --extend, --override
<ericsnow> katco: those are defined on registeringCommand
<ericsnow> katco: which RegisterCommand embeds
<katco> ericsnow: ah ok
<ericsnow> katco: we did it that way expressly for the furture work on the launch command :)
<katco> ericsnow: :) sorry i didn't see it
<ericsnow> katco: np
<natefinch> gah, I hate using godeps
<fwereade> http://reviews.vapour.ws/r/2008/ if anyone's of a mind to
<katco> ericsnow: is there a way we could have factored the use of cmd.Context out so the interface for our commands is simpler?
<ericsnow> katco: hadn't thought about it
<katco> ericsnow: i really dislike the chaining of suites
<katco> ericsnow: and it's more difficult to write unit tests when we're relying on the suite chain
<natefinch> +1 for less chaining
<katco> ericsnow: well anyway, i'm just going to write the tests as register has, but it is causing me some discomfort ;)
<ericsnow> katco: how so?
<katco> ericsnow: there's too much stuffed into parent structs
<ericsnow> katco: that's making it hard to write new tests?
<katco> ericsnow: it's making it very easy for me to write new tests that i don't fully understand
<ericsnow> katco: k
<katco> ericsnow: this is the style of unit test i prefer: https://github.com/juju/juju/blob/master/leadership/leadership_test.go#L95-L108
<katco> ericsnow: it's very easy to tell where your stubs are coming from and what they're doing
<ericsnow> katco: so you prefer creating a new stub in each test?
<katco> ericsnow: i like defining the functionality i'm stubbing out within the test
<katco> ericsnow: which does require a new instance of a stub in each test
<ericsnow> katco: while I think the discussion of what test methods should look like will be valuable, perhaps we should table it for now
<katco> ericsnow: sure, i'm continuing with the style as it is defined, just thought you'd be interested
<katco> ericsnow: https://plus.google.com/+KatherineCoxBuday/posts/7odKtVXgRB1
<ericsnow> katco: I think we different opinions here but I'd like to get on the same page
<natefinch> haha, I was going to say I agree with katco, but evidently already did back in August ;)
<ericsnow> katco: so I'm glad you've brought it up :)
<katco> natefinch: lol
<katco> ericsnow: it's the perks of being on an awesome team. good discussion :)
<ericsnow> katco: :)
<natefinch> I don't like setuptest and setupsuite because they're not obvious enough.  It's easy to be reading a test and not understand how it works, only to find out it relies on stuff 500 lines up the file in SetupTest, but it's magic, so you can't tell from the test.
 * natefinch looks at the setuptest and setupsuite he just wrote and winces.
<natefinch> ericsnow: I think we made a mistake in putting ProcDetails in the plugin package.  I think it should go in juju/charm ...that way it stays in lockstep with charm.Process
<natefinch> ericsnow: as the input and output of the plugin
<ericsnow> natefinch: but charms have nothing to do with ProcDetails
<natefinch> ericsnow: hrmph.... yeah.
<natefinch> ericsnow: I was trying to avoid copying and pasting the code for serialization of ProcDetails
<katco> ericsnow: natefinch: perhaps the notion of logical vs. physical boundaries is applicable here
<ericsnow> natefinch: why do you have to copy-and-paste?
<natefinch> ericsnow: I have to convert the json into a struct that the plugin code can rationalize about
<ericsnow> natefinch: worst-case you have to import github.com/juju/juju/process/plugin
<natefinch> ericsnow: I can't. That's not version controlled.
<natefinch> ericsnow: anything under github.com/juju/juju can change at any time
<ericsnow> natefinch: hmm
<natefinch> ericsnow: that's why charm.v5 would have worked, because it *is* version controlled
<ericsnow> natefinch: maybe *for now* it would make sense to just keep the plugin in github.com/juju/juju/process/plugin/docker
<ericsnow> natefinch: that would buy use time to sort out the issue
<natefinch> ericsnow: or like katco said, instead of purely logical boundaries, we use a physical boundary for this code
<natefinch> ericsnow: the plugin code already imports charm.v5 for the Process struct, it's not unreasonable to put the ProcDetails struct there... even if it's not strictly part of charm code (it does certainly relate to charms)
<ericsnow> natefinch: also, it's not like there's a lot of structure to what the plugin must serialize, right?
<natefinch> ericsnow: for now, sure :)
<natefinch> ericsnow: we can punt on it for now and I can copy pasta
<ericsnow> natefinch: I mean there shouldn't be much copying
<natefinch> ericsnow: there's not :)
<natefinch> ericsnow, katco: gotta run, birthday time
<ericsnow> natefinch: this does bring up the question of perhaps versioning the plugin serialization format
<katco> natefinch: have fun dude
<katco> natefinch: happy birthday to you and your wife
<ericsnow> natefinch: happy birthday!
<natefinch> ericsnow: we can talk later.  I do think versioning the format is a good idea. WE'll have to figure out how to do that
<natefinch> thanks!
<ericsnow> natefinch: I'll add a card
<katco> ericsnow: keep in mind, versioning may be ok to fudge for the demo
<ericsnow> katco: agreed
<katco> ericsnow: a bit confused. is registeringCommand intended to be the base command for all commands? it seems geared towards register specifically?
<ericsnow> katco: it's for register and launch
<katco> ericsnow: // registeringCommand is the base for commands that register a process
<katco> // that has been launched.
<katco> ericsnow: maybe we should update that? it makes it seem like the process has already been launched?
<ericsnow> katco: that's correct
<ericsnow> katco: the launch command launches the proc via the plugin and then registers it
<katco> ericsnow: ah i think i see now. that is intended to be called after launch does it's thing
<ericsnow> katco: for the launch command the Run method will make the call the plugin and then call the register method with the result
<katco> ericsnow: gotcha
<katco> ericsnow: i'm assuming i want to convert from plugin.ProcStatus -> process.Status. is there a method defined for that already?
<ericsnow> katco: actually you don't
<ericsnow> katco: they are two different statuses
<ericsnow> katco: ProcStatus is sent as-is
<katco> ericsnow: plugin.Launch returns a plugin.ProcStatus, register wants a process.Status
<ericsnow> katco: Status is always set to Active
<katco> ericsnow: oh, surprising...
<katco> ericsnow: so we just ignore the actual status from launching the plugin?
<ericsnow> katco: Launch returns a process.Details
<ericsnow> katco: pretty much...it's just informational (we will display it in juju status)
<katco> ericsnow: so if plugin.Launch returns "error, this is absolutely not running", we still pass "StatusActive" to register?
<ericsnow> katco: in that case the command should fail
<ericsnow> katco: but that should be handled via the error return from the plugin
<ericsnow> katco: not the status
<katco> ericsnow: ah gotcha. so if no error is returned, we assume the plugin has done the right thing, and whatever status is displayed is representing some good state?
<ericsnow> katco: yep
<katco> ericsnow: k makes sense now
<ericsnow> katco: oh good
<katco> ericsnow: and i will comment to that effect ;)
<alexisb> thumper, ping
<alexisb> can you join us please
<thumper> coming
<katco> ericsnow: is baseCommand::getInfo().Process the correct place to get the charm.Process? it looks like that may be circular reasoning
<ericsnow> katco: when one of our hook context commands is run the user provides the name of the process
<wallyworld> thumper: menn0: for 1.24.2, just a reminder, don't forget to land the mgo v2 dep change once 1.24 is unlocked
<ericsnow> katco: the base command uses that name to extract that appropriate info from the hook context
<menn0> wallyworld: will do. thanks.
<ericsnow> katco: after that the info is available through the info field of the base command
<ericsnow> katco: thus you can then get the charm.Process via info.Process
<jw4> what is required for a change to get into 1.24 at this point?
<jw4> wallyworld, alexisb ^^ ?
<wallyworld> jw4: what change?
<alexisb> jw4 it needs to be a regression, critical impact
<alexisb> jw4, why?
<jw4> #eco is saying that bug 1457205 is blocking some critical features in CABS
<mup> Bug #1457205: Subordinate charm Action data not reported by API <actions> <charmers> <subordinate> <juju-core:Triaged by johnweldon4> <https://launchpad.net/bugs/1457205>
<katco> ericsnow: ok, looks like that's done through basecommand::init(...)?
<jw4> mind you they did not ask me to escalate
<ericsnow> katco: yep
<wallyworld> jw4: alexisb: marco seems to be happy for it to be fixed for 1.25, what's the reason for asking about 1.24?
<alexisb> wallyworld, it is coming up again in actions discussions
<jw4> wallyworld: arosales just discovered that it's impacting some critical functionality with CABS
<wallyworld> so we could target to 1.24.2
<jw4> is there a freeze/cut-off date?
<katco> ericsnow: am i free to make the ProcLaunchCommand ctor signature whatever, or does that conform to some function sig?
<alexisb> jw4, is this a bug you are willing to take?
<wallyworld> jw4: there's some upgrade issues to fix on 1.24.2 so it will be a few days i expect
<jw4> alexisb: it's assigned to me right now - I was just expecting a more sedentary approach
<jw4> :)
<wallyworld> by EOW would be good
<alexisb> wallyworld, based our discussion today in the release call I would think the release target date to be 7/3 with a freeze date a earlier in the week
<wallyworld> for 1.24.2? that seems a way off
<jw4> ... it's closer than it appears in the mirror
<alexisb> well given 1,24,1 is going out tomorrow that seems very reasonable
<alexisb> that is less then 2 weeks
<ericsnow> katco: it has to conform to the signature expected by the registration func; so it keep it the same as NewProcRegistrationCommand
<wallyworld> ok
<katco> ericsnow: i.e. func(HookContext) (*ProcLaunchCommand, error)?
<wallyworld> my hope is we get stuff fixed sooner if possible so we are not under release pressure
<arosales> wallyworld, an issue for anyone wanting to do benchmarking with Juju and use subordinates.  We would like to make benchmarking with juju generally available next week
<ericsnow> katco: yep
<jw4> wallyworld: +1
<alexisb> wallyworld, agreed
<alexisb> arosales, that could be an issue
<alexisb> depending on how the fix can come about
<alexisb> ie I dont see us really 1.24.2 + fix for 1457205 before eow next week (at the earliest)
<arosales> alexisb, Understood if it can't make next week, but we don't want benchmarking to be crippled for too long after release
<alexisb> ok arosales noted
<arosales> we could cavet in the release notes until 1.25, but the sooner the better so we could remove that cavet
<wallyworld> alexisb: if we aim for EOW this week for 1.24.2 fixes, then we can try for EOW next week for a release as a goal
<arosales> its something we (eco eng) tracking daily on our dev board
<alexisb> wallyworld, yep
<arosales> wallyworld, jw4, alexisb: thanks. Let us know if you need us to test anything
<katco> ericsnow: is that sig defined somewhere?
<wallyworld> arosales: will do. the benchmarking stuff is freaking awesoe
<alexisb> arosales, in the bug we had an agreement for 1.25, so we will do the best we can to get it out in 1.24.2
<jw4> arosales: will do
<ericsnow> katco: worker/uniter/runner/factory.go?
<marcoceppi> wallyworld: just wait until next weeks announcment ;)
<arosales> wallyworld, ya the devX team has done some good work there
<alexisb> and yes the benchmarking stuff is freak'n awesome
<wallyworld> marcoceppi: yes, looking forward to it :-)
<ericsnow> katco: look in component/all/processes.go to see how the commands get registered
<arosales> alexisb, understood and thanks for trying to get it in earlier
<alexisb> and marcoceppi I tweeted that "stuff" just for you ;)
<alexisb> and because it is freak'n awesome
<wallyworld> arosales: marcoceppi: in your announcement will you note the limitation with subordinates to folks don't get the breakage when jumping in to try it out?
<marcoceppi> wallyworld: yes, we have a page on the juju docs we're putting up, I'll highlight that limitation there
<katco> ericsnow: this is a little strange to me. i thought component was supposed to be the way features registered themselves? it looks like they're hard coded?
<arosales> marcoceppi, thanks and hopefully we don't have to note it for too long.
<arosales> keep rockin' it juju-core
<wallyworld> +1 :-)
<arosales> and thanks for working on https://bugs.launchpad.net/juju-core/+bug/1466629
<mup> Bug #1466629: Containers fail to get ip when non-maas dhcp/dns is used <dhcp> <dns> <lxc> <maas> <openstack-installer> <openstack-provider> <ubuntu-engineering> <ubuntu-openstack> <juju-core:Triaged> <https://launchpad.net/bugs/1466629>
<ericsnow> katco: they key phrase there is "register themselves"
<ericsnow> katco: code somewhere has to make the call
<ericsnow> katco: ergo "hard-coded"
<katco> ericsnow: i just pictured it inverted
<katco> ericsnow: features calling into this package to say "here i am"
<ericsnow> katco: component/all is the intersection point
<alexisb> arosales, I will follow-up with dimiter in the morning on https://bugs.launchpad.net/juju-core/+bug/1466629
<mup> Bug #1466629: Containers fail to get ip when non-maas dhcp/dns is used <dhcp> <dns> <lxc> <maas> <openstack-installer> <openstack-provider> <ubuntu-engineering> <ubuntu-openstack> <juju-core:Triaged> <https://launchpad.net/bugs/1466629>
<katco> ericsnow: this way, we have 1 package that imports the world
<ericsnow> katco: but they have to be imported to trigger that
<alexisb> arosales, but if logs can be provided tha twould be most helpful
<ericsnow> katco: I went way from such import side-effects
<katco> ericsnow: we want it inverted; every feature imports all and says "here i am"
<ericsnow> katco: but something has to import all the components we want
<katco> ericsnow: not if you use the registration pattern
<ericsnow> katco: that's the way we had it before
<ericsnow> katco: mind hopping into moonstone?
<katco> ericsnow: sure
<arosales> alexisb, I'll see if jcastro can reach out to the bug submitter and see if we can get a reproduction
<alexisb> thanks arosales !
<mup> Bug #1466660 changed: Unable to create hosted environments on EC2 <config> <ec2-provider> <juju-core:Invalid by cherylj> <https://launchpad.net/bugs/1466660>
<mwhudson> davecheney: say um, how heavily do you think cgo on arm64 has been tested?
<wallyworld> anastasiamac: perrito666: axw: i'll be 15 minutes late for standup as i have a clash
<perrito666> wallyworld: ah, I hate when I have punk rock bands too
<perrito666> :p
<wallyworld> gawd, dad joke
<perrito666> that was a terrible joke
<perrito666> sorry
<wallyworld> so you should be
<perrito666> axw: ping me when you arrive pls
<alexisb> cherylj, thumper ping
<thumper> coming
 * menn0 likes perrito666's joke #dad
<axw> perrito666: I have arrived
<perrito666> axw: that sounded batmanish
<davecheney> thumper: sorry i missed the standup
<davecheney> it's so coldhere, it's hard to get out of bed that early
<thumper> what? down below 15Â°C?
<davecheney> mwhudson: the heviest tests have probably been the ones that come with the std lib
<davecheney> juju might exercise the glibc bindings a bit
<mwhudson> yeah
<davecheney> but the more escoteric stuff, nope
<mwhudson> looking at the code, i'm a little concerned that the thread local storage used to save g over a cgo call is not, in fact, thread local
<wallyworld> axw: here now
<mwhudson> but i'm not sure how to check
<axw> wallyworld: joining
<mwhudson> davecheney: runtimeÂ·clone has the wonderful comment "// TODO: setup TLS."
<axw> perrito666: ^^
<alexisb> thumper, that video is awesome!
<thumper> :-)
 * axw is intrigued
<alexisb> o juju core developers are so getting chairs in october
<alexisb> https://www.youtube.com/watch?v=Y9ttBt-4vWo
<menn0> wallyworld: reviewed your ResourceManager facade branch
<wallyworld> menn0: ty, will look after standup
<menn0> wallyworld: tl;dr is "ship it" :)
<wallyworld> menn0: \o/ ty
#juju-dev 2015-06-24
<anastasiamac> thumper: re chairs... it makes sense now why u r advocating standing tables :D
<anastasiamac> desks even
<anastasiamac> menn0: tyvm for looking :)
<anastasiamac> menn0: it's still WIP but I have added a li'l detail to description :)
<davecheney> thumper: http://paste.ubuntu.com/11765246/
<davecheney> down to three races
<davecheney> one is trivial
<davecheney> the other two are more complicated to fix
<davecheney> actually, its probaly 4
<davecheney> 2 are simple
<davecheney> nope, the last one is the gocheck issue
<davecheney> so, 1 simple, 3 hard
<davecheney> urgh
<davecheney> i can't cannot
<davecheney> count
<davecheney> short versoin, there are less races today
<davecheney> some are easy to fix
<davecheney> some are hard
<davecheney> there are less than yesterday
<mwhudson> and more than tomorrow?
<menn0> anastasiamac: thanks. i've decided to be more picky about that when doing reviews
<anastasiamac> menn0: \o/
<mup> Bug #1468153 opened: Charms with storage don't use cloud-native default if size is specified, but provider is omitted <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1468153>
<thumper> davecheney: I guess that's good
<mup> Bug #1468153 changed: Charms with storage don't use cloud-native default if size is specified, but provider is omitted <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1468153>
<mup> Bug #1464356 opened: TestCloudInit fails <ci> <gccgo> <regression> <test-failure> <windows> <juju-core:In Progress by thumper> <juju-core 1.24:Fix Committed by thumper> <https://launchpad.net/bugs/1464356>
<mup> Bug #1468153 opened: Charms with storage don't use cloud-native default if size is specified, but provider is omitted <storage> <juju-core:Triaged> <https://launchpad.net/bugs/1468153>
<axw> wallyworld: updated http://reviews.vapour.ws/r/1994/, PTAL
<wallyworld> sure
<wallyworld> axw: just to check, as well as  TestRemoveLastVolumeAttachment where we check Dead, there's also a TestRemoveNotLastVolumeAttachment where we assert that the volume is not cleaned up?
<axw> wallyworld: we can't do shared storage yet, so no
<wallyworld> ok
<wallyworld> lgtm
<axw> thanks
<thumper> axw: seen bug 1466167 ?
<mup> Bug #1466167: debug-hooks no longer works with 1.24-beta6 <debug-hooks> <regression> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1466167>
<thumper> axw: I was thinking that this was fixed now
<axw> thumper: hrm, I tested debug-hooks after my latest change to jujuc
 * axw tests again
<axw> thumper: oh this was fixed when waigani reverted an older change
<thumper> axw: I have had reports that it broke again...
<axw> status just needs to be updated. will do that
 * thumper shrugs
<axw> I'll test again anyway
<thumper> thanks
<thumper> 1.24 and master if you can
<thumper> then just update the bug
<axw> sure
<thumper> I thought you had fixed it
<thumper> but then someone said it was broken "again"
<thumper> and I wasn't sure how soon again was
<thumper> axw: thanks for double checking thought, I appreciate it
<axw> thumper: no worries
<thumper> hazaar !! \o/
<thumper> my ci blocker fix landed in 1.,24 and just now, master
 * thumper crosses fingers for a blessing
<thumper> menn0: I'll update both jes-cli and db-log with master now that the ci bugs *should* be fixed
<thumper> hmm...
 * thumper needs caffeine
<menn0> thumper: awesome
<axw> thumper: all good, updated the bug
<thumper> axw: awesome, thanks
 * thumper goes to turn on the coffee machine
<menn0> thumper: thanks for the reviews
<thumper> np
 * thumper is looking at what appears to be an intermittent failure
<thumper> WT actual F?
 * thumper sighs
<thumper> labix.org/v2/mgo/socket.go:285 is the only place this error text exists
<thumper> which is in the Close() method
<thumper> so... how did this test fail with that? http://juju-ci.vapour.ws:8080/job/github-merge-juju/3768/console
<thumper> passes here 40 times in a row
<thumper> what would close it?
<thumper> I don't see anything that would close it...
<thumper> bah humbug
<natefinch> thumper: I've seen that before, though not recently... I forget now what was causing it.
<thumper> natefinch: hmm...
<natefinch> thumper: google says gustavo said this about it at one point: "if a machine is explicitly removed
<natefinch> from a replica set, mgo will proactively close any open sockets to
<natefinch> prevent errors from happening due to logic that stays running with a
<natefinch> server that isn't updating."
<natefinch> from the same thread, the guy who was hitting this problem: "when this was happening, we were hitting our file descriptor limit"
<thumper> this use case doesn't touch any of that AFICT
<thumper> huh
<thumper> interesting
<natefinch> It sounds like there's no one specific reason that it can happen
<axw> wallyworld: there was just one bug that affects 1.24, I've put up a PR. the other issues are only to do with destroying storage, so just on master
<wallyworld> ty, looking
<wallyworld> axw: lgtm
<axw> wallyworld: cheers
<mup> Bug #1467973 changed: uploadSuite.TearDownTest Fails <ci> <intermittent-failure> <unit-tests> <juju-core:Invalid> <juju-core 1.24:Invalid> <https://launchpad.net/bugs/1467973>
<natefinch> good old sudo wget | sh
<natefinch> er | sudo sh  whatever you know what I mean
<thumper> meh
 * thumper jumps through the hoops to get a win2012 server kvm env to test windows test
<thumper> damn gready install, wants at least 12.5 gig disk
<mup> Bug #1466011 changed: apiserver tests fail on windows <ci> <regression> <test-failure> <windows> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1466011>
<mup> Bug #1468166 opened: serviceManagerSuite.TestCreate <blocker> <ci> <regression> <test-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468166>
<davecheney> thumper: do you have a second, well more like 10 mins, for a discussion on a race fix ?
<thumper> yeah...
<thumper> davecheney: hangout choice?
<davecheney> 1:1 ?
<davecheney> thumper: http://paste.ubuntu.com/11765877/
<thumper> davecheney: http://wiki.cloudbase.it/juju-testing
<davecheney> thumper: r u doing the standing desk ?
<mup> Bug #1438951 changed: destroy-enviroment --force destroy all aws instances <destroy-environment> <ec2-provider> <juju-core:Expired> <https://launchpad.net/bugs/1438951>
<thumper> davecheney: yeah
<davecheney> i moved my 'desk' down to the lounge room table now I'm on my own for two weeks
<davecheney> does that count ?
<thumper> well... are you standing up?
<davecheney> i have to stand up to go downstairs
<davecheney> this is undeniable
<mup> Bug #1438951 opened: destroy-enviroment --force destroy all aws instances <destroy-environment> <ec2-provider> <juju-core:Expired> <https://launchpad.net/bugs/1438951>
<davecheney> thumper: this race is really worrying
<davecheney> i think i need to bug it
<thumper> oh... kay...
<davecheney> http://paste.ubuntu.com/11766040/
<davecheney> this code alters the config value
<davecheney> which is really worrying
<davecheney> because the watcher fires to indicate st.EnviroConfig() has been updated
<davecheney> but that is then passed to environ.New()
<davecheney> which _alters_ the config object
<davecheney> even if thee was a copy returned from st.EnviornConfig(), which there is not
<davecheney> this is still bonkers
<mup> Bug #1438951 changed: destroy-enviroment --force destroy all aws instances <destroy-environment> <ec2-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1438951>
 * davecheney assumes the foetal position
<mup> Bug #1438951 opened: destroy-enviroment --force destroy all aws instances <destroy-environment> <ec2-provider> <juju-core:Triaged> <https://launchpad.net/bugs/1438951>
<davecheney> AAAAAAAAAAAAAAAARGH
<davecheney> http://paste.ubuntu.com/11766069/
<davecheney> watching for the environ config _and_ trying to retrieve the _current_ config at the same time will DEADLOCK
<thumper> wallyworld, axw, davecheney: anyone... http://reviews.vapour.ws/r/2015/
 * davecheney looking
<davecheney> LGTM
<thumper> cheers
<davecheney> that was pretty simple
<thumper> this only compiles and runs on windows
<thumper> davecheney: yeah
<thumper> most of the effort was in setting up the machine to test in
<davecheney> the log would say something like, check failed func(xx32842342) not comparable
<thumper>     c.Assert(s.getPasswd.Calls, gc.HasLen, 1)
<thumper> ... obtained func() []testing.StubCall = (func() []testing.StubCall)(0x43f760)
<thumper> ... n int = 1
<thumper> ... obtained value type has no length
<thumper> anyway, master doesn't have this problem
<thumper> just 1.24
<davecheney> yeah, what you said
<thumper> sinzui: fix submitted
 * thumper goes to start dinner
<mup> Bug #1468166 changed: serviceManagerSuite.TestCreate <blocker> <ci> <regression> <test-failure> <juju-core:Invalid> <juju-core 1.24:In Progress by thumper> <https://launchpad.net/bugs/1468166>
<mup> Bug #1468187 opened: Juju on Fedora <juju-core:New> <https://launchpad.net/bugs/1468187>
<mup> Bug #1468188 opened: environs/config: Validate mutates the config passed to it <juju-core:New> <https://launchpad.net/bugs/1468188>
<mup> Bug #1468223 opened: failed units fail to fail and fail to die <juju-core:New> <https://launchpad.net/bugs/1468223>
<dimitern> voidspace, TheMue, jam, fwereade, standup?
<jam> dimitern: brt, got logged out of google
<mup> Bug #1468349 opened: discoverySuite.TestDiscoverServiceLocalHost: invalid series for wily <test-failure> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1468349>
<mup> Bug #1468349 changed: discoverySuite.TestDiscoverServiceLocalHost: invalid series for wily <test-failure> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1468349>
<mup> Bug #1468349 opened: discoverySuite.TestDiscoverServiceLocalHost: invalid series for wily <test-failure> <unit-tests> <wily> <juju-core:Triaged> <https://launchpad.net/bugs/1468349>
<mup> Bug # opened: 1468354, 1468355, 1468357, 1468359
<mup> Bug #1468365 opened: internal compiler error: fault <ci> <intermittent-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468365>
<mup> Bug #1468365 changed: internal compiler error: fault <ci> <intermittent-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468365>
<mup> Bug #1468365 opened: internal compiler error: fault <ci> <intermittent-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468365>
<mup> Bug #1468369 opened: TestBootstrapNoToolsDevelopmentConfig fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1468369>
<mup> Bug #1468369 changed: TestBootstrapNoToolsDevelopmentConfig fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1468369>
<mup> Bug #1468369 opened: TestBootstrapNoToolsDevelopmentConfig fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1468369>
<natefinch> ericsnow: this is what we get back from docker inspect... thoughts on what we should include in our status string?  http://pastebin.ubuntu.com/11768085/
<ericsnow> Id
<ericsnow> natefinch: wish it was shorter
<natefinch> ericsnow: yes but, status
<ericsnow> natefinch: oh, right
<ericsnow> natefinch: something derived from State
<natefinch> ericsnow: and this is where I want to be able to return structured data as state ;)
<natefinch> er status
<ericsnow> natefinch: I'm not sure that exposing that whole thing to users when they call juju status is the right thing to do
<natefinch> ericsnow: not the whole thing, but like one of the statuses (running, etc) and maybe the PID and error if not empty
<ericsnow> natefinch: wwitzel3 et al. would probably have better insight into that though
<ericsnow> natefinch: you may have a point (PID would be nice in status)
<ericsnow> natefinch: let's get some feedback from wwitzel3, whit, etc.
<mup> Bug #1464356 changed: TestCloudInit fails <ci> <gccgo> <regression> <test-failure> <windows> <juju-core:Fix Released by thumper> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1464356>
<natefinch> ericsnow: I think the "Name" can be used instead of the UUID...
<natefinch> ericsnow: in this case it's /dreamy_ptolemy
<ericsnow> natefinch: as long as it's unique we don't care :)
<natefinch> ericsnow: just might be more user friendly
<ericsnow> natefinch: my preference is certainly toward user-friendly
<natefinch> ericsnow: and I'm pretty sure it is unique per host machine
<ericsnow> natefinch: sounds good
<mup> Bug #1464356 opened: TestCloudInit fails <ci> <gccgo> <regression> <test-failure> <windows> <juju-core:Fix Released by thumper> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1464356>
<mup> Bug #1464356 changed: TestCloudInit fails <ci> <gccgo> <regression> <test-failure> <windows> <juju-core:Fix Released by thumper> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1464356>
<katco> ericsnow: http://reviews.vapour.ws/r/2020/
<ericsnow> katco: will review
<natefinch> aww man, there's some benefits to docker being a Go project, on the help for "docker inspect": --format=""    Format the output using the given go template.
<natefinch> actually, not sure how useful that is in practice, but it *seems*  cool
<natefinch> ericsnow: charm.Process has a name, should I be setting the docker name to that?  Is it guaranteed to be unique?  I notice that validate requires it to be non-empty... docker will generate its own name for the container if you don't give it one
<ericsnow> natefinch: What happens if you try to launch more than one docker with the same name?
<natefinch> ericsnow: it'll probably fail.  The name is supposed to be a unique identifier.  I can try and see what happens
<ericsnow> natefinch: could the plugin append some unique suffix to the proc name?
<ericsnow> natefinch: actually, for now we could just use the proc name
<natefinch> ericsnow: sure.... or maybe we could just allow name to be empty and let the plugin give it a name
<ericsnow> natefinch: let's not go there for now :)
<natefinch> ericsnow: so for right now we just use whatever we're given and if there's a collision, we return an error and say "don't do that"?
<ericsnow> natefinch: exactly
<natefinch> ericsnow: k
<natefinch> ericsnow: what's "Type"?
<ericsnow> natefinch: the plugin name
<ericsnow> natefinch: e.g. "docker"
<natefinch> ericsnow: oh, ok, so the plugin itself can probably ignore that ;)
<ericsnow> natefinch: (a plugin could support more than one type)
<ericsnow> natefinch: sure
<ericsnow> natefinch: I suppose the plugin could fail if there's an unexpected type, but that really shouldn't be a factor
<ericsnow> natefinch: (in Juju anyway)
<natefinch> ericsnow: TypeOptions is just arguments to be given to the executable?
<ericsnow> natefinch: pretty much
<natefinch> ericsnow: I'm not really sure how to translate a map to a list of strings... maybe it would be better if that were just a slice of strings?
<natefinch> ericsnow: for example, if you wanted to pass just "-d" or something
<ericsnow> natefinch: "--<opt>=<val>"?
<ericsnow> natefinch: I expect you'll have to loop over the map and use a switch inside to compose the arg list
<natefinch> ericsnow: is there some reason we can't just make it a list of arguments?  That seems like the most direct way to get arguments into the command line
<natefinch> ericsnow: I don't really want to have to parse a map and try to guess what people mean.
<ericsnow> natefinch: I'm not sure here
<ericsnow> natefinch: I see what you mean, but I'm trying to remember why we went with a map
<katco> ericsnow: natefinch: haven't followed this conversation too closely, but would the flag package be helpful here?
<ericsnow> katco: unfortuantely no
<natefinch> katco: not really... this is like the opposite of that
<natefinch> katco: I need the unflag package :)
<katco> natefinch: ah, taking a list of args and passing them to the command line as flags?
<ericsnow> katco: yep
<katco> ericsnow: natefinch: i may have something, gimme a few mins
<natefinch> katco: it would work fine as []string, but ericsnow is trying to remember why they made it aÂ map
<natefinch> katco: I actually think a []string is much more intuitive and less error prone
<natefinch> I man, that's what the CLI expects, what exec.Command expects, etc
<katco> ericsnow: natefinch: http://play.golang.org/p/CDP_7MagHH
<ericsnow> katco: using the flag package is way overkill here
<ericsnow> katco: the data is originating in the charm's metadata.yaml
<ericsnow> katco: (or similarly formatted at the CLI)
<katco> ericsnow: hm. can you elaborate on why? it seems to do what you need in a few lines of code and is robust
<natefinch> ericsnow: maybe it's a map so you can override things?
<ericsnow> natefinch: that works just as well with a list of strings
<ericsnow> natefinch: I think we used a map for reasons that no longer apply
<natefinch> katco: it's specified as a map in the yaml, and needs to be a []string for exec.Command
<natefinch> katco: except that it doesn't really need to be a map in the yaml
<ericsnow> katco: YAML -> flags -> JSON (YAML) -> string vs. YAML -> []string -> YAML -> string
<katco> natefinch: ah, i misunderstood. https://github.com/juju/juju/blob/feature-proc-mgmt/process/env.go#L36-L38 ?
<natefinch> katco: https://github.com/juju/charm/blob/v5/process.go#L23
<ericsnow> natefinch: however, with a map you avoid the ambiguity of duplicates and you lose any expectation of ordering
<natefinch> ericsnow: uh... ordering can matter and duplicates can be valid
<natefinch> ericsnow: that's just reasons not to use a map :)
<ericsnow> natefinch: there are several dedicated fields in charm.Process which are translated into args by the plugin
<natefinch> ericsnow: yep
<ericsnow> natefinch: I would not expect ordering to matter, but if it does then yeah, a map won't work
<ericsnow> natefinch: and yeah duplicates should be respected
<ericsnow> natefinch: so I'm pretty much convinced :)
<ericsnow> natefinch: how do we resolve the dedicated fields with order?  put them first?
<natefinch> ericsnow: *shrug* charms will need to know how to use their respective plugin
<natefinch> ericsnow: hopefully order won't matter
<ericsnow> natefinch: the plugin has to decide where to stick those values
<natefinch> ericsnow: yes... generally the only time order matters is at the very end, like grepping multiple files, the files all go at the end
<natefinch> ericsnow: unless we just let them specify the entire arg list, there's no much we can do about ordering
<ericsnow> natefinch: at least in the case of those dedicated fields
<natefinch> ericsnow: right
<natefinch> ericsnow: FWIW, the docker plugin won't have a problem with ordering... only the command is order dependent, so it's easy enough to put that in the right spot
<ericsnow> natefinch: we could, as a rule, always put them at the front of the option list (unless they are args)
<ericsnow> natefinch: with args they already have a specific order
<natefinch> ericsnow: right, so once we switch TypeOptions to a []string, I think we're done
<ericsnow> natefinch: k
<natefinch> (and call we call it Args, since that's what it is?)
<ericsnow> natefinch: please leave it TypeOptions
<natefinch> ericsnow: as a charm author, TypeOptions does not sounds like "arguments to pass to the thing you're running"
<ericsnow> natefinch: it's the options for the process type
<rogpeppe> anyone else seen this error? (local provider, juju-core tip or close to it):
<rogpeppe> %  juju deploy -e local wordpress
<rogpeppe> ERROR cannot retrieve charm "cs:trusty/wordpress-2": cannot get archive: Get https://api.jujucharms.com/charmstore/v4/trusty/wordpress-2/archive: dial tcp: lookup api.jujucharms.com on [::1]:53: read udp [::1]:35163->[::1]:53: read: connection refused
<rogpeppe> i can get the URL just fine from my regular command line
<katco> ericsnow: natefinch: http://reviews.vapour.ws/r/2022/
<ericsnow> katco: dang it
<natefinch> ericsnow: I often find that the doc comment is revealing when looking at naming things.  The doc comment says "TypeOptions is a map of arguments for the process type"
<ericsnow> natefinch: that's what I just said :)
<natefinch> katco: use jc.SameContents
<ericsnow> katco, natefinch: yes, please :)
<natefinch> katco: it does slice comparison that ignores order
<katco> natefinch: cool, changing it
<natefinch> ericsnow: if there's a better word for describing the thing in the comments, that's probably the word you should use for the name.  // Args is a map of arguments for the process type
<natefinch> (except, now it's a list, of course)
<natefinch> ericsnow: I think the fact that I had to ask you what it was is kind of telling.  If it was called "arguments" or "args" I don't think I would have had to ask.
<ericsnow> natefinch: TypeArgs vs. TypeOptions?  Not much of a difference there
<katco> ericsnow: natefinch: http://reviews.vapour.ws/r/2022/
<ericsnow> natefinch: just Args (or Options) is too vague
<natefinch> ericsnow: why type?  the type is Docker or SystemD or Rocket.  The args are for whatever CLI command you're running....  but I'm not sure how to name that
<natefinch> ClientArgs?
<ericsnow> natefinch: they are options for use by the type
<ericsnow> natefinch: the plugin is responsible for interpreting their meaning and translating that into the CLI args
<natefinch> ericsnow: ok... let's roll back.  It sounds like these might not always be a direct mapping to CLI args. That might be my misunderstand
<natefinch> ing
<natefinch> ericsnow: so, like, this could just be "other stuff you need to specify but that we can't forsee for every plugin, so here's map, use that"
<ericsnow> natefinch: yep
<natefinch> ericsnow: ok, sorry, that was my misunderstanding
<natefinch> ericsnow: so this requires some shared knowledge between plugin and charm on what the values in the map mean... and that's fine.
<ericsnow> natefinch: the options should still map pretty closely to the args the plugin will use
<natefinch> ericsnow: ideally, yes, so that it's easier to understand how to set the options.
<ericsnow> natefinch: yep
<natefinch> ericsnow: I think it might just call it Options, then.  It's options for the process, the fact that it applies to the process type is implicit.  And I don't think I'd use "arguments" in the comment, because that makes it sound like CLI arguments, and they're almost certainly not a 1:1 mapping (if they were, you could just use a []string and specify CLI arguments directly)
<ericsnow> natefinch: we call it TypeOptions because it does not contain any other options (e.g. things used by Juju)
<ericsnow> natefinch: "explicit is better than implicit"
<katco> err... on build bot: "fatal: unable to access 'https://gopkg.in/mgo.v2/': Could not resolve host: gopkg.in"
<natefinch> ericsnow: but they're options for this specific process. Another process of the same type might have different options.  So they're options for this process.  Yes, they're type-dependent, so but so is the command and image etc
<ericsnow> natefinch: yeah, but other parts of charm.Process are not type-dependent
<natefinch> katco: http://downforeveryoneorjustme.com/gopkg.in
<ericsnow> natefinch: and they are specific to that process because that charm.Process{} is specific to that processs :)
<katco> natefinch: gustavo has an official uptime link: http://stats.pingdom.com/r29i3cfl66c0
<natefinch> ericsnow: the *process* itself is type dependent
<ericsnow> natefinch: exactly
<katco> natefinch: looks like it was a blip on our end
<natefinch> ericsnow: so my point is, type is redundant. The whole struct is type dependent.
<ericsnow> natefinch: not the whole struct
<natefinch> ericsnow: it is. You're passing it to a specific plugin for that type.  The way "port" is interpreted is type-dependent
<natefinch> ericsnow: can we at least change "arguments" to "options" in the comment?
<ericsnow> natefinch: sure
<natefinch> ericsnow: in my mind arguments == CLI and that is not necessarily the case
<ericsnow> natefinch: to me, in this context, they mean exactly the same thing
<ericsnow> natefinch: but I see what you mean
<natefinch> ericsnow: but that's because you wrote it and you know what you mean.  To someone else, they have to figure out what you mean
<ericsnow> natefinch: :)
<katco> natefinch: hmm... arguments get passed to functions as well
<natefinch> katco: but in this case, this is all stuff that corresponds to how the plugin will launch a process, which is almost certainly via a CLI (I suppose it could be an API of some sort... but I'm guessing it'll usually be a CLI)
<natefinch> ericsnow: It seems like it could be "PluginOptions"  .  "type" to me, is an ID.  You can't have options for an ID.  The plugin and the process both have a type, and that's "docker".  But the options are for the Plugin.  Docker itself won't know how to translate that map.
<ericsnow> natefinch: the fact that we are using plugins is an implementation detail
<ericsnow> natefinch: you could just as well call it TypeHandlerOptions
<katco> ericsnow: natefinch let's not bike shed this too much longer plx
<natefinch> katco: I wouldn't if it weren't for the fact that it's not changeable later
<natefinch> katco: this defines the API between Juju and the plugin
<natefinch> and the charm
<katco> natefinch: until this feature is live, can't it be changed
<katco> ?
<natefinch> katco: yes, true
<natefinch> ericsnow: how about this... leave the field name and just change the argument to "TypeOptions is a map of options for the process type handler."?
<natefinch> s/argument/comment
<ericsnow> natefinch: sounds good
<natefinch> ericsnow: awesome.   Periwinkle blue it is.
<ericsnow> natefinch: lol
<ericsnow> natefinch: hey, I thought you said Bullwinkle!
<natefinch> Too late!
<katco> ericsnow: i'm about ready to pair up once you review that branch
<ericsnow> katco: k
<katco> natefinch: feel free as well: http://reviews.vapour.ws/r/2020/
<natefinch> katco, ericsnow: what's up with the panics, anyway?
<ericsnow> natefinch: who's panicking?
<natefinch> ericsnow: there's some // TODO(ericsnow)  return an error instead
<katco> natefinch: just following the pattern; i just don't think we've defined the error path instead
<natefinch> ericsnow: in registerHookContextCommands
<katco> natefinch: it's not intended to stay that way obviously
<natefinch> katco: obv, just wondering what's up
<katco> natefinch: incremental progress, that's what's up ;)
<natefinch> oh... is this that stuff that gets called from Init() so panics aren't the end of the world?
<katco> natefinch: correct
<natefinch> ok.. I get it
<katco> natefinch: this is the very tip top of the chain.
<katco> natefinch: but my larger point is: this is the way it is because it doesn't matter right now. we can define a better way later
<katco> brb tea
<katco> ericsnow: moonstone?
<natefinch> ericsnow, katco: FYI docker's launch and status pretty much working with all options etc.  Destroy will be trivial too.  Still need tests... gotta run for a little bit, will be on later this afternoon and then more tonight.
<ericsnow> natefinch-afk: sweet!
<sinzui> katco Do you have a minute to review http://reviews.vapour.ws/r/2023/
<katco> sinzui: tal
<katco> sinzui: sorry, not sure if you saw: ship it!
<sinzui> katco: I did :)
<katco> sinzui: k :p
<natefinch> ericsnow: should destroy remove the volumes associated with the container?
<katco> natefinch: i don't think so
<natefinch> katco: k
<katco> natefinch: http://reviews.vapour.ws/r/2018/diff/# read the description
<natefinch> katco: kk
<natefinch> ericsnow: what's command for?
<natefinch> ericsnow: (on charm.Process) .... is that like the docker command to run (i.e. docker run <container> <command>?
<ericsnow> natefinch: the command to pass to docker (etc.)
<ericsnow> natefinch: yep
<natefinch> ericsnow: hmm... that probably needs to be a []string then...   for example, if you pass "sleep 30" as the command, then docker tries to run "sleep 30" as the command name
<natefinch> (and says there's no command called "sleep 30")
<ericsnow> natefinch: then perhaps the plugin should address splitting the string for the user
<natefinch> ericsnow: that's incredibly tricky  what if command is 'cowsay "howdy there!"` ?  what quoting scheme do we use, bash, or someting else?
<natefinch> ericsnow: there's a reason exec.Command requires the user to split up the args :)
 * thumper shakes his fist at shitty tests
<natefinch> ericsnow: seems trivial to have the user write Command: [ "cowsay", "howdy there!"]   rather than Command: "cowsay 'howdy there!'"
<natefinch> ericsnow: and a lot less error prone
<ericsnow> natefinch: what do docker commands normally look like?
<natefinch> ericsnow: docker run container cowsay "howdy there!"
<natefinch> ericsnow: if Command is just a []string, then I can just do args = append(args, process.Command...) at the end... otherwise I have to get fancy and tricky and assume I know how the user expects their command string to get split up
<natefinch> katco, ericsnow: I gotta make dinner, but I'll be back in a little less than 4 hours.... I can make progress even if we don't figure out the command stuff... plenty of tests to write of the rest of the code.
<ericsnow> natefinch-afk: yeah, let's discuss the command stuff tomorrow
<davecheney> PASS: pinger_test.go:131: mongoPingerSuite.TestAgentConnectionsShutDownWhenStateDies    30.481s
<davecheney> thumper
<davecheney> exactly 30 seconds to run this test
 * davecheney sharpens knife
<davecheney> thumper: sorry, can you resend
<davecheney> basically nothing is working today
<ericsnow> axw: in case you didn't notice, I fixed the PR links in RB (for new PRs) :)
<davecheney> ericsnow: thanks
<davecheney> that was super annoying
<ericsnow> davecheney: agreed
<ericsnow> davecheney: sorry it took so long to get a round tuit
<davecheney> s'ok
<davecheney> fixed now
<katco> thumper: for serious, i would love to see coupling of commits/branches <--> bugs
<axw> ericsnow: I did, thanks very much
<thumper> davecheney: http://reviews.vapour.ws/r/2024/diff/#
<thumper> OMG running tests in the win2012 virtual server is SSLLLOOOOWWWW.....
<perrito666> thumper: really? It didnt seem that slower
#juju-dev 2015-06-25
<davecheney> thumper: ship it
<davecheney> this is another examples of the code being adapted to fit the shoe
<davecheney> s/shoe/tests
<davecheney> in that example the test code _relied_ on the valye written into the apiInfo being localhost:nnnn
<davecheney> although in practice
<davecheney> 1. apiinfo's retried from jenv end config files never pointed to the localhost
<davecheney> 2. and even if they did point to localhost, they are always ip addresses, not the word "localhost
<davecheney> thumper:         // Check that the returned environ is still the same.
<davecheney>         env = obs.Environ()
<davecheney>         c.Assert(env.Config().AllAttrs(), jc.DeepEquals, originalConfig.AllAttrs())
<davecheney> this test will always pass because originalconfig is the same refernce returned by obs.Environ().Config()
<axw> wallyworld: I haven't been following the custom image metadata bug, but I just had a look and there is code to store custom image metadata in state already
<axw> wallyworld: it may not be working, but it is there...
<wallyworld> axw: where is that code?
<axw> wallyworld: it's a bit different to the tools metadata. the simplestreams files just get stored in the env storage verbatim
<axw> wallyworld: one sec
<axw> wallyworld: cmd/jujud/bootstrap.go, storeCustomImageMetadata
<wallyworld> axw: yes, there is code to put it in env storage
<wallyworld> let me look
<axw> wallyworld: I suppose I consider env storage part of state now.
<axw> wallyworld: anyway, the image lookup code is meant to look in env storage as one of the sources
<axw> wallyworld: during bootstrap, we marshal whatever's in the --metadata-source directory through the ssh-init script, store it on disk on the bootstrap machine and point "jujud bootstrap" at it
<axw> wallyworld: I'm *fairly* sure I tested it when I did it, so possibly something has broken since it was first done
<wallyworld> axw: yeah, for some reason that storage doesn't seem to be on the search path. i wasn't aware that we were putting the metadata on the root disk
<wallyworld> i *think* adding it to a collection in state is the right thing to do
<wallyworld> abstracts from simplestreams like for tools
<axw> wallyworld: I think there was a reason at the time, I don't recall though :/
<axw> maybe it was just to get it done quickly
<wallyworld> that is likely correct
<axw> wallyworld: if it's going to be changed again, whatever's in env storage should be migrated out
<wallyworld> axw: yeah, although if it doesn't work anyway....
<wallyworld> but we should import
<axw> wallyworld: whether or not it's being searched, the data may be there
<wallyworld> yep
<axw> anastasiamac: ^^  you might want to read up
<wallyworld> we can continue on the current path and do the import after
<axw> anastasiamac: there's some code in bootstrap already that you should be able to make use of to marshal custom image metadata
<anastasiamac> axw: :)
<anastasiamac> axw: tyvm
<axw> wallyworld: sounds fine. I just wanted to clarify that it was actually done, it's just broken. since all that's changing really is moving from unstructured to stuctured format, we'll still need to determine what's wrong with the existing code
<wallyworld> axw: thank you, i didn't realise we had done the work
<axw> wallyworld: nps, didn't tweak until I went and looked at the code :)
<axw> twig even
<wallyworld> i didn't know there was code to look at :-)
<davecheney> thumper: menn0 https://github.com/juju/juju/pull/2652
<davecheney> please review critically
<mup> Bug #1468579 opened: juju bootstrap failed - cannot dial mongo to initiate replicaset: no reachable servers <oil> <juju-core:New> <https://launchpad.net/bugs/1468579>
<thumper> davecheney: done
 * thumper is building a utopic lxc container to test bug 1468365
<mup> Bug #1468365: internal compiler error: fault <ci> <intermittent-failure> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468365>
<davecheney> thumper: it would be faster to build go 1.3.3 from source instead
<thumper> :)
<thumper> My machine now has go 1.4.2 due to lxc/lxd ppa
<davecheney> also 1.3.3 is not supported
<davecheney> so any fix will land in 1.5
<davecheney> assuming it isn't aleady fixed in 1.4.2
<mup> Bug #1468581 opened: juju bootstrap fails - Waiting for API to become available ERROR cannot get all blocks: EOF <oil> <juju-core:New> <https://launchpad.net/bugs/1468581>
 * menn0 has to do the kindy run
<davecheney> thumper: I also remove a shitload of locking in the mock that was not needed
<davecheney> initially it was there becayuse the testing stub thing was not concurrent safe
<davecheney> but I fixed that a while back
<davecheney> so those locks were just causing deadlocks in the test
<davecheney> ie, adding a lock to make s.config safe
<davecheney> would cause a deadlock somewhere else
<davecheney> the answer was to take almost all hte locking out
<davecheney> as it was not protecting anything in the structure
<natefinch> thumper: I'm with Dave, you should be building go from source.  It's trivial and fast and makes it easy to switch versions (not to mention cross compile etc)
<natefinch> ericsnow: I don
<natefinch> ericsnow: I don't suppose you're around?
<thumper> davecheney: if I wanted to compile go 1.3.3, how would I go about it?
<thumper> damn... should have said -v on that call
 * thumper waits
<natefinch> thumper: git clone https://github.com/golang/go && cd go/src && make.bash
<thumper> natefinch: surely that'll make tip
<natefinch> thumper: sorry, add a git checkout 1.3.3 in there
<natefinch> sorry, git checkout go1.3.3
<mup> Bug #1468581 changed: juju bootstrap fails - Waiting for API to become available ERROR cannot get all blocks: EOF <oil> <juju-core:New> <https://launchpad.net/bugs/1468581>
<mup> Bug #1421260 opened: juju 1.21.1 bootstrap timeout <bootstrap> <oil> <oil-bug-1372407> <juju-core:New> <https://launchpad.net/bugs/1421260>
<mup> Bug #1468581 opened: juju bootstrap fails - Waiting for API to become available ERROR cannot get all blocks: EOF <oil> <juju-core:New> <https://launchpad.net/bugs/1468581>
<mup> Bug #1468584 opened: juju bootstrap failed - cannot initiate replica set: cannot get replica set status: can't get local.system.replset config from self or any seed (EMPTYCONFIG) <oil> <juju-core:New> <https://launchpad.net/bugs/1468584>
<axw> wallyworld: can you please review my response to your comments before I land?
<wallyworld> sure
<wallyworld> axw: thanks for replies. i don't think i'll ever get used to Go interfaces tending to end with "er". /me cringes
<axw> wallyworld: I anticipated that :)  I did it to keep with the existing Lifer
<wallyworld> fair enough :-)
<axw> wallyworld: good to go?
<wallyworld> hmm, maybe i better read the diff, but i trust you :-)
<wallyworld> yeah lgtm
<axw> wallyworld: that's good, because I already hit $$merge$$ ;p
<wallyworld> figured you would
<mup> Bug #1468586 opened: juju 1.24.0 bootstrap failure - Waiting for API to become available - Error connection is shutdown <oil> <juju-core:New> <https://launchpad.net/bugs/1468586>
<menn0> thumper: shall I try and merge the db-log feature branch soon?
<menn0> thumper: ping?
<axw> wallyworld: two more PRs up for review, these should be the last. there's a TODO in the EnsureDead PR about adding a test in worker/machiner which I'm going to do before it's landed
<wallyworld> axw: ok, just in meeting, will look soon
<axw> wallyworld: nps, thanks
<mup> Bug #1468637 opened: action-set needs to accept input from stdin <juju-core:New> <https://launchpad.net/bugs/1468637>
<wallyworld> axw: anastasiamac: it's fairly large but based on existing code for tools etc. would appreciate a review at some point. i'll look to land tomorrow before i go away http://reviews.vapour.ws/r/2031/
<axw> wallyworld: will do
<wallyworld> ty, no rush
<anastasiamac> wallyworld: sounds gr8 :D
<mup> Bug #1468639 opened: leader-set needs to accept input from stdin <juju-core:New> <https://launchpad.net/bugs/1468639>
<mup> Bug #1468639 changed: leader-set needs to accept input from stdin <juju-core:New> <https://launchpad.net/bugs/1468639>
<mup> Bug #1468639 opened: leader-set needs to accept input from stdin <juju-core:New> <https://launchpad.net/bugs/1468639>
<TheMue> axw:  dimitern: thx for reviews, will complete it when in office.  dimitern, could run a bit late today.
<axw> TheMue: np
<dimitern> TheMue, ok, np
<axw> wallyworld: reviewed
<wallyworld> ty, that was quick
<mup> Bug #1468653 opened: jujud hanging after upgrading from 1.24.0 to 1.24.1 <canonical-bootstack> <juju-core:Triaged> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468653>
<dooferlad> dimitern: hangout?
<jam> fwereade: ^^
<marcoceppi> how do I get all history from the api?
<fwereade> wallyworld, ^^
<wallyworld> history of what?
<marcoceppi> like, debug-log, but from the API
<marcoceppi> just all the events that have occurred
<wallyworld> what events?
<wallyworld> hook runs?
<wallyworld> actions executed?
<marcoceppi> things the user has done to the environment
<marcoceppi> or like, a historical allwatcher
<wallyworld> ah i see
<wallyworld> i don't know we have that capability ie audit
<rogpeppe> in case anyone's interested in making new juju plugins written in Go, i've just made a little tool to make that somewhat easier - it generates a skeleton for a new multi-command plugin: go get github.com/rogpeppe/misc/cmd/newjujuplugin
<thumper> rick_h_: our meeting in 9.5 hours clashes with the release meeting
<rick_h_> thumper: well boo
<thumper> rick_h_: can we either start later, or push it to next week?
<rick_h_> thumper: good, I think I have to cancel/move anyway because of swim class
<thumper> lets cancel then
<thumper> you know you can always ping me
<rick_h_> rgr
<rick_h_> sounds good
<thumper> laters folks
<voidspace> dimitern: ping
<voidspace> dimitern: I have a successful creation of a device with MAC and allocation of IP address
<voidspace> dimitern: container name is used as hostname
<voidspace> dimitern: the only "wrinkle" is that maas automatically appends ".maas" onto our hostname
<dimitern> voidspace, pong
<voidspace> dimitern: I can't tell if using that hostname actually routes successfully yet because container creation fails due to my networking issues
<dimitern> voidspace, awesome!
<voidspace> dimitern: however my replacement networking hardware just arrived
<dimitern> voidspace, that's to be expected and it's fine
<voidspace> dimitern: so I should be able to tell you shortly
<voidspace> dimitern: great
<voidspace> dimitern: I've added MAC address to IPAddress as well
<dimitern> voidspace, sweet! keep me posted then :)
<voidspace> dimitern: now a load of tests to fix and some to write...
<voidspace> dimitern: but no further tasks on this card that I'm *aware* of if nothing more comes up in testing
<dimitern> voidspace, well, it's definitely worth trying destroy-environment --force cleans up the devices
<dimitern> s/trying/verifying/
<voidspace> dimitern: yup, good point
<voidspace> can try that now
<voidspace> dimitern: yes it does
<voidspace> dimitern: device just disappeared...
<rogpeppe> dimitern: ping
<rogpeppe> dimitern: i'm lunching now. hopefully will catch you later.
<dimitern> rogpeppe, I'm about to go out, but will ping you when I'm back
 * dimitern steps out for ~1h
<mup> Bug #1468752 opened: "juju ssh" adds an additional strings to all commands when used on Windows, in interactive mode <juju-core:New> <https://launchpad.net/bugs/1468752>
<mup> Bug #1468756 opened: Bootstrapping local environment hangs when apt on host is upset <juju-core:New> <https://launchpad.net/bugs/1468756>
 * fwereade going to bed, hoping he can sleep a bit; probably back later
<TheMue> fwereade: have a good rest
<katco> natefinch: standup
<dimitern> rogpeppe, ping
<rogpeppe> dimitern: hiya
<dimitern> rogpeppe, hey, so what's up?
<rogpeppe> dimitern: i was just looking at configstore
<rogpeppe> dimitern: and wondering which state server addresses field i should use
<dimitern> rogpeppe, it depends I guess - if it's a local env or not
<dimitern> rogpeppe, ah
<dimitern> rogpeppe, well, any of them should work
<rogpeppe> dimitern: ok. i think i decided that server-hostnames was probably the best one to use
<rogpeppe> dimitern: because i'm passing them off to a remote server to do the connection
<dimitern> rogpeppe, just a quick note about it - those are never resolved if they contain hostnames, unlike the other
<rogpeppe> dimitern: is the only difference that one of them has resolved IP addresses in it?
<dimitern> rogpeppe, yeah, the hostnames are kept to make sure we only update the addresses if they changed (and not due to hostname-to-ip resolution)
<rogpeppe> dimitern: i didn't find the comments that clear
<rogpeppe> dimitern: it's not obvious that they both hold the same set of addresses in different forms
<rogpeppe> dimitern: just saying :)
<rogpeppe> dimitern: anyway, thanks for confirming
<dimitern> rogpeppe, fair point :)
<dimitern> rogpeppe, hostnames only serve a minor purpose - to avoid connection slowdown due to resolving names
<dimitern> (and are optional to begin with)
<rogpeppe> dimitern: hold on... which is which?
<dimitern> that is - unnecessary slowdown
<rogpeppe> dimitern: Hostnames is the unresolved set of addresses, no?
<dimitern> rogpeppe, hostnames are optional and are used if there when we're about to update the addresses (which are not optional)
<dimitern> s/if there when/if/
<dimitern> too many predicates :D
<rogpeppe> dimitern: oh, so i think i'm using the wrong one then
<dimitern> rogpeppe, yes, they are as we received them
<rogpeppe> dimitern: i really don't understand now
<dimitern> rogpeppe, in brief, you should use addresses
<dimitern> :)
<rogpeppe> dimitern: ok
<dimitern> as with hostames YMMV
<rogpeppe> dimitern: so Hostnames contains resolved addresses?
<dimitern> rogpeppe, no, unresolved
<rogpeppe> dimitern: and Addresses is the addresses as received from the state server?
<dimitern> rogpeppe, it contains (most of the time, unless they're about to be updated) the same list as the addresses field, but when set, the addresses are resolved first and any unresolvable ones are dropped
<rogpeppe> dimitern: so why the "may contain unresolved hostnames" comment?
<dimitern> rogpeppe, Hostnames = as we received them, Addresses = only IPs after resolving Hostnames
<dimitern> rogpeppe, where?
<rogpeppe> 	// Hostnames holds a list of API addresses which may contain
<rogpeppe> 	// unresolved hostnames. It's used to compare more recent API
<rogpeppe> 	// addresses before resolving hostnames to determine if the cached
<rogpeppe> 	// addresses have changed and therefore perform (a possibly slow)
<rogpeppe> 	// local DNS resolution before comparing them against Addresses.
<rogpeppe> 	Hostnames []string
<rogpeppe> dimitern: that makes it sound like Hostnames is the one holding unresolved addresses
<dimitern> rogpeppe, ah, yeah - unresolved as in "the provider gave us a mix of ips and hostnames and we kept the later as-is"
<dimitern> rogpeppe, makes sense?
<rogpeppe> dimitern: no, i'm even more confused now
<dimitern> rogpeppe, here's an example:
<rogpeppe> dimitern: you said i should use Addresses, but that doesn't have the unresolved host names in it that i almost certainly want because they're the actual DNS names
<dimitern> rogpeppe, rogpeppe, Hostnames = ["sparkling-wine.maas", "172.10.20.30", "127.0.0.1"], Addresses = ["172.10.20.30", "127.0.0.1"] assuming the hostname resolves to the second item in Hostnames
<rogpeppe> dimitern: and FWIW my local.jenv has "localhost:17070" in both state-servers and server-hostnames fields, which isn't resolved
<dimitern> rogpeppe, but if we can't resolve sparkling-wine.maas (e.g. we're not using the maas dns server on the host using juju cli) it will be dropped
<rogpeppe> dimitern: ok, so in that case, i'm pretty sure i want to use Hostnames
<rogpeppe> dimitern: but you say that it may be empty even when Addresses is not?
<dimitern> rogpeppe, localhost is special - it's not resolved even if it appears in Hostnames
<rogpeppe> dimitern: weird - why not?
<dimitern> rogpeppe, it may be empty for example if you got a .jenv file from somewhere and want to use it - it will have addresses, but might not have hostnames
<rogpeppe> dimitern: even though Hostnames is the primary source?
<dimitern> rogpeppe, because localhost resolves both to 127.0.0.1 and ::1 and depending on prefer-ipv6 env setting one or the other is tried first
<rogpeppe> dimitern: ah, because it might be from an old version of juju which doesn't support that field?
<dimitern> rogpeppe, indeed
<rogpeppe> dimitern: i definitely think this could do with some better docs
<dimitern> rogpeppe, agreed
<dimitern> rogpeppe, for really bad cases of missing or misleading docs, please file a bug
<rogpeppe> dimitern: but thanks very much for the explanation
<rogpeppe> dimitern: i shall change my code to use Hostnames by preference and fall back to using Addresses if len(Hostnames) == 0
<dimitern> rogpeppe, no worries
<dimitern> rogpeppe, yes, unless the place you'll be using those addresses cannot resolve one or more hostnames in it
<rogpeppe> dimitern: well, it'll eventually just pass those addresses to api.Open
<dimitern> rogpeppe, yes
<dimitern> rogpeppe, and it mightl fail for unresolvable ones, but likely succeed for the others
<rogpeppe> dimitern: and there's no way that my code can tell whether the server it's passing the addresses to will be able to resolve the addrs
<dimitern> rogpeppe, except an educated guess, yes you can't generally know
<rogpeppe> dimitern: my client is crude and uneducated :)
<dimitern> rogpeppe, e.g. if you're trying to connect to a maas environment outside maas's network it will most likely fail; for ec2 OTOH it will work
<rogpeppe> dimitern: yeah, not much i can do about that
<dimitern> rogpeppe, yeah
<Syed_A> Hello Folks, Can anyone point out to me how quantum-gateway charm interacts with OpenStack Keystone. I don't see any relation between two.
<dimitern> Syed_A, keystone acts as a service directory and other openstack services can use it to discover others, outside the relation context
<Syed_A> dimitern: Yes but if a service wants to use keystone it must have AUTH_URL available to it. I am trying to figure from where metadata agent is getting keystone url.
<Syed_A> In my openstack environment, instances are failing to get metadata from nova-api-metadata. And apparently the reason is the metadata agent is trying to contact keystone on loclahost instead of AUTH_URL
<mup> Bug # changed: 1348663, 1442308, 1447895, 1454678, 1462146, 1464616, 1466167, 1467690, 1468584, 1468586
<mbruzek1> alexisb: ping.
<alexisb> mbruzek1, pong
<mbruzek1> alexisb: Is anyone in core working on a digitalocean provider?  Kapil wrote a plugin that works with Digital Ocean, but that is a plugin.
<alexisb> mbruzek1, no
<mbruzek1> alexisb: I ran into a Digital Ocean person at Dockercon and he seemed very interested in talking with us.
<TheMue> hmpf, merge conflicts
<mbruzek1> alexisb: Is there some documentation / instructions about how to write a provider?  (Is this something I could contribute?)
<alexisb> mbruzek1, we would be help to consult them through the process
<alexisb> mbruzek1, there is limited documentation
<alexisb> mbruzek1, but if they are willing to do the work we would be happy to help them along
<alexisb> mbruzek1, I will tell you that writing a provider is not a straight forward, painless process
<alexisb> something the team is eager to address, but work we are not currently scheduled to do
<mbruzek1> alexisb: I am going to email the D.O. person back who would be the contact on core?
<alexisb> myself and katco
<mbruzek1> OK  thank yo
<voidspace> alexisb: just checking you're still available
<alexisb> voidspace, yes
<voidspace> alexisb: great
<alexisb> I moved us out 30 minutes
<alexisb> I will be there is one minute :)
<voidspace> alexisb: ok
<TheMue> gnah, one gofmt, try again
<mgz> ericsnow: bug 1468815
<mup> Bug #1468815: Upgrade fails moving syslog config files "invalid argument" <ci> <regression> <upgrade-juju> <juju-core:Invalid> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1468815>
<ericsnow> mgz: k
<alexisb> any juju-core members who are online, team call
<alexisb> cmars, ^^
<perrito666> this conversation made me hungry, I might go for a seccond apple
<mup> Bug #1468855 opened: Feature Request for enhanced configurability of lxc hostname naming convention <config> <feature> <juju-core:New> <https://launchpad.net/bugs/1468855>
<ericsnow> could someone spare me a moment for a quick review on a critical bug fix?  http://reviews.vapour.ws/r/2035/
<natefinch-afk> ericsnow: looking
<natefinch> ericsnow: ship it
<ericsnow> natefinch: thanks
<katco> ericsnow: is this what we're trying to write? https://github.com/juju/juju/blob/master/apiserver/common/registry.go#L121-L126
<ericsnow> katco: not exactly
<ericsnow> katco: we want to add methods onto an existing facade
<ericsnow> katco: I suppose we could use a separate facade...
<katco> ericsnow: shouldn't all components have their own facade?
<ericsnow> katco: that would likely be easier
<ericsnow> katco: they don't now
<ericsnow> katco: (at least for the uniter)
<katco> ericsnow: so the uniter is a facade, and we want to add methods onto it?
<ericsnow> katco: all of those are crammed into the uniter facade
<ericsnow> katco: right
<ericsnow> katco: though it might be worth seeing if there is any technical reason not to use a separate facade for the component
<katco> ericsnow: ty, i'll continue poking around
<ericsnow> katco: cool
<natefinch> ericsnow: can I make those changes to process?  Command as []string and port ranges?
<ericsnow> natefinch: give me a few minutes to wrap something up
<natefinch> ericsnow: port ranges are more of a nice to have, but the command thing I think is a must
<natefinch> ericsnow: np
<natefinch> I have an hour before I have to go make dinner though
<natefinch> whoever decided that . should not match \n should be flogged
<perrito666> natefinch: regexes are enough problem as they are without having to worry about them being multilined
<natefinch> perrito666: why should \n be special?  It's just another character.... by making . match everything except that, you're making a special case for no reason
<perrito666> natefinch: \n is not a newline everywhere :p
<natefinch> perrito666: that's even worse
<perrito666> if you are matching a multiline entity with a regex, you are a daltonic wiring a bomb
 * natefinch learns a new English word from the ESL guy.
<perrito666> natefinch: ?
<natefinch> daltonic
<perrito666> ah, I believe color blind might be more usual?
<natefinch> I never knew there was a word for "someone who is color blind"
<natefinch> perrito666: yep
<mgz> it's a brazilian/south americanism
<natefinch> google says it's from the French word daltonisme, which is where I suspect perrito666 got it
<perrito666>  first scientific paper on the subject of color blindness, Extraordinary facts relating to the vision of colours, was published by the English chemist John Dalton in 1798[5] after the realization of his own color blindness. Because of Dalton's work, the general condition has been called daltonism, although in English this term is now used only for deuteranopia.
<perrito666> it is called daltonismo in spanish too
<perrito666> natefinch: anyway I think you always tend to learn more words of your language by non native speakers as they lookup in formal sources to translate words they use regularly and you might not
<natefinch> ericsnow: 5 minutes
<ericsnow> natefinch: I've replied.
<natefinch> cool
<ericsnow> natefinch: basically I'd rather pursue alternatives
<ericsnow> natefinch: and for the portrange we should use the code we already have for that
<ericsnow> natefinch: basically move the core network package over to the utils repo or something
<thumper> marcoceppi: there is no audit yet, but we are hoping it will drop out of planned work
<thumper> marcoceppi: not any time soon though
<marcoceppi> :(
<thumper> sorry dude
<natefinch> ericsnow: thanks. I disagree about the c ommand thing, but don't have time to talk about it now.   Gotta go make dinner.
<ericsnow> natefinch: k
<fwereade> menn0, offhand, how does presence intersect with multi-env?
<fwereade> interact? whatever ;p
<fwereade> menn0, ah I see
<fwereade> menn0, not global, but parameterised by id
<fwereade> uuid
<fwereade> menn0, many thanks for your inestimable rubber-ducking
<fwereade> menn0, (I guess we have to do manual cleanup of those on env destroy then?)
<menn0> fwereade: in standup
<fwereade> menn0, np
<menn0> fwereade: so yes, not global
<menn0> fwereade: I was thinking that eventually there could be just one low level watcher goroutine that is env agnostic (potentially more efficient when you have 100's of envs in one state server)
<menn0> fwereade: what do you mean regarding manual cleanup?
<thumper> menn0: got a moment? I want to talk about a broken environment (mine)
<menn0> thumper: give me a minute. i've almost finished writing a complainy email.
<thumper> heh, ok
<wallyworld> fwereade: not sure if you're coherent enough to talk about mgo txns?
<menn0> waigani: chat?
<waigani> menn0: yep, standup chan?
<menn0> waigani: see u there
<thumper> wallyworld: I have a physio appt during our normal call time
<thumper> wallyworld: want to do it now?
<thumper> mramm: are you really here?
<mramm> I am really here
<mramm> though out on vacation this week
<mramm> but available if you want anything
<mramm> just chilling out going through the photos I took at my cousin's wedding sorting out the few good ones from the rest
<thumper> oh
<wallyworld> thumper: want to ping me when you're back? i got standup soon etc
<thumper> kk
<fwereade> wallyworld, thumper: when you're both there, any chance you could ping me first for a quick chat about mgo/txn and its particularly subtle-and-quick-to-anger characteristics?
<fwereade> wallyworld, thumper: and if I don't answer I've probably gone to sleep, but I haven't yet
<wallyworld> fwereade: can i ping you in 10? after standup
<wallyworld> perrito666: standup?
<thumper> fwereade: I'm here
<fwereade> thumper, wallyworld, ok, let me know when standup's over and maybe we could convene in there?
<thumper> fwereade: lets start a hangout now
<fwereade> thumper, sure
<thumper> because, hell, lets not do real work
<wallyworld> fwereade: thumper: i'm free now
<wallyworld> fwereade: thumper: you guys in a hangout?
<thumper> wallyworld: yeah... fwereade should bring you in
<thumper> 2015-06-19 13:36:56 ERROR juju.worker runner.go:218 exited "api": setting up container support: cannot load machine machine-0-lxc-0 from state: unknown object type "Provisioner"
#juju-dev 2015-06-26
<davecheney> da fuq ?
<thumper> davecheney: yeah, this seems to be the fundamental problem behind the lxc containers not upgrading
 * thumper is still digging
 * thumper tries to igrore work for a bit and go to lunch
<menn0> thumper: based on circumstantial evidence only it looks like the a stuck lease/leadership worker is behind bug 1466565
<mup> Bug #1466565: Upgraded juju to 1.24 dies shortly after starting <cts> <landscape> <sts> <upgrade-juju> <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1466565>
<menn0> thumper: based on a log message indicating that a watcher fired due to a change in the leases collection long after just about everything else was dead
 * menn0 goes to try a quick repro
<davecheney> thumper: http://paste.ubuntu.com/11776424/
<davecheney> 5 races, including the obscure apiserver one
<davecheney> that we talked about in the standup
<natefinch> dave, always playing the race card
<natefinch> ericsnow: you around?
<davecheney> thumper: who maintains gomass api ?
<davecheney> https://bugs.launchpad.net/juju-core/+bug/1468972
<mup> Bug #1468972: provider/maas: race in launchpad.net/gomaasapi <juju-core:New> <https://launchpad.net/bugs/1468972>
<mup> Bug #1468972 opened: provider/maas: race in launchpad.net/gomaasapi <juju-core:New> <https://launchpad.net/bugs/1468972>
<menn0> thumper: bingo... able to repro
<menn0> wallyworld: we're never doing another 1.23 release again are we?
<wallyworld> no
<wallyworld> that's the plan
<menn0> cool
<menn0> wallyworld: the reason I ask is that I'm looking at a problem upgrading out of a 1.23 env which seems fairly easy to hit (almost certainly due to the lease/leadership workers not exiting)
<wallyworld> hmmm
<menn0> wallyworld: adam c has hit it and I can repro it pretty easily
<menn0> wallyworld: seems like anyone who ended up on 1.23 could have trouble getting off it
<wallyworld> i guess we could do another release then
<menn0> wallyworld: that wouldn't help
<wallyworld> or have to
<menn0> wallyworld: they wouldn't be able to upgrade to that either
<wallyworld> ah yeah
<menn0> wallyworld: the issue is prevent the agent from exiting to restart into the new version
<wallyworld> is there a workaround we can document?
<menn0> wallyworld: it should be possible to work around it by manually setting the symlink
<axw> menn0: I *think* killing the jujud process would fix it
<wallyworld> that will have to be what we do then i guess
<axw> it just deadlocks when shutting down
<axw> if it's the bug I fixed
<menn0> axw: no that doesn't help because the symlink gets changed as one of the very last things that jujud does b4 it exists
<axw> ah right
<menn0> axw: and b/c some workers aren't finishing it's not getting to that
 * axw nods
<axw> menn0: btw, reviewed your branches. sorry for not doing so yesterday
<menn0> adam gets a minute or so of working Juju before it wants to restart and then gets stuck
<menn0> axw: thanks. no worries.
<menn0> axw: good catches for both of the problems you noticed
<thumper> davecheney: technically we maintain gomass api
<thumper> menn0: which repro are you talking about?
<davecheney> launchpad.net/gomaasappi
<menn0> thumper: bug 1466565
<mup> Bug #1466565: Upgraded juju to 1.24 dies shortly after starting <cts> <landscape> <sts> <upgrade-juju> <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1466565>
<thumper> menn0: yes?
<menn0> thumper: this is pretty serious actually... anyone who upgraded to 1.23 is likely to have a hard time getting off it
<menn0> thumper: see the ticket for trivial repro steps
<menn0> thumper: manual steps are required to upgrade
<menn0> thumper: the culprit appears to be the lease worker not honouring kill requests
 * thumper nods
<axw> wallyworld: I'm playing around with the Azure portal, which looks like it's using the new model... and putting machines in the same AS still forces them to the same domain-name/IP
<axw> :(
<wallyworld> oh :-(
<wallyworld> can you email the ms guys we have been talking to and ask about it?
<axw> wallyworld: ok
<wallyworld> ty, may not be the answer we want but at least they may be able to explain why etc
<mup> Bug #1466565 changed: Upgraded juju to 1.24 dies shortly after starting <cts> <landscape> <sts> <upgrade-juju> <juju-core:Triaged by menno.smits> <juju-core 1.24:Triaged by menno.smits> <https://launchpad.net/bugs/1466565>
<wallyworld> axw: there's a blue card in the Next lane - binding volumes/filesystems. That one has actually been done as part of the volume deletion work
<axw> wallyworld: yes, apart from UI to change binding
<axw> wallyworld: so I'll change it to just the missing bits
<wallyworld> axw: so i reckon we should add an unplanned card worth 5 or 8 for the work done
<axw> wallyworld: it was part of the persistent volume deletion
<axw> which was just woefully underestimated
<wallyworld> yep, i under estimated the resources card too :-(
<wallyworld> axw: also, if/when you get a chance ptal at the resources pr again  :-)
<axw> wallyworld: sure, just writing this email to guy
<wallyworld> np
<mup> Bug #1466565 opened: Upgraded juju to 1.24 dies shortly after starting <cts> <landscape> <sts> <upgrade-juju> <juju-core 1.23:Won't Fix by menno.smits> <juju-core 1.24:Invalid by menno.smits> <https://launchpad.net/bugs/1466565>
<axw> wallyworld: sorry, dunno why I thought you were storing the URL now. I think I saw the params struct and thought that's what you were storing in state
<wallyworld> np
<axw> wallyworld: LGTM
<wallyworld> yay, ty
<mup> Bug #1466565 changed: Upgraded juju to 1.24 dies shortly after starting <cts> <landscape> <sts> <upgrade-juju> <juju-core 1.23:Won't Fix by menno.smits> <juju-core 1.24:Invalid by menno.smits> <https://launchpad.net/bugs/1466565>
<menn0> omg so much fail
<menn0> you pull a string and broken stuff appears everywhere
<thumper> wallyworld, axw: can you join a hangout plxz?
<thumper> https://plus.google.com/hangouts/_/canonical.com/onyx-standup
<axw> thumper: omw
<axw> thumper: are you in? just says "trying to join the call"
<thumper> axw: I had that earlier today too...
 * thumper tries a direct invite
<thumper> axw: when did this commit land BTW?
<axw> thumper: 1.24
<thumper> I'm wondering if we should pull 1.24.1
<thumper> because this problem will stop any non-state server upgrading I think
<axw> thumper: probably not a bad idea. how come this got through CI? is it only affecting things that don't support KVM?
<thumper> no idea
<thumper> maybe...
<thumper> there is an open issue though about CI around upgrades
<thumper> as we have found so many upgrade problems
<thumper> which CI didn't catch
<axw> thumper: got the OK from OIL too I think, though not sure if they do upgrade or clean install
<thumper> I assigned you to the wrong bug
<thumper> hang on
<axw> thumper: ta
<thumper> bug 1466969
<mup> Bug #1466969: Upgrading 1.20.14 -> 1.24.0 fails <canonical-bootstack> <upgrade-juju> <juju-core:Triaged> <juju-core 1.24:Triaged by axwalk> <https://launchpad.net/bugs/1466969>
<axw> hurngh, can't test because I have vivid
<axw> should fail from 1.23 I guess
<thumper> 1.23 is terrible
<thumper> you can't upgrade from 1.23 due to lease / leadership issues
<thumper> try 1.22 or 1.20
<thumper> I have some 1.20.14 binaries if you want them :)
<axw> thumper: I can build them, juju 1.20 doesn't work on vivid
<axw> no systemd
<thumper> ugh
<thumper> geez
<axw> never mind, I'll work something out
<thumper> axw: you could reproduce in ec2
<axw> yep. I think I have a VM anyway
<thumper> axw: by deploying ubuntu into a container
<thumper> ok
<thumper> axw: was this for 1.24.1 or 1.24.0?
<thumper> axw: because there is another bug about failing to upgrade from 1.24.0 to 1.24.1
<axw> thumper: pretty sure 1.24, I'll double check
<axw> .0 I mean
 * thumper wouldn't be surprised if it is a different bug
<thumper> so many bugs
<thumper> :-(
<axw> thumper: yep, 1.24.0
<thumper> ok... so this other upgrade problem is something else
 * thumper takes a deep breath
<axw> thumper: how do I work around this syslog upgrade issue?
<axw>       upgrade to 1.24.2.1 failed (will retry): move syslog config from LogDir to DataDir: error(s) while moving old syslog config files: invalid argument
<thumper> ha
<thumper> I build from the 1.24.1 tag
<axw> I see, that was only broken in 1.24.2 ?
<thumper> or mkdir /etc/juju-<namespace>
<thumper> yep
<thumper> it is the commit after updating the version to 1.24.2
<axw> okey dokey
<axw> I'll try that
<mup> Bug #1468994 opened: Multi-env unsafe leadership documents written to settings collection <juju-core:Triaged by menno.smits> <juju-core 1.24:In Progress by menno.smits> <https://launchpad.net/bugs/1468994>
<menn0> thumper: digging into the leadership settings issue... the _id field was being prefix correctly
<menn0> thumper: but the env-uuid field wasn't being added
<menn0> thumper: so there's no cross-env leakage issues, but the upgrade step definitely gets confused
 * menn0 updates ticket
<menn0> axw: can I get a quick review of http://reviews.vapour.ws/r/2036/ please
<menn0> it's a one-liner :)
<axw> menn0: sure
<axw> menn0: is there a minimal test you can add for it? or is that coming later?
<menn0> axw: i'll have a look... i didn't have to change any tests when making this change
<axw> menn0: right, but we had missing test coverage right?
<axw> menn0: maybe not worthwhile. I'll LGTM and leave it to your discretion
<menn0> axw: thinking about it, a test at this layer doensnt make sense since it's actually the responsibility of a lower level to add the env-uuid field
<axw> menn0: fair enough
<menn0> axw: the fact that the lower layer didn't blow up when given a doc like this will be fixed in a later PR
<menn0> axw: and tested there
<axw> menn0: SGTM
<axw> shipit
<menn0> axw: cheers
<menn0> thumper: https://github.com/juju/juju/pull/2662 and https://github.com/juju/juju/pull/2661 are merging now. they're the minimum fixes for the leaderships settings doc env-uuid issue for 1.24 and master. More to come to avoid this kind of thing in the future of course.
<axw> thumper: seems there's another problem too :/    2015-06-26 06:22:31 ERROR juju.worker runner.go:218 exited "api": login for "machine-1" blocked because upgrade in progress
<axw> thumper: (machine-1 hasn't upgraded yet)
<dimitern> voidspace, dooferlad, hey guys, since you're on call reviewers today, along with fwereade, please review any non-reviewed PRs with priority
<fwereade> dimitern, am so doing :)
<dimitern> fwereade, cheers :)
<dooferlad> dimitern: on it.
<dimitern> dooferlad, ta!
<voidspace> cool
<dooferlad> dimitern: the other topic for the day seems to be bootstack related. Should we sync up with Peter now
<dooferlad> ?
<dimitern> dooferlad, I'm talking to him in #juju @c
<dooferlad> dimitern: ah, I was expecting on a different channel.
<dimitern> dooferlad, standup?
<mup> Bug #1469077 opened: Leadership claims, document larger than capped size <landscape> <leadership> <juju-core:New> <https://launchpad.net/bugs/1469077>
<Syed_A> Hello !
<Syed_A> submitted two bugs last night.
<Syed_A> [1] https://bugs.launchpad.net/charms/+source/quantum-gateway/+bug/1468939
<mup> Bug #1468939: Instances fail to get metadata: The 'service_metadata_proxy' option must be enabled. <quantum-gateway (Juju Charms Collection):New> <https://launchpad.net/bugs/1468939>
<Syed_A> https://bugs.launchpad.net/charms/+source/nova-cloud-controller/+bug/1468918/
<mup> Bug #1468918: neutron-server fails to start; python-neutron-vpnaas and python-neutron-lbaas packages are missing. <nova-cloud-controller (Juju Charms Collection):New> <https://launchpad.net/bugs/1468918>
<Syed_A> jamespage: Hello
<mup> Bug #1469130 opened: tools migration fails when upgrading 1.20.14 to 1.24.1 on ec2 <ec2-provider> <upgrade-juju> <juju-core:Triaged> <https://launchpad.net/bugs/1469130>
<fwereade> mattyw, would you close http://reviews.vapour.ws/r/1460/ one way or the other? looks like it has ship-its
<mattyw> fwereade, oooh, had forgotten about this
<fwereade> mattyw, cheers
<mattyw> fwereade, the comments seem controversial - care to make a casting vote - land or just close?
<fwereade> mattyw, I'm inclined to trust dave and andrew's apparent approval; nobody's complained, so land it
<mattyw> fwereade, landing, thanks very much
<mattyw> fwereade, thanks for noticing, had totally forgotten about this
<fwereade> niedbalski, niedbalski_: so, I'm sorry, I don't know what happened with your patches http://reviews.vapour.ws/r/1698/ and http://reviews.vapour.ws/r/1717/ ; it seems they got ship-its but never landed? if you check whether they need updating, and let me know their status, I will make sure they get landed
<fwereade> dimitern, http://reviews.vapour.ws/r/1403/ ?
<dimitern> fwereade, looking
<dimitern> fwereade, that needs to land yes, it's been a while
<dimitern> fwereade, I'll fix/respond to the current reviews and ask you for a final stamp
<fwereade> dimitern, cool
<jamespage> Syed_A, which openstack release?
<Syed_A> jamespage: Kilo
<jamespage> Syed_A, for that second bug, neutron-server is not supported on nova-cloud-controller - you have to use the neutron-api charm
<jamespage> that applies for >= kilo
<jamespage> Syed_A, can you make sure that your quantum-gateway charm is up-to-date - the kilo template should have the right things set
<Syed_A> jamespage: Ok, so if i deploy neutron-api charm i wouldn't need to install vpnaas or lbass ?
<jamespage> Syed_A, the neutron-api charm knows how to deploy those things for >= kilo
<Syed_A> jamespage: Roger that.
<jamespage> it will enable them - nova-cloud-controller only supported 'embedded neutron-server' up to juno I think
<Syed_A> jamespage: This may be a silly question but how can i make sure that quantum-gateway charm is up-to-date ?
<dimitern> fwereade, updated http://reviews.vapour.ws/r/1403/ PTAL
<jamespage> Syed_A, are you deployed from branches or from the juju charm store?
<Syed_A> jamespage: juju charm store.
<jamespage> Syed_A, which version does 'juju status' tell you have deployed then
<jamespage> Syed_A, version 16 has the required templates:
<jamespage> https://api.jujucharms.com/charmstore/v4/trusty/quantum-gateway-16/archive/templates/kilo/nova.conf
<Syed_A> Ok,,, checking ...
<Syed_A> jamespage: charm: cs:trusty/quantum-gateway-16
<fwereade> dimitern, LGTM
<jamespage> Syed_A, what's your openstack-origin configuration?
<Syed_A> jamespage: Unfortunately, in this setup openstack-origin is not present but there is an ansible variable which specify openstack release which is set to kilo.
<Syed_A> jamespage: The variable is used to set this repository, repo="deb http://ubuntu-cloud.archive.canonical.com/ubuntu {{ ansible_lsb.codename }}-updates/{{ openstack_release }} main"
<jamespage> Syed_A, I need to understand what the charm thinks it should be doing
<jamespage> if openstack-origin is not set correctly, it won't use the right templates
<Syed_A> jamespage: Ok, i am going to set openstack_origin in the config right now.
<jamespage> irrespective of what you put in sources :)
<jamespage> Syed_A, this may have working in the past, but for the last release we switched how we determine openstack series to support the deploy from source feature in the charms
<jamespage> Syed_A, my statement about openstack-origin will apply across all of the openstack charms btw
<jamespage> the template loader is constructed based on that configuration
<jamespage> so it will assume a default of icehouse on trusty for example
<Syed_A> jamespage: Ohhh, i got it, so this might be the reason why this charm which used to work fine, now fails.
<jamespage> Syed_A, that's quite possible
<jamespage> Syed_A, before we determined version based on packages installed - however for deploy from source, there are not any openstack packages installed :-)
<dimitern> fwereade, last look? http://reviews.vapour.ws/r/1403/
<fwereade> dimitern, if that's all you changed just land it :)
<dimitern> fwereade, cheers :) will do
<Syed_A> jamespage: I am deploying a fresh setup with these configs. [1] http://paste.ubuntu.com/11778630/ && [2] http://paste.ubuntu.com/11778641/
<jamespage> Syed_A, openstack-dashboard needs openstack-origin as well
<jamespage> but looks much better
<jamespage> Syed_A, I must introduce you to bundles :-)
<Syed_A> jamespage: bundles ? :)
<jamespage> Syed_A, hmm - you're doing alot of --to=X to the same machines ?
<Syed_A> jamespage: Yes, specifying exactly where a service should go ? Isn't a good practice.
<jamespage> Syed_A, bundles - https://jujucharms.com/openstack-base/
<jamespage> Syed_A, pushing multiple services onto the same machines without using containers won;t work
<jamespage> Syed_A, https://wiki.ubuntu.com/ServerTeam/OpenStackCharms/ProviderColocationSupport
<Syed_A> jamespage: This is why i was working on a lxc based OpenStack deployment. But for now we are just deploying nova-compute and quantum-gateway on seperate machine which used to work in the past.
<Syed_A> jamespage: Our lxc based bits are also ready just need to patch the lxc-ubuntu-cloud template for our 3 nics per container requirement.
<jamespage> Syed_A, I thought you where - good
<jamespage> you pastebin confused me
<Syed_A> jamespage: Sorry about that. alice(controller) is 1, bob(compute) is 2 and charlie(quantum-gateway) is 3. :)
<jamespage> Syed_A, but you are going to use lxc containers right?
<Syed_A> jamespage: No, not in this setup.
<jamespage> Syed_A, most of the controller services won't work
<jamespage> Syed_A, they assume control over the filesystem, so are not safe to deploy without containers
<Syed_A> jamespage: ohhh that w'd be a problem. :/
<jamespage> Syed_A, yeah - I know they will all at-least conflict on haproxy configuration
<jamespage> Syed_A, we enable that by default now
<Syed_A> jamespage: for haproxy, we have a customized haproxy.cfg which fixes the issue
 * fwereade was up until 2 last night, taking an extended break, may or may not be back at a reasonable time
<jamespage> Syed_A, you guys are terrifying me - all I can say is ymmv
<mbruzek> Has anyone seen a problem with the GCE provider today?  The juju bootstrap command is giving this error: ERROR failed to bootstrap environment: cannot start bootstrap instance: no "trusty" images in us-central1 with arches [amd64 arm64 armhf i386 ppc64el]
<sinzui> mbruzek: I am in #cloudware. I havenât gotten any answers
<sinzui> mbruzek: there are NO images for gce http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:gce.sjson
<Syed_A> jamespage: Our goal is to eventually move towards lxc based openstack deployment as suggested by the community. Right now i am only trying to fix this issue for the time being. We have every intention to follow the process as suggested on ubuntu wiki.
<mbruzek> sinzui: strange that this worked before, I am just seeing this error today
<sinzui> mbruzek: CI tests gce, we saw the failre about 15 hours ago.
<mbruzek> sinzui: Did you file a bug that I can contribute to?
<sinzui> mbruzek: no, because this is an ops issue. I am not aware of a project for gce images
<sinzui> mbruzek: I am crafting a email asking for someone with power to explain the situation
<Syed_A> jamespage: You were right about the conflict at  haproxy , neutron-api failed to install and logs this:INFO install error: cannot open 9696/tcp (unit "neutron-api/0"): conflicts with existing 9696/tcp (unit "nova-cloud-controller/0")
<Syed_A> jamespage: Looks like nova-cloud-controller and neutron-api are both installing neutron-server.
<jamespage> Syed_A, yes
<jamespage> Syed_A, hmm - yes - that won't work well on a single unit
<jamespage> Syed_A, there is a huge assumption in the charms that they 'own' the unit
<Syed_A> jamespage: Ok, so how can i stop nova-cloud-controller from installing neutron-server.
<Syed_A> jamespage: Will it work if i deploy neutrona-api unit on quantum-gateway node ?
<jamespage> Syed_A, nope - neutron-api will trample all over the gateway charms config files
<Syed_A> jamespage: compute node then ?
<jamespage> Syed_A, nova-cc decides to stop managing neutron-server - but not straight away
<jamespage> Syed_A, same problem - but this time neutron-openvswitch's config files
<jamespage> Syed_A, the charms are just not designed for this type of use
<mup> Bug #1469184 opened: listSuite teardown fails <ci> <intermittent-failure> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:New> <https://launchpad.net/bugs/1469184>
<mup> Bug #1469186 opened: ContextRelationSuite teardown fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1469186>
<Syed_A> jamespage: Don't you think charms should be able to deploy a standalone controller node say a VM.
<jamespage> Syed_A, I'm adverse to changing the design principle each charm has in that it 'owns' the unit filesystem
<jamespage> Syed_A, LXC containers give us a lightweight way to manage this, without having to have alot of complexity in the charms to deal with this problem
<Syed_A> jamespage: I am inclined to agree with you. LXC works better but the use case that somebody might want to deploy a openstack controller node without using lxc is a valid use case.
<jamespage> Syed_A, I don't disagree with that - just saying maybe the charms are not the right way to fullfil that
<natefinch> fwereade: why did we write our own RPC implementation when there's one in the stdlib?
<mup> Bug #1469189 opened: unitUpgraderSuite teardown panic <ci> <intermittent-failure> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1469189>
<mup> Bug #1469193 opened: juju selects wrong address for API <sts> <juju-core:New> <https://launchpad.net/bugs/1469193>
<mup> Bug #1469196 opened: runlistener nil pointer / invalid address <ci> <intermittent-failure> <unit-tests> <juju-core:Incomplete> <juju-core 1.24:Triaged> <https://launchpad.net/bugs/1469196>
<Syed_A> jamespage: Ok let's say i fix the neutron-server manually but what about the instance metadata not working ?
<jamespage> Syed_A, that should be fixed by correctly specifying openstack-origin
<Syed_A> jamespage: testing ...
<Syed_A> jamespage: Ok so instance metadata is working.
<fwereade> natefinch, I can't remember what it does that the stdlib one didn't, but I know it was something :/
<Syed_A> jamespage: As per your suggestion correctly specifying openstack-origin fixed the issue.
<fwereade> natefinch, rogpeppe would remember
<rogpeppe> natefinch: there were a few reasons
<rogpeppe> natefinch: the main one is that with the stdlib version you don't get to have per-connection context
<natefinch> rogpeppe: ahh, interesting, yeah
<rogpeppe> natefinch: also, the way you have to phrase the stdlib methods is awkward
<Syed_A> jamespage: If somebody is deploying openstack on public cloud and they cannot use lxc, so the suggestion here will be to start a new vm and install neutron-api as a standalone unit there ?
<mup> Bug #1469199 opened: State server seems to have died <cloud-install-failure> <juju-core:New> <https://launchpad.net/bugs/1469199>
<natefinch> rogpeppe: yeah, the stdlib way is kind of annoying, I'm surprised they didn't do it the way ours does... (traditional val, error return)... but I'm sure there was a reason at the time
<rogpeppe> natefinch: it's simpler to implement the way they did it
<rogpeppe> natefinch: but my reasoning was we were going to be writing lots of API entry points, so the additional complexity in the rpc package was worth it
<voidspace> mgz: ping
<mgz> voidspace: hey
<voidspace> mgz: it's alright, I think I've sorted it
<voidspace> mgz: had a question about gomaasapi which you seem to have touched
<mgz> voidspace: okay, I hall remain in the dark
<voidspace> mgz: heh
<voidspace> mgz: I hate creating JSON maps in Go :-/
<mgz> voidspace: it is not the most fun
<jamespage> Syed_A, yes - but that is very much an edge case
<jamespage> most clouds are deployed on metal :-)
<jamespage> Syed_A, infact what you suggest is exactly how we test the openstack charms - we have a small QA cloud (5 compute nodes) which we can standup a full openstack cloud ontop of
<jamespage> we can run ~15 clouds in parallel
<jamespage> and do things like test HA etc...
<Syed_A> jamespage: Correct most clouds are deployed on metal. But with the latest charms neutron-api and nova-cloud-controller cannot be installed on the same physical machine ?
<jamespage> Syed_A, that is absolutley the case - and you will hit issues with other conflicts as well
<jamespage> Syed_A, which is why we have https://wiki.ubuntu.com/ServerTeam/OpenStackCharms/ProviderColocationSupport
<Syed_A> jamespage: We also have a small setup where we test openstack. I set up HA LXC openstack setup last week. Which was fun :)
<jamespage> :-)
<jamespage> Syed_A, its neat - the qa cloud i refer to is juju deployed, and is HA control plane under lxc as well
<Syed_A> jamespage: Cool !
<ericsnow> natefinch: regarding RB, did you mean the GH integration isn't working or something else?
<sinzui> mbruzek: gce streams are back
<mbruzek> sinzui: thank you
<natefinch> ericsnow: yes, the GH integration... like, I made a PR vs. juju-process-docker and no review was created on RB
<natefinch> ericsnow: I probably just missed a steo
<natefinch> ste
<natefinch> step
<natefinch> arg...
<ericsnow> natefinch: yeah, the repo did not have the web hook set up (I've added it)
<natefinch> ericsnow: can you document the steps in the wiki?
<ericsnow> natefinch: sure
<natefinch> ericsnow: so, process server api in process/api/server.go?
<ericsnow> natefinch: how about process/api/server/uniter.go
<ericsnow> natefinch: params would live in process/api/params.go
<natefinch> ericsnow: is there a reason to split out the params, server, and client stuff?  if each one is fairly simple and probably fits in a single file...
<ericsnow> natefinch: my expectation is that it won't fit well in a single file
<natefinch> ericsnow: ok
<natefinch> ericsnow: when are those state functions getting merged into the feature branch?
<ericsnow> natefinch: likely not before Monday
<natefinch> ericsnow: ok
<natefinch> this whole "duplicate every single struct in the API" thing gets really tiresome
<mup> Bug #1469318 opened: apitserver: TestAgentConnectionsShutDownWhenStateDies takes > 30 seconds to run <juju-core:New> <https://launchpad.net/bugs/1469318>
#juju-dev 2015-06-28
<thumper> bah humbug
<thumper> power failures are as intermittent as I thought
 * thumper runs the test 20 times ...
<davecheney> thumper: morning
<davecheney> https://github.com/juju/juju/pull/2665
<thumper> hi dave
<thumper> shipit
<davecheney> wheee
<thumper> I *think* I may have found a source of our agents not stopping...
<thumper> davecheney: I have a few questions about channels and select statements
<thumper> davecheney: do you have a few minutes for a hangout?
<davecheney> sure
<davecheney> let me get the other computer
<davecheney> i'll see in the 1:1
<thumper> davecheney: I'm there
<davecheney> thumper: i'm in the 1:1
<thumper> hmm
<davecheney> are you in the standup hangout ?
<thumper> so am I
 * thumper rejois
#juju-dev 2016-06-27
<mup> Bug #1596462 opened: Deployment failed because state DB is locked <ci> <deploy> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1596462>
<mup> Bug #1596462 changed: Deployment failed because state DB is locked <ci> <deploy> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1596462>
<mup> Bug #1596045 changed: Juju says windows mongo: invalid version <blocker> <ci> <mongodb> <regression> <windows> <juju-core:Fix Released by 2-xtian> <https://launchpad.net/bugs/1596045>
<mup> Bug #1596462 opened: Deployment failed because state DB is locked <ci> <deploy> <reliability> <juju-core:Triaged> <https://launchpad.net/bugs/1596462>
<mup> Bug #1596476 opened: juju charm resources does not map application name to charmstore url <juju-core:New> <https://launchpad.net/bugs/1596476>
<mup> Bug #1596493 opened: github.comjuju/juju/apiserver package: first record does not look like a TLS handshake <blocker> <ci> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1596493>
<mup> Bug #1596493 changed: github.comjuju/juju/apiserver package: first record does not look like a TLS handshake <blocker> <ci> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1596493>
<mup> Bug #1596493 opened: github.comjuju/juju/apiserver package: first record does not look like a TLS handshake <blocker> <ci> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1596493>
<mup> Bug #1596496 opened: Bootstrap should fail early if '-d controller' is specified <bitesize> <bootstrap> <juju-core:Triaged> <https://launchpad.net/bugs/1596496>
<lazyPower> ey core, thanks for getting beta10 out the door Friday. *fires up the user-testing engine*
<perrito666> lazyPower: our pleasure :p
<dimitern> lazyPower: what's that engine btw?
<lazyPower> dimitern my fingers and possibly duplicate bug reports ;)
 * dimitern is interested in all sorts of testing automation we can share and reuse
<dimitern> ah :)
<dimitern> lazyPower: ok then, hope it goes fine :)
<lazyPower> dimitern so far so good :) Yinzers do good work.
<mup> Bug #1576985 changed: aggregateSuite.TestBatching wrong size <ci> <intermittent-failure> <unit-tests> <windows> <juju-core:Invalid> <https://launchpad.net/bugs/1576985>
<dimitern> lazyPower: Yinzers ?
<lazyPower> Thats Pittsburghese for "You all"
<dimitern> :)
<mup> Bug #1576985 opened: aggregateSuite.TestBatching wrong size <ci> <intermittent-failure> <unit-tests> <windows> <juju-core:Invalid> <https://launchpad.net/bugs/1576985>
<mup> Bug #1559400 changed: TestManageModelRunsRegisteredWorkers is flaky <intermittent-failure> <juju-core:Invalid> <https://launchpad.net/bugs/1559400>
<mup> Bug #1576985 changed: aggregateSuite.TestBatching wrong size <ci> <intermittent-failure> <unit-tests> <windows> <juju-core:Invalid> <https://launchpad.net/bugs/1576985>
<mup> Bug #1580802 changed: NoContextWithLock fails on windows because of another process <ci> <intermittent-failure> <regression> <unit-tests> <windows> <juju-core:Invalid> <https://launchpad.net/bugs/1580802>
<mup> Bug #1559400 opened: TestManageModelRunsRegisteredWorkers is flaky <intermittent-failure> <juju-core:Invalid> <https://launchpad.net/bugs/1559400>
<mup> Bug #1580802 opened: NoContextWithLock fails on windows because of another process <ci> <intermittent-failure> <regression> <unit-tests> <windows> <juju-core:Invalid> <https://launchpad.net/bugs/1580802>
<mup> Bug #1559400 changed: TestManageModelRunsRegisteredWorkers is flaky <intermittent-failure> <juju-core:Invalid> <https://launchpad.net/bugs/1559400>
<mup> Bug #1580802 changed: NoContextWithLock fails on windows because of another process <ci> <intermittent-failure> <regression> <unit-tests> <windows> <juju-core:Invalid> <https://launchpad.net/bugs/1580802>
<perrito666> axw: ping
<wallyworld> ericsnow: katco: hey, did you guys have time for a staus update hangout?
<ericsnow> wallyworld: sure
<wallyworld> ericsnow: katco: https://hangouts.google.com/hangouts/_/canonical.com/tanzanite-stand
<mup> Bug #1596559 opened: BootstrapSuite.TestRunTests <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1596559>
<ericsnow> katco: ping me when you're around
<mup> Bug #1596593 opened: juju show-model does not accept model as kwarg <juju-core:New> <https://launchpad.net/bugs/1596593>
<wallyworld> ericsnow: i looked at the WIP forwarding worker. see if the comments make sense - i'm confused about what is expected as controller config; it seems it's been implemented as model config? also the worker loop seems a little unusually in not using a channel as the result of Next()?
<ericsnow> wallyworld: thanks, I'll take a look
<thumper> urulama-sprint: http://reviews.vapour.ws/r/5171/
<thumper> some juju reviewer plz: http://reviews.vapour.ws/r/5171/diff/#
<urulama-sprint> frankban|afk: ^ PTAL
<mup> Bug # opened: 1596597, 1596603, 1596605, 1596607, 1596608, 1596609
<perrito666> thumper: reviewed, shpit with a couple of gotchas
<perrito666> I must admit I did not go through the whoooole menn0 checklist but I covered many points
<mup> Bug #1596612 opened: show-controller output updates <2.0> <usability> <juju-core:New> <https://launchpad.net/bugs/1596612>
<mup> Bug #1596615 opened: show-model output updates <2.0> <usability> <juju-core:New> <https://launchpad.net/bugs/1596615>
<mup> Bug #1596616 opened: commands do not use two spaces between columns/headings <2.0> <usability> <juju-core:New> <https://launchpad.net/bugs/1596616>
<katco> ericsnow: hey, i'm here. sorry, dentist appt.
<ericsnow> katco: np
<ericsnow> katco: just getting up to speed with wallyworld about status
<ericsnow> katco: we need to get together at some point about how to integrate audit logs with log forwarding
<ericsnow> * some point today
<katco> ericsnow: i'm free now if you'd like
<ericsnow> katco: sure, moonstone?
<katco> ericsnow: brt
<mup> Bug #1596619 opened: remove aliases from juju commands <2.0> <usability> <juju-core:New> <https://launchpad.net/bugs/1596619>
<mup> Bug #1596626 opened: juju gui not showing multiple models <juju-core:New> <https://launchpad.net/bugs/1596626>
<mup> Bug #1596626 changed: juju gui not showing multiple models <juju-core:New> <https://launchpad.net/bugs/1596626>
<mup> Bug #1596626 opened: juju gui not showing multiple models <juju-core:Invalid> <https://launchpad.net/bugs/1596626>
<mup> Bug #1596626 changed: juju gui not showing multiple models <juju-core:New> <https://launchpad.net/bugs/1596626>
<perrito666> if we actally see cloud init apt-get failing suggesting update, why on earth would we not try that?
<bogdanteleaga> do we have some sort of wiki entry for the multiple mongo gotchas that got posted lately to the ML?
<katco> ericsnow: given all my changes, i think i need another proper review of: http://reviews.vapour.ws/r/5089/
<ericsnow> katco: k
<katco> ericsnow: ta
<katco> bogdanteleaga: not that i know of
<katco> ericsnow: i think i need to defer william's comments about RequestNotifier having internal knowledge of state, and tests for that type, but everything else is fair game
<ericsnow> katco: k
<mup> Bug #1596687 opened: command list output not consistent <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596687>
<katco> ericsnow: i'm ready to land http://reviews.vapour.ws/r/5089/. any thoughts?
<mup> Bug #1596687 changed: command list output not consistent <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596687>
<ericsnow> katco: one minute
<katco> ericsnow: np... took me 4 tries to land the previous one =|
<mup> Bug #1596687 opened: command list output not consistent <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596687>
<mup> Bug #1596688 opened: normalizing the shares and user commands <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596688>
<ericsnow> katco: basically, ship-it
<mup> Bug #1596688 changed: normalizing the shares and user commands <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596688>
<katco> ericsnow: ta
<mup> Bug #1596688 opened: normalizing the shares and user commands <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596688>
<stokachu> ericsnow: are you familiar with the deploy api?
<ericsnow> stokachu: a little
<stokachu> im trying to get the latest beta10 to deploy via the api
<stokachu> https://paste.ubuntu.com/17991715/
<stokachu> we pass that to the apiserver but the response is always unable to find charm
<stokachu> was there an additional requirement added in beta10?
<stokachu> only thing we changed was to match the reworked  api parameter keys
<katco> stokachu: ok, sorry, i'm back and looking at this
<stokachu> thanks
<stokachu> ive been digging through the go code but nothing yet
<stokachu> katco: i did force a 'channel': 'stable' in the parameters
<stokachu> still failed though with same message
<katco> stokachu: hm
<katco> stokachu: do you happen to have the debug logs from the controller?
<stokachu> yea
<stokachu> what would you need?
<katco> stokachu: machine 0's log with debugging turned on and the deploy failure
<katco> stokachu: controller might give us a hint at what exactly is failing
<stokachu> whats the command to enable the debugging
<katco> stokachu: i always forget, sec. it's juju set-config log-level=DEBUG or something
<stokachu> juju set-config log-level=DEBUG
<stokachu> error: no application name specified
<stokachu> juju set-model-config logging-config='<root>=DEBUG;unit=DEBUG'
<katco> stokachu: that looks right. easy, huh? where'd you find it?
<stokachu> juju help set-model-config
<stokachu> katco: http://paste.ubuntu.com/17994956/
<stokachu> the bottom there has the latest deploy attempt
<katco> stokachu: ta
<stokachu> though it says params redacted
<stokachu> how do i get it to display it all
<katco> stokachu: probably can't; we hardcode that. let me 2xcheck though
<katco> stokachu: set level to trace
<katco> stokachu: and that should log the body as well
<stokachu> nice
<stokachu> katco: http://paste.ubuntu.com/17995148/
<stokachu> thats everything
<stokachu> line 5204 shows the deploy
<stokachu> we don't pass a series as that usually is handled from the charmurl
<stokachu> ericsnow: feel free to take a peek too
<katco> stokachu: just on a whim, can you try passing the series?
<stokachu> sure
<stokachu> katco: same error
<stokachu> https://paste.ubuntu.com/17995455/
<stokachu> 9203
<katco> stokachu: k, worth a try
<katco> stokachu: i think this is where the error is coming from: https://github.com/juju/juju/blob/master/state/charm.go#L696
<katco> stokachu: which would imply that it's not successfully being stored in mongo
<stokachu> weird
<stokachu> does the juju client code do something prior to calling the api?
<stokachu> as far as mongo is concerned
<katco> stokachu: yeah it does a few things, some related to resources
<katco> stokachu: that's all this: https://github.com/juju/juju/blob/master/cmd/juju/application/deploy.go#L301
<stokachu> so did that change from beta9 to beta10?
<katco> stokachu: nope
<stokachu> :(
<katco> stokachu: https://github.com/juju/juju/commit/136d03f5987c89946b6987832c520c8879ea225f#diff-bad794df7c6f66424c0b9cc0961da6cc
<katco> stokachu: looks... plausible
<stokachu> so we have to upload first
<stokachu> wtf
<katco> stokachu: it... looks like that may be the case. it looks like that was a 1.16 guarantee... wow
<stokachu> so i've no idea where to go from here
<stokachu> do i need to start uploading every charm?
<stokachu> this PR looks like it has the $jfdi$ tag
<stokachu> lol with no review
<katco> stokachu: http://reviews.vapour.ws/r/5092/
<katco> stokachu: unfortunately we are having to JFDI everything to meet deadlines, even though master is blocked
<stokachu> haha it just says shipit
<katco> stokachu: if there were no issues found, that's how the tool indicates it's safe to land
<katco> rick_h_: ping, you still up?
<stokachu> so im guessing deploying via the API with juju gui or another client is out of the scope of testing?
<katco> stokachu: upper management told us to move fast and break things as this is a beta
<katco> stokachu: alexisb usually bends over backwards to avoid doing this, but we lost that particular battle
<katco> stokachu: i don't know what the official stance on other clients is now. my guess is that yes, you have to upload charms/resources first... that seems weird to me
<katco> stokachu: seems like something the controller should be taking care of
<stokachu> yea
<stokachu> which it has been since forever
<katco> yeah
<katco> stokachu: can you send an email out to juju-dev asking for clarification about what's intended with a link to that PR?
<stokachu> katco: sure
<katco> stokachu: ta. sorry for the trouble. sometimes we don't have control over the experience :(
<stokachu> well the PR could've had more explanation at the very least
<stokachu> i understand its beta and we dont care about breaking things
<katco> ericsnow: wow, a lot of tests break if you add a new attribute to apiserver's ServerConfig struct :|
<ericsnow> :(
<katco> ericsnow: i suspect it'll take me a bit just to hunt all these down (sigh). FULL STACK TESTING FTW!
<ericsnow> katco: I feel your pain
<katco> ericsnow: do you? let's test it out. first, i need you to move to my city, then live in my house, etc. just to make sure we're exercising all aspects of me feeling pain
<ericsnow> katco: heh
#juju-dev 2016-06-28
<ericsnow> katco: how goes it?
<katco> ericsnow: ran into some issues in the rpc package
<katco> ericsnow: purely test related
<ericsnow> katco: :(
<katco> ericsnow: working through them now
<katco> ericsnow: i removed some dead code, but the tests are testing that the dead code worked
<ericsnow> katco: this is why we can't have nice things  (â¯Â°â¡Â°)â¯ï¸µ â»ââ»
<katco> ericsnow: lol
<ericsnow> katco: BTW, one of my recent patches had one of those table flips in test data :)
<katco> ericsnow: rofl that's awesome
<ericsnow> katco: had to test UTF-9
<ericsnow> 8
<katco> ericsnow: i don't think i'm finishing this tonight
<ericsnow> katco: k
<ericsnow> katco: FYI, I'm probably not going to do much tomorrow
<katco> ericsnow: ok, no worries. it will probably take me most of the day to fix this up and land everything
<katco> ericsnow: i'll see if i can't have a go at the conversion func
<ericsnow> katco: k
<ericsnow> katco: ping
<urulama-sprint> thumper: https://github.com/juju/juju/blob/master/instance/placement.go#L23
<mup> Bug #1596842 opened: juju get-config does not accept keys to limit response <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596842>
<mup> Bug #1596850 opened: Logs for hosted models in the controller are logged against the controller model <observability> <usability> <juju-core:New> <https://launchpad.net/bugs/1596850>
<mup> Bug #1596853 opened: juju deploy output is verbose exposing addCharm to users <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596853>
<mup> Bug #1596858 opened: juju deploy a bundle output is too verbose <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596858>
<perrito666> rick_h_: need to catch up with you after this
<mup> Bug #1596888 opened: actions status does not reference the action that was triggered <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596888>
<rick_h_> perrito666: rgr
<mup> Bug #1596906 opened: there is no show-action command to learn how to use an action from the cli <2.0> <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596906>
<mup> Bug #1596608 changed: MongoDB upserts can fail with duplicate key errors <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1596608>
<mup> Bug #1596609 changed: MongoDB upserts can fail with duplicate key errors <juju-core:In Progress by menno.smits> <https://launchpad.net/bugs/1596609>
<mup> Bug #1596066 changed: In some configurations, juju never achieves HA on maas 1.9 with trusty <blocker> <ci> <maas-provider> <regression> <trusty> <juju-core:Triaged> <https://launchpad.net/bugs/1596066>
<perrito666> running the whole test suite is a bit unfulfilling
<redir> a bit
<perrito666> natefinch: hey, can you run the state tests for this branch? https://github.com/juju/juju/pull/5727
<mup> Bug #1596960 opened: Intermittent test timeout in application tests <tech-debt> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1596960>
<mup> Bug #1596967 opened: Juju plugins must only start with a letter <usability> <juju-core:Triaged> <https://launchpad.net/bugs/1596967>
<thumper> natefinch: http://reviews.vapour.ws/r/5177/
<frobware> dimitern: https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/ - and a section on how to disable it
<mup> Bug #1588186 changed: reboot-executor does not run in jujud tests <juju-core:Triaged by fwereade> <https://launchpad.net/bugs/1588186>
<mup> Bug #1597063 opened: Can't destroy/kill a controller in an openstack cloud if cinder isn't being used <v-pil> <juju-core:New> <https://launchpad.net/bugs/1597063>
<mup> Bug #1597063 changed: Can't destroy/kill a controller in an openstack cloud if cinder isn't being used <v-pil> <juju-core:New> <https://launchpad.net/bugs/1597063>
<mup> Bug #1597063 opened: Can't destroy/kill a controller in an openstack cloud if cinder isn't being used <v-pil> <juju-core:New> <https://launchpad.net/bugs/1597063>
<mup> Bug #1597063 changed: Can't destroy/kill a controller in an openstack cloud if cinder isn't being used <v-pil> <juju-core:New> <https://launchpad.net/bugs/1597063>
<mup> Bug #1597078 opened: juju fails to delete openstack security groups when a controller is killed <v-pil> <juju-core:New> <https://launchpad.net/bugs/1597078>
<mup> Bug #1582896 changed: juju deploy is not working on juju 2.0 <lxd-provider> <juju-core:New> <https://launchpad.net/bugs/1582896>
<mup> Bug #1597078 changed: juju fails to delete openstack security groups when a controller is killed <v-pil> <juju-core:New> <https://launchpad.net/bugs/1597078>
#juju-dev 2016-06-29
<mup> Bug #1597170 opened: juju beta10 fails to deploy all services in aws if storage is not ready <conjure> <juju-core:New> <https://launchpad.net/bugs/1597170>
<hoenir> why in github.com/juju/juju/provider/common/bootstrap.go at line 365 why fmt.Fprintf target is os.stderr ?
<hoenir> anyone?
<hoenir> I'ts just an informative logging message :-? so why not use the log system there?
<rick_h_> hoenir: will see if axw can help you there ^
<hoenir> no need, I figure It out why
<mup> Bug #1564524 changed: Unable to deploy openstack base using juju2 and maas 1.9 <cdo-qa> <juju-core:Triaged> <https://launchpad.net/bugs/1564524>
<mup> Bug #1597318 opened: xenial containers on trusty host need lxc packages from trusty-backports <kanban-cross-team> <landscape> <juju-core:New> <https://launchpad.net/bugs/1597318>
<perrito666> oh, we should definitely try to work juju in this https://research.googleblog.com/2016/06/project-bloks-making-code-physical-for.html
<natefinch> fwereade: I think I have a nice simplification to that pinger code
<natefinch> fwereade: the trick is that Pinger itself is 99% the same as a time.Timer
<mup> Bug #1455627 opened: TestAgentConnectionDelaysShutdownWithPing fails <ci> <intermittent-failure> <lxc> <test-failure> <unit-tests> <windows> <juju-core:Triaged by dimitern> <https://launchpad.net/bugs/1455627>
<mup> Bug #1597342 opened: Juju 2 is using a LXD API that is not in the released version of LXD 2.0 <juju-core:In Progress by frobware> <https://launchpad.net/bugs/1597342>
<mup> Bug #1597342 changed: Juju 2 is using a LXD API that is not in the released version of LXD 2.0 <juju-core:In Progress by frobware> <https://launchpad.net/bugs/1597342>
<mup> Bug #1597342 opened: Juju 2 is using a LXD API that is not in the released version of LXD 2.0 <juju-core:In Progress by frobware> <https://launchpad.net/bugs/1597342>
<mup> Bug #1597354 opened: Juju 2.0 Resource Error - cannot add resource failed to write data: read tcp : i/o timeout <juju-core:New> <https://launchpad.net/bugs/1597354>
<mup> Bug #1597354 changed: Juju 2.0 Resource Error - cannot add resource failed to write data: read tcp : i/o timeout <juju-core:New> <https://launchpad.net/bugs/1597354>
<mup> Bug #1597354 opened: Juju 2.0 Resource Error - cannot add resource failed to write data: read tcp : i/o timeout <juju-core:New> <https://launchpad.net/bugs/1597354>
<mup> Bug #1597372 opened: juju2beta10 websockets api: Inconsistency in lowercasing of endpoints in deltas versus servers key in login response <kanban-cross-team> <juju-core:New> <https://launchpad.net/bugs/1597372>
<mup> Bug #1597378 opened: Juju 2.0-beta10 appears to be missing `resource-get` help output <juju-core:New> <https://launchpad.net/bugs/1597378>
<rogpeppe1> anyone dare to review this? http://reviews.vapour.ws/r/5166/
<redir> still looking for a shipit.... http://reviews.vapour.ws/r/5153/
<redir> still looking for a shipit.... http://reviews.vapour.ws/r/5153/
<redir> no idea if i've connected to IRC yet.
<redir> bbbiab
<mup> Bug #1597481 opened: Juju should not override explicit workload status <juju-core:New> <https://launchpad.net/bugs/1597481>
<mup> Bug #1597490 opened: juju 2.0-beta9.1: juju relation status PROVIDES and CONSUMES confused <juju-core:New> <https://launchpad.net/bugs/1597490>
<mup> Bug #1597516 opened: juju2beta10 websocket api: Inconsistency lower-case Scope and Directive placement parameters  <kanban-cross-team> <juju-core:New> <https://launchpad.net/bugs/1597516>
<mup> Bug #1597519 opened: juju 2 beta10, resources facade lower-case inconsistency with all other facades <conjure> <juju-core:New> <https://launchpad.net/bugs/1597519>
<mup> Bug #1597378 changed: Juju 2.0-beta10 appears to be missing `resource-get` help output <juju-core:New> <https://launchpad.net/bugs/1597378>
#juju-dev 2016-06-30
<mup> Bug #1597601 opened: ERROR cannot deploy bundle: cannot deploy application: i/o timeout <oil> <juju-core:New> <https://launchpad.net/bugs/1597601>
<perrito666> thumper: fix it then ship it
<urulama> thumper: hey ... any estimate when this will be fixed and shipped? :) http://reviews.vapour.ws/r/5185/
<thumper> urulama: before lunch
 * urulama lunches in 15min then ...
<urulama> :)
<ericsnow> katco: ping
<ericsnow> (oops, too early)
<thumper> perrito666 or natefinch: http://reviews.vapour.ws/r/5186/
<thumper> just picking on you because you did the other one
<perrito666> thumper: that is how you pay us
<thumper> penalty for helping
<perrito666> thumper: done
<perrito666> as usual, If menn0 reviews this he will find 120138019283019831023 more things than I did
<thumper> perrito666: ta
<rogpeppe> axw: ping
<menn0> perrito666: and wallyworld will find even more than me :)
<perrito666> So wallyworld should be the only reviewer, naturally
<rogpeppe> menn0: hiya
<rogpeppe> menn0: do you know what the status of the controller/model config split is, by any chance?
<rogpeppe> perrito666, wallyworld, axw: ^
<rogpeppe> thumper: ^
<thumper> rogpeppe: axw is working on it
<wallyworld> it's wip, partly done
<thumper> I'm not sure
<wallyworld> what is the specific question?
<rogpeppe> wallyworld: we're wondering whether the config is going to be split in the providers themselves
<rogpeppe> wallyworld: and if so, what's going to happen to Provider.Schema method?
<axw> rogpeppe: not sure what you mean - providers shouldn't need to know about controller config in general
<wallyworld> don't quite follow sorry. by the time the providers are instantiated, they will see environ/config/Config
<wallyworld> which will contain only model config
<rogpeppe> axw: ah, ok, cool.
<axw> rogpeppe: I guess we'll still have it, but it won't include controller config fields. we can have a separate controller config schema, but I don't think that's really useful for you?
<wallyworld> there's a state api call to get controller congih
<rogpeppe> axw: so there will still only be one set of configuration attrs for a given model
<rogpeppe> axw: but... what about credentials?
<axw> rogpeppe: there's a separate schema for creds
<axw> rogpeppe: so, separate schemas for model, controller and credentials
<axw> tho controller is general, so doesn't really need a schema at all
<rogpeppe> axw: so environschema.AccountGroup becomes redundant?
<axw> rogpeppe: yep
<rogpeppe> axw: so Provider.Schema returns two values?
<rogpeppe> axw: can the attribute names overlap?
<axw> rogpeppe: maybe. EnvironProvider already has a CredentialSchemas method though
<axw> rogpeppe: credential and model attrs?
<rogpeppe> axw: Schemas plural?
<axw> rogpeppe: yeah, each provider can define multiple auth types
<axw> rogpeppe: e.g. access-key and userpass for openstack
<rogpeppe> axw: so where is the auth type specified?
<axw> rogpeppe: hypothetically the model and credentials could have overlapping attr names. they should be considered completely separate
<frankban> wallyworld: I am trying to convert the GUI API client for the new ModelManager facade (without COnfigSkeleton). I am encountering credential errors on ec2: e.g. http://pastebin.ubuntu.com/18158931/
<axw> rogpeppe: a credential is a key-value map, one of the keys has a special name "auth-type"
<axw> the value of which is interpreted in a provider-specific manner
<axw> (and must have a schema defined by the provider)
<rogpeppe> axw: how does the CredentialAttr.FilePath stuff work when credentials are passed across the network?
<axw> frankban: I think that's a small bug, can you try passing "user-admin@local" as the owner?
<wallyworld> frankban: you need to specify a credential to use if you are not the controller admin
<axw> rogpeppe: they have to be converted at the client first
<axw> rogpeppe: that bit is a bit iffy at the moment, needs some rework when we come to updating environschema
<frankban> axw: that works
<rogpeppe> axw: ISTM that most of this stuff isn't really compatible with the way environschema does things
<axw> frankban: thanks, I'll patch that shortly
<rogpeppe> axw: or could do things, even
<frankban> axw: cool, so we don;t need to pass any config anymore, and no ssh keys correct>
<frankban> ?
<axw> frankban: authorized-keys are still required at the moment, I have a patch up but can't land until the beta is out
<axw> frankban: i.e. the patch will make authorized-keys optional
<rogpeppe> axw: in particular, we've now got many overlapping attribute names with potentially different types depending on other attributes
<frankban> axw: well, this worked without ssh keys... http://pastebin.ubuntu.com/18159325/
<axw> frankban: ah, special case for admin user :) that will be going away
<axw> rogpeppe: example?
<frankban> axw: so, is it ok for the GUI, for the time being to not set CloudCredentials and to still pass fake ssh keys?
<frankban> axw: and to use @local in the onwner-tag?
<rogpeppe> axw: for example, AFAICS you could have two auth types, both of which define a credential field with the same name but a different type
<axw> frankban: that's fine
<frankban> axw: cool thanks
<rogpeppe> axw: as returned from CredentialSchemas
<axw> rogpeppe: credentials are always strings
<frankban> axw: just to confirm, "user-admin@local" will continue working after your fix, correct?
<axw> frankban: yes
<rogpeppe> axw: ok, i see
<axw> frankban: we may want to stop accepting invalid SSH keys at some point, but that should be easy to drop in the GUI
<rogpeppe> axw: but different descriptions and other attributes, which amounts to a similar thing
<rogpeppe> axw: different sets of allowed values too
<axw> rogpeppe: I don't really understand the problem. You could hypothetically, but they're still relative to the auth-type. In the GUI you would select an auth-type, and then the attr name should make sense in that context
<frankban> axw: ok so params would be like this: params': {'config': {'authorized-keys': 'bad-wolf'},
<frankban>             'name': 'test-model',
<frankban>             'owner-tag': 'user-admin@local'},
<frankban> correct?
<axw> frankban: yup
<rogpeppe> axw: the problem is that in our service, we have a unified view of "the model config" which includes all attributes. that's just become extremely special-case-y
<redir> PR seeks review: http://reviews.vapour.ws/r/5153/
<axw> rogpeppe: credentials are not part of model config *at all* now, though. you manage credentials separately in juju, using the Cloud facade. then when you create a model, you specify a credential by name
<rogpeppe> axw: i guess i was hoping to see a mode general mechanism than just special-casing auth-type
<frankban> axw: to be more future proof, should always include the first credential name returned by Cloud.Credentials?
<rogpeppe> axw: they're part of the configuration you need when creating a model
<rogpeppe> axw: are you saying you can't explicitly pass credentials attributes now?
<axw> rogpeppe: that's something I'm working on fixing. you currently have to duplicate the creds into model config due to things being half done
<rogpeppe> axw: if possible, i think we'd prefer to avoid the *necessity* to have a named set of credentials in order to use some credentials, as it's just another thing to manage.
<axw> frankban: https://github.com/juju/juju/pull/5704 <- this PR adds a Cloud.CloudDefaults method, which includes the default credential name
<axw> frankban: I guess for now yo ucould just use the first one in Cloud.Credentials
 * rogpeppe drops of the network for a few seconds
<rogpeppe> off
<rogpeppe> back
<frankban> axw: if I need to change it later, then maybe I'll just go with empty credential and therefore only allow admin to create models from the GUI for now. does it sound reasonable to you? and also could you please send me an email when everything is ready with a summary of the new strategy I should implement in the GUI?
<axw> rogpeppe: so credentials are managed separately from models now. you can have multiple models using the same credentials. to avoid duplication, and to support updating/revoking creds, we use a name for reference
<axw> rogpeppe: similar to how you specify credentials at bootstrap time
<axw> frankban: yes that sounds fine to me. will do
<frankban> axw: thanks a lot
<axw> np
<wallyworld> ericsnow: this is a quick and dirty pr just to get *something* done for the release deadline http://reviews.vapour.ws/r/5187/
<ericsnow> wallyworld: k
<axw> wallyworld: can you please review https://github.com/juju/juju/pull/5741
<wallyworld> axw: looking
<fwereade> katco, wallyworld, I am seeing a *lot* of audit spam that looks like: machine-0: 2016-06-30 10:56:18 CRITICAL juju.cmd.jujud machine.go:1055 ModelUUID not valid
<wallyworld> fwereade: i just found that myself :-( i've fixed it
<fwereade> wallyworld, <3
<fwereade> wallyworld, katco: do we know how this happened? presumably we did run a controller with these changes before we landed them?
<wallyworld> fwereade: i just found it running a controller to test the addition of an local audit log file sink
<mup> Bug #1385276 changed: juju leaves security groups behind in aws <bug-squad> <destroy-environment> <ec2-provider> <jujuqa> <repeatability> <security> <juju-core:Fix Released> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1385276>
<mup> Bug #1597704 opened: juju status --format=tabular goes double space if one message is long <2.0> <status> <ui> <juju-core:Triaged> <https://launchpad.net/bugs/1597704>
<thumper> jam: http://reviews.vapour.ws/r/5186/
<axw> cherylj: can I JFDI https://github.com/juju/juju/pull/5741, or would you like me to wait?
<jam> thumper: http://reviews.vapour.ws/r/5188/
<mup> Bug #1597720 opened: It is not possible to refer to multiple models with the same name from the CLI <juju-core:Triaged> <https://launchpad.net/bugs/1597720>
<ericsnow> axw: https://github.com/juju/juju/pull/5743
<redir> fwereade: diagram?
<fwereade> wallyworld, do you have a CL up for that logging yuck? it's... really quite inconvenient
<fwereade> redir, that sounds good, I thought we had a couple of sessions coming?
<redir> fwereade: OK. I am also good doing it by habgout with a screen large enough and a trackball
<natefinch> redir: https://github.com/natefinch/claymud/blob/master/util/query.go
 * perrito666 wonders if his credit card would mind if he nuked it with a laptop
<fwereade> redir, face to face is better, though: can we do it after the escape analysis session?
<redir> sure thing fwereade
<alexisb> babbageclunk, please update when you have a moment: https://bugs.launchpad.net/juju-core/+bug/1567708
<mup> Bug #1567708: unit tests fail with mongodb 3.2 <juju-core:In Progress by 2-xtian> <https://launchpad.net/bugs/1567708>
<alexisb> https://bugs.launchpad.net/juju-core/+bug/1579010
<mup> Bug #1579010: state: removing model can generate huge transactions <destroy-model> <juju-release-support> <scalability> <juju-core:Triaged> <https://launchpad.net/bugs/1579010>
<alexisb> ^^ thumper
<mup> Bug #1555368 changed: Panic due to sending on closed channel <ci> <intermittent-failure> <panic> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1555368>
<menn0> perrito666: http://reviews.vapour.ws/r/5190/
<babbageclunk> alexisb: yup, sorry - was afk
<cory_fu> junaidali: You asked in a PM about `unit-get public-address` returning an IP instead of a FQDN for your deployment and whether there was any way to work around that; I'm not entirely certain, but I think that Juju just passes through what the provider gives it.  Someone here might be able to give more insight
<mbruzek> alexisb: and or cherylj: A partner at IBM has pointed out a possible bug with Resources here: https://bugs.launchpad.net/juju-core/+bug/1597354  This is a blocker for IBM and they are very concerned about it. Can someone triage it and respond in the bug?
<mup> Bug #1597354: Juju 2.0 Resource Error - cannot add resource failed to write data: read tcp : i/o timeout <juju-core:New> <https://launchpad.net/bugs/1597354>
<cherylj> mbruzek: we need /var/log/syslog and /var/log/juju/machine-0.log  from the controller machine
<cherylj> mbruzek: I'll put that in the bug
<mbruzek> cherylj: they are chatting with me on IRC. I told them what you need as well.
<mbruzek> Thank you cherylj
<cherylj> sure thing.  I might need to pull in katco or someone else who's done work on resources to help
<cherylj> katco, ericsnow - could either of you take a look at this bug comment and see if there's any workaround?  https://bugs.launchpad.net/juju-core/+bug/1597354/comments/3
<mup> Bug #1597354: Juju 2.0 Resource Error - cannot add resource failed to write data: read tcp : i/o timeout <juju-core:Incomplete> <https://launchpad.net/bugs/1597354>
<ericsnow> cherylj: yep
<cherylj> thanks ericsnow!
<aisrael> cherylj: Do you know who'd be the point of contact for juju actions?
<cherylj> aisrael: bogdanteleaga might be able to help you out
<aisrael> cherylj: Thanks!
<cherylj> np!
<katco> fwereade: sorry about that. landed that change at close to midnight to try and unblock axw the following day. didn't have time to do any manual testing
<aisrael> bogdanteleaga: I think the issue w/the client I'm helping is resolved, so you're off the hook. I have some usability requests but I'll file bugs for those.
<katco> ericsnow: hey saw you pinged me earlier
<ericsnow> katco: just wanted to catch up
<ericsnow> katco: and see how things have gone with closing ;)
<katco> ericsnow: boring paperwork :)
<ericsnow> katco: yep
<cherylj> jam, alexisb - https://bugs.launchpad.net/juju-core/+bug/1594720
<mup> Bug #1594720: lxd containers not using configured proxy for downloading images <addressability> <lxd> <network> <proxy> <juju-core:Triaged> <https://launchpad.net/bugs/1594720>
<thumper> urulama: all api breaks landed
<thumper> as far as we are aware
<thumper> frankban: ^^
<thumper> if I have missed any, I'll be grumpy but let me know
<thumper> just emailed juju-dev list with details
<frankban> thumper: cool thanks
<thumper> some cribbed from frankban's email
<thumper> :)
<thumper> no worries
<urulama> thumper: all api breaks *that we are aware of atm* landed ? :)
<thumper> yeah
<frankban> urulama: ah "DestroyModel has moved from the "Client" facade to the "ModelManager" facade" I was not aware of that, we need to handle that as well
<thumper> I thought I got them the first time around
<urulama> frankban: yeah, seen the list :-/
<thumper> and the CharmInfo call
<urulama> frankban: but ATM, we don't destroy models in the gui as well
<thumper> not sure if you were getting it from "Client" or "Charms"
<frankban> urulama: I remember we do
<thumper> the go api only called the "Client"
<urulama> frankban: it was disabled
<frankban> thumper: yes I am working on CharmInfo
<frankban> urulama: cool, but let's just keep that in mind, the change should be trivial
<thumper> however there were two, slightly different, implementations in the server
<urulama> frankban: yeah, keeping the list visible on screen :)
<frankban> thumper: I see we had a side effect of this API change that is not only related to consistency at least: we discovered quite a lot of internal structures sent over the wire
<mup> Bug #1597830 opened: agent restarted as part of machine jobs update <juju-core:Triaged by anastasia-macmood> <juju-core 1.25:In Progress by anastasia-macmood> <https://launchpad.net/bugs/1597830>
<plars> Hi, anyone seen a situation where the bootstrap node constantly has *very* high load? I'm not sure if it's the cause or a symptom, but mongodb is hammering the logs
<plars> high as in - 400-500! restarting juju-db brings it down for a little while, but it goes back up pretty quickly
<plars> juju version is 1.25.5
<perrito666> plars: how many nodes?
<plars> perrito666: it seems to be just the bootstrap node where this is happening
<perrito666> plars: yes, but how many nodes are there in your environment?
<plars> perrito666: not many, I think 5 or so?
<plars> perrito666: I didn't see this problem until recently
<cherylj> bogdanteleaga: are you around?
<plars> perrito666: we had a power outage, and things came back. But I had a lot of trouble reaching anything or getting juju status. After a few timed out attempts, I was finally able to juju ssh 0 and see that the load was so high
<perrito666> plars: ok, that strange behavior with so little nodes I have not seen before, what size is your db? (you can tell by the size of /var/lib/juju/db folder in the bootstrap node
<bogdanteleaga> cheryl, yup
<cherylj> hey hey bogdanteleaga
<plars> perrito666: all the files taken together? there are a lot of numerical extension files there, like rotated logs?
<bogdanteleaga> -what's up
<perrito666> all together
<cherylj> bogdanteleaga: I've been working more on bug 1577949
<mup> Bug #1577949: windows services cannot upgrade to 1.25.6 <blocker> <ci> <regression> <upgrade-juju> <windows> <juju-core:In Progress by anastasia-macmood> <juju-core 1.25:In Progress by cherylj> <https://launchpad.net/bugs/1577949>
<plars> perrito666: 7.8G according to du
<cherylj> bogdanteleaga:  and I'm seeing that when we restart the juju machine agent on the windows machine, and it's running 1.25.6, it thinks that the unit agent isn't running and that it has to start it again
<cherylj> bogdanteleaga: so the service.ListServices call isn't listing the unit agent
<cherylj> bogdanteleaga: I don't see that anything has changed in juju/juju/service/windows
<cherylj> bogdanteleaga: nor has the github.com/gabriel-samfira/sys dependency
<cherylj> BUT
<bogdanteleaga> we have had some problems some time ago with that particular function having weird behavior when compiled with different go versions
<cherylj> bogdanteleaga: we did change from using go 1.2 to using go 1.6 between 1.25.5 and 1.25.6
<anastasiamac> thumper: fwereade: https://github.com/juju/juju/pull/5746 (for some reason rb is not picking it up)
<bogdanteleaga> iirc, it didn't work with 1.4, but it did work with 1.6 in our tests, so we ended up not changing it
<bogdanteleaga> you can just do a small main function that calls it and see if it works
<bogdanteleaga> in case you have access to the failing machine
<perrito666> plars: for now, I can advice you to get juju/juju-db running and, if the load can be sustained leave it  be and it might clean up the db, otherwise perhaps tomorrow I could give you a better answer since we are working on a similar issue tryig to pinpoint what is going on
<cherylj> bogdanteleaga: I have access, but I would need some handholding as I don't know windows much at all
<plars> perrito666: it's running ok, but it's been in this state for almost a week
<perrito666> plars: a full week running at a 400 load?
<perrito666> plars: could you ping me tomorrow same time?
<plars> perrito666: I modified rsyslog to have it discard a lot of those messages and was able to get the load down to 15-100 most of the time, but still not great
<perrito666> I might have a solution for you
<plars> perrito666: sure
<perrito666> tx a lot, if you could put up some logs and perhaps report a bug I would be most grateful
<plars> perrito666: sure, will do
<perrito666> again, tx a lot
<cherylj> bogdanteleaga: I need to run soon (I'm sprinting with the team in London this week and I need to get out of the office)
<bogdanteleaga> cherylj, so is it the deployer that checks for the unit agents?
<bogdanteleaga> yeah, I g2g soon too
<bogdanteleaga> so my first advice would be to just write a small main function and create an executable that just calls ListServices and check how the output looks
<bogdanteleaga> (run the executable on the failing machine, and also make sure to use the same go version for compilation)
<cherylj> bogdanteleaga: yeah - worker/deployer
<cherylj> bogdanteleaga: if it doesn't work, do you guys already know what changes were needed for 1.4?
<cherylj> (in case they're still broken)
<mgz> Odd_Bloke: /quot eod
<mgz> Odd_Bloke: sorry, ignore me
<mup> Bug #1597860 opened: "juju machine remove" cmd throughs error in 2.0  <juju-core:New> <https://launchpad.net/bugs/1597860>
<mup> Bug #1597879 opened: jujud hangs on trusty arm64 <juju-core:New> <https://launchpad.net/bugs/1597879>
<bogdanteleaga> cherylj, we needed a new package that used another windows api to fetch the info, you should probably ask gabriel about i
<bogdanteleaga> it*
<mup> Bug #1597941 opened: juju2.0beta10: websockets API usability Application Deploy failure to inform of required addCharm pre-requisite <kanban-cross-team> <usability> <juju-core:New> <https://launchpad.net/bugs/1597941>
<mup> Bug #1597941 changed: juju2.0beta10: websockets API usability Application Deploy failure to inform of required addCharm pre-requisite <kanban-cross-team> <usability> <juju-core:New> <https://launchpad.net/bugs/1597941>
<mup> Bug #1597860 changed: "juju machine remove" cmd thows error in 2.0 <juju-core:Invalid> <https://launchpad.net/bugs/1597860>
<mup> Bug #1597941 opened: juju2.0beta10: websockets API usability Application Deploy failure to inform of required addCharm pre-requisite <kanban-cross-team> <usability> <juju-core:New> <https://launchpad.net/bugs/1597941>
#juju-dev 2016-07-01
<mup> Bug #1576366 changed: juju 2 beta6: show-controller --format=json is broken <landscape> <juju-core:Expired> <https://launchpad.net/bugs/1576366>
<perrito666> I believe reviewboard is not having a good day
<perrito666> this never got a rb link https://github.com/juju/juju/pull/5747
<mup> Bug #1597354 changed: Juju 2.0 Resource Error - cannot add resource failed to write data: read tcp : i/o timeout <juju-core:Incomplete> <https://launchpad.net/bugs/1597354>
<mup> Bug #1598049 opened: TestLogRecordForwarded fails on non-ubuntu <blocker> <centos> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1598049>
<mgz> cherylj, ericsnow: ^ bug A
<mgz> ericsnow, katco: also bug 1598063 (I have not assigned this one to milestone, we may be able to punt)
<mup> Bug #1598063: Data race in apiserver/observer package <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1598063>
<mgz> wait, I did, but maybe we punt
<mup> Bug #1598063 opened: Data race in apiserver/observer package <race-condition> <juju-core:Triaged> <https://launchpad.net/bugs/1598063>
<jam> mgz: https://github.com/juju/juju/pull/5750
<mgz> sinzui: this looks good to me ^
<mgz> ericsnow:
<mgz> ok  	github.com/juju/juju/featuretests	138.465s
<mgz> http://reports.vapour.ws/releases/4108/job/run-unit-tests-win2012-amd64/attempt/2566
<mgz> ericsnow: that's 54617e0a from wed
<mgz> so, it passed (possibly with that junk in the logs, not shown), before your changes
<ericsnow> yep
<sinzui> mgz: jam: yes this looks good. but consider that run-unit-tests always calls "go test -i ./..." now
<jam> sinzui: I missed that, but I see it now.
<sinzui> jam: so gig mgz...who add it to the script :)
<sinzui> jam: mgz: In the past, we varied the command line for testing. 2 or 3 of the variations are no longer used. I think the only case where we don't use the makefile is running with race
<mgz> sinzui: yeah, we were looking at the gating job specifically
<sinzui> mgz: yeah that is one we flip-flop from makefile's test to --race
<sinzui> mgz: and the goal for xenial-amd64 is to use race next week
<sinzui> or today even
<perrito666> would anyone kindly review https://github.com/juju/juju/pull/5747 that lacks a reviewboard link for reasons escape my control?
<perrito666> axw: I addressed your comments for register, please re-check
<axw> perrito666: reviewed
<perrito666> axw: tx
<perrito666> did you really go through the checklist?
<anastasiamac> perrito666: did u keep checklist in mind while coding? :D
 * axw looks shamefaced
<axw> we're not starting till next week :)
<perrito666> anastasiamac: evidently, I always do, even before it existed :p
<perrito666> thumper: as if it had head you https://twitter.com/4BringingFire/status/748265855398576128
<mup> Bug #1598113 opened: resource-get should not download if not necessary <resources> <juju-core:New> <https://launchpad.net/bugs/1598113>
<perrito666> cherylj: got a fail from featuretest, which I little suspected where testing this
<cherylj> :/
<mgz> perrito666: bug 1598049?
<mup> Bug #1598049: TestLogRecordForwarded fails on non-ubuntu <blocker> <centos> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1598049>
<mgz> or a new one?
<perrito666> mgz: a new one, I made a change in juju register and it would seem that the same functionality is being tested in the cmd tests and in feature tests
<mup> Bug #1316223 changed: specifying juju deploy --networks=vlan:42 causes a panic <deploy> <juju-core:Invalid> <https://launchpad.net/bugs/1316223>
<mup> Bug #1584805 changed: Timeout in github.com/juju/juju/apiserver/service on windows <bitesize> <ci> <regression> <test-failure> <timeout> <unit-tests> <windows> <juju-core:Fix Released> <juju-core 1.25:New> <https://launchpad.net/bugs/1584805>
<mup> Bug #1595276 changed: TestDestroyControllerErrors failure with out of order errors <azure-provider> <ci> <intermittent-failure> <test-failure> <juju-core:Fix Released by fwereade> <https://launchpad.net/bugs/1595276>
<mup> Bug #1598118 opened: log-forwarder worker bounces endlessly when forwarding is not configured <2.0> <debug-log> <log-forwarding> <logging> <juju-core:Triaged> <https://launchpad.net/bugs/1598118>
<mup> Bug #1598127 opened: lxdbr0 spam in log file <logging> <juju-core:Triaged> <https://launchpad.net/bugs/1598127>
<perrito666> mgz: this is a flaky run  right? http://juju-ci.vapour.ws:8080/job/github-merge-juju/8340/console
<mup> Bug #1598127 changed: lxdbr0 spam in log file <logging> <juju-core:Triaged> <https://launchpad.net/bugs/1598127>
<mup> Bug #1598127 opened: lxdbr0 spam in log file <logging> <juju-core:Triaged> <https://launchpad.net/bugs/1598127>
<dimitern> cherylj: here it is - bug 1598164
<mup> Bug #1598164: [aws] adding a machine post-bootstrap on the controller model closes of api port in controller security group <add-machine> <addressability> <ec2-provider> <tech-debt> <juju-core:New> <https://launchpad.net/bugs/1598164>
<mup> Bug #1598164 opened: [aws] adding a machine post-bootstrap on the controller model closes of api port in controller security group <add-machine> <addressability> <ec2-provider> <tech-debt> <juju-core:New> <https://launchpad.net/bugs/1598164>
<cherylj> thanks, dimitern!
<katco> mgz: thanks, i'll pick up bug 1598063 shouldn't be a hard dx/fix
<mup> Bug #1598063: Data race in apiserver/observer package <race-condition> <juju-core:Triaged by cox-katherine-e> <https://launchpad.net/bugs/1598063>
<mgz> katco: thank you!
<mgz> perrito666: bug 1598049
<mup> Bug #1598049: TestLogRecordForwarded fails on non-ubuntu <blocker> <centos> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1598049>
<perrito666> mgz: tx
<katco> very simple review for someone: http://reviews.vapour.ws/r/5201/
<katco> mgz: perhaps you are interested
<mup> Bug #1598206 opened: lxc/lxd/shared/util_linux.go sys/types.h: No such file or directory <blocker> <ci> <ppc64el> <regression> <test-failure> <unit-tests> <juju-core:Triaged> <https://launchpad.net/bugs/1598206>
<mgz> katco: ta
<katco> wallyworld: ping
<wallyworld> katco: hey, how'd the house go?
<katco> wallyworld: fine, boring
<wallyworld> boring is good
<katco> wallyworld: yep :)
<katco> wallyworld: re. your comment on my review. are you referring to the observer multiplexer?
<katco> wallyworld: e.g. this? https://github.com/juju/juju/blob/master/apiserver/observer/observer.go#L102-L115
<katco> i.e. rather
<wallyworld> katco: https://github.com/juju/juju/blob/master/apiserver/observer/observer.go#L109
<wallyworld> the use of a go routine inside the loop
<katco> wallyworld: i don't think that's what's causing the issue
<katco> wallyworld: the race is between a call to ServerRequest and ServerReply
<wallyworld> the race output seemd to implicate that aspect; i was going by advice from williamn
<wallyworld> ie the race output specically talks about calls from inside those go eoutines
<katco> wallyworld: that loop will just call a single observer method on multiple observers concurrently.
<katco> wallyworld: yeah, but not at the top of the call-stack...
<katco> wallyworld: Previous write by goroutine 56:
<katco>   github.com/juju/juju/apiserver/observer.(*RequestNotifier).ServerRequest()
<katco> wallyworld: Read by goroutine 245:
<katco>   github.com/juju/juju/apiserver/observer.(*RequestNotifier).ServerReply()
<katco> wallyworld: the issue is that there are two calls coming into the same observer concurrently... i could remove the multiplexer entirely and this race would still occur
<wallyworld> ok, my brain hasn't yet delved into the full detail, so i can't confirm mentally one way or the other
<wallyworld> but we didn;t see this race before right
<katco> wallyworld: this observer had mutexes before
<wallyworld> the observer stuff seems to have introduced it
<wallyworld> where were the mutexes? why were they removed?
<wallyworld> did their removal intorduce the race?
<katco> wallyworld: https://github.com/juju/juju/blob/bbc4a902fe44ee6effdd5e0216b3e0b8216643ef/apiserver/apiserver.go#L248
<katco> wallyworld: because of what i said in the PR... i incorrectly assumed that requests/replies would happen synchronously
<katco> wallyworld: the rpc server does not guarantee that. it has nothing to do with the multiplexer
<katco> wallyworld: here's where ServerReply is kicked off on a new goroutine: https://github.com/juju/juju/blob/master/rpc/server.go#L465
<wallyworld> katco: i'm slow today (or always) - so the above mutex on line 248 avoided races before this new work?
<katco> wallyworld: apparently so
<wallyworld> so why does this new work introduce the recaes then? is that mutex removed?
<wallyworld> the observer stuff should not need extra locking?
<katco> wallyworld: yes, because the rpc server will call ServerRequest, and then spawn another goroutine and call ServerReply
<katco> wallyworld: if it's not in the observer, we need to touch rpc.Conn to lock there
<katco> wallyworld: i.e. synchronize the reply to the request so it's synchronous
<katco> wallyworld: i can tell you don't believe me; i'll just inject the RequestNotifier and show that the problem doesn't reside in the multiplexer
<thumper> wallyworld: yarp
<perrito666> sinzui: 2 things, 1) how will I know when I have my new mongo 2) how did you find a mongo for windows with ssl?
<sinzui> perrito666: we rarely update the db. the choice to switch to 2.6 was driven by the case it is supported.
<sinzui> perrito666: When we want to test only with 3.2, we will put 3.2 on the host
<katco> wallyworld: ok, justification posted to the review
<perrito666> mgz: https://github.com/juju/juju/pull/5759
<mup> Bug # changed: 1575940, 1588403, 1594415, 1596967, 1597372, 1597516, 1597519
<wallyworld> katco: hey, sorry, have been smashed trying to get log forwarding working using the actual code for the build
<katco> wallyworld: no worries at all
<wallyworld> katco: one comment given to me in additon was the use of a RWMutex instead of just a sync.Mutex - the added complexity is not justified
<katco> wallyworld: i'm sorry, i don't understand how that's more complex?
<mup> Bug #1598272 opened: LogStreamIntSuite.TestFullRequest sometimes fails <ci> <intermittent-failure> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1598272>
<wallyworld> katco: it's cognitive overhead - when do i use read vs write lock, as opposed to just lock
<mup> Bug #1585825 changed: Takes too long to download a resource from a controller to unit <ci> <resources> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1585825>
<katco> wallyworld: shouldn't we be thinking about that?
<wallyworld> katco: not prematurely
<wallyworld> not unless it has been shown to be an issue
<katco> wallyworld: it's our RPC mechanism, we want to limit our critical section as much as possible
<mup> Bug #1585825 opened: Takes too long to download a resource from a controller to unit <ci> <resources> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1585825>
<mup> Bug #1585825 changed: Takes too long to download a resource from a controller to unit <ci> <resources> <test-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1585825>
<mup> Bug # opened: 1598286, 1598289, 1598290, 1598291
<mup> Bug # opened: 1598286, 1598289, 1598290, 1598291, 1598292, 1598293
<mup> Bug #1598292 changed: log forwarding subject to clock skew <juju-core:Triaged> <https://launchpad.net/bugs/1598292>
<mup> Bug #1598293 changed: log forwarding feature does not use updated config <juju-core:Triaged> <https://launchpad.net/bugs/1598293>
<mup> Bug #1598292 opened: log forwarding subject to clock skew <juju-core:Triaged> <https://launchpad.net/bugs/1598292>
<mup> Bug #1598293 opened: log forwarding feature does not use updated config <juju-core:Triaged> <https://launchpad.net/bugs/1598293>
<mup> Bug #1598319 opened: Openstack Provider - No way to use multiple images <juju-core:New> <https://launchpad.net/bugs/1598319>
#juju-dev 2016-07-02
<mup> Bug #1598329 opened: juju status showing charm unit in error state but getting "ERROR unit is not in an error state" message when resolving charm <juju-core:New> <https://launchpad.net/bugs/1598329>
<mup> Bug #1598362 opened: MongoDB replica error in machine-0 <bootstrap> <mongodb> <juju-core:New> <https://launchpad.net/bugs/1598362>
<mup> Bug #1598390 opened: Juju 2.0 Resources - Issue faced in the deployment of a charm from charm store when juju-attach is used <juju-core:New> <https://launchpad.net/bugs/1598390>
<fwereade> cmars, tasdomas: long shot, I expect, bit if you're around I would really appreciate a review of http://reviews.vapour.ws/r/5207/
<mup> Bug #1474607 changed: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <blocker> <ci> <go1.5> <go1.6> <regression> <windows> <juju-core:Fix Released by axwalk> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1474607>
<mup> Bug #1474607 opened: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <blocker> <ci> <go1.5> <go1.6> <regression> <windows> <juju-core:Fix Released by axwalk> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1474607>
<mup> Bug #1474607 changed: worker/uniter/relation: HookQueueSuite.TestAliveHookQueue failure <blocker> <ci> <go1.5> <go1.6> <regression> <windows> <juju-core:Fix Released by axwalk> <juju-core 1.25:Won't Fix> <https://launchpad.net/bugs/1474607>
#juju-dev 2017-06-26
<wallyworld> thumper: i added a video call to the meeting
<wallyworld> we'll project you
<thumper> oh awesome
<wallyworld> also created a shared doc
<thumper> wallyworld: I'm in the call
<stokachu> is the tenant-id for azure no longer required?
<stokachu> Unable to create model: ERROR finalizing "conjure-azure-517" credential for cloud "azure": unknown key "tenant-id"
#juju-dev 2017-06-27
<thumper> anyone... https://github.com/juju/juju/pull/7557
<thumper> axw: do you know the answer to stokachu above?
<axw> stokachu: correct, it is no longer required  - juju figures it out based on the subscription-id
<axw> thumper: a total connection counter metric with entity tag/ID as label would be nice :)
<stokachu> axw: application-id, subscription-id, and application-password are all thats needed now?
<thumper> axw: what do you mean?
<axw> stokachu: correct
<stokachu> axw:ty!
<axw> thumper: I mean, a prometheus metric which tracks number of logins. you can have labels on metrics, which gives them a dimension, so you end up with a separate count for each dimension
<axw> thumper: in this case, dimension=source agent
<thumper> well... most agents will have two and only two
<thumper> one for the api and one for logs
<thumper> not sure how we can turn that into useful info
<axw> thumper: I don't mean current number of conns. total = counter that keeps going up when it logs in, never goes down
<thumper> ah
<axw> thumper: so you can see if it goes up over time, because it's bouncing
<thumper> hmm... that would be interesting
<axw> thumper: e.g. I just added https://github.com/juju/juju/blob/2.2/mongo/mongometrics/dialmetrics.go, which lets us see the rate mongo dials over time
<axw> thumper: per server
<thumper> typo on line 39 :)
<thumper> dialng
<axw> oops
<thumper> :)
<frankban> hey all, I need a review for https://github.com/juju/juju/pull/7563, anyone available>
<frankban> ?
#juju-dev 2017-06-28
<frankban> wallyworld: do you have time to look at https://github.com/juju/juju/pull/7563 ?
<wallyworld> frankban: it doesn't look right - there is no application level workload version. workload version is set b the charm, and ech unit potentially can run a separate charm revision until any upgrades sync across all units
<frankban> wallyworld: it's kind of an abstraction, the version set by the last unit wins, similar to the application status
<wallyworld> with status, that  is meant to only be set by the leader, so it's not really last one wins there
<frankban> wallyworld: 99% of the times the version is the same for all units, and "juju status" already uses that abstraction (version is displayed in the applications section, for each application)
<wallyworld> agree it is the same 99% of the time but the other 1% we are lying
<frankban> wallyworld: and we are already lying in "juju status"
<wallyworld> if others think it's ok to do this then it's ok i guess
<frankban> wallyworld: I mean, I am just trying to meet users expectations, and they want in the gui something similar to what they get in juju status
<frankban> wallyworld: we can add the more exact version in the unit info later if required
<wallyworld> that is fair enough, if it just matches status then it's no better or worse
<wallyworld> sgtm
<wallyworld> i didn't realise we were matching status
<frankban> wallyworld: ty
<wallyworld> frankban: np, sorry for pushing back a bit, i just wanted to be sure we were doing the righy thing
<frankban> wallyworld: np
#juju-dev 2017-06-29
<axw> veebers: http://juju-ci.vapour.ws/job/github-check-merge-juju/1587/artifact/artifacts/xenial.log/*view*/
<mup> Bug #1694988 changed: AWS instances created by juju don't have an IPv6 assigned, even if "auto-assign IPv6 addresses" is enabled for the subnet <canonical-is> <juju:New> <https://launchpad.net/bugs/1694988>
#juju-dev 2017-06-30
<babbageclunk> wallyworld: Do I need a review for a revert?
<wallyworld> babbageclunk: nah, land it
<mup> Bug #1701481 opened: juju 1.25 leaks memory (1.25.11+) <juju-core:New> <https://launchpad.net/bugs/1701481>
<SimonKLB> hey! what is required to put my juju auth into a container? i've tried mounting both ~/.go-cookies and ~/.local/share/juju
<SimonKLB> i'm trying to deploy a charm that is not published yet, on the host it works since i've authenticated, but in the container i get {macaroon discharge required: authentication required}
<SimonKLB> solved it, host was running 2.1.3 and container 2.2, so the host didnt have the controller specific cookiejar
